An Unfortunate Tale of Data Lost but Mostly Recovered: Best Practices for Working on Production Databases

Last Friday, a co-worker came up to me and asked if there was a way to rollback an update because they accidentally updated all the records in a table in a customer’s live database.

Blurgh.

After consulting our DBA, we decided to restore a backup and update the values in the live database from the backup. But guess what? The customer’s maintenance plan was disabled and backups hadn’t been created in over a week.

Blurgh.

Well, at least we could still update the records using the week-old database. Before we did that, we figured we should probably create a backup of the database, though. So, I executed the backup maintenance plan, and it completed. With a “good” backup saved (successfully created but containing the erroneous update), we set out to restore the week-old backup, but it was gone. The maintenance plan was configured to delete backups more than one week old.

Blurgh!

No problem, our DBA has a tool that can extract data from the transaction logs to create an undo query. We just need the transaction logs, but–of course–those were also lost with the full backup.

BLURGH!

Luckily, the field that was lost was a timestamp, and we were able to reasonably reconstruct the values based on timestamps in related tables. That was our final option, and in the end it worked out okay. The field was of low importance to the customer, and they were more-or-less indifferent about the potential loss of data. If this had been a more important field, this could have been a catastrophic sequence of events, though. The worst part is that this entire data-scare was simply the result of unawareness and recklessness that could have been completely avoided by employing safer practices. And so, with that, I present to you my best practices for working with production databases.

Best Practices for Working on Production Databases

Certain data within a customer’s production database can be critical to their day-to-day operations, and there are few things less pleasant than telling them that you accidentally changed some of their data and are unable to get it back. Knowing how to work safely in a production database is the first step to avoiding finding yourself in that position. Mistakes do happen, though, so it’s a good idea to be prepared with multiple recovery options, should one of your precautions fail. Here is a list of best practices that you can employ to help ensure no critical data is lost while working with a production database.

Work in a Test Environment

The safest way to avoid production database mishaps is to not work with the customer’s production database. This may not be an option for what you’re trying to accomplish, but it is clearly the safest choice when viable.

Make Sure You Have a Database is Backup

Check the customer’s maintenance plan history to ensure that an up-to-date backup exists. This is your disaster recovery plan. If you find that the backup job has either been failing or not running at all, do not update the customer’s database. Work with the customer to create a successful backup before proceeding.

Backup the Table(s) You’re Working With

Also backup individual tables by using a simple SELECT-INTO statement. Append “_Backup” to the table name along with a date stamp to create a unique and descriptive name for your backup.

SELECT * INTO SomeTable_BackupYYYYMMDD FROM SomeTable

This backup will not have the same triggers and indexes, but it will provide you with the data you need in the event that records are unintentionally modified.

Be sure to clean up once you’re sure you no longer need the table backup.

DROP TABLE SomeTable_BackupYYYYMMDD

Backup the Stored Procedure(s) You’re Working With

If you’re working with stored procedures, you can back them up by saving a .sql script. To do this, locate the stored procedure in SQL Server Management Studio’s (SSMS) Object Explorer, right-click > Modify or Script Stored Procedure as > ALTER To > File …, and save the script to the desired location.

You can also get the current stored procedure by using the sp_helptext command. Note that the result of sp_helptext is subject to SSMS’s character limit.

sp_helptext 'someProcedure'

Copy the results to a new query, and save it to your backup location.

Be Aware of Table Triggers

If you are modifying records in a table that has data manipulation language (DML) triggers, there is the potential for the query to get stuck in a suspended state. If the process is then killed while suspended, it could put the transaction in a killed/rollback state for an extended period of time, which could hinder customer operations.

You can easily check for triggers on a table by using the sp_helptrigger command.

sp_helptrigger 'SomeTable'

In cases of multi-row updates on tables with DML triggers, it is best to consult a DBA if you are unsure what the impact will be.

Use Transactions

At the top of your query, start a new transaction by using the BEGIN TRAN command. Execute your queries and verify the results. If you wish to keep the results, complete the transaction by using the COMMIT command. If you wish to discard the results, use the ROLLBACK command.

Here is a suggested workflow for applying changes within a transaction:

  1. Begin transaction
  2. Query to establish how many rows should be affected
  3. Query to update/insert/delete
  4. Query to verify results
  5. Rollback transaction
  6. Once results have been verified in the transaction, modify #5 to commit the transaction

Note that tables will be locked while once they are modified in a transaction until the transaction is committed or rolled-back.

Advertisement

Find Duplicate Database Entries with LINQPad

A co-worker and I were chatting about a code problem he was having that was likely due to duplicate entries in a database table. He thought that records in the table were unique based on two columns, but that didn’t seem to agree with what was happening in code. He wanted to write a SQL query to identify any duplicates, but he didn’t know how to do it. He doesn’t write a lot of SQL and wasn’t comfortable with it. He uses LINQ everyday, though, so I suggested he do it through one of my favorite tools: LINQPad.

Regardless of whether you’re doing it in SQL or LINQPad, the approach is the same: group by the fields and filter to show only groups with more than one item.

So, let’s do that in LINQPad. Here are the quick-steps to get you caught up to the point of writing your query:

  1. Open LINQPad
  2. Configure connection
  3. Configure query to use the connection
  4. Write query

There are two things to keep in mind when writing queries in LINQPad. First, the table names will be pluralized. “SomeObject” becomes “SomeObjects.” Second, field names will be case-sensitive and always start with a capital letter. Now let’s get to business…

SomeObjects.GroupBy(x => x.FirstColumn + "|" + x.SecondColumn)
    .Where(x => x.Count() > 1)

Boom! That’s all there is to it. I concatenate the columns to create a group key, and filter the results to only show groups with more than one item.

%d bloggers like this: