Prakash Samariya - IT Professional: 28 September 2008

Friday, 3 October 2008

Disaster Recovery Procedures in SQL Server 2005 Part 1

Problem
SQL Server 2005 has given us a lot of options on implementing high availability and disaster recovery. More than the technologies themselves, it is important to come up with the proper procedures as we manage different disaster recovery scenarios. How do we come up with procedures for various SQL Server disaster recovery scenarios?

Solution
This series of articles will look at different disaster recovery scenarios and the procedures involved in your recovery plan. When considering disaster recovery options for your SQL Server 2005 database, you should include as many technologies as you can so you'll have a pool of options to choose from if a disaster arises. While having these technologies prove to be important, it is the process that goes with it that makes it effective. For this article, let's take a simple scenario where a user accidentally dropped or truncated a table about 5 hours after a database backup was taken. Restoring from a database backup would mean losing 5 hours worth of data. And for most companies, they would rather opt for loss of time than loss of data. Plus, if this was a very large database, it would take quite some time to recover and bring it online. We'll take this scenario to build a procedural approach to recover the database as quickly as possible while minimizing data loss. We will use the Northwind database to demonstrate the process. Remember to change Northwind's database recovery model to FULL before working through the steps below.

1) Make sure you have a very good backup

For the scenario above, let's say you have a daily backup running at 6:00 AM and there are no database snapshots created on a regular basis. Your database is configured to use a single MDF and LDF file, which is not quite right for disaster recovery. Let's generate a full database backup for our Northwind database which will be the starting point for database recovery. Here is that code:

USE master
GO
BACKUP DATABASE Northwind
   TO DISK = N'D:\DBBackup\NorthwindBackup.bak'
   WITH NAME = N'Full Database Backup', DESCRIPTION = 'Starting point for recovery',
    INIT, STATS = 10
GO

Looking at the Northwind database schema, you won't easily be able to drop theProducts, Orders and Customers table due to foreign key constraints defined by other tables like the Order Details table. But I bet you can easily drop the Order Detailstable. Let's simulate that disaster by dropping the table at around 11:00AM.

DROP TABLE [Order Details]

2) Contain the problem

Since the database only has a single MDF and LDF file, we couldn't do much. All we can do is take the database offline by setting it to restricted user mode

USE master
GO

ALTER DATABASE Northwind
SET RESTRICTED_USER
WITH ROLLBACK IMMEDIATE
GO

This will effectively take the database offline, terminating all active connections. This is where the clock starts ticking and we need to take action. Keep in mind your RPO and RTO while proceeding with the next steps.

3) Backup the transaction log

A good DBA would know that the first thing to do when disaster strikes is to backup the transaction log - assuming that your database is set to FULL recovery model. This is to make sure you still have all the active transactions in the log since your last backup. In our scenario, since the last backup - FULL backup, in this case - occurred at 6:00AM.

BACKUP LOG Northwind
   TO DISK = N'D:\DBBackup\NorthwindBackupLog.trn'
   WITH NAME = N'Transaction Log Backup'
   , DESCRIPTION = 'Getting everything to current point in time.',
       STATS = 10
GO

4) Restore the database to a known good point-in-time

Now, any user who accidentally dropped a table or caused any damage on a database will not tell you immediately. Sometimes, you may have to dig it up for yourself but that would take quite a lot of time. Since we wanted to bring the database back online as fast as we can, let's just assume a "known good" point-in-time and leave the digging for a later time. In the script below, I have chosen to use 10:42AM in my STOPAT parameter as my "known good" point-in-time for demonstration purposes.

RESTORE DATABASE Northwind
FROM DISK = N'D:\DBBackup\NorthwindBackup.bak'
WITH NORECOVERY, RESTRICTED_USER
GO

RESTORE LOG Northwind
FROM DISK = N'D:\DBBackup\NorthwindBackupLog.trn'
WITH RESTRICTED_USER,
STOPAT = '2008-09-23 10:42:44.00', RECOVERY
-- use a "known good" point in time
GO

Although we have restored the database to a "known good" point-in-time, we don't really know how much data we've lost. We need to find out the exact time when the very lastINSERT statement occurred prior to the DROP TABLE statement so that we can recover as much data as we possibly can. But we don't want to do this directly on the server as we need to get the database back online as quickly as we can. This is where the succeeding steps would prove to be very valuable

You can validate whether the dropped table has been restored by running a query against it.

SELECT *
FROM Northwind.dbo.[Order Details]
GO

5) Create a snapshot of the restored point

We will create a database snapshot of the restored database for further processing. This database snapshot will be our reference in recovering data to the exact time prior to theDROP TABLE statement.

USE master
GO

CREATE DATABASE Northwind_RestorePointSnapshot
ON
( NAME = N'Northwind',
FILENAME = N'D:\DBBackup\NorthwindData_RestorePontSnapshot.snap')
AS SNAPSHOT OF [Northwind]
GO

Depending on the table schema, we can opt to leave it as it is, as what we will do for theOrder Details table, or do a few more tasks. If the table has an existing IDENTITYcolumn, we need to create a gap between the maximum value for our IDENTITY column and assumed number of rows which we need to recover. This, of course, will depend on the number of transactions that occur on the server. To identify the maximum value of the IDENTITY column, you can run the DBCC CHECKIDENT command as shown below:

DBCC CHECKIDENT ('tableName')
--Displays the number of rows you have for the restored table
GO

This will return the maximum value of the IDENTITY column. Let's say that the number of transactions on this table per day is around 4000 records, we can create a gap between the maximum value and the next value. If the maximum value for the IDENTITY column is25000, we need to add 4000 to it and run the DBCC CHECKIDENT command again with the RESEED parameter (we are simply assuming that you can recover the lost data within a day, thus, the value 400):

DBCC CHECKIDENT ('tableName', RESEED, 29000 )
--Creates a gap of for the IDENTITY column to start the next value at 29000
GO

6) Bring the database back online

Once you have managed to do that, change the database option to bring it online and allow users to connect back and run their applications.

USE master
GO

ALTER DATABASE Northwind
SET MULTI_USER
GO

Now the database is back online and the dropped table has been restored. Although everyone is happy by now, our job as a DBA does not stop here. Remember that we still need to recover the lost data from the "known good" point-in-time to the time before theDROP TABLE command was executed. That is the only way we can recover as much data as we can. Though there are a few third-party tools we can use to read the transaction log and recover the data by replaying those transactions back, most of us do not have the luxury of playing around with those tools. So our next best bet would be to use the RESTORE with STOPAT option. It's a bit tedious and sometimes very stressful as one mistake would plunge you back into repeating the entire process. All we need here is to find out the times we did our backups until the end of the tail (transaction log) backup. In our scenario, the last backup occurred at 6:00AM and your "known good" point-in-time is at 10:42.44AM. Therefore, you can start doing a RESTORE with the STOPAT option from 10:42.44AM and change the STOPAT time value to maybe a second. If you are not quite sure when the last backup occurred, you can always query theMSDB database.

SELECT     *
FROM msdb.dbo.backupset AS a
  INNER  JOIN msdb.dbo.backupmediafamily AS b
        ON a.media_set_id = b.media_set_id
WHERE database_name = 'Northwind'
ORDER BY backup_finish_date

Note the backup_finish_date and the type columns as this will give you an idea on the time you need to consider for your STOPAT value in your RESTORE command.

7) Restore another copy of the damaged database with a different name for investigation

Restoring another copy of the damaged database with a different name will allow us to work on the restoration of the data without worrying about the availability as we've already managed to bring up the production database. Just make sure you select a different name and database file location for the restored database or you'll end up damaging the already brought up database. We will just repeat what we did in step #4but with a twist - different name and database file location.

RESTORE DATABASE Northwind_recover
FROM DISK = N'D:\DBBackup\NorthwindBackup.bak'
WITH MOVE N'Northwind' TO N'D:\DBBackup\NorthwindData_recover.mdf',
           MOVE N'Northwind_Log' TO N'D:\DBBackup\NorthwindLog_recover.ldf',
           STANDBY = N'D:\DBBackup\Northwind_UNDO.bak',
           STATS = 10
GO

RESTORE LOG Northwind_recover
FROM DISK = N'D:\DBBackup\NorthwindBackupLog.trn'
WITH STANDBY = N'D:\DBBackup\Northwind_UNDO.bak',
           STATS = 10, STOPAT = '2008-09-23 10:42:44.00'
GO

Document the value of your STOPAT parameter as this will be the most critical parameter you'll ever work with during this process. Since we just repeated the process in step#4, we know for a fact that the DROP TABLE command has not been executed at this time.

8) Restore the transaction log by increasing the value of the STOPAT parameter

We run another RESTORE LOG command, increasing the STOPAT parameter value by a minute - from 10:42:44.00 to 10:43:44.00

RESTORE LOG Northwind_recover
FROM DISK = N'D:\DBBackup\NorthwindBackupLog.trn'
WITH STANDBY = N'D:\DBBackup\Northwind_UNDO.bak',
           STATS = 10, STOPAT = '2008-09-23 10:43:44.00'
GO

This is the part where it becomes iterative. Don't be frustrated at this point as it will be really tedious. You can increase the value by a minute, 5 minutes, 10 minutes and document the time. Remember to run a test query on the dropped object after running the RESTORE LOG command. I would recommend creating a table for this activity that would look something like this.

TIME	OBJECT EXISTED?
10:43:44.00	YES
10:48:44.00	YES
10:58:44.00	YES
11:03:44.00	NO

With this information, you know for a fact that the table was dropped between10:58:44.00 to 11:03:44.00. You can repeat step #8 and increase the value of theSTOPAT parameter by a minute or even a second if you may since you already have a smaller interval to work with. If you find yourself overshooting the time value of theSTOPAT parameter, go back to step #7 armed with the tabular information you've documented in step #8, making the restore process a bit faster. Just remember to use the WITH RECOVERY option at the last RESTORE LOG statement like this

RESTORE LOG Northwind_recover
FROM DISK = N'D:\DBBackup\NorthwindBackupLog.trn'
WITH STATS = 10, STOPAT = '2008-09-23 11:01:44.00', RECOVERY
GO

Once you've managed to restore the database to the time before the DROP TABLEcommand was executed, you can now do a comparison between what was restored on the production database and what was recovered. You can do this in a number of different ways. Since we already have a database snapshot created earlier, we will use that together with the TableDiff utility. Although the tool was designed for replication, we can use it for disaster recovery as well. A previous tip on SQL Server 2005 tablediff command line utility can give you an overview on how to use this tool but just to highlight that your source database will be the one that you recovered and the destination database will be your database snapshot. This is where your database snapshot would prove to be very important especially if you are dealing with more than one object which is normally the case. If you are not comfortable with command-line utilities, a GUI version was created by the guys from SQLTeam.com. You might want to check that out as well and include it in your DBA toolbox

You can also do an INSERT/SELECT where you insert records on the production database based on a query on the recovered database. Since our Order Details table does not have an IDENTITY column, we can create our own by inserting the records in a temporary table and using the ROW_NUMBER() function:

--This inserts records in a temporary table and assigns a dummy identity value for reference
SELECT  ROW_NUMBER() OVER (ORDER BY OrderID) AS ROWID, *
INTO Northwind_recover.dbo.OrderDetailsRecover
FROM [Order Details]

--This inserts recovered records from the recovered database into the production database based on
--the dummy identity valuewe have assigned for reference
INSERT INTO Northwind.dbo.[Order Details] (OrderID,ProductId,UnitPrice,Quantity,Discount)
SELECT OrderID,ProductId,UnitPrice,Quantity,Discount
FROM Northwind_recover.dbo.OrderDetailsRecover
WHERE ROWID>
               (
                   SELECT COUNT(*)
                   FROM Northwind_RestorePointSnapShot.dbo.[Order Details]
               )

Notice that we used our database snapshot to identify the difference between what we managed to restore and what we have recovered.

Next Steps

It is important to have a disaster recovery plan in place and the procedures necessary for recovery. While it is impossible to come up with procedures for almost every type of disaster, it would help if you start listing the possible disasters that may happen and prepare a disaster recovery plan with procedures and document accordingly.
Simulate this particular process by going thru the steps outlined above.
You can download the Northwind database used in the sample here.

Using SQL Server datetime functions GETDATE, DATENAME and DATEPART

Transact-SQL includes a set of functions that let you retrieve the current date and time or retrieve the individual parts of a DATETIME or SMALLDATETIME value. For example, you can extract the day, month or year from a datetime value, as well as the quarter, week, hour or even the millisecond. In this article, I describe each of these functions and provide examples that demonstrate how to use these functions to retrieve datetime data in SQL Server. Note that this article assumes that you have a working knowledge of T-SQL and the DATETIME and SMALLDATETIME data types. For more information about these types, see part one in this series, Basics for working with DATETIME and SMALLDATETIME in SQL Server 2005.

Retrieving the current date and time

One of the handiest datetime functions in T-SQL is GETDATE, which retrieves the current date and time based on the clock settings on the local system. To use GETDATE, simply call the function in your T-SQL statement without specifying any arguments, as in the following example:

SELECT GETDATE() AS [Current Date/Time]

In this case, I use GETDATE in the SELECT list to retrieve the date/time value. (Note that you must include the ending set of parentheses even if you don't pass in any arguments.) The statement returns results similar to the following:

Current Date/Time

2008-07-29 10:45:13.327

By default, the GETDATE function returns the datetime value in the format shown here. However, you can change the format of the results by using the CONVERT function. For information about using CONVERT, refer to part two in this tip.

Another Transact-SQL function that is just as easy to use is GETUTCDATE, which retrieves the current Coordinated Universal Time (UTC) -- also referred to as Greenwich Mean Time. The retrieved value is based on the clock and time zone settings on the local system. As you saw with GETDATE, you call GETUTCDATE within your Transact-SQL statement without including any arguments, as shown in the following example:

SELECT GETUTCDATE() AS [UTC Date/Time]

When you run this statement, you receive results similar to the following:

UTC Date/Time

2008-07-29 17:45:13.327

Notice that the time returned here is seven hours later than the time shown in the previous example. I ran both of these statements at the same time on a system configured for the Pacific time zone (during daylight savings time).

As you've seen in the last two examples, the functions are included within the SELECT list. However, the functions can be especially beneficial when using them to define a default value in your table definition. For example, the following three statements create the Orders table -- including a DATETIME column (OrderDate) -- insert data into the table and retrieve that data:

CREATE TABLE Orders
(
OrderID INT PRIMARY KEY IDENTITY,
Product VARCHAR(30) NOT NULL,
OrderAmt INT NOT NULL,
OrderDate DATETIME NOT NULL DEFAULT GETDATE()
)
GO
INSERT INTO Orders (Product, OrderAmt)
VALUES('Test Product', 12)
GO
SELECT * FROM Orders

The OrderDate column definition includes a DEFAULT clause that specifies GETDATE as the default value. As a result, when you insert a row into the table, the current date and time are automatically inserted into the column, as shown in the results returned by the SELECT statement:

OrderID	Product	OrderAmt	OrderDate
1	Test Product	12	2008-07-29 10:46:47.420

You can use the information as a timestamp in order to track when records are added and to assist in auditing the data, if necessary. This is also handy for other operations that use the timestamp when retrieving data. For example, an extract, transform and load (ETL) process might reference the timestamp when determining whether to extract or update data.

Retrieving the year, month or day

In some cases, you might want to retrieve the year, month or day from a DATETIME or SMALLDATETIME value. One approach is to use the YEAR, MONTH or DAY function to retrieve the necessary data (as an integer). The following SELECT statement is an example of how this works:

SELECT YEAR(PostTime) AS [Year],
MONTH(PostTime) AS [Month],
DAY(PostTime) AS [Day]
FROM DatabaseLog
WHERE DatabaseLogID = 1

The SELECT clause includes three column expressions. The first one uses the YEAR function to retrieve the year from the PostTime column in the DatabaseLog table (in the AdventureWorks sample database). When you call the YEAR function, you specify the column name (or other expression) as an argument to the function. The MONTH and DAY functions work the same way. The second column expression in the SELECT clause uses the MONTH function to retrieve the month from the PostTime column, and the third expression uses DAY to retrieve the day. The following results show you the type of information that the statement returns:

Year	Month	Day
2005	10	14

Each value is extracted from the PostTime column and returned as an integer. (The value stored in the table is 2005-10-14 01:58:27.567.)

These functions are an easy way to retrieve the year, month or day, but, in some cases, you might want more control over the type of values returned as well as the format of those values. In addition, you might want to extract the time from the date/time value. Fortunately, Transact-SQL supports functions that provide this type of functionality.

Retrieving parts of a date/time value

Like the YEAR, MONTH and DAY functions, the DATEPART function returns an integer representing a specific part of the date/time value. For example, the following SELECT statement returns the same results as the preceding example:

SELECT DATEPART(yy, PostTime) AS [Year],
DATEPART(mm, PostTime) AS [Month],
DATEPART(dd, PostTime) AS [Day]
FROM DatabaseLog
WHERE DatabaseLogID = 1

The first thing to note is that, when you call DATEPART, you specify two arguments. The first argument determines the date/time component to retrieve, and the second argument is the source column. For the first argument, you must use one of the supported abbreviations to specify the datetime part. The following table lists the date/time parts you can retrieve and the abbreviations you must use to retrieve those parts:

Date/time part	Abbreviations
year	yy, yyyy
quarter	qq, q
month	mm, m
day of year	dy, y
day	dd, d
week	wk, ww
weekday	dw
hour	hh
minute	mi, n
second	ss, s
millisecond	ms

For some datetime parts, more than one abbreviation is supported. For example, you can use "yy" or "yyyy" as your first DATEPART argument to retrieve the year from the date/time value. Notice that the table includes abbreviations for date/time parts other than year, month or day. In other words, you can retrieve the quarter, the day of the year, the week of the year, and the weekday as shown in the following SELECT statement:

SELECT DATEPART(qq, PostTime) AS [Quarter],
DATEPART(dy, PostTime) AS [DayOfYear],
DATEPART(wk, PostTime) AS [Week],
DATEPART(dw, PostTime) AS [Weekday]
FROM DatabaseLog
WHERE DatabaseLogID = 1

As in the preceding example, each instance of DATEPART includes two arguments: the date/time part abbreviation and the source column. The statement returns the following results:

Quarter	DayOfYear	Week	Weekday
4	287	42	6

Notice that the weekday is shown as 6. By default, SQL Server begins the week with Sunday, so weekday 6 is equivalent to Friday.

The preceding two examples retrieved only values related to dates. However, as the table below shows, you can also retrieve data related to time:

SELECT DATEPART(hh, PostTime) AS [Hour],
DATEPART(mi, PostTime) AS [Minute],
DATEPART(ss, PostTime) AS [Second],
DATEPART(ms, PostTime) AS [Millisecond]
FROM DatabaseLog
WHERE DatabaseLogID = 1

In this case, the statement is retrieving the hour, minute, second and millisecond, as shown in the following results:

Hour	Minute	Second	Millisecond
1	58	27	567

The primary limitation of the DATEPART function is that it returns only integers, which is why Friday is shown as 6. However, if you want to display actual names of days and months, you can use the DATENAME function. The DATENAME function works exactly like the DATEPART function. DATENAME takes the same number of arguments and supports the same abbreviations. For example, if you want to retrieve the year, month and day, as you saw in an earlier example, you simply replace DATEPART with DATENAME:

SELECT DATENAME(yy, PostTime) AS [Year],
DATENAME(mm, PostTime) AS [Month],
DATENAME(dd, PostTime) AS [Day]
FROM DatabaseLog
WHERE DatabaseLogID = 1

Now your results will look like the following:

Year	Month	Day
2005	October	14

The month value is now October, rather than 10. The year and day, however, remain integers because that's the only way to represent them. You can also use the DATENAME function for other date/time components, as in the following example:

SELECT DATENAME(qq, PostTime) AS [Quarter],
DATENAME(dy, PostTime) AS [DayOfYear],
DATENAME(wk, PostTime) AS [Week],
DATENAME(dw, PostTime) AS [Weekday]
FROM DatabaseLog
WHERE DatabaseLogID = 1

Once again, I've replaced DATEPART with DATENAME, but changed nothing else. The statement returns the following results.

Quarter	DayOfYear	Week	Weekday
4	287	42	Friday

Notice that the quarter, day of the year and week are still integers, but the weekday now says Friday, rather than 6. You can also use DATENAME to retrieve the time components of a date/time value, but the results will always be integers, as you would expect.

Monitoring Changes in Your Database Using DDL Triggers

Introduction

Additions, deletions, or changes to objects in a database can cause a great deal of hardship and require a dba or developer to rewrite existing code that may reference affected entities. To make matters worse tracking down the problematic alteration(s) may be synonymous to locating the needle in the haystack. Utilizing a DDL trigger in conjunction with a single user created table, used to document such changes, can considerably minimize the headaches involved in tracking and locating schema changes.

Creating the Table and DDL TRIGGER

The first step in implementing such a tracking strategy is to create a table that will be used to record all DDL actions fired from within a database. The below code creates a table in the AdventureWorks sample database that will be used to hold all such DDL actions:

USE AdventureWorks

GO

CREATE TABLE AuditLog

(ID        INT PRIMARY KEY IDENTITY(1,1),

Command    NVARCHAR(1000),

PostTime   NVARCHAR(24),

HostName   NVARCHAR(100),

LoginName  NVARCHAR(100)

GO

After creating the table to hold our DDL events it is now time to create a DDL trigger that will be specific to the AdventureWorks database and will fire on all DDL_DATABASE_LEVEL_EVENTS:

CREATE TRIGGER Audit ON DATABASE

FOR DDL_DATABASE_LEVEL_EVENTS

AS

DECLARE @data XML

DECLARE @cmd NVARCHAR(1000)

DECLARE @posttime NVARCHAR(24)

DECLARE @spid NVARCHAR(6)

DECLARE @loginname NVARCHAR(100)

DECLARE @hostname NVARCHAR(100)

SET @data = EVENTDATA()

SET @cmd = @data.value('(/EVENT_INSTANCE/TSQLCommand/CommandText)[1]', 'NVARCHAR(1000)')

SET @cmd = LTRIM(RTRIM(REPLACE(@cmd,'','')))

SET @posttime = @data.value('(/EVENT_INSTANCE/PostTime)[1]', 'NVARCHAR(24)')

SET @spid = @data.value('(/EVENT_INSTANCE/SPID)[1]', 'nvarchar(6)')

SET @loginname = @data.value('(/EVENT_INSTANCE/LoginName)[1]',

    'NVARCHAR(100)')

SET @hostname = HOST_NAME()

INSERT INTO dbo.AuditLog(Command, PostTime,HostName,LoginName)

 VALUES(@cmd, @posttime, @hostname, @loginname)

GO

The purpose of the trigger is to capture the EVENTDATA() that is created once the trigger fires and parse the data from the xml variable inserting it into the appropriate columns of our AuditLog table. The parsing of the EVENTDATA() is rather straight forward, except for when extracting the command text. The parsing of the command text includes the following code:

SET@cmd = LTRIM(RTRIM(REPLACE(@cmd,'','')))

The need for the LTRIM and RTRIM is to strip all leading and trailing white space while the REPLACE is used to remove the carriage return that is added when if using the scripting wizard from SSMS. This will provide the future ability to use SSRS string functions to further parse the command text to offer greater detail.

Once the table and trigger have been created you can test to assure that it is working properly:

UPDATE STATISTICS Production.Product

GO

CREATE TABLE dbo.Test(col INT)

GO

DROP TABLE dbo.Test

GO

-- View log table

SELECT *

FROM dbo.AuditLog

GO

The results of the above query should are shown below:

Conclusions

By creating a table to hold all DDL actions and a database level DDL trigger we can successfully capture all DDL level changes to our database and provide greater ability to track and monitor any such change.

As performance of any such action(s) is most often the deciding factor as to whether implement such change control, I have limited excessive parsing or formatting in the above trigger. Consider this the first step, documenting. Later I will post how to utilize reporting services to provide reports showing:

1. DDL action, CREATE, ALTER, DELETE, etc

2. The schema and object affected

3. Workstation executing DDL statements

4. Drill down report to show object dependencies

That will use the documenting objects created above to provide greater insight and detail external of your production environment.

Prakash Samariya - IT Professional