CDC in SQL server 2008 R2 (Part I)

Vishvvas
Posted by in Sql Server category on for Intermediate level | Points: 250 | Views : 23092 red flag
Rating: 5 out of 5  
 1 vote(s)

This article (Part I) explores one of the feature (of SQL server 2008 and higher) of providing the auditing abilities i.e. CDC – Change Data Capture. We will discuss the intrinsic of CDC in Part II.

Introduction and background:

Auditing considerations are important while designing the applications especially enterprise ones. In transaction based system, auditing and tracking are crucial component of any system from the perspective of security as well as data. Data auditing needs to be employed for configuration or master data wherein the changes would have large impact.  

Traditional techniques are using triggers, stored procedures or hooking the logic in front end. All these approaches have pros and cons. Triggers were popular choice but there inherent performance issues or possibility of bottlenecks along-with dependency on table structure is making them tricky.

Stored procedures are also preferred but it has limitations of not able to track changes outside it.

Triggers and stored procedures are not agnostic of table structure changes and also prone to bugs.

Although triggers are automatic, most of the database architects are refraining themselves from recommending those. Not to miss is that triggers also add maintenance overheads.

The third approach taken is where the auditing mechanism is implemented in UI through auditing components and such calls are hooked to the places where auditing is to be accomplished. This looks simpler but has lot of disadvantages as it has to be implemented at many places introducing redundancy, additional database trip and also such operation needs to be included in a transaction along-with the database operation.  This also incurs maintenance overhead.

To address such issues, Microsoft has come up with more sophisticated and atomic approach called CDC- Change Data Capture. It provides ability to record all data change operations such as INSERT, UPDATE and DELETE. The data is captured asynchronously which means it is not captured in transactions containing these operations rather it is captured through the transaction logs reducing the performance impact.


Objective

To learn about the CDC feature of SQL server (version 2008 and higher)


Using the code

Note: For the examples, the schema for Employee table from sample database i.e. AdvntureWorks in SQL 2008 R2 is used (after some changes in structure). For demonstration, the scripts are also attached.

Setting up CDC has few administrative tasks to be performed on the database for who CDC is to be implemented.

Let’s create a database called “TrialDB”.

USE master;

go

IF EXISTS (SELECT * FROM sys.databases WHERE name = 'TrialDB')

      DROP DATABASE TrialDB;

go

 

CREATE DATABASE TrialDB;

Go

The next step is to enable the CDC for this database.

USE TrialDB;

GO

 

EXECUTE sys.sp_cdc_enable_db;

GO

USE master;

go

IF EXISTS (SELECT * FROM sys.databases WHERE name = 'TrialDB')

      DROP DATABASE TrialDB;

go

 

CREATE DATABASE TrialDB;

Go

The next step is to enable the CDC for this database.

USE TrialDB;

GO

 

EXECUTE sys.sp_cdc_enable_db;

GO

The execution of this command results into 3 actions

1.       Create the Schema named “cdc”

2.       Create database user named “cdc”

3.       Create 5 tables with schema “cdc”

This is depicted in following diagram.



Is it all done? Not yet, the CDC is enabled for database but it has to be enabled for individual tables too.

Let’s create a table called “Employee”. This table originally belongs to database “AdventureWorks”.

SET ANSI_NULLS ON

GO

 

SET QUOTED_IDENTIFIER ON

GO

 

CREATE TABLE [Employee](

      [EmployeeID] [int] IDENTITY(1,1) NOT NULL,

      [NationalIDNumber] [nvarchar](15) NOT NULL,

      [ContactID] [int] NOT NULL,

      [LoginID] [nvarchar](256)  NULL,

      [ManagerID] [int] NULL,

      [Title] [nvarchar](50)  NULL,

      [BirthDate] [datetime]  NULL,

      [MaritalStatus] [nchar](1)  NULL,

      [Gender] [nchar](1)  NULL,

      [HireDate] [datetime]  NULL, 

      [VacationHours] [smallint]  NULL,

      [SickLeaveHours] [smallint]  NULL,

     

      [rowguid] [uniqueidentifier] ROWGUIDCOL   NULL,

      [ModifiedDate] [datetime]  NULL,

 CONSTRAINT [PK_Employee_EmployeeID] PRIMARY KEY CLUSTERED

(

      [EmployeeID] ASC

)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

) ON [PRIMARY]

 

GO

 

Once the table is created, the CDC can be enabled as shown in following screenshot. This shows the script as well as a new table created named “cdc.Employee_CT”.

Is it all done? Not yet, the CDC is enabled for database but it has to be enabled for individual tables too.

Let’s create a table called “Employee”. This table originally belongs to database “AdventureWorks”.

 

Hurray, the CDC is set up for database table and table and we are ready to plunge into database operations and see the results thereof.

INSERT:  Following query would insert a record into Employee table

INSERT INTO [TrialDB].[DBO].[Employee]

           ([NationalIDNumber]          

           ,[ContactID]

           ,[LoginID]

           ,[ManagerID]

           ,[Title]

           ,[BirthDate]

           ,[MaritalStatus]

           ,[Gender]

           ,[HireDate]        

           ,[VacationHours]

           ,[SickLeaveHours] 

)

VALUES

       (14417807

           ,1209

           ,'trial\guy1'

           ,16

           ,'Production Technician - WC60'

           ,NULL

           ,'M'

           ,'M'

           ,GETDATE()          

           ,21

           ,30

           )

 

So this INSERT operation shall be captured along-with data and we can confirm this with following query.

 

We can see the data inserted captured along-with some system data.

__$operation  denotes the database operation

1: DELETE

2: INSERT

3: Before UPDATE

4: After UPDATE

UPDATE: We will update the inserted record and change the data for column “NationalIDNumber” as depicted below. 

As seen, there are 2 records in the CDC table one for record before update and other for after update. It is interesting and very useful for security sensitive systems.

Let’s insert one more record.

INSERT INTO [TrialDB].[DBO].[Employee]

           ([NationalIDNumber]

           ,[ContactID]

           ,[LoginID]

           ,[ManagerID]

           ,[Title]

           ,[BirthDate]

           ,[MaritalStatus]

           ,[Gender]

           ,[HireDate]     

           ,[VacationHours]

           ,[SickLeaveHours]    

          )

     VALUES

           (14417809

           ,1210

           ,'trial\guy1'

           ,16

           ,'Production Technician - WC60'

           ,NULL

           ,'M'

           ,'M'

           ,GETDATE()          

           ,21

           ,30

           )

GO

Following is snapshot of data in CDC table where we can see the records for this insert.

DELETE: We would delete the first record and see how it is captured.

Summary and Conclusion

The auditing and tracking are important system need for enterprise applications. Especially in Finance domain where the systems are transactions base, it attains far more importance.  Traditionally the triggers, stored procedure and front end components were implemented. They have pro and cons and fit for particular scenarios. This has performance impacts as well as maintenance overhead. With Microsoft’s CDC feature in SQL server 2008, it seems that it is fool proof feature addressing the needs of atomicity, performance and more detailed and sophisticated approach.

We saw a practical example and also went through database operations and their data capture.

This looks like very promising feature and it would be easy to settle for at least new project. For existing systems, there is little one time overhead for setting up the CDC but it could be worth given its benefits.

HAPPY PROGRAMMING!!!

Reference

http://msdn.microsoft.com/en-us/library/bb522489(v=sql.105).aspx

 

Page copy protected against web site content infringement by Copyscape

About the Author

Vishvvas
Full Name: Vishwas Sutar
Member Level: HonoraryPlatinum
Member Status: Member,MVP
Member Since: 5/30/2011 2:13:10 AM
Country: India

http://www.dotnetfunda.com
Extensive and rich experience across gamut of technologies and programming languages like PB,VB,C++,VB.NET, C#, Classic ASP,ASP.NET, ASP.NET MVC.

Login to vote for this post.

Comments or Responses

Posted by: Ashuthinks on: 7/17/2012 | Points: 25
Something very new great article :)

Login to post response

Comment using Facebook(Author doesn't get notification)