By Lynn Pettis, 2011/02/18 (first published: 2009/09/15)
Introduction
One of the frequent questions asked on forums is how to delete a large number of records from a table in SQL Server. The question can come from several directions. Sometimes it is the result of excess growth of the database's transaction log, or that the table must remain available to users during the delete process, or simple that the user is looking for an efficient means to delete records on a regular basis.With this, I'd like to present one method that can be used to delete records from a table that will allow the table to remain available to users and will help keep the transaction log from growing excessively. I will show how this can be done in SQL Server 2000 and SQL Server 2005/2008.
The Setup
The first thing we need to do, however, is setup the table from which we will be deleting a large number of records. Most requests for help in this area come from individuals that are asking how to delete several million rows of data from a multi-million row table. For this code we will create a table with 20 million rows of data, and we will delete approximately 10%, or around 2 million, of those rows. The test data that will be generated below will have transaction dates (TDate) from January 1, 2000 through December 31, 2009. We will purge all data from the Year 2000.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
| USE [SandBox] -- This is a my test bed database GO -- Create the test table create tabledbo.LAPTest ( AccountID int , Amount money, TDate datetime ) create clustered index IX_LAPTest_TDate on dbo.LAPTest ( TDate asc ) GO --===== Build the table 100000rows at a time to "mix things up" DECLARE @CounterINT SET @Counter = 0 WHILE @Counter < 20000000 BEGIN --===== Add 100000rows to the test table INSERT INTO dbo.LAPTest( AccountID, Amount, Date ) SELECT TOP 100000 AccountID = ABS (CHECKSUM(NEWID()))%50000+1, Amount = CAST (CHECKSUM(NEWID())%10000 /100.0 AS MONEY), Date = CAST (RAND(CHECKSUM(NEWID()))*3653.0+36524.0ASDATETIME) FROM master.dbo.syscolumns t1 CROSS JOINmaster.dbo.syscolumns t2 --===== Increment the counter SET @Counter = @Counter + 100000 END GO --===== Build the table 100000rows at atime to "mix things up" |
SQL Server 2000
The first code snippet I will write will be for deleting the records from table in SQL Server 2000. Although this code will also work in SQL Server 2005, it will not work in future versions of SQL Server. For that reason, I will rewrite the code I develop for SQL Server 2000 to use the new capabilities available in SQL Server 2005.I have made every effort to make this code generic, and it should work with minimal changes. As it may be used in databases that are set to use either the SIMPLE, BULK LOGGED, or FULL recovery models; I have included conditional logic to perform a transaction log backup if the database is not using the SIMPLE recovery model.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
| declare @recoverymodel varchar (12), @dbname sysname, @backupdevice sysname, @batchsize int , @loopcnt int , @ year smallint , @backupfile nvarchar(512) select @recoverymodel = cast (DATABASEPROPERTYEX ( db_name() , 'Recovery' ) as varchar (12)), @dbname = db_name(), @backupdevice = 'D:\Backups\', @batchsize = 10000, @loopcnt = 0, @year = 2000 set rowcount @batchsize -- used for SQL Server 2000 to limit rows affected while @batchsize <> 0 begin delete from dbo.LAPTest where TDate >= dateadd(yyyy, (@year - 1900), 0) and TDate < dateadd(yyyy, (@year - 1900) + 1, 0) set @batchsize = @@rowcount if @batchsize > 0 and @recoverymodel <> ' SIMPLE ' begin set @loopcnt = @loopcnt + 1 set @backupfile = @backupdevice + @dbname + N' _ ' + cast(@loopcnt as nvarchar(8)) + N' .trn ' backup log @dbname to disk = @backupfile with description = N' T-Log Backup during purge' end end set rowcount 0 -- reset rows affected after processing, SQL Server 2000 |
SQL Server 2005
Now we will take the code above and modify it to work using the new features introduced in SQL Server 2005. This will also allow this code to be used in SQL Server 2008 and future versions as well. As you look at the code you will very little difference between the two versions. What are missing from the SQL Server 2005 version are the set rowcount statements and the addition of the TOP clause in the delete statement.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
| declare @recoverymodel varchar (12), @dbname sysname, @backupdevice sysname, @batchsize int , @loopcnt int , @ year smallint , @backupfile nvarchar(512) select @recoverymodel = cast (DATABASEPROPERTYEX ( db_name() , 'Recovery' ) as varchar (12)), @dbname = db_name(), @backupdevice = 'D:\Backups\', @batchsize = 10000, @loopcnt = 0, @year = 2000 while @batchsize <> 0 begin delete top (@batchsize) from dbo.LAPTest where TDate >= dateadd(yyyy, (@year - 1900), 0) and TDate < dateadd(yyyy, (@year - 1900) + 1, 0) set @batchsize = @@rowcount if @batchsize > 0 and @recoverymodel <> ' SIMPLE ' begin set @loopcnt = @loopcnt + 1 set @backupfile = @backupdevice + @dbname + N' _ ' + cast(@loopcnt as nvarchar(8)) + N' .trn ' backup log @dbname to disk = @backupfile with description = N' T-Log Backup during purge' end end |
Variations
Now that you have the basic generic code to delete large volumes of data from a table, you may decide that you don't want 200+ transaction log backup files should you have to do a restore to a point in time. If your database is not using the SIMPLE recovery model, you could modify the conditional used in the IF statement to decide if the transaction log backup should be run. One change you could make is to only run the transaction log backup if (@loopcnt modulo (some value) = 0 and @batchsize > 0) OR (@batchsize = 0). In addition to this logic, you could also add a differential backup after the purge is run as well.Conclusion
With these sample code snippets, I hope you can see how easily you can batch your deletes allowing the table to remain relatively open for continued use, and at the same time keep your transaction log files manageable when your database is using the BULK LOGGED or FULL recovery model.Source: http://www.sqlservercentral.com/articles/T-SQL/67898/
No comments:
Post a Comment