Archiving a large table (SQL Server 2008)

I have a very large table, populated with approximately 100 million million records per quarter.

I manually move data from an existing table to another database using this script to minimize the size of the backup and disable loading the production database when executing queries.

Is there a better way, for example, some kind of planned script that will move data from a production database to another database and then efficiently delete records from the original database every day or week?

Please note that my log file is growing rapidly due to the large amount of INSERT in this table, also when I move the data to the archive database, DELETE will be logged.

thanks

+4
source share
4 answers

Let me repeat the requirements:

  • reduce backup size
  • reduce the number of records in the database by archiving
  • archive data without excessive logging

To reduce the size of the backup, you need to move the data to another database.

As for journaling, you need to look at the minimum journaling rules and make sure that you keep track of them. Ensure that the recovery model of the database you are inserting into is in a simple or full-blown recovery model.

To insert archive data, you want to disconnect non-clusters (and rebuild them after the insert is completed), use trace flag 610, if there is a clustered index, and place the table lock on the destination table. There are many more rules in the link that you want to check, but these are the basics.

There is no minimum registration for deletion, but you can minimize the growth of the log file by deleting chunks using the top sentence. The main idea: (transition to a simple recovery model at the time of deletion to limit file growth):

SELECT NULL; WHILE @@ROWCOUNT > 0 DELETE TOP (50000) FROM TABLE WHERE Condition = TRUE; 

Adjust the top number to adjust how much writing to the file is performed for deletion. You will also want to make sure that the predicate condition is correct so that you delete only what you are going to. This will delete 50,000, and then if the string is returned, it will be repeated until the string is returned.

If you really need minimal logging for everything, you can split the source table for a week, create a clone of the source table (with the same partition function and identical indexing structure), switch the partition from the source table to the cloned table, paste from the cloned table into the archive table , and then truncate the cloned table. The advantage of this is truncation rather than deletion. The disadvantage is that it is much more difficult to configure, maintain and query (you get one heap or b-tree for each partition, so if all queries do not use partition deletion, the clustered index / table scan should scan several b -trees / heaps instead of one).

+6
source

Have you ever thought about using SSIS for this? I use SSIS for archiving and backup in order. You can also use the same script in the tsql task and assign it using an agent. Or you can just use the agent and bypassing the script into it.

+3
source

You can use table splitting instead of moving data

http://technet.microsoft.com/en-us/library/dd578580(v=sql.100).aspx

http://msdn.microsoft.com/en-us/library/ms345146(v=sql.90).aspx

To move data periodically, you can use SQL Server's job scheduling feature to run the SSIS package.

Perhaps data conversion services (DTS) may be used.

+2
source

Separation, definitely. This will eliminate the need for a new database. Good example here

If you don’t want to change your architecture, I suggest using SSIS to move data, not scripts

+2
source

Source: https://habr.com/ru/post/1441588/


All Articles