Uninstall Optimization on SQL Server

Delete on sql server is sometimes slow, and I often need to optimize them to reduce the time needed. I was looking a bit for advice on how to do this, and I found many suggestions. I would like to know your favorite and most effective methods to tame a remote beast, and how and why they work.

still:

  • make sure foreign keys have indexes

  • make sure conditions are indexed

  • using WITH ROWLOCK

  • destroy unused indexes, delete, rebuild indexes

now, your move.

+36
sql sql-server
Jun 05 '09 at 11:31
source share
15 answers

Next article, Quick, orderly deletion of transactions may interest you.

Perform fast SQL Server uninstall operations

The solution focuses on using the view to simplify the execution plan prepared for the package uninstall operation. This is achieved by accessing this table once rather than twice, which in turn reduces the number of required I / O operations.

+21
Jun 05 '09 at 13:14
source share
— -

I have much more experience with Oracle, but most likely this also applies to SQL Server:

  • when deleting a large number of rows, a table lock is called, so the database does not need to do many row locks.
  • If the table you are deleting refers to other tables, make sure that these other tables have indexes in the foreign key columns (otherwise the database will perform a full table scan for each deleted row in another table to ensure that deleting the row does not violate foreign key constraint)
+12
Jun 05 '09 at 11:37
source share

I wonder if it's time for garbage collectors? You mark the line to be deleted, and the server deletes it later during the sweep. You do not want this for every deletion - because sometimes the line should go now, but it would be convenient sometimes.

+9
Jun 05 '09 at 14:02
source share

Honestly, deleting a million rows from a table is just as bad as inserting or updating a million rows. This is the size of the rowset, that is the problem, and there you cannot do it.

My suggestions:

  • Make sure that the table has a primary key and a clustered index (this is important for all operations).
  • Ensure that the clustered index is such that a minimal page reorganization would occur if a large block of rows were deleted.
  • Make sure your selection criteria are SARGable.
  • Ensure that all foreign key constraints are currently trusted.
+5
Jun 05 '09 at 12:02
source share

Summary of responses until 2014-11-05

This answer has been flagged as a community wiki as it is an ever-evolving topic with many nuances, but very few possible answers.

The first problem is that you have to ask yourself which scenario are you optimizing? This is usually performance with a single user on db or scaling with many users on db. Sometimes the answers are completely opposite.

To optimize a single user

  • Tip a TABLELOCK
  • Delete indexes not used in deletion, and then rebuild them later
  • A package using something like SET ROWCOUNT 20000 (or any other, depending on the log space) and a loop (possibly with WAITFOR DELAY ) until you get rid of all this ( @@ROWCOUNT = 0 )
  • When deleting a large% table, simply create a new one and delete the old table
  • Separate the lines for deletion, and then release the parry. [Product Details ...]

To optimize multiple users

  • Tooltip line locks
  • Use a clustered index
  • Design a clustered index to minimize page reorganization if large blocks are removed.
  • Refresh the is_deleted column, then do the actual deletion later in the maintenance window

For general optimization

  • Make sure that FKs have indexes in their source tables.
  • Surely WHERE has indexes
  • Define rows to delete in the WHERE with a view or view table, rather than referring directly to the table. [Product Details ...]
+5
Dec 11 '14 at 19:04
source share

(if indexes are not used, why do they even exist?)

One option I've used in the past is to do the work in packages. A rough way would be to use SET ROWCOUNT 20000 (or something else) and a loop (possibly with WAITFOR DELAY ) until you get rid of all this (@@ ROWCOUNT = 0).

This can help reduce the impact on other systems.

+4
Jun 05 '09 at 11:37
source share

The problem is that you have not defined your conditions enough. That is, what exactly are you optimizing?

For example, the system is not available for night service and users are not in the system? And you delete most of the database?

If in offline mode and deleting a large% it may make sense to simply create a new table with data to save, delete the old table and rename. If you remove a small%, you probably want to run the package in large batches, as your log space allows. It depends entirely on your database, but dropping indexes during rebuilds can hurt or help - even if it’s possible because it is “disconnected”.

If you are connected to the network, what is the likelihood that your deletions conflict with user actions (and is user activity predominantly read, updated, or what)? Or are you trying to optimize the user experience or the speed of your request? If you are removed from a table that is frequently updated by other users, you need to perform batch processing, but with smaller batch sizes. Even if you do something like locking the table to ensure isolation, it is not very good if the delete statement takes an hour.

When you define your conditions better, you can choose one of the other answers here. I like the link in Rob Sanders post for dosing things.

+4
Jun 05 '09 at 14:11
source share

If you have many foreign key tables, start at the bottom of the chain and work. Final deletion will go faster and block fewer things if there are no child records for cascading deletion (which I would not include if I had a large number of fo child tables, as this would kill performance).

Remove in batches.

If you have foreign key tables that are no longer used (you will be surprised how often production databases end up with old tables that no one will get rid of), get rid of them, or at least break the FK / PK connection, It makes no sense to print the table for records if not in use.

Do not delete - mark records as divided, and then exclude marked records from all queries. This is best set up during database design. Many people use this because it is also the fastest way to return records accidentally deleted. But working on an existing system is a lot.

+3
Jun 05 '09 at 13:08
source share

I will add one more:

Verify that the transaction isolation level and database settings are set appropriately. If your SQL server is configured not to use row versioning, or you use the isolation level in other queries, where you will wait until rows are deleted, you can tune yourself to very poor performance during the operation.

+2
Jun 05 '09 at 12:13
source share

On very large tables, where you have a very specific set of criteria for deletion, you can also split the table, disconnect the partition, and then handle deletions.

The SQLCAT team uses this technique on really large amounts of data. I found some links to it here , but I will try to find something more final.

+2
Jun 05 '09 at 12:26
source share

I think the big delete trap that kills performance is sql after every deleted row, it updates all related indexes for any column on that row. how about dividing all indices before bulk deletion?

+2
Jun 17 2018-11-21T00:
source share

There are deletions and then deleted. If you are aging data as part of a cropping job, you can hopefully remove adjacent row blocks with a clustered key. If you have to age data from a large table that does not touch, it is very very painful.

+1
Jun 05 '09 at 13:09
source share

If it is true that UPDATES is faster than DELETES, you can add a DELETED status column and filter it in your options. Then run proc at night, which really removes.

+1
Jun 05 '09 at 13:32
source share

Do you have foreign keys with referential integrity enabled? Do you have triggers activated?

+1
Jun 05 '09 at 14:40
source share

Simplify any use of functions in your WHERE clause! Example:

 DELETE FROM Claims WHERE dbo.YearMonthGet(DataFileYearMonth) = dbo.YearMonthGet(@DataFileYearMonth) 

This WHERE form took 8 minutes to delete 125,837 records.

The YearMonthGet function compiled a date with a year and a month from the input date and set day = 1 . This was done so that we delete records by year and month, but not the day of the month.

I rewrote the WHERE clause:

 WHERE YEAR(DataFileYearMonth) = YEAR(@DataFileYearMonth) AND MONTH(DataFileYearMonth) = MONTH(@DataFileYearMonth) 

Result: it takes about 38-44 seconds to delete data from 125,837 records!

-one
Mar 27 '16 at 19:25
source share



All Articles