Uninstall Optimization on SQL Server

Question

Uninstall Optimization on SQL Server

Delete on sql server is sometimes slow, and I often need to optimize them to reduce the time needed. I was looking a bit for advice on how to do this, and I found many suggestions. I would like to know your favorite and most effective methods to tame a remote beast, and how and why they work.

still:

make sure foreign keys have indexes
make sure conditions are indexed
using WITH ROWLOCK
destroy unused indexes, delete, rebuild indexes

now, your move.

+36

sql sql-server

pomarc Jun 05 '09 at 11:31

source share

15 answers

I have much more experience with Oracle, but most likely this also applies to SQL Server:

when deleting a large number of rows, a table lock is called, so the database does not need to do many row locks.
If the table you are deleting refers to other tables, make sure that these other tables have indexes in the foreign key columns (otherwise the database will perform a full table scan for each deleted row in another table to ensure that deleting the row does not violate foreign key constraint)

+12

Erich Kitzmueller Jun 05 '09 at 11:37

source share

I wonder if it's time for garbage collectors? You mark the line to be deleted, and the server deletes it later during the sweep. You do not want this for every deletion - because sometimes the line should go now, but it would be convenient sometimes.

+9

quillbreaker Jun 05 '09 at 14:02

source share

Honestly, deleting a million rows from a table is just as bad as inserting or updating a million rows. This is the size of the rowset, that is the problem, and there you cannot do it.

My suggestions:

Make sure that the table has a primary key and a clustered index (this is important for all operations).
Ensure that the clustered index is such that a minimal page reorganization would occur if a large block of rows were deleted.
Make sure your selection criteria are SARGable.
Ensure that all foreign key constraints are currently trusted.

+5

Christian Hayter Jun 05 '09 at 12:02

source share

Summary of responses until 2014-11-05

This answer has been flagged as a community wiki as it is an ever-evolving topic with many nuances, but very few possible answers.

The first problem is that you have to ask yourself which scenario are you optimizing? This is usually performance with a single user on db or scaling with many users on db. Sometimes the answers are completely opposite.

To optimize a single user

Tip a TABLELOCK
Delete indexes not used in deletion, and then rebuild them later
A package using something like SET ROWCOUNT 20000 (or any other, depending on the log space) and a loop (possibly with WAITFOR DELAY ) until you get rid of all this ( @@ROWCOUNT = 0 )
When deleting a large% table, simply create a new one and delete the old table
Separate the lines for deletion, and then release the parry. [Product Details ...]

To optimize multiple users

Tooltip line locks
Use a clustered index
Design a clustered index to minimize page reorganization if large blocks are removed.
Refresh the is_deleted column, then do the actual deletion later in the maintenance window

For general optimization

Make sure that FKs have indexes in their source tables.
Surely WHERE has indexes
Define rows to delete in the WHERE with a view or view table, rather than referring directly to the table. [Product Details ...]

+5

xero Dec 11 '14 at 19:04

source share

(if indexes are not used, why do they even exist?)

One option I've used in the past is to do the work in packages. A rough way would be to use SET ROWCOUNT 20000 (or something else) and a loop (possibly with WAITFOR DELAY ) until you get rid of all this (@@ ROWCOUNT = 0).

This can help reduce the impact on other systems.

+4

Marc Gravell Jun 05 '09 at 11:37

source share

The problem is that you have not defined your conditions enough. That is, what exactly are you optimizing?

For example, the system is not available for night service and users are not in the system? And you delete most of the database?

If in offline mode and deleting a large% it may make sense to simply create a new table with data to save, delete the old table and rename. If you remove a small%, you probably want to run the package in large batches, as your log space allows. It depends entirely on your database, but dropping indexes during rebuilds can hurt or help - even if it’s possible because it is “disconnected”.

If you are connected to the network, what is the likelihood that your deletions conflict with user actions (and is user activity predominantly read, updated, or what)? Or are you trying to optimize the user experience or the speed of your request? If you are removed from a table that is frequently updated by other users, you need to perform batch processing, but with smaller batch sizes. Even if you do something like locking the table to ensure isolation, it is not very good if the delete statement takes an hour.

When you define your conditions better, you can choose one of the other answers here. I like the link in Rob Sanders post for dosing things.

+4

Matt Jun 05 '09 at 14:11

source share

If you have many foreign key tables, start at the bottom of the chain and work. Final deletion will go faster and block fewer things if there are no child records for cascading deletion (which I would not include if I had a large number of fo child tables, as this would kill performance).

Remove in batches.

If you have foreign key tables that are no longer used (you will be surprised how often production databases end up with old tables that no one will get rid of), get rid of them, or at least break the FK / PK connection, It makes no sense to print the table for records if not in use.

Do not delete - mark records as divided, and then exclude marked records from all queries. This is best set up during database design. Many people use this because it is also the fastest way to return records accidentally deleted. But working on an existing system is a lot.

+3

HLGEM Jun 05 '09 at 13:08

source share

I will add one more:

Verify that the transaction isolation level and database settings are set appropriately. If your SQL server is configured not to use row versioning, or you use the isolation level in other queries, where you will wait until rows are deleted, you can tune yourself to very poor performance during the operation.

+2

Dave Markle Jun 05 '09 at 12:13

source share

On very large tables, where you have a very specific set of criteria for deletion, you can also split the table, disconnect the partition, and then handle deletions.

The SQLCAT team uses this technique on really large amounts of data. I found some links to it here , but I will try to find something more final.

+2

RobS Jun 05 '09 at 12:26

source share

I think the big delete trap that kills performance is sql after every deleted row, it updates all related indexes for any column on that row. how about dividing all indices before bulk deletion?

+2

AbdelMoniem Jun 17 2018-11-21T00:

source share

There are deletions and then deleted. If you are aging data as part of a cropping job, you can hopefully remove adjacent row blocks with a clustered key. If you have to age data from a large table that does not touch, it is very very painful.

+1

ahains Jun 05 '09 at 13:09

source share

If it is true that UPDATES is faster than DELETES, you can add a DELETED status column and filter it in your options. Then run proc at night, which really removes.

+1

Bill Jun 05 '09 at 13:32

source share

Do you have foreign keys with referential integrity enabled? Do you have triggers activated?

+1

Cătălin Pitiș Jun 05 '09 at 14:40

source share

Simplify any use of functions in your WHERE clause! Example:

 DELETE FROM Claims WHERE dbo.YearMonthGet(DataFileYearMonth) = dbo.YearMonthGet(@DataFileYearMonth)

This WHERE form took 8 minutes to delete 125,837 records.

The YearMonthGet function compiled a date with a year and a month from the input date and set day = 1 . This was done so that we delete records by year and month, but not the day of the month.

I rewrote the WHERE clause:

 WHERE YEAR(DataFileYearMonth) = YEAR(@DataFileYearMonth) AND MONTH(DataFileYearMonth) = MONTH(@DataFileYearMonth)

Result: it takes about 38-44 seconds to delete data from 125,837 records!

-one

CoolBreeze Mar 27 '16 at 19:25

source share

John Sansom · Accepted Answer · 2009-06-05 13:14

Next article, Quick, orderly deletion of transactions may interest you.

Perform fast SQL Server uninstall operations

The solution focuses on using the view to simplify the execution plan prepared for the package uninstall operation. This is achieved by accessing this table once rather than twice, which in turn reduces the number of required I / O operations.

Uninstall Optimization on SQL Server

More articles: