Running updates on a large, heavily used table

I have a large table (~ 170 million rows, 2 nvarchar columns and 7 int) in SQL Server 2005 that is constantly inserted. Everything works fine in terms of performance, but every once in a while I have to update a row of rows in a table, which causes problems. It works great if I update a small dataset, but if I need to update a set of 40,000 records or so, it takes about 3 minutes and locks the table, which causes problems, since the insertions start to fail.

If I just ran a selection to return the data that needs to be updated, I return 40 thousand records in 2 seconds. These are just updates that take forever. This is reflected in the upgrade plan, where updating a clustered index takes 90% of the cost, and index search and the top-level operator for getting rows take 10% of the cost. The updated column is not part of any index key, so it does not like to reorganize anything.

Does anyone have any ideas on how to speed this up? Now I have to write a service that will just see when these updates should happen, undo the records that need to be updated, and then scroll and update them one at a time. This will satisfy my business needs, but this is another module for support, and I would love it if I could fix it from just part of the DBA database.

Thanks for any thoughts!

+6
source share
4 answers

Mass brute force (and the simplest) way is to have a basic service, as you mentioned. This has the advantage of being scalable with server load and / or data load.

For example, if you have a set of updates that should happen as soon as possible, you can increase the lot size. Conversely, for less important updates, you could slow down the server upgrade if each update takes too long to reduce some pressure on the database.

This type of "heart rate" is quite common in systems and can be very powerful in the right situations.

0
source

In fact, it can reorganize pages if you update nvarchar columns. Depending on what the update does for these columns, they may cause the record to become larger than the space reserved for it before the update. (See explanation now nvarchar is stored at http://www.databasejournal.com/features/mssql/physical-database-design-consideration.html .)

So to say, the record contains a string of 20 characters stored in nvarchar - it takes 20 * 2 + 2 (2 for the pointer) bytes in space. This is written in the original insert into your table (based on the index structure). SQL Server will only use as much space as your nvarchar really takes.

Now there is an update and inserts a string of 40 characters. And oops, the space for writing in your sheet structure of your index is suddenly too small. Thus, the record goes to another physical place with a pointer in the old place indicating the actual location of the updated record.

This leads to the fact that your index becomes obsolete, and because the whole physical structure needs to be changed, you see how much index work is happening behind the scenes. It is very likely that an exclusive lock table will escalate.

I don’t know how best to deal with this. Personally, if possible, I take an exclusive table lock, drop the index, do updates, reindex. Since your updates sometimes obsolete the index, this may be the fastest option. However, this requires a maintenance window.

+1
source

You should supplement your update with several updates (say, 10,000 at a time, TEST!), And not one big of 40 thousand lines.

This way you will avoid locking tables, SQL Server will only display 5,000 locks (pages or rows) before starting to lock a table, and even this is not very predictable (memory pressure, etc.). Smaller updates made in this fasion will at least get around the concurrency problems you are experiencing.

You can download updates using the service or fire cursor.

Read this for more information: http://msdn.microsoft.com/en-us/library/ms184286.aspx

Hope this helps

Robert

+1
source

Its wired that your analyzer says it takes time to update the clustered index. Is the data size changed during the upgrade? It seems that varchar manages the reorganization of data that may need to be updated with pointer pointers (as indicated in KMB). In this case, you may need to increase the% of free sizes on the data and index pages so that the data and index pages can grow without reloading / redistribution. Since updating is an intensive I / O operation (as opposed to reading, which can be buffered), performance also depends on several factors.

1) Are your tables split according to data 2) Is the whole table located on the same SAN disk (or is the SAN strip good?) 3) How detailed is the transaction logging. Is it possible to increase the size of the transaction log buffer to support more logging to support massive inserts?

Its also important which API / language are you using? for example, JDBC supports the batch update feature, which makes updates a little effective if you perform multiple updates.

0
source

Source: https://habr.com/ru/post/905119/


All Articles