What is the most efficient way to update a table during a merge operation?

Like many developers, I perform many data merge operations, particularly with SQL Server.

Historically, I used an old trick: -

1) Performing a left join with existing data and inserting something for which I do not have a corresponding record.

2) After 1), updating the rows in my target table.

I have to take a performance hit on 1). It's unavoidable. However, at 2), I was pretty wasteful. Instead of just updating the material that needs updating, I updated everything I matched (whether the underlying data changed or not).

It turns out that SQL Server is not very versed in this update. He does not perform a preliminary check to determine that what you are about to update is not the same as what you use to update it. Consequently, updates made on these lines result in a physical record and the effect of any indexes that reference the field.

So, from my POV, my options are as follows: -

1) Continue, as usual, bathing in the current wastefulness of my routine (and updating indexes daily on large databases)

  • Pros: it's easy.
  • Cons: this is shit.

2) Write more UPDATE statements that update a specific field if the field has been changed.

eg.

UPDATE p2 SET [SpecificField] = p1.[SpecificField] FROM @source p1, Dest p2 WHERE p2.ExternalKey = p1.ExternalKey AND COALESCE(p1.[SpecificField],'') <> COALESCE(p2.[SpecificField],'') 
  • Pros: it is very specific, it is updated only if necessary.
  • Cons: many different update statements for tables with many columns.

3) Something infinitely better that the community offers.

I would love to go with 3). Are my options really limited to 1 or 2? The note. I looked at the MERGE INTO. The same problems, really.

+4
source share
1 answer

With MERGE INTO , you have the opportunity to add an additional search clause to the WHEN (NOT) MATCHED . For instance,

 MERGE INTO table_to_upsert AS target USING source_table AS source ON target.key1 = source.key1 AND target.key2 = source.key2 AND target.key3 = source.key3 WHEN MATCHED AND (target.value <> source.value) THEN UPDATE SET target.value = source.value, target.timestamp = source.timestamp WHEN MATCHED AND (target.userid <> source.userid) THEN UPDATE SET target.userid = source.userid, target.timestamp = source.timestamp WHEN NOT MATCHED THEN INSERT (key1, key2, key3, value, userid, timestamp) VALUES (source.key1, source.key2, source.key3, source.value, source.userid, source.timestamp) 

However, this will not solve your problem, if several columns are updated at the same time, MERGE will accept the first WHEN (NOT) MATCHED , which evaluates to true (similar to the CASE statement).

+1
source

Source: https://habr.com/ru/post/1403488/


All Articles