I have a web application that uses a fairly large table (millions of rows, about 30 columns). Let me call TableA. Among the 30 columns, this table has a primary key named "id" and another column named "campaignID".
Within the application, users can upload new datasets related to new “campaigns”.
These datasets have the same structure as TableA, but usually only about 10,000-20,000 rows.
Each row in the new dataset will have a unique "id", but they will all have the same campaign identifier. In other words, the user uploads the full data for the new “campaign”, so all 10,000 rows have the same “campaign ID”.
Typically, users upload data for a new campaign, so there are no rows in Table A with the same campaign ID. Since the "id" is unique for each campaign, the identifier of each row of new data will be unique in TableA.
However, in the rare case when a user tries to load a new rowset for a "campaign" that is already in the database, it was necessary to first delete all old rows for this campaign from table A, and then insert new rows from the new dataset.
So my stored procedure was simple:
- BULK INSERT new data to temporary table (#tableB)
- Delete all existing rows in tableA with the same campaign ID.
- INSERT INTO Table A ([columns]) SELECT [columns] of #TableB
- Drop #TableB
Everything went perfectly.
But the new requirement is to provide users with 3 options when downloading new data to process "duplicates" - instances in which the user downloads data for a campaign that is already in the table.
- Delete ALL the data in table A with the same campaign ID, and then insert all the new data from #TableB. (This is an old behavior. With this option, they will never duplicate.)
- #TableB , A, A #TableB (, "" )
- #TableB , A, #TableB ( , ).
. , , .
MySQL, LOAD DATA INFILE "REPLACE" "IGNORE". , SQL Server/T-SQL.
, , TableA , #TableB ( ) 10k-20k.
Google - "" (- SQL Server 2008), SQL Server 2005.
- :
1:
[ - ]
2 ():
merge into TableA as Target
using
on TableA.id=
when matched then
update row in TableA with row from
when not matched then
insert row from
3 ():
merge into TableA as Target
using
on TableA.id=
when matched then
do nothing
when not matched then
insert row from