Removing duplicates in an SSIS data stream

I am working on an SSIS data flow task.

Source table from an old database that is renormalized.

Normalized destination table.

SSIS error because data transfer is not possible due to duplication (duplicates in the primary key column).

It would be nice if SSIS can check the destination for the current record (by checking the key), and if it exists, it can ignore it. Then he can continue the next recording.

Is there any way to handle this scenario?

+6
source share
1 answer

Assuming your target table is a subset of your source table, you should use Sort Transformation to pull out only the columns that you need for your target table, and then check "Delete rows with duplicate sort values" to basically give you a great A list of records based on the selected columns.

Then just lay the sorting results to your destination and you should be good to go.

+16
source

Source: https://habr.com/ru/post/891045/


All Articles