The best way to compare the contents of two flat files

We have many |(pipe) split flat files that we process daily in SQL Server using the SSIS package. Each flat file is divided into a header section, a content section, and a footer section. We regularly get a newer version of the same files. We are trying to implement file comparison functionality between two versions of the same file in order to reduce the processing load.

Which method will be more efficient?

  • Saving both versions of the same file in separate SQL Server tables with a checksum column and filtering rows for which the checksum values ​​do not match.

  • Implementing similar checksum logic in C # or any other comparison algorithm available in C #.

You can offer any other new algorithm to achieve the same.

+4
source share
1 answer

Well, if you upload both of them to SQL Server already, then a quick way will be to use or depending on your purpose. EXCEPT()INTERSECT()

select * from version2
except
select * from version1

This will return rows in version2that do not match rows in version1. You can also select only one column if you want to compare this.

0
source

Source: https://habr.com/ru/post/1681421/


All Articles