Checking if the large CSV file (lines 1 m) has the same data as the MySQL table

I am trying to find a way to effectively compare the contents of a CSV file with a MySQL database (more than 1 million rows for comparison), I did something similar before just putting all the rows in an array, but this will work for a small number of lines due to overload memory.

My question is, is there a recommended way to do this? Any libraries or something that could help?

I would state your answers.

+6
source share
3 answers

Assuming this is a health check and you want to have 0 differences, how about uploading the database as a CSV file of the same format and then using the command line tools ( diff or cmp ) to check that they match?

You need to make sure that the CSV dump is ordered and formatted in the same way as the source file.

+10
source

Besides @therefromhere's excellent answer, you can also calculate the hash, both in MySQL and in the source file, and then compare them.

+2
source

I have never tried this myself, but MySQL has a CSV table type. Perhaps you can get MySQL to read the file directly, as if it were just another database table. You will probably need to first create an empty table that matches the CSV file, so that the .frm file is created in the data directory. Then you can replace the empty CSV file in the data directory with your CSV file. You may need to start the recovery table since you did not import it.

http://dev.mysql.com/doc/refman/5.1/en/csv-storage-engine.html

0
source

Source: https://habr.com/ru/post/913281/


All Articles