I know that this question has been asked a hundred times, but, unfortunately, none of these answers helped, since most of them are many years old, it should justify the other.
I have two tables, recordss +140kk rows/+24GBand extras +89kk rows/+70GB.
Each row extrahas a foreign key relation to records, indexes are correctly set between two tables. Deletes from recordscascade deletion to the associated string extra.
I need to clear old records during production . The execution DELETE FROM records WHERE WHERE created < UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 30 DAY)) ORDER BY id LIMIT 1000;lasts forever (I killed him after 20 minutes, when he was still in init). This same action c SELECTcompletes in just a few milliseconds.
Since it SELECTworks so fast, I tried it with id IN (SELECT id subquery^), unfortunately, that did not change, so I killed it after 15 minutes.
Since deleting individual records is quick, I ended up doing this:
for i in `seq 1 100000`; do
mysql database -e "SELECT id FROM records WHERE created < UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 30 DAY)) ORDER BY id LIMIT 1000;" | sed 's;/|;;g' | awk '{if(NR>1)print "DELETE FROM records WHERE id = ",$1,";" }' | mysql database;
now=$(date +"%Y/%m/%d %T")
echo "[ $now ] $i.000"
done
This allows you to process a thousand records per second at the beginning, but after several cycles it decreases to a thousand in 10-20 seconds. Since I need to clear 100 thousand lines, it will take almost a month to complete, which is about the same timeframe as the records to clear, so it will never end (especially since it requires several databases).
The database is stored on the SSD Crucial_CT500MX200SSD1, and the software MariaDB 10.1with the engine InnoDB. innodb_flush_log_at_trx_commitset to 0to avoid unnecessary disk usage.
atop , , mem .
: https://gist.github.com/Slind14/0da34e09dba91cf411db2ead5ad666ef