Removing huge chunks of data from mysql innodb

I need to delete a huge portion of my data in my production database, which is about 100 GB in size. If possible, I would like to minimize downtime.

My selection criteria for removal are likely to be

REMOVE * FROM POSITION WHERE USER.ID = 5 AND UPDATED_AT <100

What is the best way to remove it?

  • Create an index?
  • Write a sequential script that is deleted by paging 1000 lines at a time?
+4
source share
4 answers

The best way is to delete them step by step using the LIMIT clause (over 10,000 elements), but not apply the order. This will allow MySQL to clear results faster, and transactions will not be huge. You can easily do this with any programming language that you have installed that has a connector for mysql. Be sure to commit after each statement.

An index will definitely help, but building it will take some time on a 100-GB table (in any case, it’s worth creating when you are going to reuse the index in the future). By the way, your current query is incorrect because the link to the USER table is not listed here. You must be careful with the index so that the optimizer can use it.

0
source

You can try to use the method mentioned in mysql doc :

  • Select the rows that should not be deleted in an empty table that has the same structure as the original table:

    INSERT INTO t_copy SELECT * FROM t WHERE ...;

  • Use RENAME TABLE to precisely move the source table and rename the copy to the original name:

    RENAME TABLE t TO t_old, t_copy TO t;

  • Drop the source table:

    TABLE DROP t_old;

+7
source

If possible, use the binary log at the line level, rather than the binary log at the instruction level (this reduces the number of locks), at least during this operation. Carry out the removal in batches (1000 - a decent size). Use the primary key as a criterion to delete each batch and order using the primary key (so that you delete lines that are physically close to each other).

+2
source

And back, I wanted to remove more than 99% of the data from the table. The table I was deleting had a session table in which there were more than 250 million rows, and I only wanted the most recent 500K. The fastest way I came across was to select the 500,000 rows that I wanted in another table. Delete the old table and rename the new table to replace it. It was about 100 times faster than a regular delete, which should select records and rebuild the table.

This also has the added benefit of reducing table file size if you use InnoDB with innodb_file_per_table = 1 because InnoDB tables are never shrunk.

0
source

Source: https://habr.com/ru/post/1309512/


All Articles