Interpretation of mysql information_schema.tables DATA_LENGTH, INDEX_DATA_LENGTH and DATA_FREE

I hope someone can explain why two hours of data cleansing implies that this is just a 32K reduction in the use of data from my mysql instance. Here are my details:

I have a mysql database (runs on Amazon RDS) from which I am trying to clear data. I do this to avoid running out of storage space because Amazon closes you at 1 TB, and if we take no action, we will eventually hit that limit.

I use this command to calculate the size of my tables and indexes:

select * from information_schema.tables; 

In particular, there are two InnoDB tables that consume most of my storage. I have a process that iterates through my largest table deleting records. At time t = 0, I ran the above SQL query and got the following results for the data length and index data length:

  • Data Length: 56431116288
  • Index Data Length: 74233151488

Two hours after the continuous execution of my database cleaner, I ran the above SQL statement and got the following:

  • Data Length: 56431083520
  • Index Data Length: 74126147584

This basically means that I shaved off 32 Kbytes of tabular data and 102 MB of index data.

Decreasing the index makes sense. The tabular data reduction is insanely small. It is impossible for other data to be inserted at this time, because I am doing this test on a backup of my database (one of the nice things about RDS is that you can get a complete replication of your database and run experiments on, for example, this one). I also confirmed that the AUTO_INCREMENT value was identical both times.

Can someone explain why the data length has not changed much? Is data length a very fast and dirty approximation? Is there any other other mysql compaction step? Or am I completely misinterpreting the use of these fields?

Thanks!

Update

Perhaps I realized this - at time t = 0

  • DATA_FREE = 77594624

Four hours of latter

  • DATA_FREE = 256901120

This means that I increased DATA_FREE using appx 171MB.

Does this mean that if I insert another 171 MB, it will exit the DATA_FREE pool and therefore my data length will not increase?

In other words, let's say I start with a new InnoDB table and insert 20 GB of data (assuming that 20 GB includes all the extra InnoDB stuff, I understand that the data stored in InnoDB is bigger than MyISAM), then I delete everything data, then I insert 10 GB of data. When I run select * from information_schema.tables, I should see that the data length is 10 GB and the data does not contain 10 GB, right? I should not expect that a data length of 30 GB / data will not contain 0 GB, and I do not expect to see a data length of 10 GB / data without 10 GB?

Update 2

This stack overflow message also seems to confirm my analysis.

+6
source share
1 answer

The "data length" of a table includes any free space that may exist in the table. You will probably have to OPTIMIZE table to defragment the table, freeing up this space. Please note that this may lock the table for some time while it is doing its thing.

Using the InnoDB storage engine ( CREATE TABLE ( ... ) ENGINE=InnoDB; ) will make table optimization largely unnecessary and also make the database faster. If you are not already using it, you should probably get started. :)

+3
source

Source: https://habr.com/ru/post/900756/


All Articles