Why is my table size larger than 4 times larger than expected? (Strings * bytes / string)

I am considering a simple table in MySQL that has 4 columns with the following sizes:

unsigned bigint (8 bytes) unsigned bigint (8 bytes) unsigned smallint (2 bytes) unsigned tinyint (1 byte) 

So, I would expect 19 bytes / lines.

This table has 1,654,150 rows, so the data size should be 31,428,850 bytes (or about 30 megabytes).

But I can see through phpMyAdmin that the data occupies 136.3 MiB (not including the index size on bigint 1, smallint, tinyint , which is 79 MiB).

The Storage Engine is InnoDB, and the Primary Key is bigint 1, bigint 2 (user ID and item unique identifier).


Edit: As stated in the comments, here is the result of SHOW CREATE TABLE storage

 CREATE TABLE `storage` ( `fbid` bigint(20) unsigned NOT NULL, `unique_id` bigint(20) unsigned NOT NULL, `collection_id` smallint(5) unsigned NOT NULL, `egg_id` tinyint(3) unsigned NOT NULL, PRIMARY KEY (`fbid`,`unique_id`), KEY `fbid` (`fbid`,`collection_id`,`egg_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 
+6
source share
3 answers

If the table often does insert / delete / update, you can try running the OPTIMIZE TABLE query to see how you can reduce the size of the table. there may be defragmentation and unused spaces in the data file.

The size of the data that phpmyadmin shows will not be what you expected from here. You will see when you create the table for the first time, it will not show the data usage: 0. It will be 16 KB or 32 KB or something else. And the size will not change when inserting records. This is how innoDB manages a table file as efficiently as it thinks.

Check the SHOW TABLE STATUS FROM {db_name} and see how many of Avg_row_length each row of the table is. It will not be 19 bytes either

+2
source

Your indexes have their own tables on disk (although you cannot directly see them). The total size of your db is the size of the tables and index tables.

Run

 show create table <tablename>; 

You can see any indexes. Imagine that you added the total size of your table and a table consisting of two columns in your primary key. Those who added will receive the size that you see.

+2
source

The data size for InnoDB on disk is usually 2-3 times larger than you could calculate. It's connected with

  • Column Overhead (Length)
  • Overhead per line (tx id, etc.)
  • Block overhead (16 KB) (link to next block - Tree B +)
  • Weighted average rates of 69% are filled.
  • MVCC - managing multiple versions of Concurrency. This means that there may be old and new copies of any row coexisting at the same time during the transaction.
  • Etc.

One thing that could help: Almost an application does not require a BIGINT (8 bytes) for identifiers. Consider INT UNSIGNED (4 bytes, 4B limit) or MEDIUMINT UNSIGNED (3 bytes, 16M limit), etc. You have 2 Bigints, but 4 copies of them - the secondary key implicitly includes PK columns.

PRIMARY KEY is stored with data, so it carries very little overhead. An additional key, which is actually 4 columns, is BTree with a similar set of overhead.

Even MyISAM has overhead:

  • At least 1 byte per line. (1 in your case)
  • 1 byte for 8 columns NULLable (in your case there is not a single one)
  • Some amount of lost space after the DELETEd or UPDATEd . (Updated will not be a problem in your case due to the size of the FIXED record.)
  • PRIMARY KEY is like any other index
  • All keys have a 69% problem; 1KB blocks

(Since you do not have VARCHAR or TEXT , I do not need to discuss the `CHARACTER SET.) Problems.)

In InnoDB, SHOW TABLE STATUS often turned off 2 times in estimating the number of rows. Avg_row_length is calculated as Data_length / Rows, so it is usually disabled.

I do not recommend OPTIMIZE TABLE for InnoDB tables; it's almost always worth the effort.

When ALTER TABLE .. ADD INDEX .. executed, older versions of MySQL will rebuild the entire table and indexes. In doing so, you get the OPTIMIZE effect. (It is unlikely, but not impossible, to increase the size of the data.) Newer versions only add to the new index. What version are you working in?

Each INDEX is a separate BTree (except PK in InnoDB) (and except FULLTEXT and FULLTEXT ).

+1
source

Source: https://habr.com/ru/post/920704/


All Articles