Speed ​​up mysql table creation (from selection)

I have a query that works in about 1.6 seconds and returns 150Kish strings. I am creating a temporary table based on the memory engine from this query, and this operation takes about 6.5 seconds. I sifted through a lot of documentation on the mysql site, and also futzed with the indices in the temp table (there is almost no difference whatsoever when I have indexes, not with them), trying to find a way to create the table a little more efficient way. Does anyone have any suggestions regarding configuration options that can speed up the creation of a temporary storage table (or non temp table)? To find out what exactly I am looking for (I think), these are some table configuration parameters that can increase the speed of creation. I went through most of them listed on mysql 5.7, but nothing changed the time it took to create a temporary / standard table in memory.

CREATE TEMPORARY TABLE IF NOT EXISTS myTempTable (ugID INT,stID INT, INDEX st (`st`) ) ENGINE=MEMORY AS select ugID, stID from a_complex_query_that_runs_in_under_2_seconds_but_takes_6_to_write_to_a_temptable_with_150k_rows id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY <derived15> ALL NULL NULL NULL NULL 559 1 PRIMARY creatorEntity eq_ref PRIMARY PRIMARY 4 creatorAssignment.OwnerObjectID 1 Using index 1 PRIMARY <derived2> ALL NULL NULL NULL NULL 152580 Using where; Using join buffer 15 DERIVED VMAssignment ref AssociatedObjectID,AssociatedObject AssociatedObject 2 2902 Using where 2 DERIVED test_ range AssignmentValue,AssociatedObjectID,AssociatedObject,assignmentQuery AssociatedObjectID 4 NULL 214 Using where; Using temporary; Using filesort 2 DERIVED cfv_ ref CustomFieldID,EntityID,CFEntity,CFValue,CFEntityValue CustomFieldID 4 testCloud.test_.AssignmentValue 4232 Using where 2 DERIVED ent_ ref VMStoreID VMStoreID 4 testCloud.cfv_.EntityID 1 Using index 3 UNION test_ ref AssociatedObjectID,AssociatedObject AssociatedObject 2 2902 Using where; Using temporary; Using filesort 3 UNION cfv_ ref CustomFieldID,EntityID,CFEntity,CFValue,CFEntityValue CFValue 772 func,func 4 Using where 3 UNION ent_ ref VMStoreID VMStoreID 4 testCloud.cfv_.EntityID 1 Using index 4 UNION test_ const PRIMARY PRIMARY 4 1 Using index 5 UNION test_ index_merge AssociatedObjectID,AssociatedObject AssociatedObjectID,AssociatedObject 4,6 NULL 1 Using intersect(AssociatedObjectID,AssociatedObject); Using where; Using index; Using temporary; Using filesort 5 UNION entity_ ref PRIMARY,ClassName ClassName 2 23326 Using where 5 UNION ent_ ref VMStoreID VMStoreID 4 testCloud.entity_.ID 1 Using index 6 UNION entity_ ref PRIMARY,ClassName ClassName 2 23326 Using where; Using temporary; Using filesort 6 UNION ent_ ref VMStoreID VMStoreID 4 testCloud.entity_.ID 1 Using index 6 UNION test_ ref AssignmentValue,AssociatedObjectID,AssociatedObject,assignmentQuery AssignmentValue 768 testCloud.entity_.State 1 Using where 7 UNION entity_ ref PRIMARY,ClassName ClassName 2 23326 Using where; Using temporary; Using filesort 7 UNION ent_ ref VMStoreID VMStoreID 4 testCloud.entity_.ID 1 Using index 7 UNION test_ ref AssignmentValue,AssociatedObjectID,AssociatedObject,assignmentQuery AssignmentValue 768 testCloud.entity_.Zip 1 Using where 8 UNION test_ range AssociatedObjectID,AssociatedObject AssociatedObjectID 4 NULL 150 Using where; Using temporary; Using filesort 8 UNION entity_ ref PRIMARY,ClassName,Address Address 456 func,func 4 Using where 8 UNION ent_ ref VMStoreID VMStoreID 4 testCloud.entity_.ID 1 Using index 9 UNION test_ range AssignmentValue,AssociatedObjectID,AssociatedObject,assignmentQuery AssociatedObjectID 4 NULL 203 Using where; Using temporary; Using filesort 9 UNION ent_ ref VMStoreID VMStoreID 4 testCloud.test_.AssignmentValue 1 Using where; Using index 10 UNION test_ const PRIMARY PRIMARY 4 1 Using index 11 UNION test_ ref AssociatedObjectID,AssociatedObject AssociatedObjectID 4 1 Using where; Using temporary; Using filesort 11 UNION entity_ ref PRIMARY,ClassName ClassName 2 23326 Using where 11 UNION ent_ ref VMStoreID VMStoreID 4 testCloud.entity_.ID 1 Using index 12 UNION test_ ref AssignmentValue,AssociatedObjectID,AssociatedObject,assignmentQuery AssociatedObject 2 2902 Using where; Using temporary; Using filesort 12 UNION ent_ ref VMStoreID VMStoreID 4 testCloud.test_.AssignmentValue 1 Using where; Using index 13 UNION test_ range AssignmentValue,AssociatedObjectID,AssociatedObject,assignmentQuery AssociatedObjectID 4 NULL 239 Using where; Using temporary; Using filesort 13 UNION list_stores ref VMListID,VMStoreID,VMStoreID_2,VMStoreID_3,VMStoreID_4,VMStoreID_5,VMStoreID_6,VMStoreID_7,VMStoreID_8,VMStoreID_9,VMStoreID_10,VMStoreID_11,VMStoreID_12,VMStoreID_13,VMStoreID_14,VMStoreID_15,VMStoreID_16 VMListID 4 testCloud.test_.AssignmentValue 318 Using where 13 UNION ent_ ref VMStoreID VMStoreID 4 testCloud.list_stores.VMStoreID 1 Using index 14 UNION test_ range AssignmentValue,AssociatedObjectID,AssociatedObject,assignmentQuery AssociatedObjectID 4 NULL 70 Using where; Using temporary; Using filesort 14 UNION sto_ ref PRIMARY,RetailerID RetailerID 4 testCloud.test_.AssignmentValue 63 Using where; Using index 14 UNION ent_ ref VMStoreID VMStoreID 4 testCloud.sto_.ID 1 Using index NULL UNION RESULT <union2,3,4,5,6,7,8,9,10,11,12,13,14> ALL NULL NULL NULL NULL NULL 
+5
source share
3 answers

There’s not much to work with. ( SHOW CREATE TABLE and EXPLAIN ) But I see one thing:

 Using intersect(AssociatedObjectID,AssociatedObject) 

This can almost always be improved using a "composite" index:

 INDEX(AssociatedObjectID, AssociatedObject) 

There may be other cases for composite indexes, but we will need to see the query.

", but the time it takes to populate the temp table with a query - creating 152K rows will probably take longer than inserting this number of rows.

Try ENGINE=InnoDB instead of ENGINE=MEMORY . InnoDB has made great strides.

Also try not to include INDEX st (st) in CREATE , but then add it. Building an index in stages can be the slowest part of CREATE .

There are probably no settings that affect the performance of MEMORY .

+2
source

The first thing to do is find out where the bottleneck is. As other comments pointed out in the comments, this does not populate the temporary table.

Insert Rows 150k

It takes about 5 seconds. Woah, which leaves me stupid, but I will redeem myself later.

  $dbh->query('CREATE TEMPORARY TABLE IF NOT EXISTS myTempTable (ugID INT,stID INT, INDEX st (`st`) r '); $dbh->beginTransaction(); $stmt = $dbh->prepare("INSERT INTO myTempTable(ugId,stID) values(:u, :s)"); for ($i=0; $i < 150000 ; $i++) { $stmt->execute(array("u"=> $i, "s"=> $i)); } $res = $dbh->query("SELECT COUNT(*) from myTempTable"); $dbh->commit(); 

CREATE TABLE LIKE

The above experiment is bad, because it loops around 150 thousand times and can even be. But what if we created a persistent table as above and then did CREATE TABLE LIKE (this is the original problem)

BTW Removing the TEMPORARY keyword from the above code increased the time to 9 seconds. And now in the mysql console.

 mysql> CREATE TEMPORARY TABLE IF NOT EXISTS myTempTable (ugID INT,stID INT, INDEX st (`stId`) ) ENGINE=MEMORY AS (SELECT * FROM ugTable); Query OK, 150000 rows affected (0.17 sec) Records: 150000 Duplicates: 0 Warnings: 0 

Fraction of a second. So it’s clear that the fault is how the data is loaded, not the process of filling out the table.

Reduce IO.

Your complex query reads a lot of lines. 150 thousand. Rows in a set of large tables - this is a lot of input-output. You can speed this up by making sure that the server has enough memory to cache the entire set of tables participating in memory here.

This would be much simpler if the data size on the disk were smaller. It may be time to re-examine your tables and see if there is any normalization step that you missed. Time to see if some indexes can be changed. For example, you have impressive indexes such as AssignmentValue 768 . Even when these indexes are used, it is still pretty much IO.

All this data must be read on disk before loading up to 150 thousand of them into a temporary table that is in memory.

0
source

It is recommended that you create an index after the table is populated with your data. This will reduce runtime. The theory is that the index will be created as a whole in memory after insertion and then written to disk, which is much faster than updating the index with each insertion. The index will also have the perfect tree layout when created after insertion.

0
source

Source: https://habr.com/ru/post/1268244/


All Articles