Query Optimization GROUP BY

The database is MySQL with the MyISAM engine.

Table definition:

CREATE TABLE IF NOT EXISTS matches ( id int(11) NOT NULL AUTO_INCREMENT, game int(11) NOT NULL, user int(11) NOT NULL, opponent int(11) NOT NULL, tournament int(11) NOT NULL, score int(11) NOT NULL, finish tinyint(4) NOT NULL, PRIMARY KEY ( id ), KEY game ( game ), KEY user ( user ), KEY i_gfu ( game , finish , user ) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=3149047 ; 

I set the index to (game, finish, user) , but for this GROUP BY query it still takes 0.4 - 0.6 seconds to run:

 SELECT user AS player , COUNT( id ) AS times FROM matches WHERE finish = 1 AND game = 19 GROUP BY user ORDER BY times DESC 

EXPLAIN output:

 | id | select_type | table | type | possible_keys | key | key_len | | 1 | SIMPLE | matches | ref | game,i_gfu | i_gfu | 5 | | ref | rows | Extra | | const,const | 155855 | Using where; Using temporary; Using filesort | 

Is there any way to do this faster? The table contains about 800 thousand records.


EDIT: I changed COUNT(id) to COUNT(*) , and the time decreased to 0.08-0.12 seconds. I think I tried this before doing the index and forgot to change it again.

The conclusion of the explanation Using the index explains the acceleration:

 | rows | Extra | | 168029 | Using where; Using index; Using temporary; Using filesort | 

(Side question: is this a drop in the coefficient of 5 normal?)

There are about 2,000 users, so the final sort, even if it uses filesort, does not degrade performance. I tried without ORDER BY , and it still takes almost the same time.

+6
source share
6 answers

EXPLAIN checks that the query used an index (game, finish, user) . This seems to me to be the best indicator. Could this be a hardware problem? What is your operating system and processor?

+1
source

Get rid of the key "game" - it is redundant with the help of "i_gfu". Since "id" is a unique counter (id), it simply returns the number of rows in each group, so you can get rid of it and replace it with count (*). Try this and paste the EXPLAIN output:

 SELECT user AS player, COUNT(*) AS times FROM matches WHERE finish = 1 AND game = 19 GROUP BY user ORDER BY times DESC 
+7
source

Uh, hard. Try reordering your index: first put the user column (so make an index (user, finish, game) ), as this increases the likelihood that GROUP BY will be able to use the index. However, in general, GROUP BY can only use indexes if you restrict the aggregate functions used for MIN and MAX (see http://dev.mysql.com/doc/refman/5.0/en/group-by-optimization.html and http://dev.mysql.com/doc/refman/5.5/en/loose-index-scan.html ). Your order is not very helpful.

+2
source

I believe that most of the time is spent on extracting and, more importantly, sorting (twice, including skipped, reading the index) 150k rows of 800k. I doubt that you can optimize it much more than it already is.

+1
source

As others have noted, you may have reached the limit of your ability to customize the query itself. Then you should see what max_heap_table_size and tmp_table_size on your server. The default value is 16 MB, which may be too small for your table.

+1
source

One of the drawbacks of this request is that you order by aggregation. This means that you cannot return rows until a complete set of results has been created; no index can exist (for mysql myisam, anyway) to fix this.

You can quite denormalize your data to overcome this; For example, you can add an insert / update trigger to bind an account value in a pivot table with an index so that you can immediately start returning rows.

+1
source

Source: https://habr.com/ru/post/888639/


All Articles