MySQL does not use multiple columns for some queries

I have a table with several million records in MySQL in the MyISAM table. Very simplified like this:

CREATE TABLE `test` ( `text` varchar(5) DEFAULT NULL, `number` int(5) DEFAULT NULL, KEY `number` (`number`) USING BTREE, KEY `text_number` (`text`,`number`) USING BTREE ) ENGINE=MyISAM DEFAULT CHARSET=latin1; 

It is populated with this data:

 INSERT INTO `test` VALUES ('abcd', '1'); INSERT INTO `test` VALUES ('abcd', '2'); INSERT INTO `test` VALUES ('abcd', '3'); INSERT INTO `test` VALUES ('abcd', '4'); INSERT INTO `test` VALUES ('bbbb', '1'); INSERT INTO `test` VALUES ('bbbb', '2'); INSERT INTO `test` VALUES ('bbbb', '3'); 

When I run the following query:

 EXPLAIN SELECT * FROM `test` WHERE (`text` = 'bbbb' AND `number` = 2) 

It returns a "number" as a key to use. But the following request:

 EXPLAIN SELECT * FROM `test` WHERE (`text` = 'bbbb' AND `number` = 1) 

It returns "text_number" as the key to use, which will make more sense to me, since this combined key exactly matches the two columns in WHERE. On this number of records, performance is not a problem, but on several million records, a query that uses the text index takes 4 seconds, and one that uses the text_number index is completed in a few milliseconds.

Is there a logical explanation for this? How can I change the index that MySQL uses for the index? I know I can use USE INDEX, but I want MySQL to be able to find a better plan to execute the query. This is on MySQL 5.1 and 5.5, the same results.

+4
source share
2 answers

If you expect that some of the queries will change the game for performance, then it's nice to use your own indexes, for example:

 EXPLAIN SELECT * FROM `test` USE INDEX (number) WHERE (`text` = 'bbbb' AND `number` = 1); or EXPLAIN SELECT * FROM `test` USE INDEX (text_number) WHERE (`text` = 'bbbb' AND `number` = 1); 

As a rule, you can relay the built-in query optimizer for most queries, but for decisive or problematic ones, it’s better to take a closer look.

+1
source

Two things that can improve your productivity:

  • ANALYZE TABLE to update the optimizer index statistics so that it knows the current key distribution. You can control the generation of optimizer statistics in several ways.
  • Create a second coverage index with modified fields. In the real world, this includes determining the optimal one or two coverage indices for a set of three or four columns and what you achieve through benchmarking and EXPLAIN analysis. See this answer for some tips on this.
+1
source

Source: https://habr.com/ru/post/1501864/


All Articles