Accelerate MySQL (MyISAM) COUNT with WHERE clauses

We are introducing a system that analyzes books. The system is written in PHP, and for each book, a cycle is run through the words and each of them is analyzed, setting certain flags (which translate database fields) from various regular expressions and other tests.

The result is a matches table, similar to the example below:

  + ------------------------ + -------------- + ------ + - --- + --------- + ---------------- +
 |  Field |  Type |  Null |  Key |  Default |  Extra |
 + ------------------------ + -------------- + ------ + - --- + --------- + ---------------- +
 |  id |  bigint (20) |  NO |  PRI |  NULL |  auto_increment |
 |  regex |  varchar (250) |  YES |  |  NULL |  |
 |  description |  varchar (250) |  NO |  |  NULL |  |
 |  phonic_description |  varchar (255) |  NO |  |  NULL |  |
 |  is_high_frequency |  tinyint (1) |  NO |  |  NULL |  |
 |  is_readable |  tinyint (1) |  NO |  |  NULL |  |
 |  book_id |  bigint (20) |  YES |  |  NULL |  |
 |  matched_regex |  varchar (255) |  YES |  |  NULL |  |
 |  [...] |  |  |  |  |  |
 + ------------------------ + -------------- + ------ + - --- + --------- + ---------------- +

Most missed tinyint fields are either 0 or 1. There are 25 fields in the match table.

There are ~ 2,000,000 rows in the match table; the result of the analysis is ~ 500 books.

Currently, there is a site reporting area that queries the matches table as follows:

 SELECT COUNT(*) FROM matches WHERE is_readable = 1 AND other_flag = 0 AND another_flag = 1 

However, it currently takes more than a minute to get the main report of the index, since each request takes about 0.7 seconds. I cache this at the request level, but it still takes too much time to load the page.

How am I not very good at how to manage data sets, for example, can someone advise me on a better way to store or query this data? Are there any optimizations that I can use with MySQL to improve the performance of these COUNT s, or am I better off using a different database or data structure?

We are currently using MySQL with MyISAM and VPS tables, so there can be no question of switching to a new database system.

+4
source share
2 answers

Add multiple pointers to this table when you select more than one field. The below index should help a lot. These index types are very good for boolean / int columns. For indexes with varchar values ​​read here: http://dev.mysql.com/doc/refman/5.0/en/create-index.html

 ALTER TABLE `matches` ADD INDEX ( `is_readable`, `other_flag`, `another_flag` ) 

Another thing is to check your queries using EXPLAIN {YOUR WHOLE SQL STATEMENT} to check which index the database is using. Therefore, in this example, you should run the query:

 EXPLAIN ALTER TABLE `matches` ADD INDEX ( `is_readable`, `other_flag`, `another_flag` ) 

Additional information on EXPLAIN: http://dev.mysql.com/doc/refman/5.0/en/explain.html

+4
source

You need to use indexes, create them in the columns that you use WHERE most often.

 ALTER TABLE `matches` ADD INDEX ( `is_readable` ) 

etc..

You can also create indexes based on multiple columns if you make the same type of query again and again. phpMyAdmin has an index parameter on the table structure page below.

+5
source

Source: https://habr.com/ru/post/1391862/


All Articles