Performance and sorting and excellent uniqueness between mysql and php

In situations like this, or a combination of methods is faster?

$year = db_get_fields("select distinct year from car_cache order by year desc"); 

or

 $year = db_get_fields("select year from car_cache"); $year = array_unique($year); sort($year); 

I have heard that the hallmark in mysql is the big big success for large queries, and this table can have a million rows or more. I wondered which combination of database types, Innodb or MyISAM, would work best. I know that many optimizations are query dependent. Year is an unsigned number, but other fields are varchar of different lengths that I know, and that can make a difference. For instance:

 $line = db_get_fields("select distinct line from car_cache where year='$postyear' and make='$postmake' order by line desc"); 

I read that using the new innodb multiple key method can make such requests very fast. But individual and ordinal sentences are red flags.

+6
source share
1 answer

Ask MySQL to do as much work as possible. If it is ineffective in what it does, then it is most likely not configured correctly (the correct indexing of the request you are trying to run, or settings with sorting buffers).

If you have an index in the year column, then using DISTINCT should be effective. If you do not, then a full table scan is required to extract individual rows. If you try to sort individual rows in PHP, not MySQL, then you transfer (potentially) much more data from MySQL to PHP, and PHP consumes much more memory to store all this data before eliminating duplicates.

Here are some examples from the dev database that I have. Also note that this database is located on a different server on the network from where the queries are made.

 SELECT COUNT(SerialNumber) FROM `readings`; > 97698592 SELECT SQL_NO_CACHE DISTINCT `SerialNumber` FROM `readings` ORDER BY `SerialNumber` DESC LIMIT 10000; > Fetched 10000 records. Duration: 0.801 sec, fetched in: 0.082 sec > EXPLAIN *above_query* +----+-------------+----------+-------+---------------+---------+---------+------+------+-----------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+-------+---------------+---------+---------+------+------+-----------------------------------------------------------+ | 1 | SIMPLE | readings | range | NULL | PRIMARY | 18 | NULL | 19 | Using index for group-by; Using temporary; Using filesort | +----+-------------+----------+-------+---------------+---------+---------+------+------+-----------------------------------------------------------+ 

If I try to execute the same query, with the exception of replacing the SerialNumber column with one that is not indexed, it will start forever, because MySQL must examine all 97 million rows.

Some of the effectiveness is related to how much data you expect to receive. If I slightly modify the above queries to work with the time column (read timestamp), then it takes 1 min 40 seconds to get a separate list of 273 505 times, and most of the overhead comes from transferring all the records over the network. So remember how much data you return, you want to keep it as low as possible for the data you are trying to retrieve.

Regarding your final request:

 select distinct line from car_cache where year='$postyear' and make='$postmake' order by line desc 

There shouldn't be any problems with this, just make sure you have a composite index for year and make and possibly an index on line .

In the last note, I use InnoDB for the reading table, and my server: 5.5.23-55-log Percona Server (GPL), Release 25.3 , which is a version of MySQL Percona Inc.

Hope this helps.

+4
source

Source: https://habr.com/ru/post/921785/


All Articles