I have an optimization problem with a rather large table (~ 1.7M rows).
When choosing rows, two columns are used, allowing you to call them colA and colB. They are both of type "double" (5 decimal places) and vary from:
colA: -90 ~ 90 colB: -180 ~ 180
Without an index, any form request:
SELECT * FROM table where colA BETWEEEN a and b AND colB BETWEEN c and d
it takes about the same time to start (~ 1 second), regardless of the range (a, b) and (c, d) (since MySQL must check every row).
If I add an index to colA and colB, two things will happen: queries in which the range (a, b) and (c, d) is small, for example:
SELECT * FROM table where colA BETWEEEN -4 and 4 AND colB BETWEEN 3 and 7
very fast (~ 1/10 of a second). However, the runtime increases with the range between the requested values. For instance:
SELECT * FROM table where colA BETWEEEN -80 and 80 AND colB BETWEEN -150 and 150
it takes about a minute.
I know how B-trees work for strings, but I'm not sure about the mechanism when the data is numeric and the query is executed using a range.
If anyone can suggest how to optimize this query, I would be grateful. One thought is to use an index for small ranges and tell MySQL not to use it for larger ranges, however I could not find a command that allows this.
thanks
EDIT: Explanations
I forgot to forget something. The results are ordered by rand () - I know how inefficient it is, but I have not seen another way to get a limited number of rows from a table in random order.
Adding rand () does not affect runtime when there is no index, but significantly increases the time taken when there is.
EDIT2: This is the use of composite indexes.
SMALL RANGE:
"explain select * from the table where colA between 35 and 38 and colB between -10 and 5 ORDER BY RAND () LIMIT 20"
9783 lines
NO INDEX (fast)
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | 1 | SIMPLE | table | ALL | NULL | NULL | NULL | NULL | 1673784 | Using where | +----+-------------+-------+------+---------------+------+---------+------+---------+-------------+
WITH INDEX (very fast)
+----+-------------+-------+-------+---------------+------+---------+------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+------+---------+------+--------+-------------+ | 1 | SIMPLE | table | range | test | test | 18 | NULL | 136222 | Using where | +----+-------------+-------+-------+---------------+------+---------+------+--------+-------------+
LARGE RANGE:
"explain select * from the table where colA between -80 and 80 and colB between -150 and 150 ORDER BY RAND () LIMIT 20;"
1631862 rows
NO INDEX (fast)
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | 1 | SIMPLE | table | ALL | NULL | NULL | NULL | NULL | 1673784 | Using where | +----+-------------+-------+------+---------------+------+---------+------+---------+-------------+
WITH INDEX (very slow:> 60 seconds)
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | 1 | SIMPLE | table | ALL | test | NULL | NULL | NULL | 1673784 | Using where | +----+-------------+-------+------+---------------+------+---------+------+---------+-------------+
EDIT3:
To summarize: (all queries are limited to returning 20 rows)
large range with rand () with index: 45 seconds
large range without rand (), with index: 0.003 seconds
large range with random, no index: 1 second
large range without rand, without index: 0.003 seconds
Anomaly: "large range with rand () with index, 45 seconds."