I have a table called product_category, which has a composite primary key consisting of 2 unsigned integers and no additional secondary indices:
create table product_category ( prod_id int unsigned not null, cat_id mediumint unsigned not null, primary key (cat_id, prod_id)
The table currently has 125 million rows
select count(*) as c from product_category; c = 125,524,947
with the following index / power:
show indexes from product_category; Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality ===== ========== ======== ============ =========== ========= =========== product_category 0 PRIMARY 1 cat_id A 1162276 product_category 0 PRIMARY 2 prod_id A 125525826
If I run a query similar to yours (1st run is not cached with cold / empty buffers either):
select prod_id, count(*) as c from product_category where cat_id between 1600 and 2000 -- using between to include a wider range of data group by prod_id having c = ( select count(*) as c from product_category where cat_id between 1600 and 2000 group by prod_id order by c desc limit 1 ) order by prod_id;
I get the following results:
(cold run) +---------+---+ | prod_id | c | +---------+---+ | 34957 | 4 | | 717812 | 4 | | 816612 | 4 | | 931111 | 4 | +---------+---+ 4 rows in set (0.18 sec) (2nd run) +---------+---+ | prod_id | c | +---------+---+ | 34957 | 4 | | 717812 | 4 | | 816612 | 4 | | 931111 | 4 | +---------+---+ 4 rows in set (0.14 sec)
The outline of the explanation is as follows:
+----+-------------+------------------+-------+---------------+---------+---------+------+--------+-----------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------------+-------+---------------+---------+---------+------+--------+-----------------------------------------------------------+ | 1 | PRIMARY | product_category | range | PRIMARY | PRIMARY | 3 | NULL | 194622 | Using where; Using index; Using temporary; Using filesort | | 2 | SUBQUERY | product_category | range | PRIMARY | PRIMARY | 3 | NULL | 194622 | Using where; Using index; Using temporary; Using filesort | +----+-------------+------------------+-------+---------------+---------+---------+------+--------+-----------------------------------------------------------+
If I run the regilero query:
SELECT c,prod_id FROM ( SELECT c,prod_id,CASE WHEN @mmax<=c THEN @mmax:=c ELSE 0 END 'mymax' FROM ( SELECT COUNT(*) as c, prod_id FROM product_category WHERE cat_id between 1600 and 2000 GROUP BY prod_id ORDER BY c DESC ) res1 ,(SELECT @mmax:=0) initmax ORDER BY c DESC ) res2 WHERE mymax>0;
I get the following results:
(cold) +---+---------+ | c | prod_id | +---+---------+ | 4 | 931111 | | 4 | 34957 | | 4 | 717812 | | 4 | 816612 | +---+---------+ 4 rows in set (0.17 sec) (2nd run) +---+---------+ | c | prod_id | +---+---------+ | 4 | 34957 | | 4 | 717812 | | 4 | 816612 | | 4 | 931111 | +---+---------+ 4 rows in set (0.13 sec)
The outline of the explanation is as follows:
+----+-------------+------------------+--------+---------------+---------+---------+------+--------+-----------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------------+--------+---------------+---------+---------+------+--------+-----------------------------------------------------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 92760 | Using where | | 2 | DERIVED | <derived4> | system | NULL | NULL | NULL | NULL | 1 | Using filesort | | 2 | DERIVED | <derived3> | ALL | NULL | NULL | NULL | NULL | 92760 | | | 4 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used | | 3 | DERIVED | product_category | range | PRIMARY | PRIMARY | 3 | NULL | 194622 | Using where; Using index; Using temporary; Using filesort | +----+-------------+------------------+--------+---------------+---------+---------+------+--------+-----------------------------------------------------------+
Finally, an attempt to use cyberwiki:
drop procedure if exists cyberkiwi_variant; delimiter
I get the following results:
(cold and 2nd run) +---------+---+ | prod_id | c | +---------+---+ | 816612 | 4 | | 931111 | 4 | | 34957 | 4 | | 717812 | 4 | +---------+---+ 4 rows in set (0.14 sec)
The outline of the explanation is as follows:
+----+-------------+------------------+-------+---------------+---------+---------+------+--------+-----------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------------+-------+---------------+---------+---------+------+--------+-----------------------------------------------------------+ | 1 | SIMPLE | product_category | range | PRIMARY | PRIMARY | 3 | NULL | 194622 | Using where; Using index; Using temporary; Using filesort | +----+-------------+------------------+-------+---------------+---------+---------+------+--------+-----------------------------------------------------------+
So, it seems that all tested methods have approx. the same time intervals from 0.14 to 0.18 seconds that seem pretty effective to me, given the size of the table and the number of rows requested.
Hope this helps - http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html