I am trying to query a partitioned table (by month) approaching rows of 20M. I need to group DATE (transaction_utc) as well as country_id. The rows that are returned if I disconnect the group by groups and aggregates are a little more than 40k, which is not so much, however adding a group makes the query much slower, unless GROUP BY is in the transaction_utc column, in which case it gets FAST .
I tried to optimize this first query below by changing the query and / or indexes and reaching a point lower (about 2 times faster than the original), but still stuck in a 5-second query to sum 45 thousand rows, which seems too big.
For reference, this field is a completely new 24 logical core, 64-gigabyte RAM, Mariadb-5.5.x server with more INNODB buffer pool than the index space on the server, so there should be no pressure on the RAM or CPU.
So, I'm looking for ideas on what makes this slow down, and suggestions for speeding it up. Any feedback would be greatly appreciated! :)
Ok, on the details ...
The following query (the one I really need) takes about 5 seconds (+/-) and returns less than 100 rows.
SELECT lss.`country_id` AS CountryId , Date(lss.`transaction_utc`) AS TransactionDate , c.`name` AS CountryName, lss.`country_id` AS CountryId , COALESCE(SUM(lss.`sale_usd`),0) AS SaleUSD , COALESCE(SUM(lss.`commission_usd`),0) AS CommissionUSD FROM `sales` lss JOIN `countries` c ON lss.`country_id` = c.`country_id` WHERE ( lss.`transaction_utc` BETWEEN '2012-09-26' AND '2012-10-26' AND lss.`username` = 'someuser' ) GROUP BY lss.`country_id`, DATE(lss.`transaction_utc`)
EXPLAIN SELECT for the same query is as follows. Note that it does not use the transaction_utc key. Shouldn't I use my coverage index?
id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE lss ref idx_unique,transaction_utc,country_id idx_unique 50 const 1208802 Using where; Using temporary; Using filesort 1 SIMPLE c eq_ref PRIMARY PRIMARY 4 georiot.lss.country_id 1
Now for a couple of other parameters that I tried to try to determine what was going on ...
The following query (with a modified group) takes about 5 seconds (+/-) and returns only 3 rows:
SELECT lss.`country_id` AS CountryId , DATE(lss.`transaction_utc`) AS TransactionDate , c.`name` AS CountryName, lss.`country_id` AS CountryId , COALESCE(SUM(lss.`sale_usd`),0) AS SaleUSD , COALESCE(SUM(lss.`commission_usd`),0) AS CommissionUSD FROM `sales` lss JOIN `countries` c ON lss.`country_id` = c.`country_id` WHERE ( lss.`transaction_utc` BETWEEN '2012-09-26' AND '2012-10-26' AND lss.`username` = 'someuser' ) GROUP BY lss.`country_id`
The following query (remote group) takes 4-5 seconds (+/-) and returns 1 row:
SELECT lss.`country_id` AS CountryId , DATE(lss.`transaction_utc`) AS TransactionDate , c.`name` AS CountryName, lss.`country_id` AS CountryId , COALESCE(SUM(lss.`sale_usd`),0) AS SaleUSD , COALESCE(SUM(lss.`commission_usd`),0) AS CommissionUSD FROM `sales` lss JOIN `countries` c ON lss.`country_id` = c.`country_id` WHERE ( lss.`transaction_utc` BETWEEN '2012-09-26' AND '2012-10-26' AND lss.`username` = 'someuser' )
The following query takes .00X seconds (+/-) and returns ~ 45k rows. This for me shows that with max we try to group 45K rows into less than 100 groups (as in my initial request):
SELECT lss.`country_id` AS CountryId , DATE(lss.`transaction_utc`) AS TransactionDate , c.`name` AS CountryName, lss.`country_id` AS CountryId , COALESCE(SUM(lss.`sale_usd`),0) AS SaleUSD , COALESCE(SUM(lss.`commission_usd`),0) AS CommissionUSD FROM `sales` lss JOIN `countries` c ON lss.`country_id` = c.`country_id` WHERE ( lss.`transaction_utc` BETWEEN '2012-09-26' AND '2012-10-26' AND lss.`username` = 'someuser' ) GROUP BY lss.`transaction_utc`
DIAGRAM TABLE:
CREATE TABLE IF NOT EXISTS `sales` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `user_linkshare_account_id` int(11) unsigned NOT NULL, `username` varchar(16) NOT NULL, `country_id` int(4) unsigned NOT NULL, `order` varchar(16) NOT NULL, `raw_tracking_code` varchar(255) DEFAULT NULL, `transaction_utc` datetime NOT NULL, `processed_utc` datetime NOT NULL , `sku` varchar(16) NOT NULL, `sale_original` decimal(10,4) NOT NULL, `sale_usd` decimal(10,4) NOT NULL, `quantity` int(11) NOT NULL, `commission_original` decimal(10,4) NOT NULL, `commission_usd` decimal(10,4) NOT NULL, `original_currency` char(3) NOT NULL, PRIMARY KEY (`id`,`transaction_utc`), UNIQUE KEY `idx_unique` (`username`,`order`,`processed_utc`,`sku`,`transaction_utc`), KEY `raw_tracking_code` (`raw_tracking_code`), KEY `idx_usd_amounts` (`sale_usd`,`commission_usd`), KEY `idx_countries` (`country_id`), KEY `transaction_utc` (`transaction_utc`,`username`,`country_id`,`sale_usd`,`commission_usd`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=19696320 ;