Optimization of volatile data requests

Question

Optimization of volatile data requests

I am trying to solve a problem with a delay of a until mysql-5.0 db.

The query itself is extremely simple: SELECT SUM(items) FROM tbl WHERE col = 'val'
There is an index on col and in the worst case, the sum is not more than 10,000 values (the average count(items) for all col values will be about 10).
The table has up to 2M rows.
The request is executed often enough that sometimes the execution time increases to 10 s, although 99% of them accept <<<1s
The request is not actually cached - in almost every case, every request like this will be accompanied by an insert into this table the next minute, and there will be no question of showing old values (payment information).
good enough - ~ 100% of shots

The result I'm looking for is each query <1s. Is there a way to improve the selection time without changing the table? Also, are there any interesting changes that will help solve the problem? I thought about just having a table in which the current amount is updated for each column right after every insert - but maybe there are better ways to do this?

+4

optimization mysql

viraptor Feb 24 '10 at 15:16

source share

2 answers

The coverage index should help:

 create index cix on tbl (col, items);

This will allow you to execute the amount without reading from the data file - it should be faster.

You should also keep track of how efficient your buffer key is and whether to allocate more memory for it. This can be done by polling the server status and looking at the "key%" values:

 SHOW STATUS LIKE 'Key%';

MySQL manual - show status

The relationship between key_read_requests (i.e., the number of index search queries) versus key_reads (i.e., the number of requests that require index blocks to be read from disk) is important. The higher the number of readable disks, the slower the query will run. You can increase efficiency by increasing the size of the keyboard buffer in the configuration file.

+1

Martin Feb 24 '10 at 17:06

source share

Martin · Accepted Answer · 2010-02-25T19:21:51+0000

Another approach is to add a pivot table:

 create table summary ( col varchar(10) primary key, items int not null );

and add triggers to tbl to:

insert:

 insert into summary values( new.col, new.items ) on duplicate key update set summary.items = summary.items + new.items;

when deleting:

 update summary set summary.items = summary.items - old.items where summary.col = old.col

when updating:

 update summary set summary.items = summary.items - old.items where summary.col = old.col; update summary set summary.items = summary.items + new.items where summary.col = new.col;

This will slow down your insertions, but allow you to hit one row in the pivot table for

 select items from summary where col = 'val';

The biggest problem with this is loading pivot table values. If you can disable the application, you can easily initialize the summary with the values from tbl.

 insert into summary select col, sum(items) from tbl group by col;

However, if you need the service to work, it is much more complicated. If you have a replica, you can stop replication, create a pivot table, set triggers, restart replication, and then reinstall the service to use the replica, and then repeat the process on the retired primary.

If you cannot do this, you can update the pivot table one col value at a time to reduce the impact:

 lock table write tbl, summary; delete from summary where col = 'val'; insert into summary select col, sum(items) from tbl where col = 'val'; unlock tables;

Or if you can endure a long break:

 lock table write tbl, summary; delete from summary; insert into summary select col, sum(items) from tbl group by col; unlock tables;

Optimization of volatile data requests

More articles: