MySQL UPDATE operation to avoid massive TRX sizes

Question

MySQL UPDATE operation to avoid massive TRX sizes

I often write datascrubs that update millions of rows of data. Data is stored in a MySQL 24x7x365 OLTP database using InnoDB. Updates can clean every row of the table (in which the database ends with acquiring a lock at the table level) or it can simply clear 10% of the rows in the table (which can still be in millions).

To avoid creating large transaction sizes and minimizing competition, I usually try to split my single massive UPDATE statement into a series of small UPDATE transactions. Thus, I end up writing a loop that restricts my UPDATE WHERE clause as follows:

(warning: this is just pseudo code to get the point)

@batch_size=10000;
@max_primary_key_value = select max(pk) from table1

for (int i=0; i<=@max_primary_key_value; i=i+@batch_size)
{
start transaction;

update IGNORE table1
set col2 = "flag set"
where col2 = "flag not set"
and pk > i
and pk < i+@batchsize;

commit;
}

This approach simply sucks for many reasons.

UPDATE , . , UPDATE . 1/2 ... , . , - , - , .

, , , .

?

+3

database mysql locking innodb

Matthew Quinlan 29 . '09 21:46

2

Eric Petroelje · Answer 1 · 2009-12-29T21:51:47+0000

, , , , LIMIT .

-:

do {
  update table1 set col2 = 'flag set' where col2 = 'flat not set' LIMIT 10000
} while (ROW_COUNT() > 0)

Matthew Quinlan · Answer 2 · 2010-01-06T18:00:09+0000

, ... ( , " " ). , SQL update .

mass_update (IN updatestmt TEXT, IN batchsiz INT) - : , - : UPDATE, TWICE "0 "

SET @sql = CONCAT( updatestmt," LIMIT ", batchsiz );
-- had to use CONCAT because "PREPARE stmt FROM" cannot accept dynamic LIMIT parameter
-- reference: http://forums.mysql.com/read.php?98,75640,75640#msg-75640
PREPARE stmt FROM @sql;

select @sql; --display SQL to screen
SET @cumrowcount=0;
SET @batchnum=0;
SET @now := now(); -- @now is a STRING variable... not a datetime

    increment: repeat
        SET @batchnum=@batchnum+1;
        EXECUTE stmt;
        set @rowcount = ROW_COUNT();
        set @cumrowcount = @cumrowcount + @rowcount;
        select @batchnum as "Iteration",
               @cumrowcount as "Cumulative Rows",
               TIMESTAMPDIFF(SECOND,STR_TO_DATE(@now,"%Y-%m-%d %H:%i:%s"),now()) as "Cumulative Seconds",
               now() as "Timestamp";
        until @rowcount <= 0
    end repeat increment;

    DEALLOCATE PREPARE stmt;  -- REQUIRED
END

, , UPDATE, , " 0 ".

!

MySQL UPDATE operation to avoid massive TRX sizes

More articles: