Using PHP and MySQL ... How to free up memory?

Demand:

We have two identical tables on two servers. The first table on the server has unique key columns A, B, C, and we insert table 1 into table2, which have unique key columns B, C, D.

Table 1 has approximately 5 million rows, and table2 will insert about 3 million rows due to various limitations of unique key columns.

Here you need to get all the rows from Table 1 and insert into Table 2, if in table 2 there is no such record, and if the record matches, increase the counter and update the column "cron_modified_date" in table 2.

The PHP version is 5.5, and for this installation, MySQL version 5.7, and the database server is 6 GB.

When executing the below script, the processing speed becomes very slow after processing 2 million records, and the RAM is not freed, and after all RAM is consumed all the time the script, and after that the script is not processed at all.

As you can see, I reload the variables and close the connection to the database, but not freeing the RAM server of the database. After some reading, I found out maybe the PHP garbage collection should be called manually to free up resources, but also does not free up RAM.

What am I doing wrong here and how to process millions of records using PHP, MYSQL?

Any other way to free up RAM while the script is running and so that the script has to compete with the execution?

/* Fetch records count for batch insert*/ $queryCount = "SELECT count(*) as totalRecords FROM TABLE1 where created_date > = '2018-02-10'"; $rowsCount = $GLOBALS['db']->execRaw( $queryCount)->fetchAll(); $recordsPerIteration = 50000 ; $totalCount = $rowsCount[0]['totalRecords']; $start = 0; gc_disable() ; if ( $totalCount > 0 ) { while ( $totalCount > 0 ) { $query = "SELECT * FROM TABLE1 WHERE where created_date > = '2018-02-10' ORDER BY suggestion_id DESC LIMIT ".$start.",".$recordsPerIteration; print "sql is $query" ; $getAllRows = $GLOBALS['db']->execRaw( $query )->fetchAll(); $GLOBALS['db']->queryString = null; $GLOBALS['db']->close() ; foreach ($getAllRows as $getRow) { $insertRow = " INSERT INTO TABLE2 ( Name, Company, ProductName, Status, cron_modified_date) VALUE ( ".$GLOBALS['db_ab']->quote($getRow['Name']).", ".$GLOBALS['db_ab']->quote($getRow['Company']).", ".$GLOBALS['db_ab']->quote($getRow['ProductName']).", ".$getRow['Status'].", ".$GLOBALS['db_ab']->quote($getRow['created_date'])." ) ON DUPLICATE KEY UPDATE count = (count + 1) , cron_modified_date = '".$getRow['created_date']."'" ; $GLOBALS['db_ab']->execRaw( $insertRow ) ; $GLOBALS['db_ab']->queryString = null; $getRow = null; $insertRow = null; $GLOBALS['db_ab']->close() ; } gc_enable() ; $totalCount = $totalCount- $recordsPerIteration; $start += $recordsPerIteration ; $getAllRows = null; gc_collect_cycles() ; } } 

Decision


After the suggestions provided by @ABelikov and a few hit and trail methods ... Finally, the code below works fine and frees up RAM after each insert of 50K entries.

Below are the main findings.

  • Release DB connection variables after each major operation that involves large data operations and reconnecting the database so that the database buffer is flushed.
  • Group insertion instructions and perform insertion at a time. Do not write once in a loop.

    Thank you guys for the valuable advice and help.

     /* Fetch records count for batch insert*/ $queryCount = "SELECT count(*) as totalRecords FROM TABLE1 where created_date > = '2018-02-10'"; $rowsCount = $GLOBALS['db']->execRaw( $queryCount)->fetchAll(); $recordsPerIteration = 50000 ; $totalCount = $rowsCount[0]['totalRecords']; $start = 0; if ( $totalCount > 0 ) { while ( $totalCount > 0 ) { $query = "SELECT * FROM TABLE1 WHERE where created_date > = '2018-02-10' ORDER BY suggestion_id DESC LIMIT ".$start.",".$recordsPerIteration; print "sql is $query" ; $getAllRows = $GLOBALS['db']->execRaw( $query )->fetchAll(); $GLOBALS['db']->queryString = null; $GLOBALS['db']->close() ; $insertRow = " INSERT INTO TABLE2 ( Name, Company, ProductName, Status, cron_modified_date) VALUE ( " ; foreach ($getAllRows as $getRow) { $insertRow .= (".$GLOBALS['db_ab']->quote($getRow['Name']).", ".$GLOBALS['db_ab']->quote($getRow['Company']).", ".$GLOBALS['db_ab']->quote($getRow['ProductName']).", ".$getRow['Status'].", ".$GLOBALS['db_ab']->quote($getRow['created_date'])."),"; } $insertRow=rtrim($insertRow,','); // Remove last ',' $insertRow.= " ON DUPLICATE KEY UPDATE count = (count + 1) , cron_modified_date = '".$getRow['created_date']."'" ; $GLOBALS['db_ab']->execRaw( $insertRow ) ; //Flushing all data to freeup RAM $GLOBALS['db_ab'] = null ; $GLOBALS['db'] = null ; $insertRow = null; $totalCount = $totalCount- $recordsPerIteration; $start += $recordsPerIteration ; $getAllRows = array(); $getAllRows = null; print " \n Records needs to process ".$totalCount."\n"; } } 
+5
source share
2 answers

1. Insert a solution for multiple rows

You can speed up the script by using "insert multiple lines" here https://dev.mysql.com/doc/refman/5.5/en/insert.html

INSERT INTO tbl_name (a, b, c) VALUES (1,2,3), (4,5,6), (7,8,9);

You need to leave only a part of VALUES in your foreach and print all the rest

  $insertRow = " INSERT INTO TABLE2 ( Name, Company, ProductName, Status, cron_modified_date) VALUES "; foreach ($getAllRows as $getRow) { $insertRow.="(".$GLOBALS['db_ab']->quote($getRow['Name']).", ".$GLOBALS['db_ab']->quote($getRow['Company']).", ".$GLOBALS['db_ab']->quote($getRow['ProductName']).", ".$getRow['Status'].", ".$GLOBALS['db_ab']->quote($getRow['created_date'])."),"; } $insertRow=rtrim($insertRow,','); // Remove last ',' $insertRow .= " ON DUPLICATE KEY UPDATE count = (count + 1) , cron_modified_date = '".$getRow['created_date']."'" ; $GLOBALS['db_ab']->execRaw( $insertRow ) ; $GLOBALS['db_ab']->queryString = null; $getRow = null; $insertRow = null; $GLOBALS['db_ab']->close() ; 

This will only help if your foreach body usually starts more than once.

2. MySQL server-side solution

Try using TRANSACTION https://dev.mysql.com/doc/refman/5.7/en/commit.html http://php.net/manual/en/pdo.begintransaction.php

Just run one at the beginning of the script and end it. Depends on your server, this can really help. But be careful! It depends on your MySQL server configuration sets. Need testing.

+1
source
  • PHP has no practical memory limit; MySQL does.
  • Most of the time it takes to transfer data between MySQL and PHP.

If I do not understand this task, you are working too much on it. Just go to MySQL to do all the work. No array, no chunking, only one SQL:

 INSERT INTO table2 (Name, Company, ProductName, Status, cron_modified_date, count) SELECT Name, Company, ProductName, Status, created_date, 1 FROM table1 ON DUPLICATE KEY UPDATE count = count + 1 cron_modified_date = created_date; 

Please note that you may need to use the pseudo-function VALUES() in some updates.

This avoids retrieving all (or even 5000) strings in PHP, which is probably due to a memory problem. A simple variable in PHP takes about 40 bytes. MySQL is designed to work with any number of rows without blowing RAM.

0
source

Source: https://habr.com/ru/post/1275370/


All Articles