I am trying to do the following using the Python and MySQLdb interface:
- Read the contents of a table with several million rows.
- Process and change the output of each line.
- Place the changed rows in another table.
It seems reasonable to iterate over each row, process it on the fly, and then insert each new row into a new table on the fly.
It works:
import MySQLdb import MySQLdb.cursors conn=MySQLdb.connect( host="somehost",user="someuser", passwd="somepassword",db="somedb") cursor1 = conn.cursor(MySQLdb.cursors.Cursor) query1 = "SELECT * FROM table1" cursor1.execute(query1) cursor2 = conn.cursor(MySQLdb.cursors.Cursor) for row in cursor1: values = some_function(row) query2 = "INSERT INTO table2 VALUES (%s, %s, %s)" cursor2.execute(query2, values) cursor2.close() cursor1.close() conn.commit() conn.close()
But it is slow and power consuming as it uses a client-side cursor to query for SELECT . If I use the server side cursor instead for a SELECT query:
cursor1 = conn.cursor(MySQLdb.cursors.SSCursor)
Then I get the 2014 error:
Exception _mysql_exceptions.ProgrammingError: (2014, "Commands out of sync; you can't run this command now") in <bound method SSCursor.__del__ of <MySQLdb.cursors.SSCursor object at 0x925d6ec>> ignored
So it doesn't seem to need to start with another cursor, iterating over the server cursor. Which seems to leave me stuck with a very slow client side iterator.
Any suggestions?
source share