I am trying to populate a MS SQL 2005 database using python on windows. I insert millions of lines, and for 7 million I use almost a gigabyte of memory. The test below consumes 4 megabytes of RAM for each inserted row of 100 thousand:
import pyodbc
connection=pyodbc.connect('DRIVER={SQL Server};SERVER=x;DATABASE=x;UID=x;PWD=x')
cursor=connection.cursor()
connection.autocommit=True
while 1:
cursor.execute("insert into x (a,b,c,d, e,f) VALUES (?,?,?,?,?,?)",1,2,3,4,5,6)
mdbconn.close()
Hacking solution: I ended up creating a new process using the multiprocessing module to recover the memory. It is still confusing why inserting rows this way consumes so much memory. Any ideas?
source
share