I am working on a project that brings together several data sources based on registered users. One of the questions in particular gives me a lot of problems:
import numpy as np import pandas as pd from pandas import Series, DataFrame from sqlalchemy import create_engine
No matter what I tried, I get this error (almost always for three seconds, sometimes instantly).
InterfaceError: (InterfaceError) 2013: Lost connection to MySQL server during query
I tried several things, for example adding the options pool_timeout and pool_recycle to the create_engine function for documents here http://docs.sqlalchemy.org/en/latest/core/engines.html
I also tried users = pd.read_sql_query(query_folder_users, prod_engine,chunksize=10000) but got the same error.
Interestingly, this request works fine when I run it in Sequel Pro; it immediately starts returning rows and takes only ~ 10 seconds to complete completely. The output is about 550,000 lines.
There are several other topics / posts that I found, but no one seems to satisfy what I need: https://groups.google.com/forum/#!topic/sqlalchemy/TWL7aWab9ww Disable SQLAlchemy http: // blog .fizyk.net.pl / blog / reminder-set-pool_recycle-for-sqlalchemys-connection-to-mysql.html
Reading the docs here http://dev.mysql.com/doc/refman/5.5/en/error-lost-connection.html , I noticed this line:
Sometimes the "at request time" form occurs when millions of lines are submitted as part of one or more requests. If you know this is happening, you should try to increase the default net_read_timeout value from 30 seconds to 60 seconds or longer, enough for the data transfer to complete.
It looks like I might need to change this setting, but I can't find anything in the SQLAlchemy docs that mention this.
Has anyone encountered this problem before? If so, how did you fix it?