Redshift + SQLAlchemy re-query

I am doing something among the lines:

conn_string = "postgresql+pg8000://%s:%s@%s:%d/%s" % (db_user, db_pass, host, port, schema)
conn = sqlalchemy.engine.create_engine(conn_string,execution_options={'autocommit':True},encoding='utf-8',isolation_level="AUTOCOMMIT") 
rows = cur.execute(sql_query)

Run queries in a Redshift cluster. Recently, I have been performing maintenance tasks such as starting vacuum reindexon large tables that are truncated and reloaded every day.

The problem is that this command takes about 7 minutes for a specific table (the table is huge, 60 million rows in 15 columns), and when I run it using the above method, it just does not end and does not hang. I can see on the cluster control panel in AWS that parts of the vacuum command start for about 5 minutes and then they just stop. No python errors, no cluster errors, nothing.

I assume the connection was lost during the command. So how do I prove my theory? Anyone else having a problem? What can I change for the connection string to keep it longer?

EDIT:

I changed my connection after comments here:

conn = sqlalchemy.engine.create_engine(conn_string,
                                       execution_options={'autocommit': True},
                                       encoding='utf-8',
                                       connect_args={"keepalives": 1, "keepalives_idle": 60,
                                                             "keepalives_interval": 60},  
                                                        isolation_level="AUTOCOMMIT")

And he worked for a while. However, he decided to start with the same behavior for even larger tables, which vacuum reindexactually takes about 45 minutes (at least this is my assessment, the command never ends in Python).

How can I do this work regardless of the time the request takes?

+4
source share
1 answer

. , ( 5 ) , . , , , .

, , , , ( dc1/ds2)? , , . , , , , , python.

+1

Source: https://habr.com/ru/post/1675552/


All Articles