"Failed to get data from server: connection timeout" or "connection not open" when connecting to a PostGres database on a separate AWS instance

I am using Ruby 1.9.3 on my application server that runs on an AWS EC2 instance. I have a Postgres DB running on a separate instance of EC2, but both instances are in the same security group. When m Ruby code is connected to the database, it uses the Sequel ORM stone ( http://sequel.rubyforge.org/ ).

Now I have configured this Postgres 9.1.4 database to be able to correctly accept connections from an application server instance.

However, from time to time, I noticed in the application server logs that he would have problems connecting to the Postgres DB instance, and I would see error messages like these:

PG::Error: could not receive data from server: Connection timed out 

or

 PG::Error: connection not open 

So, I went to the Postgres DB instance and looked at / var / log / postgresql / postgresql -9.1-main.log, and I see a bunch of messages like this:

 2012-11-07 08:15:17 UTC LOG: could not receive data from client: Connection timed out 2012-11-07 08:15:17 UTC LOG: unexpected EOF on client connection 

I searched the Internet, including stack overflows, and made sure my PostgreSQL did not have SSL enabled (I have "ssl = off" inside my postgresql.conf file)

At this point, I'm not quite sure what exactly is happening in the Postgres configuration. I am not going to mess with the maximum number of connections or the maximum timeout values ​​on my production server without a good proven reason.

The application server can connect to the database most of the time, and this problem only occurs intermittently.

On the Ruby side, this is an error trace for "connection not open" when calling Postgres:

 PG::Error: connection not open /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:145:in `async_exec' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:145:in `block in execute_query' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/database/logging.rb:33:in `log_yield' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:145:in `execute_query' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:132:in `block in execute' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:111:in `check_disconnect_errors' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:132:in `execute' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:372:in `_execute' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:234:in `block (2 levels) in execute' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:379:in `check_database_errors' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:234:in `block in execute' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/database/connecting.rb:229:in `block in synchronize' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/connection_pool/threaded.rb:105:in `hold' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/database/connecting.rb:229:in `synchronize' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:234:in `execute' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/dataset/actions.rb:744:in `execute' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:483:in `fetch_rows' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/model/base.rb:785:in `primary_key_lookup' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/model/base.rb:124:in `[]' 

Similarly, this is a trace for "cannot receive data from the server":

  PG::Error: could not receive data from server: Connection timed out /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:124:in `block' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:124:in `ensure in check_disconnect_errors' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:124:in `check_disconnect_errors' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:132:in `execute' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:372:in `_execute' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:234:in `block (2 levels) in execute' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:379:in `check_database_errors' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:234:in `block in execute' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/database/connecting.rb:229:in `block in synchronize' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/connection_pool/threaded.rb:105:in `hold' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/database/connecting.rb:229:in `synchronize' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:234:in `execute' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/dataset/actions.rb:744:in `execute' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/adapters/postgres.rb:483:in `fetch_rows' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/model/base.rb:785:in `primary_key_lookup' /var/lib/gems/1.9.1/gems/sequel-3.38.0/lib/sequel/model/base.rb:124:in `[]' 

I noticed that if I have an application server and DB Postgres running in one instance, then there are no connection problems, at least not yet. Maybe Postgres is less lenient about non-local database connections?

Please let me know what I might have missed, I appreciate it!

IS

+4
source share
1 answer

A common explanation for this might be a connectivity issue.

Alternatively, if it is not a connection, this could be a protocol synchronization issue. It seems that the two ends may try to read from a socket without trying to write. Perhaps the client expects the server to send a response, and the server expects the client to send data.

It will be very difficult to debug if it is intermittent and random, since you cannot just tcpdump it and analyze it.

I would add more server-side protocols - log_statement = 'all' and log_line_prefix , which show the client's IP address, backend start time, and backend pid. Thus, you can start trying to correlate these failures with the session activity that occurred before the failure, to work if it is specific customers, specific tasks, or really just random.

Is this a continuation of the ORM gem using libpq at the lower level or its own version of the PostgreSQL protocol? If the latter is likely to be the culprit.

Update: it looks like it can use the pg gem ( libpq ), the postgres gem, or perhaps postgres-pr (whatever that is). He will prefer pg if it is installed.

Since it looks like you are already using the pg gem, you probably need to do some diagnostic work to keep track of where the problem is - specific requests, specific clients, etc. - and try to find a way to reproduce the problem. PostgreSQL csvlog can be useful, so you can easily and easily download and analyze logs.

+2
source

Source: https://habr.com/ru/post/1444530/


All Articles