How to manage connection pool with PostgreSQL with sidekiq?

Problem I have a rails application that starts several hundred sidekiq side processes. They all connect to the PostgreSQL database, which is not entirely happy with providing 250 connections - it can, but if all the side processes accidentally send queries to db, it crashes.

Option 1 I was thinking about adding pgBouncer in front of db, however I can’t use it in transaction mode right now, since I am very dependent on setting search_path at the beginning of each job processing to determine which "country" (PostgreSQL schema) should work (flat- pearl). In this case, I would have to use session-based pooling mode. However, as far as I know, this will require me to disconnect the connections after each processing of the work, in order to free the connections back to the pool, and this will be a very expensive work, right? Did I miss something?

Option 2 , using an application-level connection pool, is also an option, but I'm not quite sure how I can do this for PostgreSQL with sidekiq?

Option 3 , what I did not think about?

+5
source share
2 answers

Option 1: you're right, sessions will require you to reset and reconnect and add overhead. How expensive will depend on the access pattern, i.e. what part of the / tcp handshake connection, etc. - common work and what kind of latency do you need. Benchmarking is definitely worth it, but if the connections are short, then the overhead will be really noticeable.

Option 2/3: you can evaluate the limit or throttle your side missions. There are several projects dedicated to this ...

Queue Limits

  • Fix Sidekiq restrictions : Limit the number of workers who can run the specified queues at the same time. You can dynamically pause queues and resize the queue distribution. It also keeps track of the number of active workers in the queue. Supports global mode (multiple sidekiq processes). There is an additional lock queue mode.
  • Sidekiq Throttler : Sidekiq :: Throttler is a Middleware for Sidekiq that adds the ability to evaluate job execution constraints on an individual basis.
  • sidekiq-rate-limiter : Redis is supported, with restrictions on working speed for processing jobs.
  • Sidekiq :: Throttled : Concurrency and Threshold Throttling.

I got here from here

https://github.com/mperham/sidekiq/wiki/Related-Projects

If your application needs to have a connection for each process, and you cannot break it down where more threads can use the connection, then it is pgBouncer or an application-based connection pool. The connection pool is actually either going to throttle or restrict your application in some way to save the database.

+2
source

Sidekiq requires only one connection per workflow. If you set concurrency to a reasonable value, say 10-25, I don’t think you should use 250 concurrent database connections. How many workflows do you use and what are their concurrency?

In addition, you can see on this page that even if you have a high concurrency parameter, you can still create a connection pool shared by threads within this process.

-1
source

Source: https://habr.com/ru/post/1245360/


All Articles