How to optimize Postgresql max_connections and node-postgres connection pool?

In short, I am having problems supporting over 5,000 read requests per minute from data APIs using Postgresql, Node.js, and node-postgres. The bottleneck seems to be between the API and the database. Here are the implementation details.

I am using an AWS Postgresql RDS database instance (m4.4xlarge - 64 GB of memory, 16 vCPU, 350 GB SSD, without IOPS initialized) for the data API with Node.js data. By default, RDS max_connections = 5000. The node API is load balanced in two clusters with 4 processes each (2 Ec2s with 4 vCPUs that run the API with PM2 in cluster mode). I am using node-postgres to associate the API with Postgresql RDS and am trying to use its pooling function. The following is an example connection pool code:

var pool = new Pool({ user: settings.database.username, password: settings.database.password, host: settings.database.readServer, database: settings.database.database, max: 25, idleTimeoutMillis: 1000 }); /* Example of pool usage */ pool.query('SELECT my_column FROM my_table', function(err, result){ /* Callback code here */ }); 

Using this implementation and testing with a load tester, I can support about 5,000 requests in one minute, with an average response time of about 190 ms (which I expect). As soon as I run more than 5000 requests per minute, the response time increases to more than 1200 ms in the best cases, and in the worst case, the API starts to interrupt time often. Monitoring shows that for EC2 running the Node.js API, CPU utilization remains below 10%. Thus, I focus on the database and binding the API to the database.

I tried to increase (and decrease in this respect) the node-postgres max connection setting, but there was no change in the behavior of the API response / timeout. I also tried providing IOPS on RDS, but no improvement. In addition, interestingly, I scaled RDS to m4.10xlarge (160 GB of memory, 40 vCPU), and while the load of the RDS processor decreased significantly, the overall API performance deteriorated significantly (I couldnโ€™t even support 5000 requests per minute that I could with less RDS).

In many ways, I am in the wrong territory and do not know how best to determine which of these moving parts is the performance of the API with a bottleneck with more than 5000 requests per minute. As already noted, I made various adjustments based on the analysis of Postgresql configuration documentation and node-postgres documentation, but to no avail.

If anyone has any tips on how to diagnose or optimize, I would really appreciate it.

UPDATE

After scaling to m4.10xlarge, I performed a series of load tests, changing the number of requests / min and the maximum number of connections in each pool. Here are some screenshots for monitoring:

monitoring indicators

db connections

+5
source share
3 answers

To support more than 5 thousand requests, while maintaining the same response speed, you need the best equipment ...

Simple math claims that: 5000 requests*190ms avg = 950k ms divided into 16 cores ~ 60k ms per core which basically means that your system was very busy.
(I assume you had a spare processor since some time was lost on the network)

Now the really interesting part of your question comes from an attempt to zoom in: m4.10xlarge (160 GB of memory, 40 vCPU).
Reduced CPU usage indicates that scaling frees up DB time resources - so you need to click more queries!
2 offers:

  • Try to increase the connection pool to max: 70 and look at the network traffic (depending on the amount of data that you could intimidate on the network).
  • are also your application a-sync DB requests? make sure your application can actually click more requests.
+5
source

Since you are interested in read performance, you can configure replication between two (or more) PostgreSQL instances, and then use pgpool II to load the balance between the instances.

Horizontal scaling means you wonโ€™t start hitting maximum instance sizes in AWS if you decide to switch to 10,000 simultaneous readings next week.

You can also get some HA in your architecture.

-

Many times, people will use pgbouncer as a connection pool, even if they are already built into their application code. pgbouncer works very well and it is usually easier to configure and manage this pgpool, but it does not perform load balancing. I am not sure if this will help you in this scenario.

0
source

From my experience, the best API service can do is use a separate Pool for each API call according to the priority of the call:

 var highPriority = new Pool({...max: 20}); // for high-priority API calls var lowPriority = new Pool({...max: 5}); // for low-priority API calls 

Then you simply use the correct pool for each API call for optimal service availability.

0
source

Source: https://habr.com/ru/post/1259058/


All Articles