Caching Architecture Provides Specific Scenario Recommendations

SETUP:

We have a .Net application that is distributed across 6 local servers with a local database (ORACLE), 1 main server and 1 load balancing machine. Requests are sent to the load balancer, which redirects incoming requests to one of 6 local servers. At certain intervals, data is collected on the main server and redistributed to 6 local servers in order to be able to make decisions with complete data.

Each local server has a cache component that caches incoming requests based on different parameters (location, incoming parameters, etc.). With each request, the local server decides whether to go to the database (ORACLE) or get a response from the cache. However, in both cases, the local server must receive the database in order to do 1 insert and 1 update for each request.

Problem:

On peak day, each local server receives 2000 requests per second, and the system begins to slow down (CPU: 90%). I am trying to increase capacity before adding another local server to the mix. After running some benchmarks, the bottleneck, as always, is the inevitable 1 insert and 1 update for each query in the database.

TRAINING METHODS

To reduce the frequency, I created a Windows service, which is located between the database and the .NET application. It contains a pipe server and receives each insert and update from the main .NET application and saves them in a Hashtable. Then, at a certain time interval, the new service is sent to the database once to make inserts and updates in batch mode. It was about accessing the database less frequently. Although this had a positive effect, it did not benefit the system as far as I expected. Most of the processor load comes from oracle.exe in the form of queries per second.

I am trying to avoid access to the database as much as I can, and the only way to avoid the database seems to increase the cache hit ratio, different from the above solution that I tried. The cache hit ratio is about 81% at present. Since each local computer has its own cache, I actually miss a lot of cached requests. When two similar requests are redirected to different servers, the second request cannot benefit from the cached result of the first.

I do not have much experience in system architecture, so I would be grateful for any help in this issue. Any suggestions on various caching or tuning architectures or any tools are welcome.

Thanks in advance, I hope I put my question clear.

+6
source share
3 answers

I know that the question is old, but I wanted everyone to know how we solved our problem.

After several optimization attempts, it turned out that all we need is solid-state drives for 6 local machines. The processor dropped to 30% percent right after we installed them. This is the first time I've seen any hardware update contribute to performance.

If you have a high load, try upgrading to an SSD before making changes to your software or architecture.

Thank you all for your answers.

+1
source

For me, it looks like an application for solving timesten. In this case, you can eliminate the local databases and return to one. If you now have local oracle databases, you can implement a cache grid. Most likely, this will be the AWT cache (Async, Write Through). See Oracle In-Memory Database Cache Concepts . This is not a cheap option, but if it were worth exploring. You can focus on business logic and not worry about speed. This, of course, only works well if the application code is already configured and sql is productive and scalable. SQL must be prepared (using bind variables) in order to have better performance. The application connects to the cache and no longer refers to the database. You create cache tables in the cache group for which you want to have caching. All tables in SQL must be cached, otherwise the full SQL is transferred to the Oracle database. The mesh has a cache merge mechanism, so you don’t worry about where the data is in your mesh. Current release includes .net support. The data is consistent and asynchronously updated in the Oracle database. If the required data is in the cache and you delete the Oracle database, the application may continue to work. As soon as the database returns, synchronization is again taken. Very powerful.

+4
source

2000 requests per second to the server, about 24000 rps for the database. This is a BIG database load. Try optimizing, scaling, or clustering the database.

Maybe NoSQL DB (Redis \ Raven \ Mongo), as middleware will be right for you. The local server will read \ write Sharded NoSQL DB, aggregated data will be synchronized with Oracle outside peak times.

+1
source

Source: https://habr.com/ru/post/885705/


All Articles