CPU usage and threads

We have an intensive transaction process on a single client site running on a quad-core server with four processors. This process is designed to use all available cores. Therefore, in this installation, we take the input queue, divide it by 16, and distribute each part of the queue to the core. It works well and does not lag behind the transaction volume on the box.

Looking at the CPU usage on the box, it doesn't seem to exceed 33%. Now we have a new customer with at least twice as much existing customer. Some of us argue that since CPU utilization is much lower than maximum utilization, we must go with the same configuration.

Others argue that there is no direct correlation between processor usage and transaction processing speed, and since the logic of the base program module is based on the number of available cores, it makes sense to get a box with a proportionally large number of cores available for a new client to increase the amount of traffic.

Does anyone have a feeling who is right in this case?

Thanks,

+4
source share
1 answer

To determine the best configuration for your new customer, it is important to understand the reason for the low CPU utilization.

Most likely, the reason is one of the following:

  • Your process is limited by memory bandwidth. In this case, faster RAM will help if supported by the motherboard. If possible, a redesign to limit the amount of data received during processing will increase productivity. Adding more processor cores alone will not improve performance.

  • Your process is limited by disk I / O. Using faster disk connections (SATA, etc.) and / or upgrading to SSD may help, but there will be no more processor power.

  • Your process is limited to a synchronization conflict. In this case, adding more threads for more cores can even be counterproductive. Redesigning your algorithm may help in this case.

Having said that, I also saw situations where processes that are definitely related to the processor cannot ensure 100% CPU utilization on modern processors (Core i7, etc.), because in some cases with increased requirements for turbocharging the controller tasks will show less than 100%.

As stated in 9000, you need to find out what your bottlenecks are when loading. Perfmon can provide enough data to find out.

Another belated thought: you can limit your process on an existing machine to part of the kernel (but still at least 30%, so theoretically the CPU will not become a bottleneck due to this restriction) and check if the overall throughput has deteriorated. If this does not happen, adding more cores will not improve performance.

+2
source

Source: https://habr.com/ru/post/1341468/


All Articles