Why is a multi-threaded Java program slow and not consuming much CPU time?

My Java program uses java.util.concurrent.Executor to run several threads, each of which runs the runnable class, in this class it reads from a comma-delimited text file on C: drive and goes through lines to separate and parse the text in floats, after which the data is stored in:

static Vector static ConcurrentSkipListMap 

My computer is a 7-bit 64-bit Intel Core i7 processor, has six * 2 cores and 24 GB of RAM, I noticed that the program will work for 2 minutes and finish all 1700 files, but the CPU usage is only about 10 % to 15%, no matter how many threads I assign with:

 Executor executor=Executors.newFixedThreadPool(50); 

Executors.newFixedThreadPool (500) will not have better CPU utilization or reduce task execution time. There is no network traffic, everything is on the local C drive: there is enough RAM to use more threads, with an increase in flows to 1000 it will have the value "OutOfMemoryError".

Why doesn't more threads translate into more CPU usage and less processing time, why?

Edit: My hard drive is a 200GB SSD.

Editing: finally found where the problem is, each thread writes the result to a log file, which is common for all threads, the more times I run the application, the larger the log file, the slower it gets, and since it is common, it definitely slows down the process , so after I stopped writing to the log file, it completes all tasks in 10 seconds!

+4
source share
3 answers

OutOfMemoryError probably comes from Java's own memory limitations. Try using some arguments here to increase your maximum memory.

For speed, Adam Blass starts with a good offer. If this is the same file over and over, then I assume that trying multiple threads to try to read it at the same time could lead to a conflict over locks in the file. More threads would mean even more controversy, which could even lead to a deterioration in overall performance. So avoid this and just upload the file once, if possible. Even if it is a large file, you have 24 GB of RAM. You can store a fairly large file, but you may need to increase the allowed JVM memory so that it can load the entire file.

If you are using multiple files, consider this fact: your disk can only read one file at a time . Therefore, having multiple threads trying to use the disk at the same time is probably not going to be too efficient if the threads do not spend a lot of time processing. Since you have so little CPU usage, it may be that the stream loads part of the file, and then quickly starts on the part that was buffered, and then spends a lot of time waiting for the rest of the file to load. If you download the file again and again, this may even apply.

In short: Disk IO is probably your culprit. You need to work to reduce it so that streams do not compete so much for the contents of the file.

Edit:

After further consideration, it is more likely a synchronization problem. Streams probably rise, trying to add to the list of results. If access is frequent, this will lead to a huge amount of controversy for locks on the site. Consider doing something like storing each stream in a local list (e.g. ArrayList , which is not thread safe), and then copying all the values ​​to a final, general list in pieces to try to reduce competition.

+4
source

You are probably limited by IO, not cpu.

Can you reduce the number of times you open a file to read it? Perhaps open it once, read all the lines, save them in memory, and then repeat.

Otherwise, you will have to look for a fast drive. SSDs can be quite fast.

+1
source

Did your threads somehow get a low priority on the system? An increase in the number of threads in this case will not correspond to an increase in CPU usage, since the amount of CPU space allocated for your program may be reduced elsewhere.

Are there any configuration / initialization files where something like this is possible?

+1
source

Source: https://habr.com/ru/post/1493094/


All Articles