Is multithreading useful for processing files on your hard drive?

Regarding performance and speed, is it useful to use multithreading to process files on your hard drive? (to move files from disk to another or to check file integrity)

I think that basically the speed of my hard drive will determine the speed of my treatment.

+6
source share
2 answers

Multithreading can help, at least sometimes. The reason is that if you write to a β€œregular” hard drive (for example, not on a solid state drive), then what will slow you down the most is the time it takes to find the hard drive (that is, the time it takes for the hard drive to move its read / write heads from one distance along the radius of the disk to another). This movement is slower compared to the rest of the system, and the time required to search for the head is proportional to the distance it should move. So, for example, the worst case scenario would be if the head were to move from the edge of the disk to the center of the disk after each operation.

Of course, the ideal solution is that the head of the disk will never search, or only very rarely, and if you can arrange it so that your program only needs to read / write one file sequentially, this will be the fastest. Or even better, switch to an SSD where there is no disk head, and the search time is actually zero. :)

But sometimes you need your disk to be able to read and write several files in parallel, in which case the disk head (if necessary) will look back a lot. So, how is multithreaded help in this scenario? The answer is: with a fairly intelligent disk I / O subsystem (like SCSI, I'm not sure if the IDE can do this), the I / O logic will maintain a queue of all outstanding read and write requests, and this will dynamically reorder this queue like this so that requests are executed in an order that minimizes the number of head / read read / write movements. This is known as the Elevator Algorithm , because it is similar to the logic used by the elevator to maximize the number of people that it can transport over a certain period of time.

Of course, the OS I / O subsystem can implement this optimization only if it knows in advance which I / O requests are waiting ... and if you have only one thread initiating I / O requests, then the I / O subsystem will only know about the current request. (i.e., it cannot β€œpeek” into your request queue of a custom thread field to see that your thread will be needed further). And, of course, your user stream does not know the details of the disk location, so it is difficult (impossible?) To implement the elevator algorithm in user space.

But if your program has N threads that read / write the disc right away, then the OS I / O subsystem will immediately know the NI / O requests and can re-order these requests at its discretion to maximize disk performance.

+8
source

Perhaps your main problem should be code repair. Threading helps a lot, IMO, because it does not allow this kind of hacking that allows single-threaded.

0
source

Source: https://habr.com/ru/post/887123/


All Articles