Running du on parallel

I have a very large disk drive (16T). I want to run "du" to find out how much each subdirectory takes. However, this takes a very long time. Fortunately, I have a cluster of computers at my disposal. Therefore, I can run "du" in parallel, each task is executed in a separate subdirectory and writes a simple script that does this. Is there such a thing already, or should I write it myself?

+6
source share
3 answers

This is easy to do using GNU Parallel:

parallel du ::: */* 
+3
source

From your question it is not clear how your storage is designed (RAID, NAS, NFS or something else).

But, almost independently of the actual technology, running du in parallel may not be such a good idea, after all - it is very likely that this will really slow down the work.

The disk array has limited IOPS capacity, and several du threads will be called from this pool. Worse, often a single du repeatedly slows down any other I / O, even if this process does not require large disk bandwidth.

In comparison, if you only have one processor, running parallel make ( make -j N ) will slow down the build process because switching the process has significant overhead.

The same principle applies to disks, especially to rotating disks. The only situation when you get a significant increase in speed is when you have N disks installed in independent directories (something like /mnt/disk1 , /mnt/disk2 , ..., /mnt/diskN ). In this case, you should run du in N threads, 1 per disk.

One of the common improvements to increase speed is to install your disks using the noatime flag. Without this flag, a massive disk scan creates a lot of write activity to update access time. If you use the noatime flag, avoid write operations and du is much faster.

+3
source

Is there such a thing or should I write it myself?

I wrote sn for myself, but you can also appreciate it.

 sn p . 

will give you the sizes of everything in the current directory. It runs in parallel and faster than du on large directories.

0
source

Source: https://habr.com/ru/post/971853/


All Articles