What is the fastest way to get a list of files recursively contained in a directory?

I have a directory containing millions of files distributed in a folder hierarchy. This directory is stored on a large NFS remote file system. I would like to get a list of these files as quickly as possible.

Is it possible to drive faster than find . > list.txt find . > list.txt ? What factors affect speed? I use python, but any solution will go as long as it is fast.

+4
source share
2 answers

On linux, this was the fastest for me. Use (bash) globbing and printf as follows:

 printf "%s\n" directory/**/file printf "%s\x00" directory/**/filename-with-special-characters | xargs -0 command 

It seems to be much faster than

 find directory -name file 

or

 ls -1R directory | grep file 

or even, surprisingly,

 ls directory/**/file 

It was a local file system: x86_64 system, ext4 file system on SSD, in the directory structure more than 600,000 directories with several files in them.

+3
source

Depending on what you want to output. I recommend using

 ls -R | grep ":$" | sed -e 's/:$//' -e 's/^/ /' -e 's/-/|/' 

for the full path to all files recursively in the current directory.

0
source

Source: https://habr.com/ru/post/1434163/


All Articles