How to combine a huge number of files

I would like to merge my files. I use

cat *txt > newFile 

But I have almost 500,000 files, and he complains that

 argument list is too long. 

Is there an efficient and fast way to merge half a million files?

thanks

+4
source share
2 answers

If your directory structure is shallow (no subdirectories), you can simply:

 find . -exec cat {} \; > newFile 

If you have subdirectories, you can limit the search to a higher level, or you can consider placing some files in subdirectories so that you don't have this problem!

This is not particularly efficient, and some versions of find allow:

 find . -exec cat {} \+ > newFile 

for greater efficiency. (Note that a backslash before + not required, but I find the symmetry with the previous example pretty.)

+8
source

How to do it in a loop:

 for a in *.txt ; do cat $a >> newFile ; done 

This has the disadvantage of creating a new cat instance for each file, which can be expensive, but if the files are large enough, the I / O overhead should dominate the processor time required to create a new process.

I would recommend creating a file containing the files in the correct order, I am not 100% sure about the guarantees of using selected globbing (and as in the question).

+1
source

Source: https://habr.com/ru/post/1501265/


All Articles