Split a voluminous text file into each line n

I have a folder containing several text files. I am trying to split all text files into 10,000 lines per file, keeping the base file namei.e. if filename1.txt contains 20000 lines the output will be filename1-1.txt (10000 lines) and filename1-2.txt (10000 lines).

I tried to use split -10000 filename1.txt, but this does not support the base file name, and I have to repeat the command for each text file in the folder. I also tried to do for f in *.txt; do split -10000 $f.txt; done. That didn't work either.

Any idea how I can do this? Thank you

+4
source share
2 answers
for f in filename*.txt; do split -d -a1 -l10000 --additional-suffix=.txt "$f" "${f%.txt}-"; done

Or, written on several lines:

for f in filename*.txt
do
    split -d -a1 -l10000 --additional-suffix=.txt "$f" "${f%.txt}-"
done

How it works:

  • -dreports splituse numeric suffixes

  • -a1 split .

  • -l10000 split 10000 .

  • --additional-suffix=.txt split .txt .

  • "$f" split .

  • "${f%.txt}-" split , .

, :

$ ls
filename1.txt  filename2.txt

:

$ for f in filename*.txt; do split -d -a1 -l10000 --additional-suffix=.txt "$f" "${f%.txt}-"; done

, split:

$ ls
filename1-0.txt  filename1-1.txt  filename1.txt  filename2-0.txt  filename2-1.txt  filename2.txt

, split

--additional-suffix, :

for f in filename*.txt
do 
    split -d -a1 -l10000 "$f" "${f%.txt}-"
    for g in "${f%.txt}-"*
    do 
        mv "$g" "$g.txt"
    done
done
+8

, awk :

awk 'FNR%1000==1{if(FNR==1)c=0; close(out); out=FILENAME; sub(/.txt/,"-"++c".txt)} {print > out}' *
+1

Source: https://habr.com/ru/post/1613896/


All Articles