Quickly remove the first n lines from many text files

Question

Quickly remove the first n lines from many text files

I need to create an output text file by deleting the first two lines of the input file.

I am currently using sed "1,2d" input.txt> output.txt

I need to do this for thousands of files, so I use python:

import os
for filename in somelist:
  os.system('sed "1,2d" %s-in.txt > %s-out.txt'%(filename,filename))

but it is rather slow.

I need to save the original file, so I cannot install it in place.

Is there any way to do this faster? Using anything other than sed? Perhaps using some other scripting language than python? Should I write a short program in C or can a file-write on disk be a bottleneck?

+3

performance python file-io sed

Samizdis Aug 19 '10 at 12:38

source share

3 answers

, , sed:

import os
import shutil

path = '/some/path/to/files/'
for filename in os.listdir(path):
    basename, ext = os.path.splitext(filename)
    fullname = os.path.join(path, filename)
    newname = os.path.join(path, basename + '-out' + ext)
    with open(fullname) as read:
        #skip first two lines
        for n in xrange(2):
            read.readline()
        # hand the rest to shutil.copyfileobj
        with open(newname, 'w') as write:
            shutil.copyfileobj(read, write)

+4

nosklo 19 . '10 15:36

for file in *.ext
do
    sed -i.bak -n '3,$p' $file 
done

or simply

sed -i.bak -n '3,$p' *.ext

+3

ghostdog74 Aug 19 '10 at 12:56

source share

Cascabel · Accepted Answer · 2010-08-19T12:40:46+0000

tail. :

tail -n +3 input.txt > output.txt

. , sed - , / .

Quickly remove the first n lines from many text files

More articles: