This is the function that does what you are looking for.
def reverse_lines(filename, BUFSIZE=4096):
f = open(filename, "rb")
f.seek(0, 2)
p = f.tell()
remainder = ""
while True:
sz = min(BUFSIZE, p)
p -= sz
f.seek(p)
buf = f.read(sz) + remainder
if '\n' not in buf:
remainder = buf
else:
i = buf.index('\n')
for L in buf[i+1:].split("\n")[::-1]:
yield L
remainder = buf[:i]
if p == 0:
break
yield remainder
it works by reading the buffer from the end of the file (4kb by default) and generating all the lines in it in reverse order. Then it returns to 4k and does the same until the beginning of the file. The code may need to save more than 4k in memory if there is no line feed in the section being processed (very long lines).
You can use the code as
for L in reverse_lines("my_big_file"):
... process L ...
source
share