Reading CSV for Python starts with line_num

I need to read a CSV with several million lines. The file grows throughout the day. Every time I process a file (and zip each line in a dict), I start the process again, except for creating a dict only for new lines.

To jump to new lines, I have to iterate over each line using a CSV reader and compare the line number with the "last line" number (as far as I know).

Is there a way to just skip this line number?

+4
source share
2 answers

You cannot go to a specific line number if the line size is not set, and you know this size. When I say that I can’t, I mean that you cannot without loading the entire file in memory and splitting it into characters \n.

If your CSV has a fixed row size, for example:

id,code,quantity
0001,ABC43,00100
0002,D2ZAD,00020
....

where each line has the same length, you can go to linesize*(linenumber+1)where linenumberis the line you want to send.
Otherwise, you need to go through the whole file to get the n-th line ... It contains a built-in module, a name linecachethat can help you: Go to a specific line in Python?

+2
source

, , - , , , .

0

Source: https://habr.com/ru/post/1526791/


All Articles