What is wrong with this python program running on .csv?

I have a text file with a list of strings.

I want to find the .csv file for lines starting with these lines and put them in a new CSV file.

In this case, the text file is called "output.txt", the source .csv is "input.csv", and the new .csv file is "corrected.csv".

The code:

import csv file = open('output.txt') while 1: line = file.readline() writer = csv.writer(open('corrected.csv','wb'), dialect = 'excel') for row in csv.reader('input.csv'): if not row[0].startswith(line): writer.writerow(row) writer.close() if not line: break pass 

Mistake:

 Traceback (most recent call last): File "C:\Python32\Sample Program\csvParser.py", line 9, in <module> writer.writerow(row) TypeError: 'str' does not support the buffer interface` 

New error:

 Traceback (most recent call last): File "C:\Python32\Sample Program\csvParser.py", line 12, in <module> for row in reader: _csv.Error: line contains NULL byte 

The problem was that the CSV file was saved with tabs instead of commas, now a new problem:

 Traceback (most recent call last): File "C:\Python32\Sample Program\csvParser.py", line 13, in <module> if row[0] not in lines: IndexError: list index out of range 

The CSV file contains more than 500 data records ... does it matter?

+1
python csv
Oct 21 '11 at 18:20
source share
4 answers

If you look at the documentation , here is how the reader initialized:

 spamReader = csv.reader(open('eggs.csv', 'r'), ... 

Pay attention to open('eggs.csv, 'rb') . You do not pass the file descriptor on line 9 , so str treated as a file descriptor and raises an error.

Replace line 9 as follows:

 csv.reader(open('input.csv', 'r', newline = '')) 
+6
Oct 21 '11 at 18:27
source share

csv.reader cannot open the file; it accepts a file object. The best solution would be the following:

 import csv lines = [] with open('output.txt', 'r') as f: for line in f.readlines(): lines.append(line[:-1]) with open('corrected.csv','w') as correct: writer = csv.writer(correct, dialect = 'excel') with open('input.csv', 'r') as mycsv: reader = csv.reader(mycsv) for row in reader: if row[0] not in lines: writer.writerow(row) 
+2
Oct 21 '11 at 18:27
source share

Your last problem:

  if row[0] not in lines: IndexError: list index out of range 

The error message mentions the list index.
There is only one list index that he could talk about: 0
If 0 is out of range, then len(row) must be zero.
If len(row) is zero, then the corresponding line in the input file must be empty.
If the line in the input file is empty, what do you want to do:

(a) completely ignore the input string?
(b) raise a (fatal) error?
(c) register an error message and continue?
(d) something else?

0
Oct. 25 2018-11-11T00:
source share

try it

 import csv import cStringIO file = open('output.txt') while True: line = file.readline() buf = cStringIO.StringIO() writer = csv.writer(buf, dialect = 'excel') for row in csv.reader(open('input.csv')): if not row[0].startswith(line): writer.writerow(row) writer.close() output = open('corrected.csv', 'wb') output.write(buf.getvalue()) if not line: break pass 

In my experience, using the cStringIO buffer for the whole process and then flushing the entire buffer to a file is faster.

-2
Oct 21 '11 at 18:39
source share



All Articles