Search csv file for specific items in columns

Question

Search csv file for specific items in columns

I am trying to create a loop that searches through a csv file for rows with a common third and fourth column and performs an operation on them.
The file I have is as follows:

name1,x,y,z,notes
name2,a,b,c,notes
name3,a,y,z,notes

I use code that reads the first row and identifies row [2] and row [3] and searches for all rows in the file for this combination of columns. Unfortunately, I cannot figure out how to look for them.

for row in csvfile:
    row_identify = row[2:3]
    for row in csvfile:
        if row_identify in row:
            print row
        else:
            print "not here"

I want it to print the first and third line (since y and z would be row_identify). I suggested that I could simply indicate that I wanted to search for these lines, but this does not seem to work. I also tried using

row_identify =  str(row[2]),str(row[3])

but it doesn’t work either.

+4

python csv

jcross Sep 28 '15 at 4:20

source share

2

ozgur · Answer 1 · 2015-09-28T04:34:36+0000

, - , , - :

>>> import collections
>>> similarities = collections.defaultdict(list)

>>> for row in csvfile:
...     similarities[(row[2], row[3])].append(row)

>>> print similarities 
{('y', 'z'): [['name1', 'x', 'y', 'z', 'notes'], 
              ['name3', 'a', 'y', 'z', 'notes']], 
 ('b', 'c'): [['name2', 'a', 'b', 'c', 'notes']]
}

inspectorG4dget · Answer 2 · 2015-09-28T04:50:47+0000

3- 4- , :

import csv
import operator

key = operator.itemgetter(2,3)
with open('path/to/input') as infile:
    rows = csv.reader(infile)
    holyGrail = key(next(rows))
    for row in rows:
        if key(row) != holyGrail:
            continue
        do_stuff(row)

, , , :

import csv
import operator
from collections import defaultdict as dd

key = operator.itemgetter(2,3)
info = operator.itemgetter(0,1)
similarities = dd(list)
with open('path/to/input') as infile:
    for i,row in enumerate(csv.reader(infile)):
        similarities[key(row)].append((i,info(row)))

for k, rows in similarities.items():
    print("These following rows all have the id <{}> (the data follows):".format(k), ', '.join([str(i) for i,_ in rows]))
    print('\n'.join(['\t' + '\t'.join([row]) for _,row in rows])

Search csv file for specific items in columns

More articles: