Python algorithm for counting the appearance of a specific word in csv

I just started learning python. I am wondering what are the effective ways to count the occurrence of a particular word in a CSV file, besides just using it for a loop to go line by line and read.

To be more specific, let's say I have a CSV file containing two columns: Name and Rating, with millions of records.

How to count the appearance of β€œA” under β€œGrade”?

Python code examples will be greatly appreciated!

+4
source share
3 answers

In the main example, using csv and collections.Counter (Python 2.7+) with the standard Python libraly:

 import csv import collections grades = collections.Counter() with open('file.csv') as input_file: for row in csv.reader(input_file, delimiter=';'): grades[row[1]] += 1 print 'Number of A grades: %s' % grades['A'] print grades.most_common() 

Output (for a small data set):

 Number of A grades: 2055 [('A', 2055), ('B', 2034), ('D', 1995), ('E', 1977), ('C', 1939)] 
+7
source

As you saw, there are many ways to solve this problem.

The best way to evaluate them for speed is time using timeit . For instance:

 % python -m timeit -c 'import csv with open("./grades.csv") as grades: table = csv.DictReader(grades) sum(1 for row in table if row["Grade"] == "A") ' 10000 loops, best of 3: 129 usec per loop 
+2
source

Of course, you should read all the ratings, which in this case also means reading the entire file. You can use the csv module to easily read comma separated value files:

 import csv my_reader = csv.reader(open('my_file.csv')) ctr = 0 for record in my_reader: if record[1] == 'A': ctr += 1 print(ctr) 

This is pretty fast, and I could not have done better with the Counter method:

 from collections import Counter grades = [rec[1] for rec in my_reader] # generator expression was actually slower result = Counter(grades) print(result) 

Last but not least, lists have a count method:

 from collections import Counter grades = [rec[1] for rec in my_reader] result = grades.count('A') print(result) 
0
source

Source: https://habr.com/ru/post/1396009/


All Articles