Dictionaries are disordered data structures. In addition, if you want to count some elements in a data set, it is better to use collections.Counter()
, which is more optimized and pythonic for this purpose.
Then you can simply use Counter.most_common(N)
to print most of the N
common elements in the Counter object.
Also with regard to opening files, you can simply use the with
statement, which automatically closes the file at the end of the block. And itβs better not to print the final result inside your function, you can make your function a generator by yielding the intended lines and then printing them whenever you want.
from collections import Counter def frequencies(filename, top_n): with open(filename) as infile: content = infile.read() invalid = "''`,.?!:;-_\nβ' '" counter = Counter(filter(lambda x: not invalid.__contains__(x), content)) for letter, count in counter.most_common(top_n): yield '{:8} appears {} times.'.format(letter, count)
Then use the for loop to iterate over the generator function:
for line in frequencies(filename, 100): print(line)
source share