Read all the data in the dictionary:
from collections import defaultdict from operator import itemgetter scores = defaultdict(int) with open('my_file.txt') as fobj: for line in fobj: name, score = line.split() scores[name] += int(score)
and sorting:
for name, score in sorted(scores.items(), key=itemgetter(1), reverse=True): print(name, score)
prints:
d 3 b 2 a 1 c 0
Performance
To test the performance of this answer and the value from @SvenMarnach, I included both approaches in the function. Here fobj is a file open for reading. I use io.StringIO , so I / O delays should hopefully not be measured:
from collections import Counter def counter(fobj): scores = Counter() fobj.seek(0) for line in fobj: key, score = line.split() scores.update({key: int(score)}) return scores.most_common() from collections import defaultdict from operator import itemgetter def default(fobj): scores = defaultdict(int) fobj.seek(0) for line in fobj: name, score = line.split() scores[name] += int(score) return sorted(scores.items(), key=itemgetter(1), reverse=True)
Results for collections.Counter :
%timeit counter(fobj) 10000 loops, best of 3: 59.1 µs per loop
Results for collections.defaultdict :
%timeit default(fobj) 10000 loops, best of 3: 15.8 µs per loop
It looks like defaultdict is four times faster. I would not have guessed about it. But when it comes to performance, you need to to measure.
source share