Suppose I have a large list of words. For instance:
>>> with open('/usr/share/dict/words') as f: ... words=[word for word in f.read().split('\n') if word]
If I wanted to create an index on the first letter of this list of words, this is easy:
d={} for word in words: if word[0].lower() in 'aeiou': d.setdefault(word[0].lower(),[]).append(word)
The result is something like this:
{'a':[list of 'a' words], 'e':[list of 'e' words], 'i': etc...}
Is there a way to do this with an understanding of Python 2.7, 3+? In other words, is it possible with dict understanding syntax to add the list represented by the key as the dict is created?
t
index={k[0].lower():XXX for k in words if k[0].lower() in 'aeiou'}
Where XXX performs the operation of adding or creating a list for the key when creating the index
.
Edit
Accepting suggestions and comparative tests:
def f1(): d={} for word in words: c=word[0].lower() if c in 'aeiou': d.setdefault(c,[]).append(word) def f2(): d={} {d.setdefault(word[0].lower(),[]).append(word) for word in words if word[0].lower() in 'aeiou'} def f3(): d=defaultdict(list) {d[word[0].lower()].append(word) for word in words if word[0].lower() in 'aeiou'} def f4(): d=functools.reduce(lambda d, w: d.setdefault(w[0], []).append(w[1]) or d, ((w[0].lower(), w) for w in words if w[0].lower() in 'aeiou'), {}) def f5(): d=defaultdict(list) for word in words: c=word[0].lower() if c in 'aeiou': d[c].append(word)
Produces this test:
rate/sec f4 f2 f1 f3 f5 f4 11
A default default direct loop is the fastest, followed by a job and a loop using setdefault
.
Thanks for the ideas!