For the record, another insert:
>>> n, s = 3, 'abcdabcxdabc' >>> L=[(s[i:i+n], i) for i in range(len(s)-n+1)] >>> L [('abc', 0), ('bcd', 1), ('cda', 2), ('dab', 3), ('abc', 4), ('bcx', 5), ('cxd', 6), ('xda', 7), ('dab', 8), ('abc', 9)] >>> d={t:[i for u, i in L if u == t] for t, _ in L} >>> d {'abc': [0, 4, 9], 'bcd': [1], 'cda': [2], 'dab': [3, 8], 'bcx': [5], 'cxd': [6], 'xda': [7]} >>> {k:(len(v), v) for k, v in d.items()} {'abc': (3, [0, 4, 9]), 'bcd': (1, [1]), 'cda': (1, [2]), 'dab': (2, [3, 8]), 'bcx': (1, [5]), 'cxd': (1, [6]), 'xda': (1, [7])}
In one line:
>>> {k:(len(v), v) for L in ([(s[i:i+n], i) for i in range(len(s)-n+1)],) for k, v in ((t, [i for u, i in L if u == t]) for t, _ in L)} {'abc': (3, [0, 4, 9]), 'bcd': (1, [1]), 'cda': (1, [2]), 'dab': (2, [3, 8]), 'bcx': (1, [5]), 'cxd': (1, [6]), 'xda': (1, [7])}
What would I do in the "real world":
>>> def substrings(s, n): ... d = {} ... tis = ((s[i:i+n], i) for i in range(len(s)-n+1)) ... for t, i in tis: ... d.setdefault(t, []).append(i) ... return {k:(len(v), v) for k, v in d.items()} ... >>> substrings(s, n) {'abc': (3, [0, 4, 9]), 'bcd': (1, [1]), 'cda': (1, [2]), 'dab': (2, [3, 8]), 'bcx': (1, [5]), 'cxd': (1, [6]), 'xda': (1, [7])}
The version of the "real world" differs from the one-line one-point version: the dict is built on O (n) against O (n ^ 2)