The reason your first approach doesn't work is because the groups are βconsumedβ when creating this list with
list(groupby("cccccaaaaatttttsssssss"))
groupby docs
The returned group itself is an iterator that shares the main iterable using groupby() . Since the source is shared when the groupby() object is expanded, the previous group is no longer visible.
Let me break it into stages.
from itertools import groupby a = list(groupby("cccccaaaaatttttsssssss")) print(a) b = a[0][1] print(b) print('So far, so good') print(list(b)) print('What?!')
Output
[('c', <itertools._grouper object at 0xb715104c>), ('a', <itertools._grouper object at 0xb715108c>), ('t', <itertools._grouper object at 0xb71510cc>), ('s', <itertools._grouper object at 0xb715110c>)] <itertools._grouper object at 0xb715104c> So far, so good [] What?!
Our itertools._grouper object at 0xb715104c empty because it shares its contents with the "parent" iterator returned by groupby , and these elements have now disappeared, because the first call to list repeated over the parent.
This really is no different from what happens if you try to iterate twice over any iterator, like a simple generator expression.
g = (c for c in 'python') print(list(g)) print(list(g))
Output
['p', 'y', 't', 'h', 'o', 'n'] []
By the way, here is another way to get the length of a groupby group if you really don't need its contents; it's a little cheaper (and uses less RAM) than creating a list to find its length.
from itertools import groupby for k, g in groupby("cccccaaaaatttttsssssss"): print(k, sum(1 for _ in g))
Output
c 5 a 5 t 5 s 7