If I understand your definition of "sparse" correctly, this function should be exactly what you want:
# python ≥ 2.5 import itertools, heapq def make_sparse(sequence): grouped= sorted(sequence) item_counts= [] for item, item_seq in itertools.groupby(grouped): count= max(enumerate(item_seq))[0] + 1 item_counts.append( (-count, item) )
These examples produce this conclusion:
['duck', 'goose', 'duck', 'goose', 'duck', 'dog', 'duck', 'goose'] ['duck', 'goose', 'duck', 'goose', 'duck', 'dog', 'duck', 'goose', 'duck', 'dog', 'duck', 'goose'] ['a', 'b', 'a', 'c', 'a', 'b', 'a', 'c', 'a', 'a']
A quick note: in the first and second examples, reverse the output order may look more optimal.