Iterator over all sections in k groups?

Question

Iterator over all sections in k groups?

Let's say I have a list L. How can I get an iterator over all sections of groups K?

Example: L = [2,3,5,7,11,13], K = 3

List of all possible sections from 3 groups:

[ [ 2 ], [ 3, 5], [ 7,11,13] ] [ [ 2,3,5 ], [ 7, 11], [ 13] ] [ [ 3, 11 ], [ 5, 7], [ 2, 13] ] [ [ 3 ], [ 11 ], [ 5, 7, 2, 13] ] etc...

=== UPDATE ===

I was working on a solution that seems to work, so I just copy it.

 # -*- coding: utf-8 -*- import itertools # return ( list1 - list0 ) def l1_sub_l0( l1, l0 ) : """Substract two lists""" # copy_l1 = list( l1 ) copy_l0 = list( l0 ) # for xx in l0 : # if copy_l1.count( xx ) > 0 : # copy_l1.remove( xx ) copy_l0.remove( xx ) # return [ copy_l1, copy_l0 ] # def gen_group_len( n, k ) : """Generate all possible group sizes""" # avoid doubles stop_list = [] # for t in itertools.combinations_with_replacement( xrange( 1, n - 1 ), k - 1 ) : # last_n = n - sum( t ) # valid group size if last_n >= 1 : res = tuple( sorted( t + ( last_n, ) ) ) # if res not in stop_list : yield res stop_list.append( res ) # group_len = (1, 1, 3) def gen( group_len, my_list ) : """Generate all possible partitions of all possible group sizes""" # if len( group_len ) == 1 : yield ( tuple( my_list ), ) # else : # need for a stop list if 2 groups of same size stop_list = [] # for t in itertools.combinations( my_list, group_len[ 0 ] ) : # reduced_list = l1_sub_l0( my_list, t )[ 0 ] # for t2 in gen( group_len[ 1: ], reduced_list ) : # tmp = set( ( t, t2[ 0 ] ) ) # if tmp not in stop_list : yield ( t, ) + t2 # avoid doing same thing twice if group_len[ 1 ] == group_len[ 0 ] : stop_list.append( tmp ) # my_list = [ 3,5,7,11,13 ] n = len( my_list ) k = 3 # group_len_list = list( gen_group_len( n, k ) ) print "for %i elements, %i configurations of group sizes" % ( n, len( group_len_list ) ) print group_len_list # for group_len in group_len_list : # print "group sizes", group_len # for x in gen( group_len, my_list ) : print x # print "==="

Exit:

 for 5 elements, 2 configurations of group sizes [(1, 1, 3), (1, 2, 2)] group sizes (1, 1, 3) ((3,), (5,), (7, 11, 13)) ((3,), (7,), (5, 11, 13)) ((3,), (11,), (5, 7, 13)) ((3,), (13,), (5, 7, 11)) ((5,), (7,), (3, 11, 13)) ((5,), (11,), (3, 7, 13)) ((5,), (13,), (3, 7, 11)) ((7,), (11,), (3, 5, 13)) ((7,), (13,), (3, 5, 11)) ((11,), (13,), (3, 5, 7)) === group sizes (1, 2, 2) ((3,), (5, 7), (11, 13)) ((3,), (5, 11), (7, 13)) ((3,), (5, 13), (7, 11)) ((5,), (3, 7), (11, 13)) ((5,), (3, 11), (7, 13)) ((5,), (3, 13), (7, 11)) ((7,), (3, 5), (11, 13)) ((7,), (3, 11), (5, 13)) ((7,), (3, 13), (5, 11)) ((11,), (3, 5), (7, 13)) ((11,), (3, 7), (5, 13)) ((11,), (3, 13), (5, 7)) ((13,), (3, 5), (7, 11)) ((13,), (3, 7), (5, 11)) ((13,), (3, 11), (5, 7)) ===

+6

python data-partitioning

usual me Aug 21 '13 at 9:08

source share

4 answers

A simple alternative representation of this problem is to assign one of three cluster labels to each element.

 import itertools def neclusters(l, k): for labels in itertools.product(range(k), repeat=len(l)): partition = [[] for i in range(k)] for i, label in enumerate(labels): partition[label].append(l[i]) yield partition

as with @val's answer, this can be wrapped to remove partitions with empty clusters.

+2

joeln Apr 05 '15 at 11:54

source share

Edited: as @moose noted, the following only identify sections in which adjacent indexes are in the same cluster. Performing this partition into all permutations will give the desired answer.

Itertools is very useful for this kind of combinatorial listing. First, we consider your task as the equivalent task of selecting all sets of k-1 individual split points in an array. This is allowed by itertools.combinations , which returns combinations without replacing a specific size from a given iteration, and the return values are in the order that they were found in the original iterable.

Thus, your problem is solved as follows:

 import itertools def neclusters(l, K): for splits in itertools.combinations(range(len(l) - 1), K - 1): # splits need to be offset by 1, and padded splits = [0] + [s + 1 for s in splits] + [None] yield [l[s:e] for s, e in zip(splits, splits[1:])]

numpy split to make these kinds of partitions subject to offset offsets, so here's an alternative that generates lists of numpy arrays:

 import itertools def neclusters(l, K): for splits in itertools.combinations(range(len(l) - 1), K - 1): yield np.split(l, 1 + np.array(splits))

+1

joeln Mar 2 '14 at 4:54

source share

Filter out partitions of size k using more_itertools.partitions (note the trailing "s"):

Given

 import itertools as it import more_itertools as mit iterable = [2, 3, 5, 7, 11] k = 3

demonstration

 res = [p for perm in it.permutations(iterable) for p in mit.partitions(perm) if len(p) == k] len(res) # 720 res # [[[2], [3], [5, 7, 11]], # [[2], [3, 5], [7, 11]], # [[2], [3, 5, 7], [11]], # [[2, 3], [5], [7, 11]], # [[2, 3], [5, 7], [11]], # [[2, 3, 5], [7], [11]], # ... # [[3], [2], [5, 7, 11]], # [[3], [2, 5], [7, 11]], # [[3], [2, 5, 7], [11]], # [[3, 2], [5], [7, 11]], # [[3, 2], [5, 7], [11]], # [[3, 2, 5], [7], [11]], # [[3], [2], [5, 11, 7]], # ... # ]

This version provides permutation entry sections. Sections of repeating elements may be included, for example [[3,], [5,], [7, 11, 13]] and [[7, 11, 13]], [3,], [5,]] .

Note: more_itertools is a third-party package. Install via > pip install more_itertools .

+1

pylang Apr 18 '19 at 21:00

source share

val · Accepted Answer · 2013-08-21T10:03:26+0000

This works, although it is probably superintensive (I sort them all to avoid double counting):

 def clusters(l, K): if l: prev = None for t in clusters(l[1:], K): tup = sorted(t) if tup != prev: prev = tup for i in xrange(K): yield tup[:i] + [[l[0]] + tup[i],] + tup[i+1:] else: yield [[] for _ in xrange(K)]

It also returns empty clusters, so you probably want to wrap this to get only non-empty ones:

 def neclusters(l, K): for c in clusters(l, K): if all(x for x in c): yield c

Counting only for verification:

 def kamongn(n, k): res = 1 for x in xrange(nk, n): res *= x + 1 for x in xrange(k): res /= x + 1 return res def Stirling(n, k): res = 0 for j in xrange(k + 1): res += (-1)**(kj) * kamongn(k, j) * j ** n for x in xrange(k): res /= x + 1 return res >>> sum(1 for _ in neclusters([2,3,5,7,11,13], K=3)) == Stirling(len([2,3,5,7,11,13]), k=3) True

It works!

Exit:

 >>> clust = neclusters([2,3,5,7,11,13], K=3) >>> [clust.next() for _ in xrange(5)] [[[2, 3, 5, 7], [11], [13]], [[3, 5, 7], [2, 11], [13]], [[3, 5, 7], [11], [2, 13]], [[2, 3, 11], [5, 7], [13]], [[3, 11], [2, 5, 7], [13]]]

Iterator over all sections in k groups?

More articles: