Divide the list of strings into groups arbitrarily

Given the list of strings of element n , I want to divide it into b groups (b<=n) , where each group has i to j (j>=i) items

Example: Say

 List<string> lst=new List<string>(new string[]{"a","b","c","d"}); 

(Therefore n=4 )

Suppose a function providing this functionality is

 List<List<string>> DivideIntoGroup(List<string> lst, b, i, j) 

one of the possible results of DivideIntoGroup(lst, 3, 1, 2) is

 {"a"}, {"b","c"}, {"d"} 

How do I write DivideIntoGroup functions?

+4
source share
2 answers

I am not an expert in C #, so I will give you a purely mathematical solution, and I hope you can translate it into your language.

Basically your task consists of two separate parts: select b groups from i to j elements each and randomness. The second should be easy - just randomly moving the elements first, and then splitting the group. Let's move on to the interesting part:

How to split n elements in b groups containing from i to j elements each? The direct solution will be a random number between i and j for the number of elements in the first group, then the second, etc. However, there is no guarantee that you will not be left with the last group that has an element number not between i and j . Also, such a solution does not perform a purely random distribution.

The correct approach will be to get the number of elements of the first group, given the likelihood of solving a common group splitting, when you take so many elements - you are mainly interested in how many solutions in general for task(n, b, i, j) and how many exist for task(nk, b-1, i, j) if we assume that we take elements k in the first group. If we can only calculate the number of solutions, you can take each k with the corresponding probability and make an arbitrary sample of k for the first group, then the second and so on ...

So now the question is: how many solutions exist for task(n, b, i, j) ? Noting that task(n, b, i, j) = sum(k=i to j) task(nk, b - 1, i, j) you can easily find these numbers with recursion (use dynamic optimization so that you did not have to calculate values ​​more than once).

PS: for the number of solutions there may be a closed solution of the form, but I cannot immediately understand it and as long as n * b remains relatively small (<10 ^ 6), the recursive solution should work.

EDIT
PS2: in fact, the numbers in task(n, b, i, j) can be quite large, so consider using large integers.

+4
source

What I would do as a solution is, of course, this is pseudocode:

 func( n, b, i, j ) { if(n == 0) return //finished if(i>j or i>min(j,n)) return //no solution possible down this path out = choose_random_between (i , min(j,n)) current_ave_of_cells_per_group = ( (n - out) / (b - 1) ) if current_ave_of_cells_per_group < i func ( n, b, i, min(out-1,n) ) else if current_ave_of_cells_per_group > j func ( n, b, out+1, min(j,n) ) else **form the group consisting of 'out' numbers** func ( n-out, b-1, i, min(j,n-out) ) } 
0
source

Source: https://habr.com/ru/post/1434449/


All Articles