How to divide a string of numbers into N groups so that the sums of each group are closest to their value?

I have the following problem: I have M-numbers located in a row. I need to divide the line into N groups so that the sums of the numbers of each group are closest to the average of these sums by some metric. The actual metric is not important: we can choose to minimize the sum of the absolute differences or variance, etc., Depending on what leads to the simplest solution.

A similar problem is the partition of sets, which is NP Hard. However, here we have an additional restriction: groups must pack consecutive numbers, so there may be a solution that does not require brute force search. The numbers are big.

EDIT

Example:

Numbers: 1 2 3 4 5 6 7 8 9 10, must be divided into 3 groups

Let's say we want to minimize the sum of absolute differences (SAD).

Groups: (1) 1 2 3 4 5 6 (amount = 21); (2) 7 8 (sum = 15); (3) 9 10 (sum = 19)

Average value = (21 + 15 + 19) / 3 = 18.33, SAD = 21-18.33 + 18.33-15 + 19-18.33 = 6.67 - This is what we want to minimize.

+4
source share
5 answers

Once you know how much should be, you can make groups close to that amount. If your indicators are good, you should be able to use binary search to find out what the actual amount is. When you are aiming for a certain amount, you can go through the list adding numbers to the group until the amount of the groups exceeds the amount. Then either take or don’t take this last integer. go through the whole list by doing this and see which group amounts deviate more from the total. Then return to the list of attempts to combine group sizes that fall into the rejection. It should be fast enough. Otherwise, use dynamic programming.

+2
source

The array is sorted in descending order and have three numbers storing the iteration loop amounts and adding the current number to the minimum amount of the answer (10,5,4), (9,6,3), (8,7,2,1)

#include<iostream> #include<stdio.h> #include <algorithm> using namespace std; int maximum(int x, int y, int z) { int max = x; /* assume x is the largest */ if (y > max) { /* if y is larger than max, assign y to max */ max = y; } /* end if */ if (z > max) { /* if z is larger than max, assign z to max */ max = z; } /* end if */ return max; /* max is the largest value */ } int main() { int array[] = {1 ,2, 3, 4, 5, 6, 7, 8, 9, 10}; int size = sizeof(array)/sizeof(array[0]); int part1=0; int part2=0; int part3=0; sort(array,array+size,greater<int>()); for(int x=0;x<size;x++) { if( part1 < part2 && part1 < part3) { part1 +=array[x]; }else if(part2 < part3){ part2 +=array[x]; }else{ part3 +=array[x]; } } printf("first part1 = %d\n",part1 ); printf("first part2 = %d\n",part2 ); printf("first part3 = %d\n",part3 ); printf("-------------------------------\n"); printf("largest number = %d\n",maximum(part1,part2,part3)); } 
+1
source

I think I get where you come from. As a programmer, I think about it in numerical order, I quickly put something together, like his valentines, and I'm going to dinner :) Here is a simple version:

 a = all numbers added together b = number of groups m = a/b (value is mean) c = array(a)DES (add all numbers to an array in decending order) foreach c if((m-(c[0] + c[1])) < (m-(c[0])) if((m-(c[0] + c[1] + c[2])) < (m-(c[0] + c[1]))) else g1 = c[0],c[1] c = c - (c[0],c[1]) else g1 = c[0] c = c - c[0] foreach c if((m-(c[0] + c[1])) < (m-(c[0])) else g2 = c[0] 

I quickly put it together so that it was not accurate, but I hope you can see the sequence and procedure. Of course, all c values ​​will be dynamically selected, like every foreach loop. You may need a foreach statement at the end to process any remaining digits and add them to the value that will be closest to the average.

Happy Valentine's Day!

0
source

This is where the JavaScript solution works (albeit not fully tested).

It essentially uses dynamic scripts to create complex stacks for brute force (ordered combinations) to get the initial indexes for each group in the array.

 var A = [1,2,3,4,5,6,7,8,9,10]; var G = 3; function find(line, groups) { var length = line.length; var mean = line.sum() / groups; var temp = [0]; var bestsad = 4294967295; var beststarts = []; var dynamic = "var x0 = 0; "; for(var i=1; i<groups; i++) { dynamic += "for(var x" + i + "=x" + (i-1) + "+1;x" + i + "<" + length + ";x" + i + "++) "; temp.push("x" + i); } dynamic += "{ var sad = getSAD(line, mean, [" + temp.join(",") + "]);"; dynamic += "if(sad < bestsad) { bestsad = sad; beststarts = [" + temp.join(",") + "] ;} }" eval(dynamic); console.log("Best SAD " + bestsad); console.log("Best Start Indexes " + beststarts); return beststarts; } function getSAD(line, mean, starts) { var sums = []; var sad; for(var i = 0; i < starts.length-1; i++) { var idx = i; sums.push(line.slice(starts[idx], starts[i+1]).sum()); } sums.push(line.slice(starts[starts.length-1]).sum()); sad = sums.sad(mean); return sad; } Array.prototype.sum = function() { var result = 0; for(var i=0; i<this.length; i++) result += this[i]; return result; } Array.prototype.sad = function(mean) { var result = 0; for(var i=0; i<this.length; i++) result += Math.abs(this[i] - mean); return result; } find(A, G); 

Here is the script that the variable / string var dynamic is being executed / executed.

 var x0 = 0; for(var x1=x0+1;x1<10;x1++) for(var x2=x1+1;x2<10;x2++) { var sad = getSAD(line, mean, [0,x1,x2]); if(sad < bestsad) { bestsad = sad; beststarts = [0,x1,x2] ; } } 

Why not just use group index vector + recursion? For this type of recursive task, the iterative method is optimal. Of course, the overhead (and added complexity) of dynamic scripts will negate any benefit on small arrays, but when working with actual data (large arrays), they will quickly select responses.

0
source

This is an interesting problem. I will use your example by dividing the numbers 1..10 into three groups to illustrate my answer. The solution will apply to any set of numbers and any number of groups. From sources, when the size of a set of numbers is large, you cannot use the brute force method. Having said that, large numbers can be processed in a similar way, but more on that later.

Suppose we have M consecutive numbers, indicated by (1..M), in a set, and we want to divide them into N groups.

The first thing to determine is the value that you will compare with the sum of each group. This is simply the sum of a set of numbers divided by the number of groups N.

In the example, sumOf (1..M) = 55 and N = 3, so 55/3 = 18.33 is the value that each group should sum. You want to minimize the difference between the sums of groups and 18.33

As another example, if you want to divide the set of numbers 1..20 into two groups, you need to minimize the difference between the sums of the groups and divide sumOf (1..20) = 210 into 2 groups = 210/2 = 105.

The next step is to find all possible groups. This is another interesting problem, given the limitation of profiles containing consecutive numbers, the total number of combinations of groups is not as many as you might expect.

Finding combinations is a recursive task, and it’s easy enough to work out a general equation.

lets start with a simple case. How many combinations of 10 numbers in the set (1..10). Well, there is only one group, numbers (1..10)

Now, how many combinations of 2 groups in 10 numbers. The answer is M-1 or 10-1 = 9, namely

 (1),(2..10) (1..2) (3..10) (1..3) (4..10) (1..4) (5..10) (1..5) (6..10) (1..6) (7..10) (1..7) (8..10) (1..8) (9..10) (1..9) (10) 

Thus, the size set M has combinations of groups M-1. This is the basis of recursion.

How many combinations of 3 groups in 10 numbers.

Well, the first group will be one of the following

 (1),(1..2),(1..3) ,(1..4) ,(1..5),(1..6) ,(1..7) ,(1..8) 

Given any of them as the first group, let's find out how many combinations of 2 groups exist in the remaining numbers.

Let the first group of three = (1). We have nine numbers left and they can make 9-1 = 8 different combinations of 2 groups. Let the first group of three = (1..5). We have five numbers left, and they can make 5-1 = 4 different groups of 2 numbers.

So overall we will have

 (1) -> 8 combinations (1..2) -> 7 combinations (1..3) -> 6 combinations (1..4) -> 5 combinations (1..5) -> 4 combinations (1..6) -> 3 combinations (1..7) -> 2 combinations (1..8) -> 1 combinations 

giving SumOf (1..8) or in general (sum (1..M-2), combinations of groups. SumOf (1..8) = 8 * 9/2 = 36

Thus, there are 36 combinations of 3 groups of 10 numbers, where each group contains consecutive numbers.

as an aside, for 3 groups of 100 numbers you have sumOf (1..98) = 98 * 99/2 = 4851 combinations of groups, as M increases, you get more combinations and as some value of M the brute force method is impossible.

The above approach can be used to develop a simple recursive algorithm for obtaining all combinations of groups in the set (1..M).

In addition, for any number N of groups in the set of numbers M, a simple equation can be developed. For example, if you move to 4 groups of 10 numbers, then you have situations, such as the first group (1..3), and then find combinations of 3 groups in the remaining 7 numbers. There will be a sum (1..M-2) = a sum (1..5), etc.

In any case, back to the problem. You have all the combinations of groups, and you can iterate through the groups and calculate the SAD for each combination and choose the one that minimizes the SAD.

When the number of combinations is very large, and you cannot look at each combination, then you can try self-tuning to select groups in a random or some evolutionary algorithm approach, in which you start with the number of randomly selected combinations, and then randomly move numbers from one group to another and keep those with the lowest SAD. Continue this step until you see an improvement in the GARDEN.

Or you can do as @Robert King suggested, starting with one combination and improving it by moving numbers from one group to another.

0
source

Source: https://habr.com/ru/post/1396373/


All Articles