How can I write this combinatorics algorithm more efficiently?

Question

How can I write this combinatorics algorithm more efficiently?

A group contains a set of entities, and each entity has a meaning.

Each object can be part of more than one group.

Problem . Find the largest N groups where each object appears no more than once as a result. If necessary, the entity can be excluded from the group.

Example:

Entities with values:

  A = 2

  B = 2

  C = 2

  D = 3

  E = 3

Groups

  1: (A,B,C) total value: 2+2+2 = 6

  2: (B,D) total value: 2 + 3 = 5

  3: (C,E) total value: 2 + 3 = 5

  4: (D) total value: 3

  5: (E) total value: 3

**Answers**:

  Largest 1 group is obviously (A,B,C) with total value 6

  Largest 2 groups are (B,D), (C,E) with total value 10

  Largest 3 groups are either {(A,B,C),(D),(E)}, {(A,B),(C,E),(D)} or  {(A,C), (B,D), (E)} with total value 12

The input to the algorithm should be:

Set of objects with values
Groups containing one or more objects
Number of groups as a result

If there are several answers, then finding one of them is enough.

I included an example to try to fix the problem, the number of entities in practice should be less than about 50, and the number of groups should be less than the number of entities. The number of found N groups will be from 1 to 10.

, N , , , . , , , , .

, , , ? .

, "" , "" . (B, C, D, E) ( . 1 (A, B, C) (A, B), (A, C), (A) , .

+4

algorithm time-complexity combinatorics

Ponyboy 23 . '17 22:35

2

, , :

, , n- . 3 , 3 .

, , , , . 3 , . , .

#

var entities = new Dictionary<char, int>() { { 'A', 2 }, { 'B', 2 }, { 'C', 2 }, { 'D', 3 }, { 'E', 3 } };
var groups = new List<string>() { "ABC", "BD", "CE", "D", "E" };
var solutions = new List<Tuple<List<string>, int>>();

for(int i = 0; i < groups.Max(x => x.Length); i++)
{
    var solution = new List<string>();
    foreach (var group in groups.OrderByDescending(x => x.Length > i ? entities[x[i]] : -1))
        if (!group.ToCharArray().Any(c => solution.Any(g => g.Contains(c))))
            solution.Add(group);

    solutions.Add(new Tuple<List<string>, int>(solution, solution.Sum(g => g.ToCharArray().Sum(c => entities[c]))));
}

solutions.Dump();
solutions.OrderByDescending(x => x.Item2).First().Dump();

:

0

gordy 23 . '17 23:49

Kota Mori · Accepted Answer · 2017-06-24T05:31:26+0000

. , .

. v - K, . G - K x M, : G(i,j)=1 , i j G(i,j)=0 .

x - M, : x[j]=1 , j. y - K, : y[i]=1 , i .

- x y, sum(v*y) :

G x >= y...
sum(x) = N... N .

R. lpSolve, lpsolve.

library(lpSolve)


solver <- function(values, groups, N)
{
  n_group <- ncol(groups)
  n_entity <- length(values)

  object <- c(rep(0, n_group), values)

  lhs1 <- cbind(groups, -diag(n_entity))
  rhs1 <- rep(0, n_entity)
  dir1 <- rep(">=", n_entity)

  lhs2 <- matrix(c(rep(1, n_group), rep(0, n_entity)), nrow=1)
  rhs2 <- N
  dir2 <- "="

  lhs   <- rbind(lhs1, lhs2)
  rhs   <- c(rhs1, rhs2)
  direc <- c(dir1, dir2)

  lp("max", object, lhs, direc, rhs, all.bin=TRUE)
}


values <- c(A=2, B=2, C=2, D=3, E=3)
groups <- matrix(c(1,1,1,0,0,
                   0,1,0,1,0,
                   0,0,1,0,1,
                   0,0,0,1,0,
                   0,0,0,0,1),
                 nrow=5, ncol=5)
rownames(groups) <- c("A", "B", "C", "D", "E")

ans <- solver(values, groups, 1)
print(ans)
names(values)[tail(ans$solution, length(values))==1]
# Success: the objective function is 6     
# [1] "A" "B" "C"

ans <- solver(values, groups, 2)
print(ans)
names(values)[tail(ans$solution, length(values))==1]
# Success: the objective function is 10 
# [1] "B" "C" "D" "E"

ans <- solver(values, groups, 3)
print(ans)
names(values)[tail(ans$solution, length(values))==1]
# Success: the objective function is 12 
# [1] "A" "B" "C" "D" "E"

, . .

# how does it scale?
n_entity <- 50
n_group  <- 50
N <- 10
entity_names <- paste("X", 1:n_entity, sep="")
values <- sample(1:10, n_entity, replace=TRUE)
names(values) <- entity_names
groups <- matrix(sample(c(0,1), n_entity*n_group, 
                        replace=TRUE, prob=c(0.99, 0.01)),
                 nrow=n_entity, ncol=n_group)
rownames(groups) <- entity_names

ans <- solver(values, groups, N)
print(ans)
names(values)[tail(ans$solution, length(values))==1]

How can I write this combinatorics algorithm more efficiently?

More articles: