Set the coverage approximation to R

I am trying to solve or implement an approximation to fix the cover problem in R. Given this data format.

  sets n
1   s1 1
2   s1 2
3   s1 3
4   s2 2
5   s2 4
6   s3 3
7   s3 4
8   s4 4
9   s4 5

A unique number of elements in a column nare:

unique(d$n)
[1] 1 2 3 4 5

I would like to calculate fewer sets (column sets) that span all unique elements in n (universe). There are two sets in this example: s1 {1, 2, 3} and s4 {4, 5}. I read about it on Wikipedia and on the Internet, and I know that you can use the greedy algorithm to find an approximation. I also checked this link , which mentions two packages to solve such problems, LPsolveand Rsymphony, but I don’t know how to start. In my real life example, I have more than 40,000 sets, each of which numbers from 1,000 to 10,000 elements and unimaginable or unique elements of 80,000.

Any help or guidance on how to start or continue will be greatly appreciated.

<strong> data

d <- structure(list(sets = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 
4L, 4L), .Label = c("s1", "s2", "s3", "s4"), class = "factor"), 
    n = c(1, 2, 3, 2, 4, 3, 4, 4, 5)), .Names = c("sets", "n"
), row.names = c(NA, -9L), class = "data.frame")
+4
1

lpSolve CRAN . , , ( pg 4/5) http://math.mit.edu/~goemans/18434S06/setcover-tamara.pdf , , ?lp:

library( lpSolve)
?lp
# In Details: "Note that every variable is assumed to be >= 0!"
# go from your long-form rep of the sets to a wide form for a matrix representation
( items.mat<- t(table(d$sets,d$n))  )  # could have reversed order of args to skip t()
#---------
> dimnames(items.mat) = list( items=1:5, sets=paste0("s", 1:4) )
> items.mat
     sets
items s1 s2 s3 s4
    1  1  0  0  0
    2  1  1  0  0
    3  1  0  1  0
    4  0  1  1  1
    5  0  0  0  1
#---------
f.obj <-  rep(1,4)  # starting values of objective parameters by column (to be solved)
f.dir <- rep(">=",5) # the constraint "directions" by row
f.rhs <- rep(1,5)    # the inequality values by row (require all items to be present)

lp ("min", f.obj, items.mat, f.dir, f.rhs)$solution
#[1] 1 0 0 1

, s1 s4 . " " "".

+3

Source: https://habr.com/ru/post/1652710/


All Articles