Check all values โ€‹โ€‹against each other and form groups from the resulting matrix

It seems to me that I am asking the wrong questions and trying to invent a wheel. What am I missing?

I have a bunch of values, say 8, that I need to test against each other. I created a function that returns a matrix indicating whether any two values โ€‹โ€‹are in the group or not. For the lack of a better idea, let me insert here:

    data.text <- 
"1     2     3     4     5     6     7     8
1  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
2  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
3  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
4 FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE
5 FALSE FALSE FALSE FALSE  TRUE  TRUE    NA FALSE
6 FALSE FALSE FALSE FALSE  TRUE  TRUE    NA FALSE
7 FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
8 FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE"

data <- read.table(text=data.text, header = TRUE)
data <- as.matrix(data)
colnames(data) <- 1:8

So row 1 says that value 1 is in a group with itself (column 1) and with values โ€‹โ€‹2 and 3, but not with values โ€‹โ€‹4-8. Values โ€‹โ€‹5 and 6 are within the same group.

I am trying to use this information to create separate group identifiers and a vector of all elements in this group:

  • Group 1: 1,2,3
  • Group2: 5.6

What i have done so far:

# row and column index for all TRUE values by row
groups <- which(data,arr.ind = T)

# sort each row in acending order in order to find duplicate values
groups.sorted  <- t(apply(groups,1,sort))

# drop double statments ("1 and 2", "2 and 1")
groups.unique <- unique(groups.sorted)

# drop obivous information ("1 and 1")
groups.real <- groups.unique[groups.unique[,1] != groups.unique[,2],]

. , 1, 2 3 ?

, , . - ?

+4
3

igraph :

require(igraph)
components(graph_from_adjacency_matrix(data))$membership
#1 2 3 4 5 6 7 8 
#1 1 1 2 3 3 4 5

, , - , .

+6

.

library(igraph)

graph.dat <- graph.data.frame(which(data, arr.ind=T), directed=F)
V(graph.dat)$label <- V(graph.dat)$name
V(graph.dat)$degree <- degree(graph.dat)
clusters(graph.dat, mode="weak")$membership
# 1 2 3 4 5 6 7 8 
# 1 1 1 2 3 3 4 5 
+4

, R:

groups <- unique(lapply(apply(data, 2, which), unique))
names(groups) <- seq(length(groups))
groups
#$`1`
#[1] 1 2 3

#$`2`
#[1] 4

#$`3`
#[1] 5 6

#$`4`
#[1] 7

#$`5`
#[1] 8

, stack:

stack(groups)
#  values ind
#1      1   1
#2      2   1
#3      3   1
#4      4   2
#5      5   3
#6      6   3
#7      7   4
#8      8   5
+3

Source: https://habr.com/ru/post/1620090/


All Articles