Group similar numbers of a vector

Say I have the following vector:

c(4, 5, 5, 8, 12, 12, 12, 13, 15, 15, 18, 19, 20, 23, 37, 37, 37, 37, 37, 41)

and I would like to β€œgroup” its elements according to their meaning: numbers that differ <= 3 should be considered to belong to the same group. In this case, I would like every number that appears in the vector to get all the numbers close to it. For instance,

 4 --> c(4,5,5,8) 5 --> c(4,5,5,8) 8 --> c(5,8) 12 --> c(12,12,12,13,15,15) 

etc.

Perhaps it would be useful to get their index as well ... Is there any smart way to achieve this?

+6
source share
4 answers

You can use this little function:

 similar <- function(vec, val, bound = 3, index = F) { close.index <- which(abs(vec - val) <= bound) if (index) return(close.index) return(vec[close.index]) } x <- c(4, 5, 5, 8, 12, 12, 12, 13, 15, 15, 18, 19, 20, 23, 37, 37, 37, 37, 37, 41) similar(x, 5) # [1] 4 5 5 8 similar(x, 5, index = T) # [1] 1 2 3 4 similar(x, 5, bound = 7) # [1] 4 5 5 8 12 12 12 
+9
source

It may not be the most elegant version, but does it do what you wanted to have?

 x <- c(4, 5, 5, 8, 12, 12, 12, 13, 15, 15, 18, 19, 20, 23, 37, 37, 37, 37, 37, 41) vals <- unique(x) # print indices for (i in 1:length(vals)) print(which((x >= vals[i] - 3) & (x <= vals[i] + 3))) # print values for (i in 1:length(vals)) print(x[which((x >= vals[i] - 3) & (x <= vals[i] + 3))]) [1] 1 2 3 [1] 1 2 3 4 [1] 2 3 4 [1] 5 6 7 8 9 10 [1] 5 6 7 8 9 10 [1] 5 6 7 8 9 10 11 [1] 9 10 11 12 13 [1] 11 12 13 [1] 11 12 13 14 [1] 13 14 [1] 15 16 17 18 19 [1] 20 [1] 4 5 5 [1] 4 5 5 8 [1] 5 5 8 [1] 12 12 12 13 15 15 [1] 12 12 12 13 15 15 [1] 12 12 12 13 15 15 18 [1] 15 15 18 19 20 [1] 18 19 20 [1] 18 19 20 23 [1] 20 23 [1] 37 37 37 37 37 [1] 41 

really a little more elegant to use abs .

 for (i in 1:length(vals)) print(which(abs(x-vals[i]) <= 3)) for (i in 1:length(vals)) print(x[which(abs(x-vals[i]) <= 3)]) 
+2
source

Here is a short solution giving you the whole group as a list:

 x = c(4, 5, 5, 8, 12, 12, 12, 13, 15, 15, 18, 19, 20, 23, 37, 37, 37, 37, 37, 41) m = unique(x) setNames(apply(abs(outer(m,m,'-')), 2, function(u) m[u<=3]),m) #$`4` #[1] 4 5 #$`5` #[1] 4 5 8 #$`8` #[1] 5 8 #$`12` #[1] 12 13 15 #$`13` #[1] 12 13 15 #$`15` #[1] 12 13 15 18 #$`18` #[1] 15 18 19 20 #$`19` #[1] 18 19 20 #$`20` #[1] 18 19 20 23 #$`23` #[1] 20 23 #$`37` #[1] 37 #$`41` #[1] 41 

For an index, the same concept can be easily applied:

 setNames(apply(abs(outer(m,m,'-')), 2, function(u) which(x %in% m[u<=3])),m) 
+2
source

It may be useful to create lists with numbers as "keys":

 data <- c(4, 5, 5, 8, 12, 12, 12, 13, 15, 15, 18, 19, 20, 23, 37, 37, 37, 37, 37, 41) uniques <- unique(data) myGroups <- lapply( uniques, function(n) Filter( function(x) abs(x - n) <= 3, data ) ) names(myGroups) <- uniques myIndices <- lapply( uniques, function(n) which(abs(data - n) <= 3) ) names(myIndices) <- uniques 

Then, to access the group and indices for β€œ12,” say:

 > myGroups[['12']] [1] 12 12 12 13 15 15 > myIndices[['12']] [1] 5 6 7 8 9 10 
0
source

Source: https://habr.com/ru/post/989209/


All Articles