Search for factors that match more than one value

Suppose you have the following data file:

x=data.frame(c(1,1,2,2,2,3),c("A","A","B","B","B","B"))
names(x)=c("v1","v2")

x
  v1 v2
1  1  A
2  1  A
3  2  B
4  2  B
5  2  B
6  3  B

In this data frame, the value in v1I want to match the label in v2. However, as can be seen in this example, it Bhas more than one corresponding value.

Is there an elegant and quick way to find which labels in v2correspond to more than one value in v1?

The result that I want to ideally show is the values ​​that should be in our example c(2,3), as well as the line number that should be in our example r=c(5,6).

+4
source share
1 answer

, , 'v1' 'v2' , ave ' ".

i1 <- with(x, ave(v1, v2, FUN = function(x) 
    length(unique(x))>1 & !duplicated(x, fromLast=TRUE)))!=0
x[i1,]
#   v1 v2
#5  2  B
#6  3  B

data.table

library(data.table)
i1 <- setDT(x)[, .I[uniqueN(v1)>1 & !duplicated(v1, fromLast=TRUE)], v2]$V1
x[i1, 'v1', with = FALSE][, rn := i1][]
#   v1 rn
#1:  2  5
#2:  3  6
+3

Source: https://habr.com/ru/post/1656499/


All Articles