R: Unwanted version of setdiff?

Here's the setdiffnormal behavior:

x <- rep(letters[1:4], 2)
x
# [1] "a" "b" "c" "d" "a" "b" "c" "d"

y <- letters[1:2]
y
# [1] "a" "b"

setdiff(x, y)
# [1] "c" "d"

... but what if I want to be yremoved only once and therefore get the following result?

# "c" "d" "a" "b" "c" "d"

I assume there is a simple solution using setdiffor %in%, but I just don't see it.

+4
source share
2 answers

matchreturns the vector of positions (first) matches of its first argument in the second. It is used as an index constructor:

x[ -match(y,x) ]
 #[1] "c" "d" "a" "b" "c" "d"

If there are duplicates in 'y' and you want to delete them in proportion to their numbers, then the first thing that occurred to me is the for: loop

y <- c("a","b","a")
x2 <- x
for( i in seq_along(y) ){ x2 <- x2[-match(y[i],x2)] }

> x2
[1] "c" "d" "b" "c" "d"

, . "" , . "":

c( table(x [x %in% intersect(x,y)]) - table(y[y %in% intersect(x,y)]) , 
   table( x[!x %in% intersect(x,y)]) )
a b c d 
0 1 2 2 
+5

. , 42 , .

# construct a table containing counts for all possible values in x and y in y
myCounts <- table(factor(y, levels=sort(union(x, y))))

# extract these elements from x
x[-unlist(lapply(names(myCounts),
                 function(i) which(i == x)[seq_len(myCounts[i])]))]

"" [seq_len(myCounts[i])], , y

0

Source: https://habr.com/ru/post/1666127/


All Articles