Delete duplicate numbers in a sequence

I have a type vector

c(3,3,...,9,9,...,2,2,...,3,3,...,7,7,...)

I want to remove duplicate numbers in a sequence without breaking the order. This, I would like to get something like

c(3,9,2,3,7,...)

How to do it in R?

+4
source share
3 answers

We can try with rleidand duplicated. We create run-length identifiers with rleid(from data.table), so that only adjacent elements that are equal form one group, get a logical index of values duplicatedand a subset of the vector.

library(data.table)
v1[!duplicated(rleid(v1))]
#[1] 3 9 2 3 7

Or, as indicated in the OP, we can use rlefrom base Rand extract values.

rle(v1)$values
#[1] 3 9 2 3 7

data

 v1 <- c(3,3,9,9,2,2,3,3,7,7)
+6
source

, 0 . , base-R, :

v[c(1,diff(v))!=0]
+9

Just for fun, here is the version of the solution Rcppto solve the problem:

library(Rcpp)
cppFunction('NumericVector remove_multiples(NumericVector& vec) {   
   NumericVector c_vec(clone(vec));
   NumericVector::iterator it = std::unique(c_vec.begin(),c_vec.end());
   c_vec.erase(it,c_vec.end());
   return(c_vec);
  }'
)

x <- c(1,1,1,2,2,2,1,1,3,4,4,1,1)    
> remove_multiples(x)
[1] 1 2 1 3 4 1
+2
source

Source: https://habr.com/ru/post/1628345/


All Articles