Delete consecutive duplicate entries

How to delete consecutive duplicate entries in R? I think with can be used, but can't think how to use it. Illustrating one example:

 read.table(text = " a t1 b t2 b t3 b t4 c t5 c t6 b t7 d t8") 

Sample data: D

  events time a t1 b t2 b t3 b t4 c t5 c t6 b t7 d t8 

Required Result:

  events time a t1 b t4 c t6 b t7 d t8 

`

+6
source share
4 answers

But another, if your data.frmae is called d :

 d[cumsum(rle(as.numeric(d[,1]))$lengths),] V1 V2 1 a t1 4 b t4 6 c t6 7 b t7 8 d t8 
+12
source

EDIT: Not quite right, since it only shows one line b. You can also use the duplicated () function

 x <- read.table(text = " events time a t1 b t2 b t3 b t4 c t5 c t6 d t7", header = TRUE) #Making sure the data is correctly ordered! x <- x[order(x[,1], x[,2]), ] x[!duplicated(x[,1], fromLast=TRUE), ] 
+2
source

The solution in the R database using split-apply-combination works through the tail function, which returns the last element and rle in combination with mapply to create a new events vector that preserves order in case of repeated events:

 x <- read.table(text = " events time a t1 b t2 b t3 b t4 c t5 c t6 b t7 d t8", header = TRUE) # create vector of new.events (ie, preserve reappearing objects) occurences <- rle(as.character(x$events))[["lengths"]] new.events <- unlist(mapply(rep, x = letters[seq_along(occurences)], times = occurences)) # split into sublists per event s1 <- split(x, list(new.events)) # get last element from list s2 <- lapply(s1, tail, n = 1) # combine again do.call(rbind, s2) 

This gives the desired result.

0
source

And for good measure, using head and tail :

 dat[with(dat,c(tail(events,-1) != head(events,-1),TRUE)),] events time 1 a t1 4 b t4 6 c t6 7 b t7 8 d t8 
0
source

Source: https://habr.com/ru/post/949451/


All Articles