A short solution with the data.table-package package :
library(data.table)
setDT(samp)[, flag := 0][code!="", flag := 1*(rleid(code)-1 > 0), by = id]
Or:
setDT(samp)[, flag := 0][code!="", flag := 1*(code!=code[1] & code!=''), by = id][]
which gives the desired result:
> samp
id year type code flag
1: 1 2010 1 abc 0
2: 1 2010 2 abc 0
3: 1 2011 1 0
4: 1 2011 2 0
5: 1 2012 1 xyz 1
6: 1 2012 2 xyz 1
7: 2 2010 1 0
8: 2 2010 2 0
9: 2 2011 1 lmn 0
10: 2 2011 2 0
11: 2 2012 1 efg 1
12: 2 2012 2 efg 1
13: 3 2010 1 def 0
14: 3 2010 2 def 0
15: 3 2011 1 klm 1
16: 3 2011 2 klm 1
17: 3 2012 1 nop 1
18: 3 2012 2 nop 1
Or, when the year also matters:
setDT(samp)[, flag := 0][code!="", flag := 1*(rleid(code, year)-1 > 0), id]
Possible alternative to basic R:
f <- function(x) {
x <- rle(x)$lengths
1 * (rep(seq_along(x), times=x) - 1 > 0)
}
samp$flag <- 0
samp$flag[samp$code!=''] <- with(samp[samp$code!=''], ave(as.character(code), id, FUN = f))
NOTE. Itβs better not to give the object the same name as the function.
Used data:
samp <- data.frame(id = rep(1:3, each=6),
year = rep(2010:2012, 3, each=2),
type = (rep(1:2, 9)),
code = c("abc","abc","","","xyz","xyz", "","","lmn","","efg","efg","def","def","klm","klm","nop","nop"))