I have the following framework with 0 , 1 and NA for identifiers A through E for one year:
dat <- data.frame(
id = c("A", "B", "C", "D", "E"),
jan = c(0, 0, NA, 1, 0),
feb = c(0, 1, 1, 0, 0),
mar = c(0, 0, 1, 0, 1),
apr = c(0, NA, 0, NA, 1),
may = c(0, NA, 0, 0, 0),
jun = c(0, 0, 0, 0, 0),
jul = c(0, 0, 0, 0, 1),
aug = c(NA, 0, 0, 1, 1),
sep = c(NA, 0, 0, 1, NA),
okt = c(NA, 0, 0, 0, NA),
nov = c(NA, 0, 0, 0, 1),
dez = c(NA, 0, 0, 0, 0)
)
> dat
id jan feb mar apr may jun jul aug sep okt nov dez
A 0 0 0 0 0 0 0 NA NA NA NA NA
B 0 1 0 NA NA 0 0 0 0 0 0 0
C NA 1 1 0 0 0 0 0 0 0 0 0
D 1 0 0 NA 0 0 0 1 1 0 0 0
E 0 0 1 1 0 0 1 1 NA NA 1 0
I would like to calculate the amount of 1s for each identifier for this one year, but the following conditions must be met:
- The first occurrence in 1 is always considered 1
- NA should be considered as 0s
- The second occurrence of 1 is only considered if preceded by six or more 0s / NAs
In my example, the counter will be:
> dat
id jan feb mar apr may jun jul aug sep okt nov dez count
1 A 0 0 0 0 0 0 0 NA NA NA NA NA => 0
2 B 0 1 0 NA NA 0 0 0 0 0 0 0 => 1
3 C NA 1 1 0 0 0 0 0 0 0 0 0 => 1
4 D 1 0 0 NA 0 0 0 1 1 0 0 0 => 2
5 E 0 0 1 1 0 0 1 1 NA NA 1 0 => 1
The function must be applied in the form apply(dat[, -1], 1, my_fun)and return a vector containing the counter (i.e. 0, 1, 1, 2, 1). Does anyone know how to achieve this?