I have a dataframe like this:
set.seed(123)
a <- c("A", "B", "C", "D", "E", "F", "G", "H", "I")
df <- data.frame(
V1 = sample(a,4, replace=TRUE),
V2 = sample(a,4, replace=TRUE),
V3 = sample(a,4, replace=TRUE),
V4 = sample(a,4, replace=TRUE)
)
which looks like
V1 V2 V3 V4
1 C I E G
2 H A E F
3 D E I A
4 H I E I
I would like to count the number of unique values in a row compared to the previous rows, so the result would look like this:
V1 V2 V3 V4 V5
1 C I E G 4
2 H A E F 3
3 D E I A 2
4 H I E I 1
V5 is 4 for line 1, as this is the first line, and they are all unique
V5 is 3 for line 2, since H, A and F were not in line 1
V5 is 2 for line 3, since 1) D and I were not on line 2. and 2) D and A were not on line 1.
V5 is 1 for line 4, since 1) H is not in line 1, 2) I was not in line 2, and 3) H was not in line 4.
if line 4 was HIEA, then V5 for line 4 will still be 1, because it has only 1 value not on line 3, even if it has 2 values not on lines 2 and 2 values not on line 1.