I want to filter out all the values of var3 <5, keeping at least one occurrence of each value of var1.
> foo <- data.frame(var1=c(1, 1, 8, 8, 5, 5, 5), var2=c(1,2,3,2,4,6,8), var3=c(7,1,1,1,1,1,6))
> foo
var1 var2 var3
1 1 1 7
2 1 2 1
3 8 3 1
4 8 2 1
5 5 4 1
6 5 6 1
7 5 8 6
subset(foo, (foo$var3>=5)) will delete line 2-6 and I would lose var1 == 8.
- I want to delete a line if there is another var1 value that fulfills the condition foo $ var3> = 5. See line 5.
- I want to save the string by setting NA for var2 and var3 if all occurrences of var1 do not satisfy the condition foo $ var3> = 5.
As a result, I expect:
var1 var2 var3
1 1 1 7
3 8 NA NA
7 5 8 6
This is the closest I got:
> foo$var3[ foo$var3 < 5 ] = NA
> foo$var2[ is.na(foo$var3) ] = NA
> foo
var1 var2 var3
1 1 1 7
2 1 NA NA
3 8 NA NA
4 8 NA NA
5 5 NA NA
6 5 NA NA
7 5 8 6
Now I just need to know how to conditionally delete the correct lines (2, 3 or 4, 5, 6): delete the line if var2 and var3 are NA, and if the value of var1 has more than one occurrence.
But, of course, a much simpler / elegant way to approach this little problem.
: foo,