I have data in the following DF format (this is just a simplified version):
eval.num, eval.count, fitness, fitness.mean, green.h.0, green.v.0, offset.0 random
1 1 1500 1500 100 120 40 232342
2 2 1000 1250 100 120 40 11843
3 3 1250 1250 100 120 40 981340234
4 4 1000 1187.5 100 120 40 4363453
5 1 2000 2000 200 100 40 345902
6 1 3000 3000 150 90 10 943
7 1 2000 2000 90 90 100 9304358
8 2 1800 1900 90 90 100 284333
However, the eval.count column is incorrect and I need to fix it. It should report the number of lines with the same values for (green.h.0, green.v.0 and offset.0), only by looking at the previous lines.
The above example uses expected values, but assumes they are incorrect.
How to add a new column (say, "count") that will count all previous rows that have the same values for the specified variables?
I got help on a similar problem by simply selecting all rows with the same values for the specified columns, so I assumed that I could just write a loop around this, but it seems inefficient.