Here is my problem:
df = pd.DataFrame({'A': ['one', 'one', 'two', 'two', 'one'] ,
'B': ['Ar', 'Br', 'Cr', 'Ar','Ar'] ,
'C': [1, 0, 0, 1,0 ]})
I would like to generate something like the output of a function pd.crosstab, but the values at the intersection of the column and row should come from the aggregation of the third column:
Ar, Br, Cr
one 0.5 0 0
two 1 0 0
For example, there are two cases of “one” and “Ar” the corresponding values in column “C” are 1.0, we sum the values in column “C” (0 + 1) and divide by the number of values in column “C”, therefore we get (0 + 1) / 2 = 0.5. Whenever a combination is missing, we (for example, “Cr” and “one”) set it to zero. Any thoughts?
source
share