I have a dataset that will be missing:
id category value 1 A NaN 2 B NaN 3 A 10.5 4 C NaN 5 A 2.0 6 B 1.0
I need to fill with zeros to use the data in the model. Each time a category appears for the first time, it is NULL. What I want to do is cases like categories A and B , which have more than one value, replace zeros with the average value of this category. And for category C only one case, just fill in the average of the rest of the data.
I know that I can just do this for cases like C to get the average value for all strings, but I'm stuck trying to make category methods on A and B and replacing zeros.
df['value'] = df['value'].fillna(df['value'].mean())
I need the final df to be like that
id category value 1 A 6.25 2 B 1.0 3 A 10.5 4 C 4.15 5 A 2.0 6 B 1.0
source share