I have this dataframe example:
animal gender name first second third 0 dog m Ben 5 6 3 1 dog f Lilly 2 3 5 2 dog m Bob 3 2 1 3 cat f Puss 1 4 4 4 cat m Inboots 3 6 5 5 wolf f Lady NaN 0 3 6 wolf m Summer 2 2 1 7 wolf m Grey 4 2 3 8 wolf m Wind 2 3 5 9 lion f Elsa 5 1 4 10 lion m Simba 3 3 3 11 lion f Nala 4 4 2
Now I suspect that for this I may need hierarchical indexing, but so far I have not received it in Pandas. However, I really need to do some (apparently too advanced) things with this and haven't figured out how to do this. In principle, what I would like to have at the end is the plot in this case (probably a scatter plot, although the line will be just as good now).
1) I would like to have a figure of 4 subnets - one subplot for each animal. The name of each subtitle must be an animal.
2). In each of the subheadings, I would like to build numbers (for example, the number of births every year), that is, the values “first”, “second” and “third” for a given line and give this a label that will show the “name” in the legend. And for each subtitle (each animal), I would like to separately separate the male and the female (for example, the male in blue and the female in red) and, in addition, also calculate the average value for the animal (i.e. the average value in each column for this animal ) in black.
3) note: building it against 1,2,3 for exaple - referring to the column number, So, for example, for the first subheading with the name “dog” I would like to build something like plt.plot(np.array([1,2,3]),x,'b', np.array([1,2,3]),y,'r', np.array([1,2,3]), np.mean(x,y,axis=1),'k') , where x will be (in the first case) 5,6,3, and the legend for this blue plot will show 'Ben', y will be 2,3,5, and the legend for the red graph will display “Lilly”, and the black plot will be 3,5, 4,5, 4, and in the legend I would define what this “means” (for each of the subplots).
I hope I made myself clear enough. I understand that without seeing the final figure, it can be difficult to imagine it, but ... well, if I knew how to do this, I would not ask ...
So, in conclusion, I would like to skip the data frame at different levels, having animals in separate subplots and comparisons of men and women and the average between them in each of the subplots.
My actual framework is much larger, so in the ideal case, I would like the solution to be reliable, but understandable (for a novice programmer).
To understand what a subtask should look like, this is a product in excel:
