Suppose I have a pandas DataFrame that looks like the following in the structure. However, in practice it can be much larger, and the number of level 1 indexes, as well as the number of level 2 indexes (for a level 1 index) will differ, so the decision should not make any assumptions about this:
index = pandas.MultiIndex.from_tuples([
("a", "s"),
("a", "u"),
("a", "v"),
("b", "s"),
("b", "u")])
result = pandas.DataFrame([
[1, 2],
[3, 4],
[5, 6],
[7, 8],
[9, 10]], index=index, columns=["x", "y"])
What looks like this:
x y
a s 1 2
u 3 4
v 5 6
b s 7 8
u 9 10
Now let's say that I want to create a “common” line for each of the levels “a” and “b”. Therefore, given the above as input, I would like my code to create something like this:
x y
a s 1 2
u 3 4
v 5 6
t 9 12
b s 7 8
u 9 10
b t 16 18
Here is the code that I still have:
for level, _ in result.groupby(level=0):
x_sum = result.loc[level]["x"].sum()
y_sum = result.loc[level]["y"].sum()
result = result.append(pandas.DataFrame([[x_sum, y_sum]], columns=result.columns, index=pandas.MultiIndex.from_tuples([(level, "t")])))
But this leads to the addition of “summary” columns to the end:
x y
a s 1 2
u 3 4
v 5 6
b s 7 8
u 9 10
a t 9 12
b t 16 18
Sorting using result.sort_index()does not do what I want:
x y
a s 1 2
t 9 12
u 3 4
v 5 6
b s 7 8
t 16 18
u 9 10
What am I doing wrong?