Pandas: cut MultiIndex DataFrame over secondary index range

It has been published that slicing at the second index can be done in the multi-indexed pandas series:

import numpy as np import pandas as pd buckets = np.repeat(range(3), [3,5,7]) sequence = np.hstack(map(range,[3,5,7])) s = pd.Series(np.random.randn(len(sequence)), index=pd.MultiIndex.from_tuples(zip(buckets, sequence))) print s 0 0 0.021362 1 0.917947 2 -0.956313 1 0 -0.242659 1 0.398657 2 0.455909 3 0.200061 4 -1.273537 2 0 0.747849 1 -0.012899 2 1.026659 3 -0.256648 4 0.799381 5 0.064147 6 0.491336 

Then, to get the first three rows for the first index = 1, you simply say:

 s[1].ix[range(3)] 0 -0.242659 1 0.398657 2 0.455909 

This works great for one-dimensional series, but not for DataFrames:

 buckets = np.repeat(range(3), [3,5,7]) sequence = np.hstack(map(range,[3,5,7])) d = pd.DataFrame(np.random.randn(len(sequence),2), index=pd.MultiIndex.from_tuples(zip(buckets, sequence))) print d 0 1 0 0 1.217659 0.312286 1 0.559782 0.686448 2 -0.143116 1.146196 1 0 -0.195582 0.298426 1 1.504944 -0.205834 2 0.018644 -0.979848 3 -0.387756 0.739513 4 0.719952 -0.996502 2 0 0.065863 0.481190 1 -1.309163 0.881319 2 0.545382 2.048734 3 0.506498 0.451335 4 0.872743 -0.070985 5 -1.160473 1.082550 6 0.331796 -0.366597 d[1].ix[range(3)] 0 0 0.312286 1 0.686448 2 1.146196 Name: 1 

It gives you the β€œ1st” data column and the first three rows, regardless of the first level of the index. How can you get the first three rows for the first index = 1 for a multi-indexed DataFrame?

+4
source share
2 answers
 d.xs(1)[0:3] 0 1 0 -0.716206 0.119265 1 -0.782315 0.097844 2 2.042751 -1.116453 
+2
source

.loc is more efficient and evaluated at the same time.

s.loc [pd.IndexSlice [1] ,: 3] will return level 0 = 1 and [0: 3].

0
source

Source: https://habr.com/ru/post/1446511/


All Articles