Pandas multi-index slices for level names

The latest version of Pandas supports multi-index sliders. However, to use them properly, you need to know the integer arrangement of different levels.

eg. following:

idx = pd.IndexSlice
dfmi.loc[idx[:,:,['C1','C3']],idx[:,'foo']]

assumes that we know that the third row level is the one we want to index with C1and C3, and the second column level is the one we want to index with foo.

Sometimes i know the names , but not their location in the multi-index. Is there a way to use multi-index slices in this case?

For example, say that I know the fragments that I want to apply for each level name, for example. like a dictionary:

'level_name_1' -> ':' 
'level_name_2' -> ':'
'level_name_3' -> ['C1', 'C3']

() . Pandas ?

- pd.IndexSlice, , ?

PD: , reset_index(), , ( ). query, query , Python (, ..).


, :

df.xs('C1', level='foo')

foo - , C1 - .

, xs , :

df.xs(('one', 'bar'), level=('second', 'first'), axis=1)

(, pd.IndexSlice).

+4
1

, . . , . - !

:

In [11]: midx = pd.MultiIndex.from_product([list(range(3)),['a','b','c'],pd.date_range('20130101',periods=3)],names=['numbers','letters','dates'])

In [12]: midx.names.index('letters')
Out[12]: 1

In [13]: midx.names.index('dates')
Out[13]: 2

In [18]: df = DataFrame(np.random.randn(len(midx),1),index=midx)

In [19]: df
Out[19]: 
                                   0
numbers letters dates               
0       a       2013-01-01  0.261092
                2013-01-02 -1.267770
                2013-01-03  0.008230
        b       2013-01-01 -1.515866
                2013-01-02  0.351942
                2013-01-03 -0.245463
        c       2013-01-01 -0.253103
                2013-01-02 -0.385411
                2013-01-03 -1.740821
1       a       2013-01-01 -0.108325
                2013-01-02 -0.212350
                2013-01-03  0.021097
        b       2013-01-01 -1.922214
                2013-01-02 -1.769003
                2013-01-03 -0.594216
        c       2013-01-01 -0.419775
                2013-01-02  1.511700
                2013-01-03  0.994332
2       a       2013-01-01 -0.020299
                2013-01-02 -0.749474
                2013-01-03 -1.478558
        b       2013-01-01 -1.357671
                2013-01-02  0.161185
                2013-01-03 -0.658246
        c       2013-01-01 -0.564796
                2013-01-02 -0.333106
                2013-01-03 -2.814611

β†’

In [20]: slicers = { 'numbers' : slice(0,1), 'dates' : slice('20130102','20130103') }

, ( )

In [21]: indexer = [ slice(None) ] * len(df.index.levels)

In [22]: for n, idx in slicers.items():
              indexer[df.index.names.index(n)] = idx

( , , )

In [23]: df.loc[tuple(indexer),:]
Out[23]: 
                                   0
numbers letters dates               
0       a       2013-01-02 -1.267770
                2013-01-03  0.008230
        b       2013-01-02  0.351942
                2013-01-03 -0.245463
        c       2013-01-02 -0.385411
                2013-01-03 -1.740821
1       a       2013-01-02 -0.212350
                2013-01-03  0.021097
        b       2013-01-02 -1.769003
                2013-01-03 -0.594216
        c       2013-01-02  1.511700
                2013-01-03  0.994332
+2

Source: https://habr.com/ru/post/1543903/


All Articles