Pandas - filter multi-index by condition for all values inside the index

Question

Pandas - filter multi-index by condition for all values inside the index

I am trying to filter a dataframe with a multi-index like the following.

import numpy as np
import pandas as pd

data = pd.DataFrame(np.random.rand(8),
             index=[list('AABBCCDD'),
                    ['M', 'F']*4])
data['Count'] = [1,2,15,17,8,12,11,20]

I would like to select all rows where the “Count” for “M” and “F” inside a given external level index is greater than 10. So for the framework example, all rows “B” and “D” should be selected, but none of the other rows . The only way I can do this is to iterate over the external index, but since loops in pandas are almost never the best way to do what I think should be the best solution.

+4

python pandas dataframe

elphz Apr 16 '18 at 16:44

source share

3 answers

groupby.transform :

res = data[data.groupby(data.index.get_level_values(0))['Count'].transform('min') > 10]

print(res)

#             0  Count
# B M  0.143501     15
#   F  0.964689     17
# D M  0.092362     11
#   F  0.981470     20

+2

jpp 16 . '18 16:55

Option 1

Stacking and unlocking with a level mask

data.unstack()[data.Count.gt(10).all(level=0)].stack()

            0  Count
B F  0.778883     17
  M  0.548054     15
D F  0.035073     20
  M  0.544838     11

Option 2

Using an argument levelfor pandas.Series.alland pd.DataFrame.reindex.
This avoids cracking / stacking.

mask = data.Count.gt(10).all(level=0)
data.reindex(mask.index[mask], level=0)

            0  Count
B M  0.548054     15
  F  0.778883     17
D M  0.544838     11
  F  0.035073     20

+2

piRSquared Apr 16 '18 at 17:16

source share

Wen · Accepted Answer · 2018-04-16T16:47:07+0000

groupby, filter + all, , thresh

data.groupby(level=0).filter(lambda x : x['Count'].gt(10).all())
Out[495]: 
            0  Count
B M  0.232856     15
  F  0.536026     17
D M  0.375064     11
  F  0.795447     20

Jpp, isin

s=data.Count.min(level=0).gt(10)
data.loc[data.index.get_level_values(0).isin(s[s].index)]

Pandas - filter multi-index by condition for all values ​​inside the index

Option 1

Option 2

More articles:

Pandas - filter multi-index by condition for all values inside the index