Setting value by slice index and conditional strings

Question

Setting value by slice index and conditional strings

An attempt to set the col 'X' value by slicing to multindex, as well as taking into account the conditional value of the "Z" column. I can set the col 'X' value pretty easily, but I'm stuck trying to figure out the conditional.

import pandas as pd

FOOBAR = (['foo','foo','foo','foo','bar','bar','bar','bar'])
NUM1 = ([5,5,6,6,8,8,5,5])
NUM2 = ([1,1,2,2,3,3,1,1])
NUM3 = ([1001,1002,1002,1002,1003,1004,1004,1005])

#build and name index using data
index = pd.MultiIndex.from_arrays([FOOBAR,NUM1,NUM2,NUM3], 
                                  names=['iFOOBAR','iNUM1','iNUM2','iNUM3'])

df = pd.DataFrame({'X': [ 0, 1, 2, 3, 4, 5, 6, 7],
                   'Y': [ 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
                   'Z': [ 1, 2, 2, 4, 5, 6, 7, 7],
                   'FL': [0.1,0.1,0.2,0.2,0.4,0.4,0.1,0.1]
                   }, index=index)

df.sortlevel(inplace=True)
idx = pd.IndexSlice

#original df
#                            FL  X  Y  Z
#iFOOBAR iNUM1 iNUM2 iNUM3              
#bar     5     1     1004   0.1  6  G  7
#                    1005   0.1  7  H  7
#        8     3     1003   0.4  4  E  5
#                    1004   0.4  5  F  6
#foo     5     1     1001   0.1  0  A  1
#                    1002   0.1  1  B  2
#        6     2     1002   0.2  2  C  2
#                    1002   0.2  3  D  4

#set value in 'X' based on index
newdf = df.loc[idx['foo',5,1,:], idx['X']] = 999

#new df
#                            FL    X  Y  Z
#iFOOBAR iNUM1 iNUM2 iNUM3                
#bar     5     1     1004   0.1    6  G  7
#                    1005   0.1    7  H  7
#        8     3     1003   0.4    4  E  5
#                    1004   0.4    5  F  6
#foo     5     1     1001   0.1  999  A  1
#                    1002   0.1  999  B  2
#        6     2     1002   0.2    2  C  2
#                    1002   0.2    3  D  4

#set value in 'X' base on index and 'Z' == 2 ???
#nextdf = df.loc[idx['foo',5,1,:], idx['Z'== 2]], 'X' = 999

#next df: desired output
#                            FL    X  Y  Z
#iFOOBAR iNUM1 iNUM2 iNUM3                
#bar     5     1     1004   0.1    6  G  7
#                    1005   0.1    7  H  7
#        8     3     1003   0.4    4  E  5
#                    1004   0.4    5  F  6
#foo     5     1     1001   0.1    0  A  1
#                    1002   0.1  999  B  2
#        6     2     1002   0.2    2  C  2
#                    1002   0.2    3  D  4

+4

python pandas dataframe multi-index

crashMOGWAI Aug 10 '16 at 17:11

source share

4 answers

This is actually quite complicated. It seems like there might be a better way, but here's one approach that depends on a little knowledge of indexing internal components - creating a set of places that match your criteria, and then transferring everything to iloc.

In [80]: cond1 = df.index.get_locs(idx['foo',5, 1, :])

In [81]: cond2, = (df['Z'] == 2).nonzero()

In [82]: row_indexer = pd.Index(cond1).intersection(cond2)

In [83]: row_indexer
Out[83]: Int64Index([5], dtype='int64')

In [84]: col_indexer = df.columns.get_loc('X')

In [85]: df.iloc[row_indexer, col_indexer] = 999

In [90]: df
Out[90]: 
                            FL    X  Y  Z
iFOOBAR iNUM1 iNUM2 iNUM3                
bar     5     1     1004   0.1    6  G  7
                    1005   0.1    7  H  7
        8     3     1003   0.4    4  E  5
                    1004   0.4    5  F  6
foo     5     1     1001   0.1    0  A  1
                    1002   0.1  999  B  2
        6     2     1002   0.2    2  C  2
                    1002   0.2    3  D  4

+2

chrisb 10 . '16 17:57

DataFrame, df, ,

index1 = df.loc[df['Z'] == 2].index
index2 = df.loc[idx['foo',5,1,:]].index
df.loc[index1.intersection(index2), 'X'] = 999

Since yours dfdoes not have a unique index, you can create a faux index column and then continue in a similar way:

df['index'] = np.arange(len(df))
index1 = df.loc[df['Z'] == 2, 'index']
index2 = df.loc[idx['foo',5,1,:], 'index']
df.ix[np.intersect1d(index1, index2), 'X'] = 999
df = df.drop('index', axis=1)
print(df)

gives

                            FL    X  Y  Z
iFOOBAR iNUM1 iNUM2 iNUM3                
bar     5     1     1004   0.1    6  G  7
                    1005   0.1    7  H  7
        8     3     1003   0.4    4  E  5
                    1004   0.4    5  F  6
foo     5     1     1001   0.1    0  A  1
                    1002   0.1  999  B  2
        6     2     1002   0.2    2  C  2
                    1002   0.2    3  D  4

Note that chrisb solution is more efficient because it does not generate sub-DataFrames. It prepares ordinal indexers and then cuts once.

+2

unutbu Aug 10 '16 at 18:13

source share

Figured it out ...

mask = df.loc[idx[:],idx['Z']] == 2
df.loc[idx[mask,5,1,:],idx['X']] = 999

0

crashMOGWAI Aug 11 '16 at 23:13

source share

Merlin · Accepted Answer · 2016-08-12T00:45:25+0000

Try the following: (single line)

df.loc[idx[:,5,1,(df['Z'] == 2)],idx['X']] = 999
df


                           FL    X  Y  Z
iFOOBAR iNUM1 iNUM2 iNUM3                
bar     5     1     1004   0.1    6  G  7
                    1005   0.1    7  H  7
        8     3     1003   0.4    4  E  5
                    1004   0.4    5  F  6
foo     5     1     1001   0.1    0  A  1
                    1002   0.1  999  B  2
        6     2     1002   0.2    2  C  2
                    1002   0.2    3  D  4
In [126]:

Setting value by slice index and conditional strings

More articles: