How to use .loc with multiple conditions and group

Question

How to use .loc with multiple conditions and group

I have a df that is grouped by key. I want to mark any line in the group where the release date is the same as another release date. And between these lines, one of the lines has a num1 value in the range of 5-12. I found a similar question here , but did not understand the complexity of having several conditions.

df =  pd.DataFrame({'Key': ['10003', '10003', '10003', '10003', '10003','10003','10034', '10034'], 
       'Num1': [12,13,13,13,12,13,16,13],
       'Num2': [121,122,122,124,125,126,127,128],
      'admit': [20120506, 20120508, 20121010,20121010,20121010,20121110,20120516,20120520],
  'discharge': [20120508, 20120508, 20121012,20121016,20121023,20121111,20120520,20120520]})
df['admit'] = pd.to_datetime(df['admit'], format='%Y%m%d')
df['discharge'] = pd.to_datetime(df['discharge'], format='%Y%m%d')

initial df:

    Key     Num1    Num2    admit       discharge
0   10003   12      121     2012-05-06  2012-05-08
1   10003   13      122     2012-05-08  2012-05-08
2   10003   13      122     2012-10-10  2012-10-12
3   10003   13      124     2012-10-10  2012-10-16
4   10003   12      125     2012-10-10  2012-10-23
5   10003   13      126     2012-11-10  2012-11-11
6   10034   16      127     2012-05-16  2012-05-20
7   10034   13      128     2012-05-20  2012-05-20

final df

    Key     Num1    Num2    admit       discharge   flag
0   10003   12      121     2012-05-06  2012-05-08  1
1   10003   13      122     2012-05-08  2012-05-08  1
2   10003   13      122     2012-10-10  2012-10-12  0
3   10003   13      124     2012-10-10  2012-10-16  0
4   10003   12      125     2012-10-10  2012-10-23  0
5   10003   13      126     2012-11-10  2012-11-11  0
6   10034   16      127     2012-05-16  2012-05-20  0
7   10034   13      128     2012-05-20  2012-05-20  0

num1_range = [5,6,7,8,9,10,11,12]
df.loc[df.groupby('Key').apply(lambda x : x.duplicated(subset='discharge',keep=False)).values,'flag']=1

+1

python pandas

MartyB Mar 07 '18 at 4:30

source share

1 answer

Wen · Accepted Answer · 2018-03-07T04:35:41+0000

Ummm, using filter, you can fulfill these conditions

df.loc[df.groupby(['Key','discharge']).Num1.filter(lambda x : (x.isin(num1_range).any())&(len(x)>1)).index,'flag']=1
df
Out[317]: 
     Key  Num1  Num2      admit  discharge  flag
0  10003    12   121 2012-05-06 2012-05-08   1.0
1  10003    13   122 2012-05-08 2012-05-08   1.0
2  10003    13   122 2012-10-10 2012-10-12   NaN
3  10003    13   124 2012-10-10 2012-10-16   NaN
4  10003    12   125 2012-10-10 2012-10-23   NaN
5  10003    13   126 2012-11-10 2012-11-11   NaN
6  10034    16   127 2012-05-16 2012-05-20   NaN
7  10034    13   128 2012-05-20 2012-05-20   NaN

How to use .loc with multiple conditions and group

More articles: