IF ELSE uses Numpy and Pandas

After searching several forums on similar issues, it seems like one way to quickly iterate through a conditional statement is to use the Numpy function np.where()on Pandas. I am having problems with the following task:

I have a dataset that looks like several rows:

PatientID    Date1      Date2       ICD
1234         12/14/10   12/12/10    313.2, 414.2, 228.1
3213         8/2/10     9/5/12      232.1, 221.0

I am trying to create a conditional statement so that:

 1. if strings '313.2' or '414.2' exist in df['ICD'] return 1
 2. if strings '313.2' or '414.2' exist in df['ICD'] and Date1>Date2 return 2
 3. Else return 0

Given that Date1both Date2are in a date and time format, and my data frame is encoded as df, I have the following code:

df['NewColumn'] = np.where(df.ICD.str.contains('313.2|414.2').astype(int), 1, np.where(((df.ICD.str.contains('313.2|414.2').astype(int))&(df['Date1']>df['Date2'])), 2, 0)

However, this code only returns a string with 1 and 0 and does not include 2. How else can I perform this task?

+4
source share
2 answers

, (prepend r) contains, :

In [115]:
df['NewColumn'] = np.where(df.ICD.str.contains(r'313.2|414.2').astype(int), 1, np.where(((df.ICD.str.contains(r'313.2|414.2').astype(int))&(df['Date1']>df['Date2'])), 2, 0))
df

Out[115]:
   PatientID      Date1      Date2                ICD  NewColumn
0       1234 2010-12-14 2010-12-12  313.2,414.2,228.1          1
1       3213 2010-08-02 2012-09-05        232.1,221.0          0

1 , , , 2, :

In [122]:
df['NewColumn'] = np.where( (df.ICD.str.contains(r'313.2|414.2').astype(int)) & ( df['Date1'] > df['Date2'] ), 2 , 
                           np.where( df.ICD.str.contains(r'313.2|414.2').astype(int), 1, 0 ) )
df

Out[122]:
   PatientID      Date1      Date2                ICD  NewColumn
0       1234 2010-12-14 2010-12-12  313.2,414.2,228.1          2
1       3213 2010-08-02 2012-09-05        232.1,221.0          0
+1

pandas. numpy -, pandas , - .

, , 313.2 ( 2313.25 False).

df['ICD'].astype(str) == '313.2'

True False .

 boolean =(df['ICD'].astype(str) == '313.2')| (df['ICD'].astype(str) == '414.2')
if(boolean.any()):
    #do something
    return 1

 boolean2 =((df['ICD'].astype(str) == '313.2')| (df['ICD'].astype(str) == '414.2'))&(df['Date1']>df['Date2'])
if(boolean2.any()):
     return 2

..

Pandas isin(), .

: http://pandas.pydata.org/pandas-docs/stable/indexing.html

, - , . , 2 , 1 . 1, 1 .

, 2, , 1 , 2 .

0

Source: https://habr.com/ru/post/1622932/


All Articles