Pandas: use if-else to populate a new column

I have a DataFrame like this:

col1 col2 1 0 0 1 0 0 0 0 3 3 2 0 0 4 

I would like to add a column that is 1 if col2 is> 0 or 0 otherwise. If I used R, I would do something like

 df1[,'col3'] <- ifelse(df1$col2 > 0, 1, 0) 

How would this be done in python / pandas?

+6
source share
2 answers

You can convert the boolean series df.col2 > 0 to an integer series ( True becomes 1 and False becomes 0 ):

 df['col3'] = (df.col2 > 0).astype('int') 

(To create a new column, you just need to name it and assign it to the Series array, or a list of the same length as your DataFrame.)

This gives col3 like:

  col2 col3 0 0 0 1 1 1 2 0 0 3 0 0 4 3 1 5 0 0 6 4 1 

Another way to create a column could be to use np.where , which allows you to specify a value for any of the true or false values ​​and possibly closer to the syntax of the ifelse R function. For instance:

 >>> np.where(df['col2'] > 0, 4, -1) array([-1, 4, -1, -1, 4, -1, 4]) 
+8
source

I assume you are using Pandas (due to the 'df' notation). If so, you can assign col3 a boolean flag using .gt (more) to compare col2 with zero. Multiplying the result by one converts the Boolean flags to ones and zeros.

 df1 = pd.DataFrame({'col1': [1, 0, 0, 0, 3, 2, 0], 'col2': [0, 1, 0, 0, 3, 0, 4]}) df1['col3'] = df1.col2.gt(0) * 1 >>> df1 Out[70]: col1 col2 col3 0 1 0 0 1 0 1 1 2 0 0 0 3 0 0 0 4 3 3 1 5 2 0 0 6 0 4 1 

You can also use the lambda expression to achieve the same result, but I find the above method easier for your given example.

 df1['col3'] = df1['col2'].apply(lambda x: 1 if x > 0 else 0) 
+1
source

Source: https://habr.com/ru/post/984020/


All Articles