Updating a column in a Pandas DataFrame based on the state of another column

I am interested in adding a text tag to a new column in the Pandas framework. The following example works, but I get a copy warning, and I don't quite understand if I should ignore it in this case.

A DataFrame simply has either a character or an empty string:

In [1]: import pandas as pd

In [2]: df=pd.DataFrame({('A'):['x','','x',''], ('B'):['x','x','','']})

In [3]: df
Out[3]:
   A  B
0  x  x
1     x
2  x
3

Create a new column named msg

In [4]: df['msg'] = ''

In [5]: df
Out[5]:
   A  B msg
0  x  x
1     x
2  x
3

Set the "msg" column to "red" if "A" is not an empty string

In [6]: df['msg'][df['A'] != ''] = 'red;'

In [7]: df
Out[7]:
   A  B  msg
0  x  x  red;
1     x
2  x     red;
3

Concatenation "blue" depending on the values ​​of column "B"

In [8]: df['msg'][df['B'] != ''] += 'blue;'

In [9]: df
Out[9]:
   A  B       msg
0  x  x  red;blue;
1     x     blue;
2  x         red;
3

As an alternative, I found that using numpy.where led to the desired result. What is the right way to do this in Pandas?

import numpy as np

df['msg'] += np.where(df['A'] != '','green;', '')

Update: 4/15/2018

, DataFrame, ( "" ). @COLDSPEED ( "" "" "", ):

df['msg'] = (v.where(df.applymap(len) > 0, '') + 
             df.where(df[['B']].applymap(len)>0,'')).agg(''.join, axis=1)


   A  B         msg
0  x  x  red;blue:x
1     x      blue:x
2  x           red;
3
+4
3

, DataFrame.where str.join, .

v = pd.DataFrame(
     np.repeat([['red;', 'blue;']], len(df), axis=0), 
     columns=df.columns, 
     index=df.index
) 
df['msg'] = v.where(df.applymap(len) > 0, '').agg(''.join, axis=1)

df
   A  B        msg
0  x  x  red;blue;
1     x      blue;
2  x          red;
3              
+4

pandas.DataFrame.dot
, dtype object. dot .

a = np.array(['red', 'blue;'], object)

df.assign(msg=df.astype(bool).dot(a))

   A  B        msg
0  x  x  red;blue;
1     x      blue;
2  x          red;
3                 
+4

You can use dotandreplace

(df!='').dot(df.columns).replace({'A':'red;','B':'blue;'},regex=True)
Out[379]: 
0    red;blue;
1        blue;
2         red;
3             
dtype: object

#df['msg']=(df!='').dot(df.columns).replace({'A':'red;','B':'blue;'},regex=True)
+4
source

Source: https://habr.com/ru/post/1696189/


All Articles