Pandas Dataframe: how to update multiple columns by applying a function?

I have a Dataframe df like this:

   A   B   C    D
2  1   O   s    h
4  2   P    
7  3   Q
9  4   R   h    m

I have a function f to compute C and D based on B for a string:

def f(p): #p is the value of column B for a row. 
     return p+'k', p+'n'

How can I fill in the missing values ​​for lines 4 and 7 by applying the f function to the Dataframe?

The expected result is as follows:

   A   B   C    D
2  1   O   s    h
4  2   P   Pk   Pn
7  3   Q   Qk   Qn
9  4   R   h    m

The function f should be used since the actual function is very complex. In addition, this function should only apply to strings without C and D

+4
source share
4 answers

There may be a more elegant way, but I would do this:

df['C'] = df['B'].apply(lambda x: f(x)[0])
df['D'] = df['B'].apply(lambda x: f(x)[1])

Applying a function to columns and getting the first and second output values. It returns:

   A  B   C   D
0  1  O  Ok  On
1  2  P  Pk  Pn
2  3  Q  Qk  Qn
3  4  R  Rk  Rn

EDIT:

, :

df[['C','D']] = df['B'].apply(lambda x: pd.Series([f(x)[0],f(x)[1]]))
+4

, :

df.update(df.B.apply(lambda x: pd.Series(dict(zip(['C','D'],f(x))))), overwrite=False)

In [350]: df
Out[350]:
   A  B   C   D
2  1  O   s   h
4  2  P  Pk  Pn
7  3  Q  Qk  Qn
9  4  R   h   m

:

df1 = df.copy()

df[['C','D']] = df.apply(lambda x: pd.Series([x['B'] + 'k', x['B'] + 'n']), axis=1)

df1.update(df, overwrite=False)
+2

I have an easier way to do this.

If the table is not so big.

def f(row): #row is the value of row. 
    if row['C']=='':
        row['C']=row['B']+'k'
    if row['D']=='':
        row['D']=row['B']+'n'
    return row
df=df.apply(f,axis=1)
+1
source

just by doing the following

df.C.loc[df.C.isnull()] = df.B.loc[df.C.isnull()] + 'k'

df.D.loc[df.D.isnull()] = df.B.loc[df.D.isnull()] + 'n'

check this link indexing-view-versus-copy if you want to find out why i useloc

0
source

Source: https://habr.com/ru/post/1607603/


All Articles