Manipulate specific columns (example functions) conditionally for other column entries (function value) using pandas / numpy dataframe

my input dataframe (abbreviated) is as follows:

>>> import numpy as np
>>> import pandas as pd

>>> df_in = pd.DataFrame([[1, 2, 'a', 3, 4], [6, 7, 'b', 8, 9]],
...                     columns=(['c1', 'c2', 'col', 'c3', 'c4']))
>>> df_in
   c1  c2 col  c3  c4
0   1   2   a   3   4
1   6   7   b   8   9

It is assumed that it will be controlled, i.e.

if the row (sample) in the column "col" (function) has a specific value (for example, "b" here) then convert the entries in columns "c1" and "c2" into one row in NumPy.NaNs.

Required Result:

>>> df_out = pd.DataFrame([[1, 2, 'a', 3, 4], [np.nan, np.nan, np.nan, 8, 9]], 
                         columns=(['c1', 'c2', 'col', 'c3', 'c4']))
>>> df_out
    c1  c2 col  c3  c4
0    1   2   a   3   4
1  NaN NaN   b   8   9

So far I have managed to get the desired result using code

>>> dic = {'col' : ['c1', 'c2']}          # auxiliary

>>> b_w = df_in[df_in['col'] == 'b']      # Subset with 'b' in 'col'
>>> b_w = b_w.drop(dic['col'], axis=1)    # ...inject np.nan in 'c1', 'c2'

>>> b_wo = df_in[df_in['col'] != 'b']     # Subset without 'b' in 'col'

>>> df_out = pd.concat([b_w, b_wo])       # Both Subsets together again
>>> df_out
    c1   c2  c3  c4 col
1  NaN  NaN   8   9   b
0  1.0  2.0   3   4   a

, ( , int, ), . , pandas numpy, .

, ? .:)

+4
2

, loc, , :

df_in.loc[df_in.col == 'b', ['c1', 'c2']] = np.nan

df_in
#    c1  c2  col   c3  c4
# 0 1.0 2.0    a    3   4
# 1 NaN NaN    b    8   9
+4

pandas , @Psidom.

pandas β†’ numpy β†’ pandas, .. dataframe β†’ numpy.array β†’ dataframe ( 10% ). numpy .

:

cols, df_out = df_in.columns, df_in.values
for i in [0, 1]:
    df_out[df_out[:, 2] == 'b', i] = np.nan
df_out = pd.DataFrame(df_out, columns=cols)
0

Source: https://habr.com/ru/post/1651121/


All Articles