Pandas merge two columns with null values

I have a df with two columns and I want to combine both columns ignoring the NaN values. The trick is that sometimes both columns have NaN values, in which case I want the new column to also have NaN. Here is an example:

df = pd.DataFrame({'foodstuff':['apple-martini', 'apple-pie', None, None, None], 'type':[None, None, 'strawberry-tart', 'dessert', None]}) df Out[10]: foodstuff type 0 apple-martini None 1 apple-pie None 2 None strawberry-tart 3 None dessert 4 None None 

I tried using fillna and solving this problem:

 df['foodstuff'].fillna('') + df['type'].fillna('') 

and I got:

 0 apple-martini 1 apple-pie 2 strawberry-tart 3 dessert 4 dtype: object 

Line 4 has become empty. That I am not in this situation is the value of NaN, since both columns of the union are NaN.

 0 apple-martini 1 apple-pie 2 strawberry-tart 3 dessert 4 None dtype: object 
+6
source share
4 answers

Use fillna in one column, with fill values ​​being another column:

 df['foodstuff'].fillna(df['type']) 

Result:

 0 apple-martini 1 apple-pie 2 strawberry-tart 3 dessert 4 None 
+12
source

you can use combine with lambda :

 df['foodstuff'].combine(df['type'], lambda a, b: ((a or "") + (b or "")) or None, None) 

(a or "") returns "" if a is None , then the same logic applies to concatenation (where the result will be None if the concatenation is an empty string).

+2
source

You can always fill in a blank row in a new None column.

 import numpy as np df['new_col'].replace(r'^\s*$', np.nan, regex=True, inplace=True) 

Full code:

 import pandas as pd import numpy as np df = pd.DataFrame({'foodstuff':['apple-martini', 'apple-pie', None, None, None], 'type':[None, None, 'strawberry-tart', 'dessert', None]}) df['new_col'] = df['foodstuff'].fillna('') + df['type'].fillna('') df['new_col'].replace(r'^\s*$', np.nan, regex=True, inplace=True) df 

exit:

  foodstuff type new_col 0 apple-martini None apple-martini 1 apple-pie None apple-pie 2 None strawberry-tart strawberry-tart 3 None dessert dessert 4 None None NaN 
+1
source
  • fillna both columns together
  • sum(1) to add them
  • replace('', np.nan)

 df.fillna('').sum(1).replace('', np.nan) 0 apple-martini 1 apple-pie 2 strawberry-tart 3 dessert 4 NaN dtype: object 
+1
source

Source: https://habr.com/ru/post/1013732/


All Articles