How to combine two data frames excluding the NaN value column?

Question

How to combine two data frames excluding the NaN value column?

if df1:

       size_a  size_b
0       1       2
1       1       5
2       2       3
3       2       9
4       3       1
5       3       5
6       4       4

and df2:

   size_a  size_b
0     1     2
1     2     NaN
2     3     NaN

I want the result to be as follows:

  size_a size_b
0       1       2
1       2       3
2       2       9
3       3       1
4       3       5

To make the intersection, I want to consider only non-nan-values of df2-, where NaN in df2 ever exists, the column value should be ignored to perform the intersection.

+4

python pandas

javed Aug 7 '17 at 14:12

source share

3 answers

, merge concat :

. merge:

part1 = pd.merge(df1, df2)

. NaN s:

nans = df2[df2.size_b.isnull()]
part2 = pd.merge(df1, nans[["size_a"]], on="size_a")

. concat

pd.concat([part1, part2], ignore_index=True)

:

   size_a size_b
0       1      2
1       2      3
2       2      9
3       3      1
4       3      5

+3

Huang 07 . '17 14:49

, , , .

df_out = df1.merge(df2, on='size_a',suffixes=('','_y'))

df_out.query('size_b_y == size_b or size_b_y != size_b_y').drop('size_b_y',axis=1)

:

   size_a  size_b
0       1       2
2       2       3
3       2       9
4       3       1
5       3       5

: size_by_y!= size_b_y - , NaN.

+2

Scott Boston 07 . '17 15:07

Scratch'N'Purr · Accepted Answer · 2017-08-07T15:10:36+0000

One way is to first join columns (columns) that require joining without substitution. This will help reduce the conditional filters that you have to build downstream. In the above example, I see that size_ais one of the following columns:

new_df = df1.merge(df2, how='inner', on='size_a')

, df2 NaN.

new_df = new_df[(new_df['size_b_x'] == new_df['size_b_y']) | new_df['size_b_y'].isnull()]

, df2 ( _y )

new_df = new_df.drop('size_b_y', 1)

How to combine two data frames excluding the NaN value column?

More articles: