Print the value of a column in a data frame that is not contained in another data frame

I have two data frames:

df1 = pd.DataFrame({'System':['b0001','b0002']})
df2 = pd.DataFrame({'System':['b0001']})

I want to print the value in the System of df1 column, which is NOT contained in the System of df2 column. The output should be only:

b0002

My current code is:

for i in df1.index:
    if df1.System[i] not in df2.System:
        print (df1.System[i])

But the way out:

b0001 
b0002

I can’t understand why he is still typing b0001. I tried with isin, and the result is the same.

Any help would be appreciated.

+4
source share
4 answers

A pandas way to do this is to use it isinlike this:

df1[~df1.System.isin(df2.System)]

Conclusion:

  System
1  b0002

However, for this you are missing .values:

for i in df1.index:
    if df1.System[i] not in df2.System.values:
        print (df1.System[i])

Conclusion:

b0002
+4
source

numpy

np.setdiff1d(df1.System.values, df2.System.values)

array(['b0002'], dtype=object)
+3
source
# This solution only prints unique elements in df1 which are not in df2

np.setdiff1d(df1,df2)
Out[236]: array(['b0002'], dtype=object)
+2

set(df1.system).difference(set(df2.system))
0

Source: https://habr.com/ru/post/1676872/


All Articles