I want to know that all elements in df that are not in df1 are also elements in df1 but not in df
df =sc.parallelize([1,2,3,4 ,5 ,6,7,8,9]) df1=sc.parallelize([4 ,5 ,6,7,8,9,10]) df2 = df.subtract(df1) df2.show() df3 = df1.subtract(df) df3.show()
I just want to check the result to see if I understand this function well. But got this error. PipelinedRDD object does not have the 'show' attribute any suggestion?
source share