We have two data frames:
expected data frame:
+------+---------+--------+----------+-------+--------+ |emp_id| emp_city|emp_name| emp_phone|emp_sal|emp_site| +------+---------+--------+----------+-------+--------+ | 3| Chennai| rahman|9848022330| 45000|SanRamon| | 1|Hyderabad| ram|9848022338| 50000| SF| | 2|Hyderabad| robin|9848022339| 40000| LA| | 4| sanjose| romin|9848022331| 45123|SanRamon| +------+---------+--------+----------+-------+--------+
and actual data frame:
+------+---------+--------+----------+-------+--------+ |emp_id| emp_city|emp_name| emp_phone|emp_sal|emp_site| +------+---------+--------+----------+-------+--------+ | 3| Chennai| rahman|9848022330| 45000|SanRamon| | 1|Hyderabad| ram|9848022338| 50000| SF| | 2|Hyderabad| robin|9848022339| 40000| LA| | 4| sanjose| romino|9848022331| 45123|SanRamon| +------+---------+--------+----------+-------+--------+
now the difference between two data frames:
+------+--------+--------+----------+-------+--------+ |emp_id|emp_city|emp_name| emp_phone|emp_sal|emp_site| +------+--------+--------+----------+-------+--------+ | 4| sanjose| romino|9848022331| 45123|SanRamon| +------+--------+--------+----------+-------+--------+
We use the exclusive function df1.except (df2), however, the problem with this is that it returns all different rows. We want to see which columns are different in this row (in this case, “romin” and “romino” from “emp_name” are different). We had enormous difficulties with this, and any help would be great.
source share