This seems like a simple problem, but I can't figure it out. I would like to remove duplicates from dataframe (df) if the two columns have the same values, even if these values ββare in the reverse order . I mean, you have the following data frame:
a <- c(rep("A", 3), rep("B", 3), rep("C",2)) b <- c('A','B','B','C','A','A','B','B') df <-data.frame(a,b) ab 1 AA 2 AB 3 AB 4 BC 5 BA 6 BA 7 CB 8 CB
If I remove duplicates now, I get the following data frame:
df[duplicated(df),] ab 3 AB 6 BA 8 CB
However, I would also like to delete row 6 in this data frame, since "A", "B" is the same as "B", "A". How to do it automatically?
Ideally, I could indicate which two columns should be compared, since data frames can have different columns and can be quite large.
Thanks!
source share