Row identity in data r-frame

I want to compare two rows of a data frame for identification. I would like the same () function to be suitable for this task, however it does not work properly. Here is a minimal example:

x=factor(c("x","x"),levels=c("x","y")) y=factor(c("y","y"),levels=c("x","y")) df=data.frame(x,y) df xy 1 xy 2 xy identical(df[1,],df[2,]) [1] FALSE > df[1,]==df[2,] xy 1 TRUE TRUE 

Can someone explain to me why same () returns FALSE?

Thank you Thomas

+4
source share
2 answers
 identical(df[1,],df[2,]) #[1] FALSE all.equal(df[1,],df[2,]) #[1] "Attributes: < Component 2: Mean relative difference: 1 >" all.equal(df[1,],df[2,],check.attributes = FALSE) #[1] TRUE anyDuplicated(df[1:2,])>0 #[1] TRUE 
+5
source

try this feature

 all.equal(df[1,],df[2,]) [1] "Attributes: < Component 2: Mean relative difference: 1 >" 

(in the general case, comparative factors can give "unexpected" results ...) In this case, identity , trying to match everything, finds different row.names , you can see this from dput :

 > dput(df[1,]) structure(list(x = structure(1L, .Label = c("x", "y"), class = "factor"), y = structure(2L, .Label = c("x", "y"), class = "factor")), .Names = c("x", "y"), row.names = 1L, class = "data.frame") > dput(df[2,]) structure(list(x = structure(1L, .Label = c("x", "y"), class = "factor"), y = structure(2L, .Label = c("x", "y"), class = "factor")), .Names = c("x", "y"), row.names = 2L, class = "data.frame") 

In this example, simple == works:

 > df[1,]==df[2,] xy 1 TRUE TRUE 
+2
source

Source: https://habr.com/ru/post/1485965/


All Articles