Comparing values ​​in a string

I am trying to compare the values ​​in the rows of a data frame and delete all those that match with this

dat[!dat[1]==dat[2]] 

Where

 > dat 

returns

 n1 n2 n1 n4 n4 n5 n1 n3 n4 n4 

So, I want it to compare the values ​​and delete the last row, since both columns have the same data. But when I use the above code, it tells me

 Error in Ops.factor(left, right) : level sets of factors are different 

str(dat) reads

 'data.frame': 5 obs. of 2 variables: $ V1: Factor w/ 2 levels "n1","n4": 1 1 2 1 2 $ V2: Factor w/ 4 levels "n2","n3","n4",..: 1 3 4 2 3 
+6
source share
2 answers

I suspect that when you created your data, you inadvertently and implicitly converted your columns into factors. This is possible when you are reading data from a source, for example. when using read.csv or read.table . This example illustrates this:

 dat <- read.table(text=" n1 n2 n1 n4 n4 n5 n1 n3 n4 n4") str(dat) 'data.frame': 5 obs. of 2 variables: $ V1: Factor w/ 2 levels "n1","n4": 1 1 2 1 2 $ V2: Factor w/ 4 levels "n2","n3","n4",..: 1 3 4 2 3 

The tool consists in passing the stringsAsFactors=FALSE argument to read.table() :

 dat <- read.table(text=" n1 n2 n1 n4 n4 n5 n1 n3 n4 n4", stringsAsFactors=FALSE) str(dat) 'data.frame': 5 obs. of 2 variables: $ V1: chr "n1" "n1" "n4" "n1" ... $ V2: chr "n2" "n4" "n5" "n3" ... 

Then your code works (except that I suspect you missed the comma):

 dat[!dat[1]==dat[2], ] V1 V2 1 n1 n2 2 n1 n4 3 n4 n5 4 n1 n3 
+14
source

One solution would be to instruct the data frame not to convert character vectors to factors (using stringAsFactors=F ):

 x <- c('n1', 'n1', 'n4', 'n1', 'n4') y <- c('n2', 'n4', 'n5', 'n3', 'n4') df <- data.frame(x, y, stringsAsFactors=F) df <- df[-which(df$x == df$y), ] 

After creating a data frame, the code deletes the corresponding lines, creating the desired result.

+2
source

Source: https://habr.com/ru/post/918240/


All Articles