Bind_rows in dplyr throws an unusual error

Hope I am not duplicating a pre-existing problem. I am working on a 32-bit Win7 machine, RV = 3.2.0, dplyr V = 0.4.1, RStudio 0.98.1103.

These files are two CSV files that are read in vars (x, y / sep = "|", header = TRUE, stringsasFactors = FALSE), which come from the same Oracle table. The query used to create both files pulled the same variables (29 of).

identical(names(x), names(y) > TRUE 

However, when I download the dplyr package and try to use 'bind_rows' as dat <- bind_rows (x, y), I get the following error:

 > bind_rows(x,y) Error: incompatible type (data index: 2, column: 'rmnumber', was collecting: integer (dplyr::Collecter_Impl<13>), incompatible with data of type: factor In addition: Warning messages: 1: In rbind_all(list(x, ...)) : Unequal factor levels: coercing to character 2: In rbind_all(list(x, ...)) : Unequal factor levels: coercing to character 3: In rbind_all(list(x, ...)) : Unequal factor levels: coercing to character 

I looked at the "rmnumber" column and confirmed that everything in this column is either numeric, or expected, or "NA", as expected for NULL values ​​in the table. I also tried bind_rows (list (x, y)) and it returned the same error.

The primitive "rbind" works fine with these variables without any noticeable loss of precision.

Has anyone seen this error? Do you have any potential solutions outside of using rbind?

Thanks!

#

I don't think this is useful, but I built my own dfs and, of course, β€œbind_rows” worked fine:

 > x.df <- data.frame(first_name = c("abc"), last_name = c("def"), rmnum = (1:15), addy = ("some_address")) > y.df <- data.frame(first_name = c("abc"), last_name = c("def"), rmnum = (1:15), addy = ("some_address")) > bind_rows(x.df, y.df) Source: local data frame [30 x 4] first_name last_name rmnum addy 1 abc def 1 some_address 2 abc def 2 some_address 3 abc def 3 some_address 4 abc def 4 some_address 5 abc def 5 some_address 6 abc def 6 some_address 7 abc def 7 some_address 8 abc def 8 some_address 9 abc def 9 some_address 10 abc def 10 some_address .. ... ... ... ... 

Cols class check

 > identical(sapply(x, class), sapply(y, class)) [1] FALSE > class(x$rmnumber);class(y$rmnumber) [1] "integer" [1] "character" 

I can’t understand why they are different. The information came from the same table and they were read in variables using the same code.

Lock in solution

Many thanks to @Pascal for helping me solve this problem. A simple data type conversion solved my problem:

  y$rmnumber <- as.integer(y$rmnumber) > dat2 <- bind_rows(x,y) > dat2 Source: local data frame [99,884 x 24] 
+6
source share
1 answer

The error messages say that: "in one data.frame," rmnumber "from the integer class and in the other data.frame" rmnumber "has a class factor. I cannot link different classes together."

Let me use your example.

 x.df <- data.frame(first_name = c("abc"), last_name = c("def"), rmnum = (1:15), addy = ("some_address")) y.df <- data.frame(first_name = c("abc"), last_name = c("def"), rmnum = (1:15), addy = ("some_address")) 

Check the class for each column "x.df" and "y.df":

 sapply(x.df, class) # first_name last_name rmnum addy # "factor" "factor" "integer" "factor" sapply(y.df, class) # first_name last_name rmnum addy # "factor" "factor" "integer" "factor" 

Everything is fine, the classes between data.frames are consistent. Now include "y.df $ rmnum" in the coefficient:

 y.df$rmnum <- factor(y.df$rmnum) class(y.df$rmnum) # [1] "factor" 

Try binding now:

 bind_rows(x.df, y.df) 

Error: incompatible type (data index: 2, column: 'rmnum', assembled: integer (dplyr :: Collecter_Impl <13), incompatible with data of type: factor

The same error message. So in one of your data.frames, "rmnumber" is an integer, and in the other is "rmnumber". You must turn the factorized "rmnumber" into an integer or vice versa.

+7
source

Source: https://habr.com/ru/post/987096/


All Articles