Hope I am not duplicating a pre-existing problem. I am working on a 32-bit Win7 machine, RV = 3.2.0, dplyr V = 0.4.1, RStudio 0.98.1103.
These files are two CSV files that are read in vars (x, y / sep = "|", header = TRUE, stringsasFactors = FALSE), which come from the same Oracle table. The query used to create both files pulled the same variables (29 of).
identical(names(x), names(y) > TRUE
However, when I download the dplyr package and try to use 'bind_rows' as dat <- bind_rows (x, y), I get the following error:
> bind_rows(x,y) Error: incompatible type (data index: 2, column: 'rmnumber', was collecting: integer (dplyr::Collecter_Impl<13>), incompatible with data of type: factor In addition: Warning messages: 1: In rbind_all(list(x, ...)) : Unequal factor levels: coercing to character 2: In rbind_all(list(x, ...)) : Unequal factor levels: coercing to character 3: In rbind_all(list(x, ...)) : Unequal factor levels: coercing to character
I looked at the "rmnumber" column and confirmed that everything in this column is either numeric, or expected, or "NA", as expected for NULL values ββin the table. I also tried bind_rows (list (x, y)) and it returned the same error.
The primitive "rbind" works fine with these variables without any noticeable loss of precision.
Has anyone seen this error? Do you have any potential solutions outside of using rbind?
Thanks!
#
I don't think this is useful, but I built my own dfs and, of course, βbind_rowsβ worked fine:
> x.df <- data.frame(first_name = c("abc"), last_name = c("def"), rmnum = (1:15), addy = ("some_address")) > y.df <- data.frame(first_name = c("abc"), last_name = c("def"), rmnum = (1:15), addy = ("some_address")) > bind_rows(x.df, y.df) Source: local data frame [30 x 4] first_name last_name rmnum addy 1 abc def 1 some_address 2 abc def 2 some_address 3 abc def 3 some_address 4 abc def 4 some_address 5 abc def 5 some_address 6 abc def 6 some_address 7 abc def 7 some_address 8 abc def 8 some_address 9 abc def 9 some_address 10 abc def 10 some_address .. ... ... ... ...
Cols class check
> identical(sapply(x, class), sapply(y, class)) [1] FALSE > class(x$rmnumber);class(y$rmnumber) [1] "integer" [1] "character"
I canβt understand why they are different. The information came from the same table and they were read in variables using the same code.
Lock in solution
Many thanks to @Pascal for helping me solve this problem. A simple data type conversion solved my problem:
y$rmnumber <- as.integer(y$rmnumber) > dat2 <- bind_rows(x,y) > dat2 Source: local data frame [99,884 x 24]