I have 2 data frames.
- Template - I will use data types from this data frame.
- df - I want to change the data types of this data frame based on a template.
I want to change the data types of the second data frame based on the first. Suppose I have a below data frame that I use as a template.
> template id <- c(1,2,3,4) a <- c(1,4,5,6) b <- as.character(c(0,1,1,4)) c <- as.character(c(0,1,1,0)) d <- c(0,1,1,0) template <- data.frame(id,a,b,c,d, stringsAsFactors = FALSE) > str(template) 'data.frame': 4 obs. of 5 variables: $ id: num 1 2 3 4 $ a : num 1 4 5 6 $ b : chr "0" "1" "1" "4" $ c : chr "0" "1" "1" "0" $ d : num 0 1 1 0
I am looking for things below.
- To make data type data the same for df,
- It should have the same columns that are in the template.
** Note. He should add extra columns with all NA if they are not available in df.
> df id <- c(6,7,12,14,1,3,4,4) a <- c(0,1,13,1,3,4,5,6) b <- c(1,4,12,3,4,5,6,7) c <- c(0,0,13,3,4,45,6,7) e <- c(0,0,13,3,4,45,6,7) df <- data.frame(id,a,b,c,e) > str(df) 'data.frame': 8 obs. of 5 variables: $ id: num 6 7 12 14 1 3 4 4 $ a : num 0 1 13 1 3 4 5 6 $ b : num 1 4 12 3 4 5 6 7 $ c : num 0 0 13 3 4 45 6 7 $ e : num 0 0 13 3 4 45 6 7
The required conclusion is
> output id abcd 1 6 0 1 0 NA 2 7 1 4 0 NA 3 12 13 12 13 NA 4 14 1 3 3 NA 5 1 3 4 4 NA 6 3 4 5 45 NA 7 4 5 6 6 NA 8 4 6 7 7 NA > str(output) 'data.frame': 8 obs. of 5 variables: $ id: num 6 7 12 14 1 3 4 4 $ a : num 0 1 13 1 3 4 5 6 $ b : chr "1" "4" "12" "3" ... $ c : chr "0" "0" "13" "3" ... $ d : logi NA NA NA NA NA NA ...
My attempts are
template <- fread("template.csv"),header=TRUE,stringsAsFactors = FALSE) n <- names(template) template[,(n) := lapply(.SD,function(x) gsub("[^A-Za-z0-90 _/.-]","", as.character(x)))] n <- names(df) df[,(n) := lapply(.SD,function(x) gsub("[^A-Za-z0-90 _/.-]","", as.character(x)))] output <- rbindlist(list(template,df),use.names = TRUE,fill = TRUE,idcol="template")
After that, I write the output data frame and then re-read it with write.csv to get the data types. But, I messed up the data types. Please suggest any suitable way to handle this.