Assign value to df $ column from another df?

Question

Assign value to df $ column from another df?

Example: I have df in which the first column

dat <- c("A","B","C","A")

and then I have another df in which I have in the first column:

 dat2[, 1] [1] ABC Levels: ABC dat2[, 2] [1] 21000 23400 26800

How to add values in the second df ( dat2 ) to the first df ( dat )? There is repetition in the first df, and I want every time "A" it adds the corresponding value (21000) from the second df to a new column.

+5

r

Mike jj Sep 05 '17 at 22:32

source share

4 answers

The third option that I prefer is left_join dplyr ... It seems to be faster than merge with large data frames.

 require(dplyr) dat1 <- data.frame(x1 = c("A","B","C","A"), stringsAsFactors = FALSE) dat2 <- data.frame(x1 = c("A","B","C"), x2 = c(21000, 23400, 26800), stringsAsFactors = FALSE) dat1 <- left_join(dat1, dat2, by="x1")

+2

Sam zipper Sep 06 '17 at 12:08

source share

Let the big data race with microbenchmark , just for fun!

create large data frames

 dat1 <- data.frame(x1 = rep(c("A","B","C","A"), 1000), stringsAsFactors = FALSE) dat2 <- data.frame(x1 = rep(c("A","B","C", "D"), 1000), x2 = runif(1,0), stringsAsFactors = FALSE)

on your stamps, set set, GO!

 library(microbenchmark) mbm <- microbenchmark( left_join = left_join(dat1, dat2, by="x1"), merge = merge(dat1, dat2, by = "x1"), times = 20 )

Many, many seconds later .... left_join is faster than MUCH for large data frames.

+2

Rich pauloo Sep 06 '17 at 3:02

source share

Use the merge function.

 # Input data dat <- data.frame(ID = c("A", "B", "C", "A")) dat2 <- data.frame(ID = c("A", "B", "C"), value = c(1, 2, 3)) # Merge two data.frames by specified column merge(dat, dat2, by = "ID") ID value 1 A 1 2 A 1 3 B 2 4 C 3

+1

PoGibas Sep 05 '17 at 10:38

source share

D.sen · Accepted Answer · 2017-09-05T22:37:02+0000

Creating a reproducible data frame ...

 dat1 <- data.frame(x1 = c("A","B","C","A"), stringsAsFactors = FALSE) dat2 <- data.frame(x1 = c("A","B","C"), x2 = c(21000, 23400, 26800), stringsAsFactors = FALSE)

Then use the match function.

 dat1$dat2_vals <- dat2$x2[match(dat1$x1, dat2$x1)]

It is important to convert character columns to character type, not factor , or the elements will not match. I mention this because of the levels attribute in your dat2.

Assign value to df $ column from another df?

More articles: