Have an as-is item return value using the match function in R

Question

Have an as-is item return value using the match function in R

I have a much larger existing dataframe. For this small example, I would like to replace some of the variables (replace state (df1)) with newstate (df2) according to the first column. My problem is that the values are returned as NA, since only some of them are mapped in the new data frame (df2).

Existing data frame:

state = c("CA","WA","OR","AZ") first = c("Jim","Mick","Paul","Ron") df1 <- data.frame(first, state) first state 1 Jim CA 2 Mick WA 3 Paul OR 4 Ron AZ

New data frame matching existing data frame

 state = c("CA","WA") newstate = c("TX", "LA") first =c("Jim","Mick") df2 <- data.frame(first, state, newstate) first state newstate 1 Jim CA TX 2 Mick WA LA

Tried to use a match, but returns NA for the “state”, where the corresponding “first” variable from df2 is not found in the original data frame.

 df1$state <- df2$newstate[match(df1$first, df2$first)] first state 1 Jim TX 2 Mick LA 3 Paul <NA> 4 Ron <NA>

Is there a way to ignore the nomogram or is the item returning the existing as-is variable? This will be an example of the desired result: the Jim / Mick states are updated, but the Paul and Ron states do not change.

  first state 1 Jim TX 2 Mick LA 3 Paul OR 4 Ron AZ

+5

merge r match dataframe

panstotts Oct 4 '14 at 3:16

source share

3 answers

I think you will get better behavior with character vectors than with factors.

 > df1 <- data.frame(first, state,stringsAsFactors=FALSE) > state = c("CA","WA") > newstate = c("TX", "LA") > first =c("Jim","Mick") > df2 <- data.frame(first, state, newstate, stringsAsFactors=FALSE) > df1[ match(df2$first, df1$first ), "state"] <- df2$newstate > df1 first state 1 Jim TX 2 Mick LA 3 Paul OR 4 Ron AZ

+3

42- Oct 4 '14 at 4:14

source share

 library(data.table) DT1 <- as.data.table(df1) DT2 <- as.data.table(df2) setkey(DT1, first, state) setkey(DT2, first, state) DT1[DT2] # first state newstate # 1: Jim CA TX # 2: Mick WA LA

Note that [.data.table also has a nomatch argument, that is:

 DT2[DT1, nomatch=0] # first state newstate # 1: Jim CA TX # 2: Mick WA LA DT2[DT1, nomatch=NA] # first state newstate # 1: Jim CA TX # 2: Mick WA LA # 3: Paul OR NA # 4: Ron AZ NA

+2

Ricardo saporta Oct 4 '14 at 3:23

source share

Data munger · Accepted Answer · 2014-10-04T23:08:37+0000

Is this what you want; BTW, if you really don't want to work with factors, use stringAsFactors = FALSE in your data.frame call. Note the use of nomatch = 0 in the match.

 > state = c("CA","WA","OR","AZ") > first = c("Jim","Mick","Paul","Ron") > df1 <- data.frame(first, state, stringsAsFactors = FALSE) > state = c("CA","WA") > newstate = c("TX", "LA") > first =c("Jim","Mick") > df2 <- data.frame(first, state, newstate, stringsAsFactors = FALSE) > df1 first state 1 Jim CA 2 Mick WA 3 Paul OR 4 Ron AZ > df2 first state newstate 1 Jim CA TX 2 Mick WA LA > > # create an index for the matches > indx <- match(df1$first, df2$first, nomatch = 0) > df1$state[indx != 0] <- df2$newstate[indx] > df1 first state 1 Jim TX 2 Mick LA 3 Paul OR 4 Ron AZ

Have an as-is item return value using the match function in R

More articles: