Combining two data frames, both with coordinates based on the nearest location

I have one large dataframe (~ 130,000 rows) containing local variables and another large dataframe (~ 7,000 rows) containing view density. Both have x and y coordinates, but these coordinates do not always match. eg:

df1 <- data.frame(X = c(2,4,1,2,5), Y = c(6,7,8,9,8), V1 = c("A", "B", "C", "D", "E"), V2 = c("G", "H", "I", "J", "K"))

and

df2 <- data.frame(X = c(2,4,6), Y = c(5,9,7), Dens = c(12, 17, 10))

I would like to add a column to df1 containing the density (Dens) of df2 if the point is close enough. If there is no point close, I would like it to display as NA. eg:

X Y   V1   V2    Dens
2 6   A    G      12
4 7   B    H      NA     
1 8   C    I      17
2 9   D    J      NA
5 8   E    K      10
+4
source share
1 answer

, df2 df1. (.. (x1 - x2)^2 + (y1 - y2)^2). lat/lon, :

mydist <- function(row){
  dists <- (row[["X"]] - df2$X)^2 + (row[["Y"]]- df2$Y)^2
  return(cbind(df2[which.min(dists),], distance = min(dists)))
}

, lapply :

z <- cbind(df1, do.call(rbind, lapply(1:nrow(df1), function(x) mydist(df1[x,])))) 

:

   X Y V1 V2 X Y Dens distance
1  2 6  A  G 2 5   12        1
2  4 7  B  H 4 9   17        4
3  1 8  C  I 2 5   12       10
21 2 9  D  J 4 9   17        4
22 5 8  E  K 4 9   17        2

- , :

threshold <- 5
z$Dens[z$distance > threshold] <- NA

   X Y V1 V2 X Y Dens distance
1  2 6  A  G 2 5   12        1
2  4 7  B  H 4 9   17        4
3  1 8  C  I 2 5   NA       10
21 2 9  D  J 4 9   17        4
22 5 8  E  K 4 9   17        2

( 10 ). , merge, , (. dplyr::anti_join).

+2

Source: https://habr.com/ru/post/1619635/


All Articles