Geosphere distHaversine () & dplyr - the error of the wrong length for the vector should be 2

I cannot resolve the error: β€œthe wrong length for the vector must be 2” when trying to calculate the distance (distance between the runways) between two points (thresholds / ends of the runway). To make things worse, I don’t understand the answers, for example here Error R: the wrong length for the vector should be 2 and apply them to my case. A simplified data frame (runway end position) is as follows:

runways <- data.frame( RWY_ID = c(1,2,3) ,RWY = c("36R", "36L","01") ,LAT = c(40.08, 40.12, 40.06) ,LON = c(116.59, 116.57, 116.62) ,LAT2 = c(40.05, 40.07,40.09) ,LON2 = c(116.6, 116.57, 116.61) ) 

Using the distHaversine () function from the geosphere, I am trying to calculate the distance:

 runways <- mutate(runways , CTD = distHaversine( c(LON, LAT), c(LON2, LAT2)) ) 

I'm not sure what I'm doing wrong here. If I pulled out the LON LAT position, it is a numerical vector with the correct length.

 myv <- c(runways$LON[1], runways$LAT[1]) myv [1] 116.59 40.08 str(myv) num [1:2] 116.6 40.1 
+5
source share
2 answers

You need to use rowwise , so distHaversine is passed one set of pairs at a time, and not all rows:

 runways %>% rowwise() %>% mutate(CTD = distHaversine(c(LON, LAT), c(LON2, LAT2))) ## Source: local data frame [3 x 7] ## Groups: <by row> ## ## # A tibble: 3 Γ— 7 ## RWY_ID RWY LAT LON LAT2 LON2 CTD ## <dbl> <fctr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 1 36R 40.08 116.59 40.05 116.60 3446.540 ## 2 2 36L 40.12 116.57 40.07 116.57 5565.975 ## 3 3 01 40.06 116.62 40.09 116.61 3446.509 

As an alternative, distHaversine can handle matrices, so you can use cbind instead of c :

 runways %>% mutate(CTD = distHaversine(cbind(LON, LAT), cbind(LON2, LAT2))) ## RWY_ID RWY LAT LON LAT2 LON2 CTD ## 1 1 36R 40.08 116.59 40.05 116.60 3446.540 ## 2 2 36L 40.12 116.57 40.07 116.57 5565.975 ## 3 3 01 40.06 116.62 40.09 116.61 3446.509 

On a scale, the latter approach is almost certainly better, since working rolling does not take advantage of vectorization and therefore can slow down.

+11
source

I have something to add. I spent a lot of time on this. In the end, I found out that I should not use this form of dt$variable inside dplyr. After inside dplyr, you must call the variable name directly.

0
source

Source: https://habr.com/ru/post/1259601/


All Articles