Aligning data with a binding to a form file in R

I have a binding dataset of events in a form data frame:

LONGITUDE LATITUDE VAR1 33.4 4.4 5 33.4 4.4 3 33.4 4.4 1 30.4 4.2 2 28.4 5.1 2 

He counts deaths in binding events. In addition to this, I have a provincial form file in the country, for example:

 > str(shapefile) 'data.frame': 216 obs. of 5 variables: $ CONSTI_COD: num 1 2 3 4 5 6 7 8 9 10 ... $ Area : num 20 11.7 10.7 223.3 38.7 ... $ PROVINCE_NAME : Factor w/ 216 levels "CENTRAL","COAST",..: 4 4 4 4 4 4 4 4 2 2 ... $ Shape_Leng: num 0.193 0.152 0.201 0.872 0.441 ... $ Shape_Area: num 0.001628 0.000947 0.000867 0.018135 0.003145 ... ..@ polygons :List of 216 .. ..$ :Formal class 'Polygons' [package "sp"] with 5 slots .. .. .. ..@ Polygons :List of 1 .. .. .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots .. .. .. .. .. .. ..@ labpt : num [1:2] 36.9 -1.3 .. .. .. .. .. .. ..@ area : num 0.00163 .. .. .. .. .. .. ..@ hole : logi FALSE .. .. .. .. .. .. ..@ ringDir: int 1 .. .. .. .. .. .. ..@ coords : num [1:151, 1:2] 36.8 36.8 36.8 36.9 36.9 ... .. .. .. ..@ plotOrder: int 1 .. .. .. ..@ labpt : num [1:2] 36.9 -1.3 .. .. .. ..@ ID : chr "0" .. .. .. ..@ area : num 0.00163 [...etc] 

What I need to do is post event data in the provinces, i.e. Add a fourth column to the first data frame, which indicates in which province each event occurred based on the coordinates. So I would have something like this:

 LONGITUDE LATITUDE VAR1 PROVINCE 33.4 4.4 5 CENTRAL 33.4 4.4 3 CENTRAL 33.4 4.4 1 CENTRAL 30.4 4.2 2 COAST 28.4 5.1 2 COAST 

Is it possible? I think some time ago I found a post explaining how to do this (outside of Stack Overflow), but I can't find it now.

Thanks!

(Sorry if there is a similar question here. I did a search, but I did not find an answer, maybe because I really do not know what I'm looking for. I would really appreciate a link to a similar post.)

+6
source share
1 answer

What you are talking about is a “spatial connection” (or “spatial intersection” or “overlap”). This is pretty simple using the over function from the sp package.

Here is an example.

First, let's download and import the polygon shape file from around the world.

 download.file(paste0('http://www.naturalearthdata.com/http//', 'www.naturalearthdata.com/download/110m/cultural/', 'ne_110m_admin_0_countries.zip'), f <- tempfile()) unzip(f, exdir=tempdir()) library(rgdal) countries <- readOGR(tempdir(), 'ne_110m_admin_0_countries') 

Now we will create some random coordinate data that falls within the boundaries of the polygon shape file. Then we define the x and y columns as coordinates and assign the same CRS as for the polygons (although this may not be the case for your data, remember to assign the correct coordinate systems).

 pts <- data.frame(x=runif(10, -180, 180), y=runif(10, -90, 90), VAR1=LETTERS[1:10]) coordinates(pts) <- ~x+y # pts needs to be a data.frame for this to work proj4string(pts) <- proj4string(countries) plot(countries) points(pts, pch=20, col='red') 

shp

Now we can perform a spatial overlay:

 over(pts, countries)$admin # [1] <NA> <NA> Turkey <NA> # [5] Macedonia <NA> China Argentina # [9] <NA> Canada # 177 Levels: Afghanistan Albania ... Zimbabwe 

Note that in this case some of the random points fell in the ocean (i.e., outside the polygons). When they intersect with a polygon object, these points return NA.

Now we cbind required pts attribute:

 cbind.data.frame(pts, country=over(pts, countries)$admin) # xy VAR1 country # 1 -52.59404 -37.422879 A <NA> # 2 -33.88867 -40.194482 B <NA> # 3 38.84383 37.272460 C Turkey # 4 -84.04949 7.118878 D <NA> # 5 20.98272 40.920470 E Macedonia # 6 -155.32951 -37.612497 F <NA> # 7 99.40166 38.630049 G China # 8 -61.84025 -27.412885 H Argentina # 9 -37.65287 -3.666080 I <NA> # 10 -112.81197 59.959475 J Canada 
+5
source

Source: https://habr.com/ru/post/970717/


All Articles