Extract data using a comparable data matrix in R

I have two datasets with latitude, longitude and temperature data. One dataset corresponds to a geographic area of ​​interest, with corresponding lat / long pairs that form the border and contents of the area (Matrix dimension = 4518x2)

Another dataset contains lat / long and temperature data for a larger area that covers the area of ​​interest (Matrix Dimenion = 10875x3).

My question is: how do you extract the corresponding row data (lat, long, temperature) from the second dataset that corresponds to the first lat / long dataset?

I tried many “for loops”, “subset” and “unique” commands, but I can’t get the corresponding temperature data.

Thanks in advance!


10/31 Editing: I forgot to mention that I use "R" to process this data.

The lat / long data for the area of ​​interest was presented as a list of 4518 files containing the lat / long coordinates in the name of each file:

x<- dir() lenx<- length(x) g <- strsplit(x, "_") coord1 <- matrix(NA,nrow=lenx, ncol=1) coord2 <- matrix(NA,nrow=lenx, ncol=1) for(i in 1:lenx) { coord1[i,1] <- unlist(g)[2+3*(i-1)] coord2[i,1] <- unlist(g)[3+3*(i-1)] } coord1<-as.numeric(coord1) coord2<-as.numeric(coord2) coord<- cbind(coord1, coord2) 

The lat / long and temperature data were obtained from the NCDF file with temperature data for 10,875 pairs of lat / lengths:

 long<- tempcd$var[["Temp"]]$size[1] lat<- tempcd$var[["Temp"]]$size[2] time<- tempcd$var[["Temp"]]$size[3] proj<- tempcd$var[["Temp"]]$size[4] temp<- matrix(NA, nrow=lat*long, ncol = time) lat_c<- matrix(NA, nrow=lat*long, ncol=1) long_c<- matrix(NA, nrow=lat*long, ncol =1) counter<- 1 for(i in 1:lat){ for(j in 1:long){ temp[counter,]<-get.var.ncdf(precipcd, varid= "Prcp", count = c(1,1,time,1), start=c(j,i,1,1)) counter<- counter+1 } } temp_gcm <- cbind(lat_c, long_c, temp)` 

So now the question is how to remove the values ​​from "temp_gcm" that correspond to the lat / long data pairs from "coord?"

+4
source share
2 answers

Noah

I can think of a few ways you could do this. The simplest, though not the most efficient, would be to use the function R which() , which takes a boolean argument, iterate over the data frame to which you want to apply the correspondence. Of course, this assumes that there can be no more than one match in a larger dataset. Based on your datasets, I would do something like this:

 attach(temp_gcm) # adds the temp_gcm column names to the global namespace attach(coord) # adds the coord column names to the global namespace matched.temp = vector(length = nrow(coord)) # To store matching results for (i in seq(coord)) { matched.temp[i] = temp[which(lat_c == coord1[i] & long_c == coord2[i])] } # Now add the results column to the coord data frame (indexes match) coord$temperature = matched.temp 

The function which(lat_c == coord1[i] & long_c == coord2[i]) returns the vector of all rows in the dataframe temp_gcm that match lat_c and long_c with the corresponding coord1 and coord2 respectively from line i in iteration (NOTE: I Assuming this the vector will have a length of 1, i.e. there is only one possible match). matched.temp[i] will then be assigned a value from the temp column in the dataframe temp_gcm , which satisfies the logical condition. Note that the goal of this is to create a vector that maps the values ​​corresponding to the index to the rows of the coord data block.

Hope this helps. Note that this is a rudimentary approach, and I would advise looking for the merge() function as well as apply() to do this in a more concise way.

+2
source

I added an extra column with zeros to use as the result for the IF statement. "x" is the number of lines in temp_gcm. "y" is the number of columns (representing time steps). "temp_s" is standardized temperature data

 indicator<- matrix(0, nrow = x, ncol = 1) precip_s<- cbind(precip_s, indicator) temp_s<- cbind(temp_s, indicator) for(aa in 1:x){ current_lat<-latitudes[aa,1] #Latitudes corresponding to larger area current_long<- longitudes[aa,1] #Longitudes corresponding to larger area for(ab in 1:lenx){ #Lenx coresponds to nrow(coord) if(current_lat == coord[ab,1] & current_long == coord[ab,2]) { precip_s[aa,(y/12+1)]<-1 #y/12+1 corresponds to "indicator column" temp_s[aa,(y/12+1)]<-1 } } } precip_s<- precip_s[precip_s[,(y/12+1)]>0,] #Removes rows with "0"s remaining in "indcator" column temp_s<- temp_s[temp_s[,(y/12+1)]>0,] precip_s<- precip_s[,-(y/12+1)] #Removes "indicator column temp_s<- temp_s[,-(y/12+1)] 
0
source

Source: https://habr.com/ru/post/1443105/


All Articles