Select rasters on the stack based on matching partial layer name

I have a stack of rasters (one per view), and then I have a data frame with lat / long columns along with the name of the view.

fls = list.files(pattern="median") s <- stack(fls) df<-c("x","y","species name") 

I want to be able to just select one raster at a time so that I can use it with the extraction function. I want the selection to be based on a partial match based on the view name column. I want to do this because the raster names may not coincide with the names in the species list, there may be a mismatch in the lower / upper case, or the name of the raster layer may be longer, for example, species_name_median, or there may also be a _ instead of a space.

 for(i:length(df.species name)) { result<-extract(s[[partial match to "species name[i]" ]],df.xy) } 

Hope this makes sense that I just want to use one raster at a time for extraction. I can easily select one raster using s [[i]], but there is no guarantee that each view in the list has its own equivalent raster file.

+6
source share
2 answers

If your point data for the query consists of data.frame x and y coordinates and the corresponding view name for the layer request, you can use these two commands to do everything:

 # Find the layer to match on using 'grepl' and 'which' converting all names to lowercase for consistency df$layer <- lapply( df$species , function(x) which( grepl( tolower(x) , tolower(names(s)) ) ) ) # Extract each value from the appropriate layer in the stack df$Value <- sapply( seq_len(nrow(df)) , function(x) extract( s[[ df$layer[x] ]] , df[ x , 1:2 ] ) ) 

How it works

Starting from the first line:

  • First, we define a new column vector df$layer , which will be the rasterLayer index on the stack that we need to use for this row.
  • lapply over all the elements of the df$species column and applies an anonymous function, using each element in df$species as an input variable x in turn. lapply is a loop construct, although it doesn't look like one.
  • in the first iteration, we take the first element of df$species , which is now x and uses it in grepl (something like "global regular pattern matching boolean") to find which name elements of our stack s contain our view. We use tolower() for both patterns to match ( x ) and the elements that match ( names(s) ) to ensure consistency, even if the case does not match the case, for example. "Tiger" will not find "Tiger" .
  • grepl returns a boolean vector, which elements it found pattern matches in, for example. grepl( "abc" , c("xyz", "wxy" , "acb" , "zxabcty" ) ) returns F , F , T , T We use which to obtain an index of these elements.
  • The idea is that we get one and only one match of the layer in the stack with the view name for each row, so the only TRUE index will be the index of the layer in the desired stack.

In the second line sapply :

  • sapply is an iterator similar to lapply , but returns a vector, not a list of values. TBH you can use either in this use case.
  • Now we nrow(df) over the sequence of numbers from 1 to nrow(df) .
  • We use the line number in another anonymous function as our input variable x
  • We want to extract the "x" and "y" coordinates (columns 1 and 2, respectively) for the current row (given x ) of the data.frame, using the layer we got in our previous row.
  • We assign the result of doing all this to another column in our data.frame file, which contains the extracted value for this x/y for the corresponding layer

I hope this helps!

And a processed example with some data:

 require( raster ) # Sample rasters - note the scale of values in each layer # Tens r1 <- raster( matrix( sample(1:10,100,repl=TRUE) , ncol = 10 ) ) # Hundreds r2 <- raster( matrix( sample(1e2:1.1e2,100,repl=TRUE) , ncol = 10 ) ) # Thousands r3 <- raster( matrix( sample(1e3:1.1e3,100,repl=TRUE) , ncol = 10 ) ) # Stack the rasters s <- stack( r1,r2,r3 ) # Name the layers in the stack names(s) <- c("LIon_medIan" , "PANTHeR_MEAN_AVG" , "tiger.Mean.JULY_2012") # Data of points to query on df <- data.frame( x = runif(10) , y = runif(10) , species = sample( c("lion" , "panther" , "Tiger" ) , 10 , repl = TRUE ) ) # Run the previous code df$layer <- lapply( df$species , function(x) which( grepl( tolower(x) , tolower(names(s)) ) ) ) df$Value <- sapply( seq_len(nrow(df)) , function(x) extract( s[[ df$layer[x] ]] , df[ x , 1:2 ] ) ) # And the result (note the scale of Values is consistent with the scale of values in each rasterLayer in the stack) df # xy species layer Value #1 0.4827577 0.7517476 lion 1 1 #2 0.8590993 0.9929104 lion 1 3 #3 0.8987446 0.4465397 tiger 3 1084 #4 0.5935572 0.6591223 panther 2 107 #5 0.6382287 0.1579990 panther 2 103 #6 0.7957626 0.7931233 lion 1 4 #7 0.2836228 0.3689158 tiger 3 1076 #8 0.5213569 0.7156062 lion 1 3 #9 0.6828245 0.1352709 panther 2 103 #10 0.7030304 0.8049597 panther 2 105 
+3
source

Have you tried subset your RasterStack?

Something like that

 for(i in 1: length(df.species.name)) #assuming it is the 'partial species name' { result <- subset(s, grep(df.species.name[i], ignore.case = TRUE, value = TRUE) } 

It would be interesting to know how different names of a raster and type can be. This would improve approaches by adjusting the regex if necessary. Here you will find many links to grep. Try ?grep too.

0
source

Source: https://habr.com/ru/post/944885/


All Articles