Using sapply in cross validation

I have a question about sapply in R. In my example, I use it to check for cross-references

  ##' Calculates the LOO CV score for given data and regression prediction function ##' ##' @param reg.data: regression data; data.frame with columns 'x', 'y' ##' @param reg.fcn: regr.prediction function; arguments: ##' reg.x: regression x-values ##' reg.y: regression y-values ##' x: x-value(s) of evaluation point(s) ##' value: prediction at point(s) x ##' @return LOOCV score loocv <- function(reg.data, reg.fcn) { ## Help function to calculate leave-one-out regression values loo.reg.value <- function(i, reg.data, reg.fcn) return(reg.fcn(reg.data$x[-i],reg.data$y[-i], reg.data$x[i])) ## Calculate LOO regression values using the help function above n <- nrow(reg.data) loo.values <- sapply(seq(1,n), loo.reg.value, reg.data, reg.fcn) ## Calculate and return MSE return(???) } 

My questions about sapply as follows:

  • Is it possible to use several arguments and functions, i.e. sapply(X1,FUN1,X2,FUN2,..) , where X1 and X2 are my function arguments for the functions FUN1 and FUN2 respectively.
  • In the above code, I apply 1:n to the loo.reg.value function. However, this function has several arguments, in fact 3: integer i , reg.data regression data and reg.fcn regression reg.fcn . If a function in sapply has more than one argument, and my X covers only one of the arguments, use it as the "first argument"? So this will be the same as sapply(c(1:n,reg.data,reg.fcn),loo.reg.value, reg.data, reg.fcn) ?

thanks for the help

+4
source share
2 answers

In response to the first question: Yes, you can use several functions, but the second and subsequent functions must be transferred to the first function, and then to the next function, etc. Therefore, functions must be encoded to accept additional arguments and pass them.

for instance

 foo <- function(x, f1, ...) f1(x, ...) bar <- function(y, f2, ...) f2(y, ...) foobar <- function(z, f3, ...) f3(z) sapply(1:10, foo, f1 = bar, y = 2, f2 = foobar, z = 4, f3 = seq_len) > sapply(1:10, foo, f1 = bar, y = 2, f2 = foobar, z = 4, f3 = seq_len) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 1 1 1 1 1 1 1 1 1 1 [2,] 2 2 2 2 2 2 2 2 2 2 [3,] 3 3 3 3 3 3 3 3 3 3 [4,] 4 4 4 4 4 4 4 4 4 4 

This is a dumb example, but it shows how to pass additional arguments to foo() , initially as part of the argument ... sapply() . It also shows how to have foo() and subsequent functions accept additional arguments that need to be passed simply by using ... in the function definition and how the next function is called, for example. f2(y, ...) . Note. I also avoid positional matching problems and will name all the extra arguments provided by foo() .

Regarding question 2, I think that, as you explain, this makes the situation too complicated. For example, you duplicated the bits reg.data and reg.fcn in that iterations are repeated using sapply() , which is incorrect (this means that you iterate over three things in the vector c(1:n,reg.data,reg.fcn) , not over 1:n ).

sapply(1:n, fun, arg1, arg2) equivalent

 fun(1, arg1, arg2) fun(2, arg1, arg2) .... fun(10, arg1, arg2) 

while sapply(1:n, fun, arg1 = bar, arg2 = foobar) equivalent

 fun(1, arg1 = bar, arg2 = foobar) fun(2, arg1 = bar, arg2 = foobar) .... fun(10, arg1 = bar, arg2 = foobar) 
+2
source

The function you pass to sapply can take as many arguments as you would like (within reason), but it will process everything except the first arguments for each application. Have you tried using this code? It seems like this will work.

+1
source

Source: https://habr.com/ru/post/1492985/


All Articles