Hadley Wickham suggested that one could download using the package dplyr, his proposal was improved and then implemented in the packagebroom . Can k-fold cross validation be implemented?
I think the first step (choosing a group of trains) is very simple:
crossvalidate <- function (df, k = 5) {
n <- nrow(df)
idx <- sample(rep_len(1:k, n))
attr(df, "indices") <- lapply(1:k, function(i) which(idx != i))
attr(df, "drop") <- TRUE
attr(df, "group_sizes") <- nrow(df) - unclass(table(idx))
attr(df, "biggest_group_size") <- max(attr(df, "group_sizes"))
attr(df, "labels") <- data.frame(replicate = 1:k)
attr(df, "vars") <- list(quote(replicate))
class(df) <- c("grouped_df", "tbl_df", "tbl", "data.frame")
df
}
But for some reason I canβt find the documentation attr(, "indices")to find out if I can somehow use the βotherβ indexes that were chosen to select the test group indexes. Do you have ideas?
source
share