Why does package R load random numbers?

Recently, I read the documentation for the caret package when I noticed this:

Also note that some packages load random numbers upon loading (either directly or through the namespace), and this may affect [sic] reproducibility.

What are the options for using packages to load random numbers? This seems to run counter to the idea of ​​reproducible research and may interfere with my own attempts at set.seed . (I started setting seeds closer to code that requires random numbers to be generated precisely because I worry about the side effects of loading packages.)

+5
source share
1 answer

One example of a package that does this is ggplot2 , as mentioned by Hadley Wick in response to a GitHub question related to tidyverse .

When a package is attached, a tooltip is selected randomly for display to the user (and with some probability the tooltip is not displayed). If we look at its .onAttach() function as it existed until January 2018 , we see that it calls both runif() and sample() , changing the seed

 .onAttach <- function(...) { if (!interactive() || stats::runif(1) > 0.1) return() tips <- c( "Need help? Try the ggplot2 mailing list: http://groups.google.com/group/ggplot2.", "Find out what changed in ggplot2 at http://github.com/tidyverse/ggplot2/releases.", "Use suppressPackageStartupMessages() to eliminate package startup messages.", "Stackoverflow is a great place to get help: http://stackoverflow.com/tags/ggplot2.", "Need help getting started? Try the cookbook for R: http://www.cookbook-r.com/Graphs/", "Want to understand how all the pieces fit together? Buy the ggplot2 book: http://ggplot2.org/book/" ) tip <- sample(tips, 1) packageStartupMessage(paste(strwrap(tip), collapse = "\n")) } release_questions <- function() { c( "Have you built the book?" ) } 

However, this is since it was fixed with a commit created by Jim Hester to reset the seed after ggplot2 :

 .onAttach <- function(...) { withr::with_preserve_seed({ if (!interactive() || stats::runif(1) > 0.1) return() tips <- c( "Need help? Try the ggplot2 mailing list: http://groups.google.com/group/ggplot2.", "Find out what changed in ggplot2 at http://github.com/tidyverse/ggplot2/releases.", "Use suppressPackageStartupMessages() to eliminate package startup messages.", "Stackoverflow is a great place to get help: http://stackoverflow.com/tags/ggplot2.", "Need help getting started? Try the cookbook for R: http://www.cookbook-r.com/Graphs/", "Want to understand how all the pieces fit together? Buy the ggplot2 book: http://ggplot2.org/book/" ) tip <- sample(tips, 1) packageStartupMessage(paste(strwrap(tip), collapse = "\n")) }) } 

Thus, there may be various reasons why a package does this, although there are ways that package authors can prevent this unforeseen consequence for the user.

+7
source

Source: https://habr.com/ru/post/1276238/


All Articles