I have a problem related to the fact that I wrap a while loop around code that I believe can be efficiently vectorized. However, at each step, my stopping condition depends on the value at this stage. Consider this example as a representative model of my problem:
Generate random variables N (0,1) using rnorm()
until you select a value greater than an arbitrary value of k
.
EDIT. The caveat of my problem discussed in the comments is that I do not know, a priori, a good approximation of how many samples need to be taken before my stopping condition.
One approach:
Using a while loop, sample appropriate standard random vectors (e.g. rnorm(50)
to fetch 50 standard normals at a time or rnorm(1)
if k is close to zero). Check this vector to see if there are more observations than k.
If so, stop and return all previous values. Otherwise, combine your vector from step 1 with the new vector that you do by repeating step 1.
Another approach would be to indicate the total number of excess random draws for a given k. This may mean that if k = 2, enter 1000 ordinary random variables using rnorm(1000)
.
Using the vectorization that R offers in the second case gives faster results than the loop version in cases where the number of overflows is not much higher than necessary, but in my problem I do not have a good intuition about how much I need to do, so I need to be conservative.
The question is: is there a way to perform a highly integrated procedure, such as method 2, but using conditional validation, such as method 1? Does small vectorized operations like rnorm(50)
βfastestβ way, assuming the highly integrated method is element by element faster but more wasteful?
source share