In R, how to locally shuffle vector elements

Question

In R, how to locally shuffle vector elements

I have the following vector in R. Think of them as a vector of numbers.

x = c(1,2,3,4,...100)

I want to randomize this vector “locally” based on some input number “locality coefficient”. For example, if the locality coefficient is 3, then the first 3 elements are taken and randomized, they are followed by the next 3 elements and so on. Is there an effective way to do this? I know if I use a sample, it will iterate over the entire array. thanks in advance

+6

r sample

broccoli Jul 14 '13 at 15:09

source share

6 answers

Arun did not like how ineffective my other answer was, so there is something very quick for him here;)

It requires only one call to runif() and order() and does not use sample() at all.

 x <- 1:100 k <- 3 n <- length(x) x[order(rep(seq_len(ceiling(n/k)), each=k, length.out=n) + runif(n))] # [1] 3 1 2 6 5 4 8 9 7 11 12 10 13 14 15 18 16 17 # [19] 20 19 21 23 22 24 27 25 26 29 28 30 33 31 32 36 34 35 # [37] 37 38 39 40 41 42 43 44 45 47 48 46 51 49 50 52 54 53 # [55] 55 57 56 58 60 59 62 63 61 66 64 65 68 67 69 71 70 72 # [73] 75 74 73 76 77 78 81 80 79 84 82 83 86 85 87 89 88 90 # [91] 93 92 91 94 96 95 97 98 99 100

+7

Josh o'brien Jul 14 '13 at 18:40

source share

For recording, the boot package (supplied with the R base) includes the permutation.array() function, which is used only for this purpose:

 x <- 1:100 k <- 3 ii <- boot:::permutation.array(n = length(x), R = 2, strata = (seq_along(x) - 1) %/% k)[1,] x[ii] # [1] 2 1 3 6 5 4 9 7 8 12 11 10 15 13 14 16 18 17 # [19] 21 19 20 23 22 24 26 27 25 28 29 30 33 31 32 36 35 34 # [37] 38 39 37 41 40 42 43 44 45 46 47 48 51 50 49 53 52 54 # [55] 57 55 56 59 60 58 63 61 62 65 66 64 67 69 68 72 71 70 # [73] 75 73 74 76 77 78 79 80 81 82 83 84 86 87 85 89 88 90 # [91] 93 91 92 94 95 96 97 98 99 100

+5

Josh o'brien Jul 14 '13 at 16:08

source share

This will remove the items at the end (with a warning):

 locality <- 3 x <- 1:100 c(apply(matrix(x, nrow=locality, ncol=length(x) %/% locality), 2, sample)) ## [1] 1 2 3 4 6 5 8 9 7 12 10 11 13 15 14 16 18 17 19 20 21 22 24 23 26 25 27 28 30 29 32 33 31 35 34 36 38 39 37 ## [40] 42 40 41 43 44 45 47 48 46 51 49 50 54 52 53 55 57 56 58 59 60 62 61 63 64 65 66 67 69 68 71 72 70 74 75 73 78 77 76 ## [79] 80 81 79 83 82 84 87 85 86 88 89 90 92 93 91 96 94 95 99 98 97

+2

Matthew lundberg Jul 14 '13 at 15:19

source share

 v <- 1:16 scramble <- function(vec,n) { res <- tapply(vec,(seq_along(vec)+n-1)%/%n, FUN=function(x) x[sample.int(length(x), size=length(x))]) unname(unlist(res)) } set.seed(42) scramble(v,3) #[1] 3 2 1 6 5 4 9 7 8 12 10 11 15 13 14 16 scramble(v,4) #[1] 2 3 1 4 5 8 6 7 10 12 9 11 14 15 16 13

+2

Rolling Jul 14 '13 at 15:24

source share

I like Matthew's approach better, but here is how I did it:

 x <- 1:100 fact <- 3 y <- ceiling(length(x)/fact) unlist(lapply(split(x, rep(1:y, each =fact)[1:length(x)]), function(x){ if (length(x)==1) return(x) sample(x) }), use.names = FALSE) ## [1] 3 1 2 6 4 5 8 9 7 11 10 12 13 15 14 17 16 18 ## [19] 20 21 19 24 23 22 26 27 25 29 30 28 31 32 33 35 34 36 ## [37] 39 37 38 41 42 40 45 43 44 47 46 48 51 49 50 52 53 54 ## [55] 57 56 55 59 60 58 63 62 61 64 66 65 67 68 69 70 71 72 ## [73] 75 73 74 77 76 78 80 79 81 82 84 83 85 86 87 90 89 88 ## [91] 92 91 93 96 94 95 98 99 97 100

+1

Tyler rinker Jul 14 '13 at 15:20

source share

Arun · Accepted Answer · 2013-07-14T15:21:42+0000

Common decision:

Edit: as @MatthewLundberg's comments, the problem I pointed out with "repeating numbers in x" can be easily overcome by working on seq_along(x) , which would mean that the resulting values will be indexes. So it will be like this:

 k <- 3 x <- c(2,2,1, 1,3,4, 4,6,5, 3) xs <- seq_along(x) y <- sample(xs) x[unlist(split(y, (match(y, xs)-1) %/% k), use.names = FALSE)] # [1] 2 2 1 3 4 1 4 5 6 3

Old answer:

The bottleneck here is the number of calls to the sample function. And while your numbers are not repeating, I think you can do it with one call to sample as follows:

 k <- 3 x <- 1:20 y <- sample(x) unlist(split(y, (match(y,x)-1) %/% k), use.names = FALSE) # [1] 1 3 2 5 6 4 8 9 7 12 10 11 13 14 15 17 16 18 19 20

Put it all together in a function (I like @ Roland's scramble name):

 scramble <- function(x, k=3) { xs <- seq_along(x) ys <- sample(xs) idx <- unlist(split(ys, (match(ys, xs)-1) %/% k), use.names = FALSE) x[idx] } scramble(x, 3) # [1] 2 1 2 3 4 1 5 4 6 3 scramble(x, 3) # [1] 1 2 2 1 4 3 6 5 4 3

To reduce the answer (and get it faster), even more after the comment by @flodel:

 scramble <- function(x, k=3L) { xs <- seq_along(x) ys <- sample(xs) x[unlist(split(xs[ys], (ys-1) %/% k), use.names = FALSE)] }

In R, how to locally shuffle vector elements

Common decision:

Old answer:

More articles: