Create a list of a million random items

Question

Create a list of a million random items

How to efficiently generate a list of millions of random elements in a circuit? The following code reaches maximum recursion depth with 0.1 million.

(unfold (lambda(x)(= x 1000000)) (lambda(x)(random 1000)) (lambda(x)(+ x 1)) 0)

+4

scheme

Fakrudeen Mar 13 '10 at 8:09

source share

6 answers

Eli barzilay · Answer 1 · 2010-03-13T19:59:19+0000

It really depends on the system you are using, but here is a general way to do it in a simple scheme:

 (let loop ([n 1000000] [r '()]) (if (zero? n) r (loop (- n 1) (cons (random 1000) r))))

One note about running this code as it is: if you just type it in REPL, it will print the resulting list, and usually it will involve using a lot more memory than the list. So it’s better to do something like

 (define l ...same...)

There are many other tools that you can use for varying degrees of convenience. unfold is one of them and the other is for , as shown in the PLT Schema :

 (for/list ([i (in-range 1000000)]) (random 1000))

PeterM · Answer 2 · 2010-03-13T08:20:56+0000

I don’t know much scheme, but could you just use tail recursion (which is actually just a loop) instead of expanding (or any other higher order function)?

amit_grepclub · Answer 3 · 2010-03-13T08:25:30+0000

Use do-loop-construct as described here .

Davorak · Answer 4 · 2010-03-13T11:29:40+0000

Someone will correct me if I am wrong, but the Fakrudeen code should be optimized since it is tail recursive. Or it should be with proper implementation to unfold. It should never reach the maximum depth of recursion.

What version of the circuit are you using Fakrudeen? DrScheme does not score just a million random numbers.

Loïc Faure-Lacroix · Answer 5 · 2013-11-07T16:48:05+0000

Taking Chicken-Scheme as an implementation, try some results.

 (use srfi-1) (use extras) (time (unfold (lambda(x)(= x 1000000)) (lambda(x)(random 1000)) (lambda(x)(+ x 1)) 0)) (time (let loop ([n 1000000] [r '()]) (if (zero? n) r (loop (- n 1) (cons (random 1000) r))))) (define (range min max body) (let loop ((current min) (ret '())) (if (= current max) ret (loop (+ current 1) (cons (body current ret) ret))))) (time (range 0 1000000 (lambda params (random 1000))))

Results are shown here with csc -O3 t.scm

 0.331s CPU time, 0.17s GC time (major), 12/660 GCs (major/minor) 0.107s CPU time, 0.02s GC time (major), 1/290 GCs (major/minor) 0.124s CPU time, 0.022s GC time (major), 1/320 GCs (major/minor)

As you can see, the author’s version is much slower than using regular tail recursive calls. It’s hard to say why the call unfolds much more slowly, but I assume that this is due to the fact that it takes much longer to complete the function calls.

The other two versions are very similar. My version is almost the same, except that I am creating a high-order function that can be reused.

Unlike a simple loop, it can be reused to create a range of functions. The position and the current list are sent to the function if necessary.

A higher order version is probably the best way to do it, even if it takes a little longer to complete. This is probably also related to function calls. It can be optimized by deleting parameters, and it will work almost as fast as the named let.

The advantage of a higher order version is that the user does not need to write the loop itself and it can be used with an abstract lambda function.

Edit

A look at this particular case. Ef we have to create a million elements in the range from 0 to 999, we could create a vector of a fixed length of a million and with values from 0 to 999 in it. Shuffle the thing back. Then the whole random process will depend on the shuffle function, which should not create new values for memory exchange, can be accelerated than generate random numbers. However, the shuffle method still relies somewhat on random ones.

Edit 2

If you really need a list, you can leave with a vector instead.

Here is my second implementation with vector-map

 (time (vector-map (lambda (xy) (random 1000)) (make-vector 1000000))) # 0.07s CPU time, 0/262 GCs (major/minor)

As you can see, this is much faster than using a list.

Edit 3 fun

 (define-syntax bigint (er-macro-transformer (lambda (exp rename compare) (let ((lst (map (lambda (x) (random 1000)) (make-list (cadr exp))))) (cons 'list lst))))) 100000 0.004s CPU time, 0/8888 GCs (major/minor)

This is probably not a good idea, but I felt it might be interesting. Since this is a macro, it will be executed at compile time. The compilation time will be huge, but as you can see, the speed improvement is also huge. Unfortunately, using a chicken, I could not get it to make a list of a million. I assume that the type that it can use to create the list overflows and gets access to invalid memory.

To answer the question in the comments:

I am not a professional Scheme. I also have a lot to do with this, and as I understand it, a named loop or high-order function should be a way. A high order function is good because it is reusable. You can define

 (define (make-random-list quantity maxran) ...)

Then another interesting part, since the circuit is all about high-order functions. Then you can replace the make-random-list implementation with anything. If you need to do some compilation time, define a macro, otherwise use a function. All that really matters is reusability. It should be fast and not use memory.

Common sense tells us that executing smaller execution will be faster, tail recursive calls do not imply that they will consume memory. And when you're not sure, you can hide the implementation in a function that can be optimized later.

Gozoner · Answer 6 · 2013-11-10T14:48:25+0000

The MIT scheme limits the computation stack. Given the size of your problem, you probably ran out of stack size. Fortunately, you can provide a command-line option for resizing the stack. Try:

 $ mit-scheme --stack <number-of-1024-word-blocks>

There are other command line options, check mit-scheme --help

Note that the MIT scheme, in my experience, is one of the few schemes with a limited stack size. This explains why your code often runs in other circuits.

Regarding your question about efficiency. The unfold procedure is probably not implemented using the tail-recursive / iterative algorithm. Here is the tail recursive version with the tail recursive version of "list reverse in-place":

 (define (unfold stop value incr n0) (let collecting ((n n0) (l '())) (if (stop n) (reverse! l) (collecting (incr n) (cons (value n) l))))) (define (reverse! list) (let reving ((list list) (rslt '())) (if (null? list) rslt (let ((rest (cdr list))) (set-cdr! list rslt) (reving rest list)))))

Note:

 $ mit-scheme --version MIT/GNU Scheme microcode 15.3 Copyright (C) 2011 Massachusetts Institute of Technology This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Image saved on Tuesday November 8, 2011 at 10:45:46 PM Release 9.1.1 || Microcode 15.3 || Runtime 15.7 || SF 4.41 || LIAR/x86-64 4.118 || Edwin 3.116 Moriturus te saluto.

Create a list of a million random items

Edit

Edit 2

Edit 3 fun

More articles: