Look for a clear and concise web page explaining why low-order bits of random numbers are usually not random

Question

Look for a clear and concise web page explaining why low-order bits of random numbers are usually not random

I am compiling an internal "every developer should know" page on the wiki page.

I have seen a lot of discussion regarding rand() % N , but not one web page that explains all this.

For example, I'm curious if this problem is only for C and Linux, or if it also applies to Windows, C ++. Java, .Net, Python, Perl.

Please help me figure this out. Also, how do non-random numbers get? Thanks!

+4

random bits

Hamish grubijan Apr 01 '10 at 14:15

source share

2 answers

Check http://en.wikipedia.org/wiki/Linear_congruential_generator , which is most likely the algorithm used for most built-in random number generators.

Scrolling down, find the paragraph starting with "Another problem with LCG is that the low bits of the generated sequence have a much shorter period." For some understanding, rand() % N

+2

brainjam Apr 4 '10 at 20:02

source share

Ukko · Accepted Answer · 2010-04-23T16:08:54+0000

I do not have a web page to link to you, but I may have a back-from-envelope explanation that will help. How simple random number generators work by following the steps

Use the last number generated by n or the sample number.
Multiply this number by a special large number
Add another special large amount
Divide it into the third special large quantity and discard the rest
Return result

Now, if you are thinking about what happens in all but Step 4, you perform operations where only the least significant bits can change the least significant bits of the result. Adding 1001 and 100 ... 00001 will end ... 02 (Ha, although I was talking to base 2, this is actually the number 12 for a giggle.) No matter what is at the top end of the calculation. Similarly, when you multiply it, it will end with 1, no matter what.

A similar problem exists in the upper end, a billion times a billion will invariably dominate the contribution of hundreds of places of a decaying number. This indicates that the middle is the place where good material comes from. Many bits interact here - high, medium, and low.

This is the goal of the division step, it cuts off the bottom piece of the result when there is not much interaction. The upper chunk usually does not break because the computer discards the upper bits when the multiplications no longer fit into the machine word.

In the end, although the cut-off points are somewhat arbitrary, and you can be more picky than the people who developed the algorithm and still chop off a few more bits.

The question for you is how bad they are, they can be very bad. The easiest way to see this is to combine the individual numbers into tuples and draw them. Therefore, if you have random numbers a, b, c, d, ... graph (a,b), (c,d), ... and see the results. This is called a spectral test, and Rand does a great job of it. I have a link for try http://random.mat.sbg.ac.at/results/karl/spectraltest/

Look for a clear and concise web page explaining why low-order bits of random numbers are usually not random

More articles: