How to efficiently convert multiple bytes to an integer between a range?

I am writing something that reads bytes (just List<int> ) from a random number source, which is very slow. For this and my personal requirements, I want to get as few bytes from the source as possible .

Now I'm trying to implement a method whose signatures look like this:

 int getRandomInteger(int min, int max) 

I have two theories on how I can get bytes from my random source and convert them to an integer.

Approach No. 1 naive. Extract (max - min) / 256 number of bytes and add them. It works, but it collects a lot of bytes from the source of the slow random number generators that I have. For example, if I want to get a random integer from a million to zero, it will extract almost 4,000 bytes ... this is unacceptable.

Approach # 2 sounds perfect to me, but I can't come up with an algorithm. it looks like this:

Take min: 0, max: 1000 as an example.

  • Compute ceil(rangeSize / 256) , which in this case is equal to ceil(rangeSize / 256) ceil(1000 / 256) = 4 . Now select one (1) byte from the source.
  • Scale this one byte from the range 0-255 to the range 0-3 (or 1-4) and determine which group we are using. For instance. if the byte was 250, we would select the fourth group (which represents the last 250 numbers, 750-1000 in our range).
  • Now select a different byte and a scale from 0-255 to 0-250 and let this determine the position within the group that we have. Therefore, if this second byte is, for example, 120, then our final integer is 750 + 120 = 870 .

In this scenario, we only needed 2 bytes. However, it is much more complicated if our range is 0-1000000, we need several "groups".

How do I implement something like this? I am fine with Java / C # / JavaScript code or pseudo code.

I would also like to save the result without losing entropy / randomness. So, I'm a little concerned about integer scaling.

+4
source share
4 answers

Unfortunately, your # 1 approach is broken. For example, if min is 0 and max 510, you should add 2 bytes. There is only one way to get the result 0: both bytes are zero. The probability of this is (1/256) ^ 2. However, there are many ways to get other values, say 100 = 100 + 0, 99 + 1, 98 + 2 ... Thus, the probability of 100 is much more: 101 (1/256 ) ^ 2.

A more or less standard way to do what you want is:

 Let R = max - min + 1 -- the number of possible random output values Let N = 2^k >= mR, m>=1 -- a power of 2 at least as big as some multiple of R that you choose. loop b = a random integer in 0..N-1 formed from k random bits while b >= mR -- reject b values that would bias the output return min + floor(b/m) 

This is called the rejection method. It throws randomly selected binary numbers that will bias the output. If min-max+1 is a power of 2, then you will have zero deviation.

If you have m=1 and min-max+1 - this is just one of two values, different from 2, then the deviations will be almost half. In this case, you definitely need more than m .

In general, higher m values โ€‹โ€‹result in fewer failures, but of course they require more bits per number. There is a probabilistic optimal algorithm for choosing m .

Some of the other solutions presented here have problems, but unfortunately, I donโ€™t have time to comment now. Maybe in a couple of days if there is interest.

+2
source

3 bytes (together) give you a random integer in the range 0..16777215. You can use 20 bits from this value to get a range of 0..1048575 and throw values> 1,000,000

+1
source
 range 1 to r 256^a >= r first find 'a' get 'a' number of bytes into array A[] num=0 for i=0 to len(A)-1 num+=(A[i]^(8*i)) next random number = num mod range 
+1
source

Your random source gives you 8 random bits per call. For an integer in the range [min, max] you will need the ceil bits (log2 (max-min + 1)).

Suppose you can get random bytes from the source using some function:

 bool RandomBuf(BYTE* pBuf , size_t nLen); // fill buffer with nLen random bytes 

Now you can use the following function to generate a random value in a given range:

 // -------------------------------------------------------------------------- // produce a uniformly-distributed integral value in range [nMin, nMax] // T is char/BYTE/short/WORD/int/UINT/LONGLONG/ULONGLONG template <class T> T RandU(T nMin, T nMax) { static_assert(std::numeric_limits<T>::is_integer, "RandU: integral type expected"); if (nMin>nMax) std::swap(nMin, nMax); if (0 == (T)(nMax-nMin+1)) // all range of type T { T nR; return RandomBuf((BYTE*)&nR, sizeof(T)) ? *(T*)&nR : nMin; } ULONGLONG nRange = (ULONGLONG)nMax-(ULONGLONG)nMin+1 ; // number of discrete values UINT nRangeBits= (UINT)ceil(log((double)nRange) / log(2.)); // bits for storing nRange discrete values ULONGLONG nR ; do { if (!RandomBuf((BYTE*)&nR, sizeof(nR))) return nMin; nR= nR>>((sizeof(nR)<<3) - nRangeBits); // keep nRangeBits random bits } while (nR >= nRange); // ensure value in range [0..nRange-1] return nMin + (T)nR; // [nMin..nMax] } 

Since you always get a multiple of 8 bits, you can save extra bits between calls (for example, you might need only 9 bits out of 16 bits). This requires some bit manipulation, and you decide whether to do it.

You can save even more if you use the "half bits": suppose you want to generate numbers in the range [1..5]. For each random value you will need log2 (5) = 2.32 bit. Using 32 random bits, you can actually generate a gender (32 / 2.32) = 13 random values โ€‹โ€‹in this range, although it takes extra effort.

+1
source

Source: https://habr.com/ru/post/1445224/


All Articles