The usefulness of `rand ()` - or who should call `srand ()`?

Reference Information. I use rand() , std::rand() , std::random_shuffle() and other functions in my code for scientific computing. To be able to reproduce my results, I always clearly indicate the random seed and set it via srand() . This was good until recently, when I realized that libxml2 would also lie to srand() on first use - it was after my early call to srand() .

I filled out a bug report in libxml2 about its srand() call , but I got the answer:

Initialize libxml2 first. This is a perfectly legitimate call that must be made from the library. You shouldn't expect anyone to call srand() , and the man page says nowhere to avoid using srand() multiple times.

This is actually my question. If the general policy is that every lib can / should / should / will call srand() , and I can / could also call it here and there, I really don't see how this can be useful at all. Or how is rand() useful then?

That is why I thought that the general (unwritten) policy is that lib should never call srand() , and the application should call it only once at the beginning. (Not taking multithreading into account. I think in this case you should use something else anyway.)

I also tried exploring the bit that other libraries actually called srand() , but I did not find it. Whether there is a?

My current workaround is this ugly code:

 { // On the first call to xmlDictCreate, // libxml2 will initialize some internal randomize system, // which calls srand(time(NULL)). // So, do that first call here now, so that we can use our // own random seed. xmlDictPtr p = xmlDictCreate(); xmlDictFree(p); } srand(my_own_seed); 

Probably the only clean solution would be to not use this at all and use only my own random generator (perhaps via C ++ 11 <random> ). But actually this is not a question. The question is, who should call srand() , and if everyone does, how is rand() useful then?

+46
c ++ c random srand
Oct 10 '14 at 7:33
source share
4 answers

Use the new <random> header instead. It allows you to use multiple instances of engines using different algorithms and, more importantly, for you, independent seeds.

[edit] To answer the "useful" part, rand generates random numbers. This is what is good. If you need fine-grained control, including reproducibility, you must have not only a known seed, but also a well-known algorithm. srand at best gives you a fixed seed, so this is not a complete solution.

+33
Oct 10 '14 at 7:36
source share

Well, the obvious thing has been stated several times by others, use the new C ++ 11 generators. I repeat this for another reason. You use the output for scientific calculations, and rand usually implements a rather poor generator (while in many implementations of the main thread MT19937 is used, which, in addition to poor state recovery, is not so bad, but you have no guarantee for a specific algorithm, and at least one the main compiler still uses a very bad LCG).

Do not perform scientific calculations with a bad generator. It doesn't really matter if you have things like hyperplanes in your random numbers if you make some kind of stupid game that shoots little birds on your mobile phone, but a lot of time is important for scientific modeling. Never use a bad generator. Do not do this.

Important note: std::random_shuffle (a version with two parameters) can actually cause rand , which is an error you should be aware of if you use it, even if you otherwise use the new C ++ 11 generators found in <random> .

About the actual problem causing srand twice (or even more often) is not a problem. You can basically call him as often as you want, all he does is change the seed and therefore the subsequent pseudorandom sequence. I wonder why the XML library wants to call it at all, but they are right in their answer, it is not illegal for them. But it also does not matter.
The only thing that needs to be done is that either you do not want to receive any particular pseudo-random sequence (that is, any sequence will do, you are not interested in reproducing the exact sequence), or you will be the last srand that cancels any previous calls.

Thus, the implementation of your own generator with good statistical properties and a sufficiently long period of 3-5 lines of code is not so difficult, with a little caution. The main advantage (besides speed) is that you precisely control your state and change it.
It is unlikely that you will need periods much longer than 2 128 because explicit forbidden time actually consumes so many numbers. A 3GHz computer that consumes one number each cycle will run for 10 21 years in a period of 2 128 so for people with average life expectancy not so much. Even assuming that the supercomputer with which you run your simulation is a trillion times faster, your great great children will not survive the end of the period.
While periods like the 2 19937 that the current “modern” generators supply are really funny, trying to improve the generator on the wrong end if you ask me (it’s better to make sure that they are statistically stable and quickly recover from the worst case state, etc. d.). But, of course, opinions here may differ.

This site contains several quick generators with implementations. They are xorshift generators in combination with the step of adding or multiplying and a small (from 2 to 64 machine words) lag, which leads to both fast and high-quality generators (there is also a test suite, and the site author wrote a couple of documents on this subject ) I am using a modification of one of them (the 2-word 128-bit version is ported to 64-bit, with modified changes, respectively).

+25
Oct 10 '14 at 9:29
source share

This problem is solved in the generation of random numbers C ++ 11, i.e. you can instantiate a class:

 std::default_random_engine e1 

which allows you to fully control only random numbers created from an e1 object (as opposed to what will be used in libxml). Thus, the general rule will be to use the new design, since you can generate your random numbers independently.

Very good documentation

To solve your problems - I also think that it would be bad practice to call srand() in a library like libxml. However, moreover, srand() and rand() not intended to be used in the context that you are trying to use - they are enough when you just need random numbers, as libxml does. However, when you need reproducibility and make sure you are independent of others, the new <random> header is the way to go. So, to summarize, I don’t think this is good practice on the library side, but it’s hard to blame them. In addition, I could not imagine how they are changing, since a billion other software products probably depend on this.

+8
Oct 10 '14 at 7:39
source share

The real answer is that if you want to be sure that your sequence of random numbers is not changed by any other code, you need a random number that is private for your work. Note that calling srand is only a small part of this. For example, if you call some function in another library that calls rand , it will also break the sequence of your random numbers.

In other words, if you want the predictable behavior of your code based on random number generation, it should be completely separate from any other code that uses random numbers.

Others suggested using C ++ 11 random number generation, which is one solution.

On Linux and other compatible libraries, you can also use rand_r , which takes a pointer to unsigned int on the seed that is used for this sequence. Therefore, if you initialize the seed variable, and then use it with all calls to rand_r , it will create a unique sequence for YOUR code. This, of course, is the same old rand generator, just a separate seed. The main reason I mean is because you can pretty easily do something like this:

 int myrand() { static unsigned int myseed = ... some initialization of your choice ...; return rand_r(&myseed); } 

and just call myrand instead of std::rand (and should be executed to work in std::random_shuffle , which takes an arbitrary generator parameter)

+6
Oct 10 '14 at 8:01
source share



All Articles