Is there a problem with this UUID generation code?

So, I have a code that should use the UUID for database identifiers. For simplicity, I went with v4 (random), and I see no real reason to use any other less random version of the UUID. My UUID class is approximately defined as follows (simplified):

class uuid { public: static uuid create_v4(); public: // cut out for simplification... public: uint8_t bytes[16]; }; 

where the actual generation code is as follows:

 namespace { uint32_t rand32() { // we need to do this, because there is no // gaurantee that RAND_MAX is >= 0xffffffff // in fact, it is LIKELY to be 0x7fffffff const uint32_t r1 = rand() & 0x0ff; const uint32_t r2 = rand() & 0xfff; const uint32_t r3 = rand() & 0xfff; return (r3 << 20) | (r2 << 8) | r1; } } uuid uuid::create_v4() { static const uint16_t c[] = { 0x8000, 0x9000, 0xa000, 0xb000, }; uuid uuid; const uint32_t rand_1 = (rand32() & 0xffffffff); const uint32_t rand_2 = (rand32() & 0xffff0fff) | 0x4000; const uint32_t rand_3 = (rand32() & 0xffff0fff) | c[rand() & 0x03]; const uint32_t rand_4 = (rand32() & 0xffffffff); uuid.bytes[0x00] = (rand_1 >> 24) & 0xff; uuid.bytes[0x01] = (rand_1 >> 16) & 0xff; uuid.bytes[0x02] = (rand_1 >> 8 ) & 0xff; uuid.bytes[0x03] = (rand_1 ) & 0xff; uuid.bytes[0x04] = (rand_2 >> 24) & 0xff; uuid.bytes[0x05] = (rand_2 >> 16) & 0xff; uuid.bytes[0x06] = (rand_2 >> 8 ) & 0xff; uuid.bytes[0x07] = (rand_2 ) & 0xff; uuid.bytes[0x08] = (rand_3 >> 24) & 0xff; uuid.bytes[0x09] = (rand_3 >> 16) & 0xff; uuid.bytes[0x0a] = (rand_3 >> 8 ) & 0xff; uuid.bytes[0x0b] = (rand_3 ) & 0xff; uuid.bytes[0x0c] = (rand_4 >> 24) & 0xff; uuid.bytes[0x0d] = (rand_4 >> 16) & 0xff; uuid.bytes[0x0e] = (rand_4 >> 8 ) & 0xff; uuid.bytes[0x0f] = (rand_4 ) & 0xff; return uuid; } 

This one looks correct for me, but recently I got an error from the DB saying that the UUID I was trying to insert was a duplicate. Since this is supposed to be very unlikely, I have to assume that there might be a problem with my code. So does anyone see something wrong? Is my random UUID generation not random enough?

NOTE. I cannot use accelerated random number generation or the UUID library. I would like to, but I am attached to a specific system with certain versions of installed libraries and get a sufficiently new version of boost to make these functions almost impossible.

+4
source share
1 answer

The code seems reasonable to me. As mentioned in the comments, the question arises as to whether rand () is a good choice for this task, but your use of this method seems to be a reasonable way to create 32-bit data, assuming that a newer version of the library is used, which ensures that the younger ones the bits will be as random as the higher bits (also mentioned in the comments by you).

So, while the rand () function does even moderately good work, it seems unlikely that you should get a duplicate. Therefore, I assume that a different failure has occurred. Some features that come to mind:

  • time (0). This seems unlikely. If he returned -1 to indicate an error in two different runs, this could lead to this problem. However, the only way he should be able to is to get an invalid address (which is definitely not the case).
  • Multithreaded use. I don't think rand () is thread safe. If this code was used in a multi-threaded situation, this may possibly lead to unexpected behavior.
  • Crohn causes difficulties. If the clock on the workstation was not accurate and was automatically set (for example, via rdate) to synchronize with a server, this could cause a cron job to repeat at a specific time. I managed to imitate this behavior by simply creating a cron task to upload the current date to a file every minute and then set the date many times ... as a result, it wrote the same date / time (in the second) to the file more than once. With the resolution of one second of the time function, this can lead to duplication of seeds.
  • Invalid code indicating the UUID in the database. Even if the UUID generator works fine, there may be another error that writes the same UUID to the database twice.

Just wild guesses. Of these, the third is my favorite, but the fourth will be the one I would suspect at first if I looked at my own code.

+3
source

Source: https://habr.com/ru/post/1433011/


All Articles