How does adding a random byte * increase * duplicates?

Here is a Python function to generate my own specific type for UUID (this is a long story why I can't use it uuid.uuid1()):

def uuid():
    sec = hex(int(time()))[2:]
    usec = hex(datetime.now().microsecond)[2:]
    rand = hex(choice(range(256)))[2:]

    return upper(sec + usec + rand)
    # 534AD79CDF1D27

Now let it work for a long period of time and see if we find duplicates:

UUIDs   Duplicates
100000  2
200000  8
300000  8
400000  8
500000  8
600000  9
700000  9
800000  9
900000  9
1000000     10
1100000     14
1200000     14
1300000     14
1400000     17
1500000     17
1600000     18
1700000     21
1800000     24
1900000     24
2000000     27

Oops! Almost 30 duplicates actually ... Now a new function has appeared here without a random byte at the end:

def uuid():
    sec = hex(int(time()))[2:]
    usec = hex(datetime.now().microsecond)[2:]

    return upper(sec + usec)
    #534ADA2AC4A41

Let's see how many duplicates we get now:

UUIDs   Duplicates
100000  0
200000  0
300000  0
400000  0
500000  0
600000  0
700000  0
800000  0
900000  0
1000000     0
1100000     0
1200000     0
1300000     0
1400000     0
1500000     0
1600000     0
1700000     0
1800000     0
1900000     0
2000000     0

Ok, would you look at that? Not a single duplicate! Also, if you are interested in how I determine the number of duplicates, here is the code for this:

len([x for x, y in Counter(ids).items() if y > 1])

Now, to the question: How does adding a randomly generated byte increase the number of duplicates?

+4
source share
2

, hex() . hex(int(time())) 8 nybbles long, , . Nybble .

hex(datetime.now().microsecond) . 1 nybble ( 9 us) 5 nybbles ( 999999 us). " " , , .

" " - ! , 1 nybble 2 nybble . , , uuid, . 3-nybble usec 2-nybble rand, 4-nybble usec 1-nybble rand. , , :

usec = 0xabc
rand = 0xde

usec = 0xabcd
rand = 0xe

, . format:

usec = format(datetime.now().microsecond, '05x') # hexify `microsecond` with 5 fixed hex digits
+2

usec 1 5 , rand 1 2 , , ( ) .

, usec = '12' rand = '3' , usec = '1' rand = '23' (.. '123').

, , usec 5 , rand 2 .

+1

Source: https://habr.com/ru/post/1536339/


All Articles