Is it safe to discard bytes with a UUID and still expect it to retain its uniqueness?

I wrote the following module, which encodes the UUID on an arbitrary base:

http://pypi.python.org/pypi/shortuuid/

Now it reduces to 22 characters with the default alphabet, preserving the uniqueness, but I was interested to know how many (/) digits I could cut and maximize the stored uniqueness.

Are all UUID digits equally random / unique, or are some digits more random than others? For example, if the first few digits are the identifier of the machine / application, then obviously they will be less random than the last few. I have not noticed anything like this in my experiments, but I want to be sure before I advise people on it.

Truncate it, say, 8 digits, has a probability of collision 1/57 ^ 8, or is the probability uneven in numbers?

+4
source share
2 answers

Due to the fact that the UUID is designed, it is highly version dependent. And yes, some will be more random than others. http://en.wikipedia.org/wiki/Uuid#Version_1_.28MAC_address.29

One way to crack this is to take the hash (i.e. sha256 , for example) of the UUID. These hashes must propagate uniformly.

Notice that I did not conduct a thorough analysis here. My answer should be on the football field, but I can not guarantee that he is completely right.

+4
source

It seems like it depends on which version you are dealing with. Starting with version 3, everything should be pretty random.

http://en.wikipedia.org/wiki/Universally_unique_identifier

+1
source

Source: https://habr.com/ru/post/1334833/


All Articles