Is it wrong to use a hash for a unique identifier?

I want to use the unique identifier generated by PHP in a database table, which will probably never have more than 10,000 records. I do not want the creation time to be visible or use a purely numeric value, so I use:

sha1(uniqid(mt_rand(), true)) 

Is it wrong to use a hash for a unique identifier? Are all hashes leading to collisions or are so successful that they cannot be considered in this case?

Another point: if the number of characters to be hashed is less than the number of characters in the sha1 hash, will it not always be unique?

+4
source share
5 answers

If you have 2 keys, you will have a theoretical scenario of the best probability of a 1 in 2 ^ X collision, where X is the number of bits in your hash algorithm. The best case is that the input signal will usually be ASCII, which does not use full encoding, and hashing functions do not propagate perfectly, so they will often encounter a theoretical maximum in real life.

To answer your last question:

Another point: if the number of hash characters is less than the number of characters in the sha1 hash, is it not always unique?

Yes it's true. But you will have another problem creating unique keys of this size. The easiest way is to have a checksum, so just select a big enough digest so that the space for conflicts is small enough for your comfort.

As @wayne shows, a very used approach is to combine microtime() with your random salt (and base64_encode to increase entropy).

+5
source

How terrible would it be if the two were the same? Murphy's Law applies - if a million to one or even a 100,000: 1 chance is acceptable, then go straight ahead! The real chance is much, much less, but if your system explodes, if that happens, then you first need to solve the design problem. Then continue with confidence.

Here's a question / answer for what the probabilities really are: Collision Probability SHA1

+3
source

Use sha1 (time ()) instead, then you remove the random possibility of a repeating hash until the time can be represented shorter than sha1. (probably longer than you fill, find a working php parser;))

+2
source

Random computers are by no means random, you know? The only true random event that you can get from a computer, assuming you are in a Unix environment, is from /dev/random , but this is a blocking operation that depends on user interaction, such as moving a mouse or typing text on a keyboard. Reading from /dev/urandom less secure, but it is probably best to use only ASCII characters and give an instant response.

+2
source

You can create a seemingly random, reversible identifier from the primary whole primary key using this class
http://blog.kevburnsjr.com/php-unique-hash

This way you won’t need to store (or index) the hash, and you don’t have to worry about collisions.

0
source

Source: https://habr.com/ru/post/1478912/


All Articles