Hashes vs. Numeric Identifiers

When creating a web application that somehow displays a unique identifier for a repeating object (videos on YouTube or a section of a book on a site like mine), it would be better to use an identifier of the same length, such as a hash or a unique element key in database (1, 2, 3, etc.).

Besides a little disclosure that, in my opinion, is not significant, information about the internal components of your application, why is a hash better than just using a unique identifier?

In short: What is better to use as a public unique identifier - a hash value or a unique key from the database?

Change I reopen this question because Dmitry raised a good point so as not to bind naming to a specific property. Will such a binding prevent me from optimizing / normalizing the database in the future?

The platform uses php / python with ISAM / w MySQL.

+4
source share
8 answers

If you are not trying to hide the state of your internal object identifier identifier, hashes are uselessly slow (generate and compare), useless for a long time, useless ugly and uselessly capable of colliding. GUIDs are also long and ugly, making them as unsuitable for human consumption as hashes.

For things similar to inventory, just use a sequential (or moored) counter. If you switch to another database, you just need to initialize a new counter to a value of at least the same size as your largest existing record identifier. Almost every database server gives you a way to do this.

If you are trying to hide the status of your counter, perhaps because you are counting users and you do not want competitors to know how much you have, I suggest avoiding displaying your internal identifiers. If you insist on displaying them and don't want the hash flaws, you can use a variable-length linear shift register to generate identifiers.

+4
source

I usually use hashes if I don't want the user to guess the next identifier in the series. But for your sections of the book, I would stick with numeric identifiers.

+2
source

Using hashes is preferable if you need to rebuild your database for any reason, for example, and reordering. Ordinal numbers will move, but hashes will remain unchanged.

Not relying on the fact that you put things in a box, but according to the properties of things, it just seems .. safer.

But watch out for clashes, obviously.

+2
source

With hashes you

  • If necessary, you can combine the database with a similar (or backup)
  • Do not do something that might help some guessing attacks even a little.
  • Does not disclose more personal information about the user than necessary, for example. if someone sees user number 2 in your current database, he gets information that he is an old man.
  • (assuming you use a long hash or GUID), which helps you a lot if you bought YouTube and they decided to integrate your databases.
  • Helps itself in the event of a search engine that indexes by GUID.

Please tell us if the last 6 months have given you some clarity on this matter ...

+1
source

Hashes are not guaranteed to be unique and, I believe, consistent.

0
source

Should your users remember / use the value? or are you looking at it from security pov?

From a security point of view, this does not matter - because you should not just rely on people who do not guess another, but a valid identifier of what they should not see in order to prevent them.

0
source

Yes, I donโ€™t think you are looking for a hash - you are rather looking for Guid.If you are on the .Net platform, try System.Guid.

However, the most important reason not to use Guid is performance. Including and searching a database in (long) rows is very suboptimal. The numbers are fast. So, if you really do not need it, do not do it.

0
source

Hashes have the advantage that you can check whether they are valid or not before doing any validation in your database, whether they exist or not. This can help you attack with random hashes, since you do not need to burden your database with fake searches.

Therefore, if your hash has some clearly defined format, for example, with a checksum at the end, you can check if this is fixed in the database.

0
source

Source: https://habr.com/ru/post/1277688/


All Articles