Irreversible unique identifier from String

The problem sounds pretty often, but for some reason I can’t find something that helps me with this ...
I feel like I lack a fundamental knowledge of hashing and encryption.

Problem
Suppose I have a phone number that (hopefully unique and) is used as an identifier.
But I do not want to use my personal number as an identifier in public interfaces.
What I need is a solution that obfuscates the string in one direction, but still preserves uniqueness, so when someone uses the algorithm, he will get the same ID.

Decision (?)
Is there a hashing algorithm that guarantees uniqueness when the input does not exceed the hashed output length, but still (almost) cannot be undone.
How about using a fixed RSA encryption public key? The output should be unique, but the attacker will have to break one key in order to decrypt all the numbers. Sounds like a bad idea ...

Refresh (based on response)
Obviously, I'm looking for a cryptographic hashing algorithm with a low probability of collision.
Now (that I slept a bit and) thought that after a few facts I can think:

  • In any case, I have to deal with collisions. When I use the phone number as an identifier without further verification, then anyony can indicate "this is mine."
  • Rainbow tables will always be a problem. Since there are a number of managed phone numbers, and everyone should be able to generate a hash from the phone number (I can’t even use secret salt), my only opportunity is to use an intensive algorithm and salt, which makes the rainbow table unique, I think.

With that said: I can decide to use a hash. Thus, no one can immediately tell (without attacking him) which phone number is being used. That seems to be all.

+4
source share
1 answer

What you basically want is a hashing algorithm (as your question says). But where it gets complicated, these are two lines:

  • "guarantees uniqueness when the input does not exceed the hashed output length"
  • "but still (almost) impossible to undo"

Depending on the input length, you can prove the uniqueness (or not collision) of yourself with several for cycles and some time. Therefore, for an example phone number, you can easily prove that all phone numbers for SHA1 do not collide.

If your input space is large, you can be comfortable with the fact that a modern hash function (like SHA-1 or SHA-3) has a very low chance of collision ( birthday ), but there are no guarantees. Although people have been looking for a collision for SHA-1 for a long time and have found them, I think the current gap cost of the only SHA1 was 2 million in a project called HashClash. Currently, the recommendation is for people to switch to SHA-3 , where no collisions were detected. (collisions for SHA-1, I think it would take something like 2 ^ 51 operations to find what might be good enough for your needs).

enter image description here

In the second part of your question, “abandonment cannot be undone”. You may strive to make something computationally impracticable. But with infinite time, an attacker can cancel any hash.

This link is an excellent review of non-cryptographic current hashing algorithms . Unfortunately, you probably cannot use any of the ones mentioned in the article, because you need to be resistant to reversal, so you do not need a fast hashing algorithm. Slower algorithms make things more computationally impracticable.

Suppose for a second that the attacker knows that the 160-bit SHA1 hash (or any other hash that you use) is a phone number. In this case, it will not be difficult for him to create a rainbow table for each hash value of possible phone numbers. This is true for any hashing algorithm. What people usually do to avoid this is the Salt source phrase. This helps to make the building of the rainbow table impossible because the salt is secret and the number of possibilities is huge.

+6
source

Source: https://habr.com/ru/post/1483986/


All Articles