Create human readable / usable short but unique identifiers

  • Need to process> 1000, but <10000 new entries per day

  • You cannot use GUID / UUIDs, auto increment numbers, etc.

  • Ideally, it should be 5 or 6 characters long, maybe alpha, of course

  • I would like to reuse existing, well-known algos, if available.

Is there anything there?

+44
database identity
Mar 03 '12 at 5:19
source share
4 answers

Base 62 is used by tinyurl and bit.ly for shortened URLs. This is a well-understood method for creating "unique", readable identifiers. Of course, you will need to save the created identifiers and check for duplicates when creating to ensure uniqueness. (see code at the bottom of the answer)

Base uniqueness indicators 62

5 characters in base 62 will give you 62 ^ 5 unique identifiers = 916,132,832 (~ 1 billion) With 10k ID per day, you'll be fine for 91k + days

6 characters in base 62 will give you 62 ^ 6 unique identifiers = 56 800 235 584 (56+ billion) With 10,000 IDs per day, you will be in order for 5+ million days.

Base 36 uniqueness indicators

6 characters will give you 36 ^ 6 unique identifiers = 2,176,782,336 (2+ billion)

7 characters will give you 36 ^ 7 unique identifiers = 78 364 164 000 (78+ billion)

The code:

public void TestRandomIdGenerator() { // create five IDs of six, base 62 characters for (int i=0; i<5; i++) Console.WriteLine(RandomIdGenerator.GetBase62(6)); // create five IDs of eight base 36 characters for (int i=0; i<5; i++) Console.WriteLine(RandomIdGenerator.GetBase36(8)); } public static class RandomIdGenerator { private static char[] _base62chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .ToCharArray(); private static Random _random = new Random(); public static string GetBase62(int length) { var sb = new StringBuilder(length); for (int i=0; i<length; i++) sb.Append(_base62chars[_random.Next(62)]); return sb.ToString(); } public static string GetBase36(int length) { var sb = new StringBuilder(length); for (int i=0; i<length; i++) sb.Append(_base62chars[_random.Next(36)]); return sb.ToString(); } } 

Output:

 z5KyMg
 wd4SUp
 uSzQtH
 UPrGAT
 UIf2IS

 QCF9GNM5
 0UV3TFSS
 3MG91VKP
 7NTRF10T
 AJK3AJU7
+73
Mar 03 2018-12-12T00:
source share
— -

I recommend http://hashids.org/ , which converts any number (e.g. DB ID) to a string (using salt).

It allows you to decode this string back to a number. Therefore, you do not need to store it in the database.

Has libs for JavaScript, Ruby, Python, Java, Scala, PHP, Perl, Swift, Clojure, Objective-C, C, C ++ 11, Go, Erlang, Lua, Elixir, ColdFusion, Groovy, Kotlin, Nim, VBA, CoffeeScript for both Node.js and .NET.

+11
Mar 31 '15 at 15:11
source share

I had similar OP requirements. I looked at the available libraries, but most of them are based on randomness, and I did not want this. I could not find anything that was not based on random and still very short ... Therefore, I ended up my own, based on the Flickr technique uses , but modified to require less coordination and provide a longer battery life.

In short:

  • The central server issues identification blocks consisting of 32 identifiers each
  • The local identifier generator maintains a pool of identifier blocks to generate an identifier each time it is requested. When the pool is running low, it receives more identifier blocks from the server to populate it again.

Disadvantages:

  • Central coordination required
  • Identifiers are more or less predictable (smaller than regular database identifiers, but they are not random)

Benefits

  • Remains within 53 bits (maximum Javascript / PHP size for integers)
  • very short identifiers
  • Base 36 is encoded so easy for people to read, write and pronounce
  • Identifiers can be generated locally for a very long time before contacting the server again is required (depending on the pool settings).
  • Theoretically no chance for collisions

I published both the Javascript library for the client side and the Java EE server implementation. Implementing servers in other languages ​​should also be easy.

Here are the projects:

suid - Distributed services - unique identifiers that are short and pleasant

suid-server-java - implementation of the suid server for the Java EE technology stack.

Both libraries are licensed under an open source Creative Commons license. Hoping this can help someone else look for short unique identifiers.

+4
Jun 10 '15 at 20:30
source share

I used base 36 when I solved this problem for an application that I was developing a couple of years ago. I needed to create a unique human-readable number (during the current calendar year). I decided to use the time in milliseconds since midnight on January 1 of the current year (therefore, time stamps can be duplicated every year) and convert it to base number 36. If the developed system encountered a fatal problem, it generated a base number 36 (7 characters), which was shown to the end user via a web interface, which can then transfer the problem (and number) to the technical support person (who can then use it to find the point in the logs where stacktrace was run). A number like 56af42g7 is infinitely easier for the user to read and relay than a timestamp like 2016-01-21T15: 34: 29.933-08: 00 or a random UUID, for example 5f0d3e0c-da96-11e5-b5d2-0a1d41d68578 .

+2
Feb 24 '16 at 1:42 on
source share



All Articles