When to use a primary key with auto-increment and when is it not?

I am trying to figure out the “best practices” for deciding whether to add an automatically increasing integer as the primary key to the table.

Say I have a table containing data on chemical elements. The atomic number of each element is unique and will never change. So instead of using an auto-add integer for each column, it probably makes sense to just use an atomic number, right?

Would it be equally true if I had a table of books? Should I use ISBN or an auto-incrementing integer for the primary key? Or a table of employees containing each person's SSN?

+47
sql database-design
Feb 02 '10 at 17:39
source share
7 answers

There are many stack overflow issues already addressed that can help you with your issues. See here , here , here and here .

The term you should look for: surrogate keys .

Hope this helps.

+13
Feb 02 '10 at 17:43
source share

This is a very controversial issue, with a lot of emotions on both sides.

In my humble opinion, if there is a good, affordable natural key - like ISBN - I use it. Anyway, I'm going to store it in a database. Yes, the natural key is usually larger than the integer auto-increment key, but I think this problem is bloated. Today, disk space is cheap. I would be more worried about this, and it took more time. If you were talking about an 80-byte text field as the primary key, I would say no. But if you are thinking of using a 10-byte ISBN instead of an 8-byte big integer, I cannot imagine that this brings most of the performance.

Sometimes there is a performance advantage for natural keys. Suppose, for example, that I want to find how many copies of this book have been sold. I don’t care about any information from the main record of the book. If the primary key is ISBN, I can simply write "select count (*) from the sale, where isbn = '143573338X'". If I used the auto-increment key, I would need to create a connection to search for isbn, and the query would be more complex and slow, for example, "select count (*) from book join sale using (bookid), where isbn = '143573338X'". (And I can assure you that since this particular ISBN is for my book, the number of sales records is very small, so combining and reading one additional record is a big percentage difference!)

Another advantage of natural keys is that when you have to work with a database and you look at records that refer to this table by keywords, it is easy to see which record they refer to.

On the other hand, if there is no good, obvious natural key, do not try to fuss along with the madman. I saw people trying to make a natural key by combining the first 6 letters of a customer’s name, his year of birth, and his zip code, and then pray for it to be unique. Such stupidity just creates problems for you. Often people end up taking a serial number to make it unique anyway, and at that point, why bother? Why not just use the sequence number on its own as the key?

+9
02 Feb 2018-10-02T00
source share

You have an idea right here.

Auto-increment should be used as a unique key if there is no longer a unique key about the models you are modeling. So, for Elements, you can use Atomic Number or Books ISBN number.

But if people post messages on the bulletin board, then they need a unique identifier, but they do not contain naturally, so we assign the next number from the list.

It makes sense to use natural keys where possible, just remember to make the field as the primary key and make sure it is indexed for performance

+4
Feb 02 '10 at 17:43
source share

Regarding the use of ISBN and SSN, you really need to think about how many rows in other tables will refer to them through foreign keys, because these identifiers take up much more space than an integer and, therefore, can lead to disk space and, possibly worse connection performance.

+3
Feb 02 '10 at 17:49
source share

I am trying to figure out the “best practices” for deciding whether to add an automatically increasing integer as the primary key to the table.

Use it as a unique identifier with a data set, where PKey is not part of user-managed data.

Say I have a table containing data on chemical elements. The atomic number of each element is unique and will never change. So instead of using an auto-add integer for each column, it probably makes sense to just use an atomic number, right?

Yes.

Would it be equally true if I had a table of books? Should I use ISBN or an auto-incrementing integer for the primary key? Or a table of employees containing each person's SSN?

ISBNs / SS # s are assigned by third parties, and due to their large storage size, it will be a very inefficient way to uniquely identify a string. Remember that PKeys are useful when you join tables. Why use a large data format, such as ISBN, which will be a set of text characters as a unique identifier when a small and compact format, such as Integer, is available?

+2
Feb 02 '10 at 17:44
source share

The main problem that I encountered while automatically adding an integer approach is when you export your data for transfer to another db instance or even the backup and restore operation. Since the integer has nothing to do with the data it refers to, there is no way to determine if you have duplicates when restoring or adding data to an existing database. If you do not need a connection between the data contained in the string and the PC, I would just use guid. Not very convenient to view, but it solves the above problem.

+2
Feb 02 '10 at 17:44
source share

An old topic that I know, but another thing to keep in mind, is that, given that most RDBMSes allocate blocks to disk using a PC, using an automatic incremental PC will simply increase your competition. This may not be a problem for your database for children with which you work, but trust me that it can cause serious performance problems at the far end of the city.

If you should use an auto-increment identifier, consider using it as part of a PC. Put it on the end to keep it unique .....

In addition, it is best to exhaust all the possibilities for natural PCs before moving on to a surrogate. People are usually lazy with this.

0
Aug 22 '14 at 1:37
source share



All Articles