To link one record to another in MongoDB, is it possible to use slug?

Say we have two models:

User: _ _id - name - email Company: - _id _ name _ slug 

Now let me say that I need to connect the user to the company. A user can designate one company. To do this, I can add a new field named companyID to the user model. But I do not send the _id field to the front end. All requests coming to the API will only have slug . There are two ways to do this:

1) Add slug to link the company: if I do, I can take the slug sent from the request and directly request for the company.

2) Add the company _id : if I do this, I need to first use slug to query for the company, and then use the _id returned to request the required data.

Can I find out which way is better? Is there an added benefit when using _id entries for relationships?

+5
source share
2 answers

The second approach is the best, this is Add _id company.

Using _id is the best way to practically query any information, even complex queries can be resolved with _id, since it is a unique ObjectId created by Mongodb. Population is the process of automatically replacing specified paths in a document with documents from other collections (s). We can fill out a single document, several documents, a simple object, several simple objects, or all objects returned from a request.

+2
source

Accept the second approach. When deciding which field should be used as a connection key, several problems must be taken into account (this is true for all databases, and not just for Mongo):

  • The field must be unique. I’m not sure exactly what the “slug” field is in your circuit, but if there is a chance that this might be duplicated, then do not use it.
  • The field should not change. Strictly speaking, you can change the key field, but the only way to safely do this is to change it simultaneously in all child tables atomically. This is difficult to do reliably because: a) you need to know which tables use the field (maybe some other developer added another table that you don’t know about) b) If you do this one at a time by entering the race conditions. c) If any of the updates fails, you will have conflicting data and corrupted parent-child links. Some SQL databases have a cascading update function to solve this problem, but Mongo does not. This is a rather difficult problem that you really, really do not want to change the key field if you do not need it.
  • The field must be indexed. Strictly speaking, this is not true, but if you are going to join it, then many queries will be launched on it, so you will need to index it.

For these reasons, it is almost always recommended to use a key field, which serves solely as a key field, without the actual information stored in it. Many people were burned using things like social security numbers, driver licenses, etc. As key fields, either because there may be duplicates (for example, SSNs can be duplicated if people use fake numbers, or if they do not have them), or numbers can change (for example, driver licenses).

In addition, you can format the key field to optimize the speed of unique generation and indexing. For example, if you use SSN, you need to check the SSN for the rest of the database to make sure it is unique. It takes time if you have millions of records. Similarly for slugs, which are text fields that need to be hashed and checked for index. OTOH, mongoDB essentially uses UUIDs as keys, which means that it does not need to verify uniqueness (the algorithm guarantees a high statistical probability of uniqueness).

The bottom line is that there are very good reasons not to use the "real" field as your key, if you can help him. Luckily for you, mongoDB already gives you an excellent key field that satisfies all of the above criteria, the _id field. Therefore you must use it. Even if slug is not a “real” field, and you generate it just like the _id field, why bother? Why should an entry have 2 unique identifiers?

The second problem in your situation is that you are not providing the user with an _ID field to the user. It seems intuitively that this should be valuable information that should not be given out willy-nilly. But the truth is that it has no information value in itself, because, as indicated above, the key must not have any factual information. The security location is in the request, ensuring that the user performing the request has permission to access the record / specific request fields that it requests. Hiding a key is a classic protection against unauthorized access, which in fact does not increase security.

The only time to hide your primary key is to use a poorly designed key that contains useful information. For example, an invoice identifier that increases by 1 for each invoice can be used by someone to find out how many orders you receive in one day. Auto-increment identifiers can also be easily guessed (if my invoice is No. 5, can I track invoice No. 6?). Fortunately, Mongo uses a UUID, so there really is no information leak (except maybe for temporary attacks on its cryptographic algorithm? And if you are worried about this, you need much deeper security considerations than this post :-) .

Look at it differently: if slug reliably points to a specific company and user, then how is it more secure than just using _id?

However, there are a few examples where a secondary key vulnerability (such as slugs) is useful, none of which are security related. For example, if in the future you will need to migrate database platforms and need to regenerate the keys, because the new platform will not be able to use your old ones; or if users will manually enter identifiers, then it is helpful to give them something easier to remember as slugs. But even in such situations, you can use slug as a convenient identifier for users, but in your database you should still use the company identifier for the actual connection (for example, in your option number 2). Check out this discussion about the pros / cons of exposing _ids to users: https://softwareengineering.stackexchange.com/questions/218306/why-not-expose-a-primary-key

So my recommendation would be to go ahead and give the user a company identifier (along with a machine gun, if you want the text to be readable, for example, for URLs, although mongo identifiers can be used in URL). They can send it back to you to get the user, and you can (after appropriate permission checks) make the connection and send back the user data. If you do not want to disclose the company identifier, I would recommend you option No. 2, which is essentially the same, except that you add an additional request to get the company identifier first. IMHO, that a waste of cycles for a real improvement in security, but if there are other considerations, then it is still acceptable. And both of these options are better than using a bullet as a primary key.

+2
source

Source: https://habr.com/ru/post/1274113/


All Articles