Why db.insert (dict) adds the _id key to the dict object when using pymongo

I use pymongo as follows:

from pymongo import * a = {'key1':'value1'} db1.collection1.insert(a) print a 

Will print

 {'_id': ObjectId('53ad61aa06998f07cee687c3'), 'key1': 'value1'} 

on the console. I understand that _id is being added to the mongo document. But why is this also being added to the Python dictionary? I was not going to do it. I wonder what the purpose of this is? Could I use this dictionary for other purposes, and is the dictionary updated as a side effect of pasting it into a document? If I need to, say, serialize this dictionary into a json object, I will get

 ObjectId('53ad610106998f0772adc6cb') is not JSON serializable 

error. If the insert function does not support the dictionary value when inserting a document in db.

+8
source share
3 answers

Like many other database systems, Pymongo will add the unique identifier needed to retrieve data from the database as soon as it is inserted (what happens if you insert two dictionaries with the same content {'key1':'value1'} in the database? How would you distinguish what you want, and not from this?)

This is explained in the Pymongo docs :

When a document is inserted, the special key "_id" is automatically added if the document does not already have the key "_id". The value "_id" must be unique throughout the collection.

If you want to change this behavior, you can give the object the _id attribute before inserting. In my opinion, this is a bad idea. This will easily lead to collisions, and you will lose the juicy information that is stored in the β€œreal” object , for example, the creation time , which is great for sorting and the like.

 >>> a = {'_id': 'hello', 'key1':'value1'} >>> collection.insert(a) 'hello' >>> collection.find_one({'_id': 'hello'}) {u'key1': u'value1', u'_id': u'hello'} 

Or if your problem occurs when serializing in Json, you can use the utilities in the BSON module:

 >>> a = {'key1':'value1'} >>> collection.insert(a) ObjectId('53ad6d59867b2d0d15746b34') >>> from bson import json_util >>> json_util.dumps(collection.find_one({'_id': ObjectId('53ad6d59867b2d0d15746b34')})) '{"key1": "value1", "_id": {"$oid": "53ad6d59867b2d0d15746b34"}}' 

(you can check that this is valid json on pages like jsonlint.com )

+1
source

Clearly docs will answer your question

MongoDB stores documents on disk in BSON format. BSON is a binary representation of JSON documents, although it contains more data types than JSON.

The field value can be any of the BSON data types, including other documents, arrays, and document arrays. The following document contains values ​​of various types:

 var mydoc = { _id: ObjectId("5099803df3f4948bd2f98391"), name: { first: "Alan", last: "Turing" }, birth: new Date('Jun 23, 1912'), death: new Date('Jun 07, 1954'), contribs: [ "Turing machine", "Turing test", "Turingery" ], views : NumberLong(1250000) } 

to learn more about BSON

0
source

_id act as the primary key for documents, unlike SQL databases, it is required in mongodb.

to make _id serializable, you have 2 options:

  • set _id to the serializable JSON data type in your documents before inserting them (for example, int , str ), but keep in mind that it must be unique for each document.

  • use custom BSON serialization encoding / decoding classes:

     from bson.json_util import default as bson_default from bson.json_util import object_hook as bson_object_hook class BSONJSONEncoder(json.JSONEncoder): def default(self, o): return bson_default(o) class BSONJSONDecoder(json.JSONDecoder): def __init__(self, **kwrgs): JSONDecoder.__init__(self, object_hook=bson_object_hook) 
0
source

Source: https://habr.com/ru/post/971416/


All Articles