What is the best view for mongo _id field in postgresql?

Field

Mongodb _id is defined as:

 ObjectId is a 12-byte BSON type, constructed using: a 4-byte value representing the seconds since the Unix epoch, a 3-byte machine identifier, a 2-byte process id, and a 3-byte counter, starting with a random value. 

What would be the most efficient representation of this field in postgresql?

+6
source share
1 answer

I used char(24) with the restriction CHECK decode(mongo_id::text, 'hex'::text) > '\x30'::bytea . Although this restriction does not validate ObjectId's sanity, it only saves a valid format. This saves the ObjectId as plain text, which keeps the values โ€‹โ€‹easy to read.

Another option would be to use the bytea type for the column and enter the data as "\xOBJECT_ID" , where \x converts the text form OBJECT_ID to an array of bytes. This takes up less space than char(24) (it can make a difference if you have millions of lines), but you need to use, for example, to access values โ€‹โ€‹in non-binary format. encode(mongo_id::bytea, 'hex') (can be burdensome).

Also, some platforms, such as RedShift, may have problems with the bytea data bytea .

If you need easy access to metadata in the ObjectId, you can parse and save it separately (for example, in a jsonb column or a separate column for each corresponding attribute). The โ€œcreated inโ€ part of the metadata is probably the only interesting attribute.

+3
source

Source: https://habr.com/ru/post/981829/


All Articles