What is the more idiomatic option in Datomic land for this scheme?

Question

What is the more idiomatic option in Datomic land for this scheme?

My question is what is the more idiomatic scheme for Datomic.

Let's say we have User , Post and Topic entities.

Post may be owned by Topic , User and other Post (responsible). Now, if I,

a) Create an attribute :posts , which is just a Post s list, and add it to every entity that requires a reference to the Post s number?

or

b) Establish a more explicit relationship, so that Post has an attribute :post/author , which is a link to the user, and possibly an attribute :post/belongs-to , which can refer to either Topic or another Post ?

Remarks: If I do b , I seem to get more semantic relationships. I can, for example, do (:post/_author user-entity) , which describes the nature of their relationship traps more than (:posts user-entity) (since, which means that a User has :posts ? Are those User privileged Post s, by Post s, or what?)

Another side effect of b is that I can create a new Post without changing any other object. If I do a , I need to create a Post , and also insert it into the :posts User attribute, requiring two operations instead of one.

However, I have the feeling that a might be a more idiomatic way of doing this. It seems, for example, that it would be easier to see how the :posts attribute list has changed over time, should I do this if the User links :posts instead of the Post reference User link via the :post/author attribute.

Which would be preferable and why?

+6

clojure datomic

Henrik Nov 19 '13 at 10:19

source share

2 answers

I think it depends mainly on your access patterns. If every time you access an object that can insert related messages, you need these messages, it makes sense to insert them ( a ). If most of the time you are accessing them separately, then sharing them might be better ( b ).

Or you can do both ( c ) by treating a single Post object as canonical, and embedded in various objects that must be cached versions. Thus, you will need a script / package that updates inline messages every time the canonical version is updated. This makes reading easier since information is always present but more difficult to write because you need to synchronize them yourself. This template can also be used only if you can accept some inconsistency between the canonical version and the built-in ones, and the delay of resynchronization is not critical for you.

Note: this advice is not specifically related to Datomic, these are methods borrowed from the NoSQL world, and by no means am I a specialist.

0

DjebbZ Nov 21 '13 at 9:22

source share

Alex stoddard · Accepted Answer · 2013-12-30T18:13:45+0000

Your option ( b ) is essentially idiomatic and the only way to enter data.

The entire digital circuit is encoded only as values that an attribute can take in the value structure of an attribute-value-object (EAV).

See http://docs.datomic.com/schema.html for a key sentence taken from the docs:

Each Datomic database has a schema that describes a set of attributes that can be associated with objects. A schema defines only the characteristics of the attributes themselves. It does not determine which attributes can be associated with objects.

Entities themselves are very abstract (and internally are just numbers), all interesting properties of entities are encoded as attribute statements. Objects do not print! You create the semantics of the entity by the attributes that you approve for it, for example: user / firstname ,: post / title ,: post / content ,: topic / description, etc. That's why you really want a namespace to be assigned.

A particular case of this is the attribute type :db.type/ref , where the value of "V" in EAV is itself another object. This is what creates semantic associations between entities. You give each attribute a “name” (like :db/ident ) to capture what the connection E ↔ E actually means. So you can have the attribute :db.type/ref with :db/ident ": post / author ".

Note that all attributes :db.type/ref are essentially bidirectional, therefore, if Eu is an entity representing a user, and Ep is an entity representing a message, then the following are equivalent in creating and querying dates:

 [Ep :post/author Eu] [Eu :post/_author Ep]

All entity relationships, which are only more detailed attribute statements, are truly flexible. If you later want to add the concept of your favorite posts, this is just another attribute :db.type/ref . Create it with :db/ident , such as:: user / favorites, and approve the connections between existing users and messages (which have different user objects as authors).

 [aUser :user/favorites somePost]

There is no concept of attributes associated with a collection, so what you offer in ( a ) is not properly expressed in datomic. You must use a query to aggregate messages. Removing mail will be simulated by retraction of the entity itself. Such an allotted post will remain visible in the history of the database.

This poses the problem of how to specify the order for entity lists. You need to either use the "natural" order, for example, the publication date (fixed either in the official transaction, or as an explicit attribute of the message), or use explicit ordering based on the attributes, for example, by means of: post / up - the numerical attribute of votes.

If you need a semantic grouping of objects where the "subsystems" are only meaningful and exist only as part of something larger - for example, position objects in an order - then see datomic components .

What is the more idiomatic option in Datomic land for this scheme?

More articles: