ArangoDB document database as well as graph database? How is this possible?

Can someone explain how the document database also works as a graph database?

What is the difference between ArangoDB and Neo4j?

+6
source share
1 answer

Disclaimer: I am Max from ArangoDB, one of the main developers.

First of all, a longer discussion of this and other related issues can be found in my article Charts in data modeling - is the emperor naked? but I will try to briefly answer both questions.

(1) Storing a graph in a document repository is relatively simple (as in a relational database), for example, you can simply save a document for each vertex in the "vertex collection" and a document for each edge in the "edge collection". It is only necessary to make sure that each edge stores which peak it comes from and which peak it goes to. In ArangoDB, we use the _from and _to attributes in the edge document for this.

However, a key feature for the graph database is that it needs to efficiently respond to graph requests. Typical queries for graphs are (a) "what are the neighboring vertices in the graph?" or (b) "what is the shortest path from peak A to peak B on the graph?" or (c) "give me all the vertices that I can reach from vertex A with the following edges." Whereas (a) you just need a good index in the collection of borders, (b) and (c) include an a priori unknown number of steps on the graph. Therefore, (b) and (c) cannot be used effectively with traditional database query languages, such as SQL, simply because they will be associated with a large amount of communication between the client and server, or at least a very complex expression with a variable number joins. I call queries like (b) and (c), so "graphy", without strictly defining it.

Therefore, my short answer to the question: “How can a document repository be a graphical database?”: Store the graph as described above and execute graphical queries on the database server, available in the query language in the data repository. Basically, the same thing can be done with a relational database and some significant extensions for SQL.

With ArangoDB, we were able to combine the document, graph, and key / value functions into a single, coherent query language. Therefore, we call ArangoDB a “multi-model database” because it combines these three data models without problems. You can even mix data models in one query!

This leads to my answer to question (2), which is obviously a bit biased:

Compared to ArangoDB, which is a distributed database with several models in the above sense, Neo4j is a classic graph database. It stores graphs, allows you to query them using "graphical queries" and has a storage and query mechanism optimized for this. Neo4j is particularly good at mapping paths using the built-in cypher query language. This allows you to attach properties to vertices and edges, but it is not a complete recognized document repository. It is not optimized for handling document queries using multiple secondary indexes, and it is not combined. In addition, Neo4j is not covered.

Neo4j is written in Java, ArangoDB is written in C ++ and includes Google V8 for executing JavaScript extensions.

For performance comparison, see this post .

+6
source

Source: https://habr.com/ru/post/989149/


All Articles