Performance-oriented database design (key / value) based on graph

I am in the process of preparing the design of a database library based on a graph (or key value) for C ++, which many here will find similar to projects like http://neo4j.org/ .

Since this is a very early design phase, my requirements are simple, unspecified and (I admit) probably still quite naive:

  • Oriented Acyclic Graph
    • A tree-like structure with several roots and many leaves.
    • Branches may contain links to other branches.
    • But no cycles
    • The graph is represented by a key-value pair, where keys and values ​​are simple types (integers) for the most part, but some may refer to more complex types, such as strings
  • Inquiries
    • Simple queries usually return edges. That is, which edges starting from this root correspond (cap / value / key value tupple)?
    • Queries using key strings (key, key, key, value)
  • Access Templates and Performance
    • Need to emphasize a quick search
    • Adding Ribs
    • But not removing edges / nodes from the graph. That is, the chart will grow, but it will never shrink.
    • Optimization can be performed on a graph to optimize the memory layout for cache use.
    • The size of the graph is about 1 MB - 2 GB and should for the most part fit into the primary memory.

Given these gross requirements as a problem, what would be your main concerns regarding:

  • Storage memory: layout, placement
    • eg. Fixed-size block pool?
    • Assigning memory by the clustering algorithm
  • Quick requests
    • /?
    • (, )
    • . ?

, . : , ?

+3
2

, . , , - O (v + e) ​​ ( v - , e - ). , . , .

- : CheapInsert ExpensiveInsert. "" . , . ( , ). , , , , .

+4

Source: https://habr.com/ru/post/1716740/


All Articles