Are there versions of C ++ STL associative data structures optimized for multiple partial copies?

I have a big tree that grows as my algorithm advances. Each node contains a collection, which I suppose is implemented as a balanced binary search tree. Each node set must remain fixed after creating the node before using it when creating this node children.

I am afraid, however, that copying each set is overly expensive. Instead, I would prefer that each newly created node set use all the relevant parts of the parent node set. In short, I am happy to copy O (log n) from the set, but not O (n).

Are there any variants of STL associative data structures that offer such partial copy optimization? Perhaps in Boost? Such a data structure would be trivial to implement in Haskell or OCaML, of course, but this would require more effort in C ++.

+6
source share
3 answers

I know that it is usually not recommended to propose a different language, but the standard Haskell container libraries do just that. I remember seeing a video (was it Simon Peyton Jones?), Speaking about this exact problem, and how the Haskell solution turned out to be much faster than the C ++ solution for a given programmer effort. Of course, this was for a specific problem, which had many sets with a lot of common elements.

There are quite a few studies in this area. If you are looking for keywords, I suggest looking for “functional data structures” instead of “immutable data structures” since most functional paradigms benefit from immutability in general. Structures such as the finger tree have been developed to solve this problem.

I do not know the C ++ library that implements these data structures. Nothing prevents you from reading the relevant documents (or the Haskell source code, which is about 1 thousand lines for Data.Set , including tests) and implementing it yourself, but I know that this is not what you would like to hear. You will also need reference counting for shared nodes, which for such deep structures can have higher overheads than even simple garbage collectors.

+1
source

This is practically impossible in C ++, since the concept of an immutable container does not exist. You may know that you will not make any changes, and that some kind of joint presentation would be preferable, but there is no compiler and library, and there is no way to report this to them.

0
source

Each node contains a collection, which I suppose is implemented as a balanced binary search tree. Each node set must remain fixed after this node creation before using it when creating this node children.

This is a rather unique case. I would recommend using std :: vector instead. (No!) The code that creates the node can still use set and switches to vector at the last second. However, vector smaller and takes up only a small number of memory allocations (one if you use reserve ), which makes the algorithm much faster.

 typedef unsigned int treekeytype; typedef std::vector<unsigned int> minortreetype; typedef std::pair<treekeytype, minortreetype> majornode typedef std::set<treekeytype, minortreetype> majortype; majortype majortree; void func(majortype::iterator perform) { std::set<unsigned int> results; results.assign(perform->second.begin(), perform->second.end()); majortree[perform->first+1].assign(results.begin(), results.end()); //the only change is here majortype::iterator next = majortree.find(perform->first+1); func(next); } 

You can even use std::lower_bound and std::upper_bound to still receive O (log (n)) memory images, since they are still sorted the same as the set, so you won’t lose efficiency. This is a net profit if you do not need to insert / delete often.

I am afraid, however, that copying each set is overly expensive.

If this fear is caused by the fact that each set contains basically the same nodes as its parents, and the data is expensive (for copying or in memory, depending on what has changed), only with the change of several nodes, so that instead of the subtree contained std::shared_pointers the data itself. This means that the data itself will not be copied, only pointers.

I understand that this is not what you were striving for, but, as James Kanjie said, I do not know such a container. Besides the possibly weird and dangerous use of the STL rope class. Note what I said and meant STL, not the standard C ++ library. They are different.

0
source

Source: https://habr.com/ru/post/898603/


All Articles