C ++ custom allocator that uses the underlying memory pool

I use a memory pool class that reuses allocated memory addresses and a custom allocator that wraps this class. The following code snippet gives you the basic idea of โ€‹โ€‹an interface.

template<class alloc> class memory_pool : boost::noncopyable, public allocator_traits<void> { public: memory_pool(typename alloc::size_type alloc_size); memory_pool(typename alloc::size_type alloc_size, alloc const&); template<typename U> memory_pool(typename alloc::size_type alloc_size, typename alloc::rebind<U>::other const&); virtual ~memory_pool(); pointer allocate (); /*throw(std::bad_alloc)*/ void collect (); void deallocate(pointer) throw(); /*noexcept*/ }; pointer allocate() {/* Checks if a suitable chunk of memory is available in a internal linked list. If true, then the chunk is returned and the next chunk moves up. Otherwise, new memory is allocated by the underlying allocator. */} void deallocate(pointer) {/* Interprets the passed pointer as a chunk of memory and stores it in a linked list. Please note that memory isn't actually deallocated. */} void collect() {/* Effectively deallocates the cunks in the linked list. This will be called at least once during destruction. */} 

Of course, the need for something like this is limited. However, this is very useful in situations where you need to: - Determine the type of allocator for a class that uses this allocator is very naive (for example, Avoids highlighting large parts, even if that would be appropriate). - Allocate and free several times the same memory size. - The type that you want to use the dispenser is very small (for example, built-in types such as char, short, int, etc.).

Theoretically, an implementation can use pool_ memory, which allocates a multiple of the actual size of the distribution, each time it needs to do this (from the main memory manager). Objects that are close to each other are more suitable for any cache and / or prefetch algorithm. I implemented such a memory pool with some overhead to handle the correct allocation, separation and deallocation (we cannot free every address that the user skips to free. We only need to free the addresses that are the beginning of each memory block that we have previously highlighted).

I tested both cases with the following really simple code:

 std::list<int, allocator<int>> list; std::clock_t t = std::clock(); for (int i = 0; i < 1 << 16; ++i) { for (int j = 0; j < 1 << 16; ++j) list.push_back(j); list.unique(); for (int j = 0; j < 1 << 16; ++j) list.pop_back(); } std::cout << (std::clock() - t) / CLOCKS_PER_SEC << std::endl; 

std::list calls allocactor::allocate(1, 0) every time push_back is called. unique() ensures that every element is affected and comparable to the next element. However, the result was disappointing. The minimum overhead required to manage a pool of blocks is more than any possible advantage of the system.

Can you think of a scenario in which this will improve performance?

EDIT: Of course, this is much faster than std::allocator .

+6
source share
2 answers

C ++ 0x has better support for restricted allocators such as memory pools.

Profile your code. It is very difficult to predict what benefits this will provide if your algorithm does not follow the very correct allocation / release patterns such as LIFO.

It's too easy to write a very fast allocator when all selected objects are the same size. I once wrote something in lines

 template <size_t ObjectSize> class allocator { // ... }; template <typename T> class allocator : public allocator <sizeof (T)> { // ... }; 

Before you can design a dispenser, you need to be sure what will be allocated and how. The answers to operator new are โ€œanythingโ€ and โ€œanywayโ€, so it is sometimes not optimal. If you cannot answer these questions correctly, your dispenser will probably not be a big improvement.

+1
source

Can you think of a scenario in which this will improve performance?

Something that does a lot (10k + per second) of deductions and exemptions will benefit from not having to resort to complex queries every time to allocate / release, however, only if the combined save when allocs / frees is delayed into groups is more than required to process the group (basically you need to amortize the group over your head while maintaining the unit).

the use of continuous memory will help any structures based on node / pointers, such as trees (but only for the point). however, the real benefits in the world can be completely different (or not exist at all!) from what they planned, namely, when you go along the path of creating your own similar systems, you should profile your code and already have an idea of โ€‹โ€‹how it will be used (i.e. there is no point in creating your own pool allocator for something that makes so few allocations that the gain does not matter at all.)

Something like this may be convenient for debugging, since you have a nice interface for tagging memory to monitor leaks and overwriting / invalid entries, so even if it has the same performance as the standard system, you can get other ways .

0
source

Source: https://habr.com/ru/post/892135/


All Articles