Suggestions for Improving Distributor Algorithm Implementation

Question

Suggestions for Improving Distributor Algorithm Implementation

I have a Visual Studio 2008 C ++ application where I use a custom allocator for standard containers, so their memory comes from a memory mapped file, not from a heap. This valve is used for 4 different use cases:

Fixed-size 104-byte structure std::vector< SomeType, MyAllocator< SomeType > > foo;
200-byte fixed-size structure
304 byte fixed size structure.
n-byte lines std::basic_string< char, std::char_traits< char >, MyAllocator< char > > strn;

I need to be able to allocate approximately 32 MB for each of them.

The allocator monitors memory usage with std::map pointers to allocation size. typedef std::map< void*, size_t > SuperBlock; Each SuperBlock represents 4 MB of memory.

In case there is not enough space for one superblock, there is std::vector< SuperBlock > .

The algorithm used for the dispenser is as follows:

For each SuperBlock: is there a place at the end of the SuperBlock? put there distribution. (Quickly)
If not, search within each SuperBlock for enough free space and place the selection there. (Slow)
Nothing yet? select another SuperBlock and place the selection at the beginning of the new SuperBlock.

Unfortunately, step 2 can become VERY slow after a while. As I make copies of objects and destroy temporary variables, I get a lot of fragmentation. This causes a lot of deep searching in the memory structure. Fragmentation occurs because I have a limited amount of memory to work with (see note below)

Can anyone suggest improvements to this algorithm that will speed up the process? Do I need two separate algorithms (1 for fixed-size distributions and one for row distributor)?

Note. For those who need a reason: I use this algorithm on Windows Mobile, where the limit is 32 MB for the heap. This way a regular std::allocator will not trim it. I need to allocate a 1 GB allocation to a large area of memory in order to have enough space and what it does.

+6

c ++ algorithm allocator

Paulh May 17, '11 at 16:39

source share

6 answers

Do you have a separate memory allocation pool for each type of fixed size you use? Thus, there will be no fragmentation, since the selected objects will always be aligned along n-byte boundaries. Of course, this does not help for variable-length strings.

Here is an example of placing small objects in Alexandrescu A modern C ++ design that illustrates this principle and may give you some ideas.

+6

Tim martin May 17, '11 at 16:49

source share

Based on Tim's answer, I personally would use something similar to BiBOP.

The basic idea is simple: use fixed-sized pools.

There are some clarifications.

Firstly, the size of the pools is usually fixed. It depends on your distribution procedure, as a rule, if you know which OS you are running on the card at least 4 KB at the same time, when you use malloc, then you use this value. For a memory mapped file, you can enlarge it.

The advantage of fixed-size pools is that it fights fragmentation very well. All pages are the same size, you can easily recycle a blank page with 256 bytes per 128-byte page.

There is still fragmentation for large objects that are usually allocated outside this system. But this is low, especially if you put large objects in several page sizes, so that the memory will be easily recyclable.

Secondly, how to handle pools? Using linked lists.

Pages are usually untyped (by themselves), so you have a free list of pages for preparing new pages and placing "recycled" pages.

For each size category, you have a list of "busy" pages in which memory is allocated. For each page you save:

placement size (for this page)
number of selected objects (for checking emptiness)
pointer to the first free cell
pointer to the previous and next page (may point to the "head" of the list)

Each free cell in itself is a pointer (or index, depending on your size) with the next free cell.

The list of "busy" pages of a given size is managed simply:

when deleting: if you clear the page, remove it from the list and drag it to the processed pages, otherwise update the list of free cells of this page (note: finding the beginning of the current page is usually a simple module operation at the address)
on insertion: search starting from the chapter, as soon as you find the page is incomplete, move it in front of the list (if it is not already specified) and insert your element

This layout is really memory efficient, and only one page is for indexing.

For applications with multi-threaded / multi-processor processes, you will need to add synchronization (usually a mutex per page) if you can get inspiration from Google tcmalloc (try to find another page instead of blocking, use the -local cache stream to remember which page you used last time).

Having said that, have you tried Boost.Interprocess? It provides dispensers.

+2

Matthieu M. May 17, '11 at 17:31

source share

For fixed sizes, you can easily use a small memory allocator, such as a allocator, where you allocate a large block that breaks into pieces of a fixed size. Then you create a vector of pointers to the available chunks and pop / push when you select / release. It's very fast.

For variable length elements, this is more complicated: you either have to deal with finding available continuous space or use some other approach. You might want to save another map of all the free nodes ordered by block size so that you can reduce the size of the map, and if the next available node says only 5% is too big, return it instead of trying to find usable free space of the exact size.

+1

Mark b May 17 '11 at 16:52

source share

My penchant for variable elements would be, if practicable, to avoid direct pointers to data and to save handles instead. Each descriptor will be a superblock index and an index for an element in the superblock. Each superblock must have a list of elements located from top to bottom, and elements located from bottom to top. Each distribution of elements is preceded by its length and the index of the element that it represents; use a single index bit to indicate whether the item is “pinned”.

If an item matches after the last selected item, just select it. If it hits a pinned item, move the next selection mark past the pinned item, find the next higher pinned item, and try highlighting again. If an element collides with a list of elements, but somewhere there is enough free space, compact the contents of the block (if one or several elements are fixed, it may be better to use another superblock, if available). Depending on the usage patterns, you may need to start only by compactifying the material that has been added since the last collection; if it does not provide enough space, then compacts everything.

Of course, if you have only a few individual element sizes, you can use simple, fixed-sized distributors.

+1

supercat May 17, '11 at 16:59

source share

I agree with Tim's memory pools to avoid fragmentation.

However, you can avoid some perversions by storing pointers rather than objects in your vectors, perhaps ptr_vector ?

+1

Douglas leeder May 17, '11 at 17:03

source share

David Rodríguez - dribeas · Accepted Answer · 2011-05-17T16:57:49+0000

For fixed size objects, you can create a fixed size allocator. Basically, you select blocks, break them into sub-blocks of the appropriate size, and create a linked list with the result. The selection from such a block is O (1), if there is available memory (just delete the first element from the list and return the pointer to it), as well as release it (add the block to the free list). During selection, if the list is empty, take a new superblock, section and add all the blocks to the list.

For a variable-sized list, you can simplify it to a fixed-size block by selecting only blocks of known sizes: 32 bytes, 64 bytes, 128 bytes, 512 bytes. You will have to analyze memory usage to come up with different buckets so that you do not waste too much memory. For large objects, you can return to the dynamic size distribution pattern, which will be slow, but hopefully the number of large objects is limited.

Suggestions for Improving Distributor Algorithm Implementation

More articles: