Class template for using memory with cache alignment in C ++

(to provide the information necessary to understand my question, a lot, but it is already compressed)

I am trying to implement a class template to allocate and access a data cache. This works very well, however, trying to implement array support is a problem.

Semantically, the code should provide this mapping in memory for one element like this:

cache_aligned<element_type>* my_el = 
          new(cache_line_size) cache_aligned<element_type>();
| element | buffer |

access (for now) is as follows:

*my_el; // returns cache_aligned<element_type>
**my_el; //returns element_type
*my_el->member_of_element();

HOWEVER for an array, I would like to have this:

 cache_aligned<element_type>* my_el_array = 
         new(cache_line_size)  cache_aligned<element_type()[N];
 | element 0 | buffer | element 1 | buffer | ... | element (N-1) | buffer |

So far I have the following code

template <typename T>
class cache_aligned {
    private:
        T instance;
    public:
        cache_aligned()
        {}
        cache_aligned(const T& other)
        :instance(other.instance)
        {}
        static void* operator new (size_t size, uint c_line_size) {
             return c_a_malloc(size, c_line_size);
        }
        static void* operator new[] (size_t size, uint c_line_size) {
             int num_el = (size - sizeof(cache_aligned<T>*) 
                              / sizeof(cache_aligned<T>);
             return c_a_array(sizeof(cache_aligned<T>), num_el, c_line_size);
        }
        static void operator delete (void* ptr) {
             free_c_a(ptr);
        }
        T* operator-> () {
             return &instance;
        }
        T& operator * () {
             return instance;
        }
};

cache_aligned_malloc functions

void* c_a_array(uint size, ulong num_el, uint c_line_size) {
    void* mem = malloc((size + c_line_size) * num_el + sizeof(void*));
    void** ptr = (void**)((long)mem + sizeof(void*));
    ptr[-1] = mem;
    return ptr;
}

void free_c_a(void ptr) {
    free(((void**)ptr)[-1]);
}

The problem is that data access should work as follows:

my_el_array[i]; // returns cache_aligned<element_type>
*(my_el_array[i]); // returns element_type
my_el_array[i]->member_of_element();

My ideas for solving it:

(1) something similar to this to overload the sizeof statement:

static size_t operator sizeof () {
   return sizeof(cache_aligned<T>) + c_line_size;
}

-> impossible, since overloading the sizeof operator is illegal

(2) - , [] :

static T& operator [] (uint index, cache_aligned<T>* ptr) {
    return ptr + ((sizeof(cache_aligned<T>) + c_line_size) * index);
}

- > ++,

(3)

template <typename T> cache_aligned {
    private:
          T instance;
          bool buffer[CACHE_LINE_SIZE]; 
          // CACHE_LINE_SIZE defined as macro
    public:
          // trivial operators and methods ;)
};

- > , , gcc-4.5.1 Linux...

(4) T ; T * instance_ptr; [] , :

| | ---- > | 0 | |... | (N-1) | |

, .

! , . , ! .

, - ++ 0x. gcc .

Greetz, sema

+3
1

c_line_size , , , pad_aligned char sizeof T.

, 2 T-s .

. , 2 - , - , .

+1

Source: https://habr.com/ru/post/1786197/


All Articles