Quick element access for multi-dimensional representation of a continuous array

I have a multidimensional array represented contiguously in memory. I want this view to be hidden, and just let the user access the elements of the array as if it were multidimensional: for example. my_array[0][3][5] or my_array(0,3,5) or something similar. The dimensions of the object are not determined until runtime, but the object is created with a type that determines how many dimensions it has. This elemental search will need to be called billions of times, and therefore, we hope that for each call minimal overhead is required.

I looked at similar issues, but really did not find a good solution. Using the [] operator requires the creation of N-1 dimensional objects, which is great for multidimensional structures, such as vector vectors, because the object already exists, but for an adjacent array it seems that it will get tangled up very quickly and require some kind of slicing through source array.

I also looked at overloading () , which seems more promising, but requires specifying the number of arguments, which will vary depending on the number of dimensions of the array. I was thinking about using list or vector initialization, but wanted to avoid instantiating objects.

I am a little familiar with templates and draw that there must be some way with the powerful powers of C ++ templates to indicate a unique overload () for arrays with different types (for example, different dimension numbers). But I used only templates in really basic general cases, such as using a function like float or double .

I present something like this:

 template<typename TDim> class MultiArray { public: MultiArray() {} //build some things ~MultiArray() {} //destroy some things // The number of arguments would be == to TDim for the instantiated class float& operator() (int dim1, int dim2, ...) { //convert to contiguous index and return ref to element // I believe the conversion equation is something like: // dim1 + Maxdim1 * ( dim2 + MaxDim2 * ( dim3 + MaxDim3 * (...))) } private: vector<float> internal_array; vector<int> MaxDimX; // Each element says how large each corresponding dim is. }; 

So, if I initialize this class and try to access the element, it will look something like this:

 my_array = MultiArray<4>(); element = my_array(2,5,4,1); 

How can I do this using templates? Is it possible?

+5
source share
4 answers
 template<class T> struct slice { T* data = 0; std::size_t const* stride = 0; slice operator[](std::size_t I)const { return{ data + I* *stride, stride + 1 }; } operator T&()const { return *data; } T& operator=(typename std::remove_const<T>::type in)const { *data = std::move(in); return *data; } }; 

store vector<T> data and std::vector<std::size_t> stride steps, where stride[0] is the step of the element that wants to get the first index.

 template<class T> struct buffer { std::vector<T> data; std::vector<std::size_t> strides; buffer( std::vector<std::size_t> sizes, std::vector<T> d ): data(std::move(d)), strides(sizes) { std::size_t scale = 1; for (std::size_t i = 0; i<sizes.size(); ++i){ auto next = scale*strides[sizes.size()-1-i]; strides[sizes.size()-1-i] = scale; scale=next; } } slice<T> get(){ return {data.data(), strides.data()}; } slice<T const> get()const{ return {data.data(), strides.data()}; } }; 

. Real-time example .

If you use insufficient [] , this refers to the first element of the subarray in question. If you use too much, it does UB. It performs a zero dimensional check both in size and in size.

Both can be added, but will cost performance.

The number of measurements is dynamic. You can divide the buffer into two types: one that owns the buffer, and another that provides a dimensional representation.

+5
source

It seems to me that you can use Boost.MultiArray , boost::multi_array_ref . boost::multi_array_ref does exactly what you want: it transfers a continuous array of data into an object that can be thought of as a multidimensional array. You can also use boost::multi_array_ref::array_view for slicing purposes.

I can’t provide you any test results, but, in my experience, I can say that boost::multi_array_ref works pretty fast.

+3
source

If you can use C ++ 17 to collapse the variational pattern, and the main order of the lines , I suppose you can write something like (caution: not verified)

 template <template ... Args> float & operator() (Args ... dims) { static_assert( sizeof...(Args) == TDim , "wrong number of indexes" ); // or SFINAE enable instead of static_assert() std::size_t pos { 0U }; std::size_t i { 0U }; ( pos *= MaxDimX[i++], pos += dims, ... ); return internal_array[pos]; } 

OTPS (Off Topic Post Scriptum): your MaxDimX , if I understand correctly, is a vector of dimensions; therefore there must be an unsigned integer not signed by int ; usually used for indexes is std::size_t [see Note 1].

OTPS 2: if you know the compilation time, the number of dimensions ( TDim , right?) Instead of std::vector , I suggest using std::array ; I mean

 std::array<std::size_t, TDim> MaxDimX; 

- EDIT -

If you cannot use C ++ 17, you can use the trick to initialize an unused array to get something like this.

I mean

 template <template ... Args> float & operator() (Args ... dims) { using unused = int[]; static_assert( sizeof...(Args) == TDim , "wrong number of indexes" ); // or SFINAE enable instead of static_assert() std::size_t pos { 0U }; std::size_t i { 0U }; (void)unused { (pos *= MaxDimX[i++], pos += dims, 0) ... }; return internal_array[pos]; } 

Note 1: as indicated by Julius, the use of a signed or unsigned integer for indices is controversial.

Therefore, I am trying to explain why I suggest using the unsigned value ( std::size_t , for example) for them.

The fact is that (as far as I know) the entire standard library of templates is designed to use an unsigned integer for index values. You can see this by the value returned by the size() method, and by the fact that access methods that get an index like at() or operator[] get an unsigned value.

Right or wrong, the language itself is designed to return std::size_t from the old sizeof() and from a later version of sizeof...() . The same class std::index_sequence is an alias for std::integer_sequence with fixed unsigned, again std::size_t , type.

In a world designed to use unsigned integers for indices, using a signed integer is possible for them, but IMHO is dangerous (because it is prone to errors).

+1
source

I have used this template several times when creating variable-size matrix class class templates.

Matrix.h

 #ifndef MATRIX_H template<typename Type, size_t... Dims> class Matrix { public: static const size_t numDims_ = sizeof...(Dims); private: size_t numElements_; std::vector<Type> elements_; std::vector<size_t> strides_; // Technically this vector contains the size of each dimension... (its shape) // actual strides would be the width in memory of each element to that dimension of the container. // A better name for this container would be dimensionSizes_ or shape_ public: Matrix() noexcept; template<typename... Arg> Matrix( Arg&&... as ) noexcept; const Type& operator[]( size_t idx ) const; size_t numElements() const { return elements_.size(); } const std::vector<size_t>& strides() const { return strides_; } const std::vector<Type>& elements() const { return elements_; } }; // matrix #include "Matrix.inl" #endif // MATRIX_H 

Matrix.inl

 template<typename Type, size_t... Dims> Matrix<Type, Dims...>::Matrix() noexcept : strides_( { Dims... } ) { using std::begin; using std::end; auto mult = std::accumulate( begin( strides_ ), end( strides_ ), 1, std::multiplies<>() ); numElements_ = mult; elements_.resize( numElements_ ); } // Matrix template<typename Type, size_t... Dims> template<typename... Arg> Matrix<Type, Dims...>::Matrix( Arg&&... as ) noexcept : elements_( { as... } ), strides_( { Dims... } ){ numElements_ = elements_.size(); } // Matrix template<typename T, size_t... d> const T& Matrix<T,d...>::operator[]( size_t idx ) const { return elements_[idx]; } // Operator[] 

Matrix.cpp

 #include "Matrix.h" #include <vector> #include <numeric> #include <functional> #include <algorithm> 

main.cpp

 #include <vector> #include <iostream> #include "matrix.h" int main() { Matrix<int, 3, 3> mat3x3( 1, 2, 3, 4, 5, 6, 7, 8, 9 ); for ( size_t idx = 0; idx < mat3x3.numElements(); idx++ ) { std::cout << mat3x3.elements()[idx] << " "; } std::cout << "\n\nnow using array index operator\n\n"; for ( size_t idx = 0; idx < mat3x3.numElements(); idx++ ) { std::cout << mat3x3[idx] << " "; } std::cout << "\n\ncheck the strides\n\n"; for ( size_t idx = 0; idx < mat3x3.numDims_; idx++ ) { std::cout << mat3x3.strides()[idx] << " "; } std::cout << "\n\n"; std::cout << "=================================\n\n"; Matrix<float, 5, 2, 9, 7> mf5x2x9x7; // Check Strides // Num Elements // Total Size std::cout << "The total number of dimensions are: " << mf5x2x9x7.numDims_ << "\n"; std::cout << "The total number of elements are: " << mf5x2x9x7.numElements() << "\n"; std::cout << "These are the strides: \n"; for ( size_t n = 0; n < mf5x2x9x7.numDims_; n++ ) { std::cout << mf5x2x9x7.strides()[n] << " "; } std::cout << "\n"; std::cout << "The elements are: "; for ( size_t n = 0; n < mf5x2x9x7.numElements(); n++ ) { std::cout << mf5x2x9x7[n] << " "; } std::cout << "\n"; std::cout << "\nPress any key and enter to quit." << std::endl; char c; std::cin >> c; return 0; } // main 

This is a simple variable multi-dimensional matrix class. Same Type <T>

You can create a matrix of float, ints, chars, etc. different sizes, such as 2x2 , 2x3 , 5x3x7 , 4x9x8x12x2x19 . This is a very simple but universal class.

It uses std::vector<> , so the search time is linear. The larger the multidimensional matrix grows in size, the larger the inner container will grow depending on the size of each dimension; it can “explode” quite quickly if each individual size has a large size, for example: a 9x9x9 is only 3 dimensional volumetric matrix , which has much more elements than 2x2x2x2x2 , which is 5 dimensional volumetric matrix . The first matrix has elements 729 , where the second matrix has only elements 32 .

I did not enable the default constructor , copy constructor, move constructor, and any overloaded constructors that accept either std::container<T> or another Matrix<T,...> . This can be done as an exercise for the OP.

I also did not include any simple functions that would give the size of common elements from the main container, as well as the number of general sizes that would be the size of the strides container. The OP should be able to implement this very simply.

As for strides and for indexing with multidimensional OP coordinates, it will be necessary to use stride values ​​to calculate the corresponding indices again. I leave this as the main exercise.

EDIT . I went ahead and added a default constructor, moved some members to a private section of the class, and added several access functions. I did this because I just wanted to demonstrate the strength of this class in the main function even when creating an empty container of my type.

Even more, you can answer Yakk with your step and cut algorithm, and you can easily connect it directly to this class, giving you the full functionality of what you are looking for.

+1
source

Source: https://habr.com/ru/post/1273902/


All Articles