Slow sparse matrix operation using std :: vector

I am trying to implement the functionality of a MATLAB sparse function.

Paste the value into a sparse matrix with a specific index to:

If a value with the same index is already present in the matrix, new and old values ​​are added.

Otherwise, the new value is added to the matrix.

The addNode function runs correctly, but the problem is that it is extremely slow. I call this function in a loop about 100,000 times, and it takes more than 3 minutes to execute the program. While MATLAB performs this task in seconds. Is there a way to optimize the code or use stl algorithms instead of my own function to achieve what I want?

Code:

 struct SparseMatNode { int x; int y; float value; }; std::vector<SparseMatNode> SparseMatrix; void addNode(int x, int y, float val) { SparseMatNode n; nx = x; ny = y; n.value = val; bool alreadyPresent = false; int i = 0; for(i=0; i<SparseMatrix.size(); i++) { if((SparseMatrix[i].x == x) && (SparseMatrix[i].y == y)) { alreadyPresent = true; break; } } if(alreadyPresent) { SparseMatrix[i].value += val; if(SparseMatrix[i].value == 0.0f) SparseMatrix.erase(SparseMatrix.begin + i); } else SparseMatrix.push_back(n); } 
+4
source share
5 answers

The first one thinks that it stands out that you implement your own functionality to search for an element: what is std::find . So, instead of:

 bool alreadyPresent = false; int i = 0; for(i=0; i<SparseMatrix.size(); i++) { if((SparseMatrix[i].x == x) && (SparseMatrix[i].y == y)) { alreadyPresent = true; break; } } 

You must write:

 auto it = std::find(SparseMatrix.begin(), SparseMatrix().end(), Comparer); 

where Comparer is a function that compares two SparseMatNode .

But the main improvement will be achieved through the use of an appropriate container. Instead of std::vector you would be much better off using an associative container. Thus, element search will have O(logN) complexity instead of O(N) . You can slightly modify your SparseMatNode class as follows:

 typedef std::pair<int, int> Coords; typedef std::pair<const Coords, float> SparseMatNode; 

You can cover these typedefs inside the class to provide a better interface, of course.

And then:

 std::unordered_map<Coords, float> SparseMatrix; 

This method can be used:

 auto it = SparseMatrix.find(std::make_pair(x, y)); 

to find items much more efficiently.

+1
source

Sparse matrices are usually not saved as a triplet vector when you try.

MATLAB (like many other libraries) uses a compressed sparse column (CSC) data structure, which is very efficient for static matrices. The MATLAB sparse function also does not create a matrix one record at a time (as you try) - it takes an array of triplet records and packs the entire sequence into a CSC matrix. If you are trying to build a static sparse matrix, this is the way to go.

If you want a dynamic sparse matrix object that supports efficient insertion and deletion of records, you can look at different structures - perhaps std::map triplets or an array of columns - see here for more information on data formats.

In addition, there are many good libraries. If you want to do sparse matrix operations / factorization, etc. - SuiteSparse is a good option, otherwise Eigen also has good sparse support.

+5
source

Sparse matrices are usually stored in a compressed sparse row (CSR) or a compressed sparse column (CSC, also called Harwell-Boeing). MATLAB uses CSC, IIRC by default, while most sparse matrix packets tend to use CSR.

In any case, if it concerns the use of products, rather than training exercises, I would recommend using a matrix package with support for sparse matrices. In the C ++ world, my favorite is Eigen .

+3
source

Have you tried sorting your sparse node vector? Performing a linear search becomes expensive every time you add a node. You can embed in a place and always do a binary search.

+2
source

Since a sparse matrix can be huge and needs to be compressed, you can use std::unordered_map . I assume that the indices of the matrix ( x and y ) are always positive.

 #include <unordered_map> const size_t MAX_X = 1000*1000*1000; std::unordered_map <size_t, float> matrix; void addNode (size_t x, size_t y, float val) { size_t index = x + y*MAX_X; matrix[index] += val; //this function can be still faster if (matrix[index] == 0) //using find() / insert() methods matrix.erase(index); } 

If std::unordered_map not available on your system, you can try std::tr1::unordered_map or stdext::hash_map ...

If you can use more memory, use double instead of float , this will slightly improve your processing speed.

-1
source

Source: https://habr.com/ru/post/1446881/


All Articles