The best way to solve sparse linear systems in C ++ - GPU Perhaps?

I am currently working on a project where we need to solve

|Ax - b|^2.

In this case, it Ais a very sparse matrix, and A'Ahas no more than 5 nonzero elements in each row.

We work with images, and the size A'Ais equal NxN, where N is the number of pixels. In this case N = 76800. We plan to go to RGB, and then the size will be 3Nx3N.

Matlab (A'A)\(A'b)requires about 0.15 s using doubling.

I have already experimented a bit with rare solvers Eigens. I tried:

SimplicialLLT
SimplicialLDLT
SparseQR
ConjugateGradient

and some different orders. For the better, yet available

SimplicialLDLT

which accepts 0.35 - 0.5using AMDOrdering.

, , ConjugateGradient, 6 s, 0 .

:

    A_tot.makeCompressed();
     // Create solver
    Eigen::SimplicialLDLT<Eigen::SparseMatrix<float>, Eigen::Lower, Eigen::AMDOrdering<int> > solver;
    // Eigen::ConjugateGradient<Eigen::SparseMatrix<float>, Eigen::Lower> cg;
    solver.analyzePattern(A_tot);
    t1 = omp_get_wtime();
    solver.compute(A_tot);
    if (solver.info() != Eigen::Success)
     {
         std::cerr << "Decomposition Failed" << std::endl;
         getchar();
     }
    Eigen::VectorXf opt = solver.solve(b_tot);
    t2 = omp_get_wtime();
    std::cout << "Time for normal equations: " << t2 - t1 << std::endl;

, ++ . 0.1 s .

, . , , Eigen SuiteSparse OpenMP. ? , ? ConjugateGradient ?

Edit:

! cuSparse Nvidia. , . , , . , ?

, A , , . 3D- , , 50 . , , ? , , , , , .

Cuda, .

, : Benchmark?, MatlabGPU

+4
1

2D-, , SimplicialLDLT . , analyzePattern , factorize compute. . , , , NaturalOrdering ( 100% , ). , Cholesky, ( Cholesky , , ).

+4

Source: https://habr.com/ru/post/1669169/


All Articles