What is the best way to multiply a large and sparse matrix with its transposition?

Question

What is the best way to multiply a large and sparse matrix with its transposition?

Currently, I want to multiply a large sparse matrix (~ 1M x 200k) with its transposition. The values of the resulting matrix will be in float.

I tried loading the matrix into a lean sparse matrix and multiplying each row of the first matrix by the second matrix. Multiplication took ~ 2 hours.

What is an effective way to achieve this multiplication? Because I see the pattern in the calculation.

The matrix is large and sparse.
Multiplication of a matrix with its transposition. Thus, the resulting matrix will be symmetric.

I would like to know which libraries can calculate faster. It can be in Python, R, C, C ++ or any other.

+4

c ++ python matrix-multiplication sparse-matrix linear-algebra

sravan_kumar Jul 04 '14 at 4:18

source share

2 answers

def URMdist(URM):
    NLin=URM.shape[0]
    NCol=URM.shape[1]
    URMt=URM.T
    Result = lil_matrix((NLin,NLin))
    for Lin in range(0,NLin):
        X = 0.0
        for Col in range(Lin,NLin):
            X = URM[Col,:].dot(URMt[:,Lin])
            if X != 0.0: 
                Result[Lin,Col] = Result[Col,Lin] = X
    return Result

0

Rodrigo Coelho 18 . '17 11:12

Serge Ballesta · Accepted Answer · 2014-07-04T07:44:59+0000

I believe that your main need is to save memory. First, when you multiply a matrix with its transpose, you do not need any memoory to transpose: all its cells are directly accessible through the first matrix (tA [i, j] = A [j, i]). Almost 1/3 of the memory is saved.

I see that the computation time cannot be neglected either. Since the resulting matrix will be symmetric, you can only calculate one half and directly save the other. About half the calculation time is saved.

, , , , scipy , COO: .

... , , (, python, scipy).

Python (matrix = A [M] [N])

I = []
J = []
V = []
for i in range(M):
    for j in range(i:M) :
        X = 0.0
        for k in range(N):
            X += A[i ][k] * A[k][j]
        if X != 0.0 # or abs (X) > epsilon if floating point accuracy is a concern ... 
            I.append (i )
            J.append(j)
            V.append(X)
            I.append (j )
            J.append(i)
            V.append(X)

I, J, V - , COO :

RESULT = sparse.coo_matrix((V,(I,J)),shape=(N, N))

What is the best way to multiply a large and sparse matrix with its transposition?

More articles: