The problem is that you are multiplying the sparse matrix by a dense vector using the same old matrix vector multiplication algorithm. You will not see any acceleration in this way.
Suppose you have a matrix nxn A (dense or sparse) and n -vector x . To compute y = Ax , we can write:
y = [0]*n for i in range(n): for j in range(n): y[i] += A[i,j]*x[j]
This works whether the matrix A dense or sparse. Suppose A sparse. We still iterate over all columns of A to compute one record y , although most of the records will be zero. Thus, the outer loop goes through n iterations, and the inner loop also goes through n iterations.
If we know which entries from A are nonzero, we can do much better. Suppose we have a list of all nonzero elements of row i , call it nonzero[i] . Then we can replace the inner loop with an iteration over this list:
y = [0]*n for i in range(n): for j in nonzero[i]: y[i] += A[i,j]*x[j]
Thus, although our outer loop iterates through n , the inner loop does as many iterations as there are non-zero entries.
Here, acceleration occurs with sparse multiplication of matrix vectors.
Use numpy !
But you have one more problem: you are trying to do matrix multiplication with pure Python, which (due to type checking, non-contiguous data structures, etc.) is slow. The solution is to use numpy , which provides fast algorithms and data structure. You can then use scipy sparse matrices , as they implement the fast sparse matrix vector transform for you.
Experiment
We can show it all with a quick experiment. First we create a dense matrix of 10,000 x 10,000 A :
>>> import numpy as np >>> n = 10000 >>> A = np.random.sample((n,n))
Then we make the sparse matrix B threshold A B is the same size as A , but only 10% of its entries are non-zero:
>>> B = np.where(A < .1, A, 0).astype(float)
Now we make a dense vector to multiply A and B by:
>>> x = np.random.sample(n) >>> %timeit A.dot(x) 10 loops, best of 3: 46.7 ms per loop >>> %timeit B.dot(x) 10 loops, best of 3: 43.7 ms per loop
It takes as long to calculate Ax as it does to calculate Bx , although B is sparse. Of course, it is not very sparse: it is stored as a dense matrix with a large number of zero records. Let me make it rare:
>>> sparse_B = scipy.sparse.csr_matrix(B) >>> 100 loops, best of 3: 12 ms per loop
There is our acceleration! Now, just for fun, what if we keep A as a sparse matrix, although it's really dense?
>>> sparse_A = scipy.sparse.csr_matrix(A) >>> %timeit sparse_A.dot(x) 10 loops, best of 3: 112 ms per loop
Oh! But this is to be expected, since keeping A as a sparse matrix will lead to some overhead during multiplication.