I believe that your main need is to save memory. First, when you multiply a matrix with its transpose, you do not need any memoory to transpose: all its cells are directly accessible through the first matrix (tA [i, j] = A [j, i]). Almost 1/3 of the memory is saved.
I see that the computation time cannot be neglected either. Since the resulting matrix will be symmetric, you can only calculate one half and directly save the other. About half the calculation time is saved.
, , , , scipy , COO:
.
... , , (, python, scipy).
Python (matrix = A [M] [N])
I = []
J = []
V = []
for i in range(M):
for j in range(i:M) :
X = 0.0
for k in range(N):
X += A[i ][k] * A[k][j]
if X != 0.0 # or abs (X) > epsilon if floating point accuracy is a concern ...
I.append (i )
J.append(j)
V.append(X)
I.append (j )
J.append(i)
V.append(X)
I, J, V - , COO :
RESULT = sparse.coo_matrix((V,(I,J)),shape=(N, N))