The einsum based einsum is 100 times faster than your cycle for N = 100k:
%timeit np.array([np.dot(np.dot(vecs[i, ...], mats[i, ...]), vecs[i, ...].T) for i in range(N)]) %timeit np.einsum('...i,...ij,...j->...', vecs, mats, vecs) np.allclose(np.array([np.dot(np.dot(vecs[i, ...], mats[i, ...]), vecs[i, ...].T) for i in range(N)]), np.einsum('...i,...ij,...j->...', vecs, mats, vecs)) 1 loops, best of 3: 640 ms per loop 100 loops, best of 3: 7.02 ms per loop Out[45]: True
source share