I am looking for a NumPy way of calculating the Mahalanobis distance between two numpy arrays (x and y). The following code can correctly calculate the same using the cdist Scipy function. Since this function calculates unnecessary matics in my case, I want to get a more direct way to calculate using only NumPy.
import numpy as np
from scipy.spatial.distance import cdist
x = np.array([[[1,2,3,4,5],
[5,6,7,8,5],
[5,6,7,8,5]],
[[11,22,23,24,5],
[25,26,27,28,5],
[5,6,7,8,5]]])
i,j,k = x.shape
xx = x.reshape(i,j*k).T
y = np.array([[[31,32,33,34,5],
[35,36,37,38,5],
[5,6,7,8,5]],
[[41,42,43,44,5],
[45,46,47,48,5],
[5,6,7,8,5]]])
yy = y.reshape(i,j*k).T
results = cdist(xx,yy,'mahalanobis')
results = np.diag(results)
print results
[ 2.28765854 2.75165028 2.75165028 2.75165028 0. 2.75165028
2.75165028 2.75165028 2.75165028 0. 0. 0. 0.
0. 0. ]
My test:
VI = np.linalg.inv(np.cov(xx,yy))
print np.sqrt(np.dot(np.dot((xx-yy),VI),(xx-yy).T))
Can anyone fix this method?
Here is the formula for it:
http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.spatial.distance.mahalanobis.html#scipy.spatial.distance.mahalanobis