I have an array with dimensions
arr.size = (200, 600, 20).
I want to compute scipy.stats.kendalltau for each pairwise combination of the last two dimensions. For instance:
kendalltau(arr[:, 0, 0], arr[:, 1, 0]) kendalltau(arr[:, 0, 0], arr[:, 1, 1]) kendalltau(arr[:, 0, 0], arr[:, 1, 2]) ... kendalltau(arr[:, 0, 0], arr[:, 2, 0]) kendalltau(arr[:, 0, 0], arr[:, 2, 1]) kendalltau(arr[:, 0, 0], arr[:, 2, 2]) ... ... kendalltau(arr[:, 598, 20], arr[:, 599, 20])
so I cover all combinations of arr[:, i, xi] with arr[:, j, xj] with i < j and xi in [0,20) , xj in [0, 20) . This is (600 choose 2) * 400 separate calculations, but since each takes about 0.002 s on my machine, it should not take much longer than one day with the multiprocessing module.
What is the best way to iterate over these columns (using i<j )? I suggest that I should avoid something like
for i in range(600): for j in range(i+1, 600): for xi in range(20): for xj in range(20):
What is the simplest way to do this?
Edit: I changed the name since Kendall Tau is not very important for the question. I understand that I could do something like
import itertools as it for i, j in it.combinations(xrange(600), 2): for xi, xj in product(xrange(20), xrange(20)):
but there should be a better, more vector way with numpy.