I built a correlation matrix derived from a small set of tests, and ended up with the following. True values ββare values ββthat exceed the specified value (for example, results = relation_matrix> 0.75)
[[False False False True]
[False False True False]
[False True False True]
[ True False True False]]
Note that I also faked the diagonal (top left, bottom right). I also need half the matrix, because it is a mirror top-left / bottom-right.
Is there a way / function in Numpy (or another) for me to return a row / column of True values? When I use this against real data (200 thousand rows), I need to do this quickly without using an internal loop. 200k * 200k checks will be very slow. I suppose there should be a matrix / numpy / scikit.learn function, etc. that will provide this, but I could not find it.
The expected result of this will be:
[[1, 4], [2, 3], [3, 2], [3, 4], [4, 1], [4, 3]]
Ideally, given that this is a mirror image, it would be:
[[1, 4], [2, 3], [3, 4]]
Jon m source
share