I have large 2D arrays with unsorted (X, Y) points, for which I need to know which points are in close proximity to each other (search for nearest neighbors). I used cKDTree and query_ball_tree with the results for arrays with 500,000 (X, Y) points. However, when I try to use the same algorithm for data sets of more than 1,000,000 points, query_ball_tree raises a MemoryError.
I am using 64-bit Windows with 16 GB of internal memory and trying to use the 32-bit and 64-bit versions of Python and the extension modules (scipy and numpy).
def Construct_SearchTree(AllXyPoints): KDsearch = cKDTree(AllXyPoints) return KDsearch.query_ball_tree(KDsearch, Maxdist)
My questions:
1) Does anyone know of an alternative to cKDTree / query_ball_tree that consumes less memory? In this case, speed is less important than memory usage.
2) I was hoping that switching from 32-bit to 64-bit python and extensions would solve MemoryError. What could be the reason that this is not so?
Thanks for your help and help.
source share