I use python for this task and is very objective here, I want to find a "pythonic" way to remove "duplicates" from an array of arrays that are close to each other from the threshold. For example, give this array:
[[ 5.024, 1.559, 0.281], [ 6.198, 4.827, 1.653], [ 6.199, 4.828, 1.653]]
note that [ 6.198, 4.827, 1.653]they are [ 6.199, 4.828, 1.653]really close to each other, their Euclidean distance 0.0014, so they are almost "duplicates", I want my final result to be simple:
[[ 5.024, 1.559, 0.281], [ 6.198, 4.827, 1.653]]
The algorithm I have now is:
to_delete = [];
for i in unique_cluster_centers:
for ii in unique_cluster_centers:
if i == ii:
pass;
elif np.linalg.norm(np.array(i) - np.array(ii)) <= self.tolerance:
to_delete.append(ii);
break;
for i in to_delete:
try:
uniques.remove(i);
except:
pass;
but it is very slow, I would like to know a faster and "pufonic" way to solve this problem. My tolerance is 0.0001.