About the curse of dimension

My question is about this topic, which I read a little. Basically, I understand that in higher dimensions, all points are very close to each other.

I have a doubt about whether this means that calculating distances in the usual way (like Euclidean) is valid or not. If this were still true, this would mean that when comparing vectors in high dimensions, the two most similar would not differ much from the third, even if this third might not be completely connected.

It is right? Then in this case, how would you determine if you have a match or not?

+3
source share
1 answer

Basically, the distance measurement is still correct, but it becomes meaningless when you have β€œreal world” data that is noisy.

The effect we are talking about here is that the large distance between two points in one dimension is quickly clouded by the small distances in all other dimensions. That is why, in the end, all points somewhat end with the same distance. There is a good illustration for this:

, . , ( 0..1). [0, 0.5) , [0,5, 1] ​​. 3 12,5% . 5 3,1%. 10 0,1%.

- ! . 0,1% - , .

, 10% . , [0, 0.9). 35% , 10 . 50 0,5%. , , .

, .

+2

Source: https://habr.com/ru/post/1745635/


All Articles