As an example, suppose you measure pace, wind, and draft. We will call these elements "functions." Thus, valid values can be:
- Pace: -50 to 100F (I'm in Minnesota, USA).
- Wind: 0 to 120 mph (not sure if this is realistic, but carrying with me).
- Precipitation: 0 to 100
Start by normalizing your data. Temp has a range of 150 units, Wind 120 units and Precip 100 units. Multiply your wind units by 1.25 and Precip by 1.5 to make them approximately the same “scale” as your pace. You can get fancy here and make rules that weigh one function as more valuable than others. In this example, the wind can have a huge range, but usually it remains in a smaller range, so you want to weigh it less so that it does not distort your results.
Now imagine each dimension as a point in multidimensional space. This example measures three-dimensional space (pace, wind, draft). The best part is, if you add additional functions, we simply increase the dimension of our space, but the mathematics remains unchanged. In any case, we want to find historical moments that are closest to our current point. The easiest way to do this is Euclidean distance . Therefore, measure the distance from our current point to each historical point and observe the closest matches:
for each historicalpoint distance = sqrt( pow(currentpoint.temp - historicalpoint.temp, 2) + pow(currentpoint.wind - historicalpoint.wind, 2) + pow(currentpoint.precip - historicalpoint.precip, 2)) if distance is smaller than the largest distance in our match collection add historicalpoint to our match collection remove the match with the largest distance from our match collection next
This is a brute force approach. If you have time, you can become much more attractive. Multidimensional data can be represented as trees, such as kd-tree or g-trees . If you have a lot of data, comparing your current observation with each historical observation will be too slow. Trees speed up the search. Perhaps you should take a look at Data Clustering and Nearest Neighbor Search .
Greetings.
source share