Ensure minimum distance between adjacent points

I have a list / frame of 15-25 data points. They are all between 0 and 100, and there are some clusters (for example, about 72). When displaying this data, I want to increase the distance between each pair of points, so that it should be at least 2 (for example, 69.4 and 71.4 are two neighboring points).

However, I need to ensure that I keep the general order and maintain each point as close to the source as possible.

My point list is just

scores <- c(13.343, 17.998, 25.413, 27.721, 33.361, 47.263, 52.298, 55.981, 57.851, 72.038, 72.204, 72.296, 73.472, 75.925, 80.748, 85.998) 

I want to increase the distance between clusters of points. Points at 72.038 - 72.296 will move down to provide more even distribution.

 spacedScores <- c(13.343, 17.998, 25.413, 27.721, 33.361, 47.263, 52.298, 55.981, 57.851, 67.925, 69.925, 71.925, 73.925, 75.925, 80.748, 85.998) 

Any suggestions on how to do this most cleanly in R?

Explanations: I'm not necessarily looking for a mathematically optimal solution, just something very good. I also often see some points that need to be moved up, and some points are good.

+4
source share
2 answers

You can use diff(scores) to find the distance between the points (I assume the values ​​are sorted).

Then use which(diff(scores) < 2) to identify the "bad points" and move them backward so that the distance = 2.

The problem is that moving one point to correct one distance can cause the previous or next distance to become <2, so you have to repeat this several times.

Here is an example where I am sorting through a solution. You can enter a counter to avoid an infinite loop

 scores <- c(13.343, 17.998, 25.413, 27.721, 33.361, 47.263, 52.298, 55.981, 57.851, 72.038, 72.204, 72.296, 73.472, 75.925, 80.748, 85.998) spacedScores <- c(13.343, 17.998, 25.413, 27.721, 33.361, 47.263, 52.298, 55.981, 57.851, 67.925, 69.925, 71.925, 73.925, 75.925, 80.748, 85.998) plot(scores, pch=20) points(spacedScores, pch='x', col="red") badPoints <- which(diff(scores) < 2) while (length(badPoints) > 0) { scores[badPoints] <- scores[badPoints] - (2 - diff(scores)[badPoints]) badPoints <- which(diff(scores) < 2) } points(scores, pch='o', col="green") 

Here is the result: in black, the starting points, in green - the changed points, in red - the intermediate points that you indicated

example plot

+4
source

I compiled a bruteforce hacker method that repeats several times until each diff is greater than 2 with the least modification required in the dataset:

 scores <- c(13.343, 17.998, 25.413, 27.721, 33.361, 47.263, 52.298, 55.981, 57.851, 72.038, 72.204, 72.296, 73.472, 75.925, 80.748, 85.998) done <- 0 while (any(diff(scores)<2)) { diffs <- diff(scores) closevals <- which(diffs < 2) first <- closevals[which.min(diffs[closevals])] if (which.min(diff(scores[(first-1):(first+1)])) == 1) { scores[1:(first-1)] <- scores[1:(first-1)] - (2 - (scores[first] - scores[first-1])) } else { scores[(first+1):length(scores)] <- scores[(first+1):length(scores)] + (2 - (scores[first+1] - scores[first])) } } > scores [1] 13.343 17.998 25.413 27.721 33.361 47.263 52.298 55.981 57.981 72.168 [11] 74.168 76.168 78.168 80.621 85.444 90.694 

Edit: I just saw that a more pleasant and simple answer was given (with exactly the same results). The only reason I am not deleting the complicated answer is because my loop also checks if adding a small number to diff between two numbers works better, instead of always subtracting 2-diff() from smaller values.

I hope my solution can work better if using real data :)

+2
source

Source: https://habr.com/ru/post/1341853/