How to find the smallest ellipse covering a given fraction of the set of points from R?

I am wondering: Is there any function / smart way to find the smallest ellipse covering a given fraction of the set of 2d points in R? . With the smallest value, I mean the ellipse with the smallest area.

Clarification: I am fine with an approximately correct solution if the number of points is large (as I believe, the exact solution would be to try all combinations of subsets of points)

This question may sound like a duplicate of the Ellipse question containing the percentage of given points in R , but a way of formulating this question. The resulting answer does not lead to the smallest ellipse. For example, using the solution indicated in the Ellipse containing the percentage of given points in R :

require(car) x <- runif(6) y <- runif(6) dataEllipse(x,y, levels=0.5) 

The resulting ellipse is obviously not the smallest ellipse containing half the dots, which, I think, will be a small ellipse covering the three dots in the upper left corner.

enter image description here

+5
source share
2 answers

I think I have a solution that requires two functions: cov.rob from the MASS package and ellipsoidhull from the cluster package. cov.rob(xy, quantile.used = 50, method = "mve") finds about the "best" 50 points out of a total of 2d points in xy that are contained in an ellipse of minimal size. However, cov.rob does not directly return this ellipse, but rather some other ellipse estimated from the best points (the goal is to reliably estimate the covariance matrix). To find the minimum ellipse of the actual, we can give the best ellipsoidhull points that find the minimum ellipse, and we can use predict.ellipse to infer the coordinates of the path that defines the shell of the ellipse.

I am not 100% sure that this method is the simplest and / or 100% working (it seems that the second step of using ellipsoidhull can be avoided, but I did not understand how to do this.). It seems to at least work on my toy example ...

Suffice it to say, here is the code:

 library(MASS) library(cluster) # Using the same six points as in the question xy <- cbind(x, y) # Finding the 3 points in the smallest ellipse (not finding # the actual ellipse though...) fit <- cov.rob(xy, quantile.used = 3, method = "mve") # Finding the minimum volume ellipse that contains these three points best_ellipse <- ellipsoidhull( xy[fit$best,] ) plot(xy) # The predict() function returns a 2d matrix defining the coordinates of # the hull of the ellipse lines(predict(best_ellipse), col="blue") 

enter image description here

Looks good! You can also view the ellipse object for more information.

 best_ellipse ## 'ellipsoid' in 2 dimensions: ## center = ( 0.36 0.65 ); squared ave.radius d^2 = 2 ## and shape matrix = ## xy ## x 0.00042 0.0065 ## y 0.00654 0.1229 ## hence, area = 0.018 

Here is a handy feature that adds an ellipse to an existing base graphic:

 plot_min_ellipse <- function(xy, points_in_ellipse, color = "blue") { fit <- cov.rob(xy, quantile.used = points_in_ellipse, method = "mve") best_ellipse <- ellipsoidhull( xy[fit$best,] ) lines(predict(best_ellipse), col=color) } 

Use it for more points:

 x <- runif(100) y <- runif(100) xy <- cbind(x, y) plot(xy) plot_min_ellipse(xy, points_in_ellipse = 50) 

enter image description here

+4
source

This is very similar to the two-dimensional confidence interval. Try http://stat.ethz.ch/R-manual/R-devel/library/cluster/html/ellipsoidhull.html . You will probably need to run it in each combination of N points, and then select the smallest result.

0
source

Source: https://habr.com/ru/post/1206454/


All Articles