I do not understand why people jump on the OP. In the context, this is clearly a programming question: it is about obtaining the empirical frequency of data points within a given ellipse, not theoretical probability. The OP even posted code and a graph showing what they were trying to get.
Perhaps they do not fully understand statistical theory behind a 95% ellipse, but they did not ask about it. In addition, creating graphs and calculating frequencies like this is a great way to handle theory.
In any case, here is the code that answers the narrowly defined question of how to count the points inside the ellipse obtained through the normal distribution (which underlies dataEllipse ). The idea is to convert your data into a unit circle through the main components, and then get the points in a certain radius of the origin.
within.ellipse <- function(x, y, plot.ellipse=TRUE) { if(missing(y) && is.matrix(x) && ncol(x) == 2) { y <- x[,2] x <- x[,1] } if(plot.ellipse) dataEllipse(x, y, levels=0.95) d <- scale(prcomp(cbind(x, y), scale.=TRUE)$x) rad <- sqrt(2 * qf(.95, 2, nrow(d) - 1)) mean(sqrt(d[,1]^2 + d[,2]^2) < rad) }
It was also noted that a data ellipse of 95% contains, by definition, 95% of the data. This, of course, is not so, at least for the ellipses of the normal theory. If your distribution is particularly poor, the coverage frequency may not even converge to the expected level as the sample size increases. Consider a generalized Pareto distribution, for example:
library(evd)