Explanation for the jitter function in r

According to the documentation, the explanation of the jitter function is: "Add a small amount of noise to the number vector."

What does it mean?

Is a random number associated with each number in the vector and added to it?

+6
source share
2 answers

Jittering really means just adding random noise to a vector of numerical values, by default this is done in a jittery function by drawing samples from a uniform distribution. The range of values ​​in jitter is selected according to the data if a quantity parameter is not provided.

I think the term β€œjitter” covers other distributions than homogeneous ones, and is usually used to better visualize overlapping values, such as integer covariates. This helps to understand where the density of observations is high. It is useful to mention figures in the legend if some of the meanings were trembling, even if it is obvious. Here is an example of a visualization with a jitter function, as well as a normal distribution, jitter, where I randomly threw out the value sd = 0.1:

 n <- 500 set.seed(1) dat <- data.frame(integer = rep(1:3, each=n), continuous = c(rnorm(n, mean=1), rnorm(n, mean=2), rnorm(n, mean=3))^2) par(mfrow=c(3,1)) plot(dat, main="No jitter for x-axis", xlab="Integer", ylab="Continuous") plot(jitter(dat[,1]), dat[,2], main="Jittered x-axis (uniform distr.)", xlab="Integer", ylab="Continuous") plot(dat[,1]+rnorm(3*n, sd=0.1), dat[,2], main="Jittered x-axis (normal distr.)", xlab="Integer", ylab="Continuous") 

enter image description here

+7
source

A really good explanation of the Jitter effect and its need can be found in the Swirl course on regression models in R.

The data of Sir Francis Galton take data on the relationship between the heights of parents and their children and lay them out on a graph without jitter, and then with jitter.

This is without jitter (plot (child ~ parent, galton)):

enter image description here

This is with jitter (please ignore the regression lines) (graph (jitter (child, 4) ~ parent, galton)):

enter image description here

The course says that if you don’t have a jitter, many people will have the same height, so the dots fall on each other, so some of the circles in the first plot look darker than others. However, using the "jitter" function of the R function at the height of children, we can decompose the data to simulate measurement errors and increase the visibility of high-frequency heights.

+1
source

Source: https://habr.com/ru/post/949015/


All Articles