By creating square scatterplots for two variables in ggplot2 in R

I have a framework with two columns xand y, each of which contains values ​​from 0 to 100 (data is paired). I want to match them with each other using square scatterplots. If I used a regular scatter plot, this would be easy to do:

geom_point(aes(x=x, y=y))

but instead I would like to bin points in N bins from 0 to 100, get the average value xin each box and the average value yfor the points in this bin and show that, as a scatter plot, the mean values, and not the original data points, correlate.

Is there a smart / quick way to do this in ggplot2 using some combination of geom_smooth()and geom_point? Or should it be pre-computed manually and then built?

0
source share
2 answers

I suggest geom_bin2d.

DF <- data.frame(x=1:100,y=1:100+rnorm(100))

library(ggplot2)
p <- ggplot(DF,aes(x=x,y=y)) + geom_bin2d()
print(p)

enter image description here

+1
source

Yes you can use stat_summary_bin.

set.seed(42)
x <- runif(1e4)
y <- x^2 + x + 4 * rnorm(1e4)
df <- data.frame(x=x, y=y)

library(ggplot2)
(ggplot(df, aes(x=x,y=y)) +
  geom_point(alpha = 0.4) +
  stat_summary_bin(fun.y='mean', bins=20,
                   color='orange', size=2, geom='point'))

enter image description here

+1
source

Source: https://habr.com/ru/post/1653614/


All Articles