R - difference scatter plot

I was wondering if there is a way to subtract the two scattered scatterplots from each other in R. I have two distributions with the same axes and want to lay one on top of the other and subtract them, thereby creating a difference scatter plot.

Here are my two graphs:

enter image description here enter image description here

and my script for graphs:

library(hexbin) library(RColorBrewer) setwd("/Users/home/") df <- read.table("data1.txt") x <-df$c2 y <-df$c3 bin <-hexbin(x,y,xbins=2000) my_colors=colorRampPalette(rev(brewer.pal(11,'Spectral'))) d <- plot(bin, main="" , colramp=my_colors, legend=F) 

Any advice on how to do this would be very helpful.

EDIT Found an additional way to do this:

 xbnds <- range(x1,x2) ybnds <- range(y1,y2) bin1 <- hexbin(x1,y1,xbins= 200, xbnds=xbnds,ybnds=ybnds) bin2 <- hexbin(x2,y2,xbins= 200, xbnds=xbnds,ybnds=ybnds) erodebin1 <- erode.hexbin(smooth.hexbin(bin1)) erodebin2 <- erode.hexbin(smooth.hexbin(bin2)) hdiffplot(erodebin1, erodebin2) 
+6
source share
1 answer

Well, as a starting point, here are some sample data. Each of them is random, one is shifted by (2.2).

 df1 <- data.frame( x = rnorm(1000) , y = rnorm(1000) ) df2 <- data.frame( x = rnorm(1000, 2) , y = rnorm(1000, 2) ) 

To ensure the identity of the bins, it is best to create one hexbin object. To do this, I use dplyr bind_rows to track which data.frame the data was received from (this would be even easier if you had a single data.frame with a grouping variable).

 bothDF <- bind_rows(A = df1, B = df2, .id = "df") bothHex <- hexbin(x = bothDF$x , y = bothDF$y , IDs = TRUE ) 

Then we use a combination of hexbin and dplyr to count the occurrences of each of them in each cell. First, apply through the bunkers by building a table (you need to use factor to make sure that all levels are shown, not needed if your column is already a factor). He then simplifies it and builds a data.frame, which is then controlled using mutate to calculate the difference in the counts, and then bound to a table that gives x and y values ​​for each of the identifiers.

 counts <- hexTapply(bothHex, factor(bothDF$df), table) %>% simplify2array %>% t %>% data.frame() %>% mutate(id = as.numeric(row.names(.)) , diff = A - B) %>% left_join(data.frame(id = bothHex@cell , hcell2xy(bothHex))) 

head(counts) gives:

  AB id diff xy 1 1 0 7 1 -1.3794467 -3.687014 2 1 0 71 1 -0.8149939 -3.178209 3 1 0 79 1 1.4428172 -3.178209 4 1 0 99 1 -1.5205599 -2.923806 5 2 0 105 2 0.1727985 -2.923806 6 1 0 107 1 0.7372513 -2.923806 

Finally, we use ggplot2 to build the resulting data, since it offers more control (and the ability to more easily use a different variable than the fill counter) than hexbin .

 counts %>% ggplot(aes(x = x, y = y , fill = diff)) + geom_hex(stat = "identity") + coord_equal() + scale_fill_gradient2() 

enter image description here

From there, it's easy to inspect axes, colors, etc.

+7
source

Source: https://habr.com/ru/post/1011594/


All Articles