Hexbin: apply a function to each bin

I would like to build a hexbin graph where for each bin the “relationship between classes 1 and 2 falling into this bit” is displayed (either a log or not).

x <- rnorm(10000) y <- rnorm(10000) h <- hexbin(x,y) plot(h) l <- as.factor(c( rep(1,2000), rep(2,8000) )) 

Any suggestions on how to implement this? Is there a way to present a function for each bin based on bin statistics?

+4
source share
2 answers

@ cryo111 answer has the most important ingredient - IDs = TRUE . After that, you just need to figure out what you want to do with Inf and how much you need to scale the relationship to get integers that will create quite a plot.

 library(hexbin) library(data.table) set.seed(1) x = rnorm(10000) y = rnorm(10000) h = hexbin(x, y, IDs = TRUE) # put all the relevant data in a data.table dt = data.table(x, y, l = c(1,1,1,2), cID = h@cID ) # group by cID and calculate whatever statistic you like # in this case, ratio of 1 to 2's, # and then Inf are set to be equal to the largest ratio dt[, list(ratio = sum(l == 1)/sum(l == 2)), keyby = cID][, ratio := ifelse(ratio == Inf, max(ratio[is.finite(ratio)]), ratio)][, # scale up (I chose a scaling manually to get a prettier graph) # and convert to integer and change h as.integer(ratio*10)] -> h@count plot(h) 

enter image description here

+3
source

You can determine the number of class 1 and class 2 points in each box using

 library(hexbin) library(plyr) x=rnorm(10000) y=rnorm(10000) #generate hexbin object with IDs=TRUE #the object includes then a slot with a vector cID #cID maps point (x[i],y[i]) to cell number cID[i] HexObj=hexbin(x,y,IDs = TRUE) #find count statistics for first 2000 points (class 1) and the rest (class 2) CountDF=merge(count( HexObj@cID [1:2000]), count( HexObj@cID [2001:length(x)]), by="x", all=TRUE ) #replace NAs by 0 CountDF[is.na(CountDF)]=0 #check if all points are included sum(CountDF$freq.x)+sum(CountDF$freq.y) 

But printing them is another story. For example, what if there are no class 2 points in one box? Then the fraction is not defined. In addition, as I understand it, hexbin is just a two-dimensional histogram. Thus, it counts the number of points falling into a given bit. I do not think that it can process non-integer data, as in your case.

+1
source

Source: https://habr.com/ru/post/1491736/


All Articles