Calculation of the maximum histogram value

How to calculate the maximum histogram value when creating a chart?

I want to place a line on a chart with annotation, and I want the text to be proportional to the maximum value of the y axis. For instance:

library(ggplot2) df <- data.frame(x = runif(1000)) p <- ggplot(data=df, aes(x)) + geom_histogram() p + geom_vline(aes(xintercept=0.5),color='red') + geom_text(aes(0.55, 10, label='line'), angle = 90, color='red') 

produces the following:

enter image description here

I would like to pass the argument geom_text() , which is 1/3 maximum value of the histogram, since I think this is the best way to position the text sequentially, but I do not know how to calculate this count value.

+6
source share
3 answers

stat_bin uses binwidth = range / 30 by default. I'm not sure exactly how it is calculated, but this should be a pretty reasonable approximation:

 max(table(cut(df$x,seq(min(df$x),max(df$x),dist(range(df$x))/30)))) 
+3
source

In general, a simple one-dimensional search with maximum finding is implemented as follows (in my case, in ANSI-C);

 #include <stdio.h> #include <errno.h> int printMaxHistValue(int* yValues, int* xValues, int numPoints) { int i, currentY=0, currentX=0, maxX=0, maxY=0, maxIndex=0; if(numPoints <= 0) { printf("Invalid number of points in histogram! Need at least 1 point! Exiting"); return EINVAL; } // Find the values for(i=0; i<numPoints; i++) { currentX = xValues[i]; currentY = yValues[i]; if(currentY > maxY) { maxY = currentY; maxX = currentX; maxIndex = i; } } // Finished with search printf("Found the maximum histogram value of y=%d at bin/x-value of %d (which corresponds to i=%d)",maxY,maxX,maxIndex); // Done return EOK; } 

Hope this example helps :)

+1
source

You can use the hist function, which calculates the counts. Just make sure you give it the same bunker breaks as the geom_histogram. If there is no binary width for geom_histogram, the default range is / 30. From a look at how geom_histogram generates boxes, I think this should work:

 require(plyr) min.brea <- round_any(min(df$x), diff(range(df$x))/30, floor) max.brea <- round_any(max(df$x), diff(range(df$x))/30, ceiling) breaks <- seq(min.brea, max.brea, diff(range(df$x/30))) histdata <- hist(df$x, breaks=breaks, plot=FALSE, right=FALSE) max.value <- max(histdata$counts) 

round_any function from plyr.

+1
source

Source: https://habr.com/ru/post/912134/


All Articles