Adding different percentages in boxes in R

I am new to R and recently used it to make some Boxplots. I also added the mean and standard deviation in my boxplot. I was wondering if I can add some kind of check mark or circle in different percentiles. Say, if I want to mark the 85th, 90th percentile in each HOUR box, is there a way to do this? My data consists of annual loads in MW per hour, and my output consists of 24 boxes for each hour for each month. I do every month at a time, because I'm not sure if there is a way to start all 96 (every month, weekday / weekend, for 4 different zones) boxes at once. Thanks in advance for your help.

JANWD <-read.csv("C:\\My Directory\\MWBox2.csv") JANWD.df<-data.frame(JANWD) JANWD.sub <-subset(JANWD.df, MONTH < 2 & weekend == "NO") KeepCols <-c("Hour" , "Houston_Load") HWD <- JANWD.sub[ ,KeepCols] sd <-tapply(HWD$Houston_Load, HWD$Hour, sd) means <-tapply(HWD$Houston_Load, HWD$Hour, mean) boxplot(Houston_Load ~ Hour, data=HWD, xlab="WEEKDAY HOURS", ylab="MW Differnce", ylim= c(-10, 20), smooth=TRUE ,col ="bisque", range=0) points(sd, pch = 22, col= "blue") points(means, pch=23, col ="red") #Output of the subset of data used to run boxplot for month january in Houston str(HWD) 'data.frame': 504 obs. of 2 variables: `$ Hour : int 1 2 3 4 5 6 7 8 9 10 ...' `$ Houston_Load: num 1.922 2.747 -2.389 0.515 1.922 ...' #OUTPUT of the original data str(JANWD) 'data.frame': 8783 obs. of 9 variables: $ Date : Factor w/ 366 levels "1/1/2012","1/10/2012",..: 306 306 306 306 306 306 306 306 306 306 ... `$ Hour : int 1 2 3 4 5 6 7 8 9 10 ...' ` $ MONTH : int 8 8 8 8 8 8 8 8 8 8 ...' `$ weekend : Factor w/ 2 levels "NO","YES": 1 1 1 1 1 1 1 1 1 1 ...' `$ TOTAL_LOAD : num 0.607 5.111 6.252 7.607 0.607 ...' `$ Houston_Load: num -2.389 0.515 1.922 2.747 -2.389 ...' `$ North_Load : num 2.95 4.14 3.55 3.91 2.95 ...' `$ South_Load : num -0.108 0.267 0.54 0.638 -0.108 ...' `$ West_Load : num 0.154 0.193 0.236 0.311 0.154 ...' 
+4
source share
1 answer

Here is one way using quantile() to calculate the appropriate percentiles for you. I add tags with rug() .

 set.seed(1) X <- rnorm(200) boxplot(X, yaxt = "n") ## compute the required quantiles qntl <- quantile(X, probs = c(0.85, 0.90)) ## add them as a rgu plot to the left hand side rug(qntl, side = 2, col = "blue", lwd = 2) ## add the box and axes axis(2) box() 

Refresh . In response to the OP output containing str() , here is an example similar to the data that the OP should execute:

 set.seed(1) ## make reproducible HWD <- data.frame(Hour = rep(0:23, 10), Houston_Load = rnorm(24*10)) 

Now, suppose you want ticks at the 85th and 90th percentiles for each Hour ? If so, we need to split the data into Hour and compute via quantile() , as I showed earlier:

 quants <- sapply(split(HWD$Houston_Load, list(HWD$Hour)), quantile, probs = c(0.85, 0.9)) 

which gives:

 R> quants <- sapply(split(HWD$Houston_Load, list(HWD$Hour)), + quantile, probs = c(0.85, 0.9)) R> quants 0 1 2 3 4 5 6 85% 0.3576510 0.8633506 1.581443 0.2264709 0.4164411 0.2864026 1.053742 90% 0.6116363 0.9273008 2.109248 0.4218297 0.5554147 0.4474140 1.366114 7 8 9 10 11 12 13 14 85% 0.5352211 0.5175485 1.790593 1.394988 0.7280584 0.8578999 1.437778 1.087101 90% 0.8625322 0.5969672 1.830352 1.519262 0.9399476 1.1401877 1.763725 1.102516 15 16 17 18 19 20 21 85% 0.6855288 0.4874499 0.5493679 0.9754414 1.095362 0.7936225 1.824002 90% 0.8737872 0.6121487 0.6078405 1.0990935 1.233637 0.9431199 2.175961 22 23 85% 1.058648 0.6950166 90% 1.145783 0.8436541 

Now we can draw marks in x places of boxes

 boxplot(Houston_Load ~ Hour, data = HWD, axes = FALSE) xlocs <- 1:24 ## where to draw marks tickl <- 0.15 ## length of marks used for(i in seq_len(ncol(quants))) { segments(x0 = rep(xlocs[i] - 0.15, 2), y0 = quants[, i], x1 = rep(xlocs[i] + 0.15, 2), y1 = quants[, i], col = c("red", "blue"), lwd = 2) } title(xlab = "Hour", ylab = "Houston Load") axis(1, at = xlocs, labels = xlocs - 1) axis(2) box() legend("bottomleft", legend = paste(c("0.85", "0.90"), "quantile"), bty = "n", lty = "solid", lwd = 2, col = c("red", "blue")) 

The resulting figure should look like this:

extended boxplot example

+5
source

Source: https://habr.com/ru/post/1434648/


All Articles