Violin plot: How is a range of adjacent values ​​determined and why is it different from boxplot?

In theory, the vioplot violin package is a plotter + density function.

In the "boxplot part" section,

  • the black box corresponds to IQR (indeed, see below) and

  • the middle line should correspond to the same range (adjacent values, the default is 1.5 IQR), but this is not the case (see below). Can anyone explain why they are different?

    require("vioplot")
    a = rnorm(100)
    range (a)
    a = c(a,2,8,2.9,3,4, -3, -5) # add some outliers
    
    par ( mfrow = c(1,2))
    boxplot(a, range=1.5)
    vioplot(a, range=1.5 )
    

Generated from above:

Box vs Vio generated by above lines

Hintze, JL and RD Nelson (1998). Graphs of violins: synergy of trace density of the plot. American Statistician, 52 (2): 181-4.

+4
source share
1 answer

:

b <- c(1:10, 20)

par(mfrow = c(1,2))
boxplot(b, range=1.5)
vioplot(b, range=1.5 )

enter image description here

R boxplot ( ggplot help ):

, 1.5 * IQR , IQR - .

vioplot, upper[i] <- min(q3[i] + range*iqd, data.max).

:

# vioplot draws
quantile(b, 0.75) + 1.5 * IQR(b)
# 16

# boxplot draws
max(b[b <= quantile(b, 0.75) + 1.5 * IQR(b)])
# 10
+1

Source: https://habr.com/ru/post/1609959/


All Articles