Can the boxplot in the R base display “NA” when there is no grouping coefficient?

Question

Can the boxplot in the R base display “NA” when there is no grouping coefficient?

I want this:

a boxplot with NA as a category name

And I thought passing na.action=na.pass to boxplot would allow NA to appear in group names. Here is a sample code:

 #Build a fake dataset set.seed(212012) nn = 100 sample_data <- data.frame( score = c( rpois(nn, 1), rpois(nn, 2), rpois(nn, 1.5), rpois(nn, 3)), category = c( rep(0, nn), rep(1, nn), rep(2, nn), rep(NA, nn) )) boxplot( score ~ category, data=sample_data, na.action=na.pass )

But this produces the following:

enter image description here

The “simple” way to get what I want is the following code, but it is not very suitable for analyzing exploratory data:

 sample_data$category2 <- sample_data$category sample_data$category2[ is.na(sample_data$category) ] <- 'NA' boxplot( score ~ category2, data=sample_data )

Any hints from R Guru there? I was able to learn about na.pass from this more general discussion and about the origin of na.pass from Prof. Ripley's here . But, apparently, there is no difference between the missing data (NA) appearing in the data, which will be divided by the coefficient and the missing data in the factor itself. Am I missing something simple, or is it rather a function request?

+4

r missing-data boxplot

Nathan vanhoudnos Feb 01 '12 at 18:10

source share

1 answer

Justin · Accepted Answer · 2012-02-01T18:27:04+0000

boxplot( score ~ factor(category,exclude=NULL), data=sample_data)

default behavior exclude=NA . I assume that the boxplot internal call is a factor call, if it is not already a factor. It just makes factorization include your NA values.

Can the boxplot in the R base display “NA” when there is no grouping coefficient?

More articles: