R - violin with various columns

I am looking for a way to build violin stories with many violins (columns). The problem is that my columns vary in length. For example, it is something like this:

"V1" "V2" "V1" 9 255.5 "V2" 432 286 "V3" 161 322.5 "V4" 320.5 277 "V5" 253.5 153.5 "V6" 301 155.5 "V7" 113 218.5 "V8" 341 394 "V9" 138 93.5 ........ "V38166" 62 152 "V38167" NA 20.5 "V38168" NA 12 "V38169" NA 40.5 "V38170" NA 88 "V38171" NA 2.5 "V38172" NA 279.5 "V38173" NA 161.5 "V38174" NA 14.5 

As you can see, there are several NA in the first column, since the number of records is less. Keep in mind that there may be more columns. The question is, can I have a violin plot with NA in any of the columns?

I tried this:

 jpeg("violinplot.jpg", width = 1000, height = 1000); do.call(vioplot,c(statsDataFrame, list(names=nameList))) dev.off() 

statsDataFrame is the full data frame that I posted above. However, when I run the script, I get the following error:

 Error in quantile.default(data, 0.25) : missing values and NaN not allowed if 'na.rm' is FALSE Calls: do.call -> <Anonymous> -> quantile -> quantile.default Execution halted 

who essentially complains about the National Assembly. I tried both na.rm = FALSE and na.rm = TRUE, for example:

 jpeg("stats/AllDistanceViolinPlot.jpg", width = 1000, height = 1000); do.call(vioplot,c(columnViolinDistanceDataUnlist,na.rm=FALSE,list(names=tfListRow))) dev.off() 

and

 jpeg("stats/AllDistanceViolinPlot.jpg", width = 1000, height = 1000); do.call(vioplot,c(columnViolinDistanceDataUnlist,na.rm=TRUE,list(names=tfListRow))) dev.off() 

but to no avail.

Does anyone have any suggestions on how to do this or if this can be done?

Thank you for your help.

+4
source share
1 answer

You need to remove NA, which eliminates the possibility of using the data.frame column (columns of unequal length) as the data structure of your container, but you also want to use do.call , which accepts the list. So I would use lapply to remove the values ​​from each data.frame column that are NA , because each of them will be returned as a list item, and you can still use do.call (suppose your data is called df ):

 do.call( vioplot, lapply(df, function(x) x[!is.na(x)]) ) 

Or, as @BrianDiggs points out , you can use even more concise and beautiful:

 do.call(vioplot, lapply(df, na.omit)) 
+5
source

Source: https://habr.com/ru/post/1492180/


All Articles