R applying to the line

I have a data frame that contains several rows and several columns.

I have a character vector that contains the names of some columns in a data frame. The number of columns may vary.

For each row for each of these columns, I have to determine if one of them is not NA. (basically any(!is.na(df[namecolumns])) for each row), then to make a subset for those that are TRUE .

Actually, any(!is.na(df[1,][namescolumns])) works well, but only for the first line.

I could easily make a for loop, which is my first reflector as a programmer, and because it works for the first line, but I am sure that this is not the R path and that there is a way to do this with a "apply" ( lapply , mapply , sapply , tapply or other), but I can’t understand which one and how.

Thanks.

+4
source share
2 answers

try using apply by first size (rows):

 apply(df, 1 function(x) any(!is.na(x[namescolumns]))) 

The results will return to transpose, and so you can wrap the entire statement inside t(.)

+2
source

You can use a combination of lapply and lapply

 has.na.in.cols <- Reduce(`&`, lapply(colnames, function (name) !is.na(df[name]))) 

to get a vector of NA values ​​in any column in colnames , which in turn can be used to subset the data.

 df[has.any.na,] 

For instance. Given:

 df <- data.frame(a = c(1,2,3,4,NA,6,7), b = c(2,4,6,8,10,12,14), c = c("one","two","three","four","five","six","seven"), d = c("a",NA,"c","d","e","f","g") ) colnames <- c("a","d") 

You can get:

 > df[Reduce(`&`, lapply(colnames, function (name) !is.na(df[name]))),] abcd 1 1 2 one a 3 3 6 three c 4 4 8 four d 6 6 12 six f 7 7 14 seven g 
0
source

Source: https://habr.com/ru/post/1468782/


All Articles