Getting a subset of a data frame by searching for records with NA in specific columns

Question

Getting a subset of a data frame by searching for records with NA in specific columns

Suppose we had a data frame with such NA values,

 >data ABCD 1 3 NA 4 2 1 3 4 NA 3 3 5 4 2 NA NA 2 NA 4 3 1 1 1 2

I want to know a general method for getting a subset of data with NA values in C or A So the output should be

 ABCD 1 3 NA 4 NA 3 3 5 4 2 NA NA

I tried using the subset command like so, subset(data, A==NA | C==NA) , but that didn't work. Any ideas?

+4

r subset

Christian bueno Jul 12 '13 at 20:08

source share

2 answers

A very handy feature for such things is complete.cases . It checks the string for NA , and if any returns FALSE. If no NA, returns TRUE.

So, you need to multiply only two columns of your data, and then use complete.cases(.) And negate and a subset of these rows from the source data as follows:

 # assuming your data is in 'df' df[!complete.cases(df[, c("A", "C")]), ] # ABCD # 1 1 3 NA 4 # 3 NA 3 3 5 # 4 4 2 NA NA

+12

Arun Jul 12 '13 at 20:32

source share

Janesh devkota · Accepted Answer · 2013-07-12T20:20:12+0000

Here is one of the possibilities:

 # Read your data data <- read.table(text=" ABCD 1 3 NA 4 2 1 3 4 NA 3 3 5 4 2 NA NA 2 NA 4 3 1 1 1 2",header=T,sep="") # Now subset your data subset(data, is.na(C) | is.na(A)) ABCD 1 1 3 NA 4 3 NA 3 3 5 4 4 2 NA NA

Getting a subset of a data frame by searching for records with NA in specific columns

More articles: