How to get na.omit with data.table to omit NA in each column

Question

How to get na.omit with data.table to omit NA in each column

Say I have

az<-data.table(a=1:6,b=6:1,c=4) az[b==4,c:=NA] az abc 1: 1 6 4 2: 2 5 4 3: 3 4 NA 4: 4 3 4 5: 5 2 4 6: 6 1 4

I can get the sum of all columns with

 az[,lapply(.SD,sum)] abc 1: 21 21 NA

This is what I want for a and b , but c is NA. This is apparently easy enough to fix by doing

 az[,lapply(na.omit(.SD),sum)] abc 1: 18 17 20

This is what I want for c , but I did not want to omit the values of a and b , where c is NA . This is a contrived example in my real data, in which there may be 1000+ columns with random NA. Is there a way to get na.omit or something else to act on a column, not the whole table, without relying on a loop through each column as a vector?

+4

r data.table

Dean MacGregor May 29 '13 at 18:46

source share

1 answer

Blue magister · Accepted Answer · 2013-05-29T18:59:08+0000

Extension of my comment:

Many base functions let you decide how to treat NA . For example, sum has the argument na.rm :

 az[,lapply(.SD,sum,na.rm=TRUE)]

In general, you can also use the na.omit function for each vector separately:

 az[,lapply(.SD,function(x) sum(na.omit(x)))]

How to get na.omit with data.table to omit NA in each column

More articles: