R tapply: different R releases produce different outputs

Problem

This is a simple example tapply:

z=data.frame(s=as.character(NA), rows=c(1,2,1), cols=c(1,1,2), stringsAsFactors=FALSE)
tapply(z$s, list(z$rows, z$cols), identity) 

In R (another canoe) v3.3.3 (2017-03-06) for Windows, it brings:

#   1  2 
# 1 NA NA
# 2 NA NA

In R (You Stupid Darkness) v3.4.0 (2017-04-21) for Windows, this brings:

#   1  2 
# 1 NA NA
# 2 NA ""

R News Links

According to NEWS.R-3.4.0. :

tapply()gets a new option default = NAthat allows you to change the previously hard-set value.

In this case, instead, it seems that the default is an empty string.

Inconsistencies between data types

The new behavior is incompatible with the numerical or logical version, where all NA is still received:

z=data.frame(s=as.numeric(NA), rows=c(1,2,1), cols=c(1,1,2), stringsAsFactors=FALSE)
tapply(z$s, list(z$rows, z$cols), identity)

#    1  2
# 1 NA NA
# 2 NA NA

The same for s=NAwhat it means s=as.logical(NA).

Even the worst case

s z , NA.

z=data.frame(s=c('a', NA, 'c'), rows=c(1,2,1), cols=c(1,1,2), stringsAsFactors=FALSE)
m=tapply(z$s, list(z$rows, z$cols), identity)
z;m

#      s rows cols
# 1    a    1    1
# 2 <NA>    2    1
# 3    c    1    2

#   1   2  
# 1 "a" "c"
# 2 NA  "" 

:

m[!nzchar(m)]=NA; m
#   1   2  
# 1 "a" "c"
# 2 NA  NA 

, , , (2,2), NA, . , tapply ?

z=data.frame(s=c('a', NA, ''), rows=c(1,2,1), cols=c(1,1,2), stringsAsFactors=FALSE)
m=tapply(z$s, list(z$rows, z$cols), identity)
z;m

#      s rows cols
# 1    a    1    1
# 2 <NA>    2    1
# 3         1    2

#   1   2 
# 1 "a" ""
# 2 NA  ""

(1,2) (2,2) NA . .

? , rows=2 cols=2 , (NA) ?

, R-?

+4

Source: https://habr.com/ru/post/1676738/


All Articles