Unexpected behavior in a subset of the aggregate function in R

Question

Unexpected behavior in a subset of the aggregate function in R

I have a data frame that contains the following format:

manufacturers pricegroup leads
harley        <2500      #
honda         <5000      #
...           ...        ..

I use the aggregate function to output data as follows:

aggregate( leads ~ manufacturer + pricegroup, data=leaddata, 
    FUN=sum, subset=(manufacturer==c("honda","harley")))

I noticed that this does not return the correct results. The numbers for each manufacturer are getting smaller and smaller, the more manufacturers I add to the group of subsets. However, if I use:

aggregate( leads ~ manufacturer + pricegroup, data=leaddata, 
    FUN=sum, subset=(manufacturer=="honda" | manufacturer=="harley"))

It returns the correct numbers. For my life, I can’t understand why. I would just use the OR operator, except that I will dynamically translate the list of manufacturers. Any thoughts on why the first design doesn't work? Better, any thoughts on how to make it work? Thank!

+4

r logic aggregate subset

josibake Apr 26 '15 at 3:28

1

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer · 2015-04-26T04:19:32+0000

, == "honda" "harley" "". , %in% ( MrFlick) | , , .

== , .

:

set.seed(1)
v1 <- sample(letters[1:5], 10, TRUE)
v2 <- c("a", "b")   ## Will be recycled to rep(c("a", "b"), 5) when comparing with v1

data.frame(v1, v2, 
           `==` = v1 == v2, 
           `%in%` = v1 %in% v2, 
           `|` = v1 == "a" | v1 == "b", 
           check.names = FALSE)
#    v1 v2    ==  %in%     |
# 1   b  a FALSE  TRUE  TRUE
# 2   b  b  TRUE  TRUE  TRUE
# 3   c  a FALSE FALSE FALSE
# 4   e  b FALSE FALSE FALSE
# 5   b  a FALSE  TRUE  TRUE
# 6   e  b FALSE FALSE FALSE
# 7   e  a FALSE FALSE FALSE
# 8   d  b FALSE FALSE FALSE
# 9   d  a FALSE FALSE FALSE
# 10  a  b FALSE  TRUE  TRUE

, == TRUE , "v1" "v2" .

Unexpected behavior in a subset of the aggregate function in R

More articles: