Aggregate function with a list variable

I am trying to create R Script to summarize measures in a data frame. I would like it to dynamically respond to changes in the structure of the data frame. For example, I have the following block.

library(plyr) #loading plyr just to access baseball data frame
MyData <- baseball[,cbind("id","h")]
AggHits <- aggregate(x=MyData$h, by=list(MyData[,"id"]), FUN=sum)

This block creates a data frame (AggHits) with the total number of hits (h) for each player (id). Yay

Suppose I want to bring in a team. How do I change the argument so that AggHits has a total number of views for each combination of "id" and "team"? I tried the following, and the second line throws an error: the arguments should be the same length

MyData <- baseball[,cbind("id","team","h")]
AggHits <- aggregate(x=MyData$h, by=list(MyData[,cbind("id","team")]), FUN=sum)

More generally, I would like to write a second line so that it automatically aggregates h with all variables except h. I can generate a list of variables for grouping quite easily with setdiff.

# set the list of variables to summarize by as everything except hits
SumOver <- setdiff(colnames(MyData),"h")

# total up all the hits - again this line throws an error
AggHits <- aggregate(x=MyData$h, by=list(MyData[,cbind(SumOver)]), FUN=sum)

, , csv, ($) (, , , ..). csv Script .

, ddply, , ddply ; .

!

ANSWER ( )

MyData <- baseball[,cbind("id","team","h")]
SumOver <- setdiff(colnames(MyData),"h")
AggHits <- aggregate(x=MyData$h, by=MyData[SumOver], FUN=sum)
+4
2

(ID, Team, League), (by=MyData[cols.to.group.on]):

MyData <- plyr::baseball
cols <- names(MyData)[sapply(MyData, class) != "integer"]
aggregate(MyData$h, by=MyData[cols], sum)
+4

, aggregate R

data(baseball, package = "plyr")

MyData  <- baseball[,c("id","h", "team")]
AggHits <- aggregate(h ~ ., data = MyData, sum)
+1

Source: https://habr.com/ru/post/1532890/


All Articles