Try using the "yearmon" class in the zoo as it sorts accordingly. Below we create a sample DF frame, and then add the YearMonth column of the "yearmon" class. Finally, we perform our aggregation. Actual processing is just the last two lines (the other part is just to create a sample data frame).
Lines <- "Instrument AccountValue monthYear ExitTime JPM 6997 april-07 2007-04-10 JPM 7261 mei-07 2007-05-29 JPM 7545 juli-07 2007-07-18 JPM 7614 juli-07 2007-07-19 JPM 7897 augustus-07 2007-08-22 JPM 7423 november-07 2007-11-02 KFT 6992 mei-07 2007-05-14 KFT 6944 mei-07 2007-05-21 KFT 7069 juli-07 2007-07-09 KFT 6919 juli-07 2007-07-16" library(zoo) DF <- read.table(textConnection(Lines), header = TRUE) DF$YearMonth <- as.yearmon(DF$ExitTime) aggregate(AccountValue ~ YearMonth + Instrument, DF, sum)
This gives the following:
> aggregate(AccountValue ~ YearMonth + Instrument, DF, sum) YearMonth Instrument AccountValue 1 Apr 2007 JPM 6997 2 May 2007 JPM 7261 3 Jul 2007 JPM 15159 4 Aug 2007 JPM 7897 5 Nov 2007 JPM 7423 6 May 2007 KFT 13936 7 Jul 2007 KFT 13988
A slightly different approach and output directly use read.zoo . It produces one column per instrument and one row per year / month. We read in the columns, assigning them the appropriate classes, using "NULL" for the monthYear column, since we will not use it. We also indicate that the time index is the third column of the remaining columns and that we want the input to be split into columns by the 1st column. FUN=as.yearmon indicates that we want the time index to be converted from the "Date" class to the "yearmon" class, and we summarize everything using sum .
z <- read.zoo(textConnection(Lines), header = TRUE, index = 3, split = 1, colClasses = c("character", "numeric", "NULL", "Date"), FUN = as.yearmon, aggregate = sum)
The resulting zoo object is as follows:
> z JPM KFT Apr 2007 6997 NA May 2007 7261 13936 Jul 2007 15159 13988 Aug 2007 7897 NA Nov 2007 7423 NA
We may prefer to save it as an object of the zoo, to use other functions in the zoo, or we can convert it to a data frame as follows: data.frame(Time = time(z), coredata(z)) , which makes time a separate column or as.data.frame(z) , which uses string names for time. fortify.zoo()z) also works.