Error in aggregate function

I have a dataframe:

head(df) Year Find Found 6982 1901 267 246 6983 1901 271 251 6984 1902 317 236 6985 1903 339 244 6986 1904 339 260 6987 1903 345 15 5255 1902 47 45 5256 1901 46 NA 5257 1906 45 150 5258 1905 42 24 5259 1910 42 78 5260 1910 41 NA 

When I try to fill it out:

 aggdata <-aggregate(df, by=list(Year), FUN=sum, na.rm=TRUE) 

I get error

 Error in aggregate.data.frame(AndelKvinnorUttax, by = list(Year), FUN = sum, : object 'Year' not found 

I can not find the problem ...

My solution :

 aggr=cbind(aggregate(data=df,Find~Year, FUN=sum,na.rm=TRUE),aggregate(data=df,Found~Year, FUN=sum,na.rm=TRUE))[,c(1,2,4)] 

Is anyone

Regards!

+4
source share
2 answers

aggregate does not automatically evaluate Year in the data.frame in the data argument. You must directly say where to find Year , i.e. .....

 aggdata <-aggregate(df, by=list(df$Year), FUN=sum, na.rm=TRUE) # Group.1 Year Find Found #1 1901 5703 584 497 #2 1902 3804 364 281 #3 1903 3806 684 259 #4 1904 1904 339 260 #5 1905 1905 42 24 #6 1906 1906 45 150 #7 1910 3820 83 78 
+2
source

Alternatively, since you use the formula method in your β€œsolution”, why not use it in a real solution?

Use . to indicate "all other variables".

In addition, using the formula method, NA values ​​are processed differently. You need to specify na.rm for the sum function and na.pass for aggregate .

 aggregate(. ~ Year, df, sum, na.rm = TRUE, na.action="na.pass") # Year Find Found # 1 1901 584 497 # 2 1902 364 281 # 3 1903 684 259 # 4 1904 339 260 # 5 1905 42 24 # 6 1906 45 150 # 7 1910 83 78 

For a change (and for some simple syntax), of course, data.table :

 library(data.table) DT <- data.table(df) DT[, lapply(.SD, sum, na.rm=TRUE), by = Year] # Year Find Found # 1: 1901 584 497 # 2: 1902 364 281 # 3: 1903 684 259 # 4: 1904 339 260 # 5: 1906 45 150 # 6: 1905 42 24 # 7: 1910 83 78 
+4
source

Source: https://habr.com/ru/post/1500298/


All Articles