Group and generalize by storing other columns in R

I have a dataframe that I group using the group_by function, and summing it using the sum function in R.

MM_group<-group_by(SYC,Method,Maturity)

My dataset looks like this:

 Year           Group  County Seed.Brand Seed.Variety Seed.Maturity
1 2014 Group 0 No-till Yankton     Asgrow       AG0832           0.8
2 2014 Group 0 No-till   Brown     Asgrow       AG0934           0.9
3 2014 Group 0 No-till   Brown     Asgrow       AG0934           0.9
4 2014 Group 0 No-till   Brown     Asgrow       AG0934           0.9
5 2014 Group 0 No-till   Brown    Pioneer        90Y90           0.9
6 2014 Group 0 No-till   Brown     Asgrow       AG0934           0.9

Yield  Method Maturity digits
1 73.23 No-till        0      0
2 65.14 No-till        0      0
3 63.63 No-till        0      0
4 61.57 No-till        0      0
5 60.20 No-till        0      0

I group by method and maturity. I am trying to get the county and year for maximum profitability for a combination of Method and Maturity.

I have done the following:

summarize(MM_group,Max_Yield=max(Yield))

       Method Maturity Max_Yield
           <chr>    <chr>     <dbl>
1      Irrigated        0    69.600
2      Irrigated        1    86.013
3      Irrigated        2    88.750
4      Irrigated        3    79.650
5        No-till        0    79.470
6        No-till        1    79.856
7        No-till        2    85.860
8        No-till        3    68.530
9  Non-irrigated        0    83.210
10 Non-irrigated        1    81.916
11 Non-irrigated        2   103.740
12 Non-irrigated        3    94.410

But that does not give me the name and year of the county. I know that I can use cbind or join to get this data, but I wonder if there is another easy way to do this.

Expected Result:

          Method Maturity Max_Yield  Year                  Group
           <chr>    <chr>     <dbl> <int>                 <fctr>
1      Irrigated        0    69.600  2012 Group 0 or 1 Irrigated
2      Irrigated        1    86.013  2012 Group 0 or 1 Irrigated
3      Irrigated        2    88.750  2013 Group 2 or 3 Irrigated
4      Irrigated        3    79.650  2013 Group 2 or 3 Irrigated
5        No-till        0    79.470  2013        Group 0 No-till
6        No-till        1    79.856  2012        Group 1 No-till
7        No-till        2    85.860  2013        Group 2 No-till
8        No-till        3    68.530  2014        Group 3 No-till
9  Non-irrigated        0    83.210  2013  Group 0 Non-irrigated
10 Non-irrigated        1    81.916  2012  Group 1 Non-irrigated
11 Non-irrigated        2   103.740  2014  Group 2 Non-irrigated
12 Non-irrigated        3    94.410  2014  Group 3 Non-irrigated 
+4
source share
3 answers

Try

summarize(MM_group, 
          rank = which.max(Yield),
          Year_rank = Year[rank],
          County_rank = County[rank])
+5
source

We can use

SYC %>%
   group_by(Method, Maturity) %>%
   slice(which.max(Yield)) %>% 
   rename(Max_Yield = Yield) %>%
   select(Method, Maturity, Max_Yield, Year, Group)
+3
source

arrange slice :

library(dplyr)
df %>% 
  arrange(Method, Maturity, desc(Yield)) %>% 
  group_by(Method, Maturity) %>%
  slice(1) %>%
  ungroup %>%
  select(Method, Maturity, Yield, Year, Group) %>%
  rename(Max_Yield = Yield)
+3

Source: https://habr.com/ru/post/1681052/


All Articles