val df = (Seq((1, "a", "10"),(1,"b", "12"),(1,"c", "13"),(2, "a", "14"), (2,"c", "11"),(1,"b","12" ),(2, "c", "12"),(3,"r", "11")). toDF("col1", "col2", "col3"))
So, I have a spark data block with three columns.
My requirement is that I actually need to complete two levels of groupby, as described below.
Level1: If I do groupby on col1 and do the sum of Col3. I will get below two columns. 1. col1 2. sum (col3) I will lose col2 here.
Level2: If I want to group again by col1 and col2 and make the sum of Col3, I will get below 3 columns. 1. col1 2. col2 3. sum (col3)
Actually I need to execute two levels of groupBy and have these two columns (sum (col3) of level 1, sum (col3) of level 2) in the last one data frame.
How can I do this, can someone explain?
spark: 1.6.2 Scala: 2.10