Julia: Create totals for column x for each unique value in y column of DataFrame

I would like to apply some functions, such as mean and variance, to xmy column DataFramefor each unique value in the column y. I can imagine creating a loop that manually multiplies DataFrameto reach my end, but I try not to reinvent the wheel for something that is most likely a common function.

using DataFrames
mydf = DataFrame(y = [randstring(1) for i in 1:1000], x = rand(1000))
# I could imagine a function that looks like:
apply(function = mean, across = mydf[:x], by = mydf[:y])
+2
source share
1 answer

, . split-apply-combine . : by, , , aggregate, :

julia> aggregate(mydf, :y, mean)
62×2 DataFrames.DataFrameRowyx_mean
├─────┼─────┼──────────┤
│ 1   │ "0" │ 0.454196
│ 2   │ "1" │ 0.541434
│ 3   │ "2" │ 0.36734
+2

Source: https://habr.com/ru/post/1677280/


All Articles