How to collapse repeating timestamp data in Julia `DataFrame`

I have an object DataFramethat looks like this:

| Row | timestamp           | price | volume |
|-----|---------------------|-------|--------|
| 1   | 2011-08-14T14:14:40 | 10.40 | 0.779  |
| 2   | 2011-08-14T15:15:17 | 10.40 | 0.101  |
| 3   | 2011-08-14T15:15:17 | 10.40 | 0.316  |
| ... | ................... | ..... | .....  |

timestampsare not unique, so I cannot convert it to TimeArraybefore allowing this. How can I collapse a duplicate timestampsby taking the average price and the sum of volumes?

Thanks for any pointers!

+3
source share
1 answer

You can use :

df = DataFrame(
  cat = ["a", "b", "c","a"],
  prices = [1,2,3,4],
  vol    = [10,20,30,40],
)

df2 = by(df, :cat) do sub
      t = DataFrame(prices=mean(sub[:prices]), vol=sum(sub[:vol]))
end

df2

3Γ—3 DataFrames.DataFrame
β”‚ Row β”‚ cat β”‚ prices β”‚ vol β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ "a" β”‚ 2.5    β”‚ 50  β”‚
β”‚ 2   β”‚ "b" β”‚ 2.0    β”‚ 20  β”‚
β”‚ 3   β”‚ "c" β”‚ 3.0    β”‚ 30  β”‚

If you need to make some totals by day / month / etc, you might be interested in answering that as well .

+3
source

Source: https://habr.com/ru/post/1677282/


All Articles