Postgres - Calculates the change in aggregate data.

I collect data from several API sources through Python and add them to 2 tables in Postgres.

Then I use this data to create reports, merge and group / filter data. Every day I add thousands of lines.

Cost, revenue and sales are always cumulative, which means that each data point is located from t1 for this product, and t2 is the data recovery time.

The last data presentation will include all previous data up to t1. t1, t2 - timestamp without a time zone in Postgres. I am currently using Postgres 10.

Example:

id, vendor_id, product_id, t1, t2, cost, revenue, sales
1, a, a, 2018-01-01, 2018-04-18, 50, 200, 34
2, a, b, 2018-05-01, 2018-04-18, 10, 100, 10
3, a, c, 2018-01-02, 2018-04-18, 12, 100, 9
4, a, d, 2018-01-03, 2018-04-18, 12, 100, 8
5, b, e, 2018-25-02, 2018-04-18, 12, 100, 7

6, a, a, 2018-01-01, 2018-04-17, 40, 200, 30
7, a, b, 2018-05-01, 2018-04-17, 0, 95, 8
8, a, c, 2018-01-02, 2018-04-17, 10, 12, 5
9, a, d, 2018-01-03, 2018-04-17, 8, 90, 4
10, b, e, 2018-25-02, 2018-04-17, 9, 0-, 3

Cost and revenue from two tables, and I join them on vendor_id, product_id and t2.

, "" , , , ?

?

, , , , , .

with report1 as (select ...),
report2 as (select ...)
select .. from report1 left outer join report2 on ...

!

JR

+4
1

LAG():

:

... , , ; return default ( , ). . , offset - 1, - null.

with sample_data as (
        select 1 as id, 'a'::text vendor_id, 'a'::text product_id, '2018-01-01'::date as t1, '2018-04-18'::date as t2, 50 as cost, 200 as revenue, 36 as sales
        union all
        select 2 as id, 'a'::text vendor_id, 'b'::text product_id, '2018-01-01'::date as t1, '2018-04-18'::date as t2, 55 as cost, 200 as revenue, 34 as sales
        union all
        select 3 as id, 'a'::text vendor_id, 'a'::text product_id, '2018-01-01'::date as t1, '2018-04-17'::date as t2, 35 as cost, 150 as revenue, 25 as sales
        union all
        select 4 as id, 'a'::text vendor_id, 'b'::text product_id, '2018-01-01'::date as t1, '2018-04-17'::date as t2, 25 as cost, 140 as revenue, 23 as sales
        union all
        select 5 as id, 'a'::text vendor_id, 'a'::text product_id, '2018-01-01'::date as t1, '2018-04-16'::date as t2, 16 as cost, 70 as revenue, 12 as sales
        union all
        select 6 as id, 'a'::text vendor_id, 'b'::text product_id, '2018-01-01'::date as t1, '2018-04-16'::date as t2, 13 as cost, 65 as revenue, 11 as sales
)
select sd.*
    , coalesce(cost - lag(cost) over (partition by vendor_id, product_id order by t2),cost) cost_new
    , coalesce(revenue - lag(revenue) over (partition by vendor_id, product_id order by t2),revenue) revenue_new
    , coalesce(sales - lag(sales) over (partition by vendor_id, product_id order by t2),sales) sales_new
from sample_data sd
order by t2 desc
+1

Source: https://habr.com/ru/post/1696275/


All Articles