Window functions and more "local" aggregation

Suppose I have this table:

select * from window_test; k | v ---+--- a | 1 a | 2 b | 3 a | 4 

Ultimately, I want to get:

  k | min_v | max_v ---+-------+------- a | 1 | 2 b | 3 | 3 a | 4 | 4 

But I would be just as happy to get this (since I can easily filter it with distinct ):

  k | min_v | max_v ---+-------+------- a | 1 | 2 a | 1 | 2 b | 3 | 3 a | 4 | 4 

Is it possible to achieve this using PostgreSQL 9.1+ window functions? I am trying to figure out if I can make him use a separate section for the first and last occurrence of k=a in this example (sorted by v ).

+6
source share
3 answers

This returns the desired result with sample data. Not sure if it will work for real world data:

 select k, min(v) over (partition by group_nr) as min_v, max(v) over (partition by group_nr) as max_v from ( select *, sum(group_flag) over (order by v,k) as group_nr from ( select *, case when lag(k) over (order by v) = k then null else 1 end as group_flag from window_test ) t1 ) t2 order by min_v; 

I have not used DISTINCT .

+7
source

EDIT: I came to the following query - no window functions:

 WITH RECURSIVE tree AS ( SELECT k, v, ''::text as next_k, 0 as next_v, 0 AS level FROM window_test UNION ALL SELECT ck, cv, tk, tv + level, t.level + 1 FROM tree t JOIN window_test c ON ck = tk AND cv + 1 = tv), partitions AS ( SELECT tk, tv, t.next_k, coalesce(nullif(t.next_v, 0), tv) AS next_v, t.level FROM tree t WHERE NOT EXISTS (SELECT 1 FROM tree WHERE next_k = tk AND next_v = tv)) SELECT min(k) AS k, v AS min_v, max(next_v) AS max_v FROM partitions p GROUP BY v ORDER BY 2; 

I have provided 2 work requests now, hope one of them puts you down.

SQL Fiddle for this option.


Another way to achieve this is to use a support sequence.

  • Create a support sequence:

     CREATE SEQUENCE wt_rank START WITH 1; 
  • Inquiry:

     WITH source AS ( SELECT k, v, coalesce(lag(k) OVER (ORDER BY v), k) AS prev_k FROM window_test CROSS JOIN (SELECT setval('wt_rank', 1)) AS ri), ranking AS ( SELECT k, v, prev_k, CASE WHEN k = prev_k THEN currval('wt_rank') ELSE nextval('wt_rank') END AS rank FROM source) SELECT rk, min(sv) AS min_v, max(sv) AS max_v FROM ranking r JOIN source s ON rv = sv GROUP BY r.rank, rk ORDER BY 2; 
+1
source

Does this really not do the job for you without the need for windows, partitions or coalescence. It just uses the traditional SQL trick to find the closest tuples via self-join and a minimum of difference:

 SELECT k, min(v), max(v) FROM ( SELECT k, v, v + min(d) lim FROM ( SELECT x.*, yk n, yv - xv d FROM window_test x LEFT JOIN window_test y ON xk <> yk AND yv - xv > 0) z GROUP BY k, v, n) w GROUP BY k, lim ORDER BY 2; 

I think this is probably a more β€œrelational” solution, but I'm not sure about its effectiveness.

0
source

Source: https://habr.com/ru/post/915886/


All Articles