How can I cross the bottom from bottom to top to calculate the (weighted) average node value in PostgreSQL?

A typical example, for example. Summarizing a whole tree in PostgreSQL uses WITH RECURSIVE (Common Table Expressions). However, these examples usually go from top to bottom, smooth the tree and perform an aggregate function in the entire result set. I did not find a suitable example (on StackOverflow, Google, etc.) For the problem I'm trying to solve:

Consider an asymmetric tree where each node can have an associated value. Most values ​​are bound to leaf nodes, but others may have values. If a node (leaf or not) has an explicitly attached value, this value can be used directly without further calculation (then the subtree can be ignored). If the node value does not matter, the value should be calculated as the average value of its direct children.

However, since none of the nodes will have an attached value, I need to go from bottom to top to get the overall average. In a nutshell, starting with the leaves, I need to apply AVG()siblings to each set and use this (intermediate) result as the value for the parent node (if it does not). This parent (new) value (explicitly attached or the average value of its children), in turn, is used to calculate the average values ​​at the next level (the average value of the parent and its siblings).

An example of a situation:

A
+- B (6)
+- C
   +- D
      +- E (10)
      +- F (2)
+- H (18)
   +- I (102)
   +- J (301)

I need to calculate the average value for A, which should be 10(because (6+6+18)/3 = 10and Iare Jignored).

+4
1

:

create table tree(id int primary key, parent int, caption text, node_value int);
insert into tree values
(1, 0, 'A', null),
(2, 1, 'B', 6),
(3, 1, 'C', null),
(4, 3, 'D', null),
(5, 4, 'E', 10),
(6, 4, 'F', 2),
(7, 1, 'H', 18),
(8, 7, 'I', 102),
(9, 7, 'J', 301);

- .

create or replace function get_node_value(node_id int)
returns int language plpgsql as $$
declare
    val int;
begin
    select node_value
    from tree 
    where id = node_id
    into val;
    if val isnull then
        select avg(get_node_value(id))
        from tree
        where parent = node_id
        into val;
    end if;
    return val;
end;
$$;

select get_node_value(1);

 get_node_value 
----------------
             10
(1 row)

.

sql- . , , plpgsql.

create or replace function get_node_value_sql(node_id int)
returns int language sql as $$
    select coalesce(
        node_value,
        (
            select avg(get_node_value_sql(id))::int
            from tree
            where parent = node_id
        )
    )
    from tree 
    where id = node_id;
$$;

cte . , .

with recursive bottom_up(id, parent, caption, node_value, level, calculated) as (
    select 
        *, 
        0, 
        node_value calculated
    from tree t
    where not exists (
        select id
        from tree
        where parent = t.id)
union all
    select 
        t.*, 
        b.level+ 1,
        case when t.node_value is null then b.calculated else t.node_value end
    from tree t
    join bottom_up b on t.id = b.parent
)

select id, parent, caption, avg(calculated)::int calculated
from (
    select id, parent, caption, level, avg(calculated)::int calculated
    from bottom_up
    group by 1, 2, 3, 4
    ) s
group by 1, 2, 3
order by 1;

.

+3

Source: https://habr.com/ru/post/1665205/


All Articles