A collection of one column in a query with many columns

Is there a way to assemble one column when I have many other columns in the query?

I tried this answer , which works, but my request became much more verbose.

My current request looks like this:

SELECT t1.foo1, t1.foo2, t2.foo3, t2.foo4, string_agg(t3.aggregated_field, ', ') FROM tbl1 t1 LEFT JOIN tbl2 t2 ON t1.id = t2.fkeyid LEFT JOIN tbl3 t3 ON t2.id = t3.fkeyid GROUP BY t1.foo1, t1.foo2, t2.foo3, t2.foo4, t2.foo5, t2.foo6 ORDER BY t2.foo5, t2.foo6 

There are many more fields and LEFT JOIN s in the request, the important part is that all these fields have relations from 1 to 1 or from 1 to 0, with the exception of one field, which from 1 to n, which I want to fill, presented t3.aggregated_field in the pseudo query above.

Since I use an aggregate function, all fields listed in SELECT and ORDER BY must be either aggregated or part of the GROUP BY . This makes my request more verbose than it is.

That is, if foo1 is the primary key, when this field is repeated, all the others except aggregated_field are also equal. I want these duplicate rows to be the result of a single row with an aggregated field value. (basically a select distinct with aggregate column)

Is there a better way to do this (without having to put all the other fields in GROUP BY ) or just iterate over the result set in my source code by querying for each row that gets this from 1 to n relationships?


The server runs PostgreSQL 9.1.9, namely:

PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu, compiled gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54), 64-bit

+2
source share
2 answers

Simple request

It can be a lot easier with PostgreSQL 9.1 or later . As explained in this close answer:

Enough GROUP BY primary key of the table. As:

foo1 - primary key

.. you can simplify your example:

 SELECT foo1, foo2, foo3, foo4, foo5, foo6, string_agg(aggregated_field, ', ') FROM tbl1 GROUP BY 1 ORDER BY foo7, foo8; -- have to be spelled out, since not in select list! 

Query with multiple tables

However, since you have:

there are many more fields and LEFT JOINs, the important part is that all these fields have relations from 1 to 1 or from 1 to 0, except for one field, from 1 to n, which I want to aggregate

.. it should be faster and easier to aggregate first, connect later :

 SELECT t1.foo1, t1.foo2, ... , t2.bar1, t2.bar2, ... , a.aggregated_col FROM tbl1 t1 LEFT JOIN tbl2 t2 ON ... ... LEFT JOIN ( SELECT some_id, string_agg(agg_col, ', ') AS aggregated_col FROM agg_tbl a ON ... GROUP BY some_id ) a ON a.some_id = ?.some_id ORDER BY ... 

Thus, most of your request does not require aggregation at all.

I recently presented a test case in SQL Fiddle to prove a point in this related answer:

Since you are referencing this related answer : No, DISTINCT will not help at all in this case.

+4
source

If the main problem is that the fields (foox) are being computed, this may help:

 SELECT foo1, foo2, foo3, foo4, foo5, foo6, string_agg(aggregated_field, ', ') FROM tbl1 GROUP BY 1, 2, 3, 4, 5, 6 ORDER BY 5, 6 

1, 2... are the fields in the order they appear in the selection list.

+1
source

Source: https://habr.com/ru/post/1238437/


All Articles