I am trying to understand the significant differences in speed that I see between similar database queries, and I was hoping for some understanding of why some aggregations are so slower than others.
I noticed some performance issues with the search query for a document, and much of it looks like a json_agg function:
SELECT containers.*, json_agg(content_items.*) as items FROM containers INNER JOIN content_items ON containers.id = content_items.container_id GROUP BY containers.id ORDER BY containers.order_date DESC, containers.id DESC LIMIT 25 OFFSET 0;
Shows the total request time of about 500 ms, with more than 400 ms spent at the aggregation stage:
GroupAggregate (cost=11921.58..12607.34 rows=17540 width=1553) (actual time=78.818..484.071 rows=17455 loops=1)
Just switching json_agg to array_agg brings the total time down to a range of 150 ms, although about half the time is still done by aggregation:
GroupAggregate (cost=11921.58..12607.34 rows=17540 width=1553) (actual time=81.975..147.207 rows=17455 loops=1)
Running a query without grouping or aggregation results in a total time of up to 25 ms, although this will return a variable number of containers depending on how many content_items were in each.
Is there a reason json_agg imposes such a punishment? Is there a way to get the given number of container lines along with all their content_items and just fill in at the application level?
source share