Using an external query results in a subquery in postgresql

Question

Using an external query results in a subquery in postgresql

I have two points and contacts tables, and I'm trying to get the average points.score for each contact grouped on a monthly basis. Note that the points and contacts are not connected, I just want the sum of points created for the month divided by the number of contacts that existed this month.

So, I need to summarize the points grouped by created_at month, and I need to take the contact counter ONLY THAT MONTH ONLY. This is the last part that deceives me. I'm not sure how I can use a column from an external query in a subquery. I tried something like this:

 SELECT SUM(score) AS points_sum, EXTRACT(month FROM created_at) AS month, date_trunc('MONTH', created_at) + INTERVAL '1 month' AS next_month, (SELECT COUNT(id) FROM contacts WHERE contacts.created_at <= next_month) as contact_count FROM points GROUP BY month, next_month ORDER BY month

So, I retrieve the actual month when my points are added up and at the same time get the start of next_month so I can say: "Get me the number of contacts in which their created value is <next_month"

But he complains that column next_month doesn't exist This is understandable, since nothing is known about the outer query in the subquery. Qualifying with points.next_month does not work either.

So can someone point me in the right direction how to do this?

Tables:

Points

 score | created_at 10 | "2011-11-15 21:44:00.363423" 11 | "2011-10-15 21:44:00.69667" 12 | "2011-09-15 21:44:00.773289" 13 | "2011-08-15 21:44:00.848838" 14 | "2011-07-15 21:44:00.924152"

Contacts

 id | created_at 6 | "2011-07-15 21:43:17.534777" 5 | "2011-08-15 21:43:17.520828" 4 | "2011-09-15 21:43:17.506452" 3 | "2011-10-15 21:43:17.491848" 1 | "2011-11-15 21:42:54.759225"

sum, month and next_month (without subquery)

 sum | month | next_month 14 | 7 | "2011-08-01 00:00:00" 13 | 8 | "2011-09-01 00:00:00" 12 | 9 | "2011-10-01 00:00:00" 11 | 10 | "2011-11-01 00:00:00" 10 | 11 | "2011-12-01 00:00:00"

+6

sql postgresql correlated-subquery

brad Nov 16 '11 at 0:04

source share

1 answer

Erwin brandstetter · Accepted Answer · 2011-11-16T00:20:20+0000

Edit

Now with the current amount of contacts. My first project used new contacts per month, which clearly does not meet the OP requirements.

 WITH c AS ( SELECT created_at ,count(id) OVER (order BY created_at) AS ct FROM contacts ), p AS ( SELECT date_trunc('month', created_at) AS month ,sum(score) AS points_sum FROM points GROUP BY 1 ) SELECT p.month ,EXTRACT(month FROM p.month) AS month_nr ,p.points_sum ,( SELECT c.ct FROM c WHERE c.created_at < (p.month + interval '1 month') ORDER BY c.created_at DESC LIMIT 1) AS contacts FROM p ORDER BY 1

It has been working for several months for several years.
It is assumed that the points table is missing a month. If you want all the months, including those missing in points , to generate a list of months with generate_series () and LEFT JOIN to it.
Create the current amount in CTE using the window function.
Both CTEs are not strictly necessary - just for performance and simplification.
Get combination_contacts in a subquery.

Your original request form may work as follows:

 SELECT month ,EXTRACT(month FROM month) AS month_nr ,points_sum ,(SELECT count(*) FROM contacts c WHERE c.created_at < (p.month + interval '1 month')) AS contact_count FROM ( SELECT date_trunc('MONTH', created_at) AS month ,sum(score) AS points_sum FROM points p GROUP BY 1 ) p ORDER BY 1

Correcting the immediate cause of your error is to put the aggregate in a subquery. You mixed levels in such a way that it is impossible.
I expect my option to be a little faster with large tables. Not sure about smaller tables. It would be great if you reported test results.
Plus a small fix: < instead of <= .

Using an external query results in a subquery in postgresql

Tables:

Edit

More articles: