Get a single sum of a joined table column

Question

Get a single sum of a joined table column

I have a problem and I hope there is a simple solution. I will try to make it as simple as possible:

The ticket belongs to the participant
Example:

select * from tickets JOIN attendees ON attendee.id = tickets.attendee_id

The visitor has a decimal column called income

However, I need to run a query that will return a variety of ticket information, including total revenue. The problem is that if 2 tickets belong to the same participant, he calculates their income twice. How can I sum a member’s income only once?

I do not want to use subqueries as my ORM makes this difficult. Also, the subquery solution does not scale if I want to do this for multiple columns.

Here is what I have:

1 participant with income from 100
2 tickets owned by this member

 Select count(tickets.*) as tickets_count , sum(attendees.revenue) as atendees_revenue from tickets LEFT OUTER JOIN attendees ON attendees.id = tickets.attendee_id;

=> This tells me that attendees_revenue is 200. I want it to be 100. Since there is one member in the database with existing state_100. I DO NOT want the visitor to be counted twice.

Please let me know if possible.

+5

sql duplicate-removal aggregate-functions postgresql window-functions

Binary Logic Nov 01

source share

4 answers

How about a simple division:

  Select count(tickets.*) as tickets_count , sum(attendees.revenue) / count(attendees.id) as atendees_revenue from tickets LEFT OUTER JOIN attendees ON attendees.id = tickets.attendee_id;

This should handle duplicates, triples, etc.

+3

heriberto perez Nov 20 '17 at 23:03

source share

You were actually pretty close, there are many ways to do this, and if I understand your question correctly, this should do it:

 SELECT COUNT(*) AS ticketsCount, SUM(DISTINCT attendees.revenue) AS revenueSum FROM tickets LEFT JOIN attendees ON attendees.id = tickets.attendee_id

0

Kad Nov 01

source share

The previous answer is almost correct. You just need to do an excellent job with the same income. You can fix this very simply if your identifier is of a numeric type:

 SELECT COUNT(*) AS ticketsCount, SUM(DISTINCT attendees.id + attendees.revenue) - SUM(DISTINCT attendees.id) AS revenueSum FROM tickets LEFT JOIN attendees ON attendees.id = tickets.attendee_id

0

mikezig92 Aug 26 '17 at 21:14

source share

Erwin Brandstetter · Accepted Answer · 2012-11-01 01:05

To get the result without a subquery , you need to resort to a complex workaround for the window:

 SELECT sum(count(*)) OVER () AS tickets_count ,sum(min(a.revenue)) OVER () AS atendees_revenue FROM tickets t JOIN attendees a ON a.id = t.attendee_id GROUP BY t.attendee_id LIMIT 1;

SQL Fiddle

To explain

The key to understanding this is the sequence of events in the request:

aggregated functions -> window functions -> DISTINCT -> LIMIT

More details here:

The best way to get a result counter before applying LIMIT

Step by step:

I GROUP BY t.attendee_id - which you usually do in a subquery.
Then I add up the scores to get the total number of tickets. Not very effective, but forced by your requirement. The aggregate function count(*) verified in the window function sum( ... ) OVER () to obtain a not so common expression: sum(count(*)) OVER () .
And summarize the minimum income for each participant to get the amount without duplicates.
You can also use max() or avg() instead of min() for the same effect as revenue guaranteed to be the same for each row for each member.
This might be easier if DISTINCT was enabled in the window functions, but PostgreSQL has not yet implemented this function. In the documentation :
Aggregate window functions, unlike ordinary aggregate functions, do not allow DISTINCT or ORDER BY to be used in the function argument list.
The final step is to get one line. This can be done using DISTINCT (SQL standard), since all rows are the same. LIMIT 1 will be faster. Or the SQL standard form FETCH FIRST 1 ROWS ONLY .

Get a single sum of a joined table column

To explain

More articles: