I have simplified your request; there is no need to use COALLESCE with COUNT() , because COUNT() never returns NULL . For SUM() I prefer to use IFNULL() because it is shorter and more readable. The results below contain only data for classroom_id # 16, # 17 and # 46 for easier comparison with the example presented in the question. Actual result sets are larger and contain all the classroom_id present in the tables. However, their presence is not required to understand how and why it works.
SELECT t.classroom_id, t.title, COUNT(DISTINCT r.redeemed_code_id) AS totalRewards, COUNT(DISTINCT ocm.user_id) AS totalStudents, IFNULL(SUM(r.points), 0) AS totalPoints FROM `organisation_classrooms` t LEFT JOIN `classroom_redeemed_codes` r ON r.classroom_id = t.classroom_id AND r.inactive = 0 AND (r.date_redeemed >= 1393286400 OR r.date_redeemed = 0) LEFT JOIN `organisation_classrooms_myusers` ocm ON ocm.classroom_id = t.classroom_id WHERE t.organisation_id = 37383 GROUP BY t.classroom_id ORDER BY t.classroom_id ASC
Try to break it into pieces and put it together. First, let's see which users are selected:
Request # 1
SELECT t.classroom_id, t.title, ocm.user_id FROM `organisation_classrooms` t LEFT JOIN `organisation_classrooms_myusers` ocm ON ocm.classroom_id = t.classroom_id WHERE t.organisation_id = 37383 ORDER BY t.classroom_id ASC
I deleted the classroom_redeemed_codes table and its fields, deleted GROUP BY and replaced the aggregated COUNT(ocm.user_id) function COUNT(ocm.user_id) with ocm.user_id to find out which users were selected.
The result shows that this part of the query is correct:
classroom_id | title | user_id -------------+-------+-------- 16 | BLUE | 2 16 | BLUE | 1 17 | GREEN | 508835 17 | GREEN | 508826 46 | PINK | NULL
In class number 16 there are 2 users, 2 more in # 7, and in class number 46 - no. Returning a GROUP BY will return the correct values ββ(2, 2, 0) in the totalStudents column.
Now check the link to the classroom_redeemed_codes table:
Request # 2
SELECT t.classroom_id, t.title, r.redeemed_code_id, r.points FROM `organisation_classrooms` t LEFT JOIN `classroom_redeemed_codes` r ON r.classroom_id = t.classroom_id AND r.inactive = 0 AND (r.date_redeemed >= 1393286400 OR r.date_redeemed = 0) WHERE t.organisation_id = 37383 ORDER BY t.classroom_id ASC
Result:
classroom_id | title | redeemed_code_id | points -------------+-------+------------------+------- 16 | BLUE | 7 | 50 17 | GREEN | 8 | 25 17 | GREEN | 9 | 75 46 | PINK | 5 | 250 46 | PINK | 6 | 100
Again, classroom_id will produce (1, 2, 2) in the totalRewards column and (50, 100, 350) in the totalPoints column, which is correct.
The problem starts when you want to combine them into a single query. No matter which connection you use, for the provided input, you will get strings (2 * 1, 2 * 2, 1 * 2) for classroom_id , with values ββ16, 17 and 46 (in that order). The values ββwe multiply in parentheses are the number of rows for each classroom_id in the first and in the above query result.
Combined
Let's try a query that selects rows before grouping them:
SELECT t.classroom_id, t.title, r.redeemed_code_id, ocm.user_id, r.points FROM `organisation_classrooms` t LEFT JOIN `classroom_redeemed_codes` r ON r.classroom_id = t.classroom_id AND r.inactive = 0 AND (r.date_redeemed >= 1393286400 OR r.date_redeemed = 0) LEFT JOIN `organisation_classrooms_myusers` ocm ON ocm.classroom_id = t.classroom_id WHERE t.organisation_id = 37383 ORDER BY t.classroom_id ASC
It returns this result set:
classroom_id | title | redeemed_code_id | user_id | points -------------+-------+------------------+---------+------- 16 | BLUE | 7 | 2 | 50 16 | BLUE | 7 | 1 | 50 <- * -------------+-------+------------------+---------+------- 17 | GREEN | 8 | 508835 | 25 17 | GREEN | 8 | 508826 | 25 <- * 17 | GREEN | 9 | 508835 | 75 17 | GREEN | 9 | 508826 | 75 <- * -------------+-------+------------------+---------+------- 46 | PINK | 5 | NULL | 250 46 | PINK | 6 | NULL | 100
I added horizontal rules to split rows that belong to the same group when we add the GROUP BY . This is basically a way to execute an SQL query with GROUP BY , regardless of the name of the actual software that implements it.
As you can see, for each class it combines all the redeemed codes associated with the class with all users associated with the class. If you add more users and redeem codes for classes # 16, # 17 and # 46 in your tables, you will get a much larger set of results.
The next step when executing a GROUP BY query is to create one row from each group specified above. There are no problems with the classroom_id and title columns; they contain one value in each group. For redeemed_code_id and user_id your query counts different values ββand this works fine. The problem is adding points . If you just SUM() them, you will get a redemption code added for each user_id in the group. If you use SUM(DISTINCT points) , this is also not true, because it will ignore duplicates, even if they are different in the classroom_redeemed_codes table.
You want to add points for DISTINCT redeemed_code_id . I noted in the result above, set the lines you don't need.
This is not possible using this query, because when calculating aggregate values, each column is independent of the other. We need a query that selects the desired rows before grouping them.
Idea
We can try to add the missing columns (with NULL values) to the two simple queries above, UNION ALL then select them from this and GROUP BY .
First make sure that it selects what we need:
SELECT t.classroom_id, t.title, NULL AS redeemed_code_id, ocm.user_id, NULL AS points FROM `organisation_classrooms` t LEFT JOIN `organisation_classrooms_myusers` ocm ON ocm.classroom_id = t.classroom_id WHERE t.organisation_id = 37383 UNION ALL SELECT t.classroom_id, t.title, r.redeemed_code_id, NULL AS user_id, r.points FROM `organisation_classrooms` t LEFT JOIN `classroom_redeemed_codes` r ON r.classroom_id = t.classroom_id AND r.inactive = 0 AND (r.date_redeemed >= 1393286400 OR r.date_redeemed = 0) WHERE t.organisation_id = 37383 ORDER BY classroom_id
Attention! The ORDER BY applies to the UNION result set. If you want to order rows from each SELECT (this does not help, because UNION does not support order), you must enclose this query in parentheses and place ORDER BY clauses there.
The result looks great:
classroom_id | title | redeemed_code_id | user_id | points -------------+-------+------------------+---------+------- 16 | BLUE | NULL | 1 | NULL 16 | BLUE | NULL | 2 | NULL 16 | BLUE | 7 | NULL | 50 -------------+-------+------------------+---------+------- 17 | GREEN | 8 | NULL | 25 17 | GREEN | 9 | NULL | 75 17 | GREEN | NULL | 508826 | NULL 17 | GREEN | NULL | 508835 | NULL -------------+-------+------------------+---------+------- 46 | PINK | 5 | NULL | 250 46 | PINK | 6 | NULL | 100 46 | PINK | NULL | NULL | NULL
Now we can put some bracket around the query above (strip ORDER BY ) and use it in another query, grouping the data by classroom_id , counting users and bought codes and summing their points.
You will get a query that looks awful, and in your current database schema, it scans when your tables have several hundred rows. That is why I will not write here .
Attention! Its performance can be improved by adding missing indexes to your tables in the fields that appear in the ON , WHERE , ORDER BY and GROUP BY clauses of the query.
This will lead to significant improvement, but I will not rely heavily on it. For really large tables (hundreds of thousands of rows), it scans anyway.
Another idea
We can also add GROUP BY on both Query # 1 and Query # 2 and UNION ALL after this:
SELECT t.classroom_id, t.title, NULL AS totalRewards, COUNT(DISTINCT ocm.user_id) AS totalStudents, NULL AS totalPoints FROM `organisation_classrooms` t LEFT JOIN `organisation_classrooms_myusers` ocm ON ocm.classroom_id = t.classroom_id WHERE t.organisation_id = 37383 GROUP BY t.classroom_id UNION ALL SELECT t.classroom_id, t.title, COUNT(DISTINCT redeemed_code_id) AS totalRewards, NULL AS totalStudents, SUM(points) AS totalPoints FROM `organisation_classrooms` t LEFT JOIN `classroom_redeemed_codes` r ON r.classroom_id = t.classroom_id AND r.inactive = 0 AND (r.date_redeemed >= 1393286400 OR r.date_redeemed = 0) WHERE t.organisation_id = 37383 GROUP BY t.classroom_id ORDER BY classroom_id, totalRewards
This creates a good set of results:
classroom_id | title | totalRewards | totalStudents | totalPoints -------------+-------+--------------+---------------+------------- 16 | BLUE | NULL | 2 | NULL 16 | BLUE | 1 | NULL | 50 17 | GREEN | NULL | 2 | NULL 17 | GREEN | 2 | NULL | 100 46 | PINK | NULL | 0 | NULL 46 | PINK | 2 | NULL | 350
This query can be embedded in another query that groups by classroom_id and SUM() common columns above to get the final result. But then again, the final query is big and ugly, and it doesn't work very fast for large tables. And again, thatβs why I donβt write it here .
Conclusion
This can be done in one query, but it does not look good, and it does not work well on large tables.
Regarding performance, put EXPLAIN in front of your query, then check the values ββin the type , key and Extra columns of the result. See the Documentation for an explanation of the possible meanings of these columns, what to try to achieve and what should not be.
Both queries, created by me on both ideas, produce associations of the range or ALL and having Using filesort in the Extra column (they are all slow). In contrast, using them as subqueries in large queries will not improve the way they are executed.
I recommend that you run individual SELECT queries from the last code example as two separate queries ; they return odd and even rows from the above result set. Then combine the results into client code . It will work faster this way.