MySQL group by date, forcibly returns null if it does not exist

I built a query to allow me to return the average rating of some applications.

But for some of them, we don’t have data for a month, because the application is new (let's say the application is missing from this month, and therefore we collect data from this month)

SELECT DATE_FORMAT(date, '%Y-%m'), app_id, AVG(rank) FROM wadstats.applestore_ranking where app_id IN (100, 2, 3, 4, 5, 6) GROUP BY MONTH(date), app_id ORDER BY CASE WHEN app_id = 100 THEN 1 ELSE 2 END, date ASC 

I needed to show app_id = 100 first

But for app_id = 8, I have no data for August.

Then the results look like

 '2015-07', '100', '3.9355' '2015-04', '100', '49.5000' '2015-08', '100', '5.2258' '2015-05', '100', '16.3333' '2015-09', '100', '6.1333' '2015-06', '100', '7.5667' '2015-10', '100', '5.7727' '2015-04', '2', '6.0000' '2015-08', '2', '9.8710' '2015-05', '2', '6.4667' '2015-09', '2', '8.9667' '2015-06', '2', '8.5333' '2015-10', '2', '9.9545' '2015-07', '2', '10.5806' '2015-05', '3', '56.3929' '2015-09', '3', '55.1667' '2015-06', '3', '35.2500' '2015-07', '3', '38.7143' '2015-04', '3', '38.7500' '2015-08', '3', '52.5500' '2015-09', '4', '30.2105' '2015-06', '4', '27.9231' '2015-10', '4', '30.0000' '2015-07', '4', '47.0000' '2015-08', '4', '32.6818' '2015-06', '5', '46.8667' '2015-10', '5', '86.6667' '2015-07', '5', '63.5185' '2015-04', '5', '24.2500' '2015-08', '5', '67.3571' '2015-10', '6', '30.1818' 

Instead, I want to have zero for every month, even if there was no data for that month

Expected results

 '2015-07', '100', '3.9355' '2015-04', '100', '49.5000' '2015-08', '100', '5.2258' '2015-05', '100', '16.3333' '2015-09', '100', '6.1333' '2015-06', '100', '7.5667' '2015-10', '100', '5.7727' '2015-04', '2', '6.0000' '2015-08', '2', '9.8710' '2015-05', '2', '6.4667' '2015-09', '2', '8.9667' '2015-06', '2', '8.5333' '2015-10', '2', '9.9545' '2015-07', '2', '10.5806' '2015-05', '3', '56.3929' '2015-09', '3', '55.1667' '2015-06', '3', '35.2500' '2015-07', '3', '38.7143' '2015-04', '3', '38.7500' '2015-08', '3', '52.5500' '2015-09', '4', '30.2105' '2015-06', '4', '27.9231' '2015-05', '4', NULL '2015-10', '4', '30.0000' '2015-07', '4', '47.0000' '2015-08', '4', '32.6818' '2015-06', '5', '46.8667' '2015-10', '5', '86.6667' '2015-07', '5', '63.5185' '2015-04', '5', '24.2500' '2015-08', '5', '67.3571' '2015-04', '6', NULL '2015-05', '6', NULL '2015-06', '6', NULL '2015-07', '6', NULL '2015-08', '6', NULL '2015-09', '6', NULL '2015-10', '6', '30.1818' 

If I need to have 0 instead of NULL, that will be fine too, but I need every month in the DB to have a value for each app_id

Thank you very much in advance

+5
source share
3 answers

thanks drew

Here's a way to execute in one request, which Drew explains to us in detail.

Thanks so much for your advice using the CROSS JOIN combination

  SELECT inr.date AS 'month', inr.app_id, r.title, AVG(r.rank) as 'ranking' FROM ( SELECT DISTINCT DATE_FORMAT(mh.date, '%Y-%m') date, r.app_id FROM applestore_ranking mh CROSS JOIN applestore_ranking r WHERE mh.app_id = 100 AND r.app_id IN (100, 1, 2, 3, 4, 5) ) inr LEFT JOIN applestore_ranking r ON r.app_id = inr.app_id AND inr.date = DATE_FORMAT(r.date, '%Y-%m') GROUP BY inr.date, inr.app_id ORDER BY CASE WHEN inr.app_id = 100 THEN 1 ELSE 2 END 
+1
source

The following query uses a view (alias inr ) to combine the combinations YearMonth / app_id. He then uses this in a left join to retrieve the data, regardless of whether it exists in the applestore_ranking table.

Use ifnull() if you want null to be displayed, not NULL. Thus, this part will become ifnull(AVG(r.rank),0) as Rank .

Note that this would help to simply combine this step and then highlight the code for inr select yourself and see its simple output. This would facilitate understanding of the left join, which then follows.

The concept of helpers table is used all the time in sql. Sometimes they are combined on the fly and abandoned. In other cases, they are permanent.

Scheme

 create schema appleSandbox; use appleSandbox; -- drop table applestore_ranking; create table applestore_ranking ( id int auto_increment primary key, app_id int not null, date date not null, -- not a great column name rank int not null ); -- truncate table applestore_ranking; insert applestore_ranking (app_id,date,rank) values (2,'2015-08-01',1),(2,'2015-09-05',10),(2,'2015-09-12',11),(2,'2015-10-01',14), (6,'2015-10-01',7),(6,'2015-10-05',6),(6,'2015-10-14',2), (100,'2015-09-01',16),(100,'2015-10-01',16),(100,'2015-10-05',17),(100,'2015-10-14',18); create table monthHelper ( -- load this up with a few years worth id int auto_increment primary key, theDate date not null, -- slightly better column name wantToSee int not null -- do we want to see it in results or not? 0=no, 1=yes ); -- note only a few wantToSee have been turned on to 1 insert monthHelper(theDate,wantToSee) values ('2015-05-01',0),('2015-06-01',0),('2015-07-01',0),('2015-08-01',1),('2015-09-01',1),('2015-10-01',1),('2015-11-01',0),('2015-12-01',0), ('2016-01-01',0),('2016-02-01',0),('2016-03-01',0); -- etc 

Request

 SELECT DATE_FORMAT(inr.theDate, '%Y-%m') as YearMonth, inr.app_id, AVG(r.rank) as Rank FROM ( select distinct mh.theDate,r.app_id from monthhelper mh cross join applestore_ranking r where mh.wantToSee=1 and r.app_id IN (100,2,3,4,5,6) ) inr left join applestore_ranking r on r.app_id=inr.app_id and year(inr.theDate)=year(r.date) and month(inr.theDate)=month(r.date) GROUP BY MONTH(inr.theDate), inr.app_id ORDER BY CASE WHEN inr.app_id = 100 THEN 1 ELSE 2 END, inr.theDate ASC 

results

 +-----------+--------+---------+ | YearMonth | app_id | Rank | +-----------+--------+---------+ | 2015-08 | 100 | NULL | | 2015-09 | 100 | 16.0000 | | 2015-10 | 100 | 17.0000 | | 2015-08 | 2 | 1.0000 | | 2015-08 | 6 | NULL | | 2015-09 | 2 | 10.5000 | | 2015-09 | 6 | NULL | | 2015-10 | 2 | 14.0000 | | 2015-10 | 6 | 5.0000 | +-----------+--------+---------+ 

Cleanup

 drop schema AppleSandbox; 
+3
source

It depends on how you implemented your table - do the rows contain even if the statistics (since they were not loaded, i.e.) are missing or are all these rows / dates completely omitted? For the latter, you can use the "number trick" (explained in detail here on SO ), when you have strings with zero ranks, you can just stick on LEFT JOIN (see MYSQL documentation ).

Having guessed the results you published and the results you really want, you should go for the trick.

0
source

Source: https://habr.com/ru/post/1234273/


All Articles