PostgreSQL query to count / group by day and display days without data

I need to create a PostgreSQL query that returns

  • in a day
  • number of objects found for this day

It is important that every day is displayed in the results , even if no objects were found on that day. (This was discussed earlier, but I was not able to get things to work in my particular case.)

First, I found a sql query to create a range of days that I can join with:

SELECT to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD') AS date FROM generate_series(0, 365, 1) AS offs 

Results in:

  date ------------ 2013-03-28 2013-03-27 2013-03-26 2013-03-25 ... 2012-03-28 (366 rows) 

Now I'm trying to join a table called "sharer_emailshare" that has a "created" column:

 Table 'public.sharer_emailshare' column | type ------------------- id | integer created | timestamp with time zone message | text to | character varying(75) 

Here is the best GROUP BY query I have so far:

 SELECT d.date, count(se.id) FROM ( select to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD') AS date FROM generate_series(0, 365, 1) AS offs ) d JOIN sharer_emailshare se ON (d.date=to_char(date_trunc('day', se.created), 'YYYY-MM-DD')) GROUP BY d.date; 

Results:

  date | count ------------+------- 2013-03-27 | 11 2013-03-24 | 2 2013-02-14 | 2 (3 rows) 

Desired Results:

  date | count ------------+------- 2013-03-28 | 0 2013-03-27 | 11 2013-03-26 | 0 2013-03-25 | 0 2013-03-24 | 2 2013-03-23 | 0 ... 2012-03-28 | 0 (366 rows) 

If I understand this correctly, because I'm using a simple (implied INNER ) JOIN , and this is the expected behavior, as discussed in postgres docs .

I looked through dozens of StackOverflow solutions, and all those who have work queries seem to be specific to MySQL / Oracle / MSSQL, and it’s hard for me to translate them to PostgreSQL.

The guy asking this question found his answer using Postgres, but put it on the pastebin link, which expired some time ago.

I tried to switch to LEFT OUTER JOIN , RIGHT JOIN , RIGHT OUTER JOIN , CROSS JOIN , use the CASE statement for sub in a different value, if null, COALESCE to provide the default value, etc., but I could not use them that way to get what I need.

Any help is appreciated! And I promise that I will come to read this giant PostgreSQL book soon;)

+54
sql join group-by postgresql
Mar 28 '13 at 20:04 on
source share
3 answers

You just need a left outer join instead of an inner join:

 SELECT d.date, count(se.id) FROM (SELECT to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD') AS date FROM generate_series(0, 365, 1) AS offs ) d LEFT OUTER JOIN sharer_emailshare se ON d.date = to_char(date_trunc('day', se.created), 'YYYY-MM-DD')) GROUP BY d.date; 
+47
Mar 28 '13 at 20:06
source share

Extending the helpful answer of Gordon Linoff, I would suggest a couple of improvements, such as:

  • Use ::date instead of date_trunc('day', ...)
  • Joins a date type, not a character type (it's cleaner).
  • Use specific date ranges to make them easier to change later. In this case, I choose one year before the most recent entry in the table - something that could not be easily performed with another query.
  • Calculate totals for an arbitrary subquery (using CTE). You just need to specify the date type column of interest and name it date_column.
  • Include a column for the total. (Why not?)

Here is my request:

 WITH dates_table AS ( SELECT created::date AS date_column FROM sharer_emailshare WHERE showroom_id=5 ) SELECT series_table.date, COUNT(dates_table.date_column), SUM(COUNT(dates_table.date_column)) OVER (ORDER BY series_table.date) FROM ( SELECT (last_date - b.offs) AS date FROM ( SELECT GENERATE_SERIES(0, last_date - first_date, 1) AS offs, last_date from ( SELECT MAX(date_column) AS last_date, (MAX(date_column) - '1 year'::interval)::date AS first_date FROM dates_table ) AS a ) AS b ) AS series_table LEFT OUTER JOIN dates_table ON (series_table.date = dates_table.date_column) GROUP BY series_table.date ORDER BY series_table.date 

I tested the query and it gives the same results, plus a column for the cumulative total.

+27
Mar 14 '14 at 2:16
source share

Based on Gordon Linoff’s answer, I realized that the other problem was that I had a WHERE that I didn’t mention in the original question.

Instead of naked WHERE I made a subquery:

 SELECT d.date, count(se.id) FROM ( select to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD') AS date FROM generate_series(0, 365, 1) AS offs ) d LEFT OUTER JOIN ( SELECT * FROM sharer_emailshare WHERE showroom_id=5 ) se ON (d.date=to_char(date_trunc('day', se.created), 'YYYY-MM-DD')) GROUP BY d.date; 
+7
Mar 28 '13 at 22:20
source share



All Articles