How to create date range + count earlier dates from another table in PostgreSQL?

I have the following table:

links :

 created_at active 2017-08-12 15:46:01 false 2017-08-13 15:46:01 true 2017-08-14 15:46:01 true 2017-08-15 15:46:01 false 

When a date range is given, I have to retrieve time series that tell me how many active links were created on a date equal to or less than the current (moving) date.

Output (for the date range 2017-08-12 - 2017-08-17):

 day count 2017-08-12 0 (there are 0 active links created on 2017-08-12 and earlier) 2017-08-13 1 (there is 1 active link created on 2017-08-13 and earlier) 2017-08-14 2 (there are 2 active links created on 2017-08-14 and earlier) 2017-08-15 2 ... 2017-08-16 2 2017-08-17 2 

I came up with the following query to create dates:

 SELECT date_trunc('day', dd):: date FROM generate_series ( '2017-08-12'::timestamp , '2017-08-17'::timestamp , '1 day'::interval) dd 

But rolling calculations confuse me and are not sure how to proceed. Can this be solved using a window function?

+5
source share
6 answers

This should be faster:

 SELECT day::date , sum(ct) OVER (ORDER BY day) AS count FROM generate_series (timestamp '2017-08-12' , timestamp '2017-08-17' , interval '1 day') day LEFT JOIN ( SELECT date_trunc('day', created_at) AS day, count(*) AS ct FROM tbl WHERE active -- fastest GROUP BY 1 ) t USING (day) ORDER BY 1; 

dbfiddle here

count() only considers non-zero strings, so you can use count(active OR NULL) . But the fastest way to count is to exclude unnecessary lines with a WHERE for starters. Since we add all generate_series() all the time, this is the best option.

For comparison:

Since generate_series() returns timestamp (not date ), I use date_trunc() to get matching timestamps (very slightly faster).

+1
source

I would just use aggregation and cumulative amounts - if you have at least one day:

 select date_trunc('day', created_at)::date as created_date, sum(active::int) as actives, sum(sum(active::int)) over (date_trunc('day', created_at)) as running_actives from t group by created_date; 

You only need to create dates if you have holes in the data. If you do this, I would recommend turning on where active - you can turn it on now, I just want to be sure that there are no holes.

+1
source

Demo

http://rextester.com/OGZV44492

SQL

 SELECT date_trunc('day', dd):: date AS day, (SELECT COUNT(*) FROM links WHERE active = true AND date(created_at) <= date_trunc('day', dd)) AS "count" FROM generate_series ( '2017-08-12'::timestamp , '2017-08-17'::timestamp , '1 day'::interval) dd 

Explanation

The SQL above makes a simple subquery to count the number of rows in the links table whose date is less than or equal to each date in the generated range.

+1
source

I think such a query might help you:

 ;with t as (SELECT date_trunc('day', dd):: date FROM generate_series ( '2017-08-12'::timestamp , '2017-08-17'::timestamp , '1 day'::interval) dd ) select distinct t.date_trunc , count(case when links.active = 'true' then 1 end) over (order by links.created_at) count from t left join links on t.date_trunc = cast(links.created_at as date) order by t.date_trunc; 

SQL Fiddle Demo

0
source

If you do not have days in the table, you will need to use generate_series () to create them. Since this is mainly a compilation of the two previous answers, a loan is granted ;;)

However, this connection is best done after GROUP BY, which will return only one row per day, and not earlier, which will lead to an increase in JOIN.

 WITH dailydata AS ( SELECT d::DATE, COALESCE(n,0) n FROM generate_series( '2000-01-01'::DATE, '2000-10-01'::DATE, '1 DAY'::INTERVAL ) d LEFT JOIN (SELECT created_at::DATE d, count(*) AS n FROM links WHERE active GROUP BY d) data USING (d) ) SELECT d, n, sum(n) OVER (ORDER BY d) FROM dailydata; 
0
source
 CREATE TABLE links ( created_at timestamp , active boolean ); INSERT INTO links(created_at,active)VALUES ('2017-08-12 15:46:01', false) ,('2017-08-13 15:46:01', true) ,('2017-08-14 15:46:01', true) ,('2017-08-15 15:46:01', false) ; 

 WITH cal AS ( select gs AS deet FROM generate_series('2017-08-11'::date,'2017-08-16'::date, '1day'::interval)gs ) SELECT cal.deet , SUM(1) FILTER (WHERE l.active =True) OVER(ORDER BY l.created_at) AS cumsum FROM cal LEFT JOIN links l ON date_trunc('days', l.created_at)= cal.deet ORDER BY created_at ; 
0
source

Source: https://habr.com/ru/post/1271774/


All Articles