Total Postgresql Amounts

Question

Total Postgresql Amounts

I use count and group by to register the number of subscribers every day:

  SELECT created_at, COUNT(email) FROM subscriptions GROUP BY created at;

Result:

 created_at count ----------------- 04-04-2011 100 05-04-2011 50 06-04-2011 50 07-04-2011 300

I want to receive the total number of subscribers every day instead. How to get it?

 created_at count ----------------- 04-04-2011 100 05-04-2011 150 06-04-2011 200 07-04-2011 500

+43

sql aggregate-functions postgresql

Khairul Apr 18 '11 at 4:14

source share

5 answers

Using:

 SELECT a.created_at, (SELECT COUNT(b.email) FROM SUBSCRIPTIONS b WHERE b.created_at <= a.created_at) AS count FROM SUBSCRIPTIONS a

+6

OMG Ponies Apr 18 '11 at 4:19

source share

 SELECT s1.created_at, COUNT(s2.email) AS cumul_count FROM subscriptions s1 INNER JOIN subscriptions s2 ON s1.created_at >= s2.created_at GROUP BY s1.created_at

+2

Andriy M Apr 18 2018-11-11T00:

source share

I assume that you only need one line per day, and you want to show days without any subscription (suppose no one signs up for a specific date, do you want to show that date with the balance of the previous day?). If so, you can use the c function:

 with recursive serialdates(adate) as ( select cast('2011-04-04' as date) union all select adate + 1 from serialdates where adate < cast('2011-04-07' as date) ) select D.adate, ( select count(distinct email) from subscriptions where created_at between date_trunc('month', D.adate) and D.adate ) from serialdates D

+2

Endy Tjahjono Apr 18 '11 at 7:23

source share

The best way is to have a calendar table: calendar (date date, month int, quarter int, half int, weekly int, year int)

You can then join this table to compile a summary for the required field.

-3

mentat Jul 18 '14 at 9:56

source share

intgr · Accepted Answer · 2011-04-18 09:12

With larger datasets, window functions are the most efficient way to perform such queries - tables will be scanned only once, and not once for each date, as self-join does. It also looks a lot easier. :) PostgreSQL 8.4 and above support window functions.

It looks like this:

 SELECT created_at, sum(count(email)) OVER (ORDER BY created_at) FROM subscriptions GROUP BY created_at;

Here OVER creates the window; ORDER BY created_at means that it must sum the counts in the created_at order.

Edit: If you want to delete duplicate emails within one day, you can use sum(count(distinct email)) . Unfortunately, this will not remove duplicates that intersect different dates.

If you want to remove all duplicates, I think the easiest way is to use the subquery and DISTINCT ON . This will associate the letters with their earliest date (because I sort by created_at in ascending order, he will select the earliest of them):

 SELECT created_at, sum(count(email)) OVER (ORDER BY created_at) FROM ( SELECT DISTINCT ON (email) created_at, email FROM subscriptions ORDER BY email, created_at ) AS subq GROUP BY created_at;

If you create an index on (email, created_at) , this request should also not be too slow.

(If you want to test, here's how I created the sample dataset)

 create table subscriptions as select date '2000-04-04' + (i/10000)::int as created_at, 'foofoobar@foobar.com' || (i%700000)::text as email from generate_series(1,1000000) i; create index on subscriptions (email, created_at);

Total Postgresql Amounts

More articles: