Amount for the month and put months in columns

Background

I have time series data on a monthly basis, and I would like to summarize the values ​​for each identifier, grouped by month, and then have the month names as columns and not as rows.

Example

+----+------------+-------+-------+ | id | extra_info | month | value | +----+------------+-------+-------+ | 1 | abc | jan | 10 | | 1 | abc | feb | 20 | | 2 | def | jan | 10 | | 2 | def | feb | 5 | | 1 | abc | jan | 15 | | 3 | ghi | mar | 15 | 

Desired Result

 +----+------------+-----+-----+-----+ | id | extra_info | jan | feb | mar | +----+------------+-----+-----+-----+ | 1 | abc | 25 | 20 | 0 | | 2 | def | 10 | 5 | 0 | | 3 | ghi | 0 | 0 | 15 | 

Current approach

I can easily group by month by adding up the values. Which bothers me:

 ----------------------------------- | id | extra_info | month | value | +----+------------+-------+-------+ | 1 | abc | jan | 25 | | 1 | abc | feb | 20 | | 2 | def | jan | 10 | | 2 | def | feb | 5 | | 3 | ghi | mar | 15 | 

But now I need these months as column names. Not sure where to go from here.

Additional Information

  • In terms of language, this request should be run in postgres.
  • These months are examples, it is obvious that the real data set is much larger and covers all 12 months by thousands of identifiers.

Any ideas from SQL gurus are greatly appreciated!

+4
source share
2 answers

tablefunc module

I would use crosstab() for this. Install the optional tablefunc module if you do not already have it:

 CREATE EXTENSION tablefunc 

The basics are here:
PostgreSQL Cross Forward Request

How to work with additional columns:
Rotate on multiple columns using Tablefunc

Extended use:
A dynamic alternative to rotation with CASE and GROUP BY

Customization

 CREATE TEMP TABLE tbl (id int, extra_info varchar(3), month date, value int); INSERT INTO tbl (id, extra_info, month, value) VALUES (1, 'abc', '2012-01-01', 10), (1, 'abc', '2012-02-01', 20), (2, 'def', '2012-01-01', 10), (2, 'def', '2012-02-01', 5), (1, 'abc', '2012-01-01', 15), (3, 'ghi', '2012-03-01', 15); 

I am using the actual date in the base table, as I assume they just hide it to simplify your question. But with monthly names there would be no ORDER BY .

Query

 SELECT * FROM crosstab( $$SELECT id, extra_info, to_char(month, 'mon'), sum(value) AS value FROM tbl GROUP BY 1,2,month ORDER BY 1,2,month$$ ,$$VALUES ('jan'::text), ('feb'), ('mar'), ('apr'), ('may'), ('jun') , ('jul'), ('aug'), ('sep'), ('oct'), ('nov'), ('dec')$$ ) AS ct (id int, extra text , jan int, feb int, mar int, apr int, may int, jun int , jul int, aug int, sep int, oct int, nov int, dec int); 

Result:

  id | extra | jan | feb | mar | apr | may | jun | jul | aug | sep | oct | nov | dec ----+-------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+----- 1 | abc | 25 | 20 | | | | | | | | | | 2 | def | 10 | 5 | | | | | | | | | | 3 | ghi | | | 15 | | | | | | | | | 

Installing the tablefunc module requires some overhead and some training, but the resulting queries are much faster and shorter and more universal.

+2
source

You can use the aggregate function with a CASE expression to turn rows into columns:

 select id, extra_info, sum(case when month = 'jan' then value else 0 end) jan, sum(case when month = 'feb' then value else 0 end) feb, sum(case when month = 'mar' then value else 0 end) mar, sum(case when month = 'apr' then value else 0 end) apr, sum(case when month = 'may' then value else 0 end) may, sum(case when month = 'jun' then value else 0 end) jun, sum(case when month = 'jul' then value else 0 end) jul, sum(case when month = 'aug' then value else 0 end) aug, sum(case when month = 'sep' then value else 0 end) sep, sum(case when month = 'oct' then value else 0 end) oct, sum(case when month = 'nov' then value else 0 end) nov, sum(case when month = 'dec' then value else 0 end) "dec" from yt group by id, extra_info 

See SQL Fiddle with Demo

+6
source

Source: https://habr.com/ru/post/1481267/


All Articles