开发者

SQL (postgresql) query for year extraction and sum

开发者 https://www.devze.com 2023-01-23 21:03 出处:网络
I try to collect some simple statistics on a table, let\'s say the simplified structure is the following:

I try to collect some simple statistics on a table, let's say the simplified structure is the following:

  ID  |      CREATIONDATE        |    VALUE    |
------------------------------------------------
  1   | 2007-01-06 13:54:00.000  |      7      |
  2   | 2007-03-07 15:03:00.000  |      8      |
  3   | 2008-07-02 18:55:00.000  |      12     |
  4   | 2008-09-10 19:12:00.000  |      1      |
  5   | 2010-01-06 13:54:00.000  |      4      |
  6   | 2010-01-07 07:13:00.000  |      5      |

I'd like to sum the values based on the year of the creation date, so currently I execute the following:

开发者_JS百科
select extract('year' from i.creationdate) as d1,
       sum(i.value)
  from items i
  group d1
  order by d1;

which will output

2007;15
2008;13
2010;9

Is there any way I can improve the query to have it automatically outputs 0 for the missing 2009 year, even if there is no row in the table with a date in 2009?

2007;15
2008;13
2009;0
2010;9


Use a left outer join from a generate series

select year_list.year,
       coalesce(item_list.val,0)
 from (select generate_series(2007,2010) as year) AS year_list
       left outer join
      (select date_part('year',creationdate) item_year,sum(value) as val from i group by item_year) AS item_list 
      on item_year = year_list.year;     

to avoid hard coding the years in the generate_series substitute a select min() and max() on the year part.


Use generate_series in a JOIN:

SELECT 
    generate_series as d1, 
    COALESCE(sum(i.value) , 0)
FROM 
    generate_series(2005, 2010) 
        LEFT JOIN items i ON generate_series = extract('year' from i.creationdate)
GROUP BY
    d1 
ORDER BY
    d1;


I would do this in the software which is doing the query, as opposed to in the query itself.

I have done such things (e.g. in Oracle) by using for loops in SQL (no doubt there are similar things in Postgres) however I came to the conclusion that was the wrong approach. The solution was more difficult to maintain than if I had simply written the code in the programming language of the application.

SQL returns data that exists, and operates on data that exists, and in that case there is no data for the year in question. That's why it's not a good fit to try and get SQL to do this. Just ask SQL for the data you do have, and when you generate your output in your programming language for the user (e.g. HTML table) then add the for loop to iterate between the first and the last year and print zero if there is no result.


An easy (but not a nice) way is to create a (temporary) table holding all years in one column. Then you just join the data table to your new table.

0

精彩评论

暂无评论...
验证码 换一张
取 消