开发者

How to split start/end time columns into discrete chunks with PostgreSQL?

开发者 https://www.devze.com 2023-01-17 04:07 出处:网络
We have some tables, which have a structure like: start, -- datetime end, -- datetime cost -- decimal So, for example, there might be a row like:

We have some tables, which have a structure like:

start, -- datetime

end, -- datetime

cost -- decimal

So, for example, there might be a row like:

01/01/2010 10:08am, 01/01/2010 1:56pm, 135.00

01/01/2010 11:01am, 01/01/2010 3:22pm, 118.00

01/01/2010 06:19pm, 01/02/2010 1:43am, 167.00

Etc...

I'd like to get this into a format (with a function?) that returns data in a format like:

10:00am, 10:15am, X, Y, Z

10:15am, 10:30am, X, Y, Z

10:30am, 10:45am, X, Y, Z

10:45am, 11:00am, X, Y, Z

11:00am, 11:15am, X, Y, Z

....

Where:

X = the number of rows that match

Y = the cost / expense for that chunk of time

Z = the total amount of time during this duration

IE, for the above data, we might have:

10:00am, 10:15am, 1, (135/228 minutes*7), 7

  • The first row starts at 10:08am, so only 7 minutes are used from 10:00-10:15.
  • There are 228 minutes in the start->end time.

....

11:00am, 11:15am, 2, ((135+118)/((228+261) minutes*(15+14)), 29

  • The second row starts right after 11:00am, so we need 15 minutes from the first row, plus 14 minutes from the second row
  • There are 261 minutes in the second start->end time

....

I believe I've done the math right here, but need to figure out how to make this into a PG function, so that it 开发者_运维问答can be used within a report.

Ideally, I'd like to be able to call the function with some arbitrary duration, ie 15minute, or 30minute, or 60minute, and have it split up based on that.

Any ideas?


Here is my try. Given this table definition:

CREATE TABLE interval_test
(
  "start" timestamp without time zone,
  "end" timestamp without time zone,
  "cost" integer
)

This query seems to do what you want. Not sure if it is the best solution, though. Also note that it needs Postgres 8.4 to work, because it uses WINDOW functions and WITH queries.

WITH RECURSIVE intervals(period_start) AS (
    SELECT 
    date_trunc('hour', MIN(start)) AS period_start
      FROM interval_test

  UNION ALL
    SELECT intervals.period_start + INTERVAL '15 MINUTES'
      FROM  intervals
      WHERE (intervals.period_start + INTERVAL '15 MINUTES') < (SELECT MAX("end") FROM interval_test)
  )
  SELECT DISTINCT period_start, intervals.period_start + INTERVAL '15 MINUTES' AS period_end, 
  COUNT(*) OVER  (PARTITION BY period_start ) AS record_count,
SUM (LEAST(period_start + INTERVAL '15 MINUTES', "end")::timestamp - GREATEST(period_start, "start")::timestamp)
  OVER  (PARTITION BY period_start ) AS total_time,

  (SUM(cost) OVER  (PARTITION BY period_start ) /  


 (EXTRACT(EPOCH FROM SUM("end" - "start") OVER  (PARTITION BY period_start )) / 60)) * 

 ((EXTRACT (EPOCH FROM SUM (LEAST(period_start + INTERVAL '15 MINUTES', "end")::timestamp - GREATEST(period_start, "start")::timestamp)
  OVER  (PARTITION BY period_start )))/60)

   AS expense

FROM  interval_test
INNER JOIN intervals ON (intervals.period_start, intervals.period_start + INTERVAL '15 MINUTES') OVERLAPS (interval_test.start, interval_test.end)

ORDER BY period_start ASC
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号