Efficient datawarehousing algorithm for counting Rolling Quarter value_问答_开发者

Efficient datawarehousing algorithm for counting Rolling Quarter value

开发者 https://www.devze.com 2023-03-11 08:51 出处：网络

Consider a dataset with 6 month data as follows: // Month-01 = 1 // Month-02 = 5 // Month-03 = 3 // Month-04开发者_StackOverflow中文版 = 2

Consider a dataset with 6 month data as follows:

// Month-01 = 1
// Month-02 = 5
// Month-03 = 3
// Month-04开发者_StackOverflow中文版 = 2
// Month-05 = 7
// Month-06 = 8

Then rolling quarter (summation of last 3 month) will be as follows:

// QTR-01 = N/A
// QTR-02 = N/A
// QTR-03 = 9
// QTR-04 = 10
// QTR-05 = 12
// QTR-06 = 17

Now, an inefficient algorithm for this calculation in SQL as follows (not perfect algo, just consider the theme of the algo, pls):

foreach row { id,month,qtr,... } in database.table
{
  qtrValue = select sum( top 3 month) from database.table where table.id = row.id;
  update row.qtr set row.qtr= qtrValue;
}

Can you suggest an efficient algorithm and/or datawarehouse design for this problem? It doesnt' matter it involves relational database or not.

A Moving SUM window aggregate function would accomplish what you are looking to do.

Something along these lines:

SELECT SUM(Month) 
       OVER(ROWS BETWEEN 2 PRECEDING AND CURRENT ROWS)
FROM database.table

There is a PARTITION BY option that allows you to apply the aggregation to a group of columns. The exact syntax may vary based on the database platform you are running against. If your database platform doesn't support window aggregates all hope is not lost, but it will take a bit more SQL to accomplish the same task in set based notation.

Well, my date dimension simply has MonthNumberInEpoch which is an incrementing integer for each calendar month starting at the epoch of the dimDate.

So, I can write something like:

with
q_00 as (-- sales monthly
  select
      MonthNumberInEpoch
    , sum (SaleAmount) as SalesMonthly
  from dbo.factSale  as f
  join dbo.dimDate   as d on d.DateKey = f.DateKey
  group by MonthNumberInEpoch
)
select
     a.MonthNumberInEpoch
  , (a.SalesMonthly + b.SalesMonthly + c.SalesMonthly) as SalesThreeMonths
from q_00 as a
join q_00 as b on b.MonthNumberInEpoch + 1 = a.MonthNumberInEpoch
join q_00 as c on c.MonthNumberInEpoch + 2 = a.MonthNumberInEpoch
;