开发者

DateTime as part of PK in FACT Table for warehouses

开发者 https://www.devze.com 2023-03-08 22:57 出处:网络
I know generally its not good idea to have a DateTime column as your PK, however for my situation I believe it makes more sense than having a surrogate key in the Fact Table.

I know generally its not good idea to have a DateTime column as your PK, however for my situation I believe it makes more sense than having a surrogate key in the Fact Table.

The reasons are...

  1. The data inserted into the fact table is always sequential. i.e. I would never insert date time value that is older than the last value already in Fact table.
  2. The date time field is not the only column of the PK (composite PK), The PK is of course itself and the dimension FK's surrogate key.
  3. The way i be querying the data is nearly always based on time.
  4. A surrogate key on the Fact table would tell me nothing about the row. Each row is already unique and to find that particular fact I would always filter on the Date time first and the values in the dimensions.
  5. There is no separate datetime dimension table. No requirement now or in the foreseeable future to have named points in time etc.

Side notes - time will be in UTC and using SQL 2008 R2.

What I'm asking is are giving the situation - what are the disadvantages to doing 开发者_开发知识库this? Will I come up against unforeseen issues? Is this actaully a good thing to be doing when querying that data back later?

Would like to know peoples view points on a DateTime field as the first column of a composite PK.


It's almost an essential feature of any data warehouse that date/time is a component of a key in most tables. There's nothing "wrong" with that.

A surrogate key generally shouldn't be the only key of a table, so perhaps your question is really "Should I create a surrogate key on my table as well?". My suggestion is that if you don't have a reason to create a surrogate key then don't. The time to create a surrogate is when you find that you need it.


Most fact tables have composite keys and date-time or often DateKey, TimeKey are part of it. Actually, quite common.

The dimDate and dimTime are simply used to avoid having "funny" date-time functions in the WHERE clause of a query. For example

-- sales on
-- weekends for previous 28 weeks
-- 
select sum(f.SaleAmount)
from factSale as f
join dimDate  as d on d.DateKey = f.DateKey 
where d.IsWeekend = 'yes'
  and d.WeeksAgo between 1 and 28 ;

So here I can have indexes on IsWeekend and WeeksAgo (DateKey too). If these were replaced by date-time functions, this would cause row-by-row processing.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号