开发者

SQL ERD newbie question

开发者 https://www.devze.com 2022-12-16 13:17 出处:网络
If I have a datawarehouse erd - which has the relationsips etc How will i know what type of join to 开发者_如何学运维use , inner, outer left, outer right, full outer,,

If I have a datawarehouse erd - which has the relationsips etc

How will i know what type of join to 开发者_如何学运维use , inner, outer left, outer right, full outer,,

surely if the database has been created correctly they all would be inner joins ? bar data quality issues


Each type of join does something different, so it all depends on the aim of the individual query.

An example of when you wouldn't want to use an INNER JOIN is if you have a User table containing a list of all your application's users and a Logon table that logs each time someone logs on. If you want to find all users that have not logged on, one way would be to use an OUTER JOIN. This does not mean you have an incorrect database or data quality issues at all.


No, they won't be all INNER JOINs.

This is only edependent on what you want to accomplish with your query. Every kind of JOIN has its use. I suggest you to read the manual of your database vendor to become familiar with them.


You will probably still need to use outer joins between facts.

You should definitely be able to inner join to all dimensions though. Unfortunately, we can't do that with the data warehouse at work because we lack a record in most dimension tables to represent null records.

EDIT

Facts are numeric measures (eg. dollars in a financial transaction) while dimensions are descriptors of data (eg. client involved in financial transaction, date of birth of client). You're best off referencing a good data warehousing book. I recommend The Data Warehouse Toolkit by Ralph Kimball.

A fact table contains both facts and foreign keys to dimensions. A dimension table contains the dimension ID, textual descriptions (eg. client name) and possibly foreign keys to other dimensions (eg. date dimension ID for the date of birth field).

Sometimes, a dimension attribute may be logically null. This is not necessarily stored as null, and at my workplace, we use 0 to represent null dimensions. The one that comes to mind at the moment is a death verification method (eg. viewed death certificate) field that we have in our client dimension table. Of course, most of our clients aren't dead, so we populate this field with 0. Our death verification method dimension table, however, only stores actual death verification methods, so it does not have a record with 0 (not applicable) and we are therefore forced to use left joins to it.

0

精彩评论

暂无评论...
验证码 换一张
取 消