Could someone give me a good, p开发者_开发问答ractical definition of what a data warehouse is?
I'm surprised no one has posted Inmon's definition:
A warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process
From the same page you can pick up Kimball's definition:
A copy of transaction data specifically structured for query and analysis
I think that, unfortunately, data warehousing is a wide-ranging field. There is a lot of variety with very few standard paradigms, specifically I'm thinking of Kimball's dimensional modelling. Inmon does not have as a specific a methodology as Kimball's and thus some 3NF models may or may not conform to his principles.
Because Inmon has broadened his scope for what warehousing is meant to accomplish, it can encompass unstructured data. However, analysis of unstructured data is very different than traditional analysis.
As applied to SQL Server, typically the largest Data Warehouses on SQL Server are modelled dimensionally, because this lends itself well to the non-distributed, non-massively parallel model. Massively parallel systems like Teradata generally perform a lot better with 3NF models. These are still table-based systems with the various tables connected with foreign key constraints (perhaps not enforced, but at least logical).
Of course, we are also seeing NoSQL data processing systems like Map/Reduce which are not really databases at all in the sense of normalized, denormalized or non/poorly-normalized relational databases which we have had for 40 years now.
i just started with Datawarehousing and Buisness Intelligence and looking around the web you can find some interesting links :
- Get Start With Datawarehousing
I think this two links could help you to understand the concepts of datawarehousing.
sorry, im new i can post only one link ^^
we're sorry, but as a spam prevention mechanism, new users can only post a maximum of one hyperlink. Earn 10 reputation to post more hyperlinks.
A database optimized for retrieval, in general denormalized data, usually a star schema(but could be snowflake) and uses dimensional modeling (fact and dimension tables)
While this is not an academic definition, it might serve as a practical one. A data warehouse is a collection of datamarts and will combine datasets across the breadth of an organization.
A datamart will contain datasets specific to certain portions of the business. In the datamart you will find fact tables, measurable pieces of information, along with dimensions, attributes of your measurable pieces.
A true data warehouse will have conformed dimension tables that can be shared across datamarts.
An example...
Your company may build a datamart around sales. And another datamart around human resources. If the customer dimension table is shared across both these datamarts, it would be considered a conformed dimension. All three of these entities together would make up a data warehouse.
As someone else stated you can find more detailed information by searching for Ralph Kimball's Data Strategies.
Definition : Datawarehouse is a database used for analysis purpose rather than for transaction processing
Check the below link for more informaion on datawarehouse
http://www.idatastage.com/datawarehouse/
精彩评论