Unioning Pseudo-partitioned tables into a single view_问答_开发者

Unioning Pseudo-partitioned tables into a single view

开发者 https://www.devze.com 2023-04-12 00:30 出处：网络

Lets say we have 5 tables Fact_2011 Fact_2010 Fact_2009 Fact_2008 Fact_2007 each of which stores only transactions for the year indicated by the extension of the table\'s name.

Lets say we have 5 tables

Fact_2011
Fact_2010
Fact_2009
Fact_2008
Fact_2007

each of which stores only transactions for the year indicated by the extension of the table's name.

We then create a separate index over each of these tables with the column "Year" as the first column of the index.

Lastly, we create a view, vwFact, which is the union of all of the tables:

SELECT * FROM Fact_2011
UNION
SELECT * FROM Fact_2010
UNION
SELECT * FROM Fact_2009
UNION
SELECT * FROM Fact开发者_运维问答_2008
UNION
SELECT * FROM Fact_2007

and then perform a queries like this:

SELECT * FROM vwFact WHERE YEAR = 2010

or in less likely situations,

SELECT * FROM vwFact WHERE YEAR > 2010

How efficient would these queries be compared to actually partitioning the data by Year or is it essentially the same? Is having an index by Year over each of these pseudo partitioned tables what is needed to prevent the SQL engine from wasting more than a trivial amount of time to determine that a physical table that contains records outside of the sought date range is not worth scanning? Or is this pseudo partitioning approach exactly what MS partitioning (by year) is doing?

It seems to me that if the query executed is

SELECT Col1Of200 FROM vwFact WHERE YEAR = 2010

that real partitioning would have a distinct advantage, because the pseudo partitioning first has to execute the view to pull back all of the columns from the Fact_2010 table and then filter down to the one column that the end user is selecting, while with MSSQL partitioning, it would be more of a direct up front selection of only the sought column's data.

Comments?

I have implemented partitioned views on SQL Server 2000 with great success

Make sure that you have a check constraint on each table that will restrict the year column to the year. So on the Fact_2010 table it would be Check Year = 2010

then also make the view UNION ALLs not just UNION

now when you query the view for one year it should just access 1 table, you can verify this with the execution plan

if you don't have the check constraints in place it will touch all the tables that are part of the view

that real partitioning would have a distinct advantage, because the pseudo partitioning first has to execute the view to pull back all of the columns from the Fact_2010 table and then filter down to the one column that the end user is selecting

If you have the constraints in place the optimizer is smart enough to just go the tables you need