Lets say we have 5 tables
Fact_2011
Fact_2010
Fact_2009
Fact_2008
Fact_2007
each of which stores only transactions for the year indicated by the extension of the table's name.
We then create a separate index over each of these tables with the column "Year" as the first column of the index.
Lastly, we create a view, vwFact
, which is the union of all of the tables:
SELECT * FROM Fact_2011
UNION
SELECT * FROM Fact_2010
UNION
SELECT * FROM Fact_2009
UNION
SELECT * FROM Fact开发者_运维问答_2008
UNION
SELECT * FROM Fact_2007
and then perform a queries like this:
SELECT * FROM vwFact WHERE YEAR = 2010
or in less likely situations,
SELECT * FROM vwFact WHERE YEAR > 2010
How efficient would these queries be compared to actually partitioning the data by Year or is it essentially the same? Is having an index by Year
over each of these pseudo partitioned tables what is needed to prevent the SQL engine from wasting more than a trivial amount of time to determine that a physical table that contains records outside of the sought date range is not worth scanning? Or is this pseudo partitioning approach exactly what MS partitioning (by year) is doing?
It seems to me that if the query executed is
SELECT Col1Of200 FROM vwFact WHERE YEAR = 2010
that real partitioning would have a distinct advantage, because the pseudo partitioning first has to execute the view to pull back all of the columns from the Fact_2010
table and then filter down to the one column that the end user is selecting, while with MSSQL partitioning, it would be more of a direct up front selection of only the sought column's data.
Comments?
I have implemented partitioned views on SQL Server 2000 with great success
Make sure that you have a check constraint on each table that will restrict the year column to the year. So on the Fact_2010 table it would be Check Year = 2010
then also make the view UNION ALLs not just UNION
now when you query the view for one year it should just access 1 table, you can verify this with the execution plan
if you don't have the check constraints in place it will touch all the tables that are part of the view
that real partitioning would have a distinct advantage, because the pseudo partitioning first has to execute the view to pull back all of the columns from the Fact_2010 table and then filter down to the one column that the end user is selecting
If you have the constraints in place the optimizer is smart enough to just go the tables you need
精彩评论