I have a fact table containing 8 Million rows with 1 Million rows increase per month. The table already contains indexes on it. The table is used by IBM Cognos environment to generate reports. Currently I am looking for way to optimize the table SELECT statements.
As first try, I partitioned the table (each partition has equal distribution of rows) and the query is suitable for the partitions, but for some reason, I am getting equal or even worse performance, which is weird. Only one partition is affected per query. Can someone explain how to optimize this ?
Second idea I came to is to implement the fact table as Index organized table, but it will have to have all the columns as primary key. Is this alright and will there be performance gain ?
Third idea is to implement the fact table in a way that will contain all the columns that are joined from the star schema. Will there be performance gain ?
EDIT: Here is the execution plan:
I have managed to reduce the access time to fact table FT_COSTS by 3 times (cost was 42000, now is 14900) AFTER I created indexes containing the partitioning criteria, but before that I was getting worse results than in unpartitioned table. I used this link to solve my partitioning problem Range partition skip check
Fr开发者_如何学运维om what I see now, the main bottleneck is the GROUP BY which raises the cost from 34000 to 85 000 , which is more than doubling . Does anyone have idea about a workaround on this ?
What does the GROUP BY actually GROUP BY ?
The explain plan indicates 1,238,320 rows in the hash join going into the GROUP BY and the same number coming out the top level SELECT. That suggests that the optimizer doesn't actually believe you'd be doing any real aggregation here.
Reducing the cost of group by generally requires that you create pe-computed aggregates, typically by creating one or more materialized views.
Partition pruning can be a tricky blighter to get working.
Have you got an EXPLAIN PLAN of your query? Does it show PARTITION RANGE SINGLE
? If it doesn't then the query is ignoring the partition. If it does then you have some other problem.
My money is on the first of these branches: partitioning physically re-orders the table. This means that execution plans which don't fit with the partitioning strategy can run worse than they did against the unpartitioned table.
To get any further with this we need to see some details. At the very least the partitioning clause for your table and the part of the query which you say is suitable for this approach. The EXPLAIN PLAN would be very helpful too. The more details you give us the better: tuning is all about the specifics because each case is peculiar.
"Could you possibly explain why the group by has so high cost and how it can be reduced ? "
GROUP BY means sorting. That can be expensive if you have a lot of data, because it requires memory - or disk writes - and CPU cycles.
As for reducing the cost, it's a bit difficult to offer advice on a query I haven't seen. What I can say is this: queries take time, and queries which use a lot of data take longer. The secret of tuning is to understand what is a reasonable amount of time for a given query. Cost is irrelevant if the query runs fast enough .
if you see at the end of the execution plan, it is showing that the table FT_COSTS is accessed completely(table access full). since it is completely accessed, in all the joins you have put to get data just added up and finally the cost appears large. My suggestion is put appropriate index for the table so that it refers the index instead of the entire table to access data, then see the drastic change in your performance!!!!!
精彩评论