I just read this blog post and, in short, it say that if an SQL server isn't doing a good enough job building query plans, then the last thi开发者_JAVA技巧ng you want to do is start hard coding stuff. So, that got me thinking; how could you "hard code" stuff without hard coding stuff. (Yes, that's the way I tend to think.)
Is the following; 1) possible, 2) a good idea and 3) are there any tools to help DBAs do it?
You have a slow query and for some reason you don't like the query plan your DBMS is choosing. So you start playing around with it (on the dev sever) by forcing different things until you get a plan that's better. Then you try and decompose the DBMS's decision process to find why it didn't choose that one and then reverse engineer what it will take to get it to choose the better one.
Edit: The first answers seem to be answering a slightly different question than I was trying to ask.
The situation: you have a query that's not fast enough but you think you can make it faster (at this point in time) by hard coding parts of the query plan. However you don't want to do that in prod for any number of reason.
The (proposed) solution: Build and test hard coded query plans. When you find one that's fast enough, try and back out what the DBMS is looking at (statistics etc.) when it chooses not to use the better plan. Once you have that, use that as a guide for what to tune.
The question: Is the above a practical way to go about the task? Are their enough knobs to turn to make it work? Are their any tools that show that data was used to make query plan decisions?
First of all, as the commenter points out on the post you link to, there is a bit of FUD going on here -- to summarize for those that don't want to click the link: "Things can go bad... Buy my book!"
Second what you list in your question is not hard coding, it is using hard coding to figure out the best ways to work with a DB. This seems ok to me. As long as you don't leave the hard coded hints in there, things should be ok, SQL server can still change the optimization as data changes.
This Microsoft Support page points to how to create a stats only database. No concerns about tearing anything up but with current index statistics to be able to do some query plan analysis. It may help.
If the optimizer chooses a plan that you are not expecting, it's usually a sign that the underlying data structures are not formed properly. An index could be missing. You could be attempting a join across two columns with different data types. You could be searching or joining on a complex generated column. I've actually seen stuff like this before:
SELECT *
FROM
t1
INNER JOIN t2 ON LTRIM(RTRIM(t1.code)) = LTRIM(RTRIM(t2.code))
This is crying out for improvements in the underlying data structures!
The fact that you're asking would (in most cases other than yours, I'm sure) suggest that you don't have enough experience with SQL Server to be in a position to micromanage the optimizer well. I and the SQL Server wizards I know (a couple of whom do this quite effectively - but always in edge cases, in my experience) would say that you'll end up discovering a better way to have written your query in the first place and your hints won't really end up being useful. But YMMV, as usual.
I'd spend my time rephrasing my queries and grokking query plans. And looking for the other things we all know but sometimes forget to check that can also cause surprises, like whether statistics are current, and whether we're testing against enough accurately representative data.
Hints and locking modes are the two adjustments people can start messing with too quickly when they start running out of other ideas. With the optimizer, it's especially important to know "why" what you're expecting isn't happening, and "why" the change you made caused the effect you observed.
You can hardcode index names and optimization hints in your query.
Here is one query converted and a possible conversion
SELECT Customers.CompanyName, Sum (Orders.TotalAmount) As TotalAmount
FROM Customers
INNER JOIN Orders
ON Customers.ID = Orders.CustomerID
WHERE Customer.Country IN ('USA', 'Canada')
Now for the hardcoded hints
SELECT Customers.CompanyName, Sum (Orders.TotalAmount) As TotalAmount
FROM Customers WITH (NOLOCK, IdxCountry)
INNER LOOP JOIN Orders (TABLOCK, IdxCustomerOrders)
ON Customers.ID = Orders.CustomerID
WHERE Customer.Country IN ('USA', 'Canada')
The second query forces the execution plan to use the following
- Customers has NOLOCK, and must use Index IDXCountry
- Orders has TABLOCK, and must use Index IdxCustomersOrders
- The JOIN HINT is now LOOP
Can you softcode Index and Join hints? Not as a part of a regular query. You can, however, build and execute Dynamic SQL for the purpose.
SQL Server is very good at query optimization, and the more you read about it, the better queries you will write. I think you should read up on Query Optimization and Index Tuning to understand this better
What are your most common sql optimizations?
How can I learn SQL Server index tuning?
To answer your question(s)
- Si Si. It is possible to hardcode without hardcoding.
- It is not a good idea to do this yourself unless you monitor performance and execution very regularly.
- More than tools for DBAs, it is a question of knowledge of SQL Server that will help. Read more on query optimization and it will automagically take you there.
Yes, you should figure out why it came up with the sub-optimal plan. This shouldn't be a hit or miss. You should manually figure out the best plan, looks for the differences and then for the reason. The number on cause of bad plans is invalid cardinality calculations. Skew will kill you.
** This is how you would do it in Oracle, you can tell me how SQL Server works with bind variables **
Say you have a Yes/No column and an index on the column because it winds up frequently in where clauses. Out of 10000 rows, one is yes, the other 99.99% of rows are no.
If you write your query with a bind variable as in: (Parameterized Query in SS speak)
SELECT * FROM table WHERE yn_col = :1
There's no way for the optimizer to cache a single plan for this query. If you ask for Yes, then an index is appropriate, a FTS wouldn't be too bad. IF you ask for No then a FTS would be appropriate but an index scan followed by table access by row-id would be a nightmare.
You have two choices, fix the plan at a FTS, that way it's always the same time. To the end user it looks consistent. But if you want to take advantage of the index, you'll need to write this as two queries without parameterizing it so that the optimizer will see this as 'Yes' and 'No' and if there's a histogram on the column values, the cardinality will be correctly calculated.
You have no other choice here, unless SQL Server has a capacity to cache bind variable sensitive plans as Oracle added in 11g.
精彩评论