开发者

Oracle SQL IN Statement : is this executed sequentially?

开发者 https://www.devze.com 2023-03-22 13:57 出处:网络
TableX number doc number item number parentItem With data: 1, 1000, 0 1, 1010, 1000 1, 1020, 1000 1, 2000, 0 1, 2010, 2000

TableX

number doc
number item
number parentItem

With data:

1, 1000, 0
1, 1010, 1000
1, 1020, 1000
1, 2000, 0
1, 2010, 2000

TableY

number doc
number item
varchar2(16) SomeData

With data:

1, 1000, "1000 Data"
1, 2000, "2000 Data"

I use the foll开发者_运维知识库owing SQL query to get the "SomeData" from TableY

select x.doc, x.item, y.SomeData from TableX x
join TableY y
on y.doc = x.doc and y.item IN (x.item, x.ParentItem)

Which should result in:

1, 1000, "1000 Data"
1, 1010, "1000 Data"
1, 1020, "1000 Data"
1, 2000, "2000 Data"
1, 2010, "2000 Data"

My question is : Is the IN-Statement evaluated sequentially, or is this depending on the path Oracle takes?

Edit What I mean when there an entry in TableY for an item from TableX (like 1000), will this value be used first in the JOIN or will the ParentItem be used first? Or is the JOIN on ParentItem on only done when the JOIN on the item fails ?


In general, when working with databases there is no order unless you specify one. I believe IN always examines the entire list rather than short-circuiting when it finds a match, but there's no guarantee about the order the list will be processed in (not that it matters if it's checking the entire list).


Based on your revised question:

Oracle can't create an execution plan based on what the rows contain, so the optimizer will develop a plan that looks for both columns in the most efficient way possibly. What that plan will be will depend largely upon the size of your tables and the indexes you've created.

It's almost always faster to process a table once, so the optimizer is likely to choose a route where it can easily examine both columns over one where it has to examine the columns one-at-a-time. This means that, unless it has a better avenue, a full-table scan is likely. A little experimentation shows me that even if there's an index covering both columns, the optimizer will opt for the full-table scan.

Interestingly, this appears to be one of the few scenarios where it may be better to have the two columns indexed separately. If there are separate indexes on both columns, it seems the optimizer will scan both indexes, then use a bitmap or to get the unique set of rowids from each result set.

The caveat here, is that my research was done on large tables I had available, using highly artificial scenarios. You should build your query in the simplest, easiest to read (and maintain) way you can, then test it's actual performance and look at it's explain plan for yourself. Only if you've determined that there is performance problem (or there is likely to be) should you worry about finding another, more efficient (but perhaps less self-explanatory) way of writing the query. In general, if you've got a well-thought-out, sargable query, the optimizer will do a good job of picking the most efficient path available.


The order of evaluation doesn't matter. An IN statement is simply a short-hand method of saying y.item = x.item OR y.item = x.ParentItem OR ....

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号