We have the following database schema in Oracle 10g Express Edition: Image
One of our queries looks like this:
select
*
from
torder_item oi_0
where
oi_0.id in
(
select
max(oi_1.id)
from
torder_item oi_1, torder o
where
oi_1.torder_id = o.id
group by
oi_1.tproduct_id
)
or oi_0.id in
(
select
max(oi_2.id)
from
torder_item oi_2, tproduct p
where
oi_2.tproduct_id = p.id
group
by 开发者_JS百科p.group_id
);
The problem is, the query runs very slow. I currently have less than 4000 rows in each table, but the query execution time is above 6 seconds on my computer. And it is a simplified version. If I change 'or in' to 'union':
select
*
from
torder_item oi_0
where
oi_0.id in
((
select
max(oi_1.id)
from
torder_item oi_1, torder o
where
oi_1.torder_id = o.id
group by
oi_1.tproduct_id
)
union
(
select
max(oi_2.id)
from
torder_item oi_2, tproduct p
where
oi_2.tproduct_id = p.id
group
by p.group_id
));
it returns the same results, but executes instantly. Unfortunately, we are using Hibernate, which doesn't seem to support union, so I cannot just change the query like this. This is the trace of the original query:
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.04 0.14 0 10 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 8 6.19 6.19 0 31136 0 96
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10 6.24 6.34 0 31146 0 96
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 5
Rows Row Source Operation
------- ---------------------------------------------------
96 FILTER (cr=31136 pr=0 pw=0 time=14041 us)
1111 TABLE ACCESS FULL TORDER_ITEM (cr=14 pr=0 pw=0 time=3349 us)
96 FILTER (cr=7777 pr=0 pw=0 time=1799577 us)
102096 HASH GROUP BY (cr=7777 pr=0 pw=0 time=1584153 us)
1234321 TABLE ACCESS FULL TORDER_ITEM (cr=7777 pr=0 pw=0 time=35809 us)
0 FILTER (cr=23345 pr=0 pw=0 time=4354068 us)
5075 HASH GROUP BY (cr=23345 pr=0 pw=0 time=4250913 us)
1127665 HASH JOIN (cr=23345 pr=0 pw=0 time=2716544 us)
1127665 TABLE ACCESS FULL TORDER_ITEM (cr=7105 pr=0 pw=0 time=38500 us)
3818430 TABLE ACCESS FULL TPRODUCT (cr=16240 pr=0 pw=0 time=22423 us)
I've tried to add indexes and perform analyze on the tables, but it didn't help.
Does anyone have an idea why it is so slow and how to improve it?
Here is the test data if anyone wants to reproduce the problem.
You've already found the solution to your performance problem. You could use a view and query that view from hibernate.
I don't know if Hibernate supports this type of EXISTS
queries but here's how it could be written:
select
*
from
torder_item oi_0
where
EXISTS
(
select
*
from
torder_item oi_1, torder o
where
oi_1.torder_id = o.id
group by
oi_1.tproduct_id
having
oi_0.id = max(oi_1.id)
)
or EXISTS
(
select
*
from
torder_item oi_2, tproduct p
where
oi_2.tproduct_id = p.id
group
by p.group_id
having
oi_0.id = max(oi_2.id)
);
As per my comment below your question, I think both queries are equivalent to:
select
*
from
torder_item oi_0
where
oi_0.id in
(
select
max(oi_1.id)
from
torder_item oi_1
group by
oi_1.tproduct_id
)
However, I understand that the query given in the question is simplified and this may not be true of the real query.
Why is it so slow?
Because for each row of TORDER_ITEM Oracle performs first subquery and then - if oi_0.id does not occur in the subquery results - the second subquery. That is why you see such big numbers in "Rows" column of the plan output (for example 3818430 means that the TPRODUCT table, which has 3762 rows, was full-scanned 1015 times)
In case of union the execution plan is different: first both subqueries are performed and the result (96 unique ids) is kept in memory and with this result Oracle visits each row of TORDER_ITEM - so in fact each subquery was executed one time instead of 1000.
Don't ask me though why the optimizer isn't smart enough to do something similar in case of first query.
I hope Hibernate supports outer joins. My proposition here is to left join TORDER_ITEM with first subquery and then with the second subquery and filter those rows for which there was something in first or second subquery. I mean
SELECT oi_0.*
FROM torder_item oi_0
LEFT JOIN (SELECT MAX(oi_1.id) id
FROM torder_item oi_1
/* you don't need the join with torder here, it isn't used anyway */
GROUP BY oi_1.tproduct_id
) subquery1 ON subquery1.id = oi_0.id
LEFT JOIN (SELECT MAX(oi_2.id) id
FROM torder_item oi_2,
tproduct p
WHERE oi_2.tproduct_id = p.id
GROUP BY p.group_id
) subquery2 ON subquery2.id = oi_0.id
WHERE subquery1.id IS NOT NULL OR subquery2.id IS NOT NULL
精彩评论