I need help开发者_如何学Python in optimizing the following query. It is taking a long time to finish. It takes almost 213 seconds . because of some constraints, I can not add an index and have to live with existing ones.
INSERT INTO temp_table_1
( USER_ID, role_id, participant_code, status_id )
WITH A AS
(SELECT USER_ID user_id,ROLE_ID, STATUS_ID,participant_code
FROM USER_ROLE WHERE participant_code IS NOT NULL), --1
B AS
(SELECT ROLE_ID
FROM CMP_ROLE
WHERE GROUP_ID = 3),
C AS (SELECT USER_ID FROM USER) --2
SELECT USER_ID,ROLE_ID,PARTICIPANT_CODE,MAX(STATUS_ID)
FROM A INNER JOIN B USING (ROLE_ID)
INNER JOIN C USING (USER_ID)
GROUP BY USER_ID,role_id,participant_code ;
--1 = query when ran alone takes 100+ seconds
--2 = query when ran alone takes 19 seconds
DELETE temp_table_1
WHERE ROWID NOT IN
( SELECT a.ROWID
FROM temp_table_1 a,
USER_ROLE b
WHERE a.status_id = b.status_id
AND ( b.ACTIVE IN ( 1 ) OR ( b.ACTIVE IN ( 0,3 )
AND SYSDATE BETWEEN b.effective_from_date AND b.effective_to_date ))
);
It seems like the person who wrote the query is trying to get everything into a temp table first and then deleting records from the temp table. whatever is left is the actual results.
Can't it be done such a way that there is no need for the delete? We just get the results needed since that will save time?
Here is a query which naively combines the two queries above, so make sure that you check and compare the outputs from the two methods.
select
r.user_id, r.role_id, r.participant_code, max(status_id)
from
user_role r,
cmp_role c
where
r.role_id = c.role_id
and r.active in (0,1,3)
and r.participant_code is not null
and sysdate between r.effective_from_date and r.effective_to_date
and c.group_id = 3
group by
r.user_id, r.role_id, r.participant_code;
It is not necessary to use a temporary table and then delete records afterwards to get the required results. Although, there may have been a reason for its use, maybe performance?
Also, it looks like the query and join to the USER
table is unnecessary as the USER_ID
is available from USER_ROLES
. I have omitted it from the query above. Hopefully that gives you a good start to improving it.
This should be semantically equivalent to the set left in the temp table after the delete in the existing code. Though I would agree with AR, that the User table is not needed unless it contains less user_id's than user_role. Otherwise it does not limit the set in any way. If User contains more user_id's than user_role, that will not change the resulting set. User_role is the main driver in this query with small limiting coming from the cmp_role table.
select a.user_id,
a.role_id,
a.participant_code,
a.status_id
from (select a.user_id,
a.role_id,
a.participant_code,
max(status_id) status_id
from user_role a,
(select role_id
from cmp_role
where group_id = 3
) b
where a.participant_code is not null
and a.active in (0, 1, 3)
and sysdate between a.effective_from_date and a.effective_to_date
and a.role_id = b.role_id
group by a.user_id,
a.role_id,
a.participant_code
) a
user c
where a.user_id = c.user_id;
If performance is still poor, then one would probably look at indexes on some of the fields that are used for limiting the data (user_role.role_id, user_role.participant_code, user_role.active, user_role.effective_from_date, user_role.effective_to_date).
Of course an explain plan or trace will be needed to get the full story on what Oracle is doing when it executes this query based on your data and structure.
Let's point out some obvious things.
--1 = query when ran alone takes 100+ seconds
--2 = query when ran alone takes 19 seconds
A full table scan on a USER table should not take 19 seconds. A full table scan on a USER_ROLE table should not take over 100 seconds, even if it has tens of millions of rows. Of course, if you really do have twenty million users then those timings are slightly less unreasonable but still not acceptable.
You need to understand why it takes your system so long to do simple look-ups. An EXPLAIN PLAN will hope us understand the joins but that won't solve your core problem: why does it take so long to retrieve the USER_ROLE data? Is it a complicated view? Does it have hundreds of millions of queries? Do you have an index on PARTICIPANT_CODE which is not helpful to this query?
What about other queries using these tables? Are they problematic as well? If so, you need to investigate more. Either the system is taking too long to do something or it is waiting too on some resource. What you need to do is run a 10046 trace against this query and establish where the time is going. This trace will report on the wait events for your session. That will give you some decent information on which to proceed. It is much better than guessing.
Oracle has exposed the Wait Interface since 9i. Roger Schrag wrote a pretty decent introduction. Read it now. (If you're on 10g or higher you should also read his follow-up article).
精彩评论