Hey guys, trying to optimize this query to solve a duplicate user issue:
SELECT userid, 'ismaster' AS name, 'false' AS propvalue FROM user
WHERE userid NOT IN (SELECT userid FROM userprop WHERE name = 'ismaster');
The problem is that the select开发者_如何学JAVA after the NOT IN is 120.000 records and it's taking forever.
Using the explain prefix as suggested in the comments returns:
QUERY PLAN
--------------------------------------------------------------------------------
--
Seq Scan on user (cost=5559.38..122738966.99 rows=61597 width=8)
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=5559.38..7248.33 rows=121395 width=8)
-> Seq Scan on userprop (cost=0.00..4962.99 rows=121395 width=8
)
Filter: ((name)::text = 'ismaster'::text)
(6 rows)
Any suggestion?
Did you put index on userid?
Or try another variation:
SELECT userid, 'ismaster' AS name, 'false' AS propvalue FROM user
WHERE NOT EXISTS
(SELECT * FROM userprop
WHERE userpop.userid = user.userid
AND name = 'ismaster');
Is the name column indexed? How selective is the name value? Also anytime you want to have someone recommend changes to a query provide the query plan, even on what appears to be a simple query. That way we really know what the planner is doing.
According to this answer, using a LEFT JOIN ... IS NULL
might be either faster or slower than NOT EXISTS
, depending on the RDBMS, though they're equivalent on PostGres.
SELECT u.userid, 'ismaster' AS name, 'false' AS propvalue FROM user u
LEFT JOIN userprop up ON u.userid = up.userid AND up.name <> 'ismaster'
WHERE up.userid IS NULL
精彩评论