Is it possible to distribute a LIMIT clause to subqueries?_问答_开发者

I'm JOINing the results of this query:

  SELECT
      twitter_personas.id
    , 'TwitterPersona'
    , twitter_personas.name
  FROM twitter_personas
UNION ALL
  SELECT
      facebook_personas.id
    , 'FacebookPersona'
    , facebook_personas.name
-- and more UNION ALL statements pertaining to my other services

to a scoring table. The JOIN itself isn't the problem, but the query plan is "wrong": PostgreSQL finds the top 50 scores, then JOINs to the full view above, meaning it's doing a whole lot of work for nothing, since we're only interested in the top 50. But 50 is a variable - it might change (depends on UI concerns, and might be paginated at some point, yada, yada).

I made the query very fast by limiting my result set directly in the subquery:

SELECT
    personas.id
  , personas.type
  , personas.name
  , xs.value
FROM (
  SELECT
      twitter_personas.id
    , 'TwitterPersona'
    , twitter_personas.name
  FROM twitter_personas
  WHERE id IN (
    SELECT persona_id
    FROM xs
    ORDER BY
      xs.value DESC
    LIMIT 50)
UNION ALL
  SELECT
      facebook_personas.id
    , 'FacebookPersona'
    , facebook_personas.name
  FROM facebook_personas
  WHERE id IN (
    SELECT persona_id
    FROM xs
    ORDER BY
      xs.value DESC
    LIMIT 50)) AS personas(id, type, name)
  INNER JOIN xs ON xs.persona_id = personas.id
ORDER BY
  xs.value DESC
LIMIT 50

My question is how can I distribute the 50 above from the outer query to the inner query? This query executes very fast (90ms) compared to the original merge to the full result set of the UNION ALL, which executes in 15 seconds. Maybe there's an even better way to do this?

Here are, for reference purposes, my query plans. First, the "bad" one, taking nearly 15 seconds:

                                                                                QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=0.00..31072.27 rows=50 width=176) (actual time=304.299..14403.551 rows=50 loops=1)
  ->  Subquery Scan personas_ranked  (cost=0.00..253116556.67 rows=407303 width=176) (actual time=304.298..14403.511 rows=50 loops=1)
        ->  Nested Loop Left Join  (cost=0.00..253112483.64 rows=407303 width=112) (actual time=304.297..14403.474 rows=50 loops=1)
              ->  Nested Loop  (cost=0.00..252998394.22 rows=407303 width=108) (actual time=304.283..14402.815 rows=50 loops=1)
                    Join Filter: ("*SELECT* 1".id = xs.persona_id)
                    ->  Index Scan Backward using xs_value_index on xs xs  (cost=0.00..459.97 rows=10275 width=12) (actual time=0.013..0.208 rows=50 loops=1)
                    ->  Append  (cost=0.00..15458.35 rows=407303 width=88) (actual time=0.006..244.217 rows=398435 loops=50)
                          ->  Subquery Scan "*SELECT* 1"  (cost=0.00..15420.65 rows=406562 width=88) (actual time=0.006..199.945 rows=398434 loops=50)
                                ->  Seq Scan on twitter_personas  (cost=0.00..11355.02 rows=406562 width=88) (actual time=0.005..134.607 rows=398434 loops=50)
                          ->  Subquery Scan "*SELECT* 2"  (cost=0.00..14.88 rows=150 width=502) (actual time=0.002..0.002 rows=0 loops=49)
                                ->  Seq Scan on email_personas  (cost=0.00..13.38 rows=150 width=502) (actual time=0.001..0.001 rows=0 loops=49)
                          ->  Subquery Scan "*SELECT* 3"  (cost=0.00..21.80 rows=590 width=100) (actual time=0.001..0.001 rows=0 loops=49)
                                ->  Seq Scan on facebook_personas  (cost=0.00..15.90 rows=590 width=100) (actual time=0.001..0.001 rows=0 loops=49)
                          ->  Subquery Scan "*SELECT* 4"  (cost=0.00..1.03 rows=1 width=25) (actual time=0.018..0.019 rows=1 loops=49)
                                ->  Seq Scan on web_personas  (cost=0.00..1.02 rows=1 width=25) (actual time=0.017..0.018 rows=1 loops=49)
              ->  Index Scan using people_personas_pkey on people_personas  (cost=0.00..0.27 rows=1 width=8) (actual time=0.007..0.007 rows=0 loops=50)
                    Index Cond: (people_personas.persona_id = "*SELECT* 1".id)
Total runtime: 14403.711 ms

The rewritten query, taking only 90 ms:

                                                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=2830.93..2831.05 rows=50 width=108) (actual time=83.914..83.925 rows=50 loops=1)
  ->  Sort  (cost=2830.93..2832.30 rows=551 width=108) (actual time=83.912..83.918 rows=50 loops=1)
        Sort Key: xs.value
        Sort Method:  top-N heapsort  Memory: 28kB
        ->  Hash Join  (cost=875.60..2812.62 rows=551 width=108) (actual time=8.394..79.326 rows=10275 loops=1)
              Hash Cond: ("*SELECT* 1".id = xs.persona_id)
              ->  Append  (cost=588.41..2509.59 rows=551 width=4) (actual time=5.078..69.901 rows=10275 loops=1)
                    ->  Subquery Scan "*SELECT* 1"  (cost=588.41..1184.14 rows=200 width=4) (actual time=5.078..42.428 rows=10274 loops=1)
                          ->  Nested Loop  (cost=588.41..1182.14 rows=200 width=4) (actual time=5.078..40.220 rows=10274 loops=1)
                                ->  HashAggregate  (cost=588.41..590.41 rows=200 width=4) (actual time=5.066..7.900 rows=10275 loops=1)
                                      ->  Index Scan Backward using xs_value_index on xs xs  (cost=0.00..459.97 rows=10275 width=12) (actual time=0.005..2.079 rows=10275 loops=1)
                                ->  Index Scan using twitter_personas_id_index on twitter_personas  (cost=0.00..2.95 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=10275)
                                      Index Cond: (twitter_personas.id = xs.persona_id)
                    ->  Subquery Scan "*SELECT* 2"  (cost=588.41..649.27 rows=200 width=4) (actual time=13.017..13.017 rows=0 loops=1)
                          ->  Nested Loop  (cost=588.41..647.27 rows=200 width=4) (actual time=13.016..13.016 rows=0 loops=1)
                                ->  HashAggregate  (cost=588.41..590.41 rows=200 width=4) (actual time=5.267..6.909 rows=10275 loops=1)
                                      ->  Index Scan Backward using xs_value_index on xs xs  (cost=0.00..459.97 rows=10275 width=12) (actu开发者_Go百科al time=0.007..2.292 rows=10275 loops=1)
                                ->  Index Scan using facebook_personas_id_index on facebook_personas  (cost=0.00..0.27 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=10275)
                                      Index Cond: (facebook_personas.id = xs.persona_id)
                    ->  Subquery Scan "*SELECT* 3"  (cost=588.41..648.77 rows=150 width=4) (actual time=12.568..12.568 rows=0 loops=1)
                          ->  Nested Loop  (cost=588.41..647.27 rows=150 width=4) (actual time=12.566..12.566 rows=0 loops=1)
                                ->  HashAggregate  (cost=588.41..590.41 rows=200 width=4) (actual time=5.015..6.538 rows=10275 loops=1)
                                      ->  Index Scan Backward using xs_value_index on xs xs  (cost=0.00..459.97 rows=10275 width=12) (actual time=0.002..2.065 rows=10275 loops=1)
                                ->  Index Scan using email_personas_id_index on email_personas  (cost=0.00..0.27 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=10275)
                                      Index Cond: (email_personas.id = xs.persona_id)
                    ->  Subquery Scan "*SELECT* 4"  (cost=0.00..27.41 rows=1 width=4) (actual time=0.629..0.630 rows=1 loops=1)
                          ->  Nested Loop Semi Join  (cost=0.00..27.40 rows=1 width=4) (actual time=0.628..0.628 rows=1 loops=1)
                                Join Filter: (web_personas.id = xs.persona_id)
                                ->  Seq Scan on web_personas  (cost=0.00..1.01 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=1)
                                ->  Index Scan Backward using xs_value_index on xs xs  (cost=0.00..459.97 rows=10275 width=12) (actual time=0.002..0.421 rows=1518 loops=1)
              ->  Hash  (cost=158.75..158.75 rows=10275 width=12) (actual time=3.307..3.307 rows=10275 loops=1)
                    ->  Seq Scan on xs xs  (cost=0.00..158.75 rows=10275 width=12) (actual time=0.006..1.563 rows=10275 loops=1)
Total runtime: 84.066 ms

The reason why this wouldn't work is that the ORDER BY xs.value DESC is processed before the limit, and in order to know the first (or last) 50 entries, it has to (logically) calculate all the entries first. If you put the limits in the branches of the union, you only get the first 50 entries among those who were already within the top 50 of their persona kind, which might be different. If that is acceptable to you, you can manually rewrite the query as you did, but the database system can't do that for you.

The planner has to join all the rows in xs to each table in the UNION, because there's no way for the planner to know in advance that the join won't affect the resulting data set (which could affect which rows are in the top 50).

Can you do a two-step with a temporary table, like this?

create temporary table top50 as
select xs.persona_id
, xs.value
from xs
order by value desc
limit 50;

select *
from top50
join personas_view on top50.persona_id = personas_view.id;