开发者

Help me understand this particular use of nested SELECT statements

开发者 https://www.devze.com 2023-03-08 08:02 出处:网络
From this site: Tables: CREATE TABLE PilotSkills (pilot_name CHAR(15) NOT NULL, plane_name CHAR(15) NOT NULL,

From this site:

Tables:

CREATE TABLE PilotSkills
(pilot_name CHAR(15) NOT NULL,
plane_name CHAR(15) NOT NULL,
PRIMARY KEY (pilot_name, plane_name));

CREATE TABLE Hangar
(plane_name CHAR(15) NOT NULL PRIMARY KEY);

Query:

SELECT DISTINCT pilot_name
  FROM PilotSkills AS PS1
  WHERE NOT EXISTS
       (SELECT *
          FROM Hangar
         WHERE NOT EXISTS
               (SELECT *
                  FR开发者_运维技巧OM PilotSkills AS PS2
                 WHERE (PS1.pilot_name = PS2.pilot_name)
                   AND (PS2.plane_name = Hangar.plane_name)));

I understand the problem it's used for (set division), including the analogy that describes it as "There ain't no planes in this hangar that I can't fly!". What I don't understand is exactly what's at work here, and how it comes together to do what it says its doing.

Having trouble stating with specifics my difficulty at the moment...


Edit: Let me just first ask what something like this does, exactly:

SELECT DISTINCT pilot_name
  FROM PilotSkills
  WHERE NOT EXISTS
       (SELECT *
          FROM Hangar)

I think I'm missing some fundamental understanding here...

Edit: Irrelevant, and it wouldn't be a meaningful without the third nested SELECT, right?


What we want is a distinct list of pilots that can fly every plane in the hanger. For that to be true, for a given pilot, there cannot exist a plane they cannot fly. So, we want to get a list of all planes for each pilot and see if there is one they cannot fly. If there is one (the pilot cannot fly) we remove them from the list. Whomever is left, can fly all planes in the hanger.

Said more formally, find a distinct list of pilot names such that for a given pilot, there does not exist a plane in the set of planes (Hanger) such that the plane does not exist in the set of the given pilot's skills.

"find a distinct list of pilot names..."

Select Distinct pilot_name
From PilotSkills As PS1
...

"...such that for a given pilot, there does not exist a plane in the set of planes (Hanger)..."

Select Distinct pilot_name
From PilotSkills As PS1
Where Not Exists    (
                    Select 1
                    From Hanger

"...such that the plane does not exist in the set of the given pilot's skills."

Select Distinct pilot_name
From PilotSkills As PS1
Where Not Exists    (
                    Select 1
                    From Hanger As H
                    Where Not Exists    (
                                        Select 1
                                        From PilotSkills As PS2
                                        Where PS2.pilot_name = PS1.pilot_name
                                            And PS2.plane_name = H.plane_name
                                        )
                    )


As a minor comment initially, Select * is overkill in this situation. You should select a single column, or a couple of columns, but pulling all columns should be avoided, especially in sub queries where they're only used during the query and not returned in the final result set. That said, to try to break down the work flow:

  1. Select Pilot_Name From PilotSkills - We're interested in pilot names eventually.
  2. Where Not Exists (Select * From Hangar) - We're only going to retrieve pilots if there is not a relevant entry for them in the Hangar table.
  3. Where Not Exists (Select * From PilotSkills) - We're only going to retrieve Hangars that don't have a pilot from the outer query.

Describing it as a double negative (from the other answer) is a great way to understand it. It can probably be achieved more directly.


Conceptually it is just a double negative.

Select all the pilots for which there does not exist a plane in the hangar that they cannot fly.

But it seems you are asking about the mechanics of the query itself? It uses two levels of correlated sub query.

If we reduce the number of rows down to a minimal amount and add an additional table to simplify the explanation slightly (the outer instance of PilotSkills in the query is just used to get the list of Pilots). Then the query would look like

SELECT pilot_name
  FROM Pilots
  WHERE NOT EXISTS
       (SELECT *
          FROM Hangar
         WHERE NOT EXISTS
               (SELECT *
                  FROM PilotSkills 
                 WHERE (Pilots.pilot_name = PilotSkills.pilot_name)
                   AND (PilotSkills.plane_name = Hangar.plane_name)));

Pilots

pilot_name 
===========
'Celko'    
'Higgins'  

Hangar

plane_name
=============
'B-1 Bomber'
'F-14 Fighter'

PilotSkills

pilot_name    plane_name
=========================
'Celko'    'F-14 Fighter'
'Higgins'  'B-1 Bomber'
'Higgins'  'F-14 Fighter'

If you want to know which pilots can fly all the planes in the hangar then

  1. For each Pilots.pilot_name in turn
  2. Look at each Hangar.plane_name in turn
  3. And check if there is a corresponding row in PilotSkills for that pilot_name,plane_name

If step 3 is false then we know that there is at least one plane in the hangar the pilot cannot fly and we can stop processing that Pilots row and go onto the next one. If step 3 is true then we must then return to step 2 and check the next plane in the Hangar. If we finish processing all planes in the hangar and for each one there has been a corresponding row in PilotSkills then we know that this pilot can fly all planes.

Or to put it another way we know there does not exist a plane (as we have checked them all) for which there does not exist a matching row in the PilotSkills table.

0

精彩评论

暂无评论...
验证码 换一张
取 消