开发者

Nested query to find details in table B for maximum value in table A

开发者 https://www.devze.com 2022-12-25 11:55 出处:网络
I\'ve got a huge bunch of flights travelling between airports. Each airport has an ID and (x,y) coordinates.

I've got a huge bunch of flights travelling between airports.

Each airport has an ID and (x,y) coordinates.

For a given list of flights belonging to a user, I want to find the northernmost (highest y) airport visited.


Here's the query I'm currently using:

SELECT name,iata,icao,apid,x,y 
  FROM airports 
 WHERE y=(SELECT MAX(y) 
            FROM airports AS a
               , flights AS f 
           WHERE (f.src_apid=a.apid OR f.dst_apid=a.apid) AND f.uid=[user_id]
         )

This works beautifully and reasonably fast as long as y is unique (= there's only one airport at that latitude), but fails once it isn't. Unfortunately this happens quite often, as eg. military and civilian airports have separate entries even though they occupy the same coordinates.

What I'd really want to do is find the airport with MAX(y) in the subquery and return the actual matching airport (a.apid), instead of returning the value of y and then matching it again. Any suggestions?


Assume the user has only this one flight, from apid '3728':

mysql> select * from flights where uid=35 and src_apid=3728 limit 1;
+------+----------+----------+----------+----------+------+------+-----------+-------+--------+------+------+------+--------+----------+--------------+--------------+---------------------+------+------------+------+
| uid  | src_apid | src_time | dst_apid | distance | code | seat | seat_type | class | reason | plid | alid | trid | fid    | duration | registration | note         | upd_time            | opp  | src_date   | mode |
+------+----------+----------+----------+----------+------+------+-----------+-------+--------+------+------+------+--------+----------+--------------+--------------+---------------------+------+------------+------+
|   35 |     3728 | NULL     |     3992 |     4116 | NW16 | 23C  | A         | Y     | L      |  167 | 3731 | NULL | 107493 | 08:00:00 |              | del. typhoon | 2008-10-04 10:40:58 | Y    | 2001-08-22 | F    | 
+---开发者_StackOverflow社区---+----------+----------+----------+----------+------+------+-----------+-------+--------+------+------+------+--------+----------+--------------+--------------+---------------------+------+------------+------+

And there are two airports at the same coordinates:

mysql> select * from airports where y=21.318681;
+-----------------------+----------+---------------+------+------+-------------+-----------+-----------+------+------+----------+------+
| name                  | city     | country       | iata | icao | x           | y         | elevation | apid | uid  | timezone | dst  |
+-----------------------+----------+---------------+------+------+-------------+-----------+-----------+------+------+----------+------+
| Honolulu Intl         | Honolulu | United States | HNL  | PHNL | -157.922428 | 21.318681 |        13 | 3728 | NULL |      -10 | N    | 
| Hickam Air Force Base | Honolulu | United States |      | PHIK | -157.922428 | 21.318681 |        13 | 7055 |    3 |      -10 | N    | 
+-----------------------+----------+---------------+------+------+-------------+-----------+-----------+------+------+----------+------+

If you run the original query, the subquery will return y=21.318681, which in turn will match either apid 3728 (correct) or apid 7055 (wrong).


What about this:

SELECT name,iata,icao,apid,x,y 
FROM airports AS a, flights AS f 
WHERE f.src_apid=a.apid OR f.dst_apid=a.apid
ORDER BY y DESC LIMIT 1

You take all flights of the concerned users, order them from northern to southern, and take the first one from the list.


How does the following query perform? It works by first finding the northmost Y cordinate in the set of airports visited. Then an identical query is performed which is filtered by the Y coordinate in the previous query. The final step is to find the airport.

drop table airports;
drop table flights;

create table airports(
   apid    int         not null
  ,apname  varchar(50) not null
  ,x       int         not null
  ,y       int         not null
  ,primary key(apid)
  ,unique(apname)
);

create table flights(
   flight_id int         not null auto_increment
  ,src_apid  int         not null
  ,dst_apid  int         not null
  ,user_id   varchar(20) not null
  ,foreign key(src_apid) references airports(apid)
  ,foreign key(dst_apid) references airports(apid)
  ,primary key(flight_id)
  ,index(user_id)
);

insert into airports(apid, apname, x, y) values(1, 'Northpole Civilian',     50, 100);
insert into airports(apid, apname, x, y) values(2, 'Northpole Military',     50, 100);
insert into airports(apid, apname, x, y) values(3, 'Transit point',          50, 50);
insert into airports(apid, apname, x, y) values(4, 'Southpole Civilian',     50, 0);
insert into airports(apid, apname, x, y) values(5, 'Southpole Military',     50, 0);

insert into flights(src_apid, dst_apid, user_id) values(4, 3, 'Family guy');
insert into flights(src_apid, dst_apid, user_id) values(3, 1, 'Family guy');

insert into flights(src_apid, dst_apid, user_id) values(5, 3, 'Mr Bazooka');
insert into flights(src_apid, dst_apid, user_id) values(3, 2, 'Mr Bazooka');

select airports.apid
      ,airports.apname
      ,airports.x
      ,airports.y
  from (select max(a.y) as y
          from flights  f
          join airports a on (a.apid = f.src_apid or a.apid = f.dst_apid)
         where f.user_id = 'Family guy'
       ) as northmost 
  join (select a.apid
              ,a.y
          from flights  f
          join airports a on (a.apid = f.src_apid or a.apid = f.dst_apid)
         where f.user_id = 'Family guy'
       ) as userflights on(northmost.y = userflights.y)   
  join airports on(userflights.apid = airports.apid);

Edit. Alternative query that may be less confusing to the optimizer

select airports.*
  from (select case when s.y > d.y then s.apid else d.apid end as apid
              ,case when s.y > d.y then s.y    else d.y    end as northmost
          from flights  f
          join airports s on(f.src_apid = s.apid)
          join airports d on(f.dst_apid = d.apid)
         where f.user_id = 'Family guy'
         order by northmost desc
         limit 1
       ) as user_flights
  join airports on(airports.apid = user_flights.apid);


third attempt, using assumed user (userid,name) table

select u.name, ap.name
     , ap.iata
     , ap.icao
     , ap.apid
     , ap.x
     , max(ap.y)  
  from users u
     , airports ap
     , flights f
 where u.userid=f.userid
   and (   f.src_apid=ap.apid 
        OR f.dst_apid=ap.apid
       )
group by u.name, ap.name,ap.iata,ap.icao,ap.apid,ap.x 

you can now restrict the query to the one user you are interested in ..

comment on GROUP BY:

  • strictly speaking MySQL would allow me to write that group by as 'group by u.name, ap.name'.
  • Other SQL dialects don't, they ask that all selected fields that are not aggregated be in the GROUP BY statement.
  • So I tend to be 'chicken' when selecting my GROUP BY fields ...


OK, perhaps something like this:

SELECT name, iata, icao, apid, x, y
  FROM airports
  WHERE y = (SELECT MAX(A.y)
               FROM airports AS a
             INNER JOIN flights AS f
               ON (F.SRC_APID = A.APID OR
                   F.DST_APID = A.APID)
               WHERE f.uid = [user_id]) AND
        apid IN (SELECT SRC_APID AS APID
                   FROM FLIGHTS
                   WHERE UID = [user_id]
                 UNION ALL
                 SELECT DEST_APID AS APID
                   FROM FLIGHTS
                   WHERE UID = [user_id])

Can't guarantee how this will perform, but perhaps it's a step in the right direction.

Share and enjoy.

0

精彩评论

暂无评论...
验证码 换一张
取 消