开发者

What is the difference between using a cross join and putting a comma between the two tables?

开发者 https://www.devze.com 2023-01-20 10:13 出处:网络
What开发者_如何学Go is the difference between select * from A, B and select * from A cross join B

What开发者_如何学Go is the difference between

select * from A, B

and

select * from A cross join B

? They seem to return the same results.

Is the second version preferred over the first? Is the first version completely syntactically wrong?


They return the same results because they are semantically identical. This:

select * 
  from A, B

...is (wince) ANSI-89 syntax. Without a WHERE clause to link the tables together, the result is a cartesian product. Which is exactly what alternative provides as well:

    select * 
      from A 
cross join B

...but the CROSS JOIN is ANSI-92 syntax.

About Performance

There's no performance difference between them.

Why Use ANSI-92?

The reason to use ANSI-92 syntax is for OUTER JOIN support (IE: LEFT, FULL, RIGHT)--ANSI-89 syntax doesn't have any, so many databases implemented their own (which doesn't port to any other databases). IE: Oracle's (+), SQL Server's =*


Stumbled upon this post from another SO question, but a big difference is the linkage cross join creates. For example using cross apply or another join after B on the first ('comma') variant, the cross apply or join would only refer to the table(s) after the dot. e.g, the following:

select * from A, B join C on C.SomeField = A.SomeField and C.SomeField = B.SomeField 

would create an error:

The multi-part identifier "A.SomeField" could not be bound.

because the join on C only scopes to B, whereas the same with cross join...

select * from A cross join B join C on C.SomeField = A.SomeField and C.SomeField = B.SomeField 

..is deemed ok. The same would apply if cross apply is used. For example placing a cross apply on a function after B, the function could only use fields of B, where the same query with cross join, could use fields from both A and B. Of course, this also means the reverse can be used as well. If you want to add a join solely for one of the tables, you can achieve that by going 'comma' on the tables.


They are the same and should (almost) never be used.


Besides brevity (favoring ,) and consistency (favoring CROSS JOIN), the sole difference is precedence.

The comma is lower precedence than other joins.


For example, the explicit form of

SELECT *
FROM a
  CROSS JOIN b
  JOIN c ON a.id = c.id

is

SELECT *
FROM (
  a
  CROSS JOIN b
)
  INNER JOIN c ON a.id = c.id

which is valid.

Whereas the explicit form of

SELECT *
FROM a,
  b
  JOIN c ON a.id = c.id

is

SELECT *
FROM a
  CROSS JOIN (
    b
    INNER JOIN c ON a.id = c.id
  )

which is invalid (the join clause references inaccessible a).


In your example, there are only two tables, so the two queries are exactly equivalent.


The first version was originally the only way to join two tables. But it has a number of problems so the JOIN keyword was added in the ANSI-92 standard. They give the same results but the second is more explicit and is to be preferred.


To add to the answers already given:

select * from A, B

This was the only way of joining prior to the 1992 SQL standard. So if you wanted an inner join, you'd have to use the WHERE clause for the criteria:

select * from A, B
where A.x = B.y;

One problem with this syntax was that there was no standard for outer joins. Another was that this gets unreadable with many tables and is hence prone to errors and less maintainable.

select * from A, B, C, D
where B.id = C.id_b
and C.id_d = D.id;

Here we have a cross join of A with B/C/D. On purpose or not? Maybe the programmer just forgot the and B.id = A.id_b (or whatever), or maybe this line was deleted by mistake, and maybe still it was really meant to be a cross join. Who could say?

Here is the same with explicit joins

select * 
from A
cross join B
inner join C on C.id_b = B.id
inner join D on D.id = C.id_d;

No doubt about the programmers intentions anymore.

The old comma-separated syntax was replaced for good reasons and should not be used anymore.


These are the examples of implicit and explicit cross joins. See http://en.wikipedia.org/wiki/Join_%28SQL%29#Cross_join.


To the comments as to the utility of cross joins, there is one very useful and valid example of using cross joins or commas in the admittedly somewhat obscure world of Postgres generate_series and Postgis spatial sql where you can use a cross join against generate_series to extract the nth geometry out of a Geometry Collection or Multi-(Polygon/Point/Linestring), see: http://postgis.refractions.net/documentation/manual-1.4/ST_GeometryN.html

SELECT n, ST_AsEWKT(ST_GeometryN(the_geom, n)) As geomewkt
  FROM (
    VALUES (ST_GeomFromEWKT('MULTIPOINT(1 2 7, 3 4 7, 5 6 7, 8 9 10)') ),
          ( ST_GeomFromEWKT('MULTICURVE(CIRCULARSTRING(2.5 2.5,4.5 2.5, 3.5 3.5), (10 11, 12 11))') )
    )  As foo(the_geom)
CROSS JOIN generate_series(1,100) n
  WHERE n <= ST_NumGeometries(the_geom);

This can be very useful if you want to get the area, centroid, bounding box or many of the other operations you can perform on a single geometry, when they are contained within a larger one.

I have always written such queries using a comma before generate_series, until one day when I wondered if this really meant cross join, which brought me to this post. Obscure, but definitely useful.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号