开发者

mysql JOIN - Left, right or Outer?

开发者 https://www.devze.com 2023-04-13 03:27 出处:网络
I have 2 tables with the following columns: Table 1 +--------------+---------------+------+-----+---------+----------------+

I have 2 tables with the following columns: Table 1

+--------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+---------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| symbol | varchar(45) | NO | MUL | NULL | |
| v_dt | date | NO | | NULL | |
| e_dt | date | NO | | NULL | |
| col_s | decimal(10,4) | NO | | NULL | |
| col_o | decimal(10,4) | NO | | NULL | |
| col_b | decimal(10,4) | NO | | NULL | |
| col_a | decimal(10,4) | NO | | NULL | |
| col_l | decimal(10,4) | NO | | NULL | |
| col_v | bigint(20) | NO | | NULL | |
| col_t | enum('a','b') | NO | | NULL | |
+--------------+---------------+------+-----+---------+----------------+

and Table 2:

+------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| t_date | date | NO | PRI | NULL | |
| e_date | date | NO | | NULL | |
| symbol | varchar(45) | NO | PRI | NULL | |
| col_i | decimal(10,6) | NO | | NULL | |
| col_b | decimal(10,6) | NO | | NULL | |
| col_d | decimal(10,6) | NO | | NULL | |
| col_g | decimal(10,6) | NO | | NULL | |
| col_v | decimal(10,6) | NO | | NULL | |
| col_t | decimal(10,6) | NO | | NULL | |
| col_r | decimal(10,6) | NO | | NULL | |
+------------+---------------+------+-----+---------+-------+

I want to retrieve table1.col_b, table1.col_a and table2.col_d so, I wrote the following

SELECT col_b, col_a, col_d FROM table1 LEFT JOIN table2 ON table1.symbol=table2.symbol;

but instead of returning 40 rows that it should have, it kept running to retrieve 4000 rows with irrelevant data so I am guessing I have written the JOIN incorrectly. Can you pls. let me know what is the correct way to write this to retrieve the required data. thx! result of an INNER JOIN (I changed the query to include dates so the results make more sense, I hope)

SELECT v_dt, e_dt, col_d FROM table1 INNER JOIN table2 ON table1.symbol=table2.symbol;

| 2011-09-26   | 2011-11-18 |   2.030130 |
| 2011-09-27   | 2011-11-18 |   2.030130 |
| 2011-09-28   | 2011-11-18 |   2.030130 |
| 2011-09-29   | 2011-11-18 |   2.030130 |
| 2011-09-30   | 2011-11-18 |   2.030130 |
| 2011-10-03   | 2011-11-18 |   2.030130 |
| 2011-10-04   | 2011-11-18 |   2.030130 |
| 2011-10-05   | 2011-11-18 |   2.030130 |
| 2011-10-06   | 2011-11-18 |   2.030130 |
| 2011-10-07   | 2011-11-18 |   2.030130 |
| 2011-10-10   | 2011-11-18 |   2.030130 |
| 2011-10-11   | 2011-11-18 |   2.030130 |
| 2011-10-12   | 2011-11-18 |   2.030130 |
| 2011-10-13   | 2011-11-18 |   2.030130 |
| 2011-10-14   | 2011-11-18 |   2.030130 |
| 2011-08-09   | 2011-11-18 |   1.628250 |
| 2011-08-10   | 2011-11-18 |   1.628250 |
| 2011-08-11   | 2011-11-18 |   1.628250 |
| 2011-08-12   | 2011-11-18 |   1.628250 |
| 2011-08-15   | 2011-11-18 |   1.628250 |
| 2011-08-16   | 2011-11-18 |   1.628250 |
| 2011-08-17   | 2011-11-18 |   1.628250 |
| 2011-08-18   | 2011-11-18 |   1.628250 |
| 2011-08-19   | 2011-11-18 |   1.628250 |
| 2011-08-22   | 2011-11-18 |   1.628250 |
| 2011-08-24   | 2011-11-18 |   1.628250 |
| 2011-08-25   | 2011-11-18 |   1.628250 |
| 2011-08-26   | 2011-11-18 |   1.628250 |
| 2011-08-29   | 2011-11-18 |   1.628250 |
| 2011-08-30   | 2011-11-18 |   1.628250 |
| 2011-08-31   | 2011-11-18 |   1.628250 |
| 2011-09-01   | 2011-11-18 |   1.628250 |
| 2011-09-02   | 2011-11-18 |   1.628250 |
| 2011-09-06   | 2011-11-18 |   1.628250 |
| 2011-09-07   | 2011-11-18 |   1.628250 |
| 2011-09-08   | 2011-11-18 |   1.628250 |
| 2011-09-09   | 2011-11-18 |   1.628250 |
| 2011-09-13   | 2011-11-18 |   1.628250 |
| 2011-09-14   | 2011-11-18 |   1.628250 |
| 2011-09-15   | 2011-11-18 |   1.628250 |
| 2011-09-16   | 2011-11-18 |   1.628250 |
| 2011-09-19   | 2011-11-18 |   1.628250 |
| 2011-09-20   | 2011-11-18 |   1.628250 |
| 2011-09-21   | 2011-11-18 |   1.628250 |
| 2011-09-22   | 2011-11-18 |   1.628250 |
| 2011-09-23   | 2011-11-18 |   1.628250 |
| 2011-09-26   | 2011-11-18 |   1.628250 |
| 2011-09-27   | 2011-11-18 |   1.628250 |
| 2011-09-28   | 2011-11-18 |   1.628250 |
| 2011-09-29   | 2011-11-18 |   1.628250 |
| 2011-09-30   | 2011-11-18 |   1.628250 |
| 2011-10-03   | 2011-11-18 |   1.628250 |
| 2011-10-04   | 2011-11-18 |   1.628250 |
| 2011-10-05   | 2011-11-18 |   1.628250 |
| 2011-10-06   | 2011-11-18 |   1.628250 |
| 2011-10-07   | 2011-11-18 |   1.628250 |
| 2011-10-10   | 2011-11-18 |   1.628250 |
| 2011-10-11   | 2011-11-18 |   1.628250 |
| 2011-10-12   | 2011-11-18 |   1.628250 |
| 2011-10-13   | 2011-11-18 |   1.628250 |
| 2011-10-14   | 2011-11-18 |   1.628250 |
| 2011-08-09   | 2011-11-18 |   1.254390 |
| 2011-08-10   | 2011-11-18 |   1.254390 |
| 2011-08-11   | 2011-11-18 |   1.254390 |
| 2011-08-12   | 2011-11-18 |   1.254390 |
| 2011-08-15   | 2011-11-18 |   1.254390 |
| 2011-08-16   | 2011-11-18 |   1.254390 |
| 2011-08-17   | 2011-11-18 |   1.254390 |
| 2011-08-18   | 2011-11-18 |   1.254390 |
| 2011-08-19   | 2011-11-18 |   1.254390 |
| 2011-08-22   | 2011-11-18 |   1.254390 |
| 2011-08-24   | 2011-11-18 |   1.254390 |
| 2011-08-25   | 2011-11-18 |   1.254390 |
| 2011-08-26   | 2011-11-18 |   1.254390 |
| 2011-08-29   | 2011-11-18 |   1.254390 |
| 2011-08-30   | 2011-11-18 |   1.254390 |
| 2011-08-31   | 2011-11-18 |   1.254390 |
| 2011-09-01   | 2011-11-18 |   1.254390 |
| 2011-09-02   | 2011-11-18 |   1.254390 |
| 2011-09-06   | 2011-11-18 |   1.254390 |
| 2011-09-07   | 2011-11-18 |   1.254390 |
| 2011-09-08   | 2011-11-18 |   1.254390 |
| 2011-09-09   | 2011-11-18 |   1.254390 |
| 2011-09-13   | 2011-11-18 |   1.254390 |
| 2011-09-14   | 2011-11-18 |   1.254390 |
| 2011-09-15   | 2011-11-18 |   1.254390 |
| 2011-09-16   | 2011-11-18 |   1.254390 |
| 2011-09-19   | 2011-11-18 |   1.254390 |
| 2011-09-20   | 2011-11-18 |   1.254390 |
| 2011-09-21   | 2011-11-18 |   1.254390 |
| 2011-09-22   | 2011-11-18 |   1.254390 |
| 2011-09-23   | 2011-11-18 |   1.254390 |
| 2011-09-26   | 2011-11-18 |   1.254390 |
| 2011-09-27   | 2011-11-18 |   1.254390 |
| 2011-09-28   | 2011-11-18 |   1.254390 |
| 2011-09-29   | 2011-11-18 |   1.254390 |
| 2011-09-30   | 2011-11-18 |   1.254390 |
| 2011-10-03   | 2011-11-18 |   1.254390 |
| 2011-10-04   | 2011-11-18 |   1.254390 |
| 2011-10-05   | 2011-11-18 |   1.254390 |
| 2011-10-06   | 2011-11-18 |   1.254390 |
| 2011-10-07   | 2011-11-18 |   1.254390 |
| 2011-10-10   | 2011-11-18 |   1.254390 |
| 2011-10-11   | 2011-11-18 |   1.254390 |
| 2011-10-12   | 2011-11-18 |   1.254390 |
| 2011-10-13   | 2011-11-18 |   1.254390 |
| 2011-10-14   | 2011-11-18 |   1.254390 |
| 2011-08-09   | 2011-11-18 |   1.019710 |
| 2011-08-10   | 2011-11-18 |   1.019710 |
| 2011-08-11   | 2011-11-18 |   1.019710 |
| 2011-08-12   | 2011-11-18 |   1.019710 |
| 2011-08-15   | 2011-11-18 |   1.019710 |
| 2011-08-16   | 2011-11-18 |   1.019710 |
| 2011-08-17   | 2011-11-18 |   1.019710 |
| 2011-08-18   | 2011-11-18 |   1.019710 |
| 2011-08-19   | 2011-11-18 |   1.019710 |

all I was hoping to get was the below 2 columns from table 1 and the 3rd column being col_d from table 2. hence I was expecting 45 rows.

+--------------+------------+
| 2011-08-09   | 2013-01-18 |
| 2011-08-10   | 2013-01-18 |
| 2011-08-11   | 2013-01-18 |
| 2011-08-12   | 2013-01-18 |
| 2011-08-15   | 2013-01-18 |
| 2011-08-16   | 2013-01-18 |
| 2011-08-17   | 2013-01-18 |
| 2011-08-18   | 2013-01-18 |
| 2011-08-19   | 2013-01-18 |
| 2011-08-22   | 2013-01-18 |
| 2011-08-24   | 2013-01-18 |
| 2011-08-25   | 2013-01-18 |
| 2011-08-26   | 2013-01-18 |
| 2011-08-29   | 2013-01-18 |
| 2011-08-30   | 2013-01-18 |
| 2011-08-31   | 2013-01-18 |
| 2011-09-01   | 2013-01-18 |
| 2011-09-02   | 2013-01-18 |
| 2011-09-06   | 2013-01-18 |
| 2011-09-07   | 2013-01-18 |
| 2011-09-08   | 2013-01-18 |
| 2011-09-09   | 2013-01-18 |
| 2011-09-13   | 2013-01-18 |
| 2011-09-14   | 2013-01-18 |
| 2011-09-15   | 2013-01-18 |
| 2011-09-16   | 2013-01-18 |
| 2011-09-20   | 2013-01-18 |
| 2011-09-21   | 2013-01-18 |
| 2011-09-22   | 2013-01-18 |
| 2011-09-23   | 2013-01-18 |
| 2011-09-26   | 2013-01-18 |
| 2011-09-27   | 2013-01-18 |
| 2011-09-28   | 2013-01-18 |
| 2011-09-29   | 2013-01-18 |
| 2011-09-30   | 2013-01-18 |
| 2011-10-03   | 2013-01-18 |
| 2011-10-04   | 2013-01-18 |
| 2011-10-05   | 2013-01-18 |
| 2011-10-06   | 2013-01-18 |
| 2011-10-07   | 2013-01-18 |
| 2011-10-10   | 2013-01-18 |
| 2011-10-11   | 2013-01-18 |
| 2011-10-12   | 2013-01-18 |
| 2011-10-13   | 2013-01-18 |
| 2011-10-14   | 2013-01-18 |
+--------------+------------+
45 rows in set (0.02 sec)

adding the GROUP BY v_dt, e_dt has brought it down to the 45 rows, and columns of table1 are correct. only problem is it is now showing the same value (5.530000) for table2.col_d which is not what it should be :-(

+--------------+------------+----------+
| 2011-08-09   | 2013-01-18 | 5.530000 |
| 2011-08-10   | 2013-01-18 | 5.530000 |
| 2011-08-11   | 2013-01-18 | 5.530000 |
| 2011-08-12   | 2013-01-18 | 5.530000 |
| 2011-08-15   | 2013-01-18 | 5.530000 |
| 2011-08-16   | 2013-01-18 | 5.530000 |
| 2011-08-17   | 2013-01-18 | 5.530000 |
| 2011-08-18   | 2013-01-18 | 5.530000 |
| 2011-08-19   | 2013-01-18 | 5.530000 |
| 2011-08-22   | 2013-01-18 | 5.530000 |
| 2011-08-24   | 2013-01-18 | 5.530000 |
| 2011-08-25   | 2013-01-18 | 5.530000 |
| 2011-08-26   | 2013-01-18 | 5.530000 |
| 2011-08-29   | 2013-01-18 | 5.530000 |
| 2011-08-30   | 2013-01-18 | 5.530000 |
| 2011-08-31   | 2013-01-18 | 5.530000 |
| 2011-09-01   | 2013-01-18 | 5.530000 |
| 2011-09-02   | 2013-01-18 | 5.530000 |
| 2011-09-06   | 2013-01-18 | 5.530000 |
| 2011-09-07   | 2013-01-18 | 5.530000 |
| 2011-09-08   | 2013-01-18 | 5.530000 |
| 2011-09-09   | 2013-01-18 | 5.530000 |
| 2011-09-13   | 2013-01-18 | 5.530000 |
| 2011-09-14   | 2013-01-18 | 5.530000 |
| 2011-09-15   | 2013-01-18 | 5.530000 |
| 2011-09-16   | 2013-01-18 | 5.530000 |
| 2011-09-20   | 2013-01-18 | 5.530000 |
| 2011-09-21   | 2013-01-18 | 5.530000 |
| 2011-09-22   | 2013-01-18 | 5.530000 |
| 2011-09-23   | 2013-01-18 | 5.530000 |
| 2011-09-26   | 2013-01-18 | 5.530000 |
| 2011-09-27   | 2013-01-18 | 5.530000 |
| 2011-09-28   | 2013-01-18 | 5.530000 |
| 2011-09-29   | 2013-01-18 | 5.530000 |
| 2011-09-30   | 2013-01-18 | 5开发者_JAVA百科.530000 |
| 2011-10-03   | 2013-01-18 | 5.530000 |
| 2011-10-04   | 2013-01-18 | 5.530000 |
| 2011-10-05   | 2013-01-18 | 5.530000 |
| 2011-10-06   | 2013-01-18 | 5.530000 |
| 2011-10-07   | 2013-01-18 | 5.530000 |
| 2011-10-10   | 2013-01-18 | 5.530000 |
| 2011-10-11   | 2013-01-18 | 5.530000 |
| 2011-10-12   | 2013-01-18 | 5.530000 |
| 2011-10-13   | 2013-01-18 | 5.530000 |
| 2011-10-14   | 2013-01-18 | 5.530000 |
+--------------+------------+----------+

this is the updated query. select table1.v_dt, table1.e_dt, table2.col_b from table1 inner join table2 on table1.symbol=table2.symbol group by v_dt, e_dt; to narrow the results further, I have also run

select table1.v_dt, table1.e_dt, table2.col_b from table1 inner join table2 on table1.symbol='P00055000' group by v_dt, e_dt; but I still get the same results, and then

select symbol, count(*) from table2 where symbol='P00055000' group by symbol;

+--------------------+----------+
| P00055000 |       40 |
+--------------------+----------+
1 row in set (0.02 sec)


If you have two tables A and B:

  • A LEFT JOIN B = Select all the rows A and any rows from B that match the join condition, or NULL if B does not match the join condition
  • A RIGHT JOIN B = Select all the rows B and any rows from A that match the join condition, or NULL if A does not match the join condition
  • A INNER JOIN B = Select only the rows from A and B that match the join condition

For any column selected in the LEFT/RIGHT joins on the join table, a NULL value will be present if the row does not match the join condition.


For example say we have two tables

Table A (id, name, description)
Table B (id, a_id, group_id, date)

When we run a LEFT JOIN below we expect that the results from the query would return NULL for B.group_id and B.date if the no record in the B table could be found.

SELECT A.id, A.name, A.description, B.group_id, B.date FROM A
LEFT JOIN B ON B.a_id=A.id

Resulting in

id | name | description | group_id | date
1  | Test | Test        | 2        | 2011-3-4
2  | Test | Test        | NULL     | NULL

As you can see the first row successfully found a record in both A and B that matches the join conditions. On the other hand, row 2 could not find a record in B that matched so it added NULL values instead.


Let's look at a RIGHT JOIN. This is actually the inverse of a LEFT JOIN. We expect that all records from B will contain their data, but if a record in A does not match, it will CONTAIN NULLS.

SELECT A.id, A.name, A.description, B.group_id, B.date FROM A
RIGHT JOIN B ON B.a_id=A.id

Resulting in

id   | name | description | group_id | date
1    | Test | Test        | 2        | 2011-3-4
NULL | NULL | NULL        | 3        | 2011-5-6

As you can see the first row successfully found a record in both A and B that matches the join conditions (same as the LEFT JOIN). On the other hand, row 2 could not find a record in A that matched so it added NULL values instead.


Finally let's look at the INNER JOIN. An INNER JOIN is typically the most useful join in that it returns only the records that match in both A and B, which is more often what you are looking for

SELECT A.id, A.name, A.description, B.group_id, B.date FROM A
INNER JOIN B ON B.a_id=A.id

Resulting in

id   | name | description | group_id | date
1    | Test | Test        | 2        | 2011-3-4

Now we only return the record that matches in both A and B and ignore all the rest. Hope this clears things up for you.


I think you're looking for inner join. Left join returns all the records from table1 and either the matching table2 values or nulls if there is no matching table2.

Inner join returns each matching pair (based on your join conditions) from the two sets, but does not return any records where there are no matches.

http://en.wikipedia.org/wiki/Join_(SQL)

The reason you're getting so many records is because a join returns the Cartesian product of the two tables. As an example, if you have a table with 10 records and another with 20 records, you end up with 200 records. This is obviously useless on its own, so you add a join condition to only return the records that logically go together, generally based on a foreign key relationship. So, for:

Table1 - Table1Key, Table1Value
Table2 - Table2Key, Table1KeyReference, Table2Value

and the values

Table1
1,'1'
2,'2'
3,'3'
4,'4'
5,'5'

Table2
1,1,'1-1'
2,1,'1-2'
3,2,'2-3'

With that data, the following statement

select *
    from Table1
        inner join Table2

returns

Table1Key,Table1Value,Table2Key,Table1KeyReference, Table2Value
1,'1',1,1,'1-1'
1,'1',2,1,'1-2'
1,'1',3,2,'2-3'
2,'2',1,1,'1-1'
2,'2',2,1,'1-2'
2,'2',3,2,'2-3'
....

Etc. From these 15 combinations, only are "correct" so we need to add a join condition:

select *
    from Table1
        inner join Table2 on(Table1.Table1Key=Table2.Table1KeyReference)

Which returns:

1,'1',1,1,'1-1'
1,'1',2,1,'1-2'
2,'2',3,2,'2-3'

The reason this returns only the 3 in the Table2 is that each row in Table2 corresponds to 1 and only row in Table1.

The problem in your case is that you haven't defined that relationship between the two tables. If there are 40 P00055000 symbols in your second table and there are 2 with that symbol in your first, then you'll get 80 results from your join. If there are then 40 different symbols, each with an additional 40 rows in the second table and 2 in the first, then you'll end up with 3200 rows, etc. A lot of this may be meaningless data, and this is because you haven't strongly defined what the relationship is between the two tables.

So, that all said, what results do you want when you do your select? For a given row in Table1, if you joined it with every row from table 2, which rows would you want back? From there, you should be able to craft the correct join condition. If these two rows are tightly related (parent child, etc.) you should consider re-factoring the tables to join on primary/foreign key relationships and remove duplicate data.

As an aside, I'm assuming these aren't your actual table and field names. If they are, change them to something meaningful. The length of your column names doesn't impact query performance, but makes sharing, discussing, and modifying your code infinitely easier.

0

精彩评论

暂无评论...
验证码 换一张
取 消