开发者

SQL logic: two tables (group by?)

开发者 https://www.devze.com 2022-12-12 16:25 出处:网络
I have two tables, a table of movies (mt, the primary table), and a table listing where those movies are available (st, the secondary table).For each mt 开发者_开发百科record, there are multiple st re

I have two tables, a table of movies (mt, the primary table), and a table listing where those movies are available (st, the secondary table). For each mt 开发者_开发百科record, there are multiple st records. I need a query which will join the records, and allow me to run queries on both tables. I currently use an inner join like this:

SELECT * FROM
(
    SELECT ROW_NUMBER() OVER(ORDER BY " + orderField + @") AS RowNum,
           mt.ID AS mt_ID,
           mt.title AS mt_title,
           [...]
           st.ID AS st_ID,
           st.title AS st_title,
           [...]
    FROM mt AS mt 
    st AS st
    INNER JOIN sttable AS st on mt.ID =st.ID
    WHERE st.title=@variable <> 0 AND mt.title = @variable
)    
    as DerivedTableName
    WHERE RowNum between 
    ((@pageIndex - 1) * @pageSize + 1) and @pageIndex*@pageSize

The problem with this is that I need to be able to loop through mt using RowNum and pageIndex, and the query returns more than one record for each movie (for example, if there are 8 records for a particular movie, 8, not 1, records are returned). I have tried using GROUP BY, but the problem with that is that it will not allow me to perform queries on fields in the subordinate table (st).

Any help as to the appropriate logic would be very much appreciated.


Here's a solution that does a GROUP BY in the derived table subquery so you get only one row per movie title, and calculate the ROW_NUMBER() from that.

Then join the result of the derived table to st again in the outer query. There will still be multiple rows per movie title, but the RowNum will repeat so you can filter for your @pageIndex correctly.

SELECT * FROM
(
    SELECT ROW_NUMBER() OVER(ORDER BY " + orderField + @") AS RowNum,
           mt.ID AS mt_ID,
           mt.title AS mt_title,
           [...]
    FROM mt AS mt 
    INNER JOIN sttable AS st ON mt.ID =st.ID
    WHERE mt.title = @variable
    GROUP BY mt.ID
) mt1
INNER JOIN sttable AS st1 ON (mt1.ID = st1.ID)
WHERE mt1.RowNum BETWEEN 
    ((@pageIndex - 1) * @pageSize + 1) AND @pageIndex*@pageSize;

You'll have to loop over the result of the outer query in your display code and begin a new row of output whenever the value RowNum changes. This is a pretty obvious technique.

If you instead want to do something like MySQL's GROUP_CONCAT() function, this is tricky in Microsoft SQL Server (I assume you're using Microsoft).

See blogs like http://blog.shlomoid.com/2008/11/emulating-mysqls-groupconcat-function.html that describe use of the FOR xml PATH ('') trick.

PS: Your sample SQL query didn't make sense given your verbal description, so I did my best to write something sensible. No guarantees it matches your schema.


Re your comment: I don't think you need to order by any column of st inside the subquery. I intended there to be no columns from st included in the select-list of the subquery. The only reason there's an instance of st joined in the subquery is to restrict the rows of mt to those that have matching rows in st. But the GROUP BY mt.ID makes sure there's only one row per row of mt (actually since this is SQL Server and not MySQL you'll need to name all the mt columns of the select-list in the GROUP BY clause).


Re your second comment:

I want to first display mt rows that have corresponding st records that have been most recently added

You can add other columns to the grouped query if you use grouping functions. For instance, the latest date_added per group of mt is MAX(st.date_added), and you can add this column to the subquery.

However, don't use ORDER BY in the subquery. There's seldom any reason to sort a subquery, since the order may be altered anyway by using the subquery result in the JOIN or other operations you'd use a subquery in.

You should sort in the outer query:

SELECT * FROM
(
    SELECT ROW_NUMBER() OVER(ORDER BY " + orderField + @") AS RowNum,
           mt.ID AS mt_ID,
           mt.title AS mt_title,
           [...] -- other mt.* columns
           MAX(st.date_added) AS latest_date_added
    FROM mt AS mt 
    INNER JOIN sttable AS st ON mt.ID =st.ID
    WHERE mt.title = @variable
    GROUP BY mt.ID, -- other mt.* columns
) mt1
INNER JOIN sttable AS st1 ON (mt1.ID = st1.ID)
WHERE mt1.RowNum BETWEEN 
    ((@pageIndex - 1) * @pageSize + 1) AND @pageIndex*@pageSize
ORDER BY mt1.latest_date_added DESC, st1.date_added DESC;


I don't think you need to have both this:

FROM mt AS mt 
  st AS st

and this:

INNER JOIN sttable AS st on mt.ID = st.ID
0

精彩评论

暂无评论...
验证码 换一张
取 消