I have 2 tables with lots of data that I need to join. The problem is that the 2 tables hold mostly the same data, and the join sometimes produces undesired, though not unexpected. results. Here is an example:
week_end_date nugly payroll_code rate hours check_number
--------------------------------------------------------------
2010-01-17 AU9T8K HRLY-W 13.00000 40.00000 530957
2010-01-17 AU9T8K HRLY-W 13.00000 40.00000 DD00000105382
week_end_date nugly trx_number pay_code hours rate
2010-01-17 AU9T8K ETS00000010771815 HRLY-W 40.00000 13.00000
2010-01-17 AU9T8K ETS00000010771684 HRLY-W 40.00000 13.00000
I'm looking to the the check # and the trx_number combined in the join, but I end up with a cross join because everything is the same that I'm joining on. For every case I have like this, I really don't care with trx_number ends up with which check #.
Any thoughts?
Here are the current results:
week_end_date nugly payroll_code rate hours check_number trx_number
2010-01-17 AU9T8K HRLY-W 13.00000 40.00000 DD00000105382 ETS00000010771815
2010-01-17 AU9T8K HRLY-W 13.00000 40.00000 530957 ETS00000010771815
2010-01-17 AU9T8K HRLY-W 13.00000 40.00000 DD00000105382 ETS00000010771684
2010-01-17 AU9T8K HRLY-W 13.00000 40.00000 530957 ETS00000010771684
What I'd like is:
week_end_date nugly payroll_code rate hours check_number trx_number
2010-01-17 AU9T8K HRLY-W 13.00000 40.00000 DD00000105382 ETS00000010771815
2010-01-17 AU9T8K HRLY-W 13.00000 40.00000 530957 ETS00000010771684
Where I don't really care which trx_number is with which check_number.
Here is my current query:
SELECT c.week_end_date, c.nugly, c.payroll_code, c.rate, c.hours, c.check_number, t.trx_number
FROM checksByNuglyPaycode c
LEFT OUTER JOIN trxNumByNuglyPaycode t ON c.db_id = t.db_id AND c.fridate = t.fridate
AND c.nugly = t.nugly AND c.trx_type = t.trx_type AND c.payroll_code = t.pay_code
AND c.hours = t.hours AND c.rate = t.rate AND c.week_end_date = t.week_end_date
WHERE t.db_id = 'lal' AND c.nugly = 'AU9T8K' AND c.payroll_code = 'HRLY-W'
AND c.fridate = '2010-01-2开发者_如何学Go2' AND c.week_end_date = '2010-01-17'
ORDER BY c.fridate, c.nugly, payroll_code
The where clause is obviously specifically for this case, in the final query, there will not be a where clause.
It's a guess, but could you use something like ROW_NUMBER to make a sort of identity field for each table and then join on them?
Something like
CREATE VIEW vwOrderedTable1
AS
SELECT ROW_NUMBER() OVER(ORDER BY week_end_date) AS 'RowNumber',
week_end_date,
nugly,
payroll_code...
FROM Table1
GO
CREATE VIEW vwOrderedTable2
AS
SELECT ROW_NUMBER() OVER(ORDER BY week_end_date) AS 'RowNumber',
week_end_date,
nugly,
'payroll_code' = pay_code...
FROM Table2
GO
SELECT *
FROM vwOrderedTable1
INNER JOIN vwOrderedTable2 ON vwOrderedTable1.RowNumber = vwOrderedTable2.RowNumber
What about something like the query you've already got as a sub-query wrapped up with:
SELECT MAX(trx_number), week_end_date, nugly, payroll_code, rate, hours, check_number
FROM
(
QUERY_YOU_ALREADY_HAVE
) tbl
GROUP BY week_end_date, nugly, payroll_code, rate, hours, check_number
Could you provide a bit more information? What's your current query look like? How are the tables structured? Are you trying to join on keys?
You mention you're "looking" to the check # and trx_number... are these the results?
Do you have an example of what would be returned for the lines you've provided?
I like Alex's answer, it seems solid.
Perhaps a silly comment, but if you don't care what trx_number you get, and in your sample query you're using a left join that implies you only want "all" the data from checksByNuglyPaycode, why are you joining the two tables? do you need all the data for both tables? More like a full outer join?
So if there is a record in checksByNuglyPaycode that is not in trxNumByNuglyPaycode you would need it?
And if there is a record in trxNumByNuglyPaycode and not in checksByNuglyPaycode you would need that?
Interesting aside, if you do want all the data from both tables you could use the select below. Note: it uses distinct which would likely be slow, and I think full outer joins can also be slow... it's been quite a while...
SELECT DISTINCT
ISNULL(c.week_end_date, t.week_end_date), ISNULL(c.nugly,t.nugly),
ISNULL(c.payroll_code,t.payroll_code), ISNULL(c.rate,t.rate),
ISNULL(c.hours, t.hours), c.check_number, t.trx_number
FROM
checksByNuglyPaycode c
FULL OUTER JOIN
trxNumByNuglyPaycode t ON c.db_id = t.db_id
AND c.fridate = t.fridate
AND c.nugly = t.nugly
AND c.trx_type = t.trx_type
AND c.payroll_code = t.pay_code
AND c.hours = t.hours
AND c.rate = t.rate
AND c.week_end_date = t.week_end_date
ORDER BY
c.fridate, c.nugly, payroll_code
精彩评论