开发者

relationships produce duplicate records on query, DISTINCT does not work, any other solutions available?

开发者 https://www.devze.com 2023-03-27 01:48 出处:网络
I am new to this portal. I have a very simple problem to be solved. It is related to the ANSI SQL. I am writing a reports using BIRT and I am fetching the data from several tables. I understand how th

I am new to this portal. I have a very simple problem to be solved. It is related to the ANSI SQL. I am writing a reports using BIRT and I am fetching the data from several tables. I understand how the SQL joins work but maybe not fully. I researched google for hours and I could not find relevant answer.

My problem is that one of the relationships in the code produce a duplicate result (the same row is copied - duplicated). I was so determined to solve it I used every type of join available. Some of this SQL was produced already. I shall post my code below. I know that one of the solutions to my problem is use of the 'DISTINCT' keyword. I have used it and it does not solve my problem.

Can anyone propose any solution to that?

Sample code:

SELECT DISTINCT
   partmaster.partdesc,
   partmaster.uom,
   traders.name AS tradername,
   worksorders.id AS worksorderno,
   worksorders.partid,
   worksorders.quantity,
   worksorders.duedate,
   worksorders.traderid,
   worksorders.orderid,
   routingoperations.partid,
   routingoperations.methodid,
   routingoperations.operationnumber,
   routingoperations.workcentreid,
   routingoperations.settime,
   routingoperations.runtime,
   routingoperations.perquantity,
   routingoperations.description,
   routingoperations.alternativeoperation,
   routingoperations.alternativeoperationpreference,
   machines.macdesc,
   machines.msection,
   allpartmaster.partnum,
   allpartmaster.nbq,
   allpartmaster.partdesc,
   routingoperationtools.toolid,
   tools.tooldesc,
   CAST (emediadetails.data as VARCHAR(MAX)) AS cplandata
FROM worksorders
INNER JOIN partmaster ON worksorders.partid = partmaster.partnum
INNER JOIN traders traders ON worksorders.traderid = traders.id
INNER JOIN routingoperations routingoperations ON worksorders.partid = routingoperations.partid  
       AND worksorders.routingmethod = routingoperations.methodid
INNER JOIN allpartmaster allpartmaster ON routingoperations.partid = allpartmaster.partnum 
LEFT OUTER JOIN machines machines ON routingoperations.workcentreid = machines.macid
LEFT OUTER JOIN routingoperationtools routingoperationtools ON routingoperationtools.partid = routingoperations.partid 
           AND routingoperationtools.routingmethod开发者_JAVA技巧 = routingoperations.methodid 
           AND routingoperationtools.operationnumber = routingoperations.operationnumber     
LEFT OUTER JOIN tools tools ON tools.toolid = routingoperationtools.toolid 
LEFT OUTER JOIN emediadetails ON emediadetails.keyvalue1 = worksorders.id 
            AND emediadetails.keyvalue2 = routingoperations.operationnumber 
            AND emediadetails.emediaid = 'worksorderoperation' 

I do not have too much of the test data but I know that one row is copied twice as the result of the query below even tho I used DISTINCT keyword. I know that my problem is rather specific and not general but the solution that someone will propose may help others with the similar problem.


I can't solve your problem for you without some test data, but I have some helpful hints.

In principle, you should be really careful with DISTINCT - its a great way of hiding bugs in your query. Only use DISTINCT if you are confident that the underlying data contains legitimate duplicates. If your joins are wrong, and you're getting a cartesian product, you can remove the duplicates from the results with DISTINCT - but that doesn't stop the cartesian product being generated. You'll get very poor performance, and possibly incorrect data.

Secondly, I am pretty sure that DISTINCT works properly - you are almost certainly not getting duplicates, but it may be hard to spot the difference between two rows. Leading or trailing spaces in text columns, for instance could be to blame.

Finally, to work through this problem, I'd recommend building the query up join by join, and seeing where you get the duplicate - that's the join that's to blame.

So, start with:

SELECT 

                    traders.name AS tradername,
                    worksorders.id AS worksorderno,
                    worksorders.partid,
                    worksorders.quantity,
                    worksorders.duedate,
                    worksorders.traderid,
                    worksorders.orderid
                FROM worksorders
                    INNER JOIN traders traders ON
                        worksorders.traderid = traders.id

and build up to the next join.


Are you sure the results are exact duplicates? Makes sure there isn't one column that actually has a different value.

0

精彩评论

暂无评论...
验证码 换一张
取 消