开发者

How to JOIN to a table that has multiple values in the column?

开发者 https://www.devze.com 2023-01-24 01:21 出处:网络
I\'ve got a person table that contains an error code field that can contain multiple error codes (001, 002, 003...). I know that\'s a schema problem but this is a vendor application and I have no cont

I've got a person table that contains an error code field that can contain multiple error codes (001, 002, 003...). I know that's a schema problem but this is a vendor application and I have no control over the schema, so I have to work with what I've got.

There is also a Error table that contains ErrorCode (char(3)) and Descript (char(1000)). In my query the Person.ErrorCode is joined to the Error.ErrorCode to get the value of the corresponding description.

For person records where there is only one error code, I can get the corresponding Descript with no problem. What I'm trying to do is somehow concat the Descript values for records where there are multiple errors.

For example, here's some sample data from Error table:

ErrorCode     Descript
001           Problem with person file
002           Problem with address file
003           Problem with grade

Here are the columns resulting from my SELECT on Person with a JOIN on E开发者_StackOverflow中文版rror:

Person.RecID   Person.ErrorCode  Error.Descript
12345          001               Problem with person file
12346          003               Problem with grade
12347          002,003

What I'm trying to get is this:

Person.RecID   Person.ErrorCode  Error.Descript
12345          001               Problem with person file
12346          003               Problem with grade
12347          002,003           Problem with address file, Problem with grade

Suggestions appreciated!


You should see: "Arrays and Lists in SQL Server 2005 and Beyond, When Table Value Parameters Do Not Cut it" by Erland Sommarskog, then there are many ways to split string in SQL Server. This article covers the PROs and CONs of just about every method. in general, you need to create a split function. This is how a split function can be used to join rows:

SELECT
    * 
    FROM dbo.yourSplitFunction(@Parameter) b
        INNER JOIN YourCodesTable          c ON b.ListValue=c.CodeValue

I prefer the number table approach to split a string in TSQL but there are numerous ways to split strings in SQL Server, see the previous link, which explains the PROs and CONs of each.

For the Numbers Table method to work, you need to do this one time table setup, which will create a table Numbers that contains rows from 1 to 10,000:

SELECT TOP 10000 IDENTITY(int,1,1) AS Number
    INTO Numbers
    FROM sys.objects s1
    CROSS JOIN sys.objects s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)

Once the Numbers table is set up, create this split function:

CREATE FUNCTION [dbo].[FN_ListToTable]
(
     @SplitOn  char(1)      --REQUIRED, the character to split the @List string on
    ,@List     varchar(8000)--REQUIRED, the list to split apart
)
RETURNS TABLE
AS
RETURN 
(

    ----------------
    --SINGLE QUERY-- --this will not return empty rows
    ----------------
    SELECT
        ListValue
        FROM (SELECT
                  LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(@SplitOn, List2, number+1)-number - 1))) AS ListValue
                  FROM (
                           SELECT @SplitOn + @List + @SplitOn AS List2
                       ) AS dt
                      INNER JOIN Numbers n ON n.Number < LEN(dt.List2)
                  WHERE SUBSTRING(List2, number, 1) = @SplitOn
             ) dt2
        WHERE ListValue IS NOT NULL AND ListValue!=''

);
GO 

You can now easily split a CSV string into a table and join on it:

DECLARE @ErrorCode table (ErrorCode varchar(20), Description varchar(30))
INSERT @ErrorCode VALUES ('001','Problem with person file')
INSERT @ErrorCode VALUES ('002','Problem with address file')
INSERT @ErrorCode VALUES ('003','Problem with grade')

DECLARE @Person table (RecID int, ErrorCode varchar(20))
INSERT @Person VALUES (12345 ,'001'    )
INSERT @Person VALUES (12346 ,'003'    )
INSERT @Person VALUES (12347 ,'002,003')


SELECT
    p.RecID,c.ListValue,e.Description
    FROM @Person                                        p
        CROSS APPLY dbo.FN_ListToTable(',',p.ErrorCode) c
        INNER JOIN @ErrorCode                           e ON c.ListValue=e.ErrorCode

OUTPUT:

RecID       ListValue     Description              
----------- ------------- -------------------------
12345       001           Problem with person file 
12346       003           Problem with grade       
12347       002           Problem with address file
12347       003           Problem with grade       

(4 row(s) affected)

you can use the XML trick to concatenate the rows back together:

SELECT
    t1.RecID,t1.ErrorCode
        ,STUFF(
                   (SELECT
                        ', ' + e.Description
                        FROM @Person                                        p
                            CROSS APPLY dbo.FN_ListToTable(',',p.ErrorCode) c
                            INNER JOIN @ErrorCode                           e ON c.ListValue=e.ErrorCode
                        WHERE t1.RecID=p.RecID
                        ORDER BY p.ErrorCode
                        FOR XML PATH(''), TYPE
                   ).value('.','varchar(max)')
                   ,1,2, ''
              ) AS ChildValues
    FROM @Person t1
    GROUP BY t1.RecID,t1.ErrorCode

OUTPUT:

RecID       ErrorCode            ChildValues
----------- -------------------- -----------------------------------------------
12345       001                  Problem with person file
12346       003                  Problem with grade
12347       002,003              Problem with address file, Problem with grade

(3 row(s) affected)

This returns the same result set as above, but may perform better:

SELECT
    t1.RecID,t1.ErrorCode
        ,STUFF(
                   (SELECT
                        ', ' + e.Description
                        FROM (SELECT ListValue FROM dbo.FN_ListToTable(',',t1.ErrorCode)) c
                            INNER JOIN @ErrorCode e ON c.ListValue=e.ErrorCode
                        ORDER BY c.ListValue
                        FOR XML PATH(''), TYPE
                   ).value('.','varchar(max)')
                   ,1,2, ''
              ) AS ChildValues
    FROM @Person t1
    GROUP BY t1.RecID,t1.ErrorCode


Denormalize person.errorcode before the join with error.errorcode

I don't mean denormalize on the table level, I mean with a view or sql code.


You can use a Common Table Expression to pretend that the person table is normal:

;WITH PersonPrime as (
    SELECT RecID,ErrorCode,CAST(null as varchar(100)) as Remain from Person where Value not like '%,%'
    UNION ALL
    SELECT RecID,SUBSTRING(ErrorCode,1,CHARINDEX(',',ErrorCode)-1),SUBSTRING(ErrorCode,CHARINDEX(',',ErrorCode)+1,100) from Person where Value like '%,%'
    UNION ALL
    SELECT RecID,Remain,null FROM PersonPrime where Remain not like '%,%'
    UNION ALL
    SELECT RecID,SUBSTRING(Remain,1,CHARINDEX(',',Remain)-1),SUBSTRING(Remain,CHARINDEX(',',Remain)+1,100) from PersonPrime where Remain like '%,%'
)
SELECT RecID,ErrorCode from PersonPrime

And now use PersonPrime where you'd have used Person in your original query. You'll want to CAST that null to a varchar column as wide as ErrorCode in the Person table


Concatenating the error descriptions might not be the way to go. It adds unnecessary complexity to the SQL statement that will be extremely problematic to debug. Any future additions or changes to the SQL will also be difficult. Your best bet is to generate a resultset that is normalized, even though your schema is not.

SELECT Person.RecID, Person.ErrorCode, Error.ErrorCode, Error.Descript
  FROM Person INNER JOIN Error
    ON REPLACE(Person.ErrorCode, ' ', '') LIKE '%,' + CONVERT(VARCHAR,Error.ErrorCode) + ',%'

If a person has multiple error codes set, then this will return one row for each error specified (ignoring duplicates). Using your example, it will return this.

Person.RecID   Person.ErrorCode   Error.ErrorCode   Error.Descript
12345          001                001               Problem with person file
12346          003                003               Problem with grade
12347          002,003            002               Problem with address file
12347          002,003            003               Problem with grade


By grouping the errors together and concatenating them is an option:

SELECT *, GROUP_CONCAT(Person.ErrorCode) FROM Person
GROUP BY Person.RecID
0

精彩评论

暂无评论...
验证码 换一张
取 消