How to write an SQL query that examines the previous row?_问答_开发者

I have table (already 开发者_JS百科sorted) in SQL server with the following:

stringname
==========
stringA
stringA
stringB
stringB
stringA
stringC
stringC
stringC
stringA

I would like to output the following:

stringname previous_stringname count
========== =================== =====
stringA    NULL                1
stringA    stringA             1
stringA    stringB             1
stringA    stringC             1
stringB    stringA             1
stringB    stringB             1
stringC    stringA             1
stringC    stringC             2

That is, for each stringname in the original table and for each of that stringname's previous entry, I would like to output it with the number of times it had each previous string (with NULL for the first entry).

How would I write an SQL query for this?

I am using SQL Server 2008.

Recursion is not necessary; just use:

select b.stringname as stringname, a.stringname as previous_stringname
    into #tmp
    from (select stringname, row_number() over (order by id /* $/0 */) as row from testing) a
    right outer join (select stringname, row_number() over (order by id /* $/0 */) as row from testing) b
    on a.row = b.row - 1;
select *, count(*) as [count] from #tmp group by stringname, previous_stringname;

The folllowing will do the trick, but in return for the answer I want you to Google "recursion" and see what Google suggests to search for :)

Clarification: The recursion happens by joining on the row - 1 to row between the temp table and the CTE. This method is dependent on having an independent IDENTITY column (ID in this case) and uses ROWNUMBER() to account for any possible gaps in IDs. Since ROW_NUMBER() can't be used in a JOIN I had to resort to using a subquery in the recursive part of the CTE. Even if you know you have continuous IDs, I'd recommend using the ROW_NUMBER for this kind of query anyway just to be safe because gaps will mess it up.

CREATE TABLE #tmp (id INT IDENTITY(1,1),stringname NVARCHAR(MAX))

INSERT #tmp (stringname)

VALUES
('stringA')
,('stringA')
,('stringB')
,('stringB')
,('stringA')
,('stringC')
,('stringC')
,('stringC')
,('stringA')

;WITH StringNames
AS(
SELECT
    ROW_NUMBER() OVER (ORDER BY ID) AS Row  --Accounts for gaps in ID
    ,stringname 
    ,CAST(NULL AS NVARCHAR(MAX)) AS previous_stringname     
FROM #tmp
WHERE id = 1
UNION ALL
SELECT t.Row
    ,t.stringname
    ,s.stringname AS previous_stringname
    FROM (
            SELECT
            ROW_NUMBER() OVER (ORDER BY ID) AS Row --Accounts for gaps in ID 
            ,stringname
            FROM #tmp) AS t
JOIN StringNames AS s ON t.row - 1 = s.row
)

SELECT 
    DISTINCT
    stringname
    ,previous_stringname
    ,COUNT(*) AS count  
 FROM StringNames
 GROUP BY 
    stringname
    ,previous_stringname
ORDER BY stringname

@Wil, why do you need the identity column?

Data

CREATE TABLE #table (stringname NVARCHAR(MAX))

INSERT  #table (stringname)
VALUES  ('stringA')
        ,('stringA')
        ,('stringB')
        ,('stringB')
        ,('stringA')
        ,('stringC')
        ,('stringC')
        ,('stringC')
        ,('stringA')

Query

;WITH [cteRowNumbers] AS (
    SELECT  ROW_NUMBER() OVER (ORDER BY $/0) AS [RowNumber],
            [stringname],
            CAST(NULL AS NVARCHAR(MAX)) AS [previous_stringname]
    FROM #table
)
,[cteStringNames] AS (
    SELECT  [RowNumber], 
            [stringname], 
            [previous_stringname]
    FROM (
        SELECT TOP 1 
                [RowNumber], 
                [stringname], 
                [previous_stringname]
        FROM [cteRowNumbers]
    ) t
    UNION ALL
    SELECT  t.[RowNumber],
            t.[stringname],
            s.[stringname] AS [previous_stringname]
    FROM [cteRowNumbers] AS t
    INNER JOIN [cteStringNames] AS s 
        ON t.[RowNumber] - 1 = s.[RowNumber]
)

SELECT  [stringname],
        [previous_stringname],
        COUNT(*) AS [count]
FROM [cteStringNames]
GROUP BY stringname, previous_stringname
ORDER BY stringname