I have table (already 开发者_JS百科sorted) in SQL server with the following:
stringname
==========
stringA
stringA
stringB
stringB
stringA
stringC
stringC
stringC
stringA
I would like to output the following:
stringname previous_stringname count
========== =================== =====
stringA NULL 1
stringA stringA 1
stringA stringB 1
stringA stringC 1
stringB stringA 1
stringB stringB 1
stringC stringA 1
stringC stringC 2
That is, for each stringname in the original table and for each of that stringname's previous entry, I would like to output it with the number of times it had each previous string (with NULL for the first entry).
How would I write an SQL query for this?
I am using SQL Server 2008.
Recursion is not necessary; just use:
select b.stringname as stringname, a.stringname as previous_stringname
into #tmp
from (select stringname, row_number() over (order by id /* $/0 */) as row from testing) a
right outer join (select stringname, row_number() over (order by id /* $/0 */) as row from testing) b
on a.row = b.row - 1;
select *, count(*) as [count] from #tmp group by stringname, previous_stringname;
The folllowing will do the trick, but in return for the answer I want you to Google "recursion" and see what Google suggests to search for :)
Clarification: The recursion happens by joining on the row - 1 to row between the temp table and the CTE. This method is dependent on having an independent IDENTITY
column (ID
in this case) and uses ROWNUMBER()
to account for any possible gaps in IDs. Since ROW_NUMBER()
can't be used in a JOIN
I had to resort to using a subquery in the recursive part of the CTE
. Even if you know you have continuous IDs, I'd recommend using the ROW_NUMBER
for this kind of query anyway just to be safe because gaps will mess it up.
CREATE TABLE #tmp (id INT IDENTITY(1,1),stringname NVARCHAR(MAX))
INSERT #tmp (stringname)
VALUES
('stringA')
,('stringA')
,('stringB')
,('stringB')
,('stringA')
,('stringC')
,('stringC')
,('stringC')
,('stringA')
;WITH StringNames
AS(
SELECT
ROW_NUMBER() OVER (ORDER BY ID) AS Row --Accounts for gaps in ID
,stringname
,CAST(NULL AS NVARCHAR(MAX)) AS previous_stringname
FROM #tmp
WHERE id = 1
UNION ALL
SELECT t.Row
,t.stringname
,s.stringname AS previous_stringname
FROM (
SELECT
ROW_NUMBER() OVER (ORDER BY ID) AS Row --Accounts for gaps in ID
,stringname
FROM #tmp) AS t
JOIN StringNames AS s ON t.row - 1 = s.row
)
SELECT
DISTINCT
stringname
,previous_stringname
,COUNT(*) AS count
FROM StringNames
GROUP BY
stringname
,previous_stringname
ORDER BY stringname
@Wil, why do you need the identity column?
Data
CREATE TABLE #table (stringname NVARCHAR(MAX))
INSERT #table (stringname)
VALUES ('stringA')
,('stringA')
,('stringB')
,('stringB')
,('stringA')
,('stringC')
,('stringC')
,('stringC')
,('stringA')
Query
;WITH [cteRowNumbers] AS (
SELECT ROW_NUMBER() OVER (ORDER BY $/0) AS [RowNumber],
[stringname],
CAST(NULL AS NVARCHAR(MAX)) AS [previous_stringname]
FROM #table
)
,[cteStringNames] AS (
SELECT [RowNumber],
[stringname],
[previous_stringname]
FROM (
SELECT TOP 1
[RowNumber],
[stringname],
[previous_stringname]
FROM [cteRowNumbers]
) t
UNION ALL
SELECT t.[RowNumber],
t.[stringname],
s.[stringname] AS [previous_stringname]
FROM [cteRowNumbers] AS t
INNER JOIN [cteStringNames] AS s
ON t.[RowNumber] - 1 = s.[RowNumber]
)
SELECT [stringname],
[previous_stringname],
COUNT(*) AS [count]
FROM [cteStringNames]
GROUP BY stringname, previous_stringname
ORDER BY stringname
精彩评论