开发者

SQL Server sample of each type of record

开发者 https://www.devze.com 2023-02-12 11:49 出处:网络
I have a table full of records for various semesters, and I want a query to select the top 5 rows from each semester.

I have a table full of records for various semesters, and I want a query to select the top 5 rows from each semester.

I'd like one resultset sorted by a key field (serial) and one resultset with random sorts within each semester.

The开发者_如何转开发 table looks like this:

  • serial int
  • semester char(4)
  • grade int

Table data:

Semester|Serial|NewSerial
SP10    |1     |1001
SP10    |2     |1002
SP10    |3     |1003
SP10    |4     |1004
SP10    |5     |1005
SP10    |6     |1006
SP10    |7     |1007
SP10    |8     |1008
SP10    |9     |1009
SP10    |10    |1010
FA10    |1     |2001
FA10    |2     |2002
FA10    |3     |2003
FA10    |4     |2004
FA10    |5     |2005
FA10    |6     |2006
FA10    |7     |2007
FA10    |8     |2008
FA10    |9     |2009
FA10    |10    |2010
FA09    |1     |3001
FA09    |2     |3002
FA09    |3     |3003
FA09    |4     |3004
FA09    |5     |3005
FA09    |6     |3006
FA09    |7     |3007
FA09    |8     |3008
FA09    |9     |3009
FA09    |10    |3010

Result 1: ordered by serial, top 5

Semester|Serial|NewSerial
SP10    |1     |1001
SP10    |2     |1002
SP10    |3     |1003
SP10    |4     |1004
SP10    |5     |1005
FA10    |1     |2001
FA10    |2     |2002
FA10    |3     |2003
FA10    |4     |2004
FA10    |5     |2005
FA09    |1     |3001
FA09    |2     |3002
FA09    |3     |3003
FA09    |4     |3004
FA09    |5     |3005

Result 2: order by random within semester (sampling)

Semester|Serial|NewSerial
SP10    |3     |1003
SP10    |1     |1001
SP10    |5     |1005
SP10    |2     |1002
SP10    |4     |1004
FA10    |2     |2002
FA10    |1     |2001
FA10    |4     |2004
FA10    |3     |2003
FA10    |5     |2005
FA09    |3     |3003
FA09    |1     |3001
FA09    |2     |3002
FA09    |5     |3005
FA09    |4     |3004


these should do it:

WITH [query_semester] AS (
    SELECT RN = ROW_NUMBER() OVER (PARTITION BY semester
    ORDER BY serial),
    serial, semester, grade
    FROM SemesterTable)
SELECT serial, semester, grade
FROM [query_semester] WHERE [RN] <= 5;

and the random one:

WITH [query_semester] AS (
    SELECT RN = ROW_NUMBER() OVER (PARTITION BY semester
    ORDER BY NEWID()),
    serial, semester, grade
    FROM SemesterTable)
SELECT serial, semester, grade
FROM [query_semester] WHERE [RN] <= 5;


Show us some sample data, and expected results.

You will need to SELECT TOP 5 You will also need to GROUP BY the semester type (if that is a date so be it).

Instead of TOP 5 use HAVING count...


No idea what criteria you are using to make up the new serial, but if it just needs to be serially incrementing by semester (in the thousand position), then

select semester, serial,
    newserial = (DENSE_RANK() over (order by semester) * 1000) + rn
from
(   select *, rn=ROW_NUMBER() over (partition by semester order by serial)
    from tbl
) x
where rn <= 5
order by semester

For the order by serial. Swap order by serial for order by newid() to get the random sampling.

Notes:

  1. SQL Server 2005+ at compatibility level 90+
  2. The formula for newserial works fine because each semester can only have 5 items so there won't be clashing issues

EDIT

It seems counter-intuitive in the question, but if newserial is just a regular field (it wasn't listed), then the select part becomes simply

select semester, serial, newserial
0

精彩评论

暂无评论...
验证码 换一张
取 消