开发者

Difference between TOP X and Row_Number()

开发者 https://www.devze.com 2023-03-31 17:08 出处:网络
Recently I\'ve faced weird issue. I have two simple queries where one of them uses TOP X and other one does the same by using ROW_NUMBER and then selects items with rowNumber between 1 and X, and both

Recently I've faced weird issue. I have two simple queries where one of them uses TOP X and other one does the same by using ROW_NUMBER and then selects items with rowNumber between 1 and X, and both of them ordered by same column, but the result is completely different.

For example, let's say we have a simple DB as below with some dummy data:

CREATE TABLE [dbo].[Test](
[Id] [int] IDENTITY(1,1) NOT NULL,
[NDate] [datetime] NOT NULL,
CONSTRAINT [PK_Test] PRIMARY KEY CLUSTERED 
(
[Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
SET IDENTITY_INSERT [dbo].[Test] ON
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (1, '2011-08-24 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (2, '2011-08-24 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (3, '2011-08-24 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (4, '2011-08-24 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (5, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (6, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (7, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (8, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (9, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (10, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (11, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (12, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (13, '2011-08-21 00:00:00.000')
INSERT [开发者_如何学Cdbo].[Test] ([Id], [NDate]) VALUES (14, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (15, '2011-08-21 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (16, '2011-08-21 00:00:00.000')
SET IDENTITY_INSERT [dbo].[Test] OFF

Now if we perform below queries, we will get different results. If we use TOP 10, we will get below result:

SELECT TOP 10 [Id],[NDate] FROM [Test] order by NDate desc
RESULT=>
Id  NDate
4   2011-08-24 00:00:00.000
3   2011-08-24 00:00:00.000
2   2011-08-24 00:00:00.000
1   2011-08-24 00:00:00.000
11  2011-08-21 00:00:00.000
10  2011-08-21 00:00:00.000
9   2011-08-21 00:00:00.000
8   2011-08-21 00:00:00.000
7   2011-08-21 00:00:00.000
6   2011-08-21 00:00:00.000


select id,NDate from (
  select ROW_NUMBER() over (order by NDate DESC) as RNumber, Id,NDate
  from Test) as t
where RNumber between 1 and 10
RESULT=>
id  NDate
1   2011-08-24 00:00:00.000
2   2011-08-24 00:00:00.000
3   2011-08-24 00:00:00.000
4   2011-08-24 00:00:00.000
5   2011-08-21 00:00:00.000
6   2011-08-21 00:00:00.000
7   2011-08-21 00:00:00.000
8   2011-08-21 00:00:00.000
9   2011-08-21 00:00:00.000
10  2011-08-21 00:00:00.000

The issue is, When you are using LINQ to SQL, and you want to do pagination, generated query for selecting first page will be TOP X, while for the other pages will be using ROW_NUMBER, and the result is, some items will never appear in the listing.


You need to implement a secondary sorting.

Example:

select id,NDate from (
  select ROW_NUMBER() over (order by NDate DESC, id) as RNumber, Id,NDate -- Note: NDate DESC, id
  from Test) as t
where RNumber between 1 and 10

Example:

SELECT TOP 10 [Id],[NDate] FROM [Test] order by NDate desc, ID   -- Note: NDate DESC, id

Otherwise, you might as well expect random records per unique NDate.

If you want the same records for each query, you have to specify that secondary sort column.


I agree with @hamlin11, but putting it more simply :-)

In the first example you are sorting by

order by NDate desc
/* to get same results as example 2, change this to
   order by NDate desc, id 
*/

and in the second by

order by NDate DESC, id

In the second example you are sorting by Id, that is why the ID column is in order, you haven't done this in the first example.

It might be better to use data like this, so you can see more clearly what is happening:

/****** Object:  Table [dbo].[Test]    Script Date: 08/27/2011 07:56:29 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Test](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [NDate] [datetime] NOT NULL,
 CONSTRAINT [PK_Test] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET IDENTITY_INSERT [dbo].[Test] ON
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (1, '2011-08-10 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (2, '2011-08-11 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (3, '2011-08-12 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (4, '2011-08-13 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (5, '2011-08-14 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (6, '2011-08-15 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (7, '2011-08-16 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (8, '2011-08-31 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (9, '2011-08-30 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (10, '2011-08-29 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (11, '2011-08-28 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (12, '2011-08-27 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (13, '2011-08-26 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (14, '2011-08-25 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (15, '2011-08-24 00:00:00.000')
INSERT [dbo].[Test] ([Id], [NDate]) VALUES (16, '2011-08-23 00:00:00.000')
SET IDENTITY_INSERT [dbo].[Test] OFF
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号