I have a query as follows;
SELECT COUNT(Id) FROM Table
The table contains 33 million records - it contains a primary key on Id and no other indices.
The query takes 30 seconds.
The actual execution plan shows it uses a clustered index scan.
We have analysed the table and found it isn't fragmented using the first query shown in this link: http://sqlserverpedia.com/wiki/Index_Maintenance.
Any ideas as to why this query is so slow and how to fix it.
The Table Definition:
CREATE TABLE [dbo].[DbConversation](
[ConversationID] [int] IDENTITY(1,1) NOT NULL,
[ConversationGroupID] [int] NOT NULL,
[InsideIP] [uniqueidentifier] NOT NULL,
[OutsideIP] [uniqueidentifier] NOT NULL,
[ServerPort] [int] NOT NULL,
[BytesOutbound] [bigint] NOT NULL,
[BytesInbound] [bigint] NOT NULL,
[ServerOutside] [bit] NOT NULL,
[LastFlowTime] [datetime] NOT NULL,
[LastClientPort] [int] NOT NULL,
[Protocol] [tinyint] NOT NULL,
[TypeOfService] [tinyint] NOT NULL,
CONSTRAINT [PK_Conversation_1] PRIMARY KEY CLUSTERED
(
[ConversationID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
One thing I have noticed is the datab开发者_开发知识库ase is set to grow in 1Mb chunks.
It's a live system so we restricted in what we can play with - any ideas?
UPDATE:
OK - we've improved performance in the actual query of interest by adding new non-clustered indices on appropriate columns so it's not a critical issue anymore.
SELECT COUNT
is still slow though - tried it with NOLOCK hints - no difference.
We're all thinking it's something to do with the Autogrowth set to 1Mb rather than a larger number, but surprised it has this effect. Can MDF fragmentation on the disk be a possible cause?
Is this a frequently read/inserted/updated table? Is there update/insert activity concurrent with your select?
My guess is the delay is due to contention.
I'm able to run a count on 189m rows in 17 seconds on my dev server, but there's nothing else hitting that table.
If you aren't too worried about contention or absolute accuracy you can do:
exec sp_spaceused 'MyTableName'
which will give a count based on meta-data.
If you want a more exact count but don't necessarily care if it reflect concurrent DELETE
or INSERT
activity you can do your current query with a NOLOCK
hint:
SELECT COUNT(id) FROM MyTable WITH (NOLOCK)
which will not get row-level locks for your query and should run faster.
Thoughts:
Use
SELECT COUNT(*)
which is correct for "how many rows" (as per ANSI SQL). Even if ID is the PK and thus not nullable, SQL Server will count ID. Not rows.If you can live with approximate counts, then use sys.dm_db_partition_stats. See my answer here: Fastest way to count exact number of rows in a very large table?
If you can live with dirty reads use
WITH (NOLOCK)
use [DatabaseName]
select tbl.name, dd.rows from sysindexes dd
inner join sysobjects tbl on dd.id = tbl.id where dd.indid < 2 and tbl.xtype = 'U'
select sum(dd.rows)from sysindexes dd
inner join sysobjects tbl on dd.id = tbl.id where dd.indid < 2 and tbl.xtype = 'U'
By using these queries you can fetch all tables' count within 0-5 seconds
use where clause according to your requirement.....
Another idea: When the files grow with 1MB parts, it may be fragmented on the file system. You can't see this by SQL, you see it using a disk defragmentation tool.
精彩评论