开发者

How to make SQL Server indexes take less space?

开发者 https://www.devze.com 2023-02-01 08:28 出处:网络
I have a database created by some application. Whole database is more than 50 gb, some problems with backups are occurring and my task is to get this database as small as possible.

I have a database created by some application. Whole database is more than 50 gb, some problems with backups are occurring and my task is to get this database as small as possible.

Especially one table is very big(22 gb), from which 16.5 gb is taken by indexes, rest 5.5 gb is data. It contains little more than 12 000 000 rows.

Could You tell me is it possible to shrink the indexes? I've already tried rebuilding, reorganizing, recreating clustered index, dbcc cleantable. I also know that nvarchar type takes twice size than varchar, so I changed columns type to varchar, but thanks to that I saved only about 2 gb(1 gb on data and 1 gb on indexes).

Here is an sql for this table(fld0 and fld1 are always NULL):

CREATE TABLE [dbo].[DOC8](
 [ASSOCIATION] [nvarchar](64) NULL DEFAULT (NULL),
 [DOCID] [char](32) NOT NULL,
 [FLD0] [nvarchar](255) NULL DEFAULT (NULL),
 [FLD1] [nvarchar](2048) NULL DEFAULT (NULL),
 [FLD10] [nvarchar](255) NULL DEFAULT (NULL),
 [FLD2] [nvarchar](32) NULL DEFAULT (NULL),
 [FLD3] [nvarchar](255) NULL DEFAULT (NULL),
 [FLD4] [nvarchar](255) NULL DEFAULT (NULL),
 [FLD5] [datetime] NULL DEFAULT (NULL),
 [FLD6] [nvarchar](255) NULL DEFAULT (NULL),
 [FLD7] [nvarchar](255) NULL DEFAULT (NULL),
 [FLD8] [nvarchar](255) NULL DEFAULT (NULL),
 [FLD9] [datetime] NULL DEFAULT (NULL),
 [PARENTID] [char](32) NULL DEFAULT (NULL),
 [POOLID] [char](32) NULL DEFAULT (NULL),
 [PROPERTIES] [ntext] NULL DEFAULT (NULL),
 [FLD11] [nvarchar](255) NULL DEFAULT (NULL),
 [FLD12] [nvarchar](255) NULL DEFAULT (NULL),
PRIMARY KEY CLUSTERED 
(
 [DOCID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
CREATE NONCLUSTERED INDEX [DOC8_IDX_0] ON [dbo].[DOC8] 
(
 [ASSOCIATION] ASC,
 [PARENTID] ASC,
 [POOLID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [DOC8_IDX_1] ON [dbo].[DOC8] 
(
 [POOLID] ASC,
 [FLD0] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [DOC8_IDX_10] ON [dbo].[DOC8] 
(
 [POOLID] ASC,
 [FLD11] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [DOC8_IDX_11] ON [dbo].[DOC8] 
(
 [POOLID] ASC,
 [FLD12] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [DOC8_IDX_2] ON [dbo].[DOC8] 
(
 [POOLID] ASC,
 [FLD2] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [DOC8_IDX_3] ON [dbo].[DOC8] 
(
 [POOLID] ASC,
 [FLD3] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [DOC8_IDX_4] ON [dbo].[DOC8] 
(
 [POOLID] ASC,
 [FLD4] ASC,
 [FLD5] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [DOC8_IDX_5] ON [dbo].[DOC8] 
(
 [POOLID] ASC,
 [FLD6] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [DOC8_IDX_6] ON [dbo].[DOC8] 
(
 [POOLID] ASC,
 [FLD7] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [DOC8_IDX_7] ON [dbo].[DOC8] 
(
 [POOLID] ASC,
 [FLD8] ASC,
 [FLD9] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [DOC8_IDX_8] ON [dbo].[DOC8] 
(
 [POOLID] ASC,
 [FLD10] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF开发者_运维问答, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [DOC8_IDX_9] ON [dbo].[DOC8] 
(
 [PARENTID] ASC,
 [POOLID] ASC,
 [DOCID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO


Looking at your table definition

Primary key:

  • You have a char(32) clustered index. This 32 bytes appears in every non-clustered index

Columns:

  • Why char(32)? -> varchar(32)
  • Why nvarchar()? -> varchar
  • Why datetime? -> smalldatetime
  • Why ntext? -> varchar(max)

Indexes:

  • Can you use INCLUDE rather than having key columns

Your main issue is bad choice of clustered index. If you can't fix that, pretty much everything else is pointless: unless you delete data...

When you changed to varchar, you saved 2GB out of 22GB. That's 9% which is quite reasonable whith no other optimisations. It also shows that you don't need nvarchar...

If you change to an int surrogate key, you'd save 28 bytes per row per non-clustered index. That's 3.7 GB minimum (12 x 12,000,000 x 28) but will be more because of more rows per page. And it's transparent to the client code.

Then you start checking for index usage...

However, you should look at capacity planning and plan for data growth. The changes I suggest here will reduce growth rate as well as current size, but if you need another 50 million rows then you need to plan for this. For example, can you compress your backups?


Find indexes not in use and remove them. This will also cut down on the amount of writes the drive has to make for dui's (deletes, updates, and inserts). See Brent Ozar's blog on how to do this: http://www.toadworld.com/platforms/sql-server/w/wiki/10062.find-indexes-not-in-use

Basically, if your reads/write is low ( <0.1 ), then the index is hurting you more than it is helping and possibly needs to go anyway. You will need to carefully consider before dropping an index though.

You may also benefit from filtered indexes. http://sqlfool.com/2009/04/filtered-indexes-what-you-need-to-know/


You can check if all the indexes are actually used and drop the ones you don't need, but generally you are approaching the problem in a wrong way - even if you manage to make the database smaller today, it will keep growing and in a few months you'll be facing the same problem again.

Unless the database contains lots of unnecessary data and/or indexes that you can delete, you have to find a way to manage a database of such size.

Before you start dropping columns and indexes, change columns' data types, you need to be absolutely sure how this will affect the application operation and performance. However, this is a 3rd party software and its developers will not answer your questions for the next few days (they don't work until end of the year), so there is absolutely no way to predict all negative effects of such blind database structure changes. The system won't probably fail immediately, but your changes combined with future software updates (prepared by people unaware of modifications made by you) may produce catastrophic results.


Definately review the index, do you really need all those indexes of the pattern [POOLID],[FLD]? If their existance can't be justified, then they shouldn't exist.

I'm assuming the DOCID etc are GUIDS without hyphens. If you could start again, I would chose to use the inbuilt guid type rather than char32, which would halve the size of the indecies you have, but that's not a simple change because microsoft were silly and made it so you had to put hyphens in their guid types and then the application has to have extra space in structures or translate the hyphens out.


After playing around with this problem I came to the final stage where: - data reduced from 5 451 477 MB to 4 088 609 MB - indexes reduced from 15 361 391 MB to 6 003 094 MB

Here are steps which I made(maybe it will be useful for somebody):

  1. Changed 9 column types from nvarchar to varchar - data: 4 460 305 MB, indexes - 14 456 383 MB
  2. Removed reduntant columns which occured in more than one index - data: 4 460 305 MB, indexes - 9 592 320 MB
  3. Changed two columns from char(32) to varchar(32) and one column from ntext to varchar(max) - data: 4 088 609 MB, indexes: 9 294 117 MB
  4. Created new integer indentity column, deleted clustered index which was made from DOCID char(32) column, created new non-clustered index with DOCID column, created new clustered index with newly added identity column - data: 4 088 609 MB, indexes: 6 003 094 MB

Thank You for helping me to solve this problem :)

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号