we are currently looking at setting our string columns to nvarchar(max)
rather than specifying a specific length to prevent any problems where there could be not enough room in the database to store the string . Im just wondering if this is a good thing or could it cause any problems since it was ok to do then why specify a length like nvarchar(10)
rather than nvarchar(max)
. We also use varbinary(max)
a lot since we dont know how much binary data we will need so Im not sure how much this is an effect either give that our inserts are not as fast as I think they should be . This is an example table:
CREATE TABLE [dbo].[SAMPLETABLE] (
[ID] [uniqueidentifier] NOT NULL,
[FIELD1] [int] NOT NULL,
[FIELD2] [nvarchar] (2000) NULL,
[FIELD3] [nvarchar] (max) NULL,
[FIELD4] [uniqueidentifier] NULL,
[FIELD5] [int] NULL,
[FIELD6] [nvarchar] (2000) NULL,
[FIELD7] [varbinary] (max) NULL,
[FIELD8] [varbinary] (max) NULL,
[FIELD9] [varbinary] (max) NULL,
[FIELD10] [uniqueidentifier] NULL,
[FIELD11] [nvarchar] (2000) NULL,
[FIELD12] [varbinary] (max) NULL,
[FIELD13] [varbinary] (max) NULL,
[FIELD14] [bit] NULL,
[FIELD15] [uniqueidentifier] NULL,
[FIELD1开发者_运维知识库6] [varbinary] (max) NULL,
[FIELD17] [bit] NULL,
[FIELD18] [tinyint] NULL,
[FIELD19] [datetime] NULL,
[FIELD20] [nvarchar] (2000) NULL,
PRIMARY KEY CLUSTERED
(
[ID] ASC
)
) ON [PRIMARY]
GO
Given a table design like that and changing the nvarchar(2000)
to nvarchar(max)
would that make things any worse(or better)? Does sqlserver frown upon designs like this?
If you're happy for J. Random Developer, 6 months down the line, to insert a work by Shakespeare into each column, then fine.
For me, a big part of data modelling is seriously thinking about what data I do want to allow in each column, and which data I wish to prohibit. I then apply appropriate CHECK
constraints to achieve those restrictions (as best SQL Server allows). Having a sensible length check available "for free" has always seemed like a bonus.
You're also not doing much "future proofing" - changing the length of a (n)varchar column to a larger value at a later date is, I believe, purely a meta-data operation. So I'd say size the columns appropriately for the data you're expecting to deal with today (and okay, for the next year or so). If you need to expand them later, it takes seconds to do.
Let's hope you don't use the column for searching or have unique values...
Indexes can not be over 900 bytes wide So you can probably never create an index. This is one downside: because it gives
- really bad searching performance
- no unique constraints
It can be worked around with a computed column but then why not store what you need?
Switching from the in-row types to BLOB types is always a big decision. You have to internalize that the BLOB types (VARCHAR(MAX)
, NVARCHAR(MAX)
and VARBINARY(MAX)
) are a completely different type internally from the in-row types:
- They go into a different allocation unit, see Table and Index Organization
- They cannot be indexed
- The are subject to different storage options in sp_tableoption
- They use a completely different handling code path, see Performance comparison of varchar(max) vs. varchar(N)
- Online operations are not allowed on tables with BLOBs (see Exception at
ALTER INDEX ... REBUILD ... WITH (ONLINE= ON)
. Reckon that since your table already have BLOBs, this won't be an issue
So switching all columns to BLOB types might bring in a lot of side effects you have not considered: impossibility to index the BLOB columns, lack of online operations, general performance degradation due to BLOB inherent slower code etc etc. the most serious hurdle may be the fact that you won't be able to index the columns after making them BLOBs. If this is not a show stopper, then you'll have to test and measure the performance impact.
The data modeling concerns other have raised are in general valid, but I understand that often in the real world the theory works only in theory...
The answer is the same as the answer to "Why do I need to specify an int when I can store all numbers as strings?" - because it aids:
- efficiency of speed.
- efficiency of storage.
- the author/architect's intention.
- cuts down on data error, since only a certain kind of data will fit.
But it won't cause any obvious "problems" immediately because nvarchar(10) is a subset of nvarchar(max).
Here is the same answer I gave to another guy who wanted endless tables:
Database alternative to MySQL made for millions of TABLES
That bit above is a less than optimal design for any relational data storage. Pop goes the weasel for data: just pick one you might get the data you want.
Perhaps a no sql solution would work better so you can have dynamic data and not worry about column limits.
I think if we are going to answer questions then I also think it behooves us to offer best/better practices when there are alternates to bad design.
Think of the guy coming after you as KM said above
in my experience not many 2000 character long fields end up indexed though. I think it's much better to use nvarchar(max) than some arbitary length that you might have to truncate data if it's not long enough.
An example I saw is an error log table where the table designers had not been prepared to store the call stack in an nvarchar(max) field, so they had stored the first n-thousand characters, resulting in truncated call stacks with the most interesting sections missing.
精彩评论