开发者

which is the best collation for European + English language

开发者 https://www.devze.com 2023-01-10 15:37 出处:网络
HI There, i am developing for European languages and also for English, the string are stored as NVARCHAR in sql server 2005.

HI There,

i am developing for European languages and also for English, the string are stored as NVARCHAR in sql server 2005.

so, which is the best collation to be used is "Latin1_General_CI_AS" covers all? there are variations as well like Latin1_General_CP1_CI_开发者_运维问答AS,Latin1_General_BIN,Latin1_General_BIN2 etc

comments\suggestions appreciated.

Regards DEE


For general purpose sorting "General Latin1" is probably the best choice for western European and English languages.

I believe that if the code page (e.g., CP1) is not specified, then it defaults to code page 1252 (which is also what CP1 signifies). So my understanding is that Latin1_General_CI_AS and Latin1_General_CP1_CI_AS are equivalent. Given that, my opinion is that Latin1_General_CP1_CI_AS would be the better choice for clarity reasons. Whether you use CI_AS, CS_AS, or CI_AI is purely a usability issue based on whether you want case sensitivity and/or accent sensitivity. With CI, "a" == "A" and with AI, "á" == "â".

The _BIN and _BIN2 options signify that the collation will be binary based on the code point values. For sorting purposes, you probably do not want that because the order would not necessarily match any kind of dictionary order. However, if you are only using the index for searching for data, then one of those might be appropriate because it could be faster. Relatively little computation is necessary to convert a character value to the associated key value.

Edit As Martin points out in the comment, the code page will not matter unless you are using char, memo, or varchar. If you stick completely with Unicode (nchar, nvarchar, nmemo), then the code page will not come into play. If you translate a Unicode character to a single-byte character, though, it will be used.

0

精彩评论

暂无评论...
验证码 换一张
取 消