开发者

optimizing sql server database

开发者 https://www.devze.com 2023-01-29 08:20 出处:网络
My database has one very large table with over 2 billion rows with 3 columns. Id(uniqueidentity), Type(int, between 0-10. 0 = most used. 10 = least used), Data(Binary data between 1-10MB)

My database has one very large table with over 2 billion rows with 3 columns. Id(uniqueidentity), Type(int, between 0-10. 0 = most used. 10 = least used), Data(Binary data between 1-10MB)

What are some ways I can optimize this database? (primarily select queries)

*Note:开发者_运维技巧 I might add a few more columns to this table later (eg: location, date...)


Assuming that the id column is the clustered index key, and assuming that by uniqueidentity you mean uniqueidentifier:

  • do you need the uniqueidentifier type? Why?
  • What other alternatives have you considered?
  • Do you populate the data using sequential GUIDs or not?

GUIDs are a notoriously poor choise for clustered keys. See GUIDs as PRIMARY KEYs and/or the clustering key for a more detailed discussion:

But, a GUID that is not sequential - like one that has it's values generated in the client (using .NET) OR generated by the newid() function (in SQL Server) can be a horribly bad choice - primarily because of the fragmentation that it creates in the base table but also because of its size. It's unnecessarily wide (it's 4 times wider than an int-based identity - which can give you 2 billion (really, 4 billion) unique rows). And, if you need more than 2 billion you can always go with a bigint (8-byte int) and get 2^63-1 rows

Also read Disk space is cheap...That's not the point! as a follow up.

Other than this, you need to do your homework and post the required details for such a question: exact table and index definition, prevalent data access pattern (by key, by range, filters sort order, joins etc etc).

Have you done any work to identify problems so far? If not, start with Waits and Queues, a proven methodology to identify performance bottlenecks. Once you measure and find places that need improvement, we can advise how to improve.


  • Add an Index(es). Decide which column(s) are the most appropriate clustered index.

  • Decide if storing 10MB of binary data in each (otherwise small) row is a good use of a database

[Updated in response to Remus's comment]

0

精彩评论

暂无评论...
验证码 换一张
取 消