Clustered vs NonClustered Primary Key_问答_开发者

开发者 https://www.devze.com 2022-12-17 08:46 出处：网络

begin transaction; create table person_id(person_id integer primary key); insert into person_id values(1);

begin transaction;
create table person_id(person_id integer primary key);
insert into person_id values(1);
... snip ...
insert into person_id values(50000);
commit;

开发者_StackOverflow中文版

This code takes about 0.9 seconds on my machine and creates a db file taking up 392K. These numbers become 1.4 seconds and 864K if I change the second line to

create table person_id(person_id integer nonclustered primary key);

Why is this the case?

A great answer to this question is available over at the DBA StackExchange: https://dba.stackexchange.com/questions/7741/when-should-a-primary-key-be-declared-non-clustered/7744#7744

Clustering the primary key stores it with the rows; this means that it takes up less space (as there are no separate index blocks). Typically its main benefit however, is that range scans can generally access rows which are in the same block, reducing IO operations, which becomes rather important when you have a large data set (not 50k ints).

I think 50k ints is a rather artificial benchmark and not one you care about in the real world.

[Only as an idea]

Maybe when you specify explicitly to take integer columns as a clustered key, it does just that. But when you tell it not to use your integer column, it still creates an index behind the scenes but chooses a different datatype for doing that, suppose, twice as large. Then each of those entries have to reference the records in the table and here you go, the size is exploding.