开发者

Indexing row Id vs row DateTime

开发者 https://www.devze.com 2023-03-24 04:40 出处:网络
I have a typical table that has a primary key integer Id, and a DateTime column to record when the row was created. In theory, the order of the Id column should always be in the same order of the Date

I have a typical table that has a primary key integer Id, and a DateTime column to record when the row was created. In theory, the order of the Id column should always be in the same order of the DateTime column. My application does an ORDER BY CreateDateTime DESC and I was going to add an index for the CreateDateTime column but then I realized the clustered index on the primary key should accomplish the same thing and even though it's not semantically correct, maybe I should just 开发者_Python百科sort by Id to prevent creating another index. Would you add the CreateDateTime index anyways? What about if it were a LastUpdatedDateTime column (implies occasional updates to the index) instead?


"in theory"... does it matter if practice doesn't match theory?

If yes, create the index and order by that date field. If no, use the primary key.


If you are not going to do a date range search, don't add the index and order by the primary key. I'd argue the semantics are still correct in that you are listing by the order in which the rows were inserted. The PK is keeping track of that order for you rather than a DateTime.

This is assuming that CreateDateTime is always using getdate() or comparable when the row is inserted. If you anticipate that date being created some other way, I'd go with the index on CreateDateTime and use that in the Order By clause.


Use the Id. It will better account for timezone changes


I've had several scenarios where we've had a clustered index on the datetime column. This works out well both because the data is always increasing (meaning hotspots on the page, but no page splits) and because almost all queries were on date range not on any of the other data. This was for large volumes of transaction data though, not for entities.

Are you going to be be doing a lot of queries against the table specifically based on the datetime column? If not then I don't think you will benefit from an index there - stick to the actual identifier as the clustered key. Even more so if we're talking about a last updated column - what percentage of your queries would make use of such an index? Doesn't seem like it would gain you much at all (but it could cost you plenty depending on the frequency the data is updated).

Exactly what kind of entities are we talking about? Are these users, ambulances, barbie dolls, forum posts, volcanoes, something else? How many rows will you be adding per day, week, month? What kind of queries are you running against them? Having an idea of the volume and types of queries can go a long way in determining index strategies.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号