I'm working on simple log in page using OpenID: if the user has just registered for an OpenID, then I need to create a new entry in the database for the user, otherwise I just display their alias with a greeting. Every time somebody gets authenticated with their Open ID, I must find their alias by looking up which user has the given OpenID and it seems that it might be fairly slow if the primary key is the UserID (and there are millions of users).
I'm using SQL Server 2008 and I have two tables in my database (Users and OpenIDs): I plan the chec开发者_C百科k if the Open ID exists in the OpenIDs table, then use the corresponding UserID to get the rest of the user information from the Users table.
The Users table is indexed by UserID and has the following columns:
- UserID (pk)
- Alias
- OpenID (fk)
The OpenIDs table is indexed by OpenID and has the following columns:
- OpenID (pk)
- UserID (fk)
Alternately, I can index the Users table by UserID and OpenID (i.e have 2 indexes) and completely drop the OpenIDs table.
What would be the recommended way to improve the query for a user with the matching OpenID in this case: index the Users table with two keys or use the OpenIDs table to find the matching UserID?
May be the answers to What are some best practises and “rules of thumb” for creating database indexes? can help you.
Without knowing what kind of queries you'll be running in detail, I would recommend indexing the two foreign key columns - Users.OpenID
and OpenIDs.UserID
.
Indexing the foreign keys is typically a good idea to help with JOIN conditions and other queries.
But quite honestly, if you use the OpenIDs
table only to check the existance of an OpenID
, you'd be much better off just indexing (possibly a unique index?) that column in the Users
table and be done with it. That OpenIDs
table as you have it now serves no real purpose at all - just takes up space for redundant information.
Other than that: you need to observe how your application behaves, samples some usage data, and then see what kind of queries are running the most often, and the longest, and then start doing performance tweaking. Don't over-do the ahead-of-time performance optimizations - too many indices can be worse than having none at all !
Every time somebody gets authenticated with their Open ID, I must find their alias by looking up which user has the given OpenID and it seems that it might be fairly slow if the primary key is the UserID (and there are millions of users).
Actually, quite the contrary! If you have a value that's unique amongst millions of rows, finding that single value is actually quite quick - even with millions of users. It will take only a handful (max. 5-6) comparisons, and bang! you have your one user out of a million. If you have an index on that OpenID
column, that should be pretty fast indeed. Such a highly selective index (one value picks out 1 in a million) work very very efficiently.
精彩评论