I need to create a primary key scheme for a system that will need peer to peer replication. So I'm planning to combine a unique system ID and a sequential number in some way to come up with unique ID's. I want to make sure I'll never run out of ID's, so I'm thinking about using a varchar field, since I could always add another character if I start running out. But I've read that integers are better optimized for this. So I have some questions...
1) Are integers really better optimized? And if they are, how much of a performance difference is there between varchars and integers? I'm going to use firebird for now. But I may switch later. Or possibly support multiple db's. So I'm looking for generalizations, if that's possible.
2) If integers are significantly better optimized, why is that? And is it likely that varchars will catch up in the future, so eventually it won't matter anyway?
My varchar keys won't have any meaning, except for the unique system ID part. But I may want to obscure that somehow. Also, I plan to efficiently use all the bits of each cha开发者_JAVA技巧racter. I don't, for example, plan to code the integer 123 as the character string "123". So I don't think varchars will require more space than integers.
For MySQL, according to Alexey here, the answer is surprisingly "not much". He concludes:
So, if you have an application and you need to have some table field with a small set of possible values, I'd still suggest you to use ENUM, but now we can see that performance hit may not be as large as you expect. Though again a lot depends on your data and queries.
You will likely not run out of integers.
For example in MySQL, max value for BigInt is 18,446,744,073,709,551,615. So if you insert 100 million rows per second, it will take you 5849 years before you run out of numbers.
- varchar requires extra storage for length information
- comparison and sorting requires collation processing
- varchar may not match across systems because of collation
- int gives 4 billion rows, bigint (8 bytes) gives 18 trillion rows
- pre-bigint, I've seen decimal (19, 0) which also gives 18 trillion rows
Using varchar will end in tears...
To be clear: you're developing a system that may have more then 4 billion rows (you don't know), has replication, you don't know what RDBMS you'll use, and you don't understand how varchar differs from an integer?
精彩评论