Would an indexed compare of two string fields be faster if those fields were md5'd during the imp开发者_StackOverflow中文版ort, and the md5 field used in the compare instead of the source string field?
Possibly, if the strings were very long or very similar.
More importantly though, if the strings are very long, the index for the MD5 column would likely be smaller. Smaller means more likely to fit into memory, and faster to read from the disk. So if there were any performance improvement to be gained by doing this, most of it would probably come from that.
That said, you'd have to weigh any performance gain in reading with the extra cost of calculating the checksum and writing the value and the increased table size due to the new column being added.
If the source string is larger than the md5sum (32 bytes if you hex encode it), yes, using the md5sum would be faster.
While a low probability, you rish hash collision uising the md5 sum for equality though. i.e. 2 fields might comapare as equal while they're not, so think twice about that.
Probably not. It may depend on the type of index used. You can specify USING HASH
when creating the index. In this case, MySQL will use some kind of hashing (probably MD5 as well) to build the index with, but it will avoid the risk of collision. When two values have the same hash, the hash will just function as a range for both rows.
精彩评论