In MySQL, an index type is a b-tree, and access an element in a b-tree is in logarithmic amortized time O(log(n))
.
On the other hand, accessing an el开发者_如何学JAVAement in a hash table is in O(1)
.
Why is a hash table not used instead of a b-tree in order to access data inside a database?
You can only access elements by their primary key in a hashtable.
This is faster than with a tree algorithm (O(1)
instead of log(n)
), but you cannot select ranges (everything in between x
and y
).
Tree algorithms support this in Log(n)
whereas hash indexes can result in a full table scan O(n)
.
Also the constant overhead of hash indexes is usually bigger (which is no factor in theta notation, but it still exists).
Also tree algorithms are usually easier to maintain, grow with data, scale, etc.
Hash indexes work with pre-defined hash sizes, so you end up with some "buckets" where the objects are stored in. These objects are looped over again to really find the right one inside this partition.
So if you have small sizes you have a lot of overhead for small elements, big sizes result in further scanning.
Todays hash tables algorithms usually scale, but scaling can be inefficient.
There are indeed scalable hashing algorithms. Don't ask me how that works - its a mystery to me too. AFAIK they evolved from scalable replication where re-hashing is not easy.
Its called RUSH - Replication Under Scalable Hashing, and those algorithms are thus called RUSH algorithms.
However there may be a point where your index exceeds a tolerable size compared to your hash sizes and your entire index needs to be re-built. Usually this is not a problem, but for huge-huge-huge databases, this can take days.
The trade off for tree algorithms is small and they are suitable for almost every use case and thus are default.
However if you have a very precise use case and you know exactly what and only what is going to be needed, you can take advantage of hashing indexes.
Actually, it seems that MySQL uses both kind of indexes either a hash table or a b-tree according to the following link.
The difference between using a b-tree and a hash table is that the former allows you to use column comparisons in expressions that use the =, >, >=, <, <=, or BETWEEN operators, while the latter is used only for equality comparisons that use the = or <=> operators.
The time complexity of hashtables is constant only for sufficiently sized hashtables (there need to be enough buckets to hold the data). The size of a database table is not known in advance so the table must be rehashed now and then to get optimal performance out of an hashtable. The rehashing is also expensive.
I think Hashmaps don't scale as well, and can be expensive when the entire map needs to be rehashed.
- MySQL supports HASH in only a couple of situations:
ENGINE=MEMORY
(which is rarely used) and internally for a "hash-join". - Even when you ask an InnoDB table to have a HASH index, it silently turns it into BTree.
- Hash comes close to O(1), but technically it is more like O(N^2) in the worst case. This is because of the need to handling "collisions".
- MySQL picked BTree because it is more flexible than Hash (because it can handle ranges), while not being significantly slower than Hash.
- Arguably, BTree is slower to O(1) due to caching of blocks. Non-leaf nodes tend to be cached and stay in RAM, even if the leaf nodes come and go (for large tables).
- MySQL maintains a BTree dynamically; while you can ask to rebuild an index (cf
OPTIMIZE
), it is rarely worth the effort. - In InnoDB. The data is stored in a BTree ordered by the
PRIMARY KEY
. Secondary keys are also stored in separate BTrees, but ordered by the secondary key column(s). The only other info in a leaf node is thePRIMARY KEY
value. Hence, a secondary key lookup needs two BTree lookups (unless all then necessary columns are in the secondary+primary columns -- this is called "covering").
I conclude by saying Big-O may be interesting, but the details of the implementation add complexity. And performance for arbitrarily large tables.
In addition to the nice answers here, here is some perspective when thinking about how to build a database.
First, robust hash tables are typically done using a bucketing system, such as in Quadratic Probing which is used to implement JavaScript "objects" (i.e. hash tables), for example. You can see a bucketed hash table implementation in JavaScript here.
You'll notice in this implementation, that there is a lot more processing that goes on than meets the eye with O(1)
notation. First, you run it through the hashing function, which iterates the length of the input string, and has 5+ computational steps each iteration. Note though, these are fast computational steps because they are all done in registers and not in RAM. Next, you use that hash value to fetch a bucket. I'm not sure how many buckets there are, or how long a bucket is, but the bucket is an array or linked list. So then you iterate through the bucket items, and compare every item with the input key you are fetching the value for. This is again a string comparison. So in all likelihood I would estimate that there are at least 100 computational steps for even a simple string to fetch it from a hash table. All of these string comparisons add up.
In addition, the buckets might be half empty, which takes up a lot of useless space. Finally, when the hash table reaches a certain size in occupancy, it has to then double in size! It has to re-process and recompute everything. This can cause a noticeable glitch in a UI application.
B+trees, on the other hand, are a more compact data structure. You are still doing string comparison, but you are only jumping MAX I would say 20 links in the tree (in terms of depth), then scanning the children in the last tree node to find the exact match.
In this sense, I think in reality that B+trees or B-trees will perform on par with hash tables, especially naive implementations. Both systems can be optimized and fine-tuned, and I still think they will be close to equal. Only testing will tell. But trees come with the advantage of being more compact memory-wise. So after thinking about this for long periods of time and weighing every aspect of the equation, I am going to choose B+trees as the ideal solution to finding items by key quickly.
Pick DB/OS was based on hashing and worked well. With more memory these days to support efficient sparse hash tables, and redundant hashing to support modest range queries, I would say hashing may yet have its place (some would rather have other forms of non-range similarity-matching, such as wildcards and regexps). We also recommend copying to keep collision chains contiguous when memory hierarchies have big speed differences.
Another thing that may impact the choice as well: Hash-tables work well for mapping a key to exactly one single value. However, in a situation where one key maps to a large number of elements (very common for single columns of a table), you can easily lose the O(1) behaviour depending on exactly how it handles it. BTrees do not have that problem and handle lots of duplicate entries excellently.
精彩评论