In binary search trees most of the operations' average computational complexity is given as O(NlogN). Below is a text snippet from Algo's book:
Average of internal path length is D(n) = O(n log n). Thus, the expected depth of any node is O(log n). As an example, the randomly generated 500-node tree has nodes at expected depth 9.98.
It is tempting to say immediately that this result implies that the average running time of all the operations discussed i.e., insert, find min, find max, delete on binary search tree is O(log n), but this is not entirely true. The reason for this is that because of deletions, it is not clear that all binary search trees are equally likely. In particular, the deletion algorithm described above favors making the left subtrees deeper than the right, because we are always replacing a deleted node with a node from the right subtree. The exact effect of this strategy is still unknown, but it seems only to be a theoretical novelty. It has been shown that if we alternate insertions and deletions O(n2) (i.e., theta of n square) times, then the trees will have an expected depth of theta of square root N.开发者_StackOverflow
After a quarter-million random insert/delete pairs, the tree that was somewhat right-heavy looks decidedly unbalanced (average depth = 12.51).
My questions on above text snippet are:
- What does the author mean by "because of deletions, it is not clear that all binary search trees are equally likely"?
- How did the author achieved expected depth of 9.98 for 500 node tree, as log 500 is not 9.98?
- In last paragraph, how did he get average depth of 12.51?
Thanks!
1) It considers binary tree generated by a (somewhat uniform) random sequence of insertion and deletion. When doing insertions only, all the possible shapes of the tree (up to symetry!!) for the size are equaly likely - you can try building a tree with all possible permutations of (1,2,3), or (1,2,3,4).
In the algorithm described here, when a node with two subtrees is deleted, it is replaced by its right subtree, and I guess the left subtree goes leftmost bottommost under that. This not only makes some (unbalanced) shapes more likely than they were without deletion, but also make it likely than tree will be deeper on their left branches than on they right one (see what happens when you remove a few nodes at or close to the root of a balanced tree with this strategy)
2) Consider size 511 instead. If the tree is exactly balanced, it will have depth 9 (log(n+1)) This is the minimum depth in can have with that many elements. For random shapes, this is the min, not the average, the average has to be greater : there are shapes with depth 511 (note that while the depth has to be greater log2(N), it may still be O(logN)). I don't know how the author gets this figure. Maybe clever maths, maybe just running a large sample.
4) I would believe running a sample in this case: the trees had right heavy looks. It is clear however that deletions will, on average, tend to make trees less balanced
精彩评论