How to get the size of a binary tree?_问答_开发者

I have a very simple binary tree structure, something like:

struct nmbintree_s {
    unsigned int size;
    int (*cmp)(const void *e1, const void *e2);
    void (*destructor)(void *data);
    nmbintree_node *root;
};

struct nmbintree_node_s {
    void *data;
    struct nmbintree_node_s *right;
    struct nmbintree_node_s *left;
};

Sometimes i need to extract a 'tree' from another and i need to get the size to the 'extracted tree' in order to update the size of the initial 'tree' .

I was thinking on two approaches:

1) Using a recursive function, something like:

unsigned int nmbintree_size(struct nmbin开发者_如何转开发tree_node* node) {
  if (node==NULL) {
    return(0);
  } 
  return( nmbintree_size(node->left) + nmbintree_size(node->right) + 1 );
}

2) A preorder / inorder / postorder traversal done in an iterative way (using stack / queue) + counting the nodes.

What approach do you think is more 'memory failure proof' / performant ?

Any other suggestions / tips ?

NOTE: I am probably going to use this implementation in the future for small projects of mine. So I don't want to unexpectedly fail :).

Just use a recursive function. It's simple to implement this way and there is no need to make it more compilcated.

If you were doing it "manually" you'd basically end up implementing the same thing, just that you wouldn't use the system call stack for temporary variables but your own stack. Usually this won't have any advantages outweighing the more complicated code.

If you later find out that a substantial time in your program is spend calculating the sizes of trees (which probably won't happen) you can still start to profile things and try how a manual implementation performs. But then it might also better to do algorithmic improvements like already keeping track of the changes in size during the extraction process.

If your "very simple" binary tree isn't balanced, then the recursive option is scary, because of the unconstrained recursion depth. The iterative traversals have the same time problem, but at least the stack/queue is under your control, so you needn't crash. In fact, with flags and an extra pointer in each node and exclusive access, you can iterate over your tree without any stack/queue at all.

Another option is for each node to store the size of the sub-tree below it. This means that whenever you add or remove something, you have to track all the way up to the root updating all the sizes. So again if the tree isn't balanced that's a hefty operation.

If the tree is balanced, though, then it isn't very deep. All options are failure-proof, and performance is estimated by measurement :-) But based on your tree node struct, either it's not balanced or else you're playing silly games with flags in the least significant bits of pointers...

There might not be much point being very clever with this. For many practical uses of a binary tree (in particular if it's a binary search tree), you realise sooner rather than later that you want it to be balanced. So save your energy for when you reach that point :-)

How big is this tree, and how often do you need to know its size? As sth said, the recursive function is the simplest and probably the fastest.

If the tree is like 10^3 nodes, and you change it 10^3 times per second, then you could just keep an external count, which you decrement when you remove a node, and increment when you add one. Other than that, simple is best.

Personally, I don't like any solution that requires decorating the nodes with extra information like counts and "up" pointers (although sometimes I do it). Any extra data like that makes the structure denormalized, so changing it involves extra code and extra chances for errors.