How to implement a generic hierarchy of structures with injected functionality_问答_开发者

I want to implement a generic hierarchy for tree structures, which can later be used in an implementation-independent way to describe generic algorithms over trees.

I started with this hierarchy:

interface BinaryTree<Node> {
    Node left(Node);
    bool hasLeft(Node);

    Node right(Node);
    bool hasRight(Node);
}

interface BinaryTreeWithRoot<Node> : BinaryTree<Node> {
    Node root();
}

interface BinaryTreeWithParent<Node> : BinaryTree<Node> {
    Node parent(Node);
    bool hasParent(Node);
}

Now, basically I want to be able to implement the concept of a subtree in a universal way: For each class T : BinaryTree, I want a 'class' Subtree(T) which provides the same functionality of T (so it must derive from it), and also rewrites the root() functionality.

Something like this:

class Subtree<T, Node> : T, BinaryTreeWithRoot<Node> 
    where T : BinaryTree<Node>
{
    T reference;
    Node root;

    void setRoot(Node root) {
        this.root = root;
    }

    override Node BinaryTreeWithRoot<Node>::root() {
        return this.root;
    }

    // Now, inherit all the functionality of T, so an instance of this class can be used anywhere where T can.
    forall method(arguments) return reference.method(arguments);
}

Now with this code I'm not sure how to create an object of type subtree, since the tree object should in some way be injected.

One approach is to create a subtree class for each tree class i create, but this means code duplication, and, after all, its the same thing.

So, one approach to this is mixins, which allow a generic class to derive from its template parameter.

I'm also interested how such a hierarchy can be implemented in Haskell, since Haskell has a great type system and I think it will be easier to inject such functionality.

For example in Haskell it may be something like:

class BinaryTree tree node where
    left :: tree -> node -> node
    right :: tree -> node -> node

class BinaryTreeWithRoot node where
    left :: tree -> node -> node
    right :: tree -> node -> node -- but this is a duplication of the code of BinaryTree
    root :: tree -> node

instance BinaryTree (BinaryTreeWithRoot node) where
    left = left
    right = righ开发者_开发百科t

data (BinaryTree tree node) => Subtree tree node = 
 ...

instance BinaryTreeWithRoot (Subtree tree node) where ...

I'm interested if and how this can be done in within an oop language (c++,c#,d,java), as c++ and d provide mixins out of the box (and i'm not sure for d), and out of curiosity with the Haskell type system.

Since D has "real" templates, not generics, making a template class inherit from its template parameter is trivial:

class A {}
class B(T) : T {
    static assert(is(B!T : T));  // Passes.
}

As far as making Subtree work in D, something like this should do it, assuming you also have a template class Node:

class Subtree(T) : T, BinaryTreeWithRoot!(Node!(T))
{
    T reference;
    Node root;

    void setRoot(Node root) {
        this.root = root;
    }

    override Node root() {
        return this.root;
    }
}

However, IIUC (correct me if I'm wrong), T is the payload of the tree and could therefore be a primitive. If that's the case, you would be better off getting your ability to use a Subtree!(T) as a T via alias this, which allows for subtyping without inheritance and works with primitives:

class Subtree(T) : BinaryTreeWithRoot!(Node!(T))
{
    T reference;
    alias reference this;  // Make this implicitly convertible to reference.
    Node root;

    void setRoot(Node root) {
        this.root = root;
    }

    override Node root() {
        return this.root;
    }
}

Creating a tree interface like this in Haskell is ... unusual. Both Node and Subtree are superfluous. This is partially due to algebraic types, and partially because Haskell data is immutable so different techniques are required to accomplish certain things (like setting the root node). It is possible to do it, the interface would look something like:

class BinaryTree tree where
    left :: tree a -> Maybe (tree a)
    right :: tree a -> Maybe (tree a)

-- BinaryTreeWithRoot inherits the BinaryTree interface
class BinaryTree tree => BinaryTreeWithRoot tree where
    root :: tree a -> tree a

Then, with a pretty standard binary tree definition:

data Tree a =
  Leaf
  | Branch a (Tree a) (Tree a)

instance BinaryTree Tree where
  left Leaf = Nothing
  left (Branch _ l r) = Just l
  right Leaf = Nothing
  right (Branch _ l r) = Just r

data TreeWithRoot a =
  LeafR (TreeWithRoot a)
  | BranchR a (TreeWithRoot a) (TreeWithRoot a) (TreeWithRoot a)

instance BinaryTree TreeWithRoot where
-- BinaryTree definitions omitted

instance BinaryTreeWithRoot TreeWithRoot where
  root (LeafR rt) = rt
  root (BranchR _ rt l r) = rt

Since this interface returns a Maybe (tree a), you can also use left and right to check if the branches exist instead of using separate methods.

There's nothing particularly wrong with it, but I don't believe I've ever seen anyone actually implement this approach. The more usual techniques are either to define traversals in terms of Foldable and Traversable or create a zipper. Zippers are simple to derive manually, but there are several generic zipper implementations, such as zipper, pez, and syz.

In C# 4 I would use dynamics for achieving this goal. For example you could try defining the SubtTree class as:

public class Subtree<T, Node> : DynamicObject, BinaryTreeWithRoot<Node> where T : BinaryTree<Node>
{
    private readonly T tree;

    public Subtree(T tree)
    {
        this.tree = tree;
    }
}

and override appropriate methods of DynamicObject using methods/properties of tree. More information (and sample code) can be found in this great blog post about Using C# 4.0 dynamic to drastically simplify your private reflection code.

It is worth mentioning that due to usage of dynamic capabilities and reflection, small performance overhead will be introduced as well as reduced safety (as it may involve violation of encapsulation).

I think the approach through "BinaryTree"s assumes too much of a fixed structure and unnecessarily defines your interface in a nongeneric way. Doing this makes it difficult to reuse algorithms as your tree expands into non-binary forms. Instead, you will need to code your interfaces for multiple styles when such is not necessary or useful.

Also, coding with hasLeft/hasRight checks means every access is a two step process. Checking existence of a fixed position will not provide for efficient algorithms. Instead, I think you will find that adding a generic property that may be binary left/right or binary red/black or a character index or anything else will allow much more reuse of your algorithms and checking that data can be done only by those algorithms that need it (specific binary algorithms).

From a semantic view, you want to encode some basic properties first, and then specialise. When you are "at" a node inside an algorithm, you want to be able to first find the children edges. This should be a container range of edge structures that allow you to navigate to the children nodes. Since it can be a generic container, it could have 0, 2, 5, 1, or even a 100 edges in it. Many algorithms do not care. If it has 0, iterating over the range will do nothing - no need to check hasX or hasY. For each edge, you should be able to get the node of the child, and recurse for whatever algorithm you wish.

This is basically the approach boost takes in it's Graph library, and it allows for expansion of tree algorithms to graphs where they are applicable, for even better generic algorithm reuse.

So already you have a basic interface with this

TreeNode:
  getChildEdges: () -> TreeEdgeRange

TreeEdge:
  getChildNode: () -> TreeNode

and whatever range-to-object your favorite language enjoys. D has a particularly useful range syntax, for instance.

You will want to have some basic Tree object that gives you nodes. Something like

Tree:
  getTreeNodes: () -> TreeNodeRange

starts you out.

Now, if you want to support BinaryTrees, do so as a restriction on this interface. Note that you don't really need any new interface methods, you just need to enforce more invariants - that every TreeNode has 0, 1, or 2 childEdges. Just make an interface type that indicates this semantic restriction:

BinaryTree : Tree

And if you want to support rooted trees, adding an interface layer with

RootedTree : Tree:
  getRoot: () -> TreeNode

adds that capability.

The basic idea is that you shouldn't have to add interface methods to add semantic requirements if you are making your classes more specific down the hierarchy. Only add interface methods if there is a new semantic behavior that needs accessed. Otherwise - enforce new invariants on the generic interface.

Eventually, you'll want to decorate nodes and edges with structures that hold data about the node or edge, so you can build Tries and Red-Black trees and all the great tools of advanced algorithmics. So you will want to have

PropertiedTreeNode<Property> : TreeNode:
  getProperty: () -> Property

PropertiedTreeEdge<Property> : TreeEdge:
  getProperty: () -> Property

Since this is something you will want to allow generic algorithms to work on, the type information of whether a property is a part of the Tree or not should be generic and something algorithms can ignore. This puts you on the design track of boost, where these issues have been resolved very elegantly. I would recommend studying that library if you want ideas on how to build a generic tree algorithm library.

If you follow the above guidelines of types-equating-to-semantic-descriptions, then thee SubTree should be obvious - it's exactly the same type as the tree it is coming from! In fact, you really should not have a SubTree type at all. Instead, you should just have a method of the specific TreeNode type you are dealing with

PropertiedTreeNode<Property>:
  getSubTree: () -> PropertiedTree<Property>

And, as in boost, as you encode more of the information of the capabilities of Tree in it's generic properties, you can get new Tree types with broader interface contracts.

As you pointed out, one approach is to create a subtree class for each tree class i create, This means code duplication, but it can be somehow "avoided", or better, automated, using reflection and T4. I did it myself for a past project and it works quite well!

You may start from Oleg Synch blog for an overview on T4. Here a good example of automatically generated classes: http://www.olegsych.com/2007/12/how-to-use-t4-to-generate-decorator-classes/

You could do:-

public class Node {
    public Node Left {get; set:}
    public Node Right {get; set;}
    public Node Parent {get; set;}  // if you want to be able to go up the tree
    public Node Root {get; set;}    // only if you want a direct link to root
}

A sub-tree is just a tree, and every tree can be represented as the root node of that tree and then that node can have properties enabling navigation through the tree.

Make this a generic Node<T> and store the value too. If you don't like public setters make them private and set them only in a constructor or some safe AddLeft(...), etc. methods.

You can get rid of Root too and just traverse up the Parent links until you find a null Parent value (or reach the top node for your subtree case).