Looking for a way to store a parent-child-grandchild structure_问答_开发者

I have a need to store a structure where N parents will have 1 to N children and each child will have 1 to N children. I would like to store this in a db in a manner that is both relatively performant and highly extensible w/o requiring db schema changes.

Each parent must be unique and N parents might have the same child. However, that child may have different children depending on the parent. (clear as mud?)

It may be easier to describe this as parentA may have a male child that has certain attributes (brown hair, brown eyes). parentb also has a male child but this child has blonde hair and blue eyes. I need to store each of these children (male and female) and each attribute (hair and eye color) in a normalized fashion and relate them in such a way that when I query parenta I get al of their children and those childrens attributes.

I have done a bit with tree structures and hierachical structures in SQL but am having a hard time conceptualizing开发者_如何学编程 this particular scenario in a manner that meets my requirements of performance and extensibility. Children and associated attributes will be added at regular (if not frequent) intervals. Thanks in advance. I know clarification will be required.

Additional clarification

Ok, it appears a different example may be needed. Let's use the good-old example of a car.

CarA and CarB both have steering wheels, engines, and tires. CarA's steering wheel has radio controls on it. CarB's steering wheel does not. CarA has a six cylinder motor and CarB has an eight cylander motor. I need to model the relationship between each car and each feature with that feature's attribute. Am I helping at all? -rb

if this is fixed at three layers and they are conceptually distinct (as in your extended example) then i think you are being confused by the idea of trees where they are not necessary. just use tables and relations as you would with any other problem.

a table for parents
a table for children (if they have always exactly two parents, parents can be fields, otherwise you also need a table for the relationship)
a table for attributes and another for the many-to-many relationship between that and children [or store these in the children table - see comment from DForck42 below]

trees are necessary where the nodes at different levels are "the same thing". but they're not a great fit with sql, so i wouldn't try to use them where they don't seem to be necessary.

update. from your comments below i think that you are saying that children are divided into classes or types, and that the possible attributes depends on the type of the child, but that the values of those attributes depends on the parent.

in that case you have a completely different problem, more like OO inheritance. the simplest solution that i see is that you can have a different table for each kind of child. then each table has different columns for the different attributes. child tables refer to parent tables.

so you would have a parent table with IDs. then you might have a child table for "admin sites". each row of that child table would reference a parent via the ID and contain URL, CSS, etc as columns. another child type, like "database config page" would be in another table, with a different set of attributes.

if you have attributes in common then you can either repeat them in each table or have a "superclass" table.

solutions like this can get quite complicated and i'd suggest asking another question once you have a clearer explanation of what you want. there's a good explanation of the options here - http://www.sqlalchemy.org/docs/orm/inheritance.html (ignore the parts relevant to SQLAlchemy and just look at how they are using tables in different ways to model inheritance).

The way I read your question, you only need five tables.

 -> Parent
    ParentId, Col1, Col2, Col3

 -> Child
    ChildId, Col1, Col2, Col3

 -> Grandchild
    GrandchildId, Col1, Col2, Col3

 -> ParentToChild
    ParentId, ChildId

 -> ChildToGrandchild
    ChildId, GrandchildId

That stores all the relationships, and it would be up to you to make constraints for the logic you want; with this implementation, N to N relationships are possible for (Parent, Child), and (Child, Grandchild).

Well, here's another approach. You only need two tables. The first one is where you store all your 'objects' (whatever they are) which make up your hierarchy:

ObjectID | ObjectName | ...

The second is the relationship table:

RelID | ParentID | ChildID

The relationship table can include a constraint that ensures no object is the child of more than one parent, which gives you integrity pretty much for free.

Now traversing the table(s) to extract hierarchy can get tricky, but it can be done with a relatively simple stored proc. There are two catches. First, all your objects should share the same table, and thus the same unique IDs (ideally). Second is how many levels of recursion your DB supports. In my experience the 32 levels supported by SQL Server have been more than adequate, for example. However, doing the traversal in code rather than in the DB can kill performance.

There are other ways to approach this. If you Google for database hierarchical data you'll find a few, including a formal CS paper or two.

I've used this method in the past and I find it simple and performant enough.

What's wrong with the following approach:

create Table Persons    {
  PersonID int Primary Key,
  Name  varchar(100),
  MotherID int {Foreign Key},
  FatherID int {Foreign Key}
}

create Table Attributes
{
    PersonID int {Foreign Key},
    AttributeName varchar(10),
    AttributeValue varchar(10)
}

You'd get all the attributes for a given persons children by:

Select
      Persons.Name,
      Attributes.AttributeName,
      Attributes.AttributeValue
From
      Persons
Left Join
      Atttributes
On
     Persons.PersonID = Attributes.PersonID  
Where
     MotherID = @PersonID or FatherID = @PersonID