I have a project where I hope to store hierarchical information (specifically, categories and subcategories) in what is basically a flat database system (in short, it's a collection of XML records). I'd like to store information about categories and subcategories in the system:
- Animals
- Invertebrates
- Vertebrates
- Weather
- Buildings
- Skyscrapers
- Historic buildings
...and so forth.
Everything in the system, for better or worse, is stored as an XML record; this is just how the storage system works.
Which means that each category in the system is also stored as an XML record, like so:
<record id="12345">
<attribute name="Skyscrapers" />
<attribute type="Category" />
</record>
So I am wondering how to implement a hierarchy under these constraints.
I'm used to data storage in a relational database. In those cases I pretty much always use the nested set model. It seems like this wouldn't be a good choice in this case because:
- Each time you insert an item, you
have to change the
right
and/orleft
values for many of the nodes. I can't do a bulk update on the XML files, so I'd have to update each one individually. - Although there are search functions that let me filter by
"less-than" or "greater-than" (so I
could in theory pull only the
relevant child nodes or parent nodes
of a given category), I can't order
the XML records by attributes. So
I'd have to retrieve all of the
documents, transform them into a
list of objects that can be sorted
(in this case with Python) and then
sort them using a
lambda
function.
Since my data storage model isn't significantly different than storing data using NoSQL I was wondering if anyone using that storage mechanism has come up with a good t开发者_C百科rick for handling and storing hierarchical data.
This class (based on the Peewee ORM) allows you to handle hierarchical data with a flat relational database (PostgreSQL, MySQL and SQLite are supported):
https://github.com/mathieurodic/peewee-tree/blob/master/node.py
You can make a few changes in the class methods so the changes are also applied to the XML file you are manipulating.
Not sure how applicable this would be for your use case, but as a thought, maybe using Beautiful Soup will help. Perhaps it's default hierarchical representation will be sufficient for your needs.
精彩评论