开发者

Hierarchical text data structure parser suggestions needed

开发者 https://www.devze.com 2023-03-01 04:55 出处:网络
Having the following hierarchical text data input (JunOS-like, in fact) I need to parse it into some suitable data structure I could

Having the following hierarchical text data input (JunOS-like, in fact) I need to parse it into some suitable data structure I could perform queries to obtain some user-specified branch of the tree, then linearize it (?) to some sort of mapping I could use to let user change/insert/delete etc. it and then write it back to an output file as a tree again (storing the original data in a "version" file to allow later "history" or "rollback" operations - the full set of operations as described some words ago).

version 1.0;
description "Example data";

weights {
    weight low {
        value 1;
        description Forgetable;
    }
    weight medium {
        value 2;
        description Important;
    }
    weight high {
        value 3;
        description Critical;
    }
}

tags {
    tag foo {
        description "Some foo";
    }
    tag bar {
        description "Some bar";
    }
    tag baz {
        description "Some baz";
    }
}

tag-sets {
    tag-set foo\ bar {
        tag [ foo bar ];
        description Foo\ and\ bar;
    }
    tag-set "foo bar baz" {
        tag-set "foo bar";
        tag baz;
        description "Foo, bar and baz";
    }
}

Questions:

1) What data structure suites the input the best? What C structure do you suggest to be used?

2) I do not want to use yacc/lex to parse it (unnecessary extra steps and complicated collaborative work whilst not everybody - even me - likes/knows to use the tools) - what parsing method is the easiest to implement for such sort of parsing problem?

3) What method do you suggest to maintain the "types" of nodes in source code? It seems quite tricky to me at the moment (in fact I have no idea how to do it yet). For instance there is some node of type "version" that takes some "word" as it's argument. It is also known that the node "version" exists only as part of the root branch of the hierarchy. Another example may be that there are several "description" nodes taking a "word" or a "string as their arguments. The "description" nodes belongs to every node of the hierarchy. Etc. How to cope with this sort of problem?

Note to explain the purpose: The resulting utility will "version" some data stored in text data files quite similar to the example I provided above and user will query/change/insert/delete the data to maintain some sort of specific information (say, todo list or whatever, as an example). Consider it to be sort of simple database rather than configuration file or something alike (sorry my english). The idea is to provide a) CLI, b) command-line tool, c) allow users to edit data in their editor, if the do not want to use a) or b)...

At least some "general" sugg开发者_如何学Pythonestions are to be highly appreciated.


I would use a recursive descent parser combined with some sort of hashtable or map for data storage. From the looks of it, it closely resembles JSON, but not exactly. Strings, Numbers, Lists, and Dictionaries seem to be supported though. A simple "Object" type class would do the trick for storing that (similar to javascript).

For managing history of the data structure, you could implement it similar to OMeta worlds (see: http://www.vpri.org/pdf/rn2008001_worlds.pdf). It leverages prototypical object model for managing scope and history.


You could start with a json parser such as the json parser and modify accordingly.

0

精彩评论

暂无评论...
验证码 换一张
取 消