I want to parse configuration files like开发者_JAVA百科 apache2.conf, which looks like this:
<Group group1>
param1 1
<SomeGroup group3>
param3 3
</SomeGroup>
</Group>
<Group group2>
param2 2
</Group>
Regexp:
re.findall(r'\</?[^\>]+\>([\s\S]+)\<//?[^\>]+\>', text, re.MULTILINE)
if I use lazy regexp, it cuts like this:
<Group group1>
param1 1
<SomeGroup group3>
param3 3
</SomeGroup>
If I use greedy regexp, it cuts all the text. So, what is the correct way to parse it? Or is there any libraries?
Augeas has python bindings.
There is no way to do this with regexp alone. The regexp engine has no state, so you can only parse very simple input. See here for other options: Any python libs for parsing apache config files?
精彩评论