I have to parse a file that looks like this:
versioninfo
{
"editorversion" "400"
"editorbuild" "4715"
}
visgroups
{
}
world
{
"id" "1"
"mapversion" "525"
"classname" "worldspawn"
solid
{
"id" "2"
side
{
"id" "1"
"plane" "(-544 -400 0) (-544 -240 0) (-272 -240 0)"
}
side
{
"id" "2"
"plane" "(-544 -240 -16) (-544 -400 -16) (-272 -400 -16)"
}
}
}
I have a parser written from scratch, but it has a few bugs that I can't track down and I imagine it'll be difficult to maintain if the format changes in the future. I decided to use the GOLD Parsing System to generate a parser, instead. My grammar looks like this:
"Start Symbol" = <SectionList>
! SETS
{Section Chars} = {AlphaNumeric} + [_]
{Property Chars} = {Printable} - ["]
开发者_StackOverflow! TERMINALS
SectionName = {Section Chars}+
PropertyPart = '"' {Property Chars}* '"'
! RULES
<SectionList> ::= <Section>
| <Section> <SectionList>
<SectionBody> ::= <PropertyList>
| <SectionList>
| <PropertyList> <SectionList>
<Section> ::= SectionName '{' '}'
| SectionName '{' <SectionBody> '}'
<PropertyList> ::= <Property>
| <Property> <PropertyList>
<Property> ::= PropertyPart PropertyPart
There are no errors and it parses my 2000-line test file just fine. However, this is my first time writing a custom grammar, so I'm not sure if I'm doing it correctly.
Are there any improvements I could make to the grammar above?
below are some changes i would request to change for better performance
1) make the grammar left recursive rules. this is better in terms of making shift reduce operations as gold parser is a shift reduce LR parser.
SectionList ::= Section
| SectionList Section
PropertyList ::= Property
| PropertyList Property
2) third rule in below section forces you to have propertylist only before sectionlist but not between different 's. make sure its as per requirement
SectionBody ::= PropertyList
| SectionList
| PropertyList SectionList
i can help you better if required and if you let me know the language saying " it should accept this , shouldn't accept this" rather than a sample input which will not give 100% picture of your language. or let me know the bugs you felt from which we can define the language description also.
Regards, V M Rakesh (rakesh.vm@gmail.com)
精彩评论