Nested brackets/chars '(' and ')' in grammar/ANTLRWorks warning: Decision can match input such as ... using multiple alternatives_问答_开发者

Nested brackets/chars '(' and ')' in grammar/ANTLRWorks warning: Decision can match input such as ... using multiple alternatives

开发者 https://www.devze.com 2023-03-23 22:49 出处：网络

The grammar below parses ( left part = right part # comment ), # comment is optional. Two questions: Sometimes warning (ANTLRWorks 1.4.2):

The grammar below parses ( left part = right part # comment ), # comment is optional.

Two questions:

Sometimes warning (ANTLRWorks 1.4.2): Decision can match input such as "{Int, Word}" using multiple alternatives: 1, 2 (referencing id2)
But only sometimes!
The next extension should be that the comment (id2) can contain chars '(' and ')'.

The grammar:

grammar NestedBrackets1a1;

//==========================================================
// Lexer Rules
//==========================================================

Int
  :  Digit+
  ;

fragment Digit
  :  '0'..'9'
  ;

Special
  : ( TCUnderscore | TCQuote )
  ;
TCListStart   : '(' ; 
TCListEnd     : ')' ;   
fragment TCUnderscore  : '_' ;
fragment TCQuote       : '"' ;

// A word must start with a letter
Word
  :  ( 'a'..'z' | 'A'..'Z' | Special ) ('a'.开发者_开发技巧.'z' | 'A'..'Z' | Special | Digit )*
  ;

Space
  :  ( ' ' | '\t' | '\r' | '\n' ) { $channel = HIDDEN; }
  ;

//==========================================================
// Parser Rules
//==========================================================

assignment
  :  TCListStart id1 '=' id1 ( comment )? TCListEnd
  ;

id1
  :  Word+
  ;

comment
  : '#' ( id2 )*
  ;

id2
  :  ( Word | Int )+
  ;

ANTLRStarter wrote:

Sometimes warning (ANTLRWorks 1.4.2): Decision can match input such as "{Int, Word}" using multiple alternatives: 1, 2 (referencing id2) But only sometimes!

No, the grammar you posted will always produce this warning. Perhaps you don't always notice it (your IDE-plugin or ANTLRWorks might show it in a tab you don't have opened), but the warning is there. Convince yourself by creating a lexer/parser from the command line:

java -cp antlr-3.4-complete.jar org.antlr.Tool NestedBrackets1a1.g

will produce:

warning(200): NestedBrackets1a1.g:49:19:
Decision can match input such as "{Int, Word}" using multiple alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input

This is because you have a * after ( id2 ) inside your comment rule, and id2 also is a repetition of tokens: ( Word | Int )+. Let's say your input is "# foo bar" (a # followed by two Word tokens). ANTLR can now parse the input in more than 1 way: the 2 tokens "foo" and "bar" could be matched by ( id2 )*, where id2 matches a single Word token at a time, but "foo" and "bar" could also be matches in one go of the id2 rule.

Look at the merged rules:

comment
  :  '#' ( ( Word | Int )+ )*
  ;

See how you're repeating a repetition: ( ( ... )+ )*? This is usually a problem, as it is in your case.

Resolve this problem by either replacing the * with a ?:

comment
  :  '#' ( id2 )?
  ;

id2
  :  ( Word | Int )+
  ;

or by removing the +:

comment
  :  '#' ( id2 )*
  ;

id2
  :  ( Word | Int )
  ;

ANTLRStarter wrote:

The next extension should be that the comment (id2) can contain chars '(' and ')'.

That is asking for trouble since a comment is followed by a TCListEnd, which is a ). I don't recommend letting a comment match ).

EDIT

Note that comments are usually stripped from the source file while tokenizing the input source. That way you don't need to account for them in your parser rules. You can do that by "skipping" these tokens in a lexer rule:

Comment
  :  '#' ~('\r' | '\n')* {skip();}
  ;

Nested brackets/chars '(' and ')' in grammar/ANTLRWorks warning: Decision can match input such as ... using multiple alternatives

EDIT

精彩评论

关注公众号

热门标签

图文推荐

Nested brackets/chars '(' and ')' in grammar/ANTLRWorks warning: Decision can match input such as ... using multiple alternatives

EDIT

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：