开发者

How to force ANTLR to generate NoViableAltException?

开发者 https://www.devze.com 2022-12-20 04:41 出处:网络
I\'m working with antlr 3.2. I have a simple grammar that consists of atoms (which are either the characters \"0\" or \"1\"), and a rule which accum开发者_JS百科ulates a comma separated list of them i

I'm working with antlr 3.2. I have a simple grammar that consists of atoms (which are either the characters "0" or "1"), and a rule which accum开发者_JS百科ulates a comma separated list of them into a list.

When I pass in "00" as input, I don't get an error, which surprises me because this should not be valid input:

C:\Users\dan\workspace\antlrtest\test>java -cp antlr-3.2.jar org.antlr.Tool Test.g
C:\Users\dan\workspace\antlrtest\test>javac -cp antlr-3.2.jar *.java
C:\Users\dan\workspace\antlrtest\test>java -cp .;antlr-3.2.jar TestParser
[0]

How can I force a error to be generated in this case? It's particularly puzzling because when I use the interpreter in ANTLRWorks on this input, it does show a NoViableAltException.

I find that if I change the grammar to require, say, a semicolon at the end, an error is generated, but that solution isn't available to me in the real grammar I am working on.

Here is the grammar, which is self-contained and runnable:

grammar Test;

@parser::members {
  public static void main(String[] args) throws Exception {
    String text = "00";
    ANTLRStringStream in = new ANTLRStringStream(text);
    TestLexer lexer = new TestLexer(in);
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    System.out.println(new TestParser(tokens).mainRule());
  }
}

mainRule returns [List<String> words]
@init{$words = new ArrayList<String>();}
  :  w=atom {$words.add($w.text);} (',' w=atom {$words.add($w.text);} )*
  ;


atom: '0' | '1';

WS
  :  ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ { $channel = HIDDEN; }
  ;


After your mainRule, you should add a EOF token, otherwise ANTLR will stop parsing when there is no valid token to be matched.

Also, the atom rule should really be a lexer rule instead of a parser rule (lexer rules start with a capital).

Try this instead:

grammar Test;

@parser::members {
  public static void main(String[] args) throws Exception {
    String text = "0,1  ,  1  , 0,1";
    ANTLRStringStream in = new ANTLRStringStream(text);
    TestLexer lexer = new TestLexer(in);
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    System.out.println(new TestParser(tokens).mainRule());
  }
}

mainRule returns [List<String> words]
@init{$words = new ArrayList<String>();}
  :  w=Atom {$words.add($w.text);} (',' w=Atom {$words.add($w.text);} )* EOF
  ;

Atom
  :  '0' | '1'
  ;

WS
  :  ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ { $channel = HIDDEN; }
  ;

EDIT

To clarify: as you already found out, EOF is not mandatory. It will only cause the parser to go through the entire input. A NoViableAltException is only thrown when the lexer stumbles upon a token/char that is not handled by your lexer grammar. Since you define three tokens in your grammar (0, 1 and ,) and your input, "00", does not contain any characters not handled by your grammar, no NoViableAltException is thrown. If you change your input to something like "0?0", then a NoViableAltException will pop up.

Since your parser finds the first 0 and then did not find a ,, it simply stops parsing since you did not "tell" it to parse all the way to the end of the file.

Hope that clarifies things. If not, let me know.

0

精彩评论

暂无评论...
验证码 换一张
取 消