开发者

Antlr backtrack option not working

开发者 https://www.devze.com 2023-03-27 07:34 出处:网络
I am not sure but I think the Antlr backtrack option is not working properly or something... Here is my grammar:

I am not sure but I think the Antlr backtrack option is not working properly or something...

Here is my grammar:

grammar Test;
options {
  backtrack=true;
  memoize=true;
}

prog:   (code)+;

code
    :   ABC {System.out.println("ABC");}
    |   OTHER {System.out.println("OTHER");}
    ;

ABC : 'ABC';
OTHER : .;

If the input stream is "ABC" then I'll see ABC printed.

If the input stream is "ACD" then I'll see 3 times OTHER printed.

But if the input stream is "ABD" then I'll see line 1:2 mismatched character 'D' expecting 'C' line 1:3 required (...)+ loop did not match anything at input ''

but I expect to see three times OTHER, since the input should match the second rule if the first rule fails.

That doesn't make any sense. Why the parser didn't backtrack when it sees that the last character wa开发者_JAVA技巧s not 'C'? However, it was ok with "ACD."

Could someone please help me solve this issue??? Thanks for your time!!!


The option backtrack=true applies to parser rules only, not lexer rules.

EDIT

The only work-around I am aware of, is by letting "AB" followed by some other char other than "C" be matched in the same ABC rule and then manually emitting other tokens.

A demo:

grammar Test;

@lexer::members {
  List<Token> tokens = new ArrayList<Token>();

  public void emit(int type, String text) {
    state.token = new CommonToken(type, text);
    tokens.add(state.token);
  }

  public Token nextToken() {
    super.nextToken();
    if(tokens.size() == 0) {
      return Token.EOF_TOKEN;
    }
    return tokens.remove(0);
  }
}

prog
  :  code+
  ;

code
  :   ABC   {System.out.println("ABC");}
  |   OTHER {System.out.println("OTHER");}
  ;

ABC
  :  'ABC'
  |  'AB' t=~'C' 
     {
       emit(OTHER, "A"); 
       emit(OTHER, "B"); 
       emit(OTHER, String.valueOf((char)$t));
     }
  ;

OTHER 
  :  . 
  ;


Another solution. this might be a simpler solution though. i made use of "syntactic predicates".

grammar ABC;

@lexer::header {package org.inanme.antlr;}
@parser::header {package org.inanme.antlr;}

prog: (code)+ EOF;
code: ABC {System.out.println($ABC.text);}
    | OTHER {System.out.println($OTHER.text);};

ABC : ('ABC') => 'ABC' | 'A';
OTHER : .;
0

精彩评论

暂无评论...
验证码 换一张
取 消