开发者

ANTLR grammar not handling my "not" operator correctly

开发者 https://www.devze.com 2023-04-03 19:56 出处:网络
I am trying to parse a small expression language (I didn\'t define the language, from a vendor) and everything is fine until I try to use the not operator, which is a tilde in this language.

I am trying to parse a small expression language (I didn't define the language, from a vendor) and everything is fine until I try to use the not operator, which is a tilde in this language.

My grammar has been heavily influenced by these two links (aka shameless cut and pasting):

http://www.codeproject.com/KB/recipes/sota_expression_evaluator.aspx http://www.alittlemadness.com/2006/06/05/antlr-by-example-part-1-the-language

The language consists of three expression types that can be used with and, or, not operators and parenthesis change precedence. Expressions are:

Skill("name") > some_number (can also be <, >=, <=,  =, !=)
SkillExists("name")
LoggedIn("name") (this one can also have name@name)

This input works fine:

Skill("somename") > 1 | (LoggedIn("somename") & SkillExis开发者_如何学运维ts("othername"))

However, as soon as I try to use the not operator I get NoViableAltException. I can't figure out why. I have compared my grammar to the ECalc.g one at the codeproject.com link and they seem to match, there must be some subtle difference I can't see. Fails:

Skill("somename") < 10 ~ SkillExists("othername")

My Grammar:

grammar UserAttribute;

options {
output=AST;
ASTLabelType=CommonTree;
}

tokens {
SKILL = 'Skill' ;
SKILL_EXISTS = 'SkillExists' ;
LOGGED_IN = 'LoggedIn';
GT = '>';
LT = '<';
LTE = '<=';
GTE = '>=';
EQUALS = '=';
NOT_EQUALS = '!=';  
AND = '&';
OR = '|' ;
NOT = '~';
LPAREN   = '(';
RPAREN = ')';
QUOTE = '"';
AT = '@';       
}

/*------------------------------------------------------------------
 * PARSER RULES
 *------------------------------------------------------------------*/  
expression : orexpression EOF!; 
orexpression    : andexpression (OR^ andexpression)*;
andexpression   : notexpression (AND^ notexpression)*;  
notexpression : primaryexpression | NOT^ primaryexpression;
primaryexpression : term | LPAREN! orexpression RPAREN!;
term    : skill_exists | skill | logged_in;
skill_exists    : SKILL_EXISTS LPAREN QUOTE NAME QUOTE RPAREN;
logged_in : LOGGED_IN LPAREN QUOTE NAME (AT NAME)? QUOTE RPAREN;
skill:  SKILL LPAREN QUOTE NAME QUOTE RPAREN ((GT | LT| LTE | GTE | EQUALS | NOT_EQUALS)? NUMBER*)?;

/*------------------------------------------------------------------
 * LEXER RULES
 *------------------------------------------------------------------*/
NAME    : ('a'..'z' | 'A'..'Z' | '_')+;
NUMBER  : ('0'..'9')+ ;
WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+    { $channel = HIDDEN; } ;


I have 2 remarks:

1

Since you're parsing single expressions (expression : orexpression EOF!;), the input "Skill("somename") < 10 ~ SkillExists("othername")" is not only invalid in your grammar, but it's invalid in terms of any expression parser (I know of). A notexpression only takes a "right-hand-side" expression, so ~ SkillExists("othername") is a single expression and Skill("somename") < 10 is also a single expression. But in between those two single expression, there's no OR or AND operator. It would be the same as evaluating the expression true false instead of true | false or true and false.

In short, your grammar disallows:

Skill("somename") < 10 ~ SkillExists("othername")

but allows for:

Skill("somename") < 10 & SkillExists("othername")

which seems logical to me.

2

I don't quite understand your skill rule (which is ambiguous, btw):

skill
 : SKILL LPAREN QUOTE NAME QUOTE RPAREN 
     ((GT | LT| LTE | GTE | EQUALS | NOT_EQUALS)? NUMBER*)?
 ;

This means that the operator is optional and there can be zero or more numbers at the end. This means that the following input are all valid:

  • Skill("foo") = 10 20
  • Skill("foo") 10 20 30
  • Skill("foo") <

Perhaps you meant:

skill
 : SKILL LPAREN QUOTE NAME QUOTE RPAREN 
     ((GT | LT| LTE | GTE | EQUALS | NOT_EQUALS)^ NUMBER)?
 ;

instead? (the ? becomes a ^ and the * is removed)

If I only change that rule and parse the input:

Skill("somename") < 10 & SkillExists("othername")

the following AST is created:

ANTLR grammar not handling my "not" operator correctly

(as you can see, the AST needs to be better formed: i.e. you need some rewrite rules in your skill_exists, logged_in and skill rules)


EDIT

and if you want successive expressions to have implied AND tokens in between, do something like this:

grammar UserAttribute;

...
tokens {
...
I_AND;     // <- added a token without any text (imaginary token)
AND = '&';
...
}

andexpression
  :  (notexpression -> notexpression) (AND? notexpression -> ^(I_AND $andexpression notexpression))*
  ;  

...

As you can see, since the AND is now optional, it cannot be used inside a rewrite rule, but you'll have to use the imaginary token I_AND.

If you now parse the input:

Skill("somename") < 10 ~ SkillExists("othername")

you will get the following AST:

ANTLR grammar not handling my "not" operator correctly

0

精彩评论

暂无评论...
验证码 换一张
取 消