开发者

ANTLR Parser Rules With String Literals

开发者 https://www.devze.com 2023-04-06 12:20 出处:网络
Say if my parser rules look like this: rule1 : \'functionA\' \'(\' expression-inside-parenthesis \')\';

Say if my parser rules look like this:

rule1 : 'functionA' '(' expression-inside-parenthesis ')'; 
expression-inside-parenthesis: ....;

But I never defined any lexer rule for 'functionA', '(' and ')'. Would these be considered tokens by the parser? For '(' and ')', there is only 1 character a开发者_StackOverflow中文版nyway and I suppose there would be no difference. But for 'functionA', if I never defined it as a token in my lexer rules, how could the parser see it as a token?


JavaMan wrote:

how could the parser see it as a token?

ANTLR creates a token for you behind the scenes.

The rule:

rule1 : 'functionA' '(' expression-inside-parenthesis ')';
// parser rules ...
// lexer rules  ...

is equivalent to:

rule1 : FA '(' expression-inside-parenthesis ')';
// parser rules ...
FA : 'functionA';
// lexer rules  ...

In case of tokens that only consist of 1 character and do not occur within other tokens, like '(' and ')', it is okay to define them "on the fly" inside your parser rule, put as soon as your lexer grammar also contains identifier-like tokens, it's best to explicitly define a token like 'functionA' yourself inside the lexer grammar. By defining them yourself explicitly, it is clearer in what order the lexer tries to tokenize your input.

EDIT

And in case you've used a literal-token and defined a lexer rule that matches the same, like this:

parse : 'functionA' ID;
FA    : 'functionA';
ID    : 'a'..'z'+;

then ANTLR interprets the parse rule as this:

parse : FA ID;
0

精彩评论

暂无评论...
验证码 换一张
取 消