开发者

JavaCC action in token definition

开发者 https://www.devze.com 2023-03-09 13:34 出处:网络
I was wondering if it were possible to hook i开发者_StackOverflow中文版nto JavaCC\'s lexer to call a function to check if a character is valid.

I was wondering if it were possible to hook i开发者_StackOverflow中文版nto JavaCC's lexer to call a function to check if a character is valid.

The reason I am asking is I'm trying to implement something a bit like:

TOKEN {
    <ID: id($char)>
}

where id() is:

//Check to see if the character is an ID character
boolean id(char currentCharacter) {
    int type = Character.getType(currentCharacter);

    return type == Character.LOWERCASE_LETTER || type == Character.MATH_SYMBOL;
}

Is this at all possible?


No, you can't. The lexer is a finite state machine.

What you can do is implement a lexical action that validates the characters of the matched string and adds the result of that validation to the issued token (e.g. by setting the value of a custom field). But you cannot use the result of the validation to guide the lexer.

You should define the ID token as an enumeration of all the possible characters:

TOKEN {
    < ID: [ "a"-"z", "α"-"ω", ... ] > // The enumeration is to be continued
}

Note: If you don't use Unicode escapes, don't forget to tell JavaCC the exact encoding of your grammar file.

This is tedious but it is how the lexer works.

An alternative is to accept any single character as an identifier, and validate it in the parser, or even later:

TOKEN {
    < ID: ~[] >
}

I see no reason to do that, though.

0

精彩评论

暂无评论...
验证码 换一张
取 消