开发者

What trick does Java use to avoid spaces in >>?

开发者 https://www.devze.com 2022-12-29 17:02 出处:网络
In the Java Generic Book, while contrasting the difference between C++ Templates and Java Generic says:

In the Java Generic Book, while contrasting the difference between C++ Templates and Java Generic says:

开发者_JAVA百科

In C++, a problem arises because >> without the space denotes the right-shift operator. Java fixes the problem by a trick in the grammar.)

What is this trick?


The OpenJDK javac parser, JavacParser, massages the lexer tokens GTGTGTEQ (>>>=), GTGTEQ, GTEQ, GTGTGT (>>>) and GTGT into the token with one less '>' character when parsing type arguments.

Here is a snippet of the magic from JavacParser#typeArguments():

    switch (S.token()) {
    case GTGTGTEQ:
        S.token(GTGTEQ);
        break;
    case GTGTEQ:
        S.token(GTEQ);
        break;
    case GTEQ:
        S.token(EQ);
        break;
    case GTGTGT:
        S.token(GTGT);
        break;
    case GTGT:
        S.token(GT);
        break;
    default:
        accept(GT);
        break;
    }

One can clearly see that it is indeed a trick, and it's in the grammar :)


This is actually being fixed in C++ in the next version. There really isn't much of a trick; if you encounter >> while in the process of parsing a generic or template where instead you expected >, then you already have enough information to generate an error message. And, if you have enough information to generate an error message, you also have enough information to interpret >> as two separate tokens: > followed by >.


It's a simple parser/lexer hack. The lexical analyser normally recognises the pair >> as a single token. However, when in the middle of parsing a generic type, the parser tells the lexer not to recognise >>.

Historically, C++ didn't do this for the sake of implementation simplicity, but it can (and will) be fixed using the same trick.


It's not really a trick, they just defined the grammar such that a right shift token is synonymous with with two right angle brackets (thus allowing that token to close a template). You can still create ambiguities that have to be resolved with parentheses, but unambiguous sequences are parsed without developer intervention. This is also done in C++0x.


The Java Language Specification, Third Edition shows the full grammar, both shift operators are listed in the InfixOp production, there is no (obvious) trick. to determine which operation >, >> or >>> is intented, will be decided by the scanner using a lookahead technique.

0

精彩评论

暂无评论...
验证码 换一张
取 消