开发者

How do I write a non-greedy match in LEX / FLEX?

开发者 https://www.devze.com 2023-01-25 07:38 出处:网络
I\'m trying to parse a legacy language (which is similar to \'C\') using FLEX and BISON. Everything is working nicely except for matching strings.

I'm trying to parse a legacy language (which is similar to 'C') using FLEX and BISON. Everything is working nicely except for matching strings.

This rather odd legacy language doesn't support quoting characters in string literals, so the following are all valid string literals:

"hello"
""
"\"

I'm using the following rule to match string literals:

\".*\"            { yylval.strv开发者_StackOverflow中文版al = _strdup( yytext ); return LIT_STRING; }

Unfortunately this is a greedy match, so it matches code like the following:

"hello", "world"

As a single string (hello", "world).

The usual non-greedy quantifier .*? doesn't seem to work in FLEX. Any ideas?


Just prohibit having a quote in between the quotes.

\"[^"]*\"


Backslash escaped quotes

The following also allows it:

\"(\\.|[^\n"\\])*\" {
        fprintf( yyout, "STRING: %s\n", yytext );
    }

and disallows for newlines inside of string constants.

E.g.:

>>> "a\"b""c\d"""
STRING: "a\"b"
STRING: "c\d"
STRING: ""

and fails on:

>>> "\"

When implementing such C-like features, make sure to look for existing Lex implementations, e.g.: http://www.lysator.liu.se/c/ANSI-C-grammar-l.html

0

精彩评论

暂无评论...
验证码 换一张
取 消