I used the following to get it to work partially:
%{
#define OR 2
#define AND 3
.........
.........
%}
delim [ \t]
ws {delim}*
letter [A-Za-z]
digit [0-9]
comments [/]+({letter}|{di开发者_如何转开发git}|{delim})*
%%
{comments} {return(COMMENT);}
......................
......................
%%
int main()
{
int tkn = 0;
while (tkn = yylex())
{
switch (tkn)
{
case COMMENT:
printf("GOT COMMENT");
}
}
}
This is working fine. The problem is that the regex obviously does not recognize special characters because [/]+({letter}|{digit}|{delim})*
does not consider special characters. How to change the regex to accommodate more characters till end of line?
Couldn't you just use
[/]+.*
It will match some number of / and then anything till the end of line. Of course this will not cover comments like /* COMMENT */.
may be its late. But I find this more appropriate to use \/[\/]+.*
This will cover double slash and more and then the rest of the text.
Following is the explanation from regex101.com
\/
matches the character /
literally (case sensitive) Match a single character present in the text
[\/]+
+
Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy) \/
matches the character /
literally (case sensitive)
.*
matches any character (except for line terminators)
A single-line comment expression starting with '//' can be captured by the following regular expression.
\/\/[^\r\n]*
\/\/
matches the double-slash
[^\r\n]*
matches as many characters that are not carriage-return or line-feed as it can find.
However, the C language allows a single line comment to be extended to the next line when the last character in the line is a backslash (\). Therefore, you may want to use the following.
\/\/[^\r\n]*(?:(?<=\\)\r?\n[^\r\n]*)*
\/\/
matches the double-slash
[^\r\n]*
matches as many characters that are not carriage-return (\r) or line-feed (\n) as it can find
(?:
start a non-capturing group
(?<=\\)
assert that a backslash (\) immediately precedes the current position
\r?\n
match the end of a line
[^\r\n]*
matches as many characters that are not carriage-return (\r) or line-feed
)*
complete the non-capturing group and let it repeat 0 or more times
Note that this method has problems. Depending on what you are doing, you may want to find and use a lexical scanner. A lexical scanner can avoid the following problems.
Scanning the text
/* Comment appears to have // a comment inside it */
will match
// a comment inside it */
Scanning the text
char* a = "string appears to have // a comment";
will match
// a comment";
Why can't you just write
"//"|"/*" {return(COMMENT);}
?
Following regular expression works just fine for me.
\/\/.*
精彩评论