开发者

How to REGEX // in C? Single line comments

开发者 https://www.devze.com 2023-02-22 21:43 出处:网络
I used the following to get it to work partially: %{ #define OR 2 #define AND 3 ......... ......... %} delim[ \\t]

I used the following to get it to work partially:

        %{
        #define OR 2

        #define AND 3
        .........
        .........
        %}

        delim     [ \t]
        ws        {delim}*
        letter    [A-Za-z]
        digit     [0-9]
        comments  [/]+({letter}|{di开发者_如何转开发git}|{delim})*

    %%

    {comments} {return(COMMENT);}
    ......................
    ......................    
    %%
int main()
{
    int tkn = 0;
    while (tkn = yylex())
     {
          switch (tkn)
          {

case COMMENT:
printf("GOT COMMENT");
          }
         }
}

This is working fine. The problem is that the regex obviously does not recognize special characters because [/]+({letter}|{digit}|{delim})* does not consider special characters. How to change the regex to accommodate more characters till end of line?


Couldn't you just use

[/]+.*

It will match some number of / and then anything till the end of line. Of course this will not cover comments like /* COMMENT */.


may be its late. But I find this more appropriate to use \/[\/]+.* This will cover double slash and more and then the rest of the text.

Following is the explanation from regex101.com

\/ 

matches the character / literally (case sensitive) Match a single character present in the text

[\/]+

+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy) \/ matches the character / literally (case sensitive) .* matches any character (except for line terminators)


A single-line comment expression starting with '//' can be captured by the following regular expression.

\/\/[^\r\n]*

\/\/ matches the double-slash
[^\r\n]* matches as many characters that are not carriage-return or line-feed as it can find.

However, the C language allows a single line comment to be extended to the next line when the last character in the line is a backslash (\). Therefore, you may want to use the following.

\/\/[^\r\n]*(?:(?<=\\)\r?\n[^\r\n]*)*

\/\/ matches the double-slash
[^\r\n]* matches as many characters that are not carriage-return (\r) or line-feed (\n) as it can find
(?: start a non-capturing group
(?<=\\) assert that a backslash (\) immediately precedes the current position
\r?\n match the end of a line
[^\r\n]* matches as many characters that are not carriage-return (\r) or line-feed
)* complete the non-capturing group and let it repeat 0 or more times

Note that this method has problems. Depending on what you are doing, you may want to find and use a lexical scanner. A lexical scanner can avoid the following problems.

  1. Scanning the text

    /* Comment appears to have // a comment inside it */

    will match

    // a comment inside it */

  2. Scanning the text

    char* a = "string appears to have // a comment";

    will match

    // a comment";


Why can't you just write

"//"|"/*"    {return(COMMENT);}

?


Following regular expression works just fine for me.

\/\/.*
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号