开发者

I need a regular expression to find a string that's not commented out

开发者 https://www.devze.com 2023-01-18 09:05 出处:网络
my regular expression is currently: includes.push\\(\"([^\\\"\\\"]*\\.js)\"\\) but it matches all of the following lines

my regular expression is currently:

includes.push\("([^\"\"]*\.js)"\)

but it matches all of the following lines

/*includes.push("javascriptfile.js")*/
/*
includes.push("javascriptfile.js")
*/
includes.push("javascriptfile.js");
includes.push("javascriptfile.js")

And I don't want it to match the lines within comments.

Any regex experts out 开发者_JAVA百科there got any ideas?

Thanks :o)

Edit I have tested a regex slightly adapted from madgnome. this picks up multiline ones in my test, can you see any problems with it?

includes\.push("([^\"\"]*\.js)")(?!\n**/)

new test is:

/*includes.push("javascriptfile.js")*/
/*
includes.push("javascriptfile.js")
*/
includes.push("javascriptfile.js");
includes.push("javascriptfile.js");
/*includes.push("javascriptfile.js")*/
/*
includes.push("javascriptfile.js")
*/

This includes comments underneath the initial includes strings.


Depending on your language, you could use negative lookbehind/lookahead

(?<!/\*)includes\.push\("([^\"\"]*\.js)"\)(?!\*/)
  • (?<!/\*) asserts that it is impossible to match the regex /\* before current position
  • (?!\*/) asserts that it is impossible to match the regex \*/ after current position

This regex won't work for multiline comments like your second example, you should trim before use.

Edit: You are using javascript, and negative lookbehind doesn't work in javascript, you could use only the negative lookahead like that :

includes\.push\("([^\"\"]*\.js)"\)(?![\r\n\s]*\*/)

(This regex works for multiline comments like your second example but won't with malformed comments : */ without /*)


You could just match either comments (multi- or single line), or a string literal and inspect the entire match-array:

var text = 
    "// \"foo\" \n" +
    "var s = \"no /* comment */ in here \"; \n" +
    "/*includes.push(\"javascriptfileA.js\")*/\n" +
    "/*\n" +
    "includes.push(\"javascriptfileB.js\")\n" +
    "*/\n" +
    "includes.push(\"javascriptfileC.js\");\n" +
    "includes.push(\"javascriptfileD.js\")\n";

print("--------------------------------------\ntext:\n");

var hits = text.match(/\/\/[^\r\n]*|\/\*[\s\S]*?\*\/|"(?:\\.|[^\\"])*"/g);

print(text);

print("--------------------------------------\nhits:\n");

for(i in hits) {
  var hit = hits[i]; 
  if(hit.indexOf("\"") == 0) {
    print(hit);
  }
}

produces:

--------------------------------------
text:

// "foo" 
var s = "no /* comment */ in here "; 
/*includes.push("javascriptfileA.js")*/
/*
includes.push("javascriptfileB.js")
*/
includes.push("javascriptfileC.js");
includes.push("javascriptfileD.js")

--------------------------------------
hits:

"no /* comment */ in here "
"javascriptfileC.js"
"javascriptfileD.js"

A short explanation of the regex:

//[^\r\n]*         # match a single line comment
|                  # OR
/\*[\s\S]*?\*/     # match a multi-line comment
|                  # OR
"(?:\\.|[^\\"])*"  # match a string literal

Tested online on IDEone.

0

精彩评论

暂无评论...
验证码 换一张
取 消