I want to parse some C source files and find all strings ("foo").
Something like that works
String line = "myfunc(\"foo foo foo\", \"bar\");";
System.out.println(line);
String patternStr = "\\\"([^\"]+)\\\"";
Pattern pattern = Pattern.compile(patternStr);
Matcher matcher = pattern.matcher("");
String s;
if(line.matches(".*"+patternStr+".*"))
开发者_C百科matcher.reset(line);
while(matcher.find()) {
System.out.println(" FOUND "+matcher.groupCount()+" groups");
System.out.println(matcher.group(1));
}
Until there are no "escape quoted strings" like
String line = "myfunc(\"foo \\\"foo\\\" foo\", \"bar\");";
I don't know how to create expression in Java like "without \" but with \." I've found something simmilar for C here http://wordaligned.org/articles/string-literals-and-regular-expressions
Thanks in advance.
What about strings inside comments:
/* foo "this is not a string" bar */
and what about when a single double quote is in a comment:
/* " */ printf("text");
you don't want to capture "*/ printf("
as a string.
In other words: if the above could occur in your C code, use a parser instead of regex.
Between double-quotes, you want to allow an escape sequence or any character other than a double-quote. You want to test them in that order to allow the longer alternative the opportunity to match.
Pattern pattern = Pattern.compile("\"((\\\\.|[^\"])+)\"");
Matcher matcher = pattern.matcher(line);
while (matcher.find()) {
System.out.println(" FOUND "+matcher.groupCount()+" groups");
System.out.println(matcher.group(1));
}
Output:
FOUND 2 groups foo \"foo\" foo FOUND 2 groups bar
Try the following:
String patternStr = "\"(([^\"\\\\]|\\\\.)*)\"";
(All I did was convert to Java the regexp from the article you mentioned: /"([^"\\]|\\.)*"/
).
精彩评论