I am trying to write a program using regex. The format for an identifier, as I might have explained in another question of mine, i开发者_高级运维s that it can only begin with a letter (and the rest of it can contain whatever). I have this part worked out for the most part. However, anything within quotes cannot count as an identifier either.
Currently I am using Pattern pattern = Pattern.compile("[A-Za-z][_A-Za-z0-9]*");
as my pattern, which indicates that the first character can only be letters. So how can I edit this to check if the word is surrounded by quotations (and EXCLUSE those words)?
Use negative lookaround assertions:
"(?<!\")\\b[A-Za-z][_A-Za-z0-9]*\\b(?!\")"
Example:
Pattern pattern = Pattern.compile("(?<!\")\\b[A-Za-z][_A-Za-z0-9]*\\b(?!\")");
Matcher matcher = pattern.matcher("Foo \"bar\" baz");
while (matcher.find())
{
System.out.println(matcher.group());
}
Output:
Foo baz
See it working online: ideone.
Use lookarounds.
"(?<![\"A-Za-z])[A-Z...
The (?<![\"A-Za-z])
part means "if the previous character is not a quotation mark or a letter".
精彩评论