In Java, I was unable to get a regex to behave the way I wanted, and wrote this little JUnit test to demonstrate the problem:
public void testLookahead() throws Exception {
Pattern p = Pattern.compile("ABC(?!!)");
assertTrue(p.matcher("ABC").find());
assertTrue(p.matcher("ABCx").find());
assertFalse(p.matcher("ABC!").find());
assertFalse(p.matcher("ABC!x").find());
assertFalse(p.matcher("blah/ABC!/blah").f开发者_如何学JAVAind());
p = Pattern.compile("[A-Z]{3}(?!!)");
assertTrue(p.matcher("ABC").find());
assertTrue(p.matcher("ABCx").find());
assertFalse(p.matcher("ABC!").find());
assertFalse(p.matcher("ABC!x").find());
assertFalse(p.matcher("blah/ABC!/blah").find());
p = Pattern.compile("[A-Z]{3}(?!!)", Pattern.CASE_INSENSITIVE);
assertTrue(p.matcher("ABC").find());
assertTrue(p.matcher("ABCx").find());
assertFalse(p.matcher("ABC!").find());
assertFalse(p.matcher("ABC!x").find());
assertFalse(p.matcher("blah/ABC!/blah").find()); //fails, why?
p = Pattern.compile("[A-Za-z]{3}(?!!)");
assertTrue(p.matcher("ABC").find());
assertTrue(p.matcher("ABCx").find());
assertFalse(p.matcher("ABC!").find());
assertFalse(p.matcher("ABC!x").find());
assertFalse(p.matcher("blah/ABC!/blah").find()); //fails, why?
}
Every line passes except for the two marked with the comment. The groupings are identical except for pattern string. Why would adding case-insensitivity break the matcher?
Your tests fail, because in both cases, the pattern [A-Z]{3}(?!!)
(with CASE_INSENSITIVE
) and [A-Za-z]{3}(?!!)
find at least one match in "blah/ABC!/blah"
(they find bla
twice).
A simple tests shows this:
Pattern p = Pattern.compile("[A-Z]{3}(?!!)", Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("blah/ABC!/blah");
while(m.find()) {
System.out.println(m.group());
}
prints:
bla
bla
Those two don't throw false values because there are substrings within the full string that match the pattern. Specifically, the string blah
matches the regular expression (three letters not followed by an exclamation mark). The case-sensitive ones correctly fail because blah
isn't upper-case.
精彩评论