开发者

What kind of state is retained by Java's matcher.find() after an unsuccessful match with quantifiers?

开发者 https://www.devze.com 2023-03-12 11:42 出处:网络
In the following, I expect the second find() to succeed, but it does not.Why? Matcher matcher = Pattern.compile(\"\\\\s*asdf\").matcher(\"apple banana cookie\");

In the following, I expect the second find() to succeed, but it does not. Why?

Matcher matcher = 
    Pattern.compile("\\s*asdf").matcher("apple banana cookie");

// returns false as expected
matcher.find();

// resets groups (that weren't being explicitly being used anyway), but not 开发者_StackOverflowstate.
matcher.usePattern(Pattern.compile("\\s*banana")); 

// returns false, expected true.
System.out.println(matcher.find());

If the quantifier is removed from the first regex (becoming simply "asdf"), the second match succeeds. Looking at the Matcher object reveals some kind of group information is stored after the first unsuccessful find(), although I wouldn't have expected it. Find() is supposed to start either at the beginning (if no previous match) or at the index of the last successful match. UsePattern() is supposed to preserve the Matcher's position in the input, and discard group information (that, again, I wasn't using explicitly).

I'm missing something, but I don't know what. I'm suspecting I have to implement this with lookingAt() and updating the region (such as this example), but I don't know why this approach isn't working.


Your first regex consumes the entire string (\\\\s*). When the second regex is run there is nothing left to match.

If you call matcher.reset() it works as expected.


Looks like the documentation is a little misleading (or actually, it just doesn't specify) what the behavior is when you call find() after failure.

I suppose that the expected usage is that find() is called repeatedly until failure, but never after failure without resetting.

Looking at the source code confirms that Matcher has an index (the field last) from which it starts searching when doing the next 'find()', and when find() fails, that index is advanced to the end and isn't reset.

reset() resets that index, usePattern() doesn't.

0

精彩评论

暂无评论...
验证码 换一张
取 消