开发者

Ruby Regexp: + vs *. special behaviour?

开发者 https://www.devze.com 2022-12-25 00:29 出处:网络
Using ruby regexp I get the following results: >> \'foobar\'[/o+/] => \"oo\" >> \'foobar\'[/o*/]

Using ruby regexp I get the following results:

>> 'foobar'[/o+/]
=> "oo"
>> 'foobar'[/o*/]
=> ""

But:

>> 'foobar'[/fo+/]
=> "foo"
>> 'foobar'[/fo*/]
=> "foo"

The documentation says:

*: zero or more repetitions of the preceding

+: one or more repet开发者_如何学Goitions of the preceding

So i expect that 'foobar'[/o*/] returns the same result as 'foobar'[/o+/]

Does anybody have an explanation for that


'foobar'[/o*/] is matching the zero os that appear before the f, at position 0
'foobar'[/o+/] can't match there because there needs to be at least 1 o, so it instead matches all the os from position 1

Specifically, the matches you are seeing are

'foobar'[/o*/] => '<>foobar'
'foobar'[/o+/] => 'f<oo>bar'


This is a common misunderstanding of how regexp works.

Although the * is greedy and isn't anchored at the start of the string, the regexp engine will still start looking from beginning of the string. In case of "/o+/", it does not match at position 0 (eg. "f"), but since the + means one or more, it has to continue matching (this has nothing to do with the greediness) until a match is found or all positions are evaluated.

However with the case of "/o*/", which as you know mean 0 or more times, when it doesn't match at position 0, the regexp engine will gracefully stop at that point (as it should, because o* simply means that the o is optional). There's also performance reasons, since "o" is optional, why spend more time looking for it?

0

精彩评论

暂无评论...
验证码 换一张
取 消