开发者

regular expression back referencing

开发者 https://www.devze.com 2022-12-12 10:54 出处:网络
why this snippet: \'He said \"Hello\"\' =~ /(\\w)\\1/ matches \"ll\"? I thought that the \\w part matches \"H\", and hence \\1 refers to \"H\", thus nothin开发者_JS百科g should be matched? but why

why this snippet:

'He said "Hello"' =~ /(\w)\1/

matches "ll"? I thought that the \w part matches "H", and hence \1 refers to "H", thus nothin开发者_JS百科g should be matched? but why this result?


I thought that the \w part matches "H"

\w matches any alphanumerical character (and underscore). It also happens to match H but that’s not terribly interesting since the regular expression then goes on to say that this has to be matched twice – which H can’t in your text (since it doesn’t appear twice consecutively), and neither is any of the other characters, just l. So the regular expression matches ll.


You're thinking of /^(\w)\1/. The caret symbol specifies that the match must start at the beginning of the line. Without that, the match can start anywhere in the string (it will find the first match).


and you're right, nothing was matched at that position. then regex went further and found match, which it returned to you.

\w is of course matches any word character, not just 'H'.


The point is, "\1" means one repetition of the "(\w)" block, only the letter "l" is doubled and will match your regex.

A nice page for toying around with ruby and regular expressions is Rubular

0

精彩评论

暂无评论...
验证码 换一张
取 消