开发者

Decode the regexp string that matches the word in string

开发者 https://www.devze.com 2022-12-15 00:11 出处:网络
I have the following regexp var value = \"hello\"; \"(开发者_开发百科?<start>.*?\\W*?)(?<term>\" + Regex.Escape(value) + @\")(?<end>\\W.*?)\"

I have the following regexp

var value = "hello";
"(开发者_开发百科?<start>.*?\W*?)(?<term>" + Regex.Escape(value) + @")(?<end>\W.*?)"

I'm trying to figure out the meaning, because it doesnt work against the single word. for example, it matches "they said hello us", but fails for just "hello"

can you please help me to decode what does this regexp string mean?!

PS: it's .NET regexp


Its because of \W in last part. \W is non A-Z0-9_ char.

In "they said hello us", there is space after hello, but "hello" there is nothing there, thats why.

If you change it to (?<end>\W*.*?) it may work.

Actually, the regex itself does not make sense for me, it should rather like

"\b" + Regex.Escape(value) + "\b"

\b is word boundary


The regex may be trying to find a pattern comprising whole words, so that your hello example doesn't match, say, Othello. If so, the word boundary regex, \b, is tailor-made for the purpose:

@"\b(" + Regex.Escape(value) + @")\b"


if this is .NET regex and the Regex.escape() part is replaced with just 'hello' .. Regex Buddy says it means:

(?<start>.*?\W*?)(?<term>hello)(?<end>\W.*?)

Options: case insensitive

Match the regular expression below and capture its match into backreference with name “start” «(?<start>.*?\W*?)»
   Match any single character that is not a line break character «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
   Match a single character that is a “non-word character” «\W*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the regular expression below and capture its match into backreference with name “term” «(?<term>hello)»
   Match the characters “hello” literally «hello»
Match the regular expression below and capture its match into backreference with name “end” «(?<end>\W.*?)»
   Match a single character that is a “non-word character” «\W»
   Match any single character that is not a line break character «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号