开发者

Regex: word boundary but for white space, beginning of line or end of line only

开发者 https://www.devze.com 2023-01-21 21:17 出处:网络
I am looking for some word boundary to cover those 3 cases: beginning of string end of string white space

I am looking for some word boundary to cover those 3 cases:

  1. beginning of string
  2. end of string
  3. white space

Is there somethi开发者_StackOverflowng like that since \b covers also -,/ etc.?

Would like to replace \b in this pattern by something described above:

(\b\d*\sx\s|\b\d*x|\b)


Try replacing \b with (?:^|\s|$)

That means

(
  ?: don't consider this group a match
  ^   match beginning of line
  |   or
  \s  match whitespace
  |   or
  $   match end of line
)

Works for me in Python and JavaScript.


OK, so your real question is:

How do I match a unit, optionally preceded by a quantity, but only if there is either nothing or a space right before the match?

Use

 (?<!\S)\b(?:\d+\s*x\s*)?\d+(?:\.\d+)?\s*ml\b

Explanation

(?<!\S): Assert that it's impossible to match a non-space character before the match.

\b: Match a word boundary

(?:\d+\s*x\s*)?: Optionally match a quantifier (integers only)

\d+(?:\.\d+)?: Match a number (decimals optional)

\s*ml\b: Match ml, optionally preceded by whitespace.


Boundaries that you get with \b are not whitespace sensitive. They are complicated conditional assertions related to the transition between \w\W or \W\w. See this answer for how to write your anchor more precisely, so that you can deal with whitespace the way you want.

0

精彩评论

暂无评论...
验证码 换一张
取 消