开发者

Why does my regex containing \d{1,} together with a negative lookahead still match, where it shouldn't?

开发者 https://www.devze.com 2023-03-28 15:50 出处:网络
I\'m trying to match a coordinate pair in a String us开发者_开发技巧ing a Regex in Java. I explicitly want to exclude strings using negative lookahead.

I'm trying to match a coordinate pair in a String us开发者_开发技巧ing a Regex in Java. I explicitly want to exclude strings using negative lookahead.

to be matched:

558,228
558,228,
558,228,589
558,228,A,B,C

NOT to be matched:

558,228,<Text>

The Regex ^558,228(?!,<).* does the job, while ^\d{1,},\d{1,}(?!,<).* doesn't. It's the same regex with the metacharacter \d instead of values. Any ideas why?


The reason is the .* part at the end. It matches everything that wasn't matched earlier. In combination with \d{1,}, which allows to match less than 3 digits, it will go like this: ^\d{1,},\d{1,}(?!,<) will match 558,22 and .* will match the remaining part 8,<Text>.


The problem is the \d{1,} part in combination with the .* at the end.

In your case

558,228,<Text>

The ^\d{1,},\d{1,}(?!,<) matches ">558,22" and the .* matches the rest "8,<Text>"

You can solve this using the possessive quanitifier ++

^\d+,\d++(?!,<)(.*)

See it here online on Regexr

\d++ is a seldom used possessive quantifier, which is here useful. ++ means match at least once as many as you can and do not backtrack. That means it will not give back the digits once it has found them.

Java Quantifier tutorial

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号