开发者

Regex (Java) to remove all characters up to but not including (a number or a letter a-f followed by a number)

开发者 https://www.devze.com 2023-02-15 17:11 出处:网络
I need help constructing the regular expression to remove all characters up 开发者_Go百科to but not including (a number or a letter a-f followed by a number) in Java:

I need help constructing the regular expression to remove all characters up 开发者_Go百科to but not including (a number or a letter a-f followed by a number) in Java:

Here's what I came up with (doesn't work): string.replaceFirst(".+?(\\d|[a-f]\\d)","");

That line of code replaces the entire string with an empty string. .+? is every character up to \\d a digit OR [a-f]\\d any of the letters a-f followed by a digit.

This doesn't work, however, can I have some help?

Thanks

EDIT: changed replace with replaceFirst


First off, replace() acts on literals, not regexes. You should use replaceFirst or replaceAll depending on what you want. Your regex problem is that you're including the suffix as part of the string to replace. You can give this a try:

input.replaceFirst(".+?(\\d|[a-f]\\d)","$1")

Here I just include the suffix in the replacement string as well. The more correct approach is to make that a zero-width assertion so that it doesn't get included in the region to replace. You can use a positive lookahead:

input.replaceFirst(".+?(?=(\\d|[a-f]\\d))", "")


The other answers given here have the problem that if the string starts with a-f followed by a number, or just a number, they will actually match and replace the first character. Not sure if that's a relevant scenario. This more convoluted pattern should work though:

"([^a-f\\d]|([a-f](?!\\d)))+"

(that is, everything that's not a digit or a-f, or a-f not followed by a digit).


I'd suggest something along the lines of

string.replaceFirst(".*?(?=(\\d|[a-f]\\d))", "");


s = s.replaceFirst(".*?(?=[a-f]?\\d)", "");

Using .*? instead of .+? insures that the first character gets checked by the lookahead, solving the problem @johusman mentioned. And while your (\\d|[a-f]\\d) isn't causing a problem, [a-f]?\\d is both more efficient and more readable.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号