开发者

Android - Java - Regular Expression question - consecutive words not being matched

开发者 https://www.devze.com 2023-02-02 00:58 出处:网络
For my example I am trying to replace ALL cases of \"the\" and \"a\" in a string with a space. Including cases where these words are next to characters such as quotes and other punctuation

For my example I am trying to replace ALL cases of "the" and "a" in a string with a space. Including cases where these words are next to characters such as quotes and other punctuation

String oldString = "A test of the exp."
Pattern p = Pattern.compile("(((\\W|\\A)the(\\W|\\Z))|((\\W|\\A)a(\\W|\\Z)))",Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(oldString);
newString = m.replaceAll(" ");

"A test of the exp." returns "test of exp." - Yeah!

"A test of the a exp." returns "test of a exp." - Boooo!

"The a in this test is a the." returns "a in this test is the. - DoubleBoooo!

Any help would be grea开发者_StackOverflowtly appreciated. Thanks!


String resultString = subjectString.replaceAll("\\b(?:a|the)\\b", " ");

\b matches at a word boundary (i. e. at the start or end of a word, where "word" is a sequence of alphanumeric characters).

(?:...) is a non-capturing group, needed to separate the alternative words (in this case a and the) from the surrounding word boundary anchors.


Or per simplified @Robokop soln.

Pattern.compile("(\\b(the|a)\\b)",Pattern.CASE_INSENSITIVE);

or

Pattern.compile('\b(the|a)\b',Pattern.CASE_INSENSITIVE);

Not sure about quoting in Java.


Pattern.compile("(\\bthe\\b)|(\\ba\\b)",Pattern.CASE_INSENSITIVE);
0

精彩评论

暂无评论...
验证码 换一张
取 消