开发者

How to modify this regular expression to be case insensitive while searching for curse words?

开发者 https://www.devze.com 2023-04-12 15:00 出处:网络
At the moment, this profanity filter finds darn and golly but not Darn or Golly or DARN or GOLLY. List<String> bannedWords = Arrays.asList(\"darn\", \"开发者_开发知识库golly\", \"gosh\");

At the moment, this profanity filter finds darn and golly but not Darn or Golly or DARN or GOLLY.

List<String> bannedWords = Arrays.asList("darn", "开发者_开发知识库golly", "gosh");

StringBuilder re = new StringBuilder();
for (String bannedWord : bannedWords)
{
    if (re.length() > 0)
        re.append("|");
    String quotedWord = Pattern.quote(bannedWord);
    re.append(quotedWord);
}

inputString = inputString.replaceAll(re.toString(), "[No cursing please!]");

How can it be modified to be case insensitive?


Start the expression with (?i).

I.e., change re.toString() to "(?i)" + re.toString().

From the documentation of Pattern

(?idmsux-idmsux) Nothing, but turns match flags i d m s u x on - off

where i is the CASE_INSENSITIVE flag.


You need to set the CASE_INSENSITIVE flag, or simply add (?i) to the beginning of your regex.

StringBuilder re = new StringBuilder("(?i)");

You'll also need to change your conditional to

if (re.length() > 4)

Setting the flag via @ratchetFreak's answer is probably best, however. It allows for your condition to stay the same (which is more intuitive) and gives you a clear idea of what's going on in the code.

For more info, see this question and in particular this answer which gives some decent explanation into using regex's in java.


use a precompiled java.util.regex.Pattern

Pattern p = Pattern.compile(re.toString(),Pattern.CASE_INSENSITIVE);//do this only once

inputString = p.matcher(inputString).replaceAll("[No cursing please!]");
0

精彩评论

暂无评论...
验证码 换一张
取 消