开发者

RegEx for removing words that contain a character *not* in a specified set

开发者 https://www.devze.com 2023-02-07 18:35 出处:网络
I\'ve got a string, lets say something like: \' abc a b c ab ac ae \' I\'ve got a set of characters that I like, lets say something like:

I've got a string, lets say something like:

' abc a b c ab ac ae '

I've got a set of characters that I like, lets say something like:

['开发者_JAVA技巧a', 'b', 'c']

I'm trying to remove any words that contain a character that is not in the set. I'm using JS, but a regex-is-a-regex, so any help, language-agnostic, would be oodles of help.

I tried something like this, but it didn't do the voodoo I was hoping for:

var str = ' abc a b c ab ac ae ';
var regex = new RegExp(' [a|b|c].[^a|b|c]+[a|b|c]. ', 'gi');
console.log(str.replace(regex, ' '));

Thanks :)


The ^ is the not character so [^abc] says that the character can not be a,b or c. Try this regex [abc][^abc]+ that should match ae

Edit: Modified regex to ignore whitespace [a-c][^a-c\s]+


Something like this ought to do it:

\s[abc]*[^abc\s]+[abc]*\s

This will include the space either side, which from your code is what you want.


To me [^abc] is a class that represents %99 of everything (what you don't want), when in reality thats too broad of a brush to be the only positive requirement. It needs a counter weight that shaves this down a little more. That shaving can't be done very well in a negative sence like a character class. Its already been modified to include \s.

It might be better to look for words with nothing but these things in it, then exclude them from a match.

/(?!\b[abc]+\b)\b\w+\b/

expanded:

/     # Rx delim
  (?! \b[abc]+\b )             # Not \b [abc]+ \b in front of us
  \b \w+ \b                    # Match this word, it needs to go
/x    # Rx delim, Xpanded modifier but not in JavaScript
0

精彩评论

暂无评论...
验证码 换一张
取 消