开发者

Negating Alternation In Regular Expressions

开发者 https://www.devze.com 2023-01-20 13:44 出处:网络
I can use \"Alternation\" in a regular expression to match any occurance of \"cat\" or \"dog\" thusly:

I can use "Alternation" in a regular expression to match any occurance of "cat" or "dog" thusly:

(cat|dog)

Is it possible to NEGATE this alternation, and match anything that is NOT "cat" or "dog"?

If so, how?

For Example:

Let's say I'm trying to match END OF SENTENCE in English, in an approximate way.

To Wit:

(\.)(\s+[A-Z][^.]|\s*?$)

With the following paragraph:

The quick brown fox jumps over the lazy dog. Once upon a time Dr. Sanches, Mr. Parsons and Gov. Mason went to the store. Hello World.

I incorrectly find "end of sentence" at Dr., Mr., and Gov.

(I'm testing using http://开发者_高级运维regexpal.com/ in case you want to see what I'm seeing with the above example)

Since this is incorrect, I would like to say something like:

!(Dr\.|Mr\.|Gov\.)(\.)(\s+[A-Z][^.]|\s*?$)

Of course, this isn't working, which is why I seek help.

I also tried !/(Dr.|Mr.|Gov.)/, and !~ which were no help whatsoever.

How can I avoid matches for "Dr.", "Mr." and "Gov.", etc?

Thanks in advance.


It is not possible. You would normally do this using negative lookbehind (?<!…), but JavaScript's regex flavor does not support this. Instead, you will have to filter the matches after the fact to discard those you don't want.


In language like Perl/awk, there's the !~ operator

$string !~ /(cat|dog)/

In Actionscript, you can just use NOT operator ! to negate a match. See here for reference. Also here for regex flavors comparison


You can do this:

!/(cat|dog)/

EDIT: You should've included the programming language on your question. Its Actionscript right? I'm not an actionscript coder but AFAIK its done like this:

var pattern2:RegExp = !/(cat|dog)/;


(?!NotThisStuff) is what you want, otherwise known as a negative lookahead group.

Unfortunately, it will not work as you intend. /(?!Dr\.)(\.)/ will still return the periods that belong to "Dr. Sanches" because of the second grouping. The Regex parser will say to itself, "Yep, this '.' isn't 'Dr.'" /((?!Dr).)/ won't work either, though I believe it should.

And what's more, you'll end up looking through all the sentence "ends" anyway. Actionscript doesn't have a "match all," only a match first. You have to set the global flag (or add g to the end of your regex) and call exec until your result object is null.

var string = 'The quick brown fox jumps over the lazy dog. Once upon a time Dr. Sanches, Mr. Parsons and Gov. Mason went to the store. Hello World.';

var regx:RegExp = /(?!Dr\.)(\.)/g;
var result:Object = regx.exec(string);

for (var i = 0; i < 10; i++) { // paranoia
  if (result == null || result.index == 0) break; // again: paranoia
  trace(result.index, result);
  result = regx.exec(string);
}

// trace results:    
//43 .,.
//64 .,.
//77 .,.
//94 .,.
//119 .,.
//132 .,.
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号