I am trying to figure out a regular expression which matches any string that doesn't start with mpeg. A generalization of this is matching any string which doesn't start with a given regular expression.开发者_开发知识库
I tried something like as follows:
[^m][^p][^e][^g].*
The problem with this is that it requires at least 4 characters to be present in the string. I was not able to figure out a good way to handle this and a generalized way to handle this in a general purpose manner.
I will be using this in Python.
^(?!mpeg).*
This uses a negative lookahead to only match a string where the beginning doesn't match mpeg
. Essentially, it requires that "the position at the beginning of the string cannot be a position where if we started matching the regex mpeg
, we could successfully match" - thus matching anything which doesn't start with mpeg, and not matching anything that does.
However, I'd be curious about the context in which you're using this - there might be other options aside from regex which would be either more efficient or more readable, such as...
if not inputstring.startswith("mpeg"):
don't lose your mind with regex.
if len(mystring) >=4 and mystring[:4]=="mpeg":
print "do something"
or use startswith() with "not" keyword
if len(mystring)>=4 and not mystring.startswith("mpeg")
Try a look-ahead assertion:
(?!mpeg)^.*
Or if you want to use negated classes only:
^(.{0,3}$|[^m]|m([^p]|p([^e]|e([^g])))).*$
Your regexp wouldn't match "npeg", I think you would need come up with
^($|[^m]|m($|[^p]|p($|[^e]|e($|[^g]))))
, which is quite horrible.
Another alternative would be ^(.{0,3}$|[^m]|.[^p]|..[^e]|...[^g])
which is only slightly better.
So I think you should really use a look-ahead assertion as suggested by Dav and Gumbo :-)
精彩评论