I'm trying to match all occurances of "string" in something like the following sequence except those inside @@
as87dio u8u u7o @string@ ou os8 string os u
i.e. the second occurrence should be matched but not 开发者_开发技巧the first
Can anyone give me a solution?
You can use negative lookahead and lookbehind:
(?<!@)string(?!@)
EDIT
NOTE: As per Marks comments below, this would not match @string
or string@
.
You can try:
(?:[^@])string(?:[^@])
OK,
If you want to NOT match a character you put it in a character class (square brackets) and start it with the ^ character which negates it, for example [^a]
means any character but a lowercase 'a'.
So if you want NOT at-sign, followed by string, followed by another NOT at-sign, you want
[^@]string[^@]
Now, the problem is that the character classes will each match a character, so in your example we'd get " string " which includes the leading and trailing whitespace. So, there's another construct that tells you not to match anything, and that is parens with a ?: in the beginning. (?: )
. So you surround the ends with that.
(?:[^@])string(?:[^@])
OK, but now it doesn't match at the start of string (which, confusingly, is the ^
character doing double-duty outside a character class) or at the end of string $
. So we have to use the OR character |
to say "give me a non-at-sign OR start of string" and at the end "give me an non-at-sign OR end of string" like this:
(?:[^@]|^)string(?:[^@]|$)
EDIT: The negative backward and forward lookahead is a simpler (and clever) solution, but not available to all regular expression engines.
Now a follow-up question. If you had the word "astringent" would you still want to match the "string" inside? In other words, does "string" have to be a word by itself? (Despite my initial reaction, this can get pretty complicated :) )
精彩评论