I have a regex I need to match against a path like so: "C:\Documents and Settings\User\My Documents\ScanSnap\382893.pd~
". I need a regex that matches all paths except those ending in '~' or '.dat
'. The problem I am having is that I don't understand how to match and negate the exact string '.dat
' and only at the end of the path. i.e. I don't want to match {d,a,t}
elsewhere in the path.
I have built the regex, but need to not match .dat
[\w\s:\.\\]*[^~]$[^\.dat]
[\w\s:\.\\]*
This matches all words, whitespace, the colon, periods, and backspaces.
[^~]$[^\.dat]$
This causes matches ending in '~' to fail. It seems that I should be able to follow up with a negated match for '.dat', but the match fails in my regex tester.
I think my answer lies in grouping judging from what I've read, would someone point me开发者_如何学编程 in the right direction? I should add, I am using a file watching program that allows regex matching, I have only one line to specify the regex.
This entry seems similar: Regex to match multiple strings
You want to use a negative look-ahead:
^((?!\.dat$)[\w\s:\.\\])*$
By the way, your character group ([\w\s:\.\\]
) doesn't allow a tilde (~) in it. Did you intend to allow a tilde in the filename if it wasn't at the end? If so:
^((?!~$|\.dat$)[\w\s:\.\\~])*$
The following regex:
^.*(?<!\.dat|~)$
matches any string that does NOT end with a '~' or with '.dat'.
^ # the start of the string
.* # gobble up the entire string (without line terminators!)
(?<!\.dat|~) # looking back, there should not be '.dat' or '~'
$ # the end of the string
In plain English: match a string only when looking behind from the end of the string, there is no sub-string '.dat' or '~'.
Edit: the reason why your attempt failed is because a negated character class, [^...] will just negate a single character. A character class always matches a single character. So when you do [^.dat], you're not negating the string ".dat" but you're matching a single character other than '.', 'd', 'a' or 't'.
^((?!\.dat$)[\w\s:\.\\])*$
This is just a comment on an earlier answer suggestion:
. within a character class, [], is a literal . and does not need escaping.
^((?!\.dat$)[\w\s:.\\])*$
I'm sorry to post this as a new solution, but I apparently don't have enough credibility to simply comment on an answer yet.
I believe you are looking for this:
[\w\s:\.\\]*([^~]|[^\.dat])$
which finds, like before, all word chars, white space, periods (.), back slashes. Then matches for either tilde (~) or '.dat' at the end of the string. You may also want to add a caret (^) at the very beginning if you know that the string should be at the beginning of a new line.
^[\w\s:\.\\]*([^~]|[^\.dat])$
精彩评论