I would like to capture anything up to, but not including a particular patter. My actual problem has to do with parsing out information 开发者_运维技巧from html, but I am distilling the problem down to an example to, hopefully, clarify my question.
Source
xaxbxcabcabc
Desired Match
xaxbxc
If I use a lookahead the expression will capture the first occurrence
.*(?=abc) => xaxbxcabc
I would like something along the lines of a negated character class, just for a negated pattern.
.*[^abc] //where abc as a pattern instead of a list giving anything but a, b or c
I am using http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx for testing
A non-greedy (lazy) quantifier *?
could be useful here, e.g.
^(?<captured>.*?)abc.*$
Edit: Just to be clear, the explicit capture is (of course) not needed, the really important part is just
(.*?)abc
If you anchor the regex you'll solve the problem (+ use of lazy quantifier):
"^.*?(?=abc)"
Why not use a replace:
string result = new Regex("abc.*$").Replace ( input, "" );
This will remove everything from the first matching phrase onwards, leaving you with all of the content up until that point.
精彩评论