I've been trying to solve this for last two hours but it just doesnt work :(
I have downloaded html code of one web page and then I have removed all double white spaces and all new lines so the whole code is one line string.
And then I have 开发者_运维知识库to extract one piece of data from it
page.com/users/(this)/xxxxx/.....
match = Regex.Match(htmlCode, "page.com/users/(.*)/xxxxx/");
string user = match.Groups[1].ToString();
but it doesn't work, I always get (this)/xxxxx/ + the rest of html code.
Anyone know why doesn't this work?
Instead of the greedy (.*)
, use ([^/]*)
.
Your .*
is matching everything after that, including the /xxxxx/
portion.
Specify .* more specifically like [^/]+ meaning there has to be something there and it can be anything but a /
try
match = Regex.Match(htmlCode, "page.com/users/([^/]*)/xxxxx/");
string user = match.Groups[1].ToString();
try page.com/users/([^/]*)/xxxxx/
精彩评论