开发者

How extract all url from text in c# (asp.net mvc)

开发者 https://www.devze.com 2023-03-25 19:43 出处:网络
I am creating an asp.net mvc application, where there is a text box where users can type text which may include some urls, on server i want to parse that text and extract all urls avable in it.

I am creating an asp.net mvc application, where there is a text box where users can type text which may include some urls, on server i want to parse that text and extract all urls avable in it.

possible text

abc.com, xyz.com, http://foo.com
baar.com 
http://baz.com, www.foobar.com
mosso.com
http://subfoo.foo.com
bar.baz.com
foobar.net baaz2.com  morebaaz.com

Expected Output array

abc.com
xyz.com
开发者_运维百科foo.com 
baar.com
baz.com
foobar.com
mosso.com
subfoo.foo.com
bar.baz.com   
foobar.net 
baaz2.com  
morebaaz.com


How about this:

string[] domains = text.Replace(" ",",").Replace("http://", "").Replace("www.", "").Replace("ftp://", "").Split(new char[] { ',' }, StringSplitOptions.RemoveEmptyEntries);

Then you can check each string for a valid url

public static bool isValidUrl(string url)
{
    string pattern = @"^[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?/?([a-zA-Z0-9\-\._\?\,\'/\\\+&%\$#\=~])*[^\.\,\)\(\s]$";
    Regex reg = new Regex(pattern, RegexOptions.Compiled | RegexOptions.IgnoreCase);
    return reg.IsMatch(url);
}

Hope it helps...

EDIT: Sorry, the validation failed..Fixed it now...


First, you want to set the delimiter. Looks like you're using commas, so that should be easy enough.

var urlArray = inputString.Split(',', StringSplitOptions.RemoveEmptyEntries)

Then you can loop through each string and check to see if a url needs to be trimmed.

foreach (var urlString in urlArray)
{
    if (urlString.Contains("http://")) // or other things you want to filter out
    {
        urlString = urlString.Substring(urlString.IndexOf("http://") + 7)
    }
}

Now you can display each string in the array!

0

精彩评论

暂无评论...
验证码 换一张
取 消