There are the Uri.IsWellFor开发者_如何学运维medUriString
and Uri.TryCreate
methods, but they seem to return true
for file paths, etc.
How do I check whether a string is a valid (not necessarily active) HTTP URL for input validation purposes?
Try this to validate HTTP URLs (uriName
is the URI you want to test):
Uri uriResult;
bool result = Uri.TryCreate(uriName, UriKind.Absolute, out uriResult)
&& uriResult.Scheme == Uri.UriSchemeHttp;
Or, if you want to accept both HTTP and HTTPS URLs as valid (per J0e3gan's comment):
Uri uriResult;
bool result = Uri.TryCreate(uriName, UriKind.Absolute, out uriResult)
&& (uriResult.Scheme == Uri.UriSchemeHttp || uriResult.Scheme == Uri.UriSchemeHttps);
This method works fine both in http and https. Just one line :)
if (Uri.IsWellFormedUriString("https://www.google.com", UriKind.Absolute))
MSDN: IsWellFormedUriString
Try that:
bool IsValidURL(string URL)
{
string Pattern = @"^(?:http(s)?:\/\/)?[\w.-]+(?:\.[\w\.-]+)+[\w\-\._~:/?#[\]@!\$&'\(\)\*\+,;=.]+$";
Regex Rgx = new Regex(Pattern, RegexOptions.Compiled | RegexOptions.IgnoreCase);
return Rgx.IsMatch(URL);
}
It will accept URL like that:
- http(s)://www.example.com
- http(s)://stackoverflow.example.com
- http(s)://www.example.com/page
- http(s)://www.example.com/page?id=1&product=2
- http(s)://www.example.com/page#start
- http(s)://www.example.com:8080
- http(s)://127.0.0.1
- 127.0.0.1
- www.example.com
- example.com
public static bool CheckURLValid(this string source)
{
Uri uriResult;
return Uri.TryCreate(source, UriKind.Absolute, out uriResult) && uriResult.Scheme == Uri.UriSchemeHttp;
}
Usage:
string url = "htts://adasd.xc.";
if(url.CheckUrlValid())
{
//valid process
}
UPDATE: (single line of code) Thanks @GoClimbColorado
public static bool CheckURLValid(this string source) => Uri.TryCreate(source, UriKind.Absolute, out Uri uriResult) && uriResult.Scheme == Uri.UriSchemeHttps;
Usage:
string url = "htts://adasd.xc.";
if(url.CheckUrlValid())
{
//valid process
}
All the answers here either allow URLs with other schemes (e.g., file://
, ftp://
) or reject human-readable URLs that don't start with http://
or https://
(e.g., www.google.com
) which is not good when dealing with user inputs.
Here's how I do it:
public static bool ValidHttpURL(string s, out Uri resultURI)
{
if (!Regex.IsMatch(s, @"^https?:\/\/", RegexOptions.IgnoreCase))
s = "http://" + s;
if (Uri.TryCreate(s, UriKind.Absolute, out resultURI))
return (resultURI.Scheme == Uri.UriSchemeHttp ||
resultURI.Scheme == Uri.UriSchemeHttps);
return false;
}
Usage:
string[] inputs = new[] {
"https://www.google.com",
"http://www.google.com",
"www.google.com",
"google.com",
"javascript:alert('Hack me!')"
};
foreach (string s in inputs)
{
Uri uriResult;
bool result = ValidHttpURL(s, out uriResult);
Console.WriteLine(result + "\t" + uriResult?.AbsoluteUri);
}
Output:
True https://www.google.com/
True http://www.google.com/
True http://www.google.com/
True http://google.com/
False
After Uri.TryCreate
you can check Uri.Scheme
to see if it HTTP(s).
As an alternative approach to using a regex, this code uses Uri.TryCreate
per the OP, but then also checks the result to ensure that its Scheme is one of http or https:
bool passed =
Uri.TryCreate(url, UriKind.Absolute, out Uri uriResult)
&& (uriResult.Scheme == Uri.UriSchemeHttp
|| uriResult.Scheme == Uri.UriSchemeHttps);
This would return bool:
Uri.IsWellFormedUriString(a.GetAttribute("href"), UriKind.Absolute)
Problem: Valid URLs should include all of the following “prefixes”: https, http, www
- Url must contain http:// or https://
- Url may contain only one instance of www.
- Url Host name type must be Dns
- Url max length is 100
Solution:
public static bool IsValidUrl(string webSiteUrl)
{
if (webSiteUrl.StartsWith("www."))
{
webSiteUrl = "http://" + webSiteUrl;
}
return Uri.TryCreate(webSiteUrl, UriKind.Absolute, out Uri uriResult)
&& (uriResult.Scheme == Uri.UriSchemeHttp
|| uriResult.Scheme == Uri.UriSchemeHttps) && uriResult.Host.Replace("www.", "").Split('.').Count() > 1 && uriResult.HostNameType == UriHostNameType.Dns && uriResult.Host.Length > uriResult.Host.LastIndexOf(".") + 1 && 100 >= webSiteUrl.Length;
}
Validated with Unit Tests
Positive Unit Test:
[TestCase("http://www.example.com/")]
[TestCase("https://www.example.com")]
[TestCase("http://example.com")]
[TestCase("https://example.com")]
[TestCase("www.example.com")]
public void IsValidUrlTest(string url)
{
bool result = UriHelper.IsValidUrl(url);
Assert.AreEqual(result, true);
}
Negative Unit Test:
[TestCase("http.www.example.com")]
[TestCase("http:www.example.com")]
[TestCase("http:/www.example.com")]
[TestCase("http://www.example.")]
[TestCase("http://www.example..com")]
[TestCase("https.www.example.com")]
[TestCase("https:www.example.com")]
[TestCase("https:/www.example.com")]
[TestCase("http:/example.com")]
[TestCase("https:/example.com")]
public void IsInvalidUrlTest(string url)
{
bool result = UriHelper.IsValidUrl(url);
Assert.AreEqual(result, false);
}
Note: IsValidUrl method should not validate any relative url path like example.com
See:
Should I Use Relative or Absolute URLs?
Uri uri = null;
if (!Uri.TryCreate(url, UriKind.Absolute, out uri) || null == uri)
return false;
else
return true;
Here url
is the string you have to test.
I've created this function to help me with URL validation, you can customize it as you like, note this is written in python3.10.6
def url_validator(url: str) -> bool:
"""
use this func to filter out the urls to follow only valid urls
:param: url
:type: str
:return: True if the passed url is valid otherwise return false
:rtype: bool
"""
#the following regex is copied from Django source code
# to validate a url using regax
regex = re.compile(
r"^(?:http|ftp)s?://" # http:// or https://
r"(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|" # domain...
r"localhost|" # localhost...
r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})" # ...or ip
r"(?::\d+)?" # optional port
r"(?:/?|[/?]\S+)$",
re.IGNORECASE,
)
blocked_sites: list[str] = []
for site in blocked_sites:
if site in url or site == url:
return False
# if none of the above then ensure that the url is valid and then return True otherwise return False
if re.match(regex, url):
return True
return False
精彩评论