I have never used regex before and I was wondering how to write a r开发者_运维知识库egular expression in PHP that gets the domain of the URL. For example: http://www.hegnar.no/bors/article488276.ece --> hegnar.no
You dont need to use regexp for this task.
Check PHP's built in function, parse_url http://php.net/manual/en/function.parse-url.php
Just use parse_url()
if you are specifically dealing with URLs.
For example:
$url = "http://www.hegnar.no/bors/article488276.ece";
$url_u_want = parse_url($url, PHP_URL_HOST);
Docs
EDIT: To take out the www. infront, use:
$url_u_want = preg_replace("/^www\./", "", $url_u_want);
$page = "http://google.no/page/page_1.html";
preg_match_all("/((?:[a-z][a-z\\.\\d\\-]+)\\.(?:[a-z][a-z\\-]+))(?![\\w\\.])/", $page, $result, PREG_PATTERN_ORDER);
print_r($result);
$host = parse_url($url, PHP_URL_HOST);
$host = array_reverse(explode('.', $host));
$host = $host[1].'.'.$host[0];
See
PHP Regex for extracting subdomains of arbitrary domains
and
Javascript/Regex for finding just the root domain name without sub domains
This is the problem when you use parse_url, the $url with no .com or .net or etc then the result returned is bannedadsense, this mean returning true, the fact bannedadsense is not a domain.
$url = 'http://bannedadsense/isbanned'; // this url will return false in preg_match
//$url = 'http://bannedadsense.com/isbanned'; // this url will return domain in preg_match
$domain = parse_url($url, PHP_URL_HOST));
// return "bannedadsense", meaning this is right domain.
So that we need continue to check more a case with no dot extension (.com, .net, .org, etc)
if(preg_match("/^[a-zA-Z0-9][a-zA-Z0-9-]{1,61}[a-zA-Z0-9](?:\.[a-zA-Z]{2,})+$/i",$domain)) {
echo $domain;
}else{
echo "<br>";
echo "false";
}
精彩评论