Apologies if this has been answered somewhere before, but like everything, google gives a billion results, all leading to the wrong answer.
I have a URL/Email Parser linking url's and emails addresses on my website (PHP). Everything was fine until I gained some international customers with complex domain names (.com.au etc)
This is the function I currently have...
FUNCTION linkScan($string1) {
$pattern1 = "/(?<![\/\d\w])(http:\/\/)?([\w\d\-]+)((\.([\w\d\-])+){2,})([\/\?\w\d\.\-_&=+%]*)?/i";
$pattern2 = "/([\w\d\.\-\_]+)@([\w\d\.\_\-]+)/mi";
$replace1 = "<a href=\"http://$2$3$6\" target=\"_blank\">$0</a>";
$replace2 = "<a href=\"mailto:$0\">$0</a>";
$string2 = PREG_REPLACE($pattern1,$replace1,$string1);
$string3 = PREG_REPLACE($pattern2,$replace2,$string2);
$string3 = convertSmartQuotes($string3);
RETURN $string3;
}
It works fine until it finds an email address someone@somewhere.com.au
Becuase it looks for the URL's first, it finds to somewhere.com.au portion and makes it a link, then when the email scan happend it is ignored because of the HTML tags now embedded in it.
What I want to do if force the use of a subdomain in the URL's (whether that be a www or otherwise), and not care if there is http:// in front of it. But because the regex seems to only care if there are 3 portions (subdomain, domain, .com), the regexp is mistakenly thinking that the .com in a .com.au is actually the domain portion.
It should find...
subdomain.domain.com
subdomain.domain.com.au
It should not find...
domain.com
domain.com.au (which it is currently fi开发者_运维知识库nding)
If there is anyone that can help we with the regular expression, that would be fantastic. Thanks
You need a list if all top-level domains and their structure. The Mozilla project has such a list; it is several hundred lines, so incorporating it into a regex may be cumbersome, although certainly not impossible. https://wiki.mozilla.org/TLD_List update: superseded by http://publicsuffix.org/
Anyway, quite likely you are Doing It Wrong. What are you trying to accomplish?
Regex has a nice list of expressions and also includes a nice tester to make sure your expression works.
精彩评论