I have this URLs...
$output = "href=\"/one/two/three\"
href=\"one/two/three\"
src=\"windows.jpg\"
action=\"http://www.google.com/docs\"";
When I apply the regular expression:
$base_url_page = "http://mainserver/";
$output = preg_replace( "/(href|src|action)(\s*)=(\s*)(\"|\')(\/+|\/*)(.*)(\"|\')/ismU", "$1=\"" . $base_url_page . "$6\"", $output );
I get this:
$output = "href=\"http://mainserver/one/two/three\"
href=\"http://mainserver/one/two/three\"
src=\"http://mainserver/windows.jpg\"
action=\"http://mainserver/http://w开发者_运维百科ww.google.com/docs\"";
How you can modify the regular expression to prevent this: http://mainserver/http://www.google.com/ ???????
Try
$output = preg_replace( "/(href|src|action)\s*=\s*["'](?!http)\/*([^"']*)["']/ismU", "$1=\"" . $base_url_page . "$2\"", $output );
I have simplified your regex and added a lookahead that makes sure the string you're matching doesn't start with http
. As it is now, this regex allows neither single nor double quotes inside the URL.
$output = preg_replace( "/(href|src|action)\s*=\s*[\"'](?!http)(\/+|\/*)([^\"']*)[\"']/ismU", "$1=\"" . $base_url_page . "$3\"", $output );
精彩评论