i want to parse all links in html document string in php in such way: replace href='LINK' to href='MY_DOMAIN?URL=LINK', so because LINK will be url parameter it must be urlencode开发者_C百科d. i'm trying to do so:
preg_replace('/href="(.+)"/', 'href="http://'.$host.'/?url='.urlencode('${1}').'"', $html);
but '${1}' is just string literal, not founded in preg url, what need i do, to make this code working?
Well, to answer your question, you have two choices with Regex.
You can use the e
modifier to the regex, which tells preg_replace
that the replacement is php code and should be executed. This is typically seen as not great, since it's really no better than eval...
preg_replace($regex, "'href=\"http://{$host}?url='.urlencode('\\1').'\"'", $html);
The other option (which is better IMHO) is to use preg_replace_callback
:
$callback = function ($match) use ($host) {
return 'href="http://'.$host.'?url='.urlencode($match[1]).'"';
};
preg_replace_callback($regex, $callback, $html);
But also never forget, don't parse HTML with regex...
So in practice, the better way of doing it (The more robust way), would be:
$dom = new DomDocument();
$dom->loadHtml($html);
$aTags = $dom->getElementsByTagName('a');
foreach ($aTags as $aElement) {
$href = $aElement->getAttribute('href');
$href = 'http://'.$host.'?url='.urlencode($href);
$aElement->setAttribute('href', $href);
}
$html = $dom->saveHtml();
Use the 'e' modifier.
preg_replace('/href="([^"]+)"/e',"'href=\"http://'.$host.'?url='.urlencode('\\1').'\"'",$html);
http://uk.php.net/preg-replace - example #4
精彩评论