When I curl the following
<?php
$ch = 开发者_开发百科curl_init();
curl_setopt ($ch, CURLOPT_PORT, "8081");
curl_setopt ($ch, CURLOPT_URL, "http://192.168.0.14:8081/comingEpisodes/" );
curl_setopt($ch, CURLOPT_USERPWD, "user:pass");
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
$curl_response = curl_exec($ch);
curl_close($ch);
echo $curl_response;
?>
The page is returned however the images aren't. I located the problem. 192.168.0.14 is my local host. I am calling a page from an app the runs off port 8081. Curl seems to drop the port and change 192.168.0.14 to locahost and therefore the images are no longer linked to the right place. How do I make sure that the port remains so the images remain. Thanks
EDIT: I think the /comingEpisodes after the port is also part of the problem..
Unless you're building a 100% proxy, you're dumping the contents of the cURL pull in to a browser. The results now refer from the page that the cURL results are dumped to, not from the originating cURL request.
Basically, if you visit http://localhost and the above code resides in the index.php
, that page is requesting the :8081/comingEpisodes contents and dumping it within the context of the originating http://locahost. The browser is now basing all the found content from http://localhost and not as if it were from the curl request.
You could replace all the content links within the document before it's output to some "proxy.php?retrieve=old_url" and then make all those now call through the same cURL context, but that's the basis of a web proxy.
End-User Intermediary End-Website
(http://localhost) (localhost/index.php) (http://192.168.0.14:8081/comingEpisodes/)
------------------ --------------------- ------------------------------------------
Initial visit--------->
cURL Request------------->
Page Content (html basically)
Echoed back to user<------
Content<---------------
Finds <img> etc.------>
/comingEpisodes/img1.jpg // 404 error, it's actually on :8081
// that localhost has no idea about
// because it's being hidden using cURL
VERY SIMPLE DEMO
<?php
//
// Very Dummied-down proxy
//
// Either get the url of the content they need, or use the default "page root"
// when none is supplied. This is not robust at all, as this really only handles
// relative urls (e.g. src="images/foo.jpg", something like src="http://foo.com/"
// would become src="index.php?proxy=http://foo.com/" which makes the below turn
// into "http://www.google.com/http://foo.com/")
$_target = 'http://www.google.com/' . (isset($_GET['proxy']) ? $_GET['proxy'] : '');
// Build the cURL request to get the page contents
$cURL = curl_init($_target);
try
{
// setup cURL to your liking
curl_setopt($cURL, CURLOPT_RETURNTRANSFER, 1);
// execute the request
$page = curl_exec($cURL);
// Forward along the content type (so images, files, etc all are understood correctly)
$contentType = curl_getinfo($cURL, CURLINFO_CONTENT_TYPE);
header('Content-Type: ' . $contentType);
// close curl, we're done.
curl_close($cURL);
// test against the content type. If it HTML then we need to re-parse
// the page to add our proxy intercept in the URL so the visitor keeps using
// our cURL request above for EVEYRTHING it needs from this site.
if (strstr($contentType,'text/html') !== false)
{
//
// It's html, replace all the references to content using URLs
//
// First, load our DOM parser
$html = new DOMDocument();
$html->formatOutput = true;
@$html->loadHTML($page); // was getting parse errors, added @ for demo purposes.
// simple demo, look for image references and change them
foreach ($html->getElementsByTagName('img') as $img)
{
// take a typical image:
// <img src="logo.jpg" />
// and make it go through the proxy (so it uses cURL again:
// <img src="index.php?proxy=logo.jpg" />
$img->setAttribute('src', sprintf('%s?proxy=%s', $_SERVER['PHP_SELF'], urlencode($img->getAttribute('src'))));
}
// finally dump it to client with the urls changed
echo $html->saveHTML();
}
else
{
// Not HTML, just dump it.
echo $page;
}
}
// just in case, probably want to do something with this.
catch (Exception $ex)
{
}
精彩评论