开发者

How to enable gzip compression using PHP Simple HTML DOM Parser

开发者 https://www.devze.com 2022-12-31 22:43 出处:网络
I have tried a few things to enable gzip compression using PHP Simple HTML DOM Parser but nothing has seemed to work thus far.Using ini_set I\'ve manged to change the user agent, so I figured it might

I have tried a few things to enable gzip compression using PHP Simple HTML DOM Parser but nothing has seemed to work thus far. Using ini_set I've manged to change the user agent, so I figured it might be possible to also enable gzip compression?

include("simpdom/simple_html_dom.php");
ini_set('zlib.output_compression', 'On');   
$url = 'http://www.whatsmyip.org/http_compression/';
$html = file_get_html($url);
print $html;

The website above tests it. Please let me know if I am going about this the wrong way completely.

====

UPDATE

For anyone else trying to achieve the same 开发者_StackOverflow社区thing, it's best to just use cURL, then use the dom parser like so:

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url); // Define target site
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // Return page in string
curl_setopt($cr, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.2 (KHTML, like Gecko) Chrome/5.0.342.3 Safari/533.2');
curl_setopt($ch, CURLOPT_ENCODING , "gzip");     
curl_setopt($ch, CURLOPT_TIMEOUT,5); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE); // Follow redirects

$return = curl_exec($ch); 
$info = curl_getinfo($ch); 
curl_close($ch); 

$html = str_get_html("$return");


CURLOPT_ENCODING is so that the response comes back (accepted as) gzipped data - the server settings (ob_start("ob_gzhandler") or php_ini..) tell the server to OUTPUT gzipped data.

Just like if you went to that page with a browser that didn't support gzip. To accept gzip data, you have to use curl so you can make that distinction.


Just add the following line at the very top of the PHP script that outputs the data:

  ob_start("ob_gzhandler");

Reference

-------Update--------

You can also try to enable gzip Compresion sitewide via a .htaccess file. Something like This should gzip your sites content but images:

# Insert filter
SetOutputFilter DEFLATE

# Netscape 4.x has some problems...
BrowserMatch ^Mozilla/4 gzip-only-text/html

# Netscape 4.06-4.08 have some more problems
BrowserMatch ^Mozilla/4\.0[678] no-gzip

# MSIE masquerades as Netscape, but it is fine
# BrowserMatch \bMSIE !no-gzip !gzip-only-text/html

# NOTE: Due to a bug in mod_setenvif up to Apache 2.0.48
# the above regex won't work. You can use the following
# workaround to get the desired effect:
BrowserMatch \bMSI[E] !no-gzip !gzip-only-text/html

# Don't compress images
#SetEnvIfNoCase Request_URI \
\.(?:gif|jpe?g|png)$ no-gzip dont-vary

# Make sure proxies don't deliver the wrong content
Header append Vary User-Agent env=!dont-vary
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号