开发者

Write PDF file from URL using urllib2

开发者 https://www.devze.com 2023-02-23 13:22 出处:网络
I\'m trying to save a dynamic pdf file generated from a web server using python\'s module urllib2. I use following code to get data from server and to write that data to a file in order to store the p

I'm trying to save a dynamic pdf file generated from a web server using python's module urllib2. I use following code to get data from server and to write that data to a file in order to store the pdf in a local disk.:

import urllib2
import cookielib

theurl = 'https://myweb.com/?pdf&var1=1'
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.addheaders.append(('Cookie', cookie))
request = urllib2.Request(theurl)

print("... Sending HTTP GET to %s" % theurl)
f = opener.open(request)
data = f.read()
f.close()
opener.close()

FILE = open('report.pdf', "w")
FILE.write(data)
FILE.close()

This code runs well but the written pdf file is not well recognized by adobe reader. If开发者_如何转开发 I do the request manually using firefox, I have no problems to receive the file and I can visualize it withouut problems. Comparing the received http headers (firefox and urrlib) the only difference is a http header field called "Transfer-Encoding = chunked". This field is received in firefox but it seems that is not received when I do the urllib request. Any suggestion?


Try changing,

FILE = open('report.pdf', "w")

to

FILE = open('report.pdf', "wb")

The extra 'b' indicates to write in binary mode. Currently you are writing a binary file in ASCII/text mode.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号