I'm trying to use a python script to upload a zip file to a site. The site provided an API exactly for that purpose. But, when I tried to use it, an encoding error showed up when joining all the strings to send in the httplib connect request. I've traced the string in question to be the filedata
(my zip file).
Traceback (most recent call last):
File "/Library/Application Junk/ProjectManager/Main.py", line 146, in OnUpload CurseUploader.upload_file('77353ba57bdeb5346d1b3830ed36171279763e35', 'wow', slug, version, VersionID, 'r', logText or '', 'creole', '', 'plain', zipPath)
File "/Library/Application Junk/ProjectManager/CurseUploader.py", line 83, in upload_file
content_type, body = encode_multipart_formdata(params, [('file', filepath)])
File "/Library/Application Junk/ProjectManager/CurseUploader.py", line 153, in encode_multipart_formdata
body = '\r\n'.join(L)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcb in position 10: ordinal not in range(128)
EDIT: As requested, the full code.
EDIT2: Tried, as suggested, to encode all the not-ascii strings to ascii. It throws the same bug, but now on L[i] = value.encode("ascii")
.
from httplib import HTTPConnection
from os.path import basename, exists
from mimetools import choose_boundary
try:
import simplejson as json
except ImportError:
import json
def get_game_versions(game):
"""
Return the JSON response as given from /game-versions.json from curseforge.com of the given game
`game`
The shortened version of the game, e.g. "wow", "war", or "rom"
"""
conn = HTTPConnection('%(game)s.curseforge.com' % { 'game': game })
conn.request("GET", '/game-versions.json')
response = conn.getresponse()
assert response.status == 200, "%(status)d %(reason)s from /game-versions.json" % { 'status': response.status, 'reason': response.reason }
assert response.content_type == 'application/json'
data = json.loads(response.read())
return data
def upload_file(api_key, game, project_slug, name, game_version_ids, file_type, change_log, change_markup_type, known_caveats, caveats_markup_type, filepath):
"""
Upload a file to CurseForge.com on your project
`api_key`
The api-key from http://www.curseforge.com/home/api-key/
`game`
The shortened version of the game, e.g. "wow", "war", or "rom"
`project_slug`
The slug of your project, e.g. "my-project"
`name`
The name of the file you're uploading, this should be the version's name, do not include your project's name.
`game_version_ids`
A set of game version ids.
`file_type`
Specify 'a' for Alpha, 'b' for Beta, and 'r' for Release.
`change_log`
The change log of the file. Up to 50k characters is acceptable.
`change_markup_type`
Markup type for your change log. creole or plain is recommended.
`known_caveats`
The known caveats of the file. Up to 50k characters is acceptable.
`caveats_markup_type`
Markup type for your known caveats. creole or plain is recommended.
`filepath`
The path to the file to upload.
"""
assert len(api_key) == 40
assert 1 <= len(game_version_ids) <= 3
assert file_type in ('r', 'b', 'a')
assert exists(filepath)
params = []
params.append(('name', name))
for game_version_id in game_version_ids:
params.append(('game_version', game_version_id))
params.append(('file_type', file_type))
params.append(('change_log', change_log))
params.append(('change_markup_type', change_markup_type))
params.append(('known_caveats', known_caveats))
params.append(('caveats_markup_type', caveats_markup_type))
content_type, body = encode_multipart_formdata(params, [('file', filepath)])
print('Got here?')
headers = {
"User-Agent": "CurseForge Uploader Script/1.0",
"Content-type": content_type,
"X-API-Key": api_key}
conn = HTTPConnection('%(game)s.curseforge.com' % { 'game': game })
conn.request("POST", '/projects/%(slug)s/upload-file.json' % {'slug': project_slug}, body, headers)
response = conn.getresponse()
if response.status == 201:
print "Successfully uploaded %(name)s" % { 'name': name }
elif response.status == 422:
assert response.content_type == 'application/json'
errors = json.loads(response.read())
print "Form error with uploading %(name)s:" % { 'name': name }
for k, items in errors.iteritems():
for item in items:
print " %(k)s: %(item)s" % { 'k': k, 'name': name }
else:
print "Error with uploading %(name)s: %(status)d %(reason)s" % { 'name': name, 'status': response.status, 'reason': response.reason }
def is_ascii(s):
return all(ord(c) < 128 for c in s)
def encode_multipart_formdata(fields, files):
"""
Encode data in multipart/form-data format.
`fields`
A sequence of (name, value) elements for regular form fields.
`files`
A sequence of (name, filename) elements for data to be uploaded as files
Return (content_type, body) ready for httplib.HTTP instance
"""
boundary = choose_boundary()
L = []
for key, value in fields:
if value is None:
value = ''
elif value is False:
continue
L.append('--%(boundary)s' % {'boundary': boundary})
L.append('Content-Disposition: form-data; name="%(name)s"' % {'name': key})
L.append('')
L.append(value)
for key, filename in files:
f = file(filename, 'rb')
filedata = f.read()
f.close()
L.append('--%(boundary)s' % {'boundary': boundary})
L.append('Content-Disposition: form-data; name="%(name)s"; filename="%(filename)s"' % { 'name': key, 'filename': basename(filename) })
L.append('Content-Type: application/zip')
L.append('')
L.append(filedata)
L.append('--%(boundary)s--' % {'boundary': boundary})
L.append('')
for i in range(len(L)):
value = L[i]
if not is_ascii(value):
L[i] = value.encode("ascii")
body = '\r\n'.join(L)
content_type = 'multipart/form-data; boundary=%(boundary)s' % { 'boundary': boundary }
return content_type, body
How can I workaround it?
EDIT3: As requested, the full result of printing the vars
fields: [('name', u'2.0.3'), ('game_version', u'1'), ('game_version', u'4'), ('game_version', u'9'), ('file_type', 'r'), ('change_log', u'====== 2.0.3\n* Jaliborc: Fixed a bug causing wrong items to be shown for leather, mail and plate slots\n* Jaliborc: Items are now organized by level as well\n\n====== 2.0.2\n* Jaliborc: Completly rewritten the categories dropdown to fix a bug\n\n====== 2.0.1\n* Jaliborc: Updated for patch 4.2\n* Jaliborc: Included all Firelands items\n\n===== 2.0.0\n* Jaliborc: Now works with 4.1\n* Jaliborc: Completely redesigned and improved\n* Jaliborc: Includes **all** items in-game right from the start\n* Jaliborc: Searches trough thousands of items in a blaze\n* Jaliborc: Mostly //Load on Demand//\n* Jaliborc: Only works on English clients. Versions for other clients should be released in a close future.\n\n====== 1.8.7\n* Added linkerator support for multiple chat frames\n\n====== 1.8.6\n* Fixed a bug when linking an item from the chat frame. \n\n====== 1.8.5\n* Added compatibility with WoW 3.3.5\n\n====== 1.8.3\n* Bumped TOC for 3.3\n\n====== 1.8.2\n* Bumped TOC for 3.2\n\n====== 1.8.1\n* TOC Bump + Potential WIM bugfix\n\n===== 1.8.0\n* Added "Heirloom" option to quality selector\n* Fixed a bug causing the DB to be reloaded on item scroll\n* Cleaned up the code a bit. Still need to work on the GUI/localization\n* Altered slash commands. See addon description for details.\n\n====== 1.7.2\n* Bumped the max item ID to check from 40k to 60k. Glyphs, etc, should now appear.\n\n====== 1.7.1\n* Fixed a crash issue when linking tradeskills\n\n===== 1.7.0\n* Made Wrath compatible\n* Seems to be causing a lot more CPU usage now, will investigate later.'), ('change_markup_type', 'creole'), ('known_caveats', ''), ('caveats_markup_type', 'plain')]
files: [('file', u'/Users/Jaliborc/Desktop/Ludwig 2.0.3.zip')]
开发者_开发问答It appears to contain some unicode strings. Should I encode them all?
It is highly likely that ISO-8859-1 is not the solution to your first problem. You need to be aware that any_random_gibberish.decode('ISO-8859-1')
simply cannot fail.
Secondly, I'm not sure why an encoding would be needed when uploading a file -- surely the object of the exercise is to reproduce the file exactly on the server; decoding a zip file into unicode
objects seems very strange.
It would be a very good idea if you were to publish the error that you got ("an encoding error showed up when reading the file") and the full traceback, then somebody can help you. Also needed: URL for the API that you mentioned.
Update You say that you got an "ascii error" in the line body = '\r\n'.join(L)
. A reasonable guess, based on your limited information, is that you have this problem:
>>> "".join([u"foo", "\xff"])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128)
u"foo" + "\xff"
produces the same result.
What is happening is that you have a mixture of unicode
and str
objects. Concatenating them requires converting the str
object to unicode
, and this is attempted using the default encoding, normally ascii
, which will fail when the str
object is not ASCII.
In this case the problem is not with the str
objects, but with the unicode
objects: you just can't send unencoded unicode
objects down the wire.
I suggest that you replace this code:
for key, filename in files:
f = file(filename, 'r')
filedata = f.read().decode("ISO-8859-1")
with this:
for key, filename in files:
f = file(filename, 'rb') # Specify binary mode in case this gets run on Windows
filedata = f.read() # don't decode it
and immediately after entering that function, print its args so that you can see exactly which are unicode
:
print "fields:", repr(fields)
print "files:", repr(files)
It is quite likely that the unicode
objects can all be safely coerced to ascii
by doing (explicitly) unicode_object.encode("ascii")
.
Update 2: It is worth investigating how some of your values are unicode
, and some are str
. It appears that all the unicode
can be safely encoded as ascii
:
new = [(k, v.encode('ascii') if isinstance(v, unicode) else v) for k, v in original]
精彩评论