I'm writing a small Bittorrent tracker on top of the Django framework, as part of a larger project. However, I'm having problems with decoding the "info_hash" parameter of the announce request.
Basically, uTorrent takes the SHA1 hash of the torrent in question and URL encodes the hex representation of it, which is then sent to the tracker in a GET request as the info_hash parameter.
The info_hash
A44B44B0EE8D85A9F7135489D522A19DA2C87C91
gets encoded as:
%a4KD%b0%ee%8d%85%a9%f7%13T%89%d5%22%a1%9d%开发者_如何学JAVAa2%c8%7c%91
However, Django decodes this to the Unicode string:
u'\ufffdKD\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\x13T\ufffd\ufffd"\ufffd\ufffd\ufffd\ufffd|\ufffd'
instead of a string literal like this:
\xa4KD\xb0\xee\x8d\x85\xa9\xf7\x13T\x89\xd5"\xa1\x9d\xa2\xc8|\x91
How can I stop Django from trying to translate the info_hash to Unicode, so I can then unquote it? My goal is to get a string literal that I can then encode to a hex string.
Any thoughts? Apologies if there's some concept about encoding that I'm missing. Thanks!
What is your settings.DEFAULT_ENCODING? Also how deoes the hash look like in HTTP headers? It shouldn't be modified at all during encoding as below:
>>> import urllib
>>> urllib.urlencode({'hash':"A44B44B0EE8D85A9F7135489D522A19DA2C87C91"})
'hash=A44B44B0EE8D85A9F7135489D522A19DA2C87C91'
Since:
>>> urllib.quote('A44B44B0EE8D85A9F7135489D522A19DA2C87C91') == 'A44B44B0EE8D85A9F7135489D522A19DA2C87C91'
True
And therefore:
>>> urllib.unquote('%a4KD%b0%ee%8d%85%a9%f7%13T%89%d5%22%a1%9d%a2%c8%7c%91') == 'A44B44B0EE8D85A9F7135489D522A19DA2C87C91'
False
Django decodes all GET data using the default encoding. You'll need to get the query string yourself, possibly from os.environ['QUERY_STRING']
or request.environ['QUERY_STRING']
.
精彩评论