开发者

is this code useful?

开发者 https://www.devze.com 2022-12-31 09:15 出处:网络
def _oauth_escape(val): if isinstance(val, unicode):# useful ? val = val.encode(\"utf-8\")#useful ? return urlli开发者_开发知识库b.quote(val, safe=\"~\")
def _oauth_escape(val):
    if isinstance(val, unicode):# useful ?
        val = val.encode("utf-8")#useful ?
    return urlli开发者_开发知识库b.quote(val, safe="~")

i think it is not useful ,

yes ??

updated

i think unicode is ‘utf-8’ ,yes ?


utf-8 is an encoding, a recipe for concretely representing unicode data as a series of bytes. This is one of many such encodings. Python str objects are bytestrings, which can represent arbitrary binary data, such as text in a specific encoding.

Python's unicode type is an abstract, not-encoded way to represent text. unicode strings can be encoded in any of many encodings.


As others have said already, unicode and utf-8 are not the same. Utf-8 is one of many encodings for unicode.

Think of unicode objects as "unencoded" unicode strings, while string objects are encoded in a particular encoding (unfortunately, string objects don't have an attribute that tells you how they are encoded).

val.encode("utf-8") converts this unicode object into an utf-8 encoded string object.

In Python 2.6, this is necessary, as urllib can't handle unicode properly.

>>> import urllib
>>> urllib.quote(u"")
''
>>> urllib.quote(u"ä")
/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib.py:1216: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  res = map(safe_map.__getitem__, s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib.py", line 1216, in quote
    res = map(safe_map.__getitem__, s)
KeyError: u'\xe4'
>>> urllib.quote(u"ä".encode("utf-8"))
'%C3%A4'

Python 3.x however, where all strings are unicode (the Python 3 equivalent to an encoded string is a bytes object), it is not necessary anymore.

>>> import urllib.parse
>>> urllib.parse.quote("ä")
'%C3%A4'


In Python 3.0 all strings support Unicode, but with previous versions one has to explicitly encode strings to Unicode strings. Could that be it?

(utf-8 is not the only, but the most common encoding for Unicode. Read this.)

0

精彩评论

暂无评论...
验证码 换一张
取 消