In Python, there is an encode
method in unicode strings t开发者_高级运维o encode from unicode to byte string. There is a decode
method in string to do the reverse.
But I'm confused what the encode
method in string for?
It's useful for non-text codecs.
>>> 'Hello, world!'.encode('hex')
'48656c6c6f2c20776f726c6421'
>>> 'Hello, world!'.encode('base64')
'SGVsbG8sIHdvcmxkIQ==\n'
>>> 'Hello, world!'.encode('zlib')
'x\x9c\xf3H\xcd\xc9\xc9\xd7Q(\xcf/\xcaIQ\x04\x00 ^\x04\x8a'
It first decodes to Unicode using the default encoding, then encodes back to a byte string.
>>> import sys
>>> sys.getdefaultencoding()
'ascii'
>>> sys.setdefaultencoding('latin-1')
>>> '\xc4'.encode('utf-8')
'\xc3\x84'
Here, '\xc4'
is Latin-1 for Ä and '\xc3\x84'
is UTF-8 for Ä.
Why don't you want to read the fine Python documentation yourself?
http://docs.python.org/release/2.5.2/lib/string-methods.html
""" encode( [encoding[,errors]]) Return an encoded version of the string. Default encoding is the current default string encoding. errors may be given to set a different error handling scheme. The default for errors is 'strict', meaning that encoding errors raise a UnicodeError. Other possible values are 'ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace' and any other name registered via codecs.register_error, see section 4.8.1. For a list of possible encodings, see section 4.8.3. New in version 2.0. Changed in version 2.3: Support for 'xmlcharrefreplace' and 'backslashreplace' and other error handling schemes added. """
精彩评论