开发者

Python unicode woes

开发者 https://www.devze.com 2023-02-18 23:54 出处:网络
What is the correct way to convert \'\\xbb\' into a unicode string? I have tried the following and only get UnicodeDecodeError:

What is the correct way to convert '\xbb' into a unicode string? I have tried the following and only get UnicodeDecodeError:

unicode('\xbb', 'utf-8')

'\xbb'.decode('utf-8')
开发者_如何学运维


Since it comes from Word it's probably CP1252.

>>> print '\xbb'.decode('cp1252')
»


It looks to be Latin-1 encoded. You should use:

unicode('\xbb', 'Latin-1')


Not sure what you are trying to do. But in Python3 all strings are unicode per default. In Python2.X you have to use u'my unicode string \xbb' (or double, tripple quoted) to get unicode strings. When you want to print unicode strings you have to encode them in character set that is supported on the output device, eg. the terminal. u'my unicode string \xbb'.endoce('iso-8859-1') for instance.

0

精彩评论

暂无评论...
验证码 换一张
取 消