开发者

Django \u characters in my UTF8 strings

开发者 https://www.devze.com 2023-01-05 20:18 出处:网络
I am adding UTF-8 data to a database in Django. As the data goes into the database, everything looks fine - the characters (for example): “Hello” are UTF-8 encoded.

I am adding UTF-8 data to a database in Django.

As the data goes into the database, everything looks fine - the characters (for example): “Hello” are UTF-8 encoded.

My MySQL database is UTF-8 encoded. When I examine the data from the DB by doing a select, my example string looks like this: ?Hello?. I assume this is showing the characters as UTF-开发者_运维技巧8 encoded.

When I select the data from the database in the terminal or for export as a web-service, however - my string looks like this: \u201cHello World\u201d.

Does anyone know how I can display my characters correctly?

Do I need to perform some additional UTF-8 encoding somewhere?

Thanks, Nick.


u'\u201cHello World\u201d'

Is the correct Python representation of the Unicode text “Hello World”. The smartquote characters are being displayed using a \uXXXX hex escape rather than verbatim because there are often problems with writing Unicode characters to the terminal, particularly on Windows. (It looks like MySQL tried to write them to the terminal but failed, resulting in the ? placeholders.)

On a terminal that does manage to correctly input and output Unicode characters, you can confirm that they're the same thing:

Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41) [GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> u'\u201cHello World\u201d'==u'“Hello World”'
True

just as for byte strings, \x sequences are just the same as characters:

>>> '\x61'=='a'
True

Now if you've got \u or \x sequences escaping Python and making their way into an exported file, then you've done something wrong with the export. Perhaps you used repr() somewhere by mistake.

0

精彩评论

暂无评论...
验证码 换一张
取 消