I'm working with large files in French and German. Basically, writing strings of characters from one file to another, collecting data from them, and 开发者_运维问答so forth. Unfortunately, I have no idea what to import in order to let Python handle these characters.
Even when collecting data from files that Python has already converted (in french you get weird things like écouteur ça), I get key errors when checking dicts for things that I know have already been placed in that dict, but only when the items have special characters in them like in the example of écouteur ça.
For example, when the tuple ('écouteur', 'ça') has been added to a dict which collects the frequency that any given pair of words occur together, you get a key error when probing that dict for the tuple ('écouteur', 'ça'), but not when probing the dict for other tuples that don't contain the wacky characters.
Does anyone know a quick way to get around this issue at every level?
Best, Georgina
"Unicode in Python, Completely Demystified"
精彩评论