开发者

UnicodeDecodeError only with cx_freeze

开发者 https://www.devze.com 2023-03-03 18:20 出处:网络
I get the error: \"UnicodeDecodeError: \'ascii\' codec can\'t decode byte 0xa0 in position 7338: ordinal not in range(128)\" once I try to run the progr开发者_如何学运维am after I freeze my script wit

I get the error: "UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 7338: ordinal not in range(128)" once I try to run the progr开发者_如何学运维am after I freeze my script with cx_freeze. If I run the Python 3 script normally it runs fine, but only after I freeze it and try to run the executable does it give me this error. I would post my code, but I don't know exactly what parts to post so if there are any certain parts that will help just let me know and I will post them, otherwise it seems like I have had this problem once before and solved it, but it has been a while and I can't remember what exactly the problem was or how I fixed it so any help or pointers to get me going in the right direction will help greatly. Thanks in advance.


Tell us exactly which version of Python on what platform.

Show the full traceback that you get when the error happens. Look at it yourself. What is the last line of your code that appears? What do you think is the bytes string that is being decoded? Why is the ascii codec being used??

Note that automatic conversion of bytes to str with a default codec (e.g. ascii) is NOT done by Python 3.x. So either you are doing it explicitly or cx_freeze is.

Update after further info in comments.

Excel does not save csv files in ASCII. It saves them in what MS calls "the ANSI codepage", which varies by locale. If you don't know what yours is, it is probably cp1252. To check, do this:

>>> import locale; print(locale.getpreferredencoding())
cp1252

If Excel did save files in ASCII, your offending '\xa0' byte would have been replaced by '?' and you would not be getting a UnicodeDecodeError.

Saving your files in UTF-8 would need you to open your files with encoding='utf8' and would have the same problem (except that you'd get a grumble about 0xc2 instead of 0xa0).

You don't need to post all four of your csv files on the web. Just run this little script (untested):

import sys
for filename in sys.argv[1:]:
    for lino, line in enumerate(open(filename), 1):
        if '\xa0' in line:
            print(ascii(filename), lino, ascii(line))

The '\xa0' is a NO-BREAK SPACE aka   ... you may want to edit your files to change these to ordinary spaces.

Probably you will need to ask on the cx_freeze mailing list to get an answer to why this error is happening. They will want to know the full traceback. Get some practice -- show it here.

By the way, "offset 7338" is rather large -- do you expect lines that long in your csv file? Perhaps something is reading all of your file ...


That error itself indicates that you have a character in a python string that isn't a normal ASCII character:

>>> b'abc\xa0'.decode('ascii')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 3: ordinal not in range(128)

I certainly don't know why this would only happen when a script is frozen. You could wrap the whole script in a try/except and manually print out all or part of the string in question.

EDIT: here's how that might look

try:
    # ... your script here
except UnicodeDecodeError as e:
    print("Exception happened in string '...%s...'"%(e.object[e.start-50:e.start+51],))
    raise


fix by set default coding:

reload(sys)
sys.setdefaultencoding("utf-8")


Use str.decode() function for that lines. And also you can specify encoding like myString.decode('cp1252').

Look also: http://docs.python.org/release/3.0.1/howto/unicode.html#unicode-howto

0

精彩评论

暂无评论...
验证码 换一张
取 消