开发者

CSV Module's writer won't let me write binary out

开发者 https://www.devze.com 2023-02-18 06:23 出处:网络
I tried to just use the \'w\' tag while opening the file, but it double spaced the lines which caused the read to not work.So I found that changing to \'wb\' will be the correct formatting.Now that I

I tried to just use the 'w' tag while opening the file, but it double spaced the lines which caused the read to not work. So I found that changing to 'wb' will be the correct formatting. Now that I am using the 'wb' flag I can't get the csv.writer.writerow() to work. I have encoded all my strings and am lost as to why I keep getting this error. All the questions I see say that b'string here' or myString.encode('ascii') solves the error I get, but it is not solving it for me. Here is what I have:

    dataWriter = csv.writer(open(fileName, 'wb'))
    for i in range(self.ui.table.rowCount()):
        rowData = [self.ui.table.item(i,0).text().encode('utf-8')\
        ,self.ui.table.item(i,1).text().encode('utf-8')\
        ,self.ui.table.item(i,2).text().encode('utf-8')\
        ,self.ui.table.item(i,3).text().encod开发者_开发百科e('utf-8')\
        ,self.ui.table.item(i,4).text().encode('utf-8')]
        dataWriter.writerow(rowData)

Which I figured would work but it still gives me the following error: "TypeError: must be bytes or buffer, not str" on the line "dataWriter.writerow(rowData).

Any help would be apreciated. Thank you.


You appear to be running Python 3.x. Advice about using binary mode for csv files applies to Python 2.x. The codecs module is not required for 3.x -- just use encoding=whatever when you open the file. What is needed for 3.x is that the file be opened with newline=''. This applies to both reading and writing, although it is not documented for writing (bug report has been submitted). After sorting out your doble-spacing problem, this will work:

import csv
data = [
    ['\xfforick', 123.456],
    ['polonius', 987.564],
    ]
with open('demo.csv', 'w', newline='', encoding='utf8') as f:
    writer = csv.writer(f)
    for row in data:
        writer.writerow(row)

Contents of output file:

>>> open('demo.csv', 'rb').read()
b'\xc3\xbforick,123.456\r\npolonius,987.564\r\n'
>>>

Suggestion: give some consideration to legibility of your code ... instead of

for i in range(self.ui.table.rowCount()):
    rowData = [self.ui.table.item(i,0).text().encode('utf-8')\
    ,self.ui.table.item(i,1).text().encode('utf-8')\
    ,self.ui.table.item(i,2).text().encode('utf-8')\
    ,self.ui.table.item(i,3).text().encode('utf-8')\
    ,self.ui.table.item(i,4).text().encode('utf-8')]
    dataWriter.writerow(rowData)

try

table = self.ui.table
for i in range(table.rowCount()):
    row = [table.item(i, j).text() for j in range(5)]
    writer.writerow(row)


In Python 3, using open in binary mode creates an io.BufferedWriter, which wants bytes, not strings. By using the encode method, you change your strings into bytes; but I think cvs.writer.writerow converts those bytes back into strings before writing.

Instead of opening the file in binary mode, you should try to figure out what's causing the double spacing. I have two questions:

  1. What platform are you using?

  2. What is the output of print repr(self.ui.table.item(i,4).text())?

My guess is that brandizzi's strip() method will work, but if not, we'll need to do some troubleshooting.

Edit: Ok, John Machin's post clears it all up. The correct way to fix this problem in Python 3 is to open the file with newline='', which disables automatic newline translation. This bug report contains some helpful information.


Maybe you could let the codecs module do the Unicode encoding for you, and try something like this instead:

import codecs, csv

with codecs.open(fileName, 'w', encoding = 'utf_8') as f:
    writer = csv.writer(f)
    writer.writerow(['some string', 'some other string'])


I'm hardly surprised. If you were to write out a byte of value 13, how is the module supposed to tell if this is part of a binary field, or the start of a new record in the CSV? CSV files are not suitable for storing binary data.

If you absolutely need it to be in there, you could look into BASE 64 encoding...

Martin

0

精彩评论

暂无评论...
验证码 换一张
取 消