开发者

Can't Parse CSV in Plone 4

开发者 https://www.devze.com 2023-03-24 07:44 出处:网络
I am trying to batch load some dummy content from a CSV file into a development site to do some testing.I\'m using Plone 4.0.1, with Python 2.6.5, on a Mac OS X 10.6.6.

I am trying to batch load some dummy content from a CSV file into a development site to do some testing. I'm using Plone 4.0.1, with Python 2.6.5, on a Mac OS X 10.6.6.

1) I thought I would create a quick script that would iterate through a CSV file and then create some of my custom contenttypes: (Similar to http://plone.org/documentation/kb/batch-adding-users). In Plone 3, I had been able to parse CSV files in this fasion.

However, I got an AttributeError on split. I'm copying from my ipython (ipzope) testing:

>>> portal
<PloneSite at /Plone>
>>> portal['Scripts']['dummydata.csv']
<File at /Plone/Scripts/dummydata.csv>
>>> dummy = portal['Scripts']['dummydata.csv']
>>> dummy
<File at /Plone/Scripts/dummydata.csv>
>>> dummy.data.split('\n')
------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython console>", line 1, in <module>
AttributeErr开发者_如何学编程or: split

>>> dummy.split('\n')                                               
------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython console>", line 1, in <module>
AttributeError: split

2) Ultimately, I'd like to use csv from the standard library, which also did not work.

>>> import csv
>>> csv
<module 'csv' from '/Applications/Plone/Python-2.6/lib/python2.6/csv.pyc'>
>>> spamReader = csv.reader(dummy, delimiter=',', quotechar='"')
------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython console>", line 1, in <module>
TypeError: argument 1 must be an iterator

>>> spamReader = csv.reader(dummy.data, delimiter=',', quotechar='"') 
------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython console>", line 1, in <module>
TypeError: argument 1 must be an iterator

Any ideas?

Best, Patrick


You could try something like this:

>>> from StringIO import StringIO
>>> csv_io = StringIO(dummy.data)
>>> csv_reader = csv.reader(csv_io, delimiter=',', quotechar='"')
>>> for i in csv_reader: print i
['a','b','c']
['d','e','f']
...

More info about python and csv can be found here: http://docs.python.org/library/csv.html

Bye, Giacomo


The core problem here is relying on the internals of the File object. For example, the File implementation may use a linked data structure where the top level "File.data" object itself has a link to a "File.data.data" object to break up the objects in the database allowing reading part of the file without loading the whole thing from the database. In this case, you'd only get part of the file data accessing the attribute directly. But this is just an example of the kind of gotchas you can run into when relying on internals.

Instead, access the file contents using an interface explicitly supported by the File object. Assuming the File object is a OFS.Image.File instance, the only thing I see it supporting is str(). So your best bet if your only option is to use a OFS.Image.File instance in the ZODB is to do str(dummy).split('\n').

That said, this will likely be very memory intensive for large files. You'd be much better off loading your CSV from a file on the filesystem which the Python CSV module will access efficiently without loading it all into memory.

0

精彩评论

暂无评论...
验证码 换一张
取 消