I have a class that assists in importing a special type of file, and a 'factory' class that allows me to do these in batch. The factory class uses a generator so the client can iterate through the importers. My question is, did I use the iterator correctly? Is this an acceptable idiom? I've just started using Python.
class FileParser:
""" uses an open filehandle to do stuff """
class BatchImporter:
def __init__(self, files):
self.files=files
def parsers(self):
for file in self.files:
try:
fh = open(开发者_运维知识库file, "rb")
parser = FileParser(fh)
yield parser
finally:
fh.close()
def verifyfiles(
def cleanup(
---
importer = BatchImporter(filelist)
for p in BatchImporter.parsers():
p.method1()
...
You could make one thing a little simpler: Instead of try
...finally
, use a with
block:
with open(file, "rb") as fh:
yield FileParser(fh)
This will close the file for you automatically as soon as the with
block is left.
It's absolutely fine to have a method that's a generator, as you do. I would recommend making all your classes new-style (if you're on Python 2, either set __metaclass__ = type
at the start of your module, or add (object)
to all your base-less class
statements), because legacy classes are "evil";-); and, for clarity and conciseness, I would also recomment coding the generator differently...:
def parsers(self):
for afile in self.files:
with open(afile, "rb") as fh:
yield FileParser(fh)
but neither of these bits of advice condemns in any way the use of generator methods!-)
Note the use of afile
in lieu of file
: the latter is a built-in identifier, and as a general rule it's better to get used to not "hide" built-in identifiers with your own (it doesn't bite you here, but it will in many nasty ways in the future unless you get into the right habit!-).
The design is fine if you ask me, though using finally the way you use it isn't exactly idiomatic. Use catch and maybe re-raise the exception (using the raise keyword alone, otherwise you mess the stacktrace up), and for bonus points, don't catch: but catch Exception: (otherwise, you catch SystemExit and KeyboardInterrupt).
Or simply use the with-statement as shown by Tim Pietzcker.
In general, it isn't safe to close the file after you yield a parser object that will try to read it. Consider this code:
parsers = list(BatchImporter.parsers())
for p in parsers:
# the file object that p holds will already be closed!
If you're not writing a long-running daemon process, most of the time you don't need to worry about closing files -- they will all get closed when your program exits, or when the file objects are garbage-collected. (And if you use CPython, that will happen as soon as all references to them are lost, since CPython uses reference counting.)
Nevertheless, taking care to free resources is a good habit to acquire, so I would probably write the FileParser class this way:
class FileParser:
def __init__(self, file_or_filename, closing=False):
if hasattr(file_or_filename, 'read'):
self.f = file_or_filename
self._need_to_close = closing
else:
self.f = open(file_or_filename, 'rb')
self._need_to_close = True
def close(self):
if self._need_to_close:
self.f.close()
self._need_to_close = False
and then BatchImporter.parsers would become
def parsers(self):
for file in self.files:
yield FileParser(file)
or, if you love functional programming
def parsers(self):
return itertools.imap(FileParser, self.files)
An aside: if you're new to Python, I recommend you take a look at the Python style guide (also known as PEP 8). Two-space indents look weird.
精彩评论