I'm trying to write a generic reader of all sorts of medical image formats that we come accross. I thought, let's learn from the pros and went to imitate how PIL generically reads files ("Python Imaging Library", Formats).
As I understand it, PIL has an open function that loops throuh a list of possible accept functions. When one works, it uses the related factory function to instantiate the appropriate object.
So I went to do this and my (stripped-down) efforts are here:
pluginID = [] # list of all registered plugin ID开发者_运维技巧s
OPEN = {} # plugins have open and (maybe) accept functions as a tuple
_initialized = False
import os, sys
def moduleinit():
'''Explicitly initializes the library. This function
loads all available file format drivers.
This routine has been lifted from PIL, the Python Image Library'''
global _initialized
global pluginID
if _initialized:
return
visited = {}
directories = sys.path
try:
directories = directories + [os.path.dirname(__file__)]
except NameError:
pass
# only check directories (including current, if present in the path)
for directory in filter(isDirectory, directories):
fullpath = os.path.abspath(directory)
if visited.has_key(fullpath):
continue
for file in os.listdir(directory):
if file[-19:] == "TestReaderPlugin.py":
f, e = os.path.splitext(file)
try:
sys.path.insert(0, directory)
try: # FIXME: this will not reload and hence pluginID
# will be unpopulated leading to "cannot identify format"
__import__(f, globals(), locals(), [])
finally:
del sys.path[0]
except ImportError:
print f, ":", sys.exc_value
visited[fullpath] = None
if OPEN:
_initialized = True
return 1
class Reader:
'''Base class for image file format handlers.'''
def __init__(self, fp=None, filename=None):
self.filename = filename
if isStringType(filename):
import __builtin__
self.fp = __builtin__.open(filename) # attempt opening
# this may fail if not implemented
self._open() # unimplemented in base class but provided by plugins
def _open(self):
raise NotImplementedError(
"StubImageFile subclass must implement _open"
)
# this is the generic open that tries to find the appropriate handler
def open(fp):
'''Probe an image file
Supposed to attempt all opening methods that are available. Each
of them is supposed to fail quickly if the filetype is invalid for its
respective format'''
filename=fp
moduleinit() # make sure we have access to all the plugins
for i in pluginID:
try:
factory, accept = OPEN[i]
if accept:
fp = accept(fp)
# accept is expected to either return None (if unsuccessful)
# or hand back a file handle to be used for opening
if fp:
fp.seek(0)
return factory(fp, filename=filename)
except (SyntaxError, IndexError, TypeError):
pass # I suppose that factory is allowed to have these
# exceptions for problems that weren't caught with accept()
# hence, they are simply ignored and we try the other plugins
raise IOError("cannot identify format")
# --------------------------------------------------------------------
# Plugin registry
def register_open(id, factory, accept=None):
pluginID.append(id)
OPEN[id] = factory, accept
# --------------------------------------------------------------------
# Internal:
# type stuff
from types import StringType
def isStringType(t):
return isinstance(t, StringType)
def isDirectory(f):
'''Checks if an object is a string, and that it points to a directory'''
return isStringType(f) and os.path.isdir(f)
The important bit behind the scenes is a registration of all format plugins upon the first time an attempt is made to open a file (moduleinit). Every eligible plugin must be in an accessible path and named *TestReaderPlugin.py. It will get (dynamically) imported. Each plugin module has to call a register_open to provide an ID, a method to create the file and an accept function to test candidate files.
An example plugin will look like this:
import TestReader
def _accept(filename):
fp=open(filename,"r")
# we made it here, so let's just accept this format
return fp
class exampleTestReader(TestReader.Reader):
format='example'
def _open(self):
self.data = self.fp.read()
TestReader.register_open('example', exampleTestReader, accept=_accept)
TestReader.open() is the function that a user will use:
import TestReader
a=TestReader.open(filename) # easy
So - where is the problem? Firstly, I'm still on the search for the pythonic way. Is this it? My reasons to doubt it is that the magic in the moduleinit stage looks messy. It is copied straight from PIL. Main problem: If you reload(TestReader), it will all stop working because ID gets initialized to [], but the plugins will not get reloaded.
Are there better ways to set up a generic reader that
1. allows a simple open(filename) call for all formats and 2. requires only nicely encapsulated plugins to be provided for whatever format you want. 3. works on reloads?Some guidelines:
- Use the concept of "peek" into a buffer to test if there is data data you could understand.
- Knowing the name of the importer is something a user does not want to know (what if you have 100 importers) use a "facade" interface medicimage.open(filepath)
- To work on reload you have to implement a little bit of logic, there are exaples out there on how to achieve that
精彩评论