I have a number of functions that parse data from files, usually returning a list of results.
If I encounter a dodgy line in the file, I want to soldier on and process the valid lines, and return them. But I also want to report the error to the calling function. The reason I want to report it is so that the calling function can notify the user that the file needs looking at. I don't want to start doing GUI things in the parse function, as that seems to be a big violation of separation of concerns. The parse function does not have access to the console I'm writing error messages to anyway.
This leaves me wanting to return the successful data, but also raise an exception because of the error, which clearly I can't do.
Consider this code:
try:
parseResult = parse(myFile)
except MyErrorClass, e:
HandleErrorsSomehow(str(e))
def parse(file): #file is a list of lines from an actual file
err = False
result = []
for lines in file:
processedLine = Process(line)
if not processedLine:
err = True
else
result.append(processedLine)
return result
if err:
raise MyErrorClass("Something went wrong")
Obviously the last three lines make no sense, but I can't figure out a nice way to do thi开发者_C百科s. I guess I could do return (err, result)
, and call it like
parseErr, parseResult = parse(file)
if parseErr:
HandleErrorsSomehow()
But returning error codes seems un-pythonic enough, let alone returning tuples of error codes and actual result values.
The fact that I feel like I want to do something so strange in an application that shouldn't really be terribly complicated, is making me think I'm probably doing something wrong. Is there a better solution to this problem? Or is there some way that I can use finally
to return a value and raise an exception at the same time?
Nobody says the only valid way to treat an "error" is to throw an exception.
In your design the caller wants two pieces of information: (1) the valid data, (2) whether an error occurred (and probably something about what went wrong where, so it can format a useful error message). That is a completely valid and above-ground case for returning a pair of values.
An alternative design would be to pass a mutable collection down to the function as a parameter and let it fill any error messages it wants to emit into that. That will often simplify the plumbing in the caller, especially if there are several layers of calls between the parser and the code that knows how to do something with the error messages afterwards.
Emit a warning instead, and let the code decide how to handle it.
Another possible design is to invert control, by passing the error handler in as a parameter.
(Also, don't feel like you have to tell Python how to accumulate data in a list. It knows already. It's not hard to make a list comprehension work here.)
def sample_handler():
print "OMG, I wasn't expecting that; oh well."
parseResult = parse(myFile, sample_handler)
def parse(file, handler): #file is a list of lines from an actual file
result = [Process(line) for line in data]
if not all(result): handler() # i.e. if there are any false-ish values
result = filter(None, result) # remove false-ish values if any
return result
Depending on caller design. using callbacks might be reasonable:
def got_line(line):
print 'Got valid line', line
def got_error(error):
print 'got error', error
def parse(file, line, error):
for lines in file:
processedLine = Process(line)
if not processedLine:
error(MyErrorClass("Something went wrong"))
else
line(processedLine)
parse(some_file, got_line, got_error)
I'd like to propose an alternative solution; using a class.
class MyParser(object):
def __init__(self):
self.warnings = []
def parse(self, file):
...
Now the parse function can set warnings to the warnings
list, and the user can check this list if they so desire.
As soon as my functions start becoming more advanced than just "process this and return my value" I like to consider using a class instead. It makes for great clustering of related code into one object and it often makes for cleaner code and simpler usage than functions returning tuples of information.
I come from the .Net world, so not sure how this translates into Python...
In cases like yours (where you want to process numerous items in a single call) I'd return a MyProcessingResults
object that held two collections, for example:
MyProcessingResults.ProcessedLines
- holds all the valid data you parsed.MyProcessingResults.Errors
- holds all the errors (on the assumption that you have more than one and you want to explicitly know about all of them).
精彩评论