I am working with a temperamental web app that I'm not going to name. It runs into problems from time to time, and when it does, it writes stack traces and error messages to an exception.log
file. I want to know about these problems in a timely manner, so I've got a Python script that scans the log regularly (hooray for cron). If the size of exception.log is greater than zero, the script dumps the contents of the file into an email to me, then moves them to exception_archive.log
. My current tactic is to read in the file, send email and write to the exception archive if necessary, and if both of those steps were successful, to just
target = open(target_log, 'w')
target.close()
to zorch the original log. However, since I can't predict when the system will write to exception.log
, there is at least one point in the script where I could lose data - the system could write something to the log after I've read existing data and decided to overwrite the file. Also, I have learned from painful experience that if exception.log
does not exist, the temperamental web app will not recreate it - it'll just drop exception data on the floor. So the naïve solution of "rename and re-create the log file" only pushes the problem down a layer.
The core of the question, then, is: How can I (transfer|move|read-write-erase) data from one text开发者_如何学JAVA file to another in such a way that if new data is written to the file while my script is executing, there is zero or minimal chance of losing that data? I suspect that this is either a Hard Problem, or a Solved Problem that I just haven't heard the solution to. I can't extend the app itself, either - management is very skeptical of tinkering with it, plus it's not in Python, so I'd have to start from scratch.
Additional context:
[me@server ~]$ uname -a
Linux server.example.com 2.6.9-101.ELsmp
#1 SMP Thu Jul 21 17:28:56 EDT 2011 i686 i686 i386 GNU/Linux
[me@server ~]$ python
Python 2.3.4 (#1, May 5 2011, 17:13:16)
[GCC 3.4.6 20060404 (Red Hat 3.4.6-11)] on linux2
It's running on cruddy shared hosting, which is part of why I call it "temperamental." I also call it worse things for running Python 2.3 in 2011. This would probably be easier if I had a modern Python to work with.
I'm going to go with a variation on Kevin's answer below - since I control the crontab, I'm going to have the script look for anything in the right timestamp range and operate on that. This has the side benefit that the relevant information can all live in the Python script and be a single source of truth.
I would avoid deleting the exception log while the web app is still running. Just scan the log for updates without making any changes.
#lastKnownSizeOfFile is saved somewhere so it persists between executions of this script
if size(file) > lastKnownSizeOfFile: #found an update!
amountToRead = size(file) - lastKnownSizeOfFile
file.seek(lastKnownSizeOfFile)
newData = file.read(amountToRead)
exceptionArchive.write(newData)
emailMe(newData)
lastKnownSizeOfFile += amountToRead
If you're worried the log file will grow too large this way, delete it periodically during low-activity hours (say, 2 AM), when it is unlikely the app will be writing anything to it.
Rename exception.log
to a temporary filename, then process the temp file. (I'm assuming "temperamental web app" will simply recreate exception.log
if it doesn't exist).
精彩评论