I'm working on a python server which concurrently handles transactions on a number of databases, each storing performance data about a different application. Concurrency is accomplished via the Mul开发者_高级运维tiprocessing module, so each transaction thread starts in a new process, and shared-memory data protection schemes are not viable. I am using sqlite as my DBMS, and have opted to set up each application's DB in its own file. Unfortunately, this introduces a race condition on DB creation; If two process attempt to create a DB for the same new application at the same time, both will create the file where the DB is to be stored. My research leads me to believe that one cannot lock a file before it is created; Is there some other mechanism I can use to ensure that the file is not created and then written to concurrently?
Thanks in advance, David
The usual Unix-style way of handling this for regular files is to just try to create the file and see if it fails. In Python's case, that would be:
try:
os.open(filename, os.O_WRONLY | os.O_CREAT | os.O_EXCL)
except IOError: # or OSError?
# Someone else created it already.
At the very least, you can use this method to try to create a "lock file" with a similar name to the database. If the lock file is created, you go ahead and make the database. If not, you do whatever you need to for the "database exists" case.
Name your database files in such a way that they are guaranteed not to collide.
http://docs.python.org/library/tempfile.html
You could capture the error when trying to create the file in your code and in your exception handler, check if the file exists and use the existing file instead of creating it.
You didn't mention the platform, but on linux open()
, or os.open()
in python, takes a flags parameter which you can use. The O_CREAT
flag creates a file if it does not exist, and the O_EXCL
flag gives you an error if the file already exists. You'll also be needing O_RDONLY
, O_WRONLY
or O_RDWR
for specifying the access mode. You can find these constants in the os
module.
For example: fd = os.open(filename, os.O_RDWR | os.O_CREAT | os.O_EXCL)
You can use the POSIX O_EXCL
and O_CREAT
flags to open(2)
to guarantee that only a single process gets the file and thus the database; O_EXCL
won't work over NFSv2 or earlier, and it'd be pretty shaky to rely on it for other network filesystems.
The liblockfile
library implements a network-filesystem safe locking mechanism described in the open(2)
manpage, which would be convenient; but I only see pre-made Ruby and Perl bindings. Depending upon your needs, maybe providing Python bindings would be useful, or perhaps just re-implementing the algorithm:
O_EXCL Ensure that this call creates the file: if this flag is
specified in conjunction with O_CREAT, and pathname
already exists, then open() will fail. The behavior of
O_EXCL is undefined if O_CREAT is not specified.
When these two flags are specified, symbolic links are not
followed: if pathname is a symbolic link, then open()
fails regardless of where the symbolic link points to.
O_EXCL is only supported on NFS when using NFSv3 or later
on kernel 2.6 or later. In environments where NFS O_EXCL
support is not provided, programs that rely on it for
performing locking tasks will contain a race condition.
Portable programs that want to perform atomic file locking
using a lockfile, and need to avoid reliance on NFS
support for O_EXCL, can create a unique file on the same
file system (e.g., incorporating hostname and PID), and
use link(2) to make a link to the lockfile. If link(2)
returns 0, the lock is successful. Otherwise, use stat(2)
on the unique file to check if its link count has
increased to 2, in which case the lock is also successful.
精彩评论