Prevent a file from being created in python_问答_开发者

I'm working on a python server which concurrently handles transactions on a number of databases, each storing performance data about a different application. Concurrency is accomplished via the Mul开发者_高级运维tiprocessing module, so each transaction thread starts in a new process, and shared-memory data protection schemes are not viable. I am using sqlite as my DBMS, and have opted to set up each application's DB in its own file. Unfortunately, this introduces a race condition on DB creation; If two process attempt to create a DB for the same new application at the same time, both will create the file where the DB is to be stored. My research leads me to believe that one cannot lock a file before it is created; Is there some other mechanism I can use to ensure that the file is not created and then written to concurrently?

Thanks in advance, David

The usual Unix-style way of handling this for regular files is to just try to create the file and see if it fails. In Python's case, that would be:

try:
    os.open(filename, os.O_WRONLY | os.O_CREAT | os.O_EXCL)
except IOError: # or OSError?
    # Someone else created it already.

At the very least, you can use this method to try to create a "lock file" with a similar name to the database. If the lock file is created, you go ahead and make the database. If not, you do whatever you need to for the "database exists" case.

Name your database files in such a way that they are guaranteed not to collide.

http://docs.python.org/library/tempfile.html

You could capture the error when trying to create the file in your code and in your exception handler, check if the file exists and use the existing file instead of creating it.

You didn't mention the platform, but on linux open(), or os.open() in python, takes a flags parameter which you can use. The O_CREAT flag creates a file if it does not exist, and the O_EXCL flag gives you an error if the file already exists. You'll also be needing O_RDONLY, O_WRONLY or O_RDWR for specifying the access mode. You can find these constants in the os module.

For example: fd = os.open(filename, os.O_RDWR | os.O_CREAT | os.O_EXCL)

You can use the POSIX O_EXCL and O_CREAT flags to open(2) to guarantee that only a single process gets the file and thus the database; O_EXCL won't work over NFSv2 or earlier, and it'd be pretty shaky to rely on it for other network filesystems.

The liblockfile library implements a network-filesystem safe locking mechanism described in the open(2) manpage, which would be convenient; but I only see pre-made Ruby and Perl bindings. Depending upon your needs, maybe providing Python bindings would be useful, or perhaps just re-implementing the algorithm:

   O_EXCL Ensure that this call creates the file: if this flag is
          specified in conjunction with O_CREAT, and pathname
          already exists, then open() will fail.  The behavior of
          O_EXCL is undefined if O_CREAT is not specified.

          When these two flags are specified, symbolic links are not
          followed: if pathname is a symbolic link, then open()
          fails regardless of where the symbolic link points to.

          O_EXCL is only supported on NFS when using NFSv3 or later
          on kernel 2.6 or later.  In environments where NFS O_EXCL
          support is not provided, programs that rely on it for
          performing locking tasks will contain a race condition.
          Portable programs that want to perform atomic file locking
          using a lockfile, and need to avoid reliance on NFS
          support for O_EXCL, can create a unique file on the same
          file system (e.g., incorporating hostname and PID), and
          use link(2) to make a link to the lockfile.  If link(2)
          returns 0, the lock is successful.  Otherwise, use stat(2)
          on the unique file to check if its link count has
          increased to 2, in which case the lock is also successful.