Just out of curiosity, what is the preferred way to achieve interprocess synchronization on Linux? The sem*(2)
family of system calls seem to have a very clunky and dated interface, while there are three ways to lock files - fcntl()
, flock()
and lockf()
.
Wha开发者_如何学Pythont are the internal differences (if any) and how would you justify the usage of each?
Neither. The actual versions of pthread_*
(eg. phtread_mutex_t
) all allow to place the variables in shared segments that are created via shm_open
. You just have to place some extra parameter to the init calls.
Don't use semaphores (sem_t
) if you don't have to, they are too low level and are interupted by IO etc.
Don't abuse file locking for interprocess control. It is not made for that. In particular, you don't have a possibility to flush file meta-data such as locks, so you never know when a lock / unlock will become visible to a second process.
You are suffering a wealth of choices from a rich history, as DarkDust noted. For what it's worth my decision tree goes something like this:
Use mutexes when only one process/thread can have access at a time.
Use semaphores when two or more (but nevertheless finite) processes/threads can use a resource.
Use POSIX semaphores unless you really need something SYSV semaphores have - e.g. UNDO, PID of last operation, etc.
Use file locking on files or if the above don't fit your requirements in some way.
The different locking/semaphore implementations all came to life on different systems. On System V Unix you had semget
/semop
, POSIX defined a different implementation with sem_init
, sem_wait
and sem_post
. And flock
originated in 4.2BSD, as far as I could find out.
Since they all gained a certain significance Linux supports them all now to make porting easy. Also, flock
is a mutex (either locked or unlocked), but the sem*
functions (both SysV and POSIX) are semaphores: they allow an application to grant several concurrent processes access, e.g. you could allow access to a resource to 4 processes simultaneous with semaphores. You can implement a mutex with semaphores but not the other way round. I remember that in the excellent "Advanced UNIX Programming" by Marc J. Rochkind he demonstrated how to transmit data between processes via semaphores (very inefficient, he did it just to prove it can be done). But I couldn't find anything reliable about efficiency.
I guess it's more like "Use what you want".
A potentially significant difference might be the fairness of the resource distribution. I don't know the details of the implementation of the semget/semop
family, but I suspect that it is typically implemented as a "traditional" semaphore as far as scheduling goes. Generally, I believe the released threads are handled on a FIFO basis (first one waiting for the semaphore is released first). I don't think this would happen with file locking since I suspect (again just guessing) that the handling is not performed at the kernel level.
I had existing code sitting around to test semaphores for IPC purposes, and so I compared the two situations (one using semop
and one using lockf
). I did a poor man's test and just ran to instances of the application. The shared semaphore was used to sync the start. When running the semop test, both processes finished 3 million loops almost in sync. The lockf loop, on the other hand, was not nearly as fair. One process would typically finish while the other one had only completed half the loops.
The loop for the semop test looked like the following. The semwait
and semsignal
functions are just wrappers for the semop
calls.
ct = myclock();
for ( i = 0; i < loops; i++ )
{
ret = semwait( "test", semid, 0 );
if ( ret < 0 ) { perror( "semwait" ); break; }
if (( i & 0x7f ) == 0x7f )
printf( "\r%d%%", (int)(i * 100.0 / loops ));
ret = semsignal( semid, 0 );
if ( ret < 0 ) { perror( "semsignal" ); break; }
}
printf( "\nsemop time: %d ms\n", myclock() - ct );
The total run time for both methods was about the same, although the lockf version actually was faster overall sometimes because of the unfairness of the scheduling. Once the first process finished, the other process would have uncontested access for about 1.5 million iterations and run extremely fast.
When running uncontested (single process obtaining and releasing the locks), the semop version was faster. It took about 2 seconds for 1 million iterations while the lockf version took about 3 seconds.
This was run on the following version:
[]$ uname -r
2.6.11-1.1369_FC4smp
精彩评论