I'm totally blown away from the quality of windows SRW implementation. Its faster then critical sections and its just a few bytes memory overhead.
Unfortunately it is only Windows Vista/Windows 7.
As this is a pure user land implementation, does anybody know if there is a cross platform implementation for it? Has anybody reverse-engineered there soluti开发者_Python百科on?
And please i don't want to add stuff like boost just to pull in a less then 100 LOC solution.
If you want something "portable" in the sense of conforming to some standard... If you are using POSIX threads there is pthread_rwlock_init()
and friends. These are of course not typically used on Windows but rather Unix-type OSes.
But if you mean "portable" in the sense of "portable to multiple versions of Windows..." There are some undocumented calls in ntdll
which implement RW locks. RtlAcquireResourceShared()
and RtlAcquireResourceExclusive()
.
Here are some prototypes from WINE's implementation:
void WINAPI RtlInitializeResource(LPRTL_RWLOCK rwl);
void WINAPI RtlDeleteResource(LPRTL_RWLOCK rwl);
BYTE WINAPI RtlAcquireResourceExclusive(LPRTL_RWLOCK rwl, BYTE fWait);
BYTE WINAPI RtlAcquireResourceShared(LPRTL_RWLOCK rwl, BYTE fWait);
void WINAPI RtlReleaseResource(LPRTL_RWLOCK rwl);
Note you may have to GetProcAddress()
these from ntdll.dll
yourself.
As for the structure referenced... Here's what WINE declares:
typedef struct _RTL_RWLOCK {
RTL_CRITICAL_SECTION rtlCS;
HANDLE hSharedReleaseSemaphore;
UINT uSharedWaiters;
HANDLE hExclusiveReleaseSemaphore;
UINT uExclusiveWaiters;
INT iNumberActive;
HANDLE hOwningThreadId;
DWORD dwTimeoutBoost;
PVOID pDebugInfo;
} RTL_RWLOCK, *LPRTL_RWLOCK;
If you don't want to use pthreads and you don't want to link to sketchy undocumented functionality... You can look up a rwlock implementation and implement it yourself in terms of other operations... Say InterlockedCompareExchange()
, or perhaps higher level primitives such as semaphores and events.
You can certainly roll your own using the same ideas as slim rwlock (at least what I imagine they did, since this is fairly straightforward). I outlined the approach in some detail in this other question.
For your case, you can mostly ignore the "fair" aspect, but the implementation is essentially the same. In particular, if you are willing to let an indefinite stream of readers block writers, you always let readers in when the lock already has readers in it (i.e., state (2) and (3) more or less collapse together).
In your case, for the cross platform angle, you would need to implement the blocking with either windows events or pthread condvars - but the details are similar in either case. Or, if you really want to avoid blocking at all, your only choice is spinning (ideally using the pause
instruction to be nice to the CPU), which makes things even easier by removing the whole fallback to blocking code.
A good implementation is probably a couple hundred LOC. I wrote one (close source, I cannot share it) and it performs excellently (better than slim lock, in fact).
精彩评论