2 questions on locks performance/use_问答_开发者

On the server application, I need to assign to each connected client an unique ID, so I am doing it this way:

private short GetFreeID()
{
    lock (this.mUsedPlayerIDsSynchronization)
    {
        for (short I = 1; I < 500; I++)
        {
            if (ClientIDPool[I] == false)
            {
                ClientIDPool[I] = true;
                return I;
            }
        }
        return -1;
    }
}

My fir开发者_开发问答st question: Could it be done more efficiently, I mean with better performance? I have read here that we should learn to write code without locks. I have also read there for some atomic operations there are other options. Second question: What if I wanted to lock the whole class in order to do not allow to make any changes within? For example: one client will update second clients data, can I lock the whole second client class that it is absolutely blocked? I still think "lock" will only make sure that code inside its snippet is entered by only one thread at the time, so I do not know if "lock(client2)" causes that nothing in that class can be changed until this lock is released.

Locks are often the simplest way of getting things right, which is very important. Quite often it doesn't matter if there's a more efficient way of doing things, so long as you've got clear code and it performs well enough.

A more performant approach here, however, would be to either generate a random GUID, or if you do want to reuse IDs, have a "pool" (e.g. a LinkedList) of unused IDs. You could then take from the pool very quickly, and return the ID to the pool (again quickly) once you're done.

Alternatively, if you really just need an integer and it doesn't have to be a low one, you could have a static variable which starts at 0 and which you just increment each time - you can do this without locking using Interlocked.Increment should you wish. I doubt that you'll run out of 64-bit integers, for example :)

As for your second question: yes, locks are advisory. If everything within the class takes out the same lock before changing any fields (and the fields are private) then that prevents other code from misbehaving... but each bit of code does need to take out the lock.

EDIT: If you really only need an integer, I would still suggest just using Interlocked.Increment - even if your traffic increases 1000-fold, you could use a 64 bit integer instead. However, if you want to reuse IDs then I'd suggest creating a new type to represent the "pool". Give that a counter of how many have been created, so that if you run out you can assign a new item. Then just store the available ones in a Queue<int>, LinkedList<int> or Stack<int> (it's not going to matter very much which you use). Assuming you can trust your own code to return IDs sensibly, you can make the API as simple as:

int AllocateID()
void ReturnID(int id)

AllocateID would check to see if the pool is empty, and allocate a new ID if so. Otherwise, it would just remove the first entry from the pool and return that. ReturnID would just add the specified ID to the pool.

You are locking while scanning through an array.

You'd be better off having 2 stacks. One is one with free id's, and one is with ID's in use. That way you can just pop one of the first stack and push it on the second.

This way you are locking much less long.

You can allocate state on thread local memory. Thread local memory is thread safe (as long as you don't pass pointers arround).

You can use two integers to generate the unique number and only one is a synchronized number.

Integer 1: a incrementing integer which represents the thread, a new number is generated each time you initialize a thread (which should be a rare event).

Integer2: on thread initialization you start this integer on 0.

You will use both integers -- which are stored in the thread local memory -- as the unique integer, and integer 2 will be incremented normally (unlocked).

This way the generation of unique integers is absolutely thread safe -- i.e., you don't have to use a atomic CPU instruction -- Interlocked.increment (which does cause hardware level performance penalties).

-- edit : cache coherence --

from :

Cache Coherence

To decrease time required for memory access different caches are used: recently accessed memory duplicated in CPU cache which is significantly faster than common memory. Future access to the same address will use data saved in cache, decreasing fetch time. The problems appear in SMP (symmetric multiprocessing) systems, where there are several processors have own cache memory: when one processor changes variable in memory region, used by several processors simultaneously, it actually changes own copy of a variable, located in cache, while shared variable still has original value. This problem could not be solved by using volatile keyword on a shared variable, since this will only guarantee that write-to-memory instruction will present in resulting program, but operations related to cache are still not specified. Of course, it is possible to disable CPU cache, mapping memory as no-cache (PAGE_NOCACHE protection flag in VirtualAlloc() Win32 API function), but along with significant slowdown this imposes some limitations: for example, interlocked instructions may raise hardware exception on no-cache memory.

For correct work of SMP systems data which is stored in cache of more than one processor should be the same in all caches. This means that CPU caches must be synchronized (kept coherent) at a hardware level**. But it is important to note that cache synchronization (flow of cache coherency traffic) is made asynchronously with program execution:** when one CPU changes value of a shared variable another CPU temporarily observes old value. This means CPU continues execution without waiting for cache coherency operation to be completed. Furthermore, if two variables (a then b) were changed by first CPU, another CPU could observe that b has changed earlier than a.

Interlocked instructions have considerable differences on this point. Exactly, interlocked instruction is a command to make something directly on a physical memory under a locked bus. This means that caches inconsistency does not affects programs where shared variables are accessed with only interlocked instructions (note that both processes, that who reads a variable and that writes to it, should use interlocked instructions).

-- edit : Further clarrification: --

Under your current design Interlocked increment is indeed your best bet, but it is far from ideal. Your CPU has a VERY FAST cache onchip (often clocked at the same speed as the CPU). If you have memory local to your thread it will be pulled into the CPU your thread is on which means your CPU won't have to go to main memory and it can fly at full-speed.

If you use interlocked increment your CPU will have to

lock the bus.
Increase the 32-bit word.
release the bus.

You can do without this. I might appear to be acting pedantic, as the overhead might only be a 100% decrease in relative performance. However in a industrial server app with a 4 physical CPU's and 16 cores with this UID generator firing on every request... trust you me, your bus will be screwed. Micro-optimization is an important area in programming, especially as we are scaling horizontally now.

I'd suggest you use GUIDs, unless you are required to use a short. Guid.NewGuid() is thread safe, so you eliminate the need for locks or other synchronization mechanisms.

Do you care about the ID returned? You could either increment the client ID using Interlocked.Increment each time or generate GUID (the former is likely to be faster).

Then use a simple counter to track how many clients are connected rather than scanning the array every time.