I have a generic Dictionary that I am using as a cache in a threaded .NET project (C#). I will be doing a lot of reads on the dictionary (potentially hundreds or more per second at peak times).
I have a simple Get(int id) method that should return a CustomObject if it's in the cache, or null otherwise.
My question: Is it faster/more efficient to lock the dictionary, or just use a try/catch block?
Assuming the dictionary is stored in a variable named "dic".
Lock Sample:
public CustomObject Get(int id)
{
lock(dic)
{
if (dic.ContainsKey(id)开发者_高级运维)
return dic[id];
}
return null;
}
Try/Catch Sample:
public CustomObject Get(int id)
{
try
{
return dic[id];
}
catch
{
return null;
}
}
I think you should test it in your own environment. Basically:
- Lock is cheap
- Try without getting an exception is cheap, maybe even cheaper then lock
- Try and getting exception is very expensive
So now the question is, how often you expect to have cache-miss, and therefore get an exception thrown. I would go for lock() as it's execution time is not dependent on whether you will or not get cache-hit, which means it's more predictable and measurable, while still - very cheap. I don't think that hundreds hits per second would be any problem.
Simple tests I've made indicate, that getting cache-miss with try/catch is very, very expensive.
Edit:
Simple test shows that:
- try-no throw costs about 2ms for 100k retrieves
- lock costs about 6ms
- try-throw costs about 4seconds
Which means, got for lock(), because it's more efficient then try/catch if you're getting more then 1 cache miss per few thousands tries, and it's much more stable, being not depended on luck.
You can go ahead and write off the try-catch option. I do not know if it is slower or not, but I do know that it will not always yield correct, consistent, and predictable results if there is another thread updating the Dictionary
. The problem is that at some point the writer will have the Dictionary
in a half-baked state and there is no telling what the readers will see. This just will not work.
Option 1: If .NET 4.0 is available to you then I would use ConcurrentDictionary
.
Option 2: If you are using .NET 3.5 then you can download the Reactive Extensions backport. ConcurrentDictionary
is included in System.Threading.dll.
Option 3: Another idea is to keep two separate copies of the Dictionary
. One would only be used for reading while the other would serve as the official copy that accepts updates. Anytime you update the "official" Dictionary
you would clone it and overwrite the reference of the copy.
public class Example
{
// This is the official version which can accept updates.
private readonly Dictionary<int, CustomObject> official = new Dictionary<int, CustomObject>();
// This is a readonly copy. This must be marked as volatile for this to work correctly.
private volatile Dictionary<int, CustomObject> copy = new Dictionary<int, CustomObject>();
public class Example()
{
}
public void Set(int id, CustomObject value)
{
lock (official)
{
// Update the official dictionary.
official[id] = value;
// Now create a clone of the official dictionary.
var clone = new Dictionary<int, CustomObject>();
foreach (var kvp in official)
{
clone.Add(kvp.Key, kvp.Value);
}
// Swap out the reference.
copy = clone;
}
}
public CustomObject Get(int id)
{
// No lock is required here.
CustomObject value = null;
if (copy.TryGetValue(id, out value))
{
return value;
}
return null;
}
}
This option does not work well if there are a lot of items in the Dictionary
or if updates to the official copy happen frequently. But, it is a trick I do use from time to time.
Option 4: An equally reasonable approach would be to stick with the plain old lock
.
I would go with lock
/interlocked
in your scenario. You could get currupted/invalid data if you don't lock, and try to write something to the dictionary, somewhere else.
If you are too mutch concerned about performance, you can use Interlocked class... there are lots of techniques on how to use it, to achieve locking with more performance than lock
.
An implementation would be possible by using Interlocked.CompareExchange and using a control variable.
Interlocked sample:
I found this sample in Microsoft site (just copied so that you can see it here):
Source: http://msdn.microsoft.com/en-us/library/system.threading.interlocked(v=VS.100).aspx#Y1291
if(0 == Interlocked.Exchange(ref usingResource, 1))
{
Console.WriteLine("{0} acquired the lock", Thread.CurrentThread.Name);
//Code to access a resource that is not thread safe would go here.
//Simulate some work
Thread.Sleep(500);
Console.WriteLine("{0} exiting lock", Thread.CurrentThread.Name);
//Release the lock
Interlocked.Exchange(ref usingResource, 0);
return true;
}
else
{
Console.WriteLine(" {0} was denied the lock", Thread.CurrentThread.Name);
return false;
}
Note that, in this sample, almost everything inside the if block is locked!
The variable usingResource
is a field of the class, being passed by reference to the method... it is the control variable I mentioned earlier.
It is always safer and cleaner to go with locks, however the performance will vary based on how many threads are contending to acquire the lock
Locks & Contention => NOT CHEAP!!
Please, system lock is not cheap, it has to switch between ring levels of the processor as the lock is provided by operating system. For that honor, lock() takes at minimum 50ns IF THE LOCK IS NOT CONTENDED! With contention, it gets very bad.
Benchmarks like 6ms/100k retrieves with lock() are more consistent to benchmarking with uncontended lock(), that makes it 60ns per access to locked retrieval. On multicores and for concurrent data structures protected by synchronization primitives, there has to be created contention from multiple threads, only then we can get information about its behaviour.
When in the same time there are more threads accessing lock, it gets very fast very unhappy as threads waiting are de-scheduled/re-schedule and threads going to sleep are subject to threads time slices and then contentions is forcing it fast from tens or hundreds of nanoseconds to tens milliseconds.
Basically lock is the root of all evil if you want high throughput, pushing data through data structure between multiple threads on multicore CPU or high responsiveness between threads.
For that, you have to look the other way. Concurrent data structures based on the spinlock, redesigning the way threads interact, making interaction loosely coupled, so the interaction takes minimum time and on minimum items.
Here would be probably best ConcurrentDictionary from TPL, in dotNET 3.5 (look for TPL on nuget) and newer versions.
Improved class Example from Brien Gideon's answer. It is a good idea when reading outnumbers writing many times over. I made it generic class, replaced volatile with interlocked methods and repaired one possible race condition during first-time initialization.
using System.Threading;
using System.Collections.Generic;
public class Example<TKeys, TValues>
{
private readonly object obj = new object();
// Read/Write version.
private readonly Dictionary<TKeys, TValues> official = new Dictionary<TKeys, TValues>();
// Read/Replace version.
//it must be initialized as new empty dictionary as well,
// they can not share reference or it won't be thread safe
//initially because Set be inserting data into official and copy
// in the same time.
private Dictionary<TKeys, TValues> copy = new Dictionary<TKeys, TValues>();
public void Set(TKeys id, TValues value)
{
lock (obj)
{
// Update the official dictionary.
official[id] = value;
// Now create a clone of official dictionary
var clone = new Dictionary<int, CustomObject>(official);
// interlocked sets atomically latest reference
Interlocked.Exchange(ref copy, clone);
}
}
public bool TryGetValue(TKeys id, out TValues value)
{
// interlocked gets latest reference atomically,
//also providing access to full TryGetValue makes it
//safer for struct data types, if dictionary empty,
//it wont return default value!
return
Interlocked
.CompareExchange(ref copy, null, null)
.TryGetValue(id, ref value)
;
}
}
精彩评论