i have a hash开发者_JAVA百科map in my application. The map is in a singleton and access to update or read it is protected using synchronized methods.
My problem occurs when testing with large numbers(20000+) of concurrent threads. When threads are writing to the map using put() im getting OutOfMemory exception.
Read operations are fine (i can simulate 1000000+ threads) without any issue.
Any recommendations on how i can make my hashmap more performant for writes? This may also be a limitation with my approach of storing so much data in memory?
I suspect you're running out of PermGen memory due to the number of threads. Your OutOfMemoryError
exception should tell you if it's heap or PermGen.
Each thread in Java uses about 256-512 Kbytes for its stack, which is allocated from PermGen. So 20,000 threads * 256 Kbytes = 5 Gbytes which way beyond the default PermGen size (usually 64-256 Mbytes).
You should limit the number of threads to less than a few hundred. Take a look Java 5/6's concurrent package, in particular ThreadPoolExecutor.
Sounds like your problem is memory, not performance.
Try writing least recently accessed keys and values with same hashcode to a file and clearing them from memory.
If a file stored hashcode is addressed, write next least recently used hash code's keys and sales to a file and clear from memory, then read desired stored read file to memory.
Consider multiple levels of hashmaps (each with different keys) for improving performance of this.
Have you tried a ConcurrentHashMap? Under the right coding circumstances you won't need any synchronization. There are multiple striped locks internally to reduce contention, and many nice compound atomic operations like putIfAbsent that may allow you to drop external locks entirely.
As for memory, I suspect you really are storing a lot in the JVM. Use a monitor tool like visualvm to check it out, or add more memory to the JVM allocation. Consider a cache like EHCache that will automatically overflow to disk and internally uses a ConcurrentHashMap, and has all kinds of nice bounding options
If you are using JDK1.5+, ConcurrentHashMap is a good choice. It's effective.
See: What's the difference between ConcurrentHashMap and Collections.synchronizedMap(Map)?
Also, I think put()
may lead new memory allocate in map and more time consuming, but get()
not. So more threads will be blocked in put()
.
Also, Optimize the hashCode()
method of your key class. It's important, as hash code calculation is intensive operation in your case. If the key object is immutable, calculate the hash code just once and save it as a member and return it directly in hashCode()
.
If you would like to keep your current implementation, you might also want to consider changing the amount of memory allocated to the application by changing the -Xms and -Xmx parameters passed to Java. Many other parameters exist as well. It may be necessary to do this regardless of the implementation used.
You can use ConcurrentHashMap instead of that and it has more advantages over regular map. I am not sure whether you are using Java5, since it is available from version 5 only.
Also, I would say that, think once again on your logic whether you really require synchronization on read operations. If it's not, you can remove that and you will save some performance.
If you are really feel a low memory issue, you can run the jvm with more vm memory options said above. Give it a try. :)
Have your hashcode method for the keys efficient. You can depend on other apis such as Pojomatic to do that stuff.
As far as the last part of your question:
Any recommendations on how i can make my hashmap more performant for writes? This may also be a limitation with my approach of storing so much data in memory?
I use a tool to take a look at what the application is doing. It can do heap and thread dumps. It also has a montior that displays the cpu, classes loaded, threads, heap, and perm gen. It's called Java VisualVM and it's part of jdk 1.6 The exe is in the bin folder of the jdk. I'm going to use it for tracking down some threading issues in our code.
HTH, James
OutOfMemoryError can be caused by large number of objects stored, not by large number of threads, and OOME is not a performance problem.
BTW, you can use ConcurrentHashMap for fast concurrent reads and writes, and do not use one global lock.
精彩评论