开发者

How i can know how much memory my cached objects are using?

开发者 https://www.devze.com 2023-03-09 13:24 出处:网络
We are trying to cache the results of database selects (in hash map), so we wouldn’t have to execute them multiple times. and开发者_如何学C whenever we are changing data base, so for getting the chan

We are trying to cache the results of database selects (in hash map), so we wouldn’t have to execute them multiple times. and开发者_如何学C whenever we are changing data base, so for getting the changes in app we have added refresh list functionality.

Now we have a large no of list to fetch, so it taking too much time to load pick list from the data base.

So I have some question regarding this issue:

  1. How I can find out how much memory the list is using? (I have used the method where we are using garbage collector for collecting the memory and taking the difference but there are many list and so it is taking too much time)

  2. How I can optimize the refresh list?

Thanks for the help.


how i can find how much memory the list is using

  • JProfiler
  • VisualVM

how i can optimize the refresh list.

Make sure you're using the correct collection type for your data. Have a look here.

Also have a look at the Guava collections.


One last thing, ignis is very right by advising you not to use System.gc() this might be the very reason you're having performance problems. This is why.


First, while not wanting to generalize when it comes to performance problems, the issue you're seeing are unlikely to be purely down to memory use, though if the lists are large this could come into play when they're refreshed and a large number of objects become eligible for collection.

To solve issues relating to garbage collection there's a few rules of thumb, but it always comes down to breaking out a profiler an tuning the garbage collector - there's more on that here.

But before that any loading of a database is going to involve iteration over a result set, so the biggest optimization you can make will be to reduce the size of the result sets. There's a couple of ways to do that:

  1. if you using a map, try to use keys that don't require loading and do the load when you get a miss.
  2. once loaded, only refresh the rows that have changed since you last loaded the data, though this obivously doesn't solve the start-up problem.

Now all that said, I would not recommend you write your own caching code in the first place. The reasons I say this are:

  1. all modern RDBMS cache, so providing your queries are performant getting the actual result set should not be a bottleneck.
  2. Hibernate provides not only ORM but a robust and well understood caching solution.
  3. if you really need to cache massive datasets, use Coherence or similar - the cache can be started in a seperate JVM and your application doesn't need to take the load hit.


You have two problems here: discovering how much memory is in use, and managing a cache. I'm not sure that the two are really closely related, although they may be.

Discovering how much memory an object uses isn't extremely difficult: one excellent article you can use for reference is "Sizeof for Java" from JavaWorld. It escapes the whole garbage collection fiasco, which has a ton of holes in it (it's slow, it doesn't count the object but the heap - meaning that other objects factor into your results that you may not want, etc.)

Managing the time to initialize the cache is another problem. I work for a company that has a data grid as a product, and thus I'm biased; be aware.

One option is not using a cache at all, but using a data grid. I work for GigaSpaces Technologies, and I feel ours is the best; we can load data from a database on startup, and hold your data there as a distributed, transactional data store in memory (so your greatest cost is network access.) We have a community edition as well as full-featured platforms, depending on your need and budget. (The community edition is free.) We support various protocols, including JDBC, JPA, JMS, Memcached, a Map API (similar to JCache), and a native API.

Other similar options include Coherence, which is itself a data grid, and Terracotta DSO, which can distribute an object graph on a JVM heap.

You can also look at the cache projects themselves: Two include Ehcache and OSCache. (Again: bias. I was one of the people who started OpenSymphony, so I've a soft spot for OSCache.) In your case, what would happen is not a preload of cache - note that I don't know your application, so I'm guessing and might be wrong - but a cache on demand. When you acquire data, you'd check the cache for data first and fetch from the DB only if the data is not in cache, and load the cache on read.

Of course, you can also look at memcached, although I obviously prefer my employer's offering here.


Be aware that invoking

System.gc()

or

Runtime.getRuntime().gc()

is a bad idea unless you really need to do that. You should leave the VM the task of deciding when to free objects, unless after profiling you found that it's the only way to make the application go faster on your client's VM.


I tend to use YourKit for this sort of thing. It costs money but IMO is worth every penny (no connection other than as a customer).

0

精彩评论

暂无评论...
验证码 换一张
取 消