开发者

Profiling Lucene in Nutch

开发者 https://www.devze.com 2023-01-23 16:55 出处:网络
I\'m trying to profile Nutch using VisualVM.Lucene is the part of the Nutch core responsible for ge开发者_StackOverflow社区nerating url indexes and for searching these indexes due to some query.I\'m r

I'm trying to profile Nutch using VisualVM. Lucene is the part of the Nutch core responsible for ge开发者_StackOverflow社区nerating url indexes and for searching these indexes due to some query. I'm running Nutch through Apache Tomcat and I would like to determine how much time Nutch spends in various function calls (including Lucene calls) but when I try to profile using VisualVM I get a bunch of profiling data about Tomcat and not Nutch or Lucene. What am I doing wrong here?


I had the same experience trying to locate Lucene time inside Tomcat calls. What you have to do is:

  1. Use VisualVM 1.2.2.
  2. Choose the relevant process and press "Profile".
  3. Check the "Settings" checkbox. This should open a "CPU settings" tab, with fields you can fill.
  4. Under "Start profiling From classes:" write an entrance point in your code (e.g. com.my.company.NutchUser)
  5. Uncheck "Profile new runnables".
  6. Choose "Profile only classes:" and under it write: org.apache.lucene.* org.apache.nutch.*
  7. Press the "Profile CPU" button. I believe if you do all that, then run your process and take occasional snapshots, you will be fine.

Alternatively, This guy suggests doing stack sampling instead of profiling. I have never done it, but it sounds interesting.

0

精彩评论

暂无评论...
验证码 换一张
取 消