Getting more info from Rprof()_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2022-12-17 01:20 出处：网络

I\'ve been trying to dig into what the time-hogs are in some R code I\'ve written, so I\'m using Rprof.The output isn\'t yet very helpful though:

相关专题：profiling r

I've been trying to dig into what the time-hogs are in some R code I've written, so I'm using Rprof. The output isn't yet very helpful though:

> summaryRprof()
$by.self
                      self.time self.pct total.time total.pct
"$<-.data.frame"           2.38     23.2       2.38      23.2
"FUN"                      2.04     19.9      10.20      99.6
"[.data.frame"             1.74     17.0       5.54      54.1
"[.factor"                 1.42     13.9       2.90      28.3
...

Is there some way to dig deeper and find out which specific invocations of $<-.data.frame, and FUN (which is probably from by()), etc. are actually the culprits? Or will I need to refactor the 开发者_C百科code and make smaller functional chunks in order to get more fine-grained results?

The only reason I'm resisting refactoring is that I'd have to pass data structures into the functions, and all the passing is by value, so that seems like a step in the wrong direction.

Thanks.

The existing CRAN package profr and proftools are useful for this. The latter can use Rgraphviz which isn't always installable.

The R Wiki page on profiling has additional info and a nice script by Romain which can also visualize (but requires graphviz).

Rprof takes samples of the call stack at intervals of time - that's the good news.

What I would do is get access to the raw stack samples (stackshots) that it collects, and pick several at random and examine them. What I'm looking for is call sites (not just functions, but the places where one function calls another) that appear on multiple samples. For example, if a call site appears on 50% of samples, then that's what it costs, because its possible removal would save roughly 50% of total time. (Seems obvious, right? But it's not well known.)

Not every costly call site is optimizable, but some are, unless the program is already as fast as possible.

(Don't be distracted by issues like how many samples you need to look at. If something is going to save you a reasonable fraction of time, then it appears on a similar fraction of samples. The exact number doesn't matter. What matters is that you find it. Also don't be distracted by graph and recursion and time measurement and counting issues. What matters is, for each call site you see, the fraction of stack samples that show it.)

Parsing the output that Rprof generates isn't too hard, and then you get access to absolutely everything.