I'm working on a little rigidbody simulation. I use the Irrlicht engine for display and openMesh to work with the meshes.
Now I profiled my app using VerySleepy and noticed that most of the time is spent within the following functions (exclusive the time spent in subfunctions):
RtlCompareMemoryUlong 30% within module "ntdll" sourcefile "unknown"
KiFastSystemCallRet 21% within module "ntdll" sourcefile "unknown"
RtlFillMemoryUlong 9% within module "ntdll" sourcefile "unknown"
so 50% of the time is spent in those functions and I don't call them from somewhere in my code and i don't underst开发者_Go百科and what they are doing. I doubt it's connected to the graphics, since i'm only displaying very simple meshes.
Can someone give me a hint on how to figure out why those functions are called and how to get rid of that?
Thanks!
ntdll is the NT kernel functions. Chances are those are called internal to other functions to do low level operations, hence why you're seeing a lot of time spent in them - they're the sub-building-blocks of higher level functionality. Ignore them and look elsewhere (up the callstack) for performance tweaking; you're not likely to be able to get rid of the OS calls from your application. ;)
The performance problem is probably that these functions are being called a lot, not in these functions themselves. You can guess from the names what they're used for. KiFastSystemCallRet in particular indicates your app went into Kernel mode.
Ignore the ntdll functions in your profile, and focus only on the functions that you wrote/control.
Use a better profiler. On OS X, the CPU Instruments app that comes with Xcode gives excellent diagnostic information that makes spotting performance problems easy.
What you want to see is the callstack during all this time. That will show you which library and function is calling that OS function all the time. Once you know that, it's simply a matter of calling into that library function less often.
RtlCompareMemory / RtlFillMemory sound like they're probably the underlying implementations for memcmp() / memset().
Regardless, you want to change the settings of your profiler to show system call time under the calling app / library function so you can see where the calls are ultimately coming from.
Frank Krueger is right. You need insight into the call stack as your program runs. Here's a simple explanation of why that is so. It may surprise you that you do not need special tools or a large number of samples.
You should take it as more of a symptom than part of the actual problem when you are stuck in system all the time.
Memory fragmentation and paging out is the usual suspect, but it could be a myriad of things.
In my experience performance problems are seldom something obvious like you are calling something specifically. Optimizing like commonly suggested is usually useless at a really low level. It catches things that amount to bugs that are correct but usually unintended like allocating something and deleting it over and over but for things like this you often need to have a deep understanding of everything happening to figure out exactly where the issue is (but like I said, surprisingly often it's memory management related if you are stuck in system calls a lot).
精彩评论