How can we help you today? How can we help you today?
AdamMil
What it means is that doing hit counting for that function will be more difficult, assuming you use the callbacks for that purpose. But I suspect you guys don't need/use the callbacks so much anyway -- instead, you probably use a method called instrumentation where extra bits of code are inserted into the program before it's compiled by the JIT compiler. These bits of code are what enable you to get profiling down to a per-line level. Instrumentation can be difficult to do properly, but once you've got the system in place, you don't need the function-call callbacks anymore -- you can do everything through instrumentation. I don't know how your profiler works internally, but if I had to guess, I would say that in performance profiling mode, it doesn't need the callbacks and instead uses instrumentation for everything. And if it does use the callbacks, not having them wouldn't have a large negative effect on the profiler. Again, I don't know how your profiler works internally, but I'm assuming that it adds to each function through instrumentation a prologue that increments the function hit counter and initializes a timing variable, an epilogue that calculates the elapsed time, and a bit of code before each line that increments that line's counter and calculates the elapsed time since the previous line. Extrapolating from the numbers given in your advertisements, the instrumentation overhead for a simple one-line function is 6 times as much as the instrumentation overhead for one line of code. In addition, the function call itself takes time, and raises callbacks in the profiling API, which takes even more time. Inlining that small function could eliminate all that extra time. The negative effect on the profiler really depends on how your profiler is implemented. I assume that it would get the wrong hit count for the function. As I said in a previous message, it's a tradeoff between correct timing and correct hit counting. There is one possible complication. The JIT compiler only inlines very short and simple functions, and if you emit a prologue and epilogue for hit counting, and additional code for each line of a function, the instrumentation itself could cause the JIT compiler to not inline a function that it otherwise would have, because now it's too long or too complicated. One way around that would be to 1) not use a prologue/epilogue for function timing and hit counting, and instead use the function-call callbacks (maybe you do this already), 2) not add the per-line instrumentation for one-line functions (one-line functions could be treated specially, with all the timing/hit counting done with the callbacks), and 3) enable JIT inlining. (I'll refer to this as idea #1.) The effect would be that simple one-line functions would not be instrumented at all, and the JIT would be sure to inline them, and other short functions might also be inlined because you don't have the complicated prologue/epilogue. Hit counting would be affected by enabling JIT inlining because, as you mentioned, the callbacks wouldn't be called, but the timings would be closer to correct. There are other things you could do. You could have an option to not add any instrumentation at all, using the callbacks only, with JIT inlining enabled. That would give a very accurate picture of the slowest functions, but wouldn't provide per-line profiling. You could then instruct the profiler to run again, and only instrument functions that are either large and clearly not inlineable (Microsoft has published some criteria that the JIT uses to determine whether something should be inlined), or have timings that were not close to zero on the last run. That would allow the JIT compiler to inline exactly (or almost exactly, depending on the internal workings of the JIT compiler) the same set of functions that it normally would, giving nearly-perfect timings and not misleading anyone as to what they need to optimize. (I'll refer to this as idea #2.) These are all advanced options, of course, and might be too confusing for the average user. (Perhaps you have a good tech writer who could make it accessible.) For the average user, you might have two modes -- the standard mode which you have now, and something like idea #1. The mode you have now gives perfect hit counting, but inaccurate (and in some cases very inaccurate) timing. Idea #1 gives inaccurate hit counting and somewhat better timing (in some cases much better timing). Idea #2 gives inaccurate hit counting and is a 2-step process, but the timings could be nearly perfect. And in my mind, as I said, timing is the most important thing. Hit counting is second. Anyway, this is turning into a long ramble. But I think this is something you guys should think about and work on for the next release. Let me know if you get it right. I still need to find a good profiler. / comments
What it means is that doing hit counting for that function will be more difficult, assuming you use the callbacks for that purpose. But I suspect you guys don't need/use the callbacks so much anywa...
0 votes
I'm not sure that inlining has to be disabled... From what I know of the profiling API, the JIT will call the JITInlining method when it's about to inline a function, giving the function IDs of the two functions involved, and allowing you to instruct the JIT to not inline them. (Inlining can also be disabled with a global flag.) HRESULT JITInlining(FunctionID callerID, FunctionID calleeID, BOOL *pfShouldInline) It would seem that inlining can be controlled. However, I'm sure you instrument the IL or something, and instrumenting around inlined functions to properly track when they were called (in order to get correct hit counting) would be difficult or impossible. But that difficult task doesn't seem necessary. It would still be a useful feature of the profiler if it allowed the JIT to inline functions without instrumentation. The function hit counts wouldn't be what people might expect, but the profiling results would do a better job of leading people to things that need to be optimized. With JIT inlining disabled, people can be led to optimize the wrong things, which would show an improvement in the profiler, but no actual improvement in the release build of their application. It's a tradeoff between correctness in timing and correctness in hit counting. The current implementation gives correct hit counting but incorrect timing. I personally would prefer to sacrifice hit counting for better timing, because timing's what's important for profiling in most applications. This could perhaps be an option inside the profiler. I think it's better to have the option than to be stuck with one way of doing things... or if I had to be stuck, I'd prefer to be stuck with more correct timings and less correct hit counts. Just my opinion as an evaluator of the profiler... / comments
I'm not sure that inlining has to be disabled... From what I know of the profiling API, the JIT will call the JITInlining method when it's about to inline a function, giving the function IDs of the...
0 votes
I don't think you read my message. I ran the application for 10-15 seconds and it reported Main() taking 262 seconds. I ran it for one second and it reported Main() taking 40 seconds. How do you explain this? / comments
I don't think you read my message. I ran the application for 10-15 seconds and it reported Main() taking 262 seconds. I ran it for one second and it reported Main() taking 40 seconds. How do you ex...
0 votes
This topic has been superceded by the following two topics: * Total seconds counts are wrong * Time excluding children seems wrong Go there for the continuation of this discussion. / comments
This topic has been superceded by the following two topics: * Total seconds counts are wrong * Time excluding children seems wrong Go there for the continuation of this discussion.
0 votes