How can we help you today? How can we help you today?
AdamMil
I see why it's reporting 70 seconds, but I don't think it has to be that way. I assume that by non-profiled child methods you are referring to code that's excluded by the filter (the filter that by default excludes methods that have no source code). This problem doesn't only affect Main(), then. In fact, any function that makes a call into an external (filtered out) method, either directly or indirectly, would have inaccurate timing. And I don't think that by following my logic you'd wind up with no time being spent in my source code, unless of course that was actually the case. I would simply expect it to show that the majority of the time is spent sitting in various parts of the System.Windows.Forms namespace, waiting for events, etc. (You might just say that 70 seconds were spent in "external methods", referring to methods that were filtered out). The remaining time would be the true time spent in my application (or at least the parts of my application that I, with the filter, decided that I wanted to focus on). Following my logic, I wouldn't have a large number of seconds mysteriously being attributed to the wrong functions. Perhaps you're applying your filter too early in the profiling process. I imagine that when you filter methods, you're not timing them at all. This seems consistent with the problem. A better method would be to still time filtered methods (not at a per-line level, of course) and then to simply not display them in the results. This would give you the information you need to determine the correct time. It also increases the flexibility of the profiler because you'd be able to change the filter after the results have already been collected if you wanted to view the results in a different way. / comments
I see why it's reporting 70 seconds, but I don't think it has to be that way. I assume that by non-profiled child methods you are referring to code that's excluded by the filter (the filter that by...
0 votes
What it means is that doing hit counting for that function will be more difficult, assuming you use the callbacks for that purpose. But I suspect you guys don't need/use the callbacks so much anyway -- instead, you probably use a method called instrumentation where extra bits of code are inserted into the program before it's compiled by the JIT compiler. These bits of code are what enable you to get profiling down to a per-line level. Instrumentation can be difficult to do properly, but once you've got the system in place, you don't need the function-call callbacks anymore -- you can do everything through instrumentation. I don't know how your profiler works internally, but if I had to guess, I would say that in performance profiling mode, it doesn't need the callbacks and instead uses instrumentation for everything. And if it does use the callbacks, not having them wouldn't have a large negative effect on the profiler. Again, I don't know how your profiler works internally, but I'm assuming that it adds to each function through instrumentation a prologue that increments the function hit counter and initializes a timing variable, an epilogue that calculates the elapsed time, and a bit of code before each line that increments that line's counter and calculates the elapsed time since the previous line. Extrapolating from the numbers given in your advertisements, the instrumentation overhead for a simple one-line function is 6 times as much as the instrumentation overhead for one line of code. In addition, the function call itself takes time, and raises callbacks in the profiling API, which takes even more time. Inlining that small function could eliminate all that extra time. The negative effect on the profiler really depends on how your profiler is implemented. I assume that it would get the wrong hit count for the function. As I said in a previous message, it's a tradeoff between correct timing and correct hit counting. There is one possible complication. The JIT compiler only inlines very short and simple functions, and if you emit a prologue and epilogue for hit counting, and additional code for each line of a function, the instrumentation itself could cause the JIT compiler to not inline a function that it otherwise would have, because now it's too long or too complicated. One way around that would be to 1) not use a prologue/epilogue for function timing and hit counting, and instead use the function-call callbacks (maybe you do this already), 2) not add the per-line instrumentation for one-line functions (one-line functions could be treated specially, with all the timing/hit counting done with the callbacks), and 3) enable JIT inlining. (I'll refer to this as idea #1.) The effect would be that simple one-line functions would not be instrumented at all, and the JIT would be sure to inline them, and other short functions might also be inlined because you don't have the complicated prologue/epilogue. Hit counting would be affected by enabling JIT inlining because, as you mentioned, the callbacks wouldn't be called, but the timings would be closer to correct. There are other things you could do. You could have an option to not add any instrumentation at all, using the callbacks only, with JIT inlining enabled. That would give a very accurate picture of the slowest functions, but wouldn't provide per-line profiling. You could then instruct the profiler to run again, and only instrument functions that are either large and clearly not inlineable (Microsoft has published some criteria that the JIT uses to determine whether something should be inlined), or have timings that were not close to zero on the last run. That would allow the JIT compiler to inline exactly (or almost exactly, depending on the internal workings of the JIT compiler) the same set of functions that it normally would, giving nearly-perfect timings and not misleading anyone as to what they need to optimize. (I'll refer to this as idea #2.) These are all advanced options, of course, and might be too confusing for the average user. (Perhaps you have a good tech writer who could make it accessible.) For the average user, you might have two modes -- the standard mode which you have now, and something like idea #1. The mode you have now gives perfect hit counting, but inaccurate (and in some cases very inaccurate) timing. Idea #1 gives inaccurate hit counting and somewhat better timing (in some cases much better timing). Idea #2 gives inaccurate hit counting and is a 2-step process, but the timings could be nearly perfect. And in my mind, as I said, timing is the most important thing. Hit counting is second. Anyway, this is turning into a long ramble. But I think this is something you guys should think about and work on for the next release. Let me know if you get it right. I still need to find a good profiler. / comments
What it means is that doing hit counting for that function will be more difficult, assuming you use the callbacks for that purpose. But I suspect you guys don't need/use the callbacks so much anywa...
0 votes
I'm not sure that inlining has to be disabled... From what I know of the profiling API, the JIT will call the JITInlining method when it's about to inline a function, giving the function IDs of the two functions involved, and allowing you to instruct the JIT to not inline them. (Inlining can also be disabled with a global flag.) HRESULT JITInlining(FunctionID callerID, FunctionID calleeID, BOOL *pfShouldInline) It would seem that inlining can be controlled. However, I'm sure you instrument the IL or something, and instrumenting around inlined functions to properly track when they were called (in order to get correct hit counting) would be difficult or impossible. But that difficult task doesn't seem necessary. It would still be a useful feature of the profiler if it allowed the JIT to inline functions without instrumentation. The function hit counts wouldn't be what people might expect, but the profiling results would do a better job of leading people to things that need to be optimized. With JIT inlining disabled, people can be led to optimize the wrong things, which would show an improvement in the profiler, but no actual improvement in the release build of their application. It's a tradeoff between correctness in timing and correctness in hit counting. The current implementation gives correct hit counting but incorrect timing. I personally would prefer to sacrifice hit counting for better timing, because timing's what's important for profiling in most applications. This could perhaps be an option inside the profiler. I think it's better to have the option than to be stuck with one way of doing things... or if I had to be stuck, I'd prefer to be stuck with more correct timings and less correct hit counts. Just my opinion as an evaluator of the profiler... / comments
I'm not sure that inlining has to be disabled... From what I know of the profiling API, the JIT will call the JITInlining method when it's about to inline a function, giving the function IDs of the...
0 votes