It is generally believed that using the profiler is much better (for finding performance issues, not for measuring things) than - anything else, really - of course, than a simple way to randomly pause.
This assumption is only general wisdom - it has no basis in theory or practice. There are numerous scientific peer-reviewed articles on profiling, but not one of them that I read even considered this issue, not to mention its justification. This is a blind spot in academia, not a big one, but it is there.
Now to your question -
In the screenshot showing the call stack, this is what is called the hot path, which accounts for approximately 60% of the processorโs time in the thread. Assuming the code with the name "saxon" in the title interests you, this is:
net.sf.saxon.event.ReceivingContentHandler.startElement net.sf.saxon.event.ProxyReceiver.startContent net.sf.saxon.event.ProxyReceiver.startContent net.sf.saxon.event.StartTagBuffer.startContent net.sf.saxon.event.ProxyReceiver.startContent com.saxonica.ee.validate.ValidationStack.startContent com.saxonica.ee.validate.AttributeValidator.startContent net.sf.saxon.event.TeeOutputter.startContent net.sf.saxon.event.ProxyReceiver.startContent net.sf.saxon.event.ProxyReceiver.startContent net.sf.saxon.event.Sink.startContent
First of all, it seems to me that he should do I / O, or at least wait until some other process gives him the content. If so, you should look at the wall clock, not the processor time.
Secondly, the problem can be on any of those sites where the function calls one of them. If any such call is not really necessary and can be missed or completed less frequently, this will reduce the time by a significant fraction. My suspicions are drawn on StartTagBuffer and validate , but you know better.
There are other points that I could make, but these are the main ones.
ADDED after editing the question. I am inclined to suggest that you are looking for ways to optimize your code, not just ways to get numbers for yourself.
It still looks like CPU time, not wall time, because there are no I / O in the hot paths. Perhaps this is normal in your case, but what does it mean, out of your 12-minute wall clock hours, 11 minutes can be spent waiting for I / O, with 1 minute in the CPU. If so, you could cut out 30 seconds of fat in the processor part and reduce the time by 30 seconds. Therefore, I prefer the sampling of wall clocks, so I have a common perspective.
Looking only at the hot roads, you will not get a true picture. For example, if the hot path says that the function F is on the hot path, say, 40% of the time, this means that F costs at least 40%. It can be a lot more because it can be on other roads that are not so hot. Thus, you may have the juicy opportunity to speed up the process, but it does not get much impact in the specific path that the profiler decided to show you, so you do not pay much attention to it. In fact, a large timer may not appear at all, because on a particular hot way there is always something a little larger, for example new , or because it occurs by several names, for example, with class constructor templates.
It does not show you line resolution information. If you want to inspect an allegedly expensive routine for cost reasons, you need to look at the lines inside it. There is a tendency when looking at the routine to say: "He just does what he has to do." But if you look at a certain expensive line of code, which is most often a method call, you may ask: "Is it really necessary to make this call? Maybe I already have information." This is a much more specific question about what you can fix.
Could this show you some original stack samples? In my experience, they are much more informative than any resume, like a hot way that a profiler can imagine. The point is to examine the sample and come to a full understanding of what the program was doing, and the reason why at that moment. Then repeat a few more samples. You will see things that do not need to be done, which you can fix in order to achieve significant acceleration. (If the code is no longer optimal, then it will be good to know.) The fact is that you are looking for problems, not measurements. Statistically, this is very rude, but good enough , and there will be no problems.