The question almost answers itself. What application of your bare metal performs when it is not in this process / algorithm? Measure one or the other or both. If you have a bare metal application that does not completely consume a processor in this algorithm, then you already have an operating system to the extent that you control this application / function time. You can use several methods from a simple counter in a cycle relative to a timer to see how many counts per cycle when the algorithm receives time fragments versus none. You can just run the algorithm itself, etc.
I assume that when you say CPU, you mean the whole system, since your performance is highly dependent on both your code and what it is talking to. If you use a flash on Corex-m4 depending on the clock frequency, you can burn processor cycles simply by waiting for instructions or data (and it can very easily get a wrong idea of ββprocessor performance for an algorithm when it is not a clock-burning algorithm). Caches mask / control this performance and can greatly affect performance if you are not careful and do not know what they are doing. Being a question in C ++, your compiler plays a big role in performance, and also in your code, of course, it can very easily make the code several times faster or slower with minimal changes in the command line or code.
If the algorithm is part of isr, then the processor goes into sleep mode otherwise, you can use the gpio pin and the area technique to get an idea of ββthe ratio of moves and sleep.
source share