I did some tests for code performance on Windows mobile devices and noticed that some algorithms on some hosts significantly improve performance and much worse than others. Of course, given the difference in clock speed.
Statistics for the link (all results are generated from the same binary file compiled using Visual Studio 2005 ARMv4 targeting):
Intel XScale PXA270
- Algorithm A: 22642 ms
- Algorithm B: 29271 ms
ARM1136EJ-S core (integrated into the MSM7201A chip)
- Algorithm A: 24874 ms
- Algorithm B: 29504 ms
ARM926EJ-S core (integrated in OMAP 850 chip)
- Algorithm A: 70215 ms
- Algorithm B: 31652 ms (!)
, B , , FPU.
, : , , / , .
.