What [neon / vfp / vfp3] should be specified for mfpu when evaluating and comparing float performance in an ARM processor?

I want to evaluate some ARM processor float performance. I use lmbench and pi_css5 , I confuse the float in the test.

From cat /proc/cpuinfo (below), I think there are 3 types of float functions: neon, vfp, vfpv3? From this question and answer, it seems like it depends on the compiler. However, I do not know what should I indicate in the compille flag ( -mfpu=neon/vfp/vfpv3 ), or should I compile a program with each of them or simply not specify -mfpu ?

 cat /proc/cpuinfo Processor : ARMv7 Processor rev 4 (v7l) BogoMIPS : 532.00 Features : swp half thumb fastmult vfp edsp neon vfpv3 tls CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x2 CPU part : 0xc09 CPU revision : 4 
+4
source share
2 answers

It can be even a little more complicated than you expected. The GCC settings page does not explain fpu versions, however the ARM Guide is for their compiler . You should also notice that Linux does not provide complete information about fpu functions , just talking about vfp , vfpv3 , vfpv3d16 , or vfpv4 .

Back to your question, you should choose the most common factor among them, compile your code and compare the results. On the other hand, if cpu has vfpv4 and the other has vfpv3, which in your opinion is better?

If your question is as simple as choosing between neon , vfp or vfpv3 . Select neon (source) .

 -mfpu=neon selects VFPv3 with NEON coprocessor extensions. 

In the gcc manual,

If the selected floating point hardware includes the NEON extension (for example, -mfpu=neon ), note that floating point operations cannot be used with GCC auto-integration if `-funsafe-math-optimizations' is also specified. This is because NEON Hardware does not fully implement the IEEE 754 standard for floating point arithmetic (in particular, denormal values ​​are treated as zero), so using NEON instructions can lead to loss of precision.

See, for example, Subset IEEE-754 floating point numbers on ios ... for more information on this topic.

+7
source

I tried each of them, and it seems that use -mfpu=neon and specify -march=armv7-a and -mfloat-abi=softfp correct configuration.

In addition to ARM BenchMark very useful link ( on ARM the Cortex-the A8: Atom vs. the Intel ). Another useful article on ARM Cortex-A processors and gcc commands , this clears the SIMD coprocessor configuration.

0
source

Source: https://habr.com/ru/post/1502043/


All Articles