In this case, verified arithmetic is faster than uncontrolled for two reasons:
The compiler was able to determine that verification of most of the arithmetic was not needed, so the added overhead for verification was negligible. It was an artifact of simplicity of the test. leppie's answer gives a good example of an algorithm with a salient feature.
The code inserted to implement the verification occurred to cause the assignment of a key branch not on the alignment boundary. This can be seen in two ways:
Replace int a = 0; on int a = args.Length; . Run the tests and note that the performance inversion failed. The reason is that additional code causes destination alignment.
Inspect the assembly below. I got it by adding Process.EnterDebugMode(); and Debugger.Break(); to the end of Main and running Release mode.exe at the command prompt. Please note that when the tested code runs the test for i % 35 == 0 , if false, it goes to 00B700CA, which is a aligned instruction. Compare this with the untested code that goes into 012D00C3. Despite the fact that the tested code has an additional jo statement, its cost is outweighed by the savings of the aligned branch.
Verified
int a = 0; 00B700A6 xor ebx,ebx for (int i = 0; i < 100000000; i += 3) { 00B700A8 xor esi,esi if (i == 1000) 00B700AA cmp esi,3E8h 00B700B0 jne 00B700B7 i *= 2; 00B700B2 mov esi,7D0h if (i % 35 == 0) 00B700B7 mov eax,esi 00B700B9 mov ecx,23h 00B700BE cdq 00B700BF idiv eax,ecx 00B700C1 test edx,edx 00B700C3 jne 00B700CA ++a; 00B700C5 add ebx,1 00B700C8 jo 00B70128 for (int i = 0; i < 100000000; i += 3) { 00B700CA add esi,3 00B700CD jo 00B70128 00B700CF cmp esi,5F5E100h 00B700D5 jl 00B700AA }
Unregistered
int a = 0; 012D00A6 xor ebx,ebx for (int i = 0; i < 100000000; i += 3) { 012D00A8 xor esi,esi if (i == 1000) 012D00AA cmp esi,3E8h 012D00B0 jne 012D00B4 i *= 2; 012D00B2 add esi,esi if (i % 35 == 0) 012D00B4 mov eax,esi 012D00B6 mov ecx,23h 012D00BB cdq 012D00BC idiv eax,ecx 012D00BE test edx,edx 012D00C0 jne 012D00C3 ++a; 012D00C2 inc ebx for (int i = 0; i < 100000000; i += 3) { 012D00C3 add esi,3 012D00C6 cmp esi,5F5E100h 012D00CC jl 012D00AA }
source share