If you want to understand what the compiler does, you just need to pull up the assembly. I recommend this site (I already entered the code from the question)): https://godbolt.org/g/FwZZOb .
The first example is more interesting.
int div(unsigned int num, unsigned int num2) { if( num >= num2 ) return num % num2; return num; } int div2(unsigned int num, unsigned int num2) { return num % num2; }
Forms:
div(unsigned int, unsigned int): # @div(unsigned int, unsigned int) mov eax, edi cmp eax, esi jb .LBB0_2 xor edx, edx div esi mov eax, edx .LBB0_2: ret div2(unsigned int, unsigned int): # @div2(unsigned int, unsigned int) xor edx, edx mov eax, edi div esi mov eax, edx ret
Basically, the compiler will not optimize the branch for very specific and logical reasons. If integer division were about the same cost as comparison, then the industry would be pretty pointless. But integer division (this module is performed together with typically) is actually very expensive: http://www.agner.org/optimize/instruction_tables.pdf . Numbers vary greatly in architecture and size, but usually it can be latency from 15 to 100 cycles.
By accepting a branch before executing a module, you can save a lot of work. Please note: the compiler also does not convert code without a branch to a branch at the assembly level. This is due to the fact that the branch also has a drawback: if the module is still necessary, you just spent a little time.
There is no way to make a reasonable determination of the correct optimization without knowing the relative frequency with which idx < idx_max will be true. Therefore, compilers (gcc and clang do the same) prefer to map the code in a relatively transparent way, leaving this choice in the hands of the developer.
So this thread could be a very smart choice.
The second branch should be completely meaningless, because comparison and assignment are comparable. However, you can see in the link that compilers will still not perform this optimization if they have a variable reference. If the value is a local variable (as in your demonstrated code), then the compiler will optimize the branch.
In total, the first part of the code is probably a reasonable optimization, the second is probably just a tired programmer.