This varies greatly depending on the processor and (in some cases) the type (s) of the operands.
Older / simpler processors typically use a multiplication algorithm like this:
integer operator*(integer const &other) { unsigned temp1 = other.value; unsigned temp2 = value; unsigned answer = 0; while (temp1 != 0) { if (temp1 & 1) answer += temp2; temp2 <<= 1; temp1 >>=1; } return integer(answer); }
Since the loop is executed only when / if temp1 != 0
, the loop will obviously not be executed if temp1
starts at 0 (but, as it is written here, it will not try to optimize for the other operand 0).
This, however, is basically one bit in the time algorithm. For example, when multiplying 32-bit operands, if each bit has a chance to set 50:50, we expect an average of about 16 iterations.
A new, high-performance CPU typically works with at least two bits at a time, and possibly more. Instead of having separate hardware performing several iterations, it usually associates the operation with separate (albeit essentially identical) equipment for each stage of multiplication (although they usually do not appear as separate stages in the normal pipeline diagram for the processor).
This means that execution will have the same delay (and throughput) regardless of the operands. On average, this slightly improves latency and throughput, but leads to every operation occurring at the same speed, regardless of the operands.
source share