Does unsigned math require more CPU instructions?

Question

Does unsigned math require more CPU instructions?

Take the C ++ i integral variable and assume that you multiply its value by 2.

If i has a signature, I consider the operation to be somewhat equivalent, at least mathematically , so that:

 i = i << 1;

But if type i not specified, then, since unsigned values are not overflowing, but are performed modulo their range, the operation is supposedly something like this:

 i = (i << 1) & (decltype(i))-1;

Now I believe that the actual machine instructions are likely to be shorter than the sequence of shifts for multiplication. But will a modern, say, x86, CPU have a special instruction for unsigned / modulo math? Or do math with unsigned values, as a rule, is there an additional instruction compared to math with signs?

(Yes, it would be ridiculous to take care of this during programming, I'm interested in simple curiosity.)

+6

c ++ cpu unsigned

Lightness races in orbit Jul 10 '11 at 12:02

source share

7 answers

As others have already written: it does not matter for the CPU. Signed and unsigned commands take the same time. Some operations in unsigned arithmetic are even easier to carry out and may require a smaller cycle than the signed version (single-bit division is one example).

However, this is only half the story.

C ++ defines counter integer overflows as undefined behavior and unsigned integers as modulo2. This offers completely different optimization options that lead to different code.

One example:

 int foo (int a) { return a * 1000 / 500; } unsigned bar (unsigned a) { return a * 1000 / 500; }

Here foo can be optimized for:

 int foo (int a) { return a * 2; }

And the bar will remain the same.

Note that mathematically these two functions are the same, but they start to give different results if the argument exceeds INT_MAX / 1000.

Since the effect of signed overflows is undefined, the compiler has the right to simply pretend that INT_MAX is not used when it comes to simplifying expressions. For unsigned arithmetic, this is not the case, and the compiler must emit code that performs multiplication and division. This, of course, is slower than the optimized version.

Note. Most compilers are conservative when it comes to such optimizations and only allow them if you request them because they tend to break code checking and overflows. Other compilers, especially in the embedded and DSP worlds, always do such optimizations even at low levels of optimization. The programmers who write for these machines know the subtle details, so this is rarely a problem.

OTOH we discussed stories in which C / C ++ programmers fall into this trap more than once in stackoverflow.

+11

Nils pipenbrinck Jul 10 '11 at 12:33

source share

Assuming overflow overflow, which is that most (all?) Of the processor's arithmetic instructions for the two additional hardware are all the same, << for unsigned types is equivalent to multiplication. Thus, the only problem you are facing is when you do arithmetic with a type that is less than the register used to store it.

Advance rules of at least int (or unsigned int ) in arithmetic expressions are largely designed to avoid this: when you multiply, say, unsigned short by 2, the result is int (or unsigned int if short and int are the same the size). Whatever it is, there is no need to take any module when the size of the register matches the type. And with two additions, in any case, you do not need different instructions for multiplication in C ++ and without an unsigned: overflow overflow turns around as long as the hardware does not offer an overflow bit, and you take care of its value (-1 * 2 will be an unsigned overflow, but not a signed overflow, although the resulting bit pattern is the same).

The only time a mask might be needed if /, when you convert the result back to unsigned short . Even then, I think that with caution, an implementation can sometimes leave extra “irrelevant” bits at the top of the int register, which is used to store the intermediate unsigned short value. You know that these additional upper bits do not affect the results of addition, multiplication or subtraction modulo and that they will be masked if the value is stored in memory (provided that the instruction stores the lower 2 bytes int -separated to 2 bytes of memory, the module is basically free). Therefore, the implementation must be very careful to mask before division, right shift, comparison and what I forgot, or use the appropriate instructions if available.

+2

Steve jessop Jul 10 '11 at 12:11

source share

As far as I know, most processors have unsigned hardware operations, and I'm pretty sure what x86 does.

+1

Puppy Jul 10 '11 at 12:10

source share

Unsigned math is overflowing and therefore implicitly modulo their respective range.

+1

edgar.holleis Jul 10 '11 at 12:11

source share

interjay's answer covers the basics. A couple more details:

Now I believe that the actual machine instructions are likely to be shorter than the sequence of shifts for multiplication.

It depends on the processor. In the old days, when transistors were expensive, processors like the 6500 and 6800 had instructions that only shifted one bit left or right at a time.

Later, when the chips became larger and had more bits in the opcode for parameters, “barrel switches” were implemented that could shift an arbitrary number of bits in one cycle. This is what modern processors use.

Or does performing math with unsigned values usually require extra instruction compared to math with signed values?

Never. When the operation will differ between the unsigned and the signed, there will be separate instructions for each of them.

+1

Mike de simon Jul 10 '11 at 12:37

source share

I think that this is not true with you: in unsigned data types, the bit offset does exactly what it says in tins, and the unoccupied bits are filled with zeros. This causes the correct modular arithmetic operations on type values to shift to the left, which is multiplication. The right shift has no arithmetic counterpart, because Z / nZ is not a common division ring, and there is no concept of division; the right shift is just truncating division.

Signed types, on the other hand, suffer from ambiguity, since there are various ways to interpret a bit pattern as a signed integer. With a left shift on a 2'-complement, you get the expected “wrap” for multiplication, but there is no canonical choice of behavior with a right shift. In the old C standard, I believe that this was fully implemented, but I think that C99 made this behavior concrete.

+1

Kerrek SB Jul 10 '11 at 12:44

source share

interjay · Accepted Answer · 2011-07-10T12:11:21+0000

No, it does not require more instructions, at least on x86.

For some operations (for example, addition and subtraction), the same command is used for both signed and unsigned types. This is possible since they work the same when using 2 additional representations of signed values.

There is also no difference for a left shift: the leftmost bit is simply discarded by the hardware, and there is no need to perform bitwise and, as in your example.

For other operations (for example, right shift), there are separate commands: SHR (right shift) for unsigned values and SAR (arithmetic right shift) for signed values that preserve the signed bit.

There are also separate instructions for multiplication and division with / without signature: MUL / IMUL and DIV / IDIV, where IMUL and IDIV are used for signed values.

Does unsigned math require more CPU instructions?

More articles: