Is this an optimization error?

Question

Is this an optimization error?

Here is some output from my compiler in assembler. This is the MPLAB C30 C compiler based on GCC v3.23 for dsPIC33FJ128GP802, a 16-bit medium-speed DSP / MCU.

212: inline uint16_t ror_16(uint16_t word, int num) 213: { 078C4 608270 and.w w1,#16,w4 078C6 DE0204 lsr w0,w4,w4 078C8 780101 mov.w w1,w2 078CA EA8102 com.w w2,w2 078CC EA8183 com.w w3,w3 078CE 610170 and.w w2,#16,w2 078D0 DD0002 sl w0,w2,w0 078D2 700004 ior.w w0,w4,w0 214: num &= 16; // limit to 16 shifts 215: return (word >> num) | (word << (16 - num)); 216: } 078D4 060000 return

In particular, I am interested in the following:

 and.w w1,#16,w4 AND W1 with 16, storing result in W4 lsr w0,w4,w4 Logical shift right W0 by W4 times storing result in W4 mov.w w1,w2 Move W1 to W2 com.w w2,w2 Logical complement of W2 stored in W2 com.w w3,w3 Logical complement of W3 stored in W3 <-- This line is confusing me and.w w2,#16,w2 AND W2 with 16, storing result in W2 sl w0,w2,w0 (Logical) shift left W0 left by W2 times storing result in W0 ior.w w0,w4,w0 Inclusive OR of W0 and W4 stored in W0 return Return from function

W0..W15 is an array of sixteen in 16-bit chip registers.

This effectively simplifies (in primitive RTL):

 W4 := W1 & 16 W4 := W0 LSR W4 W1 := W2 W2 := COM W2 W3 := COM W3 W2 := W2 & 16 W0 := W0 SL W2 W0 := W0 | W4 return

Now I'm confused by why it computes the complement of W3 when there are only two arguments passed (W0 and W1 - it uses the array W to pass arguments to functions for functions with smaller arguments.) W3 is never used in the calculation and never returns. In fact, he doesn’t even have data in it: nothing is stored in it by a function, and only the called one will have some data in it (although functions are not required to save W0..W7, so the called one should not rely on it.) Why is it included in the code? Is it just a compiler error or an error, or am I missing something?

And this is not only this code - I see the same strangeness in other parts of the code. Even code designed to calculate things like complementing a 16-bit variable always seems to use two registers. He lost me!

+4

assembly gcc compiler-construction

Thomas o Jan 22 '11 at 21:47

source share

1 answer

Doug currie · Accepted Answer · 2011-01-22T22:33:26+0000

The function is not encoded to limit the number to 16 (which, I suspect, you mean from 0 to 16), but limits it to 0 or 16.

Instead

 num &= 16

you might want

 num > 16 ? (num & 15) : num

Re: question, since the function is built-in, you can answer it by looking at where it is used. Perhaps W3 is used for something in the surrounding code. Or it may be a “mistake”, but one that has only performance, and not correctness, impact.

If num can only be 0 or 16 (as in your code), then (16 - num) can also be only 16 or 0, so the C30 can do a “subtraction” with the addition and mask.

FYI, when I'm not built in, in C30 I get:

 34: uint16_t ror_16(uint16_t word, int num) 35: { 05AF4 608170 and.w 0x0002,#16,0x0004 05AF6 DE0102 lsr 0x0000,0x0004,0x0004 05AF8 EA8081 com.w 0x0002,0x0002 05AFA 6080F0 and.w 0x0002,#16,0x0002 05AFC DD0001 sl 0x0000,0x0002,0x0000 05AFE 700002 ior.w 0x0000,0x0004,0x0000 36: num &= 16; // limit to 16 shifts 37: return (word >> num) | (word << (16 - num)); 38: } 05B00 060000 return

I could label it as

 34: uint16_t ror_16(uint16_t word, int num) 35: { 05AF4 780100 mov.w 0x0000,0x0004 36: num &= 15; // mod 16 05AF6 60806F and.w 0x0002,#15,0x0000 37: return (num == 0) ? word : ((word >> num) | (word << (16 - num))); 05AF8 320004 bra z, 0x005b02 05AFA DE1080 lsr 0x0004,0x0000,0x0002 05AFC 100070 subr.w 0x0000,#16,0x0000 05AFE DD1000 sl 0x0004,0x0000,0x0000 05B00 708100 ior.w 0x0002,0x0000,0x0004 38: } 05B02 780002 mov.w 0x0004,0x0000 05B04 060000 return

Is this an optimization error?

More articles: