Gcc optimization, const static object and constraint

I am working on an embedded project, and I am trying to add more structure to some code that uses macros to optimize register access for USART. I would like to organize the preprocessor # define'd address register in const structures. If I define structures as compound literals in a macro and pass them into inline'd functions, gcc is smart enough to bypass the pointer in the generated assembly and hardcode the values ​​of the structure element directly in the code. For instance:.

C1:

struct uart { volatile uint8_t * ucsra, * ucsrb, *ucsrc, * udr; volitile uint16_t * ubrr; }; #define M_UARTX(X) \ ( (struct uart) { \ .ucsra = &UCSR##X##A, \ .ucsrb = &UCSR##X##B, \ .ucsrc = &UCSR##X##C, \ .ubrr = &UBRR##X, \ .udr = &UDR##X, \ } ) void inlined_func(const struct uart * p, other_args...) { ... (*p->ucsra) = 0; (*p->ucsrb) = 0; (*p->ucsrc) = 0; } ... int main(){ ... inlined_func(&M_UART(0), other_parms...); ... } 

Here UCSR0A, UCSR0B, & c, are defined as uart registers as l-values, for example

 #define UCSR0A (*(uint8_t*)0xFFFF) 

gcc was able to completely exclude the structure literal, and all assignments like those shown inlined_func () are written directly to the register address, without having to read the register address into the machine register and without indirect addressing

A1:

 movb $0, UCSR0A movb $0, UCSR0B movb $0, UCSR0C 

This writes the values ​​directly to the USART registers, without having to load the addresses into the machine register, and therefore you never need to generate the structure literal in the object file at all. A string literal becomes a compile-time structure without the expense of generated code for abstraction.

I wanted to get rid of using a macro and tried to use the static constant structure defined in the header:

C2:

 #define M_UART0 M_UARTX(0) #define M_UART1 M_UARTX(1) static const struct uart * const uart[2] = { &M_UART0, &M_UART1 }; .... int main(){ ... inlined_func(uart[0], other_parms...); ... } 

However, gcc cannot completely remove the structure:

A2:

 movl __compound_literal.0, %eax movb $0, (%eax) movl __compound_literal.0+4, %eax movb $0, (%eax) movl __compound_literal.0+8, %eax movb $0, (%eax) 

This loads the register addresses into the machine register and uses indirect addressing to write to the register. Does anyone know that I can convince gcc to generate assembly code A1 for C2 C code? I have tried various __restrict modifier applications, to no avail.

+4
source share
2 answers

After many years of experience with UART and USART, I came to the following conclusions:

Do not use struct to display 1: 1 with UART registers.

Compilers can add padding between struct members without your knowledge, thereby spoiling the 1: 1 match.

Writing to UART registers is best done directly or through a function.

Remember to use the volatile when defining pointers to registers.

Very little performance gain using assembly language

Assembler language should only be used if the UART is accessed through the processor ports and not on the card. C language does not support ports. Access to UART registries using pointers is very efficient (generate an assembler list and check). Sometimes assembly and testing may take longer.

Isolate UART functionality in a separate library

This is a good candidate. Also, as soon as the code is tested, let it be. Libraries do not have to be (re) compiled all the time.

+2
source

Using structures “in all compilation domains” is a cardinal sin in my book. Basically, using a struct to point to something, anything, file data, memory, etc. And the reason is that it will fail, it is not reliable, regardless of the compiler. There are many special flags for the compiler and pragmas for this, the best solution is simply not to do this. You want to specify the address plus 8, specify the address plus 8, use a pointer or array. In this particular case, I had too many compilers that could not do this, and I write the assembler functions PUT32 / GET32 PUT16 / GET16 to ensure that the compiler does not spoil my case access, for example, structures, you will be burned into one It’s a wonderful day and I’ll damn understand why your 32-bit register has only 8 bits written on it. The overhead of moving to a function costs peace of mind and reliability and code mobility. In addition, it makes your code extremely portable, you can put packages for put and get functions for cross-networks, run your equipment in the hdl simulator and get to the simulation for reading / writing registers, etc. With a single piece of code that does not go from modeling to the os device built into the device driver to the application level function.

+1
source

Source: https://habr.com/ru/post/1300116/


All Articles