There is a small function ldcalled -r/ --relocatable, which can be used to combine several object files into one, which can subsequently be associated with the final product. If you can get LTO here, but not later, you can have the type of “partial” LTO that you are looking for.
Sadly ld -rit won’t work; it simply combines all the LTO information that will be processed later. But calling it using the gcc ( gcc -r) driver seems to work:
ac
int a() {
return 42;
}
bc
int a(void);
int b() {
return a();
}
c.c
int b(void);
int c() {
return b();
}
dc
int c(void);
int main() {
return c();
}
$ gcc -O3 -flto -c [a-d].c
$ gcc -O3 -r -nostdlib a.o b.o -o g1.o
$ gcc -O3 -r -nostdlib c.o d.o -o g2.o
$ gcc -O3 -fno-lto g1.o g2.o
$ objdump -d a.out
...
00000000000004f0 <main>:
4f0: e9 1b 01 00 00 jmpq 610 <b>
...
0000000000000610 <b>:
610: b8 2a 00 00 00 mov $0x2a,%eax
615: c3 retq
...
So, I main()got optimization before return b();, but b()got optimization before return 42;, but there was no interprocedural optimization between the two groups.
source
share