As part of a self-education project, I looked at how g ++ handles a type std::complex- and was puzzled by this simple function:
#include <complex>
std::complex<double> c;
void get(std::complex<double> &res){
res=c;
}
Compiled with g++-6.3 -O3(or also -Os) for Linux64, I got this result:
movsd c(%rip), %xmm0
movsd %xmm0, (%rdi)
movsd c+8(%rip), %xmm0
movsd %xmm0, 8(%rdi)
ret
Thus, it moves the real and imaginary parts separately, like 64-bit floats. However, I would expect the assembly to use two movupsinstead of four movsd, i.e. Move the real and imaginary parts simultaneously in the form of a 128-bit packet:
movups c(%rip), %xmm0
movups %xmm0, (%rdi)
ret
It is not only twice as fast on my machine (Intel Broadwell) than in the version movsd, but it also only needs 16 bytes, and for movsd-version it takes 36 bytes.
What is the reason g ++ builds assembly with movsd?
movups, -O3?movups ?- g++ ?
- - ?
: :
std::complex<double> get(){
return c;
}
void get(std::complex<double> &res){
res=c;
}
(xmm0 xmm1) - SystemV ABI. SSE-, 128 , g++.
: kennytm, g++, , . 4 movsd std:: complex , ,
void get(std::complex<double> *res){
res[1]=res[0];
}
, gcc-bugzilla..