G ++ std :: complex copy processing

As part of a self-education project, I looked at how g ++ handles a type std::complex- and was puzzled by this simple function:

#include <complex>  
std::complex<double> c;

void get(std::complex<double> &res){
    res=c;
}

Compiled with g++-6.3 -O3(or also -Os) for Linux64, I got this result:

    movsd   c(%rip), %xmm0
    movsd   %xmm0, (%rdi)
    movsd   c+8(%rip), %xmm0
    movsd   %xmm0, 8(%rdi)
    ret

Thus, it moves the real and imaginary parts separately, like 64-bit floats. However, I would expect the assembly to use two movupsinstead of four movsd, i.e. Move the real and imaginary parts simultaneously in the form of a 128-bit packet:

    movups  c(%rip), %xmm0
    movups  %xmm0, (%rdi)
    ret

It is not only twice as fast on my machine (Intel Broadwell) than in the version movsd, but it also only needs 16 bytes, and for movsd-version it takes 36 bytes.

What is the reason g ++ builds assembly with movsd?

  • movups, -O3?
  • movups ?
  • g++ ?
  • - ?

: :

std::complex<double> get(){
    return c;
}

void get(std::complex<double> &res){
    res=c;
}

(xmm0 xmm1) - SystemV ABI. SSE-, 128 , g++.


: kennytm, g++, , . 4 movsd std:: complex , ,

void get(std::complex<double> *res){
    res[1]=res[0];
}

, gcc-bugzilla..

+6
1

3. g++ .

clang icc SSE. https://godbolt.org/g/55lPv0.

get(std::complex<double>&):
        movups    c(%rip), %xmm0
        movups    %xmm0, (%rdi)  
        ret
+2

Source: https://habr.com/ru/post/1016408/


All Articles