You can do this as part of a copy of the host-> device. Each copy takes one of the adjacent input arrays on the host and copies them to the device. The layout of the storage of complex data types in CUDA is compatible with the layout defined for complex types in Fortran and C ++, i.e. As a structure with a real part, followed by an imaginary part.
float * real_vec; // host vector, real part float * imag_vec; // host vector, imaginary part float2 * complex_vec_d; // device vector, single-precision complex float * tmp_d = (float *) complex_vec_d; cudaStat = cudaMemcpy2D (tmp_d, 2 * sizeof(tmp_d[0]), real_vec, 1 * sizeof(real_vec[0]), sizeof(real_vec[0]), n, cudaMemcpyHostToDevice); cudaStat = cudaMemcpy2D (tmp_d + 1, 2 * sizeof(tmp_d[0]), imag_vec, 1 * sizeof(imag_vec[0]), sizeof(imag_vec[0]), n, cudaMemcpyHostToDevice);
source share