OpenCL String Values ​​Valid for CPU, but Not for GPU

I have a structure in a file that is included in the host and kernel code

typedef struct { float x, y, z, dir_x, dir_y, dir_z; int radius; } WorklistStruct; 

I create this structure in my C ++ host code and pass it through a buffer to the OpenCL core.

If I select a processor device for calculation, I get the following result:

  printf ( "item:[%f,%f,%f][%f,%f,%f]%d,%d\n", item.x, item.y, item.z, item.dir_x, item.dir_y, item.dir_z , item.radius ,sizeof(float)); 

Leading:

 item:[20.169043,7.000000,34.933712][0.000000,-3.000000,0.000000]1,4 

Device (CPU):

 item:[20.169043,7.000000,34.933712][0.000000,-3.000000,0.000000]1,4 

And if I choose a GPU (AMD) device to calculate, strange things happen:

Leading:

 item:[58.406261,57.786015,58.137501][2.000000,2.000000,2.000000]2,4 

Device (GPU):

 item:[58.406261,2.000000,0.000000][0.000000,0.000000,0.000000]0,0 

Notably, sizeof (float) is garbage on gpu.

I assume that there is a problem with the layouts of the floats on different devices.

Note: the structure is contained in an array of structures of this type, and each structure in this array is garbage on the GPU

Anyone have an idea why this is so and how I can predict it?

EDIT I ​​added% d to and and replaced it with 1, the result: 1065353216

EDIT : here are two structures that I use

 typedef struct { float x, y, z,//base coordinates dir_x, dir_y, dir_z;//directio int radius;//radius } WorklistStruct; typedef struct { float base_x, base_y, base_z; //base point float radius;//radius float dir_x, dir_y, dir_z; //initial direction } ReturnStruct; 

I tested some other things, this seems like a problem with printf. The values ​​seem to be correct. I passed the arguments to the return structure, read them, and these values ​​were correct.

I do not want to publish all the related code, it will be several hundred lines. If no one has an idea, I would compress it a bit.

Oh, and for printing I use #pragma OPENCL EXTENSION cl_amd_printf : enable .

Edit: Looks like a problem with printf. I just don't use it anymore.

+2
source share
2 answers

It looks like I used the wrong OpenCL headers to compile. If I try the code on the Intel platform (OpenCL 1.2), everything will be fine. But on my AMD platform (OpenCL 1.1) I get weird values.

I will try other headers.

0
source

There is an easy way to check what happens:

1 - Create data on the host side and initialize it:

 int num_points = 128; std::vector<WorklistStruct> works(num_points); std::vector<ReturnStruct> returns(num_points); for(WorklistStruct &work : works){ work = InitializeItSomehow(); std::cout << work.x << " " << work.y << " " << work.z << std::endl; std::cout << work.radius << std::endl; } // Same stuff with returns ... 

2 - Create buffers on the device side using the COPY_HOST_PTR flag, map it and check data integrity:

 cl::Buffer dev_works(..., COPY_HOST_PTR, (void*)&works[0]); cl::Buffer dev_rets(..., COPY_HOST_PTR, (void*)&returns[0]); // Then map it to check data WorklistStruct *mapped_works = dev_works.Map(...); ReturnStruct *mapped_rets = dev_rets.Map(...); // Output values & unmap buffers ... 

3 - Check the consistency of the data on the device side, as you did before.

Also, make sure that the code (presumably the header) that is included by both the kernel and the host-side code is pure OpenCL C (the AMD compiler can sometimes “swallow” some errors) and that you imported the directory to include when creating the kernel OpenCL (flag “-I” at clBuildProgramm stage)

Edited: At each step, please collect return codes (or catch exceptions). In addition, the -Werror flag in the clBuildProgramm step can also be useful.

+2
source

Source: https://habr.com/ru/post/1202789/


All Articles