I have a 10 character char array that I would like to pass as an argument to the comparator that the Thrust sort function will use.
To allocate memory for this array, I use cudaMalloc . However, cudaMalloc allocates memory in global memory, so whenever a stream wants to read data from this array, it must access global memory.
but this array is small, and I believe that it would be more efficient if it were stored in some shared memory, or even in the registers of each thread. However, is it possible to achieve this with Thrust, and if so, how?
Here is the comparator:
struct comp{ int *data_to_sort; char *helpingArray; comp(int *data_ptr) this->data_to_sort = data_ptr; __host__ __device__ bool operator()(const int&a, const int&b){
then I allocate memory for helpingArray in global memory and pass it as an argument with Comparator struct to the sort function.
Please note that the data_to_sort array is stored in global memory, because it contains data that needs to be sorted, we cannot avoid this.
This works fine, and the sorting method is faster than the cpu sorting method, however, I believe that if I avoid storing helpingArray in global memory, the sorting method will become much faster.
source share