I skipped the C / CUDA code in the debugger, for example:
for(uint i = threadIdx.x; i < 8379; i+=256)
sum += d_PartialHistograms[blockIdx.x + i * HISTOGRAM64_BIN_COUNT];
And I was completely embarrassed because the debugger passed it in one step, although the solution was right. I realized that when I put curly braces around my loop, as in the following snippet, it behaved in the debugger as expected.
for(uint i = threadIdx.x; i < 8379; i+=256) {
sum += d_PartialHistograms[blockIdx.x + i * HISTOGRAM64_BIN_COUNT];
}
Thus, for loops handled differently in C or in the debugger, there are no parentheses, or perhaps this applies to CUDA.
thanks
zenna source
share