poona January 28, 2010, 5:02pm 1 This is my kernel
__device__ void AddOne(int *acc) { atomicAdd(acc, 1); } extern "C" __global__ void test(int* var) { __shared__ int acc; if(threadIdx.x==0) acc = 0; __syncthreads(); AddOne(&acc); __syncthreads(); if(threadIdx.x==0) var[0] = acc; } I compiled this using the following command and I had no errors or warnings while compiling using nvcc.
nvcc kernel.cu --ptx -arch sm_11 If I run this program, the module fails to load using cuModuleLoad(). Even the error returned isn’t any of the ones listed in the reference manual.
If I change the line
atomicAdd(acc, 1); with
acc[0] = 1; the module gets loaded correctly, and I am able to retrieve the value 1 from the kernel.
Can someone help me with this? Thanks for reading.
I have a 1.1 device and I am running 2.3 version of the toolkit.
Shared memory atomic operations are not supported on compute capability 1.1 hardware.
poona January 28, 2010, 5:57pm 3 Right. Just saw it in the appendix. Is the atomic operation expensive to use, assuming I am using it on shared data?
poona January 28, 2010, 6:15pm 4 I am running this code instead.
__device__ int acc; __device__ void AddOne() { atomicAdd(&acc, 1); } extern "C" __global__ void test(int *var) { if(threadIdx.x==0) acc = 0; __syncthreads(); AddOne(); __syncthreads(); if(threadIdx.x==0) var[0] = acc; } I am invoking the kernel with a single block containing 31 threads.
When I do a ./a.out I get the right value. But next time I run a.out, I am getting 62. If I continue to call the app, it seems to be adding to old value of the variable from the previous invocation. The device var should have a lifetime of the app according to the guide. I am not sure how it is persisting across multiple invocations.
31 poona@poona_desktop:~/development/cuda# ./a.out
62 poona@poona_desktop:~/development/cuda# ./a.out
93 poona@poona_desktop:~/development/cuda# ./a.out
124 poona@poona_desktop:~/development/cuda# ./a.out
155 poona@poona_desktop:~/development/cuda# ./a.out
186 poona@poona_desktop:~/development/cuda# ./a.out
217 poona@poona_desktop:~/development/cuda# ./a.out
and so on.
Try declaring acc explicitly as global .
poona January 28, 2010, 6:23pm 6 Can we declare a var as global? I tried it with
__global__ int acc; and I got these errors
kernel_atomic.cu(1): warning: invalid attribute for variable "acc" kernel_atomic.cu(6): error: identifier "acc" is undefined kernel_atomic.cu(14): error: identifier "acc" is undefined kernel_atomic.cu(23): error: identifier "acc" is undefined