Using CUDA with OpenGL and DirectX CUDA mapping functons run too slow

The mapping functions for OpenGL and DirectX are really bad :( !!!

I thought that using OpenGL or Direct with CUDA can increase the speed of my program but … out of expectation…

When i run the program without calling cuda kernel and mapping functions, the speed is : 1500 fps (with OpenGL)!

void Test() {    // My code here } 

When i run the program with map and unmap function, the speed is : 115 fps (REALLY BAD… External Image )

void Test() {    cudaGLMapBufferObject(); // cdKernel();    cudaGLUnmapBufferObject();    // My code here } 

And the last, when i call the program kernel, the speed is : 65 fps (Unbelievable)

void Test() {    cudaGLMapBufferObject();    cdKernel();    cudaGLUnmapBufferObject();    // My code here } 

It’s really bad because the calling of only “map” and “unmap” function can decrease the speed from 1500 fps to 115 fps… ?_? !!! (Slower than the speed when i use SetDIBitsToDevice + CUDA = 320 fps )

Can anyone explain this for me, please…!

hit the search button on the top right. This is asked very often