If I comment out all of the host functions everything works fine. Furthermore, if I debug with NSight, weirdly only some of the threads got the following error:
I got another news. As the compiler says: “Cannot tell what pointer points to, assuming global memory” pointing to the line with “do_sth(a,b)” within the global function, I think I’m on the right way.
Now the question: How to tell the compiler it should link to device memory?
if anyone got the same issue, I was able to solve the problem. I don’t know why but I had to forceinline the device code. Thus the working code now looks like