I am new with GPU programming and exploring all possible available ways for writing my application. I have looked at the Parallel Thread Execution (PTX) ISA version 1.0 Release 1.0 document. I have understood most of things written in the document, but still not able to write a complete code. Can any one help me out and give me a very basic example code and the compliation intstructions using PTX.
May I ask why you want to write you program in PTX? why not use CUDA? Cuda generates PTX in the background right? Or am I just a stupid moron who doesn’t know anything about GPGPU?
I think their reason might be trying to avoid some optimization bug in nvcc. Also ptx gives more flexible control on memory so that you can force a variable to stay on register and so on. With ptx, you can have your own optimized code, just like ppl do some optimization jobs in traditional assembly.
(So I would say inline ptx assembly within cuda code will be extremely helpful.)