40. Ray Tracing
Ray tracing uses a separate rendering pipeline from both the graphics and compute pipelines (see Ray Tracing Pipeline).
Within the ray tracing pipeline, a pipeline trace ray instruction can be called to perform a ray traversal that invokes the various ray tracing shader stages during its execution. The relationship between the ray tracing pipeline object and the geometries present in the acceleration structure traversed is passed into the ray tracing command in a VkBuffer object known as a shader binding table. OpExecuteCallableKHR can also be used in ray tracing pipelines to invoke a callable shader.
During execution, control alternates between scheduling and other operations. The scheduling functionality is implementation-specific and is responsible for workload execution. The shader stages are programmable. Traversal, which refers to the process of traversing acceleration structures to find potential intersections of rays with geometry, is fixed function.
The programmable portions of the pipeline are exposed in a single-ray programming model, with each invocation handling one ray at a time. Memory operations can be synchronized using standard memory barriers. The Workgroup scope and variables with a storage class of Workgroup must not be used in the ray tracing pipeline.
40.1. Shader Call Instructions
A shader call is an instruction which may cause execution to continue elsewhere by creating one or more invocations that execute a different shader stage.
The following table lists all shader call instructions and which stages each one can directly call.
| Instruction | Intersection | Any-Hit | Closest Hit | Miss | Callable |
|---|---|---|---|---|---|
| X | X | X | X | |
| X | X | X | X | |
| X | ||||
| X | ||||
| X | X | |||
| X | X | |||
| X | X |
The invocations created by shader call instructions are grouped into subgroups by the implementation. Those subgroups may be unrelated to the subgroup of the parent invocation.
Pipeline trace ray instructions can be used recursively; invoked shaders can themselves execute pipeline trace ray instructions, to a maximum depth defined by the maxRecursionDepth or maxRayRecursionDepth limit.
Shaders directly invoked from the API always have a recursion depth of 0; each shader executed by a pipeline trace ray instruction has a recursion depth one higher than the recursion depth of the shader which invoked it. Applications must not invoke a shader with a recursion depth greater than the value of maxRecursionDepth or maxPipelineRayRecursionDepth specified in the pipeline.
There is no explicit recursion limit for other shader call instructions which may recurse (e.g. OpExecuteCallableKHR) but there is an upper bound determined by the stack size.
An invocation repack instruction is a ray tracing instruction where the implementation may change the set of invocations that are executing. When a repack instruction is encountered, the invocation is suspended and a new invocation begins and executes the instruction. After executing the repack instruction (which may result in other ray tracing shader stages executing) the new invocation ends and the original invocation is resumed, but it may be resumed in a different subgroup or at a different SubgroupLocalInvocationId within the same subgroup. When a subset of invocations in a subgroup execute the invocation repack instruction, those that do not execute it remain in the same subgroup at the same SubgroupLocalInvocationId.
The OpTraceRayKHR, OpTraceRayMotionNV, OpReorderThreadWithHintNV, OpReorderThreadWithHitObjectNV, OpReportIntersectionKHR, and OpExecuteCallableKHR instructions are invocation repack instructions.
The invocations that are executing before a shader call instruction, after the instruction, or are created by the instruction, are shader-call-related.
If the implementation changes the composition of subgroups, the values of SubgroupSize, SubgroupLocalInvocationId, SMIDNV, WarpIDNV, and builtin variables that are derived from them (SubgroupEqMask, SubgroupGeMask, SubgroupGtMask, SubgroupLeMask, SubgroupLtMask) must be changed accordingly by the invocation repack instruction. The application must use Volatile semantics on these BuiltIn variables when used in the ray generation, closest hit, miss, intersection, and callable shaders. Similarly, the application must use Volatile semantics on any RayTmaxKHR decorated Builtin used in an intersection shader.
| Note | Subgroup operations are permitted in the programmable ray tracing shader stages. However, shader call instructions place a bound on where results of subgroup instructions or subgroup-scoped instructions that execute the dynamic instance of that instruction are potentially valid. For example, care must be taken when using the result of a ballot operation that was computed before an invocation repack instruction, after that repack instruction. The ballot may be incorrect as the set of invocations could have changed. While the For clock operations, the value of a |
When a ray tracing shader executes a dynamic instance of an invocation repack instruction which results in another ray tracing shader being invoked, their instructions are related by shader-call-order.
For ray tracing invocations that are shader-call-related:
-
memory operations on
StorageBuffer,Image, andShaderRecordBufferKHRstorage classes can be synchronized using theShaderCallKHRscope. -
the
CallableDataKHR,IncomingCallableDataKHR,RayPayloadKHR,HitAttributeKHR, andIncomingRayPayloadKHRstorage classes are system-synchronized and no application availability and visibility operations are required. -
memory operations within a single invocation before and after the shader call instruction are ordered by program-order and do not require explicit synchronization.
40.2. Ray Tracing Commands
Ray tracing commands provoke work in the ray tracing pipeline. Ray tracing commands are recorded into a command buffer and when executed by a queue will produce work that executes according to the currently bound ray tracing pipeline. A ray tracing pipeline must be bound to a command buffer before any ray tracing commands are recorded in that command buffer.
To dispatch ray tracing use:
// Provided by VK_NV_ray_tracing void vkCmdTraceRaysNV( VkCommandBuffer commandBuffer, VkBuffer raygenShaderBindingTableBuffer, VkDeviceSize raygenShaderBindingOffset, VkBuffer missShaderBindingTableBuffer, VkDeviceSize missShaderBindingOffset, VkDeviceSize missShaderBindingStride, VkBuffer hitShaderBindingTableBuffer, VkDeviceSize hitShaderBindingOffset, VkDeviceSize hitShaderBindingStride, VkBuffer callableShaderBindingTableBuffer, VkDeviceSize callableShaderBindingOffset, VkDeviceSize callableShaderBindingStride, uint32_t width, uint32_t height, uint32_t depth); -
commandBufferis the command buffer into which the command will be recorded. -
raygenShaderBindingTableBufferis the buffer object that holds the shader binding table data for the ray generation shader stage. -
raygenShaderBindingOffsetis the offset in bytes (relative toraygenShaderBindingTableBuffer) of the ray generation shader being used for the trace. -
missShaderBindingTableBufferis the buffer object that holds the shader binding table data for the miss shader stage. -
missShaderBindingOffsetis the offset in bytes (relative tomissShaderBindingTableBuffer) of the miss shader being used for the trace. -
missShaderBindingStrideis the size in bytes of each shader binding table record inmissShaderBindingTableBuffer. -
hitShaderBindingTableBufferis the buffer object that holds the shader binding table data for the hit shader stages. -
hitShaderBindingOffsetis the offset in bytes (relative tohitShaderBindingTableBuffer) of the hit shader group being used for the trace. -
hitShaderBindingStrideis the size in bytes of each shader binding table record inhitShaderBindingTableBuffer. -
callableShaderBindingTableBufferis the buffer object that holds the shader binding table data for the callable shader stage. -
callableShaderBindingOffsetis the offset in bytes (relative tocallableShaderBindingTableBuffer) of the callable shader being used for the trace. -
callableShaderBindingStrideis the size in bytes of each shader binding table record incallableShaderBindingTableBuffer. -
widthis the width of the ray trace query dimensions. -
heightis height of the ray trace query dimensions. -
depthis depth of the ray trace query dimensions.
When the command is executed, a ray generation group of width × height × depth rays is assembled.
To dispatch ray tracing use:
// Provided by VK_KHR_ray_tracing_pipeline void vkCmdTraceRaysKHR( VkCommandBuffer commandBuffer, const VkStridedDeviceAddressRegionKHR* pRaygenShaderBindingTable, const VkStridedDeviceAddressRegionKHR* pMissShaderBindingTable, const VkStridedDeviceAddressRegionKHR* pHitShaderBindingTable, const VkStridedDeviceAddressRegionKHR* pCallableShaderBindingTable, uint32_t width, uint32_t height, uint32_t depth); -
commandBufferis the command buffer into which the command will be recorded. -
pRaygenShaderBindingTableis a VkStridedDeviceAddressRegionKHR that holds the shader binding table data for the ray generation shader stage. -
pMissShaderBindingTableis a VkStridedDeviceAddressRegionKHR that holds the shader binding table data for the miss shader stage. -
pHitShaderBindingTableis a VkStridedDeviceAddressRegionKHR that holds the shader binding table data for the hit shader stage. -
pCallableShaderBindingTableis a VkStridedDeviceAddressRegionKHR that holds the shader binding table data for the callable shader stage. -
widthis the width of the ray trace query dimensions. -
heightis height of the ray trace query dimensions. -
depthis depth of the ray trace query dimensions.
When the command is executed, a ray generation group of width × height × depth rays is assembled.
When invocation mask image usage is enabled in the bound ray tracing pipeline, the pipeline uses an invocation mask image specified by the command:
// Provided by VK_HUAWEI_invocation_mask void vkCmdBindInvocationMaskHUAWEI( VkCommandBuffer commandBuffer, VkImageView imageView, VkImageLayout imageLayout); -
commandBufferis the command buffer into which the command will be recorded -
imageViewis an image view handle specifying the invocation mask imageimageViewmay be VK_NULL_HANDLE, which is equivalent to specifying a view of an image filled with ones value. -
imageLayoutis the layout that the image subresources accessible fromimageViewwill be in when the invocation mask image is accessed
To dispatch ray tracing, with some parameters sourced on the device, use:
// Provided by VK_KHR_ray_tracing_pipeline void vkCmdTraceRaysIndirectKHR( VkCommandBuffer commandBuffer, const VkStridedDeviceAddressRegionKHR* pRaygenShaderBindingTable, const VkStridedDeviceAddressRegionKHR* pMissShaderBindingTable, const VkStridedDeviceAddressRegionKHR* pHitShaderBindingTable, const VkStridedDeviceAddressRegionKHR* pCallableShaderBindingTable, VkDeviceAddress indirectDeviceAddress); -
commandBufferis the command buffer into which the command will be recorded. -
pRaygenShaderBindingTableis a VkStridedDeviceAddressRegionKHR that holds the shader binding table data for the ray generation shader stage. -
pMissShaderBindingTableis a VkStridedDeviceAddressRegionKHR that holds the shader binding table data for the miss shader stage. -
pHitShaderBindingTableis a VkStridedDeviceAddressRegionKHR that holds the shader binding table data for the hit shader stage. -
pCallableShaderBindingTableis a VkStridedDeviceAddressRegionKHR that holds the shader binding table data for the callable shader stage. -
indirectDeviceAddressis a buffer device address which is a pointer to a VkTraceRaysIndirectCommandKHR structure containing the trace ray parameters.
vkCmdTraceRaysIndirectKHR behaves similarly to vkCmdTraceRaysKHR except that the ray trace query dimensions are read by the device from indirectDeviceAddress during execution.
The VkTraceRaysIndirectCommandKHR structure is defined as:
// Provided by VK_KHR_ray_tracing_pipeline typedef struct VkTraceRaysIndirectCommandKHR { uint32_t width; uint32_t height; uint32_t depth; } VkTraceRaysIndirectCommandKHR; -
widthis the width of the ray trace query dimensions. -
heightis height of the ray trace query dimensions. -
depthis depth of the ray trace query dimensions.
The members of VkTraceRaysIndirectCommandKHR have the same meaning as the similarly named parameters of vkCmdTraceRaysKHR.
To dispatch ray tracing, with some parameters sourced on the device, use:
// Provided by VK_KHR_ray_tracing_maintenance1 with VK_KHR_ray_tracing_pipeline void vkCmdTraceRaysIndirect2KHR( VkCommandBuffer commandBuffer, VkDeviceAddress indirectDeviceAddress); -
commandBufferis the command buffer into which the command will be recorded. -
indirectDeviceAddressis a buffer device address which is a pointer to a VkTraceRaysIndirectCommand2KHR structure containing the trace ray parameters.
vkCmdTraceRaysIndirect2KHR behaves similarly to vkCmdTraceRaysIndirectKHR except that shader binding table parameters as well as dispatch dimensions are read by the device from indirectDeviceAddress during execution.
The VkTraceRaysIndirectCommand2KHR structure is defined as:
// Provided by VK_KHR_ray_tracing_maintenance1 with VK_KHR_ray_tracing_pipeline typedef struct VkTraceRaysIndirectCommand2KHR { VkDeviceAddress raygenShaderRecordAddress; VkDeviceSize raygenShaderRecordSize; VkDeviceAddress missShaderBindingTableAddress; VkDeviceSize missShaderBindingTableSize; VkDeviceSize missShaderBindingTableStride; VkDeviceAddress hitShaderBindingTableAddress; VkDeviceSize hitShaderBindingTableSize; VkDeviceSize hitShaderBindingTableStride; VkDeviceAddress callableShaderBindingTableAddress; VkDeviceSize callableShaderBindingTableSize; VkDeviceSize callableShaderBindingTableStride; uint32_t width; uint32_t height; uint32_t depth; } VkTraceRaysIndirectCommand2KHR; -
raygenShaderRecordAddressis a VkDeviceAddress of the ray generation shader binding table record used by this command. -
raygenShaderRecordSizeis a VkDeviceSize number of bytes corresponding to the ray generation shader binding table record at base addressraygenShaderRecordAddress. -
missShaderBindingTableAddressis a VkDeviceAddress of the first record in the miss shader binding table used by this command. -
missShaderBindingTableSizeis a VkDeviceSize number of bytes corresponding to the total size of the miss shader binding table atmissShaderBindingTableAddressthat may be accessed by this command. -
missShaderBindingTableStrideis a VkDeviceSize number of bytes between records of the miss shader binding table. -
hitShaderBindingTableAddressis a VkDeviceAddress of the first record in the hit shader binding table used by this command. -
hitShaderBindingTableSizeis a VkDeviceSize number of bytes corresponding to the total size of the hit shader binding table athitShaderBindingTableAddressthat may be accessed by this command. -
hitShaderBindingTableStrideis a VkDeviceSize number of bytes between records of the hit shader binding table. -
callableShaderBindingTableAddressis a VkDeviceAddress of the first record in the callable shader binding table used by this command. -
callableShaderBindingTableSizeis a VkDeviceSize number of bytes corresponding to the total size of the callable shader binding table atcallableShaderBindingTableAddressthat may be accessed by this command. -
callableShaderBindingTableStrideis a VkDeviceSize number of bytes between records of the callable shader binding table. -
widthis the width of the ray trace query dimensions. -
heightis height of the ray trace query dimensions. -
depthis depth of the ray trace query dimensions.
The members of VkTraceRaysIndirectCommand2KHR have the same meaning as the similarly named parameters of vkCmdTraceRaysKHR.
Indirect shader binding table buffer parameters must satisfy the same memory alignment and binding requirements as their counterparts in vkCmdTraceRaysIndirectKHR and vkCmdTraceRaysKHR.
40.3. Shader Binding Table
A shader binding table is a resource which establishes the relationship between the ray tracing pipeline and the acceleration structures that were built for the ray tracing pipeline. It indicates the shaders that operate on each geometry in an acceleration structure. In addition, it contains the resources accessed by each shader, including indices of textures, buffer device addresses, and constants. The application allocates and manages shader binding tables as VkBuffer objects.
Each entry in the shader binding table consists of shaderGroupHandleSize bytes of data, either as queried by vkGetRayTracingShaderGroupHandlesKHR to refer to those specified shaders, or all zeros to refer to a zero shader group. A zero shader group behaves as though it is a shader group consisting entirely of VK_SHADER_UNUSED_KHR. The remainder of the data specified by the stride is application-visible data that can be referenced by a ShaderRecordBufferKHR block in the shader.
The shader binding tables to use in a ray tracing pipeline are passed to the vkCmdTraceRaysNV , vkCmdTraceRaysKHR, or vkCmdTraceRaysIndirectKHR commands. Shader binding tables are read-only in shaders that are executing on the ray tracing pipeline.
Shader variables identified with the ShaderRecordBufferKHR storage class are used to access the provided shader binding table. Such variables must be:
-
typed as
OpTypeStruct, or an array of this type, -
identified with a
Blockdecoration, and -
laid out explicitly using the
Offset,ArrayStride, andMatrixStridedecorations as specified in Offset and Stride Assignment.
The Offset decoration for any member of a Block-decorated variable in the ShaderRecordBufferKHR storage class must not cause the space required for that variable to extend outside the range [0, maxStorageBufferRange).
Accesses to the shader binding table from ray tracing pipelines must be synchronized with the VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_KHR pipeline stage and an access type of VK_ACCESS_SHADER_READ_BIT.
| Note | Because different shader record buffers can be associated with the same shader, a shader variable with |
40.3.1. Indexing Rules
In order to execute the correct shaders and access the correct resources during a ray tracing dispatch, the implementation must be able to locate shader binding table entries at various stages of execution. This is accomplished by defining a set of indexing rules that compute shader binding table record positions relative to the buffer’s base address in memory. The application must organize the contents of the shader binding table’s memory in a way that application of the indexing rules will lead to correct records.
Ray Generation Shaders
Only one ray generation shader is executed per ray tracing dispatch.
For vkCmdTraceRaysKHR, the location of the ray generation shader is specified by the pRaygenShaderBindingTable->deviceAddress parameter — there is no indexing. All data accessed must be less than pRaygenShaderBindingTable->size bytes from deviceAddress. pRaygenShaderBindingTable->stride is unused, and must be equal to pRaygenShaderBindingTable->size.
For vkCmdTraceRaysNV, the location of the ray generation shader is specified by the raygenShaderBindingTableBuffer and raygenShaderBindingOffset parameters — there is no indexing.
Hit Shaders
The base for the computation of intersection, any-hit, and closest hit shader locations is the instanceShaderBindingTableRecordOffset value stored with each instance of a top-level acceleration structure (VkAccelerationStructureInstanceKHR). This value determines the beginning of the shader binding table records for a given instance.
In the following rule, geometryIndex refers to the geometry index of the intersected geometry within the instance.
The sbtRecordOffset and sbtRecordStride values are passed in as parameters to traceNV() or traceRayEXT() calls made in the shaders. See Section 8.19 (Ray Tracing Functions) of the OpenGL Shading Language Specification for more details. In SPIR-V, these correspond to the SBTOffset and SBTStride parameters to the pipeline trace ray instructions.
The result of this computation is then added to pHitShaderBindingTable->deviceAddress, a device address passed to vkCmdTraceRaysKHR , or hitShaderBindingOffset, a base offset passed to vkCmdTraceRaysNV .
For vkCmdTraceRaysKHR, the complete rule to compute a hit shader binding table record address in the pHitShaderBindingTable is:
-
pHitShaderBindingTable->deviceAddress+pHitShaderBindingTable->stride× (instanceShaderBindingTableRecordOffset+geometryIndex×sbtRecordStride+sbtRecordOffset)
All data accessed must be less than pHitShaderBindingTable->size bytes from the base address.
For vkCmdTraceRaysNV, the offset and stride come from direct parameters, so the full rule to compute a hit shader binding table record address in the hitShaderBindingTableBuffer is:
-
hitShaderBindingOffset+hitShaderBindingStride× (instanceShaderBindingTableRecordOffset+geometryIndex×sbtRecordStride+sbtRecordOffset)
Miss Shaders
A miss shader is executed whenever a ray query fails to find an intersection for the given scene geometry. Multiple miss shaders may be executed throughout a ray tracing dispatch.
The base for the computation of miss shader locations is pMissShaderBindingTable->deviceAddress, a device address passed into vkCmdTraceRaysKHR , or missShaderBindingOffset, a base offset passed into vkCmdTraceRaysNV .
The missIndex value is passed in as a parameter to traceNV() or traceRayEXT() calls made in the shaders. See Section 8.19 (Ray Tracing Functions) of the OpenGL Shading Language Specification for more details. In SPIR-V, this corresponds to the MissIndex parameter to the pipeline trace ray instructions.
For vkCmdTraceRaysKHR, the complete rule to compute a miss shader binding table record address in the pMissShaderBindingTable is:
-
pMissShaderBindingTable->deviceAddress+pMissShaderBindingTable->stride×missIndex
All data accessed must be less than pMissShaderBindingTable->size bytes from the base address.
For vkCmdTraceRaysNV, the offset and stride come from direct parameters, so the full rule to compute a miss shader binding table record address in the missShaderBindingTableBuffer is:
-
missShaderBindingOffset+missShaderBindingStride×missIndex
Callable Shaders
A callable shader is executed when requested by a ray tracing shader. Multiple callable shaders may be executed throughout a ray tracing dispatch.
The base for the computation of callable shader locations is pCallableShaderBindingTable->deviceAddress, a device address passed into vkCmdTraceRaysKHR , or callableShaderBindingOffset, a base offset passed into vkCmdTraceRaysNV .
The sbtRecordIndex value is passed in as a parameter to executeCallableNV() or executeCallableEXT() calls made in the shaders. See Section 8.19 (Ray Tracing Functions) of the OpenGL Shading Language Specification for more details. In SPIR-V, this corresponds to the SBTIndex parameter to the OpExecuteCallableNV or OpExecuteCallableKHR instruction.
For vkCmdTraceRaysKHR, the complete rule to compute a callable shader binding table record address in the pCallableShaderBindingTable is:
-
pCallableShaderBindingTable->deviceAddress+pCallableShaderBindingTable->stride×sbtRecordIndex
All data accessed must be less than pCallableShaderBindingTable->size bytes from the base address.
For vkCmdTraceRaysNV, the offset and stride come from direct parameters, so the full rule to compute a callable shader binding table record address in the callableShaderBindingTableBuffer is:
-
callableShaderBindingOffset+callableShaderBindingStride×sbtRecordIndex
40.4. Ray Tracing Pipeline Stack
Ray tracing pipelines have a potentially large set of shaders which may be invoked in various call chain combinations to perform ray tracing. To store parameters for a given shader execution, an implementation may use a stack of data in memory. This stack must be sized to the sum of the stack sizes of all shaders in any call chain executed by the application.
If the stack size is not set explicitly, the stack size for a pipeline is:
-
rayGenStackMax + min(1,
maxPipelineRayRecursionDepth) × max(closestHitStackMax, missStackMax, intersectionStackMax + anyHitStackMax) + max(0,maxPipelineRayRecursionDepth-1) × max(closestHitStackMax, missStackMax) + 2 × callableStackMax
where rayGenStackMax, closestHitStackMax, missStackMax, anyHitStackMax, intersectionStackMax, and callableStackMax are the maximum stack values queried by the respective shader stages for any shaders in any shader groups defined by the pipeline.
This stack size is potentially significant, so an application may want to provide a more accurate stack size after pipeline compilation. The value that the application provides is the maximum value of the sum of all shaders in a call chain across all possible call chains, taking into account any application specific knowledge about the properties of the call chains.
| Note | For example, if an application has two types of closest hit and miss shaders that it can use but the first level of rays will only use the first kind (possibly reflection) and the second level will only use the second kind (occlusion or shadow ray, for example) then the application can compute the stack size by something similar to:
This is guaranteed to be no larger than the default stack size computation which assumes that both call levels may be the larger of the two. |
40.5. Ray Tracing Capture Replay
In a similar way to bufferDeviceAddressCaptureReplay, the rayTracingPipelineShaderGroupHandleCaptureReplay feature allows the querying of opaque data which can be used in a future replay.
During the capture phase, capture/replay tools are expected to query opaque data for shader group handle replay using vkGetRayTracingCaptureReplayShaderGroupHandlesKHR.
Providing the opaque data during replay, using VkRayTracingShaderGroupCreateInfoKHR::pShaderGroupCaptureReplayHandle at pipeline creation time, causes the implementation to generate identical shader group handles to those in the capture phase, allowing capture/replay tools to reuse previously recorded shader binding table buffer contents or to obtain the same handles by calling vkGetRayTracingCaptureReplayShaderGroupHandlesKHR again.
40.6. Ray Tracing Validation
Ray tracing validation can help root cause application issues and improve performance. Unlike existing validation layers, ray tracing validation performs checks at an implementation level, which helps identify potential problems that may not be caught by the layer.
By enabling the ray tracing validation feature, warnings and errors can be delivered straight from a ray tracing implementation to the application through a messenger callback registered with the implementation, where they can be processed through existing application-side debugging or logging systems.