- Notifications
You must be signed in to change notification settings - Fork 67
Improvements to workgroup reduce + scan #876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 19 commits
Commits
Show all changes
74 commits Select commit Hold shift + click to select a range
09f16c2 minor fixes, example
keptsecret 6f5f8b0 bug fixes and example
keptsecret 1bac247 fix to data accessor indexing
keptsecret 305ac7b added template spec for vector dim 1
keptsecret c08063d added inclusive scan
keptsecret b1d804f exclusive scan working
keptsecret 3cf98ab removed outdated comment
keptsecret 7b310e0 minor changes to config usage
keptsecret 4b4e7e8 add 1 level scans
keptsecret 2e5f29f fixes to 1 level scans
keptsecret 054b269 added handling >1 vectors on level 1 scan (untested)
keptsecret 1b5282c move load/store smem into scan funcs, setup config for 3 levels
keptsecret c6dc5bc change to use coalesced indexing for 2-level scans
keptsecret aa0c36c added 3-level scans
keptsecret 74c359b minor bug fixes
keptsecret ce244e2 changes to data accessor usage
keptsecret 90b19d8 wg reduction uses reduce instead of scan
keptsecret d2a1663 fixes to calculating levels in config
keptsecret ea39d9e fixes to 3-level scan
keptsecret 2982e5e Merge branch 'master' into improve-workgroup-scan-2
keptsecret 1c0e72e split config into new file
keptsecret 59d02fe merge master
keptsecret 507904f minor fixes
keptsecret 542592f soome changes to arithmetic config
keptsecret a9930a0 removed referencing workgroupID in scans
keptsecret 55d89c5 no need to store locals in reduce
keptsecret 4e4f26e added workgroup accessor concepts, refactor accessor usage
keptsecret 56f013e Merge branch 'master' into improve-workgroup-scan-2
keptsecret 004c95a fixed minor bug
keptsecret ccacddb store temporaries with data accessor
keptsecret 9c59677 minor fixes
keptsecret eb44262 moved indexing functionality to config struct
keptsecret 573ce44 reduction returns value instead of saving directly to storage
keptsecret 49ca655 fixes to 2-level scan indexing
keptsecret a639145 fixes to 3-level scan and minor stuff
keptsecret 7751359 some minor fixes
keptsecret fd6f527 latest example
keptsecret 27d84c8 merge master, fix conflicts
keptsecret 350c6a3 more util funcs in config, fix some calculations
keptsecret 14e5d15 added generic data/shared mem accessors
keptsecret f07329e fix include guard
keptsecret 48a7d16 changes to arithmetic accessor concepts
keptsecret 20a54be concept macro for checking types
keptsecret d83ac5c revert concept macro addition
keptsecret 00787bf added generic read/write accessors
keptsecret c0dfc1e more refactor for accessor concept changes
keptsecret 55840a3 don't pass scalar_t as index type
keptsecret d758ff7 refactor accessor to match accessor template
keptsecret b062ede simplified indexing functions
keptsecret 472aa0b more fixes to indexing
keptsecret c483941 share level 0 scan between 2-level and 3-level scans (and reduce)
keptsecret 951ff99 reduce duplicate vars in config
keptsecret 127c6d9 some fixes to indexing
keptsecret 90d3579 fix scans for level 1+
keptsecret 203c03a some indexing fixes for 3-level reduce/scan
keptsecret 0b16307 fix 3-level scan downsweep step
keptsecret 83991b9 added tuple.hlsl
keptsecret 209adb4 added some comments to config funcs for future debugging
keptsecret 0a5dc30 merge master, fix example conflict
keptsecret f82b405 Merge branch 'master' into improve-workgroup-scan-2
keptsecret 7d77d30 change indexing to uint16_t
keptsecret 7b15a54 do inclusive scan on upsweep and shift left on downsweep
keptsecret 37aa99b some adjustments to config and func usages
keptsecret da6c313 split out level 0 scans into its own struct
keptsecret e230d06 fixes to 3 level scan
keptsecret 3da175d padding to shared mem indexing to avoid bank conflict
keptsecret 32732e7 fix padding bugs
keptsecret 7a2065a update to latest example
keptsecret 3a90fa8 Merge branch 'master' into improve-workgroup-scan-2
keptsecret ce77b46 uncomment some concept requires
keptsecret fc1bc51 removed redundant stuff, make config more readable
keptsecret 10b7f50 fix some bugs, readability fix
keptsecret 437c194 use x-macros for config compat between hlsl and cpp
keptsecret 029cfeb improved readability for config, include all new files
keptsecret File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Submodule examples_tests updated 21 files
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| // Copyright (C) 2025 - DevSH Graphics Programming Sp. z O.O. | ||
| // This file is part of the "Nabla Engine". | ||
| // For conditions of distribution and use, see copyright notice in nabla.h | ||
| #ifndef _NBL_BUILTIN_HLSL_WORKGROUP2_ARITHMETIC_INCLUDED_ | ||
| #define _NBL_BUILTIN_HLSL_WORKGROUP2_ARITHMETIC_INCLUDED_ | ||
| | ||
| | ||
| #include "nbl/builtin/hlsl/functional.hlsl" | ||
| #include "nbl/builtin/hlsl/workgroup/ballot.hlsl" | ||
| #include "nbl/builtin/hlsl/workgroup/broadcast.hlsl" | ||
| #include "nbl/builtin/hlsl/workgroup2/shared_scan.hlsl" | ||
| | ||
| | ||
| namespace nbl | ||
| { | ||
| namespace hlsl | ||
| { | ||
| namespace workgroup2 | ||
| { | ||
| | ||
| template<class Config, class BinOp, class device_capabilities=void> | ||
| struct reduction | ||
| { | ||
| template<class DataAccessor, class ScratchAccessor> | ||
| static void __call(NBL_REF_ARG(DataAccessor) dataAccessor, NBL_REF_ARG(ScratchAccessor) scratchAccessor) | ||
| { | ||
| impl::reduce<Config,BinOp,Config::LevelCount,device_capabilities> fn; | ||
| fn.template __call<DataAccessor,ScratchAccessor>(dataAccessor, scratchAccessor); | ||
| } | ||
| }; | ||
| | ||
| template<class Config, class BinOp, class device_capabilities=void> | ||
| struct inclusive_scan | ||
| { | ||
| template<class DataAccessor, class ScratchAccessor> | ||
| static void __call(NBL_REF_ARG(DataAccessor) dataAccessor, NBL_REF_ARG(ScratchAccessor) scratchAccessor) | ||
| { | ||
| impl::scan<Config,BinOp,false,Config::LevelCount,device_capabilities> fn; | ||
| fn.template __call<DataAccessor,ScratchAccessor>(dataAccessor, scratchAccessor); | ||
| } | ||
| }; | ||
| | ||
| template<class Config, class BinOp, class device_capabilities=void> | ||
| struct exclusive_scan | ||
| { | ||
| template<class DataAccessor, class ScratchAccessor> | ||
| static void __call(NBL_REF_ARG(DataAccessor) dataAccessor, NBL_REF_ARG(ScratchAccessor) scratchAccessor) | ||
| { | ||
| impl::scan<Config,BinOp,true,Config::LevelCount,device_capabilities> fn; | ||
| fn.template __call<DataAccessor,ScratchAccessor>(dataAccessor, scratchAccessor); | ||
| } | ||
| }; | ||
devshgraphicsprogramming marked this conversation as resolved. Outdated Show resolved Hide resolved | ||
| | ||
| } | ||
| } | ||
| } | ||
| | ||
| #endif | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.