-
- Notifications
You must be signed in to change notification settings - Fork 12.1k
Description
Motivation.
Almost all PassConfig field names have enable_ in the name, which is unnecessarily verbose. They are also pretty long, and sometimes not descriptive enough. Finally, enable_fusion should be split into rmsnorm+quant and activation+quant flags as we want to control these flags separately.
Proposed Change.
We should rename the flags:
enable_async_tp->fuse_gemm_commsenable_attn_fusion->fuse_attn_quantenable_fi_allreduce_fusion->fuse_allreduce_rmsenable_fusion->fuse_norm_quant,fuse_act_quantenable_noop->eliminate_noopsenable_sequence_parallelism->enable_sp
For future RoPE-based fusion passes, the flags will look like:
enable_qknorm_rope_fusion->fuse_qknorm_ropeenable_rope_cache_fusion->fuse_rope_cache- ...
We can deprecate the original flags in the next release and map them to the new ones, and remove them 1 or even 2 releases later (shouldn't be hard to support). These flags will be used less commonly after -O optimization levels land anyway.
Feedback Period.
1 week, 11/3 - 11/7
CC List.
@zou3519 @youkaichao @mgoin @ilmarkov @nvpohanh @pavanimajety
Any Other Things.
With passes following a common construction convention, we can also add a full_pass_pipeline arg where users can control the exact order of the passes if necessary, but that is less likely to be needed urgently and can be added later.
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status