deepspeedai / DeepSpeed Public

Notifications You must be signed in to change notification settings
Fork 4.6k
Star 40.6k

Code
Issues 1.1k
Pull requests 104
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: deepspeedai/DeepSpeed

Labels 31 Milestones 0

New pull request New

104 Open 3,403 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Enabled compiled autograd for backward pass

#7667 opened Nov 4, 2025 by deepcharm

Loading…

PyTorch-compatible backward API

#7665 opened Nov 3, 2025 by tohtana • Draft

Add single parameter allgather optimization for zero3

#7661 opened Oct 31, 2025 by aeeeeeep

Loading…

allow seperate learning rate "muon_lr" and "adam_lr" for muon optimizer

#7658 opened Oct 30, 2025 by delock

Loading…

[ROCm] Relax tolerances for FP8 unit test for fp16 and bf16 cases

#7655 opened Oct 28, 2025 by rraminen

Loading…

Put Muon optimizer momentum buffer on GPU

#7648 opened Oct 24, 2025 by delock

Loading…

Add torch.compile test conf to existing tests

#7646 opened Oct 23, 2025 by NirSonnenschein

Loading…

Configures workflow for offline unit tests

#7512 opened Aug 24, 2025 by porfanid

Loading…

Add world-size getter in Engine

#7479 opened Aug 9, 2025 by WoosungMyung

Loading…

Add EXAONE 4.0 model support for DeepSpeed inference v2 @

#7456 opened Jul 29, 2025 by notkisk • Draft

[AMD][ROCm] Improve support of AMD

#7448 opened Jul 24, 2025 by k-artem

Loading…

Create COMMITTERS_RESPONSIBILITY.md

#7300 opened May 21, 2025 by PKUWZP

Loading…

HF2UCP: Converting a pytorch_model.bin or .safetensors checkpoint to UCP

#7212 opened Apr 10, 2025 by Schwidola0607

Loading…

gather output layout support for column parallel

#7181 opened Mar 28, 2025 by inkcherry

Loading…

[bugfix] update results of state_dict loading, embedding resizing to secondary partitions (hpz)

#7130 opened Mar 11, 2025 by cyr0930

Loading…

[Draft] Add support for seq split in Domino

#7111 opened Mar 4, 2025 by duanhx1037 • Draft

Update Domino for Llama3

#7084 opened Feb 26, 2025 by shenzheyu • Draft

Fix, pipeline model with moe cause error when send grad

#7055 opened Feb 19, 2025 by wukong1992

Loading…

Add pyproject.toml with legacy build backend to keep most logic in setup.py

#7033 opened Feb 13, 2025 by loadams

Loading…

4 of 5 tasks

Enable python 3.11 and 3.12 tests

#7007 opened Feb 6, 2025 by loadams

Loading…

Enabled high-performance Automatic Tensor Parallelism (auto TP) for the MoE models on multiple GPUs/HPUs

#6964 opened Jan 21, 2025 by gyou2021

Loading…

[FPDT] Support FPDT Based on Intel Backend

#6956 opened Jan 16, 2025 by YizhouZ • Draft

Update sharded_moe.py to support top2 gate with Tutel

#6948 opened Jan 14, 2025 by xenshinu

Loading…

Fix: forbid repeated deepspeed.initialize on training objects

#6874 opened Dec 16, 2024 by traincheck-team

Loading…

Training ops kernels: Speeding up the Llama-based MoE architectures

#6734 opened Nov 8, 2024 by RezaYazdaniAminabadi • Draft

Previous 1 2 3 4 5 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!