Use `scan` and `hostoffloading` for llama model #123

zpcore · 2025-02-25T19:04:38Z

Port scan and hostoffloading for llama model based on @tengyifei 's prototype in 1 and 2.

The sharding schema in torchprime/torch_xla_models/configs/model/scaling/llama-fsdp.yaml also plays well with the scan code.

Currently there are NaN issue when we use scan with flash attention kernel related to pytorch/xla#8734. Need to resolve the issue before producing the correct output.

zpcore · 2025-02-26T02:35:30Z

torchprime/torch_xla_models/configs/model/llama-3-8b.yaml

 attention_bias: false
 flash_attention: true
 rope_theta: 500000.0
+scan_decoder_layers: true


move to default yaml file

zpcore · 2025-03-14T06:57:54Z

@tengyifei has been working actively on #148 etc. to formally bring up Scan and Host offloading features. Close this one for now.

zpcore commented Feb 26, 2025

View reviewed changes

zpcore added 7 commits March 10, 2025 23:38

Support run trainer locally

157d30f

nit

a25f2f0

update docker command

b4a92df

initial runnable version

3d06daa

support hostoffloading

470553e

update config useless

4630cbd

clean up PR for quick test

22f20da

zpcore force-pushed the piz/llama-scan branch from 1f2c3e3 to 22f20da Compare March 10, 2025 23:46

zpcore closed this Mar 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Use `scan` and `hostoffloading` for llama model #123

Use `scan` and `hostoffloading` for llama model #123

Uh oh!

zpcore commented Feb 25, 2025

zpcore Feb 26, 2025 •

edited

Loading

zpcore commented Mar 14, 2025

Labels

2 participants

Uh oh!

Use scan and hostoffloading for llama model #123

Use scan and hostoffloading for llama model #123

Uh oh!

Conversation

zpcore commented Feb 25, 2025

zpcore Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

zpcore commented Mar 14, 2025

Labels

2 participants

Use `scan` and `hostoffloading` for llama model #123

Use `scan` and `hostoffloading` for llama model #123

zpcore Feb 26, 2025 •

edited

Loading