Skip to content

Commit 21b5b36

Browse files
update readme, add more detailed instruction and minor fix
1 parent 92ccc79 commit 21b5b36

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

lion/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -32,11 +32,11 @@ Another practical benefit is that Lion has faster runtime (steps / sec) in our e
3232

3333
- Lion outperforms AdamW on various architectures trained from scratch on ImageNet or pre-trained on ImageNet-21K.
3434

35-
<img src="./fig/i1k.png" width="60%">
35+
<img src="./fig/i1k.png" width="65%">
3636

3737
- Lion saves up to 5x the pre-training cost on JFT-300M.
3838

39-
<img src="./fig/jft-ft.png" width="100%">
39+
<img src="./fig/jft-ft.png" width="90%">
4040

4141
- Results after fine-tuning with higher resolution and Polyak averaging.
4242
Our obtained ViT-L/16 matches the previous ViT-H/14 results trained by AdamW while being 2x smaller.
@@ -58,13 +58,13 @@ Our obtained ViT-L/16 matches the previous ViT-H/14 results trained by AdamW whi
5858

5959
- On diffusion models, Lion exceeds AdamW in terms of the FID score and saves up to 2.3x the training compute. Left to right: 64x64, 128x128, 256x256 image generation trained on ImageNet.
6060

61-
<img src="./fig/diffusion.png" width="100%">
61+
<img src="./fig/diffusion.png" width="90%">
6262

6363
### **Language modeling**
6464

6565
- Lion saves up to 2x compute on the validation perplexity when performing the language modeling task (Left: on Wiki-40B, Right: on PG-19). Lion achieves larger gains on larger Transformers.
6666

67-
<img src="./fig/lm.png" width="70%">
67+
<img src="./fig/lm.png" width="65%">
6868

6969
- Lion achieves better average in-context learning ability when training LMs compared to Adafactor.
7070

lion/fig/diffusion.png

5.06 KB
Loading

0 commit comments

Comments
 (0)