Skip to content

Commit a94d6d7

Browse files
Update README.md
1 parent 58abeea commit a94d6d7

File tree

1 file changed

+16
-6
lines changed

1 file changed

+16
-6
lines changed

README.md

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -90,8 +90,17 @@ python -m paddle.distributed.launch --selected_gpus 0 \
9090
<img src="doc/distill.gif" width="550">
9191
</p>
9292

93-
# EDL Framework
94-
## Quickstart:EDL Resnet50 experiments on a single machine in docker:
93+
<h2 align="center"> Release 0.2.0 </h2>
94+
95+
<h3 align="center"> Checkpoint based elastic training on multiple GPUs </h3>
96+
97+
- We have several training nodes running on each GPU.
98+
- A master node is responsible for checkpoint saving and all the other nodes are elastic nodes.
99+
- When elastic nodes join or leave current training job, training hyper-parameter will be adjusted automatically.
100+
- Newly comming training nodes will load checkpoint from remote FS automatically.
101+
- A model checkpoint is saved every serveral steps given by user
102+
103+
<h3 align="center"> Resnet50 experiments on a single machine in docker </h3>
95104

96105
1. Start a JobServer on one node which generates changing scripts.
97106

@@ -137,10 +146,11 @@ python -u paddle_edl.demo.collective.job_client_demo \
137146
The whole example is [here](example/demo/collective)
138147

139148

140-
## FAQ
149+
<h2 align="center"> FAQ </h2>
141150

142-
TBD
143-
144-
## License
145151

152+
<h2 align="center"> License </h2>
146153
EDL is provided under the [Apache-2.0 license](LICENSE).
154+
155+
<h2 align="center"> Contribution </h2>
156+
If you want to contribute code to Paddle Serving, please reference

0 commit comments

Comments
 (0)