Profiling and misc #10

jlamypoirier · 2023-01-13T00:39:23Z

Add GPU profiling
Ultra-fast model creation so we can benchmark/profile quickly. The model is created in ~300 ms (create and init weights directly on GPU) instead of 20-30 s loading + several minutes on the first time. On the downside, we no longer benchmark the loading time and can't use custom weights but that could be added easily (and it would be much faster than before).
Rewrite the benchmarking code and pipeline initialization to support the new features.
Use the logging module instead of plain prints and add timestamps so we can more easily track the progress
Add some safety checks on arguments
Improve performance metrics and add several more
Simplify device placement (with arg instead of separate class) and data types
Move ds/hf common code to the Pipeline class
Lots of refactoring and cleanup

src/pipelines/pipeline.py

…n function, editable transformers in docker image, cleanup

Makefile

mayank31398

Added a few changes. Most of it looks good to me :)

src/pipelines/pipeline.py

mayank31398 · 2023-01-29T16:07:13Z

src/pipelines/pipeline.py

+ input_tokens[t] = input_tokens[t].to(self.device)

+ t1 = time.perf_counter()
 with torch.no_grad():


Is this needed?
The generate method already runs in torch.no_grad context by default

It's probably useless but a bit of extra safety can't hurt. Same with the model.eval() in __init__.

Any reason not to use inference_mode()?

I didn't know about this one, thanks for pointing it out. I will leave it for future work though.

Sure, I'm not sure of the context here but it should be a harmless replacement ... it includes everything no_grad does.

This reverts commit ec62e17.

This reverts commit d7fe3dd.

jlamypoirier · 2023-01-30T23:36:00Z

Makefile

 # BLOOM AliBi
 hf-1b-bloom-fp32:
-python src/main.py --hidden_size 2048 --n_head 16 --n_layer 24 --pipeline_class HF_GPU_Pipeline --model_class BLOOM --dtype float32 --batch_size ${batch_size}
+python3 src/main.py --hidden_size 2048 --n_head 16 --n_layer 24 --pipeline_class HF_Pipeline --model_class BLOOM --dtype float32 --batch_size ${batch_size}


@mayank31398 I don't see why we should have it as a module. It's a script and not installed as a package.

jlamypoirier · 2023-01-30T23:37:34Z

src/pipelines/pipeline.py

- return config, tokenizer, model_class
+ return sum(p.numel() for p in self.model.parameters())
+
+ def aggregate_and_format_metrics(self, metrics: List[Dict[str, Any]]):


@mayank31398 It's better to leave this encapsulated with the class that generates these metrics.

This reverts commit 4be387d.

jlamypoirier added 3 commits December 20, 2022 13:28

dockerfile

e7473b9

Refactor and profiling

f101f76

style

1c95f1f

jlamypoirier changed the title ~~[WIP] Profiling~~ [WIP] Profiling and misc Jan 13, 2023

jlamypoirier added 8 commits January 12, 2023 20:09

formatting and fixes

068430d

cleanup

33e8e11

style

e742a53

Merge branch 'dockerfile' into profiling

ee237d2

cleanup

633619e

improvements

008a2d4

misc

06797e7

cleanup

06a803e

jlamypoirier changed the title ~~[WIP] Profiling and misc~~ Profiling and misc Jan 13, 2023

jlamypoirier requested a review from mayank31398 January 13, 2023 07:07

fix_import

916d01a

Base automatically changed from dockerfile to main January 14, 2023 01:52

jlamypoirier added 3 commits January 19, 2023 11:26

fixes

c1b4e4a

Merge branch 'main' into profiling

9cf847f

Update transformers

c1efe53

mayank31398 reviewed Jan 26, 2023

View reviewed changes

src/pipelines/pipeline.py Show resolved Hide resolved

jlamypoirier mentioned this pull request Jan 26, 2023

Inference tasks and milestones #11

Open

jlamypoirier added 2 commits January 26, 2023 19:46

Update transformers, improve profiling output, configurable activatio…

4c77db1

…n function, editable transformers in docker image, cleanup

Update makefile

64ff77d

jlamypoirier commented Jan 27, 2023

View reviewed changes

Makefile Show resolved Hide resolved

jlamypoirier requested a review from mayank31398 January 27, 2023 20:48

mayank31398 added 2 commits January 29, 2023 20:47

--no-cache-dir

03115bc

move constants

d7fe3dd

mayank31398 approved these changes Jan 29, 2023

View reviewed changes

mayank31398 and others added 2 commits January 29, 2023 23:58

call as a module

ec62e17

cleanup

b04d9bc

jlamypoirier added 2 commits January 30, 2023 18:32

Revert "call as a module"

c9681ae

This reverts commit ec62e17.

Revert "move constants"

13561bb

This reverts commit d7fe3dd.

jlamypoirier commented Jan 30, 2023

View reviewed changes

jlamypoirier merged commit 4be387d into main Jan 30, 2023

jlamypoirier added a commit that referenced this pull request Jan 30, 2023

Revert "Profiling and misc (#10)"

f6096a8

This reverts commit 4be387d.

jlamypoirier deleted the profiling branch January 30, 2023 23:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Profiling and misc #10

Profiling and misc #10

Uh oh!

jlamypoirier commented Jan 13, 2023 •

edited

Loading

Uh oh!

Uh oh!

mayank31398 left a comment

Uh oh!

mayank31398 Jan 29, 2023

jlamypoirier Jan 30, 2023

njhill Jan 30, 2023

jlamypoirier Jan 30, 2023

jlamypoirier Jan 30, 2023

njhill Jan 30, 2023

jlamypoirier Jan 30, 2023

jlamypoirier Jan 30, 2023

Labels

3 participants

Profiling and misc #10

Profiling and misc #10

Uh oh!

Conversation

jlamypoirier commented Jan 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

mayank31398 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Labels

3 participants

jlamypoirier commented Jan 13, 2023 •

edited

Loading