- Notifications
You must be signed in to change notification settings - Fork 4
📦 onnx support for santacoder #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Makefile Outdated
| ||
RUN_HF := python3 src/main.py --pipeline_class=HF_Pipeline | ||
RUN_DS := deepspeed --num_gpus 1 src/main.py --pipeline_class=DS_Pipeline | ||
RUN_HF := python -m src.main --pipeline_class=HF_Pipeline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not an installed package so we should refer to it by path, not module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python3 src/main.py --pipeline_class=ONNX_Pipeline --pretrained_model=mayank31398/santacoder --tokenizer=mayank31398/santacoder --trust_remote_code --dtype=float16 --batch_size=1 --max_input_length=-1 device=cuda Traceback (most recent call last): File "src/main.py", line 11, in <module> from src.metrics import Metrics ModuleNotFoundError: No module named 'src' make: *** [Makefile:73: santacoder-onnx] Error 1
For me it doesn't work :)
are you pip installing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ill need to convert everything in the code to use relative paths to make this work I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I had another look, I guess you're right. The -m
is needed to do the imports properly (though it still needs to be called from the repo root), in the dockerfile I'm just adding to the pythonpath. Relative paths wouldn't work either.
Please keep python3
as the executable, python
tends to refer to python2
Makefile Outdated
optimized-santacoder: | ||
${RUN_HF} --pretrained_model=olivierdehaene/optimized-santacoder --tokenizer=bigcode/santacoder --trust_remote_code ${EXP_ARGS} | ||
| ||
.PHONY: santacoder-cpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to have different targets for cpu and gpu, $EXTRA_ARGS already covers it.
Why do we need mayank31398/santacoder
? How is it different from the other targets above?
Please build the commands like the other ones above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mayank/santacoder
has a fix for a bug. This bug is not encountered when running the vanilla generate method.
But for compiling the graph, onnx runs the model by passing dummy tensors and doesn't take the generate() path.
Its only a single line change in modeling_gpt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had created a PR for this: https://huggingface.co/bigcode/santacoder/discussions/23#640f55462ec22527ea436d5e
| ||
| ||
class ONNX_Pipeline(Pipeline): | ||
def __init__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The vast majority of this function is identical to the base class. Seems like you just need to override methods like _get_config,
_create_model, etc. and add some assertions like in
DS_Pipeline`
| ||
self.dtype = dtype | ||
| ||
if self.dtype != torch.float32: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still doesn't support float16?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does but not for santacoder. it requires a merged graph (at least if you compile via optimum)
93d9057
to d84fb61
Compare db1d5f8
to f889370
Compare
No description provided.