- Notifications
You must be signed in to change notification settings - Fork 60
ODSC 39392/triton #128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ODSC 39392/triton #128
Conversation
| Thank you for your pull request and welcome to our community! To contribute, please sign the Oracle Contributor Agreement (OCA).
To sign the OCA, please create an Oracle account and sign the OCA in Oracle's Contributor Agreement Application. When signing the OCA, please provide your GitHub username. After signing the OCA and getting an OCA approval from Oracle, this PR will be automatically updated. If you are an Oracle employee, please make sure that you are a member of the main Oracle GitHub organization, and your membership in this organization is public. |
…a-science into ODSC-39392/triton
| if model_name and model_version: | ||
| header['model-name'] = model_name | ||
| header['model-version'] = model_version | ||
| elif not model_version and not model_name: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are doing an XOR logic here, we can reduce this to -
elif bool(model_version) ^ bool(model_name): raise ValueError("`model_name` and `model_version` have to be provided together.") There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! fixed.
| infrastructure.web_concurrency | ||
| ) | ||
| runtime.set_spec(runtime.CONST_ENV, environment_variables) | ||
| if hasattr(runtime, "inference_server") and runtime.inference_server and runtime.inference_server.lower() == "triton": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"triton" should be a constant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use upper() instead then we can use the same constant (MODEL_DEPLOYMENT_INFERENCE_SERVER_TRITON) for the TRITON in the next line :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
| ... .with_model_uri(<model_id>)\ | ||
| ... .with_env({"key":"value", "key2":"value2"})\ | ||
| ... .with_inference_server("triton") | ||
| ... deployment = ModelDeployment()\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should there be >>> remove the extra indentation here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, good catch.
| return req.body | ||
| else: | ||
| request_kwargs["auth"] = header.get("signer") | ||
| request_kwargs["auth"] = header.pop("signer") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we pop signer for dry_run as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| infrastructure.web_concurrency | ||
| ) | ||
| runtime.set_spec(runtime.CONST_ENV, environment_variables) | ||
| if hasattr(runtime, "inference_server") and runtime.inference_server and runtime.inference_server.lower() == "triton": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use upper() instead then we can use the same constant (MODEL_DEPLOYMENT_INFERENCE_SERVER_TRITON) for the TRITON in the next line :)
| ) | ||
| runtime.set_spec(runtime.CONST_ENV, environment_variables) | ||
| if hasattr(runtime, "inference_server") and runtime.inference_server and runtime.inference_server.lower() == "triton": | ||
| environment_variables["CONTAINER_TYPE"] = "TRITON" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| environment_variables["CONTAINER_TYPE"] = "TRITON" | |
| environment_variables["CONTAINER_TYPE"] = MODEL_DEPLOYMENT_INFERENCE_SERVER_TRITON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
…into ODSC-39392/triton
| if model_name and model_version: | ||
| header['model-name'] = model_name | ||
| header['model-version'] = model_version | ||
| elif bool(model_version) ^ bool(model_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can rearrange this if/elif -
if bool(model_version) ^ bool(model_name): raise ValueError("`model_name` and `model_version` have to be provided together.") header['model-name'] = model_name header['model-version'] = model_version
Description
https://jira.oci.oraclecorp.com/browse/ODSC-39392
Testing Result
Int Test Result
You can check this pdf for testing results below.
inference-approach1.pdf