- Notifications
You must be signed in to change notification settings - Fork 402
feat: Add cloud profiler to training_utils #828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… support for local file uploads
@mkovalski: I am assigning as owner of this PR; feel free to ping reviewers as needed to make sure the review process progresses in a timely fashion, or provide guidance on a who might better own the process of getting the PR reviewed, passing continuous testing, and merged. Reach out if you have questions. |
| ||
if not environment_variables.http_handler_port: | ||
raise MissingEnvironmentVariableException( | ||
"'AIP_HTTP_HANDLER_PORT' must be set." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the user set this using env
or is this set by the service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is set by the service.
| ||
from google.cloud.aiplatform.training_utils.cloud_profiler.plugins import base_plugin | ||
from typing import List | ||
from werkzeug import wrappers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrap with informative importerror exception.
google/cloud/aiplatform/training_utils/cloud_profiler/webserver.py Outdated Show resolved Hide resolved
google/cloud/aiplatform/training_utils/cloud_profiler/webserver.py Outdated Show resolved Hide resolved
setup.py Outdated
| ||
full_extra_require = list( | ||
set(tensorboard_extra_require + metadata_extra_require + xai_extra_require) | ||
set( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TF version should be handled explicitly since TB, XAI, and Profiler have different version bounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Adds ability to profile vertex training jobs using tensorboard profiler.
Fixes #519