Skip to content

Commit 6eea5f4

Browse files
vertex-sdk-botcopybara-github
authored andcommitted
feat: Add Model Garden deploy SDK documentation and use cases.
PiperOrigin-RevId: 751151322
1 parent e1fcfff commit 6eea5f4

File tree

1 file changed

+199
-0
lines changed

1 file changed

+199
-0
lines changed

vertexai/model_garden/README.md

Lines changed: 199 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,199 @@
1+
# Vertex Model Garden SDK for Python
2+
3+
The Vertex Model Garden SDK helps developers use [Model Garden](https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/explore-models) open models to build AI-powered features and applications.
4+
The SDKs support use cases like the following:
5+
6+
- Deploy an open model
7+
- Export open model weights
8+
9+
## Installation
10+
11+
To install the
12+
[google-cloud-aiplatform](https://pypi.org/project/google-cloud-aiplatform/)
13+
Python package, run the following command:
14+
15+
```shell
16+
pip3 install --upgrade --user "google-cloud-aiplatform>=1.84"
17+
```
18+
19+
## Usage
20+
21+
For detailed instructions, see [deploy an open model](https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/use-models#deploy_an_open_model) and [deploy notebook tutorial](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_deployment_tutorial.ipynb).
22+
23+
## Quick Start: Default Deployment
24+
25+
This is the simplest way to deploy a model. If you provide just a model name, the SDK will use the default deployment configuration.
26+
27+
```python
28+
from vertexai.preview import model_garden
29+
30+
model = model_garden.OpenModel("google/paligemma@paligemma-224-float32")
31+
endpoint = model.deploy()
32+
```
33+
34+
**Use case:** Fast prototyping, first-time users evaluating model outputs.
35+
36+
## List Deployable Models
37+
38+
You can list all models that are currently deployable via Model Garden:
39+
40+
```python
41+
from vertexai.preview import model_garden
42+
43+
models = model_garden.list_deployable_models()
44+
```
45+
46+
To filter only Hugging Face models or by keyword:
47+
48+
```python
49+
models = model_garden.list_deployable_models(list_hf_models=True, model_filter="stable-diffusion")
50+
```
51+
52+
**Use case:** Discover available models before deciding which one to deploy.
53+
54+
## Hugging Face Model Deployment
55+
56+
Deploy a model directly from Hugging Face using the model ID.
57+
58+
```python
59+
model = model_garden.OpenModel("Qwen/Qwen2-1.5B-Instruct")
60+
endpoint = model.deploy()
61+
```
62+
63+
**Use case:** Leverage community or third-party models without custom container setup. If the model is gated, you may need to provide a Hugging Face access token:
64+
65+
```python
66+
endpoint = model.deploy(hugging_face_access_token="your_hf_token")
67+
```
68+
69+
**Use case:** Deploy gated Hugging Face models requiring authentication.
70+
71+
## List Deployment Configurations
72+
73+
You can inspect available deployment configurations for a model:
74+
75+
```python
76+
model = model_garden.OpenModel("google/paligemma@paligemma-224-float32")
77+
deploy_options = model.list_deploy_options()
78+
```
79+
80+
**Use case:** Evaluate compatible machine specs and containers before deployment.
81+
82+
## Customize Deployment: Machine and Resource Configuration
83+
84+
Specify exact hardware resources and endpoint/model names.
85+
86+
```python
87+
endpoint = model.deploy(
88+
machine_type="g2-standard-4",
89+
accelerator_type="NVIDIA_L4",
90+
accelerator_count=1,
91+
min_replica_count=1,
92+
max_replica_count=1,
93+
endpoint_display_name="paligemma-endpoint",
94+
model_display_name="paligemma-model"
95+
)
96+
```
97+
98+
**Use case:** Production configuration, performance tuning, scaling.
99+
100+
## EULA Acceptance
101+
102+
Some models require acceptance of a license agreement. Pass `eula=True` if prompted.
103+
104+
```python
105+
model = model_garden.OpenModel("google/gemma2@gemma-2-27b-it")
106+
endpoint = model.deploy(eula=True)
107+
```
108+
109+
**Use case:** First-time deployment of EULA-protected models.
110+
111+
## Spot VM Deployment
112+
113+
Schedule workloads on Spot VMs for lower cost.
114+
115+
```python
116+
endpoint = model.deploy(spot=True)
117+
```
118+
119+
**Use case:** Cost-sensitive development and batch workloads.
120+
121+
## Fast Tryout Deployment
122+
123+
Enable experimental fast-deploy path for popular models.
124+
125+
```python
126+
endpoint = model.deploy(fast_tryout_enabled=True)
127+
```
128+
129+
**Use case:** Interactive experimentation without full production setup.
130+
131+
## Dedicated Endpoints
132+
133+
Create a dedicated DNS-isolated endpoint.
134+
135+
```python
136+
endpoint = model.deploy(use_dedicated_endpoint=True)
137+
```
138+
139+
**Use case:** Traffic isolation for enterprise or regulated workloads.
140+
141+
## Reservation Affinity
142+
143+
Use shared or specific Compute Engine reservations.
144+
145+
```python
146+
endpoint = model.deploy(
147+
reservation_affinity_type="SPECIFIC_RESERVATION",
148+
reservation_affinity_key="compute.googleapis.com/reservation-name",
149+
reservation_affinity_values="projects/YOUR_PROJECT/zones/YOUR_ZONE/reservations/YOUR_RESERVATION"
150+
)
151+
```
152+
153+
**Use case:** Optimized resource usage with pre-reserved capacity.
154+
155+
## Custom Container Image
156+
157+
Override the default container with a custom image.
158+
159+
```python
160+
endpoint = model.deploy(
161+
serving_container_image_uri="us-docker.pkg.dev/vertex-ai/custom-container:latest"
162+
)
163+
```
164+
165+
**Use case:** Use of custom inference servers or fine-tuned environments.
166+
167+
## Advanced Full Container Configuration
168+
169+
Further customize startup probes, health checks, shared memory, and gRPC ports.
170+
171+
```python
172+
endpoint = model.deploy(
173+
serving_container_image_uri="us-docker.pkg.dev/vertex-ai/custom-container:latest",
174+
container_command=["python3"],
175+
container_args=["serve.py"],
176+
container_ports=[8888],
177+
container_env_vars={"ENV": "prod"},
178+
container_predict_route="/predict",
179+
container_health_route="/health",
180+
serving_container_shared_memory_size_mb=512,
181+
serving_container_grpc_ports=[9000],
182+
serving_container_startup_probe_exec=["/bin/check-start.sh"],
183+
serving_container_health_probe_exec=["/bin/health-check.sh"]
184+
)
185+
```
186+
187+
**Use case:** Production-grade deployments requiring deep customization of runtime behavior and monitoring.
188+
189+
## Documentation
190+
191+
You can find complete documentation for the Vertex AI SDKs and Model Garden in the Google Cloud [documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/overview)
192+
193+
## Contributing
194+
195+
See [Contributing](https://github.com/googleapis/python-aiplatform/blob/main/CONTRIBUTING.rst) for more information on contributing to the Vertex AI Python SDK.
196+
197+
## License
198+
199+
The contents of this repository are licensed under the [Apache License, version 2.0](http://www.apache.org/licenses/LICENSE-2.0).

0 commit comments

Comments
 (0)