Skip to content

Commit 9518b51

Browse files
committed
Update models documentation (#1556)
(cherry picked from commit cbc160b)
1 parent 3315bff commit 9518b51

File tree

2 files changed

+20
-7
lines changed

2 files changed

+20
-7
lines changed

docs/deployments/realtime-api/models.md

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -187,7 +187,20 @@ Usage varies based on the predictor type:
187187

188188
### Python
189189

190-
To use live model reloading with the Python predictor, the model path(s) must be specified in the API's `predictor` configuration (via the `model_path` or `models` field). When models are specified in this manner, your `PythonPredictor` class must implement the `load_model()` function, and models can be retrieved by using the `get_model()` method of the `python_client` that's passed to the predictor's constructor:
190+
To use live model reloading with the Python predictor, the model path(s) must be specified in the API's `predictor` configuration (via the `model_path` or `models` field). When models are specified in this manner, your `PythonPredictor` class must implement the `load_model()` function, and models can be retrieved by using the `get_model()` method of the `python_client` that's passed into your predictor's constructor.
191+
192+
The `load_model()` function that you implement in your `PythonPredictor` can return anything that you need to make a prediction. There is one caveat: whatever the return value is, it must be unloadable from memory via the `del` keyword. The following frameworks have been tested to work:
193+
194+
* PyTorch (CPU & GPU)
195+
* ONNX (CPU & GPU)
196+
* Sklearn/MLFlow (CPU)
197+
* Numpy (CPU)
198+
* Pandas (CPU)
199+
* Caffe (not tested, but should work on CPU & GPU)
200+
201+
Python data structures containing these types are also supported (e.g. lists and dicts).
202+
203+
The `load_model()` function takes a single argument, with is a path (on disk) to the model to be loaded. It is called behind the scenes when you call the `python_client`'s `get_model()` method from your predictor's `predict()` method. Whatever `load_model()` returns will be the exact return value of `python_client.get_model()`. Here is the schema for `python_client.get_model()`:
191204

192205
```python
193206
def get_model(model_name, model_version):
@@ -201,11 +214,11 @@ def get_model(model_name, model_version):
201214
model_version (string, optional): Version of the model to retrieve. Can be omitted or set to "latest" to select the highest version.
202215
203216
Returns:
204-
The model as loaded by the load_model() method.
217+
The value that's returned by your predictor's load_model() method.
205218
"""
206219
```
207220

208-
For example:
221+
Here's an example:
209222

210223
```python
211224
class PythonPredictor:
@@ -232,7 +245,7 @@ class PythonPredictor:
232245
return model.predict(payload)
233246
```
234247

235-
`python_client.get_model()` can also accept a model version if a version other than the highest version number is desired:
248+
`python_client.get_model()` can also accept a model version if a version other than the highest is desired:
236249

237250
```python
238251
class PythonPredictor:
@@ -285,7 +298,7 @@ class TensorFlowPredictor:
285298
return self.client.predict(payload, query_params["model"])
286299
```
287300

288-
`tensorflow_client.predict()` can also accept a model version if a version other than the highest version number is desired:
301+
`tensorflow_client.predict()` can also accept a model version if a version other than the highest is desired:
289302

290303
```python
291304
class TensorFlowPredictor:
@@ -339,7 +352,7 @@ class ONNXPredictor:
339352
return self.client.predict(payload, query_params["model"])
340353
```
341354

342-
`onnx_client.predict()` can also accept a model version if a version other than the highest version number is desired:
355+
`onnx_client.predict()` can also accept a model version if a version other than the highest is desired:
343356

344357
```python
345358
class ONNXPredictor:

pkg/workloads/cortex/lib/client/python.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ def get_model(self, model_name: Optional[str] = None, model_version: str = "late
9898
model_version (string, optional): Version of the model to retrieve. Can be omitted or set to "latest" to select the highest version.
9999
100100
Returns:
101-
The model as loaded by the load_model() method.
101+
The value that's returned by your predictor's load_model() method.
102102
"""
103103

104104
if model_version != "latest" and not model_version.isnumeric():

0 commit comments

Comments
 (0)