You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/cluster-management/config.md
+2-9Lines changed: 2 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -48,20 +48,13 @@ log_group: cortex
48
48
spot: false
49
49
```
50
50
51
-
The docker images used by Cortex are listed below. They can be overridden to use custom images by specifying them in your cluster configuration file.
51
+
The default docker images used for your Predictors are listed in the instructions for [system packages](../deployments/system-packages.md), and can be overridden in your [API configuration](../deployments/api-configuration.md).
52
52
53
-
You can follow these [instructions](../deployments/system-packages.md) to build and push custom Docker images to a registry and configure Cortex to use them.
53
+
The docker images used by the Cortex cluster can also be overriden, although this is not common. They can be configured by adding any of these keys to your cluster configuration file (default values are shown):
Copy file name to clipboardExpand all lines: docs/cluster-management/security.md
+3-2Lines changed: 3 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,7 +20,7 @@ It is recommended to use an IAM user with the `AdministratorAccess` policy to cr
20
20
21
21
### Operator
22
22
23
-
The operator requires read permissions for any S3 bucket containing exported models, read and write permissions for the Cortex S3 bucket, read and write permissions for the Cortex CloudWatch log group, and read and write permissions for CloudWatch metrics. The policy below may be used to restrict the Operator's access:
23
+
The operator requires read permissions for any S3 bucket containing exported models, read and write permissions for the Cortex S3 bucket, read and write permissions for the Cortex CloudWatch log group, read and write permissions for CloudWatch metrics, and read permissions for ECR. The policy below may be used to restrict the Operator's access:
24
24
25
25
```json
26
26
{
@@ -42,7 +42,8 @@ The operator requires read permissions for any S3 bucket containing exported mod
Copy file name to clipboardExpand all lines: docs/cluster-management/telemetry.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ By default, Cortex sends anonymous usage data to Cortex Labs.
6
6
7
7
## What data is collected?
8
8
9
-
If telemetry is enabled, events and errors are collected. Each time you run a command an event will be sent with a randomly generated unique CLI ID and the name of the command. For example, if you run `cortex deploy`, Cortex Labs will receive an event of the structure {id: 1234, command: "deploy"}. In addition, the operator sends heartbeats that include cluster metrics like the types of instances running in your cluster.
9
+
If telemetry is enabled, events and errors are collected. Each time you run a command an event will be sent with a randomly generated unique CLI ID and the name of the command. For example, if you run `cortex deploy`, Cortex Labs will receive an event of the structure `{id: 1234, command: "deploy"}`. In addition, the operator sends heartbeats that include cluster metrics like the types of instances running in your cluster.
Copy file name to clipboardExpand all lines: docs/deployments/api-configuration.md
+7-3Lines changed: 7 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,7 @@ Reference the section below which corresponds to your Predictor type: [Python](#
16
16
path: <string> # path to a python file with a PythonPredictor class definition, relative to the Cortex root (required)
17
17
config: <string: value> # arbitrary dictionary passed to the constructor of the Predictor (optional)
18
18
python_path: <string> # path to the root of your Python folder that will be appended to PYTHONPATH (default: folder containing cortex.yaml)
19
+
image: <string> # docker image to use for the Predictor (default: cortexlabs/python-serve[-gpu])
19
20
env: <string: string> # dictionary of environment variables
20
21
tracker:
21
22
key: <string> # the JSON key in the response to track (required if the response payload is a JSON object)
@@ -44,7 +45,7 @@ Reference the section below which corresponds to your Predictor type: [Python](#
44
45
max_unavailable: <string | int> # maximum number of replicas that can be unavailable during an update; can be an absolute number, e.g. 5, or a percentage of desired replicas, e.g. 10% (default: 25%)
45
46
```
46
47
47
-
See additional documentation for [autoscaling](autoscaling.md), [compute](compute.md), and [prediction monitoring](prediction-monitoring.md).
48
+
See additional documentation for [autoscaling](autoscaling.md), [compute](compute.md), [prediction monitoring](prediction-monitoring.md), and [overriding API images](system-packages.md).
48
49
49
50
## TensorFlow Predictor
50
51
@@ -58,6 +59,8 @@ See additional documentation for [autoscaling](autoscaling.md), [compute](comput
58
59
signature_key: <string> # name of the signature def to use for prediction (required if your model has more than one signature def)
59
60
config: <string: value> # arbitrary dictionary passed to the constructor of the Predictor (optional)
60
61
python_path: <string> # path to the root of your Python folder that will be appended to PYTHONPATH (default: folder containing cortex.yaml)
62
+
image: <string> # docker image to use for the Predictor (default: cortexlabs/tf-api)
63
+
tf_serve_image: <string> # docker image to use for the TensorFlow Serving container (default: cortexlabs/tf-serve[-gpu], which is based on tensorflow/serving)
61
64
env: <string: string> # dictionary of environment variables
62
65
tracker:
63
66
key: <string> # the JSON key in the response to track (required if the response payload is a JSON object)
@@ -86,7 +89,7 @@ See additional documentation for [autoscaling](autoscaling.md), [compute](comput
86
89
max_unavailable: <string | int> # maximum number of replicas that can be unavailable during an update; can be an absolute number, e.g. 5, or a percentage of desired replicas, e.g. 10% (default: 25%)
87
90
```
88
91
89
-
See additional documentation for [autoscaling](autoscaling.md), [compute](compute.md), and [prediction monitoring](prediction-monitoring.md).
92
+
See additional documentation for [autoscaling](autoscaling.md), [compute](compute.md), [prediction monitoring](prediction-monitoring.md), and [overriding API images](system-packages.md).
90
93
91
94
## ONNX Predictor
92
95
@@ -99,6 +102,7 @@ See additional documentation for [autoscaling](autoscaling.md), [compute](comput
99
102
model: <string> # S3 path to an exported model (e.g. s3://my-bucket/exported_model.onnx) (required)
100
103
config: <string: value> # arbitrary dictionary passed to the constructor of the Predictor (optional)
101
104
python_path: <string> # path to the root of your Python folder that will be appended to PYTHONPATH (default: folder containing cortex.yaml)
105
+
image: <string> # docker image to use for the Predictor (default: cortexlabs/onnx-serve[-gpu])
102
106
env: <string: string> # dictionary of environment variables
103
107
tracker:
104
108
key: <string> # the JSON key in the response to track (required if the response payload is a JSON object)
@@ -127,4 +131,4 @@ See additional documentation for [autoscaling](autoscaling.md), [compute](comput
127
131
max_unavailable: <string | int> # maximum number of replicas that can be unavailable during an update; can be an absolute number, e.g. 5, or a percentage of desired replicas, e.g. 10% (default: 25%)
128
132
```
129
133
130
-
See additional documentation for [autoscaling](autoscaling.md), [compute](compute.md), and [prediction monitoring](prediction-monitoring.md).
134
+
See additional documentation for [autoscaling](autoscaling.md), [compute](compute.md), [prediction monitoring](prediction-monitoring.md), and [overriding API images](system-packages.md).
Copy file name to clipboardExpand all lines: docs/deployments/system-packages.md
+20-18Lines changed: 20 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -36,24 +36,22 @@ class PythonPredictor:
36
36
37
37
## Custom Docker image
38
38
39
+
### Create a Dockerfile
40
+
39
41
Create a Dockerfile to build your custom image:
40
42
41
43
```bash
42
44
mkdir my-api &&cd my-api && touch Dockerfile
43
45
```
44
46
45
-
The Docker images used to deploy your models are listed below. Based on the Cortex Predictor and compute type specified in your API configuration, choose a Cortex image to use as the base for your custom Docker image.
46
-
47
-
### Base Cortex images for model serving
47
+
The default Docker images used to deploy your models are listed below. Based on the Cortex Predictor and compute type specified in your API configuration, choose a Cortex image to use as the base for your custom Docker image:
Update your cluster configuration file to point to your image:
95
+
Update your API configuration file to point to your image:
98
96
99
97
```yaml
100
-
#cluster.yaml
98
+
#cortex.yaml
101
99
102
-
# ...
103
-
image_python_serve: <repository_url>:latest
104
-
# ...
100
+
- name: my-api
101
+
...
102
+
predictor:
103
+
image: <repository_url>:latest
104
+
...
105
105
```
106
106
107
-
Update your cluster for the change to take effect:
107
+
*Note: for [TensorFlow Predictors](#tensorflow-predictor), two containers run together serve predictions: one which runs your Predictor code (`cortexlabs/tf-api`), and TensorFlow Serving which loads the SavedModel (`cortexlabs/tf-serve[-gpu]`). There's a 2nd available field `tf_serve_image` that can be used to override the TensorFlow Serving image. The default image (`cortexlabs/tf-serve[-gpu]`) is based on the official Tensorflow Serving image (`tensorflow/serving`). Unless a different version of Tensorflow Serving is required, this image shouldn't have to be overridden, since it's only used to load the SavedModel and does not run your Predictor code.*
0 commit comments