You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
***Rolling updates:** Cortex updates deployed APIs without any downtime.
23
-
***Log streaming:** Cortex streams logs from deployed models to your CLI.
24
-
***Prediction monitoring:** Cortex monitors network metrics and tracks predictions.
25
-
***Minimal configuration:** Cortex deployments are defined in a single `cortex.yaml` file.
18
+
***Multi framework:** deploy TensorFlow, PyTorch, scikit-learn, and other models.
19
+
***Autoscaling:** automatically scale APIs to handle production workloads.
20
+
***CPU / GPU support:** run inference on CPU or GPU instances.
21
+
***Spot instances:** save money with EC2 spot instances.
22
+
***Rolling updates:** update deployed APIs with no downtime.
23
+
***Log streaming:** stream logs from deployed models to your CLI.
24
+
***Prediction monitoring:** monitor API performance and track predictions.
26
25
27
26
<br>
28
27
@@ -135,6 +134,8 @@ Cortex is an open source alternative to serving models with SageMaker or buildin
135
134
136
135
The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing (ELB), TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.
137
136
137
+
**Note:** Cortex manages its own Kubernetes cluster so that end-to-end functionality like request-based autoscaling, GPU support, and spot instance management can work out of the box without any DevOps work.
0 commit comments