Cortex is an open source platform for deploying, managing, and scaling machine learning in production.
- Supports deploying TensorFlow, PyTorch, sklearn and other models as realtime or batch APIs
- Ensures high availability with availability zones and automated instance restarts
- Scales to handle production workloads with request-based autoscaling
- Runs inference on spot instances with on-demand backups
- Manages traffic splitting for A/B testing
# cluster.yaml region: us-east-1 availability_zones: [us-east-1a, us-east-1b] api_gateway: public instance_type: g4dn.xlarge min_instances: 10 max_instances: 100 spot: true
$ cortex cluster up --config cluster.yaml ○ configuring autoscaling ✓ ○ configuring networking ✓ ○ configuring logging ✓ ○ configuring metrics dashboard ✓ cortex is ready!
- Implement request handling in Python
- Customize compute, autoscaling, and networking for each API
- Package dependencies, code, and configuration for reproducible deployments
- Test locally before deploying to your cluster
# predictor.py from transformers import pipeline class PythonPredictor: def __init__(self, config): self.model = pipeline(task="text-generation") def predict(self, payload): return self.model(payload["text"])[0]
# cortex.yaml name: text-generator kind: RealtimeAPI predictor: path: predictor.py compute: gpu: 1 mem: 4Gi autoscaling: min_replicas: 1 max_replicas: 10 networking: api_gateway: public
$ cortex deploy cortex.yaml creating https://example.com/text-generator $ curl https://example.com/text-generator \ -X POST -H "Content-Type: application/json" \ -d '{"text": "deploy machine learning models to"}' "deploy machine learning models to production"
- Monitor API performance
- Aggregate and stream logs
- Customize prediction tracking
- Update APIs without downtime
$ cortex get realtime api status replicas last update latency requests text-generator live 34 9h 247ms 71828 object-detector live 13 15h 23ms 828459 batch api running jobs last update image-classifier 5 10h
$ pip install cortex
See the installation guide for next steps.