install • documentation • examples • we're hiring • chat with us
- Deploy TensorFlow, PyTorch, ONNX, scikit-learn, and other models.
- Define preprocessing and postprocessing steps in Python.
- Configure APIs as realtime or batch.
- Deploy multiple models per API.
- Monitor API performance and track predictions.
- Update APIs with no downtime.
- Stream logs from APIs.
- Perform A/B tests.
- Test locally, scale on your AWS account.
- Autoscale to handle production traffic.
- Reduce cost with spot instances.
Define any real-time or batch inference pipeline as simple Python APIs, regardless of framework.
# predictor.py from transformers import pipeline class PythonPredictor: def __init__(self, config): self.model = pipeline(task="text-generation") def predict(self, payload): return self.model(payload["text"])[0]
Configure autoscaling, monitoring, compute resources, update strategies, and more.
# cortex.yaml - name: text-generator predictor: path: predictor.py networking: api_gateway: public compute: gpu: 1 autoscaling: min_replicas: 3
Handle traffic with request-based autoscaling. Minimize spend with spot instances and multi-model APIs.
$ cortex get text-generator endpoint: https://example.com/text-generator status last-update replicas requests latency live 10h 10 100000 100ms
Integrate Cortex with any data science platform and CI/CD tooling, without changing your workflow.
# predictor.py import tensorflow import torch import transformers import mlflow ...
Run Cortex on your AWS account (GCP support is coming soon), maintaining control over resource utilization and data access.
# cluster.yaml region: us-west-2 instance_type: g4dn.xlarge spot: true min_instances: 1 max_instances: 5
You don't need to bring your own cluster or containerize your models, Cortex automates your cloud infrastructure.
$ cortex cluster up confguring networking ... configuring logging ... configuring metrics ... configuring autoscaling ... cortex is ready!
bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/0.20/get-cli.sh)"
See our installation guide, then deploy one of our examples or bring your own models to build realtime APIs and batch APIs.