Skip to content

cortexlabs/cortex

Repository files navigation


installdocumentationexamplessupport

Deploy machine learning models to production

Cortex is an open source platform for deploying, managing, and scaling machine learning in production.


Model serving infrastructure

  • Supports deploying TensorFlow, PyTorch, sklearn and other models as realtime or batch APIs
  • Ensures high availability with availability zones and automated instance restarts
  • Scales to handle production workloads with request-based autoscaling
  • Runs inference on spot instances with on-demand backups
  • Manages traffic splitting for A/B testing

Configure your cluster:

# cluster.yaml region: us-east-1 availability_zones: [us-east-1a, us-east-1b] api_gateway: public instance_type: g4dn.xlarge min_instances: 10 max_instances: 100 spot: true

Spin up your cluster on your AWS account:

$ cortex cluster up --config cluster.yaml ○ configuring autoscaling ✓ ○ configuring networking ✓ ○ configuring logging ✓ ○ configuring metrics dashboard ✓ cortex is ready! 

Reproducible model deployments

  • Implement request handling in Python
  • Customize compute, autoscaling, and networking for each API
  • Package dependencies, code, and configuration for reproducible deployments
  • Test locally before deploying to your cluster

Implement a predictor:

# predictor.py from transformers import pipeline class PythonPredictor: def __init__(self, config): self.model = pipeline(task="text-generation") def predict(self, payload): return self.model(payload["text"])[0]

Configure an API:

# cortex.yaml name: text-generator kind: RealtimeAPI predictor: path: predictor.py compute: gpu: 1 mem: 4Gi autoscaling: min_replicas: 1 max_replicas: 10 networking: api_gateway: public

Deploy to production:

$ cortex deploy cortex.yaml creating https://example.com/text-generator $ curl https://example.com/text-generator \ -X POST -H "Content-Type: application/json" \ -d '{"text": "deploy machine learning models to"}' "deploy machine learning models to production" 

API management

  • Monitor API performance
  • Aggregate and stream logs
  • Customize prediction tracking
  • Update APIs without downtime

Manage your APIs:

$ cortex get realtime api status replicas last update latency requests text-generator live 34 9h 247ms 71828 object-detector live 13 15h 23ms 828459 batch api running jobs last update image-classifier 5 10h 

Get started

$ pip install cortex 

See the installation guide for next steps.

About

Production infrastructure for machine learning at scale

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Contributors 22