Productionizing Machine Learning with a Microservices Architecture Yaron Haviv CTO, Iguazio
85% of AI Projects Never Make it to Production Research Environment Production Pipeline Build from Scratch with a Large Team Manual extraction In-mem analysis Small scale training Manual evaluation Real-time ingestion Preparation at scale Train with many params & large data Real-time events & data features ETL Streaming APIs Sync
Because Model Development is Just the First Step Develop and Test Locally Package ─ • Dependencies • Parameters • Run scripts • Build Scale-out ─ • Load-balance • Data partitions • Model distribution • AutoML Tune ─ • Parallelism • GPU support • Query tuning • Caching Instrument ─ • Monitoring • Logging • Versioning • Security Automate ─ • CI/CD • Workflows • Rolling upgrades • A/B testing Weeks with one data scientist or developer Months with a large team of developers, scientists, data engineers and DevOps Production
What Is An Automated ML Pipeline ? 5 ETL, Streaming, Logs, Scrapers, .. Ingest Prepare Train With hyper-params, multiple algorithms Validate Deploy ++ Join, Aggregate, Split, .. Test, deploy, monitor model & API servers End to end pipeline orchestration and tracking Serverless: ML & Analytics Functions Features/Data: Fast, Secure, Versioned base features train + test datasets model report report metricsRT features feedback Selected model with test data
Modern Data-Science Platform Architecture Auto ML Experiment Tracking Feature Store Workflows (Kubeflow) Pipeline Orchestration Managed Functions and Services Serverless Automation Shared GPU/CPU Resources Data lake or object store Real-time data and DBaaS Data layer
Serverless Enable: Resource elasticity, Automated Deployment and Operations Serverless Today Data Prep and Training Task lifespan Millisecs to mins Secs to hours Scaling Load-balancer Partition, shuffle, reduce, Hyper-params, RDD State Stateless Stateful Input Event Params, Datasets So why not use Serverless for training and data prep? 6 Time we extend Serverless to data-science !
ML & Analytics Functions Architecture User Code OR ML service Runtime / SaaS (e.g. Spark, Dask, Horovod, Nuclio, ..) Data / Feature stores Secrets Artifacts & Models Ops ML Pipeline Inputs OutputsML Function
KubeFlow+Serverless: Automated ML Pipelines What is Kubeflow ? ▪ Operators for ML frameworks (lifecycle management, scale-out, ..) ▪ Managed notebooks ▪ ML Pipeline Automation ▪ With Serverless, we automate the deployment, execution, scaling and monitoring of our code 9
Automating The Development & Tracking Workflow Write and test locally specify runtime configuration Run/scale on the cluster Build (if needed) Document & Publish Run in a Pipeline Track experiments/runs, functions and data image, deps cpu/gpu/mem data, volumes, .. Use published functions
MLOpsAutomation: The CI/CDWay Write and test locally specify runtime & pipeline config Build (if needed) Document & Publish Run in a Pipeline Track experiments/runs, functions and data image, deps cpu/gpu/mem data, volumes, .. steps trigger Process pull request (automated) Feedback (comment) https://github.com/mlrun/demo-github-actionsDemo:
• 4M global customers • 200 countries and territories - streaming global commerce • Understanding illicit patterns of behavior in real time based on 90 different parameters • Proactively preventing money laundering before it occurs Want To Move From Fraud Detection to Prevention And Cut Time To Production Fraud Prevention Case Study: Payoneer
Traditional Fraud-Detection Architecture (Hadoop) 13 SQL Server Operational database ETL to the DWH every 30min Data warehouse Mirror table Offline processing (SQL) Feature vector Batch prediction Using R Server 40 Minutes to identify suspicious money laundering account 40 Precious Minutes (detect fraud after the fact) Long and complex process to production
Moving To Real-Time Fraud Prevention 14 SQL Server Operational database CDC (Real-time) Real-time Ingestion Online + Offline Feature Store Model Training (sklearn) Model Inferencing (Nuclio) Block account ! Queue Analysis 12 Seconds (prevent fraud) 12 Seconds to detect and prevent fraud ! Automated dev to production using a serverless approach
Models Require Continuous Monitoring And Updates MLOps lifecycle with drift detection: • Automated data-prep and training • Automated model deployment • Real-time model &drift monitoring • Periodic drift analysis • Automated remediation • Retrain, ensembles, … 15 Training Batch (Parquet) Reference data Serving Tracking stream Real-Time Model Monitoring TSDB Model Analysis Requests Serverless Drift Detection Fix
Demo !
Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.
Productionizing Machine Learning with a Microservices Architecture

Productionizing Machine Learning with a Microservices Architecture

  • 2.
    Productionizing Machine Learning witha Microservices Architecture Yaron Haviv CTO, Iguazio
  • 3.
    85% of AIProjects Never Make it to Production Research Environment Production Pipeline Build from Scratch with a Large Team Manual extraction In-mem analysis Small scale training Manual evaluation Real-time ingestion Preparation at scale Train with many params & large data Real-time events & data features ETL Streaming APIs Sync
  • 4.
    Because Model Developmentis Just the First Step Develop and Test Locally Package ─ • Dependencies • Parameters • Run scripts • Build Scale-out ─ • Load-balance • Data partitions • Model distribution • AutoML Tune ─ • Parallelism • GPU support • Query tuning • Caching Instrument ─ • Monitoring • Logging • Versioning • Security Automate ─ • CI/CD • Workflows • Rolling upgrades • A/B testing Weeks with one data scientist or developer Months with a large team of developers, scientists, data engineers and DevOps Production
  • 5.
    What Is AnAutomated ML Pipeline ? 5 ETL, Streaming, Logs, Scrapers, .. Ingest Prepare Train With hyper-params, multiple algorithms Validate Deploy ++ Join, Aggregate, Split, .. Test, deploy, monitor model & API servers End to end pipeline orchestration and tracking Serverless: ML & Analytics Functions Features/Data: Fast, Secure, Versioned base features train + test datasets model report report metricsRT features feedback Selected model with test data
  • 6.
    Modern Data-Science PlatformArchitecture Auto ML Experiment Tracking Feature Store Workflows (Kubeflow) Pipeline Orchestration Managed Functions and Services Serverless Automation Shared GPU/CPU Resources Data lake or object store Real-time data and DBaaS Data layer
  • 7.
    Serverless Enable: Resource elasticity,Automated Deployment and Operations Serverless Today Data Prep and Training Task lifespan Millisecs to mins Secs to hours Scaling Load-balancer Partition, shuffle, reduce, Hyper-params, RDD State Stateless Stateful Input Event Params, Datasets So why not use Serverless for training and data prep? 6 Time we extend Serverless to data-science !
  • 8.
    ML & AnalyticsFunctions Architecture User Code OR ML service Runtime / SaaS (e.g. Spark, Dask, Horovod, Nuclio, ..) Data / Feature stores Secrets Artifacts & Models Ops ML Pipeline Inputs OutputsML Function
  • 9.
    KubeFlow+Serverless: Automated MLPipelines What is Kubeflow ? ▪ Operators for ML frameworks (lifecycle management, scale-out, ..) ▪ Managed notebooks ▪ ML Pipeline Automation ▪ With Serverless, we automate the deployment, execution, scaling and monitoring of our code 9
  • 10.
    Automating The Development& Tracking Workflow Write and test locally specify runtime configuration Run/scale on the cluster Build (if needed) Document & Publish Run in a Pipeline Track experiments/runs, functions and data image, deps cpu/gpu/mem data, volumes, .. Use published functions
  • 11.
    MLOpsAutomation: The CI/CDWay Writeand test locally specify runtime & pipeline config Build (if needed) Document & Publish Run in a Pipeline Track experiments/runs, functions and data image, deps cpu/gpu/mem data, volumes, .. steps trigger Process pull request (automated) Feedback (comment) https://github.com/mlrun/demo-github-actionsDemo:
  • 12.
    • 4M globalcustomers • 200 countries and territories - streaming global commerce • Understanding illicit patterns of behavior in real time based on 90 different parameters • Proactively preventing money laundering before it occurs Want To Move From Fraud Detection to Prevention And Cut Time To Production Fraud Prevention Case Study: Payoneer
  • 13.
    Traditional Fraud-Detection Architecture (Hadoop) 13 SQLServer Operational database ETL to the DWH every 30min Data warehouse Mirror table Offline processing (SQL) Feature vector Batch prediction Using R Server 40 Minutes to identify suspicious money laundering account 40 Precious Minutes (detect fraud after the fact) Long and complex process to production
  • 14.
    Moving To Real-TimeFraud Prevention 14 SQL Server Operational database CDC (Real-time) Real-time Ingestion Online + Offline Feature Store Model Training (sklearn) Model Inferencing (Nuclio) Block account ! Queue Analysis 12 Seconds (prevent fraud) 12 Seconds to detect and prevent fraud ! Automated dev to production using a serverless approach
  • 15.
    Models Require ContinuousMonitoring And Updates MLOps lifecycle with drift detection: • Automated data-prep and training • Automated model deployment • Real-time model &drift monitoring • Periodic drift analysis • Automated remediation • Retrain, ensembles, … 15 Training Batch (Parquet) Reference data Serving Tracking stream Real-Time Model Monitoring TSDB Model Analysis Requests Serverless Drift Detection Fix
  • 16.
  • 17.
    Feedback Your feedback isimportant to us. Don’t forget to rate and review the sessions.