Productionizing Machine Learning with a Microservices Architecture

Productionizing Machine Learning with a Microservices Architecture Yaron Haviv CTO, Iguazio

85% of AI Projects Never Make it to Production Research Environment Production Pipeline Build from Scratch with a Large Team Manual extraction In-mem analysis Small scale training Manual evaluation Real-time ingestion Preparation at scale Train with many params & large data Real-time events & data features ETL Streaming APIs Sync

Because Model Development is Just the First Step Develop and Test Locally Package ─ • Dependencies • Parameters • Run scripts • Build Scale-out ─ • Load-balance • Data partitions • Model distribution • AutoML Tune ─ • Parallelism • GPU support • Query tuning • Caching Instrument ─ • Monitoring • Logging • Versioning • Security Automate ─ • CI/CD • Workflows • Rolling upgrades • A/B testing Weeks with one data scientist or developer Months with a large team of developers, scientists, data engineers and DevOps Production

What Is An Automated ML Pipeline ? 5 ETL, Streaming, Logs, Scrapers, .. Ingest Prepare Train With hyper-params, multiple algorithms Validate Deploy ++ Join, Aggregate, Split, .. Test, deploy, monitor model & API servers End to end pipeline orchestration and tracking Serverless: ML & Analytics Functions Features/Data: Fast, Secure, Versioned base features train + test datasets model report report metricsRT features feedback Selected model with test data

Modern Data-Science Platform Architecture Auto ML Experiment Tracking Feature Store Workflows (Kubeflow) Pipeline Orchestration Managed Functions and Services Serverless Automation Shared GPU/CPU Resources Data lake or object store Real-time data and DBaaS Data layer

Serverless Enable: Resource elasticity, Automated Deployment and Operations Serverless Today Data Prep and Training Task lifespan Millisecs to mins Secs to hours Scaling Load-balancer Partition, shuffle, reduce, Hyper-params, RDD State Stateless Stateful Input Event Params, Datasets So why not use Serverless for training and data prep? 6 Time we extend Serverless to data-science !

ML & Analytics Functions Architecture User Code OR ML service Runtime / SaaS (e.g. Spark, Dask, Horovod, Nuclio, ..) Data / Feature stores Secrets Artifacts & Models Ops ML Pipeline Inputs OutputsML Function

KubeFlow+Serverless: Automated ML Pipelines What is Kubeflow ? ▪ Operators for ML frameworks (lifecycle management, scale-out, ..) ▪ Managed notebooks ▪ ML Pipeline Automation ▪ With Serverless, we automate the deployment, execution, scaling and monitoring of our code 9

Automating The Development & Tracking Workflow Write and test locally specify runtime configuration Run/scale on the cluster Build (if needed) Document & Publish Run in a Pipeline Track experiments/runs, functions and data image, deps cpu/gpu/mem data, volumes, .. Use published functions

MLOpsAutomation: The CI/CDWay Write and test locally specify runtime & pipeline config Build (if needed) Document & Publish Run in a Pipeline Track experiments/runs, functions and data image, deps cpu/gpu/mem data, volumes, .. steps trigger Process pull request (automated) Feedback (comment) https://github.com/mlrun/demo-github-actionsDemo:

• 4M global customers • 200 countries and territories - streaming global commerce • Understanding illicit patterns of behavior in real time based on 90 different parameters • Proactively preventing money laundering before it occurs Want To Move From Fraud Detection to Prevention And Cut Time To Production Fraud Prevention Case Study: Payoneer

Traditional Fraud-Detection Architecture (Hadoop) 13 SQL Server Operational database ETL to the DWH every 30min Data warehouse Mirror table Offline processing (SQL) Feature vector Batch prediction Using R Server 40 Minutes to identify suspicious money laundering account 40 Precious Minutes (detect fraud after the fact) Long and complex process to production

Moving To Real-Time Fraud Prevention 14 SQL Server Operational database CDC (Real-time) Real-time Ingestion Online + Offline Feature Store Model Training (sklearn) Model Inferencing (Nuclio) Block account ! Queue Analysis 12 Seconds (prevent fraud) 12 Seconds to detect and prevent fraud ! Automated dev to production using a serverless approach

Models Require Continuous Monitoring And Updates MLOps lifecycle with drift detection: • Automated data-prep and training • Automated model deployment • Real-time model &drift monitoring • Periodic drift analysis • Automated remediation • Retrain, ensembles, … 15 Training Batch (Parquet) Reference data Serving Tracking stream Real-Time Model Monitoring TSDB Model Analysis Requests Serverless Drift Detection Fix

Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.

Productionizing Machine Learning with a Microservices Architecture

Productionizing Machine Learning with a Microservices Architecture

More Related Content

What's hot

Similar to Productionizing Machine Learning with a Microservices Architecture

More from Databricks

Recently uploaded

In this document

Productionizing Machine Learning with a Microservices Architecture