©2015,  Amazon  Web  Services,  Inc.  or  its  affiliates.  All  rights  reserved Machine Learning for Developers Danilo Poccia – Technical Evangelist @danilop danilop
Amazon.com 1994
Three types of data-driven development Retrospective analysis and reporting Amazon Redshift Amazon RDS Amazon S3 Amazon EMR
Three types of data-driven development Retrospective analysis and reporting Here-and-now real-time processing and dashboards Amazon Kinesis Amazon EC2 AWS Lambda Amazon Redshift Amazon RDS Amazon S3 Amazon EMR
Three types of data-driven development Retrospective analysis and reporting Here-and-now real-time processing and dashboards Predictions to enable smart applications Amazon Kinesis Amazon EC2 AWS Lambda Amazon Redshift Amazon RDS Amazon S3 Amazon EMR
Machine learning and smart applications Machine learning is the technology that automatically finds patterns in your data and uses them to make predictions for new data points as they become available
Machine learning and smart applications Machine learning is the technology that automatically finds patterns in your data and uses them to make predictions for new data points as they become available Your data + machine learning = smart applications
Smart applications by example Based on what you know about the users: Will they use your product?
Smart applications by example Based on what you know about the user: Will they use your product? Based on what you know about an order: Is this order fraudulent?
Smart applications by example Based on what you know about the user: Will they use your product? Based on what you know about an order: Is this order fraudulent? Based on what you know about a news article: What other articles are interesting?
And a few more examples… Fraud detection Detecting fraudulent transactions, filtering spam emails, flagging suspicious reviews, … Personalization Recommending content, predictive content loading, improving user experience, … Targeted marketing Matching customers and offers, choosing marketing campaigns, cross-selling and up-selling, … Content classification Categorizing documents, matching hiring managers and resumes, … Churn prediction Finding customers who are likely to stop using the service, free-tier upgrade targeting, … Customer support Predictive routing of customer emails, social media listening, …
Why aren’t there more smart applications? 1.  Machine learning expertise is rare 2.  Building and scaling machine learning technology is hard 3.  Closing the gap between models and applications is time-consuming and expensive
2001
Decentralized, two-pizza teams Agility, autonomy, accountability, and ownership “DevOps” “Microservices”
Introducing Amazon ML Easy to use, managed machine learning service built for developers Robust, powerful machine learning technology based on Amazon’s internal systems Create models using your data already stored in the AWS cloud Deploy models to production in seconds
Easy to use and developer-friendly Use the intuitive, powerful service console to build and explore your initial models –  Data retrieval –  Model training, quality evaluation, fine-tuning –  Deployment and management Automate model lifecycle with fully featured APIs and SDKs –  Java, Python, .NET, JavaScript, Ruby, PHP Easily create smart iOS and Android applications with AWS Mobile SDK
Powerful machine learning technology Based on Amazon’s battle-hardened internal systems Not just the algorithms: –  Smart data transformations –  Input data and model quality alerts –  Built-in industry best practices Grows with your needs –  Train on up to 100 GB of data –  Generate billions of predictions –  Obtain predictions in batches or real-time
Integrated with AWS data ecosystem Access data that is stored in S3, Amazon Redshift, or MySQL databases in RDS Output predictions to S3 for easy integration with your data flows Use AWS Identity and Access Management (IAM) for fine-grained data- access permission policies
Fully managed model and prediction services End-to-end service, with no servers to provision and manage One-click production model deployment Programmatically query model metadata to enable automatic retraining workflows Monitor prediction usage patterns with Amazon CloudWatch metrics
Build model Evaluate and optimize Retrieve predictions 1 2 3 Building smart applications with Amazon ML
Train model Evaluate and optimize Retrieve predictions 1 2 3 Building smart applications with Amazon ML -  Create a Datasource object pointing to your data -  Explore and understand your data -  Transform data and train your model
Create a Datasource object >>> import boto >>> ml = boto.connect_machinelearning() >>> ds = ml.create_data_source_from_s3( data_source_id = ’my_datasource', data_spec= { 'DataLocationS3':'s3://bucket/input/', 'DataSchemaLocationS3':'s3://bucket/input/.schema'}, compute_statistics = True)
Explore and understand your data
Train your model >>> import boto >>> ml = boto.connect_machinelearning() >>> model = ml.create_ml_model( ml_model_id=’my_model', ml_model_type='REGRESSION', training_data_source_id='my_datasource')
Train model Evaluate and optimize Retrieve predictions 1 2 3 Building smart applications with Amazon ML -  Understand model quality -  Adjust model interpretation
Explore model quality
Fine-tune model interpretation
Fine-tune model interpretation
Train model Evaluate and optimize Retrieve predictions 1 2 3 Building smart applications with Amazon ML -  Batch predictions -  Real-time predictions
Batch predictions Asynchronous, large-volume prediction generation Request through service console or API Best for applications that deal with batches of data records >>> import boto >>> ml = boto.connect_machinelearning() >>> model = ml.create_batch_prediction( batch_prediction_id = 'my_batch_prediction’ batch_prediction_data_source_id = ’my_datasource’ ml_model_id = ’my_model', output_uri = 's3://examplebucket/output/’)
Real-time predictions Synchronous, low-latency, high-throughput prediction generation Request through service API or server or mobile SDKs Best for interaction applications that deal with individual data records >>> import boto >>> ml = boto.connect_machinelearning() >>> ml.predict( ml_model_id=’my_model', predict_endpoint=’example_endpoint’, record={’key1':’value1’, ’key2':’value2’}) { 'Prediction': { 'predictedValue': 13.284348, 'details': { 'Algorithm': 'SGD', 'PredictiveModelType': 'REGRESSION’ } } }
<demo> … </demo>
Architecture patterns for smart applications
Batch predictions with EMR Query for predictions with Amazon ML batch API Process data with EMR Raw data in S3 Aggregated data in S3 Predictions in S3 Your application
Batch predictions with Amazon Redshift Structured data In Amazon Redshift Load predictions into Amazon Redshift -or- Read prediction results directly from S3 Predictions in S3 Query for predictions with Amazon ML batch API Your application
Real-time predictions for interactive applications Your application Query for predictions with Amazon ML real-time API
Adding predictions to an existing data flow Your application Amazon DynamoDB + Trigger event with Lambda + Query for predictions with Amazon ML real-time API
Where is the Craftsmanship?
Urban Traffic Model Date Day of the Week It is an Holiday? Weather Rain (mm) Temperature
75% of users select movies based on recommendations
Abraham Wald
Pay-as-you-go and inexpensive Data analysis, model training, and evaluation: $0.42/instance hour Batch predictions: $0.10/1000 Real-time predictions: $0.10/1000 + hourly capacity reservation charge
©2015,  Amazon  Web  Services,  Inc.  or  its  affiliates.  All  rights  reserved Machine Learning for Developers Danilo Poccia – Technical Evangelist @danilop danilop

Machine Learning for Developers

  • 1.
    ©2015,  Amazon  Web Services,  Inc.  or  its  affiliates.  All  rights  reserved Machine Learning for Developers Danilo Poccia – Technical Evangelist @danilop danilop
  • 2.
  • 5.
    Three types ofdata-driven development Retrospective analysis and reporting Amazon Redshift Amazon RDS Amazon S3 Amazon EMR
  • 6.
    Three types ofdata-driven development Retrospective analysis and reporting Here-and-now real-time processing and dashboards Amazon Kinesis Amazon EC2 AWS Lambda Amazon Redshift Amazon RDS Amazon S3 Amazon EMR
  • 7.
    Three types ofdata-driven development Retrospective analysis and reporting Here-and-now real-time processing and dashboards Predictions to enable smart applications Amazon Kinesis Amazon EC2 AWS Lambda Amazon Redshift Amazon RDS Amazon S3 Amazon EMR
  • 8.
    Machine learning andsmart applications Machine learning is the technology that automatically finds patterns in your data and uses them to make predictions for new data points as they become available
  • 9.
    Machine learning andsmart applications Machine learning is the technology that automatically finds patterns in your data and uses them to make predictions for new data points as they become available Your data + machine learning = smart applications
  • 10.
    Smart applications byexample Based on what you know about the users: Will they use your product?
  • 11.
    Smart applications byexample Based on what you know about the user: Will they use your product? Based on what you know about an order: Is this order fraudulent?
  • 12.
    Smart applications byexample Based on what you know about the user: Will they use your product? Based on what you know about an order: Is this order fraudulent? Based on what you know about a news article: What other articles are interesting?
  • 13.
    And a fewmore examples… Fraud detection Detecting fraudulent transactions, filtering spam emails, flagging suspicious reviews, … Personalization Recommending content, predictive content loading, improving user experience, … Targeted marketing Matching customers and offers, choosing marketing campaigns, cross-selling and up-selling, … Content classification Categorizing documents, matching hiring managers and resumes, … Churn prediction Finding customers who are likely to stop using the service, free-tier upgrade targeting, … Customer support Predictive routing of customer emails, social media listening, …
  • 14.
    Why aren’t theremore smart applications? 1.  Machine learning expertise is rare 2.  Building and scaling machine learning technology is hard 3.  Closing the gap between models and applications is time-consuming and expensive
  • 15.
  • 16.
  • 17.
    Introducing Amazon ML Easyto use, managed machine learning service built for developers Robust, powerful machine learning technology based on Amazon’s internal systems Create models using your data already stored in the AWS cloud Deploy models to production in seconds
  • 18.
    Easy to useand developer-friendly Use the intuitive, powerful service console to build and explore your initial models –  Data retrieval –  Model training, quality evaluation, fine-tuning –  Deployment and management Automate model lifecycle with fully featured APIs and SDKs –  Java, Python, .NET, JavaScript, Ruby, PHP Easily create smart iOS and Android applications with AWS Mobile SDK
  • 19.
    Powerful machine learningtechnology Based on Amazon’s battle-hardened internal systems Not just the algorithms: –  Smart data transformations –  Input data and model quality alerts –  Built-in industry best practices Grows with your needs –  Train on up to 100 GB of data –  Generate billions of predictions –  Obtain predictions in batches or real-time
  • 20.
    Integrated with AWSdata ecosystem Access data that is stored in S3, Amazon Redshift, or MySQL databases in RDS Output predictions to S3 for easy integration with your data flows Use AWS Identity and Access Management (IAM) for fine-grained data- access permission policies
  • 21.
    Fully managed modeland prediction services End-to-end service, with no servers to provision and manage One-click production model deployment Programmatically query model metadata to enable automatic retraining workflows Monitor prediction usage patterns with Amazon CloudWatch metrics
  • 22.
    Build model Evaluate and optimize Retrieve predictions 1 23 Building smart applications with Amazon ML
  • 23.
    Train model Evaluate and optimize Retrieve predictions 1 23 Building smart applications with Amazon ML -  Create a Datasource object pointing to your data -  Explore and understand your data -  Transform data and train your model
  • 24.
    Create a Datasourceobject >>> import boto >>> ml = boto.connect_machinelearning() >>> ds = ml.create_data_source_from_s3( data_source_id = ’my_datasource', data_spec= { 'DataLocationS3':'s3://bucket/input/', 'DataSchemaLocationS3':'s3://bucket/input/.schema'}, compute_statistics = True)
  • 25.
  • 26.
    Train your model >>>import boto >>> ml = boto.connect_machinelearning() >>> model = ml.create_ml_model( ml_model_id=’my_model', ml_model_type='REGRESSION', training_data_source_id='my_datasource')
  • 27.
    Train model Evaluate and optimize Retrieve predictions 1 23 Building smart applications with Amazon ML -  Understand model quality -  Adjust model interpretation
  • 28.
  • 29.
  • 30.
  • 31.
    Train model Evaluate and optimize Retrieve predictions 1 23 Building smart applications with Amazon ML -  Batch predictions -  Real-time predictions
  • 32.
    Batch predictions Asynchronous, large-volumeprediction generation Request through service console or API Best for applications that deal with batches of data records >>> import boto >>> ml = boto.connect_machinelearning() >>> model = ml.create_batch_prediction( batch_prediction_id = 'my_batch_prediction’ batch_prediction_data_source_id = ’my_datasource’ ml_model_id = ’my_model', output_uri = 's3://examplebucket/output/’)
  • 33.
    Real-time predictions Synchronous, low-latency,high-throughput prediction generation Request through service API or server or mobile SDKs Best for interaction applications that deal with individual data records >>> import boto >>> ml = boto.connect_machinelearning() >>> ml.predict( ml_model_id=’my_model', predict_endpoint=’example_endpoint’, record={’key1':’value1’, ’key2':’value2’}) { 'Prediction': { 'predictedValue': 13.284348, 'details': { 'Algorithm': 'SGD', 'PredictiveModelType': 'REGRESSION’ } } }
  • 34.
  • 35.
    Architecture patterns forsmart applications
  • 36.
    Batch predictions withEMR Query for predictions with Amazon ML batch API Process data with EMR Raw data in S3 Aggregated data in S3 Predictions in S3 Your application
  • 37.
    Batch predictions withAmazon Redshift Structured data In Amazon Redshift Load predictions into Amazon Redshift -or- Read prediction results directly from S3 Predictions in S3 Query for predictions with Amazon ML batch API Your application
  • 38.
    Real-time predictions forinteractive applications Your application Query for predictions with Amazon ML real-time API
  • 39.
    Adding predictions toan existing data flow Your application Amazon DynamoDB + Trigger event with Lambda + Query for predictions with Amazon ML real-time API
  • 40.
    Where is theCraftsmanship?
  • 41.
    Urban Traffic Model Date Dayof the Week It is an Holiday? Weather Rain (mm) Temperature
  • 42.
    75% of usersselect movies based on recommendations
  • 44.
  • 45.
    Pay-as-you-go and inexpensive Dataanalysis, model training, and evaluation: $0.42/instance hour Batch predictions: $0.10/1000 Real-time predictions: $0.10/1000 + hourly capacity reservation charge
  • 46.
    ©2015,  Amazon  Web Services,  Inc.  or  its  affiliates.  All  rights  reserved Machine Learning for Developers Danilo Poccia – Technical Evangelist @danilop danilop