© 2019 Tryolabs An Introduction to Machine Learning and How to Teach Machines to See Facundo Parodi Tryolabs May 2019
© 2019 Tryolabs Who are we? 2
© 2019 Tryolabs Outline Machine Learning • Types of Machine Learning Problems • Steps to solve a Machine Learning Problem Deep Learning • Artificial Neural Networks Image Classification • Convolutional Neural Networks 3
© 2019 Tryolabs What is a Cat?
© 2019 Tryolabs What is a Cat? 5
© 2019 Tryolabs What is a Cat? Occlusion Diversity Deformation Lighting variations 6
© 2019 Tryolabs Introduction to Machine Learning
© 2019 Tryolabs What is Machine Learning? The subfield of computer science that “gives computers the ability to learn without being explicitly programmed”. (Arthur Samuel, 1959) A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.” (Tom Mitchell, 1997) Introduction to Machine Learning Using data for answering questions Training Predicting 8
© 2019 Tryolabs The Big Data Era Introduction to Machine Learning Data already available everywhere Low storage costs: everyone has several GBs for “free” Hardware more powerful and cheaper than ever before Everyone has a computer fully packed with sensors: • GPS • Cameras • Microphones Permanently connected to Internet Cloud Computing: • Online storage • Infrastructure as a Service User applications: • YouTube • Gmail • Facebook • Twitter Data Devices Services 9
© 2019 Tryolabs Types of Machine Learning Problems Introduction to Machine Learning Supervised Unsupervised Reinforcement 10
© 2019 Tryolabs Types of Machine Learning Problems Introduction to Machine Learning Supervised Unsupervised Reinforcement Learn through examples of which we know the desired output (what we want to predict). Is this a cat or a dog? Are these emails spam or not? Predict the market value of houses, given the square meters, number of rooms, neighborhood, etc. 11
© 2019 Tryolabs Types of Machine Learning Problems Introduction to Machine Learning Supervised Unsupervised Reinforcement Output is a discrete variable (e.g., cat/dog) Classification Regression Output is continuous (e.g., price, temperature) 12
© 2019 Tryolabs Unsupervised Types of Machine Learning Problems Introduction to Machine Learning Supervised Reinforcement There is no desired output. Learn something about the data. Latent relationships. I have photos and want to put them in 20 groups. I want to find anomalies in the credit card usage patterns of my customers. 13
© 2019 Tryolabs Unsupervised Types of Machine Learning Problems Introduction to Machine Learning Supervised Reinforcement Useful for learning structure in the data (clustering), hidden correlations, reduce dimensionality, etc. 14
© 2019 Tryolabs Unsupervised Reinforcement Types of Machine Learning Problems Introduction to Machine Learning Supervised An agent interacts with an environment and watches the result of the interaction. Environment gives feedback via a positive or negative reward signal. 15
© 2019 Tryolabs Steps to Solve a Machine Learning Problem Introduction to Machine Learning Data Gathering Collect data from various sources Data Preprocessing Clean data to have homogeneity Feature Engineering Selecting the right machine learning model Making your data more useful Algorithm Selection & Training Making Predictions Evaluate the model 16
© 2019 Tryolabs Data Gathering Might depend on human work • Manual labeling for supervised learning. • Domain knowledge. Maybe even experts. May come for free, or “sort of” • E.g., Machine Translation. The more the better: Some algorithms need large amounts of data to be useful (e.g., neural networks). The quantity and quality of data dictate the model accuracy Introduction to Machine Learning 17
© 2019 Tryolabs Data Preprocessing Is there anything wrong with the data? • Missing values • Outliers • Bad encoding (for text) • Wrongly-labeled examples • Biased data • Do I have many more samples of one class than the rest? Need to fix/remove data? Introduction to Machine Learning 18
© 2019 Tryolabs Introduction to Machine Learning Feature Engineering What is a feature? A feature is an individual measurable property of a phenomenon being observed Our inputs are represented by a set of features. To classify spam email, features could be: • Number of words that have been ch4ng3d like this. • Language of the email (0=English, 1=Spanish) • Number of emojis Buy ch34p drugs from the ph4rm4cy now :) :) :) (2, 0, 3) Feature engineering 19
© 2019 Tryolabs Introduction to Machine Learning Feature Engineering Extract more information from existing data, not adding “new” data per-se • Making it more useful • With good features, most algorithms can learn faster It can be an art • Requires thought and knowledge of the data Two steps: • Variable transformation (e.g., dates into weekdays, normalizing) • Feature creation (e.g., n-grams for texts, if word is capitalized to detect names, etc.) 20
© 2019 Tryolabs Introduction to Machine Learning Algorithm Selection & Training Supervised • Linear classifier • Naive Bayes • Support Vector Machines (SVM) • Decision Tree • Random Forests • k-Nearest Neighbors • Neural Networks (Deep learning) Unsupervised • PCA • t-SNE • k-means • DBSCAN Reinforcement • SARSA–λ • Q-Learning 21
© 2019 Tryolabs Goal of training: making the correct prediction as often as possible • Incremental improvement: • Use of metrics for evaluating performance and comparing solutions • Hyperparameter tuning: more an art than a science Introduction to Machine Learning Algorithm Selection & Training Predict Adjust 22
© 2019 Tryolabs Introduction to Machine Learning Making Predictions Feature extraction Machine Learning model Samples Labels Features Feature extraction Input Features Trained classifier Label Training Phase Prediction Phase 23
© 2019 Tryolabs Summary • Machine Learning is intelligent use of data to answer questions • Enabled by an exponential increase in computing power and data availability • Three big types of problems: supervised, unsupervised, reinforcement • 5 steps to every machine learning solution: 1. Data Gathering 2. Data Preprocessing 3. Feature Engineering 4. Algorithm Selection & Training 5. Making Predictions Introduction to Machine Learning 24
© 2019 Tryolabs Deep Learning “Any sufficiently advanced technology is indistinguishable from magic.” (Arthur C. Clarke)
© 2019 Tryolabs Artificial Neural Networks Deep Learning Perceptron (Rosenblatt, 1957) • First model of artificial neural networks proposed in 1943 • Analogy to the human brain greatly exaggerated • Given some inputs (𝑥), the network calculates some outputs (𝑦), using a set of weights (𝑤) Two-layer Fully Connected Neural Network 26
© 2019 Tryolabs Loss function Deep Learning • Weights must be adjusted (learned from the data) • Idea: define a function that tells us how “close” the network is to generating the desired output • Minimize the loss ➔ optimization problem • With a continuous and differentiable loss function, we can apply gradient descent 27
© 2019 Tryolabs The Rise, Fall, Rise, Fall and Rise of Neural Networks Deep Learning • Perceptron gained popularity in the 60s • Belief that would lead to true AI • XOR problem and AI Winter (1969 – 1986) • Backpropagation to the rescue! (1986) • Training of multilayer neural nets • LeNet-5 (Yann LeCun et al., 1998) • Unable to scale. Lack of good data and processing power 28
© 2019 Tryolabs The Rise, Fall, Rise, Fall and Rise of Neural Networks Deep Learning • Regained popularity since ~2006. • Train each layer at a time • Rebranded field as Deep Learning • Old ideas rediscovered (e.g., Convolution) • Breakthrough in 2012 with AlexNet (Krizhevsky et al.) • Use of GPUs • Convolution 29
© 2019 Tryolabs Image Classification with Deep Neural Networks
© 2019 Tryolabs Digital Representation of Images Image Classification with Deep Neural Networks = 31
© 2019 Tryolabs The Convolution Operator Image Classification with Deep Neural Networks ⊙ = ∑ Kernel OutputInput 32
© 2019 Tryolabs The Convolution Operator Image Classification with Deep Neural Networks Kernel ⊙ = ∑ OutputInput 33
© 2019 Tryolabs The Convolution Operator Image Classification with Deep Neural Networks Kernel Output ⊙ = ∑ Input 34
© 2019 Tryolabs The Convolution Operator Image Classification with Deep Neural Networks Kernel OutputInput ⊙ = ∑ 35
© 2019 Tryolabs The Convolution Operation Image Classification with Deep Neural Networks Kernel Feature MapInput 36
© 2019 Tryolabs The Convolution Operation Image Classification with Deep Neural Networks 37
© 2019 Tryolabs The Convolution Operation • Takes spatial dependencies into account • Used as a feature extraction tool • Differentiable operation ➔ the kernels can be learned Image Classification with Deep Neural Networks Feature extraction Input Features Trained classifier Output Input Trained classifier Output Deep Learning Traditional ML 38
© 2019 Tryolabs Non-linear Activation Functions Increment the network’s capacity ▪ Convolution, matrix multiplication and summation are linear Image Classification with Deep Neural Networks Sigmoid 𝑓 𝑥 = 1 1 + 𝑒−𝑥 ReLU 𝑓 𝑥 = max(0, 𝑥) Hyperbolic tangent 𝑡𝑎𝑛ℎ 𝑥 = 𝑒2𝑥−1 𝑒2𝑥+1 39
© 2019 Tryolabs Non-linear Activation Functions Image Classification with Deep Neural Networks ReLU 40
© 2019 Tryolabs The Pooling Operation • Used to reduce dimensionality • Most common: Max pooling • Makes the network invariant to small transformations, distortions and translations. Image Classification with Deep Neural Networks 12 20 30 0 8 12 2 0 34 70 37 4 112 100 25 12 20 30 112 37 2x2 Max Pooling 41
© 2019 Tryolabs Putting all together Image Classification with Deep Neural Networks Conv Layer Non-Linear Function Input Pooling Conv Layer Non-Linear Function Pooling Conv Layer Non-Linear Function Feature extraction Flatten … Classification Fully Connected Layers 42
© 2019 Tryolabs Training Convolutional Neural Networks Image classification is a supervised problem • Gather images and label them with desired output • Train the network with backpropagation! Image Classification with Deep Neural Networks Label: Cat Convolutional Network Loss Function Prediction: Dog 43
© 2019 Tryolabs Training Convolutional Neural Networks Image classification is a supervised problem • Gather images and label them with desired output • Train the network with backpropagation! Image Classification with Deep Neural Networks Label: Cat Convolutional Network Loss Function Prediction: Cat 44
© 2019 Tryolabs Surpassing Human Performance Image Classification with Deep Neural Networks 45
© 2019 Tryolabs Deep Learning in the Wild 46
© 2019 Tryolabs Deep Learning is Here to Stay Data Architectures Frameworks Power Players 47
© 2019 Tryolabs Conclusions Machine learning algorithms learn from data to find hidden relations, to make predictions, to interact with the world, … A machine learning algorithm is as good as its input data • Good model + Bad data = Bad Results Deep learning is making significant breakthroughs in: speech recognition, language processing, computer vision, control systems, … If you are not using or considering using Deep Learning to understand or solve vision problems, you almost certainly should be 48
© 2019 Tryolabs Resource Our work Tryolabs Blog https://www.tryolabs.com/blog Luminoth (Computer Vision Toolkit) https://www.luminoth.ai To Learn More… Google Machine Learning Crash Course https://developers.google.com/machine- learning/crash-course/ Stanford course CS229: Machine Learning https://developers.google.com/machine- learning/crash-course/ Stanford course CS231n: Convolutional Neural Networks for Visual Recognition http://cs231n.stanford.edu/ 49
© 2019 Tryolabs Thank you!

"An Introduction to Machine Learning and How to Teach Machines to See," a Presentation from Tryolabs

  • 1.
    © 2019 Tryolabs AnIntroduction to Machine Learning and How to Teach Machines to See Facundo Parodi Tryolabs May 2019
  • 2.
  • 3.
    © 2019 Tryolabs Outline MachineLearning • Types of Machine Learning Problems • Steps to solve a Machine Learning Problem Deep Learning • Artificial Neural Networks Image Classification • Convolutional Neural Networks 3
  • 4.
  • 5.
  • 6.
    © 2019 Tryolabs Whatis a Cat? Occlusion Diversity Deformation Lighting variations 6
  • 7.
    © 2019 Tryolabs Introductionto Machine Learning
  • 8.
    © 2019 Tryolabs Whatis Machine Learning? The subfield of computer science that “gives computers the ability to learn without being explicitly programmed”. (Arthur Samuel, 1959) A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.” (Tom Mitchell, 1997) Introduction to Machine Learning Using data for answering questions Training Predicting 8
  • 9.
    © 2019 Tryolabs TheBig Data Era Introduction to Machine Learning Data already available everywhere Low storage costs: everyone has several GBs for “free” Hardware more powerful and cheaper than ever before Everyone has a computer fully packed with sensors: • GPS • Cameras • Microphones Permanently connected to Internet Cloud Computing: • Online storage • Infrastructure as a Service User applications: • YouTube • Gmail • Facebook • Twitter Data Devices Services 9
  • 10.
    © 2019 Tryolabs Typesof Machine Learning Problems Introduction to Machine Learning Supervised Unsupervised Reinforcement 10
  • 11.
    © 2019 Tryolabs Typesof Machine Learning Problems Introduction to Machine Learning Supervised Unsupervised Reinforcement Learn through examples of which we know the desired output (what we want to predict). Is this a cat or a dog? Are these emails spam or not? Predict the market value of houses, given the square meters, number of rooms, neighborhood, etc. 11
  • 12.
    © 2019 Tryolabs Typesof Machine Learning Problems Introduction to Machine Learning Supervised Unsupervised Reinforcement Output is a discrete variable (e.g., cat/dog) Classification Regression Output is continuous (e.g., price, temperature) 12
  • 13.
    © 2019 Tryolabs Unsupervised Typesof Machine Learning Problems Introduction to Machine Learning Supervised Reinforcement There is no desired output. Learn something about the data. Latent relationships. I have photos and want to put them in 20 groups. I want to find anomalies in the credit card usage patterns of my customers. 13
  • 14.
    © 2019 Tryolabs Unsupervised Typesof Machine Learning Problems Introduction to Machine Learning Supervised Reinforcement Useful for learning structure in the data (clustering), hidden correlations, reduce dimensionality, etc. 14
  • 15.
    © 2019 Tryolabs Unsupervised Reinforcement Typesof Machine Learning Problems Introduction to Machine Learning Supervised An agent interacts with an environment and watches the result of the interaction. Environment gives feedback via a positive or negative reward signal. 15
  • 16.
    © 2019 Tryolabs Stepsto Solve a Machine Learning Problem Introduction to Machine Learning Data Gathering Collect data from various sources Data Preprocessing Clean data to have homogeneity Feature Engineering Selecting the right machine learning model Making your data more useful Algorithm Selection & Training Making Predictions Evaluate the model 16
  • 17.
    © 2019 Tryolabs DataGathering Might depend on human work • Manual labeling for supervised learning. • Domain knowledge. Maybe even experts. May come for free, or “sort of” • E.g., Machine Translation. The more the better: Some algorithms need large amounts of data to be useful (e.g., neural networks). The quantity and quality of data dictate the model accuracy Introduction to Machine Learning 17
  • 18.
    © 2019 Tryolabs DataPreprocessing Is there anything wrong with the data? • Missing values • Outliers • Bad encoding (for text) • Wrongly-labeled examples • Biased data • Do I have many more samples of one class than the rest? Need to fix/remove data? Introduction to Machine Learning 18
  • 19.
    © 2019 Tryolabs Introductionto Machine Learning Feature Engineering What is a feature? A feature is an individual measurable property of a phenomenon being observed Our inputs are represented by a set of features. To classify spam email, features could be: • Number of words that have been ch4ng3d like this. • Language of the email (0=English, 1=Spanish) • Number of emojis Buy ch34p drugs from the ph4rm4cy now :) :) :) (2, 0, 3) Feature engineering 19
  • 20.
    © 2019 Tryolabs Introductionto Machine Learning Feature Engineering Extract more information from existing data, not adding “new” data per-se • Making it more useful • With good features, most algorithms can learn faster It can be an art • Requires thought and knowledge of the data Two steps: • Variable transformation (e.g., dates into weekdays, normalizing) • Feature creation (e.g., n-grams for texts, if word is capitalized to detect names, etc.) 20
  • 21.
    © 2019 Tryolabs Introductionto Machine Learning Algorithm Selection & Training Supervised • Linear classifier • Naive Bayes • Support Vector Machines (SVM) • Decision Tree • Random Forests • k-Nearest Neighbors • Neural Networks (Deep learning) Unsupervised • PCA • t-SNE • k-means • DBSCAN Reinforcement • SARSA–λ • Q-Learning 21
  • 22.
    © 2019 Tryolabs Goalof training: making the correct prediction as often as possible • Incremental improvement: • Use of metrics for evaluating performance and comparing solutions • Hyperparameter tuning: more an art than a science Introduction to Machine Learning Algorithm Selection & Training Predict Adjust 22
  • 23.
    © 2019 Tryolabs Introductionto Machine Learning Making Predictions Feature extraction Machine Learning model Samples Labels Features Feature extraction Input Features Trained classifier Label Training Phase Prediction Phase 23
  • 24.
    © 2019 Tryolabs Summary •Machine Learning is intelligent use of data to answer questions • Enabled by an exponential increase in computing power and data availability • Three big types of problems: supervised, unsupervised, reinforcement • 5 steps to every machine learning solution: 1. Data Gathering 2. Data Preprocessing 3. Feature Engineering 4. Algorithm Selection & Training 5. Making Predictions Introduction to Machine Learning 24
  • 25.
    © 2019 Tryolabs DeepLearning “Any sufficiently advanced technology is indistinguishable from magic.” (Arthur C. Clarke)
  • 26.
    © 2019 Tryolabs ArtificialNeural Networks Deep Learning Perceptron (Rosenblatt, 1957) • First model of artificial neural networks proposed in 1943 • Analogy to the human brain greatly exaggerated • Given some inputs (𝑥), the network calculates some outputs (𝑦), using a set of weights (𝑤) Two-layer Fully Connected Neural Network 26
  • 27.
    © 2019 Tryolabs Lossfunction Deep Learning • Weights must be adjusted (learned from the data) • Idea: define a function that tells us how “close” the network is to generating the desired output • Minimize the loss ➔ optimization problem • With a continuous and differentiable loss function, we can apply gradient descent 27
  • 28.
    © 2019 Tryolabs TheRise, Fall, Rise, Fall and Rise of Neural Networks Deep Learning • Perceptron gained popularity in the 60s • Belief that would lead to true AI • XOR problem and AI Winter (1969 – 1986) • Backpropagation to the rescue! (1986) • Training of multilayer neural nets • LeNet-5 (Yann LeCun et al., 1998) • Unable to scale. Lack of good data and processing power 28
  • 29.
    © 2019 Tryolabs TheRise, Fall, Rise, Fall and Rise of Neural Networks Deep Learning • Regained popularity since ~2006. • Train each layer at a time • Rebranded field as Deep Learning • Old ideas rediscovered (e.g., Convolution) • Breakthrough in 2012 with AlexNet (Krizhevsky et al.) • Use of GPUs • Convolution 29
  • 30.
    © 2019 Tryolabs ImageClassification with Deep Neural Networks
  • 31.
    © 2019 Tryolabs DigitalRepresentation of Images Image Classification with Deep Neural Networks = 31
  • 32.
    © 2019 Tryolabs TheConvolution Operator Image Classification with Deep Neural Networks ⊙ = ∑ Kernel OutputInput 32
  • 33.
    © 2019 Tryolabs TheConvolution Operator Image Classification with Deep Neural Networks Kernel ⊙ = ∑ OutputInput 33
  • 34.
    © 2019 Tryolabs TheConvolution Operator Image Classification with Deep Neural Networks Kernel Output ⊙ = ∑ Input 34
  • 35.
    © 2019 Tryolabs TheConvolution Operator Image Classification with Deep Neural Networks Kernel OutputInput ⊙ = ∑ 35
  • 36.
    © 2019 Tryolabs TheConvolution Operation Image Classification with Deep Neural Networks Kernel Feature MapInput 36
  • 37.
    © 2019 Tryolabs TheConvolution Operation Image Classification with Deep Neural Networks 37
  • 38.
    © 2019 Tryolabs TheConvolution Operation • Takes spatial dependencies into account • Used as a feature extraction tool • Differentiable operation ➔ the kernels can be learned Image Classification with Deep Neural Networks Feature extraction Input Features Trained classifier Output Input Trained classifier Output Deep Learning Traditional ML 38
  • 39.
    © 2019 Tryolabs Non-linearActivation Functions Increment the network’s capacity ▪ Convolution, matrix multiplication and summation are linear Image Classification with Deep Neural Networks Sigmoid 𝑓 𝑥 = 1 1 + 𝑒−𝑥 ReLU 𝑓 𝑥 = max(0, 𝑥) Hyperbolic tangent 𝑡𝑎𝑛ℎ 𝑥 = 𝑒2𝑥−1 𝑒2𝑥+1 39
  • 40.
    © 2019 Tryolabs Non-linearActivation Functions Image Classification with Deep Neural Networks ReLU 40
  • 41.
    © 2019 Tryolabs ThePooling Operation • Used to reduce dimensionality • Most common: Max pooling • Makes the network invariant to small transformations, distortions and translations. Image Classification with Deep Neural Networks 12 20 30 0 8 12 2 0 34 70 37 4 112 100 25 12 20 30 112 37 2x2 Max Pooling 41
  • 42.
    © 2019 Tryolabs Puttingall together Image Classification with Deep Neural Networks Conv Layer Non-Linear Function Input Pooling Conv Layer Non-Linear Function Pooling Conv Layer Non-Linear Function Feature extraction Flatten … Classification Fully Connected Layers 42
  • 43.
    © 2019 Tryolabs TrainingConvolutional Neural Networks Image classification is a supervised problem • Gather images and label them with desired output • Train the network with backpropagation! Image Classification with Deep Neural Networks Label: Cat Convolutional Network Loss Function Prediction: Dog 43
  • 44.
    © 2019 Tryolabs TrainingConvolutional Neural Networks Image classification is a supervised problem • Gather images and label them with desired output • Train the network with backpropagation! Image Classification with Deep Neural Networks Label: Cat Convolutional Network Loss Function Prediction: Cat 44
  • 45.
    © 2019 Tryolabs SurpassingHuman Performance Image Classification with Deep Neural Networks 45
  • 46.
    © 2019 Tryolabs DeepLearning in the Wild 46
  • 47.
    © 2019 Tryolabs DeepLearning is Here to Stay Data Architectures Frameworks Power Players 47
  • 48.
    © 2019 Tryolabs Conclusions Machinelearning algorithms learn from data to find hidden relations, to make predictions, to interact with the world, … A machine learning algorithm is as good as its input data • Good model + Bad data = Bad Results Deep learning is making significant breakthroughs in: speech recognition, language processing, computer vision, control systems, … If you are not using or considering using Deep Learning to understand or solve vision problems, you almost certainly should be 48
  • 49.
    © 2019 Tryolabs Resource Ourwork Tryolabs Blog https://www.tryolabs.com/blog Luminoth (Computer Vision Toolkit) https://www.luminoth.ai To Learn More… Google Machine Learning Crash Course https://developers.google.com/machine- learning/crash-course/ Stanford course CS229: Machine Learning https://developers.google.com/machine- learning/crash-course/ Stanford course CS231n: Convolutional Neural Networks for Visual Recognition http://cs231n.stanford.edu/ 49
  • 50.