Machine learning Tutorial
What’s in it for you? Why Machine Learning? What is Machine Learning? Types Of Machine Learning Machine Learning Algorithms Linear Regression Decision Trees Support Vector Machine Use Case: Classify whether a recipe is of a cupcake or a muffin using SVM
Why Machine Learning? Because Machine can drive your car for you!! Because Machine can unlock your phone with your face!! Because Machine can now detect 50 eye diseases
Why Machine Learning?
Why Machine Learning? Nobody likes spam posts on Facebook that annoy them into interacting with likes, shares, comments, and other actions
Why Machine Learning? This tactic, known as “Engagement Bait,” takes advantage of Facebook’s Newsfeed algorithm by boosting engagement in order to get greater reach
To eliminate engagement bait, the company reviewed and categorized hundreds of thousands of posts to train a machine learning model that detects different types of engagement bait Facebook scroll GIF will be replaced New Post Scans the keywords and phrases like “This” and checks the click through rate This is a tag bait! Block this post Data fed to the machine
Google’s DeepMind project “AlphaGO”, a  computer program that plays the board game ‘GO’ has defeated the world’s number one Go player Ke Jie
What is Machine Learning? Machine learning is the science of making computers learn and act like humans by feeding data and information without being explicitly programmed! Ordinary System With Artificial Intelligence Machine Learning Learns Predicts Improves
Define Objective Collect Data Prepare Data Select Algorithm Deploy Predict Test Model Train Model 01 02 03 04 05 06 07 08 What is Machine Learning?
For instance, whether the stock price will increase or decrease Do you want to predict a category? That’s classification!
For instance, predicting the age of a person based on the height, weight, health and other factors Do you want to predict a quantity? That’s regression!
For instance, you want to detect money withdrawal anomalies Do you want to detect an anomaly? That’s anomaly detection!
For instance: Finding groups of customers with similar behavior given a large database of customer data containing their demographics and past buying records Do you want to discover structure in unexplored data? That’s clustering
What do you understand from Measures and Dimensions? Each field from the data source is automatically assigned a datatype (such as string, integer) and a role (dimension or measure) Aggregation applied on measures is ‘Sum’ by default but you can always change the default aggregation in the settings Can you tell what’s happening in the following cases? A. Grouping documents into different categories based on the topic and content of each document
What do you understand from Measures and Dimensions? Each field from the data source is automatically assigned a datatype (such as string, integer) and a role (dimension or measure) Aggregation applied on measures is ‘Sum’ by default but you can always change the default aggregation in the settings Can you tell what’s happening in the following cases? A. Grouping documents into different categories based on the topic and content of each document B. Identifying hand-written digits in images correctly
What do you understand from Measures and Dimensions? Each field from the data source is automatically assigned a datatype (such as string, integer) and a role (dimension or measure) Aggregation applied on measures is ‘Sum’ by default but you can always change the default aggregation in the settings Can you tell what’s happening in the following cases? A. Grouping documents into different categories based on the topic and content of each document B. Identifying hand-written digits in images correctly C. Behavior of a website indicating that the site is not working as designed
What do you understand from Measures and Dimensions? Each field from the data source is automatically assigned a datatype (such as string, integer) and a role (dimension or measure) Aggregation applied on measures is ‘Sum’ by default but you can always change the default aggregation in the settings Can you tell what’s happening in the following cases? A. Grouping documents into different categories based on the topic and content of each document B. Identifying hand-written digits in images correctly C. Behavior of a website indicating that the site is not working as designed D. Predicting salary of an individual based his/her years of experience
Types of Machine Learning Supervised
Types of Machine Learning Supervised Un-Supervised
Types of Machine Learning Supervised Reinforcement Un-Supervised
Supervised Learning Labeled Data Model Training New Data Square Circle Prediction Supervised learning is a method used to enable machines to classify/ predict objects, problems or situations based on labeled data fed to the machine Circle Square Triangle Labels
Unsupervised Learning Unlabled Data Output In Unsupervised learning, Machine Learning model finds the hidden pattern in an unlabeled data Model Training
Reinforcement Learning Reinforcement learning is an important type of Machine Learning where an agent learns how to behave in an environment by performing actions and seeing the results ACTION NEW STATE Agent Environment
Supervised VS Unsupervised No feedback Find hidden structure in data Supervised vs Unsupervised Labeled Data Direct feedback Predict output Non-labeled data
Support Vector Machine Linear Regression Decision Trees Machine Learning Algorithms There are many interesting Machine Learning algorithms, let’s have a look at a few of them
Linear Regression y = mx + c Linear regression is a linear model, e.g. a model that assumes a linear relationship between the input variables (x) and a single output variable (y) Linear regression is perhaps one of the most well known and well understood algorithms in statistics and machine learning!
Linear Regression Imagine, we are predicting distance travelled (y) from speed (x). Our linear regression model representation for this problem would be: y = m * x + c Or distance = m * speed + c c = coefficient m = y-intercept
Speed = 10m/s Distance = 36 km Time is constant
Speed = 10m/s Distance = 36 km Speed = 20m/s Distance = 52 km Time is constant
Speed = 10m/s Distance = 36 km Speed = 20m/s Distance = 52 km Speed = 30m/s Distance = ? Time is constant
Linear Regression Speed Distance y = mx + c Distance travelled in fixed interval of time c = y-intercept of line m = +ve slope of the line As the speed increases, distance also increases, hence the variables have a positive relationship Speed of the person
Distance is constant Speed = 10m/s Time = 100 s
Speed = 10m/s Time = 100 s Speed = 20m/s Time = 50 s Distance is constant
Speed = 10m/s Time = 100 s Speed = 20m/s Time = 50 s Speed = 30m/s Time = ? Distance is constant
Linear Regression Speed Time y = mx + c Time taken to travel a fixed distance m = -ve slope of the line As the speed increases, time decreases, hence the variables have a negative relationship If distance is assumed to be constant, let’s see the relationship between speed and time Speed of the person
Linear Regression Let’s see the mathematical implementation of Linear Regression! Suppose we have a dataset that looks like: x y 1 3 2 2 3 2 4 4 5 3
Linear Regression Let’s plot these points!! 1 2 3 4 5 6 1 2 3 4 5 x y 1 3 2 2 3 2 4 4 5 3 Mean(xi) = 3
Linear Regression Let’s plot these points!! x y 1 3 2 2 3 2 4 4 5 3 Mean(xi) = 3 Mean(yi) = 2.8 1 2 3 4 5 6 1 2 3 4 5
Linear Regression Now, lets find regression equation to find the best fit line! y = mx + c To find this equation for our data, we need to find our slope (m) and coefficient (c)
Linear Regression y = mx + c ( x- xi ) ( y – yi ) ( x- xi ) 2m = x y x- xi y – yi ( x- xi ) 2 ( x- xi ) ( y – yi ) 1 3 -2 0.2 4 -0.4 2 2 -1 -0.8 1 0.8 3 2 0 -0.8 0 0 4 4 1 1.2 1 1.2 5 3 2 0.2 4 0.4
Linear Regression y = mx + c ( x- xi ) ( y – yi ) ( x- xi ) 2m = x y x- xi y – yi ( x- xi ) 2 ( x- xi ) ( y – yi ) 1 3 -2 0.2 4 -0.4 2 2 -1 -0.8 1 0.8 3 2 0 -0.8 0 0 4 4 1 1.2 1 1.2 5 3 2 0.2 4 0.4 Total = 2Total = 10
y = mx + c ( x- xi ) ( y – yi ) ( x- xi ) 2m = x y x- xi y – yi ( x- xi ) 2 ( x- xi ) ( y – yi ) 1 3 -2 0.2 4 -0.4 2 2 -1 -0.8 1 0.8 3 2 0 -0.8 0 0 4 4 1 1.2 1 1.2 5 3 2 0.2 4 0.4 Total = 2Total = 10 = 2/10 = 0.2 Linear Regression
Linear Regression y = mx + c ( x- xi ) ( y – yi ) ( x- xi ) 2m = = 2/10 = 0.2 y = 0.2 x + c 2.8 = 0.2 * 3 + c 2.8 = 0.6 + c c = 2.8 - 0.6 c = 2.2 So, we can calculate the value of c Mean values = (3, 2.8)
Linear Regression Hence this is our regression line! y = ( 0.2 *x ) + 2.2 1 2 3 4 5 6 1 2 3 4 5
Linear Regression Now, let’s predict the values of y using x = {1,2,3,4,5} and plot the points! y = ( 0.2 *x ) + 2.2 yp = (0.2 * 1) + 2.2 = 2.4 yp = (0.2 * 2) + 2.2 = 2.6 yp = (0.2 * 3) + 2.2 = 2.8 yp = (0.2 * 4) + 2.2 = 3.0 yp = (0.2 * 5) + 2.2 = 3.2 yp = Predicted values of y
Linear Regression Plot the predicted values along with the actual values to see the difference 1 2 3 4 5 6 1 2 3 4 5 - - -- Error Error Error Error x y yp 1 3 2.4 2 2 2.6 3 2 2.8 4 4 3 5 3 3.2 x y
Linear Regression So, our goal is to reduce this error! 1 2 3 4 5 6 1 2 3 4 5 - - -- Error Error Error Error
Linear Regression Minimizing the Distance: There are lots of ways to minimize the distance between the line and the data points like Sum of Squared errors, Sum of Absolute errors, Root Mean Square error etc. We keep moving this line through the data points to make sure the best fit line has the least square distance between the data points and the regression line
Decision Trees Decision Tree is a tree shaped algorithm used to determine a course of action Each branch of the tree represents a possible decision, occurrence or reaction
Decision Trees We have a data which tells us if it is a good day to play golf! Outlook Temp Humidity Windy Play Golf Rainy Hot High FALSE No Rainy Hot High TRUE No Overcast Hot High FALSE Yes Sunny Mild High FALSE Yes Sunny Cool Normal FALSE Yes Sunny Cool Normal TRUE No Overcast Cool Normal TRUE Yes Rainy Mild High FALSE No Rainy Cool Normal FALSE Yes Sunny Mild Normal FALSE Yes Rainy Mild Normal TRUE Yes Overcast Mild High TRUE Yes Overcast Hot Normal FALSE Yes Sunny Mild High TRUE No
Decision Trees Let’s determine if you should play golf when the day is sunny and windy?
Decision Trees Suppose, we draw our tree like this! Humidity Normal High Sunny Outlook Overcast Rainy Play Don’t Play Play Don’t Play
Decision Trees But, is this the right decision tree? For that, we should calculate Entropy and Information Gain! Entropy is the measure of randomness or ‘impurity’ in the dataset Entropy It is the measure of decrease in entropy after the dataset is split Also known as Entropy Reduction Information Gain Entropy should be low! Information Gain should be high!
Decision Trees Let’s look at entropy! Better quality image will be replaced
Decision Trees Let’s look at entropy! = E(5,9) = I(5/14, 9/14) = I(0.36, 0.64) = -(0.36 log2 0.36) – (0.64 log2 0.64) = 0.94 Play Golf Yes No 9 5 Total = 14 Entropy (Play golf) a) Entropy of target class of the dataset (whole entropy):
Decision Trees Let’s look at entropy! Entropy (Play golf, Outlook) = P(sunny) * E (3,2) + P(Overcast) * E(4,0) + P(rainy) * E(2,3) = 5/14 * I(3,2) + 4/14 * I(4,0) + 5/14 * I(2,3) = 0.693 Similarly, we can calculate the entropy of other predictors like Temperature, Humidity, Windy! Play Golf Predictors Yes No Total Outlook Sunny 3 2 5 Overcast 4 0 4 Rainy 2 3 5 14
Decision Trees Now, let’s look at Information Gain! Gain(Outlook) = Entropy(PlayGolf) − Entropy(PlayGolf,Outlook) = 0.940−0.693 =0.247 The information gain of the other three attributes can be calculated in the same way: Gain(Temp) = Entropy(PlayGolf)−Entropy(PlayGolf,Temp) = 0.029 Gain(Humidity) = Entropy(PlayGolf)−Entropy(PlayGolf,Humidity) = 0.152 Gain(Windy) = Entropy(PlayGolf)−Entropy(PlayGolf,Windy) = 0.048
Decision Trees Now, let’s build the decision tree! We choose the attribute with largest information gain as the root node Sunny Outlook Overcast Rainy Windy TrueFalse Don’t Play Play Play Don’t Play Root Node Branch Node Leaf Nodes
Decision Trees So, we wanted to know if it’s a good day to play golf when it’s sunny and windy! Sunny Outlook Overcast Rainy Windy TrueFalse Don’t Play Play Play Don’t Play
Decision Trees So, we wanted to know if it’s a good day to play golf when it’s sunny and windy! Sunny Outlook Overcast Rainy Windy TrueFalse Don’t Play Play Play Don’t Play
Decision Trees Uh-Oh, it’s not a good day to play golf! You can watch a golf game at home! :D
Support Vector Machine Support Vector Machine is a widely used classification algorithm! The idea of Support Vector Machines is simple: The algorithm creates a separation line which divides the classes in the best possible manner For example, dog or cat, disease or no disease
Support Vector Machine Weight Height Suppose, we have labeled sample data, which tells height and weight of males and females
Support Vector Machine How can a machine classify whether a new data point is a male or a female? A new data point Height Weight
Support Vector Machine We draw decision lines, but if we consider decision line 1 then we will classify it as a male Line 1 Height Weight
Support Vector Machine And if we consider decision line 2, then it will be a female! Line 1 Line 2 Height Weight
Support Vector Machine We need to know which line divides the classes correctly, but how? Line 1 Line 2 Height Weight
Support Vector Machine The goal is to choose a hyperplane with the greatest possible margin between the decision line and the nearest point within the training set Height Line 1 Support Vectors Distance Margin: The distance between the hyperplane and the nearest data point from either set Weight
Support Vector Machine When we draw the hyperplanes, we observe that Line 1 has the maximum distance margin so it will classify the new data point correctly Height Line 1 Result: New data point is male! Weight Support Vectors
Support Vector Machine Let’s understand this with the help of an example!
Support Vector Machine Problem Statement: Classifying muffin and cupcake recipes using support vector machines VS
Support Vector Machine Let’s have a look at our dataset: Type Flour Milk Sugar Butter Egg Baking Powder Vanilla Salt Muffin 55 28 3 7 5 2 0 0 Muffin 47 24 12 6 9 1 0 0 Muffin 47 23 18 6 4 1 0 0 Muffin 45 11 17 17 8 1 0 0 Muffin 50 25 12 6 5 2 1 0 Muffin 55 27 3 7 5 2 1 0 Muffin 54 27 7 5 5 2 0 0 Muffin 47 26 10 10 4 1 0 0 Muffin 50 17 17 8 6 1 0 0 Muffin 50 17 17 11 4 1 0 0 Cupcake 39 0 26 19 14 1 1 0 Cupcake 42 21 16 10 8 3 0 0 Cupcake 34 17 20 20 5 2 1 0 Cupcake 39 13 17 19 10 1 1 0 Cupcake 38 15 23 15 8 0 1 0 Cupcake 42 18 25 9 5 1 0 0 Cupcake 36 14 21 14 11 2 1 0 Cupcake 38 15 31 8 6 1 1 0 Cupcake 36 16 24 12 9 1 1 0 Cupcake 34 17 23 11 13 0 1 0
Support Vector Machine Let’s have a look at our dataset: Type Flour Milk Sugar Butter Egg Baking Powder Vanilla Salt Muffin 55 28 3 7 5 2 0 0 Muffin 47 24 12 6 9 1 0 0 Muffin 47 23 18 6 4 1 0 0 Muffin 45 11 17 17 8 1 0 0 Muffin 50 25 12 6 5 2 1 0 Muffin 55 27 3 7 5 2 1 0 Muffin 54 27 7 5 5 2 0 0 Muffin 47 26 10 10 4 1 0 0 Muffin 50 17 17 8 6 1 0 0 Muffin 50 17 17 11 4 1 0 0 Cupcake 39 0 26 19 14 1 1 0 Cupcake 42 21 16 10 8 3 0 0 Cupcake 34 17 20 20 5 2 1 0 Cupcake 39 13 17 19 10 1 1 0 Cupcake 38 15 23 15 8 0 1 0 Cupcake 42 18 25 9 5 1 0 0 Cupcake 36 14 21 14 11 2 1 0 Cupcake 38 15 31 8 6 1 1 0 Cupcake 36 16 24 12 9 1 1 0 Cupcake 34 17 23 11 13 0 1 0 What's the difference between a muffin and a cupcake? Turns out muffins have more flour, while cupcakes have more butter and sugar
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine
Support Vector Machine Hence, we have built a classifier using SVM which is able to classify if a recipe is of a cupcake or a muffin!
Key Takeways What is machine learning? Classification using SVMBuilding a Decision tree Regression-Line of best fitTypes of Machine learning
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners Part - 1 | Simplilearn

Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners Part - 1 | Simplilearn

  • 1.
  • 2.
    What’s in itfor you? Why Machine Learning? What is Machine Learning? Types Of Machine Learning Machine Learning Algorithms Linear Regression Decision Trees Support Vector Machine Use Case: Classify whether a recipe is of a cupcake or a muffin using SVM
  • 3.
    Why Machine Learning? BecauseMachine can drive your car for you!! Because Machine can unlock your phone with your face!! Because Machine can now detect 50 eye diseases
  • 4.
  • 5.
    Why Machine Learning? Nobodylikes spam posts on Facebook that annoy them into interacting with likes, shares, comments, and other actions
  • 6.
    Why Machine Learning? Thistactic, known as “Engagement Bait,” takes advantage of Facebook’s Newsfeed algorithm by boosting engagement in order to get greater reach
  • 7.
    To eliminate engagementbait, the company reviewed and categorized hundreds of thousands of posts to train a machine learning model that detects different types of engagement bait Facebook scroll GIF will be replaced New Post Scans the keywords and phrases like “This” and checks the click through rate This is a tag bait! Block this post Data fed to the machine
  • 8.
    Google’s DeepMind project“AlphaGO”, a  computer program that plays the board game ‘GO’ has defeated the world’s number one Go player Ke Jie
  • 9.
    What is MachineLearning? Machine learning is the science of making computers learn and act like humans by feeding data and information without being explicitly programmed! Ordinary System With Artificial Intelligence Machine Learning Learns Predicts Improves
  • 10.
    Define Objective Collect Data PrepareData Select Algorithm Deploy Predict Test Model Train Model 01 02 03 04 05 06 07 08 What is Machine Learning?
  • 11.
    For instance, whetherthe stock price will increase or decrease Do you want to predict a category? That’s classification!
  • 12.
    For instance, predictingthe age of a person based on the height, weight, health and other factors Do you want to predict a quantity? That’s regression!
  • 13.
    For instance, youwant to detect money withdrawal anomalies Do you want to detect an anomaly? That’s anomaly detection!
  • 14.
    For instance: Findinggroups of customers with similar behavior given a large database of customer data containing their demographics and past buying records Do you want to discover structure in unexplored data? That’s clustering
  • 16.
    What do youunderstand from Measures and Dimensions? Each field from the data source is automatically assigned a datatype (such as string, integer) and a role (dimension or measure) Aggregation applied on measures is ‘Sum’ by default but you can always change the default aggregation in the settings Can you tell what’s happening in the following cases? A. Grouping documents into different categories based on the topic and content of each document
  • 17.
    What do youunderstand from Measures and Dimensions? Each field from the data source is automatically assigned a datatype (such as string, integer) and a role (dimension or measure) Aggregation applied on measures is ‘Sum’ by default but you can always change the default aggregation in the settings Can you tell what’s happening in the following cases? A. Grouping documents into different categories based on the topic and content of each document B. Identifying hand-written digits in images correctly
  • 18.
    What do youunderstand from Measures and Dimensions? Each field from the data source is automatically assigned a datatype (such as string, integer) and a role (dimension or measure) Aggregation applied on measures is ‘Sum’ by default but you can always change the default aggregation in the settings Can you tell what’s happening in the following cases? A. Grouping documents into different categories based on the topic and content of each document B. Identifying hand-written digits in images correctly C. Behavior of a website indicating that the site is not working as designed
  • 19.
    What do youunderstand from Measures and Dimensions? Each field from the data source is automatically assigned a datatype (such as string, integer) and a role (dimension or measure) Aggregation applied on measures is ‘Sum’ by default but you can always change the default aggregation in the settings Can you tell what’s happening in the following cases? A. Grouping documents into different categories based on the topic and content of each document B. Identifying hand-written digits in images correctly C. Behavior of a website indicating that the site is not working as designed D. Predicting salary of an individual based his/her years of experience
  • 20.
    Types of MachineLearning Supervised
  • 21.
    Types of MachineLearning Supervised Un-Supervised
  • 22.
    Types of MachineLearning Supervised Reinforcement Un-Supervised
  • 23.
    Supervised Learning Labeled Data ModelTraining New Data Square Circle Prediction Supervised learning is a method used to enable machines to classify/ predict objects, problems or situations based on labeled data fed to the machine Circle Square Triangle Labels
  • 24.
    Unsupervised Learning Unlabled DataOutput In Unsupervised learning, Machine Learning model finds the hidden pattern in an unlabeled data Model Training
  • 25.
    Reinforcement Learning Reinforcement learningis an important type of Machine Learning where an agent learns how to behave in an environment by performing actions and seeing the results ACTION NEW STATE Agent Environment
  • 26.
    Supervised VS Unsupervised Nofeedback Find hidden structure in data Supervised vs Unsupervised Labeled Data Direct feedback Predict output Non-labeled data
  • 27.
    Support Vector Machine LinearRegression Decision Trees Machine Learning Algorithms There are many interesting Machine Learning algorithms, let’s have a look at a few of them
  • 28.
    Linear Regression y =mx + c Linear regression is a linear model, e.g. a model that assumes a linear relationship between the input variables (x) and a single output variable (y) Linear regression is perhaps one of the most well known and well understood algorithms in statistics and machine learning!
  • 29.
    Linear Regression Imagine, weare predicting distance travelled (y) from speed (x). Our linear regression model representation for this problem would be: y = m * x + c Or distance = m * speed + c c = coefficient m = y-intercept
  • 30.
    Speed = 10m/s Distance= 36 km Time is constant
  • 31.
    Speed = 10m/s Distance= 36 km Speed = 20m/s Distance = 52 km Time is constant
  • 32.
    Speed = 10m/s Distance= 36 km Speed = 20m/s Distance = 52 km Speed = 30m/s Distance = ? Time is constant
  • 33.
    Linear Regression Speed Distance y =mx + c Distance travelled in fixed interval of time c = y-intercept of line m = +ve slope of the line As the speed increases, distance also increases, hence the variables have a positive relationship Speed of the person
  • 34.
    Distance is constant Speed= 10m/s Time = 100 s
  • 35.
    Speed = 10m/s Time= 100 s Speed = 20m/s Time = 50 s Distance is constant
  • 36.
    Speed = 10m/s Time= 100 s Speed = 20m/s Time = 50 s Speed = 30m/s Time = ? Distance is constant
  • 37.
    Linear Regression Speed Time y =mx + c Time taken to travel a fixed distance m = -ve slope of the line As the speed increases, time decreases, hence the variables have a negative relationship If distance is assumed to be constant, let’s see the relationship between speed and time Speed of the person
  • 38.
    Linear Regression Let’s seethe mathematical implementation of Linear Regression! Suppose we have a dataset that looks like: x y 1 3 2 2 3 2 4 4 5 3
  • 39.
    Linear Regression Let’s plotthese points!! 1 2 3 4 5 6 1 2 3 4 5 x y 1 3 2 2 3 2 4 4 5 3 Mean(xi) = 3
  • 40.
    Linear Regression Let’s plotthese points!! x y 1 3 2 2 3 2 4 4 5 3 Mean(xi) = 3 Mean(yi) = 2.8 1 2 3 4 5 6 1 2 3 4 5
  • 41.
    Linear Regression Now, letsfind regression equation to find the best fit line! y = mx + c To find this equation for our data, we need to find our slope (m) and coefficient (c)
  • 42.
    Linear Regression y =mx + c ( x- xi ) ( y – yi ) ( x- xi ) 2m = x y x- xi y – yi ( x- xi ) 2 ( x- xi ) ( y – yi ) 1 3 -2 0.2 4 -0.4 2 2 -1 -0.8 1 0.8 3 2 0 -0.8 0 0 4 4 1 1.2 1 1.2 5 3 2 0.2 4 0.4
  • 43.
    Linear Regression y =mx + c ( x- xi ) ( y – yi ) ( x- xi ) 2m = x y x- xi y – yi ( x- xi ) 2 ( x- xi ) ( y – yi ) 1 3 -2 0.2 4 -0.4 2 2 -1 -0.8 1 0.8 3 2 0 -0.8 0 0 4 4 1 1.2 1 1.2 5 3 2 0.2 4 0.4 Total = 2Total = 10
  • 44.
    y = mx+ c ( x- xi ) ( y – yi ) ( x- xi ) 2m = x y x- xi y – yi ( x- xi ) 2 ( x- xi ) ( y – yi ) 1 3 -2 0.2 4 -0.4 2 2 -1 -0.8 1 0.8 3 2 0 -0.8 0 0 4 4 1 1.2 1 1.2 5 3 2 0.2 4 0.4 Total = 2Total = 10 = 2/10 = 0.2 Linear Regression
  • 45.
    Linear Regression y =mx + c ( x- xi ) ( y – yi ) ( x- xi ) 2m = = 2/10 = 0.2 y = 0.2 x + c 2.8 = 0.2 * 3 + c 2.8 = 0.6 + c c = 2.8 - 0.6 c = 2.2 So, we can calculate the value of c Mean values = (3, 2.8)
  • 46.
    Linear Regression Hence thisis our regression line! y = ( 0.2 *x ) + 2.2 1 2 3 4 5 6 1 2 3 4 5
  • 47.
    Linear Regression Now, let’spredict the values of y using x = {1,2,3,4,5} and plot the points! y = ( 0.2 *x ) + 2.2 yp = (0.2 * 1) + 2.2 = 2.4 yp = (0.2 * 2) + 2.2 = 2.6 yp = (0.2 * 3) + 2.2 = 2.8 yp = (0.2 * 4) + 2.2 = 3.0 yp = (0.2 * 5) + 2.2 = 3.2 yp = Predicted values of y
  • 48.
    Linear Regression Plot thepredicted values along with the actual values to see the difference 1 2 3 4 5 6 1 2 3 4 5 - - -- Error Error Error Error x y yp 1 3 2.4 2 2 2.6 3 2 2.8 4 4 3 5 3 3.2 x y
  • 49.
    Linear Regression So, ourgoal is to reduce this error! 1 2 3 4 5 6 1 2 3 4 5 - - -- Error Error Error Error
  • 50.
    Linear Regression Minimizing theDistance: There are lots of ways to minimize the distance between the line and the data points like Sum of Squared errors, Sum of Absolute errors, Root Mean Square error etc. We keep moving this line through the data points to make sure the best fit line has the least square distance between the data points and the regression line
  • 51.
    Decision Trees Decision Treeis a tree shaped algorithm used to determine a course of action Each branch of the tree represents a possible decision, occurrence or reaction
  • 52.
    Decision Trees We havea data which tells us if it is a good day to play golf! Outlook Temp Humidity Windy Play Golf Rainy Hot High FALSE No Rainy Hot High TRUE No Overcast Hot High FALSE Yes Sunny Mild High FALSE Yes Sunny Cool Normal FALSE Yes Sunny Cool Normal TRUE No Overcast Cool Normal TRUE Yes Rainy Mild High FALSE No Rainy Cool Normal FALSE Yes Sunny Mild Normal FALSE Yes Rainy Mild Normal TRUE Yes Overcast Mild High TRUE Yes Overcast Hot Normal FALSE Yes Sunny Mild High TRUE No
  • 53.
    Decision Trees Let’s determineif you should play golf when the day is sunny and windy?
  • 54.
    Decision Trees Suppose, wedraw our tree like this! Humidity Normal High Sunny Outlook Overcast Rainy Play Don’t Play Play Don’t Play
  • 55.
    Decision Trees But, isthis the right decision tree? For that, we should calculate Entropy and Information Gain! Entropy is the measure of randomness or ‘impurity’ in the dataset Entropy It is the measure of decrease in entropy after the dataset is split Also known as Entropy Reduction Information Gain Entropy should be low! Information Gain should be high!
  • 56.
    Decision Trees Let’s lookat entropy! Better quality image will be replaced
  • 57.
    Decision Trees Let’s lookat entropy! = E(5,9) = I(5/14, 9/14) = I(0.36, 0.64) = -(0.36 log2 0.36) – (0.64 log2 0.64) = 0.94 Play Golf Yes No 9 5 Total = 14 Entropy (Play golf) a) Entropy of target class of the dataset (whole entropy):
  • 58.
    Decision Trees Let’s lookat entropy! Entropy (Play golf, Outlook) = P(sunny) * E (3,2) + P(Overcast) * E(4,0) + P(rainy) * E(2,3) = 5/14 * I(3,2) + 4/14 * I(4,0) + 5/14 * I(2,3) = 0.693 Similarly, we can calculate the entropy of other predictors like Temperature, Humidity, Windy! Play Golf Predictors Yes No Total Outlook Sunny 3 2 5 Overcast 4 0 4 Rainy 2 3 5 14
  • 59.
    Decision Trees Now, let’slook at Information Gain! Gain(Outlook) = Entropy(PlayGolf) − Entropy(PlayGolf,Outlook) = 0.940−0.693 =0.247 The information gain of the other three attributes can be calculated in the same way: Gain(Temp) = Entropy(PlayGolf)−Entropy(PlayGolf,Temp) = 0.029 Gain(Humidity) = Entropy(PlayGolf)−Entropy(PlayGolf,Humidity) = 0.152 Gain(Windy) = Entropy(PlayGolf)−Entropy(PlayGolf,Windy) = 0.048
  • 60.
    Decision Trees Now, let’sbuild the decision tree! We choose the attribute with largest information gain as the root node Sunny Outlook Overcast Rainy Windy TrueFalse Don’t Play Play Play Don’t Play Root Node Branch Node Leaf Nodes
  • 61.
    Decision Trees So, wewanted to know if it’s a good day to play golf when it’s sunny and windy! Sunny Outlook Overcast Rainy Windy TrueFalse Don’t Play Play Play Don’t Play
  • 62.
    Decision Trees So, wewanted to know if it’s a good day to play golf when it’s sunny and windy! Sunny Outlook Overcast Rainy Windy TrueFalse Don’t Play Play Play Don’t Play
  • 63.
    Decision Trees Uh-Oh, it’snot a good day to play golf! You can watch a golf game at home! :D
  • 64.
    Support Vector Machine SupportVector Machine is a widely used classification algorithm! The idea of Support Vector Machines is simple: The algorithm creates a separation line which divides the classes in the best possible manner For example, dog or cat, disease or no disease
  • 65.
    Support Vector Machine Weight Height Suppose,we have labeled sample data, which tells height and weight of males and females
  • 66.
    Support Vector Machine Howcan a machine classify whether a new data point is a male or a female? A new data point Height Weight
  • 67.
    Support Vector Machine Wedraw decision lines, but if we consider decision line 1 then we will classify it as a male Line 1 Height Weight
  • 68.
    Support Vector Machine Andif we consider decision line 2, then it will be a female! Line 1 Line 2 Height Weight
  • 69.
    Support Vector Machine Weneed to know which line divides the classes correctly, but how? Line 1 Line 2 Height Weight
  • 70.
    Support Vector Machine Thegoal is to choose a hyperplane with the greatest possible margin between the decision line and the nearest point within the training set Height Line 1 Support Vectors Distance Margin: The distance between the hyperplane and the nearest data point from either set Weight
  • 71.
    Support Vector Machine Whenwe draw the hyperplanes, we observe that Line 1 has the maximum distance margin so it will classify the new data point correctly Height Line 1 Result: New data point is male! Weight Support Vectors
  • 72.
    Support Vector Machine Let’sunderstand this with the help of an example!
  • 73.
    Support Vector Machine ProblemStatement: Classifying muffin and cupcake recipes using support vector machines VS
  • 74.
    Support Vector Machine Let’shave a look at our dataset: Type Flour Milk Sugar Butter Egg Baking Powder Vanilla Salt Muffin 55 28 3 7 5 2 0 0 Muffin 47 24 12 6 9 1 0 0 Muffin 47 23 18 6 4 1 0 0 Muffin 45 11 17 17 8 1 0 0 Muffin 50 25 12 6 5 2 1 0 Muffin 55 27 3 7 5 2 1 0 Muffin 54 27 7 5 5 2 0 0 Muffin 47 26 10 10 4 1 0 0 Muffin 50 17 17 8 6 1 0 0 Muffin 50 17 17 11 4 1 0 0 Cupcake 39 0 26 19 14 1 1 0 Cupcake 42 21 16 10 8 3 0 0 Cupcake 34 17 20 20 5 2 1 0 Cupcake 39 13 17 19 10 1 1 0 Cupcake 38 15 23 15 8 0 1 0 Cupcake 42 18 25 9 5 1 0 0 Cupcake 36 14 21 14 11 2 1 0 Cupcake 38 15 31 8 6 1 1 0 Cupcake 36 16 24 12 9 1 1 0 Cupcake 34 17 23 11 13 0 1 0
  • 75.
    Support Vector Machine Let’shave a look at our dataset: Type Flour Milk Sugar Butter Egg Baking Powder Vanilla Salt Muffin 55 28 3 7 5 2 0 0 Muffin 47 24 12 6 9 1 0 0 Muffin 47 23 18 6 4 1 0 0 Muffin 45 11 17 17 8 1 0 0 Muffin 50 25 12 6 5 2 1 0 Muffin 55 27 3 7 5 2 1 0 Muffin 54 27 7 5 5 2 0 0 Muffin 47 26 10 10 4 1 0 0 Muffin 50 17 17 8 6 1 0 0 Muffin 50 17 17 11 4 1 0 0 Cupcake 39 0 26 19 14 1 1 0 Cupcake 42 21 16 10 8 3 0 0 Cupcake 34 17 20 20 5 2 1 0 Cupcake 39 13 17 19 10 1 1 0 Cupcake 38 15 23 15 8 0 1 0 Cupcake 42 18 25 9 5 1 0 0 Cupcake 36 14 21 14 11 2 1 0 Cupcake 38 15 31 8 6 1 1 0 Cupcake 36 16 24 12 9 1 1 0 Cupcake 34 17 23 11 13 0 1 0 What's the difference between a muffin and a cupcake? Turns out muffins have more flour, while cupcakes have more butter and sugar
  • 76.
  • 77.
  • 78.
  • 79.
  • 80.
  • 81.
  • 82.
  • 83.
  • 84.
  • 85.
  • 86.
  • 87.
    Support Vector Machine Hence,we have built a classifier using SVM which is able to classify if a recipe is of a cupcake or a muffin!
  • 88.
    Key Takeways What ismachine learning? Classification using SVMBuilding a Decision tree Regression-Line of best fitTypes of Machine learning

Editor's Notes

  • #3 Style - 01
  • #4 We can talk about how AI enabled machines are able to detect major diseases these days etc
  • #5 Remove title case
  • #6 Remove title case
  • #7 Remove title case
  • #8 Remove title case
  • #9 Hence, Machine learning is being used to reduce the spread of content that is spammy, sensational, or misleading in order to promote more meaningful and authentic conversations on Facebook
  • #10 Remove title case
  • #11 Remove title case
  • #12 Remove title case
  • #13 Remove title case
  • #14 Remove title case
  • #15 Remove title case
  • #16 Remove title case
  • #17 Style - 01
  • #18 Style - 01
  • #19 Style - 01
  • #20 Style - 01
  • #21 Remove title case
  • #22 Remove title case
  • #23 Remove title case
  • #24 Remove title case
  • #25 Remove title case
  • #26 Remove title case
  • #27 Remove title case
  • #29 Remove title case
  • #30 Remove title case
  • #34 Remove title case
  • #38 Remove title case
  • #39 Remove title case
  • #40 Remove title case
  • #41 Remove title case
  • #42 Remove title case
  • #43 Remove title case
  • #44 Remove title case
  • #45 Remove title case
  • #46 Remove title case
  • #47 Remove title case
  • #48 Remove title case
  • #49 Remove title case
  • #50 Remove title case
  • #51 Remove title case
  • #52 Remove title case
  • #53 Remove title case
  • #54 Remove title case
  • #55 Remove title case
  • #56 Remove title case
  • #57 Remove title case
  • #58 Remove title case
  • #59 Remove title case
  • #60 Remove title case
  • #61 Remove title case
  • #62 Remove title case
  • #63 Remove title case
  • #64 Remove title case
  • #65 Remove title case
  • #66 Remove title case
  • #67 Remove title case
  • #68 Remove title case
  • #69 Remove title case
  • #70  a hyperplane as a line that linearly separates and classifies a set of data
  • #71  a hyperplane as a line that linearly separates and classifies a set of data
  • #72  a hyperplane as a line that linearly separates and classifies a set of data
  • #73 Remove title case
  • #74 Remove title case
  • #75 Remove title case
  • #76 Remove title case
  • #77 Remove title case
  • #78 Remove title case
  • #79 Remove title case
  • #80 Remove title case
  • #81 Remove title case
  • #82 Remove title case
  • #83 Remove title case
  • #84 Remove title case
  • #85 Remove title case
  • #86 Remove title case
  • #87 Remove title case
  • #88 Remove title case