ROHINI COLLEGE OF ENGINEERING AND TECHNOLOGY
3.1 INTRODUCTION TO MACHINE LEARNING
Machine learning is a subfield of artificial intelligence that involves the development of
algorithms and statistical models that enable computers to improve their performance in tasks
through experience. These algorithms and models are designed to learn from data and make
predictions or decisions without explicit instructions.
Machine learning is a growing technology which enables computers to learn automatically
from past data. Machine learning uses various algorithms for building mathematical models and
making predictions using historical data or information. Currently, it is being used for various
tasks such as image recognition, speech recognition, email filtering, Facebook auto-
tagging, recommender system, and many more.
With the help of sample historical data, which is known as training data, machine learning
algorithms build a mathematical model that helps in making predictions or decisions without
being explicitly programmed.
How does Machine Learning work
A Machine Learning system learns from historical data, builds the prediction models,
and whenever it receives new data, predicts the output for it. The accuracy of predicted output
depends upon the amount of data, as the huge amount of data helps to build a better model which
predicts the output more accurately.
Features of Machine Learning:
o Machine learning uses data to detect various patterns in a given dataset.
o It can learn from past data and improve automatically.
CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
ROHINI COLLEGE OF ENGINEERING AND TECHNOLOGY
o It is a data-driven technology.
o Machine learning is much similar to data mining as it also deals with the huge amount of
the data.
There are several types of machine learning, including supervised learning, unsupervised
learning, and reinforcement learning.
Supervised learning involves training a model on labeled data, while unsupervised
learning involves training a model on unlabeled data.
Reinforcement learning involves training a model through trial and error.
Machine learning is used in a wide variety of applications, including image and speech
recognition, natural language processing, and recommender systems.
Definition of learning: A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P , if its performance at tasks T, as measured by
P , improves with experience E.
Examples
Handwriting recognition learning problem
Task T : Recognizing and classifying handwritten words within images
Performance P : Percent of words correctly classified
Training experience E : A dataset of handwritten words with given classifications
A robot driving learning problem
Task T : Driving on highways using vision sensors
Performance P : Average distance traveled before an error
Training experience E : A sequence of images and steering commands recorded
while observing a human driver
Definition: A computer program which learns from experience is called a machine learning
program or simply a learning program.
CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
ROHINI COLLEGE OF ENGINEERING AND TECHNOLOGY
3.1.1 Classification of Machine Learning
Machine learning implementations are classified into four major categories, depending
on the nature of the learning “signal” or “response” available to a learning system which are as
follows:
A. SUPERVISED LEARNING:
Supervised learning is the machine learning task of learning a function that maps an input to
an output based on example input-output pairs. The given data is labeled.
Both classification and regression problems are supervised learning problems.
Example — Consider the following data regarding patients entering a clinic. The data
consists of the gender and age of the patients and each patient is labeled as “healthy” or
“sick”.
gender age Label
M 48 Sick
M 67 Sick
F 53 Healthy
M 49 Sick
F 32 Healthy
M 34 Healthy
M 21 Healthy
CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
ROHINI COLLEGE OF ENGINEERING AND TECHNOLOGY
Supervised learning is the types of machine learning in which machines are trained using
well "labelled" training data, and on basis of that data, machines predict the output. The labelled
data means some input data is already tagged with the correct output.
In supervised learning, the training data provided to the machines work as the supervisor
that teaches the machines to predict the output correctly. It applies the same concept as a student
learns in the supervision of the teacher.
Supervised learning is a process of providing input data as well as correct output data to the
machine learning model. The aim of a supervised learning algorithm is to find a mapping
function to map the input variable(x) with the output variable(y).
In the real-world, supervised learning can be used for Risk Assessment, Image
classification, Fraud Detection, spam filtering, etc.
How Supervised Learning Works?
In supervised learning, models are trained using labelled dataset, where the model learns about
each type of data. Once the training process is completed, the model is tested on the basis of test
data (a subset of the training set), and then it predicts the output.
The working of Supervised learning can be easily understood by the below example and diagram:
CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
ROHINI COLLEGE OF ENGINEERING AND TECHNOLOGY
Suppose we have a dataset of different types of shapes which includes square, rectangle, triangle,
and Polygon. Now the first step is that we need to train the model for each shape.
o If the given shape has four sides, and all the sides are equal, then it will be labelled as
a Square.
o If the given shape has three sides, then it will be labelled as a triangle.
o If the given shape has six equal sides then it will be labelled as hexagon.
Now, after training, we test our model using the test set, and the task of the model is to identify
the shape.
The machine is already trained on all types of shapes, and when it finds a new shape, it
classifies the shape on the bases of a number of sides, and predicts the output.
Steps Involved in Supervised Learning:
o First Determine the type of training dataset
o Collect/Gather the labelled training data.
o Split the training dataset into training dataset, test dataset, and validation dataset.
o Determine the input features of the training dataset, which should have enough knowledge
so that the model can accurately predict the output.
o Determine the suitable algorithm for the model, such as support vector machine, decision
tree, etc.
o Execute the algorithm on the training dataset. Sometimes we need validation sets as the
control parameters, which are the subset of training datasets.
o Evaluate the accuracy of the model by providing the test set. If the model predicts the
correct output, which means our model is accurate.
Types of supervised Machine learning Algorithms:
Supervised learning can be further divided into two types of problems:
CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
ROHINI COLLEGE OF ENGINEERING AND TECHNOLOGY
1. Regression
Regression algorithms are used if there is a relationship between the input variable and the
output variable. It is used for the prediction of continuous variables, such as Weather forecasting,
Market Trends, etc. Below are some popular Regression algorithms which come under supervised
learning:
B. UNSUPERVISED LEARNING:
As the name suggests, unsupervised learning is a machine learning technique in which
models are not supervised using training dataset. Instead, models itself find the hidden patterns
and insights from the given data. It can be compared to learning which takes place in the human
brain while learning new things. It can be defined as:
Unsupervised learning is a type of machine learning in which models are trained using
unlabeled dataset and are allowed to act on that data without any supervision.
Unsupervised learning cannot be directly applied to a regression or classification problem
because unlike supervised learning, we have the input data but no corresponding output data. The
goal of unsupervised learning is to find the underlying structure of dataset, group that data
according to similarities, and represent that dataset in a compressed format.
Why use Unsupervised Learning?
Below are some main reasons which describe the importance of Unsupervised Learning:
o Unsupervised learning is helpful for finding useful insights from the data.
CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
ROHINI COLLEGE OF ENGINEERING AND TECHNOLOGY
o Unsupervised learning is much similar as a human learns to think by their own experiences,
which makes it closer to the real AI.
o Unsupervised learning works on unlabeled and uncategorized data which make
unsupervised learning more important.
o In real-world, we do not always have input data with the corresponding output so to solve
such cases, we need unsupervised learning.
Working of Unsupervised Learning
Working of unsupervised learning can be understood by the below diagram:
Here, we have taken an unlabeled input data, which means it is not categorized and
corresponding outputs are also not given. Now, this unlabeled input data is fed to the machine
learning model in order to train it. Firstly, it will interpret the raw data to find the hidden patterns
from the data and then will apply suitable algorithms such as k-means clustering, Decision tree,
etc.
Once it applies the suitable algorithm, the algorithm divides the data objects into groups according
to the similarities and difference between the objects.
Types of Unsupervised Learning Algorithm:
CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
ROHINI COLLEGE OF ENGINEERING AND TECHNOLOGY
The unsupervised learning algorithm can be further categorized into two types of problems:
o Clustering: Clustering is a method of grouping the objects into clusters such that objects
with most similarities remains into a group and has less or no similarities with the objects
of another group. Cluster analysis finds the commonalities between the data objects and
categorizes them as per the presence and absence of those commonalities.
o Association: An association rule is an unsupervised learning method which is used for
finding the relationships between variables in the large database. It determines the set of
items that occurs together in the dataset. Association rule makes marketing strategy more
effective. Such as people who buy X item (suppose a bread) are also tend to purchase Y
(Butter/Jam) item. A typical example of Association rule is Market Basket Analysis.
C. REINFORCEMENT LEARNING:
Reinforcement learning is the problem of getting an agent to act in the world so as to
maximize its rewards. Reinforcement Learning is a part of machine learning. Here, agents are self-
trained on reward and punishment mechanisms. It’s about taking the best possible action or path
to gain maximum rewards and minimum punishment through observations in a specific situation.
It acts as a signal to positive and negative behaviors. Essentially an agent (or several) is built that
can perceive and interpret the environment in which is placed, furthermore, it can take actions and
interact with it.
CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
ROHINI COLLEGE OF ENGINEERING AND TECHNOLOGY
A learner is not told what actions to take as in most forms of machine learning but instead
must discover which actions yield the most reward by trying them. For example — Consider
teaching a dog a new trick: we cannot tell him what to do, what not to do, but we can
reward/punish it if it does the right/wrong thing.
Reinforcement learning (RL) is based on rewarding desired behaviors or punishing
undesired ones. Instead of one input producing one output, the algorithm produces a variety of
outputs and is trained to select the right one based on certain variables – Gartner
It is a type of machine learning technique where a computer agent learns to perform a task
through repeated trial and error interactions with a dynamic environment. This learning approach
enables the agent to make a series of decisions that maximize a reward metric for the task without
human intervention and without being explicitly programmed to achieve the task
D. Semi-supervised learning:
Where an incomplete training signal is given: a training set with some (often many) of
the target outputs missing.
There is a special case of this principle known as Transduction where the entire set of
problem instances is known at learning time, except that part of the targets are missing.
Semi-supervised learning is an approach to machine learning that combines small labeled
data with a large amount of unlabeled data during training. Semi-supervised learning falls
between unsupervised learning and supervised learning.
CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
ROHINI COLLEGE OF ENGINEERING AND TECHNOLOGY
Categorizing based on required Output
Another categorization of machine learning tasks arises when one considers the desired output
of a machine-learned system:
1. Classification: When inputs are divided into two or more classes, the learner must produce
a model that assigns unseen inputs to one or more (multi-label classification) of these classes.
This is typically tackled in a supervised way. Spam filtering is an example of classification,
where the inputs are email (or other) messages and the classes are “spam” and “not spam”.
2. Regression: Which is also a supervised problem, A case when the outputs are continuous
rather than discrete.
3. Clustering: When a set of inputs is to be divided into groups. Unlike in classification, the
groups are not known beforehand, making this typically an unsupervised task.
Machine Learning comes into the picture when problems cannot be solved using typical
approaches. ML algorithms combined with new computing technologies promote scalability
and improve efficiency. Modern ML models can be used to make predictions ranging from
outbreaks of disease to the rise and fall of stocks
CS3491-ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING