0% found this document useful (0 votes)
44 views52 pages

Support Vector Machines (SVMS) - Introduction and Key Concepts

Uploaded by

binduann
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views52 pages

Support Vector Machines (SVMS) - Introduction and Key Concepts

Uploaded by

binduann
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 52

SUPPORT VECTOR MACHINES (SVMS)

- INTRODUCTION AND KEY CONCEPTS


Introduction to SVM
● Definition: A Support Vector Machine (SVM) is
a supervised learning model used for
classification and regression tasks. It finds the
optimal hyperplane that best separates data
into different classes.
● Applications: Image classification, text
classification, bioinformatics, etc.
Why SVM?
● SVM is effective in high-dimensional spaces.
● It is particularly useful when the number of
dimensions exceeds the number of samples.
● SVM works well for both linear and non-linear
classification tasks.
Linear Classification in SVM
● Linear SVM: Classifies data into two classes by
finding a linear hyperplane (decision boundary)
that maximizes the margin between the classes.
● Hyperplane: A flat affine subspace that divides the
data points into two classes.
Maximum Margin Hyperplane
● Definition: The hyperplane that
maximizes the distance between the two
closest points from either class is called
the Maximum Margin Hyperplane (MMH).
● Key Concept: A larger margin results in a
better generalization of the model.
To visualize the Maximum Margin Hyperplane (used in Support Vector
Machines or SVM), let's break down the components involved and
how you can represent them in a diagram.

Key Concepts:
1. Hyperplane: A decision boundary that separates data points
belonging to different classes. In two dimensions, this is simply a
line; in higher dimensions, it’s a hyperplane.
2. Maximum Margin: The margin is the distance between the
hyperplane and the closest data points from either class. The
goal of SVM is to maximize this margin to ensure better
generalization on unseen data.
3. Support Vectors: The data points that lie closest to the
hyperplane and are used to define the maximum margin. These
are critical points as they directly influence the positioning of the
Equation of Hyperplane
● A hyperplane in a d-dimensional space is defined by the
equation:
● w⋅x+b=0
● w⋅x+b=0
● where:
● w
● w is the normal vector to the hyperplane,
● x
● x is the feature vector of a data point,
● b
● b is the bias term.
1. Objective of the Optimization Problem:
● The objective of SVM is to maximize the
margin between the hyperplane and the
closest data points from each class, known
as support vectors.
● This margin maximization helps in achieving
better generalization on new data, thus
enhancing the model’s robustness.
Objective Function of Support Vector Machine
(SVM)
Solving the Optimization Problem in SVM

1. Using the Lagrange Multiplier Method:


The SVM optimization problem can be efficiently solved
using the Lagrange multiplier method. This approach helps
to transform the problem into a form that is easier to solve,
particularly by converting it from a primal form to a dual
form. The dual form allows us to focus only on the support
vectors, simplifying the computation and providing flexibility
for handling complex data distributions.
Dual Formulation of SVM
Key Takeaway:
The dual formulation of SVM focuses only on the
critical data points (support vectors), making
SVM both efficient and adaptable to non-linear
classification through kernel methods.
Soft Margin SVM
● Definition: Soft Margin SVM allows some
misclassification in order to prevent overfitting,
especially in noisy datasets.
● Objective: Minimize both the margin and a penalty for
misclassifications.
● Mathematical Formulation: The problem becomes:
Role of Regularization Parameter
C
Non-linear SVM
● Problem: SVM works best for linearly separable
data, but most real-world data is not linearly
separable.
● Solution: Use a non-linear transformation to map
the input data into a higher-dimensional feature
space where it can be linearly separable.
Feature Space
Transformation
Kernels in SVM
● Definition: A kernel function allows us to compute
the dot product in the higher-dimensional feature
space without explicitly computing the
transformation.
● Key Advantage: Kernels enable the use of non-
linear SVMs without high computational cost.
Polynomial
Kernel
Radial Basis Function (RBF)
Kernel
The Kernel Trick
● Definition: The Kernel Trick is a technique used to
compute the dot product in the high-dimensional
feature space efficiently by applying a kernel
function.
● Benefit: It avoids the need to explicitly compute
the transformation, saving on computation time.
Key SVM Advantages
● Effective in High-Dimensional Spaces:
● Memory Efficiency:.
● Versatile with Kernel Trick:
● Robust to Overfitting:
● Effective for Binary Classification Problems:
● Good Generalization Ability:
● Optimal for Large Margin Separation:
● Versatile Application Range:
● Effective with Sparse Data:
● Effective for Outlier Detection:
SVM in Text Classification
● Application: SVM is commonly used in text
classification tasks like spam email detection
and sentiment analysis.
● Why SVM?: The high-dimensional feature
space of text (with many features) is well-
suited for SVM’s capabilities.
SVM for Image Classification
● Application: SVMs are widely used in image
classification tasks, such as face recognition
and object detection.
● Why SVM?: SVM's ability to handle high-
dimensional feature spaces (pixel values)
makes it effective for image recognition tasks.
SVM vs. Other Algorithms
● Comparison with Logistic Regression: SVM maximizes
the margin, which helps with generalization, whereas
logistic regression uses a probabilistic approach.
● Comparison with Decision Trees: SVM typically
performs better in high-dimensional spaces, while
decision trees can be more interpretable.
Training SVM
● Steps:
1. Select a kernel (linear, polynomial, RBF).
2. Choose the regularization parameter C.
3. Train the model on the dataset.
4. Optimize the objective function (maximize the
margin and minimize misclassification).
Evaluating SVM Model
● Metrics: Precision, Recall, F1-Score, and
Accuracy are commonly used to evaluate SVM
models.
● Cross-validation: Often used to evaluate SVM’s
generalization performance.
Challenges in SVM
● Computational Complexity: SVMs can be
computationally expensive, especially with
large datasets.
● Choice of Kernel: Selecting the right kernel and
hyperparameters can be challenging and
requires experimentation.
Tuning SVM Hyperparameters
● Key Hyperparameters:
1. C (regularization parameter).
2. Kernel Type (linear, polynomial,
RBF).
3. Kernel Parameters (e.g., σ for RBF
kernel).
SVM in Multi-class Classification
● Approach: SVM is inherently a binary
classifier, but can be extended to multi-class
classification using strategies like:
1. One-vs-One: Create one classifier for
each pair of classes.
2. One-vs-All: Create one classifier for each
class against all other classes.
Summary of SVM
● SVM is a powerful machine learning algorithm
used for both classification and regression
tasks.
● It works by finding a hyperplane that best
separates data points into different classes.
● Kernels allow SVM to handle non-linearly
separable data by mapping it into higher-
dimensional spaces.

You might also like