0% found this document useful (0 votes)

425 views16 pages

Data Mining With Clustering AND Classification

This document discusses data mining techniques including clustering and classification. Clustering is an unsupervised learning technique that organizes data into groups of similar objects. Major clustering methods include distance-based, hierarchical, and partitioning. Classification is a supervised learning technique that predicts categorical class labels. It involves constructing a model from a training set and using it to classify new data. Major classification techniques discussed include decision trees, Bayesian classification, and association rule mining.

Uploaded by

Amanjyot Singh Oberoi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

425 views16 pages

Data Mining With Clustering AND Classification

Uploaded by

Amanjyot Singh Oberoi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 16

DATA MINING WITH

CLUSTERING
AND
CLASSIFICATION
DATA MINING
Data Mining is the process of discovering new
correlations, patterns, and trends by digging into
(mining) large amounts of data stored in warehouses,
using artificial intelligence, statistical and
mathematical techniques.
It is currently used in a wide range of profiling
practices, such as marketing ,fraud detection, and
scientific discovery.
From a managerial perspective:

Analyzing trends
Wealth generation

Security

Strategic decision making

MODELS OF DATA MINING
Predictive Model: Predictive models can be used to
forecast explicit values, based on patterns determined
from known results. For example, from a database of
customers who have already responded to a particular
offer, a model can be built that predicts which prospects
are likeliest to respond to the same offer.

Predictive data mining is further categorized into:

Classification
Regression
CONT…
Descriptive Model: Descriptive models describe
patterns in existing data, and are generally used to
create meaningful subgroups such as demographic
clusters. They are generally used to create meaningful
subgroups.

Descriptive data mining is further classified into

Clustering
Association
Sequential analysis.
CLUSTERING
• Clustering can be considered the most important
unsupervised learning technique; so, as every other
problem of this kind, it deals with finding a structure
in a collection of unlabeled data.

• Clustering is “the process of organizing objects into

groups whose members are similar in some way”.

• A cluster is therefore a collection of objects which

are “similar” between them and are “dissimilar” to
the objects belonging to other clusters.
CONT…
Where to use clustering?
Data mining
Information retrieval
text mining
Web analysis
marketing
medical diagnostic
Major clustering methods
Distance-based
Hierarchical
Partitioning
Probabilistic
CLASSIFICATION
predicts categorical class labels
classifies data (constructs a model) based on the
training set and the values (class labels) in a classifying
attribute and uses it in classifying new data
Classification—A Two-Step Process
Model construction: describing a set of predetermined classes
 Each tuple is assumed to belong to a predefined class, as determined
by the class label attribute (supervised learning)
 The set of tuples used for model construction: training set
 The model is represented as classification rules, decision trees, or
mathematical formulae
Model usage: for classifying previously unseen objects
 Estimate accuracy of the model using a test set
 The known label of test sample is compared with the classified
result from the model
 Accuracy rate is the percentage of test set samples that are correctly
classified by the model
 Test set is independent of training set, otherwise over-fitting will
occur
Classification Process: Model
Construction
Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier

(Model)
Mike Assistant Prof 3 no
Mary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes
IF rank = ‘professor’
Dave Assistant Prof 6 no OR years > 6
Anne Associate Prof 3 no THEN tenured = ‘yes’
Classification Process: Model
usage in Prediction

Classifier

Testing
Data Unseen Data

(Jeff, Professor, 4)

NAME RANK YEARS TENURED

Tom Assistant Prof 2 no Tenured?
Merlisa Associate Prof 7 no
George Professor 5 yes
Joseph Assistant Prof 7 yes
Classification Techniques

Classification by Decision Tree

Bayesian Classification
Classification by Backpropogation
Classification based on Association Rule Mining
Classification vs Clustering
Supervised learning (classification)
Supervision: The training data (observations,
measurements, etc.) are accompanied by labels
indicating the class of the observations
New data is classified based on the training set

Unsupervised learning (clustering)

The class labels of training data is unknown
Given a set of measurements, observations, etc. the
aim is to establish the existence of classes or clusters in
the data

Classification Techniques Overview
No ratings yet
Classification Techniques Overview
141 pages
Week 8. Supervised Learning. Classification
No ratings yet
Week 8. Supervised Learning. Classification
45 pages
Lesson 5 - Supervised Learning-Classification
100% (1)
Lesson 5 - Supervised Learning-Classification
91 pages
Machine Learning Unit-I
No ratings yet
Machine Learning Unit-I
41 pages
Nearest Neighbor Algorithm Overview
No ratings yet
Nearest Neighbor Algorithm Overview
20 pages
1.introduction To Python For Data Science
No ratings yet
1.introduction To Python For Data Science
6 pages
Secure Cloud Storage with Hybrid Cryptography
No ratings yet
Secure Cloud Storage with Hybrid Cryptography
43 pages
Retail Data Insights & Strategies
No ratings yet
Retail Data Insights & Strategies
24 pages
FHHGJJJJJJJJJHFGFG
No ratings yet
FHHGJJJJJJJJJHFGFG
9 pages
Secure File Storage On Cloud Using Hybrid Cryptography
No ratings yet
Secure File Storage On Cloud Using Hybrid Cryptography
2 pages
File Sharing and Data Duplication Removal in Cloud Using File Checksum
No ratings yet
File Sharing and Data Duplication Removal in Cloud Using File Checksum
3 pages
Data Mining Course Overview and Syllabus
No ratings yet
Data Mining Course Overview and Syllabus
129 pages
Understanding Bayesian Classification
No ratings yet
Understanding Bayesian Classification
66 pages
Chapter Four
No ratings yet
Chapter Four
75 pages
Clustering for Data Analysts
No ratings yet
Clustering for Data Analysts
69 pages
Title: Personality Prediction System Problem Statement:: Literature Review
No ratings yet
Title: Personality Prediction System Problem Statement:: Literature Review
5 pages
ML Seminar Presentation
No ratings yet
ML Seminar Presentation
26 pages
IBM - Data Visualization and Dashboards With Excel and Cognos
No ratings yet
IBM - Data Visualization and Dashboards With Excel and Cognos
13 pages
Understanding Automated Teller Machines
No ratings yet
Understanding Automated Teller Machines
12 pages
Big Data Analytics Overview and Notes
No ratings yet
Big Data Analytics Overview and Notes
9 pages
Real-Time Processing
No ratings yet
Real-Time Processing
7 pages
Data Science M-1 Notes
No ratings yet
Data Science M-1 Notes
34 pages
CH 6
No ratings yet
CH 6
72 pages
Stroke Prediction Project Report
No ratings yet
Stroke Prediction Project Report
7 pages
Data Analytics - Unit 4 (22IT513PE)
100% (1)
Data Analytics - Unit 4 (22IT513PE)
30 pages
Lecture 3 Data Mining
No ratings yet
Lecture 3 Data Mining
30 pages
Recommendation System in Python
No ratings yet
Recommendation System in Python
13 pages
Cluster
100% (1)
Cluster
72 pages
Bayes Theorem Topic Final
No ratings yet
Bayes Theorem Topic Final
23 pages
Understanding Pretexting in Social Engineering
No ratings yet
Understanding Pretexting in Social Engineering
2 pages
Lecture 2
No ratings yet
Lecture 2
36 pages
AI Virtual Assistant for Product Management
No ratings yet
AI Virtual Assistant for Product Management
6 pages
Machine Learning Overview & Applications
No ratings yet
Machine Learning Overview & Applications
7 pages
K-means Clustering Explained
No ratings yet
K-means Clustering Explained
13 pages
Speech Recognition Full Report
No ratings yet
Speech Recognition Full Report
11 pages
Data Duplication Removal Using File Checksum
No ratings yet
Data Duplication Removal Using File Checksum
2 pages
CNN and Autoencoder
No ratings yet
CNN and Autoencoder
56 pages
DBSCAN
No ratings yet
DBSCAN
14 pages
Introduction To Data Mining Unit1
100% (1)
Introduction To Data Mining Unit1
37 pages
Big Data: by It Faculty Alttc Ghaziabad
No ratings yet
Big Data: by It Faculty Alttc Ghaziabad
26 pages
Cluster Analysis in Data Mining Techniques
No ratings yet
Cluster Analysis in Data Mining Techniques
76 pages
Deep Reinforcement Learning For Cyber Security
No ratings yet
Deep Reinforcement Learning For Cyber Security
17 pages
Data Mining: Classification & Prediction
No ratings yet
Data Mining: Classification & Prediction
16 pages
Data Mining Essentials for Analysts
No ratings yet
Data Mining Essentials for Analysts
35 pages
An Introduction To Text: Mining
No ratings yet
An Introduction To Text: Mining
39 pages
Sentiment Analysis Using Feature Selection and Machine Learning Algorithms
No ratings yet
Sentiment Analysis Using Feature Selection and Machine Learning Algorithms
48 pages
Introduction to Cyber Security
No ratings yet
Introduction to Cyber Security
13 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
51 pages
LightGBM Python Guide: Datasets & Training
No ratings yet
LightGBM Python Guide: Datasets & Training
26 pages
3.popular Machine Learning Algorithm
No ratings yet
3.popular Machine Learning Algorithm
11 pages
Introduction To Business Forecasting and Predictive Analytics
No ratings yet
Introduction To Business Forecasting and Predictive Analytics
25 pages
Comparison of Signature-Based Detection and Behavior-Based Detection For Effective Malware Detection
No ratings yet
Comparison of Signature-Based Detection and Behavior-Based Detection For Effective Malware Detection
16 pages
Unit IV Clustering
No ratings yet
Unit IV Clustering
60 pages
Phishing Seminar
No ratings yet
Phishing Seminar
19 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
10 pages
OPTICS: Ordering Points To Identify The Clustering Structure
No ratings yet
OPTICS: Ordering Points To Identify The Clustering Structure
10 pages
Lecture 3.1.1
No ratings yet
Lecture 3.1.1
17 pages
DWM Unit 3 Final Notes
No ratings yet
DWM Unit 3 Final Notes
47 pages
Clustering in Data Mining Explained
No ratings yet
Clustering in Data Mining Explained
12 pages
Data Mining for Aspiring Analysts
No ratings yet
Data Mining for Aspiring Analysts
36 pages
Application of Text Classification and Clustering of Twitter Data For Business Analytics
No ratings yet
Application of Text Classification and Clustering of Twitter Data For Business Analytics
7 pages
Lecture Notes - Linear Regression
No ratings yet
Lecture Notes - Linear Regression
26 pages
4-1 - Machine Learning - Intro-Classification
100% (1)
4-1 - Machine Learning - Intro-Classification
63 pages
Machine Learning & Deep Learning Overview
No ratings yet
Machine Learning & Deep Learning Overview
9 pages
BSC Thesis Geophy
No ratings yet
BSC Thesis Geophy
57 pages
SMAI Question Papers
No ratings yet
SMAI Question Papers
13 pages
Synopsis of Modern Agriculture
No ratings yet
Synopsis of Modern Agriculture
8 pages
Arti Ficial Intelligence Exploitation in Facility Management Using Deep Learning
No ratings yet
Arti Ficial Intelligence Exploitation in Facility Management Using Deep Learning
16 pages
Machine Learning Final Exam Spring 2009
No ratings yet
Machine Learning Final Exam Spring 2009
25 pages
ASL Image Classification Model Training
No ratings yet
ASL Image Classification Model Training
11 pages
A Comprehensive Survey On Machine Learning For Networking
No ratings yet
A Comprehensive Survey On Machine Learning For Networking
99 pages
Comparative Study of K-NN, Naive Bayes and Decision Tree Classification Techniques
No ratings yet
Comparative Study of K-NN, Naive Bayes and Decision Tree Classification Techniques
4 pages
Project Report
100% (3)
Project Report
36 pages
Flight Delay Prediction: Project Synopsis On
No ratings yet
Flight Delay Prediction: Project Synopsis On
13 pages
Gaussian Face
No ratings yet
Gaussian Face
9 pages
Module 4 BDA NOTES
No ratings yet
Module 4 BDA NOTES
75 pages
Python Data Analysis For Newbies Numpypandasmatplotlibscikit Learnkeras
No ratings yet
Python Data Analysis For Newbies Numpypandasmatplotlibscikit Learnkeras
95 pages
Autonomous Driving Machine Learning Case Study
No ratings yet
Autonomous Driving Machine Learning Case Study
15 pages
Environmental Sound Classificationwith Convolutional Neural Networks
No ratings yet
Environmental Sound Classificationwith Convolutional Neural Networks
6 pages
Time Series Forecasting - Final Project Report
89% (9)
Time Series Forecasting - Final Project Report
67 pages
3D Hand Estimation via Graph CNN
No ratings yet
3D Hand Estimation via Graph CNN
12 pages
Decision Trees: Artificial Intelligence: A Modern Approach, 3rd Ed
No ratings yet
Decision Trees: Artificial Intelligence: A Modern Approach, 3rd Ed
47 pages
Introduction to Machine Learning
50% (2)
Introduction to Machine Learning
27 pages
Partial Least Square
No ratings yet
Partial Least Square
6 pages
Rice Leaf Diseases Classification Using CNN With
No ratings yet
Rice Leaf Diseases Classification Using CNN With
7 pages
Major Project (Lipsha)
No ratings yet
Major Project (Lipsha)
114 pages
Machine Learning Report
No ratings yet
Machine Learning Report
16 pages
E1039207009 21119 1218595455594
No ratings yet
E1039207009 21119 1218595455594
23 pages
Data Mining Steps Using Weka for Sales
No ratings yet
Data Mining Steps Using Weka for Sales
20 pages

Data Mining With Clustering AND Classification

Uploaded by

Data Mining With Clustering AND Classification

Uploaded by

DATA MINING WITH

Strategic decision making

Predictive data mining is further categorized into:

Descriptive data mining is further classified into

• Clustering is “the process of organizing objects into

• A cluster is therefore a collection of objects which

NAME RANK YEARS TENURED Classifier

NAME RANK YEARS TENURED

Classification by Decision Tree

Unsupervised learning (clustering)

You might also like