By, Dr.V.Sumathy, Assistant Professor, Department of Data Science, Loyola College, Chennai 600 034 Introduction to Data Science
Big data Agenda Definition of Data Science Application of Data Science Process of Data Science Project Skillset to acquire Placement Opportunities Questions and Discussion Big data Vs Traditional data Need of Data Science Introduction to Predictive and prescriptive models Learning Resources
Definition of Data Science 01 02 Data science is the study of data to extract meaningful insights for business. It is a multidisciplinary approach that combines domain expertise, programming skills, machine learning algorithms, and knowledge of maths and statistics.
Need for Data Science Need for Data Science Data Abundance Business Value Complexit y of Data Advancement of Technology Decision Making Personalisation Social Impact
Big Data Streaming Data - Volume Variety of data Velocity Veracity Value
Features involved in Pricing model The amount charged for the distance traveled during the trip. A flat fee charged at the beginning of every ride, regardless of distance or time. The amount charged for the duration of the trip, typically based on the time spent in the vehicle. A fee charged to cover operational costs, such as insurance and customer support. 01 03 02 04 Base Fare Per-Minute Rate Per-Mile Rate Booking Fee
Features involved in Pricing model Additional fees may be added for tolls, airport pickups, or other special circumstances. Uber frequently offers promotions, discounts, and referral bonuses that can affect the final price of a ride. 06 07 Tolls and Surcharges Promotions and Discounts During times of high demand, Uber may implement surge pricing, which increases the fares to encourage more drivers to be available. Surge pricing multipliers can vary depending on the level of demand in the area. 05 Surge Pricing
Features involved in product recommendation 01 05 02 06 User Preferences and History Item Attributes Implicit Feedback Explicit Feedback 03 07 Collaborative Filtering Contextual Information 04 08 Content-Based Filtering Seasonality and Trends
Features involved in spam detection B H A Sender Reputation Sources Keyword/content Whitelist/Blacklist HTML code Attachments Header analysis Images C E D F
Credit card/ Loan sanction analysis
Data science in defence ➔ Predictive Maintenance ➔ Mission Planning and Optimization ➔ Target Identification and Tracking ➔ Health Monitoring and Medical Research ➔ Cybersecurity and Information Assurance and many more
Data science in Rocket launching ➔ Risk Assessment and Safety Analysis ➔ Real-Time Monitoring and Control ➔ Weather Forecasting and Environmental Conditions ➔ Launch Site Selection and Infrastructure Planning and many more
Types of Data Analysis Predictive Prescriptive Diagnostic Descriptive
Obtain Data B H A Open source Sources Real time data Video Secondary/ Primary data Text data Real world data Images C E D F
Scrub Data 01 04 02 05 Handle missing values Handle outliers Drop unwanted columns Data Transformation 03 06 Duplication data Data discretization
Explore Data 01 03 02 04 Create Histogram Create scatterplot Create Boxplot Generate descriptive statistics
Model Building
Model Building Model in simple words is an equation that helps in making decisions be it predictive, prescriptive, descriptive, or diagnostic analysis. Example:
Training and Test data set
Evaluation metrics
Evaluation metrics
Types of Machine Learning Algorithms
Skill set to acquire ➔ Statistics-Descriptive and Inferential Statistics ➔ Mathematics- eigen, eigenvector, projection(Linear algebra) ➔ Programming language- Python, Spark ➔ DBMS, SQL, NoSQL ➔ Visualisation Tools- PowerBI/Tableau ➔ Cloud – Azure/GCP ➔ Web Scraping
Explore data B A Kaggle Explore data UCI Repository Data.gov Twitter API MIMIC-III The World Bank Open Data C E D F
Learning resources B A Udemy Learning resources Coursera NPTEL Linkedin Medium.com YouTube videos by Krish Naik C E D F
Build Your Profile B A Aptitude skill Blocks Mini project Certifications LinkedIn profile Hackathon ranks USP C E D F
Thanks!

Introduction to data science.pdf-Definition,types and application of Data Science

  • 1.
    By, Dr.V.Sumathy, Assistant Professor, Department ofData Science, Loyola College, Chennai 600 034 Introduction to Data Science
  • 2.
    Big data Agenda Definition of DataScience Application of Data Science Process of Data Science Project Skillset to acquire Placement Opportunities Questions and Discussion Big data Vs Traditional data Need of Data Science Introduction to Predictive and prescriptive models Learning Resources
  • 3.
    Definition of DataScience 01 02 Data science is the study of data to extract meaningful insights for business. It is a multidisciplinary approach that combines domain expertise, programming skills, machine learning algorithms, and knowledge of maths and statistics.
  • 4.
    Need for DataScience Need for Data Science Data Abundance Business Value Complexit y of Data Advancement of Technology Decision Making Personalisation Social Impact
  • 5.
    Big Data Streaming Data - Volume Variety ofdata Velocity Veracity Value
  • 8.
    Features involved inPricing model The amount charged for the distance traveled during the trip. A flat fee charged at the beginning of every ride, regardless of distance or time. The amount charged for the duration of the trip, typically based on the time spent in the vehicle. A fee charged to cover operational costs, such as insurance and customer support. 01 03 02 04 Base Fare Per-Minute Rate Per-Mile Rate Booking Fee
  • 9.
    Features involved inPricing model Additional fees may be added for tolls, airport pickups, or other special circumstances. Uber frequently offers promotions, discounts, and referral bonuses that can affect the final price of a ride. 06 07 Tolls and Surcharges Promotions and Discounts During times of high demand, Uber may implement surge pricing, which increases the fares to encourage more drivers to be available. Surge pricing multipliers can vary depending on the level of demand in the area. 05 Surge Pricing
  • 11.
    Features involved inproduct recommendation 01 05 02 06 User Preferences and History Item Attributes Implicit Feedback Explicit Feedback 03 07 Collaborative Filtering Contextual Information 04 08 Content-Based Filtering Seasonality and Trends
  • 13.
    Features involved inspam detection B H A Sender Reputation Sources Keyword/content Whitelist/Blacklist HTML code Attachments Header analysis Images C E D F
  • 14.
    Credit card/ Loansanction analysis
  • 15.
    Data science indefence ➔ Predictive Maintenance ➔ Mission Planning and Optimization ➔ Target Identification and Tracking ➔ Health Monitoring and Medical Research ➔ Cybersecurity and Information Assurance and many more
  • 16.
    Data science inRocket launching ➔ Risk Assessment and Safety Analysis ➔ Real-Time Monitoring and Control ➔ Weather Forecasting and Environmental Conditions ➔ Launch Site Selection and Infrastructure Planning and many more
  • 17.
    Types of DataAnalysis Predictive Prescriptive Diagnostic Descriptive
  • 19.
    Obtain Data B H A Open source Sources Realtime data Video Secondary/ Primary data Text data Real world data Images C E D F
  • 20.
    Scrub Data 01 04 0205 Handle missing values Handle outliers Drop unwanted columns Data Transformation 03 06 Duplication data Data discretization
  • 22.
    Explore Data 01 03 0204 Create Histogram Create scatterplot Create Boxplot Generate descriptive statistics
  • 26.
  • 27.
    Model Building Model insimple words is an equation that helps in making decisions be it predictive, prescriptive, descriptive, or diagnostic analysis. Example:
  • 30.
  • 31.
  • 32.
  • 33.
    Types of MachineLearning Algorithms
  • 34.
    Skill set toacquire ➔ Statistics-Descriptive and Inferential Statistics ➔ Mathematics- eigen, eigenvector, projection(Linear algebra) ➔ Programming language- Python, Spark ➔ DBMS, SQL, NoSQL ➔ Visualisation Tools- PowerBI/Tableau ➔ Cloud – Azure/GCP ➔ Web Scraping
  • 35.
    Explore data B A Kaggle Explore data UCIRepository Data.gov Twitter API MIMIC-III The World Bank Open Data C E D F
  • 36.
  • 37.
    Build Your Profile B A Aptitudeskill Blocks Mini project Certifications LinkedIn profile Hackathon ranks USP C E D F
  • 40.