Ladle Patel TCS R&D Innovation Labs ladlepatelr@gmail.com Mob:+91-9742123444
Machine learning Examples Spam OR Non Spam Clustering Recommendations Market Basket Analysis
What is Machine learning?  It is a field of artificial intelligence, which is a sub-field of computer science, in which we teach computers by example and ask computer to predict for new example automatically . Ex: 1) Spam email or not spam email. 2)Product Recommendation. 3)What will be tomorrow’s temperature.
Types Of Machine Learning
Terminology • Observations :Items or entities used for learning or evaluation in the context of spam detection, emails. • Features :Are attributes used to represent an observation. Ex:In housing prices prediction ,size,area,floors etc.. • Labels :Are values or categories assigned to observations. and again, in the context of spam detection, these can be an email being defined as spam or not spam. • Training and test data :Observations that we use to train and evaluate a learning algorithm.
Tools Or Programing languages  Matlab.  Octave.  R.  SAS.  SPSS.  Python.  etc..
What is the Problem ?  Most of the traditional analytical tools runs on single machine.
Example  Spam Or Non Spam.
TFIDF i work on spark hadoop I work on spark 1 1 1 1 0 I work on hadoop 1 1 1 0 1
Cross Industry Standard Process for Data Mining(CRISP-DM)
ML Use Cases  Marketing Ex:Customer segmentation, Product mix, Recommendation  Sales Ex:Demand forecasting  Risk Ex:Fraud detection  Customer support Ex:Call centers
ML Use Cases Cont..  Healthcare Ex:Survival analysis  Consumer Financial Ex:Credit card fraud  Retail Ex:Market Basket Analysis  Insurance  Manufacturing
Thanks

Apache spark with Machine learning

  • 1.
    Ladle Patel TCS R&DInnovation Labs ladlepatelr@gmail.com Mob:+91-9742123444
  • 2.
    Machine learning Examples SpamOR Non Spam Clustering Recommendations Market Basket Analysis
  • 3.
    What is Machinelearning?  It is a field of artificial intelligence, which is a sub-field of computer science, in which we teach computers by example and ask computer to predict for new example automatically . Ex: 1) Spam email or not spam email. 2)Product Recommendation. 3)What will be tomorrow’s temperature.
  • 4.
  • 5.
    Terminology • Observations :Itemsor entities used for learning or evaluation in the context of spam detection, emails. • Features :Are attributes used to represent an observation. Ex:In housing prices prediction ,size,area,floors etc.. • Labels :Are values or categories assigned to observations. and again, in the context of spam detection, these can be an email being defined as spam or not spam. • Training and test data :Observations that we use to train and evaluate a learning algorithm.
  • 7.
    Tools Or Programinglanguages  Matlab.  Octave.  R.  SAS.  SPSS.  Python.  etc..
  • 8.
    What is theProblem ?  Most of the traditional analytical tools runs on single machine.
  • 9.
  • 10.
    TFIDF i work onspark hadoop I work on spark 1 1 1 1 0 I work on hadoop 1 1 1 0 1
  • 11.
    Cross Industry StandardProcess for Data Mining(CRISP-DM)
  • 12.
    ML Use Cases Marketing Ex:Customer segmentation, Product mix, Recommendation  Sales Ex:Demand forecasting  Risk Ex:Fraud detection  Customer support Ex:Call centers
  • 13.
    ML Use CasesCont..  Healthcare Ex:Survival analysis  Consumer Financial Ex:Credit card fraud  Retail Ex:Market Basket Analysis  Insurance  Manufacturing
  • 14.