©2018 dataiku, Inc. 2nd Class: The data science workflow & introduction to algorithms
©2018 dataiku, Inc. ● ● September 27th at 12PM ET: The data science workflow, building a predictive model flow ● ● Curriculum
©2018 dataiku, Inc. The plan for today
©2018 dataiku, Inc.
©2018 dataiku, Inc. 7 steps of a data projects
©2018 dataiku, Inc. Advanced version of the workflow Dataset 1 Scored dataset Scored dataset Dataset 2 Dataset n
©2018 dataiku, Inc. Business understanding
©2018 dataiku, Inc. Data understanding
©2018 dataiku, Inc. Data preparation
©2018 dataiku, Inc. Model Creation
©2018 dataiku, Inc. Evaluation
©2018 dataiku, Inc. Deployment
©2018 dataiku, Inc. Most Advanced version of the workflow
©2018 dataiku, Inc.
©2018 dataiku, Inc. Machine Learning: What are we talking about?
©2018 dataiku, Inc. copyright @Machine Learning Mastery
©2018 dataiku, Inc. Different types of Machine Learning Data is labeled, algorithm predicts an output feature from the input data Data isn’t labeled, algorithm learns the inherent structure of the data and makes a prediction Examples: Examples:
©2018 dataiku, Inc. Different types of Machine Learning If your target is # Numerical (continuous) If your target is A Categorical (discrete) Example: Examples:
©2018 dataiku, Inc. Different types of Prediction
©2018 dataiku, Inc. How can we chose an algorithm?
©2018 dataiku, Inc. Different types of Machine Learning Goal: Goal: Examples: • • • Examples: • • •
2
©2018 dataiku, Inc.
©2018 dataiku, Inc. Most common types of Machine Learning
©2018 dataiku, Inc. Linear Regressions
©2018 dataiku, Inc. Logistic Regressions
©2018 dataiku, Inc.
©2018 dataiku, Inc. Most common types of Tree Based Models
©2018 dataiku, Inc. The decision tree
©2018 dataiku, Inc. More complicated tree
©2018 dataiku, Inc. More trees!
©2018 dataiku, Inc. Random Forest
©2018 dataiku, Inc. Gradient Boosting Trees
©2018 dataiku, Inc. Isolation Forest
©2018 dataiku, Inc.
©2018 dataiku, Inc. K-means clustering http://www.naftaliharris.com/blog/visualizing-k-means-clustering/
©2018 dataiku, Inc. Qu s o s?
©2018 dataiku, Inc.
©2018 dataiku, Inc.
©2018 dataiku, Inc. About Dataiku - Your Path to Enterprise AI

Applied Data Science Course Part 2: the data science workflow and basic models deep dive