Deep Learning with
“What is deep learning?”
Machine learning is turning things (data) into numbers and finding patterns in those numbers. The computer does this part. How? Code & math. We’re going to be writing the code.
Arti fi cial Intelligence Machine Learning Machine Learning vs. Deep Learning Deep Learning
Traditional programming Machine learning algorithm Starts with Inputs Output 1. Cut vegetables 2. Season chicken 3. Preheat oven 4. Cook chicken for 30-minutes 5. Add vegetables Starts with Inputs Rules Makes Output Figures out 1. Cut vegetables 2. Season chicken 3. Preheat oven 4. Cook chicken for 30-minutes 5. Add vegetables Rules
“Why use machine learning (or deep learning)?”
Better reason: For a complex problem, can you think of all the rules? (probably not) Good reason: Why not?
Source: 2020 Machine Learning Roadmap video.
— A wise software engineer… (actually rule 1 of Google’s Machine Learning Handbook) “If you can build a simple rule-based system that doesn’t require machine learning, do that.” (maybe not very simple…)
What deep learning is good for • Problems with long lists of rules—when the traditional approach fails, machine learning/deep learning may help. • Continually changing environments—deep learning can adapt (‘learn’) to new scenarios. • Discovering insights within large collections of data—can you imagine trying to hand-craft rules for what 101 di ff erent kinds of food look like? 🤖✅
What deep learning is not good for • When you need explainability—the patterns learned by a deep learning model are typically uninterpretable by a human. • When the traditional approach is a better option — if you can accomplish what you need with a simple rule-based system. • When errors are unacceptable — since the outputs of deep learning model aren’t always predictable. • When you don’t have much data — deep learning models usually require a fairly large amount of data to produce great results. 🤖🚫 (typically) (though we’ll see how to get great results without huge amounts of data)
Unstructured data Deep Learning Machine Learning vs. Deep Learning Structured data Machine Learning Algorithm: neural network Algorithm: gradient boosted machine
Machine Learning vs. Deep Learning Structured data Unstructured data • Random forest • Gradient boosted models • Naive Bayes • Nearest neighbour • Support vector machine • …many more • Neural networks • Fully connected neural network • Convolutional neural network • Recurrent neural network • Transformer • …many more What we’re focused on building (with PyTorch) (since the advent of deep learning these are often referred to as “shallow algorithms”) (depending how you represent your problem, many algorithms can be used for both) (common algorithms)
“What are neural networks?”
Neural Networks [[116, 78, 15], [117, 43, 96], [125, 87, 23], …, [[0.983, 0.004, 0.013], [0.110, 0.889, 0.001], [0.023, 0.027, 0.985], …, Inputs Numerical encoding Learns representation (patterns/features/weights) Representation outputs Outputs (a human can understand these) Ramen, Spaghetti Not a diaster “Hey Siri, what’s the weather today?” (choose the appropriate neural network for your problem) (before data gets used with a neural network, it needs to be turned into numbers) Each of these nodes is called a “hidden unit” or “neuron”.
Anatomy of Neural Networks Input layer (data goes in here) Output layer (outputs learned representation or prediction probabilities) Hidden layer(s) (learns patterns in data) Note: “patterns” is an arbitrary term, you’ll often hear “embedding”, “weights”, “feature representation”, “feature vectors” all referring to similar things. # units/neurons = 2 # units/neurons = 3 # units/neurons = 1 Each layer is usually combination of linear (straight line) and/or non- linear (not-straight line) functions Overall architecture
Supervised Learning Unsupervised & Self-supervised Learning Transfer Learning Types of Learning We’ll be writing code to do these, but the style of code can be adopted across learning paradigms.
“What is deep learning actually used for?”
Source: 2020 Machine Learning Roadmap video.
Deep Learning Use Cases Recommendation Translation “Hey Siri, who’s the biggest big dog of them all?” Speech recognition Computer Vision Natural Language Processing (NLP) To: daniel@mrdbourke.com Hey Daniel, This deep learning course is incredible! I can’t wait to use what I’ve learned! To: daniel@mrdbourke.com Hay daniel… C0ongratu1ations! U win $1139239230 Spam Not spam Sequence to sequence (seq2seq) Classi fi cation/regression (some)
“What is PyTorch?”
What is PyTorch? • Most popular research deep learning framework* • Write fast deep learning code in Python (able to run on a GPU/many GPUs) • Able to access many pre-built deep learning models (Torch Hub/ torchvision.models) • Whole stack: preprocess data, model data, deploy model in your application/cloud • Originally designed and used in-house by Facebook/Meta (now open- source and used by companies such as Tesla, Microsoft, OpenAI) *Source: paperswithcode.com/trends February 2022
Why PyTorch? Research favourite Source: paperswithcode.com/trends February 2022
Why PyTorch? Source: @fchollet Twitter and PyTorch
Why PyTorch?
What is a GPU/TPU? GPU (Graphics Processing Unit) TPU (Tensor Processing Unit)
“What is a tensor?”
Neural Networks [[116, 78, 15], [117, 43, 96], [125, 87, 23], …, [[0.983, 0.004, 0.013], [0.110, 0.889, 0.001], [0.023, 0.027, 0.985], …, Inputs Numerical encoding Learns representation (patterns/features/weights) Representation outputs Outputs (a human can understand these) Ramen, Spaghetti Not spam “Hey Siri, what’s the weather today?” (choose the appropriate neural network for your problem) (before data gets used with an algorithm, it needs to be turned into numbers) These are tensors!
[[116, 78, 15], [117, 43, 96], [125, 87, 23], …, [[0.983, 0.004, 0.013], [0.110, 0.889, 0.001], [0.023, 0.027, 0.985], …, Inputs Numerical encoding Learns representation (patterns/features/weights) Representation outputs Outputs Ramen, Spaghetti These are tensors!
“What are we going to cover?”
Source: @elonmusk Twitter
• Now: • PyTorch basics & fundamentals (dealing with tensors and tensor operations) • Later: • Preprocessing data (getting it into tensors) • Building and using pretrained deep learning models • Fitting a model to the data (learning patterns) • Making predictions with a model (using patterns) • Evaluating model predictions • Saving and loading models • Using a trained model to make predictions on custom data How: 👩🔬 👩🍳 What we’re going to cover (broadly) (we’ll be cooking up lots of code!)
What we’re going to cover A PyTorch workflow (one of many)
“How should I approach this course?”
2. Explore and experiment How to approach this course 1. Code along Motto #1: if in doubt, run the code! Motto #2: Experiment, experiment, experiment! 3. Visualize what you don’t understand Motto #3: Visualize, visualize, visualize! 4. Ask questions 🛠 5. Do the exercises 🤗 6. Share your work (including the “dumb” ones)
How not to approach this course Avoid: 🧠 🔥 🔥 🔥 “I can’t learn ______”
Resources https://www.github.com/mrdbourke/pytorch-deep-learning Course materials https://www.github.com/mrdbourke/pytorch-deep-learning/ discussions Course Q&A https://learnpytorch.io Course online book PyTorch website & forums This course All things PyTorch
Let’s code!
tensor([[[1, 2, 3], [3, 6, 9], [2, 4, 5]]]) 0 1 2 0 1 2 tensor([[[1, 2, 3], [3, 6, 9], [2, 4, 5]]]) tensor([[[1, 2, 3], [3, 6, 9], [2, 4, 5]]]) torch.Size([1, 3, 3]) 0 1 2 Dimension (dim) dim=0 dim=1 dim=2 Tensor dimensions
20 0 24 44 A*J + B*L + C*N A*K + B*M + C*O D*J + E*L + F*N D*K + E*M + F*O G*J + H*L + I*N G*K + H*M + I*O 44 38 126 86 58 63 4 6 8 5 0 3 3x2 3x3 3x2 4 7 6 8 8 1 J K L M N O 5 0 3 3 7 9 3 5 2 A B C D E F G H I Dot product 3x3 3x2 torch.matmul( ) , ) * * * = = = + + 3x2 For a live demo, checkout www.matrixmultiplication.xyz Numbers on the inside must match New size is same as outside numbers = torch.matmul( ,
[[116, 78, 15], [117, 43, 96], [125, 87, 23], …, [[0.983, 0.004, 0.013], [0.110, 0.889, 0.001], [0.023, 0.027, 0.985], …, Inputs Numerical encoding Learns representation (patterns/features/weights) Representation outputs Outputs Ramen, Spaghetti [[0.092, 0.210, 0.415], [0.778, 0.929, 0.030], [0.019, 0.182, 0.555], …, 1. Initialise with random weights (only at beginning) 2. Show examples 3. Update representation outputs 4. Repeat with more examples Supervised learning (overview)
Tensor attributes Attribute Meaning Code Shape The length (number of elements) of each of the dimensions of a tensor. tensor.shape Rank/dimensions The total number of tensor dimensions. A scalar has rank 0, a vector has rank 1, a matrix is rank 2, a tensor has rank n. tensor.ndim or tensor.size() Speci fi c axis or dimension (e.g. “1st axis” or “0th dimension”) A particular dimension of a tensor. tensor[0], tensor[:, 1]…
00_pytorch_and_deep_learning_fundamentals.pdf

00_pytorch_and_deep_learning_fundamentals.pdf

  • 1.
  • 2.
    “What is deeplearning?”
  • 3.
    Machine learning isturning things (data) into numbers and finding patterns in those numbers. The computer does this part. How? Code & math. We’re going to be writing the code.
  • 4.
  • 5.
    Traditional programming Machine learning algorithm Starts with Inputs Output 1.Cut vegetables 2. Season chicken 3. Preheat oven 4. Cook chicken for 30-minutes 5. Add vegetables Starts with Inputs Rules Makes Output Figures out 1. Cut vegetables 2. Season chicken 3. Preheat oven 4. Cook chicken for 30-minutes 5. Add vegetables Rules
  • 6.
    “Why use machinelearning (or deep learning)?”
  • 7.
    Better reason: Fora complex problem, can you think of all the rules? (probably not) Good reason: Why not?
  • 8.
    Source: 2020 MachineLearning Roadmap video.
  • 9.
    — A wisesoftware engineer… (actually rule 1 of Google’s Machine Learning Handbook) “If you can build a simple rule-based system that doesn’t require machine learning, do that.” (maybe not very simple…)
  • 10.
    What deep learningis good for • Problems with long lists of rules—when the traditional approach fails, machine learning/deep learning may help. • Continually changing environments—deep learning can adapt (‘learn’) to new scenarios. • Discovering insights within large collections of data—can you imagine trying to hand-craft rules for what 101 di ff erent kinds of food look like? 🤖✅
  • 11.
    What deep learningis not good for • When you need explainability—the patterns learned by a deep learning model are typically uninterpretable by a human. • When the traditional approach is a better option — if you can accomplish what you need with a simple rule-based system. • When errors are unacceptable — since the outputs of deep learning model aren’t always predictable. • When you don’t have much data — deep learning models usually require a fairly large amount of data to produce great results. 🤖🚫 (typically) (though we’ll see how to get great results without huge amounts of data)
  • 12.
    Unstructured data Deep Learning Machine Learningvs. Deep Learning Structured data Machine Learning Algorithm: neural network Algorithm: gradient boosted machine
  • 13.
    Machine Learning vs.Deep Learning Structured data Unstructured data • Random forest • Gradient boosted models • Naive Bayes • Nearest neighbour • Support vector machine • …many more • Neural networks • Fully connected neural network • Convolutional neural network • Recurrent neural network • Transformer • …many more What we’re focused on building (with PyTorch) (since the advent of deep learning these are often referred to as “shallow algorithms”) (depending how you represent your problem, many algorithms can be used for both) (common algorithms)
  • 14.
    “What are neuralnetworks?”
  • 15.
    Neural Networks [[116, 78,15], [117, 43, 96], [125, 87, 23], …, [[0.983, 0.004, 0.013], [0.110, 0.889, 0.001], [0.023, 0.027, 0.985], …, Inputs Numerical encoding Learns representation (patterns/features/weights) Representation outputs Outputs (a human can understand these) Ramen, Spaghetti Not a diaster “Hey Siri, what’s the weather today?” (choose the appropriate neural network for your problem) (before data gets used with a neural network, it needs to be turned into numbers) Each of these nodes is called a “hidden unit” or “neuron”.
  • 16.
    Anatomy of NeuralNetworks Input layer (data goes in here) Output layer (outputs learned representation or prediction probabilities) Hidden layer(s) (learns patterns in data) Note: “patterns” is an arbitrary term, you’ll often hear “embedding”, “weights”, “feature representation”, “feature vectors” all referring to similar things. # units/neurons = 2 # units/neurons = 3 # units/neurons = 1 Each layer is usually combination of linear (straight line) and/or non- linear (not-straight line) functions Overall architecture
  • 17.
    Supervised Learning Unsupervised & Self-supervised Learning Transfer Learning Types ofLearning We’ll be writing code to do these, but the style of code can be adopted across learning paradigms.
  • 18.
    “What is deeplearning actually used for?”
  • 19.
    Source: 2020 MachineLearning Roadmap video.
  • 20.
    Deep Learning UseCases Recommendation Translation “Hey Siri, who’s the biggest big dog of them all?” Speech recognition Computer Vision Natural Language Processing (NLP) To: daniel@mrdbourke.com Hey Daniel, This deep learning course is incredible! I can’t wait to use what I’ve learned! To: daniel@mrdbourke.com Hay daniel… C0ongratu1ations! U win $1139239230 Spam Not spam Sequence to sequence (seq2seq) Classi fi cation/regression (some)
  • 22.
  • 23.
    What is PyTorch? •Most popular research deep learning framework* • Write fast deep learning code in Python (able to run on a GPU/many GPUs) • Able to access many pre-built deep learning models (Torch Hub/ torchvision.models) • Whole stack: preprocess data, model data, deploy model in your application/cloud • Originally designed and used in-house by Facebook/Meta (now open- source and used by companies such as Tesla, Microsoft, OpenAI) *Source: paperswithcode.com/trends February 2022
  • 24.
    Why PyTorch? Research favourite Source:paperswithcode.com/trends February 2022
  • 25.
    Why PyTorch? Source: @fcholletTwitter and PyTorch
  • 26.
  • 27.
    What is aGPU/TPU? GPU (Graphics Processing Unit) TPU (Tensor Processing Unit)
  • 28.
    “What is atensor?”
  • 29.
    Neural Networks [[116, 78,15], [117, 43, 96], [125, 87, 23], …, [[0.983, 0.004, 0.013], [0.110, 0.889, 0.001], [0.023, 0.027, 0.985], …, Inputs Numerical encoding Learns representation (patterns/features/weights) Representation outputs Outputs (a human can understand these) Ramen, Spaghetti Not spam “Hey Siri, what’s the weather today?” (choose the appropriate neural network for your problem) (before data gets used with an algorithm, it needs to be turned into numbers) These are tensors!
  • 30.
    [[116, 78, 15], [117,43, 96], [125, 87, 23], …, [[0.983, 0.004, 0.013], [0.110, 0.889, 0.001], [0.023, 0.027, 0.985], …, Inputs Numerical encoding Learns representation (patterns/features/weights) Representation outputs Outputs Ramen, Spaghetti These are tensors!
  • 31.
    “What are wegoing to cover?”
  • 32.
  • 33.
    • Now: • PyTorchbasics & fundamentals (dealing with tensors and tensor operations) • Later: • Preprocessing data (getting it into tensors) • Building and using pretrained deep learning models • Fitting a model to the data (learning patterns) • Making predictions with a model (using patterns) • Evaluating model predictions • Saving and loading models • Using a trained model to make predictions on custom data How: 👩🔬 👩🍳 What we’re going to cover (broadly) (we’ll be cooking up lots of code!)
  • 34.
    What we’re goingto cover A PyTorch workflow (one of many)
  • 35.
    “How should Iapproach this course?”
  • 36.
    2. Explore and experiment Howto approach this course 1. Code along Motto #1: if in doubt, run the code! Motto #2: Experiment, experiment, experiment! 3. Visualize what you don’t understand Motto #3: Visualize, visualize, visualize! 4. Ask questions 🛠 5. Do the exercises 🤗 6. Share your work (including the “dumb” ones)
  • 37.
    How not toapproach this course Avoid: 🧠 🔥 🔥 🔥 “I can’t learn ______”
  • 38.
  • 39.
  • 40.
    tensor([[[1, 2, 3], [3,6, 9], [2, 4, 5]]]) 0 1 2 0 1 2 tensor([[[1, 2, 3], [3, 6, 9], [2, 4, 5]]]) tensor([[[1, 2, 3], [3, 6, 9], [2, 4, 5]]]) torch.Size([1, 3, 3]) 0 1 2 Dimension (dim) dim=0 dim=1 dim=2 Tensor dimensions
  • 41.
    20 0 2444 A*J + B*L + C*N A*K + B*M + C*O D*J + E*L + F*N D*K + E*M + F*O G*J + H*L + I*N G*K + H*M + I*O 44 38 126 86 58 63 4 6 8 5 0 3 3x2 3x3 3x2 4 7 6 8 8 1 J K L M N O 5 0 3 3 7 9 3 5 2 A B C D E F G H I Dot product 3x3 3x2 torch.matmul( ) , ) * * * = = = + + 3x2 For a live demo, checkout www.matrixmultiplication.xyz Numbers on the inside must match New size is same as outside numbers = torch.matmul( ,
  • 42.
    [[116, 78, 15], [117,43, 96], [125, 87, 23], …, [[0.983, 0.004, 0.013], [0.110, 0.889, 0.001], [0.023, 0.027, 0.985], …, Inputs Numerical encoding Learns representation (patterns/features/weights) Representation outputs Outputs Ramen, Spaghetti [[0.092, 0.210, 0.415], [0.778, 0.929, 0.030], [0.019, 0.182, 0.555], …, 1. Initialise with random weights (only at beginning) 2. Show examples 3. Update representation outputs 4. Repeat with more examples Supervised learning (overview)
  • 43.
    Tensor attributes Attribute MeaningCode Shape The length (number of elements) of each of the dimensions of a tensor. tensor.shape Rank/dimensions The total number of tensor dimensions. A scalar has rank 0, a vector has rank 1, a matrix is rank 2, a tensor has rank n. tensor.ndim or tensor.size() Speci fi c axis or dimension (e.g. “1st axis” or “0th dimension”) A particular dimension of a tensor. tensor[0], tensor[:, 1]…