0% found this document useful (0 votes)

24 views26 pages

Chapter 3

Uploaded by

Javier Gonzalez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views26 pages

Chapter 3

Uploaded by

Javier Gonzalez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Discovering

activation functions
between layers
INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Maham Faisal Khan

Senior Data Science Content Developer
Limitations of the sigmoid and softmax function
Sigmoid functions:

Bounded between 0 and 1

Can be used anywhere in the network

Gradients:

Approach zero for low and high values of x

Cause function to saturate

Sigmoid function saturation can lead to

vanishing gradients during backpropagation.

This is also a problem for softmax.

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Introducing ReLU
Rectified Linear Unit (ReLU):

f(x) = max(x, 0)

for positive inputs, the output is equal to

the input

for strictly negative inputs, the output is

equal to zero

overcomes the vanishing gradients problem

In PyTorch:

relu = nn.ReLU()

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Introducing Leaky ReLU
Leaky ReLU:

For positive inputs, it behaves similarly to

ReLU

For negative inputs, it multiplies the input

by a small coefficient (defaulted to 0.01)

The gradients for negative inputs are never

null

In PyTorch:

leaky_relu = nn.LeakyReLU(negative_slope = 0.05)

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Let's practice!
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
A deeper dive into
neural network
architecture
INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Maham Faisal Khan

Senior Data Science Content Developer
Layers are made of neurons
Linear layers are fully connected
Each neuron of a layer connected to each
neuron of previous layer

A neuron of a linear layer:

computes a linear operation using all
neurons of previous layer

contains N+1 learnable parameters

where N = dimension of previous layer's

outputs

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Layer naming convention

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Tweaking the number of hidden layers
Input and output layers dimensions are fixed.
input layer depends on the number of features n_features

output layer depends on the number of categories n_classes

model = nn.Sequential(nn.Linear(n_features, 8),

nn.Linear(8, 4),
nn.Linear(4, n_classes))

We can use as many hidden layers as we want

Increasing the number of hidden layers = increasing the number of parameters = increasing
the model capacity

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Counting the number of parameters
Given the following model: Using PyTorch:

.numel() : returns the number of elements

model = nn.Sequential(nn.Linear(8, 4),
in the tensor
nn.Linear(4, 2))

total = 0
Manually calculating the number of
for parameter in model.parameters():
parameters:
total += parameter.numel()
first layer has 4 neurons, each neuron has print(total)
8+1 parameters = 36 parameters
46
second layer has 2 neurons, each neuron
has 4+1 parameters = 10 parameters
total = 46 learnable parameters

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Let's practice!
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Learning rate and
momentum
INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Maham Faisal Khan

Senior Data Science Content Developer
Updating weights with SGD
Training a neural network = solving an optimization problem.
Stochastic Gradient Descent (SGD) optimizer

sgd = optim.SGD(model.parameters(), lr=0.01, momentum=0.95)

Two parameters:
learning rate: controls the step size

momentum: controls the inertia of the optimizer

Bad values can lead to:

long training times

bad overall performances (poor accuracy)

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Impact of the learning rate: optimal learning rate

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Impact of the learning rate: small learning rate

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Impact of the learning rate: high learning rate

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Without momentum
lr = 0.01 momentum = 0 , after 100 steps minimum found for x = -1.23 and y = -0.14

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

With momentum
lr = 0.01 momentum = 0.9 , after 100 steps minimum found for x = 0.92 and y = -2.04

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Summary

Learning rate Momentum

Controls the step size Controls the inertia
Too small leads to long training Null momentum can lead to the optimizer being stuck in a
times local minimum
Too high leads to poor
Non-null momentum can help find the function minimum
performances

Typical values between 10−2

Typical values between 0.85 and 0.99
and 10−4

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Let's practice!
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
Layer initialization
and transfer
learning
INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Maham Faisal Khan

Senior Data Science Content Developer
Layer initialization
import torch.nn as nn
layer = nn.Linear(64, 128)
print(layer.weight.min(), layer.weight.max())

(tensor(-0.1250, grad_fn=<MinBackward1>), tensor(0.1250, grad_fn=<MaxBackward1>))

Layer weights are initialized to small values

Layer outputs can explode if inputs and weights are not normalized

Weights can be initialized using different methods (e.g., with a uniform distribution)

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Layer initialization
import torch.nn as nn

layer = nn.Linear(64, 128)

nn.init.uniform_(layer.weight)
print(custom_layer.fc.weight.min(), custom_layer.fc.weight.max())

(tensor(0.0002, grad_fn=<MinBackward1>), tensor(1.0000, grad_fn=<MaxBackward1>))

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Transfer learning and fine-tuning
Transfer learning: reusing a model trained on a first task for a second similar task, to
accelerate the training process.

import torch

layer = nn.Linear(64, 128)

torch.save(layer, 'layer.pth')

new_layer = torch.load('layer.pth')

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Transfer learning and fine-tuning
Fine-tuning = A type of transfer learning
Smaller learning rate

Not every layer is trained (we freeze some of them)

Rule of thumb: freeze early layers of network and fine-tune layers closer to output layer

import torch.nn as nn

model = nn.Sequential(nn.Linear(64, 128),

nn.Linear(128, 256))

for name, param in model.named_parameters():

if name == '0.weight':
param.requires_grad = False

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Let's practice!
INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Chapter 3
No ratings yet
Chapter 3
26 pages
Chapter 1
No ratings yet
Chapter 1
50 pages
Activation Functions: Ismail Elezi
No ratings yet
Activation Functions: Ismail Elezi
30 pages
Chapter1 Intro
No ratings yet
Chapter1 Intro
35 pages
Chapter 1
No ratings yet
Chapter 1
37 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Chapter 4
No ratings yet
Chapter 4
34 pages
Chapter 4
No ratings yet
Chapter 4
34 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Pytorch Neural Networks Guide 1717173717
No ratings yet
Pytorch Neural Networks Guide 1717173717
17 pages
Pytorch Tutorial 1
No ratings yet
Pytorch Tutorial 1
48 pages
Py Torch
No ratings yet
Py Torch
786 pages
Intro To PyTorch and Neural Networks - Intro To PyTorch and Neural Networks Cheatsheet - Codecademy
No ratings yet
Intro To PyTorch and Neural Networks - Intro To PyTorch and Neural Networks Cheatsheet - Codecademy
8 pages
PyTorch For Deep Learning Zero To Mastery
No ratings yet
PyTorch For Deep Learning Zero To Mastery
6 pages
Pytorch 101: Deep Learning PHD Course 2017/2018
No ratings yet
Pytorch 101: Deep Learning PHD Course 2017/2018
19 pages
Deep Learning Lab: How To Train Your First Neural Network
No ratings yet
Deep Learning Lab: How To Train Your First Neural Network
68 pages
Deep Learning: CIFAR-10 Classification
No ratings yet
Deep Learning: CIFAR-10 Classification
15 pages
DIP Lab 10
No ratings yet
DIP Lab 10
11 pages
Lecture 14 Introduction To Pytorch
No ratings yet
Lecture 14 Introduction To Pytorch
45 pages
PyTorch Ebook
No ratings yet
PyTorch Ebook
44 pages
ISPR 26 Pytorch
No ratings yet
ISPR 26 Pytorch
35 pages
00 Pytorch and Deep Learning Fundamentals PDF
No ratings yet
00 Pytorch and Deep Learning Fundamentals PDF
44 pages
PyTorch - A Comprehensive Overview
No ratings yet
PyTorch - A Comprehensive Overview
7 pages
Pytorch Tutorial 1 Rev 1
No ratings yet
Pytorch Tutorial 1 Rev 1
48 pages
PyTorch Data Science Guide
No ratings yet
PyTorch Data Science Guide
30 pages
Deep Learning With PyTorch Guide For Beginners and Intermediate
100% (7)
Deep Learning With PyTorch Guide For Beginners and Intermediate
120 pages
PyTorch Tensor and Autograd Guide
No ratings yet
PyTorch Tensor and Autograd Guide
15 pages
Pytorch Demo 1749471354
No ratings yet
Pytorch Demo 1749471354
10 pages
Unit 4 Part 3
No ratings yet
Unit 4 Part 3
8 pages
Deep Learning With Keras
100% (5)
Deep Learning With Keras
136 pages
Day 45 PyTorch Presentation
No ratings yet
Day 45 PyTorch Presentation
67 pages
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
195 pages
Chapter 3 - Training Deep Neural Networks
No ratings yet
Chapter 3 - Training Deep Neural Networks
25 pages
PyTorch CrashCourse
No ratings yet
PyTorch CrashCourse
16 pages
Introduction To PyTorch
No ratings yet
Introduction To PyTorch
25 pages
2c PyTorch4
No ratings yet
2c PyTorch4
4 pages
Pytorch Cheatsheet EN
No ratings yet
Pytorch Cheatsheet EN
1 page
Pytorch Tutorial: Narges Honarvar Nazari January 30
No ratings yet
Pytorch Tutorial: Narges Honarvar Nazari January 30
29 pages
Deep Learning With Scikit-Learn and PyTorch (2024)
100% (2)
Deep Learning With Scikit-Learn and PyTorch (2024)
126 pages
Part 13 MD
No ratings yet
Part 13 MD
41 pages
(Deep Learning Using PyTorch) (Cheatsheet)
No ratings yet
(Deep Learning Using PyTorch) (Cheatsheet)
7 pages
PyTorch for Deep Learning Students
No ratings yet
PyTorch for Deep Learning Students
7 pages
Building Deep Learning Models Using The PyTorch Library
No ratings yet
Building Deep Learning Models Using The PyTorch Library
4 pages
PyTorch for Deep Learning Beginners
No ratings yet
PyTorch for Deep Learning Beginners
31 pages
Lecture24 TransferLearningOverviewPart0
No ratings yet
Lecture24 TransferLearningOverviewPart0
39 pages
A Brief Introduction To Pytorch: (A Deep Learning Library)
No ratings yet
A Brief Introduction To Pytorch: (A Deep Learning Library)
32 pages
Deep Learning - Part II-1
No ratings yet
Deep Learning - Part II-1
23 pages
Train Your Image Classifier Model With PyTorch
No ratings yet
Train Your Image Classifier Model With PyTorch
6 pages
Stars 4 0 0 0 + Forks 7 0 0 + License MIT
No ratings yet
Stars 4 0 0 0 + Forks 7 0 0 + License MIT
19 pages
Neural Networks & Deep Learning - Study Notes
No ratings yet
Neural Networks & Deep Learning - Study Notes
8 pages
06 Pytorch Transfer Learning
No ratings yet
06 Pytorch Transfer Learning
18 pages
04 Pytorch Custom Datasets
No ratings yet
04 Pytorch Custom Datasets
17 pages
Morgan & Claypool - Introduction To Deep Learning For Engineers Using Python and Google Clod Platform - 2020
No ratings yet
Morgan & Claypool - Introduction To Deep Learning For Engineers Using Python and Google Clod Platform - 2020
111 pages
Tut4 NN Pytorch Updated - Ipynb - Colab
No ratings yet
Tut4 NN Pytorch Updated - Ipynb - Colab
11 pages
Beginner's PyTorch Guide
No ratings yet
Beginner's PyTorch Guide
35 pages
PyTorch for Deep Learning Experts
No ratings yet
PyTorch for Deep Learning Experts
72 pages
Building Your Deep Neural Network - Step by Step v8 PDF
No ratings yet
Building Your Deep Neural Network - Step by Step v8 PDF
44 pages
Backtracking: Sum of Subsets Algorithm
No ratings yet
Backtracking: Sum of Subsets Algorithm
16 pages
Lab 1: Review 1 Pointer
No ratings yet
Lab 1: Review 1 Pointer
4 pages
Jacobi Iteration Method Guide
100% (1)
Jacobi Iteration Method Guide
16 pages
Self-Referential Structures and Linked List
No ratings yet
Self-Referential Structures and Linked List
64 pages
Digital Signal Processing Basics
No ratings yet
Digital Signal Processing Basics
3 pages
Asymptotic Order
No ratings yet
Asymptotic Order
19 pages
Graph Search Algorithms Guide
No ratings yet
Graph Search Algorithms Guide
20 pages
Data Mining: Model Overfitting Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
No ratings yet
Data Mining: Model Overfitting Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
15 pages
Compiler Lecture 4
No ratings yet
Compiler Lecture 4
17 pages
Digital
No ratings yet
Digital
2 pages
5 Thmodule
No ratings yet
5 Thmodule
28 pages
Some Python List Methods: These Methods Return A Value and Do Not Change The List
No ratings yet
Some Python List Methods: These Methods Return A Value and Do Not Change The List
1 page
DAA Question Bank
No ratings yet
DAA Question Bank
39 pages
TP 10 Big Data (Ega Sarmita) PDF
No ratings yet
TP 10 Big Data (Ega Sarmita) PDF
6 pages
5 - Iwama2008 - A Survey of The Stable Marriage Problem and Its Variants
No ratings yet
5 - Iwama2008 - A Survey of The Stable Marriage Problem and Its Variants
7 pages
ADA Lab Manual Updated 2023-24
No ratings yet
ADA Lab Manual Updated 2023-24
36 pages
First and Follow
No ratings yet
First and Follow
1 page
Amity - Mod-1 - L - 1introduction To Algorithms
No ratings yet
Amity - Mod-1 - L - 1introduction To Algorithms
25 pages
Mock Practical Question Class 12 Isc Students Handbook
No ratings yet
Mock Practical Question Class 12 Isc Students Handbook
53 pages
ECE Linked List Assignment
No ratings yet
ECE Linked List Assignment
8 pages
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
No ratings yet
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
25 pages
Lecture 3 - Complexity Analysis Cont
No ratings yet
Lecture 3 - Complexity Analysis Cont
19 pages
D.S Viva Questions-Ktunotes - in
No ratings yet
D.S Viva Questions-Ktunotes - in
7 pages
Artificial Intelligence: Constraint Satisfaction Problems
No ratings yet
Artificial Intelligence: Constraint Satisfaction Problems
34 pages
AI Module 3 Final
No ratings yet
AI Module 3 Final
77 pages
m1 l1 Warmup
No ratings yet
m1 l1 Warmup
60 pages
2.2 DFA and NFA
No ratings yet
2.2 DFA and NFA
22 pages
Java 8 Functional Programming Guide
No ratings yet
Java 8 Functional Programming Guide
16 pages
Hariharan M
No ratings yet
Hariharan M
7 pages
Adsp PDF
No ratings yet
Adsp PDF
27 pages

Chapter 3

Uploaded by

Chapter 3

Uploaded by

Discovering

Maham Faisal Khan

Bounded between 0 and 1

Can be used anywhere in the network

Approach zero for low and high values of x

Cause function to saturate

Sigmoid function saturation can lead to

This is also a problem for softmax.

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

for positive inputs, the output is equal to

for strictly negative inputs, the output is

overcomes the vanishing gradients problem

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

For positive inputs, it behaves similarly to

For negative inputs, it multiplies the input

The gradients for negative inputs are never

leaky_relu = nn.LeakyReLU(negative_slope = 0.05)

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Maham Faisal Khan

A neuron of a linear layer:

contains N+1 learnable parameters

where N = dimension of previous layer's

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

output layer depends on the number of categories n_classes

model = nn.Sequential(nn.Linear(n_features, 8),

We can use as many hidden layers as we want

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

.numel() : returns the number of elements

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Maham Faisal Khan

sgd = optim.SGD(model.parameters(), lr=0.01, momentum=0.95)

momentum: controls the inertia of the optimizer

Bad values can lead to:

bad overall performances (poor accuracy)

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Learning rate Momentum

Typical values between 10−2

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Maham Faisal Khan

(tensor(-0.1250, grad_fn=<MinBackward1>), tensor(0.1250, grad_fn=<MaxBackward1>))

Layer weights are initialized to small values

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

layer = nn.Linear(64, 128)

(tensor(0.0002, grad_fn=<MinBackward1>), tensor(1.0000, grad_fn=<MaxBackward1>))

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

layer = nn.Linear(64, 128)

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

Not every layer is trained (we freeze some of them)

model = nn.Sequential(nn.Linear(64, 128),

for name, param in model.named_parameters():

INTRODUCTION TO DEEP LEARNING WITH PYTORCH

You might also like