0% found this document useful (0 votes)

228 views91 pages

Autoencoders: Types and Applications

The document discusses different types of autoencoders including linear autoencoders, overcomplete autoencoders, denoising autoencoders, and sparse autoencoders. Autoencoders are a type of neural network used for unsupervised learning tasks like dimensionality reduction and representation learning. They learn an efficient encoding for a set of data in an unsupervised manner.

Uploaded by

anongreeen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

228 views91 pages

Autoencoders: Types and Applications

Uploaded by

anongreeen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 91

DEEP LEARNING

UNIT - III

By,
DR. HIMANI DESHPANDE (TSEC, MUMBAI) Dr. Himani Deshpande 1
UNIT – III AUTOENCODERS

3.1
Introduction, Linear Autoencoder, Undercomplete Autoencoder, Overcomplete
Autoencoders, Regularization in Autoencoders

3.2
Denoising Autoencoders, Sparse Autoencoders, Contractive Autoencoders

3.3
Application of Autoencoders: Image Compression
2

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTOENCODERS

¡ An autoencoder is a type of artificial neural network used to learn efficient

codings of unlabeled data (unsupervised learning).

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTOENCODERS

¡ Autoencoders are designed to reproduce their input, especially for images.

¡ Key point is to reproduce the input from a learned encoding.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTOENCODERS. ARCHITECTURE

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTOENCODER

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

HIGHLIGHT NOTES

¡ The way we try to mark and learn

important things for exams instead of
learning the whole book chapter.
Autoencoder focuses on reproducing
significant information with some loss

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTO ENCODERS

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTOENCODERS

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTOENCODERS

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

Why can't we just copy ?
AUTOENCODERS
That way latent layers
will not learn anything.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTOENCODERS

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

PCA AND AUTOENCODERS

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

PCA AND AUTOENCODERS

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

PCA AND AUTOENCODERS

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

APPLICATION
SELF DRIVING CARS

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

SELF DRIVING CARS

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

Properties of Autoencoders

DATA SPECIFIC

UNSUPERVISED

LOSSY
PROPERTIES OF AUTOENCODERS

• Autoencoders are data-specific, which means that they will only be able to compress
data similar to what they have been trained on.
• Example, an autoencoder trained on pictures of faces would do a rather poor job of
compressing pictures of trees, because the features it would learn would be face-specific.
• Autoencoders are lossy, which means that the decompressed outputs will be
degraded compared to the original inputs.
• Autoencoders are learned automatically from data examples, which is a useful
property: it means that it is easy to train specialized instances of the algorithm that will
perform well on a specific type of input. It doesn’t require any new engineering, just
appropriate training data. 20

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

PARTS OF AUTO ENCODERS

¡ Encoder : This part of the network encodes or compresses the input data into a
latent-space representation. The compressed data typically looks garbled, nothing
like the original data.
¡ Decoder : This part of network decodes or reconstructs the encoded data(latent
space representation) back to original dimension. The decoded data is a lossy
reconstruction of the original data.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTO-ENCODER

Usually <784
Compact
NN code representation of
Encoder the input object
28 X 28 = 784
Learn together

code NN
Decoder
Can reconstruct
the original object
22

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTOENCODERS

Minimize 𝑥 − 𝑥# "

As close as possible

encode decode
𝑥 𝑐 𝑥#
𝑊 𝑊!
hidden layer
Input layer (linear) output layer
Bottleneck later
23

DR. HIMANI DESHPANDE (TSEC, MUMBAI) Output of the hidden layer is the code
TRAINING AUTOENCODERS

CODE SIZE
NUMBER OF LAYERS

NUMBER OF NODES
PER LAYER LOSS FUNCTION
TRAINING AUTOENCODERS

If we are working with image data, the most popular loss

functions for reconstruction are MSE Loss and L1 Loss.
In case the inputs and outputs are within the range [0,1], as in
MNIST, we can also make use of Binary Cross Entropy as the
reconstruction loss.
TRAINING AUTOENCODERS
You need to set 4 hyperparameters before training an autoencoder:

1.Code size: The code size or the size of the bottleneck is the most important hyperparameter used to tune the
autoencoder. The bottleneck size decides how much the data has to be compressed. This can also act as a
regularisation term.

2.Number of layers: Like all neural networks, an important hyperparameter to tune autoencoders is the depth
of the encoder and the decoder. While a higher depth increases model complexity, a lower depth is faster to
process.

3.Number of nodes per layer: The number of nodes per layer defines the weights we use per layer. Typically,
the number of nodes decreases with each subsequent layer in the autoencoder as the input to each of these
layers becomes smaller across the layers.

4.Reconstruction Loss: The loss function we use to train the autoencoder is highly dependent on the type of
input and output we want the autoencoder to adapt to. If we are working with image data, the most popular
loss functions for reconstruction are MSE Loss and L1 Loss. In case the inputs and outputs are within the range
[0,1], as in MNIST, we can also make use of Binary Cross Entropy as the reconstruction loss.
AUTO ENCODERS ARCHITECTURE

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTOENCODER

¡ Autoencoders are identical to multilayer

perceptron neural networks because,
like multilayer perceptrons ,
autoencoders have an input layer, some
hidden layers, and an output layer.
¡ The key difference between a multilayer
perceptron network and an
autoencoder is that the output layer of
an autoencoder has the same number
of neurons as that of the input layer.
28

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTOENCODER

h = g(W x i + b) The model is trained to minimize a certain loss function 29

xˆ i = f (W ∗ h + c) which will ensure that xˆi is close to x i

DR. HIMANI DESHPANDE (TSEC, MUMBAI)
AUTO ENCODERS ARCHITECTURE

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTO ENCODERS ARCHITECTURE

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

ENCODER

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

TYPES OF AE

9. Linear Autoencoder

10. Overcomplete Autoencoder 36

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

LINEAR AUTOENCODERS

¡ A linear autoencoder is a type of autoencoder that uses only linear transformations,

such as matrix multiplication and addition, to compress and reconstruct the data.
¡ A linear autoencoder consists of two parts: an encoder and a decoder. The encoder
takes the input data and maps it to a lower-dimensional space. The decoder then
takes the compressed representation and reconstructs the original data.
¡ The goal of training a linear autoencoder is to minimize the reconstruction error
between the input and output:
L(x, x') = ||x - x'||^2

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

LINEAR AUTOENCODERS

¡ A linear autoencoder is a type of autoencoder that uses only linear transformations.

In other words, the encoder and decoder are composed of only linear layers. The
advantage of using a linear autoencoder is that it is computationally efficient and can
be trained on large datasets.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

LINEAR AUTOENCODERS

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

UNDERCOMPLETE AUTOENCODER

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

UNDERCOMPLETE AUTOENCODER

Ø Let us consider the case where

dim(h) < dim(x i )

Ø If we are still able to reconstruct xˆi perfectly

from h, then what does it say about h?

Ø h is a loss-free encoding of x i . It cap- tures all

the important characteristics of x i

An autoencoder where dim(h) < dim(x i ) is called an

under-complete autoencoder
41

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

UNDERCOMPLETE AUTOENCODERS

¡ An undercomplete autoencoder is one of the simplest types of autoencoders.

¡ The way it works is very straightforward—
Undercomplete autoencoder takes in an image and tries to predict the same image
as output, thus reconstructing the image from the compressed bottleneck region.
¡ Undercomplete autoencoders are truly unsupervised as they do not take any form of
label, the target being the same as the input.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

UNDERCOMPLETE AE

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

OVERCOMPLETE AUTOENCODER

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

APPLICATION

¡ Applications of undercomplete autoencoders include compression, recommendation

systems as well as outlier detection.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

OVERCOMPLETE AUTOENCODER

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

OVERCOMPLETE AUTOENCODER

Ø In such a case the autoencoder could learn a

trivial encoding by simply copying x i into h
and then copying h into xˆi

Ø Such an identity encoding is useless in

practice as it does not really tell us anything
about the important characteristics of the
data

An autoencoder where dim(h) ≥ dim(x i) 47

is called an over complete autoencoder

DR. HIMANI DESHPANDE (TSEC, MUMBAI)
APPLICATION

¡ Very rare applications of overcomplete autoencoder, hypothetical scenarios.

¡ BMI à Height & Weight calculation.
¡ Sometimes inspite of knowing BMI, we might need our network to gain knowledge about height or weight.

Sparse Autoencoder based on regularization

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

UNDERCOMPLETE AND OVERCOMPLETE AUTOENCODERS
ARCHITECTURE

The only difference between the two is in the encoding

output's size 49

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

UNDERCOMPLETE AND OVERCOMPLETE AUTOENCODERS

¡ We can make out latent space representation learn useful features by giving it smaller
dimensions then input data. In this case autoencoder is undercomplete. By training an
undercomplete representation, we force the autoencoder to learn the most salient features
of the training data. If we give autoencoder much capacity(like if we have almost same
dimensions for input data and latent space), then it will just learn copying task without
extracting useful features or information from data.

¡ If dimensions of latent space is equal to or greater then to input data, in such case
autoencoder is overcomplete. In such case even linear encoder and linear decoder can learn
to copy the input to the output without learning anything useful about the data distribution.
50

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

STACK AUTOENCODER

A stacked autoencoder is a
neural network consist several
layers of sparse autoencoders
where output of each hidden
layer is connected to the input
of the successive hidden layer.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

Regularization in

Autoencoders
It is fine to
loose some
data in this
case
52

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

REGULARIZATION

¡ A reliable autoencoder must make a tradeoff between two important parts:

• Sensitive enough to inputs so that it can accurately reconstruct input data
• Able to generalize well even when evaluated on unseen data

¡ As a result, our loss function of autoencoder is composed of two different parts.

¡ The first part is the loss function (e.g. mean squared error loss) calculating the difference
between input data and output data.
¡ The second term would act as regularization term which prevents autoencoder from
overfitting.
53

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

REGULARIZATION

¡ Regularization helps with the effects of out-of-control parameters by using different methods to
minimize parameter size over time.

¡ Regularization coefficients L1 and L2 help fight overfitting by making certain weights smaller. Smaller-
valued weights lead to simpler hypotheses, which are the most generalizable.
¡ Unregularized weights with several higher-order polynomials in the feature sets tend to overfit the
training set.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

R EG U L A R I ZAT I O N

ü While poor generalization could happen

even in undercomplete autoencoders it is an
even more serious problem for overcomplete
auto encoders
ü Here, the model can simply learn to copy
x i to h and then h to xˆi
ü To avoid poor generalization, we need to
introduce regularization
55

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

R EG U L A R I ZAT I O N

¡ The simplest solution is to add a

L 2 -regularization term to the objective
function

‘m’ is the number of rows , n is number of

columns in the X vector of image
56

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

R EG U L A R I ZAT I O N

¡ The simplest solution is to add a

L 2 -regularization term to the objective
function

57
Θ= combination of all the weights and bias
Θ = [ w1,w2,w3,….. ]
DR. HIMANI DESHPANDE (TSEC, MUMBAI)
REGULARIZATION

¡ The regularized autoencoders use a loss function that helps the model to have other
properties besides copying input to the output.
¡ We can generally find two types of regularized autoencoder:
¡ the denoising autoencoder and
¡ the sparse autoencoder.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

DENOISING AUTOENCODERS

¡ Autoencoders are Neural Networks which are commonly used for feature selection and extraction.
However, when there are more nodes in the hidden layer than there are inputs, the Network is risking
to learn the so-called “Identity Function”, also called “Null Function”, meaning that the output equals
the input, marking the Autoencoder useless.
¡ Denoising Autoencoders solve this problem by corrupting the data on purpose by randomly turning
some of the input values to zero. In general, the percentage of input nodes which are being set to zero
is about 50%. Other sources suggest a lower count, such as 30%. It depends on the amount of data
and input nodes you have.
¡ When calculating the Loss function, it is important to compare the output values with the original
input, not with the corrupted input. That way, the risk of learning the identity function instead of
extracting features is eliminated.
59

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

DENOISING
AUTOENCODERS

¡ Denoising autoencoders are a robust variant of the standard autoencoders.

¡ They have the same structure as a standard autoencoders but are trained using
samples in which some amount of noise is added.
¡ Thus, we map these noisy samples to their clean version.
¡ This ensures that the network doesn’t learn an identity mapping which will be
pointless.
¡ So, to summarise, denoising autoencoders are used where you want to learn a more
robust latent representation for particular set of input data. 60

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

DENOISING AUTOENCODER

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

DENOISING AUTOENCODER

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

SPARSE AUTOENCODER

¡ A sparse autoencoder is simply an autoencoder whose training criterion involves a sparsity

penalty. In most cases, we would construct our loss function by penalizing activations of
hidden layers so that only a few nodes are encouraged to activate when a single sample is
fed into the network.
¡ There are actually two different ways to construct our sparsity penalty:
¡ L1 regularization and
¡ KL-divergence.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

SPARSE AUTOENCODER

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

SPARSE AUTOENCODER

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

SPARSE AUTOENCODER

¡ In sparse autoencoders with a sparsity enforcer that directs a single layer network to learn
code dictionary which minimizes the error in reproducing the input while constraining
number of code words for reconstruction.
¡ The sparse autoencoder consists of a single hidden layer, which is connected to the input
vector by a weight matrix forming the encoding step. The hidden layer outputs to a
reconstruction vector, using a tied weight matrix to form the decoder.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

CONTRACTIVE AE

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

CONTRACTIVE AUTOENCODER

¡ A Contractive Autoencoder is an autoencoder that adds a penalty term to the classical reconstruction cost
function.
¡ This penalty term corresponds to the Frobenius norm of the Jacobian matrix of the encoder activations with
respect to the input.

The Frobenius Norm of a matrix is defined as the

square root of the sum of the squares of the
elements of the matrix. Approach: Find the sum
of squares of the elements of the matrix and
then print the square root of the calculated
value.
75

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

CONTRACTIVE AUTOENCODER

¡ Contractive Autoencoder was proposed by the researchers at the University of Toronto in 2011 in the
paper Contractive auto-encoders: Explicit invariance during feature extraction. The idea behind that is
to make the autoencoders robust of small changes in the training dataset.
¡ To deal with the above challenge that is posed in basic autoencoders, the authors proposed to add
another penalty term to the loss function of autoencoders.
¡ The Loss function:
Contractive autoencoder adds an extra term in the loss function of autoencoder, it is given as:

i.e the above penalty term is the Frobenius Norm of the encoder, the frobenius norm is just a
generalization of Euclidean norm. 76

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

DENOISING AND CONTRACTIVE AUTOENCODER

¡ There is a connection between the denoising autoencoder and the contractive autoencoder:
¡ the denoising reconstruction error is equivalent to a contractive penalty on the reconstruction function that
maps x to r - g(f(x)).
¡ In other words, denoising autoencoders make the reconstruction function resist small but finite sized
perturbations of the input, whereas contractive autoencoders make the feature extraction function resist
infinitesimal perturbations of the input.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

Symmetric is not
necessary.
DEEP AUTO-ENCODER

¡ Of course, the auto-encoder can be deep

As close as possible

Output Layer
Input Layer

bottle
… …

Layer

Layer
Layer
Layer

Layer

𝑊# 𝑊" 𝑊"! 𝑊#!

𝑥 Initialize by RBM 𝑥#
Code 78

layer-by-layer
DR. HIMANI DESHPANDE (TSEC, MUMBAI)
85

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

APPLICATIONS
WATERMARK
REMOVAL
01
DENOISING
05
02 DIMENTIONALITY
REDUCTION
IMAGE
COMPRESSION

IMAGE 04 03 FEATURE
COLORIZATION
VARIATION
USE OF AUTO ENCODERS

¡ Data denoising and Dimensionality reduction for data visualization are considered
as two main interesting practical applications of autoencoders. With appropriate
dimensionality and sparsity constraints, autoencoders can learn data projections that
are more interesting than PCA or other basic techniques.
¡ Autoencoders also can be used for Image Reconstruction, Basic Image colorization,
data compression, gray-scale images to colored images, generating higher resolution
images etc.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

APPLICATIONS

Autoencoders present an efficient way to learn a representation of

your data that focuses on the signal, not the noise. You can use them
for a variety of tasks such as:

•Image Compression
•Dimensionality reduction
•Feature extraction
•Denoising of data/images
•Imputing missing data
IMAGE COMPRESSION

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

IMAGE COMPRESSION

¡ Autoencoders are a deep learning model for transforming data from a high-
dimensional space to a lower-dimensional space. They work by encoding the data,
whatever its size, to a 1-D vector. This vector can then be decoded to reconstruct
the original data (in this case, an image).

¡ An autoencoder consists of two parts: an encoder network and a decoder network.

The encoder network compresses the input data, while the decoder network
reconstructs the compressed data back into its original form. The compressed data,
also known as the bottleneck layer, is typically much smaller than the input data.
90

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTOENCODERS: APPLICATIONS

¡ Denoising: input clean image + noise and train to reproduce the clean image.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTOENCODERS: APPLICATIONS

¡ Image colorization: input black and white and train to produce color images

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

AUTOENCODERS: APPLICATIONS

¡ Watermark removal

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

FEATURE VARIATION

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

DIMENSIONALITY REDUCTION

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

PROPERTIES OF AUTOENCODERS

¡ Data-specific: Autoencoders are only able to compress data similar to what they have been trained on.
¡ Lossy: The decompressed outputs will be degraded compared to the original inputs.
¡ Learned automatically from examples: It is easy to train specialized instances of the algorithm that will
perform well on a specific type of input.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

https://www.edureka.co/blog/autoencoders-tutorial/
CAPACITY

¡ As with other NNs, overfitting is a problem when capacity is too large for the data.

¡ Autoencoders address this through some combination of:

¡ Bottleneck layer – fewer degrees of freedom than in possible outputs.
¡ Training to denoise.
¡ Sparsity through regularization.
¡ Contractive penalty.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

BOTTLENECK LAYER (UNDERCOMPLETE)

¡ Suppose input images are nxn and the latent space is m < nxn.
¡ Then the latent space is not sufficient to reproduce all images.
¡ Needs to learn an encoding that captures the important features in training data, sufficient for approximate
reconstruction.

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

SIMPLE BOTTLENECK LAYER IN KERAS

¡ input_img = Input(shape=(784,))
¡ encoding_dim = 32
¡ encoded = Dense(encoding_dim, activation='relu')(input_img)
¡ decoded = Dense(784, activation='sigmoid')(encoded)
¡ autoencoder = Model(input_img, decoded)
¡ Maps 28x28 images into a 32 dimensional vector.
¡ Can also use more layers and/or convolutions.

100

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

https://blog.keras.io/building-autoencoders-in-keras.html
DENOISING AUTOENCODERS

¡ Basic autoencoder trains to minimize the loss between x and the reconstruction g(f(x)).
¡ Denoising autoencoders train to minimize the loss between x and g(f(x+w)), where w is random noise.
¡ Same possible architectures, different training data.
¡ Kaggle has a dataset on damaged documents.

101

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

https://blog.keras.io/building-autoencoders-in-keras.html
DENOISING AUTOENCODERS

¡ Denoising autoencoders can’t simply memorize the input output relationship.

¡ Intuitively, a denoising autoencoder learns a projection from a neighborhood of our
training data back onto the training data.

102

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

SPARSE AUTOENCODERS

¡ Construct a loss function to penalize activations within a layer.

¡ Usually regularize the weights of a network, not the activations.
¡ Individual nodes of a trained model that activate are data-dependent.
¡ Different inputs will result in activations of different nodes through the network.
¡ Selectively activate regions of the network depending on the input data.

103

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

https://www.jeremyjordan.me/autoencoders/
SPARSE AUTOENCODERS

¡ Construct a loss function to penalize activations the network.

¡ L1 Regularization: Penalize the absolute value of the vector of activations a in layer h for observation I

¡ KL divergence: Use cross-entropy between average activation and desired activation

104

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

https://www.jeremyjordan.me/autoencoders/
CONTRACTIVE AUTOENCODERS

¡ Arrange for similar inputs to have similar activations.

¡ I.e., the derivative of the hidden layer activations are small with respect to the input.

¡ Denoising autoencoders make the reconstruction function (encoder+decoder) resist

small perturbations of the input
¡ Contractive autoencoders make the feature extraction function (ie. encoder) resist
infinitesimal perturbations of the input.

105

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

https://www.jeremyjordan.me/autoencoders/
CONTRACTIVE AUTOENCODERS

¡ Contractive autoencoders make the feature extraction function (ie. encoder) resist infinitesimal perturbations of
the input.

106

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

https://ift6266h17.files.wordpress.com/2017/03/14_autoencoders.pdf
AUTOENCODERS

¡ Both the denoising and contractive autoencoder can perform well

¡ Advantage of denoising autoencoder : simpler to implement-requires adding one or two lines of code to regular
autoencoder-no need to compute Jacobian of hidden layer
¡ Advantage of contractive autoencoder : gradient is deterministic -can use second order optimizers (conjugate gradient,
LBFGS, etc.)-might be more stable than denoising autoencoder, which uses a sampled gradient
¡ To learn more on contractive autoencoders:
¡ Contractive Auto-Encoders: Explicit Invariance During Feature Extraction. Salah Rifai, Pascal Vincent, Xavier Muller, Xavier
Glorot et Yoshua Bengio, 2011.

107

DR. HIMANI DESHPANDE (TSEC, MUMBAI)

https://ift6266h17.files.wordpress.com/2017/03/14_autoencoders.pdf

Lecture 14 Autoencoders
No ratings yet
Lecture 14 Autoencoders
39 pages
Unit - 3-NNDL - Notes
No ratings yet
Unit - 3-NNDL - Notes
17 pages
CS230 Midterm Solutions Fall 2022
No ratings yet
CS230 Midterm Solutions Fall 2022
20 pages
1-NLP - Lab Manual
No ratings yet
1-NLP - Lab Manual
15 pages
Greedy-Layerwise in Deep Learning
No ratings yet
Greedy-Layerwise in Deep Learning
15 pages
DEEP LEARNING Import Questions For External Exam
No ratings yet
DEEP LEARNING Import Questions For External Exam
1 page
Autoencoder Report 1
No ratings yet
Autoencoder Report 1
34 pages
Hamming Network
100% (1)
Hamming Network
2 pages
Stochastic Encoders
100% (1)
Stochastic Encoders
2 pages
Machine Learning Full Question Bank
No ratings yet
Machine Learning Full Question Bank
14 pages
Deep Learning - Question Bank
No ratings yet
Deep Learning - Question Bank
6 pages
Unit 2 DL
No ratings yet
Unit 2 DL
44 pages
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
No ratings yet
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
11 pages
LP I ML Viva Questions
100% (1)
LP I ML Viva Questions
9 pages
Summary Notes of CNN
No ratings yet
Summary Notes of CNN
23 pages
Thyroid Disease Classification Using Machine Learning Project
No ratings yet
Thyroid Disease Classification Using Machine Learning Project
34 pages
Deep Learning Exam With Answers
No ratings yet
Deep Learning Exam With Answers
4 pages
Convolution and Pooling As An Infinitely Strong Prior
100% (1)
Convolution and Pooling As An Infinitely Strong Prior
11 pages
NN DL
No ratings yet
NN DL
1 page
Structured Outputs - Data Types
No ratings yet
Structured Outputs - Data Types
19 pages
ML Viva Questions
100% (1)
ML Viva Questions
4 pages
Practice Final sp22
No ratings yet
Practice Final sp22
10 pages
DL Question Bank
No ratings yet
DL Question Bank
23 pages
Unit 5
No ratings yet
Unit 5
23 pages
21cs743 Model Question Paper Solution
No ratings yet
21cs743 Model Question Paper Solution
33 pages
Deep Learning Exp
100% (1)
Deep Learning Exp
25 pages
Deep Learning Question Bank Iv-I
No ratings yet
Deep Learning Question Bank Iv-I
5 pages
Omkar Sabnis B4-764 Experiment No. 7 Aim: Implementation of MC-Culloch Pitt Model For AND Gate Using Python. Theory
No ratings yet
Omkar Sabnis B4-764 Experiment No. 7 Aim: Implementation of MC-Culloch Pitt Model For AND Gate Using Python. Theory
10 pages
LU5: Deep Feedforward Networks: Hidden Units, Architecture Design
No ratings yet
LU5: Deep Feedforward Networks: Hidden Units, Architecture Design
15 pages
Cognitive Computing (Course Code: 18CS3272) : CO1 - Session4 Session Topic: The Elements of A Cognitive System
No ratings yet
Cognitive Computing (Course Code: 18CS3272) : CO1 - Session4 Session Topic: The Elements of A Cognitive System
9 pages
CNN Basics for AI Enthusiasts
No ratings yet
CNN Basics for AI Enthusiasts
29 pages
Unit III
No ratings yet
Unit III
38 pages
R22 ML Question Bank For It and CSM
No ratings yet
R22 ML Question Bank For It and CSM
4 pages
Lecture 2.1.2activation Function
No ratings yet
Lecture 2.1.2activation Function
15 pages
Explain The Concept of Unfolding Computational Graphs in The Context of Recurrent Neural Networks
No ratings yet
Explain The Concept of Unfolding Computational Graphs in The Context of Recurrent Neural Networks
9 pages
Machine Learning Theory Essentials
No ratings yet
Machine Learning Theory Essentials
9 pages
Tensor Flow
No ratings yet
Tensor Flow
12 pages
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
No ratings yet
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
7 pages
Activation Functions - Ipynb - Colaboratory
No ratings yet
Activation Functions - Ipynb - Colaboratory
10 pages
Unit - II: Recurrent Neural Network
No ratings yet
Unit - II: Recurrent Neural Network
75 pages
Deep Learning r18 Jntuh Lab Manual
No ratings yet
Deep Learning r18 Jntuh Lab Manual
20 pages
Single-Layer Perceptron Guide
No ratings yet
Single-Layer Perceptron Guide
39 pages
Answers For End-Sem Exam Part - 2 (Deep Learning)
No ratings yet
Answers For End-Sem Exam Part - 2 (Deep Learning)
20 pages
Deep Learning Question
No ratings yet
Deep Learning Question
4 pages
Unit IV V Deep Learning Material
No ratings yet
Unit IV V Deep Learning Material
32 pages
Autoencoders & Keras Overview
No ratings yet
Autoencoders & Keras Overview
42 pages
AI&ML BM4251 Unit 1-5 Notes
No ratings yet
AI&ML BM4251 Unit 1-5 Notes
116 pages
ANN-unit 4
No ratings yet
ANN-unit 4
25 pages
Deep Learning
No ratings yet
Deep Learning
6 pages
G1 Sign Language Identifier PPT
No ratings yet
G1 Sign Language Identifier PPT
18 pages
Early Detection of Lung Cancer Using AI and ML
No ratings yet
Early Detection of Lung Cancer Using AI and ML
6 pages
Must Know Questions Deep Learning
No ratings yet
Must Know Questions Deep Learning
22 pages
Backpropagation Examples PDF
No ratings yet
Backpropagation Examples PDF
9 pages
Deep Learning CNN Training Guide
No ratings yet
Deep Learning CNN Training Guide
20 pages
Neural Networks & SVMs in AI
No ratings yet
Neural Networks & SVMs in AI
19 pages
ML Unit-1
No ratings yet
ML Unit-1
34 pages
DL Question Bank Answers
No ratings yet
DL Question Bank Answers
55 pages
Unit 5 DL
No ratings yet
Unit 5 DL
11 pages
ANN Quiz - PDF - Artificial Neural Network - Computational Science
No ratings yet
ANN Quiz - PDF - Artificial Neural Network - Computational Science
17 pages
Unit 3
No ratings yet
Unit 3
39 pages
DL2 Perceptron
No ratings yet
DL2 Perceptron
14 pages
DL Question Bank Answers
No ratings yet
DL Question Bank Answers
55 pages
MNIST MLP Digit Classifier Guide
No ratings yet
MNIST MLP Digit Classifier Guide
43 pages
Neural Networks and Fuzzy Logic Question Paper
100% (1)
Neural Networks and Fuzzy Logic Question Paper
1 page
Multilayer Perceptron (MLP) & Linear Separabaility
No ratings yet
Multilayer Perceptron (MLP) & Linear Separabaility
7 pages
Fallsem2018-19 Eee1007 Eth Tt424 Vl2018191002720 Reference Material I Unit - IV Maxnet
No ratings yet
Fallsem2018-19 Eee1007 Eth Tt424 Vl2018191002720 Reference Material I Unit - IV Maxnet
11 pages
Gen AI Syllabus
No ratings yet
Gen AI Syllabus
2 pages
Deep Learning Nanodegree Program
No ratings yet
Deep Learning Nanodegree Program
9 pages
Endsem Deep Learning Important
No ratings yet
Endsem Deep Learning Important
2 pages
Stock Forecasting with RNN & PCA
No ratings yet
Stock Forecasting with RNN & PCA
3 pages
Seminar
No ratings yet
Seminar
10 pages
Syllabus ANN
No ratings yet
Syllabus ANN
2 pages
1 - Perceptron in Machine Learning
No ratings yet
1 - Perceptron in Machine Learning
6 pages
Lecture Notes For Chapter 4 Artificial Neural Networks Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 4 Artificial Neural Networks Introduction To Data Mining, 2 Edition
20 pages
Convolutional Neural Network - Towards Data Science PDF
No ratings yet
Convolutional Neural Network - Towards Data Science PDF
10 pages
Inception Net
No ratings yet
Inception Net
88 pages
Neural Networks & Deep Learning 2025
No ratings yet
Neural Networks & Deep Learning 2025
73 pages
Neural Networks: A Student's Guide
No ratings yet
Neural Networks: A Student's Guide
16 pages
Homework DL 5GI Sheet2
No ratings yet
Homework DL 5GI Sheet2
2 pages
Intro to Artificial Neural Networks
No ratings yet
Intro to Artificial Neural Networks
39 pages
Linear Optimization - Max
No ratings yet
Linear Optimization - Max
186 pages
Swipe
No ratings yet
Swipe
18 pages
Lec 5
No ratings yet
Lec 5
35 pages
A Survey On Vision Transformer
No ratings yet
A Survey On Vision Transformer
23 pages
(2020129) On Layer Normalization in The Transformer Architecture
No ratings yet
(2020129) On Layer Normalization in The Transformer Architecture
17 pages
Neural Network Learning Models
No ratings yet
Neural Network Learning Models
7 pages
Deep Learning: Seungsang Oh
No ratings yet
Deep Learning: Seungsang Oh
6 pages
Winsem2020-21 Eee1007 Eth Vl2020210500383 Model Question Paper Eee1007 QP
No ratings yet
Winsem2020-21 Eee1007 Eth Vl2020210500383 Model Question Paper Eee1007 QP
4 pages
IF4071 Deep Learning Notes
No ratings yet
IF4071 Deep Learning Notes
188 pages