Machine Learning - Convolutional Neural Network

Convolution Neural Network for Visual Recognition

Outline • Quick overview of Artificial Neural Network (ANN) • What is Convolution? Convolutional Neural Network (CNN)? Why? • How it works? • Demo • Code • References • Discussion 7/24/18 Creative Common BY-SA-NC 2

Neural Network source: http://www.kurzweilai.net/images/neuron_structure1.jpg and https://theclevermachine.files.wordpress.com/2014/09/perceptron2.png 7/24/18 Creative Common BY-SA-NC 3

Forward Feed and Back Propagation source: https://theclevermachine.wordpress.com/2014/09/11/a-gentle-introduction-to-artificial-neural-networks/ 7/24/18 Creative Common BY-SA-NC 4

Activation Function image source: https://www.gabormelli.com/RKB/Neuron_Activation_Function 7/24/18 Creative Common BY-SA-NC 5

Why Convolution Neural Network? Image source: https://www.coursera.org/lecture/convolutional-neural-networks/why-convolutions-Xv7B5 • Reduce number of weights required for training. • Use filter to capture local information; more meaningful search, move from pixel recognition to pattern recognition. • Sparsity of connections (means most of the weights are 0. This can lead to an increase in space and time efficiency.) 7/24/18 Creative Common BY-SA-NC 6

What is Convolution? Image source: https://www.youtube.com/watch?v=cOmkIsWfAcg • In mathematics, a convolution is the integral measuring how much two functions overlap as one passes over the other. • A convolution is a way of mixing two functions by multiplying them. 7/24/18 Creative Common BY-SA-NC 7

Image Convolution image source: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/ 7/24/18 Creative Common BY-SA-NC 8 • Original image: function f • Filter: function g • Image convolution f * g Example: 8 f * gg g2 g1 gn

Approach image source: image source: cs231n_2017_lecture5.pdf slide-38 7/24/18 Creative Common BY-SA-NC 9

Convolution image source: cs231n_2017_lecture5.pdf slide-39 7/24/18 Creative Common BY-SA-NC 10

CNN Layers source: partially from cs231n_2017 A simple ConvNet for CIFAR-10 classification could have the architecture [INPUT - CONV - RELU - POOL - FC]. In more detail: • INPUT [e.g. 32x32x3] • Holds the raw pixel values of the image, width 32, height 32, and with three color channels R,G,B. • CONV layer [32x32x6] • Holds the output of neurons that are connected to local regions in the input, • each computing a dot product between their weights and a small region they are connected to in the input volume. This may result in volume such as [32x32x6] if we decided to use 6 filters. • RELU layer [32x32x6] • will apply an elementwise activation function, such as the max(0,x) thresholding at zero. This leaves the size of the volume unchanged ([32x32x6]). • POOL layer [16x16x6] • will perform a downsampling operation along the spatial dimensions (width, height), resulting in volume such as [16x16x6]. • FC (i.e. fully-connected) layer [400x1]> [120x1] > [84x1] • will compute the class scores, resulting in volume of size [1x1x10], where each of the 10 numbers correspond to a class score, such as among the 10 categories of CIFAR-10. As with ordinary Neural Networks and as the name implies, each neuron in this layer will be connected to all the numbers in the previous volume. Notes: switch 12 filters used in original note to 6 filters. 7/24/18 Creative Common BY-SA-NC 11

Convolution source cs231n Calculation Demo: http://cs231n.github.io/convolutional-networks/ 7/24/18 Creative Common BY-SA-NC 12

7/24/18 Creative Common BY-SA-NC 13 Image source: image source: cs231n_2017_lecture5.pdf slide-39

Activation Function - ReLU • Remove negative values. • When we use ReLU, we should watch for dead units in the network (= units that never activate). If there is many dead units in training our network, we might want to consider using leaky_ReLU instead. 7/24/18 Creative Common BY-SA-NC 14

Max-Pooling Image source: cs231n 7/24/18 Creative Common BY-SA-NC 15

Architecture Example source: https://medium.com/machine-learning-bites/deeplearning-series-convolutional-neural-networks-a9c2f2ee1524 7/24/18 Creative Common BY-SA-NC 16

Conv Layer image source: cs231n_2017_lecture5.pdf slide-39 7/24/18 Creative Common BY-SA-NC 17

Operation – Convolution image source: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/ 7/24/18 Creative Common BY-SA-NC 18

Operation – Activation Image source: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/ 7/24/18 Creative Common BY-SA-NC 19

Operation – Pooling image source: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/ 7/24/18 Creative Common BY-SA-NC 20

Architecture Example 7/24/18 Creative Common BY-SA-NC 21

Alexnet - Trained Filters source: cs231n Example filters learned by Krizhevsky et al. Each of the 96 filters shown here is of size [11x11x3], and each one is shared by the 55*55 neurons in one depth slice. Notice that the parameter sharing assumption is relatively reasonable: If detecting a horizontal edge is important at some location in the image, it should intuitively be useful at some other location as well due to the translationally-invariant structure of images. There is therefore no need to relearn to detect a horizontal edge at every one of the 55*55 distinct locations in the Conv layer output volume. 7/24/18 Creative Common BY-SA-NC 22

Summary source: partially from cs231n_2017_lecture5.pdf slide-76 • Workflow 1. Initialize all filter weights and parameters with random numbers. 2. Use original images as input, 2.1 Apply Filters to Original Image > Conv layer 2.2 Apply Activation Function (e.g. ReLU) to Conv layer > Feature Map 2.3 Apply Pooling Filter to Feature Map > Smaller Feature Map (optional) 2.4 Flatten the Feature Map > Full Connected Network (FC) 2.5 Apply ANN training (forward and backward propagation) to FC 2.6 Optimize the Weights, Calculate error, adjust weights, loop with original images till the probability of correct class is high. 3. Test the result, if happy, then save filters (weight and parameters) for future use, else loop. • ConvNets stack CONV,POOL,FC layers [(CONV-RELU)*N-POOL?]*M-(FC-RELU)*K, SOFTMAX where - N is usually up to ~5, M is large, 0 <= K <= 2 - Trend towards smaller filters and deeper architectures - Trend towards getting rid of POOL/FC layers (just CONV) • But!! - recent advances such as ResNet/GoogLeNet challenge this paradigm. - Proposed new Capsule Neural Network can overcome some shortcoming of ConvNets. 7/24/18 Creative Common BY-SA-NC 23

Various CNN Architectures From https://www.jeremyjordan.me/convnet-architectures/ 7/24/18 Creative Common BY-SA-NC 24 These architectures serve as rich feature extractors which can be used for image classification, object detection, image segmentation, and many other more advanced tasks. Classic network architectures (included for historical purposes) • [LeNet-5](https://www.jeremyjordan.me/convnet-architectures/#lenet5) • [AlexNet](https://www.jeremyjordan.me/convnet-architectures/#alexnet) • [VGG 16](https://www.jeremyjordan.me/convnet-architectures/#vgg16 ) Modern network architectures • [Inception](https://www.jeremyjordan.me/convnet-architectures/#inception) • [ResNet](https://www.jeremyjordan.me/convnet-architectures/#resnet) • [DenseNet](https://www.jeremyjordan.me/convnet-architectures/#densenet )

Network Performance Source: https://www.semanticscholar.org/paper/An-Analysis-of-Deep-Neural-Network-Models-for-Canziani-Paszke/28ee688947cf9d31fc48f07a0497cd75200a9485 and https://arxiv.org/pdf/1605.07678.pdf 7/24/18 Creative Common BY-SA-NC 25

Reference • [How to Select Activation Function for Deep Neural Network](https://engmrk.com/activation-function-for-dnn/ ) • [Using Convolutional Neural Networks for Image Recognition](https://ip.cadence.com/uploads/901/cnn_wp-pdf) • [Activation Functions: Neural Networks](https://towardsdatascience.com/activation-functions-neural-networks- 1cbd9f8d91d6) • [Convolutional Neural Networks Tutorial in TensorFlow](http://adventuresinmachinelearning.com/convolutional-neural- networks-tutorial-tensorflow/) • [Rethinking the Inception Architecture for Computer Vision](https://arxiv.org/pdf/1512.00567.pdf) 7/24/18 Creative Common BY-SA-NC 26

Demo [Demo - filtering](https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/ ) building image [Demo – cs231n](http://cs231n.stanford.edu/) end to end architecture in real-time [Demo – convolution calculation](http://cs231n.github.io/convolutional-networks/ ) dot product [Demo – cifar10 ](https://cs.stanford.edu/people/karpathy/convnetjs/demo/cifar10.html) in details filter/ReLU 7/24/18 Creative Common BY-SA-NC 27

Code [image classification with Tensorflow](https://github.com/rkuo/ml-tensorflow/blob/master/cnn-cifar10/cnn-cifar10-keras-v0.2.0.ipynb ) use tensorflow local [image classification with Keras](https://github.com/rkuo/ml-tensorflow/blob/master/cnn-cifar10/cnn-cifar10-keras-v0.2.0.ipynb ) use keras local [catsdogs](https://github.com/rkuo/fastai/blob/master/lesson1-catsdogs/Fastai_2_Lesson1.ipynb) use fastai with pre-trained model = resnet34 [tableschairs](https://github.com/rkuo/fastai/blob/master/lesson1-tableschairs/Fastai_2_Lesson1a-tableschairs.ipynb ) switch data 7/24/18 Creative Common BY-SA-NC 28

Image Classification with Tensorflow 7/24/18 Creative Common BY-SA-NC 29

Image Classification with Keras 7/24/18 Creative Common BY-SA-NC 30

TablesChairs with Fastai 7/24/18 Creative Common BY-SA-NC 31

Catsdogs Model with Fastai 7/24/18 Creative Common BY-SA-NC 32

Supplement Slides 7/24/18 Creative Common BY-SA-NC 33

Why Convolution Neural Network? Image source: https://www.youtube.com/watch?v=QsxKKyhYxFQ • Reduce number of weights required for training. • Use filter to capture local information; more meaningful search, move from pixel recognition to pattern recognition. • Sparsity of connections (means most of the weights are 0. This can lead to an increase in space and time efficiency.) 7/24/18 Creative Common BY-SA-NC 34

LeNet 5 source: Yann. LeCun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE 86(11): 2278–2324, 1998. - 2 Conv - 2 Subsampling - 2 FC - Gaussian Connectors 7/24/18 Creative Common BY-SA-NC 35

7/24/18 Creative Common BY-SA-NC 36 Inception v3

Machine Learning - Convolutional Neural Network

In this document