Python for Image Understanding: Deep Learning with Convolutional Neural Nets
The document discusses deep learning and its significance in image understanding through convolutional neural networks, highlighting its ability to learn features automatically as opposed to manually designed features. It outlines the deep learning process, including data preprocessing, model architecture selection, training, optimization, and debugging techniques, while emphasizing the importance of regularization methods like dropout and batch normalization. Furthermore, it provides resources for implementing deep learning in Python and references various research studies related to the field.
Introduction to deep learning with convolutional neural networks by Roelof Pieters, a PhD candidate and data science consultant.
Overview and definition of deep learning, emphasizing its algorithmic nature and learning capabilities across various levels of abstraction.
Deep learning eliminates handcrafted features, using learned features adaptable to visual and linguistic information, structured as hierarchical models.
Historical advancements in audio and image recognition via deep learning (e.g., ImageNet classification by Krizhevsky in 2010).
Explaining the inner workings of deep learning models, including activation functions and optimization strategies.
Available Python libraries for deep learning, including Theano, Caffe, Keras, and Lasagne.
Training a neural network involves preprocessing data, selecting architecture, optimizing, and regularization.
Steps for effective data preprocessing, including normalization and PCA, essential for improving model performance.
Selecting the appropriate neural network architecture such as DBN, CNN, and RNN for specific tasks.
Framework for training neural networks focusing on data processing, architecture selection, training, and optimization.
Visualizing and interpreting loss curves, accuracy, and weights to debug and improve training processes.
Strategies for optimizing hyperparameters and employing techniques like dropout and data augmentation to reduce overfitting.
Use of ensemble techniques to enhance model predictions by aggregating outputs of multiple models.
Discussion on vulnerabilities in machine learning systems, including saliency maps, and resources for further learning.
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
1.
for Image Understanding: Deep Learningwith Convolutional Neural Nets Roelof Pieters PhD candidate at KTH & Data Science consultant at Graph Technologies @graphific London 2015roelof@graph-technologies.com
“Deep learning isa set of algorithms in machine learning that attempt to learn in multiple levels, corresponding to different levels of abstraction.” (much debated definition) A Definition
4.
• A hostof statistical machine learning techniques • Enables the automatic learning of feature hierarchies • Generally based on artificial neural networks A typology Deep learning is:
5.
Manually designed featuresare often over-specified, incomplete and take a long time to design and validate Learned features are easy to adapt, fast to learn Deep learning provides a very flexible, (possibly?) universal, learnable framework for representing world, visual and linguistic information. Deep learning can learn unsupervised (from raw text/ audio/images/whatever content) and supervised (with specific labels like positive/negative) Summary by Richard Socher. Old vs new school?
python has awide range of deep learning-related libraries available Deep Learning with Python Low level High level (efficient gpu-powered math) (computer-vision oriented DL framework, model-zoo, prototxt model definitions) pythonification ongoing! (wrapper for theano, yaml, experiment-oriented) (theano-wrapper, models in python code, abstracts theano away) (theano-extension, models in python code, theano not hidden) and of course:
31.
python has awide range of deep learning-related libraries available Deep Learning with Python Low level High level deeplearning.net/software/theano caffe.berkeleyvision.org deeplearning.net/software/pylearn2 keras.io lasagne.readthedocs.org/en/latest and of course:
32.
python has awide range of deep learning-related libraries available Deep Learning with Python Low level High level deeplearning.net/software/theano caffe.berkeleyvision.org deeplearning.net/software/pylearn2 keras.io lasagne.readthedocs.org/en/latest and of course: we will use lasagne in our examples
DrawCNN: visualizing theunits' connections Agrawal, et al. Analyzing the performance of multilayer neural networks for object recognition. ECCV, 2014 Szegedy, et al. Intriguing properties of neural networks.arXiv preprint arXiv:1312.6199, 2013 Zeiler, M. et al. Visualizing and Understanding Convolutional Networks, ECCV 2014
52.
1. Preprocess thedata 2. Choose architecture 3. Train 4. Optimize/Regularize 5. Tips/Tricks Training a (deep) Neural Network
53.
1. Preprocess thedata 2. Choose architecture 3. Train (Code Finally!) 4. Optimize/Regularize 5. Tips/Tricks Training a (deep) Neural Network
54.
Training a (deep)Neural Network (…) layer definitions layer parameters
Data Augmentation http://benanne.github.io/2015/03/17/plankton.html (realtime dataaugmentation at Kaggle’s #1 National Data Science Bowl ≋ Deep Sea ≋ team) rotation: random with angle between 0° and 360° (uniform) translation: random with shift between -10 and 10 pixels (uniform) rescaling: random with scale factor between 1/1.6 and 1.6 (log- uniform) flipping: yes or no (bernoulli) shearing: random with angle between -20° and 20° (uniform) stretching: random with stretch factor between 1/1.3 and 1.3 (log- uniform)
•Normalize the activationsin each layer within a minibatch •Learn the mean and variance of each layer as parameters Batch Normalization as regularization Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift S Ioffe and C Szegedy (2015) 85% 50% 15%
79.
Training a (deep)Neural Network 1. Preprocess the data 2. Choose architecture 3. Train 4. Optimize/Regularize 5. Further Tips & Tricks to improve Model Accuracy Debug
80.
• Ensembles • Finetuningpre-trained/earlier-trained net • Sticking extracted layer features in another classifier (ie SVM) Other “Tricks”
81.
• Ensembles • Finetuningpre-trained/earlier-trained net • Sticking extracted layer features in another classifier (ie SVM) Other “Tricks”
82.
Ensembles • majority votewhen hard predictions (ie classes) • average vote for soft predictions (continious scale) • make sure classifiers are uncorrelated • cross validate ensemble weights (by grid search, or rank average) • stacked • blending
predict by meanof all: 0.4244 avg: 0.3647 Ensembles (10 similar nets with varying hyperparameters on same tiny-imagenet dataset)
86.
(10 similar netswith varying hyperparameters on same tiny-imagenet dataset) predict by mean of all: 0.4244 leave out model9: 0.4259 avg: 0.3647 Ensembles
87.
(10 similar netswith varying hyperparameters on same tiny-imagenet dataset) 0.4259 0.4244 0.3647 Ensembles
Saliency Maps K. Simonyan,A. Vedaldi, A. Zisserman , "Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps", ICLR Workshop 2014 first we predict on a pixel level
92.
Fooling ConvNets Szegedy, Christian,et al. "Intriguing properties of neural networks." arXiv preprint, 2013. Nguyen, Anh, Jason Yosinski, and Jeff Clune. "Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images." arXiv preprint then we do our “magic”
93.
Fooling ConvNets Szegedy, Christian,et al. "Intriguing properties of neural networks." arXiv preprint, 2013. Nguyen, Anh, Jason Yosinski, and Jeff Clune. "Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images." arXiv preprint then we do our “magic”