Object Detection using Deep Neural Networks

1 Object Detection By Usman Qayyum 4, Dec, 2018

Talk Covers Three Papers (Object Detection -> Embedded Computing) 2 SqueezeNet-2016SSD-2016 TinySSD-2018 =+

Image Classification/Object Detection ● Autonomous vehicles, smart video surveillance, facial detection and various applications, fast and robust object detection is need of an hour ● Nonly recognizing and classifying every object in an image, but localizing each one by drawing the appropriate bounding box around it. 3

CNN Migration (Image Classification) 4

Object Detection as Classification CNN deer? cat? background?

Object Detection as Classification with Sliding Window CNN deer? cat? background?

Object Detection as Classification with Box Proposals

Box Proposal Method : Selective Search Segmentation As Selective Search for Object Recognition. van de Sande et al. ICCV 2011

Idea behind Object Detectors ● Box Proposals ● Classifier Algorithm 11

RCNN Rich feature hierarchies for accurate object detection and semantic segmentation. Girshick et al. CVPR 2014. https://people.eecs.berkeley.edu/~rbg/papers/r-cnn-cvpr.pdf

Fast-RCNN Fast R-CNN. Girshick. ICCV 2015. https://arxiv.org/abs/1504.08083 Idea: No need to recompute features for every box independently, Regress refined bounding box coordinates.

Faster-RCNN Ren et al. NIPS 2015. https://arxiv.org/abs/1506.01497 Idea: Integrate the Bounding Box Propos als as part of the CNN predictions

YOLO- You Only Look Once ● Single Shot Detector Redmon et al. CVPR 2016. https://arxiv.org/abs/1506.02640 Idea: No bounding box proposals. Predict a class and a box for every location in a grid.

SSD: Single Shot Detector Liu et al. ECCV 2016. Idea: Similar to YOLO, but denser grid map, multiscale grid maps. + Data augm entation + Hard negative mining + Other design choices in the network.

-The overall objective loss function is a weighted sum of the localization loss and the confidence loss(conf) N: the number of matched default boxes l: predicted boxes g: the ground truth box x=1 denotes some certain default box is matched to a ground truth box17 1 ( , , , ) ( ( , ) ( , , ))conf locL x c l g L x c L x l g N   SSD: Single Shot Detector

AI Workload Migration Embedded (Mobile/Edge) Server/Clou d Execution/Inference Training Execution/Inference Intelligence & Analytics Key Use Cases Vision | Audio | Security Benefits Low Latency | Privacy

How ? (AI in Embedded Devices) Pruning Quantization22

SqueezeNet (Parameter Reduction) ● Strategy 1. Replace 3x3 filters with 1x1 filters ○ Parameters per filter: (3x3 filter) = 9 * (1x1 filter) ● Strategy 2. Decrease the number of input channels to 3x3 filters ○ Total # of parameters: (# of input channels) * (# of ﬁlters) * ( # of parameters per filter) ● Strategy 3. Downsample late in the network so that convolution layers have large activation maps ○ Size of activation maps: the size of input data, the choice of layers in which to downsample in the CNN architecture 23 Iandola, Forrest N., et al. "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size."

Strategy#1 Conv1x1 or Kernel Reduction 24

Microarchitecture – Fire Module 25 Squeeze Layer Set s1x1 < (e1x1 + e3x3), limits the # of input channels to 3*3 filters Strategy 2. Decrease the number of input channels to 3x3 filters Total # of parameters: (# of input channels) * (# of ﬁlters) * ( # of parameters per filter) How much can we limit s1x1? Strategy 1. Replace 3*3 filters with 1*1 filters Parameters per filter: (3*3 filter) = 9 * (1*1 filter) How much can we replace 3*3 with 1*1? (e1x1 vs e3x3 )?

Expand ● In the "expand" modules, what are the tradeoffs when we turn the knob between mostly 1x1 and mostly 3x3 filters? ● Hypothesis: if having more weights leads to higher accuracy, then having all 3x3 filters should give the highest accuracy 27

Macroarchitecture 29 Strategy 3. Downsample late in the network so that convolution layers have large activation maps Size of activation maps: the size of input data, the choice of layers in which to downsample in the CNN architecture

TinySSD (SSD with Microarchitecture) 31

Object Detection using Deep Neural Networks

In this document

More Related Content

What's hot

Similar to Object Detection using Deep Neural Networks

More from Usman Qayyum

Recently uploaded

Object Detection using Deep Neural Networks