ONNX and Edge Deployments

Thom Lane 19th September 2018 ONNX & Edge Deployments Open Neural Network Exchange

Agenda 1. What is ONNX? 2. Creating/finding ONNX models 3. Visualizing ONNX models 4. Deploying ONNX models 5. Optimizing ONNX models

What is ONNX? Open Neural Network Exchange Format CoreML

Where can you get ONNX models from? Model Zoo Train Your Own Fine-tuning Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project

Where can you get ONNX models from? Model Zoo Train Your Own Fine-tuning Apache MXNet on AWS SageMaker Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project

Which model to choose for edge? Accuracy Computation Size Accuracy on ImageNet (top-1) Millions of Mult-Adds Millions of Parameters VGG16 71.5% 15300 138 1.0 MobileNet-224 70.6% 569 4.2

Depthwise Separable Convolution 3x10x10 16x10x10 3x10x10 3x10x10 16x10x10 3x3(x3) Convolution 3x3(x1) Convolution 1x1(x3) Convolution Regular Convolution Depthwise Separable Convolution # of params: 432 # of computations: 43,200 # of params: 27+48 = 75 # of computations: 2,700+4,800 = 7,500

MobileNet Example MobileNet from ONNX Model Zoo (pretrained on ImageNet) Fine-tune on CALtech101 Apache MXNet on AWS SageMaker Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project

Apache MXNet Overview Scalable Debuggable Optimized librariesFlexible 7 frontend languages Portable Speech Bubble by Weltenraser, Scale by Ben Davis, Bug by Nociconist, Mobile by Rafael Garcia Motta, flexible by AdbA Icons from the Noun Project

Data Model Loss Optimizer & Trainer Gluon Sample

Forward & Backwards Update Parameters Gluon Sample

AWS SageMaker Overview DeployTrain & TuneBuild

How to visualize ONNX models? lutzroeder.github.io/netron

How to deploy ONNX models? AWS SageMaker AWS Fargate with Model Servers AWS GreenGrass Custom Deployments Raspberry Pi by Ben Davis , clouds by Viktor Vorobyev from the Noun Project

AWS GreenGrass Group Deployments Lambda function ML Model Resource Device Resource SubscriptionDevice

How to optimize ONNX models? 1. Use half-precision (float16) if possible: e.g. Mali-GPU 2. Use quantization with calibration if possible (experimental) 3. Compile model with TVM Stack NNVM TVM CUDA LLVM OpenCL TVM Compiler TVM Runtime lib MXNet ONNX CoreML frontends backends

What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations

What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations Conv Conv Conv Conv Dropout

What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations Conv Conv Conv Relu Conv with Relu

What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations N*C*H*W N*(C/16)*H*W*16

What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations 1 + 3 = 4 2 + 2 = 4 1 + 0 = 1 1 + 1 = 2 1 3 = 4 2 2 = 4 1 0 = 1 1 1 = 2 +

Summary 1. Creating/finding ONNX models • ONNX Model Zoo • And fine-tune with Apache MXNet and AWS SageMaker 2. Visualizing ONNX models • Netron 3. Deploying ONNX models • AWS GreenGrass 4. Optimizing ONNX models • TVM Stack

Thanks! And don’t forget to check out: https://medium.com/apache-mxnet

ONNX and Edge Deployments

More Related Content

What's hot

Similar to ONNX and Edge Deployments

More from Apache MXNet

Recently uploaded

ONNX and Edge Deployments