Thom Lane 19th September 2018 ONNX & Edge Deployments Open Neural Network Exchange
Agenda 1. What is ONNX? 2. Creating/finding ONNX models 3. Visualizing ONNX models 4. Deploying ONNX models 5. Optimizing ONNX models
What is ONNX?
What is ONNX? Open Neural Network Exchange Format CoreML
ONNX Partners
Creating/finding ONNX models
Where can you get ONNX models from? Model Zoo Train Your Own Fine-tuning Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
Where can you get ONNX models from? Model Zoo Train Your Own Fine-tuning Apache MXNet on AWS SageMaker Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
Which model to choose for edge? Accuracy Computation Size Accuracy on ImageNet (top-1) Millions of Mult-Adds Millions of Parameters VGG16 71.5% 15300 138 1.0 MobileNet-224 70.6% 569 4.2
Depthwise Separable Convolution 3x10x10 16x10x10 3x10x10 3x10x10 16x10x10 3x3(x3) Convolution 3x3(x1) Convolution 1x1(x3) Convolution Regular Convolution Depthwise Separable Convolution # of params: 432 # of computations: 43,200 # of params: 27+48 = 75 # of computations: 2,700+4,800 = 7,500
MobileNet Example MobileNet from ONNX Model Zoo (pretrained on ImageNet) Fine-tune on CALtech101 Apache MXNet on AWS SageMaker Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
Apache MXNet Overview Scalable Debuggable Optimized librariesFlexible 7 frontend languages Portable Speech Bubble by Weltenraser, Scale by Ben Davis, Bug by Nociconist, Mobile by Rafael Garcia Motta, flexible by AdbA Icons from the Noun Project
Data Model Loss Optimizer & Trainer Gluon Sample
Forward & Backwards Update Parameters Gluon Sample
AWS SageMaker Overview DeployTrain & TuneBuild
MobileNet Example MobileNet from ONNX Model Zoo (pretrained on ImageNet) Fine-tune on CALtech101 Apache MXNet on AWS SageMaker Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
Visualizing ONNX models
How to visualize ONNX models? lutzroeder.github.io/netron
Deploying ONNX models
How to deploy ONNX models? AWS SageMaker AWS Fargate with Model Servers AWS GreenGrass Custom Deployments Raspberry Pi by Ben Davis , clouds by Viktor Vorobyev from the Noun Project
AWS GreenGrass Overview
AWS GreenGrass Group Deployments Lambda function ML Model Resource Device Resource SubscriptionDevice
Optimizing ONNX models
How to optimize ONNX models? 1. Use half-precision (float16) if possible: e.g. Mali-GPU 2. Use quantization with calibration if possible (experimental) 3. Compile model with TVM Stack NNVM TVM CUDA LLVM OpenCL TVM Compiler TVM Runtime lib MXNet ONNX CoreML frontends backends
What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations
What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations Conv Conv Conv Conv Dropout
What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations Conv Conv Conv Relu Conv with Relu
What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations N*C*H*W N*(C/16)*H*W*16
What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations 1 + 3 = 4 2 + 2 = 4 1 + 0 = 1 1 + 1 = 2 1 3 = 4 2 2 = 4 1 0 = 1 1 1 = 2 +
Summary 1. Creating/finding ONNX models • ONNX Model Zoo • And fine-tune with Apache MXNet and AWS SageMaker 2. Visualizing ONNX models • Netron 3. Deploying ONNX models • AWS GreenGrass 4. Optimizing ONNX models • TVM Stack
Thanks! And don’t forget to check out: https://medium.com/apache-mxnet

ONNX and Edge Deployments

  • 1.
    Thom Lane 19th September2018 ONNX & Edge Deployments Open Neural Network Exchange
  • 2.
    Agenda 1. What isONNX? 2. Creating/finding ONNX models 3. Visualizing ONNX models 4. Deploying ONNX models 5. Optimizing ONNX models
  • 3.
  • 4.
    What is ONNX? OpenNeural Network Exchange Format CoreML
  • 5.
  • 6.
  • 7.
    Where can youget ONNX models from? Model Zoo Train Your Own Fine-tuning Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
  • 8.
    Where can youget ONNX models from? Model Zoo Train Your Own Fine-tuning Apache MXNet on AWS SageMaker Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
  • 9.
    Which model tochoose for edge? Accuracy Computation Size Accuracy on ImageNet (top-1) Millions of Mult-Adds Millions of Parameters VGG16 71.5% 15300 138 1.0 MobileNet-224 70.6% 569 4.2
  • 10.
    Depthwise Separable Convolution 3x10x1016x10x10 3x10x10 3x10x10 16x10x10 3x3(x3) Convolution 3x3(x1) Convolution 1x1(x3) Convolution Regular Convolution Depthwise Separable Convolution # of params: 432 # of computations: 43,200 # of params: 27+48 = 75 # of computations: 2,700+4,800 = 7,500
  • 11.
    MobileNet Example MobileNet from ONNXModel Zoo (pretrained on ImageNet) Fine-tune on CALtech101 Apache MXNet on AWS SageMaker Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
  • 12.
    Apache MXNet Overview ScalableDebuggable Optimized librariesFlexible 7 frontend languages Portable Speech Bubble by Weltenraser, Scale by Ben Davis, Bug by Nociconist, Mobile by Rafael Garcia Motta, flexible by AdbA Icons from the Noun Project
  • 13.
  • 14.
    Forward & Backwards UpdateParameters Gluon Sample
  • 15.
  • 16.
    MobileNet Example MobileNet from ONNXModel Zoo (pretrained on ImageNet) Fine-tune on CALtech101 Apache MXNet on AWS SageMaker Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
  • 17.
  • 18.
    How to visualizeONNX models? lutzroeder.github.io/netron
  • 19.
  • 20.
    How to deployONNX models? AWS SageMaker AWS Fargate with Model Servers AWS GreenGrass Custom Deployments Raspberry Pi by Ben Davis , clouds by Viktor Vorobyev from the Noun Project
  • 21.
  • 22.
    AWS GreenGrass GroupDeployments Lambda function ML Model Resource Device Resource SubscriptionDevice
  • 23.
  • 24.
    How to optimizeONNX models? 1. Use half-precision (float16) if possible: e.g. Mali-GPU 2. Use quantization with calibration if possible (experimental) 3. Compile model with TVM Stack NNVM TVM CUDA LLVM OpenCL TVM Compiler TVM Runtime lib MXNet ONNX CoreML frontends backends
  • 25.
    What type ofoptimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations
  • 26.
    What type ofoptimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations Conv Conv Conv Conv Dropout
  • 27.
    What type ofoptimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations Conv Conv Conv Relu Conv with Relu
  • 28.
    What type ofoptimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations N*C*H*W N*(C/16)*H*W*16
  • 29.
    What type ofoptimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations 1 + 3 = 4 2 + 2 = 4 1 + 0 = 1 1 + 1 = 2 1 3 = 4 2 2 = 4 1 0 = 1 1 1 = 2 +
  • 30.
    Summary 1. Creating/finding ONNXmodels • ONNX Model Zoo • And fine-tune with Apache MXNet and AWS SageMaker 2. Visualizing ONNX models • Netron 3. Deploying ONNX models • AWS GreenGrass 4. Optimizing ONNX models • TVM Stack
  • 31.
    Thanks! And don’t forgetto check out: https://medium.com/apache-mxnet