J O S E P H S P I S A K P R O D U C T M A N A G E R P Y T O R C H | O N D E V I C E
OVERVIEW AND INTEGRATION PYTORCH MOBILE STATE OF THE STATE ON DEVICE DYNAMIC VS. STATIC FRAMEWORKS PRINCIPLES, MAKEUP, .. OVERVIEW
REIMPLEMENTATIONTAKESWEEKSORMONTHS Research to Production at Facebook – Early 2017
ENABLINGMODELORMODELFRAGMENTTRANSFER Research to Production at Facebook – Early 2018
Research to Production at Facebook
P Y T O R C H O V E R V I E W
SIMPLICITY OVER COMPLEXITY HARDWARE ACCELERATED INFERENCE DISTRIBUTED TRAINING DYNAMIC 
 NEURAL 
 NETWORKS EAGER & GRAPH-BASED EXECUTION WHAT IS PYTORCH?
P Y T O R C H R E S E A R C H P R O T O T Y P I N G P R O D U C T I O N D E P L O Y M E N T +
C O R E P R I N C I P L E S BUILDING FOR SCALE DEVELOPER EFFICIENCY
DEVELOPER EFFICIENCY ENABLING A HIGH VELOCITY OF MODEL ITERATION AND INNOVATION
` C L E A N A P I S
` T O R C H S C R I P T Models are Python TorchScript programs, an optimizable subset of Python + Same “models are programs” idea + Production deployment + No Python dependency + Compilation for performance optimization class RNN(nn.Module):   def __init__(self, W_h, U_h, W_y, b_h, b_y):     super(RNN, self).__init__()     self.W_h = nn.Parameter(W_h)     self.U_h = nn.Parameter(U_h)     self.W_y = nn.Parameter(W_y)     self.b_h = nn.Parameter(b_h)     self.b_y = nn.Parameter(b_y)   def forward(self, x, h):     y = []     for t in range(x.size(0)):       h = torch.tanh(x[t] @ self.W_h + h @ self.U_h + self.b_h)       y += [torch.tanh(h @ self.W_y + self.b_y)]       if t % 10 == 0:         print("stats: ", h.mean(), h.var())     return torch.stack(y), h 
 # one annotation! script_rnn = torch.jit.script(RNN(W_h, U_h, W_y, b_h, b_y))
` T O R C H S C R I P T Models are Python TorchScript programs, an optimizable subset of Python + Same “models are programs” idea + Prod deployment + No Python dependency + Optimizable (incl. codegen!)
` T E N S O R B O A R D
~1,230C O N T R I B U T O R S 50%+Y O Y G R O W T H 23KP Y T O R C H F O R U M U S E R S
GROW TH IN ARXIV MENTIONS IN RESEARCH PAPERS 0 100 200 300 400 500 Jan 17 Feb 17 M ar17 Apr17 M ay17 Jun 17 Jul17 Aug17 Sep 17 Jan 18 Feb 18 M ar18 Apr18 M ay18 Jun 18 Jul18 Aug18 Sep 18 Jan 19 Feb 19 M ar19 Apr19 M ay19 Jun 19 Jul19
F R A M E W O R K S D Y N A M I C V S . S T A T I C
Declarative Toolkits COMPUTATIONGRAPH
DECLARATIVETOOLKITS Declare and compile a model Repeatedly execute the model in a VM TOOLKIT VM PYTHON SCRIPT RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM
import tensorflow as tf
 import numpy as np trX = np.linspace(-1, 1, 101)
 trY = 2 * trX + np.random.randn(*trX.shape) * 0.33 X = tf.placeholder("float")
 Y = tf.placeholder("float") def model(X, w): return tf.multiply(X, w) w = tf.Variable(0.0, name="weights")
 y_model = model(X, w) cost = tf.square(Y - y_model) train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost) with tf.Session() as sess: tf.global_variables_initializer().run() for i in range(100): for (x, y) in zip(trX, trY): sess.run(train_op, feed_dict={X: x, Y: y}) print(sess.run(w)) DECLARATIVETOOLKITS Computation Graph • Declare a computation • Placeholder variables • Compile it • Run it in a Session
import tensorflow as tf
 import numpy as np trX = np.linspace(-1, 1, 101)
 trY = 2 * trX + np.random.randn(*trX.shape) * 0.33 X = tf.placeholder("float")
 Y = tf.placeholder("float") def model(X, w): return tf.multiply(X, w) w = tf.Variable(0.0, name="weights")
 y_model = model(X, w) cost = tf.square(Y - y_model) train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost) with tf.Session() as sess: tf.global_variables_initializer().run() for i in range(100): for (x, y) in zip(trX, trY): sess.run(train_op, feed_dict={X: x, Y: y}) print(sess.run(w)) X = tf.placeholder("float")
 Y = tf.placeholder("float") DECLARATIVETOOLKITS Computation Graph • Declare a computation • Placeholder variables • Compile it • Run it in a Session
import tensorflow as tf
 import numpy as np trX = np.linspace(-1, 1, 101)
 trY = 2 * trX + np.random.randn(*trX.shape) * 0.33 X = tf.placeholder("float")
 Y = tf.placeholder("float") def model(X, w): return tf.multiply(X, w) w = tf.Variable(0.0, name="weights")
 y_model = model(X, w) cost = tf.square(Y - y_model) train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost) with tf.Session() as sess: tf.global_variables_initializer().run() for i in range(100): for (x, y) in zip(trX, trY): sess.run(train_op, feed_dict={X: x, Y: y}) print(sess.run(w)) Model definition def model(X, w): return tf.multiply(X, w) w = tf.Variable(0.0, name="weights")
 y_model = model(X, w) cost = tf.square(Y - y_model) train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost) DECLARATIVETOOLKITS Computation Graph • Declare a computation • Placeholder variables • Compile it • Run it in a Session
import tensorflow as tf
 import numpy as np trX = np.linspace(-1, 1, 101)
 trY = 2 * trX + np.random.randn(*trX.shape) * 0.33 X = tf.placeholder("float")
 Y = tf.placeholder("float") def model(X, w): return tf.multiply(X, w) w = tf.Variable(0.0, name="weights")
 y_model = model(X, w) cost = tf.square(Y - y_model) train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost) with tf.Session() as sess: tf.global_variables_initializer().run() for i in range(100): for (x, y) in zip(trX, trY): sess.run(train_op, feed_dict={X: x, Y: y}) print(sess.run(w)) sess.run(train_op, feed_dict={X: x, Y: y}) print(sess.run(w)) A separate, Turing complete, virtual machine. for i in range(100): for (x, y) in zip(trX, trY): DECLARATIVETOOLKITS Computation Graph • Declare a computation • Placeholder variables • Compile it • Run it in a Session
Imperative Toolkits DEFINE-BY-RUN
IMPERATIVETOOLKITS Run a series of computation Implicitly defining the model as execution goes PYTHON NATIVE RUNTIME PYTHON INSTRUCTIONS RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM
Imperative Toolkits import torch
 from torch.autograd import Variable trX = torch.linspace(-1, 1, 101)
 trY = 2 * trX + torch.random(*trX.size()) * 0.33 w = Variable(trX.new([0.0]), requires_grad=True) for i in range(100): for (x, y) in zip(trX, trY): X = Variable(x) Y = Variable(y) print(X) print(Y) y_model = X * w.expand_as(X) cost = (Y - y_model) * 2 Cost.backward(torch.ones(*cost.size())) w.data = w.data + 0.01 * w.grad.data print(w) • Define a model by execution • No separate compilation stage • No separate execution engine
import torch
 from torch.autograd import Variable trX = torch.linspace(-1, 1, 101)
 trY = 2 * trX + torch.random(*trX.size()) * 0.33 w = Variable(trX.new([0.0]), requires_grad=True) for i in range(100): for (x, y) in zip(trX, trY): X = Variable(x) Y = Variable(y) print(X) print(Y) y_model = X * w.expand_as(X) cost = (Y - y_model) * 2 Cost.backward(torch.ones(*cost.size())) w.data = w.data + 0.01 * w.grad.data print(w) Imperative Toolkits Model constructed and values computed as we define it. • Define a model by execution • No separate compilation stage • No separate execution engine
P Y T O R C H F O R E M B E D D E D S T A T E O F T H E S T A T E
H O W D O I R U N P Y T O R C H M O D E L S O N D E V I C E ?
H O W D O I R U N P Y T O R C H M O D E L S O N D E V I C E ? E X P O R T O N N X F O R M A T T E D M O D E L S
H O W D O I R U N P Y T O R C H M O D E L S O N D E V I C E ? E X P O R T O N N X F O R M A T T E D M O D E L S P Y T O R C H M O B I L E
P Y T O R C H O N N X S U P P O R T
O N N X E c o s y s t e m
A R C H I T E C T U R E 
 A N D F L O W JIT Tracer and Torchscript ONNX
 Exporter Optimizer Torch IR to ONNX IR Translator Torch IR ONNX torch.onnx
 .export() PyTorch Model Sample Input ONNX Graph
Partners
P Y T O R C H M O B I L E P R E V I E W R E L E A S E
W H A T I S   P Y T O R C H   M O B I L E ?
W H A T I S   P Y T O R C H   M O B I L E ? I T ’ S P Y T O R C H
W H A T I S   P Y T O R C H   M O B I L E ? I T ’ S P Y T O R C H F O R M O B I L E 😃
W H A T I S   P Y T O R C H   M O B I L E ? I T ’ S P Y T O R C H F O R M O B I L E B U T N O P Y T H O N 😃
W H A T C A N I T R U N ? A N Y T O R C H S C R I P T M O D E L .
W H A T C A N I T R U N ? A N Y T O R C H S C R I P T M O D E L . L O O P S ? Y E S
W H A T C A N I T R U N ? A N Y T O R C H S C R I P T M O D E L . L O O P S ? F U N C T I O N S ? Y E S Y E S
W H A T C A N I T R U N ? A N Y T O R C H S C R I P T M O D E L . L O O P S ? F U N C T I O N S ? T U P L E S ? Y E S Y E S 
 Y E S
W H A T C A N I T R U N ? A N Y T O R C H S C R I P T M O D E L . L O O P S ? F U N C T I O N S ? T U P L E S ? N A M E D T U P L E ? Y E S Y E S 
 Y E S Y E S
ANDROID - MAVEN iOS - COCOAPODS MODEL OPTIMIZATION (OPTIONAL ) PY TORCH MOBILE • No separate runtime to export P Y T O R C H 1 . 3 AUTHOR A MODEL IN PYTORCH implementation 'org.pytorch:pytorch_ android:1.3.0' pod ‘LibTorch’ qmodel = quantization.convert(my_mobile_model) torch.jit.script(qmodel).save(“my_mobile_model.pt") C O M I N G S O O N • Build level optimization and selective compilation • Whole program optimization with link time optimization End-to-end workflows for mobile in iOS 
 and Android: EXPERIMENTAL
QUANTIZATION P Y T O R C H 1 . 3 model = ResNet50() model.load_state_dict(torch.load("model.pt")) qmodel = quantization.prepare( model, {"": quantization.default_qconfig}) qmodel.eval() for batch, target in data_loader: model(batch) qmodel = quantization.convert(qmodel) 4XL E S S M E M O R Y 
 U S A G E 2-4XS P E E D U P S I N 
 C O M P U T E EXPERIMENTAL • Neural networks inference is expensive • IoT and mobile devices have limited resources • Quantizing models enables efficient inference at scale
H O W D O I U S E I T ?
H O W D O I U S E I T ? TorchScript A static, high-performance subset of Python. 1. Prototype your model with PyTorch 2. Control flow is preserved 3. First-class support for lists, dicts, etc. import torch class MyModule(torch.nn.Module):     def __init__(self, N, M, state: List[Tensor]):         super(MyModule, self).__init__()         self.weight = torch.nn.Parameter(torch.rand(N, M))         self.state = state     def forward(self, input):         self.state.append(input)         if input.sum() > 0:             output = self.weight.mv(input)         else:             output = self.weight + input         return output # Compile the model code to a static representation my_module = MyModule(3, 4, [torch.rand(3, 4)]) my_script_module = torch.jit.script(my_module) # Save the compiled code and model data  # so it can be loaded elsewhere my_script_module.save("my_script_module.pt")
H O W D O I U S E I T ? TorchScript A static, high-performance subset of Python. 1. Prototype your model with PyTorch 2. Control flow is preserved 3. First-class support for lists, dicts, etc. import torch class MyModule(torch.nn.Module):     def __init__(self, N, M, state: List[Tensor]):         super(MyModule, self).__init__()         self.weight = torch.nn.Parameter(torch.rand(N, M))         self.state = state     def forward(self, input):         self.state.append(input)         if input.sum() > 0:             output = self.weight.mv(input)         else:             output = self.weight + input         return output # Compile the model code to a static representation my_module = MyModule(3, 4, [torch.rand(3, 4)]) my_script_module = torch.jit.script(my_module) # Save the compiled code and model data  # so it can be loaded elsewhere my_script_module.save("my_script_module.pt")
H O W D O I U S E I T ? # Compile the model code to a static representation my_module = MyModule(3, 4, [torch.rand(3, 4)]) my_script_module = torch.jit.script(my_module) # Save the compiled code and model data  # so it can be loaded elsewhere my_script_module.save("my_script_module.pt")
H O W D O I U S E I T ? ANDROID iOS implementation   'org.pytorch:pytorch_android:1.3.0' pod 'LibTorch'
H O W D O E S I T W O R K ? ANDROID iOS
H O W D O E S I T W O R K ? ANDROID iOS https://github.com/pytorch/android-demo-app https://github.com/pytorch/ios-demo-app
W H A T ' S H E R E T O D A Y ? Full TorchScript support. Pre-built binary releases in JCenter and CocoaPods. Java bindings. All forward CPU operators. Some optimized float operators (based on Caffe2Go). Some optimized quantized operators (based on QNNPACK w/ XNNPACK WIP).
W H A T ' S C O M I N G U P ? Faster. Smaller. Customized builds. Obj-C/Swift API? Kotlin wrapper? GPU support?? Accelerator support??
T H A N K Y O U

"PyTorch Deep Learning Framework: Status and Directions," a Presentation from Facebook

  • 1.
    J O SE P H S P I S A K P R O D U C T M A N A G E R P Y T O R C H | O N D E V I C E
  • 2.
    OVERVIEW AND INTEGRATION PYTORCHMOBILE STATE OF THE STATE ON DEVICE DYNAMIC VS. STATIC FRAMEWORKS PRINCIPLES, MAKEUP, .. OVERVIEW
  • 3.
  • 4.
  • 5.
  • 6.
    P Y TO R C H O V E R V I E W
  • 7.
  • 8.
    P Y TO R C H R E S E A R C H P R O T O T Y P I N G P R O D U C T I O N D E P L O Y M E N T +
  • 9.
    C O RE P R I N C I P L E S BUILDING FOR SCALE DEVELOPER EFFICIENCY
  • 10.
    DEVELOPER EFFICIENCY ENABLING AHIGH VELOCITY OF MODEL ITERATION AND INNOVATION
  • 11.
    ` C L EA N A P I S
  • 12.
    ` T O RC H S C R I P T Models are Python TorchScript programs, an optimizable subset of Python + Same “models are programs” idea + Production deployment + No Python dependency + Compilation for performance optimization class RNN(nn.Module):   def __init__(self, W_h, U_h, W_y, b_h, b_y):     super(RNN, self).__init__()     self.W_h = nn.Parameter(W_h)     self.U_h = nn.Parameter(U_h)     self.W_y = nn.Parameter(W_y)     self.b_h = nn.Parameter(b_h)     self.b_y = nn.Parameter(b_y)   def forward(self, x, h):     y = []     for t in range(x.size(0)):       h = torch.tanh(x[t] @ self.W_h + h @ self.U_h + self.b_h)       y += [torch.tanh(h @ self.W_y + self.b_y)]       if t % 10 == 0:         print("stats: ", h.mean(), h.var())     return torch.stack(y), h 
 # one annotation! script_rnn = torch.jit.script(RNN(W_h, U_h, W_y, b_h, b_y))
  • 13.
    ` T O RC H S C R I P T Models are Python TorchScript programs, an optimizable subset of Python + Same “models are programs” idea + Prod deployment + No Python dependency + Optimizable (incl. codegen!)
  • 14.
    ` T E NS O R B O A R D
  • 15.
    ~1,230C O NT R I B U T O R S 50%+Y O Y G R O W T H 23KP Y T O R C H F O R U M U S E R S
  • 16.
    GROW TH INARXIV MENTIONS IN RESEARCH PAPERS 0 100 200 300 400 500 Jan 17 Feb 17 M ar17 Apr17 M ay17 Jun 17 Jul17 Aug17 Sep 17 Jan 18 Feb 18 M ar18 Apr18 M ay18 Jun 18 Jul18 Aug18 Sep 18 Jan 19 Feb 19 M ar19 Apr19 M ay19 Jun 19 Jul19
  • 17.
    F R AM E W O R K S D Y N A M I C V S . S T A T I C
  • 18.
  • 19.
    DECLARATIVETOOLKITS Declare and compilea model Repeatedly execute the model in a VM TOOLKIT VM PYTHON SCRIPT RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM
  • 20.
    import tensorflow astf
 import numpy as np trX = np.linspace(-1, 1, 101)
 trY = 2 * trX + np.random.randn(*trX.shape) * 0.33 X = tf.placeholder("float")
 Y = tf.placeholder("float") def model(X, w): return tf.multiply(X, w) w = tf.Variable(0.0, name="weights")
 y_model = model(X, w) cost = tf.square(Y - y_model) train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost) with tf.Session() as sess: tf.global_variables_initializer().run() for i in range(100): for (x, y) in zip(trX, trY): sess.run(train_op, feed_dict={X: x, Y: y}) print(sess.run(w)) DECLARATIVETOOLKITS Computation Graph • Declare a computation • Placeholder variables • Compile it • Run it in a Session
  • 21.
    import tensorflow astf
 import numpy as np trX = np.linspace(-1, 1, 101)
 trY = 2 * trX + np.random.randn(*trX.shape) * 0.33 X = tf.placeholder("float")
 Y = tf.placeholder("float") def model(X, w): return tf.multiply(X, w) w = tf.Variable(0.0, name="weights")
 y_model = model(X, w) cost = tf.square(Y - y_model) train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost) with tf.Session() as sess: tf.global_variables_initializer().run() for i in range(100): for (x, y) in zip(trX, trY): sess.run(train_op, feed_dict={X: x, Y: y}) print(sess.run(w)) X = tf.placeholder("float")
 Y = tf.placeholder("float") DECLARATIVETOOLKITS Computation Graph • Declare a computation • Placeholder variables • Compile it • Run it in a Session
  • 22.
    import tensorflow astf
 import numpy as np trX = np.linspace(-1, 1, 101)
 trY = 2 * trX + np.random.randn(*trX.shape) * 0.33 X = tf.placeholder("float")
 Y = tf.placeholder("float") def model(X, w): return tf.multiply(X, w) w = tf.Variable(0.0, name="weights")
 y_model = model(X, w) cost = tf.square(Y - y_model) train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost) with tf.Session() as sess: tf.global_variables_initializer().run() for i in range(100): for (x, y) in zip(trX, trY): sess.run(train_op, feed_dict={X: x, Y: y}) print(sess.run(w)) Model definition def model(X, w): return tf.multiply(X, w) w = tf.Variable(0.0, name="weights")
 y_model = model(X, w) cost = tf.square(Y - y_model) train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost) DECLARATIVETOOLKITS Computation Graph • Declare a computation • Placeholder variables • Compile it • Run it in a Session
  • 23.
    import tensorflow astf
 import numpy as np trX = np.linspace(-1, 1, 101)
 trY = 2 * trX + np.random.randn(*trX.shape) * 0.33 X = tf.placeholder("float")
 Y = tf.placeholder("float") def model(X, w): return tf.multiply(X, w) w = tf.Variable(0.0, name="weights")
 y_model = model(X, w) cost = tf.square(Y - y_model) train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost) with tf.Session() as sess: tf.global_variables_initializer().run() for i in range(100): for (x, y) in zip(trX, trY): sess.run(train_op, feed_dict={X: x, Y: y}) print(sess.run(w)) sess.run(train_op, feed_dict={X: x, Y: y}) print(sess.run(w)) A separate, Turing complete, virtual machine. for i in range(100): for (x, y) in zip(trX, trY): DECLARATIVETOOLKITS Computation Graph • Declare a computation • Placeholder variables • Compile it • Run it in a Session
  • 24.
  • 25.
    IMPERATIVETOOLKITS Run a seriesof computation Implicitly defining the model as execution goes PYTHON NATIVE RUNTIME PYTHON INSTRUCTIONS RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM RELU CONNV2D BATCHNORM
  • 26.
    Imperative Toolkits import torch
 fromtorch.autograd import Variable trX = torch.linspace(-1, 1, 101)
 trY = 2 * trX + torch.random(*trX.size()) * 0.33 w = Variable(trX.new([0.0]), requires_grad=True) for i in range(100): for (x, y) in zip(trX, trY): X = Variable(x) Y = Variable(y) print(X) print(Y) y_model = X * w.expand_as(X) cost = (Y - y_model) * 2 Cost.backward(torch.ones(*cost.size())) w.data = w.data + 0.01 * w.grad.data print(w) • Define a model by execution • No separate compilation stage • No separate execution engine
  • 27.
    import torch
 from torch.autogradimport Variable trX = torch.linspace(-1, 1, 101)
 trY = 2 * trX + torch.random(*trX.size()) * 0.33 w = Variable(trX.new([0.0]), requires_grad=True) for i in range(100): for (x, y) in zip(trX, trY): X = Variable(x) Y = Variable(y) print(X) print(Y) y_model = X * w.expand_as(X) cost = (Y - y_model) * 2 Cost.backward(torch.ones(*cost.size())) w.data = w.data + 0.01 * w.grad.data print(w) Imperative Toolkits Model constructed and values computed as we define it. • Define a model by execution • No separate compilation stage • No separate execution engine
  • 28.
    P Y TO R C H F O R E M B E D D E D S T A T E O F T H E S T A T E
  • 29.
    H O WD O I R U N P Y T O R C H M O D E L S O N D E V I C E ?
  • 30.
    H O WD O I R U N P Y T O R C H M O D E L S O N D E V I C E ? E X P O R T O N N X F O R M A T T E D M O D E L S
  • 31.
    H O WD O I R U N P Y T O R C H M O D E L S O N D E V I C E ? E X P O R T O N N X F O R M A T T E D M O D E L S P Y T O R C H M O B I L E
  • 32.
    P Y TO R C H O N N X S U P P O R T
  • 33.
    O N NX E c o s y s t e m
  • 34.
    A R CH I T E C T U R E 
 A N D F L O W JIT Tracer and Torchscript ONNX
 Exporter Optimizer Torch IR to ONNX IR Translator Torch IR ONNX torch.onnx
 .export() PyTorch Model Sample Input ONNX Graph
  • 35.
  • 36.
    P Y TO R C H M O B I L E P R E V I E W R E L E A S E
  • 37.
    W H AT I S   P Y T O R C H   M O B I L E ?
  • 38.
    W H AT I S   P Y T O R C H   M O B I L E ? I T ’ S P Y T O R C H
  • 39.
    W H AT I S   P Y T O R C H   M O B I L E ? I T ’ S P Y T O R C H F O R M O B I L E 😃
  • 40.
    W H AT I S   P Y T O R C H   M O B I L E ? I T ’ S P Y T O R C H F O R M O B I L E B U T N O P Y T H O N 😃
  • 41.
    W H AT C A N I T R U N ? A N Y T O R C H S C R I P T M O D E L .
  • 42.
    W H AT C A N I T R U N ? A N Y T O R C H S C R I P T M O D E L . L O O P S ? Y E S
  • 43.
    W H AT C A N I T R U N ? A N Y T O R C H S C R I P T M O D E L . L O O P S ? F U N C T I O N S ? Y E S Y E S
  • 44.
    W H AT C A N I T R U N ? A N Y T O R C H S C R I P T M O D E L . L O O P S ? F U N C T I O N S ? T U P L E S ? Y E S Y E S 
 Y E S
  • 45.
    W H AT C A N I T R U N ? A N Y T O R C H S C R I P T M O D E L . L O O P S ? F U N C T I O N S ? T U P L E S ? N A M E D T U P L E ? Y E S Y E S 
 Y E S Y E S
  • 46.
    ANDROID - MAVENiOS - COCOAPODS MODEL OPTIMIZATION (OPTIONAL ) PY TORCH MOBILE • No separate runtime to export P Y T O R C H 1 . 3 AUTHOR A MODEL IN PYTORCH implementation 'org.pytorch:pytorch_ android:1.3.0' pod ‘LibTorch’ qmodel = quantization.convert(my_mobile_model) torch.jit.script(qmodel).save(“my_mobile_model.pt") C O M I N G S O O N • Build level optimization and selective compilation • Whole program optimization with link time optimization End-to-end workflows for mobile in iOS 
 and Android: EXPERIMENTAL
  • 47.
    QUANTIZATION P Y TO R C H 1 . 3 model = ResNet50() model.load_state_dict(torch.load("model.pt")) qmodel = quantization.prepare( model, {"": quantization.default_qconfig}) qmodel.eval() for batch, target in data_loader: model(batch) qmodel = quantization.convert(qmodel) 4XL E S S M E M O R Y 
 U S A G E 2-4XS P E E D U P S I N 
 C O M P U T E EXPERIMENTAL • Neural networks inference is expensive • IoT and mobile devices have limited resources • Quantizing models enables efficient inference at scale
  • 48.
    H O WD O I U S E I T ?
  • 49.
    H O WD O I U S E I T ? TorchScript A static, high-performance subset of Python. 1. Prototype your model with PyTorch 2. Control flow is preserved 3. First-class support for lists, dicts, etc. import torch class MyModule(torch.nn.Module):     def __init__(self, N, M, state: List[Tensor]):         super(MyModule, self).__init__()         self.weight = torch.nn.Parameter(torch.rand(N, M))         self.state = state     def forward(self, input):         self.state.append(input)         if input.sum() > 0:             output = self.weight.mv(input)         else:             output = self.weight + input         return output # Compile the model code to a static representation my_module = MyModule(3, 4, [torch.rand(3, 4)]) my_script_module = torch.jit.script(my_module) # Save the compiled code and model data  # so it can be loaded elsewhere my_script_module.save("my_script_module.pt")
  • 50.
    H O WD O I U S E I T ? TorchScript A static, high-performance subset of Python. 1. Prototype your model with PyTorch 2. Control flow is preserved 3. First-class support for lists, dicts, etc. import torch class MyModule(torch.nn.Module):     def __init__(self, N, M, state: List[Tensor]):         super(MyModule, self).__init__()         self.weight = torch.nn.Parameter(torch.rand(N, M))         self.state = state     def forward(self, input):         self.state.append(input)         if input.sum() > 0:             output = self.weight.mv(input)         else:             output = self.weight + input         return output # Compile the model code to a static representation my_module = MyModule(3, 4, [torch.rand(3, 4)]) my_script_module = torch.jit.script(my_module) # Save the compiled code and model data  # so it can be loaded elsewhere my_script_module.save("my_script_module.pt")
  • 51.
    H O WD O I U S E I T ? # Compile the model code to a static representation my_module = MyModule(3, 4, [torch.rand(3, 4)]) my_script_module = torch.jit.script(my_module) # Save the compiled code and model data  # so it can be loaded elsewhere my_script_module.save("my_script_module.pt")
  • 52.
    H O WD O I U S E I T ? ANDROID iOS implementation   'org.pytorch:pytorch_android:1.3.0' pod 'LibTorch'
  • 53.
    H O WD O E S I T W O R K ? ANDROID iOS
  • 54.
    H O WD O E S I T W O R K ? ANDROID iOS https://github.com/pytorch/android-demo-app https://github.com/pytorch/ios-demo-app
  • 55.
    W H AT ' S H E R E T O D A Y ? Full TorchScript support. Pre-built binary releases in JCenter and CocoaPods. Java bindings. All forward CPU operators. Some optimized float operators (based on Caffe2Go). Some optimized quantized operators (based on QNNPACK w/ XNNPACK WIP).
  • 56.
    W H AT ' S C O M I N G U P ? Faster. Smaller. Customized builds. Obj-C/Swift API? Kotlin wrapper? GPU support?? Accelerator support??
  • 57.
    T H AN K Y O U