PlasmaControl
diff --git a/‎benchmarks/benchmarking.ipynb‎
Lines changed: 572 additions & 86 deletions b/‎benchmarks/benchmarking.ipynb‎
Lines changed: 572 additions & 86 deletions
diff --git a/‎docs/benchmarking.png‎
68.1 KB b/‎docs/benchmarking.png‎
68.1 KB
diff --git a/‎docs/flow_graph.png‎
38.1 KB b/‎docs/flow_graph.png‎
38.1 KB
diff --git a/‎docs/proposal.rst‎
Lines changed: 52 additions & 0 deletions b/‎docs/proposal.rst‎
Lines changed: 52 additions & 0 deletions
diff --git a/‎docs/proposal.tex‎
Lines changed: 160 additions & 0 deletions b/‎docs/proposal.tex‎
Lines changed: 160 additions & 0 deletions
@@ -0,0 +1,52 @@
+============================================================================================
+Keras2c: A simple library for converting Keras neural networks to real-time friendly C code.
+============================================================================================
+
+Abstract
+********
+With the growth of machine learning models and neural networks in measurement and control systems comes the need to deploy these models in a way that is compatible with existing systems. Existing options for deploying neural networks either introduce very high latency, requires expensive and time consuming work to integrate into existing code bases, or only support a very limited subset of model types. We have therefore developed a new method, called Keras2c, which is a simple library for converting Keras/TensorFlow neural network models into real time compatible C code. It supports a wide range of Keras layer and model types, including multidimensional convolutions, recurrent layers, well as multi-input/output models, and shared layers. Keras2c re-implements the core components of Keras/TensorFlow required for predictive forward passes through neural networks in pure C, relying only on standard library functions. The core functionality consists of only ~1200 lines of code, making it extremely lightweight and easy to integrate into existing codebases. Keras2c has been sucessfully tested in experiments and is currently in use on the plasma control system at the DIII-D National Fusion Facility at General Atomics in San Diego.
+
+Motivation
+**********
+TensorFlow is one of the most popular libraries for developing and training neural networks, and contains a high level Python API called Keras that has become extremely popular due to its ease of use and rich feature set. As the use of machine learning and neural networks grows in the field of diagnostic and control systems, one of the central challenges remains how to deploy the resulting trained models in a way that can be easily integrated into existing systems, particularly for real time predictions using machine learning models. Given that most machine learning development traditionally takes place in Python, most deployment schemes involve calling out to a Python process (often running on a distant network connected server) and using the existing Python libraries to pass data through the model. This introduces large latency, and is generally not feasible for real time applications. Other options include rewriting the entire network using the existing TensorFlow C/C++ API, though this is extremely time consuming, and requires linking the resulting code against the full TensorFlow library, containing millions of lines of code and with a binary size up to several GB. The release of TensorFlow 2.0 contained a new possibility, called "TensorFlow Lite", a reduced library designed to run on mobile and IoT devices. However, TensorFlow Lite only supports a very limited subset of the full Keras API. Therefore, we present a new option, Keras2c, a simple library for converting Keras/TensorFlow neural network models into real time compatible C code.
+
+Method
+******
+
+Keras2c consists of two primary components: a backend library of C functions that each implement a single layer of a neural net (eg, Dense, Conv2D, LSTM), and a Python script that generates C code to call the layer functions in the right order to implement the network. The total library of backend layer functions is only ~1200 lines of code, and uses only C standard library functions, yet covers a very wide range of Keras functionality, summarized below:
+
+Supported Layers
+################
+- **Core Layers**: Dense, Activation, Flatten, Input, Reshape, Permute, RepeatVector
+- **Convolution Layers**: Convolution (1D/2D/3D, with arbitrary stride/dilation/padding), Cropping (1D/2D/3D), UpSampling (1D/2D/3D), ZeroPadding (1D/2D/3D)
+- **Pooling Layers**: MaxPooling (1D/2D/3D), AveragePooling (1D/2D/3D), GlobalMaxPooling (1D/2D/3D), GlobalAveragePooling (1D/2D/3D)
+- **Recurrent Layers**: SimpleRNN, GRU, LSTM (statefull or stateless)
+- **Embedding Layers**: Embedding
+- **Merge Layers**: Add, Subtract, Multiply, Average, Maximum, Minimum, Concatenate, Dot
+- **Normalization Layers**: BatchNormalization
+- **Layer Wrappers**: TimeDistributed, Bidirectional
+- **Activations**: ReLU, tanh, sigmoid, hard sigmoid, exponential, softplus, softmax, softsign, LeakyReLU, PReLU, ELU, ThresholdedReLU
+
+
+.. figure:: flow_graph.png
+ :align: center
+ :scale: 50 %
+
+ Workflow of converting Keras model to C code with Keras2C
+
+The Keras2c Python script takes in a trained Keras model and extracts the weights and other parameters, and parses the graph structure to determine the order that functions should be called to obtain the correct results. It then generates C code for a predictor function, that can be called with a set of inputs to generate predictions. It also generates helper functions for initializing and cleanup, to handle memory allocation (by default all variables are declared on the stack, though it also supports the option of dynamically allocating memory before execution). In addition to simple sequential models, Keras2c also supports more complicated architectures created using the Keras functional API, including multi-input/multi-output networks with complicated branching and merging internal structures.
+
+To confirm that the generated code accurately reproduces the outputs of the original model, Keras2c also generates sample input/output pairs from the original network. It then automatically tests the generated code with the same inputs to verify that the generated code produces equivalent outputs.
+
+Benchmarks
+**********
+
+Keras2c has also been benchmarked against Python Keras/TensorFlow for single CPU performance, and the generated code has been shown to be significantly faster for small to medium sized models.
+(All tests conducted on Intel Core i7-8750H CPU @ 2.20GHz, single threaded, 32GB RAM. Keras2c compiled with GCC 7.4.0 with -O3 optimization. Python Keras v2.2.4, TensorFlowCPU v1.13.1, mkl v2019.1)
+
+.. figure:: benchmarking.png
+ :align: center
+
+ Benchmarking results, Keras2c vs Keras/Tensorflow in Python. 
+
+ 
@@ -0,0 +1,160 @@
+\documentclass{article}
+\usepackage[letterpaper, portrait, margin=1in]{geometry}
+\usepackage[utf8]{inputenc}
+\usepackage{amsfonts,amsmath,amssymb,amsthm}
+\usepackage{bm,nicefrac}
+\usepackage{graphicx}
+\usepackage{hyperref}
+\usepackage{subcaption}
+\usepackage[section]{placeins}
+\usepackage{textgreek}
+\usepackage{changepage}
+\usepackage{authblk}
+
+\title{Keras2c: A simple library for converting Keras neural networks
+to real-time friendly C}
+\author[1]{Rory Conlin}
+\author[2]{Keith Erickson}
+\author[3]{Joe Abbate}
+\author[1,2]{Egemen Kolemen}
+\affil[1]{Dept. of Mechanical \& Aerospace Engineering, Princeton University}
+\affil[2]{Princeton Plasma Physics Laboratory}
+\affil[3]{Dept. of Astrophysical Sciences, Princeton University}
+
+
+\begin{document}
+
+\maketitle
+
+
+\section*{Abstract}\label{abstract}
+
+With the growth of machine learning models and neural networks in
+measurement and control systems comes the need to deploy these models in
+a way that is compatible with existing systems. Existing options for
+deploying neural networks either introduce very high latency, requires
+expensive and time consuming work to integrate into existing code bases,
+or only support a very limited subset of model types. We have therefore
+developed a new method, called Keras2c, which is a simple library for
+converting Keras/TensorFlow neural network models into real time
+compatible C code. It supports a wide range of Keras layer and model
+types, including multidimensional convolutions, recurrent layers, well
+as multi-input/output models, and shared layers. Keras2c re-implements
+the core components of Keras/TensorFlow required for predictive forward
+passes through neural networks in pure C, relying only on standard
+library functions. The core functionality consists of only
+\textasciitilde{}1200 lines of code, making it extremely lightweight and
+easy to integrate into existing codebases. Keras2c has been sucessfully
+tested in experiments and is currently in use on the plasma control
+system at the DIII-D National Fusion Facility at General Atomics in San
+Diego.
+
+
+\section*{Motivation}\label{motivation}
+
+TensorFlow is one of the most popular libraries for developing and
+training neural networks, and contains a high level Python API called
+Keras that has become extremely popular due to its ease of use and rich
+feature set. As the use of machine learning and neural networks grows in
+the field of diagnostic and control systems, one of the central
+challenges remains how to deploy the resulting trained models in a way
+that can be easily integrated into existing systems, particularly for
+real time predictions using machine learning models. Given that most
+machine learning development traditionally takes place in Python, most
+deployment schemes involve calling out to a Python process (often
+running on a distant network connected server) and using the existing
+Python libraries to pass data through the model. This introduces large
+latency, and is generally not feasible for real time applications. Other
+options include rewriting the entire network using the existing
+TensorFlow C/C++ API, though this is extremely time consuming, and
+requires linking the resulting code against the full TensorFlow library,
+containing millions of lines of code and with a binary size up to
+several GB. The release of TensorFlow 2.0 contained a new possibility,
+called "TensorFlow Lite", a reduced library designed to run on mobile
+and IoT devices. However, TensorFlow Lite only supports a very limited
+subset of the full Keras API. Therefore, we present a new option,
+Keras2c, a simple library for converting Keras/TensorFlow neural network
+models into real time compatible C code.
+
+
+\section*{Method}\label{method}
+
+
+
+Keras2c consists of two primary components: a backend library of C
+functions that each implement a single layer of a neural net (eg, Dense,
+Conv2D, LSTM), and a Python script that generates C code to call the
+layer functions in the right order to implement the network. The total
+library of backend layer functions is only $\sim$1200 lines
+of code, and uses only C standard library functions, yet covers a very
+wide range of Keras functionality, summarized below:
+
+
+
+\subsubsection*{Supported Functionality}\label{supported-layers}
+
+\begin{itemize}
+ \setlength\itemsep{0em}
+\item
+ \textbf{Core Layers}: Dense, Activation, Flatten, Input, Reshape, Permute, RepeatVector
+\item
+ \textbf{Convolution Layers}: Convolution (1D/2D/3D, with arbitrary stride/dilation/padding), Cropping (1D/2D/3D), UpSampling (1D/2D/3D), ZeroPadding (1D/2D/3D)
+\item
+ \textbf{Pooling Layers}: MaxPooling (1D/2D/3D), AveragePooling (1D/2D/3D), GlobalMaxPooling (1D/2D/3D), GlobalAveragePooling (1D/2D/3D)
+\item
+ \textbf{Recurrent Layers}: SimpleRNN, GRU, LSTM (statefull or stateless)
+\item
+ \textbf{Embedding Layers}: Embedding
+\item
+ \textbf{Merge Layers}: Add, Subtract, Multiply, Average, Maximum, Minimum, Concatenate, Dot
+\item
+ \textbf{Normalization Layers}: BatchNormalization
+\item
+ \textbf{Layer Wrappers}: TimeDistributed, Bidirectional
+\item
+ \textbf{Activations}: ReLU, tanh, sigmoid, hard sigmoid, exponential, softplus, softmax, softsign, LeakyReLU, PReLU, ELU, ThresholdedReLU
+\end{itemize}
+
+
+
+The Keras2c Python script takes in a trained Keras model and extracts
+the weights and other parameters, and parses the graph structure to
+determine the order that functions should be called to obtain the
+correct results. It then generates C code for a predictor function, that
+can be called with a set of inputs to generate predictions. It also
+generates helper functions for initializing and cleanup, to handle
+memory allocation (by default all variables are declared on the stack,
+though it also supports the option of dynamically allocating memory
+before execution). In addition to simple sequential models, Keras2c also
+supports more complicated architectures created using the Keras
+functional API, including multi-input/multi-output networks with
+complicated branching and merging internal structures.
+\begin{figure}[h!]
+\centering
+\includegraphics[width=3.5in]{flow_graph.png}
+\caption{Workflow of converting Keras model to C code with Keras2C}
+\end{figure}
+
+To confirm that the generated code accurately reproduces the outputs of
+the original model, Keras2c also generates sample input/output pairs
+from the original network. It then automatically tests the generated
+code with the same inputs to verify that the generated code produces
+equivalent outputs.
+
+
+\section*{Benchmarks}\label{benchmarks}
+
+Keras2c has also been benchmarked against Python Keras/TensorFlow for
+single CPU performance, and the generated code has been shown to be
+significantly faster for small to medium sized models. (All tests
+conducted on Intel Core i7-8750H CPU @ 2.20GHz, single threaded, 32GB
+RAM. Keras2c compiled with GCC 7.4.0 with -O3 optimization. Python Keras
+v2.2.4, TensorFlowCPU v1.13.1, mkl v2019.1)
+
+\begin{figure}[h]
+\centering
+\includegraphics[width=6in]{benchmarking.png}
+\caption{Benchmarking results, Keras2c vs Keras/Tensorflow in Python.}
+\end{figure}
+
+\end{document}