Skip to content

Commit 166c7f1

Browse files
authored
Add README.
1 parent a11309c commit 166c7f1

File tree

1 file changed

+211
-0
lines changed

1 file changed

+211
-0
lines changed

README.md

Lines changed: 211 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,211 @@
1+
# TensorFlow Model Deployment
2+
3+
A tutorial exploring multiple approaches to deploy / serve a trained TensorFlow (or Keras) model or multiple models
4+
in a production environment for prediction / inferences.
5+
6+
The code samples provided here may originally developed based on TensorFlow 1.2, 1.3 or 1.4. However, unless
7+
explicitly specified, they should work for all versions >= 1.0.
8+
9+
Table of Contents
10+
=================
11+
1. [Import the Model Graph from Meta File](#importGraph)
12+
2. [Create the Model Graph from Scratch](#createGraph)
13+
3. [Restore Multiple Models](#restoreMultiple)
14+
4. [Freeze a Model before Serving it](#freeezeModel)
15+
5. [Convert a Keras model to a TensorFlow model](#convertKeras)
16+
6. [Deploy Multiple Freezed Models](#multiFreezed)
17+
7. [Serve a Model via Web Services](#webServices)
18+
19+
During the training, TensorFlow generates the following 3 files for each checkpoint, although optionally,
20+
you can choose not to create the meta file. You can ignore the file named checkpoint as it is not used in
21+
the prediction process.
22+
23+
1. meta file: It holds the compressed Protobufs graph of the model and all the other metadata associated, such
24+
as collections and operations.
25+
2. index file: It holds an immutable table (key-value table) linking a serialised tensor name to where to find
26+
its data in the data file.
27+
3. data file: It is TensorBundle collection, which saves the values of all variables, such as weights.
28+
29+
### Import the Model Graph from Meta File
30+
<a name="importGraph"></a>
31+
One common approach is to restore the model graph from the meta file, and then restore weights and other data
32+
from the data file (index file will be used as well). Here is a sample code snippet:
33+
34+
```python
35+
import tensorflow as tf
36+
37+
with tf.Session(graph=tf.Graph()) as sess:
38+
saver = tf.train.import_meta_graph("/trained/model_ckpt.meta")
39+
saver.restore(sess, "/trained/model_ckpt")
40+
41+
# Retrieve Ops from the collection
42+
43+
# Run sess to predict
44+
```
45+
46+
A small trick here is where to place the following of code (saver) when you define the model graph for training.
47+
By default, only variables defined above this line will be saved into the meta file. If you don't plan to retrain
48+
the model, you can leave the code defining your train_ops, such as optimizer, loss, accuracy below this line so
49+
that your model file can be reasonably smaller.
50+
51+
```
52+
saver = tf.train.Saver()
53+
```
54+
55+
You normally need to leave some hooks in the trained model so that you can easily feed the data for prediction.
56+
For example, you need to save logits and image_placehoder into the collection and save them in the training, and
57+
later retrieve them for prediction.
58+
59+
A concrete example can be found in train() and predict() methods
60+
[here](https://github.com/bshao001/DmsMsgRcg/blob/Sliding_Window_Version/misc/imgconvnets.py).
61+
62+
This applies to the case when the graph used for inference and training are the same or very similar. In case
63+
the inference graph is very different from the graph used for training, this approach is not preferred as it
64+
would require the graph built for the training to adapt both training and inference, making it unnecessarily
65+
large.
66+
67+
### Create the Model Graph from Scratch
68+
<a name="createGraph"></a>
69+
Another common approach is to create the model graph from scratch instead of restoring the graph from the meta
70+
file. This is extremely useful when the graph for inference is considerably different from the graph for training.
71+
The new TensorFlow NMT model (https://github.com/tensorflow/nmt) is one of the cases.
72+
73+
```
74+
import tensorflow as tf
75+
# Replace this with your valid ModelCreator
76+
import ModelCreator
77+
78+
with tf.Session() as sess:
79+
# Replace this line with your valid ModelCreator and its arguments
80+
model = ModelCreator(training=False)
81+
# Restore model weights
82+
model.saver.restore(sess, "/trained/model_ckpt")
83+
```
84+
85+
A concrete example can be found in the constructor (\_\_init\_\_ method)
86+
[here](https://github.com/bshao001/ChatLearner/blob/master/chatbot/botpredictor.py).
87+
88+
### Restore Multiple Models
89+
<a name="restoreMultiple"></a>
90+
Sometimes, you may need to load multiple trained models into a single TF session to work together for a task. For
91+
example, in a face recognition application, you may need a model to detect faces from a given images, then use
92+
another model to recognize these faces. In a typical photo OCR application, you normally require three models to
93+
work as a pipeline: model one to detect the text areas (blocks) from a given image; model two to segment characters
94+
from the text strings detected by the first model; and model three to recognize those characters.
95+
96+
Loading multiple models into a single session can be tricky if you don't do it properly. Here are the steps to follow:
97+
98+
1. For each of the models, you need to have a unique model_scope, and define all the variables within that scope when
99+
building the graph for training:
100+
101+
```
102+
with tf.variable_scope(model_scope):
103+
# Define variables here
104+
```
105+
106+
2. At the time of restoring models, do the following:
107+
108+
```
109+
tf.train.import_meta_graph(os.path.join(result_dir, result_file + ".meta"))
110+
all_vars = tf.global_variables()
111+
model_vars = [var for var in all_vars if var.name.startswith(model_scope)]
112+
saver = tf.train.Saver(model_vars)
113+
saver.restore(sess, os.path.join(result_dir, result_file))
114+
```
115+
116+
Here, a TF session object (sess) is often passed into the method, as you don't want to create its own session here.
117+
Also, don't be fooled by the frequently used way of this statement:
118+
119+
```
120+
saver = tf.train.import_meta_graph("/trained/model_ckpt.meta")
121+
```
122+
123+
When the right side is run inside a TF session, the model graph is imported. It returns a saver, but you don't have
124+
to use it. My experience was if this saver is used to restore the data (weights), it won't work for loading multiple
125+
models: it will complain all kinds of conflicts.
126+
127+
A whole working example can be found in my [DmsMsgRcg](https://github.com/bshao001/DmsMsgRcg/tree/Sliding_Window_Version)
128+
project:
129+
- Training: https://github.com/bshao001/DmsMsgRcg/blob/Sliding_Window_Version/misc/imgconvnets.py
130+
- Predictor Definition: https://github.com/bshao001/DmsMsgRcg/blob/Sliding_Window_Version/misc/cnnpredictor.py
131+
- Final Application: https://github.com/bshao001/DmsMsgRcg/blob/Sliding_Window_Version/mesgclsf/msgclassifier.py
132+
133+
### Freeze a Model before Serving it
134+
<a name="freeezeModel"></a>
135+
Sometimes, a trained model (file) can be very big, and ranging from half to several GB is a common case. At inference
136+
time, you don't have to deal with the big file if you choose to freeze the model. This process can normally decrease
137+
the model file to 20% to 30% of its original size, making the inference considerably faster.
138+
139+
Here are the 3 steps to achieve this:
140+
141+
1. Restore / load the trained model:
142+
143+
```
144+
saver = tf.train.import_meta_graph("/trained/model_ckpt.meta")
145+
graph = tf.get_default_graph()
146+
input_graph_def = graph.as_graph_def()
147+
sess = tf.Session()
148+
saver.restore(sess, "/trained/model_ckpt")
149+
```
150+
151+
2. Choose the output for the freezed model:
152+
153+
```
154+
output_node_names = []
155+
output_node_names.append("prediction_node") # Specify the real node name
156+
output_graph_def = tf.graph_util.convert_variables_to_constants(
157+
sess,
158+
input_graph_def,
159+
output_node_names
160+
)
161+
```
162+
163+
Here, you may need to use the following code to check the output node name:
164+
165+
```
166+
for op in graph.get_operations():
167+
print(op.name)
168+
```
169+
170+
Keep in mind that when you request to output an operation, all the other operations that it depends will also be
171+
saved. Therefore, you only need to specify the final output operation in the inference graph for freezing purpose.
172+
173+
3. Serialize and write the output graph and trained weights to the file system:
174+
175+
```
176+
output_file = "model_file.pb"
177+
with tf.gfile.GFile(output_file, "wb") as f:
178+
f.write(output_graph_def.SerializeToString())
179+
180+
sess.close()
181+
```
182+
183+
A concrete working example, including how to use the freezed model for prediction can be found
184+
[here](https://github.com/bshao001/DmsMsgRcg/blob/master/misc/freezemodel.py).
185+
186+
### Convert a Keras model to a TensorFlow model
187+
<a name="convertKeras"></a>
188+
189+
### Deploy Multiple Freezed Models
190+
<a name="multiFreezed"></a>
191+
192+
### Serve a Model via Web Services
193+
<a name="webServices"></a>
194+
Although this does not directly relate to the problem of how to serve a trained model in TensorFlow, it is a
195+
commonly encountered issue.
196+
197+
We train a machine learning model using python and TensorFlow, however, we often need to make use of the model
198+
to provide services to other different environments, such as a web application or a mobile application, or using
199+
different programming languages, such as Java or C#.
200+
201+
Both REST API and SOAP API can meet your needs on this. REST API is relatively light-weighted, but SOAP API is
202+
not that complicated either. You can pick any of them based on your personal preferences.
203+
204+
- REST API
205+
206+
- SOAP API
207+
208+
### TensorFlow Serving
209+
210+
# References:
211+
1. http://cv-tricks.com/how-to/freeze-tensorflow-models/

0 commit comments

Comments
 (0)