Skip to content

Commit 6664bab

Browse files
authored
(I)Document Refinement (#358)
(1)training installation guide refined (2)training quick start guide refined
1 parent 67bfd49 commit 6664bab

File tree

2 files changed

+91
-45
lines changed

2 files changed

+91
-45
lines changed

docs/markdown/install/training.md

Lines changed: 64 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -2,52 +2,90 @@
22

33
## Prerequisites
44
* [anaconda3](https://www.anaconda.com/products/individual)
5+
Anaconda is used to create virtual environment that facilitates building the running environment and ease the complexity of library depedencies. Here we mainly use it to create virtual python environment and install cuda run-time libraries.
56
* [CUDA](https://developer.nvidia.com/cuda-downloads)
6-
7+
CUDA enviroment is essential to run deep learning neural networks on GPUs. The CUDA installation packages to download should match your system and your NVIDIA Driver version.
78
## Configure environment
8-
Hyperpose training library can be directly used by putting Hyperpose in the directory and import.
9-
But it has to install the prerequist environment to make it available.
9+
There are two ways to install hyperpose python training library.
1010

11-
The following instructions have been tested on the environments below:
11+
All the following instructions have been tested on the environments below:
1212
* Ubuntu 18.04, Tesla V100-DGXStation, Nvidia Driver Version 440.33.01, CUDA Verison=10.2
1313
* Ubuntu 18.04, Tesla V100-DGXStation, Nvidia Driver Version 410.79, CUDA Verison=10.0
1414
* Ubuntu 18.04, TITAN RTX, Nvidia Driver Version 430.64, CUDA Version=10.1
1515
* Ubuntu 18.04, TITAN Xp, Nvidia Driver Version 430.26, CUDA Version=10.2
1616

17+
Before all, we recommend you to create anaconda virtual environment first, which could handle the possible conflicts between the libraries you already have in your computers and the libraries hyperpose need to install, and also handle the dependencies of the cudatoolkit and cudnn library in a very simple way.
18+
To create the virtual environment, run the following command in bash:
1719
```bash
1820
# >>> create virtual environment (choose yes)
1921
conda create -n hyperpose python=3.7
2022
# >>> activate the virtual environment, start installation
2123
conda activate hyperpose
22-
# >>> install cuda and cudnn using conda
24+
# >>> install cudatoolkit and cudnn library using conda
2325
conda install cudatoolkit=10.0.130
2426
conda install cudnn=7.6.0
25-
# >>> install tensorflow of version 2.0.0
26-
pip install tensorflow-gpu==2.0.0
27-
# >>> install the newest version tensorlayer from github
28-
pip install tensorlayer==2.2.3
29-
# >>> install other requirements (numpy<=17.0.0 because it has conflicts with pycocotools)
30-
pip install opencv-python
31-
pip install numpy==1.16.4
32-
pip install pycocotools
33-
pip install matplotlib
34-
# >>> now the configuration is done, check whether the GPU is avaliable.
35-
python
36-
>>> import tensorflow as tf
37-
>>> import tensorlayer as tl
38-
>>> tf.test.is_gpu_available()
39-
# >>> if the output is true, congratulation! you can import and run hyperpose now
40-
>>> from hyperpose import Config,Model,Dataset
4127
```
28+
29+
After configuring and activating conda enviroment, we can then begin to install the hyperpose.
30+
(I)The first method to install is to put hyperpose python module in the working directory and import.(recommand)
31+
After git-cloning the source [repository](https://github.com/tensorlayer/hyperpose.git), you can directly import hyperpose python library under the root directory of the cloned repository.
32+
To make importion available, you should install the prerequist dependencies as followed:
33+
you can either install according to the requirements.txt in the [repository](https://github.com/tensorlayer/hyperpose.git)
34+
```bash
35+
# install according to the requirements.txt
36+
pip install -r requirements.txt
37+
```
38+
or install libraries one by one
39+
```bash
40+
# >>> install tensorflow of version 2.3.1
41+
pip install tensorflow-gpu==2.3.1
42+
# >>> install tensorlayer of version 2.2.3
43+
pip install tensorlayer==2.2.3
44+
# >>> install other requirements (numpy<=17.0.0 because it has conflicts with pycocotools)
45+
pip install opencv-python
46+
pip install numpy==1.16.4
47+
pip install pycocotools
48+
pip install matplotlib
49+
```
50+
This method of installation use the latest source code and thus is less likely to meet compatibility problems.
51+
(II)The second method to install is to use pypi repositories.
52+
We have already upload hyperpose python library to pypi website so you can install it using pip, which gives you the last stable version.
53+
```bash
54+
pip install hyperpose
55+
```
56+
This will download and install all dependencies automatically.
57+
58+
Now after installing dependent libraries and hyperpose itself, let's check whether the installation successes.
59+
run following command in bash:
60+
```bash
61+
# >>> now the configuration is done, check whether the GPU is avaliable.
62+
python
63+
>>> import tensorflow as tf
64+
>>> import tensorlayer as tl
65+
>>> tf.test.is_gpu_available()
66+
# >>> if the output is True, congratulation! you can import and run hyperpose now
67+
>>> from hyperpose import Config,Model,Dataset
68+
```
69+
4270
## Extra configuration for exporting model
43-
For training, the above configuration is enough, but to export model into **onnx** format for inference,one should install the
44-
following two extra library:
45-
* tf2onnx (necessary ,used to convert .pb format model into .onnx format model) [reference](https://github.com/onnx/tensorflow-onnx)
71+
The hypeprose python training library handles the whole pipelines for developing the pose estimation system, including training, evaluating and testing. Its goal is to produce a .npz file that contains the well-trained model weights. For the training platform, the enviroment configuration above is engough. However, most inference engine only accept .pb format or .onnx format model, such as [TensorRT](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html). Thus, one need to convert the trained model loaded with .npz file weight to .pb format or .onnx format for further deployment, which need extra configuration below:
72+
73+
(I)Convert to .pb format:
74+
To convert the model into .pb format, we use *@tf.function* to decorate the *infer* function of each model class, so we can use the *get_concrete_function* function from tensorflow to consctruct the frozen model computation graph and then save it in .pb format.
75+
We already provide a script with cli to facilitate conversion, which located at [export_pb.py](https://github.com/tensorlayer/hyperpose/blob/master/export_pb.py). What we need here is only **tensorflow** library that we already installed.
76+
77+
(II)Convert to .onnx format:
78+
To convert the model in .onnx format, we need to first convert the model into .pb format, then convert it from .pb format into .onnx format. Two extra library are needed:
79+
* tf2onnx
80+
*tf2onnx* is used to convert .pb format model into .onnx format model, is necessary here. details information see [reference](https://github.com/onnx/tensorflow-onnx).
81+
install tf2onnx by running:
4682
```bash
4783
pip install -U tf2onnx
4884
```
49-
* graph_transforms (unnecesary,used to check the input and output node of the .pb file if one doesn't know)
50-
build graph_transforms according to [reference](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms#using-the-graph-transform-tool)
85+
86+
* graph_transforms
87+
*graph_transform* is used to check the input and output node of the .pb file if one doesn't know. when convert .pb file into .onnx file using tf2onnx, one is required to provide the input node name and output node name of the computation graph stored in .pb file, so he may need to use *graph_transform* to inspect the .pn file to get node names.
88+
build graph_transforms according to [tensorflow tools](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms#using-the-graph-transform-tool)
5189

5290

5391

docs/markdown/quick_start/training.md

Lines changed: 27 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -3,25 +3,25 @@
33
## Prerequisites
44
* Make sure you have configured 'hyperpose' virtual environment following the training installation guide,(if not, you can refer to [training installation](../install/training.md)).
55
* Make sure your GPU is available now(using tf.test.is_gpu_available() and it should return True)
6-
* Make sure the Hyperpose training Library is under the root directory of the project(where you write train.py and eval.py)
6+
* Make sure the hyperpose training Library is under the root directory of the project(where you write train.py and eval.py) or you have installed hyperpose through pypi.
77

88
## Train a model
99
The training procedure of Hyperpose is to set the model architecture, model backbone and dataset.
10-
User specify these configuration using the seting functions of Config module with predefined enum value.
10+
User specify these configuration using the set up functions of *Config* module with predefined enum value.
1111
The code for training as simple as following would work.
1212
```bash
1313
# >>> import modules of hyperpose
1414
from hyperpose import Config,Model,Dataset
1515
# >>> set model name is necessary to distinguish models (neccesarry)
1616
Config.set_model_name(args.model_name)
17-
# >>> set model architecture(and model backbone when in need)
17+
# >>> set model architecture (and set model backbone when in need)
1818
Config.set_model_type(Config.MODEL.LightweightOpenpose)
1919
Config.set_model_backbone(Config.BACKBONE.Vggtiny)
2020
# >>> set dataset to use
2121
Config.set_dataset_type(Config.DATA.MSCOCO)
2222
# >>> set training type
2323
Config.set_train_type(Config.TRAIN.Single_train)
24-
# >>> configuration is done, get config object to assemble the system
24+
# >>> configuration is done, get config object and assemble the system
2525
config=Config.get_config()
2626
model=Model.get_model(config)
2727
dataset=Dataset.get_dataset(config)
@@ -30,21 +30,22 @@ train=Model.get_train(config)
3030
train(model,dataset)
3131
```
3232
Then the integrated training pipeline will start.
33-
for each model, Hyperpose will save all the related files in the direatory:
34-
./save_dir/model_name, where *model_name* is the name user set by using *Config.set_model_name*
33+
for each model, Hyperpose will save all the related files in the directory:
34+
*./save_dir/model_name*, where *model_name* is the name user set by using *Config.set_model_name*
3535
the directory and its contents are below:
3636
* directory to save model ./save_dir/model_name/model_dir
3737
* directory to save train result ./save_dir/model_name/train_vis_dir
3838
* directory to save evaluate result ./save_dir/model_name/eval_vis_dir
39+
* directory to save test result ./save_dir/model_name/test_vis_dir
3940
* directory to save dataset visualize result ./save_dir/model_name/data_vis_dir
4041
* file path to save train log ./save_dir/model_name/log.txt
4142

42-
The above code section show the simplest way to use Hyperpose training library, to make full use of Hyperpose training library,
43-
you can refer to [training tutorial](../tutorial/training.md)
43+
We provide a helpful training script with cli located at [train.py](https://github.com/tensorlayer/hyperpose/blob/master/train.py) to demonstrate the usage of hyperpose python training library, users can directly use the script to train thier own model or use it as a template for further modification.
4444

4545
## Eval a model
46-
The evaluate procedure using Hyperpose is almost the same to the training procedure:
47-
the model will be loaded from the ./save_dir/model_name/model_dir/newest_model.npz
46+
The evaluate procedure using Hyperpose is almost the same to the training procedure,
47+
the model will be loaded from the ./save_dir/model_name/model_dir/newest_model.npz,
48+
The code for evaluating is followed:
4849
```bash
4950
# >>> import modules of hyperpose
5051
from hyperpose import Config,Model,Dataset
@@ -68,29 +69,36 @@ It should be noted that:
6869
1.the model architecture, model backbone, dataset type should be the same with the configuration under which model was trained.
6970
2.the evaluation metrics will follow the official evaluation metrics of dataset
7071

71-
The above code section show the simplest way to use Hyperpose training library to evaluate a model trained by Hyperpose, to make full use of Hyperpose training library, you can refer to [training tutorial](../tutorial/training.md)
72+
We also provide a helpful evaluating script with cli located at [eval.py](https://github.com/tensorlayer/hyperpose/blob/master/eval.py) to demonstrate how to evaluate the model trained by hyperpose, users can directly use the script to evaluate thier own model or use it as a template for further modification.
73+
74+
The above code sections show the simplest way to use Hyperpose training library to train and evaluate a model trained by Hyperpose, to make full use of Hyperpose training library, you can refer to [training tutorial](../tutorial/training.md)
7275

7376
## Export a model
77+
The trained model weight is saved as a .npz file. For further deployment, one should convert the model loaded with the well-trained weight saved in the .npz file and convert it into the .pb format and .onnx format.
7478
To export a model trained by Hyperpose, one should follow two step:
7579
* (1)convert the trained .npz model into .pb format
76-
this can be done either call the export_pb.py from Hyperpose repo
80+
We use the *@tf.function* decorator to produce the static computation graph and save it into the .pb format.
81+
We already provide a script with cli to facilitate conversion, which located at [export_pb.py](https://github.com/tensorlayer/hyperpose/blob/master/export_pb.py).
82+
To convert a model with model_type=**your_model_type** and model_name=**your_model_name** developed by hyperpose,one should place the trained model weight **newest_model.npz** file at path *./save_dir/your_model_name/model_dir/newest_model.npz*,and run the command line followed:
7783
```bash
78-
python export_pb.py --model_type=your_model_type --model_name=your_model_name
84+
python export_pb.py --model_type=your_model_type --model_name=your_model_name
7985
```
80-
then the converted model will be put in the ./save_dir/model_name/forzen_model_name.pb
81-
one can also export himself by loading model and using get_concrete_function by himself, please refer the tutorial for details
82-
* (2)convert the frozen .pb format model by tensorflow-onnx
86+
Then the **frozen_your_model_name.pb** will be produced at path *./save_dir/your_model_name/frozen_your_model_name.pb*.
87+
one can also export by loading model and using *get_concrete_function* by himself, please refer the [tutorial](../tutorial/training.md) for more details.
88+
* (2)convert the frozen .pb format model into .onnx format
89+
We use *tf2onnx* library to convert the .pb format model into .onnx format.
8390
Make sure you have installed the extra requirements for exporting models from [training installation](../install/training.md)<br>
84-
if you don't know the input and output name of the pb model,you should use the function *summarize_graph* function
85-
of graph_transforms from tensorflow
91+
if you don't know the input and output node names of the pb model,you should use the function *summarize_graph* function
92+
of *graph_transforms* from tensorflow. (see [tensorflow tools](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms#using-the-graph-transform-tool) for more details.)
93+
8694
```bash
8795
bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph=your_frozen_model.pb
8896
```
8997
then, after knowing the input and output nodes of your .pb model,use tf2onnx
9098
```bash
9199
python -m tf2onnx.convert --graphdef your_frozen_model.pb --output output_model.onnx --inputs input0:0,input1:0... --outputs output0:0,output1:0,output2:0...
92100
```
93-
args follow inputs and outputs are the names of input and output nodes in .pb graph repectly, for example, if the input node name is **x** and output node name is **y1**,**y2**, then the convert bash should be:
101+
args follow *--inputs* and *-outputs* are the names of input and output nodes in .pb graph respectively, for example, if the input node name is **x** and output node name is **y1**,**y2**, then the convert bash command line should be:
94102
```
95103
python -m tf2onnx.convert --graphdef your_frozen_model.pb --output output_model.onnx --inputs x:0 --outputs y1:0,y2:0
96104
```

0 commit comments

Comments
 (0)