2. Convert Tool

2.1. 使用方法¶

Convert Tool工具的位置在 SGS_IPU_SDK/Scripts/ConvertTool/ConvertTool.py。目前Convert Tool支持从tensorflow_graphdef，tensorflow_savemodel，keras，tflite，caffe，onnx 六种框架模型转换为SGS浮点网络模型。使用前先在SGS_IPU_SDK⽬录下运⾏以下脚本，输出Library的路径（已经做过该步骤可忽略）：

cd ~/SGS_IPU_SDK
source cfg_env.sh

下面是Convert Tool目前支持的platform information：

python3 ConvertTool.py -h
usage: ConvertTool.py [-h]
    {tensorflow_graphdef,tensorflow_savemodel,keras,tflite,caffe,onnx} ...
Convert Tool
    positional arguments: {tensorflow_graphdef,tensorflow_savemodel,keras,tflite,caffe,onnx}
        platform info

    tensorflow_graphdef
        tensorflow graphdef commands

    tensorflow_savemodel
        tensorflow save_model commands

    keras
        keras commands

    tflite
        tflite commands

    caffe
        caffe commands

    onnx
       onnx commands

optional arguments:
    -h, --help show this help message and exit

如果想进一步看看各个平台转换所需要的参数信息，可以执行 python3 ConvertTool.py {platform} –h，各个平台具体信息和查看指令如下：

2.1.1 tensorflow_graphdef 框架指令¶

python3 ConvertTool.py tensorflow_graphdef -h
usage: ConvertTool.py tensorflow_graphdef [-h]
    --graph_def_file GRAPH_DEF_FILE
    --input_arrays INPUT_ARRAYS (optional)
    --output_arrays OUTPUT_ARRAYS (optional)
    --input_config INPUT_CONFIG
    --output_file OUTPUT_FILE
    --input_shapes INPUT_SHAPES (optional)

optional arguments:
    -h, --help
        show this help message and exit.

    --graph_def_file
        GRAPH_DEF_FILE Full filepath of file containing frozen GraphDef.

    --input_arrays
        INPUT_ARRAYS Names of the input arrays, comma-separated.

    --output_arrays
        OUTPUT_ARRAYS Names of the output arrays, comma-separated.

    --input_config
        INPUT_CONFIG Input config path.

    --output_file
        OUTPUT_FILE Full filepath of out Model path.

    --input_shapes
        INPUT_SHAPES Shapes corresponding to --input_arrays, colon- separated. For many models each shape takes the form batch size, input array height, input array width, input array depth. Default: None

工具使用示例：

python3 ConvertTool.py tensorflow_graphdef \
--graph_def_file ~/SGS_Models/tensorflow/resnet_v2_50/resnet_v2_50.pb \
--output_file ./resnet_v2_float.sim \
--input_shapes 1,299,299,3 \
--input_config ~/SGS_Models/tensorflow/resnet_v2_50/input_config.ini

相关参数说明：

--graph_def_file: 输入的模型为TensorFlow frozen的graphdef的pb格式文件路径。

--output_file: 输出的模型文件，flatbuffer格式，sim后缀文件。

--input_shapes: 网络输入Tensor的shape，格式为NHWC，dimention 之间以逗号( , )分隔，shape个数和inputs个数对应，多个shape之间以冒号( : )分隔。

--input_config: input_config.ini文件路径，该文件为input tensor的配置信息。

可选参数：

--input_arrays: 网络输入Tensor的名字，以字符串类型指定，多个inputs的话，中间以逗号( , )分隔，如：--input_arrays='Input1','Input2'。

--output_arrays: 网络输出Tensor的名字，以字符串类型指定，多个outputs的话，中间以逗号( , )分隔。

2.1.2 tensorflow_savemodel 框架指令¶

python3 ConvertTool.py tensorflow_savemodel -h
usage: ConvertTool.py tensorflow_savemodel [-h]
    --saved_model_dir SAVED_MODEL_DIR
    --input_config INPUT_CONFIG
    --output_file OUTPUT_FILE
    --output_arrays OUTPUT_ARRAYS (optional)
    --input_arrays INPUT_ARRAYS (optional)
    --input_shapes INPUT_SHAPES (optional)
    --tag_set TAG_SET (optional)
    --signature_key SIGNATURE_KEY (optional)

optional arguments:
    -h, --help
        show this help message and exit.

    --saved_model_dir
        SAVED_MODEL_DIR SavedModel directory to convert.

    --input_config
        INPUT_CONFIG Input config path.

    --output_file
        OUTPUT_FILE Full filepath of out Model path. --debug Run gdb in Debug mode.

    --input_arrays
        INPUT_ARRAYS Names of the input arrays, comma-separated.
        Default: None.

    --output_arrays
        OUTPUT_ARRAYS Names of the output arrays, comma-separated.
        Default: None.

    --input_shapes
        INPUT_SHAPES Shapes corresponding to --input_arrays, colon- separated. For many models each shape takes the form batch size, input array height, input array width, input array depth
        Default: None.

    --tag_set
        TAG_SET Set of tags identifying the MetaGraphDef within the SavedModel to analyze. All tags in the tag set must be present.
        Default: None.

    --signature_key
        SIGNATURE_KEY Key identifying SignatureDef containing inputs and outputs.
        Default: DEFAULT_SERVING_SIGNATURE_DEF_KEY

工具使用示例：

python3 ConvertTool.py tensorflow_savemodel \
--saved_model_dir ~/test/tensorflow_model/save_model \
--input_config ~/test/tensorflow_model/input_config.ini \
--output_file ~/test/tensorflow_model/save_model_float.sim \
--tag_set test_saved_model \
--signature_key test_signature

相关参数说明：

--saved_model_dir: 输入的模型为TensorFlow saved_model.builder生成的文件路径。

--output_file: 输出的模型文件，flatbuffer格式，sim后缀文件。

--input_config: input_config.ini文件路径，该文件为input tensor的配置信息。

可选参数：

--input_arrays: 网络输入Tensor的名字，以字符串类型指定，多个inputs的话，中间以逗号( , )分隔，如：--input_arrays='Input1','Input2'，若不指定则从saved_model_dir中获取。

--output_arrays: 网络输出Tensor的名字，以字符串类型指定，多个outputs的话，中间以逗号( , )分隔。若不指定则从saved_model_dir中获取。

--input_shapes: 网络输入Tensor的shape，格式为NHWC，dimention 之间以逗号( , )分隔，shape个数和inputs 个数对应，多个shape之间以冒号( : )分隔。若不指定则从saved_model_dir中获取。

--tag_set: 需要和save model时所指定的tag匹配，如果不做设定，则默认为‘serve’。

--signature_key: 需要和save model时所指定的signature匹配。如果不做设定，则默认为 ‘DEFAULT_SERVING_SIGNATURE_DEF_KEY’

2.1.3 keras框架指令¶

python3 ConvertTool.py keras -h
usage: ConvertTool.py keras [-h]
    --model_file MODEL_FILE
    --input_config INPUT_CONFIG
    --output_file OUTPUT_FILE
    --input_arrays INPUT_ARRAYS (optional)
    --output_arrays OUTPUT_ARRAYS (optional)
    --input_shapes INPUT_SHAPES (optional)
    --custom_objects CUSTOM_OBJECTS (optional)

optional arguments:
    -h, --help show this help message and exit
    --model_file MODEL_FILE Full filepath of HDF5 file containing the tf.keras model.

    --input_config INPUT_CONFIG Input config path.

    --output_file OUTPUT_FILE Full filepath of out Model path.

    --input_arrays INPUT_ARRAYS Names of the input arrays, comma-separated.
    Default: None.

    --input_shapes INPUT_SHAPES Shapes corresponding to --input_arrays, colon- separated. For many models each shape takes the form batch size, input array height, input array width, input array depth.
    Default: None.

    --output_arrays OUTPUT_ARRAYS Names of the output arrays, comma-separated.
    Default: None.

    --custom_objects CUSTOM_OBJECTS Dict mapping names (strings) to custom classes or functions to be considered during model deserialization.
    Default: None.

工具使用示例：

python3 ConvertTool.py keras \
--model_file ./TEST_h5/resnet50/resnet50.h5 \
--input_config ./TEST_h5/resnet50/input_config.ini \
--output_file ./TEST_h5/resnet50/resnet50_float.sim

相关参数说明：

--model_file: 输入的模型为keras h5格式文件路径。 --output_file: 输出的模型文件，flatbuffer格式，sim后缀文件。 --input_config: input_config.ini文件路径，该文件为input tensor的配置信息。

可选参数：

--input_arrays: 网络输入Tensor的名字，以字符串类型指定，多个inputs的话，中间以逗号( , )分隔，如：--input_arrays='Input1','Input2'。

--output_arrays: 网络输出Tensor的名字，以字符串类型指定，多个outputs的话，中间以逗号( , )分隔。

--input_shapes: 网络输入Tensor的shape，格式为NHWC，dimention 之间以逗号( , )分隔，shape个数和inputs 个数对应，多个shape之间以冒号( : )分隔。

--custom_objects: Dict mapping names (strings) to custom classes or functions to be considered during model deserialization (default None).

2.1.4 tflite框架指令¶

python3 ConvertTool.py tflite -h
usage: ConvertTool.py tflite [-h]
    --model_file MODEL_FILE
    --input_config INPUT_CONFIG
    --output_file OUTPUT_FILE

optional arguments:
    -h, --help show this help message and exit

    --model_file MODEL_FILE Full filepath of tflite file containing the tflite model.

    --input_config INPUT_CONFIG Input config path.

    --output_file OUTPUT_FILE Full filepath of out Model path.

工具使用示例：

python3 ConvertTool.py tflite \
--model_file ~/test/tensorflow_model/Debug_save_model_float.tflite \
--input_config ~/test/tensorflow_model/input_config.ini \
--output_file ~/test/tensorflow_model/save_model_float.sim

相关参数说明：

--model_file: 输入的模型为tflite格式文件路径(必须为非量化模型)。

--output_file: 输出的模型文件，flatbuffer格式，sim后缀文件。

--input_config: input_config.ini文件路径，该文件为input tensor的配置信息。

2.1.5 caffe框架指令¶

python3 ConvertTool.py caffe -h
usage: ConvertTool.py caffe [-h]
    --model_file MODEL_FILE
    --weight_file WEIGHT_FILE
    --input_config INPUT_CONFIG
    --output_file OUTPUT_FILE
    --input_arrays INPUT_ARRAYS (optional)
    --output_arrays OUTPUT_ARRAYS (optional)

optional arguments:
    -h, --help show this help message and exit

    --model_file MODEL_FILE Full filepath of tflite file containing the caffe model.

    --weight_file WEIGHT_FILE Full filepath of tflite file containing the caffe weight.

    --input_config INPUT_CONFIG Input config path.

    --output_file OUTPUT_FILE Full filepath of out Model path.

    --input_arrays INPUT_ARRAYS Names of the input arrays, comma-separated.
    Default: None.

    --output_arrays OUTPUT_ARRAYS Names of the output arrays, comma-separated.
    Default: None.

工具使用示例：

python3 ConvertTool.py caffe \
--model_file ~/SGS_Models/caffe/caffe_resnet50_conv/caffe_resnet50_conv.prototxt \
--weight_file ~/SGS_Models/caffe/caffe_resnet50_conv/caffe_resnet50_conv.caffemodel \
--input_config ~/SGS_Models/caffe/caffe_resnet50_conv/input_config.ini \
--output_file ./resnet50.sim

相关参数说明：

--model_file: Caffe模型文件的路径

--weight_file: Caffe权重文件的路径

--input_config: input_config.ini文件路径，该文件为input config的配置信息。

--output_file: 转换模型的输出路径。

可选参数：

--input_arrays: Caffe模型输入的节点名称，使用input的名字，如有多个输出节点，请用逗号( , )分隔

--output_arrays: 模型输出的节点名称，使用最后layer的top名字，如有多个输出节点，请用逗号( , )分隔

2.1.6 Onnx框架指令¶

python3 Path_to_SGS_IPU_SDK/Scripts/ConvertTool.py onnx -h
usage: ConvertTool.py onnx [-h]
--model_file MODEL_FILE
--input_arrays  INPUT_ARRAYS (optional)
--input_shapes INPUT_SHAPES
--output_arrays OUTPUT_ARRAYS (optional)
--input_config  INPUT_CONFIG
--output_file OUTPUT_FILE

optional arguments:  -h, --help show this help message and exit
--model_file MODEL_FILE  Full filepath of tflite file containing the onnx  model.

--input_arrays INPUT_ARRAYS  Names of the input arrays, comma-separated. (default  None).

--input_shapes INPUT_SHAPES  Shapes corresponding to --input_arrays, colon-separated. For many models each shape takes the form N  C H W (default None)

--output_arrays OUTPUT_ARRAYS  Names of the output arrays, comma-separated. (default  None)

--input_config INPUT_CONFIG  Input config path.

--output_file OUTPUT_FILE  Full filepath of out Model path.

工具使用示例：

python3 ~/SGS_IPU_SDK/Scripts/ConvertTool/ConvertTool.py onnx \
--model_file  ~/SGS_Models/onnx/ onnx_mobilenet_v2/ onnx_mobilenet_v2.onnx \
--input_shapes 1,3,224,224 \
--input_config ~/SGS_Models/onnx/ onnx_mobilenet_v2/input_config.ini \
--output_file ./onnx_mobilenet_v2_float.sim

相关参数说明：

--model_file: Onnx模型文件的路径

--input_shapes: Onnx 模型输入shape，多输入用冒号( : )分隔

--input_config: input_config.ini文件路径，该文件为input config的配置信息。

--output_dir: 转换模型的输出路径。

可选参数：

--input_arrays: Onnx模型输入的节点名称，使用input的名字,多输入用逗号( , )分隔

--output_arrays: 模型输出的节点名称，使用最后layer的top名字，如有多个输出节点，请用逗号( , )分隔

2.1.6.1. Onnx模型转换注意事项¶

如果模型中带有后处理的算子，请先去除，目前只支持主干网络的转换。

2.1.7 注意事项¶

Convert Tool转换工具转换完成后会生成两个文件，例如 --output_dir 指定为 ./resnet50.sim，转换完成后会生成Debug_resnet50.sim和resnet50.sim。其中，resnet50.sim是真正转换好的文件，Debug_resnet50.sim是经过转换后的中间文件，该文件未经过优化，与原框架模型拥有相同的网络结构，因此可以作为转换后调试使用，但是无法在IPU SDK中运行。

2.2. input config配置信息设置¶

工具的参数 --input_config 需要指定input tensor的配置信息文件input_config.ini路径，该文件的主要功能有：

•   配置网络模型图片前处理的归一化信息；

•   配置网络模型输入输出的量化处理信息；

•   配置网络模型中卷积的量化信息。

配置input_config.ini文件主要是为了能将网络模型快速适配到SigmaStar芯片中使用。在网络模型的训练中，不同的框架和训练数据集需要网络使用不同的图片归一化方法，而在实际使用中，为了能够让网络模型的预测更加准确，需要还原训练模型时的图片前处理归一化方法。

将RGB三个通道的均值和std_value设置好后，在转换模型时会写入到模型内部，这样在硬件上实际使用时，仅需要将图片resize到网络模型的输入的尺寸，图片归一化的工作在网络内部完成。

另外，实际硬件上使用时图片输入格式与训练时使用的RGB可能有很大区别，正确配置这些选项能使转换好的模型内拥有这些配置信息，能够直接在SigmaStar的硬件上部署。

[INPUT_CONFIG]
;Names of the input arrays, comma-separated.image input must be the first.
inputs='data';
;Memory formats of input arrays, comma-separated.
;One of RGB, BGR, RGBA, BGRA, YUV_NV12, RAWDATA_S16_NHWC
;Each entry in the list should match an entry in inputs arrays.
training_input_formats=BGR;
input_formats=BGR;
;Indicate the input data need qauntize or not.
;Each entry in the list should match an entry in inputs arrays.
quantizations=TRUE;
;mean_values parameter for image models,
;Each entry in the list match RGB channel of（RGB,BGR,RGBA,BGRA,YUV_NV12）
mean_red=0.0;
mean_green=0.0;
mean_blue=0.0;
;std_value parameter for image models,
std_value=1.0;

[OUTPUT_CONFIG]
;Names of the output arrays, comma-separated.
outputs='prob';
;Indicate the output data need deqauntize or not.
;Each entry in the list should match an entry in outputs arrays.
dequantizations=TRUE;

[CONV_CONFIG]
;input_format=ALL_INT16;
tensor_arrays='conv1-1,conv2-1';

该文件主要分为三个设置信息：

[INPUT_CONFIG]
[OUTPUT_CONFIG]
[CONV_CONFIG]

针对这三个设置信息具体说明：

注意

String类型的值，如tensor name，需用('')将内容包含起来，例如outputs='detectionBoxes,detectionClasses,detectionScores,numDetections';

2.2.1 INPUT_CONFIG¶

inputs: 网络输入Tensor的name，如果有多个输入Tensor，请用逗号( , )分隔。模型输入Tensor顺序与inputs的配置顺序一致。所有输入name的长度不能超过1024个字符。

training_input_formats: 网络训练时的图片格式，数量和顺序与inputs一一对应，英文逗号( , )分隔。这些格式包括RGB，BGR，RAWDATA_S16_NHWC(若training_input_formats设置为RAWDATA_S16_NHWC,则input_formats也必须设置为RAWDATA_S16_NHWC) 其中之一。training_input_formats可以和input_formats 不一样。例如在在SigmaStar开发板上，input_formats是YUV_NV12，但是training input formats 是RGB。

input_formats: 网络模型在SigmaStar芯片上运行的图片输入格式，数量和顺序与inputs一一对应，逗号( , )分隔。这些格式包括：

RGB
BGR
RGBA
BGRA
YUV_NV12
RAWDATA_S16_NHWC

注意： * training_input_formats 和 input_formats 不能配置为如下情况：
training_input_formats=RGB;
input_formats=BGR;
或者
training_input_formats=BGR;
input_formats=RGB;
* 灰度图片请按如下方式配置：
training_input_formats=RGB;
input_formats=GRAY;
具体详见9.1 灰度模型转换要点
* 配置RAWDATA_S16_NHWC时，mean_red、mean_green、mean_blue和std_value不要配置，具体详见9.2 RAWDATA_S16_NHWC模型转换要点

quantizations: 用来标识所有输入Tensor的数据是否需要做量化，TRUE或者FALSE，默认值为TRUE, 数量等于inputs个数。如果有多个输入Tensor，以英文逗号( , )分隔且中间不可有空格。

mean_red / mean_green / mean_blue: 网络训练阶段，一般会对图片做前处理，对于RGB通道的图片，使用如下公式对图片进行预处理： mean_red / mean_green / mean_blue 就是相应通道上的mean值。如果这个网络没有做任何归一化处理，这个值设为0即可。每个mean数量等于inputs 个数。如果有多个输入Tensor，以英文逗号( , )分隔且中间不可有空格。

std_value: 如上公式，如果没有做任何归一化处理，这个值设为1即可。如果每个通道都有对应的std_value值，以英文冒号( : )分隔，顺序为RGB。如果有多个输入Tensor，以英文逗号( , )分隔且中间不可有空格。

yuv420_h_pitch_alignment：用来标识YUV数据作为网络输入时H方向对齐的数量，默认为16。

yuv420_v_pitch_alignment：用来标识YUV数据作为网络输入时V方向对齐的数量，默认为2。

xrgb_h_pitch_alignment：用来标识XRGB数据作为网络输入时H方向对齐的数量，默认为16。

注意

以上三个配置只能配置默认值的整数倍。例如：yuv420_h_pitch_alignment=32;

2.2.2 OUTPUT_CONFIG¶

outputs: 网络输出Tensor的name(可用Netron工具打开模型查看)，如果有多个输出Tensor，以英文逗号( , )分隔。转换带后处理网络时，Backbone网络的outputs与完整网络outputs的名称不同，其余设置应完全一致。模型输出Tensor顺序与outputs的配置顺序一致。所有输出name的长度不能超过1024个字符。

dequantizations: 用来标识所有输出Tensor的数据是否需要做反量化，TRUE或者FALSE，默认值为TRUE,数量等于outputs。如果有多个输出Tensor，以英文逗号( , )分隔且中间不可有空格。该选项仅在板上运行时生效。

2.2.3 CONV_CONFIG¶

input_format: 指定网络中所有卷积的量化方式，默认L5，可选方案ALL_UINT8，ALL_INT16， CONV2D_INT16，DEPTHWISE_INT16。

ALL_UINT8: 指定所有卷积按照UINT8量化。
ALL_INT16: 指定所有卷积按照INT16量化。
CONV2D_INT16: 只指定所有普通卷积按照INT16量化。
DEPTHWISE_INT16: 只指定所有Depthwise 卷积按照INT16量化。

在 UINT8 模式 下，卷积运行所占带宽小，运行速度快；在 INT16模式 下，可以极大的提高卷积的精度，但是运行的速度会有影响。

tensor_arrays: 指定网络中某些层的卷积量化方式。整个网络卷积都采用默认的UINT8 ，但是某些卷积层需要更高的精度，这时直接填写那些卷积层的第一个输入的 input id name即可。多层时，name以逗号( , )分隔。

注意

卷积第一个输入的input id名字可通过netron工具查看。指定量化时，网络中的第一层卷积不生效。

2.2.4 OPTIMIZE_CONFIG¶

skip_concatenation: 网络中的Concatenation层，如果只针对最后一个维度合并，会走进该优化中。Fixed模型会将Concatenation层转换为skip_concatenation，skip_concatenation层会大幅减小运行时间。

2.2.5 输入输出配置总结¶

training_input_formats	input_formats	板上运行时数据对齐方式
RGB/BGR	RGB/BGR	不用对齐
RGB/BGR	RGBA/BGRA	W = ALIGN_UP(W * 4, `xrgb_h_pitch_alignment`) / 4
RGB/BGR	YUV_NV12/GRAY	H = ALIGN_UP(H, `yuv420_v_pitch_alignment`)
		W = ALIGN_UP(W, `yuv420_h_pitch_alignment`)
RAWDATA_S16_NHWC	RAWDATA_S16_NHWC	最后1个维度 = ALIGN_UP(最后1个维度, `8`)

模型输出是INT16，如果配置dequantizations为TRUE，在板端运行时会转换成float32。输出的数据对齐方法为：最后1个维度 = ALIGN_UP(最后1个维度, 8)。具体详见5.2.1节去除无用数据。