Download PDF [PDF 728 KB]
Revision History
Revision Number | Description | Revision Date |
---|---|---|
001 | Initial version | December 16 |
Table of Contents
1 Introduction
1.1 Related Information
1.2 Installing Intel® Deep Learning SDK Deployment Tool
1.3 Conventions and Symbols
1.4 Introducing the Intel® Deep Learning SDK Deployment Tool
2 Using the Intel® Deep Learning SDK Deployment Tool
2.1 Typical Usage Model
2.2 Model Optimizer Overview
2.2.1 Prerequisites
2.2.2 Running the Model Optimizer
2.2.3 Known Issues and Limitations
2.3 Inference Engine Overview
2.3.1 Building the Sample Applications
2.3.2 Running the Sample Applications
3 End-to-end user scenarios
3.1 Inferring an Image Using the Intel® Math Kernel Library for Deep Neural Networks Plugin
1 Introduction
The Intel® Deep Learning SDK Deployment Tool User Guide provides guidance on how to use the Deployment Tool to optimize trained deep learning models and integrate the inference with application logic using a unified API. See the End-to-End User Scenarios chapter to find usage samples.
This guide does not provide an information on the Intel® Deep Learning SDK Training Tool. For this information, see the Intel® Deep Learning SDK Training Tool User Guide.
1.1 Related Information
For more information on SDK requirements, new features, known issues and limitations, refer to the Release Notes document.
1.2 Installing Intel® Deep Learning SDK Deployment Tool
For installation steps please refer to the Intel® Deep Learning SDK Deployment Tool Installation Guide.
1.3 Conventions and Symbols
The following conventions are used in this document.
SDK | Software Development Kit |
API | Application Programming Interface |
IR | Internal representation of a deep learning network |
CNN | Convolutional Neural Network |
1.4 Introducing the Intel® Deep Learning SDK Deployment Tool
The Intel® Deep Learning SDK Deployment Tool is a feature of the Intel® Deep Learning SDK, which is a free set of tools for data scientists, researchers, and software developers to develop, train, and deploy deep learning solutions.
With the Intel® Deep Learning SDK Deployment Tool you can:
- Optimize trained deep learning networks through model compression and weight quantization, which are tailored to end-point device characteristics.
- Deliver a unified API to integrate inference with application logic.
The Deployment Tool comprises two main components:
Model Optimizer
Model Optimizer is a cross-platform command line tool that:
- Takes as input a trained network that contains a certain network topology, parameters, and the adjusted weights and biases. The input network is produced using the Caffe* framework.
- Performs horizontal and vertical fusion of the network layers.
- Prunes unused branches in the network.
- Applies weights compression methods.
- Produces as output an Internal Representation (IR) of the network – a pair of files that describe the whole model:
- Topology file – an .xml file that describes the network topology.
- Trained data file – a .bin file that contains the weights and biases binary data.
- The produced IR is used as an input for the Inference Engine.
Inference Engine
Inference Engine is a runtime which:
- Takes as input an IR produced by Model Optimizer
- Optimizes inference execution for target hardware
- Delivers inference solution with reduced footprint on embedded inference platforms
- Enables seamless integration with application logic, which eases transition between platforms from Intel® through supporting the same API across a variety of platforms.
2 Using the Intel® Deep Learning SDK Deployment Tool
2.1 Typical Usage Model
The scheme displays the typical usage of the Deployment Tool to perform inference of a trained deep neural network model. You can train a model using the Intel® Deep Learning SDK Training Tool or Caffe* framework.
- Provide the model in the Caffe* format for Model Optimizer to produce the IR of the model based on the certain network topology, weight and bias values, and other parameters.
- Test the model in the IR format using the Inference Engine in the target environment. Deployment Tool contains sample Inference Engine applications. For more information, see the Running the Sample Applications section.
- Integrate the Inference Engine in your application and deploy the model in the target environment.
2.2 Model Optimizer Overview
The Model Optimizer is a cross-platform command line tool that facilitates transition between training and deployment environments.
The Model Optimizer:
- Converts a trained model from a framework-specific format to a unified framework-independent format (IR). The current version supports conversion of models in Caffe* format only.
- Can optimize a trained model by removing redundant layers and fusing layers, for instance, Batch Normalization and Convolution layers.
The Model Optimizer takes a trained model in Caffe* format (a .prototxt file with the network topology and a .cafemodel file with the network weights) and outputs a model in the IR format (an .xml file with the network topology and a binary .bin file with the network weights):
The Model Optimizer is also included into distributions of Intel® Deep Learning SDK Training Tool.
2.2.1 Prerequisites
- The Model Optimizer is distributed as a set of binary files (an executable binary and a set of shared objects) for 64-bit Ubuntu* OS only. The files can reside in any directory with write permissions set.
- Caffe* framework and all its prerequisites are to be installed. The libcaffe.so shared object must be available.
- The Model Optimizer works with Berkley* community version of Caffe* and Intel® distribution of Caffe. It may fail to work with other versions of Caffe*.
2.2.2 Running the Model Optimizer
To run the Model Optimizer perform the following steps.
- Add the path to the libCaffe.so shared object and the path to the Model Optimizer executable binary to
LD_LIBRARY_PATH.
- Change the current directory to the Model Optimizer bin directory. For example:
cd /opt/intel/deep_learning_sdk_2016.1.0.<build_number>/deployment_tools/model_optimizer
- Run the
./ModelOptimizer
command with desired command line arguments: - "-w" -Path to a binary file with the model weights (.caffemodel file)
- "-i" - Generate IR
- "-p" - Desired precision (for now, must be FP32, because the MKLD-NN plugin currently supports only FP32)
- "-d"– Path to a file with the network topology (.prototxt file)
- “-b” – Batch size; an optional parameter, equals the number of CPU cores by default
- "-ms" - Mean image values per channel
- “-mf” – File with mean image in the binaryproto format
- "-f" - Network normalization factor (for now, must be set to 1, which corresponds to the FP32 precision).
Some models require subtract the image mean from each image on both sides training and deploying. There are two available options for subtraction:
-ms
- allows you to subtract mean values per channel–mf
- subtracts the whole mean image.
Mean image file should be in the binaryproto format. For ilsvrc12 dataset, mean image file can be downloaded by the get_ilsvrc_aux.sh
script from Caffe*:
./data/ilsvrc12/get_ilsvrc_aux.sh
Model Optimizer creates a text .xml file and a binary .bin file with a model in the IR format in the Artifacts
directory in your current directory.
2.2.3 Known Issues and Limitations
The current version of the Model Optimizer has the following limitations:
- It is distributed for 64-bit Ubuntu* OS only.
- It can process models in Caffe* format only.
- It can process popular image classification network models, including AlexNet, GoogleNet, VGG-16, LeNet, and ResNet-152, and fully convolutional network models like FCN8 that are used for image segmentation. It may fail to support a custom network.
2.3 Inference Engine Overview
The Inference Engine facilitates the deployment of deep learning solutions by delivering a unified API to integrate the inference with application logic.
The current version of the Inference Engine supports inference of popular image classification networks, including LeNet, AlexNet, GoogleNet, VGG-16, VGG-19, and ResNet-152, and fully convolutional networks like FCN8 used for image segmentation.
The Inference Engine package contains headers, libraries, and two sample console applications:
- Sample Application for Image Classification - The application demonstrates how you can use Inference Engine for inference of popular image classification networks like AlexNet and GoogleNet.
- Sample Application for Image Segmentation - The application demonstrates how you can use Inference Engine for inference of image segmentation networks like FCN8.
2.3.1 Building the Sample Applications
The recommended build environment is the following:
- Ubuntu* x86_64 version14.04 or higher, GCC* version 4.8 or higher
- CMake* 2.8 version or higher.
You can build the sample applications using the CMake file in the samples directory.
In the samples directory create a new directory that will be used for building:
$ mkdir build
$ cd build
Run CMake to generate Make files:
$ cmake <path_to_samples_directory>
Run Make to build the application:
$ make
2.3.2 Running the Sample Applications
Running the sample application for image classification
Running the application with the -h option shows the usage prompt:
$ ./classification_sample --help
classification_sample [OPTION]
Options:
-h, --help Print a usage message.
-i "<path1>""<path2>" ..., --images "<path1>""<path2>" ...
Path to a folder with images or path to an image files: a .ubyte file for LeNet and a .bmp file for the other networks.
-m "<path>", --model "<path>" Path to an .xml file with a trained model.
-p "<name>", --plugin "<name>" Plugin name. For example MKLDNNPlugin.
-pp "<path>", --plugin_path Path to a plugin folder.
-ni N, --niter N The number of iterations to do inference; 1 by default.
-l "<path>", --label "<path>" Path to a file with labels for a model.
-nt N, --ntop N Number of top results to output; 10 by default.
-pc, --performance_counts Enables printing of performance counts.
Sample commands below demonstrate use of the sample application for image classification to perform inference on an image using a trained AlexNet network and the Intel® Math Kernel Library for Deep Neural Networks (Intel® MKLD-NN) plugin:
$ ./classification_sample -i <path_to_image>/cat.bmp -m <path_to_model>/alexnet.xml -p MKLDNNPlugin -pp /path/to/plugins/directory
By default the application outputs 10 top inference results. Add the --ntop or -nt option to the previous command to modify the number of top output results. For example, to get the top 5 results, you can use the following command:
$ ./classification_sample -i <path_to_image>/cat.bmp -m <path_to_model>/alexnet.xml -p MKLDNNPlugin -nt 5 -pp /path/to/plugins/directory
Running the sample application for image segmentation
Running the application with the -h option shows the usage message:
$ ./segmentation_sample -h
segmentation_sample [OPTION]
Options:
-h Print a usage message.
-i "" Path to a .bmp image.
-m "" Path to an .xml file with a trained model.
-p "" Plugin name. For example MKLDNNPlugin.
You can use the following command to do inference on an image using a trained FCN8 network:
$ ./segmentation_sample -i /inputImage.bmp -m /fcn8.xml -p MKLDNNPlugin -pp /path/to/plugins/directory
The application outputs is a segmented image (out.bmp).
3 End-to-End User Scenarios
3.1 Inferring an Image Using the Intel® Math Kernel Library for Deep Neural Networks Plugin
Configure Model Optimizer as it was described above and go to the folder with binaries:
cd <path_to_DLSDK>/deployment_tools/model_optimizer
Add to the LD_LIBRARY_PATH variable the path to the libCaffe.so shared object and the Model Optimizer folder:
export LD_LIBRARY_PATH=${CAFFE_ROOT}/build/lib:
<path_to_DLSDK>
/deployment_tools/model_optimizer
- Configure Model Optimizer for the MKL-DNN plugin using the command line arguments listed in the Running the Model Optimizer section.
Run the following command:
./ModelOptimizer -w <path_to_network.caffemodel> -i -p FP32 -d
<path_to_deploy.prototxt> -f 1 -b 1 -ms
"104.00698793,116.66876762,122.67891434"
The output of successful launching of the command is IR representation of model and located here:
<path_to_DLSDK>/deployment_tools/model_optimizer/bin/Artifacts/<NetworkName>/
- Compile the Inference Engine classification sample application as it is described in the Building the Sample Applications chapter.
Go to the compiled binaries:
cd <path_to_DLSDK>/deployment_tools/inference_engine/bin/intel64/
Infer an image using the trained and optimized model:
$ ./classification_sample -i <path_to_image>/cat.bmp -m
<path_to_model>/alexnet.xml -p MKLDNNPlugin -pp /path/to/plugins/directory