Using Inference Engine Samples
The Inference Engine sample applications are simple console applications that demonstrate how to use Intel's Deep Learning Inference Engine in your applications.
Samples in the Samples Directory
The following sample applications are available in the samples directory in the Inference Engine installation directory:
Sample | Description |
---|---|
Image Classification Sample | Inference of image classification networks like AlexNet and GoogLeNet (the sample supports only images as inputs) |
Image Classification Sample, pipelined | Maximize performance via pipelined execution, the sample supports only images as inputs |
Security Barrier Camera Sample | Vehicle Detection followed by the Vehicle Attributes |
Object Detection for Faster R-CNN Sample | Inference of object detection networks like Faster R-CNN (the sample supports only images as inputs) |
Image Segmentation Sample | Inference of image segmentation networks like FCN8 (the sample supports only images as inputs) |
Object Detection for SSD Demonstration, Async API Performance Showcase | Demonstration application for SSD-based Object Detection networks, new Async API performance showcase, and simple OpenCV interoperability (supports video and camera inputs) |
Object Detection for SSD Sample | Inference of object detection networks based on the SSD, this sample is simplified version that supports only images as inputs |
Neural Style Transfer Sample | Style Transfer sample (the sample supports only images as inputs) |
Hello Infer Request Classification Sample | Inference of image classification networks via Infer Request API (the sample supports only images as inputs) |
Interactive Face Detection Sample | Face Detection coupled with Age-Gender and Head-Pose, supports video and camera inputs |
Security Barrier Camera Example | Supports images/video and camera inputs |
Validation Application | Infers a pack of images, resulting in total accuracy (only images as inputs) |
Samples That Support Pre-Trained Models Shipped With the Product
You are provided several pre-trained models. The table below shows the correlation between models and samples/devices. The samples are available in <INSTALL_DIR>/deployment_tools/inference_engine/samples
Model | Sample Supported on the Model | CPU | Intel® Integrated Graphics | HETERO:FPGA,CPU | Intel® Movidius™ Myriad™ 2 VPU |
---|---|---|---|---|---|
face-detection-adas-0001 | Interactive Face Detection Sample | x | x | x | |
age-gender-recognition-retail-0013 | Interactive Face Detection Sample | x | x | x | x |
head-pose-estimation-adas-0001 | Interactive Face Detection Sample | x | x | x | x |
vehicle-license-plate-detection-barrier-0007 | Security Barrier Camera Sample | x | x | x | x |
vehicle-attributes-recognition-barrier-0010 | Security Barrier Camera Sample | x | x | x | x |
license-plate-recognition-barrier-0001 | Security Barrier Camera Sample | x | x | x | x |
person-detection-retail-0001 | Object Detection Sample | x | x | x | |
person-detection-retail-00012 | Any sample that supports SSD-based models | x | x | x | |
face-detection-retail-0004 | Any sample that supports SSD-based models | x | x | x | x |
person-vehicle-bike-detection-crossroad-0066 | Any sample that supports SSD-based models | x | x | x |
Inferring Your Model with the Inference Engine Samples
Building the Sample Applications on Linux
Supported Linux build environment:
- Ubuntu* 16.04 LTS 64-bit or CentOS* 7.4 64-bit
- GCC* 5.4.0 (for Ubuntu* 16.04) or GCC* 4.8.5 (for CentOS* 7.4)
- CMake* version 2.8 or higher.
- OpenCV* 3.3 or later (required for some samples and demonstrations). Use the Intel® CV SDK installation download and instructions to complete this installation.
Follow these steps to prepare your Linux computer for the samples:
- Go to the samples directory:
<INSTALL_DIR>/deployment_tools/inference_engine/samples/
- Create a directory. This example uses a directory named
build
mkdir build
- Go to the new directory:
cd build
- Run CMake to generate the Make files with or without debug information:
- Without debug information:
cmake -DCMAKE_BUILD_TYPE=Release <path_to_inference_engine_samples_directory>
- With debug information:
cmake -DCMAKE_BUILD_TYPE=Debug <path_to_inference_engine_samples_directory>
- Without debug information:
- Build the application:
make
The sample application binaries are in <INSTALL_DIR>/deployment_tools/inference_engine/samples/intel64/Release/
Building the Sample Applications on Windows*
Supported Windows build environment:
- Microsoft Windows* 10
- Microsoft Visual Studio* 2015
- CMake* 2.8 or later
- OpenCV* 3.3 or later. Use the Intel® CV SDK installation download and instructions to complete this installation.
- Intel C++ Compiler 2017 Redistributable package for Windows
Follow these steps to prepare your Windows computer for the samples:
- Go to the
samples
directory. - Double-click
create_msvc_solution.bat
- Open Microsoft Visual Studio* 2015
- Build
samples\build\Samples.sln
Set Your Environment Variables
Use these steps to make sure your application can find the Interface Engine libraries.
For Linux, execute the following command to set the environment variable:
source <INSTALL_DIR>/deployment_tools/inference_engine/bin/setupvars.sh
where <INSTALL_DIR>
is the Intel CV SDK installation directory.
Running the Samples
Image Classification Sample
Description
The Image Classification sample application does inference using image classification networks, like AlexNet* and GoogLeNet*. The sample application reads command line parameters and loads a network and an image to the Inference Engine plugin. When inference is done, the application creates an output image and outputs data to the standard output stream.
Running the Application
Running the application with the -h
option results in the message:
$ ./classification_sample -h InferenceEngine: API version ............ <version> Build .................. <number> classification_sample [OPTION] Options: -h Print a usage message. -i "<path1>""<path3>" Required. Path to a directory with images or path to an image files: a .ubyte file for LeNet* and a .bmp file for the other networks. -m "<path>" Required. Path to an .xml file with a trained model. -l "<absolute_path>" Optional. Absolute path to library with MKL-DNN (CPU) custom layers (*.so). Or -c "<absolute_path>" Optional. Absolute path to Intel® Integrated Graphics custom layers config (*.xml). -pp "<path>" Path to a plugin directory. -d "<device>" Specify the target device to infer on; CPU, Intel® Integrated Graphics, or MYRIAD is acceptable. Sample will look for a suitable plugin for device specified -nt "<integer>" Number of top results (default 10) -ni "<integer>" Number of iterations (default 1) -pc Enables per-layer performance report
Running the application with an empty list of options results in an error message and the usage list above.
To do inference on an image using a trained AlexNet network on Intel® Processors:
$ ./classification_sample -i <path_to_image]/cat.bmp -m <path_to_model]/alexnet_fp32.xml
Output Description
By default the application outputs top-10 inference results. Add the -nt option to the previous command to modify the number of top output results. For example, to get the top-5 results on Intel® HD Graphics, use the command:
$ ./classification_sample -i <path_to_image]/cat.bmp -m <path_to_model]/alexnet_fp32.xml
Image Classification - Pipelined
Description
This sample demonstrates how to build and execute inference in pipelined mode on example of classifications networks.
The pipelined mode might increase the throughput of the pictures. The latency of one inference will be the same as for syncronous execution. The throughput is increased due to follow reasons:
- Some plugins have heterogenity inside themselves. Transferring of data, execution on remote device, pre-processing and post-processing on the host
- Using of explicit heterogenious plugin with execution of different parts of network on differnet devices
When two and more devices are involved in inference process of one picture, creation of several infer requests and starting of asynchronious inference allows to utilize devices the most efficient way. If two devices are involved in execution, the most optimal value for -nireq2
.
To do this efficiently, the Classification Sample Async uses a round-robin algorithm for inference requests. It starts by the executing the current inference request and switches to waiting for the previous request results. After finishing the wait, the application switches inference requests and repeats the procedure.
Another required aspect for good throughput is the number of iterations. Only with a large number of iterations can you emulate the application work and see performance results.
The batch mode is an independent attribute on the pipelined mode. The pipelined mode works efficiently with any batch size.
The sample application reads command line parameters and loads a network and an image to the Inference Engine plugin. Then the application creates several infer requests pointed in -nireq
parameter and loads pictures for inference.
Then in the loop it starts inference for the current infer request and switch for waiting of another one. When results are ready, inference requests are swapped.
When inference is done, the application outputs data to the standard output stream.
Running the Application
Running the application with the -h
option results in the message:
./classification_sample -h InferenceEngine: API version ............ <version> Build .................. <number> classification_sample [OPTION] Options: -h Print a usage message. -i "<path1>""<path3>" Required. Path to a directory with images or path to an image files: a .ubyte file for LeNet and a .bmp file for the other networks. -m "<path>" Required. Path to an .xml file with a trained model. -l "<absolute_path>" Optional. Absolute path to library with Intel® MKL-DNN (CPU) custom layers (*.so). Or -c "<absolute_path>" Optional. Absolute path to Intel® Integrated Graphics custom layers config (*.xml). -pp "<path>" Path to a plugin directory. -d "<device>" Specify the target device to infer on; CPU, Intel® Integrated Graphics or MYRIAD is acceptable. Sample will look for a suitable plugin for device specified -nt "<integer>" Number of top results (default 10) -ni "<integer>" Number of iterations (default 1) -pc Enables per-layer performance report
Running the application with an empty list of options results in an error message and the usage list above.
To do inference on an image using a trained AlexNet network on FPGA with a fallback to Intel® Processors:
$ ./classification_sample_async -i <path_to_image]/cat.bmp -m <path_to_model]/alexnet_fp32.xml -nt 5 -d HETERO:FPGA,CPU -nireq 2 -ni 200
Output Description
By default the application outputs top-10 inference results for each infer request. In addition to this information it will provide throughput value measured in frames per seconds.
Security Barrier Camera Sample
Description
Showcases Vehicle Detection, followed by Vehicle Attributes and License Plate Recognition applied on top of Vehicle Detection. The results are in the intel_models
directory:
vehicle-license-plate-detection-barrier-0007
: The primary detection network to find the vehicles and licence-platevehicle-attributes-recognition-barrier-0010
: Executed on top of the results fromvehicle-license-plate-detection-barrier-0007
. The vehicle attributes execution barrier reports the general vehicle attributes, like the vehicle type and color, where type is something like car, van, or bus.license-plate-recognition-barrier-0001
: Executed on top of the results fromvehicle-license-plate-detection-barrier-0007
. The license plate recognition barrier network reports a string for each recognized license plate. For topology details, see the descriptions in theintel_models
Other demonstration objectives:
- Show images/video/camera as inputs, via OpenCV*
- Show an example of simple network pipelining: Attributes and LPR networks are executed on top of the Vehicle Detection results
- Show vehicle attributes and licence plate information for each detected vehicle
How it Works
The application reads command line parameters and loads the specified networks. The Vehicle/License-Plate Detection network is required, and the other two are optional.
Upon getting a frame from the OpenCV's VideoCapture the app performs inference of Vehicles/License-Plates, then performs another two inferences using Vehicle Attributes and LPR detection networks (if those specified in command line) and displays the results.
Running the Application
Running the application with the -h
option results in the message:
$ ./security_barrier_sample -h InferenceEngine: API version ............ 1.0 [ INFO ] Parsing input parameters interactive_vehicle_detection [OPTION] Options: -h Print a usage message. -i "<path>" Required. Path to a video or image file. Default value is "cam" to work with camera. -m "<path>" Required. Path to the Vehicle/License-Plate Detection model (.xml) file. -m_va "<path>" Optional. Path to the Vehicle Attributes model (.xml) file. -m_lpr "<path>" Optional. Path to the License-Plate Recognition model (.xml) file. -l "<absolute_path>" For Intel® MKL-DNN (CPU)-targeted custom layers, if any. Absolute path to a shared library with the kernels impl. Or -c "<absolute_path>" For Intel® Integrated Graphics-targeted custom kernels, if any. Absolute path to the xml file with the kernels desc. -d "<device>" Specify the target device for Vehicle Detection (CPU, Intel® Integrated Graphics, FPGA, MYRYAD, or HETERO). -d_va "<device>" Specify the target device for Vehicle Attributes (CPU, Intel® Integrated Graphics, FPGA, MYRYAD, or HETERO). -d_lpr "<device>" Specify the target device for License Plate Recognition (CPU, Intel® Integrated Graphics, FPGA, MYRYAD, or HETERO). -pc Enables per-layer performance statistics. -r Output Inference results as raw values. -t Probability threshold for Vehicle/Licence-Plate detections.
Running the application with an empty list of options results in an error message and the usage list above.
Demonstration Output
The demonstration uses OpenCV* to display the resulting frame with detections rendered as bounding boxes and text:
Object Detection for Faster R-CNN Sample
Description
VGG16-Faster-RCNN is a public CNN that can be easily obtained from GitHub.
The sample application reads command line parameters and loads a network and an image to the Inference Engine plugin. When inference is done, the application creates an output image and outputs data to the standard output stream.
Downloading and Converting a Caffe* Model
- Download
test.prototxt
from https://raw.githubusercontent.com/rbgirshick/py-faster-rcnn/master/models/pascal_voc/VGG16/faster_rcnn_end2end/test.prototxt - Download the pretrained models from https://dl.dropboxusercontent.com/s/o6ii098bu51d139/faster_rcnn_models.tgz?dl=0
- Unzip the archive and make sure you have the file named
VGG16_faster_rcnn_final.Caffe*model
.
For correctly converting the source model, run the Model Optimizer with the extension for the Python proposal layer. To convert the source model:
python3 ${MO_ROOT_PATH}/mo_Caffe*.py --input_model <path_to_model]/VGG16_faster_rcnn_final.Caffe*model --input_proto <path_to_model]/deploy.prototxt --extensions <path_to_object_detection_sample]/fasterrcnn_extensions
Running the Application
Running the application with the -h
option results in the message:
$ ./object_detection_sample -h InferenceEngine: API version ............ <version> Build .................. <number> object_detection_sample [OPTION] Options: -h Print a usage message. -i "<path>" Required. Path to an image file. -m "<path>" Required. Path to an .xml file with a trained model. -l "<absolute_path>" Optional. Absolute path to library with MKL-DNN (CPU) custom layers (*.so). Or -c "<absolute_path>" Optional. Absolute path to Intel® Integrated Graphics custom layers config (*.xml). -pp "<path>" Path to a plugin directory. -d "<device>" Specify the target device to infer on; CPU or Intel® Integrated Graphics is acceptable. The sample looks for a suitable plugin for the device specified -ni "<integer>" Number of iterations (default 1) -pc Enables per-layer performance report
Running the application with an empty list of options results in an error message and the usage list above.
Use the following command to do inference on Intel® Processors on an image using a trained Faster R-CNN network:
$ ./object_detection_sample -i <path_to_image>/inputImage.bmp -m <path_to_model>/faster-rcnn.xml -d CPU
Output Description
The application outputs an image named out_0.bmp
with detected objects enclosed in rectangles. It outputs the list of classes of the detected objects along with the respective confidence values and the coordinates of the rectangles to the standard output stream.
Using this Sample with the Intel Person Detection Model
This model has a non-default (for Faster-RCNN) output layer name. To score it correctly, add the option --bbox_name detector/bbox/ave_pred
to the command line.
Usage example:
./object_detection_sample -i /home/user/people.jpg -m /<ie_path]/intel_models/person-detection-retail-0001/FP32/person-detection-retail-0001.xml --bbox_name detector/bbox/ave_pred -d CPU
Object Detection SSD, Async API Performance Showcase Sample
Description
This demonstration showcases Object Detection with SSD and new Async API. Async API usage can improve overall frame-rate of the application, because rather than wait for inference to complete, the app can continue doing things on the host, while accelerator is busy. Specifically, this demonstration keeps two parallel infer requests and while the current is processed, the input frame for the next is being captured. This essentially hides the latency of capturing, so that the overall framerate is rather determined by the MAXIMUM(detection time, input capturing time) and not the SUM(detection time, input capturing time).
The technique can be generalized to any available parallel slack, such as doing inference while simultaneously encoding the resulting (previous) frames, or running further inference, like emotion detection on top of the face detection results.
Be aware of performance caveats though. When running tasks in parallel, avoid over-using shared compute resources. For example, if performing inference on the FPGA with a mostly idle CPU, perform parallel tasks on the CPU. When doing inference on Intel® Integrated Graphics, you have little gain in tasks like having resulting video encoding on the same Intel® Integrated Graphics in parallel because the device is already busy.
For more performance implications and tips for the Async API, see the Optimization Guide
Other demonstration objectives:
- Video as input support via OpenCV*
- Visualization of the resulting bounding boxes and text labels (from the
.labels
file) or class number (if no file is provided) - OpenCV* provides resulting bounding boxes, labels, and other information. You can copy and paste this code without pulling Inference Engine samples helpers into your application.
- Demonstrate the Async API in action. For this, the demonstration features two modes with a Tab key toggle.
- Old-style "Sync" way - The frame capturing with OpenCV* executes back-to-back with Detection
- "Truly Async" way - The Detection is performed on the current frame, while the OpenCV* captures the next frame.
How it Works
The application reads command line parameters and loads a network to the Inference Engine. Upon getting a frame from the OpenCV*'s VideoCapture it performs inference and displays the results.
New "Async API" operates with new notion of the "Infer Request" that encapsulates the inputs/outputs and separates scheduling and waiting for result, next section. And here what makes the performance look different:
- In the default ("Sync") mode the frame is captured and then immediately processed, below in pseudo-code:
while(true) { capture frame populate CURRENT InferRequest start CURRENT InferRequest //this call is async and returns immediately wait for the CURRENT InferRequest display CURRENT result }
This is a reference implementation in which the new Async API is used in a serialized/synch fashion. - In "true" ASync mode, the frame is captured and then immediately processed:
while(true) { capture frame populate NEXT InferRequest start NEXT InferRequest //this call is async and returns immediately wait for the CURRENT InferRequest (processed in a dedicated thread) display CURRENT result swap CURRENT and NEXT InferRequests }
In this case, the NEXT request is populated in the main (app) thread, while the CURRENT request is processed. This is handled in the dedicated thread, internal to the Inference Engine runtime.
Async API
In this release, the Inference Engine offers a new API based on the notion of Infer Requests. With this API, requests encapsulate input and output allocation. You access the blob with the GetBlob method.
You can execute a request asynchronously in the background and wait until you need the result. In the meantime your application can continue:
// load plugin for the device as usual auto enginePtr = PluginDispatcher({"../../../lib/intel64", ""}).getSuitablePlugin( getDeviceFromStr("GPU")); // load network CNNNetReader network_reader; network_reader.ReadNetwork("Model.xml"); network_reader.ReadWeights("Model.bin"); // populate inputs etc auto input = async_infer_request.GetBlob(input_name); ... // start the async infer request (puts the request to the queue and immediately returns) async_infer_request->StartAsync(); // Continue execution on the host until you need the request results //... async_infer_request.Wait(IInferRequest::WaitMode::RESULT_READY); auto output = async_infer_request.GetBlob(output_name);
You have no direct way to measure execution time of the infer request that is running asynchronously, unless you measure the Wait executed immediately after the StartAsync. But this essentially would mean the serialization and synchronous execution.
This is what sample does for the default "SYNC" mode and reports as a Detection time/fps
message on the screen. In the truly asynchronous ("ASYNC") mode the host continues execution in the master thread, in parallel to the infer request. If the request is completed before than the Wait is called in the main thread (i.e. earlier than OpenCV* decoded a new frame), that reporting the time between StartAsync and Wait would obviously incorrect. That is why in the "ASYNC" mode the inference speed is not reported.
For more information about the new, request-based Inference Engine API, including ASYNC execution, see the information about integrating a customer application new request API.
Running the Application
Running the application with the -h
option results in the message:
$ ./object_detection_demo_ssd_async -h InferenceEngine: API version ............ [version] Build .................. object_detection_demo_ssd_async [OPTION] Options: -h Print a usage message. -i "[path]" Required. Path to an video file. Use "cam" to capture input from the camera). -m "[path]" Required. Path to an .xml file with a trained model. -l "[absolute_path]" Optional. Absolute path to library with Intel® MKL-DNN (CPU) custom layers (*.so). Or -c "[absolute_path]" Optional. Absolute path to Intel® Integrated Graphics custom layers config (*.xml). -d "[device]" Specify the target device to infer on; CPU, Intel® Integrated Graphics, FPGA, and Intel® Movidius™ Myriad™ 2 Vision Processing Unit are accepted. -pc Enables per-layer performance report. -t Probability threshold for detections (default is 0.5). -r Output inference results as raw values to the console.
Running the application with an empty list of options results in an error message and the usage list above.
Use the following command to do inference on a Intel® Integrated Graphics with an example pre-trained GoogleNet based SSD* available at https://software.intel.com/file/609199/download
Command Description
After reading through this demonstration, use this command to perform inference on a Intel® Integrated Graphics with the SSD you download from https://software.intel.com/file/609199/download
$ ./object_detection_demo_ssd_async -i <path_to_video>/inputVideo.mp4 -m <path_to_model>/ssd.xml -d GPU
The network must be converted from the Caffe* (*.prototxt + *.model) to the Inference Engine format (*.xml + *bin) before using this command. See the Model Optimizer Developer Guide.
The only GUI knob is using 'Tab' to switch between the synchronized execution and the true Async mode.
Output Description
The output uses OpenCV* to display the resulting frame with detections rendered as bounding boxes and labels, if provided. In default mode, the sample reports:
- OpenCV* time: Frame decoding + time to render the bounding boxes, labels, and display of the results.
- Detection time: Inference time for the objection network. This is reported in SYNC mode.
- Wallclock time: The combined application-level performance.
Object Detection with SSD-VGG Sample
Description
How to run the Object Detection sample application, which does inference using object detection networks like SSD-VGG on Intel® Processors and Intel® HD Graphics.
The sample application reads command line parameters and loads a network and an image to the Inference Engine plugin. When inference is done, the application creates an output image and outputs data to the standard output stream.
Running the Application
Running the application with the -h
option results in the message:
$./object_detection_sample_ssd -h InferenceEngine: API version ............ <version> Build .................. <number> object_detection_sample_ssd [OPTION] Options: -h Print a usage message. -i "<path>" Required. Path to an image file. -m "<path>" Required. Path to an .xml file with a trained model. -l "<absolute_path>" Optional. Absolute path to library with MKL-DNN (CPU) custom layers (*.so). Or -c "<absolute_path>" Optional. Absolute path to Intel® Integrated Graphics custom layers config (*.xml). -pp "<path>" Path to a plugin directory. -d "<device>" Specify the target device to infer on; CPU, Intel® Integrated Graphics or MYRIAD is acceptable. The sample looks for a suitable plugin for the specified device. -ni "<integer>" Number of iterations (default 1) -pc Enables per-layer performance report
Running the application with an empty list of options results in an error message and the usage list above.
Use the following command to do inference on Intel® Processors on an image using a trained SSD network:
$ ./object_detection_sample_ssd -i <path_to_image>/inputImage.bmp -m <path_to_model>/VGG_ILSVRC2016_SSD.xml -d CPU
Output Description
The application outputs an image named out_0.bmp
with detected objects enclosed in rectangles. It outputs the list of classes of the detected objects along with the respective confidence values and the coordinates of the rectangles to the standard output stream.
Neural Style Transfer Sample
Description
How to build and run the Neural Style Transfer sample (NST sample) application, which does inference using models of style transfer topology.
Running the Application
Running the application with the -h
option results in the message:
$ ./style_transfer_sample --h InferenceEngine: API version ............ <version> Build .................. <number> style_transfer_sample [OPTION] Options: -h Print a usage message. -i "<path1>""<path3>" Required. Path to a directory with images or path to an image files: a .ubyte file for LeNet and a .bmp file for the other networks. -m "<path>" Required. Path to an .xml file with a trained model. -l "<absolute_path>" Optional. Absolute path to library with MKL-DNN (CPU) custom layers (*.so). Or -c "<absolute_path>" Optional. Absolute path to Intel® Integrated Graphics custom layers config (*.xml). -pp "<path>" Path to a plugin directory. -p "<name>" Plugin name. For example Intel® MKL-DNN. If this parameter is pointed, the sample looks for this plugin only -d "<device>" Specify the target device to infer on; CPU or Intel® Integrated Graphics is acceptable. The sample looks for a suitable plugin for the specified device. -nt "<integer>" Number of top results (default 10) -ni "<integer>" Number of iterations (default 1) -pc Enables per-layer performance report
Running the application with an empty list of options results in an error message and the usage list above.
To do inference on an image using a trained model of NST network on Intel® Processors using the following command:
$ ./style_transfer_sample -i <path_to_image>/cat.bmp -m <path_to_model>/1_decoder_FP32.xml
Output Description
The application outputs one or more styled image, starting with named out1.bmp
, which were redrawn in style of model which used for inference. Style of output images depend on models which use for sample.
Hello Infer Request Classification
Description
How to run the Hello Infer Classification sample application. The sample is simplified version of the Image Classification Sample. It's intended to demonstrate using of new Infer Request API of Inference Engine in applications. See Integrate with customer application New Request API for details.
Running the Application
To do inference on an image using a trained AlexNet network on Intel® Processors:
$ ./hello_request_classification <path_to_model>/alexnet_fp32.xml <path_to_image>/cat.bmp CPU
Output Description
The top-10 inference results
Interactive Face Detection
Description
Showcases the Object Detection task applied to face recognition using a sequence of neural networks. The Async API can improve the overall frame-rate of the application because the application can continue operating while the accelerator is busy. This demonstration maintains two parallel inferance requests for the Age Gender and Head Pose detection that are run simultaneously.
Other demonstration objectives:
- Video as input support via OpenCV*.
- Visualization of the resulting face bounding boxes from Face Detection network.
- Visualization of age gender and head pose information for each detected face.
- The OpenCV* provides resulting bounding boxes, labels, and other information. You can copy and paste this code without pulling Inference Engine sample helpers into your application.
How it Works
- The application loads up to three networks, depending on the
-d
option. - The application gets a frame from the OpenCV's video capture
- The application performs inference on the frame detection network
- The application performs two simultaneous inferences, using the Age Gender and Head Pose detection networks, if these are specified in the command-line.
- The application displays the results.
The new Async API operates with new notion of the Infer Request that encapsulates the inputs/outputs and separates scheduling and waiting for result. This operation changes the performance, as follows:
In the default mode (Sync mode), the frame is captured and immediately processed:
while(true) { capture frame populate FaceDetection InferRequest wait for the FaceDetection InferRequest populate AgeGender InferRequest using dyn batch technique populate HeadPose InferRequest using dyn batch technique wait AgeGender wait HeadPose display detection results }
Running the Application
Running the application with the -h
option results in the message:
$ ./interactive_face_detection -h InferenceEngine: API version ............ <version> Build .................. <number> interactive_face_detection [OPTION] Options: -h Print a usage message. -i "<path>" Optional. Path to an video file. Default value is "cam" to work with camera. -m "<path>" Required. Path to an .xml file with a trained face detection model. -m_ag "<path>" Optional. Path to an .xml file with a trained age gender model. -m_hp "<path>" Optional. Path to an .xml file with a trained head pose model. -l "<absolute_path>" Required for Intel® MKL-DNN (CPU)-targeted custom layers.Absolute path to a shared library with the kernels impl. Or -c "<absolute_path>" Required for Intel® Integrated Graphics-targeted custom kernels.Absolute path to the xml file with the kernels desc. -d "<device>" Specify the target device for Face Detection (CPU, Intel® Integrated Graphics, FPGA, or MYRYAD. The sample looks for a suitable plugin for the specified device. -d_ag "<device>" Specify the target device for Age Gender Detection (CPU, Intel® Integrated Graphics, FPGA, or MYRYAD. The sample looks for a suitable plugin for the specified device. -d_hp "<device>" Specify the target device for Head Pose Detection (CPU, Intel® Integrated Graphics, FPGA, or MYRYAD. The sample looks for a suitable plugin for the specified device. -pc Enables per-layer performance report. -r Inference results as raw values. -t Probability threshold for detections.
Running the application with an empty list of options results in an error message and the usage list above.
To do inference on a Intel® Integrated Graphics with an example pre-trained GoogleNet based SSD* available at example pre-trained GoogLeNet-based SSD:
./object_detection_demo_ssd_async -i <path_to_video>/inputVideo.mp4 -m <path_to_model>/ssd.xml -d Intel® Integrated Graphics
Before using this, use the Model Optimizer to convert the network from the Caffe* (*.prototxt + *.model) to the Inference Engine format (*.xml + *bin)
Demonstration Output
The demonstration uses OpenCV* to display the resulting frame with detections that are rendered as bounding boxes. Labels are included if available. In default mode, the sample reports:
- OpenCV* time: frame decoding + time to render the bounding boxes, labels, and displaying the results
- Face detection time: inference time for the face Detection network
- Age Gender + Head Pose time: combined inference time of simultaneously executed age gender and head pose networks
Image Segmentation Sample
Description
How to run the Image Segmentation sample application, which does inference using image segmentation networks like FCN8.
The sample application reads command line parameters and loads a network and an image to the Inference Engine plugin. When inference is done, the application creates an output image.
Running the Applicaiton
Running the application with the -h
option results in the message:
$ ./segmentation_sample -h InferenceEngine: API version ............ <version> Build .................. <number> segmentation_sample [OPTION] Options: -h Print a usage message. -i "<path1>""<path3>" Required. Path to a directory with images or path to an image files: a .ubyte file for LeNet and a .bmp file for the other networks. -m "<path>" Required. Path to an .xml file with a trained model. -l "<absolute_path>" Optional. Absolute path to library with MKL-DNN (CPU) custom layers (*.so). Or -c "<absolute_path>" Optional. Absolute path to Intel® Integrated Graphics custom layers config (*.xml). -pp "<path>" Path to a plugin directory. -d "<device>" Specify the target device to infer on; CPU or Intel® Integrated Graphics is acceptable. The sample looks for a suitable plugin for the specified device. -ni "<integer>" Number of iterations (default 1) -pc Enables per-layer performance report
Running the application with an empty list of options results in an error message and the usage list above.
To do inference on Intel® Processors using an image from a trained FCN8 network:
$ ./segmentation_sample -i <path_to_image>/inputImage.bmp -m <path_to_model>/fcn8.xml
Output Description
The application outputs are a segmented image named out.bmp
.
Using the Validation Application to Check Accuracy on a Dataset
The Inference Engine Validation application lets you score common topologies with standard inputs and outputs configuration. These topologies include AlexNet and SSD. The Validation application allows the user to collect simple validation metrics for the topologies. It supports Top-1/Top-5 counting for classification networks and 11-points mAP calculation for object detection networks.
Possible Validation application uses:
- Check if Inference Engine scores the public topologies well
- Verify if the user's custom topology compatible with the default input/output configuration and compare its accuracy with the public ones
- Using Validation application as another sample: although the code is much more complex than in classification and object detection samples, it's still open and could be re-used
The application loads a network to the Inference Engine plugin. Then:
- The application reads the validation set (the
-i
option):- If
-i
specifies a directory. The application tries to load labels first. To do so, the application searches for a file with the same base name as the model, but with a.labels
extension. The application then searches the specified directory and adds all images from sub-directories whose names are equal to a known label to the validation set. If there are no sub-directories whose names are equal to known labels, the validation set is considered empty. - If
-i
specifies a.txt
file. The application reads the.txt
file, considering every line that has the format:<relative_path_from_txt_to_img] <ID]
whereID
is the image number that the network should classify.
- If
- The application reads the number of images specified by
-b
and loads the images to the plugin. When all images are loaded, the plugin does inference and the Validation application collects the statistics.
NOTE: Image load time is not part of of the inference time reported by the application.
As an option, use the -dump
option to retrieve the inference results. This option creates an inference report with the name in as dumpfileXXXX.csv.
in this format, using semicolon separated values:
Image_path
- Flag representing correctness of prediction
- ID of the
Top-1
class - Probability that the image belongs to the
Top-1
class - ID of the
Top-2 class
- Probability that the image belongs to the
Top-x
class, wherex
is an integer
CLI Options
Usage: validation_app [OPTION] Available options: -h Print a usage message -t Type of the network being scored ("C" by default) -t "C" for classification -t "OD" for object detection -i [path] Required. Directory with validation images, directorys grouped by labels or a .txt file list for classification networks or a VOC-formatted dataset for object detection networks -m [path] Required. Path to an .xml file with a trained model -l [absolute_path] Required for Intel® MKL-DNN (CPU)-targeted custom layers.Absolute path to a shared library with the kernel implementations -c [absolute_path] Required for Intel® Integrated Graphics-targeted custom kernels.Absolute path to the xml file with the kernel descriptions -d [device] Specify the target device to infer on; CPU, Intel® Integrated Graphics, FPGA or MYRIAD is acceptable. The sample looks for a suitable plugin for the specified device. The plugin is CPU by default. -b N Batch size value. If not specified, the batch size value is determined from IR -ppType Preprocessing type. One of "None", "Resize", "ResizeCrop" -ppSize N Preprocessing size (used with ppType="ResizeCrop") -ppWidth W Preprocessing width (overrides -ppSize, used with ppType="ResizeCrop") -ppHeight H Preprocessing height (overrides -ppSize, used with ppType="ResizeCrop") --dump Dump filenames and inference results to a csv file Classification-specific options: -Czb true "Zero is a background" flag. Some networks are trained with a modified dataset where the class IDs are enumerated from 1, but 0 is an undefined "background" class (which is never detected) Object detection-specific options: -ODkind Kind of an object detection network: SSD -ODa [path] Required for OD networks. Path to the directory containing .xml annotations for images -ODc Required for OD networks. Path to the file containing classes list -ODsubdir Directory between the image path (-i) and image name, specified in the .xml. Use JPEGImages for VOC2007
Option Categories
- Common options are usually named with a single letter or word, such as
-b
or–dump
. These options have a common sense in allvalidation_app
modes. - Network type-specific options are named as an acronym of the network type (such as
C
orOD
, followed by a letter or a word addendum. These options are specific for the network type. For instance,ODa
makes sense only for an object detection network.
The next section shows how to use the Validation application in classification mode to score a classification CNN on a pack of images.
Running the Application in Classification Mode
This section demonstrates how to run the Validation application in classification mode to score a classification CNN on a pack of images.
To do inference of a chosen pack of images:
$ ./validation_app -t C -i <path to images main directory or .txt file] -m <model to use for classification] -d <CPU|Intel® Integrated Graphics]
Source dataset format: directories as classes
A correct list of files looks similar to:
<path]/dataset /apron /apron1.bmp /apron2.bmp /collie /a_big_dog.jpg /coral reef /reef.bmp /Siamese /cat3.jpg
To score this dataset put the -i <path]/dataset
option in the command line.
Source dataset format: a list of images
This example uses a single list file in the format image_name-tabulation-class_index
. The correct list of files:
<path]/dataset /apron1.bmp /apron2.bmp /a_big_dog.jpg /reef.bmp /cat3.jpg /labels.txt
where labels.txt
:
apron1.bmp 411 apron2.bmp 411 cat3.jpg 284 reef.bmp 973 a_big_dog.jpg 231
To score this dataset put the -i <path>/dataset/labels.txt
option in the command line.
Output Description
A progress bar shows the inference progress. Upon completion, the common information is displayed.
Network load time: time spent on topology load in ms Model: path to chosen model Model Precision: precision of a chosen model Batch size: specified batch size Validation dataset: path to a validation set Validation approach: Classification networks Device: device type
You see statistics such as the average inference time, and top-1 and top-5 accuracy:
Average infer time (ms): 588.977 (16.98 images per second with batch size = 10) Top1 accuracy: 70.00% (7 of 10 images were detected correctly, top class is correct) Top5 accuracy: 80.00% (8 of 10 images were detected correctly, top five classes contain required class)
Using Object Detection with the Validation Application
Description
Running the Validation application in object detection mode to score an object detection on the SSD CNN pack of images.
Running SSD on the VOC Dataset
Use these steps to score SSD on the original dataset that was used to test it during its training.
./validation_app -d CPU -t OD -ODa "<...>/VOCdevkit/VOC2007/Annotations" -i "<...>/VOCdevkit" -m "<...>/vgg_voc0712_ssd_300x300.xml" -ODc "<...>/VOC_SSD_Classes.txt" -ODsubdir JPEGImages
- Go to the SSD author's github page to select the pre-trained SSD-300.
- From the same page, download the VOC2007 test dataset:
$wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar tar -xvf VOCtest_06-Nov-2007.tar
- Use the Model Optimizer to convert the model. For help, see https://software.intel.com/en-us/articles/CVSDK-ModelOptimizer
- Create a proper class file (made from the original labelmap_voc.prototxt) none_of_the_above 0 aeroplane 1 bicycle 2 bird 3 boat 4 bottle 5 bus 6 car 7 cat 8 chair 9 cow 10 diningtable 11 dog 12 horse 13 motorbike 14 person 15 pottedplant 16 sheep 17 sofa 18 train 19 tvmonitor 20
- Save it as
VOC_SSD_Classes.txt
- Score the model on the dataset:
- You see a progress bar followed by your data:
Progress: [....................] 100.00% done [ INFO ] Processing output blobs Network load time: 27.70ms Model: /home/user/models/ssd/withmean/vgg_voc0712_ssd_300x300/vgg_voc0712_ssd_300x300.xml Model Precision: FP32 Batch size: 1 Validation dataset: /home/user/Data/SSD-data/testonly/VOCdevkit Validation approach: Object detection network Average infer time (ms): 166.49 (6.01 images per second with batch size = 1) Average precision per class table: Class AP 1 0.796 2 0.839 3 0.759 4 0.695 5 0.508 6 0.867 7 0.861 8 0.886arXiv 9 0.602 10 0.822 11 0.768 12 0.861 13 0.874 14 0.842 15 0.797 16 0.526 17 0.792 18 0.795 19 0.873 20 0.773 Mean Average Precision (mAP): 0.7767
The Mean Value Precision is in a table on the SSD author's page and in the arXiv paper.