1 # Benchmark Python* Application
3 This topic demonstrates how to run the Benchmark Application demo, which performs inference using convolutional networks.
7 Upon start-up, the application reads command-line parameters and loads a network and images/binary files to the Inference Engine
8 plugin, which is chosen depending on a specified device. The number of infer requests and execution approach depend
9 on the mode defined with the `-api` command-line parameter.
11 > **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
15 For synchronous mode, the primary metric is latency. The application creates one infer request and executes the `Infer` method. A number of executions is defined by one of the two values:
16 * Number of iterations defined with the `-niter` command-line argument
17 * Time duration specified with the `-t` command-line argument
18 * Both of them (execution will continue until both conditions are met)
19 * Predefined duration if `-niter` and `-t` are not specified. Predefined duration value depends on device.
21 During the execution, the application collects two types of metrics:
22 * Latency for each infer request executed with `Infer` method
23 * Duration of all executions
25 Reported latency value is calculated as mean value of all collected latencies. Reported throughput value is a derivative from reported latency and additionally depends on batch size.
28 For asynchronous mode, the primary metric is throughput in frames per second (FPS). The application creates a certain number of infer requests and executes the `StartAsync` method. A number of infer is specified with the `-nireq` command-line parameter. A number of executions is defined by one of the two values:
29 * Number of iterations defined with the `-niter` command-line argument
30 * Time duration specified with the `-t` command-line argument
31 * Both of them (execution will continue until both conditions are met)
32 * Predefined duration if `-niter` and `-t` are not specified. Predefined duration value depends on device.
34 The infer requests are executed asynchronously. Callback is used to wait for previous execution to complete. The application measures all infer requests executions and reports the throughput metric based on batch size and total execution duration.
37 Notice that the benchmark_app usually produces optimal performance for any device out of the box.
39 **So in most cases you don't need to play the app options explicitly and the plain device name is enough**, e.g.:
41 $benchmark_app -m <model> -i <input> -d CPU
44 Running the application with the `-h` or `--help`' option yields the following usage message:
47 usage: benchmark_app.py [-h] [-i PATH_TO_INPUT] -m PATH_TO_MODEL
48 [-pp PLUGIN_DIR] [-d TARGET_DEVICE]
49 [-l PATH_TO_EXTENSION] [-c PATH_TO_CLDNN_CONFIG]
50 [-api {sync,async}] [-niter NUMBER_ITERATIONS]
51 [-nireq NUMBER_INFER_REQUESTS] [-b BATCH_SIZE]
52 [-stream_output [STREAM_OUTPUT]] [-t TIME]
53 [-progress [PROGRESS]] [-nstreams NUMBER_STREAMS]
54 [-nthreads NUMBER_THREADS] [-pin {YES,NO}]
55 [--exec_graph_path EXEC_GRAPH_PATH]
59 -h, --help Show this help message and exit.
60 -i PATH_TO_INPUT, --path_to_input PATH_TO_INPUT
61 Optional. Path to a folder with images and/or binaries
62 or to specific image or binary file.
63 -m PATH_TO_MODEL, --path_to_model PATH_TO_MODEL
64 Required. Path to an .xml file with a trained model.
65 -pp PLUGIN_DIR, --plugin_dir PLUGIN_DIR
66 Optional. Path to a plugin folder.
67 -d TARGET_DEVICE, --target_device TARGET_DEVICE
68 Optional. Specify a target device to infer on: CPU,
69 GPU, FPGA, HDDL or MYRIAD.
70 Use "-d HETERO:<comma separated devices list>" format to specify HETERO plugin.
71 -l PATH_TO_EXTENSION, --path_to_extension PATH_TO_EXTENSION
72 Optional. Required for CPU custom layers. Absolute
73 path to a shared library with the kernels
75 -c PATH_TO_CLDNN_CONFIG, --path_to_cldnn_config PATH_TO_CLDNN_CONFIG
76 Optional. Required for GPU custom kernels. Absolute
77 path to an .xml file with the kernels description.
78 -api {sync,async}, --api_type {sync,async}
79 Optional. Enable using sync/async API. Default value
81 -niter NUMBER_ITERATIONS, --number_iterations NUMBER_ITERATIONS
82 Optional. Number of iterations. If not specified, the
83 number of iterations is calculated depending on a
85 -nireq NUMBER_INFER_REQUESTS, --number_infer_requests NUMBER_INFER_REQUESTS
86 Optional. Number of infer requests. Default value is
87 determined automatically for device.
88 -b BATCH_SIZE, --batch_size BATCH_SIZE
89 Optional. Batch size value. If not specified, the
90 batch size value is determined from IR
91 -stream_output [STREAM_OUTPUT]
92 Optional. Print progress as a plain text. When
93 specified, an interactive progress bar is replaced
94 with a multiline output.
95 -t TIME, --time TIME Optional. Time in seconds to execute topology.
96 -progress [PROGRESS] Optional. Show progress bar (can affect performance
97 measurement). Default values is "False".
98 -nstreams NUMBER_STREAMS, --number_streams NUMBER_STREAMS
99 Optional. Number of streams to use for inference on the CPU/GPU in throughput mode
100 (for HETERO device case use format <device1>:<nstreams1>,<device2>:<nstreams2> or just <nstreams>).
101 -nthreads NUMBER_THREADS, --number_threads NUMBER_THREADS
102 Number of threads to use for inference on the CPU
103 (including HETERO case).
104 -pin {YES,NO}, --infer_threads_pinning {YES,NO}
105 Optional. Enable ("YES" is default value) or disable
106 ("NO")CPU threads pinning for CPU-involved inference.
107 --exec_graph_path EXEC_GRAPH_PATH
108 Optional. Path to a file where to store executable
109 graph information serialized.
110 -pc [PERF_COUNTS], --perf_counts [PERF_COUNTS]
111 Optional. Report performance counters.
115 Running the application with the empty list of options yields the usage message given above and an error message.
117 Application supports topologies with one or more inputs. If a topology is not data sensitive, you can skip the input parameter. In this case, inputs are filled with random values.
118 If a model has only image input(s), please a provide folder with images or a path to an image as input.
119 If a model has some specific input(s) (not images), please prepare a binary file(s), which is filled with data of appropriate precision and provide a path to them as input.
120 If a model has mixed input types, input folder should contain all required files. Image inputs are filled with image files one by one. Binary inputs are filled with binary inputs one by one.
122 To run the demo, you can use public or pre-trained models. To download the pre-trained models, use the OpenVINO [Model Downloader](https://github.com/opencv/open_model_zoo/tree/2018/model_downloader) or go to [https://download.01.org/opencv/](https://download.01.org/opencv/).
124 > **NOTE**: Before running the demo with a trained model, make sure the model is converted to the Inference Engine format (\*.xml + \*.bin) using the [Model Optimizer tool](./docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md).
126 For example, to do inference of an image using a trained network with multiple outputs on CPU, run the following command:
129 python3 benchmark_app.py -i <path_to_image>/inputImage.bmp -m <path_to_model>/multiple-output.xml -d CPU
134 The application outputs number of executed iterations, total duration of execution, latency and throughput.
135 Additionally, if you set the `-pc` parameter, the application outputs performance counters.
136 If you set `-exec_graph_path`, the application reports executable graph information serialized.
139 [Step 8/9] Measuring performance (Start inference asyncronously, 60000 ms duration, 4 inference requests in parallel using 4 streams)
140 Progress: |................................| 100.00%
142 [Step 9/9] Dumping statistics report
143 Progress: |................................| 100.00%
145 Count: 4408 iterations
146 Duration: 60153.52 ms
148 Throughput: 73.28 FPS
153 * [Using Inference Engine Samples](./docs/IE_DG/Samples_Overview.md)
154 * [Model Optimizer](./docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
155 * [Model Downloader](https://github.com/opencv/open_model_zoo/tree/2018/model_downloader)