inference-engine/ie_bridges/python/sample/benchmark_app/README.md

   1 # Benchmark Python* Application
   2
   3 This topic demonstrates how to run the Benchmark Application demo, which performs inference using convolutional networks.
   4
   5 ## How It Works
   6
   7 Upon start-up, the application reads command-line parameters and loads a network and images/binary files to the Inference Engine
   8 plugin, which is chosen depending on a specified device. The number of infer requests and execution approach depend
   9 on the mode defined with the `-api` command-line parameter.
  10
  11 > **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
  12
  13 ### Synchronous API
  14
  15 For synchronous mode, the primary metric is latency. The application creates one infer request and executes the `Infer` method. A number of executions is defined by one of the two values:
  16 * Number of iterations defined with the `-niter` command-line argument
  17 * Time duration specified with the `-t` command-line argument
  18 * Both of them (execution will continue until both conditions are met)
  19 * Predefined duration if `-niter` and `-t` are not specified. Predefined duration value depends on device.
  20
  21 During the execution, the application collects two types of metrics:
  22 * Latency for each infer request executed with `Infer` method
  23 * Duration of all executions
  24
  25 Reported latency value is calculated as mean value of all collected latencies. Reported throughput value is a derivative from reported latency and additionally depends on batch size.
  26
  27 ### Asynchronous API
  28 For asynchronous mode, the primary metric is throughput in frames per second (FPS). The application creates a certain number of infer requests and executes the `StartAsync` method. A number of infer is specified with the `-nireq` command-line parameter. A number of executions is defined by one of the two values:
  29 * Number of iterations defined with the `-niter` command-line argument
  30 * Time duration specified with the `-t` command-line argument
  31 * Both of them (execution will continue until both conditions are met)
  32 * Predefined duration if `-niter` and `-t` are not specified. Predefined duration value depends on device.
  33
  34 The infer requests are executed asynchronously. Callback is used to wait for previous execution to complete. The application measures all infer requests executions and reports the throughput metric based on batch size and total execution duration.
  35
  36 ## Running
  37 Notice that the benchmark_app usually produces optimal performance for any device out of the box.
  38
  39 **So in most cases you don't need to play the app options explicitly and the plain device name is enough**, e.g.:
  40 ```
  41 $benchmark_app -m <model> -i <input> -d CPU
  42 ```
  43
  44 Running the application with the `-h` or `--help`' option yields the following usage message:
  45
  46 ```
  47 usage: benchmark_app.py [-h] [-i PATH_TO_INPUT] -m PATH_TO_MODEL
  48                         [-pp PLUGIN_DIR] [-d TARGET_DEVICE]
  49                         [-l PATH_TO_EXTENSION] [-c PATH_TO_CLDNN_CONFIG]
  50                         [-api {sync,async}] [-niter NUMBER_ITERATIONS]
  51                         [-nireq NUMBER_INFER_REQUESTS] [-b BATCH_SIZE]
  52                         [-stream_output [STREAM_OUTPUT]] [-t TIME]
  53                         [-progress [PROGRESS]] [-nstreams NUMBER_STREAMS]
  54                         [-nthreads NUMBER_THREADS] [-pin {YES,NO}]
  55                         [--exec_graph_path EXEC_GRAPH_PATH]
  56                         [-pc [PERF_COUNTS]]
  57
  58 Options:
  59   -h, --help            Show this help message and exit.
  60   -i PATH_TO_INPUT, --path_to_input PATH_TO_INPUT
  61                         Optional. Path to a folder with images and/or binaries
  62                         or to specific image or binary file.
  63   -m PATH_TO_MODEL, --path_to_model PATH_TO_MODEL
  64                         Required. Path to an .xml file with a trained model.
  65   -pp PLUGIN_DIR, --plugin_dir PLUGIN_DIR
  66                         Optional. Path to a plugin folder.
  67   -d TARGET_DEVICE, --target_device TARGET_DEVICE
  68                         Optional. Specify a target device to infer on: CPU,
  69                         GPU, FPGA, HDDL or MYRIAD.
  70                         Use "-d HETERO:<comma separated devices list>" format to specify HETERO plugin.
  71   -l PATH_TO_EXTENSION, --path_to_extension PATH_TO_EXTENSION
  72                         Optional. Required for CPU custom layers. Absolute
  73                         path to a shared library with the kernels
  74                         implementations.
  75   -c PATH_TO_CLDNN_CONFIG, --path_to_cldnn_config PATH_TO_CLDNN_CONFIG
  76                         Optional. Required for GPU custom kernels. Absolute
  77                         path to an .xml file with the kernels description.
  78   -api {sync,async}, --api_type {sync,async}
  79                         Optional. Enable using sync/async API. Default value
  80                         is async.
  81   -niter NUMBER_ITERATIONS, --number_iterations NUMBER_ITERATIONS
  82                         Optional. Number of iterations. If not specified, the
  83                         number of iterations is calculated depending on a
  84                         device.
  85   -nireq NUMBER_INFER_REQUESTS, --number_infer_requests NUMBER_INFER_REQUESTS
  86                         Optional. Number of infer requests. Default value is
  87                         determined automatically for device.
  88   -b BATCH_SIZE, --batch_size BATCH_SIZE
  89                         Optional. Batch size value. If not specified, the
  90                         batch size value is determined from IR
  91   -stream_output [STREAM_OUTPUT]
  92                         Optional. Print progress as a plain text. When
  93                         specified, an interactive progress bar is replaced
  94                         with a multiline output.
  95   -t TIME, --time TIME  Optional. Time in seconds to execute topology.
  96   -progress [PROGRESS]  Optional. Show progress bar (can affect performance
  97                         measurement). Default values is "False".
  98   -nstreams NUMBER_STREAMS, --number_streams NUMBER_STREAMS
  99                        Optional. Number of streams to use for inference on the CPU/GPU in throughput mode
 100                        (for HETERO device case use format <device1>:<nstreams1>,<device2>:<nstreams2> or just <nstreams>).
 101   -nthreads NUMBER_THREADS, --number_threads NUMBER_THREADS
 102                         Number of threads to use for inference on the CPU
 103                         (including HETERO case).
 104   -pin {YES,NO}, --infer_threads_pinning {YES,NO}
 105                         Optional. Enable ("YES" is default value) or disable
 106                         ("NO")CPU threads pinning for CPU-involved inference.
 107   --exec_graph_path EXEC_GRAPH_PATH
 108                         Optional. Path to a file where to store executable
 109                         graph information serialized.
 110   -pc [PERF_COUNTS], --perf_counts [PERF_COUNTS]
 111                         Optional. Report performance counters.
 112
 113 ```
 114
 115 Running the application with the empty list of options yields the usage message given above and an error message.
 116
 117 Application supports topologies with one or more inputs. If a topology is not data sensitive, you can skip the input parameter. In this case, inputs are filled with random values.
 118 If a model has only image input(s), please a provide folder with images or a path to an image as input.
 119 If a model has some specific input(s) (not images), please prepare a binary file(s), which is filled with data of appropriate precision and provide a path to them as input.
 120 If a model has mixed input types, input folder should contain all required files. Image inputs are filled with image files one by one. Binary inputs are filled with binary inputs one by one.
 121
 122 To run the demo, you can use public or pre-trained models. To download the pre-trained models, use the OpenVINO [Model Downloader](https://github.com/opencv/open_model_zoo/tree/2018/model_downloader) or go to [https://download.01.org/opencv/](https://download.01.org/opencv/).
 123
 124 > **NOTE**: Before running the demo with a trained model, make sure the model is converted to the Inference Engine format (\*.xml + \*.bin) using the [Model Optimizer tool](./docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md).
 125
 126 For example, to do inference of an image using a trained network with multiple outputs on CPU, run the following command:
 127
 128 ```
 129 python3 benchmark_app.py -i <path_to_image>/inputImage.bmp -m <path_to_model>/multiple-output.xml -d CPU
 130 ```
 131
 132 ## Demo Output
 133
 134 The application outputs number of executed iterations, total duration of execution, latency and throughput.
 135 Additionally, if you set the `-pc` parameter, the application outputs performance counters.
 136 If you set `-exec_graph_path`, the application reports executable graph information serialized.
 137
 138 ```
 139 [Step 8/9] Measuring performance (Start inference asyncronously, 60000 ms duration, 4 inference requests in parallel using 4 streams)
 140 Progress: |................................| 100.00%
 141
 142 [Step 9/9] Dumping statistics report
 143 Progress: |................................| 100.00%
 144
 145 Count:      4408 iterations
 146 Duration:   60153.52 ms
 147 Latency:    51.8244 ms
 148 Throughput: 73.28 FPS
 149
 150 ```
 151
 152 ## See Also
 153 * [Using Inference Engine Samples](./docs/IE_DG/Samples_Overview.md)
 154 * [Model Optimizer](./docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
 155 * [Model Downloader](https://github.com/opencv/open_model_zoo/tree/2018/model_downloader)