inference-engine/ie_bridges/python/sample/benchmark_app/README.md

   1 # Benchmark Application Python* Demo
   2
   3 This topic demonstrates how to run the Benchmark Application demo, which performs inference using convolutional networks.
   4
   5 ## How It Works
   6
   7 > **NOTE:** To achieve benchmark results similar to the official published results, set CPU frequency to 2.9GHz and GPU frequency to 1GHz.
   8
   9 Upon the start-up, the application reads command-line parameters and loads a network and images to the Inference Engine plugin. The number of infer requests and execution approach depend on a mode defined with the `-api` command-line parameter.
  10
  11 > **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
  12
  13 ### Synchronous API
  14 For synchronous mode, the primary metric is latency. The application creates one infer request and executes the `Infer` method. A number of executions is defined by one of the two values:
  15 * Number of iterations defined with the `-niter` command-line argument
  16 * Predefined duration if `-niter` is skipped. Predefined duration value depends on device.
  17
  18 During the execution, the application collects two types of metrics:
  19 * Latency for each infer request executed with `Infer` method
  20 * Duration of all executions
  21
  22 Reported latency value is calculated as mean value of all collected latencies. Reported throughput value is a derivative from reported latency and additionally depends on batch size.
  23
  24 ### Asynchronous API
  25 For asynchronous mode, the primary metric is throughput in frames per second (FPS). The application creates a certain number of infer requests and executes the `StartAsync` method. A number of infer is specified with the `-nireq` command-line parameter. A number of executions is defined by one of the two values:
  26 * Number of iterations defined with the `-niter` command-line argument
  27 * Predefined duration if `-niter` is skipped. Predefined duration value depends on device.
  28
  29 The infer requests are executed asynchronously. `Wait` method is used to wait for previous execution to complete. The application measures all infer requests executions and reports the throughput metric based on batch size and total execution duration.
  30
  31 ## Running
  32
  33 Running the application with the `-h` or `--help`' option yields the following usage message:
  34 ```python3 benchmark_app.py -h```
  35
  36 The command yields the following usage message:
  37 ```
  38    usage: benchmark_app.py [-h] -i PATH_TO_IMAGES -m PATH_TO_MODEL
  39                         [-c PATH_TO_CLDNN_CONFIG] [-l PATH_TO_EXTENSION]
  40                         [-api {sync,async}] [-d TARGET_DEVICE]
  41                         [-niter NUMBER_ITERATIONS]
  42                         [-nireq NUMBER_INFER_REQUESTS]
  43                         [-nthreads NUMBER_THREADS] [-b BATCH_SIZE]
  44                         [-pin {YES,NO}]
  45
  46 Options:
  47   -h, --help            Show this help message and exit.
  48   -i PATH_TO_IMAGES, --path_to_images PATH_TO_IMAGES
  49                         Required. Path to a folder with images or to image
  50                         files.
  51   -m PATH_TO_MODEL, --path_to_model PATH_TO_MODEL
  52                         Required. Path to an .xml file with a trained model.
  53   -c PATH_TO_CLDNN_CONFIG, --path_to_cldnn_config PATH_TO_CLDNN_CONFIG
  54                         Optional. Required for GPU custom kernels. Absolute
  55                         path to an .xml file with the kernels description.
  56   -l PATH_TO_EXTENSION, --path_to_extension PATH_TO_EXTENSION
  57                         Optional. Required for GPU custom kernels. Absolute
  58                         path to an .xml file with the kernels description.
  59   -api {sync,async}, --api_type {sync,async}
  60                         Optional. Enable using sync/async API. Default value
  61                         is sync
  62   -d TARGET_DEVICE, --target_device TARGET_DEVICE
  63                         Optional. Specify a target device to infer on: CPU,
  64                         GPU, FPGA, HDDL or MYRIAD. Use "-d HETERO:<comma
  65                         separated devices list>" format to specify HETERO
  66                         plugin. The application looks for a suitable plugin
  67                         for the specified device.
  68   -niter NUMBER_ITERATIONS, --number_iterations NUMBER_ITERATIONS
  69                         Optional. Number of iterations. If not specified, the
  70                         number of iterations is calculated depending on a
  71                         device.
  72   -nireq NUMBER_INFER_REQUESTS, --number_infer_requests NUMBER_INFER_REQUESTS
  73                         Optional. Number of infer requests (default value is
  74                         2).
  75   -nthreads NUMBER_THREADS, --number_threads NUMBER_THREADS
  76                         Number of threads to use for inference on the CPU
  77                         (including Hetero cases).
  78   -b BATCH_SIZE, --batch_size BATCH_SIZE
  79                         Optional. Batch size value. If not specified, the
  80                         batch size value is determined from IR
  81   -pin {YES,NO}, --infer_threads_pinning {YES,NO}
  82                         Optional. Enable ("YES" is default value) or disable
  83                         ("NO")CPU threads pinning for CPU-involved inference.
  84 ```
  85
  86 Running the application with the empty list of options yields the usage message given above and an error message.
  87
  88 To run the demo, you can use public or pre-trained models. To download the pre-trained models, use the OpenVINO [Model Downloader](https://github.com/opencv/open_model_zoo/tree/2018/model_downloader) or go to [https://download.01.org/opencv/](https://download.01.org/opencv/).
  89
  90 > **NOTE**: Before running the demo with a trained model, make sure the model is converted to the Inference Engine format (\*.xml + \*.bin) using the [Model Optimizer tool](./docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md).
  91
  92 For example, to do inference on an image using a trained network with multiple outputs on CPU, run the following command:
  93
  94 ```
  95 python3 benchmark_app.py -i <path_to_image>/inputImage.bmp -m <path_to_model>/multiple-output.xml -d CPU
  96 ```
  97
  98 ## Demo Output
  99
 100 Application output depends on a used API. For synchronous API, the application outputs latency and throughput:
 101 ```
 102 [ INFO ] Start inference synchronously (10 s duration)
 103 [BENCHMARK RESULT] Latency is 15.5520 msec
 104 [BENCHMARK RESULT] Throughput is 1286.0082 FPS
 105 ```
 106
 107 For asynchronous API, the application outputs only throughput:
 108 ```
 109 [ INFO ] Start inference asynchronously (10 s duration, 8 inference requests in parallel)
 110 [BENCHMARK RESULT] Throughput is 1444.2591 FPS
 111 ```
 112
 113 ## See Also
 114 * [Using Inference Engine Samples](./docs/IE_DG/Samples_Overview.md)
 115 * [Model Optimizer](./docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
 116 * [Model Downloader](https://github.com/opencv/open_model_zoo/tree/2018/model_downloader)