inference-engine/samples/validation_app/README.md

   1 # Validation App {#InferenceEngineValidationApp}
   2
   3 Inference Engine Validation Application ("validation app" for short) is a tool that allows the user to score common topologies with
   4 de facto standard inputs and outputs configuration. Such as AlexNet or SSD. Validation app allows the user to collect simple
   5 validation metrics for the topologies. It supports Top1/Top5 counting for classification networks and 11-points mAP calculation for
   6 object detection networks.
   7
   8 Possible usages of the tool:
   9 * Check if Inference Engine scores the public topologies well (the development team uses the validation app for regular testing and the user bugreports are always welcome)
  10 * Verify if the user's custom topology compatible with the default input/output configuration and compare its accuracy with the public ones
  11 * Using Validation App as another sample: although the code is much more complex than in classification and object detection samples, it's still open and could be re-used
  12
  13 This document describes the usage and features of Inference Engine Validation Application.
  14
  15 ## Validation Application options
  16
  17 Let's list <code>validation_app</code> CLI options and describe them:
  18
  19         Usage: validation_app [OPTION]
  20
  21         Available options:
  22
  23             -h                        Print a usage message
  24             -t <type>                 Type of the network being scored ("C" by default)
  25               -t "C" for classification
  26               -t "OD" for object detection
  27             -i <path>                 Required. Folder with validation images, folders grouped by labels or a .txt file list for classification networks or a VOC-formatted dataset for object detection networks
  28             -m <path>                 Required. Path to an .xml file with a trained model
  29             -l <absolute_path>        Required for MKLDNN (CPU)-targeted custom layers.Absolute path to a shared library with the kernel implementations
  30             -c <absolute_path>        Required for clDNN (GPU)-targeted custom kernels.Absolute path to the xml file with the kernel descriptions
  31             -d <device>               Specify the target device to infer on; CPU, GPU, FPGA or MYRIAD is acceptable. Sample will look for a suitable plugin for device specified (CPU by default)
  32             -b N                      Batch size value. If not specified, the batch size value is determined from IR
  33             -ppType <type>            Preprocessing type. One of "None", "Resize", "ResizeCrop"
  34             -ppSize N                 Preprocessing size (used with ppType="ResizeCrop")
  35             -ppWidth W                Preprocessing width (overrides -ppSize, used with ppType="ResizeCrop")
  36             -ppHeight H               Preprocessing height (overrides -ppSize, used with ppType="ResizeCrop")
  37             --dump                    Dump filenames and inference results to a csv file
  38
  39             Classification-specific options:
  40               -Czb true               "Zero is a background" flag. Some networks are trained with a modified dataset where the class IDs are enumerated from 1, but 0 is an undefined "background" class (which is never detected)
  41
  42             Object detection-specific options:
  43               -ODkind <kind>          Kind of an object detection network: SSD
  44               -ODa <path>             Required for OD networks. Path to the folder containing .xml annotations for images
  45               -ODc <file>             Required for OD networks. Path to the file containing classes list
  46               -ODsubdir <name>        Folder between the image path (-i) and image name, specified in the .xml. Use JPEGImages for VOC2007
  47
  48
  49 There are three categories of options here.
  50 1. Common options, usually named with a single letter or word, such as <code>-b</code> or <code>--dump</code>. These options have a common sense in all validation_app modes.
  51 2. Network type-specific options. They are named as an acronym of the network type (such as <code>C</code> or <code>OD</code>, followed by a letter or a word addendum. These options are specific for the network type. For instance, <code>ODa</code> option makes sense only for an object detection network.
  52
  53 Let's show how to use Validation Application in all its common modes
  54
  55 ## Running classification
  56
  57 This topic demonstrates how to run the Validation Application, in classification mode to score a classification CNN on a pack of images.
  58
  59 You can use the following command to do inference of a chosen pack of images:
  60 ```bash
  61 ./validation_app -t C -i <path to images main folder or .txt file> -m <model to use for classification> -d <CPU|GPU>
  62 ```
  63
  64 ### Source dataset format: folders as classes
  65 A correct bunch of files should look something like:
  66
  67         <path>/dataset
  68                 /apron
  69                         /apron1.bmp
  70                         /apron2.bmp
  71                 /collie
  72                         /a_big_dog.jpg
  73                 /coral reef
  74                         /reef.bmp
  75                 /Siamese
  76                         /cat3.jpg
  77
  78 To score this dataset you should put `-i <path>/dataset` option to the command line
  79
  80 ### Source dataset format: a list of images
  81 Here we use a single list file in the format "image_name-tabulation-class_index". The correct bunch of files:
  82
  83         <path>/dataset
  84                 /apron1.bmp
  85                 /apron2.bmp
  86                 /a_big_dog.jpg
  87                 /reef.bmp
  88                 /cat3.jpg
  89                 /labels.txt
  90
  91 where `labels.txt` looks like:
  92
  93         apron1.bmp 411
  94         apron2.bmp 411
  95         cat3.jpg 284
  96         reef.bmp 973
  97         a_big_dog.jpg 231
  98
  99 To score this dataset you should put `-i <path>/dataset/labels.txt` option to the command line
 100
 101 ### Outputs
 102
 103 Progress bar will be shown representing a progress of inference.
 104 After inference is complete common info will be shown:
 105 <pre class="brush: bash">
 106 Network load time: time spent on topology load in ms
 107 Model: path to chosen model
 108 Model Precision: precision of the chosen model
 109 Batch size: specified batch size
 110 Validation dataset: path to a validation set
 111 Validation approach: Classification networks
 112 Device: device type
 113 </pre>
 114 Then application shows statistics like average infer time, top 1 and top 5 accuracy, for example:
 115 <pre class="brush: bash">
 116 Average infer time (ms): 588.977 (16.98 images per second with batch size = 10)
 117
 118 Top1 accuracy: 70.00% (7 of 10 images were detected correctly, top class is correct)
 119 Top5 accuracy: 80.00% (8 of 10 images were detected correctly, top five classes contain required class)
 120 </pre>
 121
 122 ### How it works
 123
 124 Upon the start-up the validation application reads command line parameters and loads a network to the Inference Engine plugin.
 125 Then program reads validation set (<code>-i</code> option):
 126
 127 - If it specifies a directory, the program tries to load labels first. To do this, app searches for the file with the same name as model but with ".labels" extension (instead of .xml).
 128   Then it searches the folder specified and adds to the validation set all images from subfolders whose names are equal to some known label.
 129   If there are no subfolders whose names are equal to known labels, validation set is considered to be empty.
 130
 131 - If it specifies a .txt file, the application reads file considering every line has an expected format: <code>&lt;relative_path_from_txt_to_img&gt; &lt;ID&gt;</code>.
 132   <code>ID</code> is a number of image that network should classify.
 133
 134 After that, the app reads the number of images specified by the <code>-b</code> option and loads them to plugin.
 135
 136 <strong>Note:</strong> Images loading time is not a part of inference time reported by the application.
 137
 138 When all images are loaded, a plugin executes inferences and the Validation Application collects the statistics.
 139
 140 It is possible to retrieve infer result by specifying <code>--dump</code> option.
 141
 142 This option enables creation (if possible) of an inference report with the name in format <code>dumpfileXXXX.csv</code>.
 143
 144 The structure of the report is a number of lines, each contains semicolon separated values:
 145 * image_path;
 146 * flag representing correctness of prediction;
 147 * id of Top-1 class;
 148 * probability that the image belongs to Top-1 class;
 149 * id of Top-2 class;
 150 * probability that the image belongs to Top-2 class;
 151 * etc.
 152
 153 ## Object detection
 154
 155 This topic demonstrates how to run the Validation Application, in object detection mode to score an object detection
 156 CNN on a pack of images.
 157
 158 Validation app was validated with SSD CNN. Any network that can be scored by the IE and has the same input and output
 159 format as one of these should be supported as well.
 160
 161 ### Running SSD on the VOC dataset
 162
 163 SSD could be scored on the original dataset that was used to test it during its training. To do that:
 164
 165 1. From the SSD author's page (<code>https://github.com/weiliu89/caffe/tree/ssd</code>) download Pre-trained SSD-300:
 166
 167         https://drive.google.com/open?id=0BzKzrI_SkD1_WVVTSmQxU0dVRzA
 168
 169 2. Download VOC2007 testing dataset (this link could be found on the same github page):
 170   ```bash
 171   $wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
 172   tar -xvf VOCtest_06-Nov-2007.tar
 173   ```
 174 3. Convert the model with Model Optimizer
 175
 176 4. Create a proper class file (made from the original <code>labelmap_voc.prototxt</code>)
 177
 178         none_of_the_above 0
 179         aeroplane 1
 180         bicycle 2
 181         bird 3
 182         boat 4
 183         bottle 5
 184         bus 6
 185         car 7
 186         cat 8
 187         chair 9
 188         cow 10
 189         diningtable 11
 190         dog 12
 191         horse 13
 192         motorbike 14
 193         person 15
 194         pottedplant 16
 195         sheep 17
 196         sofa 18
 197         train 19
 198         tvmonitor 20
 199
 200 ...and save it as <code>VOC_SSD_Classes.txt</code> file.
 201
 202 5. Score the model on the dataset:
 203   ```bash
 204   ./validation_app -d CPU -t OD -ODa "<...>/VOCdevkit/VOC2007/Annotations" -i "<...>/VOCdevkit" -m "<...>/vgg_voc0712_ssd_300x300.xml" -ODc "<...>/VOC_SSD_Classes.txt" -ODsubdir JPEGImages
 205   ```
 206 As a result you should see a progressbar that will count from 0% to 100% during some time and then you'll see this:
 207
 208         Progress: [....................] 100.00% done
 209         [ INFO ] Processing output blobs
 210         Network load time: 27.70ms
 211         Model: /home/user/models/ssd/withmean/vgg_voc0712_ssd_300x300/vgg_voc0712_ssd_300x300.xml
 212         Model Precision: FP32
 213         Batch size: 1
 214         Validation dataset: /home/user/Data/SSD-data/testonly/VOCdevkit
 215         Validation approach: Object detection network
 216
 217         Average infer time (ms): 166.49 (6.01 images per second with batch size = 1)
 218         Average precision per class table:
 219
 220         Class   AP
 221         1       0.796
 222         2       0.839
 223         3       0.759
 224         4       0.695
 225         5       0.508
 226         6       0.867
 227         7       0.861
 228         8       0.886
 229         9       0.602
 230         10      0.822
 231         11      0.768
 232         12      0.861
 233         13      0.874
 234         14      0.842
 235         15      0.797
 236         16      0.526
 237         17      0.792
 238         18      0.795
 239         19      0.873
 240         20      0.773
 241
 242         Mean Average Precision (mAP): 0.7767
 243
 244 This value (Mean Average Precision) is specified in a table on the SSD author's page (<code>https://github.com/weiliu89/caffe/tree/ssd</code>) and in their arXiv paper (http://arxiv.org/abs/1512.02325)
 245
 246 ## See Also
 247
 248 * [Using Inference Engine Samples](@ref SamplesOverview)