inference-engine/samples/validation_app/README.md

   1 # Validation Application
   2
   3 Inference Engine Validation Application is a tool that allows to infer deep learning models with
   4 standard inputs and outputs configuration and to collect simple
   5 validation metrics for topologies. It supports **top-1** and **top-5** metric for Classification networks and
   6 11-points **mAP** metric for Object Detection networks.
   7
   8 Possible use cases of the tool:
   9 * Check if the Inference Engine infers the public topologies well (the engineering team uses the Validation Application for
  10   regular testing)
  11 * Verify if a custom model is compatible with the default input/output configuration and compare its
  12   accuracy with the public models
  13 * Use Validation Application as another sample: although the code is much more complex than in classification and object
  14   detection samples, the source code is open and can be re-used.
  15
  16 ## Validation Application Options
  17
  18 The Validation Application provides the following command-line interface (CLI):
  19 ```sh
  20 Usage: validation_app [OPTION]
  21
  22 Available options:
  23
  24     -h                        Print a help message
  25     -t <type>                 Type of an inferred network ("C" by default)
  26       -t "C" for classification
  27       -t "OD" for object detection
  28     -i <path>                 Required. Folder with validation images. Path to a directory with validation images. For Classification models, the directory must contain folders named as labels with images inside or a .txt file with a list of images. For Object Detection models, the dataset must be in VOC format.
  29     -m <path>                 Required. Path to an .xml file with a trained model
  30     -l <absolute_path>        Required for CPU custom layers. Absolute path to a shared library with the kernel implementations
  31     -c <absolute_path>        Required for GPU custom kernels.Absolute path to an .xml file with the kernel descriptions.
  32     -d <device>               Target device to infer on: CPU (default), GPU, FPGA, or MYRIAD. The application looks for a suitable plugin for the specified device.
  33     -b N                      Batch size value. If not specified, the batch size value is taken from IR
  34     -ppType <type>            Preprocessing type. Options: "None", "Resize", "ResizeCrop"
  35     -ppSize N                 Preprocessing size (used with ppType="ResizeCrop")
  36     -ppWidth W                Preprocessing width (overrides -ppSize, used with ppType="ResizeCrop")
  37     -ppHeight H               Preprocessing height (overrides -ppSize, used with ppType="ResizeCrop")
  38     --dump                    Dump file names and inference results to a .csv file
  39
  40     Classification-specific options:
  41       -Czb true               "Zero is a background" flag. Some networks are trained with a modified dataset where the class IDs  are enumerated from 1, but 0 is an undefined "background" class (which is never detected)
  42
  43     Object detection-specific options:
  44       -ODkind <kind>          Type of an Object Detection model. Options: SSD
  45       -ODa <path>             Required for Object Detection models. Path to a directory containing an .xml file with annotations for images.
  46       -ODc <file>             Required for Object Detection models. Path to a file containing a list of classes
  47       -ODsubdir <name>        Directory between the path to images (specified with -i) and image name (specified in the .xml file). For VOC2007 dataset, use JPEGImages.
  48 ```
  49 The tool options are divided into two categories:
  50 1. **Common options** named with a single letter or a word, such as `-b` or `--dump`.
  51    These options are the same in all Validation Application modes.
  52 2. **Network type-specific options** named as an acronym of the network type (`C` or `OD`)
  53    followed by a letter or a word.
  54
  55 ## General Workflow
  56
  57 When executed, the Validation Application perform the following steps:
  58
  59 1. Loads a model to an Inference Engine plugin
  60 2. Reads validation set (specified with the `-i` option):
  61     - if you specified a directory, the application tries to load labels first. To do this, it searches for the file
  62       with the same name as a model, but with `.labels` extension (instead of `.xml`).
  63       Then it searches for the specified folder, detects its sub-folders named as known labels, and adds all images from these sub-folders to the validation set. When there are no such sub-folders, validation set is considered empty.
  64
  65     - if you specified a `.txt` file, the application reads this file expecting every line to be in the correct format.
  66       For more information about the format, refer to the <a href="#preparing">Preparing the Dataset</a> section below.
  67
  68 3. Reads the batch size value specified with the `-b` option and loads this number of images to the plugin
  69    **Note**: Images loading time is not a part of inference time reported by the application.
  70
  71 4. The plugin infers the model, and the Validation Application collects the statistics.
  72
  73 You can also retrieve infer result by specifying the `--dump` option, however it generates a report only
  74 for Classification models. This CLI option enables creation (if possible) of an inference report in
  75 the `.csv` format.
  76
  77 The structure of the report is a set of lines, each of them contains semicolon-separated values:
  78 * image path
  79 * a flag representing correctness of prediction
  80 * ID of Top-1 class
  81 * probability that the image belongs to Top-1 class in per cents
  82 * ID of Top-2 class
  83 * probability that the image belongs to Top-2 class in per cents
  84 *
  85
  86 This is an example line from such report:
  87 ```bash
  88 "ILSVRC2012_val_00002138.bmp";1;1;8.5;392;6.875;123;5.875;2;5.5;396;5;
  89 ```
  90 It means that the given image was predicted correctly. The most probable prediction is that this image
  91 represents class *1* with the probability *0.085*.
  92
  93 ## <a name="preparing"></a>Prepare a Dataset
  94
  95 You must prepare the dataset before running the Validation Application. The format of dataset depends on
  96 a type of the model you are going to validate. Make sure that the dataset is format is applicable
  97 for the chosen model type.
  98
  99 ### Dataset Format for Classification: Folders as Classes
 100
 101 In this case, a dataset has the following structure:
 102 ```sh
 103 |-- <path>/dataset
 104     |-- apron
 105         |-- apron1.bmp
 106         |-- apron2.bmp
 107     |-- collie
 108         |-- a_big_dog.jpg
 109     |-- coral reef
 110         |-- reef.bmp
 111     |-- Siamese
 112         |-- cat3.jpg
 113 ```
 114
 115 This structure means that each folder in dataset directory must have the name of one of the classes and contain all images of this class. In the given example, there are two images that represent the class `apron`, while three other classes have only one image
 116 each.
 117
 118 **NOTE:** A dataset can contain images of both `.bmp` and `.jpg` formats.
 119
 120 The correct way to use such dataset is to specify the path as `-i <path>/dataset`.
 121
 122 ### Dataset Format for Classification: List of Images (ImageNet-like)
 123
 124 If you want to use this dataset format, create a single file with a list of images. In this case, the correct set of files must be similar to the following:
 125 ```bash
 126 |-- <path>/dataset
 127     |-- apron1.bmp
 128     |-- apron2.bmp
 129     |-- a_big_dog.jpg
 130     |-- reef.bmp
 131     |-- cat3.jpg
 132     |-- labels.txt
 133 ```
 134
 135 Where `labels.txt` looks like:
 136 ```bash
 137 apron1.bmp 411
 138 apron2.bmp 411
 139 cat3.jpg 284
 140 reef.bmp 973
 141 a_big_dog.jpg 231
 142 ```
 143
 144 Each line of the file must contain the name of the image and the ID of the class
 145 that it represents in the format `<image_name> tabulation <class_id>`. For example, `apron1.bmp` represents the class with ID `411`.
 146
 147 **NOTE:** A dataset can contain images of both `.bmp` and `.jpg` formats.
 148
 149 The correct way to use such dataset is to specify the path as `-i <path>/dataset/labels.txt`.
 150
 151 ### Dataset Format for Object Detection (VOC-like)
 152
 153 Object Detection SSD models can be inferred on the original dataset that was used as a testing dataset during the model training.
 154 To prepare the VOC dataset, follow the steps below :
 155
 156 1. Download the pre-trained SSD-300 model from the SSD GitHub* repository at
 157    [https://github.com/weiliu89/caffe/tree/ssd](https://github.com/weiliu89/caffe/tree/ssd).
 158
 159 2. Download VOC2007 testing dataset:
 160   ```bash
 161   $wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
 162   tar -xvf VOCtest_06-Nov-2007.tar
 163   ```
 164 3. Convert the model with the [Model Optimizer](docs/Model_Optimizer_Developer_Guide/prepare_trained_model/convert_model/Convert_Model_From_Caffe.md).
 165
 166 4. Create a proper `.txt` class file from the original `labelmap_voc.prototxt`. The new file must be in
 167 the following format:
 168 ```sh
 169         none_of_the_above 0
 170         aeroplane 1
 171         bicycle 2
 172         bird 3
 173         boat 4
 174         bottle 5
 175         bus 6
 176         car 7
 177         cat 8
 178         chair 9
 179         cow 10
 180         diningtable 11
 181         dog 12
 182         horse 13
 183         motorbike 14
 184         person 15
 185         pottedplant 16
 186         sheep 17
 187         sofa 18
 188         train 19
 189         tvmonitor 20
 190 ```
 191 Save this file as `VOC_SSD_Classes.txt`.
 192
 193 ## Validate Classification Models
 194
 195 Once you have prepared the dataset (refer to the <a href="#preparing">Preparing the Dataset</a> section above),
 196 run the following command to infer a classification model on the selected dataset:
 197 ```bash
 198 ./validation_app -t C -i <path_to_images_directory_or_txt_file> -m <path_to_classification_model>/<model_name>.xml -d <CPU|GPU>
 199 ```
 200
 201 ## Validate Object Detection Models
 202
 203 **Note**: Validation Application was validated with SSD CNN. Any network that can be inferred by the Inference Engine
 204 and has the same input and output format as one of these should be supported as well.
 205
 206 Once you have prepared the dataset (refer to the <a href="#preparing">Preparing the Dataset</a> section above),
 207 run the following command to infer an Object Detection model on the selected dataset:
 208 ```bash
 209 ./validation_app -d CPU -t OD -ODa "<path_to_VOC_dataset>/VOCdevkit/VOC2007/Annotations" -i "<path_to_VOC_dataset>/VOCdevkit" -m "<path_to_model>/vgg_voc0712_ssd_300x300.xml" -ODc "<path_to_classes_file>/VOC_SSD_Classes.txt" -ODsubdir JPEGImages
 210 ```
 211
 212 ## Understand Validation Application Output
 213
 214 During the validation process, you can see the interactive progress bar that represents the current validation stage. When it is
 215 full, the validation process is over, and you can analyze the output.
 216
 217 Key data from the output:
 218 * **Network loading time** - time spent on topology loading in ms
 219 * **Model** - path to a chosen model
 220 * **Model Precision** - precision of the chosen model
 221 * **Batch size** - specified batch size
 222 * **Validation dataset** - path to a validation set
 223 * **Validation approach** - type of the model: Classification or Object Detection
 224 * **Device** - device type
 225
 226 Below you can find the example output for Classification models, which reports average infer time and
 227 **Top-1** and **Top-5** metric values:
 228 ```bash
 229 Average infer time (ms): 588.977 (16.98 images per second with batch size = 10)
 230
 231 Top1 accuracy: 70.00% (7 of 10 images were detected correctly, top class is correct)
 232 Top5 accuracy: 80.00% (8 of 10 images were detected correctly, top five classes contain required class)
 233 ```
 234
 235 Below you can find the example output for Object Detection models:
 236
 237 ```bash
 238 Progress: [....................] 100.00% done
 239 [ INFO ] Processing output blobs
 240 Network load time: 27.70ms
 241 Model: /home/user/models/ssd/withmean/vgg_voc0712_ssd_300x300/vgg_voc0712_ssd_300x300.xml
 242 Model Precision: FP32
 243 Batch size: 1
 244 Validation dataset: /home/user/Data/SSD-data/testonly/VOCdevkit
 245 Validation approach: Object detection network
 246
 247 Average infer time (ms): 166.49 (6.01 images per second with batch size = 1)
 248 Average precision per class table:
 249
 250 Class   AP
 251 1       0.796
 252 2       0.839
 253 3       0.759
 254 4       0.695
 255 5       0.508
 256 6       0.867
 257 7       0.861
 258 8       0.886
 259 9       0.602
 260 10      0.822
 261 11      0.768
 262 12      0.861
 263 13      0.874
 264 14      0.842
 265 15      0.797
 266 16      0.526
 267 17      0.792
 268 18      0.795
 269 19      0.873
 270 20      0.773
 271
 272 Mean Average Precision (mAP): 0.7767
 273 ```
 274
 275 This output shows the resulting `mAP` metric value for the SSD300 model used to prepare the
 276 dataset. This value repeats the result stated in the
 277 [SSD GitHub* repository](https://github.com/weiliu89/caffe/tree/ssd) and in the
 278 [original arXiv paper](http://arxiv.org/abs/1512.02325).
 279
 280
 281
 282 ## See Also
 283
 284 * [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)