1 # Validation App {#InferenceEngineValidationApp}
3 Inference Engine Validation Application ("validation app" for short) is a tool that allows the user to score common topologies with
4 de facto standard inputs and outputs configuration. Such as AlexNet or SSD. Validation app allows the user to collect simple
5 validation metrics for the topologies. It supports Top1/Top5 counting for classification networks and 11-points mAP calculation for
6 object detection networks.
8 Possible usages of the tool:
9 * Check if Inference Engine scores the public topologies well (the development team uses the validation app for regular testing and the user bugreports are always welcome)
10 * Verify if the user's custom topology compatible with the default input/output configuration and compare its accuracy with the public ones
11 * Using Validation App as another sample: although the code is much more complex than in classification and object detection samples, it's still open and could be re-used
13 This document describes the usage and features of Inference Engine Validation Application.
15 ## Validation Application options
17 Let's list <code>validation_app</code> CLI options and describe them:
19 Usage: validation_app [OPTION]
23 -h Print a usage message
24 -t <type> Type of the network being scored ("C" by default)
25 -t "C" for classification
26 -t "OD" for object detection
27 -i <path> Required. Folder with validation images, folders grouped by labels or a .txt file list for classification networks or a VOC-formatted dataset for object detection networks
28 -m <path> Required. Path to an .xml file with a trained model
29 -l <absolute_path> Required for MKLDNN (CPU)-targeted custom layers.Absolute path to a shared library with the kernel implementations
30 -c <absolute_path> Required for clDNN (GPU)-targeted custom kernels.Absolute path to the xml file with the kernel descriptions
31 -d <device> Specify the target device to infer on; CPU, GPU, FPGA or MYRIAD is acceptable. Sample will look for a suitable plugin for device specified (CPU by default)
32 -b N Batch size value. If not specified, the batch size value is determined from IR
33 -ppType <type> Preprocessing type. One of "None", "Resize", "ResizeCrop"
34 -ppSize N Preprocessing size (used with ppType="ResizeCrop")
35 -ppWidth W Preprocessing width (overrides -ppSize, used with ppType="ResizeCrop")
36 -ppHeight H Preprocessing height (overrides -ppSize, used with ppType="ResizeCrop")
37 --dump Dump filenames and inference results to a csv file
39 Classification-specific options:
40 -Czb true "Zero is a background" flag. Some networks are trained with a modified dataset where the class IDs are enumerated from 1, but 0 is an undefined "background" class (which is never detected)
42 Object detection-specific options:
43 -ODkind <kind> Kind of an object detection network: SSD
44 -ODa <path> Required for OD networks. Path to the folder containing .xml annotations for images
45 -ODc <file> Required for OD networks. Path to the file containing classes list
46 -ODsubdir <name> Folder between the image path (-i) and image name, specified in the .xml. Use JPEGImages for VOC2007
49 There are three categories of options here.
50 1. Common options, usually named with a single letter or word, such as <code>-b</code> or <code>--dump</code>. These options have a common sense in all validation_app modes.
51 2. Network type-specific options. They are named as an acronym of the network type (such as <code>C</code> or <code>OD</code>, followed by a letter or a word addendum. These options are specific for the network type. For instance, <code>ODa</code> option makes sense only for an object detection network.
53 Let's show how to use Validation Application in all its common modes
55 ## Running classification
57 This topic demonstrates how to run the Validation Application, in classification mode to score a classification CNN on a pack of images.
59 You can use the following command to do inference of a chosen pack of images:
61 ./validation_app -t C -i <path to images main folder or .txt file> -m <model to use for classification> -d <CPU|GPU>
64 ### Source dataset format: folders as classes
65 A correct bunch of files should look something like:
78 To score this dataset you should put `-i <path>/dataset` option to the command line
80 ### Source dataset format: a list of images
81 Here we use a single list file in the format "image_name-tabulation-class_index". The correct bunch of files:
91 where `labels.txt` looks like:
99 To score this dataset you should put `-i <path>/dataset/labels.txt` option to the command line
103 Progress bar will be shown representing a progress of inference.
104 After inference is complete common info will be shown:
105 <pre class="brush: bash">
106 Network load time: time spent on topology load in ms
107 Model: path to chosen model
108 Model Precision: precision of the chosen model
109 Batch size: specified batch size
110 Validation dataset: path to a validation set
111 Validation approach: Classification networks
114 Then application shows statistics like average infer time, top 1 and top 5 accuracy, for example:
115 <pre class="brush: bash">
116 Average infer time (ms): 588.977 (16.98 images per second with batch size = 10)
118 Top1 accuracy: 70.00% (7 of 10 images were detected correctly, top class is correct)
119 Top5 accuracy: 80.00% (8 of 10 images were detected correctly, top five classes contain required class)
124 Upon the start-up the validation application reads command line parameters and loads a network to the Inference Engine plugin.
125 Then program reads validation set (<code>-i</code> option):
127 - If it specifies a directory, the program tries to load labels first. To do this, app searches for the file with the same name as model but with ".labels" extension (instead of .xml).
128 Then it searches the folder specified and adds to the validation set all images from subfolders whose names are equal to some known label.
129 If there are no subfolders whose names are equal to known labels, validation set is considered to be empty.
131 - If it specifies a .txt file, the application reads file considering every line has an expected format: <code><relative_path_from_txt_to_img> <ID></code>.
132 <code>ID</code> is a number of image that network should classify.
134 After that, the app reads the number of images specified by the <code>-b</code> option and loads them to plugin.
136 <strong>Note:</strong> Images loading time is not a part of inference time reported by the application.
138 When all images are loaded, a plugin executes inferences and the Validation Application collects the statistics.
140 It is possible to retrieve infer result by specifying <code>--dump</code> option.
142 This option enables creation (if possible) of an inference report with the name in format <code>dumpfileXXXX.csv</code>.
144 The structure of the report is a number of lines, each contains semicolon separated values:
146 * flag representing correctness of prediction;
148 * probability that the image belongs to Top-1 class;
150 * probability that the image belongs to Top-2 class;
155 This topic demonstrates how to run the Validation Application, in object detection mode to score an object detection
156 CNN on a pack of images.
158 Validation app was validated with SSD CNN. Any network that can be scored by the IE and has the same input and output
159 format as one of these should be supported as well.
161 ### Running SSD on the VOC dataset
163 SSD could be scored on the original dataset that was used to test it during its training. To do that:
165 1. From the SSD author's page (<code>https://github.com/weiliu89/caffe/tree/ssd</code>) download Pre-trained SSD-300:
167 https://drive.google.com/open?id=0BzKzrI_SkD1_WVVTSmQxU0dVRzA
169 2. Download VOC2007 testing dataset (this link could be found on the same github page):
171 $wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
172 tar -xvf VOCtest_06-Nov-2007.tar
174 3. Convert the model with Model Optimizer
176 4. Create a proper class file (made from the original <code>labelmap_voc.prototxt</code>)
200 ...and save it as <code>VOC_SSD_Classes.txt</code> file.
202 5. Score the model on the dataset:
204 ./validation_app -d CPU -t OD -ODa "<...>/VOCdevkit/VOC2007/Annotations" -i "<...>/VOCdevkit" -m "<...>/vgg_voc0712_ssd_300x300.xml" -ODc "<...>/VOC_SSD_Classes.txt" -ODsubdir JPEGImages
206 As a result you should see a progressbar that will count from 0% to 100% during some time and then you'll see this:
208 Progress: [....................] 100.00% done
209 [ INFO ] Processing output blobs
210 Network load time: 27.70ms
211 Model: /home/user/models/ssd/withmean/vgg_voc0712_ssd_300x300/vgg_voc0712_ssd_300x300.xml
212 Model Precision: FP32
214 Validation dataset: /home/user/Data/SSD-data/testonly/VOCdevkit
215 Validation approach: Object detection network
217 Average infer time (ms): 166.49 (6.01 images per second with batch size = 1)
218 Average precision per class table:
242 Mean Average Precision (mAP): 0.7767
244 This value (Mean Average Precision) is specified in a table on the SSD author's page (<code>https://github.com/weiliu89/caffe/tree/ssd</code>) and in their arXiv paper (http://arxiv.org/abs/1512.02325)
248 * [Using Inference Engine Samples](@ref SamplesOverview)