1 # Overview of Inference Engine Python* API
3 > **NOTE:** It is a preview version of the Inference Engine Python\* API for evaluation purpose only.
4 > Module structure and API itself may be changed in future releases.
6 This API provides a simplified interface for Inference Engine functionality that allows to:
9 * load and configure Inference Engine plugins based on device names
10 * perform inference in synchronous and asynchronous modes with arbitrary number of infer requests (the number of infer requests may be limited by target device capabilities)
14 Currently the Inference Engine Python\* API is supported on Ubuntu* 16.04, Microsoft Windows* 10 and CentOS* 7.3 OSes.
15 Supported Python* versions:
17 * On Ubuntu 16.04: 2.7, 3.5, 3.6
18 * On Windows 10: 3.5, 3.6
19 * On CentOS 7.3: 3.4, 3.5, 3.6
21 ## Setting Up the Environment
23 To configure the environment for the Inference Engine Python\* API, run:
24 * On Ubuntu 16.04: `source <INSTALL_DIR>/bin/setupvars.sh .`
25 * On Windows 10: `call <INSTALL_DIR>\deployment_tools\inference_engine\python_api\setenv.bat`
27 The script automatically detects latest installed Python\* version and configures required environment if the version is supported.
28 If you want to use certain version of Python\*, set the environment variable `PYTHONPATH=<INSTALL_DIR>/deployment_tools/inference_engine/python_api/<desired_python_version>`
29 after running the environment configuration script.
31 ## <a name="ienetlayer-class"></a>IENetLayer
32 This class stores main information about the layer and allow to modify some layer parameters
35 * `name` - Name of the layer
37 * `precision` - Layer base operating precision. Provides getter and setter interfaces.
38 * `layout` - Returns the layout of shape of the layer.
39 * `shape` - Return the list of the shape of the layer.
40 * `parents` - Returns a list, which contains names of layers preceding this layer.
41 * `children` - Returns a list, which contains names of layers following this layer.
42 * `affinity` - Layer affinity set by user or a default affinity set by the `IEPlugin.set_initial_affinity()` method.
43 The affinity attribute provides getter and setter interfaces, so the layer affinity can be modified directly.
46 >>> net = IENetwork(model=path_to_xml_file, weights=path_to_bin_file)
47 >>> plugin = IEPlugin(device="HETERO:FPGA,CPU")
48 >>> plugin.set_config({"TARGET_FALLBACK": "HETERO:FPGA,CPU"})
49 >>> plugin.set_initial_affinity(net)
50 >>> for l in net.layers.values():
51 ... if l.type == "Convolution":
52 ... l.affinity = "CPU"
56 To correctly set affinity for the network, you must first initialize and properly configure the HETERO plugin.
57 `set_config({"TARGET_FALLBACK": "HETERO:FPGA,GPU"})` function configures the plugin fallback devices and their order.
58 `plugin.set_initial_affinity(net)` function sets affinity parameter of model layers according to its support
61 After default affinity is set by the plugin, override the default values by setting affinity manually how it's
62 described in example above
64 To understand how default and non-default affinities are set:
66 1. Call `net.layers` function right after model loading and check that layer affinity parameter is empty.
67 2. Call `plugin.set_default_affinity(net)`.
68 3. Call `net.layers` and check layer affinity parameters to see how plugin set a default affinity
69 4. Set layer affinity how it's described above
70 5. Call `net.layers` again and check layer affinity parameters to see how it was changed after manual affinity
73 Please refer to `affinity_setting_demo.py` to see the full usage pipeline.
75 * `weights`- Dictionary with layer weights, biases or custom blobs if any
76 * `params` - Layer specific parameters. Provides getter and setter interfaces to get and modify layer parameters.
77 Please note that some modifications can be ignored and\or overwriten by target plugin (e.g. modification of
78 convolution kernel size will be reflected in layer parameters but finally the plugin will ignore it and will
79 use initial kernel size)
81 ## <a name="ienetwork-class"></a>IENetwork
83 This class contains the information about the network model read from IR and allows you to manipulate with some model parameters such as
84 layers affinity and output layers.
88 * `__init__(model: str, weights: str)`
90 * model - Path to `.xml` file of the IR
91 * weights - Path to `.bin` file of the IR
95 * `name` - Name of the loaded network
96 * `inputs` - A dictionary that maps input layer names to <a name="inputinfo-class"></a>InputInfo objects.
97 For example, to get a shape of the input layer:
99 >>> net = IENetwork(model=path_to_xml_file, weights=path_to_bin_file)
101 {'data': <inference_engine.ie_api.InputInfo object at 0x7efe042dedd8>}
102 >>> net.inputs['data'].shape
105 * `outputs` - A dictionary that maps output layer names to <a name="inputinfo-class"></a>OutputInfo objects
106 For example, to get a shape of the output layer:
108 >>> net = IENetwork(model=path_to_xml_file, weights=path_to_bin_file)
110 {'prob': <inference_engine.ie_api.OutputInfo object at 0x7efe03ab95d0>}
111 >>> net.outputs['prob'].shape
115 * `batch_size` - Batch size of the network. Provides getter and setter interfaces to get and modify the
116 network batch size. For example:
118 >>> net = IENetwork(model=path_to_xml_file, weights=path_to_bin_file)
121 >>> net.batch_size = 4
124 >>> net.inputs['data'].shape
127 * `layers` - Return dictionary that maps network layer names to <a name="ienetlayer-class"></a>`IENetLayer`
128 objects containing layer properties in topological order. For example, to list all network layers:
130 >>> net = IENetwork(model=path_to_xml_file, weights=path_to_bin_file)
132 {'conv0': <inference_engine.ie_api.IENetLayer object at 0x7f3a4c102370>
136 * `stats` - Returns `LayersStatsMap` object containing dictionary that maps network layer names to calibration statistics
137 represented by <a name="layerstats-class"></a> `LayerStats` objects.
138 `LayersStatsMap` class inherited from built-in python `dict` and overrides default `update()`method to allow
139 to set or modify layers calibration statistics.
141 >>> net = IENetwork(model=path_to_xml_file, weights=path_to_bin_file)
142 >>> net.stats.update({
143 "conv1_2d" : LayserStats(min=(-25, -1, 0), max=(63, 124, 70)),
144 "conv2_2d" : LayserStats(min=(-5, -1, 0, 1, -7, 2), max=(63, 124, 70, 174, 99, 106)),
147 For more details about low precision inference please refer to "Low-Precision 8-bit Integer Inference"
148 section in Inference Engine Developers Guide documentation.
152 * `from_ir(model: str, weights: str)`
153 > **NOTE:** The function is deprecated. Please use `IENetwork()` class constructor to create valid instance of `IENetwork`
155 The class method serves to read the model from the `.xml` and `.bin` files of the IR.
157 * model - Path to `.xml` file of the IR
158 * weights - Path to `.bin` file of the IR
160 An instance of the `IENetwork` class
163 >>> net = IENetwork(model=path_to_xml_file, weights=path_to_bin_file)
165 <inference_engine.ie_api.IENetwork object at 0x7fd7dbce54b0>
170 * `add_outputs(outputs)`:
172 The method serves to mark any intermediate layer as output layer to retrieve the inference results
173 from the specified layers.
175 * `outputs` - List of layer names to be set as model outputs. In case of setting one layer as output, string with one layer can be provided.
180 >>> net = IENetwork(model=path_to_xml_file, weights=path_to_bin_file)
181 >>> net.add_outputs(["conv5_1/dwise', conv2_1/expand'])]
183 ['prob', 'conv5_1/dwise', 'conv2_1/expand']
185 > **NOTE**: The last layers (nodes without successors in graph representation of the model) are set as output
186 > by default. In the case above, `prob` layer is a default output and `conv5_1/dwise`, `conv2_1/expand` are user-defined
189 * `reshape(input_shapes: dict)`:
191 The method reshapes the network to change spatial dimensions, batch size, or any dimension.
192 > **Note:** Before using this method, make sure that the target shape is applicable for the network. Changing the network shape to an arbitrary value may lead to unpredictable behaviour.
194 * `input_shapes` - The dictionary that maps input layer names to tuples with the target shape
199 >>> net = IENetwork(model=path_to_xml_file, weights=path_to_bin_file)
200 >>> input_layer = next(iter(net.inputs))
201 >>> n, c, h, w = net.inputs[input_layer]
202 >>> net.reshape({input_layer: (n, c, h*2, w*2)}]
204 * `serialize(path_to_xml, path_to_bin)`:
206 The method serializes the network and stores it in files.
208 * `path_to_xml` - path to a file, where a serialized model will be stored.
209 * `path_to_bin` - path to a file, where serialized weights will be stored.
214 >>> net = IENetwork(model=path_to_model, weights=path_to_weights)
215 >>> net.serialize(path_to_xml, path_to_bin)
218 ## <a name="layerstats-class"></a>LayerStats
220 Layer calibration statistic container.
222 ### Class Constructor
224 * `__init__(min: tuple = (), max: tuple = ())`
226 * min - Tuple with per-channel minimum layer activation values
227 * max - Tuple with per-channel maximum layer activation values
229 ## <a name="inputinfo-class"></a>InputInfo
231 This class contains the information about the network input layers
233 ### Class attributes:
235 * `precision` - Precision of the input data provided by user. Provides setter and getter interfaces
236 to get and modify input layer precision.
237 List of applicable precisions: FP32 FP16, I32, I16, I8, U32, U16
238 > **NOTE**: Support of any calculation precision depends on the target plugin.
239 * `layout` - Layout of the input data provided by user. Provides setter and getter interfaces
240 to get and modify input layer layout.
241 List of applicable layouts: NCHW, NHWC, OIHW, C, CHW, HW, NC, CN, BLOCKED
242 * `shape` - input layer data shape
244 ## <a name="outputinfo-class"></a>OutputInfo
246 This class contains the information about the network input layers
248 ### Class attributes:
250 * `precision` - Precision of the output data. Provides setter and getter interfaces
251 to get and modify output layer precision.
252 * `layout` - Layout of the output data provided by user
253 * `shape` - Input layer data shape
255 ## <a name="ieplugin-class"></a>IEPlugin Class
257 This class is the main plugin interface and serves to initialize and configure the plugin.
259 ### Class Constructor
261 * `__init__(device: str, plugin_dirs=None)`
263 * `device` - Target device name. Supported devices: CPU, GPU, FPGA, MYRIAD, HETERO
264 * `plugin_dirs` - List of paths to plugin directories
268 * `device` - a name of the device that was specified to initialize IEPlugin
269 * `version` - a version of the plugin
273 * ```load(network: IENetwork, num_requests: int=1, config=None)```
275 Loads a network that was read from the IR to the plugin and creates an executable network from a network object.
276 You can create as many networks as you need and use them simultaneously (up to the limitation of the hardware
279 * `network` - A valid `IENetwork` instance
280 * `num_requests` - A positive integer value of infer requests to be created. Number of infer requests may be limited
281 by device capabilities.
282 * `config` - A dictionary of plugin configuration keys and their values
287 >>> net = IENetwork(model=path_to_xml_file, weights=path_to_bin_file)
288 >>> plugin = IEPlugin(device="CPU")
289 >>> exec_net = plugin.load(network=net, num_requsts=2)
291 <inference_engine.ie_api.ExecutableNetwork object at 0x7f5140bbcd38>
293 * `set_initial_affinity(net: IENetwork)`
295 Sets initial affinity for model layers according to the HETERO plugin logic. Applicable only if
296 IEPlugin was initialized for HETERO device.
298 * `net` - A valid instance of IENetwork
302 See `affinity` attribute of the `IENetLayer` class.
303 * `add_cpu_extension(extension_path: str)`
305 Loads extensions library to the plugin. Applicable only for CPU device and HETERO device with CPU
307 * `extension_path` - A full path to CPU extensions library
312 >>> plugin = IEPlugin(device="CPU")
313 >>> plugin.add_cpu_extenstions(ext_lib_path)
315 * `set_config(config: dict)`
317 Sets a configuration for the plugin. Refer to `SetConfig()` in Inference Engine C++ documentation for acceptable
318 keys and values list.
320 * `config` - A dictionary of keys and values of acceptable configuration parameters
324 See `set_affinity` method of the `IENetwork` class.
325 * `get_supported_layers(net: IENetwork)`
327 Returns the set of layers supported by the plugin. Please note that in case of CPU plugin support of
328 a layer may depends on extension loaded by `add_cpu_extenstion()` method
330 * `net` - A valid instance of IENetwork
332 Set of layers supported by the plugin
334 See `affinity` attribute of the `IENetLayer` class.
336 ## <a name="executablenetwork"></a>ExecutableNetwork Class
338 This class represents a network instance loaded to plugin and ready for inference.
340 ### Class Constructor
342 There is no explicit class constructor. To make a valid instance of `ExecutableNetwork`, use `load()` method of the `IEPlugin` class.
346 * `requests` - A tuple of InferRequest instances
349 >>> net = IENetwork(model=path_to_xml_file, weights=path_to_bin_file)
350 >>> plugin = IEPlugin(device="CPU")
351 >>> exec_net = plugin.load(network=net, num_requsts=3)
352 >>> exec_net.requests
353 (<inference_engine.ie_api.InferRequest object at 0x7f66f56c57e0>,
354 <inference_engine.ie_api.InferRequest object at 0x7f66f56c58b8>,
355 <inference_engine.ie_api.InferRequest object at 0x7f66f56c5900>)
360 * `infer(inputs=None)`
362 Starts synchronous inference for the first infer request of the executable network and returns output data.
363 Wraps `infer()` method of the `InferRequest` class
365 * `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
367 A dictionary that maps output layer names to `numpy.ndarray` objects with output data of the layer
370 >>> net = IENetwork(model=path_to_xml_file, weights=path_to_bin_file)
371 >>> plugin = IEPlugin(device="CPU")
372 >>> exec_net = plugin.load(network=net, num_requsts=2)
373 >>> res = exec_net.infer({'data': img})
375 {'prob': array([[[[2.83426580e-08]],
382 For illustration of input data preparation, please see samples (for example, `classification_sample.py`).
383 * `start_async(request_id, inputs=None)`
385 Starts asynchronous inference for specified infer request.
386 Wraps `async_infer()` method of the `InferRequest` class
388 * `request_id` - Index of infer request to start inference
389 * `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
391 A handler of specified infer request, which is an instance of the `InferRequest` class.
394 >>> infer_request_handle = exec_net.start_async(request_id=0, inputs={input_blob: image})
395 >>> infer_status = infer_request_handle.wait()
396 >>> res = infer_request_handle.outputs[out_blob]
399 For more details about infer requests processing, see `classification_sample_async.py` (simplified case) and
400 `object_detection_demo_ssd_async.py` (real asynchronous use case) samples.
402 ## <a name="inferrequest"></a>InferRequest Class
404 This class provides an interface to infer requests of `ExecutableNetwork` and serves to handle infer requests execution
405 and to set and get output data.
407 ### Class Constructor
409 There is no explicit class constructor. To make a valid `InferRequest` instance, use `load()` method of the `IEPlugin`
410 class with specified number of requests to get `ExecutableNetwork` instance which stores infer requests.
414 * `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
415 * `outputs` - A dictionary that maps output layer names to `numpy.ndarray` objects with output data of the layer
418 >>> exec_net.requests[0].inputs['data'][:] = image
419 >>> exec_net.requests[0].infer()
420 >>> res = exec_net.requests[0].outputs['prob']
421 >>> np.flip(np.sort(np.squeeze(res)),0)
422 array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
423 5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
424 2.26027006e-03, 2.12283316e-03 ...])
429 It is not recommended to run inference directly on `InferRequest` instance.
430 To run inference, please use simplified methods `infer()` and `start_async()` of `ExecutableNetwork`.
432 * `infer(inputs=None)`
434 Starts synchronous inference of the infer request and fill outputs array
436 * `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
441 >>> exec_net = plugin.load(network=net, num_requests=2)
442 >>> exec_net.requests[0].infer({input_blob: image})
443 >>> res = exec_net.requests[0].outputs['prob']
444 >>> np.flip(np.sort(np.squeeze(res)),0)
445 array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
446 5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
447 2.26027006e-03, 2.12283316e-03 ...])
449 * `async_infer(inputs=None)`
451 Starts asynchronous inference of the infer request and fill outputs array
453 * `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
458 >>> exec_net = plugin.load(network=net, num_requests=2)
459 >>> exec_net.requests[0].async_infer({input_blob: image})
460 >>> exec_net.requests[0].wait()
461 >>> res = exec_net.requests[0].outputs['prob']
462 >>> np.flip(np.sort(np.squeeze(res)),0)
463 array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
464 5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
465 2.26027006e-03, 2.12283316e-03 ...])
469 Waits for the result to become available. Blocks until specified timeout elapses or the result
470 becomes available, whichever comes first.
471 > **NOTE:** There are special values of the timeout parameter:
472 * 0 - Immediately returns the inference status. It does not block or interrupt execution.
473 To find statuses meaning, please refer to InferenceEngine::StatusCode in Inference Engine C++ documentation
474 * -1 - Waits until inference result becomes available (default value)
476 * `timeout` - Time to wait in milliseconds or special (0, -1) cases described above.
477 If not specified, `timeout` value is set to -1 by default.
479 See `async_infer()` method of the the `InferRequest` class.
480 * `get_perf_counts()`
482 Queries performance measures per layer to get feedback of what is the most time consuming layer.
483 > **NOTE**: Performance counters data and format depends on the plugin
488 >>> exec_net = plugin.load(network=net, num_requests=2)
489 >>> exec_net.requests[0].infer({input_blob: image})
490 >>> exec_net.requests[0].get_perf_counts()
491 {'Conv2D': {'exec_type': 'jit_avx2_1x1',
494 'status': 'EXECUTED',
495 'layer_type': 'Convolution'},
496 'Relu6': {'exec_type': 'undef',
500 'layer_type': 'Clamp'}
506 Sets new batch size for certain infer request when dynamic batching is enabled in executable network that created this request.
507 > **NOTE:** Support of dynamic batch size depends on the target plugin.
509 * `batch` - new batch size to be used by all the following inference calls for this request.
512 >>> plugin.set_config({"DYN_BATCH_ENABLED": "YES"})
513 >>> exec_net = plugin.load(network=net)
514 >>> exec_net.requests[0].set_batch(inputs_count)
516 Please refer to `dynamic_batch_demo.py` to see the full usage example.