DR1 Detailed level documentation (#5896)

author Efimov Alexander/AI Tools Lab/./Samsung Electronics <a.efimov@samsung.com>

Wed, 31 Jul 2019 00:00:50 +0000 (03:00 +0300)

committer 이성재/On-Device Lab(SR)/Principal Engineer/삼성전자 <sj925.lee@samsung.com>

Wed, 31 Jul 2019 00:00:50 +0000 (09:00 +0900)
author Efimov Alexander/AI Tools Lab/./Samsung Electronics <a.efimov@samsung.com>
Wed, 31 Jul 2019 00:00:50 +0000 (03:00 +0300)
committer 이성재/On-Device Lab(SR)/Principal Engineer/삼성전자 <sj925.lee@samsung.com>
Wed, 31 Jul 2019 00:00:50 +0000 (09:00 +0900)
diff --git a/docs/fig/compiler_flow.png b/docs/fig/compiler_flow.png

new file mode 100644 (file)

index 0000000..25daa0c

Binary files /dev/null and b/docs/fig/compiler_flow.png differ
diff --git a/docs/fig/nnfw_compiler_structure.png b/docs/fig/nnfw_compiler_structure.png

new file mode 100644 (file)

index 0000000..4c650c1

Binary files /dev/null and b/docs/fig/nnfw_compiler_structure.png differ
diff --git a/docs/fig/nnfw_compiler_structure.pptx b/docs/fig/nnfw_compiler_structure.pptx

new file mode 100644 (file)

index 0000000..9b5585d

Binary files /dev/null and b/docs/fig/nnfw_compiler_structure.pptx differ
diff --git a/docs/fig/runtime_nativeapi_flow.png b/docs/fig/runtime_nativeapi_flow.png

new file mode 100644 (file)

index 0000000..1f9c882

Binary files /dev/null and b/docs/fig/runtime_nativeapi_flow.png differ
diff --git a/docs/project/19_NN_Compiler_and_Runtime_(DLD_Rev._1.0).md b/docs/project/19_NN_Compiler_and_Runtime_(DLD_Rev._1.0).md

new file mode 100644 (file)

index 0000000..ee48964
--- /dev/null
+++ b/docs/project/19_NN_Compiler_and_Runtime_(DLD_Rev._1.0).md
@@ -0,0 +1,672 @@
+# SW Detailed Level Design
+
+**Table of Contents**
+
+1. [Overview](#overview)  
+1.1 [Scope](#scope)  
+1.2 [Design Consideration](#consideration)  
+1.2.1 [ML Framework](#design_framework)  
+1.2.2 [NN Compiler](#design_compiler)  
+1.2.3 [NN Runtime](#design_runtime)  
+1.2.4 [Loco IR](#design_loco_ir)  
+1.2.5 [Neurun IR](#design_runtime_ir)  
+1.2.6 [Common IR](#design_common_ir)  
+1.2.7 [Backends](#design_backends)  
+1.3 [Constraints](#constraints)
+2. [SW Detailed Structure Design](#structure)  
+2.1 [NN Compiler Feature](#compilerfeature)  
+2.2 [NN Compiler Structure](#compilerstructure)  
+2.1 [NN Runtime Feature](#runtimefeature)  
+2.2 [NN Runtime Structure](#runtimestructure)  
+2.1 [NN Package Feature](#packagefeature)  
+2.2 [NN Package Structure](#packagestructure)  
+2.3 [Open Source Software License Pre-Review](#opensource)
+3. [SW Detailed Operation Design](#operation)  
+3.1 [{NN Compiler} Detailed Design](#compiler)  
+3.1.1 [Major Function](#compilerfunction)  
+3.1.2 [Operation Sequence](#compilersequence)  
+3.2 [{NN Runtime} Detailed Design](#runtime)  
+3.2.1 [Major Function](#runtimefunction)  
+3.2.2 [Operation Sequence](#runtimesequence)
+4. [Interface Design](#interface)
+5. [SW Code Structure](#codestructure)  
+5.1 [nnfw Project Code Structure](#nnfw_code_structure)  
+5.2 [NN Runtime Code Structure](#nn_runtime_code_structure)  
+5.3 [NN Compiler Code Structure](#nn_compiler_code_structure)
+
+<div style="page-break-after: always;"></div>
+
+**Revision History**
+
+| Ver. | Date       | Contents            | Author           | Approver          |
+| ---- | ---------- | ------------------- | ---------------- | ----------------- |
+| 0.1  | 2019.07.24 | Initial DR1 version | A\. Efimov       |                   |
+| 0.2  | 2019.07.26 | PM part review      | A\. Kondrashov   |                   |
+| 1.0  | 2019.07.26 | Final DR1 version   | A\. Efimov       | Sung-Jae Lee      |
+
+**Terminology and Abbreviation**
+
+Terms | Description
+--- | ---
+OS | Operating System
+OS API | Application interface of OS
+HW | Hardware
+SW | Software
+NN | Neural Network
+NN Model | Neural network model (Instance of NN built with ML framework)
+NN Compiler | The compiler for neural network
+NN Package | Serialized representation of NN model generated by NN Compiler
+NN Runtime | Infrastructure to perform neural network inference
+ML framework | The machine learning framework
+TF/TF Lite | TensorFlow/TensorFlow Lite ML framework
+IR | Intermediate representation
+CI/CI system | Continuous integration system
+UI | The user interface
+GUI | The graphical user interface
+CLI | The command-line interface
+CG | Computational Graph
+RNN | Recurrent Neural Network
+
+<div style="page-break-after: always;"></div>
+
+<a name="overview"/>
+
+## 1 Overview
+
+<a name="scope"/>
+
+### 1.1 Scope
+
+The main goals of the project are to develop `NN Compiler` and `NN Runtime` for neural networks to perform inference for specified SW and HW platform.
+
+
+**2019 year goals:**
+
+- Support heterogeneous computing in `NN Runtime`
+- Support TensorFlow `NN Model` format
+- Support ONNX `NN Model` format
+- Improve inference performance in `NN Runtime`
+- Develop `Common IR` for both `NN Compiler` and `NN Runtime`
+- Support `RNN` in `NN Compiler` and `NN runtime`
+- Support custom operators in `NN Runtime`
+
+<a name="consideration"/>
+
+### 1.2 Design Consideration
+
+The figure below illustrates the overall software stack of `NN compiler and runtime` project.
+Modules are grouped in layers, represented by colors. Upper layers depend on underlying layers.
+Gray and white modules are out of scope of this project, but may be included in sources as externals.
+
+![nnfw_components](../fig/nnfw_components.png)
+
+Below is the list of large modules of the project.
+First level bullet describes the role, and second describes the background of design choice.
+
+<a name="design_framework"/>
+
+#### 1.2.1 ML Framework
+
+- TensorFlow format as an input for `NN Compiler`
+  - We chose TensorFlow format as an input for `NN Compiler` because it is the dominant format and
+    it is actively developed to satisfy the needs of data science community.
+    Almost all widely used `NN models` are represented in this format, and community continues to use this framework intensively.
+- ONNX format as an input for `NN Compiler`
+   - We chose ONNX as an input format for `NN Compiler`,
+    because it provides a possibility to support neural networks created with other ML frameworks
+    with help of various converters developed in ONNX project.
+
+<a name="design_compiler"/>
+
+#### 1.2.2 NN Compiler
+
+- Unification of `NN Runtime` interface
+  - `NN Compiler` is designed for several input model formats, and using `NN Compiler` we can provide unified storage format, so we have no need to implement separate frontend for every input format.
+
+<a name="design_runtime"/>
+
+#### 1.2.3 NN Runtime
+
+- Provide a lightweight interface for on-device application developers to inference `NN Model`
+  - It should be easy to use  `NN runtime`, so we provide a relatively small interface for loading and execution of compiled neural models produced by `NN Compiler`.
+- Provide common runtime interface
+  - In order to provide compatibility with existing software `Android NN API` was chosen.
+- `NN Runtime` should be able to provide improved performance
+  - In addition to `NN Compiler` optimizations, `NN runtime` can apply device-specific optimizations.
+  - `NN Runtime` has information about computational resources on device, so it can utilize different processing units to achieve best performance.
+
+<a name="design_loco_ir"/>
+
+#### 1.2.4 Loco IR
+
+- Flexible `NN Compiler` IR
+  - We need to support two different input formats (TensorFlow and ONNX), so we need a flexible and general IR to express networks from both formats.
+
+<a name="design_runtime_ir"/>
+
+#### 1.2.5 Neurun IR
+
+- `NN Runtime` IR that handles specific hardware information
+  - To achieve maximum performance we need to store additional hardware-specific information about hardware in IR.
+
+<a name="design_common_ir"/>
+
+#### 1.2.6 Common IR
+
+- Serialized IR that is responsible for storing a `NN Model` inside a `NN Package`
+  - To make it easier for the `NN Runtime` to extract model from `NN Package` this IR should be close to the `Neurun IR`
+  - IR loading should introduce a little performance and memory overhead during loading, so we chose the most appropriate serialization technology. Similarity with the `Neurun IR` should help to achieve this goal too.
+
+<a name="design_backends"/>
+
+#### 1.2.7 Backends
+
+- Provide computational kernels for operations
+  - This layer provides various libraries for computations on both CPU and GPU.
+  - Several of libraries could be the part of target platform, for example ARM Compute Library.
+
+<a name="constraints"/>
+
+### 1.3 Constraints
+
+Target software for `NN Compiler` part of the project is Linux-based and x86-based workstation.
+Target software for `NN Runtime` part of the project is Linux-based (including Tizen) arm device.
+
+Target model of hardware for `NN Runtime` is Odroid XU4.
+
+<a name="structure"/>
+
+## 2 SW Detailed Structure Design
+
+Top-level components are:
+
+- `NN Compiler` described in [NN Compiler](#design_compiler) section
+- `NN Runtime` described in [NN Runtime](#design_runtime) section
+- `NN Package` file format that contains graph description (`Common IR` described in [Common IR](#design_common_ir) section) metadata attached to this graph and implementation of custom operation kernels.
+
+<a name="compilerfeature"/>
+
+### 2.1 NN Compiler Feature
+
+`NN Compiler` is a set of tools designed to transforms original ML framework models into unified format designed for `NN Runtime`.
+`NN Compiler` works on application developer's workstation.
+This unified format is a file with defined structure `NN Package`.
+
+Supported input model formats are:
+
+- TensorFlow `NN Model` in `Protocol Buffers` format
+- ONNX `NN Model` in `Protocol Buffers` format
+
+<a name="compilestructure"/>
+
+### 2.2 NN Compiler Structure
+
+Schema below describes dependencies between `NN Compiler` components
+
+![nnfw_compiler_structure](../fig/nnfw_compiler_structure.png)
+
+**tf2circle**
+
+This is an executable that is designed to transform TensorFlow `NN Model` into a `NN Package`.
+This module combines other modules from level below in component schema.
+
+**onnx2circle**
+
+This is an executable that is designed to transform ONNX `NN Model` into a `NN Package`.
+This module combines other modules from level below in component schema.
+
+**Loco IR**
+
+This component is a collection of structures that is use to represent `NN Model` in `NN Compiler` and algorithms to process this structures.
+
+**TF Importer**
+
+This component is designed to parse TensorFlow `NN Model` protobuf file and create corresponding `Loco IR`.
+
+**ONNX Importer**
+
+This component is designed to parse ONNX `NN Model` protobuf file and create corresponding `Loco IR`.
+
+**Optimizer of Loco IR**
+
+This component is designed to apply optimizations in `Loco IR`. It accepts `Loco IR` as input and produces semantically (in terms of observable behavior) identical `Loco IR` as output.
+
+**Loco IR Verifier**
+
+This component is checking correctness and consistency of `Loco IR` in terms of agreement of it's structure. For example: operation input must be attached to some operation output.
+
+**Common IR Exporter**
+
+This component is designed to serialize computational graph from `Loco IR` to `Common IR`
+
+**Packager**
+
+This component is designed to pack all needed data and files in `NN Package`.
+
+**Interpreter (locomotiv)**
+
+This component is designed to interpret `NN Model` that is represented as `Loco IR`.
+
+<a name="runtimefeature"/>
+
+### 2.3 NN Runtime Feature
+
+`NN Runtime` is an infrastructure for performing inference of `NN Model`.
+`NN Runtime` is designed to work on user linux-based device.
+
+`NN Model` could be provided to `NN Runtime` by two APIs:
+
+- Our implementation of `Android NNAPI` specification
+- `Native NN Runtime` API. This API accepts `NN Package` produced by `NN Compiler`
+
+<a name="runtimestructure"/>
+
+### 2.4 NN Runtime Structure
+
+Image below describes dependencies between `NN Runtime` components
+
+![nnfw_runtime_structure](../fig/nnfw_runtime_structure.png)
+
+**Neurun Native Frontend**
+
+This component is designed to provide an API for loading and processing of `NN Package` into `Neurun IR`.
+
+**Android NN API Frontend**
+
+This component is an implementation of `Android NN API` that provides compatibilty with applications using this `Android NN API`.
+
+**NN Package Loader**
+
+This component is designed to parse `Common IR` and create corresponding `Neurun IR`.
+
+**Custom operations manager**
+
+This component is designed to manage custom operations (storing custom operation kernels, shape inference related code, etc.)
+
+**Runtime Core**
+
+This component is designed to compose components related to `NN Runtime`. It chooses between interpretation and compilation, chooses executor.
+
+**Interpreter**
+
+This component is designed to interpret `Neurun IR` if it is impossible to compile the model.
+
+**Compiler**
+
+This component is designed for preparation computational kernels for inference.
+Preparation process includes:
+
+- Backend assignment
+- Splitting computational graph into tasks
+- Configuration of tensors and computational kernels
+
+**Execution Manager**
+
+This component is designed for execution of compiled model.
+
+**Profiler**
+
+This component is designed for collecting, storing and processing profiling information.
+
+**Scheduler**
+
+This component is designed for optimal backend assignment.
+
+**Memory Manager**
+
+This component is designed for managing tensor memory.
+
+**Neurun IR**
+
+This component contains a collection of structures that represent `NN Model` in `NN Runtime`.
+
+**Kernels**
+
+This component contains implementation of NN operations.
+
+Supported backends are:
+- pure CPU backend
+- Arm Compute Library NEON
+- Arm Compute Library Open CL
+
+<a name="packagefeature"/>
+
+### 2.5 NN Package
+
+`NN Package` is the input of `NN Runtime`, and the output of `NN Compiler`.
+
+`NN Package` contains all data (such as model, `MANIFEST`, custom_op) that requires to run a given model.
+
+<a name="packagestructure">
+
+### 2.6 NN Package Structure
+
+`nnpackage` is a Zip archive in the following structure:
+
+```
+nnpackage
+├── custom_op
+├── metadata
+│   └── MANIFEST
+└── mymodel.model
+```
+
+- `mymodel.model` is a model file that has computation graph and weights.
+- `metadata` is a directory that contains all metadata including `MANIFEST`.
+- `MANIFEST` is a collection of attributes about this package.
+- `custom_op` is a directory that contains implementation objects.
+
+Model is stored in `Custom IR` format, for description see High Level Design document.
+
+<a name="opensource"/>
+
+### 2.7 Open Source Software License Pre-Review
+
+All components of the projects are under Apache 2.0 license.
+
+Third-party components used in project:
+
+| Module              | License                     | Review Comments          | Scope of Derivative work   | Open Possibility | PL's Decision |
+| ------------------- | --------------------------- | ------------------------ | -------------------------- | ---------------- | ------------- |
+| FlatBuffers         | Apache 2.0 License          | *TODO usage*             | nnfw                       | O(Open)          | O(Open)       |
+| TensorFlow          | Apache 2.0 License          | NN Runtime externals     | nnfw                       | O(Open)          | O(Open)       |
+| Protocol Buffers    | BSD License                 | NN model format          | nnc                        | O(Open)          | O(Open)       |
+| Arm Compute Library | MIT License                 | nnfw backend             | nnfw                       | O(Open)          | O(Open)       |
+
+<div style="page-break-after: always;"></div>
+
+<a name="operation"/>
+
+## 3 SW Detailed Operation Design
+
+<a name="nn_compiler"/>
+
+### 3.1 {NN Compiler} Detailed Design
+
+<a name="compilerfunction"/>
+
+#### 3.1.1 Major Function
+
+`NN Compiler` is a set of tools.
+These tools accepts as input TensorFlow or ONNX serialized `NN Models` and produces `NN Package`.
+Each tool has an command line interface described in High Level Documentation.
+
+<a name="compilersequence"/>
+
+#### 3.1.2 Operation Sequence
+
+The figure below illustrates data flow of data in `NN Compiler`.
+
+![compiler_flow](../fig/compiler_flow.png)
+
+<a name="runtime"/>
+
+### 3.2 {NN Runtime} Detailed Design
+
+<a name="runtimefunction"/>
+
+#### 3.2.1 Major Function
+
+`NN Runtime` is a environment for inference of `NN Models` on device.
+
+There are two ways to load `NN Model` into the system:
+
+- Our implementation of `Android NN API`
+- `NN Package` file
+
+<a name="runtimesequence"/>
+
+### 3.2.2 Operation Sequence
+
+The figure below illustrates common flow of data in `NN Runtime` when using `native API`.
+
+![runtime_nativeapi_flow](../fig/runtime_nativeapi_flow.png)
+
+The figure below illustrates common flow of data in `NN Runtime` when using `Android NN API`.
+
+![nnfw_nnapi_flow](../fig/nnfw_nnapi_flow.png)
+
+<div style="page-break-after: always;"></div>
+
+<a name="interface"/>
+
+## 4. Interface Design
+
+Interfaces will be described till final DR.
+
+<div style="page-break-after: always;"></div>
+
+
+<a name="codestructure"/>
+
+## 5. Code Structure
+
+<a name="nnfw_code_structure"/>
+
+### 5.1 High level code structure of nnfw project:
+
+directory | description
+--- | ---
+compiler | `NN Compiler` source code
+docs | nnfw Documentation
+infra | build and maintenance scripts
+packaging | Tizen package related files
+res | test neural networks
+runtimes | `NN Runtime` source code
+scripts | alias for infra related scripts
+tests | testing and benchmarking tools
+tools | utilities for `NN Model` analysis, test data generation, etc.
+
+<a name="nn_runtime_code_structure"/>
+
+### 5.2 NN Runtime code structure:
+
+runtimes  
+├── contrib  
+│   ├── android_tflite  
+│   ├── benchmark_acl  
+│   ├── custom_op  
+│   ├── detection  
+│   ├── labs  
+│   ├── nnpackage  
+│   ├── tflite_classify  
+│   ├── TFLiteSharp  
+│   ├── tflite_test  
+│   └── uben  
+├── include  
+├── libs  
+│   ├── ARMComputeEx  
+│   ├── cker  
+│   ├── cpp14  
+│   ├── jsoncpp  
+│   ├── misc  
+│   ├── profiling  
+│   ├── rua  
+│   ├── tflite  
+│   ├── xdata  
+│   ├── xprobe  
+│   └── xray  
+├── logging  
+│   ├── include  
+│   └── src  
+└── neurun  
+    ├── backend  
+    ├── core  
+    ├── frontend  
+    └── test  
+
+<a name="nn_compiler_code_structure"/>
+
+### 5.3 NN Compiler code structure:
+
+compiler  
+├── adtidas  
+│   └── include  
+├── angkor  
+│   ├── include  
+│   └── src  
+├── ann-api  
+│   └── include  
+├── ann-ref  
+│   └── src  
+├── caffegen  
+│   └── src  
+├── cli  
+│   ├── include  
+│   └── src  
+├── coco  
+│   ├── core  
+│   └── generic  
+├── cwrap  
+│   ├── include  
+│   └── src  
+├── enco  
+│   ├── cli  
+│   ├── core  
+│   ├── frontend  
+│   └── test  
+├── encodump  
+│   └── src  
+├── enco-intf  
+│   ├── cmdline  
+│   └── frontend  
+├── exo-tflite  
+│   ├── include  
+│   └── src  
+├── hermes  
+│   ├── include  
+│   └── src  
+├── hermes-std  
+│   ├── include  
+│   └── src  
+├── i5diff  
+│   └── src  
+├── loco  
+│   ├── include  
+│   └── src  
+├── locoex-customop  
+│   ├── include  
+│   └── src  
+├── loco-exporter  
+│   ├── include  
+│   ├── schema  
+│   └── src  
+├── locomotiv  
+│   ├── include  
+│   └── src  
+├── locop  
+│   ├── include  
+│   └── src  
+├── mir  
+│   ├── include  
+│   ├── proto  
+│   ├── src  
+│   └── unittests  
+├── mir2loco  
+│   ├── include  
+│   └── src  
+├── mirunner  
+├── moco  
+├── moco-log  
+│   ├── include  
+│   └── src  
+├── moco-onnx  
+│   ├── include  
+│   ├── proto  
+│   └── src  
+├── mocotest-onnx  
+│   ├── Const_000  
+│   └── Identity_000  
+├── mocotest-tf  
+├── moco-tf  
+│   ├── doc  
+│   ├── include  
+│   ├── proto  
+│   └── src  
+├── morph  
+│   ├── include  
+│   └── src  
+├── nest  
+│   └── core  
+├── nike  
+│   ├── include  
+│   └── src  
+├── nnc  
+│   ├── cmake  
+│   ├── doc  
+│   ├── driver  
+│   ├── include  
+│   ├── pass  
+│   ├── passes  
+│   ├── support  
+│   ├── tests  
+│   ├── unittests  
+│   └── utils  
+├── nnkit  
+│   ├── actions  
+│   ├── backends  
+│   ├── libs  
+│   └── tools  
+├── nnkit-caffe  
+│   ├── backend  
+│   └── support  
+├── nnkit-intf  
+│   ├── action  
+│   ├── backend  
+│   ├── cmdline  
+│   └── tensor  
+├── nnkit-misc  
+│   ├── backend  
+│   └── cmdline  
+├── nnkit-mocotf  
+│   ├── backend  
+│   └── support  
+├── nnkit-onnxrt  
+│   ├── backend  
+│   └── support  
+├── nnkit-tf  
+│   ├── backend  
+│   └── support  
+├── nnkit-tflite  
+│   ├── backend  
+│   └── support  
+├── nnop  
+│   ├── include  
+│   └── src  
+├── nnsuite  
+│   └── conv  
+├── onnxkit  
+│   └── src  
+├── pepper-strcast  
+│   ├── include  
+│   └── src  
+├── plier-tf  
+│   ├── include  
+│   ├── proto  
+│   └── src  
+├── pp  
+│   ├── include  
+│   └── src  
+├── safemain  
+├── stdex  
+│   ├── include  
+│   └── src  
+├── tf2tflite  
+│   └── src  
+├── tfgraph-xform  
+├── tfinfo  
+│   ├── include  
+│   └── src  
+├── tfkit  
+│   └── src  
+├── tflchef  
+│   ├── core  
+│   ├── proto  
+│   ├── tests  
+│   ├── tflite  
+│   └── tools  
+└── tfldump  
+    ├── driver  
+    ├── include  
+    ├── schema  
+    └── src
author	Efimov Alexander/AI Tools Lab/./Samsung Electronics <a.efimov@samsung.com>
	Wed, 31 Jul 2019 00:00:50 +0000 (03:00 +0300)
committer	이성재/On-Device Lab(SR)/Principal Engineer/삼성전자 <sj925.lee@samsung.com>
	Wed, 31 Jul 2019 00:00:50 +0000 (09:00 +0900)
docs/fig/compiler_flow.png	[new file with mode: 0644]	patch \| blob
docs/fig/nnfw_compiler_structure.png	[new file with mode: 0644]	patch \| blob
docs/fig/nnfw_compiler_structure.pptx	[new file with mode: 0644]	patch \| blob
docs/fig/runtime_nativeapi_flow.png	[new file with mode: 0644]	patch \| blob
docs/project/19_NN_Compiler_and_Runtime_(DLD_Rev._1.0).md	[new file with mode: 0644]	patch \| blob