From 4abea47b2a51977ec92059fe627cba4111eb2bc5 Mon Sep 17 00:00:00 2001 From: =?utf8?q?=D0=A0=D0=BE=D0=BC=D0=B0=D0=BD=20=D0=9C=D0=B8=D1=85=D0=B0?= =?utf8?q?=D0=B9=D0=BB=D0=BE=D0=B2=D0=B8=D1=87=20=D0=A0=D1=83=D1=81=D1=8F?= =?utf8?q?=D0=B5=D0=B2/AI=20Tools=20Lab=20/SRR/Staff=20Engineer/=EC=82=BC?= =?utf8?q?=EC=84=B1=EC=A0=84=EC=9E=90?= Date: Mon, 3 Sep 2018 21:46:13 +0300 Subject: [PATCH] Update HLD, SRS, and DLD documentation (#1304) * include content from rst file into one * remove redundant rst files * correct reference to image Signed-off-by: Roman Rusyaev --- .../project/18_NN_Compiler_and_Optimizer_DLD.rst | 590 ++++++++++++++++++++- .../project/18_NN_Compiler_and_Optimizer_HLD.rst | 133 ++++- .../project/18_NN_Compiler_and_Optimizer_SRS.rst | 62 ++- contrib/nnc/doc/project/caffe_importer_details.rst | 112 ---- contrib/nnc/doc/project/cli_details.rst | 131 ----- contrib/nnc/doc/project/model_ir_overview.rst | 43 -- .../nnc/doc/project/project_purpose_and_scope.rst | 19 - .../nnc/doc/project/project_sw_hw_constraints.rst | 51 -- contrib/nnc/doc/project/project_target_model.rst | 21 - .../doc/project/project_terms_and_abbreviation.rst | 39 -- contrib/nnc/doc/project/soft_backend_details.rst | 168 ------ 11 files changed, 770 insertions(+), 599 deletions(-) delete mode 100644 contrib/nnc/doc/project/caffe_importer_details.rst delete mode 100644 contrib/nnc/doc/project/cli_details.rst delete mode 100644 contrib/nnc/doc/project/model_ir_overview.rst delete mode 100644 contrib/nnc/doc/project/project_purpose_and_scope.rst delete mode 100644 contrib/nnc/doc/project/project_sw_hw_constraints.rst delete mode 100644 contrib/nnc/doc/project/project_target_model.rst delete mode 100644 contrib/nnc/doc/project/project_terms_and_abbreviation.rst delete mode 100644 contrib/nnc/doc/project/soft_backend_details.rst diff --git a/contrib/nnc/doc/project/18_NN_Compiler_and_Optimizer_DLD.rst b/contrib/nnc/doc/project/18_NN_Compiler_and_Optimizer_DLD.rst index e320b1c..0ddcce4 100644 --- a/contrib/nnc/doc/project/18_NN_Compiler_and_Optimizer_DLD.rst +++ b/contrib/nnc/doc/project/18_NN_Compiler_and_Optimizer_DLD.rst @@ -25,7 +25,44 @@ SW Detailed Level Design | 2.0 | 2018.09.03 | DR2 version | Roman Rusyaev | Sung-Jae Lee | +--------+---------------+----------------------------+---------------------------------------+---------------------+ -.. include:: project_terms_and_abbreviation.rst +| + +**Terminology and Abbreviation** + +.. list-table:: + :widths: 10 30 + :header-rows: 0 + + * - OS + - Operating System + * - OS API + - Application interface of OS + * - HW + - Hardware + * - SW + - Software + * - NN + - Neural Network + * - NN model + - Neural network model (Instance of NN built with ML framework) + * - NN compiler + - The compiler for neural network + * - ML framework + - The machine learning framework + * - TF/TF Lite + - Tensorflow/Tensorflow Lite ML framework + * - IR + - Intermediate representation + * - CI/CI system + - Continuous integration system + * - UI + - The user interface + * - GUI + - The graphical user interface + * - CLI + - The command-line interface + * - CG + - Computational Graph | @@ -43,8 +80,47 @@ Overview Scope ----- -.. include:: project_purpose_and_scope.rst -.. include:: project_target_model.rst +The main goal of the project is to develop a compiler for neural networks to produce executable artifact for specified SW and HW platform. + +The development scope includes the following components: + +- Develop importer module to parse, verify and represent NN model for further optimization and compilation +- Develop code emitters to produce executable binary for CPU and GPU + +| +| **2018 year goals:** + +- Support TensorFlow Lite NN model format +- Support Caffe NN model format +- Support Caffe2 NN model format (Optional) +- Support compilation of MobileNet NN +- Support compilation of Inception v3 NN +- Support ARM CPU +- Support ARM GPU (Mali) +- Support Tizen OS +- Support SmartMachine OS (Optional) + +| + +.. list-table:: Table 1-1. Target Model + :widths: 23 50 20 + :header-rows: 1 + + * - Product + - Target Model Name + - Comment + + * - Tizen phone + - Tizen TM2 + - Reference device + + * - Tizen device + - Odroid XU4 + - Reference board + + * - SmartMachine target + - Microvision mv8890, exynos8890 + - Reference device Design Consideration @@ -71,7 +147,56 @@ Constraints See constraints in SW Requirements Specification. -.. include:: project_sw_hw_constraints.rst +| + +.. list-table:: Table 1-2. Assumptions, Dependencies and the Constraints + :widths: 23 40 23 + :header-rows: 1 + + * - Item + - Assumptions, Dependencies and the Constraints + - Reference + + * - Tizen SW Platform + - The following items should be provided: + - Tizen API + - Tizen kernel + - Tizen FW + - Tizen SDK + - Tizen naming convention + | + - + - `www.tizen.org `_ + - `wiki.tizen.org `_ + - `developer.tizen.org `_ + + * - SmartMachine OS Platform + - The following items should be provided: + - SmartMachine API + - SmartMachine kernel + - SmartMachine FW + - SmartMachine SDK + - SmartMachine naming convention + | + - + - `Platform confluence `_ + - `Github `_ + - `Functional Safety confluence `_ + + + * - Host OS + - Linux-based OS (Ubuntu, Archlinux, etc) + - + - `Ubuntu site `_ + - `Archlinux site `_ + + * - Tizen target HW + - The reference device should be provided: Tizen TM2 + - + + * - SmartMachine target HW + - The reference device should be provided + - SW Detailed Structure Design @@ -138,7 +263,49 @@ Major Function To provide access to NN model representation in order to perform transformations and optimizations improving performance of code generation. -.. include:: model_ir_overview.rst +Overview +```````` +Model IR consists of 4 main parts: + +* Graph - represents the computation graph +* Node - container for single computational operation in computation graph +* Operation description - represents a single operation +* Visitor - declares an interface used for graph traversal + +Graph +````` +Graph contains information about graph input/output nodes and list of all nodes in graph. + +Responsible for allocating nodes and keeps all allocated node references. +`Graph` class takes care of graph traversal considering all node input/output dependencies. + +Node +```` +Each node contains: + +- Node id ( used to uniquely address node in computation graph ) +- Node name( set by importer, used to distinguish inputs/outputs ) +- Operation description - reference to OpDescription subclass +- List of inputs( each represented by node reference and output index from that node ) +- List of outputs( List of nodes which take any resulting data from this node ) + +Operation Description +````````````````````` +All operations in computation graph are represented by subclasses of `OpDescription` class. + +Every Operation's description contains: + +- Number of inputs/outputs operation takes +- Shapes of input/output tensors ( initialised in Importer/ShapeInference ) +- Any information specific to operation( i.e. convolution kernel ) + +Visitor +``````` +Base class used for traversing computation graph. + +Defines set of operations on IR nodes to be provided by IR user. + +Supposed to be the only way used for graph processing. {Import NN model} Detailed Design @@ -194,7 +361,118 @@ Development of the TF Lite importer (frontend) is in progress. The detailed desi Caffe importer `````````````` -.. include:: caffe_importer_details.rst +Basics +###### + +Caffe models consist of *layers*, which is more or less a synonym to "NN operation". + +Layers input and output *blobs*, which is more or less a synonym to "tensor". + +Every layer has a type (for example "Convolution"), a name, bottom blobs (input tensors), top blobs (output tensors). + +Note that frequently layer's name and output blob might be identical, which might be confusing. So remember: "name" is just a name of a layer. But "top" and "bottom" are names of the *blobs*, and they should be consistent over the sequence of layers (i.e. a bottom blob of every layer must have the same name as a top blob of some other layer). + +Example: + +.. code-block:: none + + layer { + name: "conv1_3x3_s2" + type: "Convolution" + bottom: "data" + top: "conv1_3x3_s2" + param { + lr_mult: 1 + decay_mult: 1 + } + phase: TEST + convolution_param { + num_output: 32 + bias_term: false + pad: 0 + kernel_size: 3 + stride: 2 + weight_filler { + type: "xavier" + std: 0.01 + } + } + } + +Model files +########### + +Models are frequently distributed as a pair of files: + +* `.prototxt` file, containing the text version of the model used for deployment (without various layers needed for training etc) +* `.caffemodel` binary file, containing the version of the model that was (or still can) be used for training, and containing trained model weights + +Ideally, Caffe importer should also support this - it should also accept two files like this, read the first one to get the NN model architecture, read the second one to get the weights. + +Instead, currently we do the following - we just take the first file (`.prototxt`), fill it with weights (it will still remain a `.prototxt` file), and then use it as input to the importer. Filling a `.prototxt` with weights can be done using `caffegen` tool in `contrib/caffegen` - just run `caffegen init < "path-to-prototxt" > "path-to-output"`. This command will result in another `.prototxt` file, but this time it will be filled with weights (yes, the size of this file will be large). + +Now this `.prototxt` file can be turned into a binary `.caffemodel` file. Unfortunately, I don't know a convenient way to do this, so currently I just take the code from `src/caffe/util/io.cpp` from Caffe sources, which has functions that can read and write Caffe models in text and binary formats (these functions are `ReadProtoFromTextFile`, `WriteProtoToTextFile`, `ReadProtoFromBinaryFile` and `WriteProtoToBinaryFile`), then I insert them to the `proto_reader.cpp` in nncc sources, and use them to read and write Caffe models. + +Caffe model preparation for importer +#################################### + +Currently we do not support Caffe layers like `BatchNorm`, `Scale`, `Split`, so we have to manually remove them from the `.prototxt` models. + +After this it is necessary to make sure that top blobs of previous layers connect to the bottom blobs of following layers. Example: + +.. code-block:: none + + layer { + bottom: "x" + top: "blob1" + } + layer { + type: "Split" + bottom: "blob1" + top: "blob2" + } + layer { + bottom: "blob2" + top: "y" + } + +After removing the `Split` layer the first layer will output "blob1", but the last layer will accept "blob2", which doesn't exist. So, the result should be: + +.. code-block:: none + + layer { + bottom: "x" + top: "blob1" + } + layer { + bottom: "blob1" + top: "y" + } + + +Model format +############ + +Defined by Protocol Buffers library schema, can be found in Caffe's sources in `src/caffe/proto/caffe.proto`. + +Note that layers are not called layers there, they are called *parameters*. + +The main structure describing the whole model is called `NetParameter`. + +`NetParameter` contains a sequence of `LayerParameters`. Each of them has properties "name", "type", "top", "bottom", "blobs" (basically - "kernels", if it is applicable for this layer of course), and a property corresponding to one specific layer type, for example `ConvolutionParameter`. + +**Note1:** most of these properties are technically optional; some of them are *repeated*, which means there can be zero or more of them (like in the case of "top" and "bottom" - a layer may have zero or more inputs and outputs). + +**Note2:** sometimes you'll see that for some layers bottom and top blobs have the same name. It means, that Caffe will reuse the memory for this layer (i.e. it will put the calculation result for this layer into the same memory as its input). Still, the computation graph is still correctly defined, because the order of layers is significant. + +Important notes +############### + +* `InnerProduct` layer is just another name for `FullyConnected` or `Dense` layer. +* Layers such `Pooling`, `InnerProduct`, `Convolution` all have 4D inputs and 4D outputs. This can be unexpected for `InnerProduct` especially. Check Caffe docs for details (for example `InnerProduct `_). +* `Pooling` layer has `global_pooling` property which basically a way to automatically pool over the whole height and width of the input tensor. It means that pooling window size won't be set as numbers, which in turn means that it is impossible to implement this in the Caffe importer without knowing the shape of the input. Currently I just change this `global_pooling` property to concrete pooling windows size. +* At the time of writing Caffe **does not** have a DepthwiseConv layer. +* `Split` layer, quite surprisingly, just makes a few identical copies of the input (bottom) blob; it doesn't actually split the blob into parts. {Generate the code} Detailed Design @@ -211,13 +489,309 @@ Soft backend Generation of C++ source code for CPU ##################################### -.. include:: soft_backend_details.rst +Glossary +~~~~~~~~ ++ **Tensor** - class that represents multidimensional array. + Provides user artifact interface (holds input data and results) and + keep temporary data generated by one operation and used by other. ++ **Operation** - neural network layer implementation, like 2d convolution, activation function, etc. + It consumes ``Tensor`` objects as input and produces one or multiple output ``Tensors`` ++ **Shape** - class that stores number and values of ``Tensor`` dimensions. ++ **Artifact** - product of soft backend execution, this artifact provides interface for user and + implements inference of provided computational graph + +Overview +~~~~~~~~ +Soft backend takes a pointer to a computational graph(Model IR) and +generates artifact in form of C++ source code file, header with artifact interfaces and +binary file that containing parameters of compiled network. + +Example of generated artifact interface: + +.. code-block:: c++ + + class NNModel + { + public: + // Constructor of artifact, it takes path to file with NN parameters + // Contents of this file is not stored in code directly + // because it could be enormous + NNModel(const std::string ¶metersPath); + + // Setter of NN input named "layer1" + // contents of in Tensor are copied to internals of Model + void set_layer1(const Tensor &in); + + // Getter of NN output named "result" + // Model creates result object during inference and + // holds reference to it until model is destroyed + // or another inference executed + std::stared_ptr get_result(); + + // This method exists in every aftifact, + // has fixed name that is not dependent on Model + // It is responsible for inference of result tensors from input data + void doInference(); + }; + +Common usage: + +.. code-block:: c++ + + std::string pathToParameters = "nnmodel.params"; + NNModel model(pathToParameters); + Tensor inputTensor; + for (int i = 0; i < numInputs; ++i) + { + fillInput(inputTensor); + model.set_layer1(inputTensor); + model.doInference(); + std::shared_ptr result = model.get_result(); + showResult(*result); + } + + +Soft backend has three main phases: analysis, serialization and artifact generation. + +* Analysis implemented by ``ModelAnalyzer`` class, +* Serialization implemented by ``Serializer`` class, +* Generation of header, code and parameter files implemented by ``BaseCodeGenerator`` class and derived classes. + +General generation sequence +~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Main backend sequence can be found in ``BaseCodeGenerator::generate`` method: + +1. Apply ``ShapeInference`` visitor to determine output Shapes of operations, + this simplifies artifact code: no need to compute shapes during inference. +2. Apply ``ModelAnalyzer`` visitor to generate inference sequence and + find artifact inputs, output and temporary Tensors. +3. Apply ``Serializer`` visitor on inference sequence generated by ``ModelAnalyzer`` + to create binary array of parameters. +4. Call ``formatTensorNames`` method that adjusts input and output names + to target language naming convention(remove symbols(like '/') invalid in C++ names, etc.). +5. Create artifact output directory(set by ``--output-dir`` option, + possibility of this operation should be checked by driver systems). +6. Create header file in output directory, write it contents + by calling virtual ``materializeHeader`` method overrided + in particular soft backend class(CPPCodeGenerator, CCodeGenerator, etc). + This phase consumes data gathered by ``ModelAnalyzer``. +7. Create code file in output directory, write it contents + by calling virtual ``materializeCode`` method overrided + in particular soft backend class(CPPCodeGenerator, CCodeGenerator, etc); + This phase consumes data gathered by ``ModelAnalyzer``. +8. Create and fill file with model parameters. + This file contains a header (magic number, version of protocol, etc.) + to identify this file and avoid errors of wrong model + parameters combination + and raw data colected by serializer. + +Inference sequence construction +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +``ModelAnalyzer`` object walks computational graph on phase 2 from general sequence +in topological order and creates layout of it's operations. This sequence is represented +by list of ``ModelAnalyzer::OpDescr`` objects. + +``ModelAnalyzer::OpDescr`` object contains name of operation, +pointer to corresponding CG Node, ids of input and output Tensors. + +Information about artifact variables(Tensors) are stored in array of ``TensorDescription`` objects, +Mentioned object holds name of ``Tensor`` and it's properties(is it input/output/temporary). +Every ``Node`` node emits "input" type ``Tensor`` variable. +Every node with name(what is not ``Node``) emits "output" type ``Tensor`` variable that holds operation output. +Node without particular name creates temporary ``Tensor``, that is not accessible outside Model; + +Serialization +~~~~~~~~~~~~~ +``Serializer`` object visits CG nodes and stores corresponding data in raw binary format +in internal array of bytes. Every operation receives unique +(with exception of operations that has nothing to serialize, for example relu) offset in this array +where it's data stored. This offset is stored in ``_paramStartOffset`` of ``ModelAnalyzer::OpDescr`` object. + +Shape, strides are stored as arrays of integers: first value serialized is array size, then all dimension values. +Pads are stored as vectors too, but they have additional value to save: +padding type. It is not used in inference yet, because all ``Tensor`` ``Shapes`` are available at compile time. +To serialize ``Tensor`` objects (like convolution kernel, fully-connected weights) ``Serializer`` dumps ``Shape`` first, +then actual data in form of C multidimensional array(data stored continuously like ``int a[100][100][3]``). + +Def files +~~~~~~~~~ +Generator has number of ``.def`` files. This files contain code snipets used to generate the artifact. +This is classes and type declarations for artifact, library of operations, support functions. + +Build system converts them into headers containing char arrays with contents of ``def`` files. +Is generator need to include some snippet into generated code it just prints contents of this generated array. + +Header generation +~~~~~~~~~~~~~~~~~ +``materializeHeader`` method of derivative of ``BaseCodeGenerator`` implements generation of header file. + +C++ backend generates artifact class in header file that contains: ++ constructor that takes path to parameter file ++ destructor that frees taken resources ++ setters of model inputs. Every function has unique name taken from CG node. +These setters correspond to "input" tensors found by ``ModelAnalyzer`` object. ++ getters of model products. Every function has unique name taken from CG node; +These getters corrspond to "output" tensors found by `ModelAnalyzer` object. + +Also header file contains a number of helper functions and types(``Shape``, ``Tensor``) defined in ``.def`` files. +These types and methods are needed by users of the artifact. + +Code generation +~~~~~~~~~~~~~~~ +``materializeCode`` method of derivative of ``BaseCodeGenerator`` implements generation of code file. + +First the backend writes the required snippets from ``.def`` files. This includes operation implementations, +helper functions (like parameter file mapping and unmapping). + +Then the artifact interface implementation is written: artifact constructor, destructor, setters, getters, ``doInference``. + +Constructor and destructor call support functions from included snippets to manage parameters. + +Setters and getters are trivial and contain assignments or ``return`` statements. + +``doInference`` function contains "initilizer" section and actual inference. +"initializer" section resets all "output" variables so reference to them are dropped. +Inference part is generated from the inference sequence that was constructed by ``ModelAnalyzer`` object. +Operations are represented by calls from support library. +General form of operation looks like ``opName(outputTensor, paramsPtr, inputTensor1, inputTensor2);``. +If an operation defines a temporary variable, then the temporary variable is allocated before the point of call. +``paramsPtr`` arguments corresponds to expression +`` + `` +data offset is defined by ``Serializer``. Interface Design ================ -.. include:: cli_details.rst +Overview +-------- +``nnc`` provides a command line user interface to run compilation pipeline with settings that user wants. +To get all available options of ``nnc`` the flag ``--help`` is to be passed to command line. + +Here is a list of available ``nnc`` options: + +``$ nnc --help`` + +:: + + Usage: nnc OPTIONS + Available OPTIONS + --help, -h - print usage and exit + --debug - turn debugging on (optional: provide filename) + --debug-area - if specified, debug code will be executed + only in given areas + --caffe - treat input file as Caffe model + --tflite - treat input file as Tensor Flow Lite model + --target - select target language to emit for given + architecture. Valid values are 'x86-c++', + 'interpreter' + --nnmodel, -m - specify input file with NN model + --output, -o - specify name for output files + --output-dir, -d - specify directory for output files + --input-model-data - interpreter option: specify file with + neural network input data. This file + contains array of floats in binary form + --input-node - interpreter option: set input node + in Computational Graph + --output-node - interpreter option: set output node + in Computational Graph + --res-filename - interpreter option: file for result tensor + + + +Option declaration design +------------------------- +``nnc`` has a convenient and flexible mechanism for declaring command line options that allows users to adjust various option settings. +A command line option is represented by ``Option`` template class and defined in global scope which allows option to be constructed before +``main`` function. This allows calling ``parseCommandLine`` (see `Command line parser design`_) in arbitrary point of program having already all declared options. +The ``Option`` class has only one constructor with the following parameters: + +.. list-table:: + :widths: 10 30 + :header-rows: 0 + + * - **optnames** + - Names of option. The option can have a several names (`aliases`), for example: ``-f``, ``--file-name`` + * - **descr** + - Option Description. This text will be shown if ``--help`` option is passed or if command line is incorrect + * - **default_val** + - Option value accepted by default. This value will be set to option value if value for option isn't passed to command line + * - **is_optional** + - If this parameter set to ``false`` and option isn't passed to command line then error message will be shown and ``nnc`` will be terminated + * - **vals** + - Valid values for option. Other values are interpreted as invalid + * - **checker** + - Pointer to function that will be called by command line parser to verify option + * - **seps** (by default is spaces) + - Symbols that separate option name from value + * - **enabled** + - If this option is set to ``false`` then it won't be shown for users + +| + +When ``Option`` is constructed it registers itself to command line parser that is singleton of ``CommandLine`` class object, +so that when all options are registered the command line parser will contain all of them. + +Most of option parameters should be set with special helper functions that give the user an opportunity to declare options in a simpler form: + +.. list-table:: + :widths: 10 30 + :header-rows: 0 + + * - **optname** + - Convert option names for ``Option`` constructor, if option has several names then they must be separated by a comma + * - **overview** + - Convert string contains option description for ``Option`` constructor. This function can split long lines of description for more pretty printing + * - **optvalues** + - Convert option valid values for ``Optioin`` constructor, if option has several values then they must be separated by a comma + * - **separators** + - Convert string of symbols separated by a comma for ``Option`` constructor + +| + +This is an example of a declaration a ``target`` option that selects a specific `backend`: + +.. code-block:: c++ + + Option target( + optname("--target"), + overview("select target language" + "to emit for given architecture. " + "Valid values are 'x86-c++', 'interpreter'), + std::string(), // default value is empty + optional(false), // required option + optvalues("x86-c++, interpreter"), + nullptr, // no option checker + separators("=")); + +| + +After command line parsing ``Option`` object can be used as an object of type with which ``Option`` was instantiated, for example: + +.. code-block:: c++ + + ... + if ( target == "x86-c++" ) + ... + ... + std::string targ_val = target; + ... + f(target.c_str()); + ... + + +Command line parser design +-------------------------- +Command line parser is presented by a singleton object of ``CommandLine`` class that contains all registered options (see `Option declaration design`_). +The ``CommandLine`` class has a main public method ``parseCommandLine`` that parses the command line. When this method is invoked the command line parser implements the following steps: + +- verify that next option in command line is either option that was registered in command line parser or print error message that option is not recognized +- set the value of the current command line argument to ``Option`` object and check that this value is valid for the option, if value is invalid error message will be printed +- verify that all required options are present in command line if they are not an error message will be shown +- invoke checker function for all options if these functions are available + +| + +**Note**. Since ``Option`` class is template class and can take a various type the command line parser accesses to options via interface that presented by ``BaseOption`` class and that is supperclass for ``Options`` class SW Code Structure diff --git a/contrib/nnc/doc/project/18_NN_Compiler_and_Optimizer_HLD.rst b/contrib/nnc/doc/project/18_NN_Compiler_and_Optimizer_HLD.rst index 9e5fee4..a561caf 100644 --- a/contrib/nnc/doc/project/18_NN_Compiler_and_Optimizer_HLD.rst +++ b/contrib/nnc/doc/project/18_NN_Compiler_and_Optimizer_HLD.rst @@ -23,7 +23,44 @@ SW High Level Design | 1.0 | 2018.06.22 | Final DR1 version | Vostokov Sergey | Sung-Jae Lee | +-------+-------------+----------------------------+--------------------------+---------------------+ -.. include:: project_terms_and_abbreviation.rst +| + +**Terminology and Abbreviation** + +.. list-table:: + :widths: 10 30 + :header-rows: 0 + + * - OS + - Operating System + * - OS API + - Application interface of OS + * - HW + - Hardware + * - SW + - Software + * - NN + - Neural Network + * - NN model + - Neural network model (Instance of NN built with ML framework) + * - NN compiler + - The compiler for neural network + * - ML framework + - The machine learning framework + * - TF/TF Lite + - Tensorflow/Tensorflow Lite ML framework + * - IR + - Intermediate representation + * - CI/CI system + - Continuous integration system + * - UI + - The user interface + * - GUI + - The graphical user interface + * - CLI + - The command-line interface + * - CG + - Computational Graph | @@ -38,8 +75,47 @@ Overview Scope ----- -.. include:: project_purpose_and_scope.rst -.. include:: project_target_model.rst +The main goal of the project is to develop a compiler for neural networks to produce executable artifact for specified SW and HW platform. + +The development scope includes the following components: + +- Develop importer module to parse, verify and represent NN model for further optimization and compilation +- Develop code emitters to produce executable binary for CPU and GPU + +| +| **2018 year goals:** + +- Support TensorFlow Lite NN model format +- Support Caffe NN model format +- Support Caffe2 NN model format (Optional) +- Support compilation of MobileNet NN +- Support compilation of Inception v3 NN +- Support ARM CPU +- Support ARM GPU (Mali) +- Support Tizen OS +- Support SmartMachine OS (Optional) + +| + +.. list-table:: Table 1-1. Target Model + :widths: 23 50 20 + :header-rows: 1 + + * - Product + - Target Model Name + - Comment + + * - Tizen phone + - Tizen TM2 + - Reference device + + * - Tizen device + - Odroid XU4 + - Reference board + + * - SmartMachine target + - Microvision mv8890, exynos8890 + - Reference device Design Consideration @@ -65,7 +141,56 @@ Constraints See constraints in SW Requirements Specification. -.. include:: project_sw_hw_constraints.rst +| + +.. list-table:: Table 1-2. Assumptions, Dependencies and the Constraints + :widths: 23 40 23 + :header-rows: 1 + + * - Item + - Assumptions, Dependencies and the Constraints + - Reference + + * - Tizen SW Platform + - The following items should be provided: + - Tizen API + - Tizen kernel + - Tizen FW + - Tizen SDK + - Tizen naming convention + | + - + - `www.tizen.org `_ + - `wiki.tizen.org `_ + - `developer.tizen.org `_ + + * - SmartMachine OS Platform + - The following items should be provided: + - SmartMachine API + - SmartMachine kernel + - SmartMachine FW + - SmartMachine SDK + - SmartMachine naming convention + | + - + - `Platform confluence `_ + - `Github `_ + - `Functional Safety confluence `_ + + + * - Host OS + - Linux-based OS (Ubuntu, Archlinux, etc) + - + - `Ubuntu site `_ + - `Archlinux site `_ + + * - Tizen target HW + - The reference device should be provided: Tizen TM2 + - + + * - SmartMachine target HW + - The reference device should be provided + - SW System Architecture Design diff --git a/contrib/nnc/doc/project/18_NN_Compiler_and_Optimizer_SRS.rst b/contrib/nnc/doc/project/18_NN_Compiler_and_Optimizer_SRS.rst index e42f77a..e4e382d 100644 --- a/contrib/nnc/doc/project/18_NN_Compiler_and_Optimizer_SRS.rst +++ b/contrib/nnc/doc/project/18_NN_Compiler_and_Optimizer_SRS.rst @@ -32,8 +32,64 @@ Introduction Purpose and scope ----------------- -.. include:: project_purpose_and_scope.rst -.. include:: project_terms_and_abbreviation.rst +The main goal of the project is to develop a compiler for neural networks to produce executable artifact for specified SW and HW platform. + +The development scope includes the following components: + +- Develop importer module to parse, verify and represent NN model for further optimization and compilation +- Develop code emitters to produce executable binary for CPU and GPU + +| +| **2018 year goals:** + +- Support TensorFlow Lite NN model format +- Support Caffe NN model format +- Support Caffe2 NN model format (Optional) +- Support compilation of MobileNet NN +- Support compilation of Inception v3 NN +- Support ARM CPU +- Support ARM GPU (Mali) +- Support Tizen OS +- Support SmartMachine OS (Optional) + +| + +**Terminology and Abbreviation** + +.. list-table:: + :widths: 10 30 + :header-rows: 0 + + * - OS + - Operating System + * - OS API + - Application interface of OS + * - HW + - Hardware + * - SW + - Software + * - NN + - Neural Network + * - NN model + - Neural network model (Instance of NN built with ML framework) + * - NN compiler + - The compiler for neural network + * - ML framework + - The machine learning framework + * - TF/TF Lite + - Tensorflow/Tensorflow Lite ML framework + * - IR + - Intermediate representation + * - CI/CI system + - Continuous integration system + * - UI + - The user interface + * - GUI + - The graphical user interface + * - CLI + - The command-line interface + * - CG + - Computational Graph SW System Architecture @@ -48,7 +104,7 @@ The main components of the compiler are the following: - Code emitter (Produces the binary to take advantages of CPU and/or GPU) -.. image:: ../images/nncc_idef0_a1.png +.. image:: images/nncc_idef0_a1.png :scale: 100% diff --git a/contrib/nnc/doc/project/caffe_importer_details.rst b/contrib/nnc/doc/project/caffe_importer_details.rst deleted file mode 100644 index f7a60c7..0000000 --- a/contrib/nnc/doc/project/caffe_importer_details.rst +++ /dev/null @@ -1,112 +0,0 @@ -Basics -###### - -Caffe models consist of *layers*, which is more or less a synonym to "NN operation". - -Layers input and output *blobs*, which is more or less a synonym to "tensor". - -Every layer has a type (for example "Convolution"), a name, bottom blobs (input tensors), top blobs (output tensors). - -Note that frequently layer's name and output blob might be identical, which might be confusing. So remember: "name" is just a name of a layer. But "top" and "bottom" are names of the *blobs*, and they should be consistent over the sequence of layers (i.e. a bottom blob of every layer must have the same name as a top blob of some other layer). - -Example: - -.. code-block:: none - - layer { - name: "conv1_3x3_s2" - type: "Convolution" - bottom: "data" - top: "conv1_3x3_s2" - param { - lr_mult: 1 - decay_mult: 1 - } - phase: TEST - convolution_param { - num_output: 32 - bias_term: false - pad: 0 - kernel_size: 3 - stride: 2 - weight_filler { - type: "xavier" - std: 0.01 - } - } - } - -Model files -########### - -Models are frequently distributed as a pair of files: - -* `.prototxt` file, containing the text version of the model used for deployment (without various layers needed for training etc) -* `.caffemodel` binary file, containing the version of the model that was (or still can) be used for training, and containing trained model weights - -Ideally, Caffe importer should also support this - it should also accept two files like this, read the first one to get the NN model architecture, read the second one to get the weights. - -Instead, currently we do the following - we just take the first file (`.prototxt`), fill it with weights (it will still remain a `.prototxt` file), and then use it as input to the importer. Filling a `.prototxt` with weights can be done using `caffegen` tool in `contrib/caffegen` - just run `caffegen init < "path-to-prototxt" > "path-to-output"`. This command will result in another `.prototxt` file, but this time it will be filled with weights (yes, the size of this file will be large). - -Now this `.prototxt` file can be turned into a binary `.caffemodel` file. Unfortunately, I don't know a convenient way to do this, so currently I just take the code from `src/caffe/util/io.cpp` from Caffe sources, which has functions that can read and write Caffe models in text and binary formats (these functions are `ReadProtoFromTextFile`, `WriteProtoToTextFile`, `ReadProtoFromBinaryFile` and `WriteProtoToBinaryFile`), then I insert them to the `proto_reader.cpp` in nncc sources, and use them to read and write Caffe models. - -Caffe model preparation for importer -#################################### - -Currently we do not support Caffe layers like `BatchNorm`, `Scale`, `Split`, so we have to manually remove them from the `.prototxt` models. - -After this it is necessary to make sure that top blobs of previous layers connect to the bottom blobs of following layers. Example: - -.. code-block:: none - - layer { - bottom: "x" - top: "blob1" - } - layer { - type: "Split" - bottom: "blob1" - top: "blob2" - } - layer { - bottom: "blob2" - top: "y" - } - -After removing the `Split` layer the first layer will output "blob1", but the last layer will accept "blob2", which doesn't exist. So, the result should be: - -.. code-block:: none - - layer { - bottom: "x" - top: "blob1" - } - layer { - bottom: "blob1" - top: "y" - } - - -Model format -############ - -Defined by Protocol Buffers library schema, can be found in Caffe's sources in `src/caffe/proto/caffe.proto`. - -Note that layers are not called layers there, they are called *parameters*. - -The main structure describing the whole model is called `NetParameter`. - -`NetParameter` contains a sequence of `LayerParameters`. Each of them has properties "name", "type", "top", "bottom", "blobs" (basically - "kernels", if it is applicable for this layer of course), and a property corresponding to one specific layer type, for example `ConvolutionParameter`. - -**Note1:** most of these properties are technically optional; some of them are *repeated*, which means there can be zero or more of them (like in the case of "top" and "bottom" - a layer may have zero or more inputs and outputs). - -**Note2:** sometimes you'll see that for some layers bottom and top blobs have the same name. It means, that Caffe will reuse the memory for this layer (i.e. it will put the calculation result for this layer into the same memory as its input). Still, the computation graph is still correctly defined, because the order of layers is significant. - -Important notes -############### - -* `InnerProduct` layer is just another name for `FullyConnected` or `Dense` layer. -* Layers such `Pooling`, `InnerProduct`, `Convolution` all have 4D inputs and 4D outputs. This can be unexpected for `InnerProduct` especially. Check Caffe docs for details (for example `InnerProduct `_). -* `Pooling` layer has `global_pooling` property which basically a way to automatically pool over the whole height and width of the input tensor. It means that pooling window size won't be set as numbers, which in turn means that it is impossible to implement this in the Caffe importer without knowing the shape of the input. Currently I just change this `global_pooling` property to concrete pooling windows size. -* At the time of writing Caffe **does not** have a DepthwiseConv layer. -* `Split` layer, quite surprisingly, just makes a few identical copies of the input (bottom) blob; it doesn't actually split the blob into parts. \ No newline at end of file diff --git a/contrib/nnc/doc/project/cli_details.rst b/contrib/nnc/doc/project/cli_details.rst deleted file mode 100644 index 951c057..0000000 --- a/contrib/nnc/doc/project/cli_details.rst +++ /dev/null @@ -1,131 +0,0 @@ -Overview --------- -``nnc`` provides a command line user interface to run compilation pipeline with settings that user wants. -To get all available options of ``nnc`` the flag ``--help`` is to be passed to command line. - -Here is a list of available ``nnc`` options: - -``$ nnc --help`` - -:: - - Usage: nnc OPTIONS - Available OPTIONS - --help, -h - print usage and exit - --debug - turn debugging on (optional: provide filename) - --debug-area - if specified, debug code will be executed - only in given areas - --caffe - treat input file as Caffe model - --tflite - treat input file as Tensor Flow Lite model - --target - select target language to emit for given - architecture. Valid values are 'x86-c++', - 'interpreter' - --nnmodel, -m - specify input file with NN model - --output, -o - specify name for output files - --output-dir, -d - specify directory for output files - --input-model-data - interpreter option: specify file with - neural network input data. This file - contains array of floats in binary form - --input-node - interpreter option: set input node - in Computational Graph - --output-node - interpreter option: set output node - in Computational Graph - --res-filename - interpreter option: file for result tensor - - - -Option declaration design -------------------------- -``nnc`` has a convenient and flexible mechanism for declaring command line options that allows users to adjust various option settings. -A command line option is represented by ``Option`` template class and defined in global scope which allows option to be constructed before -``main`` function. This allows calling ``parseCommandLine`` (see `Command line parser design`_) in arbitrary point of program having already all declared options. -The ``Option`` class has only one constructor with the following parameters: - -.. list-table:: - :widths: 10 30 - :header-rows: 0 - - * - **optnames** - - Names of option. The option can have a several names (`aliases`), for example: ``-f``, ``--file-name`` - * - **descr** - - Option Description. This text will be shown if ``--help`` option is passed or if command line is incorrect - * - **default_val** - - Option value accepted by default. This value will be set to option value if value for option isn't passed to command line - * - **is_optional** - - If this parameter set to ``false`` and option isn't passed to command line then error message will be shown and ``nnc`` will be terminated - * - **vals** - - Valid values for option. Other values are interpreted as invalid - * - **checker** - - Pointer to function that will be called by command line parser to verify option - * - **seps** (by default is spaces) - - Symbols that separate option name from value - * - **enabled** - - If this option is set to ``false`` then it won't be shown for users - -| - -When ``Option`` is constructed it registers itself to command line parser that is singleton of ``CommandLine`` class object, -so that when all options are registered the command line parser will contain all of them. - -Most of option parameters should be set with special helper functions that give the user an opportunity to declare options in a simpler form: - -.. list-table:: - :widths: 10 30 - :header-rows: 0 - - * - **optname** - - Convert option names for ``Option`` constructor, if option has several names then they must be separated by a comma - * - **overview** - - Convert string contains option description for ``Option`` constructor. This function can split long lines of description for more pretty printing - * - **optvalues** - - Convert option valid values for ``Optioin`` constructor, if option has several values then they must be separated by a comma - * - **separators** - - Convert string of symbols separated by a comma for ``Option`` constructor - -| - -This is an example of a declaration a ``target`` option that selects a specific `backend`: - -.. code-block:: c++ - - Option target( - optname("--target"), - overview("select target language" - "to emit for given architecture. " - "Valid values are 'x86-c++', 'interpreter'), - std::string(), // default value is empty - optional(false), // required option - optvalues("x86-c++, interpreter"), - nullptr, // no option checker - separators("=")); - -| - -After command line parsing ``Option`` object can be used as an object of type with which ``Option`` was instantiated, for example: - -.. code-block:: c++ - - ... - if ( target == "x86-c++" ) - ... - ... - std::string targ_val = target; - ... - f(target.c_str()); - ... - - -Command line parser design --------------------------- -Command line parser is presented by a singleton object of ``CommandLine`` class that contains all registered options (see `Option declaration design`_). -The ``CommandLine`` class has a main public method ``parseCommandLine`` that parses the command line. When this method is invoked the command line parser implements the following steps: - -- verify that next option in command line is either option that was registered in command line parser or print error message that option is not recognized -- set the value of the current command line argument to ``Option`` object and check that this value is valid for the option, if value is invalid error message will be printed -- verify that all required options are present in command line if they are not an error message will be shown -- invoke checker function for all options if these functions are available - -| - -**Note**. Since ``Option`` class is template class and can take a various type the command line parser accesses to options via interface that presented by ``BaseOption`` class and that is supperclass for ``Options`` class - diff --git a/contrib/nnc/doc/project/model_ir_overview.rst b/contrib/nnc/doc/project/model_ir_overview.rst deleted file mode 100644 index ad24763..0000000 --- a/contrib/nnc/doc/project/model_ir_overview.rst +++ /dev/null @@ -1,43 +0,0 @@ -Overview -```````` -Model IR consists of 4 main parts: - -* Graph - represents the computation graph -* Node - container for single computational operation in computation graph -* Operation description - represents a single operation -* Visitor - declares an interface used for graph traversal - -Graph -````` -Graph contains information about graph input/output nodes and list of all nodes in graph. - -Responsible for allocating nodes and keeps all allocated node references. -`Graph` class takes care of graph traversal considering all node input/output dependencies. - -Node -```` -Each node contains: - -- Node id ( used to uniquely address node in computation graph ) -- Node name( set by importer, used to distinguish inputs/outputs ) -- Operation description - reference to OpDescription subclass -- List of inputs( each represented by node reference and output index from that node ) -- List of outputs( List of nodes which take any resulting data from this node ) - -Operation Description -````````````````````` -All operations in computation graph are represented by subclasses of `OpDescription` class. - -Every Operation's description contains: - -- Number of inputs/outputs operation takes -- Shapes of input/output tensors ( initialised in Importer/ShapeInference ) -- Any information specific to operation( i.e. convolution kernel ) - -Visitor -``````` -Base class used for traversing computation graph. - -Defines set of operations on IR nodes to be provided by IR user. - -Supposed to be the only way used for graph processing. \ No newline at end of file diff --git a/contrib/nnc/doc/project/project_purpose_and_scope.rst b/contrib/nnc/doc/project/project_purpose_and_scope.rst deleted file mode 100644 index 3793ef0..0000000 --- a/contrib/nnc/doc/project/project_purpose_and_scope.rst +++ /dev/null @@ -1,19 +0,0 @@ -The main goal of the project is to develop a compiler for neural networks to produce executable artifact for specified SW and HW platform. - -The development scope includes the following components: - -- Develop importer module to parse, verify and represent NN model for further optimization and compilation -- Develop code emitters to produce executable binary for CPU and GPU - -| -| **2018 year goals:** - -- Support TensorFlow Lite NN model format -- Support Caffe NN model format -- Support Caffe2 NN model format (Optional) -- Support compilation of MobileNet NN -- Support compilation of Inception v3 NN -- Support ARM CPU -- Support ARM GPU (Mali) -- Support Tizen OS -- Support SmartMachine OS (Optional) diff --git a/contrib/nnc/doc/project/project_sw_hw_constraints.rst b/contrib/nnc/doc/project/project_sw_hw_constraints.rst deleted file mode 100644 index f47c13f..0000000 --- a/contrib/nnc/doc/project/project_sw_hw_constraints.rst +++ /dev/null @@ -1,51 +0,0 @@ -| - -.. list-table:: Table 1-2. Assumptions, Dependencies and the Constraints - :widths: 23 40 23 - :header-rows: 1 - - * - Item - - Assumptions, Dependencies and the Constraints - - Reference - - * - Tizen SW Platform - - The following items should be provided: - - Tizen API - - Tizen kernel - - Tizen FW - - Tizen SDK - - Tizen naming convention - | - - - - `www.tizen.org `_ - - `wiki.tizen.org `_ - - `developer.tizen.org `_ - - * - SmartMachine OS Platform - - The following items should be provided: - - SmartMachine API - - SmartMachine kernel - - SmartMachine FW - - SmartMachine SDK - - SmartMachine naming convention - | - - - - `Platform confluence `_ - - `Github `_ - - `Functional Safety confluence `_ - - - * - Host OS - - Linux-based OS (Ubuntu, Archlinux, etc) - - - - `Ubuntu site `_ - - `Archlinux site `_ - - * - Tizen target HW - - The reference device should be provided: Tizen TM2 - - - - * - SmartMachine target HW - - The reference device should be provided - - - diff --git a/contrib/nnc/doc/project/project_target_model.rst b/contrib/nnc/doc/project/project_target_model.rst deleted file mode 100644 index e16ab71..0000000 --- a/contrib/nnc/doc/project/project_target_model.rst +++ /dev/null @@ -1,21 +0,0 @@ -| - -.. list-table:: Table 1-1. Target Model - :widths: 23 50 20 - :header-rows: 1 - - * - Product - - Target Model Name - - Comment - - * - Tizen phone - - Tizen TM2 - - Reference device - - * - Tizen device - - Odroid XU4 - - Reference board - - * - SmartMachine target - - Microvision mv8890, exynos8890 - - Reference device diff --git a/contrib/nnc/doc/project/project_terms_and_abbreviation.rst b/contrib/nnc/doc/project/project_terms_and_abbreviation.rst deleted file mode 100644 index 59b2e01..0000000 --- a/contrib/nnc/doc/project/project_terms_and_abbreviation.rst +++ /dev/null @@ -1,39 +0,0 @@ -| - -**Terminology and Abbreviation** - -.. list-table:: - :widths: 10 30 - :header-rows: 0 - - * - OS - - Operating System - * - OS API - - Application interface of OS - * - HW - - Hardware - * - SW - - Software - * - NN - - Neural Network - * - NN model - - Neural network model (Instance of NN built with ML framework) - * - NN compiler - - The compiler for neural network - * - ML framework - - The machine learning framework - * - TF/TF Lite - - Tensorflow/Tensorflow Lite ML framework - * - IR - - Intermediate representation - * - CI/CI system - - Continuous integration system - * - UI - - The user interface - * - GUI - - The graphical user interface - * - CLI - - The command-line interface - * - CG - - Computational Graph - diff --git a/contrib/nnc/doc/project/soft_backend_details.rst b/contrib/nnc/doc/project/soft_backend_details.rst deleted file mode 100644 index 15f93e1..0000000 --- a/contrib/nnc/doc/project/soft_backend_details.rst +++ /dev/null @@ -1,168 +0,0 @@ -Glossary -~~~~~~~~ -+ **Tensor** - class that represents multidimensional array. - Provides user artifact interface (holds input data and results) and - keep temporary data generated by one operation and used by other. -+ **Operation** - neural network layer implementation, like 2d convolution, activation function, etc. - It consumes ``Tensor`` objects as input and produces one or multiple output ``Tensors`` -+ **Shape** - class that stores number and values of ``Tensor`` dimensions. -+ **Artifact** - product of soft backend execution, this artifact provides interface for user and - implements inference of provided computational graph - -Overview -~~~~~~~~ -Soft backend takes a pointer to a computational graph(Model IR) and -generates artifact in form of C++ source code file, header with artifact interfaces and -binary file that containing parameters of compiled network. - -Example of generated artifact interface: - -.. code-block:: c++ - - class NNModel - { - public: - // Constructor of artifact, it takes path to file with NN parameters - // Contents of this file is not stored in code directly - // because it could be enormous - NNModel(const std::string ¶metersPath); - - // Setter of NN input named "layer1" - // contents of in Tensor are copied to internals of Model - void set_layer1(const Tensor &in); - - // Getter of NN output named "result" - // Model creates result object during inference and - // holds reference to it until model is destroyed - // or another inference executed - std::stared_ptr get_result(); - - // This method exists in every aftifact, - // has fixed name that is not dependent on Model - // It is responsible for inference of result tensors from input data - void doInference(); - }; - -Common usage: - -.. code-block:: c++ - - std::string pathToParameters = "nnmodel.params"; - NNModel model(pathToParameters); - Tensor inputTensor; - for (int i = 0; i < numInputs; ++i) - { - fillInput(inputTensor); - model.set_layer1(inputTensor); - model.doInference(); - std::shared_ptr result = model.get_result(); - showResult(*result); - } - - -Soft backend has three main phases: analysis, serialization and artifact generation. - -* Analysis implemented by ``ModelAnalyzer`` class, -* Serialization implemented by ``Serializer`` class, -* Generation of header, code and parameter files implemented by ``BaseCodeGenerator`` class and derived classes. - -General generation sequence -~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Main backend sequence can be found in ``BaseCodeGenerator::generate`` method: - -1. Apply ``ShapeInference`` visitor to determine output Shapes of operations, - this simplifies artifact code: no need to compute shapes during inference. -2. Apply ``ModelAnalyzer`` visitor to generate inference sequence and - find artifact inputs, output and temporary Tensors. -3. Apply ``Serializer`` visitor on inference sequence generated by ``ModelAnalyzer`` - to create binary array of parameters. -4. Call ``formatTensorNames`` method that adjusts input and output names - to target language naming convention(remove symbols(like '/') invalid in C++ names, etc.). -5. Create artifact output directory(set by ``--output-dir`` option, - possibility of this operation should be checked by driver systems). -6. Create header file in output directory, write it contents - by calling virtual ``materializeHeader`` method overrided - in particular soft backend class(CPPCodeGenerator, CCodeGenerator, etc). - This phase consumes data gathered by ``ModelAnalyzer``. -7. Create code file in output directory, write it contents - by calling virtual ``materializeCode`` method overrided - in particular soft backend class(CPPCodeGenerator, CCodeGenerator, etc); - This phase consumes data gathered by ``ModelAnalyzer``. -8. Create and fill file with model parameters. - This file contains a header (magic number, version of protocol, etc.) - to identify this file and avoid errors of wrong model + parameters combination - and raw data colected by serializer. - -Inference sequence construction -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -``ModelAnalyzer`` object walks computational graph on phase 2 from general sequence -in topological order and creates layout of it's operations. This sequence is represented -by list of ``ModelAnalyzer::OpDescr`` objects. - -``ModelAnalyzer::OpDescr`` object contains name of operation, -pointer to corresponding CG Node, ids of input and output Tensors. - -Information about artifact variables(Tensors) are stored in array of ``TensorDescription`` objects, -Mentioned object holds name of ``Tensor`` and it's properties(is it input/output/temporary). -Every ``Node`` node emits "input" type ``Tensor`` variable. -Every node with name(what is not ``Node``) emits "output" type ``Tensor`` variable that holds operation output. -Node without particular name creates temporary ``Tensor``, that is not accessible outside Model; - -Serialization -~~~~~~~~~~~~~ -``Serializer`` object visits CG nodes and stores corresponding data in raw binary format -in internal array of bytes. Every operation receives unique -(with exception of operations that has nothing to serialize, for example relu) offset in this array -where it's data stored. This offset is stored in ``_paramStartOffset`` of ``ModelAnalyzer::OpDescr`` object. - -Shape, strides are stored as arrays of integers: first value serialized is array size, then all dimension values. -Pads are stored as vectors too, but they have additional value to save: -padding type. It is not used in inference yet, because all ``Tensor`` ``Shapes`` are available at compile time. -To serialize ``Tensor`` objects (like convolution kernel, fully-connected weights) ``Serializer`` dumps ``Shape`` first, -then actual data in form of C multidimensional array(data stored continuously like ``int a[100][100][3]``). - -Def files -~~~~~~~~~ -Generator has number of ``.def`` files. This files contain code snipets used to generate the artifact. -This is classes and type declarations for artifact, library of operations, support functions. - -Build system converts them into headers containing char arrays with contents of ``def`` files. -Is generator need to include some snippet into generated code it just prints contents of this generated array. - -Header generation -~~~~~~~~~~~~~~~~~ -``materializeHeader`` method of derivative of ``BaseCodeGenerator`` implements generation of header file. - -C++ backend generates artifact class in header file that contains: -+ constructor that takes path to parameter file -+ destructor that frees taken resources -+ setters of model inputs. Every function has unique name taken from CG node. -These setters correspond to "input" tensors found by ``ModelAnalyzer`` object. -+ getters of model products. Every function has unique name taken from CG node; -These getters corrspond to "output" tensors found by `ModelAnalyzer` object. - -Also header file contains a number of helper functions and types(``Shape``, ``Tensor``) defined in ``.def`` files. -These types and methods are needed by users of the artifact. - -Code generation -~~~~~~~~~~~~~~~ -``materializeCode`` method of derivative of ``BaseCodeGenerator`` implements generation of code file. - -First the backend writes the required snippets from ``.def`` files. This includes operation implementations, -helper functions (like parameter file mapping and unmapping). - -Then the artifact interface implementation is written: artifact constructor, destructor, setters, getters, ``doInference``. - -Constructor and destructor call support functions from included snippets to manage parameters. - -Setters and getters are trivial and contain assignments or ``return`` statements. - -``doInference`` function contains "initilizer" section and actual inference. -"initializer" section resets all "output" variables so reference to them are dropped. -Inference part is generated from the inference sequence that was constructed by ``ModelAnalyzer`` object. -Operations are represented by calls from support library. -General form of operation looks like ``opName(outputTensor, paramsPtr, inputTensor1, inputTensor2);``. -If an operation defines a temporary variable, then the temporary variable is allocated before the point of call. -``paramsPtr`` arguments corresponds to expression -`` + `` -data offset is defined by ``Serializer``. \ No newline at end of file -- 2.7.4