review.tizen.org Git - platform/core/ml/nntrainer.git/log

[Realizer] Implement slice realizer

This patch implements slice realizer and it's test

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[layer/test] Bug fix for dropout + unittest

This patch adds bug fix for dropout for training as well as inference
mode. Futher, added unittest:
- when dropout_rate is 0 or 100%, all the values are checked
- when dropout_rate between 0 and 100, weak check ensures that either
values are equal or one of the values (golden vs output) is 0
- when dropout_rate r between 0 and 100, strong check ensures that
100 - 2*r percentage of values must always match

All the checks for performed in the unittests.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Recurrent] Add timestep property to recurrent layers

This patch creates recurrent layer setting property support to the
recurrent wrapper.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[packaging] Debian launchpad buildfix for focal

This patch provides buildfix for focal buildfix for launchpad.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[network] Bug fix for trainable with supportBackwarding

This patch resolves the bug for calling calcDerivative on layer which
do not support backwaring.
The issue rises from the assumption that there will be no layer
requiring backwarding after the last non-trainable layer. But this is
not valid for multi-input scenario. With multi-input, there can be over
two input layers where they both do not support backwarding but the rest of
the model can be trainable.

This patch changes this error check. Below are the updated semantics:
1. After initialization of the graph, a check is added to ensure that
for each trainable layer, all the layers all of ahead must support
backwarding, so that the trainable layer can be trained. If any layer
ahead of it does not support backwarding, error is thrown.
2. A layer is only trainable if its trainable property is set to true
(defaults to true) and contains at least 1 weight. If a layer does not
contain any weights, the layer is treated as non-trainable.
3. When backwarding the model, backwarding is called only for layers
which support backwarding, and skipped for layers which donot.

The updated semantics ensure the dependency of the flow of the
derivatives and allows mixture of layers which support and do not
support backwarding.

Resolves #1017

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Model/API] renew addWithReferenceLayers

This patch add refactored version of addWithReferenceLayers

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Model] Add Model::addWithReferenceLayers

This patch add addWithReferenceLayers prototype

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[layer] Support setBatch for attention layer

This patch provides support for setBatch for attention layer.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layer] Add constructor for attention layer

This patch adds constructor for attention layer.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layer] Support single timestep for lstm

This patch adds support for single timestep for lstm.
This is achieved with two external properties:

1. timestep - provides the current timestep for which lstm will run
2. max_timestep - the maximum timestep till which lstm will run

This patch also verifies that this LSTM implementation already does gradient stacking appropriately.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layer] Bug fix of setBatch for LSTM/GRU/RNN

LSTM/GRU/RNN requested tensors from manager and the shape of tensor
depends on the batch size. However, the layers didnot override the
setBatch to update the batchsize of the request tensors. This patch
provides the corresponding bugfix.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layer] Remove setBatch for init context

This patch remove setBatch for the init context from the layer
interface. setBatch is now only needed to be set for runContext.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[graph/model] Support for multi-label/input for the model

This patch adds the support for mutli-label while training the model.
- Multi labels/inputs are now allowed for the model by taking the
dimensions from the graph than the first and last nodes
- Outputs are now also taken from the graph for validation

Another bug fix is added related to setBatch. The caches input and label
dimensions were not updated when batchsize was updated in the network
graph. This patch updates the bug for updating the batch size in the
network graph.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Sharing] SKip saving shared weights

If weight is not original, skip saving it

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Recurrent] Implement finalizing graph

This patch implements finalizing graph with/without return sequence
property

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Recurrent] Implement unrolling

This patch implement unrolling in RecurrentRealizer

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[LayerNode] Add cloneConfiguration function

This patch add cloneConfiguration function, which creates a new node
from an exisiting node

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Recurrent] Add verification and preparation

This patch add logic to verify and add connection from input <->
external inputs

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Realizer] Implement remap realizer

This patch introduce remap realizer which remaps identifier inside a
graph representation. Please refer to the test to see what this realizer
does.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Recurrent] Add skleton of recurrent realizer

This patch add skeleton of and some basic verification
for the recurrent realizer

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[ Conv1D ] Add Skeleton code for Conv1D Layer

This commit includes:
  . Skeleton code for conv1D
  . Padding1D Property
  . and minor

Signed-off-by: jijoongmoon <jijoong.moon@samsung.com>

[Realizer] Apply flatten realizer

This patch applies flatten realizer to model compile. Later, neuralnet
will not have model_graph until compile() is being called

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Model] Add memory optimization property

This patch add memory optimization property to neuralnetwork. The main
purpose of this is to fixating memory optimzation boolean to be applied
only at neuralnet::compile();

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Test/realizer] Add flatten realizer test

This patch adds flatten realizer test

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Sharing] Implement tensor sharing

This patch implement tensor sharing.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[test/layers] Add gru layer testing

This patch added gru layer unittest for layer golden tests.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Graph] Add realizer test skeleton

This patch add realizer test skleton with separating utils to compiler
test utils

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Graph/recurrent] Add concept of realizer

This patch add graph realizer. Graph realizer will preprocess graph
which can be effectively done as a lowering process of compile

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Interpreter] Change signature of interpreter

Instead of returning networkgraph from the interpreter, it returns the
graph representation, which is a specification to generate a graph.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Fix] Tflite Interpreter disable

This patch update tflite interpreter to pass the test

The main problem was that, before tensors to be saved was distinguished
by when it's not allocated. Now it's manually decided by it's
kind(weight, input, outputs)

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[inputs] remove multi input realization

This patch removes multiinput realization behavior for now, this is not
used else where so it's fine to remove

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[layer] Update dropout rate property name

Update dropout rate property name from `dropout` to `dropout_rate`.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Debian] Disable debug on normal package build

This patch disable debug on normal package build

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[test] Added unittests for LSTM

This patch adds unittests for LSTM layer
1. single and multi timesteps
2. with and without return sequences
3. with setting activations differently

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layer] Attention support for different key value

This patch adds support for different values of key and value to
be given to the attention layer.
Corresponding unittests are also added.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Fix] Prevent connect input layer when in the middle

This patch update network graph to prevent making connections when input
layer is in the middle

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Trivial] Add tensor dim constructor from array

This patch adds tensor dim constructor from array

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Trivial] Open up util_func.h header

This patch open up util_func.h to devel packages

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Neuralnet] set input, output layers

This patch enables setting multiple input and output layers explicitly

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Neuralnet] Add property of input layers, label layers

This patch add property, input layers and label layers to neuralnet

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[graph] update getter of input/output dims

This patch update input/output dims to properly reflect model input,
output dimensions, not just a single object.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Graph] Identify model_input, model_label

This patch add ability of graph to identify model_input and model_label
with determined order, which follows semantics described in #1374

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[test] Add unittest for attention layer

This patch adds unittest for attention layer.
- Backwarding implementation is fixed for attention layer
- more wider coverage unittests are added
- layer golden test is updated to generate float input data which is
needed for the attention

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[test] Add unittest for attention layer

This patch adds unittest for attention layer:
- unittest generator for layers is updated to work for multi-input
layers
- initial unittest for attention layer is added

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layer] attention layer backwarding match

This patch adds bug fix for backwarding operation for the attention
layer.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layer] Attention layer bugfix

This patch adds bugfix for the forwarding of the attention layer.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layer] Bug fix for softmax operation

Current implementation of softmax operation applies flatten on the
tensor unintentionally and calculates softmax on the last 3 dimensions
of the given tensor.
This patch updates the softmax operation to apply its operations on the
last dimension only.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[WeightSharing] Remove zero grad

Removing zero grad function in the cost of the layer should handle the scenarios

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Test] Add recurrent model test

This patch contains intial test for the recurrent model.

In this patch, there are three fc layers sharing the same weights

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[fclayer] Update gradient to accumulate

This patch update gradient calculation to accumulate for fully connected
layer.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Tensor pool] Query execution order by source

This patch enables querying execution order by source tensor. As
dependent tensor does not have the ground truth.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Layer] Add constant derivative layer

This patch adds constant derivative layer. This layer will be used to
simulate a backward operation without any loss.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[WeightSharing] enable weight sharing from manager

This patch enables weight sharing from manager.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[WeightSharing] Pass shared_name from the original

This patch adds creating shared_weight_names from the original source
and pass it to the manager.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Property] Add shared_from key to the layer node

This patch add shared_from key to the layer node

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[WeightSharing] Implement isFirst/lastAccess

This patch implements isFirstAccess and isLastAccess making nntrainer
ready for the weight sharing while fixing overriding issue in example
pow layer

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Recurrent] Add zero grad / delegate apply gradient

This patch add zeroing the grad mechanism + delegating apply gradient
to the network graph.
The main reason for this change is that when sharing gradient and
derivatives, 1. the value has to be accumulated starting from zero
2. gradient has to be applied only at the last access.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Recurrent] Propagate Trainable variable to weights

This patch propagate trainable variable to weights to prepare sharing
weights

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[meson] Disable enable-debug by default

This patch sets enable-debug to false by default which was enabled
mistakenly by #1607.
enable-debug is set to true for ubuntu and tizen build in CI only with
unit_test set to true in the CI.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[ API ] Add Inference in CCAPI to get loss value

Add Inference API to get the loss value

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>

[pkg] Enable debug mode for CI

This patch enables debug mode for the CI build for both ubuntu and
tizen. This enables all the debug tests to be done in the CI which were
disabled till now.
Fixes required to enable the DEBUG mode is also added.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layer] Add checks for layer tensor overwrite bug

This patch adds a check to ensure that when layer tensors are created
and overwrites existing tensors. These checks are enabled only in DEBUG
mode to ensure that they only run in CI mode, and are called after each
operation - forwarding, calcGradient, and calcDerivative.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[conv] Update temporary memory requests

This patch updates the request for temporary memory in the convolution
layer.
- im2col and col2im results are both the same size and used exclusive of
each other but both are requested for the backwarding. so, instead of
requesting both, they can share their memories.
- as the values in these tensors can be discarded between forwarding and
backwarding, two independent tensors are requested for forwarding and
backwarding so that the memory can be reused in the intermediate
duration.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Dataset/test] Update batch before creating tensor

Dataset sample creating now creates a tensor after updating batch to one

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

resolves #1604

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[layer] Add backwarding for attention layer

This patch adds backwarding for attention layer. Corresponding unittests
will be added in the next patch.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layer/test] Add basic test for attention

This patch adds basic unittest for attention layer.
To achieve, the existing tests are modified to support multiple inputs
in the test format.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layer] Scaffolding attention layer

This patch adds the initial commit for attention layer.
- add class description
- add basic forwarding

This implements the common form of attention layer where key and value
are the same tensor. The other format will be supported soon.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layernode] Bug fix for finalize

This patch adds bug fix for finalize of the layer node. The checks of
the inputs dimensions and input shapes has been fixed when multiple
inputs are expected to be set.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Test] Add warmup to the golden layer

This patch add warmup forwarding to the layer golden test

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Test] Add conv2d golden tests

**Changes proposed in this PR:**
- Conv2d Golden tests
- remove _golden_ to the name of test cases, as it is attached in the
extension

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[batchnorm] Optimize batch norm backwarding

Remove the extra full size extra memory requirement as the cost of the
reduced memory. The difference in memory requirmement can be
significant. Earlier memory requirement was b*c*h*w vs now it is just c
where the assumption is that batch norm if is normalizing along
axis=channel.
This is achieved by reordering of the operations.
Note: this change has no performance impact.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[batchnorm] Optimize batch norm forward memory

Reduce the memory comsumption of batch norm for forwarding by re-using
the output tensor as temporary memory than requesting new memory.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[rebase] Rebase fix

This patch adds rebase fix.
Further some of the temporary fixes in the previous commits are also
removed.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[in-place] Make input layer work in-place

This patch makes input layer work inplace. This is done by support of
externally allocated tensors in tensorPool, and making input of input
layer and labels to be externally allocated tensors.
Input layer is updated to work in-place.
Further the methodology to set inputs and labels has also been updated.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[inplace opt] Support in-place no-op flatten layer

This patch updates the flatten layer to be a no-op layer. This is done
with the flatten layer setting the input and output shapes at finalize
time and making flatten layer execute in-place. Changes in this patch:
1. requestPreallocatedTensor() in TensorPool now returns a new tensor
which will eventually share the memory with the preallocated tensor than
returning the preallocated tensor itself. This allows tensor metadata to
be changed (like name, shape, etc) which sharing the memory. This is
done by storing the dependency link between tensors in token.
Corresponding unittests are also added.
2. Manager now supports giving shared tensors for outputs (shared with
some inputs) to support in-place running of some layers.
3. Flatten layer is updated to be a basic no-op and to perform
flattening once at compile time.
4. Update flatten layer supportBackwarding to true
5. Input layer updated to not edit tensor mapping. Input layer will be
updated to be in-place in the next patch.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[graph/manager] Enable memory v1 optimizations

This patch adds interface to enable memory optimizations with the neural
network. Enabling the interface changes the planner being used for the
memory allocation.
With this patch, OptimizedV1Planner is put to use when enabling
optimizations.

Unittest of models is updated to disable optimizations in the
non-optimized test cases.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[bug fixes] Add bug fixes to manager + tensor pool

This patch adds the corresponding bug fixes:
- end validity for each requested tensor is fixed
- execution order for requestInputs and requestOutputs are fixed
- output is specially handled for activation layer as it uses output in
derivative calculation
- gradient of the weights is kept valid for calcDerivative as well
because gradients are applied to weights after calcDerivative.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[planner] Update optimized v1 planner

Optimized v1 planner is updated to reuse expired allocations which are
not at the top the sorted list. This is a heuristic which works well for
training memory usage scenario.

The unittest is also updated to be more generic when checking the
interval overlap with the allocations.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[networkgraph] Bug fix in execution order calculation

Added bug fix in the execution order calculation where some of the
execution values were being missed in the previous implementation.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[flatten] Remove in-place implementation

Flatten layer in-place implementation manipulates the tensor stored in
its input and output. This violates the execution order usage of the
tensors and does not work with the memory planning setup.
This patch makes flatten layer to work out-of-place with a copy.

There will soon be a patch to make flatten work in-place with proper
method.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[rebase] Add rebase fix

Add rebase fix to the memory unitest. Memory unittest was updated to
shuffle the validity of the requests. However, the unittest check was
not ready for it. This patch adds the sort of the validity before
performing the check, corresponding to the previously added shuffle.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[memoryPool] Release memory on destroy

This patch adds releasing all the memory upon destruction of the memory
pool.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[planner] Optimized v1 planner unittests

This patch adds unittests for the optimized v1 planner.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[planner] Add optimized v1 planner

This patch add optimized v1 planner for memory sharing.
This planner assigns memory in the order of start of the validity,
and then by decreasing order of the end of the validity. This matches
the memory use pattern while training the model.
This planner is supposed to work better for training than for inference.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[rebase] Rebase fix

This patch adds rebase fix.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[resnet] Enable resnet application

This patch enables resnet application with some fixes:
- setBatchSize updated for handling gradient tensors
- added input layer for resnet
- passed arguments correctly for resnet test application
- tensor allocation for ENABLE_TEST mode updated

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[batchnorm] Optimize batch norm layer

This patch optimizes batch norm layer and tries to share the
calculations performed in calcGradient and calcDerivative.
- reuse dbeta and dgamma calculations
- reduce number of required temporary variables
- create all the required tensor variables with context
- add support for checking if the layer is trainable or not via run
context
- support average operation with the output tensor already allocated
- this patch reduces as much as memory as possible without sacrificing
speed. more memory optimization is possible at the expense of speed but
has been ommitted for now.

Note: this patch has slight improvement in performance, and adds no
extra operations.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[batchnorm] Optimize batch normalization implementation

This patch optimizes batch normalization implementation.
Reduces the temporary memory allocation and provide speedup for the
layer execution. With this patch, resnet18 runtime improves by approx
5%.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Test] Add bn layer inference test

**Changes proposed in this PR:**
- bn layer inference mode test
- add option to skip backwarding golden test

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[rebase] Add rebase fix

This patch adds rebase fix:
- isLabelAvailable is checked with allocation than based on size
- var_grad initialize variable is temporarily updated till externally
allocated tensors are supported #1544
- 2 dataset unittests are disabled till externally allocated tensors are
supported with #1544
- other minor edits arising from rebase to main

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[cleanup/fix] Cleanup + bugfix

This patch provides cleanup for manager and related classes:
- cleanup for Manager
- cleanup for Var_Grad and Weights
Fixes:
- Graphcore,NetworkGraph,NeuralNet destructor fixed to not clear their
lists until they are destructed, as this caused bug in copy constructor
- initialize and allocation for tensors are merged for model, graph and
manager interface. This removes unnecessary confusion and possible bugs
from these classes
- Pass appropriate start and end execution orders to manager for
allocation
- Support setInputs to set multiple inputs
- clean inputs/labels which are allocated by the dataset, other manager
tries to clear them. This is temporary till dataset is not using the
manager
- memory pool to start giving tokens from 1 instead of 0. token 0 is
treated at a non-requested tensor memory
- add interface to check if the tensor pool and memory pool have
allocated memory
- initialize all the tensors token to 0, when finalizing multiple times
- tensor behavior update to not allow updateBatch() if it is allocated
- update setBatch to deallocate already allocated tensors, then update
the batch size, and then initialize and allocate the tensors

TODO:
- add a enableMemoryOptimization() option

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[manager] Temporarily handle external tensors

With rebase, the var_grads representing inputs cannot be null tensors.
This patch provides a temporary fix for this issue. Proper fix for the
issue is added in #1544.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layernode] Bug fix for loss

getLoss() was adding loss from the loss layer to the total loss of the
layer with every call to getLoss() which was resulting in wrong results.
This patch adds the corresponding fix.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Manager] Manager to use TensorPool for all requests

This patch updates manager to use TensorPool for all of its requests.
Other changes are as listed below:
- NetworkGraph now caches the inputs and labels Var_Grads which can
directly by the model before execution.
- setLabels and setInputs in the model have been updated. Further,
manual setting of inputs and labels has been removed.
- Introduce clear in memory pool to clear any allocations and requests
- TensorPool clears any requests made before making more requests in
finalize
- models unittest updated to not pass label when loss is not given in
the model. Passing label without loss in the model now results in error
- Manager updated to use tensor pool for all the memory requests. The
allocation, initialization and deallocation has been correspondingly simplified

Much needed cleanup will be done in the next commit.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>
[squash commit]

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Test] Add bn layer training

**Changes proposed in this PR:**
- Add batchnormalization layer test (for channel, width axis)

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Fix] gcc-7 has compile error

This patch fixes gcc compile error (-wsign-compare,
-wmaybe-uninitialized)

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[fix] Rebase fix

This patch adds fix when rebase to the main branch.
Due to significant changes to the main branch, certain patches have
been taken from future PRs:
- from #1539, commit b58c72a54ab445f34f5839314df65e849f393b11 has been
cherry picked to solve the issue of tensor shape related with batch size
changes
- from #1539, memory pool to start giving tokens from 1 instead of 0. token 0 is
treated at a non-requested tensor memory

These patches might seem a bit out of the way but instead of taking
partial functionality, either full commits have been cherry-picked here
or minor functionality has been manually imported.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[tensorpool] Make tensor list ordered

This patch makes tensor list in tensorpool ordered.
This ensures that the requested tensors are initialized in the order of
the their requests which is by the order of the sorted graph.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Manager] Use TensorPool for Gradients

Use TensorPool for gradients of the weights.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Manager] Manager use TensorPool for Weights

This patch updates manager to use TensorPool for its weights and
corresponding weights. Corresponding changes to Var_Grad/Weights and
Tensors are also added.
- setBatchSize() now takes dimensions from user, and then updates the
provides these updates dimensions to the manager. We can possibly remove
setBatch(RunLayerContext), which will be finalized in the next commit.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>