review.tizen.org Git - platform/core/ml/nntrainer.git/log

[meson] Update meson for ubuntu 20.04

Update meson to work with ubuntu 20.04
Also add some missing checks

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[docs] Add missing dependencies

Add missing dependencies required to build nntrainer with meson

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Layer] Add eval mode for the training

**Changes proposed in this PR:**
- This patch add eval mode for the training forward and
fix batch normalization layer accordingly

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[ Fix ] Fix Logistic Regression Example Error

This PR includes fixes about logistic regression application

Change forwarding function

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>

Enable trainable property to layer

Set trainable value to false in constructor in activation layer, flatten_layer

Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>

[Tools] Fix bug that translayer cannot detect bn

For batchnormalization in tf 2.3 it is not detected in transLayer, so
added new type to detect batch normalization layer

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[transfer learning] Enable test on ubuntu

Enable testing of the trained model on ubuntu
Added check to ensure that nnstreamer is enabled

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[manager] Optimize input/output memory for inference

Optimize input/output memory for inference by using a shared buffer
where the max([sum(input_l, output_l)) for l from all layers]) memory
is allocated for inference.

Baseline working unittest added with models unittest which ensures
that inference works with and without optimizations without any
failures. Value verification tests is done by nnstreamer subplugin of
nntrainer.

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

Support sum value in profiler

Now profiler will show the avg, min, max, sum values

Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>

[Test] Disable deriv verification when opt is on

This patch disables derivative verification but only checks the whole
return derivatives.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Conv2d] Optimize layer loop

This optimize layer loops by

- minimize padding calculation
- Maximize cache hit by tranposing the matrix
- maximize cache hit by reordering loop order
- ~use single offset to minimize offset calculation~
- ~add shortcut when kernel size is 1~

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Conv2d] Reuse im2col array by batch

This patch enables reusing im2col array by batch, while saving
initializing time setting to zero.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Conv2d] Change conv2d gemm to dot

- Change conv2dgemm to dot to enable optimization path inside dot
operation
- Add beta option to dot operation (C = alpha*A*B + beta*C)

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[bugfix] Fix model path and dataset path in model_loader.cpp

Fix model path and dataset path to involve working directory path

Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>

[dist/tizen] Enable base unittests for tizen build

Enable nntrainer unittests for tizen build
Not sure why or when this got commented
but lets enable it

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[model] Optimize model input/output

Optimize models extra input/output memory allocation counting towards peak memory allocation.
Memory is allocated with for input of input layer and output/gradient of output layer.
However, that memory is never used as train_run() allocates new buffer and passes it to the
input layer/loss layer.
This patch takes the already allocted memory from input/loss layer to be used to collect input/label data.

This patch also removes the extra parameters from forwarding/backwarding and with corresponding
with_val functions. Further, two types of forwarding in loss layer has been merged to just 1 function.
Now, loss layer and input layer does not need to be distinguished and can be treated as a regular layer.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Conv] Optimize im2col

This patch optimize im2col by...

- Add padding as a argument instead of passing pad value
- Skip creating padded tensor and assignment for padded index
- Refactor variable names for clarity

See also #824

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Tensor] Optimize accessor

This patch...
- inlines some accessor with noexcept specifier to boost up
- Add getValuePadded to reduce memory copy to make a padded tensor

see also #825

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Cc: Parichay Kapoor <pk.kappor@samsung.com>
Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Fix] Assign default value for max_deriv size

This patch initialize max_dervative_size to avoid unexpected termination

resolves #834

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[model/test] Duplicate models test for optimization

Run models test twice, once with all the optimizations enabled
and then once with all the optimizations disabled.

This ensures that both the modes work properly.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[activation] Making activation in-place

Added activation layer to be in-place.
Each layer now allocates memory for its output than for its input.

For activation layer, if its memory is optimized, then the memory
for the layer behind activation layer is not allocated.
And the memory for the derivative of the activation layer is shared
among all such layers.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layer] Use gradient instead of variable for derivative

Use gradient instead of variable for derivative
Manager internally sets gradient memory same as variable for the optimization
but hides this kind of optimizations from the layer

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[manager] Manager tracks input/output memory

Manager tracks input/output memory and allocates it
based on if the execution is training or inference

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[inputlayer] Input layer must be non-trainable

Input layer must always be non-trainable as it does not support backwarding operation

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layer] Move layer input/output management to manager

Move layer inputs/outputs memory management to the manager.
This is accomplished by replacing the use of NetBuffers instead of Var_Grad.

Now, all the memory of weights, gradients, inputs, outputs and derivatives
are managed by the manager, and allows more optimizations to be done with
inputs/outputs.

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Profiler] Change profiler specs

- Profiler time unit is changed: milli -> microsecond
- Now report is ordered by key

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Profiler] Apply ops level profiler

This patch attaches ops level profiler

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Profiler] Add event registerator

Profiler can now dynamically register event and send it to
profileListenr as of this patch with fixing few bugs

resolves #814

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Manager] Add MMaped memory

There was a requirement to separate weight memory region and grad memory
region.
To easily separate those two, this patch introduces no abstraction:
`MMapedMemory` while separating weight and grad mmap

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Manager/Fix] Disallow copy ctor of manager

Since manager is holding a memory, it shouldn't be copied as ownership
becoms not clear. This patch delets copy ctor / assignment ops. While
chainging signature for members and functions that uses manager

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Android] Manage ndk to deal with changes

1. Upgrade ndk version to 29
2. Add dependent library
3. Fix syntax for Application.mk

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Tensor] Add Tensor Wrap method

Add Tensor some factory methods
1. burrows external memory and use from
2. create from shared pointer without copy

To restrict unwanted use, those methods are static methods
called `Tensor::Wrap`

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[TensorDim] Add initializer list ctor

This patch adds a tensordim
initializer list ctor to easily pass as a functional argument

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[tensor] argmax bugfix

Apply memory allocation bugfix to argmax
where a empty vector is being addressed

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[tensor] Set stride for shared tensor

Set stride for shared tensor

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layer] Support in-place batch normalization

Support in-place batch normalization where the batch normalization
input/output is not stored and is over-written by the next layer.

This patch removes the input/output memory requirement when using
batch normalization layer.

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[ ARGMAX ] Fix bug about argmax

Need to fix to calcuate argmax in tensor

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>

[Test] Add macro to check if backbone is enabled

When backbone is not enabled, test fails because backbone is not enabled
This patch adds a define in the test so that test can pass when backbone
is not enabled

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[svace] Assure unintialized members

nnstreamer_layer had two unintialized members.
This patch initializes those two

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[svace] Error handling for applications/test

1. Fix inconsistent alloc/dealloc(new/free)
2. Add try catch to some statements
3. Fix memory leak from `asprintf`

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[svace] assure file to be closed before remove

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Docs] Remove unnecessary HTML link for feature/privilege.

This patch removes the unnecessary HTML link for feature/privilege.

Signed-off-by: Sangjung Woo <sangjung.woo@samsung.com>

[Optim] Add shortcut to dot product

When dimension is 1, it is vector by matrix or vector by vector
multiplication. This patch adds a shortcut in that situation

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Fix] fix lda, ldb param

**Changes proposed in this PR:**
- lda, ldb, ldc is for layout so it should be set in terms of memory
layout, this patch fixes the issue while adding a corresponding test

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Profiler] Add basic profilerlistener

This patch adds global profiler listener for various purpose

From this patch,
1. Profiler can called globally with designated event key
2. Listener reporting suite included
3. Enum key has changed to int key to deal with unhashable
key compile error in few platforms.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

v2)
1. Change listener to RAII object (with forcing profiler, event
designation)
2. Add unsubscribe method
3. Change event register to set to prevent notifying a listener twice
4. Change semintics to not allow adding same listener twice

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Test] Add profiler test

**Changes proposed in this PR:**
- Add profiler test
- Wire profiler sources / header to the build system

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Profiler] Separate Profiler for wider use

This patch extracts profiler from neuralnet.

Also, this seperates `ProfileListener` which
should be used for client side while `Profiler`
is used in library side

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[meson.build] Change join_paths to / in meson.build files

Replace join_paths in meson.build files to /

Check issue #709 for more details

Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>

[Android] Integrate openblas into android

Android ndk was not building on top of openblas

This patch fixes the problem

resolves #794

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[mnist] Update saved model file

As saving the optimizer parameters has been updated, the previous
model file gives wrong result. This patch adds the new model file.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[network] Rework the backwarding

- remove forwarding from backwarding
backwarding should just do backwarding and no more
- moved backwarding back to neuralnetwork so that graph
does not has to care about how to backward etc.
Graph just provides iterators for iterating the graph
in reverse. Graph does not know that layers have backwarding etc.

Also this removes dependency of graph from optimizer.

V2:
Added comment fixes for the corresponding PR

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[optimizer] Move optimizer out of layer

This patch moves optimizer out of layer.
Now backwarding just calculates derivatives and gradient
but does not applies the gradient.
This gradient applying is done by the model.

Layer still support applyGradient operation but requires optimizer
as an argument.
This decouples layers from optimizers and can operate independently.

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[optimizer] Simplify optimizer initialize

As there is just one optimizer and shared by layers, it must be initialized just once by the neural network.
Also, addOptimizerVariables() moved out separately from initialize() as initialize() should work
on optimizers parameters and should not need list of weights.

Also remove set_tensor argument which was redundant

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[optimizer] Move optimizer variables to weights

Move optimizer variables to weights
Now all the weight related tensors are handled by weights themselves
So, optimizer can be shared across all layers, no need to create new
copies for all layers

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[vgg] Added pytorch model for vgg16

Added pytorch model for vgg16
This is to benchmark against tf and nntrainer

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[vgg] Update to official vgg16 model

Update the nntrainer and tensorflow to use official VGG16 model architecture
The FC layers setup is different as the cifar100 dataset has just 100 output classes
than 1000 classes of the imagenet.
Further, the number of epochs are reduced to 1.
When training, this can be increased appropriately.

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[MNIST] Added pytorch version

Added pytorch version of MNIST for benchmarking purpose
This code is only tested with CPU

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[ndk] Add enable profile flag

This patch add enable profile flag for ndk build for profiling purpose

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Experiment] Add profiler

This patch add `enable-profile` option to enable profile. Also this
patch adds a simple profiling logic to `neuralnet::inference`

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Meson] Add ndk-build to be part of ndk build

**Changes proposed in this PR:**
- Add option to build library using ndk

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Chores] CustomShortcut bug

As ini format has been changed, ini for customshortcut need change

This patch handles it.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[manager] Share gradient memory for all layer

This patch allows sharing the memory for gradient across all the layers
The maximum size of the gradient is allocated and all layers have unique tensors
which internally points to this tensor.

This optimization feature can be disabled for a model (as done with automated models unittest)

Manager is also moved to nntrainer/tensor as manager is managing all the weights (tensors) and will
in future manage all the inputs/outputs.
If the functionality of manager is extended, then it can be appropriately moved.

See also #774
Resolves #766

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[layers/manager] Register weights with manager

All the weights of the layer are now registered with manager
Manager allocates memory for these weights and in future
handle their updates etc

See also #774 #766

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[weight] Updated weights to be vector

Updated weights of layer to be vector than a shared_ptr array
This is for easier management and updating weight internally when
gradient will share the memory

See also #774 #766

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[manager] Added nntrainer manager for weights

Added manager to manage all the allocated weights
This patch also adds manager to the model and passes manager to the
initialize which allows weights to be added to the manager.

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[weight/var_grad] Make internal variable as shared_ptr

Internal variables in weights/var_grad, namely, the variable and gradient itself
are changed to shared_ptr so that weights can be shared without worrying about
shallow copies.

Also changed the copy constructor to not create new Tensor as copy constructor
of weight will get called and its unnecessary + unintentional overhead.
As weight is just wrapper over tensor, their copy constructors should follow
same behavior as tensor which is to not create new memory.
Added clone as an alternative to create new copy of a given weight.

See also #774 #766

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[ CONV2D ] seperate conv2d_gemm and im2col

It is better to split conv2d_gemm and im2col

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>

[unittest] Enable disabled unittest

Enable fc layer disabled unittest

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[var_grad] Trainable inferred from gradient

Trainable property of a variable was earlier inferred by storing a trainable variable
Now, trainable will be inferred using gradient.uninitialized()

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[tensor] Update tensor operation signature

Update tensor operation signature to return Tensor reference as a retval
than a tensor itself. This avoid creating dummy tensors as a return (which might have been
optimized by the compiler but lets do manually as the input is also a reference).

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[CustomLayer] Update readme.md

Add readme.md about how to run and expected output

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [ ]Passed [ ]Failed [X]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Custom] Add actual example

**Changes proposed in this PR:**
- Add an example to create the custom layer to be used with ini
- Add an example to create the custom layer to be used with api

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Custom] Add an example scaffolding

Add a layer example that depends on the user's custom code
This patch generates scaffolding to the `Application/Custom` folder

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[ Graph ] remove grad mem buffer for backwarding

This PR includes,
  . remove grad memory buffer in n_buffes for graph. We do not need
  this because we could use var memory buffer of n_buffers to
  backwarding.
  . For MNIST, memory consumption is reduced 3.5 to 2.6

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>

[ModelLoader] Use vector<string> when create layer

When creating a layer from an ini, enum based properties were used.
This prevents adding a new properties without changing the api header.

This patch moves to setting layer properties to vector<string>
to enable setting properties without changing the api header, eventually
enabling custom properties in custom layer.

**Semantics Change propesed in this PR**
Ini won't ignore the properties that is not supported since model_loader
would not know if it is supported or not

See also #716

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[bnlayer] bug fix for inference

Batch normalization bug fix for inference mode
when add() was used instead of add_i()

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[tensor] Support multiply/divide with given output

Support multiply/divide with given output tensor
This reduces temporary allocations for bn layer

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[pooling] Reduce temporary mem allocations

Reduce temporary memory allocations for pooling
Remove unnecessary temporary memory allocations which can be
replaced with a slice view
Also removed unnecessary setting memory to zero

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[activaiton] Reduce temporary memory alloc

Reduce temporary memory allocations by activation layer
by using the hidden and ret_derivative class variables
This temporarily increases peak memory but alloc-dealloc is removed
from every iteration

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[regex] Make regex static const

Make regex static const
Although it is using static string, that memory is always being allocated inside regex
Making is static const only makes it once for the function lifetime

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[neuralnet] Skip backwarding for non-trainable layers

This patch skips the backwarding for the non-trainable layers.
Further, the last trainable layer skips calcDerivative as well.
This results in much fewer calculations as well as more importantly, reduced memory.

See also #732

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[Layer] Add built-in ops to the context

**Changes proposed in this PR:**
- Add default layers to the global context

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[AppContext] Fix key is case sensitive

In current semantics, type key should be case insensitive however case
was sensitive, this patch fixes the issue.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[conv2d] More optimizations for conv2d

This patch provides more optimizations for conv2d
by avoiding more memcopies and operations along with modification
to internal interface of conv2d_gemm operation

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[conv2d] Bug fix for regularization loss

Regularization loss for conv2d layer took average over output filters
than adding it up. This patch fixes it.

See also #761

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[conv2d] Refactor conv2d layer

Conv2d layer has some issues #761
This patch addresses some of them:
- Weight is now independent of the filter size. Different filter
weights have now been combined. This has resulted in easier addressing of weights
- Above combining of weights also reduced many mem-copies of weights to bring it in a particular shape
- Moved to use getSlice() to get access to some data than creating copies

Now, all layers have fixed number of weights.

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[AppContext] Register Default ops at the begining

**Changes proposed in this PR:**
- Register default optimizer at the beginning of the load

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[Deps] Remove openmp dependency

Openmp is no longer used. It is deleted to reduce memory consumption

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[layers] Split backwarding into smaller functions

Split layer backwarding into smaller functions for optimization purposes
- calcDerivative() - calculate the derivative to be passed to previous layers
this function must be implemented by all derived layers
- calcGradient() - calculate the gradient for the weights of the layer
- applyGradient() - apply the gradients to the weights of the layer

Also, now input->backwarding() throws than just silently not doing anything.

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[var_grad] Add var_grad for input/output lists

Added var_grad for input/output lists which also combines derivatives
This is the baseclass for the weights
This update will help with graph class

**Self evaluation:**
1. Build test: [x]Passed [ ]Failed [ ]Skipped
2. Run test: [x]Passed [ ]Failed [ ]Skipped

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>

[AppContext] Add registerer,invoke factory methods

**Changes proposed in this PR:**
- Add factory registerer
- Add factory invoker
- Register built-in objects to each layers(postponed)
- Add tests

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Jihoon Lee <jhoon.it.lee@samsung.com>

[ GRAPH ] Remove unused function and add doxygen note

In neural network class, there is fucntions which should be moved to
graph.
In this PR, remove member functions which is not used any more and add
doxygen comment in graph header.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>

[ ANDROID ] Enable graph for andoid build

Fix Android.mk to support graph

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>

[ GRAPH ] Split initilization & Assign Memory

Split ini and mem assignment

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>

[ NNSTREAMER ] Fix NNStreamer Filter for graph

Describe a commit content (Until 80 colums per line) in detail ASAP.

**Changes proposed in this PR:**
- Added TOC generator for README.md

Resolves:

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>

[ NNSTREAMER FILTER ] Fix nnstreamer filter to support graph

Describe a commit content (Until 80 colums per line) in detail ASAP.

**Changes proposed in this PR:**
- Added TOC generator for README.md

Resolves:

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>

[ Fix ] istrequal to check length of string

fix istreaqual

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>

[ GRAPH ] Support Backbone Network

This PR includes :
. Modification of Network Graph to enable Backbone network support
. Fixes for unittest

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>

[ GRAPH ] Add Compiled Variable

- In order to make sure to run initialize after success of compile(),
compiled variable is used.
- additional unittest cases are fixed.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>

[ UNITTEST ] Fix unitest & Applications to support NetworkGraph

Unittest and Applications need to be fixed to support NetworkGraph.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>