skykongkong8 [Fri, 6 Oct 2023 07:17:36 +0000 (16:17 +0900)]
[neon] Optmize sgemv
- Instead of declaring explicit register variable, declaring the function in inline code can save the number of register variable in use.
- This way, we can load more of variables to accelerate sgemv computation
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Donghyeon Jeong [Tue, 10 Oct 2023 02:29:13 +0000 (11:29 +0900)]
[Tensor] Add output axis in weight spec
Add output axis in weight spec to identify scale and zero points multiplication direction
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
jijoong.moon [Tue, 12 Sep 2023 08:18:40 +0000 (17:18 +0900)]
[ Tensor ] Add Output Axis to dequantize api
This pr add outoput axis parameter in dequantize api
Resolves:
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
skykongkong8 [Thu, 5 Oct 2023 01:43:47 +0000 (10:43 +0900)]
[blas/neon] Use inter-fp32 value in dimension shrinking computation
- Previously, sgemm and sgemv in neon intrinsics was depdendent on two conditions
1. should have 8-divisible column or row
2. fully work with fp16 variables (which might cause precision loss while being accumulated)
- In this commit, I expect sgemm and sgemv to work like:
1. support every column length with adaptive-length-compute optimization
2. use temporal fp32 array to secure cumulative data value especially in large scale Tensor
3. accelerate converting such fp32 array to fp16 Tensor and vice versa with neon to enhance time performance
4. consider the number of register to avoid register spilling
- More optimizations w.r.t. time and memory is on wip
Resolves:
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Donghyeon Jeong [Thu, 5 Oct 2023 05:40:12 +0000 (14:40 +0900)]
[Bug] Fix build error in yolo with fp 16
- This patch fixes build error in yolo_v2_loss with fp16 enabled
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghak PARK [Wed, 20 Sep 2023 10:37:39 +0000 (19:37 +0900)]
[Coverity] Fix Coverity issue
For fix uninitialzed output_axis set default output_axis as 3
- in tensor.h file
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghak PARK [Tue, 19 Sep 2023 08:44:52 +0000 (17:44 +0900)]
[Coverity] Fix Coverity Issue
Fix Coverity Issues
- set default output_axis = 3
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghak PARK [Tue, 19 Sep 2023 08:33:12 +0000 (17:33 +0900)]
[Coverity] Fix Coverity Issue
Remove local reference return
- it already removed at latest version but not yet merged in main branch
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghak PARK [Tue, 19 Sep 2023 08:29:22 +0000 (17:29 +0900)]
[Coverity] Fix Coverity issue
Fix may be NULL and is dereferenced at blas_neon.cpp
Check NULL if failed to malloc
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
SeoHyungjun [Tue, 19 Sep 2023 07:04:52 +0000 (16:04 +0900)]
[Svace] Fix Ahub issues
Added exception handling when malloc fails.
The value of the idx variable has been changed to be initialized when declared.
In the tensor constructor, output_axis was changed to be initialized to 3.
Signed-off-by: SeoHyungjun <hyungjun.seo@samsung.com>
Donghak PARK [Tue, 29 Aug 2023 08:00:38 +0000 (17:00 +0900)]
[Coverity] Fix Coverity issue
Fix Coverity issue : auto_causes_copy issue
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghak PARK [Tue, 29 Aug 2023 07:52:01 +0000 (16:52 +0900)]
[Coverity] Fix issue on Draw_Classification Application
Fix Coverity issue
- leaked_storage issue : on fail we need to destroy in_data
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghak PARK [Tue, 29 Aug 2023 06:52:21 +0000 (15:52 +0900)]
[Typo] Fix typo
Fix typo on Simpleshot application's README.md file
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghak PARK [Tue, 29 Aug 2023 06:33:41 +0000 (15:33 +0900)]
[Coverity] Fix Coverity issue on task_runner.cpp
Fix Coverity issue
- returned_null : potentialy getcwd can returns nullptr so check it's value
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
SeoHyungjun [Thu, 24 Aug 2023 09:41:39 +0000 (18:41 +0900)]
[Ahub] Fix Ahub issue
The second argument of tensor_dtype is used as std::regex(string).
But it didn't include handling for std::regex_error. created a getRegex()
function because it is used in a similar form in other codes. The
getRegex() function takes a string and retrun a std::regex object.
Signed-off-by: SeoHyungjun <hyungjun.seo@samsung.com>
SeoHyungjun [Thu, 24 Aug 2023 08:10:51 +0000 (17:10 +0900)]
[Ahub] Fix Ahub issue
Previously, nntrainer only supported fp32. nntrainer didn't need to
change the data type, but it needs to be changed to support fp16. If
the tensor type is fp16, get input_dim of InitLayerContext through
getInputDimentions and call setDataType. For getInputDimentions,
return a const object. Added getMutableInputDimentions function
because return object of getInputDimentions cannot be modified.
Signed-off-by: SeoHyungjun <hyungjun.seo@samsung.com>
SeoHyungjun [Wed, 23 Aug 2023 11:57:49 +0000 (20:57 +0900)]
[Ahub] Fix Ahub issue
The condition of the while statement has been modified to solve
'Dereferencingiterator lhs_iter thought it is already past the end
of its container'. In fact, in the above if conditional, it only
works when the size of the two containers is the same, so there is
no problem using '||'. However, it has been edited for clarity.
Signed-off-by: SeoHyungjun <hyungjun.seo@samsung.com>
SeoHyungjun [Wed, 23 Aug 2023 07:26:03 +0000 (16:26 +0900)]
[Ahub] Fix Ahub issue
Restore does not work as expected after changing ostream type.
Added because there is no code to restore the variable out.
Changed the code where the nesting depth was set incorrectly to the
correct location.
Signed-off-by: SeoHyungjun <hyungjun.seo@samsung.com>
SeoHyungjun [Wed, 23 Aug 2023 04:44:48 +0000 (13:44 +0900)]
[Ahub] Fix Ahub issue
The 'initialized' variable receives a pointer via malloc.
If malloc fails, it will be null.
However, since the exception is not handled, calling initialize[i] points to null + i.
Exception handling has been added to prevent this problem.
Signed-off-by: SeoHyungjun <hyungjun.seo@samsung.com>
SeoHyungjun [Wed, 23 Aug 2023 02:34:29 +0000 (11:34 +0900)]
[Ahub] Fix Ahub issue
changed 'auto' to 'auto &' in multiout_realizer.cpp.
Signed-off-by: SeoHyungjun <hyungjun.seo@samsung.com>
SeoHyungjun [Wed, 23 Aug 2023 02:33:13 +0000 (11:33 +0900)]
[Ahub] Fix Ahub issue
Fixed the part where buf is not initialized and used in nntrainer_logger.cpp.
Array initialized with ascii 0 ('\0').
Signed-off-by: SeoHyungjun <hyungjun.seo@samsung.com>
hyunil park [Mon, 18 Sep 2023 08:15:03 +0000 (17:15 +0900)]
[Application] bugfix for YOLO version 2
Training does not proceed because the layer names are different.
- reorg layer name is changed from reorg to reorg_layer
Signed-off-by: hyunil park <hyunil46.park@samsung.com>
jijoong.moon [Mon, 18 Sep 2023 14:17:07 +0000 (23:17 +0900)]
Revert "[Release] NNTrainer v0.5.1 release"
This reverts commit
258967dcad51caffde61a06f46bde7f6329622d9.
Signed-off-by: Jijoong Moon <jijoong.moon@samsung.com>
hyeonseok lee [Thu, 14 Sep 2023 08:17:49 +0000 (17:17 +0900)]
[Application] handle getcwd return null pointer
- Handle when getcwd return NULL to prevent dereference null pointer
Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>
jijoong.moon [Tue, 12 Sep 2023 01:38:27 +0000 (10:38 +0900)]
[Release] NNTrainer v0.5.1 release
NNTrainer v0.5.1 is released.
**Changes proposed in this PR:**
- Added TOC generator for README.md
Resolves:
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
Seungbaek Hong [Wed, 13 Sep 2023 05:48:35 +0000 (14:48 +0900)]
[Application] Setup meson file for libyolov2_loss_layer.so
Setup meson file for making libyolov2_loss_layer.so in yolo v2
application.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Seungbaek Hong <sb92.hong@samsung.com>
Seungbaek Hong [Wed, 31 May 2023 06:14:12 +0000 (15:14 +0900)]
[Wait for #2177,#2213][Application] rebase for yolo v2
I've rebased #2177(loss for yolo) and #2213(custom layer for yolo).
(Because the author of PR #2177 is absent for now.
If someone needs to use yolo v2, then use this pr.
I'll update document for this later.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Seungbaek Hong <sb92.hong@samsung.com>
hyeonseok lee [Thu, 23 Mar 2023 10:38:20 +0000 (19:38 +0900)]
[Application] match nntrainer and pytorch yolo model
- Match option value like epsilon, momentum
- This commit will match nntrainer yolo v2 output with pytorch yolo v2
Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>
hyeonseok lee [Thu, 6 Apr 2023 11:22:53 +0000 (20:22 +0900)]
[Application] implement yolo v2 loss backwarding
- Implement yolo v2 loss layer backwarding
Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>
hyeonseok lee [Wed, 22 Mar 2023 04:44:40 +0000 (13:44 +0900)]
[Application] implement yolo v2 forward
- Implement yolo v2 forward
Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>
hyeonseok lee [Thu, 9 Mar 2023 02:22:12 +0000 (11:22 +0900)]
[yolo_v2] yolo v2 loss scaffold
- Added yolo v2 loss scaffold
Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>
Donghyeon Jeong [Tue, 12 Sep 2023 00:09:35 +0000 (09:09 +0900)]
[Tensor] Unsigned Quantized Tensor
- Quantized tensor values are unsigned with zero points
- Layer context dequantize quantized tensor when request weight
- Template dequantize function
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Tue, 5 Sep 2023 02:25:40 +0000 (11:25 +0900)]
[Tensor] Quantized Tensor (Int 4) with Scale
- Quantized Tensor is now available with Int 4 with scale.
- Two Int 4 values use one Int 8, which each uses 4 bits
- Dequantization is performed by multiplying scaling factors with a given index (b, c, h, w).
- Only read (getValueQint4), write (setValue), and dequantization operations are allowed.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Seungbaek Hong [Tue, 30 May 2023 07:04:34 +0000 (16:04 +0900)]
[Application] add re-orginization layer to Yolo v2
Added Re-organization layer to yolo v2 examples
of nntrainer and pytorch.
Signed-off-by: Seungbaek Hong <sb92.hong@samsung.com>
Donghyeon Jeong [Wed, 30 Aug 2023 23:19:30 +0000 (08:19 +0900)]
[Tensor] Quantized Tensor (Int 8) with Scale
- Quantized Tensor is now present with Int 8 with scale.
- Dequantization is performed by multiplying values by a scaling factor for channels.
- Only read, write, and dequantization operations are allowed.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
hyunil park [Tue, 29 Aug 2023 09:23:25 +0000 (18:23 +0900)]
[sub-plugin] Modify synchronization mechanism between push_data and getSample
Modify synchronization mechanism between push_data and getSample
- Remove some member variable
- Add member function to check queue
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: hyunil park <hyunil46.park@samsung.com>
Debadri Samaddar [Fri, 8 Sep 2023 11:13:34 +0000 (16:43 +0530)]
[blas/neon] SGEMM Neon execution for any M value
Used padded calculations for SGEMM using NEON for any value of M.
Where M is the number of rows in output matrix.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
Debadri Samaddar [Thu, 31 Aug 2023 09:09:20 +0000 (14:39 +0530)]
[blas/neon] Optimized SGEMM when both inputs are transposed
Optimized sgemm stub when both A and B are transposed
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
Debadri Samaddar [Wed, 30 Aug 2023 14:01:27 +0000 (19:31 +0530)]
[blas/neon] Added unit test for NEON fp16 SGEMM
Added UT for NEON fp16 implementation of SGEMM.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
Debadri Samaddar [Wed, 30 Aug 2023 13:57:47 +0000 (19:27 +0530)]
[blas/neon] NEON fp16 implementation of SGEMM
SGEMM fp16 implmentation for Android(ARM) using NEON.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
Donghak PARK [Tue, 29 Aug 2023 13:38:39 +0000 (22:38 +0900)]
[Typo] Fix typo
Fix typo
- nntrainer/models/neuralnet.cpp
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghak PARK [Tue, 29 Aug 2023 13:32:56 +0000 (22:32 +0900)]
[tflite_export] Set model batch 1
Now, After train the model and export it to the tflite format, it will be exported with the batch size used for learning.
However, for interference, it is not necessary to have a batch size, and when converting to a tensorflow lite format in tensorflow, the batch size is set to 1.
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Debadri Samaddar [Tue, 29 Aug 2023 08:37:19 +0000 (14:07 +0530)]
[blas/neon] Optimization on SGEMV fp16 implementation
Optimized fp16 implementation of SGEMV using NEON to run on Android(ARM).
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
hyunil park [Tue, 8 Aug 2023 02:42:30 +0000 (11:42 +0900)]
[sub-plugin] Add function to stop model training
nnstreamer tensor_trainer call this function to stop model training
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: hyunil park <hyunil46.park@samsung.com>
Debadri Samaddar [Thu, 24 Aug 2023 13:23:18 +0000 (18:53 +0530)]
[ blas/neon ] Add NEON fp16 function for isamax
Enable neon isamax function for Android (ARM) fp16 computation.
Add unit test for fp16 isamax function in Android(ARM).
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
Debadri Samaddar [Tue, 22 Aug 2023 15:57:34 +0000 (21:27 +0530)]
[ blas/neon ] Add NEON fp16 function for scopy
Enable neon scopy function for Android (ARM) fp16 computation.
Add unit test for fp16 scopy function in Android(ARM).
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
Debadri Samaddar [Thu, 17 Aug 2023 14:47:55 +0000 (20:17 +0530)]
[ blas/neon ] Add NEON fp16 function for sscal
Enable neon sscal function for Android (ARM) fp16 computation.
Add unit test for fp16 sscal function in Android(ARM).
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
Debadri Samaddar [Thu, 10 Aug 2023 10:54:36 +0000 (16:24 +0530)]
[ blas/neon ] Add NEON fp16 function for snrm2
Enable neon snrm2 function for Android (ARM) fp16 computation.
Add unit test for fp16 snrm2 function in Android(ARM).
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
skykongkong8 [Wed, 23 Aug 2023 01:59:40 +0000 (10:59 +0900)]
[trivial] Add reviewers
- add new reviewers : sungsik Kong, donghyeon Jeong
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Tue, 22 Aug 2023 04:33:23 +0000 (13:33 +0900)]
[layer] Verify ln, bn layers with fp16
- issue : adding cosine similarity check in fp32/fp16 revealed that there was unmatched cosine similarity Tensors in case of near-zero Tensors. Nevertheless, absolute value difference and mse pass our epsilon value. We would better to come back here for sanity check.
- Same result for multi-headed attention layer as well. (Only for near-zero Tensors)
- Added skip_cosine_similarity_check param to avoid this issue
- Macro for enable-fp16 option
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Fri, 18 Aug 2023 05:37:34 +0000 (14:37 +0900)]
[layer] Verify positional encoding layer with fp16
- added tensor_type getting code into layer
- added test case in positional encoding layer unittest
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Thu, 17 Aug 2023 04:26:55 +0000 (13:26 +0900)]
[ bug ] bugfix for wrong data generation trial
- since we handle by casting all the data at the end of the binary data file generation, we do not need to pass input data type in the first place
- newly generated .tar file included
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Donghyeon Jeong [Tue, 22 Aug 2023 01:59:59 +0000 (10:59 +0900)]
[TensorPool] Check tensor type in view
This PR enables the TensorPool view to filter call from different tensor type
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
hyunil park [Fri, 28 Jul 2023 00:01:13 +0000 (09:01 +0900)]
[sub-plugin] Add function to load an existing model
An existing model registered in model_load_path is used when training a new model.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: hyunil park <hyunil46.park@samsung.com>
Donghyeon Jeong [Thu, 17 Aug 2023 05:54:57 +0000 (14:54 +0900)]
[Android] Add unit-testing executable build
This patch adds additional unit test for the android
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Thu, 17 Aug 2023 04:40:23 +0000 (13:40 +0900)]
[Tensor] remove unused code
- Remove unused code.
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Wed, 16 Aug 2023 06:08:42 +0000 (15:08 +0900)]
[Tensor] Fix in Mixed Precision Support
- Fix unchanged works in mixed precision support
- Remove unused code
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Fri, 11 Aug 2023 04:49:34 +0000 (13:49 +0900)]
[unittest] specify softmax template type
Template type in activation functions needs to be specified to avoid
errors on ndk-build.
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
skykongkong8 [Thu, 10 Aug 2023 06:46:44 +0000 (15:46 +0900)]
[layers] Dump acti_func into header
- For easier maintenance, dump everyhing to header since there only few functions left after applying template to acti_fun.cpp
Resolves:
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Thu, 10 Aug 2023 01:27:02 +0000 (10:27 +0900)]
[gtest] Add dataset generation code for all layers in fp16
- Add code block for generating fp16 dataset for every layer
- Add new .tar.gz file that contains above
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
jijoong.moon [Thu, 10 Aug 2023 12:51:57 +0000 (21:51 +0900)]
[ Bug Fix ] fix the error in FP32 only case
There is configuration bugs for the FP32 only case.
This PR fixes the configuration and some of the ENABLE_FP16 compiler
macro errors.
Resolves:
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
Debadri Samaddar [Tue, 8 Aug 2023 11:14:16 +0000 (16:44 +0530)]
[ blas/neon ] Add NEON fp16 function for sdot
Enable neon sdot function for Android (ARM) fp16 computation.
Add unit test for fp16 sdot function in Android(ARM).
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
Donghyeon Jeong [Thu, 10 Aug 2023 09:06:39 +0000 (18:06 +0900)]
[Bug] Change the string format of the tensor datatype
Substitute underscore to hyphen in defining tenser datatype.
The _ (underscore) character used in std::regex is treated as a quantifier in LLVM.
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Wed, 9 Aug 2023 07:27:57 +0000 (16:27 +0900)]
Fix cosine similarity calculation error
Computing cosine similarity in FP16 gives inaccurate results (compute in double).
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
skykongkong8 [Thu, 10 Aug 2023 05:19:00 +0000 (14:19 +0900)]
[Bug] Fix bug when Android build
- Due to different compiler setting, trivial code fix for default
template instantiation is required.
Resolves:
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Thu, 10 Aug 2023 00:31:38 +0000 (09:31 +0900)]
[gtest] Verify attention layer with fp16
- Add fp16 test case
- Modify epsilon value in cosine similarity with proper decimal number & significant digit
Resolves:
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Thu, 10 Aug 2023 00:30:22 +0000 (09:30 +0900)]
[layers/activation_func] Apply template on activation functions
**Changes proposed in this PR:**
- For mixed precision, activation functions should be revised to a function template to avoid bulky code
- In order to use function template for setActivation, we need another function template to handle multiple types of activation function
- Minor fixes for template instantiation, and this will be revised proplerly for fp16 use in the next PR
Resolves:
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Tue, 8 Aug 2023 07:41:08 +0000 (16:41 +0900)]
[gtest] Add dataset file for attention layer
* Now nnlayergolden binary file for attention layer gtest will be automatically generated when build
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
jijoong.moon [Thu, 10 Aug 2023 01:29:44 +0000 (10:29 +0900)]
[Bug] Fix the nhwc test bug
We do need to add the format information during layer test.
This pr add the format change for the input tensor.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
skykongkong8 [Mon, 7 Aug 2023 06:36:07 +0000 (15:36 +0900)]
[bug] Fix zero division error
* add edge case handling in cosine_similarity function: in case of zero-valued tensor
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Mon, 7 Aug 2023 05:58:01 +0000 (14:58 +0900)]
[unittest/layer] Enable fp16 golden test in fc layer
* fp16 tensor validation metric
* value-by-value : with epsilon 1e-2, since _FP16 decimal digit is 3
* cosine similarity
* mean squared error with epsilon 1e-4, since it is 'squared' value
* Add fclayer fp16 tensor golden data when build
* fix cosine_similarity function to avoid zero division error (NaN value generation)
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Donghyeon Jeong [Mon, 7 Aug 2023 01:48:12 +0000 (10:48 +0900)]
Fix meson build options to support ARM properly
- Check for non-android ARM machines
- Use blas_neon.cpp only for ARM machines
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Debadri Samaddar [Fri, 4 Aug 2023 09:42:21 +0000 (15:12 +0530)]
[Bug] Fix redundant call to sgemv fp16 function
Added conditions for handling function call based USE__FP16 identifier.
Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
Debadri Samaddar [Thu, 3 Aug 2023 14:25:17 +0000 (19:55 +0530)]
[ GTEST ] Add gtest for NEON fp16 tensor unittest in Android
Enables the gtest for half precision NEON functions in Android(ARM).
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
Debadri Samaddar [Thu, 3 Aug 2023 11:20:55 +0000 (16:50 +0530)]
[ blas/neon ] Add NEON fp16 function for saxpy
Enable neon saxpy function for Android (ARM) __fp16 computation
Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
skykongkong8 [Fri, 4 Aug 2023 00:52:22 +0000 (09:52 +0900)]
[test] Enable fp16 golden test data
* generation : work with genLayerTests.py and use record_single_fp16
* data comparison : from sizeCheckedReadTensor, read with _FP16 memory size offset
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Donghyeon Jeong [Wed, 2 Aug 2023 05:15:51 +0000 (14:15 +0900)]
[Compiler] Preserve connection order in multi-out realizer
Create multiout nodes with a given connection order in building a frequency map.
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
hyeonseok lee [Thu, 27 Jul 2023 12:57:40 +0000 (21:57 +0900)]
[bugfix] added warning flag to compile with gcc 13
- Added Wno-maybe-uninitialized flag
Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>
DongHak Park [Fri, 14 Apr 2023 08:35:07 +0000 (17:35 +0900)]
[TFLite Export] Add Realized Path for Fused Op
For Fused OP Made Realized Path
1. Check Trainable
- check node is trainable or not for fusing
2. Conv + ReLU Fusing
3. Batch Normalization Fusing
Signed-off-by: DongHak Park <donghak.park@samsung.com>
DongHak Park [Fri, 14 Apr 2023 08:27:46 +0000 (17:27 +0900)]
[TFLite Export] Add variable, functions TfOpNodes for Fused OP export
for Export Tflite format with Fused Op add some Variable and Function
1. Add getter, setter, replace to weights
- for Fused Op we need to adjust weights after made Opnode
2. Add isToBeRemove variable
- After made Opnode, check condition and mark as to be remove
3. Add additional_props
- for BatchNormalization Fused Op we need additional props from nntrainer
- made vector<float> variable for save additional data
Signed-off-by: DongHak Park <donghak.park@samsung.com>
hyeonseok lee [Fri, 21 Jul 2023 11:12:38 +0000 (20:12 +0900)]
remove warning flags related to compile with gcc-13
- Remove warning flags which helps to compile with gcc 13.
- Remove multiout testcase cause this test cannot guarantees the multiout layer order
Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>
Seungbaek Hong [Wed, 19 Jul 2023 02:21:02 +0000 (11:21 +0900)]
[ahub] fix ahub issues
Fix some issues of svace and coverity.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Seungbaek Hong <sb92.hong@samsung.com>
hyeonseok lee [Mon, 17 Jul 2023 11:42:13 +0000 (20:42 +0900)]
[graph_node] handle deprecated stl iterator
- Explicitly provide the parameter as default parameter for stl iterator is deprecated.
Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>
Adwaith Anand [Wed, 28 Jun 2023 10:19:43 +0000 (15:49 +0530)]
[ Tensor ] Support NHWC for dot, add/multiply_strided and other ops
This PR includes changes of Tensor and TensorDim to support NHWC
computation for dot, add_strided, multiply_strided, cat, split,
and transpose. It also includes unittests to evaluate.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Adwaith Anand <adwaith.a@samsung.com>
Signed-off-by: Manohara HK <manohara.hk@samsung.com>
Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
Donghyeon Jeong [Thu, 3 Aug 2023 04:52:03 +0000 (13:52 +0900)]
[Bug] Fix unchanged work in Apply template
FP16 is seperated from FP32 in apply function.
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
skykongkong8 [Wed, 2 Aug 2023 08:23:07 +0000 (17:23 +0900)]
[ blas/neon ] Add neon_blas files
* Enable neon sgemv function in Android (ARM) __fp16 computation
* note: this pr includes a significant part of PR#1981 of nnstreamer/nntrainer
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Donghyeon Jeong [Tue, 1 Aug 2023 02:42:00 +0000 (11:42 +0900)]
[Bug] Fix generating nan values in tensor
- Gradient tensor values are inconsistently set to NaN
- NaN values caused incorrect backwarding in Neural Net
- Replacing malloc with calloc prevents memory allocation with value set to NaN
Signed-off-by: Donghyeon Jeong <djeong20@illinois.edu>
jijoong.moon [Fri, 28 Jul 2023 13:57:29 +0000 (22:57 +0900)]
[ Tensor ] Templatize apply member function
In order to support gcc-13 & ndk-build, the apply member function
needs to be templetize. And also it makes sence define apply
function.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
jijoong.moon [Fri, 28 Jul 2023 10:49:52 +0000 (19:49 +0900)]
[ Mixed ] fix apply using casted function
Describe a commit content (Until 80 colums per line) in detail ASAP.
**Changes proposed in this PR:**
- Added TOC generator for README.md
Resolves:
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
jijoong.moon [Thu, 27 Jul 2023 00:14:57 +0000 (09:14 +0900)]
[ Mixed Tensor ] add tensor type property in initContext
This PR add the tensor type (Format, Weight Tensor DataType,
Activation Tensor DataType) in initContext.
- Remove the tensor type variables and setter, getter member function
in layer, layer_devel, loss layer etc.
- add tensor type setter in initContext
- set the var_grad ( input & ouput ) Tensor Type according to model
Tensor Data Type.
- Add ModelTensorTypeInfo : eg. FP16_FP16 ( Weight FP16, Activation
FP16 )
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
jijoong.moon [Wed, 26 Jul 2023 05:39:17 +0000 (14:39 +0900)]
[ Mixed Tensor ] Bug Fixes
This pr includes bug fixes for mixed tensor supports
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
Donghyeon Jeong [Wed, 26 Jul 2023 02:43:29 +0000 (11:43 +0900)]
[Tensor] Enable FP16 in gcc-13
- divide in tensor now supports FP16
- ranged in test util supports FP16
- fix zoneout_rate from fp16 to float
Signed-off-by: Donghyeon Jeong <djeong20@illinois.edu>
Donghyeon Jeong [Tue, 25 Jul 2023 09:11:19 +0000 (18:11 +0900)]
[Bug] Fix tensor_pool unittest error
Signed-off-by: Donghyeon Jeong <djeong20@illinois.edu>
Donghyeon Jeong [Tue, 25 Jul 2023 08:38:22 +0000 (17:38 +0900)]
Enable gcc-13 compile with FP16
- Match FP16 types to avoid greater conversion rank error
- Replace deprecated functions in gcc-13
- Add apply function for FP16 in Tensor
Signed-off-by: Donghyeon Jeong <djeong20@illinois.edu>
jijoong.moon [Mon, 24 Jul 2023 22:47:33 +0000 (07:47 +0900)]
[ Mixed Tensor ] Enable FP32 unittest cases
This PR enables the FP32 unittest cases. It includes various fixes and
adding compiler preprocessor pragmas.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
Donghyeon Jeong [Thu, 20 Jul 2023 07:51:37 +0000 (16:51 +0900)]
[Bug] Fix memory access error in addValue
- Previously memory access to tensor data was incorrect
- Change to direct access to data with index instead of calculating the index
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Thu, 20 Jul 2023 04:19:50 +0000 (13:19 +0900)]
[Tensor] check data allocation in add/multiply_strided
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
skykongkong8 [Thu, 20 Jul 2023 08:25:49 +0000 (17:25 +0900)]
[WIP] [__fp16] Verify through __fp16 unittests
* Uncomment __fp16 testcases, then verify & debug
* fix missing functions or varibles in tensor and blas_interface
* TODO: do the last, fix setDist function, find erf function
Signed-off-by: skykongkong8 <kssjustin98@gmail.com>
Donghyeon Jeong [Wed, 19 Jul 2023 07:30:06 +0000 (16:30 +0900)]
[unittest] static cast answer data to fp16
- static_cast<__fp16> is needed to avoid narrowing conversion error
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Wed, 19 Jul 2023 01:06:37 +0000 (10:06 +0900)]
[unittest] Add data type for testing tensor
- add Tdatatype to avoid error
- default datda type is FP32
- Tformat & Tdatatype is used to create TensorType
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>