platform/core/ml/nntrainer.git
7 months ago[Applicaiton] Remove the vocab and merges file in Repo remove_vocab
jijoong.moon [Wed, 23 Oct 2024 00:23:15 +0000 (09:23 +0900)]
[Applicaiton] Remove the vocab and merges file in Repo

Remove these files for security issues.

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Change-Id: Ic2f39d100a199c4b742047b3634bdebb57392983
Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
7 months ago[Unittest] re-activate the deactivated unittest
Seungbaek Hong [Mon, 21 Oct 2024 09:56:41 +0000 (18:56 +0900)]
[Unittest] re-activate the deactivated unittest

I have reactivated the deactivated unittest except for the test
cases that cause errors.

- enable mol_attention_layer unittest
- enable reduce_mean_layer unittest
- enable unittest_models except for test cases that cause errors

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Seungbaek Hong <sb92.hong@samsung.com>
7 months ago[Unittest] Add lr_scheduler_cosine unit test
Donghak PARK [Mon, 21 Oct 2024 09:00:57 +0000 (18:00 +0900)]
[Unittest] Add lr_scheduler_cosine unit test

Add lr_scheduler_cosine Unit test case
- property setting test
- finalize test
- getLearningRate test

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK <donghak.park@samsung.com>
7 months ago[Unittest] Add network_graph unit test
Donghak PARK [Mon, 21 Oct 2024 05:50:29 +0000 (14:50 +0900)]
[Unittest] Add network_graph unit test

Add More unit test on network_graph
- add reinitialize
- catch exceptions

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK <donghak.park@samsung.com>
7 months ago[Trivial] Moved Testing Rotary Embedding to Unittest Dir
Yash Singh [Wed, 16 Oct 2024 07:13:01 +0000 (12:43 +0530)]
[Trivial] Moved Testing Rotary Embedding to Unittest Dir

Moved testing_rotary_emb.cpp to unittest Directory.

Signed-off-by: Yash Singh <yash.singh@samsung.com>
7 months ago[GPU/Enhance] Registering Attention kernels and removind cl_context dependency
Yash Singh [Tue, 8 Oct 2024 07:13:17 +0000 (12:43 +0530)]
[GPU/Enhance] Registering Attention kernels and removind cl_context dependency

Added registerCLKernel function to register custom OpenCL kernels as well as in-house kernels.
Modified attention kernels to remove cl_context related dependencies.
Added initAttentionCLKernels function to register default attention kernels.
Modified unittest to remove layer_context dependency
attention_kernel_strings.h added to handle attention kernels at one place.
Rebased the PR with current log.

Signed-off-by: Yash Singh <yash.singh@samsung.com>
7 months ago[GPU/OpenCl] Kernel optimization
Yash Singh [Tue, 3 Sep 2024 11:39:35 +0000 (17:09 +0530)]
[GPU/OpenCl] Kernel optimization

Kernel Optimized for GPU. Some trivial changes in code.

Signed-off-by: Yash Singh <yash.singh@samsung.com>
7 months ago[Trivial] Unnecessary comments Removed
Yash Singh [Thu, 29 Aug 2024 06:34:51 +0000 (12:04 +0530)]
[Trivial] Unnecessary comments Removed

Comments removed from the code.

Signed-off-by: Yash Singh <yash.singh@samsung.com>
7 months ago[Trivial] New line at the end
Yash Singh [Wed, 28 Aug 2024 12:29:58 +0000 (17:59 +0530)]
[Trivial] New line at the end

Added newline at the end in new files.

Signed-off-by: Yash Singh <yash.singh@samsung.com>
7 months ago[GPU/OpenCL] Initial version of Rotary Embedding with OpenCL ops
Yash Singh [Wed, 28 Aug 2024 12:10:24 +0000 (17:40 +0530)]
[GPU/OpenCL] Initial version of Rotary Embedding with OpenCL ops

Added initial version of Rotary Embedding kernel for GPU. This includes both FP32 and FP16 implementation got GPU kernel.

Signed-off-by: Yash Singh <yash.singh@samsung.com>
7 months ago[format] formatting files using clang-format
lhw414 [Sat, 12 Oct 2024 11:52:43 +0000 (20:52 +0900)]
[format] formatting files using clang-format

There was a formatting issue in the previous version due to incorrect clang-format settings, but I have now properly configured the settings in the local environment and reapplied clang-format.

Signed-off-by: lhw414 <dlgusdn0414@snu.ac.kr>
7 months ago[lr] fix scheduler functions
lhw414 [Sat, 12 Oct 2024 11:49:39 +0000 (20:49 +0900)]
[lr] fix scheduler functions

Fix cosineAnnealingLearningRateScheduler functions(finalize(), setProperty(), exportTo(), getLearningRate())

The previous version had an issue where an incorrect formatting caused a class name change, leading to infinite recursion. In this version, that issue has been fixed.

Signed-off-by: lhw414 <dlgusdn0414@snu.ac.kr>
7 months ago[format] add new-line
lhw414 [Thu, 10 Oct 2024 12:55:50 +0000 (21:55 +0900)]
[format] add new-line

add new-line in lr_scheduler_cosine.h and lr_scheduler_cosine.cpp

Signed-off-by: lhw414 <dlgusdn0414@snu.ac.kr>
7 months ago[lr] add cosineAnnealinglr scheduler
lhw414 [Mon, 7 Oct 2024 05:48:31 +0000 (14:48 +0900)]
[lr] add cosineAnnealinglr scheduler

This commit implements the cosine annealing learning rate scheduler.
It includes methods for setting learning rate properties and error handling.

Signed-off-by: lhw414 <dlgusdn0414@snu.ac.kr>
7 months ago[Loss] Remove Empty KLD Loss files
Donghak PARK [Fri, 18 Oct 2024 02:31:16 +0000 (11:31 +0900)]
[Loss] Remove Empty KLD Loss files

Currently, the KLD Loss file has only the function name and is empty.
Therefore, I will remove this file now and implement it after retesting with test cases and verification in the next issue #2757.

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK <donghak.park@samsung.com>
7 months ago[Test] Add unit tests for newly added tensor type
Donghyeon Jeong [Wed, 16 Oct 2024 07:37:29 +0000 (16:37 +0900)]
[Test] Add unit tests for newly added tensor type

This PR adds unit tests for the recently implemented tensor data types.
The purpose is to ensure their functionality and correctness by performing various test cases.

**Changes proposed in this PR:**
- Added unit tests to test functions for CharTensor and ShortTensor
- Check if both tensor data are contiguous before copy()

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
7 months ago[unittest] add negative unittest cases
Seungbaek Hong [Mon, 14 Oct 2024 11:37:56 +0000 (20:37 +0900)]
[unittest] add negative unittest cases

- added negative unittest cases
- deleted unittest cases that cause GTEST.MEANINGLESS_ASSERTION defect

**Self evaluation:**
1. Build test:   [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Seungbaek Hong <sb92.hong@samsung.com>
7 months ago[header] add missing headers
hyeonseok [Mon, 30 Sep 2024 06:54:48 +0000 (15:54 +0900)]
[header] add missing headers

 - Added missing header "algorithm"

Signed-off-by: hyeonseok <hs89.lee@samsung.com>
7 months ago[ GPU ] fix return by reference to return by value
Eunju Yang [Thu, 10 Oct 2024 06:53:32 +0000 (15:53 +0900)]
[ GPU ] fix return by reference to return by value

- In the previous code, `registerClKernel` function returned reference
of SharedPtrClKernel, which has two problems:
- returning by reference cannot return nullptr, leading an error in
line 169 of `cl_context.cpp`
- returning by reference of `shared_ptr` may not increase the
reference counter, leading to early free problem.

Signed-off-by: Eunju Yang <ej.yang@samsung.com>
7 months agoFix for compiling with testcoverage
jijoong.moon [Wed, 16 Oct 2024 01:58:16 +0000 (10:58 +0900)]
Fix for compiling with testcoverage

In order to generate test coverage rerport with gcov, option needs to
be fixed.

- remove meson exclude option

Resolves:

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
7 months ago[coverity] fix coverity issue
Donghyeon Jeong [Wed, 25 Sep 2024 07:35:36 +0000 (16:35 +0900)]
[coverity] fix coverity issue

This PR resolves the issue by checking what the malloc function returned

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
7 months ago[Filter] util to get nth info
Jaeyun Jung [Mon, 23 Sep 2024 06:24:57 +0000 (15:24 +0900)]
[Filter] util to get nth info

Use util function to get nth tensor-info ptr.
Also, remove unnecessary header include in filter subplugin.

Signed-off-by: Jaeyun Jung <jy1210.jung@samsung.com>
7 months ago[GPU/OpenCL] RMSNorm Bug Fix - Index value of alpha corrected in kernel logic.
Niket Agarwal [Thu, 10 Oct 2024 11:00:58 +0000 (16:30 +0530)]
[GPU/OpenCL] RMSNorm Bug Fix - Index value of alpha corrected in kernel logic.
Updated RMSNorm with the new shared_ptr flow.
Replaced clCreateKernel with registerClKernel.

Self evaluation:

        Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Niket Agarwal <niket.a@samsung.com>
7 months ago[GPU/OpenCL] Updated the SwiGLU, Reshape and Concat Layers
Niket Agarwal [Wed, 9 Oct 2024 07:34:20 +0000 (13:04 +0530)]
[GPU/OpenCL] Updated the SwiGLU, Reshape and Concat Layers
Updated the swiglu, reshape, and concat layers with the new shared_ptr flow.
Replaced clCreateKernel with registerClKernel for all these layers.

Self evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
        Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Niket Agarwal <niket.a@samsung.com>
7 months ago[update] GPU FC layer updated with latest pipeline changes
Debadri Samaddar [Mon, 7 Oct 2024 06:29:27 +0000 (11:59 +0530)]
[update] GPU FC layer updated with latest pipeline changes

Removed cl_context dependencies from fc_layer_cl.
Modified blas function calls accordingly.

Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
7 months ago[bugfix/blas] Fixed sgemm_cl poiner check
Debadri Samaddar [Thu, 3 Oct 2024 09:32:08 +0000 (15:02 +0530)]
[bugfix/blas] Fixed sgemm_cl poiner check

Fixed failing condition in sgemm_cl and function call argument

Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
7 months ago[bugfix] Fix sgemv_cl function call from blas_kernel_interface
Debadri Samaddar [Wed, 18 Sep 2024 11:00:20 +0000 (16:30 +0530)]
[bugfix] Fix sgemv_cl function call from blas_kernel_interface

Fixed sgemv_cl function call. Failing unittest after recent changes.

Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
8 months ago[ Print ] Update print result of model summary
Eunju Yang [Wed, 2 Oct 2024 05:31:32 +0000 (14:31 +0900)]
[ Print ] Update print result of model summary

- This commit updates the model summary print of the layer with multiple
inputs.

[ASIS]
             concat0              concat            1:1:14:2              input0
                                                     1:1:4:2              input1
                                                     1:1:8:2              input2

[TOBE]
             concat0              concat            1:1:14:2              input0
                                                                          input1
                                                                          input2

Signed-off-by: Eunju Yang <ej.yang@samsung.com>
8 months ago[ App ] Multi-Input Example Update
Eunju Yang [Wed, 2 Oct 2024 05:24:32 +0000 (14:24 +0900)]
[ App ] Multi-Input Example Update

- This commit is related to issue #2660
- When using multi-inputs, users must feed the data in reverse order due
to a known bug that needs fixing. In the current version, the input must
be provided in reverse order, which was not shown in the previous
example where random data with the same dimensions were used.
- To provide a more accurate example to NNTrainer users, I have
temporarily updated this example.
- Once the issue is handled, further updates will be necessary.

Signed-off-by: Eunju Yang <ej.yang@samsung.com>
8 months ago[gpu/enhance] Utility for registering Blas kernels during initialization
Debadri Samaddar [Tue, 24 Sep 2024 04:49:47 +0000 (10:19 +0530)]
[gpu/enhance] Utility for registering Blas kernels during initialization

Default Blas kernel registration during cl_context initialization
Remove RunLayerContext dependency from unit tests

Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
8 months ago[bugfix] Fix memcheck in CacheLoader unit tests
Donghyeon Jeong [Mon, 30 Sep 2024 04:43:08 +0000 (13:43 +0900)]
[bugfix] Fix memcheck in CacheLoader unit tests

This pull request fixes the issue of failing Continuous Integration.
The new patch checks for nullptr after a flush operation on the cache pool.
This adjustment is expected to rectify the previous failures encountered during the CI process.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
8 months ago[ LORA ] Bugfix in LoRA support in FC Layer
Eunju Yang [Thu, 5 Sep 2024 04:55:57 +0000 (13:55 +0900)]
[ LORA ] Bugfix in LoRA support in FC Layer

- In the previous code, LoRA didn't work for the case batch_size > 1.
- Tensors used in LoRA-related computation were not updated when the
batch size is upsted.
- `setBatch()` function is implemented for `FullyConnectedLayer`.
- BugFix in Lifespan of loraTmp Tensor: FORWARD_DERIV_LIFESPANE ->
FORWARD_GRAD_LIFESPAN

Self evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Eunju Yang <ej.yang@samsung.com>
8 months ago[ FC ] update incremental_forwarding to support LoRA and multi-batch
Eunju Yang [Mon, 2 Sep 2024 08:47:02 +0000 (17:47 +0900)]
[ FC ] update incremental_forwarding to support LoRA and multi-batch

- This commit add some codes to support LoRA in incremental_forwarding.
- This commit updates the incremental_forwarding to support multiple
batch input. However, it is not the desirable way in that it cannot be
parallelized across the batch axis. I left this issue on the comment.

Self evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Eunju Yang <ej.yang@samsung.com>
8 months ago[enhance/gpu] Removing layer_context dependency
Debadri Samaddar [Fri, 13 Sep 2024 07:53:30 +0000 (13:23 +0530)]
[enhance/gpu] Removing layer_context dependency

Removed layer_context dependency from blas OpenCL kernels.
Temporarily commented out cl_layers to avoid build failure.

Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
8 months ago[enhance] Registering OpenCL kernels at cl_context
Debadri Samaddar [Wed, 11 Sep 2024 08:05:22 +0000 (13:35 +0530)]
[enhance] Registering OpenCL kernels at cl_context

Register custom kernels as well as in-house kernels at cl_context initialization

Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
9 months ago[ Tizen7.0 ] Include some headers in -dev header for neuralnet.h
Eunju Yang [Wed, 11 Sep 2024 05:14:13 +0000 (14:14 +0900)]
[ Tizen7.0 ] Include some headers in -dev header for neuralnet.h

- In the previous PR (77e56f1), neuralnet.h was included in dev package.
- However, some headers were missing used in nueralnet.h
- This PR adds headers which have dependency with neuralnet.h
- This PR is tested whether it supports ReinforcementLearning app on
Tizen7.0

Self evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Eunju Yang <ej.yang@samsung.com>
9 months ago[Tizen7.0] Tizen7.0 Backporting
Eunju Yang [Mon, 26 Aug 2024 05:17:18 +0000 (14:17 +0900)]
[Tizen7.0] Tizen7.0 Backporting

- This commit adds some updates for Tizen7.0 backporting
- Type mismatch bug is fixed.
- Unused variable is removed.
- Missing header files are added in spec file.
- spec file is updated

Self evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Eunju Yang <ej.yang@samsung.com>
9 months ago[CI] Fix meson ubuntu ci build
Donghak PARK [Fri, 23 Aug 2024 03:00:15 +0000 (12:00 +0900)]
[CI] Fix meson ubuntu ci build

Fix build bug
- Currently, there is a bug in the matrix used in CI where the first Meson build runs successfully but subsequent builds fail due to the presence of a 'build' folder. I would like to fix this issue.
- Before running the Meson build, ensure that any existing folders named 'build' are deleted.
- fix gcc version to 13

Resolves:
- #2715

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Co-authored-by: hyeonseok <hs89.lee@samsung.com>
Co-authored-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
9 months ago[ Tizen7.0 ] Include neuralnet.h in -dev header
Eunju Yang [Thu, 5 Sep 2024 07:12:35 +0000 (16:12 +0900)]
[ Tizen7.0 ] Include neuralnet.h in -dev header

- Update the code to include `neuralnet.h` in -dev header.
- Some applications, e.g., ReinforcementLearning uses `forwarding` and
`backwarding` directly. To support it, this commit adds the header into
dev package.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Eunju Yang <ej.yang@samsung.com>
9 months ago[bugfix] fix coverity issues release/tizen7.0 accepted/tizen/7.0/unified/20240830.164841
Donghyeon Jeong [Thu, 29 Aug 2024 03:53:20 +0000 (12:53 +0900)]
[bugfix] fix coverity issues

This PR resolves coverity issues in the ShortTensor class.
Replace max_abs() implementation with maxValue() since the maximum absolute value of unsigned int equals to the maximum value.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
9 months ago[Tizen7.0] Tizen7.0 Backporting
Eunju Yang [Mon, 26 Aug 2024 05:17:18 +0000 (14:17 +0900)]
[Tizen7.0] Tizen7.0 Backporting

- This commit adds some updates for Tizen7.0 backporting
- Type mismatch bug is fixed.
- Unused variable is removed.
- Missing header files are added in spec file.
- spec file is updated

Self evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Eunju Yang <ej.yang@samsung.com>
9 months ago[ NNStreamer ] disable nnstreamer trainer
jijoong.moon [Mon, 22 Apr 2024 05:43:58 +0000 (14:43 +0900)]
[ NNStreamer ] disable nnstreamer trainer

Describe a commit content (Until 80 colums per line) in detail ASAP.

**Changes proposed in this PR:**
- Added TOC generator for README.md

Resolves:

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
9 months ago[ SPEC ] chagne fp16
jijoong.moon [Mon, 22 Apr 2024 05:07:21 +0000 (14:07 +0900)]
[ SPEC ] chagne fp16

Describe a commit content (Until 80 colums per line) in detail ASAP.

**Changes proposed in this PR:**
- Added TOC generator for README.md

Resolves:

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
9 months agotemporary code for layer initialization
hyeonseok lee [Thu, 21 Mar 2024 07:17:53 +0000 (16:17 +0900)]
temporary code for layer initialization

 - Temporary code for layer initialization

Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>
9 months ago[BUILD] Remove Flag and Add FLECIBLE PAGE Option
Donghak PARK [Tue, 13 Aug 2024 02:55:28 +0000 (11:55 +0900)]
[BUILD] Remove Flag and Add FLECIBLE PAGE Option

Remove Flag on Android.mk
Add APP_SUPPORT_FLEXIBLE_PAGE_SIZE to Application.mk

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK <donghak.park@samsung.com>
9 months ago[BUILD] add APP_SUPPORT_FLEXIBLE_PAGE_SIZES
Donghak PARK [Tue, 13 Aug 2024 02:32:16 +0000 (11:32 +0900)]
[BUILD] add APP_SUPPORT_FLEXIBLE_PAGE_SIZES

For support 16k page size, add APP_SUPPORT_FLEXIBLE_PAGE_SIZES
as True, According to Android Guide

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK <donghak.park@samsung.com>
9 months ago[BUILD] Add more 16k shared lib package option on Android.mk
Donghak PARK [Tue, 13 Aug 2024 01:28:12 +0000 (10:28 +0900)]
[BUILD] Add more 16k shared lib package option on Android.mk

After #2699 : add option's for all android.mk file

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK <donghak.park@samsung.com>
9 months ago[ BUILD ] Add 16K shared lib package option for Android
jijoong.moon [Wed, 7 Aug 2024 02:06:54 +0000 (11:06 +0900)]
[ BUILD ] Add 16K shared lib package option for Android

Android encourage to use 16KB package for the shared library. This PR
add the 16KB package option and also recommand to use ndk which is
higher or equal version of r27.

Resolves:

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
9 months ago[enhance] Using 64 bit for LayerKernel enum
Debadri Samaddar [Tue, 27 Aug 2024 10:19:24 +0000 (15:49 +0530)]
[enhance] Using 64 bit for LayerKernel enum

Enhanced LayerKernel enum and mask for 64-bit values

Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
9 months ago[Tensor] ShortTensor class with unsigned 16-bit integer
Donghyeon Jeong [Mon, 5 Aug 2024 10:49:22 +0000 (19:49 +0900)]
[Tensor] ShortTensor class with unsigned 16-bit integer

In this PR, a new type of tensor, the ShortTensor class, is designed explicitly for handling unsigned 16-bit integer data types.
This new tensor class aims to provide users with more options when working with various data types.
Note that the ShortTensor class does not support mathematical operations like multiplication or addition.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
9 months ago[ blas/bugfix ] Fix irrelevant function call
skykongkong8 [Mon, 26 Aug 2024 02:17:19 +0000 (11:17 +0900)]
[ blas/bugfix ] Fix irrelevant function call

- Since current function implementations are not using CBLAS params, should directly call function from cblas.h

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
9 months ago[bugfix] Resolve fp16 enabled build error
Donghyeon Jeong [Fri, 23 Aug 2024 02:27:29 +0000 (11:27 +0900)]
[bugfix] Resolve fp16 enabled build error

This PR resolves the build error after #2704 when enable_fp16 is true.

This fixes:
blas_interface.cpp:141:9: error: â€˜order’ was not declared in this scope
  141 |   sgemv(order, TransA, M, N, alpha, A_, lda, X_, incX, beta, Y_, incY);
      |         ^~~~~

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [ ]Passed [X]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
9 months ago[ Tensor ] Remove CBLAS params from Tensor related files.
skykongkong8 [Mon, 12 Aug 2024 04:15:53 +0000 (13:15 +0900)]
[ Tensor ] Remove CBLAS params from Tensor related files.

- Remove cblas params from tensor related files since nntrainer is not fully-dependent on cblas anymore.
- Letting tensors to be aware of Cblas related parameters is a nonsense at the first place.
- CBLAS params will be declared only when functions from cblas is called.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
9 months ago[ CAPI ] fix the Native API Ref Doc
jijoong.moon [Thu, 22 Aug 2024 05:38:45 +0000 (14:38 +0900)]
[ CAPI ] fix the Native API Ref Doc

Add MODULE in submodule name in doc file.

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
9 months ago[coverity] fix coverity issue
Donghyeon Jeong [Wed, 14 Aug 2024 11:49:11 +0000 (20:49 +0900)]
[coverity] fix coverity issue

This PR resolves the coverity issues of resource leak, unreachable code, and missing break.

**Changes proposed in this PR:**
- use static arrays instead of dynamic allocation to avoid resource leaks.
- remove unreachable code and add missing break statement.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
9 months ago[GPU/OPENCL] RMSNorm Accuracy Fix
Thummala Pallavi [Fri, 2 Aug 2024 06:41:34 +0000 (12:11 +0530)]
[GPU/OPENCL] RMSNorm Accuracy Fix

The alpha values were not picked correctly.

Signed-off-by: Thummala Pallavi <t.pallavi@samsung.com>
9 months ago[Layer] enhance ConcatLayer algorithms for efficient concatenation and split
Donghyeon Jeong [Wed, 14 Aug 2024 01:14:25 +0000 (10:14 +0900)]
[Layer] enhance ConcatLayer algorithms for efficient concatenation and split

This PR renovates reshape/concatenation algorithms to facilitate efficient concatenation and split in ConcatLayer.

Previously, dimension 2 (height) was set as a standard axis to operate concatenation.
However, this causes an overhead by copying a tensor size of 1 when the concat dimension is 3 (width).

The new algorithm consolidates all dimensions to the first and last axes based on the concat dimension, sets the standard axis to be 3, and performs concat and split.

**Changes proposed in this PR:**
- Revise creating helper dimension logic in finalize().
- Update forwarding() and calcDeriv() workflow to be efficient.
- Add descriptions for the new concat algorithm.

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
9 months ago[Tensor] Add optional output tensor for tensor concatenation
Donghyeon Jeong [Mon, 12 Aug 2024 02:13:29 +0000 (11:13 +0900)]
[Tensor] Add optional output tensor for tensor concatenation

This PR adds an optional feature in Tensor::cat to pass the output tensor to the function.
This change allows the user-given tensor to store the result of the concatenation without creating a new tensor.

**Changes proposed in this PR:**
- Add optional argument output (the output tensor) to the cat function.
- Add negative test cases for tensor concatenation.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
9 months ago[Android] Support Android NDK r27 and higher
Donghyeon Jeong [Wed, 14 Aug 2024 02:05:30 +0000 (11:05 +0900)]
[Android] Support Android NDK r27 and higher

This PR enables NNTrainer to use Android NDK r27 to support compiling 16 KB-aligned shared libraries.

While -fpu is ignored and -mfloat-abi option is not valid with AArch64 targets, removing these options has no effect on using current NEON instructions for armv8.2.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
10 months ago[Layer] Improve forwarding logic of ConcatLayer
Donghyeon Jeong [Thu, 8 Aug 2024 08:27:20 +0000 (17:27 +0900)]
[Layer] Improve forwarding logic of ConcatLayer

This PR updates current ConcatLayer forwarding for faster computation.

**Changes proposed in this PR:**
- Utilize the Tensor::concat() operation to perform forwarding and replace manual mapping and copying.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
10 months ago[Tensor] CharTensor class with signed 8-bit integer
Donghyeon Jeong [Tue, 2 Apr 2024 07:19:56 +0000 (16:19 +0900)]
[Tensor] CharTensor class with signed 8-bit integer

In this PR, a new type of tensor, the CharTensor class, is designed explicitly for handling signed 8-bit integer data types that have already undergone quantization.
This new tensor class aims to provide users with more options when working with tensors and their respective data types.
Currently, the CharTensor class does not support mathematical operations like multiplication or addition. However, these features will be added in future updates.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
10 months ago[ matrix_transpose/bugfix ] Prevent reading/saving data from/to unallocated memory
skykongkong8 [Tue, 6 Aug 2024 04:35:37 +0000 (13:35 +0900)]
[ matrix_transpose/bugfix ] Prevent reading/saving data from/to unallocated memory

- Previous transpose kernel occasionally load/save unallocated memory, and then masked it.
- Now, it does not read them at the first place, but load with for-loop
- This would deteriorate speed of fp16 matrix transpose, but won't be dominant in total model latency

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ hgemm ] Generalize redundant micro hgemm kernel implementation
skykongkong8 [Wed, 7 Aug 2024 11:41:39 +0000 (20:41 +0900)]
[ hgemm ] Generalize redundant micro hgemm kernel implementation

- Previous implementation naively used fixed-sized ukernels for the K-direction accumulation.
- Such kernels were excessively long, but had better performance than looping through single K-iteration.
- However, recent test results have shown that justing stacking 4 K iters, and looping through such ukernel preserved the performance with better code readability.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[Layer] add Weight Layer
Seungbaek Hong [Tue, 30 Jul 2024 06:17:17 +0000 (15:17 +0900)]
[Layer] add Weight Layer

- This layer contains only weights for building tensor-level graph

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Seungbaek Hong <sb92.hong@samsung.com>
10 months ago[ hgemm ] Apply hgemm util funcs at frequently used functions
skykongkong8 [Wed, 7 Aug 2024 01:26:45 +0000 (10:26 +0900)]
[ hgemm ] Apply hgemm util funcs at frequently used functions

- get_prev_mltpl_of_2p_n is frequently used in many hgemm kernels.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ trivial ] Add missing docs and error message
skykongkong8 [Wed, 7 Aug 2024 01:21:08 +0000 (10:21 +0900)]
[ trivial ] Add missing docs and error message

- Add missing doxtgen tags : transpose boolean params
- error message : emit error when try to use full-fp16 kernel with experimental kernel build

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ hgemm ] Add hgemm experimental kernel
skykongkong8 [Thu, 1 Aug 2024 11:58:31 +0000 (20:58 +0900)]
[ hgemm ] Add hgemm experimental kernel

- According to current paper, accumulating up to 64 ~ 128 w.r.t. K-direction is fine.
- Since conventional error metric, and newly introduced metric (max component relative error) is fine as well, introduce experiemntal kernel.
- using build option -Dhgemm-experimental-kernel=true can enable such kernel when android build

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ hgemm ] Implement hgemm_small
skykongkong8 [Thu, 1 Aug 2024 11:29:39 +0000 (20:29 +0900)]
[ hgemm ] Implement hgemm_small

- Forcibly adding zero-padding made small dim index quite clumsy and redundant.
- Implement explicit hgemm small function to cover M<8, N<16, K<16 case

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[refactor] Restructure getStringDataType function
Donghyeon Jeong [Fri, 2 Aug 2024 04:02:23 +0000 (13:02 +0900)]
[refactor] Restructure getStringDataType function

This patch updates the getStringDataType function structure to utilize method overriding.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
10 months ago[Tensor] Update tensorbase for efficient creation of new tensor class.
Donghyeon Jeong [Tue, 30 Jul 2024 02:09:39 +0000 (11:09 +0900)]
[Tensor] Update tensorbase for efficient creation of new tensor class.

This PR updates the TensorBase class to make mathematical operations that are not required to create a new tensor class.
This change allows developers to easily create new classes without implementing math operations.
Note that these functions should be implemented to utilize tensor operations fully.

**Changes proposed in this PR:**
- Change math operation function from pure virtual function to virtual function
- Add a private function to get the data type as a string

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
10 months agoBUG FIX : Concat GPU Layer and CPU layer unittest cases name overlapping.
Niket Agarwal [Thu, 1 Aug 2024 05:56:52 +0000 (11:26 +0530)]
BUG FIX : Concat GPU Layer and CPU layer unittest cases name overlapping.

Modified the concat gpu testcases name in unittest_layers_concat_cl for differentiation with concat cpu testcases name.

**Self evaluation:**
1. Build test:   [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Niket Agarwal <niket.a@samsung.com>
10 months ago[Doc] NNTrainer Tool Utilization Guide
Donghyeon Jeong [Fri, 26 Jul 2024 07:53:32 +0000 (16:53 +0900)]
[Doc] NNTrainer Tool Utilization Guide

This PR adds a guide for executing unit tests on the Android device.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
10 months ago[Android] Verify Android NDK Installation and Configuration
Donghyeon Jeong [Fri, 26 Jul 2024 08:17:44 +0000 (17:17 +0900)]
[Android] Verify Android NDK Installation and Configuration

This patch checks if Android NDK is installed and configured before building using NDK in the Android test script.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
10 months ago[doc] Extend code documentation
Donghyeon Jeong [Wed, 31 Jul 2024 03:15:33 +0000 (12:15 +0900)]
[doc] Extend code documentation

This PR adds summary content to help users quickly understand the role and scope of the Tensor API.

**Self-evaluation:**
1. Build test: [ ]Passed [ ]Failed [X]Skipped
2. Run test:   [ ]Passed [ ]Failed [X]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
10 months ago[GPU/OpenCL] Initial version of Concat Layer with OpenCL ops
Niket Agarwal [Wed, 3 Jul 2024 10:42:38 +0000 (16:12 +0530)]
[GPU/OpenCL] Initial version of Concat Layer with OpenCL ops

Added naive version of OpenCL implementation for Concat Layer.
Incorporated kernel for ops used.
Added unit test for Concat_cl.

Signed-off-by: Niket Agarwal <niket.a@samsung.com>
10 months ago[ unittest ] Implement max_componentwise_relative_error
skykongkong8 [Mon, 15 Jul 2024 10:45:24 +0000 (19:45 +0900)]
[ unittest ] Implement max_componentwise_relative_error

- When comparing outputs computed with different precision, max componentwise relative error is needed.
- (trivial) Use more precision comparison for zeroDivisionError classifying code in cosine similarity function

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ unittest ] Use bounded value generator in hgemm unittests
skykongkong8 [Mon, 15 Jul 2024 10:20:01 +0000 (19:20 +0900)]
[ unittest ] Use bounded value generator in hgemm unittests

- According to recent papers, using values with distribution of [0,1), or [-1, 1) is widely used when comparing fp16-fp32 precision comparison.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ unittest ] Add TCs for checking padding-using GEMM
skykongkong8 [Mon, 15 Jul 2024 09:44:13 +0000 (18:44 +0900)]
[ unittest ] Add TCs for checking padding-using GEMM

- Add TCs checking for padding w.r.t. M, K, N, MK, KN, MKN

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ hgemm ] Implement NYI functions from matrix A/B hgemm_padding
skykongkong8 [Mon, 15 Jul 2024 09:41:43 +0000 (18:41 +0900)]
[ hgemm ] Implement NYI functions from matrix A/B hgemm_padding

- Missing implementations might trigger unittest fails on Android.
- This patch will now support padding function for all combinations of following conditions : matrix A / B, trans/noTrans, M/K/N direction

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ hgemm ] Implement matrix noTrans A w.r.t. MK padding
skykongkong8 [Fri, 12 Jul 2024 05:15:03 +0000 (14:15 +0900)]
[ hgemm ] Implement matrix noTrans A w.r.t. MK padding

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ trivial ] Fix typo and add missing doxygen tags
skykongkong8 [Wed, 10 Jul 2024 10:07:44 +0000 (19:07 +0900)]
[ trivial ] Fix typo and add missing doxygen tags

- Fix typo and add missing doxygen tags
- Add more exact explanation for doxygen tag briefs

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ hgemm ] Move hgemm_padding related files to explicit directory
skykongkong8 [Wed, 10 Jul 2024 09:10:38 +0000 (18:10 +0900)]
[ hgemm ] Move hgemm_padding related files to explicit directory

- Adding padding to matrices is not an optimal solution to approach, but yet can be one sub-optimal option.
- Final goal of this directory would be deleting this directory itself.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ hgemm ] Remove unnecessary K1 GEMM functions
skykongkong8 [Wed, 10 Jul 2024 08:43:39 +0000 (17:43 +0900)]
[ hgemm ] Remove unnecessary K1 GEMM functions

- With perspective of memory, when K = 1, matrix transpose condition has nothing to do with GEMM algorithm.
- Remove all K1 noTrans / transA / transB / transAB and unify them into single function.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ hgemm/refactor ] Refactor hgemm file structure
skykongkong8 [Wed, 10 Jul 2024 08:34:43 +0000 (17:34 +0900)]
[ hgemm/refactor ] Refactor hgemm file structure

- Kernel functions are used regardless of matrix transpose, does need to be included from separate file.
- For further optimal implemenation of matrix A / B / AB transpose blocking-kernel sequences, divide their file for convenience
- Function 'hgemm' itself is better to be reside in hgemm directory.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ unittest ] Add TC for K=1 hgemm case
skykongkong8 [Wed, 10 Jul 2024 04:38:53 +0000 (13:38 +0900)]
[ unittest ] Add TC for K=1 hgemm case

- Missing optimizations for K=1 GEMM case was recently detected.
- Add such TC accordingly.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ trivial/hgemm ] Move hgemm_K1 to hgemm directory
skykongkong8 [Wed, 10 Jul 2024 04:36:15 +0000 (13:36 +0900)]
[ trivial/hgemm ] Move hgemm_K1 to hgemm directory

- For consistency, hgemm_K1 function should reside under hgemm directory

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ trivial ] Add doxygen tags for hgemm padding functions
skykongkong8 [Wed, 10 Jul 2024 04:27:26 +0000 (13:27 +0900)]
[ trivial ] Add doxygen tags for hgemm padding functions

- Add doxygen tags for hgemm padding functions

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ hgemm ] Implement packing-blocking-kernel sequence for hgemm transB
skykongkong8 [Wed, 10 Jul 2024 03:27:39 +0000 (12:27 +0900)]
[ hgemm ] Implement packing-blocking-kernel sequence for hgemm transB

- Previously, hgemm transB computation was relying on transposing the entire matrix and using non-transpose sequence.
- For optimal performance, matrix packing-blocking-kernel sequence for transB case is explicitly implemented.
- Note that current implementation only supports for 8x16 gemm kernel.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ hgemm ] Separate source / header files for hgemm packing function
skykongkong8 [Wed, 10 Jul 2024 01:48:53 +0000 (10:48 +0900)]
[ hgemm ] Separate source / header files for hgemm packing function

- For easier implementation and maintenance of hgemm packing functions, separate them.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ Trivial/bugfix ] Add missing library to include
skykongkong8 [Wed, 10 Jul 2024 01:46:46 +0000 (10:46 +0900)]
[ Trivial/bugfix ] Add missing library to include

- add stdlib.h to hgemm_util.h

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months ago[ hgemm ] Implement matrix padding function
skykongkong8 [Wed, 10 Jul 2024 01:42:10 +0000 (10:42 +0900)]
[ hgemm ] Implement matrix padding function

- Since current kernel / blocking function supports for fixed shape only, implement padding function for temporary solution.
- Note that flexible kernel / blocking implementation should be added for optimal performances
- Current implementation separates padding function for matrix A and B but it will eventually be governed with single function

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <ss.kong@samsung.com>
10 months agofix: incorrect C/C++ preprocessor macro
MyungJoo Ham [Fri, 26 Jul 2024 05:52:27 +0000 (14:52 +0900)]
fix: incorrect C/C++ preprocessor macro

When -DENABLE_ENCODER is given, you do
 #ifdef ENABLE_ENCODER
not
 #ifdef DENABLE_ENCODER

CC: @baek2sm
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
10 months ago[bugfix] Resolves Android build warnings
Donghyeon Jeong [Mon, 22 Jul 2024 07:36:37 +0000 (16:36 +0900)]
[bugfix] Resolves Android build warnings

This PR resolves warnings that occur during the Android build. The list is as follows.

**Changes proposed in this PR:**
- Fix function that overrides virtual functions but is not marked override.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
10 months ago[bugfix] Android build error when fp16 is enabled
Donghyeon Jeong [Mon, 22 Jul 2024 07:34:52 +0000 (16:34 +0900)]
[bugfix] Android build error when fp16 is enabled

This PR fixes issues of undefined symbols of one of the tensor constructors.
The function implementation is moved to the header file to resolve this issue.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
10 months ago[Tensor] Operational Improvements and Functionality Simplification
Donghyeon Jeong [Fri, 12 Jul 2024 07:33:11 +0000 (16:33 +0900)]
[Tensor] Operational Improvements and Functionality Simplification

This commit moves several operations implementations to each Tensor class for easier management.
This allows users to create a new data type Tensor without unnecessary modification to the Tensor class.

**Changes proposed in this PR:**
- static function Tensor::cat() uses each tensor's member function concat().
- Tensor::copy() logic is simplified by not differentiating by its data type.
- Tensor::copy_with_stride() uses an internal function to operate.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
10 months ago[Tensor] Update newly added features
Donghyeon Jeong [Tue, 9 Jul 2024 11:57:46 +0000 (20:57 +0900)]
[Tensor] Update newly added features

This commit updates recently added features in tensor, including add_i_partial() and ele_mul().
The newly added functions have been implemented according to the revised tensor structure.

**Changes proposed in this PR:**
- Update Float/HalfTensor class with newly added function, add_i_partial().
- Apply BLAS operations in basic arithmetic operations in Tensor.
- height-width transpose in half-precision can be SIMD accelerated.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
10 months ago[bugfix] Fix issues occured in Tensor class refactoring
Donghyeon Jeong [Fri, 8 Mar 2024 02:37:22 +0000 (11:37 +0900)]
[bugfix] Fix issues occured in Tensor class refactoring

This commit aims to fix several issues that arose due to the refactoring of the Tensor class.

**Changes proposed in this PR:**
- The copy constructor has been implemented to prevent incorrect behavior of the default copy constructor in this commit
- Tensor add_i() has been newly implemented to fix previous incorrect implementations.
- Add chain() function that returns LazyTensor

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
10 months ago[Refactor] Deprecate TensorV2 and replace Tensor class with TensorV2
Donghyeon Jeong [Wed, 6 Mar 2024 05:31:48 +0000 (14:31 +0900)]
[Refactor] Deprecate TensorV2 and replace Tensor class with TensorV2

This commit deprecates the existing TensorV2 class and replaces Tensor class with the new TensorV2 class.
The previous Tensor class has been removed and all its usages have been updated to use the TensorV2 class.
Additionally, all instances of TensorV2 usage within the NNTrainer have been removed.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
10 months ago[Application] Bug fix in RL example
Eunju Yang [Thu, 25 Jul 2024 01:52:14 +0000 (10:52 +0900)]
[Application] Bug fix in RL example

**Changes proposed in this PR:**
- This commit updates the DQN example.
- In the previous code, there was a bug : copy main Net to Target Net was not written as
intended.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Eunju Yang <ej.yang@samsung.com>
10 months ago[Android] Add android test script
Donghyeon Jeong [Tue, 23 Jul 2024 10:16:35 +0000 (19:16 +0900)]
[Android] Add android test script

This patch adds a script to run unit tests on Android devices.

**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>