MyungJoo Ham [Thu, 1 Feb 2024 06:29:01 +0000 (15:29 +0900)]
meson script condition fix
Whether to include fp16 codes should depend on
fp16 enable/disable, not on the platform name
directly.
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
MyungJoo Ham [Thu, 1 Feb 2024 06:28:41 +0000 (15:28 +0900)]
blas_neon.cpp: unsigned int type mismatch
Please do not ignore compiler warnings.
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
MyungJoo Ham [Thu, 1 Feb 2024 10:17:00 +0000 (19:17 +0900)]
dist/Tizen: disable fp16 in Tizen
NNTrainer FP16 implementation relies on NEON, which requires
armv8.2-a ISA.
Tizen aarch64 is on armv8.0-a; thus it cannot support fp16-neon.
Thus, disable fp16 in armv7l and aarch64.
Tizen x86/x64 does not support fp16, too.
This re-enabled Tizen build of nntrainer.
Please do not break the build in main branch!
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
skykongkong8 [Mon, 22 Jan 2024 08:40:13 +0000 (17:40 +0900)]
[ Tensor ] Support non-contiguous case in sin, cos, inv_sqrt_i
- If it is not for BLAS, we can also support sin, cos, inv_sqrt_i functions for non-contiguous case as well.
- Fix related functions and add unittest accordingly.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Tue, 16 Jan 2024 01:19:38 +0000 (10:19 +0900)]
[ Trivial ] Add exception in inv_sqrt_i function
- In case of non-contiguous Tensor, it is impossible to apply SIMD instructions. Add expection accordingly.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Mon, 15 Jan 2024 07:47:10 +0000 (16:47 +0900)]
[ Trivial ] Refactor trigonometric functions
- In case of non-contiguous Tensor, it is impossible to apply SIMD instructions. Add expection accordingly.
- Rename the function name for intuitiveness : sin_transform -> sin, cos_transform -> cos
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Wed, 31 Jan 2024 06:25:29 +0000 (15:25 +0900)]
[ Bug ] Fix coverity issues
- Fix non-const variables to const variables since their value is never changed in actual practice
- Use const auto & to avoid object copy
Resolves:
```
non-const type variable, but its value is never changed.
auto_causes_copy
```
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Donghyeon Jeong [Wed, 31 Jan 2024 06:28:38 +0000 (15:28 +0900)]
[coverity] Fix coverity issues
This PR resolves the coverity issues that were identified.
**Changes proposed in this PR:**
- Specify the return type of the lambda function
- Use reference to not copy the object.
This fixes:
- Use of auto that causes a copy (AUTO_CAUSES_COPY)
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Jiho Chu [Wed, 31 Jan 2024 01:21:08 +0000 (10:21 +0900)]
[FIX] Fix coverity issues
Issue:
1740106
1742375
1747001
Signed-off-by: Jiho Chu <jiho.chu@samsung.com>
hyeonseok lee [Tue, 30 Jan 2024 07:58:24 +0000 (16:58 +0900)]
[bug] fix coverity issues
- Specify the lambda return type to avoid object copy
Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>
Donghak PARK [Fri, 26 Jan 2024 00:17:08 +0000 (09:17 +0900)]
[CI] Rename laber & upgrade node version & add workflow fail process
In order to improve the convenience and robustness of GitAction, make the following modifications:
1. In accordance with the guidelines to upgrade from Node 16 to Node 20
- change gitaction-script version to v7
- change gitaction-upload-artifact to v4
- ref : https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
2. If the check_count fails, alter it so that no additional actions are executed and the process terminates immediately.
3. adopt more descriptive and clear names for better understanding.
**Changes proposed in this PR:**
renamed: .github/workflows/Upload.yml -> .github/workflows/check_count.yml
modified: .github/workflows/labeler.yml
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghyeon Jeong [Mon, 22 Jan 2024 01:36:49 +0000 (10:36 +0900)]
[TensorV2] Multiplication support
This PR adds support for performing the multiplication operation on two tensors.
**Changes proposed in this PR:**
- TensorV2 includes member functions to perform tensor multiplication.
- FloatTensor and HalfTensor take TensorV2 as input/output to perform multiplication.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
MyungJoo Ham [Fri, 26 Jan 2024 04:46:42 +0000 (13:46 +0900)]
blas_neon: fix compiler errors in aarch64/Linux
With stricter compilers, fp16 codes are not compilable.
To enable testing in non-android, fix type mismatches.
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Donghak PARK [Wed, 24 Jan 2024 04:38:09 +0000 (13:38 +0900)]
[CI] Add Pylint gitaction for gitaction ci
Add pylint yml file for python lint
- we move to gitaction from TAOS CI
- using pylint file from tensorflow gitaction
- ref : https://github.com/tensorflow/tensorflow/blob/master/.github/workflows/pylint-presubmit.yml
- and for test : fix python file's format
**Changes proposed in this PR:**
- pylint.yml
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghyeon Jeong [Fri, 26 Jan 2024 02:19:51 +0000 (11:19 +0900)]
[coverity] Remove no effect code
This PR fixes the Coverity issue, indicating no effect code
**Changes proposed in this PR:**
- Remove negative check (unsigned int is always non-negative).
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghak PARK [Thu, 25 Jan 2024 08:15:08 +0000 (17:15 +0900)]
[Trivial] Add new member & update CODEOWNERS
Add new member & update CODEOWNERS
**Changes proposed in this PR:**
modified: .github/CODEOWNERS
modified: CONTRIBUTING.md
modified: README.md
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghyeon Jeong [Tue, 23 Jan 2024 02:22:37 +0000 (11:22 +0900)]
[TensorV2] multiply_strided() skeleton
This pull request introduces a basic structure of tensor multiplication operations that support different strided inputs and outputs.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Mon, 22 Jan 2024 04:22:41 +0000 (13:22 +0900)]
[Test] Generate TensorV2 in unit test
This PR includes the implementation of test util functions to create a tensor filled with values to utilize in unit testing.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghak PARK [Wed, 24 Jan 2024 07:22:11 +0000 (16:22 +0900)]
[CI] Add cpp file format checker
This patch adds a Github Action workflow to check cpp file format
- this file imported from deviceMLOps.MLAgent
- this file use cpp_linter marketplace actions
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghak PARK [Wed, 24 Jan 2024 01:20:50 +0000 (10:20 +0900)]
[CI] Add Clean meson build for gitaction ci
Add Clean meson build .yml file for ci
- This file was taken from nnstreamer's git action and modified to fit nntrainer.
- ref : https://github.com/nnstreamer/nntrainer/blob/main/docs/getting-started.md
**Changes proposed in this PR:**
- .github/workflows/ubuntu_clean_meson_build.yml
Resolves:
- Add gitaction ci
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghyeon Jeong [Tue, 23 Jan 2024 04:30:47 +0000 (13:30 +0900)]
[bugfix] Resolve segfault in tensor apply
This PR fixes a bug where a segmentation fault occurs when the output tensor is empty.
The fix initializes the output tensor when empty to avoid this error.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Fri, 19 Jan 2024 10:40:02 +0000 (19:40 +0900)]
[TensorV2] Enable copying data from Tensor
This PR enables deep copy functionalities of a contiguous tensor with the following functions. copy(), copyData(), and copy_with_strides().
The copy function completely copies the target tensor values regardless of the dimension of the input tensor. All elements and properties of the original tensor are copied to the new tensor. Therefore, if the copy function is used, a new tensor with the same size and shape as the original tensor is created.
On the other hand, the copyData function must match the size of the input and target tensors. This function only copies the data of the original tenor, so if the size or shape of the tensor is different, the copy may not be done properly.
Note that copy and copyData functions support copy data from multiple tensor data types while the copy_with_strides function only supports copying data from the same data type.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Fri, 19 Jan 2024 04:49:37 +0000 (13:49 +0900)]
[TensorV2] Reshape functionality
This commit includes the implementation of reshaping the tensor to the given TensorDim in the following conditions.
1. Tensor to reshape is contiguous.
2. The length of data should match with the new TensorDim.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Fri, 19 Jan 2024 02:23:37 +0000 (11:23 +0900)]
[TensorV2] Add support for applying operators with broadcasting
This PR enables functionality to apply the given operator, such as multiply and divide, with broadcasting to the tensor.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Eunju Yang [Thu, 18 Jan 2024 02:45:18 +0000 (11:45 +0900)]
[DOCS] add instructions to create meson.build in how-to-create-model.md
* Ths commit adds a missing part in docs/how-to-create-model.md.
* It includes some explanations to write meson.build file of a new application (under Applications/MyApp/jni/ so as to build it.
Signed-off-by: Eunju Yang <ej.yang@samsung.com>
Donghyeon Jeong [Tue, 16 Jan 2024 04:29:16 +0000 (13:29 +0900)]
[Tensor] Add broadcast support for operations
This PR adds support for broadcasting to enable broadcasting in operation in the future.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Fri, 12 Jan 2024 04:49:18 +0000 (13:49 +0900)]
[TensorV2] Multiplication operation skeleton
This pull request adds a basic implementation of tensor multiplication operations to our codebase.
The new functionality allows users to perform multiplication of tensors by simply calling a function.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
MyungJoo Ham [Thu, 11 Jan 2024 07:50:48 +0000 (16:50 +0900)]
meson: do not force enable ml-api when it's not force enabled.
The previous meson logic force enabled ml-api if it is not disabled
and common headers are found.
The new logic disables ml-api even if common headers are found
if ml-inference is not found.
This allows to build nntrainer in a system only common ml headers
are available without any meson options.
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Donghyeon Jeong [Thu, 11 Jan 2024 04:22:36 +0000 (13:22 +0900)]
[Test] Enabled unit testing for TensorV2 class
This PR enables unit testing for the TensorV2 class by adding a suite of tests that cover public methods.
More tests will be added in a future PR to further validate the TensorV2 class.
**Changes proposed in this PR:**
- Edit meson build file to include tensor v2 unit tests
- Fix public methods usage due to changed function use.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
skykongkong8 [Tue, 9 Jan 2024 23:34:23 +0000 (08:34 +0900)]
[ unittest ] Add unittest for inv_sqrt_i with fp16
- There was a request to add unittest for inv_sqrt_i in PR#2396
- Conducted between fp16 and fp32 Tensor with eps = 1e-3
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
MyungJoo Ham [Wed, 17 Jan 2024 02:23:09 +0000 (11:23 +0900)]
util_simd: make typename consistent (__fp16 --> _FP16)
_FP16 is the macro to unify different fp16 typenames
across different architectures or libraries.
Note that util_simd.cpp has the correct name (_FP16) while
the header has the incorrect naming (__fp16).
Although this does not break the build or execution,
this is not good for readability and dependency clean-ups.
CC: @skykongkong8
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
MyungJoo Ham [Tue, 16 Jan 2024 12:56:57 +0000 (21:56 +0900)]
Use getDataType() instead of getTensorType().data_type
Suggested by @djeong20 at https://github.com/nnstreamer/nntrainer/pull/2409#pullrequestreview-
1822817983
Co-authored-by: Donghyeon Jeong <54725479+djeong20@users.noreply.github.com>
MyungJoo Ham [Fri, 12 Jan 2024 06:06:29 +0000 (15:06 +0900)]
fix: multi-head-attention incorrect macro usage.
1. Fix re-definitions of macros
2. Determine mask num in runtime, not in compile-time.
You do not know whether users want fp16 or not at compile-time.
Fixes #2407
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
skykongkong8 [Mon, 15 Jan 2024 05:46:17 +0000 (14:46 +0900)]
[ util ] Implement swish function in util
- To accelerate swish activation function implement swish calculation function for:
- neon fp32 / fp16
- raw fp32 / fp16
- SIMD calculation of exponential function is grounded on neon_mathfun
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Mon, 15 Jan 2024 05:35:21 +0000 (14:35 +0900)]
[ BLAS ] Refactor neon_mathfun
- For easier usage of neon_mathfun, refactor to avoid duplicated symbol error
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Donghyeon Jeong [Wed, 10 Jan 2024 02:18:12 +0000 (11:18 +0900)]
[TensorV2] apply() function to apply a given function
This pull request implements a new function called apply(), which applies a given function element-by-element to a tensor.
The resulting tensor has the same shape as the input tensor, but each element has been transformed by the given function.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Mon, 8 Jan 2024 05:26:13 +0000 (14:26 +0900)]
[Tensor] Added getters and setters for private members of TensorBase class.
This PR adds accessors (getters) and mutators (setters) for the private data members of the TensorBase class.
This change allows TensorV2 to interact with these variables through the provided methods, improving encapsulation and making the code more maintainable.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
skykongkong8 [Tue, 9 Jan 2024 10:28:39 +0000 (19:28 +0900)]
[ layer ] Apply neon simd acceleration in rotary embedding computation
- Previous rotary embedding computation code was naively implemented with for-loop
- Using simd code, I expect this code to behave efficiently in time cost perspective, without precision loss
- Current implementation only supports Neon simd (ARMv8)
- Trivial typos fixes included
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Tue, 9 Jan 2024 10:22:35 +0000 (19:22 +0900)]
[ util ] Add util_simd file
- Here I would like to introduce util_simd. I had no choice but differentiate this file with blas for following reasons:
1. This is actually not 'Basic Linear Algebra Subprogram' indeed
2. This code is only used in the very specific occasion.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Donghyeon Jeong [Mon, 8 Jan 2024 00:23:35 +0000 (09:23 +0900)]
[Tensor] Fix comparison operator
A tensor containing a NaN value cannot be equal to any other tensors since NaNs are never equal. Therefore, comparison operator logic changed in handling NaN values.
**Changes proposed in this PR:**
- Comparison operator returns false if tensor has a NaN value.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
skykongkong8 [Mon, 8 Jan 2024 05:21:11 +0000 (14:21 +0900)]
[ BLAS ] Add inv sqrt inplace function
- Implement inv sqrt inplace function with neon / raw
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Mon, 8 Jan 2024 07:39:47 +0000 (16:39 +0900)]
[ Tensor ] Add trigonometric transformation functions in Tensor
- Add sin / cos transform function in Tensor : both for BLAS/raw
- Add unittest accordingly
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Mon, 8 Jan 2024 01:59:11 +0000 (10:59 +0900)]
[ BLAS ] Add trigonometric transformation functions
- In order to accelerate trigonometric calculations, add such transformation functions in BLAS module
- Add zlib license file
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Donghyeon Jeong [Wed, 3 Jan 2024 05:05:38 +0000 (14:05 +0900)]
[Tensor] Support additional weight initialization
This PR enables additional weight initializers with a probability distribution.
**Changes proposed in this PR:**
- Functions to set tensor with random distribution are implemented.
- Tensor now supports various initializers besides zero and one.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Tue, 26 Dec 2023 12:10:23 +0000 (21:10 +0900)]
[Tensor] Sum by axis in column-major order
This PR enables a summation of tensor elements by axis in column-major order.
**Changes proposed in this PR:**
- Use sgemv in sum() with CblasColMajor when the tensor is column-major.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Tue, 2 Jan 2024 08:05:27 +0000 (17:05 +0900)]
[Tensor] Comparison operator overloading
This PR includes the implementation of comparison operators for TensorV2-related classes.
**Changes proposed in this PR:**
- TensorBase comparison operator compares Tensor information such as TensorDim.
- Float/HalfTensor comparison operator checks Tensor data.
- Destructor implementation is removed and set to default due to the rule of five.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Wed, 27 Dec 2023 04:35:43 +0000 (13:35 +0900)]
[TensorDim] Add column-major storage order
This PR defines the Tensor storage order in the TensorDim class to support Row-major and Column-major order.
**Changes proposed in this PR:**
- Add enum class StorageOrder to define storage order
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Thu, 28 Dec 2023 01:26:22 +0000 (10:26 +0900)]
[FP16] Include HalfTensor when enable_fp16
In this PR, HalfTensor is included only when FP16 is enabled.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Wed, 20 Dec 2023 02:04:09 +0000 (11:04 +0900)]
[Tensor] Enable additional constructors
In this PR, multiple constructors are supported in the original Tensor class.
- TensorV2 constructors decide which Tensor to create.
- TensorBase constructors handle initialization that is shared.
- Float/HalfTensor constructors manage their own unique initialization.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Tue, 26 Dec 2023 01:18:11 +0000 (10:18 +0900)]
[Tensor] Support source tensor allocation
In this PR, the float/half tensor can be allocated based on the source tensor.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
skykongkong8 [Fri, 15 Dec 2023 00:26:04 +0000 (09:26 +0900)]
[ Ahub ] Fix Ahub issues
- Fixes : TensorV2 may not initialize itensor
- Fixes : After dynamic memory allocation of itensor, there is no destructor
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Donghyeon Jeong [Thu, 7 Dec 2023 09:48:42 +0000 (18:48 +0900)]
[Tensor] Resolve SrcSharedTensorV2 cyclic dependency
This PR fixes the issue of SrcSharedTensorV2 containing TensorV2 which creates cyclic dependency.
(SrcSharedTensorV2 -> TensorV2 -> TensorBase -> SrcSharedTensorV2)
**Changes proposed in this PR:**
- SrcSharedTensorV2 owns TensorBase instead of TensorV2
- Rename SrcSharedTensorV2 as SrcSharedTensorBase accordingly
- Add functions to create and get shared data tensor
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Fri, 8 Dec 2023 00:23:43 +0000 (09:23 +0900)]
[Tensor] Add Float/HalfTensor Implementation
In this PR, FloatTensor and HalfTensor's override methods are implemented.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
jijoong.moon [Mon, 11 Dec 2023 07:08:37 +0000 (16:08 +0900)]
[ API ] Add Tensor CPP API for Auto Grad
In this PR,
. Add skeleton Tensor Class in cpp api which inherite the
nntrainer::var_grad.
. include setter and getter of source layer which create this
tensor.
. Add unittest case
Resolves:
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: jijoong.moon <jijoong.moon@samsung.com>
Donghyeon Jeong [Thu, 7 Dec 2023 12:23:45 +0000 (21:23 +0900)]
[Util] Fix error in using fp16.h functions
This PR fixes multiple definition error when using FP32/16 conversion functions in fp16.h
**Changes proposed in this PR:**
- Function definition is moved to fp16.cpp.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Tue, 5 Dec 2023 02:19:26 +0000 (11:19 +0900)]
[Tensor] Add Tensor member functions
This PR extends the current Tensor member functions.
**Changes proposed in this PR:**
- Add member functions to get tensor information.
- Implement skeleton code.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghak PARK [Mon, 4 Dec 2023 02:08:34 +0000 (11:08 +0900)]
[Unit Test] Fix unittest_interpreter
The unit test interpreter had been previously disabled due to build errors.
However, the TensorFlow Lite (TFLite) related unit tests were separated and dependencies were removed, reflecting changes made in the neural network trainer.
**Changes proposed in this PR:**
- modified: ../test/unittest/compiler/meson.build
- modified: ../test/unittest/compiler/unittest_interpreter.cpp
Resolves:
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghyeon Jeong [Mon, 4 Dec 2023 02:21:45 +0000 (11:21 +0900)]
[Doc] Add TensorV2 class diagram
- Add class diagram of TensorV2
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Fri, 1 Dec 2023 06:24:46 +0000 (15:24 +0900)]
[Tensor] Refactored TensorV2 class
In this PR, the TensorV2 structure is refactored instead of following a Type erasure pattern.
**Changes proposed in this PR:**
- TensorV2 is a target, expected for a user to use, that contains a TensorBase pointer.
- TensorBase is an abstract class that provides default infrastructure code.
- FloatTensor class inherits the TensorBase class and overrides the pure virtual methods with 32-bit floating point calculation.
- HalfTensor class inherits the TensorBase class and overrides the pure virtual methods with 16-bit floating point calculation.
**Note**
This is a skeleton to build the structure with no implementation.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghak PARK [Mon, 27 Nov 2023 10:55:27 +0000 (19:55 +0900)]
[Unit Test] Add unittest_export for tflite export
I have created a file named "unittest_export" to test the tflite export functionality.
- So far, I have only tested this feature on a network graph basis.
- In order to conduct more accurate testing, I have generated a model for testing purposes.
- To facilitate future tests, I have developed a function called "run_tflite," which can retrieve outputs using the model name and input data.
- I have included the MNIST FULL model to assess the application's compatibility with existing models.
**Changes proposed in this PR:**
- Added unittest_export.cpp
- modify meson.build
- update unittest_interpreter.cpp
Related : #2371
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghak PARK [Mon, 27 Nov 2023 10:53:56 +0000 (19:53 +0900)]
[Unit Test] Remove tflite export related part in unittest_interpreter
Thus far, I have removed the TFLite-related unit tests from the interpreter and transferred them to the unittest_export file.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghak PARK [Mon, 27 Nov 2023 10:25:21 +0000 (19:25 +0900)]
[Unit Test] Update meson.build file to add export test
Update meson.build file to add export test
- for tflite export test add meson unittest_export.cpp
- for unittest add some dependency
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghyeon Jeong [Fri, 1 Dec 2023 05:09:22 +0000 (14:09 +0900)]
[Tensor] Remove current TensorV2
- Remove TensorV2 class and related classes for designing a new pattern
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Fri, 1 Dec 2023 00:18:36 +0000 (09:18 +0900)]
[Trivial] Edit author list
- Edit authors list for tensor files
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Thu, 23 Nov 2023 03:49:44 +0000 (12:49 +0900)]
[Tensor] Apply function for TensorV2
- Add TensorV2 member function apply, which applies given function element by element.
- Add unit test for newly added TensorV2 function verification.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Thu, 23 Nov 2023 01:45:11 +0000 (10:45 +0900)]
[Tensor] TensorV2 class operator overloading
- Add implementation of operator overloading that is required .
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Thu, 23 Nov 2023 01:41:17 +0000 (10:41 +0900)]
[Tensor] Addition setter/getter for TensorV2
This PR extends the current setter and getter for TensorV2 data.
**Self-evaluation:**
1. Build test: [ ]Passed [ ]Failed [X]Skipped
2. Run test: [ ]Passed [ ]Failed [X]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Thu, 23 Nov 2023 00:37:06 +0000 (09:37 +0900)]
[Tensor] Remove data type as input for Float/Half Tensor
As discussed on #2367, the Float/HalfTensor data type does not change. There's no need to take datatype as input and therefore is removed.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghak PARK [Thu, 23 Nov 2023 13:50:41 +0000 (22:50 +0900)]
[Trivial] Update gitignore file
**Update gitignore file**
The current build process involves downloading the necessary encoder,
ctre-unicode, and json files for running LLM from an external repository
which is not tracked within the nntrainer repo.
In order to prevent developers from accidentally uploading
these files upstream and to make the process more convenient,
we will be updating the gitignore file.
**Changes proposed in this PR:**
- Update gitignore file
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghak PARK [Thu, 23 Nov 2023 13:17:07 +0000 (22:17 +0900)]
[Exporter] Update node_exporter
Update node_exporter.cpp and node_exporter.h
The issue was that new properties were added to the layer node,
causing existing TFLite export code to fail recognizing the layer_node.
This problem was resolved by adding these properties to the node_exporter.
added props
- props::Packed
- props::Print
**Changes proposed in this PR:**
- modified: nntrainer/utils/node_exporter.cpp
- modified: nntrainer/utils/node_exporter.h
Resolves:
- tflite export error #2371
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
Donghyeon Jeong [Tue, 21 Nov 2023 23:13:07 +0000 (08:13 +0900)]
[Tensor] Initial Draft of Tensor Version 2
In this PR, the initial working version of Tensor V2 is included.
**Changes proposed in this PR:**
- Create a TensorV2 class that follows a type erasure design pattern.
- Create a new SrcSharedTensorV2, which is a class template.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
skykongkong8 [Fri, 24 Nov 2023 06:56:11 +0000 (15:56 +0900)]
[neon/bugfix] Fix ewva function
- There was a wrong implementation of ewva function. It should be added, not multiplied.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Donghyeon Jeong [Wed, 22 Nov 2023 00:26:27 +0000 (09:26 +0900)]
[Ahub] Fix AnalysisHub defects
**Changes proposed in this PR:**
- Fix uninitialized class members in the constructor
- Fix potential uninitialized data
- Add try-block to catch exceptions
- Check if malloc returns null
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Fri, 17 Nov 2023 08:15:19 +0000 (17:15 +0900)]
[Tensor] Include Half Tensor when FP16 is enabled
**Changes proposed in this PR:**
- Edit meson.build file to add half_tensor.cpp when enable_fp16 is true
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Fri, 10 Nov 2023 04:26:39 +0000 (13:26 +0900)]
[Tensor] HalfTensor class for 16-bit floating point
This PR includes creating the HalfTensor class which separates 16-bit floating point calculation from nntrainer::Tensor.
**Changes proposed in this PR:**
- Create a HalfTensor class that only handles 16-bit floating point calculation.
- Remove operations for Quantized Tensor.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
hyunil park [Thu, 9 Nov 2023 06:26:00 +0000 (15:26 +0900)]
[Sub-plugin] Refactoring sub-plugin class
- Change NNTrainerTrain class name to NNTrainerImpl
- Change InputTensorsInfo class name to TensorsQueue
- Add push method to TensorsQueue
- Change member variables of NNTrainerImpl and TensorsQueue to private and rename some variables and methods
Signed-off-by: hyunil park <hyunil46.park@samsung.com>
Seungbaek Hong [Wed, 5 Jul 2023 07:05:31 +0000 (16:05 +0900)]
[Application] Add multi_input dataloader example
Added multi-input dataloader example application.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Seungbaek Hong <sb92.hong@samsung.com>
Donghyeon Jeong [Fri, 10 Nov 2023 03:19:48 +0000 (12:19 +0900)]
[Tensor] FloatTensor class for 32-bit floating point
This PR includes creating the FloatTensor class which separates 32-bit floating point calculation from nntrainer::Tensor.
**Changes proposed in this PR:**
- Create a FloatTensor class that only handles 32-bit floating point calculation.
- Remove operations for Quantized Tensor.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
skykongkong8 [Tue, 7 Nov 2023 00:41:44 +0000 (09:41 +0900)]
[trivial/bugfix] Add `inference_only_option` in multi_head_attention unittest
- We were using `inference_only` option for multi_head_attention fp16 unittest since we do not have loss scaling implementation yet.
- However, I discovered that there was a missing declaration of such option which might make malfunction when build, and added accordingly.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Fri, 27 Oct 2023 08:29:11 +0000 (17:29 +0900)]
[gtest] Add test suites for multiHeadAttention with w16a16
- We already had proper implementation of multihead attention layer with half-precision, but did not have any unittest cases.
- Add unittest accordingly
- Fix typo : last line indent
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Fri, 27 Oct 2023 07:41:03 +0000 (16:41 +0900)]
[layer] Support fp16 in embedding layer
- Add _FP16 code block in calcGrad in embedding layer (it does not need it for forwarding)
- Add unittest accordingly
- Explicit code in gtest for embedding layer:
- In layer gtest, each test suite runs without knowing any context of its adjacent layers
- Thus, for atypical layers like Embedding layer (which layer that has different dataType in input and output) we should either refactorize the code or explicitly process it
- Room for memory optimization
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Fri, 27 Oct 2023 04:48:12 +0000 (13:48 +0900)]
[gtest] Add gtest data for embedding layer
- Generating gtest data for the embedding layer should be differentiated from the other data because of following reasons:
1. Embedding layer takes 32bit for its input, even when working with fp16 models
2. Embedding layer has this particular object called 'IndexSlices', and we need additional processing in order to let it behave like the way in
the NNTrainer
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Fri, 27 Oct 2023 01:41:02 +0000 (10:41 +0900)]
[gtest/trivial] Change notation in gtest: fp16fp16 to w16a16
- Previous notation 'fp16fp16' does not really represent the true meaning: data file for layers with half-precision weight and activation
- Thus, I would like to propose a new notation style 'w16a16' not only for the better understanding, but also to avoid unnecessary misunderstandings for mixed precision support in the near future.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Fri, 27 Oct 2023 00:29:34 +0000 (09:29 +0900)]
[layer] Support fp16 in dropout layer
- Check dropout layer does not need code fix to support multiple dataTypes
- Add unittest accordingly
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Thu, 26 Oct 2023 02:33:11 +0000 (11:33 +0900)]
[layer] Support fp16 in lstm layer
- Add _FP16 code block to enable float16 functionality
- Add unittest accordingly
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Thu, 26 Oct 2023 01:58:30 +0000 (10:58 +0900)]
[layer] Support fp16 in concat layer
- Add _FP16 code block to enable float16 functionality
- Add unittest accordingly
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Donghyeon Jeong [Fri, 3 Nov 2023 05:59:26 +0000 (14:59 +0900)]
[Utils] Conversion to/from half-precision floating point
This PR includes utility functions for conversion to/from 16-bit floating point number, in bit representation, from/to 32-bit floating point number in IEEE format.
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Donghyeon Jeong [Thu, 2 Nov 2023 00:50:45 +0000 (09:50 +0900)]
[Tensor] Support multiple data types in copyData
Previously, copyData only supported copying data of the same data type. Copying data of different data types is needed with the increased demand for flexibility of mixed-precision.
**Changes proposed in this PR:**
- copyData supports copying data of different type with the use of NEON
- Remove the flate function
- utilize copyData in dequantize
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
Seungbaek Hong [Thu, 19 Oct 2023 09:44:31 +0000 (18:44 +0900)]
[docs] add how-to-create-model document
I have added a tutorial document on how users can build their own models using NNTrainer.
The current tutorial is just the most basic draft, and it needs updating.
It seems that the API needs to be updated to use easily,
as setting up the data by the user is very inconvenient for now.
(So in this example, it uses random data generator.
so.. user couldn't learn how to train using a real data.)
I also added this tutorial link to the README on the first page,
and since the list of maintainers and contributors currently occupies too much space,
I moved this part to the bottom of the README.
Signed-off-by: Seungbaek Hong <sb92.hong@samsung.com>
Seungbaek Hong [Wed, 18 Oct 2023 06:06:24 +0000 (15:06 +0900)]
[trivial] fix typo errors and delete duplicated script
fix typo error in llama implementation and delete dummy
script (multi_head_attention copy.h).
Signed-off-by: Seungbaek Hong <sb92.hong@samsung.com>
skykongkong8 [Tue, 29 Aug 2023 02:19:56 +0000 (11:19 +0900)]
[TensorDim] Fix TensorDim constructor
- Previously, there were lack of default constructors for 1-batch, 1-channel, 1-height, 1-width with currently added TensorType option (regardless of format).
- Since there are a lot of codes using such case in FP32 implementation, I had no choice but have to explicitly feed the tensor_type (which contains the fp16 info) to the previously defined tensor.
- For clean code, I would like to propose a new default constructor for better construction of TensorDim instance.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Donghyeon Jeong [Wed, 1 Nov 2023 00:00:57 +0000 (09:00 +0900)]
[bugfix] memory overwrite error fix in unittest_tizen_capi
This PR resolves the failing getWeight_01 test case in the unittest_tizen_capi.
In the ML API common data structure, the maximum rank in Tizen APIs has changed from 4 to 16 since Tizen 8.0.
However, the NNTrainer getWeight test uses a dimension with MAXDIM of 4.
This causes an issue of ml_tensors_info_get_tensor_dimension overwriting irrelevant memory since it expects to get an array length of 16 while it's passing array length of 4.
**Changes proposed in this PR:**
- Switch the order of defining variables to avoid memory overwrites.
This fixes:
[ RUN ] nntrainer_capi_nnmodel.getWeight_01
[ FAILED ] nntrainer_capi_nnmodel.getWeight_01 (10 ms)
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
skykongkong8 [Tue, 31 Oct 2023 05:45:08 +0000 (14:45 +0900)]
[blas/neon] Add copy function for fp32 and fp16
- There was a missing implementation of user interface of fp32<->fp16 copy neon function.
- Add blas interface to use such neon funcions
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
Seungbaek Hong [Wed, 18 Oct 2023 05:55:00 +0000 (14:55 +0900)]
[Application] LLaMA weights converter for mha model
Actually, I already added weights converter for mha model in pr #2287.
There were 2 types of converter for supporting legacy and mha model.
But now, converter for legacy model is useless. So I deleted it.
Signed-off-by: Seungbaek Hong <sb92.hong@samsung.com>
hyeonseok lee [Tue, 31 Oct 2023 02:23:10 +0000 (11:23 +0900)]
[Application] fix deepq to make it run
- Make res directory and move DeepQ.ini file to res dir
- If the input batch size does not match with batch size property of model graph,
set model graph batch size property as input batch size in forwarding function
- Allocate weight/tensor memory as train mode of mainNet, targetNet network
- Comment "return 1" statement so as to make it as train from scratch
Signed-off-by: hyeonseok lee <hs89.lee@samsung.com>
hs0207.kim [Tue, 24 Oct 2023 06:53:30 +0000 (15:53 +0900)]
implementation of nndetector
implementation of application to run object detection and learning personal object on mobile
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: hs0207.kim <hs0207.kim@samsung.com>
Donghyeon Jeong [Mon, 30 Oct 2023 23:46:20 +0000 (08:46 +0900)]
[bugfix] Android ndk-build error fix
This PR resolves ndk-build issue in the tensor fp16 unit test.
**Changes proposed in this PR:**
- Move implementation to header file to avoid linker error
- Change ambiguous variable and function names
This fixes:
[arm64-v8a] Executable : unittest_nntrainer_tensor_fp16
ld: error: undefined symbol: nntrainer::Tensor::setScaleFactors16(std::__ndk1::vector<_Float16, std::__ndk1::allocator<_Float16> >)
>>> referenced by unittest_nntrainer_tensor_fp16.cpp:5805 (../unittest/unittest_nntrainer_tensor_fp16.cpp:5805)
**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>
skykongkong8 [Tue, 24 Oct 2023 01:33:21 +0000 (10:33 +0900)]
[gtest] Fix gtest error assessing logic
- Float16 models tend to show unavoidable higher accuracy loss in: 1. huge Tensors 2. Tensors with huge values
- Reassess value-by-value logic with relative error when large error
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Wed, 25 Oct 2023 04:36:33 +0000 (13:36 +0900)]
[bugfix] Fix Tensor save function when float16
- Previously, wrong getData function was called when saving fp16 Tensor
- Apply proper template parameter: _FP16
Resolves:
```
...
23/41 unittest_nntrainer_tensor_fp16 FAIL 2.26 s (exit status 1)
...
[ FAILED ] nntrainer_Tensor.save_read_01_p
...
```
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>
skykongkong8 [Fri, 20 Oct 2023 01:49:59 +0000 (10:49 +0900)]
[neon] Support scopy for multiple dataTypes in neon
- scopy_int4_to_fp32
- scopy_int8_to_fp32
- scopy_int8_to_fp16
- scopy_int8_or_int4 : Since we use uint8_t for int4 Tensors, codes can be shared here
- vcvt_fp32_u32_bitwise : Faster optimization can be done with bitwise operation rather than elementwise operation
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: skykongkong8 <ss.kong@samsung.com>