[sgemm/neon] Optimized noTrans scenario for SGEMM
authorDebadri Samaddar <s.debadri@samsung.com>
Wed, 11 Oct 2023 09:43:47 +0000 (15:13 +0530)
committerJijoong Moon <jijoong.moon@samsung.com>
Thu, 12 Oct 2023 00:46:16 +0000 (09:46 +0900)
commit6a25e34193b6ad4c6f52dc2ea21bf2d0ac9da394
treee76a1304927e84e48de9c36476dd691343d30f67
parent20e0c12e3a9d2fbdbc95962eb2f29629f108f8b4
[sgemm/neon] Optimized noTrans scenario for SGEMM

Used NEON SIMD to calculate prefixes.
Added vectorization to process 16 rows together.

**Self evaluation:**
1. Build test:  [X]Passed [ ]Failed [ ]Skipped
2. Run test:  [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Debadri Samaddar <s.debadri@samsung.com>
nntrainer/tensor/blas_neon.cpp