platform/upstream/openblas.git
8 years ago[z]ger: increase multithread threshold
Jerome Robert [Fri, 15 Jan 2016 17:40:13 +0000 (18:40 +0100)]
[z]ger: increase multithread threshold

The ones given in 3ae30cd was by far to low because I
mixed m and m*n in my measures. Note that the new ones
are closed to the [z]gemv ones which is comforting
that both are right.

8 years agoRef #740: updated nrm2_vfp.S
Werner Saar [Sat, 23 Jan 2016 16:47:58 +0000 (17:47 +0100)]
Ref #740: updated nrm2_vfp.S

8 years agoRef #740: updated asum_vfp.S and iamax_vfp.S
Werner Saar [Sat, 23 Jan 2016 13:44:34 +0000 (14:44 +0100)]
Ref #740: updated asum_vfp.S and iamax_vfp.S

8 years agoRef #750 and Ref #740 : bugfix for sdot, dsdot and ddot on arm
Werner Saar [Sat, 23 Jan 2016 10:59:51 +0000 (11:59 +0100)]
Ref #750 and Ref #740 : bugfix for sdot, dsdot and ddot on arm

8 years agoMerge pull request #747 from wernsaar/develop
wernsaar [Thu, 21 Jan 2016 13:21:59 +0000 (14:21 +0100)]
Merge pull request #747 from wernsaar/develop

Ref #730: added performance updates for syrk and syr2k

8 years agoadded updates for syrk and syr2k
Werner Saar [Thu, 21 Jan 2016 12:16:44 +0000 (13:16 +0100)]
added updates for syrk and syr2k

8 years agoMerge pull request #745 from jakirkham/minor_fix_scipy_prof
Zhang Xianyi [Wed, 20 Jan 2016 17:24:22 +0000 (11:24 -0600)]
Merge pull request #745 from jakirkham/minor_fix_scipy_prof

BENCH: Minor fixes in SciPy benchmarks

8 years agoMerge pull request #744 from jeromerobert/bug731
Zhang Xianyi [Wed, 20 Jan 2016 17:18:21 +0000 (11:18 -0600)]
Merge pull request #744 from jeromerobert/bug731

Bug731

8 years agobenchmark/scripts/SCIPY/dsyrk.py: Overwrite will work on a Fortran array of the corre...
John Kirkham [Tue, 19 Jan 2016 20:32:28 +0000 (15:32 -0500)]
benchmark/scripts/SCIPY/dsyrk.py: Overwrite will work on a Fortran array of the correct type.

8 years agobenchmark/scripts/SCIPY/ssyrk.py: Overwrite will work on a Fortran array of the corre...
John Kirkham [Tue, 19 Jan 2016 20:31:37 +0000 (15:31 -0500)]
benchmark/scripts/SCIPY/ssyrk.py: Overwrite will work on a Fortran array of the correct type.

8 years agobenchmark/scripts/SCIPY/dsyrk.py: Arrays should be Fortran order.
John Kirkham [Tue, 19 Jan 2016 20:29:43 +0000 (15:29 -0500)]
benchmark/scripts/SCIPY/dsyrk.py: Arrays should be Fortran order.

8 years agobenchmark/scripts/SCIPY/ssyrk.py: Arrays should be Fortran order.
John Kirkham [Tue, 19 Jan 2016 20:28:22 +0000 (15:28 -0500)]
benchmark/scripts/SCIPY/ssyrk.py: Arrays should be Fortran order.

8 years agobenchmark/scripts/SCIPY/ssyrk.py: Fix PEP8 issues.
John Kirkham [Tue, 19 Jan 2016 20:06:17 +0000 (15:06 -0500)]
benchmark/scripts/SCIPY/ssyrk.py: Fix PEP8 issues.

8 years agobenchmark/scripts/SCIPY/dsyrk.py: Fix PEP8 issues.
John Kirkham [Tue, 19 Jan 2016 20:05:18 +0000 (15:05 -0500)]
benchmark/scripts/SCIPY/dsyrk.py: Fix PEP8 issues.

8 years agobenchmark/scripts/SCIPY/ssyrk.py: Write values into `C`.
John Kirkham [Tue, 19 Jan 2016 20:00:54 +0000 (15:00 -0500)]
benchmark/scripts/SCIPY/ssyrk.py: Write values into `C`.

8 years agobenchmark/scripts/SCIPY/dsyrk.py: Write values into `C`.
John Kirkham [Tue, 19 Jan 2016 20:00:23 +0000 (15:00 -0500)]
benchmark/scripts/SCIPY/dsyrk.py: Write values into `C`.

8 years agobenchmark/scripts/SCIPY/ssyrk.py: Use the environment python.
John Kirkham [Tue, 19 Jan 2016 19:05:14 +0000 (14:05 -0500)]
benchmark/scripts/SCIPY/ssyrk.py: Use the environment python.

8 years agobenchmark/scripts/SCIPY/dsyrk.py: Use the environment python.
John Kirkham [Tue, 19 Jan 2016 19:04:55 +0000 (14:04 -0500)]
benchmark/scripts/SCIPY/dsyrk.py: Use the environment python.

8 years agobenchmark/scripts/SCIPY/ssyrk.py: Drop unneeded semicolons.
John Kirkham [Tue, 19 Jan 2016 17:34:01 +0000 (12:34 -0500)]
benchmark/scripts/SCIPY/ssyrk.py: Drop unneeded semicolons.

8 years agobenchmark/scripts/SCIPY/dsyrk.py: Drop unneeded semicolons.
John Kirkham [Tue, 19 Jan 2016 17:33:44 +0000 (12:33 -0500)]
benchmark/scripts/SCIPY/dsyrk.py: Drop unneeded semicolons.

8 years agobenchmark/scripts/SCIPY/ssyrk.py: Allocate `C` using zeros instead of randomly genera...
John Kirkham [Tue, 19 Jan 2016 17:32:26 +0000 (12:32 -0500)]
benchmark/scripts/SCIPY/ssyrk.py: Allocate `C` using zeros instead of randomly generating it.

8 years agobenchmark/scripts/SCIPY/dsyrk.py: Allocate `C` using zeros instead of randomly genera...
John Kirkham [Tue, 19 Jan 2016 17:32:14 +0000 (12:32 -0500)]
benchmark/scripts/SCIPY/dsyrk.py: Allocate `C` using zeros instead of randomly generating it.

8 years agoupdate CONTRIBUTORS.md
Jerome Robert [Tue, 19 Jan 2016 16:15:31 +0000 (17:15 +0100)]
update CONTRIBUTORS.md

8 years agoswap: disable multi-threading for small matrices
Jerome Robert [Mon, 18 Jan 2016 08:12:37 +0000 (09:12 +0100)]
swap: disable multi-threading for small matrices

Close #731

8 years agoDisable multi-threading for small matrices in [z]ger
Jerome Robert [Fri, 15 Jan 2016 16:12:04 +0000 (17:12 +0100)]
Disable multi-threading for small matrices in [z]ger

Ref #731

8 years agoRef #740: simple solution to clear floating point register on arm
Werner Saar [Sun, 17 Jan 2016 14:37:12 +0000 (15:37 +0100)]
Ref #740: simple solution to clear floating point register on arm

8 years agoFixed CMake bug for single core.
Zhang Xianyi [Thu, 14 Jan 2016 22:42:54 +0000 (06:42 +0800)]
Fixed CMake bug for single core.

8 years ago[av skip] Change test cmd on Travis.
Zhang Xianyi [Wed, 13 Jan 2016 02:44:49 +0000 (20:44 -0600)]
[av skip] Change test cmd on Travis.

8 years agoRefs #738. Fix previous commit bug. Run BLAS and CBLAS test on Travis.
Zhang Xianyi [Wed, 13 Jan 2016 02:01:49 +0000 (20:01 -0600)]
Refs #738. Fix previous commit bug. Run BLAS and CBLAS test on Travis.

8 years agoRefs #738. Run test on Travis.
Zhang Xianyi [Tue, 12 Jan 2016 22:52:47 +0000 (22:52 +0000)]
Refs #738. Run test on Travis.

8 years agoMerge branch 'develop' of github.com:xianyi/OpenBLAS into develop
Zhang Xianyi [Tue, 12 Jan 2016 22:25:36 +0000 (22:25 +0000)]
Merge branch 'develop' of github.com:xianyi/OpenBLAS into develop

8 years agoMerge branch 'jeromerobert-bug736' into develop
Zhang Xianyi [Tue, 12 Jan 2016 22:25:08 +0000 (22:25 +0000)]
Merge branch 'jeromerobert-bug736' into develop

8 years ago#736 Revert #733 patch to fix bus error on ARM.
Zhang Xianyi [Tue, 12 Jan 2016 22:19:58 +0000 (22:19 +0000)]
#736 Revert #733 patch to fix bus error on ARM.

8 years agoMerge pull request #739 from sebastien-villemot/develop
Zhang Xianyi [Tue, 12 Jan 2016 20:47:34 +0000 (14:47 -0600)]
Merge pull request #739 from sebastien-villemot/develop

Fixes for old outstanding bugs in CBLAS test programs

8 years agoFix output descriptors of c_{s,d,c,z}blat3
Sébastien Villemot [Mon, 11 Jan 2016 10:22:17 +0000 (11:22 +0100)]
Fix output descriptors of c_{s,d,c,z}blat3

The NTRA argument can be equal to -1 if one does not want a snapshot file
(and this is the case with sample data {s,d,c,z}in3).
The routines {S,D,C,Z}PRCN3 will try to use their first argument as an output
unit number, so we avoid calling them when NTRA < 0.

Patch originally written by Camm Maguire.

8 years agoFix CBLAS double complex level 2 tests
Sébastien Villemot [Mon, 11 Jan 2016 10:15:33 +0000 (11:15 +0100)]
Fix CBLAS double complex level 2 tests

The SNAME variable contains names of C functions like "cblas_dgemv".
Apparently the code was not taking into account the 6-letter "cblas_"
prefix when determining the task to be done.

The issue does not affect c_{s,d,c}blat2.f, which use the correct
offsetting.

Patch originally written by Camm Maguire.

8 years agostack alloc: Fix stack smashing detection in 32bits
Jerome Robert [Sun, 10 Jan 2016 18:04:37 +0000 (19:04 +0100)]
stack alloc: Fix stack smashing detection in 32bits

* Fix commit 87a2ccc
* Close #736

8 years agoadded benchmark tests for ssyrk and dsyrk
Werner Saar [Sun, 10 Jan 2016 11:19:03 +0000 (12:19 +0100)]
added benchmark tests for ssyrk and dsyrk

8 years agoMerge pull request #734 from jeromerobert/common_stackalloc
Zhang Xianyi [Sat, 9 Jan 2016 04:13:37 +0000 (22:13 -0600)]
Merge pull request #734 from jeromerobert/common_stackalloc

Factorize MAX_STACK_ALLOC code to common_stackalloc.h

8 years agoFactorize MAX_STACK_ALLOC code to common_stackalloc.h
Jerome Robert [Sun, 3 Jan 2016 12:59:37 +0000 (13:59 +0100)]
Factorize MAX_STACK_ALLOC code to common_stackalloc.h

Ref #727

8 years agoMerge pull request #733 from yuyichao/arm-asm
Zhang Xianyi [Wed, 6 Jan 2016 01:35:12 +0000 (19:35 -0600)]
Merge pull request #733 from yuyichao/arm-asm

Do not use vsub to clear the register values

8 years agoDo not use vsub to clear the register values since it doesn't work with non-normal...
Yichao Yu [Tue, 5 Jan 2016 04:36:25 +0000 (23:36 -0500)]
Do not use vsub to clear the register values since it doesn't work with non-normal numbers.

8 years agoMerge pull request #732 from wernsaar/develop
wernsaar [Tue, 5 Jan 2016 14:34:08 +0000 (15:34 +0100)]
Merge pull request #732 from wernsaar/develop

added optimized trsm_kernels

8 years agoadded optimized trsm_kernels
Werner Saar [Tue, 5 Jan 2016 12:05:05 +0000 (13:05 +0100)]
added optimized trsm_kernels

8 years agoinclude sched.h if OS is Android
Werner Saar [Tue, 5 Jan 2016 11:36:49 +0000 (12:36 +0100)]
include sched.h if OS is Android

8 years agoMerge pull request #728 from jeromerobert/fix-no-stack-alloc
Zhang Xianyi [Mon, 4 Jan 2016 21:04:24 +0000 (15:04 -0600)]
Merge pull request #728 from jeromerobert/fix-no-stack-alloc

Fix make MAX_STACK_ALLOC=0

8 years agoFix compilation when MAX_STACK_ALLOC is not set
Jerome Robert [Thu, 31 Dec 2015 13:36:22 +0000 (13:36 +0000)]
Fix compilation when MAX_STACK_ALLOC is not set

Close #722

8 years agoLet make MAX_STACK_ALLOC=0 do what expected
Jerome Robert [Thu, 31 Dec 2015 13:32:53 +0000 (13:32 +0000)]
Let make MAX_STACK_ALLOC=0 do what expected

It's no longer required to modify Makefile.rule to disable
stack allocation. It's now possible to run:

make MAX_STACK_ALLOC=0

8 years agoMerge pull request #726 from jeromerobert/amd-e2-3200
Zhang Xianyi [Mon, 28 Dec 2015 18:53:11 +0000 (12:53 -0600)]
Merge pull request #726 from jeromerobert/amd-e2-3200

Fix detection of AMD E2-3200

8 years agoMerge pull request #725 from jeromerobert/make-nb-jobs
Zhang Xianyi [Mon, 28 Dec 2015 18:48:49 +0000 (12:48 -0600)]
Merge pull request #725 from jeromerobert/make-nb-jobs

Allow to force the number of parallel make job

8 years agoFix detection of AMD E2-3200
Jerome Robert [Mon, 28 Dec 2015 18:26:43 +0000 (19:26 +0100)]
Fix detection of AMD E2-3200

8 years agoAllow to force the number of parallel make job
Jerome Robert [Mon, 28 Dec 2015 18:28:42 +0000 (19:28 +0100)]
Allow to force the number of parallel make job

This is particularly useful when using distcc

8 years agoMerge branch 'develop' of github.com:xianyi/OpenBLAS into develop
Zhang Xianyi [Mon, 14 Dec 2015 16:07:10 +0000 (10:07 -0600)]
Merge branch 'develop' of github.com:xianyi/OpenBLAS into develop

8 years agoFixed rotg bug on ARM.
Zhang Xianyi [Mon, 14 Dec 2015 16:07:01 +0000 (10:07 -0600)]
Fixed rotg bug on ARM.

8 years agoMerge pull request #713 from btracey/patch-2
Zhang Xianyi [Thu, 10 Dec 2015 16:13:49 +0000 (10:13 -0600)]
Merge pull request #713 from btracey/patch-2

Fix Dormbr to perform the correct size operations with RowMajor

8 years agoMerge pull request #711 from btracey/patch-1
Zhang Xianyi [Thu, 10 Dec 2015 16:13:12 +0000 (10:13 -0600)]
Merge pull request #711 from btracey/patch-1

Fix Dormlq to perform the correct size operations with RowMajor

8 years agoFix Dormbr to perform the correct size operations with RowMajor
Brendan Tracey [Wed, 9 Dec 2015 07:50:22 +0000 (00:50 -0700)]
Fix Dormbr to perform the correct size operations with RowMajor

Fixes issue #712

8 years agoFix Dormlq to perform the correct size operations with RowMajor
Brendan Tracey [Wed, 9 Dec 2015 05:34:21 +0000 (22:34 -0700)]
Fix Dormlq to perform the correct size operations with RowMajor

Fixes issue #615.

8 years agoMerge branch 'develop' of github.com:xianyi/OpenBLAS into develop
Zhang Xianyi [Fri, 4 Dec 2015 16:46:42 +0000 (00:46 +0800)]
Merge branch 'develop' of github.com:xianyi/OpenBLAS into develop

8 years agoRefs #708. Modified config template for MSVC.
Zhang Xianyi [Fri, 4 Dec 2015 16:45:29 +0000 (00:45 +0800)]
Refs #708. Modified config template for MSVC.

8 years agoRefs #706. Fixed lapacke installation error.
Zhang Xianyi [Wed, 2 Dec 2015 17:32:39 +0000 (01:32 +0800)]
Refs #706. Fixed lapacke installation error.

8 years agoMerge pull request #704 from tkelman/patch-1
Zhang Xianyi [Tue, 1 Dec 2015 04:37:25 +0000 (22:37 -0600)]
Merge pull request #704 from tkelman/patch-1

fix makefile warning when renaming symbols

8 years agofix makefile warning when renaming symbols
Tony Kelman [Tue, 1 Dec 2015 04:16:33 +0000 (20:16 -0800)]
fix makefile warning when renaming symbols

use different names for `openblas*.renamed` between osx and other unices, fixes
```
Makefile:121: warning: overriding commands for target `../libopenblas64_p-r0.2.15.a.renamed'
Makefile:100: warning: ignoring old commands for target `../libopenblas64_p-r0.2.15.a.renamed'
```

also clean `*.renamed`

8 years agoRefs #697. Fixed gemv bug for Windows.
Zhang Xianyi [Mon, 30 Nov 2015 21:19:45 +0000 (15:19 -0600)]
Refs #697. Fixed gemv bug for Windows.

Thank matzeri's patch.

8 years agoRefs #702. Delete redundant xerbla exporting
Zhang Xianyi [Mon, 30 Nov 2015 17:08:33 +0000 (11:08 -0600)]
Refs #702. Delete redundant xerbla exporting

8 years agoRefs #699. Split the obj list of LAPACKE 3.6.0.
Zhang Xianyi [Tue, 24 Nov 2015 19:15:28 +0000 (13:15 -0600)]
Refs #699. Split the obj list of LAPACKE 3.6.0.

8 years agoMerge pull request #690 from rayglover/msvc-fix
Zhang Xianyi [Mon, 23 Nov 2015 17:05:37 +0000 (11:05 -0600)]
Merge pull request #690 from rayglover/msvc-fix

(Visual Studio) Don't use C99 complex numbers when building C++ code.

8 years agoMerge pull request #696 from ashwinyes/develop_20151120_lapack_test_fixes
Zhang Xianyi [Mon, 23 Nov 2015 17:04:42 +0000 (11:04 -0600)]
Merge pull request #696 from ashwinyes/develop_20151120_lapack_test_fixes

Cortex A57 fixes and Lapack 3.6.0

8 years agofix for bad or outdated mingw compiler
Werner Saar [Mon, 23 Nov 2015 15:20:14 +0000 (16:20 +0100)]
fix for bad or outdated mingw compiler

8 years agolapack-test fixes in nrm2 kernels for Cortex A57
Ashwin Sekhar T K [Mon, 23 Nov 2015 06:38:56 +0000 (12:08 +0530)]
lapack-test fixes in nrm2 kernels for Cortex A57

8 years agolapack fixes for Windos
Werner Saar [Sat, 21 Nov 2015 13:33:27 +0000 (14:33 +0100)]
lapack fixes for Windos

8 years agofixes for cross compile
Werner Saar [Sat, 21 Nov 2015 09:48:37 +0000 (10:48 +0100)]
fixes for cross compile

8 years agobugfix for cross compiling
Werner Saar [Fri, 20 Nov 2015 12:47:22 +0000 (13:47 +0100)]
bugfix for cross compiling

8 years agoadded lapack-3.6.0
Werner Saar [Fri, 20 Nov 2015 08:45:46 +0000 (09:45 +0100)]
added lapack-3.6.0

8 years agoremoved lapack-3.5.0
Werner Saar [Fri, 20 Nov 2015 08:41:59 +0000 (09:41 +0100)]
removed lapack-3.5.0

8 years agoincrease the stack size limit in the constructor
Werner Saar [Fri, 20 Nov 2015 08:23:01 +0000 (09:23 +0100)]
increase the stack size limit in the constructor

8 years agoFix blas_lock for arm64
Ashwin Sekhar T K [Thu, 19 Nov 2015 20:15:35 +0000 (01:45 +0530)]
Fix blas_lock for arm64

8 years agolapack-test fixes for Cortex A57
Ashwin Sekhar T K [Thu, 19 Nov 2015 19:45:04 +0000 (01:15 +0530)]
lapack-test fixes for Cortex A57

8 years agoChange BUFFER_SIZE for Cortex A57 to 20 MB
Ashwin Sekhar T K [Thu, 19 Nov 2015 19:42:04 +0000 (01:12 +0530)]
Change BUFFER_SIZE for Cortex A57 to 20 MB

Change the GEMM_P, GEMM_Q, GEMM_R values for Cortex A57

8 years ago(Visual Studio) Don't use C99 complex numbers when building C++ code.
Ray Glover [Tue, 17 Nov 2015 17:29:30 +0000 (17:29 +0000)]
(Visual Studio) Don't use C99 complex numbers when building C++ code.

9 years agoFix #686. Merge branch 'ashwinyes-develop' into develop
Zhang Xianyi [Tue, 10 Nov 2015 20:30:26 +0000 (04:30 +0800)]
Fix #686. Merge branch 'ashwinyes-develop' into develop

9 years agoUse 40 MB buffer for ARM Cortex A57.
Zhang Xianyi [Tue, 10 Nov 2015 20:22:34 +0000 (04:22 +0800)]
Use 40 MB buffer for ARM Cortex A57.

9 years agoDelete vi swap file.
Zhang Xianyi [Tue, 10 Nov 2015 20:19:43 +0000 (04:19 +0800)]
Delete vi swap file.

9 years agoMerge branch 'develop' of https://github.com/ashwinyes/OpenBLAS into ashwinyes-develop
Zhang Xianyi [Tue, 10 Nov 2015 20:16:22 +0000 (04:16 +0800)]
Merge branch 'develop' of https://github.com/ashwinyes/OpenBLAS into ashwinyes-develop

9 years agoUpdate develop version.
Zhang Xianyi [Tue, 10 Nov 2015 20:14:58 +0000 (04:14 +0800)]
Update develop version.

9 years agoMerge pull request #684 from sebastien-villemot/develop
Zhang Xianyi [Mon, 9 Nov 2015 17:39:21 +0000 (11:39 -0600)]
Merge pull request #684 from sebastien-villemot/develop

Fix detection of POWER architecture in c_check.

9 years agoFix detection of POWER architecture in c_check.
Sébastien Villemot [Mon, 9 Nov 2015 17:36:04 +0000 (18:36 +0100)]
Fix detection of POWER architecture in c_check.

This is necessary to avoid the false detection of a cross-compiling
environment.

9 years agoFix bug in benchmark/gemm.c
Ashwin Sekhar T K [Fri, 6 Nov 2015 14:45:05 +0000 (20:15 +0530)]
Fix bug in benchmark/gemm.c

9 years agoOptimized trmm kernels for CORTEXA57
Ashwin Sekhar T K [Mon, 2 Nov 2015 14:00:28 +0000 (19:30 +0530)]
Optimized trmm kernels for CORTEXA57

9 years agoOptimized zgemm kernel for CORTEXA57
Ashwin Sekhar T K [Mon, 2 Nov 2015 13:28:28 +0000 (18:58 +0530)]
Optimized zgemm kernel for CORTEXA57

9 years agoOptimized cgemm kernel for CORTEXA57
Ashwin Sekhar T K [Mon, 2 Nov 2015 13:10:27 +0000 (18:40 +0530)]
Optimized cgemm kernel for CORTEXA57

Also, add a generic ztrmm 4x4 kernel

9 years agoOptimized dgemm kernel for CORTEXA57
Ashwin Sekhar T K [Mon, 2 Nov 2015 12:23:28 +0000 (17:53 +0530)]
Optimized dgemm kernel for CORTEXA57

9 years agoImprove the sgemm kernel for CORTEXA57
Ashwin Sekhar T K [Mon, 2 Nov 2015 12:15:24 +0000 (17:45 +0530)]
Improve the sgemm kernel for CORTEXA57

9 years agoOptimized gemv kernels for CORTEXA57
Ashwin Sekhar T K [Mon, 2 Nov 2015 11:47:47 +0000 (17:17 +0530)]
Optimized gemv kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
9 years agoOptimized swap kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 09:09:02 +0000 (14:39 +0530)]
Optimized swap kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
9 years agoOptimized scal kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 09:06:31 +0000 (14:36 +0530)]
Optimized scal kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
9 years agoOptimized rot kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 09:03:00 +0000 (14:33 +0530)]
Optimized rot kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
9 years agoOptimized nrm2 kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 08:59:27 +0000 (14:29 +0530)]
Optimized nrm2 kernels for CORTEXA57

9 years agoOptimized dot kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 08:46:04 +0000 (14:16 +0530)]
Optimized dot kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
9 years agoOptimized copy kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 06:49:05 +0000 (12:19 +0530)]
Optimized copy kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>