platform/upstream/openblas.git
8 years agoMerge branch 'develop' of github.com:xianyi/OpenBLAS into arm_soft_fp_abi
Zhang Xianyi [Wed, 11 Nov 2015 19:25:07 +0000 (19:25 +0000)]
Merge branch 'develop' of github.com:xianyi/OpenBLAS into arm_soft_fp_abi

8 years agoFix #686. Merge branch 'ashwinyes-develop' into develop
Zhang Xianyi [Tue, 10 Nov 2015 20:30:26 +0000 (04:30 +0800)]
Fix #686. Merge branch 'ashwinyes-develop' into develop

8 years agoUse 40 MB buffer for ARM Cortex A57.
Zhang Xianyi [Tue, 10 Nov 2015 20:22:34 +0000 (04:22 +0800)]
Use 40 MB buffer for ARM Cortex A57.

8 years agoDelete vi swap file.
Zhang Xianyi [Tue, 10 Nov 2015 20:19:43 +0000 (04:19 +0800)]
Delete vi swap file.

8 years agoMerge branch 'develop' of https://github.com/ashwinyes/OpenBLAS into ashwinyes-develop
Zhang Xianyi [Tue, 10 Nov 2015 20:16:22 +0000 (04:16 +0800)]
Merge branch 'develop' of https://github.com/ashwinyes/OpenBLAS into ashwinyes-develop

8 years agoUpdate develop version.
Zhang Xianyi [Tue, 10 Nov 2015 20:14:58 +0000 (04:14 +0800)]
Update develop version.

8 years agoMerge pull request #684 from sebastien-villemot/develop
Zhang Xianyi [Mon, 9 Nov 2015 17:39:21 +0000 (11:39 -0600)]
Merge pull request #684 from sebastien-villemot/develop

Fix detection of POWER architecture in c_check.

8 years agoFix detection of POWER architecture in c_check.
Sébastien Villemot [Mon, 9 Nov 2015 17:36:04 +0000 (18:36 +0100)]
Fix detection of POWER architecture in c_check.

This is necessary to avoid the false detection of a cross-compiling
environment.

8 years agoFix bug in benchmark/gemm.c
Ashwin Sekhar T K [Fri, 6 Nov 2015 14:45:05 +0000 (20:15 +0530)]
Fix bug in benchmark/gemm.c

8 years agoOptimized trmm kernels for CORTEXA57
Ashwin Sekhar T K [Mon, 2 Nov 2015 14:00:28 +0000 (19:30 +0530)]
Optimized trmm kernels for CORTEXA57

8 years agoOptimized zgemm kernel for CORTEXA57
Ashwin Sekhar T K [Mon, 2 Nov 2015 13:28:28 +0000 (18:58 +0530)]
Optimized zgemm kernel for CORTEXA57

8 years agoOptimized cgemm kernel for CORTEXA57
Ashwin Sekhar T K [Mon, 2 Nov 2015 13:10:27 +0000 (18:40 +0530)]
Optimized cgemm kernel for CORTEXA57

Also, add a generic ztrmm 4x4 kernel

8 years agoOptimized dgemm kernel for CORTEXA57
Ashwin Sekhar T K [Mon, 2 Nov 2015 12:23:28 +0000 (17:53 +0530)]
Optimized dgemm kernel for CORTEXA57

8 years agoImprove the sgemm kernel for CORTEXA57
Ashwin Sekhar T K [Mon, 2 Nov 2015 12:15:24 +0000 (17:45 +0530)]
Improve the sgemm kernel for CORTEXA57

8 years agoOptimized gemv kernels for CORTEXA57
Ashwin Sekhar T K [Mon, 2 Nov 2015 11:47:47 +0000 (17:17 +0530)]
Optimized gemv kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
8 years agoOptimized swap kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 09:09:02 +0000 (14:39 +0530)]
Optimized swap kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
8 years agoOptimized scal kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 09:06:31 +0000 (14:36 +0530)]
Optimized scal kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
8 years agoOptimized rot kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 09:03:00 +0000 (14:33 +0530)]
Optimized rot kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
8 years agoOptimized nrm2 kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 08:59:27 +0000 (14:29 +0530)]
Optimized nrm2 kernels for CORTEXA57

8 years agoOptimized dot kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 08:46:04 +0000 (14:16 +0530)]
Optimized dot kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
8 years agoOptimized copy kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 06:49:05 +0000 (12:19 +0530)]
Optimized copy kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
8 years agoOptimized axpy kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 06:42:08 +0000 (12:12 +0530)]
Optimized axpy kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
8 years agoOptimized asum kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 06:22:15 +0000 (11:52 +0530)]
Optimized asum kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
8 years agoOptimized iamax kernels for CORTEXA57
Ashwin Sekhar T K [Tue, 6 Oct 2015 06:11:15 +0000 (11:41 +0530)]
Optimized iamax kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
8 years agoOptimized amax kernels for CORTEXA57
Ashwin Sekhar T K [Mon, 5 Oct 2015 14:19:44 +0000 (19:49 +0530)]
Optimized amax kernels for CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
8 years agoFix compiler errors in common.h
Ashwin Sekhar T K [Mon, 5 Oct 2015 12:16:11 +0000 (17:46 +0530)]
Fix compiler errors in common.h

8 years agoAdding arm64 target CORTEXA57
Ashwin Sekhar T K [Fri, 4 Sep 2015 07:56:52 +0000 (13:26 +0530)]
Adding arm64 target CORTEXA57

Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
8 years agoMinor C code fixes in interface/
Ralph Campbell [Thu, 3 Sep 2015 12:30:12 +0000 (18:00 +0530)]
Minor C code fixes in interface/

8 years agoMinor C code fixes in driver/
Ralph Campbell [Thu, 3 Sep 2015 12:27:06 +0000 (17:57 +0530)]
Minor C code fixes in driver/

8 years agoMinor C code fixes in kernel/arm
Ralph Campbell [Fri, 6 Nov 2015 14:34:56 +0000 (20:04 +0530)]
Minor C code fixes in kernel/arm

8 years agoRemove duplicate -D args in kernel/Makefile.L1
Ralph Campbell [Fri, 6 Nov 2015 14:34:17 +0000 (20:04 +0530)]
Remove duplicate -D args in kernel/Makefile.L1

8 years agoRefs #682. Enable LAPACK_COMPLEX_STRUCTURE when __ANDROID_API_ < 21.
Zhang Xianyi [Sat, 7 Nov 2015 05:46:20 +0000 (23:46 -0600)]
Refs #682. Enable LAPACK_COMPLEX_STRUCTURE when __ANDROID_API_ < 21.

8 years agoDetect AMD Trinity and Richland.
Zhang Xianyi [Wed, 28 Oct 2015 18:53:29 +0000 (02:53 +0800)]
Detect AMD Trinity and Richland.

8 years agoMerge pull request #677 from j-bo/develop
Zhang Xianyi [Wed, 28 Oct 2015 14:44:25 +0000 (09:44 -0500)]
Merge pull request #677 from j-bo/develop

Refs #676.  Fixed ONLY_CBLAS=1 compiling bug on windows.

8 years agoRefs #676. Fixed ONLY_CBLAS=1 compiling bug on windows.
j-bo [Wed, 28 Oct 2015 14:10:42 +0000 (15:10 +0100)]
Refs #676.  Fixed ONLY_CBLAS=1 compiling bug on windows.

8 years agoMerge branch 'develop' into arm_soft_fp_abi
Zhang Xianyi [Wed, 28 Oct 2015 12:12:31 +0000 (12:12 +0000)]
Merge branch 'develop' into arm_soft_fp_abi

8 years agoMerge branch 'develop' v0.2.15
Zhang Xianyi [Tue, 27 Oct 2015 20:44:50 +0000 (15:44 -0500)]
Merge branch 'develop'

8 years agoUpdate doc for OpenBLAS 0.2.15 version. [CI skipped]
Zhang Xianyi [Tue, 27 Oct 2015 20:44:35 +0000 (15:44 -0500)]
Update doc for OpenBLAS 0.2.15 version. [CI skipped]

8 years agoMerge pull request #674 from j-bo/develop
Zhang Xianyi [Tue, 27 Oct 2015 15:49:22 +0000 (10:49 -0500)]
Merge pull request #674 from j-bo/develop

Fix #673

8 years agoOnly include complex.h since Android 5.0
Zhang Xianyi [Tue, 27 Oct 2015 15:47:55 +0000 (10:47 -0500)]
Only include complex.h since Android 5.0

8 years agoMerge pull request #1 from j-bo/j-bo-patch-673
j-bo [Tue, 27 Oct 2015 12:56:45 +0000 (13:56 +0100)]
Merge pull request #1 from j-bo/j-bo-patch-673

Fix #673

8 years agoFix #673
j-bo [Tue, 27 Oct 2015 12:55:24 +0000 (13:55 +0100)]
Fix #673

Add lacking headers declarations when compiling for Android ARM7

8 years agoRefs #668. Raise the signal when pthread_create fails.
Zhang Xianyi [Tue, 27 Oct 2015 00:02:51 +0000 (19:02 -0500)]
Refs #668. Raise the signal when pthread_create fails.

Thank James K. Lowden for the patch.

8 years agoAdd AppVeyor badge.
Zhang Xianyi [Mon, 26 Oct 2015 23:14:41 +0000 (18:14 -0500)]
Add AppVeyor badge.

8 years ago[ci skip] Build Visual Studio 12 Win64 on Appveyor
Zhang Xianyi [Mon, 26 Oct 2015 23:08:54 +0000 (18:08 -0500)]
[ci skip] Build Visual Studio 12 Win64 on Appveyor

8 years agoOnly test x64 Windows CI.
Zhang Xianyi [Mon, 26 Oct 2015 21:11:07 +0000 (05:11 +0800)]
Only test x64 Windows CI.

8 years agoFix DYNAMIC_ARCH=1 bug.
Zhang Xianyi [Mon, 26 Oct 2015 21:10:40 +0000 (05:10 +0800)]
Fix DYNAMIC_ARCH=1 bug.

8 years agoUse AppVeyor Windows CI.
Zhang Xianyi [Mon, 26 Oct 2015 20:08:17 +0000 (15:08 -0500)]
Use AppVeyor Windows CI.

8 years agoMerge branch 'cmake' into develop
Zhang Xianyi [Mon, 26 Oct 2015 19:54:59 +0000 (14:54 -0500)]
Merge branch 'cmake' into develop

8 years agoMerge branch 'develop' into cmake
Zhang Xianyi [Mon, 26 Oct 2015 19:54:34 +0000 (14:54 -0500)]
Merge branch 'develop' into cmake

Conflicts:
driver/others/memory.c

8 years agoFix cmake bug on MSVC 32-bit.
Zhang Xianyi [Mon, 26 Oct 2015 19:52:13 +0000 (14:52 -0500)]
Fix cmake bug on MSVC 32-bit.

8 years agoFix cmake bug on x86 32-bit.
Zhang Xianyi [Mon, 26 Oct 2015 18:54:53 +0000 (02:54 +0800)]
Fix cmake bug on x86 32-bit.

e.g. Build 32-bit on 64-bit Linux.
cmake -DBINARY=32

8 years agoAdd CBLAS test for CMAKE.
Zhang Xianyi [Mon, 26 Oct 2015 15:42:21 +0000 (23:42 +0800)]
Add CBLAS test for CMAKE.

8 years agoRefs #671. the return of i?max cannot larger than N.
Zhang Xianyi [Fri, 23 Oct 2015 17:16:34 +0000 (01:16 +0800)]
Refs #671. the return of i?max cannot larger than N.

8 years agoRefs #669. Fixed the build bug with gcc on Mac OS X.
Zhang Xianyi [Thu, 22 Oct 2015 16:07:35 +0000 (11:07 -0500)]
Refs #669. Fixed the build bug with gcc on Mac OS X.

8 years agoFixed cmake bug on Visual Studio.
Zhang Xianyi [Tue, 20 Oct 2015 19:37:22 +0000 (14:37 -0500)]
Fixed cmake bug on Visual Studio.

8 years agoFixed cmake bug on haswell.
Zhang Xianyi [Tue, 20 Oct 2015 18:24:54 +0000 (02:24 +0800)]
Fixed cmake bug on haswell.

8 years agoFixe cmake config bugs.
Zhang Xianyi [Mon, 19 Oct 2015 20:30:55 +0000 (04:30 +0800)]
Fixe cmake config bugs.

8 years agoDetect cmake test result.
Zhang Xianyi [Mon, 19 Oct 2015 19:35:25 +0000 (03:35 +0800)]
Detect cmake test result.

8 years agoMerge branch 'develop' into cmake
Zhang Xianyi [Mon, 12 Oct 2015 20:46:08 +0000 (04:46 +0800)]
Merge branch 'develop' into cmake

Conflicts:
driver/others/memory.c

8 years agoInclude time.h.
Zhang Xianyi [Thu, 8 Oct 2015 15:07:24 +0000 (15:07 +0000)]
Include time.h.

8 years agoRefs #615. Import bug fixes for LAPACKE dormlq.
Zhang Xianyi [Tue, 6 Oct 2015 18:31:51 +0000 (02:31 +0800)]
Refs #615. Import bug fixes for LAPACKE dormlq.

8 years agoFixed #654. Make sure the gotoblas_init function is run before all other static initi...
Zhang Xianyi [Mon, 5 Oct 2015 19:14:32 +0000 (14:14 -0500)]
Fixed #654. Make sure the gotoblas_init function is run before all other static initializations.

8 years agoMerge pull request #656 from stevengj/libname
Zhang Xianyi [Mon, 5 Oct 2015 15:25:15 +0000 (10:25 -0500)]
Merge pull request #656 from stevengj/libname

default to lib$(SYMBOLPREFIX)openblas$(SYMBOLSUFFIX)

8 years agoMerge pull request #659 from Keno/patch-2
Zhang Xianyi [Mon, 5 Oct 2015 15:23:52 +0000 (10:23 -0500)]
Merge pull request #659 from Keno/patch-2

Fix cross compilation suffix detection

8 years agoFix cross compilation suffix detection
Keno Fischer [Mon, 5 Oct 2015 04:58:07 +0000 (00:58 -0400)]
Fix cross compilation suffix detection

If the path involves `-`, this would have otherwise detected this as a cross compile suffix.

8 years agodefault to lib$(SYMBOLPREFIX)openblas$(SYMBOLSUFFIX), as discussed in #646: if you...
Steven G. Johnson [Thu, 1 Oct 2015 19:07:04 +0000 (15:07 -0400)]
default to lib$(SYMBOLPREFIX)openblas$(SYMBOLSUFFIX), as discussed in #646: if you rename the symbols, it is best to rename the library

8 years agoFixed make TARGET=CORTEXA9 and CORTEXA15 bug.
Zhang Xianyi [Sat, 26 Sep 2015 14:42:44 +0000 (14:42 +0000)]
Fixed make TARGET=CORTEXA9 and CORTEXA15 bug.

8 years agoARM soft fp abi branch.
Zhang Xianyi [Sat, 26 Sep 2015 14:10:18 +0000 (14:10 +0000)]
ARM soft fp abi branch.

8 years agoMerge pull request #652 from larsmans/fixes
Zhang Xianyi [Tue, 22 Sep 2015 15:01:59 +0000 (10:01 -0500)]
Merge pull request #652 from larsmans/fixes

Tiny fixes

8 years agogit ignore versioned .so files
Lars Buitinck [Tue, 22 Sep 2015 10:01:09 +0000 (12:01 +0200)]
git ignore versioned .so files

8 years agoactually remove cblas_noconst.h
Lars Buitinck [Tue, 22 Sep 2015 10:00:30 +0000 (12:00 +0200)]
actually remove cblas_noconst.h

This file hasn't been used since 212463dce961827421a9c54f109a430c1599732c.

8 years agoMerge pull request #640 from kortschak/dlansy-fix
Zhang Xianyi [Thu, 10 Sep 2015 15:36:57 +0000 (10:36 -0500)]
Merge pull request #640 from kortschak/dlansy-fix

Fix LAPACK_*lansy routines

8 years agoRefs #638. Fixed compiling bug with clang on Mac OS X.
Zhang Xianyi [Thu, 10 Sep 2015 15:32:07 +0000 (10:32 -0500)]
Refs #638. Fixed compiling bug with clang on Mac OS X.

8 years agoFix LAPACK_*lansy routines
kortschak [Thu, 10 Sep 2015 06:02:50 +0000 (15:32 +0930)]
Fix LAPACK_*lansy routines

Fixes #639.

8 years agoMerge branch 'yuyichao-skylake-id' into develop
Zhang Xianyi [Wed, 9 Sep 2015 15:48:15 +0000 (10:48 -0500)]
Merge branch 'yuyichao-skylake-id' into develop

8 years agoDetect other Intel Skylake cores.
Zhang Xianyi [Wed, 9 Sep 2015 15:47:17 +0000 (10:47 -0500)]
Detect other Intel Skylake cores.

http://users.atw.hu/instlatx64/

8 years agoRef #632. Support Intel Skylake by Haswell kernels.
Yichao Yu [Wed, 9 Sep 2015 15:00:23 +0000 (11:00 -0400)]
Ref #632. Support Intel Skylake by Haswell kernels.

8 years agoMerge pull request #634 from kortschak/lantr-trans-prep
Zhang Xianyi [Wed, 9 Sep 2015 14:56:07 +0000 (09:56 -0500)]
Merge pull request #634 from kortschak/lantr-trans-prep

Fix lantr preparation for row major matrices

8 years agoFix lantr preparation for row major matrices
kortschak [Tue, 8 Sep 2015 23:55:48 +0000 (09:25 +0930)]
Fix lantr preparation for row major matrices

8 years agoMerge pull request #633 from grisuthedragon/tune_imatcopy
Zhang Xianyi [Tue, 8 Sep 2015 18:59:08 +0000 (13:59 -0500)]
Merge pull request #633 from grisuthedragon/tune_imatcopy

Improved Ximatcopy when lda==ldb.

8 years agoImproved Ximatcopy when lda==ldb.
Martin Koehler [Mon, 7 Sep 2015 12:33:26 +0000 (14:33 +0200)]
Improved Ximatcopy when lda==ldb.

The Ximatcopy functions create a copy of the input matrix
although they seem to work inplace. The new routines
XIMATCOPY_K_YY perform the operations inplace if the leading
dimension does not change.

8 years agoMerge pull request #630 from buffer51/develop
Zhang Xianyi [Fri, 4 Sep 2015 18:01:01 +0000 (13:01 -0500)]
Merge pull request #630 from buffer51/develop

Fixed error in common.h for Android compilation introduced by e12cf11

8 years agoFixed error in common.h for Android compilation introduced by e12cf1123e8784ce6fe9d2a...
buffer51 [Fri, 4 Sep 2015 00:54:21 +0000 (20:54 -0400)]
Fixed error in common.h for Android compilation introduced by e12cf1123e8784ce6fe9d2ac14526331fbe2c555

8 years agoAdd notification.
Zhang Xianyi [Thu, 20 Aug 2015 03:50:25 +0000 (22:50 -0500)]
Add notification.

8 years agoMerge branch 'develop' of github.com:xianyi/OpenBLAS into develop
Zhang Xianyi [Thu, 20 Aug 2015 03:48:55 +0000 (22:48 -0500)]
Merge branch 'develop' of github.com:xianyi/OpenBLAS into develop

8 years agoMerge pull request #619 from gitter-badger/gitter-badge
Zhang Xianyi [Thu, 20 Aug 2015 03:26:20 +0000 (22:26 -0500)]
Merge pull request #619 from gitter-badger/gitter-badge

Add a Gitter chat badge to README.md

8 years agoAdded Gitter badge
The Gitter Badger [Thu, 20 Aug 2015 03:21:09 +0000 (03:21 +0000)]
Added Gitter badge

8 years agoUse C kernels for s/dgemv on x86.
Zhang Xianyi [Wed, 19 Aug 2015 13:07:47 +0000 (08:07 -0500)]
Use C kernels for s/dgemv on x86.

8 years agoFixed cmake bug with NO_LAPACK=1
Zhang Xianyi [Wed, 19 Aug 2015 03:43:42 +0000 (22:43 -0500)]
Fixed cmake bug with NO_LAPACK=1

8 years agoMerge pull request #617 from notaz/arm_fixes
Zhang Xianyi [Mon, 17 Aug 2015 20:22:37 +0000 (15:22 -0500)]
Merge pull request #617 from notaz/arm_fixes

really fix ARM64 locking

8 years agoreally fix ARM64 locking
Grazvydas Ignotas [Sun, 16 Aug 2015 23:27:45 +0000 (01:27 +0200)]
really fix ARM64 locking

8 years agoMerge pull request #616 from notaz/arm_fixes
Zhang Xianyi [Sun, 16 Aug 2015 22:16:18 +0000 (17:16 -0500)]
Merge pull request #616 from notaz/arm_fixes

ARM fixes

8 years agocorrect a minor mistake
Grazvydas Ignotas [Sun, 16 Aug 2015 18:11:13 +0000 (20:11 +0200)]
correct a minor mistake

8 years agouse real armv5 support
Grazvydas Ignotas [Sun, 16 Aug 2015 16:13:30 +0000 (18:13 +0200)]
use real armv5 support

there is no more requirement for ARMv6 instructions,
and VFP on ARMv5 is uncommon

8 years agoadd fallback blas_lock implementation
Grazvydas Ignotas [Sun, 16 Aug 2015 16:10:34 +0000 (18:10 +0200)]
add fallback blas_lock implementation

to be used on armv5 and new platforms

8 years agoset ARMV7 for Cortex-A9 and Cortex-A15
Grazvydas Ignotas [Sun, 16 Aug 2015 16:08:45 +0000 (18:08 +0200)]
set ARMV7 for Cortex-A9 and Cortex-A15

otherwise some macros like YIELDING are not defined correctly

8 years agoadd fallback rpcc implementation
Grazvydas Ignotas [Sun, 16 Aug 2015 15:27:25 +0000 (17:27 +0200)]
add fallback rpcc implementation

- use on arm, arm64 and any new platform
- use faster integer math instead of double
- use similar scale as rdtsc so that timeouts work

8 years agoadd missing barriers
Grazvydas Ignotas [Sun, 16 Aug 2015 13:37:02 +0000 (15:37 +0200)]
add missing barriers

should fix issue #597

8 years agoreally fix ARM locking
Grazvydas Ignotas [Sun, 16 Aug 2015 13:18:42 +0000 (15:18 +0200)]
really fix ARM locking

- was writing 0 to lock variable, so was ineffective
- only exit loop if both lock was 0 and strex was successful