platform/upstream/openblas.git
9 years agoAdded Gitter badge
The Gitter Badger [Thu, 20 Aug 2015 03:21:09 +0000 (03:21 +0000)]
Added Gitter badge

9 years agoMerge pull request #617 from notaz/arm_fixes
Zhang Xianyi [Mon, 17 Aug 2015 20:22:37 +0000 (15:22 -0500)]
Merge pull request #617 from notaz/arm_fixes

really fix ARM64 locking

9 years agoreally fix ARM64 locking
Grazvydas Ignotas [Sun, 16 Aug 2015 23:27:45 +0000 (01:27 +0200)]
really fix ARM64 locking

9 years agoMerge pull request #616 from notaz/arm_fixes
Zhang Xianyi [Sun, 16 Aug 2015 22:16:18 +0000 (17:16 -0500)]
Merge pull request #616 from notaz/arm_fixes

ARM fixes

9 years agocorrect a minor mistake
Grazvydas Ignotas [Sun, 16 Aug 2015 18:11:13 +0000 (20:11 +0200)]
correct a minor mistake

9 years agouse real armv5 support
Grazvydas Ignotas [Sun, 16 Aug 2015 16:13:30 +0000 (18:13 +0200)]
use real armv5 support

there is no more requirement for ARMv6 instructions,
and VFP on ARMv5 is uncommon

9 years agoadd fallback blas_lock implementation
Grazvydas Ignotas [Sun, 16 Aug 2015 16:10:34 +0000 (18:10 +0200)]
add fallback blas_lock implementation

to be used on armv5 and new platforms

9 years agoset ARMV7 for Cortex-A9 and Cortex-A15
Grazvydas Ignotas [Sun, 16 Aug 2015 16:08:45 +0000 (18:08 +0200)]
set ARMV7 for Cortex-A9 and Cortex-A15

otherwise some macros like YIELDING are not defined correctly

9 years agoadd fallback rpcc implementation
Grazvydas Ignotas [Sun, 16 Aug 2015 15:27:25 +0000 (17:27 +0200)]
add fallback rpcc implementation

- use on arm, arm64 and any new platform
- use faster integer math instead of double
- use similar scale as rdtsc so that timeouts work

9 years agoadd missing barriers
Grazvydas Ignotas [Sun, 16 Aug 2015 13:37:02 +0000 (15:37 +0200)]
add missing barriers

should fix issue #597

9 years agoreally fix ARM locking
Grazvydas Ignotas [Sun, 16 Aug 2015 13:18:42 +0000 (15:18 +0200)]
really fix ARM locking

- was writing 0 to lock variable, so was ineffective
- only exit loop if both lock was 0 and strex was successful

9 years agoMerge pull request #614 from xantares/cmake_version
Zhang Xianyi [Thu, 6 Aug 2015 18:15:51 +0000 (13:15 -0500)]
Merge pull request #614 from xantares/cmake_version

install OpenBLASConfigVersion.cmake

9 years agoinstall OpenBLASConfigVersion.cmake
xantares [Thu, 6 Aug 2015 18:03:50 +0000 (20:03 +0200)]
install OpenBLASConfigVersion.cmake

9 years agoMerge pull request #613 from fabioperez/develop
Zhang Xianyi [Wed, 5 Aug 2015 14:19:17 +0000 (09:19 -0500)]
Merge pull request #613 from fabioperez/develop

Add POWER7/POWER8 as targets

9 years agoAdd POWER7/POWER8 as targets
Fábio Perez [Wed, 5 Aug 2015 14:02:39 +0000 (11:02 -0300)]
Add POWER7/POWER8 as targets

9 years agoMerge pull request #612 from ibmsoe/ppc64le
Zhang Xianyi [Tue, 4 Aug 2015 21:58:24 +0000 (16:58 -0500)]
Merge pull request #612 from ibmsoe/ppc64le

ppc64le platform support (ELF ABI v2)

9 years agoUse pure C generic target on x86 and x86_64.
Zhang Xianyi [Tue, 4 Aug 2015 04:55:56 +0000 (23:55 -0500)]
Use pure C generic target on x86 and x86_64.

make TARGET=GENERIC

?gemm3m is unimplemented on generic target.

9 years agoppc64le platform support (ELF ABI v2)
Matthew Brandyberry [Tue, 21 Jul 2015 17:45:12 +0000 (12:45 -0500)]
ppc64le platform support (ELF ABI v2)

9 years agoFix blas lock bug on AArch64.
Zhang Xianyi [Fri, 26 Jun 2015 03:54:41 +0000 (11:54 +0800)]
Fix blas lock bug on AArch64.

9 years agoMerge pull request #595 from tanderson92/fixTests
Zhang Xianyi [Tue, 23 Jun 2015 02:54:51 +0000 (21:54 -0500)]
Merge pull request #595 from tanderson92/fixTests

Fix test execution when USE_OPENMP=0

9 years agoMerge pull request #596 from wernsaar/develop
wernsaar [Sat, 13 Jun 2015 14:44:48 +0000 (16:44 +0200)]
Merge pull request #596 from wernsaar/develop

optimizations for haswell

9 years agoadded optimized dtrmm_kernel for haswell
Werner Saar [Sat, 13 Jun 2015 14:16:29 +0000 (16:16 +0200)]
added optimized dtrmm_kernel for haswell

9 years agomodified haswell parameter dgemm_unroll_n
Werner Saar [Sat, 13 Jun 2015 08:28:27 +0000 (10:28 +0200)]
modified haswell parameter dgemm_unroll_n

9 years agoFix test execution when USE_OPENMP=0
Thomas Anderson [Sat, 13 Jun 2015 06:52:07 +0000 (23:52 -0700)]
Fix test execution when USE_OPENMP=0

The standard way to disable OpenMP support is to set USE_OPENMP=0,
as indicated by other checks to see if USE_OPENMP equals 1. The
problem is obviously then that `ifdef USE_OPENMP` is very much not
what we want to test for. This causes tests to fail when no OpenMP
library is installed.

9 years agoFix #593. Change MACOSX_DEPLOYMENT_TARGET to 10.6.
Zhang Xianyi [Mon, 8 Jun 2015 15:53:50 +0000 (10:53 -0500)]
Fix #593. Change MACOSX_DEPLOYMENT_TARGET to 10.6.

9 years agoMerge pull request #592 from wernsaar/develop
wernsaar [Mon, 8 Jun 2015 12:22:02 +0000 (14:22 +0200)]
Merge pull request #592 from wernsaar/develop

added benchmark scripts

9 years agoadded benchmark scripts for numpy, octave and R
Werner Saar [Mon, 8 Jun 2015 12:06:38 +0000 (14:06 +0200)]
added benchmark scripts for numpy, octave and R

9 years agoupdated geev benchmark
Werner Saar [Mon, 8 Jun 2015 10:58:38 +0000 (12:58 +0200)]
updated geev benchmark

9 years agoMerge pull request #589 from wernsaar/develop
wernsaar [Wed, 3 Jun 2015 10:14:09 +0000 (12:14 +0200)]
Merge pull request #589 from wernsaar/develop

small modification of gemm.c

9 years agosmall modification of gemm.c
Werner Saar [Wed, 3 Jun 2015 07:11:51 +0000 (09:11 +0200)]
small modification of gemm.c

9 years agoMerge pull request #587 from wernsaar/develop
wernsaar [Tue, 2 Jun 2015 13:29:49 +0000 (15:29 +0200)]
Merge pull request #587 from wernsaar/develop

added gesv benchmark

9 years agoadded gesv benchmark
Werner Saar [Tue, 2 Jun 2015 11:35:49 +0000 (13:35 +0200)]
added gesv benchmark

9 years agoMerge pull request #585 from wernsaar/develop
wernsaar [Sun, 31 May 2015 13:01:54 +0000 (15:01 +0200)]
Merge pull request #585 from wernsaar/develop

bugfix for benchmark Makefile on MAC

9 years agobugfix for Makefile on mac
Werner Saar [Sun, 31 May 2015 12:16:51 +0000 (14:16 +0200)]
bugfix for Makefile on mac

9 years agoMerge pull request #584 from wernsaar/develop
wernsaar [Fri, 29 May 2015 11:27:20 +0000 (13:27 +0200)]
Merge pull request #584 from wernsaar/develop

bugfixes, to build benchmarks with mingw on Windows OS

9 years agobugfixes, to build benchmarks with mingw on Windows OS
Werner Saar [Fri, 29 May 2015 10:56:22 +0000 (12:56 +0200)]
bugfixes, to build benchmarks with mingw on Windows OS

9 years agoMerge pull request #581 from wernsaar/develop
wernsaar [Sat, 23 May 2015 10:58:15 +0000 (12:58 +0200)]
Merge pull request #581 from wernsaar/develop

bugfix for arm locking

9 years agobugfix for arm locking
Werner Saar [Sat, 23 May 2015 09:40:40 +0000 (11:40 +0200)]
bugfix for arm locking

9 years agosmp lock bugfix
Werner Saar [Sat, 23 May 2015 08:58:38 +0000 (10:58 +0200)]
smp lock bugfix

9 years agoMerge pull request #580 from wernsaar/develop
wernsaar [Sat, 23 May 2015 07:46:39 +0000 (09:46 +0200)]
Merge pull request #580 from wernsaar/develop

added blas level1 swap  benchmark

9 years agoadded blas level1 swap benchmark
Werner Saar [Thu, 21 May 2015 06:51:42 +0000 (08:51 +0200)]
added blas level1 swap benchmark

9 years agoSupport Android NDK armeabi-v7a-hard ABI. (-mfloat-abi=hard)
Zhang Xianyi [Thu, 21 May 2015 02:57:27 +0000 (21:57 -0500)]
Support Android NDK armeabi-v7a-hard ABI. (-mfloat-abi=hard)

e.g.
make HOSTCC=gcc CC=arm-linux-androideabi-gcc NO_LAPACK=1 TARGET=ARMV7

In Android NDK, it uses armeabi-v7a-hard ABI.
TARGET_CFLAGS += -mhard-float -D_NDK_MATH_NO_SOFTFP=1
TARGET_LDFLAGS += -Wl,--no-warn-mismatch -lm_hard
For more information, please check hard-float example at
android_ndk/tests/device/hard-float/jni/.

9 years agoMerge pull request #578 from wernsaar/develop
wernsaar [Wed, 20 May 2015 09:56:02 +0000 (11:56 +0200)]
Merge pull request #578 from wernsaar/develop

added blas level1 copy benchmark

9 years agoadded blas level1 copy benchmark
Werner Saar [Wed, 20 May 2015 09:05:00 +0000 (11:05 +0200)]
added blas level1 copy benchmark

9 years agoFix f_check bug.
Zhang Xianyi [Tue, 19 May 2015 17:04:45 +0000 (12:04 -0500)]
Fix f_check bug.

9 years agoMerge pull request #577 from wernsaar/develop
wernsaar [Tue, 19 May 2015 08:59:24 +0000 (10:59 +0200)]
Merge pull request #577 from wernsaar/develop

Bugfix for armv6 memory barrier

9 years agoRef #574: Bugfix for armv6 memory barrier
Werner Saar [Tue, 19 May 2015 08:43:12 +0000 (10:43 +0200)]
Ref #574: Bugfix for armv6 memory barrier

9 years ago1) Refs #575. Remove g77 from compiler list.
Zhang Xianyi [Tue, 19 May 2015 05:01:04 +0000 (00:01 -0500)]
1) Refs #575. Remove g77 from compiler list.
2) If OpenBLAS cannot find Fortran compiler, it will only build BLAS
(without LAPACK).

9 years agoMerge pull request #572 from wernsaar/develop
wernsaar [Mon, 18 May 2015 11:47:38 +0000 (13:47 +0200)]
Merge pull request #572 from wernsaar/develop

added optimized cscal and zscal functions for steamroller

9 years agoadded optimized cscal and zscal kernels for steamroller
Werner Saar [Mon, 18 May 2015 10:40:07 +0000 (12:40 +0200)]
added optimized cscal and zscal kernels for steamroller

9 years agoadded optimized cscal and zscal kernels for steamroller and piledriver
Werner Saar [Mon, 18 May 2015 08:50:57 +0000 (10:50 +0200)]
added optimized cscal and zscal kernels for steamroller and piledriver

9 years agoadded optimized cscal kernel for sandybridge
Werner Saar [Mon, 18 May 2015 06:46:06 +0000 (08:46 +0200)]
added optimized cscal kernel for sandybridge

9 years agoadded optimized cscal kernel for bulldozer
Werner Saar [Mon, 18 May 2015 05:33:52 +0000 (07:33 +0200)]
added optimized cscal kernel for bulldozer

9 years agoMerge pull request #571 from wernsaar/develop
wernsaar [Sun, 17 May 2015 12:09:14 +0000 (14:09 +0200)]
Merge pull request #571 from wernsaar/develop

added optimized cscal and zscal functions

9 years agoadded optimized cscal kernel for haswell
Werner Saar [Sun, 17 May 2015 11:44:09 +0000 (13:44 +0200)]
added optimized cscal kernel for haswell

9 years agoadded optimized zscal kernel for bulldozer
Werner Saar [Sun, 17 May 2015 09:45:19 +0000 (11:45 +0200)]
added optimized zscal kernel for bulldozer

9 years agoadded optimized zscal kernel for haswell
Werner Saar [Sat, 16 May 2015 14:41:45 +0000 (16:41 +0200)]
added optimized zscal kernel for haswell

9 years agoAdd AMD Excavator target.
Zhang Xianyi [Wed, 13 May 2015 21:16:30 +0000 (16:16 -0500)]
Add AMD Excavator target.

9 years agoMerge pull request #568 from wernsaar/develop
wernsaar [Wed, 13 May 2015 11:48:08 +0000 (13:48 +0200)]
Merge pull request #568 from wernsaar/develop

added optimized dscal kernel

9 years agobugfix: added static to functions
Werner Saar [Wed, 13 May 2015 11:31:26 +0000 (13:31 +0200)]
bugfix: added static to functions

9 years agoadded optimized dscal kernel for piledriver
Werner Saar [Wed, 13 May 2015 11:05:35 +0000 (13:05 +0200)]
added optimized dscal kernel for piledriver

9 years agooptimized dscal kernel for increment != 1
Werner Saar [Wed, 13 May 2015 10:14:39 +0000 (12:14 +0200)]
optimized dscal kernel for increment != 1

9 years agoadded optimized dscal kernel for haswell
Werner Saar [Tue, 12 May 2015 15:19:58 +0000 (17:19 +0200)]
added optimized dscal kernel for haswell

9 years agoadded optimized dscal kernel for sandybridge
Werner Saar [Tue, 12 May 2015 14:27:43 +0000 (16:27 +0200)]
added optimized dscal kernel for sandybridge

9 years agoadded optimized dscal kernel for bulldozer
Werner Saar [Tue, 12 May 2015 10:28:44 +0000 (12:28 +0200)]
added optimized dscal kernel for bulldozer

9 years agoMerge pull request #566 from powderluv/develop
Zhang Xianyi [Tue, 12 May 2015 01:59:12 +0000 (20:59 -0500)]
Merge pull request #566 from powderluv/develop

Fix build with ALLOC_SHM=0 (Android NDK)

9 years agoFix build with ALLOC_SHM=0 (Android NDK)
powderluv [Sun, 10 May 2015 07:10:26 +0000 (00:10 -0700)]
Fix build with ALLOC_SHM=0 (Android NDK)

Refactor such that you can build with ALLOC_SHM=0. HughTLB
implicity depends on ALLOC_SHM=1. This patch allows
building for Android NDK r10d.

9 years agoRefs #532. Improve gemv paralel with small m and large n case.
Zhang Xianyi [Thu, 7 May 2015 21:33:17 +0000 (05:33 +0800)]
Refs #532. Improve gemv paralel with small m and large n case.

Splite the matrix and reduction.

9 years agoRefs #565. Fix the bug of generate FEXTRALIB.
Zhang Xianyi [Thu, 7 May 2015 05:06:53 +0000 (13:06 +0800)]
Refs #565. Fix the bug of generate FEXTRALIB.

9 years agoRefs #565. Merge branch 'andreasnoack-anj/bench' into develop
Zhang Xianyi [Thu, 7 May 2015 04:52:14 +0000 (12:52 +0800)]
Refs #565. Merge branch 'andreasnoack-anj/bench' into develop

9 years agoAdd vecLib benchmarks
Andreas Noack [Thu, 7 May 2015 01:52:34 +0000 (21:52 -0400)]
Add vecLib benchmarks

9 years agoMerge pull request #564 from wernsaar/develop
wernsaar [Wed, 6 May 2015 09:10:31 +0000 (11:10 +0200)]
Merge pull request #564 from wernsaar/develop

 Use only 1 thread in trsm if m or n < 2*GEMM_MULTITHREAD_THRESHOLD

9 years agouse only 1 thread if m or n < 2*GEMM_MULTITHREAD_THRESHOLD
Werner Saar [Wed, 6 May 2015 08:41:53 +0000 (10:41 +0200)]
use only 1 thread if m or n < 2*GEMM_MULTITHREAD_THRESHOLD

9 years agoadded loops to trsm.c
Werner Saar [Wed, 6 May 2015 07:21:19 +0000 (09:21 +0200)]
added loops to trsm.c

9 years agoMerge pull request #563 from wernsaar/develop
wernsaar [Tue, 5 May 2015 10:13:35 +0000 (12:13 +0200)]
Merge pull request #563 from wernsaar/develop

Bugfix for gemm3m tests

9 years agobugfix for gemm3m tests
Werner Saar [Tue, 5 May 2015 09:58:59 +0000 (11:58 +0200)]
bugfix for gemm3m tests

9 years agoremoved gemm3m functions from normal checks
Werner Saar [Tue, 5 May 2015 09:39:43 +0000 (11:39 +0200)]
removed gemm3m functions from normal checks

9 years agoMerge pull request #561 from wernsaar/develop
wernsaar [Mon, 4 May 2015 09:11:13 +0000 (11:11 +0200)]
Merge pull request #561 from wernsaar/develop

 updated dgemv_n sgemv_n kernels

9 years agoupdated dgemv_n kernel for nehalem
Werner Saar [Thu, 30 Apr 2015 12:38:06 +0000 (14:38 +0200)]
updated dgemv_n kernel for nehalem

9 years agooptimized dgemv_n kernel for haswell
Werner Saar [Thu, 30 Apr 2015 10:11:39 +0000 (12:11 +0200)]
optimized dgemv_n kernel for haswell

9 years agoMerge pull request #560 from sebastien-villemot/develop
Zhang Xianyi [Wed, 29 Apr 2015 16:36:47 +0000 (11:36 -0500)]
Merge pull request #560 from sebastien-villemot/develop

Fix detection of ARM architectures in c_check.

9 years agoFix detection of ARM architectures in c_check.
Sébastien Villemot [Wed, 29 Apr 2015 16:14:21 +0000 (18:14 +0200)]
Fix detection of ARM architectures in c_check.

This is necessary to avoid the false detection of a cross-compiling environment.

9 years agoMerge pull request #558 from wernsaar/develop
wernsaar [Tue, 28 Apr 2015 15:30:16 +0000 (17:30 +0200)]
Merge pull request #558 from wernsaar/develop

optimizations for sandybridge

9 years agooptimized dger kernel for sandybridge
Werner Saar [Tue, 28 Apr 2015 14:58:11 +0000 (16:58 +0200)]
optimized dger kernel for sandybridge

9 years agoadded optimized sger kernel for sandybridge
Werner Saar [Tue, 28 Apr 2015 13:33:38 +0000 (15:33 +0200)]
added optimized sger kernel for sandybridge

9 years agooptimized saxpy and daxpy for sandybridge
Werner Saar [Tue, 28 Apr 2015 08:18:32 +0000 (10:18 +0200)]
optimized saxpy and daxpy for sandybridge

9 years agoMerge pull request #554 from wernsaar/develop
Zhang Xianyi [Sat, 25 Apr 2015 13:11:36 +0000 (08:11 -0500)]
Merge pull request #554 from wernsaar/develop

added benchmarks for zgeru and cgeru

9 years agoadd benchmarks for zgeru and cgeru
Werner Saar [Sat, 25 Apr 2015 12:53:07 +0000 (14:53 +0200)]
add benchmarks for zgeru and cgeru

9 years agoMerge pull request #552 from jeromerobert/develop
Zhang Xianyi [Fri, 24 Apr 2015 19:12:12 +0000 (14:12 -0500)]
Merge pull request #552 from jeromerobert/develop

gemv: Ensure stack buffer is large enough to handle memory alignment

9 years agobugfixes: replaced int with BLASLONG
Werner Saar [Fri, 24 Apr 2015 12:30:44 +0000 (14:30 +0200)]
bugfixes: replaced int with BLASLONG

9 years agoMerge pull request #553 from wernsaar/develop
wernsaar [Fri, 24 Apr 2015 11:57:48 +0000 (13:57 +0200)]
Merge pull request #553 from wernsaar/develop

optimized some blas level1 kernels for increments != 1

9 years agooptimized sdot.c for increments != 1
Werner Saar [Fri, 24 Apr 2015 11:13:20 +0000 (13:13 +0200)]
optimized sdot.c for increments != 1

9 years agooptimized saxpy.c for increments != 1
Werner Saar [Fri, 24 Apr 2015 09:52:59 +0000 (11:52 +0200)]
optimized saxpy.c for increments != 1

9 years agooptimized daxpy kernel for increments != 1
Werner Saar [Fri, 24 Apr 2015 09:39:17 +0000 (11:39 +0200)]
optimized daxpy kernel for increments != 1

9 years agooptimized ddot.c for increments != 1
Werner Saar [Fri, 24 Apr 2015 08:56:55 +0000 (10:56 +0200)]
optimized ddot.c for increments != 1

9 years agogemv: Ensure stack buffer is large enough to handle memory alignment
Jerome Robert [Tue, 21 Apr 2015 08:12:01 +0000 (10:12 +0200)]
gemv: Ensure stack buffer is large enough to handle memory alignment

Ref #478

9 years agoMerge pull request #550 from wernsaar/develop
wernsaar [Thu, 23 Apr 2015 11:27:38 +0000 (13:27 +0200)]
Merge pull request #550 from wernsaar/develop

added optimized ssymv kernels for haswell and sandybridge

9 years agoadded optimized ssymv kernels for sandybridge
Werner Saar [Thu, 23 Apr 2015 10:19:24 +0000 (12:19 +0200)]
added optimized ssymv kernels for sandybridge

9 years agoadded optimized ssymv kernels for haswell
Werner Saar [Thu, 23 Apr 2015 08:23:13 +0000 (10:23 +0200)]
added optimized ssymv kernels for haswell

9 years agoMerge pull request #549 from wernsaar/develop
wernsaar [Wed, 22 Apr 2015 10:36:13 +0000 (12:36 +0200)]
Merge pull request #549 from wernsaar/develop

added optimized dsymv kernels for haswell and sandybridge