platform/upstream/openblas.git
10 years agorenamed some BLAS kernels, which are compatible to ARMV6
wernsaar [Thu, 21 Nov 2013 19:48:57 +0000 (20:48 +0100)]
renamed some BLAS kernels, which are compatible to ARMV6

10 years agoadded cpu detection and target ARMV6, used in raspberry pi
wernsaar [Thu, 21 Nov 2013 19:18:51 +0000 (20:18 +0100)]
added cpu detection and target ARMV6,  used in raspberry pi

10 years agoadded gemv_n kernel for single and double precision
wernsaar [Tue, 19 Nov 2013 14:07:20 +0000 (15:07 +0100)]
added gemv_n kernel for single and double precision

10 years agoadded gemv_t kernel for single and double precision
wernsaar [Tue, 19 Nov 2013 08:55:54 +0000 (09:55 +0100)]
added gemv_t kernel for single and double precision

10 years agoadded nrm2 kernel for all precisions
wernsaar [Sat, 16 Nov 2013 15:17:17 +0000 (16:17 +0100)]
added nrm2 kernel for all precisions

11 years agoadded rot kernel for all precisions
wernsaar [Fri, 15 Nov 2013 13:08:57 +0000 (14:08 +0100)]
added rot kernel for all precisions

11 years agoadded scal kernel for all precisions
wernsaar [Fri, 15 Nov 2013 10:56:43 +0000 (11:56 +0100)]
added scal kernel for all precisions

11 years agoadded swap-kernel for all precisions
wernsaar [Thu, 14 Nov 2013 18:06:19 +0000 (19:06 +0100)]
added swap-kernel for all precisions

11 years agoadded max- und min-kernels for all precisions
wernsaar [Thu, 14 Nov 2013 12:52:47 +0000 (13:52 +0100)]
added max- und min-kernels for all precisions

11 years agosmall optimizations on dot-kernels
wernsaar [Mon, 11 Nov 2013 14:47:56 +0000 (15:47 +0100)]
small optimizations on dot-kernels

11 years agoadded asum_kernel for all precisions and complex
wernsaar [Mon, 11 Nov 2013 13:20:59 +0000 (14:20 +0100)]
added asum_kernel for all precisions and complex

11 years agoadded blas level1 dot kernels for complex and double complex
wernsaar [Fri, 8 Nov 2013 08:08:11 +0000 (09:08 +0100)]
added blas level1 dot kernels for complex and double complex

11 years agoadded optimized blas level1 dot kernels for single and double precision
wernsaar [Thu, 7 Nov 2013 16:22:03 +0000 (17:22 +0100)]
added optimized blas level1 dot kernels for single and double precision

11 years agoadded optimized blas level1 copy kernels
wernsaar [Thu, 7 Nov 2013 16:18:56 +0000 (17:18 +0100)]
added optimized blas level1 copy kernels

11 years agoadded cgemm_tcopy_2_vfpv3.S and zgemm_tcopy_2_vfpv3.S
wernsaar [Thu, 7 Nov 2013 16:15:50 +0000 (17:15 +0100)]
added cgemm_tcopy_2_vfpv3.S and zgemm_tcopy_2_vfpv3.S

11 years agoadded dgemm_tcopy_4_vfpv3.S and sgemm_tcopy_4_vfpv3.S
wernsaar [Wed, 6 Nov 2013 19:01:18 +0000 (20:01 +0100)]
added dgemm_tcopy_4_vfpv3.S and sgemm_tcopy_4_vfpv3.S

11 years agoadded cgemm_ncopy_2_vfpv3.S and made assembler labels unique
wernsaar [Tue, 5 Nov 2013 19:21:35 +0000 (20:21 +0100)]
added cgemm_ncopy_2_vfpv3.S and made assembler labels unique

11 years agoadded zgemm_ncopy_2_vfpv3.S and made assembler labels unique
wernsaar [Tue, 5 Nov 2013 18:31:22 +0000 (19:31 +0100)]
added zgemm_ncopy_2_vfpv3.S and made assembler labels unique

11 years agoadded missing file kernel/arm/Makefile
wernsaar [Sun, 3 Nov 2013 10:54:39 +0000 (11:54 +0100)]
added missing file kernel/arm/Makefile

11 years agoadded missing file arm/Makefile in lapack/laswp
wernsaar [Sun, 3 Nov 2013 10:19:32 +0000 (11:19 +0100)]
added missing file arm/Makefile in lapack/laswp

11 years agoadded missing file cblas_noconst.h to the armv7 branch
wernsaar [Sun, 3 Nov 2013 10:04:16 +0000 (11:04 +0100)]
added missing file cblas_noconst.h to the armv7 branch

11 years agoredefined functions for TIMING and YIELDING for ARMV7 processor
wernsaar [Sun, 3 Nov 2013 09:34:04 +0000 (10:34 +0100)]
redefined functions for TIMING and YIELDING for ARMV7 processor

11 years agodeleted obsolete dgemm_kernel and dtrmm_kernel
wernsaar [Sat, 2 Nov 2013 12:12:21 +0000 (13:12 +0100)]
deleted obsolete dgemm_kernel and dtrmm_kernel

11 years agosmall optimizations on sgemm_kernel for ARMV7
wernsaar [Sat, 2 Nov 2013 12:06:11 +0000 (13:06 +0100)]
small optimizations on sgemm_kernel for ARMV7

11 years agominor optimizations on zgemm_kernel for ARMV7
wernsaar [Sat, 2 Nov 2013 08:43:53 +0000 (09:43 +0100)]
minor optimizations on zgemm_kernel for ARMV7

11 years agoadded sgemm_ncopy routine and made some improvements on cgemm_kernel for ARMV7
wernsaar [Fri, 1 Nov 2013 17:22:27 +0000 (18:22 +0100)]
added sgemm_ncopy routine and made some improvements on cgemm_kernel for ARMV7

11 years agomoved compiler flags from Makefile.rule to Makefile.arm
wernsaar [Wed, 16 Oct 2013 17:04:42 +0000 (19:04 +0200)]
moved compiler flags from Makefile.rule to Makefile.arm

11 years agooptimized param.h
wernsaar [Wed, 16 Oct 2013 16:04:34 +0000 (18:04 +0200)]
optimized param.h

11 years agoadded kernels for cgemm, ctrmm, zgemm and ztrmm
wernsaar [Wed, 16 Oct 2013 16:00:41 +0000 (18:00 +0200)]
added kernels for cgemm, ctrmm, zgemm and ztrmm

11 years agoadded sgemm- and strmm_kernel
wernsaar [Mon, 14 Oct 2013 06:22:27 +0000 (08:22 +0200)]
added sgemm- and strmm_kernel

11 years agoadded dgemm_ncopy_4_vfpv3.S
wernsaar [Sat, 12 Oct 2013 14:48:29 +0000 (16:48 +0200)]
added dgemm_ncopy_4_vfpv3.S

11 years agominor optimizations on dgemm_kernel
wernsaar [Sat, 12 Oct 2013 07:42:18 +0000 (09:42 +0200)]
minor optimizations on dgemm_kernel

11 years agoChanged kernels for dgemm and dtrmm
wernsaar [Sat, 5 Oct 2013 10:59:44 +0000 (12:59 +0200)]
Changed kernels for dgemm and dtrmm

11 years agochanged some values for arm
wernsaar [Mon, 30 Sep 2013 16:03:56 +0000 (18:03 +0200)]
changed some values for arm

11 years agoupdated dgemm_kernel_8x2_vfpv3.S
wernsaar [Mon, 30 Sep 2013 15:31:23 +0000 (17:31 +0200)]
updated dgemm_kernel_8x2_vfpv3.S

11 years agoadd modified c_check perl program
wernsaar [Sun, 29 Sep 2013 17:42:33 +0000 (19:42 +0200)]
add modified c_check perl program

11 years agoadded Makefile.arm
wernsaar [Sun, 29 Sep 2013 16:55:21 +0000 (18:55 +0200)]
added Makefile.arm

11 years agochanged dgemm_kernel to use fused multiply add
wernsaar [Sun, 29 Sep 2013 15:46:23 +0000 (17:46 +0200)]
changed dgemm_kernel to use fused multiply add

11 years agomodified Makefile.L3 for ARM
wernsaar [Sat, 28 Sep 2013 17:13:47 +0000 (19:13 +0200)]
modified Makefile.L3 for ARM

11 years agocommon files modified for ARM
wernsaar [Sat, 28 Sep 2013 17:10:32 +0000 (19:10 +0200)]
common files modified for ARM

11 years agoinitial checkin of kernel/arm
wernsaar [Sat, 28 Sep 2013 17:02:25 +0000 (19:02 +0200)]
initial checkin of kernel/arm

11 years agoAdded backers.
Zhang Xianyi [Thu, 5 Sep 2013 07:39:45 +0000 (15:39 +0800)]
Added backers.

11 years agoMerge pull request #290 from larsmans/missing-threshold
Lars Buitinck [Wed, 28 Aug 2013 15:20:16 +0000 (17:20 +0200)]
Merge pull request #290 from larsmans/missing-threshold

check if GEMM_MULTITHREAD_THRESHOLD defined in gemm.c
Set a fallback value.

11 years agoMerge pull request #291 from larsmans/fix-makefile-prefix
Zhang Xianyi [Wed, 28 Aug 2013 16:26:16 +0000 (09:26 -0700)]
Merge pull request #291 from larsmans/fix-makefile-prefix

fix default prefix handling in makefiles

11 years agoMerge pull request #289 from larsmans/no-noconst
Zhang Xianyi [Wed, 28 Aug 2013 16:25:23 +0000 (09:25 -0700)]
Merge pull request #289 from larsmans/no-noconst

get rid of the generated cblas_noconst.h file

11 years agofix default prefix handling in makefiles
Lars Buitinck [Wed, 28 Aug 2013 15:39:54 +0000 (17:39 +0200)]
fix default prefix handling in makefiles

PREFIX wasn't communicated to Makefile.install (where it matters)
by Makefile. The result is that the default PREFIX is empty and
OpenBLAS was being installed in /lib.

11 years agoget rid of the generated cblas_noconst.h file
Lars Buitinck [Wed, 28 Aug 2013 14:52:24 +0000 (16:52 +0200)]
get rid of the generated cblas_noconst.h file

11 years agoMerge pull request #288 from sebastien-villemot/develop
Zhang Xianyi [Wed, 28 Aug 2013 13:26:37 +0000 (06:26 -0700)]
Merge pull request #288 from sebastien-villemot/develop

Avoid failure on qemu guests declaring an Athlon CPU without 3dnow!

11 years agoAvoid failure on qemu guests declaring an Athlon CPU without 3dnow!
Sébastien Villemot [Wed, 28 Aug 2013 12:27:59 +0000 (14:27 +0200)]
Avoid failure on qemu guests declaring an Athlon CPU without 3dnow!

The present patch verifies that, on machines declaring an Athlon CPU model and
family, the 3dnow and 3dnowext feature flags are indeed present. If they are
not, it fallbacks on the most generic x86 kernel. This prevents crashes due to
illegal instruction on qemu guests with a weird configuration.

Closes #272

11 years agoMerge branch 'bulldozer' into develop
Zhang Xianyi [Sat, 24 Aug 2013 14:46:18 +0000 (11:46 -0300)]
Merge branch 'bulldozer' into develop

11 years agoRefs #281. Detect __CYGWIN__ macro for Cygwin x86_64.
Zhang Xianyi [Sat, 24 Aug 2013 05:09:49 +0000 (13:09 +0800)]
Refs #281. Detect __CYGWIN__ macro for Cygwin x86_64.

Signed-off-by: Zhang Xianyi <traits.zhang@gmail.com>
11 years agoRefs #281. Detect _WIN32 macro for Windows API.
Zhang Xianyi [Fri, 23 Aug 2013 17:10:02 +0000 (01:10 +0800)]
Refs #281. Detect _WIN32 macro for Windows API.

http://www.mail-archive.com/bug-gnulib@gnu.org/msg05722.html

11 years agoremoved unnessesary instructions from zgemm_kernel_2x2_bulldozer.S
wernsaar [Sat, 17 Aug 2013 04:46:17 +0000 (06:46 +0200)]
removed unnessesary instructions from zgemm_kernel_2x2_bulldozer.S

11 years agoremoved unnessesary instructions
wernsaar [Fri, 16 Aug 2013 18:23:34 +0000 (20:23 +0200)]
removed unnessesary instructions

11 years agoRefs #282. Fixed zgemv_n typo bug on Win64.
Zhang Xianyi [Fri, 23 Aug 2013 08:27:17 +0000 (16:27 +0800)]
Refs #282. Fixed zgemv_n typo bug on Win64.

11 years agoMerge pull request #280 from ViralBShah/develop
Zhang Xianyi [Wed, 21 Aug 2013 15:21:51 +0000 (08:21 -0700)]
Merge pull request #280 from ViralBShah/develop

Patch LAPACK XLASD4.f as discussed in JuliaLang/julia#2340

11 years agoPatch LAPACK XLASD4.f as discussed in JuliaLang/julia#2340
Viral B. Shah [Wed, 21 Aug 2013 13:44:07 +0000 (19:14 +0530)]
Patch LAPACK XLASD4.f as discussed in JuliaLang/julia#2340

11 years agoRefs #279. Provide ONLY_CBLAS flag. If you only need CBLAS without
Zhang Xianyi [Tue, 20 Aug 2013 16:03:25 +0000 (00:03 +0800)]
Refs #279. Provide ONLY_CBLAS flag. If you only need CBLAS without
a fortran compiler, please try make ONLY_CBLAS=1.

This mode only compiler CBLAS without BLAS fortran interface and LAPACK.

11 years agoMerge branch 'bulldozer' into develop
Zhang Xianyi [Mon, 12 Aug 2013 15:22:10 +0000 (23:22 +0800)]
Merge branch 'bulldozer' into develop

11 years agoFixed #276. Merge branch 'wernsaar-develop' into bulldozer
Zhang Xianyi [Fri, 9 Aug 2013 02:49:44 +0000 (10:49 +0800)]
Fixed #276. Merge branch 'wernsaar-develop' into bulldozer

11 years agoMerge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop
Zhang Xianyi [Fri, 9 Aug 2013 02:48:46 +0000 (10:48 +0800)]
Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop

11 years agomodified KERNEL.BULLDOZER
wernsaar [Thu, 8 Aug 2013 15:49:30 +0000 (17:49 +0200)]
modified KERNEL.BULLDOZER

11 years agoadded dtrsm_kernel_RN_8x2_bulldozer.S
wernsaar [Thu, 8 Aug 2013 05:14:08 +0000 (07:14 +0200)]
added dtrsm_kernel_RN_8x2_bulldozer.S

11 years agodtrsm_kernel_LT_8x2_bulldozer.S performance optimization
wernsaar [Mon, 5 Aug 2013 09:27:16 +0000 (11:27 +0200)]
dtrsm_kernel_LT_8x2_bulldozer.S performance optimization

11 years agoRefs #270 #268. Merge branch 'wernsaar-develop' into bulldozer
Zhang Xianyi [Mon, 5 Aug 2013 08:17:15 +0000 (16:17 +0800)]
Refs #270 #268. Merge branch 'wernsaar-develop' into bulldozer

11 years agoMerge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop
Zhang Xianyi [Mon, 5 Aug 2013 08:09:47 +0000 (16:09 +0800)]
Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop

11 years agoEnable bulldozer kernels.
Zhang Xianyi [Mon, 5 Aug 2013 08:07:54 +0000 (16:07 +0800)]
Enable bulldozer kernels.

11 years agoMerge branch 'develop' into bulldozer
Zhang Xianyi [Mon, 5 Aug 2013 07:51:53 +0000 (15:51 +0800)]
Merge branch 'develop' into bulldozer

11 years agomodified dtrsm_kernel_LT_8x2_bulldozer.S
wernsaar [Sun, 4 Aug 2013 10:16:12 +0000 (12:16 +0200)]
modified dtrsm_kernel_LT_8x2_bulldozer.S

11 years agomodified dtrsm_kernel_LT_8x2_bulldozer.S
wernsaar [Sun, 4 Aug 2013 08:15:33 +0000 (10:15 +0200)]
modified dtrsm_kernel_LT_8x2_bulldozer.S

11 years agoadded dtrsm_kernel_LT_8x2_bulldozer.S
wernsaar [Sun, 4 Aug 2013 07:54:40 +0000 (09:54 +0200)]
added dtrsm_kernel_LT_8x2_bulldozer.S

11 years agoremoved dtrsm_kernel_LT_8x2_bulldozer.S
wernsaar [Sat, 3 Aug 2013 13:40:51 +0000 (15:40 +0200)]
removed dtrsm_kernel_LT_8x2_bulldozer.S

11 years agofixed bug in dgemv_t_bulldozer.S
wernsaar [Sat, 3 Aug 2013 10:19:29 +0000 (12:19 +0200)]
fixed bug in dgemv_t_bulldozer.S

11 years agorepaired trmm bug in sgemm_kernel_16x2_bulldozer.S
wernsaar [Sat, 3 Aug 2013 09:43:25 +0000 (11:43 +0200)]
repaired trmm bug in sgemm_kernel_16x2_bulldozer.S

11 years agorepaired trmm bug in cgemm_kernel_4x2_bulldozer.S
wernsaar [Sat, 3 Aug 2013 08:32:51 +0000 (10:32 +0200)]
repaired trmm bug in cgemm_kernel_4x2_bulldozer.S

11 years agorepaired trmm bug in zgemm_kernel_2x2_bulldozer.S
wernsaar [Sat, 3 Aug 2013 08:17:08 +0000 (10:17 +0200)]
repaired trmm bug in zgemm_kernel_2x2_bulldozer.S

11 years agorepaired trmm bug in dgemm_kernel_8x2_bulldozer.S
wernsaar [Sat, 3 Aug 2013 07:35:39 +0000 (09:35 +0200)]
repaired trmm bug in dgemm_kernel_8x2_bulldozer.S

11 years agoMerge branch 'hotfix-v0.2.8' into develop
Zhang Xianyi [Thu, 1 Aug 2013 15:57:19 +0000 (23:57 +0800)]
Merge branch 'hotfix-v0.2.8' into develop

11 years agoUpdate the doc for 0.2.8 version.
Zhang Xianyi [Thu, 1 Aug 2013 15:52:43 +0000 (23:52 +0800)]
Update the doc for 0.2.8 version.

11 years agoOpenBLAS 0.2.8 rc1.
Zhang Xianyi [Wed, 31 Jul 2013 06:49:16 +0000 (14:49 +0800)]
OpenBLAS 0.2.8 rc1.

11 years agoMerge branch 'hotfix-v0.2.8' into develop
Zhang Xianyi [Wed, 31 Jul 2013 06:46:56 +0000 (14:46 +0800)]
Merge branch 'hotfix-v0.2.8' into develop

11 years agoRefs #266. Fixed the compiling bug with Open64 5.0.
Zhang Xianyi [Wed, 31 Jul 2013 06:41:39 +0000 (14:41 +0800)]
Refs #266. Fixed the compiling bug with Open64 5.0.

11 years agoadded generic trmm kernels and modified Makefile.L3
wernsaar [Tue, 30 Jul 2013 18:18:57 +0000 (20:18 +0200)]
added generic trmm kernels and modified Makefile.L3

11 years agoFixed #264 the memory leak bug in dtrtri_U.
Zhang Xianyi [Mon, 29 Jul 2013 15:21:10 +0000 (23:21 +0800)]
Fixed #264 the memory leak bug in dtrtri_U.

11 years agoFixed the FMA3 detection bug.
Zhang Xianyi [Sat, 27 Jul 2013 14:37:57 +0000 (22:37 +0800)]
Fixed the FMA3 detection bug.

11 years agoFixed #261. Use strncmp instead of a comparing trick.
Zhang Xianyi [Fri, 26 Jul 2013 15:43:54 +0000 (23:43 +0800)]
Fixed #261. Use strncmp instead of a comparing trick.

11 years agoFixed typo in getarch_2nd.c.
Zhang Xianyi [Mon, 29 Jul 2013 07:42:00 +0000 (15:42 +0800)]
Fixed typo in getarch_2nd.c.

11 years agoadded dtrsm_kernel_LT_8x2_bulldozer.S
wernsaar [Sun, 28 Jul 2013 14:47:58 +0000 (16:47 +0200)]
added dtrsm_kernel_LT_8x2_bulldozer.S

11 years agoRefs #263. Rollback bulldozer and piledriver kernels to barcelona kernels.
Zhang Xianyi [Sun, 28 Jul 2013 09:39:24 +0000 (17:39 +0800)]
Refs #263. Rollback bulldozer and piledriver kernels to barcelona kernels.

11 years agoMerge branch 'develop' into bulldozer
Zhang Xianyi [Sun, 28 Jul 2013 04:38:25 +0000 (06:38 +0200)]
Merge branch 'develop' into bulldozer

Conflicts:
kernel/x86_64/KERNEL.BULLDOZER

11 years agoRefs #262. Added executable stack markings.
Zhang Xianyi [Sat, 27 Jul 2013 16:09:40 +0000 (00:09 +0800)]
Refs #262. Added executable stack markings.

11 years agoMerge branch 'sfabbro-ldflags' into develop
Zhang Xianyi [Sat, 27 Jul 2013 15:03:07 +0000 (23:03 +0800)]
Merge branch 'sfabbro-ldflags' into develop

11 years agoFixed #260. Fixed generating 32-bit shared library on previous commit.
Zhang Xianyi [Sat, 27 Jul 2013 15:01:36 +0000 (23:01 +0800)]
Fixed #260. Fixed generating 32-bit shared library on previous commit.

11 years agoFixed the FMA3 detection bug.
Zhang Xianyi [Sat, 27 Jul 2013 14:37:57 +0000 (22:37 +0800)]
Fixed the FMA3 detection bug.

11 years agoMerge branch 'ldflags' of https://github.com/sfabbro/OpenBLAS into sfabbro-ldflags
Zhang Xianyi [Sat, 27 Jul 2013 14:19:54 +0000 (22:19 +0800)]
Merge branch 'ldflags' of https://github.com/sfabbro/OpenBLAS into sfabbro-ldflags

11 years agoFixed #261. Use strncmp instead of a comparing trick.
Zhang Xianyi [Fri, 26 Jul 2013 15:43:54 +0000 (23:43 +0800)]
Fixed #261. Use strncmp instead of a comparing trick.

11 years agoRespect user's LDFLAGS
Sebastien Fabbro [Wed, 24 Jul 2013 16:37:16 +0000 (09:37 -0700)]
Respect user's LDFLAGS

11 years agoMerge branch 'develop' v0.2.7
Zhang Xianyi [Thu, 25 Jul 2013 17:34:45 +0000 (01:34 +0800)]
Merge branch 'develop'

11 years agoRefs #259. Fixed missing LAPACK functions in shared library.
Zhang Xianyi [Thu, 25 Jul 2013 17:32:32 +0000 (01:32 +0800)]
Refs #259. Fixed missing LAPACK functions in shared library.

11 years agoMerge branch 'develop'
Zhang Xianyi [Tue, 23 Jul 2013 05:40:08 +0000 (13:40 +0800)]
Merge branch 'develop'