platform/upstream/openblas.git
11 years agoRefs #279. Provide ONLY_CBLAS flag. If you only need CBLAS without
Zhang Xianyi [Tue, 20 Aug 2013 16:03:25 +0000 (00:03 +0800)]
Refs #279. Provide ONLY_CBLAS flag. If you only need CBLAS without
a fortran compiler, please try make ONLY_CBLAS=1.

This mode only compiler CBLAS without BLAS fortran interface and LAPACK.

11 years agoMerge branch 'bulldozer' into develop
Zhang Xianyi [Mon, 12 Aug 2013 15:22:10 +0000 (23:22 +0800)]
Merge branch 'bulldozer' into develop

11 years agoFixed #276. Merge branch 'wernsaar-develop' into bulldozer
Zhang Xianyi [Fri, 9 Aug 2013 02:49:44 +0000 (10:49 +0800)]
Fixed #276. Merge branch 'wernsaar-develop' into bulldozer

11 years agoMerge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop
Zhang Xianyi [Fri, 9 Aug 2013 02:48:46 +0000 (10:48 +0800)]
Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop

11 years agomodified KERNEL.BULLDOZER
wernsaar [Thu, 8 Aug 2013 15:49:30 +0000 (17:49 +0200)]
modified KERNEL.BULLDOZER

11 years agoadded dtrsm_kernel_RN_8x2_bulldozer.S
wernsaar [Thu, 8 Aug 2013 05:14:08 +0000 (07:14 +0200)]
added dtrsm_kernel_RN_8x2_bulldozer.S

11 years agodtrsm_kernel_LT_8x2_bulldozer.S performance optimization
wernsaar [Mon, 5 Aug 2013 09:27:16 +0000 (11:27 +0200)]
dtrsm_kernel_LT_8x2_bulldozer.S performance optimization

11 years agoRefs #270 #268. Merge branch 'wernsaar-develop' into bulldozer
Zhang Xianyi [Mon, 5 Aug 2013 08:17:15 +0000 (16:17 +0800)]
Refs #270 #268. Merge branch 'wernsaar-develop' into bulldozer

11 years agoMerge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop
Zhang Xianyi [Mon, 5 Aug 2013 08:09:47 +0000 (16:09 +0800)]
Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop

11 years agoEnable bulldozer kernels.
Zhang Xianyi [Mon, 5 Aug 2013 08:07:54 +0000 (16:07 +0800)]
Enable bulldozer kernels.

11 years agoMerge branch 'develop' into bulldozer
Zhang Xianyi [Mon, 5 Aug 2013 07:51:53 +0000 (15:51 +0800)]
Merge branch 'develop' into bulldozer

11 years agomodified dtrsm_kernel_LT_8x2_bulldozer.S
wernsaar [Sun, 4 Aug 2013 10:16:12 +0000 (12:16 +0200)]
modified dtrsm_kernel_LT_8x2_bulldozer.S

11 years agomodified dtrsm_kernel_LT_8x2_bulldozer.S
wernsaar [Sun, 4 Aug 2013 08:15:33 +0000 (10:15 +0200)]
modified dtrsm_kernel_LT_8x2_bulldozer.S

11 years agoadded dtrsm_kernel_LT_8x2_bulldozer.S
wernsaar [Sun, 4 Aug 2013 07:54:40 +0000 (09:54 +0200)]
added dtrsm_kernel_LT_8x2_bulldozer.S

11 years agoremoved dtrsm_kernel_LT_8x2_bulldozer.S
wernsaar [Sat, 3 Aug 2013 13:40:51 +0000 (15:40 +0200)]
removed dtrsm_kernel_LT_8x2_bulldozer.S

11 years agofixed bug in dgemv_t_bulldozer.S
wernsaar [Sat, 3 Aug 2013 10:19:29 +0000 (12:19 +0200)]
fixed bug in dgemv_t_bulldozer.S

11 years agorepaired trmm bug in sgemm_kernel_16x2_bulldozer.S
wernsaar [Sat, 3 Aug 2013 09:43:25 +0000 (11:43 +0200)]
repaired trmm bug in sgemm_kernel_16x2_bulldozer.S

11 years agorepaired trmm bug in cgemm_kernel_4x2_bulldozer.S
wernsaar [Sat, 3 Aug 2013 08:32:51 +0000 (10:32 +0200)]
repaired trmm bug in cgemm_kernel_4x2_bulldozer.S

11 years agorepaired trmm bug in zgemm_kernel_2x2_bulldozer.S
wernsaar [Sat, 3 Aug 2013 08:17:08 +0000 (10:17 +0200)]
repaired trmm bug in zgemm_kernel_2x2_bulldozer.S

11 years agorepaired trmm bug in dgemm_kernel_8x2_bulldozer.S
wernsaar [Sat, 3 Aug 2013 07:35:39 +0000 (09:35 +0200)]
repaired trmm bug in dgemm_kernel_8x2_bulldozer.S

11 years agoMerge branch 'hotfix-v0.2.8' into develop
Zhang Xianyi [Thu, 1 Aug 2013 15:57:19 +0000 (23:57 +0800)]
Merge branch 'hotfix-v0.2.8' into develop

11 years agoUpdate the doc for 0.2.8 version.
Zhang Xianyi [Thu, 1 Aug 2013 15:52:43 +0000 (23:52 +0800)]
Update the doc for 0.2.8 version.

11 years agoOpenBLAS 0.2.8 rc1.
Zhang Xianyi [Wed, 31 Jul 2013 06:49:16 +0000 (14:49 +0800)]
OpenBLAS 0.2.8 rc1.

11 years agoMerge branch 'hotfix-v0.2.8' into develop
Zhang Xianyi [Wed, 31 Jul 2013 06:46:56 +0000 (14:46 +0800)]
Merge branch 'hotfix-v0.2.8' into develop

11 years agoRefs #266. Fixed the compiling bug with Open64 5.0.
Zhang Xianyi [Wed, 31 Jul 2013 06:41:39 +0000 (14:41 +0800)]
Refs #266. Fixed the compiling bug with Open64 5.0.

11 years agoadded generic trmm kernels and modified Makefile.L3
wernsaar [Tue, 30 Jul 2013 18:18:57 +0000 (20:18 +0200)]
added generic trmm kernels and modified Makefile.L3

11 years agoFixed #264 the memory leak bug in dtrtri_U.
Zhang Xianyi [Mon, 29 Jul 2013 15:21:10 +0000 (23:21 +0800)]
Fixed #264 the memory leak bug in dtrtri_U.

11 years agoFixed the FMA3 detection bug.
Zhang Xianyi [Sat, 27 Jul 2013 14:37:57 +0000 (22:37 +0800)]
Fixed the FMA3 detection bug.

11 years agoFixed #261. Use strncmp instead of a comparing trick.
Zhang Xianyi [Fri, 26 Jul 2013 15:43:54 +0000 (23:43 +0800)]
Fixed #261. Use strncmp instead of a comparing trick.

11 years agoFixed typo in getarch_2nd.c.
Zhang Xianyi [Mon, 29 Jul 2013 07:42:00 +0000 (15:42 +0800)]
Fixed typo in getarch_2nd.c.

11 years agoadded dtrsm_kernel_LT_8x2_bulldozer.S
wernsaar [Sun, 28 Jul 2013 14:47:58 +0000 (16:47 +0200)]
added dtrsm_kernel_LT_8x2_bulldozer.S

11 years agoRefs #263. Rollback bulldozer and piledriver kernels to barcelona kernels.
Zhang Xianyi [Sun, 28 Jul 2013 09:39:24 +0000 (17:39 +0800)]
Refs #263. Rollback bulldozer and piledriver kernels to barcelona kernels.

11 years agoMerge branch 'develop' into bulldozer
Zhang Xianyi [Sun, 28 Jul 2013 04:38:25 +0000 (06:38 +0200)]
Merge branch 'develop' into bulldozer

Conflicts:
kernel/x86_64/KERNEL.BULLDOZER

11 years agoRefs #262. Added executable stack markings.
Zhang Xianyi [Sat, 27 Jul 2013 16:09:40 +0000 (00:09 +0800)]
Refs #262. Added executable stack markings.

11 years agoMerge branch 'sfabbro-ldflags' into develop
Zhang Xianyi [Sat, 27 Jul 2013 15:03:07 +0000 (23:03 +0800)]
Merge branch 'sfabbro-ldflags' into develop

11 years agoFixed #260. Fixed generating 32-bit shared library on previous commit.
Zhang Xianyi [Sat, 27 Jul 2013 15:01:36 +0000 (23:01 +0800)]
Fixed #260. Fixed generating 32-bit shared library on previous commit.

11 years agoFixed the FMA3 detection bug.
Zhang Xianyi [Sat, 27 Jul 2013 14:37:57 +0000 (22:37 +0800)]
Fixed the FMA3 detection bug.

11 years agoMerge branch 'ldflags' of https://github.com/sfabbro/OpenBLAS into sfabbro-ldflags
Zhang Xianyi [Sat, 27 Jul 2013 14:19:54 +0000 (22:19 +0800)]
Merge branch 'ldflags' of https://github.com/sfabbro/OpenBLAS into sfabbro-ldflags

11 years agoFixed #261. Use strncmp instead of a comparing trick.
Zhang Xianyi [Fri, 26 Jul 2013 15:43:54 +0000 (23:43 +0800)]
Fixed #261. Use strncmp instead of a comparing trick.

11 years agoRespect user's LDFLAGS
Sebastien Fabbro [Wed, 24 Jul 2013 16:37:16 +0000 (09:37 -0700)]
Respect user's LDFLAGS

11 years agoMerge branch 'develop' v0.2.7
Zhang Xianyi [Thu, 25 Jul 2013 17:34:45 +0000 (01:34 +0800)]
Merge branch 'develop'

11 years agoRefs #259. Fixed missing LAPACK functions in shared library.
Zhang Xianyi [Thu, 25 Jul 2013 17:32:32 +0000 (01:32 +0800)]
Refs #259. Fixed missing LAPACK functions in shared library.

11 years agoMerge branch 'develop'
Zhang Xianyi [Tue, 23 Jul 2013 05:40:08 +0000 (13:40 +0800)]
Merge branch 'develop'

11 years agoMerge pull request #257 from staticfloat/develop
Zhang Xianyi [Tue, 23 Jul 2013 05:35:29 +0000 (22:35 -0700)]
Merge pull request #257 from staticfloat/develop

Add in return value for `interface/trtri.c`

11 years agoFix xianyi/OpenBLAS#256
Elliot Saba [Tue, 23 Jul 2013 00:02:06 +0000 (17:02 -0700)]
Fix xianyi/OpenBLAS#256

11 years agoRefs #255. Didn't use f77 compiler.
Zhang Xianyi [Mon, 22 Jul 2013 03:34:43 +0000 (11:34 +0800)]
Refs #255. Didn't use f77 compiler.

11 years agoUpdate CONTRIBUTORS.md
Zhang Xianyi [Sat, 20 Jul 2013 15:32:23 +0000 (23:32 +0800)]
Update CONTRIBUTORS.md

11 years agoMerge branch 'develop'
Zhang Xianyi [Sat, 20 Jul 2013 15:05:36 +0000 (23:05 +0800)]
Merge branch 'develop'

11 years agoFixed #253. Update doc for v0.2.7 version.
Zhang Xianyi [Sat, 20 Jul 2013 15:05:12 +0000 (23:05 +0800)]
Fixed #253. Update doc for v0.2.7 version.

11 years agoMerge branch 'loongson3b' into develop
Zhang Xianyi [Sat, 20 Jul 2013 14:33:35 +0000 (22:33 +0800)]
Merge branch 'loongson3b' into develop

11 years agoMerge branch 'loongson3a' into develop
Zhang Xianyi [Sat, 20 Jul 2013 14:32:38 +0000 (22:32 +0800)]
Merge branch 'loongson3a' into develop

Conflicts:
Makefile.system

11 years agoFixed #254. Added the date of changes in contributors file.
Zhang Xianyi [Sat, 20 Jul 2013 03:35:27 +0000 (11:35 +0800)]
Fixed #254. Added the date of changes in contributors file.

11 years agocreate contributor file.
Zhang Xianyi [Fri, 19 Jul 2013 00:38:03 +0000 (08:38 +0800)]
create contributor file.

11 years agoFixed a computational error in zgemm_kernel_4x4_sandy.S file.
wangqian [Thu, 18 Jul 2013 12:23:21 +0000 (20:23 +0800)]
Fixed a computational error in zgemm_kernel_4x4_sandy.S file.

11 years agoEnsure the correct stack alignment on Win32.
Zhang Xianyi [Wed, 17 Jul 2013 07:19:07 +0000 (15:19 +0800)]
Ensure the correct stack alignment on Win32.

11 years agoFixed typo in generating shared library on x86_64.
Zhang Xianyi [Tue, 16 Jul 2013 15:18:18 +0000 (23:18 +0800)]
Fixed typo in generating shared library on x86_64.

11 years agoModified Makefile to avoid redundant echo.
Zhang Xianyi [Tue, 16 Jul 2013 14:44:27 +0000 (22:44 +0800)]
Modified Makefile to avoid redundant echo.

11 years agoModified Makefile.install
Zhang Xianyi [Tue, 16 Jul 2013 09:45:00 +0000 (17:45 +0800)]
Modified Makefile.install

11 years agoRefs #225. Fixed a bug in GEMM OpenMP threading.
Zhang Xianyi [Mon, 15 Jul 2013 01:56:19 +0000 (09:56 +0800)]
Refs #225. Fixed a bug in GEMM OpenMP threading.

11 years agoRefs #191. A walk around for dtrtri_U single thread bug.
Zhang Xianyi [Sun, 14 Jul 2013 14:16:30 +0000 (22:16 +0800)]
Refs #191. A walk around for dtrtri_U single thread bug.

This function caused the failure of ERKALE serial test.
I replaced it with LAPACK source code.

11 years agoChanged makefile for lapack.
Zhang Xianyi [Sun, 14 Jul 2013 02:41:54 +0000 (10:41 +0800)]
Changed makefile for lapack.

11 years agoUpdated travis.
Zhang Xianyi [Fri, 12 Jul 2013 13:41:12 +0000 (21:41 +0800)]
Updated travis.

11 years agoUpdate build matrix for Travis CI.
Zhang Xianyi [Thu, 11 Jul 2013 15:49:29 +0000 (23:49 +0800)]
Update build matrix for Travis CI.

11 years agoFixed the typo.
Zhang Xianyi [Thu, 11 Jul 2013 15:47:07 +0000 (23:47 +0800)]
Fixed the typo.

11 years agoFixed generating dll bug in last commit.
Zhang Xianyi [Thu, 11 Jul 2013 14:24:50 +0000 (22:24 +0800)]
Fixed generating dll bug in last commit.

11 years agoFixed #251. Merge branch 'grisuthedragon-develop' into develop
Zhang Xianyi [Thu, 11 Jul 2013 13:41:44 +0000 (21:41 +0800)]
Fixed #251. Merge branch 'grisuthedragon-develop' into develop

11 years agocreate openblas_get_parallel to retrieve information which
grisuthedragon [Thu, 11 Jul 2013 11:39:27 +0000 (13:39 +0200)]
create openblas_get_parallel to retrieve information which
parallelization model is used by OpenBLAS.

11 years agoRefs #214, #221, #246. Fixed the getrf overflow bug on Windows.
Zhang Xianyi [Wed, 10 Jul 2013 19:20:02 +0000 (03:20 +0800)]
Refs #214, #221, #246. Fixed the getrf overflow bug on Windows.

I used a smaller threshold since the stack size is 1MB on windows.

11 years agoRefs #248. Support LAPACK and LAPACKE with lsbcc.
Zhang Xianyi [Wed, 10 Jul 2013 08:02:27 +0000 (16:02 +0800)]
Refs #248. Support LAPACK and LAPACKE with lsbcc.

For LAPACKE, use LAPACK_COMPLEX_STRUCTURE.
The reson is lsbcc didn't define complex I in complex.h.

11 years agoMerge pull request #249 from wernsaar/develop
Zhang Xianyi [Wed, 10 Jul 2013 08:01:03 +0000 (01:01 -0700)]
Merge pull request #249 from wernsaar/develop

replaced defined(DOUBLE) by !defined(XDOUBLE)

11 years agoreplaced defined(DOUBLE) by !defined(XDOUBLE)
wernsaar [Tue, 9 Jul 2013 16:17:50 +0000 (18:17 +0200)]
replaced defined(DOUBLE) by !defined(XDOUBLE)

11 years agoRefs #247. Included lapack source codes. Avoid downloading tar.gz from netlib.org
Zhang Xianyi [Tue, 9 Jul 2013 09:00:02 +0000 (17:00 +0800)]
Refs #247. Included lapack source codes. Avoid downloading tar.gz from netlib.org

Based on 3.4.2 version, apply patch.for_lapack-3.4.2.

11 years agoFixed the typo in getarch.c
Zhang Xianyi [Tue, 9 Jul 2013 08:26:59 +0000 (16:26 +0800)]
Fixed the typo in getarch.c

11 years agoRefs #248. Fixed the LSB compatiable issue for BLAS only.
Zhang Xianyi [Tue, 9 Jul 2013 07:38:03 +0000 (15:38 +0800)]
Refs #248. Fixed the LSB compatiable issue for BLAS only.
For example, make CC=lsbcc NO_LAPACK=1.

11 years agoRefs #221 #246. Fixed the overflowing stack bug in mutlithreading BLAS3.
Zhang Xianyi [Sun, 7 Jul 2013 17:07:05 +0000 (01:07 +0800)]
Refs #221 #246. Fixed the overflowing stack bug in mutlithreading BLAS3.

When NUM_THREADS(MAX_CPU_NUNBERS) is very large ,e.g. 256.

typedef struct {
  volatile BLASLONG working[MAX_CPU_NUMBER][CACHE_LINE_SIZE * DIVIDE_RATE];
} job_t;

job_t          job[MAX_CPU_NUMBER];

The job array is equal 8MB.

Thus, We use malloc instead of stack allocation.

11 years agoSupport AMD Piledriver by bulldozer kernels.
Zhang Xianyi [Sat, 6 Jul 2013 15:06:43 +0000 (12:06 -0300)]
Support AMD Piledriver by bulldozer kernels.

11 years agoAdded Travis CI status image.
Zhang Xianyi [Fri, 5 Jul 2013 07:28:41 +0000 (15:28 +0800)]
Added Travis CI status image.

11 years agoUse quiet make for Travis CI.
Zhang Xianyi [Fri, 5 Jul 2013 06:52:57 +0000 (14:52 +0800)]
Use quiet make for Travis CI.

11 years agoInstall gfortran in Travis CI.
Zhang Xianyi [Fri, 5 Jul 2013 03:11:18 +0000 (11:11 +0800)]
Install gfortran in Travis CI.

11 years agoAdded travis.yml file.
Zhang Xianyi [Thu, 4 Jul 2013 15:30:53 +0000 (23:30 +0800)]
Added travis.yml file.

11 years agoImproved make clean on Mac OS X.
Zhang Xianyi [Tue, 2 Jul 2013 06:37:30 +0000 (14:37 +0800)]
Improved make clean on Mac OS X.

11 years agoRefs #221. Set stack limit to 16MB to prevent a SEGFAULT bug on Mac OS X with DYNAMIC...
Zhang Xianyi [Tue, 2 Jul 2013 06:17:55 +0000 (14:17 +0800)]
Refs #221. Set stack limit to 16MB to prevent a SEGFAULT bug on Mac OS X with DYNAMIC_ARCH=1 & NUM_THREADS=256.

11 years agoUse ALIGN_5 instead of .algin 32 in assembly kernel. Added ALIGN_5 for 32-bit OSX.
Zhang Xianyi [Mon, 1 Jul 2013 08:09:05 +0000 (16:09 +0800)]
Use ALIGN_5 instead of .algin 32 in assembly kernel. Added ALIGN_5 for 32-bit OSX.

11 years agoMerge pull request #242 from danluu/readme.haswell
Zhang Xianyi [Sun, 30 Jun 2013 16:40:32 +0000 (09:40 -0700)]
Merge pull request #242 from danluu/readme.haswell

Update README to reflect Haswell support, etc.

11 years agoFix miscellaneous typos
Dan Luu [Sun, 30 Jun 2013 16:36:13 +0000 (11:36 -0500)]
Fix miscellaneous typos

11 years agoFixed #217 openblas_config.h bug on Windows 64.
Zhang Xianyi [Sun, 30 Jun 2013 16:35:14 +0000 (00:35 +0800)]
Fixed #217 openblas_config.h bug on Windows 64.

11 years agoAdd Haswell support
Dan Luu [Sun, 30 Jun 2013 16:35:00 +0000 (11:35 -0500)]
Add Haswell support

11 years agoRefs #241. Add Haswell support (using sandybridge optimizations)
Dan Luu [Sat, 29 Jun 2013 22:26:56 +0000 (17:26 -0500)]
Refs #241. Add Haswell support (using sandybridge optimizations)

11 years agoFixed #239 bug in param.h about BARCELONA and BULLDOZER.
Zhang Xianyi [Sat, 29 Jun 2013 02:36:01 +0000 (10:36 +0800)]
Fixed #239 bug in param.h about BARCELONA and BULLDOZER.

11 years agoFixed #238 bug in lsame on x86.
Zhang Xianyi [Fri, 28 Jun 2013 14:43:41 +0000 (22:43 +0800)]
Fixed #238 bug in lsame on x86.

11 years agoMerge pull request #235 from wernsaar/develop
Zhang Xianyi [Sat, 22 Jun 2013 00:59:26 +0000 (17:59 -0700)]
Merge pull request #235 from wernsaar/develop

Added ddot, daxpy, dcopy kernels for AMD bulldozer.

11 years agoadded dcopy_bulldozer.S
wernsaar [Fri, 21 Jun 2013 14:06:51 +0000 (16:06 +0200)]
added dcopy_bulldozer.S

11 years agoadded ddot_bulldozer.S
wernsaar [Thu, 20 Jun 2013 14:15:09 +0000 (16:15 +0200)]
added ddot_bulldozer.S

11 years agoadded daxpy_bulldozer.S
wernsaar [Thu, 20 Jun 2013 12:07:54 +0000 (14:07 +0200)]
added daxpy_bulldozer.S

11 years agocleanup of dgemm_ncopy_8_bulldozer.S
wernsaar [Wed, 19 Jun 2013 17:31:38 +0000 (19:31 +0200)]
cleanup of dgemm_ncopy_8_bulldozer.S

11 years agoadded dgemv_t_bulldozer.S
wernsaar [Wed, 19 Jun 2013 15:32:42 +0000 (17:32 +0200)]
added dgemv_t_bulldozer.S

11 years agoMerge pull request #233 from wernsaar/develop
Zhang Xianyi [Wed, 19 Jun 2013 03:02:36 +0000 (20:02 -0700)]
Merge pull request #233 from wernsaar/develop

added dgemv_n and some faster gemm_copy routines to BULLDOZER.

11 years agoadded dgemm_ncopy_8_bulldozer.S
wernsaar [Tue, 18 Jun 2013 11:29:23 +0000 (13:29 +0200)]
added dgemm_ncopy_8_bulldozer.S

11 years agoadded gemm_tcopy_2_bulldozer.S
wernsaar [Tue, 18 Jun 2013 09:01:33 +0000 (11:01 +0200)]
added gemm_tcopy_2_bulldozer.S

11 years agoadded dgemm_tcopy_8_bulldozer.S
wernsaar [Mon, 17 Jun 2013 12:19:09 +0000 (14:19 +0200)]
added dgemm_tcopy_8_bulldozer.S