platform/upstream/openblas.git
9 years agoFixed #395. Enable optimized cgemm for Sandybridge. Added optimized sdot kernel.
Zhang Xianyi [Sun, 29 Jun 2014 02:34:51 +0000 (10:34 +0800)]
Fixed #395. Enable optimized cgemm for Sandybridge. Added optimized sdot kernel.
Fixed c/zgemm, zgemv computational error of haswell, piledriver, bullldozer, and
barcelona on Windows.

Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop

Conflicts:
kernel/Makefile.L1
kernel/x86_64/KERNEL
param.h

9 years agofixed zgemv bug for older AMD Processors
wernsaar [Sat, 28 Jun 2014 17:04:49 +0000 (19:04 +0200)]
fixed zgemv bug for older AMD Processors

9 years agoMerge branch 'TimothyGu-develop' into develop
Zhang Xianyi [Sat, 28 Jun 2014 12:52:07 +0000 (20:52 +0800)]
Merge branch 'TimothyGu-develop' into develop
Fixed #398. Remove all trailing whitespace except lapack-netlib.

9 years agoMerge branch 'develop' of https://github.com/TimothyGu/OpenBLAS into TimothyGu-develop
Zhang Xianyi [Sat, 28 Jun 2014 12:51:31 +0000 (20:51 +0800)]
Merge branch 'develop' of https://github.com/TimothyGu/OpenBLAS into TimothyGu-develop

Conflicts:
driver/others/memory.c

9 years agoMerge pull request #399 from TimothyGu/upstr
Zhang Xianyi [Sat, 28 Jun 2014 12:40:23 +0000 (20:40 +0800)]
Merge pull request #399 from TimothyGu/upstr

Build import libs as .dll.a instead of .lib

9 years agoMerge pull request #397 from vtjnash/develop
Zhang Xianyi [Sat, 28 Jun 2014 12:38:48 +0000 (20:38 +0800)]
Merge pull request #397 from vtjnash/develop

fix #394

9 years agobugfix for barcelona zgemv-kernel
wernsaar [Sat, 28 Jun 2014 10:36:11 +0000 (12:36 +0200)]
bugfix for barcelona zgemv-kernel

9 years agobugfix for bulldozer cgemm-, zgemm- and zgemv-kernel
wernsaar [Sat, 28 Jun 2014 10:16:20 +0000 (12:16 +0200)]
bugfix for bulldozer cgemm-, zgemm- and zgemv-kernel

9 years agobugfix for piledriver cgemm-, zgemm- and zgemv-kernel
wernsaar [Sat, 28 Jun 2014 09:46:58 +0000 (11:46 +0200)]
bugfix for piledriver cgemm-, zgemm- and zgemv-kernel

9 years agobugfix for haswell cgemm- and zgemm-kernel
wernsaar [Sat, 28 Jun 2014 08:22:40 +0000 (10:22 +0200)]
bugfix for haswell cgemm- and zgemm-kernel

9 years agobugfix for cgemm_kernel_8x2_sandy.S
wernsaar [Sat, 28 Jun 2014 08:01:56 +0000 (10:01 +0200)]
bugfix for cgemm_kernel_8x2_sandy.S

9 years ago.gitignore: add some more entries concerned with kernel
Timothy Gu [Fri, 27 Jun 2014 20:58:42 +0000 (13:58 -0700)]
.gitignore: add some more entries concerned with kernel

Signed-off-by: Timothy Gu <timothygu99@gmail.com>
9 years agoBuild import libs as .dll.a instead of .lib
Timothy Gu [Fri, 27 Jun 2014 18:58:14 +0000 (11:58 -0700)]
Build import libs as .dll.a instead of .lib

This is MinGW convention.

Signed-off-by: Timothy Gu <timothygu99@gmail.com>
9 years agoRemove all trailing whitespace except lapack-netlib
Timothy Gu [Fri, 27 Jun 2014 19:05:18 +0000 (12:05 -0700)]
Remove all trailing whitespace except lapack-netlib

Signed-off-by: Timothy Gu <timothygu99@gmail.com>
9 years agofix #394. this cleans up some handles after using them, and doesn't disable ALL proce...
Jameson Nash [Fri, 27 Jun 2014 16:10:04 +0000 (12:10 -0400)]
fix #394. this cleans up some handles after using them, and doesn't disable ALL process privileges upon success

9 years agoadded optimized cgemm-kernel for SANDYBRIDGE
wernsaar [Fri, 27 Jun 2014 11:40:29 +0000 (13:40 +0200)]
added optimized cgemm-kernel for SANDYBRIDGE

9 years agoadded DSDOT definition and enabled optimized sdot kernel
wernsaar [Fri, 27 Jun 2014 09:30:29 +0000 (11:30 +0200)]
added DSDOT definition and enabled optimized sdot kernel

9 years agoadded blas-test from lapack
wernsaar [Fri, 27 Jun 2014 08:12:19 +0000 (10:12 +0200)]
added blas-test from lapack

9 years agoMerge pull request #390 from wernsaar/develop
Zhang Xianyi [Fri, 27 Jun 2014 06:57:06 +0000 (14:57 +0800)]
Merge pull request #390 from wernsaar/develop

Ref #103: enhancement for small matrix dimensions. Fixed some bugs. Enable sgemm for SNB and dgemm for NEHALEM

9 years agoadded new optimized sgemm kernel for SANDYBRIGE
wernsaar [Thu, 26 Jun 2014 19:42:08 +0000 (21:42 +0200)]
added new optimized sgemm kernel for SANDYBRIGE

9 years agoenabled optimized dgemm kernel for NEHALEM
wernsaar [Thu, 26 Jun 2014 10:22:29 +0000 (12:22 +0200)]
enabled optimized dgemm kernel for NEHALEM

9 years agoFortran flag -frecursive is disabled by default
wernsaar [Wed, 25 Jun 2014 11:55:19 +0000 (13:55 +0200)]
Fortran flag -frecursive is disabled by default

9 years agoenabled optimized sgemv kernel for barcelona and piledriver
wernsaar [Wed, 25 Jun 2014 11:50:57 +0000 (13:50 +0200)]
enabled optimized sgemv kernel for barcelona and piledriver

9 years agoenabled optimized sgemv kernel for HASWELL
wernsaar [Wed, 25 Jun 2014 10:56:45 +0000 (12:56 +0200)]
enabled optimized sgemv kernel for HASWELL

9 years agoenabled optimized sgemv kernels for nehalem, sandybridge and bulldozer
wernsaar [Wed, 25 Jun 2014 10:38:14 +0000 (12:38 +0200)]
enabled optimized sgemv kernels for nehalem, sandybridge and bulldozer

9 years agofixed compiler warnings
wernsaar [Wed, 25 Jun 2014 09:32:44 +0000 (11:32 +0200)]
fixed compiler warnings

9 years agoadded parameter for gemm3m kernels
wernsaar [Wed, 25 Jun 2014 08:40:25 +0000 (10:40 +0200)]
added parameter for gemm3m kernels

9 years agoforce fallback for x86 32bit
wernsaar [Sun, 22 Jun 2014 15:27:11 +0000 (17:27 +0200)]
force fallback for x86 32bit

9 years agoRef #391: force fallback for x86 32bit
wernsaar [Sun, 22 Jun 2014 11:51:17 +0000 (13:51 +0200)]
Ref #391: force fallback for x86 32bit

9 years agoRef #391: disabled SMP in ger.c and zger.c
wernsaar [Sun, 22 Jun 2014 10:01:24 +0000 (12:01 +0200)]
Ref #391: disabled SMP in ger.c and zger.c

9 years agofixed bug for INTERFACE64
wernsaar [Sun, 22 Jun 2014 07:49:20 +0000 (09:49 +0200)]
fixed bug for INTERFACE64

9 years agoRef #393: fix for INTERFACE64=0 and ARCH_X86 in divtable
wernsaar [Sat, 21 Jun 2014 10:29:23 +0000 (12:29 +0200)]
Ref #393: fix for INTERFACE64=0 and ARCH_X86 in divtable

9 years agoRef #380: lowered stack usage for haswell kernels
wernsaar [Thu, 19 Jun 2014 12:31:52 +0000 (14:31 +0200)]
Ref #380: lowered stack usage for haswell kernels

9 years agoRef #380: lowered stack usage for piledriver and bulldozer kernels
wernsaar [Thu, 19 Jun 2014 12:02:14 +0000 (14:02 +0200)]
Ref #380: lowered stack usage for piledriver and bulldozer kernels

9 years agoRef #103: enhancement for small matrix dimensions
wernsaar [Wed, 18 Jun 2014 13:04:11 +0000 (15:04 +0200)]
Ref #103: enhancement for small matrix dimensions

9 years agoMerge pull request #387 from davidanthoff/fixbuilderroronwin
Zhang Xianyi [Tue, 17 Jun 2014 23:57:30 +0000 (07:57 +0800)]
Merge pull request #387 from davidanthoff/fixbuilderroronwin

Add -lgfortran flag to gcc call in a makefile

9 years agoMerge pull request #386 from wernsaar/develop
Zhang Xianyi [Tue, 17 Jun 2014 23:56:08 +0000 (07:56 +0800)]
Merge pull request #386 from wernsaar/develop

Some enhancements for dynamic_arch and some warning fixes

10 years agoAdd -lgfortran flag to gcc call in a makefile
David Anthoff [Sat, 14 Jun 2014 04:10:27 +0000 (21:10 -0700)]
Add -lgfortran flag to gcc call in a makefile

Adding $(EXTRALIB) adds this flag when things are built with
msys2 on windows. Without this the build fails.

10 years agoRef #385: fixed warnings in dynamic.c
wernsaar [Thu, 12 Jun 2014 16:17:08 +0000 (18:17 +0200)]
Ref #385: fixed warnings in dynamic.c

10 years agoRef #385: added missing return instruction
wernsaar [Thu, 12 Jun 2014 13:52:14 +0000 (15:52 +0200)]
Ref #385: added missing return instruction

10 years agoRef #380: enhancements for dynamic_arch
wernsaar [Thu, 12 Jun 2014 12:20:03 +0000 (14:20 +0200)]
Ref #380: enhancements for dynamic_arch

10 years agoMerge pull request #384 from wernsaar/develop
Zhang Xianyi [Wed, 11 Jun 2014 01:49:27 +0000 (09:49 +0800)]
Merge pull request #384 from wernsaar/develop

Blas extensions

10 years agoRef #51: added blas extensions simatcopy, dimatcopy, cimatcopy, zimatcopy
wernsaar [Tue, 10 Jun 2014 14:14:34 +0000 (16:14 +0200)]
Ref #51: added blas extensions simatcopy, dimatcopy, cimatcopy, zimatcopy

10 years agoOpenBLAS 0.2.9 Version.
Zhang Xianyi [Tue, 10 Jun 2014 13:55:19 +0000 (21:55 +0800)]
OpenBLAS 0.2.9 Version.

10 years agoRef #51: added blas extensions zomatcopy and comatcopy
wernsaar [Tue, 10 Jun 2014 08:34:54 +0000 (10:34 +0200)]
Ref #51: added blas extensions zomatcopy and comatcopy

10 years agoRef #51: added blas extension somatcopy
wernsaar [Mon, 9 Jun 2014 18:21:13 +0000 (20:21 +0200)]
Ref #51: added blas extension somatcopy

10 years agoRef #51: added blas extension domatcopy as not opimized reference
wernsaar [Mon, 9 Jun 2014 15:11:07 +0000 (17:11 +0200)]
Ref #51: added blas extension domatcopy as not opimized reference

10 years agoRef #375: added workaround for small sizes to scal.c and zscal.c
wernsaar [Sun, 8 Jun 2014 11:49:19 +0000 (13:49 +0200)]
Ref #375: added workaround for small sizes to scal.c and zscal.c

10 years agoRef #285: added axpby kernels
wernsaar [Sun, 8 Jun 2014 09:54:24 +0000 (11:54 +0200)]
Ref #285: added axpby kernels

10 years agoFixed generating DLL bug.
Zhang Xianyi [Fri, 6 Jun 2014 08:13:08 +0000 (16:13 +0800)]
Fixed generating DLL bug.

10 years agoFixed #374.
Zhang Xianyi [Thu, 5 Jun 2014 09:01:12 +0000 (17:01 +0800)]
Fixed #374.

Merge branch 'TimothyGu-develop' into develop

10 years agoMerge pull request #376 from wernsaar/develop
Zhang Xianyi [Mon, 26 May 2014 09:46:06 +0000 (04:46 -0500)]
Merge pull request #376 from wernsaar/develop

Merged some Lapack optimized functions
https://github.com/xianyi/OpenBLAS/wiki/Fixed-optimized-kernels-To-do-List

10 years agofixed function profile in zpotri.c
wernsaar [Sun, 25 May 2014 07:15:22 +0000 (09:15 +0200)]
fixed function profile in zpotri.c

10 years agoadded lapack and lapacke timing libs by default
wernsaar [Sat, 24 May 2014 13:53:25 +0000 (15:53 +0200)]
added lapack and lapacke timing libs by default

10 years agochanged threshold value for sep.in from 50.0 to 60.0
wernsaar [Fri, 23 May 2014 15:26:50 +0000 (17:26 +0200)]
changed threshold value for sep.in from 50.0 to 60.0

10 years agoenabled and tested optimized potri lapack functions
wernsaar [Fri, 23 May 2014 10:14:30 +0000 (12:14 +0200)]
enabled and tested optimized potri lapack functions

10 years agoenabled abd tested optimized trtri lapack functions
wernsaar [Fri, 23 May 2014 08:55:39 +0000 (10:55 +0200)]
enabled abd tested optimized trtri lapack functions

10 years agoRandom "walk (a)round" --> "work-around" typo fixes
Timothy Gu [Fri, 23 May 2014 01:10:33 +0000 (18:10 -0700)]
Random "walk (a)round" --> "work-around" typo fixes

Signed-off-by: Timothy Gu <timothygu99@gmail.com>
10 years agoAdd NO_STATIC variable which disables static lib installation
Timothy Gu [Fri, 23 May 2014 01:05:19 +0000 (18:05 -0700)]
Add NO_STATIC variable which disables static lib installation

Static library is still built for shared lib generation.

Signed-off-by: Timothy Gu <timothygu99@gmail.com>
10 years agoBuild import library for mingw
Timothy Gu [Sat, 17 May 2014 23:21:21 +0000 (16:21 -0700)]
Build import library for mingw

Signed-off-by: Timothy Gu <timothygu99@gmail.com>
10 years agoremoved lapack/getri because it was never used
wernsaar [Wed, 21 May 2014 12:21:19 +0000 (14:21 +0200)]
removed lapack/getri because it was never used

10 years agoenabled optimized trti2 lapack functions again
wernsaar [Wed, 21 May 2014 09:02:07 +0000 (11:02 +0200)]
enabled optimized trti2 lapack functions again

10 years agoenabled optimized complex lauum lapack functions again
wernsaar [Wed, 21 May 2014 08:35:28 +0000 (10:35 +0200)]
enabled optimized complex lauum lapack functions again

10 years agoenabled lauu2 and lauum lapack functions again
wernsaar [Wed, 21 May 2014 07:49:18 +0000 (09:49 +0200)]
enabled lauu2 and lauum lapack functions again

10 years agoRefs #372. Fixed a lot of bugs about LAPACK testing.
Zhang Xianyi [Wed, 21 May 2014 03:25:11 +0000 (11:25 +0800)]
Refs #372. Fixed a lot of bugs about LAPACK testing.
As a walk round solution, we rolled back some kernels.

Please check https://github.com/xianyi/OpenBLAS/wiki/Fixed-optimized-kernels-To-do-List

Merge branch 'wernsaar-develop' into develop

10 years agoMerge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop
Zhang Xianyi [Wed, 21 May 2014 03:24:39 +0000 (11:24 +0800)]
Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop

Conflicts:
kernel/arm/KERNEL.ARMV6

10 years agoremoved debug flag from Makefile.rule
wernsaar [Mon, 19 May 2014 13:57:18 +0000 (15:57 +0200)]
removed debug flag from Makefile.rule

10 years agoenabled and tested optimized gesv lapack functions
wernsaar [Mon, 19 May 2014 12:44:53 +0000 (14:44 +0200)]
enabled and tested optimized gesv lapack functions

10 years agomarked trti2.c and ztrti2.c as bad
wernsaar [Mon, 19 May 2014 11:50:02 +0000 (13:50 +0200)]
marked trti2.c and ztrti2.c as bad

10 years agoenabled and tested optimized laswp lapack function
wernsaar [Mon, 19 May 2014 11:35:32 +0000 (13:35 +0200)]
enabled and tested optimized laswp lapack function

10 years agomarked zlauu2.c and zlauum.c as bad
wernsaar [Mon, 19 May 2014 10:53:22 +0000 (12:53 +0200)]
marked zlauu2.c and zlauum.c as bad

10 years agomarked trtri.c and ztrtri as bad
wernsaar [Mon, 19 May 2014 10:42:52 +0000 (12:42 +0200)]
marked trtri.c and ztrtri as bad

10 years agomoved trtri.c and ztrtri.c to the directory lapack
wernsaar [Mon, 19 May 2014 10:29:29 +0000 (12:29 +0200)]
moved trtri.c and ztrtri.c to the directory lapack

10 years agomarked lauu2.c and lauum.c as bad
wernsaar [Mon, 19 May 2014 10:00:16 +0000 (12:00 +0200)]
marked lauu2.c and lauum.c as bad

10 years agomarked larf.c as obsolete
wernsaar [Mon, 19 May 2014 09:23:17 +0000 (11:23 +0200)]
marked larf.c as obsolete

10 years agoMerge branch 'TimothyGu-develop' into develop
Zhang Xianyi [Mon, 19 May 2014 02:37:20 +0000 (10:37 +0800)]
Merge branch 'TimothyGu-develop' into develop

10 years agoRemove code for downloading lapack tarball and the patches themselves
Timothy Gu [Mon, 19 May 2014 02:09:26 +0000 (19:09 -0700)]
Remove code for downloading lapack tarball and the patches themselves

They are not used anymore since 3eb5af1.

Signed-off-by: Timothy Gu <timothygu99@gmail.com>
10 years agoRemove unused dll2 target
Timothy Gu [Mon, 19 May 2014 01:54:38 +0000 (18:54 -0700)]
Remove unused dll2 target

Signed-off-by: Timothy Gu <timothygu99@gmail.com>
10 years agomarked potri functions as bad because a lot of errors
wernsaar [Sun, 18 May 2014 21:41:13 +0000 (23:41 +0200)]
marked potri functions as bad because a lot of errors

10 years agoenabled and tested optimized potf2 lapack functions
wernsaar [Sun, 18 May 2014 20:41:43 +0000 (22:41 +0200)]
enabled and tested optimized potf2 lapack functions

10 years agoenabled and tested optimized getf2 lapack functions
wernsaar [Sun, 18 May 2014 20:21:16 +0000 (22:21 +0200)]
enabled and tested optimized getf2 lapack functions

10 years agoenabled and tested optimized potrf lapack functions
wernsaar [Sun, 18 May 2014 19:42:37 +0000 (21:42 +0200)]
enabled and tested optimized potrf lapack functions

10 years agoenabled and tested optimized getrs lapack functions
wernsaar [Sun, 18 May 2014 19:13:56 +0000 (21:13 +0200)]
enabled and tested optimized getrs lapack functions

10 years agoenabled and tested optimized cgetrf lapack function
wernsaar [Sun, 18 May 2014 18:32:27 +0000 (20:32 +0200)]
enabled and tested optimized cgetrf lapack function

10 years agoenabled and tested optimized sgetrf lapack function
wernsaar [Sun, 18 May 2014 18:01:23 +0000 (20:01 +0200)]
enabled and tested optimized sgetrf lapack function

10 years agoenabled and tested optimized zgetrf lapack function
wernsaar [Sun, 18 May 2014 17:36:32 +0000 (19:36 +0200)]
enabled and tested optimized zgetrf lapack function

10 years agoenabled and tested optimized dgetrf function
wernsaar [Sun, 18 May 2014 17:07:51 +0000 (19:07 +0200)]
enabled and tested optimized dgetrf function

10 years agoadded optimized lapack files from OpenBLAS
wernsaar [Sun, 18 May 2014 12:09:22 +0000 (14:09 +0200)]
added optimized lapack files from OpenBLAS

10 years agoRemove routines for generating exports/symbol.S
Timothy Gu [Sat, 17 May 2014 23:02:36 +0000 (16:02 -0700)]
Remove routines for generating exports/symbol.S

Signed-off-by: Timothy Gu <timothygu99@gmail.com>
10 years agoRemove routines for making exports/linux.def
Timothy Gu [Sat, 17 May 2014 23:01:30 +0000 (16:01 -0700)]
Remove routines for making exports/linux.def

Signed-off-by: Timothy Gu <timothygu99@gmail.com>
10 years agobugfix for ARMV6
wernsaar [Sat, 17 May 2014 11:00:36 +0000 (13:00 +0200)]
bugfix for ARMV6

10 years agoenable debug for lapack testing
wernsaar [Sat, 17 May 2014 09:18:26 +0000 (11:18 +0200)]
enable debug for lapack testing

10 years agosome modifications regarding lapack test
wernsaar [Fri, 16 May 2014 18:37:41 +0000 (20:37 +0200)]
some modifications regarding lapack test

10 years agochanged threshold to 50.0
wernsaar [Fri, 16 May 2014 18:34:48 +0000 (20:34 +0200)]
changed threshold to 50.0

10 years agochanged default optimization flag from O3 to O2 for ARM
wernsaar [Fri, 16 May 2014 12:36:24 +0000 (14:36 +0200)]
changed default optimization flag from O3 to O2 for ARM

10 years agochanged threshold for 50.0 to 54.0 in svd.in
wernsaar [Fri, 16 May 2014 12:32:10 +0000 (14:32 +0200)]
changed threshold for 50.0 to 54.0 in svd.in

10 years agochanged YIELDING for BULLDOZER
wernsaar [Thu, 15 May 2014 09:37:38 +0000 (11:37 +0200)]
changed YIELDING for BULLDOZER

10 years agoModified lapack-test, using lapack_testing.py to run tests
wernsaar [Wed, 14 May 2014 13:16:21 +0000 (15:16 +0200)]
Modified lapack-test, using lapack_testing.py to run tests

10 years agoadded FCOMMON_OPT for lapack
wernsaar [Wed, 14 May 2014 13:01:03 +0000 (15:01 +0200)]
added FCOMMON_OPT for lapack

10 years agochanged label lapack-test
wernsaar [Wed, 14 May 2014 11:08:05 +0000 (13:08 +0200)]
changed label lapack-test