platform/upstream/openblas.git
10 years agoadd SYMBOLPREFIX and SYMBOLSUFFIX makefile options
Tony Kelman [Sat, 25 Oct 2014 05:27:00 +0000 (22:27 -0700)]
add SYMBOLPREFIX and SYMBOLSUFFIX makefile options

for adding a prefix or suffix to all exported symbol names in the shared library
Useful to avoid conflicts with other BLAS libraries, especially when using
64 bit integer interfaces in OpenBLAS

Note that since OSX does not have the objcopy utility, setting these options
to non-empty values on Mac requires the objconv tool, available (GPL license)
from http://www.agner.org/optimize/#objconv

10 years agoUpdate dot to 0.2.12 version.
Zhang Xianyi [Mon, 13 Oct 2014 09:10:12 +0000 (17:10 +0800)]
Update dot to 0.2.12 version.

10 years agoRef #454: fixed bug in common_param.h
wernsaar [Tue, 23 Sep 2014 09:34:29 +0000 (11:34 +0200)]
Ref #454: fixed bug in common_param.h

10 years agoMerge pull request #453 from wernsaar/develop
Zhang Xianyi [Mon, 22 Sep 2014 08:47:54 +0000 (16:47 +0800)]
Merge pull request #453 from wernsaar/develop

Enabled GEMM3M functions

10 years agoupdated cblas.h and cblas_noconst.h
wernsaar [Sun, 21 Sep 2014 11:39:15 +0000 (13:39 +0200)]
updated cblas.h and cblas_noconst.h

10 years agoadded benchmark for gemm3m functions
wernsaar [Sun, 21 Sep 2014 10:00:41 +0000 (12:00 +0200)]
added benchmark for gemm3m functions

10 years agobugfix for GEMM3M functions
wernsaar [Sun, 21 Sep 2014 09:41:43 +0000 (11:41 +0200)]
bugfix for GEMM3M functions

10 years agoadded GEMM3M tests
wernsaar [Sun, 21 Sep 2014 08:55:08 +0000 (10:55 +0200)]
added GEMM3M tests

10 years agoenabled cblas gemm3m functions
wernsaar [Sat, 20 Sep 2014 15:20:02 +0000 (17:20 +0200)]
enabled cblas gemm3m functions

10 years agodisabled SYMM3M and HEMM3M functions because segment violations
wernsaar [Sat, 20 Sep 2014 13:27:40 +0000 (15:27 +0200)]
disabled SYMM3M and HEMM3M functions because segment violations

10 years agoadded test for CGEMM3M function
wernsaar [Sat, 20 Sep 2014 12:53:30 +0000 (14:53 +0200)]
added test for CGEMM3M function

10 years agoenabled use of GEMM3M functions
wernsaar [Sat, 20 Sep 2014 12:27:10 +0000 (14:27 +0200)]
enabled use of GEMM3M functions

10 years agoadded test for GEMM3M functions
wernsaar [Sat, 20 Sep 2014 12:21:42 +0000 (14:21 +0200)]
added test for GEMM3M functions

10 years agoupdated README.md
wernsaar [Wed, 17 Sep 2014 14:01:07 +0000 (16:01 +0200)]
updated README.md

10 years agoUpdate the doc for target list.
Zhang Xianyi [Wed, 17 Sep 2014 06:29:21 +0000 (14:29 +0800)]
Update the doc for target list.

10 years agoMerge pull request #451 from eshelman/patch-1
Zhang Xianyi [Wed, 17 Sep 2014 06:20:06 +0000 (14:20 +0800)]
Merge pull request #451 from eshelman/patch-1

Add HASWELL to TargetList.txt

10 years agoAdd HASWELL to TargetList.txt
Eliot Eshelman [Tue, 16 Sep 2014 22:26:45 +0000 (18:26 -0400)]
Add HASWELL to TargetList.txt

The Intel "Haswell" architecture is missing from the list of build targets.

10 years agoMerge pull request #449 from wernsaar/develop
Zhang Xianyi [Tue, 16 Sep 2014 06:33:48 +0000 (14:33 +0800)]
Merge pull request #449 from wernsaar/develop

optimized multithreading lower limits

10 years agooptimized multithreading lower limits
wernsaar [Mon, 15 Sep 2014 09:38:25 +0000 (11:38 +0200)]
optimized multithreading lower limits

10 years agoMerge pull request #448 from wernsaar/develop
Zhang Xianyi [Mon, 15 Sep 2014 05:12:14 +0000 (13:12 +0800)]
Merge pull request #448 from wernsaar/develop

Optimized cgemv and zgemv kernels

10 years agoremoved obsolete gemv kernel files
wernsaar [Sun, 14 Sep 2014 09:00:53 +0000 (11:00 +0200)]
removed obsolete gemv kernel files

10 years agooptimized zgemv_n_microk_sandy-4.c
wernsaar [Sun, 14 Sep 2014 08:21:22 +0000 (10:21 +0200)]
optimized zgemv_n_microk_sandy-4.c

10 years agoadded optimized zgemv_n kernel for sandybridge
wernsaar [Sun, 14 Sep 2014 07:02:05 +0000 (09:02 +0200)]
added optimized zgemv_n kernel for sandybridge

10 years agobugfix in KERNEL.PILEDRIVER
wernsaar [Sat, 13 Sep 2014 14:26:53 +0000 (16:26 +0200)]
bugfix in KERNEL.PILEDRIVER

10 years agooptimized cgemv_t kernel for haswell
wernsaar [Sat, 13 Sep 2014 14:13:27 +0000 (16:13 +0200)]
optimized cgemv_t kernel for haswell

10 years agoadded optimized cgemv_t kernel for haswell
wernsaar [Sat, 13 Sep 2014 13:14:12 +0000 (15:14 +0200)]
added optimized cgemv_t kernel for haswell

10 years agoupdated KERNEL.HASWELL
wernsaar [Sat, 13 Sep 2014 10:23:27 +0000 (12:23 +0200)]
updated KERNEL.HASWELL

10 years agoupdated zgemv_t_4.c
wernsaar [Sat, 13 Sep 2014 07:48:34 +0000 (09:48 +0200)]
updated zgemv_t_4.c

10 years agoadded optimized zgemv_t kernel for haswell
wernsaar [Sat, 13 Sep 2014 07:47:07 +0000 (09:47 +0200)]
added optimized zgemv_t kernel for haswell

10 years agooptimized interface/zgemv.c for multithreading
wernsaar [Fri, 12 Sep 2014 17:18:23 +0000 (19:18 +0200)]
optimized interface/zgemv.c for multithreading

10 years agoenabled optimized zgemv_t kernel for bulldozer
wernsaar [Fri, 12 Sep 2014 15:43:47 +0000 (17:43 +0200)]
enabled optimized zgemv_t kernel for bulldozer

10 years agooptimized zgemv_t for bulldozer
wernsaar [Fri, 12 Sep 2014 15:42:25 +0000 (17:42 +0200)]
optimized zgemv_t for bulldozer

10 years agoadded optimized zgemv_t kernel for bulldozer
wernsaar [Fri, 12 Sep 2014 15:04:22 +0000 (17:04 +0200)]
added optimized zgemv_t kernel for bulldozer

10 years agobugfix in cgemv_t_4.c
wernsaar [Fri, 12 Sep 2014 12:12:24 +0000 (14:12 +0200)]
bugfix in cgemv_t_4.c

10 years agoadded optimized cgemv_t kernel
wernsaar [Fri, 12 Sep 2014 11:38:01 +0000 (13:38 +0200)]
added optimized cgemv_t kernel

10 years agoadded optimized zgemv_t routine
wernsaar [Fri, 12 Sep 2014 10:35:20 +0000 (12:35 +0200)]
added optimized zgemv_t routine

10 years agooptimized zgemv_n_microk_haswell-4.c for small size
wernsaar [Thu, 11 Sep 2014 11:44:55 +0000 (13:44 +0200)]
optimized zgemv_n_microk_haswell-4.c for small size

10 years agobugfix in zgemv_n_4.c
wernsaar [Thu, 11 Sep 2014 11:18:00 +0000 (13:18 +0200)]
bugfix in zgemv_n_4.c

10 years agoadded optimized zgemv_n kernel
wernsaar [Thu, 11 Sep 2014 10:34:57 +0000 (12:34 +0200)]
added optimized zgemv_n kernel

10 years agobufix in cgemv_n_microk_haswell-4.c
wernsaar [Thu, 11 Sep 2014 09:12:44 +0000 (11:12 +0200)]
bufix in cgemv_n_microk_haswell-4.c

10 years agomore optimizations
wernsaar [Thu, 11 Sep 2014 08:25:48 +0000 (10:25 +0200)]
more optimizations

10 years agooptimized cgemv_n_4.c
wernsaar [Wed, 10 Sep 2014 17:26:14 +0000 (19:26 +0200)]
optimized cgemv_n_4.c

10 years agoadded optimized cgemv_kernel for haswell
wernsaar [Wed, 10 Sep 2014 12:11:24 +0000 (14:11 +0200)]
added optimized cgemv_kernel for haswell

10 years agoadded cgemv_n kernel, optimized for small sizes
wernsaar [Wed, 10 Sep 2014 11:45:13 +0000 (13:45 +0200)]
added cgemv_n kernel, optimized for small sizes

10 years agoMerge pull request #446 from grisuthedragon/cblas_matcopy
Zhang Xianyi [Wed, 10 Sep 2014 08:31:31 +0000 (16:31 +0800)]
Merge pull request #446 from grisuthedragon/cblas_matcopy

Add a CBLAS interface for the BLAS extension s/d/c/z*matcopy routines.

10 years agoMerge pull request #445 from wernsaar/develop
Zhang Xianyi [Wed, 10 Sep 2014 08:28:14 +0000 (16:28 +0800)]
Merge pull request #445 from wernsaar/develop

A lot of optimizations for gemv kernels

10 years agoadded and tested optimized dgemv_n kernel for haswell
wernsaar [Tue, 9 Sep 2014 14:17:45 +0000 (16:17 +0200)]
added and tested optimized dgemv_n kernel for haswell

10 years agoadded optimized dgemv_n kernel for haswell
wernsaar [Tue, 9 Sep 2014 13:32:32 +0000 (15:32 +0200)]
added optimized dgemv_n kernel for haswell

10 years agooptimized dgemv_t kernel for haswell
wernsaar [Tue, 9 Sep 2014 12:38:08 +0000 (14:38 +0200)]
optimized dgemv_t kernel for haswell

10 years agobugfix in KERNEL.HASWELL
wernsaar [Tue, 9 Sep 2014 12:04:44 +0000 (14:04 +0200)]
bugfix in KERNEL.HASWELL

10 years agoadded optimized gemv kernels
wernsaar [Tue, 9 Sep 2014 11:54:55 +0000 (13:54 +0200)]
added optimized gemv kernels

10 years agoadded optimized dgemv_t kernel for haswell
wernsaar [Tue, 9 Sep 2014 11:34:22 +0000 (13:34 +0200)]
added optimized dgemv_t kernel for haswell

10 years agoadd CBLAS interface for s/d/c/zimatcopy
Martin Koehler [Tue, 9 Sep 2014 07:52:13 +0000 (09:52 +0200)]
add CBLAS interface for s/d/c/zimatcopy

10 years agoremoved obsolete files
wernsaar [Mon, 8 Sep 2014 17:15:31 +0000 (19:15 +0200)]
removed obsolete files

10 years agoAdd cblas_(s/d/c/z)omatcopy in order to have cblas interface for them.
Martin Köhler [Mon, 8 Sep 2014 15:57:44 +0000 (17:57 +0200)]
Add cblas_(s/d/c/z)omatcopy in order to have cblas interface for them.

10 years agooptimized dgemv_n kernel for small sizes
wernsaar [Mon, 8 Sep 2014 13:22:35 +0000 (15:22 +0200)]
optimized dgemv_n kernel for small sizes

10 years agomodified multithreading threshold
wernsaar [Mon, 8 Sep 2014 10:27:32 +0000 (12:27 +0200)]
modified multithreading threshold

10 years agoadded haswell optimized kernel
wernsaar [Mon, 8 Sep 2014 10:25:16 +0000 (12:25 +0200)]
added haswell optimized kernel

10 years agobugfix in sgemv_n_microk_haswell-4.c
wernsaar [Mon, 8 Sep 2014 08:54:33 +0000 (10:54 +0200)]
bugfix in sgemv_n_microk_haswell-4.c

10 years agoadded optimized sgemv_t kernel for haswell
wernsaar [Mon, 8 Sep 2014 08:13:39 +0000 (10:13 +0200)]
added optimized sgemv_t kernel for haswell

10 years agobugfix for windows
wernsaar [Sun, 7 Sep 2014 19:48:42 +0000 (21:48 +0200)]
bugfix for windows

10 years agoenabled optimized sgemv kernels for piledriver
wernsaar [Sun, 7 Sep 2014 19:13:57 +0000 (21:13 +0200)]
enabled optimized sgemv kernels for piledriver

10 years agooptimized sgemv_n kernel for sandybridge
wernsaar [Sun, 7 Sep 2014 18:53:30 +0000 (20:53 +0200)]
optimized sgemv_n kernel for sandybridge

10 years agooptimized sgemv_n kernel for nehalem
wernsaar [Sun, 7 Sep 2014 17:20:08 +0000 (19:20 +0200)]
optimized sgemv_n kernel for nehalem

10 years agooptimized sgemv_n for very small size of m
wernsaar [Sun, 7 Sep 2014 16:23:48 +0000 (18:23 +0200)]
optimized sgemv_n for very small size of m

10 years agooptimizations for very small sizes
wernsaar [Sun, 7 Sep 2014 11:45:03 +0000 (13:45 +0200)]
optimizations for very small sizes

10 years agobetter optimzations for sgemv_t kernel
wernsaar [Sat, 6 Sep 2014 19:28:57 +0000 (21:28 +0200)]
better optimzations for sgemv_t kernel

10 years agooptimized sgemv_t_4 kernel for very small sizes
wernsaar [Sat, 6 Sep 2014 17:41:57 +0000 (19:41 +0200)]
optimized sgemv_t_4 kernel for very small sizes

10 years agooptimized sgemv_t
wernsaar [Sat, 6 Sep 2014 16:34:25 +0000 (18:34 +0200)]
optimized sgemv_t

10 years agooptimization for small size
wernsaar [Sat, 6 Sep 2014 11:17:56 +0000 (13:17 +0200)]
optimization for small size

10 years agoadded optimized sgemv_n kernel for haswell
wernsaar [Sat, 6 Sep 2014 10:08:48 +0000 (12:08 +0200)]
added optimized sgemv_n kernel for haswell

10 years agoundef WHEREAMI
wernsaar [Sat, 6 Sep 2014 09:01:42 +0000 (11:01 +0200)]
undef WHEREAMI

10 years agoadded optimized sgemv_n kernel for sandybridge
wernsaar [Sat, 6 Sep 2014 06:41:53 +0000 (08:41 +0200)]
added optimized sgemv_n kernel for sandybridge

10 years agoexperimentally removed expensive function calls
wernsaar [Fri, 5 Sep 2014 13:05:53 +0000 (15:05 +0200)]
experimentally removed expensive function calls

10 years agooptimized sgemv_t for sandybridge
wernsaar [Fri, 5 Sep 2014 08:22:50 +0000 (10:22 +0200)]
optimized sgemv_t for sandybridge

10 years agobugfix for sgemv_n_4.c
wernsaar [Thu, 4 Sep 2014 16:55:52 +0000 (18:55 +0200)]
bugfix for sgemv_n_4.c

10 years agooptimized sgemv_n kernel for small sizes
wernsaar [Thu, 4 Sep 2014 11:09:27 +0000 (13:09 +0200)]
optimized sgemv_n kernel for small sizes

10 years agooptimized sgemv_n_4.c
wernsaar [Wed, 3 Sep 2014 13:34:30 +0000 (15:34 +0200)]
optimized sgemv_n_4.c

10 years agooptimized sgemv_n for small sizes
wernsaar [Wed, 3 Sep 2014 12:48:45 +0000 (14:48 +0200)]
optimized sgemv_n for small sizes

10 years agobugfix for buffer overflow
wernsaar [Wed, 3 Sep 2014 08:13:47 +0000 (10:13 +0200)]
bugfix for buffer overflow

10 years agooptimized interface/gemv.c for multithreading
wernsaar [Tue, 2 Sep 2014 15:36:07 +0000 (17:36 +0200)]
optimized interface/gemv.c for multithreading

10 years agoupdated interface/gemv.c for multithreading
wernsaar [Tue, 2 Sep 2014 14:30:04 +0000 (16:30 +0200)]
updated interface/gemv.c for multithreading

10 years agoadded plot-header to compare multithreading
wernsaar [Tue, 2 Sep 2014 12:11:42 +0000 (14:11 +0200)]
added plot-header to compare multithreading

10 years agoremoved obsolete instructions from sgemv_t_4.c
wernsaar [Tue, 2 Sep 2014 11:35:41 +0000 (13:35 +0200)]
removed obsolete instructions from sgemv_t_4.c

10 years agooptimized sgemv_t for bulldozer
wernsaar [Tue, 2 Sep 2014 10:42:36 +0000 (12:42 +0200)]
optimized sgemv_t for bulldozer

10 years agooptimized sgemv_t_4.c for small sizes
wernsaar [Mon, 1 Sep 2014 13:11:37 +0000 (15:11 +0200)]
optimized sgemv_t_4.c for small sizes

10 years agoextended gemv.c benchmark
wernsaar [Mon, 1 Sep 2014 13:07:36 +0000 (15:07 +0200)]
extended gemv.c benchmark

10 years agomodified benchmark/gemv.c
wernsaar [Sun, 31 Aug 2014 13:38:18 +0000 (15:38 +0200)]
modified benchmark/gemv.c

10 years agooptimized sgemv_t_4.c for uneven sizes
wernsaar [Sun, 31 Aug 2014 12:33:15 +0000 (14:33 +0200)]
optimized sgemv_t_4.c for uneven sizes

10 years agooptimized sgemv_t_4.c for small size
wernsaar [Sun, 31 Aug 2014 11:23:44 +0000 (13:23 +0200)]
optimized sgemv_t_4.c for small size

10 years agochanged 1 test value (bug in lapack-testing?)
wernsaar [Sat, 30 Aug 2014 11:58:02 +0000 (13:58 +0200)]
changed 1 test value (bug in lapack-testing?)

10 years agooptimized sgemv_t kernel for small sizes
wernsaar [Sat, 30 Aug 2014 11:36:27 +0000 (13:36 +0200)]
optimized sgemv_t kernel for small sizes

10 years agoMerge pull request #443 from idunham/fix
Zhang Xianyi [Fri, 29 Aug 2014 05:31:06 +0000 (13:31 +0800)]
Merge pull request #443 from idunham/fix

Workaround PIC limitations in cpuid.

10 years agoWorkaround PIC limitations in cpuid.
Isaac Dunham [Thu, 28 Aug 2014 20:05:07 +0000 (13:05 -0700)]
Workaround PIC limitations in cpuid.

cpuid uses register ebx, but ebx is reserved in PIC.
So save ebx, swap ebx & edi, and return edi.

Copied from Igor Pavlov's equivalent fix for 7zip (in CpuArch.c),
which is public domain and thus OK license-wise.

10 years agoMerge pull request #440 from wernsaar/develop
Zhang Xianyi [Thu, 28 Aug 2014 04:43:54 +0000 (12:43 +0800)]
Merge pull request #440 from wernsaar/develop

optimizations for leve1 and level2 blas functions

10 years agomodification for clang compiler
wernsaar [Wed, 27 Aug 2014 07:00:20 +0000 (09:00 +0200)]
modification for clang compiler

10 years agorenoved flag no-integrated-as, because not working on macosx
wernsaar [Tue, 26 Aug 2014 16:29:40 +0000 (18:29 +0200)]
renoved flag no-integrated-as, because not working on macosx

10 years agoEXPERIMENTAL: added the flag -no-integrated-as for clang compiler in Makefile.system
wernsaar [Tue, 26 Aug 2014 15:36:32 +0000 (17:36 +0200)]
EXPERIMENTAL: added the flag -no-integrated-as for clang compiler in Makefile.system

10 years agoFixed the typo in Changelog.txt
Zhang Xianyi [Tue, 26 Aug 2014 08:14:34 +0000 (16:14 +0800)]
Fixed the typo in Changelog.txt

10 years agoadded optimized zaxpy bulldozer kernel
wernsaar [Mon, 25 Aug 2014 13:52:35 +0000 (15:52 +0200)]
added optimized zaxpy bulldozer kernel