platform/upstream/openblas.git
9 years agoMerge pull request #453 from wernsaar/develop
Zhang Xianyi [Mon, 22 Sep 2014 08:47:54 +0000 (16:47 +0800)]
Merge pull request #453 from wernsaar/develop

Enabled GEMM3M functions

9 years agoupdated cblas.h and cblas_noconst.h
wernsaar [Sun, 21 Sep 2014 11:39:15 +0000 (13:39 +0200)]
updated cblas.h and cblas_noconst.h

9 years agoadded benchmark for gemm3m functions
wernsaar [Sun, 21 Sep 2014 10:00:41 +0000 (12:00 +0200)]
added benchmark for gemm3m functions

9 years agobugfix for GEMM3M functions
wernsaar [Sun, 21 Sep 2014 09:41:43 +0000 (11:41 +0200)]
bugfix for GEMM3M functions

9 years agoadded GEMM3M tests
wernsaar [Sun, 21 Sep 2014 08:55:08 +0000 (10:55 +0200)]
added GEMM3M tests

9 years agoenabled cblas gemm3m functions
wernsaar [Sat, 20 Sep 2014 15:20:02 +0000 (17:20 +0200)]
enabled cblas gemm3m functions

9 years agodisabled SYMM3M and HEMM3M functions because segment violations
wernsaar [Sat, 20 Sep 2014 13:27:40 +0000 (15:27 +0200)]
disabled SYMM3M and HEMM3M functions because segment violations

9 years agoadded test for CGEMM3M function
wernsaar [Sat, 20 Sep 2014 12:53:30 +0000 (14:53 +0200)]
added test for CGEMM3M function

9 years agoenabled use of GEMM3M functions
wernsaar [Sat, 20 Sep 2014 12:27:10 +0000 (14:27 +0200)]
enabled use of GEMM3M functions

9 years agoadded test for GEMM3M functions
wernsaar [Sat, 20 Sep 2014 12:21:42 +0000 (14:21 +0200)]
added test for GEMM3M functions

9 years agoupdated README.md
wernsaar [Wed, 17 Sep 2014 14:01:07 +0000 (16:01 +0200)]
updated README.md

9 years agoUpdate the doc for target list.
Zhang Xianyi [Wed, 17 Sep 2014 06:29:21 +0000 (14:29 +0800)]
Update the doc for target list.

9 years agoMerge pull request #451 from eshelman/patch-1
Zhang Xianyi [Wed, 17 Sep 2014 06:20:06 +0000 (14:20 +0800)]
Merge pull request #451 from eshelman/patch-1

Add HASWELL to TargetList.txt

9 years agoAdd HASWELL to TargetList.txt
Eliot Eshelman [Tue, 16 Sep 2014 22:26:45 +0000 (18:26 -0400)]
Add HASWELL to TargetList.txt

The Intel "Haswell" architecture is missing from the list of build targets.

9 years agoMerge pull request #449 from wernsaar/develop
Zhang Xianyi [Tue, 16 Sep 2014 06:33:48 +0000 (14:33 +0800)]
Merge pull request #449 from wernsaar/develop

optimized multithreading lower limits

9 years agooptimized multithreading lower limits
wernsaar [Mon, 15 Sep 2014 09:38:25 +0000 (11:38 +0200)]
optimized multithreading lower limits

9 years agoMerge pull request #448 from wernsaar/develop
Zhang Xianyi [Mon, 15 Sep 2014 05:12:14 +0000 (13:12 +0800)]
Merge pull request #448 from wernsaar/develop

Optimized cgemv and zgemv kernels

9 years agoremoved obsolete gemv kernel files
wernsaar [Sun, 14 Sep 2014 09:00:53 +0000 (11:00 +0200)]
removed obsolete gemv kernel files

9 years agooptimized zgemv_n_microk_sandy-4.c
wernsaar [Sun, 14 Sep 2014 08:21:22 +0000 (10:21 +0200)]
optimized zgemv_n_microk_sandy-4.c

9 years agoadded optimized zgemv_n kernel for sandybridge
wernsaar [Sun, 14 Sep 2014 07:02:05 +0000 (09:02 +0200)]
added optimized zgemv_n kernel for sandybridge

9 years agobugfix in KERNEL.PILEDRIVER
wernsaar [Sat, 13 Sep 2014 14:26:53 +0000 (16:26 +0200)]
bugfix in KERNEL.PILEDRIVER

9 years agooptimized cgemv_t kernel for haswell
wernsaar [Sat, 13 Sep 2014 14:13:27 +0000 (16:13 +0200)]
optimized cgemv_t kernel for haswell

9 years agoadded optimized cgemv_t kernel for haswell
wernsaar [Sat, 13 Sep 2014 13:14:12 +0000 (15:14 +0200)]
added optimized cgemv_t kernel for haswell

9 years agoupdated KERNEL.HASWELL
wernsaar [Sat, 13 Sep 2014 10:23:27 +0000 (12:23 +0200)]
updated KERNEL.HASWELL

9 years agoupdated zgemv_t_4.c
wernsaar [Sat, 13 Sep 2014 07:48:34 +0000 (09:48 +0200)]
updated zgemv_t_4.c

9 years agoadded optimized zgemv_t kernel for haswell
wernsaar [Sat, 13 Sep 2014 07:47:07 +0000 (09:47 +0200)]
added optimized zgemv_t kernel for haswell

9 years agooptimized interface/zgemv.c for multithreading
wernsaar [Fri, 12 Sep 2014 17:18:23 +0000 (19:18 +0200)]
optimized interface/zgemv.c for multithreading

9 years agoenabled optimized zgemv_t kernel for bulldozer
wernsaar [Fri, 12 Sep 2014 15:43:47 +0000 (17:43 +0200)]
enabled optimized zgemv_t kernel for bulldozer

9 years agooptimized zgemv_t for bulldozer
wernsaar [Fri, 12 Sep 2014 15:42:25 +0000 (17:42 +0200)]
optimized zgemv_t for bulldozer

9 years agoadded optimized zgemv_t kernel for bulldozer
wernsaar [Fri, 12 Sep 2014 15:04:22 +0000 (17:04 +0200)]
added optimized zgemv_t kernel for bulldozer

9 years agobugfix in cgemv_t_4.c
wernsaar [Fri, 12 Sep 2014 12:12:24 +0000 (14:12 +0200)]
bugfix in cgemv_t_4.c

9 years agoadded optimized cgemv_t kernel
wernsaar [Fri, 12 Sep 2014 11:38:01 +0000 (13:38 +0200)]
added optimized cgemv_t kernel

9 years agoadded optimized zgemv_t routine
wernsaar [Fri, 12 Sep 2014 10:35:20 +0000 (12:35 +0200)]
added optimized zgemv_t routine

9 years agooptimized zgemv_n_microk_haswell-4.c for small size
wernsaar [Thu, 11 Sep 2014 11:44:55 +0000 (13:44 +0200)]
optimized zgemv_n_microk_haswell-4.c for small size

9 years agobugfix in zgemv_n_4.c
wernsaar [Thu, 11 Sep 2014 11:18:00 +0000 (13:18 +0200)]
bugfix in zgemv_n_4.c

9 years agoadded optimized zgemv_n kernel
wernsaar [Thu, 11 Sep 2014 10:34:57 +0000 (12:34 +0200)]
added optimized zgemv_n kernel

9 years agobufix in cgemv_n_microk_haswell-4.c
wernsaar [Thu, 11 Sep 2014 09:12:44 +0000 (11:12 +0200)]
bufix in cgemv_n_microk_haswell-4.c

9 years agomore optimizations
wernsaar [Thu, 11 Sep 2014 08:25:48 +0000 (10:25 +0200)]
more optimizations

9 years agooptimized cgemv_n_4.c
wernsaar [Wed, 10 Sep 2014 17:26:14 +0000 (19:26 +0200)]
optimized cgemv_n_4.c

9 years agoadded optimized cgemv_kernel for haswell
wernsaar [Wed, 10 Sep 2014 12:11:24 +0000 (14:11 +0200)]
added optimized cgemv_kernel for haswell

9 years agoadded cgemv_n kernel, optimized for small sizes
wernsaar [Wed, 10 Sep 2014 11:45:13 +0000 (13:45 +0200)]
added cgemv_n kernel, optimized for small sizes

9 years agoMerge pull request #446 from grisuthedragon/cblas_matcopy
Zhang Xianyi [Wed, 10 Sep 2014 08:31:31 +0000 (16:31 +0800)]
Merge pull request #446 from grisuthedragon/cblas_matcopy

Add a CBLAS interface for the BLAS extension s/d/c/z*matcopy routines.

9 years agoMerge pull request #445 from wernsaar/develop
Zhang Xianyi [Wed, 10 Sep 2014 08:28:14 +0000 (16:28 +0800)]
Merge pull request #445 from wernsaar/develop

A lot of optimizations for gemv kernels

9 years agoadded and tested optimized dgemv_n kernel for haswell
wernsaar [Tue, 9 Sep 2014 14:17:45 +0000 (16:17 +0200)]
added and tested optimized dgemv_n kernel for haswell

9 years agoadded optimized dgemv_n kernel for haswell
wernsaar [Tue, 9 Sep 2014 13:32:32 +0000 (15:32 +0200)]
added optimized dgemv_n kernel for haswell

9 years agooptimized dgemv_t kernel for haswell
wernsaar [Tue, 9 Sep 2014 12:38:08 +0000 (14:38 +0200)]
optimized dgemv_t kernel for haswell

9 years agobugfix in KERNEL.HASWELL
wernsaar [Tue, 9 Sep 2014 12:04:44 +0000 (14:04 +0200)]
bugfix in KERNEL.HASWELL

9 years agoadded optimized gemv kernels
wernsaar [Tue, 9 Sep 2014 11:54:55 +0000 (13:54 +0200)]
added optimized gemv kernels

9 years agoadded optimized dgemv_t kernel for haswell
wernsaar [Tue, 9 Sep 2014 11:34:22 +0000 (13:34 +0200)]
added optimized dgemv_t kernel for haswell

9 years agoadd CBLAS interface for s/d/c/zimatcopy
Martin Koehler [Tue, 9 Sep 2014 07:52:13 +0000 (09:52 +0200)]
add CBLAS interface for s/d/c/zimatcopy

9 years agoremoved obsolete files
wernsaar [Mon, 8 Sep 2014 17:15:31 +0000 (19:15 +0200)]
removed obsolete files

9 years agoAdd cblas_(s/d/c/z)omatcopy in order to have cblas interface for them.
Martin Köhler [Mon, 8 Sep 2014 15:57:44 +0000 (17:57 +0200)]
Add cblas_(s/d/c/z)omatcopy in order to have cblas interface for them.

9 years agooptimized dgemv_n kernel for small sizes
wernsaar [Mon, 8 Sep 2014 13:22:35 +0000 (15:22 +0200)]
optimized dgemv_n kernel for small sizes

9 years agomodified multithreading threshold
wernsaar [Mon, 8 Sep 2014 10:27:32 +0000 (12:27 +0200)]
modified multithreading threshold

9 years agoadded haswell optimized kernel
wernsaar [Mon, 8 Sep 2014 10:25:16 +0000 (12:25 +0200)]
added haswell optimized kernel

9 years agobugfix in sgemv_n_microk_haswell-4.c
wernsaar [Mon, 8 Sep 2014 08:54:33 +0000 (10:54 +0200)]
bugfix in sgemv_n_microk_haswell-4.c

9 years agoadded optimized sgemv_t kernel for haswell
wernsaar [Mon, 8 Sep 2014 08:13:39 +0000 (10:13 +0200)]
added optimized sgemv_t kernel for haswell

9 years agobugfix for windows
wernsaar [Sun, 7 Sep 2014 19:48:42 +0000 (21:48 +0200)]
bugfix for windows

9 years agoenabled optimized sgemv kernels for piledriver
wernsaar [Sun, 7 Sep 2014 19:13:57 +0000 (21:13 +0200)]
enabled optimized sgemv kernels for piledriver

9 years agooptimized sgemv_n kernel for sandybridge
wernsaar [Sun, 7 Sep 2014 18:53:30 +0000 (20:53 +0200)]
optimized sgemv_n kernel for sandybridge

9 years agooptimized sgemv_n kernel for nehalem
wernsaar [Sun, 7 Sep 2014 17:20:08 +0000 (19:20 +0200)]
optimized sgemv_n kernel for nehalem

9 years agooptimized sgemv_n for very small size of m
wernsaar [Sun, 7 Sep 2014 16:23:48 +0000 (18:23 +0200)]
optimized sgemv_n for very small size of m

9 years agooptimizations for very small sizes
wernsaar [Sun, 7 Sep 2014 11:45:03 +0000 (13:45 +0200)]
optimizations for very small sizes

9 years agobetter optimzations for sgemv_t kernel
wernsaar [Sat, 6 Sep 2014 19:28:57 +0000 (21:28 +0200)]
better optimzations for sgemv_t kernel

9 years agooptimized sgemv_t_4 kernel for very small sizes
wernsaar [Sat, 6 Sep 2014 17:41:57 +0000 (19:41 +0200)]
optimized sgemv_t_4 kernel for very small sizes

9 years agooptimized sgemv_t
wernsaar [Sat, 6 Sep 2014 16:34:25 +0000 (18:34 +0200)]
optimized sgemv_t

9 years agooptimization for small size
wernsaar [Sat, 6 Sep 2014 11:17:56 +0000 (13:17 +0200)]
optimization for small size

9 years agoadded optimized sgemv_n kernel for haswell
wernsaar [Sat, 6 Sep 2014 10:08:48 +0000 (12:08 +0200)]
added optimized sgemv_n kernel for haswell

9 years agoundef WHEREAMI
wernsaar [Sat, 6 Sep 2014 09:01:42 +0000 (11:01 +0200)]
undef WHEREAMI

9 years agoadded optimized sgemv_n kernel for sandybridge
wernsaar [Sat, 6 Sep 2014 06:41:53 +0000 (08:41 +0200)]
added optimized sgemv_n kernel for sandybridge

9 years agoexperimentally removed expensive function calls
wernsaar [Fri, 5 Sep 2014 13:05:53 +0000 (15:05 +0200)]
experimentally removed expensive function calls

9 years agooptimized sgemv_t for sandybridge
wernsaar [Fri, 5 Sep 2014 08:22:50 +0000 (10:22 +0200)]
optimized sgemv_t for sandybridge

9 years agobugfix for sgemv_n_4.c
wernsaar [Thu, 4 Sep 2014 16:55:52 +0000 (18:55 +0200)]
bugfix for sgemv_n_4.c

9 years agooptimized sgemv_n kernel for small sizes
wernsaar [Thu, 4 Sep 2014 11:09:27 +0000 (13:09 +0200)]
optimized sgemv_n kernel for small sizes

9 years agooptimized sgemv_n_4.c
wernsaar [Wed, 3 Sep 2014 13:34:30 +0000 (15:34 +0200)]
optimized sgemv_n_4.c

9 years agooptimized sgemv_n for small sizes
wernsaar [Wed, 3 Sep 2014 12:48:45 +0000 (14:48 +0200)]
optimized sgemv_n for small sizes

9 years agobugfix for buffer overflow
wernsaar [Wed, 3 Sep 2014 08:13:47 +0000 (10:13 +0200)]
bugfix for buffer overflow

9 years agooptimized interface/gemv.c for multithreading
wernsaar [Tue, 2 Sep 2014 15:36:07 +0000 (17:36 +0200)]
optimized interface/gemv.c for multithreading

9 years agoupdated interface/gemv.c for multithreading
wernsaar [Tue, 2 Sep 2014 14:30:04 +0000 (16:30 +0200)]
updated interface/gemv.c for multithreading

9 years agoadded plot-header to compare multithreading
wernsaar [Tue, 2 Sep 2014 12:11:42 +0000 (14:11 +0200)]
added plot-header to compare multithreading

9 years agoremoved obsolete instructions from sgemv_t_4.c
wernsaar [Tue, 2 Sep 2014 11:35:41 +0000 (13:35 +0200)]
removed obsolete instructions from sgemv_t_4.c

9 years agooptimized sgemv_t for bulldozer
wernsaar [Tue, 2 Sep 2014 10:42:36 +0000 (12:42 +0200)]
optimized sgemv_t for bulldozer

9 years agooptimized sgemv_t_4.c for small sizes
wernsaar [Mon, 1 Sep 2014 13:11:37 +0000 (15:11 +0200)]
optimized sgemv_t_4.c for small sizes

9 years agoextended gemv.c benchmark
wernsaar [Mon, 1 Sep 2014 13:07:36 +0000 (15:07 +0200)]
extended gemv.c benchmark

9 years agomodified benchmark/gemv.c
wernsaar [Sun, 31 Aug 2014 13:38:18 +0000 (15:38 +0200)]
modified benchmark/gemv.c

9 years agooptimized sgemv_t_4.c for uneven sizes
wernsaar [Sun, 31 Aug 2014 12:33:15 +0000 (14:33 +0200)]
optimized sgemv_t_4.c for uneven sizes

9 years agooptimized sgemv_t_4.c for small size
wernsaar [Sun, 31 Aug 2014 11:23:44 +0000 (13:23 +0200)]
optimized sgemv_t_4.c for small size

9 years agochanged 1 test value (bug in lapack-testing?)
wernsaar [Sat, 30 Aug 2014 11:58:02 +0000 (13:58 +0200)]
changed 1 test value (bug in lapack-testing?)

9 years agooptimized sgemv_t kernel for small sizes
wernsaar [Sat, 30 Aug 2014 11:36:27 +0000 (13:36 +0200)]
optimized sgemv_t kernel for small sizes

9 years agoMerge pull request #443 from idunham/fix
Zhang Xianyi [Fri, 29 Aug 2014 05:31:06 +0000 (13:31 +0800)]
Merge pull request #443 from idunham/fix

Workaround PIC limitations in cpuid.

9 years agoWorkaround PIC limitations in cpuid.
Isaac Dunham [Thu, 28 Aug 2014 20:05:07 +0000 (13:05 -0700)]
Workaround PIC limitations in cpuid.

cpuid uses register ebx, but ebx is reserved in PIC.
So save ebx, swap ebx & edi, and return edi.

Copied from Igor Pavlov's equivalent fix for 7zip (in CpuArch.c),
which is public domain and thus OK license-wise.

9 years agoMerge pull request #440 from wernsaar/develop
Zhang Xianyi [Thu, 28 Aug 2014 04:43:54 +0000 (12:43 +0800)]
Merge pull request #440 from wernsaar/develop

optimizations for leve1 and level2 blas functions

9 years agomodification for clang compiler
wernsaar [Wed, 27 Aug 2014 07:00:20 +0000 (09:00 +0200)]
modification for clang compiler

9 years agorenoved flag no-integrated-as, because not working on macosx
wernsaar [Tue, 26 Aug 2014 16:29:40 +0000 (18:29 +0200)]
renoved flag no-integrated-as, because not working on macosx

9 years agoEXPERIMENTAL: added the flag -no-integrated-as for clang compiler in Makefile.system
wernsaar [Tue, 26 Aug 2014 15:36:32 +0000 (17:36 +0200)]
EXPERIMENTAL: added the flag -no-integrated-as for clang compiler in Makefile.system

9 years agoFixed the typo in Changelog.txt
Zhang Xianyi [Tue, 26 Aug 2014 08:14:34 +0000 (16:14 +0800)]
Fixed the typo in Changelog.txt

9 years agoadded optimized zaxpy bulldozer kernel
wernsaar [Mon, 25 Aug 2014 13:52:35 +0000 (15:52 +0200)]
added optimized zaxpy bulldozer kernel

9 years agoadded optimized caxpy kernel for bulldozer
wernsaar [Mon, 25 Aug 2014 12:53:28 +0000 (14:53 +0200)]
added optimized caxpy kernel for bulldozer

9 years agoadded optimized daxpy kernel for bulldozer
wernsaar [Sun, 24 Aug 2014 08:57:12 +0000 (10:57 +0200)]
added optimized daxpy kernel for bulldozer

9 years agoadded optimized daxpy kernel for nehalem
wernsaar [Sat, 23 Aug 2014 15:53:07 +0000 (17:53 +0200)]
added optimized daxpy kernel for nehalem