wernsaar [Tue, 9 Sep 2014 14:17:45 +0000 (16:17 +0200)]
added and tested optimized dgemv_n kernel for haswell
wernsaar [Tue, 9 Sep 2014 13:32:32 +0000 (15:32 +0200)]
added optimized dgemv_n kernel for haswell
wernsaar [Tue, 9 Sep 2014 12:38:08 +0000 (14:38 +0200)]
optimized dgemv_t kernel for haswell
wernsaar [Tue, 9 Sep 2014 12:04:44 +0000 (14:04 +0200)]
bugfix in KERNEL.HASWELL
wernsaar [Tue, 9 Sep 2014 11:54:55 +0000 (13:54 +0200)]
added optimized gemv kernels
wernsaar [Tue, 9 Sep 2014 11:34:22 +0000 (13:34 +0200)]
added optimized dgemv_t kernel for haswell
wernsaar [Mon, 8 Sep 2014 17:15:31 +0000 (19:15 +0200)]
removed obsolete files
wernsaar [Mon, 8 Sep 2014 13:22:35 +0000 (15:22 +0200)]
optimized dgemv_n kernel for small sizes
wernsaar [Mon, 8 Sep 2014 10:27:32 +0000 (12:27 +0200)]
modified multithreading threshold
wernsaar [Mon, 8 Sep 2014 10:25:16 +0000 (12:25 +0200)]
added haswell optimized kernel
wernsaar [Mon, 8 Sep 2014 08:54:33 +0000 (10:54 +0200)]
bugfix in sgemv_n_microk_haswell-4.c
wernsaar [Mon, 8 Sep 2014 08:13:39 +0000 (10:13 +0200)]
added optimized sgemv_t kernel for haswell
wernsaar [Sun, 7 Sep 2014 19:48:42 +0000 (21:48 +0200)]
bugfix for windows
wernsaar [Sun, 7 Sep 2014 19:13:57 +0000 (21:13 +0200)]
enabled optimized sgemv kernels for piledriver
wernsaar [Sun, 7 Sep 2014 18:53:30 +0000 (20:53 +0200)]
optimized sgemv_n kernel for sandybridge
wernsaar [Sun, 7 Sep 2014 17:20:08 +0000 (19:20 +0200)]
optimized sgemv_n kernel for nehalem
wernsaar [Sun, 7 Sep 2014 16:23:48 +0000 (18:23 +0200)]
optimized sgemv_n for very small size of m
wernsaar [Sun, 7 Sep 2014 11:45:03 +0000 (13:45 +0200)]
optimizations for very small sizes
wernsaar [Sat, 6 Sep 2014 19:28:57 +0000 (21:28 +0200)]
better optimzations for sgemv_t kernel
wernsaar [Sat, 6 Sep 2014 17:41:57 +0000 (19:41 +0200)]
optimized sgemv_t_4 kernel for very small sizes
wernsaar [Sat, 6 Sep 2014 16:34:25 +0000 (18:34 +0200)]
optimized sgemv_t
wernsaar [Sat, 6 Sep 2014 11:17:56 +0000 (13:17 +0200)]
optimization for small size
wernsaar [Sat, 6 Sep 2014 10:08:48 +0000 (12:08 +0200)]
added optimized sgemv_n kernel for haswell
wernsaar [Sat, 6 Sep 2014 09:01:42 +0000 (11:01 +0200)]
undef WHEREAMI
wernsaar [Sat, 6 Sep 2014 06:41:53 +0000 (08:41 +0200)]
added optimized sgemv_n kernel for sandybridge
wernsaar [Fri, 5 Sep 2014 13:05:53 +0000 (15:05 +0200)]
experimentally removed expensive function calls
wernsaar [Fri, 5 Sep 2014 08:22:50 +0000 (10:22 +0200)]
optimized sgemv_t for sandybridge
wernsaar [Thu, 4 Sep 2014 16:55:52 +0000 (18:55 +0200)]
bugfix for sgemv_n_4.c
wernsaar [Thu, 4 Sep 2014 11:09:27 +0000 (13:09 +0200)]
optimized sgemv_n kernel for small sizes
wernsaar [Wed, 3 Sep 2014 13:34:30 +0000 (15:34 +0200)]
optimized sgemv_n_4.c
wernsaar [Wed, 3 Sep 2014 12:48:45 +0000 (14:48 +0200)]
optimized sgemv_n for small sizes
wernsaar [Wed, 3 Sep 2014 08:13:47 +0000 (10:13 +0200)]
bugfix for buffer overflow
wernsaar [Tue, 2 Sep 2014 15:36:07 +0000 (17:36 +0200)]
optimized interface/gemv.c for multithreading
wernsaar [Tue, 2 Sep 2014 14:30:04 +0000 (16:30 +0200)]
updated interface/gemv.c for multithreading
wernsaar [Tue, 2 Sep 2014 12:11:42 +0000 (14:11 +0200)]
added plot-header to compare multithreading
wernsaar [Tue, 2 Sep 2014 11:35:41 +0000 (13:35 +0200)]
removed obsolete instructions from sgemv_t_4.c
wernsaar [Tue, 2 Sep 2014 10:42:36 +0000 (12:42 +0200)]
optimized sgemv_t for bulldozer
wernsaar [Mon, 1 Sep 2014 13:11:37 +0000 (15:11 +0200)]
optimized sgemv_t_4.c for small sizes
wernsaar [Mon, 1 Sep 2014 13:07:36 +0000 (15:07 +0200)]
extended gemv.c benchmark
wernsaar [Sun, 31 Aug 2014 13:38:18 +0000 (15:38 +0200)]
modified benchmark/gemv.c
wernsaar [Sun, 31 Aug 2014 12:33:15 +0000 (14:33 +0200)]
optimized sgemv_t_4.c for uneven sizes
wernsaar [Sun, 31 Aug 2014 11:23:44 +0000 (13:23 +0200)]
optimized sgemv_t_4.c for small size
wernsaar [Sat, 30 Aug 2014 11:58:02 +0000 (13:58 +0200)]
changed 1 test value (bug in lapack-testing?)
wernsaar [Sat, 30 Aug 2014 11:36:27 +0000 (13:36 +0200)]
optimized sgemv_t kernel for small sizes
Zhang Xianyi [Thu, 28 Aug 2014 04:43:54 +0000 (12:43 +0800)]
Merge pull request #440 from wernsaar/develop
optimizations for leve1 and level2 blas functions
wernsaar [Wed, 27 Aug 2014 07:00:20 +0000 (09:00 +0200)]
modification for clang compiler
wernsaar [Tue, 26 Aug 2014 16:29:40 +0000 (18:29 +0200)]
renoved flag no-integrated-as, because not working on macosx
wernsaar [Tue, 26 Aug 2014 15:36:32 +0000 (17:36 +0200)]
EXPERIMENTAL: added the flag -no-integrated-as for clang compiler in Makefile.system
Zhang Xianyi [Tue, 26 Aug 2014 08:14:34 +0000 (16:14 +0800)]
Fixed the typo in Changelog.txt
wernsaar [Mon, 25 Aug 2014 13:52:35 +0000 (15:52 +0200)]
added optimized zaxpy bulldozer kernel
wernsaar [Mon, 25 Aug 2014 12:53:28 +0000 (14:53 +0200)]
added optimized caxpy kernel for bulldozer
wernsaar [Sun, 24 Aug 2014 08:57:12 +0000 (10:57 +0200)]
added optimized daxpy kernel for bulldozer
wernsaar [Sat, 23 Aug 2014 15:53:07 +0000 (17:53 +0200)]
added optimized daxpy kernel for nehalem
wernsaar [Sat, 23 Aug 2014 15:28:01 +0000 (17:28 +0200)]
updated gemm.c
wernsaar [Sat, 23 Aug 2014 15:15:21 +0000 (17:15 +0200)]
added optimized saxpy kernel for nehalem
wernsaar [Sat, 23 Aug 2014 11:12:44 +0000 (13:12 +0200)]
added axpy benchmark-test
wernsaar [Sat, 23 Aug 2014 08:40:57 +0000 (10:40 +0200)]
added optimized dgemv_n kernel for nehalem
wernsaar [Fri, 22 Aug 2014 19:19:29 +0000 (21:19 +0200)]
added optimized ddot kernel for bulldozer
wernsaar [Fri, 22 Aug 2014 18:34:41 +0000 (20:34 +0200)]
added optimized ddot kernel for nehalem
wernsaar [Fri, 22 Aug 2014 15:02:55 +0000 (17:02 +0200)]
bugfix for Makefile
wernsaar [Fri, 22 Aug 2014 15:01:27 +0000 (17:01 +0200)]
update of KERNEL.BULLDOZER
wernsaar [Fri, 22 Aug 2014 15:00:26 +0000 (17:00 +0200)]
added optimized sdot kernel for nehalem
wernsaar [Fri, 22 Aug 2014 12:29:17 +0000 (14:29 +0200)]
added optimized sdot for bulldozer
wernsaar [Fri, 22 Aug 2014 09:51:30 +0000 (11:51 +0200)]
bugfix in Makefile
wernsaar [Fri, 22 Aug 2014 09:42:07 +0000 (11:42 +0200)]
added sdot and ddot benchmarks
wernsaar [Fri, 22 Aug 2014 08:00:09 +0000 (10:00 +0200)]
added hemv benchmark
wernsaar [Thu, 21 Aug 2014 17:33:57 +0000 (19:33 +0200)]
added benchmarks for csymv and zsymv
wernsaar [Thu, 21 Aug 2014 12:27:00 +0000 (14:27 +0200)]
added optimized symv_L kernels for nehalem
wernsaar [Thu, 21 Aug 2014 11:32:06 +0000 (13:32 +0200)]
added optimized ssymv_L kernel for bulldozer
wernsaar [Thu, 21 Aug 2014 11:02:53 +0000 (13:02 +0200)]
added optimized dsymv_L kernel for bulldozer
wernsaar [Wed, 20 Aug 2014 07:58:04 +0000 (09:58 +0200)]
added optimized dsymv_U kernel for nehalem
wernsaar [Wed, 20 Aug 2014 07:00:56 +0000 (09:00 +0200)]
updated optimized dsymv_U kernel for bulldozer
wernsaar [Tue, 19 Aug 2014 17:25:03 +0000 (19:25 +0200)]
updated optimized ssymv_U for bulldozer
wernsaar [Tue, 19 Aug 2014 15:09:45 +0000 (17:09 +0200)]
added optimized ssymv_U kernel for nehalem
wernsaar [Mon, 18 Aug 2014 11:52:24 +0000 (13:52 +0200)]
added optimized ssymv_U kernel for bulldozer
wernsaar [Mon, 18 Aug 2014 10:18:10 +0000 (12:18 +0200)]
added optimized dsymv_U kernel for bulldozer
Zhang Xianyi [Mon, 18 Aug 2014 03:15:42 +0000 (11:15 +0800)]
OpenBLAS 0.2.11 version.
wernsaar [Sat, 16 Aug 2014 11:52:50 +0000 (13:52 +0200)]
add reference in C for symv_U
wernsaar [Sat, 16 Aug 2014 09:36:48 +0000 (11:36 +0200)]
added reference in C for symv_L
wernsaar [Fri, 15 Aug 2014 10:40:10 +0000 (12:40 +0200)]
Ref #433: removed obsolete lapack entries from common_interface.h
Zhang Xianyi [Fri, 15 Aug 2014 00:07:27 +0000 (08:07 +0800)]
Merge pull request #434 from wernsaar/develop
A lot of performance enhancements
wernsaar [Thu, 14 Aug 2014 17:00:30 +0000 (19:00 +0200)]
added optimized cgemv_n for haswell
wernsaar [Thu, 14 Aug 2014 12:10:29 +0000 (14:10 +0200)]
added optimized cgemv_t kernel for haswell
wernsaar [Wed, 13 Aug 2014 14:10:03 +0000 (16:10 +0200)]
optimized zgemv_n kernel for sandybridge
wernsaar [Wed, 13 Aug 2014 12:54:50 +0000 (14:54 +0200)]
added additional test values
wernsaar [Wed, 13 Aug 2014 11:54:19 +0000 (13:54 +0200)]
added fast return, if m or n < 1
wernsaar [Wed, 13 Aug 2014 11:42:22 +0000 (13:42 +0200)]
optimized zgemv_t_microk_haswell-2.c
wernsaar [Wed, 13 Aug 2014 10:54:18 +0000 (12:54 +0200)]
bugfix for zgemv_n_microk_haswell-2.c
wernsaar [Wed, 13 Aug 2014 10:18:03 +0000 (12:18 +0200)]
bugfix in zgemv_n_microk_sandy-2.c
wernsaar [Tue, 12 Aug 2014 10:15:41 +0000 (12:15 +0200)]
added optimized cgemv_t c-kernel
wernsaar [Tue, 12 Aug 2014 08:02:25 +0000 (10:02 +0200)]
bugfix in zgemv_n_microk_haswell-2.c
wernsaar [Tue, 12 Aug 2014 06:35:42 +0000 (08:35 +0200)]
modified algorithm for better numerical stability
wernsaar [Mon, 11 Aug 2014 14:57:52 +0000 (16:57 +0200)]
added optimized zgemv_t kernel for haswell
wernsaar [Mon, 11 Aug 2014 12:19:25 +0000 (14:19 +0200)]
add optimized zgemv_t kernel for bulldozer
wernsaar [Mon, 11 Aug 2014 11:10:12 +0000 (13:10 +0200)]
added optimized zgemv_t for haswell
wernsaar [Mon, 11 Aug 2014 07:13:18 +0000 (09:13 +0200)]
added optimimized zgemv_t c-kernel
wernsaar [Sun, 10 Aug 2014 09:57:24 +0000 (11:57 +0200)]
disabled optimized haswell zgemv_n kernel for windows ( bad rounding )
wernsaar [Sun, 10 Aug 2014 06:39:17 +0000 (08:39 +0200)]
added optimized zgemv_n kernel for haswell
wernsaar [Thu, 7 Aug 2014 20:30:20 +0000 (22:30 +0200)]
added zgemv_n c-function
wernsaar [Thu, 7 Aug 2014 08:08:54 +0000 (10:08 +0200)]
added optimized dgemv_t kernel for haswell