Zhang Xianyi [Fri, 2 Dec 2016 02:28:57 +0000 (10:28 +0800)]
Merge pull request #1015 from ararslan/aa/freebsd
Include system headers for blas_server on FreeBSD
Zhang Xianyi [Fri, 2 Dec 2016 02:27:51 +0000 (10:27 +0800)]
Merge pull request #996 from grisuthedragon/lapack-3.6.1
Lapack 3.6.1
Zhang Xianyi [Tue, 22 Nov 2016 07:54:56 +0000 (15:54 +0800)]
Merge pull request #1016 from ksraste/develop
Add data prefetch in DOT and ASUM functions
kaustubh [Tue, 22 Nov 2016 05:51:03 +0000 (11:21 +0530)]
Add data prefetch in DOT and ASUM functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
Alex Arslan [Thu, 17 Nov 2016 05:58:20 +0000 (21:58 -0800)]
Include system headers on FreeBSD
Zhang Xianyi [Mon, 7 Nov 2016 02:41:20 +0000 (10:41 +0800)]
Merge pull request #1002 from brada4/limpio
Remove few lines of dead code.
Zhang Xianyi [Mon, 7 Nov 2016 02:26:13 +0000 (10:26 +0800)]
Merge pull request #1010 from martin-frbg/cpuid
Add TARGETs for newer Intel CPUs - Kaby Lake, Knights Landing, Apollo Lake
Zhang Xianyi [Mon, 7 Nov 2016 02:25:51 +0000 (10:25 +0800)]
Merge pull request #1009 from martin-frbg/getarch-newline-fix
Getarch newline fix
Martin Kroeker [Sun, 6 Nov 2016 22:27:30 +0000 (23:27 +0100)]
Add files via upload
Martin Kroeker [Sun, 6 Nov 2016 22:26:39 +0000 (23:26 +0100)]
Add files via upload
Martin Kroeker [Sun, 6 Nov 2016 22:26:04 +0000 (23:26 +0100)]
Add files via upload
Martin Kroeker [Sun, 6 Nov 2016 16:38:20 +0000 (17:38 +0100)]
Add files via upload
Martin Kroeker [Sun, 6 Nov 2016 16:37:37 +0000 (17:37 +0100)]
Delete CMakeLists.txt
Martin Kroeker [Sun, 6 Nov 2016 16:29:33 +0000 (17:29 +0100)]
Fix spurious define in openblas_config.h
TARGET as specified with make is already return-terminated when getarch reads it. This led to an empty line written to config_last.h that awk in Makefile.install then expanded to a spurious "#define OPENBLAS_" in openblas_config.h (as noted by "kmb" on the mailing list)
Martin Kroeker [Sat, 5 Nov 2016 12:38:57 +0000 (13:38 +0100)]
Update CMakeLists.txt
Martin Kroeker [Sat, 5 Nov 2016 12:30:40 +0000 (13:30 +0100)]
Update CMakeLists.txt
Martin Kroeker [Sat, 5 Nov 2016 12:26:01 +0000 (13:26 +0100)]
Update CMakeLists.txt
Martin Kroeker [Sat, 5 Nov 2016 12:11:32 +0000 (13:11 +0100)]
Update CMakeLists.txt
Martin Kroeker [Sat, 5 Nov 2016 12:05:05 +0000 (13:05 +0100)]
Update CMakeLists.txt
Martin Kroeker [Sat, 5 Nov 2016 11:59:05 +0000 (12:59 +0100)]
Update CMakeLists.txt
Martin Kroeker [Sat, 5 Nov 2016 11:47:15 +0000 (12:47 +0100)]
Update CMakeLists.txt
Martin Kroeker [Sat, 5 Nov 2016 10:55:45 +0000 (11:55 +0100)]
Consolidate debug options
Use BUILD_DEBUG option only if CMAKE_BUILD_TYPE is not set
Consolidate debug postfixes in install target
Andrew [Mon, 31 Oct 2016 11:46:56 +0000 (12:46 +0100)]
remove dead code
Andrew [Sat, 29 Oct 2016 21:44:02 +0000 (23:44 +0200)]
new branch
Martin Koehler [Wed, 26 Oct 2016 19:43:41 +0000 (21:43 +0200)]
Move remaining OpenBLAS related changes from 3.6.0 to 3.6.1
Martin Koehler [Wed, 26 Oct 2016 19:34:56 +0000 (21:34 +0200)]
Fix #971
Martin Koehler [Wed, 26 Oct 2016 19:17:12 +0000 (21:17 +0200)]
Fix threshold in nep.in
Martin Köhler [Wed, 26 Oct 2016 14:03:00 +0000 (16:03 +0200)]
Fix MingW build
Martin Köhler [Wed, 26 Oct 2016 13:19:40 +0000 (15:19 +0200)]
Update gitignore
Martin Köhler [Wed, 26 Oct 2016 13:14:13 +0000 (15:14 +0200)]
Import LAPACK: top directory
Martin Köhler [Wed, 26 Oct 2016 13:13:03 +0000 (15:13 +0200)]
Import LAPACK: TESTING directory
Martin Köhler [Wed, 26 Oct 2016 13:12:09 +0000 (15:12 +0200)]
Import LAPACK: SRC directory
Martin Köhler [Wed, 26 Oct 2016 13:06:08 +0000 (15:06 +0200)]
Import LAPACK: LAPACKE directory
Martin Köhler [Wed, 26 Oct 2016 13:04:39 +0000 (15:04 +0200)]
Import LAPACK: INSTALL directory
Martin Köhler [Wed, 26 Oct 2016 13:03:51 +0000 (15:03 +0200)]
Import LAPACK: DOCS directory
Martin Köhler [Wed, 26 Oct 2016 13:03:16 +0000 (15:03 +0200)]
Import LAPACK: CMAKE directory
Martin Köhler [Wed, 26 Oct 2016 13:02:41 +0000 (15:02 +0200)]
Import LAPACK: CBLAS directory
Martin Köhler [Wed, 26 Oct 2016 13:02:09 +0000 (15:02 +0200)]
Import LAPACK: BLAS directory
Martin Kroeker [Wed, 19 Oct 2016 13:27:22 +0000 (15:27 +0200)]
Add CMAKE install target
Add CMAKE install target (copied from a patch provided by PrimarchOfTheSpaceWolves in #957)
Zhang Xianyi [Tue, 18 Oct 2016 04:38:52 +0000 (12:38 +0800)]
Merge pull request #986 from ksraste/develop
SGEMM, DGEMM, CGEMM, ZGEMM functions data prefetch
kaustubh [Mon, 17 Oct 2016 12:59:38 +0000 (18:29 +0530)]
SGEMM, DGEMM, CGEMM, ZGEMM functions data prefetch
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
Zhang Xianyi [Mon, 17 Oct 2016 03:33:16 +0000 (11:33 +0800)]
Merge pull request #984 from ksraste/develop
STRSM, DTRSM functions data prefetch
Zhang Xianyi [Mon, 17 Oct 2016 03:32:57 +0000 (11:32 +0800)]
Merge pull request #981 from howard0su/develop
USE NPROCESSOR_CONF instaed of NPORCESSOR_ONLN
Zhang Xianyi [Mon, 17 Oct 2016 03:32:20 +0000 (11:32 +0800)]
Merge pull request #982 from martin-frbg/develop
Change file comments to work around clang 3.9 assembler bug; add support for Bay Trail atom
Martin Kroeker [Sun, 16 Oct 2016 20:51:42 +0000 (22:51 +0200)]
Merge pull request #1 from martin-frbg/martin-frbg-patch-1
Add Intel "Bay Trail" atom cpu
Martin Kroeker [Sun, 16 Oct 2016 20:48:58 +0000 (22:48 +0200)]
Merge pull request #2 from martin-frbg/martin-frbg-patch-1-1
Update cpuid_x86.c
Martin Kroeker [Sun, 16 Oct 2016 20:45:44 +0000 (22:45 +0200)]
Update cpuid_x86.c
Add Bay Trail "Pentium N3520" atom cpu
Martin Kroeker [Sun, 16 Oct 2016 20:40:00 +0000 (22:40 +0200)]
Update dynamic.c
Add Bay Trail "Pentium N3520" atom
kaustubh [Fri, 14 Oct 2016 11:11:28 +0000 (16:41 +0530)]
STRSM, DTRSM functions data prefetch
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
Martin Kroeker [Thu, 13 Oct 2016 14:51:08 +0000 (16:51 +0200)]
Change file comments to work around clang 3.9 assembler bug
Howard Su [Thu, 13 Oct 2016 12:37:50 +0000 (12:37 +0000)]
USE NPROCESSOR_CONF instaed of NPORCESSOR_ONLN
to determine the number of CPU. In ARM platform,
online CPU will increasing when there is more workload.
while configure cpu is the max number of CPU.
Zhang Xianyi [Thu, 13 Oct 2016 02:17:07 +0000 (10:17 +0800)]
Fixed #979. Patch for NetBSD.
Zhang Xianyi [Thu, 13 Oct 2016 02:13:56 +0000 (10:13 +0800)]
Merge pull request #970 from martin-frbg/develop
Remove implicit inclusions of complex.h in various zdot implementations
Zhang Xianyi [Thu, 13 Oct 2016 02:13:12 +0000 (10:13 +0800)]
Merge pull request #980 from kiwifb/utest_ldflags
make utest/Makefile respect LDFLAGS
Zhang Xianyi [Thu, 13 Oct 2016 02:12:35 +0000 (10:12 +0800)]
Merge pull request #973 from vladimir-ch/fix-lapacke-xlarfb
LAPACKE: fix wrong direction check in LAPACKE_?larfb_work
François Bissey [Wed, 12 Oct 2016 20:32:25 +0000 (09:32 +1300)]
make utest/Makefile respect LDFLAGS
Martin Kroeker [Wed, 5 Oct 2016 16:59:09 +0000 (18:59 +0200)]
Update zdot_msa.c
Martin Kroeker [Wed, 5 Oct 2016 16:58:03 +0000 (18:58 +0200)]
Update zdot.c
Martin Kroeker [Wed, 5 Oct 2016 16:57:14 +0000 (18:57 +0200)]
Update zdot.c
Vladimir Chalupecky [Fri, 30 Sep 2016 20:31:30 +0000 (05:31 +0900)]
LAPACKE: fix wrong direction check in LAPACKE_?larfb_work
Closes #971
Martin Kroeker [Thu, 29 Sep 2016 21:45:56 +0000 (23:45 +0200)]
Remove explicit include of complex.h
Martin Kroeker [Thu, 29 Sep 2016 21:43:28 +0000 (23:43 +0200)]
Remove explicit include of complex.h
Martin Kroeker [Thu, 29 Sep 2016 21:41:43 +0000 (23:41 +0200)]
Remove explicit include of complex.h
Martin Kroeker [Thu, 29 Sep 2016 21:40:36 +0000 (23:40 +0200)]
Remove explicit include of complex.h
Martin Kroeker [Thu, 29 Sep 2016 21:39:35 +0000 (23:39 +0200)]
Remove explicit include of complex.h
Zhang Xianyi [Thu, 22 Sep 2016 15:34:57 +0000 (11:34 -0400)]
Merge pull request #968 from buffer51/develop
Updated CROSS_SUFFIX regex to work with CC containing arguments
Zhang Xianyi [Thu, 22 Sep 2016 15:33:51 +0000 (11:33 -0400)]
Merge pull request #969 from sva-img/develop
DGEMM function split and data prefech
Shivraj Patil [Thu, 22 Sep 2016 11:55:46 +0000 (17:25 +0530)]
DGEMM function split and data prefech
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Paul MUSTIÈRE [Wed, 14 Sep 2016 18:42:22 +0000 (11:42 -0700)]
Updated CROSS_SUFFIX regex to work with CC containing arguments
Zhang Xianyi [Tue, 13 Sep 2016 20:15:37 +0000 (16:15 -0400)]
Merge pull request #958 from intelfx/remove-stabs
common_arm.h, common_mips.h: get rid of .func directives
Ivan Shapovalov [Fri, 9 Sep 2016 00:36:49 +0000 (03:36 +0300)]
common_arm.h, common_mips.h: get rid of .func directives
.func/.endfunc are gcc/gas-specific directives for generating stabs
debug information (and nothing more). This is near-useless now because
DWARF is commonly used, and not implemented in Clang. Hence building
OpenBLAS with Clang fails, and there is no sane way to detect GCC vs.
anything else with preprocessor definitions.
Hence, just remove these directives.
Zhang Xianyi [Thu, 1 Sep 2016 04:01:23 +0000 (00:01 -0400)]
Update develop for 0.2.20.dev.
Zhang Xianyi [Thu, 1 Sep 2016 03:58:29 +0000 (23:58 -0400)]
Update doc for 0.2.19.
Zhang Xianyi [Fri, 19 Aug 2016 01:59:43 +0000 (18:59 -0700)]
Refs #946. Use nrm2 reference implementation for Power8.
Zhang Xianyi [Thu, 18 Aug 2016 17:24:42 +0000 (10:24 -0700)]
Refs #929. Deal with zero and NaNs for scale.
Zhang Xianyi [Thu, 18 Aug 2016 13:31:31 +0000 (09:31 -0400)]
Merge pull request #941 from sva-img/develop
MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
Zhang Xianyi [Fri, 12 Aug 2016 21:20:59 +0000 (17:20 -0400)]
Merge pull request #943 from ibmsoe/IBMMASS_Support
Added support of IBM's MASS library that optimizes performance on Pow…
nishidha@us.ibm.com [Thu, 11 Aug 2016 09:13:26 +0000 (14:43 +0530)]
Added support of IBM's MASS library that optimizes performance on Power architectures
Shivraj Patil [Wed, 10 Aug 2016 12:14:22 +0000 (17:44 +0530)]
MIPS n32 ABI and build time mips simd support check
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Shivraj Patil [Mon, 8 Aug 2016 06:28:01 +0000 (11:58 +0530)]
MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Zhang Xianyi [Tue, 26 Jul 2016 13:54:31 +0000 (09:54 -0400)]
Merge pull request #933 from ashwinyes/develop_aarch64_20160726_Dgemm_8x4_Opts
Cortex A57: Improvements to DGEMM 8x4 kernel
Ashwin Sekhar T K [Mon, 25 Jul 2016 09:03:25 +0000 (14:33 +0530)]
Cortex A57: Improvements to DGEMM 8x4 kernel
Zhang Xianyi [Fri, 22 Jul 2016 15:42:30 +0000 (11:42 -0400)]
Merge pull request #930 from sva-img/develop
P6600/I6400 Build fix.
Shivraj Patil [Fri, 22 Jul 2016 13:15:06 +0000 (18:45 +0530)]
P6600/I6400 Build fix. Reverted the changes which was done to support for MIPS n32 ABI
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Zhang Xianyi [Fri, 15 Jul 2016 15:17:30 +0000 (11:17 -0400)]
Merge pull request #927 from sva-img/develop
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
Shivraj Patil [Fri, 15 Jul 2016 13:08:25 +0000 (18:38 +0530)]
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Zhang Xianyi [Thu, 14 Jul 2016 20:09:36 +0000 (13:09 -0700)]
Refs #917 Avoid detecting gfortran bug on IBM POWER + Ubuntu
Zhang Xianyi [Thu, 14 Jul 2016 19:49:33 +0000 (15:49 -0400)]
Merge pull request #926 from vriera/develop
Complete support for MIPS n32 ABI
Zhang Xianyi [Thu, 14 Jul 2016 19:48:58 +0000 (15:48 -0400)]
Merge pull request #925 from martin-frbg/develop
Update zgetrf2.f, cpuid_x86.c, dynamic.c
Zhang Xianyi [Thu, 14 Jul 2016 19:47:55 +0000 (15:47 -0400)]
Merge pull request #924 from ashwinyes/develop_aarch64_improvements_20160714
Improvements to Aarch64 kernels
Zhang Xianyi [Thu, 14 Jul 2016 19:45:39 +0000 (15:45 -0400)]
Merge pull request #918 from sva-img/develop
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM.
Vicente Olivert Riera [Thu, 14 Jul 2016 16:20:51 +0000 (17:20 +0100)]
Complete support for MIPS n32 ABI
Signed-off-by: Vicente Olivert Riera <Vincent.Riera@imgtec.com>
Martin Kroeker [Thu, 14 Jul 2016 15:29:34 +0000 (17:29 +0200)]
Update cpuid_x86.c
Martin Kroeker [Thu, 14 Jul 2016 10:25:17 +0000 (12:25 +0200)]
Update cpuid_x86.c
Martin Kroeker [Thu, 14 Jul 2016 10:22:55 +0000 (12:22 +0200)]
Update dynamic.c
Add Braswell (extended model 4, model 12) N3150 as Nehalem
Martin Kroeker [Thu, 14 Jul 2016 09:41:57 +0000 (11:41 +0200)]
Update zgetrf2.f
Trivial typo correction (ZERBLA => XERBLA) to fix #910
Ashwin Sekhar T K [Thu, 14 Jul 2016 08:21:17 +0000 (13:51 +0530)]
Improvements to TRMM and GEMM kernels
Ashwin Sekhar T K [Thu, 14 Jul 2016 08:20:38 +0000 (13:50 +0530)]
Improvements to GEMV kernels
Ashwin Sekhar T K [Thu, 14 Jul 2016 08:19:15 +0000 (13:49 +0530)]
Improvements to COPY and IAMAX kernels
Ashwin Sekhar T K [Thu, 14 Jul 2016 08:18:13 +0000 (13:48 +0530)]
Add time prints in benchmark output