Isuru Fernando [Mon, 7 Aug 2017 18:37:25 +0000 (00:07 +0530)]
No strncasecmp with MSVC
Isuru Fernando [Mon, 7 Aug 2017 17:38:44 +0000 (23:08 +0530)]
Add commonobjs
Isuru Fernando [Sun, 6 Aug 2017 13:47:31 +0000 (19:17 +0530)]
Test DYNAMIC_ARCH on appveyor
Isuru Fernando [Sun, 6 Aug 2017 13:37:00 +0000 (19:07 +0530)]
Merge remote-tracking branch 'upstream/develop' into dyn
Martin Kroeker [Sun, 6 Aug 2017 12:11:44 +0000 (14:11 +0200)]
Merge pull request #1262 from martin-frbg/xmv_thread-splitting
Make sure that range limit of last thread never exceeds data size
Martin Kroeker [Sun, 6 Aug 2017 12:10:18 +0000 (14:10 +0200)]
Merge pull request #1256 from isuruf/develop
Support compiling with clang on windows
Isuru Fernando [Fri, 4 Aug 2017 02:34:16 +0000 (08:04 +0530)]
Build all branches so that appveyor works in forks
Isuru Fernando [Fri, 4 Aug 2017 02:27:20 +0000 (07:57 +0530)]
New utest for clang
Isuru Fernando [Fri, 4 Aug 2017 02:27:55 +0000 (07:57 +0530)]
Merge remote-tracking branch 'upstream/develop' into develop
Martin Kroeker [Thu, 3 Aug 2017 13:33:28 +0000 (15:33 +0200)]
Merge pull request #1266 from ashwinyes/develop_thunderx2t99_fix_clang_compilation
THUDNERX2T99: Fix clang compilation
Ashwin Sekhar T K [Wed, 2 Aug 2017 18:28:45 +0000 (11:28 -0700)]
THUDNERX2T99: Fix clang compilation
Isuru Fernando [Wed, 2 Aug 2017 13:09:04 +0000 (18:39 +0530)]
Add missing EXCAVATOR
Martin Kroeker [Wed, 2 Aug 2017 13:31:05 +0000 (15:31 +0200)]
Merge pull request #1259 from isuruf/cmake
CMake Improvements
Isuru Fernando [Wed, 2 Aug 2017 13:00:26 +0000 (18:30 +0530)]
Fix extra whitespaces. CMake parser macro fails with it
TODO: Fix the parser macro to strip trailing whitespaces
Isuru Fernando [Wed, 2 Aug 2017 12:54:54 +0000 (18:24 +0530)]
Add hemm3m and symm3m objects
Isuru Fernando [Wed, 2 Aug 2017 10:44:34 +0000 (16:14 +0530)]
Fixes for dynamic_arch. almost there
Martin Kroeker [Wed, 2 Aug 2017 10:03:54 +0000 (12:03 +0200)]
Update trmv_thread.c
Martin Kroeker [Wed, 2 Aug 2017 09:59:17 +0000 (11:59 +0200)]
Merge pull request #1255 from jirutka/travis
Travis: Rewrite config, build and test also on Alpine Linux (musl libc)
Martin Kroeker [Tue, 1 Aug 2017 22:37:58 +0000 (00:37 +0200)]
Make sure that range_n of last thread never exceeds the actual data size when splitting the workload
Jakub Jirutka [Fri, 28 Jul 2017 16:08:44 +0000 (18:08 +0200)]
Travis: Allow job LINUX64_MUSL USE_OPENMP=1 to fail
See: https://github.com/xianyi/OpenBLAS/pull/1255#issuecomment-
318692183
Jakub Jirutka [Fri, 28 Jul 2017 12:32:17 +0000 (14:32 +0200)]
Travis: Disable some gcc warnings to avoid exceeding Travis limit
See: https://github.com/xianyi/OpenBLAS/pull/1255#issuecomment-
318628666
Jakub Jirutka [Fri, 28 Jul 2017 00:31:27 +0000 (02:31 +0200)]
Travis: Build and test also on Alpine Linux (musl libc)
Alpine jobs needs sudo (for chroot), so they run on VMs infrastructure.
That's why they are much slower than other jobs.
Jakub Jirutka [Fri, 28 Jul 2017 00:01:44 +0000 (02:01 +0200)]
Travis: Simplify configuration using Build Stages and APT addon
Using APT addon has nice side-effect - you don't need sudo anymore, so
it can run on Travis containers-based infrastructure that is much faster
than their VMs infrastructure (used when sudo is needed).
You've been still running on Ubuntu Presty builders, but new default is
Trusty. Thus I've explicitly set `dist: presty` to let it stay on
Presty, to not change build environment by this commit.
Martin Kroeker [Tue, 1 Aug 2017 18:07:32 +0000 (20:07 +0200)]
Merge pull request #1260 from xianyi/revert-1254-xbmv_range
Revert "Fix calculated range limit exceeding actual data size for last thread"
Isuru Fernando [Tue, 1 Aug 2017 17:53:55 +0000 (23:23 +0530)]
configure kernel_core.h
Martin Kroeker [Tue, 1 Aug 2017 17:28:08 +0000 (19:28 +0200)]
Revert "Fix calculated range limit exceeding actual data size for last thread"
Isuru Fernando [Tue, 1 Aug 2017 17:02:47 +0000 (22:32 +0530)]
configure setparam
Isuru Fernando [Tue, 1 Aug 2017 16:01:55 +0000 (21:31 +0530)]
Support DYNAMIC_ARCH with cmake
Isuru Fernando [Tue, 1 Aug 2017 10:17:14 +0000 (15:47 +0530)]
Fix lapacke copying
Isuru Fernando [Tue, 1 Aug 2017 09:57:19 +0000 (15:27 +0530)]
No need of a temp file for f77blas.h
Isuru Fernando [Tue, 1 Aug 2017 09:40:41 +0000 (15:10 +0530)]
Support out-of-source build
Isuru Fernando [Tue, 1 Aug 2017 09:28:49 +0000 (14:58 +0530)]
Fix installing cblas.h and fix tabs
Martin Kroeker [Tue, 1 Aug 2017 09:23:03 +0000 (11:23 +0200)]
Merge pull request #1257 from martin-frbg/cgroups-prereq
Rework __GLIBC_PREREQ checks to avoid breaking non-glibc builds
Isuru Fernando [Tue, 1 Aug 2017 05:32:00 +0000 (11:02 +0530)]
Don't change timestamps
Martin Kroeker [Mon, 31 Jul 2017 19:02:43 +0000 (21:02 +0200)]
Rework __GLIBC_PREREQ checks to avoid breaking non-glibc builds
Martin Kroeker [Mon, 31 Jul 2017 15:46:40 +0000 (17:46 +0200)]
Merge pull request #1254 from martin-frbg/xbmv_range
Fix calculated range limit exceeding actual data size for last thread
Isuru Fernando [Sat, 29 Jul 2017 18:30:37 +0000 (00:00 +0530)]
Remove unnecessary line in appveyor
Isuru Fernando [Sat, 29 Jul 2017 18:12:56 +0000 (23:42 +0530)]
Fix vcvarsall call in appveyor
Isuru Fernando [Sat, 29 Jul 2017 18:12:38 +0000 (23:42 +0530)]
Fix copying libopenblas.dll
Isuru Fernando [Sat, 29 Jul 2017 18:00:15 +0000 (23:30 +0530)]
Make ARCH variable a CACHE variable
Isuru Fernando [Sat, 29 Jul 2017 16:46:53 +0000 (22:16 +0530)]
Try adding RC to path
Isuru Fernando [Sat, 29 Jul 2017 16:28:53 +0000 (21:58 +0530)]
vsvarsall in appveyor
Isuru Fernando [Sat, 29 Jul 2017 16:24:32 +0000 (21:54 +0530)]
Fix CMAKE_C_COMPILER in appveyor
Isuru Fernando [Sat, 29 Jul 2017 16:18:49 +0000 (21:48 +0530)]
add --yes to conda in appveyor.yml
Isuru Fernando [Sat, 29 Jul 2017 16:17:15 +0000 (21:47 +0530)]
build clang-cl first
Isuru Fernando [Sat, 29 Jul 2017 16:07:48 +0000 (21:37 +0530)]
Fix appveyor.yml
Isuru Fernando [Sat, 29 Jul 2017 15:46:00 +0000 (21:16 +0530)]
Test clang in appveyor.yml
Isuru Fernando [Sat, 29 Jul 2017 15:30:32 +0000 (21:00 +0530)]
Ninja complains that file openblas.def does not exist
Isuru Fernando [Sat, 29 Jul 2017 15:29:17 +0000 (20:59 +0530)]
clang on windows needs FU=''
Isuru Fernando [Sat, 29 Jul 2017 15:08:16 +0000 (20:38 +0530)]
typedefs only for c
Isuru Fernando [Fri, 28 Jul 2017 06:20:29 +0000 (11:50 +0530)]
Fix complex support for MSVC headers
Isuru Fernando [Fri, 28 Jul 2017 06:19:39 +0000 (11:49 +0530)]
check compiler is msvc instead of msvc
Martin Kroeker [Thu, 27 Jul 2017 22:27:02 +0000 (00:27 +0200)]
Fix range limit exceeding actual data size in last step
Martin Kroeker [Thu, 27 Jul 2017 22:21:53 +0000 (00:21 +0200)]
Fix range limit exceeding data size in last step
Martin Kroeker [Thu, 27 Jul 2017 22:13:24 +0000 (00:13 +0200)]
Fix range exceeding actual data size in quick_divide
Martin Kroeker [Tue, 25 Jul 2017 21:31:57 +0000 (23:31 +0200)]
Merge pull request #1249 from martin-frbg/cgroup
Honor cgroup/cpuset limits when enumerating cpus
Martin Kroeker [Tue, 25 Jul 2017 20:47:34 +0000 (22:47 +0200)]
Honor cgroup/cpuset limits when enumerating cpus
Martin Kroeker [Mon, 24 Jul 2017 14:17:50 +0000 (16:17 +0200)]
Revert "Honor cgroup/cpuset limits when enumerating cpus" (#1246)
Zhang Xianyi [Mon, 24 Jul 2017 04:07:00 +0000 (12:07 +0800)]
Merge pull request #1236 from martin-frbg/l1cache
Use cpuid 4 with subleafs to query L1 cache size on Intel processors
Zhang Xianyi [Mon, 24 Jul 2017 04:06:29 +0000 (12:06 +0800)]
Bump develop version for 0.3.0.
Zhang Xianyi [Mon, 24 Jul 2017 04:03:35 +0000 (12:03 +0800)]
Merge branch 'develop'
0.2.20 version
Zhang Xianyi [Mon, 24 Jul 2017 03:55:10 +0000 (11:55 +0800)]
Update doc for 0.2.20 version.
Zhang Xianyi [Mon, 24 Jul 2017 03:46:52 +0000 (11:46 +0800)]
Merge pull request #1239 from martin-frbg/cgroups
Honor cgroup/cpuset limits when enumerating cpus
Zhang Xianyi [Mon, 24 Jul 2017 03:45:27 +0000 (11:45 +0800)]
Merge pull request #1244 from martin-frbg/micmuc_cimatcopy
Fix complex imatcopy for Trans cases with non-square matrix
Martin Kroeker [Fri, 21 Jul 2017 09:20:15 +0000 (11:20 +0200)]
Use in-place transform shortcut only if matrix is square
Martin Kroeker [Thu, 20 Jul 2017 18:51:06 +0000 (20:51 +0200)]
Add files via upload
Martin Kroeker [Sat, 15 Jul 2017 20:02:53 +0000 (22:02 +0200)]
Exchange rows and cols in final omatcopy with BlasTrans
This is MicMuc's patch from #899
Martin Kroeker [Sat, 15 Jul 2017 10:48:42 +0000 (12:48 +0200)]
More fixes for silly misedits
Martin Kroeker [Sat, 15 Jul 2017 09:53:28 +0000 (11:53 +0200)]
Fixup braces lost in previous edit
Martin Kroeker [Sat, 15 Jul 2017 08:40:42 +0000 (10:40 +0200)]
Merge branch 'develop' into cgroups
Martin Kroeker [Thu, 13 Jul 2017 20:01:47 +0000 (22:01 +0200)]
Disable ReLAPACK by default (#1238)
* Disable ReLAPACK by default; mention it in final build message if included
* Add files via upload
* Add files via upload
* Add files via upload
Zhang Xianyi [Thu, 13 Jul 2017 12:31:08 +0000 (20:31 +0800)]
Merge pull request #1214 from martin-frbg/relapack
Initial import of ReLAPACK
Zhang Xianyi [Thu, 13 Jul 2017 12:27:37 +0000 (20:27 +0800)]
Merge pull request #1234 from brada4/develop
Fix write past fixed size buffer
Martin Kroeker [Wed, 12 Jul 2017 19:56:23 +0000 (21:56 +0200)]
Add dummy implementation of cpuid_count for the CPUIDEMU case
Martin Kroeker [Wed, 12 Jul 2017 18:43:09 +0000 (20:43 +0200)]
Use cpuid 4 with subleafs to query L1 cache size on Intel processors
Martin Kroeker [Wed, 12 Jul 2017 07:37:55 +0000 (09:37 +0200)]
Merge pull request #1235 from xianyi/revert-1233-cpuid-fix
Revert "Fix unintentional fall-through cases in get_cacheinfo"
Martin Kroeker [Wed, 12 Jul 2017 07:35:11 +0000 (09:35 +0200)]
Revert "Fix unintentional fall-through cases in get_cacheinfo"
Andrew [Tue, 11 Jul 2017 22:59:30 +0000 (00:59 +0200)]
Fix write past fixed size buffer
Martin Kroeker [Tue, 11 Jul 2017 16:48:13 +0000 (18:48 +0200)]
Add files via upload
Martin Kroeker [Tue, 11 Jul 2017 16:42:39 +0000 (18:42 +0200)]
Add files via upload
Martin Kroeker [Tue, 11 Jul 2017 16:27:33 +0000 (18:27 +0200)]
Honor cgroup/cpuset constraints when enumerating cpus
Martin Kroeker [Tue, 11 Jul 2017 15:10:55 +0000 (17:10 +0200)]
Merge pull request #1233 from martin-frbg/cpuid-fix
Fix unintentional fall-through cases in get_cacheinfo
Martin Kroeker [Tue, 11 Jul 2017 13:39:15 +0000 (15:39 +0200)]
Fix unintentional fall-through cases in get_cacheinfo
These appear to be unintended side effects of PR #1091, probably causing #1232
Zhang Xianyi [Mon, 10 Jul 2017 12:04:42 +0000 (20:04 +0800)]
Merge pull request #1226 from ashwinyes/develop_arm_clang_ual_fix
arm: Fix clang compilation for ARMv7
Zhang Xianyi [Mon, 10 Jul 2017 12:03:57 +0000 (20:03 +0800)]
Merge pull request #1221 from ashwinyes/develop_arm_softfp
arm: add support for softfp in arm vfp assembly files
Zhang Xianyi [Mon, 10 Jul 2017 12:02:36 +0000 (20:02 +0800)]
Merge branch 'develop' into develop_arm_softfp
Martin Kroeker [Sun, 9 Jul 2017 11:16:16 +0000 (13:16 +0200)]
Merge pull request #1230 from martin-frbg/rhel5
Add sched_getcpu implementation for pre-2.6 glibc
Martin Kroeker [Sun, 9 Jul 2017 11:15:24 +0000 (13:15 +0200)]
Do not add -lpthread on Android builds (#1229)
* Do not add -lpthread on Android builds
* Do not add -lpthread on Android cmake builds
Martin Kroeker [Sun, 9 Jul 2017 07:45:38 +0000 (09:45 +0200)]
Add sched_getcpu implementation for pre-2.6 glibc
Fixes #1210, compilation on RHEL5 with affinity enabled
Zhang Xianyi [Fri, 7 Jul 2017 07:43:33 +0000 (15:43 +0800)]
Merge pull request #1225 from martin-frbg/stolen_from_wernsaar_fork
fixed syrk_thread.c taken from wernsaar
Ashwin Sekhar T K [Fri, 7 Jul 2017 07:00:42 +0000 (12:30 +0530)]
arm: Fix clang compilation for ARMv7
clang is not recognizing some pre-UAL VFP mnemonics like fnmacs, fnmacd,
fnmuls and fnmuld. Replaced them with equivalent UAL mnemonics which are
vmls.f32, vmls.f64, vnmul.f32 and vnmul.f64 respectively.
Martin Kroeker [Thu, 6 Jul 2017 15:30:12 +0000 (17:30 +0200)]
fixed syrk_thread.c taken from wernsaar
Stride calculation fix copied from https://github.com/wernsaar/OpenBLAS/commit/88900e1
Martin Kroeker [Thu, 6 Jul 2017 08:12:00 +0000 (10:12 +0200)]
Handle different object extensions in Makefile
The optimized LAPACK functions from interface use OS-dependent suffixes .o/.obj for the object files, while netlib LAPACK uses .o throughout. ReLAPACK object names have to match in order for function replacement in the growing library file to work.
Zhang Xianyi [Wed, 5 Jul 2017 09:01:03 +0000 (17:01 +0800)]
Link -lm or -lm_hard for Android ARMv7.
Zhang Xianyi [Mon, 3 Jul 2017 05:48:29 +0000 (13:48 +0800)]
Merge pull request #1218 from m-brow/power9
Optimise loads on Power9 LE
Zhang Xianyi [Mon, 3 Jul 2017 05:43:48 +0000 (13:43 +0800)]
Merge pull request #1212 from neilsh-msft/develop
Add Microsoft Windows 10 UWP build support
Martin Kroeker [Sat, 1 Jul 2017 23:46:23 +0000 (01:46 +0200)]
Update Makefile
Martin Kroeker [Sat, 1 Jul 2017 22:50:14 +0000 (00:50 +0200)]
Add files via upload
Ashwin Sekhar T K [Sat, 1 Jul 2017 21:36:36 +0000 (03:06 +0530)]
arm: Remove unnecessary files/code
Since softfp code has been added to all required vfp kernels,
the code for auto detection of abi is no longer required.
The option to force softfp ABI on make command line by giving
ARM_SOFTFP_ABI=1 is retained. But there is no need to give this option
anymore.
Also the newly added C versions of 4x4/4x2 gemm/trmm kernels are removed.
These are longer required. Moreover these kernels has bugs.
Ashwin Sekhar T K [Sat, 1 Jul 2017 21:24:32 +0000 (02:54 +0530)]
arm: add softfp support in zgemm/ztrmm vfp kernels