Zaheer Chothia [Mon, 21 May 2012 10:10:26 +0000 (12:10 +0200)]
Fix FreeBSD build (undefined reference to `pthread_create')
Zhang Xianyi [Mon, 21 May 2012 05:01:00 +0000 (13:01 +0800)]
Fixed #110. Merge branch 'patch-2' of https://github.com/nolta/OpenBLAS into develop
Mike Nolta [Mon, 21 May 2012 00:44:15 +0000 (21:44 -0300)]
FreeBSD: replace EXTRALIB -> FEXTRALIB
Zaheer Chothia [Sun, 20 May 2012 16:11:34 +0000 (18:11 +0200)]
Fix Fortran compiler detection
- Test with '-x' operator to ensure file is executable.
- 'break' is not a valid Perl keyword.
Zaheer Chothia [Sun, 20 May 2012 16:09:35 +0000 (18:09 +0200)]
Respect C compiler set on the command line or inherited from the environment
Zhang Xianyi [Sun, 20 May 2012 04:06:04 +0000 (12:06 +0800)]
Merge branch 'patch-1' of https://github.com/nolta/OpenBLAS into develop
Mike Nolta [Sun, 20 May 2012 03:49:38 +0000 (00:49 -0300)]
fix 'sched_yield' warnings on FreeBSD,NetBSD
Zaheer Chothia [Wed, 16 May 2012 09:24:24 +0000 (11:24 +0200)]
Symbol list: document how LAPACKE exports are derived and synchronize with lapack-3.4.1
This change adds the missing LAPACKE_[zc]syr routines but does not remove any exported functions.
Zaheer Chothia [Tue, 15 May 2012 21:58:22 +0000 (23:58 +0200)]
Fixed #107. Export missing LAPACK auxiliary routines (ALLAUX, SCLAUX, DZLAUX)
Added some documentation on how the symbol list is derived and synchronized with
lapack-3.4.1 to minimize the differences.
Zhang Xianyi [Sun, 13 May 2012 03:43:29 +0000 (11:43 +0800)]
Refs #106. Fixed wget and md5 bug on FreeBSD and NetBSD.
Xianyi Zhang [Thu, 10 May 2012 05:01:35 +0000 (13:01 +0800)]
Refs #105. Export missing LAPACK functions in shared library.
They are as following,
slabad, dlabad,
slacpy, dlacpy,
slamch, dlamch,
slartg, slartgp, slartgs, dlartg, dlartgp, dlartgs,
slascl, dlascl,
slaset, dlaset.
Xianyi Zhang [Tue, 8 May 2012 15:50:46 +0000 (23:50 +0800)]
Refs #85 #104. Use patch instead of git to apply this segfaults.patch.
Xianyi Zhang [Mon, 7 May 2012 08:38:44 +0000 (16:38 +0800)]
Refs #85 #104. Disable my_bind to fix this segfault issue.
Xianyi Zhang [Thu, 3 May 2012 12:05:34 +0000 (20:05 +0800)]
refs #103 Increase GEMM_MULTITHREAD_THRESHOLD to 50.
Xianyi Zhang [Thu, 3 May 2012 12:00:40 +0000 (05:00 -0700)]
Merge pull request #104 from aeberspaecher/develop
Fixed #85. Add the patch for segfaults on kernel 2.6.32 and add documentation accordingly.
Alexander Eberspächer [Wed, 2 May 2012 10:03:07 +0000 (12:03 +0200)]
Add note on compiler warnings for the segfaults patch.
Alexander Eberspächer [Wed, 2 May 2012 09:33:06 +0000 (11:33 +0200)]
Add Xianyi's patch for segfaults on kernel 2.6.32 and add documentation
accordingly.
Xianyi Zhang [Mon, 30 Apr 2012 05:07:14 +0000 (13:07 +0800)]
Merge branch 'develop'
Xianyi Zhang [Mon, 30 Apr 2012 05:03:34 +0000 (13:03 +0800)]
Fixed #102. Export the missing LAPACK functions (slapy2,slapy3,dlapy2,dlapy3) in shared library.
Xianyi Zhang [Sun, 29 Apr 2012 10:47:26 +0000 (18:47 +0800)]
Merge branch 'release-0.1.1' into develop
Xianyi Zhang [Sun, 29 Apr 2012 10:41:21 +0000 (18:41 +0800)]
Merge branch 'release-0.1.1'
Xianyi Zhang [Sun, 29 Apr 2012 10:40:24 +0000 (18:40 +0800)]
Refs #91. Updated the doc for 0.1.1 version.
Xianyi Zhang [Sat, 28 Apr 2012 04:33:56 +0000 (12:33 +0800)]
Fixed #101. Install the missing lapacke header with LAPACK-3.4.1. Thank Zaheer for this patch.
Zhang Xianyi [Fri, 27 Apr 2012 03:15:24 +0000 (11:15 +0800)]
Fixed the bug about NO_CBLAS=1 disabled exporting LAPACKE functions in shared library.
Zaheer Chothia [Thu, 26 Apr 2012 20:13:18 +0000 (21:13 +0100)]
Refs #99. c_check/f_check: strip quotes from detected flags
Zhang Xianyi [Fri, 27 Apr 2012 01:55:21 +0000 (09:55 +0800)]
Fixed #98 updated MD5 for new LAPACK 3.4.1 version on netlib.org.
Xianyi Zhang [Thu, 26 Apr 2012 08:50:57 +0000 (16:50 +0800)]
Fixed #96 a SEGFAULT bug in samax on x86.
Xianyi Zhang [Thu, 26 Apr 2012 08:40:44 +0000 (16:40 +0800)]
Adde the mising test_amax.c file.
Xianyi Zhang [Thu, 26 Apr 2012 08:17:17 +0000 (16:17 +0800)]
Added the test case for samax.
Xianyi Zhang [Thu, 26 Apr 2012 07:54:15 +0000 (15:54 +0800)]
Fixed the utest bug for drotmg.
Xianyi Zhang [Thu, 26 Apr 2012 07:39:03 +0000 (15:39 +0800)]
Automatically download CUnit 2.1.2-2 version from SF.net.
Zhang Xianyi [Tue, 24 Apr 2012 04:03:08 +0000 (12:03 +0800)]
Fixed the LAPACKE building bug on Mac OSX.
Zaheer Chothia [Sun, 22 Apr 2012 19:16:03 +0000 (21:16 +0200)]
Refs #95 cblas: compatibility for compilers without C99 complex number support (e.g. Visual Studio)
Xianyi Zhang [Mon, 23 Apr 2012 09:38:54 +0000 (17:38 +0800)]
Refs #94. Auto-detecting Intel Xeon E7 Westmere-EX.
Zaheer Chothia [Sun, 22 Apr 2012 20:38:10 +0000 (22:38 +0200)]
Refs #93. Upgraded LAPACK to 3.4.1 version.
Xianyi Zhang [Mon, 23 Apr 2012 03:26:42 +0000 (11:26 +0800)]
Refs #90 auto detect Intel Sandy Bridge Core i7-3820
Xianyi Zhang [Fri, 13 Apr 2012 15:12:06 +0000 (23:12 +0800)]
Refs #88. Fixed the build bug about LAPACKE C Interface to LAPACKE.
Zaheer Chothia [Sat, 7 Apr 2012 08:40:46 +0000 (10:40 +0200)]
Fixed #88. Build LAPACKE: C Interface to LAPACK.
Zaheer Chothia [Sat, 7 Apr 2012 08:39:09 +0000 (10:39 +0200)]
Fixed #87. Export missing and new LAPACK 3.4.0 functions in shared library.
Xianyi Zhang [Thu, 5 Apr 2012 10:16:18 +0000 (18:16 +0800)]
Refs #86. Test alpha=Nan in x86/x86_64 dscale.
Xianyi Zhang [Thu, 5 Apr 2012 08:21:40 +0000 (16:21 +0800)]
Fixed #84 the MD5 command line bug on Mac OSX.
Xianyi Zhang [Tue, 27 Mar 2012 06:17:13 +0000 (14:17 +0800)]
Fixed a typo in license file.
Xianyi Zhang [Fri, 23 Mar 2012 10:53:51 +0000 (18:53 +0800)]
Merge branch 'release-0.1.0' into develop
Xianyi Zhang [Fri, 23 Mar 2012 10:52:40 +0000 (18:52 +0800)]
Merge branch 'release-0.1.0'
Xianyi Zhang [Fri, 23 Mar 2012 10:45:54 +0000 (18:45 +0800)]
Ref #70 Updated Changelog.txt.
Xianyi Zhang [Fri, 23 Mar 2012 10:17:12 +0000 (18:17 +0800)]
Ref #82. Disable outputing debug information in alloc_mmap.
Xianyi Zhang [Fri, 23 Mar 2012 07:15:05 +0000 (15:15 +0800)]
Ref #82 fixed the bug in my_mbind function.
Xianyi Zhang [Thu, 22 Mar 2012 17:29:05 +0000 (01:29 +0800)]
Updated the version to 0.1.0.
Xianyi Zhang [Thu, 22 Mar 2012 17:26:44 +0000 (01:26 +0800)]
Merge branch 'loongson3b' into release-0.1.0
Xianyi Zhang [Thu, 22 Mar 2012 17:26:27 +0000 (01:26 +0800)]
Merge branch 'loongson3a' into release-0.1.0
Xianyi Zhang [Thu, 22 Mar 2012 17:17:41 +0000 (01:17 +0800)]
Ref #79 Added GEMM_MULTITHREAD_THRESHOLD flag to use single thread in gemm function with small matrices.
Xianyi Zhang [Thu, 22 Mar 2012 16:06:29 +0000 (00:06 +0800)]
Merge branch 'fix-crash_on_P4' into develop
Xianyi Zhang [Thu, 22 Mar 2012 16:00:13 +0000 (00:00 +0800)]
Merge branch 'master' into develop
Xianyi Zhang [Wed, 21 Mar 2012 15:57:09 +0000 (23:57 +0800)]
Refs #81. Added LIBNAMESUFFIX flag in Makefile.rule. The user can use this flag to control the library name, e.g. libopenblas.a, libopenblas_ifort.a or libopenblas_omp.a.
unknown [Mon, 19 Mar 2012 09:56:22 +0000 (17:56 +0800)]
refs #80. Used GEMV SSE2 kernels on x86.
Xianyi Zhang [Fri, 16 Mar 2012 12:29:39 +0000 (20:29 +0800)]
ref #80. On P4 CPU with 32-bit Windows XP, Octave crashed with OpenBLAS. Walkaroud: Use netlib reference gemv instead of own funtions.
For example, make USE_NETLIB_GEMV=1
Xianyi Zhang [Wed, 14 Mar 2012 17:07:34 +0000 (01:07 +0800)]
Set shared library soname in Linux.
Xianyi Zhang [Wed, 14 Mar 2012 09:08:21 +0000 (17:08 +0800)]
Export CBLAS funtions on Windows DLL.
Xianyi Zhang [Mon, 12 Mar 2012 10:20:37 +0000 (18:20 +0800)]
Refs #74. Added -lgfortran into generating shared library.
Xianyi Zhang [Wed, 7 Mar 2012 15:14:25 +0000 (23:14 +0800)]
Check new LAPACK version in generating shared library.
Xianyi Zhang [Mon, 20 Feb 2012 15:36:58 +0000 (23:36 +0800)]
Improved the makefile for Intel compiler.
Xianyi Zhang [Mon, 20 Feb 2012 01:06:43 +0000 (09:06 +0800)]
Updated the Changelog.
Xianyi Zhang [Mon, 20 Feb 2012 00:44:35 +0000 (16:44 -0800)]
Merge pull request #77 from nolta/master
fix #49 the sched_yield warnings bug on Mac OS X.
Mike Nolta [Sun, 19 Feb 2012 19:07:34 +0000 (14:07 -0500)]
fix #49
Xianyi Zhang [Sun, 19 Feb 2012 15:11:06 +0000 (23:11 +0800)]
Merge branch 'hotfix-0.1alpha2.5' into develop
Xianyi Zhang [Sun, 19 Feb 2012 14:56:06 +0000 (22:56 +0800)]
Merge branch 'hotfix-0.1alpha2.5'
Xianyi Zhang [Sun, 19 Feb 2012 14:55:31 +0000 (22:55 +0800)]
Released 0.1 alpha 2.5. Updated the documents.
Xianyi Zhang [Sun, 19 Feb 2012 14:31:09 +0000 (22:31 +0800)]
Merge branch 'develop' into hotfix-0.1alpha2.5
Xianyi Zhang [Mon, 13 Feb 2012 11:20:35 +0000 (19:20 +0800)]
refs #69. Auto-detect Intel Core i6/i7 (Sandy Bridge) CPU with Nehalem assembly kernels.
Xianyi Zhang [Fri, 20 Jan 2012 13:32:13 +0000 (21:32 +0800)]
Merge branch 'master' into develop
traz [Wed, 11 Jan 2012 16:05:39 +0000 (16:05 +0000)]
Modify P Q R size of Loongson3b.
Wang Qian [Tue, 10 Jan 2012 17:16:13 +0000 (17:16 +0000)]
Appending gemmkernel and trmmkernel C code in kernel/generic, this code can be used to execute on a new platform which dose not have optimized assemble kernel.
Xianyi Zhang [Sun, 1 Jan 2012 13:57:25 +0000 (05:57 -0800)]
Merge pull request #76 from StefanKarpinski/patch-1
Fix #68: don't require SystemStubs on OS X. SystemStubs does not exist on Lion.
Stefan Karpinski [Thu, 29 Dec 2011 04:53:20 +0000 (23:53 -0500)]
Fix #68: don't require SystemStubs on OS X.
traz [Tue, 6 Dec 2011 13:49:39 +0000 (13:49 +0000)]
Merge remote branch 'origin/loongson3a' into loongson3b
traz [Mon, 5 Dec 2011 14:54:25 +0000 (14:54 +0000)]
Adding detection of complex situations in symm.c, otherwise the buffer address of sb will overlap the end of sa.
Wang Qian [Thu, 1 Dec 2011 16:33:11 +0000 (16:33 +0000)]
Adding n32 multiple threads condition.
Xianyi Zhang [Mon, 28 Nov 2011 07:31:46 +0000 (15:31 +0800)]
Fixed a typo in Makefile.
Xianyi Zhang [Mon, 28 Nov 2011 07:28:54 +0000 (15:28 +0800)]
Merge branch 'lapack_3.4.0' into develop
Xianyi Zhang [Mon, 28 Nov 2011 07:28:22 +0000 (15:28 +0800)]
Refs #72. Upgraded LAPACK to 3.4.0 version.
Wang Qian [Fri, 25 Nov 2011 11:20:25 +0000 (11:20 +0000)]
BLAS3 used standard MIPS instructions without extensions on Loongson 3B.
Wang Qian [Wed, 23 Nov 2011 18:40:35 +0000 (18:40 +0000)]
Change the block size on Loongson 3B.
Xianyi Zhang [Wed, 23 Nov 2011 17:17:41 +0000 (17:17 +0000)]
Fixed mbind bug on Loongson 3B. Check the return value of my_mbind function.
Xianyi Zhang [Thu, 17 Nov 2011 16:46:26 +0000 (16:46 +0000)]
Disable using simple thread level3 to fix a bug on Loongson 3B.
Xianyi Zhang [Fri, 11 Nov 2011 17:49:41 +0000 (17:49 +0000)]
Enable thread affinity on Loongson 3B. Fixed the bug of reading cycle counter.
In Loongson 3A and 3B, the CPU core increases the counter in every 2 cycles by default.
Xianyi Zhang [Fri, 11 Nov 2011 14:26:49 +0000 (14:26 +0000)]
Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3b
traz [Thu, 10 Nov 2011 15:38:48 +0000 (15:38 +0000)]
Add conjugate condition to gemv.
Xianyi Zhang [Wed, 9 Nov 2011 19:28:22 +0000 (19:28 +0000)]
Support detecting ICT Loongson-3B CPU.
Xianyi Zhang [Wed, 9 Nov 2011 19:08:29 +0000 (19:08 +0000)]
Merge branch 'develop' of github.com:xianyi/OpenBLAS into loongson3b
traz [Fri, 4 Nov 2011 19:32:21 +0000 (19:32 +0000)]
Fix the compute error of gemv when incx and incy are negative numbers.
traz [Thu, 3 Nov 2011 13:53:48 +0000 (13:53 +0000)]
Add complete gemv function on Loongson3a platform.
traits [Tue, 18 Oct 2011 10:44:23 +0000 (18:44 +0800)]
Fixed #66 the bug in zgemv kernel with transpose matrix on 64-bit MingW (Windows).
traits [Tue, 18 Oct 2011 02:23:17 +0000 (10:23 +0800)]
Ref #65. Fixed 64-bit Windows calling convention bug in cdot and zdot.
According to 64-bit Windows calling convention, the return value is in %rax instead of %xmm0 in cdot kernel.
In zdot, the caller allocates a memory space for return value and sets this memory address to the first hidden parameter. Thus, the callee (zdot) should assign the result to this memory space and return the memory address in %rax.
Xianyi Zhang [Sun, 16 Oct 2011 14:56:19 +0000 (22:56 +0800)]
Ref #62. In OpenMP implementation, check the return value of omp_get_max_threads().
It makes sure the return value as same as blas_cpu_numbers which is an internal global variable to store the number of threads in OpenBLAS.
traits [Sun, 9 Oct 2011 09:25:44 +0000 (17:25 +0800)]
Ref #63. Fixed generating DLL bug on ming-w64.
Xianyi Zhang [Sun, 9 Oct 2011 07:14:48 +0000 (15:14 +0800)]
ref #62. Added the user friendly message with USE_OPENMP=1. The users should use OMP_NUM_THREADS.
When OpenBLAS is compiled with USE_OPENMP=1, it ignores OPENBLAS_NUM_THREADS and GOTO_NUM_THREADS flags.Therefore, you should use OMP_NUM_THREADS.
Without setting OMP_NUM_THREADS, a process will use maximal number of threads on a computing node. Thus, if there are 2 processes on the computing node, the thread will contend against other threads on CPU cores. As a result, the application will hang.
traz [Mon, 26 Sep 2011 15:21:45 +0000 (15:21 +0000)]
Adding conditional compilation(#if defined(LOONGSON3A)) to avoid affecting the performance of other platforms.
traz [Fri, 23 Sep 2011 20:59:48 +0000 (20:59 +0000)]
Modify aligned address of sa and sb to improve the performance of multi-threads.
Xianyi [Sun, 18 Sep 2011 09:00:29 +0000 (17:00 +0800)]
Merge branch 'hotfix-0.1alpha2.4' into develop
Xianyi [Sun, 18 Sep 2011 08:57:28 +0000 (16:57 +0800)]
Merge branch 'hotfix-0.1alpha2.4'