platform/upstream/openblas.git
11 years agoRefs #205. Merge boegel's codes about downloading LAPACK.
Zhang Xianyi [Fri, 24 May 2013 07:29:10 +0000 (15:29 +0800)]
Refs #205. Merge boegel's codes about downloading LAPACK.

11 years agoFixed #199. Saved USE_THREAD switch for make install.
Zhang Xianyi [Fri, 24 May 2013 07:15:52 +0000 (15:15 +0800)]
Fixed #199. Saved USE_THREAD switch for make install.

11 years agoRefs #220. Support Power7 by old Power6 kernels.
Zhang Xianyi [Tue, 21 May 2013 14:59:45 +0000 (22:59 +0800)]
Refs #220. Support Power7 by old Power6 kernels.

11 years agoRefs #215. Fixed the compatible between <complex.h> and <complex> in C++.
Zhang Xianyi [Fri, 17 May 2013 08:41:05 +0000 (16:41 +0800)]
Refs #215. Fixed the compatible between <complex.h> and <complex> in C++.

11 years agoRefs #216. Revert the default value of GEMM_MULTITHREAD_THRESHOLD to 4.
Zhang Xianyi [Fri, 3 May 2013 01:08:54 +0000 (09:08 +0800)]
Refs #216. Revert the default value of GEMM_MULTITHREAD_THRESHOLD to 4.

11 years agoRefs #210. Disable checking /lib/libpthread.so*.
Zhang Xianyi [Sat, 27 Apr 2013 07:02:04 +0000 (15:02 +0800)]
Refs #210. Disable checking /lib/libpthread.so*.

11 years agoUpdated the mailing list for OpenBLAS.
Xianyi Zhang [Wed, 24 Apr 2013 16:45:42 +0000 (00:45 +0800)]
Updated the mailing list for OpenBLAS.

11 years agoUpdated the mailing list for OpenBLAS.
Xianyi Zhang [Wed, 24 Apr 2013 16:44:22 +0000 (00:44 +0800)]
Updated the mailing list for OpenBLAS.

11 years agoMerge pull request #213 from wernsaar/develop
Zhang Xianyi [Thu, 18 Apr 2013 06:56:09 +0000 (23:56 -0700)]
Merge pull request #213 from wernsaar/develop

Merged some improvements into dgemm_kernel_4x4_bulldozer.S.

11 years agoMerged some improvements into dgemm_kernel_4x4_bulldozer.S.
wernsaar [Tue, 16 Apr 2013 17:05:06 +0000 (19:05 +0200)]
Merged some improvements into dgemm_kernel_4x4_bulldozer.S.
Changed the copy functions to generic to solve prefetch conflicts

11 years agoAdded NO_PARALLEL_MAKE flag to disable parallel make.
Zhang Xianyi [Mon, 15 Apr 2013 13:37:30 +0000 (21:37 +0800)]
Added NO_PARALLEL_MAKE flag to disable parallel make.

11 years agoMerge pull request #211 from wernsaar/develop
Zhang Xianyi [Mon, 15 Apr 2013 07:20:55 +0000 (00:20 -0700)]
Merge pull request #211 from wernsaar/develop

New version of dgemm_kernel_4x4_bulldozer.S

11 years agoNew version of dgemm_kernel_4x4_bulldozer.S
wernsaar [Fri, 12 Apr 2013 15:55:51 +0000 (17:55 +0200)]
New version of dgemm_kernel_4x4_bulldozer.S
The peak performance with 8 cores is now 90 GFlops

11 years agoRefs #209. Export the missing cblas_cdotc_sub functions.
Zhang Xianyi [Mon, 8 Apr 2013 15:21:28 +0000 (23:21 +0800)]
Refs #209. Export the missing cblas_cdotc_sub functions.

11 years agoMerge pull request #206 from wlbksy/patch-1
Zhang Xianyi [Sat, 23 Mar 2013 16:57:41 +0000 (09:57 -0700)]
Merge pull request #206 from wlbksy/patch-1

Fix #204 wget in mingw/msys sometimes download file with trailing name,

11 years agoFix #204
wlbksy [Sat, 23 Mar 2013 06:41:26 +0000 (14:41 +0800)]
Fix #204

11 years agoadjusted Makefile to allow for provided required LAPACK source files rather than...
Kenneth Hoste [Fri, 22 Mar 2013 18:45:11 +0000 (19:45 +0100)]
adjusted Makefile to allow for provided required LAPACK source files rather than downloading them

11 years agoMerge pull request #201 from Explorer09/develop
Zhang Xianyi [Mon, 18 Mar 2013 14:31:30 +0000 (07:31 -0700)]
Merge pull request #201 from Explorer09/develop

11 years agogetarch.c: Minor re-ordering of architecture list
Explorer09 [Sun, 17 Mar 2013 15:09:23 +0000 (23:09 +0800)]
getarch.c: Minor re-ordering of architecture list

11 years agogetarch.c: Minor re-ordering of architecture list
Explorer09 [Sun, 17 Mar 2013 15:07:48 +0000 (23:07 +0800)]
getarch.c: Minor re-ordering of architecture list

11 years agoTargetList.txt: minor re-ordering
Explorer09 [Sun, 17 Mar 2013 15:03:05 +0000 (23:03 +0800)]
TargetList.txt: minor re-ordering

11 years agoTypo correction in README.md
Explorer09 [Sun, 17 Mar 2013 14:48:24 +0000 (22:48 +0800)]
Typo correction in README.md

11 years agoOverride CFLAGS in LAPACK make.in.
Zhang Xianyi [Sat, 9 Mar 2013 17:01:16 +0000 (01:01 +0800)]
Override CFLAGS in LAPACK make.in.

11 years agoFixed the Windows x86_64 ABI bug in s/daxpy kernels.
Zhang Xianyi [Fri, 8 Mar 2013 14:28:34 +0000 (22:28 +0800)]
Fixed the Windows x86_64 ABI bug in s/daxpy kernels.

11 years agoMerge pull request #198 from wernsaar/develop
Zhang Xianyi [Wed, 6 Mar 2013 21:39:53 +0000 (13:39 -0800)]
Merge pull request #198 from wernsaar/develop

new optimization of dgemm kernel for bulldozer: 10% performance increase

11 years agonew optimization of dgemm kernel for bulldozer: 10% performance increase
wernsaar [Wed, 6 Mar 2013 16:26:03 +0000 (17:26 +0100)]
new optimization of dgemm kernel for bulldozer: 10% performance increase

11 years agoMerge pull request #197 from wernsaar/develop
Zhang Xianyi [Wed, 6 Mar 2013 09:11:08 +0000 (01:11 -0800)]
Merge pull request #197 from wernsaar/develop

optimized again bulldozer dgemm kernel

11 years agooptimized again bulldozer dgemm kernel
wernsaar [Tue, 5 Mar 2013 18:51:37 +0000 (19:51 +0100)]
optimized again bulldozer dgemm kernel

11 years agoMerge pull request #195 from wernsaar/develop
Zhang Xianyi [Tue, 5 Mar 2013 13:35:42 +0000 (05:35 -0800)]
Merge pull request #195 from wernsaar/develop

Develop dgemm for bullozer

11 years agonew dgemm_kernel for bulldozer
wernsaar [Mon, 4 Mar 2013 16:37:38 +0000 (17:37 +0100)]
new dgemm_kernel for bulldozer

11 years agoMerge branch 'develop' v0.2.6
Zhang Xianyi [Sat, 2 Mar 2013 06:42:06 +0000 (14:42 +0800)]
Merge branch 'develop'

11 years agoRefs#194. Export the missing LAPACK s/dlamc3 functions.
Zhang Xianyi [Sat, 2 Mar 2013 06:41:18 +0000 (14:41 +0800)]
Refs#194. Export the missing LAPACK s/dlamc3 functions.

11 years agoMerge branch 'develop'
Zhang Xianyi [Sat, 2 Mar 2013 06:24:23 +0000 (14:24 +0800)]
Merge branch 'develop'

11 years agoUpdated the doc for 0.2.6 version.
Zhang Xianyi [Sat, 2 Mar 2013 06:22:27 +0000 (14:22 +0800)]
Updated the doc for 0.2.6 version.

11 years agoImproved the print when OS don't support AVX.
Zhang Xianyi [Sat, 2 Mar 2013 06:15:54 +0000 (14:15 +0800)]
Improved the print when OS don't support AVX.

11 years agoIn OpenMP threading, preallocate the thread buffer instead of allocating the buffer...
Zhang Xianyi [Fri, 1 Mar 2013 06:36:47 +0000 (14:36 +0800)]
In OpenMP threading, preallocate the thread buffer instead of allocating the buffer every time. This patch improved the performance slightly.

11 years agoRefs #174. Return sb pointer when OpenMP or Windows.
Zhang Xianyi [Mon, 25 Feb 2013 16:48:21 +0000 (00:48 +0800)]
Refs #174. Return sb pointer when OpenMP or Windows.

11 years agoFixed the overflowing bug in single thread cholesky factorization.
Zhang Xianyi [Sat, 23 Feb 2013 04:51:13 +0000 (12:51 +0800)]
Fixed the overflowing bug in single thread cholesky factorization.

11 years agoRefs #174. Fixed the overflowing buffer bug of multithreading hbmv and sbmv.
Zhang Xianyi [Wed, 13 Feb 2013 08:05:58 +0000 (16:05 +0800)]
Refs #174. Fixed the overflowing buffer bug of multithreading hbmv and sbmv.

Instead of using thread 0 buffer, each thread uses its own sb buffer.
Thus, it can avoid overflowing thread 0 buffer.

11 years agoMerge branch 'bulldozer' into develop
Zhang Xianyi [Sat, 9 Feb 2013 17:19:42 +0000 (01:19 +0800)]
Merge branch 'bulldozer' into develop

11 years agoMissing line continuation -- follow-up to last commit (64ad8b9809).
Zaheer Chothia [Fri, 1 Feb 2013 08:34:12 +0000 (09:34 +0100)]
Missing line continuation -- follow-up to last commit (64ad8b9809).

11 years agoRefs #193. Don't use C99 complex numbers when building C++ code.
Zaheer Chothia [Fri, 1 Feb 2013 08:24:44 +0000 (09:24 +0100)]
Refs #193. Don't use C99 complex numbers when building C++ code.

11 years agoRefs #193. cblas: move #include out of extern "C" block.
Zaheer Chothia [Thu, 31 Jan 2013 07:48:27 +0000 (08:48 +0100)]
Refs #193. cblas: move #include out of extern "C" block.

Standard headers may contain C++ templates which are not permitted inside an
extern "C" block. This might be the case when we include <complex.h>.

11 years agoRefs #189. Fixed the bug of s/cdot about invalid reading NAN on x86_64.
Zhang Xianyi [Fri, 25 Jan 2013 08:18:27 +0000 (16:18 +0800)]
Refs #189. Fixed the bug of s/cdot about invalid reading NAN on x86_64.

11 years agoRefs #187. Use perl to generate cblas_noconst.h instead of sed.
Zhang Xianyi [Mon, 21 Jan 2013 16:29:54 +0000 (00:29 +0800)]
Refs #187. Use perl to generate cblas_noconst.h instead of sed.

Thank Dan Povey's patch. https://github.com/xianyi/OpenBLAS/issues/187

11 years agoRefs #187. Use binary code for xgetbv, which is compatible with old compiler.
Zhang Xianyi [Mon, 21 Jan 2013 16:18:21 +0000 (00:18 +0800)]
Refs #187. Use binary code for xgetbv, which is compatible with old compiler.

11 years agoRefs #185. Add missing 'const' to declarations in <cblas.h>. Thanks to Dan Povey!
Zaheer Chothia [Sun, 20 Jan 2013 20:53:52 +0000 (21:53 +0100)]
Refs #185. Add missing 'const' to declarations in <cblas.h>. Thanks to Dan Povey!

The 'const' modifications were done automatically using this scripts:
https://kaldi.svn.sourceforge.net/svnroot/kaldi/sandbox/dan/tools/for_openblas

11 years agoRefs #154. Fixed gemv_t bug about overflow 16MB buffer on x86.
Zhang Xianyi [Sun, 20 Jan 2013 13:22:12 +0000 (21:22 +0800)]
Refs #154. Fixed gemv_t bug about overflow 16MB buffer on x86.

11 years agocblas: typedef enums for improved compatibility with Intel MKL.
Zaheer Chothia [Mon, 25 Jun 2012 11:51:46 +0000 (13:51 +0200)]
cblas: typedef enums for improved compatibility with Intel MKL.

Netlib style:
    enum CBLAS_XYZ {X=1, Y=2, Z=3};

Intel MKL style:
    typedef enum {X=1, Y=2, Z=3} CBLAS_XYZ;

With this hybrid style, code written in the latter form won't need any
modifications to be built with OpenBLAS.  This change should not affect existing
code, although a warning may be emitted for C code which does the following
(does not occur with C++):
    typedef enum CBLAS_XYZ CBLAS_XYZ;
    warning: redefinition of typedef 'CBLAS_XYZ' [-pedantic]

11 years agoFixed #180. the typos in kernel/x86_64/sgemv_t.S
Zhang Xianyi [Sat, 12 Jan 2013 04:31:14 +0000 (12:31 +0800)]
Fixed #180. the typos in kernel/x86_64/sgemv_t.S

11 years agoRefs #177. Fixed sgemv_t compiling bug on Win64.
Zhang Xianyi [Sat, 5 Jan 2013 03:36:39 +0000 (11:36 +0800)]
Refs #177. Fixed sgemv_t compiling bug on Win64.

11 years agoRefs #176. Fixed make.inc overriding RANLIB bug when cross-compiling LAPACK.
Zhang Xianyi [Wed, 2 Jan 2013 17:47:31 +0000 (01:47 +0800)]
Refs #176. Fixed make.inc overriding RANLIB bug when cross-compiling LAPACK.

11 years agoRefs #173. Fixed overflow internal buffer bug of gemv_n on x86
Zhang Xianyi [Tue, 25 Dec 2012 01:27:49 +0000 (09:27 +0800)]
Refs #173. Fixed overflow internal buffer bug of gemv_n on x86

11 years agoRefs #173. Fixed overflow internal buffer bug of sgemv_t on x86
Zhang Xianyi [Tue, 25 Dec 2012 01:10:17 +0000 (09:10 +0800)]
Refs #173. Fixed overflow internal buffer bug of sgemv_t on x86

11 years agoRefs #171. Prevent loading the dirty number from the buffer in sgemv_t x86 kernel.
Zhang Xianyi [Sun, 23 Dec 2012 15:14:17 +0000 (23:14 +0800)]
Refs #171. Prevent loading the dirty number from the buffer in sgemv_t x86 kernel.

11 years agoRefs #173. Fixed overflow internal buffer bug of gemv_t on x86.
Zhang Xianyi [Sun, 23 Dec 2012 13:47:22 +0000 (21:47 +0800)]
Refs #173. Fixed overflow internal buffer bug of gemv_t on x86.

11 years agoFixed #172. Support Intel Xeon E7540.
Zhang Xianyi [Tue, 18 Dec 2012 00:57:46 +0000 (08:57 +0800)]
Fixed #172. Support Intel Xeon E7540.

11 years agoMerge branch 'master' into develop
Zhang Xianyi [Tue, 18 Dec 2012 00:51:30 +0000 (08:51 +0800)]
Merge branch 'master' into develop

11 years agoMerge pull request #170 from juliantaylor/athlon-defaults
Zhang Xianyi [Sat, 15 Dec 2012 23:50:02 +0000 (15:50 -0800)]
Merge pull request #170 from juliantaylor/athlon-defaults

set parameters for CORE_ATHLON

11 years agoset parameters for CORE_ATHLON
Julian Taylor [Sat, 15 Dec 2012 15:05:33 +0000 (16:05 +0100)]
set parameters for CORE_ATHLON

else dgemm_p is set to zero leading to a segfault in alloc_mmap due to
allocsize being zero

11 years agoMerge branch 'master' into develop
Zhang Xianyi [Sat, 15 Dec 2012 14:49:37 +0000 (22:49 +0800)]
Merge branch 'master' into develop

11 years agoMerge pull request #169 from juliantaylor/sanity-check-cpu
Zhang Xianyi [Sat, 15 Dec 2012 14:46:48 +0000 (06:46 -0800)]
Merge pull request #169 from juliantaylor/sanity-check-cpu

add a sanity check on the detected cpu type

11 years agoadd a sanity check on the detected cpu type
Julian Taylor [Sat, 15 Dec 2012 12:29:46 +0000 (13:29 +0100)]
add a sanity check on the detected cpu type

if we have 64 bit pointers we can't have a 32 bit cpu, so fall back to
the 64bit cpu fallback (prescott)
E.g. the cpu detection fails in amd qemu64 emulation (family 6 model 2)
causing it to use the uninitialized gotoblas_ATHLON

11 years agoWrite FMA4 flag to the configure file.
Zhang Xianyi [Tue, 11 Dec 2012 09:55:10 +0000 (10:55 +0100)]
Write FMA4 flag to the configure file.

11 years agoRefs #163. Obtain the build configure on runtime.
Zhang Xianyi [Mon, 10 Dec 2012 07:49:01 +0000 (15:49 +0800)]
Refs #163. Obtain the build configure on runtime.
openblas_get_config function returns the configure string.
So far, it supports USE64BITINT, NO_CBLAS, NO_LAPACK, NO_LAPACKE,
DYNAMIC_ARCH, NO_AFFINITY.

Example:
 #include <stdio.h>
extern char * openblas_get_config();
void main()
{
  printf("%s\n",openblas_get_config());
  return;
}

11 years agoRefs #165. fall back of DTB_DEFAULT_ENTRIES for some virtual machines.
Zhang Xianyi [Mon, 10 Dec 2012 03:51:39 +0000 (11:51 +0800)]
Refs #165. fall back of DTB_DEFAULT_ENTRIES for some virtual machines.

11 years agoRefs #54. Added AMD Bulldozer x86_64 dgemm kernel developed by Werner Saar <wernsaar...
Zhang Xianyi [Thu, 6 Dec 2012 16:58:03 +0000 (00:58 +0800)]
Refs #54. Added AMD Bulldozer x86_64 dgemm kernel developed by Werner Saar <wernsaar at googlemail.com>

Based on the dgemm kernel for AMD Barcelona, he used AVX and FMA4 instructions.
Thank Werner Saar!

11 years agoAdded BULLDOZER target. So far it uses barcelona kernels.
Zhang Xianyi [Thu, 6 Dec 2012 16:53:31 +0000 (00:53 +0800)]
Added BULLDOZER target. So far it uses barcelona kernels.

11 years agoInit AMD Bulldozer codebase.
Zhang Xianyi [Thu, 6 Dec 2012 12:29:54 +0000 (07:29 -0500)]
Init AMD Bulldozer codebase.

11 years agoAdded -lgomp for generating DLL on Windows.
Zhang Xianyi [Wed, 28 Nov 2012 04:52:28 +0000 (12:52 +0800)]
Added -lgomp for generating DLL on Windows.

11 years agoMerge branch 'develop' v0.2.5
Zhang Xianyi [Mon, 26 Nov 2012 23:24:53 +0000 (07:24 +0800)]
Merge branch 'develop'

11 years agoRefs #154. Fixed the build bug of dgemv_t on MinW64.
Zhang Xianyi [Mon, 26 Nov 2012 23:24:04 +0000 (07:24 +0800)]
Refs #154. Fixed the build bug of dgemv_t on MinW64.

11 years agoMerge branch 'develop'
Zhang Xianyi [Mon, 26 Nov 2012 09:32:56 +0000 (17:32 +0800)]
Merge branch 'develop'

11 years agoUpdate the doc for 0.2.5 version.
Zhang Xianyi [Mon, 26 Nov 2012 09:32:25 +0000 (17:32 +0800)]
Update the doc for 0.2.5 version.

11 years agoRefs #154. Fixed a SEGFAULT bug of dgemv_t when m is very large.
Zhang Xianyi [Mon, 19 Nov 2012 14:32:27 +0000 (22:32 +0800)]
Refs #154. Fixed a SEGFAULT bug of dgemv_t when m is very large.

It overflowed the internal buffer. Thus, we split vector x into blocks when m is very large.

Thank @wangqian for this patch.

11 years agoFixed #160. Merge branch 'master' of https://github.com/sebastien-villemot/OpenBLAS...
Zhang Xianyi [Thu, 15 Nov 2012 10:20:07 +0000 (18:20 +0800)]
Fixed #160. Merge branch 'master' of https://github.com/sebastien-villemot/OpenBLAS into develop

11 years agoFix compilation with TARGET=GENERIC
Sébastien Villemot [Wed, 14 Nov 2012 20:04:05 +0000 (21:04 +0100)]
Fix compilation with TARGET=GENERIC

Patch applied to Debian package

11 years agoFixed #157. Only detect the number of physical CPU cores on Mac OSX.
Zhang Xianyi [Tue, 13 Nov 2012 07:48:57 +0000 (15:48 +0800)]
Fixed #157. Only detect the number of physical CPU cores on Mac OSX.

11 years agoCompile lapacke with ILP64 modle when INTERFACE64=1
Zhang Xianyi [Mon, 12 Nov 2012 16:54:20 +0000 (00:54 +0800)]
Compile lapacke with ILP64 modle when INTERFACE64=1

11 years agoAdded the patch for lapacke example.
Zhang Xianyi [Mon, 12 Nov 2012 16:53:26 +0000 (00:53 +0800)]
Added the patch for lapacke example.

11 years agoMerge branch 'master' of https://github.com/alnsn/OpenBLAS into develop
Zhang Xianyi [Mon, 12 Nov 2012 03:17:04 +0000 (11:17 +0800)]
Merge branch 'master' of https://github.com/alnsn/OpenBLAS into develop

11 years agoFix NetBSD build.
Alexander Nasonov [Sat, 10 Nov 2012 23:20:44 +0000 (23:20 +0000)]
Fix NetBSD build.

11 years agoImproved Makefile.rule for cross compiler.
Zhang Xianyi [Thu, 8 Nov 2012 14:15:04 +0000 (22:15 +0800)]
Improved Makefile.rule for cross compiler.

11 years agoAdded NO_SHARED flag to disable generating the shared library.
Zhang Xianyi [Thu, 8 Nov 2012 14:08:01 +0000 (22:08 +0800)]
Added NO_SHARED flag to disable generating the shared library.

11 years agoRefs #153. Restore the original CPU affinity when calling openblas_set_num_threads(1).
Zhang Xianyi [Tue, 6 Nov 2012 10:21:46 +0000 (18:21 +0800)]
Refs #153. Restore the original CPU affinity when calling openblas_set_num_threads(1).

Please read the issue on github.com for the detail.

11 years agoAlternative approach to avoid command-line length while archiving lapacke -- Thanks...
Zaheer Chothia [Mon, 15 Oct 2012 20:26:18 +0000 (22:26 +0200)]
Alternative approach to avoid command-line length while archiving lapacke -- Thanks Michel!

11 years agoFix installation step on Windows (regression from e8306f623a)
Zaheer Chothia [Mon, 15 Oct 2012 20:13:37 +0000 (22:13 +0200)]
Fix installation step on Windows (regression from e8306f623a)

Since the DLL now has a fixed name there is no need to install a versioned alias too.

11 years agoFixed #147: LAPACK symbols were not being exported for version 3.4.2
Zaheer Chothia [Fri, 12 Oct 2012 21:44:23 +0000 (23:44 +0200)]
Fixed #147: LAPACK symbols were not being exported for version 3.4.2

11 years agoMerge branch 'develop' v0.2.4
Zhang Xianyi [Tue, 9 Oct 2012 12:08:28 +0000 (20:08 +0800)]
Merge branch 'develop'

11 years agoDon't use xgetbv instruction when NO_AVX=1
Zhang Xianyi [Tue, 9 Oct 2012 06:52:35 +0000 (14:52 +0800)]
Don't use xgetbv instruction when NO_AVX=1

11 years agoMerge branch 'develop'
Zhang Xianyi [Mon, 8 Oct 2012 05:38:03 +0000 (13:38 +0800)]
Merge branch 'develop'

11 years agoUpdated the doc for 0.2.4 version.
Zhang Xianyi [Mon, 8 Oct 2012 05:37:44 +0000 (13:37 +0800)]
Updated the doc for 0.2.4 version.

11 years agoFixed #141. make f77blas.h compatible with compilers which lack C99 complex number.
Zhang Xianyi [Mon, 8 Oct 2012 04:48:20 +0000 (12:48 +0800)]
Fixed #141. make f77blas.h compatible with compilers which lack C99 complex number.
Apply the patch from Tony @tonyhill. Thank you.

11 years agoRefs #145. Update LAPACK to 3.4.2 version.
Zhang Xianyi [Sat, 29 Sep 2012 15:14:39 +0000 (23:14 +0800)]
Refs #145. Update LAPACK to 3.4.2 version.

11 years agorefs #140. Fixed zdot incompatibility ABI issue with GCC 4.7 on Win 32.
Zhang Xianyi [Mon, 24 Sep 2012 12:34:33 +0000 (20:34 +0800)]
refs #140. Fixed zdot incompatibility ABI issue with GCC 4.7 on Win 32.

GCC 4.7 uses MSVC ABI on Win 32. This means the caller pops the hidden pointer for returning
aggregate structures larger than 8 bytes.

11 years agoFixed generating shared library bug on MIPS.
Zhang Xianyi [Fri, 21 Sep 2012 11:49:07 +0000 (11:49 +0000)]
Fixed generating shared library bug on MIPS.

11 years agoFixed the detection bug on Loongson 3A server.
Zhang Xianyi [Fri, 21 Sep 2012 10:14:07 +0000 (10:14 +0000)]
Fixed the detection bug on Loongson 3A server.

11 years agoRefs #139. Check OS supporting AVX on runtime.
Zhang Xianyi [Tue, 18 Sep 2012 07:46:20 +0000 (15:46 +0800)]
Refs #139. Check OS supporting AVX on runtime.

11 years agoRefs #139. Added NO_AVX flag to use old Nehalem kernels on Sandy Bridge.
Zhang Xianyi [Mon, 17 Sep 2012 15:24:04 +0000 (23:24 +0800)]
Refs #139. Added NO_AVX flag to use old Nehalem kernels on Sandy Bridge.

For example, make NO_AVX=1 or make DYNAMIC_ARCH=1 NO_AVX=1

11 years agoFixed #142. Added the gesvd and potrs function families to common_interface.h.
Zhang Xianyi [Fri, 14 Sep 2012 07:15:08 +0000 (15:15 +0800)]
Fixed #142. Added the gesvd and potrs function families to common_interface.h.