platform/upstream/openblas.git
12 years agoMerge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a
traz [Fri, 24 Jun 2011 09:28:12 +0000 (09:28 +0000)]
Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a

12 years agoFix compute error in ztrmm.
traz [Fri, 24 Jun 2011 09:27:41 +0000 (09:27 +0000)]
Fix compute error in ztrmm.

12 years agoAdd ztrmm and ztrsm part on loongson3a. The average performance is 2.2G.
traz [Thu, 23 Jun 2011 21:11:00 +0000 (21:11 +0000)]
Add ztrmm and ztrsm part on loongson3a. The average performance is 2.2G.

12 years agoChange prefetch length of A and B, the performance is 2.1G now.
traz [Thu, 23 Jun 2011 10:46:58 +0000 (10:46 +0000)]
Change prefetch length of A and B, the performance is 2.1G now.

12 years agoMerge branch 'release-v0.1alpha2' into loongson3a
Xianyi Zhang [Thu, 23 Jun 2011 08:08:23 +0000 (16:08 +0800)]
Merge branch 'release-v0.1alpha2' into loongson3a

12 years agoFixed #38. Released v0.1 alpha2.
traits [Thu, 23 Jun 2011 07:16:24 +0000 (15:16 +0800)]
Fixed #38. Released v0.1 alpha2.

12 years agoRefs #37. Updated REAME about the compatible issue with EKOPath compiler.
traits [Thu, 23 Jun 2011 07:09:34 +0000 (15:09 +0800)]
Refs #37. Updated REAME about the compatible issue with EKOPath compiler.

12 years agoRefs #39. Moved the shared lib (dll) to top directory in MingW64 compiler environment.
Xianyi Zhang [Wed, 22 Jun 2011 05:19:39 +0000 (13:19 +0800)]
Refs #39. Moved the shared lib (dll) to top directory in MingW64 compiler environment.

12 years agoImprove zgemm performance from 1G to 1.8G, change block size in param.h.
traz [Tue, 21 Jun 2011 22:16:23 +0000 (22:16 +0000)]
Improve zgemm performance from 1G to 1.8G, change block size in param.h.

12 years agoRefs #39. It's unnecessary to include sys/mman.h file in blas_server_omp.c.
Xianyi Zhang [Tue, 21 Jun 2011 17:52:20 +0000 (01:52 +0800)]
Refs #39. It's unnecessary to include sys/mman.h file in blas_server_omp.c.

12 years agoRefs #38. Prepare the docs with v0.1alpha2.
Xianyi Zhang [Tue, 21 Jun 2011 10:06:13 +0000 (18:06 +0800)]
Refs #38. Prepare the docs with v0.1alpha2.

12 years agoMerge branch 'loongson3a' into release-v0.1alpha2
Xianyi Zhang [Tue, 21 Jun 2011 09:50:00 +0000 (17:50 +0800)]
Merge branch 'loongson3a' into release-v0.1alpha2

12 years agoMerge branch 'add_install_target' into develop
Xianyi Zhang [Tue, 21 Jun 2011 09:40:16 +0000 (17:40 +0800)]
Merge branch 'add_install_target' into develop

12 years agoRefs #20. Fixed the installation bug with DYNAMIC_ARCH=1.
Xianyi Zhang [Tue, 21 Jun 2011 09:39:08 +0000 (17:39 +0800)]
Refs #20. Fixed the installation bug with DYNAMIC_ARCH=1.

12 years agoMerge branch 'add_install_target' into develop
Xianyi Zhang [Mon, 20 Jun 2011 10:40:05 +0000 (18:40 +0800)]
Merge branch 'add_install_target' into develop

Conflicts:
Changelog.txt

12 years agoRefs #20. Updated the docs.
Xianyi Zhang [Mon, 20 Jun 2011 10:36:29 +0000 (18:36 +0800)]
Refs #20. Updated the docs.

12 years agoFixed #20. Added install target in makefile. You can use "make install PREFIX=your_in...
Xianyi Zhang [Mon, 20 Jun 2011 10:35:35 +0000 (18:35 +0800)]
Fixed #20. Added install target in makefile. You can use "make install PREFIX=your_installation_directory".

12 years agoUpdated gitignore file.
Xianyi Zhang [Sun, 19 Jun 2011 04:07:31 +0000 (12:07 +0800)]
Updated gitignore file.

12 years agoMerge branch 'master' of github.com:xianyi/OpenBLAS into develop
Xianyi Zhang [Sun, 19 Jun 2011 03:59:38 +0000 (11:59 +0800)]
Merge branch 'master' of github.com:xianyi/OpenBLAS into develop

12 years agoFixed #27. Temporarily walk around axpy's low performance issue with small imput...
Xianyi Zhang [Sun, 19 Jun 2011 03:55:29 +0000 (11:55 +0800)]
Fixed #27. Temporarily walk around axpy's low performance issue with small imput size & multithreads.

13 years agoMerge pull request #36 from pipping/master
Xianyi Zhang [Sat, 11 Jun 2011 12:59:00 +0000 (05:59 -0700)]
Merge pull request #36 from pipping/master

Fixed the bug about USE_OPENMP=0 enabling OpenMP

13 years agoMake USE_OPENMP=0 disable openmp
Elias Pipping [Sat, 11 Jun 2011 12:36:16 +0000 (14:36 +0200)]
Make USE_OPENMP=0 disable openmp

13 years agoFixed #35 a build bug with NO_LAPACK=1 DYNAMIC_ARCH=1 FC=gfortran. I forgot to test...
Xianyi Zhang [Thu, 9 Jun 2011 14:59:49 +0000 (22:59 +0800)]
Fixed #35 a build bug with NO_LAPACK=1 DYNAMIC_ARCH=1 FC=gfortran. I forgot to test it with gfortran in last bug fixed commit.

13 years agoFixed #35 a build bug with NO_LAPACK=1 & DYNAMIC_ARCH=1.
Xianyi Zhang [Thu, 9 Jun 2011 03:38:59 +0000 (11:38 +0800)]
Fixed #35 a build bug with NO_LAPACK=1 & DYNAMIC_ARCH=1.

13 years agoPrint the wall time (cycles) with enabling FUNCTION_PROFILE.
Xianyi Zhang [Thu, 9 Jun 2011 02:40:15 +0000 (10:40 +0800)]
Print the wall time (cycles) with enabling FUNCTION_PROFILE.

13 years agoFixed #33 ztrmm bug on Nehalem.
Wang Qian [Tue, 7 Jun 2011 04:53:25 +0000 (12:53 +0800)]
Fixed #33 ztrmm bug on Nehalem.

13 years agoFixed #32 a SEGFAULT bug with gcc-4.6. According to i386 calling convention, The...
Xianyi [Fri, 3 Jun 2011 05:19:54 +0000 (13:19 +0800)]
Fixed #32 a SEGFAULT bug with gcc-4.6. According to i386 calling convention, The called funtion should remove the hidden return value address from the stack.

13 years agoFixed #31 Shared library placement on Mac. Thank Mr.Viral B. Shah for this patch.
Xianyi Zhang [Mon, 30 May 2011 04:42:17 +0000 (12:42 +0800)]
Fixed #31 Shared library placement on Mac. Thank Mr.Viral B. Shah for this patch.

13 years agoFixed #30 strmm computational error on Loongson3A.
traz [Sat, 28 May 2011 09:48:34 +0000 (09:48 +0000)]
Fixed #30 strmm computational error on Loongson3A.

13 years agoFixed the makefile bug about openblas_set_num_threads.
Xianyi Zhang [Fri, 27 May 2011 13:15:30 +0000 (21:15 +0800)]
Fixed the makefile bug about openblas_set_num_threads.

13 years agoFixed a bug about detecting underscore prefix in c_check.
Xianyi Zhang [Fri, 27 May 2011 10:16:19 +0000 (18:16 +0800)]
Fixed a bug about detecting underscore prefix in c_check.

13 years agoIngnore *.obj files in git.
Xianyi Zhang [Fri, 27 May 2011 10:12:45 +0000 (18:12 +0800)]
Ingnore *.obj files in git.

13 years agoModify single precision compiler conditions, increasing single precision kernel code...
traz [Fri, 27 May 2011 09:47:17 +0000 (09:47 +0000)]
Modify single precision compiler conditions, increasing single precision kernel code on Loongson3a.

13 years agoRemove the useless code, modify code comments and format.
traz [Wed, 18 May 2011 10:54:51 +0000 (10:54 +0000)]
Remove the useless code, modify code comments and format.

13 years agoFixed #28. Convert the result to double precision in MIPS64 dsdot_k kernel.
Xianyi Zhang [Tue, 17 May 2011 21:24:00 +0000 (21:24 +0000)]
Fixed #28. Convert the result to double precision in MIPS64 dsdot_k kernel.

13 years agoFixed #25 dtrmm and dtrsm computational error on Loongson3A.
traz [Sat, 14 May 2011 22:00:57 +0000 (22:00 +0000)]
Fixed #25 dtrmm and dtrsm computational error on Loongson3A.

13 years agoAdded missed testing codes for dsdot.
Xianyi Zhang [Thu, 12 May 2011 18:41:39 +0000 (02:41 +0800)]
Added missed testing codes for dsdot.

13 years agoFixed #28. Convert the result to double precision in the end of dsdot kernel.
Xianyi Zhang [Thu, 12 May 2011 18:34:30 +0000 (02:34 +0800)]
Fixed #28. Convert the result to double precision in the end of dsdot kernel.

13 years agoAdded the unit testcase for dsdot.
Xianyi Zhang [Thu, 12 May 2011 18:19:55 +0000 (02:19 +0800)]
Added the unit testcase for dsdot.

13 years agoAdded the unit test for drotmg.
Xianyi Zhang [Thu, 12 May 2011 17:21:39 +0000 (01:21 +0800)]
Added the unit test for drotmg.

13 years agoMerge branch 'hotfix-readme_about_branches' into develop
Xianyi Zhang [Thu, 12 May 2011 11:06:31 +0000 (19:06 +0800)]
Merge branch 'hotfix-readme_about_branches' into develop

13 years agoMerge branch 'hotfix-readme_about_branches'
Xianyi Zhang [Thu, 12 May 2011 11:06:02 +0000 (19:06 +0800)]
Merge branch 'hotfix-readme_about_branches'

13 years agoAdded the spec of git branches about this project.
Xianyi Zhang [Thu, 12 May 2011 11:05:20 +0000 (19:05 +0800)]
Added the spec of git branches about this project.

13 years agoFinish dtrsm_kernel_Rx.S on Loongson3A.
traz [Wed, 11 May 2011 10:44:23 +0000 (10:44 +0000)]
Finish dtrsm_kernel_Rx.S on Loongson3A.

13 years agoFixed #26 the wrong result of rotmg. Used fabs() instead of abs().
Xianyi Zhang [Tue, 10 May 2011 17:12:32 +0000 (01:12 +0800)]
Fixed #26 the wrong result of rotmg. Used fabs() instead of abs().

13 years agoFinish dtrsm_kernel_Lx.S on Loongson3A.
traz [Tue, 10 May 2011 12:48:43 +0000 (12:48 +0000)]
Finish dtrsm_kernel_Lx.S on Loongson3A.

13 years agoModify dtrsm compiler options
traz [Mon, 9 May 2011 17:31:58 +0000 (17:31 +0000)]
Modify dtrsm compiler options

13 years agoFixed #24 drmm error on Loongson3A
traz [Mon, 9 May 2011 17:28:20 +0000 (17:28 +0000)]
Fixed #24 drmm error on Loongson3A

13 years agoAdded openblas_set_num_threads for Fortran.
Xianyi Zhang [Fri, 6 May 2011 09:03:35 +0000 (17:03 +0800)]
Added openblas_set_num_threads for Fortran.

13 years agoFixed #23. Fixed a bug of f_check script about generating link flags.
Xianyi Zhang [Wed, 4 May 2011 05:03:10 +0000 (13:03 +0800)]
Fixed #23. Fixed a bug of f_check script about generating link flags.

13 years agoFixed a bug when detecting Intel CPU.
Xianyi Zhang [Tue, 3 May 2011 09:19:36 +0000 (17:19 +0800)]
Fixed a bug when detecting Intel CPU.

13 years agoFixed a build bug with NO_LAPACK=1 and SANNITY_CHECK=1.
traits [Tue, 3 May 2011 06:42:11 +0000 (14:42 +0800)]
Fixed a build bug with NO_LAPACK=1 and SANNITY_CHECK=1.

13 years agoFixed #16. Print the user-friendly message when detecting CPU failed.
Xianyi Zhang [Fri, 22 Apr 2011 14:14:06 +0000 (22:14 +0800)]
Fixed #16. Print the user-friendly message when detecting CPU failed.

13 years agoAdded docs for make TARGET=your_cpu_target.
Xianyi Zhang [Fri, 22 Apr 2011 14:07:46 +0000 (22:07 +0800)]
Added docs for make TARGET=your_cpu_target.

13 years agoFixed #19. Provided an error msg when the arch is not supported.
Xianyi Zhang [Fri, 22 Apr 2011 12:21:42 +0000 (20:21 +0800)]
Fixed #19. Provided an error msg when the arch is not supported.

13 years agoFixed #21. Added extern C to support C++. Thank Tasio for the patch.
Xianyi Zhang [Wed, 20 Apr 2011 05:41:38 +0000 (13:41 +0800)]
Fixed #21. Added extern C to support C++. Thank Tasio for the patch.

13 years agoCompletely dtrmm function.
traz [Sun, 17 Apr 2011 20:26:49 +0000 (20:26 +0000)]
Completely dtrmm function.

13 years agoIncreased handling trmm part, no edge handling. Test size(M and N) must be a multiple...
traz [Fri, 15 Apr 2011 21:56:25 +0000 (21:56 +0000)]
Increased handling trmm part, no edge handling. Test size(M and N) must be a multiple of 4 .

13 years agoModify prefetching C.
traz [Mon, 11 Apr 2011 22:46:36 +0000 (22:46 +0000)]
Modify prefetching C.

13 years agoAdjust kc size from 112 to 116 .
traz [Mon, 11 Apr 2011 22:17:57 +0000 (22:17 +0000)]
Adjust kc size from 112 to 116 .

13 years agoChanged default page size to 16KB on Loongson 3A.
Xianyi Zhang [Mon, 11 Apr 2011 21:46:48 +0000 (21:46 +0000)]
Changed default page size to 16KB on Loongson 3A.

13 years agoSupported goto_set_num_threads & openblas_set_num_threads functions when USE_OPENMP=1.
Xianyi Zhang [Thu, 7 Apr 2011 06:52:35 +0000 (14:52 +0800)]
Supported goto_set_num_threads & openblas_set_num_threads functions when USE_OPENMP=1.

13 years agoFixed #14 the SEGFAULT bug on 64 cores. On SMP server, the number of CPUs or cores...
Xianyi Zhang [Mon, 28 Mar 2011 02:58:39 +0000 (10:58 +0800)]
Fixed #14 the SEGFAULT bug on 64 cores. On SMP server, the number of CPUs or cores should be less than or equal to 64.

13 years agoFixed #13. Fixed blasint undefined bug in <cblas.h> file.
Xianyi Zhang [Thu, 24 Mar 2011 17:16:12 +0000 (01:16 +0800)]
Fixed #13. Fixed blasint undefined bug in <cblas.h> file.

13 years agoUpdated the developing version to v0.1 alpha2.
Xianyi Zhang [Sun, 20 Mar 2011 15:35:31 +0000 (23:35 +0800)]
Updated the developing version to v0.1 alpha2.

13 years agoInit Changelog file for next release version(v0.1alpha2).
Xianyi Zhang [Sun, 20 Mar 2011 15:30:09 +0000 (23:30 +0800)]
Init Changelog file for next release version(v0.1alpha2).

13 years agoChange BLOCK SIZE of LOONGSON3A TARGET.
traz [Wed, 6 Apr 2011 10:39:31 +0000 (10:39 +0000)]
Change BLOCK SIZE of LOONGSON3A TARGET.

13 years agoAdd dgemm compiler Options in KERNEL.LOONGSON3A.
traz [Wed, 6 Apr 2011 10:38:34 +0000 (10:38 +0000)]
Add dgemm compiler Options in KERNEL.LOONGSON3A.

13 years agoNew kernel in LOONGSON3A.
traz [Wed, 6 Apr 2011 10:36:44 +0000 (10:36 +0000)]
New kernel in LOONGSON3A.

13 years agoFixed #14 the SEGFAULT bug on 64 cores. On SMP server, the number of CPUs or cores...
Xianyi Zhang [Mon, 28 Mar 2011 02:58:39 +0000 (10:58 +0800)]
Fixed #14 the SEGFAULT bug on 64 cores. On SMP server, the number of CPUs or cores should be less than or equal to 64.

13 years agoFixed #13. Fixed blasint undefined bug in <cblas.h> file.
Xianyi Zhang [Thu, 24 Mar 2011 17:16:12 +0000 (01:16 +0800)]
Fixed #13. Fixed blasint undefined bug in <cblas.h> file.

13 years agoMerge branch 'master' of github.com:xianyi/OpenBLAS into x86
Xianyi Zhang [Tue, 22 Mar 2011 06:16:18 +0000 (14:16 +0800)]
Merge branch 'master' of github.com:xianyi/OpenBLAS into x86

13 years agoFixed the detecting bug on Intel Core i5. Thank ggl329 for the patch. v0.1alpha1
Xianyi Zhang [Tue, 22 Mar 2011 06:09:47 +0000 (14:09 +0800)]
Fixed the detecting bug on Intel Core i5. Thank ggl329 for the patch.

13 years agoUpdated the developing version to v0.1 alpha2.
Xianyi Zhang [Sun, 20 Mar 2011 15:35:31 +0000 (23:35 +0800)]
Updated the developing version to v0.1 alpha2.

13 years agoInit Changelog file for next release version(v0.1alpha2).
Xianyi Zhang [Sun, 20 Mar 2011 15:30:09 +0000 (23:30 +0800)]
Init Changelog file for next release version(v0.1alpha2).

13 years agoOpenBLAS 0.1 alpha version 1.
Xianyi Zhang [Sun, 20 Mar 2011 14:44:57 +0000 (22:44 +0800)]
OpenBLAS 0.1 alpha version 1.

13 years agoMerge remote branch 'origin/loongson3a' into x86
Xianyi Zhang [Sun, 20 Mar 2011 13:57:58 +0000 (21:57 +0800)]
Merge remote branch 'origin/loongson3a' into x86

13 years agoMerge remote branch 'origin/loongson3a' into x86
Xianyi Zhang [Sun, 20 Mar 2011 13:57:09 +0000 (21:57 +0800)]
Merge remote branch 'origin/loongson3a' into x86

13 years agoDetect Intel Core Clarkdale & Arrandale
Xianyi Zhang [Sun, 20 Mar 2011 13:56:40 +0000 (21:56 +0800)]
Detect Intel Core Clarkdale & Arrandale

13 years agoFixed the bug about Loongson3A gsLQC1 & gsSQC1 instructions in daxpy kernel. Now...
Xianyi Zhang [Fri, 18 Mar 2011 23:05:56 +0000 (23:05 +0000)]
Fixed the bug about Loongson3A gsLQC1 & gsSQC1 instructions in daxpy kernel. Now daxpy is correct.

13 years agoMerge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a
Xianyi Zhang [Fri, 18 Mar 2011 01:20:15 +0000 (01:20 +0000)]
Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a

13 years agoSupported detecting new kernel(2.6.36) & new Loongson3A03 CPU.
Xianyi Zhang [Fri, 18 Mar 2011 01:10:58 +0000 (01:10 +0000)]
Supported detecting new kernel(2.6.36) & new Loongson3A03 CPU.

13 years agoModified the default kernel makefile in MIPS64 arch.
Wang Qian [Mon, 7 Mar 2011 11:22:32 +0000 (11:22 +0000)]
Modified the default kernel makefile in MIPS64 arch.

13 years agoSupport unalign address in daxpy on loongson3a simd..
Xianyi Zhang [Sat, 5 Mar 2011 02:17:10 +0000 (10:17 +0800)]
Support unalign address in daxpy on loongson3a simd..

13 years agoUnroll to 16 in daxpy on loongson3a.
Xianyi Zhang [Fri, 4 Mar 2011 09:50:17 +0000 (17:50 +0800)]
Unroll to 16 in daxpy on loongson3a.

13 years agoMerge commit 'origin/x86' into loongson3a
Xianyi Zhang [Fri, 4 Mar 2011 14:11:52 +0000 (14:11 +0000)]
Merge commit 'origin/x86' into loongson3a

13 years agoMerge branch 'x86' of github.com:xianyi/OpenBLAS into x86
Xianyi Zhang [Fri, 4 Mar 2011 03:53:04 +0000 (11:53 +0800)]
Merge branch 'x86' of github.com:xianyi/OpenBLAS into x86

13 years agoSupport NO_LAPACK=1 to build the lib without LAPACK functions.
Xianyi Zhang [Fri, 4 Mar 2011 03:51:32 +0000 (11:51 +0800)]
Support NO_LAPACK=1 to build the lib without LAPACK functions.

13 years agoChanged movlps macro name in capital in x86/zdot_sse2.S file.
Xianyi [Wed, 2 Mar 2011 16:46:39 +0000 (00:46 +0800)]
Changed movlps macro name in capital in x86/zdot_sse2.S file.

13 years agoOn x86 32bits, gcc 4.4.3 generated wrong codes (movsd) from movlps in zdot_sse2.S...
Xianyi [Wed, 2 Mar 2011 10:45:30 +0000 (18:45 +0800)]
On x86 32bits, gcc 4.4.3 generated wrong codes (movsd) from movlps in zdot_sse2.S line 191.
This would casue zdotu & zdotc failures. Instead, use movlpd to walk around it. Fixed #8. Fixed #9.

13 years agoAdded zdotu with x & y offset=1 test case.
Xianyi Zhang [Wed, 2 Mar 2011 10:03:40 +0000 (18:03 +0800)]
Added zdotu with x & y offset=1 test case.

13 years agoMerge remote branch 'origin/x86' into loongson3a
Xianyi Zhang [Wed, 2 Mar 2011 05:52:05 +0000 (13:52 +0800)]
Merge remote branch 'origin/x86' into loongson3a

13 years agoupdated the changelog.
Xianyi Zhang [Wed, 2 Mar 2011 05:40:55 +0000 (13:40 +0800)]
updated the changelog.

13 years agoFixed randomly SEGFAULT when nodemask==NULL with above Linux 2.6.34. Fixed #12. Thank...
Xianyi Zhang [Wed, 2 Mar 2011 05:38:32 +0000 (13:38 +0800)]
Fixed randomly SEGFAULT when nodemask==NULL with above Linux 2.6.34. Fixed #12. Thank Mr.Ei-ji Nakama providing this patch.

13 years agoAdded Changelog. Fixed #11.
Xianyi Zhang [Sat, 26 Feb 2011 04:27:56 +0000 (12:27 +0800)]
Added Changelog. Fixed #11.

13 years agoEnable Debug flags in memory alloc and init functions.
Xianyi Zhang [Sat, 26 Feb 2011 03:51:39 +0000 (11:51 +0800)]
Enable Debug flags in memory alloc and init functions.

13 years agoAdded DEBUG option in Makefile.rule. Fixed DEBUG typo mistakes.
Xianyi Zhang [Sat, 26 Feb 2011 03:19:54 +0000 (11:19 +0800)]
Added DEBUG option in Makefile.rule. Fixed DEBUG typo mistakes.

13 years agoMerge branch 'x86' of github.com:xianyi/OpenBLAS into x86
Xianyi Zhang [Thu, 24 Feb 2011 09:02:52 +0000 (17:02 +0800)]
Merge branch 'x86' of github.com:xianyi/OpenBLAS into x86

13 years agoFixed #10. Supported GOTO_NUM_THREADS & GOTO_THREADS_TIMEOUT environment variables.
Xianyi Zhang [Thu, 24 Feb 2011 07:16:21 +0000 (15:16 +0800)]
Fixed #10. Supported GOTO_NUM_THREADS & GOTO_THREADS_TIMEOUT environment variables.

13 years agoFixed #7. Modified axpy kernel codes to avoid unloop with incx==0 or incy==0 in x86...
Xianyi [Wed, 23 Feb 2011 12:08:34 +0000 (20:08 +0800)]
Fixed #7. Modified axpy kernel codes to avoid unloop with incx==0 or incy==0 in x86 32bits arch.