platform/upstream/openblas.git
10 years agoadded sgemv_t microkernel for sandybridge
wernsaar [Sun, 20 Jul 2014 08:21:08 +0000 (10:21 +0200)]
added sgemv_t microkernel for sandybridge

10 years agoadded optimized sgemv_t for bulldozer and piledriver
wernsaar [Sat, 19 Jul 2014 13:48:07 +0000 (15:48 +0200)]
added optimized sgemv_t for bulldozer and piledriver

10 years agodon't use this sgemv_n on Windows
wernsaar [Sat, 19 Jul 2014 05:15:34 +0000 (07:15 +0200)]
don't use this sgemv_n on Windows

10 years agoperformance optimizations for sgemv_n
wernsaar [Fri, 18 Jul 2014 09:25:21 +0000 (11:25 +0200)]
performance optimizations for sgemv_n

10 years agoadded blocked sgemv_n and microkernel for bulldozer and piledriver
wernsaar [Thu, 17 Jul 2014 21:15:07 +0000 (23:15 +0200)]
added blocked sgemv_n and microkernel for bulldozer and piledriver

10 years agochanged string GFORTRAN to lowercase
wernsaar [Wed, 16 Jul 2014 15:08:43 +0000 (17:08 +0200)]
changed string GFORTRAN to lowercase

10 years agoadjust number of threads for small size in cgemv and zgemv
wernsaar [Tue, 15 Jul 2014 14:27:02 +0000 (16:27 +0200)]
adjust number of threads for small size in cgemv and zgemv

10 years agoadjust number of threads for sgemv and dgemv
wernsaar [Tue, 15 Jul 2014 14:04:46 +0000 (16:04 +0200)]
adjust number of threads for sgemv and dgemv

10 years agoadjusted number of threads for small size
wernsaar [Tue, 15 Jul 2014 12:41:35 +0000 (14:41 +0200)]
adjusted number of threads for small size

10 years agoadded benchmark for gemv
wernsaar [Tue, 15 Jul 2014 11:35:36 +0000 (13:35 +0200)]
added benchmark for gemv

10 years agoadded additional test value
wernsaar [Sun, 13 Jul 2014 16:26:38 +0000 (18:26 +0200)]
added additional test value

10 years agosegment violation in sgemv kernels
wernsaar [Sun, 13 Jul 2014 08:46:14 +0000 (10:46 +0200)]
segment violation in sgemv kernels

10 years agomodified pathes to atlas, mkl and acml
wernsaar [Sat, 12 Jul 2014 14:20:29 +0000 (16:20 +0200)]
modified pathes to atlas, mkl and acml

10 years agoadded conf option for number of loops
wernsaar [Sat, 12 Jul 2014 09:54:39 +0000 (11:54 +0200)]
added conf option for number of loops

10 years agoadded her2k benchmark
wernsaar [Fri, 11 Jul 2014 14:31:05 +0000 (16:31 +0200)]
added her2k benchmark

10 years agoadded herk benchmark
wernsaar [Fri, 11 Jul 2014 14:16:48 +0000 (16:16 +0200)]
added herk benchmark

10 years agoadd hemm benchmark
wernsaar [Fri, 11 Jul 2014 13:26:34 +0000 (15:26 +0200)]
add hemm benchmark

10 years agoadded syr2k benchmark
wernsaar [Fri, 11 Jul 2014 12:48:25 +0000 (14:48 +0200)]
added syr2k benchmark

10 years agoadded syrk benchmark
wernsaar [Fri, 11 Jul 2014 12:21:25 +0000 (14:21 +0200)]
added syrk benchmark

10 years agoadded trsm benchmark
wernsaar [Fri, 11 Jul 2014 11:51:08 +0000 (13:51 +0200)]
added trsm benchmark

10 years agoadded trmm benchmark
wernsaar [Fri, 11 Jul 2014 11:20:42 +0000 (13:20 +0200)]
added trmm benchmark

10 years agoadded benchmark for symm
wernsaar [Fri, 11 Jul 2014 10:47:48 +0000 (12:47 +0200)]
added benchmark for symm

10 years agoadded gemm benchmark and modified Makefile for benchmark
wernsaar [Fri, 11 Jul 2014 09:09:47 +0000 (11:09 +0200)]
added gemm benchmark and modified Makefile for benchmark

10 years agoMerge pull request #411 from wernsaar/develop
Zhang Xianyi [Thu, 10 Jul 2014 14:38:15 +0000 (22:38 +0800)]
Merge pull request #411 from wernsaar/develop

Lapack-test on x86 32bit now runs without errors.

10 years agoRef #410: disabled optimized potri functions ( single threading bug)
wernsaar [Thu, 10 Jul 2014 11:42:32 +0000 (13:42 +0200)]
Ref #410: disabled optimized potri functions ( single threading bug)

10 years agoLapack-test Windows 32bit now error free
wernsaar [Thu, 10 Jul 2014 09:01:47 +0000 (11:01 +0200)]
Lapack-test Windows 32bit now error free

10 years agoLapack-test: cleanup of x86 32bit KERNEL file
wernsaar [Wed, 9 Jul 2014 14:08:19 +0000 (16:08 +0200)]
Lapack-test: cleanup of x86 32bit KERNEL file

10 years agoMerge pull request #409 from wernsaar/develop
Zhang Xianyi [Wed, 9 Jul 2014 13:11:00 +0000 (21:11 +0800)]
Merge pull request #409 from wernsaar/develop

some fixes for Lapack and ARM platform

10 years agobugfixes for lapack on ARM Platform
wernsaar [Wed, 9 Jul 2014 10:21:39 +0000 (12:21 +0200)]
bugfixes for lapack on ARM Platform

10 years agoOpenBLAS 0.2.10 rc2 version.
Zhang Xianyi [Wed, 9 Jul 2014 00:47:36 +0000 (08:47 +0800)]
OpenBLAS 0.2.10 rc2 version.

10 years agoadded cross compiler examples for 32bit and 64bit ARM
wernsaar [Tue, 8 Jul 2014 10:55:18 +0000 (12:55 +0200)]
added cross compiler examples for 32bit and 64bit ARM

10 years agoRefs #406. Fixed utest building bug.
Zhang Xianyi [Tue, 8 Jul 2014 09:26:49 +0000 (17:26 +0800)]
Refs #406. Fixed utest building bug.

10 years agoLapack bug114: replaced cgesvd.f and zgesvd.f
wernsaar [Tue, 8 Jul 2014 08:21:10 +0000 (10:21 +0200)]
Lapack bug114: replaced cgesvd.f and zgesvd.f

10 years agoLapack bug117: replaced zstemr.f
wernsaar [Tue, 8 Jul 2014 08:08:34 +0000 (10:08 +0200)]
Lapack bug117: replaced zstemr.f

10 years agoLapack bug118: replaced clanhf.f and zlanhf.f
wernsaar [Tue, 8 Jul 2014 07:57:40 +0000 (09:57 +0200)]
Lapack bug118: replaced clanhf.f and zlanhf.f

10 years agoFixed #407. Support outputing the CPU corename on runtime.
Zhang Xianyi [Tue, 8 Jul 2014 04:48:08 +0000 (12:48 +0800)]
Fixed #407. Support outputing the CPU corename on runtime.
The user can use char * openblas_get_config() or char * openblas_get_corename().

10 years agoMerge pull request #404 from wernsaar/develop
Zhang Xianyi [Sun, 6 Jul 2014 16:39:33 +0000 (00:39 +0800)]
Merge pull request #404 from wernsaar/develop

A lot of fixes for v0.2.10-rc2

10 years agoremoved reference to daxpy_bulldozer kernel (Windows bug in lapack-test)
wernsaar [Sun, 6 Jul 2014 14:39:32 +0000 (16:39 +0200)]
removed reference to daxpy_bulldozer kernel (Windows bug in lapack-test)

10 years agobugfix for fortran compiler
wernsaar [Sun, 6 Jul 2014 11:33:42 +0000 (13:33 +0200)]
bugfix for fortran compiler

10 years agoadded definitions for PILEDRIVER and HASWELL
wernsaar [Sun, 6 Jul 2014 10:08:27 +0000 (12:08 +0200)]
added definitions for PILEDRIVER and HASWELL

10 years agobugfix for CORE2
wernsaar [Sun, 6 Jul 2014 09:47:28 +0000 (11:47 +0200)]
bugfix for CORE2

10 years agofallback to zgemm_kernel_4x2_sse.S
wernsaar [Sun, 6 Jul 2014 09:05:28 +0000 (11:05 +0200)]
fallback to zgemm_kernel_4x2_sse.S

10 years agoadded missing definition for DUNNINGTON
wernsaar [Sun, 6 Jul 2014 08:17:07 +0000 (10:17 +0200)]
added missing definition for DUNNINGTON

10 years agoremoved reference to zgemm_kernel_4x2_sse3.S (bug in lapack-test)
wernsaar [Sat, 5 Jul 2014 14:13:17 +0000 (16:13 +0200)]
removed reference to zgemm_kernel_4x2_sse3.S (bug in lapack-test)

10 years agoenabled compiling of *3M functions
wernsaar [Wed, 2 Jul 2014 12:11:53 +0000 (14:11 +0200)]
enabled compiling of *3M functions

10 years agofixed my bug in ger.c
wernsaar [Wed, 2 Jul 2014 08:39:33 +0000 (10:39 +0200)]
fixed my bug in ger.c

10 years agodisabled *3M functions for x86_64 platforms
wernsaar [Tue, 1 Jul 2014 14:18:05 +0000 (16:18 +0200)]
disabled *3M functions for x86_64 platforms

10 years agoadded optimized sdot- and dsdot-kernel, written in C
wernsaar [Mon, 30 Jun 2014 12:46:38 +0000 (14:46 +0200)]
added optimized sdot- and dsdot-kernel, written in C

10 years agodisabled SMP for sbmv and zsbmv again
wernsaar [Sun, 29 Jun 2014 19:18:38 +0000 (21:18 +0200)]
disabled SMP for sbmv and zsbmv again

10 years agoenabled SMP for sbmv and zsbmv, but only for 64bit binaries
wernsaar [Sun, 29 Jun 2014 18:35:56 +0000 (20:35 +0200)]
enabled SMP for sbmv and zsbmv, but only for 64bit binaries

10 years agoenabled smp for ger.c and zger.c, but only for 64bit binaries
wernsaar [Sun, 29 Jun 2014 14:43:04 +0000 (16:43 +0200)]
enabled smp for ger.c and zger.c, but only for 64bit binaries

10 years agomodification, to run blas-test on Windows
wernsaar [Sun, 29 Jun 2014 08:15:29 +0000 (10:15 +0200)]
modification, to run blas-test on Windows

10 years agoOpenBLAS 0.2.10 rc1 version.
Zhang Xianyi [Sun, 29 Jun 2014 02:45:50 +0000 (10:45 +0800)]
OpenBLAS 0.2.10 rc1 version.

10 years agoMerge branch 'wernsaar-develop' into develop
Zhang Xianyi [Sun, 29 Jun 2014 02:40:54 +0000 (10:40 +0800)]
Merge branch 'wernsaar-develop' into develop

10 years agoFixed #395. Enable optimized cgemm for Sandybridge. Added optimized sdot kernel.
Zhang Xianyi [Sun, 29 Jun 2014 02:34:51 +0000 (10:34 +0800)]
Fixed #395. Enable optimized cgemm for Sandybridge. Added optimized sdot kernel.
Fixed c/zgemm, zgemv computational error of haswell, piledriver, bullldozer, and
barcelona on Windows.

Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop

Conflicts:
kernel/Makefile.L1
kernel/x86_64/KERNEL
param.h

10 years agofixed zgemv bug for older AMD Processors
wernsaar [Sat, 28 Jun 2014 17:04:49 +0000 (19:04 +0200)]
fixed zgemv bug for older AMD Processors

10 years agoMerge branch 'TimothyGu-develop' into develop
Zhang Xianyi [Sat, 28 Jun 2014 12:52:07 +0000 (20:52 +0800)]
Merge branch 'TimothyGu-develop' into develop
Fixed #398. Remove all trailing whitespace except lapack-netlib.

10 years agoMerge branch 'develop' of https://github.com/TimothyGu/OpenBLAS into TimothyGu-develop
Zhang Xianyi [Sat, 28 Jun 2014 12:51:31 +0000 (20:51 +0800)]
Merge branch 'develop' of https://github.com/TimothyGu/OpenBLAS into TimothyGu-develop

Conflicts:
driver/others/memory.c

10 years agoMerge pull request #399 from TimothyGu/upstr
Zhang Xianyi [Sat, 28 Jun 2014 12:40:23 +0000 (20:40 +0800)]
Merge pull request #399 from TimothyGu/upstr

Build import libs as .dll.a instead of .lib

10 years agoMerge pull request #397 from vtjnash/develop
Zhang Xianyi [Sat, 28 Jun 2014 12:38:48 +0000 (20:38 +0800)]
Merge pull request #397 from vtjnash/develop

fix #394

10 years agobugfix for barcelona zgemv-kernel
wernsaar [Sat, 28 Jun 2014 10:36:11 +0000 (12:36 +0200)]
bugfix for barcelona zgemv-kernel

10 years agobugfix for bulldozer cgemm-, zgemm- and zgemv-kernel
wernsaar [Sat, 28 Jun 2014 10:16:20 +0000 (12:16 +0200)]
bugfix for bulldozer cgemm-, zgemm- and zgemv-kernel

10 years agobugfix for piledriver cgemm-, zgemm- and zgemv-kernel
wernsaar [Sat, 28 Jun 2014 09:46:58 +0000 (11:46 +0200)]
bugfix for piledriver cgemm-, zgemm- and zgemv-kernel

10 years agobugfix for haswell cgemm- and zgemm-kernel
wernsaar [Sat, 28 Jun 2014 08:22:40 +0000 (10:22 +0200)]
bugfix for haswell cgemm- and zgemm-kernel

10 years agobugfix for cgemm_kernel_8x2_sandy.S
wernsaar [Sat, 28 Jun 2014 08:01:56 +0000 (10:01 +0200)]
bugfix for cgemm_kernel_8x2_sandy.S

10 years ago.gitignore: add some more entries concerned with kernel
Timothy Gu [Fri, 27 Jun 2014 20:58:42 +0000 (13:58 -0700)]
.gitignore: add some more entries concerned with kernel

Signed-off-by: Timothy Gu <timothygu99@gmail.com>
10 years agoBuild import libs as .dll.a instead of .lib
Timothy Gu [Fri, 27 Jun 2014 18:58:14 +0000 (11:58 -0700)]
Build import libs as .dll.a instead of .lib

This is MinGW convention.

Signed-off-by: Timothy Gu <timothygu99@gmail.com>
10 years agoRemove all trailing whitespace except lapack-netlib
Timothy Gu [Fri, 27 Jun 2014 19:05:18 +0000 (12:05 -0700)]
Remove all trailing whitespace except lapack-netlib

Signed-off-by: Timothy Gu <timothygu99@gmail.com>
10 years agofix #394. this cleans up some handles after using them, and doesn't disable ALL proce...
Jameson Nash [Fri, 27 Jun 2014 16:10:04 +0000 (12:10 -0400)]
fix #394. this cleans up some handles after using them, and doesn't disable ALL process privileges upon success

10 years agoadded optimized cgemm-kernel for SANDYBRIDGE
wernsaar [Fri, 27 Jun 2014 11:40:29 +0000 (13:40 +0200)]
added optimized cgemm-kernel for SANDYBRIDGE

10 years agoadded DSDOT definition and enabled optimized sdot kernel
wernsaar [Fri, 27 Jun 2014 09:30:29 +0000 (11:30 +0200)]
added DSDOT definition and enabled optimized sdot kernel

10 years agoadded blas-test from lapack
wernsaar [Fri, 27 Jun 2014 08:12:19 +0000 (10:12 +0200)]
added blas-test from lapack

10 years agoMerge pull request #390 from wernsaar/develop
Zhang Xianyi [Fri, 27 Jun 2014 06:57:06 +0000 (14:57 +0800)]
Merge pull request #390 from wernsaar/develop

Ref #103: enhancement for small matrix dimensions. Fixed some bugs. Enable sgemm for SNB and dgemm for NEHALEM

10 years agoadded new optimized sgemm kernel for SANDYBRIGE
wernsaar [Thu, 26 Jun 2014 19:42:08 +0000 (21:42 +0200)]
added new optimized sgemm kernel for SANDYBRIGE

10 years agoenabled optimized dgemm kernel for NEHALEM
wernsaar [Thu, 26 Jun 2014 10:22:29 +0000 (12:22 +0200)]
enabled optimized dgemm kernel for NEHALEM

10 years agoFortran flag -frecursive is disabled by default
wernsaar [Wed, 25 Jun 2014 11:55:19 +0000 (13:55 +0200)]
Fortran flag -frecursive is disabled by default

10 years agoenabled optimized sgemv kernel for barcelona and piledriver
wernsaar [Wed, 25 Jun 2014 11:50:57 +0000 (13:50 +0200)]
enabled optimized sgemv kernel for barcelona and piledriver

10 years agoenabled optimized sgemv kernel for HASWELL
wernsaar [Wed, 25 Jun 2014 10:56:45 +0000 (12:56 +0200)]
enabled optimized sgemv kernel for HASWELL

10 years agoenabled optimized sgemv kernels for nehalem, sandybridge and bulldozer
wernsaar [Wed, 25 Jun 2014 10:38:14 +0000 (12:38 +0200)]
enabled optimized sgemv kernels for nehalem, sandybridge and bulldozer

10 years agofixed compiler warnings
wernsaar [Wed, 25 Jun 2014 09:32:44 +0000 (11:32 +0200)]
fixed compiler warnings

10 years agoadded parameter for gemm3m kernels
wernsaar [Wed, 25 Jun 2014 08:40:25 +0000 (10:40 +0200)]
added parameter for gemm3m kernels

10 years agoforce fallback for x86 32bit
wernsaar [Sun, 22 Jun 2014 15:27:11 +0000 (17:27 +0200)]
force fallback for x86 32bit

10 years agoRef #391: force fallback for x86 32bit
wernsaar [Sun, 22 Jun 2014 11:51:17 +0000 (13:51 +0200)]
Ref #391: force fallback for x86 32bit

10 years agoRef #391: disabled SMP in ger.c and zger.c
wernsaar [Sun, 22 Jun 2014 10:01:24 +0000 (12:01 +0200)]
Ref #391: disabled SMP in ger.c and zger.c

10 years agofixed bug for INTERFACE64
wernsaar [Sun, 22 Jun 2014 07:49:20 +0000 (09:49 +0200)]
fixed bug for INTERFACE64

10 years agoRef #393: fix for INTERFACE64=0 and ARCH_X86 in divtable
wernsaar [Sat, 21 Jun 2014 10:29:23 +0000 (12:29 +0200)]
Ref #393: fix for INTERFACE64=0 and ARCH_X86 in divtable

10 years agoRef #380: lowered stack usage for haswell kernels
wernsaar [Thu, 19 Jun 2014 12:31:52 +0000 (14:31 +0200)]
Ref #380: lowered stack usage for haswell kernels

10 years agoRef #380: lowered stack usage for piledriver and bulldozer kernels
wernsaar [Thu, 19 Jun 2014 12:02:14 +0000 (14:02 +0200)]
Ref #380: lowered stack usage for piledriver and bulldozer kernels

10 years agoRef #103: enhancement for small matrix dimensions
wernsaar [Wed, 18 Jun 2014 13:04:11 +0000 (15:04 +0200)]
Ref #103: enhancement for small matrix dimensions

10 years agoMerge pull request #387 from davidanthoff/fixbuilderroronwin
Zhang Xianyi [Tue, 17 Jun 2014 23:57:30 +0000 (07:57 +0800)]
Merge pull request #387 from davidanthoff/fixbuilderroronwin

Add -lgfortran flag to gcc call in a makefile

10 years agoMerge pull request #386 from wernsaar/develop
Zhang Xianyi [Tue, 17 Jun 2014 23:56:08 +0000 (07:56 +0800)]
Merge pull request #386 from wernsaar/develop

Some enhancements for dynamic_arch and some warning fixes

10 years agoAdd -lgfortran flag to gcc call in a makefile
David Anthoff [Sat, 14 Jun 2014 04:10:27 +0000 (21:10 -0700)]
Add -lgfortran flag to gcc call in a makefile

Adding $(EXTRALIB) adds this flag when things are built with
msys2 on windows. Without this the build fails.

10 years agoRef #385: fixed warnings in dynamic.c
wernsaar [Thu, 12 Jun 2014 16:17:08 +0000 (18:17 +0200)]
Ref #385: fixed warnings in dynamic.c

10 years agoRef #385: added missing return instruction
wernsaar [Thu, 12 Jun 2014 13:52:14 +0000 (15:52 +0200)]
Ref #385: added missing return instruction

10 years agoRef #380: enhancements for dynamic_arch
wernsaar [Thu, 12 Jun 2014 12:20:03 +0000 (14:20 +0200)]
Ref #380: enhancements for dynamic_arch

10 years agoMerge pull request #384 from wernsaar/develop
Zhang Xianyi [Wed, 11 Jun 2014 01:49:27 +0000 (09:49 +0800)]
Merge pull request #384 from wernsaar/develop

Blas extensions

10 years agoRef #51: added blas extensions simatcopy, dimatcopy, cimatcopy, zimatcopy
wernsaar [Tue, 10 Jun 2014 14:14:34 +0000 (16:14 +0200)]
Ref #51: added blas extensions simatcopy, dimatcopy, cimatcopy, zimatcopy

10 years agoOpenBLAS 0.2.9 Version.
Zhang Xianyi [Tue, 10 Jun 2014 13:55:19 +0000 (21:55 +0800)]
OpenBLAS 0.2.9 Version.

10 years agoRef #51: added blas extensions zomatcopy and comatcopy
wernsaar [Tue, 10 Jun 2014 08:34:54 +0000 (10:34 +0200)]
Ref #51: added blas extensions zomatcopy and comatcopy

10 years agoRef #51: added blas extension somatcopy
wernsaar [Mon, 9 Jun 2014 18:21:13 +0000 (20:21 +0200)]
Ref #51: added blas extension somatcopy