X-Git-Url: http://review.tizen.org/git/?a=blobdiff_plain;f=README.md;h=14815ff00b56fb17cce612e45fbc3c9309efc7d4;hb=refs%2Fheads%2Faccepted%2Ftizen_6.0_unified;hp=1c3255fe5f1695ce3ae2d22e03ce3ff74db8da61;hpb=e8d0e66982c3d3eadacf9a435681622291a7c98e;p=platform%2Fupstream%2Fopenblas.git diff --git a/README.md b/README.md index 1c3255f..14815ff 100644 --- a/README.md +++ b/README.md @@ -2,175 +2,227 @@ [![Join the chat at https://gitter.im/xianyi/OpenBLAS](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/xianyi/OpenBLAS?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) -Travis CI: [![Build Status](https://travis-ci.org/xianyi/OpenBLAS.png?branch=develop)](https://travis-ci.org/xianyi/OpenBLAS) +Travis CI: [![Build Status](https://travis-ci.org/xianyi/OpenBLAS.svg?branch=develop)](https://travis-ci.org/xianyi/OpenBLAS) AppVeyor: [![Build status](https://ci.appveyor.com/api/projects/status/09sohd35n8nkkx64/branch/develop?svg=true)](https://ci.appveyor.com/project/xianyi/openblas/branch/develop) + +[![Build Status](https://dev.azure.com/xianyi/OpenBLAS/_apis/build/status/xianyi.OpenBLAS?branchName=develop)](https://dev.azure.com/xianyi/OpenBLAS/_build/latest?definitionId=1&branchName=develop) + ## Introduction + OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version. -Please read the documents on OpenBLAS wiki pages . +Please read the documentation on the OpenBLAS wiki pages: . ## Binary Packages -We provide binary packages for the following platform. + +We provide official binary packages for the following platform: * Windows x86/x86_64 You can download them from [file hosting on sourceforge.net](https://sourceforge.net/projects/openblas/files/). ## Installation from Source -Download from project homepage. http://xianyi.github.com/OpenBLAS/ -Or, check out codes from git://github.com/xianyi/OpenBLAS.git -### Normal compile - * type "make" to detect the CPU automatically. - or - * type "make TARGET=xxx" to set target CPU, e.g. "make TARGET=NEHALEM". The full target list is in file TargetList.txt. +Download from project homepage, https://xianyi.github.com/OpenBLAS/, or check out the code +using Git from https://github.com/xianyi/OpenBLAS.git. -### Cross compile -Please set CC and FC with the cross toolchains. Then, set HOSTCC with your host C compiler. At last, set TARGET explicitly. +### Dependencies -Examples: +Building OpenBLAS requires the following to be installed: -On X86 box, compile this library for loongson3a CPU. +* GNU Make +* A C compiler, e.g. GCC or Clang +* A Fortran compiler (optional, for LAPACK) +* IBM MASS (optional, see below) - make BINARY=64 CC=mips64el-unknown-linux-gnu-gcc FC=mips64el-unknown-linux-gnu-gfortran HOSTCC=gcc TARGET=LOONGSON3A +### Normal compile -On X86 box, compile this library for loongson3a CPU with loongcc (based on Open64) compiler. +Simply invoking `make` (or `gmake` on BSD) will detect the CPU automatically. +To set a specific target CPU, use `make TARGET=xxx`, e.g. `make TARGET=NEHALEM`. +The full target list is in the file `TargetList.txt`. - make CC=loongcc FC=loongf95 HOSTCC=gcc TARGET=LOONGSON3A CROSS=1 CROSS_SUFFIX=mips64el-st-linux-gnu- NO_LAPACKE=1 NO_SHARED=1 BINARY=32 +### Cross compile -### Debug version +Set `CC` and `FC` to point to the cross toolchains, and set `HOSTCC` to your host C compiler. +The target must be specified explicitly when cross compiling. - make DEBUG=1 +Examples: + +* On an x86 box, compile this library for a loongson3a CPU: + ```sh + make BINARY=64 CC=mips64el-unknown-linux-gnu-gcc FC=mips64el-unknown-linux-gnu-gfortran HOSTCC=gcc TARGET=LOONGSON3A + ``` -### Compile with MASS Support on Power CPU (Optional dependency) +* On an x86 box, compile this library for a loongson3a CPU with loongcc (based on Open64) compiler: + ```sh + make CC=loongcc FC=loongf95 HOSTCC=gcc TARGET=LOONGSON3A CROSS=1 CROSS_SUFFIX=mips64el-st-linux-gnu- NO_LAPACKE=1 NO_SHARED=1 BINARY=32 + ``` -[IBM MASS](http://www-01.ibm.com/software/awdtools/mass/linux/mass-linux.html) library consists of a set of mathematical functions for C, C++, and -Fortran-language applications that are tuned for optimum performance on POWER architectures. OpenBLAS with MASS requires 64-bit, little-endian OS on POWER. -The library can be installed as below - +### Debug version - * On Ubuntu: +A debug version can be built using `make DEBUG=1`. - wget -q http://public.dhe.ibm.com/software/server/POWER/Linux/xl-compiler/eval/ppc64le/ubuntu/public.gpg -O- | sudo apt-key add - - echo "deb http://public.dhe.ibm.com/software/server/POWER/Linux/xl-compiler/eval/ppc64le/ubuntu/ trusty main" | sudo tee /etc/apt/sources.list.d/ibm-xl-compiler-eval.list - sudo apt-get update - sudo apt-get install libxlmass-devel.8.1.3 +### Compile with MASS support on Power CPU (optional) - * On RHEL/CentOS: +The [IBM MASS](https://www.ibm.com/support/home/product/W511326D80541V01/other_software/mathematical_acceleration_subsystem) library consists of a set of mathematical functions for C, C++, and Fortran applications that are tuned for optimum performance on POWER architectures. +OpenBLAS with MASS requires a 64-bit, little-endian OS on POWER. +The library can be installed as shown: - wget http://public.dhe.ibm.com/software/server/POWER/Linux/xl-compiler/eval/ppc64le/rhel7/repodata/repomd.xml.key - sudo rpm --import repomd.xml.key - wget http://public.dhe.ibm.com/software/server/POWER/Linux/xl-compiler/eval/ppc64le/rhel7/ibm-xl-compiler-eval.repo - sudo cp ibm-xl-compiler-eval.repo /etc/yum.repos.d/ - sudo yum install libxlmass-devel.8.1.3 +* On Ubuntu: + ```sh + wget -q http://public.dhe.ibm.com/software/server/POWER/Linux/xl-compiler/eval/ppc64le/ubuntu/public.gpg -O- | sudo apt-key add - + echo "deb http://public.dhe.ibm.com/software/server/POWER/Linux/xl-compiler/eval/ppc64le/ubuntu/ trusty main" | sudo tee /etc/apt/sources.list.d/ibm-xl-compiler-eval.list + sudo apt-get update + sudo apt-get install libxlmass-devel.8.1.5 + ``` -After installing MASS library, compile openblas with USE_MASS=1. +* On RHEL/CentOS: + ```sh + wget http://public.dhe.ibm.com/software/server/POWER/Linux/xl-compiler/eval/ppc64le/rhel7/repodata/repomd.xml.key + sudo rpm --import repomd.xml.key + wget http://public.dhe.ibm.com/software/server/POWER/Linux/xl-compiler/eval/ppc64le/rhel7/ibm-xl-compiler-eval.repo + sudo cp ibm-xl-compiler-eval.repo /etc/yum.repos.d/ + sudo yum install libxlmass-devel.8.1.5 + ``` -Example: +After installing the MASS library, compile OpenBLAS with `USE_MASS=1`. +For example, to compile on Power8 with MASS support: `make USE_MASS=1 TARGET=POWER8`. -Compiling on Power8 with MASS support - +### Install to a specific directory (optional) - make USE_MASS=1 TARGET=POWER8 +Use `PREFIX=` when invoking `make`, for example -### Install to the directory (optional) +```sh +make install PREFIX=your_installation_directory +``` -Example: +The default installation directory is `/opt/OpenBLAS`. - make install PREFIX=your_installation_directory +## Supported CPUs and Operating Systems -The default directory is /opt/OpenBLAS +Please read `GotoBLAS_01Readme.txt`. -## Support CPU & OS -Please read GotoBLAS_01Readme.txt +### Additional supported CPUs -### Additional support CPU: +#### x86/x86-64 -#### x86/x86-64: - **Intel Xeon 56xx (Westmere)**: Used GotoBLAS2 Nehalem codes. - **Intel Sandy Bridge**: Optimized Level-3 and Level-2 BLAS with AVX on x86-64. - **Intel Haswell**: Optimized Level-3 and Level-2 BLAS with AVX2 and FMA on x86-64. +- **Intel Skylake**: Optimized Level-3 and Level-2 BLAS with AVX512 and FMA on x86-64. - **AMD Bobcat**: Used GotoBLAS2 Barcelona codes. -- **AMD Bulldozer**: x86-64 ?GEMM FMA4 kernels. (Thank Werner Saar) +- **AMD Bulldozer**: x86-64 ?GEMM FMA4 kernels. (Thanks to Werner Saar) - **AMD PILEDRIVER**: Uses Bulldozer codes with some optimizations. - **AMD STEAMROLLER**: Uses Bulldozer codes with some optimizations. +- **AMD ZEN**: Uses Haswell codes with some optimizations. + +#### MIPS64 -#### MIPS64: - **ICT Loongson 3A**: Optimized Level-3 BLAS and the part of Level-1,2. - **ICT Loongson 3B**: Experimental -#### ARM: -- **ARMV6**: Optimized BLAS for vfpv2 and vfpv3-d16 ( e.g. BCM2835, Cortex M0+ ) -- **ARMV7**: Optimized BLAS for vfpv3-d32 ( e.g. Cortex A8, A9 and A15 ) +#### ARM + +- **ARMv6**: Optimized BLAS for vfpv2 and vfpv3-d16 (e.g. BCM2835, Cortex M0+) +- **ARMv7**: Optimized BLAS for vfpv3-d32 (e.g. Cortex A8, A9 and A15) + +#### ARM64 -#### ARM64: -- **ARMV8**: Experimental +- **ARMv8**: Experimental - **ARM Cortex-A57**: Experimental -#### IBM zEnterprise System: -- **Z13**: blas3 for double - +#### PPC/PPC64 -### Support OS: -- **GNU/Linux** -- **MingWin or Visual Studio(CMake)/Windows**: Please read . -- **Darwin/Mac OS X**: Experimental. Although GotoBLAS2 supports Darwin, we are the beginner on Mac OS X. -- **FreeBSD**: Supported by community. We didn't test the library on this OS. -- **Android**: Supported by community. Please read . +- **POWER8**: Optimized BLAS, only for PPC64LE (Little Endian), only with `USE_OPENMP=1` +- **POWER9**: Optimized Level-3 BLAS (real) and some Level-1,2. PPC64LE with OpenMP only. -## Usages -Link with libopenblas.a or -lopenblas for shared library. +#### IBM zEnterprise System -### Set the number of threads with environment variables. +- **Z13**: Optimized Level-3 BLAS and Level-1,2 (double precision) +- **Z14**: Optimized Level-3 BLAS and Level-1,2 (single precision) -Examples: +### Supported OS + +- **GNU/Linux** +- **MinGW or Visual Studio (CMake)/Windows**: Please read . +- **Darwin/macOS**: Experimental. Although GotoBLAS2 supports Darwin, we are not macOS experts. +- **FreeBSD**: Supported by the community. We don't actively test the library on this OS. +- **OpenBSD**: Supported by the community. We don't actively test the library on this OS. +- **DragonFly BSD**: Supported by the community. We don't actively test the library on this OS. +- **Android**: Supported by the community. Please read . - export OPENBLAS_NUM_THREADS=4 +## Usage - or +Statically link with `libopenblas.a` or dynamically link with `-lopenblas` if OpenBLAS was +compiled as a shared library. - export GOTO_NUM_THREADS=4 +### Setting the number of threads using environment variables - or +Environment variables are used to specify a maximum number of threads. +For example, - export OMP_NUM_THREADS=4 +```sh +export OPENBLAS_NUM_THREADS=4 +export GOTO_NUM_THREADS=4 +export OMP_NUM_THREADS=4 +``` -The priorities are OPENBLAS_NUM_THREADS > GOTO_NUM_THREADS > OMP_NUM_THREADS. +The priorities are `OPENBLAS_NUM_THREADS` > `GOTO_NUM_THREADS` > `OMP_NUM_THREADS`. -If you compile this lib with USE_OPENMP=1, you should set OMP_NUM_THREADS environment variable. OpenBLAS ignores OPENBLAS_NUM_THREADS and GOTO_NUM_THREADS with USE_OPENMP=1. +If you compile this library with `USE_OPENMP=1`, you should set the `OMP_NUM_THREADS` +environment variable; OpenBLAS ignores `OPENBLAS_NUM_THREADS` and `GOTO_NUM_THREADS` when +compiled with `USE_OPENMP=1`. -### Set the number of threads on runtime. +### Setting the number of threads at runtime -We provided the below functions to control the number of threads on runtime. +We provide the following functions to control the number of threads at runtime: - void goto_set_num_threads(int num_threads); +```c +void goto_set_num_threads(int num_threads); +void openblas_set_num_threads(int num_threads); +``` - void openblas_set_num_threads(int num_threads); +If you compile this library with `USE_OPENMP=1`, you should use the above functions too. -If you compile this lib with USE_OPENMP=1, you should use the above functions, too. +## Reporting bugs -## Report Bugs -Please add a issue in https://github.com/xianyi/OpenBLAS/issues +Please submit an issue in https://github.com/xianyi/OpenBLAS/issues. ## Contact + * OpenBLAS users mailing list: https://groups.google.com/forum/#!forum/openblas-users * OpenBLAS developers mailing list: https://groups.google.com/forum/#!forum/openblas-dev -## ChangeLog -Please see Changelog.txt to obtain the differences between GotoBLAS2 1.13 BSD version. +## Change log + +Please see Changelog.txt to view the differences between OpenBLAS and GotoBLAS2 1.13 BSD version. ## Troubleshooting -* Please read [Faq](https://github.com/xianyi/OpenBLAS/wiki/Faq) at first. -* Please use gcc version 4.6 and above to compile Sandy Bridge AVX kernels on Linux/MingW/BSD. -* Please use Clang version 3.1 and above to compile the library on Sandy Bridge microarchitecture. The Clang 3.0 will generate the wrong AVX binary code. -* The number of CPUs/Cores should less than or equal to 256. On Linux x86_64(amd64), there is experimental support for up to 1024 CPUs/Cores and 128 numa nodes if you build the library with BIGNUMA=1. -* OpenBLAS does not set processor affinity by default. On Linux, you can enable processor affinity by commenting the line NO_AFFINITY=1 in Makefile.rule. But this may cause [the conflict with R parallel](https://stat.ethz.ch/pipermail/r-sig-hpc/2012-April/001348.html). -* On Loongson 3A. make test would be failed because of pthread_create error. The error code is EAGAIN. However, it will be OK when you run the same testcase on shell. + +* Please read the [FAQ](https://github.com/xianyi/OpenBLAS/wiki/Faq) first. +* Please use GCC version 4.6 and above to compile Sandy Bridge AVX kernels on Linux/MinGW/BSD. +* Please use Clang version 3.1 and above to compile the library on Sandy Bridge microarchitecture. + Clang 3.0 will generate the wrong AVX binary code. +* Please use GCC version 6 or LLVM version 6 and above to compile Skylake AVX512 kernels. +* The number of CPUs/cores should less than or equal to 256. On Linux `x86_64` (`amd64`), + there is experimental support for up to 1024 CPUs/cores and 128 numa nodes if you build + the library with `BIGNUMA=1`. +* OpenBLAS does not set processor affinity by default. + On Linux, you can enable processor affinity by commenting out the line `NO_AFFINITY=1` in + Makefile.rule. However, note that this may cause + [a conflict with R parallel](https://stat.ethz.ch/pipermail/r-sig-hpc/2012-April/001348.html). +* On Loongson 3A, `make test` may fail with a `pthread_create` error (`EAGAIN`). + However, it will be okay when you run the same test case on the shell. ## Contributing -1. [Check for open issues](https://github.com/xianyi/OpenBLAS/issues) or open a fresh issue to start a discussion around a feature idea or a bug. -1. Fork the [OpenBLAS](https://github.com/xianyi/OpenBLAS) repository to start making your changes. -1. Write a test which shows that the bug was fixed or that the feature works as expected. -1. Send a pull request. Make sure to add yourself to `CONTRIBUTORS.md`. + +1. [Check for open issues](https://github.com/xianyi/OpenBLAS/issues) or open a fresh issue + to start a discussion around a feature idea or a bug. +2. Fork the [OpenBLAS](https://github.com/xianyi/OpenBLAS) repository to start making your changes. +3. Write a test which shows that the bug was fixed or that the feature works as expected. +4. Send a pull request. Make sure to add yourself to `CONTRIBUTORS.md`. ## Donation + Please read [this wiki page](https://github.com/xianyi/OpenBLAS/wiki/Donation).