Imported Upstream version 1.23.0

author DongHun Kwak <dh0128.kwak@samsung.com>

Fri, 15 Jul 2022 02:14:57 +0000 (11:14 +0900)

committer DongHun Kwak <dh0128.kwak@samsung.com>

Fri, 15 Jul 2022 02:14:57 +0000 (11:14 +0900)
author DongHun Kwak <dh0128.kwak@samsung.com>
Fri, 15 Jul 2022 02:14:57 +0000 (11:14 +0900)
committer DongHun Kwak <dh0128.kwak@samsung.com>
Fri, 15 Jul 2022 02:14:57 +0000 (11:14 +0900)
diff --git a/INSTALL.rst.txt b/INSTALL.rst.txt

index 1bc97c4b5f8641a09fdef74f40505e14f2aa6791..130306d06c073bbf995d3e214dccaf0229ab8de5 100644 (file)
--- a/INSTALL.rst.txt
+++ b/INSTALL.rst.txt
@@ -14,15 +14,15 @@ Prerequisites
  
  Building NumPy requires the following installed software:
  
-1) Python__ 3.7.x or newer.
+1) Python__ 3.8.x or newer.
  
     Please note that the Python development headers also need to be installed,
     e.g., on Debian/Ubuntu one needs to install both `python3` and
     `python3-dev`. On Windows and macOS this is normally not an issue.
  
-2) Cython >= 0.29.21
+2) Cython >= 0.29.30 but < 3.0
  
-3) pytest__ (optional) 1.15 or later
+3) pytest__ (optional)
  
     This is required for testing NumPy, but not for using it.
  
diff --git a/PKG-INFO b/PKG-INFO

index a9930fbc6b0be46b1ffd0bee6837f7c66b8e8191..7e71bd59156bb3253a7a339597f617fe7665880f 100644 (file)
--- a/PKG-INFO
+++ b/PKG-INFO
@@ -1,6 +1,6 @@
-Metadata-Version: 2.1
+Metadata-Version: 1.2
  Name: numpy
-Version: 1.22.4
+Version: 1.23.0
  Summary:  NumPy is the fundamental package for array computing with Python.
  Home-page: https://www.numpy.org
  Author: Travis E. Oliphant et al.
@@ -9,8 +9,29 @@ Maintainer-email: numpy-discussion@python.org
  License: BSD
  Download-URL: https://pypi.python.org/pypi/numpy
  Project-URL: Bug Tracker, https://github.com/numpy/numpy/issues
-Project-URL: Documentation, https://numpy.org/doc/1.22
+Project-URL: Documentation, https://numpy.org/doc/1.23
  Project-URL: Source Code, https://github.com/numpy/numpy
+Description: It provides:
+        
+        - a powerful N-dimensional array object
+        - sophisticated (broadcasting) functions
+        - tools for integrating C/C++ and Fortran code
+        - useful linear algebra, Fourier transform, and random number capabilities
+        - and much more
+        
+        Besides its obvious scientific uses, NumPy can also be used as an efficient
+        multi-dimensional container of generic data. Arbitrary data-types can be
+        defined. This allows NumPy to seamlessly and speedily integrate with a wide
+        variety of databases.
+        
+        All NumPy wheels distributed on PyPI are BSD licensed.
+        
+        NumPy requires ``pytest`` and ``hypothesis``.  Tests can then be run after
+        installation with::
+        
+            python -c 'import numpy; numpy.test()'
+        
+        
  Platform: Windows
  Platform: Linux
  Platform: Solaris
@@ -36,23 +57,3 @@ Classifier: Operating System :: POSIX
  Classifier: Operating System :: Unix
  Classifier: Operating System :: MacOS
  Requires-Python: >=3.8
-License-File: LICENSE.txt
-License-File: LICENSES_bundled.txt
-
-It provides:
-
-- a powerful N-dimensional array object
-- sophisticated (broadcasting) functions
-- tools for integrating C/C++ and Fortran code
-- useful linear algebra, Fourier transform, and random number capabilities
-- and much more
-
-Besides its obvious scientific uses, NumPy can also be used as an efficient
-multi-dimensional container of generic data. Arbitrary data-types can be
-defined. This allows NumPy to seamlessly and speedily integrate with a wide
-variety of databases.
-
-All NumPy wheels distributed on PyPI are BSD licensed.
-
-
-
diff --git a/README.md b/README.md

index 04825dc5d9417ed7f992eecbb2b1e4f91ab5c85c..3367f26193b8a31f8b4669d0db00a482fa1c55cd 100644 (file)
--- a/README.md
+++ b/README.md
@@ -1,19 +1,11 @@
-# <a href="https://numpy.org/"><img alt="NumPy" src="/branding/logo/primary/numpylogo.svg" height="60"></a>
-
-<!--[![Azure Pipelines](https://dev.azure.com/numpy/numpy/_apis/build/status/numpy.numpy?branchName=main)](-->
-<!--https://dev.azure.com/numpy/numpy/_build/latest?definitionId=1?branchName=main)-->
-<!--[![Actions build_test](https://github.com/numpy/numpy/actions/workflows/build_test.yml/badge.svg)](-->
-<!--https://github.com/numpy/numpy/actions/workflows/build_test.yml)-->
-<!--[![TravisCI](https://app.travis-ci.com/numpy/numpy.svg?branch=main)](-->
-<!--https://app.travis-ci.com/numpy/numpy)-->
-<!--[![CircleCI](https://img.shields.io/circleci/project/github/numpy/numpy/main.svg?label=CircleCI)](-->
-<!--https://circleci.com/gh/numpy/numpy)-->
-<!--[![Codecov](https://codecov.io/gh/numpy/numpy/branch/main/graph/badge.svg)](-->
-<!--https://codecov.io/gh/numpy/numpy)-->
+<h1 align="center">
+<img src="/branding/logo/primary/numpylogo.svg" width="300">
+</h1><br>
+
  
  [![Powered by NumFOCUS](https://img.shields.io/badge/powered%20by-NumFOCUS-orange.svg?style=flat&colorA=E1523D&colorB=007D8A)](
  https://numfocus.org)
-[![Pypi Downloads](https://img.shields.io/pypi/dm/numpy.svg?label=Pypi%20downloads)](
+[![PyPI Downloads](https://img.shields.io/pypi/dm/numpy.svg?label=PyPI%20downloads)](
  https://pypi.org/project/numpy/)
  [![Conda Downloads](https://img.shields.io/conda/dn/conda-forge/numpy.svg?label=Conda%20downloads)](
  https://anaconda.org/conda-forge/numpy)
@@ -22,7 +14,7 @@ https://stackoverflow.com/questions/tagged/numpy)
  [![Nature Paper](https://img.shields.io/badge/DOI-10.1038%2Fs41592--019--0686--2-blue)](
  https://doi.org/10.1038/s41586-020-2649-2)
  
-NumPy is the fundamental package needed for scientific computing with Python.
+NumPy is the fundamental package for scientific computing with Python.
  
  - **Website:** https://www.numpy.org
  - **Documentation:** https://numpy.org/doc
@@ -41,25 +33,31 @@ It provides:
  
  Testing:
  
-NumPy requires `pytest`.  Tests can then be run after installation with:
+NumPy requires `pytest` and `hypothesis`.  Tests can then be run after installation with:
  
      python -c 'import numpy; numpy.test()'
  
+Code of Conduct
+----------------------
+
+NumPy is a community-driven open source project developed by a diverse group of
+[contributors](https://numpy.org/teams/). The NumPy leadership has made a strong
+commitment to creating an open, inclusive, and positive community. Please read the
+[NumPy Code of Conduct](https://numpy.org/code-of-conduct/) for guidance on how to interact
+with others in a way that makes our community thrive.
  
  Call for Contributions
  ----------------------
  
  The NumPy project welcomes your expertise and enthusiasm!
  
-Small improvements or fixes are always appreciated; issues labeled as ["good
-first issue"](https://github.com/numpy/numpy/labels/good%20first%20issue)
-may be a good starting point. If you are considering larger contributions
+Small improvements or fixes are always appreciated. If you are considering larger contributions
  to the source code, please contact us through the [mailing
  list](https://mail.python.org/mailman/listinfo/numpy-discussion) first.
  
-Writing code isn’t the only way to contribute to NumPy. You can also: 
+Writing code isn’t the only way to contribute to NumPy. You can also:
  - review pull requests
-- triage issues
+- help us stay on top of new and old issues
  - develop tutorials, presentations, and other educational materials
  - maintain and improve [our website](https://github.com/numpy/numpy.org)
  - develop graphic design for our brand assets and promotional materials
@@ -67,6 +65,7 @@ Writing code isn’t the only way to contribute to NumPy. You can also:
  - help with outreach and onboard new contributors
  - write grant proposals and help with other fundraising efforts
  
+For more information about the ways you can contribute to NumPy, visit [our website](https://numpy.org/contribute/). 
  If you’re unsure where to start or how your skills fit in, reach out! You can
  ask on the mailing list or here, on GitHub, by opening a new issue or leaving a
  comment on a relevant issue that is already open.
@@ -77,7 +76,7 @@ numpy-team@googlegroups.com or on Slack (write numpy-team@googlegroups.com for
  an invitation).
  
  We also have a biweekly community call, details of which are announced on the
-mailing list. You are very welcome to join. 
+mailing list. You are very welcome to join.
  
  If you are new to contributing to open source, [this
  guide](https://opensource.guide/how-to-contribute/) helps explain why, what,
diff --git a/benchmarks/asv.conf.json b/benchmarks/asv.conf.json

index 029adb5898db08a882aafc81059000a68d987de8..b60135524c109d59769767b8d72c5e9bf8aec901 100644 (file)
--- a/benchmarks/asv.conf.json
+++ b/benchmarks/asv.conf.json
@@ -43,6 +43,7 @@
      // version.
      "matrix": {
          "Cython": [],
+        "setuptools": ["59.2.0"]
      },
  
      // The directory (relative to the current directory) that benchmarks are
diff --git a/benchmarks/asv_compare.conf.json.tpl b/benchmarks/asv_compare.conf.json.tpl

index 93d12d4a0b777639bad5ecebf21be1cd0a556374..01f4e41de708a9b1599a042edacb2da920fbaddb 100644 (file)
--- a/benchmarks/asv_compare.conf.json.tpl
+++ b/benchmarks/asv_compare.conf.json.tpl
@@ -47,6 +47,7 @@
      // version.
      "matrix": {
          "Cython": [],
+        "setuptools": ["59.2.0"]
      },
  
      // The directory (relative to the current directory) that benchmarks are
diff --git a/benchmarks/benchmarks/bench_core.py b/benchmarks/benchmarks/bench_core.py

index 30647f4b850fa5a7b162ab98f460d8eac47aadff..4fcd7ace509aec95baf8778b62391ab381b0546b 100644 (file)
--- a/benchmarks/benchmarks/bench_core.py
+++ b/benchmarks/benchmarks/bench_core.py
@@ -207,7 +207,7 @@ class Indices(Benchmark):
          np.indices((1000, 500))
  
  class VarComplex(Benchmark):
-    params = [10**n for n in range(1, 9)]
+    params = [10**n for n in range(0, 9)]
      def setup(self, n):
          self.arr = np.random.randn(n) + 1j * np.random.randn(n)
  
diff --git a/benchmarks/benchmarks/bench_function_base.py b/benchmarks/benchmarks/bench_function_base.py

index 062843d10cc0f7edd346eff487108d10e6bd13a4..3e35f54f286b89159803d45b4392c16f3cbb7dfe 100644 (file)
--- a/benchmarks/benchmarks/bench_function_base.py
+++ b/benchmarks/benchmarks/bench_function_base.py
@@ -43,6 +43,20 @@ class Bincount(Benchmark):
          np.bincount(self.d, weights=self.e)
  
  
+class Mean(Benchmark):
+    param_names = ['size']
+    params = [[1, 10, 100_000]]
+
+    def setup(self, size):
+        self.array = np.arange(2*size).reshape(2, size)
+
+    def time_mean(self, size):
+        np.mean(self.array)
+
+    def time_mean_axis(self, size):
+        np.mean(self.array, axis=1)
+
+
  class Median(Benchmark):
      def setup(self):
          self.e = np.arange(10000, dtype=np.float32)
@@ -78,7 +92,7 @@ class Median(Benchmark):
  class Percentile(Benchmark):
      def setup(self):
          self.e = np.arange(10000, dtype=np.float32)
-        self.o = np.arange(10001, dtype=np.float32)
+        self.o = np.arange(21, dtype=np.float32)
  
      def time_quartile(self):
          np.percentile(self.e, [25, 75])
@@ -86,6 +100,9 @@ class Percentile(Benchmark):
      def time_percentile(self):
          np.percentile(self.e, [25, 35, 55, 65, 75])
  
+    def time_percentile_small(self):
+        np.percentile(self.o, [25, 75])
+
  
  class Select(Benchmark):
      def setup(self):
@@ -222,7 +239,7 @@ class Sort(Benchmark):
          # In NumPy 1.17 and newer, 'merge' can be one of several
          # stable sorts, it isn't necessarily merge sort.
          ['quick', 'merge', 'heap'],
-        ['float64', 'int64', 'int16'],
+        ['float64', 'int64', 'float32', 'uint32', 'int32', 'int16'],
          [
              ('random',),
              ('ordered',),
@@ -284,6 +301,21 @@ class Where(Benchmark):
          self.d = np.arange(20000)
          self.e = self.d.copy()
          self.cond = (self.d > 5000)
+        size = 1024 * 1024 // 8
+        rnd_array = np.random.rand(size)
+        self.rand_cond_01 = rnd_array > 0.01
+        self.rand_cond_20 = rnd_array > 0.20
+        self.rand_cond_30 = rnd_array > 0.30
+        self.rand_cond_40 = rnd_array > 0.40
+        self.rand_cond_50 = rnd_array > 0.50
+        self.all_zeros = np.zeros(size, dtype=bool)
+        self.all_ones = np.ones(size, dtype=bool)
+        self.rep_zeros_2 = np.arange(size) % 2 == 0
+        self.rep_zeros_4 = np.arange(size) % 4 == 0
+        self.rep_zeros_8 = np.arange(size) % 8 == 0
+        self.rep_ones_2 = np.arange(size) % 2 > 0
+        self.rep_ones_4 = np.arange(size) % 4 > 0
+        self.rep_ones_8 = np.arange(size) % 8 > 0
  
      def time_1(self):
          np.where(self.cond)
@@ -293,3 +325,43 @@ class Where(Benchmark):
  
      def time_2_broadcast(self):
          np.where(self.cond, self.d, 0)
+
+    def time_all_zeros(self):
+        np.where(self.all_zeros)
+
+    def time_random_01_percent(self):
+        np.where(self.rand_cond_01)
+
+    def time_random_20_percent(self):
+        np.where(self.rand_cond_20)
+
+    def time_random_30_percent(self):
+        np.where(self.rand_cond_30)
+
+    def time_random_40_percent(self):
+        np.where(self.rand_cond_40)
+
+    def time_random_50_percent(self):
+        np.where(self.rand_cond_50)
+
+    def time_all_ones(self):
+        np.where(self.all_ones)
+
+    def time_interleaved_zeros_x2(self):
+        np.where(self.rep_zeros_2)
+
+    def time_interleaved_zeros_x4(self):
+        np.where(self.rep_zeros_4)
+
+    def time_interleaved_zeros_x8(self):
+        np.where(self.rep_zeros_8)
+
+    def time_interleaved_ones_x2(self):
+        np.where(self.rep_ones_2)
+
+    def time_interleaved_ones_x4(self):
+        np.where(self.rep_ones_4)
+
+    def time_interleaved_ones_x8(self):
+        np.where(self.rep_ones_8)
+
diff --git a/benchmarks/benchmarks/bench_linalg.py b/benchmarks/benchmarks/bench_linalg.py

index 5ed5b6eecd6d566f05532a36412311693288b7d9..a94ba1139168a84bbdd949dc9c9db8373f9c63c6 100644 (file)
--- a/benchmarks/benchmarks/bench_linalg.py
+++ b/benchmarks/benchmarks/bench_linalg.py
@@ -98,6 +98,18 @@ class Linalg(Benchmark):
          self.func(self.a)
  
  
+class LinalgSmallArrays(Benchmark):
+    """ Test overhead of linalg methods for small arrays """
+    def setup(self):
+        self.array_5 = np.arange(5.)
+        self.array_5_5 = np.arange(5.)
+
+    def time_norm_small_array(self):
+        np.linalg.norm(self.array_5)
+
+    def time_det_small_array(self):
+        np.linalg.det(self.array_5_5)
+        
  class Lstsq(Benchmark):
      def setup(self):
          self.a = get_squares_()['float64']
@@ -117,11 +129,11 @@ class Einsum(Benchmark):
          self.two_dim = np.arange(240000, dtype=dtype).reshape(400, 600)
          self.three_dim_small = np.arange(10000, dtype=dtype).reshape(10,100,10)
          self.three_dim = np.arange(24000, dtype=dtype).reshape(20, 30, 40)
-        # non_contigous arrays
-        self.non_contigous_dim1_small = np.arange(1, 80, 2, dtype=dtype)
-        self.non_contigous_dim1 = np.arange(1, 4000, 2, dtype=dtype)
-        self.non_contigous_dim2 = np.arange(1, 2400, 2, dtype=dtype).reshape(30, 40)
-        self.non_contigous_dim3 = np.arange(1, 48000, 2, dtype=dtype).reshape(20, 30, 40)
+        # non_contiguous arrays
+        self.non_contiguous_dim1_small = np.arange(1, 80, 2, dtype=dtype)
+        self.non_contiguous_dim1 = np.arange(1, 4000, 2, dtype=dtype)
+        self.non_contiguous_dim2 = np.arange(1, 2400, 2, dtype=dtype).reshape(30, 40)
+        self.non_contiguous_dim3 = np.arange(1, 48000, 2, dtype=dtype).reshape(20, 30, 40)
  
      # outer(a,b): trigger sum_of_products_contig_stride0_outcontig_two
      def time_einsum_outer(self, dtype):
@@ -130,7 +142,7 @@ class Einsum(Benchmark):
      # multiply(a, b):trigger sum_of_products_contig_two
      def time_einsum_multiply(self, dtype):
          np.einsum("..., ...", self.two_dim_small, self.three_dim , optimize=True)
-    
+
      # sum and multiply:trigger sum_of_products_contig_stride0_outstride0_two
      def time_einsum_sum_mul(self, dtype):
          np.einsum(",i...->", 300, self.three_dim_small, optimize=True)
@@ -138,11 +150,11 @@ class Einsum(Benchmark):
      # sum and multiply:trigger sum_of_products_stride0_contig_outstride0_two
      def time_einsum_sum_mul2(self, dtype):
          np.einsum("i...,->", self.three_dim_small, 300, optimize=True)
-    
+
      # scalar mul: trigger sum_of_products_stride0_contig_outcontig_two
      def time_einsum_mul(self, dtype):
          np.einsum("i,->i", self.one_dim_big, 300, optimize=True)
-    
+
      # trigger contig_contig_outstride0_two
      def time_einsum_contig_contig(self, dtype):
          np.einsum("ji,i->", self.two_dim, self.one_dim_small, optimize=True)
@@ -151,30 +163,30 @@ class Einsum(Benchmark):
      def time_einsum_contig_outstride0(self, dtype):
          np.einsum("i->", self.one_dim_big, optimize=True)
  
-    # outer(a,b): non_contigous arrays
+    # outer(a,b): non_contiguous arrays
      def time_einsum_noncon_outer(self, dtype):
-        np.einsum("i,j", self.non_contigous_dim1, self.non_contigous_dim1, optimize=True)
+        np.einsum("i,j", self.non_contiguous_dim1, self.non_contiguous_dim1, optimize=True)
  
-    # multiply(a, b):non_contigous arrays
+    # multiply(a, b):non_contiguous arrays
      def time_einsum_noncon_multiply(self, dtype):
-        np.einsum("..., ...", self.non_contigous_dim2, self.non_contigous_dim3 , optimize=True)
-    
-    # sum and multiply:non_contigous arrays
+        np.einsum("..., ...", self.non_contiguous_dim2, self.non_contiguous_dim3, optimize=True)
+
+    # sum and multiply:non_contiguous arrays
      def time_einsum_noncon_sum_mul(self, dtype):
-        np.einsum(",i...->", 300, self.non_contigous_dim3, optimize=True)
+        np.einsum(",i...->", 300, self.non_contiguous_dim3, optimize=True)
  
-    # sum and multiply:non_contigous arrays
+    # sum and multiply:non_contiguous arrays
      def time_einsum_noncon_sum_mul2(self, dtype):
-        np.einsum("i...,->", self.non_contigous_dim3, 300, optimize=True)
-    
-    # scalar mul: non_contigous arrays
+        np.einsum("i...,->", self.non_contiguous_dim3, 300, optimize=True)
+
+    # scalar mul: non_contiguous arrays
      def time_einsum_noncon_mul(self, dtype):
-        np.einsum("i,->i", self.non_contigous_dim1, 300, optimize=True)
-    
-    # contig_contig_outstride0_two: non_contigous arrays
+        np.einsum("i,->i", self.non_contiguous_dim1, 300, optimize=True)
+
+    # contig_contig_outstride0_two: non_contiguous arrays
      def time_einsum_noncon_contig_contig(self, dtype):
-        np.einsum("ji,i->", self.non_contigous_dim2, self.non_contigous_dim1_small, optimize=True)
+        np.einsum("ji,i->", self.non_contiguous_dim2, self.non_contiguous_dim1_small, optimize=True)
  
-    # sum_of_products_contig_outstride0_one：non_contigous arrays
+    # sum_of_products_contig_outstride0_one：non_contiguous arrays
      def time_einsum_noncon_contig_outstride0(self, dtype):
-        np.einsum("i->", self.non_contigous_dim1, optimize=True)
+        np.einsum("i->", self.non_contiguous_dim1, optimize=True)
diff --git a/benchmarks/benchmarks/bench_reduce.py b/benchmarks/benchmarks/bench_reduce.py

index 7b05f4fcce317a3cef2aef57841f79b614d42563..ca07bd180c0ea0c3f22282e10ba0ddde6cbba65a 100644 (file)
--- a/benchmarks/benchmarks/bench_reduce.py
+++ b/benchmarks/benchmarks/bench_reduce.py
@@ -46,7 +46,8 @@ class AnyAll(Benchmark):
  
  
  class MinMax(Benchmark):
-    params = [np.float32, np.float64, np.intp]
+    params = [np.int8, np.uint8, np.int16, np.uint16, np.int32, np.uint32,
+              np.int64, np.uint64, np.float32, np.float64, np.intp]
      param_names = ['dtype']
  
      def setup(self, dtype):
@@ -58,8 +59,22 @@ class MinMax(Benchmark):
      def time_max(self, dtype):
          np.max(self.d)
  
+class FMinMax(Benchmark):
+    params = [np.float32, np.float64]
+    param_names = ['dtype']
+
+    def setup(self, dtype):
+        self.d = np.ones(20000, dtype=dtype)
+
+    def time_min(self, dtype):
+        np.fmin.reduce(self.d)
+
+    def time_max(self, dtype):
+        np.fmax.reduce(self.d)
+
  class ArgMax(Benchmark):
-    params = [np.float32, bool]
+    params = [np.int8, np.uint8, np.int16, np.uint16, np.int32, np.uint32,
+              np.int64, np.uint64, np.float32, np.float64, bool]
      param_names = ['dtype']
  
      def setup(self, dtype):
@@ -68,6 +83,17 @@ class ArgMax(Benchmark):
      def time_argmax(self, dtype):
          np.argmax(self.d)
  
+class ArgMin(Benchmark):
+    params = [np.int8, np.uint8, np.int16, np.uint16, np.int32, np.uint32,
+              np.int64, np.uint64, np.float32, np.float64, bool]
+    param_names = ['dtype']
+
+    def setup(self, dtype):
+        self.d = np.ones(200000, dtype=dtype)
+
+    def time_argmin(self, dtype):
+        np.argmin(self.d)
+
  class SmallReduction(Benchmark):
      def setup(self):
          self.d = np.ones(100, dtype=np.float32)
diff --git a/benchmarks/benchmarks/bench_scalar.py b/benchmarks/benchmarks/bench_scalar.py

index 219e48bede94fcfd0d34337998cf6fec1ec7de76..650daa89de42cc8b03818c1aa967b58736781721 100644 (file)
--- a/benchmarks/benchmarks/bench_scalar.py
+++ b/benchmarks/benchmarks/bench_scalar.py
@@ -10,6 +10,8 @@ class ScalarMath(Benchmark):
      param_names = ["type"]
      def setup(self, typename):
          self.num = np.dtype(typename).type(2)
+        self.int32 = np.int32(2)
+        self.int32arr = np.array(2, dtype=np.int32)
  
      def time_addition(self, typename):
          n = self.num
@@ -31,3 +33,35 @@ class ScalarMath(Benchmark):
          n = self.num
          res = abs(abs(abs(abs(abs(abs(abs(abs(abs(abs(n))))))))))
  
+    def time_add_int32_other(self, typename):
+        # Some mixed cases are fast, some are slow, this documents these
+        # differences.  (When writing, it was fast if the type of the result
+        # is one of the inputs.)
+        int32 = self.int32
+        other = self.num
+        int32 + other
+        int32 + other
+        int32 + other
+        int32 + other
+        int32 + other
+
+    def time_add_int32arr_and_other(self, typename):
+        # `arr + scalar` hits the normal ufunc (array) paths.
+        int32 = self.int32arr
+        other = self.num
+        int32 + other
+        int32 + other
+        int32 + other
+        int32 + other
+        int32 + other
+
+    def time_add_other_and_int32arr(self, typename):
+        # `scalar + arr` at some point hit scalar paths in some cases, and
+        # these paths could be optimized more easily
+        int32 = self.int32arr
+        other = self.num
+        other + int32
+        other + int32
+        other + int32
+        other + int32
+        other + int32
diff --git a/benchmarks/benchmarks/bench_shape_base.py b/benchmarks/benchmarks/bench_shape_base.py

index 0c7dc4e728eca11bec37fa09c44d44bf03a3e2cd..375c43dcfe56466b6a3888ab0121d0d448976a32 100644 (file)
--- a/benchmarks/benchmarks/bench_shape_base.py
+++ b/benchmarks/benchmarks/bench_shape_base.py
@@ -134,3 +134,21 @@ class Block3D(Benchmark):
  
      # Retain old benchmark name for backward compat
      time_3d.benchmark_name = "bench_shape_base.Block.time_3d"
+
+
+class Kron(Benchmark):
+    """Benchmarks for Kronecker product of two arrays"""
+
+    def setup(self):
+        self.large_arr = np.random.random((10,) * 4)
+        self.large_mat = np.mat(np.random.random((100, 100)))
+        self.scalar = 7
+
+    def time_arr_kron(self):
+        np.kron(self.large_arr, self.large_arr)
+
+    def time_scalar_kron(self):
+        np.kron(self.large_arr, self.scalar)
+
+    def time_mat_kron(self):
+        np.kron(self.large_mat, self.large_mat)
diff --git a/benchmarks/benchmarks/bench_ufunc.py b/benchmarks/benchmarks/bench_ufunc.py

index b036581e1aae392cb73132a974beefed922b68e2..cfa29017d23919f9992b36d900f0cdb2ad4f6c40 100644 (file)
--- a/benchmarks/benchmarks/bench_ufunc.py
+++ b/benchmarks/benchmarks/bench_ufunc.py
@@ -57,10 +57,47 @@ class UFunc(Benchmark):
      def time_ufunc_types(self, ufuncname):
          [self.f(*arg) for arg in self.args]
  
+class UFuncSmall(Benchmark):
+    """  Benchmark for a selection of ufuncs on a small arrays and scalars 
+
+    Since the arrays and scalars are small, we are benchmarking the overhead 
+    of the numpy ufunc functionality
+    """
+    params = ['abs', 'sqrt', 'cos']
+    param_names = ['ufunc']
+    timeout = 10
+
+    def setup(self, ufuncname):
+        np.seterr(all='ignore')
+        try:
+            self.f = getattr(np, ufuncname)
+        except AttributeError:
+            raise NotImplementedError()
+        self.array_5 = np.array([1., 2., 10., 3., 4.])
+        self.array_int_3 = np.array([1, 2, 3])
+        self.float64 = np.float64(1.1)
+        self.python_float = 1.1
+        
+    def time_ufunc_small_array(self, ufuncname):
+        self.f(self.array_5)
+
+    def time_ufunc_small_array_inplace(self, ufuncname):
+        self.f(self.array_5, out = self.array_5)
+
+    def time_ufunc_small_int_array(self, ufuncname):
+        self.f(self.array_int_3)
+
+    def time_ufunc_numpy_scalar(self, ufuncname):
+        self.f(self.float64)
+
+    def time_ufunc_python_float(self, ufuncname):
+        self.f(self.python_float)
+        
  
  class Custom(Benchmark):
      def setup(self):
          self.b = np.ones(20000, dtype=bool)
+        self.b_small = np.ones(3, dtype=bool)
  
      def time_nonzero(self):
          np.nonzero(self.b)
@@ -74,6 +111,9 @@ class Custom(Benchmark):
      def time_or_bool(self):
          (self.b | self.b)
  
+    def time_and_bool_small(self):
+        (self.b_small & self.b_small)
+
  
  class CustomInplace(Benchmark):
      def setup(self):
@@ -150,6 +190,19 @@ class CustomScalarFloorDivideInt(Benchmark):
      def time_floor_divide_int(self, dtype, divisor):
          self.x // divisor
  
+class CustomArrayFloorDivideInt(Benchmark):
+    params = (np.sctypes['int'] + np.sctypes['uint'], [100, 10000, 1000000])
+    param_names = ['dtype', 'size']
+
+    def setup(self, dtype, size):
+        iinfo = np.iinfo(dtype)
+        self.x = np.random.randint(
+                    iinfo.min, iinfo.max, size=size, dtype=dtype)
+        self.y = np.random.randint(2, 32, size=size, dtype=dtype)
+
+    def time_floor_divide_int(self, dtype, size):
+        self.x // self.y
+
  
  class Scalar(Benchmark):
      def setup(self):
diff --git a/benchmarks/benchmarks/bench_ufunc_strides.py b/benchmarks/benchmarks/bench_ufunc_strides.py

index 75aa510a6b81f124ed55212dc67fdba5eddf44ca..b751e4804238f904686d104794045932e3507c3b 100644 (file)
--- a/benchmarks/benchmarks/bench_ufunc_strides.py
+++ b/benchmarks/benchmarks/bench_ufunc_strides.py
@@ -44,27 +44,40 @@ class AVX_UFunc_log(Benchmark):
      def time_log(self, stride, dtype):
          np.log(self.arr[::stride])
  
-avx_bfuncs = ['maximum',
-              'minimum']
  
-class AVX_BFunc(Benchmark):
+binary_ufuncs = [
+    'maximum', 'minimum', 'fmax', 'fmin'
+]
+binary_dtype = ['f', 'd']
  
-    params = [avx_bfuncs, dtype, stride]
-    param_names = ['avx_based_bfunc', 'dtype', 'stride']
+class Binary(Benchmark):
+    param_names = ['ufunc', 'stride_in0', 'stride_in1', 'stride_out', 'dtype']
+    params = [binary_ufuncs, stride, stride, stride_out, binary_dtype]
      timeout = 10
  
-    def setup(self, ufuncname, dtype, stride):
+    def setup(self, ufuncname, stride_in0, stride_in1, stride_out, dtype):
          np.seterr(all='ignore')
          try:
              self.f = getattr(np, ufuncname)
          except AttributeError:
              raise NotImplementedError(f"No ufunc {ufuncname} found") from None
-        N = 10000
-        self.arr1 = np.array(np.random.rand(stride*N), dtype=dtype)
-        self.arr2 = np.array(np.random.rand(stride*N), dtype=dtype)
+        N = 100000
+        self.arr1 = np.array(np.random.rand(stride_in0*N), dtype=dtype)
+        self.arr2 = np.array(np.random.rand(stride_in1*N), dtype=dtype)
+        self.arr_out = np.empty(stride_out*N, dtype)
  
-    def time_ufunc(self, ufuncname, dtype, stride):
-        self.f(self.arr1[::stride], self.arr2[::stride])
+    def time_ufunc(self, ufuncname, stride_in0, stride_in1, stride_out, dtype):
+        self.f(self.arr1[::stride_in0], self.arr2[::stride_in1],
+               self.arr_out[::stride_out])
+
+
+binary_int_ufuncs = ['maximum', 'minimum']
+binary_int_dtype = ['b', 'B', 'h', 'H', 'i', 'I', 'l', 'L', 'q', 'Q']
+
+class BinaryInt(Binary):
+
+    param_names = ['ufunc', 'stride_in0', 'stride_in1', 'stride_out', 'dtype']
+    params = [binary_int_ufuncs, stride, stride, stride_out, binary_int_dtype]
  
  class AVX_ldexp(Benchmark):
  
diff --git a/doc/C_STYLE_GUIDE.rst.txt b/doc/C_STYLE_GUIDE.rst.txt

index 4e2f27fbb1b10380c1ac0f852446c2d3b6b3a6ac..60d2d73835101388cbe7a2c020a9db3ed1d2f78e 100644 (file)
--- a/doc/C_STYLE_GUIDE.rst.txt
+++ b/doc/C_STYLE_GUIDE.rst.txt
@@ -1,3 +1,3 @@
  
-The "NumPy C Style Guide" at this page has been supserseded by
+The "NumPy C Style Guide" at this page has been superseded by
  "NEP 45 — C Style Guide" at https://numpy.org/neps/nep-0045-c_style_guide.html
diff --git a/doc/EXAMPLE_DOCSTRING.rst.txt b/doc/EXAMPLE_DOCSTRING.rst.txt

index 55294f6568c46a8bb5c828dc66de332e0fdece69..1de0588ec5f3bb9718124e2dd7ae3616b7ff7e22 100644 (file)
--- a/doc/EXAMPLE_DOCSTRING.rst.txt
+++ b/doc/EXAMPLE_DOCSTRING.rst.txt
@@ -9,7 +9,7 @@ multivariate_normal(mean, cov[, shape])
  Draw samples from a multivariate normal distribution.
  
  The multivariate normal, multinormal or Gaussian distribution is a
-generalisation of the one-dimensional normal distribution to higher
+generalization of the one-dimensional normal distribution to higher
  dimensions.
  
  Such a distribution is specified by its mean and covariance matrix,
diff --git a/doc/HOWTO_RELEASE.rst.txt b/doc/HOWTO_RELEASE.rst.txt

index 37e047f9fbf0498303f0d63c3b2bcc173d161688..6bf1b226872122ba149cc3853238bcc93f9e4318 100644 (file)
--- a/doc/HOWTO_RELEASE.rst.txt
+++ b/doc/HOWTO_RELEASE.rst.txt
@@ -153,7 +153,7 @@ What is released
  
  Wheels
  ------
-We currently support Python 3.6-3.8 on Windows, OSX, and Linux
+We currently support Python 3.8-3.10 on Windows, OSX, and Linux
  
  * Windows: 32-bit and 64-bit wheels built using Appveyor;
  * OSX: x64_86 OSX wheels built using travis-ci;
diff --git a/doc/Makefile b/doc/Makefile

index 16fc3229d4c90ba015ff0898efaad648f4561c77..ad4414d38b3aa65f4d9b5283ef06923ca41a53cb 100644 (file)
--- a/doc/Makefile
+++ b/doc/Makefile
@@ -88,6 +88,7 @@ ifeq "$(GITVER)" "Unknown"
         # @echo sdist build with unlabeled sources
  else ifeq ("", "$(NUMPYVER)")
         @echo numpy not found, cannot build documentation without successful \"import numpy\"
+       @echo Try overriding the makefile PYTHON variable with '"make PYTHON=python $(MAKECMDGOALS)"'
         @exit 1
  else ifneq ($(NUMPYVER),$(GITVER))
         @echo installed numpy $(NUMPYVER) != current repo git version \'$(GITVER)\'
@@ -263,4 +264,3 @@ info:
  
  show:
         @python -c "import webbrowser; webbrowser.open_new_tab('file://$(PWD)/build/html/index.html')"
-
diff --git a/doc/RELEASE_WALKTHROUGH.rst.txt b/doc/RELEASE_WALKTHROUGH.rst.txt

index 9324ce971e2962c4f3bc3973b750c235f6884397..eb9d513e30b209f2d565df105d20666e4bf170e9 100644 (file)
--- a/doc/RELEASE_WALKTHROUGH.rst.txt
+++ b/doc/RELEASE_WALKTHROUGH.rst.txt
@@ -12,7 +12,7 @@ ensure that you have the needed software. Most software can be installed with
  pip, but some will require apt-get, dnf, or whatever your system uses for
  software. Note that at this time the documentation cannot be built with Python
  3.10, for that use 3.8-3.9 instead. You will also need a GitHub personal access
-token (PAT) to push the documention. There are a few ways to streamline things.
+token (PAT) to push the documentation. There are a few ways to streamline things.
  
  - Git can be set up to use a keyring to store your GitHub personal access token.
    Search online for the details.
@@ -126,8 +126,8 @@ source releases in the latter. ::
      $ paver sdist  # sdist will do a git clean -xdfq, so we omit that
  
  
-Build wheels
-------------
+Build wheels via MacPython/numpy-wheels
+---------------------------------------
  
  Trigger the wheels build by pointing the numpy-wheels repository at this
  commit. This can take up to an hour. The numpy-wheels repository is cloned from
@@ -163,6 +163,23 @@ Note that sometimes builds, like tests, fail for unrelated reasons and you will
  need to rerun them. You will need to be logged in under 'numpy' to do this
  on azure.
  
+Build wheels via cibuildwheel
+-----------------------------
+Tagging the build at the beginning of this process will trigger a wheel build
+via cibuildwheel and upload wheels and an sdist to the staging area. The CI run
+on github actions (for all x86-based and macOS arm64 wheels) takes about 1 1/4
+hours. The CI run on travis (for aarch64) takes less time. 
+
+If you wish to manually trigger a wheel build, you can do so:
+
+- On github actions -> `Wheel builder`_ there is a "Run workflow" button, click
+  on it and choose the tag to build
+- On travis_ there is a "More Options" button, click on it and choose a branch
+  to build. There does not appear to be an option to build a tag.
+
+.. _`Wheel builder`: https://github.com/numpy/numpy/actions/workflows/wheels.yml
+.. _travis : https://app.travis-ci.com/github/numpy/numpy
+
  Download wheels
  ---------------
  
@@ -265,6 +282,11 @@ If the release series is a new one, you will need to add a new section to the
  
      $ gvim index.html +/'insert here'
  
+Further, update the version-switcher json file to add the new release and
+update the version marked `(stable)`::
+
+    $ gvim _static/versions.json
+
  Otherwise, only the ``zip`` and ``pdf`` links should be updated with the
  new tag name::
  
diff --git a/doc/changelog/1.12.0-changelog.rst b/doc/changelog/1.12.0-changelog.rst

index 2e91f510f529d57beaa183ae95cf47161ba968ab..052714374445854b7ff027c2264548a69227c366 100644 (file)
--- a/doc/changelog/1.12.0-changelog.rst
+++ b/doc/changelog/1.12.0-changelog.rst
@@ -283,7 +283,7 @@ A total of 418 pull requests were merged for this release.
  * `#7373 <https://github.com/numpy/numpy/pull/7373>`__: ENH: Add bitwise_and identity
  * `#7378 <https://github.com/numpy/numpy/pull/7378>`__: added NumPy logo and separator
  * `#7382 <https://github.com/numpy/numpy/pull/7382>`__: MAINT: cleanup np.average
-* `#7385 <https://github.com/numpy/numpy/pull/7385>`__: DOC: note about wheels / windows wheels for pypi
+* `#7385 <https://github.com/numpy/numpy/pull/7385>`__: DOC: note about wheels / windows wheels for PyPI
  * `#7386 <https://github.com/numpy/numpy/pull/7386>`__: Added label icon to Travis status
  * `#7397 <https://github.com/numpy/numpy/pull/7397>`__: BUG: incorrect type for objects whose __len__ fails
  * `#7398 <https://github.com/numpy/numpy/pull/7398>`__: DOC: fix typo
diff --git a/doc/changelog/1.20.0-changelog.rst b/doc/changelog/1.20.0-changelog.rst

index f06bd8a8d22d32f0277c63d6b871d9628925358f..f2af4a7de01c3dc1e25ffe365c68570346b3594c 100644 (file)
--- a/doc/changelog/1.20.0-changelog.rst
+++ b/doc/changelog/1.20.0-changelog.rst
@@ -714,7 +714,7 @@ A total of 716 pull requests were merged for this release.
  * `#17440 <https://github.com/numpy/numpy/pull/17440>`__: DOC: Cleaner template for PRs
  * `#17442 <https://github.com/numpy/numpy/pull/17442>`__: MAINT: fix exception chaining in format.py
  * `#17443 <https://github.com/numpy/numpy/pull/17443>`__: ENH: Warn on unsupported Python 3.10+
-* `#17444 <https://github.com/numpy/numpy/pull/17444>`__: ENH: Add ``Typing :: Typed`` to the PyPi classifier
+* `#17444 <https://github.com/numpy/numpy/pull/17444>`__: ENH: Add ``Typing :: Typed`` to the PyPI classifier
  * `#17445 <https://github.com/numpy/numpy/pull/17445>`__: DOC: Fix the references for macros
  * `#17447 <https://github.com/numpy/numpy/pull/17447>`__: NEP: update NEP 42 with discussion of type hinting applications
  * `#17448 <https://github.com/numpy/numpy/pull/17448>`__: DOC: Remove CoC pages from Sphinx
diff --git a/doc/changelog/1.21.5-changelog.rst b/doc/changelog/1.21.5-changelog.rst

new file mode 100644 (file)

index 0000000..acd3599
--- /dev/null
+++ b/doc/changelog/1.21.5-changelog.rst
@@ -0,0 +1,31 @@
+
+Contributors
+============
+
+A total of 7 people contributed to this release.  People with a "+" by their
+names contributed a patch for the first time.
+
+* Bas van Beek
+* Charles Harris
+* Matti Picus
+* Rohit Goswami +
+* Ross Barnowski
+* Sayed Adel
+* Sebastian Berg
+
+Pull requests merged
+====================
+
+A total of 11 pull requests were merged for this release.
+
+* `#20357 <https://github.com/numpy/numpy/pull/20357>`__: MAINT: Do not forward `__(deep)copy__` calls of `_GenericAlias`...
+* `#20462 <https://github.com/numpy/numpy/pull/20462>`__: BUG: Fix float16 einsum fastpaths using wrong tempvar
+* `#20463 <https://github.com/numpy/numpy/pull/20463>`__: BUG, DIST: Print os error message when the executable not exist
+* `#20464 <https://github.com/numpy/numpy/pull/20464>`__: BLD: Verify the ability to compile C++ sources before initiating...
+* `#20465 <https://github.com/numpy/numpy/pull/20465>`__: BUG: Force ``npymath` ` to respect ``npy_longdouble``
+* `#20466 <https://github.com/numpy/numpy/pull/20466>`__: BUG: Fix failure to create aligned, empty structured dtype
+* `#20467 <https://github.com/numpy/numpy/pull/20467>`__: ENH: provide a convenience function to replace npy_load_module
+* `#20495 <https://github.com/numpy/numpy/pull/20495>`__: MAINT: update wheel to version that supports python3.10
+* `#20497 <https://github.com/numpy/numpy/pull/20497>`__: BUG: Clear errors correctly in F2PY conversions
+* `#20613 <https://github.com/numpy/numpy/pull/20613>`__: DEV: add a warningfilter to fix pytest workflow.
+* `#20618 <https://github.com/numpy/numpy/pull/20618>`__: MAINT: Help boost::python libraries at least not crash
diff --git a/doc/changelog/1.21.6-changelog.rst b/doc/changelog/1.21.6-changelog.rst

new file mode 100644 (file)

index 0000000..5d869ee
--- /dev/null
+++ b/doc/changelog/1.21.6-changelog.rst
@@ -0,0 +1,15 @@
+
+Contributors
+============
+
+A total of 1 people contributed to this release.  People with a "+" by their
+names contributed a patch for the first time.
+
+* Charles Harris
+
+Pull requests merged
+====================
+
+A total of 1 pull requests were merged for this release.
+
+* `#21318 <https://github.com/numpy/numpy/pull/21318>`__: REV: Revert pull request #20464 from charris/backport-20354
diff --git a/doc/changelog/1.22.3-changelog.rst b/doc/changelog/1.22.3-changelog.rst

index b0ecedc8aca06cf44ca18ee01de1077c3906755c..051e5665b231132db9ee2b456d858b90dd46f984 100644 (file)
--- a/doc/changelog/1.22.3-changelog.rst
+++ b/doc/changelog/1.22.3-changelog.rst
@@ -25,7 +25,7 @@ A total of 10 pull requests were merged for this release.
  * `#21137 <https://github.com/numpy/numpy/pull/21137>`__: BLD,DOC: skip broken ipython 8.1.0
  * `#21138 <https://github.com/numpy/numpy/pull/21138>`__: BUG, ENH: np._from_dlpack: export correct device information
  * `#21139 <https://github.com/numpy/numpy/pull/21139>`__: BUG: Fix numba DUFuncs added loops getting picked up
-* `#21140 <https://github.com/numpy/numpy/pull/21140>`__: BUG: Fix unpickling an empty ndarray with a none-zero dimension...
+* `#21140 <https://github.com/numpy/numpy/pull/21140>`__: BUG: Fix unpickling an empty ndarray with a non-zero dimension...
  * `#21141 <https://github.com/numpy/numpy/pull/21141>`__: BUG: use ThreadPoolExecutor instead of ThreadPool
  * `#21142 <https://github.com/numpy/numpy/pull/21142>`__: API: Disallow strings in logical ufuncs
  * `#21143 <https://github.com/numpy/numpy/pull/21143>`__: MAINT, DOC: Fix SciPy intersphinx link
diff --git a/doc/changelog/1.23.0-changelog.rst b/doc/changelog/1.23.0-changelog.rst

new file mode 100644 (file)

index 0000000..3ff0dcf
--- /dev/null
+++ b/doc/changelog/1.23.0-changelog.rst
@@ -0,0 +1,660 @@
+
+Contributors
+============
+
+A total of 151 people contributed to this release.  People with a "+" by their
+names contributed a patch for the first time.
+
+* @DWesl
+* @GalaxySnail +
+* @code-review-doctor +
+* @h-vetinari
+* Aaron Meurer
+* Alexander Shadchin
+* Alexandre de Siqueira
+* Allan Haldane
+* Amrit Krishnan
+* Andrei Batomunkuev
+* Andrew J. Hesford +
+* Andrew Murray +
+* Andrey Andreyevich Bienkowski +
+* André Elimelek de Weber +
+* Andy Wharton +
+* Arryan Singh
+* Arushi Sharma
+* Bas van Beek
+* Bharat Raghunathan
+* Bhavuk Kalra +
+* Brigitta Sipőcz
+* Brénainn Woodsend +
+* Burlen Loring +
+* Caio Agiani +
+* Charles Harris
+* Chiara Marmo
+* Cornelius Roemer +
+* Dahyun Kim +
+* Damien Caliste
+* David Prosin +
+* Denis Laxalde
+* Developer-Ecosystem-Engineering
+* Devin Shanahan +
+* Diego Wang +
+* Dimitri Papadopoulos Orfanos
+* Ding Liu +
+* Diwakar Gupta +
+* Don Kirkby +
+* Emma Simon +
+* Eric Wieser
+* Evan Miller +
+* Evgeni Burovski
+* Evgeny Posenitskiy +
+* Ewout ter Hoeven +
+* Felix Divo
+* Francesco Andreuzzi +
+* Ganesh Kathiresan
+* Gaëtan de Menten
+* Geoffrey Gunter +
+* Hans Meine
+* Harsh Mishra +
+* Henry Schreiner
+* Hood Chatham +
+* I-Shen Leong
+* Ilhan Polat
+* Inessa Pawson
+* Isuru Fernando
+* Ivan Gonzalez +
+* Ivan Meleshko +
+* Ivan Yashchuk +
+* Janus Heide +
+* Jarrod Millman
+* Jason Thai +
+* Jeremy Volkman +
+* Jesús Carrete Montaña +
+* Jhong-Ken Chen (陳仲肯) +
+* John Kirkham
+* John-Mark Gurney +
+* Jonathan Deng +
+* Joseph Fox-Rabinovitz
+* Jouke Witteveen +
+* Junyan Ou +
+* Jérôme Richard +
+* Kassian Sun +
+* Kazuki Sakamoto +
+* Kenichi Maehashi
+* Kevin Sheppard
+* Kilian Lieret +
+* Kushal Beniwal +
+* Leo Singer
+* Logan Thomas +
+* Lorenzo Mammana +
+* Margret Pax
+* Mariusz Felisiak +
+* Markus Mohrhard +
+* Mars Lee
+* Marten van Kerkwijk
+* Masamichi Hosoda +
+* Matthew Barber
+* Matthew Brett
+* Matthias Bussonnier
+* Matthieu Darbois
+* Matti Picus
+* Melissa Weber Mendonça
+* Michael Burkhart +
+* Morteza Mirzai +
+* Motahhar Mokf +
+* Muataz Attaia +
+* Muhammad Motawe +
+* Mukulika Pahari
+* Márton Gunyhó +
+* Namami Shanker +
+* Nihaal Sangha +
+* Niyas Sait
+* Omid Rajaei +
+* Oscar Gustafsson +
+* Ovee Jawdekar +
+* P. L. Lim +
+* Pamphile Roy +
+* Pantelis Antonoudiou +
+* Pearu Peterson
+* Peter Andreas Entschev
+* Peter Hawkins
+* Pierre de Buyl
+* Pieter Eendebak +
+* Pradipta Ghosh +
+* Rafael Cardoso Fernandes Sousa +
+* Raghuveer Devulapalli
+* Ralf Gommers
+* Raphael Kruse
+* Raúl Montón Pinillos
+* Robert Kern
+* Rohit Goswami
+* Ross Barnowski
+* Ruben Garcia +
+* Sadie Louise Bartholomew +
+* Saswat Das +
+* Sayed Adel
+* Sebastian Berg
+* Serge Guelton
+* Simon Surland Andersen +
+* Siyabend Ürün +
+* Somasree Majumder +
+* Soumya +
+* Stefan van der Walt
+* Stefano Miccoli +
+* Stephan Hoyer
+* Stephen Worsley +
+* Tania Allard
+* Thomas Duvernay +
+* Thomas Green +
+* Thomas J. Fan
+* Thomas Li +
+* Tim Hoffmann
+* Ting Sun +
+* Tirth Patel
+* Toshiki Kataoka
+* Tyler Reddy
+* Warren Weckesser
+* Yang Hau
+* Yoon, Jee Seok +
+
+Pull requests merged
+====================
+
+A total of 494 pull requests were merged for this release.
+
+* `#15006 <https://github.com/numpy/numpy/pull/15006>`__: ENH: add support for operator() in crackfortran.
+* `#15844 <https://github.com/numpy/numpy/pull/15844>`__: ENH: add inline definition of access rights for Fortran types
+* `#16810 <https://github.com/numpy/numpy/pull/16810>`__: MAINT: Remove subclass paths from scalar_value
+* `#16830 <https://github.com/numpy/numpy/pull/16830>`__: MAINT: more python <3.6 cleanup
+* `#16895 <https://github.com/numpy/numpy/pull/16895>`__: MAINT: extend delete single value optimization
+* `#17709 <https://github.com/numpy/numpy/pull/17709>`__: BUG: Fix norm type promotion
+* `#18343 <https://github.com/numpy/numpy/pull/18343>`__: DOC: document how to skip CI jobs
+* `#18846 <https://github.com/numpy/numpy/pull/18846>`__: DOC: Improve documentation of default int type on windows
+* `#19226 <https://github.com/numpy/numpy/pull/19226>`__: API: Fix structured dtype cast-safety, promotion, and comparison
+* `#19345 <https://github.com/numpy/numpy/pull/19345>`__: ENH: Move ``ensure_dtype_nbo`` onto the DType as ``ensure_canonical``
+* `#19346 <https://github.com/numpy/numpy/pull/19346>`__: API: Fix ``np.result_type(structured_dtype)`` to "canonicalize"
+* `#19581 <https://github.com/numpy/numpy/pull/19581>`__: DOC: Improve SIMD documentation(1/4)
+* `#19756 <https://github.com/numpy/numpy/pull/19756>`__: DOC: Update front page of documentation with Sphinx-Panels
+* `#19898 <https://github.com/numpy/numpy/pull/19898>`__: DOC: Fixed Refguide errors
+* `#20020 <https://github.com/numpy/numpy/pull/20020>`__: ENH: add ndenumerate specialization for masked arrays
+* `#20093 <https://github.com/numpy/numpy/pull/20093>`__: DOC: Created an indexing how-to
+* `#20131 <https://github.com/numpy/numpy/pull/20131>`__: BUG: min/max is slow, re-implement using NEON (#17989)
+* `#20133 <https://github.com/numpy/numpy/pull/20133>`__: ENH: Vectorize quicksort for 32-bit dtype using AVX-512
+* `#20140 <https://github.com/numpy/numpy/pull/20140>`__: DOC: Fix some target not found sphinx warnings.
+* `#20147 <https://github.com/numpy/numpy/pull/20147>`__: DOC: updated docstring for binary file object
+* `#20175 <https://github.com/numpy/numpy/pull/20175>`__: ENH: Optimize ``np.empty`` for scalar arguments
+* `#20176 <https://github.com/numpy/numpy/pull/20176>`__: MAINT: Use intp output param viewable casts/methods
+* `#20185 <https://github.com/numpy/numpy/pull/20185>`__: DOC: Added explanation document on interoperability
+* `#20244 <https://github.com/numpy/numpy/pull/20244>`__: DOC: Clarify behavior of ``np.lib.scimath.sqrt`` apropos -0.0
+* `#20246 <https://github.com/numpy/numpy/pull/20246>`__: DOC: Merge doc strings of divide and true_divide.
+* `#20285 <https://github.com/numpy/numpy/pull/20285>`__: ENH, SIMD: add new universal intrinsics for floor/rint
+* `#20288 <https://github.com/numpy/numpy/pull/20288>`__: DOC: make some doctests in user,reference pass pytest
+* `#20311 <https://github.com/numpy/numpy/pull/20311>`__: DOC: Windows and F2PY
+* `#20363 <https://github.com/numpy/numpy/pull/20363>`__: SIMD: Replace SVML/ASM of tanh(f32, f64) with universal intrinsics
+* `#20368 <https://github.com/numpy/numpy/pull/20368>`__: MAINT: Fix METH_NOARGS function signatures
+* `#20380 <https://github.com/numpy/numpy/pull/20380>`__: DOC: random: Fix a comment and example in the multivariate_normal...
+* `#20383 <https://github.com/numpy/numpy/pull/20383>`__: BLD: Try making 64-bit Windows wheels
+* `#20387 <https://github.com/numpy/numpy/pull/20387>`__: REL: Prepare main for NumPy 1.23.0 development
+* `#20388 <https://github.com/numpy/numpy/pull/20388>`__: Update ARM cpu_asimdfhm.c check
+* `#20389 <https://github.com/numpy/numpy/pull/20389>`__: MAINT: Raise different type of errors
+* `#20393 <https://github.com/numpy/numpy/pull/20393>`__: CI: CircleCI: Install numpy after processing doc_requirements.txt
+* `#20394 <https://github.com/numpy/numpy/pull/20394>`__: DEP: remove allocation_tracking, deprecate PyDataMem_SetEventHook
+* `#20395 <https://github.com/numpy/numpy/pull/20395>`__: ENH: provide a convenience function to replace npy_load_module
+* `#20396 <https://github.com/numpy/numpy/pull/20396>`__: DOC: np.fromfunction documentation not clear
+* `#20397 <https://github.com/numpy/numpy/pull/20397>`__: SIMD: replace raw AVX512 of floor/trunc/rint with universal intrinsics
+* `#20398 <https://github.com/numpy/numpy/pull/20398>`__: BLD: Fix Macos Builds [wheel build]
+* `#20399 <https://github.com/numpy/numpy/pull/20399>`__: DOC: Docstring improvements in the context of np.shape
+* `#20403 <https://github.com/numpy/numpy/pull/20403>`__: CI: CircleCI: Install numpy after processing doc_requirements.txt
+* `#20404 <https://github.com/numpy/numpy/pull/20404>`__: BUG: Clear errors correctly in F2PY conversions
+* `#20405 <https://github.com/numpy/numpy/pull/20405>`__: BUG, SIMD: Fix ``exp`` FP stack overflow when ``AVX512_SKX`` is enabled
+* `#20407 <https://github.com/numpy/numpy/pull/20407>`__: DOC: Update axis parameter for np.ma.{min,max}
+* `#20409 <https://github.com/numpy/numpy/pull/20409>`__: MAINT: import setuptools before distutils in one ``np.random``...
+* `#20412 <https://github.com/numpy/numpy/pull/20412>`__: MAINT: Fix METH_NOARGS function signatures (#20368)
+* `#20413 <https://github.com/numpy/numpy/pull/20413>`__: DOC: correct the versionadded number for ``f2py.get_include``
+* `#20414 <https://github.com/numpy/numpy/pull/20414>`__: DEP: remove deprecated ``alen`` and ``asscalar`` functions
+* `#20416 <https://github.com/numpy/numpy/pull/20416>`__: ENH: Add ARM Compiler with ARM Performance Library support
+* `#20417 <https://github.com/numpy/numpy/pull/20417>`__: BLD: Add macOS arm64 wheels [wheel build]
+* `#20422 <https://github.com/numpy/numpy/pull/20422>`__: BUG: Restore support for i386 and PowerPC (OS X)
+* `#20427 <https://github.com/numpy/numpy/pull/20427>`__: MAINT: Fix longdouble precision check in test_umath.py
+* `#20432 <https://github.com/numpy/numpy/pull/20432>`__: ENH: Add annotations for ``np.emath``
+* `#20433 <https://github.com/numpy/numpy/pull/20433>`__: BUG: Fix an incorrect protocol used in ``np.lib.shape_base``
+* `#20435 <https://github.com/numpy/numpy/pull/20435>`__: DOC: nicer CMake example in the f2py docs
+* `#20437 <https://github.com/numpy/numpy/pull/20437>`__: DOC: Fix a typo in docstring of MT19937
+* `#20443 <https://github.com/numpy/numpy/pull/20443>`__: DOC: get scikit-build example working
+* `#20446 <https://github.com/numpy/numpy/pull/20446>`__: BUG: Fixed output variable overriding in numpy.info()
+* `#20447 <https://github.com/numpy/numpy/pull/20447>`__: DOC: use FindPython instead of FindPython3
+* `#20452 <https://github.com/numpy/numpy/pull/20452>`__: MAINT: Update the required setuptools version.
+* `#20457 <https://github.com/numpy/numpy/pull/20457>`__: TST: remove obsolete TestF77Mismatch
+* `#20468 <https://github.com/numpy/numpy/pull/20468>`__: BUG: Fix two overload-related problems
+* `#20470 <https://github.com/numpy/numpy/pull/20470>`__: ENH: Add dtype-typing support to ``np.core.function_base``
+* `#20471 <https://github.com/numpy/numpy/pull/20471>`__: Rename _operand_flag_tests.c.src into numpy/core/src/umath/_operand_f…
+* `#20478 <https://github.com/numpy/numpy/pull/20478>`__: TST,MAINT: F2PY test typo
+* `#20479 <https://github.com/numpy/numpy/pull/20479>`__: TST,STY: Clean up F2PY tests for pathlib.Path
+* `#20482 <https://github.com/numpy/numpy/pull/20482>`__: BUG: Fix tensorsolve for 0-sized input
+* `#20484 <https://github.com/numpy/numpy/pull/20484>`__: BUG: Fix reduce promotion with out argument
+* `#20486 <https://github.com/numpy/numpy/pull/20486>`__: MAINT: update wheel to version that supports python3.10
+* `#20489 <https://github.com/numpy/numpy/pull/20489>`__: MAINT: Translate binsearch.c.src to C++ using templates.
+* `#20490 <https://github.com/numpy/numpy/pull/20490>`__: BUG: Protect divide by 0 in multinomial distribution.
+* `#20491 <https://github.com/numpy/numpy/pull/20491>`__: TEST: use pypy3.8-v7.3.7 final versions
+* `#20499 <https://github.com/numpy/numpy/pull/20499>`__: BUG: Fix the .T attribute in the array_api namespace
+* `#20500 <https://github.com/numpy/numpy/pull/20500>`__: ENH: add ndmin to ``genfromtxt`` behaving the same as ``loadtxt``
+* `#20505 <https://github.com/numpy/numpy/pull/20505>`__: BUG: fix ``ma.average`` not working well with ``nan`` weights
+* `#20509 <https://github.com/numpy/numpy/pull/20509>`__: DOC: Adds valgrind to the test command
+* `#20515 <https://github.com/numpy/numpy/pull/20515>`__: ENH: Generate the docstrings of umath into a separate C header
+* `#20516 <https://github.com/numpy/numpy/pull/20516>`__: DOC: Add more details on F2PY output conditions
+* `#20517 <https://github.com/numpy/numpy/pull/20517>`__: MAINT,TST: Refactor F2PY testsuite
+* `#20518 <https://github.com/numpy/numpy/pull/20518>`__: PERF: Fix performance bug in ufunc dispatching cache
+* `#20521 <https://github.com/numpy/numpy/pull/20521>`__: MAINT: Pin OS versions when building wheels [wheel build]
+* `#20524 <https://github.com/numpy/numpy/pull/20524>`__: CI: make sure CI stays on VS2019 unless changed explicitly
+* `#20527 <https://github.com/numpy/numpy/pull/20527>`__: ENH: Add __array__ to the array_api Array object
+* `#20528 <https://github.com/numpy/numpy/pull/20528>`__: BLD: Add PyPy wheels [wheel build]
+* `#20533 <https://github.com/numpy/numpy/pull/20533>`__: BUG: Fix handling of the dtype parameter to numpy.array_api.prod()
+* `#20547 <https://github.com/numpy/numpy/pull/20547>`__: REV: Revert adding a default ufunc promoter
+* `#20552 <https://github.com/numpy/numpy/pull/20552>`__: ENH: Extending CPU feature detection framework to support IBM...
+* `#20553 <https://github.com/numpy/numpy/pull/20553>`__: BLD: Use the new hypotl on Cygwin, rather than defaulting to...
+* `#20556 <https://github.com/numpy/numpy/pull/20556>`__: DOC: Update links to mailing list on python.org
+* `#20558 <https://github.com/numpy/numpy/pull/20558>`__: TST: move get_glibc_version to np.testing; skip 2 more tests...
+* `#20559 <https://github.com/numpy/numpy/pull/20559>`__: DOC: Refactoring f2py user guide
+* `#20563 <https://github.com/numpy/numpy/pull/20563>`__: BUG: Fix small issues found using valgrind
+* `#20565 <https://github.com/numpy/numpy/pull/20565>`__: REF: Clean up wheels workflow [wheel build]
+* `#20569 <https://github.com/numpy/numpy/pull/20569>`__: BUG: Fix sorting of int8/int16
+* `#20571 <https://github.com/numpy/numpy/pull/20571>`__: DOC: fix typo
+* `#20572 <https://github.com/numpy/numpy/pull/20572>`__: DOC: Adds link to NEP 43 from NEP 41
+* `#20580 <https://github.com/numpy/numpy/pull/20580>`__: ENH: Move ``loadtxt`` to C for much better speed
+* `#20583 <https://github.com/numpy/numpy/pull/20583>`__: BUG: Fix issues (mainly) found using pytest-leaks
+* `#20587 <https://github.com/numpy/numpy/pull/20587>`__: MAINT: Fix two minor typing-related problems
+* `#20588 <https://github.com/numpy/numpy/pull/20588>`__: BUG, DIST: fix normalize IBMZ features flags
+* `#20589 <https://github.com/numpy/numpy/pull/20589>`__: DEP: remove NPY_ARRAY_UPDATEIFCOPY, deprecated in 1.14
+* `#20590 <https://github.com/numpy/numpy/pull/20590>`__: BUG: Fix leaks found using pytest-leaks
+* `#20591 <https://github.com/numpy/numpy/pull/20591>`__: removed two redundant '\\' typos
+* `#20592 <https://github.com/numpy/numpy/pull/20592>`__: BUG: Reject buffers with suboffsets
+* `#20593 <https://github.com/numpy/numpy/pull/20593>`__: MAINT: Check for buffer interface support rather than try/except
+* `#20594 <https://github.com/numpy/numpy/pull/20594>`__: BUG: Fix setstate logic for empty arrays
+* `#20595 <https://github.com/numpy/numpy/pull/20595>`__: BUG: Fix PyInit__umath_linalg type
+* `#20604 <https://github.com/numpy/numpy/pull/20604>`__: DEV: add a warningfilter to fix pytest workflow.
+* `#20607 <https://github.com/numpy/numpy/pull/20607>`__: BUG: Protect kahan_sum from empty arrays
+* `#20611 <https://github.com/numpy/numpy/pull/20611>`__: TST: Bump mypy: 0.910 -> 0.920
+* `#20616 <https://github.com/numpy/numpy/pull/20616>`__: MAINT: Help boost::python libraries at least not crash
+* `#20621 <https://github.com/numpy/numpy/pull/20621>`__: BUG: random: Check 'writeable' flag in 'shuffle' and 'permuted'.
+* `#20622 <https://github.com/numpy/numpy/pull/20622>`__: BLD: Add Windows 32-bit wheels
+* `#20624 <https://github.com/numpy/numpy/pull/20624>`__: BUILD: pin to cython 0.29.24 to hide PyPy3.8 bug
+* `#20628 <https://github.com/numpy/numpy/pull/20628>`__: REL: Update main after 1.21.5 release.
+* `#20629 <https://github.com/numpy/numpy/pull/20629>`__: DOC: Refer to NumPy, not pandas, in main page
+* `#20630 <https://github.com/numpy/numpy/pull/20630>`__: BUG: f2py: Simplify creation of an exception message.
+* `#20640 <https://github.com/numpy/numpy/pull/20640>`__: BUG: Support env argument in CCompiler.spawn
+* `#20641 <https://github.com/numpy/numpy/pull/20641>`__: PERF: Speed up check_constraint checks
+* `#20643 <https://github.com/numpy/numpy/pull/20643>`__: PERF: Optimize array check for bounded 0,1 values
+* `#20646 <https://github.com/numpy/numpy/pull/20646>`__: DOC: add np.iterable to reference guide
+* `#20647 <https://github.com/numpy/numpy/pull/20647>`__: DOC: Add PyArray_FailUnlessWriteable to the online C-API docs.
+* `#20648 <https://github.com/numpy/numpy/pull/20648>`__: DOC: Modify SVGs to be visible on Chrome
+* `#20652 <https://github.com/numpy/numpy/pull/20652>`__: STY: Use PEP 585 and 604 syntaxes throughout the .pyi stub files
+* `#20653 <https://github.com/numpy/numpy/pull/20653>`__: DEV: Add ``TYP``, a standard acronym for static typing
+* `#20654 <https://github.com/numpy/numpy/pull/20654>`__: CI: Find cygwin test failures
+* `#20660 <https://github.com/numpy/numpy/pull/20660>`__: MAINT: update OpenBLAS to 0.3.19
+* `#20663 <https://github.com/numpy/numpy/pull/20663>`__: TYP,TST: Bump mypy to 0.930
+* `#20666 <https://github.com/numpy/numpy/pull/20666>`__: DOC: Add help string for F2PY
+* `#20668 <https://github.com/numpy/numpy/pull/20668>`__: TST: Initialize f2py2e tests of the F2PY CLI
+* `#20669 <https://github.com/numpy/numpy/pull/20669>`__: CI, TST: Run Cygwin CI with Netlib reference BLAS and re-enable...
+* `#20672 <https://github.com/numpy/numpy/pull/20672>`__: DOC: add hypothesis test dependency in README and PyPI long-description
+* `#20674 <https://github.com/numpy/numpy/pull/20674>`__: BUG: array interface PyCapsule reference
+* `#20678 <https://github.com/numpy/numpy/pull/20678>`__: BUG: Remove trailing dec point in dragon4positional
+* `#20683 <https://github.com/numpy/numpy/pull/20683>`__: DOC: Updated pointer spacing for consistency.
+* `#20689 <https://github.com/numpy/numpy/pull/20689>`__: BUG: Added check for NULL data in ufuncs
+* `#20691 <https://github.com/numpy/numpy/pull/20691>`__: DOC, ENH: Added pngs for svgs for pdf build
+* `#20693 <https://github.com/numpy/numpy/pull/20693>`__: DOC: Replaced svgs with pngs in the Broadcasting doc
+* `#20695 <https://github.com/numpy/numpy/pull/20695>`__: BLD: Add NPY_DISABLE_SVML env var to opt out of SVML
+* `#20697 <https://github.com/numpy/numpy/pull/20697>`__: REL: Update main after 1.22.0 release.
+* `#20698 <https://github.com/numpy/numpy/pull/20698>`__: DOC:Fixed the link on user-guide landing page
+* `#20701 <https://github.com/numpy/numpy/pull/20701>`__: MAINT, DOC: Post 1.22.0 release fixes.
+* `#20708 <https://github.com/numpy/numpy/pull/20708>`__: DOC: fix broken documentation references in mtrand.pyx
+* `#20710 <https://github.com/numpy/numpy/pull/20710>`__: TYP: Allow ``ndindex`` to accept integer tuples
+* `#20712 <https://github.com/numpy/numpy/pull/20712>`__: BUG: Restore vc141 support
+* `#20713 <https://github.com/numpy/numpy/pull/20713>`__: DOC: Add Code of Conduct to README.md
+* `#20719 <https://github.com/numpy/numpy/pull/20719>`__: TYP: change type annotation for ``__array_namespace__`` to ModuleType
+* `#20720 <https://github.com/numpy/numpy/pull/20720>`__: TYP: add a few type annotations to ``numpy.array_api.Array``
+* `#20721 <https://github.com/numpy/numpy/pull/20721>`__: BUG: Fix array dimensions solver for multidimensional arguments...
+* `#20722 <https://github.com/numpy/numpy/pull/20722>`__: ENH: Removed requirement for C-contiguity when changing to dtype...
+* `#20727 <https://github.com/numpy/numpy/pull/20727>`__: DOC: Update README.md mainly to include link to website
+* `#20729 <https://github.com/numpy/numpy/pull/20729>`__: BUG: Relax dtype identity check in reductions
+* `#20730 <https://github.com/numpy/numpy/pull/20730>`__: DOC: Document that dtype, strides, shape attributes should not...
+* `#20731 <https://github.com/numpy/numpy/pull/20731>`__: DOC: fix OpenBLAS version in release note
+* `#20732 <https://github.com/numpy/numpy/pull/20732>`__: MAINT: Translate timsort.c.src to C++ using templates.
+* `#20738 <https://github.com/numpy/numpy/pull/20738>`__: ENH: fix a typo in the example trigger for wheels
+* `#20740 <https://github.com/numpy/numpy/pull/20740>`__: Update teams URL
+* `#20741 <https://github.com/numpy/numpy/pull/20741>`__: DOC: add instructions for cross compilation
+* `#20745 <https://github.com/numpy/numpy/pull/20745>`__: ENH: add hook and test for PyInstaller.
+* `#20747 <https://github.com/numpy/numpy/pull/20747>`__: BLD: Upload wheel artifacts separately [wheel build]
+* `#20750 <https://github.com/numpy/numpy/pull/20750>`__: TYP: Allow time manipulation functions to accept ``date`` and ``timedelta``...
+* `#20754 <https://github.com/numpy/numpy/pull/20754>`__: MAINT: Relax asserts to match relaxed reducelike resolution behaviour
+* `#20758 <https://github.com/numpy/numpy/pull/20758>`__: DOC: Capitalization and missing word in docs
+* `#20759 <https://github.com/numpy/numpy/pull/20759>`__: MAINT: Raise RuntimeError if setuptools version is too recent.
+* `#20762 <https://github.com/numpy/numpy/pull/20762>`__: BUG: Allow integer inputs for pow-related functions in ``array_api``
+* `#20766 <https://github.com/numpy/numpy/pull/20766>`__: ENH: Make ndarray.__array_finalize__ a callable no-op
+* `#20773 <https://github.com/numpy/numpy/pull/20773>`__: BUG: method without self argument should be static
+* `#20774 <https://github.com/numpy/numpy/pull/20774>`__: DOC: explicitly define numpy.datetime64 semantics
+* `#20776 <https://github.com/numpy/numpy/pull/20776>`__: DOC: fix remaining "easy" doctests errors
+* `#20779 <https://github.com/numpy/numpy/pull/20779>`__: MAINT: removed duplicate 'int' type in ScalarType
+* `#20783 <https://github.com/numpy/numpy/pull/20783>`__: DOC: Update Copyright to 2022 [License]
+* `#20784 <https://github.com/numpy/numpy/pull/20784>`__: MAINT, DOC: fix new typos detected by codespell
+* `#20786 <https://github.com/numpy/numpy/pull/20786>`__: BUG, DOC: Fixes SciPy docs build warnings
+* `#20788 <https://github.com/numpy/numpy/pull/20788>`__: BUG: ``array_api.argsort(descending=True)`` respects relative...
+* `#20789 <https://github.com/numpy/numpy/pull/20789>`__: DOC: git:// protocol deprecated by github.
+* `#20791 <https://github.com/numpy/numpy/pull/20791>`__: BUG: Return correctly shaped inverse indices in ``array_api`` set...
+* `#20792 <https://github.com/numpy/numpy/pull/20792>`__: TST: Bump mypy to 0.931
+* `#20793 <https://github.com/numpy/numpy/pull/20793>`__: BUG: Fix that reduce-likes honor out always (and live in the...
+* `#20794 <https://github.com/numpy/numpy/pull/20794>`__: TYP: Type the NEP 35 ``like`` parameter via a ``__array_function__``...
+* `#20810 <https://github.com/numpy/numpy/pull/20810>`__: DOC: Restore MaskedArray.hardmask documentation
+* `#20811 <https://github.com/numpy/numpy/pull/20811>`__: MAINT, DOC: discard repeated words
+* `#20813 <https://github.com/numpy/numpy/pull/20813>`__: MAINT: fix typo
+* `#20816 <https://github.com/numpy/numpy/pull/20816>`__: DOC: discard repeated words in NEPs
+* `#20818 <https://github.com/numpy/numpy/pull/20818>`__: BUG: Fix build of third-party extensions with Py_LIMITED_API
+* `#20821 <https://github.com/numpy/numpy/pull/20821>`__: ENH: Add CPU feature detection for POWER10 (VSX4)
+* `#20823 <https://github.com/numpy/numpy/pull/20823>`__: REL: Update main after 1.22.1 release.
+* `#20827 <https://github.com/numpy/numpy/pull/20827>`__: TYP: Fix pyright being unable to infer the ``real`` and ``imag``...
+* `#20828 <https://github.com/numpy/numpy/pull/20828>`__: MAINT: Translate heapsort.c.src to C++ using templates
+* `#20829 <https://github.com/numpy/numpy/pull/20829>`__: MAINT: Translate mergesort.c.src to C++ using templates.
+* `#20831 <https://github.com/numpy/numpy/pull/20831>`__: BUG: Avoid importing numpy.distutils on import numpy.testing
+* `#20833 <https://github.com/numpy/numpy/pull/20833>`__: BUG: Fix comparator function signatures
+* `#20834 <https://github.com/numpy/numpy/pull/20834>`__: DOC: Update ndarray.argmax + argmin documentation with keepdims...
+* `#20835 <https://github.com/numpy/numpy/pull/20835>`__: DEP: Removed deprecated error clearing
+* `#20840 <https://github.com/numpy/numpy/pull/20840>`__: MAINT: Translate selection.c.src to C++ using templates.
+* `#20846 <https://github.com/numpy/numpy/pull/20846>`__: ENH, SIMD: improve argmax/argmin performance
+* `#20847 <https://github.com/numpy/numpy/pull/20847>`__: MAINT: remove outdated mingw32 fseek support
+* `#20851 <https://github.com/numpy/numpy/pull/20851>`__: DOC: Fix typo in meshgrid example
+* `#20852 <https://github.com/numpy/numpy/pull/20852>`__: MAINT: Fix a typo in numpy/f2py/capi_maps.py
+* `#20854 <https://github.com/numpy/numpy/pull/20854>`__: DEV: Update dependencies and Docker image
+* `#20857 <https://github.com/numpy/numpy/pull/20857>`__: BUG: Fix pre-builds in Gitpod
+* `#20858 <https://github.com/numpy/numpy/pull/20858>`__: TYP: Relax the return-type of ``np.vectorize``
+* `#20861 <https://github.com/numpy/numpy/pull/20861>`__: DOC: fix formatting of mean example
+* `#20862 <https://github.com/numpy/numpy/pull/20862>`__: Fix typo in numpy/lib/polynomial.py
+* `#20865 <https://github.com/numpy/numpy/pull/20865>`__: MAINT: Fix inconsistent PyPI casing
+* `#20866 <https://github.com/numpy/numpy/pull/20866>`__: ENH: Add changes that allow NumPy to compile with clang-cl
+* `#20867 <https://github.com/numpy/numpy/pull/20867>`__: DOC: Cosmetic docstring fix for numpydoc.
+* `#20868 <https://github.com/numpy/numpy/pull/20868>`__: BUG: Gitpod Remove lock file --unshallow
+* `#20869 <https://github.com/numpy/numpy/pull/20869>`__: DOC: random: Fix spelling of 'precision'.
+* `#20872 <https://github.com/numpy/numpy/pull/20872>`__: BUG: Loss of precision in longdouble min
+* `#20874 <https://github.com/numpy/numpy/pull/20874>`__: BUG: mtrand cannot be imported on Cygwin
+* `#20875 <https://github.com/numpy/numpy/pull/20875>`__: DEP: deprecate ``numpy.distutils``, and add a migration guide
+* `#20876 <https://github.com/numpy/numpy/pull/20876>`__: MAINT, DOC: Fixes minor formatting issue related to nested inline...
+* `#20878 <https://github.com/numpy/numpy/pull/20878>`__: DOC,TST: Fix Pandas code example
+* `#20881 <https://github.com/numpy/numpy/pull/20881>`__: BUG: fix f2py's define for threading when building with Mingw
+* `#20883 <https://github.com/numpy/numpy/pull/20883>`__: BUG: Fix ``np.array_api.can_cast()`` by not relying on ``np.can_cast()``
+* `#20884 <https://github.com/numpy/numpy/pull/20884>`__: MAINT: Minor cleanup to F2PY
+* `#20885 <https://github.com/numpy/numpy/pull/20885>`__: TYP,ENH: Improve typing with the help of ``ParamSpec``
+* `#20886 <https://github.com/numpy/numpy/pull/20886>`__: BUG: distutils: fix building mixed C/Fortran extensions
+* `#20887 <https://github.com/numpy/numpy/pull/20887>`__: TYP,MAINT: Add aliases for commonly used unions
+* `#20890 <https://github.com/numpy/numpy/pull/20890>`__: BUILD: Upload wheels to anaconda,org
+* `#20897 <https://github.com/numpy/numpy/pull/20897>`__: MAINT: Translate quicksort.c.src to C++ using templates.
+* `#20900 <https://github.com/numpy/numpy/pull/20900>`__: TYP,ENH: Add annotations for ``np.lib.mixins``
+* `#20902 <https://github.com/numpy/numpy/pull/20902>`__: TYP,ENH: Add dtype-typing support to ``np.core.fromnumeric`` (part...
+* `#20904 <https://github.com/numpy/numpy/pull/20904>`__: ENH,BUG: Expand the experimental DType API and fix small exposed...
+* `#20911 <https://github.com/numpy/numpy/pull/20911>`__: BUG: Fix the return type of random_float_fill
+* `#20916 <https://github.com/numpy/numpy/pull/20916>`__: TYP, MAINT: Add annotations for ``flatiter.__setitem__``
+* `#20917 <https://github.com/numpy/numpy/pull/20917>`__: DOC: fix np.ma.flatnotmasked_contiguous docstring
+* `#20918 <https://github.com/numpy/numpy/pull/20918>`__: MAINT, TYP: Added missing where typehints in fromnumeric.pyi
+* `#20920 <https://github.com/numpy/numpy/pull/20920>`__: DEP: Deprecate use of ``axis=MAXDIMS`` instead of ``axis=None``
+* `#20927 <https://github.com/numpy/numpy/pull/20927>`__: DOC: lib/io.py was renamed to lib/npyio.py
+* `#20931 <https://github.com/numpy/numpy/pull/20931>`__: BUG: Fix missing intrinsics for windows/arm64 target
+* `#20934 <https://github.com/numpy/numpy/pull/20934>`__: BUG: Fix build_ext interaction with non-numpy extensions
+* `#20940 <https://github.com/numpy/numpy/pull/20940>`__: MAINT: f2py: don't generate code that triggers ``-Wsometimes-uninitialized``
+* `#20944 <https://github.com/numpy/numpy/pull/20944>`__: DOC: improper doc syntax (markdown and imbalanced ticks).
+* `#20946 <https://github.com/numpy/numpy/pull/20946>`__: MAINT: Fix typo in setup.py
+* `#20948 <https://github.com/numpy/numpy/pull/20948>`__: MAINT, DOC: NEP link update
+* `#20950 <https://github.com/numpy/numpy/pull/20950>`__: Fix broken link in nep-0046-sponsorship-guidelines.rst
+* `#20955 <https://github.com/numpy/numpy/pull/20955>`__: BUG: Fix incorrect return type in reduce without initial value
+* `#20956 <https://github.com/numpy/numpy/pull/20956>`__: DOC: Improve NEP page layout with nested toctrees
+* `#20960 <https://github.com/numpy/numpy/pull/20960>`__: ENH: review return values for PyArray_DescrNew
+* `#20963 <https://github.com/numpy/numpy/pull/20963>`__: MAINT: be more tolerant of setuptools>=60
+* `#20966 <https://github.com/numpy/numpy/pull/20966>`__: DOC: update python minimal version to build from source
+* `#20967 <https://github.com/numpy/numpy/pull/20967>`__: MAINT: Update to numpydoc v1.2
+* `#20968 <https://github.com/numpy/numpy/pull/20968>`__: MAINT: Translate npy_partition.h.src to C++ using templates.
+* `#20972 <https://github.com/numpy/numpy/pull/20972>`__: DOC: Add warning about differences between range and arange
+* `#20973 <https://github.com/numpy/numpy/pull/20973>`__: DOC: switch Python intersphinx link from dev to stable.
+* `#20974 <https://github.com/numpy/numpy/pull/20974>`__: DOC: Include special case in ``hsplit`` doc
+* `#20975 <https://github.com/numpy/numpy/pull/20975>`__: MAINT: refactor NonNull in API functions
+* `#20976 <https://github.com/numpy/numpy/pull/20976>`__: ENH,BENCH: Optimize floor_divide for VSX4/Power10
+* `#20987 <https://github.com/numpy/numpy/pull/20987>`__: BLD: Try adding aarch64 wheels [wheel build]
+* `#20990 <https://github.com/numpy/numpy/pull/20990>`__: MAINT: Further small return value validation fixes
+* `#20991 <https://github.com/numpy/numpy/pull/20991>`__: ENH: Use SVML for f64 exp and log
+* `#20993 <https://github.com/numpy/numpy/pull/20993>`__: ENH: Allow object and subarray dtypes in fromiter
+* `#20994 <https://github.com/numpy/numpy/pull/20994>`__: REL: Update main after 1.22.2 release.
+* `#20996 <https://github.com/numpy/numpy/pull/20996>`__: MAINT: use brackets in github action syntax
+* `#20999 <https://github.com/numpy/numpy/pull/20999>`__: DOC: Remove mention of deleted subpackages in numpy docstring
+* `#21000 <https://github.com/numpy/numpy/pull/21000>`__: MAINT: Replace LooseVersion by _pep440.
+* `#21001 <https://github.com/numpy/numpy/pull/21001>`__: ENH: help compilers to auto-vectorize reduction operators
+* `#21003 <https://github.com/numpy/numpy/pull/21003>`__: ENH: Suppress over-/underflow RuntimeWarning in assert_array_equal
+* `#21005 <https://github.com/numpy/numpy/pull/21005>`__: BUG: Add parameter check to negative_binomial
+* `#21010 <https://github.com/numpy/numpy/pull/21010>`__: MAINT: Fix warning message for deprecated keyword
+* `#21015 <https://github.com/numpy/numpy/pull/21015>`__: DOC: Added note about possible arange signatures
+* `#21016 <https://github.com/numpy/numpy/pull/21016>`__: MAINT, DOC: Fix SciPy intersphinx link
+* `#21020 <https://github.com/numpy/numpy/pull/21020>`__: DOC: imbalanced backticks
+* `#21021 <https://github.com/numpy/numpy/pull/21021>`__: TYP,ENH: Add dtype-typing support to ``fromnumeric`` (part 2)
+* `#21024 <https://github.com/numpy/numpy/pull/21024>`__: API: Disallow strings in logical ufuncs
+* `#21025 <https://github.com/numpy/numpy/pull/21025>`__: MAINT: Use C++ for tokenizer unicode-kind templating
+* `#21027 <https://github.com/numpy/numpy/pull/21027>`__: BUG: use ``concurrent.futures.ThreadPoolExecutor`` in distutils...
+* `#21029 <https://github.com/numpy/numpy/pull/21029>`__: DEP: Remove support for non-tuple nd-indices.
+* `#21030 <https://github.com/numpy/numpy/pull/21030>`__: DOC: change fill_value of full_like from scalar to array_like
+* `#21031 <https://github.com/numpy/numpy/pull/21031>`__: MAINT, STY: Style fixes to quicksort.cpp
+* `#21032 <https://github.com/numpy/numpy/pull/21032>`__: DOC: fix sphinx errors due to np.emath references
+* `#21035 <https://github.com/numpy/numpy/pull/21035>`__: BUILD: remove condition on upload step
+* `#21037 <https://github.com/numpy/numpy/pull/21037>`__: DOC: Consistency of :: syntax.
+* `#21039 <https://github.com/numpy/numpy/pull/21039>`__: MAINT: Remove the RELAXED_STRIDES_CHECKING env variable
+* `#21040 <https://github.com/numpy/numpy/pull/21040>`__: DOC: "See Also" should not have backticks.
+* `#21042 <https://github.com/numpy/numpy/pull/21042>`__: MAINT, STY: Style fixups.
+* `#21043 <https://github.com/numpy/numpy/pull/21043>`__: BUILD: simplify upload step
+* `#21045 <https://github.com/numpy/numpy/pull/21045>`__: BUILD: change syntax to use env variable
+* `#21046 <https://github.com/numpy/numpy/pull/21046>`__: MAINT: Use "3.10" instead of "3.10-dev" on travis.
+* `#21049 <https://github.com/numpy/numpy/pull/21049>`__: use repo secrets for uploading
+* `#21050 <https://github.com/numpy/numpy/pull/21050>`__: BUILD: tweak upload to use python3, less verbose
+* `#21053 <https://github.com/numpy/numpy/pull/21053>`__: BUILD: make sure a python3 is on the path
+* `#21054 <https://github.com/numpy/numpy/pull/21054>`__: BUG: (loadtxt) Ignore last empty field when ``delimiter=None``
+* `#21060 <https://github.com/numpy/numpy/pull/21060>`__: TYP: Add dtype-typing support to ``fromnumeric`` part 3
+* `#21061 <https://github.com/numpy/numpy/pull/21061>`__: BLD,ENH: Add vsx3 and vsx4 as targets when building cos/sin and...
+* `#21064 <https://github.com/numpy/numpy/pull/21064>`__: DOC: Update arctan2 docstring based on doctest output
+* `#21067 <https://github.com/numpy/numpy/pull/21067>`__: BUG: Fix unpickling an empty ndarray with a non-zero dimension
+* `#21068 <https://github.com/numpy/numpy/pull/21068>`__: DOC: Fix spelling and grammar in documentation for quantile().
+* `#21071 <https://github.com/numpy/numpy/pull/21071>`__: BUG: Ensure equality/identity comparison with ``__array_function__``
+* `#21074 <https://github.com/numpy/numpy/pull/21074>`__: BUG: Replace ``ssize_t`` with ``size_t`` in tokenize.cpp
+* `#21077 <https://github.com/numpy/numpy/pull/21077>`__: TYP,MAINT: Remove inconsistencies between ``fromnumeric`` functions...
+* `#21082 <https://github.com/numpy/numpy/pull/21082>`__: DOC: clarify the return value of linalg.cholesky
+* `#21085 <https://github.com/numpy/numpy/pull/21085>`__: MAINT: point to html docs on distutils migration in deprecation...
+* `#21086 <https://github.com/numpy/numpy/pull/21086>`__: DEV: fix ``python runtests.py --bench-compare``
+* `#21087 <https://github.com/numpy/numpy/pull/21087>`__: DOC: update docs in site.cfg.example
+* `#21088 <https://github.com/numpy/numpy/pull/21088>`__: DEV: add distutils deprecation warning filter to pytest conf
+* `#21089 <https://github.com/numpy/numpy/pull/21089>`__: BUILD: if travis build is triggered manually, then upload wheels
+* `#21090 <https://github.com/numpy/numpy/pull/21090>`__: BUILD: change cibuildwheel output directory on travis [ci skip]
+* `#21095 <https://github.com/numpy/numpy/pull/21095>`__: BLD: Make a sdist [wheel build]
+* `#21097 <https://github.com/numpy/numpy/pull/21097>`__: MAINT: update cython, pypy for cython0.29.28 and pypy v7.3.8...
+* `#21099 <https://github.com/numpy/numpy/pull/21099>`__: MAINT: Translate x86-qsort.dispatch.c.src to C++ using templates.
+* `#21100 <https://github.com/numpy/numpy/pull/21100>`__: BLD: Comment out broken macOS PyPy build [wheel build]
+* `#21102 <https://github.com/numpy/numpy/pull/21102>`__: TYP,MAINT: Explicitly allow sequences of array-likes in ``np.concatenate``
+* `#21107 <https://github.com/numpy/numpy/pull/21107>`__: BLD: Run wheel builders on labeled pull requests
+* `#21108 <https://github.com/numpy/numpy/pull/21108>`__: TYP, ENH: Mark non-subclassable classes as ``final``
+* `#21109 <https://github.com/numpy/numpy/pull/21109>`__: MAINT: Fix incorrect signature in readtext header file
+* `#21110 <https://github.com/numpy/numpy/pull/21110>`__: CI: Improve concurrency to cancel running jobs on PR update
+* `#21111 <https://github.com/numpy/numpy/pull/21111>`__: TYP, MAINT: Relax the ``obj`` type in ``__array_finalize__``
+* `#21113 <https://github.com/numpy/numpy/pull/21113>`__: BUG: Fix numba DUFuncs added loops getting picked up
+* `#21118 <https://github.com/numpy/numpy/pull/21118>`__: DOC: improve documentation of singular value decomposition
+* `#21119 <https://github.com/numpy/numpy/pull/21119>`__: BUG, ENH: np._from_dlpack: export correct device information
+* `#21121 <https://github.com/numpy/numpy/pull/21121>`__: MAINT,TST: np._from_dlpack: add more test + small memory optimization
+* `#21124 <https://github.com/numpy/numpy/pull/21124>`__: ENH,SIMD: Vectorize modulo/divide using the universal intrinsics...
+* `#21125 <https://github.com/numpy/numpy/pull/21125>`__: BLD: bump cibuildwheel 2.3.0 → 2.3.1 on GHA [wheel build]
+* `#21127 <https://github.com/numpy/numpy/pull/21127>`__: BLD,DOC: skip broken ipython 8.1.0
+* `#21128 <https://github.com/numpy/numpy/pull/21128>`__: BLD: move cibuildwheel configuration to ``pyproject.toml``
+* `#21130 <https://github.com/numpy/numpy/pull/21130>`__: ENH: improve the speed of numpy.where using a branchless code
+* `#21132 <https://github.com/numpy/numpy/pull/21132>`__: BUG,ENH: np._from_dlpack: export arrays with any-strided size-1...
+* `#21133 <https://github.com/numpy/numpy/pull/21133>`__: DOC: Note interop from "subclassing" docs and explain when to...
+* `#21144 <https://github.com/numpy/numpy/pull/21144>`__: DOC: Change recommendation away from pinning numpy+3
+* `#21145 <https://github.com/numpy/numpy/pull/21145>`__: MAINT, DOC: make np._from_dlpack public
+* `#21146 <https://github.com/numpy/numpy/pull/21146>`__: BUG: assign all tuple items before using it for PyPy
+* `#21149 <https://github.com/numpy/numpy/pull/21149>`__: DOC: Update linalg.qr docstring with numerically stable example
+* `#21150 <https://github.com/numpy/numpy/pull/21150>`__: DOC: Fix syntax highlighting for numpy.flatnonzero
+* `#21151 <https://github.com/numpy/numpy/pull/21151>`__: ENH: Add 'ulong' to sctypeDict
+* `#21154 <https://github.com/numpy/numpy/pull/21154>`__: ENH, BLD: Fix math feature detection for wasm
+* `#21155 <https://github.com/numpy/numpy/pull/21155>`__: DOC: document uploads to ananconda.org
+* `#21157 <https://github.com/numpy/numpy/pull/21157>`__: DOC: fix documentation for typedescr argument of PyArray_AsCArray
+* `#21167 <https://github.com/numpy/numpy/pull/21167>`__: DOC: Add "pip install -r test_requirements.txt"
+* `#21170 <https://github.com/numpy/numpy/pull/21170>`__: REL: Update main after 1.22.3 release.
+* `#21178 <https://github.com/numpy/numpy/pull/21178>`__: MAINT: Move can-cast table to a custom header file
+* `#21180 <https://github.com/numpy/numpy/pull/21180>`__: TST: Bump mypy from 0.931 to 0.940
+* `#21185 <https://github.com/numpy/numpy/pull/21185>`__: TYP, BUG: Fix ``np.lib.stride_tricks`` re-exported under wrong...
+* `#21186 <https://github.com/numpy/numpy/pull/21186>`__: MAINT: update NEP 29
+* `#21187 <https://github.com/numpy/numpy/pull/21187>`__: ENH: F2PY build output determinism
+* `#21188 <https://github.com/numpy/numpy/pull/21188>`__: MAINT,ENH: Rewrite scalar math logic
+* `#21189 <https://github.com/numpy/numpy/pull/21189>`__: DEV: Remove deprecated "python.pythonPath"
+* `#21193 <https://github.com/numpy/numpy/pull/21193>`__: DOC: Remove the confusing "unless not" in numpy/core/fromnumeric.py
+* `#21201 <https://github.com/numpy/numpy/pull/21201>`__: DOC: typo corrected in numpy.argpartition
+* `#21202 <https://github.com/numpy/numpy/pull/21202>`__: DOC: fix outdated description of unicode
+* `#21205 <https://github.com/numpy/numpy/pull/21205>`__: BUG: f2py cannot read in customised f2cmap file; fix #21204
+* `#21206 <https://github.com/numpy/numpy/pull/21206>`__: MAINT: fix typo in NEP 29
+* `#21207 <https://github.com/numpy/numpy/pull/21207>`__: MAINT: remove maint from triggering wheel build, add env to sdist
+* `#21216 <https://github.com/numpy/numpy/pull/21216>`__: MAINT: Split ``numpy.typing`` into a public and private component
+* `#21218 <https://github.com/numpy/numpy/pull/21218>`__: BUG: Use -0. as initial value for summation (internal only)
+* `#21226 <https://github.com/numpy/numpy/pull/21226>`__: DOC: misc fixes
+* `#21227 <https://github.com/numpy/numpy/pull/21227>`__: MAINT: Translate numpy/linalg/umath_linalg.c.src to C++ using...
+* `#21231 <https://github.com/numpy/numpy/pull/21231>`__: BUG: Catch error if array-priority is not float compatible
+* `#21232 <https://github.com/numpy/numpy/pull/21232>`__: BUG: Fixes ``ValueError`` in ``np.kron``
+* `#21238 <https://github.com/numpy/numpy/pull/21238>`__: BLD: Fix upload script
+* `#21241 <https://github.com/numpy/numpy/pull/21241>`__: MAINT: use doc_requirements.txt in azure build
+* `#21244 <https://github.com/numpy/numpy/pull/21244>`__: TST: Bump mypy from 0.940 to 0.942
+* `#21247 <https://github.com/numpy/numpy/pull/21247>`__: DOC: directive fix (single instead of double backticks).
+* `#21250 <https://github.com/numpy/numpy/pull/21250>`__: DEV: Fixed Un-responsive live-preview in gitpod.
+* `#21251 <https://github.com/numpy/numpy/pull/21251>`__: DOC: fix data type of parameter shape
+* `#21253 <https://github.com/numpy/numpy/pull/21253>`__: DOC: fix code sample for leg2poly
+* `#21254 <https://github.com/numpy/numpy/pull/21254>`__: DOC: document automatic wheel building for a release
+* `#21255 <https://github.com/numpy/numpy/pull/21255>`__: DOC: mention Gitpod as alternative to build numpy
+* `#21256 <https://github.com/numpy/numpy/pull/21256>`__: BUG,ENH: Fix negative bounds for F2PY
+* `#21260 <https://github.com/numpy/numpy/pull/21260>`__: DOC: Enumerate the differences between numpy and numpy.array_api
+* `#21262 <https://github.com/numpy/numpy/pull/21262>`__: ENH: Masked Array support for ``np.kron``
+* `#21269 <https://github.com/numpy/numpy/pull/21269>`__: DOC: Improve "random.generator.shuffle" docs page
+* `#21272 <https://github.com/numpy/numpy/pull/21272>`__: BUG: Fix typos
+* `#21285 <https://github.com/numpy/numpy/pull/21285>`__: BLD: Bump cibuildwheel and enable more PyPy
+* `#21286 <https://github.com/numpy/numpy/pull/21286>`__: DOC: double backticks and links
+* `#21287 <https://github.com/numpy/numpy/pull/21287>`__: MAINT: Use C++ inline and include files in C++ files.
+* `#21290 <https://github.com/numpy/numpy/pull/21290>`__: DOC: Improve documentation formatting
+* `#21291 <https://github.com/numpy/numpy/pull/21291>`__: DOC: Add space after argument name
+* `#21295 <https://github.com/numpy/numpy/pull/21295>`__: MAINT: Clean-up includes of auto-generated umath code
+* `#21297 <https://github.com/numpy/numpy/pull/21297>`__: MAINT: Rename source files that were not using any template-preprocessing
+* `#21303 <https://github.com/numpy/numpy/pull/21303>`__: MAINT: Edit logo size and logo position in README.md
+* `#21306 <https://github.com/numpy/numpy/pull/21306>`__: ENH: Introduce numpy.core.setup_common.NPY_CXX_FLAGS
+* `#21307 <https://github.com/numpy/numpy/pull/21307>`__: MAINT: bump versions in Github actions configuration to v3.
+* `#21314 <https://github.com/numpy/numpy/pull/21314>`__: DOC: various spell checks and typo fixes
+* `#21315 <https://github.com/numpy/numpy/pull/21315>`__: DOC: minor typo fix in numpy.random API docs
+* `#21321 <https://github.com/numpy/numpy/pull/21321>`__: BUG: Stop using PyBytesObject.ob_shash deprecated in Python 3.11.
+* `#21324 <https://github.com/numpy/numpy/pull/21324>`__: BUG: Make mmap handling safer in frombuffer
+* `#21327 <https://github.com/numpy/numpy/pull/21327>`__: Small updates to the array_api docs
+* `#21330 <https://github.com/numpy/numpy/pull/21330>`__: DOC: Add F2PY tests documentation
+* `#21331 <https://github.com/numpy/numpy/pull/21331>`__: REL: Update main after 1.21.6 release.
+* `#21345 <https://github.com/numpy/numpy/pull/21345>`__: TYP: Let ``ndarray`` fancy indexing always return an ``ndarray``
+* `#21347 <https://github.com/numpy/numpy/pull/21347>`__: MAINT: Fix failing simd and cygwin tests.
+* `#21348 <https://github.com/numpy/numpy/pull/21348>`__: DEV: reverted misplaced install of "esbonio".
+* `#21349 <https://github.com/numpy/numpy/pull/21349>`__: MAINT: Update setup-cygwin to v3 again.
+* `#21352 <https://github.com/numpy/numpy/pull/21352>`__: Doc: Philox.jumped correct the formula
+* `#21354 <https://github.com/numpy/numpy/pull/21354>`__: ENH: Improve ``np.kron`` performance
+* `#21355 <https://github.com/numpy/numpy/pull/21355>`__: MAINT: Remove the reference to the “good first issue” label
+* `#21356 <https://github.com/numpy/numpy/pull/21356>`__: DOC: Fix a typo in docstring of MT19937
+* `#21360 <https://github.com/numpy/numpy/pull/21360>`__: MAINT: Add compile flag to disable voltbl on MSVC 142
+* `#21366 <https://github.com/numpy/numpy/pull/21366>`__: BUG: fix compilation error for VS 141 and earlier
+* `#21367 <https://github.com/numpy/numpy/pull/21367>`__: MAINT: Translate ieee754.c.src to C++ using templates.
+* `#21368 <https://github.com/numpy/numpy/pull/21368>`__: MAINT: Fix failing Python 3.8 32-bit Windows test.
+* `#21372 <https://github.com/numpy/numpy/pull/21372>`__: BUG: Allow legacy dtypes to cast to datetime again
+* `#21377 <https://github.com/numpy/numpy/pull/21377>`__: API: Allow newaxis indexing for ``array_api`` arrays
+* `#21381 <https://github.com/numpy/numpy/pull/21381>`__: DOC: Typesetting of math for np.correlate and np.convolve
+* `#21382 <https://github.com/numpy/numpy/pull/21382>`__: DOC: non-orphan page, and casing.
+* `#21384 <https://github.com/numpy/numpy/pull/21384>`__: BUG: Missing ``f`` prefix on f-strings fix
+* `#21388 <https://github.com/numpy/numpy/pull/21388>`__: MAINT: be sure to match base and docker images
+* `#21392 <https://github.com/numpy/numpy/pull/21392>`__: BUG: add linux guard per #21386
+* `#21394 <https://github.com/numpy/numpy/pull/21394>`__: PERF: Reduce overhead of np.linalg.norm for small arrays
+* `#21400 <https://github.com/numpy/numpy/pull/21400>`__: DOC: Add missing entries in ``numpy.testing`` documentation
+* `#21407 <https://github.com/numpy/numpy/pull/21407>`__: MAINT: Reduce f2py verbiage for valid parameters
+* `#21410 <https://github.com/numpy/numpy/pull/21410>`__: DOC: Update set of allowed f2cmap types
+* `#21411 <https://github.com/numpy/numpy/pull/21411>`__: MAINT: Remove ``f2py.f2py_testing`` without replacement
+* `#21413 <https://github.com/numpy/numpy/pull/21413>`__: DOC: Secure PR template URLs [ci skip]
+* `#21415 <https://github.com/numpy/numpy/pull/21415>`__: BUG: Fix handling of skip-empty-wrappers
+* `#21417 <https://github.com/numpy/numpy/pull/21417>`__: MAINT: Update doc requirements
+* `#21421 <https://github.com/numpy/numpy/pull/21421>`__: MAINT: Remove FPE helper code that is unnecessary on C99/C++11
+* `#21423 <https://github.com/numpy/numpy/pull/21423>`__: PERF: Improve performance of special attribute lookups
+* `#21425 <https://github.com/numpy/numpy/pull/21425>`__: TEST: on PyPy, skip hanging slow test [wheel build]
+* `#21426 <https://github.com/numpy/numpy/pull/21426>`__: DOC: Add version switcher to the documentation
+* `#21430 <https://github.com/numpy/numpy/pull/21430>`__: TYP: Bump mypy to 0.950
+* `#21436 <https://github.com/numpy/numpy/pull/21436>`__: BUG: Fix segmentation fault
+* `#21442 <https://github.com/numpy/numpy/pull/21442>`__: BUG: Ensure compile errors are raised correctly
+* `#21450 <https://github.com/numpy/numpy/pull/21450>`__: PERF: Statically allocate unicode strings of memhandler
+* `#21451 <https://github.com/numpy/numpy/pull/21451>`__: DOC: Style version switcher button
+* `#21452 <https://github.com/numpy/numpy/pull/21452>`__: TST: Remove most prints from the test suit run
+* `#21453 <https://github.com/numpy/numpy/pull/21453>`__: [road-to-cxx] npy_cpu_features moved to pure C
+* `#21456 <https://github.com/numpy/numpy/pull/21456>`__: DOC: style main page card
+* `#21463 <https://github.com/numpy/numpy/pull/21463>`__: BENCH: Add benchmarks targeted at small arrays
+* `#21464 <https://github.com/numpy/numpy/pull/21464>`__: PERF: Fast check on equivalent arrays in PyArray_EQUIVALENTLY_ITERABLE_OVERLAP_OK
+* `#21465 <https://github.com/numpy/numpy/pull/21465>`__: PERF: Use python integer on _count_reduce_items
+* `#21466 <https://github.com/numpy/numpy/pull/21466>`__: DEV: Pin setuptools in the asv config
+* `#21467 <https://github.com/numpy/numpy/pull/21467>`__: MAINT: Mark ``npy_memchr`` with ``no_sanitize("alignment")`` on clang
+* `#21470 <https://github.com/numpy/numpy/pull/21470>`__: PERF: Skip probing ``__array_ufunc__`` for NumPy builtin scalars
+* `#21477 <https://github.com/numpy/numpy/pull/21477>`__: MAINT: Reduce allocation size of empty (0 size) arrays to 1 byte
+* `#21479 <https://github.com/numpy/numpy/pull/21479>`__: TYP,ENH: Add annotations for new numpy 1.23 features
+* `#21485 <https://github.com/numpy/numpy/pull/21485>`__: ENH: Add 'keepdims' to 'average()' and 'ma.average()'.
+* `#21490 <https://github.com/numpy/numpy/pull/21490>`__: TYP: Add typing for the keepdims param. of 'average' and 'ma.average'
+* `#21491 <https://github.com/numpy/numpy/pull/21491>`__: DOC: Proposal - make the doc landing page cards more similar...
+* `#21492 <https://github.com/numpy/numpy/pull/21492>`__: BUG: lib: Allow type uint64 for eye() arguments.
+* `#21498 <https://github.com/numpy/numpy/pull/21498>`__: ENH: Add ``_get_madvise_hugepage`` function
+* `#21499 <https://github.com/numpy/numpy/pull/21499>`__: ENH: avoid looping when dimensions[0] == 0 or array.size == 0
+* `#21500 <https://github.com/numpy/numpy/pull/21500>`__: TST: Fix uninitialized value in masked ndenumerate test
+* `#21502 <https://github.com/numpy/numpy/pull/21502>`__: DEV: Fix Warnings/Errors on Gitpod
+* `#21503 <https://github.com/numpy/numpy/pull/21503>`__: TYP: Add basic ``np.number`` overloads for ``ndarray`` dunders
+* `#21514 <https://github.com/numpy/numpy/pull/21514>`__: MAINT: Update to Cython 0.29.29.
+* `#21517 <https://github.com/numpy/numpy/pull/21517>`__: MAINT: Update .mailmap
+* `#21518 <https://github.com/numpy/numpy/pull/21518>`__: BUG: Fix complex+longdouble and broken subclass handling
+* `#21530 <https://github.com/numpy/numpy/pull/21530>`__: MAINT: Update to Cython 0.29.30.
+* `#21534 <https://github.com/numpy/numpy/pull/21534>`__: BUG: Fix GCC error during build configuration
+* `#21540 <https://github.com/numpy/numpy/pull/21540>`__: BUILD: update OpenBLAS to v0.3.20
+* `#21542 <https://github.com/numpy/numpy/pull/21542>`__: DOC: improve the docstring of numpy.sinc to explain behavior...
+* `#21543 <https://github.com/numpy/numpy/pull/21543>`__: TST,TYP: Fix a python 3.11 failure for the ``GenericAlias`` tests
+* `#21545 <https://github.com/numpy/numpy/pull/21545>`__: Tests/Docs: Update tests to Cython 0.29.30, mention in docs
+* `#21552 <https://github.com/numpy/numpy/pull/21552>`__: BLD: Sort svml objects to keep builds reproducible
+* `#21553 <https://github.com/numpy/numpy/pull/21553>`__: PERF: Faster MyPyFloat_AsDouble
+* `#21558 <https://github.com/numpy/numpy/pull/21558>`__: MAINT: Python <3.8 related cleanups
+* `#21562 <https://github.com/numpy/numpy/pull/21562>`__: REL: Update main after 1.22.4 release.
+* `#21565 <https://github.com/numpy/numpy/pull/21565>`__: DOC: add explanation to makefile error
+* `#21566 <https://github.com/numpy/numpy/pull/21566>`__: DOC: Fix docstring and examples for rfn.get_names*
+* `#21568 <https://github.com/numpy/numpy/pull/21568>`__: DOC:linalg: Remove ref to scipy.linalg.pinv2
+* `#21569 <https://github.com/numpy/numpy/pull/21569>`__: MAINT: loosen Cython pin in environment.yml
+* `#21570 <https://github.com/numpy/numpy/pull/21570>`__: CI: fix Gitpod image build
+* `#21574 <https://github.com/numpy/numpy/pull/21574>`__: BUG: refguide-check: respect the verbosity
+* `#21577 <https://github.com/numpy/numpy/pull/21577>`__: MAINT: update PyPy to 7.3.9 and remove unused script
+* `#21580 <https://github.com/numpy/numpy/pull/21580>`__: MAINT: Update the cversion hash.
+* `#21589 <https://github.com/numpy/numpy/pull/21589>`__: REL: Prepare for the NumPy 1.23.0rc1 release.
+* `#21604 <https://github.com/numpy/numpy/pull/21604>`__: BUILD: fix tag name for travis: it is v1.23.0rc1
+* `#21606 <https://github.com/numpy/numpy/pull/21606>`__: DOC: add missing links for two NEPs
+* `#21607 <https://github.com/numpy/numpy/pull/21607>`__: TYP, MAINT: Allow unsigned integer inplace-ops to accept signed...
+* `#21610 <https://github.com/numpy/numpy/pull/21610>`__: REL: Prepare for 1.23.0rc1 release, second version.
+* `#21619 <https://github.com/numpy/numpy/pull/21619>`__: MAINT, STY: Make download-wheels download source files.
+* `#21634 <https://github.com/numpy/numpy/pull/21634>`__: MAINT: back out conversion of npymath component to c++
+* `#21635 <https://github.com/numpy/numpy/pull/21635>`__: TST: Skip F2PY tests without Fortran compilers
+* `#21636 <https://github.com/numpy/numpy/pull/21636>`__: API: Retain ``arr.base`` more strictly in ``np.frombuffer``
+* `#21637 <https://github.com/numpy/numpy/pull/21637>`__: REL: Prepare for the NumPy 1.23.0rc2 release.
+* `#21646 <https://github.com/numpy/numpy/pull/21646>`__: ENH: Add equals_nan kwarg to np.unique
+* `#21649 <https://github.com/numpy/numpy/pull/21649>`__: MAINT: Start testing with Python 3.11.
+* `#21656 <https://github.com/numpy/numpy/pull/21656>`__: TYP, ENH: Add annotations for the ``equal_nan`` keyword to ``np.unique``
+* `#21660 <https://github.com/numpy/numpy/pull/21660>`__: MAINT: Adapt npt._GenericAlias to Python 3.11 types.GenericAlias
+* `#21684 <https://github.com/numpy/numpy/pull/21684>`__: MAINT: Point documentation version switcher at the docs homepage
+* `#21688 <https://github.com/numpy/numpy/pull/21688>`__: DEP: Deprecate (rather than remove) the int-via-float parsing...
+* `#21697 <https://github.com/numpy/numpy/pull/21697>`__: BUG: Fix a refactor leftover bug
+* `#21698 <https://github.com/numpy/numpy/pull/21698>`__: BUG: Prevent attempted broadcasting of 0-D output operands in...
+* `#21710 <https://github.com/numpy/numpy/pull/21710>`__: TST: Fixup loadtxt int-via-float tests when in release mode
+* `#21716 <https://github.com/numpy/numpy/pull/21716>`__: ENH: Implement string comparison ufuncs (or almost)
+* `#21718 <https://github.com/numpy/numpy/pull/21718>`__: BUG: use explicit einsum_path whenever it is given
+* `#21719 <https://github.com/numpy/numpy/pull/21719>`__: BUG: Small fixupes found using valgrind
+* `#21720 <https://github.com/numpy/numpy/pull/21720>`__: BUG: Enable fortran preprocessing for ifort on Windows
+* `#21721 <https://github.com/numpy/numpy/pull/21721>`__: BLD, SIMD: Fix detect armhf and hardened the Neon/ASIMD compile-time...
+* `#21722 <https://github.com/numpy/numpy/pull/21722>`__: BUG: .f2py_f2cmap doesn't map long_long and other options
+* `#21729 <https://github.com/numpy/numpy/pull/21729>`__: REL: Prepare for the NumPy 1.23.0rc3 release.
+* `#21754 <https://github.com/numpy/numpy/pull/21754>`__: BUG, SIMD: Fix detecting NEON/ASIMD on aarch64
+* `#21757 <https://github.com/numpy/numpy/pull/21757>`__: BUG: Do not skip value-based promotion path for large Python...
+* `#21761 <https://github.com/numpy/numpy/pull/21761>`__: BUG: Fix small reference leaks found with pytest-leaks
+* `#21777 <https://github.com/numpy/numpy/pull/21777>`__: REV: Revert "ENH: Implement string comparison ufuncs (or almost)...
+* `#21809 <https://github.com/numpy/numpy/pull/21809>`__: MAINT: Add a check of the return value of PyMem_Calloc().
+* `#21810 <https://github.com/numpy/numpy/pull/21810>`__: BUG: lib: A loadtxt error message had two values reversed.
+* `#21811 <https://github.com/numpy/numpy/pull/21811>`__: REL: Prepare for the NumPy 1.23.0 release
+* `#21824 <https://github.com/numpy/numpy/pull/21824>`__: MAINT: Try fixing broken Anaconda uploads
+
+
diff --git a/doc/conftest.py b/doc/conftest.py

new file mode 100644 (file)

index 0000000..5e00b1e
--- /dev/null
+++ b/doc/conftest.py
@@ -0,0 +1,32 @@
+"""
+Pytest configuration and fixtures for the Numpy test suite.
+"""
+import pytest
+import numpy
+import matplotlib
+import doctest
+
+matplotlib.use('agg', force=True)
+
+# Ignore matplotlib output such as `<matplotlib.image.AxesImage at
+# 0x7f956908c280>`. doctest monkeypatching inspired by
+# https://github.com/wooyek/pytest-doctest-ellipsis-markers (MIT license)
+OutputChecker = doctest.OutputChecker
+
+empty_line_markers = ['<matplotlib.', '<mpl_toolkits.mplot3d.']
+class SkipMatplotlibOutputChecker(doctest.OutputChecker):
+    def check_output(self, want, got, optionflags):
+        for marker in empty_line_markers:
+            if marker in got:
+                got = ''
+                break
+        return OutputChecker.check_output(self, want, got, optionflags)
+
+
+doctest.OutputChecker = SkipMatplotlibOutputChecker
+
+@pytest.fixture(autouse=True)
+def add_np(doctest_namespace):
+    numpy.random.seed(1)
+    doctest_namespace['np'] = numpy
+
diff --git a/doc/neps/.gitignore b/doc/neps/.gitignore

index 04163f7079c83523aaaebad6895ef184aebbeb7b..e5d89d1b2eabd3533ada369353a58da4b5e2e385 100644 (file)
--- a/doc/neps/.gitignore
+++ b/doc/neps/.gitignore
@@ -1 +1,7 @@
-index.rst
+accepted.rst
+deferred.rst
+finished.rst
+meta.rst
+open.rst
+provisional.rst
+rejected.rst
diff --git a/doc/neps/accepted.rst.tmpl b/doc/neps/accepted.rst.tmpl

new file mode 100644 (file)

index 0000000..8e1ce33
--- /dev/null
+++ b/doc/neps/accepted.rst.tmpl
@@ -0,0 +1,9 @@
+Accepted NEPs (implementation in progress)
+------------------------------------------
+
+.. toctree::
+   :maxdepth: 1
+
+{% for nep, tags in neps.items() if tags['Status'] == 'Accepted' %}
+   {{ tags['Title'] }} <{{ tags['Filename'] }}>
+{% endfor %}
diff --git a/doc/neps/conf.py b/doc/neps/conf.py

index 68805e50faf04f8280cb406e4119e1199f438268..a0ab286bdcbbb158fb51b9c0f7c6ba6934f7d83c 100644 (file)
--- a/doc/neps/conf.py
+++ b/doc/neps/conf.py
@@ -86,6 +86,8 @@ html_theme = 'pydata_sphinx_theme'
  
  html_logo = '../source/_static/numpylogo.svg'
  
+html_favicon = '../source/_static/favicon/favicon.ico'
+
  html_theme_options = {
    "github_url": "https://github.com/numpy/numpy",
    "twitter_url": "https://twitter.com/numpy_team",
@@ -106,8 +108,6 @@ html_copy_source = False
  html_domain_indices = False
  html_file_suffix = '.html'
  
-htmlhelp_basename = 'numpy'
-
  if 'sphinx.ext.pngmath' in extensions:
      pngmath_use_preview = True
      pngmath_dvipng_args = ['-gamma', '1.5', '-D', '96', '-bg', 'Transparent']
diff --git a/doc/neps/deferred.rst.tmpl b/doc/neps/deferred.rst.tmpl

new file mode 100644 (file)

index 0000000..55074bf
--- /dev/null
+++ b/doc/neps/deferred.rst.tmpl
@@ -0,0 +1,9 @@
+Deferred and Superseded NEPs
+----------------------------
+
+.. toctree::
+   :maxdepth: 1
+
+{% for nep, tags in neps.items() if tags['Status'] in ('Deferred', 'Superseded') %}
+   {{ tags['Title'] }} <{{ tags['Filename'] }}>
+{% endfor %}
diff --git a/doc/neps/finished.rst.tmpl b/doc/neps/finished.rst.tmpl

new file mode 100644 (file)

index 0000000..0b9ba8a
--- /dev/null
+++ b/doc/neps/finished.rst.tmpl
@@ -0,0 +1,9 @@
+Finished NEPs
+-------------
+
+.. toctree::
+   :maxdepth: 1
+
+{% for nep, tags in neps.items() if tags['Status'] == 'Final' %}
+   {{ tags['Title'] }} <{{ tags['Filename'] }}>
+{% endfor %}
diff --git a/doc/neps/index.rst b/doc/neps/index.rst

new file mode 100644 (file)

index 0000000..0530308
--- /dev/null
+++ b/doc/neps/index.rst
@@ -0,0 +1,32 @@
+=====================================
+Roadmap & NumPy Enhancement Proposals
+=====================================
+
+This page provides an overview of development priorities for NumPy.
+Specifically, it contains a roadmap with a higher-level overview, as
+well as NumPy Enhancement Proposals (NEPs)—suggested changes
+to the library—in various stages of discussion or completion (see `NEP
+0 <nep-0000>`__).
+
+Roadmap
+-------
+.. toctree::
+   :maxdepth: 1
+
+   The Scope of NumPy <scope>
+   Current roadmap <roadmap>
+   Wish list <https://github.com/numpy/numpy/issues?q=is%3Aopen+is%3Aissue+label%3A%2223+-+Wish+List%22>
+
+NumPy Enhancement Proposals (NEPs)
+----------------------------------
+
+.. toctree::
+   :maxdepth: 2
+
+   meta
+   provisional
+   accepted
+   open
+   finished
+   deferred
+   rejected
diff --git a/doc/neps/index.rst.tmpl b/doc/neps/index.rst.tmpl

deleted file mode 100644 (file)

index 0299f86..0000000
--- a/doc/neps/index.rst.tmpl
+++ /dev/null
@@ -1,100 +0,0 @@
-=====================================
-Roadmap & NumPy Enhancement Proposals
-=====================================
-
-This page provides an overview of development priorities for NumPy.
-Specifically, it contains a roadmap with a higher-level overview, as
-well as NumPy Enhancement Proposals (NEPs)—suggested changes
-to the library—in various stages of discussion or completion (see `NEP
-0 <nep-0000>`__).
-
-Roadmap
--------
-.. toctree::
-   :maxdepth: 1
-
-   The Scope of NumPy <scope>
-   Current roadmap <roadmap>
-   Wish list <https://github.com/numpy/numpy/issues?q=is%3Aopen+is%3Aissue+label%3A%2223+-+Wish+List%22>
-
-Meta-NEPs (NEPs about NEPs or Processes)
-----------------------------------------
-
-.. toctree::
-   :maxdepth: 1
-
-{% for nep, tags in neps.items() if tags['Status'] == 'Active' %}
-   {{ tags['Title'] }} <{{ tags['Filename'] }}>
-{% endfor %}
-
-   nep-template
-
-
-{% if has_provisional %}
-
-Provisional NEPs (provisionally accepted; interface may change)
----------------------------------------------------------------
-
-.. toctree::
-   :maxdepth: 1
-
-{% for nep, tags in neps.items() if tags['Status'] == 'Provisional' %}
-   {{ tags['Title'] }} <{{ tags['Filename'] }}>
-{% endfor %}
-
-{% endif %}
-
-
-Accepted NEPs (implementation in progress)
-------------------------------------------
-
-.. toctree::
-   :maxdepth: 1
-
-{% for nep, tags in neps.items() if tags['Status'] == 'Accepted' %}
-   {{ tags['Title'] }} <{{ tags['Filename'] }}>
-{% endfor %}
-
-
-Open NEPs (under consideration)
--------------------------------
-
-.. toctree::
-   :maxdepth: 1
-
-{% for nep, tags in neps.items() if tags['Status'] == 'Draft' %}
-   {{ tags['Title'] }} <{{ tags['Filename'] }}>
-{% endfor %}
-
-
-
-Finished NEPs
-----------------
-
-.. toctree::
-   :maxdepth: 1
-
-{% for nep, tags in neps.items() if tags['Status'] == 'Final' %}
-   {{ tags['Title'] }} <{{ tags['Filename'] }}>
-{% endfor %}
-
-Deferred and Superseded NEPs
-----------------------------
-
-.. toctree::
-   :maxdepth: 1
-
-{% for nep, tags in neps.items() if tags['Status'] in ('Deferred', 'Superseded') %}
-   {{ tags['Title'] }} <{{ tags['Filename'] }}>
-{% endfor %}
-
-Rejected and Withdrawn NEPs
----------------------------
-
-.. toctree::
-   :maxdepth: 1
-
-{% for nep, tags in neps.items() if tags['Status'] in ('Rejected', 'Withdrawn') %}
-   {{ tags['Title'] }} <{{ tags['Filename'] }}>
-{% endfor %}
-
diff --git a/doc/neps/meta.rst.tmpl b/doc/neps/meta.rst.tmpl

new file mode 100644 (file)

index 0000000..a74311e
--- /dev/null
+++ b/doc/neps/meta.rst.tmpl
@@ -0,0 +1,11 @@
+Meta-NEPs (NEPs about NEPs or Processes)
+----------------------------------------
+
+.. toctree::
+   :maxdepth: 1
+
+{% for nep, tags in neps.items() if tags['Status'] == 'Active' %}
+   {{ tags['Title'] }} <{{ tags['Filename'] }}>
+{% endfor %}
+
+   nep-template
diff --git a/doc/neps/nep-0002-warnfix.rst b/doc/neps/nep-0002-warnfix.rst

index a1138b2f1b833e470bc030ec6255c93c5de284ca..1608998a652df073ce4a7b718d184302a4fd4000 100644 (file)
--- a/doc/neps/nep-0002-warnfix.rst
+++ b/doc/neps/nep-0002-warnfix.rst
@@ -76,7 +76,7 @@ expanded to::
     int foo(int * __NPY_UNUSED_TAGGEDdummy __COMP_NPY_UNUSED)
  
  Thus avoiding any accidental use of the variable. The mangling is pure C, and
-thuse portable. The per-variable warning disabling is compiler specific.
+thus portable. The per-variable warning disabling is compiler specific.
  
  signed/unsigned comparison
  --------------------------
diff --git a/doc/neps/nep-0005-generalized-ufuncs.rst b/doc/neps/nep-0005-generalized-ufuncs.rst

index 43459a555a58aec5467b4960e68adf4cc0361e9b..8ef6f345368b37bde162a386ae3a1c57e9901f3b 100644 (file)
--- a/doc/neps/nep-0005-generalized-ufuncs.rst
+++ b/doc/neps/nep-0005-generalized-ufuncs.rst
@@ -45,7 +45,7 @@ determines how the dimensions of each input/output object are split
  into core and loop dimensions:
  
  #. While an input array has a smaller dimensionality than the corresponding
-   number of core dimensions, 1's are pre-pended to its shape.
+   number of core dimensions, 1's are prepended to its shape.
  #. The core dimensions are removed from all inputs and the remaining
     dimensions are broadcasted; defining the loop dimensions.
  #. The output is given by the loop dimensions plus the output core dimensions.
diff --git a/doc/neps/nep-0009-structured_array_extensions.rst b/doc/neps/nep-0009-structured_array_extensions.rst

index cd6c3f6c380c06f51c1037222eb4b91cc0fc2755..5912b268ba0416197e057753913deb6041d0f4e0 100644 (file)
--- a/doc/neps/nep-0009-structured_array_extensions.rst
+++ b/doc/neps/nep-0009-structured_array_extensions.rst
@@ -9,7 +9,7 @@ NEP 9 — Structured array extensions
  1.  Create with-style context that makes "named-columns" available as names in the namespace.
  
     with np.columns(array):
-        price = unit * quantityt
+        price = unit * quantity
  
  
  2. Allow structured arrays to be sliced by their column  (i.e. one additional indexing option for structured arrays) so that a[:4, 'foo':'bar']  would be allowed.
diff --git a/doc/neps/nep-0012-missing-data.rst b/doc/neps/nep-0012-missing-data.rst

index 4775ea18bc1e5a64564bd3e9e1d3a28ae0a4461c..c896c6b6a238884975fce6b804bf7810756c0a68 100644 (file)
--- a/doc/neps/nep-0012-missing-data.rst
+++ b/doc/neps/nep-0012-missing-data.rst
@@ -428,7 +428,7 @@ New functions added to the ndarray are::
      arr.copy(..., replacena=np.NA)
          Modification to the copy function which replaces NA values,
          either masked or with the NA bitpattern, with the 'replacena='
-        parameter suppled. When 'replacena' isn't NA, the copied
+        parameter supplied. When 'replacena' isn't NA, the copied
          array is unmasked and has the 'NA' part stripped from the
          parameterized dtype ('NA[f8]' becomes just 'f8').
  
diff --git a/doc/neps/nep-0017-split-out-maskedarray.rst b/doc/neps/nep-0017-split-out-maskedarray.rst

index 5cb1c0c399e544089c2998cdef8f9e0e128c2e9d..fac05e256d8630e91e81eae2e52483dc72804c83 100644 (file)
--- a/doc/neps/nep-0017-split-out-maskedarray.rst
+++ b/doc/neps/nep-0017-split-out-maskedarray.rst
@@ -69,7 +69,7 @@ how to modify code to use `maskedarray`.
  After two releases, `np.ma` will be removed entirely. In order to obtain
  `np.ma`, a user will install it via `pip install` or via their package
  manager. Subsequently, `importing maskedarray` on a version of NumPy that
-includes it intgrally will raise an `ImportError`.
+includes it integrally will raise an `ImportError`.
  
  Documentation
  `````````````
@@ -123,7 +123,7 @@ References and Footnotes
  
  .. [1] Subclassing ndarray,
         https://docs.scipy.org/doc/numpy/user/basics.subclassing.html
-.. [2] PyPi: maskedarray, https://pypi.org/project/maskedarray/
+.. [2] PyPI: maskedarray, https://pypi.org/project/maskedarray/
  
  Copyright
  ---------
diff --git a/doc/neps/nep-0022-ndarray-duck-typing-overview.rst b/doc/neps/nep-0022-ndarray-duck-typing-overview.rst

index 47b81d9e76ec2275cfed080609d56317c8641c35..8f3e09995e109a6f7fddde17efe677084f88d048 100644 (file)
--- a/doc/neps/nep-0022-ndarray-duck-typing-overview.rst
+++ b/doc/neps/nep-0022-ndarray-duck-typing-overview.rst
@@ -97,7 +97,7 @@ partial duck arrays. We've been guilty of this ourself.
  
  At this point though, we think the best general strategy is to focus
  our efforts primarily on supporting full duck arrays, and only worry
-about partial duck arrays as much as we need to to make sure we don't
+about partial duck arrays as much as we need to make sure we don't
  accidentally rule them out for no reason.
  
  Why focus on full duck arrays? Several reasons:
diff --git a/doc/neps/nep-0023-backwards-compatibility.rst b/doc/neps/nep-0023-backwards-compatibility.rst

index 8b6f4cd1186a81229ada058474723858cdb2b24a..a056e7074716c981306cec88e11eeba4cd25b585 100644 (file)
--- a/doc/neps/nep-0023-backwards-compatibility.rst
+++ b/doc/neps/nep-0023-backwards-compatibility.rst
@@ -327,7 +327,7 @@ Discussion
  
  - `Mailing list discussion on the first version of this NEP in 2018 <https://mail.python.org/pipermail/numpy-discussion/2018-July/078432.html>`__
  - `Mailing list discussion on the Dec 2020 update of this NEP <https://mail.python.org/pipermail/numpy-discussion/2020-December/081358.html>`__
-- `PR with review comments on the the Dec 2020 update of this NEP <https://github.com/numpy/numpy/pull/18097>`__
+- `PR with review comments on the Dec 2020 update of this NEP <https://github.com/numpy/numpy/pull/18097>`__
  
  
  References and Footnotes
diff --git a/doc/neps/nep-0024-missing-data-2.rst b/doc/neps/nep-0024-missing-data-2.rst

index c0e2d2ce777148d7c87cb3bacb1689a13cbb5244..ef6e628b5f8f866499f0d3425a6fd38f5ce8695e 100644 (file)
--- a/doc/neps/nep-0024-missing-data-2.rst
+++ b/doc/neps/nep-0024-missing-data-2.rst
@@ -193,7 +193,7 @@ is obvious in the NA case::
     >>> na_arr
     array([1., 2., NA], dtype='NA[<f8]')
  
-Direct assignnent in the masked case is magic and confusing, and so happens only
+Direct assignment in the masked case is magic and confusing, and so happens only
  via the mask::
  
     >>> masked_array = np.array([1.0, 2.0, 7.0], masked=True)
diff --git a/doc/neps/nep-0027-zero-rank-arrarys.rst b/doc/neps/nep-0027-zero-rank-arrarys.rst

index eef4bcacc4cd5f40aacff009ae8a8e467dc7ed1f..ed51f3a13cafd4a7e350c61734f7e8fa82766476 100644 (file)
--- a/doc/neps/nep-0027-zero-rank-arrarys.rst
+++ b/doc/neps/nep-0027-zero-rank-arrarys.rst
@@ -105,7 +105,7 @@ arrays to scalars were summarized as follows:
  
    - This results in a special-case checking that is not
      pleasant.  Fundamentally it lets the user believe that
-    somehow multidimensional homoegeneous arrays
+    somehow multidimensional homogeneous arrays
      are something like Python lists (which except for
      Object arrays they are not).
  
@@ -166,7 +166,7 @@ Alexander started a `Jan 2006 discussion`_ on scipy-dev
  with the following proposal:
  
      ... it may be reasonable to allow ``a[...]``.  This way
-    ellipsis can be interpereted as any number of  ``:`` s including zero.
+    ellipsis can be interpreted as any number of  ``:`` s including zero.
      Another subscript operation that makes sense for scalars would be
      ``a[...,newaxis]`` or even ``a[{newaxis, }* ..., {newaxis,}*]``, where
      ``{newaxis,}*`` stands for any number of comma-separated newaxis tokens.
diff --git a/doc/neps/nep-0029-deprecation_policy.rst b/doc/neps/nep-0029-deprecation_policy.rst

index a50afcb98f9da653951dc7a1334a6374c1fecbb2..36f12815942c0564d67cd0eeae61960307394437 100644 (file)
--- a/doc/neps/nep-0029-deprecation_policy.rst
+++ b/doc/neps/nep-0029-deprecation_policy.rst
@@ -114,7 +114,12 @@ Jul 26, 2021 3.7+   1.18+
  Dec 22, 2021 3.7+   1.19+
  Dec 26, 2021 3.8+   1.19+
  Jun 21, 2022 3.8+   1.20+
-Apr 14, 2023 3.9+   1.20+
+Jan 31, 2023 3.8+   1.21+
+Apr 14, 2023 3.9+   1.21+
+Jun 23, 2023 3.9+   1.22+
+Jan 01, 2024 3.9+   1.23+
+Apr 05, 2024 3.10+  1.23+
+Apr 04, 2025 3.11+  1.23+
  ============ ====== =====
  
  
@@ -132,7 +137,12 @@ Drop Schedule
    On Dec 22, 2021 drop support for NumPy 1.18 (initially released on Dec 22, 2019)
    On Dec 26, 2021 drop support for Python 3.7 (initially released on Jun 27, 2018)
    On Jun 21, 2022 drop support for NumPy 1.19 (initially released on Jun 20, 2020)
+  On Jan 31, 2023 drop support for NumPy 1.20 (initially released on Jan 31, 2021)
    On Apr 14, 2023 drop support for Python 3.8 (initially released on Oct 14, 2019)
+  On Jun 23, 2023 drop support for NumPy 1.21 (initially released on Jun 22, 2021)
+  On Jan 01, 2024 drop support for NumPy 1.22 (initially released on Dec 31, 2021)
+  On Apr 05, 2024 drop support for Python 3.9 (initially released on Oct 05, 2020)
+  On Apr 04, 2025 drop support for Python 3.10 (initially released on Oct 04, 2021)
  
  
  Implementation
@@ -261,6 +271,11 @@ Code to generate support and drop schedule tables ::
    Oct 14, 2019: Python 3.8
    Dec 22, 2019: NumPy 1.18
    Jun 20, 2020: NumPy 1.19
+  Oct 05, 2020: Python 3.9
+  Jan 30, 2021: NumPy 1.20
+  Jun 22, 2021: NumPy 1.21
+  Oct 04, 2021: Python 3.10
+  Dec 31, 2021: NumPy 1.22
    """
  
    releases = []
diff --git a/doc/neps/nep-0031-uarray.rst b/doc/neps/nep-0031-uarray.rst

index b4ec94077f802b07b01bdad5c9315740c175003b..bda35d426b31dd33b654ac0d26a74b806dc8a0e2 100644 (file)
--- a/doc/neps/nep-0031-uarray.rst
+++ b/doc/neps/nep-0031-uarray.rst
@@ -102,7 +102,7 @@ Usage and Impact
  This NEP allows for global and context-local overrides, as well as
  automatic overrides a-la ``__array_function__``.
  
-Here are some use-cases this NEP would enable, besides the 
+Here are some use-cases this NEP would enable, besides the
  first one stated in the motivation section:
  
  The first is allowing alternate dtypes to return their
@@ -114,7 +114,7 @@ respective arrays.
      x = unp.ones((5, 5), dtype=xnd_dtype) # Or torch dtype
  
  The second is allowing overrides for parts of the API.
-This is to allow alternate and/or optimised implementations
+This is to allow alternate and/or optimized implementations
  for ``np.linalg``, BLAS, and ``np.random``.
  
  .. code:: python
@@ -126,7 +126,7 @@ for ``np.linalg``, BLAS, and ``np.random``.
      np.set_global_backend(pyfftw)
  
      # Uses pyfftw without monkeypatching
-    np.fft.fft(numpy_array)    
+    np.fft.fft(numpy_array)
  
      with np.set_backend(pyfftw) # Or mkl_fft, or numpy
          # Uses the backend you specified
@@ -200,10 +200,10 @@ GitHub workflow. There are a few reasons for this:
    The reason for this is that there may exist functions in the in these
    submodules that need backends, even for ``numpy.ndarray`` inputs.
  
-Advantanges of ``unumpy`` over other solutions
+Advantages of ``unumpy`` over other solutions
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  
-``unumpy`` offers a number of advantanges over the approach of defining a new
+``unumpy`` offers a number of advantages over the approach of defining a new
  protocol for every problem encountered: Whenever there is something requiring
  an override, ``unumpy`` will be able to offer a unified API with very minor
  changes. For example:
@@ -302,7 +302,7 @@ This is different from monkeypatching in a few different ways:
    so there is at least the loose sense of an API contract. Monkeypatching
    does not provide this ability.
  * There is the ability of locally switching the backend.
-* It has been `suggested <http://numpy-discussion.10968.n7.nabble.com/NEP-31-Context-local-and-global-overrides-of-the-NumPy-API-tp47452p47472.html>`_
+* It has been `suggested <https://mail.python.org/archives/list/numpy-discussion@python.org/message/PS7EN3CRT6XERNTCN56MAYOXFFFEC55G/>`_
    that the reason that 1.17 hasn't landed in the Anaconda defaults channel is
    due to the incompatibility between monkeypatching and ``__array_function__``,
    as monkeypatching would bypass the protocol completely.
@@ -313,7 +313,7 @@ This is different from monkeypatching in a few different ways:
  All this isn't possible at all with ``__array_function__`` or
  ``__array_ufunc__``.
  
-It has been formally realised (at least in part) that a backend system is
+It has been formally realized (at least in part) that a backend system is
  needed for this, in the `NumPy roadmap <https://numpy.org/neps/roadmap.html#other-functionality>`_.
  
  For ``numpy.random``, it's still necessary to make the C-API fit the one
@@ -347,7 +347,7 @@ dispatchable.
  The need for an opt-in module
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  
-The need for an opt-in module is realised because of a few reasons:
+The need for an opt-in module is realized because of a few reasons:
  
  * There are parts of the API (like `numpy.asarray`) that simply cannot be
    overridden due to incompatibility concerns with C/Cython extensions, however,
@@ -356,7 +356,7 @@ The need for an opt-in module is realised because of a few reasons:
    as those mentioned above.
  
  NEP 18 notes that this may require maintenance of two separate APIs. However,
-this burden may be lessened by, for example, parametrizing all tests over
+this burden may be lessened by, for example, parameterizing all tests over
  ``numpy.overridable`` separately via a fixture. This also has the side-effect
  of thoroughly testing it, unlike ``__array_function__``. We also feel that it
  provides an opportunity to separate the NumPy API contract properly from the
@@ -640,9 +640,9 @@ References and Footnotes
  
  .. [4] NEP 13 — A Mechanism for Overriding Ufuncs: https://numpy.org/neps/nep-0013-ufunc-overrides.html
  
-.. [5] Reply to Adding to the non-dispatched implementation of NumPy methods: http://numpy-discussion.10968.n7.nabble.com/Adding-to-the-non-dispatched-implementation-of-NumPy-methods-tp46816p46874.html
+.. [5] Reply to Adding to the non-dispatched implementation of NumPy methods: https://mail.python.org/archives/list/numpy-discussion@python.org/thread/5GUDMALWDIRHITG5YUOCV343J66QSX3U/#5GUDMALWDIRHITG5YUOCV343J66QSX3U
  
-.. [6] Custom Dtype/Units discussion: http://numpy-discussion.10968.n7.nabble.com/Custom-Dtype-Units-discussion-td43262.html
+.. [6] Custom Dtype/Units discussion: https://mail.python.org/archives/list/numpy-discussion@python.org/thread/RZYCVT6C3F7UDV6NA6FEV4MC5FKS6RDA/#RZYCVT6C3F7UDV6NA6FEV4MC5FKS6RDA
  
  .. [7] The epic dtype cleanup plan: https://github.com/numpy/numpy/issues/2899
  
diff --git a/doc/neps/nep-0036-fair-play.rst b/doc/neps/nep-0036-fair-play.rst

index 2acdcc70459a16f90cb1474954e5d4440d08d473..96dfa61d8d85a50e14d5d9133047613b12832642 100644 (file)
--- a/doc/neps/nep-0036-fair-play.rst
+++ b/doc/neps/nep-0036-fair-play.rst
@@ -1,3 +1,5 @@
+.. _NEP36:
+
  ==================
  NEP 36 — Fair play
  ==================
diff --git a/doc/neps/nep-0038-SIMD-optimizations.rst b/doc/neps/nep-0038-SIMD-optimizations.rst

index 9272284474a6175a350b967ed605d301cabc8afe..c7d9ce9d91976dfd530f799136e2d4237b70c833 100644 (file)
--- a/doc/neps/nep-0038-SIMD-optimizations.rst
+++ b/doc/neps/nep-0038-SIMD-optimizations.rst
@@ -8,7 +8,7 @@ NEP 38 — Using SIMD optimization instructions for performance
  :Status: Accepted
  :Type: Standards
  :Created: 2019-11-25
-:Resolution: http://numpy-discussion.10968.n7.nabble.com/NEP-38-Universal-SIMD-intrinsics-td47854.html
+:Resolution: https://mail.python.org/archives/list/numpy-discussion@python.org/thread/PVWJ74UVBRZ5ZWF6MDU7EUSJXVNILAQB/#PVWJ74UVBRZ5ZWF6MDU7EUSJXVNILAQB
  
  
  Abstract
@@ -26,9 +26,9 @@ function that matches the run-time CPU info `is chosen`_ from the candidates.Thi
  NEP proposes a mechanism to build on that for many more features and
  architectures.  The steps proposed are to:
  
-- Establish a set of well-defined, architecture-agnostic, universal intrisics
+- Establish a set of well-defined, architecture-agnostic, universal intrinsics
    which capture features available across architectures.
-- Capture these universal intrisics in a set of C macros and use the macros
+- Capture these universal intrinsics in a set of C macros and use the macros
    to build code paths for sets of features from the baseline up to the maximum
    set of features available on that architecture. Offer these as a limited
    number of compiled alternative code paths.
@@ -43,7 +43,7 @@ Traditionally NumPy has depended on compilers to generate optimal code
  specifically for the target architecture.
  However few users today compile NumPy locally for their machines. Most use the
  binary packages which must provide run-time support for the lowest-common
-denominator CPU architecture. Thus NumPy cannot take advantage of 
+denominator CPU architecture. Thus NumPy cannot take advantage of
  more advanced features of their CPU processors, since they may not be available
  on all users' systems.
  
@@ -124,7 +124,7 @@ Therefore, such code should only be added if it yields a significant
  performance benefit. Assessing this performance benefit can be nontrivial.
  To aid with this, the implementation for this NEP will add a way to select
  which instruction sets can be used at *runtime* via environment variables.
-(name TBD). This ablility is critical for CI code verification.
+(name TBD). This ability is critical for CI code verification.
  
  
  Diagnostics
@@ -153,7 +153,7 @@ SIMD loops for many ufuncs. These would likely be the first candidates
  to be ported to universal intrinsics. The expectation is that the new
  implementation may cause a regression in benchmarks, but not increase the
  size of the binary. If the regression is not minimal, we may choose to keep
-the X86-specific code for that platform and use the universal intrisic code
+the X86-specific code for that platform and use the universal intrinsic code
  for other platforms.
  
  Any new PRs to implement ufuncs using intrinsics will be expected to use the
@@ -208,12 +208,12 @@ There should be no impact on backwards compatibility.
  Detailed description
  --------------------
  
-The CPU-specific are mapped to unversal intrinsics which are
+The CPU-specific are mapped to universal intrinsics which are
  similar for all x86 SIMD variants, ARM SIMD variants etc. For example, the
  NumPy universal intrinsic ``npyv_load_u32`` maps to:
  
  *  ``vld1q_u32`` for ARM based NEON
-* ``_mm256_loadu_si256`` for x86 based AVX2 
+* ``_mm256_loadu_si256`` for x86 based AVX2
  * ``_mm512_loadu_si512`` for x86 based AVX-512
  
  Anyone writing a SIMD loop will use the ``npyv_load_u32`` macro instead of the
@@ -271,7 +271,7 @@ Current PRs:
  
  The compile-time and runtime code infrastructure are supplied by the first PR.
  The second adds a demonstration of use of the infrastructure for a loop. Once
-the NEP is approved, more work is needed to write loops using the machnisms
+the NEP is approved, more work is needed to write loops using the mechanisms
  provided by the NEP.
  
  Alternatives
diff --git a/doc/neps/nep-0040-legacy-datatype-impl.rst b/doc/neps/nep-0040-legacy-datatype-impl.rst

index a6e74d7a0c1f62f3b06e9be377cf35ad92167d53..a8d20ac4181a2cddb7ddd850c13d8e0a56168f90 100644 (file)
--- a/doc/neps/nep-0040-legacy-datatype-impl.rst
+++ b/doc/neps/nep-0040-legacy-datatype-impl.rst
@@ -21,7 +21,7 @@ NEP 40 — Legacy datatype implementation in NumPy
  
      - :ref:`NEP 42 <NEP42>` describes the new design's datatype-related APIs.
  
-    - NEP 43 describes the new design's API for universal functions.
+    - :ref:`NEP 43 <NEP43>` describes the new design's API for universal functions.
  
  
  
diff --git a/doc/neps/nep-0041-improved-dtype-support.rst b/doc/neps/nep-0041-improved-dtype-support.rst

index 2fb907073b7f3056bf8d227b849c02a5b7165fec..fcde39780d3c2c51f1fe6ee63e6e8882bf2651c4 100644 (file)
--- a/doc/neps/nep-0041-improved-dtype-support.rst
+++ b/doc/neps/nep-0041-improved-dtype-support.rst
@@ -23,7 +23,7 @@ NEP 41 — First step towards a new datatype system
  
      - :ref:`NEP 42 <NEP42>` describes the new design's datatype-related APIs.
  
-    - NEP 43 describes the new design's API for universal functions.
+    - :ref:`NEP 43 <NEP43>` describes the new design's API for universal functions.
  
  
  Abstract
@@ -74,7 +74,7 @@ cannot describe casting for such parametric datatypes implemented outside of Num
  This additional functionality for supporting parametric datatypes introduces
  increased complexity within NumPy itself,
  and furthermore is not available to external user-defined datatypes.
-In general the concerns of different datatypes are not well well-encapsulated.
+In general the concerns of different datatypes are not well-encapsulated.
  This burden is exacerbated by the exposure of internal C structures,
  limiting the addition of new fields
  (for example to support new sorting methods [new_sort]_).
diff --git a/doc/neps/nep-0042-new-dtypes.rst b/doc/neps/nep-0042-new-dtypes.rst

index c29172a281c568120f80843c0b5363faa241b170..3d30d54f4b14d604c132d478d0c59d1849288658 100644 (file)
--- a/doc/neps/nep-0042-new-dtypes.rst
+++ b/doc/neps/nep-0042-new-dtypes.rst
@@ -17,13 +17,13 @@ NEP 42 — New and extensible DTypes
  
      This NEP is third in a series:
  
-    - :ref:`NEP40` explains the shortcomings of NumPy's dtype implementation.
+    - :ref:`NEP 40 <NEP40>` explains the shortcomings of NumPy's dtype implementation.
  
-    - :ref:`NEP41` gives an overview of our proposed replacement.
+    - :ref:`NEP 41 <NEP41>` gives an overview of our proposed replacement.
  
      - NEP 42 (this document) describes the new design's datatype-related APIs.
  
-    - :ref:`NEP43` describes the new design's API for universal functions.
+    - :ref:`NEP 43 <NEP43>` describes the new design's API for universal functions.
  
  
  ******************************************************************************
@@ -302,7 +302,7 @@ user-defined DType::
      class UserDtype(dtype): ...
  
  one can do ``np.ndarray[UserDtype]``, keeping annotations concise in
-that case without introducing boilerplate in NumPy itself. For a user
+that case without introducing boilerplate in NumPy itself. For a
  user-defined scalar type::
  
      class UserScalar(generic): ...
diff --git a/doc/neps/nep-0043-extensible-ufuncs.rst b/doc/neps/nep-0043-extensible-ufuncs.rst

index 3312eb12cc89970b52c99f705d983d3fc3b15fe1..abc19ccf3315e8d48d22ac6af0cf1cc4e4e18a2d 100644 (file)
--- a/doc/neps/nep-0043-extensible-ufuncs.rst
+++ b/doc/neps/nep-0043-extensible-ufuncs.rst
@@ -804,7 +804,7 @@ the inner-loop operates on.
  This is necessary information for parametric dtypes since for example comparing
  two strings requires knowing the length of both strings.
  The ``Context`` can also hold potentially useful information such as the
-the original ``ufunc``, which can be helpful when reporting errors.
+original ``ufunc``, which can be helpful when reporting errors.
  
  In principle passing in Context is not necessary, as all information could be
  included in ``innerloop_data`` and set up in the ``get_loop`` function.
@@ -948,7 +948,7 @@ This wrapped ``ArrayMethod`` will have two additional methods:
    convert this to ``float64 + float64``.
  
  * ``wrap_outputs(Tuple[DType]: input_descr) -> Tuple[DType]`` replacing the
-  resolved descriptors with with the desired actual loop descriptors.
+  resolved descriptors with the desired actual loop descriptors.
    The original ``resolve_descriptors`` function will be called between these
    two calls, so that the output descriptors may not be set in the first call.
    In the above example it will use the ``float64`` as returned (which might
@@ -987,8 +987,8 @@ A different use-case is that of a ``Unit(float64, "m")`` DType, where
  the numerical type is part of the DType parameter.
  This approach is possible, but will require a custom ``ArrayMethod``
  which wraps existing loops.
-It must also always require require two steps of dispatching
-(one to the ``Unit`` DType and a second one for the numerical type).
+It must also always require two steps of dispatching (one to the ``Unit``
+DType and a second one for the numerical type).
  
  Furthermore, the efficient implementation will require the ability to
  fetch and reuse the inner-loop function from another ``ArrayMethod``.
@@ -1296,7 +1296,7 @@ of the current ufunc machinery (as well as casting).
  
  The implementation unfortunately will require large maintenance of the
  UFunc machinery, since both the actual UFunc loop calls, as well as the
-the initial dispatching steps have to be modified.
+initial dispatching steps have to be modified.
  
  In general, the correct ``ArrayMethod``, also those returned by a promoter,
  will be cached (or stored) inside a hashtable for efficient lookup.
diff --git a/doc/neps/nep-0046-sponsorship-guidelines.rst b/doc/neps/nep-0046-sponsorship-guidelines.rst

index 8535cb554703be93251062af3eca3677b992240d..ed54c21d55d0b8e2fb570db054b693cc9bc20638 100644 (file)
--- a/doc/neps/nep-0046-sponsorship-guidelines.rst
+++ b/doc/neps/nep-0046-sponsorship-guidelines.rst
@@ -219,8 +219,9 @@ https://scikit-learn.org/stable/about.html#funding. Plus a separate section
  https://jupyter.org/about. Some subprojects have separate approaches, for
  example sponsors are listed (by using the `all-contributors
  <https://github.com/all-contributors/all-contributors>`__ bot) in the README for
-`jupyterlab-git <https://github.com/jupyterlab/jupyterlab-git>`__. For a recent
-discussion on that, see `here <jupyterlab-git acknowledgements discussion>`_.
+`jupyterlab-git <https://github.com/jupyterlab/jupyterlab-git>`__.
+For a discussion from Jan 2020 on that, see
+`here <https://discourse.jupyter.org/t/ideas-for-recognizing-developer-contributions-by-companies-institutes/3178>`_.
  
  *NumFOCUS* has a large banner with sponsor logos on its front page at
  https://numfocus.org, and a full page with sponsors at different sponsorship
diff --git a/doc/neps/nep-0047-array-api-standard.rst b/doc/neps/nep-0047-array-api-standard.rst

index 53b8e35b001fc186a1bf3bc07b1e7cf903d1550e..a94b3b42345bc087e22d3f03016d10fd43c0c10f 100644 (file)
--- a/doc/neps/nep-0047-array-api-standard.rst
+++ b/doc/neps/nep-0047-array-api-standard.rst
@@ -340,7 +340,7 @@ Adding support for DLPack to NumPy entails:
  
  - Adding a ``ndarray.__dlpack__()`` method which returns a ``dlpack`` C
    structure wrapped in a ``PyCapsule``.
-- Adding a ``np._from_dlpack(obj)`` function, where ``obj`` supports
+- Adding a ``np.from_dlpack(obj)`` function, where ``obj`` supports
    ``__dlpack__()``, and returns an ``ndarray``.
  
  DLPack is currently a ~200 LoC header, and is meant to be included directly, so
diff --git a/doc/neps/nep-0049.rst b/doc/neps/nep-0049.rst

index 3bd1d102c62d4f0293ded6b45bd21d40aacd544c..8bc88a68b1e09bd606e8e440212aa605cd8dd5df 100644 (file)
--- a/doc/neps/nep-0049.rst
+++ b/doc/neps/nep-0049.rst
@@ -1,3 +1,5 @@
+.. _NEP49:
+
  ===================================
  NEP 49 — Data allocation strategies
  ===================================
@@ -55,8 +57,8 @@ is to create a flexible enough interface without burdening normative users.
  .. _`issue 5312`: https://github.com/numpy/numpy/issues/5312
  .. _`from 2017`: https://github.com/numpy/numpy/issues/5312#issuecomment-315234656
  .. _`in 2005`: https://numpy-discussion.scipy.narkive.com/MvmMkJcK/numpy-arrays-data-allocation-and-simd-alignement
-.. _`here`: http://numpy-discussion.10968.n7.nabble.com/Aligned-configurable-memory-allocation-td39712.html
-.. _`and here`: http://numpy-discussion.10968.n7.nabble.com/Numpy-s-policy-for-releasing-memory-td1533.html
+.. _`here`: https://mail.python.org/archives/list/numpy-discussion@python.org/thread/YPC5BGPUMKT2MLBP6O3FMPC35LFM2CCH/#YPC5BGPUMKT2MLBP6O3FMPC35LFM2CCH
+.. _`and here`: https://mail.python.org/archives/list/numpy-discussion@python.org/thread/IQK3EPIIRE3V4BPNAMJ2ZST3NUG2MK2A/#IQK3EPIIRE3V4BPNAMJ2ZST3NUG2MK2A
  .. _`issue 14177`: https://github.com/numpy/numpy/issues/14177
  .. _`filprofiler`: https://github.com/pythonspeed/filprofiler/blob/master/design/allocator-overrides.md
  .. _`electric fence`: https://github.com/boundarydevices/efence
diff --git a/doc/neps/open.rst.tmpl b/doc/neps/open.rst.tmpl

new file mode 100644 (file)

index 0000000..78cdafb
--- /dev/null
+++ b/doc/neps/open.rst.tmpl
@@ -0,0 +1,9 @@
+Open NEPs (under consideration)
+-------------------------------
+
+.. toctree::
+   :maxdepth: 1
+
+{% for nep, tags in neps.items() if tags['Status'] == 'Draft' %}
+   {{ tags['Title'] }} <{{ tags['Filename'] }}>
+{% endfor %}
diff --git a/doc/neps/provisional.rst.tmpl b/doc/neps/provisional.rst.tmpl

new file mode 100644 (file)

index 0000000..4c289db
--- /dev/null
+++ b/doc/neps/provisional.rst.tmpl
@@ -0,0 +1,9 @@
+Provisional NEPs (provisionally accepted; interface may change)
+---------------------------------------------------------------
+
+.. toctree::
+   :maxdepth: 1
+
+{% for nep, tags in neps.items() if tags['Status'] == 'Provisional' %}
+   {{ tags['Title'] }} <{{ tags['Filename'] }}>
+{% endfor %}
diff --git a/doc/neps/rejected.rst.tmpl b/doc/neps/rejected.rst.tmpl

new file mode 100644 (file)

index 0000000..5898a8c
--- /dev/null
+++ b/doc/neps/rejected.rst.tmpl
@@ -0,0 +1,9 @@
+Rejected and Withdrawn NEPs
+---------------------------
+
+.. toctree::
+   :maxdepth: 1
+
+{% for nep, tags in neps.items() if tags['Status'] in ('Rejected', 'Withdrawn') %}
+   {{ tags['Title'] }} <{{ tags['Filename'] }}>
+{% endfor %}
diff --git a/doc/neps/tools/build_index.py b/doc/neps/tools/build_index.py

index 51227a6f127372a5c64dc699043918c9d3dc0990..bcf414ddc872cf6d80b45499d342687cd6143b0b 100644 (file)
--- a/doc/neps/tools/build_index.py
+++ b/doc/neps/tools/build_index.py
@@ -1,6 +1,7 @@
  """
  Scan the directory of nep files and extract their metadata.  The
-metadata is passed to Jinja for filling out `index.rst.tmpl`.
+metadata is passed to Jinja for filling out the toctrees for various NEP
+categories.
  """
  
  import os
@@ -50,7 +51,7 @@ def nep_metadata():
          if not tags['Title'].startswith(f'NEP {nr} — '):
              raise RuntimeError(
                  f'Title for NEP {nr} does not start with "NEP {nr} — " '
-                '(note that — here is a special, enlongated dash). Got: '
+                '(note that — here is a special, elongated dash). Got: '
                  f'    {tags["Title"]!r}')
  
          if tags['Status'] in ('Accepted', 'Rejected', 'Withdrawn'):
@@ -100,14 +101,16 @@ def nep_metadata():
  
      return {'neps': neps, 'has_provisional': has_provisional}
  
-
-infile = 'index.rst.tmpl'
-outfile = 'index.rst'
-
  meta = nep_metadata()
  
-print(f'Compiling {infile} -> {outfile}')
-index = render(infile, meta)
-
-with open(outfile, 'w') as f:
-    f.write(index)
+for nepcat in (
+    "provisional", "accepted", "deferred", "finished", "meta",
+    "open", "rejected",
+):
+    infile = f"{nepcat}.rst.tmpl"
+    outfile =f"{nepcat}.rst"
+
+    print(f'Compiling {infile} -> {outfile}')
+    genf = render(infile, meta)
+    with open(outfile, 'w') as f:
+        f.write(genf)
diff --git a/doc/release/upcoming_changes/21154.improvement.rst b/doc/release/upcoming_changes/21154.improvement.rst

deleted file mode 100644 (file)

index 38630b4..0000000
--- a/doc/release/upcoming_changes/21154.improvement.rst
+++ /dev/null
@@ -1,7 +0,0 @@
-Math C library feature detection now uses correct signatures
-------------------------------------------------------------
-Compiling is preceded by a detection phase to determine whether the
-underlying libc supports certain math operations. Previously this code
-did not respect the proper signatures. Fixing this enables compilation
-for the ``wasm-ld`` backend (compilation for web assembly) and reduces
-the number of warnings.
diff --git a/doc/source/_static/numpy.css b/doc/source/_static/numpy.css

index e0ccaadaff316f5cf7ac9770afdc8a351878014f..b68bb378021a0caf51a0dd6274adb847bb1a4116 100644 (file)
--- a/doc/source/_static/numpy.css
+++ b/doc/source/_static/numpy.css
@@ -22,7 +22,6 @@ h1 {
    color: #013243; /* warm black */
  }
  
-
  h2 {
    color: #4d77cf; /* han blue */
    letter-spacing: -.03em;
@@ -33,6 +32,36 @@ h3 {
    letter-spacing: -.03em;
  }
  
+/* Style the active version button.
+
+- dev: orange
+- stable: green
+- old, PR: red
+
+Colors from:
+
+Wong, B. Points of view: Color blindness.
+Nat Methods 8, 441 (2011). https://doi.org/10.1038/nmeth.1618
+*/
+
+/* If the active version has the name "dev", style it orange */
+#version_switcher_button[data-active-version-name*="dev"] {
+  background-color: #E69F00;
+  border-color: #E69F00;
+}
+
+/* green for `stable` */
+#version_switcher_button[data-active-version-name*="stable"] {
+  background-color: #009E73;
+  border-color: #009E73;
+}
+
+/* red for `old` */
+#version_switcher_button:not([data-active-version-name*="stable"], [data-active-version-name*="dev"], [data-active-version-name=""]) {
+  background-color: #980F0F;
+  border-color: #980F0F;
+}
+
  /* Main page overview cards */
  
  .intro-card {
diff --git a/doc/source/_static/versions.json b/doc/source/_static/versions.json

new file mode 100644 (file)

index 0000000..104a30d
--- /dev/null
+++ b/doc/source/_static/versions.json
@@ -0,0 +1,62 @@
+[
+    {
+        "name": "dev",
+        "version": "devdocs",
+        "url": "https://numpy.org/devdocs/"
+    },
+    {
+        "name": "1.22 (stable)",
+        "version": "stable",
+        "url": "https://numpy.org/doc/stable/"
+    },
+    {
+        "name": "1.22",
+        "version": "1.22",
+        "url": "https://numpy.org/doc/1.22/"
+    },
+    {
+        "name": "1.21",
+        "version": "1.21",
+        "url": "https://numpy.org/doc/1.21/"
+    },
+    {
+        "name": "1.20",
+        "version": "1.20",
+        "url": "https://numpy.org/doc/1.20/"
+    },
+    {
+        "name": "1.19",
+        "version": "1.19",
+        "url": "https://numpy.org/doc/1.19/"
+    },
+    {
+        "name": "1.18",
+        "version": "1.18",
+        "url": "https://numpy.org/doc/1.18/"
+    },
+    {
+        "name": "1.17",
+        "version": "1.17",
+        "url": "https://numpy.org/doc/1.17/"
+    },
+    {
+        "name": "1.16",
+        "version": "1.16",
+        "url": "https://numpy.org/doc/1.16/"
+    },
+    {
+        "name": "1.15",
+        "version": "1.15",
+        "url": "https://numpy.org/doc/1.15/"
+    },
+    {
+        "name": "1.14",
+        "version": "1.14",
+        "url": "https://numpy.org/doc/1.14/"
+    },
+    {
+        "name": "1.13",
+        "version": "1.13",
+        "url": "https://numpy.org/doc/1.13/"
+    }
+]
diff --git a/doc/source/conf.py b/doc/source/conf.py

index c73d6d7bc1b60a8ddf7b919a8bad1a4cd79d38e4..ef31cfaac0faa20c5f7b37436ceb1fd98b982aa8 100644 (file)
--- a/doc/source/conf.py
+++ b/doc/source/conf.py
@@ -171,6 +171,16 @@ html_logo = '_static/numpylogo.svg'
  
  html_favicon = '_static/favicon/favicon.ico'
  
+# Set up the version switcher.  The versions.json is stored in the doc repo.
+if os.environ.get('CIRCLE_JOB', False) and \
+        os.environ.get('CIRCLE_BRANCH', '') != 'main':
+    # For PR, name is set to its ref
+    switcher_version = os.environ['CIRCLE_BRANCH']
+elif ".dev" in version:
+    switcher_version = "devdocs"
+else:
+    switcher_version = f"doc/{version}"
+
  html_theme_options = {
    "logo_link": "index",
    "github_url": "https://github.com/numpy/numpy",
@@ -179,6 +189,12 @@ html_theme_options = {
    "external_links": [
        {"name": "Learn", "url": "https://numpy.org/numpy-tutorials/"}
        ],
+  # Add light/dark mode and documentation version switcher:
+  "navbar_end": ["version-switcher", "navbar-icon-links"],
+  "switcher": {
+      "version_match": switcher_version,
+      "json_url": "https://numpy.org/doc/_static/versions.json",
+  },
  }
  
  html_title = "%s v%s Manual" % (project, version)
@@ -312,6 +328,7 @@ intersphinx_mapping = {
      'pytest': ('https://docs.pytest.org/en/stable', None),
      'numpy-tutorials': ('https://numpy.org/numpy-tutorials', None),
      'numpydoc': ('https://numpydoc.readthedocs.io/en/latest', None),
+    'dlpack': ('https://dmlc.github.io/dlpack/latest', None)
  }
  
  
diff --git a/doc/source/dev/development_advanced_debugging.rst b/doc/source/dev/development_advanced_debugging.rst

index 18a7f6ae9ad18f864a8ba71a1755851fd97f6c53..2dbd6ac228192b3c6167a1ef3537dd30a6fd36ba 100644 (file)
--- a/doc/source/dev/development_advanced_debugging.rst
+++ b/doc/source/dev/development_advanced_debugging.rst
@@ -106,7 +106,7 @@ Valgrind is a powerful tool to find certain memory access problems and should
  be run on complicated C code.
  Basic use of ``valgrind`` usually requires no more than::
  
-    PYTHONMALLOC=malloc python runtests.py
+    PYTHONMALLOC=malloc valgrind python runtests.py
  
  where ``PYTHONMALLOC=malloc`` is necessary to avoid false positives from python
  itself.
diff --git a/doc/source/dev/development_environment.rst b/doc/source/dev/development_environment.rst

index 37cf6f7afb50bef01d096c969536c406a757d9ff..4772366d26966b3cdb5042c6d51a1176cd18bf50 100644 (file)
--- a/doc/source/dev/development_environment.rst
+++ b/doc/source/dev/development_environment.rst
@@ -18,6 +18,10 @@ sources needs some additional steps, which are explained below.  For the rest
  of this chapter we assume that you have set up your git repo as described in
  :ref:`using-git`.
  
+.. note:: If you are having trouble building NumPy from source or setting up
+   your local development environment, you can try
+   to :ref:`build NumPy with Gitpod <development-gitpod>`.
+
  .. _testing-builds:
  
  Testing builds
@@ -190,9 +194,9 @@ That also takes extra arguments, like ``--pdb`` which drops you into the Python
  debugger when a test fails or an exception is raised.
  
  Running tests with `tox`_ is also supported.  For example, to build NumPy and
-run the test suite with Python 3.7, use::
+run the test suite with Python 3.9, use::
  
-    $ tox -e py37
+    $ tox -e py39
  
  For more extensive information, see :ref:`testing-guidelines`
  
diff --git a/doc/source/dev/development_workflow.rst b/doc/source/dev/development_workflow.rst

index 8c56f6fb2cbc57470d99f82fd9fb2e2adf0d4dc3..502c8993941effd5142965bfe16b689853404e7c 100644 (file)
--- a/doc/source/dev/development_workflow.rst
+++ b/doc/source/dev/development_workflow.rst
@@ -185,8 +185,60 @@ Standard acronyms to start the commit message with are::
     REV: revert an earlier commit
     STY: style fix (whitespace, PEP8)
     TST: addition or modification of tests
+   TYP: static typing
     REL: related to releasing numpy
  
+Commands to skip continuous integration
+```````````````````````````````````````
+
+By default a lot of continuous integration (CI) jobs are run for every PR,
+from running the test suite on different operating systems and hardware
+platforms to building the docs. In some cases you already know that CI isn't
+needed (or not all of it), for example if you work on CI config files, text in
+the README, or other files that aren't involved in regular build, test or docs
+sequences. In such cases you may explicitly skip CI by including one of these
+fragments in your commit message::
+
+   ``[ci skip]``: skip as much CI as possible (not all jobs can be skipped)
+   ``[skip github]``: skip GitHub Actions "build numpy and run tests" jobs
+   ``[skip travis]``: skip TravisCI jobs
+   ``[skip azurepipelines]``: skip Azure jobs
+
+*Note*: unfortunately not all CI systems implement this feature well, or at all.
+CircleCI supports ``ci skip`` but has no command to skip only CircleCI.
+Azure chooses to still run jobs with skip commands on PRs, the jobs only get
+skipped on merging to master.
+
+Test building wheels
+```````````````````````````````````````
+
+Numpy currently uses `cibuildwheel <https://https://cibuildwheel.readthedocs.io/en/stable/>`_
+in order to build wheels through continuous integration services. To save resources, the
+cibuildwheel wheel builders are not run by default on every single PR or commit to main.
+
+If you would like to test that your pull request do not break the wheel builders,
+you may either append ``[wheel build]`` to the end of the commit message of the commit
+or add one of the following labels to the pull request(if you have the permissions to do so):
+
+- ``36 - Build``: for pull requests changing build processes/configurations
+- ``03 - Maintenance``: for pull requests upgrading dependencies
+- ``14 - Release``: for pull requests preparing for a release
+
+The wheels built via github actions (including 64-bit linux, macOS, and
+windows, arm64 macOS, and 32-bit windows) will be uploaded as artifacts in zip
+files. You can access them from the Summary page of the "Wheel builder"
+Action_. The aarch64 wheels built via travis_ CI are not available as artifacts.
+Additionally, the wheels will be uploaded to
+https://anaconda.org/scipy-wheels-nightly/ on the following conditions:
+
+- by a weekly cron job or
+- if the github action or travis build has been manually triggered, which requires appropriate permissions
+
+The wheels wil be uploaded to https://anaconda.org/multibuild-wheels-staging/
+if the build was triggered by a tag to the repo that begins with ``v``
+
+.. _Action: https://github.com/numpy/numpy/actions
+.. _travis: https://app.travis-ci.com/github/numpy/numpy/builds
  
  .. _workflow_mailing_list:
  
diff --git a/doc/source/dev/gitwash/development_setup.rst b/doc/source/dev/gitwash/development_setup.rst

index 2be7125da032a5283049a271345f664e013bf6d8..a2fc61d2ed8c5ef1590e456934118babe2e7ac0c 100644 (file)
--- a/doc/source/dev/gitwash/development_setup.rst
+++ b/doc/source/dev/gitwash/development_setup.rst
@@ -112,7 +112,7 @@ Look it over
     - the ``main`` branch you just cloned on your own machine
     - the ``main`` branch from your fork on GitHub, which git named
       ``origin`` by default
-   - the ``main`` branch on the the main NumPy repo, which you named
+   - the ``main`` branch on the main NumPy repo, which you named
       ``upstream``.
  
     ::
diff --git a/doc/source/dev/gitwash/following_latest.rst b/doc/source/dev/gitwash/following_latest.rst

index 0e98b4ec41d60261a87a1b3711d7727ff35b9a2d..ffe753180e1c8185f4d9d8feb981b6af03444308 100644 (file)
--- a/doc/source/dev/gitwash/following_latest.rst
+++ b/doc/source/dev/gitwash/following_latest.rst
@@ -16,7 +16,7 @@ Get the local copy of the code
  
  From the command line::
  
-   git clone git://github.com/numpy/numpy.git
+   git clone https://github.com/numpy/numpy.git
  
  You now have a copy of the code tree in the new ``numpy`` directory.
  If this doesn't work you can try the alternative read-only url::
diff --git a/doc/source/dev/howto-docs.rst b/doc/source/dev/howto-docs.rst

index 93fec509c2370cf0233bd2ff9121a6c23741aa18..ff4a9f6d550a1276594f66f3f791495c4e2b05a1 100644 (file)
--- a/doc/source/dev/howto-docs.rst
+++ b/doc/source/dev/howto-docs.rst
@@ -315,7 +315,7 @@ Sub-config files can accept any of Doxygen_ `configuration options <https://www.
  but do not override or re-initialize any configuration option,
  rather only use the concatenation operator "+=". For example::
  
-   # to specfiy certain headers
+   # to specify certain headers
     INPUT += @CUR_DIR/header1.h \
              @CUR_DIR/header2.h
     # to add all headers in certain path
diff --git a/doc/source/dev/index.rst b/doc/source/dev/index.rst

index a8c9692679b335d46ecd09074c5decae56ce603a..46ff3a5b29c904faf052d63220ddf74945a16d9b 100644 (file)
--- a/doc/source/dev/index.rst
+++ b/doc/source/dev/index.rst
@@ -27,8 +27,10 @@ the `numpy-discussion mailing list <https://mail.python.org/mailman/listinfo/num
  or on `GitHub <https://github.com/numpy/numpy>`__ (open an issue or comment on a
  relevant issue). These are our preferred communication channels (open source is open
  by nature!), however if you prefer to discuss in private first, please reach out to
-our community coordinators at `numpy-team@googlegroups.com` or `numpy-team.slack.com`
-(send an email to `numpy-team@googlegroups.com` for an invite the first time).
+our community coordinators at `numpy-team@googlegroups.com
+<mailto://numpy-team@googlegroups.com>`_ or `numpy-team.slack.com
+<https://numpy-team.slack.com>`__ (send an email to `numpy-team@googlegroups.com`_ for an
+invite the first time).
  
  Development process - summary
  =============================
@@ -53,7 +55,7 @@ Here's the short summary, complete TOC links are below:
  
        git remote add upstream https://github.com/numpy/numpy.git
  
-   * Now, `git remote -v` will show two remote repositories named:
+   * Now, ``git remote -v`` will show two remote repositories named:
  
       - ``upstream``, which refers to the ``numpy`` repository
       - ``origin``, which refers to your personal fork
diff --git a/doc/source/f2py/advanced.rst b/doc/source/f2py/advanced.rst

index c8efbaadb426a2a1bf79f17bdd8f84f9a7285f2e..cf9984380123219935f0b423068856330b54965b 100644 (file)
--- a/doc/source/f2py/advanced.rst
+++ b/doc/source/f2py/advanced.rst
@@ -79,14 +79,18 @@ that defines mapping between Fortran type::
  
  and the corresponding <C type>. The <C type> can be one of the following::
  
+    double
+    float
+    long_double
      char
      signed_char
+    unsigned_char
      short
+    unsigned_short
      int
+    long
      long_long
-    float
-    double
-    long_double
+    unsigned
      complex_float
      complex_double
      complex_long_double
diff --git a/doc/source/f2py/buildtools/cmake.rst b/doc/source/f2py/buildtools/cmake.rst

index 3ed5a2beea142c0a4bf04f062c06b37c03f7ea8f..8c654c73e83efc6c4afd03c40eac63afa44d0e32 100644 (file)
--- a/doc/source/f2py/buildtools/cmake.rst
+++ b/doc/source/f2py/buildtools/cmake.rst
@@ -48,9 +48,9 @@ with the ``cython`` information.
  
      ls .
      # CMakeLists.txt fib1.f
-    mkdir build && cd build
-    cmake ..
-    make
+    cmake -S . -B build
+    cmake --build build
+    cd build
      python -c "import numpy as np; import fibby; a = np.zeros(9); fibby.fib(a); print (a)"
      # [ 0.  1.  1.  2.  3.  5.  8. 13. 21.]
  
diff --git a/doc/source/f2py/buildtools/index.rst b/doc/source/f2py/buildtools/index.rst

index aa41fd37f01a0c59a9e22d90e1841c237c867f3e..48ff927df0507c417d8756c81d2a2d3d90ddfef7 100644 (file)
--- a/doc/source/f2py/buildtools/index.rst
+++ b/doc/source/f2py/buildtools/index.rst
@@ -27,6 +27,7 @@ Building an extension module which includes Python and Fortran consists of:
  
    + A ``C`` wrapper file is always created
    + Code with modules require an additional ``.f90`` wrapper
+  + Code with functions generate an additional ``.f`` wrapper
  
  - ``fortranobject.{c,h}``
  
@@ -46,7 +47,7 @@ Fortran 77 programs
     - Generates
  
       + ``blahmodule.c``
-     + ``f2pywrappers.f``
+     + ``blah-f2pywrappers.f``
  
     When no ``COMMON`` blocks are present only a ``C`` wrapper file is generated.
     Wrappers are also generated to rewrite assumed shape arrays as automatic
@@ -57,10 +58,12 @@ Fortran 90 programs
     - Generates:
  
       + ``blahmodule.c``
+     + ``blah-f2pywrappers.f``
       + ``blah-f2pywrappers2.f90``
  
-   The secondary wrapper is used to handle code which is subdivided into
-   modules. It rewrites assumed shape arrays as automatic arrays.
+   The ``f90`` wrapper is used to handle code which is subdivided into
+   modules. The ``f`` wrapper makes ``subroutines`` for  ``functions``. It
+   rewrites assumed shape arrays as automatic arrays.
  
  Signature files
     - Input file ``blah.pyf``
@@ -68,7 +71,7 @@ Signature files
  
       + ``blahmodule.c``
       + ``blah-f2pywrappers2.f90`` (occasionally)
-     + ``f2pywrappers.f`` (occasionally)
+     + ``blah-f2pywrappers.f`` (occasionally)
  
     Signature files ``.pyf`` do not signal their language standard via the file
     extension, they may generate the F90 and F77 specific wrappers depending on
@@ -77,7 +80,10 @@ Signature files
  
  .. note::
  
-   The signature file output situation is being reconsidered in `issue 20385`_ .
+   From NumPy ``1.22.4`` onwards, ``f2py`` will deterministically generate
+   wrapper files based on the input file Fortran standard (F77 or greater).
+   ``--skip-empty-wrappers`` can be passed to ``f2py`` to restore the previous
+   behaviour of only generating wrappers when needed by the input .
  
  
  In theory keeping the above requirements in hand, any build system can be
diff --git a/doc/source/f2py/buildtools/meson.rst b/doc/source/f2py/buildtools/meson.rst

index d98752e65f800c4afc6c769be373467b920a89bb..502d3e21159a6b6dd26976f621cf4f145866e356 100644 (file)
--- a/doc/source/f2py/buildtools/meson.rst
+++ b/doc/source/f2py/buildtools/meson.rst
@@ -83,6 +83,13 @@ A major pain point in the workflow defined above, is the manual tracking of
  inputs. Although it would require more effort to figure out the actual outputs
  for reasons discussed in :ref:`f2py-bldsys`.
  
+.. note::
+
+   From NumPy ``1.22.4`` onwards, ``f2py`` will deterministically generate
+   wrapper files based on the input file Fortran standard (F77 or greater).
+   ``--skip-empty-wrappers`` can be passed to ``f2py`` to restore the previous
+   behaviour of only generating wrappers when needed by the input .
+
  However, we can augment our workflow in a straightforward to take into account
  files for which the outputs are known when the build system is set up.
  
diff --git a/doc/source/f2py/buildtools/skbuild.rst b/doc/source/f2py/buildtools/skbuild.rst

index af18ea43bfd0f8e95cf34b4627a6b5044d14f86a..f1a0bf65e7cc467d17699b7ea32c4d2b3d298667 100644 (file)
--- a/doc/source/f2py/buildtools/skbuild.rst
+++ b/doc/source/f2py/buildtools/skbuild.rst
@@ -44,9 +44,9 @@ The resulting extension can be built and loaded in the standard workflow.
  
      ls .
      # CMakeLists.txt fib1.f
-    mkdir build && cd build
-    cmake ..
-    make
+    cmake -S . -B build
+    cmake --build build
+    cd build
      python -c "import numpy as np; import fibby; a = np.zeros(9); fibby.fib(a); print (a)"
      # [ 0.  1.  1.  2.  3.  5.  8. 13. 21.]
  
diff --git a/doc/source/f2py/code/CMakeLists.txt b/doc/source/f2py/code/CMakeLists.txt

index 62ff193bbb2d82106aa2d8abf7d4a05ce2efd77d..6f5170ad534821fd9e8edf57e890a3f2fd6d78fd 100644 (file)
--- a/doc/source/f2py/code/CMakeLists.txt
+++ b/doc/source/f2py/code/CMakeLists.txt
@@ -1,12 +1,10 @@
-### setup project ###
-cmake_minimum_required(VERSION 3.17.3) # 3.17 > for Python3_SOABI
-set(CMAKE_CXX_STANDARD_REQUIRED ON)
+cmake_minimum_required(VERSION 3.18) # Needed to avoid requiring embedded Python libs too
  
  project(fibby
    VERSION 1.0
    DESCRIPTION "FIB module"
    LANGUAGES C Fortran
-  )
+)
  
  # Safety net
  if(PROJECT_SOURCE_DIR STREQUAL PROJECT_BINARY_DIR)
@@ -16,65 +14,52 @@ if(PROJECT_SOURCE_DIR STREQUAL PROJECT_BINARY_DIR)
    )
  endif()
  
-# Grab Python
-find_package(Python3 3.9 REQUIRED
-  COMPONENTS Interpreter Development NumPy)
+# Grab Python, 3.8 or newer
+find_package(Python 3.8 REQUIRED
+  COMPONENTS Interpreter Development.Module NumPy)
  
  # Grab the variables from a local Python installation
  # F2PY headers
  execute_process(
-  COMMAND "${Python3_EXECUTABLE}"
+  COMMAND "${Python_EXECUTABLE}"
    -c "import numpy.f2py; print(numpy.f2py.get_include())"
    OUTPUT_VARIABLE F2PY_INCLUDE_DIR
    OUTPUT_STRIP_TRAILING_WHITESPACE
  )
  
-# Project scope; consider using target_include_directories instead
-include_directories(
-  BEFORE
-  ${Python3_INCLUDE_DIRS}
-  ${Python3_NumPy_INCLUDE_DIRS}
-  ${F2PY_INCLUDE_DIR}
-  )
-
-message(STATUS ${Python3_INCLUDE_DIRS})
-message(STATUS ${F2PY_INCLUDE_DIR})
-message(STATUS ${Python3_NumPy_INCLUDE_DIRS})
+# Print out the discovered paths
+include(CMakePrintHelpers)
+cmake_print_variables(Python_INCLUDE_DIRS)
+cmake_print_variables(F2PY_INCLUDE_DIR)
+cmake_print_variables(Python_NumPy_INCLUDE_DIRS)
  
-# Vars
+# Common variables
  set(f2py_module_name "fibby")
  set(fortran_src_file "${CMAKE_SOURCE_DIR}/fib1.f")
  set(f2py_module_c "${f2py_module_name}module.c")
-set(generated_module_file "${f2py_module_name}${Python3_SOABI}")
  
  # Generate sources
  add_custom_target(
    genpyf
    DEPENDS "${CMAKE_CURRENT_BINARY_DIR}/${f2py_module_c}"
-  )
+)
  add_custom_command(
    OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/${f2py_module_c}"
-  COMMAND ${Python3_EXECUTABLE}  -m "numpy.f2py"
+  COMMAND ${Python_EXECUTABLE}  -m "numpy.f2py"
                     "${fortran_src_file}"
                     -m "fibby"
                     --lower # Important
    DEPENDS fib1.f # Fortran source
-  )
+)
  
  # Set up target
-add_library(${CMAKE_PROJECT_NAME} SHARED
+Python_add_library(${CMAKE_PROJECT_NAME} MODULE WITH_SOABI
    "${CMAKE_CURRENT_BINARY_DIR}/${f2py_module_c}" # Generated
    "${F2PY_INCLUDE_DIR}/fortranobject.c" # From NumPy
    "${fortran_src_file}" # Fortran source(s)
-  )
+)
  
  # Depend on sources
+target_link_libraries(${CMAKE_PROJECT_NAME} PRIVATE Python::NumPy)
  add_dependencies(${CMAKE_PROJECT_NAME} genpyf)
-
-set_target_properties(
-     ${CMAKE_PROJECT_NAME}
-    PROPERTIES
-        PREFIX ""
-        OUTPUT_NAME "${CMAKE_PROJECT_NAME}"
-        LINKER_LANGUAGE C
-    )
+target_include_directories(${CMAKE_PROJECT_NAME} PRIVATE "${F2PY_INCLUDE_DIR}")
diff --git a/doc/source/f2py/code/CMakeLists_skbuild.txt b/doc/source/f2py/code/CMakeLists_skbuild.txt

index 97bc5c744d41cffc8400908f91390408c872a745..f2d6b69c1ba8c05cda8d9eece0a2b9b7a51a56b5 100644 (file)
--- a/doc/source/f2py/code/CMakeLists_skbuild.txt
+++ b/doc/source/f2py/code/CMakeLists_skbuild.txt
@@ -1,6 +1,5 @@
  ### setup project ###
-cmake_minimum_required(VERSION 3.17.3)
-set(CMAKE_CXX_STANDARD_REQUIRED ON)
+cmake_minimum_required(VERSION 3.9)
  
  project(fibby
    VERSION 1.0
@@ -16,74 +15,81 @@ if(PROJECT_SOURCE_DIR STREQUAL PROJECT_BINARY_DIR)
    )
  endif()
  
-# Grab Python
-find_package(Python3 3.9 REQUIRED
-  COMPONENTS Interpreter Development)
-
  # Ensure scikit-build modules
  if (NOT SKBUILD)
-  # Kanged -->https://github.com/Kitware/torch_liberator/blob/master/CMakeLists.txt
+  find_package(PythonInterp 3.8 REQUIRED)
+  # Kanged --> https://github.com/Kitware/torch_liberator/blob/master/CMakeLists.txt
    # If skbuild is not the driver; include its utilities in CMAKE_MODULE_PATH
    execute_process(
-  COMMAND "${Python3_EXECUTABLE}"
-  -c "import os, skbuild; print(os.path.dirname(skbuild.__file__))"
-  OUTPUT_VARIABLE SKBLD_DIR
-  OUTPUT_STRIP_TRAILING_WHITESPACE
+    COMMAND "${PYTHON_EXECUTABLE}"
+    -c "import os, skbuild; print(os.path.dirname(skbuild.__file__))"
+    OUTPUT_VARIABLE SKBLD_DIR
+    OUTPUT_STRIP_TRAILING_WHITESPACE
    )
-  set(SKBLD_CMAKE_DIR "${SKBLD_DIR}/resources/cmake")
-  list(APPEND CMAKE_MODULE_PATH ${SKBLD_CMAKE_DIR})
+  list(APPEND CMAKE_MODULE_PATH "${SKBLD_DIR}/resources/cmake")
+  message(STATUS "Looking in ${SKBLD_DIR}/resources/cmake for CMake modules")
  endif()
  
  # scikit-build style includes
  find_package(PythonExtensions REQUIRED) # for ${PYTHON_EXTENSION_MODULE_SUFFIX}
-find_package(NumPy REQUIRED) # for ${NumPy_INCLUDE_DIRS}
-find_package(F2PY REQUIRED) # for ${F2PY_INCLUDE_DIR}
+
+# Grab the variables from a local Python installation
+# NumPy headers
+execute_process(
+  COMMAND "${PYTHON_EXECUTABLE}"
+  -c "import numpy; print(numpy.get_include())"
+  OUTPUT_VARIABLE NumPy_INCLUDE_DIRS
+  OUTPUT_STRIP_TRAILING_WHITESPACE
+)
+# F2PY headers
+execute_process(
+  COMMAND "${PYTHON_EXECUTABLE}"
+  -c "import numpy.f2py; print(numpy.f2py.get_include())"
+  OUTPUT_VARIABLE F2PY_INCLUDE_DIR
+  OUTPUT_STRIP_TRAILING_WHITESPACE
+)
  
  # Prepping the module
  set(f2py_module_name "fibby")
  set(fortran_src_file "${CMAKE_SOURCE_DIR}/fib1.f")
-set(generated_module_file ${f2py_module_name}${PYTHON_EXTENSION_MODULE_SUFFIX})
+set(f2py_module_c "${f2py_module_name}module.c")
  
  # Target for enforcing dependencies
-add_custom_target(${f2py_module_name} ALL
+add_custom_target(genpyf
    DEPENDS "${fortran_src_file}"
-  )
-
-# Custom command for generating .c
+)
  add_custom_command(
-  OUTPUT "${f2py_module_name}module.c"
-  COMMAND ${F2PY_EXECUTABLE}
-    -m ${f2py_module_name}
-    ${fortran_src_file}
-    --lower
-  WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
-  DEPENDS ${fortran_src_file}
-  )
+  OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/${f2py_module_c}"
+  COMMAND ${PYTHON_EXECUTABLE}  -m "numpy.f2py"
+                   "${fortran_src_file}"
+                   -m "fibby"
+                   --lower # Important
+  DEPENDS fib1.f # Fortran source
+)
  
-add_library(${generated_module_file} MODULE
+add_library(${CMAKE_PROJECT_NAME} MODULE
              "${f2py_module_name}module.c"
              "${F2PY_INCLUDE_DIR}/fortranobject.c"
              "${fortran_src_file}")
  
-target_include_directories(${generated_module_file} PUBLIC
-                           ${F2PY_INCLUDE_DIRS}
+target_include_directories(${CMAKE_PROJECT_NAME} PUBLIC
+                           ${F2PY_INCLUDE_DIR}
+                           ${NumPy_INCLUDE_DIRS}
                             ${PYTHON_INCLUDE_DIRS})
-set_target_properties(${generated_module_file} PROPERTIES SUFFIX "")
-set_target_properties(${generated_module_file} PROPERTIES PREFIX "")
+set_target_properties(${CMAKE_PROJECT_NAME} PROPERTIES SUFFIX "${PYTHON_EXTENSION_MODULE_SUFFIX}")
+set_target_properties(${CMAKE_PROJECT_NAME} PROPERTIES PREFIX "")
  
  # Linker fixes
  if (UNIX)
    if (APPLE)
-    set_target_properties(${generated_module_file} PROPERTIES
+    set_target_properties(${CMAKE_PROJECT_NAME} PROPERTIES
      LINK_FLAGS  '-Wl,-dylib,-undefined,dynamic_lookup')
    else()
-    set_target_properties(${generated_module_file} PROPERTIES
+    set_target_properties(${CMAKE_PROJECT_NAME} PROPERTIES
    LINK_FLAGS  '-Wl,--allow-shlib-undefined')
    endif()
  endif()
  
-if (SKBUILD)
-  install(TARGETS ${generated_module_file} DESTINATION fibby)
-else()
-  install(TARGETS ${generated_module_file} DESTINATION ${CMAKE_SOURCE_DIR}/fibby)
-endif()
+add_dependencies(${CMAKE_PROJECT_NAME} genpyf)
+
+install(TARGETS ${CMAKE_PROJECT_NAME} DESTINATION fibby)
diff --git a/doc/source/f2py/code/add-edited.pyf b/doc/source/f2py/code/add-edited.pyf

new file mode 100644 (file)

index 0000000..4d20472
--- /dev/null
+++ b/doc/source/f2py/code/add-edited.pyf
@@ -0,0 +1,6 @@
+subroutine zadd(a,b,c,n) ! in :add:add.f
+   double complex dimension(n) :: a
+   double complex dimension(n) :: b
+   double complex intent(out),dimension(n) :: c
+   integer intent(hide),depend(a) :: n=len(a)
+end subroutine zadd
+\ No newline at end of file
diff --git a/doc/source/f2py/code/add-improved.f b/doc/source/f2py/code/add-improved.f

new file mode 100644 (file)

index 0000000..65f70c9
--- /dev/null
+++ b/doc/source/f2py/code/add-improved.f
@@ -0,0 +1,16 @@
+C
+      SUBROUTINE ZADD(A,B,C,N)
+C
+CF2PY INTENT(OUT) :: C
+CF2PY INTENT(HIDE) :: N
+CF2PY DOUBLE COMPLEX :: A(N)
+CF2PY DOUBLE COMPLEX :: B(N)
+CF2PY DOUBLE COMPLEX :: C(N)
+      DOUBLE COMPLEX A(*)
+      DOUBLE COMPLEX B(*)
+      DOUBLE COMPLEX C(*)
+      INTEGER N
+      DO 20 J = 1, N
+         C(J) = A(J) + B(J)
+ 20   CONTINUE
+      END
+\ No newline at end of file
diff --git a/doc/source/f2py/code/add-test.f b/doc/source/f2py/code/add-test.f

new file mode 100644 (file)

index 0000000..1d52e47
--- /dev/null
+++ b/doc/source/f2py/code/add-test.f
@@ -0,0 +1,10 @@
+        subroutine addb(k)
+          real(8), intent(inout) :: k(:)
+          k=k+1
+        endsubroutine
+
+        subroutine addc(w,k)
+          real(8), intent(in) :: w(:)
+          real(8), intent(out) :: k(size(w))
+          k=w+1
+        endsubroutine
+\ No newline at end of file
diff --git a/doc/source/f2py/code/add.f b/doc/source/f2py/code/add.f

new file mode 100644 (file)

index 0000000..5e7556b
--- /dev/null
+++ b/doc/source/f2py/code/add.f
@@ -0,0 +1,11 @@
+C
+      SUBROUTINE ZADD(A,B,C,N)
+C
+      DOUBLE COMPLEX A(*)
+      DOUBLE COMPLEX B(*)
+      DOUBLE COMPLEX C(*)
+      INTEGER N
+      DO 20 J = 1, N
+         C(J) = A(J)+B(J)
+ 20   CONTINUE
+      END
diff --git a/doc/source/f2py/code/add.pyf b/doc/source/f2py/code/add.pyf

new file mode 100644 (file)

index 0000000..d2583e3
--- /dev/null
+++ b/doc/source/f2py/code/add.pyf
@@ -0,0 +1,6 @@
+subroutine zadd(a,b,c,n) ! in :add:add.f
+   double complex dimension(*) :: a
+   double complex dimension(*) :: b
+   double complex dimension(*) :: c
+   integer :: n
+end subroutine zadd
+\ No newline at end of file
diff --git a/doc/source/f2py/code/filter.f b/doc/source/f2py/code/filter.f

new file mode 100644 (file)

index 0000000..fb44343
--- /dev/null
+++ b/doc/source/f2py/code/filter.f
@@ -0,0 +1,19 @@
+C
+      SUBROUTINE DFILTER2D(A,B,M,N)
+C
+      DOUBLE PRECISION A(M,N)
+      DOUBLE PRECISION B(M,N)
+      INTEGER N, M
+CF2PY INTENT(OUT) :: B
+CF2PY INTENT(HIDE) :: N
+CF2PY INTENT(HIDE) :: M
+      DO 20 I = 2,M-1
+         DO 40 J = 2,N-1
+            B(I,J) = A(I,J) +
+     &           (A(I-1,J)+A(I+1,J) +
+     &           A(I,J-1)+A(I,J+1) )*0.5D0 +
+     &           (A(I-1,J-1) + A(I-1,J+1) +
+     &           A(I+1,J-1) + A(I+1,J+1))*0.25D0
+ 40      CONTINUE
+ 20   CONTINUE
+      END
diff --git a/doc/source/f2py/code/meson.build b/doc/source/f2py/code/meson.build

index b756abf8f59aedcc223af28bb14011f2043309bc..b84bf52a994a6577368e7162b71134f2284debf3 100644 (file)
--- a/doc/source/f2py/code/meson.build
+++ b/doc/source/f2py/code/meson.build
@@ -21,10 +21,10 @@ incdir_f2py = run_command(py3,
  ).stdout().strip()
  
  fibby_source = custom_target('fibbymodule.c',
-                            input : ['fib1.f'],
-                            output : ['fibbymodule.c'],
+                            input : ['fib1.f'], # .f so no F90 wrappers
+                            output : ['fibbymodule.c', 'fibby-f2pywrappers.f'],
                              command : [ py3, '-m', 'numpy.f2py', '@INPUT@',
-                            '-m', 'fibby', '--lower' ]
+                            '-m', 'fibby', '--lower']
                              )
  
  inc_np = include_directories(incdir_numpy, incdir_f2py)
diff --git a/doc/source/f2py/code/meson_upd.build b/doc/source/f2py/code/meson_upd.build

index 97bd8d175c7c564fc8b3dee53d64b8f60c2ae6dd..44d69d18222a22958ef90aac33e6215804079268 100644 (file)
--- a/doc/source/f2py/code/meson_upd.build
+++ b/doc/source/f2py/code/meson_upd.build
@@ -21,10 +21,10 @@ incdir_f2py = run_command(py3,
  ).stdout().strip()
  
  fibby_source = custom_target('fibbymodule.c',
-                            input : ['fib1.f'],
-                            output : ['fibbymodule.c'],
+                            input : ['fib1.f'], # .f so no F90 wrappers
+                            output : ['fibbymodule.c', 'fibby-f2pywrappers.f'],
                              command : [ py3, '-m', 'numpy.f2py', '@INPUT@',
-                            '-m', 'fibby', '--lower' ])
+                            '-m', 'fibby', '--lower'])
  
  inc_np = include_directories(incdir_numpy, incdir_f2py)
  
diff --git a/doc/source/f2py/code/myroutine-edited.pyf b/doc/source/f2py/code/myroutine-edited.pyf

new file mode 100644 (file)

index 0000000..14e4554
--- /dev/null
+++ b/doc/source/f2py/code/myroutine-edited.pyf
@@ -0,0 +1,17 @@
+!    -*- f90 -*-
+! Note: the context of this file is case sensitive.
+
+python module myroutine ! in 
+    interface  ! in :myroutine
+        subroutine s(n,m,c,x) ! in :myroutine:myroutine.f90
+            integer intent(in) :: n
+            integer intent(in) :: m
+            real(kind=8) dimension(:),intent(in) :: c
+            real(kind=8) dimension(n,m),intent(out) :: x
+        end subroutine s
+    end interface 
+end python module myroutine
+
+! This file was auto-generated with f2py (version:1.23.0.dev0+120.g4da01f42d).
+! See:
+! https://web.archive.org/web/20140822061353/http://cens.ioc.ee/projects/f2py2e
+\ No newline at end of file
diff --git a/doc/source/f2py/code/myroutine.f90 b/doc/source/f2py/code/myroutine.f90

new file mode 100644 (file)

index 0000000..592796a
--- /dev/null
+++ b/doc/source/f2py/code/myroutine.f90
@@ -0,0 +1,10 @@
+subroutine s(n, m, c, x)
+       implicit none
+       integer, intent(in) :: n, m
+       real(kind=8), intent(out), dimension(n,m) :: x
+       real(kind=8), intent(in) :: c(:)
+
+       x = 0.0d0
+       x(1, 1) = c(1)
+
+end subroutine s
+\ No newline at end of file
diff --git a/doc/source/f2py/code/myroutine.pyf b/doc/source/f2py/code/myroutine.pyf

new file mode 100644 (file)

index 0000000..ef8f167
--- /dev/null
+++ b/doc/source/f2py/code/myroutine.pyf
@@ -0,0 +1,17 @@
+!    -*- f90 -*-
+! Note: the context of this file is case sensitive.
+
+python module myroutine ! in 
+    interface  ! in :myroutine
+        subroutine s(n,m,c,x) ! in :myroutine:myroutine.f90
+            integer intent(in) :: n
+            integer intent(in) :: m
+            real(kind=8) dimension(:),intent(in) :: c
+            real(kind=8) dimension(n,m),intent(out),depend(m,n) :: x
+        end subroutine s
+    end interface 
+end python module myroutine
+
+! This file was auto-generated with f2py (version:1.23.0.dev0+120.g4da01f42d).
+! See:
+! https://web.archive.org/web/20140822061353/http://cens.ioc.ee/projects/f2py2e
+\ No newline at end of file
diff --git a/doc/source/f2py/code/pyproj_skbuild.toml b/doc/source/f2py/code/pyproj_skbuild.toml

index 6686d173601597be1991201f7b2c2f6abaf220a9..bcd6ae99cbfb8c4ff60fa7b2d5ad879e9c1a9eb7 100644 (file)
--- a/doc/source/f2py/code/pyproj_skbuild.toml
+++ b/doc/source/f2py/code/pyproj_skbuild.toml
@@ -1,5 +1,3 @@
-[project]
-requires-python = ">=3.7"
-
  [build-system]
-requires = ["setuptools>=42", "wheel", "scikit-build", "cmake>=3.18", "numpy>=1.21"]
+requires = ["setuptools>=42", "wheel", "scikit-build", "cmake>=3.9", "numpy>=1.21"]
+build-backend = "setuptools.build_meta"
diff --git a/doc/source/f2py/code/setup_skbuild.py b/doc/source/f2py/code/setup_skbuild.py

index 4dfc6af8b76d4cf290a2dc33f6dc08584f6cf8af..28dcdcb1f8849535fe6e7dbcaf36bb35f2c914cf 100644 (file)
--- a/doc/source/f2py/code/setup_skbuild.py
+++ b/doc/source/f2py/code/setup_skbuild.py
@@ -6,5 +6,5 @@ setup(
      description="a minimal example package (fortran version)",
      license="MIT",
      packages=['fibby'],
-    cmake_args=['-DSKBUILD=ON']
+    python_requires=">=3.7",
  )
diff --git a/doc/source/f2py/f2py-examples.rst b/doc/source/f2py/f2py-examples.rst

new file mode 100644 (file)

index 0000000..8dcdec0
--- /dev/null
+++ b/doc/source/f2py/f2py-examples.rst
@@ -0,0 +1,245 @@
+.. _f2py-examples:
+
+F2PY examples
+=============
+
+Below are some examples of F2PY usage. This list is not comprehensive, but can
+be used as a starting point when wrapping your own code.
+
+F2PY walkthrough: a basic extension module
+------------------------------------------
+
+Creating source for a basic extension module
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Consider the following subroutine, contained in a file named :file:`add.f`
+
+.. literalinclude:: ./code/add.f
+    :language: fortran
+
+This routine simply adds the elements in two contiguous arrays and places the
+result in a third. The memory for all three arrays must be provided by the
+calling routine. A very basic interface to this routine can be automatically
+generated by f2py::
+
+    python -m numpy.f2py -m add add.f
+
+This command will produce an extension module named :file:`addmodule.c` in the
+current directory. This extension module can now be compiled and used from
+Python just like any other extension module.
+
+Creating a compiled extension module
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. note::
+
+    This usage depends heavily on ``numpy.distutils``, see :ref:`f2py-bldsys`
+    for more details.
+
+You can also get f2py to both compile :file:`add.f` along with the produced
+extension module leaving only a shared-library extension file that can
+be imported from Python::
+
+    python -m numpy.f2py -c -m add add.f
+
+This command produces a Python extension module compatible with your platform.
+This module may then be imported from Python. It will contain a method for each
+subroutine in ``add``. The docstring of each method contains information about
+how the module method may be called:
+
+.. code-block:: python
+
+    >>> import add
+    >>> print(add.zadd.__doc__)
+    zadd(a,b,c,n)
+
+    Wrapper for ``zadd``.
+
+    Parameters
+    ----------
+    a : input rank-1 array('D') with bounds (*)
+    b : input rank-1 array('D') with bounds (*)
+    c : input rank-1 array('D') with bounds (*)
+    n : input int
+
+Improving the basic interface
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The default interface is a very literal translation of the Fortran code into
+Python. The Fortran array arguments are converted to NumPy arrays and the
+integer argument should be mapped to a ``C`` integer. The interface will attempt
+to convert all arguments to their required types (and shapes) and issue an error
+if unsuccessful. However, because ``f2py`` knows nothing about the semantics of
+the arguments (such that ``C`` is an output and ``n`` should really match the
+array sizes), it is possible to abuse this function in ways that can cause
+Python to crash. For example:
+
+.. code-block:: python
+
+    >>> add.zadd([1, 2, 3], [1, 2], [3, 4], 1000)
+
+will cause a program crash on most systems. Under the hood, the lists are being
+converted to arrays but then the underlying ``add`` function is told to cycle
+way beyond the borders of the allocated memory.
+
+In order to improve the interface, ``f2py`` supports directives. This is
+accomplished by constructing a signature file. It is usually best to start from
+the interfaces that ``f2py`` produces in that file, which correspond to the
+default behavior. To get ``f2py`` to generate the interface file use the ``-h``
+option::
+
+    python -m numpy.f2py -h add.pyf -m add add.f
+
+This command creates the ``add.pyf`` file in the current directory. The section
+of this file corresponding to ``zadd`` is:
+
+.. literalinclude:: ./code/add.pyf
+    :language: fortran
+
+By placing intent directives and checking code, the interface can be cleaned up
+quite a bit so the Python module method is both easier to use and more robust to
+malformed inputs.
+
+.. literalinclude:: ./code/add-edited.pyf
+    :language: fortran
+
+The intent directive, intent(out) is used to tell f2py that ``c`` is
+an output variable and should be created by the interface before being
+passed to the underlying code. The intent(hide) directive tells f2py
+to not allow the user to specify the variable, ``n``, but instead to
+get it from the size of ``a``. The depend( ``a`` ) directive is
+necessary to tell f2py that the value of n depends on the input a (so
+that it won't try to create the variable n until the variable a is
+created).
+
+After modifying ``add.pyf``, the new Python module file can be generated
+by compiling both ``add.f`` and ``add.pyf``::
+
+    python -m numpy.f2py -c add.pyf add.f
+
+The new interface's docstring is:
+
+.. code-block:: python
+
+    >>> import add
+    >>> print(add.zadd.__doc__)
+    c = zadd(a,b)
+
+    Wrapper for ``zadd``.
+
+    Parameters
+    ----------
+    a : input rank-1 array('D') with bounds (n)
+    b : input rank-1 array('D') with bounds (n)
+
+    Returns
+    -------
+    c : rank-1 array('D') with bounds (n)
+
+Now, the function can be called in a much more robust way:
+
+.. code-block::
+
+    >>> add.zadd([1, 2, 3], [4, 5, 6])
+    array([5.+0.j, 7.+0.j, 9.+0.j])
+
+Notice the automatic conversion to the correct format that occurred.
+
+Inserting directives in Fortran source
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The robust interface of the previous section can also be generated automatically
+by placing the variable directives as special comments in the original Fortran
+code. 
+
+.. note::
+
+    For projects where the Fortran code is being actively developed, this may be
+    preferred.
+
+Thus, if the source code is modified to contain:
+
+.. literalinclude:: ./code/add-improved.f
+    :language: fortran
+
+Then, one can compile the extension module using::
+
+    python -m numpy.f2py -c -m add add.f
+
+The resulting signature for the function add.zadd is exactly the same
+one that was created previously. If the original source code had
+contained ``A(N)`` instead of ``A(*)`` and so forth with ``B`` and ``C``,
+then nearly the same interface can be obtained by placing the
+``INTENT(OUT) :: C`` comment line in the source code. The only difference
+is that ``N`` would be an optional input that would default to the length
+of ``A``.
+
+A filtering example
+-------------------
+
+This example shows a function that filters a two-dimensional array of double
+precision floating-point numbers using a fixed averaging filter. The advantage
+of using Fortran to index into multi-dimensional arrays should be clear from
+this example.
+
+.. literalinclude:: ./code/filter.f
+    :language: fortran
+
+This code can be compiled and linked into an extension module named
+filter using::
+
+    python -m numpy.f2py -c -m filter filter.f
+
+This will produce an extension module in the current directory with a method
+named ``dfilter2d`` that returns a filtered version of the input.
+
+
+``depends`` keyword example
+---------------------------
+
+Consider the following code, saved in the file ``myroutine.f90``:
+
+.. literalinclude:: ./code/myroutine.f90
+    :language: fortran
+
+Wrapping this with ``python -m numpy.f2py -c myroutine.f90 -m myroutine``, we
+can do the following in Python::
+
+       >>> import numpy as np
+       >>> import myroutine
+       >>> x = myroutine.s(2, 3, np.array([5, 6, 7]))
+       >>> x
+       array([[5., 0., 0.],
+           [0., 0., 0.]])
+
+Now, instead of generating the extension module directly, we will create a
+signature file for this subroutine first. This is a common pattern for
+multi-step extension module generation. In this case, after running
+
+.. code-block:: python
+
+       python -m numpy.f2py myroutine.f90 -h myroutine.pyf
+
+the following signature file is generated:
+
+.. literalinclude:: ./code/myroutine.pyf
+    :language: fortran
+
+Now, if we run ``python -m numpy.f2py -c myroutine.pyf myroutine.f90`` we see an
+error; note that the signature file included a ``depend(m,n)`` statement for
+``x`` which is not necessary. Indeed, editing the file above to read
+
+.. literalinclude:: ./code/myroutine-edited.pyf
+    :language: fortran
+
+and running ``f2py -c myroutine.pyf myroutine.f90`` yields correct results.
+
+
+Read more
+---------
+
+* `Wrapping C codes using f2py <https://scipy.github.io/old-wiki/pages/Cookbook/f2py_and_NumPy.html>`_
+* `F2py section on the SciPy Cookbook <https://scipy-cookbook.readthedocs.io/items/F2Py.html>`_
+* `F2py example: Interactive System for Ice sheet Simulation <http://websrv.cs.umt.edu/isis/index.php/F2py_example>`_
+* `"Interfacing With Other Languages" section on the SciPy Cookbook.
+  <https://scipy-cookbook.readthedocs.io/items/idx_interfacing_with_other_languages.html>`_
diff --git a/doc/source/f2py/f2py-reference.rst b/doc/source/f2py/f2py-reference.rst

new file mode 100644 (file)

index 0000000..70912bd
--- /dev/null
+++ b/doc/source/f2py/f2py-reference.rst
@@ -0,0 +1,13 @@
+.. _f2py-reference:
+
+F2PY reference manual
+=====================
+
+.. toctree::
+   :maxdepth: 2
+
+   signature-file
+   python-usage
+   buildtools/index
+   advanced
+   f2py-testing
diff --git a/doc/source/f2py/f2py-testing.rst b/doc/source/f2py/f2py-testing.rst

new file mode 100644 (file)

index 0000000..88fc26c
--- /dev/null
+++ b/doc/source/f2py/f2py-testing.rst
@@ -0,0 +1,80 @@
+.. _f2py-testing:
+
+===============
+F2PY test suite
+===============
+
+F2PY's test suite is present in the directory ``numpy/f2py/tests``. Its aim
+is to ensure that Fortran language features are correctly translated to Python.
+For example, the user can specify starting and ending indices of arrays in
+Fortran. This behaviour is translated to the generated CPython library where
+the arrays strictly start from 0 index.
+
+The directory of the test suite looks like the following::
+
+       ./tests/
+       ├── __init__.py
+       ├── src
+       │   ├── abstract_interface
+       │   ├── array_from_pyobj
+       │   ├── // ... several test folders
+       │   └── string
+       ├── test_abstract_interface.py
+       ├── test_array_from_pyobj.py
+       ├── // ... several test files
+       ├── test_symbolic.py
+       └── util.py
+
+Files starting with ``test_`` contain tests for various aspects of f2py from parsing
+Fortran files to checking modules' documentation. ``src`` directory contains the
+Fortran source files upon which we do the testing. ``util.py`` contains utility 
+functions for building and importing Fortran modules during test time using a 
+temporary location.
+
+Adding a test
+==============
+
+F2PY's current test suite predates ``pytest`` and therefore does not use fixtures.
+Instead, the test files contain test classes that inherit from ``F2PyTest``
+class present in ``util.py``.
+
+.. literalinclude:: ../../../numpy/f2py/tests/util.py
+   :language: python
+   :lines:  327-336
+   :linenos:
+
+This class many helper functions for parsing and compiling test source files. Its child 
+classes can override its ``sources`` data member to provide their own source files.
+This superclass will then compile the added source files upon object creation andtheir
+functions will be appended to ``self.module`` data member. Thus, the child classes will
+be able to access the fortran functions specified in source file by calling
+``self.module.[fortran_function_name]``.
+
+Example
+~~~~~~~
+
+Consider the following subroutines, contained in a file named :file:`add-test.f`
+
+.. literalinclude:: ./code/add-test.f
+   :language: fortran
+
+The first routine `addb` simply takes an array and increases its elements by 1.
+The second subroutine `addc` assigns a new array `k` with elements greater that 
+the elements of the input array `w` by 1.
+
+A test can be implemented as follows::
+
+       class TestAdd(util.F2PyTest):
+           sources = [util.getpath("add-test.f")]
+
+           def test_module(self):
+               k = np.array([1, 2, 3], dtype=np.float64)
+               w = np.array([1, 2, 3], dtype=np.float64)
+               self.module.subb(k)
+               assert np.allclose(k, w + 1)
+               self.module.subc([w, k])
+               assert np.allclose(k, w + 1)
+
+We override the ``sources`` data member to provide the source file. The source files
+are compiled and subroutines are attached to module data member when the class object
+is created. The ``test_module`` function calls the subroutines and tests their results.
+\ No newline at end of file
diff --git a/doc/source/f2py/f2py-user.rst b/doc/source/f2py/f2py-user.rst

new file mode 100644 (file)

index 0000000..70d4aed
--- /dev/null
+++ b/doc/source/f2py/f2py-user.rst
@@ -0,0 +1,11 @@
+.. _f2py-user:
+
+F2PY user guide
+===============
+
+.. toctree::
+   :maxdepth: 2
+
+   f2py.getting-started
+   usage
+   f2py-examples
+\ No newline at end of file
diff --git a/doc/source/f2py/f2py.getting-started.rst b/doc/source/f2py/f2py.getting-started.rst

index c1a006f6f2eb9206f0ee8d8d9ba9fae88f2b5e85..da88b46f550d3bf510d2fb18e60aed9b1da66cb7 100644 (file)
--- a/doc/source/f2py/f2py.getting-started.rst
+++ b/doc/source/f2py/f2py.getting-started.rst
@@ -7,15 +7,14 @@
  Wrapping Fortran or C functions to Python using F2PY consists of the
  following steps:
  
-* Creating the so-called signature file that contains descriptions of
-  wrappers to Fortran or C functions, also called the signatures of the
-  functions. For Fortran routines, F2PY can create an initial
-  signature file by scanning Fortran source codes and
-  tracking all relevant information needed to create wrapper
-  functions.
+* Creating the so-called :doc:`signature file <signature-file>` that contains
+  descriptions of wrappers to Fortran or C functions, also called the signatures
+  of the functions. For Fortran routines, F2PY can create an initial signature
+  file by scanning Fortran source codes and tracking all relevant information
+  needed to create wrapper functions.
  
-  * Optionally, F2PY created signature files can be edited to optimize
-    wrapper functions, to make them "smarter" and more "Pythonic".
+  * Optionally, F2PY-created signature files can be edited to optimize wrapper
+    functions, which can make them "smarter" and more "Pythonic".
  
  * F2PY reads a signature file and writes a Python C/API module containing
    Fortran/C/Python bindings.
@@ -24,7 +23,8 @@ following steps:
  
    * In building the extension modules, F2PY uses ``numpy_distutils`` which
      supports a number of Fortran 77/90/95 compilers, including Gnu, Intel, Sun
-    Fortran, SGI MIPSpro, Absoft, NAG, Compaq etc.
+    Fortran, SGI MIPSpro, Absoft, NAG, Compaq etc. For different build systems,
+    see :ref:`f2py-bldsys`.
  
  Depending on the situation, these steps can be carried out in a single composite
  command or step-by-step; in which case some steps can be omitted or combined
@@ -40,6 +40,13 @@ illustration, save it as ``fib1.f``:
  .. literalinclude:: ./code/fib1.f
     :language: fortran
  
+.. note::
+
+  F2PY parses Fortran/C signatures to build wrapper functions to be used with
+  Python. However, it is not a compiler, and does not check for additional
+  errors in source code, nor does it implement the entire language standards.
+  Some errors may pass silently (or as warnings) and need to be verified by the
+  user.
  
  The quick way
  ==============
@@ -51,6 +58,18 @@ run
  
    python -m numpy.f2py -c fib1.f -m fib1
  
+or, alternatively, if the ``f2py`` command-line tool is available,
+
+::
+
+  f2py -c fib1.f -m fib1
+
+.. note::
+
+  Because the ``f2py`` command might not be available in all system, notably on
+  Windows, we will use the ``python -m numpy.f2py`` command throughout this
+  guide.
+
  This command compiles and wraps ``fib1.f`` (``-c``) to create the extension
  module ``fib1.so`` (``-m``) in the current directory. A list of command line
  options can be seen by executing ``python -m numpy.f2py``.  Now, in Python the
@@ -103,15 +122,15 @@ Fortran subroutine ``FIB`` is accessible via ``fib1.fib``::
      F2PY implements basic compatibility checks between related
      arguments in order to avoid unexpected crashes.
  
-  * When a NumPy array, that is Fortran contiguous and has a ``dtype``
-    corresponding to a presumed Fortran type, is used as an input array
-    argument, then its C pointer is directly passed to Fortran.
+  * When a NumPy array that is :term:`Fortran <Fortran order>`
+    :term:`contiguous` and has a ``dtype`` corresponding to a presumed Fortran
+    type is used as an input array argument, then its C pointer is directly
+    passed to Fortran.
  
-    Otherwise F2PY makes a contiguous copy (with the proper ``dtype``) of
-    the input array and passes a C pointer of the copy to the Fortran
-    subroutine. As a result, any possible changes to the (copy of)
-    input array have no effect to the original argument, as
-    demonstrated below::
+    Otherwise, F2PY makes a contiguous copy (with the proper ``dtype``) of the
+    input array and passes a C pointer of the copy to the Fortran subroutine. As
+    a result, any possible changes to the (copy of) input array have no effect
+    on the original argument, as demonstrated below::
  
        >>> a = np.ones(8, 'i')
        >>> fib1.fib(a)
@@ -121,11 +140,11 @@ Fortran subroutine ``FIB`` is accessible via ``fib1.fib``::
      Clearly, this is unexpected, as Fortran typically passes by reference. That
      the above example worked with ``dtype=float`` is considered accidental.
  
-    F2PY provides an ``intent(inplace)`` attribute that modifies
-    the attributes of an input array so that any changes made by
-    Fortran routine will be reflected in the input argument. For example,
-    if one specifies the ``intent(inplace) a`` directive (see subsequent
-    sections on how), then the example above would read::
+    F2PY provides an ``intent(inplace)`` attribute that modifies the attributes
+    of an input array so that any changes made by the Fortran routine will be
+    reflected in the input argument. For example, if one specifies the
+    ``intent(inplace) a`` directive (see :ref:`f2py-attributes` for details),
+    then the example above would read::
  
        >>> a = np.ones(8, 'i')
        >>> fib1.fib(a)
@@ -133,8 +152,8 @@ Fortran subroutine ``FIB`` is accessible via ``fib1.fib``::
        [  0.   1.   1.   2.   3.   5.   8.  13.]
  
      However, the recommended way to have changes made by Fortran subroutine
-    propagate to Python is to use the ``intent(out)`` attribute. That approach is
-    more efficient and also cleaner.
+    propagate to Python is to use the ``intent(out)`` attribute. That approach
+    is more efficient and also cleaner.
  
    * The usage of ``fib1.fib`` in Python is very similar to using ``FIB`` in
      Fortran. However, using *in situ* output arguments in Python is poor style,
@@ -145,23 +164,23 @@ Fortran subroutine ``FIB`` is accessible via ``fib1.fib``::
      may lead to difficult to find bugs, not to mention the fact that the
      codes will be less readable when all required type checks are implemented.
  
-  Though the approach to wrapping Fortran routines for Python discussed so far is
-  very straightforward, it has several drawbacks (see the comments above).
+  Though the approach to wrapping Fortran routines for Python discussed so far
+  is very straightforward, it has several drawbacks (see the comments above).
    The drawbacks are due to the fact that there is no way for F2PY to determine
-  the actual intention of the arguments; that is there is ambiguity in
+  the actual intention of the arguments; that is, there is ambiguity in
    distinguishing between input and output arguments. Consequently, F2PY assumes
    that all arguments are input arguments by default.
  
-  However, there are ways (see below) to remove this ambiguity by "teaching"
-  F2PY about the true intentions of function arguments, and F2PY is then able to
-  generate more explicit, easier to use, and less error prone wrappers for
-  Fortran functions.
+  There are ways (see below) to remove this ambiguity by "teaching" F2PY about
+  the true intentions of function arguments, and F2PY is then able to generate
+  more explicit, easier to use, and less error prone wrappers for Fortran
+  functions.
  
  The smart way
  ==============
  
-Let us apply the steps for wrapping Fortran functions to Python one by
-one.
+If we want to have more control over how F2PY will treat the interface to our
+Fortran code, we can apply the wrapping steps one by one.
  
  * First, we create a signature file from ``fib1.f`` by running:
  
@@ -169,21 +188,21 @@ one.
  
      python -m numpy.f2py fib1.f -m fib2 -h fib1.pyf
  
-  The signature file is saved to ``fib1.pyf`` (see the ``-h`` flag) and
-  its contents are shown below.
+  The signature file is saved to ``fib1.pyf`` (see the ``-h`` flag) and its
+  contents are shown below.
  
    .. literalinclude:: ./code/fib1.pyf
       :language: fortran
  
  * Next, we'll teach F2PY that the argument ``n`` is an input argument (using the
    ``intent(in)`` attribute) and that the result, i.e., the contents of ``a``
-  after calling the Fortran function ``FIB``, should be returned to Python (using
-  the ``intent(out)`` attribute). In addition, an array ``a`` should be created
-  dynamically using the size determined by the input argument ``n`` (using the
-  ``depend(n)`` attribute to indicate this dependence relation).
+  after calling the Fortran function ``FIB``, should be returned to Python
+  (using the ``intent(out)`` attribute). In addition, an array ``a`` should be
+  created dynamically using the size determined by the input argument ``n``
+  (using the ``depend(n)`` attribute to indicate this dependence relation).
  
    The contents of a suitably modified version of ``fib1.pyf`` (saved as
-  ``fib2.pyf``) is as follows:
+  ``fib2.pyf``) are as follows:
  
    .. literalinclude:: ./code/fib2.pyf
       :language: fortran
@@ -215,15 +234,18 @@ In Python::
  
  .. note::
  
-  * The signature of ``fib2.fib`` now more closely corresponds to the
-    intention of Fortran subroutine ``FIB``: given the number ``n``,
-    ``fib2.fib`` returns the first ``n`` Fibonacci numbers as a NumPy array.
-    The new Python signature ``fib2.fib`` also rules out the unexpected behaviour in ``fib1.fib``.
+  * The signature of ``fib2.fib`` now more closely corresponds to the intention
+    of the Fortran subroutine ``FIB``: given the number ``n``, ``fib2.fib``
+    returns the first ``n`` Fibonacci numbers as a NumPy array. The new Python
+    signature ``fib2.fib`` also rules out the unexpected behaviour in
+    ``fib1.fib``.
  
    * Note that by default, using a single ``intent(out)`` also implies
      ``intent(hide)``. Arguments that have the ``intent(hide)`` attribute
      specified will not be listed in the argument list of a wrapper function.
  
+  For more details, see :doc:`signature-file`.
+
  The quick and smart way
  ========================
  
diff --git a/doc/source/f2py/index.rst b/doc/source/f2py/index.rst

index 56df31b4e752e53d580713c96596e9bccd0453ce..dedfe3f644dd3403a1ccdac5eda964fd1475d815 100644 (file)
--- a/doc/source/f2py/index.rst
+++ b/doc/source/f2py/index.rst
@@ -5,10 +5,10 @@ F2PY user guide and reference manual
  =====================================
  
  The purpose of the ``F2PY`` --*Fortran to Python interface generator*-- utility
-is to provide a connection between Python and Fortran
-languages.  F2PY is a part of NumPy_ (``numpy.f2py``) and also available as a
-standalone command line tool ``f2py`` when ``numpy`` is installed that
-facilitates creating/building Python C/API extension modules that make it
+is to provide a connection between Python and Fortran. F2PY is a part of NumPy_
+(``numpy.f2py``) and also available as a standalone command line tool.
+
+F2PY facilitates creating/building Python C/API extension modules that make it
  possible
  
  * to call Fortran 77/90/95 external subroutines and Fortran 90/95
@@ -18,15 +18,31 @@ possible
  
  from Python.
  
+F2PY can be used either as a command line tool ``f2py`` or as a Python
+module ``numpy.f2py``. While we try to provide the command line tool as part
+of the numpy setup, some platforms like Windows make it difficult to
+reliably put the executables on the ``PATH``. If the ``f2py`` command is not
+available in your system, you may have to run it as a module::
+
+   python -m numpy.f2py
+
+If you run ``f2py`` with no arguments, and the line ``numpy Version`` at the
+end matches the NumPy version printed from ``python -m numpy.f2py``, then you
+can use the shorter version. If not, or if you cannot run ``f2py``, you should
+replace all calls to ``f2py`` mentioned in this guide with the longer version.
+
  .. toctree::
-   :maxdepth: 2
+   :maxdepth: 3
  
-   usage
     f2py.getting-started
+   f2py-user
+   f2py-reference
+   usage
     python-usage
     signature-file
     buildtools/index
     advanced
+   windows/index
  
  .. _Python: https://www.python.org/
  .. _NumPy: https://www.numpy.org/
diff --git a/doc/source/f2py/python-usage.rst b/doc/source/f2py/python-usage.rst

index ef8ccd7dd657a562b1e2c4f1d32972c097fe3465..c3379f6c5ad42b4400ab8ca065e0227b910c2785 100644 (file)
--- a/doc/source/f2py/python-usage.rst
+++ b/doc/source/f2py/python-usage.rst
@@ -2,41 +2,48 @@
  Using F2PY bindings in Python
  ==================================
  
-All wrappers for Fortran/C routines, common blocks, or for Fortran
-90 module data generated by F2PY are exposed to Python as ``fortran``
-type objects. Routine wrappers are callable ``fortran`` type objects
-while wrappers to Fortran data have attributes referring to data
-objects.
+In this page, you can find a full description and a few examples of common usage
+patterns for F2PY with Python and different argument types. For more examples
+and use cases, see :ref:`f2py-examples`.
+
+Fortran type objects
+====================
+
+All wrappers for Fortran/C routines, common blocks, or for Fortran 90 module
+data generated by F2PY are exposed to Python as ``fortran`` type objects.
+Routine wrappers are callable ``fortran`` type objects while wrappers to Fortran
+data have attributes referring to data objects.
  
  All ``fortran`` type objects have an attribute ``_cpointer`` that contains a
  ``CObject`` referring to the C pointer of the corresponding Fortran/C function
-or variable at the C level. Such ``CObjects`` can be used as a callback argument
+or variable at the C level. Such ``CObjects`` can be used as callback arguments
  for F2PY generated functions to bypass the Python C/API layer for calling Python
-functions from Fortran or C when the computational aspects of such functions are
-implemented in C or Fortran and wrapped with F2PY (or any other tool capable of
-providing the ``CObject`` of a function).
+functions from Fortran or C. This can be useful when the computational aspects
+of such functions are implemented in C or Fortran and wrapped with F2PY (or any
+other tool capable of providing the ``CObject`` of a function).
  
  Consider a Fortran 77 file ```ftype.f``:
  
-  .. literalinclude:: ./code/ftype.f
-     :language: fortran
+.. literalinclude:: ./code/ftype.f
+  :language: fortran
  
  and a wrapper built using ``f2py -c ftype.f -m ftype``.
  
-In Python:
+In Python, you can observe the types of ``foo`` and ``data``, and how to access
+individual objects of the wrapped Fortran code.
  
-  .. literalinclude:: ./code/results/ftype_session.dat
-     :language: python
+.. literalinclude:: ./code/results/ftype_session.dat
+  :language: python
  
  
  Scalar arguments
  =================
  
-In general, a scalar argument for a F2PY generated wrapper function can
-be an ordinary Python scalar (integer, float, complex number) as well as
-an arbitrary sequence object (list, tuple, array, string) of
-scalars. In the latter case, the first element of the sequence object
-is passed to Fortran routine as a scalar argument.
+In general, a scalar argument for a F2PY generated wrapper function can be an
+ordinary Python scalar (integer, float, complex number) as well as an arbitrary
+sequence object (list, tuple, array, string) of scalars. In the latter case, the
+first element of the sequence object is passed to the Fortran routine as a
+scalar argument.
  
  .. note::
  
@@ -44,52 +51,57 @@ is passed to Fortran routine as a scalar argument.
       narrowing e.g. when type-casting float to integer or complex to float, F2PY
       *does not* raise an exception.
  
-     * For complex to real type-casting only the real part of a complex number is used.
+     * For complex to real type-casting only the real part of a complex number
+       is used.
  
     * ``intent(inout)`` scalar arguments are assumed to be array objects in
       order to have *in situ* changes be effective. It is recommended to use
-     arrays with proper type but also other types work.
+     arrays with proper type but also other types work. :ref:`Read more about
+     the intent attribute <f2py-attributes>`.
  
  Consider the following Fortran 77 code:
  
-  .. literalinclude:: ./code/scalar.f
-     :language: fortran
+.. literalinclude:: ./code/scalar.f
+  :language: fortran
  
  and wrap it using ``f2py -c -m scalar scalar.f``.
  
  In Python:
  
-  .. literalinclude:: ./code/results/scalar_session.dat
-     :language: python
+.. literalinclude:: ./code/results/scalar_session.dat
+  :language: python
  
  
  String arguments
  =================
  
-F2PY generated wrapper functions accept almost any Python object as
-a string argument, since ``str`` is applied for non-string objects.
-Exceptions are NumPy arrays that must have type code ``'c'`` or
-``'1'`` when used as string arguments.
+F2PY generated wrapper functions accept almost any Python object as a string
+argument, since ``str`` is applied for non-string objects. Exceptions are NumPy
+arrays that must have type code ``'S1'`` or ``'b'`` (corresponding to the
+outdated ``'c'`` or ``'1'`` typecodes, respectively) when used as string
+arguments. See :ref:`arrays.scalars` for more information on these typecodes.
  
-A string can have an arbitrary length when used as a string argument
-for an F2PY generated wrapper function. If the length is greater than
-expected, the string is truncated silently. If the length is smaller than
-expected, additional memory is allocated and filled with ``\0``.
+A string can have an arbitrary length when used as a string argument for an F2PY
+generated wrapper function. If the length is greater than expected, the string
+is truncated silently. If the length is smaller than expected, additional memory
+is allocated and filled with ``\0``.
  
-Because Python strings are immutable, an ``intent(inout)`` argument
-expects an array version of a string in order to have *in situ* changes be effective.
+.. TODO: review this section once https://github.com/numpy/numpy/pull/19388 is merged.
+
+Because Python strings are immutable, an ``intent(inout)`` argument expects an
+array version of a string in order to have *in situ* changes be effective.
  
  Consider the following Fortran 77 code:
  
-  .. literalinclude:: ./code/string.f
-     :language: fortran
+.. literalinclude:: ./code/string.f
+  :language: fortran
  
  and wrap it using ``f2py -c -m mystring string.f``.
  
  Python session:
  
-  .. literalinclude:: ./code/results/string_session.dat
-     :language: python
+.. literalinclude:: ./code/results/string_session.dat
+  :language: python
  
  
  Array arguments
@@ -99,40 +111,17 @@ In general, array arguments for F2PY generated wrapper functions accept
  arbitrary sequences that can be transformed to NumPy array objects. There are
  two notable exceptions:
  
-* ``intent(inout)`` array arguments must always be proper-contiguous (defined below) and have a
-  compatible ``dtype``, otherwise an exception is raised.
+* ``intent(inout)`` array arguments must always be
+  :term:`proper-contiguous <contiguous>` and have a compatible ``dtype``,
+  otherwise an exception is raised.
  * ``intent(inplace)`` array arguments  will be changed *in situ* if the argument
-  has a different type than expected (see the ``intent(inplace)`` attribute for
-  more information).
-
-In general, if a NumPy array is proper-contiguous and has a proper type then it
-is directly passed to the wrapped Fortran/C function. Otherwise, an element-wise
-copy of the input array is made and the copy, being proper-contiguous and with
-proper type, is used as the array argument.
-
-There are two types of proper-contiguous NumPy arrays:
-
-* Fortran-contiguous arrays refer to data that is stored columnwise,
-  i.e. the indexing of data as stored in memory starts from the lowest
-  dimension;
-* C-contiguous, or simply contiguous arrays, refer to data that is stored
-  rowwise, i.e. the indexing of data as stored in memory starts from the highest
-  dimension.
-
-For one-dimensional arrays these notions coincide.
-
-For example, a 2x2 array ``A`` is Fortran-contiguous if its elements
-are stored in memory in the following order::
-
-  A[0,0] A[1,0] A[0,1] A[1,1]
-
-and C-contiguous if the order is as follows::
-
-  A[0,0] A[0,1] A[1,0] A[1,1]
+  has a different type than expected (see the ``intent(inplace)``
+  :ref:`attribute <f2py-attributes>` for more information).
  
-To test whether an array is C-contiguous, use the ``.flags.c_contiguous``
-attribute of NumPy arrays.  To test for Fortran contiguity, use the
-``.flags.f_contiguous`` attribute.
+In general, if a NumPy array is :term:`proper-contiguous <contiguous>` and has
+a proper type then it is directly passed to the wrapped Fortran/C function.
+Otherwise, an element-wise copy of the input array is made and the copy, being
+proper-contiguous and with proper type, is used as the array argument.
  
  Usually there is no need to worry about how the arrays are stored in memory and
  whether the wrapped functions, being either Fortran or C functions, assume one
@@ -144,19 +133,19 @@ physical memory in your computer, then care must be taken to ensure the usage of
  proper-contiguous and proper type arguments.
  
  To transform input arrays to column major storage order before passing
-them to Fortran routines, use the function ``numpy.asfortranarray(<array>)``.
+them to Fortran routines, use the function `numpy.asfortranarray`.
  
  Consider the following Fortran 77 code:
  
-  .. literalinclude:: ./code/array.f
-     :language: fortran
+.. literalinclude:: ./code/array.f
+  :language: fortran
  
  and wrap it using ``f2py -c -m arr array.f -DF2PY_REPORT_ON_ARRAY_COPY=1``.
  
  In Python:
  
-  .. literalinclude:: ./code/results/array_session.dat
-     :language: python
+.. literalinclude:: ./code/results/array_session.dat
+  :language: python
  
  .. _Call-back arguments:
  
@@ -167,15 +156,15 @@ F2PY supports calling Python functions from Fortran or C codes.
  
  Consider the following Fortran 77 code:
  
-  .. literalinclude:: ./code/callback.f
-     :language: fortran
+.. literalinclude:: ./code/callback.f
+  :language: fortran
  
  and wrap it using ``f2py -c -m callback callback.f``.
  
  In Python:
  
-  .. literalinclude:: ./code/results/callback_session.dat
-     :language: python
+.. literalinclude:: ./code/results/callback_session.dat
+  :language: python
  
  In the above example F2PY was able to guess accurately the signature
  of the call-back function. However, sometimes F2PY cannot establish the
@@ -183,13 +172,13 @@ appropriate signature; in these cases the signature of the call-back
  function must be explicitly defined in the signature file.
  
  To facilitate this, signature files may contain special modules (the names of
-these modules contain the special ``__user__`` sub-string) that defines the
+these modules contain the special ``__user__`` sub-string) that define the
  various signatures for call-back functions.  Callback arguments in routine
  signatures have the ``external`` attribute (see also the ``intent(callback)``
-attribute). To relate a callback argument with its signature in a ``__user__``
-module block, a ``use`` statement can be utilized as illustrated below. The same
-signature for a callback argument can be referred to in different routine
-signatures.
+:ref:`attribute <f2py-attributes>`). To relate a callback argument with its
+signature in a ``__user__`` module block, a ``use`` statement can be utilized as
+illustrated below. The same signature for a callback argument can be referred to
+in different routine signatures.
  
  We use the same Fortran 77 code as in the previous example but now
  we will pretend that F2PY was not able to guess the signatures of
@@ -200,69 +189,67 @@ file ``callback2.pyf`` using F2PY::
  
  Then modify it as follows
  
-  .. include:: ./code/callback2.pyf
-     :literal:
+.. include:: ./code/callback2.pyf
+  :literal:
  
-Finally, we build the extension module using ``f2py -c callback2.pyf callback.f``.
+Finally, we build the extension module using
+``f2py -c callback2.pyf callback.f``.
  
  An example Python session for this snippet would be identical to the previous
  example except that the argument names would differ.
  
-Sometimes a Fortran package may require that users provide routines
-that the package will use. F2PY can construct an interface to such
-routines so that Python functions can be called from Fortran.
+Sometimes a Fortran package may require that users provide routines that the
+package will use. F2PY can construct an interface to such routines so that
+Python functions can be called from Fortran.
  
  Consider the following Fortran 77 subroutine that takes an array as its input
  and applies a function ``func`` to its elements.
  
-  .. literalinclude:: ./code/calculate.f
-     :language: fortran
+.. literalinclude:: ./code/calculate.f
+  :language: fortran
  
  The Fortran code expects that the function ``func`` has been defined externally.
  In order to use a Python function for ``func``, it must have an attribute
-``intent(callback)`` and, it must be specified before the ``external`` statement.
+``intent(callback)`` and it must be specified before the ``external`` statement.
  
  Finally, build an extension module using ``f2py -c -m foo calculate.f``
  
  In Python:
  
-  .. literalinclude:: ./code/results/calculate_session.dat
-     :language: python
+.. literalinclude:: ./code/results/calculate_session.dat
+  :language: python
  
  The function is included as an argument to the python function call to the
  Fortran subroutine even though it was *not* in the Fortran subroutine argument
  list. The "external" keyword refers to the C function generated by f2py, not the
-python function itself. The python function is essentially being supplied to the
+Python function itself. The python function is essentially being supplied to the
  C function.
  
-The callback function may also be explicitly set in the module.
-Then it is not necessary to pass the function in the argument list to
-the Fortran function. This may be desired if the Fortran function calling
-the python callback function is itself called by another Fortran function.
+The callback function may also be explicitly set in the module. Then it is not
+necessary to pass the function in the argument list to the Fortran function.
+This may be desired if the Fortran function calling the Python callback function
+is itself called by another Fortran function.
  
  Consider the following Fortran 77 subroutine:
  
-  .. literalinclude:: ./code/extcallback.f
-     :language: fortran
+.. literalinclude:: ./code/extcallback.f
+  :language: fortran
  
  and wrap it using ``f2py -c -m pfromf extcallback.f``.
  
  In Python:
  
-  .. literalinclude:: ./code/results/extcallback_session.dat
-     :language: python
+.. literalinclude:: ./code/results/extcallback_session.dat
+  :language: python
  
  Resolving arguments to call-back functions
-===========================================
+------------------------------------------
  
-F2PY generated interfaces are very flexible with respect to call-back
-arguments.  For each call-back argument an additional optional
-argument ``<name>_extra_args`` is introduced by F2PY. This argument
-can be used to pass extra arguments to user provided call-back
-functions.
+F2PY generated interfaces are very flexible with respect to call-back arguments.  For each call-back argument an additional optional
+argument ``<name>_extra_args`` is introduced by F2PY. This argument can be used
+to pass extra arguments to user provided call-back functions.
  
-If a F2PY generated wrapper function expects the following call-back
-argument::
+If a F2PY generated wrapper function expects the following call-back argument::
  
    def fun(a_1,...,a_n):
       ...
@@ -282,20 +269,20 @@ is provided by a user, and in addition,
  
    fun_extra_args = (e_1,...,e_p)
  
-is used, then the following rules are applied when a Fortran or C
-function evaluates the call-back argument ``gun``:
+is used, then the following rules are applied when a Fortran or C function
+evaluates the call-back argument ``gun``:
  
  * If ``p == 0`` then ``gun(a_1, ..., a_q)`` is called, here
    ``q = min(m, n)``.
  * If ``n + p <= m`` then ``gun(a_1, ..., a_n, e_1, ..., e_p)`` is called.
-* If ``p <= m < n + p`` then ``gun(a_1, ..., a_q, e_1, ..., e_p)`` is called, here
-  ``q=m-p``.
+* If ``p <= m < n + p`` then ``gun(a_1, ..., a_q, e_1, ..., e_p)`` is called,
+  and here ``q=m-p``.
  * If ``p > m`` then ``gun(e_1, ..., e_m)`` is called.
-* If ``n + p`` is less than the number of required arguments to ``gun``
-  then an exception is raised.
+* If ``n + p`` is less than the number of required arguments to ``gun`` then an
+  exception is raised.
  
-If the function ``gun`` may return any number of objects as a tuple; then
-the following rules are applied:
+If the function ``gun`` may return any number of objects as a tuple; then the
+following rules are applied:
  
  * If ``k < l``, then ``y_{k + 1}, ..., y_l`` are ignored.
  * If ``k > l``, then only ``x_1, ..., x_l`` are set.
@@ -304,48 +291,47 @@ the following rules are applied:
  Common blocks
  ==============
  
-F2PY generates wrappers to ``common`` blocks defined in a routine
-signature block. Common blocks are visible to all Fortran codes linked
-to the current extension module, but not to other extension modules
-(this restriction is due to the way Python imports shared libraries).  In
-Python, the F2PY wrappers to ``common`` blocks are ``fortran`` type
-objects that have (dynamic) attributes related to the data members of
-the common blocks. When accessed, these attributes return as NumPy array
-objects (multidimensional arrays are Fortran-contiguous) which
-directly link to data members in common blocks. Data members can be
-changed by direct assignment or by in-place changes to the
+F2PY generates wrappers to ``common`` blocks defined in a routine signature
+block. Common blocks are visible to all Fortran codes linked to the current
+extension module, but not to other extension modules (this restriction is due to
+the way Python imports shared libraries). In Python, the F2PY wrappers to
+``common`` blocks are ``fortran`` type objects that have (dynamic) attributes
+related to the data members of the common blocks. When accessed, these
+attributes return as NumPy array objects (multidimensional arrays are
+Fortran-contiguous) which directly link to data members in common blocks. Data
+members can be changed by direct assignment or by in-place changes to the
  corresponding array objects.
  
  Consider the following Fortran 77 code:
  
-  .. literalinclude:: ./code/common.f
-     :language: fortran
+.. literalinclude:: ./code/common.f
+  :language: fortran
  
  and wrap it using ``f2py -c -m common common.f``.
  
  In Python:
  
-  .. literalinclude:: ./code/results/common_session.dat
-     :language: python
+.. literalinclude:: ./code/results/common_session.dat
+  :language: python
  
  
  Fortran 90 module data
  =======================
  
-The F2PY interface to Fortran 90 module data is similar to the handling of Fortran 77
-common blocks.
+The F2PY interface to Fortran 90 module data is similar to the handling of
+Fortran 77 common blocks.
  
  Consider the following Fortran 90 code:
  
-  .. literalinclude:: ./code/moddata.f90
-     :language: fortran
+.. literalinclude:: ./code/moddata.f90
+  :language: fortran
  
  and wrap it using ``f2py -c -m moddata moddata.f90``.
  
  In Python:
  
-  .. literalinclude:: ./code/results/moddata_session.dat
-     :language: python
+.. literalinclude:: ./code/results/moddata_session.dat
+  :language: python
  
  
  Allocatable arrays
@@ -355,12 +341,12 @@ F2PY has basic support for Fortran 90 module allocatable arrays.
  
  Consider the following Fortran 90 code:
  
-  .. literalinclude:: ./code/allocarr.f90
-     :language: fortran
+.. literalinclude:: ./code/allocarr.f90
+  :language: fortran
  
  and wrap it using ``f2py -c -m allocarr allocarr.f90``.
  
  In Python:
  
-  .. literalinclude:: ./code/results/allocarr_session.dat
-     :language: python
+.. literalinclude:: ./code/results/allocarr_session.dat
+  :language: python
diff --git a/doc/source/f2py/signature-file.rst b/doc/source/f2py/signature-file.rst

index b80b31509661e0a2664d85fda80dea0a32ab133b..cea3682c2f3413f87706d298e53b7e1bc568caf4 100644 (file)
--- a/doc/source/f2py/signature-file.rst
+++ b/doc/source/f2py/signature-file.rst
@@ -2,28 +2,39 @@
   Signature file
  ==================
  
-The syntax specification for signature files (.pyf files) is modeled on the
-Fortran 90/95 language specification. Almost all Fortran 90/95 standard
-constructs are understood, both in free and fixed format (recall that Fortran 77
-is a subset of Fortran 90/95). F2PY introduces some extensions to the Fortran
-90/95 language specification that help in the design of the Fortran to Python
-interface, making it more "Pythonic".
+The interface definition file (.pyf) is how you can fine-tune the interface
+between Python and Fortran. The syntax specification for signature files
+(``.pyf`` files) is modeled on the Fortran 90/95 language specification. Almost
+all Fortran 90/95 standard constructs are understood, both in free and fixed
+format (recall that Fortran 77 is a subset of Fortran 90/95). F2PY introduces
+some extensions to the Fortran 90/95 language specification that help in the
+design of the Fortran to Python interface, making it more "Pythonic".
  
  Signature files may contain arbitrary Fortran code so that any Fortran 90/95
-codes can be treated as signature files. F2PY silently ignores
-Fortran constructs that are irrelevant for creating the interface.
-However, this also means that syntax errors are not caught by F2PY and will only
-be caught when the library is built.
+codes can be treated as signature files. F2PY silently ignores Fortran
+constructs that are irrelevant for creating the interface. However, this also
+means that syntax errors are not caught by F2PY and will only be caught when the
+library is built.
+
+.. note::
+
+  Currently, F2PY may fail with valid Fortran constructs, such as intrinsic
+  modules. If this happens, you can check the
+  `NumPy GitHub issue tracker <https://github.com/numpy/numpy/issues>`_ for
+  possible workarounds or work-in-progress ideas.
  
  In general, the contents of the signature files are case-sensitive. When
-scanning Fortran codes to generate a signature file, F2PY lowers all
-cases automatically except in multi-line blocks or when the ``--no-lower``
-option is used.
+scanning Fortran codes to generate a signature file, F2PY lowers all cases
+automatically except in multi-line blocks or when the ``--no-lower`` option is
+used.
  
  The syntax of signature files is presented below.
  
+Signature files syntax
+======================
+
  Python module block
-=====================
+-------------------
  
  A signature file may contain one (recommended) or more ``python
  module`` blocks. The ``python module`` block describes the contents of
@@ -63,7 +74,7 @@ previous section.
  
  
  Fortran/C routine signatures
-=============================
+----------------------------
  
  The signature of a Fortran routine has the following structure::
  
@@ -93,8 +104,10 @@ The signature of a Fortran block data has the following structure::
      [<include statements>]
    end [ block data [<block data name>] ]
  
+.. _type-declarations:
+
  Type declarations
-=================
+-----------------
  
  The definition of the ``<argument/variable type declaration>`` part
  is
@@ -128,27 +141,27 @@ and
  
  * ``<arrayspec>`` is a comma separated list of dimension bounds;
  
-* ``<init_expr>`` is a `C expression`__;
+* ``<init_expr>`` is a :ref:`C expression <c-expressions>`;
  
  * ``<intlen>`` may be negative integer for ``integer`` type
    specifications. In such cases ``integer*<negintlen>`` represents
    unsigned C integers;
  
-__ `C expressions`_
-
  If an argument has no ``<argument type declaration>``, its type is
  determined by applying ``implicit`` rules to its name.
  
  Statements
-==========
+----------
  
  Attribute statements
  ^^^^^^^^^^^^^^^^^^^^^
  
-* The ``<argument/variable attribute statement>`` is
-  ``<argument/variable type declaration>`` without ``<typespec>``.
-* In addition, in an attribute statement one cannot use other
-  attributes, also ``<entitydecl>`` can be only a list of names.
+The ``<argument/variable attribute statement>`` is similar to the
+``<argument/variable type declaration>``, but without ``<typespec>``.
+
+An attribute statement cannot contain other attributes, and ``<entitydecl>`` can
+be only a list of names. See :ref:`f2py-attributes` for more details on the
+attributes that can be used by F2PY.
  
  Use statements
  ^^^^^^^^^^^^^^^
@@ -165,9 +178,8 @@ Use statements
  
       <rename_list> := <local_name> => <use_name> [ , <rename_list> ]
  
-* Currently F2PY uses ``use`` statement only for linking call-back
-  modules and ``external`` arguments (call-back functions), see
-  :ref:`Call-back arguments`.
+* Currently F2PY uses ``use`` statements only for linking call-back modules and
+  ``external`` arguments (call-back functions). See :ref:`Call-back arguments`.
  
  Common block statements
  ^^^^^^^^^^^^^^^^^^^^^^^
@@ -199,9 +211,7 @@ Other statements
    except the following:
  
    + ``call`` statements and function calls of ``external`` arguments
-    (`more details`__?);
-
-    __ external_
+    (see :ref:`more details on external arguments <external>`);
  
    + ``include`` statements
        ::
@@ -256,7 +266,7 @@ Other statements
  F2PY statements
  ^^^^^^^^^^^^^^^^
  
-  In addition, F2PY introduces the following statements:
+In addition, F2PY introduces the following statements:
  
  ``threadsafe``
    Uses a ``Py_BEGIN_ALLOW_THREADS .. Py_END_ALLOW_THREADS`` block
@@ -271,10 +281,9 @@ F2PY statements
    block>``.
  
  ``callprotoargument <C-typespecs>``
-  When the ``callstatement`` statement is used then F2PY may not
-  generate proper prototypes for Fortran/C functions (because
-  ``<C-expr>`` may contain any function calls and F2PY has no way
-  to determine what should be the proper prototype).
+  When the ``callstatement`` statement is used, F2PY may not generate proper
+  prototypes for Fortran/C functions (because ``<C-expr>`` may contain function
+  calls, and F2PY has no way to determine what should be the proper prototype).
  
    With this statement you can explicitly specify the arguments of the
    corresponding prototype::
@@ -321,61 +330,64 @@ F2PY statements
  
    __ https://docs.python.org/extending/index.html
  
+.. _f2py-attributes:
+
  Attributes
-============
+----------
  
-The following attributes are used by F2PY:
+The following attributes can be used by F2PY.
  
  ``optional``
-  The corresponding argument is moved to the end of ``<optional
-  arguments>`` list. A default value for an optional argument can be
-  specified via ``<init_expr>``, see the ``entitydecl`` definition.
-
+  The corresponding argument is moved to the end of ``<optional arguments>``
+  list. A default value for an optional argument can be specified via
+  ``<init_expr>`` (see the ``entitydecl`` :ref:`definition <type-declarations>`)
  
    .. note::
  
     * The default value must be given as a valid C expression.
-   * Whenever ``<init_expr>`` is used, ``optional`` attribute
-     is set automatically by F2PY.
+   * Whenever ``<init_expr>`` is used, the ``optional`` attribute is set
+     automatically by F2PY.
     * For an optional array argument, all its dimensions must be bounded.
  
  ``required``
-  The corresponding argument with this attribute considered mandatory. This is
-  the default. ``required`` should only be specified if there is a need to
+  The corresponding argument with this attribute is considered mandatory. This
+  is the default. ``required`` should only be specified if there is a need to
    disable the automatic ``optional`` setting when ``<init_expr>`` is used.
  
-  If a Python ``None`` object is used as a required argument, the
-  argument is treated as optional. That is, in the case of array
-  argument, the memory is allocated. If ``<init_expr>`` is given, then the
-  corresponding initialization is carried out.
+  If a Python ``None`` object is used as a required argument, the argument is
+  treated as optional. That is, in the case of array arguments, the memory is
+  allocated. If ``<init_expr>`` is given, then the corresponding initialization
+  is carried out.
  
  ``dimension(<arrayspec>)``
    The corresponding variable is considered as an array with dimensions given in
    ``<arrayspec>``.
  
  ``intent(<intentspec>)``
-  This specifies the "intention" of the corresponding
-  argument. ``<intentspec>`` is a comma separated list of the
-  following keys:
+  This specifies the "intention" of the corresponding argument. ``<intentspec>``
+  is a comma separated list of the following keys:
  
    * ``in``
-      The corresponding argument is considered to be input-only. This means that the value of
-      the argument is passed to a Fortran/C function and that the function is
-      expected to not change the value of this argument.
+      The corresponding argument is considered to be input-only. This means that
+      the value of the argument is passed to a Fortran/C function and that the
+      function is expected to not change the value of this argument.
  
    * ``inout``
-      The corresponding argument is marked for input/output or as an *in situ* output
-      argument. ``intent(inout)`` arguments can be only "contiguous" NumPy
-      arrays with proper type and size. Here "contiguous" can be either in the
-      Fortran or C sense. The latter  coincides with the default contiguous
+      The corresponding argument is marked for input/output or as an *in situ*
+      output argument. ``intent(inout)`` arguments can be only
+      :term:`contiguous` NumPy arrays (in either the Fortran or C sense) with
+      proper type and size. The latter coincides with the default contiguous
        concept used in NumPy and is effective only if ``intent(c)`` is used. F2PY
        assumes Fortran contiguous arguments by default.
  
        .. note::
  
-         Using ``intent(inout)`` is generally not recommended, use ``intent(in,out)`` instead.
+         Using ``intent(inout)`` is generally not recommended, as it can cause
+         unexpected results. For example, scalar arguments using
+         ``intent(inout)`` are assumed to be array objects in order to have
+         *in situ* changes be effective. Use ``intent(in,out)`` instead.
  
-     See also the ``intent(inplace)`` attribute.
+      See also the ``intent(inplace)`` attribute.
  
    * ``inplace``
        The corresponding argument is considered to be an input/output or *in situ* output
@@ -586,15 +598,15 @@ The following attributes are used by F2PY:
    values.
  
  Extensions
-============
+----------
  
  F2PY directives
  ^^^^^^^^^^^^^^^^
  
-The F2PY directives allow using F2PY signature file constructs in
-Fortran 77/90 source codes. With this feature one  can (almost) completely skip
-the intermediate signature file generation and apply F2PY directly to Fortran
-source codes.
+The F2PY directives allow using F2PY signature file constructs in Fortran 77/90
+source codes. With this feature one  can (almost) completely skip the
+intermediate signature file generation and apply F2PY directly to Fortran source
+codes.
  
  F2PY directives have the following form::
  
@@ -613,6 +625,8 @@ For fixed format Fortran codes, ``<comment char>`` must be at the
  first column of a file, of course. For free format Fortran codes,
  the F2PY directives can appear anywhere in a file.
  
+.. _c-expressions:
+
  C expressions
  ^^^^^^^^^^^^^^
  
diff --git a/doc/source/f2py/usage.rst b/doc/source/f2py/usage.rst

index 596148799ba9839772631e3b60838c3e3ecde301..dbd33e36e22336e6f5f5eaf19e4433c78ab53cc3 100644 (file)
--- a/doc/source/f2py/usage.rst
+++ b/doc/source/f2py/usage.rst
@@ -2,246 +2,294 @@
  Using F2PY
  ===========
  
-F2PY can be used either as a command line tool ``f2py`` or as a Python
-module ``numpy.f2py``. While we try to provide the command line tool as part
-of the numpy setup, some platforms like Windows make it difficult to
-reliably put the executables on the ``PATH``. We will refer to ``f2py``
-in this document but you may have to run it as a module::
+This page contains a reference to all command-line options for the ``f2py``
+command, as well as a reference to internal functions of the ``numpy.f2py``
+module.
  
-   python -m numpy.f2py
+Using ``f2py`` as a command-line tool
+=====================================
  
-If you run ``f2py`` with no arguments, and the line ``numpy Version`` at the
-end matches the NumPy version printed from ``python -m numpy.f2py``, then you
-can use the shorter version. If not, or if you cannot run ``f2py``, you should
-replace all calls to ``f2py`` here with the longer version.
+When used as a command-line tool, ``f2py`` has three major modes, distinguished
+by the usage of ``-c`` and ``-h`` switches.
  
-Command ``f2py``
-=================
+1. Signature file generation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  
-When used as a command line tool, ``f2py`` has three major modes,
-distinguished by the usage of ``-c`` and ``-h`` switches:
+To scan Fortran sources and generate a signature file, use
  
-Signature file generation
-^^^^^^^^^^^^^^^^^^^^^^^^^^
+.. code-block:: sh
  
-1. To scan Fortran sources and generate a signature file, use
+  f2py -h <filename.pyf> <options> <fortran files>   \
+    [[ only: <fortran functions>  : ]                \
+      [ skip: <fortran functions>  : ]]...           \
+    [<fortran files> ...]
  
-   .. code-block:: sh
+.. note::
  
-     f2py -h <filename.pyf> <options> <fortran files>   \
-       [[ only: <fortran functions>  : ]                \
-        [ skip: <fortran functions>  : ]]...            \
-       [<fortran files> ...]
+  A Fortran source file can contain many routines, and it is often not
+  necessary to allow all routines to be usable from Python. In such cases,
+  either specify which routines should be wrapped (in the ``only: .. :`` part)
+  or which routines F2PY should ignore (in the ``skip: .. :`` part).
  
-   .. note::
+If ``<filename.pyf>`` is specified as ``stdout``, then signatures are written to
+standard output instead of a file.
  
-    A Fortran source file can contain many routines, and it is often
-    not necessary to allow all routines be usable from Python. In such cases,
-    either specify which routines should be wrapped (in the ``only: .. :`` part)
-    or which routines F2PY should ignored (in the ``skip: .. :`` part).
+Among other options (see below), the following can be used in this mode:
  
-   If ``<filename.pyf>`` is specified as ``stdout`` then signatures
-   are written to standard output instead of a file.
+  ``--overwrite-signature``
+    Overwrites an existing signature file.
  
-   Among other options (see below), the following can be used
-   in this mode:
+2. Extension module construction
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  
-   ``--overwrite-signature``
-     Overwrites an existing signature file.
+To construct an extension module, use
  
-Extension module construction
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+.. code-block:: sh
  
-2. To construct an extension module, use
+  f2py -m <modulename> <options> <fortran files>   \
+    [[ only: <fortran functions>  : ]              \
+      [ skip: <fortran functions>  : ]]...          \
+    [<fortran files> ...]
  
-   .. code-block:: sh
+The constructed extension module is saved as ``<modulename>module.c`` to the
+current directory.
  
-     f2py -m <modulename> <options> <fortran files>   \
-       [[ only: <fortran functions>  : ]              \
-        [ skip: <fortran functions>  : ]]...          \
-       [<fortran files> ...]
+Here ``<fortran files>`` may also contain signature files. Among other options
+(see below), the following options can be used in this mode:
  
-   The constructed extension module is saved as
-   ``<modulename>module.c`` to the current directory.
+  ``--debug-capi``
+    Adds debugging hooks to the extension module. When using this extension
+    module, various diagnostic information about the wrapper is written to the
+    standard output, for example, the values of variables, the steps taken, etc.
  
-   Here ``<fortran files>`` may also contain signature files.
-   Among other options (see below), the following options can be used
-   in this mode:
+  ``-include'<includefile>'``
+    Add a CPP ``#include`` statement to the extension module source.
+    ``<includefile>`` should be given in one of the following forms
  
-   ``--debug-capi``
-     Adds debugging hooks to the extension module. When using this extension
-     module, various diagnostic information about the wrapper is written to
-     the standard output, for example, the values of variables, the steps taken,
-     etc.
+    .. code-block:: cpp
  
-   ``-include'<includefile>'``
-     Add a CPP ``#include`` statement to the extension module source.
-     ``<includefile>`` should be given in one of the following forms
+      "filename.ext"
+      <filename.ext>
  
-       .. code-block:: cpp
+    The include statement is inserted just before the wrapper functions. This
+    feature enables using arbitrary C functions (defined in ``<includefile>``)
+    in F2PY generated wrappers.
  
-        "filename.ext"
-        <filename.ext>
+    .. note:: This option is deprecated. Use ``usercode`` statement to specify
+      C code snippets directly in signature files.
  
-     The include statement is inserted just before the wrapper
-     functions. This feature enables using arbitrary C functions
-     (defined in ``<includefile>``) in F2PY generated wrappers.
+  ``--[no-]wrap-functions``
+    Create Fortran subroutine wrappers to Fortran functions.
+    ``--wrap-functions`` is default because it ensures maximum portability and
+    compiler independence.
  
-     .. note:: This option is deprecated. Use ``usercode`` statement to specify C code snippets directly in signature files.
+  ``--include-paths <path1>:<path2>:..``
+    Search include files from given directories.
  
-   ``--[no-]wrap-functions``
-     Create Fortran subroutine wrappers to Fortran functions.
-     ``--wrap-functions`` is default because it ensures maximum
-     portability and compiler independence.
+  ``--help-link [<list of resources names>]``
+    List system resources found by ``numpy_distutils/system_info.py``. For
+    example, try ``f2py --help-link lapack_opt``.
  
-   ``--include-paths <path1>:<path2>:..``
-     Search include files from given directories.
+3. Building a module
+^^^^^^^^^^^^^^^^^^^^
  
-   ``--help-link [<list of resources names>]``
-     List system resources found by ``numpy_distutils/system_info.py``.
-     For example, try ``f2py --help-link lapack_opt``.
+To build an extension module, use
  
-Building a module
-^^^^^^^^^^^^^^^^^
+.. code-block:: sh
  
-3. To build an extension module, use
-
-   .. code-block:: sh
-
-     f2py -c <options> <fortran files>       \
-       [[ only: <fortran functions>  : ]     \
-        [ skip: <fortran functions>  : ]]... \
-       [ <fortran/c source files> ] [ <.o, .a, .so files> ]
+  f2py -c <options> <fortran files>       \
+    [[ only: <fortran functions>  : ]     \
+      [ skip: <fortran functions>  : ]]... \
+    [ <fortran/c source files> ] [ <.o, .a, .so files> ]
   
-   If ``<fortran files>`` contains a signature file, then the source for
-   an extension module is constructed, all Fortran and C sources are
-   compiled, and finally all object and library files are linked to the
-   extension module ``<modulename>.so`` which is saved into the current
-   directory.
-
-   If ``<fortran files>`` does not contain a signature file, then an
-   extension module is constructed by scanning all Fortran source codes
-   for routine signatures, before proceeding to build the extension module.
+If ``<fortran files>`` contains a signature file, then the source for an
+extension module is constructed, all Fortran and C sources are compiled, and
+finally all object and library files are linked to the extension module
+``<modulename>.so`` which is saved into the current directory.
+
+If ``<fortran files>`` does not contain a signature file, then an extension
+module is constructed by scanning all Fortran source codes for routine
+signatures, before proceeding to build the extension module.
   
-   Among other options (see below) and options described for previous
-   modes, the following options can be used in this mode:
+Among other options (see below) and options described for previous modes, the
+following options can be used in this mode:
   
-   ``--help-fcompiler``
-     List the available Fortran compilers.
-   ``--help-compiler`` **[depreciated]**
-     List the available Fortran compilers.
-   ``--fcompiler=<Vendor>``
-     Specify a Fortran compiler type by vendor.
-   ``--f77exec=<path>``
-     Specify the path to a F77 compiler
-   ``--fcompiler-exec=<path>`` **[depreciated]**
-     Specify the path to a F77 compiler
-   ``--f90exec=<path>``
-     Specify the path to a F90 compiler
-   ``--f90compiler-exec=<path>`` **[depreciated]**
-     Specify the path to a F90 compiler
-   ``--f77flags=<string>``
-     Specify F77 compiler flags
-   ``--f90flags=<string>``
-     Specify F90 compiler flags
-   ``--opt=<string>``
-     Specify optimization flags
-   ``--arch=<string>``
-     Specify architecture specific optimization flags
-   ``--noopt``
-     Compile without optimization flags
-   ``--noarch``
-     Compile without arch-dependent optimization flags
-   ``--debug``
-     Compile with debugging information
-   ``-l<libname>``
-     Use the library ``<libname>`` when linking.
-   ``-D<macro>[=<defn=1>]``
-     Define macro ``<macro>`` as ``<defn>``.
-   ``-U<macro>``
-     Define macro ``<macro>``
-   ``-I<dir>``
-     Append directory ``<dir>`` to the list of directories searched for
-     include files.
-   ``-L<dir>``
-     Add directory ``<dir>`` to the list of directories to  be  searched
-     for ``-l``.
-   ``link-<resource>``
-     Link the extension module with <resource> as defined by
-     ``numpy_distutils/system_info.py``. E.g. to link with optimized
-     LAPACK libraries (vecLib on MacOSX, ATLAS elsewhere), use
-     ``--link-lapack_opt``. See also ``--help-link`` switch.
-
-   .. note:: The ``f2py -c`` option must be applied either to an existing ``.pyf`` file (plus the source/object/library files) or one must specify the ``-m <modulename>`` option (plus the sources/object/library files). Use one of the following options:
-
-   .. code-block:: sh
-
-         f2py -c -m fib1 fib1.f
-
-   or
-
-   .. code-block:: sh
-
-         f2py -m fib1 fib1.f -h fib1.pyf
-         f2py -c fib1.pyf fib1.f
-
-   For more information, see the `Building C and C++ Extensions`__ Python documentation for details.
+  ``--help-fcompiler``
+    List the available Fortran compilers.
+  ``--help-compiler`` **[depreciated]**
+    List the available Fortran compilers.
+  ``--fcompiler=<Vendor>``
+    Specify a Fortran compiler type by vendor.
+  ``--f77exec=<path>``
+    Specify the path to a F77 compiler
+  ``--fcompiler-exec=<path>`` **[depreciated]**
+    Specify the path to a F77 compiler
+  ``--f90exec=<path>``
+    Specify the path to a F90 compiler
+  ``--f90compiler-exec=<path>`` **[depreciated]**
+    Specify the path to a F90 compiler
+  ``--f77flags=<string>``
+    Specify F77 compiler flags
+  ``--f90flags=<string>``
+    Specify F90 compiler flags
+  ``--opt=<string>``
+    Specify optimization flags
+  ``--arch=<string>``
+    Specify architecture specific optimization flags
+  ``--noopt``
+    Compile without optimization flags
+  ``--noarch``
+    Compile without arch-dependent optimization flags
+  ``--debug``
+    Compile with debugging information
+  ``-l<libname>``
+    Use the library ``<libname>`` when linking.
+  ``-D<macro>[=<defn=1>]``
+    Define macro ``<macro>`` as ``<defn>``.
+  ``-U<macro>``
+    Define macro ``<macro>``
+  ``-I<dir>``
+    Append directory ``<dir>`` to the list of directories searched for include
+    files.
+  ``-L<dir>``
+    Add directory ``<dir>`` to the list of directories to be searched for
+    ``-l``.
+  ``link-<resource>``
+    Link the extension module with <resource> as defined by
+    ``numpy_distutils/system_info.py``. E.g. to link with optimized LAPACK
+    libraries (vecLib on MacOSX, ATLAS elsewhere), use ``--link-lapack_opt``.
+    See also ``--help-link`` switch.
+
+.. note:: 
+  
+  The ``f2py -c`` option must be applied either to an existing ``.pyf`` file
+  (plus the source/object/library files) or one must specify the
+  ``-m <modulename>`` option (plus the sources/object/library files). Use one of
+  the following options:
+
+  .. code-block:: sh
+    
+    f2py -c -m fib1 fib1.f
+
+  or
+
+  .. code-block:: sh
+
+    f2py -m fib1 fib1.f -h fib1.pyf
+    f2py -c fib1.pyf fib1.f
+
+  For more information, see the `Building C and C++ Extensions`__ Python
+  documentation for details.
  
     __ https://docs.python.org/3/extending/building.html
  
  
-   When building an extension module, a combination of the following
-   macros may be required for non-gcc Fortran compilers:
+When building an extension module, a combination of the following macros may be
+required for non-gcc Fortran compilers:
  
-   .. code-block:: sh
+.. code-block:: sh
  
-     -DPREPEND_FORTRAN
-     -DNO_APPEND_FORTRAN
-     -DUPPERCASE_FORTRAN
+  -DPREPEND_FORTRAN
+  -DNO_APPEND_FORTRAN
+  -DUPPERCASE_FORTRAN
   
-   To test the performance of F2PY generated interfaces, use
-   ``-DF2PY_REPORT_ATEXIT``. Then a report of various timings is
-   printed out at the exit of Python. This feature may not work on
-   all platforms, currently only Linux platform is supported.
+To test the performance of F2PY generated interfaces, use
+``-DF2PY_REPORT_ATEXIT``. Then a report of various timings is printed out at the
+exit of Python. This feature may not work on all platforms, and currently only
+Linux is supported.
   
-   To see whether F2PY generated interface performs copies of array
-   arguments, use ``-DF2PY_REPORT_ON_ARRAY_COPY=<int>``. When the size
-   of an array argument is larger than ``<int>``, a message about
-   the coping is sent to ``stderr``.
+To see whether F2PY generated interface performs copies of array arguments, use
+``-DF2PY_REPORT_ON_ARRAY_COPY=<int>``. When the size of an array argument is
+larger than ``<int>``, a message about the copying is sent to ``stderr``.
  
  Other options
  ^^^^^^^^^^^^^
  
-``-m <modulename>``
-  Name of an extension module. Default is ``untitled``.
-
-  .. warning:: Don't use this option if a signature file (\*.pyf) is used.
-``--[no-]lower``
-  Do [not] lower the cases in ``<fortran files>``.  By default,
-  ``--lower`` is assumed with ``-h`` switch, and ``--no-lower``
-  without the ``-h`` switch.
-``--build-dir <dirname>``
-  All F2PY generated files are created in ``<dirname>``.  Default is
-  ``tempfile.mkdtemp()``.
-``--quiet``
-  Run quietly.
-``--verbose``
-  Run with extra verbosity.
-``-v``
-  Print the F2PY version and exit.
-
-Execute ``f2py`` without any options to get an up-to-date list of
-available options.
+  ``-m <modulename>``
+    Name of an extension module. Default is ``untitled``.
+
+  .. warning:: Don't use this option if a signature file (``*.pyf``) is used.
+
+  ``--[no-]lower``
+    Do [not] lower the cases in ``<fortran files>``. By default, ``--lower`` is
+    assumed with ``-h`` switch, and ``--no-lower`` without the ``-h`` switch.
+  ``-include<header>``
+    Writes additional headers in the C wrapper, can be passed multiple times,
+    generates #include <header> each time. Note that this is meant to be passed
+    in single quotes and without spaces, for example ``'-include<stdbool.h>'``
+  ``--build-dir <dirname>``
+    All F2PY generated files are created in ``<dirname>``. Default is
+    ``tempfile.mkdtemp()``.
+  ``--quiet``
+    Run quietly.
+  ``--verbose``
+    Run with extra verbosity.
+  ``--skip-empty-wrappers``
+    Do not generate wrapper files unless required by the inputs.
+    This is a backwards compatibility flag to restore pre 1.22.4 behavior.
+  ``-v``
+    Print the F2PY version and exit.
+
+Execute ``f2py`` without any options to get an up-to-date list of available
+options.
  
  Python module ``numpy.f2py``
  ============================
  
-.. warning::
+The f2py program is written in Python and can be run from inside your code
+to compile Fortran code at runtime, as follows:
+
+.. code-block:: python
  
-  The current Python interface to the ``f2py`` module is not mature and
-  may change in the future.
+    from numpy import f2py
+    with open("add.f") as sourcefile:
+        sourcecode = sourcefile.read()
+    f2py.compile(sourcecode, modulename='add')
+    import add
+
+The source string can be any valid Fortran code. If you want to save
+the extension-module source code then a suitable file-name can be
+provided by the ``source_fn`` keyword to the compile function.
+
+When using ``numpy.f2py`` as a module, the following functions can be invoked.
+
+.. warning::
  
+  The current Python interface to the ``f2py`` module is not mature and may
+  change in the future.
  
  .. automodule:: numpy.f2py
      :members:
  
+Automatic extension module generation
+=====================================
+
+If you want to distribute your f2py extension module, then you only
+need to include the .pyf file and the Fortran code. The distutils
+extensions in NumPy allow you to define an extension module entirely
+in terms of this interface file. A valid ``setup.py`` file allowing
+distribution of the ``add.f`` module (as part of the package
+``f2py_examples`` so that it would be loaded as ``f2py_examples.add``) is:
+
+.. code-block:: python
+
+    def configuration(parent_package='', top_path=None)
+        from numpy.distutils.misc_util import Configuration
+        config = Configuration('f2py_examples',parent_package, top_path)
+        config.add_extension('add', sources=['add.pyf','add.f'])
+        return config
+
+    if __name__ == '__main__':
+        from numpy.distutils.core import setup
+        setup(**configuration(top_path='').todict())
+
+Installation of the new package is easy using::
+
+    pip install .
+
+assuming you have the proper permissions to write to the main site-
+packages directory for the version of Python you are using. For the
+resulting package to work, you need to create a file named ``__init__.py``
+(in the same directory as ``add.pyf``). Notice the extension module is
+defined entirely in terms of the ``add.pyf`` and ``add.f`` files. The
+conversion of the .pyf file to a .c file is handled by `numpy.distutils`.
diff --git a/doc/source/f2py/windows/conda.rst b/doc/source/f2py/windows/conda.rst

new file mode 100644 (file)

index 0000000..b16402b
--- /dev/null
+++ b/doc/source/f2py/windows/conda.rst
@@ -0,0 +1,34 @@
+.. _f2py-win-conda:
+
+=========================
+F2PY and Conda on Windows
+=========================
+
+As a convienience measure, we will additionally assume the
+existence of ``scoop``, which can be used to install tools without
+administrative access.
+
+.. code-block:: powershell
+
+  Invoke-Expression (New-Object System.Net.WebClient).DownloadString('https://get.scoop.sh')
+
+Now we will setup a ``conda`` environment.
+
+.. code-block:: powershell
+
+       scoop install miniconda3
+       # For conda activate / deactivate in powershell
+       conda install -n root -c pscondaenvs pscondaenvs
+       Powershell -c Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
+       conda init powershell
+       # Open a new shell for the rest
+
+``conda`` pulls packages from ``msys2``, however, the UX is sufficiently different enough to warrant a separate discussion.
+
+.. warning::
+
+       As of 30-01-2022, the `MSYS2 binaries`_ shipped with ``conda`` are **outdated** and this approach is **not preferred**.
+
+
+
+.. _MSYS2 binaries: https://github.com/conda-forge/conda-forge.github.io/issues/1044
diff --git a/doc/source/f2py/windows/index.rst b/doc/source/f2py/windows/index.rst

new file mode 100644 (file)

index 0000000..aee96cb
--- /dev/null
+++ b/doc/source/f2py/windows/index.rst
@@ -0,0 +1,208 @@
+.. _f2py-windows:
+
+=================
+F2PY and Windows
+=================
+
+.. warning::
+
+       F2PY support for Windows is not at par with Linux support, and 
+       OS specific flags can be seen via ``python -m numpy.f2py``
+
+Broadly speaking, there are two issues working with F2PY on Windows:
+
+-  the lack of actively developed FOSS Fortran compilers, and,
+- the linking issues related to the C runtime library for building Python-C extensions.
+
+The focus of this section is to establish a guideline for developing and
+extending Fortran modules for Python natively, via F2PY on Windows. 
+
+Overview
+========
+From a user perspective, the most UNIX compatible Windows
+development environment is through emulation, either via the Windows Subsystem
+on Linux, or facilitated by Docker. In a similar vein, traditional
+virtualization methods like VirtualBox are also reasonable methods to develop
+UNIX tools on Windows.
+
+Native Windows support is typically stunted beyond the usage of commercial compilers.
+However, as of 2022, most commercial compilers have free plans which are sufficient for
+general use. Additionally, the Fortran language features supported by ``f2py``
+(partial coverage of Fortran 2003), means that newer toolchains are often not
+required. Briefly, then, for an end user, in order of use:
+
+Classic Intel Compilers (commercial)
+   These are maintained actively, though licensing restrictions may apply as
+   further detailed in :ref:`f2py-win-intel`.
+
+   Suitable for general use for those building native Windows programs by
+   building off of MSVC.
+
+MSYS2 (FOSS)
+   In conjunction with the ``mingw-w64`` project, ``gfortran`` and ``gcc``
+   toolchains can be used to natively build Windows programs.
+
+Windows Subsystem for Linux
+   Assuming the usage of ``gfortran``, this can be used for cross-compiling
+   Windows applications, but is significantly more complicated.
+
+Conda
+   Windows support for compilers in ``conda`` is facilitated by pulling MSYS2
+   binaries, however these `are outdated`_, and therefore not recommended (as of 30-01-2022).
+
+PGI Compilers (commercial)
+   Unmaintained but sufficient if an existing license is present. Works
+   natively, but has been superseded by the Nvidia HPC SDK, with no `native
+   Windows support`_.
+
+Cygwin (FOSS)
+   Can also be used for ``gfortran``. Howeve, the POSIX API compatibility layer provided by
+   Cygwin is meant to compile UNIX software on Windows, instead of building
+   native Windows programs. This means cross compilation is required.
+
+The compilation suites described so far are compatible with the `now
+deprecated`_ ``np.distutils`` build backend which is exposed by the F2PY CLI.
+Additional build system usage (``meson``, ``cmake``) as described in
+:ref:`f2py-bldsys` allows for a more flexible set of compiler
+backends including:
+
+Intel oneAPI
+   The newer Intel compilers (``ifx``, ``icx``) are based on LLVM and can be
+   used for native compilation. Licensing requirements can be onerous.
+
+Classic Flang (FOSS)
+   The backbone of the PGI compilers were cannibalized to form the "classic" or
+   `legacy version of Flang`_. This may be compiled from source and used
+   natively. `LLVM Flang`_ does not support Windows yet (30-01-2022).
+   
+LFortran (FOSS)
+   One of two LLVM based compilers. Not all of F2PY supported Fortran can be
+   compiled yet (30-01-2022) but uses MSVC for native linking.
+
+
+Baseline
+========
+
+For this document we will asume the following basic tools:
+
+- The IDE being considered is the community supported `Microsoft Visual Studio Code`_
+- The terminal being used is the `Windows Terminal`_
+- The shell environment is assumed to be `Powershell 7.x`_
+- Python 3.10 from `the Microsoft Store`_ and this can be tested with
+   ``Get-Command python.exe`` resolving to
+   ``C:\Users\$USERNAME\AppData\Local\Microsoft\WindowsApps\python.exe``
+- The Microsoft Visual C++ (MSVC) toolset
+
+With this baseline configuration, we will further consider a configuration
+matrix as follows:
+
+.. _table-f2py-winsup-mat:
+
+.. table:: Support matrix, exe implies a Windows installer 
+
+  +----------------------+--------------------+-------------------+
+  | **Fortran Compiler** | **C/C++ Compiler** | **Source**        |
+  +======================+====================+===================+
+  | Intel Fortran        | MSVC / ICC         | exe               |
+  +----------------------+--------------------+-------------------+
+  | GFortran             | MSVC               | MSYS2/exe         |
+  +----------------------+--------------------+-------------------+
+  | GFortran             | GCC                | WSL               |
+  +----------------------+--------------------+-------------------+
+  | Classic Flang        | MSVC               | Source / Conda    |
+  +----------------------+--------------------+-------------------+
+  | Anaconda GFortran    | Anaconda GCC       | exe               |
+  +----------------------+--------------------+-------------------+
+
+For an understanding of the key issues motivating the need for such a matrix
+`Pauli Virtanen's in-depth post on wheels with Fortran for Windows`_ is an
+excellent resource. An entertaining explanation of an application binary
+interface (ABI) can be found in this post by `JeanHeyd Meneide`_. 
+
+Powershell and MSVC
+====================
+
+MSVC is installed either via the Visual Studio Bundle or the lighter (preferred)
+`Build Tools for Visual Studio`_ with the ``Desktop development with C++``
+setting.
+
+.. note::
+   
+  This can take a significant amount of time as it includes a download of around
+  2GB and requires a restart.
+
+It is possible to use the resulting environment from a `standard command
+prompt`_. However, it is more pleasant to use a `developer powershell`_,
+with a `profile in Windows Terminal`_. This can be achieved by adding the
+following block to the ``profiles->list`` section of the JSON file used to 
+configure Windows Terminal (see ``Settings->Open JSON file``):
+
+.. code-block:: json
+
+  {
+  "name": "Developer PowerShell for VS 2019",
+  "commandline": "powershell.exe -noe -c \"$vsPath = (Join-Path ${env:ProgramFiles(x86)} -ChildPath 'Microsoft Visual Studio\\2019\\BuildTools'); Import-Module (Join-Path $vsPath 'Common7\\Tools\\Microsoft.VisualStudio.DevShell.dll'); Enter-VsDevShell -VsInstallPath $vsPath -SkipAutomaticLocation\"",
+  "icon": "ms-appx:///ProfileIcons/{61c54bbd-c2c6-5271-96e7-009a87ff44bf}.png"
+  }
+
+Now, testing the compiler toolchain could look like:
+
+.. code-block:: powershell
+
+   # New Windows Developer Powershell instance / tab
+   # or
+   $vsPath = (Join-Path ${env:ProgramFiles(x86)} -ChildPath 'Microsoft Visual Studio\\2019\\BuildTools'); 
+   Import-Module (Join-Path $vsPath 'Common7\\Tools\\Microsoft.VisualStudio.DevShell.dll');
+   Enter-VsDevShell -VsInstallPath $vsPath -SkipAutomaticLocation
+   **********************************************************************
+   ** Visual Studio 2019 Developer PowerShell v16.11.9
+   ** Copyright (c) 2021 Microsoft Corporation
+   **********************************************************************
+   cd $HOME
+   echo "#include<stdio.h>" > blah.cpp; echo 'int main(){printf("Hi");return 1;}' >> blah.cpp
+   cl blah.cpp
+  .\blah.exe
+   # Hi
+   rm blah.cpp
+
+It is also possible to check that the environment has been updated correctly
+with ``$ENV:PATH``.
+
+
+Windows Store Python Paths
+==========================
+
+The MS Windows version of Python discussed here installs to a non-deterministic
+path using a hash. This needs to be added to the ``PATH`` variable.
+
+.. code-block:: powershell
+
+   $Env:Path += ";$env:LOCALAPPDATA\packages\pythonsoftwarefoundation.python.3.10_qbz5n2kfra8p0\localcache\local-packages\python310\scripts"
+
+.. toctree::
+   :maxdepth: 2
+
+   intel
+   msys2
+   conda
+   pgi
+
+
+.. _the Microsoft Store: https://www.microsoft.com/en-us/p/python-310/9pjpw5ldxlz5
+.. _Microsoft Visual Studio Code: https://code.visualstudio.com/Download
+.. _more complete POSIX environment: https://www.cygwin.com/
+.. _This MSYS2 document: https://www.msys2.org/wiki/How-does-MSYS2-differ-from-Cygwin/
+.. _Build Tools for Visual Studio: https://visualstudio.microsoft.com/downloads/#build-tools-for-visual-studio-2019
+.. _Windows Terminal: https://www.microsoft.com/en-us/p/windows-terminal/9n0dx20hk701?activetab=pivot:overviewtab
+.. _Powershell 7.x: https://docs.microsoft.com/en-us/powershell/scripting/install/installing-powershell-on-windows?view=powershell-7.1
+.. _standard command prompt: https://docs.microsoft.com/en-us/cpp/build/building-on-the-command-line?view=msvc-160#developer_command_file_locations
+.. _developer powershell: https://docs.microsoft.com/en-us/visualstudio/ide/reference/command-prompt-powershell?view=vs-2019
+.. _profile in Windows Terminal: https://techcommunity.microsoft.com/t5/microsoft-365-pnp-blog/add-developer-powershell-and-developer-command-prompt-for-visual/ba-p/2243078
+.. _Pauli Virtanen's in-depth post on wheels with Fortran for Windows: https://pav.iki.fi/blog/2017-10-08/pywingfortran.html#building-python-wheels-with-fortran-for-windows
+.. _Nvidia HPC SDK: https://www.pgroup.com/index.html
+.. _JeanHeyd Meneide: https://thephd.dev/binary-banshees-digital-demons-abi-c-c++-help-me-god-please
+.. _legacy version of Flang: https://github.com/flang-compiler/flang
+.. _native Windows support: https://developer.nvidia.com/nvidia-hpc-sdk-downloads#collapseFour
+.. _are outdated: https://github.com/conda-forge/conda-forge.github.io/issues/1044
+.. _now deprecated: https://github.com/numpy/numpy/pull/20875
+.. _LLVM Flang: https://releases.llvm.org/11.0.0/tools/flang/docs/ReleaseNotes.html
diff --git a/doc/source/f2py/windows/intel.rst b/doc/source/f2py/windows/intel.rst

new file mode 100644 (file)

index 0000000..ab0cea2
--- /dev/null
+++ b/doc/source/f2py/windows/intel.rst
@@ -0,0 +1,57 @@
+.. _f2py-win-intel:
+
+==============================
+F2PY and Windows Intel Fortran
+==============================
+
+As of NumPy 1.23, only the classic Intel compilers (``ifort``) are supported.
+
+.. note::
+
+       The licensing restrictions for beta software `have been relaxed`_ during
+       the transition to the LLVM backed ``ifx/icc`` family of compilers.
+       However this document does not endorse the usage of Intel in downstream
+       projects due to the issues pertaining to `disassembly of components and
+       liability`_.
+       
+       Neither the Python Intel installation nor the `Classic Intel C/C++
+       Compiler` are required.
+
+- The `Intel Fortran Compilers`_ come in a combined installer providing both
+  Classic and Beta versions; these also take around a gigabyte and a half or so.
+
+We will consider the classic example of the generation of Fibonnaci numbers,
+``fib1.f``, given by:
+
+.. literalinclude:: ../code/fib1.f
+   :language: fortran
+
+For ``cmd.exe`` fans, using the Intel oneAPI command prompt is the easiest approach, as
+it loads the required environment for both ``ifort`` and ``msvc``. Helper batch
+scripts are also provided.
+
+.. code-block:: bat
+
+   # cmd.exe
+   "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
+   python -m numpy.f2py -c fib1.f -m fib1
+   python -c "import fib1; import numpy as np; a=np.zeros(8); fib1.fib(a); print(a)"
+
+Powershell usage is a little less pleasant, and this configuration now works with MSVC as:
+
+.. code-block:: powershell
+
+   # Powershell
+   python -m numpy.f2py -c fib1.f -m fib1 --f77exec='C:\Program Files (x86)\Intel\oneAPI\compiler\latest\windows\bin\intel64\ifort.exe' --f90exec='C:\Program Files (x86)\Intel\oneAPI\compiler\latest\windows\bin\intel64\ifort.exe' -L'C:\Program Files (x86)\Intel\oneAPI\compiler\latest\windows\compiler\lib\ia32'
+   python -c "import fib1; import numpy as np; a=np.zeros(8); fib1.fib(a); print(a)"
+   # Alternatively, set environment and reload Powershell in one line
+   cmd.exe /k '"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" && powershell'
+   python -m numpy.f2py -c fib1.f -m fib1
+   python -c "import fib1; import numpy as np; a=np.zeros(8); fib1.fib(a); print(a)"
+
+Note that the actual path to your local installation of `ifort` may vary, and the command above will need to be updated accordingly.
+
+.. _have been relaxed: https://www.intel.com/content/www/us/en/developer/articles/release-notes/oneapi-fortran-compiler-release-notes.html
+.. _disassembly of components and liability: https://software.sintel.com/content/www/us/en/develop/articles/end-user-license-agreement.html
+.. _Intel Fortran Compilers: https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html#inpage-nav-6-1
+.. _Classic Intel C/C++ Compiler: https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html#inpage-nav-6-undefined
+\ No newline at end of file
diff --git a/doc/source/f2py/windows/msys2.rst b/doc/source/f2py/windows/msys2.rst

new file mode 100644 (file)

index 0000000..68c4353
--- /dev/null
+++ b/doc/source/f2py/windows/msys2.rst
@@ -0,0 +1,19 @@
+.. _f2py-win-msys2:
+
+===========================
+F2PY and Windows with MSYS2
+===========================
+
+Follow the standard `installation instructions`_. Then, to grab the requisite Fortran compiler with ``MVSC``:
+
+.. code-block:: bash
+
+   # Assuming a fresh install
+   pacman -Syu # Restart the terminal
+   pacman -Su  # Update packages
+   # Get the toolchains
+   pacman -S --needed base-devel gcc-fortran
+   pacman -S mingw-w64-x86_64-toolchain
+
+
+.. _`installation instructions`: https://www.msys2.org/
diff --git a/doc/source/f2py/windows/pgi.rst b/doc/source/f2py/windows/pgi.rst

new file mode 100644 (file)

index 0000000..3139d9c
--- /dev/null
+++ b/doc/source/f2py/windows/pgi.rst
@@ -0,0 +1,28 @@
+.. _f2py-win-pgi:
+
+===============================
+F2PY and PGI Fortran on Windows
+===============================
+
+A variant of these are part of the so called "classic" Flang, however, 
+as classic Flang requires a custom LLVM and compilation from sources.
+
+.. warning::
+
+       Since the proprietary compilers are no longer available for
+       usage they are not recommended and will not be ported to the
+       new ``f2py`` CLI. 
+       
+
+
+.. note::
+
+       **As of November 2021**
+
+       As of 29-01-2022, `PGI compiler toolchains`_ have been superceeded by the Nvidia
+       HPC SDK, with no `native Windows support`_.
+
+However, 
+
+.. _PGI compiler toolchains: https://www.pgroup.com/index.html
+.. _native Windows support: https://developer.nvidia.com/nvidia-hpc-sdk-downloads#collapseFour
+\ No newline at end of file
diff --git a/doc/source/glossary.rst b/doc/source/glossary.rst

index aa2dc13dff1f68ccdeeba04117b58a575cd2cf51..eebebd9dc555389ad06967d44928713f1827cd64 100644 (file)
--- a/doc/source/glossary.rst
+++ b/doc/source/glossary.rst
@@ -166,10 +166,10 @@ Glossary
  
  
     array scalar
-       An :doc:`array scalar <reference/arrays.scalars>` is an instance of the types/classes float32, float64, 
-       etc.. For uniformity in handling operands, NumPy treats a scalar as 
-       an array of zero dimension. In contrast, a 0-dimensional array is an :doc:`ndarray <reference/arrays.ndarray>` instance 
-       containing precisely one value. 
+       An :doc:`array scalar <reference/arrays.scalars>` is an instance of the types/classes float32, float64,
+       etc.. For uniformity in handling operands, NumPy treats a scalar as
+       an array of zero dimension. In contrast, a 0-dimensional array is an :doc:`ndarray <reference/arrays.ndarray>` instance
+       containing precisely one value.
  
  
     axis
@@ -280,10 +280,36 @@ Glossary
  
  
     contiguous
-       An array is contiguous if
-           * it occupies an unbroken block of memory, and
-           * array elements with higher indexes occupy higher addresses (that
-             is, no :term:`stride` is negative).
+
+       An array is contiguous if:
+
+       - it occupies an unbroken block of memory, and
+       - array elements with higher indexes occupy higher addresses (that
+         is, no :term:`stride` is negative).
+
+       There are two types of proper-contiguous NumPy arrays:
+
+       - Fortran-contiguous arrays refer to data that is stored column-wise,
+         i.e. the indexing of data as stored in memory starts from the
+         lowest dimension;
+       - C-contiguous, or simply contiguous arrays, refer to data that is
+         stored row-wise, i.e. the indexing of data as stored in memory
+         starts from the highest dimension.
+
+       For one-dimensional arrays these notions coincide.
+
+       For example, a 2x2 array ``A`` is Fortran-contiguous if its elements are
+       stored in memory in the following order::
+
+           A[0,0] A[1,0] A[0,1] A[1,1]
+
+       and C-contiguous if the order is as follows::
+
+           A[0,0] A[0,1] A[1,0] A[1,1]
+
+       To test whether an array is C-contiguous, use the ``.flags.c_contiguous``
+       attribute of NumPy arrays.  To test for Fortran contiguity, use the
+       ``.flags.f_contiguous`` attribute.
  
  
     copy
@@ -395,7 +421,7 @@ Glossary
         both flatten an ndarray. ``ravel`` will return a view if possible;
         ``flatten`` always returns a copy.
  
-       Flattening collapses a multimdimensional array to a single dimension;
+       Flattening collapses a multidimensional array to a single dimension;
         details of how this is done (for instance, whether ``a[n+1]`` should be
         the next row or next column) are parameters.
  
diff --git a/doc/source/reference/array_api.rst b/doc/source/reference/array_api.rst

new file mode 100644 (file)

index 0000000..a6a8619
--- /dev/null
+++ b/doc/source/reference/array_api.rst
@@ -0,0 +1,803 @@
+.. _array_api:
+
+********************************
+Array API Standard Compatibility
+********************************
+
+.. note::
+
+   The ``numpy.array_api`` module is still experimental. See `NEP 47
+   <https://numpy.org/neps/nep-0047-array-api-standard.html>`__.
+
+NumPy includes a reference implementation of the `array API standard
+<https://data-apis.org/array-api/latest/>`__ in ``numpy.array_api``. `NEP 47
+<https://numpy.org/neps/nep-0047-array-api-standard.html>`__ describes the
+motivation and scope for implementing the array API standard in NumPy.
+
+The ``numpy.array_api`` module serves as a minimal, reference implementation
+of the array API standard. In being minimal, the module only implements those
+things that are explicitly required by the specification. Certain things are
+allowed by the specification but are explicitly disallowed in
+``numpy.array_api``. This is so that the module can serve as a reference
+implementation for users of the array API standard. Any consumer of the array
+API can test their code against ``numpy.array_api`` and be sure that they
+aren't using any features that aren't guaranteed by the spec, and which may
+not be present in other conforming libraries.
+
+The ``numpy.array_api`` module is not documented here. For a listing of the
+functions present in the array API specification, refer to the `array API
+standard <https://data-apis.org/array-api/latest/>`__. The ``numpy.array_api``
+implementation is functionally complete, so all functionality described in the
+standard is implemented.
+
+.. _array_api-differences:
+
+Table of Differences between ``numpy.array_api`` and ``numpy``
+==============================================================
+
+This table outlines the primary differences between ``numpy.array_api`` from
+the main ``numpy`` namespace. There are three types of differences:
+
+1. **Strictness**. Things that are only done so that ``numpy.array_api`` is a
+   strict, minimal implementation. They aren't actually required by the spec,
+   and other conforming libraries may not follow them. In most cases, spec
+   does not specify or require any behavior outside of the given domain. The
+   main ``numpy`` namespace would not need to change in any way to be
+   spec-compatible for these.
+
+2. **Compatible**. Things that could be added to the main ``numpy`` namespace
+   without breaking backwards compatibility.
+
+3. **Breaking**. Things that would break backwards compatibility if
+   implemented in the main ``numpy`` namespace.
+
+Name Differences
+----------------
+
+Many functions have been renamed in the spec from NumPy. These are otherwise
+identical in behavior, and are thus all **compatible** changes, unless
+otherwise noted.
+
+.. _array_api-name-changes:
+
+Function Name Changes
+~~~~~~~~~~~~~~~~~~~~~
+
+The following functions are named differently in the array API
+
+.. list-table::
+   :header-rows: 1
+
+   * - Array API name
+     - NumPy namespace name
+     - Notes
+   * - ``acos``
+     - ``arccos``
+     -
+   * - ``acosh``
+     - ``arccosh``
+     -
+   * - ``asin``
+     - ``arcsin``
+     -
+   * - ``asinh``
+     - ``arcsinh``
+     -
+   * - ``atan``
+     - ``arctan``
+     -
+   * - ``atan2``
+     - ``arctan2``
+     -
+   * - ``atanh``
+     - ``arctanh``
+     -
+   * - ``bitwise_left_shift``
+     - ``left_shift``
+     -
+   * - ``bitwise_invert``
+     - ``invert``
+     -
+   * - ``bitwise_right_shift``
+     - ``right_shift``
+     -
+   * - ``bool``
+     - ``bool_``
+     - This is **breaking** because ``np.bool`` is currently a deprecated
+       alias for the built-in ``bool``.
+   * - ``concat``
+     - ``concatenate``
+     -
+   * - ``matrix_norm`` and ``vector_norm``
+     - ``norm``
+     - ``matrix_norm`` and ``vector_norm`` each do a limited subset of what
+       ``np.norm`` does.
+   * - ``permute_dims``
+     - ``transpose``
+     - Unlike ``np.transpose``, the ``axis`` keyword-argument to
+       ``permute_dims`` is required.
+   * - ``pow``
+     - ``power``
+     -
+   * - ``unique_all``, ``unique_counts``, ``unique_inverse``, and
+       ``unique_values``
+     - ``unique``
+     - Each is equivalent to ``np.unique`` with certain flags set.
+
+
+Function instead of method
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+- ``astype`` is a function in the array API, whereas it is a method on
+  ``ndarray`` in ``numpy``.
+
+
+``linalg`` Namespace Differences
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+These functions are in the ``linalg`` sub-namespace in the array API, but are
+only in the top-level namespace in NumPy:
+
+- ``cross``
+- ``diagonal``
+- ``matmul`` (*)
+- ``outer``
+- ``tensordot`` (*)
+- ``trace``
+
+(*): These functions are also in the top-level namespace in the array API.
+
+Keyword Argument Renames
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+The following functions have keyword arguments that have been renamed. The
+functionality of the keyword argument is identical unless otherwise stated.
+Each new keyword argument is not already present on the given function in
+``numpy``, so the changes are **compatible**.
+
+Note, this page does not list function keyword arguments that are in the main
+``numpy`` namespace but not in the array API. Such keyword arguments are
+omitted from ``numpy.array_api`` for **strictness**, as the spec allows
+functions to include additional keyword arguments from those required.
+
+.. list-table::
+   :header-rows: 1
+
+   * - Function
+     - Array API keyword name
+     - NumPy keyword name
+     - Notes
+   * - ``argsort`` and ``sort``
+     - ``stable``
+     - ``kind``
+     - The definitions of ``stable`` and ``kind`` differ, as do the default
+       values. The change of the default value makes this **breaking**. See
+       :ref:`array_api-set-functions-differences`.
+   * - ``matrix_rank``
+     - ``rtol``
+     - ``tol``
+     - The definitions of ``rtol`` and ``tol`` differ, as do the default
+       values. The change of the default value makes this **breaking**. See
+       :ref:`array_api-linear-algebra-differences`.
+   * - ``pinv``
+     - ``rtol``
+     - ``rcond``
+     - The definitions of ``rtol`` and ``rcond`` are the same, but their
+       default values differ, making this **breaking**. See
+       :ref:`array_api-linear-algebra-differences`.
+   * - ``std`` and ``var``
+     - ``correction``
+     - ``ddof``
+     -
+
+
+.. _array_api-type-promotion-differences:
+
+Type Promotion Differences
+--------------------------
+
+Type promotion is the biggest area where NumPy deviates from the spec. The
+most notable difference is that NumPy does value-based casting in many cases.
+The spec explicitly disallows value-based casting. In the array API, the
+result type of any operation is always determined entirely by the input types,
+independently of values or shapes.
+
+.. list-table::
+   :header-rows: 1
+
+   * - Feature
+     - Type
+     - Notes
+   * - Limited set of dtypes.
+     - **Strictness**
+     - ``numpy.array_api`` only implements those `dtypes that are required by
+       the spec
+       <https://data-apis.org/array-api/latest/API_specification/data_types.html>`__.
+   * - Operators (like ``+``) with Python scalars only accept matching
+       scalar types.
+     - **Strictness**
+     - For example, ``<int32 array> + 1.0`` is not allowed. See `the spec
+       rules for mixing arrays and Python scalars
+       <https://data-apis.org/array-api/latest/API_specification/type_promotion.html#mixing-arrays-with-python-scalars>`__.
+   * - Operators (like ``+``) with Python scalars always return the same dtype
+       as the array.
+     - **Breaking**
+     - For example, ``numpy.array_api.asarray(0., dtype=float32) + 1e64`` is a
+       ``float32`` array.
+   * - In-place operators are disallowed when the left-hand side would be
+       promoted.
+     - **Breaking**
+     - Example: ``a = np.array(1, dtype=np.int8); a += np.array(1, dtype=np.int16)``. The spec explicitly disallows this.
+   * - ``int`` promotion for operators is only specified for integers within
+       the bounds of the dtype.
+     - **Strictness**
+     - ``numpy.array_api`` fallsback to ``np.ndarray`` behavior (either
+       cast or raise ``OverflowError``).
+   * - ``__pow__`` and ``__rpow__`` do not do value-based casting for 0-D
+       arrays.
+     - **Breaking**
+     - For example, ``np.array(0., dtype=float32)**np.array(0.,
+       dtype=float64)`` is ``float32``. Note that this is value-based casting
+       on 0-D arrays, not scalars.
+   * - No cross-kind casting.
+     - **Strictness**
+     - Namely, boolean, integer, and floating-point data types do not cast to
+       each other, except explicitly with ``astype`` (this is separate from
+       the behavior with Python scalars).
+   * - No casting unsigned integer dtypes to floating dtypes (e.g., ``int64 +
+       uint64 -> float64``.
+     - **Strictness**
+     -
+   * - ``can_cast`` and ``result_type`` are restricted.
+     - **Strictness**
+     - The ``numpy.array_api`` implementations disallow cross-kind casting.
+   * - ``sum`` and ``prod`` always upcast ``float32`` to ``float64`` when
+       ``dtype=None``.
+     - **Breaking**
+     -
+
+Indexing Differences
+--------------------
+
+The spec requires only a subset of indexing, but all indexing rules in the
+spec are compatible with NumPy's more broad indexing rules.
+
+.. list-table::
+   :header-rows: 1
+
+   * - Feature
+     - Type
+     - Notes
+   * - No implicit ellipses (``...``).
+     - **Strictness**
+     - If an index does not include an ellipsis, all axes must be indexed.
+   * - The start and stop of a slice may not be out of bounds.
+     - **Strictness**
+     - For a slice ``i:j:k``, only the following are allowed:
+
+       - ``i`` or ``j`` omitted (``None``).
+       - ``-n <= i <= max(0, n - 1)``.
+       - For ``k > 0`` or ``k`` omitted (``None``), ``-n <= j <= n``.
+       - For ``k < 0``, ``-n - 1 <= j <= max(0, n - 1)``.
+   * - Boolean array indices are only allowed as the sole index.
+     - **Strictness**
+     -
+   * - Integer array indices are not allowed at all.
+     - **Strictness**
+     - With the exception of 0-D arrays, which are treated like integers.
+
+.. _array_api-type-strictness:
+
+Type Strictness
+---------------
+
+Functions in ``numpy.array_api`` restrict their inputs to only those dtypes
+that are explicitly required by the spec, even when the wrapped corresponding
+NumPy function would allow a broader set. Here, we list each function and the
+dtypes that are allowed in ``numpy.array_api``. These are **strictness**
+differences because the spec does not require that other dtypes result in an
+error. The categories here are defined as follows:
+
+- **Floating-point**: ``float32`` or ``float64``.
+- **Integer**: Any signed or unsigned integer dtype (``int8``, ``int16``,
+  ``int32``, ``int64``, ``uint8``, ``uint16``, ``uint32``, or ``uint64``).
+- **Boolean**: ``bool``.
+- **Integer or boolean**: Any signed or unsigned integer dtype, or ``bool``.
+  For two-argument functions, both arguments must be integer or both must be
+  ``bool``.
+- **Numeric**: Any integer or floating-point dtype. For two-argument
+  functions, both arguments must be integer or both must be
+  floating-point.
+- **All**: Any of the above dtype categories. For two-argument functions, both
+  arguments must be the same kind (integer, floating-point, or boolean).
+
+In all cases, the return dtype is chosen according to `the rules outlined in
+the spec
+<https://data-apis.org/array-api/latest/API_specification/type_promotion.html>`__,
+and does not differ from NumPy's return dtype for any of the allowed input
+dtypes, except in the cases mentioned specifically in the subsections below.
+
+Elementwise Functions
+~~~~~~~~~~~~~~~~~~~~~
+
+.. list-table::
+   :header-rows: 1
+
+   * - Function Name
+     - Dtypes
+   * - ``abs``
+     - Numeric
+   * - ``acos``
+     - Floating-point
+   * - ``acosh``
+     - Floating-point
+   * - ``add``
+     - Numeric
+   * - ``asin`` (*)
+     - Floating-point
+   * - ``asinh`` (*)
+     - Floating-point
+   * - ``atan`` (*)
+     - Floating-point
+   * - ``atan2`` (*)
+     - Floating-point
+   * - ``atanh`` (*)
+     - Floating-point
+   * - ``bitwise_and``
+     - Integer or boolean
+   * - ``bitwise_invert``
+     - Integer or boolean
+   * - ``bitwise_left_shift`` (*)
+     - Integer
+   * - ``bitwise_or``
+     - Integer or boolean
+   * - ``bitwise_right_shift`` (*)
+     - Integer
+   * - ``bitwise_xor``
+     - Integer or boolean
+   * - ``ceil``
+     - Numeric
+   * - ``cos``
+     - Floating-point
+   * - ``cosh``
+     - Floating-point
+   * - ``divide``
+     - Floating-point
+   * - ``equal``
+     - All
+   * - ``exp``
+     - Floating-point
+   * - ``expm1``
+     - Floating-point
+   * - ``floor``
+     - Numeric
+   * - ``floor_divide``
+     - Numeric
+   * - ``greater``
+     - Numeric
+   * - ``greater_equal``
+     - Numeric
+   * - ``isfinite``
+     - Numeric
+   * - ``isinf``
+     - Numeric
+   * - ``isnan``
+     - Numeric
+   * - ``less``
+     - Numeric
+   * - ``less_equal``
+     - Numeric
+   * - ``log``
+     - Floating-point
+   * - ``logaddexp``
+     - Floating-point
+   * - ``log10``
+     - Floating-point
+   * - ``log1p``
+     - Floating-point
+   * - ``log2``
+     - Floating-point
+   * - ``logical_and``
+     - Boolean
+   * - ``logical_not``
+     - Boolean
+   * - ``logical_or``
+     - Boolean
+   * - ``logical_xor``
+     - Boolean
+   * - ``multiply``
+     - Numeric
+   * - ``negative``
+     - Numeric
+   * - ``not_equal``
+     - All
+   * - ``positive``
+     - Numeric
+   * - ``pow`` (*)
+     - Numeric
+   * - ``remainder``
+     - Numeric
+   * - ``round``
+     - Numeric
+   * - ``sign``
+     - Numeric
+   * - ``sin``
+     - Floating-point
+   * - ``sinh``
+     - Floating-point
+   * - ``sqrt``
+     - Floating-point
+   * - ``square``
+     - Numeric
+   * - ``subtract``
+     - Numeric
+   * - ``tan``
+     - Floating-point
+   * - ``tanh``
+     - Floating-point
+   * - ``trunc``
+     - Numeric
+
+(*) These functions have different names from the main ``numpy`` namespace.
+See :ref:`array_api-name-changes`.
+
+Creation Functions
+~~~~~~~~~~~~~~~~~~
+
+.. list-table::
+   :header-rows: 1
+
+   * - Function Name
+     - Dtypes
+   * - ``meshgrid``
+     - Any (all input dtypes must be the same)
+
+
+Linear Algebra Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. list-table::
+   :header-rows: 1
+
+   * - Function Name
+     - Dtypes
+   * - ``cholesky``
+     - Floating-point
+   * - ``cross``
+     - Numeric
+   * - ``det``
+     - Floating-point
+   * - ``diagonal``
+     - Any
+   * - ``eigh``
+     - Floating-point
+   * - ``eighvals``
+     - Floating-point
+   * - ``inv``
+     - Floating-point
+   * - ``matmul``
+     - Numeric
+   * - ``matrix_norm`` (*)
+     - Floating-point
+   * - ``matrix_power``
+     - Floating-point
+   * - ``matrix_rank``
+     - Floating-point
+   * - ``matrix_transpose`` (**)
+     - Any
+   * - ``outer``
+     - Numeric
+   * - ``pinv``
+     - Floating-point
+   * - ``qr``
+     - Floating-point
+   * - ``slogdet``
+     - Floating-point
+   * - ``solve``
+     - Floating-point
+   * - ``svd``
+     - Floating-point
+   * - ``svdvals`` (**)
+     - Floating-point
+   * - ``tensordot``
+     - Numeric
+   * - ``trace``
+     - Numeric
+   * - ``vecdot`` (**)
+     - Numeric
+   * - ``vector_norm`` (*)
+     - Floating-point
+
+(*) Thes functions are split from ``norm`` from the main ``numpy`` namespace.
+See :ref:`array_api-name-changes`.
+
+(**) These functions are new in the array API and are not in the main
+``numpy`` namespace.
+
+Array Object
+~~~~~~~~~~~~
+
+All the special ``__operator__`` methods on the array object behave
+identically to their corresponding functions (see `the spec
+<https://data-apis.org/array-api/latest/API_specification/array_object.html#methods>`__
+for a list of which methods correspond to which functions). The exception is
+that operators explicitly allow Python scalars according to the `rules
+outlined in the spec
+<https://data-apis.org/array-api/latest/API_specification/type_promotion.html#mixing-arrays-with-python-scalars>`__
+(see :ref:`array_api-type-promotion-differences`).
+
+
+Array Object Differences
+------------------------
+
+.. list-table::
+   :header-rows: 1
+
+   * - Feature
+     - Type
+     - Notes
+   * - No array scalars
+     - **Strictness**
+     - The spec does not have array scalars, only 0-D arrays. However, other
+       than the promotion differences outlined in
+       :ref:`array_api-type-promotion-differences`, scalars duck type as 0-D
+       arrays for the purposes of the spec. The are immutable, but the spec
+       `does not require mutability
+       <https://data-apis.org/array-api/latest/design_topics/copies_views_and_mutation.html>`__.
+   * - ``bool()``, ``int()``, and ``float()`` only work on 0-D arrays.
+     - **Strictness**
+     - See https://github.com/numpy/numpy/issues/10404.
+   * - ``__imatmul__``
+     - **Compatible**
+     - ``np.ndarray`` does not currently implement ``__imatmul``. Note that
+       ``a @= b`` should only defined when it does not change the shape of
+       ``a``.
+   * - The ``mT`` attribute for matrix transpose.
+     - **Compatible**
+     - See `the spec definition
+       <https://data-apis.org/array-api/latest/API_specification/generated/signatures.array_object.array.mT.html>`__
+       for ``mT``.
+   * - The ``T`` attribute should error if the input is not 2-dimensional.
+     - **Breaking**
+     - See `the note in the spec
+       <https://data-apis.org/array-api/latest/API_specification/generated/signatures.array_object.array.T.html>`__.
+   * - New method ``to_device`` and attribute ``device``
+     - **Compatible**
+     - The methods would effectively not do anything since NumPy is CPU only
+
+Creation Functions Differences
+------------------------------
+
+.. list-table::
+   :header-rows: 1
+
+   * - Feature
+     - Type
+     - Notes
+   * - ``copy`` keyword argument to ``asarray``
+     - **Compatible**
+     -
+   * - New ``device`` keyword argument to all array creation functions
+       (``asarray``, ``arange``, ``empty``, ``empty_like``, ``eye``, ``full``,
+       ``full_like``, ``linspace``, ``ones``, ``ones_like``, ``zeros``, and
+       ``zeros_like``).
+     - **Compatible**
+     - ``device`` would effectively do nothing, since NumPy is CPU only.
+
+Elementwise Functions Differences
+---------------------------------
+
+.. list-table::
+   :header-rows: 1
+
+   * - Feature
+     - Type
+     - Notes
+   * - Various functions have been renamed.
+     - **Compatible**
+     - See :ref:`array_api-name-changes`.
+   * - Elementwise functions are only defined for given input type
+       combinations.
+     - **Strictness**
+     - See :ref:`array_api-type-strictness`.
+   * - ``bitwise_left_shift`` and ``bitwise_right_shift`` are only defined for
+       ``x2`` nonnegative.
+     - **Strictness**
+     -
+   * - ``ceil``, ``floor``, and ``trunc`` return an integer with integer
+       input.
+     - **Breaking**
+     - ``np.ceil``, ``np.floor``, and ``np.trunc`` return a floating-point
+       dtype on integer dtype input.
+
+.. _array_api-linear-algebra-differences:
+
+Linear Algebra Differences
+--------------------------
+
+.. list-table::
+   :header-rows: 1
+
+   * - Feature
+     - Type
+     - Notes
+   * - ``cholesky`` includes an ``upper`` keyword argument.
+     - **Compatible**
+     -
+   * - ``cross`` does not allow size 2 vectors (only size 3).
+     - **Breaking**
+     -
+   * - ``diagonal`` operates on the last two axes.
+     - **Breaking**
+     - Strictly speaking this can be **compatible** because ``diagonal`` is
+       moved to the ``linalg`` namespace.
+   * - ``eigh``, ``qr``, ``slogdet`` and ``svd`` return a named tuple.
+     - **Compatible**
+     - The corresponding ``numpy`` functions return a ``tuple``, with the
+       resulting arrays in the same order.
+   * - New functions ``matrix_norm`` and ``vector_norm``.
+     - **Compatible**
+     - The ``norm`` function has been omitted from the array API and split
+       into ``matrix_norm`` for matrix norms and ``vector_norm`` for vector
+       norms. Note that ``vector_norm`` supports any number of axes, whereas
+       ``np.linalg.norm`` only supports a single axis for vector norms.
+   * - ``matrix_rank`` has an ``rtol`` keyword argument instead of ``tol``.
+     - **Breaking**
+     - In the array API, ``rtol`` filters singular values smaller than
+       ``rtol * largest_singular_value``. In ``np.linalg.matrix_rank``,
+       ``tol`` filters singular values smaller than ``tol``. Furthermore, the
+       default value for ``rtol`` is ``max(M, N) * eps``, whereas the default
+       value of ``tol`` in ``np.linalg.matrix_rank`` is ``S.max() *
+       max(M, N) * eps``, where ``S`` is the singular values of the input. The
+       new flag name is compatible but the default change is breaking
+   * - ``matrix_rank`` does not support 1-dimensional arrays.
+     - **Breaking**
+     -
+   * - New function ``matrix_transpose``.
+     - **Compatible**
+     - Unlike ``np.transpose``, ``matrix_transpose`` only transposes the last
+       two axes. See `the spec definition
+       <https://data-apis.org/array-api/latest/API_specification/generated/signatures.linear_algebra_functions.matrix_transpose.html#signatures.linear_algebra_functions.matrix_transpose>`__
+   * - ``outer`` only supports 1-dimensional arrays.
+     - **Breaking**
+     - The spec currently only specifies behavior on 1-D arrays but future
+       behavior will likely be to broadcast, rather than flatten, which is
+       what ``np.outer`` does.
+   * - ``pinv`` has an ``rtol`` keyword argument instead of ``rcond``
+     - **Breaking**
+     - The meaning of ``rtol`` and ``rcond`` is the same, but the default
+       value for ``rtol`` is ``max(M, N) * eps``, whereas the default value
+       for ``rcond`` is ``1e-15``. The new flag name is compatible but the
+       default change is breaking.
+   * - ``solve`` only accepts ``x2`` as a vector when it is exactly
+       1-dimensional.
+     - **Breaking**
+     - The ``np.linalg.solve`` behavior is ambiguous. See `this numpy issue
+       <https://github.com/numpy/numpy/issues/15349>`__ and `this array API
+       specification issue
+       <https://github.com/data-apis/array-api/issues/285>`__ for more
+       details.
+   * - New function ``svdvals``.
+     - **Compatible**
+     - Equivalent to ``np.linalg.svd(compute_uv=False)``.
+   * - The ``axis`` keyword to ``tensordot`` must be a tuple.
+     - **Compatible**
+     - In ``np.tensordot``, it can also be an array or array-like.
+   * - ``trace`` operates on the last two axes.
+     - **Breaking**
+     - ``np.trace`` operates on the first two axes by default. Note that the
+       array API ``trace`` does not allow specifying which axes to operate on.
+
+Manipulation Functions Differences
+----------------------------------
+
+.. list-table::
+   :header-rows: 1
+
+   * - Feature
+     - Type
+     - Notes
+   * - Various functions have been renamed
+     - **Compatible**
+     - See :ref:`array_api-name-changes`.
+   * - ``concat`` has different default casting rules from ``np.concatenate``
+     - **Strictness**
+     - No cross-kind casting. No value-based casting on scalars (when axis=None).
+   * - ``stack`` has different default casting rules from ``np.stack``
+     - **Strictness**
+     - No cross-kind casting.
+   * - New function ``permute_dims``.
+     - **Compatible**
+     - Unlike ``np.transpose``, the ``axis`` keyword argument to
+       ``permute_dims`` is required.
+   * - ``reshape`` function has a ``copy`` keyword argument
+     - **Compatible**
+     - See https://github.com/numpy/numpy/issues/9818.
+
+Set Functions Differences
+-------------------------
+
+.. list-table::
+   :header-rows: 1
+
+   * - Feature
+     - Type
+     - Notes
+   * - New functions ``unique_all``, ``unique_counts``, ``unique_inverse``,
+       and ``unique_values``.
+     - **Compatible**
+     - See :ref:`array_api-name-changes`.
+   * - The four ``unique_*`` functions return a named tuple.
+     - **Compatible**
+     -
+   * - ``unique_all`` and ``unique_indices`` return indices with the same
+       shape as ``x``.
+     - **Compatible**
+     - See https://github.com/numpy/numpy/issues/20638.
+
+.. _array_api-set-functions-differences:
+
+Set Functions Differences
+-------------------------
+
+.. list-table::
+   :header-rows: 1
+
+   * - Feature
+     - Type
+     - Notes
+   * - ``argsort`` and ``sort`` have a ``stable`` keyword argument instead of
+       ``kind``.
+     - **Breaking**
+     - ``stable`` is a boolean keyword argument, defaulting to ``True``.
+       ``kind`` takes a string, defaulting to ``"quicksort"``. ``stable=True``
+       is equivalent to ``kind="stable"`` and ``kind=False`` is equivalent to
+       ``kind="quicksort"``, although any sorting algorithm is allowed by the
+       spec when ``stable=False``. The new flag name is compatible but the
+       default change is breaking.
+   * - ``argsort`` and ``sort`` have a ``descending`` keyword argument.
+     - **Compatible**
+     -
+
+Statistical Functions Differences
+---------------------------------
+
+.. list-table::
+   :header-rows: 1
+
+   * - Feature
+     - Type
+     - Notes
+   * - ``sum`` and ``prod`` always upcast ``float32`` to ``float64`` when
+       ``dtype=None``.
+     - **Breaking**
+     -
+   * - The ``std`` and ``var`` functions have a ``correction`` keyword
+       argument instead of ``ddof``.
+     - **Compatible**
+     -
+
+Other Differences
+-----------------
+
+.. list-table::
+   :header-rows: 1
+
+   * - Feature
+     - Type
+     - Notes
+   * - Dtypes can only be spelled as dtype objects.
+     - **Strictness**
+     - For example, ``numpy.array_api.asarray([0], dtype='int32')`` is not
+       allowed.
+   * - ``asarray`` is not implicitly called in any function.
+     - **Strictness**
+     - The exception is Python operators, which accept Python scalars in
+       certain cases (see :ref:`array_api-type-promotion-differences`).
+   * - ``tril`` and ``triu`` require the input to be at least 2-D.
+     - **Strictness**
+     -
+   * - finfo() return type uses ``float`` for the various attributes.
+     - **Strictness**
+     - The spec allows duck typing, so ``finfo`` returning dtype
+       scalars is considered type compatible with ``float``.
diff --git a/doc/source/reference/arrays.classes.rst b/doc/source/reference/arrays.classes.rst

index 92c271f6b964df0d1309bdbb4bfe655d6cc61292..2f40423eeea5ad65467314a8b20474d2263f50f9 100644 (file)
--- a/doc/source/reference/arrays.classes.rst
+++ b/doc/source/reference/arrays.classes.rst
@@ -7,7 +7,6 @@ Standard array subclasses
  .. currentmodule:: numpy
  
  .. for doctests
-   >>> import numpy as np
     >>> np.random.seed(1)
  
  .. note::
@@ -42,6 +41,7 @@ however, of why your subroutine may not be able to handle an arbitrary
  subclass of an array is that matrices redefine the "*" operator to be
  matrix-multiplication, rather than element-by-element multiplication.
  
+.. _special-attributes-and-methods:
  
  Special attributes and methods
  ==============================
diff --git a/doc/source/reference/arrays.datetime.rst b/doc/source/reference/arrays.datetime.rst

index 63c93821b6b4d6bcd83e6c06bd72dde1470674a3..76539c24ed55d90cc36e9733c92a4809529c4b28 100644 (file)
--- a/doc/source/reference/arrays.datetime.rst
+++ b/doc/source/reference/arrays.datetime.rst
@@ -9,23 +9,51 @@ Datetimes and Timedeltas
  .. versionadded:: 1.7.0
  
  Starting in NumPy 1.7, there are core array data types which natively
-support datetime functionality. The data type is called "datetime64",
-so named because "datetime" is already taken by the datetime library
-included in Python.
+support datetime functionality. The data type is called :class:`datetime64`,
+so named because :class:`~datetime.datetime` is already taken by the Python standard library.
+
+Datetime64 Conventions and Assumptions
+======================================
+
+Similar to the Python `~datetime.date` class, dates are expressed in the current
+Gregorian Calendar, indefinitely extended both in the future and in the past.
+[#]_ Contrary to Python `~datetime.date`, which supports only years in the 1 AD — 9999
+AD range, `datetime64` allows also for dates BC; years BC follow the `Astronomical
+year numbering <https://en.wikipedia.org/wiki/Astronomical_year_numbering>`_
+convention, i.e. year 2 BC is numbered −1, year 1 BC is numbered 0, year 1 AD is
+numbered 1.
+
+Time instants, say 16:23:32.234, are represented counting hours, minutes,
+seconds and fractions from midnight: i.e. 00:00:00.000 is midnight, 12:00:00.000
+is noon, etc. Each calendar day has exactly 86400 seconds. This is a "naive"
+time, with no explicit notion of timezones or specific time scales (UT1, UTC, TAI,
+etc.). [#]_
+
+.. [#] The calendar obtained by extending the Gregorian calendar before its
+       official adoption on Oct. 15, 1582 is called `Proleptic Gregorian Calendar
+       <https://en.wikipedia.org/wiki/Proleptic_Gregorian_calendar>`_
+
+.. [#] The assumption of 86400 seconds per calendar day is not valid for UTC,
+       the present day civil time scale. In fact due to the presence of
+       `leap seconds <https://en.wikipedia.org/wiki/Leap_second>`_ on rare occasions
+       a day may be 86401 or 86399 seconds long. On the contrary the 86400s day
+       assumption holds for the TAI timescale. An explicit support for TAI and
+       TAI to UTC conversion, accounting for leap seconds, is proposed but not
+       yet implemented. See also the `shortcomings`_ section below.
  
  
  Basic Datetimes
  ===============
  
-The most basic way to create datetimes is from strings in ISO 8601 date 
-or datetime format. It is also possible to create datetimes from an integer by 
+The most basic way to create datetimes is from strings in ISO 8601 date
+or datetime format. It is also possible to create datetimes from an integer by
  offset relative to the Unix epoch (00:00:00 UTC on 1 January 1970).
-The unit for internal storage is automatically selected from the 
+The unit for internal storage is automatically selected from the
  form of the string, and can be either a :ref:`date unit <arrays.dtypes.dateunits>` or a
  :ref:`time unit <arrays.dtypes.timeunits>`. The date units are years ('Y'),
  months ('M'), weeks ('W'), and days ('D'), while the time units are
  hours ('h'), minutes ('m'), seconds ('s'), milliseconds ('ms'), and
-some additional SI-prefix seconds-based units. The datetime64 data type
+some additional SI-prefix seconds-based units. The `datetime64` data type
  also accepts the string "NAT", in any combination of lowercase/uppercase
  letters, for a "Not A Time" value.
  
@@ -35,11 +63,11 @@ letters, for a "Not A Time" value.
  
      >>> np.datetime64('2005-02-25')
      numpy.datetime64('2005-02-25')
-    
+
      From an integer and a date unit, 1 year since the UNIX epoch:
  
      >>> np.datetime64(1, 'Y')
-    numpy.datetime64('1971')   
+    numpy.datetime64('1971')
  
      Using months for the unit:
  
@@ -122,19 +150,19 @@ because the moment of time is still being represented exactly.
  
    NumPy does not store timezone information. For backwards compatibility, datetime64
    still parses timezone offsets, which it handles by converting to
-  UTC. This behaviour is deprecated and will raise an error in the
+  UTC±00:00 (Zulu time). This behaviour is deprecated and will raise an error in the
    future.
  
  
  Datetime and Timedelta Arithmetic
  =================================
  
-NumPy allows the subtraction of two Datetime values, an operation which
+NumPy allows the subtraction of two datetime values, an operation which
  produces a number with a time unit. Because NumPy doesn't have a physical
-quantities system in its core, the timedelta64 data type was created
-to complement datetime64. The arguments for timedelta64 are a number,
+quantities system in its core, the `timedelta64` data type was created
+to complement `datetime64`. The arguments for `timedelta64` are a number,
  to represent the number of units, and a date/time unit, such as
-(D)ay, (M)onth, (Y)ear, (h)ours, (m)inutes, or (s)econds. The timedelta64
+(D)ay, (M)onth, (Y)ear, (h)ours, (m)inutes, or (s)econds. The `timedelta64`
  data type also accepts the string "NAT" in place of the number for a "Not A Time" value.
  
  .. admonition:: Example
@@ -199,9 +227,8 @@ The Datetime and Timedelta data types support a large number of time
  units, as well as generic units which can be coerced into any of the
  other units based on input data.
  
-Datetimes are always stored based on POSIX time (though having a TAI
-mode which allows for accounting of leap-seconds is proposed), with
-an epoch of 1970-01-01T00:00Z. This means the supported dates are
+Datetimes are always stored with
+an epoch of 1970-01-01T00:00. This means the supported dates are
  always a symmetric interval around the epoch, called "time span" in the
  table below.
  
@@ -328,7 +355,7 @@ in an optimized form.
  
  np.is_busday():
  ```````````````
-To test a datetime64 value to see if it is a valid day, use :func:`is_busday`.
+To test a `datetime64` value to see if it is a valid day, use :func:`is_busday`.
  
  .. admonition:: Example
  
@@ -384,3 +411,69 @@ Some examples::
      weekmask = "Mon Tue Wed Thu Fri"
      # any amount of whitespace is allowed; abbreviations are case-sensitive.
      weekmask = "MonTue Wed  Thu\tFri"
+
+
+.. _shortcomings:
+
+Datetime64 shortcomings
+=======================
+
+The assumption that all days are exactly 86400 seconds long makes `datetime64`
+largely compatible with Python `datetime` and "POSIX time" semantics; therefore
+they all share the same well known shortcomings with respect to the UTC
+timescale and historical time determination. A brief non exhaustive summary is
+given below.
+
+- It is impossible to parse valid UTC timestamps occurring during a positive
+  leap second.
+
+  .. admonition:: Example
+
+    "2016-12-31 23:59:60 UTC" was a leap second, therefore "2016-12-31
+    23:59:60.450 UTC" is a valid timestamp which is not parseable by
+    `datetime64`:
+
+      >>> np.datetime64("2016-12-31 23:59:60.450")
+      Traceback (most recent call last):
+        File "<stdin>", line 1, in <module>
+      ValueError: Seconds out of range in datetime string "2016-12-31 23:59:60.450"
+
+- Timedelta64 computations between two UTC dates can be wrong by an integer
+  number of SI seconds.
+
+  .. admonition:: Example
+
+    Compute the number of SI seconds between "2021-01-01 12:56:23.423 UTC" and
+    "2001-01-01 00:00:00.000 UTC":
+
+      >>> (
+      ...   np.datetime64("2021-01-01 12:56:23.423")
+      ...   - np.datetime64("2001-01-01")
+      ... ) / np.timedelta64(1, "s")
+      631198583.423
+
+    however correct answer is `631198588.423` SI seconds because there were 5
+    leap seconds between 2001 and 2021.
+
+- Timedelta64 computations for dates in the past do not return SI seconds, as
+  one would expect.
+
+  .. admonition:: Example
+
+     Compute the number of seconds between "000-01-01 UT" and "1600-01-01 UT",
+     where UT is `universal time
+     <https://en.wikipedia.org/wiki/Universal_Time>`_:
+
+      >>> a = np.datetime64("0000-01-01", "us")
+      >>> b = np.datetime64("1600-01-01", "us")
+      >>> b - a
+      numpy.timedelta64(50491123200000000,'us')
+
+     The computed results, `50491123200` seconds, is obtained as the elapsed
+     number of days (`584388`) times `86400` seconds; this is the number of
+     seconds of a clock in sync with earth rotation. The exact value in SI
+     seconds can only be estimated, e.g using data published in `Measurement of
+     the Earth's rotation: 720 BC to AD 2015, 2016, Royal Society's Proceedings
+     A 472, by Stephenson et.al. <https://doi.org/10.1098/rspa.2016.0404>`_. A
+     sensible estimate is `50491112870 ± 90` seconds, with a difference of 10330
+     seconds.
diff --git a/doc/source/reference/arrays.indexing.rst b/doc/source/reference/arrays.indexing.rst

index 100d22e029d01133e3825085a4cb43e679249568..1e413469115c496532f4db3d91c509bb18f79908 100644 (file)
--- a/doc/source/reference/arrays.indexing.rst
+++ b/doc/source/reference/arrays.indexing.rst
@@ -68,3 +68,4 @@ Iterating over arrays
     nested_iters
     flatiter
     lib.Arrayterator
+   iterable
diff --git a/doc/source/reference/arrays.interface.rst b/doc/source/reference/arrays.interface.rst

index 6a8c5f9c4d098f1f15fc04571f8f4818a8ef8038..e10710719ecd156df7b132ff470e29a48c7f7b36 100644 (file)
--- a/doc/source/reference/arrays.interface.rst
+++ b/doc/source/reference/arrays.interface.rst
@@ -4,18 +4,18 @@
  
  .. _arrays.interface:
  
-*******************
-The Array Interface
-*******************
+****************************
+The array interface protocol
+****************************
  
  .. note::
  
-   This page describes the numpy-specific API for accessing the contents of
-   a numpy array from other C extensions. :pep:`3118` --
+   This page describes the NumPy-specific API for accessing the contents of
+   a NumPy array from other C extensions. :pep:`3118` --
     :c:func:`The Revised Buffer Protocol <PyObject_GetBuffer>` introduces
     similar, standardized API to Python 2.6 and 3.0 for any extension
     module to use. Cython__'s buffer array support
-   uses the :pep:`3118` API; see the `Cython numpy
+   uses the :pep:`3118` API; see the `Cython NumPy
     tutorial`__. Cython provides a way to write code that supports the buffer
     protocol with Python versions older than 2.6 because it has a
     backward-compatible implementation utilizing the array interface
@@ -81,7 +81,8 @@ This approach to the interface consists of the object having an
         =====  ================================================================
         ``t``  Bit field (following integer gives the number of
                bits in the bit field).
-       ``b``  Boolean (integer type where all values are only True or False)
+       ``b``  Boolean (integer type where all values are only ``True`` or
+              ``False``)
         ``i``  Integer
         ``u``  Unsigned integer
         ``f``  Floating point
@@ -90,7 +91,7 @@ This approach to the interface consists of the object having an
         ``M``  Datetime
         ``O``  Object (i.e. the memory contains a pointer to :c:type:`PyObject`)
         ``S``  String (fixed-length sequence of char)
-       ``U``  Unicode (fixed-length sequence of :c:type:`Py_UNICODE`)
+       ``U``  Unicode (fixed-length sequence of :c:type:`Py_UCS4`)
         ``V``  Other (void \* -- each item is a fixed-size chunk of memory)
         =====  ================================================================
  
@@ -141,11 +142,11 @@ This approach to the interface consists of the object having an
         must be stored by the new object if the memory area is to be
         secured.
  
-       **Default**: None
+       **Default**: ``None``
  
     **strides** (optional)
         Either ``None`` to indicate a C-style contiguous array or
-       a Tuple of strides which provides the number of bytes needed
+       a tuple of strides which provides the number of bytes needed
         to jump to the next array element in the corresponding
         dimension. Each entry must be an integer (a Python
         :py:class:`int`). As with shape, the values may
@@ -156,26 +157,26 @@ This approach to the interface consists of the object having an
         memory buffer. In this model, the last dimension of the array
         varies the fastest.  For example, the default strides tuple
         for an object whose array entries are 8 bytes long and whose
-       shape is ``(10, 20, 30)`` would be ``(4800, 240, 8)``
+       shape is ``(10, 20, 30)`` would be ``(4800, 240, 8)``.
  
         **Default**: ``None`` (C-style contiguous)
  
     **mask** (optional)
-       None or an object exposing the array interface.  All
+       ``None`` or an object exposing the array interface.  All
         elements of the mask array should be interpreted only as true
         or not true indicating which elements of this array are valid.
         The shape of this object should be `"broadcastable"
         <arrays.broadcasting.broadcastable>` to the shape of the
         original array.
  
-       **Default**: None (All array values are valid)
+       **Default**: ``None`` (All array values are valid)
  
     **offset** (optional)
         An integer offset into the array data region. This can only be
         used when data is ``None`` or returns a :class:`buffer`
         object.
  
-       **Default**: 0.
+       **Default**: ``0``.
  
     **version** (required)
         An integer showing the version of the interface (i.e. 3 for
@@ -243,6 +244,12 @@ flag is present.
     returning the :c:type:`PyCapsule`, and configure a destructor to decref this
     reference.
  
+.. note::
+
+    :obj:`__array_struct__` is considered legacy and should not be used for new
+    code. Use the :py:doc:`buffer protocol <c-api/buffer>` or the DLPack protocol
+    `numpy.from_dlpack` instead.
+
  
  Type description examples
  =========================
diff --git a/doc/source/reference/arrays.ndarray.rst b/doc/source/reference/arrays.ndarray.rst

index 0f703b4754de44b1cbc34b2658da40b075681e02..985a11c8818e363e82386ffa059be9e6b4597deb 100644 (file)
--- a/doc/source/reference/arrays.ndarray.rst
+++ b/doc/source/reference/arrays.ndarray.rst
@@ -54,13 +54,13 @@ objects implementing the :class:`buffer` or :ref:`array
  
     >>> y = x[:,1]
     >>> y
-   array([2, 5])
+   array([2, 5], dtype=int32)
     >>> y[0] = 9 # this also changes the corresponding element in x
     >>> y
-   array([9, 5])
+   array([9, 5], dtype=int32)
     >>> x
     array([[1, 9, 3],
-          [4, 5, 6]])
+          [4, 5, 6]], dtype=int32)
  
  
  Constructing arrays
@@ -161,26 +161,15 @@ An array is considered aligned if the memory offsets for all elements and the
  base offset itself is a multiple of `self.itemsize`. Understanding
  `memory-alignment` leads to better performance on most hardware.
  
-.. note::
-
-    Points (1) and (2) can currently be disabled by the compile time
-    environmental variable ``NPY_RELAXED_STRIDES_CHECKING=0``,
-    which was the default before NumPy 1.10.
-    No users should have to do this. ``NPY_RELAXED_STRIDES_DEBUG=1``
-    can be used to help find errors when incorrectly relying on the strides
-    in C-extension code (see below warning).
-
-    You can check whether this option was enabled when your NumPy was
-    built by looking at the value of ``np.ones((10,1),
-    order='C').flags.f_contiguous``. If this is ``True``, then your
-    NumPy has relaxed strides checking enabled.
-
  .. warning::
  
      It does *not* generally hold that ``self.strides[-1] == self.itemsize``
      for C-style contiguous arrays or ``self.strides[0] == self.itemsize`` for
      Fortran-style contiguous arrays is true.
  
+    ``NPY_RELAXED_STRIDES_DEBUG=1`` can be used to help find errors when
+    incorrectly relying on the strides in C-extension code (see below warning).
+
  Data in new :class:`ndarrays <ndarray>` is in the :term:`row-major`
  (C) order, unless otherwise specified, but, for example, :ref:`basic
  array slicing <arrays.indexing>` often produces :term:`views <view>`
diff --git a/doc/source/reference/arrays.nditer.cython.rst b/doc/source/reference/arrays.nditer.cython.rst

index 43aad99275c7da757d327d452e11d7653b324888..66485fc8a449e098f4cba54c335b449897f47a90 100644 (file)
--- a/doc/source/reference/arrays.nditer.cython.rst
+++ b/doc/source/reference/arrays.nditer.cython.rst
@@ -49,7 +49,7 @@ Here's how this looks.
      ...
      >>> a = np.arange(6).reshape(2,3)
      >>> sum_squares_py(a)
-    array(55.0)
+    array(55.)
      >>> sum_squares_py(a, axis=-1)
      array([  5.,  50.])
  
@@ -117,11 +117,11 @@ as our native Python/NumPy code did.
  
  .. admonition:: Example
  
-    >>> from sum_squares import sum_squares_cy
+    >>> from sum_squares import sum_squares_cy #doctest: +SKIP
      >>> a = np.arange(6).reshape(2,3)
-    >>> sum_squares_cy(a)
+    >>> sum_squares_cy(a) #doctest: +SKIP
      array(55.0)
-    >>> sum_squares_cy(a, axis=-1)
+    >>> sum_squares_cy(a, axis=-1) #doctest: +SKIP
      array([  5.,  50.])
  
  Doing a little timing in IPython shows that the reduced overhead and
diff --git a/doc/source/reference/arrays.nditer.rst b/doc/source/reference/arrays.nditer.rst

index 72a04f73e8d1ac4b6192bfeabe623fd6a47527ad..8cabc1a061a64de42f723ee958a5bbfdc6c7a5df 100644 (file)
--- a/doc/source/reference/arrays.nditer.rst
+++ b/doc/source/reference/arrays.nditer.rst
@@ -1,9 +1,5 @@
  .. currentmodule:: numpy
  
-.. for doctests
-   The last section on Cython is 'included' at the end of this file. The tests
-   for that section are disabled.
-
  .. _arrays.nditer:
  
  *********************
@@ -489,9 +485,9 @@ reasons.
  
      >>> b = np.zeros((3,))
      >>> square([1,2,3], out=b)
-    array([ 1.,  4.,  9.])
+    array([1.,  4.,  9.])
      >>> b
-    array([ 1.,  4.,  9.])
+    array([1.,  4.,  9.])
  
      >>> square(np.arange(6).reshape(2,3), out=b)
      Traceback (most recent call last):
diff --git a/doc/source/reference/c-api/array.rst b/doc/source/reference/c-api/array.rst

index bb440582548cd938675c85d0e1be7e61c1e27097..f22b41a8528196580da8295c3c5ea9523611b400 100644 (file)
--- a/doc/source/reference/c-api/array.rst
+++ b/doc/source/reference/c-api/array.rst
@@ -127,8 +127,7 @@ and its sub-types).
      your own memory, you should use the function :c:func:`PyArray_SetBaseObject`
      to set the base to an object which owns the memory.
  
-    If the (deprecated) :c:data:`NPY_ARRAY_UPDATEIFCOPY` or the
-    :c:data:`NPY_ARRAY_WRITEBACKIFCOPY` flags are set, it has a different
+    If the :c:data:`NPY_ARRAY_WRITEBACKIFCOPY` flag is set, it has a different
      meaning, namely base is the array into which the current array will
      be copied upon copy resolution. This overloading of the base property
      for two functions is likely to change in a future version of NumPy.
@@ -237,8 +236,7 @@ From scratch
      If *data* is not ``NULL``, then it is assumed to point to the memory
      to be used for the array and the *flags* argument is used as the
      new flags for the array (except the state of :c:data:`NPY_ARRAY_OWNDATA`,
-    :c:data:`NPY_ARRAY_WRITEBACKIFCOPY` and :c:data:`NPY_ARRAY_UPDATEIFCOPY`
-    flags of the new array will be reset).
+    :c:data:`NPY_ARRAY_WRITEBACKIFCOPY` flag of the new array will be reset).
  
      In addition, if *data* is non-NULL, then *strides* can
      also be provided. If *strides* is ``NULL``, then the array strides
@@ -487,13 +485,6 @@ From other objects
          will be made writeable again. If *op* is not writeable to begin
          with, or if it is not already an array, then an error is raised.
  
-    .. c:macro:: NPY_ARRAY_UPDATEIFCOPY
-
-        Deprecated. Use :c:data:`NPY_ARRAY_WRITEBACKIFCOPY`, which is similar.
-        This flag "automatically" copies the data back when the returned
-        array is deallocated, which is not supported in all python
-        implementations.
-
      .. c:macro:: NPY_ARRAY_BEHAVED
  
          :c:data:`NPY_ARRAY_ALIGNED` \| :c:data:`NPY_ARRAY_WRITEABLE`
@@ -550,14 +541,12 @@ From other objects
  .. c:macro:: NPY_ARRAY_INOUT_ARRAY
  
      :c:data:`NPY_ARRAY_C_CONTIGUOUS` \| :c:data:`NPY_ARRAY_WRITEABLE` \|
-    :c:data:`NPY_ARRAY_ALIGNED` \| :c:data:`NPY_ARRAY_WRITEBACKIFCOPY` \|
-    :c:data:`NPY_ARRAY_UPDATEIFCOPY`
+    :c:data:`NPY_ARRAY_ALIGNED` \| :c:data:`NPY_ARRAY_WRITEBACKIFCOPY`
  
      .. c:macro:: NPY_ARRAY_INOUT_FARRAY
  
          :c:data:`NPY_ARRAY_F_CONTIGUOUS` \| :c:data:`NPY_ARRAY_WRITEABLE` \|
-        :c:data:`NPY_ARRAY_ALIGNED` \| :c:data:`NPY_ARRAY_WRITEBACKIFCOPY` \|
-        :c:data:`NPY_ARRAY_UPDATEIFCOPY`
+        :c:data:`NPY_ARRAY_ALIGNED` \| :c:data:`NPY_ARRAY_WRITEBACKIFCOPY`
  
  .. c:function:: int PyArray_GetArrayParamsFromObject( \
          PyObject* op, PyArray_Descr* requested_dtype, npy_bool writeable, \
@@ -773,8 +762,7 @@ From other objects
      :c:data:`NPY_ARRAY_C_CONTIGUOUS`, :c:data:`NPY_ARRAY_F_CONTIGUOUS`,
      :c:data:`NPY_ARRAY_ALIGNED`, :c:data:`NPY_ARRAY_WRITEABLE`,
      :c:data:`NPY_ARRAY_NOTSWAPPED`, :c:data:`NPY_ARRAY_ENSURECOPY`,
-    :c:data:`NPY_ARRAY_WRITEBACKIFCOPY`, :c:data:`NPY_ARRAY_UPDATEIFCOPY`,
-    :c:data:`NPY_ARRAY_FORCECAST`, and
+    :c:data:`NPY_ARRAY_WRITEBACKIFCOPY`, :c:data:`NPY_ARRAY_FORCECAST`, and
      :c:data:`NPY_ARRAY_ENSUREARRAY`. Standard combinations of flags can also
      be used:
  
@@ -1375,15 +1363,6 @@ Special functions for NPY_OBJECT
      decrement all the items in the object array prior to calling this
      function.
  
-.. c:function:: int PyArray_SetUpdateIfCopyBase(PyArrayObject* arr, PyArrayObject* base)
-
-    Precondition: ``arr`` is a copy of ``base`` (though possibly with different
-    strides, ordering, etc.) Set the UPDATEIFCOPY flag and ``arr->base`` so
-    that when ``arr`` is destructed, it will copy any changes back to ``base``.
-    DEPRECATED, use :c:func:`PyArray_SetWritebackIfCopyBase`.
-
-    Returns 0 for success, -1 for failure.
-
  .. c:function:: int PyArray_SetWritebackIfCopyBase(PyArrayObject* arr, PyArrayObject* base)
  
      Precondition: ``arr`` is a copy of ``base`` (though possibly with different
@@ -1496,14 +1475,6 @@ of the constant names is deprecated in 1.7.
      would have returned an error because :c:data:`NPY_ARRAY_WRITEBACKIFCOPY`
      would not have been possible.
  
-.. c:macro:: NPY_ARRAY_UPDATEIFCOPY
-
-    A deprecated version of :c:data:`NPY_ARRAY_WRITEBACKIFCOPY` which
-    depends upon ``dealloc`` to trigger the writeback. For backwards
-    compatibility, :c:func:`PyArray_ResolveWritebackIfCopy` is called at
-    ``dealloc`` but relying
-    on that behavior is deprecated and not supported in PyPy.
-
  :c:func:`PyArray_UpdateFlags` (obj, flags) will update the ``obj->flags``
  for ``flags`` which can be any of :c:data:`NPY_ARRAY_C_CONTIGUOUS`,
  :c:data:`NPY_ARRAY_F_CONTIGUOUS`, :c:data:`NPY_ARRAY_ALIGNED`, or
@@ -1575,8 +1546,7 @@ For all of these macros *arr* must be an instance of a (subclass of)
      combinations of the possible flags an array can have:
      :c:data:`NPY_ARRAY_C_CONTIGUOUS`, :c:data:`NPY_ARRAY_F_CONTIGUOUS`,
      :c:data:`NPY_ARRAY_OWNDATA`, :c:data:`NPY_ARRAY_ALIGNED`,
-    :c:data:`NPY_ARRAY_WRITEABLE`, :c:data:`NPY_ARRAY_WRITEBACKIFCOPY`,
-    :c:data:`NPY_ARRAY_UPDATEIFCOPY`.
+    :c:data:`NPY_ARRAY_WRITEABLE`, :c:data:`NPY_ARRAY_WRITEBACKIFCOPY`.
  
  .. c:function:: int PyArray_IS_C_CONTIGUOUS(PyObject *arr)
  
@@ -1653,6 +1623,17 @@ For all of these macros *arr* must be an instance of a (subclass of)
      calculations in NumPy that rely on the state of these flags do not
      repeat the calculation to update them.
  
+.. c:function:: int PyArray_FailUnlessWriteable(PyArrayObject *obj, const char *name)
+
+    This function does nothing and returns 0 if *obj* is writeable.
+    It raises an exception and returns -1 if *obj* is not writeable.
+    It may also do other house-keeping, such as issuing warnings on
+    arrays which are transitioning to become views. Always call this
+    function at some point before writing to an array.
+
+    *name* is a name for the array, used to give better error messages.
+    It can be something like "assignment destination", "output array",
+    or even just "array".
  
  Array method alternative API
  ----------------------------
@@ -2195,8 +2176,8 @@ Array Functions
  ^^^^^^^^^^^^^^^
  
  .. c:function:: int PyArray_AsCArray( \
-        PyObject** op, void* ptr, npy_intp* dims, int nd, int typenum, \
-        int itemsize)
+        PyObject** op, void* ptr, npy_intp* dims, int nd, \
+        PyArray_Descr* typedescr)
  
      Sometimes it is useful to access a multidimensional array as a
      C-style multi-dimensional array so that algorithms can be
@@ -2226,14 +2207,11 @@ Array Functions
  
          The dimensionality of the array (1, 2, or 3).
  
-    :param typenum:
-
-        The expected data type of the array.
-
-    :param itemsize:
+    :param typedescr:
  
-        This argument is only needed when *typenum* represents a
-        flexible array. Otherwise it should be 0.
+        A :c:type:`PyArray_Descr` structure indicating the desired data-type
+        (including required byteorder). The call will steal a reference to
+        the parameter.
  
  .. note::
  
@@ -2765,7 +2743,7 @@ Array mapping is the machinery behind advanced indexing.
      has memory overlap with any of the arrays in ``index`` and with
      ``extra_op``, and make copies as appropriate to avoid problems if the
      input is modified during the iteration. ``iter->array`` may contain a
-    copied array (UPDATEIFCOPY/WRITEBACKIFCOPY set).
+    copied array (WRITEBACKIFCOPY set).
  
  Array Scalars
  -------------
@@ -3377,8 +3355,8 @@ Memory management
  
  .. c:function:: int PyArray_ResolveWritebackIfCopy(PyArrayObject* obj)
  
-    If ``obj.flags`` has :c:data:`NPY_ARRAY_WRITEBACKIFCOPY` or (deprecated)
-    :c:data:`NPY_ARRAY_UPDATEIFCOPY`, this function clears the flags, `DECREF` s
+    If ``obj.flags`` has :c:data:`NPY_ARRAY_WRITEBACKIFCOPY`, this function
+    clears the flags, `DECREF` s
      `obj->base` and makes it writeable, and sets ``obj->base`` to NULL. It then
      copies ``obj->data`` to `obj->base->data`, and returns the error state of
      the copy operation. This is the opposite of
@@ -3609,8 +3587,8 @@ Miscellaneous Macros
  
  .. c:function:: void PyArray_DiscardWritebackIfCopy(PyObject* obj)
  
-    If ``obj.flags`` has :c:data:`NPY_ARRAY_WRITEBACKIFCOPY` or (deprecated)
-    :c:data:`NPY_ARRAY_UPDATEIFCOPY`, this function clears the flags, `DECREF` s
+    If ``obj.flags`` has :c:data:`NPY_ARRAY_WRITEBACKIFCOPY`, this function
+    clears the flags, `DECREF` s
      `obj->base` and makes it writeable, and sets ``obj->base`` to NULL. In
      contrast to :c:func:`PyArray_DiscardWritebackIfCopy` it makes no attempt
      to copy the data from `obj->base` This undoes
@@ -3623,8 +3601,8 @@ Miscellaneous Macros
      Deprecated in 1.14, use :c:func:`PyArray_DiscardWritebackIfCopy`
      followed by ``Py_XDECREF``
  
-    DECREF's an array object which may have the (deprecated)
-    :c:data:`NPY_ARRAY_UPDATEIFCOPY` or :c:data:`NPY_ARRAY_WRITEBACKIFCOPY`
+    DECREF's an array object which may have the
+    :c:data:`NPY_ARRAY_WRITEBACKIFCOPY`
      flag set without causing the contents to be copied back into the
      original array. Resets the :c:data:`NPY_ARRAY_WRITEABLE` flag on the base
      object. This is useful for recovering from an error condition when
diff --git a/doc/source/reference/c-api/data_memory.rst b/doc/source/reference/c-api/data_memory.rst

index b779026b45518517e3abb8c5e079fee85ed4905e..2084ab5d083f80d0d711189caeea89a74e0dd4ba 100644 (file)
--- a/doc/source/reference/c-api/data_memory.rst
+++ b/doc/source/reference/c-api/data_memory.rst
@@ -20,8 +20,8 @@ Historical overview
  Since version 1.7.0, NumPy has exposed a set of ``PyDataMem_*`` functions
  (:c:func:`PyDataMem_NEW`, :c:func:`PyDataMem_FREE`, :c:func:`PyDataMem_RENEW`)
  which are backed by `alloc`, `free`, `realloc` respectively. In that version
-NumPy also exposed the `PyDataMem_EventHook` function described below, which
-wrap the OS-level calls.
+NumPy also exposed the `PyDataMem_EventHook` function (now deprecated)
+described below, which wrap the OS-level calls.
  
  Since those early days, Python also improved its memory management
  capabilities, and began providing
@@ -50,10 +50,10 @@ management routines can use :c:func:`PyDataMem_SetHandler`, which uses a
  :c:type:`PyDataMem_Handler` structure to hold pointers to functions used to
  manage the data memory. The calls are still wrapped by internal routines to
  call :c:func:`PyTraceMalloc_Track`, :c:func:`PyTraceMalloc_Untrack`, and will
-use the :c:func:`PyDataMem_EventHookFunc` mechanism. Since the functions may
-change during the lifetime of the process, each ``ndarray`` carries with it the
-functions used at the time of its instantiation, and these will be used to
-reallocate or free the data memory of the instance.
+use the deprecated :c:func:`PyDataMem_EventHookFunc` mechanism. Since the
+functions may change during the lifetime of the process, each ``ndarray``
+carries with it the functions used at the time of its instantiation, and these
+will be used to reallocate or free the data memory of the instance.
  
  .. c:type:: PyDataMem_Handler
  
@@ -119,7 +119,9 @@ For an example of setting up and using the PyDataMem_Handler, see the test in
      thread.  The hook should be written to be reentrant, if it performs
      operations that might cause new allocation events (such as the
      creation/destruction numpy objects, or creating/destroying Python
-    objects which might cause a gc)
+    objects which might cause a gc).
+
+    Deprecated in v1.23
  
  What happens when deallocating if there is no policy set
  --------------------------------------------------------
diff --git a/doc/source/reference/c-api/iterator.rst b/doc/source/reference/c-api/iterator.rst

index 83644d8b240b1513de804cacfddf393a7febb4d3..b4adaef9b9c059de124cd3bc1176793d35356191 100644 (file)
--- a/doc/source/reference/c-api/iterator.rst
+++ b/doc/source/reference/c-api/iterator.rst
@@ -653,7 +653,7 @@ Construction and Destruction
      may not be repeated.  The following example is how normal broadcasting
      applies to a 3-D array, a 2-D array, a 1-D array and a scalar.
  
-    **Note**: Before NumPy 1.8 ``oa_ndim == 0` was used for signalling that
+    **Note**: Before NumPy 1.8 ``oa_ndim == 0` was used for signalling
      that ``op_axes`` and ``itershape`` are unused. This is deprecated and
      should be replaced with -1. Better backward compatibility may be
      achieved by using :c:func:`NpyIter_MultiNew` for this case.
diff --git a/doc/source/reference/c-api/types-and-structures.rst b/doc/source/reference/c-api/types-and-structures.rst

index 605a4ae718fb897e3bd5c9e36ad18590406d2b17..34437bd303bbb893abce33fb8120e846e7418e1b 100644 (file)
--- a/doc/source/reference/c-api/types-and-structures.rst
+++ b/doc/source/reference/c-api/types-and-structures.rst
@@ -144,9 +144,8 @@ PyArray_Type and PyArrayObject
  
         - If this array does not own its own memory, then base points to the
           Python object that owns it (perhaps another array object)
-       - If this array has the (deprecated) :c:data:`NPY_ARRAY_UPDATEIFCOPY` or
-         :c:data:`NPY_ARRAY_WRITEBACKIFCOPY` flag set, then this array is a working
-         copy of a "misbehaved" array.
+       - If this array has the :c:data:`NPY_ARRAY_WRITEBACKIFCOPY` flag set,
+         then this array is a working copy of a "misbehaved" array.
  
         When ``PyArray_ResolveWritebackIfCopy`` is called, the array pointed to
         by base will be updated with the contents of this array.
@@ -169,7 +168,7 @@ PyArray_Type and PyArrayObject
         interpreted. Possible flags are :c:data:`NPY_ARRAY_C_CONTIGUOUS`,
         :c:data:`NPY_ARRAY_F_CONTIGUOUS`, :c:data:`NPY_ARRAY_OWNDATA`,
         :c:data:`NPY_ARRAY_ALIGNED`, :c:data:`NPY_ARRAY_WRITEABLE`,
-       :c:data:`NPY_ARRAY_WRITEBACKIFCOPY`, and :c:data:`NPY_ARRAY_UPDATEIFCOPY`.
+       :c:data:`NPY_ARRAY_WRITEBACKIFCOPY`.
  
     .. c:member:: PyObject *weakreflist
  
@@ -286,6 +285,11 @@ PyArrayDescr_Type and PyArray_Descr
         array like behavior. Each bit in this member is a flag which are named
         as:
  
+   .. c:member:: int alignment
+
+       Non-NULL if this type is an array (C-contiguous) of some other type
+
+
  ..
    dedented to allow internal linking, pending a refactoring
  
diff --git a/doc/source/reference/c-api/ufunc.rst b/doc/source/reference/c-api/ufunc.rst

index 95dc47839e4be24ba5f0f4a52dd07c81c450910f..39447ae2403fa561aef7993ae2a2e203e7249948 100644 (file)
--- a/doc/source/reference/c-api/ufunc.rst
+++ b/doc/source/reference/c-api/ufunc.rst
@@ -79,7 +79,7 @@ Types
  
  .. c:type:: PyUFuncGenericFunction
  
-    pointers to functions that actually implement the underlying
+    Pointers to functions that actually implement the underlying
      (element-by-element) function :math:`N` times with the following
      signature:
  
@@ -107,7 +107,10 @@ Types
  
              Arbitrary data (extra arguments, function names, *etc.* )
              that can be stored with the ufunc and will be passed in
-            when it is called.
+            when it is called. May be ``NULL``.
+
+            .. versionchanged:: 1.23.0
+               Accepts ``NULL`` `data` in addition to array of ``NULL`` values.
  
          This is an example of a func specialized for addition of doubles
          returning doubles.
@@ -154,20 +157,21 @@ Functions
         ufunc object is alive.
  
      :param func:
-        Must to an array of length *ntypes* containing
-        :c:type:`PyUFuncGenericFunction` items.
+        Must point to an array containing *ntypes*
+        :c:type:`PyUFuncGenericFunction` elements.
  
      :param data:
-        Should be ``NULL`` or a pointer to an array of size *ntypes*
-        . This array may contain arbitrary extra-data to be passed to
-        the corresponding loop function in the func array.
+        Should be ``NULL`` or a pointer to an array of size *ntypes*.
+        This array may contain arbitrary extra-data to be passed to
+        the corresponding loop function in the func array, including
+        ``NULL``.
  
      :param types:
         Length ``(nin + nout) * ntypes`` array of ``char`` encoding the
         `numpy.dtype.num` (built-in only) that the corresponding
         function in the ``func`` array accepts. For instance, for a comparison
         ufunc with three ``ntypes``, two ``nin`` and one ``nout``, where the
-       first function accepts `numpy.int32` and the the second
+       first function accepts `numpy.int32` and the second
         `numpy.int64`, with both returning `numpy.bool_`, ``types`` would
         be ``(char[]) {5, 5, 0, 7, 7, 0}`` since ``NPY_INT32`` is 5,
         ``NPY_INT64`` is 7, and ``NPY_BOOL`` is 0.
diff --git a/doc/source/reference/distutils.rst b/doc/source/reference/distutils.rst

index f201ba66865b3fcb964a39e6422ea5f316502612..ff1ba3b0d1dfd219868ceeb21f08938dd20ab667 100644 (file)
--- a/doc/source/reference/distutils.rst
+++ b/doc/source/reference/distutils.rst
@@ -4,6 +4,11 @@ Packaging (:mod:`numpy.distutils`)
  
  .. module:: numpy.distutils
  
+.. warning::
+
+   ``numpy.distutils`` is deprecated, and will be removed for
+   Python >= 3.12. For more details, see :ref:`distutils-status-migration`
+
  NumPy provides enhanced distutils functionality to make it easier to
  build and install sub-packages, auto-generate code, and extension
  modules that use Fortran-compiled libraries. To use features of NumPy
@@ -188,6 +193,8 @@ Info are easily retrieved from the `get_info` function in
  
    >>> info = np.distutils.misc_util.get_info('npymath')
    >>> config.add_extension('foo', sources=['foo.c'], extra_info=info)
+  <numpy.distutils.extension.Extension('foo') at 0x...>
+
  
  An additional list of paths to look for .ini files can be given to `get_info`.
  
diff --git a/doc/source/reference/distutils_guide.rst b/doc/source/reference/distutils_guide.rst

index 081719d164284e8f3b04d41cd2ef2c8cb115e191..5bb4c2878e49c7763dc0ea8991c722628f7b5f1a 100644 (file)
--- a/doc/source/reference/distutils_guide.rst
+++ b/doc/source/reference/distutils_guide.rst
@@ -3,5 +3,11 @@
  NumPy Distutils - Users Guide
  =============================
  
+.. warning::
+
+   ``numpy.distutils`` is deprecated, and will be removed for
+   Python >= 3.12. For more details, see :ref:`distutils-status-migration`
+
+
  .. include:: ../../DISTUTILS.rst.txt
     :start-line: 6
diff --git a/doc/source/reference/distutils_status_migration.rst b/doc/source/reference/distutils_status_migration.rst

new file mode 100644 (file)

index 0000000..9ef5f72
--- /dev/null
+++ b/doc/source/reference/distutils_status_migration.rst
@@ -0,0 +1,138 @@
+.. _distutils-status-migration:
+
+Status of ``numpy.distutils`` and migration advice
+==================================================
+
+`numpy.distutils` has been deprecated in NumPy ``1.23.0``. It will be removed
+for Python 3.12; for Python <= 3.11 it will not be removed until 2 years after
+the Python 3.12 release (Oct 2025).
+
+
+.. warning::
+
+   ``numpy.distutils`` is only tested with ``setuptools < 60.0``, newer
+   versions may break. See :ref:`numpy-setuptools-interaction` for details.
+
+
+Migration advice
+----------------
+
+It is **not necessary** to migrate immediately - the release date for Python 3.12
+is October 2023. It may be beneficial to wait with migrating until there are
+examples from other projects to follow (see below).
+
+There are several build systems which are good options to migrate to. Assuming
+you have compiled code in your package (if not, we recommend using Flit_) and
+you want to be using a well-designed, modern and reliable build system, we
+recommend:
+
+1. Meson_
+2. CMake_ (or scikit-build_ as an interface to CMake)
+
+If you have modest needs (only simple Cython/C extensions, and perhaps nested
+``setup.py`` files) and have been happy with ``numpy.distutils`` so far, you
+can also consider switching to ``setuptools``. Note that most functionality of
+``numpy.disutils`` is unlikely to be ported to ``setuptools``.
+
+
+Moving to Meson
+```````````````
+
+SciPy is moving to Meson for its 1.9.0 release, planned for July 2022. During
+this process, any remaining issues with Meson's Python support and achieving
+feature parity with ``numpy.distutils`` will be resolved. *Note: parity means a
+large superset, but right now some BLAS/LAPACK support is missing and there are
+a few open issues related to Cython.* SciPy uses almost all functionality that
+``numpy.distutils`` offers, so if SciPy has successfully made a release with
+Meson as the build system, there should be no blockers left to migrate, and
+SciPy will be a good reference for other packages who are migrating.
+For more details about the SciPy migration, see:
+
+- `RFC: switch to Meson as a build system <https://github.com/scipy/scipy/issues/13615>`__
+- `Tracking issue for Meson support <https://github.com/rgommers/scipy/issues/22>`__
+
+NumPy itself will very likely migrate to Meson as well, once the SciPy
+migration is done.
+
+
+Moving to CMake / scikit-build
+``````````````````````````````
+
+See the `scikit-build documentation <https://scikit-build.readthedocs.io/en/latest/>`__
+for how to use scikit-build. Please note that as of Feb 2022, scikit-build
+still relies on setuptools, so it's probably not quite ready yet for a
+post-distutils world. How quickly this changes depends on funding, the current
+(Feb 2022) estimate is that if funding arrives then a viable ``numpy.distutils``
+replacement will be ready at the end of 2022, and a very polished replacement
+mid-2023.  For more details on this, see
+`this blog post by Henry Schreiner <https://iscinumpy.gitlab.io/post/scikit-build-proposal/>`__.
+
+
+Moving to ``setuptools``
+````````````````````````
+
+For projects that only use ``numpy.distutils`` for historical reasons, and do
+not actually use features beyond those that ``setuptools`` also supports,
+moving to ``setuptools`` is likely the solution which costs the least effort.
+To assess that, there are the ``numpy.distutils`` features that are *not*
+present in ``setuptools``:
+
+- Nested ``setup.py`` files
+- Fortran build support
+- BLAS/LAPACK library support (OpenBLAS, MKL, ATLAS, Netlib LAPACK/BLAS, BLIS, 64-bit ILP interface, etc.)
+- Support for a few other scientific libraries, like FFTW and UMFPACK
+- Better MinGW support
+- Per-compiler build flag customization (e.g. `-O3` and `SSE2` flags are default)
+- a simple user build config system, see [site.cfg.example](https://github.com/numpy/numpy/blob/master/site.cfg.example)
+- SIMD intrinsics support
+
+The most widely used feature is nested ``setup.py`` files. This feature will
+likely be ported to ``setuptools`` (see
+`gh-18588 <https://github.com/numpy/numpy/issues/18588>`__ for status).
+Projects only using that feature could move to ``setuptools`` after that is
+done. In case a project uses only a couple of ``setup.py`` files, it also could
+make sense to simply aggregate all the content of those files into a single
+``setup.py`` file and then move to ``setuptools``. This involves dropping all
+``Configuration`` instances, and using ``Extension`` instead. E.g.,::
+
+    from distutils.core import setup
+    from distutils.extension import Extension
+    setup(name='foobar',
+          version='1.0',
+          ext_modules=[
+              Extension('foopkg.foo', ['foo.c']),
+              Extension('barpkg.bar', ['bar.c']),
+              ],
+          )
+
+For more details, see the
+`setuptools documentation <https://setuptools.pypa.io/en/latest/setuptools.html>`__
+
+
+.. _numpy-setuptools-interaction:
+
+Interaction of ``numpy.disutils`` with ``setuptools``
+-----------------------------------------------------
+
+It is recommended to use ``setuptools < 60.0``. Newer versions may work, but
+are not guaranteed to. The reason for this is that ``setuptools`` 60.0 enabled
+a vendored copy of ``distutils``, including backwards incompatible changes that
+affect some functionality in ``numpy.distutils``.
+
+If you are using only simple Cython or C extensions with minimal use of
+``numpy.distutils`` functionality beyond nested ``setup.py`` files (its most
+popular feature, see :class:`Configuration <numpy.distutils.misc_util.Configuration>`),
+then latest ``setuptools`` is likely to continue working. In case of problems,
+you can also try ``SETUPTOOLS_USE_DISTUTILS=stdlib`` to avoid the backwards
+incompatible changes in ``setuptools``.
+
+Whatever you do, it is recommended to put an upper bound on your ``setuptools``
+build requirement in ``pyproject.toml`` to avoid future breakage - see
+:ref:`for-downstream-package-authors`.
+
+
+.. _Flit: https://flit.readthedocs.io
+.. _CMake: https://cmake.org/
+.. _Meson: https://mesonbuild.com/
+.. _scikit-build: https://scikit-build.readthedocs.io/
+
diff --git a/doc/source/reference/global_state.rst b/doc/source/reference/global_state.rst

index 20874ceaae4954e664c96f8ac073cb68746ad667..81685ec7d83652e5ff9adf000af274dff50837ca 100644 (file)
--- a/doc/source/reference/global_state.rst
+++ b/doc/source/reference/global_state.rst
@@ -70,19 +70,15 @@ Debugging-Related Options
  Relaxed Strides Checking
  ------------------------
  
-The *compile-time* environment variables::
+The *compile-time* environment variable::
  
      NPY_RELAXED_STRIDES_DEBUG=0
-    NPY_RELAXED_STRIDES_CHECKING=1
-
-control how NumPy reports contiguity for arrays.
-The default that it is enabled and the debug mode is disabled.
-This setting should always be enabled. Setting the
-debug option can be interesting for testing code written
-in C which iterates through arrays that may or may not be
-contiguous in memory.
-Most users will have no reason to change these; for details
-see the :ref:`memory layout <memory-layout>` documentation.
+
+can be set to help debug code written in C which iteraters through arrays
+manually.  When an array is contiguous and iterated in a contiguous manner,
+its ``strides`` should not be queried.  This option can help find errors where
+the ``strides`` are incorrectly used.
+For details see the :ref:`memory layout <memory-layout>` documentation.
  
  
  Warn if no memory allocation policy when deallocating data
diff --git a/doc/source/reference/index.rst b/doc/source/reference/index.rst

index 0ee51f6127895e1c197be5e398271c4a4d1dc4b7..1c483907b4c374ac859f49ae18bb727898028997 100644 (file)
--- a/doc/source/reference/index.rst
+++ b/doc/source/reference/index.rst
@@ -18,6 +18,7 @@ For learning how to use NumPy, see the :ref:`complete documentation <numpy_docs_
     :maxdepth: 2
  
     arrays
+   array_api
     constants
     ufuncs
     routines
@@ -25,8 +26,9 @@ For learning how to use NumPy, see the :ref:`complete documentation <numpy_docs_
     global_state
     distutils
     distutils_guide
+   distutils_status_migration
     c-api/index
-   simd/simd-optimizations
+   simd/index
     swig
  
  
diff --git a/doc/source/reference/maskedarray.baseclass.rst b/doc/source/reference/maskedarray.baseclass.rst

index 5a0f99651c3f02be21b2d910e945611a69d33d14..44792a0d6a60a825b9d64ed059abd96e2db2407a 100644 (file)
--- a/doc/source/reference/maskedarray.baseclass.rst
+++ b/doc/source/reference/maskedarray.baseclass.rst
@@ -1,7 +1,6 @@
  .. currentmodule:: numpy.ma
  
  .. for doctests
-   >>> import numpy as np
     >>> from numpy import ma
  
  .. _numpy.ma.constants:
diff --git a/doc/source/reference/maskedarray.generic.rst b/doc/source/reference/maskedarray.generic.rst

index d3849c50deec9b4f5e33c14a4cfdaa321c58cad5..29fc2fe07452d76909f20efac8b053fe79d84c86 100644 (file)
--- a/doc/source/reference/maskedarray.generic.rst
+++ b/doc/source/reference/maskedarray.generic.rst
@@ -467,7 +467,7 @@ Suppose now that we wish to print that same data, but with the missing values
  replaced by the average value.
  
     >>> print(mx.filled(mx.mean()))
-   [ 0.  1.  2.  3.  4.]
+   [0.  1.  2.  3.  4.]
  
  
  Numerical operations
diff --git a/doc/source/reference/random/generator.rst b/doc/source/reference/random/generator.rst

index 7934be98a6d4355a1828ff4681c9652b6b49a354..9bee4d756553d466311919b93e0f76a66ba8a047 100644 (file)
--- a/doc/source/reference/random/generator.rst
+++ b/doc/source/reference/random/generator.rst
@@ -73,12 +73,12 @@ the value of the ``out`` parameter.  For example,
  
      >>> rng = np.random.default_rng()
      >>> x = np.arange(0, 15).reshape(3, 5)
-    >>> x
+    >>> x #doctest: +SKIP
      array([[ 0,  1,  2,  3,  4],
             [ 5,  6,  7,  8,  9],
             [10, 11, 12, 13, 14]])
      >>> y = rng.permuted(x, axis=1, out=x)
-    >>> x
+    >>> x #doctest: +SKIP
      array([[ 1,  0,  2,  4,  3],  # random
             [ 6,  7,  8,  9,  5],
             [10, 14, 11, 13, 12]])
@@ -88,6 +88,8 @@ Note that when ``out`` is given, the return value is ``out``:
      >>> y is x
      True
  
+.. _generator-handling-axis-parameter:    
+
  Handling the ``axis`` parameter
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  An important distinction for these methods is how they handle the ``axis``
@@ -103,7 +105,7 @@ array, and  ``axis=1`` will rearrange the columns.  For example
      array([[ 0,  1,  2,  3,  4],
             [ 5,  6,  7,  8,  9],
             [10, 11, 12, 13, 14]])
-    >>> rng.permutation(x, axis=1)
+    >>> rng.permutation(x, axis=1) #doctest: +SKIP
      array([[ 1,  3,  2,  0,  4],  # random
             [ 6,  8,  7,  5,  9],
             [11, 13, 12, 10, 14]])
@@ -116,7 +118,7 @@ how `numpy.sort` treats it.  Each slice along the given axis is shuffled
  independently of the others.  Compare the following example of the use of
  `Generator.permuted` to the above example of `Generator.permutation`:
  
-    >>> rng.permuted(x, axis=1)
+    >>> rng.permuted(x, axis=1) #doctest: +SKIP
      array([[ 1,  0,  2,  4,  3],  # random
             [ 5,  7,  6,  9,  8],
             [10, 14, 12, 13, 11]])
@@ -134,7 +136,7 @@ For example,
      >>> rng = np.random.default_rng()
      >>> a = ['A', 'B', 'C', 'D', 'E']
      >>> rng.shuffle(a)  # shuffle the list in-place
-    >>> a
+    >>> a #doctest: +SKIP
      ['B', 'D', 'A', 'E', 'C']  # random
  
  Distributions
diff --git a/doc/source/reference/random/index.rst b/doc/source/reference/random/index.rst

index aaabc9b39278b4fda8f76dca23b46c675bf3c575..674799d475e70ebc1296b4b30fb09e235e3db2db 100644 (file)
--- a/doc/source/reference/random/index.rst
+++ b/doc/source/reference/random/index.rst
@@ -185,7 +185,7 @@ What's New or Different
    methods which are 2-10 times faster than NumPy's Box-Muller or inverse CDF
    implementations.
  * Optional ``dtype`` argument that accepts ``np.float32`` or ``np.float64``
-  to produce either single or double prevision uniform random variables for
+  to produce either single or double precision uniform random variables for
    select distributions
  * Optional ``out`` argument that allows existing arrays to be filled for
    select distributions
diff --git a/doc/source/reference/random/new-or-different.rst b/doc/source/reference/random/new-or-different.rst

index a815439267fc6e2c3d9a82418dbdf258a4238ba3..8f4a70540b65e66f79a3353efd4de6beb1837e7b 100644 (file)
--- a/doc/source/reference/random/new-or-different.rst
+++ b/doc/source/reference/random/new-or-different.rst
@@ -40,7 +40,7 @@ Feature            Older Equivalent     Notes
                                          supported.
  ------------------ -------------------- -------------
  ``integers``       ``randint``,         Use the ``endpoint`` kwarg to adjust
-                   ``random_integers``  the inclusion or exclution of the
+                   ``random_integers``  the inclusion or exclusion of the
                                          ``high`` interval endpoint
  ================== ==================== =============
  
@@ -84,7 +84,7 @@ And in more detail:
  * The bit generators can be used in downstream projects via
    Cython.
  * Optional ``dtype`` argument that accepts ``np.float32`` or ``np.float64``
-  to produce either single or double prevision uniform random variables for
+  to produce either single or double precision uniform random variables for
    select distributions
  
    * Uniforms (`~.Generator.random` and `~.Generator.integers`)
diff --git a/doc/source/reference/random/parallel.rst b/doc/source/reference/random/parallel.rst

index 7f0207bdebb2c78ed68e6fce18c866807e4a7fdb..bff955948f77371c271c206f53abb4782d494541 100644 (file)
--- a/doc/source/reference/random/parallel.rst
+++ b/doc/source/reference/random/parallel.rst
@@ -28,8 +28,8 @@ streams.
  
  `~SeedSequence` avoids these problems by using successions of integer hashes
  with good `avalanche properties`_ to ensure that flipping any bit in the input
-input has about a 50% chance of flipping any bit in the output. Two input seeds
-that are very close to each other will produce initial states that are very far
+has about a 50% chance of flipping any bit in the output. Two input seeds that
+are very close to each other will produce initial states that are very far
  from each other (with very high probability). It is also constructed in such
  a way that you can provide arbitrary-sized integers or lists of integers.
  `~SeedSequence` will take all of the bits that you provide and mix them
diff --git a/doc/source/reference/routines.array-creation.rst b/doc/source/reference/routines.array-creation.rst

index 30780c286c4100fbf5e54bfa4c6bbeffd0da5784..9d2954f2c35bec614e341cc563883ab348e7beb8 100644 (file)
--- a/doc/source/reference/routines.array-creation.rst
+++ b/doc/source/reference/routines.array-creation.rst
@@ -35,6 +35,7 @@ From existing data
     asmatrix
     copy
     frombuffer
+   from_dlpack
     fromfile
     fromfunction
     fromiter
diff --git a/doc/source/reference/routines.array-manipulation.rst b/doc/source/reference/routines.array-manipulation.rst

index 1c96495d96f7d81eb92c48e3bd87e487865148fd..95fc398763773fe57ef148062d14fb89f7f3e252 100644 (file)
--- a/doc/source/reference/routines.array-manipulation.rst
+++ b/doc/source/reference/routines.array-manipulation.rst
@@ -59,7 +59,6 @@ Changing kind of array
     asfortranarray
     ascontiguousarray
     asarray_chkfinite
-   asscalar
     require
  
  Joining arrays
diff --git a/doc/source/reference/routines.emath.rst b/doc/source/reference/routines.emath.rst

index c0c5b61fc894f865e78c43fda99158a65d719843..8ff712cb4536813d6776cee53a88b4f19d1df42f 100644 (file)
--- a/doc/source/reference/routines.emath.rst
+++ b/doc/source/reference/routines.emath.rst
@@ -1,9 +1,9 @@
-Mathematical functions with automatic domain (:mod:`numpy.emath`)
-***********************************************************************
+Mathematical functions with automatic domain
+********************************************
  
  .. currentmodule:: numpy
  
-.. note:: :mod:`numpy.emath` is a preferred alias for :mod:`numpy.lib.scimath`,
+.. note:: :mod:`numpy.emath` is a preferred alias for ``numpy.lib.scimath``,
            available after :mod:`numpy` is imported.
  
-.. automodule:: numpy.lib.scimath
+.. automodule:: numpy.emath
diff --git a/doc/source/reference/routines.ma.rst b/doc/source/reference/routines.ma.rst

index 5404c43d8feecf35a1769945d6be6c90b5b0e8ef..1de5c1c029d378aed69c4522cb24eb3d4b8c8f0e 100644 (file)
--- a/doc/source/reference/routines.ma.rst
+++ b/doc/source/reference/routines.ma.rst
@@ -190,6 +190,7 @@ Finding masked data
  .. autosummary::
     :toctree: generated/
  
+   ma.ndenumerate
     ma.flatnotmasked_contiguous
     ma.flatnotmasked_edges
     ma.notmasked_contiguous
diff --git a/doc/source/reference/routines.polynomials.classes.rst b/doc/source/reference/routines.polynomials.classes.rst

index 5f575bed13d4317425da00bb66b0c8e8ae98904d..2ce29d9d0c8e1693945f690c45a889613803579d 100644 (file)
--- a/doc/source/reference/routines.polynomials.classes.rst
+++ b/doc/source/reference/routines.polynomials.classes.rst
@@ -59,11 +59,11 @@ first is the coefficients, the second is the domain, and the third is the
  window::
  
     >>> p.coef
-   array([ 1.,  2.,  3.])
+   array([1., 2., 3.])
     >>> p.domain
-   array([-1.,  1.])
+   array([-1,  1])
     >>> p.window
-   array([-1.,  1.])
+   array([-1,  1])
  
  Printing a polynomial yields the polynomial expression in a more familiar
  format::
@@ -77,7 +77,7 @@ representation is also available (default on Windows). The polynomial string
  format can be toggled at the package-level with the 
  `~numpy.polynomial.set_default_printstyle` function::
  
-   >>> numpy.polynomial.set_default_printstyle('ascii')
+   >>> np.polynomial.set_default_printstyle('ascii')
     >>> print(p)
     1.0 + 2.0 x**1 + 3.0 x**2
  
@@ -137,9 +137,9 @@ Evaluation::
     array([  1.,   6.,  17.,  34.,  57.])
     >>> x = np.arange(6).reshape(3,2)
     >>> p(x)
-   array([[  1.,   6.],
-          [ 17.,  34.],
-          [ 57.,  86.]])
+   array([[ 1.,   6.],
+          [17.,  34.],
+          [57.,  86.]])
  
  Substitution:
  
@@ -294,7 +294,6 @@ polynomials up to degree 5 are plotted below.
      ...     ax = plt.plot(x, T.basis(i)(x), lw=2, label=f"$T_{i}$")
      ...
      >>> plt.legend(loc="upper left")
-    <matplotlib.legend.Legend object at 0x3b3ee10>
      >>> plt.show()
  
  In the range -1 <= `x` <= 1 they are nice, equiripple functions lying between +/- 1.
@@ -309,7 +308,6 @@ The same plots over the range -2 <= `x` <= 2 look very different:
      ...     ax = plt.plot(x, T.basis(i)(x), lw=2, label=f"$T_{i}$")
      ...
      >>> plt.legend(loc="lower right")
-    <matplotlib.legend.Legend object at 0x3b3ee10>
      >>> plt.show()
  
  As can be seen, the "good" parts have shrunk to insignificance. In using
@@ -335,12 +333,10 @@ illustrated below for a fit to a noisy sine curve.
      >>> y = np.sin(x) + np.random.normal(scale=.1, size=x.shape)
      >>> p = T.fit(x, y, 5)
      >>> plt.plot(x, y, 'o')
-    [<matplotlib.lines.Line2D object at 0x2136c10>]
      >>> xx, yy = p.linspace()
      >>> plt.plot(xx, yy, lw=2)
-    [<matplotlib.lines.Line2D object at 0x1cf2890>]
      >>> p.domain
-    array([ 0.        ,  6.28318531])
+    array([0.        ,  6.28318531])
      >>> p.window
      array([-1.,  1.])
      >>> plt.show()
diff --git a/doc/source/reference/routines.testing.rst b/doc/source/reference/routines.testing.rst

index d9e98e94188d6f95005e25577668d2bb472bfa21..16d53bb4e5ff5eb66d552547aa0b3f6ef8d9118c 100644 (file)
--- a/doc/source/reference/routines.testing.rst
+++ b/doc/source/reference/routines.testing.rst
@@ -27,6 +27,8 @@ Asserts
     assert_raises
     assert_raises_regex
     assert_warns
+   assert_no_warnings
+   assert_no_gc_cycles
     assert_string_equal
  
  Asserts (not recommended)
@@ -38,9 +40,11 @@ functions for more consistent floating point comparisons.
  .. autosummary::
     :toctree: generated/
  
+   assert_
     assert_almost_equal
     assert_approx_equal
     assert_array_almost_equal
+   print_assert_equal
  
  Decorators
  ----------
@@ -60,6 +64,8 @@ Test Running
     :toctree: generated/
  
     Tester
+   clear_and_catch_warnings
+   measure
     run_module_suite
     rundocs
     suppress_warnings
diff --git a/doc/source/reference/simd/build-options.rst b/doc/source/reference/simd/build-options.rst

new file mode 100644 (file)

index 0000000..0994f15
--- /dev/null
+++ b/doc/source/reference/simd/build-options.rst
@@ -0,0 +1,376 @@
+*****************
+CPU build options
+*****************
+
+Description
+-----------
+
+The following options are mainly used to change the default behavior of optimizations
+that target certain CPU features:
+
+- ``--cpu-baseline``: minimal set of required CPU features.
+   Default value is ``min`` which provides the minimum CPU features that can
+   safely run on a wide range of platforms within the processor family.
+
+   .. note::
+
+     During the runtime, NumPy modules will fail to load if any of specified features
+     are not supported by the target CPU (raises Python runtime error).
+
+- ``--cpu-dispatch``: dispatched set of additional CPU features.
+   Default value is ``max -xop -fma4`` which enables all CPU
+   features, except for AMD legacy features (in case of X86).
+
+   .. note::
+
+      During the runtime, NumPy modules will skip any specified features
+      that are not available in the target CPU.
+
+These options are accessible through :py:mod:`distutils` commands
+`distutils.command.build`, `distutils.command.build_clib` and
+`distutils.command.build_ext`.
+They accept a set of :ref:`CPU features <opt-supported-features>`
+or groups of features that gather several features or
+:ref:`special options <opt-special-options>` that
+perform a series of procedures.
+
+.. note::
+
+    If ``build_clib`` or ``build_ext`` are not specified by the user,
+    the arguments of ``build`` will be used instead, which also holds the default values.
+
+To customize both ``build_ext`` and ``build_clib``::
+
+    cd /path/to/numpy
+    python setup.py build --cpu-baseline="avx2 fma3" install --user
+
+To customize only ``build_ext``::
+
+    cd /path/to/numpy
+    python setup.py build_ext --cpu-baseline="avx2 fma3" install --user
+
+To customize only ``build_clib``::
+
+    cd /path/to/numpy
+    python setup.py build_clib --cpu-baseline="avx2 fma3" install --user
+
+You can also customize CPU/build options through PIP command::
+
+    pip install --no-use-pep517 --global-option=build \
+    --global-option="--cpu-baseline=avx2 fma3" \
+    --global-option="--cpu-dispatch=max" ./
+
+Quick Start
+-----------
+
+In general, the default settings tend to not impose certain CPU features that
+may not be available on some older processors. Raising the ceiling of the
+baseline features will often improve performance and may also reduce
+binary size.
+
+
+The following are the most common scenarios that may require changing
+the default settings:
+
+
+I am building NumPy for my local use
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+And I do not intend to export the build to other users or target a
+different CPU than what the host has.
+
+Set `native` for baseline, or manually specify the CPU features in case of option
+`native` isn't supported by your platform::
+
+    python setup.py build --cpu-baseline="native" bdist
+
+Building NumPy with extra CPU features isn't necessary for this case,
+since all supported features are already defined within the baseline features::
+
+    python setup.py build --cpu-baseline=native --cpu-dispatch=none bdist
+
+.. note::
+
+    A fatal error will be raised if `native` isn't supported by the host platform.
+
+I do not want to support the old processors of the `x86` architecture
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Since most of the CPUs nowadays support at least `AVX`, `F16C` features, you can use::
+
+    python setup.py build --cpu-baseline="avx f16c" bdist
+
+.. note::
+
+    ``--cpu-baseline`` force combine all implied features, so there's no need
+    to add SSE features.
+
+
+I'm facing the same case above but with `ppc64` architecture
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Then raise the ceiling of the baseline features to Power8::
+
+    python setup.py build --cpu-baseline="vsx2" bdist
+
+Having issues with `AVX512` features?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You may have some reservations about including of `AVX512` or
+any other CPU feature and you want to exclude from the dispatched features::
+
+    python setup.py build --cpu-dispatch="max -avx512f -avx512cd \
+    -avx512_knl -avx512_knm -avx512_skx -avx512_clx -avx512_cnl -avx512_icl" \
+    bdist
+
+.. _opt-supported-features:
+
+Supported Features
+------------------
+
+The names of the features can express one feature or a group of features,
+as shown in the following tables supported depend on the lowest interest:
+
+.. note::
+
+    The following features may not be supported by all compilers,
+    also some compilers may produce different set of implied features
+    when it comes to features like ``AVX512``, ``AVX2``, and ``FMA3``.
+    See :ref:`opt-platform-differences` for more details.
+
+.. include:: generated_tables/cpu_features.inc
+
+.. _opt-special-options:
+
+Special Options
+---------------
+
+- ``NONE``: enable no features.
+
+- ``NATIVE``: Enables all CPU features that supported by the host CPU,
+  this operation is based on the compiler flags (``-march=native``, ``-xHost``, ``/QxHost``)
+
+- ``MIN``: Enables the minimum CPU features that can safely run on a wide range of platforms:
+
+  .. table::
+      :align: left
+
+      ======================================  =======================================
+       For Arch                               Implies
+      ======================================  =======================================
+       x86 (32-bit mode)                      ``SSE`` ``SSE2``
+       x86_64                                 ``SSE`` ``SSE2`` ``SSE3``
+       IBM/POWER (big-endian mode)            ``NONE``
+       IBM/POWER (little-endian mode)         ``VSX`` ``VSX2``
+       ARMHF                                  ``NONE``
+       ARM64 A.K. AARCH64                     ``NEON`` ``NEON_FP16`` ``NEON_VFPV4``
+                                              ``ASIMD``
+       IBM/ZSYSTEM(S390X)                     ``NONE``
+      ======================================  =======================================
+
+- ``MAX``: Enables all supported CPU features by the compiler and platform.
+
+- ``Operators-/+``: remove or add features, useful with options ``MAX``, ``MIN`` and ``NATIVE``.
+
+Behaviors
+---------
+
+- CPU features and other options are case-insensitive, for example::
+
+    python setup.py build --cpu-dispatch="SSE41 avx2 FMA3"
+
+- The order of the requested optimizations doesn't matter::
+
+    python setup.py build --cpu-dispatch="SSE41 AVX2 FMA3"
+    # equivalent to
+    python setup.py build --cpu-dispatch="FMA3 AVX2 SSE41"
+
+- Either commas or spaces or '+' can be used as a separator,
+  for example::
+
+    python setup.py build --cpu-dispatch="avx2 avx512f"
+    # or
+    python setup.py build --cpu-dispatch=avx2,avx512f
+    # or
+    python setup.py build --cpu-dispatch="avx2+avx512f"
+
+  all works but arguments should be enclosed in quotes or escaped
+  by backslash if any spaces are used.
+
+- ``--cpu-baseline`` combines all implied CPU features, for example::
+
+    python setup.py build --cpu-baseline=sse42
+    # equivalent to
+    python setup.py build --cpu-baseline="sse sse2 sse3 ssse3 sse41 popcnt sse42"
+
+- ``--cpu-baseline`` will be treated as "native" if compiler native flag
+  ``-march=native`` or ``-xHost`` or ``/QxHost`` is enabled through environment variable
+  `CFLAGS`::
+
+    export CFLAGS="-march=native"
+    python setup.py install --user
+    # is equivalent to
+    python setup.py build --cpu-baseline=native install --user
+
+- ``--cpu-baseline`` escapes any specified features that aren't supported
+  by the target platform or compiler rather than raising fatal errors.
+
+   .. note::
+
+        Since ``--cpu-baseline`` combines all implied features, the maximum
+        supported of implied features will be enabled rather than escape all of them.
+        For example::
+
+           # Requesting `AVX2,FMA3` but the compiler only support **SSE** features
+           python setup.py build --cpu-baseline="avx2 fma3"
+           # is equivalent to
+           python setup.py build --cpu-baseline="sse sse2 sse3 ssse3 sse41 popcnt sse42"
+
+- ``--cpu-dispatch`` does not combain any of implied CPU features,
+  so you must add them unless you want to disable one or all of them::
+
+    # Only dispatches AVX2 and FMA3
+    python setup.py build --cpu-dispatch=avx2,fma3
+    # Dispatches AVX and SSE features
+    python setup.py build --cpu-baseline=ssse3,sse41,sse42,avx,avx2,fma3
+
+- ``--cpu-dispatch`` escapes any specified baseline features and also escapes
+  any features not supported by the target platform or compiler without raising
+  fatal errors.
+
+Eventually, you should always check the final report through the build log
+to verify the enabled features. See :ref:`opt-build-report` for more details.
+
+.. _opt-platform-differences:
+
+Platform differences
+--------------------
+
+Some exceptional conditions force us to link some features together when it come to
+certain compilers or architectures, resulting in the impossibility of building them separately.
+
+These conditions can be divided into two parts, as follows:
+
+**Architectural compatibility**
+
+The need to align certain CPU features that are assured to be supported by
+successive generations of the same architecture, some cases:
+
+- On ppc64le ``VSX(ISA 2.06)`` and ``VSX2(ISA 2.07)`` both imply one another since the
+  first generation that supports little-endian mode is Power-8`(ISA 2.07)`
+- On AArch64 ``NEON NEON_FP16 NEON_VFPV4 ASIMD`` implies each other since they are part of the
+  hardware baseline.
+
+For example::
+
+    # On ARMv8/A64, specify NEON is going to enable Advanced SIMD
+    # and all predecessor extensions
+    python setup.py build --cpu-baseline=neon
+    # which equivalent to
+    python setup.py build --cpu-baseline="neon neon_fp16 neon_vfpv4 asimd"
+
+.. note::
+
+    Please take a deep look at :ref:`opt-supported-features`,
+    in order to determine the features that imply one another.
+
+**Compilation compatibility**
+
+Some compilers don't provide independent support for all CPU features. For instance
+**Intel**'s compiler doesn't provide separated flags for ``AVX2`` and ``FMA3``,
+it makes sense since all Intel CPUs that comes with ``AVX2`` also support ``FMA3``,
+but this approach is incompatible with other **x86** CPUs from **AMD** or **VIA**.
+
+For example::
+
+    # Specify AVX2 will force enables FMA3 on Intel compilers
+    python setup.py build --cpu-baseline=avx2
+    # which equivalent to
+    python setup.py build --cpu-baseline="avx2 fma3"
+
+
+The following tables only show the differences imposed by some compilers from the
+general context that been shown in the :ref:`opt-supported-features` tables:
+
+.. note::
+
+    Features names with strikeout represent the unsupported CPU features.
+
+.. raw:: html
+
+    <style>
+        .enabled-feature {color:green; font-weight:bold;}
+        .disabled-feature {color:red; text-decoration: line-through;}
+    </style>
+
+.. role:: enabled
+    :class: enabled-feature
+
+.. role:: disabled
+    :class: disabled-feature
+
+.. include:: generated_tables/compilers-diff.inc
+
+.. _opt-build-report:
+
+Build report
+------------
+
+In most cases, the CPU build options do not produce any fatal errors that lead to hanging the build.
+Most of the errors that may appear in the build log serve as heavy warnings due to the lack of some
+expected CPU features by the compiler.
+
+So we strongly recommend checking the final report log, to be aware of what kind of CPU features
+are enabled and what are not.
+
+You can find the final report of CPU optimizations at the end of the build log,
+and here is how it looks on x86_64/gcc:
+
+.. raw:: html
+
+    <style>#build-report .highlight-bash pre{max-height:450px; overflow-y: scroll;}</style>
+
+.. literalinclude:: log_example.txt
+   :language: bash
+
+As you see, there is a separate report for each of ``build_ext`` and ``build_clib``
+that includes several sections, and each section has several values, representing the following:
+
+**Platform**:
+
+- :enabled:`Architecture`: The architecture name of target CPU. It should be one of
+  ``x86``, ``x64``, ``ppc64``, ``ppc64le``, ``armhf``, ``aarch64``, ``s390x`` or ``unknown``.
+
+- :enabled:`Compiler`: The compiler name. It should be one of
+  gcc, clang, msvc, icc, iccw or unix-like.
+
+**CPU baseline**:
+
+- :enabled:`Requested`: The specific features and options to ``--cpu-baseline`` as-is.
+- :enabled:`Enabled`: The final set of enabled CPU features.
+- :enabled:`Flags`: The compiler flags that were used to all NumPy `C/C++` sources
+  during the compilation except for temporary sources that have been used for generating
+  the binary objects of dispatched features.
+- :enabled:`Extra checks`: list of internal checks that activate certain functionality
+  or intrinsics related to the enabled features, useful for debugging when it comes
+  to developing SIMD kernels.
+
+**CPU dispatch**:
+
+- :enabled:`Requested`: The specific features and options to ``--cpu-dispatch`` as-is.
+- :enabled:`Enabled`: The final set of enabled CPU features.
+- :enabled:`Generated`: At the beginning of the next row of this property,
+  the features for which optimizations have been generated are shown in the
+  form of several sections with similar properties explained as follows:
+
+  - :enabled:`One or multiple dispatched feature`: The implied CPU features.
+  - :enabled:`Flags`: The compiler flags that been used for these features.
+  - :enabled:`Extra checks`: Similar to the baseline but for these dispatched features.
+  - :enabled:`Detect`: Set of CPU features that need be detected in runtime in order to
+    execute the generated optimizations.
+  - The lines that come after the above property and end with a ':' on a separate line,
+    represent the paths of c/c++ sources that define the generated optimizations.
+
+Runtime Trace
+-------------
+To be completed.
diff --git a/doc/source/reference/simd/gen_features.py b/doc/source/reference/simd/gen_features.py

new file mode 100644 (file)

index 0000000..9a38ef5
--- /dev/null
+++ b/doc/source/reference/simd/gen_features.py
@@ -0,0 +1,196 @@
+"""
+Generate CPU features tables from CCompilerOpt
+"""
+from os import sys, path
+from numpy.distutils.ccompiler_opt import CCompilerOpt
+
+class FakeCCompilerOpt(CCompilerOpt):
+    # disable caching no need for it
+    conf_nocache = True
+
+    def __init__(self, arch, cc, *args, **kwargs):
+        self.fake_info = (arch, cc, '')
+        CCompilerOpt.__init__(self, None, **kwargs)
+
+    def dist_compile(self, sources, flags, **kwargs):
+        return sources
+
+    def dist_info(self):
+        return self.fake_info
+
+    @staticmethod
+    def dist_log(*args, stderr=False):
+        # avoid printing
+        pass
+
+    def feature_test(self, name, force_flags=None, macros=[]):
+        # To speed up
+        return True
+
+class Features:
+    def __init__(self, arch, cc):
+        self.copt = FakeCCompilerOpt(arch, cc, cpu_baseline="max")
+
+    def names(self):
+        return self.copt.cpu_baseline_names()
+
+    def serialize(self, features_names):
+        result = []
+        for f in self.copt.feature_sorted(features_names):
+            gather = self.copt.feature_supported.get(f, {}).get("group", [])
+            implies = self.copt.feature_sorted(self.copt.feature_implies(f))
+            result.append((f, implies, gather))
+        return result
+
+    def table(self, **kwargs):
+        return self.gen_table(self.serialize(self.names()), **kwargs)
+
+    def table_diff(self, vs, **kwargs):
+        fnames = set(self.names())
+        fnames_vs = set(vs.names())
+        common = fnames.intersection(fnames_vs)
+        extra = fnames.difference(fnames_vs)
+        notavl = fnames_vs.difference(fnames)
+        iextra = {}
+        inotavl = {}
+        idiff = set()
+        for f in common:
+            implies = self.copt.feature_implies(f)
+            implies_vs = vs.copt.feature_implies(f)
+            e = implies.difference(implies_vs)
+            i = implies_vs.difference(implies)
+            if not i and not e:
+                continue
+            if e:
+                iextra[f] = e
+            if i:
+                inotavl[f] = e
+            idiff.add(f)
+
+        def fbold(f):
+            if f in extra:
+                return f':enabled:`{f}`'
+            if f in notavl:
+                return f':disabled:`{f}`'
+            return f
+
+        def fbold_implies(f, i):
+            if i in iextra.get(f, {}):
+                return f':enabled:`{i}`'
+            if f in notavl or i in inotavl.get(f, {}):
+                return f':disabled:`{i}`'
+            return i
+
+        diff_all = self.serialize(idiff.union(extra))
+        diff_all += vs.serialize(notavl)
+        content = self.gen_table(
+            diff_all, fstyle=fbold, fstyle_implies=fbold_implies, **kwargs
+        )
+        return content
+
+    def gen_table(self, serialized_features, fstyle=None, fstyle_implies=None,
+                  **kwargs):
+
+        if fstyle is None:
+            fstyle = lambda ft: f'``{ft}``'
+        if fstyle_implies is None:
+            fstyle_implies = lambda origin, ft: fstyle(ft)
+
+        rows = []
+        have_gather = False
+        for f, implies, gather in serialized_features:
+            if gather:
+                have_gather = True
+            name = fstyle(f)
+            implies = ' '.join([fstyle_implies(f, i) for i in implies])
+            gather = ' '.join([fstyle_implies(f, i) for i in gather])
+            rows.append((name, implies, gather))
+        if not rows:
+            return ''
+        fields = ["Name", "Implies", "Gathers"]
+        if not have_gather:
+            del fields[2]
+            rows = [(name, implies) for name, implies, _ in rows]
+        return self.gen_rst_table(fields, rows, **kwargs)
+
+    def gen_rst_table(self, field_names, rows, tab_size=4):
+        assert(not rows or len(field_names) == len(rows[0]))
+        rows.append(field_names)
+        fld_len = len(field_names)
+        cls_len = [max(len(c[i]) for c in rows) for i in range(fld_len)]
+        del rows[-1]
+        cformat = ' '.join('{:<%d}' % i for i in cls_len)
+        border = cformat.format(*['='*i for i in cls_len])
+
+        rows = [cformat.format(*row) for row in rows]
+        # header
+        rows = [border, cformat.format(*field_names), border] + rows
+        # footer
+        rows += [border]
+        # add left margin
+        rows = [(' ' * tab_size) + r for r in rows]
+        return '\n'.join(rows)
+
+def wrapper_section(title, content, tab_size=4):
+    tab = ' '*tab_size
+    if content:
+        return (
+            f"{title}\n{'~'*len(title)}"
+            f"\n.. table::\n{tab}:align: left\n\n"
+            f"{content}\n\n"
+        )
+    return ''
+
+def wrapper_tab(title, table, tab_size=4):
+    tab = ' '*tab_size
+    if table:
+        ('\n' + tab).join((
+            '.. tab:: ' + title,
+            tab + '.. table::',
+            tab + 'align: left',
+            table + '\n\n'
+        ))
+    return ''
+
+
+if __name__ == '__main__':
+
+    pretty_names = {
+        "PPC64": "IBM/POWER big-endian",
+        "PPC64LE": "IBM/POWER little-endian",
+        "S390X": "IBM/ZSYSTEM(S390X)",
+        "ARMHF": "ARMv7/A32",
+        "AARCH64": "ARMv8/A64",
+        "ICC": "Intel Compiler",
+        # "ICCW": "Intel Compiler msvc-like",
+        "MSVC": "Microsoft Visual C/C++"
+    }
+    gen_path = path.join(
+        path.dirname(path.realpath(__file__)), "generated_tables"
+    )
+    with open(path.join(gen_path, 'cpu_features.inc'), 'wt') as fd:
+        fd.write(f'.. generated via {__file__}\n\n')
+        for arch in (
+            ("x86", "PPC64", "PPC64LE", "ARMHF", "AARCH64", "S390X")
+        ):
+            title = "On " + pretty_names.get(arch, arch)
+            table = Features(arch, 'gcc').table()
+            fd.write(wrapper_section(title, table))
+
+    with open(path.join(gen_path, 'compilers-diff.inc'), 'wt') as fd:
+        fd.write(f'.. generated via {__file__}\n\n')
+        for arch, cc_names in (
+            ("x86", ("clang", "ICC", "MSVC")),
+            ("PPC64", ("clang",)),
+            ("PPC64LE", ("clang",)),
+            ("ARMHF", ("clang",)),
+            ("AARCH64", ("clang",)),
+            ("S390X", ("clang",))
+        ):
+            arch_pname = pretty_names.get(arch, arch)
+            for cc in cc_names:
+                title = f"On {arch_pname}::{pretty_names.get(cc, cc)}"
+                table = Features(arch, cc).table_diff(Features(arch, "gcc"))
+                fd.write(wrapper_section(title, table))
+
+
diff --git a/doc/source/reference/simd/generated_tables/compilers-diff.inc b/doc/source/reference/simd/generated_tables/compilers-diff.inc

new file mode 100644 (file)

index 0000000..4b9009a
--- /dev/null
+++ b/doc/source/reference/simd/generated_tables/compilers-diff.inc
@@ -0,0 +1,33 @@
+.. generated via /home/seiko/work/repos/numpy/doc/source/reference/simd/./gen_features.py
+
+On x86::Intel Compiler
+~~~~~~~~~~~~~~~~~~~~~~
+.. table::
+    :align: left
+
+    ================ ==========================================================================================================================================
+    Name             Implies                                                                                                                                   
+    ================ ==========================================================================================================================================
+    FMA3             SSE SSE2 SSE3 SSSE3 SSE41 POPCNT SSE42 AVX F16C :enabled:`AVX2`                                                                           
+    AVX2             SSE SSE2 SSE3 SSSE3 SSE41 POPCNT SSE42 AVX F16C :enabled:`FMA3`                                                                           
+    AVX512F          SSE SSE2 SSE3 SSSE3 SSE41 POPCNT SSE42 AVX F16C FMA3 AVX2 :enabled:`AVX512CD`                                                             
+    :disabled:`XOP`  :disabled:`SSE` :disabled:`SSE2` :disabled:`SSE3` :disabled:`SSSE3` :disabled:`SSE41` :disabled:`POPCNT` :disabled:`SSE42` :disabled:`AVX`
+    :disabled:`FMA4` :disabled:`SSE` :disabled:`SSE2` :disabled:`SSE3` :disabled:`SSSE3` :disabled:`SSE41` :disabled:`POPCNT` :disabled:`SSE42` :disabled:`AVX`
+    ================ ==========================================================================================================================================
+
+On x86::Microsoft Visual C/C++
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+.. table::
+    :align: left
+
+    ====================== ============================================================================================================================================================================================================================================================= =============================================================================
+    Name                   Implies                                                                                                                                                                                                                                                       Gathers                                                                      
+    ====================== ============================================================================================================================================================================================================================================================= =============================================================================
+    FMA3                   SSE SSE2 SSE3 SSSE3 SSE41 POPCNT SSE42 AVX F16C :enabled:`AVX2`                                                                                                                                                                                                                                                                            
+    AVX2                   SSE SSE2 SSE3 SSSE3 SSE41 POPCNT SSE42 AVX F16C :enabled:`FMA3`                                                                                                                                                                                                                                                                            
+    AVX512F                SSE SSE2 SSE3 SSSE3 SSE41 POPCNT SSE42 AVX F16C FMA3 AVX2 :enabled:`AVX512CD` :enabled:`AVX512_SKX`                                                                                                                                                                                                                                        
+    AVX512CD               SSE SSE2 SSE3 SSSE3 SSE41 POPCNT SSE42 AVX F16C FMA3 AVX2 AVX512F :enabled:`AVX512_SKX`                                                                                                                                                                                                                                                    
+    :disabled:`AVX512_KNL` :disabled:`SSE` :disabled:`SSE2` :disabled:`SSE3` :disabled:`SSSE3` :disabled:`SSE41` :disabled:`POPCNT` :disabled:`SSE42` :disabled:`AVX` :disabled:`F16C` :disabled:`FMA3` :disabled:`AVX2` :disabled:`AVX512F` :disabled:`AVX512CD`                        :disabled:`AVX512ER` :disabled:`AVX512PF`                                    
+    :disabled:`AVX512_KNM` :disabled:`SSE` :disabled:`SSE2` :disabled:`SSE3` :disabled:`SSSE3` :disabled:`SSE41` :disabled:`POPCNT` :disabled:`SSE42` :disabled:`AVX` :disabled:`F16C` :disabled:`FMA3` :disabled:`AVX2` :disabled:`AVX512F` :disabled:`AVX512CD` :disabled:`AVX512_KNL` :disabled:`AVX5124FMAPS` :disabled:`AVX5124VNNIW` :disabled:`AVX512VPOPCNTDQ`
+    ====================== ============================================================================================================================================================================================================================================================= =============================================================================
+
diff --git a/doc/source/reference/simd/generated_tables/cpu_features.inc b/doc/source/reference/simd/generated_tables/cpu_features.inc

new file mode 100644 (file)

index 0000000..7782172
--- /dev/null
+++ b/doc/source/reference/simd/generated_tables/cpu_features.inc
@@ -0,0 +1,108 @@
+.. generated via /home/seiko/work/repos/review/numpy/doc/source/reference/simd/gen_features.py
+
+On x86
+~~~~~~
+.. table::
+    :align: left
+
+    ============== =========================================================================================================================================================================== =====================================================
+    Name           Implies                                                                                                                                                                     Gathers                                              
+    ============== =========================================================================================================================================================================== =====================================================
+    ``SSE``        ``SSE2``                                                                                                                                                                                                                         
+    ``SSE2``       ``SSE``                                                                                                                                                                                                                          
+    ``SSE3``       ``SSE`` ``SSE2``                                                                                                                                                                                                                 
+    ``SSSE3``      ``SSE`` ``SSE2`` ``SSE3``                                                                                                                                                                                                        
+    ``SSE41``      ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3``                                                                                                                                                                                              
+    ``POPCNT``     ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41``                                                                                                                                                                                    
+    ``SSE42``      ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT``                                                                                                                                                                         
+    ``AVX``        ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42``                                                                                                                                                               
+    ``XOP``        ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX``                                                                                                                                                       
+    ``FMA4``       ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX``                                                                                                                                                       
+    ``F16C``       ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX``                                                                                                                                                       
+    ``FMA3``       ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C``                                                                                                                                              
+    ``AVX2``       ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C``                                                                                                                                              
+    ``AVX512F``    ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2``                                                                                                                            
+    ``AVX512CD``   ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F``                                                                                                                
+    ``AVX512_KNL`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD``                                              ``AVX512ER`` ``AVX512PF``                            
+    ``AVX512_KNM`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_KNL``                               ``AVX5124FMAPS`` ``AVX5124VNNIW`` ``AVX512VPOPCNTDQ``
+    ``AVX512_SKX`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD``                                              ``AVX512VL`` ``AVX512BW`` ``AVX512DQ``               
+    ``AVX512_CLX`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX``                               ``AVX512VNNI``                                       
+    ``AVX512_CNL`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX``                               ``AVX512IFMA`` ``AVX512VBMI``                        
+    ``AVX512_ICL`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX`` ``AVX512_CLX`` ``AVX512_CNL`` ``AVX512VBMI2`` ``AVX512BITALG`` ``AVX512VPOPCNTDQ`` 
+    ============== =========================================================================================================================================================================== =====================================================
+
+On IBM/POWER big-endian
+~~~~~~~~~~~~~~~~~~~~~~~
+.. table::
+    :align: left
+
+    ======== =========================
+    Name     Implies                  
+    ======== =========================
+    ``VSX``                           
+    ``VSX2`` ``VSX``                  
+    ``VSX3`` ``VSX`` ``VSX2``         
+    ``VSX4`` ``VSX`` ``VSX2`` ``VSX3``
+    ======== =========================
+
+On IBM/POWER little-endian
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+.. table::
+    :align: left
+
+    ======== =========================
+    Name     Implies                  
+    ======== =========================
+    ``VSX``  ``VSX2``                 
+    ``VSX2`` ``VSX``                  
+    ``VSX3`` ``VSX`` ``VSX2``         
+    ``VSX4`` ``VSX`` ``VSX2`` ``VSX3``
+    ======== =========================
+
+On ARMv7/A32
+~~~~~~~~~~~~
+.. table::
+    :align: left
+
+    ============== ===========================================================
+    Name           Implies                                                    
+    ============== ===========================================================
+    ``NEON``                                                                  
+    ``NEON_FP16``  ``NEON``                                                   
+    ``NEON_VFPV4`` ``NEON`` ``NEON_FP16``                                     
+    ``ASIMD``      ``NEON`` ``NEON_FP16`` ``NEON_VFPV4``                      
+    ``ASIMDHP``    ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD``            
+    ``ASIMDDP``    ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD``            
+    ``ASIMDFHM``   ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` ``ASIMDHP``
+    ============== ===========================================================
+
+On ARMv8/A64
+~~~~~~~~~~~~
+.. table::
+    :align: left
+
+    ============== ===========================================================
+    Name           Implies                                                    
+    ============== ===========================================================
+    ``NEON``       ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD``                     
+    ``NEON_FP16``  ``NEON`` ``NEON_VFPV4`` ``ASIMD``                          
+    ``NEON_VFPV4`` ``NEON`` ``NEON_FP16`` ``ASIMD``                           
+    ``ASIMD``      ``NEON`` ``NEON_FP16`` ``NEON_VFPV4``                      
+    ``ASIMDHP``    ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD``            
+    ``ASIMDDP``    ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD``            
+    ``ASIMDFHM``   ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` ``ASIMDHP``
+    ============== ===========================================================
+
+On IBM/ZSYSTEM(S390X)
+~~~~~~~~~~~~~~~~~~~~~
+.. table::
+    :align: left
+
+    ======== ==============
+    Name     Implies       
+    ======== ==============
+    ``VX``                 
+    ``VXE``  ``VX``        
+    ``VXE2`` ``VX`` ``VXE``
+    ======== ==============
+
diff --git a/doc/source/reference/simd/how-it-works.rst b/doc/source/reference/simd/how-it-works.rst

new file mode 100644 (file)

index 0000000..19b3dba
--- /dev/null
+++ b/doc/source/reference/simd/how-it-works.rst
@@ -0,0 +1,349 @@
+**********************************
+How does the CPU dispatcher work?
+**********************************
+
+NumPy dispatcher is based on multi-source compiling, which means taking
+a certain source and compiling it multiple times with different compiler
+flags and also with different **C** definitions that affect the code
+paths. This enables certain instruction-sets for each compiled object
+depending on the required optimizations and ends with linking the
+returned objects together.
+
+.. figure:: ../figures/opt-infra.png
+
+This mechanism should support all compilers and it doesn't require any
+compiler-specific extension, but at the same time it adds a few steps to
+normal compilation that are explained as follows.
+
+1- Configuration
+~~~~~~~~~~~~~~~~
+
+Configuring the required optimization by the user before starting to build the
+source files via the two command arguments as explained above:
+
+-  ``--cpu-baseline``: minimal set of required optimizations.
+
+-  ``--cpu-dispatch``: dispatched set of additional optimizations.
+
+
+2- Discovering the environment
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In this part, we check the compiler and platform architecture
+and cache some of the intermediary results to speed up rebuilding.
+
+3- Validating the requested optimizations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+By testing them against the compiler, and seeing what the compiler can
+support according to the requested optimizations.
+
+4- Generating the main configuration header
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The generated header ``_cpu_dispatch.h`` contains all the definitions and
+headers of instruction-sets for the required optimizations that have been
+validated during the previous step.
+
+It also contains extra C definitions that are used for defining NumPy's
+Python-level module attributes ``__cpu_baseline__`` and ``__cpu_dispatch__``.
+
+**What is in this header?**
+
+The example header was dynamically generated by gcc on an X86 machine.
+The compiler supports ``--cpu-baseline="sse sse2 sse3"`` and
+``--cpu-dispatch="ssse3 sse41"``, and the result is below.
+
+.. code:: c
+
+   // The header should be located at numpy/numpy/core/src/common/_cpu_dispatch.h
+   /**NOTE
+    ** C definitions prefixed with "NPY_HAVE_" represent
+    ** the required optimizations.
+    **
+    ** C definitions prefixed with 'NPY__CPU_TARGET_' are protected and
+    ** shouldn't be used by any NumPy C sources.
+    */
+   /******* baseline features *******/
+   /** SSE **/
+   #define NPY_HAVE_SSE 1
+   #include <xmmintrin.h>
+   /** SSE2 **/
+   #define NPY_HAVE_SSE2 1
+   #include <emmintrin.h>
+   /** SSE3 **/
+   #define NPY_HAVE_SSE3 1
+   #include <pmmintrin.h>
+
+   /******* dispatch-able features *******/
+   #ifdef NPY__CPU_TARGET_SSSE3
+     /** SSSE3 **/
+     #define NPY_HAVE_SSSE3 1
+     #include <tmmintrin.h>
+   #endif
+   #ifdef NPY__CPU_TARGET_SSE41
+     /** SSE41 **/
+     #define NPY_HAVE_SSE41 1
+     #include <smmintrin.h>
+   #endif
+
+**Baseline features** are the minimal set of required optimizations configured
+via ``--cpu-baseline``. They have no preprocessor guards and they're
+always on, which means they can be used in any source.
+
+Does this mean NumPy's infrastructure passes the compiler's flags of
+baseline features to all sources?
+
+Definitely, yes. But the :ref:`dispatch-able sources <dispatchable-sources>` are
+treated differently.
+
+What if the user specifies certain **baseline features** during the
+build but at runtime the machine doesn't support even these
+features? Will the compiled code be called via one of these definitions, or
+maybe the compiler itself auto-generated/vectorized certain piece of code
+based on the provided command line compiler flags?
+
+During the loading of the NumPy module, there's a validation step
+which detects this behavior. It will raise a Python runtime error to inform the
+user. This is to prevent the CPU reaching an illegal instruction error causing
+a segfault.
+
+**Dispatch-able features** are our dispatched set of additional optimizations
+that were configured via ``--cpu-dispatch``. They are not activated by
+default and are always guarded by other C definitions prefixed with
+``NPY__CPU_TARGET_``. C definitions ``NPY__CPU_TARGET_`` are only
+enabled within **dispatch-able sources**.
+
+.. _dispatchable-sources:
+
+5- Dispatch-able sources and configuration statements
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Dispatch-able sources are special **C** files that can be compiled multiple
+times with different compiler flags and also with different **C**
+definitions. These affect code paths to enable certain
+instruction-sets for each compiled object according to "**the
+configuration statements**" that must be declared between a **C**
+comment\ ``(/**/)`` and start with a special mark **@targets** at the
+top of each dispatch-able source. At the same time, dispatch-able
+sources will be treated as normal **C** sources if the optimization was
+disabled by the command argument ``--disable-optimization`` .
+
+**What are configuration statements?**
+
+Configuration statements are sort of keywords combined together to
+determine the required optimization for the dispatch-able source.
+
+Example:
+
+.. code:: c
+
+   /*@targets avx2 avx512f vsx2 vsx3 asimd asimdhp */
+   // C code
+
+The keywords mainly represent the additional optimizations configured
+through ``--cpu-dispatch``, but it can also represent other options such as:
+
+- Target groups: pre-configured configuration statements used for
+  managing the required optimizations from outside the dispatch-able source.
+
+- Policies: collections of options used for changing the default
+  behaviors or forcing the compilers to perform certain things.
+
+- "baseline": a unique keyword represents the minimal optimizations
+  that configured through ``--cpu-baseline``
+
+**Numpy's infrastructure handles dispatch-able sources in four steps**:
+
+- **(A) Recognition**: Just like source templates and F2PY, the
+  dispatch-able sources requires a special extension ``*.dispatch.c``
+  to mark C dispatch-able source files, and for C++
+  ``*.dispatch.cpp`` or ``*.dispatch.cxx``
+  **NOTE**: C++ not supported yet.
+
+- **(B) Parsing and validating**: In this step, the
+  dispatch-able sources that had been filtered by the previous step
+  are parsed and validated by the configuration statements for each one
+  of them one by one in order to determine the required optimizations.
+
+- **(C) Wrapping**: This is the approach taken by NumPy's
+  infrastructure, which has proved to be sufficiently flexible in order
+  to compile a single source multiple times with different **C**
+  definitions and flags that affect the code paths. The process is
+  achieved by creating a temporary **C** source for each required
+  optimization that related to the additional optimization, which
+  contains the declarations of the **C** definitions and includes the
+  involved source via the **C** directive **#include**. For more
+  clarification take a look at the following code for AVX512F :
+
+  .. code:: c
+
+      /*
+       * this definition is used by NumPy utilities as suffixes for the
+       * exported symbols
+       */
+      #define NPY__CPU_TARGET_CURRENT AVX512F
+      /*
+       * The following definitions enable
+       * definitions of the dispatch-able features that are defined within the main
+       * configuration header. These are definitions for the implied features.
+       */
+      #define NPY__CPU_TARGET_SSE
+      #define NPY__CPU_TARGET_SSE2
+      #define NPY__CPU_TARGET_SSE3
+      #define NPY__CPU_TARGET_SSSE3
+      #define NPY__CPU_TARGET_SSE41
+      #define NPY__CPU_TARGET_POPCNT
+      #define NPY__CPU_TARGET_SSE42
+      #define NPY__CPU_TARGET_AVX
+      #define NPY__CPU_TARGET_F16C
+      #define NPY__CPU_TARGET_FMA3
+      #define NPY__CPU_TARGET_AVX2
+      #define NPY__CPU_TARGET_AVX512F
+      // our dispatch-able source
+      #include "/the/absuolate/path/of/hello.dispatch.c"
+
+- **(D) Dispatch-able configuration header**: The infrastructure
+  generates a config header for each dispatch-able source, this header
+  mainly contains two abstract **C** macros used for identifying the
+  generated objects, so they can be used for runtime dispatching
+  certain symbols from the generated objects by any **C** source. It is
+  also used for forward declarations.
+
+  The generated header takes the name of the dispatch-able source after
+  excluding the extension and replace it with ``.h``, for example
+  assume we have a dispatch-able source called ``hello.dispatch.c`` and
+  contains the following:
+
+  .. code:: c
+
+      // hello.dispatch.c
+      /*@targets baseline sse42 avx512f */
+      #include <stdio.h>
+      #include "numpy/utils.h" // NPY_CAT, NPY_TOSTR
+
+      #ifndef NPY__CPU_TARGET_CURRENT
+        // wrapping the dispatch-able source only happens to the additional optimizations
+        // but if the keyword 'baseline' provided within the configuration statements,
+        // the infrastructure will add extra compiling for the dispatch-able source by
+        // passing it as-is to the compiler without any changes.
+        #define CURRENT_TARGET(X) X
+        #define NPY__CPU_TARGET_CURRENT baseline // for printing only
+      #else
+        // since we reach to this point, that's mean we're dealing with
+          // the additional optimizations, so it could be SSE42 or AVX512F
+        #define CURRENT_TARGET(X) NPY_CAT(NPY_CAT(X, _), NPY__CPU_TARGET_CURRENT)
+      #endif
+      // Macro 'CURRENT_TARGET' adding the current target as suffux to the exported symbols,
+      // to avoid linking duplications, NumPy already has a macro called
+      // 'NPY_CPU_DISPATCH_CURFX' similar to it, located at
+      // numpy/numpy/core/src/common/npy_cpu_dispatch.h
+      // NOTE: we tend to not adding suffixes to the baseline exported symbols
+      void CURRENT_TARGET(simd_whoami)(const char *extra_info)
+      {
+          printf("I'm " NPY_TOSTR(NPY__CPU_TARGET_CURRENT) ", %s\n", extra_info);
+      }
+
+  Now assume you attached **hello.dispatch.c** to the source tree, then
+  the infrastructure should generate a temporary config header called
+  **hello.dispatch.h** that can be reached by any source in the source
+  tree, and it should contain the following code :
+
+  .. code:: c
+
+      #ifndef NPY__CPU_DISPATCH_EXPAND_
+        // To expand the macro calls in this header
+          #define NPY__CPU_DISPATCH_EXPAND_(X) X
+      #endif
+      // Undefining the following macros, due to the possibility of including config headers
+      // multiple times within the same source and since each config header represents
+      // different required optimizations according to the specified configuration
+      // statements in the dispatch-able source that derived from it.
+      #undef NPY__CPU_DISPATCH_BASELINE_CALL
+      #undef NPY__CPU_DISPATCH_CALL
+      // nothing strange here, just a normal preprocessor callback
+      // enabled only if 'baseline' specified within the configuration statements
+      #define NPY__CPU_DISPATCH_BASELINE_CALL(CB, ...) \
+        NPY__CPU_DISPATCH_EXPAND_(CB(__VA_ARGS__))
+      // 'NPY__CPU_DISPATCH_CALL' is an abstract macro is used for dispatching
+      // the required optimizations that specified within the configuration statements.
+      //
+      // @param CHK, Expected a macro that can be used to detect CPU features
+      // in runtime, which takes a CPU feature name without string quotes and
+      // returns the testing result in a shape of boolean value.
+      // NumPy already has macro called "NPY_CPU_HAVE", which fits this requirement.
+      //
+      // @param CB, a callback macro that expected to be called multiple times depending
+      // on the required optimizations, the callback should receive the following arguments:
+      //  1- The pending calls of @param CHK filled up with the required CPU features,
+      //     that need to be tested first in runtime before executing call belong to
+      //     the compiled object.
+      //  2- The required optimization name, same as in 'NPY__CPU_TARGET_CURRENT'
+      //  3- Extra arguments in the macro itself
+      //
+      // By default the callback calls are sorted depending on the highest interest
+      // unless the policy "$keep_sort" was in place within the configuration statements
+      // see "Dive into the CPU dispatcher" for more clarification.
+      #define NPY__CPU_DISPATCH_CALL(CHK, CB, ...) \
+        NPY__CPU_DISPATCH_EXPAND_(CB((CHK(AVX512F)), AVX512F, __VA_ARGS__)) \
+        NPY__CPU_DISPATCH_EXPAND_(CB((CHK(SSE)&&CHK(SSE2)&&CHK(SSE3)&&CHK(SSSE3)&&CHK(SSE41)), SSE41, __VA_ARGS__))
+
+  An example of using the config header in light of the above:
+
+  .. code:: c
+
+      // NOTE: The following macros are only defined for demonstration purposes only.
+      // NumPy already has a collections of macros located at
+      // numpy/numpy/core/src/common/npy_cpu_dispatch.h, that covers all dispatching
+      // and declarations scenarios.
+
+      #include "numpy/npy_cpu_features.h" // NPY_CPU_HAVE
+      #include "numpy/utils.h" // NPY_CAT, NPY_EXPAND
+
+      // An example for setting a macro that calls all the exported symbols at once
+      // after checking if they're supported by the running machine.
+      #define DISPATCH_CALL_ALL(FN, ARGS) \
+          NPY__CPU_DISPATCH_CALL(NPY_CPU_HAVE, DISPATCH_CALL_ALL_CB, FN, ARGS) \
+          NPY__CPU_DISPATCH_BASELINE_CALL(DISPATCH_CALL_BASELINE_ALL_CB, FN, ARGS)
+      // The preprocessor callbacks.
+      // The same suffixes as we define it in the dispatch-able source.
+      #define DISPATCH_CALL_ALL_CB(CHECK, TARGET_NAME, FN, ARGS) \
+        if (CHECK) { NPY_CAT(NPY_CAT(FN, _), TARGET_NAME) ARGS; }
+      #define DISPATCH_CALL_BASELINE_ALL_CB(FN, ARGS) \
+        FN NPY_EXPAND(ARGS);
+
+      // An example for setting a macro that calls the exported symbols of highest
+      // interest optimization, after checking if they're supported by the running machine.
+      #define DISPATCH_CALL_HIGH(FN, ARGS) \
+        if (0) {} \
+          NPY__CPU_DISPATCH_CALL(NPY_CPU_HAVE, DISPATCH_CALL_HIGH_CB, FN, ARGS) \
+          NPY__CPU_DISPATCH_BASELINE_CALL(DISPATCH_CALL_BASELINE_HIGH_CB, FN, ARGS)
+      // The preprocessor callbacks
+      // The same suffixes as we define it in the dispatch-able source.
+      #define DISPATCH_CALL_HIGH_CB(CHECK, TARGET_NAME, FN, ARGS) \
+        else if (CHECK) { NPY_CAT(NPY_CAT(FN, _), TARGET_NAME) ARGS; }
+      #define DISPATCH_CALL_BASELINE_HIGH_CB(FN, ARGS) \
+        else { FN NPY_EXPAND(ARGS); }
+
+      // NumPy has a macro called 'NPY_CPU_DISPATCH_DECLARE' can be used
+      // for forward declarations any kind of prototypes based on
+      // 'NPY__CPU_DISPATCH_CALL' and 'NPY__CPU_DISPATCH_BASELINE_CALL'.
+      // However in this example, we just handle it manually.
+      void simd_whoami(const char *extra_info);
+      void simd_whoami_AVX512F(const char *extra_info);
+      void simd_whoami_SSE41(const char *extra_info);
+
+      void trigger_me(void)
+      {
+          // bring the auto-generated config header
+          // which contains config macros 'NPY__CPU_DISPATCH_CALL' and
+          // 'NPY__CPU_DISPATCH_BASELINE_CALL'.
+          // it is highly recommended to include the config header before executing
+        // the dispatching macros in case if there's another header in the scope.
+          #include "hello.dispatch.h"
+          DISPATCH_CALL_ALL(simd_whoami, ("all"))
+          DISPATCH_CALL_HIGH(simd_whoami, ("the highest interest"))
+          // An example of including multiple config headers in the same source
+          // #include "hello2.dispatch.h"
+          // DISPATCH_CALL_HIGH(another_function, ("the highest interest"))
+      }
diff --git a/doc/source/reference/simd/index.rst b/doc/source/reference/simd/index.rst

new file mode 100644 (file)

index 0000000..230e2dc
--- /dev/null
+++ b/doc/source/reference/simd/index.rst
@@ -0,0 +1,43 @@
+.. _numpysimd:
+.. currentmodule:: numpysimd
+
+***********************
+CPU/SIMD Optimizations
+***********************
+
+NumPy comes with a flexible working mechanism that allows it to harness the SIMD
+features that CPUs own, in order to provide faster and more stable performance
+on all popular platforms. Currently, NumPy supports the X86, IBM/Power, ARM7 and ARM8
+architectures.
+
+The optimization process in NumPy is carried out in three layers:
+
+- Code is *written* using the universal intrinsics which is a set of types, macros and
+  functions that are mapped to each supported instruction-sets by using guards that
+  will enable use of the them only when the compiler recognizes them.
+  This allow us to generate multiple kernels for the same functionality,
+  in which each generated kernel represents a set of instructions that related one
+  or multiple certain CPU features. The first kernel represents the minimum (baseline)
+  CPU features, and the other kernels represent the additional (dispatched) CPU features.
+
+- At *compile* time, CPU build options are used to define the minimum and
+  additional features to support, based on user choice and compiler support. The
+  appropriate intrinsics are overlaid with the platform / architecture intrinsics,
+  and multiple kernels are compiled.
+
+- At *runtime import*, the CPU is probed for the set of supported CPU
+  features. A mechanism is used to grab the pointer to the most appropriate
+  kernel, and this will be the one called for the function.
+
+.. note::
+
+   NumPy community had a deep discussion before implementing this work,
+   please check `NEP-38`_ for more clarification.
+
+.. toctree::
+
+    build-options
+    how-it-works
+
+.. _`NEP-38`: https://numpy.org/neps/nep-0038-SIMD-optimizations.html
+
diff --git a/doc/source/reference/simd/log_example.txt b/doc/source/reference/simd/log_example.txt

new file mode 100644 (file)

index 0000000..b0c7324
--- /dev/null
+++ b/doc/source/reference/simd/log_example.txt
@@ -0,0 +1,79 @@
+########### EXT COMPILER OPTIMIZATION ###########
+Platform      :
+  Architecture: x64
+  Compiler    : gcc
+
+CPU baseline  :
+  Requested   : 'min'
+  Enabled     : SSE SSE2 SSE3
+  Flags       : -msse -msse2 -msse3
+  Extra checks: none
+
+CPU dispatch  :
+  Requested   : 'max -xop -fma4'
+  Enabled     : SSSE3 SSE41 POPCNT SSE42 AVX F16C FMA3 AVX2 AVX512F AVX512CD AVX512_KNL AVX512_KNM AVX512_SKX AVX512_CLX AVX512_CNL AVX512_ICL
+  Generated   :
+              :
+  SSE41       : SSE SSE2 SSE3 SSSE3
+  Flags       : -msse -msse2 -msse3 -mssse3 -msse4.1
+  Extra checks: none
+  Detect      : SSE SSE2 SSE3 SSSE3 SSE41
+              : build/src.linux-x86_64-3.9/numpy/core/src/umath/loops_arithmetic.dispatch.c
+              : numpy/core/src/umath/_umath_tests.dispatch.c
+              :
+  SSE42       : SSE SSE2 SSE3 SSSE3 SSE41 POPCNT
+  Flags       : -msse -msse2 -msse3 -mssse3 -msse4.1 -mpopcnt -msse4.2
+  Extra checks: none
+  Detect      : SSE SSE2 SSE3 SSSE3 SSE41 POPCNT SSE42
+              : build/src.linux-x86_64-3.9/numpy/core/src/_simd/_simd.dispatch.c
+              :
+  AVX2        : SSE SSE2 SSE3 SSSE3 SSE41 POPCNT SSE42 AVX F16C
+  Flags       : -msse -msse2 -msse3 -mssse3 -msse4.1 -mpopcnt -msse4.2 -mavx -mf16c -mavx2
+  Extra checks: none
+  Detect      : AVX F16C AVX2
+              : build/src.linux-x86_64-3.9/numpy/core/src/umath/loops_arithm_fp.dispatch.c
+              : build/src.linux-x86_64-3.9/numpy/core/src/umath/loops_arithmetic.dispatch.c
+              : numpy/core/src/umath/_umath_tests.dispatch.c
+              :
+  (FMA3 AVX2) : SSE SSE2 SSE3 SSSE3 SSE41 POPCNT SSE42 AVX F16C
+  Flags       : -msse -msse2 -msse3 -mssse3 -msse4.1 -mpopcnt -msse4.2 -mavx -mf16c -mfma -mavx2
+  Extra checks: none
+  Detect      : AVX F16C FMA3 AVX2
+              : build/src.linux-x86_64-3.9/numpy/core/src/_simd/_simd.dispatch.c
+              : build/src.linux-x86_64-3.9/numpy/core/src/umath/loops_exponent_log.dispatch.c
+              : build/src.linux-x86_64-3.9/numpy/core/src/umath/loops_trigonometric.dispatch.c
+              :
+  AVX512F     : SSE SSE2 SSE3 SSSE3 SSE41 POPCNT SSE42 AVX F16C FMA3 AVX2
+  Flags       : -msse -msse2 -msse3 -mssse3 -msse4.1 -mpopcnt -msse4.2 -mavx -mf16c -mfma -mavx2 -mavx512f
+  Extra checks: AVX512F_REDUCE
+  Detect      : AVX512F
+              : build/src.linux-x86_64-3.9/numpy/core/src/_simd/_simd.dispatch.c
+              : build/src.linux-x86_64-3.9/numpy/core/src/umath/loops_arithm_fp.dispatch.c
+              : build/src.linux-x86_64-3.9/numpy/core/src/umath/loops_arithmetic.dispatch.c
+              : build/src.linux-x86_64-3.9/numpy/core/src/umath/loops_exponent_log.dispatch.c
+              : build/src.linux-x86_64-3.9/numpy/core/src/umath/loops_trigonometric.dispatch.c
+              :
+  AVX512_SKX  : SSE SSE2 SSE3 SSSE3 SSE41 POPCNT SSE42 AVX F16C FMA3 AVX2 AVX512F AVX512CD
+  Flags       : -msse -msse2 -msse3 -mssse3 -msse4.1 -mpopcnt -msse4.2 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq
+  Extra checks: AVX512BW_MASK AVX512DQ_MASK
+  Detect      : AVX512_SKX
+              : build/src.linux-x86_64-3.9/numpy/core/src/_simd/_simd.dispatch.c
+              : build/src.linux-x86_64-3.9/numpy/core/src/umath/loops_arithmetic.dispatch.c
+              : build/src.linux-x86_64-3.9/numpy/core/src/umath/loops_exponent_log.dispatch.c
+CCompilerOpt.cache_flush[804] : write cache to path -> /home/seiko/work/repos/numpy/build/temp.linux-x86_64-3.9/ccompiler_opt_cache_ext.py
+
+########### CLIB COMPILER OPTIMIZATION ###########
+Platform      :
+  Architecture: x64
+  Compiler    : gcc
+
+CPU baseline  :
+  Requested   : 'min'
+  Enabled     : SSE SSE2 SSE3
+  Flags       : -msse -msse2 -msse3
+  Extra checks: none
+
+CPU dispatch  :
+  Requested   : 'max -xop -fma4'
+  Enabled     : SSSE3 SSE41 POPCNT SSE42 AVX F16C FMA3 AVX2 AVX512F AVX512CD AVX512_KNL AVX512_KNM AVX512_SKX AVX512_CLX AVX512_CNL AVX512_ICL
+  Generated   : none
diff --git a/doc/source/reference/simd/simd-optimizations-tables-diff.inc b/doc/source/reference/simd/simd-optimizations-tables-diff.inc

deleted file mode 100644 (file)

index 41fa967..0000000
--- a/doc/source/reference/simd/simd-optimizations-tables-diff.inc
+++ /dev/null
@@ -1,37 +0,0 @@
-.. generated via source/reference/simd/simd-optimizations.py
-
-x86::Intel Compiler - CPU feature names
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-.. table::
-    :align: left
-
-    =========== ==================================================================================================================
-    Name        Implies                                                                                                           
-    =========== ==================================================================================================================
-    ``FMA3``    ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` **AVX2**                      
-    ``AVX2``    ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` **FMA3**                      
-    ``AVX512F`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` **AVX512CD**
-    =========== ==================================================================================================================
-
-.. note::
-  The following features aren't supported by x86::Intel Compiler:
-  **XOP FMA4**
-
-x86::Microsoft Visual C/C++ - CPU feature names
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-.. table::
-    :align: left
-
-    ============ =================================================================================================================================
-    Name         Implies                                                                                                                          
-    ============ =================================================================================================================================
-    ``FMA3``     ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` **AVX2**                                     
-    ``AVX2``     ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` **FMA3**                                     
-    ``AVX512F``  ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` **AVX512CD** **AVX512_SKX**
-    ``AVX512CD`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` **AVX512_SKX** 
-    ============ =================================================================================================================================
-
-.. note::
-  The following features aren't supported by x86::Microsoft Visual C/C++:
-  **AVX512_KNL AVX512_KNM**
-
diff --git a/doc/source/reference/simd/simd-optimizations-tables.inc b/doc/source/reference/simd/simd-optimizations-tables.inc

deleted file mode 100644 (file)

index f038a91..0000000
--- a/doc/source/reference/simd/simd-optimizations-tables.inc
+++ /dev/null
@@ -1,103 +0,0 @@
-.. generated via source/reference/simd/simd-optimizations.py
-
-x86 - CPU feature names
-~~~~~~~~~~~~~~~~~~~~~~~
-.. table::
-    :align: left
-
-    ============ =================================================================================================================
-    Name         Implies                                                                                                          
-    ============ =================================================================================================================
-    ``SSE``      ``SSE2``                                                                                                         
-    ``SSE2``     ``SSE``                                                                                                          
-    ``SSE3``     ``SSE`` ``SSE2``                                                                                                 
-    ``SSSE3``    ``SSE`` ``SSE2`` ``SSE3``                                                                                        
-    ``SSE41``    ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3``                                                                              
-    ``POPCNT``   ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41``                                                                    
-    ``SSE42``    ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT``                                                         
-    ``AVX``      ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42``                                               
-    ``XOP``      ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX``                                       
-    ``FMA4``     ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX``                                       
-    ``F16C``     ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX``                                       
-    ``FMA3``     ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C``                              
-    ``AVX2``     ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C``                              
-    ``AVX512F``  ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2``            
-    ``AVX512CD`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F``
-    ============ =================================================================================================================
-
-x86 - Group names
-~~~~~~~~~~~~~~~~~
-.. table::
-    :align: left
-
-    ============== ===================================================== ===========================================================================================================================================================================
-    Name           Gather                                                Implies                                                                                                                                                                    
-    ============== ===================================================== ===========================================================================================================================================================================
-    ``AVX512_KNL`` ``AVX512ER`` ``AVX512PF``                             ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD``                                             
-    ``AVX512_KNM`` ``AVX5124FMAPS`` ``AVX5124VNNIW`` ``AVX512VPOPCNTDQ`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_KNL``                              
-    ``AVX512_SKX`` ``AVX512VL`` ``AVX512BW`` ``AVX512DQ``                ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD``                                             
-    ``AVX512_CLX`` ``AVX512VNNI``                                        ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX``                              
-    ``AVX512_CNL`` ``AVX512IFMA`` ``AVX512VBMI``                         ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX``                              
-    ``AVX512_ICL`` ``AVX512VBMI2`` ``AVX512BITALG`` ``AVX512VPOPCNTDQ``  ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX`` ``AVX512_CLX`` ``AVX512_CNL``
-    ============== ===================================================== ===========================================================================================================================================================================
-
-IBM/POWER big-endian - CPU feature names
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-.. table::
-    :align: left
-
-    ======== ================
-    Name     Implies         
-    ======== ================
-    ``VSX``                  
-    ``VSX2`` ``VSX``         
-    ``VSX3`` ``VSX`` ``VSX2``
-    ======== ================
-
-IBM/POWER little-endian - CPU feature names
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-.. table::
-    :align: left
-
-    ======== ================
-    Name     Implies         
-    ======== ================
-    ``VSX``  ``VSX2``        
-    ``VSX2`` ``VSX``         
-    ``VSX3`` ``VSX`` ``VSX2``
-    ======== ================
-
-ARMv7/A32 - CPU feature names
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-.. table::
-    :align: left
-
-    ============== ===========================================================
-    Name           Implies                                                    
-    ============== ===========================================================
-    ``NEON``                                                                  
-    ``NEON_FP16``  ``NEON``                                                   
-    ``NEON_VFPV4`` ``NEON`` ``NEON_FP16``                                     
-    ``ASIMD``      ``NEON`` ``NEON_FP16`` ``NEON_VFPV4``                      
-    ``ASIMDHP``    ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD``            
-    ``ASIMDDP``    ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD``            
-    ``ASIMDFHM``   ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` ``ASIMDHP``
-    ============== ===========================================================
-
-ARMv8/A64 - CPU feature names
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-.. table::
-    :align: left
-
-    ============== ===========================================================
-    Name           Implies                                                    
-    ============== ===========================================================
-    ``NEON``       ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD``                     
-    ``NEON_FP16``  ``NEON`` ``NEON_VFPV4`` ``ASIMD``                          
-    ``NEON_VFPV4`` ``NEON`` ``NEON_FP16`` ``ASIMD``                           
-    ``ASIMD``      ``NEON`` ``NEON_FP16`` ``NEON_VFPV4``                      
-    ``ASIMDHP``    ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD``            
-    ``ASIMDDP``    ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD``            
-    ``ASIMDFHM``   ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` ``ASIMDHP``
-    ============== ===========================================================
-
diff --git a/doc/source/reference/simd/simd-optimizations.py b/doc/source/reference/simd/simd-optimizations.py

deleted file mode 100644 (file)

index a78302d..0000000
--- a/doc/source/reference/simd/simd-optimizations.py
+++ /dev/null
@@ -1,190 +0,0 @@
-"""
-Generate CPU features tables from CCompilerOpt
-"""
-from os import sys, path
-gen_path = path.dirname(path.realpath(__file__))
-#sys.path.append(path.abspath(path.join(gen_path, *([".."]*4), "numpy", "distutils")))
-#from ccompiler_opt import CCompilerOpt
-from numpy.distutils.ccompiler_opt import CCompilerOpt
-
-class FakeCCompilerOpt(CCompilerOpt):
-    fake_info = ("arch", "compiler", "extra_args")
-    # disable caching no need for it
-    conf_nocache = True
-    def __init__(self, *args, **kwargs):
-        no_cc = None
-        CCompilerOpt.__init__(self, no_cc, **kwargs)
-    def dist_compile(self, sources, flags, **kwargs):
-        return sources
-    def dist_info(self):
-        return FakeCCompilerOpt.fake_info
-    @staticmethod
-    def dist_log(*args, stderr=False):
-        # avoid printing
-        pass
-    def feature_test(self, name, force_flags=None):
-        # To speed up
-        return True
-
-    def gen_features_table(self, features, ignore_groups=True,
-                           field_names=["Name", "Implies"],
-                           fstyle=None, fstyle_implies=None, **kwargs):
-        rows = []
-        if fstyle is None:
-            fstyle = lambda ft: f'``{ft}``'
-        if fstyle_implies is None:
-            fstyle_implies = lambda origin, ft: fstyle(ft)
-        for f in self.feature_sorted(features):
-            is_group = "group" in self.feature_supported.get(f, {})
-            if ignore_groups and is_group:
-                continue
-            implies = self.feature_sorted(self.feature_implies(f))
-            implies = ' '.join([fstyle_implies(f, i) for i in implies])
-            rows.append([fstyle(f), implies])
-        if rows:
-           return self.gen_rst_table(field_names, rows, **kwargs)
-
-    def gen_gfeatures_table(self, features,
-                            field_names=["Name", "Gather", "Implies"],
-                            fstyle=None, fstyle_implies=None, **kwargs):
-        rows = []
-        if fstyle is None:
-            fstyle = lambda ft: f'``{ft}``'
-        if fstyle_implies is None:
-            fstyle_implies = lambda origin, ft: fstyle(ft)
-        for f in self.feature_sorted(features):
-            gather = self.feature_supported.get(f, {}).get("group", None)
-            if not gather:
-                continue
-            implies = self.feature_sorted(self.feature_implies(f))
-            implies = ' '.join([fstyle_implies(f, i) for i in implies])
-            gather = ' '.join([fstyle_implies(f, i) for i in gather])
-            rows.append([fstyle(f), gather, implies])
-        if rows:
-            return self.gen_rst_table(field_names, rows, **kwargs)
-
-    def gen_rst_table(self, field_names, rows, tab_size=4):
-        assert(not rows or len(field_names) == len(rows[0]))
-        rows.append(field_names)
-        fld_len = len(field_names)
-        cls_len = [max(len(c[i]) for c in rows) for i in range(fld_len)]
-        del rows[-1]
-        cformat = ' '.join('{:<%d}' % i for i in cls_len)
-        border  = cformat.format(*['='*i for i in cls_len])
-
-        rows = [cformat.format(*row) for row in rows]
-        # header
-        rows = [border, cformat.format(*field_names), border] + rows
-        # footer
-        rows += [border]
-        # add left margin
-        rows = [(' ' * tab_size) + r for r in rows]
-        return '\n'.join(rows)
-
-def features_table_sections(name, ftable=None, gtable=None, tab_size=4):
-    tab = ' '*tab_size
-    content = ''
-    if ftable:
-        title = f"{name} - CPU feature names"
-        content = (
-            f"{title}\n{'~'*len(title)}"
-            f"\n.. table::\n{tab}:align: left\n\n"
-            f"{ftable}\n\n"
-        )
-    if gtable:
-        title = f"{name} - Group names"
-        content += (
-            f"{title}\n{'~'*len(title)}"
-            f"\n.. table::\n{tab}:align: left\n\n"
-            f"{gtable}\n\n"
-        )
-    return content
-
-def features_table(arch, cc="gcc", pretty_name=None, **kwargs):
-    FakeCCompilerOpt.fake_info = (arch, cc, '')
-    ccopt = FakeCCompilerOpt(cpu_baseline="max")
-    features = ccopt.cpu_baseline_names()
-    ftable = ccopt.gen_features_table(features, **kwargs)
-    gtable = ccopt.gen_gfeatures_table(features, **kwargs)
-
-    if not pretty_name:
-        pretty_name = arch + '/' + cc
-    return features_table_sections(pretty_name, ftable, gtable, **kwargs)
-
-def features_table_diff(arch, cc, cc_vs="gcc", pretty_name=None, **kwargs):
-    FakeCCompilerOpt.fake_info = (arch, cc, '')
-    ccopt = FakeCCompilerOpt(cpu_baseline="max")
-    fnames = ccopt.cpu_baseline_names()
-    features = {f:ccopt.feature_implies(f) for f in fnames}
-
-    FakeCCompilerOpt.fake_info = (arch, cc_vs, '')
-    ccopt_vs = FakeCCompilerOpt(cpu_baseline="max")
-    fnames_vs = ccopt_vs.cpu_baseline_names()
-    features_vs = {f:ccopt_vs.feature_implies(f) for f in fnames_vs}
-
-    common  = set(fnames).intersection(fnames_vs)
-    extra_avl = set(fnames).difference(fnames_vs)
-    not_avl = set(fnames_vs).difference(fnames)
-    diff_impl_f = {f:features[f].difference(features_vs[f]) for f in common}
-    diff_impl = {k for k, v in diff_impl_f.items() if v}
-
-    fbold = lambda ft: f'**{ft}**' if ft in extra_avl else f'``{ft}``'
-    fbold_implies = lambda origin, ft: (
-        f'**{ft}**' if ft in diff_impl_f.get(origin, {}) else f'``{ft}``'
-    )
-    diff_all = diff_impl.union(extra_avl)
-    ftable = ccopt.gen_features_table(
-        diff_all, fstyle=fbold, fstyle_implies=fbold_implies, **kwargs
-    )
-    gtable = ccopt.gen_gfeatures_table(
-        diff_all, fstyle=fbold, fstyle_implies=fbold_implies, **kwargs
-    )
-    if not pretty_name:
-        pretty_name = arch + '/' + cc
-    content = features_table_sections(pretty_name, ftable, gtable, **kwargs)
-
-    if not_avl:
-        not_avl = ccopt_vs.feature_sorted(not_avl)
-        not_avl = ' '.join(not_avl)
-        content += (
-            ".. note::\n"
-            f"  The following features aren't supported by {pretty_name}:\n"
-            f"  **{not_avl}**\n\n"
-        )
-    return content
-
-if __name__ == '__main__':
-    pretty_names = {
-        "PPC64": "IBM/POWER big-endian",
-        "PPC64LE": "IBM/POWER little-endian",
-        "ARMHF": "ARMv7/A32",
-        "AARCH64": "ARMv8/A64",
-        "ICC": "Intel Compiler",
-        # "ICCW": "Intel Compiler msvc-like",
-        "MSVC": "Microsoft Visual C/C++"
-    }
-    with open(path.join(gen_path, 'simd-optimizations-tables.inc'), 'wt') as fd:
-        fd.write(f'.. generated via {__file__}\n\n')
-        for arch in (
-            ("x86", "PPC64", "PPC64LE", "ARMHF", "AARCH64")
-        ):
-            pretty_name = pretty_names.get(arch, arch)
-            table = features_table(arch=arch, pretty_name=pretty_name)
-            assert(table)
-            fd.write(table)
-
-    with open(path.join(gen_path, 'simd-optimizations-tables-diff.inc'), 'wt') as fd:
-        fd.write(f'.. generated via {__file__}\n\n')
-        for arch, cc_names in (
-            ("x86", ("clang", "ICC", "MSVC")),
-            ("PPC64", ("clang",)),
-            ("PPC64LE", ("clang",)),
-            ("ARMHF", ("clang",)),
-            ("AARCH64", ("clang",))
-        ):
-            arch_pname = pretty_names.get(arch, arch)
-            for cc in cc_names:
-                pretty_name = f"{arch_pname}::{pretty_names.get(cc, cc)}"
-                table = features_table_diff(arch=arch, cc=cc, pretty_name=pretty_name)
-                if table:
-                    fd.write(table)
diff --git a/doc/source/reference/simd/simd-optimizations.rst b/doc/source/reference/simd/simd-optimizations.rst

index 9de6d1734079a3520ef1d6654b2ecacf81166518..a181082661e8d47940fde1718ef8e01c972fa495 100644 (file)
--- a/doc/source/reference/simd/simd-optimizations.rst
+++ b/doc/source/reference/simd/simd-optimizations.rst
@@ -1,527 +1,12 @@
-******************
-SIMD Optimizations
-******************
+:orphan:
  
-NumPy provides a set of macros that define `Universal Intrinsics`_ to
-abstract out typical platform-specific intrinsics so SIMD code needs to be
-written only once. There are three layers:
+.. raw:: html
  
-- Code is *written* using the universal intrinsic macros, with guards that
-  will enable use of the macros only when the compiler recognizes them.
-  In NumPy, these are used to construct multiple ufunc loops. Current policy is
-  to create three loops: One loop is the default and uses no intrinsics. One
-  uses the minimum intrinsics required on the architecture. And the third is
-  written using the maximum set of intrinsics possible.
-- At *compile* time, a distutils command is used to define the minimum and
-  maximum features to support, based on user choice and compiler support. The
-  appropriate macros are overlaid with the platform / architecture intrinsics,
-  and the three loops are compiled.
-- At *runtime import*, the CPU is probed for the set of supported intrinsic
-  features. A mechanism is used to grab the pointer to the most appropriate
-  function, and this will be the one called for the function.
+    <html>
+        <head>
+            <meta http-equiv="refresh" content="0; url=index.html"/>
+        </head>
+    </html>
  
-
-Build options for compilation
-=============================
-
-- ``--cpu-baseline``: minimal set of required optimizations. Default
-  value is ``min`` which provides the minimum CPU features that can
-  safely run on a wide range of platforms within the processor family.
-
-- ``--cpu-dispatch``: dispatched set of additional optimizations.
-  The default value is ``max -xop -fma4`` which enables all CPU
-  features, except for AMD legacy features(in case of X86).
-
-The command arguments are available in ``build``, ``build_clib``, and
-``build_ext``.
-if ``build_clib`` or ``build_ext`` are not specified by the user, the arguments of
-``build`` will be used instead, which also holds the default values.
-
-Optimization names can be CPU features or groups of features that gather
-several features or :ref:`special options <special-options>` to perform a series of procedures.
-
-
-The following tables show the current supported optimizations sorted from the lowest to the highest interest.
-
-.. include:: simd-optimizations-tables.inc
-
-----
-
-.. _tables-diff:
-
-While the above tables are based on the GCC Compiler, the following tables showing the differences in the
-other compilers:
-
-.. include:: simd-optimizations-tables-diff.inc
-
-.. _special-options:
-
-Special options
-~~~~~~~~~~~~~~~
-
-- ``NONE``: enable no features
-
-- ``NATIVE``: Enables all CPU features that supported by the current
-   machine, this operation is based on the compiler flags (``-march=native, -xHost, /QxHost``)
-
-- ``MIN``: Enables the minimum CPU features that can safely run on a wide range of platforms:
-
-  .. table::
-      :align: left
-
-      ======================================  =======================================
-       For Arch                               Returns
-      ======================================  =======================================
-       ``x86``                                ``SSE`` ``SSE2``
-       ``x86`` ``64-bit mode``                ``SSE`` ``SSE2`` ``SSE3``
-       ``IBM/POWER`` ``big-endian mode``      ``NONE``
-       ``IBM/POWER`` ``little-endian mode``   ``VSX`` ``VSX2``
-       ``ARMHF``                              ``NONE``
-       ``ARM64`` ``AARCH64``                  ``NEON`` ``NEON_FP16`` ``NEON_VFPV4``
-                                              ``ASIMD``
-      ======================================  =======================================
-
-- ``MAX``: Enables all supported CPU features by the Compiler and platform.
-
-- ``Operators-/+``: remove or add features, useful with options ``MAX``, ``MIN`` and ``NATIVE``.
-
-NOTES
-~~~~~~~~~~~~~
-- CPU features and other options are case-insensitive.
-
-- The order of the requested optimizations doesn't matter.
-
-- Either commas or spaces can be used as a separator, e.g. ``--cpu-dispatch``\ =
-  "avx2 avx512f" or ``--cpu-dispatch``\ = "avx2, avx512f" both work, but the
-  arguments must be enclosed in quotes.
-
-- The operand ``+`` is only added for nominal reasons, For example:
-  ``--cpu-baseline= "min avx2"`` is equivalent to ``--cpu-baseline="min + avx2"``.
-  ``--cpu-baseline="min,avx2"`` is equivalent to ``--cpu-baseline`="min,+avx2"``
-
-- If the CPU feature is not supported by the user platform or
-  compiler, it will be skipped rather than raising a fatal error.
-
-- Any specified CPU feature to ``--cpu-dispatch`` will be skipped if
-  it's part of CPU baseline features
-
-- The ``--cpu-baseline`` argument force-enables implied features,
-  e.g. ``--cpu-baseline``\ ="sse42" is equivalent to
-  ``--cpu-baseline``\ ="sse sse2 sse3 ssse3 sse41 popcnt sse42"
-
-- The value of ``--cpu-baseline`` will be treated as "native" if
-  compiler native flag ``-march=native`` or ``-xHost`` or ``QxHost`` is
-  enabled through environment variable ``CFLAGS``
-
-- The validation process for the requested optimizations when it comes to
-  ``--cpu-baseline`` isn't strict. For example, if the user requested
-  ``AVX2`` but the compiler doesn't support it then we just skip it and return
-  the maximum optimization that the compiler can handle depending on the
-  implied features of ``AVX2``, let us assume ``AVX``.
-
-- The user should always check the final report through the build log
-  to verify the enabled features.
-
-Special cases
-~~~~~~~~~~~~~
-
-**Interrelated CPU features**: Some exceptional conditions force us to link some features together when it come to certain compilers or architectures, resulting in the impossibility of building them separately.
-These conditions can be divided into two parts, as follows:
-
-- **Architectural compatibility**: The need to align certain CPU features that are assured
-  to be supported by successive generations of the same architecture, for example:
-
-  - On ppc64le `VSX(ISA 2.06)` and `VSX2(ISA 2.07)` both imply one another since the
-    first generation that supports little-endian mode is Power-8`(ISA 2.07)`
-  - On AArch64 `NEON` `FP16` `VFPV4` `ASIMD` implies each other since they are part of the
-    hardware baseline.
-
-- **Compilation compatibility**: Not all **C/C++** compilers provide independent support for all CPU
-  features. For example, **Intel**'s compiler doesn't provide separated flags for `AVX2` and `FMA3`,
-  it makes sense since all Intel CPUs that comes with `AVX2` also support `FMA3` and vice versa,
-  but this approach is incompatible with other **x86** CPUs from **AMD** or **VIA**.
-  Therefore, there are differences in the depiction of CPU features between the C/C++ compilers,
-  as shown in the :ref:`tables above <tables-diff>`.
-
-
-Behaviors and Errors
-~~~~~~~~~~~~~~~~~~~~
-
-
-
-Usage and Examples
-~~~~~~~~~~~~~~~~~~
-
-Report and Trace
-~~~~~~~~~~~~~~~~
-
-Understanding CPU Dispatching, How the NumPy dispatcher works?
-==============================================================
-
-NumPy dispatcher is based on multi-source compiling, which means taking
-a certain source and compiling it multiple times with different compiler
-flags and also with different **C** definitions that affect the code
-paths to enable certain instruction-sets for each compiled object
-depending on the required optimizations, then combining the returned
-objects together.
-
-.. figure:: ../figures/opt-infra.png
-
-This mechanism should support all compilers and it doesn't require any
-compiler-specific extension, but at the same time it is adds a few steps to
-normal compilation that are explained as follows:
-
-1- Configuration
-~~~~~~~~~~~~~~~~
-
-Configuring the required optimization by the user before starting to build the
-source files via the two command arguments as explained above:
-
--  ``--cpu-baseline``: minimal set of required optimizations.
-
--  ``--cpu-dispatch``: dispatched set of additional optimizations.
-
-
-2- Discovering the environment
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-In this part, we check the compiler and platform architecture
-and cache some of the intermediary results to speed up rebuilding.
-
-3- Validating the requested optimizations
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-By testing them against the compiler, and seeing what the compiler can
-support according to the requested optimizations.
-
-4- Generating the main configuration header
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-The generated header ``_cpu_dispatch.h`` contains all the definitions and
-headers of instruction-sets for the required optimizations that have been
-validated during the previous step.
-
-It also contains extra C definitions that are used for defining NumPy's
-Python-level module attributes ``__cpu_baseline__`` and ``__cpu_dispaٍtch__``.
-
-**What is in this header?**
-
-The example header was dynamically generated by gcc on an X86 machine.
-The compiler supports ``--cpu-baseline="sse sse2 sse3"`` and
-``--cpu-dispatch="ssse3 sse41"``, and the result is below.
-
-.. code:: c
-
-   // The header should be located at numpy/numpy/core/src/common/_cpu_dispatch.h
-   /**NOTE
-    ** C definitions prefixed with "NPY_HAVE_" represent
-    ** the required optimzations.
-    **
-    ** C definitions prefixed with 'NPY__CPU_TARGET_' are protected and
-    ** shouldn't be used by any NumPy C sources.
-    */
-   /******* baseline features *******/
-   /** SSE **/
-   #define NPY_HAVE_SSE 1
-   #include <xmmintrin.h>
-   /** SSE2 **/
-   #define NPY_HAVE_SSE2 1
-   #include <emmintrin.h>
-   /** SSE3 **/
-   #define NPY_HAVE_SSE3 1
-   #include <pmmintrin.h>
-
-   /******* dispatch-able features *******/
-   #ifdef NPY__CPU_TARGET_SSSE3
-     /** SSSE3 **/
-     #define NPY_HAVE_SSSE3 1
-     #include <tmmintrin.h>
-   #endif
-   #ifdef NPY__CPU_TARGET_SSE41
-     /** SSE41 **/
-     #define NPY_HAVE_SSE41 1
-     #include <smmintrin.h>
-   #endif
-
-**Baseline features** are the minimal set of required optimizations configured
-via ``--cpu-baseline``. They have no preprocessor guards and they're
-always on, which means they can be used in any source.
-
-Does this mean NumPy's infrastructure passes the compiler's flags of
-baseline features to all sources?
-
-Definitely, yes. But the :ref:`dispatch-able sources <dispatchable-sources>` are
-treated differently.
-
-What if the user specifies certain **baseline features** during the
-build but at runtime the machine doesn't support even these
-features? Will the compiled code be called via one of these definitions, or
-maybe the compiler itself auto-generated/vectorized certain piece of code
-based on the provided command line compiler flags?
-
-During the loading of the NumPy module, there's a validation step
-which detects this behavior. It will raise a Python runtime error to inform the
-user. This is to prevent the CPU reaching an illegal instruction error causing
-a segfault.
-
-**Dispatch-able features** are our dispatched set of additional optimizations
-that were configured via ``--cpu-dispatch``. They are not activated by
-default and are always guarded by other C definitions prefixed with
-``NPY__CPU_TARGET_``. C definitions ``NPY__CPU_TARGET_`` are only
-enabled within **dispatch-able sources**.
-
-.. _dispatchable-sources:
-
-5- Dispatch-able sources and configuration statements
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Dispatch-able sources are special **C** files that can be compiled multiple
-times with different compiler flags and also with different **C**
-definitions. These affect code paths to enable certain
-instruction-sets for each compiled object according to "**the
-configuration statements**" that must be declared between a **C**
-comment\ ``(/**/)`` and start with a special mark **@targets** at the
-top of each dispatch-able source. At the same time, dispatch-able
-sources will be treated as normal **C** sources if the optimization was
-disabled by the command argument ``--disable-optimization`` .
-
-**What are configuration statements?**
-
-Configuration statements are sort of keywords combined together to
-determine the required optimization for the dispatch-able source.
-
-Example:
-
-.. code:: c
-
-   /*@targets avx2 avx512f vsx2 vsx3 asimd asimdhp */
-   // C code
-
-The keywords mainly represent the additional optimizations configured
-through ``--cpu-dispatch``, but it can also represent other options such as:
-
-- Target groups: pre-configured configuration statements used for
-  managing the required optimizations from outside the dispatch-able source.
-
-- Policies: collections of options used for changing the default
-  behaviors or forcing the compilers to perform certain things.
-
-- "baseline": a unique keyword represents the minimal optimizations
-  that configured through ``--cpu-baseline``
-
-**Numpy's infrastructure handles dispatch-able sources in four steps**:
-
-- **(A) Recognition**: Just like source templates and F2PY, the
-  dispatch-able sources requires a special extension ``*.dispatch.c``
-  to mark C dispatch-able source files, and for C++
-  ``*.dispatch.cpp`` or ``*.dispatch.cxx``
-  **NOTE**: C++ not supported yet.
-
-- **(B) Parsing and validating**: In this step, the
-  dispatch-able sources that had been filtered by the previous step
-  are parsed and validated by the configuration statements for each one
-  of them one by one in order to determine the required optimizations.
-
-- **(C) Wrapping**: This is the approach taken by NumPy's
-  infrastructure, which has proved to be sufficiently flexible in order
-  to compile a single source multiple times with different **C**
-  definitions and flags that affect the code paths. The process is
-  achieved by creating a temporary **C** source for each required
-  optimization that related to the additional optimization, which
-  contains the declarations of the **C** definitions and includes the
-  involved source via the **C** directive **#include**. For more
-  clarification take a look at the following code for AVX512F :
-
-  .. code:: c
-
-      /*
-       * this definition is used by NumPy utilities as suffixes for the
-       * exported symbols
-       */
-      #define NPY__CPU_TARGET_CURRENT AVX512F
-      /*
-       * The following definitions enable
-       * definitions of the dispatch-able features that are defined within the main
-       * configuration header. These are definitions for the implied features.
-       */
-      #define NPY__CPU_TARGET_SSE
-      #define NPY__CPU_TARGET_SSE2
-      #define NPY__CPU_TARGET_SSE3
-      #define NPY__CPU_TARGET_SSSE3
-      #define NPY__CPU_TARGET_SSE41
-      #define NPY__CPU_TARGET_POPCNT
-      #define NPY__CPU_TARGET_SSE42
-      #define NPY__CPU_TARGET_AVX
-      #define NPY__CPU_TARGET_F16C
-      #define NPY__CPU_TARGET_FMA3
-      #define NPY__CPU_TARGET_AVX2
-      #define NPY__CPU_TARGET_AVX512F
-      // our dispatch-able source
-      #include "/the/absuolate/path/of/hello.dispatch.c"
-
-- **(D) Dispatch-able configuration header**: The infrastructure
-  generates a config header for each dispatch-able source, this header
-  mainly contains two abstract **C** macros used for identifying the
-  generated objects, so they can be used for runtime dispatching
-  certain symbols from the generated objects by any **C** source. It is
-  also used for forward declarations.
-
-  The generated header takes the name of the dispatch-able source after
-  excluding the extension and replace it with '**.h**', for example
-  assume we have a dispatch-able source called **hello.dispatch.c** and
-  contains the following:
-
-  .. code:: c
-
-      // hello.dispatch.c
-      /*@targets baseline sse42 avx512f */
-      #include <stdio.h>
-      #include "numpy/utils.h" // NPY_CAT, NPY_TOSTR
-
-      #ifndef NPY__CPU_TARGET_CURRENT
-        // wrapping the dispatch-able source only happens to the additional optimizations
-        // but if the keyword 'baseline' provided within the configuration statements,
-        // the infrastructure will add extra compiling for the dispatch-able source by
-        // passing it as-is to the compiler without any changes.
-        #define CURRENT_TARGET(X) X
-        #define NPY__CPU_TARGET_CURRENT baseline // for printing only
-      #else
-        // since we reach to this point, that's mean we're dealing with
-          // the additional optimizations, so it could be SSE42 or AVX512F
-        #define CURRENT_TARGET(X) NPY_CAT(NPY_CAT(X, _), NPY__CPU_TARGET_CURRENT)
-      #endif
-      // Macro 'CURRENT_TARGET' adding the current target as suffux to the exported symbols,
-      // to avoid linking duplications, NumPy already has a macro called
-      // 'NPY_CPU_DISPATCH_CURFX' similar to it, located at
-      // numpy/numpy/core/src/common/npy_cpu_dispatch.h
-      // NOTE: we tend to not adding suffixes to the baseline exported symbols
-      void CURRENT_TARGET(simd_whoami)(const char *extra_info)
-      {
-          printf("I'm " NPY_TOSTR(NPY__CPU_TARGET_CURRENT) ", %s\n", extra_info);
-      }
-
-  Now assume you attached **hello.dispatch.c** to the source tree, then
-  the infrastructure should generate a temporary config header called
-  **hello.dispatch.h** that can be reached by any source in the source
-  tree, and it should contain the following code :
-
-  .. code:: c
-
-      #ifndef NPY__CPU_DISPATCH_EXPAND_
-        // To expand the macro calls in this header
-          #define NPY__CPU_DISPATCH_EXPAND_(X) X
-      #endif
-      // Undefining the following macros, due to the possibility of including config headers
-      // multiple times within the same source and since each config header represents
-      // different required optimizations according to the specified configuration
-      // statements in the dispatch-able source that derived from it.
-      #undef NPY__CPU_DISPATCH_BASELINE_CALL
-      #undef NPY__CPU_DISPATCH_CALL
-      // nothing strange here, just a normal preprocessor callback
-      // enabled only if 'baseline' specified within the configuration statements
-      #define NPY__CPU_DISPATCH_BASELINE_CALL(CB, ...) \
-        NPY__CPU_DISPATCH_EXPAND_(CB(__VA_ARGS__))
-      // 'NPY__CPU_DISPATCH_CALL' is an abstract macro is used for dispatching
-      // the required optimizations that specified within the configuration statements.
-      //
-      // @param CHK, Expected a macro that can be used to detect CPU features
-      // in runtime, which takes a CPU feature name without string quotes and
-      // returns the testing result in a shape of boolean value.
-      // NumPy already has macro called "NPY_CPU_HAVE", which fits this requirement.
-      //
-      // @param CB, a callback macro that expected to be called multiple times depending
-      // on the required optimizations, the callback should receive the following arguments:
-      //  1- The pending calls of @param CHK filled up with the required CPU features,
-      //     that need to be tested first in runtime before executing call belong to
-      //     the compiled object.
-      //  2- The required optimization name, same as in 'NPY__CPU_TARGET_CURRENT'
-      //  3- Extra arguments in the macro itself
-      //
-      // By default the callback calls are sorted depending on the highest interest
-      // unless the policy "$keep_sort" was in place within the configuration statements
-      // see "Dive into the CPU dispatcher" for more clarification.
-      #define NPY__CPU_DISPATCH_CALL(CHK, CB, ...) \
-        NPY__CPU_DISPATCH_EXPAND_(CB((CHK(AVX512F)), AVX512F, __VA_ARGS__)) \
-        NPY__CPU_DISPATCH_EXPAND_(CB((CHK(SSE)&&CHK(SSE2)&&CHK(SSE3)&&CHK(SSSE3)&&CHK(SSE41)), SSE41, __VA_ARGS__))
-
-  An example of using the config header in light of the above:
-
-  .. code:: c
-
-      // NOTE: The following macros are only defined for demonstration purposes only.
-      // NumPy already has a collections of macros located at
-      // numpy/numpy/core/src/common/npy_cpu_dispatch.h, that covers all dispatching
-      // and declarations scenarios.
-
-      #include "numpy/npy_cpu_features.h" // NPY_CPU_HAVE
-      #include "numpy/utils.h" // NPY_CAT, NPY_EXPAND
-
-      // An example for setting a macro that calls all the exported symbols at once
-      // after checking if they're supported by the running machine.
-      #define DISPATCH_CALL_ALL(FN, ARGS) \
-          NPY__CPU_DISPATCH_CALL(NPY_CPU_HAVE, DISPATCH_CALL_ALL_CB, FN, ARGS) \
-          NPY__CPU_DISPATCH_BASELINE_CALL(DISPATCH_CALL_BASELINE_ALL_CB, FN, ARGS)
-      // The preprocessor callbacks.
-      // The same suffixes as we define it in the dispatch-able source.
-      #define DISPATCH_CALL_ALL_CB(CHECK, TARGET_NAME, FN, ARGS) \
-        if (CHECK) { NPY_CAT(NPY_CAT(FN, _), TARGET_NAME) ARGS; }
-      #define DISPATCH_CALL_BASELINE_ALL_CB(FN, ARGS) \
-        FN NPY_EXPAND(ARGS);
-
-      // An example for setting a macro that calls the exported symbols of highest
-      // interest optimization, after checking if they're supported by the running machine.
-      #define DISPATCH_CALL_HIGH(FN, ARGS) \
-        if (0) {} \
-          NPY__CPU_DISPATCH_CALL(NPY_CPU_HAVE, DISPATCH_CALL_HIGH_CB, FN, ARGS) \
-          NPY__CPU_DISPATCH_BASELINE_CALL(DISPATCH_CALL_BASELINE_HIGH_CB, FN, ARGS)
-      // The preprocessor callbacks
-      // The same suffixes as we define it in the dispatch-able source.
-      #define DISPATCH_CALL_HIGH_CB(CHECK, TARGET_NAME, FN, ARGS) \
-        else if (CHECK) { NPY_CAT(NPY_CAT(FN, _), TARGET_NAME) ARGS; }
-      #define DISPATCH_CALL_BASELINE_HIGH_CB(FN, ARGS) \
-        else { FN NPY_EXPAND(ARGS); }
-
-      // NumPy has a macro called 'NPY_CPU_DISPATCH_DECLARE' can be used
-      // for forward declrations any kind of prototypes based on
-      // 'NPY__CPU_DISPATCH_CALL' and 'NPY__CPU_DISPATCH_BASELINE_CALL'.
-      // However in this example, we just handle it manually.
-      void simd_whoami(const char *extra_info);
-      void simd_whoami_AVX512F(const char *extra_info);
-      void simd_whoami_SSE41(const char *extra_info);
-
-      void trigger_me(void)
-      {
-          // bring the auto-gernreated config header
-          // which contains config macros 'NPY__CPU_DISPATCH_CALL' and
-          // 'NPY__CPU_DISPATCH_BASELINE_CALL'.
-          // it highely recomaned to include the config header before exectuing
-        // the dispatching macros in case if there's another header in the scope.
-          #include "hello.dispatch.h"
-          DISPATCH_CALL_ALL(simd_whoami, ("all"))
-          DISPATCH_CALL_HIGH(simd_whoami, ("the highest interest"))
-          // An example of including multiple config headers in the same source
-          // #include "hello2.dispatch.h"
-          // DISPATCH_CALL_HIGH(another_function, ("the highest interest"))
-      }
-
-
-Dive into the CPU dispatcher
-============================
-
-The baseline
-~~~~~~~~~~~~
-
-Dispatcher
-~~~~~~~~~~
-
-Groups and Policies
-~~~~~~~~~~~~~~~~~~~
-
-Examples
-~~~~~~~~
-
-Report and Trace
-~~~~~~~~~~~~~~~~
-
-
-.. _`Universal Intrinsics`: https://numpy.org/neps/nep-0038-SIMD-optimizations.html
+The location of this document has been changed , if you are not
+redirected in few seconds, `click here <index.html>`_.
diff --git a/doc/source/reference/swig.interface-file.rst b/doc/source/reference/swig.interface-file.rst

index 6dd74f4ecb211f9ba35d8f1f2f8d9a07cce13a79..a22b98d394a1ee1a07566584b2ebfdfabe18dbed 100644 (file)
--- a/doc/source/reference/swig.interface-file.rst
+++ b/doc/source/reference/swig.interface-file.rst
@@ -904,7 +904,7 @@ Routines
  
      * ``PyArrayObject* ary``, a NumPy array.
  
-    Require the given ``PyArrayObject`` to to be Fortran ordered.  If
+    Require the given ``PyArrayObject`` to be Fortran ordered.  If
      the ``PyArrayObject`` is already Fortran ordered, do nothing.
      Else, set the Fortran ordering flag and recompute the strides.
  
diff --git a/doc/source/release.rst b/doc/source/release.rst

index 3474e0a48b0116ebd361d43506ab65c984408667..f21b5610d943227e4a98394fa54612a1873a6323 100644 (file)
--- a/doc/source/release.rst
+++ b/doc/source/release.rst
@@ -5,11 +5,14 @@ Release notes
  .. toctree::
      :maxdepth: 3
  
+    1.23.0 <release/1.23.0-notes>
      1.22.4 <release/1.22.4-notes>
      1.22.3 <release/1.22.3-notes>
      1.22.2 <release/1.22.2-notes>
      1.22.1 <release/1.22.1-notes>
      1.22.0 <release/1.22.0-notes>
+    1.21.6 <release/1.21.6-notes>
+    1.21.5 <release/1.21.5-notes>
      1.21.4 <release/1.21.4-notes>
      1.21.3 <release/1.21.3-notes>
      1.21.2 <release/1.21.2-notes>
diff --git a/doc/source/release/1.10.3-notes.rst b/doc/source/release/1.10.3-notes.rst

index 0d4df4ce6a183321c323b361fc8ed5df656ec777..9172f76635232221c43cc771f37611a4522774bc 100644 (file)
--- a/doc/source/release/1.10.3-notes.rst
+++ b/doc/source/release/1.10.3-notes.rst
@@ -2,4 +2,4 @@
  NumPy 1.10.3 Release Notes
  ==========================
  
-N/A this release did not happen due to various screwups involving PyPi.
+N/A this release did not happen due to various screwups involving PyPI.
diff --git a/doc/source/release/1.11.1-notes.rst b/doc/source/release/1.11.1-notes.rst

index 6303c32f0e07f4c91cb0b437d91b7051c998ecd4..a196502cf7462c85f7bcf0847b85baa4f240269c 100644 (file)
--- a/doc/source/release/1.11.1-notes.rst
+++ b/doc/source/release/1.11.1-notes.rst
@@ -4,7 +4,7 @@ NumPy 1.11.1 Release Notes
  
  Numpy 1.11.1 supports Python 2.6 - 2.7 and 3.2 - 3.5. It fixes bugs and
  regressions found in Numpy 1.11.0 and includes several build related
-improvements. Wheels for Linux, Windows, and OSX can be found on pypi.
+improvements. Wheels for Linux, Windows, and OSX can be found on PyPI.
  
  Fixes Merged
  ============
diff --git a/doc/source/release/1.12.1-notes.rst b/doc/source/release/1.12.1-notes.rst

index f67dab1085d57e14d161233c7fd54963d7fb8790..09a2e67381c97a61087b05f3a12f33be61e55220 100644 (file)
--- a/doc/source/release/1.12.1-notes.rst
+++ b/doc/source/release/1.12.1-notes.rst
@@ -4,7 +4,7 @@ NumPy 1.12.1 Release Notes
  
  NumPy 1.12.1 supports Python 2.7 and 3.4 - 3.6 and fixes bugs and regressions
  found in NumPy 1.12.0. In particular, the regression in f2py constant parsing
-is fixed. Wheels for Linux, Windows, and OSX can be found on pypi,
+is fixed. Wheels for Linux, Windows, and OSX can be found on PyPI,
  
  Bugs Fixed
  ==========
diff --git a/doc/source/release/1.19.0-notes.rst b/doc/source/release/1.19.0-notes.rst

index 410890697b5c27cefa55716cc942b1fd2750dac0..4a09920e45cd1b891d5e46cae822012af7e1cb59 100644 (file)
--- a/doc/source/release/1.19.0-notes.rst
+++ b/doc/source/release/1.19.0-notes.rst
@@ -317,9 +317,9 @@ New Features
  
  ``numpy.frompyfunc`` now accepts an identity argument
  -----------------------------------------------------
-This allows the :attr:``numpy.ufunc.identity`` attribute to be set on the
+This allows the :attr:`numpy.ufunc.identity` attribute to be set on the
  resulting ufunc, meaning it can be used for empty and multi-dimensional
-calls to :meth:``numpy.ufunc.reduce``.
+calls to :meth:`numpy.ufunc.reduce`.
  
  (`gh-8255 <https://github.com/numpy/numpy/pull/8255>`__)
  
diff --git a/doc/source/release/1.21.5-notes.rst b/doc/source/release/1.21.5-notes.rst

new file mode 100644 (file)

index 0000000..c69d267
--- /dev/null
+++ b/doc/source/release/1.21.5-notes.rst
@@ -0,0 +1,42 @@
+.. currentmodule:: numpy
+
+==========================
+NumPy 1.21.5 Release Notes
+==========================
+
+NumPy 1.21.5 is a maintenance release that fixes a few bugs discovered after
+the 1.21.4 release and does some maintenance to extend the 1.21.x lifetime.
+The Python versions supported in this release are 3.7-3.10. If you want to
+compile your own version using gcc-11, you will need to use gcc-11.2+ to avoid
+problems.
+
+Contributors
+============
+
+A total of 7 people contributed to this release.  People with a "+" by their
+names contributed a patch for the first time.
+
+* Bas van Beek
+* Charles Harris
+* Matti Picus
+* Rohit Goswami
+* Ross Barnowski
+* Sayed Adel
+* Sebastian Berg
+
+Pull requests merged
+====================
+
+A total of 11 pull requests were merged for this release.
+
+* `#20357 <https://github.com/numpy/numpy/pull/20357>`__: MAINT: Do not forward ``__(deep)copy__`` calls of ``_GenericAlias``...
+* `#20462 <https://github.com/numpy/numpy/pull/20462>`__: BUG: Fix float16 einsum fastpaths using wrong tempvar
+* `#20463 <https://github.com/numpy/numpy/pull/20463>`__: BUG, DIST: Print os error message when the executable not exist
+* `#20464 <https://github.com/numpy/numpy/pull/20464>`__: BLD: Verify the ability to compile C++ sources before initiating...
+* `#20465 <https://github.com/numpy/numpy/pull/20465>`__: BUG: Force ``npymath` ` to respect ``npy_longdouble``
+* `#20466 <https://github.com/numpy/numpy/pull/20466>`__: BUG: Fix failure to create aligned, empty structured dtype
+* `#20467 <https://github.com/numpy/numpy/pull/20467>`__: ENH: provide a convenience function to replace npy_load_module
+* `#20495 <https://github.com/numpy/numpy/pull/20495>`__: MAINT: update wheel to version that supports python3.10
+* `#20497 <https://github.com/numpy/numpy/pull/20497>`__: BUG: Clear errors correctly in F2PY conversions
+* `#20613 <https://github.com/numpy/numpy/pull/20613>`__: DEV: add a warningfilter to fix pytest workflow.
+* `#20618 <https://github.com/numpy/numpy/pull/20618>`__: MAINT: Help boost::python libraries at least not crash
diff --git a/doc/source/release/1.21.6-notes.rst b/doc/source/release/1.21.6-notes.rst

new file mode 100644 (file)

index 0000000..6683969
--- /dev/null
+++ b/doc/source/release/1.21.6-notes.rst
@@ -0,0 +1,13 @@
+.. currentmodule:: numpy
+
+==========================
+NumPy 1.21.6 Release Notes
+==========================
+
+NumPy 1.21.6 is a very small release that achieves two things:
+
+- Backs out the mistaken backport of C++ code into 1.21.5.
+- Provides a 32 bit Windows wheel for Python 3.10.
+
+The provision of the 32 bit wheel is intended to make life easier
+for oldest-supported-numpy.
diff --git a/doc/source/release/1.22.0-notes.rst b/doc/source/release/1.22.0-notes.rst

index 4b93e872e6bdf8833b2c4e4e51222626729046d9..c5477fb163d0614a7732e8970a2276681d1b34e6 100644 (file)
--- a/doc/source/release/1.22.0-notes.rst
+++ b/doc/source/release/1.22.0-notes.rst
@@ -3,7 +3,7 @@
  ==========================
  NumPy 1.22.0 Release Notes
  ==========================
-NumPy 1.22.0 is a big release featuring the work of 153 contributers spread
+NumPy 1.22.0 is a big release featuring the work of 153 contributors spread
  over 609 pull requests. There have been many improvements, highlights are:
  
  * Annotations of the main namespace are essentially complete. Upstream is a
@@ -18,22 +18,22 @@ over 609 pull requests. There have been many improvements, highlights are:
  * New methods for ``quantile``, ``percentile``, and related functions. The new
    methods provide a complete set of the methods commonly found in the
    literature.
-* A new configurable allocator for use by downstream projects.
  * The universal functions have been refactored to implement most of
    :ref:`NEP 43 <NEP43>`.  This also unlocks the ability to experiment with the
    future DType API.
+* A new configurable allocator for use by downstream projects.
  
  These are in addition to the ongoing work to provide SIMD support for commonly
  used functions, improvements to F2PY, and better documentation.
  
  The Python versions supported in this release are 3.8-3.10, Python 3.7 has been
-dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on
-Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other
-Linux distributions dropping 32 bit support. The Mac wheels are now based on
-OS X 10.14 rather than 10.6 that was used in previous NumPy release cycles.
-10.14 is the oldest release supported by Apple. All 64 bit wheels are also
-linked with 64 bit integer OpenBLAS, which should fix the occasional problems
-encountered by folks using truly huge arrays.
+dropped. Note that the Mac wheels are now based on OS X 10.14 rather than 10.9
+that was used in previous NumPy release cycles. 10.14 is the oldest release
+supported by Apple. Also note that 32 bit wheels are only provided for Python
+3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu,
+Fedora, and other Linux distributions dropping 32 bit support. All 64 bit
+wheels are also linked with 64 bit integer OpenBLAS, which should fix the
+occasional problems encountered by folks using truly huge arrays.
  
  
  Expired deprecations
diff --git a/doc/source/release/1.22.1-notes.rst b/doc/source/release/1.22.1-notes.rst

index e494bdef45ce0b8831f06015230b2b2edad1ef4b..0012f199eaf73be9d053971a9546a48cd1f430af 100644 (file)
--- a/doc/source/release/1.22.1-notes.rst
+++ b/doc/source/release/1.22.1-notes.rst
@@ -4,7 +4,7 @@
  NumPy 1.22.1 Release Notes
  ==========================
  
-The NumPy 1.22.1 is maintenance release that fixes bugs discovered after the
+The NumPy 1.22.1 is a maintenance release that fixes bugs discovered after the
  1.22.0 release. Notable fixes are:
  
  - Fix f2PY docstring problems (SciPy)
diff --git a/doc/source/release/1.22.3-notes.rst b/doc/source/release/1.22.3-notes.rst

index 2b7508bbd37aa7d874dfcd1cff0157fbeeaaa169..6c94e2d28baabec46895026d50fe12014b4f132e 100644 (file)
--- a/doc/source/release/1.22.3-notes.rst
+++ b/doc/source/release/1.22.3-notes.rst
@@ -4,7 +4,7 @@
  NumPy 1.22.3 Release Notes
  ==========================
  
-The NumPy 1.22.3 is maintenance release that fixes bugs discovered after the
+NumPy 1.22.3 is a maintenance release that fixes bugs discovered after the
  1.22.2 release. The most noticeable fixes may be those for DLPack. One that may
  cause some problems is disallowing strings as inputs to logical ufuncs. It is
  still undecided how strings should be treated in those functions and it was
@@ -12,7 +12,7 @@ thought best to simply disallow them until a decision was reached. That should
  not cause problems with older code.
  
  The Python versions supported for this release are 3.8-3.10. Note that the Mac
-wheels are now based on OS X 10.14 rather than 10.6 that was used in previous
+wheels are now based on OS X 10.14 rather than 10.9 that was used in previous
  NumPy release cycles. 10.14 is the oldest release supported by Apple.
  
  Contributors
@@ -41,7 +41,7 @@ A total of 10 pull requests were merged for this release.
  * `#21137 <https://github.com/numpy/numpy/pull/21137>`__: BLD,DOC: skip broken ipython 8.1.0
  * `#21138 <https://github.com/numpy/numpy/pull/21138>`__: BUG, ENH: np._from_dlpack: export correct device information
  * `#21139 <https://github.com/numpy/numpy/pull/21139>`__: BUG: Fix numba DUFuncs added loops getting picked up
-* `#21140 <https://github.com/numpy/numpy/pull/21140>`__: BUG: Fix unpickling an empty ndarray with a none-zero dimension...
+* `#21140 <https://github.com/numpy/numpy/pull/21140>`__: BUG: Fix unpickling an empty ndarray with a non-zero dimension...
  * `#21141 <https://github.com/numpy/numpy/pull/21141>`__: BUG: use ThreadPoolExecutor instead of ThreadPool
  * `#21142 <https://github.com/numpy/numpy/pull/21142>`__: API: Disallow strings in logical ufuncs
  * `#21143 <https://github.com/numpy/numpy/pull/21143>`__: MAINT, DOC: Fix SciPy intersphinx link
diff --git a/doc/source/release/1.22.4-notes.rst b/doc/source/release/1.22.4-notes.rst

index a13d0a86221a4dde5bfc1c18d9102d233a5beac8..1f418ca62adff37a7ed341af683fc74cbc7209dd 100644 (file)
--- a/doc/source/release/1.22.4-notes.rst
+++ b/doc/source/release/1.22.4-notes.rst
@@ -10,8 +10,8 @@ recently released Cython 0.29.30, which should fix the reported problems with
  `debugging <https://github.com/numpy/numpy/issues/21008>`_.
  
  The Python versions supported for this release are 3.8-3.10. Note that the Mac
-wheels are now based on OS X 10.14 rather than 10.6 that was used in previous
-NumPy release cycles. 10.14 is the oldest release supported by Apple.
+wheels are based on OS X 10.15 rather than 10.9 that was used in previous
+NumPy release cycles.
  
  Contributors
  ============
diff --git a/doc/source/release/1.23.0-notes.rst b/doc/source/release/1.23.0-notes.rst

new file mode 100644 (file)

index 0000000..c5c2362
--- /dev/null
+++ b/doc/source/release/1.23.0-notes.rst
@@ -0,0 +1,412 @@
+.. currentmodule:: numpy
+
+==========================
+NumPy 1.23.0 Release Notes
+==========================
+
+The NumPy 1.23.0 release continues the ongoing work to improve the handling and
+promotion of dtypes, increase the execution speed, clarify the documentation,
+and expire old deprecations. The highlights are:
+
+* Implementation of ``loadtxt`` in C, greatly improving its performance.
+* Exposing DLPack at the Python level for easy data exchange.
+* Changes to the promotion and comparisons of structured dtypes.
+* Improvements to f2py.
+
+See below for the details,
+
+
+New functions
+=============
+
+* A masked array specialization of ``ndenumerate`` is now available as
+  ``numpy.ma.ndenumerate``. It provides an alternative to ``numpy.ndenumerate``
+  and skips masked values by default.
+
+  (`gh-20020 <https://github.com/numpy/numpy/pull/20020>`__)
+
+* ``numpy.from_dlpack`` has been added to allow easy exchange of data using the
+  DLPack protocol.  It accepts Python objects that implement the ``__dlpack__``
+  and ``__dlpack_device__`` methods and returns a ndarray object which is
+  generally the view of the data of the input object.
+
+  (`gh-21145 <https://github.com/numpy/numpy/pull/21145>`__)
+
+
+Deprecations
+============
+
+* Setting ``__array_finalize__`` to ``None`` is deprecated.  It must now be
+  a method and may wish to call ``super().__array_finalize__(obj)`` after
+  checking for ``None`` or if the NumPy version is sufficiently new.
+
+  (`gh-20766 <https://github.com/numpy/numpy/pull/20766>`__)
+
+* Using ``axis=32`` (``axis=np.MAXDIMS``) in many cases had the
+  same meaning as ``axis=None``.  This is deprecated and ``axis=None``
+  must be used instead.
+
+  (`gh-20920 <https://github.com/numpy/numpy/pull/20920>`__)
+
+* The hook function ``PyDataMem_SetEventHook`` has been deprecated and the
+  demonstration of its use in tool/allocation_tracking has been removed.  The
+  ability to track allocations is now built-in to python via ``tracemalloc``.
+
+  (`gh-20394 <https://github.com/numpy/numpy/pull/20394>`__)
+
+* ``numpy.distutils`` has been deprecated, as a result of ``distutils`` itself
+  being deprecated. It will not be present in NumPy for Python >= 3.12, and
+  will be removed completely 2 years after the release of Python 3.12 For more
+  details, see :ref:`distutils-status-migration`.
+
+  (`gh-20875 <https://github.com/numpy/numpy/pull/20875>`__)
+
+* ``numpy.loadtxt`` will now give a ``DeprecationWarning`` when an integer
+  ``dtype`` is requested but the value is formatted as a floating point number.
+
+  (`gh-21663 <https://github.com/numpy/numpy/pull/21663>`__)
+
+
+Expired deprecations
+====================
+
+* The ``NpzFile.iteritems()`` and ``NpzFile.iterkeys()`` methods have been
+  removed as part of the continued removal of Python 2 compatibility. This
+  concludes the deprecation from 1.15.
+
+  (`gh-16830 <https://github.com/numpy/numpy/pull/16830>`__)
+
+* The ``alen`` and ``asscalar`` functions have been removed.
+
+  (`gh-20414 <https://github.com/numpy/numpy/pull/20414>`__)
+
+* The ``UPDATEIFCOPY`` array flag has been removed together with the enum
+  ``NPY_ARRAY_UPDATEIFCOPY``. The associated (and deprecated)
+  ``PyArray_XDECREF_ERR`` was also removed. These were all deprecated in 1.14. They
+  are replaced by ``WRITEBACKIFCOPY``, that requires calling
+  ``PyArray_ResoveWritebackIfCopy`` before the array is deallocated.
+
+  (`gh-20589 <https://github.com/numpy/numpy/pull/20589>`__)
+
+* Exceptions will be raised during array-like creation.  When an object raised
+  an exception during access of the special attributes ``__array__`` or
+  ``__array_interface__``, this exception was usually ignored.  This behaviour
+  was deprecated in 1.21, and the exception will now be raised.
+
+  (`gh-20835 <https://github.com/numpy/numpy/pull/20835>`__)
+
+* Multidimensional indexing with non-tuple values is not allowed.  Previously,
+  code such as ``arr[ind]`` where ``ind = [[0, 1], [0, 1]]`` produced a
+  ``FutureWarning`` and was interpreted as a multidimensional index (i.e.,
+  ``arr[tuple(ind)]``). Now this example is treated like an array index over a
+  single dimension (``arr[array(ind)]``).  Multidimensional indexing with
+  anything but a tuple was deprecated in NumPy 1.15.
+
+  (`gh-21029 <https://github.com/numpy/numpy/pull/21029>`__)
+
+* Changing to a dtype of different size in F-contiguous arrays is no longer
+  permitted. Deprecated since Numpy 1.11.0. See below for an extended
+  explanation of the effects of this change.
+
+  (`gh-20722 <https://github.com/numpy/numpy/pull/20722>`__)
+
+
+New Features
+============
+
+crackfortran has support for operator and assignment overloading
+----------------------------------------------------------------
+``crackfortran`` parser now understands operator and assignment
+definitions in a module. They are added in the ``body`` list of the
+module which contains a new key ``implementedby`` listing the names
+of the subroutines or functions implementing the operator or
+assignment.
+
+(`gh-15006 <https://github.com/numpy/numpy/pull/15006>`__)
+
+f2py supports reading access type attributes from derived type statements
+-------------------------------------------------------------------------
+As a result, one does not need to use ``public`` or ``private`` statements to
+specify derived type access properties.
+
+(`gh-15844 <https://github.com/numpy/numpy/pull/15844>`__)
+
+New parameter ``ndmin`` added to ``genfromtxt``
+-------------------------------------------------------------------------
+This parameter behaves the same as ``ndmin`` from ``numpy.loadtxt``.
+
+(`gh-20500 <https://github.com/numpy/numpy/pull/20500>`__)
+
+``np.loadtxt`` now supports quote character and single converter function
+-------------------------------------------------------------------------
+``numpy.loadtxt`` now supports an additional ``quotechar`` keyword argument
+which is not set by default.  Using ``quotechar='"'`` will read quoted fields
+as used by the Excel CSV dialect.
+
+Further, it is now possible to pass a single callable rather than a dictionary
+for the ``converters`` argument.
+
+(`gh-20580 <https://github.com/numpy/numpy/pull/20580>`__)
+
+Changing to dtype of a different size now requires contiguity of only the last axis
+-----------------------------------------------------------------------------------
+Previously, viewing an array with a dtype of a different item size required that
+the entire array be C-contiguous. This limitation would unnecessarily force the
+user to make contiguous copies of non-contiguous arrays before being able to
+change the dtype.
+
+This change affects not only ``ndarray.view``, but other construction
+mechanisms, including the discouraged direct assignment to ``ndarray.dtype``.
+
+This change expires the deprecation regarding the viewing of F-contiguous
+arrays, described elsewhere in the release notes.
+
+(`gh-20722 <https://github.com/numpy/numpy/pull/20722>`__)
+
+Deterministic output files for F2PY
+-----------------------------------
+For F77 inputs, ``f2py`` will generate ``modname-f2pywrappers.f``
+unconditionally, though these may be empty.  For free-form inputs,
+``modname-f2pywrappers.f``, ``modname-f2pywrappers2.f90`` will both be generated
+unconditionally, and may be empty. This allows writing generic output rules in
+``cmake`` or ``meson`` and other build systems. Older behavior can be restored
+by passing ``--skip-empty-wrappers`` to ``f2py``. :ref:`f2py-meson` details usage.
+
+(`gh-21187 <https://github.com/numpy/numpy/pull/21187>`__)
+
+``keepdims`` parameter for ``average``
+--------------------------------------
+The parameter ``keepdims`` was added to the functions ``numpy.average``
+and ``numpy.ma.average``.  The parameter has the same meaning as it
+does in reduction functions such as ``numpy.sum`` or ``numpy.mean``.
+
+(`gh-21485 <https://github.com/numpy/numpy/pull/21485>`__)
+
+New parameter ``equal_nan`` added to ``np.unique``
+--------------------------------------------------
+``np.unique`` was changed in 1.21 to treat all ``NaN`` values as equal and return
+a single ``NaN``. Setting ``equal_nan=False`` will restore pre-1.21 behavior
+to treat ``NaNs`` as unique. Defaults to ``True``.
+
+(`gh-21623 <https://github.com/numpy/numpy/pull/21623>`__)
+
+
+Compatibility notes
+===================
+
+1D ``np.linalg.norm`` preserves float input types, even for scalar results
+--------------------------------------------------------------------------
+Previously, this would promote to ``float64`` when the ``ord`` argument was
+not one of the explicitly listed values, e.g. ``ord=3``::
+
+    >>> f32 = np.float32([1, 2])
+    >>> np.linalg.norm(f32, 2).dtype
+    dtype('float32')
+    >>> np.linalg.norm(f32, 3)
+    dtype('float64')  # numpy 1.22
+    dtype('float32')  # numpy 1.23
+
+This change affects only ``float32`` and ``float16`` vectors with ``ord``
+other than ``-Inf``, ``0``, ``1``, ``2``, and ``Inf``.
+
+(`gh-17709 <https://github.com/numpy/numpy/pull/17709>`__)
+
+Changes to structured (void) dtype promotion and comparisons
+------------------------------------------------------------
+In general, NumPy now defines correct, but slightly limited, promotion for
+structured dtypes by promoting the subtypes of each field instead of raising
+an exception::
+
+    >>> np.result_type(np.dtype("i,i"), np.dtype("i,d"))
+    dtype([('f0', '<i4'), ('f1', '<f8')])
+
+For promotion matching field names, order, and titles are enforced, however
+padding is ignored.
+Promotion involving structured dtypes now always ensures native byte-order for
+all fields (which may change the result of ``np.concatenate``)
+and ensures that the result will be "packed", i.e. all fields are ordered
+contiguously and padding is removed.
+See :ref:`structured_dtype_comparison_and_promotion` for further details.
+
+The ``repr`` of aligned structures will now never print the long form including
+``offsets`` and ``itemsize`` unless the structure includes padding not
+guaranteed by ``align=True``.
+
+In alignment with the above changes to the promotion logic, the
+casting safety has been updated:
+
+* ``"equiv"`` enforces matching names and titles. The itemsize
+  is allowed to differ due to padding.
+* ``"safe"`` allows mismatching field names and titles
+* The cast safety is limited by the cast safety of each included
+  field.
+* The order of fields is used to decide cast safety of each
+  individual field.  Previously, the field names were used and
+  only unsafe casts were possible when names mismatched.
+
+The main important change here is that name mismatches are now
+considered "safe" casts.
+
+(`gh-19226 <https://github.com/numpy/numpy/pull/19226>`__)
+
+``NPY_RELAXED_STRIDES_CHECKING`` has been removed
+-------------------------------------------------
+NumPy cannot be compiled with ``NPY_RELAXED_STRIDES_CHECKING=0``
+anymore.  Relaxed strides have been the default for many years and
+the option was initially introduced to allow a smoother transition.
+
+(`gh-20220 <https://github.com/numpy/numpy/pull/20220>`__)
+
+``np.loadtxt`` has recieved several changes
+-------------------------------------------
+
+The row counting of ``numpy.loadtxt`` was fixed.  ``loadtxt`` ignores fully
+empty lines in the file, but counted them towards ``max_rows``.
+When ``max_rows`` is used and the file contains empty lines, these will now
+not be counted.  Previously, it was possible that the result contained fewer
+than ``max_rows`` rows even though more data was available to be read.
+If the old behaviour is required, ``itertools.islice`` may be used::
+
+    import itertools
+    lines = itertools.islice(open("file"), 0, max_rows)
+    result = np.loadtxt(lines, ...)
+
+While generally much faster and improved, ``numpy.loadtxt`` may now fail to
+converter certain strings to numbers that were previously successfully read.
+The most important cases for this are:
+
+* Parsing floating point values such as ``1.0`` into integers is now deprecated.
+* Parsing hexadecimal floats such as ``0x3p3`` will fail
+* An ``_`` was previously accepted as a thousands delimiter ``100_000``.
+  This will now result in an error.
+
+If you experience these limitations, they can all be worked around by passing
+appropriate ``converters=``.  NumPy now supports passing a single converter
+to be used for all columns to make this more convenient.
+For example, ``converters=float.fromhex`` can read hexadecimal float numbers
+and ``converters=int`` will be able to read ``100_000``.
+
+Further, the error messages have been generally improved.  However, this means
+that error types may differ.  In particularly, a ``ValueError`` is now always
+raised when parsing of a single entry fails.
+
+(`gh-20580 <https://github.com/numpy/numpy/pull/20580>`__)
+
+
+Improvements
+============
+
+``ndarray.__array_finalize__`` is now callable
+----------------------------------------------
+This means subclasses can now use ``super().__array_finalize__(obj)``
+without worrying whether ``ndarray`` is their superclass or not.
+The actual call remains a no-op.
+
+(`gh-20766 <https://github.com/numpy/numpy/pull/20766>`__)
+
+Add support for VSX4/Power10
+----------------------------------------------
+With VSX4/Power10 enablement, the new instructions available in
+Power ISA 3.1 can be used to accelerate some NumPy operations,
+e.g., floor_divide, modulo, etc.
+
+(`gh-20821 <https://github.com/numpy/numpy/pull/20821>`__)
+
+``np.fromiter`` now accepts objects and subarrays
+-------------------------------------------------
+The ``numpy.fromiter`` function now supports object and
+subarray dtypes. Please see he function documentation for
+examples.
+
+(`gh-20993 <https://github.com/numpy/numpy/pull/20993>`__)
+
+Math C library feature detection now uses correct signatures
+------------------------------------------------------------
+Compiling is preceded by a detection phase to determine whether the
+underlying libc supports certain math operations. Previously this code
+did not respect the proper signatures. Fixing this enables compilation
+for the ``wasm-ld`` backend (compilation for web assembly) and reduces
+the number of warnings.
+
+(`gh-21154 <https://github.com/numpy/numpy/pull/21154>`__)
+
+``np.kron`` now maintains subclass information
+----------------------------------------------
+``np.kron`` maintains subclass information now such as masked arrays
+while computing the Kronecker product of the inputs
+
+.. code-block:: python
+
+    >>> x = ma.array([[1, 2], [3, 4]], mask=[[0, 1], [1, 0]])
+    >>> np.kron(x,x)
+    masked_array(
+      data=[[1, --, --, --],
+            [--, 4, --, --],
+            [--, --, 4, --],
+            [--, --, --, 16]],
+      mask=[[False,  True,  True,  True],
+            [ True, False,  True,  True],
+            [ True,  True, False,  True],
+            [ True,  True,  True, False]],
+      fill_value=999999)
+
+.. warning::
+    ``np.kron`` output now follows ``ufunc`` ordering (``multiply``)
+    to determine the output class type
+
+    .. code-block:: python
+
+        >>> class myarr(np.ndarray):
+        >>>    __array_priority__ = -1
+        >>> a = np.ones([2, 2])
+        >>> ma = myarray(a.shape, a.dtype, a.data)
+        >>> type(np.kron(a, ma)) == np.ndarray
+        False # Before it was True
+        >>> type(np.kron(a, ma)) == myarr
+        True
+
+(`gh-21262 <https://github.com/numpy/numpy/pull/21262>`__)
+
+
+Performance improvements and changes
+====================================
+
+Faster ``np.loadtxt``
+---------------------
+``numpy.loadtxt`` is now generally much faster than previously as most of it
+is now implemented in C.
+
+(`gh-20580 <https://github.com/numpy/numpy/pull/20580>`__)
+
+Faster reduction operators
+--------------------------
+Reduction operations like ``numpy.sum``, ``numpy.prod``, ``numpy.add.reduce``,
+``numpy.logical_and.reduce`` on contiguous integer-based arrays are now
+much faster.
+
+(`gh-21001 <https://github.com/numpy/numpy/pull/21001>`__)
+
+Faster ``np.where``
+-------------------
+``numpy.where`` is now much faster than previously on unpredictable/random
+input data.
+
+(`gh-21130 <https://github.com/numpy/numpy/pull/21130>`__)
+
+Faster operations on NumPy scalars
+----------------------------------
+Many operations on NumPy scalars are now significantly faster, although
+rare operations (e.g. with 0-D arrays rather than scalars) may be slower
+in some cases.
+However, even with these improvements users who want the best performance
+for their scalars, may want to convert a known NumPy scalar into a Python
+one using ``scalar.item()``.
+
+(`gh-21188 <https://github.com/numpy/numpy/pull/21188>`__)
+
+Faster ``np.kron``
+------------------
+``numpy.kron`` is about 80% faster as the product is now computed
+using broadcasting.
+
+(`gh-21354 <https://github.com/numpy/numpy/pull/21354>`__)
diff --git a/doc/source/release/1.7.0-notes.rst b/doc/source/release/1.7.0-notes.rst

index f111f80dc97a7c26291229c7cfc8c4328ed1b40e..40a6f550b7751e910876a75eeaadb3398a04f21c 100644 (file)
--- a/doc/source/release/1.7.0-notes.rst
+++ b/doc/source/release/1.7.0-notes.rst
@@ -162,7 +162,7 @@ Added experimental support for the AArch64 architecture.
  C API
  -----
  
-New function ``PyArray_RequireWriteable`` provides a consistent interface
+New function ``PyArray_FailUnlessWriteable`` provides a consistent interface
  for checking array writeability -- any C code which works with arrays whose
  WRITEABLE flag is not known to be True a priori, should make sure to call
  this function before writing.
diff --git a/doc/source/user/absolute_beginners.rst b/doc/source/user/absolute_beginners.rst

index 90012da1c5107b0832468dc843f6adef9dd728b1..a4a82afb61c7d1143dccb059c13f10caa12b0025 100644 (file)
--- a/doc/source/user/absolute_beginners.rst
+++ b/doc/source/user/absolute_beginners.rst
@@ -229,8 +229,8 @@ content is random and depends on the state of the memory. The reason to use
  fill every element afterwards! ::
  
    >>> # Create an empty array with 2 elements
-  >>> np.empty(2)
-  array([ 3.14, 42.  ])  # may vary
+  >>> np.empty(2) #doctest: +SKIP
+  array([3.14, 42.  ])  # may vary
  
  You can create an array with a range of elements::
  
@@ -669,18 +669,18 @@ If you wanted to split this array into three equally shaped arrays, you would
  run::
  
    >>> np.hsplit(x, 3)
-  [array([[1,  2,  3,  4],
-          [13, 14, 15, 16]]), array([[ 5,  6,  7,  8],
-          [17, 18, 19, 20]]), array([[ 9, 10, 11, 12],
-          [21, 22, 23, 24]])]
+    [array([[ 1,  2,  3,  4],
+           [13, 14, 15, 16]]), array([[ 5,  6,  7,  8],
+           [17, 18, 19, 20]]), array([[ 9, 10, 11, 12],
+           [21, 22, 23, 24]])]
  
  If you wanted to split your array after the third and fourth column, you'd run::
  
    >>> np.hsplit(x, (3, 4))
-  [array([[1, 2, 3],
-          [13, 14, 15]]), array([[ 4],
-          [16]]), array([[ 5, 6, 7, 8, 9, 10, 11, 12],
-          [17, 18, 19, 20, 21, 22, 23, 24]])]
+    [array([[ 1,  2,  3],
+           [13, 14, 15]]), array([[ 4],
+           [16]]), array([[ 5,  6,  7,  8,  9, 10, 11, 12],
+           [17, 18, 19, 20, 21, 22, 23, 24]])]
  
  :ref:`Learn more about stacking and splitting arrays here <quickstart.stacking-arrays>`.
  
@@ -967,9 +967,8 @@ All you need to do is pass in the number of elements you want it to generate::
    array([1., 1., 1.])
    >>> np.zeros(3)
    array([0., 0., 0.])
-  # the simplest way to generate random numbers
-  >>> rng = np.random.default_rng(0)
-  >>> rng.random(3)
+  >>> rng = np.random.default_rng()  # the simplest way to generate random numbers
+  >>> rng.random(3) #doctest: +SKIP
    array([0.63696169, 0.26978671, 0.04097352])
  
  .. image:: images/np_ones_zeros_random.png
@@ -985,7 +984,7 @@ a 2D array if you give them a tuple describing the dimensions of the matrix::
    array([[0., 0.],
           [0., 0.],
           [0., 0.]])
-  >>> rng.random((3, 2))
+  >>> rng.random((3, 2)) #doctest: +SKIP
    array([[0.01652764, 0.81327024],
           [0.91275558, 0.60663578],
           [0.72949656, 0.54362499]])  # may vary
@@ -1011,7 +1010,7 @@ that this is inclusive with NumPy) to high (exclusive). You can set
  
  You can generate a 2 x 4 array of random integers between 0 and 4 with::
  
-  >>> rng.integers(5, size=(2, 4))
+  >>> rng.integers(5, size=(2, 4)) #doctest: +SKIP
    array([[2, 1, 1, 0],
           [0, 0, 0, 4]])  # may vary
  
@@ -1345,7 +1344,7 @@ followed by the docstring of ``ndarray`` of which ``a`` is an instance):
    Type:            ndarray
    String form:     [1 2 3 4 5 6]
    Length:          6
-  File:            ~/anaconda3/lib/python3.7/site-packages/numpy/__init__.py
+  File:            ~/anaconda3/lib/python3.9/site-packages/numpy/__init__.py
    Docstring:       <no docstring>
    Class docstring:
    ndarray(shape, dtype=float, buffer=None, offset=0,
diff --git a/doc/source/user/basics.broadcasting.rst b/doc/source/user/basics.broadcasting.rst

index 5bea8e7677ed956215dc3e81e735a714df5d571b..8e8add41e2676ff6b5852aec871184cca8317d9c 100644 (file)
--- a/doc/source/user/basics.broadcasting.rst
+++ b/doc/source/user/basics.broadcasting.rst
@@ -6,7 +6,7 @@ Broadcasting
  ************
  
  .. seealso::
-    :class:`numpy.broadcast`   
+    :class:`numpy.broadcast`
  
  
  The term broadcasting describes how NumPy treats arrays with different
@@ -26,7 +26,7 @@ have exactly the same shape, as in the following example:
    >>> a = np.array([1.0, 2.0, 3.0])
    >>> b = np.array([2.0, 2.0, 2.0])
    >>> a * b
-  array([ 2.,  4.,  6.])
+  array([2.,  4.,  6.])
  
  NumPy's broadcasting rule relaxes this constraint when the arrays'
  shapes meet certain constraints. The simplest broadcasting example occurs
@@ -35,7 +35,7 @@ when an array and a scalar value are combined in an operation:
  >>> a = np.array([1.0, 2.0, 3.0])
  >>> b = 2.0
  >>> a * b
-array([ 2.,  4.,  6.])
+array([2.,  4.,  6.])
  
  The result is equivalent to the previous example where ``b`` was an array.
  We can think of the scalar ``b`` being *stretched* during the arithmetic
@@ -73,8 +73,8 @@ way left.  Two dimensions are compatible when
  2) one of them is 1
  
  If these conditions are not met, a
-``ValueError: operands could not be broadcast together`` exception is 
-thrown, indicating that the arrays have incompatible shapes. The size of 
+``ValueError: operands could not be broadcast together`` exception is
+thrown, indicating that the arrays have incompatible shapes. The size of
  the resulting array is the size that is not 1 along each axis of the inputs.
  
  Arrays do not need to have the same *number* of dimensions.  For example,
@@ -158,18 +158,18 @@ Here are examples of shapes that do not broadcast::
  
  An example of broadcasting when a 1-d array is added to a 2-d array::
  
-  >>> a = array([[ 0.0,  0.0,  0.0],
-  ...            [10.0, 10.0, 10.0],
-  ...            [20.0, 20.0, 20.0],
-  ...            [30.0, 30.0, 30.0]])
-  >>> b = array([1.0, 2.0, 3.0])
+  >>> a = np.array([[ 0.0,  0.0,  0.0],
+  ...               [10.0, 10.0, 10.0],
+  ...               [20.0, 20.0, 20.0],
+  ...               [30.0, 30.0, 30.0]])
+  >>> b = np.array([1.0, 2.0, 3.0])
    >>> a + b
    array([[  1.,   2.,   3.],
-          [ 11.,  12.,  13.],
-          [ 21.,  22.,  23.],
-          [ 31.,  32.,  33.]])
-  >>> b = array([1.0, 2.0, 3.0, 4.0])
-  >>> a + b 
+          [11.,  12.,  13.],
+          [21.,  22.,  23.],
+          [31.,  32.,  33.]])
+  >>> b = np.array([1.0, 2.0, 3.0, 4.0])
+  >>> a + b
    Traceback (most recent call last):
    ValueError: operands could not be broadcast together with shapes (4,3) (4,)
  
@@ -178,7 +178,7 @@ In :ref:`broadcasting.figure-3`, an exception is raised because of the
  incompatible shapes.
  
  .. figure:: broadcasting_2.png
-    :alt: A 1-d array with shape (3) is strectched to match the 2-d array of
+    :alt: A 1-d array with shape (3) is stretched to match the 2-d array of
            shape (4, 3) it is being added to, and the result is a 2-d array of shape
            (4, 3).
      :name: broadcasting.figure-2
@@ -208,10 +208,10 @@ outer addition operation of two 1-d arrays::
    >>> a = np.array([0.0, 10.0, 20.0, 30.0])
    >>> b = np.array([1.0, 2.0, 3.0])
    >>> a[:, np.newaxis] + b
-  array([[  1.,   2.,   3.],
-         [ 11.,  12.,  13.],
-         [ 21.,  22.,  23.],
-         [ 31.,  32.,  33.]])
+  array([[ 1.,   2.,   3.],
+         [11.,  12.,  13.],
+         [21.,  22.,  23.],
+         [31.,  32.,  33.]])
  
  .. figure:: broadcasting_4.png
      :alt: A 2-d array of shape (4, 1) and a 1-d array of shape (3) are
@@ -266,7 +266,7 @@ the shape of the ``codes`` array::
            gymnast, marathon runner, basketball player, football
            lineman and the athlete to be classified. Shortest distance
            is found between the basketball player and the athlete
-          to be classified. 
+          to be classified.
      :name: broadcasting.figure-5
  
      *Figure 5*
@@ -281,7 +281,7 @@ are compared to a set of ``codes``. Consider this scenario::
  
    Observation      (2d array):      10 x 3
    Codes            (2d array):       5 x 3
-  Diff             (3d array):  5 x 10 x 3 
+  Diff             (3d array):  5 x 10 x 3
  
  The three-dimensional array, ``diff``, is a consequence of broadcasting, not a
  necessity for the calculation. Large data sets will generate a large
diff --git a/doc/source/user/basics.byteswapping.rst b/doc/source/user/basics.byteswapping.rst

index fecdb9ee85437c886b02d1cf6a49e6db78347396..d0a6623903bc170fca263501784f2fb94a911bc2 100644 (file)
--- a/doc/source/user/basics.byteswapping.rst
+++ b/doc/source/user/basics.byteswapping.rst
@@ -31,7 +31,7 @@ The bytes I have loaded from the file would have these contents:
  
  >>> big_end_buffer = bytearray([0,1,3,2])
  >>> big_end_buffer
-bytearray(b'\\x00\\x01\\x03\\x02')
+bytearray(b'\x00\x01\x03\x02')
  
  We might want to use an ``ndarray`` to access these integers.  In that
  case, we can create an array around this memory, and tell numpy that
diff --git a/doc/source/user/basics.copies.rst b/doc/source/user/basics.copies.rst

index 583a59b9563a97d7cfdd3ae9b2a70548ebaa66ed..482cbc189ec82dad6611a5feadceeb3403d12ba8 100644 (file)
--- a/doc/source/user/basics.copies.rst
+++ b/doc/source/user/basics.copies.rst
@@ -39,6 +39,8 @@ do not reflect on the original array. Making a copy is slower and
  memory-consuming but sometimes necessary. A copy can be forced by using
  :meth:`.ndarray.copy`.
  
+.. _indexing-operations:
+
  Indexing operations
  ===================
  
@@ -149,4 +151,4 @@ the original array while it returns ``None`` for a copy.
  
  Note that the ``base`` attribute should not be used to determine
  if an ndarray object is *new*; only if it is a view or a copy
-of another ndarray.
-\ No newline at end of file
+of another ndarray.
diff --git a/doc/source/user/basics.creation.rst b/doc/source/user/basics.creation.rst

index 84ff1c30e1f0e90121195f86eca4f320bd9e4688..c0a4fd7cfcbc3ddba669215abd4c1368bcbc696c 100644 (file)
--- a/doc/source/user/basics.creation.rst
+++ b/doc/source/user/basics.creation.rst
@@ -74,10 +74,11 @@ assign a new type that satisfies all of the array elements involved in
  the computation, here ``uint32`` and ``int32`` can both be represented in
  as ``int64``. 
  
-The default NumPy behavior is to create arrays in either 64-bit signed
-integers or double precision floating point numbers, ``int64`` and
-``float``, respectively. If you expect your arrays to be a certain type,
-then you need to specify the ``dtype`` while you create the array. 
+The default NumPy behavior is to create arrays in either 32 or 64-bit signed
+integers (platform dependent and matches C int size) or double precision
+floating point numbers, int32/int64 and float, respectively. If you expect your
+integer arrays to be a specific type, then you need to specify the dtype while
+you create the array.
  
  2) Intrinsic NumPy array creation functions
  ===========================================
@@ -108,9 +109,9 @@ examples are shown::
   >>> np.arange(10)
   array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
   >>> np.arange(2, 10, dtype=float)
- array([ 2., 3., 4., 5., 6., 7., 8., 9.])
+ array([2., 3., 4., 5., 6., 7., 8., 9.])
   >>> np.arange(2, 3, 0.1)
- array([ 2. , 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9])
+ array([2. , 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9])
  
  Note: best practice for :func:`numpy.arange` is to use integer start, end, and
  step values. There are some subtleties regarding ``dtype``. In the second
@@ -123,7 +124,7 @@ spaced equally between the specified beginning and end values. For
  example: ::
  
   >>> np.linspace(1., 4., 6)
- array([ 1. ,  1.6,  2.2,  2.8,  3.4,  4. ])
+ array([1. ,  1.6,  2.2,  2.8,  3.4,  4. ])
  
  The advantage of this creation function is that you guarantee the
  number of elements and the starting and end point. The previous
@@ -216,8 +217,8 @@ specified shape. The default dtype is ``float64``::
  ``zeros`` in all other respects as such::
  
   >>> np.ones((2, 3))
- array([[ 1., 1., 1.], 
-        [ 1., 1., 1.]])
+ array([[1., 1., 1.], 
+        [1., 1., 1.]])
   >>> np.ones((2, 3, 2))
   array([[[1., 1.],
           [1., 1.],
@@ -299,10 +300,10 @@ arrays into a 4-by-4 array using ``block``::
   >>> C = np.zeros((2, 2))
   >>> D = np.diag((-3, -4))
   >>> np.block([[A, B], [C, D]])
- array([[ 1.,  1.,  1.,  0. ],
-        [ 1.,  1.,  0.,  1. ],
-        [ 0.,  0., -3.,  0. ],
-        [ 0.,  0.,  0., -4. ]])
+ array([[ 1.,  1.,  1.,  0.],
+        [ 1.,  1.,  0.,  1.],
+        [ 0.,  0., -3.,  0.],
+        [ 0.,  0.,  0., -4.]])
  
  Other routines use similar syntax to join ndarrays. Check the
  routine's documentation for further examples and syntax. 
diff --git a/doc/source/user/basics.dispatch.rst b/doc/source/user/basics.dispatch.rst

index 089a7df170636f47a62cd5b28004a54faac2c44f..35c73dde4ad875b8e84686223fc255937a0f5292 100644 (file)
--- a/doc/source/user/basics.dispatch.rst
+++ b/doc/source/user/basics.dispatch.rst
@@ -57,7 +57,7 @@ array([[2., 0., 0., 0., 0.],
  Notice that the return type is a standard ``numpy.ndarray``.
  
  >>> type(np.multiply(arr, 2))
-numpy.ndarray
+<class 'numpy.ndarray'>
  
  How can we pass our custom array type through this function? Numpy allows a
  class to indicate that it would like to handle computations in a custom-defined
@@ -119,7 +119,9 @@ DiagonalArray(N=5, value=0.8414709848078965)
  At this point ``arr + 3`` does not work.
  
  >>> arr + 3
-TypeError: unsupported operand type(s) for *: 'DiagonalArray' and 'int'
+Traceback (most recent call last):
+...
+TypeError: unsupported operand type(s) for +: 'DiagonalArray' and 'int'
  
  To support it, we need to define the Python interfaces ``__add__``, ``__lt__``,
  and so on to dispatch to the corresponding ufunc. We can achieve this
@@ -193,14 +195,14 @@ functions to our custom variants.
  ...             return self.__class__(N, ufunc(*scalars, **kwargs))
  ...         else:
  ...             return NotImplemented
-...    def __array_function__(self, func, types, args, kwargs):
-...        if func not in HANDLED_FUNCTIONS:
-...            return NotImplemented
-...        # Note: this allows subclasses that don't override
-...        # __array_function__ to handle DiagonalArray objects.
-...        if not all(issubclass(t, self.__class__) for t in types):
-...            return NotImplemented
-...        return HANDLED_FUNCTIONS[func](*args, **kwargs)
+...     def __array_function__(self, func, types, args, kwargs):
+...         if func not in HANDLED_FUNCTIONS:
+...             return NotImplemented
+...         # Note: this allows subclasses that don't override
+...         # __array_function__ to handle DiagonalArray objects.
+...         if not all(issubclass(t, self.__class__) for t in types):
+...             return NotImplemented
+...         return HANDLED_FUNCTIONS[func](*args, **kwargs)
  ...
  
  A convenient pattern is to define a decorator ``implements`` that can be used
@@ -241,14 +243,19 @@ this operation is not supported. For example, concatenating two
  supported.
  
  >>> np.concatenate([arr, arr])
+Traceback (most recent call last):
+...
  TypeError: no implementation found for 'numpy.concatenate' on types that implement __array_function__: [<class '__main__.DiagonalArray'>]
  
  Additionally, our implementations of ``sum`` and ``mean`` do not accept the
  optional arguments that numpy's implementation does.
  
  >>> np.sum(arr, axis=0)
+Traceback (most recent call last):
+...
  TypeError: sum() got an unexpected keyword argument 'axis'
  
+
  The user always has the option of converting to a normal ``numpy.ndarray`` with
  :func:`numpy.asarray` and using standard numpy from there.
  
diff --git a/doc/source/user/basics.indexing.rst b/doc/source/user/basics.indexing.rst

index 264c3d721f4f471977be763b6fa503420a1f83d3..334047f9c7c820b31f1128777cf6583994958230 100644 (file)
--- a/doc/source/user/basics.indexing.rst
+++ b/doc/source/user/basics.indexing.rst
@@ -1,3 +1,6 @@
+.. for doctest:
+    >>> import numpy as np
+  
  .. _basics.indexing:
  
  ****************************************
@@ -28,6 +31,7 @@ Note that in Python, ``x[(exp1, exp2, ..., expN)]`` is equivalent to
  ``x[exp1, exp2, ..., expN]``; the latter is just syntactic sugar
  for the former.
  
+.. _basic-indexing:
  
  Basic indexing
  --------------
@@ -88,6 +92,7 @@ that is subsequently indexed by 2.
      rapidly changing location in memory. This difference represents a
      great potential for confusion.
  
+.. _slicing-and-striding:
  
  Slicing and striding
  ^^^^^^^^^^^^^^^^^^^^
@@ -99,14 +104,6 @@ integer, or a tuple of slice objects and integers. :py:data:`Ellipsis`
  and :const:`newaxis` objects can be interspersed with these as
  well.
  
-.. deprecated:: 1.15.0
-
-  In order to remain backward compatible with a common usage in
-  Numeric, basic slicing is also initiated if the selection object is
-  any non-ndarray and non-tuple sequence (such as a :class:`list`) containing
-  :class:`slice` objects, the :py:data:`Ellipsis` object, or the :const:`newaxis`
-  object, but not for integer arrays or other embedded sequences.
-
  .. index::
     triple: ndarray; special methods; getitem
     triple: ndarray; special methods; setitem
@@ -226,6 +223,7 @@ concepts to remember include:
  .. index::
     pair: ndarray; view
  
+.. _dimensional-indexing-tools:
  
  Dimensional indexing tools
  ^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -325,6 +323,8 @@ If the index values are out of bounds then an ``IndexError`` is thrown::
      array([[3, 4],
            [5, 6]])
      >>> x[np.array([3, 4])]
+    Traceback (most recent call last):
+      ...
      IndexError: index 3 is out of bounds for axis 0 with size 3
  
  When the index consists of as many integer arrays as dimensions of the array
@@ -368,6 +368,8 @@ broadcast them to the same shape. If they cannot be broadcast to the same
  shape, an exception is raised::
  
      >>> y[np.array([0, 2, 4]), np.array([0, 1])]
+    Traceback (most recent call last):
+      ...
      IndexError: shape mismatch: indexing arrays could not be broadcast
      together with shapes (3,) (2,)
  
@@ -470,6 +472,7 @@ such an array with an image with shape (ny, nx) with dtype=np.uint8
  lookup table) will result in an array of shape (ny, nx, 3) where a
  triple of RGB values is associated with each pixel location.
  
+.. _boolean-indexing:
  
  Boolean array indexing
  ^^^^^^^^^^^^^^^^^^^^^^
@@ -502,7 +505,7 @@ Or wish to add a constant to all negative elements::
      >>> x = np.array([1., -1., -2., 3])
      >>> x[x < 0] += 20
      >>> x
-    array([1., 19., 18., 3.])
+    array([ 1., 19., 18., 3.])
  
  In general if an index includes a Boolean array, the result will be
  identical to inserting ``obj.nonzero()`` into the same position
@@ -786,6 +789,8 @@ exceptions (assigning complex to floats or ints): ::
   >>> x[1]
   1
   >>> x[1] = 1.2j
+ Traceback (most recent call last):
+   ...
   TypeError: can't convert complex to int
  
  
@@ -851,7 +856,7 @@ For this reason, it is possible to use the output from the
  :meth:`np.nonzero() <ndarray.nonzero>` function directly as an index since
  it always returns a tuple of index arrays.
  
-Because the special treatment of tuples, they are not automatically
+Because of the special treatment of tuples, they are not automatically
  converted to an array as a list would be. As an example: ::
  
   >>> z[[1, 1, 1, 1]]  # produces a large array
diff --git a/doc/source/user/basics.interoperability.rst b/doc/source/user/basics.interoperability.rst

new file mode 100644 (file)

index 0000000..11fe18a
--- /dev/null
+++ b/doc/source/user/basics.interoperability.rst
@@ -0,0 +1,524 @@
+
+.. _basics.interoperability:
+
+***************************
+Interoperability with NumPy
+***************************
+
+NumPy's ndarray objects provide both a high-level API for operations on
+array-structured data and a concrete implementation of the API based on
+:ref:`strided in-RAM storage <arrays>`. While this API is powerful and fairly
+general, its concrete implementation has limitations. As datasets grow and NumPy
+becomes used in a variety of new environments and architectures, there are cases
+where the strided in-RAM storage strategy is inappropriate, which has caused
+different libraries to reimplement this API for their own uses. This includes
+GPU arrays (CuPy_), Sparse arrays (`scipy.sparse`, `PyData/Sparse <Sparse_>`_)
+and parallel arrays (Dask_ arrays) as well as various NumPy-like implementations
+in deep learning frameworks, like TensorFlow_ and PyTorch_. Similarly, there are
+many projects that build on top of the NumPy API for labeled and indexed arrays
+(XArray_), automatic differentiation (JAX_), masked arrays (`numpy.ma`),
+physical units (astropy.units_, pint_, unyt_), among others that add additional
+functionality on top of the NumPy API.
+
+Yet, users still want to work with these arrays using the familiar NumPy API and
+re-use existing code with minimal (ideally zero) porting overhead. With this
+goal in mind, various protocols are defined for implementations of
+multi-dimensional arrays with high-level APIs matching NumPy.
+
+Broadly speaking, there are three groups of features used for interoperability
+with NumPy:
+
+1. Methods of turning a foreign object into an ndarray;
+2. Methods of deferring execution from a NumPy function to another array
+   library;
+3. Methods that use NumPy functions and return an instance of a foreign object.
+
+We describe these features below.
+
+
+1. Using arbitrary objects in NumPy
+-----------------------------------
+
+The first set of interoperability features from the NumPy API allows foreign
+objects to be treated as NumPy arrays whenever possible. When NumPy functions
+encounter a foreign object, they will try (in order):
+
+1. The buffer protocol, described :py:doc:`in the Python C-API documentation
+   <c-api/buffer>`.
+2. The ``__array_interface__`` protocol, described
+   :ref:`in this page <arrays.interface>`. A precursor to Python's buffer
+   protocol, it defines a way to access the contents of a NumPy array from other
+   C extensions.
+3. The ``__array__()`` method, which asks an arbitrary object to convert
+   itself into an array.
+
+For both the buffer and the ``__array_interface__`` protocols, the object
+describes its memory layout and NumPy does everything else (zero-copy if
+possible). If that's not possible, the object itself is responsible for
+returning a ``ndarray`` from ``__array__()``.
+
+:doc:`DLPack <dlpack:index>` is yet another protocol to convert foreign objects
+to NumPy arrays in a language and device agnostic manner. NumPy doesn't implicitly
+convert objects to ndarrays using DLPack. It provides the function
+`numpy.from_dlpack` that accepts any object implementing the ``__dlpack__`` method
+and outputs a NumPy ndarray (which is generally a view of the input object's data
+buffer). The :ref:`dlpack:python-spec` page explains the ``__dlpack__`` protocol
+in detail.
+
+The array interface protocol
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The :ref:`array interface protocol <arrays.interface>` defines a way for
+array-like objects to re-use each other's data buffers. Its implementation
+relies on the existence of the following attributes or methods:
+
+-  ``__array_interface__``: a Python dictionary containing the shape, the
+   element type, and optionally, the data buffer address and the strides of an
+   array-like object;
+-  ``__array__()``: a method returning the NumPy ndarray view of an array-like
+   object;
+
+The ``__array_interface__`` attribute can be inspected directly:
+
+ >>> import numpy as np
+ >>> x = np.array([1, 2, 5.0, 8])
+ >>> x.__array_interface__
+ {'data': (94708397920832, False), 'strides': None, 'descr': [('', '<f8')], 'typestr': '<f8', 'shape': (4,), 'version': 3}
+
+The ``__array_interface__`` attribute can also be used to manipulate the object
+data in place:
+
+ >>> class wrapper():
+ ...     pass
+ ...
+ >>> arr = np.array([1, 2, 3, 4])
+ >>> buf = arr.__array_interface__
+ >>> buf
+ {'data': (140497590272032, False), 'strides': None, 'descr': [('', '<i8')], 'typestr': '<i8', 'shape': (4,), 'version': 3}
+ >>> buf['shape'] = (2, 2)
+ >>> w = wrapper()
+ >>> w.__array_interface__ = buf
+ >>> new_arr = np.array(w, copy=False)
+ >>> new_arr
+ array([[1, 2],
+        [3, 4]])
+
+We can check that ``arr`` and ``new_arr`` share the same data buffer:
+
+ >>> new_arr[0, 0] = 1000
+ >>> new_arr
+ array([[1000,    2],
+        [   3,    4]])
+ >>> arr
+ array([1000, 2, 3, 4])
+
+
+The ``__array__()`` method
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``__array__()`` method ensures that any NumPy-like object (an array, any
+object exposing the array interface, an object whose ``__array__()`` method
+returns an array or any nested sequence) that implements it can be used as a
+NumPy array. If possible, this will mean using ``__array__()`` to create a NumPy
+ndarray view of the array-like object. Otherwise, this copies the data into a
+new ndarray object. This is not optimal, as coercing arrays into ndarrays may
+cause performance problems or create the need for copies and loss of metadata,
+as the original object and any attributes/behavior it may have had, is lost.
+
+To see an example of a custom array implementation including the use of
+``__array__()``, see :ref:`basics.dispatch`.
+
+The DLPack Protocol
+~~~~~~~~~~~~~~~~~~~
+
+The :doc:`DLPack <dlpack:index>` protocol defines a memory-layout of
+strided n-dimensional array objects. It offers the following syntax
+for data exchange:
+
+1. A `numpy.from_dlpack` function, which accepts (array) objects with a
+   ``__dlpack__`` method and uses that method to construct a new array
+   containing the data from ``x``.
+2. ``__dlpack__(self, stream=None)`` and ``__dlpack_device__`` methods on the
+   array object, which will be called from within ``from_dlpack``, to query
+   what device the array is on (may be needed to pass in the correct
+   stream, e.g. in the case of multiple GPUs) and to access the data.
+
+Unlike the buffer protocol, DLPack allows exchanging arrays containing data on
+devices other than the CPU (e.g. Vulkan or GPU). Since NumPy only supports CPU,
+it can only convert objects whose data exists on the CPU. But other libraries,
+like PyTorch_ and CuPy_, may exchange data on GPU using this protocol.
+
+
+2. Operating on foreign objects without converting
+--------------------------------------------------
+
+A second set of methods defined by the NumPy API allows us to defer the
+execution from a NumPy function to another array library.
+
+Consider the following function.
+
+ >>> import numpy as np
+ >>> def f(x):
+ ...     return np.mean(np.exp(x))
+
+Note that `np.exp <numpy.exp>` is a :ref:`ufunc <ufuncs-basics>`, which means
+that it operates on ndarrays in an element-by-element fashion. On the other
+hand, `np.mean <numpy.mean>` operates along one of the array's axes.
+
+We can apply ``f`` to a NumPy ndarray object directly:
+
+ >>> x = np.array([1, 2, 3, 4])
+ >>> f(x)
+ 21.1977562209304
+
+We would like this function to work equally well with any NumPy-like array
+object.
+
+NumPy allows a class to indicate that it would like to handle computations in a
+custom-defined way through the following interfaces:
+
+-  ``__array_ufunc__``: allows third-party objects to support and override
+   :ref:`ufuncs <ufuncs-basics>`.
+-  ``__array_function__``: a catch-all for NumPy functionality that is not
+   covered by the ``__array_ufunc__`` protocol for universal functions.
+
+As long as foreign objects implement the ``__array_ufunc__`` or
+``__array_function__`` protocols, it is possible to operate on them without the
+need for explicit conversion.
+
+The ``__array_ufunc__`` protocol
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A :ref:`universal function (or ufunc for short) <ufuncs-basics>` is a
+“vectorized” wrapper for a function that takes a fixed number of specific inputs
+and produces a fixed number of specific outputs. The output of the ufunc (and
+its methods) is not necessarily a ndarray, if not all input arguments are
+ndarrays. Indeed, if any input defines an ``__array_ufunc__`` method, control
+will be passed completely to that function, i.e., the ufunc is overridden. The
+``__array_ufunc__`` method defined on that (non-ndarray) object has access to
+the NumPy ufunc. Because ufuncs have a well-defined structure, the foreign
+``__array_ufunc__`` method may rely on ufunc attributes like ``.at()``,
+``.reduce()``, and others.
+
+A subclass can override what happens when executing NumPy ufuncs on it by
+overriding the default ``ndarray.__array_ufunc__`` method. This method is
+executed instead of the ufunc and should return either the result of the
+operation, or ``NotImplemented`` if the operation requested is not implemented.
+
+The ``__array_function__`` protocol
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To achieve enough coverage of the NumPy API to support downstream projects,
+there is a need to go beyond ``__array_ufunc__`` and implement a protocol that
+allows arguments of a NumPy function to take control and divert execution to
+another function (for example, a GPU or parallel implementation) in a way that
+is safe and consistent across projects.
+
+The semantics of ``__array_function__`` are very similar to ``__array_ufunc__``,
+except the operation is specified by an arbitrary callable object rather than a
+ufunc instance and method. For more details, see :ref:`NEP18`.
+
+
+3. Returning foreign objects
+----------------------------
+
+A third type of feature set is meant to use the NumPy function implementation
+and then convert the return value back into an instance of the foreign object.
+The ``__array_finalize__`` and ``__array_wrap__`` methods act behind the scenes
+to ensure that the return type of a NumPy function can be specified as needed.
+
+The ``__array_finalize__`` method is the mechanism that NumPy provides to allow
+subclasses to handle the various ways that new instances get created. This
+method is called whenever the system internally allocates a new array from an
+object which is a subclass (subtype) of the ndarray. It can be used to change
+attributes after construction, or to update meta-information from the “parent.”
+
+The ``__array_wrap__`` method “wraps up the action” in the sense of allowing any
+object (such as user-defined functions) to set the type of its return value and
+update attributes and metadata. This can be seen as the opposite of the
+``__array__`` method. At the end of every object that implements
+``__array_wrap__``, this method is called on the input object with the highest
+*array priority*, or the output object if one was specified. The
+``__array_priority__`` attribute is used to determine what type of object to
+return in situations where there is more than one possibility for the Python
+type of the returned object. For example, subclasses may opt to use this method
+to transform the output array into an instance of the subclass and update
+metadata before returning the array to the user.
+
+For more information on these methods, see :ref:`basics.subclassing` and
+:ref:`specific-array-subtyping`.
+
+
+Interoperability examples
+-------------------------
+
+Example: Pandas ``Series`` objects
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Consider the following:
+
+ >>> import pandas as pd
+ >>> ser = pd.Series([1, 2, 3, 4])
+ >>> type(ser)
+ pandas.core.series.Series
+
+Now, ``ser`` is **not** a ndarray, but because it
+`implements the __array_ufunc__ protocol
+<https://pandas.pydata.org/docs/user_guide/dsintro.html#dataframe-interoperability-with-numpy-functions>`__,
+we can apply ufuncs to it as if it were a ndarray:
+
+ >>> np.exp(ser)
+    0     2.718282
+    1     7.389056
+    2    20.085537
+    3    54.598150
+    dtype: float64
+ >>> np.sin(ser)
+    0    0.841471
+    1    0.909297
+    2    0.141120
+    3   -0.756802
+    dtype: float64
+
+We can even do operations with other ndarrays:
+
+ >>> np.add(ser, np.array([5, 6, 7, 8]))
+    0     6
+    1     8
+    2    10
+    3    12
+    dtype: int64
+ >>> f(ser)
+ 21.1977562209304
+ >>> result = ser.__array__()
+ >>> type(result)
+ numpy.ndarray
+
+
+Example: PyTorch tensors
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+`PyTorch <https://pytorch.org/>`__ is an optimized tensor library for deep
+learning using GPUs and CPUs. PyTorch arrays are commonly called *tensors*.
+Tensors are similar to NumPy's ndarrays, except that tensors can run on GPUs or
+other hardware accelerators. In fact, tensors and NumPy arrays can often share
+the same underlying memory, eliminating the need to copy data.
+
+ >>> import torch
+ >>> data = [[1, 2],[3, 4]]
+ >>> x_np = np.array(data)
+ >>> x_tensor = torch.tensor(data)
+
+Note that ``x_np`` and ``x_tensor`` are different kinds of objects:
+
+ >>> x_np
+ array([[1, 2],
+        [3, 4]])
+ >>> x_tensor
+ tensor([[1, 2],
+         [3, 4]])
+
+However, we can treat PyTorch tensors as NumPy arrays without the need for
+explicit conversion:
+
+ >>> np.exp(x_tensor)
+ tensor([[ 2.7183,  7.3891],
+         [20.0855, 54.5982]], dtype=torch.float64)
+
+Also, note that the return type of this function is compatible with the initial
+data type.
+
+.. admonition:: Warning
+
+   While this mixing of ndarrays and tensors may be convenient, it is not
+   recommended. It will not work for non-CPU tensors, and will have unexpected
+   behavior in corner cases. Users should prefer explicitly converting the
+   ndarray to a tensor.
+
+.. note::
+
+   PyTorch does not implement ``__array_function__`` or ``__array_ufunc__``.
+   Under the hood, the ``Tensor.__array__()`` method returns a NumPy ndarray as
+   a view of the tensor data buffer. See `this issue
+   <https://github.com/pytorch/pytorch/issues/24015>`__ and the
+   `__torch_function__ implementation
+   <https://github.com/pytorch/pytorch/blob/master/torch/overrides.py>`__
+   for details.
+
+Note also that we can see ``__array_wrap__`` in action here, even though
+``torch.Tensor`` is not a subclass of ndarray::
+
+   >>> import torch
+   >>> t = torch.arange(4)
+   >>> np.abs(t)
+   tensor([0, 1, 2, 3])
+
+PyTorch implements ``__array_wrap__`` to be able to get tensors back from NumPy
+functions, and we can modify it directly to control which type of objects are
+returned from these functions.
+
+Example: CuPy arrays
+~~~~~~~~~~~~~~~~~~~~
+
+CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated computing
+with Python. CuPy implements a subset of the NumPy interface by implementing
+``cupy.ndarray``, `a counterpart to NumPy ndarrays
+<https://docs.cupy.dev/en/stable/reference/ndarray.html>`__.
+
+ >>> import cupy as cp
+ >>> x_gpu = cp.array([1, 2, 3, 4])
+
+The ``cupy.ndarray`` object implements the ``__array_ufunc__`` interface. This
+enables NumPy ufuncs to be applied to CuPy arrays (this will defer operation to
+the matching CuPy CUDA/ROCm implementation of the ufunc):
+
+ >>> np.mean(np.exp(x_gpu))
+ array(21.19775622)
+
+Note that the return type of these operations is still consistent with the
+initial type:
+
+ >>> arr = cp.random.randn(1, 2, 3, 4).astype(cp.float32)
+ >>> result = np.sum(arr)
+ >>> print(type(result))
+ <class 'cupy._core.core.ndarray'>
+
+See `this page in the CuPy documentation for details
+<https://docs.cupy.dev/en/stable/reference/ufunc.html>`__.
+
+``cupy.ndarray`` also implements the ``__array_function__`` interface, meaning
+it is possible to do operations such as
+
+ >>> a = np.random.randn(100, 100)
+ >>> a_gpu = cp.asarray(a)
+ >>> qr_gpu = np.linalg.qr(a_gpu)
+
+CuPy implements many NumPy functions on ``cupy.ndarray`` objects, but not all.
+See `the CuPy documentation
+<https://docs.cupy.dev/en/stable/user_guide/difference.html>`__
+for details.
+
+Example: Dask arrays
+~~~~~~~~~~~~~~~~~~~~
+
+Dask is a flexible library for parallel computing in Python. Dask Array
+implements a subset of the NumPy ndarray interface using blocked algorithms,
+cutting up the large array into many small arrays. This allows computations on
+larger-than-memory arrays using multiple cores.
+
+Dask supports ``__array__()`` and ``__array_ufunc__``.
+
+ >>> import dask.array as da
+ >>> x = da.random.normal(1, 0.1, size=(20, 20), chunks=(10, 10))
+ >>> np.mean(np.exp(x))
+ dask.array<mean_agg-aggregate, shape=(), dtype=float64, chunksize=(), chunktype=numpy.ndarray>
+ >>> np.mean(np.exp(x)).compute()
+ 5.090097550553843
+
+.. note::
+
+   Dask is lazily evaluated, and the result from a computation isn't computed
+   until you ask for it by invoking ``compute()``.
+
+See `the Dask array documentation
+<https://docs.dask.org/en/stable/array.html>`__
+and the `scope of Dask arrays interoperability with NumPy arrays
+<https://docs.dask.org/en/stable/array.html#scope>`__ for details.
+
+Example: DLPack
+~~~~~~~~~~~~~~~
+
+Several Python data science libraries implement the ``__dlpack__`` protocol.
+Among them are PyTorch_ and CuPy_. A full list of libraries that implement
+this protocol can be found on
+:doc:`this page of DLPack documentation <dlpack:index>`.
+
+Convert a PyTorch CPU tensor to NumPy array:
+
+ >>> import torch
+ >>> x_torch = torch.arange(5)
+ >>> x_torch
+ tensor([0, 1, 2, 3, 4])
+ >>> x_np = np.from_dlpack(x_torch)
+ >>> x_np
+ array([0, 1, 2, 3, 4])
+ >>> # note that x_np is a view of x_torch
+ >>> x_torch[1] = 100
+ >>> x_torch
+ tensor([  0, 100,   2,   3,   4])
+ >>> x_np
+ array([  0, 100,   2,   3,   4])
+
+The imported arrays are read-only so writing or operating in-place will fail:
+
+ >>> x.flags.writeable
+ False
+ >>> x_np[1] = 1
+ Traceback (most recent call last):
+   File "<stdin>", line 1, in <module>
+ ValueError: assignment destination is read-only
+
+A copy must be created in order to operate on the imported arrays in-place, but
+will mean duplicating the memory. Do not do this for very large arrays:
+
+ >>> x_np_copy = x_np.copy()
+ >>> x_np_copy.sort()  # works
+
+.. note::
+
+  Note that GPU tensors can't be converted to NumPy arrays since NumPy doesn't
+  support GPU devices:
+
+   >>> x_torch = torch.arange(5, device='cuda')
+   >>> np.from_dlpack(x_torch)
+   Traceback (most recent call last):
+     File "<stdin>", line 1, in <module>
+   RuntimeError: Unsupported device in DLTensor.
+
+  But, if both libraries support the device the data buffer is on, it is
+  possible to use the ``__dlpack__`` protocol (e.g. PyTorch_ and CuPy_):
+
+   >>> x_torch = torch.arange(5, device='cuda')
+   >>> x_cupy = cupy.from_dlpack(x_torch)
+
+Similarly, a NumPy array can be converted to a PyTorch tensor:
+
+ >>> x_np = np.arange(5)
+ >>> x_torch = torch.from_dlpack(x_np)
+
+Read-only arrays cannot be exported:
+
+ >>> x_np = np.arange(5)
+ >>> x_np.flags.writeable = False
+ >>> torch.from_dlpack(x_np)  # doctest: +ELLIPSIS
+ Traceback (most recent call last):
+   File "<stdin>", line 1, in <module>
+   File ".../site-packages/torch/utils/dlpack.py", line 63, in from_dlpack
+     dlpack = ext_tensor.__dlpack__()
+ TypeError: NumPy currently only supports dlpack for writeable arrays
+
+Further reading
+---------------
+
+-  :ref:`arrays.interface`
+-  :ref:`basics.dispatch`
+-  :ref:`special-attributes-and-methods` (details on the ``__array_ufunc__`` and
+   ``__array_function__`` protocols)
+-  :ref:`basics.subclassing` (details on the ``__array_wrap__`` and
+   ``__array_finalize__`` methods)
+-  :ref:`specific-array-subtyping` (more details on the implementation of
+   ``__array_finalize__``, ``__array_wrap__`` and ``__array_priority__``)
+-  :doc:`NumPy roadmap: interoperability <neps:roadmap>`
+-  `PyTorch documentation on the Bridge with NumPy
+   <https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html#bridge-to-np-label>`__
+
+.. _CuPy: https://cupy.dev/
+.. _Sparse: https://sparse.pydata.org/
+.. _Dask: https://docs.dask.org/
+.. _TensorFlow: https://www.tensorflow.org/
+.. _PyTorch: https://pytorch.org/
+.. _XArray: http://xarray.pydata.org/
+.. _JAX: https://jax.readthedocs.io/
+.. _astropy.units: https://docs.astropy.org/en/stable/units/
+.. _pint: https://pint.readthedocs.io/
+.. _unyt: https://unyt.readthedocs.io/
diff --git a/doc/source/user/basics.io.genfromtxt.rst b/doc/source/user/basics.io.genfromtxt.rst

index 8fe7565aa730231be770e5bf9c8218c254a55e72..a9c521fa378ddecdbaf43408af7074ff1108eb09 100644 (file)
--- a/doc/source/user/basics.io.genfromtxt.rst
+++ b/doc/source/user/basics.io.genfromtxt.rst
@@ -60,8 +60,8 @@ example, comma-separated files (CSV) use a comma (``,``) or a semicolon
  
     >>> data = u"1, 2, 3\n4, 5, 6"
     >>> np.genfromtxt(StringIO(data), delimiter=",")
-   array([[ 1.,  2.,  3.],
-          [ 4.,  5.,  6.]])
+   array([[1.,  2.,  3.],
+          [4.,  5.,  6.]])
  
  Another common separator is ``"\t"``, the tabulation character.  However,
  we are not limited to a single character, any string will do.  By default,
@@ -76,14 +76,14 @@ size) or to a sequence of integers (if columns can have different sizes)::
  
     >>> data = u"  1  2  3\n  4  5 67\n890123  4"
     >>> np.genfromtxt(StringIO(data), delimiter=3)
-   array([[   1.,    2.,    3.],
-          [   4.,    5.,   67.],
-          [ 890.,  123.,    4.]])
+   array([[  1.,    2.,    3.],
+          [  4.,    5.,   67.],
+          [890.,  123.,    4.]])
     >>> data = u"123456789\n   4  7 9\n   4567 9"
     >>> np.genfromtxt(StringIO(data), delimiter=(4, 3, 2))
-   array([[ 1234.,   567.,    89.],
-          [    4.,     7.,     9.],
-          [    4.,   567.,     9.]])
+   array([[1234.,   567.,    89.],
+          [   4.,     7.,     9.],
+          [   4.,   567.,     9.]])
  
  
  The ``autostrip`` argument
@@ -156,10 +156,10 @@ using the ``skip_footer`` attribute and giving it a value of ``n``::
  
     >>> data = u"\n".join(str(i) for i in range(10))
     >>> np.genfromtxt(StringIO(data),)
-   array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])
+   array([0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])
     >>> np.genfromtxt(StringIO(data),
     ...               skip_header=3, skip_footer=5)
-   array([ 3.,  4.])
+   array([3.,  4.])
  
  By default, ``skip_header=0`` and ``skip_footer=0``, meaning that no lines
  are skipped.
@@ -180,8 +180,8 @@ can use ``usecols=(0, -1)``::
  
     >>> data = u"1 2 3\n4 5 6"
     >>> np.genfromtxt(StringIO(data), usecols=(0, -1))
-   array([[ 1.,  3.],
-          [ 4.,  6.]])
+   array([[1.,  3.],
+          [4.,  6.]])
  
  If the columns have names, we can also select which columns to import by
  giving their name to the ``usecols`` argument, either as a sequence
@@ -190,12 +190,10 @@ of strings or a comma-separated string::
     >>> data = u"1 2 3\n4 5 6"
     >>> np.genfromtxt(StringIO(data),
     ...               names="a, b, c", usecols=("a", "c"))
-   array([(1.0, 3.0), (4.0, 6.0)],
-         dtype=[('a', '<f8'), ('c', '<f8')])
+   array([(1., 3.), (4., 6.)], dtype=[('a', '<f8'), ('c', '<f8')])
     >>> np.genfromtxt(StringIO(data),
     ...               names="a, b, c", usecols=("a, c"))
-       array([(1.0, 3.0), (4.0, 6.0)],
-             dtype=[('a', '<f8'), ('c', '<f8')])
+       array([(1., 3.), (4., 6.)], dtype=[('a', '<f8'), ('c', '<f8')])
  
  
  
@@ -231,9 +229,7 @@ When ``dtype=None``, the type of each column is determined iteratively from
  its data.  We start by checking whether a string can be converted to a
  boolean (that is, if the string matches ``true`` or ``false`` in lower
  cases); then whether it can be converted to an integer, then to a float,
-then to a complex and eventually to a string.  This behavior may be changed
-by modifying the default mapper of the
-:class:`~numpy.lib._iotools.StringConverter` class.
+then to a complex and eventually to a string.
  
  The option ``dtype=None`` is provided for convenience.  However, it is
  significantly slower than setting the dtype explicitly.
@@ -260,7 +256,7 @@ sequence of strings or a comma-separated string::
  
     >>> data = StringIO("1 2 3\n 4 5 6")
     >>> np.genfromtxt(data, names="A, B, C")
-   array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)],
+   array([(1., 2., 3.), (4., 5., 6.)],
           dtype=[('A', '<f8'), ('B', '<f8'), ('C', '<f8')])
  
  In the example above, we used the fact that by default, ``dtype=float``.
@@ -274,7 +270,7 @@ that case, we must use the ``names`` keyword with a value of
  
     >>> data = StringIO("So it goes\n#a b c\n1 2 3\n 4 5 6")
     >>> np.genfromtxt(data, skip_header=1, names=True)
-   array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)],
+   array([(1., 2., 3.), (4., 5., 6.)],
           dtype=[('a', '<f8'), ('b', '<f8'), ('c', '<f8')])
  
  The default value of ``names`` is ``None``.  If we give any other
@@ -285,7 +281,7 @@ have defined with the dtype::
     >>> ndtype=[('a',int), ('b', float), ('c', int)]
     >>> names = ["A", "B", "C"]
     >>> np.genfromtxt(data, names=names, dtype=ndtype)
-   array([(1, 2.0, 3), (4, 5.0, 6)],
+   array([(1, 2., 3), (4, 5., 6)],
           dtype=[('A', '<i8'), ('B', '<f8'), ('C', '<i8')])
  
  
@@ -298,7 +294,7 @@ with the standard NumPy default of ``"f%i"``, yielding names like ``f0``,
  
     >>> data = StringIO("1 2 3\n 4 5 6")
     >>> np.genfromtxt(data, dtype=(int, float, int))
-   array([(1, 2.0, 3), (4, 5.0, 6)],
+   array([(1, 2., 3), (4, 5., 6)],
           dtype=[('f0', '<i8'), ('f1', '<f8'), ('f2', '<i8')])
  
  In the same way, if we don't give enough names to match the length of the
@@ -306,7 +302,7 @@ dtype, the missing names will be defined with this default template::
  
     >>> data = StringIO("1 2 3\n 4 5 6")
     >>> np.genfromtxt(data, dtype=(int, float, int), names="a")
-   array([(1, 2.0, 3), (4, 5.0, 6)],
+   array([(1, 2., 3), (4, 5., 6)],
           dtype=[('a', '<i8'), ('f0', '<f8'), ('f1', '<i8')])
  
  We can overwrite this default with the ``defaultfmt`` argument, that
@@ -314,7 +310,7 @@ takes any format string::
  
     >>> data = StringIO("1 2 3\n 4 5 6")
     >>> np.genfromtxt(data, dtype=(int, float, int), defaultfmt="var_%02i")
-   array([(1, 2.0, 3), (4, 5.0, 6)],
+   array([(1, 2., 3), (4, 5., 6)],
           dtype=[('var_00', '<i8'), ('var_01', '<f8'), ('var_02', '<i8')])
  
  .. note::
@@ -390,7 +386,7 @@ and ``' 78.9%'`` cannot be converted to float and we end up having
     >>> # Converted case ...
     >>> np.genfromtxt(StringIO(data), delimiter=",", names=names,
     ...               converters={1: convertfunc})
-   array([(1.0, 0.023, 45.0), (6.0, 0.78900000000000003, 0.0)],
+   array([(1., 0.023, 45.), (6., 0.789, 0.)],
           dtype=[('i', '<f8'), ('p', '<f8'), ('n', '<f8')])
  
  The same results can be obtained by using the name of the second column
@@ -399,7 +395,7 @@ The same results can be obtained by using the name of the second column
     >>> # Using a name for the converter ...
     >>> np.genfromtxt(StringIO(data), delimiter=",", names=names,
     ...               converters={"p": convertfunc})
-   array([(1.0, 0.023, 45.0), (6.0, 0.78900000000000003, 0.0)],
+   array([(1., 0.023, 45.), (6., 0.789, 0.)],
           dtype=[('i', '<f8'), ('p', '<f8'), ('n', '<f8')])
  
  
@@ -514,15 +510,15 @@ output array will then be a :class:`~numpy.ma.MaskedArray`.
  Shortcut functions
  ==================
  
-In addition to :func:`~numpy.genfromtxt`, the :mod:`numpy.lib.npyio` module
+In addition to :func:`~numpy.genfromtxt`, the ``numpy.lib.npyio`` module
  provides several convenience functions derived from
  :func:`~numpy.genfromtxt`.  These functions work the same way as the
  original, but they have different default values.
  
-:func:`~numpy.npyio.recfromtxt`
+``numpy.lib.npyio.recfromtxt``
     Returns a standard :class:`numpy.recarray` (if ``usemask=False``) or a
-   :class:`~numpy.ma.mrecords.MaskedRecords` array (if ``usemaske=True``).  The
+   ``numpy.ma.mrecords.MaskedRecords`` array (if ``usemaske=True``).  The
     default dtype is ``dtype=None``, meaning that the types of each column
     will be automatically determined.
-:func:`~numpy.npyio.recfromcsv`
-   Like :func:`~numpy.npyio.recfromtxt`, but with a default ``delimiter=","``.
+``numpy.lib.npyio.recfromcsv``
+   Like ``numpy.lib.npyio.recfromtxt``, but with a default ``delimiter=","``.
diff --git a/doc/source/user/basics.rec.rst b/doc/source/user/basics.rec.rst

index 1e6f30506c7a3e09ec6756243fcf35b099c9fe5f..b3c7f9e4a4606135f7dc2ff5ccd618b3735106dd 100644 (file)
--- a/doc/source/user/basics.rec.rst
+++ b/doc/source/user/basics.rec.rst
@@ -1,7 +1,7 @@
  .. _structured_arrays:
  
  *****************
-Structured arrays 
+Structured arrays
  *****************
  
  Introduction
@@ -15,7 +15,7 @@ datatypes organized as a sequence of named :term:`fields <field>`. For example,
   ...              dtype=[('name', 'U10'), ('age', 'i4'), ('weight', 'f4')])
   >>> x
   array([('Rex', 9, 81.), ('Fido', 3, 27.)],
-       dtype=[('name', 'U10'), ('age', '<i4'), ('weight', '<f4')])
+       dtype=[('name', '<U10'), ('age', '<i4'), ('weight', '<f4')])
  
  Here ``x`` is a one-dimensional array of length two whose datatype is a
  structure with three fields: 1. A string of length 10 or less named 'name', 2.
@@ -24,7 +24,7 @@ a 32-bit integer named 'age', and 3. a 32-bit float named 'weight'.
  If you index ``x`` at position 1 you get a structure::
  
   >>> x[1]
- ('Fido', 3, 27.0)
+ ('Fido', 3, 27.)
  
  You can access and modify individual fields of a structured array by indexing
  with the field name::
@@ -34,7 +34,7 @@ with the field name::
   >>> x['age'] = 5
   >>> x
   array([('Rex', 5, 81.), ('Fido', 5, 27.)],
-       dtype=[('name', 'U10'), ('age', '<i4'), ('weight', '<f4')])
+       dtype=[('name', '<U10'), ('age', '<i4'), ('weight', '<f4')])
  
  Structured datatypes are designed to be able to mimic 'structs' in the C
  language, and share a similar memory layout. They are meant for interfacing with
@@ -146,16 +146,14 @@ summary they are:
  
  4.   A dictionary of field names
  
-     The use of this form of specification is discouraged, but documented here
-     because older numpy code may use it. The keys of the dictionary are the
-     field names and the values are tuples specifying type and offset::
+     The keys of the dictionary are the field names and the values are tuples
+     specifying type and offset::
  
        >>> np.dtype({'col1': ('i1', 0), 'col2': ('f4', 1)})
        dtype([('col1', 'i1'), ('col2', '<f4')])
  
-     This form is discouraged because Python dictionaries do not preserve order
-     in Python versions before Python 3.6, and the order of the fields in a
-     structured dtype has meaning. :ref:`Field Titles <titles>` may be
+     This form was discouraged because Python dictionaries did not preserve order
+     in Python versions before Python 3.6. :ref:`Field Titles <titles>` may be
       specified by using a 3-tuple, see below.
  
  Manipulating and Displaying Structured Datatypes
@@ -425,7 +423,7 @@ array, as follows::
   >>> a = np.zeros(3, dtype=[('a', 'i4'), ('b', 'i4'), ('c', 'f4')])
   >>> a[['a', 'c']]
   array([(0, 0.), (0, 0.), (0, 0.)],
-      dtype={'names':['a','c'], 'formats':['<i4','<f4'], 'offsets':[0,8], 'itemsize':12})
+      dtype={'names': ['a', 'c'], 'formats': ['<i4', '<f4'], 'offsets': [0, 8], 'itemsize': 12})
  
  Assignment to the view modifies the original array. The view's fields will be
  in the order they were indexed. Note that unlike for single-field indexing, the
@@ -472,7 +470,7 @@ missing.
      Furthermore, numpy now provides a new function
      :func:`numpy.lib.recfunctions.structured_to_unstructured` which is a safer
      and more efficient alternative for users who wish to convert structured
-    arrays to unstructured arrays, as the view above is often indeded to do.
+    arrays to unstructured arrays, as the view above is often intended to do.
      This function allows safe conversion to an unstructured type taking into
      account padding, often avoids a copy, and also casts the datatypes
      as needed, unlike the view. Code such as:
@@ -485,7 +483,9 @@ missing.
  
       >>> from numpy.lib.recfunctions import structured_to_unstructured
       >>> structured_to_unstructured(b[['x', 'z']])
-     array([0, 0, 0])
+     array([[0., 0.],
+            [0., 0.],
+            [0., 0.]], dtype=float32)
  
  
  Assignment to an array with a multi-field index modifies the original array::
@@ -548,29 +548,79 @@ In order to prevent clobbering object pointers in fields of
  :class:`object` type, numpy currently does not allow views of structured
  arrays containing objects.
  
-Structure Comparison
---------------------
+.. _structured_dtype_comparison_and_promotion:
+
+Structure Comparison and Promotion
+----------------------------------
  
  If the dtypes of two void structured arrays are equal, testing the equality of
  the arrays will result in a boolean array with the dimensions of the original
  arrays, with elements set to ``True`` where all fields of the corresponding
-structures are equal. Structured dtypes are equal if the field names,
-dtypes and titles are the same, ignoring endianness, and the fields are in
-the same order::
+structures are equal::
  
- >>> a = np.zeros(2, dtype=[('a', 'i4'), ('b', 'i4')])
- >>> b = np.ones(2, dtype=[('a', 'i4'), ('b', 'i4')])
+ >>> a = np.array([(1, 1), (2, 2)], dtype=[('a', 'i4'), ('b', 'i4')])
+ >>> b = np.array([(1, 1), (2, 3)], dtype=[('a', 'i4'), ('b', 'i4')])
   >>> a == b
- array([False, False])
+ array([True, False])
  
-Currently, if the dtypes of two void structured arrays are not equivalent the
-comparison fails, returning the scalar value ``False``. This behavior is
-deprecated as of numpy 1.10 and will raise an error or perform elementwise
-comparison in the future.
+NumPy will promote individual field datatypes to perform the comparison.
+So the following is also valid (note the ``'f4'`` dtype for the ``'a'`` field):
+
+ >>> b = np.array([(1.0, 1), (2.5, 2)], dtype=[("a", "f4"), ("b", "i4")])
+ >>> a == b
+ array([True, False])
+
+To compare two structured arrays, it must be possible to promote them to a
+common dtype as returned by `numpy.result_type` and `np.promote_types`.
+This enforces that the number of fields, the field names, and the field titles
+must match precisely.
+When promotion is not possible, for example due to mismatching field names,
+NumPy will raise an error.
+Promotion between two structured dtypes results in a canonical dtype that
+ensures native byte-order for all fields::
+
+    >>> np.result_type(np.dtype("i,>i"))
+    dtype([('f0', '<i4'), ('f1', '<i4')])
+    >>> np.result_type(np.dtype("i,>i"), np.dtype("i,i"))
+    dtype([('f0', '<i4'), ('f1', '<i4')])
+
+The resulting dtype from promotion is also guaranteed to be packed, meaning
+that all fields are ordered contiguously and any unnecessary padding is
+removed::
+
+    >>> dt = np.dtype("i1,V3,i4,V1")[["f0", "f2"]]
+    >>> dt
+    dtype({'names':['f0','f2'], 'formats':['i1','<i4'], 'offsets':[0,4], 'itemsize':9})
+    >>> np.result_type(dt)
+    dtype([('f0', 'i1'), ('f2', '<i4')])
+
+Note that the result prints without ``offsets`` or ``itemsize`` indicating no
+additional padding.
+If a structured dtype is created with ``align=True`` ensuring that
+``dtype.isalignedstruct`` is true, this property is preserved::
+
+    >>> dt = np.dtype("i1,V3,i4,V1", align=True)[["f0", "f2"]]
+    >>> dt
+    dtype({'names':['f0','f2'], 'formats':['i1','<i4'], 'offsets':[0,4], 'itemsize':12}, align=True)
+    >>> np.result_type(dt)
+    dtype([('f0', 'i1'), ('f2', '<i4')], align=True)
+    >>> np.result_type(dt).isalignedstruct
+    True
+
+When promoting multiple dtypes, the result is aligned if any of the inputs is::
+
+    >>> np.result_type(np.dtype("i,i"), np.dtype("i,i", align=True))
+    dtype([('f0', '<i4'), ('f1', '<i4')], align=True)
  
  The ``<`` and ``>`` operators always return ``False`` when comparing void
  structured arrays, and arithmetic and bitwise operations are not supported.
  
+.. versionchanged:: 1.23
+    Before NumPy 1.23, a warning was given and ``False`` returned when
+    promotion to a common dtype failed.
+    Further, promotion was much more restrictive: It would reject the mixed
+    float/integer comparison example above.
+
  Record Arrays
  =============
  
@@ -579,17 +629,18 @@ As an optional convenience numpy provides an ndarray subclass,
  attribute instead of only by index.
  Record arrays use a special datatype, :class:`numpy.record`, that allows
  field access by attribute on the structured scalars obtained from the array.
-The :mod:`numpy.rec` module provides functions for creating recarrays from
+The ``numpy.rec`` module provides functions for creating recarrays from
  various objects.
  Additional helper functions for creating and manipulating structured arrays
  can be found in :mod:`numpy.lib.recfunctions`.
  
-The simplest way to create a record array is with ``numpy.rec.array``::
+The simplest way to create a record array is with
+:func:`numpy.rec.array <numpy.core.records.array>`::
  
   >>> recordarr = np.rec.array([(1, 2., 'Hello'), (2, 3., "World")],
   ...                    dtype=[('foo', 'i4'),('bar', 'f4'), ('baz', 'S10')])
   >>> recordarr.bar
- array([ 2.,  3.], dtype=float32)
+ array([2., 3.], dtype=float32)
   >>> recordarr[1:2]
   rec.array([(2, 3., b'World')],
         dtype=[('foo', '<i4'), ('bar', '<f4'), ('baz', 'S10')])
@@ -600,14 +651,14 @@ The simplest way to create a record array is with ``numpy.rec.array``::
   >>> recordarr[1].baz
   b'World'
  
-:func:`numpy.rec.array` can convert a wide variety of arguments into record
-arrays, including structured arrays::
+:func:`numpy.rec.array <numpy.core.records.array>` can convert a wide variety
+of arguments into record arrays, including structured arrays::
  
   >>> arr = np.array([(1, 2., 'Hello'), (2, 3., "World")],
   ...             dtype=[('foo', 'i4'), ('bar', 'f4'), ('baz', 'S10')])
   >>> recordarr = np.rec.array(arr)
  
-The :mod:`numpy.rec` module provides a number of other convenience functions for
+The ``numpy.rec`` module provides a number of other convenience functions for
  creating record arrays, see :ref:`record array creation routines
  <routines.array-creation.rec>`.
  
diff --git a/doc/source/user/basics.rst b/doc/source/user/basics.rst

index affb85db2f0c547a64d32086280b91ce3fafd75b..c004d8978a4622f943986ef723b4e52b9aa6d49a 100644 (file)
--- a/doc/source/user/basics.rst
+++ b/doc/source/user/basics.rst
@@ -20,3 +20,4 @@ fundamental NumPy ideas and philosophy.
     basics.subclassing
     basics.ufuncs
     basics.copies
+   basics.interoperability
diff --git a/doc/source/user/basics.subclassing.rst b/doc/source/user/basics.subclassing.rst

index 1b78809865aa0fa289cf1978b81084d68c08b2de..7b97abab792430d6a80f05b1482d92ddcfe24aac 100644 (file)
--- a/doc/source/user/basics.subclassing.rst
+++ b/doc/source/user/basics.subclassing.rst
@@ -31,6 +31,49 @@ things like array slicing.  The complications of subclassing ndarray are
  due to the mechanisms numpy has to support these latter two routes of
  instance creation.
  
+When to use subclassing
+=======================
+
+Besides the additional complexities of subclassing a NumPy array, subclasses
+can run into unexpected behaviour because some functions may convert the
+subclass to a baseclass and "forget" any additional information
+associated with the subclass.
+This can result in surprising behavior if you use NumPy methods or
+functions you have not explicitly tested.
+
+On the other hand, compared to other interoperability approaches,
+subclassing can be a useful because many thing will "just work".
+
+This means that subclassing can be a convenient approach and for a long time
+it was also often the only available approach.
+However, NumPy now provides additional interoperability protocols described
+in ":ref:`Interoperability with NumPy <basics.interoperability>`".
+For many use-cases these interoperability protocols may now be a better fit
+or supplement the use of subclassing.
+
+Subclassing can be a good fit if:
+
+* you are less worried about maintainability or users other than yourself:
+  Subclass will be faster to implement and additional interoperability
+  can be added "as-needed".  And with few users, possible surprises are not
+  an issue.
+* you do not think it is problematic if the subclass information is
+  ignored or lost silently.  An example is ``np.memmap`` where "forgetting"
+  about data being memory mapped cannot lead to a wrong result.
+  An example of a subclass that sometimes confuses users are NumPy's masked
+  arrays.  When they were introduced, subclassing was the only approach for
+  implementation.  However, today we would possibly try to avoid subclassing
+  and rely only on interoperability protocols.
+
+Note that also subclass authors may wish to study
+:ref:`Interoperability with NumPy <basics.interoperability>`
+to support more complex use-cases or work around the surprising behavior.
+
+``astropy.units.Quantity`` and ``xarray`` are examples for array-like objects
+that interoperate well with NumPy.  Astropy's ``Quantity`` is an example
+which uses a dual approach of both subclassing and interoperability protocols.
+
+
  .. _view-casting:
  
  View casting
@@ -48,7 +91,7 @@ ndarray of any subclass, and return a view of the array as another
  >>> # take a view of it, as our useless subclass
  >>> c_arr = arr.view(C)
  >>> type(c_arr)
-<class 'C'>
+<class '__main__.C'>
  
  .. _new-from-template:
  
@@ -63,7 +106,7 @@ For example:
  
  >>> v = c_arr[1:]
  >>> type(v) # the view is of type 'C'
-<class 'C'>
+<class '__main__.C'>
  >>> v is c_arr # but it's a new instance
  False
  
@@ -114,18 +157,15 @@ __new__ documentation
  
  For example, consider the following Python code:
  
-.. testcode::
-
-  class C:
-      def __new__(cls, *args):
-          print('Cls in __new__:', cls)
-          print('Args in __new__:', args)
-          # The `object` type __new__ method takes a single argument.
-          return object.__new__(cls)
-
-      def __init__(self, *args):
-          print('type(self) in __init__:', type(self))
-          print('Args in __init__:', args)
+>>> class C:
+>>>     def __new__(cls, *args):
+>>>         print('Cls in __new__:', cls)
+>>>         print('Args in __new__:', args)
+>>>         # The `object` type __new__ method takes a single argument.
+>>>         return object.__new__(cls)
+>>>     def __init__(self, *args):
+>>>         print('type(self) in __init__:', type(self))
+>>>         print('Args in __init__:', args)
  
  meaning that we get:
  
@@ -526,7 +566,7 @@ which inputs and outputs it converted. Hence, e.g.,
  >>> a.info
  {'inputs': [0, 1], 'outputs': [0]}
  
-Note that another approach would be to to use ``getattr(ufunc,
+Note that another approach would be to use ``getattr(ufunc,
  methods)(*inputs, **kwargs)`` instead of the ``super`` call. For this example,
  the result would be identical, but there is a difference if another operand
  also defines ``__array_ufunc__``. E.g., lets assume that we evalulate
diff --git a/doc/source/user/basics.types.rst b/doc/source/user/basics.types.rst

index 354f003fbb28d55162bb689fae92b092d2952385..3d5b380dbc8c11c24e3d2ce358aa89f6587483ac 100644 (file)
--- a/doc/source/user/basics.types.rst
+++ b/doc/source/user/basics.types.rst
@@ -104,11 +104,7 @@ aliases are provided (See :ref:`sized-aliases`).
  
  NumPy numerical types are instances of ``dtype`` (data-type) objects, each
  having unique characteristics.  Once you have imported NumPy using
-
-  ::
-
-    >>> import numpy as np
-
+``>>> import numpy as np``
  the dtypes are available as ``np.bool_``, ``np.float32``, etc.
  
  Advanced types, not listed above, are explored in
@@ -127,7 +123,6 @@ Data-types can be used as functions to convert python numbers to array scalars
  to arrays of that type, or as arguments to the dtype keyword that many numpy
  functions or methods accept. Some examples::
  
-    >>> import numpy as np
      >>> x = np.float32(1.0)
      >>> x
      1.0
@@ -143,7 +138,7 @@ backward compatibility with older packages such as Numeric.  Some
  documentation may still refer to these, for example::
  
    >>> np.array([1, 2, 3], dtype='f')
-  array([ 1.,  2.,  3.], dtype=float32)
+  array([1.,  2.,  3.], dtype=float32)
  
  We recommend using dtype objects instead.
  
@@ -151,7 +146,7 @@ To convert the type of an array, use the .astype() method (preferred) or
  the type itself as a function. For example: ::
  
      >>> z.astype(float)                 #doctest: +NORMALIZE_WHITESPACE
-    array([  0.,  1.,  2.])
+    array([0.,  1.,  2.])
      >>> np.int8(z)
      array([0, 1, 2], dtype=int8)
  
@@ -170,7 +165,7 @@ and its byte-order.  The data type can also be used indirectly to query
  properties of the type, such as whether it is an integer::
  
      >>> d = np.dtype(int)
-    >>> d
+    >>> d #doctest: +SKIP
      dtype('int32')
  
      >>> np.issubdtype(d, np.integer)
@@ -258,8 +253,8 @@ compiler's ``long double`` available as ``np.longdouble`` (and
  numpy provides with ``np.finfo(np.longdouble)``.
  
  NumPy does not provide a dtype with more precision than C's
-``long double``\\; in particular, the 128-bit IEEE quad precision
-data type (FORTRAN's ``REAL*16``\\) is not available.
+``long double``; in particular, the 128-bit IEEE quad precision
+data type (FORTRAN's ``REAL*16``) is not available.
  
  For efficient memory alignment, ``np.longdouble`` is usually stored
  padded with zero bits, either to 96 or 128 bits. Which is more efficient
diff --git a/doc/source/user/basics.ufuncs.rst b/doc/source/user/basics.ufuncs.rst

index 083e31f702b5b3956f4b6d07a9fbe9f0968296a1..5e83621aa6191aad46b9e43dcd15c716352774b8 100644 (file)
--- a/doc/source/user/basics.ufuncs.rst
+++ b/doc/source/user/basics.ufuncs.rst
@@ -89,7 +89,7 @@ Considering ``x`` from the previous example::
     >>> y
     array([0, 0, 0])
     >>> np.multiply.reduce(x, dtype=float, out=y)
-   array([ 0, 28, 80])     # dtype argument is ignored
+   array([ 0, 28, 80])
  
  Ufuncs also have a fifth method, :func:`numpy.ufunc.at`, that allows in place
  operations to be performed using advanced indexing. No
diff --git a/doc/source/user/building.rst b/doc/source/user/building.rst

index 01ec65d3b6b3ed9c6443793253f176d449004e16..4bd0b7183ea0ea1af08e2a3be39cfafbfe8d9f64 100644 (file)
--- a/doc/source/user/building.rst
+++ b/doc/source/user/building.rst
@@ -8,7 +8,7 @@ source. Your choice depends on your operating system and familiarity with the
  command line.
  
  Gitpod
-------------
+------
  
  Gitpod is an open-source platform that automatically creates
  the correct development environment right in your browser, reducing the need to
@@ -21,7 +21,7 @@ in-depth instructions for building NumPy with `building NumPy with Gitpod`_.
  .. _building NumPy with Gitpod: https://numpy.org/devdocs/dev/development_gitpod.html
  
  Building locally
-------------------
+----------------
  
  Building locally on your machine gives you
  more granular control. If you are a MacOS or Linux user familiar with using the
@@ -37,7 +37,7 @@ Prerequisites
  
  Building NumPy requires the following software installed:
  
-1) Python 3.6.x or newer
+1) Python 3.8.x or newer
  
     Please note that the Python development headers also need to be installed,
     e.g., on Debian/Ubuntu one needs to install both `python3` and
@@ -97,7 +97,14 @@ Testing
  -------
  
  Make sure to test your builds. To ensure everything stays in shape, see if
-all tests pass::
+all tests pass.
+
+The test suite requires additional dependencies, which can easily be 
+installed with::
+
+    $ python -m pip install -r test_requirements.txt
+
+Run tests::
  
      $ python runtests.py -v -m full
  
@@ -291,3 +298,101 @@ Additional compiler flags can be supplied by setting the ``OPT``,
  When providing options that should improve the performance of the code
  ensure that you also set ``-DNDEBUG`` so that debugging code is not
  executed.
+
+Cross compilation
+-----------------
+
+Although ``numpy.distutils`` and ``setuptools`` do not directly support cross
+compilation, it is possible to build NumPy on one system for different
+architectures with minor modifications to the build environment. This may be
+desirable, for example, to use the power of a high-performance desktop to
+create a NumPy package for a low-power, single-board computer. Because the
+``setup.py`` scripts are unaware of cross-compilation environments and tend to
+make decisions based on the environment detected on the build system, it is
+best to compile for the same type of operating system that runs on the builder.
+Attempting to compile a Mac version of NumPy on Windows, for example, is likely
+to be met with challenges not considered here.
+
+For the purpose of this discussion, the nomenclature adopted by `meson`_ will
+be used: the "build" system is that which will be running the NumPy build
+process, while the "host" is the platform on which the compiled package will be
+run. A native Python interpreter, the setuptools and Cython packages and the
+desired cross compiler must be available for the build system. In addition, a
+Python interpreter and its development headers as well as any external linear
+algebra libraries must be available for the host platform. For convenience, it
+is assumed that all host software is available under a separate prefix
+directory, here called ``$CROSS_PREFIX``.
+
+.. _meson: https://mesonbuild.com/Cross-compilation.html#cross-compilation
+
+When building and installing NumPy for a host system, the ``CC`` environment
+variable must provide the path the cross compiler that will be used to build
+NumPy C extensions. It may also be necessary to set the ``LDSHARED``
+environment variable to the path to the linker that can link compiled objects
+for the host system. The compiler must be told where it can find Python
+libraries and development headers. On Unix-like systems, this generally
+requires adding, *e.g.*, the following parameters to the ``CFLAGS`` environment
+variable::
+
+    -I${CROSS_PREFIX}/usr/include
+    -I${CROSS_PREFIX}/usr/include/python3.y
+
+for Python version 3.y. (Replace the "y" in this path with the actual minor
+number of the installed Python runtime.) Likewise, the linker should be told
+where to find host libraries by adding a parameter to the ``LDFLAGS``
+environment variable::
+
+    -L${CROSS_PREFIX}/usr/lib
+
+To make sure Python-specific system configuration options are provided for the
+intended host and not the build system, set::
+
+    _PYTHON_SYSCONFIGDATA_NAME=_sysconfigdata_${ARCH_TRIPLET}
+
+where ``${ARCH_TRIPLET}`` is an architecture-dependent suffix appropriate for
+the host architecture. (This should be the name of a ``_sysconfigdata`` file,
+without the ``.py`` extension, found in the host Python library directory.)
+
+When using external linear algebra libraries, include and library directories
+should be provided for the desired libraries in ``site.cfg`` as described
+above and in the comments of the ``site.cfg.example`` file included in the
+NumPy repository or sdist. In this example, set::
+
+    include_dirs = ${CROSS_PREFIX}/usr/include
+    library_dirs = ${CROSS_PREFIX}/usr/lib
+
+under appropriate sections of the file to allow ``numpy.distutils`` to find the
+libraries.
+
+As of NumPy 1.22.0, a vendored copy of SVML will be built on ``x86_64`` Linux
+hosts to provide AVX-512 acceleration of floating-point operations. When using
+an ``x86_64`` Linux build system to cross compile NumPy for hosts other than
+``x86_64`` Linux, set the environment variable ``NPY_DISABLE_SVML`` to prevent
+the NumPy build script from incorrectly attempting to cross-compile this
+platform-specific library::
+
+    NPY_DISABLE_SVML=1
+
+With the environment configured, NumPy may be built as it is natively::
+
+    python setup.py build
+
+When the ``wheel`` package is available, the cross-compiled package may be
+packed into a wheel for installation on the host with::
+
+    python setup.py bdist_wheel
+
+It may be possible to use ``pip`` to build a wheel, but ``pip`` configures its
+own environment; adapting the ``pip`` environment to cross-compilation is
+beyond the scope of this guide.
+
+The cross-compiled package may also be installed into the host prefix for
+cross-compilation of other packages using, *e.g.*, the command::
+
+    python setup.py install --prefix=${CROSS_PREFIX}
+
+When cross compiling other packages that depend on NumPy, the host
+npy-pkg-config file must be made available. For further discussion, refer to
+`numpy distutils documentation`_.
+
+.. _numpy distutils documentation: https://numpy.org/devdocs/reference/distutils.html#numpy.distutils.misc_util.Configuration.add_npy_pkg_config
diff --git a/doc/source/user/c-info.beyond-basics.rst b/doc/source/user/c-info.beyond-basics.rst

index 7dd22afbf629c3da0ab89ec827f7c09aa99b0d17..04ca834897e16b32b62b5a4af1d3b3729c75b512 100644 (file)
--- a/doc/source/user/c-info.beyond-basics.rst
+++ b/doc/source/user/c-info.beyond-basics.rst
@@ -450,6 +450,7 @@ type(s). In particular, to create a sub-type in C follow these steps:
  More information on creating sub-types in C can be learned by reading
  PEP 253 (available at https://www.python.org/dev/peps/pep-0253).
  
+.. _specific-array-subtyping:
  
  Specific features of ndarray sub-typing
  ---------------------------------------
diff --git a/doc/source/user/c-info.how-to-extend.rst b/doc/source/user/c-info.how-to-extend.rst

index 96727a177136929ffbd0c99bb8cea7fa6da3e831..ffa141b95c8318c6d53e7ce05eee0a3c9105956b 100644 (file)
--- a/doc/source/user/c-info.how-to-extend.rst
+++ b/doc/source/user/c-info.how-to-extend.rst
@@ -111,7 +111,7 @@ Defining functions
  ==================
  
  The second argument passed in to the Py_InitModule function is a
-structure that makes it easy to to define functions in the module. In
+structure that makes it easy to define functions in the module. In
  the example given above, the mymethods structure would have been
  defined earlier in the file (usually right before the init{name}
  subroutine) to:
@@ -459,9 +459,8 @@ writeable). The syntax is
              must be called before :c:func:`Py_DECREF` at
              the end of the interface routine to write back the temporary data
              into the original array passed in. Use
-            of the :c:data:`NPY_ARRAY_WRITEBACKIFCOPY` or
-            :c:data:`NPY_ARRAY_UPDATEIFCOPY` flags requires that the input
-            object is already an array (because other objects cannot
+            of the :c:data:`NPY_ARRAY_WRITEBACKIFCOPY` flag requires that the
+            input object is already an array (because other objects cannot
              be automatically updated in this fashion). If an error
              occurs use :c:func:`PyArray_DiscardWritebackIfCopy` (obj) on an
              array with these flags set. This will set the underlying base array
diff --git a/doc/source/user/c-info.python-as-glue.rst b/doc/source/user/c-info.python-as-glue.rst

index 6d514f146c4fedd0bccceecd9d710294764d215e..4db7898566c896d746cf5b6d10bfcf37a314a517 100644 (file)
--- a/doc/source/user/c-info.python-as-glue.rst
+++ b/doc/source/user/c-info.python-as-glue.rst
@@ -116,303 +116,7 @@ signatures for the subroutines it encounters, or you can guide how the
  subroutine interfaces with Python by constructing an interface-definition-file
  (or modifying the f2py-produced one).
  
-.. index::
-   single: f2py
-
-Creating source for a basic extension module
---------------------------------------------
-
-Probably the easiest way to introduce f2py is to offer a simple
-example. Here is one of the subroutines contained in a file named
-:file:`add.f`
-
-.. code-block:: fortran
-
-    C
-          SUBROUTINE ZADD(A,B,C,N)
-    C
-          DOUBLE COMPLEX A(*)
-          DOUBLE COMPLEX B(*)
-          DOUBLE COMPLEX C(*)
-          INTEGER N
-          DO 20 J = 1, N
-             C(J) = A(J)+B(J)
-     20   CONTINUE
-          END
-
-This routine simply adds the elements in two contiguous arrays and
-places the result in a third. The memory for all three arrays must be
-provided by the calling routine. A very basic interface to this
-routine can be automatically generated by f2py::
-
-    f2py -m add add.f
-
-You should be able to run this command assuming your search-path is
-set-up properly. This command will produce an extension module named
-:file:`addmodule.c` in the current directory. This extension module can now be
-compiled and used from Python just like any other extension module.
-
-
-Creating a compiled extension module
-------------------------------------
-
-You can also get f2py to both compile :file:`add.f` along with the produced
-extension module leaving only a shared-library extension file that can
-be imported from Python::
-
-    f2py -c -m add add.f
-
-This command leaves a file named add.{ext} in the current directory
-(where {ext} is the appropriate extension for a Python extension
-module on your platform --- so, pyd, *etc.* ). This module may then be
-imported from Python. It will contain a method for each subroutine in
-add (zadd, cadd, dadd, sadd). The docstring of each method contains
-information about how the module method may be called::
-
-    >>> import add
-    >>> print(add.zadd.__doc__)
-    zadd(a,b,c,n)
-
-    Wrapper for ``zadd``.
-
-    Parameters
-    ----------
-    a : input rank-1 array('D') with bounds (*)
-    b : input rank-1 array('D') with bounds (*)
-    c : input rank-1 array('D') with bounds (*)
-    n : input int
-
-Improving the basic interface
------------------------------
-
-The default interface is a very literal translation of the Fortran
-code into Python. The Fortran array arguments must now be NumPy arrays
-and the integer argument should be an integer. The interface will
-attempt to convert all arguments to their required types (and shapes)
-and issue an error if unsuccessful. However, because it knows nothing
-about the semantics of the arguments (such that C is an output and n
-should really match the array sizes), it is possible to abuse this
-function in ways that can cause Python to crash. For example::
-
-    >>> add.zadd([1, 2, 3], [1, 2], [3, 4], 1000)
-
-will cause a program crash on most systems. Under the covers, the
-lists are being converted to proper arrays but then the underlying add
-loop is told to cycle way beyond the borders of the allocated memory.
-
-In order to improve the interface, directives should be provided. This
-is accomplished by constructing an interface definition file. It is
-usually best to start from the interface file that f2py can produce
-(where it gets its default behavior from). To get f2py to generate the
-interface file use the -h option::
-
-    f2py -h add.pyf -m add add.f
-
-This command leaves the file add.pyf in the current directory. The
-section of this file corresponding to zadd is:
-
-.. code-block:: fortran
-
-    subroutine zadd(a,b,c,n) ! in :add:add.f
-       double complex dimension(*) :: a
-       double complex dimension(*) :: b
-       double complex dimension(*) :: c
-       integer :: n
-    end subroutine zadd
-
-By placing intent directives and checking code, the interface can be
-cleaned up quite a bit until the Python module method is both easier
-to use and more robust.
-
-.. code-block:: fortran
-
-    subroutine zadd(a,b,c,n) ! in :add:add.f
-       double complex dimension(n) :: a
-       double complex dimension(n) :: b
-       double complex intent(out),dimension(n) :: c
-       integer intent(hide),depend(a) :: n=len(a)
-    end subroutine zadd
-
-The intent directive, intent(out) is used to tell f2py that ``c`` is
-an output variable and should be created by the interface before being
-passed to the underlying code. The intent(hide) directive tells f2py
-to not allow the user to specify the variable, ``n``, but instead to
-get it from the size of ``a``. The depend( ``a`` ) directive is
-necessary to tell f2py that the value of n depends on the input a (so
-that it won't try to create the variable n until the variable a is
-created).
-
-After modifying ``add.pyf``, the new Python module file can be generated
-by compiling both ``add.f`` and ``add.pyf``::
-
-    f2py -c add.pyf add.f
-
-The new interface has docstring::
-
-    >>> import add
-    >>> print(add.zadd.__doc__)
-    c = zadd(a,b)
-
-    Wrapper for ``zadd``.
-
-    Parameters
-    ----------
-    a : input rank-1 array('D') with bounds (n)
-    b : input rank-1 array('D') with bounds (n)
-
-    Returns
-    -------
-    c : rank-1 array('D') with bounds (n)
-
-Now, the function can be called in a much more robust way::
-
-    >>> add.zadd([1, 2, 3], [4, 5, 6])
-    array([5.+0.j, 7.+0.j, 9.+0.j])
-
-Notice the automatic conversion to the correct format that occurred.
-
-
-Inserting directives in Fortran source
---------------------------------------
-
-The nice interface can also be generated automatically by placing the
-variable directives as special comments in the original Fortran code.
-Thus, if the source code is modified to contain:
-
-.. code-block:: fortran
-
-    C
-          SUBROUTINE ZADD(A,B,C,N)
-    C
-    CF2PY INTENT(OUT) :: C
-    CF2PY INTENT(HIDE) :: N
-    CF2PY DOUBLE COMPLEX :: A(N)
-    CF2PY DOUBLE COMPLEX :: B(N)
-    CF2PY DOUBLE COMPLEX :: C(N)
-          DOUBLE COMPLEX A(*)
-          DOUBLE COMPLEX B(*)
-          DOUBLE COMPLEX C(*)
-          INTEGER N
-          DO 20 J = 1, N
-             C(J) = A(J) + B(J)
-     20   CONTINUE
-          END
-
-Then, one can compile the extension module using::
-
-    f2py -c -m add add.f
-
-The resulting signature for the function add.zadd is exactly the same
-one that was created previously. If the original source code had
-contained ``A(N)`` instead of ``A(*)`` and so forth with ``B`` and ``C``,
-then nearly the same interface can be obtained by placing the
-``INTENT(OUT) :: C`` comment line in the source code. The only difference
-is that ``N`` would be an optional input that would default to the length
-of ``A``.
-
-
-A filtering example
--------------------
-
-For comparison with the other methods to be discussed. Here is another
-example of a function that filters a two-dimensional array of double
-precision floating-point numbers using a fixed averaging filter. The
-advantage of using Fortran to index into multi-dimensional arrays
-should be clear from this example.
-
-.. code-block::
-
-          SUBROUTINE DFILTER2D(A,B,M,N)
-    C
-          DOUBLE PRECISION A(M,N)
-          DOUBLE PRECISION B(M,N)
-          INTEGER N, M
-    CF2PY INTENT(OUT) :: B
-    CF2PY INTENT(HIDE) :: N
-    CF2PY INTENT(HIDE) :: M
-          DO 20 I = 2,M-1
-             DO 40 J=2,N-1
-                B(I,J) = A(I,J) +
-         $           (A(I-1,J)+A(I+1,J) +
-         $            A(I,J-1)+A(I,J+1) )*0.5D0 +
-         $           (A(I-1,J-1) + A(I-1,J+1) +
-         $            A(I+1,J-1) + A(I+1,J+1))*0.25D0
-     40      CONTINUE
-     20   CONTINUE
-          END
-
-This code can be compiled and linked into an extension module named
-filter using::
-
-    f2py -c -m filter filter.f
-
-This will produce an extension module named filter.so in the current
-directory with a method named dfilter2d that returns a filtered
-version of the input.
-
-
-Calling f2py from Python
-------------------------
-
-The f2py program is written in Python and can be run from inside your code
-to compile Fortran code at runtime, as follows:
-
-.. code-block:: python
-
-    from numpy import f2py
-    with open("add.f") as sourcefile:
-        sourcecode = sourcefile.read()
-    f2py.compile(sourcecode, modulename='add')
-    import add
-
-The source string can be any valid Fortran code. If you want to save
-the extension-module source code then a suitable file-name can be
-provided by the ``source_fn`` keyword to the compile function.
-
-
-Automatic extension module generation
--------------------------------------
-
-If you want to distribute your f2py extension module, then you only
-need to include the .pyf file and the Fortran code. The distutils
-extensions in NumPy allow you to define an extension module entirely
-in terms of this interface file. A valid ``setup.py`` file allowing
-distribution of the ``add.f`` module (as part of the package
-``f2py_examples`` so that it would be loaded as ``f2py_examples.add``) is:
-
-.. code-block:: python
-
-    def configuration(parent_package='', top_path=None)
-        from numpy.distutils.misc_util import Configuration
-        config = Configuration('f2py_examples',parent_package, top_path)
-        config.add_extension('add', sources=['add.pyf','add.f'])
-        return config
-
-    if __name__ == '__main__':
-        from numpy.distutils.core import setup
-        setup(**configuration(top_path='').todict())
-
-Installation of the new package is easy using::
-
-    pip install .
-
-assuming you have the proper permissions to write to the main site-
-packages directory for the version of Python you are using. For the
-resulting package to work, you need to create a file named ``__init__.py``
-(in the same directory as ``add.pyf``). Notice the extension module is
-defined entirely in terms of the ``add.pyf`` and ``add.f`` files. The
-conversion of the .pyf file to a .c file is handled by `numpy.disutils`.
-
-
-Conclusion
-----------
-
-The interface definition file (.pyf) is how you can fine-tune the interface
-between Python and Fortran. There is decent documentation for f2py at
-:ref:`f2py`. There is also more information on using f2py (including how to use
-it to wrap C codes) at the `"Interfacing With Other Languages" heading of the
-SciPy Cookbook.
-<https://scipy-cookbook.readthedocs.io/items/idx_interfacing_with_other_languages.html>`_
+See the :ref:`F2PY documentation <f2py>` for more information and examples.
  
  The f2py method of linking compiled code is currently the most
  sophisticated and integrated approach. It allows clean separation of
diff --git a/doc/source/user/c-info.ufunc-tutorial.rst b/doc/source/user/c-info.ufunc-tutorial.rst

index 9bd01b9639e7cd3f0d507b7ab129aafe9f83a426..3a406479b34463455dfdd64800de6d578b7d8f4e 100644 (file)
--- a/doc/source/user/c-info.ufunc-tutorial.rst
+++ b/doc/source/user/c-info.ufunc-tutorial.rst
@@ -71,11 +71,11 @@ Example Non-ufunc extension
     pair: ufunc; adding new
  
  For comparison and general edification of the reader we provide
-a simple implementation of a C extension of logit that uses no
+a simple implementation of a C extension of ``logit`` that uses no
  numpy.
  
  To do this we need two files. The first is the C file which contains
-the actual code, and the second is the setup.py file used to create
+the actual code, and the second is the ``setup.py`` file used to create
  the module.
  
      .. code-block:: c
@@ -99,8 +99,7 @@ the module.
  
  
          /* This declares the logit function */
-        static PyObject* spam_logit(PyObject *self, PyObject *args);
-
+        static PyObject *spam_logit(PyObject *self, PyObject *args);
  
          /*
           * This tells Python what methods this module has.
@@ -113,13 +112,12 @@ the module.
              {NULL, NULL, 0, NULL}
          };
  
-
          /*
           * This actually defines the logit function for
           * input args from Python.
           */
  
-        static PyObject* spam_logit(PyObject *self, PyObject *args)
+        static PyObject *spam_logit(PyObject *self, PyObject *args)
          {
              double p;
  
@@ -136,7 +134,6 @@ the module.
              return Py_BuildValue("d", p);
          }
  
-
          /* This initiates the module using the above definitions. */
          static struct PyModuleDef moduledef = {
              PyModuleDef_HEAD_INIT,
@@ -160,10 +157,10 @@ the module.
              return m;
          }
  
-To use the setup.py file, place setup.py and spammodule.c in the same
-folder. Then python setup.py build will build the module to import,
-or setup.py install will install the module to your site-packages
-directory.
+To use the ``setup.py file``, place ``setup.py`` and ``spammodule.c``
+in the same folder. Then ``python setup.py build`` will build the module to
+import, or ``python setup.py install`` will install the module to your
+site-packages directory.
  
      .. code-block:: python
  
@@ -203,9 +200,9 @@ directory.
  
  
  Once the spam module is imported into python, you can call logit
-via spam.logit. Note that the function used above cannot be applied
-as-is to numpy arrays. To do so we must call numpy.vectorize on it.
-For example, if a python interpreter is opened in the file containing
+via ``spam.logit``. Note that the function used above cannot be applied
+as-is to numpy arrays. To do so we must call :py:func:`numpy.vectorize`
+on it. For example, if a python interpreter is opened in the file containing
  the spam library or spam has been installed, one can perform the
  following commands:
  
@@ -225,10 +222,10 @@ TypeError: only length-1 arrays can be converted to Python scalars
  array([       -inf, -2.07944154, -1.25276297, -0.69314718, -0.22314355,
      0.22314355,  0.69314718,  1.25276297,  2.07944154,         inf])
  
-THE RESULTING LOGIT FUNCTION IS NOT FAST! numpy.vectorize simply
-loops over spam.logit. The loop is done at the C level, but the numpy
+THE RESULTING LOGIT FUNCTION IS NOT FAST! ``numpy.vectorize`` simply
+loops over ``spam.logit``. The loop is done at the C level, but the numpy
  array is constantly being parsed and build back up. This is expensive.
-When the author compared numpy.vectorize(spam.logit) against the
+When the author compared ``numpy.vectorize(spam.logit)`` against the
  logit ufuncs constructed below, the logit ufuncs were almost exactly
  4 times faster. Larger or smaller speedups are, of course, possible
  depending on the nature of the function.
@@ -242,13 +239,14 @@ Example NumPy ufunc for one dtype
  .. index::
     pair: ufunc; adding new
  
-For simplicity we give a ufunc for a single dtype, the 'f8' double.
-As in the previous section, we first give the .c file and then the
-setup.py file used to create the module containing the ufunc.
+For simplicity we give a ufunc for a single dtype, the ``'f8'``
+``double``. As in the previous section, we first give the ``.c`` file
+and then the ``setup.py`` file used to create the module containing the
+ufunc.
  
  The place in the code corresponding to the actual computations for
-the ufunc are marked with /\*BEGIN main ufunc computation\*/ and
-/\*END main ufunc computation\*/. The code in between those lines is
+the ufunc are marked with ``/\* BEGIN main ufunc computation \*/`` and
+``/\* END main ufunc computation \*/``. The code in between those lines is
  the primary thing that must be changed to create your own ufunc.
  
      .. code-block:: c
@@ -277,13 +275,13 @@ the primary thing that must be changed to create your own ufunc.
           */
  
          static PyMethodDef LogitMethods[] = {
-                {NULL, NULL, 0, NULL}
+            {NULL, NULL, 0, NULL}
          };
  
          /* The loop definition must precede the PyMODINIT_FUNC. */
  
-        static void double_logit(char **args, npy_intp *dimensions,
-                                    npy_intp* steps, void* data)
+        static void double_logit(char **args, const npy_intp *dimensions,
+                                 const npy_intp *steps, void *data)
          {
              npy_intp i;
              npy_intp n = dimensions[0];
@@ -293,25 +291,23 @@ the primary thing that must be changed to create your own ufunc.
              double tmp;
  
              for (i = 0; i < n; i++) {
-                /*BEGIN main ufunc computation*/
+                /* BEGIN main ufunc computation */
                  tmp = *(double *)in;
-                tmp /= 1-tmp;
+                tmp /= 1 - tmp;
                  *((double *)out) = log(tmp);
-                /*END main ufunc computation*/
+                /* END main ufunc computation */
  
                  in += in_step;
                  out += out_step;
              }
          }
  
-        /*This a pointer to the above function*/
+        /* This a pointer to the above function */
          PyUFuncGenericFunction funcs[1] = {&double_logit};
  
          /* These are the input and return dtypes of logit.*/
          static char types[2] = {NPY_DOUBLE, NPY_DOUBLE};
  
-        static void *data[1] = {NULL};
-
          static struct PyModuleDef moduledef = {
              PyModuleDef_HEAD_INIT,
              "npufunc",
@@ -335,7 +331,7 @@ the primary thing that must be changed to create your own ufunc.
              import_array();
              import_umath();
  
-            logit = PyUFunc_FromFuncAndData(funcs, data, types, 1, 1, 1,
+            logit = PyUFunc_FromFuncAndData(funcs, NULL, types, 1, 1, 1,
                                              PyUFunc_None, "logit",
                                              "logit_docstring", 0);
  
@@ -347,21 +343,23 @@ the primary thing that must be changed to create your own ufunc.
              return m;
          }
  
-This is a setup.py file for the above code. As before, the module
-can be build via calling python setup.py build at the command prompt,
-or installed to site-packages via python setup.py install.
+This is a ``setup.py file`` for the above code. As before, the module
+can be build via calling ``python setup.py build`` at the command prompt,
+or installed to site-packages via ``python setup.py install``. The module
+can also be placed into a local folder e.g. ``npufunc_directory`` below
+using ``python setup.py build_ext --inplace``.
  
      .. code-block:: python
  
          '''
-            setup.py file for logit.c
+            setup.py file for single_type_logit.c
              Note that since this is a numpy extension
              we use numpy.distutils instead of
              distutils from the python standard library.
  
              Calling
              $python setup.py build_ext --inplace
-            will build the extension library in the current file.
+            will build the extension library in the npufunc_directory.
  
              Calling
              $python setup.py build
@@ -382,7 +380,6 @@ or installed to site-packages via python setup.py install.
  
  
          def configuration(parent_package='', top_path=None):
-            import numpy
              from numpy.distutils.misc_util import Configuration
  
              config = Configuration('npufunc_directory',
@@ -418,13 +415,13 @@ Example NumPy ufunc with multiple dtypes
  
  We finally give an example of a full ufunc, with inner loops for
  half-floats, floats, doubles, and long doubles. As in the previous
-sections we first give the .c file and then the corresponding
-setup.py file.
+sections we first give the ``.c`` file and then the corresponding
+``setup.py`` file.
  
  The places in the code corresponding to the actual computations for
-the ufunc are marked with /\*BEGIN main ufunc computation\*/ and
-/\*END main ufunc computation\*/. The code in between those lines is
-the primary thing that must be changed to create your own ufunc.
+the ufunc are marked with ``/\* BEGIN main ufunc computation \*/`` and
+``/\* END main ufunc computation \*/``. The code in between those lines
+is the primary thing that must be changed to create your own ufunc.
  
  
      .. code-block:: c
@@ -455,37 +452,36 @@ the primary thing that must be changed to create your own ufunc.
           *
           */
  
-
          static PyMethodDef LogitMethods[] = {
-                {NULL, NULL, 0, NULL}
+            {NULL, NULL, 0, NULL}
          };
  
          /* The loop definitions must precede the PyMODINIT_FUNC. */
  
-        static void long_double_logit(char **args, npy_intp *dimensions,
-                                      npy_intp* steps, void* data)
+        static void long_double_logit(char **args, const npy_intp *dimensions,
+                                      const npy_intp *steps, void *data)
          {
              npy_intp i;
              npy_intp n = dimensions[0];
-            char *in = args[0], *out=args[1];
+            char *in = args[0], *out = args[1];
              npy_intp in_step = steps[0], out_step = steps[1];
  
              long double tmp;
  
              for (i = 0; i < n; i++) {
-                /*BEGIN main ufunc computation*/
+                /* BEGIN main ufunc computation */
                  tmp = *(long double *)in;
-                tmp /= 1-tmp;
+                tmp /= 1 - tmp;
                  *((long double *)out) = logl(tmp);
-                /*END main ufunc computation*/
+                /* END main ufunc computation */
  
                  in += in_step;
                  out += out_step;
              }
          }
  
-        static void double_logit(char **args, npy_intp *dimensions,
-                                 npy_intp* steps, void* data)
+        static void double_logit(char **args, const npy_intp *dimensions,
+                                 const npy_intp *steps, void *data)
          {
              npy_intp i;
              npy_intp n = dimensions[0];
@@ -495,33 +491,33 @@ the primary thing that must be changed to create your own ufunc.
              double tmp;
  
              for (i = 0; i < n; i++) {
-                /*BEGIN main ufunc computation*/
+                /* BEGIN main ufunc computation */
                  tmp = *(double *)in;
-                tmp /= 1-tmp;
+                tmp /= 1 - tmp;
                  *((double *)out) = log(tmp);
-                /*END main ufunc computation*/
+                /* END main ufunc computation */
  
                  in += in_step;
                  out += out_step;
              }
          }
  
-        static void float_logit(char **args, npy_intp *dimensions,
-                                npy_intp* steps, void* data)
+        static void float_logit(char **args, const npy_intp *dimensions,
+                               const npy_intp *steps, void *data)
          {
              npy_intp i;
              npy_intp n = dimensions[0];
-            char *in=args[0], *out = args[1];
+            char *in = args[0], *out = args[1];
              npy_intp in_step = steps[0], out_step = steps[1];
  
              float tmp;
  
              for (i = 0; i < n; i++) {
-                /*BEGIN main ufunc computation*/
+                /* BEGIN main ufunc computation */
                  tmp = *(float *)in;
-                tmp /= 1-tmp;
+                tmp /= 1 - tmp;
                  *((float *)out) = logf(tmp);
-                /*END main ufunc computation*/
+                /* END main ufunc computation */
  
                  in += in_step;
                  out += out_step;
@@ -529,8 +525,8 @@ the primary thing that must be changed to create your own ufunc.
          }
  
  
-        static void half_float_logit(char **args, npy_intp *dimensions,
-                                     npy_intp* steps, void* data)
+        static void half_float_logit(char **args, const npy_intp *dimensions,
+                                    const npy_intp *steps, void *data)
          {
              npy_intp i;
              npy_intp n = dimensions[0];
@@ -541,13 +537,12 @@ the primary thing that must be changed to create your own ufunc.
  
              for (i = 0; i < n; i++) {
  
-                /*BEGIN main ufunc computation*/
-                tmp = *(npy_half *)in;
-                tmp = npy_half_to_float(tmp);
-                tmp /= 1-tmp;
+                /* BEGIN main ufunc computation */
+                tmp = npy_half_to_float(*(npy_half *)in);
+                tmp /= 1 - tmp;
                  tmp = logf(tmp);
                  *((npy_half *)out) = npy_float_to_half(tmp);
-                /*END main ufunc computation*/
+                /* END main ufunc computation */
  
                  in += in_step;
                  out += out_step;
@@ -562,10 +557,9 @@ the primary thing that must be changed to create your own ufunc.
                                             &long_double_logit};
  
          static char types[8] = {NPY_HALF, NPY_HALF,
-                        NPY_FLOAT, NPY_FLOAT,
-                        NPY_DOUBLE,NPY_DOUBLE,
-                        NPY_LONGDOUBLE, NPY_LONGDOUBLE};
-        static void *data[4] = {NULL, NULL, NULL, NULL};
+                                NPY_FLOAT, NPY_FLOAT,
+                                NPY_DOUBLE, NPY_DOUBLE,
+                                NPY_LONGDOUBLE, NPY_LONGDOUBLE};
  
          static struct PyModuleDef moduledef = {
              PyModuleDef_HEAD_INIT,
@@ -590,7 +584,7 @@ the primary thing that must be changed to create your own ufunc.
              import_array();
              import_umath();
  
-            logit = PyUFunc_FromFuncAndData(funcs, data, types, 4, 1, 1,
+            logit = PyUFunc_FromFuncAndData(funcs, NULL, types, 4, 1, 1,
                                              PyUFunc_None, "logit",
                                              "logit_docstring", 0);
  
@@ -602,14 +596,14 @@ the primary thing that must be changed to create your own ufunc.
              return m;
          }
  
-This is a setup.py file for the above code. As before, the module
-can be build via calling python setup.py build at the command prompt,
-or installed to site-packages via python setup.py install.
+This is a ``setup.py`` file for the above code. As before, the module
+can be build via calling ``python setup.py build`` at the command prompt,
+or installed to site-packages via ``python setup.py install``.
  
      .. code-block:: python
  
          '''
-            setup.py file for logit.c
+            setup.py file for multi_type_logit.c
              Note that since this is a numpy extension
              we use numpy.distutils instead of
              distutils from the python standard library.
@@ -637,9 +631,7 @@ or installed to site-packages via python setup.py install.
  
  
          def configuration(parent_package='', top_path=None):
-            import numpy
-            from numpy.distutils.misc_util import Configuration
-            from numpy.distutils.misc_util import get_info
+            from numpy.distutils.misc_util import Configuration, get_info
  
              #Necessary for the half-float d-type.
              info = get_info('npymath')
@@ -676,10 +668,10 @@ Example NumPy ufunc with multiple arguments/return values
  
  Our final example is a ufunc with multiple arguments. It is a modification
  of the code for a logit ufunc for data with a single dtype. We
-compute (A*B, logit(A*B)).
+compute ``(A * B, logit(A * B))``.
  
  We only give the C code as the setup.py file is exactly the same as
-the setup.py file in `Example NumPy ufunc for one dtype`_, except that
+the ``setup.py`` file in `Example NumPy ufunc for one dtype`_, except that
  the line
  
      .. code-block:: python
@@ -692,9 +684,9 @@ is replaced with
  
          config.add_extension('npufunc', ['multi_arg_logit.c'])
  
-The C file is given below. The ufunc generated takes two arguments A
-and B. It returns a tuple whose first element is A*B and whose second
-element is logit(A*B). Note that it automatically supports broadcasting,
+The C file is given below. The ufunc generated takes two arguments ``A``
+and ``B``. It returns a tuple whose first element is ``A * B`` and whose second
+element is ``logit(A * B)``. Note that it automatically supports broadcasting,
  as well as all other properties of a ufunc.
  
      .. code-block:: c
@@ -716,19 +708,17 @@ as well as all other properties of a ufunc.
           *
           * Details explaining the Python-C API can be found under
           * 'Extending and Embedding' and 'Python/C API' at
-         * docs.python.org .
-         *
+         * docs.python.org.
           */
  
-
          static PyMethodDef LogitMethods[] = {
-                {NULL, NULL, 0, NULL}
+            {NULL, NULL, 0, NULL}
          };
  
          /* The loop definition must precede the PyMODINIT_FUNC. */
  
-        static void double_logitprod(char **args, npy_intp *dimensions,
-                                    npy_intp* steps, void* data)
+        static void double_logitprod(char **args, const npy_intp *dimensions,
+                                     const npy_intp *steps, void *data)
          {
              npy_intp i;
              npy_intp n = dimensions[0];
@@ -740,12 +730,12 @@ as well as all other properties of a ufunc.
              double tmp;
  
              for (i = 0; i < n; i++) {
-                /*BEGIN main ufunc computation*/
+                /* BEGIN main ufunc computation */
                  tmp = *(double *)in1;
                  tmp *= *(double *)in2;
                  *((double *)out1) = tmp;
-                *((double *)out2) = log(tmp/(1-tmp));
-                /*END main ufunc computation*/
+                *((double *)out2) = log(tmp / (1 - tmp));
+                /* END main ufunc computation */
  
                  in1 += in1_step;
                  in2 += in2_step;
@@ -754,7 +744,6 @@ as well as all other properties of a ufunc.
              }
          }
  
-
          /*This a pointer to the above function*/
          PyUFuncGenericFunction funcs[1] = {&double_logitprod};
  
@@ -763,9 +752,6 @@ as well as all other properties of a ufunc.
          static char types[4] = {NPY_DOUBLE, NPY_DOUBLE,
                                  NPY_DOUBLE, NPY_DOUBLE};
  
-
-        static void *data[1] = {NULL};
-
          static struct PyModuleDef moduledef = {
              PyModuleDef_HEAD_INIT,
              "npufunc",
@@ -789,7 +775,7 @@ as well as all other properties of a ufunc.
              import_array();
              import_umath();
  
-            logit = PyUFunc_FromFuncAndData(funcs, data, types, 1, 2, 2,
+            logit = PyUFunc_FromFuncAndData(funcs, NULL, types, 1, 2, 2,
                                              PyUFunc_None, "logit",
                                              "logit_docstring", 0);
  
@@ -809,13 +795,13 @@ Example NumPy ufunc with structured array dtype arguments
  
  This example shows how to create a ufunc for a structured array dtype.
  For the example we show a trivial ufunc for adding two arrays with dtype
-'u8,u8,u8'. The process is a bit different from the other examples since
+``'u8,u8,u8'``. The process is a bit different from the other examples since
  a call to :c:func:`PyUFunc_FromFuncAndData` doesn't fully register ufuncs for
  custom dtypes and structured array dtypes. We need to also call
  :c:func:`PyUFunc_RegisterLoopForDescr` to finish setting up the ufunc.
  
-We only give the C code as the setup.py file is exactly the same as
-the setup.py file in `Example NumPy ufunc for one dtype`_, except that
+We only give the C code as the ``setup.py`` file is exactly the same as
+the ``setup.py`` file in `Example NumPy ufunc for one dtype`_, except that
  the line
  
      .. code-block:: python
@@ -839,7 +825,6 @@ The C file is given below.
          #include "numpy/npy_3kcompat.h"
          #include <math.h>
  
-
          /*
           * add_triplet.c
           * This is the C code for creating your own
@@ -847,7 +832,7 @@ The C file is given below.
           *
           * Details explaining the Python-C API can be found under
           * 'Extending and Embedding' and 'Python/C API' at
-         * docs.python.org .
+         * docs.python.org.
           */
  
          static PyMethodDef StructUfuncTestMethods[] = {
@@ -856,25 +841,25 @@ The C file is given below.
  
          /* The loop definition must precede the PyMODINIT_FUNC. */
  
-        static void add_uint64_triplet(char **args, npy_intp *dimensions,
-                                    npy_intp* steps, void* data)
+        static void add_uint64_triplet(char **args, const npy_intp *dimensions,
+                                       const npy_intp *steps, void *data)
          {
              npy_intp i;
-            npy_intp is1=steps[0];
-            npy_intp is2=steps[1];
-            npy_intp os=steps[2];
-            npy_intp n=dimensions[0];
+            npy_intp is1 = steps[0];
+            npy_intp is2 = steps[1];
+            npy_intp os = steps[2];
+            npy_intp n = dimensions[0];
              uint64_t *x, *y, *z;
  
-            char *i1=args[0];
-            char *i2=args[1];
-            char *op=args[2];
+            char *i1 = args[0];
+            char *i2 = args[1];
+            char *op = args[2];
  
              for (i = 0; i < n; i++) {
  
-                x = (uint64_t*)i1;
-                y = (uint64_t*)i2;
-                z = (uint64_t*)op;
+                x = (uint64_t *)i1;
+                y = (uint64_t *)i2;
+                z = (uint64_t *)op;
  
                  z[0] = x[0] + y[0];
                  z[1] = x[1] + y[1];
@@ -892,8 +877,6 @@ The C file is given below.
          /* These are the input and return dtypes of add_uint64_triplet. */
          static char types[3] = {NPY_UINT64, NPY_UINT64, NPY_UINT64};
  
-        static void *data[1] = {NULL};
-
          static struct PyModuleDef moduledef = {
              PyModuleDef_HEAD_INIT,
              "struct_ufunc_test",
@@ -924,11 +907,11 @@ The C file is given below.
  
              /* Create a new ufunc object */
              add_triplet = PyUFunc_FromFuncAndData(NULL, NULL, NULL, 0, 2, 1,
-                                            PyUFunc_None, "add_triplet",
-                                            "add_triplet_docstring", 0);
+                                                  PyUFunc_None, "add_triplet",
+                                                  "add_triplet_docstring", 0);
  
              dtype_dict = Py_BuildValue("[(s, s), (s, s), (s, s)]",
-                "f0", "u8", "f1", "u8", "f2", "u8");
+                                       "f0", "u8", "f1", "u8", "f2", "u8");
              PyArray_DescrConverter(dtype_dict, &dtype);
              Py_DECREF(dtype_dict);
  
@@ -938,10 +921,10 @@ The C file is given below.
  
              /* Register ufunc for structured dtype */
              PyUFunc_RegisterLoopForDescr(add_triplet,
-                                        dtype,
-                                        &add_uint64_triplet,
-                                        dtypes,
-                                        NULL);
+                                         dtype,
+                                         &add_uint64_triplet,
+                                         dtypes,
+                                         NULL);
  
              d = PyModule_GetDict(m);
  
@@ -963,9 +946,9 @@ adapted from the umath module
          static PyUFuncGenericFunction atan2_functions[] = {
                                PyUFunc_ff_f, PyUFunc_dd_d,
                                PyUFunc_gg_g, PyUFunc_OO_O_method};
-        static void* atan2_data[] = {
-                              (void *)atan2f,(void *) atan2,
-                              (void *)atan2l,(void *)"arctan2"};
+        static void *atan2_data[] = {
+                              (void *)atan2f, (void *)atan2,
+                              (void *)atan2l, (void *)"arctan2"};
          static char atan2_signatures[] = {
                        NPY_FLOAT, NPY_FLOAT, NPY_FLOAT,
                        NPY_DOUBLE, NPY_DOUBLE, NPY_DOUBLE,
diff --git a/doc/source/user/depending_on_numpy.rst b/doc/source/user/depending_on_numpy.rst

index d8e97ef1f967aa12a5ae7f58ea36312366b4c9b4..c61b1d7fed8d303da67fe123d715d1b056d1ce7c 100644 (file)
--- a/doc/source/user/depending_on_numpy.rst
+++ b/doc/source/user/depending_on_numpy.rst
@@ -114,26 +114,30 @@ for dropping support for old Python and NumPy versions: :ref:`NEP29`. We
  recommend all packages depending on NumPy to follow the recommendations in NEP
  29.
  
-For *run-time dependencies*, you specify the range of versions in
+For *run-time dependencies*, specify version bounds using
  ``install_requires`` in ``setup.py`` (assuming you use ``numpy.distutils`` or
-``setuptools`` to build). Getting the upper bound right for NumPy is slightly
-tricky. If we don't set any bound, a too-new version will be pulled in a few
-years down the line, and NumPy may have deprecated and removed some API that
-your package depended on by then. On the other hand if you set the upper bound
-to the newest already-released version, then as soon as a new NumPy version is
-released there will be no matching version of your package that works with it.
-
-What to do here depends on your release frequency. Given that NumPy releases
-come in a 6-monthly cadence and that features that get deprecated in NumPy
-should stay around for another two releases, a good upper bound is
-``<1.(xx+3).0`` - where ``xx`` is the minor version of the latest
-already-released NumPy. This is safe to do if you release at least once a year.
-If your own releases are much less frequent, you may set the upper bound a
-little further into the future - this is a trade-off between a future NumPy
-version _maybe_ removing something you rely on, and the upper bound being
-exceeded which _may_ lead to your package being hard to install in combination
-with other packages relying on the latest NumPy.
-
+``setuptools`` to build).
+
+Most libraries that rely on NumPy will not need to set an upper
+version bound: NumPy is careful to preserve backward-compatibility.
+
+That said, if you are (a) a project that is guaranteed to release
+frequently, (b) use a large part of NumPy's API surface, and (c) is
+worried that changes in NumPy may break your code, you can set an
+upper bound of ``<MAJOR.MINOR + N`` with N no less than 3, and
+``MAJOR.MINOR`` being the current release of NumPy [*]_. If you use the NumPy
+C API (directly or via Cython), you can also pin the current major
+version to prevent ABI breakage. Note that setting an upper bound on
+NumPy may `affect the ability of your library to be installed
+alongside other, newer packages
+<https://iscinumpy.dev/post/bound-version-constraints/>`__.
+
+.. [*] The reason for setting ``N=3`` is that NumPy will, on the
+       rare occasion where it makes breaking changes, raise warnings
+       for at least two releases. (NumPy releases about once every six
+       months, so this translates to a window of at least a year;
+       hence the subsequent requirement that your project releases at
+       least on that cadence.)
  
  .. note::
  
diff --git a/doc/source/user/how-to-index.rst b/doc/source/user/how-to-index.rst

new file mode 100644 (file)

index 0000000..41061d5
--- /dev/null
+++ b/doc/source/user/how-to-index.rst
@@ -0,0 +1,351 @@
+.. currentmodule:: numpy
+
+.. _how-to-index.rst:
+
+*****************************************
+How to index :class:`ndarrays <.ndarray>`
+*****************************************
+
+.. seealso:: :ref:`basics.indexing`
+
+This page tackles common examples. For an in-depth look into indexing, refer
+to :ref:`basics.indexing`.
+
+Access specific/arbitrary rows and columns
+==========================================
+
+Use :ref:`basic-indexing` features like :ref:`slicing-and-striding`, and
+:ref:`dimensional-indexing-tools`.
+
+    >>> a = np.arange(30).reshape(2, 3, 5)
+    >>> a
+    array([[[ 0,  1,  2,  3,  4],
+            [ 5,  6,  7,  8,  9],
+            [10, 11, 12, 13, 14]],
+    <BLANKLINE>
+            [[15, 16, 17, 18, 19],
+            [20, 21, 22, 23, 24],
+            [25, 26, 27, 28, 29]]])
+    >>> a[0, 2, :]
+    array([10, 11, 12, 13, 14])
+    >>> a[0, :, 3]
+    array([ 3,  8, 13])
+    
+Note that the output from indexing operations can have different shape from the
+original object. To preserve the original dimensions after indexing, you can
+use :func:`newaxis`. To use other such tools, refer to
+:ref:`dimensional-indexing-tools`.
+
+    >>> a[0, :, 3].shape
+    (3,)
+    >>> a[0, :, 3, np.newaxis].shape
+    (3, 1)
+    >>> a[0, :, 3, np.newaxis, np.newaxis].shape
+    (3, 1, 1)
+
+Variables can also be used to index::
+
+    >>> y = 0
+    >>> a[y, :, y+3]
+    array([ 3,  8, 13])
+
+Refer to :ref:`dealing-with-variable-indices` to see how to use
+:term:`python:slice` and :py:data:`Ellipsis` in your index variables.
+
+Index columns
+-------------
+
+To index columns, you have to index the last axis. Use
+:ref:`dimensional-indexing-tools` to get the desired number of dimensions::
+
+    >>> a = np.arange(24).reshape(2, 3, 4)
+    >>> a
+    array([[[ 0,  1,  2,  3],
+            [ 4,  5,  6,  7],
+            [ 8,  9, 10, 11]],
+    <BLANKLINE>
+           [[12, 13, 14, 15],
+            [16, 17, 18, 19],
+            [20, 21, 22, 23]]])
+    >>> a[..., 3]
+    array([[ 3,  7, 11],
+           [15, 19, 23]])
+
+To index specific elements in each column, make use of :ref:`advanced-indexing`
+as below::
+
+    >>> arr = np.arange(3*4).reshape(3, 4)
+    >>> arr
+    array([[ 0,  1,  2,  3],
+           [ 4,  5,  6,  7],
+           [ 8,  9, 10, 11]])
+    >>> column_indices = [[1, 3], [0, 2], [2, 2]]
+    >>> np.arange(arr.shape[0])
+    array([0, 1, 2])
+    >>> row_indices = np.arange(arr.shape[0])[:, np.newaxis]
+    >>> row_indices
+    array([[0],
+           [1],
+           [2]])
+
+Use the ``row_indices`` and ``column_indices`` for advanced
+indexing::
+
+    >>> arr[row_indices, column_indices]
+    array([[ 1,  3],
+           [ 4,  6],
+           [10, 10]])
+
+Index along a specific axis
+---------------------------
+
+Use :meth:`take`. See also :meth:`take_along_axis` and
+:meth:`put_along_axis`.
+
+    >>> a = np.arange(30).reshape(2, 3, 5)
+    >>> a
+    array([[[ 0,  1,  2,  3,  4],
+            [ 5,  6,  7,  8,  9],
+            [10, 11, 12, 13, 14]],
+    <BLANKLINE>
+            [[15, 16, 17, 18, 19],
+            [20, 21, 22, 23, 24],
+            [25, 26, 27, 28, 29]]])
+    >>> np.take(a, [2, 3], axis=2)
+    array([[[ 2,  3],
+            [ 7,  8],
+            [12, 13]],
+    <BLANKLINE>
+            [[17, 18],
+            [22, 23],
+            [27, 28]]])
+    >>> np.take(a, [2], axis=1)
+    array([[[10, 11, 12, 13, 14]],
+    <BLANKLINE>
+            [[25, 26, 27, 28, 29]]])
+
+Create subsets of larger matrices
+=================================
+
+Use :ref:`slicing-and-striding` to access chunks of a large array::
+
+    >>> a = np.arange(100).reshape(10, 10)
+    >>> a
+    array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
+            [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
+            [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
+            [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
+            [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
+            [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
+            [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
+            [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
+            [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
+            [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])
+    >>> a[2:5, 2:5]
+    array([[22, 23, 24],
+           [32, 33, 34],
+           [42, 43, 44]])
+    >>> a[2:5, 1:3]
+    array([[21, 22],
+           [31, 32],
+           [41, 42]])
+    >>> a[:5, :5]
+    array([[ 0,  1,  2,  3,  4],
+           [10, 11, 12, 13, 14],
+           [20, 21, 22, 23, 24],
+           [30, 31, 32, 33, 34],
+           [40, 41, 42, 43, 44]])
+
+The same thing can be done with advanced indexing in a slightly more complex
+way. Remember that
+:ref:`advanced indexing creates a copy <indexing-operations>`::
+
+    >>> a[np.arange(5)[:, None], np.arange(5)[None, :]]
+    array([[ 0,  1,  2,  3,  4],
+           [10, 11, 12, 13, 14],
+           [20, 21, 22, 23, 24],
+           [30, 31, 32, 33, 34],
+           [40, 41, 42, 43, 44]])
+
+You can also use :meth:`mgrid` to generate indices::
+
+    >>> indices = np.mgrid[0:6:2]
+    >>> indices
+    array([0, 2, 4])
+    >>> a[:, indices]
+    array([[ 0,  2,  4],
+           [10, 12, 14],
+           [20, 22, 24],
+           [30, 32, 34],
+           [40, 42, 44],
+           [50, 52, 54],
+           [60, 62, 64],
+           [70, 72, 74],
+           [80, 82, 84],
+           [90, 92, 94]])
+
+Filter values
+=============
+
+Non-zero elements
+-----------------
+
+Use :meth:`nonzero` to get a tuple of array indices of non-zero elements 
+corresponding to every dimension::
+
+       >>> z = np.array([[1, 2, 3, 0], [0, 0, 5, 3], [4, 6, 0, 0]])
+       >>> z
+       array([[1, 2, 3, 0],
+              [0, 0, 5, 3],
+              [4, 6, 0, 0]])
+       >>> np.nonzero(z)
+       (array([0, 0, 0, 1, 1, 2, 2]), array([0, 1, 2, 2, 3, 0, 1]))
+
+Use :meth:`flatnonzero` to fetch indices of elements that are non-zero in
+the flattened version of the ndarray::
+
+       >>> np.flatnonzero(z)
+       array([0, 1, 2, 6, 7, 8, 9])
+
+Arbitrary conditions
+--------------------
+
+Use :meth:`where` to generate indices based on conditions and then
+use :ref:`advanced-indexing`.
+
+    >>> a = np.arange(30).reshape(2, 3, 5)
+    >>> indices = np.where(a % 2 == 0)
+    >>> indices
+    (array([0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1]), 
+    array([0, 0, 0, 1, 1, 2, 2, 2, 0, 0, 1, 1, 1, 2, 2]), 
+    array([0, 2, 4, 1, 3, 0, 2, 4, 1, 3, 0, 2, 4, 1, 3]))
+    >>> a[indices]
+    array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28])
+
+Or, use :ref:`boolean-indexing`::
+
+    >>> a > 14
+    array([[[False, False, False, False, False],
+            [False, False, False, False, False],
+            [False, False, False, False, False]],
+    <BLANKLINE>
+           [[ True,  True,  True,  True,  True],
+            [ True,  True,  True,  True,  True],
+            [ True,  True,  True,  True,  True]]])
+    >>> a[a > 14]
+    array([15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29])
+
+Replace values after filtering
+------------------------------
+
+Use assignment with filtering to replace desired values::
+
+    >>> p = np.arange(-10, 10).reshape(2, 2, 5)
+    >>> p
+    array([[[-10,  -9,  -8,  -7,  -6],
+            [ -5,  -4,  -3,  -2,  -1]],
+    <BLANKLINE>
+           [[  0,   1,   2,   3,   4],
+            [  5,   6,   7,   8,   9]]])
+    >>> q = p < 0
+    >>> q
+    array([[[ True,  True,  True,  True,  True],
+            [ True,  True,  True,  True,  True]],
+    <BLANKLINE>
+           [[False, False, False, False, False],
+            [False, False, False, False, False]]])
+    >>> p[q] = 0
+    >>> p
+    array([[[0, 0, 0, 0, 0],
+            [0, 0, 0, 0, 0]],
+    <BLANKLINE>
+           [[0, 1, 2, 3, 4],
+            [5, 6, 7, 8, 9]]])
+
+Fetch indices of max/min values
+===============================
+
+Use :meth:`argmax` and :meth:`argmin`::
+
+    >>> a = np.arange(30).reshape(2, 3, 5)
+    >>> np.argmax(a)
+    29
+    >>> np.argmin(a)
+    0
+
+Use the ``axis`` keyword to get the indices of maximum and minimum
+values along a specific axis::
+
+    >>> np.argmax(a, axis=0)
+    array([[1, 1, 1, 1, 1],
+           [1, 1, 1, 1, 1],
+           [1, 1, 1, 1, 1]])
+    >>> np.argmax(a, axis=1)
+    array([[2, 2, 2, 2, 2],
+           [2, 2, 2, 2, 2]])
+    >>> np.argmax(a, axis=2)
+    array([[4, 4, 4],
+           [4, 4, 4]])
+    <BLANKLINE>
+    >>> np.argmin(a, axis=1)
+    array([[0, 0, 0, 0, 0],
+           [0, 0, 0, 0, 0]])
+    >>> np.argmin(a, axis=2)
+    array([[0, 0, 0],
+           [0, 0, 0]])
+
+Set ``keepdims`` to ``True`` to keep the axes which are reduced in the
+result as dimensions with size one::
+
+    >>> np.argmin(a, axis=2, keepdims=True)
+    array([[[0],
+            [0],
+            [0]],
+    <BLANKLINE>
+           [[0],
+            [0],
+            [0]]])
+    >>> np.argmax(a, axis=1, keepdims=True)
+    array([[[2, 2, 2, 2, 2]],
+    <BLANKLINE>
+           [[2, 2, 2, 2, 2]]])
+
+Index the same ndarray multiple times efficiently
+=================================================
+
+It must be kept in mind that basic indexing produces :term:`views <view>`
+and advanced indexing produces :term:`copies <copy>`, which are
+computationally less efficient. Hence, you should take care to use basic
+indexing wherever possible instead of advanced indexing.
+
+Further reading
+===============
+
+Nicolas Rougier's `100 NumPy exercises <https://github.com/rougier/numpy-100>`_
+provide a good insight into how indexing is combined with other operations.
+Exercises `6`_, `8`_, `10`_, `15`_, `16`_, `19`_, `20`_, `45`_, `59`_,
+`64`_, `65`_, `70`_, `71`_, `72`_, `76`_, `80`_, `81`_, `84`_, `87`_, `90`_,
+`93`_, `94`_ are specially focused on indexing. 
+
+.. _6: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#6-create-a-null-vector-of-size-10-but-the-fifth-value-which-is-1-
+.. _8: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#8-reverse-a-vector-first-element-becomes-last-
+.. _10: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#10-find-indices-of-non-zero-elements-from-120040-
+.. _15: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#15-create-a-2d-array-with-1-on-the-border-and-0-inside-
+.. _16: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#16-how-to-add-a-border-filled-with-0s-around-an-existing-array-
+.. _19: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#19-create-a-8x8-matrix-and-fill-it-with-a-checkerboard-pattern-
+.. _20: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#20-consider-a-678-shape-array-what-is-the-index-xyz-of-the-100th-element-
+.. _45: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#45-create-random-vector-of-size-10-and-replace-the-maximum-value-by-0-
+.. _59: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#59-how-to-sort-an-array-by-the-nth-column-
+.. _64: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#64-consider-a-given-vector-how-to-add-1-to-each-element-indexed-by-a-second-vector-be-careful-with-repeated-indices-
+.. _65: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#65-how-to-accumulate-elements-of-a-vector-x-to-an-array-f-based-on-an-index-list-i-
+.. _70: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#70-consider-the-vector-1-2-3-4-5-how-to-build-a-new-vector-with-3-consecutive-zeros-interleaved-between-each-value-
+.. _71: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#71-consider-an-array-of-dimension-553-how-to-mulitply-it-by-an-array-with-dimensions-55-
+.. _72: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#72-how-to-swap-two-rows-of-an-array-
+.. _76: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#76-consider-a-one-dimensional-array-z-build-a-two-dimensional-array-whose-first-row-is-z0z1z2-and-each-subsequent-row-is--shifted-by-1-last-row-should-be-z-3z-2z-1-
+.. _80: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#80-consider-an-arbitrary-array-write-a-function-that-extract-a-subpart-with-a-fixed-shape-and-centered-on-a-given-element-pad-with-a-fill-value-when-necessary-
+.. _81: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#81-consider-an-array-z--1234567891011121314-how-to-generate-an-array-r--1234-2345-3456--11121314-
+.. _84: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#84-extract-all-the-contiguous-3x3-blocks-from-a-random-10x10-matrix-
+.. _87: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#87-consider-a-16x16-array-how-to-get-the-block-sum-block-size-is-4x4-
+.. _90: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#90-given-an-arbitrary-number-of-vectors-build-the-cartesian-product-every-combinations-of-every-item-
+.. _93: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#93-consider-two-arrays-a-and-b-of-shape-83-and-22-how-to-find-rows-of-a-that-contain-elements-of-each-row-of-b-regardless-of-the-order-of-the-elements-in-b-
+.. _94: https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md#94-considering-a-10x3-matrix-extract-rows-with-unequal-values-eg-223-
+\ No newline at end of file
diff --git a/doc/source/user/howtos_index.rst b/doc/source/user/howtos_index.rst

index 89a6f54e791cb98e2d1ae3c4bf694c2fbc77a87f..2d66d06381d97343c6ceac4b897a761138815b5a 100644 (file)
--- a/doc/source/user/howtos_index.rst
+++ b/doc/source/user/howtos_index.rst
@@ -13,3 +13,4 @@ the package, see the :ref:`API reference <reference>`.
  
     how-to-how-to
     how-to-io
+   how-to-index
diff --git a/doc/source/user/index.rst b/doc/source/user/index.rst

index e5c51351e073d64a85d7de356564c87630c9f9f4..3ce6bd804f5bd8186d737e58e9666d0cff0a2e42 100644 (file)
--- a/doc/source/user/index.rst
+++ b/doc/source/user/index.rst
@@ -1,5 +1,3 @@
-:orphan:
-
  .. _user:
  
  ################
diff --git a/doc/source/user/misc.rst b/doc/source/user/misc.rst

index 31647315146dd454d6460c70b931e4c8ca54c4be..9b6aa65e2a4458959018e12bbbb10312082729f2 100644 (file)
--- a/doc/source/user/misc.rst
+++ b/doc/source/user/misc.rst
@@ -19,10 +19,10 @@ Note: cannot use equality to test NaNs. E.g.: ::
   False
   >>> myarr[myarr == np.nan] = 0. # doesn't work
   >>> myarr
- array([  1.,   0.,  NaN,   3.])
+ array([  1.,   0.,  nan,   3.])
   >>> myarr[np.isnan(myarr)] = 0. # use this instead find
   >>> myarr
- array([ 1.,  0.,  0.,  3.])
+ array([1.,  0.,  0.,  3.])
  
  Other related special value functions: ::
  
@@ -79,20 +79,24 @@ Examples
  
   >>> oldsettings = np.seterr(all='warn')
   >>> np.zeros(5,dtype=np.float32)/0.
- invalid value encountered in divide
+ Traceback (most recent call last):
+ ...
+ RuntimeWarning: invalid value encountered in divide
   >>> j = np.seterr(under='ignore')
   >>> np.array([1.e-100])**10
+ array([0.])
   >>> j = np.seterr(invalid='raise')
   >>> np.sqrt(np.array([-1.]))
+ Traceback (most recent call last):
+ ...
   FloatingPointError: invalid value encountered in sqrt
   >>> def errorhandler(errstr, errflag):
   ...      print("saw stupid error!")
   >>> np.seterrcall(errorhandler)
- <function err_handler at 0x...>
   >>> j = np.seterr(all='call')
   >>> np.zeros(5, dtype=np.int32)/0
- FloatingPointError: invalid value encountered in divide
   saw stupid error!
+ array([nan, nan, nan, nan, nan])
   >>> j = np.seterr(**oldsettings) # restore previous
   ...                              # error-handling settings
  
@@ -119,8 +123,6 @@ Only a survey of the choices. Little detail on how each works.
  
       - getting it wrong leads to memory leaks, and worse, segfaults
  
-   - API will change for Python 3.0!
-
  2) Cython
  
   - Plusses:
@@ -179,21 +181,7 @@ Only a survey of the choices. Little detail on how each works.
     - doesn't necessarily avoid reference counting issues or needing to know
       API's
  
-5) scipy.weave
-
- - Plusses:
-
-   - can turn many numpy expressions into C code
-   - dynamic compiling and loading of generated C code
-   - can embed pure C code in Python module and have weave extract, generate
-     interfaces and compile, etc.
-
- - Minuses:
-
-   - Future very uncertain: it's the only part of Scipy not ported to Python 3
-     and is effectively deprecated in favor of Cython.
-
-6) Psyco
+5) Psyco
  
   - Plusses:
  
@@ -222,5 +210,3 @@ Interfacing to C++:
   3) Boost.python
   4) SWIG
   5) SIP (used mainly in PyQT)
-
-
diff --git a/doc/source/user/quickstart.rst b/doc/source/user/quickstart.rst

index a9cfeca31553c763b81cbac6354920cb2e532c2d..8e0e3b6ba3a10e359e7531aefa9c2a4d983458f6 100644 (file)
--- a/doc/source/user/quickstart.rst
+++ b/doc/source/user/quickstart.rst
@@ -193,7 +193,7 @@ state of the memory. By default, the dtype of the created array is
             [[1, 1, 1, 1],
              [1, 1, 1, 1],
              [1, 1, 1, 1]]], dtype=int16)
-    >>> np.empty((2, 3))
+    >>> np.empty((2, 3)) #doctest: +SKIP
      array([[3.73603959e-262, 6.02658058e-154, 6.55490914e-260],  # may vary
             [5.30498948e-313, 3.14673309e-307, 1.00000000e+000]])
  
@@ -868,9 +868,9 @@ copy.
      >>> def f(x):
      ...     print(id(x))
      ...
-    >>> id(a)  # id is a unique identifier of an object
+    >>> id(a)  # id is a unique identifier of an object #doctest: +SKIP
      148293216  # may vary
-    >>> f(a)
+    >>> f(a)   #doctest: +SKIP
      148293216  # may vary
  
  View or Shallow Copy
@@ -1272,6 +1272,7 @@ set <https://en.wikipedia.org/wiki/Mandelbrot_set>`__:
      ...         z[diverge] = r                          # avoid diverging too much
      ...
      ...     return divtime
+    >>> plt.clf()
      >>> plt.imshow(mandelbrot(400, 400))
  
  The second way of indexing with booleans is more similar to integer
@@ -1468,9 +1469,10 @@ that ``pylab.hist`` plots the histogram automatically, while
      >>> v = rg.normal(mu, sigma, 10000)
      >>> # Plot a normalized histogram with 50 bins
      >>> plt.hist(v, bins=50, density=True)       # matplotlib version (plot)
+    (array...)
      >>> # Compute the histogram with numpy and then plot it
      >>> (n, bins) = np.histogram(v, bins=50, density=True)  # NumPy version (no plot)
-    >>> plt.plot(.5 * (bins[1:] + bins[:-1]), n)
+    >>> plt.plot(.5 * (bins[1:] + bins[:-1]), n) #doctest: +SKIP
  
  With Matplotlib >=3.4 you can also use ``plt.stairs(n, bins)``.
  
diff --git a/doc/source/user/whatisnumpy.rst b/doc/source/user/whatisnumpy.rst

index e152a4ae2ee54c9dd345493504ef95a6374bb248..892ac53470902c78cd907688662341326ba44123 100644 (file)
--- a/doc/source/user/whatisnumpy.rst
+++ b/doc/source/user/whatisnumpy.rst
@@ -60,7 +60,7 @@ and initializations, memory allocation, etc.)
  
  ::
  
-  for (i = 0; i < rows; i++): {
+  for (i = 0; i < rows; i++) {
      c[i] = a[i]*b[i];
    }
  
@@ -72,8 +72,8 @@ array, for example, the C code (abridged as before) expands to
  
  ::
  
-  for (i = 0; i < rows; i++): {
-    for (j = 0; j < columns; j++): {
+  for (i = 0; i < rows; i++) {
+    for (j = 0; j < columns; j++) {
        c[i][j] = a[i][j]*b[i][j];
      }
    }
diff --git a/environment.yml b/environment.yml

index e63fe87617e34e8ef89ee9a664340f5e8e47e7d8..24bf7383961cb5a93a2f961c7a947f6160f201eb 100644 (file)
--- a/environment.yml
+++ b/environment.yml
@@ -8,7 +8,7 @@ channels:
    - conda-forge
  dependencies:
    - python=3.9 #need to pin to avoid issues with builds
-  - cython=0.29.30
+  - cython>=0.29.30
    - compilers
    - openblas
    - nomkl
@@ -19,7 +19,8 @@ dependencies:
    - pytest-xdist
    - hypothesis
    # For type annotations
-  - mypy=0.940
+  - mypy=0.950
+  - typing_extensions>=4.2.0
    # For building docs
    - sphinx=4.5.0
    - sphinx-panels
@@ -29,6 +30,7 @@ dependencies:
    - pandas
    - matplotlib
    - pydata-sphinx-theme=0.8.1
+  - doxygen
    # NOTE: breathe 4.33.0 collides with sphinx.ext.graphviz
    - breathe!=4.33.0
    # For linting
diff --git a/numpy/__init__.cython-30.pxd b/numpy/__init__.cython-30.pxd

index 42a46d0b832b8e3c0a176512c59a2463ae416124..5fd6086e0701d9fa400333dcc4088152e1ccd86d 100644 (file)
--- a/numpy/__init__.cython-30.pxd
+++ b/numpy/__init__.cython-30.pxd
@@ -133,7 +133,6 @@ cdef extern from "numpy/arrayobject.h":
          NPY_ALIGNED
          NPY_NOTSWAPPED
          NPY_WRITEABLE
-        NPY_UPDATEIFCOPY
          NPY_ARR_HAS_DESCR
  
          NPY_BEHAVED
@@ -165,7 +164,7 @@ cdef extern from "numpy/arrayobject.h":
          NPY_ARRAY_ALIGNED
          NPY_ARRAY_NOTSWAPPED
          NPY_ARRAY_WRITEABLE
-        NPY_ARRAY_UPDATEIFCOPY
+        NPY_ARRAY_WRITEBACKIFCOPY
  
          NPY_ARRAY_BEHAVED
          NPY_ARRAY_BEHAVED_NS
diff --git a/numpy/__init__.pxd b/numpy/__init__.pxd

index 97f3da2e5673c33f9e5c3c14bee2a6e1b0a82411..03db9a0c12fa44a059916442404535bec6a543eb 100644 (file)
--- a/numpy/__init__.pxd
+++ b/numpy/__init__.pxd
@@ -130,7 +130,6 @@ cdef extern from "numpy/arrayobject.h":
          NPY_ALIGNED
          NPY_NOTSWAPPED
          NPY_WRITEABLE
-        NPY_UPDATEIFCOPY
          NPY_ARR_HAS_DESCR
  
          NPY_BEHAVED
@@ -162,7 +161,7 @@ cdef extern from "numpy/arrayobject.h":
          NPY_ARRAY_ALIGNED
          NPY_ARRAY_NOTSWAPPED
          NPY_ARRAY_WRITEABLE
-        NPY_ARRAY_UPDATEIFCOPY
+        NPY_ARRAY_WRITEBACKIFCOPY
  
          NPY_ARRAY_BEHAVED
          NPY_ARRAY_BEHAVED_NS
diff --git a/numpy/__init__.py b/numpy/__init__.py

index e8d1820a1406bd76a0d40218577fb98da1963cdf..45b0cf23c9145b7d2a62e7aba233b5bd9a1bc843 100644 (file)
--- a/numpy/__init__.py
+++ b/numpy/__init__.py
@@ -11,7 +11,7 @@ How to use the documentation
  ----------------------------
  Documentation is available in two forms: docstrings provided
  with the code, and a loose standing reference guide, available from
-`the NumPy homepage <https://www.scipy.org>`_.
+`the NumPy homepage <https://numpy.org>`_.
  
  We recommend exploring the docstrings using
  `IPython <https://ipython.org>`_, an advanced Python shell with
@@ -52,8 +52,6 @@ of numpy are available under the ``doc`` sub-module::
  
  Available subpackages
  ---------------------
-doc
-    Topical documentation on broadcasting, indexing, etc.
  lib
      Basic functions used by several sub-packages.
  random
@@ -66,8 +64,6 @@ polynomial
      Polynomial tools
  testing
      NumPy testing tools
-f2py
-    Fortran to Python Interface Generator.
  distutils
      Enhancements to distutils with support for
      Fortran compilers support and more.
@@ -413,6 +409,11 @@ else:
      # it is tidier organized.
      core.multiarray._multiarray_umath._reload_guard()
  
+    # Tell PyInstaller where to find hook-numpy.py
+    def _pyinstaller_hooks_dir():
+        from pathlib import Path
+        return [str(Path(__file__).with_name("_pyinstaller").resolve())]
+
  
  # get the version using versioneer
  from .version import __version__, git_revision as __git_version__
diff --git a/numpy/__init__.pyi b/numpy/__init__.pyi

index 55a13057e424b39a71b1c6eb4ec47a05553529f8..2eb4a0634e0f91436912f299f1982bc390280e87 100644 (file)
--- a/numpy/__init__.pyi
+++ b/numpy/__init__.pyi
@@ -16,7 +16,7 @@ if sys.version_info >= (3, 9):
  from numpy._pytesttester import PytestTester
  from numpy.core._internal import _ctypes
  
-from numpy.typing import (
+from numpy._typing import (
      # Arrays
      ArrayLike,
      NDArray,
@@ -38,6 +38,7 @@ from numpy.typing import (
  
      # DTypes
      DTypeLike,
+    _DTypeLike,
      _SupportsDType,
      _VoidDTypeLike,
  
@@ -125,7 +126,7 @@ from numpy.typing import (
      _GUFunc_Nin2_Nout1,
  )
  
-from numpy.typing._callable import (
+from numpy._typing._callable import (
      _BoolOp,
      _BoolBitOp,
      _BoolSub,
@@ -152,7 +153,7 @@ from numpy.typing._callable import (
  
  # NOTE: Numpy's mypy plugin is used for removing the types unavailable
  # to the specific platform
-from numpy.typing._extended_precision import (
+from numpy._typing._extended_precision import (
      uint128 as uint128,
      uint256 as uint256,
      int128 as int128,
@@ -167,31 +168,25 @@ from numpy.typing._extended_precision import (
      complex512 as complex512,
  )
  
-from typing import (
-    Literal as L,
-    Any,
-    ByteString,
+from collections.abc import (
      Callable,
      Container,
-    Callable,
-    Dict,
-    Generic,
-    IO,
      Iterable,
      Iterator,
-    List,
      Mapping,
-    NoReturn,
-    Optional,
-    overload,
      Sequence,
      Sized,
+)
+from typing import (
+    Literal as L,
+    Any,
+    Generic,
+    IO,
+    NoReturn,
+    overload,
      SupportsComplex,
      SupportsFloat,
      SupportsInt,
-    Text,
-    Tuple,
-    Type,
      TypeVar,
      Union,
      Protocol,
@@ -199,7 +194,6 @@ from typing import (
      Final,
      final,
      ClassVar,
-    Set,
  )
  
  # Ensures that the stubs are picked up
@@ -655,8 +649,8 @@ class _MemMapIOProtocol(Protocol):
  class _SupportsWrite(Protocol[_AnyStr_contra]):
      def write(self, s: _AnyStr_contra, /) -> object: ...
  
-__all__: List[str]
-__path__: List[str]
+__all__: list[str]
+__path__: list[str]
  __version__: str
  __git_version__: str
  test: PytestTester
@@ -682,13 +676,14 @@ _NdArraySubClass = TypeVar("_NdArraySubClass", bound=ndarray)
  _DTypeScalar_co = TypeVar("_DTypeScalar_co", covariant=True, bound=generic)
  _ByteOrder = L["S", "<", ">", "=", "|", "L", "B", "N", "I"]
  
+@final
  class dtype(Generic[_DTypeScalar_co]):
-    names: None | Tuple[builtins.str, ...]
+    names: None | tuple[builtins.str, ...]
      # Overload for subclass of generic
      @overload
      def __new__(
          cls,
-        dtype: Type[_DTypeScalar_co],
+        dtype: type[_DTypeScalar_co],
          align: bool = ...,
          copy: bool = ...,
      ) -> dtype[_DTypeScalar_co]: ...
@@ -702,64 +697,64 @@ class dtype(Generic[_DTypeScalar_co]):
      # first.
      # Builtin types
      @overload
-    def __new__(cls, dtype: Type[bool], align: bool = ..., copy: bool = ...) -> dtype[bool_]: ...
+    def __new__(cls, dtype: type[bool], align: bool = ..., copy: bool = ...) -> dtype[bool_]: ...
      @overload
-    def __new__(cls, dtype: Type[int], align: bool = ..., copy: bool = ...) -> dtype[int_]: ...
+    def __new__(cls, dtype: type[int], align: bool = ..., copy: bool = ...) -> dtype[int_]: ...
      @overload
-    def __new__(cls, dtype: None | Type[float], align: bool = ..., copy: bool = ...) -> dtype[float_]: ...
+    def __new__(cls, dtype: None | type[float], align: bool = ..., copy: bool = ...) -> dtype[float_]: ...
      @overload
-    def __new__(cls, dtype: Type[complex], align: bool = ..., copy: bool = ...) -> dtype[complex_]: ...
+    def __new__(cls, dtype: type[complex], align: bool = ..., copy: bool = ...) -> dtype[complex_]: ...
      @overload
-    def __new__(cls, dtype: Type[builtins.str], align: bool = ..., copy: bool = ...) -> dtype[str_]: ...
+    def __new__(cls, dtype: type[builtins.str], align: bool = ..., copy: bool = ...) -> dtype[str_]: ...
      @overload
-    def __new__(cls, dtype: Type[bytes], align: bool = ..., copy: bool = ...) -> dtype[bytes_]: ...
+    def __new__(cls, dtype: type[bytes], align: bool = ..., copy: bool = ...) -> dtype[bytes_]: ...
  
      # `unsignedinteger` string-based representations and ctypes
      @overload
-    def __new__(cls, dtype: _UInt8Codes | Type[ct.c_uint8], align: bool = ..., copy: bool = ...) -> dtype[uint8]: ...
+    def __new__(cls, dtype: _UInt8Codes | type[ct.c_uint8], align: bool = ..., copy: bool = ...) -> dtype[uint8]: ...
      @overload
-    def __new__(cls, dtype: _UInt16Codes | Type[ct.c_uint16], align: bool = ..., copy: bool = ...) -> dtype[uint16]: ...
+    def __new__(cls, dtype: _UInt16Codes | type[ct.c_uint16], align: bool = ..., copy: bool = ...) -> dtype[uint16]: ...
      @overload
-    def __new__(cls, dtype: _UInt32Codes | Type[ct.c_uint32], align: bool = ..., copy: bool = ...) -> dtype[uint32]: ...
+    def __new__(cls, dtype: _UInt32Codes | type[ct.c_uint32], align: bool = ..., copy: bool = ...) -> dtype[uint32]: ...
      @overload
-    def __new__(cls, dtype: _UInt64Codes | Type[ct.c_uint64], align: bool = ..., copy: bool = ...) -> dtype[uint64]: ...
+    def __new__(cls, dtype: _UInt64Codes | type[ct.c_uint64], align: bool = ..., copy: bool = ...) -> dtype[uint64]: ...
      @overload
-    def __new__(cls, dtype: _UByteCodes | Type[ct.c_ubyte], align: bool = ..., copy: bool = ...) -> dtype[ubyte]: ...
+    def __new__(cls, dtype: _UByteCodes | type[ct.c_ubyte], align: bool = ..., copy: bool = ...) -> dtype[ubyte]: ...
      @overload
-    def __new__(cls, dtype: _UShortCodes | Type[ct.c_ushort], align: bool = ..., copy: bool = ...) -> dtype[ushort]: ...
+    def __new__(cls, dtype: _UShortCodes | type[ct.c_ushort], align: bool = ..., copy: bool = ...) -> dtype[ushort]: ...
      @overload
-    def __new__(cls, dtype: _UIntCCodes | Type[ct.c_uint], align: bool = ..., copy: bool = ...) -> dtype[uintc]: ...
+    def __new__(cls, dtype: _UIntCCodes | type[ct.c_uint], align: bool = ..., copy: bool = ...) -> dtype[uintc]: ...
  
      # NOTE: We're assuming here that `uint_ptr_t == size_t`,
      # an assumption that does not hold in rare cases (same for `ssize_t`)
      @overload
-    def __new__(cls, dtype: _UIntPCodes | Type[ct.c_void_p] | Type[ct.c_size_t], align: bool = ..., copy: bool = ...) -> dtype[uintp]: ...
+    def __new__(cls, dtype: _UIntPCodes | type[ct.c_void_p] | type[ct.c_size_t], align: bool = ..., copy: bool = ...) -> dtype[uintp]: ...
      @overload
-    def __new__(cls, dtype: _UIntCodes | Type[ct.c_ulong], align: bool = ..., copy: bool = ...) -> dtype[uint]: ...
+    def __new__(cls, dtype: _UIntCodes | type[ct.c_ulong], align: bool = ..., copy: bool = ...) -> dtype[uint]: ...
      @overload
-    def __new__(cls, dtype: _ULongLongCodes | Type[ct.c_ulonglong], align: bool = ..., copy: bool = ...) -> dtype[ulonglong]: ...
+    def __new__(cls, dtype: _ULongLongCodes | type[ct.c_ulonglong], align: bool = ..., copy: bool = ...) -> dtype[ulonglong]: ...
  
      # `signedinteger` string-based representations and ctypes
      @overload
-    def __new__(cls, dtype: _Int8Codes | Type[ct.c_int8], align: bool = ..., copy: bool = ...) -> dtype[int8]: ...
+    def __new__(cls, dtype: _Int8Codes | type[ct.c_int8], align: bool = ..., copy: bool = ...) -> dtype[int8]: ...
      @overload
-    def __new__(cls, dtype: _Int16Codes | Type[ct.c_int16], align: bool = ..., copy: bool = ...) -> dtype[int16]: ...
+    def __new__(cls, dtype: _Int16Codes | type[ct.c_int16], align: bool = ..., copy: bool = ...) -> dtype[int16]: ...
      @overload
-    def __new__(cls, dtype: _Int32Codes | Type[ct.c_int32], align: bool = ..., copy: bool = ...) -> dtype[int32]: ...
+    def __new__(cls, dtype: _Int32Codes | type[ct.c_int32], align: bool = ..., copy: bool = ...) -> dtype[int32]: ...
      @overload
-    def __new__(cls, dtype: _Int64Codes | Type[ct.c_int64], align: bool = ..., copy: bool = ...) -> dtype[int64]: ...
+    def __new__(cls, dtype: _Int64Codes | type[ct.c_int64], align: bool = ..., copy: bool = ...) -> dtype[int64]: ...
      @overload
-    def __new__(cls, dtype: _ByteCodes | Type[ct.c_byte], align: bool = ..., copy: bool = ...) -> dtype[byte]: ...
+    def __new__(cls, dtype: _ByteCodes | type[ct.c_byte], align: bool = ..., copy: bool = ...) -> dtype[byte]: ...
      @overload
-    def __new__(cls, dtype: _ShortCodes | Type[ct.c_short], align: bool = ..., copy: bool = ...) -> dtype[short]: ...
+    def __new__(cls, dtype: _ShortCodes | type[ct.c_short], align: bool = ..., copy: bool = ...) -> dtype[short]: ...
      @overload
-    def __new__(cls, dtype: _IntCCodes | Type[ct.c_int], align: bool = ..., copy: bool = ...) -> dtype[intc]: ...
+    def __new__(cls, dtype: _IntCCodes | type[ct.c_int], align: bool = ..., copy: bool = ...) -> dtype[intc]: ...
      @overload
-    def __new__(cls, dtype: _IntPCodes | Type[ct.c_ssize_t], align: bool = ..., copy: bool = ...) -> dtype[intp]: ...
+    def __new__(cls, dtype: _IntPCodes | type[ct.c_ssize_t], align: bool = ..., copy: bool = ...) -> dtype[intp]: ...
      @overload
-    def __new__(cls, dtype: _IntCodes | Type[ct.c_long], align: bool = ..., copy: bool = ...) -> dtype[int_]: ...
+    def __new__(cls, dtype: _IntCodes | type[ct.c_long], align: bool = ..., copy: bool = ...) -> dtype[int_]: ...
      @overload
-    def __new__(cls, dtype: _LongLongCodes | Type[ct.c_longlong], align: bool = ..., copy: bool = ...) -> dtype[longlong]: ...
+    def __new__(cls, dtype: _LongLongCodes | type[ct.c_longlong], align: bool = ..., copy: bool = ...) -> dtype[longlong]: ...
  
      # `floating` string-based representations and ctypes
      @overload
@@ -771,11 +766,11 @@ class dtype(Generic[_DTypeScalar_co]):
      @overload
      def __new__(cls, dtype: _HalfCodes, align: bool = ..., copy: bool = ...) -> dtype[half]: ...
      @overload
-    def __new__(cls, dtype: _SingleCodes | Type[ct.c_float], align: bool = ..., copy: bool = ...) -> dtype[single]: ...
+    def __new__(cls, dtype: _SingleCodes | type[ct.c_float], align: bool = ..., copy: bool = ...) -> dtype[single]: ...
      @overload
-    def __new__(cls, dtype: _DoubleCodes | Type[ct.c_double], align: bool = ..., copy: bool = ...) -> dtype[double]: ...
+    def __new__(cls, dtype: _DoubleCodes | type[ct.c_double], align: bool = ..., copy: bool = ...) -> dtype[double]: ...
      @overload
-    def __new__(cls, dtype: _LongDoubleCodes | Type[ct.c_longdouble], align: bool = ..., copy: bool = ...) -> dtype[longdouble]: ...
+    def __new__(cls, dtype: _LongDoubleCodes | type[ct.c_longdouble], align: bool = ..., copy: bool = ...) -> dtype[longdouble]: ...
  
      # `complexfloating` string-based representations
      @overload
@@ -791,7 +786,7 @@ class dtype(Generic[_DTypeScalar_co]):
  
      # Miscellaneous string-based representations and ctypes
      @overload
-    def __new__(cls, dtype: _BoolCodes | Type[ct.c_bool], align: bool = ..., copy: bool = ...) -> dtype[bool_]: ...
+    def __new__(cls, dtype: _BoolCodes | type[ct.c_bool], align: bool = ..., copy: bool = ...) -> dtype[bool_]: ...
      @overload
      def __new__(cls, dtype: _TD64Codes, align: bool = ..., copy: bool = ...) -> dtype[timedelta64]: ...
      @overload
@@ -799,11 +794,11 @@ class dtype(Generic[_DTypeScalar_co]):
      @overload
      def __new__(cls, dtype: _StrCodes, align: bool = ..., copy: bool = ...) -> dtype[str_]: ...
      @overload
-    def __new__(cls, dtype: _BytesCodes | Type[ct.c_char], align: bool = ..., copy: bool = ...) -> dtype[bytes_]: ...
+    def __new__(cls, dtype: _BytesCodes | type[ct.c_char], align: bool = ..., copy: bool = ...) -> dtype[bytes_]: ...
      @overload
      def __new__(cls, dtype: _VoidCodes, align: bool = ..., copy: bool = ...) -> dtype[void]: ...
      @overload
-    def __new__(cls, dtype: _ObjectCodes | Type[ct.py_object], align: bool = ..., copy: bool = ...) -> dtype[object_]: ...
+    def __new__(cls, dtype: _ObjectCodes | type[ct.py_object], align: bool = ..., copy: bool = ...) -> dtype[object_]: ...
  
      # dtype of a dtype is the same dtype
      @overload
@@ -840,7 +835,7 @@ class dtype(Generic[_DTypeScalar_co]):
      @overload
      def __new__(
          cls,
-        dtype: Type[object],
+        dtype: type[object],
          align: bool = ...,
          copy: bool = ...,
      ) -> dtype[object_]: ...
@@ -849,7 +844,7 @@ class dtype(Generic[_DTypeScalar_co]):
          def __class_getitem__(self, item: Any) -> GenericAlias: ...
  
      @overload
-    def __getitem__(self: dtype[void], key: List[builtins.str]) -> dtype[void]: ...
+    def __getitem__(self: dtype[void], key: list[builtins.str]) -> dtype[void]: ...
      @overload
      def __getitem__(self: dtype[void], key: builtins.str | SupportsIndex) -> dtype[Any]: ...
  
@@ -889,11 +884,11 @@ class dtype(Generic[_DTypeScalar_co]):
      @property
      def char(self) -> builtins.str: ...
      @property
-    def descr(self) -> List[Tuple[builtins.str, builtins.str] | Tuple[builtins.str, builtins.str, _Shape]]: ...
+    def descr(self) -> list[tuple[builtins.str, builtins.str] | tuple[builtins.str, builtins.str, _Shape]]: ...
      @property
      def fields(
          self,
-    ) -> None | MappingProxyType[builtins.str, Tuple[dtype[Any], int] | Tuple[dtype[Any], int, Any]]: ...
+    ) -> None | MappingProxyType[builtins.str, tuple[dtype[Any], int] | tuple[dtype[Any], int, Any]]: ...
      @property
      def flags(self) -> int: ...
      @property
@@ -919,12 +914,12 @@ class dtype(Generic[_DTypeScalar_co]):
      @property
      def ndim(self) -> int: ...
      @property
-    def subdtype(self) -> None | Tuple[dtype[Any], _Shape]: ...
+    def subdtype(self) -> None | tuple[dtype[Any], _Shape]: ...
      def newbyteorder(self: _DType, __new_order: _ByteOrder = ...) -> _DType: ...
      @property
      def str(self) -> builtins.str: ...
      @property
-    def type(self) -> Type[_DTypeScalar_co]: ...
+    def type(self) -> type[_DTypeScalar_co]: ...
  
  _ArrayLikeInt = Union[
      int,
@@ -936,6 +931,7 @@ _ArrayLikeInt = Union[
  
  _FlatIterSelf = TypeVar("_FlatIterSelf", bound=flatiter)
  
+@final
  class flatiter(Generic[_NdArraySubClass]):
      @property
      def base(self) -> _NdArraySubClass: ...
@@ -971,9 +967,9 @@ class flatiter(Generic[_NdArraySubClass]):
      @overload
      def __array__(self, dtype: _DType, /) -> ndarray[Any, _DType]: ...
  
-_OrderKACF = Optional[L["K", "A", "C", "F"]]
-_OrderACF = Optional[L["A", "C", "F"]]
-_OrderCF = Optional[L["C", "F"]]
+_OrderKACF = L[None, "K", "A", "C", "F"]
+_OrderACF = L[None, "A", "C", "F"]
+_OrderCF = L[None, "C", "F"]
  
  _ModeKind = L["raise", "wrap", "clip"]
  _PartitionKind = L["introselect"]
@@ -998,7 +994,7 @@ class _ArrayOrScalarCommon:
      def __str__(self) -> str: ...
      def __repr__(self) -> str: ...
      def __copy__(self: _ArraySelf) -> _ArraySelf: ...
-    def __deepcopy__(self: _ArraySelf, memo: None | Dict[int, Any], /) -> _ArraySelf: ...
+    def __deepcopy__(self: _ArraySelf, memo: None | dict[int, Any], /) -> _ArraySelf: ...
  
      # TODO: How to deal with the non-commutative nature of `==` and `!=`?
      # xref numpy/numpy#17368
@@ -1020,17 +1016,17 @@ class _ArrayOrScalarCommon:
      def tolist(self) -> Any: ...
  
      @property
-    def __array_interface__(self) -> Dict[str, Any]: ...
+    def __array_interface__(self) -> dict[str, Any]: ...
      @property
      def __array_priority__(self) -> float: ...
      @property
      def __array_struct__(self) -> Any: ...  # builtins.PyCapsule
-    def __setstate__(self, state: Tuple[
+    def __setstate__(self, state: tuple[
          SupportsIndex,  # version
          _ShapeLike,  # Shape
          _DType_co,  # DType
          bool,  # F-continuous
-        bytes | List[Any],  # Data
+        bytes | list[Any],  # Data
      ], /) -> None: ...
      # a `bool_` is returned when `keepdims=True` and `self` is a 0d array
  
@@ -1046,7 +1042,7 @@ class _ArrayOrScalarCommon:
      @overload
      def all(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          out: None = ...,
          keepdims: bool = ...,
          *,
@@ -1055,7 +1051,7 @@ class _ArrayOrScalarCommon:
      @overload
      def all(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          out: _NdArraySubClass = ...,
          keepdims: bool = ...,
          *,
@@ -1074,7 +1070,7 @@ class _ArrayOrScalarCommon:
      @overload
      def any(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          out: None = ...,
          keepdims: bool = ...,
          *,
@@ -1083,7 +1079,7 @@ class _ArrayOrScalarCommon:
      @overload
      def any(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          out: _NdArraySubClass = ...,
          keepdims: bool = ...,
          *,
@@ -1101,7 +1097,7 @@ class _ArrayOrScalarCommon:
      @overload
      def argmax(
          self,
-        axis: _ShapeLike = ...,
+        axis: SupportsIndex = ...,
          out: None = ...,
          *,
          keepdims: bool = ...,
@@ -1109,7 +1105,7 @@ class _ArrayOrScalarCommon:
      @overload
      def argmax(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | SupportsIndex = ...,
          out: _NdArraySubClass = ...,
          *,
          keepdims: bool = ...,
@@ -1126,7 +1122,7 @@ class _ArrayOrScalarCommon:
      @overload
      def argmin(
          self,
-        axis: _ShapeLike = ...,
+        axis: SupportsIndex = ...,
          out: None = ...,
          *,
          keepdims: bool = ...,
@@ -1134,7 +1130,7 @@ class _ArrayOrScalarCommon:
      @overload
      def argmin(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | SupportsIndex = ...,
          out: _NdArraySubClass = ...,
          *,
          keepdims: bool = ...,
@@ -1142,9 +1138,9 @@ class _ArrayOrScalarCommon:
  
      def argsort(
          self,
-        axis: Optional[SupportsIndex] = ...,
-        kind: Optional[_SortKind] = ...,
-        order: Union[None, str, Sequence[str]] = ...,
+        axis: None | SupportsIndex = ...,
+        kind: None | _SortKind = ...,
+        order: None | str | Sequence[str] = ...,
      ) -> ndarray: ...
  
      @overload
@@ -1166,7 +1162,7 @@ class _ArrayOrScalarCommon:
      def clip(
          self,
          min: ArrayLike = ...,
-        max: Optional[ArrayLike] = ...,
+        max: None | ArrayLike = ...,
          out: None = ...,
          **kwargs: Any,
      ) -> ndarray: ...
@@ -1182,7 +1178,7 @@ class _ArrayOrScalarCommon:
      def clip(
          self,
          min: ArrayLike = ...,
-        max: Optional[ArrayLike] = ...,
+        max: None | ArrayLike = ...,
          out: _NdArraySubClass = ...,
          **kwargs: Any,
      ) -> _NdArraySubClass: ...
@@ -1199,14 +1195,14 @@ class _ArrayOrScalarCommon:
      def compress(
          self,
          a: ArrayLike,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
          out: None = ...,
      ) -> ndarray: ...
      @overload
      def compress(
          self,
          a: ArrayLike,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
          out: _NdArraySubClass = ...,
      ) -> _NdArraySubClass: ...
  
@@ -1217,14 +1213,14 @@ class _ArrayOrScalarCommon:
      @overload
      def cumprod(
          self,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
          dtype: DTypeLike = ...,
          out: None = ...,
      ) -> ndarray: ...
      @overload
      def cumprod(
          self,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
          dtype: DTypeLike = ...,
          out: _NdArraySubClass = ...,
      ) -> _NdArraySubClass: ...
@@ -1232,14 +1228,14 @@ class _ArrayOrScalarCommon:
      @overload
      def cumsum(
          self,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
          dtype: DTypeLike = ...,
          out: None = ...,
      ) -> ndarray: ...
      @overload
      def cumsum(
          self,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
          dtype: DTypeLike = ...,
          out: _NdArraySubClass = ...,
      ) -> _NdArraySubClass: ...
@@ -1247,7 +1243,7 @@ class _ArrayOrScalarCommon:
      @overload
      def max(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          out: None = ...,
          keepdims: bool = ...,
          initial: _NumberLike_co = ...,
@@ -1256,7 +1252,7 @@ class _ArrayOrScalarCommon:
      @overload
      def max(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          out: _NdArraySubClass = ...,
          keepdims: bool = ...,
          initial: _NumberLike_co = ...,
@@ -1266,7 +1262,7 @@ class _ArrayOrScalarCommon:
      @overload
      def mean(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          dtype: DTypeLike = ...,
          out: None = ...,
          keepdims: bool = ...,
@@ -1276,7 +1272,7 @@ class _ArrayOrScalarCommon:
      @overload
      def mean(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          dtype: DTypeLike = ...,
          out: _NdArraySubClass = ...,
          keepdims: bool = ...,
@@ -1287,7 +1283,7 @@ class _ArrayOrScalarCommon:
      @overload
      def min(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          out: None = ...,
          keepdims: bool = ...,
          initial: _NumberLike_co = ...,
@@ -1296,7 +1292,7 @@ class _ArrayOrScalarCommon:
      @overload
      def min(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          out: _NdArraySubClass = ...,
          keepdims: bool = ...,
          initial: _NumberLike_co = ...,
@@ -1311,7 +1307,7 @@ class _ArrayOrScalarCommon:
      @overload
      def prod(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          dtype: DTypeLike = ...,
          out: None = ...,
          keepdims: bool = ...,
@@ -1321,7 +1317,7 @@ class _ArrayOrScalarCommon:
      @overload
      def prod(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          dtype: DTypeLike = ...,
          out: _NdArraySubClass = ...,
          keepdims: bool = ...,
@@ -1332,14 +1328,14 @@ class _ArrayOrScalarCommon:
      @overload
      def ptp(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          out: None = ...,
          keepdims: bool = ...,
      ) -> Any: ...
      @overload
      def ptp(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          out: _NdArraySubClass = ...,
          keepdims: bool = ...,
      ) -> _NdArraySubClass: ...
@@ -1360,10 +1356,10 @@ class _ArrayOrScalarCommon:
      @overload
      def std(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          dtype: DTypeLike = ...,
          out: None = ...,
-        ddof: int = ...,
+        ddof: float = ...,
          keepdims: bool = ...,
          *,
          where: _ArrayLikeBool_co = ...,
@@ -1371,10 +1367,10 @@ class _ArrayOrScalarCommon:
      @overload
      def std(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          dtype: DTypeLike = ...,
          out: _NdArraySubClass = ...,
-        ddof: int = ...,
+        ddof: float = ...,
          keepdims: bool = ...,
          *,
          where: _ArrayLikeBool_co = ...,
@@ -1383,7 +1379,7 @@ class _ArrayOrScalarCommon:
      @overload
      def sum(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          dtype: DTypeLike = ...,
          out: None = ...,
          keepdims: bool = ...,
@@ -1393,7 +1389,7 @@ class _ArrayOrScalarCommon:
      @overload
      def sum(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          dtype: DTypeLike = ...,
          out: _NdArraySubClass = ...,
          keepdims: bool = ...,
@@ -1404,10 +1400,10 @@ class _ArrayOrScalarCommon:
      @overload
      def var(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          dtype: DTypeLike = ...,
          out: None = ...,
-        ddof: int = ...,
+        ddof: float = ...,
          keepdims: bool = ...,
          *,
          where: _ArrayLikeBool_co = ...,
@@ -1415,10 +1411,10 @@ class _ArrayOrScalarCommon:
      @overload
      def var(
          self,
-        axis: Optional[_ShapeLike] = ...,
+        axis: None | _ShapeLike = ...,
          dtype: DTypeLike = ...,
          out: _NdArraySubClass = ...,
-        ddof: int = ...,
+        ddof: float = ...,
          keepdims: bool = ...,
          *,
          where: _ArrayLikeBool_co = ...,
@@ -1449,15 +1445,9 @@ _SupportsBuffer = Union[
  _T = TypeVar("_T")
  _T_co = TypeVar("_T_co", covariant=True)
  _T_contra = TypeVar("_T_contra", contravariant=True)
-_2Tuple = Tuple[_T, _T]
+_2Tuple = tuple[_T, _T]
  _CastingKind = L["no", "equiv", "safe", "same_kind", "unsafe"]
  
-_DTypeLike = Union[
-    dtype[_ScalarType],
-    Type[_ScalarType],
-    _SupportsDType[dtype[_ScalarType]],
-]
-
  _ArrayUInt_co = NDArray[Union[bool_, unsignedinteger[Any]]]
  _ArrayInt_co = NDArray[Union[bool_, integer[Any]]]
  _ArrayFloat_co = NDArray[Union[bool_, integer[Any], floating[Any]]]
@@ -1485,7 +1475,7 @@ class _SupportsImag(Protocol[_T_co]):
  
  class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @property
-    def base(self) -> Optional[ndarray]: ...
+    def base(self) -> None | ndarray: ...
      @property
      def ndim(self) -> int: ...
      @property
@@ -1503,7 +1493,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @imag.setter
      def imag(self, value: ArrayLike) -> None: ...
      def __new__(
-        cls: Type[_ArraySelf],
+        cls: type[_ArraySelf],
          shape: _ShapeLike,
          dtype: DTypeLike = ...,
          buffer: None | _SupportsBuffer = ...,
@@ -1536,37 +1526,36 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
          kwargs: Mapping[str, Any],
      ) -> Any: ...
  
-    __array_finalize__: Any
+    # NOTE: In practice any object is accepted by `obj`, but as `__array_finalize__`
+    # is a pseudo-abstract method the type has been narrowed down in order to
+    # grant subclasses a bit more flexiblity
+    def __array_finalize__(self, obj: None | NDArray[Any], /) -> None: ...
  
      def __array_wrap__(
          self,
          array: ndarray[_ShapeType2, _DType],
-        context: None | Tuple[ufunc, Tuple[Any, ...], int] = ...,
+        context: None | tuple[ufunc, tuple[Any, ...], int] = ...,
          /,
      ) -> ndarray[_ShapeType2, _DType]: ...
  
      def __array_prepare__(
          self,
          array: ndarray[_ShapeType2, _DType],
-        context: None | Tuple[ufunc, Tuple[Any, ...], int] = ...,
+        context: None | tuple[ufunc, tuple[Any, ...], int] = ...,
          /,
      ) -> ndarray[_ShapeType2, _DType]: ...
  
      @overload
-    def __getitem__(self, key: Union[
-        SupportsIndex,
-        _ArrayLikeInt_co,
-        Tuple[SupportsIndex | _ArrayLikeInt_co, ...],
-    ]) -> Any: ...
+    def __getitem__(self, key: SupportsIndex | tuple[SupportsIndex, ...]) -> Any: ...
      @overload
-    def __getitem__(self, key: Union[
-        None,
-        slice,
-        ellipsis,
-        SupportsIndex,
-        _ArrayLikeInt_co,
-        Tuple[None | slice | ellipsis | _ArrayLikeInt_co | SupportsIndex, ...],
-    ]) -> ndarray[Any, _DType_co]: ...
+    def __getitem__(self, key: (
+        None
+        | slice
+        | ellipsis
+        | SupportsIndex
+        | _ArrayLikeInt_co
+        | tuple[None | slice | ellipsis | _ArrayLikeInt_co | SupportsIndex, ...]
+    )) -> ndarray[Any, _DType_co]: ...
      @overload
      def __getitem__(self: NDArray[void], key: str) -> NDArray[Any]: ...
      @overload
@@ -1596,7 +1585,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def item(
          self: ndarray[Any, _dtype[_SupportsItem[_T]]],  # type: ignore[type-var]
-        args: Tuple[SupportsIndex, ...],
+        args: tuple[SupportsIndex, ...],
          /,
      ) -> _T: ...
  
@@ -1616,7 +1605,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
  
      def squeeze(
          self,
-        axis: Union[SupportsIndex, Tuple[SupportsIndex, ...]] = ...,
+        axis: SupportsIndex | tuple[SupportsIndex, ...] = ...,
      ) -> ndarray[Any, _DType_co]: ...
  
      def swapaxes(
@@ -1633,9 +1622,9 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      def argpartition(
          self,
          kth: _ArrayLikeInt_co,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
          kind: _PartitionKind = ...,
-        order: Union[None, str, Sequence[str]] = ...,
+        order: None | str | Sequence[str] = ...,
      ) -> ndarray[Any, _dtype[intp]]: ...
  
      def diagonal(
@@ -1655,14 +1644,14 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      def dot(self, b: ArrayLike, out: _NdArraySubClass) -> _NdArraySubClass: ...
  
      # `nonzero()` is deprecated for 0d arrays/generics
-    def nonzero(self) -> Tuple[ndarray[Any, _dtype[intp]], ...]: ...
+    def nonzero(self) -> tuple[ndarray[Any, _dtype[intp]], ...]: ...
  
      def partition(
          self,
          kth: _ArrayLikeInt_co,
          axis: SupportsIndex = ...,
          kind: _PartitionKind = ...,
-        order: Union[None, str, Sequence[str]] = ...,
+        order: None | str | Sequence[str] = ...,
      ) -> None: ...
  
      # `put` is technically available to `generic`,
@@ -1679,14 +1668,14 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
          self,  # >= 1D array
          v: _ScalarLike_co,  # 0D array-like
          side: _SortSide = ...,
-        sorter: Optional[_ArrayLikeInt_co] = ...,
+        sorter: None | _ArrayLikeInt_co = ...,
      ) -> intp: ...
      @overload
      def searchsorted(
          self,  # >= 1D array
          v: ArrayLike,
          side: _SortSide = ...,
-        sorter: Optional[_ArrayLikeInt_co] = ...,
+        sorter: None | _ArrayLikeInt_co = ...,
      ) -> ndarray[Any, _dtype[intp]]: ...
  
      def setfield(
@@ -1699,8 +1688,8 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      def sort(
          self,
          axis: SupportsIndex = ...,
-        kind: Optional[_SortKind] = ...,
-        order: Union[None, str, Sequence[str]] = ...,
+        kind: None | _SortKind = ...,
+        order: None | str | Sequence[str] = ...,
      ) -> None: ...
  
      @overload
@@ -1726,7 +1715,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      def take(  # type: ignore[misc]
          self: ndarray[Any, _dtype[_ScalarType]],
          indices: _IntLike_co,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
          out: None = ...,
          mode: _ModeKind = ...,
      ) -> _ScalarType: ...
@@ -1734,7 +1723,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      def take(  # type: ignore[misc]
          self,
          indices: _ArrayLikeInt_co,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
          out: None = ...,
          mode: _ModeKind = ...,
      ) -> ndarray[Any, _DType_co]: ...
@@ -1742,7 +1731,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      def take(
          self,
          indices: _ArrayLikeInt_co,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
          out: _NdArraySubClass = ...,
          mode: _ModeKind = ...,
      ) -> _NdArraySubClass: ...
@@ -1750,7 +1739,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      def repeat(
          self,
          repeats: _ArrayLikeInt_co,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
      ) -> ndarray[Any, _DType_co]: ...
  
      def flatten(
@@ -1794,7 +1783,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def view(self: _ArraySelf) -> _ArraySelf: ...
      @overload
-    def view(self, type: Type[_NdArraySubClass]) -> _NdArraySubClass: ...
+    def view(self, type: type[_NdArraySubClass]) -> _NdArraySubClass: ...
      @overload
      def view(self, dtype: _DTypeLike[_ScalarType]) -> NDArray[_ScalarType]: ...
      @overload
@@ -1803,7 +1792,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      def view(
          self,
          dtype: DTypeLike,
-        type: Type[_NdArraySubClass],
+        type: type[_NdArraySubClass],
      ) -> _NdArraySubClass: ...
  
      @overload
@@ -1937,6 +1926,8 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __matmul__(self: _ArrayComplex_co, other: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...
      @overload
+    def __matmul__(self: NDArray[number[Any]], other: _ArrayLikeNumber_co) -> NDArray[number[Any]]: ...
+    @overload
      def __matmul__(self: NDArray[object_], other: Any) -> Any: ...
      @overload
      def __matmul__(self: NDArray[Any], other: _ArrayLikeObject_co) -> Any: ...
@@ -1952,6 +1943,8 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __rmatmul__(self: _ArrayComplex_co, other: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...
      @overload
+    def __rmatmul__(self: NDArray[number[Any]], other: _ArrayLikeNumber_co) -> NDArray[number[Any]]: ...
+    @overload
      def __rmatmul__(self: NDArray[object_], other: Any) -> Any: ...
      @overload
      def __rmatmul__(self: NDArray[Any], other: _ArrayLikeObject_co) -> Any: ...
@@ -1995,7 +1988,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __divmod__(self: _ArrayFloat_co, other: _ArrayLikeFloat_co) -> _2Tuple[NDArray[floating[Any]]]: ...  # type: ignore[misc]
      @overload
-    def __divmod__(self: _ArrayTD64_co, other: _SupportsArray[_dtype[timedelta64]] | _NestedSequence[_SupportsArray[_dtype[timedelta64]]]) -> Tuple[NDArray[int64], NDArray[timedelta64]]: ...
+    def __divmod__(self: _ArrayTD64_co, other: _SupportsArray[_dtype[timedelta64]] | _NestedSequence[_SupportsArray[_dtype[timedelta64]]]) -> tuple[NDArray[int64], NDArray[timedelta64]]: ...
  
      @overload
      def __rdivmod__(self: NDArray[bool_], other: _ArrayLikeBool_co) -> _2Tuple[NDArray[int8]]: ...  # type: ignore[misc]
@@ -2006,7 +1999,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __rdivmod__(self: _ArrayFloat_co, other: _ArrayLikeFloat_co) -> _2Tuple[NDArray[floating[Any]]]: ...  # type: ignore[misc]
      @overload
-    def __rdivmod__(self: _ArrayTD64_co, other: _SupportsArray[_dtype[timedelta64]] | _NestedSequence[_SupportsArray[_dtype[timedelta64]]]) -> Tuple[NDArray[int64], NDArray[timedelta64]]: ...
+    def __rdivmod__(self: _ArrayTD64_co, other: _SupportsArray[_dtype[timedelta64]] | _NestedSequence[_SupportsArray[_dtype[timedelta64]]]) -> tuple[NDArray[int64], NDArray[timedelta64]]: ...
  
      @overload
      def __add__(self: NDArray[bool_], other: _ArrayLikeBool_co) -> NDArray[bool_]: ...  # type: ignore[misc]
@@ -2019,6 +2012,8 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __add__(self: _ArrayComplex_co, other: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...  # type: ignore[misc]
      @overload
+    def __add__(self: NDArray[number[Any]], other: _ArrayLikeNumber_co) -> NDArray[number[Any]]: ...
+    @overload
      def __add__(self: _ArrayTD64_co, other: _ArrayLikeTD64_co) -> NDArray[timedelta64]: ...  # type: ignore[misc]
      @overload
      def __add__(self: _ArrayTD64_co, other: _ArrayLikeDT64_co) -> NDArray[datetime64]: ...
@@ -2040,6 +2035,8 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __radd__(self: _ArrayComplex_co, other: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...  # type: ignore[misc]
      @overload
+    def __radd__(self: NDArray[number[Any]], other: _ArrayLikeNumber_co) -> NDArray[number[Any]]: ...
+    @overload
      def __radd__(self: _ArrayTD64_co, other: _ArrayLikeTD64_co) -> NDArray[timedelta64]: ...  # type: ignore[misc]
      @overload
      def __radd__(self: _ArrayTD64_co, other: _ArrayLikeDT64_co) -> NDArray[datetime64]: ...
@@ -2061,6 +2058,8 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __sub__(self: _ArrayComplex_co, other: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...  # type: ignore[misc]
      @overload
+    def __sub__(self: NDArray[number[Any]], other: _ArrayLikeNumber_co) -> NDArray[number[Any]]: ...
+    @overload
      def __sub__(self: _ArrayTD64_co, other: _ArrayLikeTD64_co) -> NDArray[timedelta64]: ...  # type: ignore[misc]
      @overload
      def __sub__(self: NDArray[datetime64], other: _ArrayLikeTD64_co) -> NDArray[datetime64]: ...
@@ -2082,6 +2081,8 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __rsub__(self: _ArrayComplex_co, other: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...  # type: ignore[misc]
      @overload
+    def __rsub__(self: NDArray[number[Any]], other: _ArrayLikeNumber_co) -> NDArray[number[Any]]: ...
+    @overload
      def __rsub__(self: _ArrayTD64_co, other: _ArrayLikeTD64_co) -> NDArray[timedelta64]: ...  # type: ignore[misc]
      @overload
      def __rsub__(self: _ArrayTD64_co, other: _ArrayLikeDT64_co) -> NDArray[datetime64]: ...  # type: ignore[misc]
@@ -2103,6 +2104,8 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __mul__(self: _ArrayComplex_co, other: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...  # type: ignore[misc]
      @overload
+    def __mul__(self: NDArray[number[Any]], other: _ArrayLikeNumber_co) -> NDArray[number[Any]]: ...
+    @overload
      def __mul__(self: _ArrayTD64_co, other: _ArrayLikeFloat_co) -> NDArray[timedelta64]: ...
      @overload
      def __mul__(self: _ArrayFloat_co, other: _ArrayLikeTD64_co) -> NDArray[timedelta64]: ...
@@ -2122,6 +2125,8 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __rmul__(self: _ArrayComplex_co, other: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...  # type: ignore[misc]
      @overload
+    def __rmul__(self: NDArray[number[Any]], other: _ArrayLikeNumber_co) -> NDArray[number[Any]]: ...
+    @overload
      def __rmul__(self: _ArrayTD64_co, other: _ArrayLikeFloat_co) -> NDArray[timedelta64]: ...
      @overload
      def __rmul__(self: _ArrayFloat_co, other: _ArrayLikeTD64_co) -> NDArray[timedelta64]: ...
@@ -2179,6 +2184,8 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __pow__(self: _ArrayComplex_co, other: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...
      @overload
+    def __pow__(self: NDArray[number[Any]], other: _ArrayLikeNumber_co) -> NDArray[number[Any]]: ...
+    @overload
      def __pow__(self: NDArray[object_], other: Any) -> Any: ...
      @overload
      def __pow__(self: NDArray[Any], other: _ArrayLikeObject_co) -> Any: ...
@@ -2194,6 +2201,8 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __rpow__(self: _ArrayComplex_co, other: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...
      @overload
+    def __rpow__(self: NDArray[number[Any]], other: _ArrayLikeNumber_co) -> NDArray[number[Any]]: ...
+    @overload
      def __rpow__(self: NDArray[object_], other: Any) -> Any: ...
      @overload
      def __rpow__(self: NDArray[Any], other: _ArrayLikeObject_co) -> Any: ...
@@ -2205,6 +2214,8 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __truediv__(self: _ArrayComplex_co, other: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...  # type: ignore[misc]
      @overload
+    def __truediv__(self: NDArray[number[Any]], other: _ArrayLikeNumber_co) -> NDArray[number[Any]]: ...
+    @overload
      def __truediv__(self: NDArray[timedelta64], other: _SupportsArray[_dtype[timedelta64]] | _NestedSequence[_SupportsArray[_dtype[timedelta64]]]) -> NDArray[float64]: ...
      @overload
      def __truediv__(self: NDArray[timedelta64], other: _ArrayLikeBool_co) -> NoReturn: ...
@@ -2222,6 +2233,8 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __rtruediv__(self: _ArrayComplex_co, other: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...  # type: ignore[misc]
      @overload
+    def __rtruediv__(self: NDArray[number[Any]], other: _ArrayLikeNumber_co) -> NDArray[number[Any]]: ...
+    @overload
      def __rtruediv__(self: NDArray[timedelta64], other: _SupportsArray[_dtype[timedelta64]] | _NestedSequence[_SupportsArray[_dtype[timedelta64]]]) -> NDArray[float64]: ...
      @overload
      def __rtruediv__(self: NDArray[bool_], other: _ArrayLikeTD64_co) -> NoReturn: ...
@@ -2343,10 +2356,15 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      def __ror__(self: NDArray[Any], other: _ArrayLikeObject_co) -> Any: ...
  
      # `np.generic` does not support inplace operations
+
+    # NOTE: Inplace ops generally use "same_kind" casting w.r.t. to the left
+    # operand. An exception to this rule are unsigned integers though, which
+    # also accepts a signed integer for the right operand as long it is a 0D
+    # object and its value is >= 0
      @overload
      def __iadd__(self: NDArray[bool_], other: _ArrayLikeBool_co) -> NDArray[bool_]: ...
      @overload
-    def __iadd__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co) -> NDArray[unsignedinteger[_NBit1]]: ...
+    def __iadd__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co | _IntLike_co) -> NDArray[unsignedinteger[_NBit1]]: ...
      @overload
      def __iadd__(self: NDArray[signedinteger[_NBit1]], other: _ArrayLikeInt_co) -> NDArray[signedinteger[_NBit1]]: ...
      @overload
@@ -2361,7 +2379,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      def __iadd__(self: NDArray[object_], other: Any) -> NDArray[object_]: ...
  
      @overload
-    def __isub__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co) -> NDArray[unsignedinteger[_NBit1]]: ...
+    def __isub__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co | _IntLike_co) -> NDArray[unsignedinteger[_NBit1]]: ...
      @overload
      def __isub__(self: NDArray[signedinteger[_NBit1]], other: _ArrayLikeInt_co) -> NDArray[signedinteger[_NBit1]]: ...
      @overload
@@ -2378,7 +2396,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __imul__(self: NDArray[bool_], other: _ArrayLikeBool_co) -> NDArray[bool_]: ...
      @overload
-    def __imul__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co) -> NDArray[unsignedinteger[_NBit1]]: ...
+    def __imul__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co | _IntLike_co) -> NDArray[unsignedinteger[_NBit1]]: ...
      @overload
      def __imul__(self: NDArray[signedinteger[_NBit1]], other: _ArrayLikeInt_co) -> NDArray[signedinteger[_NBit1]]: ...
      @overload
@@ -2402,7 +2420,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      def __itruediv__(self: NDArray[object_], other: Any) -> NDArray[object_]: ...
  
      @overload
-    def __ifloordiv__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co) -> NDArray[unsignedinteger[_NBit1]]: ...
+    def __ifloordiv__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co | _IntLike_co) -> NDArray[unsignedinteger[_NBit1]]: ...
      @overload
      def __ifloordiv__(self: NDArray[signedinteger[_NBit1]], other: _ArrayLikeInt_co) -> NDArray[signedinteger[_NBit1]]: ...
      @overload
@@ -2417,7 +2435,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      def __ifloordiv__(self: NDArray[object_], other: Any) -> NDArray[object_]: ...
  
      @overload
-    def __ipow__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co) -> NDArray[unsignedinteger[_NBit1]]: ...
+    def __ipow__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co | _IntLike_co) -> NDArray[unsignedinteger[_NBit1]]: ...
      @overload
      def __ipow__(self: NDArray[signedinteger[_NBit1]], other: _ArrayLikeInt_co) -> NDArray[signedinteger[_NBit1]]: ...
      @overload
@@ -2428,7 +2446,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      def __ipow__(self: NDArray[object_], other: Any) -> NDArray[object_]: ...
  
      @overload
-    def __imod__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co) -> NDArray[unsignedinteger[_NBit1]]: ...
+    def __imod__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co | _IntLike_co) -> NDArray[unsignedinteger[_NBit1]]: ...
      @overload
      def __imod__(self: NDArray[signedinteger[_NBit1]], other: _ArrayLikeInt_co) -> NDArray[signedinteger[_NBit1]]: ...
      @overload
@@ -2439,14 +2457,14 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      def __imod__(self: NDArray[object_], other: Any) -> NDArray[object_]: ...
  
      @overload
-    def __ilshift__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co) -> NDArray[unsignedinteger[_NBit1]]: ...
+    def __ilshift__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co | _IntLike_co) -> NDArray[unsignedinteger[_NBit1]]: ...
      @overload
      def __ilshift__(self: NDArray[signedinteger[_NBit1]], other: _ArrayLikeInt_co) -> NDArray[signedinteger[_NBit1]]: ...
      @overload
      def __ilshift__(self: NDArray[object_], other: Any) -> NDArray[object_]: ...
  
      @overload
-    def __irshift__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co) -> NDArray[unsignedinteger[_NBit1]]: ...
+    def __irshift__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co | _IntLike_co) -> NDArray[unsignedinteger[_NBit1]]: ...
      @overload
      def __irshift__(self: NDArray[signedinteger[_NBit1]], other: _ArrayLikeInt_co) -> NDArray[signedinteger[_NBit1]]: ...
      @overload
@@ -2455,7 +2473,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __iand__(self: NDArray[bool_], other: _ArrayLikeBool_co) -> NDArray[bool_]: ...
      @overload
-    def __iand__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co) -> NDArray[unsignedinteger[_NBit1]]: ...
+    def __iand__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co | _IntLike_co) -> NDArray[unsignedinteger[_NBit1]]: ...
      @overload
      def __iand__(self: NDArray[signedinteger[_NBit1]], other: _ArrayLikeInt_co) -> NDArray[signedinteger[_NBit1]]: ...
      @overload
@@ -2464,7 +2482,7 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __ixor__(self: NDArray[bool_], other: _ArrayLikeBool_co) -> NDArray[bool_]: ...
      @overload
-    def __ixor__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co) -> NDArray[unsignedinteger[_NBit1]]: ...
+    def __ixor__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co | _IntLike_co) -> NDArray[unsignedinteger[_NBit1]]: ...
      @overload
      def __ixor__(self: NDArray[signedinteger[_NBit1]], other: _ArrayLikeInt_co) -> NDArray[signedinteger[_NBit1]]: ...
      @overload
@@ -2473,14 +2491,14 @@ class ndarray(_ArrayOrScalarCommon, Generic[_ShapeType, _DType_co]):
      @overload
      def __ior__(self: NDArray[bool_], other: _ArrayLikeBool_co) -> NDArray[bool_]: ...
      @overload
-    def __ior__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co) -> NDArray[unsignedinteger[_NBit1]]: ...
+    def __ior__(self: NDArray[unsignedinteger[_NBit1]], other: _ArrayLikeUInt_co | _IntLike_co) -> NDArray[unsignedinteger[_NBit1]]: ...
      @overload
      def __ior__(self: NDArray[signedinteger[_NBit1]], other: _ArrayLikeInt_co) -> NDArray[signedinteger[_NBit1]]: ...
      @overload
      def __ior__(self: NDArray[object_], other: Any) -> NDArray[object_]: ...
  
      def __dlpack__(self: NDArray[number[Any]], *, stream: None = ...) -> _PyCapsule: ...
-    def __dlpack_device__(self) -> Tuple[int, L[0]]: ...
+    def __dlpack_device__(self) -> tuple[int, L[0]]: ...
  
      # Keep `dtype` at the bottom to avoid name conflicts with `np.dtype`
      @property
@@ -2512,9 +2530,9 @@ class generic(_ArrayOrScalarCommon):
      @property
      def size(self) -> L[1]: ...
      @property
-    def shape(self) -> Tuple[()]: ...
+    def shape(self) -> tuple[()]: ...
      @property
-    def strides(self) -> Tuple[()]: ...
+    def strides(self) -> tuple[()]: ...
      def byteswap(self: _ScalarType, inplace: L[False] = ...) -> _ScalarType: ...
      @property
      def flat(self: _ScalarType) -> flatiter[ndarray[Any, _dtype[_ScalarType]]]: ...
@@ -2543,19 +2561,19 @@ class generic(_ArrayOrScalarCommon):
      @overload
      def view(
          self: _ScalarType,
-        type: Type[ndarray[Any, Any]] = ...,
+        type: type[ndarray[Any, Any]] = ...,
      ) -> _ScalarType: ...
      @overload
      def view(
          self,
          dtype: _DTypeLike[_ScalarType],
-        type: Type[ndarray[Any, Any]] = ...,
+        type: type[ndarray[Any, Any]] = ...,
      ) -> _ScalarType: ...
      @overload
      def view(
          self,
          dtype: DTypeLike,
-        type: Type[ndarray[Any, Any]] = ...,
+        type: type[ndarray[Any, Any]] = ...,
      ) -> Any: ...
  
      @overload
@@ -2572,14 +2590,14 @@ class generic(_ArrayOrScalarCommon):
      ) -> Any: ...
  
      def item(
-        self, args: L[0] | Tuple[()] | Tuple[L[0]] = ..., /,
+        self, args: L[0] | tuple[()] | tuple[L[0]] = ..., /,
      ) -> Any: ...
  
      @overload
      def take(  # type: ignore[misc]
          self: _ScalarType,
          indices: _IntLike_co,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
          out: None = ...,
          mode: _ModeKind = ...,
      ) -> _ScalarType: ...
@@ -2587,7 +2605,7 @@ class generic(_ArrayOrScalarCommon):
      def take(  # type: ignore[misc]
          self: _ScalarType,
          indices: _ArrayLikeInt_co,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
          out: None = ...,
          mode: _ModeKind = ...,
      ) -> ndarray[Any, _dtype[_ScalarType]]: ...
@@ -2595,7 +2613,7 @@ class generic(_ArrayOrScalarCommon):
      def take(
          self,
          indices: _ArrayLikeInt_co,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
          out: _NdArraySubClass = ...,
          mode: _ModeKind = ...,
      ) -> _NdArraySubClass: ...
@@ -2603,7 +2621,7 @@ class generic(_ArrayOrScalarCommon):
      def repeat(
          self: _ScalarType,
          repeats: _ArrayLikeInt_co,
-        axis: Optional[SupportsIndex] = ...,
+        axis: None | SupportsIndex = ...,
      ) -> ndarray[Any, _dtype[_ScalarType]]: ...
  
      def flatten(
@@ -2626,9 +2644,9 @@ class generic(_ArrayOrScalarCommon):
      ) -> ndarray[Any, _dtype[_ScalarType]]: ...
  
      def squeeze(
-        self: _ScalarType, axis: Union[L[0], Tuple[()]] = ...
+        self: _ScalarType, axis: L[0] | tuple[()] = ...
      ) -> _ScalarType: ...
-    def transpose(self: _ScalarType, axes: Tuple[()] = ..., /) -> _ScalarType: ...
+    def transpose(self: _ScalarType, axes: tuple[()] = ..., /) -> _ScalarType: ...
      # Keep `dtype` at the bottom to avoid name conflicts with `np.dtype`
      @property
      def dtype(self: _ScalarType) -> _dtype[_ScalarType]: ...
@@ -2667,7 +2685,7 @@ class number(generic, Generic[_NBit1]):  # type: ignore
  class bool_(generic):
      def __init__(self, value: object = ..., /) -> None: ...
      def item(
-        self, args: L[0] | Tuple[()] | Tuple[L[0]] = ..., /,
+        self, args: L[0] | tuple[()] | tuple[L[0]] = ..., /,
      ) -> bool: ...
      def tolist(self) -> bool: ...
      @property
@@ -2743,14 +2761,14 @@ class datetime64(generic):
      def __init__(
          self,
          value: None | datetime64 | _CharLike_co | _DatetimeScalar = ...,
-        format: _CharLike_co | Tuple[_CharLike_co, _IntLike_co] = ...,
+        format: _CharLike_co | tuple[_CharLike_co, _IntLike_co] = ...,
          /,
      ) -> None: ...
      @overload
      def __init__(
          self,
          value: int,
-        format: _CharLike_co | Tuple[_CharLike_co, _IntLike_co],
+        format: _CharLike_co | tuple[_CharLike_co, _IntLike_co],
          /,
      ) -> None: ...
      def __add__(self, other: _TD64Like_co) -> datetime64: ...
@@ -2789,7 +2807,7 @@ class integer(number[_NBit1]):  # type: ignore
      # NOTE: `__index__` is technically defined in the bottom-most
      # sub-classes (`int64`, `uint32`, etc)
      def item(
-        self, args: L[0] | Tuple[()] | Tuple[L[0]] = ..., /,
+        self, args: L[0] | tuple[()] | tuple[L[0]] = ..., /,
      ) -> int: ...
      def tolist(self) -> int: ...
      def is_integer(self) -> L[True]: ...
@@ -2858,7 +2876,7 @@ class timedelta64(generic):
      def __init__(
          self,
          value: None | int | _CharLike_co | dt.timedelta | timedelta64 = ...,
-        format: _CharLike_co | Tuple[_CharLike_co, _IntLike_co] = ...,
+        format: _CharLike_co | tuple[_CharLike_co, _IntLike_co] = ...,
          /,
      ) -> None: ...
      @property
@@ -2886,8 +2904,8 @@ class timedelta64(generic):
      def __rfloordiv__(self, other: timedelta64) -> int64: ...
      def __mod__(self, other: timedelta64) -> timedelta64: ...
      def __rmod__(self, other: timedelta64) -> timedelta64: ...
-    def __divmod__(self, other: timedelta64) -> Tuple[int64, timedelta64]: ...
-    def __rdivmod__(self, other: timedelta64) -> Tuple[int64, timedelta64]: ...
+    def __divmod__(self, other: timedelta64) -> tuple[int64, timedelta64]: ...
+    def __rdivmod__(self, other: timedelta64) -> tuple[int64, timedelta64]: ...
      __lt__: _ComparisonOp[_TD64Like_co, _ArrayLikeTD64_co]
      __le__: _ComparisonOp[_TD64Like_co, _ArrayLikeTD64_co]
      __gt__: _ComparisonOp[_TD64Like_co, _ArrayLikeTD64_co]
@@ -2935,7 +2953,7 @@ uint = unsignedinteger[_NBitInt]
  ulonglong = unsignedinteger[_NBitLongLong]
  
  class inexact(number[_NBit1]):  # type: ignore
-    def __getnewargs__(self: inexact[_64Bit]) -> Tuple[float, ...]: ...
+    def __getnewargs__(self: inexact[_64Bit]) -> tuple[float, ...]: ...
  
  _IntType = TypeVar("_IntType", bound=integer)
  _FloatType = TypeVar('_FloatType', bound=floating)
@@ -2943,20 +2961,20 @@ _FloatType = TypeVar('_FloatType', bound=floating)
  class floating(inexact[_NBit1]):
      def __init__(self, value: _FloatValue = ..., /) -> None: ...
      def item(
-        self, args: L[0] | Tuple[()] | Tuple[L[0]] = ...,
+        self, args: L[0] | tuple[()] | tuple[L[0]] = ...,
          /,
      ) -> float: ...
      def tolist(self) -> float: ...
      def is_integer(self) -> bool: ...
      def hex(self: float64) -> str: ...
      @classmethod
-    def fromhex(cls: Type[float64], string: str, /) -> float64: ...
-    def as_integer_ratio(self) -> Tuple[int, int]: ...
+    def fromhex(cls: type[float64], string: str, /) -> float64: ...
+    def as_integer_ratio(self) -> tuple[int, int]: ...
      if sys.version_info >= (3, 9):
          def __ceil__(self: float64) -> int: ...
          def __floor__(self: float64) -> int: ...
      def __trunc__(self: float64) -> int: ...
-    def __getnewargs__(self: float64) -> Tuple[float]: ...
+    def __getnewargs__(self: float64) -> tuple[float]: ...
      def __getformat__(self: float64, typestr: L["double", "float"], /) -> str: ...
      @overload
      def __round__(self, ndigits: None = ...) -> int: ...
@@ -2997,7 +3015,7 @@ longfloat = floating[_NBitLongDouble]
  class complexfloating(inexact[_NBit1], Generic[_NBit1, _NBit2]):
      def __init__(self, value: _ComplexValue = ..., /) -> None: ...
      def item(
-        self, args: L[0] | Tuple[()] | Tuple[L[0]] = ..., /,
+        self, args: L[0] | tuple[()] | tuple[L[0]] = ..., /,
      ) -> complex: ...
      def tolist(self) -> complex: ...
      @property
@@ -3005,7 +3023,7 @@ class complexfloating(inexact[_NBit1], Generic[_NBit1, _NBit2]):
      @property
      def imag(self) -> floating[_NBit2]: ...  # type: ignore[override]
      def __abs__(self) -> floating[_NBit1]: ...  # type: ignore[override]
-    def __getnewargs__(self: complex128) -> Tuple[float, float]: ...
+    def __getnewargs__(self: complex128) -> tuple[float, float]: ...
      # NOTE: Deprecated
      # def __round__(self, ndigits=...): ...
      __add__: _ComplexOp[_NBit1]
@@ -3051,7 +3069,7 @@ class void(flexible):
      def __getitem__(self, key: list[str]) -> void: ...
      def __setitem__(
          self,
-        key: str | List[str] | SupportsIndex,
+        key: str | list[str] | SupportsIndex,
          value: ArrayLike,
      ) -> None: ...
  
@@ -3072,7 +3090,7 @@ class bytes_(character, bytes):
          self, value: str, /, encoding: str = ..., errors: str = ...
      ) -> None: ...
      def item(
-        self, args: L[0] | Tuple[()] | Tuple[L[0]] = ..., /,
+        self, args: L[0] | tuple[()] | tuple[L[0]] = ..., /,
      ) -> bytes: ...
      def tolist(self) -> bytes: ...
  
@@ -3087,7 +3105,7 @@ class str_(character, str):
          self, value: bytes, /, encoding: str = ..., errors: str = ...
      ) -> None: ...
      def item(
-        self, args: L[0] | Tuple[()] | Tuple[L[0]] = ..., /,
+        self, args: L[0] | tuple[()] | tuple[L[0]] = ..., /,
      ) -> str: ...
      def tolist(self) -> str: ...
  
@@ -3146,7 +3164,8 @@ UFUNC_PYVALS_NAME: L["UFUNC_PYVALS"]
  
  newaxis: None
  
-# See `npt._ufunc` for more concrete nin-/nout-specific stubs
+# See `numpy._typing._ufunc` for more concrete nin-/nout-specific stubs
+@final
  class ufunc:
      @property
      def __name__(self) -> str: ...
@@ -3162,7 +3181,7 @@ class ufunc:
      @property
      def ntypes(self) -> int: ...
      @property
-    def types(self) -> List[str]: ...
+    def types(self) -> list[str]: ...
      # Broad return type because it has to encompass things like
      #
      # >>> np.logical_and.identity is True
@@ -3177,7 +3196,7 @@ class ufunc:
      def identity(self) -> Any: ...
      # This is None for ufuncs and a string for gufuncs.
      @property
-    def signature(self) -> Optional[str]: ...
+    def signature(self) -> None | str: ...
      # The next four methods will always exist, but they will just
      # raise a ValueError ufuncs with that don't accept two input
      # arguments and return one output argument. Because of that we
@@ -3306,7 +3325,7 @@ class AxisError(ValueError, IndexError):
      @overload
      def __init__(self, axis: int, ndim: int, msg_prefix: None | str = ...) -> None: ...
  
-_CallType = TypeVar("_CallType", bound=Union[_ErrFunc, _SupportsWrite[str]])
+_CallType = TypeVar("_CallType", bound=_ErrFunc | _SupportsWrite[str])
  
  class errstate(Generic[_CallType], ContextDecorator):
      call: _CallType
@@ -3317,18 +3336,18 @@ class errstate(Generic[_CallType], ContextDecorator):
          self,
          *,
          call: _CallType = ...,
-        all: Optional[_ErrKind] = ...,
-        divide: Optional[_ErrKind] = ...,
-        over: Optional[_ErrKind] = ...,
-        under: Optional[_ErrKind] = ...,
-        invalid: Optional[_ErrKind] = ...,
+        all: None | _ErrKind = ...,
+        divide: None | _ErrKind = ...,
+        over: None | _ErrKind = ...,
+        under: None | _ErrKind = ...,
+        invalid: None | _ErrKind = ...,
      ) -> None: ...
      def __enter__(self) -> None: ...
      def __exit__(
          self,
-        exc_type: Optional[Type[BaseException]],
-        exc_value: Optional[BaseException],
-        traceback: Optional[TracebackType],
+        exc_type: None | type[BaseException],
+        exc_value: None | BaseException,
+        traceback: None | TracebackType,
          /,
      ) -> None: ...
  
@@ -3350,7 +3369,7 @@ class ndenumerate(Generic[_ScalarType]):
      def __new__(cls, arr: float | _NestedSequence[float]) -> ndenumerate[float_]: ...
      @overload
      def __new__(cls, arr: complex | _NestedSequence[complex]) -> ndenumerate[complex_]: ...
-    def __next__(self: ndenumerate[_ScalarType]) -> Tuple[_Shape, _ScalarType]: ...
+    def __next__(self: ndenumerate[_ScalarType]) -> tuple[_Shape, _ScalarType]: ...
      def __iter__(self: _T) -> _T: ...
  
  class ndindex:
@@ -3364,7 +3383,7 @@ class ndindex:
  class DataSource:
      def __init__(
          self,
-        destpath: Union[None, str, os.PathLike[str]] = ...,
+        destpath: None | str | os.PathLike[str] = ...,
      ) -> None: ...
      def __del__(self) -> None: ...
      def abspath(self, path: str) -> str: ...
@@ -3376,19 +3395,20 @@ class DataSource:
          self,
          path: str,
          mode: str = ...,
-        encoding: Optional[str] = ...,
-        newline: Optional[str] = ...,
+        encoding: None | str = ...,
+        newline: None | str = ...,
      ) -> IO[Any]: ...
  
  # TODO: The type of each `__next__` and `iters` return-type depends
  # on the length and dtype of `args`; we can't describe this behavior yet
  # as we lack variadics (PEP 646).
+@final
  class broadcast:
      def __new__(cls, *args: ArrayLike) -> broadcast: ...
      @property
      def index(self) -> int: ...
      @property
-    def iters(self) -> Tuple[flatiter[Any], ...]: ...
+    def iters(self) -> tuple[flatiter[Any], ...]: ...
      @property
      def nd(self) -> int: ...
      @property
@@ -3399,10 +3419,11 @@ class broadcast:
      def shape(self) -> _Shape: ...
      @property
      def size(self) -> int: ...
-    def __next__(self) -> Tuple[Any, ...]: ...
+    def __next__(self) -> tuple[Any, ...]: ...
      def __iter__(self: _T) -> _T: ...
      def reset(self) -> None: ...
  
+@final
  class busdaycalendar:
      def __new__(
          cls,
@@ -3441,7 +3462,7 @@ class finfo(Generic[_FloatType]):
      ) -> finfo[floating[_NBit1]]: ...
      @overload
      def __new__(
-        cls, dtype: complex | float | Type[complex] | Type[float]
+        cls, dtype: complex | float | type[complex] | type[float]
      ) -> finfo[float_]: ...
      @overload
      def __new__(
@@ -3461,7 +3482,7 @@ class iinfo(Generic[_IntType]):
      @overload
      def __new__(cls, dtype: _IntType | _DTypeLike[_IntType]) -> iinfo[_IntType]: ...
      @overload
-    def __new__(cls, dtype: int | Type[int]) -> iinfo[int_]: ...
+    def __new__(cls, dtype: int | type[int]) -> iinfo[int_]: ...
      @overload
      def __new__(cls, dtype: str) -> iinfo[Any]: ...
  
@@ -3515,29 +3536,29 @@ class recarray(ndarray[_ShapeType, _DType_co]):
      def __getattribute__(self, attr: str) -> Any: ...
      def __setattr__(self, attr: str, val: ArrayLike) -> None: ...
      @overload
-    def __getitem__(self, indx: Union[
-        SupportsIndex,
-        _ArrayLikeInt_co,
-        Tuple[SupportsIndex | _ArrayLikeInt_co, ...],
-    ]) -> Any: ...
-    @overload
-    def __getitem__(self: recarray[Any, dtype[void]], indx: Union[
-        None,
-        slice,
-        ellipsis,
-        SupportsIndex,
-        _ArrayLikeInt_co,
-        Tuple[None | slice | ellipsis | _ArrayLikeInt_co | SupportsIndex, ...],
-    ]) -> recarray[Any, _DType_co]: ...
-    @overload
-    def __getitem__(self, indx: Union[
-        None,
-        slice,
-        ellipsis,
-        SupportsIndex,
-        _ArrayLikeInt_co,
-        Tuple[None | slice | ellipsis | _ArrayLikeInt_co | SupportsIndex, ...],
-    ]) -> ndarray[Any, _DType_co]: ...
+    def __getitem__(self, indx: (
+        SupportsIndex
+        | _ArrayLikeInt_co
+        | tuple[SupportsIndex | _ArrayLikeInt_co, ...]
+    )) -> Any: ...
+    @overload
+    def __getitem__(self: recarray[Any, dtype[void]], indx: (
+        None
+        | slice
+        | ellipsis
+        | SupportsIndex
+        | _ArrayLikeInt_co
+        | tuple[None | slice | ellipsis | _ArrayLikeInt_co | SupportsIndex, ...]
+    )) -> recarray[Any, _DType_co]: ...
+    @overload
+    def __getitem__(self, indx: (
+        None
+        | slice
+        | ellipsis
+        | SupportsIndex
+        | _ArrayLikeInt_co
+        | tuple[None | slice | ellipsis | _ArrayLikeInt_co | SupportsIndex, ...]
+    )) -> ndarray[Any, _DType_co]: ...
      @overload
      def __getitem__(self, indx: str) -> NDArray[Any]: ...
      @overload
@@ -3607,18 +3628,18 @@ class nditer:
      def __enter__(self) -> nditer: ...
      def __exit__(
          self,
-        exc_type: None | Type[BaseException],
+        exc_type: None | type[BaseException],
          exc_value: None | BaseException,
          traceback: None | TracebackType,
      ) -> None: ...
      def __iter__(self) -> nditer: ...
-    def __next__(self) -> Tuple[NDArray[Any], ...]: ...
+    def __next__(self) -> tuple[NDArray[Any], ...]: ...
      def __len__(self) -> int: ...
      def __copy__(self) -> nditer: ...
      @overload
      def __getitem__(self, index: SupportsIndex) -> NDArray[Any]: ...
      @overload
-    def __getitem__(self, index: slice) -> Tuple[NDArray[Any], ...]: ...
+    def __getitem__(self, index: slice) -> tuple[NDArray[Any], ...]: ...
      def __setitem__(self, index: slice | SupportsIndex, value: ArrayLike) -> None: ...
      def close(self) -> None: ...
      def copy(self) -> nditer: ...
@@ -3629,7 +3650,7 @@ class nditer:
      def remove_multi_index(self) -> None: ...
      def reset(self) -> None: ...
      @property
-    def dtypes(self) -> Tuple[dtype[Any], ...]: ...
+    def dtypes(self) -> tuple[dtype[Any], ...]: ...
      @property
      def finished(self) -> bool: ...
      @property
@@ -3645,23 +3666,23 @@ class nditer:
      @property
      def iterindex(self) -> int: ...
      @property
-    def iterrange(self) -> Tuple[int, ...]: ...
+    def iterrange(self) -> tuple[int, ...]: ...
      @property
      def itersize(self) -> int: ...
      @property
-    def itviews(self) -> Tuple[NDArray[Any], ...]: ...
+    def itviews(self) -> tuple[NDArray[Any], ...]: ...
      @property
-    def multi_index(self) -> Tuple[int, ...]: ...
+    def multi_index(self) -> tuple[int, ...]: ...
      @property
      def ndim(self) -> int: ...
      @property
      def nop(self) -> int: ...
      @property
-    def operands(self) -> Tuple[NDArray[Any], ...]: ...
+    def operands(self) -> tuple[NDArray[Any], ...]: ...
      @property
-    def shape(self) -> Tuple[int, ...]: ...
+    def shape(self) -> tuple[int, ...]: ...
      @property
-    def value(self) -> Tuple[NDArray[Any], ...]: ...
+    def value(self) -> tuple[NDArray[Any], ...]: ...
  
  _MemMapModeKind = L[
      "readonly", "r",
@@ -3679,10 +3700,10 @@ class memmap(ndarray[_ShapeType, _DType_co]):
      def __new__(
          subtype,
          filename: str | bytes | os.PathLike[str] | os.PathLike[bytes] | _MemMapIOProtocol,
-        dtype: Type[uint8] = ...,
+        dtype: type[uint8] = ...,
          mode: _MemMapModeKind = ...,
          offset: int = ...,
-        shape: None | int | Tuple[int, ...] = ...,
+        shape: None | int | tuple[int, ...] = ...,
          order: _OrderKACF = ...,
      ) -> memmap[Any, dtype[uint8]]: ...
      @overload
@@ -3692,7 +3713,7 @@ class memmap(ndarray[_ShapeType, _DType_co]):
          dtype: _DTypeLike[_ScalarType],
          mode: _MemMapModeKind = ...,
          offset: int = ...,
-        shape: None | int | Tuple[int, ...] = ...,
+        shape: None | int | tuple[int, ...] = ...,
          order: _OrderKACF = ...,
      ) -> memmap[Any, dtype[_ScalarType]]: ...
      @overload
@@ -3702,14 +3723,14 @@ class memmap(ndarray[_ShapeType, _DType_co]):
          dtype: DTypeLike,
          mode: _MemMapModeKind = ...,
          offset: int = ...,
-        shape: None | int | Tuple[int, ...] = ...,
+        shape: None | int | tuple[int, ...] = ...,
          order: _OrderKACF = ...,
      ) -> memmap[Any, dtype[Any]]: ...
-    def __array_finalize__(self, obj: memmap[Any, Any]) -> None: ...
+    def __array_finalize__(self, obj: object) -> None: ...
      def __array_wrap__(
          self,
          array: memmap[_ShapeType, _DType_co],
-        context: None | Tuple[ufunc, Tuple[Any, ...], int] = ...,
+        context: None | tuple[ufunc, tuple[Any, ...], int] = ...,
      ) -> Any: ...
      def flush(self) -> None: ...
  
@@ -3720,7 +3741,7 @@ class vectorize:
      cache: bool
      signature: None | str
      otypes: None | str
-    excluded: Set[int | str]
+    excluded: set[int | str]
      __doc__: None | str
      def __init__(
          self,
@@ -3817,23 +3838,23 @@ class matrix(ndarray[_ShapeType, _DType_co]):
          dtype: DTypeLike = ...,
          copy: bool = ...,
      ) -> matrix[Any, Any]: ...
-    def __array_finalize__(self, obj: NDArray[Any]) -> None: ...
+    def __array_finalize__(self, obj: object) -> None: ...
  
      @overload
-    def __getitem__(self, key: Union[
-        SupportsIndex,
-        _ArrayLikeInt_co,
-        Tuple[SupportsIndex | _ArrayLikeInt_co, ...],
-    ]) -> Any: ...
+    def __getitem__(self, key: (
+        SupportsIndex
+        | _ArrayLikeInt_co
+        | tuple[SupportsIndex | _ArrayLikeInt_co, ...]
+    )) -> Any: ...
      @overload
-    def __getitem__(self, key: Union[
-        None,
-        slice,
-        ellipsis,
-        SupportsIndex,
-        _ArrayLikeInt_co,
-        Tuple[None | slice | ellipsis | _ArrayLikeInt_co | SupportsIndex, ...],
-    ]) -> matrix[Any, _DType_co]: ...
+    def __getitem__(self, key: (
+        None
+        | slice
+        | ellipsis
+        | SupportsIndex
+        | _ArrayLikeInt_co
+        | tuple[None | slice | ellipsis | _ArrayLikeInt_co | SupportsIndex, ...]
+    )) -> matrix[Any, _DType_co]: ...
      @overload
      def __getitem__(self: NDArray[void], key: str) -> matrix[Any, dtype[Any]]: ...
      @overload
@@ -3930,7 +3951,7 @@ class matrix(ndarray[_ShapeType, _DType_co]):
      def ptp(self, axis: None | _ShapeLike = ..., out: _NdArraySubClass = ...) -> _NdArraySubClass: ...
  
      def squeeze(self, axis: None | _ShapeLike = ...) -> matrix[Any, _DType_co]: ...
-    def tolist(self: matrix[Any, dtype[_SupportsItem[_T]]]) -> List[List[_T]]: ...  # type: ignore[typevar]
+    def tolist(self: matrix[Any, dtype[_SupportsItem[_T]]]) -> list[list[_T]]: ...  # type: ignore[typevar]
      def ravel(self, order: _OrderKACF = ...) -> matrix[Any, _DType_co]: ...
      def flatten(self, order: _OrderKACF = ...) -> matrix[Any, _DType_co]: ...
  
@@ -3978,7 +3999,7 @@ class chararray(ndarray[_ShapeType, _CharDType]):
          order: _OrderKACF = ...,
      ) -> chararray[Any, dtype[str_]]: ...
  
-    def __array_finalize__(self, obj: NDArray[str_ | bytes_]) -> None: ...
+    def __array_finalize__(self, obj: object) -> None: ...
      def __mul__(self, other: _ArrayLikeInt_co) -> chararray[Any, _CharDType]: ...
      def __rmul__(self, other: _ArrayLikeInt_co) -> chararray[Any, _CharDType]: ...
      def __mod__(self, i: Any) -> chararray[Any, _CharDType]: ...
@@ -4376,4 +4397,4 @@ class chararray(ndarray[_ShapeType, _CharDType]):
  class _SupportsDLPack(Protocol[_T_contra]):
      def __dlpack__(self, *, stream: None | _T_contra = ...) -> _PyCapsule: ...
  
-def _from_dlpack(__obj: _SupportsDLPack[None]) -> NDArray[Any]: ...
+def from_dlpack(obj: _SupportsDLPack[None], /) -> NDArray[Any]: ...
diff --git a/numpy/_globals.py b/numpy/_globals.py

index c888747258c77008fde5c6cef261702890ab8cfe..593d90e428c2260b3b76bcc1b7cc31193d949c05 100644 (file)
--- a/numpy/_globals.py
+++ b/numpy/_globals.py
@@ -116,7 +116,7 @@ class _CopyMode(enum.Enum):
      NEVER = 2
  
      def __bool__(self):
-        # For backwards compatiblity
+        # For backwards compatibility
          if self == _CopyMode.ALWAYS:
              return True
  
diff --git a/numpy/_pyinstaller/__init__.py b/numpy/_pyinstaller/__init__.py

new file mode 100644 (file)

index 0000000..e69de29
diff --git a/numpy/_pyinstaller/hook-numpy.py b/numpy/_pyinstaller/hook-numpy.py

new file mode 100644 (file)

index 0000000..a08b7c9
--- /dev/null
+++ b/numpy/_pyinstaller/hook-numpy.py
@@ -0,0 +1,40 @@
+"""This hook should collect all binary files and any hidden modules that numpy
+needs.
+
+Our (some-what inadequate) docs for writing PyInstaller hooks are kept here:
+https://pyinstaller.readthedocs.io/en/stable/hooks.html
+
+"""
+from PyInstaller.compat import is_conda, is_pure_conda
+from PyInstaller.utils.hooks import collect_dynamic_libs, is_module_satisfies
+
+# Collect all DLLs inside numpy's installation folder, dump them into built
+# app's root.
+binaries = collect_dynamic_libs("numpy", ".")
+
+# If using Conda without any non-conda virtual environment manager:
+if is_pure_conda:
+    # Assume running the NumPy from Conda-forge and collect it's DLLs from the
+    # communal Conda bin directory. DLLs from NumPy's dependencies must also be
+    # collected to capture MKL, OpenBlas, OpenMP, etc.
+    from PyInstaller.utils.hooks import conda_support
+    datas = conda_support.collect_dynamic_libs("numpy", dependencies=True)
+
+# Submodules PyInstaller cannot detect (probably because they are only imported
+# by extension modules, which PyInstaller cannot read).
+hiddenimports = ['numpy.core._dtype_ctypes']
+if is_conda:
+    hiddenimports.append("six")
+
+# Remove testing and building code and packages that are referenced throughout
+# NumPy but are not really dependencies.
+excludedimports = [
+    "scipy",
+    "pytest",
+    "nose",
+    "f2py",
+    "setuptools",
+    "numpy.f2py",
+    "distutils",
+    "numpy.distutils",
+]
diff --git a/numpy/_pyinstaller/pyinstaller-smoke.py b/numpy/_pyinstaller/pyinstaller-smoke.py

new file mode 100644 (file)

index 0000000..1c9f78a
--- /dev/null
+++ b/numpy/_pyinstaller/pyinstaller-smoke.py
@@ -0,0 +1,32 @@
+"""A crude *bit of everything* smoke test to verify PyInstaller compatibility.
+
+PyInstaller typically goes wrong by forgetting to package modules, extension
+modules or shared libraries. This script should aim to touch as many of those
+as possible in an attempt to trip a ModuleNotFoundError or a DLL load failure
+due to an uncollected resource. Missing resources are unlikely to lead to
+arithmitic errors so there's generally no need to verify any calculation's
+output - merely that it made it to the end OK. This script should not
+explicitly import any of numpy's submodules as that gives PyInstaller undue
+hints that those submodules exist and should be collected (accessing implicitly
+loaded submodules is OK).
+
+"""
+import numpy as np
+
+a = np.arange(1., 10.).reshape((3, 3)) % 5
+np.linalg.det(a)
+a @ a
+a @ a.T
+np.linalg.inv(a)
+np.sin(np.exp(a))
+np.linalg.svd(a)
+np.linalg.eigh(a)
+
+np.unique(np.random.randint(0, 10, 100))
+np.sort(np.random.uniform(0, 10, 100))
+
+np.fft.fft(np.exp(2j * np.pi * np.arange(8) / 8))
+np.ma.masked_array(np.arange(10), np.random.rand(10) < .5).sum()
+np.polynomial.Legendre([7, 8, 9]).roots()
+
+print("I made it!")
diff --git a/numpy/_pyinstaller/test_pyinstaller.py b/numpy/_pyinstaller/test_pyinstaller.py

new file mode 100644 (file)

index 0000000..a9061da
--- /dev/null
+++ b/numpy/_pyinstaller/test_pyinstaller.py
@@ -0,0 +1,35 @@
+import subprocess
+from pathlib import Path
+
+import pytest
+
+
+# PyInstaller has been very unproactive about replacing 'imp' with 'importlib'.
+@pytest.mark.filterwarnings('ignore::DeprecationWarning')
+# It also leaks io.BytesIO()s.
+@pytest.mark.filterwarnings('ignore::ResourceWarning')
+@pytest.mark.parametrize("mode", ["--onedir", "--onefile"])
+@pytest.mark.slow
+def test_pyinstaller(mode, tmp_path):
+    """Compile and run pyinstaller-smoke.py using PyInstaller."""
+
+    pyinstaller_cli = pytest.importorskip("PyInstaller.__main__").run
+
+    source = Path(__file__).with_name("pyinstaller-smoke.py").resolve()
+    args = [
+        # Place all generated files in ``tmp_path``.
+        '--workpath', str(tmp_path / "build"),
+        '--distpath', str(tmp_path / "dist"),
+        '--specpath', str(tmp_path),
+        mode,
+        str(source),
+    ]
+    pyinstaller_cli(args)
+
+    if mode == "--onefile":
+        exe = tmp_path / "dist" / source.stem
+    else:
+        exe = tmp_path / "dist" / source.stem / source.stem
+
+    p = subprocess.run([str(exe)], check=True, stdout=subprocess.PIPE)
+    assert p.stdout.strip() == b"I made it!"
diff --git a/numpy/_pytesttester.pyi b/numpy/_pytesttester.pyi

index 0be64b3f7488e0830e434d6171ffd98dfa670f35..67ac87b33de164c710a25110d45545e24a06d42e 100644 (file)
--- a/numpy/_pytesttester.pyi
+++ b/numpy/_pytesttester.pyi
@@ -1,6 +1,7 @@
-from typing import List, Iterable, Literal as L
+from collections.abc import Iterable
+from typing import Literal as L
  
-__all__: List[str]
+__all__: list[str]
  
  class PytestTester:
      module_name: str
diff --git a/numpy/_typing/__init__.py b/numpy/_typing/__init__.py

new file mode 100644 (file)

index 0000000..37ed068
--- /dev/null
+++ b/numpy/_typing/__init__.py
@@ -0,0 +1,223 @@
+"""Private counterpart of ``numpy.typing``."""
+
+from __future__ import annotations
+
+from numpy import ufunc
+from numpy.core.overrides import set_module
+from typing import TYPE_CHECKING, final
+
+
+@final  # Disallow the creation of arbitrary `NBitBase` subclasses
+@set_module("numpy.typing")
+class NBitBase:
+    """
+    A type representing `numpy.number` precision during static type checking.
+
+    Used exclusively for the purpose static type checking, `NBitBase`
+    represents the base of a hierarchical set of subclasses.
+    Each subsequent subclass is herein used for representing a lower level
+    of precision, *e.g.* ``64Bit > 32Bit > 16Bit``.
+
+    .. versionadded:: 1.20
+
+    Examples
+    --------
+    Below is a typical usage example: `NBitBase` is herein used for annotating
+    a function that takes a float and integer of arbitrary precision
+    as arguments and returns a new float of whichever precision is largest
+    (*e.g.* ``np.float16 + np.int64 -> np.float64``).
+
+    .. code-block:: python
+
+        >>> from __future__ import annotations
+        >>> from typing import TypeVar, TYPE_CHECKING
+        >>> import numpy as np
+        >>> import numpy.typing as npt
+
+        >>> T1 = TypeVar("T1", bound=npt.NBitBase)
+        >>> T2 = TypeVar("T2", bound=npt.NBitBase)
+
+        >>> def add(a: np.floating[T1], b: np.integer[T2]) -> np.floating[T1 | T2]:
+        ...     return a + b
+
+        >>> a = np.float16()
+        >>> b = np.int64()
+        >>> out = add(a, b)
+
+        >>> if TYPE_CHECKING:
+        ...     reveal_locals()
+        ...     # note: Revealed local types are:
+        ...     # note:     a: numpy.floating[numpy.typing._16Bit*]
+        ...     # note:     b: numpy.signedinteger[numpy.typing._64Bit*]
+        ...     # note:     out: numpy.floating[numpy.typing._64Bit*]
+
+    """
+
+    def __init_subclass__(cls) -> None:
+        allowed_names = {
+            "NBitBase", "_256Bit", "_128Bit", "_96Bit", "_80Bit",
+            "_64Bit", "_32Bit", "_16Bit", "_8Bit",
+        }
+        if cls.__name__ not in allowed_names:
+            raise TypeError('cannot inherit from final class "NBitBase"')
+        super().__init_subclass__()
+
+
+# Silence errors about subclassing a `@final`-decorated class
+class _256Bit(NBitBase):  # type: ignore[misc]
+    pass
+
+class _128Bit(_256Bit):  # type: ignore[misc]
+    pass
+
+class _96Bit(_128Bit):  # type: ignore[misc]
+    pass
+
+class _80Bit(_96Bit):  # type: ignore[misc]
+    pass
+
+class _64Bit(_80Bit):  # type: ignore[misc]
+    pass
+
+class _32Bit(_64Bit):  # type: ignore[misc]
+    pass
+
+class _16Bit(_32Bit):  # type: ignore[misc]
+    pass
+
+class _8Bit(_16Bit):  # type: ignore[misc]
+    pass
+
+
+from ._nested_sequence import (
+    _NestedSequence as _NestedSequence,
+)
+from ._nbit import (
+    _NBitByte as _NBitByte,
+    _NBitShort as _NBitShort,
+    _NBitIntC as _NBitIntC,
+    _NBitIntP as _NBitIntP,
+    _NBitInt as _NBitInt,
+    _NBitLongLong as _NBitLongLong,
+    _NBitHalf as _NBitHalf,
+    _NBitSingle as _NBitSingle,
+    _NBitDouble as _NBitDouble,
+    _NBitLongDouble as _NBitLongDouble,
+)
+from ._char_codes import (
+    _BoolCodes as _BoolCodes,
+    _UInt8Codes as _UInt8Codes,
+    _UInt16Codes as _UInt16Codes,
+    _UInt32Codes as _UInt32Codes,
+    _UInt64Codes as _UInt64Codes,
+    _Int8Codes as _Int8Codes,
+    _Int16Codes as _Int16Codes,
+    _Int32Codes as _Int32Codes,
+    _Int64Codes as _Int64Codes,
+    _Float16Codes as _Float16Codes,
+    _Float32Codes as _Float32Codes,
+    _Float64Codes as _Float64Codes,
+    _Complex64Codes as _Complex64Codes,
+    _Complex128Codes as _Complex128Codes,
+    _ByteCodes as _ByteCodes,
+    _ShortCodes as _ShortCodes,
+    _IntCCodes as _IntCCodes,
+    _IntPCodes as _IntPCodes,
+    _IntCodes as _IntCodes,
+    _LongLongCodes as _LongLongCodes,
+    _UByteCodes as _UByteCodes,
+    _UShortCodes as _UShortCodes,
+    _UIntCCodes as _UIntCCodes,
+    _UIntPCodes as _UIntPCodes,
+    _UIntCodes as _UIntCodes,
+    _ULongLongCodes as _ULongLongCodes,
+    _HalfCodes as _HalfCodes,
+    _SingleCodes as _SingleCodes,
+    _DoubleCodes as _DoubleCodes,
+    _LongDoubleCodes as _LongDoubleCodes,
+    _CSingleCodes as _CSingleCodes,
+    _CDoubleCodes as _CDoubleCodes,
+    _CLongDoubleCodes as _CLongDoubleCodes,
+    _DT64Codes as _DT64Codes,
+    _TD64Codes as _TD64Codes,
+    _StrCodes as _StrCodes,
+    _BytesCodes as _BytesCodes,
+    _VoidCodes as _VoidCodes,
+    _ObjectCodes as _ObjectCodes,
+)
+from ._scalars import (
+    _CharLike_co as _CharLike_co,
+    _BoolLike_co as _BoolLike_co,
+    _UIntLike_co as _UIntLike_co,
+    _IntLike_co as _IntLike_co,
+    _FloatLike_co as _FloatLike_co,
+    _ComplexLike_co as _ComplexLike_co,
+    _TD64Like_co as _TD64Like_co,
+    _NumberLike_co as _NumberLike_co,
+    _ScalarLike_co as _ScalarLike_co,
+    _VoidLike_co as _VoidLike_co,
+)
+from ._shape import (
+    _Shape as _Shape,
+    _ShapeLike as _ShapeLike,
+)
+from ._dtype_like import (
+    DTypeLike as DTypeLike,
+    _DTypeLike as _DTypeLike,
+    _SupportsDType as _SupportsDType,
+    _VoidDTypeLike as _VoidDTypeLike,
+    _DTypeLikeBool as _DTypeLikeBool,
+    _DTypeLikeUInt as _DTypeLikeUInt,
+    _DTypeLikeInt as _DTypeLikeInt,
+    _DTypeLikeFloat as _DTypeLikeFloat,
+    _DTypeLikeComplex as _DTypeLikeComplex,
+    _DTypeLikeTD64 as _DTypeLikeTD64,
+    _DTypeLikeDT64 as _DTypeLikeDT64,
+    _DTypeLikeObject as _DTypeLikeObject,
+    _DTypeLikeVoid as _DTypeLikeVoid,
+    _DTypeLikeStr as _DTypeLikeStr,
+    _DTypeLikeBytes as _DTypeLikeBytes,
+    _DTypeLikeComplex_co as _DTypeLikeComplex_co,
+)
+from ._array_like import (
+    ArrayLike as ArrayLike,
+    _ArrayLike as _ArrayLike,
+    _FiniteNestedSequence as _FiniteNestedSequence,
+    _SupportsArray as _SupportsArray,
+    _SupportsArrayFunc as _SupportsArrayFunc,
+    _ArrayLikeInt as _ArrayLikeInt,
+    _ArrayLikeBool_co as _ArrayLikeBool_co,
+    _ArrayLikeUInt_co as _ArrayLikeUInt_co,
+    _ArrayLikeInt_co as _ArrayLikeInt_co,
+    _ArrayLikeFloat_co as _ArrayLikeFloat_co,
+    _ArrayLikeComplex_co as _ArrayLikeComplex_co,
+    _ArrayLikeNumber_co as _ArrayLikeNumber_co,
+    _ArrayLikeTD64_co as _ArrayLikeTD64_co,
+    _ArrayLikeDT64_co as _ArrayLikeDT64_co,
+    _ArrayLikeObject_co as _ArrayLikeObject_co,
+    _ArrayLikeVoid_co as _ArrayLikeVoid_co,
+    _ArrayLikeStr_co as _ArrayLikeStr_co,
+    _ArrayLikeBytes_co as _ArrayLikeBytes_co,
+)
+from ._generic_alias import (
+    NDArray as NDArray,
+    _DType as _DType,
+    _GenericAlias as _GenericAlias,
+)
+
+if TYPE_CHECKING:
+    from ._ufunc import (
+        _UFunc_Nin1_Nout1 as _UFunc_Nin1_Nout1,
+        _UFunc_Nin2_Nout1 as _UFunc_Nin2_Nout1,
+        _UFunc_Nin1_Nout2 as _UFunc_Nin1_Nout2,
+        _UFunc_Nin2_Nout2 as _UFunc_Nin2_Nout2,
+        _GUFunc_Nin2_Nout1 as _GUFunc_Nin2_Nout1,
+    )
+else:
+    # Declare the (type-check-only) ufunc subclasses as ufunc aliases during
+    # runtime; this helps autocompletion tools such as Jedi (numpy/numpy#19834)
+    _UFunc_Nin1_Nout1 = ufunc
+    _UFunc_Nin2_Nout1 = ufunc
+    _UFunc_Nin1_Nout2 = ufunc
+    _UFunc_Nin2_Nout2 = ufunc
+    _GUFunc_Nin2_Nout1 = ufunc
diff --git a/numpy/_typing/_add_docstring.py b/numpy/_typing/_add_docstring.py

new file mode 100644 (file)

index 0000000..10d77f5
--- /dev/null
+++ b/numpy/_typing/_add_docstring.py
@@ -0,0 +1,152 @@
+"""A module for creating docstrings for sphinx ``data`` domains."""
+
+import re
+import textwrap
+
+from ._generic_alias import NDArray
+
+_docstrings_list = []
+
+
+def add_newdoc(name: str, value: str, doc: str) -> None:
+    """Append ``_docstrings_list`` with a docstring for `name`.
+
+    Parameters
+    ----------
+    name : str
+        The name of the object.
+    value : str
+        A string-representation of the object.
+    doc : str
+        The docstring of the object.
+
+    """
+    _docstrings_list.append((name, value, doc))
+
+
+def _parse_docstrings() -> str:
+    """Convert all docstrings in ``_docstrings_list`` into a single
+    sphinx-legible text block.
+
+    """
+    type_list_ret = []
+    for name, value, doc in _docstrings_list:
+        s = textwrap.dedent(doc).replace("\n", "\n    ")
+
+        # Replace sections by rubrics
+        lines = s.split("\n")
+        new_lines = []
+        indent = ""
+        for line in lines:
+            m = re.match(r'^(\s+)[-=]+\s*$', line)
+            if m and new_lines:
+                prev = textwrap.dedent(new_lines.pop())
+                if prev == "Examples":
+                    indent = ""
+                    new_lines.append(f'{m.group(1)}.. rubric:: {prev}')
+                else:
+                    indent = 4 * " "
+                    new_lines.append(f'{m.group(1)}.. admonition:: {prev}')
+                new_lines.append("")
+            else:
+                new_lines.append(f"{indent}{line}")
+
+        s = "\n".join(new_lines)
+        s_block = f""".. data:: {name}\n    :value: {value}\n    {s}"""
+        type_list_ret.append(s_block)
+    return "\n".join(type_list_ret)
+
+
+add_newdoc('ArrayLike', 'typing.Union[...]',
+    """
+    A `~typing.Union` representing objects that can be coerced
+    into an `~numpy.ndarray`.
+
+    Among others this includes the likes of:
+
+    * Scalars.
+    * (Nested) sequences.
+    * Objects implementing the `~class.__array__` protocol.
+
+    .. versionadded:: 1.20
+
+    See Also
+    --------
+    :term:`array_like`:
+        Any scalar or sequence that can be interpreted as an ndarray.
+
+    Examples
+    --------
+    .. code-block:: python
+
+        >>> import numpy as np
+        >>> import numpy.typing as npt
+
+        >>> def as_array(a: npt.ArrayLike) -> np.ndarray:
+        ...     return np.array(a)
+
+    """)
+
+add_newdoc('DTypeLike', 'typing.Union[...]',
+    """
+    A `~typing.Union` representing objects that can be coerced
+    into a `~numpy.dtype`.
+
+    Among others this includes the likes of:
+
+    * :class:`type` objects.
+    * Character codes or the names of :class:`type` objects.
+    * Objects with the ``.dtype`` attribute.
+
+    .. versionadded:: 1.20
+
+    See Also
+    --------
+    :ref:`Specifying and constructing data types <arrays.dtypes.constructing>`
+        A comprehensive overview of all objects that can be coerced
+        into data types.
+
+    Examples
+    --------
+    .. code-block:: python
+
+        >>> import numpy as np
+        >>> import numpy.typing as npt
+
+        >>> def as_dtype(d: npt.DTypeLike) -> np.dtype:
+        ...     return np.dtype(d)
+
+    """)
+
+add_newdoc('NDArray', repr(NDArray),
+    """
+    A :term:`generic <generic type>` version of
+    `np.ndarray[Any, np.dtype[+ScalarType]] <numpy.ndarray>`.
+
+    Can be used during runtime for typing arrays with a given dtype
+    and unspecified shape.
+
+    .. versionadded:: 1.21
+
+    Examples
+    --------
+    .. code-block:: python
+
+        >>> import numpy as np
+        >>> import numpy.typing as npt
+
+        >>> print(npt.NDArray)
+        numpy.ndarray[typing.Any, numpy.dtype[+ScalarType]]
+
+        >>> print(npt.NDArray[np.float64])
+        numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]]
+
+        >>> NDArrayInt = npt.NDArray[np.int_]
+        >>> a: NDArrayInt = np.arange(10)
+
+        >>> def func(a: npt.ArrayLike) -> npt.NDArray[Any]:
+        ...     return np.array(a)
+
+    """)
+
+_docstrings = _parse_docstrings()
diff --git a/numpy/_typing/_array_like.py b/numpy/_typing/_array_like.py

new file mode 100644 (file)

index 0000000..02f2642
--- /dev/null
+++ b/numpy/_typing/_array_like.py
@@ -0,0 +1,143 @@
+from __future__ import annotations
+
+# NOTE: Import `Sequence` from `typing` as we it is needed for a type-alias,
+# not an annotation
+from collections.abc import Collection, Callable
+from typing import Any, Sequence, Protocol, Union, TypeVar
+from numpy import (
+    ndarray,
+    dtype,
+    generic,
+    bool_,
+    unsignedinteger,
+    integer,
+    floating,
+    complexfloating,
+    number,
+    timedelta64,
+    datetime64,
+    object_,
+    void,
+    str_,
+    bytes_,
+)
+from ._nested_sequence import _NestedSequence
+
+_T = TypeVar("_T")
+_ScalarType = TypeVar("_ScalarType", bound=generic)
+_DType = TypeVar("_DType", bound="dtype[Any]")
+_DType_co = TypeVar("_DType_co", covariant=True, bound="dtype[Any]")
+
+# The `_SupportsArray` protocol only cares about the default dtype
+# (i.e. `dtype=None` or no `dtype` parameter at all) of the to-be returned
+# array.
+# Concrete implementations of the protocol are responsible for adding
+# any and all remaining overloads
+class _SupportsArray(Protocol[_DType_co]):
+    def __array__(self) -> ndarray[Any, _DType_co]: ...
+
+
+class _SupportsArrayFunc(Protocol):
+    """A protocol class representing `~class.__array_function__`."""
+    def __array_function__(
+        self,
+        func: Callable[..., Any],
+        types: Collection[type[Any]],
+        args: tuple[Any, ...],
+        kwargs: dict[str, Any],
+    ) -> object: ...
+
+
+# TODO: Wait until mypy supports recursive objects in combination with typevars
+_FiniteNestedSequence = Union[
+    _T,
+    Sequence[_T],
+    Sequence[Sequence[_T]],
+    Sequence[Sequence[Sequence[_T]]],
+    Sequence[Sequence[Sequence[Sequence[_T]]]],
+]
+
+# A subset of `npt.ArrayLike` that can be parametrized w.r.t. `np.generic`
+_ArrayLike = Union[
+    _SupportsArray["dtype[_ScalarType]"],
+    _NestedSequence[_SupportsArray["dtype[_ScalarType]"]],
+]
+
+# A union representing array-like objects; consists of two typevars:
+# One representing types that can be parametrized w.r.t. `np.dtype`
+# and another one for the rest
+_DualArrayLike = Union[
+    _SupportsArray[_DType],
+    _NestedSequence[_SupportsArray[_DType]],
+    _T,
+    _NestedSequence[_T],
+]
+
+# TODO: support buffer protocols once
+#
+# https://bugs.python.org/issue27501
+#
+# is resolved. See also the mypy issue:
+#
+# https://github.com/python/typing/issues/593
+ArrayLike = _DualArrayLike[
+    dtype,
+    Union[bool, int, float, complex, str, bytes],
+]
+
+# `ArrayLike<X>_co`: array-like objects that can be coerced into `X`
+# given the casting rules `same_kind`
+_ArrayLikeBool_co = _DualArrayLike[
+    "dtype[bool_]",
+    bool,
+]
+_ArrayLikeUInt_co = _DualArrayLike[
+    "dtype[Union[bool_, unsignedinteger[Any]]]",
+    bool,
+]
+_ArrayLikeInt_co = _DualArrayLike[
+    "dtype[Union[bool_, integer[Any]]]",
+    Union[bool, int],
+]
+_ArrayLikeFloat_co = _DualArrayLike[
+    "dtype[Union[bool_, integer[Any], floating[Any]]]",
+    Union[bool, int, float],
+]
+_ArrayLikeComplex_co = _DualArrayLike[
+    "dtype[Union[bool_, integer[Any], floating[Any], complexfloating[Any, Any]]]",
+    Union[bool, int, float, complex],
+]
+_ArrayLikeNumber_co = _DualArrayLike[
+    "dtype[Union[bool_, number[Any]]]",
+    Union[bool, int, float, complex],
+]
+_ArrayLikeTD64_co = _DualArrayLike[
+    "dtype[Union[bool_, integer[Any], timedelta64]]",
+    Union[bool, int],
+]
+_ArrayLikeDT64_co = Union[
+    _SupportsArray["dtype[datetime64]"],
+    _NestedSequence[_SupportsArray["dtype[datetime64]"]],
+]
+_ArrayLikeObject_co = Union[
+    _SupportsArray["dtype[object_]"],
+    _NestedSequence[_SupportsArray["dtype[object_]"]],
+]
+
+_ArrayLikeVoid_co = Union[
+    _SupportsArray["dtype[void]"],
+    _NestedSequence[_SupportsArray["dtype[void]"]],
+]
+_ArrayLikeStr_co = _DualArrayLike[
+    "dtype[str_]",
+    str,
+]
+_ArrayLikeBytes_co = _DualArrayLike[
+    "dtype[bytes_]",
+    bytes,
+]
+
+_ArrayLikeInt = _DualArrayLike[
+    "dtype[integer[Any]]",
+    int,
+]
diff --git a/numpy/_typing/_callable.pyi b/numpy/_typing/_callable.pyi

new file mode 100644 (file)

index 0000000..6d71365
--- /dev/null
+++ b/numpy/_typing/_callable.pyi
@@ -0,0 +1,325 @@
+"""
+A module with various ``typing.Protocol`` subclasses that implement
+the ``__call__`` magic method.
+
+See the `Mypy documentation`_ on protocols for more details.
+
+.. _`Mypy documentation`: https://mypy.readthedocs.io/en/stable/protocols.html#callback-protocols
+
+"""
+
+from __future__ import annotations
+
+from typing import (
+    TypeVar,
+    overload,
+    Any,
+    NoReturn,
+    Protocol,
+)
+
+from numpy import (
+    ndarray,
+    dtype,
+    generic,
+    bool_,
+    timedelta64,
+    number,
+    integer,
+    unsignedinteger,
+    signedinteger,
+    int8,
+    int_,
+    floating,
+    float64,
+    complexfloating,
+    complex128,
+)
+from ._nbit import _NBitInt, _NBitDouble
+from ._scalars import (
+    _BoolLike_co,
+    _IntLike_co,
+    _FloatLike_co,
+    _NumberLike_co,
+)
+from . import NBitBase
+from ._generic_alias import NDArray
+
+_T1 = TypeVar("_T1")
+_T2 = TypeVar("_T2")
+_T1_contra = TypeVar("_T1_contra", contravariant=True)
+_T2_contra = TypeVar("_T2_contra", contravariant=True)
+_2Tuple = tuple[_T1, _T1]
+
+_NBit1 = TypeVar("_NBit1", bound=NBitBase)
+_NBit2 = TypeVar("_NBit2", bound=NBitBase)
+
+_IntType = TypeVar("_IntType", bound=integer)
+_FloatType = TypeVar("_FloatType", bound=floating)
+_NumberType = TypeVar("_NumberType", bound=number)
+_NumberType_co = TypeVar("_NumberType_co", covariant=True, bound=number)
+_GenericType_co = TypeVar("_GenericType_co", covariant=True, bound=generic)
+
+class _BoolOp(Protocol[_GenericType_co]):
+    @overload
+    def __call__(self, other: _BoolLike_co, /) -> _GenericType_co: ...
+    @overload  # platform dependent
+    def __call__(self, other: int, /) -> int_: ...
+    @overload
+    def __call__(self, other: float, /) -> float64: ...
+    @overload
+    def __call__(self, other: complex, /) -> complex128: ...
+    @overload
+    def __call__(self, other: _NumberType, /) -> _NumberType: ...
+
+class _BoolBitOp(Protocol[_GenericType_co]):
+    @overload
+    def __call__(self, other: _BoolLike_co, /) -> _GenericType_co: ...
+    @overload  # platform dependent
+    def __call__(self, other: int, /) -> int_: ...
+    @overload
+    def __call__(self, other: _IntType, /) -> _IntType: ...
+
+class _BoolSub(Protocol):
+    # Note that `other: bool_` is absent here
+    @overload
+    def __call__(self, other: bool, /) -> NoReturn: ...
+    @overload  # platform dependent
+    def __call__(self, other: int, /) -> int_: ...
+    @overload
+    def __call__(self, other: float, /) -> float64: ...
+    @overload
+    def __call__(self, other: complex, /) -> complex128: ...
+    @overload
+    def __call__(self, other: _NumberType, /) -> _NumberType: ...
+
+class _BoolTrueDiv(Protocol):
+    @overload
+    def __call__(self, other: float | _IntLike_co, /) -> float64: ...
+    @overload
+    def __call__(self, other: complex, /) -> complex128: ...
+    @overload
+    def __call__(self, other: _NumberType, /) -> _NumberType: ...
+
+class _BoolMod(Protocol):
+    @overload
+    def __call__(self, other: _BoolLike_co, /) -> int8: ...
+    @overload  # platform dependent
+    def __call__(self, other: int, /) -> int_: ...
+    @overload
+    def __call__(self, other: float, /) -> float64: ...
+    @overload
+    def __call__(self, other: _IntType, /) -> _IntType: ...
+    @overload
+    def __call__(self, other: _FloatType, /) -> _FloatType: ...
+
+class _BoolDivMod(Protocol):
+    @overload
+    def __call__(self, other: _BoolLike_co, /) -> _2Tuple[int8]: ...
+    @overload  # platform dependent
+    def __call__(self, other: int, /) -> _2Tuple[int_]: ...
+    @overload
+    def __call__(self, other: float, /) -> _2Tuple[floating[_NBit1 | _NBitDouble]]: ...
+    @overload
+    def __call__(self, other: _IntType, /) -> _2Tuple[_IntType]: ...
+    @overload
+    def __call__(self, other: _FloatType, /) -> _2Tuple[_FloatType]: ...
+
+class _TD64Div(Protocol[_NumberType_co]):
+    @overload
+    def __call__(self, other: timedelta64, /) -> _NumberType_co: ...
+    @overload
+    def __call__(self, other: _BoolLike_co, /) -> NoReturn: ...
+    @overload
+    def __call__(self, other: _FloatLike_co, /) -> timedelta64: ...
+
+class _IntTrueDiv(Protocol[_NBit1]):
+    @overload
+    def __call__(self, other: bool, /) -> floating[_NBit1]: ...
+    @overload
+    def __call__(self, other: int, /) -> floating[_NBit1 | _NBitInt]: ...
+    @overload
+    def __call__(self, other: float, /) -> floating[_NBit1 | _NBitDouble]: ...
+    @overload
+    def __call__(
+        self, other: complex, /,
+    ) -> complexfloating[_NBit1 | _NBitDouble, _NBit1 | _NBitDouble]: ...
+    @overload
+    def __call__(self, other: integer[_NBit2], /) -> floating[_NBit1 | _NBit2]: ...
+
+class _UnsignedIntOp(Protocol[_NBit1]):
+    # NOTE: `uint64 + signedinteger -> float64`
+    @overload
+    def __call__(self, other: bool, /) -> unsignedinteger[_NBit1]: ...
+    @overload
+    def __call__(
+        self, other: int | signedinteger[Any], /
+    ) -> Any: ...
+    @overload
+    def __call__(self, other: float, /) -> floating[_NBit1 | _NBitDouble]: ...
+    @overload
+    def __call__(
+        self, other: complex, /,
+    ) -> complexfloating[_NBit1 | _NBitDouble, _NBit1 | _NBitDouble]: ...
+    @overload
+    def __call__(
+        self, other: unsignedinteger[_NBit2], /
+    ) -> unsignedinteger[_NBit1 | _NBit2]: ...
+
+class _UnsignedIntBitOp(Protocol[_NBit1]):
+    @overload
+    def __call__(self, other: bool, /) -> unsignedinteger[_NBit1]: ...
+    @overload
+    def __call__(self, other: int, /) -> signedinteger[Any]: ...
+    @overload
+    def __call__(self, other: signedinteger[Any], /) -> signedinteger[Any]: ...
+    @overload
+    def __call__(
+        self, other: unsignedinteger[_NBit2], /
+    ) -> unsignedinteger[_NBit1 | _NBit2]: ...
+
+class _UnsignedIntMod(Protocol[_NBit1]):
+    @overload
+    def __call__(self, other: bool, /) -> unsignedinteger[_NBit1]: ...
+    @overload
+    def __call__(
+        self, other: int | signedinteger[Any], /
+    ) -> Any: ...
+    @overload
+    def __call__(self, other: float, /) -> floating[_NBit1 | _NBitDouble]: ...
+    @overload
+    def __call__(
+        self, other: unsignedinteger[_NBit2], /
+    ) -> unsignedinteger[_NBit1 | _NBit2]: ...
+
+class _UnsignedIntDivMod(Protocol[_NBit1]):
+    @overload
+    def __call__(self, other: bool, /) -> _2Tuple[signedinteger[_NBit1]]: ...
+    @overload
+    def __call__(
+        self, other: int | signedinteger[Any], /
+    ) -> _2Tuple[Any]: ...
+    @overload
+    def __call__(self, other: float, /) -> _2Tuple[floating[_NBit1 | _NBitDouble]]: ...
+    @overload
+    def __call__(
+        self, other: unsignedinteger[_NBit2], /
+    ) -> _2Tuple[unsignedinteger[_NBit1 | _NBit2]]: ...
+
+class _SignedIntOp(Protocol[_NBit1]):
+    @overload
+    def __call__(self, other: bool, /) -> signedinteger[_NBit1]: ...
+    @overload
+    def __call__(self, other: int, /) -> signedinteger[_NBit1 | _NBitInt]: ...
+    @overload
+    def __call__(self, other: float, /) -> floating[_NBit1 | _NBitDouble]: ...
+    @overload
+    def __call__(
+        self, other: complex, /,
+    ) -> complexfloating[_NBit1 | _NBitDouble, _NBit1 | _NBitDouble]: ...
+    @overload
+    def __call__(
+        self, other: signedinteger[_NBit2], /,
+    ) -> signedinteger[_NBit1 | _NBit2]: ...
+
+class _SignedIntBitOp(Protocol[_NBit1]):
+    @overload
+    def __call__(self, other: bool, /) -> signedinteger[_NBit1]: ...
+    @overload
+    def __call__(self, other: int, /) -> signedinteger[_NBit1 | _NBitInt]: ...
+    @overload
+    def __call__(
+        self, other: signedinteger[_NBit2], /,
+    ) -> signedinteger[_NBit1 | _NBit2]: ...
+
+class _SignedIntMod(Protocol[_NBit1]):
+    @overload
+    def __call__(self, other: bool, /) -> signedinteger[_NBit1]: ...
+    @overload
+    def __call__(self, other: int, /) -> signedinteger[_NBit1 | _NBitInt]: ...
+    @overload
+    def __call__(self, other: float, /) -> floating[_NBit1 | _NBitDouble]: ...
+    @overload
+    def __call__(
+        self, other: signedinteger[_NBit2], /,
+    ) -> signedinteger[_NBit1 | _NBit2]: ...
+
+class _SignedIntDivMod(Protocol[_NBit1]):
+    @overload
+    def __call__(self, other: bool, /) -> _2Tuple[signedinteger[_NBit1]]: ...
+    @overload
+    def __call__(self, other: int, /) -> _2Tuple[signedinteger[_NBit1 | _NBitInt]]: ...
+    @overload
+    def __call__(self, other: float, /) -> _2Tuple[floating[_NBit1 | _NBitDouble]]: ...
+    @overload
+    def __call__(
+        self, other: signedinteger[_NBit2], /,
+    ) -> _2Tuple[signedinteger[_NBit1 | _NBit2]]: ...
+
+class _FloatOp(Protocol[_NBit1]):
+    @overload
+    def __call__(self, other: bool, /) -> floating[_NBit1]: ...
+    @overload
+    def __call__(self, other: int, /) -> floating[_NBit1 | _NBitInt]: ...
+    @overload
+    def __call__(self, other: float, /) -> floating[_NBit1 | _NBitDouble]: ...
+    @overload
+    def __call__(
+        self, other: complex, /,
+    ) -> complexfloating[_NBit1 | _NBitDouble, _NBit1 | _NBitDouble]: ...
+    @overload
+    def __call__(
+        self, other: integer[_NBit2] | floating[_NBit2], /
+    ) -> floating[_NBit1 | _NBit2]: ...
+
+class _FloatMod(Protocol[_NBit1]):
+    @overload
+    def __call__(self, other: bool, /) -> floating[_NBit1]: ...
+    @overload
+    def __call__(self, other: int, /) -> floating[_NBit1 | _NBitInt]: ...
+    @overload
+    def __call__(self, other: float, /) -> floating[_NBit1 | _NBitDouble]: ...
+    @overload
+    def __call__(
+        self, other: integer[_NBit2] | floating[_NBit2], /
+    ) -> floating[_NBit1 | _NBit2]: ...
+
+class _FloatDivMod(Protocol[_NBit1]):
+    @overload
+    def __call__(self, other: bool, /) -> _2Tuple[floating[_NBit1]]: ...
+    @overload
+    def __call__(self, other: int, /) -> _2Tuple[floating[_NBit1 | _NBitInt]]: ...
+    @overload
+    def __call__(self, other: float, /) -> _2Tuple[floating[_NBit1 | _NBitDouble]]: ...
+    @overload
+    def __call__(
+        self, other: integer[_NBit2] | floating[_NBit2], /
+    ) -> _2Tuple[floating[_NBit1 | _NBit2]]: ...
+
+class _ComplexOp(Protocol[_NBit1]):
+    @overload
+    def __call__(self, other: bool, /) -> complexfloating[_NBit1, _NBit1]: ...
+    @overload
+    def __call__(self, other: int, /) -> complexfloating[_NBit1 | _NBitInt, _NBit1 | _NBitInt]: ...
+    @overload
+    def __call__(
+        self, other: complex, /,
+    ) -> complexfloating[_NBit1 | _NBitDouble, _NBit1 | _NBitDouble]: ...
+    @overload
+    def __call__(
+        self,
+        other: (
+            integer[_NBit2]
+            | floating[_NBit2]
+            | complexfloating[_NBit2, _NBit2]
+        ), /,
+    ) -> complexfloating[_NBit1 | _NBit2, _NBit1 | _NBit2]: ...
+
+class _NumberOp(Protocol):
+    def __call__(self, other: _NumberLike_co, /) -> Any: ...
+
+class _ComparisonOp(Protocol[_T1_contra, _T2_contra]):
+    @overload
+    def __call__(self, other: _T1_contra, /) -> bool_: ...
+    @overload
+    def __call__(self, other: _T2_contra, /) -> NDArray[bool_]: ...
diff --git a/numpy/_typing/_char_codes.py b/numpy/_typing/_char_codes.py

new file mode 100644 (file)

index 0000000..f840d17
--- /dev/null
+++ b/numpy/_typing/_char_codes.py
@@ -0,0 +1,111 @@
+from typing import Literal
+
+_BoolCodes = Literal["?", "=?", "<?", ">?", "bool", "bool_", "bool8"]
+
+_UInt8Codes = Literal["uint8", "u1", "=u1", "<u1", ">u1"]
+_UInt16Codes = Literal["uint16", "u2", "=u2", "<u2", ">u2"]
+_UInt32Codes = Literal["uint32", "u4", "=u4", "<u4", ">u4"]
+_UInt64Codes = Literal["uint64", "u8", "=u8", "<u8", ">u8"]
+
+_Int8Codes = Literal["int8", "i1", "=i1", "<i1", ">i1"]
+_Int16Codes = Literal["int16", "i2", "=i2", "<i2", ">i2"]
+_Int32Codes = Literal["int32", "i4", "=i4", "<i4", ">i4"]
+_Int64Codes = Literal["int64", "i8", "=i8", "<i8", ">i8"]
+
+_Float16Codes = Literal["float16", "f2", "=f2", "<f2", ">f2"]
+_Float32Codes = Literal["float32", "f4", "=f4", "<f4", ">f4"]
+_Float64Codes = Literal["float64", "f8", "=f8", "<f8", ">f8"]
+
+_Complex64Codes = Literal["complex64", "c8", "=c8", "<c8", ">c8"]
+_Complex128Codes = Literal["complex128", "c16", "=c16", "<c16", ">c16"]
+
+_ByteCodes = Literal["byte", "b", "=b", "<b", ">b"]
+_ShortCodes = Literal["short", "h", "=h", "<h", ">h"]
+_IntCCodes = Literal["intc", "i", "=i", "<i", ">i"]
+_IntPCodes = Literal["intp", "int0", "p", "=p", "<p", ">p"]
+_IntCodes = Literal["long", "int", "int_", "l", "=l", "<l", ">l"]
+_LongLongCodes = Literal["longlong", "q", "=q", "<q", ">q"]
+
+_UByteCodes = Literal["ubyte", "B", "=B", "<B", ">B"]
+_UShortCodes = Literal["ushort", "H", "=H", "<H", ">H"]
+_UIntCCodes = Literal["uintc", "I", "=I", "<I", ">I"]
+_UIntPCodes = Literal["uintp", "uint0", "P", "=P", "<P", ">P"]
+_UIntCodes = Literal["ulong", "uint", "L", "=L", "<L", ">L"]
+_ULongLongCodes = Literal["ulonglong", "Q", "=Q", "<Q", ">Q"]
+
+_HalfCodes = Literal["half", "e", "=e", "<e", ">e"]
+_SingleCodes = Literal["single", "f", "=f", "<f", ">f"]
+_DoubleCodes = Literal["double", "float", "float_", "d", "=d", "<d", ">d"]
+_LongDoubleCodes = Literal["longdouble", "longfloat", "g", "=g", "<g", ">g"]
+
+_CSingleCodes = Literal["csingle", "singlecomplex", "F", "=F", "<F", ">F"]
+_CDoubleCodes = Literal["cdouble", "complex", "complex_", "cfloat", "D", "=D", "<D", ">D"]
+_CLongDoubleCodes = Literal["clongdouble", "clongfloat", "longcomplex", "G", "=G", "<G", ">G"]
+
+_StrCodes = Literal["str", "str_", "str0", "unicode", "unicode_", "U", "=U", "<U", ">U"]
+_BytesCodes = Literal["bytes", "bytes_", "bytes0", "S", "=S", "<S", ">S"]
+_VoidCodes = Literal["void", "void0", "V", "=V", "<V", ">V"]
+_ObjectCodes = Literal["object", "object_", "O", "=O", "<O", ">O"]
+
+_DT64Codes = Literal[
+    "datetime64", "=datetime64", "<datetime64", ">datetime64",
+    "datetime64[Y]", "=datetime64[Y]", "<datetime64[Y]", ">datetime64[Y]",
+    "datetime64[M]", "=datetime64[M]", "<datetime64[M]", ">datetime64[M]",
+    "datetime64[W]", "=datetime64[W]", "<datetime64[W]", ">datetime64[W]",
+    "datetime64[D]", "=datetime64[D]", "<datetime64[D]", ">datetime64[D]",
+    "datetime64[h]", "=datetime64[h]", "<datetime64[h]", ">datetime64[h]",
+    "datetime64[m]", "=datetime64[m]", "<datetime64[m]", ">datetime64[m]",
+    "datetime64[s]", "=datetime64[s]", "<datetime64[s]", ">datetime64[s]",
+    "datetime64[ms]", "=datetime64[ms]", "<datetime64[ms]", ">datetime64[ms]",
+    "datetime64[us]", "=datetime64[us]", "<datetime64[us]", ">datetime64[us]",
+    "datetime64[ns]", "=datetime64[ns]", "<datetime64[ns]", ">datetime64[ns]",
+    "datetime64[ps]", "=datetime64[ps]", "<datetime64[ps]", ">datetime64[ps]",
+    "datetime64[fs]", "=datetime64[fs]", "<datetime64[fs]", ">datetime64[fs]",
+    "datetime64[as]", "=datetime64[as]", "<datetime64[as]", ">datetime64[as]",
+    "M", "=M", "<M", ">M",
+    "M8", "=M8", "<M8", ">M8",
+    "M8[Y]", "=M8[Y]", "<M8[Y]", ">M8[Y]",
+    "M8[M]", "=M8[M]", "<M8[M]", ">M8[M]",
+    "M8[W]", "=M8[W]", "<M8[W]", ">M8[W]",
+    "M8[D]", "=M8[D]", "<M8[D]", ">M8[D]",
+    "M8[h]", "=M8[h]", "<M8[h]", ">M8[h]",
+    "M8[m]", "=M8[m]", "<M8[m]", ">M8[m]",
+    "M8[s]", "=M8[s]", "<M8[s]", ">M8[s]",
+    "M8[ms]", "=M8[ms]", "<M8[ms]", ">M8[ms]",
+    "M8[us]", "=M8[us]", "<M8[us]", ">M8[us]",
+    "M8[ns]", "=M8[ns]", "<M8[ns]", ">M8[ns]",
+    "M8[ps]", "=M8[ps]", "<M8[ps]", ">M8[ps]",
+    "M8[fs]", "=M8[fs]", "<M8[fs]", ">M8[fs]",
+    "M8[as]", "=M8[as]", "<M8[as]", ">M8[as]",
+]
+_TD64Codes = Literal[
+    "timedelta64", "=timedelta64", "<timedelta64", ">timedelta64",
+    "timedelta64[Y]", "=timedelta64[Y]", "<timedelta64[Y]", ">timedelta64[Y]",
+    "timedelta64[M]", "=timedelta64[M]", "<timedelta64[M]", ">timedelta64[M]",
+    "timedelta64[W]", "=timedelta64[W]", "<timedelta64[W]", ">timedelta64[W]",
+    "timedelta64[D]", "=timedelta64[D]", "<timedelta64[D]", ">timedelta64[D]",
+    "timedelta64[h]", "=timedelta64[h]", "<timedelta64[h]", ">timedelta64[h]",
+    "timedelta64[m]", "=timedelta64[m]", "<timedelta64[m]", ">timedelta64[m]",
+    "timedelta64[s]", "=timedelta64[s]", "<timedelta64[s]", ">timedelta64[s]",
+    "timedelta64[ms]", "=timedelta64[ms]", "<timedelta64[ms]", ">timedelta64[ms]",
+    "timedelta64[us]", "=timedelta64[us]", "<timedelta64[us]", ">timedelta64[us]",
+    "timedelta64[ns]", "=timedelta64[ns]", "<timedelta64[ns]", ">timedelta64[ns]",
+    "timedelta64[ps]", "=timedelta64[ps]", "<timedelta64[ps]", ">timedelta64[ps]",
+    "timedelta64[fs]", "=timedelta64[fs]", "<timedelta64[fs]", ">timedelta64[fs]",
+    "timedelta64[as]", "=timedelta64[as]", "<timedelta64[as]", ">timedelta64[as]",
+    "m", "=m", "<m", ">m",
+    "m8", "=m8", "<m8", ">m8",
+    "m8[Y]", "=m8[Y]", "<m8[Y]", ">m8[Y]",
+    "m8[M]", "=m8[M]", "<m8[M]", ">m8[M]",
+    "m8[W]", "=m8[W]", "<m8[W]", ">m8[W]",
+    "m8[D]", "=m8[D]", "<m8[D]", ">m8[D]",
+    "m8[h]", "=m8[h]", "<m8[h]", ">m8[h]",
+    "m8[m]", "=m8[m]", "<m8[m]", ">m8[m]",
+    "m8[s]", "=m8[s]", "<m8[s]", ">m8[s]",
+    "m8[ms]", "=m8[ms]", "<m8[ms]", ">m8[ms]",
+    "m8[us]", "=m8[us]", "<m8[us]", ">m8[us]",
+    "m8[ns]", "=m8[ns]", "<m8[ns]", ">m8[ns]",
+    "m8[ps]", "=m8[ps]", "<m8[ps]", ">m8[ps]",
+    "m8[fs]", "=m8[fs]", "<m8[fs]", ">m8[fs]",
+    "m8[as]", "=m8[as]", "<m8[as]", ">m8[as]",
+]
diff --git a/numpy/_typing/_dtype_like.py b/numpy/_typing/_dtype_like.py

new file mode 100644 (file)

index 0000000..b705d82
--- /dev/null
+++ b/numpy/_typing/_dtype_like.py
@@ -0,0 +1,247 @@
+from typing import (
+    Any,
+    List,
+    Sequence,
+    Tuple,
+    Union,
+    Type,
+    TypeVar,
+    Protocol,
+    TypedDict,
+)
+
+import numpy as np
+
+from ._shape import _ShapeLike
+from ._generic_alias import _DType as DType
+
+from ._char_codes import (
+    _BoolCodes,
+    _UInt8Codes,
+    _UInt16Codes,
+    _UInt32Codes,
+    _UInt64Codes,
+    _Int8Codes,
+    _Int16Codes,
+    _Int32Codes,
+    _Int64Codes,
+    _Float16Codes,
+    _Float32Codes,
+    _Float64Codes,
+    _Complex64Codes,
+    _Complex128Codes,
+    _ByteCodes,
+    _ShortCodes,
+    _IntCCodes,
+    _IntPCodes,
+    _IntCodes,
+    _LongLongCodes,
+    _UByteCodes,
+    _UShortCodes,
+    _UIntCCodes,
+    _UIntPCodes,
+    _UIntCodes,
+    _ULongLongCodes,
+    _HalfCodes,
+    _SingleCodes,
+    _DoubleCodes,
+    _LongDoubleCodes,
+    _CSingleCodes,
+    _CDoubleCodes,
+    _CLongDoubleCodes,
+    _DT64Codes,
+    _TD64Codes,
+    _StrCodes,
+    _BytesCodes,
+    _VoidCodes,
+    _ObjectCodes,
+)
+
+_SCT = TypeVar("_SCT", bound=np.generic)
+_DType_co = TypeVar("_DType_co", covariant=True, bound=DType[Any])
+
+_DTypeLikeNested = Any  # TODO: wait for support for recursive types
+
+
+# Mandatory keys
+class _DTypeDictBase(TypedDict):
+    names: Sequence[str]
+    formats: Sequence[_DTypeLikeNested]
+
+
+# Mandatory + optional keys
+class _DTypeDict(_DTypeDictBase, total=False):
+    # Only `str` elements are usable as indexing aliases,
+    # but `titles` can in principle accept any object
+    offsets: Sequence[int]
+    titles: Sequence[Any]
+    itemsize: int
+    aligned: bool
+
+
+# A protocol for anything with the dtype attribute
+class _SupportsDType(Protocol[_DType_co]):
+    @property
+    def dtype(self) -> _DType_co: ...
+
+
+# A subset of `npt.DTypeLike` that can be parametrized w.r.t. `np.generic`
+_DTypeLike = Union[
+    "np.dtype[_SCT]",
+    Type[_SCT],
+    _SupportsDType["np.dtype[_SCT]"],
+]
+
+
+# Would create a dtype[np.void]
+_VoidDTypeLike = Union[
+    # (flexible_dtype, itemsize)
+    Tuple[_DTypeLikeNested, int],
+    # (fixed_dtype, shape)
+    Tuple[_DTypeLikeNested, _ShapeLike],
+    # [(field_name, field_dtype, field_shape), ...]
+    #
+    # The type here is quite broad because NumPy accepts quite a wide
+    # range of inputs inside the list; see the tests for some
+    # examples.
+    List[Any],
+    # {'names': ..., 'formats': ..., 'offsets': ..., 'titles': ...,
+    #  'itemsize': ...}
+    _DTypeDict,
+    # (base_dtype, new_dtype)
+    Tuple[_DTypeLikeNested, _DTypeLikeNested],
+]
+
+# Anything that can be coerced into numpy.dtype.
+# Reference: https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html
+DTypeLike = Union[
+    DType[Any],
+    # default data type (float64)
+    None,
+    # array-scalar types and generic types
+    Type[Any],  # NOTE: We're stuck with `Type[Any]` due to object dtypes
+    # anything with a dtype attribute
+    _SupportsDType[DType[Any]],
+    # character codes, type strings or comma-separated fields, e.g., 'float64'
+    str,
+    _VoidDTypeLike,
+]
+
+# NOTE: while it is possible to provide the dtype as a dict of
+# dtype-like objects (e.g. `{'field1': ..., 'field2': ..., ...}`),
+# this syntax is officially discourged and
+# therefore not included in the Union defining `DTypeLike`.
+#
+# See https://github.com/numpy/numpy/issues/16891 for more details.
+
+# Aliases for commonly used dtype-like objects.
+# Note that the precision of `np.number` subclasses is ignored herein.
+_DTypeLikeBool = Union[
+    Type[bool],
+    Type[np.bool_],
+    DType[np.bool_],
+    _SupportsDType[DType[np.bool_]],
+    _BoolCodes,
+]
+_DTypeLikeUInt = Union[
+    Type[np.unsignedinteger],
+    DType[np.unsignedinteger],
+    _SupportsDType[DType[np.unsignedinteger]],
+    _UInt8Codes,
+    _UInt16Codes,
+    _UInt32Codes,
+    _UInt64Codes,
+    _UByteCodes,
+    _UShortCodes,
+    _UIntCCodes,
+    _UIntPCodes,
+    _UIntCodes,
+    _ULongLongCodes,
+]
+_DTypeLikeInt = Union[
+    Type[int],
+    Type[np.signedinteger],
+    DType[np.signedinteger],
+    _SupportsDType[DType[np.signedinteger]],
+    _Int8Codes,
+    _Int16Codes,
+    _Int32Codes,
+    _Int64Codes,
+    _ByteCodes,
+    _ShortCodes,
+    _IntCCodes,
+    _IntPCodes,
+    _IntCodes,
+    _LongLongCodes,
+]
+_DTypeLikeFloat = Union[
+    Type[float],
+    Type[np.floating],
+    DType[np.floating],
+    _SupportsDType[DType[np.floating]],
+    _Float16Codes,
+    _Float32Codes,
+    _Float64Codes,
+    _HalfCodes,
+    _SingleCodes,
+    _DoubleCodes,
+    _LongDoubleCodes,
+]
+_DTypeLikeComplex = Union[
+    Type[complex],
+    Type[np.complexfloating],
+    DType[np.complexfloating],
+    _SupportsDType[DType[np.complexfloating]],
+    _Complex64Codes,
+    _Complex128Codes,
+    _CSingleCodes,
+    _CDoubleCodes,
+    _CLongDoubleCodes,
+]
+_DTypeLikeDT64 = Union[
+    Type[np.timedelta64],
+    DType[np.timedelta64],
+    _SupportsDType[DType[np.timedelta64]],
+    _TD64Codes,
+]
+_DTypeLikeTD64 = Union[
+    Type[np.datetime64],
+    DType[np.datetime64],
+    _SupportsDType[DType[np.datetime64]],
+    _DT64Codes,
+]
+_DTypeLikeStr = Union[
+    Type[str],
+    Type[np.str_],
+    DType[np.str_],
+    _SupportsDType[DType[np.str_]],
+    _StrCodes,
+]
+_DTypeLikeBytes = Union[
+    Type[bytes],
+    Type[np.bytes_],
+    DType[np.bytes_],
+    _SupportsDType[DType[np.bytes_]],
+    _BytesCodes,
+]
+_DTypeLikeVoid = Union[
+    Type[np.void],
+    DType[np.void],
+    _SupportsDType[DType[np.void]],
+    _VoidCodes,
+    _VoidDTypeLike,
+]
+_DTypeLikeObject = Union[
+    type,
+    DType[np.object_],
+    _SupportsDType[DType[np.object_]],
+    _ObjectCodes,
+]
+
+_DTypeLikeComplex_co = Union[
+    _DTypeLikeBool,
+    _DTypeLikeUInt,
+    _DTypeLikeInt,
+    _DTypeLikeFloat,
+    _DTypeLikeComplex,
+]
diff --git a/numpy/_typing/_extended_precision.py b/numpy/_typing/_extended_precision.py

new file mode 100644 (file)

index 0000000..edc1778
--- /dev/null
+++ b/numpy/_typing/_extended_precision.py
@@ -0,0 +1,43 @@
+"""A module with platform-specific extended precision
+`numpy.number` subclasses.
+
+The subclasses are defined here (instead of ``__init__.pyi``) such
+that they can be imported conditionally via the numpy's mypy plugin.
+"""
+
+from typing import TYPE_CHECKING
+
+import numpy as np
+from . import (
+    _80Bit,
+    _96Bit,
+    _128Bit,
+    _256Bit,
+)
+
+if TYPE_CHECKING:
+    uint128 = np.unsignedinteger[_128Bit]
+    uint256 = np.unsignedinteger[_256Bit]
+    int128 = np.signedinteger[_128Bit]
+    int256 = np.signedinteger[_256Bit]
+    float80 = np.floating[_80Bit]
+    float96 = np.floating[_96Bit]
+    float128 = np.floating[_128Bit]
+    float256 = np.floating[_256Bit]
+    complex160 = np.complexfloating[_80Bit, _80Bit]
+    complex192 = np.complexfloating[_96Bit, _96Bit]
+    complex256 = np.complexfloating[_128Bit, _128Bit]
+    complex512 = np.complexfloating[_256Bit, _256Bit]
+else:
+    uint128 = Any
+    uint256 = Any
+    int128 = Any
+    int256 = Any
+    float80 = Any
+    float96 = Any
+    float128 = Any
+    float256 = Any
+    complex160 = Any
+    complex192 = Any
+    complex256 = Any
+    complex512 = Any
diff --git a/numpy/_typing/_generic_alias.py b/numpy/_typing/_generic_alias.py

new file mode 100644 (file)

index 0000000..d32814a
--- /dev/null
+++ b/numpy/_typing/_generic_alias.py
@@ -0,0 +1,244 @@
+from __future__ import annotations
+
+import sys
+import types
+from collections.abc import Generator, Iterable, Iterator
+from typing import (
+    Any,
+    ClassVar,
+    NoReturn,
+    TypeVar,
+    TYPE_CHECKING,
+)
+
+import numpy as np
+
+__all__ = ["_GenericAlias", "NDArray"]
+
+_T = TypeVar("_T", bound="_GenericAlias")
+
+
+def _to_str(obj: object) -> str:
+    """Helper function for `_GenericAlias.__repr__`."""
+    if obj is Ellipsis:
+        return '...'
+    elif isinstance(obj, type) and not isinstance(obj, _GENERIC_ALIAS_TYPE):
+        if obj.__module__ == 'builtins':
+            return obj.__qualname__
+        else:
+            return f'{obj.__module__}.{obj.__qualname__}'
+    else:
+        return repr(obj)
+
+
+def _parse_parameters(args: Iterable[Any]) -> Generator[TypeVar, None, None]:
+    """Search for all typevars and typevar-containing objects in `args`.
+
+    Helper function for `_GenericAlias.__init__`.
+
+    """
+    for i in args:
+        if hasattr(i, "__parameters__"):
+            yield from i.__parameters__
+        elif isinstance(i, TypeVar):
+            yield i
+
+
+def _reconstruct_alias(alias: _T, parameters: Iterator[TypeVar]) -> _T:
+    """Recursively replace all typevars with those from `parameters`.
+
+    Helper function for `_GenericAlias.__getitem__`.
+
+    """
+    args = []
+    for i in alias.__args__:
+        if isinstance(i, TypeVar):
+            value: Any = next(parameters)
+        elif isinstance(i, _GenericAlias):
+            value = _reconstruct_alias(i, parameters)
+        elif hasattr(i, "__parameters__"):
+            prm_tup = tuple(next(parameters) for _ in i.__parameters__)
+            value = i[prm_tup]
+        else:
+            value = i
+        args.append(value)
+
+    cls = type(alias)
+    return cls(alias.__origin__, tuple(args), alias.__unpacked__)
+
+
+class _GenericAlias:
+    """A python-based backport of the `types.GenericAlias` class.
+
+    E.g. for ``t = list[int]``, ``t.__origin__`` is ``list`` and
+    ``t.__args__`` is ``(int,)``.
+
+    See Also
+    --------
+    :pep:`585`
+        The PEP responsible for introducing `types.GenericAlias`.
+
+    """
+
+    __slots__ = (
+        "__weakref__",
+        "_origin",
+        "_args",
+        "_parameters",
+        "_hash",
+        "_starred",
+    )
+
+    @property
+    def __origin__(self) -> type:
+        return super().__getattribute__("_origin")
+
+    @property
+    def __args__(self) -> tuple[object, ...]:
+        return super().__getattribute__("_args")
+
+    @property
+    def __parameters__(self) -> tuple[TypeVar, ...]:
+        """Type variables in the ``GenericAlias``."""
+        return super().__getattribute__("_parameters")
+
+    @property
+    def __unpacked__(self) -> bool:
+        return super().__getattribute__("_starred")
+
+    @property
+    def __typing_unpacked_tuple_args__(self) -> tuple[object, ...] | None:
+        # NOTE: This should return `__args__` if `__origin__` is a tuple,
+        # which should never be the case with how `_GenericAlias` is used
+        # within numpy
+        return None
+
+    def __init__(
+        self,
+        origin: type,
+        args: object | tuple[object, ...],
+        starred: bool = False,
+    ) -> None:
+        self._origin = origin
+        self._args = args if isinstance(args, tuple) else (args,)
+        self._parameters = tuple(_parse_parameters(self.__args__))
+        self._starred = starred
+
+    @property
+    def __call__(self) -> type[Any]:
+        return self.__origin__
+
+    def __reduce__(self: _T) -> tuple[
+        type[_T],
+        tuple[type[Any], tuple[object, ...], bool],
+    ]:
+        cls = type(self)
+        return cls, (self.__origin__, self.__args__, self.__unpacked__)
+
+    def __mro_entries__(self, bases: Iterable[object]) -> tuple[type[Any]]:
+        return (self.__origin__,)
+
+    def __dir__(self) -> list[str]:
+        """Implement ``dir(self)``."""
+        cls = type(self)
+        dir_origin = set(dir(self.__origin__))
+        return sorted(cls._ATTR_EXCEPTIONS | dir_origin)
+
+    def __hash__(self) -> int:
+        """Return ``hash(self)``."""
+        # Attempt to use the cached hash
+        try:
+            return super().__getattribute__("_hash")
+        except AttributeError:
+            self._hash: int = (
+                hash(self.__origin__) ^
+                hash(self.__args__) ^
+                hash(self.__unpacked__)
+            )
+            return super().__getattribute__("_hash")
+
+    def __instancecheck__(self, obj: object) -> NoReturn:
+        """Check if an `obj` is an instance."""
+        raise TypeError("isinstance() argument 2 cannot be a "
+                        "parameterized generic")
+
+    def __subclasscheck__(self, cls: type) -> NoReturn:
+        """Check if a `cls` is a subclass."""
+        raise TypeError("issubclass() argument 2 cannot be a "
+                        "parameterized generic")
+
+    def __repr__(self) -> str:
+        """Return ``repr(self)``."""
+        args = ", ".join(_to_str(i) for i in self.__args__)
+        origin = _to_str(self.__origin__)
+        prefix = "*" if self.__unpacked__ else ""
+        return f"{prefix}{origin}[{args}]"
+
+    def __getitem__(self: _T, key: object | tuple[object, ...]) -> _T:
+        """Return ``self[key]``."""
+        key_tup = key if isinstance(key, tuple) else (key,)
+
+        if len(self.__parameters__) == 0:
+            raise TypeError(f"There are no type variables left in {self}")
+        elif len(key_tup) > len(self.__parameters__):
+            raise TypeError(f"Too many arguments for {self}")
+        elif len(key_tup) < len(self.__parameters__):
+            raise TypeError(f"Too few arguments for {self}")
+
+        key_iter = iter(key_tup)
+        return _reconstruct_alias(self, key_iter)
+
+    def __eq__(self, value: object) -> bool:
+        """Return ``self == value``."""
+        if not isinstance(value, _GENERIC_ALIAS_TYPE):
+            return NotImplemented
+        return (
+            self.__origin__ == value.__origin__ and
+            self.__args__ == value.__args__ and
+            self.__unpacked__ == getattr(
+                value, "__unpacked__", self.__unpacked__
+            )
+        )
+
+    def __iter__(self: _T) -> Generator[_T, None, None]:
+        """Return ``iter(self)``."""
+        cls = type(self)
+        yield cls(self.__origin__, self.__args__, True)
+
+    _ATTR_EXCEPTIONS: ClassVar[frozenset[str]] = frozenset({
+        "__origin__",
+        "__args__",
+        "__parameters__",
+        "__mro_entries__",
+        "__reduce__",
+        "__reduce_ex__",
+        "__copy__",
+        "__deepcopy__",
+        "__unpacked__",
+        "__typing_unpacked_tuple_args__",
+    })
+
+    def __getattribute__(self, name: str) -> Any:
+        """Return ``getattr(self, name)``."""
+        # Pull the attribute from `__origin__` unless its
+        # name is in `_ATTR_EXCEPTIONS`
+        cls = type(self)
+        if name in cls._ATTR_EXCEPTIONS:
+            return super().__getattribute__(name)
+        return getattr(self.__origin__, name)
+
+
+# See `_GenericAlias.__eq__`
+if sys.version_info >= (3, 9):
+    _GENERIC_ALIAS_TYPE = (_GenericAlias, types.GenericAlias)
+else:
+    _GENERIC_ALIAS_TYPE = (_GenericAlias,)
+
+ScalarType = TypeVar("ScalarType", bound=np.generic, covariant=True)
+
+if TYPE_CHECKING or sys.version_info >= (3, 9):
+    _DType = np.dtype[ScalarType]
+    NDArray = np.ndarray[Any, np.dtype[ScalarType]]
+else:
+    _DType = _GenericAlias(np.dtype, (ScalarType,))
+    NDArray = _GenericAlias(np.ndarray, (Any, _DType))
diff --git a/numpy/_typing/_nbit.py b/numpy/_typing/_nbit.py

new file mode 100644 (file)

index 0000000..b8d35db
--- /dev/null
+++ b/numpy/_typing/_nbit.py
@@ -0,0 +1,16 @@
+"""A module with the precisions of platform-specific `~numpy.number`s."""
+
+from typing import Any
+
+# To-be replaced with a `npt.NBitBase` subclass by numpy's mypy plugin
+_NBitByte = Any
+_NBitShort = Any
+_NBitIntC = Any
+_NBitIntP = Any
+_NBitInt = Any
+_NBitLongLong = Any
+
+_NBitHalf = Any
+_NBitSingle = Any
+_NBitDouble = Any
+_NBitLongDouble = Any
diff --git a/numpy/_typing/_nested_sequence.py b/numpy/_typing/_nested_sequence.py

new file mode 100644 (file)

index 0000000..7c12c4a
--- /dev/null
+++ b/numpy/_typing/_nested_sequence.py
@@ -0,0 +1,90 @@
+"""A module containing the `_NestedSequence` protocol."""
+
+from __future__ import annotations
+
+from typing import (
+    Any,
+    Iterator,
+    overload,
+    TypeVar,
+    Protocol,
+)
+
+__all__ = ["_NestedSequence"]
+
+_T_co = TypeVar("_T_co", covariant=True)
+
+
+class _NestedSequence(Protocol[_T_co]):
+    """A protocol for representing nested sequences.
+
+    Warning
+    -------
+    `_NestedSequence` currently does not work in combination with typevars,
+    *e.g.* ``def func(a: _NestedSequnce[T]) -> T: ...``.
+
+    See Also
+    --------
+    collections.abc.Sequence
+        ABCs for read-only and mutable :term:`sequences`.
+
+    Examples
+    --------
+    .. code-block:: python
+
+        >>> from __future__ import annotations
+
+        >>> from typing import TYPE_CHECKING
+        >>> import numpy as np
+        >>> from numpy._typing import _NestedSequnce
+
+        >>> def get_dtype(seq: _NestedSequnce[float]) -> np.dtype[np.float64]:
+        ...     return np.asarray(seq).dtype
+
+        >>> a = get_dtype([1.0])
+        >>> b = get_dtype([[1.0]])
+        >>> c = get_dtype([[[1.0]]])
+        >>> d = get_dtype([[[[1.0]]]])
+
+        >>> if TYPE_CHECKING:
+        ...     reveal_locals()
+        ...     # note: Revealed local types are:
+        ...     # note:     a: numpy.dtype[numpy.floating[numpy._typing._64Bit]]
+        ...     # note:     b: numpy.dtype[numpy.floating[numpy._typing._64Bit]]
+        ...     # note:     c: numpy.dtype[numpy.floating[numpy._typing._64Bit]]
+        ...     # note:     d: numpy.dtype[numpy.floating[numpy._typing._64Bit]]
+
+    """
+
+    def __len__(self, /) -> int:
+        """Implement ``len(self)``."""
+        raise NotImplementedError
+
+    @overload
+    def __getitem__(self, index: int, /) -> _T_co | _NestedSequence[_T_co]: ...
+    @overload
+    def __getitem__(self, index: slice, /) -> _NestedSequence[_T_co]: ...
+
+    def __getitem__(self, index, /):
+        """Implement ``self[x]``."""
+        raise NotImplementedError
+
+    def __contains__(self, x: object, /) -> bool:
+        """Implement ``x in self``."""
+        raise NotImplementedError
+
+    def __iter__(self, /) -> Iterator[_T_co | _NestedSequence[_T_co]]:
+        """Implement ``iter(self)``."""
+        raise NotImplementedError
+
+    def __reversed__(self, /) -> Iterator[_T_co | _NestedSequence[_T_co]]:
+        """Implement ``reversed(self)``."""
+        raise NotImplementedError
+
+    def count(self, value: Any, /) -> int:
+        """Return the number of occurrences of `value`."""
+        raise NotImplementedError
+
+    def index(self, value: Any, /) -> int:
+        """Return the first index of `value`."""
+        raise NotImplementedError
diff --git a/numpy/_typing/_scalars.py b/numpy/_typing/_scalars.py

new file mode 100644 (file)

index 0000000..516b996
--- /dev/null
+++ b/numpy/_typing/_scalars.py
@@ -0,0 +1,30 @@
+from typing import Union, Tuple, Any
+
+import numpy as np
+
+# NOTE: `_StrLike_co` and `_BytesLike_co` are pointless, as `np.str_` and
+# `np.bytes_` are already subclasses of their builtin counterpart
+
+_CharLike_co = Union[str, bytes]
+
+# The 6 `<X>Like_co` type-aliases below represent all scalars that can be
+# coerced into `<X>` (with the casting rule `same_kind`)
+_BoolLike_co = Union[bool, np.bool_]
+_UIntLike_co = Union[_BoolLike_co, np.unsignedinteger]
+_IntLike_co = Union[_BoolLike_co, int, np.integer]
+_FloatLike_co = Union[_IntLike_co, float, np.floating]
+_ComplexLike_co = Union[_FloatLike_co, complex, np.complexfloating]
+_TD64Like_co = Union[_IntLike_co, np.timedelta64]
+
+_NumberLike_co = Union[int, float, complex, np.number, np.bool_]
+_ScalarLike_co = Union[
+    int,
+    float,
+    complex,
+    str,
+    bytes,
+    np.generic,
+]
+
+# `_VoidLike_co` is technically not a scalar, but it's close enough
+_VoidLike_co = Union[Tuple[Any, ...], np.void]
diff --git a/numpy/_typing/_shape.py b/numpy/_typing/_shape.py

new file mode 100644 (file)

index 0000000..c28859b
--- /dev/null
+++ b/numpy/_typing/_shape.py
@@ -0,0 +1,6 @@
+from typing import Sequence, Tuple, Union, SupportsIndex
+
+_Shape = Tuple[int, ...]
+
+# Anything that can be coerced to a shape tuple
+_ShapeLike = Union[SupportsIndex, Sequence[SupportsIndex]]
diff --git a/numpy/_typing/_ufunc.pyi b/numpy/_typing/_ufunc.pyi

new file mode 100644 (file)

index 0000000..ee0317c
--- /dev/null
+++ b/numpy/_typing/_ufunc.pyi
@@ -0,0 +1,403 @@
+"""A module with private type-check-only `numpy.ufunc` subclasses.
+
+The signatures of the ufuncs are too varied to reasonably type
+with a single class. So instead, `ufunc` has been expanded into
+four private subclasses, one for each combination of
+`~ufunc.nin` and `~ufunc.nout`.
+
+"""
+
+from typing import (
+    Any,
+    Generic,
+    overload,
+    TypeVar,
+    Literal,
+    SupportsIndex,
+)
+
+from numpy import ufunc, _CastingKind, _OrderKACF
+from numpy.typing import NDArray
+
+from ._shape import _ShapeLike
+from ._scalars import _ScalarLike_co
+from ._array_like import ArrayLike, _ArrayLikeBool_co, _ArrayLikeInt_co
+from ._dtype_like import DTypeLike
+
+_T = TypeVar("_T")
+_2Tuple = tuple[_T, _T]
+_3Tuple = tuple[_T, _T, _T]
+_4Tuple = tuple[_T, _T, _T, _T]
+
+_NTypes = TypeVar("_NTypes", bound=int)
+_IDType = TypeVar("_IDType", bound=Any)
+_NameType = TypeVar("_NameType", bound=str)
+
+# NOTE: In reality `extobj` should be a length of list 3 containing an
+# int, an int, and a callable, but there's no way to properly express
+# non-homogenous lists.
+# Use `Any` over `Union` to avoid issues related to lists invariance.
+
+# NOTE: `reduce`, `accumulate`, `reduceat` and `outer` raise a ValueError for
+# ufuncs that don't accept two input arguments and return one output argument.
+# In such cases the respective methods are simply typed as `None`.
+
+# NOTE: Similarly, `at` won't be defined for ufuncs that return
+# multiple outputs; in such cases `at` is typed as `None`
+
+# NOTE: If 2 output types are returned then `out` must be a
+# 2-tuple of arrays. Otherwise `None` or a plain array are also acceptable
+
+class _UFunc_Nin1_Nout1(ufunc, Generic[_NameType, _NTypes, _IDType]):  # type: ignore[misc]
+    @property
+    def __name__(self) -> _NameType: ...
+    @property
+    def ntypes(self) -> _NTypes: ...
+    @property
+    def identity(self) -> _IDType: ...
+    @property
+    def nin(self) -> Literal[1]: ...
+    @property
+    def nout(self) -> Literal[1]: ...
+    @property
+    def nargs(self) -> Literal[2]: ...
+    @property
+    def signature(self) -> None: ...
+    @property
+    def reduce(self) -> None: ...
+    @property
+    def accumulate(self) -> None: ...
+    @property
+    def reduceat(self) -> None: ...
+    @property
+    def outer(self) -> None: ...
+
+    @overload
+    def __call__(
+        self,
+        __x1: _ScalarLike_co,
+        out: None = ...,
+        *,
+        where: None | _ArrayLikeBool_co = ...,
+        casting: _CastingKind = ...,
+        order: _OrderKACF = ...,
+        dtype: DTypeLike = ...,
+        subok: bool = ...,
+        signature: str | _2Tuple[None | str] = ...,
+        extobj: list[Any] = ...,
+    ) -> Any: ...
+    @overload
+    def __call__(
+        self,
+        __x1: ArrayLike,
+        out: None | NDArray[Any] | tuple[NDArray[Any]] = ...,
+        *,
+        where: None | _ArrayLikeBool_co = ...,
+        casting: _CastingKind = ...,
+        order: _OrderKACF = ...,
+        dtype: DTypeLike = ...,
+        subok: bool = ...,
+        signature: str | _2Tuple[None | str] = ...,
+        extobj: list[Any] = ...,
+    ) -> NDArray[Any]: ...
+
+    def at(
+        self,
+        a: NDArray[Any],
+        indices: _ArrayLikeInt_co,
+        /,
+    ) -> None: ...
+
+class _UFunc_Nin2_Nout1(ufunc, Generic[_NameType, _NTypes, _IDType]):  # type: ignore[misc]
+    @property
+    def __name__(self) -> _NameType: ...
+    @property
+    def ntypes(self) -> _NTypes: ...
+    @property
+    def identity(self) -> _IDType: ...
+    @property
+    def nin(self) -> Literal[2]: ...
+    @property
+    def nout(self) -> Literal[1]: ...
+    @property
+    def nargs(self) -> Literal[3]: ...
+    @property
+    def signature(self) -> None: ...
+
+    @overload
+    def __call__(
+        self,
+        __x1: _ScalarLike_co,
+        __x2: _ScalarLike_co,
+        out: None = ...,
+        *,
+        where: None | _ArrayLikeBool_co = ...,
+        casting: _CastingKind = ...,
+        order: _OrderKACF = ...,
+        dtype: DTypeLike = ...,
+        subok: bool = ...,
+        signature: str | _3Tuple[None | str] = ...,
+        extobj: list[Any] = ...,
+    ) -> Any: ...
+    @overload
+    def __call__(
+        self,
+        __x1: ArrayLike,
+        __x2: ArrayLike,
+        out: None | NDArray[Any] | tuple[NDArray[Any]] = ...,
+        *,
+        where: None | _ArrayLikeBool_co = ...,
+        casting: _CastingKind = ...,
+        order: _OrderKACF = ...,
+        dtype: DTypeLike = ...,
+        subok: bool = ...,
+        signature: str | _3Tuple[None | str] = ...,
+        extobj: list[Any] = ...,
+    ) -> NDArray[Any]: ...
+
+    def at(
+        self,
+        a: NDArray[Any],
+        indices: _ArrayLikeInt_co,
+        b: ArrayLike,
+        /,
+    ) -> None: ...
+
+    def reduce(
+        self,
+        array: ArrayLike,
+        axis: None | _ShapeLike = ...,
+        dtype: DTypeLike = ...,
+        out: None | NDArray[Any] = ...,
+        keepdims: bool = ...,
+        initial: Any = ...,
+        where: _ArrayLikeBool_co = ...,
+    ) -> Any: ...
+
+    def accumulate(
+        self,
+        array: ArrayLike,
+        axis: SupportsIndex = ...,
+        dtype: DTypeLike = ...,
+        out: None | NDArray[Any] = ...,
+    ) -> NDArray[Any]: ...
+
+    def reduceat(
+        self,
+        array: ArrayLike,
+        indices: _ArrayLikeInt_co,
+        axis: SupportsIndex = ...,
+        dtype: DTypeLike = ...,
+        out: None | NDArray[Any] = ...,
+    ) -> NDArray[Any]: ...
+
+    # Expand `**kwargs` into explicit keyword-only arguments
+    @overload
+    def outer(
+        self,
+        A: _ScalarLike_co,
+        B: _ScalarLike_co,
+        /, *,
+        out: None = ...,
+        where: None | _ArrayLikeBool_co = ...,
+        casting: _CastingKind = ...,
+        order: _OrderKACF = ...,
+        dtype: DTypeLike = ...,
+        subok: bool = ...,
+        signature: str | _3Tuple[None | str] = ...,
+        extobj: list[Any] = ...,
+    ) -> Any: ...
+    @overload
+    def outer(  # type: ignore[misc]
+        self,
+        A: ArrayLike,
+        B: ArrayLike,
+        /, *,
+        out: None | NDArray[Any] | tuple[NDArray[Any]] = ...,
+        where: None | _ArrayLikeBool_co = ...,
+        casting: _CastingKind = ...,
+        order: _OrderKACF = ...,
+        dtype: DTypeLike = ...,
+        subok: bool = ...,
+        signature: str | _3Tuple[None | str] = ...,
+        extobj: list[Any] = ...,
+    ) -> NDArray[Any]: ...
+
+class _UFunc_Nin1_Nout2(ufunc, Generic[_NameType, _NTypes, _IDType]):  # type: ignore[misc]
+    @property
+    def __name__(self) -> _NameType: ...
+    @property
+    def ntypes(self) -> _NTypes: ...
+    @property
+    def identity(self) -> _IDType: ...
+    @property
+    def nin(self) -> Literal[1]: ...
+    @property
+    def nout(self) -> Literal[2]: ...
+    @property
+    def nargs(self) -> Literal[3]: ...
+    @property
+    def signature(self) -> None: ...
+    @property
+    def at(self) -> None: ...
+    @property
+    def reduce(self) -> None: ...
+    @property
+    def accumulate(self) -> None: ...
+    @property
+    def reduceat(self) -> None: ...
+    @property
+    def outer(self) -> None: ...
+
+    @overload
+    def __call__(
+        self,
+        __x1: _ScalarLike_co,
+        __out1: None = ...,
+        __out2: None = ...,
+        *,
+        where: None | _ArrayLikeBool_co = ...,
+        casting: _CastingKind = ...,
+        order: _OrderKACF = ...,
+        dtype: DTypeLike = ...,
+        subok: bool = ...,
+        signature: str | _3Tuple[None | str] = ...,
+        extobj: list[Any] = ...,
+    ) -> _2Tuple[Any]: ...
+    @overload
+    def __call__(
+        self,
+        __x1: ArrayLike,
+        __out1: None | NDArray[Any] = ...,
+        __out2: None | NDArray[Any] = ...,
+        *,
+        out: _2Tuple[NDArray[Any]] = ...,
+        where: None | _ArrayLikeBool_co = ...,
+        casting: _CastingKind = ...,
+        order: _OrderKACF = ...,
+        dtype: DTypeLike = ...,
+        subok: bool = ...,
+        signature: str | _3Tuple[None | str] = ...,
+        extobj: list[Any] = ...,
+    ) -> _2Tuple[NDArray[Any]]: ...
+
+class _UFunc_Nin2_Nout2(ufunc, Generic[_NameType, _NTypes, _IDType]):  # type: ignore[misc]
+    @property
+    def __name__(self) -> _NameType: ...
+    @property
+    def ntypes(self) -> _NTypes: ...
+    @property
+    def identity(self) -> _IDType: ...
+    @property
+    def nin(self) -> Literal[2]: ...
+    @property
+    def nout(self) -> Literal[2]: ...
+    @property
+    def nargs(self) -> Literal[4]: ...
+    @property
+    def signature(self) -> None: ...
+    @property
+    def at(self) -> None: ...
+    @property
+    def reduce(self) -> None: ...
+    @property
+    def accumulate(self) -> None: ...
+    @property
+    def reduceat(self) -> None: ...
+    @property
+    def outer(self) -> None: ...
+
+    @overload
+    def __call__(
+        self,
+        __x1: _ScalarLike_co,
+        __x2: _ScalarLike_co,
+        __out1: None = ...,
+        __out2: None = ...,
+        *,
+        where: None | _ArrayLikeBool_co = ...,
+        casting: _CastingKind = ...,
+        order: _OrderKACF = ...,
+        dtype: DTypeLike = ...,
+        subok: bool = ...,
+        signature: str | _4Tuple[None | str] = ...,
+        extobj: list[Any] = ...,
+    ) -> _2Tuple[Any]: ...
+    @overload
+    def __call__(
+        self,
+        __x1: ArrayLike,
+        __x2: ArrayLike,
+        __out1: None | NDArray[Any] = ...,
+        __out2: None | NDArray[Any] = ...,
+        *,
+        out: _2Tuple[NDArray[Any]] = ...,
+        where: None | _ArrayLikeBool_co = ...,
+        casting: _CastingKind = ...,
+        order: _OrderKACF = ...,
+        dtype: DTypeLike = ...,
+        subok: bool = ...,
+        signature: str | _4Tuple[None | str] = ...,
+        extobj: list[Any] = ...,
+    ) -> _2Tuple[NDArray[Any]]: ...
+
+class _GUFunc_Nin2_Nout1(ufunc, Generic[_NameType, _NTypes, _IDType]):  # type: ignore[misc]
+    @property
+    def __name__(self) -> _NameType: ...
+    @property
+    def ntypes(self) -> _NTypes: ...
+    @property
+    def identity(self) -> _IDType: ...
+    @property
+    def nin(self) -> Literal[2]: ...
+    @property
+    def nout(self) -> Literal[1]: ...
+    @property
+    def nargs(self) -> Literal[3]: ...
+
+    # NOTE: In practice the only gufunc in the main name is `matmul`,
+    # so we can use its signature here
+    @property
+    def signature(self) -> Literal["(n?,k),(k,m?)->(n?,m?)"]: ...
+    @property
+    def reduce(self) -> None: ...
+    @property
+    def accumulate(self) -> None: ...
+    @property
+    def reduceat(self) -> None: ...
+    @property
+    def outer(self) -> None: ...
+    @property
+    def at(self) -> None: ...
+
+    # Scalar for 1D array-likes; ndarray otherwise
+    @overload
+    def __call__(
+        self,
+        __x1: ArrayLike,
+        __x2: ArrayLike,
+        out: None = ...,
+        *,
+        casting: _CastingKind = ...,
+        order: _OrderKACF = ...,
+        dtype: DTypeLike = ...,
+        subok: bool = ...,
+        signature: str | _3Tuple[None | str] = ...,
+        extobj: list[Any] = ...,
+        axes: list[_2Tuple[SupportsIndex]] = ...,
+    ) -> Any: ...
+    @overload
+    def __call__(
+        self,
+        __x1: ArrayLike,
+        __x2: ArrayLike,
+        out: NDArray[Any] | tuple[NDArray[Any]],
+        *,
+        casting: _CastingKind = ...,
+        order: _OrderKACF = ...,
+        dtype: DTypeLike = ...,
+        subok: bool = ...,
+        signature: str | _3Tuple[None | str] = ...,
+        extobj: list[Any] = ...,
+        axes: list[_2Tuple[SupportsIndex]] = ...,
+    ) -> NDArray[Any]: ...
diff --git a/numpy/_typing/setup.py b/numpy/_typing/setup.py

new file mode 100644 (file)

index 0000000..24022fd
--- /dev/null
+++ b/numpy/_typing/setup.py
@@ -0,0 +1,10 @@
+def configuration(parent_package='', top_path=None):
+    from numpy.distutils.misc_util import Configuration
+    config = Configuration('_typing', parent_package, top_path)
+    config.add_data_files('*.pyi')
+    return config
+
+
+if __name__ == '__main__':
+    from numpy.distutils.core import setup
+    setup(configuration=configuration)
diff --git a/numpy/_version.py b/numpy/_version.py

index 92f62350bddd630b9ad5d21c6d8b1021491287bb..38f459ae6bfa0ac6022551feaec7d5ce5afe4744 100644 (file)
--- a/numpy/_version.py
+++ b/numpy/_version.py
@@ -8,11 +8,11 @@ import json
  
  version_json = '''
  {
- "date": "2022-05-20T10:57:06-0600",
+ "date": "2022-06-22T13:57:27-0600",
   "dirty": false,
   "error": null,
- "full-revisionid": "08772f91455db66810995db5e9d0671f91e027ed",
- "version": "1.22.4"
+ "full-revisionid": "54c52f13713f3d21795926ca4dbb27e16fada171",
+ "version": "1.23.0"
  }
  '''  # END VERSION_JSON
  
diff --git a/numpy/array_api/_array_object.py b/numpy/array_api/_array_object.py

index 00ffe7162158962ed17f89f06ebe1e9e35227ead..c4746fad96e7c0ef7d64803ab7652cd99feae4af 100644 (file)
--- a/numpy/array_api/_array_object.py
+++ b/numpy/array_api/_array_object.py
@@ -29,7 +29,7 @@ from ._dtypes import (
      _dtype_categories,
  )
  
-from typing import TYPE_CHECKING, Optional, Tuple, Union, Any
+from typing import TYPE_CHECKING, Optional, Tuple, Union, Any, SupportsIndex
  import types
  
  if TYPE_CHECKING:
@@ -56,6 +56,7 @@ class Array:
      functions, such as asarray().
  
      """
+    _array: np.ndarray
  
      # Use a custom constructor instead of __init__, as manually initializing
      # this class is not supported API.
@@ -125,7 +126,7 @@ class Array:
      # spec in places where it either deviates from or is more strict than
      # NumPy behavior
  
-    def _check_allowed_dtypes(self, other, dtype_category, op):
+    def _check_allowed_dtypes(self, other: bool | int | float | Array, dtype_category: str, op: str) -> Array:
          """
          Helper function for operators to only allow specific input dtypes
  
@@ -176,6 +177,8 @@ class Array:
          integer that is too large to fit in a NumPy integer dtype, or
          TypeError when the scalar type is incompatible with the dtype of self.
          """
+        # Note: Only Python scalar types that match the array dtype are
+        # allowed.
          if isinstance(scalar, bool):
              if self.dtype not in _boolean_dtypes:
                  raise TypeError(
@@ -194,6 +197,9 @@ class Array:
          else:
              raise TypeError("'scalar' must be a Python scalar")
  
+        # Note: scalars are unconditionally cast to the same dtype as the
+        # array.
+
          # Note: the spec only specifies integer-dtype/int promotion
          # behavior for integers within the bounds of the integer dtype.
          # Outside of those bounds we use the default NumPy behavior (either
@@ -201,7 +207,7 @@ class Array:
          return Array._new(np.array(scalar, self.dtype))
  
      @staticmethod
-    def _normalize_two_args(x1, x2):
+    def _normalize_two_args(x1, x2) -> Tuple[Array, Array]:
          """
          Normalize inputs to two arg functions to fix type promotion rules
  
@@ -237,8 +243,7 @@ class Array:
  
      # Note: A large fraction of allowed indices are disallowed here (see the
      # docstring below)
-    @staticmethod
-    def _validate_index(key, shape):
+    def _validate_index(self, key):
          """
          Validate an index according to the array API.
  
@@ -251,8 +256,7 @@ class Array:
          https://data-apis.org/array-api/latest/API_specification/indexing.html
          for the full list of required indexing behavior
  
-        This function either raises IndexError if the index ``key`` is
-        invalid, or a new key to be used in place of ``key`` in indexing. It
+        This function raises IndexError if the index ``key`` is invalid. It
          only raises ``IndexError`` on indices that are not already rejected by
          NumPy, as NumPy will already raise the appropriate error on such
          indices. ``shape`` may be None, in which case, only cases that are
@@ -263,7 +267,7 @@ class Array:
  
          - Indices to not include an implicit ellipsis at the end. That is,
            every axis of an array must be explicitly indexed or an ellipsis
-          included.
+          included. This behaviour is sometimes referred to as flat indexing.
  
          - The start and stop of a slice may not be out of bounds. In
            particular, for a slice ``i:j:k`` on an axis of size ``n``, only the
@@ -286,100 +290,122 @@ class Array:
          ``Array._new`` constructor, not this function.
  
          """
-        if isinstance(key, slice):
-            if shape is None:
-                return key
-            if shape == ():
-                return key
-            if len(shape) > 1:
+        _key = key if isinstance(key, tuple) else (key,)
+        for i in _key:
+            if isinstance(i, bool) or not (
+                isinstance(i, SupportsIndex)  # i.e. ints
+                or isinstance(i, slice)
+                or i == Ellipsis
+                or i is None
+                or isinstance(i, Array)
+                or isinstance(i, np.ndarray)
+            ):
                  raise IndexError(
-                    "Multidimensional arrays must include an index for every axis or use an ellipsis"
+                    f"Single-axes index {i} has {type(i)=}, but only "
+                    "integers, slices (:), ellipsis (...), newaxis (None), "
+                    "zero-dimensional integer arrays and boolean arrays "
+                    "are specified in the Array API."
                  )
-            size = shape[0]
-            # Ensure invalid slice entries are passed through.
-            if key.start is not None:
-                try:
-                    operator.index(key.start)
-                except TypeError:
-                    return key
-                if not (-size <= key.start <= size):
-                    raise IndexError(
-                        "Slices with out-of-bounds start are not allowed in the array API namespace"
-                    )
-            if key.stop is not None:
-                try:
-                    operator.index(key.stop)
-                except TypeError:
-                    return key
-                step = 1 if key.step is None else key.step
-                if (step > 0 and not (-size <= key.stop <= size)
-                    or step < 0 and not (-size - 1 <= key.stop <= max(0, size - 1))):
-                    raise IndexError("Slices with out-of-bounds stop are not allowed in the array API namespace")
-            return key
-
-        elif isinstance(key, tuple):
-            key = tuple(Array._validate_index(idx, None) for idx in key)
-
-            for idx in key:
-                if (
-                    isinstance(idx, np.ndarray)
-                    and idx.dtype in _boolean_dtypes
-                    or isinstance(idx, (bool, np.bool_))
-                ):
-                    if len(key) == 1:
-                        return key
-                    raise IndexError(
-                        "Boolean array indices combined with other indices are not allowed in the array API namespace"
-                    )
-                if isinstance(idx, tuple):
-                    raise IndexError(
-                        "Nested tuple indices are not allowed in the array API namespace"
-                    )
  
-            if shape is None:
-                return key
-            n_ellipsis = key.count(...)
-            if n_ellipsis > 1:
-                return key
-            ellipsis_i = key.index(...) if n_ellipsis else len(key)
-
-            for idx, size in list(zip(key[:ellipsis_i], shape)) + list(
-                zip(key[:ellipsis_i:-1], shape[:ellipsis_i:-1])
-            ):
-                Array._validate_index(idx, (size,))
-            if n_ellipsis == 0 and len(key) < len(shape):
+        nonexpanding_key = []
+        single_axes = []
+        n_ellipsis = 0
+        key_has_mask = False
+        for i in _key:
+            if i is not None:
+                nonexpanding_key.append(i)
+                if isinstance(i, Array) or isinstance(i, np.ndarray):
+                    if i.dtype in _boolean_dtypes:
+                        key_has_mask = True
+                    single_axes.append(i)
+                else:
+                    # i must not be an array here, to avoid elementwise equals
+                    if i == Ellipsis:
+                        n_ellipsis += 1
+                    else:
+                        single_axes.append(i)
+
+        n_single_axes = len(single_axes)
+        if n_ellipsis > 1:
+            return  # handled by ndarray
+        elif n_ellipsis == 0:
+            # Note boolean masks must be the sole index, which we check for
+            # later on.
+            if not key_has_mask and n_single_axes < self.ndim:
                  raise IndexError(
-                    "Multidimensional arrays must include an index for every axis or use an ellipsis"
+                    f"{self.ndim=}, but the multi-axes index only specifies "
+                    f"{n_single_axes} dimensions. If this was intentional, "
+                    "add a trailing ellipsis (...) which expands into as many "
+                    "slices (:) as necessary - this is what np.ndarray arrays "
+                    "implicitly do, but such flat indexing behaviour is not "
+                    "specified in the Array API."
                  )
-            return key
-        elif isinstance(key, bool):
-            return key
-        elif isinstance(key, Array):
-            if key.dtype in _integer_dtypes:
-                if key.ndim != 0:
+
+        if n_ellipsis == 0:
+            indexed_shape = self.shape
+        else:
+            ellipsis_start = None
+            for pos, i in enumerate(nonexpanding_key):
+                if not (isinstance(i, Array) or isinstance(i, np.ndarray)):
+                    if i == Ellipsis:
+                        ellipsis_start = pos
+                        break
+            assert ellipsis_start is not None  # sanity check
+            ellipsis_end = self.ndim - (n_single_axes - ellipsis_start)
+            indexed_shape = (
+                self.shape[:ellipsis_start] + self.shape[ellipsis_end:]
+            )
+        for i, side in zip(single_axes, indexed_shape):
+            if isinstance(i, slice):
+                if side == 0:
+                    f_range = "0 (or None)"
+                else:
+                    f_range = f"between -{side} and {side - 1} (or None)"
+                if i.start is not None:
+                    try:
+                        start = operator.index(i.start)
+                    except TypeError:
+                        pass  # handled by ndarray
+                    else:
+                        if not (-side <= start <= side):
+                            raise IndexError(
+                                f"Slice {i} contains {start=}, but should be "
+                                f"{f_range} for an axis of size {side} "
+                                "(out-of-bounds starts are not specified in "
+                                "the Array API)"
+                            )
+                if i.stop is not None:
+                    try:
+                        stop = operator.index(i.stop)
+                    except TypeError:
+                        pass  # handled by ndarray
+                    else:
+                        if not (-side <= stop <= side):
+                            raise IndexError(
+                                f"Slice {i} contains {stop=}, but should be "
+                                f"{f_range} for an axis of size {side} "
+                                "(out-of-bounds stops are not specified in "
+                                "the Array API)"
+                            )
+            elif isinstance(i, Array):
+                if i.dtype in _boolean_dtypes and len(_key) != 1:
+                    assert isinstance(key, tuple)  # sanity check
                      raise IndexError(
-                        "Non-zero dimensional integer array indices are not allowed in the array API namespace"
+                        f"Single-axes index {i} is a boolean array and "
+                        f"{len(key)=}, but masking is only specified in the "
+                        "Array API when the array is the sole index."
                      )
-            return key._array
-        elif key is Ellipsis:
-            return key
-        elif key is None:
-            raise IndexError(
-                "newaxis indices are not allowed in the array API namespace"
-            )
-        try:
-            key = operator.index(key)
-            if shape is not None and len(shape) > 1:
+                elif i.dtype in _integer_dtypes and i.ndim != 0:
+                    raise IndexError(
+                        f"Single-axes index {i} is a non-zero-dimensional "
+                        "integer array, but advanced integer indexing is not "
+                        "specified in the Array API."
+                    )
+            elif isinstance(i, tuple):
                  raise IndexError(
-                    "Multidimensional arrays must include an index for every axis or use an ellipsis"
+                    f"Single-axes index {i} is a tuple, but nested tuple "
+                    "indices are not specified in the Array API."
                  )
-            return key
-        except TypeError:
-            # Note: This also omits boolean arrays that are not already in
-            # Array() form, like a list of booleans.
-            raise IndexError(
-                "Only integers, slices (`:`), ellipsis (`...`), and boolean arrays are valid indices in the array API namespace"
-            )
  
      # Everything below this line is required by the spec.
  
@@ -505,7 +531,10 @@ class Array:
          """
          # Note: Only indices required by the spec are allowed. See the
          # docstring of _validate_index
-        key = self._validate_index(key, self.shape)
+        self._validate_index(key)
+        if isinstance(key, Array):
+            # Indexing self._array with array_api arrays can be erroneous
+            key = key._array
          res = self._array.__getitem__(key)
          return self._new(res)
  
@@ -692,7 +721,10 @@ class Array:
          """
          # Note: Only indices required by the spec are allowed. See the
          # docstring of _validate_index
-        key = self._validate_index(key, self.shape)
+        self._validate_index(key)
+        if isinstance(key, Array):
+            # Indexing self._array with array_api arrays can be erroneous
+            key = key._array
          self._array.__setitem__(key, asarray(value)._array)
  
      def __sub__(self: Array, other: Union[int, float, Array], /) -> Array:
diff --git a/numpy/array_api/_creation_functions.py b/numpy/array_api/_creation_functions.py

index 741498ff610f2ca2b8f55374aea56ad1b3f84caf..3b014d37b2d64adc70cc76892bbfc96189b6eedf 100644 (file)
--- a/numpy/array_api/_creation_functions.py
+++ b/numpy/array_api/_creation_functions.py
@@ -154,7 +154,7 @@ def eye(
  def from_dlpack(x: object, /) -> Array:
      from ._array_object import Array
  
-    return Array._new(np._from_dlpack(x))
+    return Array._new(np.from_dlpack(x))
  
  
  def full(
diff --git a/numpy/array_api/_data_type_functions.py b/numpy/array_api/_data_type_functions.py

index e4d6db61bb8464feab0cbdc4d1b7754b26a5f989..7026bd489563e5ae8e6a88906217fca8cf305c9f 100644 (file)
--- a/numpy/array_api/_data_type_functions.py
+++ b/numpy/array_api/_data_type_functions.py
@@ -50,11 +50,23 @@ def can_cast(from_: Union[Dtype, Array], to: Dtype, /) -> bool:
  
      See its docstring for more information.
      """
-    from ._array_object import Array
-
      if isinstance(from_, Array):
-        from_ = from_._array
-    return np.can_cast(from_, to)
+        from_ = from_.dtype
+    elif from_ not in _all_dtypes:
+        raise TypeError(f"{from_=}, but should be an array_api array or dtype")
+    if to not in _all_dtypes:
+        raise TypeError(f"{to=}, but should be a dtype")
+    # Note: We avoid np.can_cast() as it has discrepancies with the array API,
+    # since NumPy allows cross-kind casting (e.g., NumPy allows bool -> int8).
+    # See https://github.com/numpy/numpy/issues/20870
+    try:
+        # We promote `from_` and `to` together. We then check if the promoted
+        # dtype is `to`, which indicates if `from_` can (up)cast to `to`.
+        dtype = _result_type(from_, to)
+        return to == dtype
+    except TypeError:
+        # _result_type() raises if the dtypes don't promote together
+        return False
  
  
  # These are internal objects for the return types of finfo and iinfo, since
diff --git a/numpy/array_api/_sorting_functions.py b/numpy/array_api/_sorting_functions.py

index b2a11872fa327937a3c4aaf7e42ff1897ff26b65..afbb412f7f5edf3e79c46490f9d3be5177c768b4 100644 (file)
--- a/numpy/array_api/_sorting_functions.py
+++ b/numpy/array_api/_sorting_functions.py
@@ -5,6 +5,7 @@ from ._array_object import Array
  import numpy as np
  
  
+# Note: the descending keyword argument is new in this function
  def argsort(
      x: Array, /, *, axis: int = -1, descending: bool = False, stable: bool = True
  ) -> Array:
@@ -31,7 +32,7 @@ def argsort(
          res = max_i - res
      return Array._new(res)
  
-
+# Note: the descending keyword argument is new in this function
  def sort(
      x: Array, /, *, axis: int = -1, descending: bool = False, stable: bool = True
  ) -> Array:
diff --git a/numpy/array_api/linalg.py b/numpy/array_api/linalg.py

index 8d7ba659ea9dfa738c5da5bc857bdbf3d21e38da..f422e1c2767f0d75fdc08da9f4433bcf98db9bb3 100644 (file)
--- a/numpy/array_api/linalg.py
+++ b/numpy/array_api/linalg.py
@@ -89,7 +89,6 @@ def diagonal(x: Array, /, *, offset: int = 0) -> Array:
      return Array._new(np.diagonal(x._array, offset=offset, axis1=-2, axis2=-1))
  
  
-# Note: the keyword argument name upper is different from np.linalg.eigh
  def eigh(x: Array, /) -> EighResult:
      """
      Array API compatible wrapper for :py:func:`np.linalg.eigh <numpy.linalg.eigh>`.
@@ -106,7 +105,6 @@ def eigh(x: Array, /) -> EighResult:
      return EighResult(*map(Array._new, np.linalg.eigh(x._array)))
  
  
-# Note: the keyword argument name upper is different from np.linalg.eigvalsh
  def eigvalsh(x: Array, /) -> Array:
      """
      Array API compatible wrapper for :py:func:`np.linalg.eigvalsh <numpy.linalg.eigvalsh>`.
@@ -346,6 +344,8 @@ def svd(x: Array, /, *, full_matrices: bool = True) -> SVDResult:
  # Note: svdvals is not in NumPy (but it is in SciPy). It is equivalent to
  # np.linalg.svd(compute_uv=False).
  def svdvals(x: Array, /) -> Union[Array, Tuple[Array, ...]]:
+    if x.dtype not in _floating_dtypes:
+        raise TypeError('Only floating-point dtypes are allowed in svdvals')
      return Array._new(np.linalg.svd(x._array, compute_uv=False))
  
  # Note: tensordot is the numpy top-level namespace but not in np.linalg
@@ -366,12 +366,16 @@ def trace(x: Array, /, *, offset: int = 0) -> Array:
  
      See its docstring for more information.
      """
+    if x.dtype not in _numeric_dtypes:
+        raise TypeError('Only numeric dtypes are allowed in trace')
      # Note: trace always operates on the last two axes, whereas np.trace
      # operates on the first two axes by default
      return Array._new(np.asarray(np.trace(x._array, offset=offset, axis1=-2, axis2=-1)))
  
  # Note: vecdot is not in NumPy
  def vecdot(x1: Array, x2: Array, /, *, axis: int = -1) -> Array:
+    if x1.dtype not in _numeric_dtypes or x2.dtype not in _numeric_dtypes:
+        raise TypeError('Only numeric dtypes are allowed in vecdot')
      return tensordot(x1, x2, axes=((axis,), (axis,)))
  
  
@@ -380,7 +384,7 @@ def vecdot(x1: Array, x2: Array, /, *, axis: int = -1) -> Array:
  
  # The type for ord should be Optional[Union[int, float, Literal[np.inf,
  # -np.inf]]] but Literal does not support floating-point literals.
-def vector_norm(x: Array, /, *, axis: Optional[Union[int, Tuple[int, int]]] = None, keepdims: bool = False, ord: Optional[Union[int, float]] = 2) -> Array:
+def vector_norm(x: Array, /, *, axis: Optional[Union[int, Tuple[int, ...]]] = None, keepdims: bool = False, ord: Optional[Union[int, float]] = 2) -> Array:
      """
      Array API compatible wrapper for :py:func:`np.linalg.norm <numpy.linalg.norm>`.
  
diff --git a/numpy/array_api/tests/test_array_object.py b/numpy/array_api/tests/test_array_object.py

index 1fe1dfddf80cd753287845571b63afa9dadf4f9d..ba9223532be50a4d0d0d829f1d253ccd09e1d6f5 100644 (file)
--- a/numpy/array_api/tests/test_array_object.py
+++ b/numpy/array_api/tests/test_array_object.py
@@ -2,8 +2,9 @@ import operator
  
  from numpy.testing import assert_raises
  import numpy as np
+import pytest
  
-from .. import ones, asarray, result_type, all, equal
+from .. import ones, asarray, reshape, result_type, all, equal
  from .._array_object import Array
  from .._dtypes import (
      _all_dtypes,
@@ -17,6 +18,7 @@ from .._dtypes import (
      int32,
      int64,
      uint64,
+    bool as bool_,
  )
  
  
@@ -70,11 +72,6 @@ def test_validate_index():
      assert_raises(IndexError, lambda: a[[0, 1]])
      assert_raises(IndexError, lambda: a[np.array([[0, 1]])])
  
-    # np.newaxis is not allowed
-    assert_raises(IndexError, lambda: a[None])
-    assert_raises(IndexError, lambda: a[None, ...])
-    assert_raises(IndexError, lambda: a[..., None])
-
      # Multiaxis indices must contain exactly as many indices as dimensions
      assert_raises(IndexError, lambda: a[()])
      assert_raises(IndexError, lambda: a[0,])
@@ -322,3 +319,57 @@ def test___array__():
      b = np.asarray(a, dtype=np.float64)
      assert np.all(np.equal(b, np.ones((2, 3), dtype=np.float64)))
      assert b.dtype == np.float64
+
+def test_allow_newaxis():
+    a = ones(5)
+    indexed_a = a[None, :]
+    assert indexed_a.shape == (1, 5)
+
+def test_disallow_flat_indexing_with_newaxis():
+    a = ones((3, 3, 3))
+    with pytest.raises(IndexError):
+        a[None, 0, 0]
+
+def test_disallow_mask_with_newaxis():
+    a = ones((3, 3, 3))
+    with pytest.raises(IndexError):
+        a[None, asarray(True)]
+
+@pytest.mark.parametrize("shape", [(), (5,), (3, 3, 3)])
+@pytest.mark.parametrize("index", ["string", False, True])
+def test_error_on_invalid_index(shape, index):
+    a = ones(shape)
+    with pytest.raises(IndexError):
+        a[index]
+
+def test_mask_0d_array_without_errors():
+    a = ones(())
+    a[asarray(True)]
+
+@pytest.mark.parametrize(
+    "i", [slice(5), slice(5, 0), asarray(True), asarray([0, 1])]
+)
+def test_error_on_invalid_index_with_ellipsis(i):
+    a = ones((3, 3, 3))
+    with pytest.raises(IndexError):
+        a[..., i]
+    with pytest.raises(IndexError):
+        a[i, ...]
+
+def test_array_keys_use_private_array():
+    """
+    Indexing operations convert array keys before indexing the internal array
+
+    Fails when array_api array keys are not converted into NumPy-proper arrays
+    in __getitem__(). This is achieved by passing array_api arrays with 0-sized
+    dimensions, which NumPy-proper treats erroneously - not sure why!
+
+    TODO: Find and use appropiate __setitem__() case.
+    """
+    a = ones((0, 0), dtype=bool_)
+    assert a[a].shape == (0,)
+
+    a = ones((0,), dtype=bool_)
+    key = ones((0, 0), dtype=bool_)
+    with pytest.raises(IndexError):
+        a[key]
diff --git a/numpy/array_api/tests/test_data_type_functions.py b/numpy/array_api/tests/test_data_type_functions.py

new file mode 100644 (file)

index 0000000..efe3d0a
--- /dev/null
+++ b/numpy/array_api/tests/test_data_type_functions.py
@@ -0,0 +1,19 @@
+import pytest
+
+from numpy import array_api as xp
+
+
+@pytest.mark.parametrize(
+    "from_, to, expected",
+    [
+        (xp.int8, xp.int16, True),
+        (xp.int16, xp.int8, False),
+        (xp.bool, xp.int8, False),
+        (xp.asarray(0, dtype=xp.uint8), xp.int8, False),
+    ],
+)
+def test_can_cast(from_, to, expected):
+    """
+    can_cast() returns correct result
+    """
+    assert xp.can_cast(from_, to) == expected
diff --git a/numpy/array_api/tests/test_validation.py b/numpy/array_api/tests/test_validation.py

new file mode 100644 (file)

index 0000000..0dd100d
--- /dev/null
+++ b/numpy/array_api/tests/test_validation.py
@@ -0,0 +1,27 @@
+from typing import Callable
+
+import pytest
+
+from numpy import array_api as xp
+
+
+def p(func: Callable, *args, **kwargs):
+    f_sig = ", ".join(
+        [str(a) for a in args] + [f"{k}={v}" for k, v in kwargs.items()]
+    )
+    id_ = f"{func.__name__}({f_sig})"
+    return pytest.param(func, args, kwargs, id=id_)
+
+
+@pytest.mark.parametrize(
+    "func, args, kwargs",
+    [
+        p(xp.can_cast, 42, xp.int8),
+        p(xp.can_cast, xp.int8, 42),
+        p(xp.result_type, 42),
+    ],
+)
+def test_raises_on_invalid_types(func, args, kwargs):
+    """Function raises TypeError when passed invalidly-typed inputs"""
+    with pytest.raises(TypeError):
+        func(*args, **kwargs)
diff --git a/numpy/compat/__init__.py b/numpy/compat/__init__.py

index afee621b87264f13bf2f70cd5d115e7adc9d397a..ff04f725af9890c00a43fe26ca311c80b2e5556f 100644 (file)
--- a/numpy/compat/__init__.py
+++ b/numpy/compat/__init__.py
@@ -9,6 +9,7 @@ extensions, which may be included for the following reasons:
  
  """
  from . import _inspect
+from . import _pep440
  from . import py3k
  from ._inspect import getargspec, formatargspec
  from .py3k import *
diff --git a/numpy/compat/_pep440.py b/numpy/compat/_pep440.py

new file mode 100644 (file)

index 0000000..73d0afb
--- /dev/null
+++ b/numpy/compat/_pep440.py
@@ -0,0 +1,487 @@
+"""Utility to compare pep440 compatible version strings.
+
+The LooseVersion and StrictVersion classes that distutils provides don't
+work; they don't recognize anything like alpha/beta/rc/dev versions.
+"""
+
+# Copyright (c) Donald Stufft and individual contributors.
+# All rights reserved.
+
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are met:
+
+#     1. Redistributions of source code must retain the above copyright notice,
+#        this list of conditions and the following disclaimer.
+
+#     2. Redistributions in binary form must reproduce the above copyright
+#        notice, this list of conditions and the following disclaimer in the
+#        documentation and/or other materials provided with the distribution.
+
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+# POSSIBILITY OF SUCH DAMAGE.
+
+import collections
+import itertools
+import re
+
+
+__all__ = [
+    "parse", "Version", "LegacyVersion", "InvalidVersion", "VERSION_PATTERN",
+]
+
+
+# BEGIN packaging/_structures.py
+
+
+class Infinity:
+    def __repr__(self):
+        return "Infinity"
+
+    def __hash__(self):
+        return hash(repr(self))
+
+    def __lt__(self, other):
+        return False
+
+    def __le__(self, other):
+        return False
+
+    def __eq__(self, other):
+        return isinstance(other, self.__class__)
+
+    def __ne__(self, other):
+        return not isinstance(other, self.__class__)
+
+    def __gt__(self, other):
+        return True
+
+    def __ge__(self, other):
+        return True
+
+    def __neg__(self):
+        return NegativeInfinity
+
+
+Infinity = Infinity()
+
+
+class NegativeInfinity:
+    def __repr__(self):
+        return "-Infinity"
+
+    def __hash__(self):
+        return hash(repr(self))
+
+    def __lt__(self, other):
+        return True
+
+    def __le__(self, other):
+        return True
+
+    def __eq__(self, other):
+        return isinstance(other, self.__class__)
+
+    def __ne__(self, other):
+        return not isinstance(other, self.__class__)
+
+    def __gt__(self, other):
+        return False
+
+    def __ge__(self, other):
+        return False
+
+    def __neg__(self):
+        return Infinity
+
+
+# BEGIN packaging/version.py
+
+
+NegativeInfinity = NegativeInfinity()
+
+_Version = collections.namedtuple(
+    "_Version",
+    ["epoch", "release", "dev", "pre", "post", "local"],
+)
+
+
+def parse(version):
+    """
+    Parse the given version string and return either a :class:`Version` object
+    or a :class:`LegacyVersion` object depending on if the given version is
+    a valid PEP 440 version or a legacy version.
+    """
+    try:
+        return Version(version)
+    except InvalidVersion:
+        return LegacyVersion(version)
+
+
+class InvalidVersion(ValueError):
+    """
+    An invalid version was found, users should refer to PEP 440.
+    """
+
+
+class _BaseVersion:
+
+    def __hash__(self):
+        return hash(self._key)
+
+    def __lt__(self, other):
+        return self._compare(other, lambda s, o: s < o)
+
+    def __le__(self, other):
+        return self._compare(other, lambda s, o: s <= o)
+
+    def __eq__(self, other):
+        return self._compare(other, lambda s, o: s == o)
+
+    def __ge__(self, other):
+        return self._compare(other, lambda s, o: s >= o)
+
+    def __gt__(self, other):
+        return self._compare(other, lambda s, o: s > o)
+
+    def __ne__(self, other):
+        return self._compare(other, lambda s, o: s != o)
+
+    def _compare(self, other, method):
+        if not isinstance(other, _BaseVersion):
+            return NotImplemented
+
+        return method(self._key, other._key)
+
+
+class LegacyVersion(_BaseVersion):
+
+    def __init__(self, version):
+        self._version = str(version)
+        self._key = _legacy_cmpkey(self._version)
+
+    def __str__(self):
+        return self._version
+
+    def __repr__(self):
+        return "<LegacyVersion({0})>".format(repr(str(self)))
+
+    @property
+    def public(self):
+        return self._version
+
+    @property
+    def base_version(self):
+        return self._version
+
+    @property
+    def local(self):
+        return None
+
+    @property
+    def is_prerelease(self):
+        return False
+
+    @property
+    def is_postrelease(self):
+        return False
+
+
+_legacy_version_component_re = re.compile(
+    r"(\d+ | [a-z]+ | \.| -)", re.VERBOSE,
+)
+
+_legacy_version_replacement_map = {
+    "pre": "c", "preview": "c", "-": "final-", "rc": "c", "dev": "@",
+}
+
+
+def _parse_version_parts(s):
+    for part in _legacy_version_component_re.split(s):
+        part = _legacy_version_replacement_map.get(part, part)
+
+        if not part or part == ".":
+            continue
+
+        if part[:1] in "0123456789":
+            # pad for numeric comparison
+            yield part.zfill(8)
+        else:
+            yield "*" + part
+
+    # ensure that alpha/beta/candidate are before final
+    yield "*final"
+
+
+def _legacy_cmpkey(version):
+    # We hardcode an epoch of -1 here. A PEP 440 version can only have an epoch
+    # greater than or equal to 0. This will effectively put the LegacyVersion,
+    # which uses the defacto standard originally implemented by setuptools,
+    # as before all PEP 440 versions.
+    epoch = -1
+
+    # This scheme is taken from pkg_resources.parse_version setuptools prior to
+    # its adoption of the packaging library.
+    parts = []
+    for part in _parse_version_parts(version.lower()):
+        if part.startswith("*"):
+            # remove "-" before a prerelease tag
+            if part < "*final":
+                while parts and parts[-1] == "*final-":
+                    parts.pop()
+
+            # remove trailing zeros from each series of numeric parts
+            while parts and parts[-1] == "00000000":
+                parts.pop()
+
+        parts.append(part)
+    parts = tuple(parts)
+
+    return epoch, parts
+
+
+# Deliberately not anchored to the start and end of the string, to make it
+# easier for 3rd party code to reuse
+VERSION_PATTERN = r"""
+    v?
+    (?:
+        (?:(?P<epoch>[0-9]+)!)?                           # epoch
+        (?P<release>[0-9]+(?:\.[0-9]+)*)                  # release segment
+        (?P<pre>                                          # pre-release
+            [-_\.]?
+            (?P<pre_l>(a|b|c|rc|alpha|beta|pre|preview))
+            [-_\.]?
+            (?P<pre_n>[0-9]+)?
+        )?
+        (?P<post>                                         # post release
+            (?:-(?P<post_n1>[0-9]+))
+            |
+            (?:
+                [-_\.]?
+                (?P<post_l>post|rev|r)
+                [-_\.]?
+                (?P<post_n2>[0-9]+)?
+            )
+        )?
+        (?P<dev>                                          # dev release
+            [-_\.]?
+            (?P<dev_l>dev)
+            [-_\.]?
+            (?P<dev_n>[0-9]+)?
+        )?
+    )
+    (?:\+(?P<local>[a-z0-9]+(?:[-_\.][a-z0-9]+)*))?       # local version
+"""
+
+
+class Version(_BaseVersion):
+
+    _regex = re.compile(
+        r"^\s*" + VERSION_PATTERN + r"\s*$",
+        re.VERBOSE | re.IGNORECASE,
+    )
+
+    def __init__(self, version):
+        # Validate the version and parse it into pieces
+        match = self._regex.search(version)
+        if not match:
+            raise InvalidVersion("Invalid version: '{0}'".format(version))
+
+        # Store the parsed out pieces of the version
+        self._version = _Version(
+            epoch=int(match.group("epoch")) if match.group("epoch") else 0,
+            release=tuple(int(i) for i in match.group("release").split(".")),
+            pre=_parse_letter_version(
+                match.group("pre_l"),
+                match.group("pre_n"),
+            ),
+            post=_parse_letter_version(
+                match.group("post_l"),
+                match.group("post_n1") or match.group("post_n2"),
+            ),
+            dev=_parse_letter_version(
+                match.group("dev_l"),
+                match.group("dev_n"),
+            ),
+            local=_parse_local_version(match.group("local")),
+        )
+
+        # Generate a key which will be used for sorting
+        self._key = _cmpkey(
+            self._version.epoch,
+            self._version.release,
+            self._version.pre,
+            self._version.post,
+            self._version.dev,
+            self._version.local,
+        )
+
+    def __repr__(self):
+        return "<Version({0})>".format(repr(str(self)))
+
+    def __str__(self):
+        parts = []
+
+        # Epoch
+        if self._version.epoch != 0:
+            parts.append("{0}!".format(self._version.epoch))
+
+        # Release segment
+        parts.append(".".join(str(x) for x in self._version.release))
+
+        # Pre-release
+        if self._version.pre is not None:
+            parts.append("".join(str(x) for x in self._version.pre))
+
+        # Post-release
+        if self._version.post is not None:
+            parts.append(".post{0}".format(self._version.post[1]))
+
+        # Development release
+        if self._version.dev is not None:
+            parts.append(".dev{0}".format(self._version.dev[1]))
+
+        # Local version segment
+        if self._version.local is not None:
+            parts.append(
+                "+{0}".format(".".join(str(x) for x in self._version.local))
+            )
+
+        return "".join(parts)
+
+    @property
+    def public(self):
+        return str(self).split("+", 1)[0]
+
+    @property
+    def base_version(self):
+        parts = []
+
+        # Epoch
+        if self._version.epoch != 0:
+            parts.append("{0}!".format(self._version.epoch))
+
+        # Release segment
+        parts.append(".".join(str(x) for x in self._version.release))
+
+        return "".join(parts)
+
+    @property
+    def local(self):
+        version_string = str(self)
+        if "+" in version_string:
+            return version_string.split("+", 1)[1]
+
+    @property
+    def is_prerelease(self):
+        return bool(self._version.dev or self._version.pre)
+
+    @property
+    def is_postrelease(self):
+        return bool(self._version.post)
+
+
+def _parse_letter_version(letter, number):
+    if letter:
+        # We assume there is an implicit 0 in a pre-release if there is
+        # no numeral associated with it.
+        if number is None:
+            number = 0
+
+        # We normalize any letters to their lower-case form
+        letter = letter.lower()
+
+        # We consider some words to be alternate spellings of other words and
+        # in those cases we want to normalize the spellings to our preferred
+        # spelling.
+        if letter == "alpha":
+            letter = "a"
+        elif letter == "beta":
+            letter = "b"
+        elif letter in ["c", "pre", "preview"]:
+            letter = "rc"
+        elif letter in ["rev", "r"]:
+            letter = "post"
+
+        return letter, int(number)
+    if not letter and number:
+        # We assume that if we are given a number but not given a letter,
+        # then this is using the implicit post release syntax (e.g., 1.0-1)
+        letter = "post"
+
+        return letter, int(number)
+
+
+_local_version_seperators = re.compile(r"[\._-]")
+
+
+def _parse_local_version(local):
+    """
+    Takes a string like abc.1.twelve and turns it into ("abc", 1, "twelve").
+    """
+    if local is not None:
+        return tuple(
+            part.lower() if not part.isdigit() else int(part)
+            for part in _local_version_seperators.split(local)
+        )
+
+
+def _cmpkey(epoch, release, pre, post, dev, local):
+    # When we compare a release version, we want to compare it with all of the
+    # trailing zeros removed. So we'll use a reverse the list, drop all the now
+    # leading zeros until we come to something non-zero, then take the rest,
+    # re-reverse it back into the correct order, and make it a tuple and use
+    # that for our sorting key.
+    release = tuple(
+        reversed(list(
+            itertools.dropwhile(
+                lambda x: x == 0,
+                reversed(release),
+            )
+        ))
+    )
+
+    # We need to "trick" the sorting algorithm to put 1.0.dev0 before 1.0a0.
+    # We'll do this by abusing the pre-segment, but we _only_ want to do this
+    # if there is no pre- or a post-segment. If we have one of those, then
+    # the normal sorting rules will handle this case correctly.
+    if pre is None and post is None and dev is not None:
+        pre = -Infinity
+    # Versions without a pre-release (except as noted above) should sort after
+    # those with one.
+    elif pre is None:
+        pre = Infinity
+
+    # Versions without a post-segment should sort before those with one.
+    if post is None:
+        post = -Infinity
+
+    # Versions without a development segment should sort after those with one.
+    if dev is None:
+        dev = Infinity
+
+    if local is None:
+        # Versions without a local segment should sort before those with one.
+        local = -Infinity
+    else:
+        # Versions with a local segment need that segment parsed to implement
+        # the sorting rules in PEP440.
+        # - Alphanumeric segments sort before numeric segments
+        # - Alphanumeric segments sort lexicographically
+        # - Numeric segments sort numerically
+        # - Shorter versions sort before longer versions when the prefixes
+        #   match exactly
+        local = tuple(
+            (i, "") if isinstance(i, int) else (-Infinity, i)
+            for i in local
+        )
+
+    return epoch, release, pre, post, dev, local
diff --git a/numpy/core/_add_newdocs.py b/numpy/core/_add_newdocs.py

index af9682c03f77590bf34dd36f701ffcf2d027ce6e..fb9c30d93080c11b46e942ceb7f6bbc15e27a779 100644 (file)
--- a/numpy/core/_add_newdocs.py
+++ b/numpy/core/_add_newdocs.py
@@ -384,7 +384,7 @@ add_newdoc('numpy.core', 'nditer',
      >>> luf(lambda i,j:i*i + j/2, a, b)
      array([  0.5,   1.5,   4.5,   9.5,  16.5])
  
-    If operand flags `"writeonly"` or `"readwrite"` are used the
+    If operand flags ``"writeonly"`` or ``"readwrite"`` are used the
      operands may be views into the original data with the
      `WRITEBACKIFCOPY` flag. In this case `nditer` must be used as a
      context manager or the `nditer.close` method must be called before
@@ -833,7 +833,7 @@ add_newdoc('numpy.core.multiarray', 'array',
          the returned array will be forced to be a base-class array (default).
      ndmin : int, optional
          Specifies the minimum number of dimensions that the resulting
-        array should have.  Ones will be pre-pended to the shape as
+        array should have.  Ones will be prepended to the shape as
          needed to meet this requirement.
      ${ARRAY_FUNCTION_LIKE}
  
@@ -1318,6 +1318,7 @@ add_newdoc('numpy.core.multiarray', 'fromstring',
              text, the binary mode of `fromstring` will first encode it into
              bytes using either utf-8 (python 3) or the default encoding
              (python 2), neither of which produce sane results.
+
      ${ARRAY_FUNCTION_LIKE}
  
          .. versionadded:: 1.20.0
@@ -1398,6 +1399,11 @@ add_newdoc('numpy.core.multiarray', 'fromiter',
          An iterable object providing data for the array.
      dtype : data-type
          The data-type of the returned array.
+
+        .. versionchanged:: 1.23
+            Object and subarray dtypes are now supported (note that the final
+            result is not 1-D for a subarray dtype).
+
      count : int, optional
          The number of items to read from *iterable*.  The default is -1,
          which means all data is read.
@@ -1421,6 +1427,18 @@ add_newdoc('numpy.core.multiarray', 'fromiter',
      >>> np.fromiter(iterable, float)
      array([  0.,   1.,   4.,   9.,  16.])
  
+    A carefully constructed subarray dtype will lead to higher dimensional
+    results:
+
+    >>> iterable = ((x+1, x+2) for x in range(5))
+    >>> np.fromiter(iterable, dtype=np.dtype((int, 2)))
+    array([[1, 2],
+           [2, 3],
+           [3, 4],
+           [4, 5],
+           [5, 6]])
+
+
      """.replace(
          "${ARRAY_FUNCTION_LIKE}",
          array_function_like_doc,
@@ -1577,17 +1595,38 @@ add_newdoc('numpy.core.multiarray', 'frombuffer',
          array_function_like_doc,
      ))
  
-add_newdoc('numpy.core.multiarray', '_from_dlpack',
+add_newdoc('numpy.core.multiarray', 'from_dlpack',
      """
-    _from_dlpack(x, /)
+    from_dlpack(x, /)
  
      Create a NumPy array from an object implementing the ``__dlpack__``
-    protocol.
+    protocol. Generally, the returned NumPy array is a read-only view
+    of the input object. See [1]_ and [2]_ for more details.
  
-    See Also
+    Parameters
+    ----------
+    x : object
+        A Python object that implements the ``__dlpack__`` and
+        ``__dlpack_device__`` methods.
+
+    Returns
+    -------
+    out : ndarray
+
+    References
+    ----------
+    .. [1] Array API documentation,
+       https://data-apis.org/array-api/latest/design_topics/data_interchange.html#syntax-for-data-interchange-with-dlpack
+
+    .. [2] Python specification for DLPack,
+       https://dmlc.github.io/dlpack/latest/python_spec.html
+
+    Examples
      --------
-    `Array API documentation
-    <https://data-apis.org/array-api/latest/design_topics/data_interchange.html#syntax-for-data-interchange-with-dlpack>`_
+    >>> import torch
+    >>> x = torch.arange(10)
+    >>> # create a view of the torch tensor "x" in NumPy
+    >>> y = np.from_dlpack(x)
      """)
  
  add_newdoc('numpy.core', 'fastCopyAndTranspose',
@@ -1602,13 +1641,25 @@ add_newdoc('numpy.core.multiarray', 'arange',
  
      Return evenly spaced values within a given interval.
  
-    Values are generated within the half-open interval ``[start, stop)``
-    (in other words, the interval including `start` but excluding `stop`).
-    For integer arguments the function is equivalent to the Python built-in
-    `range` function, but returns an ndarray rather than a list.
+    ``arange`` can be called with a varying number of positional arguments:
+
+    * ``arange(stop)``: Values are generated within the half-open interval
+      ``[0, stop)`` (in other words, the interval including `start` but
+      excluding `stop`).
+    * ``arange(start, stop)``: Values are generated within the half-open
+      interval ``[start, stop)``.
+    * ``arange(start, stop, step)`` Values are generated within the half-open
+      interval ``[start, stop)``, with spacing between values given by
+      ``step``.
+
+    For integer arguments the function is roughly equivalent to the Python
+    built-in :py:class:`range`, but returns an ndarray rather than a ``range``
+    instance.
  
      When using a non-integer step, such as 0.1, it is often better to use
-    `numpy.linspace`. See the warnings section below for more information.
+    `numpy.linspace`.
+
+    See the Warning sections below for more information.
  
      Parameters
      ----------
@@ -1624,7 +1675,7 @@ add_newdoc('numpy.core.multiarray', 'arange',
          between two adjacent values, ``out[i+1] - out[i]``.  The default
          step size is 1.  If `step` is specified as a position argument,
          `start` must also be given.
-    dtype : dtype
+    dtype : dtype, optional
          The type of the output array.  If `dtype` is not given, infer the data
          type from the other input arguments.
      ${ARRAY_FUNCTION_LIKE}
@@ -1660,6 +1711,20 @@ add_newdoc('numpy.core.multiarray', 'arange',
  
      In such cases, the use of `numpy.linspace` should be preferred.
  
+    The built-in :py:class:`range` generates :std:doc:`Python built-in integers
+    that have arbitrary size <c-api/long>`, while `numpy.arange` produces
+    `numpy.int32` or `numpy.int64` numbers. This may result in incorrect
+    results for large integer values::
+
+      >>> power = 40
+      >>> modulo = 10000
+      >>> x1 = [(n ** power) % modulo for n in range(8)]
+      >>> x2 = [(n ** power) % modulo for n in np.arange(8)]
+      >>> print(x1)
+      [0, 1, 7776, 8801, 6176, 625, 6576, 4001]  # correct
+      >>> print(x2)
+      [0, 1, 7776, 7185, 0, 5969, 4816, 3361]  # incorrect
+
      See Also
      --------
      numpy.linspace : Evenly spaced numbers with careful handling of endpoints.
@@ -1731,7 +1796,7 @@ add_newdoc('numpy.core.multiarray', 'set_numeric_ops',
  
      Notes
      -----
-    .. WARNING::
+    .. warning::
         Use with care!  Incorrect usage may lead to memory errors.
  
      A function replacing an operator cannot make use of that operator.
@@ -1761,7 +1826,8 @@ add_newdoc('numpy.core.multiarray', 'promote_types',
  
      Returns the data type with the smallest size and smallest scalar
      kind to which both ``type1`` and ``type2`` may be safely cast.
-    The returned data type is always in native byte order.
+    The returned data type is always considered "canonical", this mainly
+    means that the promoted dtype will always be in native byte order.
  
      This function is symmetric, but rarely associative.
  
@@ -1779,6 +1845,8 @@ add_newdoc('numpy.core.multiarray', 'promote_types',
  
      Notes
      -----
+    Please see `numpy.result_type` for additional information about promotion.
+
      .. versionadded:: 1.6.0
  
      Starting in NumPy 1.9, promote_types function now returns a valid string
@@ -1787,6 +1855,12 @@ add_newdoc('numpy.core.multiarray', 'promote_types',
      dtype, even if it wasn't long enough to store the max integer/float value
      converted to a string.
  
+    .. versionchanged:: 1.23.0
+
+    NumPy now supports promotion for more structured dtypes.  It will now
+    remove unnecessary padding from a structure dtype and promote included
+    fields individually.
+
      See Also
      --------
      result_type, dtype, can_cast
@@ -2269,10 +2343,6 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('__array_interface__',
      """Array protocol: Python side."""))
  
  
-add_newdoc('numpy.core.multiarray', 'ndarray', ('__array_finalize__',
-    """None."""))
-
-
  add_newdoc('numpy.core.multiarray', 'ndarray', ('__array_priority__',
      """Array priority."""))
  
@@ -2282,12 +2352,12 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('__array_struct__',
  
  add_newdoc('numpy.core.multiarray', 'ndarray', ('__dlpack__',
      """a.__dlpack__(*, stream=None)
-    
+
      DLPack Protocol: Part of the Array API."""))
  
  add_newdoc('numpy.core.multiarray', 'ndarray', ('__dlpack_device__',
      """a.__dlpack_device__()
-    
+
      DLPack Protocol: Part of the Array API."""))
  
  add_newdoc('numpy.core.multiarray', 'ndarray', ('base',
@@ -2396,6 +2466,12 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('dtype',
      """
      Data-type of the array's elements.
  
+    .. warning::
+
+        Setting ``arr.dtype`` is discouraged and may be deprecated in the
+        future.  Setting will replace the ``dtype`` without modifying the
+        memory (see also `ndarray.view` and `ndarray.astype`).
+
      Parameters
      ----------
      None
@@ -2406,6 +2482,8 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('dtype',
  
      See Also
      --------
+    ndarray.astype : Cast the values contained in the array to a new data-type.
+    ndarray.view : Create a view of the same data but a different data-type.
      numpy.dtype
  
      Examples
@@ -2481,11 +2559,6 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('flags',
          This array is a copy of some other array. The C-API function
          PyArray_ResolveWritebackIfCopy must be called before deallocating
          to the base array will be updated with the contents of this array.
-    UPDATEIFCOPY (U)
-        (Deprecated, use WRITEBACKIFCOPY) This array is a copy of some other array.
-        When this array is
-        deallocated, the base array will be updated with the contents of
-        this array.
      FNC
          F_CONTIGUOUS and not C_CONTIGUOUS.
      FORC
@@ -2503,13 +2576,12 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('flags',
      or by using lowercased attribute names (as in ``a.flags.writeable``). Short flag
      names are only supported in dictionary access.
  
-    Only the WRITEBACKIFCOPY, UPDATEIFCOPY, WRITEABLE, and ALIGNED flags can be
+    Only the WRITEBACKIFCOPY, WRITEABLE, and ALIGNED flags can be
      changed by the user, via direct assignment to the attribute or dictionary
      entry, or by calling `ndarray.setflags`.
  
      The array flags cannot be set arbitrarily:
  
-    - UPDATEIFCOPY can only be set ``False``.
      - WRITEBACKIFCOPY can only be set ``False``.
      - ALIGNED can only be set ``True`` if the data is truly aligned.
      - WRITEABLE can only be set ``True`` if the array owns its own memory
@@ -2637,6 +2709,11 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('shape',
      the array and the remaining dimensions. Reshaping an array in-place will
      fail if a copy is required.
  
+    .. warning::
+
+        Setting ``arr.shape`` is discouraged and may be deprecated in the
+        future.  Using `ndarray.reshape` is the preferred approach.
+
      Examples
      --------
      >>> x = np.array([1, 2, 3, 4])
@@ -2662,8 +2739,9 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('shape',
  
      See Also
      --------
-    numpy.reshape : similar function
-    ndarray.reshape : similar method
+    numpy.shape : Equivalent getter function.
+    numpy.reshape : Function similar to setting ``shape``.
+    ndarray.reshape : Method similar to setting ``shape``.
  
      """))
  
@@ -2706,6 +2784,12 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('strides',
      A more detailed explanation of strides can be found in the
      "ndarray.rst" file in the NumPy reference guide.
  
+    .. warning::
+
+        Setting ``arr.strides`` is discouraged and may be deprecated in the
+        future.  `numpy.lib.stride_tricks.as_strided` should be preferred
+        to create a new view of the same data in a safer way.
+
      Notes
      -----
      Imagine an array of 32-bit integers (each 4 bytes)::
@@ -2801,6 +2885,14 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('__array__',
      """))
  
  
+add_newdoc('numpy.core.multiarray', 'ndarray', ('__array_finalize__',
+    """a.__array_finalize__(obj, /)
+
+    Present so subclasses can call super. Does nothing.
+
+    """))
+
+
  add_newdoc('numpy.core.multiarray', 'ndarray', ('__array_prepare__',
      """a.__array_prepare__(array[, context], /)
  
@@ -2929,7 +3021,7 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('any',
  
  add_newdoc('numpy.core.multiarray', 'ndarray', ('argmax',
      """
-    a.argmax(axis=None, out=None)
+    a.argmax(axis=None, out=None, *, keepdims=False)
  
      Return indices of the maximum values along the given axis.
  
@@ -2944,7 +3036,7 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('argmax',
  
  add_newdoc('numpy.core.multiarray', 'ndarray', ('argmin',
      """
-    a.argmin(axis=None, out=None)
+    a.argmin(axis=None, out=None, *, keepdims=False)
  
      Return indices of the minimum values along the given axis.
  
@@ -3910,13 +4002,13 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('setflags',
      """
      a.setflags(write=None, align=None, uic=None)
  
-    Set array flags WRITEABLE, ALIGNED, (WRITEBACKIFCOPY and UPDATEIFCOPY),
+    Set array flags WRITEABLE, ALIGNED, WRITEBACKIFCOPY,
      respectively.
  
      These Boolean-valued flags affect how numpy interprets the memory
      area used by `a` (see Notes below). The ALIGNED flag can only
      be set to True if the data is actually aligned according to the type.
-    The WRITEBACKIFCOPY and (deprecated) UPDATEIFCOPY flags can never be set
+    The WRITEBACKIFCOPY and flag can never be set
      to True. The flag WRITEABLE can only be set to True if the array owns its
      own memory, or the ultimate owner of the memory exposes a writeable buffer
      interface, or is a string. (The exception for string is made so that
@@ -3936,15 +4028,13 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('setflags',
      Array flags provide information about how the memory area used
      for the array is to be interpreted. There are 7 Boolean flags
      in use, only four of which can be changed by the user:
-    WRITEBACKIFCOPY, UPDATEIFCOPY, WRITEABLE, and ALIGNED.
+    WRITEBACKIFCOPY, WRITEABLE, and ALIGNED.
  
      WRITEABLE (W) the data area can be written to;
  
      ALIGNED (A) the data and strides are aligned appropriately for the hardware
      (as determined by the compiler);
  
-    UPDATEIFCOPY (U) (deprecated), replaced by WRITEBACKIFCOPY;
-
      WRITEBACKIFCOPY (X) this array is a copy of some other array (referenced
      by .base). When the C-API function PyArray_ResolveWritebackIfCopy is
      called, the base array will be updated with the contents of this array.
@@ -3968,7 +4058,6 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('setflags',
        WRITEABLE : True
        ALIGNED : True
        WRITEBACKIFCOPY : False
-      UPDATEIFCOPY : False
      >>> y.setflags(write=0, align=0)
      >>> y.flags
        C_CONTIGUOUS : True
@@ -3977,7 +4066,6 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('setflags',
        WRITEABLE : False
        ALIGNED : False
        WRITEBACKIFCOPY : False
-      UPDATEIFCOPY : False
      >>> y.setflags(uic=1)
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
@@ -4087,7 +4175,7 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('partition',
  
      See Also
      --------
-    numpy.partition : Return a parititioned copy of an array.
+    numpy.partition : Return a partitioned copy of an array.
      argpartition : Indirect partition.
      sort : Full sort.
  
@@ -4463,14 +4551,13 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('view',
      memory.
  
      For ``a.view(some_dtype)``, if ``some_dtype`` has a different number of
-    bytes per entry than the previous dtype (for example, converting a
-    regular array to a structured array), then the behavior of the view
-    cannot be predicted just from the superficial appearance of ``a`` (shown
-    by ``print(a)``). It also depends on exactly how ``a`` is stored in
-    memory. Therefore if ``a`` is C-ordered versus fortran-ordered, versus
-    defined as a slice or transpose, etc., the view may give different
-    results.
+    bytes per entry than the previous dtype (for example, converting a regular
+    array to a structured array), then the last axis of ``a`` must be
+    contiguous. This axis will be resized in the result.
  
+    .. versionchanged:: 1.23.0
+       Only the last axis needs to be contiguous. Previously, the entire array
+       had to be C-contiguous.
  
      Examples
      --------
@@ -4515,19 +4602,34 @@ add_newdoc('numpy.core.multiarray', 'ndarray', ('view',
      Views that change the dtype size (bytes per entry) should normally be
      avoided on arrays defined by slices, transposes, fortran-ordering, etc.:
  
-    >>> x = np.array([[1,2,3],[4,5,6]], dtype=np.int16)
-    >>> y = x[:, 0:2]
+    >>> x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.int16)
+    >>> y = x[:, ::2]
      >>> y
-    array([[1, 2],
-           [4, 5]], dtype=int16)
+    array([[1, 3],
+           [4, 6]], dtype=int16)
      >>> y.view(dtype=[('width', np.int16), ('length', np.int16)])
      Traceback (most recent call last):
          ...
-    ValueError: To change to a dtype of a different size, the array must be C-contiguous
+    ValueError: To change to a dtype of a different size, the last axis must be contiguous
      >>> z = y.copy()
      >>> z.view(dtype=[('width', np.int16), ('length', np.int16)])
-    array([[(1, 2)],
-           [(4, 5)]], dtype=[('width', '<i2'), ('length', '<i2')])
+    array([[(1, 3)],
+           [(4, 6)]], dtype=[('width', '<i2'), ('length', '<i2')])
+
+    However, views that change dtype are totally fine for arrays with a
+    contiguous last axis, even if the rest of the axes are not C-contiguous:
+
+    >>> x = np.arange(2 * 3 * 4, dtype=np.int8).reshape(2, 3, 4)
+    >>> x.transpose(1, 0, 2).view(np.int16)
+    array([[[ 256,  770],
+            [3340, 3854]],
+    <BLANKLINE>
+           [[1284, 1798],
+            [4368, 4882]],
+    <BLANKLINE>
+           [[2312, 2826],
+            [5396, 5910]]], dtype=int16)
+
      """))
  
  
@@ -4773,6 +4875,15 @@ add_newdoc('numpy.core.multiarray', 'get_handler_version',
      its memory, in which case you can traverse ``a.base`` for a memory handler.
      """)
  
+add_newdoc('numpy.core.multiarray', '_get_madvise_hugepage',
+    """
+    _get_madvise_hugepage() -> bool
+
+    Get use of ``madvise (2)`` MADV_HUGEPAGE support when
+    allocating the array data. Returns the currently set value.
+    See `global_state` for more information.
+    """)
+
  add_newdoc('numpy.core.multiarray', '_set_madvise_hugepage',
      """
      _set_madvise_hugepage(enabled: bool) -> bool
@@ -5229,7 +5340,7 @@ add_newdoc('numpy.core', 'ufunc', ('accumulate',
      dtype : data-type code, optional
          The data-type used to represent the intermediate results. Defaults
          to the data-type of the output array if such is provided, or the
-        the data-type of the input array if no output array is provided.
+        data-type of the input array if no output array is provided.
      out : ndarray, None, or tuple of ndarray and None, optional
          A location into which the result is stored. If not provided or None,
          a freshly-allocated array is returned. For consistency with
diff --git a/numpy/core/_asarray.py b/numpy/core/_asarray.py

index ecb4e7c39d0cebb33fc7677592d0f50c80cf9b97..89d422e99106a24704dae6143bbb466d0786a269 100644 (file)
--- a/numpy/core/_asarray.py
+++ b/numpy/core/_asarray.py
@@ -78,7 +78,6 @@ def require(a, dtype=None, requirements=None, *, like=None):
        WRITEABLE : True
        ALIGNED : True
        WRITEBACKIFCOPY : False
-      UPDATEIFCOPY : False
  
      >>> y = np.require(x, dtype=np.float32, requirements=['A', 'O', 'W', 'F'])
      >>> y.flags
@@ -88,7 +87,6 @@ def require(a, dtype=None, requirements=None, *, like=None):
        WRITEABLE : True
        ALIGNED : True
        WRITEBACKIFCOPY : False
-      UPDATEIFCOPY : False
  
      """
      if like is not None:
diff --git a/numpy/core/_asarray.pyi b/numpy/core/_asarray.pyi

index fee9b7b6e0e000c3635dd12a4bafbfcb89db0a72..473bc037cefc89da55d860769a1b4ad955aefa4b 100644 (file)
--- a/numpy/core/_asarray.pyi
+++ b/numpy/core/_asarray.pyi
@@ -1,7 +1,8 @@
-from typing import TypeVar, Union, Iterable, overload, Literal
+from collections.abc import Iterable
+from typing import TypeVar, Union, overload, Literal
  
  from numpy import ndarray
-from numpy.typing import ArrayLike, DTypeLike
+from numpy._typing import DTypeLike, _SupportsArrayFunc
  
  _ArrayType = TypeVar("_ArrayType", bound=ndarray)
  
@@ -19,23 +20,23 @@ _RequirementsWithE = Union[_Requirements, _E]
  def require(
      a: _ArrayType,
      dtype: None = ...,
-    requirements: Union[None, _Requirements, Iterable[_Requirements]] = ...,
+    requirements: None | _Requirements | Iterable[_Requirements] = ...,
      *,
-    like: ArrayLike = ...
+    like: _SupportsArrayFunc = ...
  ) -> _ArrayType: ...
  @overload
  def require(
      a: object,
      dtype: DTypeLike = ...,
-    requirements: Union[_E, Iterable[_RequirementsWithE]] = ...,
+    requirements: _E | Iterable[_RequirementsWithE] = ...,
      *,
-    like: ArrayLike = ...
+    like: _SupportsArrayFunc = ...
  ) -> ndarray: ...
  @overload
  def require(
      a: object,
      dtype: DTypeLike = ...,
-    requirements: Union[None, _Requirements, Iterable[_Requirements]] = ...,
+    requirements: None | _Requirements | Iterable[_Requirements] = ...,
      *,
-    like: ArrayLike = ...
+    like: _SupportsArrayFunc = ...
  ) -> ndarray: ...
diff --git a/numpy/core/_dtype.py b/numpy/core/_dtype.py

index c3a22b1c6bb0eee60916fb334996ca5ab4901e4f..3db80c17eebe13ad2409d978b5418e13ac61b73b 100644 (file)
--- a/numpy/core/_dtype.py
+++ b/numpy/core/_dtype.py
@@ -237,6 +237,11 @@ def _struct_dict_str(dtype, includealignedflag):
      return ret
  
  
+def _aligned_offset(offset, alignment):
+    # round up offset:
+    return - (-offset // alignment) * alignment
+
+
  def _is_packed(dtype):
      """
      Checks whether the structured data type in 'dtype'
@@ -249,12 +254,23 @@ def _is_packed(dtype):
  
      Duplicates the C `is_dtype_struct_simple_unaligned_layout` function.
      """
+    align = dtype.isalignedstruct
+    max_alignment = 1
      total_offset = 0
      for name in dtype.names:
          fld_dtype, fld_offset, title = _unpack_field(*dtype.fields[name])
+
+        if align:
+            total_offset = _aligned_offset(total_offset, fld_dtype.alignment)
+            max_alignment = max(max_alignment, fld_dtype.alignment)
+
          if fld_offset != total_offset:
              return False
          total_offset += fld_dtype.itemsize
+
+    if align:
+        total_offset = _aligned_offset(total_offset, max_alignment)
+
      if total_offset != dtype.itemsize:
          return False
      return True
diff --git a/numpy/core/_internal.py b/numpy/core/_internal.py

index 8942955f60c63ec5689304c41f5878f8807847d4..9a1787dde0423c69180f620e99bab0f121b76715 100644 (file)
--- a/numpy/core/_internal.py
+++ b/numpy/core/_internal.py
@@ -10,7 +10,7 @@ import sys
  import platform
  import warnings
  
-from .multiarray import dtype, array, ndarray
+from .multiarray import dtype, array, ndarray, promote_types
  try:
      import ctypes
  except ImportError:
@@ -433,6 +433,61 @@ def _copy_fields(ary):
                    'formats': [dt.fields[name][0] for name in dt.names]}
      return array(ary, dtype=copy_dtype, copy=True)
  
+def _promote_fields(dt1, dt2):
+    """ Perform type promotion for two structured dtypes.
+
+    Parameters
+    ----------
+    dt1 : structured dtype
+        First dtype.
+    dt2 : structured dtype
+        Second dtype.
+
+    Returns
+    -------
+    out : dtype
+        The promoted dtype
+
+    Notes
+    -----
+    If one of the inputs is aligned, the result will be.  The titles of
+    both descriptors must match (point to the same field).
+    """
+    # Both must be structured and have the same names in the same order
+    if (dt1.names is None or dt2.names is None) or dt1.names != dt2.names:
+        raise TypeError("invalid type promotion")
+
+    # if both are identical, we can (maybe!) just return the same dtype.
+    identical = dt1 is dt2
+    new_fields = []
+    for name in dt1.names:
+        field1 = dt1.fields[name]
+        field2 = dt2.fields[name]
+        new_descr = promote_types(field1[0], field2[0])
+        identical = identical and new_descr is field1[0]
+
+        # Check that the titles match (if given):
+        if field1[2:] != field2[2:]:
+            raise TypeError("invalid type promotion")
+        if len(field1) == 2:
+            new_fields.append((name, new_descr))
+        else:
+            new_fields.append(((field1[2], name), new_descr))
+
+    res = dtype(new_fields, align=dt1.isalignedstruct or dt2.isalignedstruct)
+
+    # Might as well preserve identity (and metadata) if the dtype is identical
+    # and the itemsize, offsets are also unmodified.  This could probably be
+    # sped up, but also probably just be removed entirely.
+    if identical and res.itemsize == dt1.itemsize:
+        for name in dt1.names:
+            if dt1.fields[name][1] != res.fields[name][1]:
+                return res  # the dtype changed.
+        return dt1
+
+    return res
+
+
  def _getfield_is_safe(oldtype, newtype, offset):
      """ Checks safety of getfield for object arrays.
  
diff --git a/numpy/core/_internal.pyi b/numpy/core/_internal.pyi

index f4bfd770f0f2e03cf0f6e04c8d2fe6592a696e73..8a25ef2cba41d33557e1a09ef630a7bb0a5d0c4c 100644 (file)
--- a/numpy/core/_internal.pyi
+++ b/numpy/core/_internal.pyi
@@ -1,4 +1,4 @@
-from typing import Any, TypeVar, Type, overload, Optional, Generic
+from typing import Any, TypeVar, overload, Generic
  import ctypes as ct
  
  from numpy import ndarray
@@ -6,7 +6,7 @@ from numpy.ctypeslib import c_intp
  
  _CastT = TypeVar("_CastT", bound=ct._CanCastTo)  # Copied from `ctypes.cast`
  _CT = TypeVar("_CT", bound=ct._CData)
-_PT = TypeVar("_PT", bound=Optional[int])
+_PT = TypeVar("_PT", bound=None | int)
  
  # TODO: Let the likes of `shape_as` and `strides_as` return `None`
  # for 0D arrays once we've got shape-support
@@ -25,6 +25,6 @@ class _ctypes(Generic[_PT]):
      @property
      def _as_parameter_(self) -> ct.c_void_p: ...
  
-    def data_as(self, obj: Type[_CastT]) -> _CastT: ...
-    def shape_as(self, obj: Type[_CT]) -> ct.Array[_CT]: ...
-    def strides_as(self, obj: Type[_CT]) -> ct.Array[_CT]: ...
+    def data_as(self, obj: type[_CastT]) -> _CastT: ...
+    def shape_as(self, obj: type[_CT]) -> ct.Array[_CT]: ...
+    def strides_as(self, obj: type[_CT]) -> ct.Array[_CT]: ...
diff --git a/numpy/core/_methods.py b/numpy/core/_methods.py

index a239e2c87eb7658a542a1bfaec57496453c0aec1..eda00147d053090dd572fea83cb1bb320aeb4537 100644 (file)
--- a/numpy/core/_methods.py
+++ b/numpy/core/_methods.py
@@ -71,9 +71,10 @@ def _count_reduce_items(arr, axis, keepdims=False, where=True):
              axis = tuple(range(arr.ndim))
          elif not isinstance(axis, tuple):
              axis = (axis,)
-        items = nt.intp(1)
+        items = 1
          for ax in axis:
              items *= arr.shape[mu.normalize_axis_index(ax, arr.ndim)]
+        items = nt.intp(items)
      else:
          # TODO: Optimize case when `where` is broadcast along a non-reduction
          # axis and full sum is more excessive than needed.
diff --git a/numpy/core/_type_aliases.py b/numpy/core/_type_aliases.py

index 3765a0d34e18d7faee5dbb682c7fdc49aa7d021e..9b1dcb8cff479f0a089da6e2ec167810438bac5f 100644 (file)
--- a/numpy/core/_type_aliases.py
+++ b/numpy/core/_type_aliases.py
@@ -172,7 +172,7 @@ def _set_up_aliases():
          allTypes[alias] = allTypes[t]
          sctypeDict[alias] = sctypeDict[t]
      # Remove aliases overriding python types and modules
-    to_remove = ['ulong', 'object', 'int', 'float',
+    to_remove = ['object', 'int', 'float',
                   'complex', 'bool', 'string', 'datetime', 'timedelta',
                   'bytes', 'str']
  
@@ -182,6 +182,15 @@ def _set_up_aliases():
              del sctypeDict[t]
          except KeyError:
              pass
+
+    # Additional aliases in sctypeDict that should not be exposed as attributes
+    attrs_to_remove = ['ulong']
+
+    for t in attrs_to_remove:
+        try:
+            del allTypes[t]
+        except KeyError:
+            pass
  _set_up_aliases()
  
  
diff --git a/numpy/core/_type_aliases.pyi b/numpy/core/_type_aliases.pyi

index c10d072f9f0b2cf4b94757bf776787d5084ec314..bbead0cb5b9315c43d13a7d9a46dc324646102c2 100644 (file)
--- a/numpy/core/_type_aliases.pyi
+++ b/numpy/core/_type_aliases.pyi
@@ -1,13 +1,13 @@
-from typing import Dict, Union, Type, List, TypedDict
+from typing import TypedDict
  
  from numpy import generic, signedinteger, unsignedinteger, floating, complexfloating
  
  class _SCTypes(TypedDict):
-    int: List[Type[signedinteger]]
-    uint: List[Type[unsignedinteger]]
-    float: List[Type[floating]]
-    complex: List[Type[complexfloating]]
-    others: List[type]
+    int: list[type[signedinteger]]
+    uint: list[type[unsignedinteger]]
+    float: list[type[floating]]
+    complex: list[type[complexfloating]]
+    others: list[type]
  
-sctypeDict: Dict[Union[int, str], Type[generic]]
+sctypeDict: dict[int | str, type[generic]]
  sctypes: _SCTypes
diff --git a/numpy/core/_ufunc_config.py b/numpy/core/_ufunc_config.py

index b40e7445ec5b21693a9c41cd4cc123c10623658b..a731f6bf7cf05be0a8a40fdd91411118eb945e57 100644 (file)
--- a/numpy/core/_ufunc_config.py
+++ b/numpy/core/_ufunc_config.py
@@ -290,7 +290,7 @@ def seterrcall(func):
      >>> save_err = np.seterr(all='log')
  
      >>> np.array([1, 2, 3]) / 0.0
-    LOG: Warning: divide by zero encountered in true_divide
+    LOG: Warning: divide by zero encountered in divide
      array([inf, inf, inf])
  
      >>> np.seterrcall(saved_handler)
diff --git a/numpy/core/_ufunc_config.pyi b/numpy/core/_ufunc_config.pyi

index cd7129bcb140fb56bf98a873caf816206142d7b4..b7c2ebefcc9de05a22651f3e739a56e72cb6298c 100644 (file)
--- a/numpy/core/_ufunc_config.pyi
+++ b/numpy/core/_ufunc_config.pyi
@@ -1,4 +1,5 @@
-from typing import Optional, Union, Callable, Any, Literal, TypedDict
+from collections.abc import Callable
+from typing import Any, Literal, TypedDict
  
  from numpy import _SupportsWrite
  
@@ -12,25 +13,25 @@ class _ErrDict(TypedDict):
      invalid: _ErrKind
  
  class _ErrDictOptional(TypedDict, total=False):
-    all: Optional[_ErrKind]
-    divide: Optional[_ErrKind]
-    over: Optional[_ErrKind]
-    under: Optional[_ErrKind]
-    invalid: Optional[_ErrKind]
+    all: None | _ErrKind
+    divide: None | _ErrKind
+    over: None | _ErrKind
+    under: None | _ErrKind
+    invalid: None | _ErrKind
  
  def seterr(
-    all: Optional[_ErrKind] = ...,
-    divide: Optional[_ErrKind] = ...,
-    over: Optional[_ErrKind] = ...,
-    under: Optional[_ErrKind] = ...,
-    invalid: Optional[_ErrKind] = ...,
+    all: None | _ErrKind = ...,
+    divide: None | _ErrKind = ...,
+    over: None | _ErrKind = ...,
+    under: None | _ErrKind = ...,
+    invalid: None | _ErrKind = ...,
  ) -> _ErrDict: ...
  def geterr() -> _ErrDict: ...
  def setbufsize(size: int) -> int: ...
  def getbufsize() -> int: ...
  def seterrcall(
-    func: Union[None, _ErrFunc, _SupportsWrite[str]]
-) -> Union[None, _ErrFunc, _SupportsWrite[str]]: ...
-def geterrcall() -> Union[None, _ErrFunc, _SupportsWrite[str]]: ...
+    func: None | _ErrFunc | _SupportsWrite[str]
+) -> None | _ErrFunc | _SupportsWrite[str]: ...
+def geterrcall() -> None | _ErrFunc | _SupportsWrite[str]: ...
  
  # See `numpy/__init__.pyi` for the `errstate` class
diff --git a/numpy/core/arrayprint.pyi b/numpy/core/arrayprint.pyi

index 0d338206f604831ccbd171e11c93b7ef2901503c..d8255387a3a52635b30312d2800d02220b7ce6f2 100644 (file)
--- a/numpy/core/arrayprint.pyi
+++ b/numpy/core/arrayprint.pyi
@@ -1,5 +1,6 @@
  from types import TracebackType
-from typing import Any, Optional, Callable, Union, Type, Literal, TypedDict, SupportsIndex
+from collections.abc import Callable
+from typing import Any, Literal, TypedDict, SupportsIndex
  
  # Using a private class is by no means ideal, but it is simply a consequence
  # of a `contextlib.context` returning an instance of aforementioned class
@@ -20,7 +21,7 @@ from numpy import (
      longdouble,
      clongdouble,
  )
-from numpy.typing import ArrayLike, _CharLike_co, _FloatLike_co
+from numpy._typing import ArrayLike, _CharLike_co, _FloatLike_co
  
  _FloatMode = Literal["fixed", "unique", "maxprec", "maxprec_equal"]
  
@@ -50,92 +51,92 @@ class _FormatOptions(TypedDict):
      suppress: bool
      nanstr: str
      infstr: str
-    formatter: Optional[_FormatDict]
+    formatter: None | _FormatDict
      sign: Literal["-", "+", " "]
      floatmode: _FloatMode
      legacy: Literal[False, "1.13", "1.21"]
  
  def set_printoptions(
-    precision: Optional[SupportsIndex] = ...,
-    threshold: Optional[int] = ...,
-    edgeitems: Optional[int] = ...,
-    linewidth: Optional[int] = ...,
-    suppress: Optional[bool] = ...,
-    nanstr: Optional[str] = ...,
-    infstr: Optional[str] = ...,
-    formatter: Optional[_FormatDict] = ...,
-    sign: Optional[Literal["-", "+", " "]] = ...,
-    floatmode: Optional[_FloatMode] = ...,
+    precision: None | SupportsIndex = ...,
+    threshold: None | int = ...,
+    edgeitems: None | int = ...,
+    linewidth: None | int = ...,
+    suppress: None | bool = ...,
+    nanstr: None | str = ...,
+    infstr: None | str = ...,
+    formatter: None | _FormatDict = ...,
+    sign: Literal[None, "-", "+", " "] = ...,
+    floatmode: None | _FloatMode = ...,
      *,
-    legacy: Optional[Literal[False, "1.13", "1.21"]] = ...
+    legacy: Literal[None, False, "1.13", "1.21"] = ...
  ) -> None: ...
  def get_printoptions() -> _FormatOptions: ...
  def array2string(
      a: ndarray[Any, Any],
-    max_line_width: Optional[int] = ...,
-    precision: Optional[SupportsIndex] = ...,
-    suppress_small: Optional[bool] = ...,
+    max_line_width: None | int = ...,
+    precision: None | SupportsIndex = ...,
+    suppress_small: None | bool = ...,
      separator: str = ...,
      prefix: str = ...,
      # NOTE: With the `style` argument being deprecated,
      # all arguments between `formatter` and `suffix` are de facto
      # keyworld-only arguments
      *,
-    formatter: Optional[_FormatDict] = ...,
-    threshold: Optional[int] = ...,
-    edgeitems: Optional[int] = ...,
-    sign: Optional[Literal["-", "+", " "]] = ...,
-    floatmode: Optional[_FloatMode] = ...,
+    formatter: None | _FormatDict = ...,
+    threshold: None | int = ...,
+    edgeitems: None | int = ...,
+    sign: Literal[None, "-", "+", " "] = ...,
+    floatmode: None | _FloatMode = ...,
      suffix: str = ...,
-    legacy: Optional[Literal[False, "1.13", "1.21"]] = ...,
+    legacy: Literal[None, False, "1.13", "1.21"] = ...,
  ) -> str: ...
  def format_float_scientific(
      x: _FloatLike_co,
-    precision: Optional[int] = ...,
+    precision: None | int = ...,
      unique: bool = ...,
      trim: Literal["k", ".", "0", "-"] = ...,
      sign: bool = ...,
-    pad_left: Optional[int] = ...,
-    exp_digits: Optional[int] = ...,
-    min_digits: Optional[int] = ...,
+    pad_left: None | int = ...,
+    exp_digits: None | int = ...,
+    min_digits: None | int = ...,
  ) -> str: ...
  def format_float_positional(
      x: _FloatLike_co,
-    precision: Optional[int] = ...,
+    precision: None | int = ...,
      unique: bool = ...,
      fractional: bool = ...,
      trim: Literal["k", ".", "0", "-"] = ...,
      sign: bool = ...,
-    pad_left: Optional[int] = ...,
-    pad_right: Optional[int] = ...,
-    min_digits: Optional[int] = ...,
+    pad_left: None | int = ...,
+    pad_right: None | int = ...,
+    min_digits: None | int = ...,
  ) -> str: ...
  def array_repr(
      arr: ndarray[Any, Any],
-    max_line_width: Optional[int] = ...,
-    precision: Optional[SupportsIndex] = ...,
-    suppress_small: Optional[bool] = ...,
+    max_line_width: None | int = ...,
+    precision: None | SupportsIndex = ...,
+    suppress_small: None | bool = ...,
  ) -> str: ...
  def array_str(
      a: ndarray[Any, Any],
-    max_line_width: Optional[int] = ...,
-    precision: Optional[SupportsIndex] = ...,
-    suppress_small: Optional[bool] = ...,
+    max_line_width: None | int = ...,
+    precision: None | SupportsIndex = ...,
+    suppress_small: None | bool = ...,
  ) -> str: ...
  def set_string_function(
-    f: Optional[Callable[[ndarray[Any, Any]], str]], repr: bool = ...
+    f: None | Callable[[ndarray[Any, Any]], str], repr: bool = ...
  ) -> None: ...
  def printoptions(
-    precision: Optional[SupportsIndex] = ...,
-    threshold: Optional[int] = ...,
-    edgeitems: Optional[int] = ...,
-    linewidth: Optional[int] = ...,
-    suppress: Optional[bool] = ...,
-    nanstr: Optional[str] = ...,
-    infstr: Optional[str] = ...,
-    formatter: Optional[_FormatDict] = ...,
-    sign: Optional[Literal["-", "+", " "]] = ...,
-    floatmode: Optional[_FloatMode] = ...,
+    precision: None | SupportsIndex = ...,
+    threshold: None | int = ...,
+    edgeitems: None | int = ...,
+    linewidth: None | int = ...,
+    suppress: None | bool = ...,
+    nanstr: None | str = ...,
+    infstr: None | str = ...,
+    formatter: None | _FormatDict = ...,
+    sign: Literal[None, "-", "+", " "] = ...,
+    floatmode: None | _FloatMode = ...,
      *,
-    legacy: Optional[Literal[False, "1.13", "1.21"]] = ...
+    legacy: Literal[None, False, "1.13", "1.21"] = ...
  ) -> _GeneratorContextManager[_FormatOptions]: ...
diff --git a/numpy/core/code_generators/cversions.txt b/numpy/core/code_generators/cversions.txt

index e7b3ef697edc5996dac79a13803838ed3a8120d3..e6dc00dbb3dc1f41be97151092bb81e59b9cf2d8 100644 (file)
--- a/numpy/core/code_generators/cversions.txt
+++ b/numpy/core/code_generators/cversions.txt
@@ -60,5 +60,10 @@
  # Version 14 (NumPy 1.21) No change.
  0x0000000e = 17a0f366e55ec05e5c5c149123478452
  
-# Version 15 (NumPy 1.22) Configurable memory allocations
+# Version 15 (NumPy 1.22)
+# Configurable memory allocations
  0x0000000f = b8783365b873681cd204be50cdfb448d
+
+# Version 16 (NumPy 1.23)
+# NonNull attributes removed from numpy_api.py
+0x00000010 = 04a7bf1e65350926a0e528798da263c0
diff --git a/numpy/core/code_generators/genapi.py b/numpy/core/code_generators/genapi.py

index b401ee6a581e905b9321460a2ab5f23fed2b73ba..68ae30d5bef8430a070577afa6add2c9424a6117 100644 (file)
--- a/numpy/core/code_generators/genapi.py
+++ b/numpy/core/code_generators/genapi.py
@@ -92,17 +92,6 @@ class StealRef:
              return 'NPY_STEALS_REF_TO_ARG(%d)' % self.arg
  
  
-class NonNull:
-    def __init__(self, arg):
-        self.arg = arg # counting from 1
-
-    def __str__(self):
-        try:
-            return ' '.join('NPY_GCC_NONNULL(%d)' % x for x in self.arg)
-        except TypeError:
-            return 'NPY_GCC_NONNULL(%d)' % self.arg
-
-
  class Function:
      def __init__(self, name, return_type, args, doc=''):
          self.name = name
diff --git a/numpy/core/code_generators/generate_umath.py b/numpy/core/code_generators/generate_umath.py

index 292d9e0d37e2690ac7acfc2e6f694786f2fe357a..266fccefb2c619a0be22560c8647010e65995d34 100644 (file)
--- a/numpy/core/code_generators/generate_umath.py
+++ b/numpy/core/code_generators/generate_umath.py
@@ -4,10 +4,6 @@ import struct
  import sys
  import textwrap
  
-sys.path.insert(0, os.path.dirname(__file__))
-import ufunc_docstrings as docstrings
-sys.path.pop(0)
-
  Zero = "PyLong_FromLong(0)"
  One = "PyLong_FromLong(1)"
  True_ = "(Py_INCREF(Py_True), Py_True)"
@@ -17,6 +13,16 @@ AllOnes = "PyLong_FromLong(-1)"
  MinusInfinity = 'PyFloat_FromDouble(-NPY_INFINITY)'
  ReorderableNone = "(Py_INCREF(Py_None), Py_None)"
  
+class docstrings:
+    @staticmethod
+    def get(place):
+        """
+        Returns the C #definition name of docstring according
+        to ufunc place. C #definitions are generated by generate_umath_doc.py
+        in a separate C header.
+        """
+        return 'DOC_' + place.upper().replace('.', '_')
+
  # Sentinel value to specify using the full type description in the
  # function name
  class FullTypeDescr:
@@ -48,10 +54,10 @@ class TypeDescription:
      cfunc_alias : str or none, optional
          Appended to inner loop C function name, e.g., FLOAT_{cfunc_alias}. See make_arrays.
          NOTE: it doesn't support 'astype'
-    simd: list
+    simd : list
          Available SIMD ufunc loops, dispatched at runtime in specified order
          Currently only supported for simples types (see make_arrays)
-    dispatch: str or None, optional
+    dispatch : str or None, optional
          Dispatch-able source name without its extension '.dispatch.c' that
          contains the definition of ufunc, dispatched at runtime depending on the
          specified targets of the dispatch-able source.
@@ -322,7 +328,7 @@ defdict = {
            ],
            TD(O, f='PyNumber_Multiply'),
            ),
-#'divide' : aliased to true_divide in umathmodule.c:initumath
+#'true_divide' : aliased to divide in umathmodule.c:initumath
  'floor_divide':
      Ufunc(2, 1, None, # One is only a unit to the right, not the left
            docstrings.get('numpy.core.umath.floor_divide'),
@@ -336,9 +342,9 @@ defdict = {
            ],
            TD(O, f='PyNumber_FloorDivide'),
            ),
-'true_divide':
+'divide':
      Ufunc(2, 1, None, # One is only a unit to the right, not the left
-          docstrings.get('numpy.core.umath.true_divide'),
+          docstrings.get('numpy.core.umath.divide'),
            'PyUFunc_TrueDivisionTypeResolver',
            TD(flts+cmplx, cfunc_alias='divide', dispatch=[('loops_arithm_fp', 'fd')]),
            [TypeDescription('m', FullTypeDescr, 'mq', 'm', cfunc_alias='divide'),
@@ -358,7 +364,7 @@ defdict = {
      Ufunc(2, 1, None,
            docstrings.get('numpy.core.umath.fmod'),
            None,
-          TD(ints),
+          TD(ints, dispatch=[('loops_modulo', ints)]),
            TD(flts, f='fmod', astype={'e': 'f'}),
            TD(P, f='fmod'),
            ),
@@ -516,14 +522,14 @@ defdict = {
      Ufunc(2, 1, ReorderableNone,
            docstrings.get('numpy.core.umath.maximum'),
            'PyUFunc_SimpleUniformOperationTypeResolver',
-          TD(noobj, simd=[('avx512f', 'fd')]),
+          TD(noobj, dispatch=[('loops_minmax', ints+'fdg')]),
            TD(O, f='npy_ObjectMax')
            ),
  'minimum':
      Ufunc(2, 1, ReorderableNone,
            docstrings.get('numpy.core.umath.minimum'),
            'PyUFunc_SimpleUniformOperationTypeResolver',
-          TD(noobj, simd=[('avx512f', 'fd')]),
+          TD(noobj, dispatch=[('loops_minmax', ints+'fdg')]),
            TD(O, f='npy_ObjectMin')
            ),
  'clip':
@@ -537,6 +543,7 @@ defdict = {
      Ufunc(2, 1, ReorderableNone,
            docstrings.get('numpy.core.umath.fmax'),
            'PyUFunc_SimpleUniformOperationTypeResolver',
+          TD('fdg', dispatch=[('loops_minmax', 'fdg')]),
            TD(noobj),
            TD(O, f='npy_ObjectMax')
            ),
@@ -544,6 +551,7 @@ defdict = {
      Ufunc(2, 1, ReorderableNone,
            docstrings.get('numpy.core.umath.fmin'),
            'PyUFunc_SimpleUniformOperationTypeResolver',
+          TD('fdg', dispatch=[('loops_minmax', 'fdg')]),
            TD(noobj),
            TD(O, f='npy_ObjectMin')
            ),
@@ -737,7 +745,7 @@ defdict = {
            docstrings.get('numpy.core.umath.tanh'),
            None,
            TD('e', f='tanh', astype={'e': 'f'}),
-          TD('fd', dispatch=[('loops_umath_fp', 'fd')]),
+          TD('fd', dispatch=[('loops_hyperbolic', 'fd')]),
            TD(inexact, f='tanh', astype={'e': 'f'}),
            TD(P, f='tanh'),
            ),
@@ -836,7 +844,7 @@ defdict = {
            docstrings.get('numpy.core.umath.trunc'),
            None,
            TD('e', f='trunc', astype={'e': 'f'}),
-          TD(inexactvec, simd=[('fma', 'fd'), ('avx512f', 'fd')]),
+          TD(inexactvec, dispatch=[('loops_unary_fp', 'fd')]),
            TD('fdg', f='trunc'),
            TD(O, f='npy_ObjectTrunc'),
            ),
@@ -852,7 +860,7 @@ defdict = {
            docstrings.get('numpy.core.umath.floor'),
            None,
            TD('e', f='floor', astype={'e': 'f'}),
-          TD(inexactvec, simd=[('fma', 'fd'), ('avx512f', 'fd')]),
+          TD(inexactvec, dispatch=[('loops_unary_fp', 'fd')]),
            TD('fdg', f='floor'),
            TD(O, f='npy_ObjectFloor'),
            ),
@@ -861,7 +869,7 @@ defdict = {
            docstrings.get('numpy.core.umath.rint'),
            None,
            TD('e', f='rint', astype={'e': 'f'}),
-          TD(inexactvec, simd=[('fma', 'fd'), ('avx512f', 'fd')]),
+          TD(inexactvec, dispatch=[('loops_unary_fp', 'fd')]),
            TD('fdg' + cmplx, f='rint'),
            TD(P, f='rint'),
            ),
@@ -876,7 +884,8 @@ defdict = {
      Ufunc(2, 1, None,
            docstrings.get('numpy.core.umath.remainder'),
            'PyUFunc_RemainderTypeResolver',
-          TD(intflt),
+          TD(ints, dispatch=[('loops_modulo', ints)]),
+          TD(flts),
            [TypeDescription('m', FullTypeDescr, 'mm', 'm')],
            TD(O, f='PyNumber_Remainder'),
            ),
@@ -884,7 +893,8 @@ defdict = {
      Ufunc(2, 2, None,
            docstrings.get('numpy.core.umath.divmod'),
            'PyUFunc_DivmodTypeResolver',
-          TD(intflt),
+          TD(ints, dispatch=[('loops_modulo', ints)]),
+          TD(flts),
            [TypeDescription('m', FullTypeDescr, 'mm', 'qm')],
            # TD(O, f='PyNumber_Divmod'),  # gh-9730
            ),
@@ -1151,14 +1161,6 @@ def make_ufuncs(funcdict):
      for name in names:
          uf = funcdict[name]
          mlist = []
-        docstring = textwrap.dedent(uf.docstring).strip()
-        docstring = docstring.encode('unicode-escape').decode('ascii')
-        docstring = docstring.replace(r'"', r'\"')
-        docstring = docstring.replace(r"'", r"\'")
-        # Split the docstring because some compilers (like MS) do not like big
-        # string literal in C code. We split at endlines because textwrap.wrap
-        # do not play well with \n
-        docstring = '\\n\"\"'.join(docstring.split(r"\n"))
          if uf.signature is None:
              sig = "NULL"
          else:
@@ -1171,7 +1173,7 @@ def make_ufuncs(funcdict):
              f = PyUFunc_FromFuncAndDataAndSignatureAndIdentity(
                  {name}_functions, {name}_data, {name}_signatures, {nloops},
                  {nin}, {nout}, {identity}, "{name}",
-                "{doc}", 0, {sig}, identity
+                {doc}, 0, {sig}, identity
              );
              if ({has_identity}) {{
                  Py_DECREF(identity);
@@ -1186,7 +1188,7 @@ def make_ufuncs(funcdict):
              has_identity='0' if uf.identity is None_ else '1',
              identity='PyUFunc_IdentityValue',
              identity_expr=uf.identity,
-            doc=docstring,
+            doc=uf.docstring,
              sig=sig,
          )
  
@@ -1222,6 +1224,7 @@ def make_code(funcdict, filename):
      #include "loops.h"
      #include "matmul.h"
      #include "clip.h"
+    #include "_umath_doc_generated.h"
      %s
  
      static int
diff --git a/numpy/core/code_generators/generate_umath_doc.py b/numpy/core/code_generators/generate_umath_doc.py

new file mode 100644 (file)

index 0000000..9888730
--- /dev/null
+++ b/numpy/core/code_generators/generate_umath_doc.py
@@ -0,0 +1,30 @@
+import sys
+import os
+import textwrap
+
+sys.path.insert(0, os.path.dirname(__file__))
+import ufunc_docstrings as docstrings
+sys.path.pop(0)
+
+def normalize_doc(docstring):
+    docstring = textwrap.dedent(docstring).strip()
+    docstring = docstring.encode('unicode-escape').decode('ascii')
+    docstring = docstring.replace(r'"', r'\"')
+    docstring = docstring.replace(r"'", r"\'")
+    # Split the docstring because some compilers (like MS) do not like big
+    # string literal in C code. We split at endlines because textwrap.wrap
+    # do not play well with \n
+    docstring = '\\n\"\"'.join(docstring.split(r"\n"))
+    return docstring
+
+def write_code(target):
+    with open(target, 'w') as fid:
+        fid.write(
+            "#ifndef NUMPY_CORE_INCLUDE__UMATH_DOC_GENERATED_H_\n"
+            "#define NUMPY_CORE_INCLUDE__UMATH_DOC_GENERATED_H_\n"
+        )
+        for place, string in docstrings.docdict.items():
+            cdef_name = f"DOC_{place.upper().replace('.', '_')}"
+            cdef_str = normalize_doc(string)
+            fid.write(f"#define {cdef_name} \"{cdef_str}\"\n")
+        fid.write("#endif //NUMPY_CORE_INCLUDE__UMATH_DOC_GENERATED_H\n")
diff --git a/numpy/core/code_generators/numpy_api.py b/numpy/core/code_generators/numpy_api.py

index d12d62d8fe9aec0408f804a97687abef81ba60e8..fa28f92b7ca608e3b1f876c5babfab8cc50d9c84 100644 (file)
--- a/numpy/core/code_generators/numpy_api.py
+++ b/numpy/core/code_generators/numpy_api.py
@@ -13,7 +13,7 @@ When adding a function, make sure to use the next integer not used as an index
  exception, so it should hopefully not get unnoticed).
  
  """
-from code_generators.genapi import StealRef, NonNull
+from code_generators.genapi import StealRef
  
  # index, type
  multiarray_global_vars = {
@@ -92,7 +92,7 @@ multiarray_funcs_api = {
      'PyArray_TypeObjectFromType':           (46,),
      'PyArray_Zero':                         (47,),
      'PyArray_One':                          (48,),
-    'PyArray_CastToType':                   (49, StealRef(2), NonNull(2)),
+    'PyArray_CastToType':                   (49, StealRef(2)),
      'PyArray_CastTo':                       (50,),
      'PyArray_CastAnyTo':                    (51,),
      'PyArray_CanCastSafely':                (52,),
@@ -120,15 +120,15 @@ multiarray_funcs_api = {
      'PyArray_FromBuffer':                   (74,),
      'PyArray_FromIter':                     (75, StealRef(2)),
      'PyArray_Return':                       (76, StealRef(1)),
-    'PyArray_GetField':                     (77, StealRef(2), NonNull(2)),
-    'PyArray_SetField':                     (78, StealRef(2), NonNull(2)),
+    'PyArray_GetField':                     (77, StealRef(2)),
+    'PyArray_SetField':                     (78, StealRef(2)),
      'PyArray_Byteswap':                     (79,),
      'PyArray_Resize':                       (80,),
      'PyArray_MoveInto':                     (81,),
      'PyArray_CopyInto':                     (82,),
      'PyArray_CopyAnyInto':                  (83,),
      'PyArray_CopyObject':                   (84,),
-    'PyArray_NewCopy':                      (85, NonNull(1)),
+    'PyArray_NewCopy':                      (85,),
      'PyArray_ToList':                       (86,),
      'PyArray_ToString':                     (87,),
      'PyArray_ToFile':                       (88,),
@@ -136,8 +136,8 @@ multiarray_funcs_api = {
      'PyArray_Dumps':                        (90,),
      'PyArray_ValidType':                    (91,),
      'PyArray_UpdateFlags':                  (92,),
-    'PyArray_New':                          (93, NonNull(1)),
-    'PyArray_NewFromDescr':                 (94, StealRef(2), NonNull([1, 2])),
+    'PyArray_New':                          (93,),
+    'PyArray_NewFromDescr':                 (94, StealRef(2)),
      'PyArray_DescrNew':                     (95,),
      'PyArray_DescrNewFromType':             (96,),
      'PyArray_GetPriority':                  (97,),
@@ -318,7 +318,7 @@ multiarray_funcs_api = {
      'PyArray_CanCastArrayTo':               (274,),
      'PyArray_CanCastTypeTo':                (275,),
      'PyArray_EinsteinSum':                  (276,),
-    'PyArray_NewLikeArray':                 (277, StealRef(3), NonNull(1)),
+    'PyArray_NewLikeArray':                 (277, StealRef(3)),
      'PyArray_GetArrayParamsFromObject':     (278,),
      'PyArray_ConvertClipmodeSequence':      (279,),
      'PyArray_MatrixProduct2':               (280,),
@@ -344,7 +344,7 @@ multiarray_funcs_api = {
      'PyDataMem_NEW_ZEROED':                 (299,),
      # End 1.8 API
      # End 1.9 API
-    'PyArray_CheckAnyScalarExact':          (300, NonNull(1)),
+    'PyArray_CheckAnyScalarExact':          (300,),
      # End 1.10 API
      'PyArray_MapIterArrayCopyIfOverlap':    (301,),
      # End 1.13 API
@@ -353,7 +353,7 @@ multiarray_funcs_api = {
      # End 1.14 API
      'PyDataMem_SetHandler':                 (304,),
      'PyDataMem_GetHandler':                 (305,),
-    # End 1.21 API
+    # End 1.22 API
  }
  
  ufunc_types_api = {
diff --git a/numpy/core/code_generators/ufunc_docstrings.py b/numpy/core/code_generators/ufunc_docstrings.py

index c9be945693dca5846d2ac40db930142b312e991a..24b707a1216c1bd4ff5a9a7b2d7167dbb884bf39 100644 (file)
--- a/numpy/core/code_generators/ufunc_docstrings.py
+++ b/numpy/core/code_generators/ufunc_docstrings.py
@@ -4,18 +4,15 @@ Docstrings for generated ufuncs
  The syntax is designed to look like the function add_newdoc is being
  called from numpy.lib, but in this file  add_newdoc puts the docstrings
  in a dictionary. This dictionary is used in
-numpy/core/code_generators/generate_umath.py to generate the docstrings
-for the ufuncs in numpy.core at the C level when the ufuncs are created
-at compile time.
+numpy/core/code_generators/generate_umath_doc.py to generate the docstrings
+as a C #definitions for the ufuncs in numpy.core at the C level when the
+ufuncs are created at compile time.
  
  """
  import textwrap
  
  docdict = {}
  
-def get(name):
-    return docdict.get(name)
-
  # common parameter text to all ufuncs
  subst = {
      'PARAMS': textwrap.dedent("""
@@ -515,7 +512,7 @@ add_newdoc('numpy.core.umath', 'arctan2',
      >>> np.arctan2([1., -1.], [0., 0.])
      array([ 1.57079633, -1.57079633])
      >>> np.arctan2([0., 0., np.inf], [+0., -0., np.inf])
-    array([ 0.        ,  3.14159265,  0.78539816])
+    array([0.        , 3.14159265, 0.78539816])
  
      """)
  
@@ -1089,9 +1086,8 @@ add_newdoc('numpy.core.umath', 'divide',
      -----
      Equivalent to ``x1`` / ``x2`` in terms of array-broadcasting.
  
-    Behavior on division by zero can be changed using ``seterr``.
-
-    Behaves like ``true_divide``.
+    The ``true_divide(x1, x2)`` function is an alias for
+    ``divide(x1, x2)``.
  
      Examples
      --------
@@ -1100,13 +1096,9 @@ add_newdoc('numpy.core.umath', 'divide',
      >>> x1 = np.arange(9.0).reshape((3, 3))
      >>> x2 = np.arange(3.0)
      >>> np.divide(x1, x2)
-    array([[ NaN,  1. ,  1. ],
-           [ Inf,  4. ,  2.5],
-           [ Inf,  7. ,  4. ]])
-
-    >>> ignored_states = np.seterr(**old_err_state)
-    >>> np.divide(1, 0)
-    0
+    array([[nan, 1. , 1. ],
+           [inf, 4. , 2.5],
+           [inf, 7. , 4. ]])
  
      The ``/`` operator can be used as a shorthand for ``np.divide`` on
      ndarrays.
@@ -3825,8 +3817,9 @@ add_newdoc('numpy.core.umath', 'sqrt',
  
      See Also
      --------
-    lib.scimath.sqrt
+    emath.sqrt
          A version which returns complex numbers when given negative reals.
+        Note: 0.0 and -0.0 are handled differently for complex inputs.
  
      Notes
      -----
@@ -4051,59 +4044,11 @@ add_newdoc('numpy.core.umath', 'tanh',
  
      """)
  
-add_newdoc('numpy.core.umath', 'true_divide',
-    """
-    Returns a true division of the inputs, element-wise.
-
-    Unlike 'floor division', true division adjusts the output type
-    to present the best answer, regardless of input types.
-
-    Parameters
-    ----------
-    x1 : array_like
-        Dividend array.
-    x2 : array_like
-        Divisor array.
-        $BROADCASTABLE_2
-    $PARAMS
-
-    Returns
-    -------
-    out : ndarray or scalar
-        $OUT_SCALAR_2
-
-    Notes
-    -----
-    In Python, ``//`` is the floor division operator and ``/`` the
-    true division operator.  The ``true_divide(x1, x2)`` function is
-    equivalent to true division in Python.
-
-    Examples
-    --------
-    >>> x = np.arange(5)
-    >>> np.true_divide(x, 4)
-    array([ 0.  ,  0.25,  0.5 ,  0.75,  1.  ])
-
-    >>> x/4
-    array([ 0.  ,  0.25,  0.5 ,  0.75,  1.  ])
-
-    >>> x//4
-    array([0, 0, 0, 0, 1])
-
-    The ``/`` operator can be used as a shorthand for ``np.true_divide`` on
-    ndarrays.
-
-    >>> x = np.arange(5)
-    >>> x / 4
-    array([0.  , 0.25, 0.5 , 0.75, 1.  ])
-
-    """)
-
  add_newdoc('numpy.core.umath', 'frexp',
      """
      Decompose the elements of x into mantissa and twos exponent.
  
-    Returns (`mantissa`, `exponent`), where `x = mantissa * 2**exponent``.
+    Returns (`mantissa`, `exponent`), where ``x = mantissa * 2**exponent``.
      The mantissa lies in the open interval(-1, 1), while the twos
      exponent is a signed integer.
  
diff --git a/numpy/core/defchararray.pyi b/numpy/core/defchararray.pyi

index 28d247b056e4064ea2760aadb74cd1eac3d8bc9c..73d90bb2fc531a1c38dce4feb0c8ac97c0e17e24 100644 (file)
--- a/numpy/core/defchararray.pyi
+++ b/numpy/core/defchararray.pyi
@@ -3,7 +3,6 @@ from typing import (
      overload,
      TypeVar,
      Any,
-    List,
  )
  
  from numpy import (
@@ -17,7 +16,7 @@ from numpy import (
      _OrderKACF,
  )
  
-from numpy.typing import (
+from numpy._typing import (
      NDArray,
      _ArrayLikeStr_co as U_co,
      _ArrayLikeBytes_co as S_co,
@@ -30,7 +29,7 @@ from numpy.core.multiarray import compare_chararrays as compare_chararrays
  _SCT = TypeVar("_SCT", str_, bytes_)
  _CharArray = chararray[Any, dtype[_SCT]]
  
-__all__: List[str]
+__all__: list[str]
  
  # Comparison
  @overload
diff --git a/numpy/core/einsumfunc.py b/numpy/core/einsumfunc.py

index c78d3db23abcbf2a2dae92c8ca5ce8a836f3e878..d6c5885b810a6df86df5c321d9d4793fc7e58496 100644 (file)
--- a/numpy/core/einsumfunc.py
+++ b/numpy/core/einsumfunc.py
@@ -821,6 +821,7 @@ def einsum_path(*operands, optimize='greedy', einsum_call=False):
      if path_type is None:
          path_type = False
  
+    explicit_einsum_path = False
      memory_limit = None
  
      # No optimization or a named path algorithm
@@ -829,7 +830,7 @@ def einsum_path(*operands, optimize='greedy', einsum_call=False):
  
      # Given an explicit path
      elif len(path_type) and (path_type[0] == 'einsum_path'):
-        pass
+        explicit_einsum_path = True
  
      # Path tuple with memory limit
      elif ((len(path_type) == 2) and isinstance(path_type[0], str) and
@@ -898,15 +899,19 @@ def einsum_path(*operands, optimize='greedy', einsum_call=False):
      naive_cost = _flop_count(indices, inner_product, len(input_list), dimension_dict)
  
      # Compute the path
-    if (path_type is False) or (len(input_list) in [1, 2]) or (indices == output_set):
+    if explicit_einsum_path:
+        path = path_type[1:]
+    elif (
+        (path_type is False)
+        or (len(input_list) in [1, 2])
+        or (indices == output_set)
+    ):
          # Nothing to be optimized, leave it to einsum
          path = [tuple(range(len(input_list)))]
      elif path_type == "greedy":
          path = _greedy_path(input_sets, output_set, dimension_dict, memory_arg)
      elif path_type == "optimal":
          path = _optimal_path(input_sets, output_set, dimension_dict, memory_arg)
-    elif path_type[0] == 'einsum_path':
-        path = path_type[1:]
      else:
          raise KeyError("Path name %s not found", path_type)
  
@@ -955,6 +960,13 @@ def einsum_path(*operands, optimize='greedy', einsum_call=False):
  
      opt_cost = sum(cost_list) + 1
  
+    if len(input_list) != 1:
+        # Explicit "einsum_path" is usually trusted, but we detect this kind of
+        # mistake in order to prevent from returning an intermediate value.
+        raise RuntimeError(
+            "Invalid einsum_path is specified: {} more operands has to be "
+            "contracted.".format(len(input_list) - 1))
+
      if einsum_call_arg:
          return (operands, contraction_list)
  
diff --git a/numpy/core/einsumfunc.pyi b/numpy/core/einsumfunc.pyi

index aabb04c478b9f3d9e7eb83ee486d16fe59effe43..e614254cacd3f6b78916a39569495d8e1efc874f 100644 (file)
--- a/numpy/core/einsumfunc.pyi
+++ b/numpy/core/einsumfunc.pyi
@@ -1,4 +1,5 @@
-from typing import List, TypeVar, Optional, Any, overload, Union, Tuple, Sequence, Literal
+from collections.abc import Sequence
+from typing import TypeVar, Any, overload, Union, Literal
  
  from numpy import (
      ndarray,
@@ -11,7 +12,7 @@ from numpy import (
      number,
      _OrderKACF,
  )
-from numpy.typing import (
+from numpy._typing import (
      _ArrayLikeBool_co,
      _ArrayLikeUInt_co,
      _ArrayLikeInt_co,
@@ -30,13 +31,11 @@ _ArrayType = TypeVar(
      bound=ndarray[Any, dtype[Union[bool_, number[Any]]]],
  )
  
-_OptimizeKind = Union[
-    None, bool, Literal["greedy", "optimal"], Sequence[Any]
-]
+_OptimizeKind = None | bool | Literal["greedy", "optimal"] | Sequence[Any]
  _CastingSafe = Literal["no", "equiv", "safe", "same_kind"]
  _CastingUnsafe = Literal["unsafe"]
  
-__all__: List[str]
+__all__: list[str]
  
  # TODO: Properly handle the `casting`-based combinatorics
  # TODO: We need to evaluate the content `__subscripts` in order
@@ -50,7 +49,7 @@ def einsum(
      /,
      *operands: _ArrayLikeBool_co,
      out: None = ...,
-    dtype: Optional[_DTypeLikeBool] = ...,
+    dtype: None | _DTypeLikeBool = ...,
      order: _OrderKACF = ...,
      casting: _CastingSafe = ...,
      optimize: _OptimizeKind = ...,
@@ -61,7 +60,7 @@ def einsum(
      /,
      *operands: _ArrayLikeUInt_co,
      out: None = ...,
-    dtype: Optional[_DTypeLikeUInt] = ...,
+    dtype: None | _DTypeLikeUInt = ...,
      order: _OrderKACF = ...,
      casting: _CastingSafe = ...,
      optimize: _OptimizeKind = ...,
@@ -72,7 +71,7 @@ def einsum(
      /,
      *operands: _ArrayLikeInt_co,
      out: None = ...,
-    dtype: Optional[_DTypeLikeInt] = ...,
+    dtype: None | _DTypeLikeInt = ...,
      order: _OrderKACF = ...,
      casting: _CastingSafe = ...,
      optimize: _OptimizeKind = ...,
@@ -83,7 +82,7 @@ def einsum(
      /,
      *operands: _ArrayLikeFloat_co,
      out: None = ...,
-    dtype: Optional[_DTypeLikeFloat] = ...,
+    dtype: None | _DTypeLikeFloat = ...,
      order: _OrderKACF = ...,
      casting: _CastingSafe = ...,
      optimize: _OptimizeKind = ...,
@@ -94,7 +93,7 @@ def einsum(
      /,
      *operands: _ArrayLikeComplex_co,
      out: None = ...,
-    dtype: Optional[_DTypeLikeComplex] = ...,
+    dtype: None | _DTypeLikeComplex = ...,
      order: _OrderKACF = ...,
      casting: _CastingSafe = ...,
      optimize: _OptimizeKind = ...,
@@ -105,7 +104,7 @@ def einsum(
      /,
      *operands: Any,
      casting: _CastingUnsafe,
-    dtype: Optional[_DTypeLikeComplex_co] = ...,
+    dtype: None | _DTypeLikeComplex_co = ...,
      out: None = ...,
      order: _OrderKACF = ...,
      optimize: _OptimizeKind = ...,
@@ -116,7 +115,7 @@ def einsum(
      /,
      *operands: _ArrayLikeComplex_co,
      out: _ArrayType,
-    dtype: Optional[_DTypeLikeComplex_co] = ...,
+    dtype: None | _DTypeLikeComplex_co = ...,
      order: _OrderKACF = ...,
      casting: _CastingSafe = ...,
      optimize: _OptimizeKind = ...,
@@ -128,7 +127,7 @@ def einsum(
      *operands: Any,
      out: _ArrayType,
      casting: _CastingUnsafe,
-    dtype: Optional[_DTypeLikeComplex_co] = ...,
+    dtype: None | _DTypeLikeComplex_co = ...,
      order: _OrderKACF = ...,
      optimize: _OptimizeKind = ...,
  ) -> _ArrayType: ...
@@ -142,4 +141,4 @@ def einsum_path(
      /,
      *operands: _ArrayLikeComplex_co,
      optimize: _OptimizeKind = ...,
-) -> Tuple[List[Any], str]: ...
+) -> tuple[list[Any], str]: ...
diff --git a/numpy/core/fromnumeric.py b/numpy/core/fromnumeric.py

index 3242124acf50826580b228ea61d5a372e5a6ba59..9e58d9beada72185292217d0a069a48e9aa5a030 100644 (file)
--- a/numpy/core/fromnumeric.py
+++ b/numpy/core/fromnumeric.py
@@ -17,7 +17,7 @@ _dt_ = nt.sctype2char
  
  # functions that are methods
  __all__ = [
-    'alen', 'all', 'alltrue', 'amax', 'amin', 'any', 'argmax',
+    'all', 'alltrue', 'amax', 'amin', 'any', 'argmax',
      'argmin', 'argpartition', 'argsort', 'around', 'choose', 'clip',
      'compress', 'cumprod', 'cumproduct', 'cumsum', 'diagonal', 'mean',
      'ndim', 'nonzero', 'partition', 'prod', 'product', 'ptp', 'put',
@@ -804,8 +804,8 @@ def argpartition(a, kth, axis=-1, kind='introselect', order=None):
      index_array : ndarray, int
          Array of indices that partition `a` along the specified axis.
          If `a` is one-dimensional, ``a[index_array]`` yields a partitioned `a`.
-        More generally, ``np.take_along_axis(a, index_array, axis=a)`` always
-        yields the partitioned `a`, irrespective of dimensionality.
+        More generally, ``np.take_along_axis(a, index_array, axis)``
+        always yields the partitioned `a`, irrespective of dimensionality.
  
      See Also
      --------
@@ -1980,25 +1980,27 @@ def shape(a):
  
      See Also
      --------
-    len
+    len : ``len(a)`` is equivalent to ``np.shape(a)[0]`` for N-D arrays with
+          ``N>=1``.
      ndarray.shape : Equivalent array method.
  
      Examples
      --------
      >>> np.shape(np.eye(3))
      (3, 3)
-    >>> np.shape([[1, 2]])
+    >>> np.shape([[1, 3]])
      (1, 2)
      >>> np.shape([0])
      (1,)
      >>> np.shape(0)
      ()
  
-    >>> a = np.array([(1, 2), (3, 4)], dtype=[('x', 'i4'), ('y', 'i4')])
+    >>> a = np.array([(1, 2), (3, 4), (5, 6)],
+    ...              dtype=[('x', 'i4'), ('y', 'i4')])
      >>> np.shape(a)
-    (2,)
+    (3,)
      >>> a.shape
-    (2,)
+    (3,)
  
      """
      try:
@@ -2307,7 +2309,7 @@ def any(a, axis=None, out=None, keepdims=np._NoValue, *, where=np._NoValue):
      """
      Test whether any array element along a given axis evaluates to True.
  
-    Returns single boolean unless `axis` is not ``None``
+    Returns single boolean if `axis` is ``None``
  
      Parameters
      ----------
@@ -2917,51 +2919,6 @@ def amin(a, axis=None, out=None, keepdims=np._NoValue, initial=np._NoValue,
                            keepdims=keepdims, initial=initial, where=where)
  
  
-def _alen_dispathcer(a):
-    return (a,)
-
-
-@array_function_dispatch(_alen_dispathcer)
-def alen(a):
-    """
-    Return the length of the first dimension of the input array.
-
-    .. deprecated:: 1.18
-       `numpy.alen` is deprecated, use `len` instead.
-
-    Parameters
-    ----------
-    a : array_like
-       Input array.
-
-    Returns
-    -------
-    alen : int
-       Length of the first dimension of `a`.
-
-    See Also
-    --------
-    shape, size
-
-    Examples
-    --------
-    >>> a = np.zeros((7,4,5))
-    >>> a.shape[0]
-    7
-    >>> np.alen(a)
-    7
-
-    """
-    # NumPy 1.18.0, 2019-08-02
-    warnings.warn(
-        "`np.alen` is deprecated, use `len` instead",
-        DeprecationWarning, stacklevel=2)
-    try:
-        return len(a)
-    except TypeError:
-        return len(array(a, ndmin=1))
-
-
  def _prod_dispatcher(a, axis=None, dtype=None, out=None, keepdims=None,
                       initial=None, where=None):
      return (a, out)
@@ -3451,6 +3408,7 @@ def mean(a, axis=None, dtype=None, out=None, keepdims=np._NoValue, *,
      0.55000000074505806 # may vary
  
      Specifying a where argument:
+
      >>> a = np.array([[5, 9, 13], [14, 10, 12], [11, 15, 19]])
      >>> np.mean(a)
      12.0
diff --git a/numpy/core/fromnumeric.pyi b/numpy/core/fromnumeric.pyi

index 4a5e50503fe0d608a57ca78b9238caf031c0aad4..17b17819d70476c7f18437ea2787aafed7ff23e5 100644 (file)
--- a/numpy/core/fromnumeric.pyi
+++ b/numpy/core/fromnumeric.pyi
@@ -1,12 +1,19 @@
  import datetime as dt
-from typing import Optional, Union, Sequence, Tuple, Any, overload, TypeVar, Literal
+from collections.abc import Sequence
+from typing import Union, Any, overload, TypeVar, Literal, SupportsIndex
  
  from numpy import (
      ndarray,
      number,
-    integer,
+    uint64,
+    int_,
+    int64,
      intp,
+    float16,
      bool_,
+    floating,
+    complexfloating,
+    object_,
      generic,
      _OrderKACF,
      _OrderACF,
@@ -15,230 +22,479 @@ from numpy import (
      _SortKind,
      _SortSide,
  )
-from numpy.typing import (
+from numpy._typing import (
      DTypeLike,
+    _DTypeLike,
      ArrayLike,
+    _ArrayLike,
+    NDArray,
      _ShapeLike,
      _Shape,
      _ArrayLikeBool_co,
+    _ArrayLikeUInt_co,
      _ArrayLikeInt_co,
+    _ArrayLikeFloat_co,
+    _ArrayLikeComplex_co,
+    _ArrayLikeObject_co,
+    _IntLike_co,
+    _BoolLike_co,
+    _ComplexLike_co,
      _NumberLike_co,
+    _ScalarLike_co,
  )
  
-# Various annotations for scalars
+_SCT = TypeVar("_SCT", bound=generic)
+_SCT_uifcO = TypeVar("_SCT_uifcO", bound=number[Any] | object_)
+_ArrayType = TypeVar("_ArrayType", bound=NDArray[Any])
  
-# While dt.datetime and dt.timedelta are not technically part of NumPy,
-# they are one of the rare few builtin scalars which serve as valid return types.
-# See https://github.com/numpy/numpy-stubs/pull/67#discussion_r412604113.
-_ScalarNumpy = Union[generic, dt.datetime, dt.timedelta]
-_ScalarBuiltin = Union[str, bytes, dt.date, dt.timedelta, bool, int, float, complex]
-_Scalar = Union[_ScalarBuiltin, _ScalarNumpy]
+__all__: list[str]
  
-# Integers and booleans can generally be used interchangeably
-_ScalarGeneric = TypeVar("_ScalarGeneric", bound=generic)
-
-_Number = TypeVar("_Number", bound=number)
-
-# The signature of take() follows a common theme with its overloads:
-# 1. A generic comes in; the same generic comes out
-# 2. A scalar comes in; a generic comes out
-# 3. An array-like object comes in; some keyword ensures that a generic comes out
-# 4. An array-like object comes in; an ndarray or generic comes out
+@overload
+def take(
+    a: _ArrayLike[_SCT],
+    indices: _IntLike_co,
+    axis: None = ...,
+    out: None = ...,
+    mode: _ModeKind = ...,
+) -> _SCT: ...
+@overload
  def take(
      a: ArrayLike,
-    indices: _ArrayLikeInt_co,
-    axis: Optional[int] = ...,
-    out: Optional[ndarray] = ...,
+    indices: _IntLike_co,
+    axis: None | SupportsIndex = ...,
+    out: None = ...,
      mode: _ModeKind = ...,
  ) -> Any: ...
+@overload
+def take(
+    a: _ArrayLike[_SCT],
+    indices: _ArrayLikeInt_co,
+    axis: None | SupportsIndex = ...,
+    out: None = ...,
+    mode: _ModeKind = ...,
+) -> NDArray[_SCT]: ...
+@overload
+def take(
+    a: ArrayLike,
+    indices: _ArrayLikeInt_co,
+    axis: None | SupportsIndex = ...,
+    out: None = ...,
+    mode: _ModeKind = ...,
+) -> NDArray[Any]: ...
+@overload
+def take(
+    a: ArrayLike,
+    indices: _ArrayLikeInt_co,
+    axis: None | SupportsIndex = ...,
+    out: _ArrayType = ...,
+    mode: _ModeKind = ...,
+) -> _ArrayType: ...
  
+@overload
+def reshape(
+    a: _ArrayLike[_SCT],
+    newshape: _ShapeLike,
+    order: _OrderACF = ...,
+) -> NDArray[_SCT]: ...
+@overload
  def reshape(
      a: ArrayLike,
      newshape: _ShapeLike,
      order: _OrderACF = ...,
-) -> ndarray: ...
+) -> NDArray[Any]: ...
  
+@overload
  def choose(
-    a: _ArrayLikeInt_co,
+    a: _IntLike_co,
      choices: ArrayLike,
-    out: Optional[ndarray] = ...,
+    out: None = ...,
      mode: _ModeKind = ...,
  ) -> Any: ...
+@overload
+def choose(
+    a: _ArrayLikeInt_co,
+    choices: _ArrayLike[_SCT],
+    out: None = ...,
+    mode: _ModeKind = ...,
+) -> NDArray[_SCT]: ...
+@overload
+def choose(
+    a: _ArrayLikeInt_co,
+    choices: ArrayLike,
+    out: None = ...,
+    mode: _ModeKind = ...,
+) -> NDArray[Any]: ...
+@overload
+def choose(
+    a: _ArrayLikeInt_co,
+    choices: ArrayLike,
+    out: _ArrayType = ...,
+    mode: _ModeKind = ...,
+) -> _ArrayType: ...
  
+@overload
+def repeat(
+    a: _ArrayLike[_SCT],
+    repeats: _ArrayLikeInt_co,
+    axis: None | SupportsIndex = ...,
+) -> NDArray[_SCT]: ...
+@overload
  def repeat(
      a: ArrayLike,
      repeats: _ArrayLikeInt_co,
-    axis: Optional[int] = ...,
-) -> ndarray: ...
+    axis: None | SupportsIndex = ...,
+) -> NDArray[Any]: ...
  
  def put(
-    a: ndarray,
+    a: NDArray[Any],
      ind: _ArrayLikeInt_co,
      v: ArrayLike,
      mode: _ModeKind = ...,
  ) -> None: ...
  
+@overload
+def swapaxes(
+    a: _ArrayLike[_SCT],
+    axis1: SupportsIndex,
+    axis2: SupportsIndex,
+) -> NDArray[_SCT]: ...
+@overload
  def swapaxes(
      a: ArrayLike,
-    axis1: int,
-    axis2: int,
-) -> ndarray: ...
+    axis1: SupportsIndex,
+    axis2: SupportsIndex,
+) -> NDArray[Any]: ...
  
+@overload
+def transpose(
+    a: _ArrayLike[_SCT],
+    axes: None | _ShapeLike = ...
+) -> NDArray[_SCT]: ...
+@overload
  def transpose(
      a: ArrayLike,
-    axes: Union[None, Sequence[int], ndarray] = ...
-) -> ndarray: ...
+    axes: None | _ShapeLike = ...
+) -> NDArray[Any]: ...
  
+@overload
+def partition(
+    a: _ArrayLike[_SCT],
+    kth: _ArrayLikeInt_co,
+    axis: None | SupportsIndex = ...,
+    kind: _PartitionKind = ...,
+    order: None | str | Sequence[str] = ...,
+) -> NDArray[_SCT]: ...
+@overload
  def partition(
      a: ArrayLike,
      kth: _ArrayLikeInt_co,
-    axis: Optional[int] = ...,
+    axis: None | SupportsIndex = ...,
      kind: _PartitionKind = ...,
-    order: Union[None, str, Sequence[str]] = ...,
-) -> ndarray: ...
+    order: None | str | Sequence[str] = ...,
+) -> NDArray[Any]: ...
  
  def argpartition(
      a: ArrayLike,
      kth: _ArrayLikeInt_co,
-    axis: Optional[int] = ...,
+    axis: None | SupportsIndex = ...,
      kind: _PartitionKind = ...,
-    order: Union[None, str, Sequence[str]] = ...,
-) -> Any: ...
+    order: None | str | Sequence[str] = ...,
+) -> NDArray[intp]: ...
  
+@overload
+def sort(
+    a: _ArrayLike[_SCT],
+    axis: None | SupportsIndex = ...,
+    kind: None | _SortKind = ...,
+    order: None | str | Sequence[str] = ...,
+) -> NDArray[_SCT]: ...
+@overload
  def sort(
      a: ArrayLike,
-    axis: Optional[int] = ...,
-    kind: Optional[_SortKind] = ...,
-    order: Union[None, str, Sequence[str]] = ...,
-) -> ndarray: ...
+    axis: None | SupportsIndex = ...,
+    kind: None | _SortKind = ...,
+    order: None | str | Sequence[str] = ...,
+) -> NDArray[Any]: ...
  
  def argsort(
      a: ArrayLike,
-    axis: Optional[int] = ...,
-    kind: Optional[_SortKind] = ...,
-    order: Union[None, str, Sequence[str]] = ...,
-) -> ndarray: ...
+    axis: None | SupportsIndex = ...,
+    kind: None | _SortKind = ...,
+    order: None | str | Sequence[str] = ...,
+) -> NDArray[intp]: ...
  
  @overload
  def argmax(
      a: ArrayLike,
      axis: None = ...,
-    out: Optional[ndarray] = ...,
+    out: None = ...,
      *,
      keepdims: Literal[False] = ...,
  ) -> intp: ...
  @overload
  def argmax(
      a: ArrayLike,
-    axis: Optional[int] = ...,
-    out: Optional[ndarray] = ...,
+    axis: None | SupportsIndex = ...,
+    out: None = ...,
      *,
      keepdims: bool = ...,
  ) -> Any: ...
+@overload
+def argmax(
+    a: ArrayLike,
+    axis: None | SupportsIndex = ...,
+    out: _ArrayType = ...,
+    *,
+    keepdims: bool = ...,
+) -> _ArrayType: ...
  
  @overload
  def argmin(
      a: ArrayLike,
      axis: None = ...,
-    out: Optional[ndarray] = ...,
+    out: None = ...,
      *,
      keepdims: Literal[False] = ...,
  ) -> intp: ...
  @overload
  def argmin(
      a: ArrayLike,
-    axis: Optional[int] = ...,
-    out: Optional[ndarray] = ...,
+    axis: None | SupportsIndex = ...,
+    out: None = ...,
      *,
      keepdims: bool = ...,
  ) -> Any: ...
+@overload
+def argmin(
+    a: ArrayLike,
+    axis: None | SupportsIndex = ...,
+    out: _ArrayType = ...,
+    *,
+    keepdims: bool = ...,
+) -> _ArrayType: ...
  
  @overload
  def searchsorted(
      a: ArrayLike,
-    v: _Scalar,
+    v: _ScalarLike_co,
      side: _SortSide = ...,
-    sorter: Optional[_ArrayLikeInt_co] = ...,  # 1D int array
+    sorter: None | _ArrayLikeInt_co = ...,  # 1D int array
  ) -> intp: ...
  @overload
  def searchsorted(
      a: ArrayLike,
      v: ArrayLike,
      side: _SortSide = ...,
-    sorter: Optional[_ArrayLikeInt_co] = ...,  # 1D int array
-) -> ndarray: ...
+    sorter: None | _ArrayLikeInt_co = ...,  # 1D int array
+) -> NDArray[intp]: ...
  
+@overload
+def resize(
+    a: _ArrayLike[_SCT],
+    new_shape: _ShapeLike,
+) -> NDArray[_SCT]: ...
+@overload
  def resize(
      a: ArrayLike,
      new_shape: _ShapeLike,
-) -> ndarray: ...
+) -> NDArray[Any]: ...
  
  @overload
  def squeeze(
-    a: _ScalarGeneric,
-    axis: Optional[_ShapeLike] = ...,
-) -> _ScalarGeneric: ...
+    a: _SCT,
+    axis: None | _ShapeLike = ...,
+) -> _SCT: ...
+@overload
+def squeeze(
+    a: _ArrayLike[_SCT],
+    axis: None | _ShapeLike = ...,
+) -> NDArray[_SCT]: ...
  @overload
  def squeeze(
      a: ArrayLike,
-    axis: Optional[_ShapeLike] = ...,
-) -> ndarray: ...
+    axis: None | _ShapeLike = ...,
+) -> NDArray[Any]: ...
  
+@overload
+def diagonal(
+    a: _ArrayLike[_SCT],
+    offset: SupportsIndex = ...,
+    axis1: SupportsIndex = ...,
+    axis2: SupportsIndex = ...,  # >= 2D array
+) -> NDArray[_SCT]: ...
+@overload
  def diagonal(
      a: ArrayLike,
-    offset: int = ...,
-    axis1: int = ...,
-    axis2: int = ...,  # >= 2D array
-) -> ndarray: ...
+    offset: SupportsIndex = ...,
+    axis1: SupportsIndex = ...,
+    axis2: SupportsIndex = ...,  # >= 2D array
+) -> NDArray[Any]: ...
  
+@overload
  def trace(
      a: ArrayLike,  # >= 2D array
-    offset: int = ...,
-    axis1: int = ...,
-    axis2: int = ...,
+    offset: SupportsIndex = ...,
+    axis1: SupportsIndex = ...,
+    axis2: SupportsIndex = ...,
      dtype: DTypeLike = ...,
-    out: Optional[ndarray] = ...,
+    out: None = ...,
  ) -> Any: ...
+@overload
+def trace(
+    a: ArrayLike,  # >= 2D array
+    offset: SupportsIndex = ...,
+    axis1: SupportsIndex = ...,
+    axis2: SupportsIndex = ...,
+    dtype: DTypeLike = ...,
+    out: _ArrayType = ...,
+) -> _ArrayType: ...
  
-def ravel(a: ArrayLike, order: _OrderKACF = ...) -> ndarray: ...
+@overload
+def ravel(a: _ArrayLike[_SCT], order: _OrderKACF = ...) -> NDArray[_SCT]: ...
+@overload
+def ravel(a: ArrayLike, order: _OrderKACF = ...) -> NDArray[Any]: ...
  
-def nonzero(a: ArrayLike) -> Tuple[ndarray, ...]: ...
+def nonzero(a: ArrayLike) -> tuple[NDArray[intp], ...]: ...
  
  def shape(a: ArrayLike) -> _Shape: ...
  
+@overload
+def compress(
+    condition: _ArrayLikeBool_co,  # 1D bool array
+    a: _ArrayLike[_SCT],
+    axis: None | SupportsIndex = ...,
+    out: None = ...,
+) -> NDArray[_SCT]: ...
+@overload
+def compress(
+    condition: _ArrayLikeBool_co,  # 1D bool array
+    a: ArrayLike,
+    axis: None | SupportsIndex = ...,
+    out: None = ...,
+) -> NDArray[Any]: ...
+@overload
  def compress(
-    condition: ArrayLike,  # 1D bool array
+    condition: _ArrayLikeBool_co,  # 1D bool array
      a: ArrayLike,
-    axis: Optional[int] = ...,
-    out: Optional[ndarray] = ...,
-) -> ndarray: ...
+    axis: None | SupportsIndex = ...,
+    out: _ArrayType = ...,
+) -> _ArrayType: ...
  
  @overload
  def clip(
-    a: ArrayLike,
-    a_min: ArrayLike,
-    a_max: Optional[ArrayLike],
-    out: Optional[ndarray] = ...,
-    **kwargs: Any,
+    a: _SCT,
+    a_min: None | ArrayLike,
+    a_max: None | ArrayLike,
+    out: None = ...,
+    *,
+    dtype: None = ...,
+    where: None | _ArrayLikeBool_co = ...,
+    order: _OrderKACF = ...,
+    subok: bool = ...,
+    signature: str | tuple[None | str, ...] = ...,
+    extobj: list[Any] = ...,
+) -> _SCT: ...
+@overload
+def clip(
+    a: _ScalarLike_co,
+    a_min: None | ArrayLike,
+    a_max: None | ArrayLike,
+    out: None = ...,
+    *,
+    dtype: None = ...,
+    where: None | _ArrayLikeBool_co = ...,
+    order: _OrderKACF = ...,
+    subok: bool = ...,
+    signature: str | tuple[None | str, ...] = ...,
+    extobj: list[Any] = ...,
  ) -> Any: ...
  @overload
+def clip(
+    a: _ArrayLike[_SCT],
+    a_min: None | ArrayLike,
+    a_max: None | ArrayLike,
+    out: None = ...,
+    *,
+    dtype: None = ...,
+    where: None | _ArrayLikeBool_co = ...,
+    order: _OrderKACF = ...,
+    subok: bool = ...,
+    signature: str | tuple[None | str, ...] = ...,
+    extobj: list[Any] = ...,
+) -> NDArray[_SCT]: ...
+@overload
  def clip(
      a: ArrayLike,
-    a_min: None,
-    a_max: ArrayLike,
-    out: Optional[ndarray] = ...,
-    **kwargs: Any,
+    a_min: None | ArrayLike,
+    a_max: None | ArrayLike,
+    out: None = ...,
+    *,
+    dtype: None = ...,
+    where: None | _ArrayLikeBool_co = ...,
+    order: _OrderKACF = ...,
+    subok: bool = ...,
+    signature: str | tuple[None | str, ...] = ...,
+    extobj: list[Any] = ...,
+) -> NDArray[Any]: ...
+@overload
+def clip(
+    a: ArrayLike,
+    a_min: None | ArrayLike,
+    a_max: None | ArrayLike,
+    out: _ArrayType = ...,
+    *,
+    dtype: DTypeLike,
+    where: None | _ArrayLikeBool_co = ...,
+    order: _OrderKACF = ...,
+    subok: bool = ...,
+    signature: str | tuple[None | str, ...] = ...,
+    extobj: list[Any] = ...,
  ) -> Any: ...
+@overload
+def clip(
+    a: ArrayLike,
+    a_min: None | ArrayLike,
+    a_max: None | ArrayLike,
+    out: _ArrayType,
+    *,
+    dtype: DTypeLike = ...,
+    where: None | _ArrayLikeBool_co = ...,
+    order: _OrderKACF = ...,
+    subok: bool = ...,
+    signature: str | tuple[None | str, ...] = ...,
+    extobj: list[Any] = ...,
+) -> _ArrayType: ...
  
+@overload
+def sum(
+    a: _ArrayLike[_SCT],
+    axis: None = ...,
+    dtype: None = ...,
+    out: None  = ...,
+    keepdims: bool = ...,
+    initial: _NumberLike_co = ...,
+    where: _ArrayLikeBool_co = ...,
+) -> _SCT: ...
+@overload
  def sum(
      a: ArrayLike,
-    axis: _ShapeLike = ...,
+    axis: None | _ShapeLike = ...,
      dtype: DTypeLike = ...,
-    out: Optional[ndarray] = ...,
+    out: None  = ...,
      keepdims: bool = ...,
      initial: _NumberLike_co = ...,
      where: _ArrayLikeBool_co = ...,
  ) -> Any: ...
+@overload
+def sum(
+    a: ArrayLike,
+    axis: None | _ShapeLike = ...,
+    dtype: DTypeLike = ...,
+    out: _ArrayType  = ...,
+    keepdims: bool = ...,
+    initial: _NumberLike_co = ...,
+    where: _ArrayLikeBool_co = ...,
+) -> _ArrayType: ...
  
  @overload
  def all(
@@ -252,12 +508,21 @@ def all(
  @overload
  def all(
      a: ArrayLike,
-    axis: Optional[_ShapeLike] = ...,
-    out: Optional[ndarray] = ...,
+    axis: None | _ShapeLike = ...,
+    out: None = ...,
      keepdims: bool = ...,
      *,
      where: _ArrayLikeBool_co = ...,
  ) -> Any: ...
+@overload
+def all(
+    a: ArrayLike,
+    axis: None | _ShapeLike = ...,
+    out: _ArrayType = ...,
+    keepdims: bool = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> _ArrayType: ...
  
  @overload
  def any(
@@ -271,44 +536,135 @@ def any(
  @overload
  def any(
      a: ArrayLike,
-    axis: Optional[_ShapeLike] = ...,
-    out: Optional[ndarray] = ...,
+    axis: None | _ShapeLike = ...,
+    out: None = ...,
      keepdims: bool = ...,
      *,
      where: _ArrayLikeBool_co = ...,
  ) -> Any: ...
+@overload
+def any(
+    a: ArrayLike,
+    axis: None | _ShapeLike = ...,
+    out: _ArrayType = ...,
+    keepdims: bool = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> _ArrayType: ...
  
+@overload
+def cumsum(
+    a: _ArrayLike[_SCT],
+    axis: None | SupportsIndex = ...,
+    dtype: None = ...,
+    out: None = ...,
+) -> NDArray[_SCT]: ...
+@overload
+def cumsum(
+    a: ArrayLike,
+    axis: None | SupportsIndex = ...,
+    dtype: None = ...,
+    out: None = ...,
+) -> NDArray[Any]: ...
+@overload
+def cumsum(
+    a: ArrayLike,
+    axis: None | SupportsIndex = ...,
+    dtype: _DTypeLike[_SCT] = ...,
+    out: None = ...,
+) -> NDArray[_SCT]: ...
+@overload
+def cumsum(
+    a: ArrayLike,
+    axis: None | SupportsIndex = ...,
+    dtype: DTypeLike = ...,
+    out: None = ...,
+) -> NDArray[Any]: ...
+@overload
  def cumsum(
      a: ArrayLike,
-    axis: Optional[int] = ...,
+    axis: None | SupportsIndex = ...,
      dtype: DTypeLike = ...,
-    out: Optional[ndarray] = ...,
-) -> ndarray: ...
+    out: _ArrayType = ...,
+) -> _ArrayType: ...
  
+@overload
+def ptp(
+    a: _ArrayLike[_SCT],
+    axis: None = ...,
+    out: None = ...,
+    keepdims: Literal[False] = ...,
+) -> _SCT: ...
+@overload
  def ptp(
      a: ArrayLike,
-    axis: Optional[_ShapeLike] = ...,
-    out: Optional[ndarray] = ...,
+    axis: None | _ShapeLike = ...,
+    out: None = ...,
      keepdims: bool = ...,
  ) -> Any: ...
+@overload
+def ptp(
+    a: ArrayLike,
+    axis: None | _ShapeLike = ...,
+    out: _ArrayType = ...,
+    keepdims: bool = ...,
+) -> _ArrayType: ...
  
+@overload
+def amax(
+    a: _ArrayLike[_SCT],
+    axis: None = ...,
+    out: None = ...,
+    keepdims: Literal[False] = ...,
+    initial: _NumberLike_co = ...,
+    where: _ArrayLikeBool_co = ...,
+) -> _SCT: ...
+@overload
  def amax(
      a: ArrayLike,
-    axis: Optional[_ShapeLike] = ...,
-    out: Optional[ndarray] = ...,
+    axis: None | _ShapeLike = ...,
+    out: None = ...,
      keepdims: bool = ...,
      initial: _NumberLike_co = ...,
      where: _ArrayLikeBool_co = ...,
  ) -> Any: ...
+@overload
+def amax(
+    a: ArrayLike,
+    axis: None | _ShapeLike = ...,
+    out: _ArrayType = ...,
+    keepdims: bool = ...,
+    initial: _NumberLike_co = ...,
+    where: _ArrayLikeBool_co = ...,
+) -> _ArrayType: ...
  
+@overload
+def amin(
+    a: _ArrayLike[_SCT],
+    axis: None = ...,
+    out: None = ...,
+    keepdims: Literal[False] = ...,
+    initial: _NumberLike_co = ...,
+    where: _ArrayLikeBool_co = ...,
+) -> _SCT: ...
+@overload
  def amin(
      a: ArrayLike,
-    axis: Optional[_ShapeLike] = ...,
-    out: Optional[ndarray] = ...,
+    axis: None | _ShapeLike = ...,
+    out: None = ...,
      keepdims: bool = ...,
      initial: _NumberLike_co = ...,
      where: _ArrayLikeBool_co = ...,
  ) -> Any: ...
+@overload
+def amin(
+    a: ArrayLike,
+    axis: None | _ShapeLike = ...,
+    out: _ArrayType = ...,
+    keepdims: bool = ...,
+    initial: _NumberLike_co = ...,
+    where: _ArrayLikeBool_co = ...,
+) -> _ArrayType: ...
  
  # TODO: `np.prod()``: For object arrays `initial` does not necessarily
  # have to be a numerical scalar.
@@ -317,61 +673,377 @@ def amin(
  
  # Note that the same situation holds for all wrappers around
  # `np.ufunc.reduce`, e.g. `np.sum()` (`.__add__()`).
+@overload
  def prod(
-    a: ArrayLike,
-    axis: Optional[_ShapeLike] = ...,
-    dtype: DTypeLike = ...,
-    out: Optional[ndarray] = ...,
+    a: _ArrayLikeBool_co,
+    axis: None = ...,
+    dtype: None = ...,
+    out: None = ...,
+    keepdims: Literal[False] = ...,
+    initial: _NumberLike_co = ...,
+    where: _ArrayLikeBool_co = ...,
+) -> int_: ...
+@overload
+def prod(
+    a: _ArrayLikeUInt_co,
+    axis: None = ...,
+    dtype: None = ...,
+    out: None = ...,
+    keepdims: Literal[False] = ...,
+    initial: _NumberLike_co = ...,
+    where: _ArrayLikeBool_co = ...,
+) -> uint64: ...
+@overload
+def prod(
+    a: _ArrayLikeInt_co,
+    axis: None = ...,
+    dtype: None = ...,
+    out: None = ...,
+    keepdims: Literal[False] = ...,
+    initial: _NumberLike_co = ...,
+    where: _ArrayLikeBool_co = ...,
+) -> int64: ...
+@overload
+def prod(
+    a: _ArrayLikeFloat_co,
+    axis: None = ...,
+    dtype: None = ...,
+    out: None = ...,
+    keepdims: Literal[False] = ...,
+    initial: _NumberLike_co = ...,
+    where: _ArrayLikeBool_co = ...,
+) -> floating[Any]: ...
+@overload
+def prod(
+    a: _ArrayLikeComplex_co,
+    axis: None = ...,
+    dtype: None = ...,
+    out: None = ...,
+    keepdims: Literal[False] = ...,
+    initial: _NumberLike_co = ...,
+    where: _ArrayLikeBool_co = ...,
+) -> complexfloating[Any, Any]: ...
+@overload
+def prod(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | _ShapeLike = ...,
+    dtype: None = ...,
+    out: None = ...,
      keepdims: bool = ...,
      initial: _NumberLike_co = ...,
      where: _ArrayLikeBool_co = ...,
  ) -> Any: ...
+@overload
+def prod(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None = ...,
+    dtype: _DTypeLike[_SCT] = ...,
+    out: None = ...,
+    keepdims: Literal[False] = ...,
+    initial: _NumberLike_co = ...,
+    where: _ArrayLikeBool_co = ...,
+) -> _SCT: ...
+@overload
+def prod(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | _ShapeLike = ...,
+    dtype: None | DTypeLike = ...,
+    out: None = ...,
+    keepdims: bool = ...,
+    initial: _NumberLike_co = ...,
+    where: _ArrayLikeBool_co = ...,
+) -> Any: ...
+@overload
+def prod(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | _ShapeLike = ...,
+    dtype: None | DTypeLike = ...,
+    out: _ArrayType = ...,
+    keepdims: bool = ...,
+    initial: _NumberLike_co = ...,
+    where: _ArrayLikeBool_co = ...,
+) -> _ArrayType: ...
  
+@overload
  def cumprod(
-    a: ArrayLike,
-    axis: Optional[int] = ...,
+    a: _ArrayLikeBool_co,
+    axis: None | SupportsIndex = ...,
+    dtype: None = ...,
+    out: None = ...,
+) -> NDArray[int_]: ...
+@overload
+def cumprod(
+    a: _ArrayLikeUInt_co,
+    axis: None | SupportsIndex = ...,
+    dtype: None = ...,
+    out: None = ...,
+) -> NDArray[uint64]: ...
+@overload
+def cumprod(
+    a: _ArrayLikeInt_co,
+    axis: None | SupportsIndex = ...,
+    dtype: None = ...,
+    out: None = ...,
+) -> NDArray[int64]: ...
+@overload
+def cumprod(
+    a: _ArrayLikeFloat_co,
+    axis: None | SupportsIndex = ...,
+    dtype: None = ...,
+    out: None = ...,
+) -> NDArray[floating[Any]]: ...
+@overload
+def cumprod(
+    a: _ArrayLikeComplex_co,
+    axis: None | SupportsIndex = ...,
+    dtype: None = ...,
+    out: None = ...,
+) -> NDArray[complexfloating[Any, Any]]: ...
+@overload
+def cumprod(
+    a: _ArrayLikeObject_co,
+    axis: None | SupportsIndex = ...,
+    dtype: None = ...,
+    out: None = ...,
+) -> NDArray[object_]: ...
+@overload
+def cumprod(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | SupportsIndex = ...,
+    dtype: _DTypeLike[_SCT] = ...,
+    out: None = ...,
+) -> NDArray[_SCT]: ...
+@overload
+def cumprod(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | SupportsIndex = ...,
+    dtype: DTypeLike = ...,
+    out: None = ...,
+) -> NDArray[Any]: ...
+@overload
+def cumprod(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | SupportsIndex = ...,
      dtype: DTypeLike = ...,
-    out: Optional[ndarray] = ...,
-) -> ndarray: ...
+    out: _ArrayType = ...,
+) -> _ArrayType: ...
  
  def ndim(a: ArrayLike) -> int: ...
  
-def size(a: ArrayLike, axis: Optional[int] = ...) -> int: ...
+def size(a: ArrayLike, axis: None | int = ...) -> int: ...
  
+@overload
  def around(
-    a: ArrayLike,
-    decimals: int = ...,
-    out: Optional[ndarray] = ...,
+    a: _BoolLike_co,
+    decimals: SupportsIndex = ...,
+    out: None = ...,
+) -> float16: ...
+@overload
+def around(
+    a: _SCT_uifcO,
+    decimals: SupportsIndex = ...,
+    out: None = ...,
+) -> _SCT_uifcO: ...
+@overload
+def around(
+    a: _ComplexLike_co | object_,
+    decimals: SupportsIndex = ...,
+    out: None = ...,
  ) -> Any: ...
+@overload
+def around(
+    a: _ArrayLikeBool_co,
+    decimals: SupportsIndex = ...,
+    out: None = ...,
+) -> NDArray[float16]: ...
+@overload
+def around(
+    a: _ArrayLike[_SCT_uifcO],
+    decimals: SupportsIndex = ...,
+    out: None = ...,
+) -> NDArray[_SCT_uifcO]: ...
+@overload
+def around(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    decimals: SupportsIndex = ...,
+    out: None = ...,
+) -> NDArray[Any]: ...
+@overload
+def around(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    decimals: SupportsIndex = ...,
+    out: _ArrayType = ...,
+) -> _ArrayType: ...
  
+@overload
  def mean(
-    a: ArrayLike,
-    axis: Optional[_ShapeLike] = ...,
+    a: _ArrayLikeFloat_co,
+    axis: None = ...,
+    dtype: None = ...,
+    out: None = ...,
+    keepdims: Literal[False] = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> floating[Any]: ...
+@overload
+def mean(
+    a: _ArrayLikeComplex_co,
+    axis: None = ...,
+    dtype: None = ...,
+    out: None = ...,
+    keepdims: Literal[False] = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> complexfloating[Any, Any]: ...
+@overload
+def mean(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | _ShapeLike = ...,
+    dtype: None = ...,
+    out: None = ...,
+    keepdims: bool = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> Any: ...
+@overload
+def mean(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None = ...,
+    dtype: _DTypeLike[_SCT] = ...,
+    out: None = ...,
+    keepdims: Literal[False] = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> _SCT: ...
+@overload
+def mean(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | _ShapeLike = ...,
      dtype: DTypeLike = ...,
-    out: Optional[ndarray] = ...,
+    out: None = ...,
      keepdims: bool = ...,
      *,
      where: _ArrayLikeBool_co = ...,
  ) -> Any: ...
+@overload
+def mean(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | _ShapeLike = ...,
+    dtype: DTypeLike = ...,
+    out: _ArrayType = ...,
+    keepdims: bool = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> _ArrayType: ...
  
+@overload
  def std(
-    a: ArrayLike,
-    axis: Optional[_ShapeLike] = ...,
+    a: _ArrayLikeComplex_co,
+    axis: None = ...,
+    dtype: None = ...,
+    out: None = ...,
+    ddof: float = ...,
+    keepdims: Literal[False] = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> floating[Any]: ...
+@overload
+def std(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | _ShapeLike = ...,
+    dtype: None = ...,
+    out: None = ...,
+    ddof: float = ...,
+    keepdims: bool = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> Any: ...
+@overload
+def std(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None = ...,
+    dtype: _DTypeLike[_SCT] = ...,
+    out: None = ...,
+    ddof: float = ...,
+    keepdims: Literal[False] = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> _SCT: ...
+@overload
+def std(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | _ShapeLike = ...,
      dtype: DTypeLike = ...,
-    out: Optional[ndarray] = ...,
-    ddof: int = ...,
+    out: None = ...,
+    ddof: float = ...,
      keepdims: bool = ...,
      *,
      where: _ArrayLikeBool_co = ...,
  ) -> Any: ...
+@overload
+def std(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | _ShapeLike = ...,
+    dtype: DTypeLike = ...,
+    out: _ArrayType = ...,
+    ddof: float = ...,
+    keepdims: bool = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> _ArrayType: ...
  
+@overload
  def var(
-    a: ArrayLike,
-    axis: Optional[_ShapeLike] = ...,
+    a: _ArrayLikeComplex_co,
+    axis: None = ...,
+    dtype: None = ...,
+    out: None = ...,
+    ddof: float = ...,
+    keepdims: Literal[False] = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> floating[Any]: ...
+@overload
+def var(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | _ShapeLike = ...,
+    dtype: None = ...,
+    out: None = ...,
+    ddof: float = ...,
+    keepdims: bool = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> Any: ...
+@overload
+def var(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None = ...,
+    dtype: _DTypeLike[_SCT] = ...,
+    out: None = ...,
+    ddof: float = ...,
+    keepdims: Literal[False] = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> _SCT: ...
+@overload
+def var(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | _ShapeLike = ...,
      dtype: DTypeLike = ...,
-    out: Optional[ndarray] = ...,
-    ddof: int = ...,
+    out: None = ...,
+    ddof: float = ...,
      keepdims: bool = ...,
      *,
      where: _ArrayLikeBool_co = ...,
  ) -> Any: ...
+@overload
+def var(
+    a: _ArrayLikeComplex_co | _ArrayLikeObject_co,
+    axis: None | _ShapeLike = ...,
+    dtype: DTypeLike = ...,
+    out: _ArrayType = ...,
+    ddof: float = ...,
+    keepdims: bool = ...,
+    *,
+    where: _ArrayLikeBool_co = ...,
+) -> _ArrayType: ...
diff --git a/numpy/core/function_base.pyi b/numpy/core/function_base.pyi

index 68d3b3a98f5766c27cef8c8052d71abda16da9c3..2c2a277b1b1b0d180bc13473ce11e637fe946fdb 100644 (file)
--- a/numpy/core/function_base.pyi
+++ b/numpy/core/function_base.pyi
@@ -1,60 +1,187 @@
-from typing import overload, Tuple, Union, Sequence, Any, SupportsIndex, Literal, List
+from typing import (
+    Literal as L,
+    overload,
+    Any,
+    SupportsIndex,
+    TypeVar,
+)
  
-from numpy import ndarray
-from numpy.typing import ArrayLike, DTypeLike, _SupportsArray, _NumberLike_co
+from numpy import floating, complexfloating, generic
+from numpy._typing import (
+    NDArray,
+    DTypeLike,
+    _DTypeLike,
+    _ArrayLikeFloat_co,
+    _ArrayLikeComplex_co,
+)
  
-# TODO: wait for support for recursive types
-_ArrayLikeNested = Sequence[Sequence[Any]]
-_ArrayLikeNumber = Union[
-    _NumberLike_co, Sequence[_NumberLike_co], ndarray, _SupportsArray, _ArrayLikeNested
-]
+_SCT = TypeVar("_SCT", bound=generic)
  
-__all__: List[str]
+__all__: list[str]
  
  @overload
  def linspace(
-    start: _ArrayLikeNumber,
-    stop: _ArrayLikeNumber,
+    start: _ArrayLikeFloat_co,
+    stop: _ArrayLikeFloat_co,
      num: SupportsIndex = ...,
      endpoint: bool = ...,
-    retstep: Literal[False] = ...,
+    retstep: L[False] = ...,
+    dtype: None = ...,
+    axis: SupportsIndex = ...,
+) -> NDArray[floating[Any]]: ...
+@overload
+def linspace(
+    start: _ArrayLikeComplex_co,
+    stop: _ArrayLikeComplex_co,
+    num: SupportsIndex = ...,
+    endpoint: bool = ...,
+    retstep: L[False] = ...,
+    dtype: None = ...,
+    axis: SupportsIndex = ...,
+) -> NDArray[complexfloating[Any, Any]]: ...
+@overload
+def linspace(
+    start: _ArrayLikeComplex_co,
+    stop: _ArrayLikeComplex_co,
+    num: SupportsIndex = ...,
+    endpoint: bool = ...,
+    retstep: L[False] = ...,
+    dtype: _DTypeLike[_SCT] = ...,
+    axis: SupportsIndex = ...,
+) -> NDArray[_SCT]: ...
+@overload
+def linspace(
+    start: _ArrayLikeComplex_co,
+    stop: _ArrayLikeComplex_co,
+    num: SupportsIndex = ...,
+    endpoint: bool = ...,
+    retstep: L[False] = ...,
      dtype: DTypeLike = ...,
      axis: SupportsIndex = ...,
-) -> ndarray: ...
+) -> NDArray[Any]: ...
  @overload
  def linspace(
-    start: _ArrayLikeNumber,
-    stop: _ArrayLikeNumber,
+    start: _ArrayLikeFloat_co,
+    stop: _ArrayLikeFloat_co,
      num: SupportsIndex = ...,
      endpoint: bool = ...,
-    retstep: Literal[True] = ...,
+    retstep: L[True] = ...,
+    dtype: None = ...,
+    axis: SupportsIndex = ...,
+) -> tuple[NDArray[floating[Any]], floating[Any]]: ...
+@overload
+def linspace(
+    start: _ArrayLikeComplex_co,
+    stop: _ArrayLikeComplex_co,
+    num: SupportsIndex = ...,
+    endpoint: bool = ...,
+    retstep: L[True] = ...,
+    dtype: None = ...,
+    axis: SupportsIndex = ...,
+) -> tuple[NDArray[complexfloating[Any, Any]], complexfloating[Any, Any]]: ...
+@overload
+def linspace(
+    start: _ArrayLikeComplex_co,
+    stop: _ArrayLikeComplex_co,
+    num: SupportsIndex = ...,
+    endpoint: bool = ...,
+    retstep: L[True] = ...,
+    dtype: _DTypeLike[_SCT] = ...,
+    axis: SupportsIndex = ...,
+) -> tuple[NDArray[_SCT], _SCT]: ...
+@overload
+def linspace(
+    start: _ArrayLikeComplex_co,
+    stop: _ArrayLikeComplex_co,
+    num: SupportsIndex = ...,
+    endpoint: bool = ...,
+    retstep: L[True] = ...,
      dtype: DTypeLike = ...,
      axis: SupportsIndex = ...,
-) -> Tuple[ndarray, Any]: ...
+) -> tuple[NDArray[Any], Any]: ...
  
+@overload
  def logspace(
-    start: _ArrayLikeNumber,
-    stop: _ArrayLikeNumber,
+    start: _ArrayLikeFloat_co,
+    stop: _ArrayLikeFloat_co,
      num: SupportsIndex = ...,
      endpoint: bool = ...,
-    base: _ArrayLikeNumber = ...,
+    base: _ArrayLikeFloat_co = ...,
+    dtype: None = ...,
+    axis: SupportsIndex = ...,
+) -> NDArray[floating[Any]]: ...
+@overload
+def logspace(
+    start: _ArrayLikeComplex_co,
+    stop: _ArrayLikeComplex_co,
+    num: SupportsIndex = ...,
+    endpoint: bool = ...,
+    base: _ArrayLikeComplex_co = ...,
+    dtype: None = ...,
+    axis: SupportsIndex = ...,
+) -> NDArray[complexfloating[Any, Any]]: ...
+@overload
+def logspace(
+    start: _ArrayLikeComplex_co,
+    stop: _ArrayLikeComplex_co,
+    num: SupportsIndex = ...,
+    endpoint: bool = ...,
+    base: _ArrayLikeComplex_co = ...,
+    dtype: _DTypeLike[_SCT] = ...,
+    axis: SupportsIndex = ...,
+) -> NDArray[_SCT]: ...
+@overload
+def logspace(
+    start: _ArrayLikeComplex_co,
+    stop: _ArrayLikeComplex_co,
+    num: SupportsIndex = ...,
+    endpoint: bool = ...,
+    base: _ArrayLikeComplex_co = ...,
      dtype: DTypeLike = ...,
      axis: SupportsIndex = ...,
-) -> ndarray: ...
+) -> NDArray[Any]: ...
  
+@overload
+def geomspace(
+    start: _ArrayLikeFloat_co,
+    stop: _ArrayLikeFloat_co,
+    num: SupportsIndex = ...,
+    endpoint: bool = ...,
+    dtype: None = ...,
+    axis: SupportsIndex = ...,
+) -> NDArray[floating[Any]]: ...
+@overload
+def geomspace(
+    start: _ArrayLikeComplex_co,
+    stop: _ArrayLikeComplex_co,
+    num: SupportsIndex = ...,
+    endpoint: bool = ...,
+    dtype: None = ...,
+    axis: SupportsIndex = ...,
+) -> NDArray[complexfloating[Any, Any]]: ...
+@overload
+def geomspace(
+    start: _ArrayLikeComplex_co,
+    stop: _ArrayLikeComplex_co,
+    num: SupportsIndex = ...,
+    endpoint: bool = ...,
+    dtype: _DTypeLike[_SCT] = ...,
+    axis: SupportsIndex = ...,
+) -> NDArray[_SCT]: ...
+@overload
  def geomspace(
-    start: _ArrayLikeNumber,
-    stop: _ArrayLikeNumber,
+    start: _ArrayLikeComplex_co,
+    stop: _ArrayLikeComplex_co,
      num: SupportsIndex = ...,
      endpoint: bool = ...,
      dtype: DTypeLike = ...,
      axis: SupportsIndex = ...,
-) -> ndarray: ...
+) -> NDArray[Any]: ...
  
  # Re-exported to `np.lib.function_base`
  def add_newdoc(
      place: str,
      obj: str,
-    doc: str | Tuple[str, str] | List[Tuple[str, str]],
+    doc: str | tuple[str, str] | list[tuple[str, str]],
      warn_on_python: bool = ...,
  ) -> None: ...
diff --git a/numpy/core/getlimits.pyi b/numpy/core/getlimits.pyi

index 66d0629954d2cdefa06af22f60f7665848062533..da5e3c23ea724bfeca0d83ff2550febe1aade2f0 100644 (file)
--- a/numpy/core/getlimits.pyi
+++ b/numpy/core/getlimits.pyi
@@ -1,8 +1,6 @@
-from typing import List
-
  from numpy import (
      finfo as finfo,
      iinfo as iinfo,
  )
  
-__all__: List[str]
+__all__: list[str]
diff --git a/numpy/core/include/numpy/experimental_dtype_api.h b/numpy/core/include/numpy/experimental_dtype_api.h

index effa66baf4c2f7c330e7631e7f35fa9cf90ff2da..1dd6215e62213fdb4542f06286d2a7e1142209ef 100644 (file)
--- a/numpy/core/include/numpy/experimental_dtype_api.h
+++ b/numpy/core/include/numpy/experimental_dtype_api.h
@@ -24,6 +24,12 @@
   *     Register a new loop for a ufunc.  This uses the `PyArrayMethod_Spec`
   *     which must be filled in (see in-line comments).
   *
+ * - PyUFunc_AddWrappingLoop:
+ *
+ *     Register a new loop which reuses an existing one, but modifies the
+ *     result dtypes.  Please search the internal NumPy docs for more info
+ *     at this point.  (Used for physical units dtype.)
+ *
   * - PyUFunc_AddPromoter:
   *
   *     Register a new promoter for a ufunc.  A promoter is a function stored
@@ -58,6 +64,16 @@
   *     also promote C; where "promotes" means implements the promotion.
   *     (There are some exceptions for abstract DTypes)
   *
+ * - PyArray_GetDefaultDescr:
+ *
+ *     Given a DType class, returns the default instance (descriptor).
+ *     This is an inline function checking for `singleton` first and only
+ *     calls the `default_descr` function if necessary.
+ *
+ * - PyArray_DoubleDType, etc.:
+ *
+ *     Aliases to the DType classes for the builtin NumPy DTypes.
+ *
   * WARNING
   * =======
   *
@@ -82,6 +98,15 @@
   * The new DType API is designed in a way to make it potentially useful for
   * alternative "array-like" implementations.  This will require careful
   * exposure of details and functions and is not part of this experimental API.
+ *
+ * Brief (incompatibility) changelog
+ * =================================
+ *
+ * 2. None (only additions).
+ * 3. New `npy_intp *view_offset` argument for `resolve_descriptors`.
+ *    This replaces the `NPY_CAST_IS_VIEW` flag.  It can be set to 0 if the
+ *    operation is a view, and is pre-initialized to `NPY_MIN_INTP` indicating
+ *    that the operation is not a view.
   */
  
  #ifndef NUMPY_CORE_INCLUDE_NUMPY_EXPERIMENTAL_DTYPE_API_H_
@@ -92,20 +117,41 @@
  
  
  /*
- * Just a hack so I don't forget importing as much myself, I spend way too
- * much time noticing it the first time around :).
+ * There must be a better way?! -- Oh well, this is experimental
+ * (my issue with it, is that I cannot undef those helpers).
   */
-static void
-__not_imported(void)
-{
-    printf("*****\nCritical error, dtype API not imported\n*****\n");
-}
-static void *__uninitialized_table[] = {
-        &__not_imported, &__not_imported, &__not_imported, &__not_imported,
-        &__not_imported, &__not_imported, &__not_imported, &__not_imported};
+#if defined(PY_ARRAY_UNIQUE_SYMBOL)
+    #define NPY_EXP_DTYPE_API_CONCAT_HELPER2(x, y) x ## y
+    #define NPY_EXP_DTYPE_API_CONCAT_HELPER(arg) NPY_EXP_DTYPE_API_CONCAT_HELPER2(arg, __experimental_dtype_api_table)
+    #define __experimental_dtype_api_table NPY_EXP_DTYPE_API_CONCAT_HELPER(PY_ARRAY_UNIQUE_SYMBOL)
+#else
+    #define __experimental_dtype_api_table __experimental_dtype_api_table
+#endif
+
+/* Support for correct multi-file projects: */
+#if defined(NO_IMPORT) || defined(NO_IMPORT_ARRAY)
+    extern void **__experimental_dtype_api_table;
+#else
+    /*
+     * Just a hack so I don't forget importing as much myself, I spend way too
+     * much time noticing it the first time around :).
+     */
+    static void
+    __not_imported(void)
+    {
+        printf("*****\nCritical error, dtype API not imported\n*****\n");
+    }
  
+    static void *__uninitialized_table[] = {
+            &__not_imported, &__not_imported, &__not_imported, &__not_imported,
+            &__not_imported, &__not_imported, &__not_imported, &__not_imported};
  
-static void **__experimental_dtype_api_table = __uninitialized_table;
+    #if defined(PY_ARRAY_UNIQUE_SYMBOL)
+        void **__experimental_dtype_api_table = __uninitialized_table;
+    #else
+        static void **__experimental_dtype_api_table = __uninitialized_table;
+    #endif
+#endif
  
  
  /*
@@ -132,7 +178,7 @@ typedef struct {
   * NOTE: Expected changes:
   *       * invert logic of floating point error flag
   *       * probably split runtime and general flags into two
- *       * should possibly not use an enum for typdef for more stable ABI?
+ *       * should possibly not use an enum for typedef for more stable ABI?
   */
  typedef enum {
      /* Flag for whether the GIL is required */
@@ -163,7 +209,7 @@ typedef struct {
      int nin, nout;
      NPY_CASTING casting;
      NPY_ARRAYMETHOD_FLAGS flags;
-    PyObject **dtypes;  /* array of DType class objects */
+    PyArray_DTypeMeta **dtypes;
      PyType_Slot *slots;
  } PyArrayMethod_Spec;
  
@@ -178,6 +224,21 @@ typedef PyObject *_ufunc_addloop_fromspec_func(
      (*(_ufunc_addloop_fromspec_func *)(__experimental_dtype_api_table[0]))
  
  
+/* Please see the NumPy definitions in `array_method.h` for details on these */
+typedef int translate_given_descrs_func(int nin, int nout,
+        PyArray_DTypeMeta *wrapped_dtypes[],
+        PyArray_Descr *given_descrs[], PyArray_Descr *new_descrs[]);
+typedef int translate_loop_descrs_func(int nin, int nout,
+        PyArray_DTypeMeta *new_dtypes[], PyArray_Descr *given_descrs[],
+        PyArray_Descr *original_descrs[], PyArray_Descr *loop_descrs[]);
+
+typedef int _ufunc_wrapping_loop_func(PyObject *ufunc_obj,
+        PyArray_DTypeMeta *new_dtypes[], PyArray_DTypeMeta *wrapped_dtypes[],
+        translate_given_descrs_func *translate_given_descrs,
+        translate_loop_descrs_func *translate_loop_descrs);
+#define PyUFunc_AddWrappingLoop \
+    (*(_ufunc_wrapping_loop_func *)(__experimental_dtype_api_table[7]))
+
  /*
   * Type of the C promoter function, which must be wrapped into a
   * PyCapsule with name "numpy._ufunc_promoter".
@@ -206,16 +267,6 @@ typedef int _ufunc_addpromoter_func(
  #define PyUFunc_AddPromoter \
      (*(_ufunc_addpromoter_func *)(__experimental_dtype_api_table[1]))
  
-/*
- * In addition to the normal casting levels, NPY_CAST_IS_VIEW indicates
- * that no cast operation is necessary at all (although a copy usually will be)
- *
- * NOTE: The most likely modification here is to add an additional
- *       `view_offset` output to resolve_descriptors.  If set, it would
- *       indicate both that it is a view and what offset to use.  This means that
- *       e.g. `arr.imag` could be implemented by an ArrayMethod.
- */
-#define NPY_CAST_IS_VIEW _NPY_CAST_IS_VIEW
  
  /*
   * The resolve descriptors function, must be able to handle NULL values for
@@ -236,7 +287,8 @@ typedef NPY_CASTING (resolve_descriptors_function)(
          /* Input descriptors (instances).  Outputs may be NULL. */
          PyArray_Descr **given_descrs,
          /* Exact loop descriptors to use, must not hold references on error */
-        PyArray_Descr **loop_descrs);
+        PyArray_Descr **loop_descrs,
+        npy_intp *view_offset);
  
  /* NOT public yet: Signature needs adapting as external API. */
  #define _NPY_METH_get_loop 2
@@ -334,6 +386,65 @@ typedef PyArray_DTypeMeta *__promote_dtype_sequence(
      ((__promote_dtype_sequence *)(__experimental_dtype_api_table[5]))
  
  
+typedef PyArray_Descr *__get_default_descr(
+        PyArray_DTypeMeta *DType);
+#define _PyArray_GetDefaultDescr \
+    ((__get_default_descr *)(__experimental_dtype_api_table[6]))
+
+static NPY_INLINE PyArray_Descr *
+PyArray_GetDefaultDescr(PyArray_DTypeMeta *DType)
+{
+    if (DType->singleton != NULL) {
+        Py_INCREF(DType->singleton);
+        return DType->singleton;
+    }
+    return _PyArray_GetDefaultDescr(DType);
+}
+
+
+/*
+ * NumPy's builtin DTypes:
+ */
+#define PyArray_BoolDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[10])
+/* Integers */
+#define PyArray_ByteDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[11])
+#define PyArray_UByteDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[12])
+#define PyArray_ShortDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[13])
+#define PyArray_UShortDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[14])
+#define PyArray_IntDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[15])
+#define PyArray_UIntDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[16])
+#define PyArray_LongDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[17])
+#define PyArray_ULongDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[18])
+#define PyArray_LongLongDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[19])
+#define PyArray_ULongLongDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[20])
+/* Integer aliases */
+#define PyArray_Int8Type (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[21])
+#define PyArray_UInt8DType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[22])
+#define PyArray_Int16DType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[23])
+#define PyArray_UInt16DType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[24])
+#define PyArray_Int32DType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[25])
+#define PyArray_UInt32DType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[26])
+#define PyArray_Int64DType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[27])
+#define PyArray_UInt64DType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[28])
+#define PyArray_IntpDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[29])
+#define PyArray_UIntpDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[30])
+/* Floats */
+#define PyArray_HalfType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[31])
+#define PyArray_FloatDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[32])
+#define PyArray_DoubleDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[33])
+#define PyArray_LongDoubleDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[34])
+/* Complex */
+#define PyArray_CFloatDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[35])
+#define PyArray_CDoubleDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[36])
+#define PyArray_CLongDoubleDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[37])
+/* String/Bytes */
+#define PyArray_StringDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[38])
+#define PyArray_UnicodeDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[39])
+/* Datetime/Timedelta */
+#define PyArray_DatetimeDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[40])
+#define PyArray_TimedeltaDType (*(PyArray_DTypeMeta *)__experimental_dtype_api_table[41])
+
+
  /*
   * ********************************
   *         Initialization
@@ -344,7 +455,9 @@ typedef PyArray_DTypeMeta *__promote_dtype_sequence(
   * runtime-check this.
   * You must call this function to use the symbols defined in this file.
   */
-#define __EXPERIMENTAL_DTYPE_VERSION 2
+#if !defined(NO_IMPORT) && !defined(NO_IMPORT_ARRAY)
+
+#define __EXPERIMENTAL_DTYPE_VERSION 4
  
  static int
  import_experimental_dtype_api(int version)
@@ -372,7 +485,7 @@ import_experimental_dtype_api(int version)
      if (api == NULL) {
          return -1;
      }
-    __experimental_dtype_api_table = PyCapsule_GetPointer(api,
+    __experimental_dtype_api_table = (void **)PyCapsule_GetPointer(api,
              "experimental_dtype_api_table");
      Py_DECREF(api);
  
@@ -383,4 +496,6 @@ import_experimental_dtype_api(int version)
      return 0;
  }
  
+#endif  /* !defined(NO_IMPORT) && !defined(NO_IMPORT_ARRAY) */
+
  #endif  /* NUMPY_CORE_INCLUDE_NUMPY_EXPERIMENTAL_DTYPE_API_H_ */
diff --git a/numpy/core/include/numpy/ndarrayobject.h b/numpy/core/include/numpy/ndarrayobject.h

index 2eb951486e82fc578efdb459f4614224a5e14dd1..aaaefd7defb3925a9b1aad916e05e6063f1fe57b 100644 (file)
--- a/numpy/core/include/numpy/ndarrayobject.h
+++ b/numpy/core/include/numpy/ndarrayobject.h
@@ -152,19 +152,16 @@ extern "C" {
                                              (k)*PyArray_STRIDES(obj)[2] + \
                                              (l)*PyArray_STRIDES(obj)[3]))
  
-/* Move to arrayobject.c once PyArray_XDECREF_ERR is removed */
  static NPY_INLINE void
  PyArray_DiscardWritebackIfCopy(PyArrayObject *arr)
  {
      PyArrayObject_fields *fa = (PyArrayObject_fields *)arr;
      if (fa && fa->base) {
-        if ((fa->flags & NPY_ARRAY_UPDATEIFCOPY) ||
-                (fa->flags & NPY_ARRAY_WRITEBACKIFCOPY)) {
+        if (fa->flags & NPY_ARRAY_WRITEBACKIFCOPY) {
              PyArray_ENABLEFLAGS((PyArrayObject*)fa->base, NPY_ARRAY_WRITEABLE);
              Py_DECREF(fa->base);
              fa->base = NULL;
              PyArray_CLEARFLAGS(arr, NPY_ARRAY_WRITEBACKIFCOPY);
-            PyArray_CLEARFLAGS(arr, NPY_ARRAY_UPDATEIFCOPY);
          }
      }
  }
@@ -246,20 +243,6 @@ NPY_TITLE_KEY_check(PyObject *key, PyObject *value)
  #define DEPRECATE(msg) PyErr_WarnEx(PyExc_DeprecationWarning,msg,1)
  #define DEPRECATE_FUTUREWARNING(msg) PyErr_WarnEx(PyExc_FutureWarning,msg,1)
  
-#if !defined(NPY_NO_DEPRECATED_API) || \
-    (NPY_NO_DEPRECATED_API < NPY_1_14_API_VERSION)
-static NPY_INLINE void
-PyArray_XDECREF_ERR(PyArrayObject *arr)
-{
-    /* 2017-Nov-10 1.14 */
-    DEPRECATE("PyArray_XDECREF_ERR is deprecated, call "
-        "PyArray_DiscardWritebackIfCopy then Py_XDECREF instead");
-    PyArray_DiscardWritebackIfCopy(arr);
-    Py_XDECREF(arr);
-}
-#endif
-
-
  #ifdef __cplusplus
  }
  #endif
diff --git a/numpy/core/include/numpy/ndarraytypes.h b/numpy/core/include/numpy/ndarraytypes.h

index 6240adc0c7f17ba310eb5961c91bf9ea236410b5..c295f34bb51a303fe0f86cbca1f3a76a67790c75 100644 (file)
--- a/numpy/core/include/numpy/ndarraytypes.h
+++ b/numpy/core/include/numpy/ndarraytypes.h
@@ -87,7 +87,7 @@ enum NPY_TYPES {    NPY_BOOL=0,
                      /* The number of types not including the new 1.6 types */
                      NPY_NTYPES_ABI_COMPATIBLE=21
  };
-#ifdef _MSC_VER
+#if defined(_MSC_VER) && !defined(__clang__)
  #pragma deprecated(NPY_CHAR)
  #endif
  
@@ -221,13 +221,6 @@ typedef enum {
          NPY_SAME_KIND_CASTING=3,
          /* Allow any casts */
          NPY_UNSAFE_CASTING=4,
-        /*
-         * Flag to allow signalling that a cast is a view, this flag is not
-         * valid when requesting a cast of specific safety.
-         * _NPY_CAST_IS_VIEW|NPY_EQUIV_CASTING means the same as NPY_NO_CASTING.
-         */
-        // TODO-DTYPES: Needs to be documented.
-        _NPY_CAST_IS_VIEW = 1 << 16,
  } NPY_CASTING;
  
  typedef enum {
@@ -841,11 +834,9 @@ typedef int (PyArray_FinalizeFunc)(PyArrayObject *, PyObject *);
   * 1-d array is C_CONTIGUOUS it is also F_CONTIGUOUS. Arrays with
   * more then one dimension can be C_CONTIGUOUS and F_CONTIGUOUS
   * at the same time if they have either zero or one element.
- * If NPY_RELAXED_STRIDES_CHECKING is set, a higher dimensional
- * array is always C_CONTIGUOUS and F_CONTIGUOUS if it has zero elements
- * and the array is contiguous if ndarray.squeeze() is contiguous.
- * I.e. dimensions for which `ndarray.shape[dimension] == 1` are
- * ignored.
+ * A higher dimensional array always has the same contiguity flags as
+ * `array.squeeze()`; dimensions with `array.shape[dimension] == 1` are
+ * effectively ignored when checking for contiguity.
   */
  
  /*
@@ -934,7 +925,6 @@ typedef int (PyArray_FinalizeFunc)(PyArrayObject *, PyObject *);
   * This flag may be requested in constructor functions.
   * This flag may be tested for in PyArray_FLAGS(arr).
   */
-#define NPY_ARRAY_UPDATEIFCOPY    0x1000 /* Deprecated in 1.14 */
  #define NPY_ARRAY_WRITEBACKIFCOPY 0x2000
  
  /*
@@ -965,14 +955,12 @@ typedef int (PyArray_FinalizeFunc)(PyArrayObject *, PyObject *);
  #define NPY_ARRAY_DEFAULT      (NPY_ARRAY_CARRAY)
  #define NPY_ARRAY_IN_ARRAY     (NPY_ARRAY_CARRAY_RO)
  #define NPY_ARRAY_OUT_ARRAY    (NPY_ARRAY_CARRAY)
-#define NPY_ARRAY_INOUT_ARRAY  (NPY_ARRAY_CARRAY | \
-                                NPY_ARRAY_UPDATEIFCOPY)
+#define NPY_ARRAY_INOUT_ARRAY  (NPY_ARRAY_CARRAY)
  #define NPY_ARRAY_INOUT_ARRAY2 (NPY_ARRAY_CARRAY | \
                                  NPY_ARRAY_WRITEBACKIFCOPY)
  #define NPY_ARRAY_IN_FARRAY    (NPY_ARRAY_FARRAY_RO)
  #define NPY_ARRAY_OUT_FARRAY   (NPY_ARRAY_FARRAY)
-#define NPY_ARRAY_INOUT_FARRAY (NPY_ARRAY_FARRAY | \
-                                NPY_ARRAY_UPDATEIFCOPY)
+#define NPY_ARRAY_INOUT_FARRAY (NPY_ARRAY_FARRAY)
  #define NPY_ARRAY_INOUT_FARRAY2 (NPY_ARRAY_FARRAY | \
                                  NPY_ARRAY_WRITEBACKIFCOPY)
  
diff --git a/numpy/core/include/numpy/noprefix.h b/numpy/core/include/numpy/noprefix.h

index 2c0ce1420e2c895819659c02036d08032c469aa5..cea5b0d4678346249233435061e094297dea1979 100644 (file)
--- a/numpy/core/include/numpy/noprefix.h
+++ b/numpy/core/include/numpy/noprefix.h
@@ -165,7 +165,6 @@
  #define ALIGNED            NPY_ALIGNED
  #define NOTSWAPPED         NPY_NOTSWAPPED
  #define WRITEABLE          NPY_WRITEABLE
-#define UPDATEIFCOPY       NPY_UPDATEIFCOPY
  #define WRITEBACKIFCOPY    NPY_ARRAY_WRITEBACKIFCOPY
  #define ARR_HAS_DESCR      NPY_ARR_HAS_DESCR
  #define BEHAVED            NPY_BEHAVED
diff --git a/numpy/core/include/numpy/npy_1_7_deprecated_api.h b/numpy/core/include/numpy/npy_1_7_deprecated_api.h

index 4fd4015a991a8bfe1c853561087d56435e86cabc..6455d40d223b8a13c9903c95e7282b9621311414 100644 (file)
--- a/numpy/core/include/numpy/npy_1_7_deprecated_api.h
+++ b/numpy/core/include/numpy/npy_1_7_deprecated_api.h
@@ -48,7 +48,6 @@
  #define NPY_ALIGNED        NPY_ARRAY_ALIGNED
  #define NPY_NOTSWAPPED     NPY_ARRAY_NOTSWAPPED
  #define NPY_WRITEABLE      NPY_ARRAY_WRITEABLE
-#define NPY_UPDATEIFCOPY   NPY_ARRAY_UPDATEIFCOPY
  #define NPY_BEHAVED        NPY_ARRAY_BEHAVED
  #define NPY_BEHAVED_NS     NPY_ARRAY_BEHAVED_NS
  #define NPY_CARRAY         NPY_ARRAY_CARRAY
diff --git a/numpy/core/include/numpy/npy_3kcompat.h b/numpy/core/include/numpy/npy_3kcompat.h

index 22c103e93da95844b795bcadf8b7cc071f78bebe..11cc477655a4a766d02b512e0fd67370a063ec86 100644 (file)
--- a/numpy/core/include/numpy/npy_3kcompat.h
+++ b/numpy/core/include/numpy/npy_3kcompat.h
@@ -1,6 +1,8 @@
  /*
   * This is a convenience header file providing compatibility utilities
- * for supporting Python 2 and Python 3 in the same code base.
+ * for supporting different minor versions of Python 3.
+ * It was originally used to support the transition from Python 2,
+ * hence the "3k" naming.
   *
   * If you want to use this for your own projects, it's recommended to make a
   * copy of it. Although the stuff below is unlikely to change, we don't provide
diff --git a/numpy/core/include/numpy/npy_common.h b/numpy/core/include/numpy/npy_common.h

index 1d6234e20e0f19eeaf457a81fc4b9185d76e2118..2bcc45e4f67712fb10ce9ea802818ea017c96c1f 100644 (file)
--- a/numpy/core/include/numpy/npy_common.h
+++ b/numpy/core/include/numpy/npy_common.h
@@ -131,9 +131,10 @@
  #endif
  #endif
  
-#if defined(_MSC_VER)
-        #define NPY_INLINE __inline
-#elif defined(__GNUC__)
+#if defined(_MSC_VER) && !defined(__clang__)
+    #define NPY_INLINE __inline
+/* clang included here to handle clang-cl on Windows */
+#elif defined(__GNUC__) || defined(__clang__)
      #if defined(__STRICT_ANSI__)
           #define NPY_INLINE __inline__
      #else
diff --git a/numpy/core/include/numpy/npy_os.h b/numpy/core/include/numpy/npy_os.h

index efa0e4012f91b85dbfdc7818a9a2818fc8747962..6d335f75159b461256738f2508d6c50aeb2ae186 100644 (file)
--- a/numpy/core/include/numpy/npy_os.h
+++ b/numpy/core/include/numpy/npy_os.h
@@ -21,6 +21,10 @@
      #define NPY_OS_CYGWIN
  #elif defined(_WIN32) || defined(__WIN32__) || defined(WIN32)
      #define NPY_OS_WIN32
+#elif defined(_WIN64) || defined(__WIN64__) || defined(WIN64)
+    #define NPY_OS_WIN64
+#elif defined(__MINGW32__) || defined(__MINGW64__)
+    #define NPY_OS_MINGW
  #elif defined(__APPLE__)
      #define NPY_OS_DARWIN
  #else
diff --git a/numpy/core/include/numpy/numpyconfig.h b/numpy/core/include/numpy/numpyconfig.h

index e4c17f7e19a5c25e4987319c145d3f1fd4f63500..e0064382bc04fe8b4383a038a73ea718f4113b26 100644 (file)
--- a/numpy/core/include/numpy/numpyconfig.h
+++ b/numpy/core/include/numpy/numpyconfig.h
@@ -63,5 +63,6 @@
  #define NPY_1_20_API_VERSION 0x0000000e
  #define NPY_1_21_API_VERSION 0x0000000e
  #define NPY_1_22_API_VERSION 0x0000000f
+#define NPY_1_23_API_VERSION 0x00000010
  
  #endif  /* NUMPY_CORE_INCLUDE_NUMPY_NPY_NUMPYCONFIG_H_ */
diff --git a/numpy/core/include/numpy/random/distributions.h b/numpy/core/include/numpy/random/distributions.h

index dacf7782909f87e0cf877d06d9c5842148caaf97..78bd06ff52505dec1003b999a68877d928c61aca 100644 (file)
--- a/numpy/core/include/numpy/random/distributions.h
+++ b/numpy/core/include/numpy/random/distributions.h
@@ -28,7 +28,7 @@ extern "C" {
  #define RAND_INT_MAX INT64_MAX
  #endif
  
-#if defined(_MSC_VER) || defined(__CYGWIN__)
+#ifdef _MSC_VER
  #define DECLDIR __declspec(dllexport)
  #else
  #define DECLDIR extern
diff --git a/numpy/core/include/numpy/utils.h b/numpy/core/include/numpy/utils.h

index e2b57f9e508d47171c8361f44b53c5a50a5377d2..97f06092e54050baf3c2fc4372429cbd110429e8 100644 (file)
--- a/numpy/core/include/numpy/utils.h
+++ b/numpy/core/include/numpy/utils.h
@@ -24,7 +24,7 @@
  /* Use this to tag a variable as not used. It will remove unused variable
   * warning on support platforms (see __COM_NPY_UNUSED) and mangle the variable
   * to avoid accidental use */
-#define NPY_UNUSED(x) (__NPY_UNUSED_TAGGED ## x) __COMP_NPY_UNUSED
+#define NPY_UNUSED(x) __NPY_UNUSED_TAGGED ## x __COMP_NPY_UNUSED
  #define NPY_EXPAND(x) x
  
  #define NPY_STRINGIFY(x) #x
diff --git a/numpy/core/memmap.pyi b/numpy/core/memmap.pyi

index ba595bf1ef64c4bfc1c2fe4a67ac972206de28cb..03c6b772dcd52c87bb958329f5acecd0ed8c1092 100644 (file)
--- a/numpy/core/memmap.pyi
+++ b/numpy/core/memmap.pyi
@@ -1,5 +1,3 @@
-from typing import List
-
  from numpy import memmap as memmap
  
-__all__: List[str]
+__all__: list[str]
diff --git a/numpy/core/multiarray.py b/numpy/core/multiarray.py

index f88d75978697516973fb2d91392f5a92295833b6..ee88ce30b61889c322227c367a4fb3327070ddbf 100644 (file)
--- a/numpy/core/multiarray.py
+++ b/numpy/core/multiarray.py
@@ -14,9 +14,9 @@ from ._multiarray_umath import *  # noqa: F403
  # do not change them. issue gh-15518
  # _get_ndarray_c_version is semi-public, on purpose not added to __all__
  from ._multiarray_umath import (
-    _fastCopyAndTranspose, _flagdict, _from_dlpack, _insert, _reconstruct,
+    _fastCopyAndTranspose, _flagdict, from_dlpack, _insert, _reconstruct,
      _vec_string, _ARRAY_API, _monotonicity, _get_ndarray_c_version,
-    _set_madvise_hugepage,
+    _get_madvise_hugepage, _set_madvise_hugepage,
      )
  
  __all__ = [
@@ -24,7 +24,7 @@ __all__ = [
      'ITEM_HASOBJECT', 'ITEM_IS_POINTER', 'LIST_PICKLE', 'MAXDIMS',
      'MAY_SHARE_BOUNDS', 'MAY_SHARE_EXACT', 'NEEDS_INIT', 'NEEDS_PYAPI',
      'RAISE', 'USE_GETITEM', 'USE_SETITEM', 'WRAP', '_fastCopyAndTranspose',
-    '_flagdict', '_from_dlpack', '_insert', '_reconstruct', '_vec_string',
+    '_flagdict', 'from_dlpack', '_insert', '_reconstruct', '_vec_string',
      '_monotonicity', 'add_docstring', 'arange', 'array', 'asarray',
      'asanyarray', 'ascontiguousarray', 'asfortranarray', 'bincount',
      'broadcast', 'busday_count', 'busday_offset', 'busdaycalendar', 'can_cast',
@@ -47,7 +47,7 @@ _reconstruct.__module__ = 'numpy.core.multiarray'
  scalar.__module__ = 'numpy.core.multiarray'
  
  
-_from_dlpack.__module__ = 'numpy'
+from_dlpack.__module__ = 'numpy'
  arange.__module__ = 'numpy'
  array.__module__ = 'numpy'
  asarray.__module__ = 'numpy'
diff --git a/numpy/core/multiarray.pyi b/numpy/core/multiarray.pyi

index 5a8999582219dd04234541fdf56581c795db3666..1be58235788b27ec9008d81ead779eeea426a6a1 100644 (file)
--- a/numpy/core/multiarray.pyi
+++ b/numpy/core/multiarray.pyi
@@ -2,19 +2,12 @@
  
  import os
  import datetime as dt
+from collections.abc import Sequence, Callable, Iterable
  from typing import (
      Literal as L,
      Any,
-    Callable,
-    Iterable,
-    Optional,
      overload,
      TypeVar,
-    List,
-    Type,
-    Union,
-    Sequence,
-    Tuple,
      SupportsIndex,
      final,
      Final,
@@ -55,20 +48,20 @@ from numpy import (
      _NDIterOpFlagsKind,
  )
  
-from numpy.typing import (
+from numpy._typing import (
      # Shapes
      _ShapeLike,
  
      # DTypes
      DTypeLike,
-    _SupportsDType,
+    _DTypeLike,
  
      # Arrays
      NDArray,
      ArrayLike,
-    _SupportsArray,
+    _ArrayLike,
+    _SupportsArrayFunc,
      _NestedSequence,
-    _FiniteNestedSequence,
      _ArrayLikeBool_co,
      _ArrayLikeUInt_co,
      _ArrayLikeInt_co,
@@ -90,14 +83,6 @@ _T_contra = TypeVar("_T_contra", contravariant=True)
  _SCT = TypeVar("_SCT", bound=generic)
  _ArrayType = TypeVar("_ArrayType", bound=NDArray[Any])
  
-# Subscriptable subsets of `npt.DTypeLike` and `npt.ArrayLike`
-_DTypeLike = Union[
-    dtype[_SCT],
-    Type[_SCT],
-    _SupportsDType[dtype[_SCT]],
-]
-_ArrayLike = _FiniteNestedSequence[_SupportsArray[dtype[_SCT]]]
-
  # Valid time units
  _UnitKind = L[
      "Y",
@@ -125,9 +110,9 @@ _RollKind = L[  # `raise` is deliberately excluded
  
  class _SupportsLenAndGetItem(Protocol[_T_contra, _T_co]):
      def __len__(self) -> int: ...
-    def __getitem__(self, __key: _T_contra) -> _T_co: ...
+    def __getitem__(self, key: _T_contra, /) -> _T_co: ...
  
-__all__: List[str]
+__all__: list[str]
  
  ALLOW_THREADS: Final[int]  # 0 or 1 (system-specific)
  BUFSIZE: L[8192]
@@ -145,7 +130,7 @@ def empty_like(
      dtype: None = ...,
      order: _OrderKACF = ...,
      subok: bool = ...,
-    shape: Optional[_ShapeLike] = ...,
+    shape: None | _ShapeLike = ...,
  ) -> _ArrayType: ...
  @overload
  def empty_like(
@@ -153,7 +138,7 @@ def empty_like(
      dtype: None = ...,
      order: _OrderKACF = ...,
      subok: bool = ...,
-    shape: Optional[_ShapeLike] = ...,
+    shape: None | _ShapeLike = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def empty_like(
@@ -161,7 +146,7 @@ def empty_like(
      dtype: None = ...,
      order: _OrderKACF = ...,
      subok: bool = ...,
-    shape: Optional[_ShapeLike] = ...,
+    shape: None | _ShapeLike = ...,
  ) -> NDArray[Any]: ...
  @overload
  def empty_like(
@@ -169,7 +154,7 @@ def empty_like(
      dtype: _DTypeLike[_SCT],
      order: _OrderKACF = ...,
      subok: bool = ...,
-    shape: Optional[_ShapeLike] = ...,
+    shape: None | _ShapeLike = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def empty_like(
@@ -177,7 +162,7 @@ def empty_like(
      dtype: DTypeLike,
      order: _OrderKACF = ...,
      subok: bool = ...,
-    shape: Optional[_ShapeLike] = ...,
+    shape: None | _ShapeLike = ...,
  ) -> NDArray[Any]: ...
  
  @overload
@@ -189,7 +174,7 @@ def array(
      order: _OrderKACF = ...,
      subok: L[True],
      ndmin: int = ...,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> _ArrayType: ...
  @overload
  def array(
@@ -200,7 +185,7 @@ def array(
      order: _OrderKACF = ...,
      subok: bool = ...,
      ndmin: int = ...,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def array(
@@ -211,7 +196,7 @@ def array(
      order: _OrderKACF = ...,
      subok: bool = ...,
      ndmin: int = ...,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  @overload
  def array(
@@ -222,7 +207,7 @@ def array(
      order: _OrderKACF = ...,
      subok: bool = ...,
      ndmin: int = ...,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def array(
@@ -233,7 +218,7 @@ def array(
      order: _OrderKACF = ...,
      subok: bool = ...,
      ndmin: int = ...,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  @overload
@@ -242,7 +227,7 @@ def zeros(
      dtype: None = ...,
      order: _OrderCF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[float64]: ...
  @overload
  def zeros(
@@ -250,7 +235,7 @@ def zeros(
      dtype: _DTypeLike[_SCT],
      order: _OrderCF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def zeros(
@@ -258,7 +243,7 @@ def zeros(
      dtype: DTypeLike,
      order: _OrderCF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  @overload
@@ -267,7 +252,7 @@ def empty(
      dtype: None = ...,
      order: _OrderCF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[float64]: ...
  @overload
  def empty(
@@ -275,7 +260,7 @@ def empty(
      dtype: _DTypeLike[_SCT],
      order: _OrderCF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def empty(
@@ -283,7 +268,7 @@ def empty(
      dtype: DTypeLike,
      order: _OrderCF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  @overload
@@ -291,26 +276,26 @@ def unravel_index(  # type: ignore[misc]
      indices: _IntLike_co,
      shape: _ShapeLike,
      order: _OrderCF = ...,
-) -> Tuple[intp, ...]: ...
+) -> tuple[intp, ...]: ...
  @overload
  def unravel_index(
      indices: _ArrayLikeInt_co,
      shape: _ShapeLike,
      order: _OrderCF = ...,
-) -> Tuple[NDArray[intp], ...]: ...
+) -> tuple[NDArray[intp], ...]: ...
  
  @overload
  def ravel_multi_index(  # type: ignore[misc]
      multi_index: Sequence[_IntLike_co],
      dims: Sequence[SupportsIndex],
-    mode: Union[_ModeKind, Tuple[_ModeKind, ...]] = ...,
+    mode: _ModeKind | tuple[_ModeKind, ...] = ...,
      order: _OrderCF = ...,
  ) -> intp: ...
  @overload
  def ravel_multi_index(
      multi_index: Sequence[_ArrayLikeInt_co],
      dims: Sequence[SupportsIndex],
-    mode: Union[_ModeKind, Tuple[_ModeKind, ...]] = ...,
+    mode: _ModeKind | tuple[_ModeKind, ...] = ...,
      order: _OrderCF = ...,
  ) -> NDArray[intp]: ...
  
@@ -319,51 +304,51 @@ def ravel_multi_index(
  def concatenate(  # type: ignore[misc]
      arrays: _ArrayLike[_SCT],
      /,
-    axis: Optional[SupportsIndex] = ...,
+    axis: None | SupportsIndex = ...,
      out: None = ...,
      *,
      dtype: None = ...,
-    casting: Optional[_CastingKind] = ...
+    casting: None | _CastingKind = ...
  ) -> NDArray[_SCT]: ...
  @overload
  def concatenate(  # type: ignore[misc]
      arrays: _SupportsLenAndGetItem[int, ArrayLike],
      /,
-    axis: Optional[SupportsIndex] = ...,
+    axis: None | SupportsIndex = ...,
      out: None = ...,
      *,
      dtype: None = ...,
-    casting: Optional[_CastingKind] = ...
+    casting: None | _CastingKind = ...
  ) -> NDArray[Any]: ...
  @overload
  def concatenate(  # type: ignore[misc]
      arrays: _SupportsLenAndGetItem[int, ArrayLike],
      /,
-    axis: Optional[SupportsIndex] = ...,
+    axis: None | SupportsIndex = ...,
      out: None = ...,
      *,
      dtype: _DTypeLike[_SCT],
-    casting: Optional[_CastingKind] = ...
+    casting: None | _CastingKind = ...
  ) -> NDArray[_SCT]: ...
  @overload
  def concatenate(  # type: ignore[misc]
      arrays: _SupportsLenAndGetItem[int, ArrayLike],
      /,
-    axis: Optional[SupportsIndex] = ...,
+    axis: None | SupportsIndex = ...,
      out: None = ...,
      *,
      dtype: DTypeLike,
-    casting: Optional[_CastingKind] = ...
+    casting: None | _CastingKind = ...
  ) -> NDArray[Any]: ...
  @overload
  def concatenate(
      arrays: _SupportsLenAndGetItem[int, ArrayLike],
      /,
-    axis: Optional[SupportsIndex] = ...,
+    axis: None | SupportsIndex = ...,
      out: _ArrayType = ...,
      *,
      dtype: DTypeLike = ...,
-    casting: Optional[_CastingKind] = ...
+    casting: None | _CastingKind = ...
  ) -> _ArrayType: ...
  
  def inner(
@@ -376,7 +361,7 @@ def inner(
  def where(
      condition: ArrayLike,
      /,
-) -> Tuple[NDArray[intp], ...]: ...
+) -> tuple[NDArray[intp], ...]: ...
  @overload
  def where(
      condition: ArrayLike,
@@ -387,13 +372,13 @@ def where(
  
  def lexsort(
      keys: ArrayLike,
-    axis: Optional[SupportsIndex] = ...,
+    axis: None | SupportsIndex = ...,
  ) -> Any: ...
  
  def can_cast(
-    from_: Union[ArrayLike, DTypeLike],
+    from_: ArrayLike | DTypeLike,
      to: DTypeLike,
-    casting: Optional[_CastingKind] = ...,
+    casting: None | _CastingKind = ...,
  ) -> bool: ...
  
  def min_scalar_type(
@@ -401,7 +386,7 @@ def min_scalar_type(
  ) -> dtype[Any]: ...
  
  def result_type(
-    *arrays_and_dtypes: Union[ArrayLike, DTypeLike],
+    *arrays_and_dtypes: ArrayLike | DTypeLike,
  ) -> dtype[Any]: ...
  
  @overload
@@ -429,15 +414,15 @@ def vdot(a: Any, b: _ArrayLikeObject_co, /) -> Any: ...
  def bincount(
      x: ArrayLike,
      /,
-    weights: Optional[ArrayLike] = ...,
+    weights: None | ArrayLike = ...,
      minlength: SupportsIndex = ...,
  ) -> NDArray[intp]: ...
  
  def copyto(
      dst: NDArray[Any],
      src: ArrayLike,
-    casting: Optional[_CastingKind] = ...,
-    where: Optional[_ArrayLikeBool_co] = ...,
+    casting: None | _CastingKind = ...,
+    where: None | _ArrayLikeBool_co = ...,
  ) -> None: ...
  
  def putmask(
@@ -449,15 +434,15 @@ def putmask(
  def packbits(
      a: _ArrayLikeInt_co,
      /,
-    axis: Optional[SupportsIndex] = ...,
+    axis: None | SupportsIndex = ...,
      bitorder: L["big", "little"] = ...,
  ) -> NDArray[uint8]: ...
  
  def unpackbits(
      a: _ArrayLike[uint8],
      /,
-    axis: Optional[SupportsIndex] = ...,
-    count: Optional[SupportsIndex] = ...,
+    axis: None | SupportsIndex = ...,
+    count: None | SupportsIndex = ...,
      bitorder: L["big", "little"] = ...,
  ) -> NDArray[uint8]: ...
  
@@ -465,14 +450,14 @@ def shares_memory(
      a: object,
      b: object,
      /,
-    max_work: Optional[int] = ...,
+    max_work: None | int = ...,
  ) -> bool: ...
  
  def may_share_memory(
      a: object,
      b: object,
      /,
-    max_work: Optional[int] = ...,
+    max_work: None | int = ...,
  ) -> bool: ...
  
  @overload
@@ -481,7 +466,7 @@ def asarray(
      dtype: None = ...,
      order: _OrderKACF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def asarray(
@@ -489,7 +474,7 @@ def asarray(
      dtype: None = ...,
      order: _OrderKACF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  @overload
  def asarray(
@@ -497,7 +482,7 @@ def asarray(
      dtype: _DTypeLike[_SCT],
      order: _OrderKACF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def asarray(
@@ -505,7 +490,7 @@ def asarray(
      dtype: DTypeLike,
      order: _OrderKACF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  @overload
@@ -514,7 +499,7 @@ def asanyarray(
      dtype: None = ...,
      order: _OrderKACF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> _ArrayType: ...
  @overload
  def asanyarray(
@@ -522,7 +507,7 @@ def asanyarray(
      dtype: None = ...,
      order: _OrderKACF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def asanyarray(
@@ -530,7 +515,7 @@ def asanyarray(
      dtype: None = ...,
      order: _OrderKACF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  @overload
  def asanyarray(
@@ -538,7 +523,7 @@ def asanyarray(
      dtype: _DTypeLike[_SCT],
      order: _OrderKACF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def asanyarray(
@@ -546,7 +531,7 @@ def asanyarray(
      dtype: DTypeLike,
      order: _OrderKACF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  @overload
@@ -554,28 +539,28 @@ def ascontiguousarray(
      a: _ArrayLike[_SCT],
      dtype: None = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def ascontiguousarray(
      a: object,
      dtype: None = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  @overload
  def ascontiguousarray(
      a: Any,
      dtype: _DTypeLike[_SCT],
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def ascontiguousarray(
      a: Any,
      dtype: DTypeLike,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  @overload
@@ -583,34 +568,34 @@ def asfortranarray(
      a: _ArrayLike[_SCT],
      dtype: None = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def asfortranarray(
      a: object,
      dtype: None = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  @overload
  def asfortranarray(
      a: Any,
      dtype: _DTypeLike[_SCT],
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def asfortranarray(
      a: Any,
      dtype: DTypeLike,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
-# In practice `List[Any]` is list with an int, int and a valid
+# In practice `list[Any]` is list with an int, int and a valid
  # `np.seterrcall()` object
-def geterrobj() -> List[Any]: ...
-def seterrobj(errobj: List[Any], /) -> None: ...
+def geterrobj() -> list[Any]: ...
+def seterrobj(errobj: list[Any], /) -> None: ...
  
  def promote_types(__type1: DTypeLike, __type2: DTypeLike) -> dtype[Any]: ...
  
@@ -622,7 +607,7 @@ def fromstring(
      count: SupportsIndex = ...,
      *,
      sep: str,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[float64]: ...
  @overload
  def fromstring(
@@ -631,7 +616,7 @@ def fromstring(
      count: SupportsIndex = ...,
      *,
      sep: str,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def fromstring(
@@ -640,7 +625,7 @@ def fromstring(
      count: SupportsIndex = ...,
      *,
      sep: str,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  def frompyfunc(
@@ -659,7 +644,7 @@ def fromfile(
      sep: str = ...,
      offset: SupportsIndex = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[float64]: ...
  @overload
  def fromfile(
@@ -669,7 +654,7 @@ def fromfile(
      sep: str = ...,
      offset: SupportsIndex = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def fromfile(
@@ -679,7 +664,7 @@ def fromfile(
      sep: str = ...,
      offset: SupportsIndex = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  @overload
@@ -688,7 +673,7 @@ def fromiter(
      dtype: _DTypeLike[_SCT],
      count: SupportsIndex = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def fromiter(
@@ -696,7 +681,7 @@ def fromiter(
      dtype: DTypeLike,
      count: SupportsIndex = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  @overload
@@ -706,7 +691,7 @@ def frombuffer(
      count: SupportsIndex = ...,
      offset: SupportsIndex = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[float64]: ...
  @overload
  def frombuffer(
@@ -715,7 +700,7 @@ def frombuffer(
      count: SupportsIndex = ...,
      offset: SupportsIndex = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def frombuffer(
@@ -724,7 +709,7 @@ def frombuffer(
      count: SupportsIndex = ...,
      offset: SupportsIndex = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  @overload
@@ -732,7 +717,7 @@ def arange(  # type: ignore[misc]
      stop: _IntLike_co,
      /, *,
      dtype: None = ...,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[signedinteger[Any]]: ...
  @overload
  def arange(  # type: ignore[misc]
@@ -741,14 +726,14 @@ def arange(  # type: ignore[misc]
      step: _IntLike_co = ...,
      dtype: None = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[signedinteger[Any]]: ...
  @overload
  def arange(  # type: ignore[misc]
      stop: _FloatLike_co,
      /, *,
      dtype: None = ...,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[floating[Any]]: ...
  @overload
  def arange(  # type: ignore[misc]
@@ -757,14 +742,14 @@ def arange(  # type: ignore[misc]
      step: _FloatLike_co = ...,
      dtype: None = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[floating[Any]]: ...
  @overload
  def arange(
      stop: _TD64Like_co,
      /, *,
      dtype: None = ...,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[timedelta64]: ...
  @overload
  def arange(
@@ -773,7 +758,7 @@ def arange(
      step: _TD64Like_co = ...,
      dtype: None = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[timedelta64]: ...
  @overload
  def arange(  # both start and stop must always be specified for datetime64
@@ -782,14 +767,14 @@ def arange(  # both start and stop must always be specified for datetime64
      step: datetime64 = ...,
      dtype: None = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[datetime64]: ...
  @overload
  def arange(
      stop: Any,
      /, *,
      dtype: _DTypeLike[_SCT],
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def arange(
@@ -798,14 +783,14 @@ def arange(
      step: Any = ...,
      dtype: _DTypeLike[_SCT] = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def arange(
      stop: Any, /,
      *,
      dtype: DTypeLike,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  @overload
  def arange(
@@ -814,12 +799,12 @@ def arange(
      step: Any = ...,
      dtype: DTypeLike = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  def datetime_data(
      dtype: str | _DTypeLike[datetime64] | _DTypeLike[timedelta64], /,
-) -> Tuple[str, int]: ...
+) -> tuple[str, int]: ...
  
  # The datetime functions perform unsafe casts to `datetime64[D]`,
  # so a lot of different argument types are allowed here
@@ -1032,4 +1017,4 @@ def nested_iters(
      order: _OrderKACF = ...,
      casting: _CastingKind = ...,
      buffersize: SupportsIndex = ...,
-) -> Tuple[nditer, ...]: ...
+) -> tuple[nditer, ...]: ...
diff --git a/numpy/core/numeric.py b/numpy/core/numeric.py

index 344d40d934cfb50224c97a58c201d0531562093d..bb3cbf054b857f499c47768f4c91c957282aa458 100644 (file)
--- a/numpy/core/numeric.py
+++ b/numpy/core/numeric.py
@@ -13,7 +13,7 @@ from .multiarray import (
      WRAP, arange, array, asarray, asanyarray, ascontiguousarray,
      asfortranarray, broadcast, can_cast, compare_chararrays,
      concatenate, copyto, dot, dtype, empty,
-    empty_like, flatiter, frombuffer, _from_dlpack, fromfile, fromiter,
+    empty_like, flatiter, frombuffer, from_dlpack, fromfile, fromiter,
      fromstring, inner, lexsort, matmul, may_share_memory,
      min_scalar_type, ndarray, nditer, nested_iters, promote_types,
      putmask, result_type, set_numeric_ops, shares_memory, vdot, where,
@@ -41,7 +41,7 @@ __all__ = [
      'newaxis', 'ndarray', 'flatiter', 'nditer', 'nested_iters', 'ufunc',
      'arange', 'array', 'asarray', 'asanyarray', 'ascontiguousarray',
      'asfortranarray', 'zeros', 'count_nonzero', 'empty', 'broadcast', 'dtype',
-    'fromstring', 'fromfile', 'frombuffer', '_from_dlpack', 'where',
+    'fromstring', 'fromfile', 'frombuffer', 'from_dlpack', 'where',
      'argwhere', 'copyto', 'concatenate', 'fastCopyAndTranspose', 'lexsort',
      'set_numeric_ops', 'can_cast', 'promote_types', 'min_scalar_type',
      'result_type', 'isfortran', 'empty_like', 'zeros_like', 'ones_like',
@@ -136,7 +136,7 @@ def zeros_like(a, dtype=None, order='K', subok=True, shape=None):
  
      """
      res = empty_like(a, dtype=dtype, order=order, subok=subok, shape=shape)
-    # needed instead of a 0 to get same result as zeros for for string dtypes
+    # needed instead of a 0 to get same result as zeros for string dtypes
      z = zeros(1, dtype=res.dtype)
      multiarray.copyto(res, z, casting='unsafe')
      return res
@@ -364,7 +364,7 @@ def full_like(a, fill_value, dtype=None, order='K', subok=True, shape=None):
      a : array_like
          The shape and data-type of `a` define these same attributes of
          the returned array.
-    fill_value : scalar
+    fill_value : array_like
          Fill value.
      dtype : data-type, optional
          Overrides the data type of the result.
@@ -412,6 +412,12 @@ def full_like(a, fill_value, dtype=None, order='K', subok=True, shape=None):
      >>> np.full_like(y, 0.1)
      array([0.1, 0.1, 0.1, 0.1, 0.1, 0.1])
  
+    >>> y = np.zeros([2, 2, 3], dtype=int)
+    >>> np.full_like(y, [0, 0, 255])
+    array([[[  0,   0, 255],
+            [  0,   0, 255]],
+           [[  0,   0, 255],
+            [  0,   0, 255]]])
      """
      res = empty_like(a, dtype=dtype, order=order, subok=subok, shape=shape)
      multiarray.copyto(res, fill_value, casting='unsafe')
@@ -627,7 +633,7 @@ def flatnonzero(a):
      """
      Return indices that are non-zero in the flattened version of a.
  
-    This is equivalent to np.nonzero(np.ravel(a))[0].
+    This is equivalent to ``np.nonzero(np.ravel(a))[0]``.
  
      Parameters
      ----------
@@ -637,7 +643,7 @@ def flatnonzero(a):
      Returns
      -------
      res : ndarray
-        Output array, containing the indices of the elements of `a.ravel()`
+        Output array, containing the indices of the elements of ``a.ravel()``
          that are non-zero.
  
      See Also
@@ -669,16 +675,16 @@ def _correlate_dispatcher(a, v, mode=None):
  
  @array_function_dispatch(_correlate_dispatcher)
  def correlate(a, v, mode='valid'):
-    """
+    r"""
      Cross-correlation of two 1-dimensional sequences.
  
      This function computes the correlation as generally defined in signal
-    processing texts::
+    processing texts:
  
-        c_{av}[k] = sum_n a[n+k] * conj(v[n])
+    .. math:: c_k = \sum_n a_{n+k} \cdot \overline{v_n}
  
-    with a and v sequences being zero-padded where necessary and conj being
-    the conjugate.
+    with a and v sequences being zero-padded where necessary and
+    :math:`\overline x` denoting complex conjugation.
  
      Parameters
      ----------
@@ -705,11 +711,11 @@ def correlate(a, v, mode='valid'):
      Notes
      -----
      The definition of correlation above is not unique and sometimes correlation
-    may be defined differently. Another common definition is::
+    may be defined differently. Another common definition is:
  
-        c'_{av}[k] = sum_n a[n] conj(v[n+k])
+    .. math:: c'_k = \sum_n a_{n} \cdot \overline{v_{n+k}}
  
-    which is related to ``c_{av}[k]`` by ``c'_{av}[k] = c_{av}[-k]``.
+    which is related to :math:`c_k` by :math:`c'_k = c_{-k}`.
  
      `numpy.correlate` may perform slowly in large arrays (i.e. n = 1e5) because it does
      not use the FFT to compute the convolution; in that case, `scipy.signal.correlate` might
@@ -731,8 +737,8 @@ def correlate(a, v, mode='valid'):
      array([ 0.5-0.5j,  1.0+0.j ,  1.5-1.5j,  3.0-1.j ,  0.0+0.j ])
  
      Note that you get the time reversed, complex conjugated result
-    when the two input sequences change places, i.e.,
-    ``c_{va}[k] = c^{*}_{av}[-k]``:
+    (:math:`\overline{c_{-k}}`) when the two input sequences a and v change 
+    places:
  
      >>> np.correlate([0, 1, 0.5j], [1+1j, 2, 3-1j], 'full')
      array([ 0.0+0.j ,  3.0+1.j ,  1.5+1.5j,  1.0+0.j ,  0.5+0.5j])
@@ -798,7 +804,7 @@ def convolve(a, v, mode='full'):
      -----
      The discrete convolution operation is defined as
  
-    .. math:: (a * v)[n] = \\sum_{m = -\\infty}^{\\infty} a[m] v[n - m]
+    .. math:: (a * v)_n = \\sum_{m = -\\infty}^{\\infty} a_m v_{n - m}
  
      It can be shown that a convolution :math:`x(t) * y(t)` in time/space
      is equivalent to the multiplication :math:`X(f) Y(f)` in the Fourier
@@ -1561,7 +1567,7 @@ def cross(a, b, axisa=-1, axisb=-1, axisc=-1, axis=None):
      array(-3)
  
      Multiple vector cross-products. Note that the direction of the cross
-    product vector is defined by the `right-hand rule`.
+    product vector is defined by the *right-hand rule*.
  
      >>> x = np.array([[1,2,3], [4,5,6]])
      >>> y = np.array([[4,5,6], [1,2,3]])
@@ -1829,6 +1835,14 @@ def fromfunction(function, shape, *, dtype=float, like=None, **kwargs):
  
      Examples
      --------
+    >>> np.fromfunction(lambda i, j: i, (2, 2), dtype=float)
+    array([[0., 0.],
+           [1., 1.]])
+           
+    >>> np.fromfunction(lambda i, j: j, (2, 2), dtype=float)    
+    array([[0., 1.],
+           [0., 1.]])
+           
      >>> np.fromfunction(lambda i, j: i == j, (3, 3), dtype=int)
      array([[ True, False, False],
             [False,  True, False],
diff --git a/numpy/core/numeric.pyi b/numpy/core/numeric.pyi

index d7ec303518a20cd3c0dd304b3a00b28c27528540..d09144f90fe1a11facb81752560db3306007ccc3 100644 (file)
--- a/numpy/core/numeric.pyi
+++ b/numpy/core/numeric.pyi
@@ -1,14 +1,9 @@
+from collections.abc import Callable, Sequence
  from typing import (
      Any,
-    Union,
-    Sequence,
-    Tuple,
-    Callable,
-    List,
      overload,
      TypeVar,
      Literal,
-    Type,
      SupportsAbs,
      SupportsIndex,
      NoReturn,
@@ -17,7 +12,6 @@ from typing_extensions import TypeGuard
  
  from numpy import (
      ComplexWarning as ComplexWarning,
-    dtype,
      generic,
      unsignedinteger,
      signedinteger,
@@ -33,14 +27,14 @@ from numpy import (
      _OrderCF,
  )
  
-from numpy.typing import (
+from numpy._typing import (
      ArrayLike,
      NDArray,
      DTypeLike,
      _ShapeLike,
-    _SupportsDType,
-    _FiniteNestedSequence,
-    _SupportsArray,
+    _DTypeLike,
+    _ArrayLike,
+    _SupportsArrayFunc,
      _ScalarLike_co,
      _ArrayLikeBool_co,
      _ArrayLikeUInt_co,
@@ -55,15 +49,9 @@ _T = TypeVar("_T")
  _SCT = TypeVar("_SCT", bound=generic)
  _ArrayType = TypeVar("_ArrayType", bound=NDArray[Any])
  
-_DTypeLike = Union[
-    dtype[_SCT],
-    Type[_SCT],
-    _SupportsDType[dtype[_SCT]],
-]
-_ArrayLike = _FiniteNestedSequence[_SupportsArray[dtype[_SCT]]]
  _CorrelateMode = Literal["valid", "same", "full"]
  
-__all__: List[str]
+__all__: list[str]
  
  @overload
  def zeros_like(
@@ -112,7 +100,7 @@ def ones(
      dtype: None = ...,
      order: _OrderCF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[float64]: ...
  @overload
  def ones(
@@ -120,7 +108,7 @@ def ones(
      dtype: _DTypeLike[_SCT],
      order: _OrderCF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def ones(
@@ -128,7 +116,7 @@ def ones(
      dtype: DTypeLike,
      order: _OrderCF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  @overload
@@ -179,7 +167,7 @@ def full(
      dtype: None = ...,
      order: _OrderCF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  @overload
  def full(
@@ -188,7 +176,7 @@ def full(
      dtype: _DTypeLike[_SCT],
      order: _OrderCF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def full(
@@ -197,7 +185,7 @@ def full(
      dtype: DTypeLike,
      order: _OrderCF = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  @overload
@@ -406,43 +394,43 @@ def outer(
  def tensordot(
      a: _ArrayLikeBool_co,
      b: _ArrayLikeBool_co,
-    axes: int | Tuple[_ShapeLike, _ShapeLike] = ...,
+    axes: int | tuple[_ShapeLike, _ShapeLike] = ...,
  ) -> NDArray[bool_]: ...
  @overload
  def tensordot(
      a: _ArrayLikeUInt_co,
      b: _ArrayLikeUInt_co,
-    axes: int | Tuple[_ShapeLike, _ShapeLike] = ...,
+    axes: int | tuple[_ShapeLike, _ShapeLike] = ...,
  ) -> NDArray[unsignedinteger[Any]]: ...
  @overload
  def tensordot(
      a: _ArrayLikeInt_co,
      b: _ArrayLikeInt_co,
-    axes: int | Tuple[_ShapeLike, _ShapeLike] = ...,
+    axes: int | tuple[_ShapeLike, _ShapeLike] = ...,
  ) -> NDArray[signedinteger[Any]]: ...
  @overload
  def tensordot(
      a: _ArrayLikeFloat_co,
      b: _ArrayLikeFloat_co,
-    axes: int | Tuple[_ShapeLike, _ShapeLike] = ...,
+    axes: int | tuple[_ShapeLike, _ShapeLike] = ...,
  ) -> NDArray[floating[Any]]: ...
  @overload
  def tensordot(
      a: _ArrayLikeComplex_co,
      b: _ArrayLikeComplex_co,
-    axes: int | Tuple[_ShapeLike, _ShapeLike] = ...,
+    axes: int | tuple[_ShapeLike, _ShapeLike] = ...,
  ) -> NDArray[complexfloating[Any, Any]]: ...
  @overload
  def tensordot(
      a: _ArrayLikeTD64_co,
      b: _ArrayLikeTD64_co,
-    axes: int | Tuple[_ShapeLike, _ShapeLike] = ...,
+    axes: int | tuple[_ShapeLike, _ShapeLike] = ...,
  ) -> NDArray[timedelta64]: ...
  @overload
  def tensordot(
      a: _ArrayLikeObject_co,
      b: _ArrayLikeObject_co,
-    axes: int | Tuple[_ShapeLike, _ShapeLike] = ...,
+    axes: int | tuple[_ShapeLike, _ShapeLike] = ...,
  ) -> NDArray[object_]: ...
  
  @overload
@@ -528,15 +516,15 @@ def cross(
  @overload
  def indices(
      dimensions: Sequence[int],
-    dtype: Type[int] = ...,
+    dtype: type[int] = ...,
      sparse: Literal[False] = ...,
  ) -> NDArray[int_]: ...
  @overload
  def indices(
      dimensions: Sequence[int],
-    dtype: Type[int] = ...,
+    dtype: type[int] = ...,
      sparse: Literal[True] = ...,
-) -> Tuple[NDArray[int_], ...]: ...
+) -> tuple[NDArray[int_], ...]: ...
  @overload
  def indices(
      dimensions: Sequence[int],
@@ -548,7 +536,7 @@ def indices(
      dimensions: Sequence[int],
      dtype: _DTypeLike[_SCT],
      sparse: Literal[True],
-) -> Tuple[NDArray[_SCT], ...]: ...
+) -> tuple[NDArray[_SCT], ...]: ...
  @overload
  def indices(
      dimensions: Sequence[int],
@@ -560,14 +548,14 @@ def indices(
      dimensions: Sequence[int],
      dtype: DTypeLike,
      sparse: Literal[True],
-) -> Tuple[NDArray[Any], ...]: ...
+) -> tuple[NDArray[Any], ...]: ...
  
  def fromfunction(
      function: Callable[..., _T],
      shape: Sequence[int],
      *,
      dtype: DTypeLike = ...,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
      **kwargs: Any,
  ) -> _T: ...
  
@@ -588,21 +576,21 @@ def identity(
      n: int,
      dtype: None = ...,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[float64]: ...
  @overload
  def identity(
      n: int,
      dtype: _DTypeLike[_SCT],
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def identity(
      n: int,
      dtype: DTypeLike,
      *,
-    like: ArrayLike = ...,
+    like: _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  def allclose(
diff --git a/numpy/core/numerictypes.py b/numpy/core/numerictypes.py

index 8e5de852bcff88cad299af61a3b9e6d85f787d9f..3d1cb6fd19c0cbddb8476800b3fc4d0aa3d2c257 100644 (file)
--- a/numpy/core/numerictypes.py
+++ b/numpy/core/numerictypes.py
@@ -516,7 +516,7 @@ def _scalar_type_key(typ):
      return (dt.kind.lower(), dt.itemsize)
  
  
-ScalarType = [int, float, complex, int, bool, bytes, str, memoryview]
+ScalarType = [int, float, complex, bool, bytes, str, memoryview]
  ScalarType += sorted(_concrete_types, key=_scalar_type_key)
  ScalarType = tuple(ScalarType)
  
diff --git a/numpy/core/numerictypes.pyi b/numpy/core/numerictypes.pyi

index 1d3ff773bbd71166a8d37566113684f37c94efc1..d10e4822ad09014333f107a7add59792195d8afe 100644 (file)
--- a/numpy/core/numerictypes.pyi
+++ b/numpy/core/numerictypes.pyi
@@ -1,16 +1,12 @@
  import sys
  import types
+from collections.abc import Iterable
  from typing import (
      Literal as L,
-    Type,
      Union,
-    Tuple,
      overload,
      Any,
      TypeVar,
-    Dict,
-    List,
-    Iterable,
      Protocol,
      TypedDict,
  )
@@ -50,18 +46,11 @@ from numpy.core._type_aliases import (
      sctypes as sctypes,
  )
  
-from numpy.typing import DTypeLike, ArrayLike, _SupportsDType
+from numpy._typing import DTypeLike, ArrayLike, _DTypeLike
  
  _T = TypeVar("_T")
  _SCT = TypeVar("_SCT", bound=generic)
  
-# A paramtrizable subset of `npt.DTypeLike`
-_DTypeLike = Union[
-    Type[_SCT],
-    dtype[_SCT],
-    _SupportsDType[dtype[_SCT]],
-]
-
  class _CastFunc(Protocol):
      def __call__(
          self, x: ArrayLike, k: DTypeLike = ...
@@ -78,48 +67,48 @@ class _TypeCodes(TypedDict):
      Datetime: L['Mm']
      All: L['?bhilqpBHILQPefdgFDGSUVOMm']
  
-class _typedict(Dict[Type[generic], _T]):
+class _typedict(dict[type[generic], _T]):
      def __getitem__(self, key: DTypeLike) -> _T: ...
  
  if sys.version_info >= (3, 10):
      _TypeTuple = Union[
-        Type[Any],
+        type[Any],
          types.UnionType,
-        Tuple[Union[Type[Any], types.UnionType, Tuple[Any, ...]], ...],
+        tuple[Union[type[Any], types.UnionType, tuple[Any, ...]], ...],
      ]
  else:
      _TypeTuple = Union[
-        Type[Any],
-        Tuple[Union[Type[Any], Tuple[Any, ...]], ...],
+        type[Any],
+        tuple[Union[type[Any], tuple[Any, ...]], ...],
      ]
  
-__all__: List[str]
+__all__: list[str]
  
  @overload
-def maximum_sctype(t: _DTypeLike[_SCT]) -> Type[_SCT]: ...
+def maximum_sctype(t: _DTypeLike[_SCT]) -> type[_SCT]: ...
  @overload
-def maximum_sctype(t: DTypeLike) -> Type[Any]: ...
+def maximum_sctype(t: DTypeLike) -> type[Any]: ...
  
  @overload
-def issctype(rep: dtype[Any] | Type[Any]) -> bool: ...
+def issctype(rep: dtype[Any] | type[Any]) -> bool: ...
  @overload
  def issctype(rep: object) -> L[False]: ...
  
  @overload
-def obj2sctype(rep: _DTypeLike[_SCT], default: None = ...) -> None | Type[_SCT]: ...
+def obj2sctype(rep: _DTypeLike[_SCT], default: None = ...) -> None | type[_SCT]: ...
  @overload
-def obj2sctype(rep: _DTypeLike[_SCT], default: _T) -> _T | Type[_SCT]: ...
+def obj2sctype(rep: _DTypeLike[_SCT], default: _T) -> _T | type[_SCT]: ...
  @overload
-def obj2sctype(rep: DTypeLike, default: None = ...) -> None | Type[Any]: ...
+def obj2sctype(rep: DTypeLike, default: None = ...) -> None | type[Any]: ...
  @overload
-def obj2sctype(rep: DTypeLike, default: _T) -> _T | Type[Any]: ...
+def obj2sctype(rep: DTypeLike, default: _T) -> _T | type[Any]: ...
  @overload
  def obj2sctype(rep: object, default: None = ...) -> None: ...
  @overload
  def obj2sctype(rep: object, default: _T) -> _T: ...
  
  @overload
-def issubclass_(arg1: Type[Any], arg2: _TypeTuple) -> bool: ...
+def issubclass_(arg1: type[Any], arg2: _TypeTuple) -> bool: ...
  @overload
  def issubclass_(arg1: object, arg2: object) -> L[False]: ...
  
@@ -137,37 +126,36 @@ def find_common_type(
  cast: _typedict[_CastFunc]
  nbytes: _typedict[int]
  typecodes: _TypeCodes
-ScalarType: Tuple[
-    Type[int],
-    Type[float],
-    Type[complex],
-    Type[int],
-    Type[bool],
-    Type[bytes],
-    Type[str],
-    Type[memoryview],
-    Type[bool_],
-    Type[csingle],
-    Type[cdouble],
-    Type[clongdouble],
-    Type[half],
-    Type[single],
-    Type[double],
-    Type[longdouble],
-    Type[byte],
-    Type[short],
-    Type[intc],
-    Type[int_],
-    Type[longlong],
-    Type[timedelta64],
-    Type[datetime64],
-    Type[object_],
-    Type[bytes_],
-    Type[str_],
-    Type[ubyte],
-    Type[ushort],
-    Type[uintc],
-    Type[uint],
-    Type[ulonglong],
-    Type[void],
+ScalarType: tuple[
+    type[int],
+    type[float],
+    type[complex],
+    type[bool],
+    type[bytes],
+    type[str],
+    type[memoryview],
+    type[bool_],
+    type[csingle],
+    type[cdouble],
+    type[clongdouble],
+    type[half],
+    type[single],
+    type[double],
+    type[longdouble],
+    type[byte],
+    type[short],
+    type[intc],
+    type[int_],
+    type[longlong],
+    type[timedelta64],
+    type[datetime64],
+    type[object_],
+    type[bytes_],
+    type[str_],
+    type[ubyte],
+    type[ushort],
+    type[uintc],
+    type[uint],
+    type[ulonglong],
+    type[void],
  ]
diff --git a/numpy/core/overrides.py b/numpy/core/overrides.py

index 840cf38c9ccbb55701eb6c97de0078d623062279..cb550152ebede227848a8b503bedc26d3fec331f 100644 (file)
--- a/numpy/core/overrides.py
+++ b/numpy/core/overrides.py
@@ -12,7 +12,7 @@ ARRAY_FUNCTION_ENABLED = bool(
      int(os.environ.get('NUMPY_EXPERIMENTAL_ARRAY_FUNCTION', 1)))
  
  array_function_like_doc = (
-    """like : array_like
+    """like : array_like, optional
          Reference object to allow the creation of arrays which are not
          NumPy arrays. If an array-like passed in as ``like`` supports
          the ``__array_function__`` protocol, the result will be defined
diff --git a/numpy/core/records.pyi b/numpy/core/records.pyi

index 172bab3eeea7ad2f31b5592184de7d07f3716a5d..d3bbe0e70eefc7d8c1a6643fca240c64c33b77d5 100644 (file)
--- a/numpy/core/records.pyi
+++ b/numpy/core/records.pyi
@@ -1,12 +1,9 @@
  import os
+from collections.abc import Sequence, Iterable
  from typing import (
-    List,
-    Sequence,
      Any,
      TypeVar,
-    Iterable,
      overload,
-    Tuple,
      Protocol,
  )
  
@@ -21,7 +18,7 @@ from numpy import (
      _SupportsBuffer,
  )
  
-from numpy.typing import (
+from numpy._typing import (
      ArrayLike,
      DTypeLike,
      NDArray,
@@ -39,7 +36,7 @@ class _SupportsReadInto(Protocol):
      def tell(self, /) -> int: ...
      def readinto(self, buffer: memoryview, /) -> int: ...
  
-__all__: List[str]
+__all__: list[str]
  
  @overload
  def fromarrays(
@@ -67,7 +64,7 @@ def fromarrays(
  
  @overload
  def fromrecords(
-    recList: _ArrayLikeVoid_co | Tuple[Any, ...] | _NestedSequence[Tuple[Any, ...]],
+    recList: _ArrayLikeVoid_co | tuple[Any, ...] | _NestedSequence[tuple[Any, ...]],
      dtype: DTypeLike = ...,
      shape: None | _ShapeLike = ...,
      formats: None = ...,
@@ -78,7 +75,7 @@ def fromrecords(
  ) -> _RecArray[record]: ...
  @overload
  def fromrecords(
-    recList: _ArrayLikeVoid_co | Tuple[Any, ...] | _NestedSequence[Tuple[Any, ...]],
+    recList: _ArrayLikeVoid_co | tuple[Any, ...] | _NestedSequence[tuple[Any, ...]],
      dtype: None = ...,
      shape: None | _ShapeLike = ...,
      *,
diff --git a/numpy/core/setup.py b/numpy/core/setup.py

index 157c825d2a1e289f22356774249f4577399dfaea..6d1d8db4b9391b5df05147581d8b0d61a7fdcf5a 100644 (file)
--- a/numpy/core/setup.py
+++ b/numpy/core/setup.py
@@ -17,6 +17,10 @@ from setup_common import *  # noqa: F403
  # Set to True to enable relaxed strides checking. This (mostly) means
  # that `strides[dim]` is ignored if `shape[dim] == 1` when setting flags.
  NPY_RELAXED_STRIDES_CHECKING = (os.environ.get('NPY_RELAXED_STRIDES_CHECKING', "1") != "0")
+if not NPY_RELAXED_STRIDES_CHECKING:
+    raise SystemError(
+        "Support for NPY_RELAXED_STRIDES_CHECKING=0 has been remove as of "
+        "NumPy 1.23.  This error will eventually be removed entirely.")
  
  # Put NPY_RELAXED_STRIDES_DEBUG=1 in the environment if you want numpy to use a
  # bogus value for affected strides in order to help smoke out bad stride usage
@@ -526,13 +530,9 @@ def configuration(parent_package='',top_path=None):
              if can_link_svml():
                  moredefs.append(('NPY_CAN_LINK_SVML', 1))
  
-            # Use relaxed stride checking
-            if NPY_RELAXED_STRIDES_CHECKING:
-                moredefs.append(('NPY_RELAXED_STRIDES_CHECKING', 1))
-            else:
-                moredefs.append(('NPY_RELAXED_STRIDES_CHECKING', 0))
-
-            # Use bogus stride debug aid when relaxed strides are enabled
+            # Use bogus stride debug aid to flush out bugs where users use
+            # strides of dimensions with length 1 to index a full contiguous
+            # array.
              if NPY_RELAXED_STRIDES_DEBUG:
                  moredefs.append(('NPY_RELAXED_STRIDES_DEBUG', 1))
              else:
@@ -628,9 +628,6 @@ def configuration(parent_package='',top_path=None):
              moredefs.extend(cocache.check_ieee_macros(config_cmd)[1])
              moredefs.extend(cocache.check_complex(config_cmd, mathlibs)[1])
  
-            if NPY_RELAXED_STRIDES_CHECKING:
-                moredefs.append(('NPY_RELAXED_STRIDES_CHECKING', 1))
-
              if NPY_RELAXED_STRIDES_DEBUG:
                  moredefs.append(('NPY_RELAXED_STRIDES_DEBUG', 1))
  
@@ -772,29 +769,35 @@ def configuration(parent_package='',top_path=None):
  
      npymath_sources = [join('src', 'npymath', 'npy_math_internal.h.src'),
                         join('src', 'npymath', 'npy_math.c'),
+                       # join('src', 'npymath', 'ieee754.cpp'),
                         join('src', 'npymath', 'ieee754.c.src'),
                         join('src', 'npymath', 'npy_math_complex.c.src'),
                         join('src', 'npymath', 'halffloat.c')
                         ]
  
-    def gl_if_msvc(build_cmd):
-        """ Add flag if we are using MSVC compiler
+    def opts_if_msvc(build_cmd):
+        """ Add flags if we are using MSVC compiler
  
-        We can't see this in our scope, because we have not initialized the
-        distutils build command, so use this deferred calculation to run when
-        we are building the library.
+        We can't see `build_cmd` in our scope, because we have not initialized
+        the distutils build command, so use this deferred calculation to run
+        when we are building the library.
          """
-        if build_cmd.compiler.compiler_type == 'msvc':
-            # explicitly disable whole-program optimization
-            return ['/GL-']
-        return []
+        if build_cmd.compiler.compiler_type != 'msvc':
+            return []
+        # Explicitly disable whole-program optimization.
+        flags = ['/GL-']
+        # Disable voltbl section for vc142 to allow link using mingw-w64; see:
+        # https://github.com/matthew-brett/dll_investigation/issues/1#issuecomment-1100468171
+        if build_cmd.compiler_opt.cc_test_flags(['-d2VolatileMetadata-']):
+            flags.append('-d2VolatileMetadata-')
+        return flags
  
      config.add_installed_library('npymath',
              sources=npymath_sources + [get_mathlib_info],
              install_dir='lib',
              build_info={
                  'include_dirs' : [],  # empty list required for creating npy_math_internal.h
-                'extra_compiler_args': [gl_if_msvc],
+                'extra_compiler_args': [opts_if_msvc],
              })
      config.add_npy_pkg_config("npymath.ini.in", "lib/npy-pkg-config",
              subst_dict)
@@ -856,7 +859,7 @@ def configuration(parent_package='',top_path=None):
              join('src', 'common', 'ucsnarrow.c'),
              join('src', 'common', 'ufunc_override.c'),
              join('src', 'common', 'numpyos.c'),
-            join('src', 'common', 'npy_cpu_features.c.src'),
+            join('src', 'common', 'npy_cpu_features.c'),
              ]
  
      if os.environ.get('NPY_USE_BLAS_ILP64', "0") != "0":
@@ -883,7 +886,7 @@ def configuration(parent_package='',top_path=None):
      multiarray_deps = [
              join('src', 'multiarray', 'abstractdtypes.h'),
              join('src', 'multiarray', 'arrayobject.h'),
-            join('src', 'multiarray', 'arraytypes.h'),
+            join('src', 'multiarray', 'arraytypes.h.src'),
              join('src', 'multiarray', 'arrayfunction_override.h'),
              join('src', 'multiarray', 'array_coercion.h'),
              join('src', 'multiarray', 'array_method.h'),
@@ -919,6 +922,7 @@ def configuration(parent_package='',top_path=None):
              join('src', 'multiarray', 'typeinfo.h'),
              join('src', 'multiarray', 'usertypes.h'),
              join('src', 'multiarray', 'vdot.h'),
+            join('src', 'multiarray', 'textreading', 'readtext.h'),
              join('include', 'numpy', 'arrayobject.h'),
              join('include', 'numpy', '_neighborhood_iterator_imp.h'),
              join('include', 'numpy', 'npy_endian.h'),
@@ -944,7 +948,9 @@ def configuration(parent_package='',top_path=None):
              join('src', 'multiarray', 'abstractdtypes.c'),
              join('src', 'multiarray', 'alloc.c'),
              join('src', 'multiarray', 'arrayobject.c'),
+            join('src', 'multiarray', 'arraytypes.h.src'),
              join('src', 'multiarray', 'arraytypes.c.src'),
+            join('src', 'multiarray', 'argfunc.dispatch.c.src'),
              join('src', 'multiarray', 'array_coercion.c'),
              join('src', 'multiarray', 'array_method.c'),
              join('src', 'multiarray', 'array_assign_scalar.c'),
@@ -997,15 +1003,24 @@ def configuration(parent_package='',top_path=None):
              join('src', 'multiarray', 'usertypes.c'),
              join('src', 'multiarray', 'vdot.c'),
              join('src', 'common', 'npy_sort.h.src'),
-            join('src', 'npysort', 'quicksort.c.src'),
-            join('src', 'npysort', 'mergesort.c.src'),
-            join('src', 'npysort', 'timsort.c.src'),
-            join('src', 'npysort', 'heapsort.c.src'),
+            join('src', 'npysort', 'x86-qsort.dispatch.cpp'),
+            join('src', 'npysort', 'quicksort.cpp'),
+            join('src', 'npysort', 'mergesort.cpp'),
+            join('src', 'npysort', 'timsort.cpp'),
+            join('src', 'npysort', 'heapsort.cpp'),
              join('src', 'npysort', 'radixsort.cpp'),
-            join('src', 'common', 'npy_partition.h.src'),
-            join('src', 'npysort', 'selection.c.src'),
-            join('src', 'common', 'npy_binsearch.h.src'),
-            join('src', 'npysort', 'binsearch.c.src'),
+            join('src', 'common', 'npy_partition.h'),
+            join('src', 'npysort', 'selection.cpp'),
+            join('src', 'common', 'npy_binsearch.h'),
+            join('src', 'npysort', 'binsearch.cpp'),
+            join('src', 'multiarray', 'textreading', 'conversions.c'),
+            join('src', 'multiarray', 'textreading', 'field_types.c'),
+            join('src', 'multiarray', 'textreading', 'growth.c'),
+            join('src', 'multiarray', 'textreading', 'readtext.c'),
+            join('src', 'multiarray', 'textreading', 'rows.c'),
+            join('src', 'multiarray', 'textreading', 'stream_pyobject.c'),
+            join('src', 'multiarray', 'textreading', 'str_to_int.c'),
+            join('src', 'multiarray', 'textreading', 'tokenize.cpp'),
              ]
  
      #######################################################################
@@ -1024,6 +1039,21 @@ def configuration(parent_package='',top_path=None):
                                                   generate_umath.__file__))
          return []
  
+    def generate_umath_doc_header(ext, build_dir):
+        from numpy.distutils.misc_util import exec_mod_from_location
+
+        target = join(build_dir, header_dir, '_umath_doc_generated.h')
+        dir = os.path.dirname(target)
+        if not os.path.exists(dir):
+            os.makedirs(dir)
+
+        generate_umath_doc_py = join(codegen_dir, 'generate_umath_doc.py')
+        if newer(generate_umath_doc_py, target):
+            n = dot_join(config.name, 'generate_umath_doc')
+            generate_umath_doc = exec_mod_from_location(
+                '_'.join(n.split('.')), generate_umath_doc_py)
+            generate_umath_doc.write_code(target)
+
      umath_src = [
              join('src', 'umath', 'umathmodule.c'),
              join('src', 'umath', 'reduction.c'),
@@ -1035,15 +1065,19 @@ def configuration(parent_package='',top_path=None):
              join('src', 'umath', 'loops_unary_fp.dispatch.c.src'),
              join('src', 'umath', 'loops_arithm_fp.dispatch.c.src'),
              join('src', 'umath', 'loops_arithmetic.dispatch.c.src'),
+            join('src', 'umath', 'loops_minmax.dispatch.c.src'),
              join('src', 'umath', 'loops_trigonometric.dispatch.c.src'),
              join('src', 'umath', 'loops_umath_fp.dispatch.c.src'),
              join('src', 'umath', 'loops_exponent_log.dispatch.c.src'),
+            join('src', 'umath', 'loops_hyperbolic.dispatch.c.src'),
+            join('src', 'umath', 'loops_modulo.dispatch.c.src'),
              join('src', 'umath', 'matmul.h.src'),
              join('src', 'umath', 'matmul.c.src'),
              join('src', 'umath', 'clip.h'),
              join('src', 'umath', 'clip.cpp'),
              join('src', 'umath', 'dispatching.c'),
              join('src', 'umath', 'legacy_array_method.c'),
+            join('src', 'umath', 'wrapping_array_method.c'),
              join('src', 'umath', 'ufunc_object.c'),
              join('src', 'umath', 'extobj.c'),
              join('src', 'umath', 'scalarmath.c.src'),
@@ -1063,12 +1097,26 @@ def configuration(parent_package='',top_path=None):
              join('src', 'umath', 'simd.inc.src'),
              join('src', 'umath', 'override.h'),
              join(codegen_dir, 'generate_ufunc_api.py'),
+            join(codegen_dir, 'ufunc_docstrings.py'),
              ]
  
      svml_path = join('numpy', 'core', 'src', 'umath', 'svml')
      svml_objs = []
+    # we have converted the following into universal intrinsics
+    # so we can bring the benefits of performance for all platforms
+    # not just for avx512 on linux without performance/accuracy regression,
+    # actually the other way around, better performance and
+    # after all maintainable code.
+    svml_filter = (
+        'svml_z0_tanh_d_la.s', 'svml_z0_tanh_s_la.s'
+    )
      if can_link_svml() and check_svml_submodule(svml_path):
          svml_objs = glob.glob(svml_path + '/**/*.s', recursive=True)
+        svml_objs = [o for o in svml_objs if not o.endswith(svml_filter)]
+
+        # The ordering of names returned by glob is undefined, so we sort
+        # to make builds reproducible.
+        svml_objs.sort()
  
      config.add_extension('_multiarray_umath',
                           # Forcing C language even though we have C++ sources.
@@ -1082,6 +1130,7 @@ def configuration(parent_package='',top_path=None):
                                    join(codegen_dir, 'generate_numpy_api.py'),
                                    join('*.py'),
                                    generate_umath_c,
+                                  generate_umath_doc_header,
                                    generate_ufunc_api,
                                   ],
                           depends=deps + multiarray_deps + umath_deps +
@@ -1098,7 +1147,7 @@ def configuration(parent_package='',top_path=None):
      config.add_extension('_umath_tests', sources=[
          join('src', 'umath', '_umath_tests.c.src'),
          join('src', 'umath', '_umath_tests.dispatch.c'),
-        join('src', 'common', 'npy_cpu_features.c.src'),
+        join('src', 'common', 'npy_cpu_features.c'),
      ])
  
      #######################################################################
@@ -1106,14 +1155,14 @@ def configuration(parent_package='',top_path=None):
      #######################################################################
  
      config.add_extension('_rational_tests',
-                    sources=[join('src', 'umath', '_rational_tests.c.src')])
+                    sources=[join('src', 'umath', '_rational_tests.c')])
  
      #######################################################################
      #                        struct_ufunc_test module                     #
      #######################################################################
  
      config.add_extension('_struct_ufunc_tests',
-                    sources=[join('src', 'umath', '_struct_ufunc_tests.c.src')])
+                    sources=[join('src', 'umath', '_struct_ufunc_tests.c')])
  
  
      #######################################################################
@@ -1121,14 +1170,14 @@ def configuration(parent_package='',top_path=None):
      #######################################################################
  
      config.add_extension('_operand_flag_tests',
-                    sources=[join('src', 'umath', '_operand_flag_tests.c.src')])
+                    sources=[join('src', 'umath', '_operand_flag_tests.c')])
  
      #######################################################################
      #                        SIMD module                                  #
      #######################################################################
  
      config.add_extension('_simd', sources=[
-        join('src', 'common', 'npy_cpu_features.c.src'),
+        join('src', 'common', 'npy_cpu_features.c'),
          join('src', '_simd', '_simd.c'),
          join('src', '_simd', '_simd_inc.h.src'),
          join('src', '_simd', '_simd_data.inc.src'),
diff --git a/numpy/core/setup_common.py b/numpy/core/setup_common.py

index 181c58fb1219a73979b906af43816508a0c0dd16..4b802de82846536de2468def015deb7c7512080d 100644 (file)
--- a/numpy/core/setup_common.py
+++ b/numpy/core/setup_common.py
@@ -45,22 +45,12 @@ C_ABI_VERSION = 0x01000009
  # 0x0000000e - 1.20.x
  # 0x0000000e - 1.21.x
  # 0x0000000f - 1.22.x
-C_API_VERSION = 0x0000000f
+# 0x00000010 - 1.23.x
+C_API_VERSION = 0x00000010
  
  class MismatchCAPIWarning(Warning):
      pass
  
-def is_released(config):
-    """Return True if a released version of numpy is detected."""
-    from distutils.version import LooseVersion
-
-    v = config.get_version('../_version.py')
-    if v is None:
-        raise ValueError("Could not get version")
-    pv = LooseVersion(vstring=v).version
-    if len(pv) > 3:
-        return False
-    return True
  
  def get_api_versions(apiversion, codegen_dir):
      """
@@ -193,6 +183,8 @@ OPTIONAL_FUNCTION_ATTRIBUTES = [('__attribute__((optimize("unroll-loops")))',
                                  'attribute_optimize_unroll_loops'),
                                  ('__attribute__((optimize("O3")))',
                                   'attribute_optimize_opt_3'),
+                                ('__attribute__((optimize("O2")))',
+                                 'attribute_optimize_opt_2'),
                                  ('__attribute__((nonnull (1)))',
                                   'attribute_nonnull'),
                                  ('__attribute__((target ("avx")))',
diff --git a/numpy/core/shape_base.py b/numpy/core/shape_base.py

index a81a04f7ff0ecca310d4542a4c64970892d58567..1a4198c5f8e97926bd91493cbdb8ec2eca58ee27 100644 (file)
--- a/numpy/core/shape_base.py
+++ b/numpy/core/shape_base.py
@@ -543,25 +543,23 @@ def _concatenate_shapes(shapes, axis):
      Returns
      -------
      shape: tuple of int
-        This tuple satisfies:
-        ```
-        shape, _ = _concatenate_shapes([arr.shape for shape in arrs], axis)
-        shape == concatenate(arrs, axis).shape
-        ```
+        This tuple satisfies::
+
+            shape, _ = _concatenate_shapes([arr.shape for shape in arrs], axis)
+            shape == concatenate(arrs, axis).shape
  
      slice_prefixes: tuple of (slice(start, end), )
          For a list of arrays being concatenated, this returns the slice
          in the larger array at axis that needs to be sliced into.
  
-        For example, the following holds:
-        ```
-        ret = concatenate([a, b, c], axis)
-        _, (sl_a, sl_b, sl_c) = concatenate_slices([a, b, c], axis)
+        For example, the following holds::
+
+            ret = concatenate([a, b, c], axis)
+            _, (sl_a, sl_b, sl_c) = concatenate_slices([a, b, c], axis)
  
-        ret[(slice(None),) * axis + sl_a] == a
-        ret[(slice(None),) * axis + sl_b] == b
-        ret[(slice(None),) * axis + sl_c] == c
-        ```
+            ret[(slice(None),) * axis + sl_a] == a
+            ret[(slice(None),) * axis + sl_b] == b
+            ret[(slice(None),) * axis + sl_c] == c
  
          These are called slice prefixes since they are used in the recursive
          blocking algorithm to compute the left-most slices during the
diff --git a/numpy/core/shape_base.pyi b/numpy/core/shape_base.pyi

index 159ad2781c05f920327ccc755982d43ecb4221e0..cea355d443c0c003b1ffc924563758f1840e12e5 100644 (file)
--- a/numpy/core/shape_base.pyi
+++ b/numpy/core/shape_base.pyi
@@ -1,35 +1,34 @@
-from typing import TypeVar, overload, List, Sequence, Any, SupportsIndex
+from collections.abc import Sequence
+from typing import TypeVar, overload, Any, SupportsIndex
  
-from numpy import generic, dtype
-from numpy.typing import ArrayLike, NDArray, _FiniteNestedSequence, _SupportsArray
+from numpy import generic
+from numpy._typing import ArrayLike, NDArray, _ArrayLike
  
  _SCT = TypeVar("_SCT", bound=generic)
  _ArrayType = TypeVar("_ArrayType", bound=NDArray[Any])
  
-_ArrayLike = _FiniteNestedSequence[_SupportsArray[dtype[_SCT]]]
-
-__all__: List[str]
+__all__: list[str]
  
  @overload
  def atleast_1d(arys: _ArrayLike[_SCT], /) -> NDArray[_SCT]: ...
  @overload
  def atleast_1d(arys: ArrayLike, /) -> NDArray[Any]: ...
  @overload
-def atleast_1d(*arys: ArrayLike) -> List[NDArray[Any]]: ...
+def atleast_1d(*arys: ArrayLike) -> list[NDArray[Any]]: ...
  
  @overload
  def atleast_2d(arys: _ArrayLike[_SCT], /) -> NDArray[_SCT]: ...
  @overload
  def atleast_2d(arys: ArrayLike, /) -> NDArray[Any]: ...
  @overload
-def atleast_2d(*arys: ArrayLike) -> List[NDArray[Any]]: ...
+def atleast_2d(*arys: ArrayLike) -> list[NDArray[Any]]: ...
  
  @overload
  def atleast_3d(arys: _ArrayLike[_SCT], /) -> NDArray[_SCT]: ...
  @overload
  def atleast_3d(arys: ArrayLike, /) -> NDArray[Any]: ...
  @overload
-def atleast_3d(*arys: ArrayLike) -> List[NDArray[Any]]: ...
+def atleast_3d(*arys: ArrayLike) -> list[NDArray[Any]]: ...
  
  @overload
  def vstack(tup: Sequence[_ArrayLike[_SCT]]) -> NDArray[_SCT]: ...
diff --git a/numpy/core/src/_simd/_simd.dispatch.c.src b/numpy/core/src/_simd/_simd.dispatch.c.src

index 84de9a059fc8b8023a3246746269149b8edfb71e..ab48db5b108d7173557aad818cea58d2e956c829 100644 (file)
--- a/numpy/core/src/_simd/_simd.dispatch.c.src
+++ b/numpy/core/src/_simd/_simd.dispatch.c.src
@@ -15,7 +15,8 @@
  /**begin repeat
   * #sfx       = u8, s8, u16, s16, u32, s32, u64, s64, f32, f64#
   * #bsfx      = b8, b8, b16, b16, b32, b32, b64, b64, b32, b64#
- * #esfx      = u16, s8, u32,s16, u32, s32, u64, s64, f32, f64#
+ * #esfx      = u16,s8, u32, s16, u32, s32, u64, s64, f32, f64#
+ * #size      = 8,  8,  16,  16,  32,  32,  64,  64,  32,  64#
   * #expand_sup= 1,  0,  1,   0,   0,   0,   0,   0,   0,   0#
   * #simd_sup  = 1,  1,  1,   1,   1,   1,   1,   1,   1,   NPY_SIMD_F64#
   * #fp_only   = 0,  0,  0,   0,   0,   0,   0,   0,   1,   1#
@@ -232,6 +233,15 @@ err:
  /**end repeat1**/
  #endif // @ncont_sup@
  
+/****************************
+ * Lookup tables
+ ****************************/
+#if @size@ == 32
+SIMD_IMPL_INTRIN_2(lut32_@sfx@, v@sfx@, q@sfx@, vu@size@)
+#endif
+#if @size@ == 64
+SIMD_IMPL_INTRIN_2(lut16_@sfx@, v@sfx@, q@sfx@, vu@size@)
+#endif
  /***************************
   * Misc
   ***************************/
@@ -381,7 +391,7 @@ SIMD_IMPL_INTRIN_1(sumup_@sfx@, @esfx@, v@sfx@)
   ***************************/
  #if @fp_only@
  /**begin repeat1
- * #intrin = sqrt, recip, abs, square, ceil, trunc#
+ * #intrin = sqrt, recip, abs, square, rint, ceil, trunc, floor#
   */
  SIMD_IMPL_INTRIN_1(@intrin@_@sfx@, v@sfx@, v@sfx@)
  /**end repeat1**/
@@ -470,8 +480,9 @@ static PyMethodDef simd__intrinsics_methods[] = {
  /**begin repeat
   * #sfx       = u8, s8, u16, s16, u32, s32, u64, s64, f32, f64#
   * #bsfx      = b8, b8, b16, b16, b32, b32, b64, b64, b32, b64#
- * #esfx      = u16, s8, u32,s16, u32, s32, u64, s64, f32, f64#
- * #expand_sup =1,  0,  1,   0,   0,   0,   0,   0,   0,   0#
+ * #esfx      = u16,s8, u32, s16, u32, s32, u64, s64, f32, f64#
+ * #size      = 8,  8,  16,  16,  32,  32,  64,  64,  32,  64#
+ * #expand_sup= 1,  0,  1,   0,   0,   0,   0,   0,   0,   0#
   * #simd_sup  = 1,  1,  1,   1,   1,   1,   1,   1,   1,   NPY_SIMD_F64#
   * #fp_only   = 0,  0,  0,   0,   0,   0,   0,   0,   1,   1#
   * #sat_sup   = 1,  1,  1,   1,   0,   0,   0,   0,   0,   0#
@@ -509,6 +520,15 @@ SIMD_INTRIN_DEF(@intrin@_@sfx@)
  /**end repeat1**/
  #endif // ncont_sup
  
+/****************************
+ * Lookup tables
+ ****************************/
+#if @size@ == 32
+SIMD_INTRIN_DEF(lut32_@sfx@)
+#endif
+#if @size@ == 64
+SIMD_INTRIN_DEF(lut16_@sfx@)
+#endif
  /***************************
   * Misc
   ***************************/
@@ -615,7 +635,7 @@ SIMD_INTRIN_DEF(sumup_@sfx@)
   ***************************/
  #if @fp_only@
  /**begin repeat1
- * #intrin = sqrt, recip, abs, square, ceil, trunc#
+ * #intrin = sqrt, recip, abs, square, rint, ceil, trunc, floor#
   */
  SIMD_INTRIN_DEF(@intrin@_@sfx@)
  /**end repeat1**/
diff --git a/numpy/core/src/common/array_assign.c b/numpy/core/src/common/array_assign.c

index b7495fc0993057564ff9af520578d2d3d5aba2fe..956e55d3067d1d852462953e5d563a2158ce2e26 100644 (file)
--- a/numpy/core/src/common/array_assign.c
+++ b/numpy/core/src/common/array_assign.c
@@ -110,7 +110,6 @@ raw_array_is_aligned(int ndim, npy_intp const *shape,
          int i;
  
          for (i = 0; i < ndim; i++) {
-#if NPY_RELAXED_STRIDES_CHECKING
              /* skip dim == 1 as it is not required to have stride 0 */
              if (shape[i] > 1) {
                  /* if shape[i] == 1, the stride is never used */
@@ -120,9 +119,6 @@ raw_array_is_aligned(int ndim, npy_intp const *shape,
                  /* an array with zero elements is always aligned */
                  return 1;
              }
-#else /* not NPY_RELAXED_STRIDES_CHECKING */
-            align_check |= (npy_uintp)strides[i];
-#endif /* not NPY_RELAXED_STRIDES_CHECKING */
          }
  
          return npy_is_aligned((void *)align_check, alignment);
diff --git a/numpy/core/src/common/binop_override.h b/numpy/core/src/common/binop_override.h

index 61bc05ef37197864d04f0b19c56aa5e8ea8b8f35..ec3d046796ab62099c9c5077f4c1274f16f506cb 100644 (file)
--- a/numpy/core/src/common/binop_override.h
+++ b/numpy/core/src/common/binop_override.h
@@ -128,7 +128,7 @@ binop_should_defer(PyObject *self, PyObject *other, int inplace)
       * Classes with __array_ufunc__ are living in the future, and only need to
       * check whether __array_ufunc__ equals None.
       */
-    attr = PyArray_LookupSpecial(other, "__array_ufunc__");
+    attr = PyArray_LookupSpecial(other, npy_um_str_array_ufunc);
      if (attr != NULL) {
          defer = !inplace && (attr == Py_None);
          Py_DECREF(attr);
diff --git a/numpy/core/src/common/get_attr_string.h b/numpy/core/src/common/get_attr_string.h

index 3b23b2e6619bb59c4565d36ed76d0d6935fb5f7b..90eca5ee65c976949ab30e6da9c6355c95fe7c3d 100644 (file)
--- a/numpy/core/src/common/get_attr_string.h
+++ b/numpy/core/src/common/get_attr_string.h
@@ -1,6 +1,9 @@
  #ifndef NUMPY_CORE_SRC_COMMON_GET_ATTR_STRING_H_
  #define NUMPY_CORE_SRC_COMMON_GET_ATTR_STRING_H_
  
+#include <Python.h>
+#include "ufunc_object.h"
+
  static NPY_INLINE npy_bool
  _is_basic_python_type(PyTypeObject *tp)
  {
@@ -33,55 +36,18 @@ _is_basic_python_type(PyTypeObject *tp)
      );
  }
  
-/*
- * Stripped down version of PyObject_GetAttrString(obj, name) that does not
- * raise PyExc_AttributeError.
- *
- * This allows it to avoid creating then discarding exception objects when
- * performing lookups on objects without any attributes.
- *
- * Returns attribute value on success, NULL without an exception set if
- * there is no such attribute, and NULL with an exception on failure.
- */
-static NPY_INLINE PyObject *
-maybe_get_attr(PyObject *obj, char const *name)
-{
-    PyTypeObject *tp = Py_TYPE(obj);
-    PyObject *res = (PyObject *)NULL;
-
-    /* Attribute referenced by (char *)name */
-    if (tp->tp_getattr != NULL) {
-        res = (*tp->tp_getattr)(obj, (char *)name);
-        if (res == NULL && PyErr_ExceptionMatches(PyExc_AttributeError)) {
-            PyErr_Clear();
-        }
-    }
-    /* Attribute referenced by (PyObject *)name */
-    else if (tp->tp_getattro != NULL) {
-        PyObject *w = PyUnicode_InternFromString(name);
-        if (w == NULL) {
-            return (PyObject *)NULL;
-        }
-        res = (*tp->tp_getattro)(obj, w);
-        Py_DECREF(w);
-        if (res == NULL && PyErr_ExceptionMatches(PyExc_AttributeError)) {
-            PyErr_Clear();
-        }
-    }
-    return res;
-}
  
  /*
   * Lookup a special method, following the python approach of looking up
   * on the type object, rather than on the instance itself.
   *
   * Assumes that the special method is a numpy-specific one, so does not look
- * at builtin types, nor does it look at a base ndarray.
+ * at builtin types. It does check base ndarray and numpy scalar types.
   *
   * In future, could be made more like _Py_LookupSpecial
   */
  static NPY_INLINE PyObject *
-PyArray_LookupSpecial(PyObject *obj, char const *name)
+PyArray_LookupSpecial(PyObject *obj, PyObject *name_unicode)
  {
      PyTypeObject *tp = Py_TYPE(obj);
  
@@ -89,9 +55,16 @@ PyArray_LookupSpecial(PyObject *obj, char const *name)
      if (_is_basic_python_type(tp)) {
          return NULL;
      }
-    return maybe_get_attr((PyObject *)tp, name);
+    PyObject *res = PyObject_GetAttr((PyObject *)tp, name_unicode);
+
+    if (res == NULL && PyErr_ExceptionMatches(PyExc_AttributeError)) {
+        PyErr_Clear();
+    }
+
+    return res;
  }
  
+
  /*
   * PyArray_LookupSpecial_OnInstance:
   *
@@ -101,7 +74,7 @@ PyArray_LookupSpecial(PyObject *obj, char const *name)
   * Kept for backwards compatibility. In future, we should deprecate this.
   */
  static NPY_INLINE PyObject *
-PyArray_LookupSpecial_OnInstance(PyObject *obj, char const *name)
+PyArray_LookupSpecial_OnInstance(PyObject *obj, PyObject *name_unicode)
  {
      PyTypeObject *tp = Py_TYPE(obj);
  
@@ -110,7 +83,13 @@ PyArray_LookupSpecial_OnInstance(PyObject *obj, char const *name)
          return NULL;
      }
  
-    return maybe_get_attr(obj, name);
+    PyObject *res = PyObject_GetAttr(obj, name_unicode);
+
+    if (res == NULL && PyErr_ExceptionMatches(PyExc_AttributeError)) {
+        PyErr_Clear();
+    }
+
+    return res;
  }
  
  #endif  /* NUMPY_CORE_SRC_COMMON_GET_ATTR_STRING_H_ */
diff --git a/numpy/core/src/common/lowlevel_strided_loops.h b/numpy/core/src/common/lowlevel_strided_loops.h

index ad86c04895a9e6c29f3fa80673878fed1e5a9304..118ce9cb1e0b00e01069936b4a8ef6e41c84fcbf 100644 (file)
--- a/numpy/core/src/common/lowlevel_strided_loops.h
+++ b/numpy/core/src/common/lowlevel_strided_loops.h
@@ -692,6 +692,19 @@ PyArray_EQUIVALENTLY_ITERABLE_OVERLAP_OK(PyArrayObject *arr1, PyArrayObject *arr
          return 1;
      }
  
+    size1 = PyArray_SIZE(arr1);
+    stride1 = PyArray_TRIVIAL_PAIR_ITERATION_STRIDE(size1, arr1);
+
+    /*
+     * arr1 == arr2 is common for in-place operations, so we fast-path it here.
+     * TODO: The stride1 != 0 check rejects broadcast arrays.  This may affect
+     *       self-overlapping arrays, but seems only necessary due to
+     *       `try_trivial_single_output_loop` not rejecting broadcast outputs.
+     */
+    if (arr1 == arr2 && stride1 != 0) {
+        return 1;
+    }
+
      if (solve_may_share_memory(arr1, arr2, 1) == 0) {
          return 1;
      }
@@ -701,10 +714,7 @@ PyArray_EQUIVALENTLY_ITERABLE_OVERLAP_OK(PyArrayObject *arr1, PyArrayObject *arr
       * arrays stride ahead faster than output arrays.
       */
  
-    size1 = PyArray_SIZE(arr1);
      size2 = PyArray_SIZE(arr2);
-
-    stride1 = PyArray_TRIVIAL_PAIR_ITERATION_STRIDE(size1, arr1);
      stride2 = PyArray_TRIVIAL_PAIR_ITERATION_STRIDE(size2, arr2);
  
      /*
diff --git a/numpy/core/src/common/npy_binsearch.h b/numpy/core/src/common/npy_binsearch.h

new file mode 100644 (file)

index 0000000..8d2f071
--- /dev/null
+++ b/numpy/core/src/common/npy_binsearch.h
@@ -0,0 +1,31 @@
+#ifndef __NPY_BINSEARCH_H__
+#define __NPY_BINSEARCH_H__
+
+#include "npy_sort.h"
+#include <numpy/npy_common.h>
+#include <numpy/ndarraytypes.h>
+
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+typedef void (PyArray_BinSearchFunc)(const char*, const char*, char*,
+                                     npy_intp, npy_intp,
+                                     npy_intp, npy_intp, npy_intp,
+                                     PyArrayObject*);
+
+typedef int (PyArray_ArgBinSearchFunc)(const char*, const char*,
+                                       const char*, char*,
+                                       npy_intp, npy_intp, npy_intp,
+                                       npy_intp, npy_intp, npy_intp,
+                                       PyArrayObject*);
+
+NPY_NO_EXPORT PyArray_BinSearchFunc* get_binsearch_func(PyArray_Descr *dtype, NPY_SEARCHSIDE side);
+NPY_NO_EXPORT PyArray_ArgBinSearchFunc* get_argbinsearch_func(PyArray_Descr *dtype, NPY_SEARCHSIDE side);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/numpy/core/src/common/npy_binsearch.h.src b/numpy/core/src/common/npy_binsearch.h.src

deleted file mode 100644 (file)

index 052c444..0000000
--- a/numpy/core/src/common/npy_binsearch.h.src
+++ /dev/null
@@ -1,144 +0,0 @@
-#ifndef __NPY_BINSEARCH_H__
-#define __NPY_BINSEARCH_H__
-
-#include "npy_sort.h"
-#include <numpy/npy_common.h>
-#include <numpy/ndarraytypes.h>
-
-#define ARRAY_SIZE(a) (sizeof(a)/sizeof(a[0]))
-
-typedef void (PyArray_BinSearchFunc)(const char*, const char*, char*,
-                                     npy_intp, npy_intp,
-                                     npy_intp, npy_intp, npy_intp,
-                                     PyArrayObject*);
-
-typedef int (PyArray_ArgBinSearchFunc)(const char*, const char*,
-                                       const char*, char*,
-                                       npy_intp, npy_intp, npy_intp,
-                                       npy_intp, npy_intp, npy_intp,
-                                       PyArrayObject*);
-
-typedef struct {
-    int typenum;
-    PyArray_BinSearchFunc *binsearch[NPY_NSEARCHSIDES];
-} binsearch_map;
-
-typedef struct {
-    int typenum;
-    PyArray_ArgBinSearchFunc *argbinsearch[NPY_NSEARCHSIDES];
-} argbinsearch_map;
-
-/**begin repeat
- *
- * #side = left, right#
- */
-
-/**begin repeat1
- *
- * #suff = bool, byte, ubyte, short, ushort, int, uint, long, ulong,
- *         longlong, ulonglong, half, float, double, longdouble,
- *         cfloat, cdouble, clongdouble, datetime, timedelta#
- */
-
-NPY_NO_EXPORT void
-binsearch_@side@_@suff@(const char *arr, const char *key, char *ret,
-                        npy_intp arr_len, npy_intp key_len,
-                        npy_intp arr_str, npy_intp key_str, npy_intp ret_str,
-                        PyArrayObject *unused);
-NPY_NO_EXPORT int
-argbinsearch_@side@_@suff@(const char *arr, const char *key,
-                           const char *sort, char *ret,
-                           npy_intp arr_len, npy_intp key_len,
-                           npy_intp arr_str, npy_intp key_str,
-                           npy_intp sort_str, npy_intp ret_str,
-                           PyArrayObject *unused);
-/**end repeat1**/
-
-NPY_NO_EXPORT void
-npy_binsearch_@side@(const char *arr, const char *key, char *ret,
-                     npy_intp arr_len, npy_intp key_len,
-                     npy_intp arr_str, npy_intp key_str,
-                     npy_intp ret_str, PyArrayObject *cmp);
-NPY_NO_EXPORT int
-npy_argbinsearch_@side@(const char *arr, const char *key,
-                        const char *sort, char *ret,
-                        npy_intp arr_len, npy_intp key_len,
-                        npy_intp arr_str, npy_intp key_str,
-                        npy_intp sort_str, npy_intp ret_str,
-                        PyArrayObject *cmp);
-/**end repeat**/
-
-/**begin repeat
- *
- * #arg = , arg#
- * #Arg = , Arg#
- */
-
-static @arg@binsearch_map _@arg@binsearch_map[] = {
-    /* If adding new types, make sure to keep them ordered by type num */
-    /**begin repeat1
-     *
-     * #TYPE = BOOL, BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG,
-     *         LONGLONG, ULONGLONG, FLOAT, DOUBLE, LONGDOUBLE,
-     *         CFLOAT, CDOUBLE, CLONGDOUBLE, DATETIME, TIMEDELTA, HALF#
-     * #suff = bool, byte, ubyte, short, ushort, int, uint, long, ulong,
-     *         longlong, ulonglong, float, double, longdouble,
-     *         cfloat, cdouble, clongdouble, datetime, timedelta, half#
-     */
-    {NPY_@TYPE@,
-        {
-            &@arg@binsearch_left_@suff@,
-            &@arg@binsearch_right_@suff@,
-        },
-    },
-    /**end repeat1**/
-};
-
-static PyArray_@Arg@BinSearchFunc *gen@arg@binsearch_map[] = {
-    &npy_@arg@binsearch_left,
-    &npy_@arg@binsearch_right,
-};
-
-static NPY_INLINE PyArray_@Arg@BinSearchFunc*
-get_@arg@binsearch_func(PyArray_Descr *dtype, NPY_SEARCHSIDE side)
-{
-    npy_intp nfuncs = ARRAY_SIZE(_@arg@binsearch_map);
-    npy_intp min_idx = 0;
-    npy_intp max_idx = nfuncs;
-    int type = dtype->type_num;
-
-    if (side >= NPY_NSEARCHSIDES) {
-        return NULL;
-    }
-
-    /*
-     * It seems only fair that a binary search function be searched for
-     * using a binary search...
-     */
-    while (min_idx < max_idx) {
-        npy_intp mid_idx = min_idx + ((max_idx - min_idx) >> 1);
-
-        if (_@arg@binsearch_map[mid_idx].typenum < type) {
-            min_idx = mid_idx + 1;
-        }
-        else {
-            max_idx = mid_idx;
-        }
-    }
-
-    if (min_idx < nfuncs &&
-            _@arg@binsearch_map[min_idx].typenum == type) {
-        return _@arg@binsearch_map[min_idx].@arg@binsearch[side];
-    }
-
-    if (dtype->f->compare) {
-        return gen@arg@binsearch_map[side];
-    }
-
-    return NULL;
-}
-/**end repeat**/
-
-#undef ARRAY_SIZE
-
-#endif
diff --git a/numpy/core/src/common/npy_config.h b/numpy/core/src/common/npy_config.h

index fd0f1855c8d310dde7672414dbeb092d6ffeecc5..b01eca5abc15c6bfab8d1411941def2afaa0a1d1 100644 (file)
--- a/numpy/core/src/common/npy_config.h
+++ b/numpy/core/src/common/npy_config.h
@@ -136,11 +136,23 @@
  #undef HAVE_CPOWL
  #undef HAVE_CEXPL
  
+#include <cygwin/version.h>
+#if CYGWIN_VERSION_DLL_MAJOR < 3003
+/* https://cygwin.com/pipermail/cygwin-announce/2021-October/010268.html */
  /* Builtin abs reports overflow */
  #undef HAVE_CABSL
  #undef HAVE_HYPOTL
  #endif
  
+#if CYGWIN_VERSION_DLL_MAJOR < 3002
+/* https://cygwin.com/pipermail/cygwin-announce/2021-March/009987.html */
+/* Segfault */
+#undef HAVE_MODFL
+/* sqrt(-inf) returns -inf instead of -nan */
+#undef HAVE_SQRTL
+#endif
+#endif
+
  /* Disable broken gnu trig functions */
  #if defined(HAVE_FEATURES_H)
  #include <features.h>
diff --git a/numpy/core/src/common/npy_cpu_features.c b/numpy/core/src/common/npy_cpu_features.c

new file mode 100644 (file)

index 0000000..773f4af
--- /dev/null
+++ b/numpy/core/src/common/npy_cpu_features.c
@@ -0,0 +1,773 @@
+#include "npy_cpu_features.h"
+#include "npy_cpu_dispatch.h" // To guarantee the CPU baseline definitions are in scope.
+#include "numpy/npy_common.h" // for NPY_INLINE
+#include "numpy/npy_cpu.h" // To guarantee the CPU definitions are in scope.
+
+/******************** Private Definitions *********************/
+
+// Hold all CPU features boolean values
+static unsigned char npy__cpu_have[NPY_CPU_FEATURE_MAX];
+
+/******************** Private Declarations *********************/
+
+// Almost detect all CPU features in runtime
+static void
+npy__cpu_init_features(void);
+/*
+ * Disable CPU dispatched features at runtime if environment variable
+ * 'NPY_DISABLE_CPU_FEATURES' is defined.
+ * Multiple features can be present, and separated by space, comma, or tab.
+ * Raises an error if parsing fails or if the feature was not enabled
+*/
+static int
+npy__cpu_try_disable_env(void);
+
+/* Ensure the build's CPU baseline features are supported at runtime */
+static int
+npy__cpu_validate_baseline(void);
+
+/******************** Public Definitions *********************/
+
+NPY_VISIBILITY_HIDDEN int
+npy_cpu_have(int feature_id)
+{
+    if (feature_id <= NPY_CPU_FEATURE_NONE || feature_id >= NPY_CPU_FEATURE_MAX)
+        return 0;
+    return npy__cpu_have[feature_id];
+}
+
+NPY_VISIBILITY_HIDDEN int
+npy_cpu_init(void)
+{
+    npy__cpu_init_features();
+    if (npy__cpu_validate_baseline() < 0) {
+        return -1;
+    }
+    if (npy__cpu_try_disable_env() < 0) {
+        return -1;
+    }
+    return 0;
+}
+
+static struct {
+  enum npy_cpu_features feature;
+  char const *string;
+} features[] = {{NPY_CPU_FEATURE_MMX, "MMX"},
+                {NPY_CPU_FEATURE_SSE, "SSE"},
+                {NPY_CPU_FEATURE_SSE2, "SSE2"},
+                {NPY_CPU_FEATURE_SSE3, "SSE3"},
+                {NPY_CPU_FEATURE_SSSE3, "SSSE3"},
+                {NPY_CPU_FEATURE_SSE41, "SSE41"},
+                {NPY_CPU_FEATURE_POPCNT, "POPCNT"},
+                {NPY_CPU_FEATURE_SSE42, "SSE42"},
+                {NPY_CPU_FEATURE_AVX, "AVX"},
+                {NPY_CPU_FEATURE_F16C, "F16C"},
+                {NPY_CPU_FEATURE_XOP, "XOP"},
+                {NPY_CPU_FEATURE_FMA4, "FMA4"},
+                {NPY_CPU_FEATURE_FMA3, "FMA3"},
+                {NPY_CPU_FEATURE_AVX2, "AVX2"},
+                {NPY_CPU_FEATURE_AVX512F, "AVX512F"},
+                {NPY_CPU_FEATURE_AVX512CD, "AVX512CD"},
+                {NPY_CPU_FEATURE_AVX512ER, "AVX512ER"},
+                {NPY_CPU_FEATURE_AVX512PF, "AVX512PF"},
+                {NPY_CPU_FEATURE_AVX5124FMAPS, "AVX5124FMAPS"},
+                {NPY_CPU_FEATURE_AVX5124VNNIW, "AVX5124VNNIW"},
+                {NPY_CPU_FEATURE_AVX512VPOPCNTDQ, "AVX512VPOPCNTDQ"},
+                {NPY_CPU_FEATURE_AVX512VL, "AVX512VL"},
+                {NPY_CPU_FEATURE_AVX512BW, "AVX512BW"},
+                {NPY_CPU_FEATURE_AVX512DQ, "AVX512DQ"},
+                {NPY_CPU_FEATURE_AVX512VNNI, "AVX512VNNI"},
+                {NPY_CPU_FEATURE_AVX512IFMA, "AVX512IFMA"},
+                {NPY_CPU_FEATURE_AVX512VBMI, "AVX512VBMI"},
+                {NPY_CPU_FEATURE_AVX512VBMI2, "AVX512VBMI2"},
+                {NPY_CPU_FEATURE_AVX512BITALG, "AVX512BITALG"},
+                {NPY_CPU_FEATURE_AVX512_KNL, "AVX512_KNL"},
+                {NPY_CPU_FEATURE_AVX512_KNM, "AVX512_KNM"},
+                {NPY_CPU_FEATURE_AVX512_SKX, "AVX512_SKX"},
+                {NPY_CPU_FEATURE_AVX512_CLX, "AVX512_CLX"},
+                {NPY_CPU_FEATURE_AVX512_CNL, "AVX512_CNL"},
+                {NPY_CPU_FEATURE_AVX512_ICL, "AVX512_ICL"},
+                {NPY_CPU_FEATURE_VSX, "VSX"},
+                {NPY_CPU_FEATURE_VSX2, "VSX2"},
+                {NPY_CPU_FEATURE_VSX3, "VSX3"},
+                {NPY_CPU_FEATURE_VSX4, "VSX4"},
+                {NPY_CPU_FEATURE_VX, "VX"},
+                {NPY_CPU_FEATURE_VXE, "VXE"},
+                {NPY_CPU_FEATURE_VXE2, "VXE2"},
+                {NPY_CPU_FEATURE_NEON, "NEON"},
+                {NPY_CPU_FEATURE_NEON_FP16, "NEON_FP16"},
+                {NPY_CPU_FEATURE_NEON_VFPV4, "NEON_VFPV4"},
+                {NPY_CPU_FEATURE_ASIMD, "ASIMD"},
+                {NPY_CPU_FEATURE_FPHP, "FPHP"},
+                {NPY_CPU_FEATURE_ASIMDHP, "ASIMDHP"},
+                {NPY_CPU_FEATURE_ASIMDDP, "ASIMDDP"},
+                {NPY_CPU_FEATURE_ASIMDFHM, "ASIMDFHM"}};
+
+
+NPY_VISIBILITY_HIDDEN PyObject *
+npy_cpu_features_dict(void)
+{
+    PyObject *dict = PyDict_New();
+    if (dict) {
+        for(unsigned i = 0; i < sizeof(features)/sizeof(features[0]); ++i)
+            if (PyDict_SetItemString(dict, features[i].string,
+                npy__cpu_have[features[i].feature] ? Py_True : Py_False) < 0) {
+                Py_DECREF(dict);
+                return NULL;
+            }
+    }
+    return dict;
+}
+
+#define NPY__CPU_PYLIST_APPEND_CB(FEATURE, LIST) \
+    item = PyUnicode_FromString(NPY_TOSTRING(FEATURE)); \
+    if (item == NULL) { \
+        Py_DECREF(LIST); \
+        return NULL; \
+    } \
+    PyList_SET_ITEM(LIST, index++, item);
+
+NPY_VISIBILITY_HIDDEN PyObject *
+npy_cpu_baseline_list(void)
+{
+#if !defined(NPY_DISABLE_OPTIMIZATION) && NPY_WITH_CPU_BASELINE_N > 0
+    PyObject *list = PyList_New(NPY_WITH_CPU_BASELINE_N), *item;
+    int index = 0;
+    if (list != NULL) {
+        NPY_WITH_CPU_BASELINE_CALL(NPY__CPU_PYLIST_APPEND_CB, list)
+    }
+    return list;
+#else
+    return PyList_New(0);
+#endif
+}
+
+NPY_VISIBILITY_HIDDEN PyObject *
+npy_cpu_dispatch_list(void)
+{
+#if !defined(NPY_DISABLE_OPTIMIZATION) && NPY_WITH_CPU_DISPATCH_N > 0
+    PyObject *list = PyList_New(NPY_WITH_CPU_DISPATCH_N), *item;
+    int index = 0;
+    if (list != NULL) {
+        NPY_WITH_CPU_DISPATCH_CALL(NPY__CPU_PYLIST_APPEND_CB, list)
+    }
+    return list;
+#else
+    return PyList_New(0);
+#endif
+}
+
+/******************** Private Definitions *********************/
+#define NPY__CPU_FEATURE_ID_CB(FEATURE, WITH_FEATURE)     \
+    if (strcmp(NPY_TOSTRING(FEATURE), WITH_FEATURE) == 0) \
+        return NPY_CAT(NPY_CPU_FEATURE_, FEATURE);
+/**
+ * Returns CPU feature's ID, if the 'feature' was part of baseline
+ * features that had been configured via --cpu-baseline
+ * otherwise it returns 0
+*/
+static NPY_INLINE int
+npy__cpu_baseline_fid(const char *feature)
+{
+#if !defined(NPY_DISABLE_OPTIMIZATION) && NPY_WITH_CPU_BASELINE_N > 0
+    NPY_WITH_CPU_BASELINE_CALL(NPY__CPU_FEATURE_ID_CB, feature)
+#endif
+    return 0;
+}
+/**
+ * Returns CPU feature's ID, if the 'feature' was part of dispatched
+ * features that had been configured via --cpu-dispatch
+ * otherwise it returns 0
+*/
+static NPY_INLINE int
+npy__cpu_dispatch_fid(const char *feature)
+{
+#if !defined(NPY_DISABLE_OPTIMIZATION) && NPY_WITH_CPU_DISPATCH_N > 0
+    NPY_WITH_CPU_DISPATCH_CALL(NPY__CPU_FEATURE_ID_CB, feature)
+#endif
+    return 0;
+}
+
+static int
+npy__cpu_validate_baseline(void)
+{
+#if !defined(NPY_DISABLE_OPTIMIZATION) && NPY_WITH_CPU_BASELINE_N > 0
+    char baseline_failure[sizeof(NPY_WITH_CPU_BASELINE) + 1];
+    char *fptr = &baseline_failure[0];
+
+    #define NPY__CPU_VALIDATE_CB(FEATURE, DUMMY)                  \
+        if (!npy__cpu_have[NPY_CAT(NPY_CPU_FEATURE_, FEATURE)]) { \
+            const int size = sizeof(NPY_TOSTRING(FEATURE));       \
+            memcpy(fptr, NPY_TOSTRING(FEATURE), size);            \
+            fptr[size] = ' '; fptr += size + 1;                   \
+        }
+    NPY_WITH_CPU_BASELINE_CALL(NPY__CPU_VALIDATE_CB, DUMMY) // extra arg for msvc
+    *fptr = '\0';
+
+    if (baseline_failure[0] != '\0') {
+        *(fptr-1) = '\0'; // trim the last space
+        PyErr_Format(PyExc_RuntimeError,
+            "NumPy was built with baseline optimizations: \n"
+            "(" NPY_WITH_CPU_BASELINE ") but your machine doesn't support:\n(%s).",
+            baseline_failure
+        );
+        return -1;
+    }
+#endif
+    return 0;
+}
+
+static int
+npy__cpu_try_disable_env(void)
+{
+    char *disenv = getenv("NPY_DISABLE_CPU_FEATURES");
+    if (disenv == NULL || disenv[0] == 0) {
+        return 0;
+    }
+    #define NPY__CPU_ENV_ERR_HEAD \
+        "During parsing environment variable 'NPY_DISABLE_CPU_FEATURES':\n"
+
+#if !defined(NPY_DISABLE_OPTIMIZATION) && NPY_WITH_CPU_DISPATCH_N > 0
+    #define NPY__MAX_VAR_LEN 1024 // More than enough for this era
+    size_t var_len = strlen(disenv) + 1;
+    if (var_len > NPY__MAX_VAR_LEN) {
+        PyErr_Format(PyExc_RuntimeError,
+            "Length of environment variable 'NPY_DISABLE_CPU_FEATURES' is %d, only %d accepted",
+            var_len, NPY__MAX_VAR_LEN - 1
+        );
+        return -1;
+    }
+    char disable_features[NPY__MAX_VAR_LEN];
+    memcpy(disable_features, disenv, var_len);
+
+    char nexist[NPY__MAX_VAR_LEN];
+    char *nexist_cur = &nexist[0];
+
+    char notsupp[sizeof(NPY_WITH_CPU_DISPATCH) + 1];
+    char *notsupp_cur = &notsupp[0];
+
+    //comma and space including (htab, vtab, CR, LF, FF)
+    const char *delim = ", \t\v\r\n\f";
+    char *feature = strtok(disable_features, delim);
+    while (feature) {
+        if (npy__cpu_baseline_fid(feature) > 0) {
+            PyErr_Format(PyExc_RuntimeError,
+                NPY__CPU_ENV_ERR_HEAD
+                "You cannot disable CPU feature '%s', since it is part of "
+                "the baseline optimizations:\n"
+                "(" NPY_WITH_CPU_BASELINE ").",
+                feature
+            );
+            return -1;
+        }
+        // check if the feature is part of dispatched features
+        int feature_id = npy__cpu_dispatch_fid(feature);
+        if (feature_id == 0) {
+            int flen = strlen(feature);
+            memcpy(nexist_cur, feature, flen);
+            nexist_cur[flen] = ' '; nexist_cur += flen + 1;
+            goto next;
+        }
+        // check if the feature supported by the running machine
+        if (!npy__cpu_have[feature_id]) {
+            int flen = strlen(feature);
+            memcpy(notsupp_cur, feature, flen);
+            notsupp_cur[flen] = ' '; notsupp_cur += flen + 1;
+            goto next;
+        }
+        // Finally we can disable it
+        npy__cpu_have[feature_id] = 0;
+    next:
+        feature = strtok(NULL, delim);
+    }
+
+    *nexist_cur = '\0';
+    if (nexist[0] != '\0') {
+        *(nexist_cur-1) = '\0'; // trim the last space
+        if (PyErr_WarnFormat(PyExc_RuntimeWarning, 1,
+                NPY__CPU_ENV_ERR_HEAD
+                "You cannot disable CPU features (%s), since "
+                "they are not part of the dispatched optimizations\n"
+                "(" NPY_WITH_CPU_DISPATCH ").",
+                nexist
+        ) < 0) {
+            return -1;
+        }
+    }
+
+    *notsupp_cur = '\0';
+    if (notsupp[0] != '\0') {
+        *(notsupp_cur-1) = '\0'; // trim the last space
+        if (PyErr_WarnFormat(PyExc_RuntimeWarning, 1,
+                NPY__CPU_ENV_ERR_HEAD
+                "You cannot disable CPU features (%s), since "
+                "they are not supported by your machine.",
+                notsupp
+        ) < 0) {
+            return -1;
+        }
+    }
+#else
+    if (PyErr_WarnFormat(PyExc_RuntimeWarning, 1,
+            NPY__CPU_ENV_ERR_HEAD
+            "You cannot use environment variable 'NPY_DISABLE_CPU_FEATURES', since "
+        #ifdef NPY_DISABLE_OPTIMIZATION
+            "the NumPy library was compiled with optimization disabled."
+        #else
+            "the NumPy library was compiled without any dispatched optimizations."
+        #endif
+    ) < 0) {
+        return -1;
+    }
+#endif
+    return 0;
+}
+
+/****************************************************************
+ * This section is reserved to defining @npy__cpu_init_features
+ * for each CPU architecture, please try to keep it clean. Ty
+ ****************************************************************/
+
+/***************** X86 ******************/
+
+#if defined(NPY_CPU_AMD64) || defined(NPY_CPU_X86)
+
+#ifdef _MSC_VER
+    #include <intrin.h>
+#elif defined(__INTEL_COMPILER)
+    #include <immintrin.h>
+#endif
+
+static int
+npy__cpu_getxcr0(void)
+{
+#if defined(_MSC_VER) || defined (__INTEL_COMPILER)
+    return _xgetbv(0);
+#elif defined(__GNUC__) || defined(__clang__)
+    /* named form of xgetbv not supported on OSX, so must use byte form, see:
+     * https://github.com/asmjit/asmjit/issues/78
+    */
+    unsigned int eax, edx;
+    __asm(".byte 0x0F, 0x01, 0xd0" : "=a"(eax), "=d"(edx) : "c"(0));
+    return eax;
+#else
+    return 0;
+#endif
+}
+
+static void
+npy__cpu_cpuid(int reg[4], int func_id)
+{
+#if defined(_MSC_VER)
+    __cpuidex(reg, func_id, 0);
+#elif defined(__INTEL_COMPILER)
+    __cpuid(reg, func_id);
+#elif defined(__GNUC__) || defined(__clang__)
+    #if defined(NPY_CPU_X86) && defined(__PIC__)
+        // %ebx may be the PIC register
+        __asm__("xchg{l}\t{%%}ebx, %1\n\t"
+                "cpuid\n\t"
+                "xchg{l}\t{%%}ebx, %1\n\t"
+                : "=a" (reg[0]), "=r" (reg[1]), "=c" (reg[2]),
+                  "=d" (reg[3])
+                : "a" (func_id), "c" (0)
+        );
+    #else
+        __asm__("cpuid\n\t"
+                : "=a" (reg[0]), "=b" (reg[1]), "=c" (reg[2]),
+                  "=d" (reg[3])
+                : "a" (func_id), "c" (0)
+        );
+    #endif
+#else
+    reg[0] = 0;
+#endif
+}
+
+static void
+npy__cpu_init_features(void)
+{
+    memset(npy__cpu_have, 0, sizeof(npy__cpu_have[0]) * NPY_CPU_FEATURE_MAX);
+
+    // validate platform support
+    int reg[] = {0, 0, 0, 0};
+    npy__cpu_cpuid(reg, 0);
+    if (reg[0] == 0) {
+       npy__cpu_have[NPY_CPU_FEATURE_MMX]  = 1;
+       npy__cpu_have[NPY_CPU_FEATURE_SSE]  = 1;
+       npy__cpu_have[NPY_CPU_FEATURE_SSE2] = 1;
+       #ifdef NPY_CPU_AMD64
+           npy__cpu_have[NPY_CPU_FEATURE_SSE3] = 1;
+       #endif
+       return;
+    }
+
+    npy__cpu_cpuid(reg, 1);
+    npy__cpu_have[NPY_CPU_FEATURE_MMX]    = (reg[3] & (1 << 23)) != 0;
+    npy__cpu_have[NPY_CPU_FEATURE_SSE]    = (reg[3] & (1 << 25)) != 0;
+    npy__cpu_have[NPY_CPU_FEATURE_SSE2]   = (reg[3] & (1 << 26)) != 0;
+    npy__cpu_have[NPY_CPU_FEATURE_SSE3]   = (reg[2] & (1 << 0))  != 0;
+    npy__cpu_have[NPY_CPU_FEATURE_SSSE3]  = (reg[2] & (1 << 9))  != 0;
+    npy__cpu_have[NPY_CPU_FEATURE_SSE41]  = (reg[2] & (1 << 19)) != 0;
+    npy__cpu_have[NPY_CPU_FEATURE_POPCNT] = (reg[2] & (1 << 23)) != 0;
+    npy__cpu_have[NPY_CPU_FEATURE_SSE42]  = (reg[2] & (1 << 20)) != 0;
+    npy__cpu_have[NPY_CPU_FEATURE_F16C]   = (reg[2] & (1 << 29)) != 0;
+
+    // check OSXSAVE
+    if ((reg[2] & (1 << 27)) == 0)
+        return;
+    // check AVX OS support
+    int xcr = npy__cpu_getxcr0();
+    if ((xcr & 6) != 6)
+        return;
+    npy__cpu_have[NPY_CPU_FEATURE_AVX]    = (reg[2] & (1 << 28)) != 0;
+    if (!npy__cpu_have[NPY_CPU_FEATURE_AVX])
+        return;
+    npy__cpu_have[NPY_CPU_FEATURE_FMA3]   = (reg[2] & (1 << 12)) != 0;
+
+    // second call to the cpuid to get extended AMD feature bits
+    npy__cpu_cpuid(reg, 0x80000001);
+    npy__cpu_have[NPY_CPU_FEATURE_XOP]    = (reg[2] & (1 << 11)) != 0;
+    npy__cpu_have[NPY_CPU_FEATURE_FMA4]   = (reg[2] & (1 << 16)) != 0;
+
+    // third call to the cpuid to get extended AVX2 & AVX512 feature bits
+    npy__cpu_cpuid(reg, 7);
+    npy__cpu_have[NPY_CPU_FEATURE_AVX2]   = (reg[1] & (1 << 5))  != 0;
+    if (!npy__cpu_have[NPY_CPU_FEATURE_AVX2])
+        return;
+    // detect AVX2 & FMA3
+    npy__cpu_have[NPY_CPU_FEATURE_FMA]    = npy__cpu_have[NPY_CPU_FEATURE_FMA3];
+
+    // check AVX512 OS support
+    int avx512_os = (xcr & 0xe6) == 0xe6;
+#if defined(__APPLE__) && defined(__x86_64__)
+    /**
+     * On darwin, machines with AVX512 support, by default, threads are created with
+     * AVX512 masked off in XCR0 and an AVX-sized savearea is used.
+     * However, AVX512 capabilities are advertised in the commpage and via sysctl.
+     * for more information, check:
+     *  - https://github.com/apple/darwin-xnu/blob/0a798f6738bc1db01281fc08ae024145e84df927/osfmk/i386/fpu.c#L175-L201
+     *  - https://github.com/golang/go/issues/43089
+     *  - https://github.com/numpy/numpy/issues/19319
+     */
+    if (!avx512_os) {
+        npy_uintp commpage64_addr = 0x00007fffffe00000ULL;
+        npy_uint16 commpage64_ver = *((npy_uint16*)(commpage64_addr + 0x01E));
+        // cpu_capabilities64 undefined in versions < 13
+        if (commpage64_ver > 12) {
+            npy_uint64 commpage64_cap = *((npy_uint64*)(commpage64_addr + 0x010));
+            avx512_os = (commpage64_cap & 0x0000004000000000ULL) != 0;
+        }
+    }
+#endif
+    if (!avx512_os) {
+        return;
+    }
+    npy__cpu_have[NPY_CPU_FEATURE_AVX512F]  = (reg[1] & (1 << 16)) != 0;
+    npy__cpu_have[NPY_CPU_FEATURE_AVX512CD] = (reg[1] & (1 << 28)) != 0;
+    if (npy__cpu_have[NPY_CPU_FEATURE_AVX512F] && npy__cpu_have[NPY_CPU_FEATURE_AVX512CD]) {
+        // Knights Landing
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512PF]        = (reg[1] & (1 << 26)) != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512ER]        = (reg[1] & (1 << 27)) != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512_KNL]      = npy__cpu_have[NPY_CPU_FEATURE_AVX512ER] &&
+                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512PF];
+        // Knights Mill
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512VPOPCNTDQ] = (reg[2] & (1 << 14)) != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_AVX5124VNNIW]    = (reg[3] & (1 << 2))  != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_AVX5124FMAPS]    = (reg[3] & (1 << 3))  != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512_KNM]      = npy__cpu_have[NPY_CPU_FEATURE_AVX512_KNL] &&
+                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX5124FMAPS] &&
+                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX5124VNNIW] &&
+                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512VPOPCNTDQ];
+
+        // Skylake-X
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512DQ]        = (reg[1] & (1 << 17)) != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512BW]        = (reg[1] & (1 << 30)) != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512VL]        = (reg[1] & (1 << 31)) != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512_SKX]      = npy__cpu_have[NPY_CPU_FEATURE_AVX512BW] &&
+                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512DQ] &&
+                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512VL];
+        // Cascade Lake
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512VNNI]      = (reg[2] & (1 << 11)) != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512_CLX]      = npy__cpu_have[NPY_CPU_FEATURE_AVX512_SKX] &&
+                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512VNNI];
+
+        // Cannon Lake
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512IFMA]      = (reg[1] & (1 << 21)) != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512VBMI]      = (reg[2] & (1 << 1))  != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512_CNL]      = npy__cpu_have[NPY_CPU_FEATURE_AVX512_SKX] &&
+                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512IFMA] &&
+                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512VBMI];
+        // Ice Lake
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512VBMI2]     = (reg[2] & (1 << 6))  != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512BITALG]    = (reg[2] & (1 << 12)) != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_AVX512_ICL]      = npy__cpu_have[NPY_CPU_FEATURE_AVX512_CLX] &&
+                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512_CNL] &&
+                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512VBMI2] &&
+                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512BITALG] &&
+                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512VPOPCNTDQ];
+    }
+}
+
+/***************** POWER ******************/
+
+#elif defined(NPY_CPU_PPC64) || defined(NPY_CPU_PPC64LE)
+
+#ifdef __linux__
+    #include <sys/auxv.h>
+    #ifndef AT_HWCAP2
+        #define AT_HWCAP2 26
+    #endif
+    #ifndef PPC_FEATURE2_ARCH_2_07
+        #define PPC_FEATURE2_ARCH_2_07 0x80000000
+    #endif
+    #ifndef PPC_FEATURE2_ARCH_3_00
+        #define PPC_FEATURE2_ARCH_3_00 0x00800000
+    #endif
+    #ifndef PPC_FEATURE2_ARCH_3_1
+        #define PPC_FEATURE2_ARCH_3_1  0x00040000
+    #endif
+#endif
+
+static void
+npy__cpu_init_features(void)
+{
+    memset(npy__cpu_have, 0, sizeof(npy__cpu_have[0]) * NPY_CPU_FEATURE_MAX);
+#ifdef __linux__
+    unsigned int hwcap = getauxval(AT_HWCAP);
+    if ((hwcap & PPC_FEATURE_HAS_VSX) == 0)
+        return;
+
+    hwcap = getauxval(AT_HWCAP2);
+    if (hwcap & PPC_FEATURE2_ARCH_3_1)
+    {
+        npy__cpu_have[NPY_CPU_FEATURE_VSX]  =
+        npy__cpu_have[NPY_CPU_FEATURE_VSX2] =
+        npy__cpu_have[NPY_CPU_FEATURE_VSX3] =
+        npy__cpu_have[NPY_CPU_FEATURE_VSX4] = 1;
+        return;
+    }
+    npy__cpu_have[NPY_CPU_FEATURE_VSX]  = 1;
+    npy__cpu_have[NPY_CPU_FEATURE_VSX2] = (hwcap & PPC_FEATURE2_ARCH_2_07) != 0;
+    npy__cpu_have[NPY_CPU_FEATURE_VSX3] = (hwcap & PPC_FEATURE2_ARCH_3_00) != 0;
+    npy__cpu_have[NPY_CPU_FEATURE_VSX4] = (hwcap & PPC_FEATURE2_ARCH_3_1) != 0;
+// TODO: AIX, FreeBSD
+#else
+    npy__cpu_have[NPY_CPU_FEATURE_VSX]  = 1;
+    #if defined(NPY_CPU_PPC64LE) || defined(NPY_HAVE_VSX2)
+    npy__cpu_have[NPY_CPU_FEATURE_VSX2] = 1;
+    #endif
+    #ifdef NPY_HAVE_VSX3
+    npy__cpu_have[NPY_CPU_FEATURE_VSX3] = 1;
+    #endif
+    #ifdef NPY_HAVE_VSX4
+    npy__cpu_have[NPY_CPU_FEATURE_VSX4] = 1;
+    #endif
+#endif
+}
+
+/***************** ZARCH ******************/
+
+#elif defined(__s390x__)
+
+#include <sys/auxv.h>
+#ifndef HWCAP_S390_VXE
+    #define HWCAP_S390_VXE 8192
+#endif
+
+#ifndef HWCAP_S390_VXRS_EXT2
+    #define HWCAP_S390_VXRS_EXT2 32768
+#endif
+
+static void
+npy__cpu_init_features(void)
+{
+    memset(npy__cpu_have, 0, sizeof(npy__cpu_have[0]) * NPY_CPU_FEATURE_MAX);
+    
+    unsigned int hwcap = getauxval(AT_HWCAP);
+    if ((hwcap & HWCAP_S390_VX) == 0) {
+        return;
+    }
+
+    if (hwcap & HWCAP_S390_VXRS_EXT2) {
+       npy__cpu_have[NPY_CPU_FEATURE_VX]  =
+       npy__cpu_have[NPY_CPU_FEATURE_VXE] =
+       npy__cpu_have[NPY_CPU_FEATURE_VXE2] = 1;
+       return;
+    }
+    
+    npy__cpu_have[NPY_CPU_FEATURE_VXE] = (hwcap & HWCAP_S390_VXE) != 0;
+
+    npy__cpu_have[NPY_CPU_FEATURE_VX]  = 1;
+}
+
+
+/***************** ARM ******************/
+
+#elif defined(__arm__) || defined(__aarch64__)
+
+static NPY_INLINE void
+npy__cpu_init_features_arm8(void)
+{
+    npy__cpu_have[NPY_CPU_FEATURE_NEON]       =
+    npy__cpu_have[NPY_CPU_FEATURE_NEON_FP16]  =
+    npy__cpu_have[NPY_CPU_FEATURE_NEON_VFPV4] =
+    npy__cpu_have[NPY_CPU_FEATURE_ASIMD]      = 1;
+}
+
+#if defined(__linux__) || defined(__FreeBSD__)
+/*
+ * we aren't sure of what kind kernel or clib we deal with
+ * so we play it safe
+*/
+#include <stdio.h>
+#include "npy_cpuinfo_parser.h"
+
+#if defined(__linux__)
+__attribute__((weak)) unsigned long getauxval(unsigned long); // linker should handle it
+#endif
+#ifdef __FreeBSD__
+__attribute__((weak)) int elf_aux_info(int, void *, int); // linker should handle it
+
+static unsigned long getauxval(unsigned long k)
+{
+    unsigned long val = 0ul;
+    if (elf_aux_info == 0 || elf_aux_info((int)k, (void *)&val, (int)sizeof(val)) != 0) {
+       return 0ul;
+    }
+    return val;
+}
+#endif
+static int
+npy__cpu_init_features_linux(void)
+{
+    unsigned long hwcap = 0, hwcap2 = 0;
+    #ifdef __linux__
+    if (getauxval != 0) {
+        hwcap = getauxval(NPY__HWCAP);
+    #ifdef __arm__
+        hwcap2 = getauxval(NPY__HWCAP2);
+    #endif
+    } else {
+        unsigned long auxv[2];
+        int fd = open("/proc/self/auxv", O_RDONLY);
+        if (fd >= 0) {
+            while (read(fd, &auxv, sizeof(auxv)) == sizeof(auxv)) {
+                if (auxv[0] == NPY__HWCAP) {
+                    hwcap = auxv[1];
+                }
+            #ifdef __arm__
+                else if (auxv[0] == NPY__HWCAP2) {
+                    hwcap2 = auxv[1];
+                }
+            #endif
+                // detect the end
+                else if (auxv[0] == 0 && auxv[1] == 0) {
+                    break;
+                }
+            }
+            close(fd);
+        }
+    }
+    #else
+    hwcap = getauxval(NPY__HWCAP);
+    #ifdef __arm__
+    hwcap2 = getauxval(NPY__HWCAP2);
+    #endif
+    #endif
+    if (hwcap == 0 && hwcap2 == 0) {
+    #ifdef __linux__
+        /*
+         * try parsing with /proc/cpuinfo, if sandboxed
+         * failback to compiler definitions
+        */
+        if(!get_feature_from_proc_cpuinfo(&hwcap, &hwcap2)) {
+            return 0;
+        }
+    #else
+       return 0;
+    #endif
+    }
+#ifdef __arm__
+    // Detect Arm8 (aarch32 state)
+    if ((hwcap2 & NPY__HWCAP2_AES)  || (hwcap2 & NPY__HWCAP2_SHA1)  ||
+        (hwcap2 & NPY__HWCAP2_SHA2) || (hwcap2 & NPY__HWCAP2_PMULL) ||
+        (hwcap2 & NPY__HWCAP2_CRC32))
+    {
+        hwcap = hwcap2;
+#else
+    if (1)
+    {
+        if (!(hwcap & (NPY__HWCAP_FP | NPY__HWCAP_ASIMD))) {
+            // Is this could happen? maybe disabled by kernel
+            // BTW this will break the baseline of AARCH64
+            return 1;
+        }
+#endif
+        npy__cpu_have[NPY_CPU_FEATURE_FPHP]       = (hwcap & NPY__HWCAP_FPHP)     != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_ASIMDHP]    = (hwcap & NPY__HWCAP_ASIMDHP)  != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_ASIMDDP]    = (hwcap & NPY__HWCAP_ASIMDDP)  != 0;
+        npy__cpu_have[NPY_CPU_FEATURE_ASIMDFHM]   = (hwcap & NPY__HWCAP_ASIMDFHM) != 0;
+        npy__cpu_init_features_arm8();
+    } else {
+        npy__cpu_have[NPY_CPU_FEATURE_NEON]       = (hwcap & NPY__HWCAP_NEON)   != 0;
+        if (npy__cpu_have[NPY_CPU_FEATURE_NEON]) {
+            npy__cpu_have[NPY_CPU_FEATURE_NEON_FP16]  = (hwcap & NPY__HWCAP_HALF) != 0;
+            npy__cpu_have[NPY_CPU_FEATURE_NEON_VFPV4] = (hwcap & NPY__HWCAP_VFPv4) != 0;
+        }
+    }
+    return 1;
+}
+#endif
+
+static void
+npy__cpu_init_features(void)
+{
+    memset(npy__cpu_have, 0, sizeof(npy__cpu_have[0]) * NPY_CPU_FEATURE_MAX);
+#ifdef __linux__
+    if (npy__cpu_init_features_linux())
+        return;
+#endif
+    // We have nothing else todo
+#if defined(NPY_HAVE_ASIMD) || defined(__aarch64__) || (defined(__ARM_ARCH) && __ARM_ARCH >= 8)
+    #if defined(NPY_HAVE_FPHP) || defined(__ARM_FEATURE_FP16_VECTOR_ARITHMETIC)
+    npy__cpu_have[NPY_CPU_FEATURE_FPHP] = 1;
+    #endif
+    #if defined(NPY_HAVE_ASIMDHP) || defined(__ARM_FEATURE_FP16_VECTOR_ARITHMETIC)
+    npy__cpu_have[NPY_CPU_FEATURE_ASIMDHP] = 1;
+    #endif
+    #if defined(NPY_HAVE_ASIMDDP) || defined(__ARM_FEATURE_DOTPROD)
+    npy__cpu_have[NPY_CPU_FEATURE_ASIMDDP] = 1;
+    #endif
+    #if defined(NPY_HAVE_ASIMDFHM) || defined(__ARM_FEATURE_FP16FML)
+    npy__cpu_have[NPY_CPU_FEATURE_ASIMDFHM] = 1;
+    #endif
+    npy__cpu_init_features_arm8();
+#else
+    #if defined(NPY_HAVE_NEON) || defined(__ARM_NEON__)
+        npy__cpu_have[NPY_CPU_FEATURE_NEON] = 1;
+    #endif
+    #if defined(NPY_HAVE_NEON_FP16) || defined(__ARM_FP16_FORMAT_IEEE) || (defined(__ARM_FP) && (__ARM_FP & 2))
+        npy__cpu_have[NPY_CPU_FEATURE_NEON_FP16] = npy__cpu_have[NPY_CPU_FEATURE_NEON];
+    #endif
+    #if defined(NPY_HAVE_NEON_VFPV4) || defined(__ARM_FEATURE_FMA)
+        npy__cpu_have[NPY_CPU_FEATURE_NEON_VFPV4] = npy__cpu_have[NPY_CPU_FEATURE_NEON];
+    #endif
+#endif
+}
+
+/*********** Unsupported ARCH ***********/
+#else
+static void
+npy__cpu_init_features(void)
+{
+    /*
+     * just in case if the compiler doesn't respect ANSI
+     * but for knowing paltforms it still nessecery, because @npy__cpu_init_features
+     * may called multiple of times and we need to clear the disabled features by
+     * ENV Var or maybe in the future we can support other methods like
+     * global variables, go back to @npy__cpu_try_disable_env for more understanding
+     */
+    memset(npy__cpu_have, 0, sizeof(npy__cpu_have[0]) * NPY_CPU_FEATURE_MAX);
+}
+#endif
diff --git a/numpy/core/src/common/npy_cpu_features.c.src b/numpy/core/src/common/npy_cpu_features.c.src

deleted file mode 100644 (file)

index 0bdc046..0000000
--- a/numpy/core/src/common/npy_cpu_features.c.src
+++ /dev/null
@@ -1,680 +0,0 @@
-#include "npy_cpu_features.h"
-#include "npy_cpu_dispatch.h" // To guarantee the CPU baseline definitions are in scope.
-#include "numpy/npy_common.h" // for NPY_INLINE
-#include "numpy/npy_cpu.h" // To guarantee the CPU definitions are in scope.
-
-/******************** Private Definitions *********************/
-
-// Hold all CPU features boolean values
-static unsigned char npy__cpu_have[NPY_CPU_FEATURE_MAX];
-
-/******************** Private Declarations *********************/
-
-// Almost detect all CPU features in runtime
-static void
-npy__cpu_init_features(void);
-/*
- * Disable CPU dispatched features at runtime if environment variable
- * 'NPY_DISABLE_CPU_FEATURES' is defined.
- * Multiple features can be present, and separated by space, comma, or tab.
- * Raises an error if parsing fails or if the feature was not enabled
-*/
-static int
-npy__cpu_try_disable_env(void);
-
-/* Ensure the build's CPU baseline features are supported at runtime */
-static int
-npy__cpu_validate_baseline(void);
-
-/******************** Public Definitions *********************/
-
-NPY_VISIBILITY_HIDDEN int
-npy_cpu_have(int feature_id)
-{
-    if (feature_id <= NPY_CPU_FEATURE_NONE || feature_id >= NPY_CPU_FEATURE_MAX)
-        return 0;
-    return npy__cpu_have[feature_id];
-}
-
-NPY_VISIBILITY_HIDDEN int
-npy_cpu_init(void)
-{
-    npy__cpu_init_features();
-    if (npy__cpu_validate_baseline() < 0) {
-        return -1;
-    }
-    if (npy__cpu_try_disable_env() < 0) {
-        return -1;
-    }
-    return 0;
-}
-
-NPY_VISIBILITY_HIDDEN PyObject *
-npy_cpu_features_dict(void)
-{
-    PyObject *dict = PyDict_New();
-    if (dict) {
-    /**begin repeat
-     * #feature = MMX, SSE, SSE2, SSE3, SSSE3, SSE41, POPCNT, SSE42,
-     *            AVX, F16C, XOP, FMA4, FMA3, AVX2, AVX512F,
-     *            AVX512CD, AVX512ER, AVX512PF, AVX5124FMAPS, AVX5124VNNIW,
-     *            AVX512VPOPCNTDQ, AVX512VL, AVX512BW, AVX512DQ, AVX512VNNI,
-     *            AVX512IFMA, AVX512VBMI, AVX512VBMI2, AVX512BITALG,
-     *            AVX512_KNL, AVX512_KNM, AVX512_SKX, AVX512_CLX, AVX512_CNL, AVX512_ICL,
-     *            VSX, VSX2, VSX3,
-     *            NEON, NEON_FP16, NEON_VFPV4, ASIMD, FPHP, ASIMDHP, ASIMDDP, ASIMDFHM#
-    */
-        if (PyDict_SetItemString(dict, "@feature@",
-            npy__cpu_have[NPY_CPU_FEATURE_@feature@] ? Py_True : Py_False) < 0) {
-            Py_DECREF(dict);
-            return NULL;
-        }
-    /**end repeat**/
-    }
-    return dict;
-}
-
-#define NPY__CPU_PYLIST_APPEND_CB(FEATURE, LIST) \
-    item = PyUnicode_FromString(NPY_TOSTRING(FEATURE)); \
-    if (item == NULL) { \
-        Py_DECREF(LIST); \
-        return NULL; \
-    } \
-    PyList_SET_ITEM(LIST, index++, item);
-
-NPY_VISIBILITY_HIDDEN PyObject *
-npy_cpu_baseline_list(void)
-{
-#if !defined(NPY_DISABLE_OPTIMIZATION) && NPY_WITH_CPU_BASELINE_N > 0
-    PyObject *list = PyList_New(NPY_WITH_CPU_BASELINE_N), *item;
-    int index = 0;
-    if (list != NULL) {
-        NPY_WITH_CPU_BASELINE_CALL(NPY__CPU_PYLIST_APPEND_CB, list)
-    }
-    return list;
-#else
-    return PyList_New(0);
-#endif
-}
-
-NPY_VISIBILITY_HIDDEN PyObject *
-npy_cpu_dispatch_list(void)
-{
-#if !defined(NPY_DISABLE_OPTIMIZATION) && NPY_WITH_CPU_DISPATCH_N > 0
-    PyObject *list = PyList_New(NPY_WITH_CPU_DISPATCH_N), *item;
-    int index = 0;
-    if (list != NULL) {
-        NPY_WITH_CPU_DISPATCH_CALL(NPY__CPU_PYLIST_APPEND_CB, list)
-    }
-    return list;
-#else
-    return PyList_New(0);
-#endif
-}
-
-/******************** Private Definitions *********************/
-#define NPY__CPU_FEATURE_ID_CB(FEATURE, WITH_FEATURE)     \
-    if (strcmp(NPY_TOSTRING(FEATURE), WITH_FEATURE) == 0) \
-        return NPY_CAT(NPY_CPU_FEATURE_, FEATURE);
-/**
- * Returns CPU feature's ID, if the 'feature' was part of baseline
- * features that had been configured via --cpu-baseline
- * otherwise it returns 0
-*/
-static NPY_INLINE int
-npy__cpu_baseline_fid(const char *feature)
-{
-#if !defined(NPY_DISABLE_OPTIMIZATION) && NPY_WITH_CPU_BASELINE_N > 0
-    NPY_WITH_CPU_BASELINE_CALL(NPY__CPU_FEATURE_ID_CB, feature)
-#endif
-    return 0;
-}
-/**
- * Returns CPU feature's ID, if the 'feature' was part of dispatched
- * features that had been configured via --cpu-dispatch
- * otherwise it returns 0
-*/
-static NPY_INLINE int
-npy__cpu_dispatch_fid(const char *feature)
-{
-#if !defined(NPY_DISABLE_OPTIMIZATION) && NPY_WITH_CPU_DISPATCH_N > 0
-    NPY_WITH_CPU_DISPATCH_CALL(NPY__CPU_FEATURE_ID_CB, feature)
-#endif
-    return 0;
-}
-
-static int
-npy__cpu_validate_baseline(void)
-{
-#if !defined(NPY_DISABLE_OPTIMIZATION) && NPY_WITH_CPU_BASELINE_N > 0
-    char baseline_failure[sizeof(NPY_WITH_CPU_BASELINE) + 1];
-    char *fptr = &baseline_failure[0];
-
-    #define NPY__CPU_VALIDATE_CB(FEATURE, DUMMY)                  \
-        if (!npy__cpu_have[NPY_CAT(NPY_CPU_FEATURE_, FEATURE)]) { \
-            const int size = sizeof(NPY_TOSTRING(FEATURE));       \
-            memcpy(fptr, NPY_TOSTRING(FEATURE), size);            \
-            fptr[size] = ' '; fptr += size + 1;                   \
-        }
-    NPY_WITH_CPU_BASELINE_CALL(NPY__CPU_VALIDATE_CB, DUMMY) // extra arg for msvc
-    *fptr = '\0';
-
-    if (baseline_failure[0] != '\0') {
-        *(fptr-1) = '\0'; // trim the last space
-        PyErr_Format(PyExc_RuntimeError,
-            "NumPy was built with baseline optimizations: \n"
-            "(" NPY_WITH_CPU_BASELINE ") but your machine doesn't support:\n(%s).",
-            baseline_failure
-        );
-        return -1;
-    }
-#endif
-    return 0;
-}
-
-static int
-npy__cpu_try_disable_env(void)
-{
-    char *disenv = getenv("NPY_DISABLE_CPU_FEATURES");
-    if (disenv == NULL || disenv[0] == 0) {
-        return 0;
-    }
-    #define NPY__CPU_ENV_ERR_HEAD \
-        "During parsing environment variable 'NPY_DISABLE_CPU_FEATURES':\n"
-
-#if !defined(NPY_DISABLE_OPTIMIZATION) && NPY_WITH_CPU_DISPATCH_N > 0
-    #define NPY__MAX_VAR_LEN 1024 // More than enough for this era
-    size_t var_len = strlen(disenv) + 1;
-    if (var_len > NPY__MAX_VAR_LEN) {
-        PyErr_Format(PyExc_RuntimeError,
-            "Length of environment variable 'NPY_DISABLE_CPU_FEATURES' is %d, only %d accepted",
-            var_len, NPY__MAX_VAR_LEN - 1
-        );
-        return -1;
-    }
-    char disable_features[NPY__MAX_VAR_LEN];
-    memcpy(disable_features, disenv, var_len);
-
-    char nexist[NPY__MAX_VAR_LEN];
-    char *nexist_cur = &nexist[0];
-
-    char notsupp[sizeof(NPY_WITH_CPU_DISPATCH) + 1];
-    char *notsupp_cur = &notsupp[0];
-
-    //comma and space including (htab, vtab, CR, LF, FF)
-    const char *delim = ", \t\v\r\n\f";
-    char *feature = strtok(disable_features, delim);
-    while (feature) {
-        if (npy__cpu_baseline_fid(feature) > 0) {
-            PyErr_Format(PyExc_RuntimeError,
-                NPY__CPU_ENV_ERR_HEAD
-                "You cannot disable CPU feature '%s', since it is part of "
-                "the baseline optimizations:\n"
-                "(" NPY_WITH_CPU_BASELINE ").",
-                feature
-            );
-            return -1;
-        }
-        // check if the feature is part of dispatched features
-        int feature_id = npy__cpu_dispatch_fid(feature);
-        if (feature_id == 0) {
-            int flen = strlen(feature);
-            memcpy(nexist_cur, feature, flen);
-            nexist_cur[flen] = ' '; nexist_cur += flen + 1;
-            goto next;
-        }
-        // check if the feature supported by the running machine
-        if (!npy__cpu_have[feature_id]) {
-            int flen = strlen(feature);
-            memcpy(notsupp_cur, feature, flen);
-            notsupp_cur[flen] = ' '; notsupp_cur += flen + 1;
-            goto next;
-        }
-        // Finally we can disable it
-        npy__cpu_have[feature_id] = 0;
-    next:
-        feature = strtok(NULL, delim);
-    }
-
-    *nexist_cur = '\0';
-    if (nexist[0] != '\0') {
-        *(nexist_cur-1) = '\0'; // trim the last space
-        if (PyErr_WarnFormat(PyExc_RuntimeWarning, 1,
-                NPY__CPU_ENV_ERR_HEAD
-                "You cannot disable CPU features (%s), since "
-                "they are not part of the dispatched optimizations\n"
-                "(" NPY_WITH_CPU_DISPATCH ").",
-                nexist
-        ) < 0) {
-            return -1;
-        }
-    }
-
-    *notsupp_cur = '\0';
-    if (notsupp[0] != '\0') {
-        *(notsupp_cur-1) = '\0'; // trim the last space
-        if (PyErr_WarnFormat(PyExc_RuntimeWarning, 1,
-                NPY__CPU_ENV_ERR_HEAD
-                "You cannot disable CPU features (%s), since "
-                "they are not supported by your machine.",
-                notsupp
-        ) < 0) {
-            return -1;
-        }
-    }
-#else
-    if (PyErr_WarnFormat(PyExc_RuntimeWarning, 1,
-            NPY__CPU_ENV_ERR_HEAD
-            "You cannot use environment variable 'NPY_DISABLE_CPU_FEATURES', since "
-        #ifdef NPY_DISABLE_OPTIMIZATION
-            "the NumPy library was compiled with optimization disabled."
-        #else
-            "the NumPy library was compiled without any dispatched optimizations."
-        #endif
-    ) < 0) {
-        return -1;
-    }
-#endif
-    return 0;
-}
-
-/****************************************************************
- * This section is reserved to defining @npy__cpu_init_features
- * for each CPU architecture, please try to keep it clean. Ty
- ****************************************************************/
-
-/***************** X86 ******************/
-
-#if defined(NPY_CPU_AMD64) || defined(NPY_CPU_X86)
-
-#ifdef _MSC_VER
-    #include <intrin.h>
-#elif defined(__INTEL_COMPILER)
-    #include <immintrin.h>
-#endif
-
-static int
-npy__cpu_getxcr0(void)
-{
-#if defined(_MSC_VER) || defined (__INTEL_COMPILER)
-    return _xgetbv(0);
-#elif defined(__GNUC__) || defined(__clang__)
-    /* named form of xgetbv not supported on OSX, so must use byte form, see:
-     * https://github.com/asmjit/asmjit/issues/78
-    */
-    unsigned int eax, edx;
-    __asm(".byte 0x0F, 0x01, 0xd0" : "=a"(eax), "=d"(edx) : "c"(0));
-    return eax;
-#else
-    return 0;
-#endif
-}
-
-static void
-npy__cpu_cpuid(int reg[4], int func_id)
-{
-#if defined(_MSC_VER)
-    __cpuidex(reg, func_id, 0);
-#elif defined(__INTEL_COMPILER)
-    __cpuid(reg, func_id);
-#elif defined(__GNUC__) || defined(__clang__)
-    #if defined(NPY_CPU_X86) && defined(__PIC__)
-        // %ebx may be the PIC register
-        __asm__("xchg{l}\t{%%}ebx, %1\n\t"
-                "cpuid\n\t"
-                "xchg{l}\t{%%}ebx, %1\n\t"
-                : "=a" (reg[0]), "=r" (reg[1]), "=c" (reg[2]),
-                  "=d" (reg[3])
-                : "a" (func_id), "c" (0)
-        );
-    #else
-        __asm__("cpuid\n\t"
-                : "=a" (reg[0]), "=b" (reg[1]), "=c" (reg[2]),
-                  "=d" (reg[3])
-                : "a" (func_id), "c" (0)
-        );
-    #endif
-#else
-    reg[0] = 0;
-#endif
-}
-
-static void
-npy__cpu_init_features(void)
-{
-    memset(npy__cpu_have, 0, sizeof(npy__cpu_have[0]) * NPY_CPU_FEATURE_MAX);
-
-    // validate platform support
-    int reg[] = {0, 0, 0, 0};
-    npy__cpu_cpuid(reg, 0);
-    if (reg[0] == 0) {
-       npy__cpu_have[NPY_CPU_FEATURE_MMX]  = 1;
-       npy__cpu_have[NPY_CPU_FEATURE_SSE]  = 1;
-       npy__cpu_have[NPY_CPU_FEATURE_SSE2] = 1;
-       #ifdef NPY_CPU_AMD64
-           npy__cpu_have[NPY_CPU_FEATURE_SSE3] = 1;
-       #endif
-       return;
-    }
-
-    npy__cpu_cpuid(reg, 1);
-    npy__cpu_have[NPY_CPU_FEATURE_MMX]    = (reg[3] & (1 << 23)) != 0;
-    npy__cpu_have[NPY_CPU_FEATURE_SSE]    = (reg[3] & (1 << 25)) != 0;
-    npy__cpu_have[NPY_CPU_FEATURE_SSE2]   = (reg[3] & (1 << 26)) != 0;
-    npy__cpu_have[NPY_CPU_FEATURE_SSE3]   = (reg[2] & (1 << 0))  != 0;
-    npy__cpu_have[NPY_CPU_FEATURE_SSSE3]  = (reg[2] & (1 << 9))  != 0;
-    npy__cpu_have[NPY_CPU_FEATURE_SSE41]  = (reg[2] & (1 << 19)) != 0;
-    npy__cpu_have[NPY_CPU_FEATURE_POPCNT] = (reg[2] & (1 << 23)) != 0;
-    npy__cpu_have[NPY_CPU_FEATURE_SSE42]  = (reg[2] & (1 << 20)) != 0;
-    npy__cpu_have[NPY_CPU_FEATURE_F16C]   = (reg[2] & (1 << 29)) != 0;
-
-    // check OSXSAVE
-    if ((reg[2] & (1 << 27)) == 0)
-        return;
-    // check AVX OS support
-    int xcr = npy__cpu_getxcr0();
-    if ((xcr & 6) != 6)
-        return;
-    npy__cpu_have[NPY_CPU_FEATURE_AVX]    = (reg[2] & (1 << 28)) != 0;
-    if (!npy__cpu_have[NPY_CPU_FEATURE_AVX])
-        return;
-    npy__cpu_have[NPY_CPU_FEATURE_FMA3]   = (reg[2] & (1 << 12)) != 0;
-
-    // second call to the cpuid to get extended AMD feature bits
-    npy__cpu_cpuid(reg, 0x80000001);
-    npy__cpu_have[NPY_CPU_FEATURE_XOP]    = (reg[2] & (1 << 11)) != 0;
-    npy__cpu_have[NPY_CPU_FEATURE_FMA4]   = (reg[2] & (1 << 16)) != 0;
-
-    // third call to the cpuid to get extended AVX2 & AVX512 feature bits
-    npy__cpu_cpuid(reg, 7);
-    npy__cpu_have[NPY_CPU_FEATURE_AVX2]   = (reg[1] & (1 << 5))  != 0;
-    if (!npy__cpu_have[NPY_CPU_FEATURE_AVX2])
-        return;
-    // detect AVX2 & FMA3
-    npy__cpu_have[NPY_CPU_FEATURE_FMA]    = npy__cpu_have[NPY_CPU_FEATURE_FMA3];
-
-    // check AVX512 OS support
-    int avx512_os = (xcr & 0xe6) == 0xe6;
-#if defined(__APPLE__) && defined(__x86_64__)
-    /**
-     * On darwin, machines with AVX512 support, by default, threads are created with
-     * AVX512 masked off in XCR0 and an AVX-sized savearea is used.
-     * However, AVX512 capabilities are advertised in the commpage and via sysctl.
-     * for more information, check:
-     *  - https://github.com/apple/darwin-xnu/blob/0a798f6738bc1db01281fc08ae024145e84df927/osfmk/i386/fpu.c#L175-L201
-     *  - https://github.com/golang/go/issues/43089
-     *  - https://github.com/numpy/numpy/issues/19319
-     */
-    if (!avx512_os) {
-        npy_uintp commpage64_addr = 0x00007fffffe00000ULL;
-        npy_uint16 commpage64_ver = *((npy_uint16*)(commpage64_addr + 0x01E));
-        // cpu_capabilities64 undefined in versions < 13
-        if (commpage64_ver > 12) {
-            npy_uint64 commpage64_cap = *((npy_uint64*)(commpage64_addr + 0x010));
-            avx512_os = (commpage64_cap & 0x0000004000000000ULL) != 0;
-        }
-    }
-#endif
-    if (!avx512_os) {
-        return;
-    }
-    npy__cpu_have[NPY_CPU_FEATURE_AVX512F]  = (reg[1] & (1 << 16)) != 0;
-    npy__cpu_have[NPY_CPU_FEATURE_AVX512CD] = (reg[1] & (1 << 28)) != 0;
-    if (npy__cpu_have[NPY_CPU_FEATURE_AVX512F] && npy__cpu_have[NPY_CPU_FEATURE_AVX512CD]) {
-        // Knights Landing
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512PF]        = (reg[1] & (1 << 26)) != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512ER]        = (reg[1] & (1 << 27)) != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512_KNL]      = npy__cpu_have[NPY_CPU_FEATURE_AVX512ER] &&
-                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512PF];
-        // Knights Mill
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512VPOPCNTDQ] = (reg[2] & (1 << 14)) != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_AVX5124VNNIW]    = (reg[3] & (1 << 2))  != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_AVX5124FMAPS]    = (reg[3] & (1 << 3))  != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512_KNM]      = npy__cpu_have[NPY_CPU_FEATURE_AVX512_KNL] &&
-                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX5124FMAPS] &&
-                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX5124VNNIW] &&
-                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512VPOPCNTDQ];
-
-        // Skylake-X
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512DQ]        = (reg[1] & (1 << 17)) != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512BW]        = (reg[1] & (1 << 30)) != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512VL]        = (reg[1] & (1 << 31)) != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512_SKX]      = npy__cpu_have[NPY_CPU_FEATURE_AVX512BW] &&
-                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512DQ] &&
-                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512VL];
-        // Cascade Lake
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512VNNI]      = (reg[2] & (1 << 11)) != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512_CLX]      = npy__cpu_have[NPY_CPU_FEATURE_AVX512_SKX] &&
-                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512VNNI];
-
-        // Cannon Lake
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512IFMA]      = (reg[1] & (1 << 21)) != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512VBMI]      = (reg[2] & (1 << 1))  != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512_CNL]      = npy__cpu_have[NPY_CPU_FEATURE_AVX512_SKX] &&
-                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512IFMA] &&
-                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512VBMI];
-        // Ice Lake
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512VBMI2]     = (reg[2] & (1 << 6))  != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512BITALG]    = (reg[2] & (1 << 12)) != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_AVX512_ICL]      = npy__cpu_have[NPY_CPU_FEATURE_AVX512_CLX] &&
-                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512_CNL] &&
-                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512VBMI2] &&
-                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512BITALG] &&
-                                                         npy__cpu_have[NPY_CPU_FEATURE_AVX512VPOPCNTDQ];
-    }
-}
-
-/***************** POWER ******************/
-
-#elif defined(NPY_CPU_PPC64) || defined(NPY_CPU_PPC64LE)
-
-#ifdef __linux__
-    #include <sys/auxv.h>
-    #ifndef AT_HWCAP2
-        #define AT_HWCAP2 26
-    #endif
-    #ifndef PPC_FEATURE2_ARCH_3_00
-        #define PPC_FEATURE2_ARCH_3_00 0x00800000
-    #endif
-#endif
-
-static void
-npy__cpu_init_features(void)
-{
-    memset(npy__cpu_have, 0, sizeof(npy__cpu_have[0]) * NPY_CPU_FEATURE_MAX);
-#ifdef __linux__
-    unsigned int hwcap = getauxval(AT_HWCAP);
-    if ((hwcap & PPC_FEATURE_HAS_VSX) == 0)
-        return;
-
-    hwcap = getauxval(AT_HWCAP2);
-    if (hwcap & PPC_FEATURE2_ARCH_3_00)
-    {
-        npy__cpu_have[NPY_CPU_FEATURE_VSX]  =
-        npy__cpu_have[NPY_CPU_FEATURE_VSX2] =
-        npy__cpu_have[NPY_CPU_FEATURE_VSX3] = 1;
-        return;
-    }
-    npy__cpu_have[NPY_CPU_FEATURE_VSX2] = (hwcap & PPC_FEATURE2_ARCH_2_07) != 0;
-    npy__cpu_have[NPY_CPU_FEATURE_VSX]  = 1;
-// TODO: AIX, FreeBSD
-#else
-    npy__cpu_have[NPY_CPU_FEATURE_VSX]  = 1;
-    #if defined(NPY_CPU_PPC64LE) || defined(NPY_HAVE_VSX2)
-    npy__cpu_have[NPY_CPU_FEATURE_VSX2] = 1;
-    #endif
-    #ifdef NPY_HAVE_VSX3
-    npy__cpu_have[NPY_CPU_FEATURE_VSX3] = 1;
-    #endif
-#endif
-}
-
-/***************** ARM ******************/
-
-#elif defined(__arm__) || defined(__aarch64__)
-
-static NPY_INLINE void
-npy__cpu_init_features_arm8(void)
-{
-    npy__cpu_have[NPY_CPU_FEATURE_NEON]       =
-    npy__cpu_have[NPY_CPU_FEATURE_NEON_FP16]  =
-    npy__cpu_have[NPY_CPU_FEATURE_NEON_VFPV4] =
-    npy__cpu_have[NPY_CPU_FEATURE_ASIMD]      = 1;
-}
-
-#if defined(__linux__) || defined(__FreeBSD__)
-/*
- * we aren't sure of what kind kernel or clib we deal with
- * so we play it safe
-*/
-#include <stdio.h>
-#include "npy_cpuinfo_parser.h"
-
-#if defined(__linux__)
-__attribute__((weak)) unsigned long getauxval(unsigned long); // linker should handle it
-#endif
-#ifdef __FreeBSD__
-__attribute__((weak)) int elf_aux_info(int, void *, int); // linker should handle it
-
-static unsigned long getauxval(unsigned long k)
-{
-    unsigned long val = 0ul;
-    if (elf_aux_info == 0 || elf_aux_info((int)k, (void *)&val, (int)sizeof(val)) != 0) {
-       return 0ul;
-    }
-    return val;
-}
-#endif
-static int
-npy__cpu_init_features_linux(void)
-{
-    unsigned long hwcap = 0, hwcap2 = 0;
-    #ifdef __linux__
-    if (getauxval != 0) {
-        hwcap = getauxval(NPY__HWCAP);
-    #ifdef __arm__
-        hwcap2 = getauxval(NPY__HWCAP2);
-    #endif
-    } else {
-        unsigned long auxv[2];
-        int fd = open("/proc/self/auxv", O_RDONLY);
-        if (fd >= 0) {
-            while (read(fd, &auxv, sizeof(auxv)) == sizeof(auxv)) {
-                if (auxv[0] == NPY__HWCAP) {
-                    hwcap = auxv[1];
-                }
-            #ifdef __arm__
-                else if (auxv[0] == NPY__HWCAP2) {
-                    hwcap2 = auxv[1];
-                }
-            #endif
-                // detect the end
-                else if (auxv[0] == 0 && auxv[1] == 0) {
-                    break;
-                }
-            }
-            close(fd);
-        }
-    }
-    #else
-    hwcap = getauxval(NPY__HWCAP);
-    #ifdef __arm__
-    hwcap2 = getauxval(NPY__HWCAP2);
-    #endif
-    #endif
-    if (hwcap == 0 && hwcap2 == 0) {
-    #ifdef __linux__
-        /*
-         * try parsing with /proc/cpuinfo, if sandboxed
-         * failback to compiler definitions
-        */
-        if(!get_feature_from_proc_cpuinfo(&hwcap, &hwcap2)) {
-            return 0;
-        }
-    #else
-       return 0;
-    #endif
-    }
-#ifdef __arm__
-    // Detect Arm8 (aarch32 state)
-    if ((hwcap2 & NPY__HWCAP2_AES)  || (hwcap2 & NPY__HWCAP2_SHA1)  ||
-        (hwcap2 & NPY__HWCAP2_SHA2) || (hwcap2 & NPY__HWCAP2_PMULL) ||
-        (hwcap2 & NPY__HWCAP2_CRC32))
-    {
-        hwcap = hwcap2;
-#else
-    if (1)
-    {
-        if (!(hwcap & (NPY__HWCAP_FP | NPY__HWCAP_ASIMD))) {
-            // Is this could happen? maybe disabled by kernel
-            // BTW this will break the baseline of AARCH64
-            return 1;
-        }
-#endif
-        npy__cpu_have[NPY_CPU_FEATURE_FPHP]       = (hwcap & NPY__HWCAP_FPHP)     != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_ASIMDHP]    = (hwcap & NPY__HWCAP_ASIMDHP)  != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_ASIMDDP]    = (hwcap & NPY__HWCAP_ASIMDDP)  != 0;
-        npy__cpu_have[NPY_CPU_FEATURE_ASIMDFHM]   = (hwcap & NPY__HWCAP_ASIMDFHM) != 0;
-        npy__cpu_init_features_arm8();
-    } else {
-        npy__cpu_have[NPY_CPU_FEATURE_NEON]       = (hwcap & NPY__HWCAP_NEON)   != 0;
-        if (npy__cpu_have[NPY_CPU_FEATURE_NEON]) {
-            npy__cpu_have[NPY_CPU_FEATURE_NEON_FP16]  = (hwcap & NPY__HWCAP_HALF) != 0;
-            npy__cpu_have[NPY_CPU_FEATURE_NEON_VFPV4] = (hwcap & NPY__HWCAP_VFPv4) != 0;
-        }
-    }
-    return 1;
-}
-#endif
-
-static void
-npy__cpu_init_features(void)
-{
-    memset(npy__cpu_have, 0, sizeof(npy__cpu_have[0]) * NPY_CPU_FEATURE_MAX);
-#ifdef __linux__
-    if (npy__cpu_init_features_linux())
-        return;
-#endif
-    // We have nothing else todo
-#if defined(NPY_HAVE_ASIMD) || defined(__aarch64__) || (defined(__ARM_ARCH) && __ARM_ARCH >= 8)
-    #if defined(NPY_HAVE_FPHP) || defined(__ARM_FEATURE_FP16_VECTOR_ARITHMETIC)
-    npy__cpu_have[NPY_CPU_FEATURE_FPHP] = 1;
-    #endif
-    #if defined(NPY_HAVE_ASIMDHP) || defined(__ARM_FEATURE_FP16_VECTOR_ARITHMETIC)
-    npy__cpu_have[NPY_CPU_FEATURE_ASIMDHP] = 1;
-    #endif
-    #if defined(NPY_HAVE_ASIMDDP) || defined(__ARM_FEATURE_DOTPROD)
-    npy__cpu_have[NPY_CPU_FEATURE_ASIMDDP] = 1;
-    #endif
-    #if defined(NPY_HAVE_ASIMDFHM) || defined(__ARM_FEATURE_FP16FML)
-    npy__cpu_have[NPY_CPU_FEATURE_ASIMDFHM] = 1;
-    #endif
-    npy__cpu_init_features_arm8();
-#else
-    #if defined(NPY_HAVE_NEON) || defined(__ARM_NEON__)
-        npy__cpu_have[NPY_CPU_FEATURE_NEON] = 1;
-    #endif
-    #if defined(NPY_HAVE_NEON_FP16) || defined(__ARM_FP16_FORMAT_IEEE) || (defined(__ARM_FP) && (__ARM_FP & 2))
-        npy__cpu_have[NPY_CPU_FEATURE_NEON_FP16] = npy__cpu_have[NPY_CPU_FEATURE_NEON];
-    #endif
-    #if defined(NPY_HAVE_NEON_VFPV4) || defined(__ARM_FEATURE_FMA)
-        npy__cpu_have[NPY_CPU_FEATURE_NEON_VFPV4] = npy__cpu_have[NPY_CPU_FEATURE_NEON];
-    #endif
-#endif
-}
-
-/*********** Unsupported ARCH ***********/
-#else
-static void
-npy__cpu_init_features(void)
-{
-    /*
-     * just in case if the compiler doesn't respect ANSI
-     * but for knowing paltforms it still nessecery, because @npy__cpu_init_features
-     * may called multiple of times and we need to clear the disabled features by
-     * ENV Var or maybe in the future we can support other methods like
-     * global variables, go back to @npy__cpu_try_disable_env for more understanding
-     */
-    memset(npy__cpu_have, 0, sizeof(npy__cpu_have[0]) * NPY_CPU_FEATURE_MAX);
-}
-#endif
diff --git a/numpy/core/src/common/npy_cpu_features.h b/numpy/core/src/common/npy_cpu_features.h

index ce1fc822ac038eb261555cd4ce44586d8b6a2a82..3d5f2e75cb12432eeb9f85c003e9c9d0acd7629d 100644 (file)
--- a/numpy/core/src/common/npy_cpu_features.h
+++ b/numpy/core/src/common/npy_cpu_features.h
@@ -65,6 +65,8 @@ enum npy_cpu_features
      NPY_CPU_FEATURE_VSX2              = 201,
      // POWER9
      NPY_CPU_FEATURE_VSX3              = 202,
+    // POWER10
+    NPY_CPU_FEATURE_VSX4              = 203,
  
      // ARM
      NPY_CPU_FEATURE_NEON              = 300,
@@ -82,6 +84,15 @@ enum npy_cpu_features
      // ARMv8.2 single&half-precision multiply
      NPY_CPU_FEATURE_ASIMDFHM          = 307,
  
+    // IBM/ZARCH
+    NPY_CPU_FEATURE_VX                = 350,
+ 
+    // Vector-Enhancements Facility 1
+    NPY_CPU_FEATURE_VXE               = 351,
+
+    // Vector-Enhancements Facility 2
+    NPY_CPU_FEATURE_VXE2              = 352,
+
      NPY_CPU_FEATURE_MAX
  };
  
@@ -138,6 +149,7 @@ npy_cpu_features_dict(void);
   * On aarch64: ['NEON', 'NEON_FP16', 'NEON_VPFV4', 'ASIMD']
   * On ppc64: []
   * On ppc64le: ['VSX', 'VSX2']
+ * On s390x: []
   * On any other arch or if the optimization is disabled: []
   */
  NPY_VISIBILITY_HIDDEN PyObject *
@@ -157,8 +169,9 @@ npy_cpu_baseline_list(void);
   * On x64: ['SSSE3', 'SSE41', 'POPCNT', 'SSE42', 'AVX', 'F16C', 'FMA3', 'AVX2', 'AVX512F', ...]
   * On armhf: ['NEON', 'NEON_FP16', 'NEON_VPFV4', 'ASIMD', 'ASIMDHP', 'ASIMDDP', 'ASIMDFHM']
   * On aarch64: ['ASIMDHP', 'ASIMDDP', 'ASIMDFHM']
- * On ppc64:  ['VSX', 'VSX2', 'VSX3']
- * On ppc64le: ['VSX3']
+ * On ppc64:  ['VSX', 'VSX2', 'VSX3', 'VSX4']
+ * On ppc64le: ['VSX3', 'VSX4']
+ * On s390x: ['VX', 'VXE', VXE2]
   * On any other arch or if the optimization is disabled: []
   */
  NPY_VISIBILITY_HIDDEN PyObject *
diff --git a/numpy/core/src/common/npy_dlpack.h b/numpy/core/src/common/npy_dlpack.h

index 14ca352c01a7d31e1615a9ab63dd26c1e715ddda..cb926a26271d7f7be2d94aaade5e7f10235c2d0f 100644 (file)
--- a/numpy/core/src/common/npy_dlpack.h
+++ b/numpy/core/src/common/npy_dlpack.h
@@ -23,6 +23,6 @@ array_dlpack_device(PyArrayObject *self, PyObject *NPY_UNUSED(args));
  
  
  NPY_NO_EXPORT PyObject *
-_from_dlpack(PyObject *NPY_UNUSED(self), PyObject *obj);
+from_dlpack(PyObject *NPY_UNUSED(self), PyObject *obj);
  
  #endif
diff --git a/numpy/core/src/common/npy_partition.h b/numpy/core/src/common/npy_partition.h

new file mode 100644 (file)

index 0000000..85a0727
--- /dev/null
+++ b/numpy/core/src/common/npy_partition.h
@@ -0,0 +1,27 @@
+#ifndef NUMPY_CORE_SRC_COMMON_PARTITION_H_
+#define NUMPY_CORE_SRC_COMMON_PARTITION_H_
+
+#include "npy_sort.h"
+
+/* Python include is for future object sorts */
+#include <Python.h>
+
+#include <numpy/ndarraytypes.h>
+#include <numpy/npy_common.h>
+
+#define NPY_MAX_PIVOT_STACK 50
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+NPY_NO_EXPORT PyArray_PartitionFunc *
+get_partition_func(int type, NPY_SELECTKIND which);
+NPY_NO_EXPORT PyArray_ArgPartitionFunc *
+get_argpartition_func(int type, NPY_SELECTKIND which);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/numpy/core/src/common/npy_partition.h.src b/numpy/core/src/common/npy_partition.h.src

deleted file mode 100644 (file)

index 72c2095..0000000
--- a/numpy/core/src/common/npy_partition.h.src
+++ /dev/null
@@ -1,126 +0,0 @@
-/*
- *****************************************************************************
- **               IMPORTANT NOTE for npy_partition.h.src -> npy_partition.h **
- *****************************************************************************
- *  The template file loops.h.src is not automatically converted into
- *  loops.h by the build system.  If you edit this file, you must manually
- *  do the conversion using numpy/distutils/conv_template.py from the
- *  command line as follows:
- *
- *  $ cd <NumPy source root directory>
- *  $ python  numpy/distutils/conv_template.py numpy/core/src/private/npy_partition.h.src
- *  $
- */
-
-
-#ifndef __NPY_PARTITION_H__
-#define __NPY_PARTITION_H__
-
-
-#include "npy_sort.h"
-
-/* Python include is for future object sorts */
-#include <Python.h>
-#include <numpy/npy_common.h>
-#include <numpy/ndarraytypes.h>
-
-#define ARRAY_SIZE(a) (sizeof(a)/sizeof(a[0]))
-
-#define NPY_MAX_PIVOT_STACK 50
-
-/**begin repeat
- *
- * #TYPE = BOOL, BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG,
- *         LONGLONG, ULONGLONG, HALF, FLOAT, DOUBLE, LONGDOUBLE,
- *         CFLOAT, CDOUBLE, CLONGDOUBLE#
- * #suff = bool, byte, ubyte, short, ushort, int, uint, long, ulong,
- *         longlong, ulonglong, half, float, double, longdouble,
- *         cfloat, cdouble, clongdouble#
- * #type = npy_bool, npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int,
- *         npy_uint, npy_long, npy_ulong, npy_longlong, npy_ulonglong,
- *         npy_ushort, npy_float, npy_double, npy_longdouble, npy_cfloat,
- *         npy_cdouble, npy_clongdouble#
- */
-
-NPY_NO_EXPORT int introselect_@suff@(@type@ *v, npy_intp num,
-                                             npy_intp kth,
-                                             npy_intp * pivots,
-                                             npy_intp * npiv,
-                                             void *NOT_USED);
-NPY_NO_EXPORT int aintroselect_@suff@(@type@ *v, npy_intp* tosort, npy_intp num,
-                                              npy_intp kth,
-                                              npy_intp * pivots,
-                                              npy_intp * npiv,
-                                              void *NOT_USED);
-
-
-/**end repeat**/
-
-typedef struct {
-    int typenum;
-    PyArray_PartitionFunc * part[NPY_NSELECTS];
-    PyArray_ArgPartitionFunc * argpart[NPY_NSELECTS];
-} part_map;
-
-static part_map _part_map[] = {
-/**begin repeat
- *
- * #TYPE = BOOL, BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG,
- *         LONGLONG, ULONGLONG, HALF, FLOAT, DOUBLE, LONGDOUBLE,
- *         CFLOAT, CDOUBLE, CLONGDOUBLE#
- * #suff = bool, byte, ubyte, short, ushort, int, uint, long, ulong,
- *         longlong, ulonglong, half, float, double, longdouble,
- *         cfloat, cdouble, clongdouble#
- * #type = npy_bool, npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int,
- *         npy_uint, npy_long, npy_ulong, npy_longlong, npy_ulonglong,
- *         npy_ushort, npy_float, npy_double, npy_longdouble, npy_cfloat,
- *         npy_cdouble, npy_clongdouble#
- */
-    {
-        NPY_@TYPE@,
-        {
-            (PyArray_PartitionFunc *)&introselect_@suff@,
-        },
-        {
-            (PyArray_ArgPartitionFunc *)&aintroselect_@suff@,
-        }
-    },
-/**end repeat**/
-};
-
-
-static NPY_INLINE PyArray_PartitionFunc *
-get_partition_func(int type, NPY_SELECTKIND which)
-{
-    npy_intp i;
-    npy_intp ntypes = ARRAY_SIZE(_part_map);
-
-    if (which >= NPY_NSELECTS) {
-        return NULL;
-    }
-    for (i = 0; i < ntypes; i++) {
-        if (type == _part_map[i].typenum) {
-            return _part_map[i].part[which];
-        }
-    }
-    return NULL;
-}
-
-
-static NPY_INLINE PyArray_ArgPartitionFunc *
-get_argpartition_func(int type, NPY_SELECTKIND which)
-{
-    npy_intp i;
-    npy_intp ntypes = ARRAY_SIZE(_part_map);
-
-    for (i = 0; i < ntypes; i++) {
-        if (type == _part_map[i].typenum) {
-            return _part_map[i].argpart[which];
-        }
-    }
-    return NULL;
-}
-
-#undef ARRAY_SIZE
-
-#endif
diff --git a/numpy/core/src/common/npy_sort.h.src b/numpy/core/src/common/npy_sort.h.src

index b4a1e9b0cad97943ebcfe2e084397a37a760c96e..a3f556f56a9ce0728df2d465cdcf9c0b53de75f4 100644 (file)
--- a/numpy/core/src/common/npy_sort.h.src
+++ b/numpy/core/src/common/npy_sort.h.src
@@ -18,6 +18,11 @@ static NPY_INLINE int npy_get_msb(npy_uintp unum)
      return depth_limit;
  }
  
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+
  
  /*
   *****************************************************************************
@@ -102,4 +107,8 @@ NPY_NO_EXPORT int npy_aheapsort(void *vec, npy_intp *ind, npy_intp cnt, void *ar
  NPY_NO_EXPORT int npy_amergesort(void *vec, npy_intp *ind, npy_intp cnt, void *arr);
  NPY_NO_EXPORT int npy_atimsort(void *vec, npy_intp *ind, npy_intp cnt, void *arr);
  
+#ifdef __cplusplus
+}
+#endif
+
  #endif
diff --git a/numpy/core/src/common/npy_svml.h b/numpy/core/src/common/npy_svml.h

index 4292f7090333c6d9a87a6bfda3af00b1cb519f17..1111025d736f94ce89aba1d5034d332b1a000def 100644 (file)
--- a/numpy/core/src/common/npy_svml.h
+++ b/numpy/core/src/common/npy_svml.h
@@ -1,5 +1,7 @@
  #if NPY_SIMD && defined(NPY_HAVE_AVX512_SKX) && defined(NPY_CAN_LINK_SVML)
+extern __m512 __svml_expf16(__m512 x);
  extern __m512 __svml_exp2f16(__m512 x);
+extern __m512 __svml_logf16(__m512 x);
  extern __m512 __svml_log2f16(__m512 x);
  extern __m512 __svml_log10f16(__m512 x);
  extern __m512 __svml_expm1f16(__m512 x);
@@ -19,7 +21,9 @@ extern __m512 __svml_asinhf16(__m512 x);
  extern __m512 __svml_acoshf16(__m512 x);
  extern __m512 __svml_atanhf16(__m512 x);
  
+extern __m512d __svml_exp8(__m512d x);
  extern __m512d __svml_exp28(__m512d x);
+extern __m512d __svml_log8(__m512d x);
  extern __m512d __svml_log28(__m512d x);
  extern __m512d __svml_log108(__m512d x);
  extern __m512d __svml_expm18(__m512d x);
diff --git a/numpy/core/src/common/numpy_tag.h b/numpy/core/src/common/numpy_tag.h

index dc8d5286b07d2500005ab3b28db37725ea2d66f1..ee0c36cacd73d55fa74344ff056631d500d93e63 100644 (file)
--- a/numpy/core/src/common/numpy_tag.h
+++ b/numpy/core/src/common/numpy_tag.h
@@ -1,8 +1,15 @@
  #ifndef _NPY_COMMON_TAG_H_
  #define _NPY_COMMON_TAG_H_
  
+#include "../npysort/npysort_common.h"
+
  namespace npy {
  
+template<typename... tags>
+struct taglist {
+  static constexpr unsigned size = sizeof...(tags);
+};
+
  struct integral_tag {
  };
  struct floating_point_tag {
@@ -14,63 +21,237 @@ struct date_tag {
  
  struct bool_tag : integral_tag {
      using type = npy_bool;
+    static constexpr NPY_TYPES type_value = NPY_BOOL;
+    static int less(type const& a, type const& b) {
+      return BOOL_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct byte_tag : integral_tag {
      using type = npy_byte;
+    static constexpr NPY_TYPES type_value = NPY_BYTE;
+    static int less(type const& a, type const& b) {
+      return BYTE_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct ubyte_tag : integral_tag {
      using type = npy_ubyte;
+    static constexpr NPY_TYPES type_value = NPY_UBYTE;
+    static int less(type const& a, type const& b) {
+      return UBYTE_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct short_tag : integral_tag {
      using type = npy_short;
+    static constexpr NPY_TYPES type_value = NPY_SHORT;
+    static int less(type const& a, type const& b) {
+      return SHORT_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct ushort_tag : integral_tag {
      using type = npy_ushort;
+    static constexpr NPY_TYPES type_value = NPY_USHORT;
+    static int less(type const& a, type const& b) {
+      return USHORT_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct int_tag : integral_tag {
      using type = npy_int;
+    static constexpr NPY_TYPES type_value = NPY_INT;
+    static int less(type const& a, type const& b) {
+      return INT_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct uint_tag : integral_tag {
      using type = npy_uint;
+    static constexpr NPY_TYPES type_value = NPY_UINT;
+    static int less(type const& a, type const& b) {
+      return UINT_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct long_tag : integral_tag {
      using type = npy_long;
+    static constexpr NPY_TYPES type_value = NPY_LONG;
+    static int less(type const& a, type const& b) {
+      return LONG_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct ulong_tag : integral_tag {
      using type = npy_ulong;
+    static constexpr NPY_TYPES type_value = NPY_ULONG;
+    static int less(type const& a, type const& b) {
+      return ULONG_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct longlong_tag : integral_tag {
      using type = npy_longlong;
+    static constexpr NPY_TYPES type_value = NPY_LONGLONG;
+    static int less(type const& a, type const& b) {
+      return LONGLONG_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct ulonglong_tag : integral_tag {
      using type = npy_ulonglong;
+    static constexpr NPY_TYPES type_value = NPY_ULONGLONG;
+    static int less(type const& a, type const& b) {
+      return ULONGLONG_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct half_tag {
      using type = npy_half;
+    static constexpr NPY_TYPES type_value = NPY_HALF;
+    static int less(type const& a, type const& b) {
+      return HALF_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct float_tag : floating_point_tag {
      using type = npy_float;
+    static constexpr NPY_TYPES type_value = NPY_FLOAT;
+    static int less(type const& a, type const& b) {
+      return FLOAT_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct double_tag : floating_point_tag {
      using type = npy_double;
+    static constexpr NPY_TYPES type_value = NPY_DOUBLE;
+    static int less(type const& a, type const& b) {
+      return DOUBLE_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct longdouble_tag : floating_point_tag {
      using type = npy_longdouble;
+    static constexpr NPY_TYPES type_value = NPY_LONGDOUBLE;
+    static int less(type const& a, type const& b) {
+      return LONGDOUBLE_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct cfloat_tag : complex_tag {
      using type = npy_cfloat;
+    static constexpr NPY_TYPES type_value = NPY_CFLOAT;
+    static int less(type const& a, type const& b) {
+      return CFLOAT_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct cdouble_tag : complex_tag {
      using type = npy_cdouble;
+    static constexpr NPY_TYPES type_value = NPY_CDOUBLE;
+    static int less(type const& a, type const& b) {
+      return CDOUBLE_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct clongdouble_tag : complex_tag {
      using type = npy_clongdouble;
+    static constexpr NPY_TYPES type_value = NPY_CLONGDOUBLE;
+    static int less(type const& a, type const& b) {
+      return CLONGDOUBLE_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct datetime_tag : date_tag {
      using type = npy_datetime;
+    static constexpr NPY_TYPES type_value = NPY_DATETIME;
+    static int less(type const& a, type const& b) {
+      return DATETIME_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
  };
  struct timedelta_tag : date_tag {
      using type = npy_timedelta;
+    static constexpr NPY_TYPES type_value = NPY_TIMEDELTA;
+    static int less(type const& a, type const& b) {
+      return TIMEDELTA_LT(a, b);
+    }
+    static int less_equal(type const& a, type const& b) {
+      return !less(b, a);
+    }
+};
+
+struct string_tag {
+    using type = npy_char;
+    static constexpr NPY_TYPES type_value = NPY_STRING;
+    static int less(type const* a, type const* b, size_t len) {
+      return STRING_LT(a, b, len);
+    }
+    static int less_equal(type const* a, type const* b, size_t len) {
+      return !less(b, a, len);
+    }
+    static void swap(type* a, type* b, size_t len) {
+      STRING_SWAP(a, b, len);
+    }
+    static void copy(type * a, type const* b, size_t len) {
+      STRING_COPY(a, b, len);
+    }
+};
+
+struct unicode_tag {
+    using type = npy_ucs4;
+    static constexpr NPY_TYPES type_value = NPY_UNICODE;
+    static int less(type const* a, type const* b, size_t len) {
+      return UNICODE_LT(a, b, len);
+    }
+    static int less_equal(type const* a, type const* b, size_t len) {
+      return !less(b, a, len);
+    }
+    static void swap(type* a, type* b, size_t len) {
+      UNICODE_SWAP(a, b, len);
+    }
+    static void copy(type * a, type const* b, size_t len) {
+      UNICODE_COPY(a, b, len);
+    }
  };
  
  }  // namespace npy
diff --git a/numpy/core/src/common/simd/avx2/math.h b/numpy/core/src/common/simd/avx2/math.h

index ec15e50e1fdb9e7a94aaa21f8dd8918193ef3383..deaf4ad115e8b3a52785d35c4bd4a9d7570f0705 100644 (file)
--- a/numpy/core/src/common/simd/avx2/math.h
+++ b/numpy/core/src/common/simd/avx2/math.h
@@ -42,7 +42,7 @@ NPY_FINLINE npyv_f64 npyv_square_f64(npyv_f64 a)
  #define npyv_max_f64 _mm256_max_pd
  // Maximum, supports IEEE floating-point arithmetic (IEC 60559),
  // - If one of the two vectors contains NaN, the equivalent element of the other vector is set
-// - Only if both corresponded elements are NaN, NaN is set. 
+// - Only if both corresponded elements are NaN, NaN is set.
  NPY_FINLINE npyv_f32 npyv_maxp_f32(npyv_f32 a, npyv_f32 b)
  {
      __m256 nn  = _mm256_cmp_ps(b, b, _CMP_ORD_Q);
@@ -76,7 +76,7 @@ NPY_FINLINE npyv_s64 npyv_max_s64(npyv_s64 a, npyv_s64 b)
  #define npyv_min_f64 _mm256_min_pd
  // Minimum, supports IEEE floating-point arithmetic (IEC 60559),
  // - If one of the two vectors contains NaN, the equivalent element of the other vector is set
-// - Only if both corresponded elements are NaN, NaN is set. 
+// - Only if both corresponded elements are NaN, NaN is set.
  NPY_FINLINE npyv_f32 npyv_minp_f32(npyv_f32 a, npyv_f32 b)
  {
      __m256 nn  = _mm256_cmp_ps(b, b, _CMP_ORD_Q);
@@ -105,6 +105,10 @@ NPY_FINLINE npyv_s64 npyv_min_s64(npyv_s64 a, npyv_s64 b)
      return _mm256_blendv_epi8(a, b, _mm256_cmpgt_epi64(a, b));
  }
  
+// round to nearest intger even
+#define npyv_rint_f32(A) _mm256_round_ps(A, _MM_FROUND_TO_NEAREST_INT)
+#define npyv_rint_f64(A) _mm256_round_pd(A, _MM_FROUND_TO_NEAREST_INT)
+
  // ceil
  #define npyv_ceil_f32 _mm256_ceil_ps
  #define npyv_ceil_f64 _mm256_ceil_pd
@@ -113,4 +117,8 @@ NPY_FINLINE npyv_s64 npyv_min_s64(npyv_s64 a, npyv_s64 b)
  #define npyv_trunc_f32(A) _mm256_round_ps(A, _MM_FROUND_TO_ZERO)
  #define npyv_trunc_f64(A) _mm256_round_pd(A, _MM_FROUND_TO_ZERO)
  
+// floor
+#define npyv_floor_f32 _mm256_floor_ps
+#define npyv_floor_f64 _mm256_floor_pd
+
  #endif // _NPY_SIMD_AVX2_MATH_H
diff --git a/numpy/core/src/common/simd/avx2/memory.h b/numpy/core/src/common/simd/avx2/memory.h

index 5891a270aa182188a02f2fcfdd5ab0701b08a458..410c35dc8873f2f9e97f2e5dfd333ecd6181fbb1 100644 (file)
--- a/numpy/core/src/common/simd/avx2/memory.h
+++ b/numpy/core/src/common/simd/avx2/memory.h
@@ -353,4 +353,25 @@ NPYV_IMPL_AVX2_REST_PARTIAL_TYPES(f32, s32)
  NPYV_IMPL_AVX2_REST_PARTIAL_TYPES(u64, s64)
  NPYV_IMPL_AVX2_REST_PARTIAL_TYPES(f64, s64)
  
+/*********************************
+ * Lookup tables
+ *********************************/
+// uses vector as indexes into a table
+// that contains 32 elements of float32.
+NPY_FINLINE npyv_f32 npyv_lut32_f32(const float *table, npyv_u32 idx)
+{ return _mm256_i32gather_ps(table, idx, 4); }
+NPY_FINLINE npyv_u32 npyv_lut32_u32(const npy_uint32 *table, npyv_u32 idx)
+{ return npyv_reinterpret_u32_f32(npyv_lut32_f32((const float*)table, idx)); }
+NPY_FINLINE npyv_s32 npyv_lut32_s32(const npy_int32 *table, npyv_u32 idx)
+{ return npyv_reinterpret_s32_f32(npyv_lut32_f32((const float*)table, idx)); }
+
+// uses vector as indexes into a table
+// that contains 16 elements of float64.
+NPY_FINLINE npyv_f64 npyv_lut16_f64(const double *table, npyv_u64 idx)
+{ return _mm256_i64gather_pd(table, idx, 8); }
+NPY_FINLINE npyv_u64 npyv_lut16_u64(const npy_uint64 *table, npyv_u64 idx)
+{ return npyv_reinterpret_u64_f64(npyv_lut16_f64((const double*)table, idx)); }
+NPY_FINLINE npyv_s64 npyv_lut16_s64(const npy_int64 *table, npyv_u64 idx)
+{ return npyv_reinterpret_s64_f64(npyv_lut16_f64((const double*)table, idx)); }
+
  #endif // _NPY_SIMD_AVX2_MEMORY_H
diff --git a/numpy/core/src/common/simd/avx512/arithmetic.h b/numpy/core/src/common/simd/avx512/arithmetic.h

index f8632e7017908dcd82acd9cc504673434c8b32c7..93e9d9d45197150264536bda5cdd033f493b9b22 100644 (file)
--- a/numpy/core/src/common/simd/avx512/arithmetic.h
+++ b/numpy/core/src/common/simd/avx512/arithmetic.h
@@ -371,7 +371,79 @@ NPY_FINLINE npyv_s64 npyv_divc_s64(npyv_s64 a, const npyv_s64x3 divisor)
      #define npyv_sum_u64 _mm512_reduce_add_epi64
      #define npyv_sum_f32 _mm512_reduce_add_ps
      #define npyv_sum_f64 _mm512_reduce_add_pd
+    #define npyv_reducemin_u32 _mm512_reduce_min_epu32
+    #define npyv_reducemin_s32 _mm512_reduce_min_epi32
+    #define npyv_reducemin_f32 _mm512_reduce_min_ps
+    #define npyv_reducemax_u32 _mm512_reduce_max_epu32
+    #define npyv_reducemax_s32 _mm512_reduce_max_epi32
+    #define npyv_reducemax_f32 _mm512_reduce_max_ps
  #else
+    NPY_FINLINE npy_uint32 npyv_reducemax_u32(npyv_u32 a)
+    {
+        const npyv_u32 idx1 = _mm512_set_epi32(7, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8);
+        const npyv_u32 idx2 = _mm512_set_epi32(3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12);
+        npyv_u32 a1 = _mm512_max_epu32(a, _mm512_permutex2var_epi32(a, idx1, a));
+        npyv_u32 a2 = _mm512_max_epu32(a1, _mm512_permutex2var_epi32(a1, idx2, a1));
+        npyv_u32 a3 = _mm512_max_epu32(a2, _mm512_shuffle_epi32(a2, (1<<6 | 0<<4 | 3<<2 | 2)));
+        npyv_u32 a4 = _mm512_max_epu32(a3, _mm512_shuffle_epi32(a3, (2<<6 | 3<<4 | 0<<2 | 1)));
+        return _mm_cvtsi128_si32(_mm512_extracti32x4_epi32(a4, 0x00));
+    }
+
+    NPY_FINLINE npy_int32 npyv_reducemax_s32(npyv_s32 a)
+    {
+        const npyv_u32 idx1 = _mm512_set_epi32(7, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8);
+        const npyv_u32 idx2 = _mm512_set_epi32(3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12);
+        npyv_s32 a1 = _mm512_max_epi32(a, _mm512_permutex2var_epi32(a, idx1, a));
+        npyv_s32 a2 = _mm512_max_epi32(a1, _mm512_permutex2var_epi32(a1, idx2, a1));
+        npyv_s32 a3 = _mm512_max_epi32(a2, _mm512_shuffle_epi32(a2, (1<<6 | 0<<4 | 3<<2 | 2)));
+        npyv_s32 a4 = _mm512_max_epi32(a3, _mm512_shuffle_epi32(a3, (2<<6 | 3<<4 | 0<<2 | 1)));
+        return _mm_cvtsi128_si32(_mm512_extracti32x4_epi32(a4, 0x00));
+    }
+
+    NPY_FINLINE npy_float npyv_reducemax_f32(npyv_f32 a)
+    {
+        const npyv_u32 idx1 = _mm512_set_epi32(7, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8);
+        const npyv_u32 idx2 = _mm512_set_epi32(3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12);
+        npyv_f32 a1 = _mm512_max_ps(a, _mm512_permutex2var_ps(a, idx1, a));
+        npyv_f32 a2 = _mm512_max_ps(a1, _mm512_permutex2var_ps(a1, idx2, a1));
+        npyv_f32 a3 = _mm512_max_ps(a2, _mm512_shuffle_ps(a2, a2, (1<<6 | 0<<4 | 3<<2 | 2)));
+        npyv_f32 a4 = _mm512_max_ps(a3, _mm512_shuffle_sp(a3, a3, (2<<6 | 3<<4 | 0<<2 | 1)));
+        return _mm_cvtss_f32(_mm512_extractf32x4_ps(a4, 0x00));
+    }
+
+    NPY_FINLINE npy_uint32 npyv_reducemin_u32(npyv_u32 a)
+    {
+        const npyv_u32 idx1 = _mm512_set_epi32(7, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8);
+        const npyv_u32 idx2 = _mm512_set_epi32(3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12);
+        npyv_u32 a1 = _mm512_min_epu32(a, _mm512_permutex2var_epi32(a, idx1, a));
+        npyv_u32 a2 = _mm512_min_epu32(a1, _mm512_permutex2var_epi32(a1, idx2, a1));
+        npyv_u32 a3 = _mm512_min_epu32(a2, _mm512_shuffle_epi32(a2, (1<<6 | 0<<4 | 3<<2 | 2)));
+        npyv_u32 a4 = _mm512_min_epu32(a3, _mm512_shuffle_epi32(a3, (2<<6 | 3<<4 | 0<<2 | 1)));
+        return _mm_cvtsi128_si32(_mm512_extracti32x4_epi32(a4, 0x00));
+    }
+
+    NPY_FINLINE npy_int32 npyv_reducemin_s32(npyv_s32 a)
+    {
+        const npyv_u32 idx1 = _mm512_set_epi32(7, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8);
+        const npyv_u32 idx2 = _mm512_set_epi32(3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12);
+        npyv_s32 a1 = _mm512_min_epi32(a, _mm512_permutex2var_epi32(a, idx1, a));
+        npyv_s32 a2 = _mm512_min_epi32(a1, _mm512_permutex2var_epi32(a1, idx2, a1));
+        npyv_s32 a3 = _mm512_min_epi32(a2, _mm512_shuffle_epi32(a2, (1<<6 | 0<<4 | 3<<2 | 2)));
+        npyv_s32 a4 = _mm512_min_epi32(a3, _mm512_shuffle_epi32(a3, (2<<6 | 3<<4 | 0<<2 | 1)));
+        return _mm_cvtsi128_si32(_mm512_extracti32x4_epi32(a4, 0x00));
+    }
+
+    NPY_FINLINE npy_float npyv_reducemin_f32(npyv_f32 a)
+    {
+        const npyv_u32 idx1 = _mm512_set_epi32(7, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8);
+        const npyv_u32 idx2 = _mm512_set_epi32(3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12);
+        npyv_f32 a1 = _mm512_min_ps(a, _mm512_permutex2var_ps(a, idx1, a));
+        npyv_f32 a2 = _mm512_min_ps(a1, _mm512_permutex2var_ps(a1, idx2, a1));
+        npyv_f32 a3 = _mm512_min_ps(a2, _mm512_shuffle_ps(a2, a2, (1<<6 | 0<<4 | 3<<2 | 2)));
+        npyv_f32 a4 = _mm512_min_ps(a3, _mm512_shuffle_sp(a3, a3, (2<<6 | 3<<4 | 0<<2 | 1)));
+        return _mm_cvtss_f32(_mm512_extractf32x4_ps(a4, 0x00));
+    }
+
      NPY_FINLINE npy_uint32 npyv_sum_u32(npyv_u32 a)
      {
          __m256i half = _mm256_add_epi32(npyv512_lower_si256(a), npyv512_higher_si256(a));
diff --git a/numpy/core/src/common/simd/avx512/math.h b/numpy/core/src/common/simd/avx512/math.h

index f30e50ad05dfeb5037ba921751023af5e0b85ab4..5a6cb6dcd4425f0008461bd733b7a189b121d49b 100644 (file)
--- a/numpy/core/src/common/simd/avx512/math.h
+++ b/numpy/core/src/common/simd/avx512/math.h
@@ -51,7 +51,7 @@ NPY_FINLINE npyv_f64 npyv_square_f64(npyv_f64 a)
  #define npyv_max_f64 _mm512_max_pd
  // Maximum, supports IEEE floating-point arithmetic (IEC 60559),
  // - If one of the two vectors contains NaN, the equivalent element of the other vector is set
-// - Only if both corresponded elements are NaN, NaN is set. 
+// - Only if both corresponded elements are NaN, NaN is set.
  NPY_FINLINE npyv_f32 npyv_maxp_f32(npyv_f32 a, npyv_f32 b)
  {
      __mmask16 nn = _mm512_cmp_ps_mask(b, b, _CMP_ORD_Q);
@@ -84,7 +84,7 @@ NPY_FINLINE npyv_f64 npyv_maxp_f64(npyv_f64 a, npyv_f64 b)
  #define npyv_min_f64 _mm512_min_pd
  // Minimum, supports IEEE floating-point arithmetic (IEC 60559),
  // - If one of the two vectors contains NaN, the equivalent element of the other vector is set
-// - Only if both corresponded elements are NaN, NaN is set. 
+// - Only if both corresponded elements are NaN, NaN is set.
  NPY_FINLINE npyv_f32 npyv_minp_f32(npyv_f32 a, npyv_f32 b)
  {
      __mmask16 nn = _mm512_cmp_ps_mask(b, b, _CMP_ORD_Q);
@@ -112,6 +112,10 @@ NPY_FINLINE npyv_f64 npyv_minp_f64(npyv_f64 a, npyv_f64 b)
  #define npyv_min_u64 _mm512_min_epu64
  #define npyv_min_s64 _mm512_min_epi64
  
+// round to nearest integer even
+#define npyv_rint_f32(A) _mm512_roundscale_ps(A, _MM_FROUND_TO_NEAREST_INT)
+#define npyv_rint_f64(A) _mm512_roundscale_pd(A, _MM_FROUND_TO_NEAREST_INT)
+
  // ceil
  #define npyv_ceil_f32(A) _mm512_roundscale_ps(A, _MM_FROUND_TO_POS_INF)
  #define npyv_ceil_f64(A) _mm512_roundscale_pd(A, _MM_FROUND_TO_POS_INF)
@@ -120,4 +124,8 @@ NPY_FINLINE npyv_f64 npyv_minp_f64(npyv_f64 a, npyv_f64 b)
  #define npyv_trunc_f32(A) _mm512_roundscale_ps(A, _MM_FROUND_TO_ZERO)
  #define npyv_trunc_f64(A) _mm512_roundscale_pd(A, _MM_FROUND_TO_ZERO)
  
+// floor
+#define npyv_floor_f32(A) _mm512_roundscale_ps(A, _MM_FROUND_TO_NEG_INF)
+#define npyv_floor_f64(A) _mm512_roundscale_pd(A, _MM_FROUND_TO_NEG_INF)
+
  #endif // _NPY_SIMD_AVX512_MATH_H
diff --git a/numpy/core/src/common/simd/avx512/memory.h b/numpy/core/src/common/simd/avx512/memory.h

index 47095bf72aa1de55203f129a0511e41c2f92d3aa..03fcb4630cd61fa2873580d68294f7f4230cd959 100644 (file)
--- a/numpy/core/src/common/simd/avx512/memory.h
+++ b/numpy/core/src/common/simd/avx512/memory.h
@@ -276,7 +276,8 @@ NPY_FINLINE void npyv_storen_till_s64(npy_int64 *ptr, npy_intp stride, npy_uintp
          union {                                                                             \
              npyv_lanetype_##F_SFX from_##F_SFX;                                             \
              npyv_lanetype_##T_SFX to_##T_SFX;                                               \
-        } pun = {.from_##F_SFX = fill};                                                     \
+        } pun;                                                                              \
+        pun.from_##F_SFX = fill;                                                            \
          return npyv_reinterpret_##F_SFX##_##T_SFX(npyv_load_till_##T_SFX(                   \
              (const npyv_lanetype_##T_SFX *)ptr, nlane, pun.to_##T_SFX                       \
          ));                                                                                 \
@@ -288,7 +289,8 @@ NPY_FINLINE void npyv_storen_till_s64(npy_int64 *ptr, npy_intp stride, npy_uintp
          union {                                                                             \
              npyv_lanetype_##F_SFX from_##F_SFX;                                             \
              npyv_lanetype_##T_SFX to_##T_SFX;                                               \
-        } pun = {.from_##F_SFX = fill};                                                     \
+        } pun;                                                                              \
+        pun.from_##F_SFX = fill;                                                            \
          return npyv_reinterpret_##F_SFX##_##T_SFX(npyv_loadn_till_##T_SFX(                  \
              (const npyv_lanetype_##T_SFX *)ptr, stride, nlane, pun.to_##T_SFX               \
          ));                                                                                 \
@@ -329,4 +331,33 @@ NPYV_IMPL_AVX512_REST_PARTIAL_TYPES(f32, s32)
  NPYV_IMPL_AVX512_REST_PARTIAL_TYPES(u64, s64)
  NPYV_IMPL_AVX512_REST_PARTIAL_TYPES(f64, s64)
  
+/**************************************************
+ * Lookup table
+ *************************************************/
+// uses vector as indexes into a table
+// that contains 32 elements of float32.
+NPY_FINLINE npyv_f32 npyv_lut32_f32(const float *table, npyv_u32 idx)
+{
+    const npyv_f32 table0 = npyv_load_f32(table);
+    const npyv_f32 table1 = npyv_load_f32(table + 16);
+    return _mm512_permutex2var_ps(table0, idx, table1);
+}
+NPY_FINLINE npyv_u32 npyv_lut32_u32(const npy_uint32 *table, npyv_u32 idx)
+{ return npyv_reinterpret_u32_f32(npyv_lut32_f32((const float*)table, idx)); }
+NPY_FINLINE npyv_s32 npyv_lut32_s32(const npy_int32 *table, npyv_u32 idx)
+{ return npyv_reinterpret_s32_f32(npyv_lut32_f32((const float*)table, idx)); }
+
+// uses vector as indexes into a table
+// that contains 16 elements of float64.
+NPY_FINLINE npyv_f64 npyv_lut16_f64(const double *table, npyv_u64 idx)
+{
+    const npyv_f64 table0 = npyv_load_f64(table);
+    const npyv_f64 table1 = npyv_load_f64(table + 8);
+    return _mm512_permutex2var_pd(table0, idx, table1);
+}
+NPY_FINLINE npyv_u64 npyv_lut16_u64(const npy_uint64 *table, npyv_u64 idx)
+{ return npyv_reinterpret_u64_f64(npyv_lut16_f64((const double*)table, idx)); }
+NPY_FINLINE npyv_s64 npyv_lut16_s64(const npy_int64 *table, npyv_u64 idx)
+{ return npyv_reinterpret_s64_f64(npyv_lut16_f64((const double*)table, idx)); }
+
  #endif // _NPY_SIMD_AVX512_MEMORY_H
diff --git a/numpy/core/src/common/simd/avx512/reorder.h b/numpy/core/src/common/simd/avx512/reorder.h

index f043004ecc45ea79f5912cb9f1bfc54828c3aec6..c0b2477f38b85d50f06558ba7aaa0e19723ccf49 100644 (file)
--- a/numpy/core/src/common/simd/avx512/reorder.h
+++ b/numpy/core/src/common/simd/avx512/reorder.h
@@ -214,13 +214,13 @@ NPY_FINLINE npyv_u16 npyv_rev64_u16(npyv_u16 a)
  
  NPY_FINLINE npyv_u32 npyv_rev64_u32(npyv_u32 a)
  {
-    return _mm512_shuffle_epi32(a, _MM_SHUFFLE(2, 3, 0, 1));
+    return _mm512_shuffle_epi32(a, (_MM_PERM_ENUM)_MM_SHUFFLE(2, 3, 0, 1));
  }
  #define npyv_rev64_s32 npyv_rev64_u32
  
  NPY_FINLINE npyv_f32 npyv_rev64_f32(npyv_f32 a)
  {
-    return _mm512_shuffle_ps(a, a, _MM_SHUFFLE(2, 3, 0, 1));
+    return _mm512_shuffle_ps(a, a, (_MM_PERM_ENUM)_MM_SHUFFLE(2, 3, 0, 1));
  }
  
  #endif // _NPY_SIMD_AVX512_REORDER_H
diff --git a/numpy/core/src/common/simd/intdiv.h b/numpy/core/src/common/simd/intdiv.h

index a7a461721dba8d5eb3f42a7003bc11f3b4896a39..8b65b3a76e721452702e1dfd6ac09d6fac4d5b2e 100644 (file)
--- a/numpy/core/src/common/simd/intdiv.h
+++ b/numpy/core/src/common/simd/intdiv.h
@@ -46,9 +46,6 @@
   *  - For 64-bit division on Aarch64 and IBM/Power, we fall-back to the scalar division
   *    since emulating multiply-high is expensive and both architectures have very fast dividers.
   *
- ** TODO:
- *   - Add support for Power10(VSX4)
- *
   ***************************************************************
   ** Figure 4.1: Unsigned division by run–time invariant divisor
   ***************************************************************
@@ -136,7 +133,7 @@ NPY_FINLINE npy_uint64 npyv__divh128_u64(npy_uint64 high, npy_uint64 divisor)
  {
      assert(divisor > 1);
      npy_uint64 quotient;
-#if defined(_M_X64) && defined(_MSC_VER) && _MSC_VER >= 1920
+#if defined(_M_X64) && defined(_MSC_VER) && _MSC_VER >= 1920 && !defined(__clang__)
      npy_uint64 remainder;
      quotient = _udiv128(high, 0, divisor, &remainder);
      (void)remainder;
diff --git a/numpy/core/src/common/simd/neon/math.h b/numpy/core/src/common/simd/neon/math.h

index 19e5cd846f7d7a352ea80134d406903fae164f73..4607d6f27576244fa59a275abf63d85cef79b2e3 100644 (file)
--- a/numpy/core/src/common/simd/neon/math.h
+++ b/numpy/core/src/common/simd/neon/math.h
@@ -153,6 +153,33 @@ NPY_FINLINE npyv_s64 npyv_min_s64(npyv_s64 a, npyv_s64 b)
      return vbslq_s64(npyv_cmplt_s64(a, b), a, b);
  }
  
+// round to nearest integer even
+NPY_FINLINE npyv_f32 npyv_rint_f32(npyv_f32 a)
+{
+#ifdef NPY_HAVE_ASIMD
+    return vrndnq_f32(a);
+#else
+    // ARMv7 NEON only supports fp to int truncate conversion.
+    // a magic trick of adding 1.5 * 2**23 is used for rounding
+    // to nearest even and then substract this magic number to get
+    // the integer.
+    const npyv_s32 szero = vreinterpretq_s32_f32(vdupq_n_f32(-0.0f));
+    const npyv_f32 magic = vdupq_n_f32(12582912.0f); // 1.5 * 2**23
+    npyv_f32 round = vsubq_f32(vaddq_f32(a, magic), magic);
+    npyv_b32 overflow = vcleq_f32(vabsq_f32(a), vreinterpretq_f32_u32(vdupq_n_u32(0x4b000000)));
+    round = vbslq_f32(overflow, round, a);
+    // signed zero
+    round = vreinterpretq_f32_s32(vorrq_s32(
+         vreinterpretq_s32_f32(round),
+         vandq_s32(vreinterpretq_s32_f32(a), szero)
+    ));
+    return round;
+#endif
+}
+#if NPY_SIMD_F64
+    #define npyv_rint_f64 vrndnq_f64
+#endif // NPY_SIMD_F64
+
  // ceil
  #ifdef NPY_HAVE_ASIMD
      #define npyv_ceil_f32 vrndpq_f32
@@ -223,4 +250,36 @@ NPY_FINLINE npyv_s64 npyv_min_s64(npyv_s64 a, npyv_s64 b)
      #define npyv_trunc_f64 vrndq_f64
  #endif // NPY_SIMD_F64
  
+// floor
+#ifdef NPY_HAVE_ASIMD
+    #define npyv_floor_f32 vrndmq_f32
+#else
+   NPY_FINLINE npyv_f32 npyv_floor_f32(npyv_f32 a)
+   {
+        const npyv_s32 szero = vreinterpretq_s32_f32(vdupq_n_f32(-0.0f));
+        const npyv_u32 one = vreinterpretq_u32_f32(vdupq_n_f32(1.0f));
+        const npyv_s32 max_int = vdupq_n_s32(0x7fffffff);
+
+        npyv_s32 roundi = vcvtq_s32_f32(a);
+        npyv_f32 round = vcvtq_f32_s32(roundi);
+        npyv_f32 floor = vsubq_f32(round, vreinterpretq_f32_u32(
+            vandq_u32(vcgtq_f32(round, a), one)
+        ));
+        // respect signed zero
+        npyv_f32 rzero = vreinterpretq_f32_s32(vorrq_s32(
+            vreinterpretq_s32_f32(floor),
+            vandq_s32(vreinterpretq_s32_f32(a), szero)
+        ));
+        npyv_u32 nnan = npyv_notnan_f32(a);
+        npyv_u32 overflow = vorrq_u32(
+            vceqq_s32(roundi, szero), vceqq_s32(roundi, max_int)
+        );
+
+       return vbslq_f32(vbicq_u32(nnan, overflow), rzero, a);
+   }
+#endif // NPY_HAVE_ASIMD
+#if NPY_SIMD_F64
+    #define npyv_floor_f64 vrndmq_f64
+#endif // NPY_SIMD_F64
+
  #endif // _NPY_SIMD_NEON_MATH_H
diff --git a/numpy/core/src/common/simd/neon/memory.h b/numpy/core/src/common/simd/neon/memory.h

index 1e258f1bcbef79b43206741904b718b52915d9d6..7060ea628305cefaff29f764e4b4f84dc941636c 100644 (file)
--- a/numpy/core/src/common/simd/neon/memory.h
+++ b/numpy/core/src/common/simd/neon/memory.h
@@ -332,5 +332,45 @@ NPYV_IMPL_NEON_REST_PARTIAL_TYPES(u64, s64)
  #if NPY_SIMD_F64
  NPYV_IMPL_NEON_REST_PARTIAL_TYPES(f64, s64)
  #endif
+/*********************************
+ * Lookup table
+ *********************************/
+// uses vector as indexes into a table
+// that contains 32 elements of uint32.
+NPY_FINLINE npyv_u32 npyv_lut32_u32(const npy_uint32 *table, npyv_u32 idx)
+{
+    const unsigned i0 = vgetq_lane_u32(idx, 0);
+    const unsigned i1 = vgetq_lane_u32(idx, 1);
+    const unsigned i2 = vgetq_lane_u32(idx, 2);
+    const unsigned i3 = vgetq_lane_u32(idx, 3);
+
+    uint32x2_t low = vcreate_u32(table[i0]);
+               low = vld1_lane_u32((const uint32_t*)table + i1, low, 1);
+    uint32x2_t high = vcreate_u32(table[i2]);
+               high = vld1_lane_u32((const uint32_t*)table + i3, high, 1);
+    return vcombine_u32(low, high);
+}
+NPY_FINLINE npyv_s32 npyv_lut32_s32(const npy_int32 *table, npyv_u32 idx)
+{ return npyv_reinterpret_s32_u32(npyv_lut32_u32((const npy_uint32*)table, idx)); }
+NPY_FINLINE npyv_f32 npyv_lut32_f32(const float *table, npyv_u32 idx)
+{ return npyv_reinterpret_f32_u32(npyv_lut32_u32((const npy_uint32*)table, idx)); }
+
+// uses vector as indexes into a table
+// that contains 16 elements of uint64.
+NPY_FINLINE npyv_u64 npyv_lut16_u64(const npy_uint64 *table, npyv_u64 idx)
+{
+    const unsigned i0 = vgetq_lane_u32(vreinterpretq_u32_u64(idx), 0);
+    const unsigned i1 = vgetq_lane_u32(vreinterpretq_u32_u64(idx), 2);
+    return vcombine_u64(
+        vld1_u64((const uint64_t*)table + i0),
+        vld1_u64((const uint64_t*)table + i1)
+    );
+}
+NPY_FINLINE npyv_s64 npyv_lut16_s64(const npy_int64 *table, npyv_u64 idx)
+{ return npyv_reinterpret_s64_u64(npyv_lut16_u64((const npy_uint64*)table, idx)); }
+#if NPY_SIMD_F64
+NPY_FINLINE npyv_f64 npyv_lut16_f64(const double *table, npyv_u64 idx)
+{ return npyv_reinterpret_f64_u64(npyv_lut16_u64((const npy_uint64*)table, idx)); }
+#endif
  
  #endif // _NPY_SIMD_NEON_MEMORY_H
diff --git a/numpy/core/src/common/simd/sse/math.h b/numpy/core/src/common/simd/sse/math.h

index 5daf7711e416da7dfcca9bbe5e81a28ea44e6471..e4b77b671f1a0067081d23c81e49bf626468b7d4 100644 (file)
--- a/numpy/core/src/common/simd/sse/math.h
+++ b/numpy/core/src/common/simd/sse/math.h
@@ -42,7 +42,7 @@ NPY_FINLINE npyv_f64 npyv_square_f64(npyv_f64 a)
  #define npyv_max_f64 _mm_max_pd
  // Maximum, supports IEEE floating-point arithmetic (IEC 60559),
  // - If one of the two vectors contains NaN, the equivalent element of the other vector is set
-// - Only if both corresponded elements are NaN, NaN is set. 
+// - Only if both corresponded elements are NaN, NaN is set.
  NPY_FINLINE npyv_f32 npyv_maxp_f32(npyv_f32 a, npyv_f32 b)
  {
      __m128 nn  = _mm_cmpord_ps(b, b);
@@ -95,7 +95,7 @@ NPY_FINLINE npyv_s64 npyv_max_s64(npyv_s64 a, npyv_s64 b)
  #define npyv_min_f64 _mm_min_pd
  // Minimum, supports IEEE floating-point arithmetic (IEC 60559),
  // - If one of the two vectors contains NaN, the equivalent element of the other vector is set
-// - Only if both corresponded elements are NaN, NaN is set. 
+// - Only if both corresponded elements are NaN, NaN is set.
  NPY_FINLINE npyv_f32 npyv_minp_f32(npyv_f32 a, npyv_f32 b)
  {
      __m128 nn  = _mm_cmpord_ps(b, b);
@@ -143,6 +143,38 @@ NPY_FINLINE npyv_s64 npyv_min_s64(npyv_s64 a, npyv_s64 b)
      return npyv_select_s64(npyv_cmplt_s64(a, b), a, b);
  }
  
+// round to nearest integer even
+NPY_FINLINE npyv_f32 npyv_rint_f32(npyv_f32 a)
+{
+#ifdef NPY_HAVE_SSE41
+    return _mm_round_ps(a, _MM_FROUND_TO_NEAREST_INT);
+#else
+    const npyv_f32 szero = _mm_set1_ps(-0.0f);
+    __m128i roundi = _mm_cvtps_epi32(a);
+    __m128i overflow = _mm_cmpeq_epi32(roundi, _mm_castps_si128(szero));
+    __m128 r = _mm_cvtepi32_ps(roundi);
+    // respect sign of zero
+    r = _mm_or_ps(r, _mm_and_ps(a, szero));
+    return npyv_select_f32(overflow, a, r);
+#endif
+}
+
+// round to nearest integer even
+NPY_FINLINE npyv_f64 npyv_rint_f64(npyv_f64 a)
+{
+#ifdef NPY_HAVE_SSE41
+    return _mm_round_pd(a, _MM_FROUND_TO_NEAREST_INT);
+#else
+    const npyv_f64 szero = _mm_set1_pd(-0.0);
+    const npyv_f64 two_power_52 = _mm_set1_pd(0x10000000000000);
+    npyv_f64 sign_two52 = _mm_or_pd(two_power_52, _mm_and_pd(a, szero));
+    // round by add magic number 2^52
+    npyv_f64 round = _mm_sub_pd(_mm_add_pd(a, sign_two52), sign_two52);
+    // respect signed zero, e.g. -0.5 -> -0.0
+    return _mm_or_pd(round, _mm_and_pd(a, szero));
+#endif
+}
+
  // ceil
  #ifdef NPY_HAVE_SSE41
      #define npyv_ceil_f32 _mm_ceil_ps
@@ -202,4 +234,23 @@ NPY_FINLINE npyv_s64 npyv_min_s64(npyv_s64 a, npyv_s64 b)
      }
  #endif
  
+// floor
+#ifdef NPY_HAVE_SSE41
+    #define npyv_floor_f32 _mm_floor_ps
+    #define npyv_floor_f64 _mm_floor_pd
+#else
+    NPY_FINLINE npyv_f32 npyv_floor_f32(npyv_f32 a)
+    {
+        const npyv_f32 one = _mm_set1_ps(1.0f);
+        npyv_f32 round = npyv_rint_f32(a);
+        return _mm_sub_ps(round, _mm_and_ps(_mm_cmpgt_ps(round, a), one));
+    }
+    NPY_FINLINE npyv_f64 npyv_floor_f64(npyv_f64 a)
+    {
+        const npyv_f64 one = _mm_set1_pd(1.0);
+        npyv_f64 round = npyv_rint_f64(a);
+        return _mm_sub_pd(round, _mm_and_pd(_mm_cmpgt_pd(round, a), one));
+    }
+#endif // NPY_HAVE_SSE41
+
  #endif // _NPY_SIMD_SSE_MATH_H
diff --git a/numpy/core/src/common/simd/sse/memory.h b/numpy/core/src/common/simd/sse/memory.h

index 1074c3b02efef39df52a830ca71dab98b4278211..3ff64848d281629876e0a488273ffa6a21c48b8e 100644 (file)
--- a/numpy/core/src/common/simd/sse/memory.h
+++ b/numpy/core/src/common/simd/sse/memory.h
@@ -495,4 +495,45 @@ NPYV_IMPL_SSE_REST_PARTIAL_TYPES(f32, s32)
  NPYV_IMPL_SSE_REST_PARTIAL_TYPES(u64, s64)
  NPYV_IMPL_SSE_REST_PARTIAL_TYPES(f64, s64)
  
+/*********************************
+ * Lookup table
+ *********************************/
+// uses vector as indexes into a table
+// that contains 32 elements of float32.
+NPY_FINLINE npyv_f32 npyv_lut32_f32(const float *table, npyv_u32 idx)
+{
+    const int i0 = _mm_cvtsi128_si32(idx);
+#ifdef NPY_HAVE_SSE41
+    const int i1 = _mm_extract_epi32(idx, 1);
+    const int i2 = _mm_extract_epi32(idx, 2);
+    const int i3 = _mm_extract_epi32(idx, 3);
+#else
+    const int i1 = _mm_extract_epi16(idx, 2);
+    const int i2 = _mm_extract_epi16(idx, 4);
+    const int i3 = _mm_extract_epi16(idx, 6);
+#endif
+    return npyv_set_f32(table[i0], table[i1], table[i2], table[i3]);
+}
+NPY_FINLINE npyv_u32 npyv_lut32_u32(const npy_uint32 *table, npyv_u32 idx)
+{ return npyv_reinterpret_u32_f32(npyv_lut32_f32((const float*)table, idx)); }
+NPY_FINLINE npyv_s32 npyv_lut32_s32(const npy_int32 *table, npyv_u32 idx)
+{ return npyv_reinterpret_s32_f32(npyv_lut32_f32((const float*)table, idx)); }
+
+// uses vector as indexes into a table
+// that contains 16 elements of float64.
+NPY_FINLINE npyv_f64 npyv_lut16_f64(const double *table, npyv_u64 idx)
+{
+    const int i0 = _mm_cvtsi128_si32(idx);
+#ifdef NPY_HAVE_SSE41
+    const int i1 = _mm_extract_epi32(idx, 2);
+#else
+    const int i1 = _mm_extract_epi16(idx, 4);
+#endif
+    return npyv_set_f64(table[i0], table[i1]);
+}
+NPY_FINLINE npyv_u64 npyv_lut16_u64(const npy_uint64 *table, npyv_u64 idx)
+{ return npyv_reinterpret_u64_f64(npyv_lut16_f64((const double*)table, idx)); }
+NPY_FINLINE npyv_s64 npyv_lut16_s64(const npy_int64 *table, npyv_u64 idx)
+{ return npyv_reinterpret_s64_f64(npyv_lut16_f64((const double*)table, idx)); }
+
  #endif // _NPY_SIMD_SSE_MEMORY_H
diff --git a/numpy/core/src/common/simd/vsx/arithmetic.h b/numpy/core/src/common/simd/vsx/arithmetic.h

index eaca536201fb21dbab08bd44086d6674462d4636..01dbf5480e9f972c64226c657f7c747fe8f4613b 100644 (file)
--- a/numpy/core/src/common/simd/vsx/arithmetic.h
+++ b/numpy/core/src/common/simd/vsx/arithmetic.h
@@ -97,9 +97,6 @@
  /***************************
   * Integer Division
   ***************************/
-/***
- * TODO: Add support for VSX4(Power10)
- */
  // See simd/intdiv.h for more clarification
  // divide each unsigned 8-bit element by a precomputed divisor
  NPY_FINLINE npyv_u8 npyv_divc_u8(npyv_u8 a, const npyv_u8x3 divisor)
@@ -172,6 +169,10 @@ NPY_FINLINE npyv_s16 npyv_divc_s16(npyv_s16 a, const npyv_s16x3 divisor)
  // divide each unsigned 32-bit element by a precomputed divisor
  NPY_FINLINE npyv_u32 npyv_divc_u32(npyv_u32 a, const npyv_u32x3 divisor)
  {
+#if defined(NPY_HAVE_VSX4)
+    // high part of unsigned multiplication
+    npyv_u32 mulhi    = vec_mulh(a, divisor.val[0]);
+#else
  #if defined(__GNUC__) && __GNUC__ < 8
      // Doubleword integer wide multiplication supported by GCC 8+
      npyv_u64 mul_even, mul_odd;
@@ -184,6 +185,7 @@ NPY_FINLINE npyv_u32 npyv_divc_u32(npyv_u32 a, const npyv_u32x3 divisor)
  #endif
      // high part of unsigned multiplication
      npyv_u32 mulhi    = vec_mergeo((npyv_u32)mul_even, (npyv_u32)mul_odd);
+#endif
      // floor(x/d)     = (((a-mulhi) >> sh1) + mulhi) >> sh2
      npyv_u32 q        = vec_sub(a, mulhi);
               q        = vec_sr(q, divisor.val[1]);
@@ -194,6 +196,10 @@ NPY_FINLINE npyv_u32 npyv_divc_u32(npyv_u32 a, const npyv_u32x3 divisor)
  // divide each signed 32-bit element by a precomputed divisor (round towards zero)
  NPY_FINLINE npyv_s32 npyv_divc_s32(npyv_s32 a, const npyv_s32x3 divisor)
  {
+#if defined(NPY_HAVE_VSX4)
+    // high part of signed multiplication
+    npyv_s32 mulhi    = vec_mulh(a, divisor.val[0]);
+#else
  #if defined(__GNUC__) && __GNUC__ < 8
      // Doubleword integer wide multiplication supported by GCC8+
      npyv_s64 mul_even, mul_odd;
@@ -206,6 +212,7 @@ NPY_FINLINE npyv_s32 npyv_divc_s32(npyv_s32 a, const npyv_s32x3 divisor)
  #endif
      // high part of signed multiplication
      npyv_s32 mulhi    = vec_mergeo((npyv_s32)mul_even, (npyv_s32)mul_odd);
+#endif
      // q              = ((a + mulhi) >> sh1) - XSIGN(a)
      // trunc(a/d)     = (q ^ dsign) - dsign
      npyv_s32 q        = vec_sra(vec_add(a, mulhi), (npyv_u32)divisor.val[1]);
@@ -216,8 +223,12 @@ NPY_FINLINE npyv_s32 npyv_divc_s32(npyv_s32 a, const npyv_s32x3 divisor)
  // divide each unsigned 64-bit element by a precomputed divisor
  NPY_FINLINE npyv_u64 npyv_divc_u64(npyv_u64 a, const npyv_u64x3 divisor)
  {
+#if defined(NPY_HAVE_VSX4)
+    return vec_div(a, divisor.val[0]);
+#else
      const npy_uint64 d = vec_extract(divisor.val[0], 0);
      return npyv_set_u64(vec_extract(a, 0) / d, vec_extract(a, 1) / d);
+#endif
  }
  // divide each signed 64-bit element by a precomputed divisor (round towards zero)
  NPY_FINLINE npyv_s64 npyv_divc_s64(npyv_s64 a, const npyv_s64x3 divisor)
diff --git a/numpy/core/src/common/simd/vsx/math.h b/numpy/core/src/common/simd/vsx/math.h

index d138cae8a24d87ed1871764448fda01bba6c6e7d..444bc9e544b6cc10887baf2e0f846f364e7fe238 100644 (file)
--- a/numpy/core/src/common/simd/vsx/math.h
+++ b/numpy/core/src/common/simd/vsx/math.h
@@ -38,7 +38,7 @@ NPY_FINLINE npyv_f64 npyv_square_f64(npyv_f64 a)
  #define npyv_max_f64 vec_max
  // Maximum, supports IEEE floating-point arithmetic (IEC 60559),
  // - If one of the two vectors contains NaN, the equivalent element of the other vector is set
-// - Only if both corresponded elements are NaN, NaN is set. 
+// - Only if both corresponded elements are NaN, NaN is set.
  #define npyv_maxp_f32 vec_max
  #define npyv_maxp_f64 vec_max
  // Maximum, integer operations
@@ -56,7 +56,7 @@ NPY_FINLINE npyv_f64 npyv_square_f64(npyv_f64 a)
  #define npyv_min_f64 vec_min
  // Minimum, supports IEEE floating-point arithmetic (IEC 60559),
  // - If one of the two vectors contains NaN, the equivalent element of the other vector is set
-// - Only if both corresponded elements are NaN, NaN is set. 
+// - Only if both corresponded elements are NaN, NaN is set.
  #define npyv_minp_f32 vec_min
  #define npyv_minp_f64 vec_min
  // Minimum, integer operations
@@ -69,6 +69,10 @@ NPY_FINLINE npyv_f64 npyv_square_f64(npyv_f64 a)
  #define npyv_min_u64 vec_min
  #define npyv_min_s64 vec_min
  
+// round to nearest int even
+#define npyv_rint_f32 vec_rint
+#define npyv_rint_f64 vec_rint
+
  // ceil
  #define npyv_ceil_f32 vec_ceil
  #define npyv_ceil_f64 vec_ceil
@@ -77,4 +81,8 @@ NPY_FINLINE npyv_f64 npyv_square_f64(npyv_f64 a)
  #define npyv_trunc_f32 vec_trunc
  #define npyv_trunc_f64 vec_trunc
  
+// floor
+#define npyv_floor_f32 vec_floor
+#define npyv_floor_f64 vec_floor
+
  #endif // _NPY_SIMD_VSX_MATH_H
diff --git a/numpy/core/src/common/simd/vsx/memory.h b/numpy/core/src/common/simd/vsx/memory.h

index 08a0a9276cc63330427276001c20bb5e4ac43a1e..3007584ef97b42405e0abbbba3ec5b622883db73 100644 (file)
--- a/numpy/core/src/common/simd/vsx/memory.h
+++ b/numpy/core/src/common/simd/vsx/memory.h
@@ -343,4 +343,41 @@ NPYV_IMPL_VSX_REST_PARTIAL_TYPES(f32, s32)
  NPYV_IMPL_VSX_REST_PARTIAL_TYPES(u64, s64)
  NPYV_IMPL_VSX_REST_PARTIAL_TYPES(f64, s64)
  
+/*********************************
+ * Lookup table
+ *********************************/
+// uses vector as indexes into a table
+// that contains 32 elements of float32.
+NPY_FINLINE npyv_f32 npyv_lut32_f32(const float *table, npyv_u32 idx)
+{
+    const unsigned i0 = vec_extract(idx, 0);
+    const unsigned i1 = vec_extract(idx, 1);
+    const unsigned i2 = vec_extract(idx, 2);
+    const unsigned i3 = vec_extract(idx, 3);
+    npyv_f32 r = vec_promote(table[i0], 0);
+             r = vec_insert(table[i1], r, 1);
+             r = vec_insert(table[i2], r, 2);
+             r = vec_insert(table[i3], r, 3);
+    return r;
+}
+NPY_FINLINE npyv_u32 npyv_lut32_u32(const npy_uint32 *table, npyv_u32 idx)
+{ return npyv_reinterpret_u32_f32(npyv_lut32_f32((const float*)table, idx)); }
+NPY_FINLINE npyv_s32 npyv_lut32_s32(const npy_int32 *table, npyv_u32 idx)
+{ return npyv_reinterpret_s32_f32(npyv_lut32_f32((const float*)table, idx)); }
+
+// uses vector as indexes into a table
+// that contains 16 elements of float64.
+NPY_FINLINE npyv_f64 npyv_lut16_f64(const double *table, npyv_u64 idx)
+{
+    const unsigned i0 = vec_extract((npyv_u32)idx, 0);
+    const unsigned i1 = vec_extract((npyv_u32)idx, 2);
+    npyv_f64 r = vec_promote(table[i0], 0);
+             r = vec_insert(table[i1], r, 1);
+    return r;
+}
+NPY_FINLINE npyv_u64 npyv_lut16_u64(const npy_uint64 *table, npyv_u64 idx)
+{ return npyv_reinterpret_u64_f64(npyv_lut16_f64((const double*)table, idx)); }
+NPY_FINLINE npyv_s64 npyv_lut16_s64(const npy_int64 *table, npyv_u64 idx)
+{ return npyv_reinterpret_s64_f64(npyv_lut16_f64((const double*)table, idx)); }
+
  #endif // _NPY_SIMD_VSX_MEMORY_H
diff --git a/numpy/core/src/common/simd/vsx/vsx.h b/numpy/core/src/common/simd/vsx/vsx.h

index 66b76208f042ab33e2e04ce654fb69a4b6b4b872..b4d8172a271a20f76feb1638afb40f1a02c3c9ee 100644 (file)
--- a/numpy/core/src/common/simd/vsx/vsx.h
+++ b/numpy/core/src/common/simd/vsx/vsx.h
@@ -61,7 +61,7 @@ typedef struct { npyv_f64 val[3]; } npyv_f64x3;
  #define npyv_nlanes_f32 4
  #define npyv_nlanes_f64 2
  
-// using __bool with typdef cause ambiguous errors
+// using __bool with typedef cause ambiguous errors
  #define npyv_b8  __vector __bool char
  #define npyv_b16 __vector __bool short
  #define npyv_b32 __vector __bool int
diff --git a/numpy/core/src/common/ufunc_override.c b/numpy/core/src/common/ufunc_override.c

index d510f185acf3b30c9c0897bc85ae3e2c973fbdde..4fb4d4b3edda627eddd80e84a653dd4f83b696e4 100644 (file)
--- a/numpy/core/src/common/ufunc_override.c
+++ b/numpy/core/src/common/ufunc_override.c
@@ -5,6 +5,7 @@
  #include "get_attr_string.h"
  #include "npy_import.h"
  #include "ufunc_override.h"
+#include "scalartypes.h"
  
  /*
   * Check whether an object has __array_ufunc__ defined on its class and it
@@ -30,11 +31,16 @@ PyUFuncOverride_GetNonDefaultArrayUfunc(PyObject *obj)
      if (PyArray_CheckExact(obj)) {
          return NULL;
      }
+   /* Fast return for numpy scalar types */
+    if (is_anyscalar_exact(obj)) {
+        return NULL;
+    }
+
      /*
       * Does the class define __array_ufunc__? (Note that LookupSpecial has fast
       * return for basic python types, so no need to worry about those here)
       */
-    cls_array_ufunc = PyArray_LookupSpecial(obj, "__array_ufunc__");
+    cls_array_ufunc = PyArray_LookupSpecial(obj, npy_um_str_array_ufunc);
      if (cls_array_ufunc == NULL) {
          if (PyErr_Occurred()) {
              PyErr_Clear(); /* TODO[gh-14801]: propagate crashes during attribute access? */
diff --git a/numpy/core/src/common/umathmodule.h b/numpy/core/src/common/umathmodule.h

index 6d4169ad5f8a9420a0834cf890afc75c9a2f4a0b..fe44fe403783334c9fcecd4612ca6504b9dc9ea9 100644 (file)
--- a/numpy/core/src/common/umathmodule.h
+++ b/numpy/core/src/common/umathmodule.h
@@ -1,8 +1,8 @@
  #ifndef NUMPY_CORE_SRC_COMMON_UMATHMODULE_H_
  #define NUMPY_CORE_SRC_COMMON_UMATHMODULE_H_
  
-#include "__umath_generated.c"
-#include "__ufunc_api.c"
+#include "ufunc_object.h"
+#include "ufunc_type_resolution.h"
  
  NPY_NO_EXPORT PyObject *
  get_sfloat_dtype(PyObject *NPY_UNUSED(mod), PyObject *NPY_UNUSED(args));
diff --git a/numpy/core/src/multiarray/_multiarray_tests.c.src b/numpy/core/src/multiarray/_multiarray_tests.c.src

index b7a8b08495f934c521ba4790d1354c070cc44c5b..0fcebedc791f7441332850f162c8765f3acf5ce6 100644 (file)
--- a/numpy/core/src/multiarray/_multiarray_tests.c.src
+++ b/numpy/core/src/multiarray/_multiarray_tests.c.src
@@ -795,25 +795,6 @@ npy_char_deprecation(PyObject* NPY_UNUSED(self), PyObject* NPY_UNUSED(args))
      return (PyObject *)descr;
  }
  
-/* used to test UPDATEIFCOPY usage emits deprecation warning */
-static PyObject*
-npy_updateifcopy_deprecation(PyObject* NPY_UNUSED(self), PyObject* args)
-{
-    int flags;
-    PyObject* array;
-    if (!PyArray_Check(args)) {
-        PyErr_SetString(PyExc_TypeError, "test needs ndarray input");
-        return NULL;
-    }
-    flags = NPY_ARRAY_CARRAY | NPY_ARRAY_UPDATEIFCOPY;
-    array = PyArray_FromArray((PyArrayObject*)args, NULL, flags);
-    if (array == NULL)
-        return NULL;
-    PyArray_ResolveWritebackIfCopy((PyArrayObject*)array);
-    Py_DECREF(array);
-    Py_RETURN_NONE;
-}
-
  /* used to test PyArray_As1D usage emits not implemented error */
  static PyObject*
  npy_pyarrayas1d_deprecation(PyObject* NPY_UNUSED(self), PyObject* NPY_UNUSED(args))
@@ -1084,20 +1065,18 @@ get_all_cast_information(PyObject *NPY_UNUSED(mod), PyObject *NPY_UNUSED(args))
              PyArrayMethodObject *cast = (PyArrayMethodObject *)cast_obj;
  
              /* Pass some information about this cast out! */
-            PyObject *cast_info = Py_BuildValue("{sOsOsisisisisisssi}",
+            PyObject *cast_info = Py_BuildValue("{sOsOsisisisisiss}",
                      "from", from_dtype,
                      "to", to_dtype,
                      "legacy", (cast->name != NULL &&
                                 strncmp(cast->name, "legacy_", 7) == 0),
-                    "casting", cast->casting & ~_NPY_CAST_IS_VIEW,
+                    "casting", cast->casting,
                      "requires_pyapi", cast->flags & NPY_METH_REQUIRES_PYAPI,
                      "supports_unaligned",
                          cast->flags & NPY_METH_SUPPORTS_UNALIGNED,
                      "no_floatingpoint_errors",
                          cast->flags & NPY_METH_NO_FLOATINGPOINT_ERRORS,
-                    "name", cast->name,
-                    "cast_is_view",
-                        cast->casting & _NPY_CAST_IS_VIEW);
+                    "name", cast->name);
              if (cast_info == NULL) {
                  goto fail;
              }
@@ -2372,6 +2351,32 @@ npy_ensurenocopy(PyObject* NPY_UNUSED(self), PyObject* args)
      Py_RETURN_NONE;
  }
  
+static PyObject *
+run_scalar_intp_converter(PyObject *NPY_UNUSED(self), PyObject *obj)
+{
+    PyArray_Dims dims;
+    if (!PyArray_IntpConverter(obj, &dims)) {
+        return NULL;
+    }
+    else {
+        PyObject *result = PyArray_IntTupleFromIntp(dims.len, dims.ptr);
+        PyDimMem_FREE(dims.ptr);
+        return result;
+    }
+}
+
+static PyObject *
+run_scalar_intp_from_sequence(PyObject *NPY_UNUSED(self), PyObject *obj)
+{
+    npy_intp vals[1];
+
+    int output = PyArray_IntpFromSequence(obj, vals, 1);
+    if (output == -1) {
+        return NULL;
+    }
+    return PyArray_IntTupleFromIntp(1, vals);
+}
+
  static PyMethodDef Multiarray_TestsMethods[] = {
      {"argparse_example_function",
           (PyCFunction)argparse_example_function,
@@ -2412,9 +2417,6 @@ static PyMethodDef Multiarray_TestsMethods[] = {
      {"npy_char_deprecation",
          npy_char_deprecation,
          METH_NOARGS, NULL},
-    {"npy_updateifcopy_deprecation",
-        npy_updateifcopy_deprecation,
-        METH_O, NULL},
      {"npy_pyarrayas1d_deprecation",
          npy_pyarrayas1d_deprecation,
          METH_NOARGS, NULL},
@@ -2565,6 +2567,12 @@ static PyMethodDef Multiarray_TestsMethods[] = {
      {"run_casting_converter",
          run_casting_converter,
          METH_VARARGS, NULL},
+    {"run_scalar_intp_converter",
+        run_scalar_intp_converter,
+        METH_O, NULL},
+    {"run_scalar_intp_from_sequence",
+        run_scalar_intp_from_sequence,
+        METH_O, NULL},
      {"run_intp_converter",
          run_intp_converter,
          METH_VARARGS, NULL},
diff --git a/numpy/core/src/multiarray/abstractdtypes.c b/numpy/core/src/multiarray/abstractdtypes.c

index cc1d7fad823387c099ff64f562484bc7c6ccc4e0..b0345c46b614f13fd3f4084d4abaf2204c917718 100644 (file)
--- a/numpy/core/src/multiarray/abstractdtypes.c
+++ b/numpy/core/src/multiarray/abstractdtypes.c
@@ -259,6 +259,7 @@ NPY_NO_EXPORT PyArray_DTypeMeta PyArray_PyIntAbstractDType = {{{
          .tp_name = "numpy._IntegerAbstractDType",
      },},
      .flags = NPY_DT_ABSTRACT,
+    .type_num = -1,
      .dt_slots = &pyintabstractdtype_slots,
  };
  
@@ -276,6 +277,7 @@ NPY_NO_EXPORT PyArray_DTypeMeta PyArray_PyFloatAbstractDType = {{{
          .tp_name = "numpy._FloatAbstractDType",
      },},
      .flags = NPY_DT_ABSTRACT,
+    .type_num = -1,
      .dt_slots = &pyfloatabstractdtype_slots,
  };
  
@@ -293,5 +295,6 @@ NPY_NO_EXPORT PyArray_DTypeMeta PyArray_PyComplexAbstractDType = {{{
          .tp_name = "numpy._ComplexAbstractDType",
      },},
      .flags = NPY_DT_ABSTRACT,
+    .type_num = -1,
      .dt_slots = &pycomplexabstractdtype_slots,
  };
diff --git a/numpy/core/src/multiarray/alloc.c b/numpy/core/src/multiarray/alloc.c

index 0a694cf62662dda114549067bc2d4f9e2e3d80dc..6f18054ff5b860d6feb109431bd5d66e74894b7b 100644 (file)
--- a/numpy/core/src/multiarray/alloc.c
+++ b/numpy/core/src/multiarray/alloc.c
@@ -38,6 +38,25 @@ static cache_bucket dimcache[NBUCKETS_DIM];
  static int _madvise_hugepage = 1;
  
  
+/*
+ * This function tells whether NumPy attempts to call `madvise` with
+ * `MADV_HUGEPAGE`.  `madvise` is only ever used on linux, so the value
+ * of `_madvise_hugepage` may be ignored.
+ *
+ * It is exposed to Python as `np.core.multiarray._get_madvise_hugepage`.
+ */
+NPY_NO_EXPORT PyObject *
+_get_madvise_hugepage(PyObject *NPY_UNUSED(self), PyObject *NPY_UNUSED(args))
+{
+#ifdef NPY_OS_LINUX
+    if (_madvise_hugepage) {
+        Py_RETURN_TRUE;
+    }
+#endif
+    Py_RETURN_FALSE;
+}
+
+
  /*
   * This function enables or disables the use of `MADV_HUGEPAGE` on Linux
   * by modifying the global static `_madvise_hugepage`.
@@ -186,6 +205,24 @@ npy_free_cache_dim(void * p, npy_uintp sz)
                      &PyArray_free);
  }
  
+/* Similar to array_dealloc in arrayobject.c */
+static NPY_INLINE void
+WARN_NO_RETURN(PyObject* warning, const char * msg) {
+    if (PyErr_WarnEx(warning, msg, 1) < 0) {
+        PyObject * s;
+
+        s = PyUnicode_FromString("PyDataMem_UserFREE");
+        if (s) {
+            PyErr_WriteUnraisable(s);
+            Py_DECREF(s);
+        }
+        else {
+            PyErr_WriteUnraisable(Py_None);
+        }
+    }
+}
+
+
  
  /* malloc/free/realloc hook */
  NPY_NO_EXPORT PyDataMem_EventHookFunc *_PyDataMem_eventhook = NULL;
@@ -210,6 +247,8 @@ NPY_NO_EXPORT void *_PyDataMem_eventhook_user_data = NULL;
   * operations that might cause new allocation events (such as the
   * creation/destruction numpy objects, or creating/destroying Python
   * objects which might cause a gc)
+ *
+ * Deprecated in 1.23
   */
  NPY_NO_EXPORT PyDataMem_EventHookFunc *
  PyDataMem_SetEventHook(PyDataMem_EventHookFunc *newhook,
@@ -218,6 +257,10 @@ PyDataMem_SetEventHook(PyDataMem_EventHookFunc *newhook,
      PyDataMem_EventHookFunc *temp;
      NPY_ALLOW_C_API_DEF
      NPY_ALLOW_C_API
+    /* 2021-11-18, 1.23 */
+    WARN_NO_RETURN(PyExc_DeprecationWarning,
+                     "PyDataMem_SetEventHook is deprecated, use tracemalloc "
+                     "and the 'np.lib.tracemalloc_domain' domain");
      temp = _PyDataMem_eventhook;
      _PyDataMem_eventhook = newhook;
      if (old_data != NULL) {
@@ -435,33 +478,14 @@ PyDataMem_UserNEW_ZEROED(size_t nmemb, size_t size, PyObject *mem_handler)
      return result;
  }
  
-/* Similar to array_dealloc in arrayobject.c */
-static NPY_INLINE void
-WARN_IN_FREE(PyObject* warning, const char * msg) {
-    if (PyErr_WarnEx(warning, msg, 1) < 0) {
-        PyObject * s;
-
-        s = PyUnicode_FromString("PyDataMem_UserFREE");
-        if (s) {
-            PyErr_WriteUnraisable(s);
-            Py_DECREF(s);
-        }
-        else {
-            PyErr_WriteUnraisable(Py_None);
-        }
-    }
-}
-
-
  
  NPY_NO_EXPORT void
  PyDataMem_UserFREE(void *ptr, size_t size, PyObject *mem_handler)
  {
      PyDataMem_Handler *handler = (PyDataMem_Handler *) PyCapsule_GetPointer(mem_handler, "mem_handler");
      if (handler == NULL) {
-        WARN_IN_FREE(PyExc_RuntimeWarning,
+        WARN_NO_RETURN(PyExc_RuntimeWarning,
                       "Could not get pointer to 'mem_handler' from PyCapsule");
-        PyErr_Clear();
          return;
      }
      PyTraceMalloc_Untrack(NPY_TRACE_DOMAIN, (npy_uintp)ptr);
@@ -571,7 +595,7 @@ PyDataMem_GetHandler()
      if (p == NULL) {
          return NULL;
      }
-    handler = PyDict_GetItemString(p, "current_allocator");
+    handler = PyDict_GetItem(p, npy_ma_str_current_allocator);
      if (handler == NULL) {
          handler = PyCapsule_New(&default_handler, "mem_handler", NULL);
          if (handler == NULL) {
diff --git a/numpy/core/src/multiarray/alloc.h b/numpy/core/src/multiarray/alloc.h

index 13c82845813dc26292671718d40a9baf2d6bab3b..e82f2d947c691cdaceef258ee94685fb61cf5424 100644 (file)
--- a/numpy/core/src/multiarray/alloc.h
+++ b/numpy/core/src/multiarray/alloc.h
@@ -7,6 +7,9 @@
  
  #define NPY_TRACE_DOMAIN 389047
  
+NPY_NO_EXPORT PyObject *
+_get_madvise_hugepage(PyObject *NPY_UNUSED(self), PyObject *NPY_UNUSED(args));
+
  NPY_NO_EXPORT PyObject *
  _set_madvise_hugepage(PyObject *NPY_UNUSED(self), PyObject *enabled_obj);
  
diff --git a/numpy/core/src/multiarray/argfunc.dispatch.c.src b/numpy/core/src/multiarray/argfunc.dispatch.c.src

new file mode 100644 (file)

index 0000000..cbfaebd
--- /dev/null
+++ b/numpy/core/src/multiarray/argfunc.dispatch.c.src
@@ -0,0 +1,394 @@
+/* -*- c -*- */
+/*@targets
+ ** $maxopt baseline
+ ** sse2 sse42 xop avx2 avx512_skx
+ ** vsx2
+ ** neon asimd
+ **/
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+
+#include "simd/simd.h"
+#include "numpy/npy_math.h"
+
+#include "arraytypes.h"
+
+#define MIN(a,b) (((a)<(b))?(a):(b))
+
+#if NPY_SIMD
+#if NPY_SIMD > 512 || NPY_SIMD < 0
+    #error "the following 8/16-bit argmax kernel isn't applicable for larger SIMD"
+    // TODO: add special loop for large SIMD width.
+    // i.e avoid unroll by x4 should be numerically safe till 2048-bit SIMD width
+    // or maybe expand the indices to 32|64-bit vectors(slower).
+#endif
+/**begin repeat
+ * #sfx = u8, s8, u16, s16#
+ * #usfx = u8, u8, u16, u16#
+ * #bsfx = b8, b8, b16, b16#
+ * #idx_max = NPY_MAX_UINT8*2, NPY_MAX_UINT16*2#
+ */
+/**begin repeat1
+ * #intrin = cmpgt, cmplt#
+ * #func = argmax, argmin#
+ * #op = >, <#
+ */
+static inline npy_intp
+simd_@func@_@sfx@(npyv_lanetype_@sfx@ *ip, npy_intp len)
+{
+    npyv_lanetype_@sfx@ s_acc = *ip;
+    npy_intp ret_idx = 0, i = 0;
+
+    const int vstep = npyv_nlanes_@sfx@;
+    const int wstep = vstep*4;
+    npyv_lanetype_@usfx@ d_vindices[npyv_nlanes_@sfx@*4];
+    for (int vi = 0; vi < wstep; ++vi) {
+        d_vindices[vi] = vi;
+    }
+    const npyv_@usfx@ vindices_0 = npyv_load_@usfx@(d_vindices);
+    const npyv_@usfx@ vindices_1 = npyv_load_@usfx@(d_vindices + vstep);
+    const npyv_@usfx@ vindices_2 = npyv_load_@usfx@(d_vindices + vstep*2);
+    const npyv_@usfx@ vindices_3 = npyv_load_@usfx@(d_vindices + vstep*3);
+
+    const npy_intp max_block = @idx_max@*wstep & -wstep;
+    npy_intp len0 = len & -wstep;
+    while (i < len0) {
+        npyv_@sfx@ acc = npyv_setall_@sfx@(s_acc);
+        npyv_@usfx@ acc_indices = npyv_zero_@usfx@();
+        npyv_@usfx@ acc_indices_scale = npyv_zero_@usfx@();
+
+        npy_intp n = i + MIN(len0 - i, max_block);
+        npy_intp ik = i, i2 = 0;
+        for (; i < n; i += wstep, ++i2) {
+            npyv_@usfx@ vi = npyv_setall_@usfx@((npyv_lanetype_@usfx@)i2);
+            npyv_@sfx@ a = npyv_load_@sfx@(ip + i);
+            npyv_@sfx@ b = npyv_load_@sfx@(ip + i + vstep);
+            npyv_@sfx@ c = npyv_load_@sfx@(ip + i + vstep*2);
+            npyv_@sfx@ d = npyv_load_@sfx@(ip + i + vstep*3);
+
+            // reverse to put lowest index first in case of matched values
+            npyv_@bsfx@ m_ba = npyv_@intrin@_@sfx@(b, a);
+            npyv_@bsfx@ m_dc = npyv_@intrin@_@sfx@(d, c);
+            npyv_@sfx@  x_ba = npyv_select_@sfx@(m_ba, b, a);
+            npyv_@sfx@  x_dc = npyv_select_@sfx@(m_dc, d, c);
+            npyv_@bsfx@ m_dcba = npyv_@intrin@_@sfx@(x_dc, x_ba);
+            npyv_@sfx@  x_dcba = npyv_select_@sfx@(m_dcba, x_dc, x_ba);
+
+            npyv_@usfx@ idx_ba = npyv_select_@usfx@(m_ba, vindices_1, vindices_0);
+            npyv_@usfx@ idx_dc = npyv_select_@usfx@(m_dc, vindices_3, vindices_2);
+            npyv_@usfx@ idx_dcba = npyv_select_@usfx@(m_dcba, idx_dc, idx_ba);
+            npyv_@bsfx@ m_acc = npyv_@intrin@_@sfx@(x_dcba, acc);
+            acc = npyv_select_@sfx@(m_acc, x_dcba, acc);
+            acc_indices = npyv_select_@usfx@(m_acc, idx_dcba, acc_indices);
+            acc_indices_scale = npyv_select_@usfx@(m_acc, vi, acc_indices_scale);
+        }
+        // reduce
+        npyv_lanetype_@sfx@ dacc[npyv_nlanes_@sfx@];
+        npyv_lanetype_@usfx@ dacc_i[npyv_nlanes_@sfx@];
+        npyv_lanetype_@usfx@ dacc_s[npyv_nlanes_@sfx@];
+        npyv_store_@sfx@(dacc, acc);
+        npyv_store_@usfx@(dacc_i, acc_indices);
+        npyv_store_@usfx@(dacc_s, acc_indices_scale);
+
+        for (int vi = 0; vi < vstep; ++vi) {
+            if (dacc[vi] @op@ s_acc) {
+                s_acc = dacc[vi];
+                ret_idx = ik + (npy_intp)dacc_s[vi]*wstep + dacc_i[vi];
+            }
+        }
+        // get the lowest index in case of matched values
+        for (int vi = 0; vi < vstep; ++vi) {
+            npy_intp idx = ik + (npy_intp)dacc_s[vi]*wstep + dacc_i[vi];
+            if (s_acc == dacc[vi] && ret_idx > idx) {
+                ret_idx = idx;
+            }
+        }
+    }
+    for (; i < len; ++i) {
+        npyv_lanetype_@sfx@ a = ip[i];
+        if (a @op@ s_acc) {
+            s_acc = a;
+            ret_idx = i;
+        }
+    }
+    return ret_idx;
+}
+/**end repeat1**/
+/**end repeat**/
+#endif
+
+/**begin repeat
+ * #sfx = u32, s32, u64, s64, f32, f64#
+ * #usfx = u32, u32, u64, u64, u32, u64#
+ * #bsfx = b32, b32, b64, b64, b32, b64#
+ * #is_fp = 0*4, 1*2#
+ * #is_idx32 = 1*2, 0*2, 1, 0#
+ * #chk_simd = NPY_SIMD*5, NPY_SIMD_F64#
+ */
+#if @chk_simd@
+/**begin repeat1
+ * #intrin = cmpgt, cmplt#
+ * #func = argmax, argmin#
+ * #op = >, <#
+ * #iop = <, >#
+ */
+static inline npy_intp
+simd_@func@_@sfx@(npyv_lanetype_@sfx@ *ip, npy_intp len)
+{
+    npyv_lanetype_@sfx@ s_acc = *ip;
+    npy_intp ret_idx = 0, i = 0;
+    const int vstep = npyv_nlanes_@sfx@;
+    const int wstep = vstep*4;
+    // loop by a scalar will perform better for small arrays
+    if (len < wstep) {
+        goto scalar_loop;
+    }
+    npy_intp len0 = len;
+    // guard against wraparound vector addition for 32-bit indices
+    // in case of the array length is larger than 16gb
+#if @is_idx32@
+    if (len0 > NPY_MAX_UINT32) {
+        len0 = NPY_MAX_UINT32;
+    }
+#endif
+    // create index for vector indices
+    npyv_lanetype_@usfx@ d_vindices[npyv_nlanes_@sfx@*4];
+    for (int vi = 0; vi < wstep; ++vi) {
+        d_vindices[vi] = vi;
+    }
+    const npyv_@usfx@ vindices_0 = npyv_load_@usfx@(d_vindices);
+    const npyv_@usfx@ vindices_1 = npyv_load_@usfx@(d_vindices + vstep);
+    const npyv_@usfx@ vindices_2 = npyv_load_@usfx@(d_vindices + vstep*2);
+    const npyv_@usfx@ vindices_3 = npyv_load_@usfx@(d_vindices + vstep*3);
+    // initialize vector accumulator for highest values and its indexes
+    npyv_@usfx@ acc_indices = npyv_zero_@usfx@();
+    npyv_@sfx@ acc = npyv_setall_@sfx@(s_acc);
+    for (npy_intp n = len0 & -wstep; i < n; i += wstep) {
+        npyv_@usfx@ vi = npyv_setall_@usfx@((npyv_lanetype_@usfx@)i);
+        npyv_@sfx@ a = npyv_load_@sfx@(ip + i);
+        npyv_@sfx@ b = npyv_load_@sfx@(ip + i + vstep);
+        npyv_@sfx@ c = npyv_load_@sfx@(ip + i + vstep*2);
+        npyv_@sfx@ d = npyv_load_@sfx@(ip + i + vstep*3);
+
+        // reverse to put lowest index first in case of matched values
+        npyv_@bsfx@ m_ba = npyv_@intrin@_@sfx@(b, a);
+        npyv_@bsfx@ m_dc = npyv_@intrin@_@sfx@(d, c);
+        npyv_@sfx@  x_ba = npyv_select_@sfx@(m_ba, b, a);
+        npyv_@sfx@  x_dc = npyv_select_@sfx@(m_dc, d, c);
+        npyv_@bsfx@ m_dcba = npyv_@intrin@_@sfx@(x_dc, x_ba);
+        npyv_@sfx@  x_dcba = npyv_select_@sfx@(m_dcba, x_dc, x_ba);
+
+        npyv_@usfx@ idx_ba = npyv_select_@usfx@(m_ba, vindices_1, vindices_0);
+        npyv_@usfx@ idx_dc = npyv_select_@usfx@(m_dc, vindices_3, vindices_2);
+        npyv_@usfx@ idx_dcba = npyv_select_@usfx@(m_dcba, idx_dc, idx_ba);
+        npyv_@bsfx@ m_acc = npyv_@intrin@_@sfx@(x_dcba, acc);
+        acc = npyv_select_@sfx@(m_acc, x_dcba, acc);
+        acc_indices = npyv_select_@usfx@(m_acc, npyv_add_@usfx@(vi, idx_dcba), acc_indices);
+
+    #if @is_fp@
+        npyv_@bsfx@ nnan_a = npyv_notnan_@sfx@(a);
+        npyv_@bsfx@ nnan_b = npyv_notnan_@sfx@(b);
+        npyv_@bsfx@ nnan_c = npyv_notnan_@sfx@(c);
+        npyv_@bsfx@ nnan_d = npyv_notnan_@sfx@(d);
+        npyv_@bsfx@ nnan_ab = npyv_and_@bsfx@(nnan_a, nnan_b);
+        npyv_@bsfx@ nnan_cd = npyv_and_@bsfx@(nnan_c, nnan_d);
+        npy_uint64 nnan = npyv_tobits_@bsfx@(npyv_and_@bsfx@(nnan_ab, nnan_cd));
+        if (nnan != ((1LL << vstep) - 1)) {
+            npy_uint64 nnan_4[4];
+            nnan_4[0] = npyv_tobits_@bsfx@(nnan_a);
+            nnan_4[1] = npyv_tobits_@bsfx@(nnan_b);
+            nnan_4[2] = npyv_tobits_@bsfx@(nnan_c);
+            nnan_4[3] = npyv_tobits_@bsfx@(nnan_d);
+            for (int ni = 0; ni < 4; ++ni) {
+                for (int vi = 0; vi < vstep; ++vi) {
+                    if (!((nnan_4[ni] >> vi) & 1)) {
+                        return i + ni*vstep + vi;
+                    }
+                }
+            }
+        }
+    #endif
+    }
+    for (npy_intp n = len0 & -vstep; i < n; i += vstep) {
+        npyv_@usfx@ vi = npyv_setall_@usfx@((npyv_lanetype_@usfx@)i);
+        npyv_@sfx@ a = npyv_load_@sfx@(ip + i);
+        npyv_@bsfx@ m_acc = npyv_@intrin@_@sfx@(a, acc);
+        acc = npyv_select_@sfx@(m_acc, a, acc);
+        acc_indices = npyv_select_@usfx@(m_acc, npyv_add_@usfx@(vi, vindices_0), acc_indices);
+    #if @is_fp@
+        npyv_@bsfx@ nnan_a = npyv_notnan_@sfx@(a);
+        npy_uint64 nnan = npyv_tobits_@bsfx@(nnan_a);
+        if (nnan != ((1LL << vstep) - 1)) {
+            for (int vi = 0; vi < vstep; ++vi) {
+                if (!((nnan >> vi) & 1)) {
+                    return i + vi;
+                }
+            }
+        }
+    #endif
+    }
+
+    // reduce
+    npyv_lanetype_@sfx@ dacc[npyv_nlanes_@sfx@];
+    npyv_lanetype_@usfx@ dacc_i[npyv_nlanes_@sfx@];
+    npyv_store_@usfx@(dacc_i, acc_indices);
+    npyv_store_@sfx@(dacc, acc);
+
+    s_acc = dacc[0];
+    ret_idx = dacc_i[0];
+    for (int vi = 1; vi < vstep; ++vi) {
+        if (dacc[vi] @op@ s_acc) {
+            s_acc = dacc[vi];
+            ret_idx = (npy_intp)dacc_i[vi];
+        }
+    }
+    // get the lowest index in case of matched values
+    for (int vi = 0; vi < vstep; ++vi) {
+        if (s_acc == dacc[vi] && ret_idx > (npy_intp)dacc_i[vi]) {
+            ret_idx = dacc_i[vi];
+        }
+    }
+scalar_loop:
+    for (; i < len; ++i) {
+        npyv_lanetype_@sfx@ a = ip[i];
+    #if @is_fp@
+        if (!(a @iop@= s_acc)) {  // negated, for correct nan handling
+    #else
+        if (a @op@ s_acc) {
+    #endif
+            s_acc = a;
+            ret_idx = i;
+        #if @is_fp@
+            if (npy_isnan(s_acc)) {
+                // nan encountered, it's maximal
+                return ret_idx;
+            }
+        #endif
+        }
+    }
+    return ret_idx;
+}
+/**end repeat1**/
+#endif // chk_simd
+/**end repeat**/
+
+/**begin repeat
+ * #TYPE = UBYTE, USHORT, UINT, ULONG, ULONGLONG,
+ *         BYTE, SHORT, INT, LONG, LONGLONG,
+ *         FLOAT, DOUBLE, LONGDOUBLE#
+ *
+ * #BTYPE = BYTE, SHORT, INT, LONG, LONGLONG,
+ *          BYTE, SHORT, INT, LONG, LONGLONG,
+ *          FLOAT, DOUBLE, LONGDOUBLE#
+ * #type = npy_ubyte, npy_ushort, npy_uint, npy_ulong, npy_ulonglong,
+ *         npy_byte, npy_short, npy_int, npy_long, npy_longlong,
+ *         npy_float, npy_double, npy_longdouble#
+ *
+ * #is_fp = 0*10, 1*3#
+ * #is_unsigned = 1*5, 0*5, 0*3#
+ */
+#undef TO_SIMD_SFX
+#if 0
+/**begin repeat1
+ * #len = 8, 16, 32, 64#
+ */
+#elif NPY_SIMD && NPY_BITSOF_@BTYPE@ == @len@
+    #if @is_fp@
+        #define TO_SIMD_SFX(X) X##_f@len@
+        #if NPY_BITSOF_@BTYPE@ == 64 && !NPY_SIMD_F64
+            #undef TO_SIMD_SFX
+        #endif
+    #elif @is_unsigned@
+        #define TO_SIMD_SFX(X) X##_u@len@
+    #else
+        #define TO_SIMD_SFX(X) X##_s@len@
+    #endif
+/**end repeat1**/
+#endif
+
+/**begin repeat1
+ * #func = argmax, argmin#
+ * #op = >, <#
+ * #iop = <, >#
+ */
+NPY_NO_EXPORT int NPY_CPU_DISPATCH_CURFX(@TYPE@_@func@)
+(@type@ *ip, npy_intp n, npy_intp *mindx, PyArrayObject *NPY_UNUSED(aip))
+{
+#if @is_fp@
+    if (npy_isnan(*ip)) {
+        // nan encountered; it's maximal|minimal
+        *mindx = 0;
+        return 0;
+    }
+#endif
+#ifdef TO_SIMD_SFX
+    *mindx = TO_SIMD_SFX(simd_@func@)((TO_SIMD_SFX(npyv_lanetype)*)ip, n);
+    npyv_cleanup();
+#else
+    @type@ mp = *ip;
+    *mindx = 0;
+    npy_intp i = 1;
+
+    for (; i < n; ++i) {
+        @type@ a = ip[i];
+    #if @is_fp@
+        if (!(a @iop@= mp)) {  // negated, for correct nan handling
+    #else
+        if (a @op@ mp) {
+    #endif
+            mp = a;
+            *mindx = i;
+        #if @is_fp@
+            if (npy_isnan(mp)) {
+                // nan encountered, it's maximal|minimal
+                break;
+            }
+        #endif
+        }
+    }
+#endif // TO_SIMD_SFX
+    return 0;
+}
+/**end repeat1**/
+/**end repeat**/
+
+NPY_NO_EXPORT int NPY_CPU_DISPATCH_CURFX(BOOL_argmax)
+(npy_bool *ip, npy_intp len, npy_intp *mindx, PyArrayObject *NPY_UNUSED(aip))
+
+{
+    npy_intp i = 0;
+#if NPY_SIMD
+    const npyv_u8 zero = npyv_zero_u8();
+    const int vstep = npyv_nlanes_u8;
+    const int wstep = vstep * 4;
+    for (npy_intp n = len & -wstep; i < n; i += wstep) {
+        npyv_u8 a = npyv_load_u8(ip + i + vstep*0);
+        npyv_u8 b = npyv_load_u8(ip + i + vstep*1);
+        npyv_u8 c = npyv_load_u8(ip + i + vstep*2);
+        npyv_u8 d = npyv_load_u8(ip + i + vstep*3);
+        npyv_b8 m_a = npyv_cmpeq_u8(a, zero);
+        npyv_b8 m_b = npyv_cmpeq_u8(b, zero);
+        npyv_b8 m_c = npyv_cmpeq_u8(c, zero);
+        npyv_b8 m_d = npyv_cmpeq_u8(d, zero);
+        npyv_b8 m_ab = npyv_and_b8(m_a, m_b);
+        npyv_b8 m_cd = npyv_and_b8(m_c, m_d);
+        npy_uint64 m = npyv_tobits_b8(npyv_and_b8(m_ab, m_cd));
+    #if NPY_SIMD == 512
+        if (m != NPY_MAX_UINT64) {
+    #else
+        if ((npy_int64)m != ((1LL << vstep) - 1)) {
+    #endif
+            break;
+        }
+    }
+    npyv_cleanup();
+#endif // NPY_SIMD
+    for (; i < len; ++i) {
+        if (ip[i]) {
+            *mindx = i;
+            return 0;
+        }
+    }
+    *mindx = 0;
+    return 0;
+}
diff --git a/numpy/core/src/multiarray/array_coercion.c b/numpy/core/src/multiarray/array_coercion.c

index 2598e4bde6ea1720b06f8a54e745a54b84a689a9..562e4f0086e057b948b3bc701b95551875985e25 100644 (file)
--- a/numpy/core/src/multiarray/array_coercion.c
+++ b/numpy/core/src/multiarray/array_coercion.c
@@ -67,8 +67,8 @@
   *
   * The code here avoid multiple conversion of array-like objects (including
   * sequences). These objects are cached after conversion, which will require
- * additional memory, but can drastically speed up coercion from from array
- * like objects.
+ * additional memory, but can drastically speed up coercion from array like
+ * objects.
   */
  
  
@@ -230,6 +230,16 @@ npy_discover_dtype_from_pytype(PyTypeObject *pytype)
      return (PyArray_DTypeMeta *)DType;
  }
  
+/*
+ * Note: This function never fails, but will return `NULL` for unknown scalars
+ *       and `None` for known array-likes (e.g. tuple, list, ndarray).
+ */
+NPY_NO_EXPORT PyObject *
+PyArray_DiscoverDTypeFromScalarType(PyTypeObject *pytype)
+{
+    return (PyObject *)npy_discover_dtype_from_pytype(pytype);
+}
+
  
  /**
   * Find the correct DType class for the given python type. If flags is NULL
diff --git a/numpy/core/src/multiarray/array_coercion.h b/numpy/core/src/multiarray/array_coercion.h

index f2482cecc0058248ac1f9985b4f1d69ecd2cec7f..63d543cf7508896840665edc1a17f51073596ab3 100644 (file)
--- a/numpy/core/src/multiarray/array_coercion.h
+++ b/numpy/core/src/multiarray/array_coercion.h
@@ -19,6 +19,9 @@ NPY_NO_EXPORT int
  _PyArray_MapPyTypeToDType(
          PyArray_DTypeMeta *DType, PyTypeObject *pytype, npy_bool userdef);
  
+NPY_NO_EXPORT PyObject *
+PyArray_DiscoverDTypeFromScalarType(PyTypeObject *pytype);
+
  NPY_NO_EXPORT int
  PyArray_Pack(PyArray_Descr *descr, char *item, PyObject *value);
  
diff --git a/numpy/core/src/multiarray/array_method.c b/numpy/core/src/multiarray/array_method.c

index d93dac5069494354e7b1973afdc786f79e9a2c30..3450273b106d2224165e02707675a990a27a15ec 100644 (file)
--- a/numpy/core/src/multiarray/array_method.c
+++ b/numpy/core/src/multiarray/array_method.c
@@ -48,13 +48,19 @@
   *
   * We could allow setting the output descriptors specifically to simplify
   * this step.
+ *
+ * Note that the default version will indicate that the cast can be done
+ * as using `arr.view(new_dtype)` if the default cast-safety is
+ * set to "no-cast".  This default function cannot be used if a view may
+ * be sufficient for casting but the cast is not always "no-cast".
   */
  static NPY_CASTING
  default_resolve_descriptors(
          PyArrayMethodObject *method,
          PyArray_DTypeMeta **dtypes,
          PyArray_Descr **input_descrs,
-        PyArray_Descr **output_descrs)
+        PyArray_Descr **output_descrs,
+        npy_intp *view_offset)
  {
      int nin = method->nin;
      int nout = method->nout;
@@ -62,7 +68,7 @@ default_resolve_descriptors(
      for (int i = 0; i < nin + nout; i++) {
          PyArray_DTypeMeta *dtype = dtypes[i];
          if (input_descrs[i] != NULL) {
-            output_descrs[i] = ensure_dtype_nbo(input_descrs[i]);
+            output_descrs[i] = NPY_DT_CALL_ensure_canonical(input_descrs[i]);
          }
          else {
              output_descrs[i] = NPY_DT_CALL_default_descr(dtype);
@@ -76,6 +82,13 @@ default_resolve_descriptors(
       * abstract ones or unspecified outputs).  We can use the common-dtype
       * operation to provide a default here.
       */
+    if (method->casting == NPY_NO_CASTING) {
+        /*
+         * By (current) definition no-casting should imply viewable.  This
+         * is currently indicated for example for object to object cast.
+         */
+        *view_offset = 0;
+    }
      return method->casting;
  
    fail:
@@ -102,9 +115,10 @@ is_contiguous(
  /**
   * The default method to fetch the correct loop for a cast or ufunc
   * (at the time of writing only casts).
- * The default version can return loops explicitly registered during method
- * creation. It does specialize contiguous loops, although has to check
- * all descriptors itemsizes for this.
+ * Note that the default function provided here will only indicate that a cast
+ * can be done as a view (i.e., arr.view(new_dtype)) when this is trivially
+ * true, i.e., for cast safety "no-cast". It will not recognize view as an
+ * option for other casts (e.g., viewing '>i8' as '>i4' with an offset of 4).
   *
   * @param context
   * @param aligned
@@ -119,7 +133,7 @@ is_contiguous(
  NPY_NO_EXPORT int
  npy_default_get_strided_loop(
          PyArrayMethod_Context *context,
-        int aligned, int NPY_UNUSED(move_references), npy_intp *strides,
+        int aligned, int NPY_UNUSED(move_references), const npy_intp *strides,
          PyArrayMethod_StridedLoop **out_loop, NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags)
  {
@@ -166,7 +180,7 @@ validate_spec(PyArrayMethod_Spec *spec)
                  "not exceed %d. (method: %s)", NPY_MAXARGS, spec->name);
          return -1;
      }
-    switch (spec->casting & ~_NPY_CAST_IS_VIEW) {
+    switch (spec->casting) {
          case NPY_NO_CASTING:
          case NPY_EQUIV_CASTING:
          case NPY_SAFE_CASTING:
@@ -247,12 +261,17 @@ fill_arraymethod_from_slots(
                  meth->resolve_descriptors = slot->pfunc;
                  continue;
              case NPY_METH_get_loop:
-                if (private) {
-                    /* Only allow override for private functions initially */
-                    meth->get_strided_loop = slot->pfunc;
-                    continue;
-                }
-                break;
+                /*
+                 * NOTE: get_loop is considered "unstable" in the public API,
+                 *       I do not like the signature, and the `move_references`
+                 *       parameter must NOT be used.
+                 *       (as in: we should not worry about changing it, but of
+                 *       course that would not break it immediately.)
+                 */
+                /* Only allow override for private functions initially */
+                meth->get_strided_loop = slot->pfunc;
+                continue;
+            /* "Typical" loops, supported used by the default `get_loop` */
              case NPY_METH_strided_loop:
                  meth->strided_loop = slot->pfunc;
                  continue;
@@ -446,6 +465,15 @@ arraymethod_dealloc(PyObject *self)
  
      PyMem_Free(meth->name);
  
+    if (meth->wrapped_meth != NULL) {
+        /* Cleanup for wrapping array method (defined in umath) */
+        Py_DECREF(meth->wrapped_meth);
+        for (int i = 0; i < meth->nin + meth->nout; i++) {
+            Py_XDECREF(meth->wrapped_dtypes[i]);
+        }
+        PyMem_Free(meth->wrapped_dtypes);
+    }
+
      Py_TYPE(self)->tp_free(self);
  }
  
@@ -495,8 +523,9 @@ boundarraymethod_dealloc(PyObject *self)
  
  
  /*
- * Calls resolve_descriptors() and returns the casting level and the resolved
- * descriptors as a tuple. If the operation is impossible returns (-1, None).
+ * Calls resolve_descriptors() and returns the casting level, the resolved
+ * descriptors as a tuple, and a possible view-offset (integer or None).
+ * If the operation is impossible returns (-1, None, None).
   * May raise an error, but usually should not.
   * The function validates the casting attribute compared to the returned
   * casting level.
@@ -551,14 +580,15 @@ boundarraymethod__resolve_descripors(
          }
      }
  
+    npy_intp view_offset = NPY_MIN_INTP;
      NPY_CASTING casting = self->method->resolve_descriptors(
-            self->method, self->dtypes, given_descrs, loop_descrs);
+            self->method, self->dtypes, given_descrs, loop_descrs, &view_offset);
  
      if (casting < 0 && PyErr_Occurred()) {
          return NULL;
      }
      else if (casting < 0) {
-        return Py_BuildValue("iO", casting, Py_None);
+        return Py_BuildValue("iOO", casting, Py_None, Py_None);
      }
  
      PyObject *result_tuple = PyTuple_New(nin + nout);
@@ -570,9 +600,22 @@ boundarraymethod__resolve_descripors(
          PyTuple_SET_ITEM(result_tuple, i, (PyObject *)loop_descrs[i]);
      }
  
+    PyObject *view_offset_obj;
+    if (view_offset == NPY_MIN_INTP) {
+        Py_INCREF(Py_None);
+        view_offset_obj = Py_None;
+    }
+    else {
+        view_offset_obj = PyLong_FromSsize_t(view_offset);
+        if (view_offset_obj == NULL) {
+            Py_DECREF(result_tuple);
+            return NULL;
+        }
+    }
+
      /*
-     * The casting flags should be the most generic casting level (except the
-     * cast-is-view flag.  If no input is parametric, it must match exactly.
+     * The casting flags should be the most generic casting level.
+     * If no input is parametric, it must match exactly.
       *
       * (Note that these checks are only debugging checks.)
       */
@@ -584,7 +627,7 @@ boundarraymethod__resolve_descripors(
          }
      }
      if (self->method->casting != -1) {
-        NPY_CASTING cast = casting & ~_NPY_CAST_IS_VIEW;
+        NPY_CASTING cast = casting;
          if (self->method->casting !=
                  PyArray_MinCastSafety(cast, self->method->casting)) {
              PyErr_Format(PyExc_RuntimeError,
@@ -592,6 +635,7 @@ boundarraymethod__resolve_descripors(
                      "(set level is %d, got %d for method %s)",
                      self->method->casting, cast, self->method->name);
              Py_DECREF(result_tuple);
+            Py_DECREF(view_offset_obj);
              return NULL;
          }
          if (!parametric) {
@@ -608,12 +652,13 @@ boundarraymethod__resolve_descripors(
                          "(set level is %d, got %d for method %s)",
                          self->method->casting, cast, self->method->name);
                  Py_DECREF(result_tuple);
+                Py_DECREF(view_offset_obj);
                  return NULL;
              }
          }
      }
  
-    return Py_BuildValue("iN", casting, result_tuple);
+    return Py_BuildValue("iNN", casting, result_tuple, view_offset_obj);
  }
  
  
@@ -694,8 +739,9 @@ boundarraymethod__simple_strided_call(
          return NULL;
      }
  
+    npy_intp view_offset = NPY_MIN_INTP;
      NPY_CASTING casting = self->method->resolve_descriptors(
-            self->method, self->dtypes, descrs, out_descrs);
+            self->method, self->dtypes, descrs, out_descrs, &view_offset);
  
      if (casting < 0) {
          PyObject *err_type = NULL, *err_value = NULL, *err_traceback = NULL;
@@ -817,15 +863,17 @@ generic_masked_strided_loop(PyArrayMethod_Context *context,
  
          /* Process unmasked values */
          mask = npy_memchr(mask, 0, mask_stride, N, &subloopsize, 0);
-        int res = strided_loop(context,
-                dataptrs, &subloopsize, strides, strided_loop_auxdata);
-        if (res != 0) {
-            return res;
-        }
-        for (int i = 0; i < nargs; i++) {
-            dataptrs[i] += subloopsize * strides[i];
+        if (subloopsize > 0) {
+            int res = strided_loop(context,
+                    dataptrs, &subloopsize, strides, strided_loop_auxdata);
+            if (res != 0) {
+                return res;
+            }
+            for (int i = 0; i < nargs; i++) {
+                dataptrs[i] += subloopsize * strides[i];
+            }
+            N -= subloopsize;
          }
-        N -= subloopsize;
      } while (N > 0);
  
      return 0;
diff --git a/numpy/core/src/multiarray/array_method.h b/numpy/core/src/multiarray/array_method.h

index 7b7372bd0b59834b21d039e6ef2e6f0f504d1ebe..30dd94a80b7d9e66a26b23ed557543f4fffbf2d4 100644 (file)
--- a/numpy/core/src/multiarray/array_method.h
+++ b/numpy/core/src/multiarray/array_method.h
@@ -70,18 +70,71 @@ typedef NPY_CASTING (resolve_descriptors_function)(
          struct PyArrayMethodObject_tag *method,
          PyArray_DTypeMeta **dtypes,
          PyArray_Descr **given_descrs,
-        PyArray_Descr **loop_descrs);
+        PyArray_Descr **loop_descrs,
+        npy_intp *view_offset);
  
  
  typedef int (get_loop_function)(
          PyArrayMethod_Context *context,
          int aligned, int move_references,
-        npy_intp *strides,
+        const npy_intp *strides,
          PyArrayMethod_StridedLoop **out_loop,
          NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags);
  
  
+/*
+ * The following functions are only used be the wrapping array method defined
+ * in umath/wrapping_array_method.c
+ */
+
+/*
+ * The function to convert the given descriptors (passed in to
+ * `resolve_descriptors`) and translates them for the wrapped loop.
+ * The new descriptors MUST be viewable with the old ones, `NULL` must be
+ * supported (for outputs) and should normally be forwarded.
+ *
+ * The function must clean up on error.
+ *
+ * NOTE: We currently assume that this translation gives "viewable" results.
+ *       I.e. there is no additional casting related to the wrapping process.
+ *       In principle that could be supported, but not sure it is useful.
+ *       This currently also means that e.g. alignment must apply identically
+ *       to the new dtypes.
+ *
+ * TODO: Due to the fact that `resolve_descriptors` is also used for `can_cast`
+ *       there is no way to "pass out" the result of this function.  This means
+ *       it will be called twice for every ufunc call.
+ *       (I am considering including `auxdata` as an "optional" parameter to
+ *       `resolve_descriptors`, so that it can be filled there if not NULL.)
+ */
+typedef int translate_given_descrs_func(int nin, int nout,
+        PyArray_DTypeMeta *wrapped_dtypes[],
+        PyArray_Descr *given_descrs[], PyArray_Descr *new_descrs[]);
+
+/**
+ * The function to convert the actual loop descriptors (as returned by the
+ * original `resolve_descriptors` function) to the ones the output array
+ * should use.
+ * This function must return "viewable" types, it must not mutate them in any
+ * form that would break the inner-loop logic.  Does not need to support NULL.
+ *
+ * The function must clean up on error.
+ *
+ * @param nargs Number of arguments
+ * @param new_dtypes The DTypes of the output (usually probably not needed)
+ * @param given_descrs Original given_descrs to the resolver, necessary to
+ *        fetch any information related to the new dtypes from the original.
+ * @param original_descrs The `loop_descrs` returned by the wrapped loop.
+ * @param loop_descrs The output descriptors, compatible to `original_descrs`.
+ *
+ * @returns 0 on success, -1 on failure.
+ */
+typedef int translate_loop_descrs_func(int nin, int nout,
+        PyArray_DTypeMeta *new_dtypes[], PyArray_Descr *given_descrs[],
+        PyArray_Descr *original_descrs[], PyArray_Descr *loop_descrs[]);
+
+
  /*
   * This struct will be public and necessary for creating a new ArrayMethod
   * object (casting and ufuncs).
@@ -124,6 +177,11 @@ typedef struct PyArrayMethodObject_tag {
      PyArrayMethod_StridedLoop *contiguous_loop;
      PyArrayMethod_StridedLoop *unaligned_strided_loop;
      PyArrayMethod_StridedLoop *unaligned_contiguous_loop;
+    /* Chunk only used for wrapping array method defined in umath */
+    struct PyArrayMethodObject_tag *wrapped_meth;
+    PyArray_DTypeMeta **wrapped_dtypes;
+    translate_given_descrs_func *translate_given_descrs;
+    translate_loop_descrs_func *translate_loop_descrs;
  } PyArrayMethodObject;
  
  
@@ -166,7 +224,7 @@ extern NPY_NO_EXPORT PyTypeObject PyBoundArrayMethod_Type;
  NPY_NO_EXPORT int
  npy_default_get_strided_loop(
          PyArrayMethod_Context *context,
-        int aligned, int NPY_UNUSED(move_references), npy_intp *strides,
+        int aligned, int NPY_UNUSED(move_references), const npy_intp *strides,
          PyArrayMethod_StridedLoop **out_loop, NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags);
  
diff --git a/numpy/core/src/multiarray/arrayfunction_override.c b/numpy/core/src/multiarray/arrayfunction_override.c

index 463a2d4d8724c994c0e4e81efe297f95ee972820..af53d78219e3dff6dd8c695d03612482e0add91b 100644 (file)
--- a/numpy/core/src/multiarray/arrayfunction_override.c
+++ b/numpy/core/src/multiarray/arrayfunction_override.c
@@ -37,7 +37,7 @@ get_array_function(PyObject *obj)
          return ndarray_array_function;
      }
  
-    PyObject *array_function = PyArray_LookupSpecial(obj, "__array_function__");
+    PyObject *array_function = PyArray_LookupSpecial(obj, npy_ma_str_array_function);
      if (array_function == NULL && PyErr_Occurred()) {
          PyErr_Clear(); /* TODO[gh-14801]: propagate crashes during attribute access? */
      }
@@ -333,7 +333,7 @@ NPY_NO_EXPORT PyObject *
  array_implement_array_function(
      PyObject *NPY_UNUSED(dummy), PyObject *positional_args)
  {
-    PyObject *implementation, *public_api, *relevant_args, *args, *kwargs;
+    PyObject *res, *implementation, *public_api, *relevant_args, *args, *kwargs;
  
      if (!PyArg_UnpackTuple(
              positional_args, "implement_array_function", 5, 5,
@@ -357,10 +357,20 @@ array_implement_array_function(
              }
              Py_DECREF(tmp_has_override);
              PyDict_DelItem(kwargs, npy_ma_str_like);
+
+            /*
+             * If `like=` kwarg was removed, `implementation` points to the NumPy
+             * public API, as `public_api` is in that case the wrapper dispatcher
+             * function. For example, in the `np.full` case, `implementation` is
+             * `np.full`, whereas `public_api` is `_full_with_like`. This is done
+             * to ensure `__array_function__` implementations can do
+             * equality/identity comparisons when `like=` is present.
+             */
+            public_api = implementation;
          }
      }
  
-    PyObject *res = array_implement_array_function_internal(
+    res = array_implement_array_function_internal(
          public_api, relevant_args, args, kwargs);
  
      if (res == Py_NotImplemented) {
diff --git a/numpy/core/src/multiarray/arrayobject.c b/numpy/core/src/multiarray/arrayobject.c

index 292523bbc26df8971f110f1d7809d5976dfd229c..a1f0e2d5bb7a5c73ba8ec06a203b3dbde6c03822 100644 (file)
--- a/numpy/core/src/multiarray/arrayobject.c
+++ b/numpy/core/src/multiarray/arrayobject.c
@@ -75,36 +75,19 @@ PyArray_Size(PyObject *op)
      }
  }
  
-/*NUMPY_API
- *
- * Precondition: 'arr' is a copy of 'base' (though possibly with different
- * strides, ordering, etc.). This function sets the UPDATEIFCOPY flag and the
- * ->base pointer on 'arr', so that when 'arr' is destructed, it will copy any
- * changes back to 'base'. DEPRECATED, use PyArray_SetWritebackIfCopyBase
- *
- * Steals a reference to 'base'.
- *
- * Returns 0 on success, -1 on failure.
- */
+/*NUMPY_API */
  NPY_NO_EXPORT int
  PyArray_SetUpdateIfCopyBase(PyArrayObject *arr, PyArrayObject *base)
  {
-    int ret;
-    /* 2017-Nov  -10 1.14 (for PyPy only) */
-    /* 2018-April-21 1.15 (all Python implementations) */
-    if (DEPRECATE("PyArray_SetUpdateIfCopyBase is deprecated, use "
-              "PyArray_SetWritebackIfCopyBase instead, and be sure to call "
-              "PyArray_ResolveWritebackIfCopy before the array is deallocated, "
-              "i.e. before the last call to Py_DECREF. If cleaning up from an "
-              "error, PyArray_DiscardWritebackIfCopy may be called instead to "
-              "throw away the scratch buffer.") < 0)
-        return -1;
-    ret = PyArray_SetWritebackIfCopyBase(arr, base);
-    if (ret >=0) {
-        PyArray_ENABLEFLAGS(arr, NPY_ARRAY_UPDATEIFCOPY);
-        PyArray_CLEARFLAGS(arr, NPY_ARRAY_WRITEBACKIFCOPY);
-    }
-    return ret;
+    /* 2021-Dec-15 1.23*/
+    PyErr_SetString(PyExc_RuntimeError,
+        "PyArray_SetUpdateIfCopyBase is disabled, use "
+        "PyArray_SetWritebackIfCopyBase instead, and be sure to call "
+        "PyArray_ResolveWritebackIfCopy before the array is deallocated, "
+        "i.e. before the last call to Py_DECREF. If cleaning up from an "
+        "error, PyArray_DiscardWritebackIfCopy may be called instead to "
+        "throw away the scratch buffer.");
+    return -1;
  }
  
  /*NUMPY_API
@@ -377,9 +360,9 @@ PyArray_ResolveWritebackIfCopy(PyArrayObject * self)
  {
      PyArrayObject_fields *fa = (PyArrayObject_fields *)self;
      if (fa && fa->base) {
-        if ((fa->flags & NPY_ARRAY_UPDATEIFCOPY) || (fa->flags & NPY_ARRAY_WRITEBACKIFCOPY)) {
+        if (fa->flags & NPY_ARRAY_WRITEBACKIFCOPY) {
              /*
-             * UPDATEIFCOPY or WRITEBACKIFCOPY means that fa->base's data
+             * WRITEBACKIFCOPY means that fa->base's data
               * should be updated with the contents
               * of self.
               * fa->base->flags is not WRITEABLE to protect the relationship
@@ -388,7 +371,6 @@ PyArray_ResolveWritebackIfCopy(PyArrayObject * self)
              int retval = 0;
              PyArray_ENABLEFLAGS(((PyArrayObject *)fa->base),
                                                      NPY_ARRAY_WRITEABLE);
-            PyArray_CLEARFLAGS(self, NPY_ARRAY_UPDATEIFCOPY);
              PyArray_CLEARFLAGS(self, NPY_ARRAY_WRITEBACKIFCOPY);
              retval = PyArray_CopyAnyInto((PyArrayObject *)fa->base, self);
              Py_DECREF(fa->base);
@@ -462,25 +444,6 @@ array_dealloc(PyArrayObject *self)
                  PyErr_Clear();
              }
          }
-        if (PyArray_FLAGS(self) & NPY_ARRAY_UPDATEIFCOPY) {
-            /* DEPRECATED, remove once the flag is removed */
-            char const * msg = "UPDATEIFCOPY detected in array_dealloc. "
-                " Required call to PyArray_ResolveWritebackIfCopy or "
-                "PyArray_DiscardWritebackIfCopy is missing";
-            /*
-             * prevent reaching 0 twice and thus recursing into dealloc.
-             * Increasing sys.gettotalrefcount, but path should not be taken.
-             */
-            Py_INCREF(self);
-            /* 2017-Nov-10 1.14 */
-            WARN_IN_DEALLOC(PyExc_DeprecationWarning, msg);
-            retval = PyArray_ResolveWritebackIfCopy(self);
-            if (retval < 0)
-            {
-                PyErr_Print();
-                PyErr_Clear();
-            }
-        }
          /*
           * If fa->base is non-NULL, it is something
           * to DECREF -- either a view or a buffer object
@@ -505,10 +468,6 @@ array_dealloc(PyArrayObject *self)
              free(fa->data);
          }
          else {
-            /*
-             * In theory `PyArray_NBYTES_ALLOCATED`, but differs somewhere?
-             * So instead just use the knowledge that 0 is impossible.
-             */
              size_t nbytes = PyArray_NBYTES(self);
              if (nbytes == 0) {
                  nbytes = 1;
@@ -572,8 +531,6 @@ PyArray_DebugPrint(PyArrayObject *obj)
          printf(" NPY_ALIGNED");
      if (fobj->flags & NPY_ARRAY_WRITEABLE)
          printf(" NPY_WRITEABLE");
-    if (fobj->flags & NPY_ARRAY_UPDATEIFCOPY)
-        printf(" NPY_UPDATEIFCOPY");
      if (fobj->flags & NPY_ARRAY_WRITEBACKIFCOPY)
          printf(" NPY_WRITEBACKIFCOPY");
      printf("\n");
@@ -661,15 +618,15 @@ array_might_be_written(PyArrayObject *obj)
  
  /*NUMPY_API
   *
- * This function does nothing if obj is writeable, and raises an exception
- * (and returns -1) if obj is not writeable. It may also do other
- * house-keeping, such as issuing warnings on arrays which are transitioning
- * to become views. Always call this function at some point before writing to
- * an array.
+ *  This function does nothing and returns 0 if *obj* is writeable.
+ *  It raises an exception and returns -1 if *obj* is not writeable.
+ *  It may also do other house-keeping, such as issuing warnings on
+ *  arrays which are transitioning to become views. Always call this
+ *  function at some point before writing to an array.
   *
- * 'name' is a name for the array, used to give better error
- * messages. Something like "assignment destination", "output array", or even
- * just "array".
+ *  *name* is a name for the array, used to give better error messages.
+ *  It can be something like "assignment destination", "output array",
+ *  or even just "array".
   */
  NPY_NO_EXPORT int
  PyArray_FailUnlessWriteable(PyArrayObject *obj, const char *name)
@@ -1068,35 +1025,85 @@ static PyObject *
  _void_compare(PyArrayObject *self, PyArrayObject *other, int cmp_op)
  {
      if (!(cmp_op == Py_EQ || cmp_op == Py_NE)) {
-        PyErr_SetString(PyExc_ValueError,
+        PyErr_SetString(PyExc_TypeError,
                  "Void-arrays can only be compared for equality.");
          return NULL;
      }
-    if (PyArray_HASFIELDS(self)) {
-        PyObject *res = NULL, *temp, *a, *b;
-        PyObject *key, *value, *temp2;
-        PyObject *op;
-        Py_ssize_t pos = 0;
+    if (PyArray_TYPE(other) != NPY_VOID) {
+        PyErr_SetString(PyExc_TypeError,
+                "Cannot compare structured or void to non-void arrays.");
+        return NULL;
+    }
+    if (PyArray_HASFIELDS(self) && PyArray_HASFIELDS(other)) {
+        PyArray_Descr *self_descr = PyArray_DESCR(self);
+        PyArray_Descr *other_descr = PyArray_DESCR(other);
+
+        /* Use promotion to decide whether the comparison is valid */
+        PyArray_Descr *promoted = PyArray_PromoteTypes(self_descr, other_descr);
+        if (promoted == NULL) {
+            PyErr_SetString(PyExc_TypeError,
+                    "Cannot compare structured arrays unless they have a "
+                    "common dtype.  I.e. `np.result_type(arr1, arr2)` must "
+                    "be defined.");
+            return NULL;
+        }
+        Py_DECREF(promoted);
+
          npy_intp result_ndim = PyArray_NDIM(self) > PyArray_NDIM(other) ?
                              PyArray_NDIM(self) : PyArray_NDIM(other);
  
-        op = (cmp_op == Py_EQ ? n_ops.logical_and : n_ops.logical_or);
-        while (PyDict_Next(PyArray_DESCR(self)->fields, &pos, &key, &value)) {
-            if (NPY_TITLE_KEY(key, value)) {
-                continue;
-            }
-            a = array_subscript_asarray(self, key);
+        int field_count = PyTuple_GET_SIZE(self_descr->names);
+        if (field_count != PyTuple_GET_SIZE(other_descr->names)) {
+            PyErr_SetString(PyExc_TypeError,
+                    "Cannot compare structured dtypes with different number of "
+                    "fields.  (unreachable error please report to NumPy devs)");
+            return NULL;
+        }
+
+        PyObject *op = (cmp_op == Py_EQ ? n_ops.logical_and : n_ops.logical_or);
+        PyObject *res = NULL;
+        for (int i = 0; i < field_count; ++i) {
+            PyObject *fieldname, *temp, *temp2;
+
+            fieldname = PyTuple_GET_ITEM(self_descr->names, i);
+            PyArrayObject *a = (PyArrayObject *)array_subscript_asarray(
+                    self, fieldname);
              if (a == NULL) {
                  Py_XDECREF(res);
                  return NULL;
              }
-            b = array_subscript_asarray(other, key);
+            fieldname = PyTuple_GET_ITEM(other_descr->names, i);
+            PyArrayObject *b = (PyArrayObject *)array_subscript_asarray(
+                    other, fieldname);
              if (b == NULL) {
                  Py_XDECREF(res);
                  Py_DECREF(a);
                  return NULL;
              }
-            temp = array_richcompare((PyArrayObject *)a,b,cmp_op);
+            /*
+             * If the fields were subarrays, the dimensions may have changed.
+             * In that case, the new shape (subarray part) must match exactly.
+             * (If this is 0, there is no subarray.)
+             */
+            int field_dims_a = PyArray_NDIM(a) - PyArray_NDIM(self);
+            int field_dims_b = PyArray_NDIM(b) - PyArray_NDIM(other);
+            if (field_dims_a != field_dims_b || (
+                    field_dims_a != 0 &&  /* neither is subarray */
+                    /* Compare only the added (subarray) dimensions: */
+                    !PyArray_CompareLists(
+                            PyArray_DIMS(a) + PyArray_NDIM(self),
+                            PyArray_DIMS(b) + PyArray_NDIM(other),
+                            field_dims_a))) {
+                PyErr_SetString(PyExc_TypeError,
+                        "Cannot compare subarrays with different shapes. "
+                        "(unreachable error, please report to NumPy devs.)");
+                Py_DECREF(a);
+                Py_DECREF(b);
+                Py_XDECREF(res);
+                return NULL;
+            }
+
+            temp = array_richcompare(a, (PyObject *)b, cmp_op);
              Py_DECREF(a);
              Py_DECREF(b);
              if (temp == NULL) {
@@ -1181,7 +1188,24 @@ _void_compare(PyArrayObject *self, PyArrayObject *other, int cmp_op)
          }
          return res;
      }
+    else if (PyArray_HASFIELDS(self) || PyArray_HASFIELDS(other)) {
+        PyErr_SetString(PyExc_TypeError,
+                "Cannot compare structured with unstructured void arrays. "
+                "(unreachable error, please report to NumPy devs.)");
+        return NULL;
+    }
      else {
+        /*
+         * Since arrays absorb subarray descriptors, this path can only be
+         * reached when both arrays have unstructured voids "V<len>" dtypes.
+         */
+        if (PyArray_ITEMSIZE(self) != PyArray_ITEMSIZE(other)) {
+            PyErr_SetString(PyExc_TypeError,
+                    "cannot compare unstructured voids of different length. "
+                    "Use bytes to compare. "
+                    "(This may return array of False in the future.)");
+            return NULL;
+        }
          /* compare as a string. Assumes self and other have same descr->type */
          return _strings_richcompare(self, other, cmp_op, 0);
      }
@@ -1366,8 +1390,6 @@ array_richcompare(PyArrayObject *self, PyObject *other, int cmp_op)
           */
  
          if (PyArray_TYPE(self) == NPY_VOID) {
-            int _res;
-
              array_other = (PyArrayObject *)PyArray_FROM_O(other);
              /*
               * If not successful, indicate that the items cannot be compared
@@ -1384,28 +1406,7 @@ array_richcompare(PyArrayObject *self, PyObject *other, int cmp_op)
                  return Py_NotImplemented;
              }
  
-            _res = PyArray_CheckCastSafety(
-                    NPY_EQUIV_CASTING,
-                    PyArray_DESCR(self), PyArray_DESCR(array_other), NULL);
-            if (_res < 0) {
-                PyErr_Clear();
-                _res = 0;
-            }
-            if (_res == 0) {
-                /* 2015-05-07, 1.10 */
-                Py_DECREF(array_other);
-                if (DEPRECATE_FUTUREWARNING(
-                        "elementwise == comparison failed and returning scalar "
-                        "instead; this will raise an error or perform "
-                        "elementwise comparison in the future.") < 0) {
-                    return NULL;
-                }
-                Py_INCREF(Py_False);
-                return Py_False;
-            }
-            else {
-                result = _void_compare(self, array_other, cmp_op);
-            }
+            result = _void_compare(self, array_other, cmp_op);
              Py_DECREF(array_other);
              return result;
          }
@@ -1421,8 +1422,6 @@ array_richcompare(PyArrayObject *self, PyObject *other, int cmp_op)
           */
  
          if (PyArray_TYPE(self) == NPY_VOID) {
-            int _res;
-
              array_other = (PyArrayObject *)PyArray_FROM_O(other);
              /*
               * If not successful, indicate that the items cannot be compared
@@ -1439,29 +1438,8 @@ array_richcompare(PyArrayObject *self, PyObject *other, int cmp_op)
                  return Py_NotImplemented;
              }
  
-            _res = PyArray_CheckCastSafety(
-                    NPY_EQUIV_CASTING,
-                    PyArray_DESCR(self), PyArray_DESCR(array_other), NULL);
-            if (_res < 0) {
-                PyErr_Clear();
-                _res = 0;
-            }
-            if (_res == 0) {
-                /* 2015-05-07, 1.10 */
-                Py_DECREF(array_other);
-                if (DEPRECATE_FUTUREWARNING(
-                        "elementwise != comparison failed and returning scalar "
-                        "instead; this will raise an error or perform "
-                        "elementwise comparison in the future.") < 0) {
-                    return NULL;
-                }
-                Py_INCREF(Py_True);
-                return Py_True;
-            }
-            else {
-                result = _void_compare(self, array_other, cmp_op);
-                Py_DECREF(array_other);
-            }
+            result = _void_compare(self, array_other, cmp_op);
+            Py_DECREF(array_other);
              return result;
          }
  
diff --git a/numpy/core/src/multiarray/arraytypes.c.src b/numpy/core/src/multiarray/arraytypes.c.src

index 71401c60e8d05b5c7fecf2d78a73c7d232ef146e..ee4f5f312bf04e2626f2ec094f17210ea03c126f 100644 (file)
--- a/numpy/core/src/multiarray/arraytypes.c.src
+++ b/numpy/core/src/multiarray/arraytypes.c.src
@@ -27,12 +27,6 @@
  #include "arrayobject.h"
  #include "alloc.h"
  #include "typeinfo.h"
-#if defined(__ARM_NEON__) || defined (__ARM_NEON)
-#include <arm_neon.h>
-#endif
-#ifdef NPY_HAVE_SSE2_INTRINSICS
-#include <emmintrin.h>
-#endif
  
  #include "npy_longdouble.h"
  #include "numpyos.h"
@@ -42,7 +36,7 @@
  #include "npy_cblas.h"
  #include "npy_buffer.h"
  
-
+#include "arraytypes.h"
  /*
   * Define a stack allocated dummy array with only the minimum information set:
   *   1. The descr, the main field interesting here.
@@ -97,7 +91,7 @@ MyPyFloat_AsDouble(PyObject *obj)
      if (num == NULL) {
          return NPY_NAN;
      }
-    ret = PyFloat_AsDouble(num);
+    ret = PyFloat_AS_DOUBLE(num);
      Py_DECREF(num);
      return ret;
  }
@@ -3176,77 +3170,21 @@ finish:
   **                                 ARGFUNC                                 **
   *****************************************************************************
   */
-#if defined(__ARM_NEON__) || defined (__ARM_NEON)
-    int32_t _mm_movemask_epi8_neon(uint8x16_t input)
-    {
-        int8x8_t m0 = vcreate_s8(0x0706050403020100ULL);
-        uint8x16_t v0 = vshlq_u8(vshrq_n_u8(input, 7), vcombine_s8(m0, m0));
-        uint64x2_t v1 = vpaddlq_u32(vpaddlq_u16(vpaddlq_u8(v0)));
-        return (int)vgetq_lane_u64(v1, 0) + ((int)vgetq_lane_u64(v1, 1) << 8);
-    }
-#endif
-#define _LESS_THAN_OR_EQUAL(a,b) ((a) <= (b))
  
-static int
-BOOL_argmax(npy_bool *ip, npy_intp n, npy_intp *max_ind,
-            PyArrayObject *NPY_UNUSED(aip))
-
-{
-    npy_intp i = 0;
-    /* memcmp like logical_and on i386 is maybe slower for small arrays */
-#ifdef NPY_HAVE_SSE2_INTRINSICS
-    const __m128i zero = _mm_setzero_si128();
-    for (; i < n - (n % 32); i+=32) {
-        __m128i d1 = _mm_loadu_si128((__m128i*)&ip[i]);
-        __m128i d2 = _mm_loadu_si128((__m128i*)&ip[i + 16]);
-        d1 = _mm_cmpeq_epi8(d1, zero);
-        d2 = _mm_cmpeq_epi8(d2, zero);
-        if (_mm_movemask_epi8(_mm_min_epu8(d1, d2)) != 0xFFFF) {
-            break;
-        }
-    }
-#else
-    #if defined(__ARM_NEON__) || defined (__ARM_NEON)
-        uint8x16_t zero = vdupq_n_u8(0);
-        for(; i < n - (n % 32); i+=32) {
-            uint8x16_t d1 = vld1q_u8((uint8_t *)&ip[i]);
-            uint8x16_t d2 = vld1q_u8((uint8_t *)&ip[i + 16]);
-            d1 = vceqq_u8(d1, zero);
-            d2 = vceqq_u8(d2, zero);
-            if(_mm_movemask_epi8_neon(vminq_u8(d1, d2)) != 0xFFFF) {
-                break;
-            }
-        }
-    #endif
-#endif
-    for (; i < n; i++) {
-        if (ip[i]) {
-            *max_ind = i;
-            return 0;
-        }
-    }
-    *max_ind = 0;
-    return 0;
-}
+#define _LESS_THAN_OR_EQUAL(a,b) ((a) <= (b))
  
  /**begin repeat
   *
- * #fname = BYTE, UBYTE, SHORT, USHORT, INT, UINT,
- *          LONG, ULONG, LONGLONG, ULONGLONG,
- *          HALF, FLOAT, DOUBLE, LONGDOUBLE,
- *          CFLOAT, CDOUBLE, CLONGDOUBLE,
+ * #fname = HALF, CFLOAT, CDOUBLE, CLONGDOUBLE,
   *          DATETIME, TIMEDELTA#
- * #type = npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int, npy_uint,
- *         npy_long, npy_ulong, npy_longlong, npy_ulonglong,
- *         npy_half, npy_float, npy_double, npy_longdouble,
- *         npy_float, npy_double, npy_longdouble,
+ * #type = npy_half, npy_float, npy_double, npy_longdouble,
   *         npy_datetime, npy_timedelta#
- * #isfloat = 0*10, 1*7, 0*2#
- * #isnan = nop*10, npy_half_isnan, npy_isnan*6, nop*2#
- * #le = _LESS_THAN_OR_EQUAL*10, npy_half_le, _LESS_THAN_OR_EQUAL*8#
- * #iscomplex = 0*14, 1*3, 0*2#
- * #incr = ip++*14, ip+=2*3, ip++*2#
- * #isdatetime = 0*17, 1*2#
+ * #isfloat = 1*4, 0*2#
+ * #isnan = npy_half_isnan, npy_isnan*3, nop*2#
+ * #le = npy_half_le, _LESS_THAN_OR_EQUAL*5#
+ * #iscomplex = 0, 1*3, 0*2#
+ * #incr = ip++, ip+=2*3, ip++*2#
+ * #isdatetime = 0*4, 1*2#
   */
  static int
  @fname@_argmax(@type@ *ip, npy_intp n, npy_intp *max_ind,
@@ -3337,22 +3275,16 @@ BOOL_argmin(npy_bool *ip, npy_intp n, npy_intp *min_ind,
  
  /**begin repeat
   *
- * #fname = BYTE, UBYTE, SHORT, USHORT, INT, UINT,
- *          LONG, ULONG, LONGLONG, ULONGLONG,
- *          HALF, FLOAT, DOUBLE, LONGDOUBLE,
- *          CFLOAT, CDOUBLE, CLONGDOUBLE,
+ * #fname = HALF, CFLOAT, CDOUBLE, CLONGDOUBLE,
   *          DATETIME, TIMEDELTA#
- * #type = npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int, npy_uint,
- *         npy_long, npy_ulong, npy_longlong, npy_ulonglong,
- *         npy_half, npy_float, npy_double, npy_longdouble,
- *         npy_float, npy_double, npy_longdouble,
+ * #type = npy_half, npy_float, npy_double, npy_longdouble,
   *         npy_datetime, npy_timedelta#
- * #isfloat = 0*10, 1*7, 0*2#
- * #isnan = nop*10, npy_half_isnan, npy_isnan*6, nop*2#
- * #le = _LESS_THAN_OR_EQUAL*10, npy_half_le, _LESS_THAN_OR_EQUAL*8#
- * #iscomplex = 0*14, 1*3, 0*2#
- * #incr = ip++*14, ip+=2*3, ip++*2#
- * #isdatetime = 0*17, 1*2#
+ * #isfloat = 1*4, 0*2#
+ * #isnan = npy_half_isnan, npy_isnan*3, nop*2#
+ * #le = npy_half_le, _LESS_THAN_OR_EQUAL*5#
+ * #iscomplex = 0, 1*3, 0*2#
+ * #incr = ip++, ip+=2*3, ip++*2#
+ * #isdatetime = 0*4, 1*2#
   */
  static int
  @fname@_argmin(@type@ *ip, npy_intp n, npy_intp *min_ind,
@@ -3409,7 +3341,7 @@ static int
              *min_ind = i;
              break;
          }
-#endif 
+#endif
          if (!@le@(mp, *ip)) {  /* negated, for correct nan handling */
              mp = *ip;
              *min_ind = i;
@@ -4494,6 +4426,27 @@ set_typeinfo(PyObject *dict)
      PyArray_Descr *dtype;
      PyObject *cobj, *key;
  
+    // SIMD runtime dispatching
+    #ifndef NPY_DISABLE_OPTIMIZATION
+        #include "argfunc.dispatch.h"
+    #endif
+    /**begin repeat
+     * #FROM = BYTE, UBYTE, SHORT, USHORT, INT, UINT,
+     *         LONG, ULONG, LONGLONG, ULONGLONG,
+     *         FLOAT, DOUBLE, LONGDOUBLE#
+     *
+     * #NAME = Byte, UByte, Short, UShort, Int, UInt,
+     *         Long, ULong, LongLong, ULongLong,
+     *         Float, Double, LongDouble#
+     */
+    /**begin repeat1
+     * #func = argmax, argmin#
+     */
+    NPY_CPU_DISPATCH_CALL_XB(_Py@NAME@_ArrFuncs.@func@ = (PyArray_ArgFunc*)@FROM@_@func@);
+    /**end repeat1**/
+    /**end repeat**/
+    NPY_CPU_DISPATCH_CALL_XB(_PyBool_ArrFuncs.argmax = (PyArray_ArgFunc*)BOOL_argmax);
+
      /*
       * Override the base class for all types, eventually all of this logic
       * should be defined on the class and inherited to the scalar.
diff --git a/numpy/core/src/multiarray/arraytypes.h b/numpy/core/src/multiarray/arraytypes.h

deleted file mode 100644 (file)

index b3a13b2..0000000
--- a/numpy/core/src/multiarray/arraytypes.h
+++ /dev/null
@@ -1,31 +0,0 @@
-#ifndef NUMPY_CORE_SRC_MULTIARRAY_ARRAYTYPES_H_
-#define NUMPY_CORE_SRC_MULTIARRAY_ARRAYTYPES_H_
-
-#include "common.h"
-
-NPY_NO_EXPORT int
-set_typeinfo(PyObject *dict);
-
-/* needed for blasfuncs */
-NPY_NO_EXPORT void
-FLOAT_dot(char *, npy_intp, char *, npy_intp, char *, npy_intp, void *);
-
-NPY_NO_EXPORT void
-CFLOAT_dot(char *, npy_intp, char *, npy_intp, char *, npy_intp, void *);
-
-NPY_NO_EXPORT void
-DOUBLE_dot(char *, npy_intp, char *, npy_intp, char *, npy_intp, void *);
-
-NPY_NO_EXPORT void
-CDOUBLE_dot(char *, npy_intp, char *, npy_intp, char *, npy_intp, void *);
-
-
-/* for _pyarray_correlate */
-NPY_NO_EXPORT int
-small_correlate(const char * d_, npy_intp dstride,
-                npy_intp nd, enum NPY_TYPES dtype,
-                const char * k_, npy_intp kstride,
-                npy_intp nk, enum NPY_TYPES ktype,
-                char * out_, npy_intp ostride);
-
-#endif  /* NUMPY_CORE_SRC_MULTIARRAY_ARRAYTYPES_H_ */
diff --git a/numpy/core/src/multiarray/arraytypes.h.src b/numpy/core/src/multiarray/arraytypes.h.src

new file mode 100644 (file)

index 0000000..4c74871
--- /dev/null
+++ b/numpy/core/src/multiarray/arraytypes.h.src
@@ -0,0 +1,52 @@
+#ifndef NUMPY_CORE_SRC_MULTIARRAY_ARRAYTYPES_H_
+#define NUMPY_CORE_SRC_MULTIARRAY_ARRAYTYPES_H_
+
+#include "common.h"
+
+NPY_NO_EXPORT int
+set_typeinfo(PyObject *dict);
+
+/* needed for blasfuncs */
+NPY_NO_EXPORT void
+FLOAT_dot(char *, npy_intp, char *, npy_intp, char *, npy_intp, void *);
+
+NPY_NO_EXPORT void
+CFLOAT_dot(char *, npy_intp, char *, npy_intp, char *, npy_intp, void *);
+
+NPY_NO_EXPORT void
+DOUBLE_dot(char *, npy_intp, char *, npy_intp, char *, npy_intp, void *);
+
+NPY_NO_EXPORT void
+CDOUBLE_dot(char *, npy_intp, char *, npy_intp, char *, npy_intp, void *);
+
+
+/* for _pyarray_correlate */
+NPY_NO_EXPORT int
+small_correlate(const char * d_, npy_intp dstride,
+                npy_intp nd, enum NPY_TYPES dtype,
+                const char * k_, npy_intp kstride,
+                npy_intp nk, enum NPY_TYPES ktype,
+                char * out_, npy_intp ostride);
+
+#ifndef NPY_DISABLE_OPTIMIZATION
+    #include "argfunc.dispatch.h"
+#endif
+/**begin repeat
+ * #TYPE = BYTE, UBYTE, SHORT, USHORT, INT, UINT,
+ *         LONG, ULONG, LONGLONG, ULONGLONG,
+ *         FLOAT, DOUBLE, LONGDOUBLE#
+ * #type = byte, ubyte, short, ushort, int, uint,
+ *         long, ulong, longlong, ulonglong,
+ *         float, double, longdouble#
+ */
+/**begin repeat1
+ * #func = argmax, argmin#
+ */
+NPY_CPU_DISPATCH_DECLARE(NPY_NO_EXPORT int @TYPE@_@func@,
+    (npy_@type@ *ip, npy_intp n, npy_intp *max_ind, PyArrayObject *aip))
+/**end repeat1**/
+/**end repeat**/
+NPY_CPU_DISPATCH_DECLARE(NPY_NO_EXPORT int BOOL_argmax,
+    (npy_bool *ip, npy_intp n, npy_intp *max_ind, PyArrayObject *aip))
+
+#endif  /* NUMPY_CORE_SRC_MULTIARRAY_ARRAYTYPES_H_ */
diff --git a/numpy/core/src/multiarray/buffer.c b/numpy/core/src/multiarray/buffer.c

index 13d7038d344837c6012d8454cf73c0c8d86f2742..0307d41a8b31e66870accf851f068c97aa4a660a 100644 (file)
--- a/numpy/core/src/multiarray/buffer.c
+++ b/numpy/core/src/multiarray/buffer.c
@@ -498,14 +498,11 @@ _buffer_info_new(PyObject *obj, int flags)
              assert((size_t)info->shape % sizeof(npy_intp) == 0);
              info->strides = info->shape + PyArray_NDIM(arr);
  
-#if NPY_RELAXED_STRIDES_CHECKING
              /*
-             * When NPY_RELAXED_STRIDES_CHECKING is used, some buffer users
-             * may expect a contiguous buffer to have well formatted strides
-             * also when a dimension is 1, but we do not guarantee this
-             * internally. Thus, recalculate strides for contiguous arrays.
-             * (This is unnecessary, but has no effect in the case where
-             * NPY_RELAXED_STRIDES CHECKING is disabled.)
+             * Some buffer users may expect a contiguous buffer to have well
+             * formatted strides also when a dimension is 1, but we do not
+             * guarantee this internally. Thus, recalculate strides for
+             * contiguous arrays.
               */
              int f_contiguous = (flags & PyBUF_F_CONTIGUOUS) == PyBUF_F_CONTIGUOUS;
              if (PyArray_IS_C_CONTIGUOUS(arr) && !(
@@ -526,11 +523,6 @@ _buffer_info_new(PyObject *obj, int flags)
                  }
              }
              else {
-#else  /* NPY_RELAXED_STRIDES_CHECKING */
-            /* We can always use the arrays strides directly */
-            {
-#endif
-
                  for (k = 0; k < PyArray_NDIM(arr); ++k) {
                      info->shape[k] = PyArray_DIMS(arr)[k];
                      info->strides[k] = PyArray_STRIDES(arr)[k];
@@ -708,8 +700,8 @@ _buffer_get_info(void **buffer_info_cache_ptr, PyObject *obj, int flags)
           if (info->ndim > 1 && next_info != NULL) {
               /*
                * Some arrays are C- and F-contiguous and if they have more
-              * than one dimension, the buffer-info may differ between
-              * the two due to RELAXED_STRIDES_CHECKING.
+              * than one dimension, the buffer-info may differ between the
+              * two because strides for length 1 dimension may be adjusted.
                * If we export both buffers, the first stored one may be
                * the one for the other contiguity, so check both.
                * This is generally very unlikely in all other cases, since
diff --git a/numpy/core/src/multiarray/calculation.c b/numpy/core/src/multiarray/calculation.c

index 327f685d4ffcb8b7cdd9eb9bac800acebaeff382..a985a2308bfb58b47e11abd6e48a4a3cfa99391a 100644 (file)
--- a/numpy/core/src/multiarray/calculation.c
+++ b/numpy/core/src/multiarray/calculation.c
@@ -175,7 +175,7 @@ _PyArray_ArgMinMaxCommon(PyArrayObject *op,
      NPY_END_THREADS_DESCR(PyArray_DESCR(ap));
  
      Py_DECREF(ap);
-    /* Trigger the UPDATEIFCOPY/WRITEBACKIFCOPY if necessary */
+    /* Trigger the WRITEBACKIFCOPY if necessary */
      if (out != NULL && out != rp) {
          PyArray_ResolveWritebackIfCopy(rp);
          Py_DECREF(rp);
diff --git a/numpy/core/src/multiarray/can_cast_table.h b/numpy/core/src/multiarray/can_cast_table.h

new file mode 100644 (file)

index 0000000..bd9c4c4
--- /dev/null
+++ b/numpy/core/src/multiarray/can_cast_table.h
@@ -0,0 +1,124 @@
+/*
+ * This file defines a compile time constant casting table for use in
+ * a few situations:
+ * 1. As a fast-path in can-cast (untested how much it helps).
+ * 2. To define the actual cast safety stored on the CastingImpl/ArrayMethod
+ * 3. For scalar math, since it also needs cast safety information.
+ *
+ * It is useful to have this constant to allow writing compile time generic
+ * code based on cast safety in the scalar math code.
+ */
+
+#ifndef NUMPY_CORE_SRC_MULTIARRAY_CAN_CAST_TABLE_H_
+#define NUMPY_CORE_SRC_MULTIARRAY_CAN_CAST_TABLE_H_
+
+#include "numpy/ndarraytypes.h"
+
+
+/* The from type fits into to (it has a smaller or equal number of bits) */
+#define FITS(FROM, TO) (NPY_SIZEOF_##FROM <= NPY_SIZEOF_##TO)
+/* Unsigned "from" fits a signed integer if it is truly smaller */
+#define UFITS(FROM, TO) (NPY_SIZEOF_##FROM < NPY_SIZEOF_##TO)
+/* Integer "from" only fits a float if it is truly smaller or double... */
+#define IFITS(FROM, TO) (  \
+    NPY_SIZEOF_##FROM < NPY_SIZEOF_##TO || (  \
+            NPY_SIZEOF_##FROM == NPY_SIZEOF_##TO  \
+            && NPY_SIZEOF_##FROM >= NPY_SIZEOF_DOUBLE))
+
+/*
+ * NOTE: The Order is bool, integers (signed, unsigned) tuples, float, cfloat,
+ *       then 6 fixed ones (object, string, unicode, void, datetime, timedelta),
+ *       and finally half.
+ *       Note that in the future we may only need the numeric casts here, but
+ *       currently it fills in the others as well.
+ */
+#define CASTS_SAFELY_FROM_UINT(FROM)  \
+    {0,  \
+     UFITS(FROM, BYTE), FITS(FROM, BYTE), UFITS(FROM, SHORT), FITS(FROM, SHORT),  \
+     UFITS(FROM, INT), FITS(FROM, INT), UFITS(FROM, LONG), FITS(FROM, LONG),  \
+     UFITS(FROM, LONGLONG), FITS(FROM, LONGLONG),  \
+     IFITS(FROM, FLOAT), IFITS(FROM, DOUBLE), IFITS(FROM, LONGDOUBLE),  \
+     IFITS(FROM, FLOAT), IFITS(FROM, DOUBLE), IFITS(FROM, LONGDOUBLE),  \
+     1, 1, 1, 1, 0, NPY_SIZEOF_##FROM < NPY_SIZEOF_TIMEDELTA, IFITS(FROM, HALF)}
+
+#define CASTS_SAFELY_FROM_INT(FROM)  \
+    {0,  \
+     FITS(FROM, BYTE), 0, FITS(FROM, SHORT), 0,  \
+     FITS(FROM, INT), 0, FITS(FROM, LONG), 0,  \
+     FITS(FROM, LONGLONG), 0,  \
+     IFITS(FROM, FLOAT), IFITS(FROM, DOUBLE), IFITS(FROM, LONGDOUBLE),  \
+     IFITS(FROM, FLOAT), IFITS(FROM, DOUBLE), IFITS(FROM, LONGDOUBLE),  \
+     1, 1, 1, 1, 0, NPY_SIZEOF_##FROM <= NPY_SIZEOF_TIMEDELTA, IFITS(FROM, HALF)}
+
+/* Floats are similar to ints, but cap at double */
+#define CASTS_SAFELY_FROM_FLOAT(FROM)  \
+    {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,  \
+     FITS(FROM, FLOAT), FITS(FROM, DOUBLE), FITS(FROM, LONGDOUBLE),  \
+     FITS(FROM, FLOAT), FITS(FROM, DOUBLE), FITS(FROM, LONGDOUBLE),  \
+     1, 1, 1, 1, 0, 0, FITS(FROM, HALF)}
+
+#define CASTS_SAFELY_FROM_CFLOAT(FROM)  \
+    {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,  \
+     0, 0, 0,  \
+     FITS(FROM, FLOAT), FITS(FROM, DOUBLE), FITS(FROM, LONGDOUBLE),  \
+     1, 1, 1, 1, 0, 0, 0}
+
+static const npy_bool _npy_can_cast_safely_table[NPY_NTYPES][NPY_NTYPES] = {
+        /* Bool safely casts to anything except datetime (has no zero) */
+        {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+         1, 1, 1, 1, 1, 1,
+         1, 1, 1, 1, 0, 1, 1},
+        /* Integers in pairs of signed, unsigned */
+        CASTS_SAFELY_FROM_INT(BYTE), CASTS_SAFELY_FROM_UINT(BYTE),
+        CASTS_SAFELY_FROM_INT(SHORT), CASTS_SAFELY_FROM_UINT(SHORT),
+        CASTS_SAFELY_FROM_INT(INT), CASTS_SAFELY_FROM_UINT(INT),
+        CASTS_SAFELY_FROM_INT(LONG), CASTS_SAFELY_FROM_UINT(LONG),
+        CASTS_SAFELY_FROM_INT(LONGLONG), CASTS_SAFELY_FROM_UINT(LONGLONG),
+        /* Floats and complex */
+        CASTS_SAFELY_FROM_FLOAT(FLOAT),
+        CASTS_SAFELY_FROM_FLOAT(DOUBLE),
+        CASTS_SAFELY_FROM_FLOAT(LONGDOUBLE),
+        CASTS_SAFELY_FROM_CFLOAT(FLOAT),
+        CASTS_SAFELY_FROM_CFLOAT(DOUBLE),
+        CASTS_SAFELY_FROM_CFLOAT(LONGDOUBLE),
+        /*
+         * Following the main numeric types are:
+         * object, string, unicode, void, datetime, timedelta (and half)
+         */
+        /* object casts safely only to itself */
+        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,  /* bool + ints */
+         0, 0, 0, 0, 0, 0,  /* floats (without half) */
+         1, 0, 0, 0, 0, 0, 0},
+        /* String casts safely to object, unicode and void */
+        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,  /* bool + ints */
+         0, 0, 0, 0, 0, 0,  /* floats (without half) */
+         1, 1, 1, 1, 0, 0, 0},
+        /* Unicode casts safely to object and void */
+        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,  /* bool + ints */
+         0, 0, 0, 0, 0, 0,  /* floats (without half) */
+         1, 0, 1, 1, 0, 0, 0},
+        /* Void cast safely to object */
+        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,  /* bool + ints */
+         0, 0, 0, 0, 0, 0,  /* floats (without half) */
+         1, 0, 0, 1, 0, 0, 0},
+        /* datetime cast safely to object, string, unicode, void */
+        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,  /* bool + ints */
+         0, 0, 0, 0, 0, 0,  /* floats (without half) */
+         1, 1, 1, 1, 1, 0, 0},
+        /* timedelta cast safely to object, string, unicode, void */
+        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,  /* bool + ints */
+         0, 0, 0, 0, 0, 0,  /* floats (without half) */
+         1, 1, 1, 1, 0, 1, 0},
+        /* half */
+        CASTS_SAFELY_FROM_FLOAT(HALF),
+};
+
+#undef FITS
+#undef UFITS
+#undef IFITS
+#undef CASTS_SAFELY_TO_UINT
+#undef CASTS_SAFELY_TO_INT
+#undef CASTS_SAFELY_TO_FLOAT
+#undef CASTS_SAFELY_TO_CFLOAT
+
+#endif  /* NUMPY_CORE_SRC_MULTIARRAY_CAN_CAST_TABLE_H_ */
diff --git a/numpy/core/src/multiarray/common.c b/numpy/core/src/multiarray/common.c

index aa95d285a8ca93a7d09d77435725539ebdaad3a4..aa612146ce5ccc61383a6b069155d611db56113a 100644 (file)
--- a/numpy/core/src/multiarray/common.c
+++ b/numpy/core/src/multiarray/common.c
@@ -108,8 +108,8 @@ PyArray_DTypeFromObjectStringDiscovery(
  
  /*
   * This function is now identical to the new PyArray_DiscoverDTypeAndShape
- * but only returns the the dtype. It should in most cases be slowly phased
- * out. (Which may need some refactoring to PyArray_FromAny to make it simpler)
+ * but only returns the dtype. It should in most cases be slowly phased out.
+ * (Which may need some refactoring to PyArray_FromAny to make it simpler)
   */
  NPY_NO_EXPORT int
  PyArray_DTypeFromObject(PyObject *obj, int maxdims, PyArray_Descr **out_dtype)
@@ -127,23 +127,6 @@ PyArray_DTypeFromObject(PyObject *obj, int maxdims, PyArray_Descr **out_dtype)
      return 0;
  }
  
-NPY_NO_EXPORT char *
-index2ptr(PyArrayObject *mp, npy_intp i)
-{
-    npy_intp dim0;
-
-    if (PyArray_NDIM(mp) == 0) {
-        PyErr_SetString(PyExc_IndexError, "0-d arrays can't be indexed");
-        return NULL;
-    }
-    dim0 = PyArray_DIMS(mp)[0];
-    if (check_and_adjust_index(&i, dim0, 0, NULL) < 0)
-        return NULL;
-    if (i == 0) {
-        return PyArray_DATA(mp);
-    }
-    return PyArray_BYTES(mp)+i*PyArray_STRIDES(mp)[0];
-}
  
  NPY_NO_EXPORT int
  _zerofill(PyArrayObject *ret)
diff --git a/numpy/core/src/multiarray/common.h b/numpy/core/src/multiarray/common.h

index ed022e4f8370290a652d3d030469818766c3b76d..a6c117745b2fae8163d1ca64912664cb5383b5bb 100644 (file)
--- a/numpy/core/src/multiarray/common.h
+++ b/numpy/core/src/multiarray/common.h
@@ -43,9 +43,6 @@ NPY_NO_EXPORT int
  PyArray_DTypeFromObject(PyObject *obj, int maxdims,
                          PyArray_Descr **out_dtype);
  
-NPY_NO_EXPORT int
-PyArray_DTypeFromObjectHelper(PyObject *obj, int maxdims,
-                              PyArray_Descr **out_dtype, int string_status);
  
  /*
   * Returns NULL without setting an exception if no scalar is matched, a
@@ -54,12 +51,6 @@ PyArray_DTypeFromObjectHelper(PyObject *obj, int maxdims,
  NPY_NO_EXPORT PyArray_Descr *
  _array_find_python_scalar_type(PyObject *op);
  
-NPY_NO_EXPORT PyArray_Descr *
-_array_typedescr_fromstr(char const *str);
-
-NPY_NO_EXPORT char *
-index2ptr(PyArrayObject *mp, npy_intp i);
-
  NPY_NO_EXPORT int
  _zerofill(PyArrayObject *ret);
  
@@ -301,35 +292,6 @@ npy_memchr(char * haystack, char needle,
      return p;
  }
  
-/*
- * Helper to work around issues with the allocation strategy currently
- * allocating not 1 byte for empty arrays, but enough for an array where
- * all 0 dimensions are replaced with size 1 (if the itemsize is not 0).
- *
- * This means that we can fill in nice (nonzero) strides and still handle
- * slicing direct math without being in danger of leaving the allocated byte
- * bounds.
- * In practice, that probably does not matter, but in principle this would be
- * undefined behaviour in C.  Another solution may be to force the strides
- * to 0 in these cases.  See also gh-15788.
- *
- * Unlike the code in `PyArray_NewFromDescr` does no overflow checks.
- */
-static NPY_INLINE npy_intp
-PyArray_NBYTES_ALLOCATED(PyArrayObject *arr)
-{
-    if (PyArray_ITEMSIZE(arr) == 0) {
-        return 1;
-    }
-    npy_intp nbytes = PyArray_ITEMSIZE(arr);
-    for (int i = 0; i < PyArray_NDIM(arr); i++) {
-        if (PyArray_DIMS(arr)[i] != 0) {
-            nbytes *= PyArray_DIMS(arr)[i];
-        }
-    }
-    return nbytes;
-}
-
  
  /*
   * Simple helper to create a tuple from an array of items. The `make_null_none`
diff --git a/numpy/core/src/multiarray/common_dtype.c b/numpy/core/src/multiarray/common_dtype.c

index ca80b1ed7002830c9efa511333d98c8a536d5e66..3561a905aa7a6fc5cf194dec1cee0094c02f4b3d 100644 (file)
--- a/numpy/core/src/multiarray/common_dtype.c
+++ b/numpy/core/src/multiarray/common_dtype.c
@@ -41,7 +41,7 @@
   * @param dtype2 Second DType class.
   * @return The common DType or NULL with an error set
   */
-NPY_NO_EXPORT NPY_INLINE PyArray_DTypeMeta *
+NPY_NO_EXPORT PyArray_DTypeMeta *
  PyArray_CommonDType(PyArray_DTypeMeta *dtype1, PyArray_DTypeMeta *dtype2)
  {
      if (dtype1 == dtype2) {
diff --git a/numpy/core/src/multiarray/conversion_utils.c b/numpy/core/src/multiarray/conversion_utils.c

index a1de580d953725df199a40a4e3f3185d717bd1a9..96afb6c00ce25175e94b7ab2047d14e8de2fe79d 100644 (file)
--- a/numpy/core/src/multiarray/conversion_utils.c
+++ b/numpy/core/src/multiarray/conversion_utils.c
@@ -78,6 +78,27 @@ PyArray_OutputConverter(PyObject *object, PyArrayObject **address)
      }
  }
  
+
+/*
+ * Convert the given value to an integer. Replaces the error when compared
+ * to `PyArray_PyIntAsIntp`.  Exists mainly to retain old behaviour of
+ * `PyArray_IntpConverter` and `PyArray_IntpFromSequence`
+ */
+static NPY_INLINE npy_intp
+dimension_from_scalar(PyObject *ob)
+{
+    npy_intp value = PyArray_PyIntAsIntp(ob);
+
+    if (error_converting(value)) {
+        if (PyErr_ExceptionMatches(PyExc_OverflowError)) {
+            PyErr_SetString(PyExc_ValueError,
+                    "Maximum allowed dimension exceeded");
+        }
+        return -1;
+    }
+    return value;
+}
+
  /*NUMPY_API
   * Get intp chunk from sequence
   *
@@ -90,9 +111,6 @@ PyArray_OutputConverter(PyObject *object, PyArrayObject **address)
  NPY_NO_EXPORT int
  PyArray_IntpConverter(PyObject *obj, PyArray_Dims *seq)
  {
-    Py_ssize_t len;
-    int nd;
-
      seq->ptr = NULL;
      seq->len = 0;
  
@@ -110,42 +128,85 @@ PyArray_IntpConverter(PyObject *obj, PyArray_Dims *seq)
          return NPY_SUCCEED;
      }
  
-    len = PySequence_Size(obj);
-    if (len == -1) {
-        /* Check to see if it is an integer number */
-        if (PyNumber_Check(obj)) {
-            /*
-             * After the deprecation the PyNumber_Check could be replaced
-             * by PyIndex_Check.
-             * FIXME 1.9 ?
-             */
-            len = 1;
+    PyObject *seq_obj = NULL;
+
+    /*
+     * If obj is a scalar we skip all the useless computations and jump to
+     * dimension_from_scalar as soon as possible.
+     */
+    if (!PyLong_CheckExact(obj) && PySequence_Check(obj)) {
+        seq_obj = PySequence_Fast(obj,
+               "expected a sequence of integers or a single integer.");
+        if (seq_obj == NULL) {
+            /* continue attempting to parse as a single integer. */
+            PyErr_Clear();
          }
      }
-    if (len < 0) {
-        PyErr_SetString(PyExc_TypeError,
-                "expected sequence object with len >= 0 or a single integer");
-        return NPY_FAIL;
-    }
-    if (len > NPY_MAXDIMS) {
-        PyErr_Format(PyExc_ValueError, "maximum supported dimension for an ndarray is %d"
-                     ", found %d", NPY_MAXDIMS, len);
-        return NPY_FAIL;
-    }
-    if (len > 0) {
-        seq->ptr = npy_alloc_cache_dim(len);
+
+    if (seq_obj == NULL) {
+        /*
+         * obj *might* be a scalar (if dimension_from_scalar does not fail, at
+         * the moment no check have been performed to verify this hypothesis).
+         */
+        seq->ptr = npy_alloc_cache_dim(1);
          if (seq->ptr == NULL) {
              PyErr_NoMemory();
              return NPY_FAIL;
          }
+        else {
+            seq->len = 1;
+
+            seq->ptr[0] = dimension_from_scalar(obj);
+            if (error_converting(seq->ptr[0])) {
+                /*
+                 * If the error occurred is a type error (cannot convert the
+                 * value to an integer) communicate that we expected a sequence
+                 * or an integer from the user.
+                 */
+                if (PyErr_ExceptionMatches(PyExc_TypeError)) {
+                    PyErr_Format(PyExc_TypeError,
+                            "expected a sequence of integers or a single "
+                            "integer, got '%.100R'", obj);
+                }
+                npy_free_cache_dim_obj(*seq);
+                seq->ptr = NULL;
+                return NPY_FAIL;
+            }
+        }
      }
-    seq->len = len;
-    nd = PyArray_IntpFromIndexSequence(obj, (npy_intp *)seq->ptr, len);
-    if (nd == -1 || nd != len) {
-        npy_free_cache_dim_obj(*seq);
-        seq->ptr = NULL;
-        return NPY_FAIL;
+    else {
+        /*
+         * `obj` is a sequence converted to the `PySequence_Fast` in `seq_obj`
+         */
+        Py_ssize_t len = PySequence_Fast_GET_SIZE(seq_obj);
+        if (len > NPY_MAXDIMS) {
+            PyErr_Format(PyExc_ValueError,
+                    "maximum supported dimension for an ndarray "
+                    "is %d, found %d", NPY_MAXDIMS, len);
+            Py_DECREF(seq_obj);
+            return NPY_FAIL;
+        }
+        if (len > 0) {
+            seq->ptr = npy_alloc_cache_dim(len);
+            if (seq->ptr == NULL) {
+                PyErr_NoMemory();
+                Py_DECREF(seq_obj);
+                return NPY_FAIL;
+            }
+        }
+
+        seq->len = len;
+        int nd = PyArray_IntpFromIndexSequence(seq_obj,
+                (npy_intp *)seq->ptr, len);
+        Py_DECREF(seq_obj);
+
+        if (nd == -1 || nd != len) {
+            npy_free_cache_dim_obj(*seq);
+            seq->ptr = NULL;
+            return NPY_FAIL;
+        }
      }
+
      return NPY_SUCCEED;
  }
  
@@ -271,6 +332,15 @@ PyArray_AxisConverter(PyObject *obj, int *axis)
          if (error_converting(*axis)) {
              return NPY_FAIL;
          }
+        if (*axis == NPY_MAXDIMS){
+            /* NumPy 1.23, 2022-05-19 */
+            if (DEPRECATE("Using `axis=32` (MAXDIMS) is deprecated. "
+                          "32/MAXDIMS had the same meaning as `axis=None` which "
+                          "should be used instead.  "
+                          "(Deprecated NumPy 1.23)") < 0) {
+                return NPY_FAIL;
+            }
+        }
      }
      return NPY_SUCCEED;
  }
@@ -993,64 +1063,46 @@ PyArray_PyIntAsIntp(PyObject *o)
  }
  
  
-/*
- * PyArray_IntpFromIndexSequence
- * Returns the number of dimensions or -1 if an error occurred.
- * vals must be large enough to hold maxvals.
- * Opposed to PyArray_IntpFromSequence it uses and returns npy_intp
- * for the number of values.
+NPY_NO_EXPORT int
+PyArray_IntpFromPyIntConverter(PyObject *o, npy_intp *val)
+{
+    *val = PyArray_PyIntAsIntp(o);
+    if (error_converting(*val)) {
+        return NPY_FAIL;
+    }
+    return NPY_SUCCEED;
+}
+
+
+/**
+ * Reads values from a sequence of integers and stores them into an array.
+ *
+ * @param  seq      A sequence created using `PySequence_Fast`.
+ * @param  vals     Array used to store dimensions (must be large enough to
+ *                      hold `maxvals` values).
+ * @param  max_vals Maximum number of dimensions that can be written into `vals`.
+ * @return          Number of dimensions or -1 if an error occurred.
+ *
+ * .. note::
+ *
+ *   Opposed to PyArray_IntpFromSequence it uses and returns `npy_intp`
+ *      for the number of values.
   */
  NPY_NO_EXPORT npy_intp
  PyArray_IntpFromIndexSequence(PyObject *seq, npy_intp *vals, npy_intp maxvals)
  {
-    Py_ssize_t nd;
-    npy_intp i;
-    PyObject *op, *err;
-
      /*
-     * Check to see if sequence is a single integer first.
-     * or, can be made into one
+     * First of all, check if sequence is a scalar integer or if it can be
+     * "casted" into a scalar.
       */
-    nd = PySequence_Length(seq);
-    if (nd == -1) {
-        if (PyErr_Occurred()) {
-            PyErr_Clear();
-        }
+    Py_ssize_t nd = PySequence_Fast_GET_SIZE(seq);
+    PyObject *op;
+    for (Py_ssize_t i = 0; i < PyArray_MIN(nd, maxvals); i++) {
+        op = PySequence_Fast_GET_ITEM(seq, i);
  
-        vals[0] = PyArray_PyIntAsIntp(seq);
-        if(vals[0] == -1) {
-            err = PyErr_Occurred();
-            if (err &&
-                    PyErr_GivenExceptionMatches(err, PyExc_OverflowError)) {
-                PyErr_SetString(PyExc_ValueError,
-                        "Maximum allowed dimension exceeded");
-            }
-            if(err != NULL) {
-                return -1;
-            }
-        }
-        nd = 1;
-    }
-    else {
-        for (i = 0; i < PyArray_MIN(nd,maxvals); i++) {
-            op = PySequence_GetItem(seq, i);
-            if (op == NULL) {
-                return -1;
-            }
-
-            vals[i] = PyArray_PyIntAsIntp(op);
-            Py_DECREF(op);
-            if(vals[i] == -1) {
-                err = PyErr_Occurred();
-                if (err &&
-                        PyErr_GivenExceptionMatches(err, PyExc_OverflowError)) {
-                    PyErr_SetString(PyExc_ValueError,
-                            "Maximum allowed dimension exceeded");
-                }
-                if(err != NULL) {
-                    return -1;
-                }
-            }
+        vals[i] = dimension_from_scalar(op);
+        if (error_converting(vals[i])) {
+            return -1;
          }
      }
      return nd;
@@ -1064,7 +1116,34 @@ PyArray_IntpFromIndexSequence(PyObject *seq, npy_intp *vals, npy_intp maxvals)
  NPY_NO_EXPORT int
  PyArray_IntpFromSequence(PyObject *seq, npy_intp *vals, int maxvals)
  {
-    return PyArray_IntpFromIndexSequence(seq, vals, (npy_intp)maxvals);
+    PyObject *seq_obj = NULL;
+    if (!PyLong_CheckExact(seq) && PySequence_Check(seq)) {
+        seq_obj = PySequence_Fast(seq,
+            "expected a sequence of integers or a single integer");
+        if (seq_obj == NULL) {
+            /* continue attempting to parse as a single integer. */
+            PyErr_Clear();
+        }
+    }
+
+    if (seq_obj == NULL) {
+        vals[0] = dimension_from_scalar(seq);
+        if (error_converting(vals[0])) {
+            if (PyErr_ExceptionMatches(PyExc_TypeError)) {
+                PyErr_Format(PyExc_TypeError,
+                        "expected a sequence of integers or a single "
+                        "integer, got '%.100R'", seq);
+            }
+            return -1;
+        }
+        return 1;
+    }
+    else {
+        int res;
+        res = PyArray_IntpFromIndexSequence(seq_obj, vals, (npy_intp)maxvals);
+        Py_DECREF(seq_obj);
+        return res;
+    }
  }
  
  
diff --git a/numpy/core/src/multiarray/conversion_utils.h b/numpy/core/src/multiarray/conversion_utils.h

index 4072841ee1c74aba898081bf7b1630846fe5313c..4d0fbb8941ba57559263a5eea2e97a2d57b3fb1e 100644 (file)
--- a/numpy/core/src/multiarray/conversion_utils.h
+++ b/numpy/core/src/multiarray/conversion_utils.h
@@ -6,6 +6,9 @@
  NPY_NO_EXPORT int
  PyArray_IntpConverter(PyObject *obj, PyArray_Dims *seq);
  
+NPY_NO_EXPORT int
+PyArray_IntpFromPyIntConverter(PyObject *o, npy_intp *val);
+
  NPY_NO_EXPORT int
  PyArray_OptionalIntpConverter(PyObject *obj, PyArray_Dims *seq);
  
diff --git a/numpy/core/src/multiarray/convert.c b/numpy/core/src/multiarray/convert.c

index 2f68db07c9886a47b3e90388871408c1d35e8bae..630253e38b648b8df3d7e9cbbe53fd4a6a2e42cb 100644 (file)
--- a/numpy/core/src/multiarray/convert.c
+++ b/numpy/core/src/multiarray/convert.c
@@ -544,6 +544,12 @@ PyArray_NewCopy(PyArrayObject *obj, NPY_ORDER order)
  {
      PyArrayObject *ret;
  
+    if (obj == NULL) {
+        PyErr_SetString(PyExc_ValueError,
+            "obj is NULL in PyArray_NewCopy");
+        return NULL;
+    }
+
      ret = (PyArrayObject *)PyArray_NewLikeArray(obj, order, NULL, 1);
      if (ret == NULL) {
          return NULL;
diff --git a/numpy/core/src/multiarray/convert_datatype.c b/numpy/core/src/multiarray/convert_datatype.c

index 5d215a647f8a0787116eb3414b66527db6e079f0..139136fd3b55eb1f6fe4fa5de3e1cf088631df46 100644 (file)
--- a/numpy/core/src/multiarray/convert_datatype.c
+++ b/numpy/core/src/multiarray/convert_datatype.c
@@ -15,6 +15,7 @@
  #include "numpy/npy_math.h"
  
  #include "array_coercion.h"
+#include "can_cast_table.h"
  #include "common.h"
  #include "ctors.h"
  #include "dtypemeta.h"
@@ -220,14 +221,11 @@ PyArray_MinCastSafety(NPY_CASTING casting1, NPY_CASTING casting2)
      if (casting1 < 0 || casting2 < 0) {
          return -1;
      }
-    NPY_CASTING view = casting1 & casting2 & _NPY_CAST_IS_VIEW;
-    casting1 = casting1 & ~_NPY_CAST_IS_VIEW;
-    casting2 = casting2 & ~_NPY_CAST_IS_VIEW;
      /* larger casting values are less safe */
      if (casting1 > casting2) {
-        return casting1 | view;
+        return casting1;
      }
-    return casting2 | view;
+    return casting2;
  }
  
  
@@ -245,6 +243,12 @@ PyArray_CastToType(PyArrayObject *arr, PyArray_Descr *dtype, int is_f_order)
  {
      PyObject *out;
  
+    if (dtype == NULL) {
+        PyErr_SetString(PyExc_ValueError,
+            "dtype is NULL in PyArray_CastToType");
+        return NULL;
+    }
+
      Py_SETREF(dtype, PyArray_AdaptDescriptorToArray(arr, (PyObject *)dtype));
      if (dtype == NULL) {
          return NULL;
@@ -360,29 +364,41 @@ PyArray_CastAnyTo(PyArrayObject *out, PyArrayObject *mp)
  
  static NPY_CASTING
  _get_cast_safety_from_castingimpl(PyArrayMethodObject *castingimpl,
-        PyArray_DTypeMeta *dtypes[2], PyArray_Descr *from, PyArray_Descr *to)
+        PyArray_DTypeMeta *dtypes[2], PyArray_Descr *from, PyArray_Descr *to,
+        npy_intp *view_offset)
  {
      PyArray_Descr *descrs[2] = {from, to};
      PyArray_Descr *out_descrs[2];
  
+    *view_offset = NPY_MIN_INTP;
      NPY_CASTING casting = castingimpl->resolve_descriptors(
-            castingimpl, dtypes, descrs, out_descrs);
+            castingimpl, dtypes, descrs, out_descrs, view_offset);
      if (casting < 0) {
          return -1;
      }
      /* The returned descriptors may not match, requiring a second check */
      if (out_descrs[0] != descrs[0]) {
-        NPY_CASTING from_casting = PyArray_GetCastSafety(
-                descrs[0], out_descrs[0], NULL);
+        npy_intp from_offset = NPY_MIN_INTP;
+        NPY_CASTING from_casting = PyArray_GetCastInfo(
+                descrs[0], out_descrs[0], NULL, &from_offset);
          casting = PyArray_MinCastSafety(casting, from_casting);
+        if (from_offset != *view_offset) {
+            /* `view_offset` differs: The multi-step cast cannot be a view. */
+            *view_offset = NPY_MIN_INTP;
+        }
          if (casting < 0) {
              goto finish;
          }
      }
      if (descrs[1] != NULL && out_descrs[1] != descrs[1]) {
-        NPY_CASTING from_casting = PyArray_GetCastSafety(
-                descrs[1], out_descrs[1], NULL);
+        npy_intp from_offset = NPY_MIN_INTP;
+        NPY_CASTING from_casting = PyArray_GetCastInfo(
+                descrs[1], out_descrs[1], NULL, &from_offset);
          casting = PyArray_MinCastSafety(casting, from_casting);
+        if (from_offset != *view_offset) {
+            /* `view_offset` differs: The multi-step cast cannot be a view. */
+            *view_offset = NPY_MIN_INTP;
+        }
          if (casting < 0) {
              goto finish;
          }
@@ -393,15 +409,21 @@ _get_cast_safety_from_castingimpl(PyArrayMethodObject *castingimpl,
      Py_DECREF(out_descrs[1]);
      /*
       * Check for less harmful non-standard returns.  The following two returns
-     * should never happen. They would be roughly equivalent, but less precise,
-     * versions of `(NPY_NO_CASTING|_NPY_CAST_IS_VIEW)`.
-     * 1. No-casting must imply cast-is-view.
-     * 2. Equivalent-casting + cast-is-view is (currently) the definition
-     *    of a "no" cast (there may be reasons to relax this).
-     * Note that e.g. `(NPY_UNSAFE_CASTING|_NPY_CAST_IS_VIEW)` is valid.
+     * should never happen:
+     * 1. No-casting must imply a view offset of 0.
+     * 2. Equivalent-casting + 0 view offset is (usually) the definition
+     *    of a "no" cast.  However, changing the order of fields can also
+     *    create descriptors that are not equivalent but views.
+     * Note that unsafe casts can have a view offset.  For example, in
+     * principle, casting `<i8` to `<i4` is a cast with 0 offset.
       */
-    assert(casting != NPY_NO_CASTING);
-    assert(casting != (NPY_EQUIV_CASTING|_NPY_CAST_IS_VIEW));
+    if (*view_offset != 0) {
+        assert(casting != NPY_NO_CASTING);
+    }
+    else {
+        assert(casting != NPY_EQUIV_CASTING
+               || (PyDataType_HASFIELDS(from) && PyDataType_HASFIELDS(to)));
+    }
      return casting;
  }
  
@@ -417,11 +439,13 @@ _get_cast_safety_from_castingimpl(PyArrayMethodObject *castingimpl,
   * @param to The descriptor to cast to (may be NULL)
   * @param to_dtype If `to` is NULL, must pass the to_dtype (otherwise this
   *        is ignored).
+ * @param[out] view_offset
   * @return NPY_CASTING or -1 on error or if the cast is not possible.
   */
  NPY_NO_EXPORT NPY_CASTING
-PyArray_GetCastSafety(
-        PyArray_Descr *from, PyArray_Descr *to, PyArray_DTypeMeta *to_dtype)
+PyArray_GetCastInfo(
+        PyArray_Descr *from, PyArray_Descr *to, PyArray_DTypeMeta *to_dtype,
+        npy_intp *view_offset)
  {
      if (to != NULL) {
          to_dtype = NPY_DTYPE(to);
@@ -438,7 +462,7 @@ PyArray_GetCastSafety(
      PyArrayMethodObject *castingimpl = (PyArrayMethodObject *)meth;
      PyArray_DTypeMeta *dtypes[2] = {NPY_DTYPE(from), to_dtype};
      NPY_CASTING casting = _get_cast_safety_from_castingimpl(castingimpl,
-            dtypes, from, to);
+            dtypes, from, to, view_offset);
      Py_DECREF(meth);
  
      return casting;
@@ -446,8 +470,8 @@ PyArray_GetCastSafety(
  
  
  /**
- * Check whether a cast is safe, see also `PyArray_GetCastSafety` for
- * a similar function.  Unlike GetCastSafety, this function checks the
+ * Check whether a cast is safe, see also `PyArray_GetCastInfo` for
+ * a similar function.  Unlike GetCastInfo, this function checks the
   * `castingimpl->casting` when available.  This allows for two things:
   *
   * 1. It avoids  calling `resolve_descriptors` in some cases.
@@ -490,8 +514,9 @@ PyArray_CheckCastSafety(NPY_CASTING casting,
      }
  
      PyArray_DTypeMeta *dtypes[2] = {NPY_DTYPE(from), to_dtype};
+    npy_intp view_offset;
      NPY_CASTING safety = _get_cast_safety_from_castingimpl(castingimpl,
-            dtypes, from, to);
+            dtypes, from, to, &view_offset);
      Py_DECREF(meth);
      /* If casting is the smaller (or equal) safety we match */
      if (safety < 0) {
@@ -906,22 +931,6 @@ promote_types(PyArray_Descr *type1, PyArray_Descr *type2,
  
  }
  
-/*
- * Returns a new reference to type if it is already NBO, otherwise
- * returns a copy converted to NBO.
- */
-NPY_NO_EXPORT PyArray_Descr *
-ensure_dtype_nbo(PyArray_Descr *type)
-{
-    if (PyArray_ISNBO(type->byteorder)) {
-        Py_INCREF(type);
-        return type;
-    }
-    else {
-        return PyArray_DescrNewByteorder(type, NPY_NATIVE);
-    }
-}
-
  
  /**
   * This function should possibly become public API eventually.  At this
@@ -968,8 +977,9 @@ PyArray_CastDescrToDType(PyArray_Descr *descr, PyArray_DTypeMeta *given_DType)
      PyArray_Descr *loop_descrs[2];
  
      PyArrayMethodObject *meth = (PyArrayMethodObject *)tmp;
+    npy_intp view_offset = NPY_MIN_INTP;
      NPY_CASTING casting = meth->resolve_descriptors(
-            meth, dtypes, given_descrs, loop_descrs);
+            meth, dtypes, given_descrs, loop_descrs, &view_offset);
      Py_DECREF(tmp);
      if (casting < 0) {
          goto error;
@@ -1003,7 +1013,7 @@ PyArray_FindConcatenationDescriptor(
          npy_intp n, PyArrayObject **arrays, PyObject *requested_dtype)
  {
      if (requested_dtype == NULL) {
-        return PyArray_LegacyResultType(n, arrays, 0, NULL);
+        return PyArray_ResultType(n, arrays, 0, NULL);
      }
  
      PyArray_DTypeMeta *common_dtype;
@@ -1064,7 +1074,13 @@ PyArray_PromoteTypes(PyArray_Descr *type1, PyArray_Descr *type2)
      PyArray_Descr *res;
  
      /* Fast path for identical inputs (NOTE: This path preserves metadata!) */
-    if (type1 == type2 && PyArray_ISNBO(type1->byteorder)) {
+    if (type1 == type2
+            /*
+             * Short-cut for legacy/builtin dtypes except void, since void has
+             * no reliable byteorder.  Note: This path preserves metadata!
+             */
+            && NPY_DT_is_legacy(NPY_DTYPE(type1))
+            && PyArray_ISNBO(type1->byteorder) && type1->type_num != NPY_VOID) {
          Py_INCREF(type1);
          return type1;
      }
@@ -1614,7 +1630,7 @@ PyArray_ResultType(
                      "no arrays or types available to calculate result type");
              return NULL;
          }
-        return ensure_dtype_nbo(result);
+        return NPY_DT_CALL_ensure_canonical(result);
      }
  
      void **info_on_heap = NULL;
@@ -1675,8 +1691,12 @@ PyArray_ResultType(
              all_DTypes[i_all] = &PyArray_PyComplexAbstractDType;
          }
          else {
-            /* N.B.: Could even be an object dtype here for large ints */
+            /* This could even be an object dtype here for large ints */
              all_DTypes[i_all] = &PyArray_PyIntAbstractDType;
+            if (PyArray_TYPE(arrs[i]) != NPY_LONG) {
+                /* Not a "normal" scalar, so we cannot avoid the legacy path */
+                all_pyscalar = 0;
+            }
          }
          Py_INCREF(all_DTypes[i_all]);
          /*
@@ -2286,13 +2306,14 @@ legacy_same_dtype_resolve_descriptors(
          PyArrayMethodObject *NPY_UNUSED(self),
          PyArray_DTypeMeta *NPY_UNUSED(dtypes[2]),
          PyArray_Descr *given_descrs[2],
-        PyArray_Descr *loop_descrs[2])
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *view_offset)
  {
      Py_INCREF(given_descrs[0]);
      loop_descrs[0] = given_descrs[0];
  
      if (given_descrs[1] == NULL) {
-        loop_descrs[1] = ensure_dtype_nbo(loop_descrs[0]);
+        loop_descrs[1] = NPY_DT_CALL_ensure_canonical(loop_descrs[0]);
          if (loop_descrs[1] == NULL) {
              Py_DECREF(loop_descrs[0]);
              return -1;
@@ -2312,7 +2333,8 @@ legacy_same_dtype_resolve_descriptors(
       */
      if (PyDataType_ISNOTSWAPPED(loop_descrs[0]) ==
                  PyDataType_ISNOTSWAPPED(loop_descrs[1])) {
-        return NPY_NO_CASTING | _NPY_CAST_IS_VIEW;
+        *view_offset = 0;
+        return NPY_NO_CASTING;
      }
      return NPY_EQUIV_CASTING;
  }
@@ -2351,16 +2373,17 @@ simple_cast_resolve_descriptors(
          PyArrayMethodObject *self,
          PyArray_DTypeMeta *dtypes[2],
          PyArray_Descr *given_descrs[2],
-        PyArray_Descr *loop_descrs[2])
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *view_offset)
  {
      assert(NPY_DT_is_legacy(dtypes[0]) && NPY_DT_is_legacy(dtypes[1]));
  
-    loop_descrs[0] = ensure_dtype_nbo(given_descrs[0]);
+    loop_descrs[0] = NPY_DT_CALL_ensure_canonical(given_descrs[0]);
      if (loop_descrs[0] == NULL) {
          return -1;
      }
      if (given_descrs[1] != NULL) {
-        loop_descrs[1] = ensure_dtype_nbo(given_descrs[1]);
+        loop_descrs[1] = NPY_DT_CALL_ensure_canonical(given_descrs[1]);
          if (loop_descrs[1] == NULL) {
              Py_DECREF(loop_descrs[0]);
              return -1;
@@ -2375,7 +2398,8 @@ simple_cast_resolve_descriptors(
      }
      if (PyDataType_ISNOTSWAPPED(loop_descrs[0]) ==
              PyDataType_ISNOTSWAPPED(loop_descrs[1])) {
-        return NPY_NO_CASTING | _NPY_CAST_IS_VIEW;
+        *view_offset = 0;
+        return NPY_NO_CASTING;
      }
      return NPY_EQUIV_CASTING;
  }
@@ -2426,7 +2450,7 @@ get_byteswap_loop(
  NPY_NO_EXPORT int
  complex_to_noncomplex_get_loop(
          PyArrayMethod_Context *context,
-        int aligned, int move_references, npy_intp *strides,
+        int aligned, int move_references, const npy_intp *strides,
          PyArrayMethod_StridedLoop **out_loop, NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags)
  {
@@ -2569,7 +2593,8 @@ cast_to_string_resolve_descriptors(
          PyArrayMethodObject *self,
          PyArray_DTypeMeta *dtypes[2],
          PyArray_Descr *given_descrs[2],
-        PyArray_Descr *loop_descrs[2])
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *NPY_UNUSED(view_offset))
  {
      /*
       * NOTE: The following code used to be part of PyArray_AdaptFlexibleDType
@@ -2645,14 +2670,14 @@ cast_to_string_resolve_descriptors(
      }
      else {
          /* The legacy loop can handle mismatching itemsizes */
-        loop_descrs[1] = ensure_dtype_nbo(given_descrs[1]);
+        loop_descrs[1] = NPY_DT_CALL_ensure_canonical(given_descrs[1]);
          if (loop_descrs[1] == NULL) {
              return -1;
          }
      }
  
      /* Set the input one as well (late for easier error management) */
-    loop_descrs[0] = ensure_dtype_nbo(given_descrs[0]);
+    loop_descrs[0] = NPY_DT_CALL_ensure_canonical(given_descrs[0]);
      if (loop_descrs[0] == NULL) {
          return -1;
      }
@@ -2720,13 +2745,14 @@ string_to_string_resolve_descriptors(
          PyArrayMethodObject *NPY_UNUSED(self),
          PyArray_DTypeMeta *NPY_UNUSED(dtypes[2]),
          PyArray_Descr *given_descrs[2],
-        PyArray_Descr *loop_descrs[2])
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *view_offset)
  {
      Py_INCREF(given_descrs[0]);
      loop_descrs[0] = given_descrs[0];
  
      if (given_descrs[1] == NULL) {
-        loop_descrs[1] = ensure_dtype_nbo(loop_descrs[0]);
+        loop_descrs[1] = NPY_DT_CALL_ensure_canonical(loop_descrs[0]);
          if (loop_descrs[1] == NULL) {
              return -1;
          }
@@ -2736,26 +2762,36 @@ string_to_string_resolve_descriptors(
          loop_descrs[1] = given_descrs[1];
      }
  
-    if (loop_descrs[0]->elsize == loop_descrs[1]->elsize) {
-        if (PyDataType_ISNOTSWAPPED(loop_descrs[0]) ==
-                PyDataType_ISNOTSWAPPED(loop_descrs[1])) {
-            return NPY_NO_CASTING | _NPY_CAST_IS_VIEW;
+    if (loop_descrs[0]->elsize < loop_descrs[1]->elsize) {
+        /* New string is longer: safe but cannot be a view */
+        return NPY_SAFE_CASTING;
+    }
+    else {
+        /* New string fits into old: if the byte-order matches can be a view */
+        int not_swapped = (PyDataType_ISNOTSWAPPED(loop_descrs[0])
+                           == PyDataType_ISNOTSWAPPED(loop_descrs[1]));
+        if (not_swapped) {
+            *view_offset = 0;
+        }
+
+        if (loop_descrs[0]->elsize > loop_descrs[1]->elsize) {
+            return NPY_SAME_KIND_CASTING;
+        }
+        /* The strings have the same length: */
+        if (not_swapped) {
+            return NPY_NO_CASTING;
          }
          else {
              return NPY_EQUIV_CASTING;
          }
      }
-    else if (loop_descrs[0]->elsize <= loop_descrs[1]->elsize) {
-        return NPY_SAFE_CASTING;
-    }
-    return NPY_SAME_KIND_CASTING;
  }
  
  
  NPY_NO_EXPORT int
  string_to_string_get_loop(
          PyArrayMethod_Context *context,
-        int aligned, int NPY_UNUSED(move_references), npy_intp *strides,
+        int aligned, int NPY_UNUSED(move_references), const npy_intp *strides,
          PyArrayMethod_StridedLoop **out_loop, NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags)
  {
@@ -2863,7 +2899,8 @@ PyArray_InitializeStringCasts(void)
   */
  static NPY_CASTING
  cast_to_void_dtype_class(
-        PyArray_Descr **given_descrs, PyArray_Descr **loop_descrs)
+        PyArray_Descr **given_descrs, PyArray_Descr **loop_descrs,
+        npy_intp *view_offset)
  {
      /* `dtype="V"` means unstructured currently (compare final path) */
      loop_descrs[1] = PyArray_DescrNewFromType(NPY_VOID);
@@ -2873,11 +2910,13 @@ cast_to_void_dtype_class(
      loop_descrs[1]->elsize = given_descrs[0]->elsize;
      Py_INCREF(given_descrs[0]);
      loop_descrs[0] = given_descrs[0];
+
+    *view_offset = 0;
      if (loop_descrs[0]->type_num == NPY_VOID &&
              loop_descrs[0]->subarray == NULL && loop_descrs[1]->names == NULL) {
-        return NPY_NO_CASTING | _NPY_CAST_IS_VIEW;
+        return NPY_NO_CASTING;
      }
-    return NPY_SAFE_CASTING | _NPY_CAST_IS_VIEW;
+    return NPY_SAFE_CASTING;
  }
  
  
@@ -2886,12 +2925,13 @@ nonstructured_to_structured_resolve_descriptors(
          PyArrayMethodObject *NPY_UNUSED(self),
          PyArray_DTypeMeta *NPY_UNUSED(dtypes[2]),
          PyArray_Descr *given_descrs[2],
-        PyArray_Descr *loop_descrs[2])
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *view_offset)
  {
      NPY_CASTING casting;
  
      if (given_descrs[1] == NULL) {
-        return cast_to_void_dtype_class(given_descrs, loop_descrs);
+        return cast_to_void_dtype_class(given_descrs, loop_descrs, view_offset);
      }
  
      if (given_descrs[1]->subarray != NULL) {
@@ -2900,12 +2940,18 @@ nonstructured_to_structured_resolve_descriptors(
           * possible to allow a view if the field has exactly one element.
           */
          casting = NPY_SAFE_CASTING;
+        npy_intp sub_view_offset = NPY_MIN_INTP;
          /* Subarray dtype */
-        NPY_CASTING base_casting = PyArray_GetCastSafety(
-                given_descrs[0], given_descrs[1]->subarray->base, NULL);
+        NPY_CASTING base_casting = PyArray_GetCastInfo(
+                given_descrs[0], given_descrs[1]->subarray->base, NULL,
+                &sub_view_offset);
          if (base_casting < 0) {
              return -1;
          }
+        if (given_descrs[1]->elsize == given_descrs[1]->subarray->base->elsize) {
+            /* A single field, view is OK if sub-view is */
+            *view_offset = sub_view_offset;
+        }
          casting = PyArray_MinCastSafety(casting, base_casting);
      }
      else if (given_descrs[1]->names != NULL) {
@@ -2917,21 +2963,32 @@ nonstructured_to_structured_resolve_descriptors(
          else {
              /* Considered at most unsafe casting (but this could be changed) */
              casting = NPY_UNSAFE_CASTING;
-            if (PyTuple_Size(given_descrs[1]->names) == 1) {
-                /* A view may be acceptable */
-                casting |= _NPY_CAST_IS_VIEW;
-            }
  
              Py_ssize_t pos = 0;
              PyObject *key, *tuple;
              while (PyDict_Next(given_descrs[1]->fields, &pos, &key, &tuple)) {
                  PyArray_Descr *field_descr = (PyArray_Descr *)PyTuple_GET_ITEM(tuple, 0);
-                NPY_CASTING field_casting = PyArray_GetCastSafety(
-                        given_descrs[0], field_descr, NULL);
+                npy_intp field_view_off = NPY_MIN_INTP;
+                NPY_CASTING field_casting = PyArray_GetCastInfo(
+                        given_descrs[0], field_descr, NULL, &field_view_off);
                  casting = PyArray_MinCastSafety(casting, field_casting);
                  if (casting < 0) {
                      return -1;
                  }
+                if (field_view_off != NPY_MIN_INTP) {
+                    npy_intp to_off = PyLong_AsSsize_t(PyTuple_GET_ITEM(tuple, 1));
+                    if (error_converting(to_off)) {
+                        return -1;
+                    }
+                    *view_offset = field_view_off - to_off;
+                }
+            }
+            if (PyTuple_Size(given_descrs[1]->names) != 1 || *view_offset < 0) {
+                /*
+                 * Assume that a view is impossible when there is more than one
+                 * field.  (Fields could overlap, but that seems weird...)
+                 */
+                *view_offset = NPY_MIN_INTP;
              }
          }
      }
@@ -2941,15 +2998,20 @@ nonstructured_to_structured_resolve_descriptors(
                  !PyDataType_REFCHK(given_descrs[0])) {
              /*
               * A simple view, at the moment considered "safe" (the refcheck is
-             * probably not necessary, but more future proof
+             * probably not necessary, but more future proof)
               */
-            casting = NPY_SAFE_CASTING | _NPY_CAST_IS_VIEW;
+            *view_offset = 0;
+            casting = NPY_SAFE_CASTING;
          }
          else if (given_descrs[0]->elsize <= given_descrs[1]->elsize) {
              casting = NPY_SAFE_CASTING;
          }
          else {
              casting = NPY_UNSAFE_CASTING;
+            /* new elsize is smaller so a view is OK (reject refs for now) */
+            if (!PyDataType_REFCHK(given_descrs[0])) {
+                *view_offset = 0;
+            }
          }
      }
  
@@ -2978,7 +3040,7 @@ static int
  nonstructured_to_structured_get_loop(
          PyArrayMethod_Context *context,
          int aligned, int move_references,
-        npy_intp *strides,
+        const npy_intp *strides,
          PyArrayMethod_StridedLoop **out_loop,
          NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags)
@@ -3045,6 +3107,8 @@ PyArray_GetGenericToVoidCastingImpl(void)
      method->casting = -1;
      method->resolve_descriptors = &nonstructured_to_structured_resolve_descriptors;
      method->get_strided_loop = &nonstructured_to_structured_get_loop;
+    method->nin = 1;
+    method->nout = 1;
  
      return (PyObject *)method;
  }
@@ -3055,12 +3119,19 @@ structured_to_nonstructured_resolve_descriptors(
          PyArrayMethodObject *NPY_UNUSED(self),
          PyArray_DTypeMeta *dtypes[2],
          PyArray_Descr *given_descrs[2],
-        PyArray_Descr *loop_descrs[2])
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *view_offset)
  {
      PyArray_Descr *base_descr;
+    /* The structured part may allow a view (and have its own offset): */
+    npy_intp struct_view_offset = NPY_MIN_INTP;
  
      if (given_descrs[0]->subarray != NULL) {
          base_descr = given_descrs[0]->subarray->base;
+        /* A view is possible if the subarray has exactly one element: */
+        if (given_descrs[0]->elsize == given_descrs[0]->subarray->base->elsize) {
+            struct_view_offset = 0;
+        }
      }
      else if (given_descrs[0]->names != NULL) {
          if (PyTuple_Size(given_descrs[0]->names) != 1) {
@@ -3070,6 +3141,10 @@ structured_to_nonstructured_resolve_descriptors(
          PyObject *key = PyTuple_GetItem(given_descrs[0]->names, 0);
          PyObject *base_tup = PyDict_GetItem(given_descrs[0]->fields, key);
          base_descr = (PyArray_Descr *)PyTuple_GET_ITEM(base_tup, 0);
+        struct_view_offset = PyLong_AsSsize_t(PyTuple_GET_ITEM(base_tup, 1));
+        if (error_converting(struct_view_offset)) {
+            return -1;
+        }
      }
      else {
          /*
@@ -3077,20 +3152,29 @@ structured_to_nonstructured_resolve_descriptors(
           * at this time they go back to legacy behaviour using getitem/setitem.
           */
          base_descr = NULL;
+        struct_view_offset = 0;
      }
  
      /*
-     * The cast is always considered unsafe, so the PyArray_GetCastSafety
-     * result currently does not matter.
+     * The cast is always considered unsafe, so the PyArray_GetCastInfo
+     * result currently only matters for the view_offset.
       */
-    if (base_descr != NULL && PyArray_GetCastSafety(
-            base_descr, given_descrs[1], dtypes[1]) < 0) {
+    npy_intp base_view_offset = NPY_MIN_INTP;
+    if (base_descr != NULL && PyArray_GetCastInfo(
+            base_descr, given_descrs[1], dtypes[1], &base_view_offset) < 0) {
          return -1;
      }
+    if (base_view_offset != NPY_MIN_INTP
+            && struct_view_offset != NPY_MIN_INTP) {
+        *view_offset = base_view_offset + struct_view_offset;
+    }
  
      /* Void dtypes always do the full cast. */
      if (given_descrs[1] == NULL) {
          loop_descrs[1] = NPY_DT_CALL_default_descr(dtypes[1]);
+        if (loop_descrs[1] == NULL) {
+            return -1;
+        }
          /*
           * Special case strings here, it should be useless (and only actually
           * work for empty arrays).  Possibly this should simply raise for
@@ -3118,7 +3202,7 @@ static int
  structured_to_nonstructured_get_loop(
          PyArrayMethod_Context *context,
          int aligned, int move_references,
-        npy_intp *strides,
+        const npy_intp *strides,
          PyArrayMethod_StridedLoop **out_loop,
          NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags)
@@ -3184,6 +3268,8 @@ PyArray_GetVoidToGenericCastingImpl(void)
      method->casting = -1;
      method->resolve_descriptors = &structured_to_nonstructured_resolve_descriptors;
      method->get_strided_loop = &structured_to_nonstructured_get_loop;
+    method->nin = 1;
+    method->nout = 1;
  
      return (PyObject *)method;
  }
@@ -3198,73 +3284,106 @@ PyArray_GetVoidToGenericCastingImpl(void)
   *       implementations on the dtype, to avoid duplicate work.
   */
  static NPY_CASTING
-can_cast_fields_safety(PyArray_Descr *from, PyArray_Descr *to)
+can_cast_fields_safety(
+        PyArray_Descr *from, PyArray_Descr *to, npy_intp *view_offset)
  {
-    NPY_CASTING casting = NPY_NO_CASTING | _NPY_CAST_IS_VIEW;
-
      Py_ssize_t field_count = PyTuple_Size(from->names);
      if (field_count != PyTuple_Size(to->names)) {
-        /* TODO: This should be rejected! */
-        return NPY_UNSAFE_CASTING;
+        return -1;
      }
+
+    NPY_CASTING casting = NPY_NO_CASTING;
+    *view_offset = 0;  /* if there are no fields, a view is OK. */
      for (Py_ssize_t i = 0; i < field_count; i++) {
+        npy_intp field_view_off = NPY_MIN_INTP;
          PyObject *from_key = PyTuple_GET_ITEM(from->names, i);
          PyObject *from_tup = PyDict_GetItemWithError(from->fields, from_key);
          if (from_tup == NULL) {
              return give_bad_field_error(from_key);
          }
-        PyArray_Descr *from_base = (PyArray_Descr*)PyTuple_GET_ITEM(from_tup, 0);
+        PyArray_Descr *from_base = (PyArray_Descr *) PyTuple_GET_ITEM(from_tup, 0);
  
-        /*
-         * TODO: This should use to_key (order), compare gh-15509 by
-         *       by Allan Haldane.  And raise an error on failure.
-         *       (Fixing that may also requires fixing/changing promotion.)
-         */
-        PyObject *to_tup = PyDict_GetItem(to->fields, from_key);
+        /* Check whether the field names match */
+        PyObject *to_key = PyTuple_GET_ITEM(to->names, i);
+        PyObject *to_tup = PyDict_GetItem(to->fields, to_key);
          if (to_tup == NULL) {
-            return NPY_UNSAFE_CASTING;
+            return give_bad_field_error(from_key);
+        }
+        PyArray_Descr *to_base = (PyArray_Descr *) PyTuple_GET_ITEM(to_tup, 0);
+
+        int cmp = PyUnicode_Compare(from_key, to_key);
+        if (error_converting(cmp)) {
+            return -1;
+        }
+        if (cmp != 0) {
+            /* Field name mismatch, consider this at most SAFE. */
+            casting = PyArray_MinCastSafety(casting, NPY_SAFE_CASTING);
          }
-        PyArray_Descr *to_base = (PyArray_Descr*)PyTuple_GET_ITEM(to_tup, 0);
  
-        NPY_CASTING field_casting = PyArray_GetCastSafety(from_base, to_base, NULL);
+        /* Also check the title (denote mismatch as SAFE only) */
+        PyObject *from_title = from_key;
+        PyObject *to_title = to_key;
+        if (PyTuple_GET_SIZE(from_tup) > 2) {
+            from_title = PyTuple_GET_ITEM(from_tup, 2);
+        }
+        if (PyTuple_GET_SIZE(to_tup) > 2) {
+            to_title = PyTuple_GET_ITEM(to_tup, 2);
+        }
+        cmp = PyObject_RichCompareBool(from_title, to_title, Py_EQ);
+        if (error_converting(cmp)) {
+            return -1;
+        }
+        if (!cmp) {
+            casting = PyArray_MinCastSafety(casting, NPY_SAFE_CASTING);
+        }
+
+        NPY_CASTING field_casting = PyArray_GetCastInfo(
+                from_base, to_base, NULL, &field_view_off);
          if (field_casting < 0) {
              return -1;
          }
          casting = PyArray_MinCastSafety(casting, field_casting);
-    }
-    if (!(casting & _NPY_CAST_IS_VIEW)) {
-        assert((casting & ~_NPY_CAST_IS_VIEW) != NPY_NO_CASTING);
-        return casting;
-    }
  
-    /*
-     * If the itemsize (includes padding at the end), fields, or names
-     * do not match, this cannot be a view and also not a "no" cast
-     * (identical dtypes).
-     * It may be possible that this can be relaxed in some cases.
-     */
-    if (from->elsize != to->elsize) {
+        /* Adjust the "view offset" by the field offsets: */
+        if (field_view_off != NPY_MIN_INTP) {
+            npy_intp to_off = PyLong_AsSsize_t(PyTuple_GET_ITEM(to_tup, 1));
+            if (error_converting(to_off)) {
+                return -1;
+            }
+            npy_intp from_off = PyLong_AsSsize_t(PyTuple_GET_ITEM(from_tup, 1));
+            if (error_converting(from_off)) {
+                return -1;
+            }
+            field_view_off = field_view_off - to_off + from_off;
+        }
+
          /*
-         * The itemsize may mismatch even if all fields and formats match
-         * (due to additional padding).
+         * If there is one field, use its field offset.  After that propagate
+         * the view offset if they match and set to "invalid" if not.
           */
-        return PyArray_MinCastSafety(casting, NPY_EQUIV_CASTING);
+        if (i == 0) {
+            *view_offset = field_view_off;
+        }
+        else if (*view_offset != field_view_off) {
+            *view_offset = NPY_MIN_INTP;
+        }
      }
  
-    int cmp = PyObject_RichCompareBool(from->fields, to->fields, Py_EQ);
-    if (cmp != 1) {
-        if (cmp == -1) {
-            PyErr_Clear();
-        }
-        return PyArray_MinCastSafety(casting, NPY_EQUIV_CASTING);
+    if (*view_offset != 0 || from->elsize != to->elsize) {
+        /* Can never be considered "no" casting. */
+        casting = PyArray_MinCastSafety(casting, NPY_EQUIV_CASTING);
      }
-    cmp = PyObject_RichCompareBool(from->names, to->names, Py_EQ);
-    if (cmp != 1) {
-        if (cmp == -1) {
-            PyErr_Clear();
-        }
-        return PyArray_MinCastSafety(casting, NPY_EQUIV_CASTING);
+
+    /* The new dtype may have access outside the old one due to padding: */
+    if (*view_offset < 0) {
+        /* negative offsets would give indirect access before original dtype */
+        *view_offset = NPY_MIN_INTP;
+    }
+    if (from->elsize < to->elsize + *view_offset) {
+        /* new dtype has indirect access outside of the original dtype */
+        *view_offset = NPY_MIN_INTP;
      }
+
      return casting;
  }
  
@@ -3274,38 +3393,45 @@ void_to_void_resolve_descriptors(
          PyArrayMethodObject *self,
          PyArray_DTypeMeta *dtypes[2],
          PyArray_Descr *given_descrs[2],
-        PyArray_Descr *loop_descrs[2])
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *view_offset)
  {
      NPY_CASTING casting;
  
      if (given_descrs[1] == NULL) {
          /* This is weird, since it doesn't return the original descr, but... */
-        return cast_to_void_dtype_class(given_descrs, loop_descrs);
+        return cast_to_void_dtype_class(given_descrs, loop_descrs, view_offset);
      }
  
      if (given_descrs[0]->names != NULL && given_descrs[1]->names != NULL) {
          /* From structured to structured, need to check fields */
-        casting = can_cast_fields_safety(given_descrs[0], given_descrs[1]);
+        casting = can_cast_fields_safety(
+                given_descrs[0], given_descrs[1], view_offset);
+        if (casting < 0) {
+            return -1;
+        }
      }
      else if (given_descrs[0]->names != NULL) {
          return structured_to_nonstructured_resolve_descriptors(
-                self, dtypes, given_descrs, loop_descrs);
+                self, dtypes, given_descrs, loop_descrs, view_offset);
      }
      else if (given_descrs[1]->names != NULL) {
          return nonstructured_to_structured_resolve_descriptors(
-                self, dtypes, given_descrs, loop_descrs);
+                self, dtypes, given_descrs, loop_descrs, view_offset);
      }
      else if (given_descrs[0]->subarray == NULL &&
                  given_descrs[1]->subarray == NULL) {
          /* Both are plain void dtypes */
          if (given_descrs[0]->elsize == given_descrs[1]->elsize) {
-            casting = NPY_NO_CASTING | _NPY_CAST_IS_VIEW;
+            casting = NPY_NO_CASTING;
+            *view_offset = 0;
          }
          else if (given_descrs[0]->elsize < given_descrs[1]->elsize) {
              casting = NPY_SAFE_CASTING;
          }
          else {
              casting = NPY_SAME_KIND_CASTING;
+            *view_offset = 0;
          }
      }
      else {
@@ -3319,20 +3445,51 @@ void_to_void_resolve_descriptors(
  
          /* If the shapes do not match, this is at most an unsafe cast */
          casting = NPY_UNSAFE_CASTING;
+        /*
+         * We can use a view in two cases:
+         * 1. The shapes and elsizes matches, so any view offset applies to
+         *    each element of the subarray identically.
+         *    (in practice this probably implies the `view_offset` will be 0)
+         * 2. There is exactly one element and the subarray has no effect
+         *    (can be tested by checking if the itemsizes of the base matches)
+         */
+        npy_bool subarray_layout_supports_view = NPY_FALSE;
          if (from_sub && to_sub) {
              int res = PyObject_RichCompareBool(from_sub->shape, to_sub->shape, Py_EQ);
              if (res < 0) {
                  return -1;
              }
              else if (res) {
-                /* Both are subarrays and the shape matches */
-                casting = NPY_NO_CASTING | _NPY_CAST_IS_VIEW;
+                /* Both are subarrays and the shape matches, could be no cast */
+                casting = NPY_NO_CASTING;
+                /* May be a view if there is one element or elsizes match */
+                if (from_sub->base->elsize == to_sub->base->elsize
+                        || given_descrs[0]->elsize == from_sub->base->elsize) {
+                    subarray_layout_supports_view = NPY_TRUE;
+                }
+            }
+        }
+        else if (from_sub) {
+            /* May use a view if "from" has only a single element: */
+            if (given_descrs[0]->elsize == from_sub->base->elsize) {
+                subarray_layout_supports_view = NPY_TRUE;
+            }
+        }
+        else {
+            /* May use a view if "from" has only a single element: */
+            if (given_descrs[1]->elsize == to_sub->base->elsize) {
+                subarray_layout_supports_view = NPY_TRUE;
              }
          }
  
          PyArray_Descr *from_base = (from_sub == NULL) ? given_descrs[0] : from_sub->base;
          PyArray_Descr *to_base = (to_sub == NULL) ? given_descrs[1] : to_sub->base;
-        NPY_CASTING field_casting = PyArray_GetCastSafety(from_base, to_base, NULL);
+        /* An offset for  */
+        NPY_CASTING field_casting = PyArray_GetCastInfo(
+                from_base, to_base, NULL, view_offset);
+        if (!subarray_layout_supports_view) {
+            *view_offset = NPY_MIN_INTP;
+        }
          if (field_casting < 0) {
              return -1;
          }
@@ -3353,7 +3510,7 @@ NPY_NO_EXPORT int
  void_to_void_get_loop(
          PyArrayMethod_Context *context,
          int aligned, int move_references,
-        npy_intp *strides,
+        const npy_intp *strides,
          PyArrayMethod_StridedLoop **out_loop,
          NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags)
@@ -3440,7 +3597,8 @@ object_to_any_resolve_descriptors(
          PyArrayMethodObject *NPY_UNUSED(self),
          PyArray_DTypeMeta *dtypes[2],
          PyArray_Descr *given_descrs[2],
-        PyArray_Descr *loop_descrs[2])
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *NPY_UNUSED(view_offset))
  {
      if (given_descrs[1] == NULL) {
          /*
@@ -3504,13 +3662,14 @@ PyArray_GetObjectToGenericCastingImpl(void)
  
  
  
-/* Any object object is simple (could even use the default) */
+/* Any object is simple (could even use the default) */
  static NPY_CASTING
  any_to_object_resolve_descriptors(
          PyArrayMethodObject *NPY_UNUSED(self),
          PyArray_DTypeMeta *dtypes[2],
          PyArray_Descr *given_descrs[2],
-        PyArray_Descr *loop_descrs[2])
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *NPY_UNUSED(view_offset))
  {
      if (given_descrs[1] == NULL) {
          loop_descrs[1] = NPY_DT_CALL_default_descr(dtypes[1]);
@@ -3567,7 +3726,7 @@ static int
  object_to_object_get_loop(
          PyArrayMethod_Context *NPY_UNUSED(context),
          int NPY_UNUSED(aligned), int move_references,
-        npy_intp *NPY_UNUSED(strides),
+        const npy_intp *NPY_UNUSED(strides),
          PyArrayMethod_StridedLoop **out_loop,
          NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags)
@@ -3588,10 +3747,6 @@ object_to_object_get_loop(
  static int
  PyArray_InitializeObjectToObjectCast(void)
  {
-    /*
-     * The object dtype does not support byte order changes, so its cast
-     * is always a direct view.
-     */
      PyArray_DTypeMeta *Object = PyArray_DTypeFromTypeNum(NPY_OBJECT);
      PyArray_DTypeMeta *dtypes[2] = {Object, Object};
      PyType_Slot slots[] = {
@@ -3599,7 +3754,7 @@ PyArray_InitializeObjectToObjectCast(void)
              {0, NULL}};
      PyArrayMethod_Spec spec = {
              .name = "object_to_object_cast",
-            .casting = NPY_NO_CASTING | _NPY_CAST_IS_VIEW,
+            .casting = NPY_NO_CASTING,
              .nin = 1,
              .nout = 1,
              .flags = NPY_METH_REQUIRES_PYAPI | NPY_METH_SUPPORTS_UNALIGNED,
diff --git a/numpy/core/src/multiarray/convert_datatype.h b/numpy/core/src/multiarray/convert_datatype.h

index 5e0682f2267d05aec2a2fffb034e669facac4bf1..d1865d1c247e14b15828291922d127006276ffd4 100644 (file)
--- a/numpy/core/src/multiarray/convert_datatype.h
+++ b/numpy/core/src/multiarray/convert_datatype.h
@@ -36,9 +36,6 @@ NPY_NO_EXPORT npy_bool
  can_cast_scalar_to(PyArray_Descr *scal_type, char *scal_data,
                      PyArray_Descr *to, NPY_CASTING casting);
  
-NPY_NO_EXPORT PyArray_Descr *
-ensure_dtype_nbo(PyArray_Descr *type);
-
  NPY_NO_EXPORT int
  should_use_min_scalar(npy_intp narrs, PyArrayObject **arr,
                        npy_intp ndtypes, PyArray_Descr **dtypes);
@@ -68,8 +65,9 @@ NPY_NO_EXPORT NPY_CASTING
  PyArray_MinCastSafety(NPY_CASTING casting1, NPY_CASTING casting2);
  
  NPY_NO_EXPORT NPY_CASTING
-PyArray_GetCastSafety(
-        PyArray_Descr *from, PyArray_Descr *to, PyArray_DTypeMeta *to_dtype);
+PyArray_GetCastInfo(
+        PyArray_Descr *from, PyArray_Descr *to, PyArray_DTypeMeta *to_dtype,
+        npy_intp *view_offset);
  
  NPY_NO_EXPORT int
  PyArray_CheckCastSafety(NPY_CASTING casting,
@@ -80,7 +78,8 @@ legacy_same_dtype_resolve_descriptors(
          PyArrayMethodObject *self,
          PyArray_DTypeMeta *dtypes[2],
          PyArray_Descr *given_descrs[2],
-        PyArray_Descr *loop_descrs[2]);
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *view_offset);
  
  NPY_NO_EXPORT int
  legacy_cast_get_strided_loop(
@@ -94,7 +93,8 @@ simple_cast_resolve_descriptors(
          PyArrayMethodObject *self,
          PyArray_DTypeMeta *dtypes[2],
          PyArray_Descr *input_descrs[2],
-        PyArray_Descr *loop_descrs[2]);
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *view_offset);
  
  NPY_NO_EXPORT int
  PyArray_InitializeCasts(void);
diff --git a/numpy/core/src/multiarray/ctors.c b/numpy/core/src/multiarray/ctors.c

index c5a7ebf7dec97c9b916308632a85ba8ca1e550df..c780f4b2bb126d12b4f63ac9a8089966cec84b1f 100644 (file)
--- a/numpy/core/src/multiarray/ctors.c
+++ b/numpy/core/src/multiarray/ctors.c
@@ -743,7 +743,6 @@ PyArray_NewFromDescr_int(
      }
      else {
          fa->flags = (flags & ~NPY_ARRAY_WRITEBACKIFCOPY);
-        fa->flags &= ~NPY_ARRAY_UPDATEIFCOPY;
      }
      fa->descr = descr;
      fa->base = (PyObject *)NULL;
@@ -759,19 +758,17 @@ PyArray_NewFromDescr_int(
  
          /*
           * Copy dimensions, check them, and find total array size `nbytes`
-         *
-         * Note that we ignore 0-length dimensions, to match this in the `free`
-         * calls, `PyArray_NBYTES_ALLOCATED` is a private helper matching this
-         * behaviour, but without overflow checking.
           */
+        int is_zero = 0;
          for (int i = 0; i < nd; i++) {
              fa->dimensions[i] = dims[i];
  
              if (fa->dimensions[i] == 0) {
                  /*
-                 * Compare to PyArray_OverflowMultiplyList that
-                 * returns 0 in this case. See also `PyArray_NBYTES_ALLOCATED`.
+                 * Continue calculating the max size "as if" this were 1
+                 * to get the proper overflow error
                   */
+                is_zero = 1;
                  continue;
              }
  
@@ -792,6 +789,9 @@ PyArray_NewFromDescr_int(
                  goto fail;
              }
          }
+        if (is_zero) {
+            nbytes = 0;
+        }
  
          /* Fill the strides (or copy them if they were passed in) */
          if (strides == NULL) {
@@ -826,11 +826,13 @@ PyArray_NewFromDescr_int(
           * Allocate something even for zero-space arrays
           * e.g. shape=(0,) -- otherwise buffer exposure
           * (a.data) doesn't work as it should.
-         * Could probably just allocate a few bytes here. -- Chuck
-         * Note: always sync this with calls to PyDataMem_UserFREE
           */
          if (nbytes == 0) {
-            nbytes = descr->elsize ? descr->elsize : 1;
+            nbytes = 1;
+            /* Make sure all the strides are 0 */
+            for (int i = 0; i < nd; i++) {
+                fa->strides[i] = 0;
+            }
          }
          /*
           * It is bad to have uninitialized OBJECT pointers
@@ -879,16 +881,39 @@ PyArray_NewFromDescr_int(
      /*
       * call the __array_finalize__ method if a subtype was requested.
       * If obj is NULL use Py_None for the Python callback.
+     * For speed, we skip if __array_finalize__ is inherited from ndarray
+     * (since that function does nothing), or, for backward compatibility,
+     * if it is None.
       */
      if (subtype != &PyArray_Type) {
          PyObject *res, *func;
-
-        func = PyObject_GetAttr((PyObject *)fa, npy_ma_str_array_finalize);
+        static PyObject *ndarray_array_finalize = NULL;
+        /* First time, cache ndarray's __array_finalize__ */
+        if (ndarray_array_finalize == NULL) {
+            ndarray_array_finalize = PyObject_GetAttr(
+                (PyObject *)&PyArray_Type, npy_ma_str_array_finalize);
+        }
+        func = PyObject_GetAttr((PyObject *)subtype, npy_ma_str_array_finalize);
          if (func == NULL) {
              goto fail;
          }
+        else if (func == ndarray_array_finalize) {
+            Py_DECREF(func);
+        }
          else if (func == Py_None) {
              Py_DECREF(func);
+            /*
+             * 2022-01-08, NumPy 1.23; when deprecation period is over, remove this
+             * whole stanza so one gets a "NoneType object is not callable" TypeError.
+             */
+            if (DEPRECATE(
+                    "Setting __array_finalize__ = None to indicate no finalization"
+                    "should be done is deprecated.  Instead, just inherit from "
+                    "ndarray or, if that is not possible, explicitly set to "
+                    "ndarray.__array_function__; this will raise a TypeError "
+                    "in the future. (Deprecated since NumPy 1.23)") < 0) {
+                goto fail;
+            }
          }
          else {
              if (PyCapsule_CheckExact(func)) {
@@ -907,7 +932,7 @@ PyArray_NewFromDescr_int(
                  if (obj == NULL) {
                      obj = Py_None;
                  }
-                res = PyObject_CallFunctionObjArgs(func, obj, NULL);
+                res = PyObject_CallFunctionObjArgs(func, (PyObject *)fa, obj, NULL);
                  Py_DECREF(func);
                  if (res == NULL) {
                      goto fail;
@@ -939,6 +964,18 @@ PyArray_NewFromDescr(
          int nd, npy_intp const *dims, npy_intp const *strides, void *data,
          int flags, PyObject *obj)
  {
+    if (subtype == NULL) {
+        PyErr_SetString(PyExc_ValueError,
+            "subtype is NULL in PyArray_NewFromDescr");
+        return NULL;
+    }
+
+    if (descr == NULL) {
+        PyErr_SetString(PyExc_ValueError,
+            "descr is NULL in PyArray_NewFromDescr");
+        return NULL;
+    }
+
      return PyArray_NewFromDescrAndBase(
              subtype, descr,
              nd, dims, strides, data,
@@ -1091,6 +1128,11 @@ NPY_NO_EXPORT PyObject *
  PyArray_NewLikeArray(PyArrayObject *prototype, NPY_ORDER order,
                       PyArray_Descr *dtype, int subok)
  {
+    if (prototype == NULL) {
+        PyErr_SetString(PyExc_ValueError,
+            "prototype is NULL in PyArray_NewLikeArray");
+        return NULL;
+    }
      return PyArray_NewLikeArrayWithShape(prototype, order, dtype, -1, NULL, subok);
  }
  
@@ -1106,6 +1148,12 @@ PyArray_New(
      PyArray_Descr *descr;
      PyObject *new;
  
+    if (subtype == NULL) {
+        PyErr_SetString(PyExc_ValueError,
+            "subtype is NULL in PyArray_New");
+        return NULL;
+    }
+
      descr = PyArray_DescrFromType(type_num);
      if (descr == NULL) {
          return NULL;
@@ -1167,6 +1215,16 @@ _array_from_buffer_3118(PyObject *memoryview)
      npy_intp shape[NPY_MAXDIMS], strides[NPY_MAXDIMS];
  
      view = PyMemoryView_GET_BUFFER(memoryview);
+
+    if (view->suboffsets != NULL) {
+        PyErr_SetString(PyExc_BufferError,
+                "NumPy currently does not support importing buffers which "
+                "include suboffsets as they are not compatible with the NumPy"
+                "memory layout without a copy.  Consider copying the original "
+                "before trying to convert it to a NumPy array.");
+        return NULL;
+    }
+
      nd = view->ndim;
      descr = _dtype_from_buffer_3118(memoryview);
  
@@ -1614,8 +1672,8 @@ PyArray_FromAny(PyObject *op, PyArray_Descr *newtype, int min_depth,
           * Thus, we check if there is an array included, in that case we
           * give a FutureWarning.
           * When the warning is removed, PyArray_Pack will have to ensure
-         * that that it does not append the dimensions when creating the
-         * subarrays to assign `arr[0] = obj[0]`.
+         * that it does not append the dimensions when creating the subarrays
+         * to assign `arr[0] = obj[0]`.
           */
          int includes_array = 0;
          if (cache != NULL) {
@@ -1758,8 +1816,7 @@ PyArray_FromAny(PyObject *op, PyArray_Descr *newtype, int min_depth,
      }
  
      /* There was no array (or array-like) passed in directly. */
-    if ((flags & NPY_ARRAY_WRITEBACKIFCOPY) ||
-            (flags & NPY_ARRAY_UPDATEIFCOPY)) {
+    if (flags & NPY_ARRAY_WRITEBACKIFCOPY) {
          PyErr_SetString(PyExc_TypeError,
                          "WRITEBACKIFCOPY used for non-array input.");
          Py_DECREF(dtype);
@@ -1828,7 +1885,6 @@ PyArray_FromAny(PyObject *op, PyArray_Descr *newtype, int min_depth,
   * NPY_ARRAY_WRITEABLE,
   * NPY_ARRAY_NOTSWAPPED,
   * NPY_ARRAY_ENSURECOPY,
- * NPY_ARRAY_UPDATEIFCOPY,
   * NPY_ARRAY_WRITEBACKIFCOPY,
   * NPY_ARRAY_FORCECAST,
   * NPY_ARRAY_ENSUREARRAY,
@@ -1855,9 +1911,6 @@ PyArray_FromAny(PyObject *op, PyArray_Descr *newtype, int min_depth,
   * Fortran arrays are always behaved (aligned,
   * notswapped, and writeable) and not (C) CONTIGUOUS (if > 1d).
   *
- * NPY_ARRAY_UPDATEIFCOPY is deprecated in favor of
- * NPY_ARRAY_WRITEBACKIFCOPY in 1.14
-
   * NPY_ARRAY_WRITEBACKIFCOPY flag sets this flag in the returned
   * array if a copy is made and the base argument points to the (possibly)
   * misbehaved array. Before returning to python, PyArray_ResolveWritebackIfCopy
@@ -2012,31 +2065,8 @@ PyArray_FromArray(PyArrayObject *arr, PyArray_Descr *newtype, int flags)
              return NULL;
          }
  
-        if (flags & NPY_ARRAY_UPDATEIFCOPY) {
-            /* This is the ONLY place the NPY_ARRAY_UPDATEIFCOPY flag
-             * is still used.
-             * Can be deleted once the flag itself is removed
-             */
  
-            /* 2017-Nov-10 1.14 */
-            if (DEPRECATE(
-                    "NPY_ARRAY_UPDATEIFCOPY, NPY_ARRAY_INOUT_ARRAY, and "
-                    "NPY_ARRAY_INOUT_FARRAY are deprecated, use NPY_WRITEBACKIFCOPY, "
-                    "NPY_ARRAY_INOUT_ARRAY2, or NPY_ARRAY_INOUT_FARRAY2 respectively "
-                    "instead, and call PyArray_ResolveWritebackIfCopy before the "
-                    "array is deallocated, i.e. before the last call to Py_DECREF.") < 0) {
-                Py_DECREF(ret);
-                return NULL;
-            }
-            Py_INCREF(arr);
-            if (PyArray_SetWritebackIfCopyBase(ret, arr) < 0) {
-                Py_DECREF(ret);
-                return NULL;
-            }
-            PyArray_ENABLEFLAGS(ret, NPY_ARRAY_UPDATEIFCOPY);
-            PyArray_CLEARFLAGS(ret, NPY_ARRAY_WRITEBACKIFCOPY);
-        }
-        else if (flags & NPY_ARRAY_WRITEBACKIFCOPY) {
+        if (flags & NPY_ARRAY_WRITEBACKIFCOPY) {
              Py_INCREF(arr);
              if (PyArray_SetWritebackIfCopyBase(ret, arr) < 0) {
                  Py_DECREF(ret);
@@ -2082,7 +2112,7 @@ PyArray_FromStructInterface(PyObject *input)
      PyObject *attr;
      char endian = NPY_NATBYTE;
  
-    attr = PyArray_LookupSpecial_OnInstance(input, "__array_struct__");
+    attr = PyArray_LookupSpecial_OnInstance(input, npy_ma_str_array_struct);
      if (attr == NULL) {
          if (PyErr_Occurred()) {
              return NULL;
@@ -2136,11 +2166,31 @@ PyArray_FromStructInterface(PyObject *input)
          }
      }
  
+    /* a tuple to hold references */
+    PyObject *refs = PyTuple_New(2);
+    if (!refs) {
+        Py_DECREF(attr);
+        return NULL;
+    }
+
+    /* add a reference to the object sharing the data */
+    Py_INCREF(input);
+    PyTuple_SET_ITEM(refs, 0, input);
+
+    /* take a reference to the PyCapsule containing the PyArrayInterface
+     * structure. When the PyCapsule reference is released the PyCapsule
+     * destructor will free any resources that need to persist while numpy has
+     * access to the data. */
+    PyTuple_SET_ITEM(refs, 1,  attr);
+
+    /* create the numpy array, this call adds a reference to refs */
      PyObject *ret = PyArray_NewFromDescrAndBase(
              &PyArray_Type, thetype,
              inter->nd, inter->shape, inter->strides, inter->data,
-            inter->flags, NULL, input);
-    Py_DECREF(attr);
+            inter->flags, NULL, refs);
+
+    Py_DECREF(refs);
+
      return ret;
  
   fail:
@@ -2171,38 +2221,6 @@ _is_default_descr(PyObject *descr, PyObject *typestr) {
  }
  
  
-/*
- * A helper function to transition away from ignoring errors during
- * special attribute lookups during array coercion.
- */
-static NPY_INLINE int
-deprecated_lookup_error_clearing(PyTypeObject *type, char *attribute)
-{
-    PyObject *exc_type, *exc_value, *traceback;
-    PyErr_Fetch(&exc_type, &exc_value, &traceback);
-
-    /* DEPRECATED 2021-05-12, NumPy 1.21. */
-    int res = PyErr_WarnFormat(PyExc_DeprecationWarning, 1,
-            "An exception was ignored while fetching the attribute `%s` from "
-            "an object of type '%s'.  With the exception of `AttributeError` "
-            "NumPy will always raise this exception in the future.  Raise this "
-            "deprecation warning to see the original exception. "
-            "(Warning added NumPy 1.21)", attribute, type->tp_name);
-
-    if (res < 0) {
-        npy_PyErr_ChainExceptionsCause(exc_type, exc_value, traceback);
-        return -1;
-    }
-    else {
-        /* `PyErr_Fetch` cleared the original error, delete the references */
-        Py_DECREF(exc_type);
-        Py_XDECREF(exc_value);
-        Py_XDECREF(traceback);
-        return 0;
-    }
-}
-
-
  /*NUMPY_API*/
  NPY_NO_EXPORT PyObject *
  PyArray_FromInterface(PyObject *origin)
@@ -2218,19 +2236,11 @@ PyArray_FromInterface(PyObject *origin)
      npy_intp dims[NPY_MAXDIMS], strides[NPY_MAXDIMS];
      int dataflags = NPY_ARRAY_BEHAVED;
  
-    iface = PyArray_LookupSpecial_OnInstance(origin, "__array_interface__");
+    iface = PyArray_LookupSpecial_OnInstance(origin, npy_ma_str_array_interface);
  
      if (iface == NULL) {
          if (PyErr_Occurred()) {
-            if (PyErr_ExceptionMatches(PyExc_RecursionError) ||
-                    PyErr_ExceptionMatches(PyExc_MemoryError)) {
-                /* RecursionError and MemoryError are considered fatal */
-                return NULL;
-            }
-            if (deprecated_lookup_error_clearing(
-                    Py_TYPE(origin), "__array_interface__") < 0) {
-                return NULL;
-            }
+            return NULL;
          }
          return Py_NotImplemented;
      }
@@ -2507,18 +2517,10 @@ PyArray_FromArrayAttr_int(
      PyObject *new;
      PyObject *array_meth;
  
-    array_meth = PyArray_LookupSpecial_OnInstance(op, "__array__");
+    array_meth = PyArray_LookupSpecial_OnInstance(op, npy_ma_str_array);
      if (array_meth == NULL) {
          if (PyErr_Occurred()) {
-            if (PyErr_ExceptionMatches(PyExc_RecursionError) ||
-                PyErr_ExceptionMatches(PyExc_MemoryError)) {
-                /* RecursionError and MemoryError are considered fatal */
-                return NULL;
-            }
-            if (deprecated_lookup_error_clearing(
-                    Py_TYPE(op), "__array__") < 0) {
-                return NULL;
-            }
+            return NULL;
          }
          return Py_NotImplemented;
      }
@@ -3685,15 +3687,16 @@ PyArray_FromBuffer(PyObject *buf, PyArray_Descr *type,
      }
  
      /*
-     * The array check is probably unnecessary.  It preserves the base for
-     * arrays.  This is the "old" buffer protocol, which had no release logic.
-     * (It was assumed that the result is always a view.)
-     *
-     * NOTE: We could also check if `bf_releasebuffer` is defined which should
-     *       be the most precise and safe thing to do.  But that should only be
-     *       necessary if unexpected backcompat issues are found downstream.
+     * If the object supports `releasebuffer`, the new buffer protocol allows
+     * tying the memories lifetime to the `Py_buffer view`.
+     * NumPy cannot hold on to the view itself (it is not an object) so it
+     * has to wrap the original object in a Python `memoryview` which deals
+     * with the lifetime management for us.
+     * For backwards compatibility of `arr.base` we try to avoid this when
+     * possible.  (For example, NumPy arrays will never get wrapped here!)
       */
-    if (!PyArray_Check(buf)) {
+    if (Py_TYPE(buf)->tp_as_buffer
+            && Py_TYPE(buf)->tp_as_buffer->bf_releasebuffer) {
          buf = PyMemoryView_FromObject(buf);
          if (buf == NULL) {
              return NULL;
@@ -3895,11 +3898,9 @@ PyArray_FromString(char *data, npy_intp slen, PyArray_Descr *dtype,
  NPY_NO_EXPORT PyObject *
  PyArray_FromIter(PyObject *obj, PyArray_Descr *dtype, npy_intp count)
  {
-    PyObject *value;
      PyObject *iter = NULL;
      PyArrayObject *ret = NULL;
      npy_intp i, elsize, elcount;
-    char *item, *new_data;
  
      if (dtype == NULL) {
          return NULL;
@@ -3911,6 +3912,7 @@ PyArray_FromIter(PyObject *obj, PyArray_Descr *dtype, npy_intp count)
      }
  
      if (PyDataType_ISUNSIZED(dtype)) {
+        /* If this error is removed, the `ret` allocation may need fixing */
          PyErr_SetString(PyExc_ValueError,
                  "Must specify length when using variable-size data-type.");
          goto done;
@@ -3928,38 +3930,43 @@ PyArray_FromIter(PyObject *obj, PyArray_Descr *dtype, npy_intp count)
      elsize = dtype->elsize;
  
      /*
-     * We would need to alter the memory RENEW code to decrement any
-     * reference counts before throwing away any memory.
+     * Note that PyArray_DESCR(ret) may not match dtype.  There are exactly
+     * two cases where this can happen: empty strings/bytes/void (rejected
+     * above) and subarray dtypes (supported by sticking with `dtype`).
       */
-    if (PyDataType_REFCHK(dtype)) {
-        PyErr_SetString(PyExc_ValueError,
-                "cannot create object arrays from iterator");
-        goto done;
-    }
-
+    Py_INCREF(dtype);
      ret = (PyArrayObject *)PyArray_NewFromDescr(&PyArray_Type, dtype, 1,
                                                  &elcount, NULL,NULL, 0, NULL);
-    dtype = NULL;
      if (ret == NULL) {
          goto done;
      }
-    for (i = 0; (i < count || count == -1) &&
-             (value = PyIter_Next(iter)); i++) {
-        if (i >= elcount && elsize != 0) {
+
+    char *item = PyArray_BYTES(ret);
+    for (i = 0; i < count || count == -1; i++, item += elsize) {
+        PyObject *value = PyIter_Next(iter);
+        if (value == NULL) {
+            if (PyErr_Occurred()) {
+                /* Fetching next item failed perhaps due to exhausting iterator */
+                goto done;
+            }
+            break;
+        }
+
+        if (NPY_UNLIKELY(i >= elcount) && elsize != 0) {
+            char *new_data = NULL;
              npy_intp nbytes;
              /*
                Grow PyArray_DATA(ret):
                this is similar for the strategy for PyListObject, but we use
                50% overallocation => 0, 4, 8, 14, 23, 36, 56, 86 ...
+              TODO: The loadtxt code now uses a `growth` helper that would
+                    be suitable to reuse here.
              */
              elcount = (i >> 1) + (i < 4 ? 4 : 2) + i;
              if (!npy_mul_with_overflow_intp(&nbytes, elcount, elsize)) {
                  /* The handler is always valid */
-                new_data = PyDataMem_UserRENEW(PyArray_DATA(ret), nbytes,
-                                  PyArray_HANDLER(ret));
-            }
-            else {
-                new_data = NULL;
+                new_data = PyDataMem_UserRENEW(
+                        PyArray_BYTES(ret), nbytes, PyArray_HANDLER(ret));
              }
              if (new_data == NULL) {
                  PyErr_SetString(PyExc_MemoryError,
@@ -3968,44 +3975,66 @@ PyArray_FromIter(PyObject *obj, PyArray_Descr *dtype, npy_intp count)
                  goto done;
              }
              ((PyArrayObject_fields *)ret)->data = new_data;
+            /* resize array for cleanup: */
+            PyArray_DIMS(ret)[0] = elcount;
+            /* Reset `item` pointer to point into realloc'd chunk */
+            item = new_data + i * elsize;
+            if (PyDataType_FLAGCHK(dtype, NPY_NEEDS_INIT)) {
+                /* Initialize new chunk: */
+                memset(item, 0, nbytes - i * elsize);
+            }
          }
-        PyArray_DIMS(ret)[0] = i + 1;
  
-        if (((item = index2ptr(ret, i)) == NULL) ||
-                PyArray_SETITEM(ret, item, value) == -1) {
+        if (PyArray_Pack(dtype, item, value) < 0) {
              Py_DECREF(value);
              goto done;
          }
          Py_DECREF(value);
      }
  
-
-    if (PyErr_Occurred()) {
-        goto done;
-    }
      if (i < count) {
-        PyErr_SetString(PyExc_ValueError,
-                "iterator too short");
+        PyErr_Format(PyExc_ValueError,
+                "iterator too short: Expected %zd but iterator had only %zd "
+                "items.", (Py_ssize_t)count, (Py_ssize_t)i);
          goto done;
      }
  
      /*
-     * Realloc the data so that don't keep extra memory tied up
-     * (assuming realloc is reasonably good about reusing space...)
+     * Realloc the data so that don't keep extra memory tied up and fix
+     * the arrays first dimension (there could be more than one).
       */
      if (i == 0 || elsize == 0) {
          /* The size cannot be zero for realloc. */
-        goto done;
      }
-    /* The handler is always valid */
-    new_data = PyDataMem_UserRENEW(PyArray_DATA(ret), i * elsize,
-                                   PyArray_HANDLER(ret));
-    if (new_data == NULL) {
-        PyErr_SetString(PyExc_MemoryError,
-                "cannot allocate array memory");
-        goto done;
+    else {
+        /* Resize array to actual final size (it may be too large) */
+        /* The handler is always valid */
+        char *new_data = PyDataMem_UserRENEW(
+                PyArray_DATA(ret), i * elsize, PyArray_HANDLER(ret));
+
+        if (new_data == NULL) {
+            PyErr_SetString(PyExc_MemoryError,
+                    "cannot allocate array memory");
+            goto done;
+        }
+        ((PyArrayObject_fields *)ret)->data = new_data;
+
+        if (count < 0 || NPY_RELAXED_STRIDES_DEBUG) {
+            /*
+             * If the count was smaller than zero or NPY_RELAXED_STRIDES_DEBUG
+             * was active, the strides may be all 0 or intentionally mangled
+             * (even in the later dimensions for `count < 0`!
+             * Thus, fix all strides here again for C-contiguity.
+             */
+            int oflags;
+            _array_fill_strides(
+                    PyArray_STRIDES(ret), PyArray_DIMS(ret), PyArray_NDIM(ret),
+                    PyArray_ITEMSIZE(ret), NPY_ARRAY_C_CONTIGUOUS, &oflags);
+            PyArray_STRIDES(ret)[0] = elsize;
+            assert(oflags & NPY_ARRAY_C_CONTIGUOUS);
+        }
      }
-    ((PyArrayObject_fields *)ret)->data = new_data;
+    PyArray_DIMS(ret)[0] = i;
  
   done:
      Py_XDECREF(iter);
@@ -4041,7 +4070,6 @@ _array_fill_strides(npy_intp *strides, npy_intp const *dims, int nd, size_t item
                      int inflag, int *objflags)
  {
      int i;
-#if NPY_RELAXED_STRIDES_CHECKING
      npy_bool not_cf_contig = 0;
      npy_bool nod = 0; /* A dim != 1 was found */
  
@@ -4055,7 +4083,6 @@ _array_fill_strides(npy_intp *strides, npy_intp const *dims, int nd, size_t item
              nod = 1;
          }
      }
-#endif /* NPY_RELAXED_STRIDES_CHECKING */
  
      /* Only make Fortran strides if not contiguous as well */
      if ((inflag & (NPY_ARRAY_F_CONTIGUOUS|NPY_ARRAY_C_CONTIGUOUS)) ==
@@ -4065,7 +4092,6 @@ _array_fill_strides(npy_intp *strides, npy_intp const *dims, int nd, size_t item
              if (dims[i]) {
                  itemsize *= dims[i];
              }
-#if NPY_RELAXED_STRIDES_CHECKING
              else {
                  not_cf_contig = 0;
              }
@@ -4075,13 +4101,8 @@ _array_fill_strides(npy_intp *strides, npy_intp const *dims, int nd, size_t item
                  strides[i] = NPY_MAX_INTP;
              }
  #endif /* NPY_RELAXED_STRIDES_DEBUG */
-#endif /* NPY_RELAXED_STRIDES_CHECKING */
          }
-#if NPY_RELAXED_STRIDES_CHECKING
          if (not_cf_contig) {
-#else /* not NPY_RELAXED_STRIDES_CHECKING */
-        if ((nd > 1) && ((strides[0] != strides[nd-1]) || (dims[nd-1] > 1))) {
-#endif /* not NPY_RELAXED_STRIDES_CHECKING */
              *objflags = ((*objflags)|NPY_ARRAY_F_CONTIGUOUS) &
                                              ~NPY_ARRAY_C_CONTIGUOUS;
          }
@@ -4095,7 +4116,6 @@ _array_fill_strides(npy_intp *strides, npy_intp const *dims, int nd, size_t item
              if (dims[i]) {
                  itemsize *= dims[i];
              }
-#if NPY_RELAXED_STRIDES_CHECKING
              else {
                  not_cf_contig = 0;
              }
@@ -4105,13 +4125,8 @@ _array_fill_strides(npy_intp *strides, npy_intp const *dims, int nd, size_t item
                  strides[i] = NPY_MAX_INTP;
              }
  #endif /* NPY_RELAXED_STRIDES_DEBUG */
-#endif /* NPY_RELAXED_STRIDES_CHECKING */
          }
-#if NPY_RELAXED_STRIDES_CHECKING
          if (not_cf_contig) {
-#else /* not NPY_RELAXED_STRIDES_CHECKING */
-        if ((nd > 1) && ((strides[0] != strides[nd-1]) || (dims[0] > 1))) {
-#endif /* not NPY_RELAXED_STRIDES_CHECKING */
              *objflags = ((*objflags)|NPY_ARRAY_C_CONTIGUOUS) &
                                              ~NPY_ARRAY_F_CONTIGUOUS;
          }
diff --git a/numpy/core/src/multiarray/datetime.c b/numpy/core/src/multiarray/datetime.c

index e0064c017361bce1d29c6331f7457b1f53d4bb18..99096be5640470446e116cdfc0977f2f2ef56fa5 100644 (file)
--- a/numpy/core/src/multiarray/datetime.c
+++ b/numpy/core/src/multiarray/datetime.c
@@ -3746,8 +3746,6 @@ find_object_datetime_type(PyObject *obj, int type_num)
  }
  
  
-
-
  /*
   * Describes casting within datetimes or timedelta
   */
@@ -3756,13 +3754,14 @@ time_to_time_resolve_descriptors(
          PyArrayMethodObject *NPY_UNUSED(self),
          PyArray_DTypeMeta *NPY_UNUSED(dtypes[2]),
          PyArray_Descr *given_descrs[2],
-        PyArray_Descr *loop_descrs[2])
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *view_offset)
  {
      /* This is a within-dtype cast, which currently must handle byteswapping */
      Py_INCREF(given_descrs[0]);
      loop_descrs[0] = given_descrs[0];
      if (given_descrs[1] == NULL) {
-        loop_descrs[1] = ensure_dtype_nbo(given_descrs[0]);
+        loop_descrs[1] = NPY_DT_CALL_ensure_canonical(given_descrs[0]);
      }
      else {
          Py_INCREF(given_descrs[1]);
@@ -3772,14 +3771,14 @@ time_to_time_resolve_descriptors(
      int is_timedelta = given_descrs[0]->type_num == NPY_TIMEDELTA;
  
      if (given_descrs[0] == given_descrs[1]) {
-        return NPY_NO_CASTING | _NPY_CAST_IS_VIEW;
+        *view_offset = 0;
+        return NPY_NO_CASTING;
      }
  
-    NPY_CASTING byteorder_may_allow_view = 0;
-    if (PyDataType_ISNOTSWAPPED(loop_descrs[0]) ==
-            PyDataType_ISNOTSWAPPED(loop_descrs[1])) {
-        byteorder_may_allow_view = _NPY_CAST_IS_VIEW;
-    }
+    npy_bool byteorder_may_allow_view = (
+            PyDataType_ISNOTSWAPPED(loop_descrs[0])
+            == PyDataType_ISNOTSWAPPED(loop_descrs[1]));
+
      PyArray_DatetimeMetaData *meta1, *meta2;
      meta1 = get_datetime_metadata_from_dtype(loop_descrs[0]);
      assert(meta1 != NULL);
@@ -3798,12 +3797,16 @@ time_to_time_resolve_descriptors(
              ((meta2->base >= 7) && (meta1->base - meta2->base == 3)
                && ((meta1->num / meta2->num) == 1000000000))) {
          if (byteorder_may_allow_view) {
-            return NPY_NO_CASTING | byteorder_may_allow_view;
+            *view_offset = 0;
+            return NPY_NO_CASTING;
          }
          return NPY_EQUIV_CASTING;
      }
      else if (meta1->base == NPY_FR_GENERIC) {
-        return NPY_SAFE_CASTING | byteorder_may_allow_view;
+        if (byteorder_may_allow_view) {
+            *view_offset = 0;
+        }
+        return NPY_SAFE_CASTING ;
      }
      else if (meta2->base == NPY_FR_GENERIC) {
          /* TODO: This is actually an invalid cast (casting will error) */
@@ -3877,8 +3880,8 @@ time_to_time_get_loop(
          return 0;
      }
  
-    PyArray_Descr *src_wrapped_dtype = ensure_dtype_nbo(descrs[0]);
-    PyArray_Descr *dst_wrapped_dtype = ensure_dtype_nbo(descrs[1]);
+    PyArray_Descr *src_wrapped_dtype = NPY_DT_CALL_ensure_canonical(descrs[0]);
+    PyArray_Descr *dst_wrapped_dtype = NPY_DT_CALL_ensure_canonical(descrs[1]);
  
      int needs_api = 0;
      int res = wrap_aligned_transferfunction(
@@ -3903,7 +3906,7 @@ datetime_to_timedelta_resolve_descriptors(
          PyArray_Descr *given_descrs[2],
          PyArray_Descr *loop_descrs[2])
  {
-    loop_descrs[0] = ensure_dtype_nbo(given_descrs[0]);
+    loop_descrs[0] = NPY_DT_CALL_ensure_canonical(given_descrs[0]);
      if (loop_descrs[0] == NULL) {
          return -1;
      }
@@ -3913,7 +3916,7 @@ datetime_to_timedelta_resolve_descriptors(
          loop_descrs[1] = create_datetime_dtype(dtypes[1]->type_num, meta);
      }
      else {
-        loop_descrs[1] = ensure_dtype_nbo(given_descrs[1]);
+        loop_descrs[1] = NPY_DT_CALL_ensure_canonical(given_descrs[1]);
      }
      if (loop_descrs[1] == NULL) {
          Py_DECREF(loop_descrs[0]);
@@ -3931,10 +3934,11 @@ datetime_to_timedelta_resolve_descriptors(
  /* In the current setup both strings and unicode casts support all outputs */
  static NPY_CASTING
  time_to_string_resolve_descriptors(
-        PyArrayMethodObject *self,
+        PyArrayMethodObject *NPY_UNUSED(self),
          PyArray_DTypeMeta *dtypes[2],
          PyArray_Descr **given_descrs,
-        PyArray_Descr **loop_descrs)
+        PyArray_Descr **loop_descrs,
+        npy_intp *NPY_UNUSED(view_offset))
  {
      if (given_descrs[1] != NULL && dtypes[0]->type_num == NPY_DATETIME) {
          /*
@@ -3969,7 +3973,7 @@ time_to_string_resolve_descriptors(
          loop_descrs[1]->elsize = size;
      }
  
-    loop_descrs[0] = ensure_dtype_nbo(given_descrs[0]);
+    loop_descrs[0] = NPY_DT_CALL_ensure_canonical(given_descrs[0]);
      if (loop_descrs[0] == NULL) {
          Py_DECREF(loop_descrs[1]);
          return -1;
@@ -4013,7 +4017,8 @@ string_to_datetime_cast_resolve_descriptors(
          PyArrayMethodObject *NPY_UNUSED(self),
          PyArray_DTypeMeta *dtypes[2],
          PyArray_Descr *given_descrs[2],
-        PyArray_Descr *loop_descrs[2])
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *NPY_UNUSED(view_offset))
  {
      if (given_descrs[1] == NULL) {
          /* NOTE: This doesn't actually work, and will error during the cast */
@@ -4023,7 +4028,7 @@ string_to_datetime_cast_resolve_descriptors(
          }
      }
      else {
-        loop_descrs[1] = ensure_dtype_nbo(given_descrs[1]);
+        loop_descrs[1] = NPY_DT_CALL_ensure_canonical(given_descrs[1]);
          if (loop_descrs[1] == NULL) {
              return -1;
          }
diff --git a/numpy/core/src/multiarray/descriptor.c b/numpy/core/src/multiarray/descriptor.c

index 07abc755fab3b9828bdf01b47e5158c39836123f..a23ee6d2c6805196fbacd54817b45bf77e505568 100644 (file)
--- a/numpy/core/src/multiarray/descriptor.c
+++ b/numpy/core/src/multiarray/descriptor.c
@@ -22,18 +22,6 @@
  #include "npy_buffer.h"
  #include "dtypemeta.h"
  
-/*
- * offset:    A starting offset.
- * alignment: A power-of-two alignment.
- *
- * This macro returns the smallest value >= 'offset'
- * that is divisible by 'alignment'. Because 'alignment'
- * is a power of two and integers are twos-complement,
- * it is possible to use some simple bit-fiddling to do this.
- */
-#define NPY_NEXT_ALIGNED_OFFSET(offset, alignment) \
-                (((offset) + (alignment) - 1) & (-(alignment)))
-
  #ifndef PyDictProxy_Check
  #define PyDictProxy_Check(obj) (Py_TYPE(obj) == &PyDictProxy_Type)
  #endif
diff --git a/numpy/core/src/multiarray/descriptor.h b/numpy/core/src/multiarray/descriptor.h

index f832958dae902f1b449ed3230a4c34bb3cb76b43..7e6f212f287990a7792f1478ac29785e496360db 100644 (file)
--- a/numpy/core/src/multiarray/descriptor.h
+++ b/numpy/core/src/multiarray/descriptor.h
@@ -6,6 +6,18 @@ NPY_NO_EXPORT PyObject *arraydescr_protocol_typestr_get(
  NPY_NO_EXPORT PyObject *arraydescr_protocol_descr_get(
          PyArray_Descr *self, void *);
  
+/*
+ * offset:    A starting offset.
+ * alignment: A power-of-two alignment.
+ *
+ * This macro returns the smallest value >= 'offset'
+ * that is divisible by 'alignment'. Because 'alignment'
+ * is a power of two and integers are twos-complement,
+ * it is possible to use some simple bit-fiddling to do this.
+ */
+#define NPY_NEXT_ALIGNED_OFFSET(offset, alignment) \
+                (((offset) + (alignment) - 1) & (-(alignment)))
+
  NPY_NO_EXPORT PyObject *
  array_set_typeDict(PyObject *NPY_UNUSED(ignored), PyObject *args);
  
diff --git a/numpy/core/src/multiarray/dlpack.c b/numpy/core/src/multiarray/dlpack.c

index 37b0b6dcfda83f2f26f96fdad714460525e12a3e..d5b1af101bab7dc1925a90595d89a7143a7b4dfb 100644 (file)
--- a/numpy/core/src/multiarray/dlpack.c
+++ b/numpy/core/src/multiarray/dlpack.c
@@ -15,8 +15,7 @@ static void
  array_dlpack_deleter(DLManagedTensor *self)
  {
      PyArrayObject *array = (PyArrayObject *)self->manager_ctx;
-    // This will also free the strides as it's one allocation.
-    PyMem_Free(self->dl_tensor.shape);
+    // This will also free the shape and strides as it's one allocation.
      PyMem_Free(self);
      Py_XDECREF(array);
  }
@@ -197,12 +196,17 @@ array_dlpack(PyArrayObject *self,
          return NULL;
      }
  
-    DLManagedTensor *managed = PyMem_Malloc(sizeof(DLManagedTensor));
-    if (managed == NULL) {
+    // ensure alignment
+    int offset = sizeof(DLManagedTensor) % sizeof(void *);
+    void *ptr = PyMem_Malloc(sizeof(DLManagedTensor) + offset +
+        (sizeof(int64_t) * ndim * 2));
+    if (ptr == NULL) {
          PyErr_NoMemory();
          return NULL;
      }
  
+    DLManagedTensor *managed = ptr;
+
      /*
       * Note: the `dlpack.h` header suggests/standardizes that `data` must be
       * 256-byte aligned.  We ignore this intentionally, because `__dlpack__`
@@ -221,12 +225,8 @@ array_dlpack(PyArrayObject *self,
      managed->dl_tensor.device = device;
      managed->dl_tensor.dtype = managed_dtype;
  
-    int64_t *managed_shape_strides = PyMem_Malloc(sizeof(int64_t) * ndim * 2);
-    if (managed_shape_strides == NULL) {
-        PyErr_NoMemory();
-        PyMem_Free(managed);
-        return NULL;
-    }
+    int64_t *managed_shape_strides = (int64_t *)((char *)ptr +
+        sizeof(DLManagedTensor) + offset);
  
      int64_t *managed_shape = managed_shape_strides;
      int64_t *managed_strides = managed_shape_strides + ndim;
@@ -249,8 +249,7 @@ array_dlpack(PyArrayObject *self,
      PyObject *capsule = PyCapsule_New(managed, NPY_DLPACK_CAPSULE_NAME,
              dlpack_capsule_deleter);
      if (capsule == NULL) {
-        PyMem_Free(managed);
-        PyMem_Free(managed_shape_strides);
+        PyMem_Free(ptr);
          return NULL;
      }
  
@@ -270,7 +269,7 @@ array_dlpack_device(PyArrayObject *self, PyObject *NPY_UNUSED(args))
  }
  
  NPY_NO_EXPORT PyObject *
-_from_dlpack(PyObject *NPY_UNUSED(self), PyObject *obj) {
+from_dlpack(PyObject *NPY_UNUSED(self), PyObject *obj) {
      PyObject *capsule = PyObject_CallMethod((PyObject *)obj->ob_type,
              "__dlpack__", "O", obj);
      if (capsule == NULL) {
diff --git a/numpy/core/src/multiarray/dragon4.c b/numpy/core/src/multiarray/dragon4.c

index ce02936152284ee1f7c8555568a86d796b3a7388..5d245b106f91b5cf409ac9e5e21486c83cbe0786 100644 (file)
--- a/numpy/core/src/multiarray/dragon4.c
+++ b/numpy/core/src/multiarray/dragon4.c
@@ -1809,9 +1809,16 @@ FormatPositional(char *buffer, npy_uint32 bufferSize, BigInt *mantissa,
              pos--;
              numFractionDigits--;
          }
-        if (trim_mode == TrimMode_LeaveOneZero && buffer[pos-1] == '.') {
-            buffer[pos++] = '0';
-            numFractionDigits++;
+        if (buffer[pos-1] == '.') {
+            /* in TrimMode_LeaveOneZero, add trailing 0 back */
+            if (trim_mode == TrimMode_LeaveOneZero){
+                buffer[pos++] = '0';
+                numFractionDigits++;
+            }
+            /* in TrimMode_DptZeros, remove trailing decimal point */
+            else if (trim_mode == TrimMode_DptZeros) {
+                    pos--;
+            }
          }
      }
  
diff --git a/numpy/core/src/multiarray/dtype_transfer.c b/numpy/core/src/multiarray/dtype_transfer.c

index 78704f6eda0a125e95a8df05ec486d40f7078b3a..18de5d1321d2c58f5c099eb7123cabc79280d9c9 100644 (file)
--- a/numpy/core/src/multiarray/dtype_transfer.c
+++ b/numpy/core/src/multiarray/dtype_transfer.c
@@ -237,7 +237,7 @@ NPY_NO_EXPORT int
  any_to_object_get_loop(
          PyArrayMethod_Context *context,
          int aligned, int move_references,
-        npy_intp *strides,
+        const npy_intp *strides,
          PyArrayMethod_StridedLoop **out_loop,
          NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags)
@@ -343,7 +343,7 @@ NPY_NO_EXPORT int
  object_to_any_get_loop(
          PyArrayMethod_Context *context,
          int NPY_UNUSED(aligned), int move_references,
-        npy_intp *NPY_UNUSED(strides),
+        const npy_intp *NPY_UNUSED(strides),
          PyArrayMethod_StridedLoop **out_loop,
          NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags)
@@ -2991,7 +2991,8 @@ _strided_to_strided_multistep_cast(
   * transferfunction and transferdata.
   */
  static NPY_INLINE int
-init_cast_info(NPY_cast_info *cast_info, NPY_CASTING *casting,
+init_cast_info(
+        NPY_cast_info *cast_info, NPY_CASTING *casting, npy_intp *view_offset,
          PyArray_Descr *src_dtype, PyArray_Descr *dst_dtype, int main_step)
  {
      PyObject *meth = PyArray_GetCastingImpl(
@@ -3016,7 +3017,8 @@ init_cast_info(NPY_cast_info *cast_info, NPY_CASTING *casting,
      PyArray_Descr *in_descr[2] = {src_dtype, dst_dtype};
  
      *casting = cast_info->context.method->resolve_descriptors(
-            cast_info->context.method, dtypes, in_descr, cast_info->descriptors);
+            cast_info->context.method, dtypes,
+            in_descr, cast_info->descriptors, view_offset);
      if (NPY_UNLIKELY(*casting < 0)) {
          if (!PyErr_Occurred()) {
              PyErr_Format(PyExc_TypeError,
@@ -3071,6 +3073,9 @@ _clear_cast_info_after_get_loop_failure(NPY_cast_info *cast_info)
   * transfer function from the each casting implementation (ArrayMethod).
   * May set the transfer function to NULL when the cast can be achieved using
   * a view.
+ * TODO: Expand the view functionality for general offsets, not just 0:
+ *       Partial casts could be skipped also for `view_offset != 0`.
+ *
   * The `out_needs_api` flag must be initialized.
   *
   * NOTE: In theory casting errors here could be slightly misleading in case
@@ -3101,9 +3106,12 @@ define_cast_for_descrs(
      castdata.main.func = NULL;
      castdata.to.func = NULL;
      castdata.from.func = NULL;
+    /* `view_offset` passed to `init_cast_info` but unused for the main cast */
+    npy_intp view_offset = NPY_MIN_INTP;
      NPY_CASTING casting = -1;
  
-    if (init_cast_info(cast_info, &casting, src_dtype, dst_dtype, 1) < 0) {
+    if (init_cast_info(
+            cast_info, &casting, &view_offset, src_dtype, dst_dtype, 1) < 0) {
          return -1;
      }
  
@@ -3123,17 +3131,18 @@ define_cast_for_descrs(
       */
      if (NPY_UNLIKELY(src_dtype != cast_info->descriptors[0] || must_wrap)) {
          NPY_CASTING from_casting = -1;
+        npy_intp from_view_offset = NPY_MIN_INTP;
          /* Cast function may not support the input, wrap if necessary */
          if (init_cast_info(
-                &castdata.from, &from_casting,
+                &castdata.from, &from_casting, &from_view_offset,
                  src_dtype, cast_info->descriptors[0], 0) < 0) {
              goto fail;
          }
          casting = PyArray_MinCastSafety(casting, from_casting);
  
          /* Prepare the actual cast (if necessary): */
-        if (from_casting & _NPY_CAST_IS_VIEW && !must_wrap) {
-            /* This step is not necessary and can be skipped. */
+        if (from_view_offset == 0 && !must_wrap) {
+            /* This step is not necessary and can be skipped */
              castdata.from.func = &_dec_src_ref_nop;  /* avoid NULL */
              NPY_cast_info_xfree(&castdata.from);
          }
@@ -3161,16 +3170,17 @@ define_cast_for_descrs(
       */
      if (NPY_UNLIKELY(dst_dtype != cast_info->descriptors[1] || must_wrap)) {
          NPY_CASTING to_casting = -1;
+        npy_intp to_view_offset = NPY_MIN_INTP;
          /* Cast function may not support the output, wrap if necessary */
          if (init_cast_info(
-                &castdata.to, &to_casting,
+                &castdata.to, &to_casting, &to_view_offset,
                  cast_info->descriptors[1], dst_dtype,  0) < 0) {
              goto fail;
          }
          casting = PyArray_MinCastSafety(casting, to_casting);
  
          /* Prepare the actual cast (if necessary): */
-        if (to_casting & _NPY_CAST_IS_VIEW && !must_wrap) {
+        if (to_view_offset == 0 && !must_wrap) {
              /* This step is not necessary and can be skipped. */
              castdata.to.func = &_dec_src_ref_nop;  /* avoid NULL */
              NPY_cast_info_xfree(&castdata.to);
@@ -3383,8 +3393,8 @@ wrap_aligned_transferfunction(
   * For casts between two dtypes with the same type (within DType casts)
   * it also wraps the `copyswapn` function.
   *
- * This function is called called from `ArrayMethod.get_loop()` when a
- * specialized cast function is missing.
+ * This function is called from `ArrayMethod.get_loop()` when a specialized
+ * cast function is missing.
   *
   * In general, the legacy cast functions do not support unaligned access,
   * so an ArrayMethod using this must signal that.  In a few places we do
@@ -3447,11 +3457,11 @@ get_wrapped_legacy_cast_function(int aligned,
       * If we are here, use the legacy code to wrap the above cast (which
       * does not support unaligned data) into copyswapn.
       */
-    PyArray_Descr *src_wrapped_dtype = ensure_dtype_nbo(src_dtype);
+    PyArray_Descr *src_wrapped_dtype = NPY_DT_CALL_ensure_canonical(src_dtype);
      if (src_wrapped_dtype == NULL) {
          goto fail;
      }
-    PyArray_Descr *dst_wrapped_dtype = ensure_dtype_nbo(dst_dtype);
+    PyArray_Descr *dst_wrapped_dtype = NPY_DT_CALL_ensure_canonical(dst_dtype);
      if (dst_wrapped_dtype == NULL) {
          goto fail;
      }
diff --git a/numpy/core/src/multiarray/dtype_transfer.h b/numpy/core/src/multiarray/dtype_transfer.h

index c7e0a029f990ed8a35d0de67ed4b5f1c5bf7131f..9ae332e385b1763fc1bd355c71adaed588492c3f 100644 (file)
--- a/numpy/core/src/multiarray/dtype_transfer.h
+++ b/numpy/core/src/multiarray/dtype_transfer.h
@@ -132,7 +132,7 @@ NPY_NO_EXPORT int
  any_to_object_get_loop(
          PyArrayMethod_Context *context,
          int aligned, int move_references,
-        npy_intp *strides,
+        const npy_intp *strides,
          PyArrayMethod_StridedLoop **out_loop,
          NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags);
@@ -141,7 +141,7 @@ NPY_NO_EXPORT int
  object_to_any_get_loop(
          PyArrayMethod_Context *context,
          int NPY_UNUSED(aligned), int move_references,
-        npy_intp *NPY_UNUSED(strides),
+        const npy_intp *NPY_UNUSED(strides),
          PyArrayMethod_StridedLoop **out_loop,
          NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags);
diff --git a/numpy/core/src/multiarray/dtypemeta.c b/numpy/core/src/multiarray/dtypemeta.c

index 53f38e8e8b1aeaa3be3d29dd2158765193fbb8f6..577478d2a1a03a32ca0bf216b16856ecffb0ffd9 100644 (file)
--- a/numpy/core/src/multiarray/dtypemeta.c
+++ b/numpy/core/src/multiarray/dtypemeta.c
@@ -12,6 +12,7 @@
  
  #include "common.h"
  #include "dtypemeta.h"
+#include "descriptor.h"
  #include "_datetime.h"
  #include "array_coercion.h"
  #include "scalartypes.h"
@@ -222,6 +223,23 @@ nonparametric_default_descr(PyArray_DTypeMeta *cls)
  }
  
  
+/*
+ * For most builtin (and legacy) dtypes, the canonical property means to
+ * ensure native byte-order.  (We do not care about metadata here.)
+ */
+static PyArray_Descr *
+ensure_native_byteorder(PyArray_Descr *descr)
+{
+    if (PyArray_ISNBO(descr->byteorder)) {
+        Py_INCREF(descr);
+        return descr;
+    }
+    else {
+        return PyArray_DescrNewByteorder(descr, NPY_NATIVE);
+    }
+}
+
+
  /* Ensure a copy of the singleton (just in case we do adapt it somewhere) */
  static PyArray_Descr *
  datetime_and_timedelta_default_descr(PyArray_DTypeMeta *cls)
@@ -265,10 +283,117 @@ static PyArray_Descr *
  string_unicode_common_instance(PyArray_Descr *descr1, PyArray_Descr *descr2)
  {
      if (descr1->elsize >= descr2->elsize) {
-        return ensure_dtype_nbo(descr1);
+        return NPY_DT_CALL_ensure_canonical(descr1);
      }
      else {
-        return ensure_dtype_nbo(descr2);
+        return NPY_DT_CALL_ensure_canonical(descr2);
+    }
+}
+
+
+static PyArray_Descr *
+void_ensure_canonical(PyArray_Descr *self)
+{
+    if (self->subarray != NULL) {
+        PyArray_Descr *new_base = NPY_DT_CALL_ensure_canonical(
+                self->subarray->base);
+        if (new_base == NULL) {
+            return NULL;
+        }
+        if (new_base == self->subarray->base) {
+            /* just return self, no need to modify */
+            Py_DECREF(new_base);
+            Py_INCREF(self);
+            return self;
+        }
+        PyArray_Descr *new = PyArray_DescrNew(self);
+        if (new == NULL) {
+            return NULL;
+        }
+        Py_SETREF(new->subarray->base, new_base);
+        return new;
+    }
+    else if (self->names != NULL) {
+        /*
+         * This branch is fairly complex, since it needs to build a new
+         * descriptor that is in canonical form.  This means that the new
+         * descriptor should be an aligned struct if the old one was, and
+         * otherwise it should be an unaligned struct.
+         * Any unnecessary empty space is stripped from the struct.
+         *
+         * TODO: In principle we could/should try to provide the identity when
+         *       no change is necessary. (Simple if we add a flag.)
+         */
+        Py_ssize_t field_num = PyTuple_GET_SIZE(self->names);
+
+        PyArray_Descr *new = PyArray_DescrNew(self);
+        if (new == NULL) {
+            return NULL;
+        }
+        Py_SETREF(new->fields, PyDict_New());
+        if (new->fields == NULL) {
+            Py_DECREF(new);
+            return NULL;
+        }
+        int aligned = PyDataType_FLAGCHK(new, NPY_ALIGNED_STRUCT);
+        new->flags = new->flags & ~NPY_FROM_FIELDS;
+        new->flags |= NPY_NEEDS_PYAPI;  /* always needed for field access */
+        int totalsize = 0;
+        int maxalign = 1;
+        for (Py_ssize_t i = 0; i < field_num; i++) {
+            PyObject *name = PyTuple_GET_ITEM(self->names, i);
+            PyObject *tuple = PyDict_GetItem(self->fields, name);
+            PyObject *new_tuple = PyTuple_New(PyTuple_GET_SIZE(tuple));
+            PyArray_Descr *field_descr = NPY_DT_CALL_ensure_canonical(
+                    (PyArray_Descr *)PyTuple_GET_ITEM(tuple, 0));
+            if (field_descr == NULL) {
+                Py_DECREF(new_tuple);
+                Py_DECREF(new);
+                return NULL;
+            }
+            new->flags |= field_descr->flags & NPY_FROM_FIELDS;
+            PyTuple_SET_ITEM(new_tuple, 0, (PyObject *)field_descr);
+
+            if (aligned) {
+                totalsize = NPY_NEXT_ALIGNED_OFFSET(
+                        totalsize, field_descr->alignment);
+                maxalign = PyArray_MAX(maxalign, field_descr->alignment);
+            }
+            PyObject *offset_obj = PyLong_FromLong(totalsize);
+            if (offset_obj == NULL) {
+                Py_DECREF(new_tuple);
+                Py_DECREF(new);
+                return NULL;
+            }
+            PyTuple_SET_ITEM(new_tuple, 1, (PyObject *)offset_obj);
+            if (PyTuple_GET_SIZE(tuple) == 3) {
+                /* Be sure to set all items in the tuple before using it */
+                PyObject *title = PyTuple_GET_ITEM(tuple, 2);
+                Py_INCREF(title);
+                PyTuple_SET_ITEM(new_tuple, 2, title);
+                if (PyDict_SetItem(new->fields, title, new_tuple) < 0) {
+                    Py_DECREF(new_tuple);
+                    Py_DECREF(new);
+                    return NULL;
+                }
+            }
+            if (PyDict_SetItem(new->fields, name, new_tuple) < 0) {
+                Py_DECREF(new_tuple);
+                Py_DECREF(new);
+                return NULL;
+            }
+            Py_DECREF(new_tuple);  /* Reference now owned by new->fields */
+            totalsize += field_descr->elsize;
+        }
+        totalsize = NPY_NEXT_ALIGNED_OFFSET(totalsize, maxalign);
+        new->elsize = totalsize;
+        new->alignment = maxalign;
+        return new;
+    }
+    else {
+        /* unstructured voids are always canonical. */
+        Py_INCREF(self);
+        return self;
      }
  }
  
@@ -276,26 +401,81 @@ string_unicode_common_instance(PyArray_Descr *descr1, PyArray_Descr *descr2)
  static PyArray_Descr *
  void_common_instance(PyArray_Descr *descr1, PyArray_Descr *descr2)
  {
-    /*
-     * We currently do not support promotion of void types unless they
-     * are equivalent.
-     */
-    if (!PyArray_CanCastTypeTo(descr1, descr2, NPY_EQUIV_CASTING)) {
-        if (descr1->subarray == NULL && descr1->names == NULL &&
-                descr2->subarray == NULL && descr2->names == NULL) {
+    if (descr1->subarray == NULL && descr1->names == NULL &&
+            descr2->subarray == NULL && descr2->names == NULL) {
+        if (descr1->elsize != descr2->elsize) {
              PyErr_SetString(PyExc_TypeError,
                      "Invalid type promotion with void datatypes of different "
                      "lengths. Use the `np.bytes_` datatype instead to pad the "
                      "shorter value with trailing zero bytes.");
+            return NULL;
          }
-        else {
+        Py_INCREF(descr1);
+        return descr1;
+    }
+
+    if (descr1->names != NULL && descr2->names != NULL) {
+        /* If both have fields promoting individual fields may be possible */
+        static PyObject *promote_fields_func = NULL;
+        npy_cache_import("numpy.core._internal", "_promote_fields",
+                &promote_fields_func);
+        if (promote_fields_func == NULL) {
+            return NULL;
+        }
+        PyObject *result = PyObject_CallFunctionObjArgs(promote_fields_func,
+                descr1, descr2, NULL);
+        if (result == NULL) {
+            return NULL;
+        }
+        if (!PyObject_TypeCheck(result, Py_TYPE(descr1))) {
+            PyErr_SetString(PyExc_RuntimeError,
+                    "Internal NumPy error: `_promote_fields` did not return "
+                    "a valid descriptor object.");
+            Py_DECREF(result);
+            return NULL;
+        }
+        return (PyArray_Descr *)result;
+    }
+    else if (descr1->subarray != NULL && descr2->subarray != NULL) {
+        int cmp = PyObject_RichCompareBool(
+                descr1->subarray->shape, descr2->subarray->shape, Py_EQ);
+        if (error_converting(cmp)) {
+            return NULL;
+        }
+        if (!cmp) {
              PyErr_SetString(PyExc_TypeError,
-                    "invalid type promotion with structured datatype(s).");
+                    "invalid type promotion with subarray datatypes "
+                    "(shape mismatch).");
+            return NULL;
          }
-        return NULL;
+        PyArray_Descr *new_base = PyArray_PromoteTypes(
+                descr1->subarray->base, descr2->subarray->base);
+        if (new_base == NULL) {
+            return NULL;
+        }
+        /*
+         * If it is the same dtype and the container did not change, we might
+         * as well preserve identity and metadata.  This could probably be
+         * changed.
+         */
+        if (descr1 == descr2 && new_base == descr1->subarray->base) {
+            Py_DECREF(new_base);
+            Py_INCREF(descr1);
+            return descr1;
+        }
+
+        PyArray_Descr *new_descr = PyArray_DescrNew(descr1);
+        if (new_descr == NULL) {
+            Py_DECREF(new_base);
+            return NULL;
+        }
+        Py_SETREF(new_descr->subarray->base, new_base);
+        return new_descr;
      }
-    Py_INCREF(descr1);
-    return descr1;
+
+    PyErr_SetString(PyExc_TypeError,
+            "invalid type promotion with structured datatype(s).");
+    return NULL;
  }
  
  NPY_NO_EXPORT int
@@ -621,6 +801,7 @@ dtypemeta_wrap_legacy_descriptor(PyArray_Descr *descr)
      dt_slots->is_known_scalar_type = python_builtins_are_known_scalar_types;
      dt_slots->common_dtype = default_builtin_common_dtype;
      dt_slots->common_instance = NULL;
+    dt_slots->ensure_canonical = ensure_native_byteorder;
  
      if (PyTypeNum_ISSIGNED(dtype_class->type_num)) {
          /* Convert our scalars (raise on too large unsigned and NaN, etc.) */
@@ -652,6 +833,7 @@ dtypemeta_wrap_legacy_descriptor(PyArray_Descr *descr)
              dt_slots->discover_descr_from_pyobject = (
                      void_discover_descr_from_pyobject);
              dt_slots->common_instance = void_common_instance;
+            dt_slots->ensure_canonical = void_ensure_canonical;
          }
          else {
              dt_slots->default_descr = string_and_unicode_default_descr;
diff --git a/numpy/core/src/multiarray/dtypemeta.h b/numpy/core/src/multiarray/dtypemeta.h

index 2a61fe39de37fb981cb905b01d141aa1e5f8fac5..e7d5505d851e0d967ad14eeeaf09814834817d3c 100644 (file)
--- a/numpy/core/src/multiarray/dtypemeta.h
+++ b/numpy/core/src/multiarray/dtypemeta.h
@@ -25,6 +25,7 @@ typedef PyArray_DTypeMeta *(common_dtype_function)(
          PyArray_DTypeMeta *dtype1, PyArray_DTypeMeta *dtype2);
  typedef PyArray_Descr *(common_instance_function)(
          PyArray_Descr *dtype1, PyArray_Descr *dtype2);
+typedef PyArray_Descr *(ensure_canonical_function)(PyArray_Descr *dtype);
  
  /*
   * TODO: These two functions are currently only used for experimental DType
@@ -44,6 +45,7 @@ typedef struct {
      default_descr_function *default_descr;
      common_dtype_function *common_dtype;
      common_instance_function *common_instance;
+    ensure_canonical_function *ensure_canonical;
      /*
       * Currently only used for experimental user DTypes.
       * Typing as `void *` until NumPy itself uses these (directly).
@@ -93,6 +95,8 @@ typedef struct {
      NPY_DT_SLOTS(dtype)->default_descr(dtype)
  #define NPY_DT_CALL_common_dtype(dtype, other)  \
      NPY_DT_SLOTS(dtype)->common_dtype(dtype, other)
+#define NPY_DT_CALL_ensure_canonical(descr)  \
+    NPY_DT_SLOTS(NPY_DTYPE(descr))->ensure_canonical(descr)
  #define NPY_DT_CALL_getitem(descr, data_ptr)  \
      NPY_DT_SLOTS(NPY_DTYPE(descr))->getitem(descr, data_ptr)
  #define NPY_DT_CALL_setitem(descr, value, data_ptr)  \
diff --git a/numpy/core/src/multiarray/experimental_public_dtype_api.c b/numpy/core/src/multiarray/experimental_public_dtype_api.c

index 4b9c7199b167495cafcb0b4417207553a1f45e12..cf5f152abe73e5beeddc2f6f2342c8ab0d1d94cd 100644 (file)
--- a/numpy/core/src/multiarray/experimental_public_dtype_api.c
+++ b/numpy/core/src/multiarray/experimental_public_dtype_api.c
@@ -16,7 +16,7 @@
  #include "common_dtype.h"
  
  
-#define EXPERIMENTAL_DTYPE_API_VERSION 2
+#define EXPERIMENTAL_DTYPE_API_VERSION 4
  
  
  typedef struct{
@@ -332,6 +332,16 @@ PyUFunc_AddLoopFromSpec(PyObject *ufunc, PyArrayMethod_Spec *spec)
      return PyUFunc_AddLoop((PyUFuncObject *)ufunc, info, 0);
  }
  
+/*
+ * Function is defined in umath/wrapping_array_method.c
+ * (same/one compilation unit)
+ */
+NPY_NO_EXPORT int
+PyUFunc_AddWrappingLoop(PyObject *ufunc_obj,
+        PyArray_DTypeMeta *new_dtypes[], PyArray_DTypeMeta *wrapped_dtypes[],
+        translate_given_descrs_func *translate_given_descrs,
+        translate_loop_descrs_func *translate_loop_descrs);
+
  
  static int
  PyUFunc_AddPromoter(
@@ -358,18 +368,75 @@ PyUFunc_AddPromoter(
  }
  
  
+/*
+ * Lightweight function fetch a default instance of a DType class.
+ * Note that this version is named `_PyArray_GetDefaultDescr` with an
+ * underscore.  The `singleton` slot is public, so an inline version is
+ * provided that checks `singleton != NULL` first.
+ */
+static PyArray_Descr *
+_PyArray_GetDefaultDescr(PyArray_DTypeMeta *DType)
+{
+    return NPY_DT_CALL_default_descr(DType);
+}
+
+
  NPY_NO_EXPORT PyObject *
  _get_experimental_dtype_api(PyObject *NPY_UNUSED(mod), PyObject *arg)
  {
-    static void *experimental_api_table[] = {
+    static void *experimental_api_table[42] = {
              &PyUFunc_AddLoopFromSpec,
              &PyUFunc_AddPromoter,
              &PyArrayDTypeMeta_Type,
              &PyArrayInitDTypeMeta_FromSpec,
              &PyArray_CommonDType,
              &PyArray_PromoteDTypeSequence,
+            &_PyArray_GetDefaultDescr,
+            &PyUFunc_AddWrappingLoop,
+            NULL,
              NULL,
+            /* NumPy's builtin DTypes (starting at offset 10 going to 41) */
      };
+    if (experimental_api_table[10] == NULL) {
+        experimental_api_table[10] = PyArray_DTypeFromTypeNum(NPY_BOOL);
+        /* Integers */
+        experimental_api_table[11] = PyArray_DTypeFromTypeNum(NPY_BYTE);
+        experimental_api_table[12] = PyArray_DTypeFromTypeNum(NPY_UBYTE);
+        experimental_api_table[13] = PyArray_DTypeFromTypeNum(NPY_SHORT);
+        experimental_api_table[14] = PyArray_DTypeFromTypeNum(NPY_USHORT);
+        experimental_api_table[15] = PyArray_DTypeFromTypeNum(NPY_INT);
+        experimental_api_table[16] = PyArray_DTypeFromTypeNum(NPY_UINT);
+        experimental_api_table[17] = PyArray_DTypeFromTypeNum(NPY_LONG);
+        experimental_api_table[18] = PyArray_DTypeFromTypeNum(NPY_ULONG);
+        experimental_api_table[19] = PyArray_DTypeFromTypeNum(NPY_LONGLONG);
+        experimental_api_table[20] = PyArray_DTypeFromTypeNum(NPY_ULONGLONG);
+        /* Integer aliases */
+        experimental_api_table[21] = PyArray_DTypeFromTypeNum(NPY_INT8);
+        experimental_api_table[22] = PyArray_DTypeFromTypeNum(NPY_UINT8);
+        experimental_api_table[23] = PyArray_DTypeFromTypeNum(NPY_INT16);
+        experimental_api_table[24] = PyArray_DTypeFromTypeNum(NPY_UINT16);
+        experimental_api_table[25] = PyArray_DTypeFromTypeNum(NPY_INT32);
+        experimental_api_table[26] = PyArray_DTypeFromTypeNum(NPY_UINT32);
+        experimental_api_table[27] = PyArray_DTypeFromTypeNum(NPY_INT64);
+        experimental_api_table[28] = PyArray_DTypeFromTypeNum(NPY_UINT64);
+        experimental_api_table[29] = PyArray_DTypeFromTypeNum(NPY_INTP);
+        experimental_api_table[30] = PyArray_DTypeFromTypeNum(NPY_UINTP);
+        /* Floats */
+        experimental_api_table[31] = PyArray_DTypeFromTypeNum(NPY_HALF);
+        experimental_api_table[32] = PyArray_DTypeFromTypeNum(NPY_FLOAT);
+        experimental_api_table[33] = PyArray_DTypeFromTypeNum(NPY_DOUBLE);
+        experimental_api_table[34] = PyArray_DTypeFromTypeNum(NPY_LONGDOUBLE);
+        /* Complex */
+        experimental_api_table[35] = PyArray_DTypeFromTypeNum(NPY_CFLOAT);
+        experimental_api_table[36] = PyArray_DTypeFromTypeNum(NPY_CDOUBLE);
+        experimental_api_table[37] = PyArray_DTypeFromTypeNum(NPY_CLONGDOUBLE);
+        /* String/Bytes */
+        experimental_api_table[38] = PyArray_DTypeFromTypeNum(NPY_STRING);
+        experimental_api_table[39] = PyArray_DTypeFromTypeNum(NPY_UNICODE);
+        /* Datetime/Timedelta */
+        experimental_api_table[40] = PyArray_DTypeFromTypeNum(NPY_DATETIME);
+        experimental_api_table[41] = PyArray_DTypeFromTypeNum(NPY_TIMEDELTA);
+    }
  
      char *env = getenv("NUMPY_EXPERIMENTAL_DTYPE_API");
      if (env == NULL || strcmp(env, "1") != 0) {
diff --git a/numpy/core/src/multiarray/flagsobject.c b/numpy/core/src/multiarray/flagsobject.c

index 3b1b4f406194474ca2b19a7f0583adec70cdf90c..adbfb22e7f5d9357b7514d5efe94eb1328a7e346 100644 (file)
--- a/numpy/core/src/multiarray/flagsobject.c
+++ b/numpy/core/src/multiarray/flagsobject.c
@@ -105,20 +105,11 @@ PyArray_UpdateFlags(PyArrayObject *ret, int flagmask)
   *
   * According to these rules, a 0- or 1-dimensional array is either both
   * C- and F-contiguous, or neither; and an array with 2+ dimensions
- * can be C- or F- contiguous, or neither, but not both. Though there
- * there are exceptions for arrays with zero or one item, in the first
- * case the check is relaxed up to and including the first dimension
- * with shape[i] == 0. In the second case `strides == itemsize` will
- * can be true for all dimensions and both flags are set.
- *
- * When NPY_RELAXED_STRIDES_CHECKING is set, we use a more accurate
- * definition of C- and F-contiguity, in which all 0-sized arrays are
- * contiguous (regardless of dimensionality), and if shape[i] == 1
- * then we ignore strides[i] (since it has no affect on memory layout).
- * With these new rules, it is possible for e.g. a 10x1 array to be both
- * C- and F-contiguous -- but, they break downstream code which assumes
- * that for contiguous arrays strides[-1] (resp. strides[0]) always
- * contains the itemsize.
+ * can be C- or F- contiguous, or neither, but not both (unless it has only
+ * a single element).
+ * We correct this, however.  When a dimension has length 1, its stride is
+ * never used and thus has no effect on the  memory layout.
+ * The above rules thus only apply when ignorning all size 1 dimenions.
   */
  static void
  _UpdateContiguousFlags(PyArrayObject *ap)
@@ -131,7 +122,6 @@ _UpdateContiguousFlags(PyArrayObject *ap)
      sd = PyArray_ITEMSIZE(ap);
      for (i = PyArray_NDIM(ap) - 1; i >= 0; --i) {
          dim = PyArray_DIMS(ap)[i];
-#if NPY_RELAXED_STRIDES_CHECKING
          /* contiguous by definition */
          if (dim == 0) {
              PyArray_ENABLEFLAGS(ap, NPY_ARRAY_C_CONTIGUOUS);
@@ -144,17 +134,6 @@ _UpdateContiguousFlags(PyArrayObject *ap)
              }
              sd *= dim;
          }
-#else /* not NPY_RELAXED_STRIDES_CHECKING */
-        if (PyArray_STRIDES(ap)[i] != sd) {
-            is_c_contig = 0;
-            break;
-        }
-        /* contiguous, if it got this far */
-        if (dim == 0) {
-            break;
-        }
-        sd *= dim;
-#endif /* not NPY_RELAXED_STRIDES_CHECKING */
      }
      if (is_c_contig) {
          PyArray_ENABLEFLAGS(ap, NPY_ARRAY_C_CONTIGUOUS);
@@ -167,7 +146,6 @@ _UpdateContiguousFlags(PyArrayObject *ap)
      sd = PyArray_ITEMSIZE(ap);
      for (i = 0; i < PyArray_NDIM(ap); ++i) {
          dim = PyArray_DIMS(ap)[i];
-#if NPY_RELAXED_STRIDES_CHECKING
          if (dim != 1) {
              if (PyArray_STRIDES(ap)[i] != sd) {
                  PyArray_CLEARFLAGS(ap, NPY_ARRAY_F_CONTIGUOUS);
@@ -175,16 +153,6 @@ _UpdateContiguousFlags(PyArrayObject *ap)
              }
              sd *= dim;
          }
-#else /* not NPY_RELAXED_STRIDES_CHECKING */
-        if (PyArray_STRIDES(ap)[i] != sd) {
-            PyArray_CLEARFLAGS(ap, NPY_ARRAY_F_CONTIGUOUS);
-            return;
-        }
-        if (dim == 0) {
-            break;
-        }
-        sd *= dim;
-#endif /* not NPY_RELAXED_STRIDES_CHECKING */
      }
      PyArray_ENABLEFLAGS(ap, NPY_ARRAY_F_CONTIGUOUS);
      return;
@@ -237,25 +205,6 @@ _define_get_warn(NPY_ARRAY_ALIGNED|
              NPY_ARRAY_WRITEABLE|
              NPY_ARRAY_C_CONTIGUOUS, carray)
  
-static PyObject *
-arrayflags_updateifcopy_get(PyArrayFlagsObject *self, void *NPY_UNUSED(ignored))
-{
-    PyObject *item;
-    /* 2017-Nov-10 1.14 */
-    if(DEPRECATE("UPDATEIFCOPY deprecated, use WRITEBACKIFCOPY instead") < 0) {
-        return NULL;
-    }
-    if ((self->flags & (NPY_ARRAY_UPDATEIFCOPY)) == (NPY_ARRAY_UPDATEIFCOPY)) {
-        item = Py_True;
-    }
-    else {
-        item = Py_False;
-    }
-    Py_INCREF(item);
-    return item;
-}
-
-
  static PyObject *
  arrayflags_forc_get(PyArrayFlagsObject *self, void *NPY_UNUSED(ignored))
  {
@@ -312,36 +261,6 @@ arrayflags_num_get(PyArrayFlagsObject *self, void *NPY_UNUSED(ignored))
      return PyLong_FromLong(self->flags);
  }
  
-/* relies on setflags order being write, align, uic */
-static int
-arrayflags_updateifcopy_set(
-        PyArrayFlagsObject *self, PyObject *obj, void *NPY_UNUSED(ignored))
-{
-    PyObject *res;
-
-    if (obj == NULL) {
-        PyErr_SetString(PyExc_AttributeError,
-                "Cannot delete flags updateifcopy attribute");
-        return -1;
-    }
-    if (self->arr == NULL) {
-        PyErr_SetString(PyExc_ValueError,
-                "Cannot set flags on array scalars.");
-        return -1;
-    }
-    /* 2017-Nov-10 1.14 */
-    if(DEPRECATE("UPDATEIFCOPY deprecated, use WRITEBACKIFCOPY instead") < 0) {
-        return -1;
-    }
-    res = PyObject_CallMethod(self->arr, "setflags", "OOO", Py_None, Py_None,
-                              (PyObject_IsTrue(obj) ? Py_True : Py_False));
-    if (res == NULL) {
-        return -1;
-    }
-    Py_DECREF(res);
-    return 0;
-}
-
  /* relies on setflags order being write, align, uic */
  static int
  arrayflags_writebackifcopy_set(
@@ -473,10 +392,6 @@ static PyGetSetDef arrayflags_getsets[] = {
          (getter)arrayflags_fortran_get,
          NULL,
          NULL, NULL},
-    {"updateifcopy",
-        (getter)arrayflags_updateifcopy_get,
-        (setter)arrayflags_updateifcopy_set,
-        NULL, NULL},
      {"writebackifcopy",
          (getter)arrayflags_writebackifcopy_get,
          (setter)arrayflags_writebackifcopy_set,
@@ -574,8 +489,6 @@ arrayflags_getitem(PyArrayFlagsObject *self, PyObject *ind)
              return arrayflags_aligned_get(self, NULL);
          case 'X':
              return arrayflags_writebackifcopy_get(self, NULL);
-        case 'U':
-            return arrayflags_updateifcopy_get(self, NULL);
          default:
              goto fail;
          }
@@ -631,9 +544,6 @@ arrayflags_getitem(PyArrayFlagsObject *self, PyObject *ind)
          }
          break;
      case 12:
-        if (strncmp(key, "UPDATEIFCOPY", n) == 0) {
-            return arrayflags_updateifcopy_get(self, NULL);
-        }
          if (strncmp(key, "C_CONTIGUOUS", n) == 0) {
              return arrayflags_contiguous_get(self, NULL);
          }
@@ -684,10 +594,6 @@ arrayflags_setitem(PyArrayFlagsObject *self, PyObject *ind, PyObject *item)
               ((n==1) && (strncmp(key, "A", n) == 0))) {
          return arrayflags_aligned_set(self, item, NULL);
      }
-    else if (((n==12) && (strncmp(key, "UPDATEIFCOPY", n) == 0)) ||
-             ((n==1) && (strncmp(key, "U", n) == 0))) {
-        return arrayflags_updateifcopy_set(self, item, NULL);
-    }
      else if (((n==15) && (strncmp(key, "WRITEBACKIFCOPY", n) == 0)) ||
               ((n==1) && (strncmp(key, "X", n) == 0))) {
          return arrayflags_writebackifcopy_set(self, item, NULL);
@@ -721,16 +627,14 @@ arrayflags_print(PyArrayFlagsObject *self)
      return PyUnicode_FromFormat(
                          "  %s : %s\n  %s : %s\n"
                          "  %s : %s\n  %s : %s%s\n"
-                        "  %s : %s\n  %s : %s\n"
-                        "  %s : %s\n",
+                        "  %s : %s\n  %s : %s\n",
                          "C_CONTIGUOUS",    _torf_(fl, NPY_ARRAY_C_CONTIGUOUS),
                          "F_CONTIGUOUS",    _torf_(fl, NPY_ARRAY_F_CONTIGUOUS),
                          "OWNDATA",         _torf_(fl, NPY_ARRAY_OWNDATA),
                          "WRITEABLE",       _torf_(fl, NPY_ARRAY_WRITEABLE),
                          _warn_on_write,
                          "ALIGNED",         _torf_(fl, NPY_ARRAY_ALIGNED),
-                        "WRITEBACKIFCOPY", _torf_(fl, NPY_ARRAY_WRITEBACKIFCOPY),
-                        "UPDATEIFCOPY",    _torf_(fl, NPY_ARRAY_UPDATEIFCOPY)
+                        "WRITEBACKIFCOPY", _torf_(fl, NPY_ARRAY_WRITEBACKIFCOPY)
      );
  }
  
diff --git a/numpy/core/src/multiarray/getset.c b/numpy/core/src/multiarray/getset.c

index fc43bb3fe026caa88cbaa74b08143e668b688ff9..eb55e5e6110d58fbdaaf03213b56974a2665959f 100644 (file)
--- a/numpy/core/src/multiarray/getset.c
+++ b/numpy/core/src/multiarray/getset.c
@@ -384,7 +384,10 @@ array_data_set(PyArrayObject *self, PyObject *op, void *NPY_UNUSED(ignored))
      }
      if (PyArray_FLAGS(self) & NPY_ARRAY_OWNDATA) {
          PyArray_XDECREF(self);
-        size_t nbytes = PyArray_NBYTES_ALLOCATED(self);
+        size_t nbytes = PyArray_NBYTES(self);
+        if (nbytes == 0) {
+            nbytes = 1;
+        }
          PyObject *handler = PyArray_HANDLER(self);
          if (handler == NULL) {
              /* This can happen if someone arbitrarily sets NPY_ARRAY_OWNDATA */
@@ -396,12 +399,10 @@ array_data_set(PyArrayObject *self, PyObject *op, void *NPY_UNUSED(ignored))
          Py_CLEAR(((PyArrayObject_fields *)self)->mem_handler);
      }
      if (PyArray_BASE(self)) {
-        if ((PyArray_FLAGS(self) & NPY_ARRAY_WRITEBACKIFCOPY) ||
-            (PyArray_FLAGS(self) & NPY_ARRAY_UPDATEIFCOPY)) {
+        if (PyArray_FLAGS(self) & NPY_ARRAY_WRITEBACKIFCOPY) {
              PyArray_ENABLEFLAGS((PyArrayObject *)PyArray_BASE(self),
                                                  NPY_ARRAY_WRITEABLE);
              PyArray_CLEARFLAGS(self, NPY_ARRAY_WRITEBACKIFCOPY);
-            PyArray_CLEARFLAGS(self, NPY_ARRAY_UPDATEIFCOPY);
          }
          Py_DECREF(PyArray_BASE(self));
          ((PyArrayObject_fields *)self)->base = NULL;
@@ -498,9 +499,6 @@ array_descr_set(PyArrayObject *self, PyObject *arg, void *NPY_UNUSED(ignored))
  
      /* Changing the size of the dtype results in a shape change */
      if (newtype->elsize != PyArray_DESCR(self)->elsize) {
-        int axis;
-        npy_intp newdim;
-
          /* forbidden cases */
          if (PyArray_NDIM(self) == 0) {
              PyErr_SetString(PyExc_ValueError,
@@ -515,31 +513,21 @@ array_descr_set(PyArrayObject *self, PyObject *arg, void *NPY_UNUSED(ignored))
              goto fail;
          }
  
-        /* determine which axis to resize */
-        if (PyArray_IS_C_CONTIGUOUS(self)) {
-            axis = PyArray_NDIM(self) - 1;
-        }
-        else if (PyArray_IS_F_CONTIGUOUS(self)) {
-            /* 2015-11-27 1.11.0, gh-6747 */
-            if (DEPRECATE(
-                        "Changing the shape of an F-contiguous array by "
-                        "descriptor assignment is deprecated. To maintain the "
-                        "Fortran contiguity of a multidimensional Fortran "
-                        "array, use 'a.T.view(...).T' instead") < 0) {
-                goto fail;
-            }
-            axis = 0;
-        }
-        else {
-            /* Don't mention the deprecated F-contiguous support */
+        /* resize on last axis only */
+        int axis = PyArray_NDIM(self) - 1;
+        if (PyArray_DIMS(self)[axis] != 1 &&
+                PyArray_SIZE(self) != 0 &&
+                PyArray_STRIDES(self)[axis] != PyArray_DESCR(self)->elsize) {
              PyErr_SetString(PyExc_ValueError,
-                    "To change to a dtype of a different size, the array must "
-                    "be C-contiguous");
+                    "To change to a dtype of a different size, the last axis "
+                    "must be contiguous");
              goto fail;
          }
  
+        npy_intp newdim;
+
          if (newtype->elsize < PyArray_DESCR(self)->elsize) {
-            /* if it is compatible, increase the size of the relevant axis */
+            /* if it is compatible, increase the size of the last axis */
              if (newtype->elsize == 0 ||
                      PyArray_DESCR(self)->elsize % newtype->elsize != 0) {
                  PyErr_SetString(PyExc_ValueError,
@@ -551,7 +539,7 @@ array_descr_set(PyArrayObject *self, PyObject *arg, void *NPY_UNUSED(ignored))
              PyArray_DIMS(self)[axis] *= newdim;
              PyArray_STRIDES(self)[axis] = newtype->elsize;
          }
-        else if (newtype->elsize > PyArray_DESCR(self)->elsize) {
+        else /* newtype->elsize > PyArray_DESCR(self)->elsize */ {
              /* if it is compatible, decrease the size of the relevant axis */
              newdim = PyArray_DIMS(self)[axis] * PyArray_DESCR(self)->elsize;
              if ((newdim % newtype->elsize) != 0) {
@@ -626,7 +614,7 @@ array_struct_get(PyArrayObject *self, void *NPY_UNUSED(ignored))
          inter->flags = inter->flags & ~NPY_ARRAY_WRITEABLE;
      }
      /* reset unused flags */
-    inter->flags &= ~(NPY_ARRAY_WRITEBACKIFCOPY | NPY_ARRAY_UPDATEIFCOPY |NPY_ARRAY_OWNDATA);
+    inter->flags &= ~(NPY_ARRAY_WRITEBACKIFCOPY | NPY_ARRAY_OWNDATA);
      if (PyArray_ISNOTSWAPPED(self)) inter->flags |= NPY_ARRAY_NOTSWAPPED;
      /*
       * Copy shape and strides over since these can be reset
@@ -945,15 +933,6 @@ array_transpose_get(PyArrayObject *self, void *NPY_UNUSED(ignored))
      return PyArray_Transpose(self, NULL);
  }
  
-/* If this is None, no function call is made
-   --- default sub-class behavior
-*/
-static PyObject *
-array_finalize_get(PyArrayObject *NPY_UNUSED(self), void *NPY_UNUSED(ignored))
-{
-    Py_RETURN_NONE;
-}
-
  NPY_NO_EXPORT PyGetSetDef array_getsetlist[] = {
      {"ndim",
          (getter)array_ndim_get,
@@ -1027,10 +1006,6 @@ NPY_NO_EXPORT PyGetSetDef array_getsetlist[] = {
          (getter)array_priority_get,
          NULL,
          NULL, NULL},
-    {"__array_finalize__",
-        (getter)array_finalize_get,
-        NULL,
-        NULL, NULL},
      {NULL, NULL, NULL, NULL, NULL},  /* Sentinel */
  };
  
diff --git a/numpy/core/src/multiarray/item_selection.c b/numpy/core/src/multiarray/item_selection.c

index 086b674c809eb327138db0c40245abd6d8bf032a..9fad153a3d4c111c9c50fb01457775ee4b1ea977 100644 (file)
--- a/numpy/core/src/multiarray/item_selection.c
+++ b/numpy/core/src/multiarray/item_selection.c
@@ -2641,13 +2641,33 @@ PyArray_Nonzero(PyArrayObject *self)
                      *multi_index++ = j++;
                  }
              }
+            /*
+             * Fallback to a branchless strategy to avoid branch misprediction 
+             * stalls that are very expensive on most modern processors.
+             */
              else {
-                npy_intp j;
-                for (j = 0; j < count; ++j) {
-                    if (*data != 0) {
-                        *multi_index++ = j;
-                    }
+                npy_intp *multi_index_end = multi_index + nonzero_count;
+                npy_intp j = 0;
+
+                /* Manually unroll for GCC and maybe other compilers */
+                while (multi_index + 4 < multi_index_end) {
+                    *multi_index = j;
+                    multi_index += data[0] != 0;
+                    *multi_index = j + 1;
+                    multi_index += data[stride] != 0;
+                    *multi_index = j + 2;
+                    multi_index += data[stride * 2] != 0;
+                    *multi_index = j + 3;
+                    multi_index += data[stride * 3] != 0;
+                    data += stride * 4;
+                    j += 4;
+                }
+
+                while (multi_index < multi_index_end) {
+                    *multi_index = j;
+                    multi_index += *data != 0;
                      data += stride;
+                    ++j;
                  }
              }
          }
diff --git a/numpy/core/src/multiarray/legacy_dtype_implementation.c b/numpy/core/src/multiarray/legacy_dtype_implementation.c

index 72a52d7a87c0b661b469f575e898eebc8376313a..73c70c393a0cd882c51839d7435700f5909306b5 100644 (file)
--- a/numpy/core/src/multiarray/legacy_dtype_implementation.c
+++ b/numpy/core/src/multiarray/legacy_dtype_implementation.c
@@ -13,6 +13,7 @@
  #include "scalartypes.h"
  #include "_datetime.h"
  #include "datetime_strings.h"
+#include "can_cast_table.h"
  #include "convert_datatype.h"
  
  #include "legacy_dtype_implementation.h"
diff --git a/numpy/core/src/multiarray/mapping.c b/numpy/core/src/multiarray/mapping.c

index 014a863d547127db907ca1fd4d44a7eb43dedc7f..1a2ade11b93b83101d4139cf1711defc585860e3 100644 (file)
--- a/numpy/core/src/multiarray/mapping.c
+++ b/numpy/core/src/multiarray/mapping.c
@@ -197,21 +197,8 @@ unpack_scalar(PyObject *index, PyObject **result, npy_intp NPY_UNUSED(result_n))
  /**
   * Turn an index argument into a c-array of `PyObject *`s, one for each index.
   *
- * When a scalar is passed, this is written directly to the buffer. When a
- * tuple is passed, the tuple elements are unpacked into the buffer.
- *
- * When some other sequence is passed, this implements the following section
- * from the advanced indexing docs to decide whether to unpack or just write
- * one element:
- *
- * > In order to remain backward compatible with a common usage in Numeric,
- * > basic slicing is also initiated if the selection object is any non-ndarray
- * > sequence (such as a list) containing slice objects, the Ellipsis object,
- * > or the newaxis object, but not for integer arrays or other embedded
- * > sequences.
- *
- * It might be worth deprecating this behaviour (gh-4434), in which case the
- * entire function should become a simple check of PyTuple_Check.
+ * When a tuple is passed, the tuple elements are unpacked into the buffer.
+ * Anything else is handled by unpack_scalar().
   *
   * @param  index     The index object, which may or may not be a tuple. This is
   *                   a borrowed reference.
@@ -228,129 +215,32 @@ unpack_scalar(PyObject *index, PyObject **result, npy_intp NPY_UNUSED(result_n))
  NPY_NO_EXPORT npy_intp
  unpack_indices(PyObject *index, PyObject **result, npy_intp result_n)
  {
-    npy_intp n, i;
-    npy_bool commit_to_unpack;
+    /* It is likely that the logic here can be simplified. See the discussion
+     * on https://github.com/numpy/numpy/pull/21029
+     */
  
      /* Fast route for passing a tuple */
      if (PyTuple_CheckExact(index)) {
          return unpack_tuple((PyTupleObject *)index, result, result_n);
      }
  
-    /* Obvious single-entry cases */
-    if (0  /* to aid macros below */
-            || PyLong_CheckExact(index)
-            || index == Py_None
-            || PySlice_Check(index)
-            || PyArray_Check(index)
-            || !PySequence_Check(index)
-            || PyUnicode_Check(index)) {
-
-        return unpack_scalar(index, result, result_n);
-    }
-
      /*
       * Passing a tuple subclass - coerce to the base type. This incurs an
-     * allocation, but doesn't need to be a fast path anyway
+     * allocation, but doesn't need to be a fast path anyway. Note that by
+     * calling `PySequence_Tuple`, we ensure that the subclass `__iter__` is
+     * called.
       */
      if (PyTuple_Check(index)) {
          PyTupleObject *tup = (PyTupleObject *) PySequence_Tuple(index);
          if (tup == NULL) {
              return -1;
          }
-        n = unpack_tuple(tup, result, result_n);
+        npy_intp n = unpack_tuple(tup, result, result_n);
          Py_DECREF(tup);
          return n;
      }
  
-    /*
-     * At this point, we're left with a non-tuple, non-array, sequence:
-     * typically, a list. We use some somewhat-arbitrary heuristics from here
-     * onwards to decided whether to treat that list as a single index, or a
-     * list of indices.
-     */
-
-    /* if len fails, treat like a scalar */
-    n = PySequence_Size(index);
-    if (n < 0) {
-        PyErr_Clear();
-        return unpack_scalar(index, result, result_n);
-    }
-
-    /*
-     * Backwards compatibility only takes effect for short sequences - otherwise
-     * we treat it like any other scalar.
-     *
-     * Sequences < NPY_MAXDIMS with any slice objects
-     * or newaxis, Ellipsis or other arrays or sequences
-     * embedded, are considered equivalent to an indexing
-     * tuple. (`a[[[1,2], [3,4]]] == a[[1,2], [3,4]]`)
-     */
-    if (n >= NPY_MAXDIMS) {
-        return unpack_scalar(index, result, result_n);
-    }
-
-    /* In case we change result_n elsewhere */
-    assert(n <= result_n);
-
-    /*
-     * Some other type of short sequence - assume we should unpack it like a
-     * tuple, and then decide whether that was actually necessary.
-     */
-    commit_to_unpack = 0;
-    for (i = 0; i < n; i++) {
-        PyObject *tmp_obj = result[i] = PySequence_GetItem(index, i);
-
-        if (commit_to_unpack) {
-            /* propagate errors */
-            if (tmp_obj == NULL) {
-                goto fail;
-            }
-        }
-        else {
-            /*
-             * if getitem fails (unusual) before we've committed, then stop
-             * unpacking
-             */
-            if (tmp_obj == NULL) {
-                PyErr_Clear();
-                break;
-            }
-
-            /* decide if we should treat this sequence like a tuple */
-            if (PyArray_Check(tmp_obj)
-                    || PySequence_Check(tmp_obj)
-                    || PySlice_Check(tmp_obj)
-                    || tmp_obj == Py_Ellipsis
-                    || tmp_obj == Py_None) {
-                if (DEPRECATE_FUTUREWARNING(
-                        "Using a non-tuple sequence for multidimensional "
-                        "indexing is deprecated; use `arr[tuple(seq)]` "
-                        "instead of `arr[seq]`. In the future this will be "
-                        "interpreted as an array index, `arr[np.array(seq)]`, "
-                        "which will result either in an error or a different "
-                        "result.") < 0) {
-                    i++;  /* since loop update doesn't run */
-                    goto fail;
-                }
-                commit_to_unpack = 1;
-            }
-        }
-    }
-
-    /* unpacking was the right thing to do, and we already did it */
-    if (commit_to_unpack) {
-        return n;
-    }
-    /* got to the end, never found an indication that we should have unpacked */
-    else {
-        /* we partially filled result, so empty it first */
-        multi_DECREF(result, i);
-        return unpack_scalar(index, result, result_n);
-    }
-
-fail:
-    multi_DECREF(result, i);
-    return -1;
+    return unpack_scalar(index, result, result_n);
  }
  
  /**
@@ -3254,7 +3144,7 @@ PyArray_MapIterNew(npy_index_info *indices , int index_num, int index_type,
   * If copy_if_overlap != 0, check if `a` has memory overlap with any of the
   * arrays in `index` and with `extra_op`. If yes, make copies as appropriate
   * to avoid problems if `a` is modified during the iteration.
- * `iter->array` may contain a copied array (UPDATEIFCOPY/WRITEBACKIFCOPY set).
+ * `iter->array` may contain a copied array (WRITEBACKIFCOPY set).
   */
  NPY_NO_EXPORT PyObject *
  PyArray_MapIterArrayCopyIfOverlap(PyArrayObject * a, PyObject * index,
diff --git a/numpy/core/src/multiarray/methods.c b/numpy/core/src/multiarray/methods.c

index 184d73f4942667f9d24c80de18c5417dbec05b66..b738c1d44e76c0d0d6386417cbf1c55abf49e2bc 100644 (file)
--- a/numpy/core/src/multiarray/methods.c
+++ b/numpy/core/src/multiarray/methods.c
@@ -379,6 +379,18 @@ PyArray_GetField(PyArrayObject *self, PyArray_Descr *typed, int offset)
      static PyObject *checkfunc = NULL;
      int self_elsize, typed_elsize;
  
+    if (self == NULL) {
+        PyErr_SetString(PyExc_ValueError,
+            "self is NULL in PyArray_GetField");
+        return NULL;
+    }
+
+    if (typed == NULL) {
+        PyErr_SetString(PyExc_ValueError,
+            "typed is NULL in PyArray_GetField");
+        return NULL;
+    }
+
      /* check that we are not reinterpreting memory containing Objects. */
      if (_may_have_objects(PyArray_DESCR(self)) || _may_have_objects(typed)) {
          npy_cache_import("numpy.core._internal", "_getfield_is_safe",
@@ -457,6 +469,18 @@ PyArray_SetField(PyArrayObject *self, PyArray_Descr *dtype,
      PyObject *ret = NULL;
      int retval = 0;
  
+    if (self == NULL) {
+        PyErr_SetString(PyExc_ValueError,
+            "self is NULL in PyArray_SetField");
+        return -1;
+    }
+
+    if (dtype == NULL) {
+        PyErr_SetString(PyExc_ValueError,
+            "dtype is NULL in PyArray_SetField");
+        return -1;
+    }
+
      if (PyArray_FailUnlessWriteable(self, "assignment destination") < 0) {
          Py_DECREF(dtype);
          return -1;
@@ -859,7 +883,7 @@ array_astype(PyArrayObject *self,
       * and it's not a subtype if subok is False, then we
       * can skip the copy.
       */
-    if (forcecopy != NPY_COPY_ALWAYS && 
+    if (forcecopy != NPY_COPY_ALWAYS &&
                      (order == NPY_KEEPORDER ||
                      (order == NPY_ANYORDER &&
                          (PyArray_IS_C_CONTIGUOUS(self) ||
@@ -881,7 +905,7 @@ array_astype(PyArrayObject *self,
          Py_DECREF(dtype);
          return NULL;
      }
-    
+
      if (!PyArray_CanCastArrayTo(self, dtype, casting)) {
          PyErr_Clear();
          npy_set_invalid_cast_error(
@@ -925,6 +949,13 @@ array_astype(PyArrayObject *self,
  /* default sub-type implementation */
  
  
+static PyObject *
+array_finalizearray(PyArrayObject *self, PyObject *obj)
+{
+    Py_RETURN_NONE;
+}
+
+
  static PyObject *
  array_wraparray(PyArrayObject *self, PyObject *args)
  {
@@ -1992,7 +2023,10 @@ array_setstate(PyArrayObject *self, PyObject *args)
       * since fa could be a 0-d or scalar, and then
       * PyDataMem_UserFREE will be confused
       */
-    size_t n_tofree = PyArray_NBYTES_ALLOCATED(self);
+    size_t n_tofree = PyArray_NBYTES(self);
+    if (n_tofree == 0) {
+        n_tofree = 1;
+    }
      Py_XDECREF(PyArray_DESCR(self));
      fa->descr = typecode;
      Py_INCREF(typecode);
@@ -2098,7 +2132,6 @@ array_setstate(PyArrayObject *self, PyObject *args)
      fa->base = NULL;
  
      PyArray_CLEARFLAGS(self, NPY_ARRAY_WRITEBACKIFCOPY);
-    PyArray_CLEARFLAGS(self, NPY_ARRAY_UPDATEIFCOPY);
  
      if (PyArray_DIMS(self) != NULL) {
          npy_free_cache_dim_array(self);
@@ -2130,7 +2163,10 @@ array_setstate(PyArrayObject *self, PyObject *args)
          /* Bytes should always be considered immutable, but we just grab the
           * pointer if they are large, to save memory. */
          if (!IsAligned(self) || swap || (len <= 1000)) {
-            npy_intp num = PyArray_NBYTES_ALLOCATED(self);
+            npy_intp num = PyArray_NBYTES(self);
+            if (num == 0) {
+                num = 1;
+            }
              /* Store the handler in case the default is modified */
              Py_XDECREF(fa->mem_handler);
              fa->mem_handler = PyDataMem_GetHandler();
@@ -2193,7 +2229,10 @@ array_setstate(PyArrayObject *self, PyObject *args)
          }
      }
      else {
-        npy_intp num = PyArray_NBYTES_ALLOCATED(self);
+        npy_intp num = PyArray_NBYTES(self);
+        if (num == 0) {
+            num = 1;
+        }
  
          /* Store the functions in case the default handler is modified */
          Py_XDECREF(fa->mem_handler);
@@ -2652,7 +2691,6 @@ array_setflags(PyArrayObject *self, PyObject *args, PyObject *kwds)
          }
          else {
              PyArray_CLEARFLAGS(self, NPY_ARRAY_WRITEBACKIFCOPY);
-            PyArray_CLEARFLAGS(self, NPY_ARRAY_UPDATEIFCOPY);
              Py_XDECREF(fa->base);
              fa->base = NULL;
          }
@@ -2800,6 +2838,9 @@ NPY_NO_EXPORT PyMethodDef array_methods[] = {
      {"__array_prepare__",
          (PyCFunction)array_preparearray,
          METH_VARARGS, NULL},
+    {"__array_finalize__",
+        (PyCFunction)array_finalizearray,
+        METH_O, NULL},
      {"__array_wrap__",
          (PyCFunction)array_wraparray,
          METH_VARARGS, NULL},
diff --git a/numpy/core/src/multiarray/multiarraymodule.c b/numpy/core/src/multiarray/multiarraymodule.c

index 576c39f5d9ec6d20cdde375a142c7f9244129eab..597feb6bbbc2a0abf9c5e4341e87744c3464c6c7 100644 (file)
--- a/numpy/core/src/multiarray/multiarraymodule.c
+++ b/numpy/core/src/multiarray/multiarraymodule.c
@@ -69,16 +69,19 @@ NPY_NO_EXPORT int NPY_NUMUSERTYPES = 0;
  
  #include "get_attr_string.h"
  #include "experimental_public_dtype_api.h"  /* _get_experimental_dtype_api */
+#include "textreading/readtext.h"  /* _readtext_from_file_object */
  
  #include "npy_dlpack.h"
  
+#include "umathmodule.h"
+
  /*
   *****************************************************************************
   **                    INCLUDE GENERATED CODE                               **
   *****************************************************************************
   */
-#include "funcs.inc"
-#include "umathmodule.h"
+/* __ufunc_api.c define is the PyUFunc_API table: */
+#include "__ufunc_api.c"
  
  NPY_NO_EXPORT int initscalarmath(PyObject *);
  NPY_NO_EXPORT int set_matmul_flags(PyObject *d); /* in ufunc_object.c */
@@ -125,16 +128,22 @@ PyArray_GetPriority(PyObject *obj, double default_)
          return NPY_SCALAR_PRIORITY;
      }
  
-    ret = PyArray_LookupSpecial_OnInstance(obj, "__array_priority__");
+    ret = PyArray_LookupSpecial_OnInstance(obj, npy_ma_str_array_priority);
      if (ret == NULL) {
          if (PyErr_Occurred()) {
-            PyErr_Clear(); /* TODO[gh-14801]: propagate crashes during attribute access? */
+            /* TODO[gh-14801]: propagate crashes during attribute access? */
+            PyErr_Clear();
          }
          return default_;
      }
  
      priority = PyFloat_AsDouble(ret);
      Py_DECREF(ret);
+    if (error_converting(priority)) {
+        /* TODO[gh-14801]: propagate crashes for bad priority? */
+        PyErr_Clear();
+        return default_;
+    }
      return priority;
  }
  
@@ -1500,7 +1509,8 @@ PyArray_EquivTypes(PyArray_Descr *type1, PyArray_Descr *type2)
       * Do not use PyArray_CanCastTypeTo because it supports legacy flexible
       * dtypes as input.
       */
-    NPY_CASTING safety = PyArray_GetCastSafety(type1, type2, NULL);
+    npy_intp view_offset;
+    NPY_CASTING safety = PyArray_GetCastInfo(type1, type2, NULL, &view_offset);
      if (safety < 0) {
          PyErr_Clear();
          return 0;
@@ -1531,8 +1541,9 @@ PyArray_EquivTypenums(int typenum1, int typenum2)
  
  /*** END C-API FUNCTIONS **/
  /*
- * NPY_RELAXED_STRIDES_CHECKING: If the strides logic is changed, the
- * order specific stride setting is not necessary.
+ * NOTE: The order specific stride setting is not necessary to preserve
+ *       contiguity and could be removed.  However, this way the resulting
+ *       strides strides look better for fortran order inputs.
   */
  static NPY_STEALS_REF_TO_ARG(1) PyObject *
  _prepend_ones(PyArrayObject *arr, int nd, int ndmin, NPY_ORDER order)
@@ -4455,6 +4466,8 @@ static struct PyMethodDef array_module_methods[] = {
          METH_VARARGS | METH_KEYWORDS, NULL},
      {"_get_experimental_dtype_api", (PyCFunction)_get_experimental_dtype_api,
          METH_O, NULL},
+    {"_load_from_filelike", (PyCFunction)_load_from_filelike,
+        METH_FASTCALL | METH_KEYWORDS, NULL},
      /* from umath */
      {"frompyfunc",
          (PyCFunction) ufunc_frompyfunc,
@@ -4475,12 +4488,14 @@ static struct PyMethodDef array_module_methods[] = {
          METH_VARARGS, NULL},
      {"_get_sfloat_dtype",
          get_sfloat_dtype, METH_NOARGS, NULL},
+    {"_get_madvise_hugepage", (PyCFunction)_get_madvise_hugepage,
+        METH_NOARGS, NULL},
      {"_set_madvise_hugepage", (PyCFunction)_set_madvise_hugepage,
          METH_O, NULL},
      {"_reload_guard", (PyCFunction)_reload_guard,
          METH_NOARGS,
          "Give a warning on reload and big warning in sub-interpreters."},
-    {"_from_dlpack", (PyCFunction)_from_dlpack,
+    {"from_dlpack", (PyCFunction)from_dlpack,
          METH_O, NULL},
      {NULL, NULL, 0, NULL}                /* sentinel */
  };
@@ -4639,7 +4654,6 @@ set_flaginfo(PyObject *d)
      _addnew(FORTRAN, NPY_ARRAY_F_CONTIGUOUS, F);
      _addnew(CONTIGUOUS, NPY_ARRAY_C_CONTIGUOUS, C);
      _addnew(ALIGNED, NPY_ARRAY_ALIGNED, A);
-    _addnew(UPDATEIFCOPY, NPY_ARRAY_UPDATEIFCOPY, U);
      _addnew(WRITEBACKIFCOPY, NPY_ARRAY_WRITEBACKIFCOPY, X);
      _addnew(WRITEABLE, NPY_ARRAY_WRITEABLE, W);
      _addone(C_CONTIGUOUS, NPY_ARRAY_C_CONTIGUOUS);
@@ -4653,6 +4667,12 @@ set_flaginfo(PyObject *d)
      return;
  }
  
+NPY_VISIBILITY_HIDDEN PyObject * npy_ma_str_current_allocator = NULL;
+NPY_VISIBILITY_HIDDEN PyObject * npy_ma_str_array = NULL;
+NPY_VISIBILITY_HIDDEN PyObject * npy_ma_str_array_function = NULL;
+NPY_VISIBILITY_HIDDEN PyObject * npy_ma_str_array_struct = NULL;
+NPY_VISIBILITY_HIDDEN PyObject * npy_ma_str_array_interface = NULL;
+NPY_VISIBILITY_HIDDEN PyObject * npy_ma_str_array_priority = NULL;
  NPY_VISIBILITY_HIDDEN PyObject * npy_ma_str_array_wrap = NULL;
  NPY_VISIBILITY_HIDDEN PyObject * npy_ma_str_array_finalize = NULL;
  NPY_VISIBILITY_HIDDEN PyObject * npy_ma_str_implementation = NULL;
@@ -4664,6 +4684,30 @@ NPY_VISIBILITY_HIDDEN PyObject * npy_ma_str_numpy = NULL;
  static int
  intern_strings(void)
  {
+    npy_ma_str_current_allocator = PyUnicode_InternFromString("current_allocator");
+    if (npy_ma_str_current_allocator == NULL) {
+        return -1;
+    }
+    npy_ma_str_array = PyUnicode_InternFromString("__array__");
+    if (npy_ma_str_array == NULL) {
+        return -1;
+    }
+    npy_ma_str_array_function = PyUnicode_InternFromString("__array_function__");
+    if (npy_ma_str_array_function == NULL) {
+        return -1;
+    }
+    npy_ma_str_array_struct = PyUnicode_InternFromString("__array_struct__");
+    if (npy_ma_str_array_struct == NULL) {
+        return -1;
+    }
+    npy_ma_str_array_priority = PyUnicode_InternFromString("__array_priority__");
+    if (npy_ma_str_array_priority == NULL) {
+        return -1;
+    }
+    npy_ma_str_array_interface = PyUnicode_InternFromString("__array_interface__");
+    if (npy_ma_str_array_interface == NULL) {
+        return -1;
+    }
      npy_ma_str_array_wrap = PyUnicode_InternFromString("__array_wrap__");
      if (npy_ma_str_array_wrap == NULL) {
          return -1;
@@ -4938,8 +4982,7 @@ PyMODINIT_FUNC PyInit__multiarray_umath(void) {
          goto err;
      }
  
-    /* Load the ufunc operators into the array module's namespace */
-    if (InitOperators(d) < 0) {
+    if (initumath(m) != 0) {
          goto err;
      }
  
@@ -4947,9 +4990,6 @@ PyMODINIT_FUNC PyInit__multiarray_umath(void) {
          goto err;
      }
  
-    if (initumath(m) != 0) {
-        goto err;
-    }
      /*
       * Initialize the default PyDataMem_Handler capsule singleton.
       */
diff --git a/numpy/core/src/multiarray/multiarraymodule.h b/numpy/core/src/multiarray/multiarraymodule.h

index 640940d2a9787bad69f6a23848ae6ed25be8669d..809736cd2df01fce8fc1c894280a46290e75be62 100644 (file)
--- a/numpy/core/src/multiarray/multiarraymodule.h
+++ b/numpy/core/src/multiarray/multiarraymodule.h
@@ -1,6 +1,12 @@
  #ifndef NUMPY_CORE_SRC_MULTIARRAY_MULTIARRAYMODULE_H_
  #define NUMPY_CORE_SRC_MULTIARRAY_MULTIARRAYMODULE_H_
  
+NPY_VISIBILITY_HIDDEN extern PyObject * npy_ma_str_current_allocator;
+NPY_VISIBILITY_HIDDEN extern PyObject * npy_ma_str_array;
+NPY_VISIBILITY_HIDDEN extern PyObject * npy_ma_str_array_function;
+NPY_VISIBILITY_HIDDEN extern PyObject * npy_ma_str_array_struct;
+NPY_VISIBILITY_HIDDEN extern PyObject * npy_ma_str_array_priority;
+NPY_VISIBILITY_HIDDEN extern PyObject * npy_ma_str_array_interface;
  NPY_VISIBILITY_HIDDEN extern PyObject * npy_ma_str_array_wrap;
  NPY_VISIBILITY_HIDDEN extern PyObject * npy_ma_str_array_finalize;
  NPY_VISIBILITY_HIDDEN extern PyObject * npy_ma_str_implementation;
diff --git a/numpy/core/src/multiarray/nditer_constr.c b/numpy/core/src/multiarray/nditer_constr.c

index 2812aaf3cb51bc3098f603cd300c5cf249a0cfbd..f82a9624eeb07ce2b164b95169963ab03a4b88aa 100644 (file)
--- a/numpy/core/src/multiarray/nditer_constr.c
+++ b/numpy/core/src/multiarray/nditer_constr.c
@@ -992,7 +992,7 @@ npyiter_check_per_op_flags(npy_uint32 op_flags, npyiter_opitflags *op_itflags)
  }
  
  /*
- * Prepares a a constructor operand.  Assumes a reference to 'op'
+ * Prepares a constructor operand.  Assumes a reference to 'op'
   * is owned, and that 'op' may be replaced.  Fills in 'op_dataptr',
   * 'op_dtype', and may modify 'op_itflags'.
   *
diff --git a/numpy/core/src/multiarray/scalarapi.c b/numpy/core/src/multiarray/scalarapi.c

index 4d4b4bccb0e49861ae4d762e5236004e6729e44e..40f1da2c4a9688b20134be5bd68e1ad39d3b8328 100644 (file)
--- a/numpy/core/src/multiarray/scalarapi.c
+++ b/numpy/core/src/multiarray/scalarapi.c
@@ -88,83 +88,12 @@ scalar_value(PyObject *scalar, PyArray_Descr *descr)
      }
  
      /*
-     * Must be a user-defined type --- check to see which
-     * scalar it inherits from.
+     * Must be a user defined type with an associated (registered) dtype.
+     * Thus, it cannot be flexible (user dtypes cannot be), so we can (and
+     * pretty much have no choice but to) assume the below logic always works.
+     * I.e. this assumes that the logic would also works for most of our types.
       */
  
-#define _CHK(cls) PyObject_IsInstance(scalar, \
-            (PyObject *)&Py##cls##ArrType_Type)
-#define _IFCASE(cls) if (_CHK(cls)) return &PyArrayScalar_VAL(scalar, cls)
-
-    if (_CHK(Number)) {
-        if (_CHK(Integer)) {
-            if (_CHK(SignedInteger)) {
-                _IFCASE(Byte);
-                _IFCASE(Short);
-                _IFCASE(Int);
-                _IFCASE(Long);
-                _IFCASE(LongLong);
-                _IFCASE(Timedelta);
-            }
-            else {
-                /* Unsigned Integer */
-                _IFCASE(UByte);
-                _IFCASE(UShort);
-                _IFCASE(UInt);
-                _IFCASE(ULong);
-                _IFCASE(ULongLong);
-            }
-        }
-        else {
-            /* Inexact */
-            if (_CHK(Floating)) {
-                _IFCASE(Half);
-                _IFCASE(Float);
-                _IFCASE(Double);
-                _IFCASE(LongDouble);
-            }
-            else {
-                /*ComplexFloating */
-                _IFCASE(CFloat);
-                _IFCASE(CDouble);
-                _IFCASE(CLongDouble);
-            }
-        }
-    }
-    else if (_CHK(Bool)) {
-        return &PyArrayScalar_VAL(scalar, Bool);
-    }
-    else if (_CHK(Datetime)) {
-        return &PyArrayScalar_VAL(scalar, Datetime);
-    }
-    else if (_CHK(Flexible)) {
-        if (_CHK(String)) {
-            return (void *)PyBytes_AS_STRING(scalar);
-        }
-        if (_CHK(Unicode)) {
-            /* Treat this the same as the NPY_UNICODE base class */
-
-            /* lazy initialization, to reduce the memory used by string scalars */
-            if (PyArrayScalar_VAL(scalar, Unicode) == NULL) {
-                Py_UCS4 *raw_data = PyUnicode_AsUCS4Copy(scalar);
-                if (raw_data == NULL) {
-                    return NULL;
-                }
-                PyArrayScalar_VAL(scalar, Unicode) = raw_data;
-                return (void *)raw_data;
-            }
-            return PyArrayScalar_VAL(scalar, Unicode);
-        }
-        if (_CHK(Void)) {
-            /* Note: no & needed here, so can't use _IFCASE */
-            return PyArrayScalar_VAL(scalar, Void);
-        }
-    }
-    else {
-        _IFCASE(Object);
-    }
-
-
      /*
       * Use the alignment flag to figure out where the data begins
       * after a PyObject_HEAD
@@ -177,16 +106,20 @@ scalar_value(PyObject *scalar, PyArray_Descr *descr)
          memloc = ((memloc + align - 1)/align)*align;
      }
      return (void *)memloc;
-#undef _IFCASE
-#undef _CHK
  }
  
  /*NUMPY_API
- * return true an object is exactly a numpy scalar
+ * return 1 if an object is exactly a numpy scalar
   */
  NPY_NO_EXPORT int
  PyArray_CheckAnyScalarExact(PyObject * obj)
  {
+    if (obj == NULL) {
+        PyErr_SetString(PyExc_ValueError,
+            "obj is NULL in PyArray_CheckAnyScalarExact");
+        return 0;
+    }
+
      return is_anyscalar_exact(obj);
  }
  
@@ -379,6 +312,14 @@ PyArray_FromScalar(PyObject *scalar, PyArray_Descr *outcode)
  NPY_NO_EXPORT PyObject *
  PyArray_ScalarFromObject(PyObject *object)
  {
+    if (DEPRECATE(
+            "PyArray_ScalarFromObject() is deprecated and scheduled for "
+            "removal. If you are using this (undocumented) function, "
+            "please notify the NumPy developers to look for solutions."
+            "(Deprecated in NumPy 1.23)") < 0) {
+        return NULL;
+    }
+
      PyObject *ret = NULL;
  
      if (PyArray_IsZeroDim(object)) {
diff --git a/numpy/core/src/multiarray/scalartypes.c.src b/numpy/core/src/multiarray/scalartypes.c.src

index 2249fa22bc5a3a348ea051580294ad3369b97cc7..459e5b222f2c912a6bc091d58e57b19509d1bdad 100644 (file)
--- a/numpy/core/src/multiarray/scalartypes.c.src
+++ b/numpy/core/src/multiarray/scalartypes.c.src
@@ -20,6 +20,7 @@
  #include "ctors.h"
  #include "usertypes.h"
  #include "numpyos.h"
+#include "can_cast_table.h"
  #include "common.h"
  #include "scalartypes.h"
  #include "_datetime.h"
@@ -1203,8 +1204,7 @@ gentype_struct_get(PyObject *self, void *NPY_UNUSED(ignored))
      inter->two = 2;
      inter->nd = 0;
      inter->flags = PyArray_FLAGS(arr);
-    inter->flags &= ~(NPY_ARRAY_UPDATEIFCOPY | NPY_ARRAY_WRITEBACKIFCOPY |
-                      NPY_ARRAY_OWNDATA);
+    inter->flags &= ~(NPY_ARRAY_WRITEBACKIFCOPY | NPY_ARRAY_OWNDATA);
      inter->flags |= NPY_ARRAY_NOTSWAPPED;
      inter->typekind = PyArray_DESCR(arr)->kind;
      inter->itemsize = PyArray_DESCR(arr)->elsize;
@@ -3712,13 +3712,6 @@ _npy_smallest_type_of_kind_table[NPY_NSCALARKINDS];
  NPY_NO_EXPORT signed char
  _npy_next_larger_type_table[NPY_NTYPES];
  
-/*
- * This table describes safe casting for small type numbers,
- * and is used by PyArray_CanCastSafely.
- */
-NPY_NO_EXPORT unsigned char
-_npy_can_cast_safely_table[NPY_NTYPES][NPY_NTYPES];
-
  /*
   * This table gives the smallest-size and smallest-kind type to which
   * the input types may be safely cast, according to _npy_can_cast_safely.
@@ -3769,161 +3762,6 @@ initialize_casting_tables(void)
  
      /**end repeat**/
  
-    memset(_npy_can_cast_safely_table, 0, sizeof(_npy_can_cast_safely_table));
-
-    for (i = 0; i < NPY_NTYPES; ++i) {
-        /* Identity */
-        _npy_can_cast_safely_table[i][i] = 1;
-        if (i != NPY_DATETIME) {
-            /*
-             * Bool -> <Anything> except datetime (since
-             *                    it conceptually has no zero)
-             */
-            _npy_can_cast_safely_table[NPY_BOOL][i] = 1;
-        }
-        /* <Anything> -> Object */
-        _npy_can_cast_safely_table[i][NPY_OBJECT] = 1;
-        /* <Anything> -> Void */
-        _npy_can_cast_safely_table[i][NPY_VOID] = 1;
-    }
-
-    _npy_can_cast_safely_table[NPY_STRING][NPY_UNICODE] = 1;
-
-#ifndef NPY_SIZEOF_BYTE
-#define NPY_SIZEOF_BYTE 1
-#endif
-
-    /* Compile-time loop of casting rules */
-
-    /**begin repeat
-     * #FROM_NAME = BYTE, UBYTE, SHORT, USHORT, INT, UINT,
-     *              LONG, ULONG, LONGLONG, ULONGLONG,
-     *              HALF, FLOAT, DOUBLE, LONGDOUBLE,
-     *              CFLOAT, CDOUBLE, CLONGDOUBLE#
-     * #FROM_BASENAME = BYTE, BYTE, SHORT, SHORT, INT, INT,
-     *                  LONG, LONG, LONGLONG, LONGLONG,
-     *                  HALF, FLOAT, DOUBLE, LONGDOUBLE,
-     *                  FLOAT, DOUBLE, LONGDOUBLE#
-     * #from_isint = 1, 0, 1, 0, 1, 0, 1, 0,
-     *               1, 0, 0, 0, 0, 0,
-     *               0, 0, 0#
-     * #from_isuint = 0, 1, 0, 1, 0, 1, 0, 1,
-     *                0, 1, 0, 0, 0, 0,
-     *                0, 0, 0#
-     * #from_isfloat = 0, 0, 0, 0, 0, 0, 0, 0,
-     *                 0, 0, 1, 1, 1, 1,
-     *                 0, 0, 0#
-     * #from_iscomplex = 0, 0, 0, 0, 0, 0, 0, 0,
-     *                   0, 0, 0, 0, 0, 0,
-     *                   1, 1, 1#
-     */
-
-#define _FROM_BSIZE NPY_SIZEOF_@FROM_BASENAME@
-#define _FROM_NUM   (NPY_@FROM_NAME@)
-
-    _npy_can_cast_safely_table[_FROM_NUM][NPY_STRING] = 1;
-    _npy_can_cast_safely_table[_FROM_NUM][NPY_UNICODE] = 1;
-
-#if @from_isint@ && NPY_SIZEOF_TIMEDELTA >= _FROM_BSIZE
-    /* Allow casts from smaller or equal signed integers to the TIMEDELTA type */
-    _npy_can_cast_safely_table[_FROM_NUM][NPY_TIMEDELTA] = 1;
-#elif @from_isuint@ && NPY_SIZEOF_TIMEDELTA > _FROM_BSIZE
-    /* Allow casts from smaller unsigned integers to the TIMEDELTA type */
-    _npy_can_cast_safely_table[_FROM_NUM][NPY_TIMEDELTA] = 1;
-#endif
-
-    /**begin repeat1
-     * #TO_NAME = BYTE, UBYTE, SHORT, USHORT, INT, UINT,
-     *            LONG, ULONG, LONGLONG, ULONGLONG,
-     *            HALF, FLOAT, DOUBLE, LONGDOUBLE,
-     *            CFLOAT, CDOUBLE, CLONGDOUBLE#
-     * #TO_BASENAME = BYTE, BYTE, SHORT, SHORT, INT, INT,
-     *                LONG, LONG, LONGLONG, LONGLONG,
-     *                HALF, FLOAT, DOUBLE, LONGDOUBLE,
-     *                FLOAT, DOUBLE, LONGDOUBLE#
-     * #to_isint = 1, 0, 1, 0, 1, 0, 1, 0,
-     *             1, 0, 0, 0, 0, 0,
-     *             0, 0, 0#
-     * #to_isuint = 0, 1, 0, 1, 0, 1, 0, 1,
-     *              0, 1, 0, 0, 0, 0,
-     *              0, 0, 0#
-     * #to_isfloat = 0, 0, 0, 0, 0, 0, 0, 0,
-     *               0, 0, 1, 1, 1, 1,
-     *               0, 0, 0#
-     * #to_iscomplex = 0, 0, 0, 0, 0, 0, 0, 0,
-     *                 0, 0, 0, 0, 0, 0,
-     *                 1, 1, 1#
-     */
-#define _TO_BSIZE NPY_SIZEOF_@TO_BASENAME@
-#define _TO_NUM   (NPY_@TO_NAME@)
-
-    /*
-     * NOTE: _FROM_BSIZE and _TO_BSIZE are the sizes of the "base type"
-     *       which is the same as the size of the type except for
-     *       complex, where it is the size of the real type.
-     */
-
-#if @from_isint@
-
-#  if @to_isint@ && (_TO_BSIZE >= _FROM_BSIZE)
-    /* int -> int */
-    _npy_can_cast_safely_table[_FROM_NUM][_TO_NUM] = 1;
-#  elif @to_isfloat@ && (_FROM_BSIZE < 8) && (_TO_BSIZE > _FROM_BSIZE)
-    /* int -> float */
-    _npy_can_cast_safely_table[_FROM_NUM][_TO_NUM] = 1;
-#  elif @to_isfloat@ && (_FROM_BSIZE >= 8) && (_TO_BSIZE >= _FROM_BSIZE)
-    /* int -> float */
-    _npy_can_cast_safely_table[_FROM_NUM][_TO_NUM] = 1;
-#  elif @to_iscomplex@ && (_FROM_BSIZE < 8) && (_TO_BSIZE > _FROM_BSIZE)
-    /* int -> complex */
-    _npy_can_cast_safely_table[_FROM_NUM][_TO_NUM] = 1;
-#  elif @to_iscomplex@ && (_FROM_BSIZE >= 8) && (_TO_BSIZE >= _FROM_BSIZE)
-    /* int -> complex */
-    _npy_can_cast_safely_table[_FROM_NUM][_TO_NUM] = 1;
-#  endif
-
-#elif @from_isuint@
-
-#  if @to_isint@ && (_TO_BSIZE > _FROM_BSIZE)
-    /* uint -> int */
-    _npy_can_cast_safely_table[_FROM_NUM][_TO_NUM] = 1;
-#  elif @to_isuint@ && (_TO_BSIZE >= _FROM_BSIZE)
-    /* uint -> uint */
-    _npy_can_cast_safely_table[_FROM_NUM][_TO_NUM] = 1;
-#  elif @to_isfloat@ && (_FROM_BSIZE < 8) && (_TO_BSIZE > _FROM_BSIZE)
-    /* uint -> float */
-    _npy_can_cast_safely_table[_FROM_NUM][_TO_NUM] = 1;
-#  elif @to_isfloat@ && (_FROM_BSIZE >= 8) && (_TO_BSIZE >= _FROM_BSIZE)
-    /* uint -> float */
-    _npy_can_cast_safely_table[_FROM_NUM][_TO_NUM] = 1;
-#  elif @to_iscomplex@ && (_FROM_BSIZE < 8) && (_TO_BSIZE > _FROM_BSIZE)
-    /* uint -> complex */
-    _npy_can_cast_safely_table[_FROM_NUM][_TO_NUM] = 1;
-#  elif @to_iscomplex@ && (_FROM_BSIZE >= 8) && (_TO_BSIZE >= _FROM_BSIZE)
-    /* uint -> complex */
-    _npy_can_cast_safely_table[_FROM_NUM][_TO_NUM] = 1;
-#  endif
-
-
-#elif @from_isfloat@
-
-#  if @to_isfloat@ && (_TO_BSIZE >= _FROM_BSIZE)
-    /* float -> float */
-    _npy_can_cast_safely_table[_FROM_NUM][_TO_NUM] = 1;
-#  elif @to_iscomplex@ && (_TO_BSIZE >= _FROM_BSIZE)
-    /* float -> complex */
-    _npy_can_cast_safely_table[_FROM_NUM][_TO_NUM] = 1;
-#  endif
-
-#elif @from_iscomplex@
-
-#  if @to_iscomplex@ && (_TO_BSIZE >= _FROM_BSIZE)
-    /* complex -> complex */
-    _npy_can_cast_safely_table[_FROM_NUM][_TO_NUM] = 1;
-#  endif
-
-#endif
-
  #undef _TO_NUM
  #undef _TO_BSIZE
  
diff --git a/numpy/core/src/multiarray/scalartypes.h b/numpy/core/src/multiarray/scalartypes.h

index 95a2f66c6fbcc6edd41692d284f157a8778f7799..4d6eda2a1b7b8d6038fd002301b351e4b263c7c3 100644 (file)
--- a/numpy/core/src/multiarray/scalartypes.h
+++ b/numpy/core/src/multiarray/scalartypes.h
@@ -1,9 +1,10 @@
  #ifndef NUMPY_CORE_SRC_MULTIARRAY_SCALARTYPES_H_
  #define NUMPY_CORE_SRC_MULTIARRAY_SCALARTYPES_H_
  
-/* Internal look-up tables */
-extern NPY_NO_EXPORT unsigned char
-_npy_can_cast_safely_table[NPY_NTYPES][NPY_NTYPES];
+/*
+ * Internal look-up tables, casting safety is defined in convert_datatype.h.
+ * Most of these should be phased out eventually, but some are still used.
+ */
  extern NPY_NO_EXPORT signed char
  _npy_scalar_kinds_table[NPY_NTYPES];
  extern NPY_NO_EXPORT signed char
diff --git a/numpy/core/src/multiarray/shape.c b/numpy/core/src/multiarray/shape.c

index 162abd6a49c8e01cef3c516c8a2c44bbe5c1a0b8..98f65415b79f855883970b009cd94a6bb666d277 100644 (file)
--- a/numpy/core/src/multiarray/shape.c
+++ b/numpy/core/src/multiarray/shape.c
@@ -244,11 +244,9 @@ PyArray_Newshape(PyArrayObject *self, PyArray_Dims *newdims,
       * in order to get the right orientation and
       * because we can't just re-use the buffer with the
       * data in the order it is in.
-     * NPY_RELAXED_STRIDES_CHECKING: size check is unnecessary when set.
       */
      Py_INCREF(self);
-    if ((PyArray_SIZE(self) > 1) &&
-        ((order == NPY_CORDER && !PyArray_IS_C_CONTIGUOUS(self)) ||
+    if (((order == NPY_CORDER && !PyArray_IS_C_CONTIGUOUS(self)) ||
           (order == NPY_FORTRANORDER && !PyArray_IS_F_CONTIGUOUS(self)))) {
          int success = 0;
          success = _attempt_nocopy_reshape(self, ndim, dimensions,
@@ -1000,7 +998,6 @@ PyArray_Flatten(PyArrayObject *a, NPY_ORDER order)
   *          If an axis flagged for removal has a shape larger than one,
   *          the aligned flag (and in the future the contiguous flags),
   *          may need explicit update.
- *          (check also NPY_RELAXED_STRIDES_CHECKING)
   *
   * For example, this can be used to remove the reduction axes
   * from a reduction result once its computation is complete.
@@ -1024,6 +1021,6 @@ PyArray_RemoveAxesInPlace(PyArrayObject *arr, const npy_bool *flags)
      /* The final number of dimensions */
      fa->nd = idim_out;
  
-    /* May not be necessary for NPY_RELAXED_STRIDES_CHECKING (see comment) */
+    /* NOTE: This is only necessary if a dimension with size != 1 was removed */
      PyArray_UpdateFlags(arr, NPY_ARRAY_C_CONTIGUOUS | NPY_ARRAY_F_CONTIGUOUS);
  }
diff --git a/numpy/core/src/multiarray/temp_elide.c b/numpy/core/src/multiarray/temp_elide.c

index f615aa3360e69e561f8d36b9190dde23575845b6..34248076c98ee30608d4fabe7d3d4852a5fc0a83 100644 (file)
--- a/numpy/core/src/multiarray/temp_elide.c
+++ b/numpy/core/src/multiarray/temp_elide.c
@@ -286,7 +286,6 @@ can_elide_temp(PyObject *olhs, PyObject *orhs, int *cannot)
              !PyArray_ISNUMBER(alhs) ||
              !PyArray_CHKFLAGS(alhs, NPY_ARRAY_OWNDATA) ||
              !PyArray_ISWRITEABLE(alhs) ||
-            PyArray_CHKFLAGS(alhs, NPY_ARRAY_UPDATEIFCOPY) ||
              PyArray_CHKFLAGS(alhs, NPY_ARRAY_WRITEBACKIFCOPY) ||
              PyArray_NBYTES(alhs) < NPY_MIN_ELIDE_BYTES) {
          return 0;
@@ -365,7 +364,6 @@ can_elide_temp_unary(PyArrayObject * m1)
              !PyArray_ISNUMBER(m1) ||
              !PyArray_CHKFLAGS(m1, NPY_ARRAY_OWNDATA) ||
              !PyArray_ISWRITEABLE(m1) ||
-            PyArray_CHKFLAGS(m1, NPY_ARRAY_UPDATEIFCOPY) ||
              PyArray_NBYTES(m1) < NPY_MIN_ELIDE_BYTES) {
          return 0;
      }
diff --git a/numpy/core/src/multiarray/textreading/conversions.c b/numpy/core/src/multiarray/textreading/conversions.c

new file mode 100644 (file)

index 0000000..11f4210
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/conversions.c
@@ -0,0 +1,395 @@
+
+#include <Python.h>
+
+#include <string.h>
+#include <stdlib.h>
+#include <stdbool.h>
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#define _MULTIARRAYMODULE
+#include "lowlevel_strided_loops.h"
+
+#include "conversions.h"
+#include "str_to_int.h"
+
+#include "array_coercion.h"
+
+
+/*
+ * Coercion to boolean is done via integer right now.
+ */
+NPY_NO_EXPORT int
+to_bool(PyArray_Descr *NPY_UNUSED(descr),
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *NPY_UNUSED(pconfig))
+{
+    int64_t res;
+    if (str_to_int64(str, end, INT64_MIN, INT64_MAX, &res) < 0) {
+        return -1;
+    }
+    *dataptr = (char)(res != 0);
+    return 0;
+}
+
+
+/*
+ * In order to not pack a whole copy of a floating point parser, we copy the
+ * result into ascii and call the Python one.  Float parsing isn't super quick
+ * so this is not terrible, but avoiding it would speed up things.
+ *
+ * Also note that parsing the first float of a complex will copy the whole
+ * string to ascii rather than just the first part.
+ * TODO: A tweak of the break might be a simple mitigation there.
+ *
+ * @param str The UCS4 string to parse
+ * @param end Pointer to the end of the string
+ * @param skip_trailing_whitespace If false does not skip trailing whitespace
+ *        (used by the complex parser).
+ * @param result Output stored as double value.
+ */
+static NPY_INLINE int
+double_from_ucs4(
+        const Py_UCS4 *str, const Py_UCS4 *end,
+        bool strip_whitespace, double *result, const Py_UCS4 **p_end)
+{
+    /* skip leading whitespace */
+    if (strip_whitespace) {
+        while (Py_UNICODE_ISSPACE(*str)) {
+            str++;
+        }
+    }
+    if (str == end) {
+        return -1;  /* empty or only whitespace: not a floating point number */
+    }
+
+    /* We convert to ASCII for the Python parser, use stack if small: */
+    char stack_buf[128];
+    char *heap_buf = NULL;
+    char *ascii = stack_buf;
+
+    size_t str_len = end - str + 1;
+    if (str_len > 128) {
+        heap_buf = PyMem_MALLOC(str_len);
+        if (heap_buf == NULL) {
+            PyErr_NoMemory();
+            return -1;
+        }
+        ascii = heap_buf;
+    }
+    char *c = ascii;
+    for (; str < end; str++, c++) {
+        if (NPY_UNLIKELY(*str >= 128)) {
+            /* Character cannot be used, ignore for end calculation and stop */
+            end = str;
+            break;
+        }
+        *c = (char)(*str);
+    }
+    *c = '\0';
+
+    char *end_parsed;
+    *result = PyOS_string_to_double(ascii, &end_parsed, NULL);
+    /* Rewind `end` to the first UCS4 character not parsed: */
+    end = end - (c - end_parsed);
+
+    PyMem_FREE(heap_buf);
+
+    if (*result == -1. && PyErr_Occurred()) {
+        return -1;
+    }
+
+    if (strip_whitespace) {
+        /* and then skip any remainig whitespace: */
+        while (Py_UNICODE_ISSPACE(*end)) {
+            end++;
+        }
+    }
+    *p_end = end;
+    return 0;
+}
+
+
+NPY_NO_EXPORT int
+to_float(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *NPY_UNUSED(pconfig))
+{
+    double double_val;
+    const Py_UCS4 *p_end;
+    if (double_from_ucs4(str, end, true, &double_val, &p_end) < 0) {
+        return -1;
+    }
+    if (p_end != end) {
+        return -1;
+    }
+
+    float val = (float)double_val;
+    memcpy(dataptr, &val, sizeof(float));
+    if (!PyArray_ISNBO(descr->byteorder)) {
+        npy_bswap4_unaligned(dataptr);
+    }
+    return 0;
+}
+
+
+NPY_NO_EXPORT int
+to_double(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *NPY_UNUSED(pconfig))
+{
+    double val;
+    const Py_UCS4 *p_end;
+    if (double_from_ucs4(str, end, true, &val, &p_end) < 0) {
+        return -1;
+    }
+    if (p_end != end) {
+        return -1;
+    }
+
+    memcpy(dataptr, &val, sizeof(double));
+    if (!PyArray_ISNBO(descr->byteorder)) {
+        npy_bswap8_unaligned(dataptr);
+    }
+    return 0;
+}
+
+
+static bool
+to_complex_int(
+        const Py_UCS4 *item, const Py_UCS4 *token_end,
+        double *p_real, double *p_imag,
+        Py_UCS4 imaginary_unit, bool allow_parens)
+{
+    const Py_UCS4 *p_end;
+    bool unmatched_opening_paren = false;
+
+    /* Remove whitespace before the possibly leading '(' */
+    while (Py_UNICODE_ISSPACE(*item)) {
+        ++item;
+    }
+    if (allow_parens && (*item == '(')) {
+        unmatched_opening_paren = true;
+        ++item;
+        /* Allow whitespace within the parentheses: "( 1j)" */
+        while (Py_UNICODE_ISSPACE(*item)) {
+            ++item;
+        }
+    }
+    if (double_from_ucs4(item, token_end, false, p_real, &p_end) < 0) {
+        return false;
+    }
+    if (p_end == token_end) {
+        // No imaginary part in the string (e.g. "3.5")
+        *p_imag = 0.0;
+        return !unmatched_opening_paren;
+    }
+    if (*p_end == imaginary_unit) {
+        /* Only an imaginary part (e.g "1.5j") */
+        *p_imag = *p_real;
+        *p_real = 0.0;
+        ++p_end;
+    }
+    else if (*p_end == '+' || *p_end == '-') {
+        /* Imaginary part still to parse */
+        if (*p_end == '+') {
+            ++p_end;  /* Advance to support +- (and ++) */
+        }
+        if (double_from_ucs4(p_end, token_end, false, p_imag, &p_end) < 0) {
+            return false;
+        }
+        if (*p_end != imaginary_unit) {
+            return false;
+        }
+        ++p_end;
+    }
+    else {
+        *p_imag = 0;
+    }
+
+    if (unmatched_opening_paren) {
+        /* Allow whitespace inside brackets as in "(1+2j )" or "( 1j )" */
+        while (Py_UNICODE_ISSPACE(*p_end)) {
+            ++p_end;
+        }
+        if (*p_end == ')') {
+            ++p_end;
+        }
+        else {
+            /* parentheses was not closed */
+            return false;
+        }
+    }
+
+    while (Py_UNICODE_ISSPACE(*p_end)) {
+        ++p_end;
+    }
+    return p_end == token_end;
+}
+
+
+NPY_NO_EXPORT int
+to_cfloat(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *pconfig)
+{
+    double real;
+    double imag;
+
+    bool success = to_complex_int(
+            str, end, &real, &imag,
+            pconfig->imaginary_unit, true);
+
+    if (!success) {
+        return -1;
+    }
+    npy_complex64 val = {(float)real, (float)imag};
+    memcpy(dataptr, &val, sizeof(npy_complex64));
+    if (!PyArray_ISNBO(descr->byteorder)) {
+        npy_bswap4_unaligned(dataptr);
+        npy_bswap4_unaligned(dataptr + 4);
+    }
+    return 0;
+}
+
+
+NPY_NO_EXPORT int
+to_cdouble(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *pconfig)
+{
+    double real;
+    double imag;
+
+    bool success = to_complex_int(
+            str, end, &real, &imag, pconfig->imaginary_unit, true);
+
+    if (!success) {
+        return -1;
+    }
+    npy_complex128 val = {real, imag};
+    memcpy(dataptr, &val, sizeof(npy_complex128));
+    if (!PyArray_ISNBO(descr->byteorder)) {
+        npy_bswap8_unaligned(dataptr);
+        npy_bswap8_unaligned(dataptr + 8);
+    }
+    return 0;
+}
+
+
+/*
+ * String and unicode conversion functions.
+ */
+NPY_NO_EXPORT int
+to_string(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *NPY_UNUSED(unused))
+{
+    const Py_UCS4* c = str;
+    size_t length = descr->elsize;
+
+    for (size_t i = 0; i < length; i++) {
+        if (c < end) {
+            /*
+             * loadtxt assumed latin1, which is compatible with UCS1 (first
+             * 256 unicode characters).
+             */
+            if (NPY_UNLIKELY(*c > 255)) {
+                /* TODO: Was UnicodeDecodeError, is unspecific error good? */
+                return -1;
+            }
+            dataptr[i] = (Py_UCS1)(*c);
+            c++;
+        }
+        else {
+            dataptr[i] = '\0';
+        }
+    }
+    return 0;
+}
+
+
+NPY_NO_EXPORT int
+to_unicode(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *NPY_UNUSED(unused))
+{
+    int length = descr->elsize / 4;
+
+    if (length <= end - str) {
+        memcpy(dataptr, str, length * 4);
+    }
+    else {
+        size_t given_len = end - str;
+        memcpy(dataptr, str, given_len * 4);
+        memset(dataptr + given_len * 4, '\0', (length - given_len) * 4);
+    }
+
+    if (!PyArray_ISNBO(descr->byteorder)) {
+        for (int i = 0; i < length; i++) {
+            npy_bswap4_unaligned(dataptr);
+            dataptr += 4;
+        }
+    }
+    return 0;
+}
+
+
+
+/*
+ * Convert functions helper for the generic converter.
+ */
+static PyObject *
+call_converter_function(
+        PyObject *func, const Py_UCS4 *str, size_t length, bool byte_converters)
+{
+    PyObject *s = PyUnicode_FromKindAndData(PyUnicode_4BYTE_KIND, str, length);
+    if (s == NULL) {
+        return s;
+    }
+    if (byte_converters) {
+        Py_SETREF(s, PyUnicode_AsEncodedString(s, "latin1", NULL));
+        if (s == NULL) {
+            return NULL;
+        }
+    }
+    if (func == NULL) {
+        return s;
+    }
+    PyObject *result = PyObject_CallFunctionObjArgs(func, s, NULL);
+    Py_DECREF(s);
+    return result;
+}
+
+
+NPY_NO_EXPORT int
+to_generic_with_converter(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *config, PyObject *func)
+{
+    bool use_byte_converter;
+    if (func == NULL) {
+        use_byte_converter = config->c_byte_converters;
+    }
+    else {
+        use_byte_converter = config->python_byte_converters;
+    }
+    /* Converts to unicode and calls custom converter (if set) */
+    PyObject *converted = call_converter_function(
+            func, str, (size_t)(end - str), use_byte_converter);
+    if (converted == NULL) {
+        return -1;
+    }
+
+    int res = PyArray_Pack(descr, dataptr, converted);
+    Py_DECREF(converted);
+    return res;
+}
+
+
+NPY_NO_EXPORT int
+to_generic(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *config)
+{
+    return to_generic_with_converter(descr, str, end, dataptr, config, NULL);
+}
diff --git a/numpy/core/src/multiarray/textreading/conversions.h b/numpy/core/src/multiarray/textreading/conversions.h

new file mode 100644 (file)

index 0000000..222eea4
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/conversions.h
@@ -0,0 +1,57 @@
+#ifndef NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_CONVERSIONS_H_
+#define NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_CONVERSIONS_H_
+
+#include <stdbool.h>
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#define _MULTIARRAYMODULE
+#include "numpy/arrayobject.h"
+
+#include "textreading/parser_config.h"
+
+NPY_NO_EXPORT int
+to_bool(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *pconfig);
+
+NPY_NO_EXPORT int
+to_float(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *pconfig);
+
+NPY_NO_EXPORT int
+to_double(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *pconfig);
+
+NPY_NO_EXPORT int
+to_cfloat(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *pconfig);
+
+NPY_NO_EXPORT int
+to_cdouble(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *pconfig);
+
+NPY_NO_EXPORT int
+to_string(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *unused);
+
+NPY_NO_EXPORT int
+to_unicode(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *unused);
+
+NPY_NO_EXPORT int
+to_generic_with_converter(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *unused, PyObject *func);
+
+NPY_NO_EXPORT int
+to_generic(PyArray_Descr *descr,
+        const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,
+        parser_config *pconfig);
+
+#endif  /* NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_CONVERSIONS_H_ */
diff --git a/numpy/core/src/multiarray/textreading/field_types.c b/numpy/core/src/multiarray/textreading/field_types.c

new file mode 100644 (file)

index 0000000..0722efd
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/field_types.c
@@ -0,0 +1,201 @@
+#include "field_types.h"
+#include "conversions.h"
+#include "str_to_int.h"
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#define _MULTIARRAYMODULE
+#include "numpy/ndarraytypes.h"
+#include "alloc.h"
+
+#include "textreading/growth.h"
+
+
+NPY_NO_EXPORT void
+field_types_xclear(int num_field_types, field_type *ft) {
+    assert(num_field_types >= 0);
+    if (ft == NULL) {
+        return;
+    }
+    for (int i = 0; i < num_field_types; i++) {
+        Py_XDECREF(ft[i].descr);
+        ft[i].descr = NULL;
+    }
+    PyMem_Free(ft);
+}
+
+
+/*
+ * Fetch custom converters for the builtin NumPy DTypes (or the generic one).
+ * Structured DTypes get unpacked and `object` uses the generic method.
+ *
+ * TODO: This should probably be moved on the DType object in some form,
+ *       to allow user DTypes to define their own converters.
+ */
+static set_from_ucs4_function *
+get_from_ucs4_function(PyArray_Descr *descr)
+{
+    if (descr->type_num == NPY_BOOL) {
+        return &to_bool;
+    }
+    else if (PyDataType_ISSIGNED(descr)) {
+        switch (descr->elsize) {
+            case 1:
+                return &to_int8;
+            case 2:
+                return &to_int16;
+            case 4:
+                return &to_int32;
+            case 8:
+                return &to_int64;
+            default:
+                assert(0);
+        }
+    }
+    else if (PyDataType_ISUNSIGNED(descr)) {
+        switch (descr->elsize) {
+            case 1:
+                return &to_uint8;
+            case 2:
+                return &to_uint16;
+            case 4:
+                return &to_uint32;
+            case 8:
+                return &to_uint64;
+            default:
+                assert(0);
+        }
+    }
+    else if (descr->type_num == NPY_FLOAT) {
+        return &to_float;
+    }
+    else if (descr->type_num == NPY_DOUBLE) {
+        return &to_double;
+    }
+    else if (descr->type_num == NPY_CFLOAT) {
+        return &to_cfloat;
+    }
+    else if (descr->type_num == NPY_CDOUBLE) {
+        return &to_cdouble;
+    }
+    else if (descr->type_num == NPY_STRING) {
+        return &to_string;
+    }
+    else if (descr->type_num == NPY_UNICODE) {
+        return &to_unicode;
+    }
+    return &to_generic;
+}
+
+
+/*
+ * Note that the function cleans up `ft` on error.  If `num_field_types < 0`
+ * cleanup has already happened in the internal call.
+ */
+static npy_intp
+field_type_grow_recursive(PyArray_Descr *descr,
+        npy_intp num_field_types, field_type **ft, npy_intp *ft_size,
+        npy_intp field_offset)
+{
+    if (PyDataType_HASSUBARRAY(descr)) {
+        PyArray_Dims shape = {NULL, -1};
+
+        if (!(PyArray_IntpConverter(descr->subarray->shape, &shape))) {
+             PyErr_SetString(PyExc_ValueError, "invalid subarray shape");
+             field_types_xclear(num_field_types, *ft);
+             return -1;
+        }
+        npy_intp size = PyArray_MultiplyList(shape.ptr, shape.len);
+        npy_free_cache_dim_obj(shape);
+        for (npy_intp i = 0; i < size; i++) {
+            num_field_types = field_type_grow_recursive(descr->subarray->base,
+                    num_field_types, ft, ft_size, field_offset);
+            field_offset += descr->subarray->base->elsize;
+            if (num_field_types < 0) {
+                return -1;
+            }
+        }
+        return num_field_types;
+    }
+    else if (PyDataType_HASFIELDS(descr)) {
+        npy_int num_descr_fields = PyTuple_Size(descr->names);
+        if (num_descr_fields < 0) {
+            field_types_xclear(num_field_types, *ft);
+            return -1;
+        }
+        for (npy_intp i = 0; i < num_descr_fields; i++) {
+            PyObject *key = PyTuple_GET_ITEM(descr->names, i);
+            PyObject *tup = PyObject_GetItem(descr->fields, key);
+            if (tup == NULL) {
+                field_types_xclear(num_field_types, *ft);
+                return -1;
+            }
+            PyArray_Descr *field_descr;
+            PyObject *title;
+            int offset;
+            if (!PyArg_ParseTuple(tup, "Oi|O", &field_descr, &offset, &title)) {
+                Py_DECREF(tup);
+                field_types_xclear(num_field_types, *ft);
+                return -1;
+            }
+            Py_DECREF(tup);
+            num_field_types = field_type_grow_recursive(
+                    field_descr, num_field_types, ft, ft_size,
+                    field_offset + offset);
+            if (num_field_types < 0) {
+                return -1;
+            }
+        }
+        return num_field_types;
+    }
+
+    if (*ft_size <= num_field_types) {
+        npy_intp alloc_size = grow_size_and_multiply(
+                ft_size, 4, sizeof(field_type));
+        if (alloc_size < 0) {
+            field_types_xclear(num_field_types, *ft);
+            return -1;
+        }
+        field_type *new_ft = PyMem_Realloc(*ft, alloc_size);
+        if (new_ft == NULL) {
+            field_types_xclear(num_field_types, *ft);
+            return -1;
+        }
+        *ft = new_ft;
+    }
+
+    Py_INCREF(descr);
+    (*ft)[num_field_types].descr = descr;
+    (*ft)[num_field_types].set_from_ucs4 = get_from_ucs4_function(descr);
+    (*ft)[num_field_types].structured_offset = field_offset;
+
+    return num_field_types + 1;
+}
+
+
+/*
+ * Prepare the "field_types" for the given dtypes/descriptors.  Currently,
+ * we copy the itemsize, but the main thing is that we check for custom
+ * converters.
+ */
+NPY_NO_EXPORT npy_intp
+field_types_create(PyArray_Descr *descr, field_type **ft)
+{
+    if (descr->subarray != NULL) {
+        /*
+         * This could probably be allowed, but NumPy absorbs the dimensions
+         * so it is an awkward corner case that probably never really worked.
+         */
+        PyErr_SetString(PyExc_TypeError,
+                "file reader does not support subarray dtypes.  You can"
+                "put the dtype into a structured one using "
+                "`np.dtype(('name', dtype))` to avoid this limitation.");
+        return -1;
+    }
+
+    npy_intp ft_size = 4;
+    *ft = PyMem_Malloc(ft_size * sizeof(field_type));
+    if (*ft == NULL) {
+        return -1;
+    }
+    return field_type_grow_recursive(descr, 0, ft, &ft_size, 0);
+}
diff --git a/numpy/core/src/multiarray/textreading/field_types.h b/numpy/core/src/multiarray/textreading/field_types.h

new file mode 100644 (file)

index 0000000..f26e00a
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/field_types.h
@@ -0,0 +1,67 @@
+
+#ifndef NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_FIELD_TYPES_H_
+#define NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_FIELD_TYPES_H_
+
+#include <stdint.h>
+#include <stdbool.h>
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#define _MULTIARRAYMODULE
+#include "numpy/ndarraytypes.h"
+
+#include "textreading/parser_config.h"
+
+/**
+ * Function defining the conversion for each value.
+ *
+ * This function must support unaligned memory access.  As of now, there is
+ * no special error handling (in whatever form):  We assume that it is always
+ * reasonable to raise a `ValueError` noting the string that failed to be
+ * converted.
+ *
+ * NOTE: An earlier version of the code had unused default values (pandas
+ *       does this) when columns are missing.  We could define this either
+ *       by passing `NULL` in, or by adding a default explicitly somewhere.
+ *       (I think users should probably have to define the default, at which
+ *       point it doesn't matter here.)
+ *
+ * NOTE: We are currently passing the parser config, this could be made public
+ *       or could be set up to be dtype specific/private.  Always passing
+ *       pconfig fully seems easier right now even if it may change.
+ *       (A future use-case may for example be user-specified strings that are
+ *       considered boolean True or False).
+ *
+ * TODO: Aside from nailing down the above notes, it may be nice to expose
+ *       these function publically.  This could allow user DTypes to provide
+ *       a converter or custom converters written in C rather than Python.
+ *
+ * @param descr The NumPy descriptor of the field (may be byte-swapped, etc.)
+ * @param str Pointer to the beginning of the UCS4 string to be parsed.
+ * @param end Pointer to the end of the UCS4 string.  This value is currently
+ *            guaranteed to be `\0`, ensuring that parsers can rely on
+ *            nul-termination.
+ * @param dataptr The pointer where to store the parsed value
+ * @param pconfig Additional configuration for the parser.
+ * @returns 0 on success and -1 on failure.  If the return value is -1 an
+ *          error may or may not be set.  If an error is set, it is chained
+ *          behind the generic ValueError.
+ */
+typedef int (set_from_ucs4_function)(
+        PyArray_Descr *descr, const Py_UCS4 *str, const Py_UCS4 *end,
+        char *dataptr, parser_config *pconfig);
+
+typedef struct _field_type {
+    set_from_ucs4_function *set_from_ucs4;
+    /* The original NumPy descriptor */
+    PyArray_Descr *descr;
+    /* Offset to this entry within row. */
+    npy_intp structured_offset;
+} field_type;
+
+
+NPY_NO_EXPORT void
+field_types_xclear(int num_field_types, field_type *ft);
+
+NPY_NO_EXPORT npy_intp
+field_types_create(PyArray_Descr *descr, field_type **ft);
+
+#endif  /* NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_FIELD_TYPES_H_ */
diff --git a/numpy/core/src/multiarray/textreading/growth.c b/numpy/core/src/multiarray/textreading/growth.c

new file mode 100644 (file)

index 0000000..49a09d5
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/growth.c
@@ -0,0 +1,47 @@
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#define _MULTIARRAYMODULE
+#include "numpy/ndarraytypes.h"
+
+#include "templ_common.h"
+
+/*
+ * Helper function taking the size input and growing it (based on min_grow).
+ * The current scheme is a minimum growth and a general growth by 25%
+ * overallocation.  This is then capped at 2**20 elements, as that propels us
+ * in the range of large page sizes (so it is presumably more than enough).
+ *
+ * It further multiplies it with `itemsize` and ensures that all results fit
+ * into an `npy_intp`.
+ * Returns -1 if any overflow occurred or the result would not fit.
+ * The user has to ensure the input is ssize_t but not negative.
+ */
+NPY_NO_EXPORT npy_intp
+grow_size_and_multiply(npy_intp *size, npy_intp min_grow, npy_intp itemsize) {
+    /* min_grow must be a power of two: */
+    assert((min_grow & (min_grow - 1)) == 0);
+    npy_uintp new_size = (npy_uintp)*size;
+    npy_intp growth = *size >> 2;
+    if (growth <= min_grow) {
+        /* can never lead to overflow if we are using min_growth */
+        new_size += min_grow;
+    }
+    else {
+        if (growth > 1 << 20) {
+            /* limit growth to order of MiB (even hugepages are not larger) */
+            growth = 1 << 20;
+        }
+        new_size += growth + min_grow - 1;
+        new_size &= ~min_grow;
+
+        if (new_size > NPY_MAX_INTP) {
+            return -1;
+        }
+    }
+    *size = (npy_intp)new_size;
+    npy_intp alloc_size;
+    if (npy_mul_with_overflow_intp(&alloc_size, (npy_intp)new_size, itemsize)) {
+        return -1;
+    }
+    return alloc_size;
+}
+
diff --git a/numpy/core/src/multiarray/textreading/growth.h b/numpy/core/src/multiarray/textreading/growth.h

new file mode 100644 (file)

index 0000000..c7ebe36
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/growth.h
@@ -0,0 +1,15 @@
+#ifndef NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_GROWTH_H_
+#define NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_GROWTH_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+NPY_NO_EXPORT npy_intp
+grow_size_and_multiply(npy_intp *size, npy_intp min_grow, npy_intp itemsize);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif  /* NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_GROWTH_H_ */
diff --git a/numpy/core/src/multiarray/textreading/parser_config.h b/numpy/core/src/multiarray/textreading/parser_config.h

new file mode 100644 (file)

index 0000000..022ba95
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/parser_config.h
@@ -0,0 +1,74 @@
+
+#ifndef NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_PARSER_CONFIG_H_
+#define NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_PARSER_CONFIG_H_
+
+#include <stdbool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+typedef struct {
+    /*
+     *  Field delimiter character.
+     *  Typically ',', ' ', '\t', ignored if `delimiter_is_whitespace` is true.
+     */
+    Py_UCS4 delimiter;
+
+    /*
+     *  Character used to quote fields.
+     *  Typically '"' or "'".  To disable quoting we set this to UINT_MAX
+     *  (which is not a valid unicode character and thus cannot occur in the
+     *  file; the same is used for all other characters if necessary).
+     */
+    Py_UCS4 quote;
+
+    /*
+     *  Character(s) that indicates the start of a comment.
+     *  Typically '#', '%' or ';'.
+     *  When encountered in a line and not inside quotes, all character
+     *  from the comment character(s) to the end of the line are ignored.
+     */
+    Py_UCS4 comment;
+
+    /*
+     *  Ignore whitespace at the beginning of a field (outside/before quotes).
+     *  Is (and must be) set if `delimiter_is_whitespace`.
+     */
+    bool ignore_leading_whitespace;
+
+    /*
+     * If true, the delimiter is ignored and any unicode whitespace is used
+     * for splitting (same as `string.split()` in Python). In that case
+     * `ignore_leading_whitespace` should also be set.
+     */
+    bool delimiter_is_whitespace;
+
+    /*
+     *  The imaginary unit character. Default is `j`.
+     */
+    Py_UCS4 imaginary_unit;
+
+     /*
+      * Data should be encoded as `latin1` when using python converter
+      * (implementing `loadtxt` default Python 2 compatibility mode).
+      * The c byte converter is used when the user requested `dtype="S"`.
+      * In this case we go via `dtype=object`, however, loadtxt allows latin1
+      * while normal object to string casts only accept ASCII, so it ensures
+      * that that the object array already contains bytes and not strings.
+      */
+     bool python_byte_converters;
+     bool c_byte_converters;
+     /*
+      * Flag to store whether a warning was already given for an integer being
+      * parsed by first converting to a float.
+      */
+     bool gave_int_via_float_warning;
+} parser_config;
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif  /* NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_PARSER_CONFIG_H_ */
diff --git a/numpy/core/src/multiarray/textreading/readtext.c b/numpy/core/src/multiarray/textreading/readtext.c

new file mode 100644 (file)

index 0000000..a5db1cb
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/readtext.c
@@ -0,0 +1,317 @@
+#include <stdio.h>
+#include <stdbool.h>
+
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#define _MULTIARRAYMODULE
+#include "numpy/arrayobject.h"
+#include "npy_argparse.h"
+#include "common.h"
+#include "conversion_utils.h"
+
+#include "textreading/parser_config.h"
+#include "textreading/stream_pyobject.h"
+#include "textreading/field_types.h"
+#include "textreading/rows.h"
+#include "textreading/str_to_int.h"
+
+
+//
+// `usecols` must point to a Python object that is Py_None or a 1-d contiguous
+// numpy array with data type int32.
+//
+// `dtype` must point to a Python object that is Py_None or a numpy dtype
+// instance.  If the latter, code and sizes must be arrays of length
+// num_dtype_fields, holding the flattened data field type codes and byte
+// sizes. (num_dtype_fields, codes, and sizes can be inferred from dtype,
+// but we do that in Python code.)
+//
+// If both `usecols` and `dtype` are not None, and the data type is compound,
+// then len(usecols) must equal num_dtype_fields.
+//
+// If `dtype` is given and it is compound, and `usecols` is None, then the
+// number of columns in the file must match the number of fields in `dtype`.
+//
+static PyObject *
+_readtext_from_stream(stream *s,
+        parser_config *pc, Py_ssize_t num_usecols, Py_ssize_t usecols[],
+        Py_ssize_t skiplines, Py_ssize_t max_rows,
+        PyObject *converters, PyObject *dtype)
+{
+    PyArrayObject *arr = NULL;
+    PyArray_Descr *out_dtype = NULL;
+    field_type *ft = NULL;
+
+    /*
+     * If dtypes[0] is dtype the input was not structured and the result
+     * is considered "homogeneous" and we have to discover the number of
+     * columns/
+     */
+    out_dtype = (PyArray_Descr *)dtype;
+    Py_INCREF(out_dtype);
+
+    Py_ssize_t num_fields = field_types_create(out_dtype, &ft);
+    if (num_fields < 0) {
+        goto finish;
+    }
+    bool homogeneous = num_fields == 1 && ft[0].descr == out_dtype;
+
+    if (!homogeneous && usecols != NULL && num_usecols != num_fields) {
+        PyErr_Format(PyExc_TypeError,
+                "If a structured dtype is used, the number of columns in "
+                "`usecols` must match the effective number of fields. "
+                "But %zd usecols were given and the number of fields is %zd.",
+                num_usecols, num_fields);
+        goto finish;
+    }
+
+    arr = read_rows(
+            s, max_rows, num_fields, ft, pc,
+            num_usecols, usecols, skiplines, converters,
+            NULL, out_dtype, homogeneous);
+    if (arr == NULL) {
+        goto finish;
+    }
+
+  finish:
+    Py_XDECREF(out_dtype);
+    field_types_xclear(num_fields, ft);
+    return (PyObject *)arr;
+}
+
+
+static int
+parse_control_character(PyObject *obj, Py_UCS4 *character)
+{
+    if (obj == Py_None) {
+        *character = (Py_UCS4)-1;  /* character beyond unicode range */
+        return 1;
+    }
+    if (!PyUnicode_Check(obj) || PyUnicode_GetLength(obj) != 1) {
+        PyErr_Format(PyExc_TypeError,
+                "Text reading control character must be a single unicode "
+                "character or None; but got: %.100R", obj);
+        return 0;
+    }
+    *character = PyUnicode_READ_CHAR(obj, 0);
+    return 1;
+}
+
+
+/*
+ * A (somewhat verbose) check that none of the control characters match or are
+ * newline.  Most of these combinations are completely fine, just weird or
+ * surprising.
+ * (I.e. there is an implicit priority for control characters, so if a comment
+ * matches a delimiter, it would just be a comment.)
+ * In theory some `delimiter=None` paths could have a "meaning", but let us
+ * assume that users are better of setting one of the control chars to `None`
+ * for clarity.
+ *
+ * This also checks that the control characters cannot be newlines.
+ */
+static int
+error_if_matching_control_characters(
+        Py_UCS4 delimiter, Py_UCS4 quote, Py_UCS4 comment)
+{
+    char *control_char1;
+    char *control_char2 = NULL;
+    if (comment != (Py_UCS4)-1) {
+        control_char1 = "comment";
+        if (comment == '\r' || comment == '\n') {
+            goto error;
+        }
+        else if (comment == quote) {
+            control_char2 = "quotechar";
+            goto error;
+        }
+        else if (comment == delimiter) {
+            control_char2 = "delimiter";
+            goto error;
+        }
+    }
+    if (quote != (Py_UCS4)-1) {
+        control_char1 = "quotechar";
+        if (quote == '\r' || quote == '\n') {
+            goto error;
+        }
+        else if (quote == delimiter) {
+            control_char2 = "delimiter";
+            goto error;
+        }
+    }
+    if (delimiter != (Py_UCS4)-1) {
+        control_char1 = "delimiter";
+        if (delimiter == '\r' || delimiter == '\n') {
+            goto error;
+        }
+    }
+    /* The above doesn't work with delimiter=None, which means "whitespace" */
+    if (delimiter == (Py_UCS4)-1) {
+        control_char1 = "delimiter";
+        if (Py_UNICODE_ISSPACE(comment)) {
+            control_char2 = "comment";
+            goto error;
+        }
+        else if (Py_UNICODE_ISSPACE(quote)) {
+            control_char2 = "quotechar";
+            goto error;
+        }
+    }
+    return 0;
+
+  error:
+    if (control_char2 != NULL) {
+        PyErr_Format(PyExc_TypeError,
+                "The values for control characters '%s' and '%s' are "
+                "incompatible",
+                control_char1, control_char2);
+    }
+    else {
+        PyErr_Format(PyExc_TypeError,
+                "control character '%s' cannot be a newline (`\\r` or `\\n`).",
+                control_char1, control_char2);
+    }
+    return -1;
+}
+
+
+NPY_NO_EXPORT PyObject *
+_load_from_filelike(PyObject *NPY_UNUSED(mod),
+        PyObject *const *args, Py_ssize_t len_args, PyObject *kwnames)
+{
+    PyObject *file;
+    Py_ssize_t skiplines = 0;
+    Py_ssize_t max_rows = -1;
+    PyObject *usecols_obj = Py_None;
+    PyObject *converters = Py_None;
+
+    PyObject *dtype = Py_None;
+    PyObject *encoding_obj = Py_None;
+    const char *encoding = NULL;
+
+    parser_config pc = {
+        .delimiter = ',',
+        .comment = '#',
+        .quote = '"',
+        .imaginary_unit = 'j',
+        .delimiter_is_whitespace = false,
+        .ignore_leading_whitespace = false,
+        .python_byte_converters = false,
+        .c_byte_converters = false,
+        .gave_int_via_float_warning = false,
+    };
+    bool filelike = true;
+
+    PyObject *arr = NULL;
+
+    NPY_PREPARE_ARGPARSER;
+    if (npy_parse_arguments("_load_from_filelike", args, len_args, kwnames,
+            "file", NULL, &file,
+            "|delimiter", &parse_control_character, &pc.delimiter,
+            "|comment", &parse_control_character, &pc.comment,
+            "|quote", &parse_control_character, &pc.quote,
+            "|imaginary_unit", &parse_control_character, &pc.imaginary_unit,
+            "|usecols", NULL, &usecols_obj,
+            "|skiplines", &PyArray_IntpFromPyIntConverter, &skiplines,
+            "|max_rows", &PyArray_IntpFromPyIntConverter, &max_rows,
+            "|converters", NULL, &converters,
+            "|dtype", NULL, &dtype,
+            "|encoding", NULL, &encoding_obj,
+            "|filelike", &PyArray_BoolConverter, &filelike,
+            "|byte_converters", &PyArray_BoolConverter, &pc.python_byte_converters,
+            "|c_byte_converters", PyArray_BoolConverter, &pc.c_byte_converters,
+            NULL, NULL, NULL) < 0) {
+        return NULL;
+    }
+
+    /* Reject matching control characters, they just rarely make sense anyway */
+    if (error_if_matching_control_characters(
+            pc.delimiter, pc.quote, pc.comment) < 0) {
+        return NULL;
+    }
+
+    if (pc.delimiter == (Py_UCS4)-1) {
+        pc.delimiter_is_whitespace = true;
+        /* Ignore leading whitespace to match `string.split(None)` */
+        pc.ignore_leading_whitespace = true;
+    }
+
+    if (!PyArray_DescrCheck(dtype) ) {
+        PyErr_SetString(PyExc_TypeError,
+                "internal error: dtype must be provided and be a NumPy dtype");
+        return NULL;
+    }
+
+    if (encoding_obj != Py_None) {
+        if (!PyUnicode_Check(encoding_obj)) {
+            PyErr_SetString(PyExc_TypeError,
+                    "encoding must be a unicode string.");
+            return NULL;
+        }
+        encoding = PyUnicode_AsUTF8(encoding_obj);
+        if (encoding == NULL) {
+            return NULL;
+        }
+    }
+
+    /*
+     * Parse usecols, the rest of NumPy has no clear helper for this, so do
+     * it here manually.
+     */
+    Py_ssize_t num_usecols = -1;
+    Py_ssize_t *usecols = NULL;
+    if (usecols_obj != Py_None) {
+        num_usecols = PySequence_Length(usecols_obj);
+        if (num_usecols < 0) {
+            return NULL;
+        }
+        /* Calloc just to not worry about overflow */
+        usecols = PyMem_Calloc(num_usecols, sizeof(Py_ssize_t));
+        if (usecols == NULL) {
+            PyErr_NoMemory();
+            return NULL;
+        }
+        for (Py_ssize_t i = 0; i < num_usecols; i++) {
+            PyObject *tmp = PySequence_GetItem(usecols_obj, i);
+            if (tmp == NULL) {
+                PyMem_FREE(usecols);
+                return NULL;
+            }
+            usecols[i] = PyNumber_AsSsize_t(tmp, PyExc_OverflowError);
+            if (error_converting(usecols[i])) {
+                if (PyErr_ExceptionMatches(PyExc_TypeError)) {
+                    PyErr_Format(PyExc_TypeError,
+                            "usecols must be an int or a sequence of ints but "
+                            "it contains at least one element of type '%s'",
+                            Py_TYPE(tmp)->tp_name);
+                }
+                Py_DECREF(tmp);
+                PyMem_FREE(usecols);
+                return NULL;
+            }
+            Py_DECREF(tmp);
+        }
+    }
+
+    stream *s;
+    if (filelike) {
+        s = stream_python_file(file, encoding);
+    }
+    else {
+        s = stream_python_iterable(file, encoding);
+    }
+    if (s == NULL) {
+        PyMem_FREE(usecols);
+        return NULL;
+    }
+
+    arr = _readtext_from_stream(
+            s, &pc, num_usecols, usecols, skiplines, max_rows, converters, dtype);
+    stream_close(s);
+    PyMem_FREE(usecols);
+    return arr;
+}
+
diff --git a/numpy/core/src/multiarray/textreading/readtext.h b/numpy/core/src/multiarray/textreading/readtext.h

new file mode 100644 (file)

index 0000000..133c788
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/readtext.h
@@ -0,0 +1,8 @@
+#ifndef NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_READTEXT_H_
+#define NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_READTEXT_H_
+
+NPY_NO_EXPORT PyObject *
+_load_from_filelike(PyObject *mod,
+        PyObject *const *args, Py_ssize_t len_args, PyObject *kwnames);
+
+#endif  /* NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_READTEXT_H_ */
diff --git a/numpy/core/src/multiarray/textreading/rows.c b/numpy/core/src/multiarray/textreading/rows.c

new file mode 100644 (file)

index 0000000..a72fb79
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/rows.c
@@ -0,0 +1,491 @@
+
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#define _MULTIARRAYMODULE
+#include "numpy/arrayobject.h"
+#include "numpy/npy_3kcompat.h"
+#include "alloc.h"
+
+#include <string.h>
+#include <stdbool.h>
+
+#include "textreading/stream.h"
+#include "textreading/tokenize.h"
+#include "textreading/conversions.h"
+#include "textreading/field_types.h"
+#include "textreading/rows.h"
+#include "textreading/growth.h"
+
+/*
+ * Minimum size to grow the allcoation by (or 25%). The 8KiB means the actual
+ * growths is within `8 KiB <= size < 16 KiB` (depending on the row size).
+ */
+#define MIN_BLOCK_SIZE (1 << 13)
+
+
+
+/*
+ *  Create the array of converter functions from the Python converters.
+ */
+static PyObject **
+create_conv_funcs(
+        PyObject *converters, Py_ssize_t num_fields, const Py_ssize_t *usecols)
+{
+    assert(converters != Py_None);
+
+    PyObject **conv_funcs = PyMem_Calloc(num_fields, sizeof(PyObject *));
+    if (conv_funcs == NULL) {
+        PyErr_NoMemory();
+        return NULL;
+    }
+
+    if (PyCallable_Check(converters)) {
+        /* a single converter used for all columns individually */
+        for (Py_ssize_t i = 0; i < num_fields; i++) {
+            Py_INCREF(converters);
+            conv_funcs[i] = converters;
+        }
+        return conv_funcs;
+    }
+    else if (!PyDict_Check(converters)) {
+        PyErr_SetString(PyExc_TypeError,
+                "converters must be a dictionary mapping columns to converter "
+                "functions or a single callable.");
+        goto error;
+    }
+
+    PyObject *key, *value;
+    Py_ssize_t pos = 0;
+    while (PyDict_Next(converters, &pos, &key, &value)) {
+        Py_ssize_t column = PyNumber_AsSsize_t(key, PyExc_IndexError);
+        if (column == -1 && PyErr_Occurred()) {
+            PyErr_Format(PyExc_TypeError,
+                    "keys of the converters dictionary must be integers; "
+                    "got %.100R", key);
+            goto error;
+        }
+        if (usecols != NULL) {
+            /*
+             * This code searches for the corresponding usecol.  It is
+             * identical to the legacy usecols code, which has two weaknesses:
+             * 1. It fails for duplicated usecols only setting converter for
+             *    the first one.
+             * 2. It fails e.g. if usecols uses negative indexing and
+             *    converters does not.  (This is a feature, since it allows
+             *    us to correctly normalize converters to result column here.)
+             */
+            Py_ssize_t i = 0;
+            for (; i < num_fields; i++) {
+                if (column == usecols[i]) {
+                    column = i;
+                    break;
+                }
+            }
+            if (i == num_fields) {
+                continue;  /* ignore unused converter */
+            }
+        }
+        else {
+            if (column < -num_fields || column >= num_fields) {
+                PyErr_Format(PyExc_ValueError,
+                        "converter specified for column %zd, which is invalid "
+                        "for the number of fields %zd.", column, num_fields);
+                goto error;
+            }
+            if (column < 0) {
+                column += num_fields;
+            }
+        }
+        if (!PyCallable_Check(value)) {
+            PyErr_Format(PyExc_TypeError,
+                    "values of the converters dictionary must be callable, "
+                    "but the value associated with key %R is not", key);
+            goto error;
+        }
+        Py_INCREF(value);
+        conv_funcs[column] = value;
+    }
+    return conv_funcs;
+
+  error:
+    for (Py_ssize_t i = 0; i < num_fields; i++) {
+        Py_XDECREF(conv_funcs[i]);
+    }
+    PyMem_FREE(conv_funcs);
+    return NULL;
+}
+
+/**
+ * Read a file into the provided array, or create (and possibly grow) an
+ * array to read into.
+ *
+ * @param s The stream object/struct providing reading capabilities used by
+ *        the tokenizer.
+ * @param max_rows The number of rows to read, or -1.  If negative
+ *        all rows are read.
+ * @param num_field_types The number of field types stored in `field_types`.
+ * @param field_types Information about the dtype for each column (or one if
+ *        `homogeneous`).
+ * @param pconfig Pointer to the parser config object used by both the
+ *        tokenizer and the conversion functions.
+ * @param num_usecols The number of columns in `usecols`.
+ * @param usecols An array of length `num_usecols` or NULL.  If given indicates
+ *        which column is read for each individual row (negative columns are
+ *        accepted).
+ * @param skiplines The number of lines to skip, these lines are ignored.
+ * @param converters Python dictionary of converters.  Finalizing converters
+ *        is difficult without information about the number of columns.
+ * @param data_array An array to be filled or NULL.  In either case a new
+ *        reference is returned (the reference to `data_array` is not stolen).
+ * @param out_descr The dtype used for allocating a new array.  This is not
+ *        used if `data_array` is provided.  Note that the actual dtype of the
+ *        returned array can differ for strings.
+ * @param num_cols Pointer in which the actual (discovered) number of columns
+ *        is returned.  This is only relevant if `homogeneous` is true.
+ * @param homogeneous Whether the datatype of the array is not homogeneous,
+ *        i.e. not structured.  In this case the number of columns has to be
+ *        discovered an the returned array will be 2-dimensional rather than
+ *        1-dimensional.
+ *
+ * @returns Returns the result as an array object or NULL on error.  The result
+ *          is always a new reference (even when `data_array` was passed in).
+ */
+NPY_NO_EXPORT PyArrayObject *
+read_rows(stream *s,
+        npy_intp max_rows, Py_ssize_t num_field_types, field_type *field_types,
+        parser_config *pconfig, Py_ssize_t num_usecols, Py_ssize_t *usecols,
+        Py_ssize_t skiplines, PyObject *converters,
+        PyArrayObject *data_array, PyArray_Descr *out_descr,
+        bool homogeneous)
+{
+    char *data_ptr = NULL;
+    Py_ssize_t current_num_fields;
+    npy_intp row_size = out_descr->elsize;
+    PyObject **conv_funcs = NULL;
+
+    bool needs_init = PyDataType_FLAGCHK(out_descr, NPY_NEEDS_INIT);
+
+    int ndim = homogeneous ? 2 : 1;
+    npy_intp result_shape[2] = {0, 1};
+
+    bool data_array_allocated = data_array == NULL;
+    /* Make sure we own `data_array` for the purpose of error handling */
+    Py_XINCREF(data_array);
+    size_t rows_per_block = 1;  /* will be increased depending on row size */
+    npy_intp data_allocated_rows = 0;
+
+    /* We give a warning if max_rows is used and an empty line is encountered */
+    bool give_empty_row_warning = max_rows >= 0;
+
+    int ts_result = 0;
+    tokenizer_state ts;
+    if (tokenizer_init(&ts, pconfig) < 0) {
+        goto error;
+    }
+
+    /* Set the actual number of fields if it is already known, otherwise -1 */
+    Py_ssize_t actual_num_fields = -1;
+    if (usecols != NULL) {
+        assert(homogeneous || num_field_types == num_usecols);
+        actual_num_fields = num_usecols;
+    }
+    else if (!homogeneous) {
+        assert(usecols == NULL || num_field_types == num_usecols);
+        actual_num_fields = num_field_types;
+    }
+
+    for (Py_ssize_t i = 0; i < skiplines; i++) {
+        ts.state = TOKENIZE_GOTO_LINE_END;
+        ts_result = tokenize(s, &ts, pconfig);
+        if (ts_result < 0) {
+            goto error;
+        }
+        else if (ts_result != 0) {
+            /* Fewer lines than skiplines is acceptable */
+            break;
+        }
+    }
+
+    Py_ssize_t row_count = 0;  /* number of rows actually processed */
+    while ((max_rows < 0 || row_count < max_rows) && ts_result == 0) {
+        ts_result = tokenize(s, &ts, pconfig);
+        if (ts_result < 0) {
+            goto error;
+        }
+        current_num_fields = ts.num_fields;
+        field_info *fields = ts.fields;
+        if (NPY_UNLIKELY(ts.num_fields == 0)) {
+            /*
+             * Deprecated NumPy 1.23, 2021-01-13 (not really a deprecation,
+             * but similar policy should apply to removing the warning again)
+             */
+             /* Tokenizer may give a final "empty line" even if there is none */
+            if (give_empty_row_warning && ts_result == 0) {
+                give_empty_row_warning = false;
+                if (PyErr_WarnFormat(PyExc_UserWarning, 3,
+                        "Input line %zd contained no data and will not be "
+                        "counted towards `max_rows=%zd`.  This differs from "
+                        "the behaviour in NumPy <=1.22 which counted lines "
+                        "rather than rows.  If desired, the previous behaviour "
+                        "can be achieved by using `itertools.islice`.\n"
+                        "Please see the 1.23 release notes for an example on "
+                        "how to do this.  If you wish to ignore this warning, "
+                        "use `warnings.filterwarnings`.  This warning is "
+                        "expected to be removed in the future and is given "
+                        "only once per `loadtxt` call.",
+                        row_count + skiplines + 1, max_rows) < 0) {
+                    goto error;
+                }
+            }
+            continue;  /* Ignore empty line */
+        }
+
+        if (NPY_UNLIKELY(data_ptr == NULL)) {
+            // We've deferred some of the initialization tasks to here,
+            // because we've now read the first line, and we definitively
+            // know how many fields (i.e. columns) we will be processing.
+            if (actual_num_fields == -1) {
+                actual_num_fields = current_num_fields;
+            }
+
+            if (converters != Py_None) {
+                conv_funcs = create_conv_funcs(
+                        converters, actual_num_fields, usecols);
+                if (conv_funcs == NULL) {
+                    goto error;
+                }
+            }
+
+            /* Note that result_shape[1] is only used if homogeneous is true */
+            result_shape[1] = actual_num_fields;
+            if (homogeneous) {
+                row_size *= actual_num_fields;
+            }
+
+            if (data_array == NULL) {
+                if (max_rows < 0) {
+                    /*
+                     * Negative max_rows denotes to read the whole file, we
+                     * approach this by allocating ever larger blocks.
+                     * Adds a number of rows based on `MIN_BLOCK_SIZE`.
+                     * Note: later code grows assuming this is a power of two.
+                     */
+                    if (row_size == 0) {
+                        /* actual rows_per_block should not matter here */
+                        rows_per_block = 512;
+                    }
+                    else {
+                        /* safe on overflow since min_rows will be 0 or 1 */
+                        size_t min_rows = (
+                                (MIN_BLOCK_SIZE + row_size - 1) / row_size);
+                        while (rows_per_block < min_rows) {
+                            rows_per_block *= 2;
+                        }
+                    }
+                    data_allocated_rows = rows_per_block;
+                }
+                else {
+                    data_allocated_rows = max_rows;
+                }
+                result_shape[0] = data_allocated_rows;
+                Py_INCREF(out_descr);
+                /*
+                 * We do not use Empty, as it would fill with None
+                 * and requiring decref'ing if we shrink again.
+                 */
+                data_array = (PyArrayObject *)PyArray_SimpleNewFromDescr(
+                        ndim, result_shape, out_descr);
+#ifdef NPY_RELAXED_STRIDES_DEBUG
+                /* Incompatible with NPY_RELAXED_STRIDES_DEBUG due to growing */
+                if (result_shape[0] == 1) {
+                    PyArray_STRIDES(data_array)[0] = row_size;
+                }
+#endif /* NPY_RELAXED_STRIDES_DEBUG */
+                if (data_array == NULL) {
+                    goto error;
+                }
+                if (needs_init) {
+                    memset(PyArray_BYTES(data_array), 0, PyArray_NBYTES(data_array));
+                }
+            }
+            else {
+                assert(max_rows >=0);
+                data_allocated_rows = max_rows;
+            }
+            data_ptr = PyArray_BYTES(data_array);
+        }
+
+        if (!usecols && (actual_num_fields != current_num_fields)) {
+            PyErr_Format(PyExc_ValueError,
+                    "the number of columns changed from %zd to %zd at row %zd; "
+                    "use `usecols` to select a subset and avoid this error",
+                    actual_num_fields, current_num_fields, row_count+1);
+            goto error;
+        }
+
+        if (NPY_UNLIKELY(data_allocated_rows == row_count)) {
+            /*
+             * Grow by ~25% and rounded up to the next rows_per_block
+             * NOTE: This is based on very crude timings and could be refined!
+             */
+            npy_intp new_rows = data_allocated_rows;
+            npy_intp alloc_size = grow_size_and_multiply(
+                    &new_rows, rows_per_block, row_size);
+            if (alloc_size < 0) {
+                /* should normally error much earlier, but make sure */
+                PyErr_SetString(PyExc_ValueError,
+                        "array is too big. Cannot read file as a single array; "
+                        "providing a maximum number of rows to read may help.");
+                goto error;
+            }
+
+            char *new_data = PyDataMem_UserRENEW(
+                    PyArray_BYTES(data_array), alloc_size ? alloc_size : 1,
+                    PyArray_HANDLER(data_array));
+            if (new_data == NULL) {
+                PyErr_NoMemory();
+                goto error;
+            }
+            /* Replace the arrays data since it may have changed */
+            ((PyArrayObject_fields *)data_array)->data = new_data;
+            ((PyArrayObject_fields *)data_array)->dimensions[0] = new_rows;
+            data_ptr = new_data + row_count * row_size;
+            data_allocated_rows = new_rows;
+            if (needs_init) {
+                memset(data_ptr, '\0', (new_rows - row_count) * row_size);
+            }
+        }
+
+        for (Py_ssize_t i = 0; i < actual_num_fields; ++i) {
+            Py_ssize_t f;  /* The field, either 0 (if homogeneous) or i. */
+            Py_ssize_t col;  /* The column as read, remapped by usecols */
+            char *item_ptr;
+            if (homogeneous) {
+                f = 0;
+                item_ptr = data_ptr + i * field_types[0].descr->elsize;
+            }
+            else {
+                f = i;
+                item_ptr = data_ptr + field_types[f].structured_offset;
+            }
+
+            if (usecols == NULL) {
+                col = i;
+            }
+            else {
+                col = usecols[i];
+                if (col < 0) {
+                    // Python-like column indexing: k = -1 means the last column.
+                    col += current_num_fields;
+                }
+                if (NPY_UNLIKELY((col < 0) || (col >= current_num_fields))) {
+                    PyErr_Format(PyExc_ValueError,
+                            "invalid column index %zd at row %zd with %zd "
+                            "columns",
+                            usecols[i], row_count+1, current_num_fields);
+                    goto error;
+                }
+            }
+
+            /*
+             * The following function calls represent the main "conversion"
+             * step, i.e. parsing the unicode string for each field and storing
+             * the result in the array.
+             */
+            int parser_res;
+            Py_UCS4 *str = ts.field_buffer + fields[col].offset;
+            Py_UCS4 *end = ts.field_buffer + fields[col + 1].offset - 1;
+            if (conv_funcs == NULL || conv_funcs[i] == NULL) {
+                parser_res = field_types[f].set_from_ucs4(field_types[f].descr,
+                        str, end, item_ptr, pconfig);
+            }
+            else {
+                parser_res = to_generic_with_converter(field_types[f].descr,
+                        str, end, item_ptr, pconfig, conv_funcs[i]);
+            }
+
+            if (NPY_UNLIKELY(parser_res < 0)) {
+                PyObject *exc, *val, *tb;
+                PyErr_Fetch(&exc, &val, &tb);
+
+                size_t length = end - str;
+                PyObject *string = PyUnicode_FromKindAndData(
+                        PyUnicode_4BYTE_KIND, str, length);
+                if (string == NULL) {
+                    npy_PyErr_ChainExceptions(exc, val, tb);
+                    goto error;
+                }
+                PyErr_Format(PyExc_ValueError,
+                        "could not convert string %.100R to %S at "
+                        "row %zd, column %zd.",
+                        string, field_types[f].descr, row_count, col+1);
+                Py_DECREF(string);
+                npy_PyErr_ChainExceptionsCause(exc, val, tb);
+                goto error;
+            }
+        }
+
+        ++row_count;
+        data_ptr += row_size;
+    }
+
+    tokenizer_clear(&ts);
+    if (conv_funcs != NULL) {
+        for (Py_ssize_t i = 0; i < actual_num_fields; i++) {
+            Py_XDECREF(conv_funcs[i]);
+        }
+        PyMem_FREE(conv_funcs);
+    }
+
+    if (data_array == NULL) {
+        assert(row_count == 0 && result_shape[0] == 0);
+        if (actual_num_fields == -1) {
+            /*
+             * We found no rows and have to discover the number of elements
+             * we have no choice but to guess 1.
+             * NOTE: It may make sense to move this outside of here to refine
+             *       the behaviour where necessary.
+             */
+            result_shape[1] = 1;
+        }
+        else {
+            result_shape[1] = actual_num_fields;
+        }
+        Py_INCREF(out_descr);
+        data_array = (PyArrayObject *)PyArray_Empty(
+                ndim, result_shape, out_descr, 0);
+    }
+
+    /*
+     * Note that if there is no data, `data_array` may still be NULL and
+     * row_count is 0.  In that case, always realloc just in case.
+     */
+    if (data_array_allocated && data_allocated_rows != row_count) {
+        size_t size = row_count * row_size;
+        char *new_data = PyDataMem_UserRENEW(
+                PyArray_BYTES(data_array), size ? size : 1,
+                PyArray_HANDLER(data_array));
+        if (new_data == NULL) {
+            Py_DECREF(data_array);
+            PyErr_NoMemory();
+            return NULL;
+        }
+        ((PyArrayObject_fields *)data_array)->data = new_data;
+        ((PyArrayObject_fields *)data_array)->dimensions[0] = row_count;
+    }
+
+    return data_array;
+
+  error:
+    if (conv_funcs != NULL) {
+        for (Py_ssize_t i = 0; i < actual_num_fields; i++) {
+            Py_XDECREF(conv_funcs[i]);
+        }
+        PyMem_FREE(conv_funcs);
+    }
+    tokenizer_clear(&ts);
+    Py_XDECREF(data_array);
+    return NULL;
+}
diff --git a/numpy/core/src/multiarray/textreading/rows.h b/numpy/core/src/multiarray/textreading/rows.h

new file mode 100644 (file)

index 0000000..20eb9e1
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/rows.h
@@ -0,0 +1,22 @@
+
+#ifndef NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_ROWS_H_
+#define NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_ROWS_H_
+
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+#include <stdio.h>
+
+#include "textreading/stream.h"
+#include "textreading/field_types.h"
+#include "textreading/parser_config.h"
+
+
+NPY_NO_EXPORT PyArrayObject *
+read_rows(stream *s,
+        npy_intp nrows, Py_ssize_t num_field_types, field_type *field_types,
+        parser_config *pconfig, Py_ssize_t num_usecols, Py_ssize_t *usecols,
+        Py_ssize_t skiplines, PyObject *converters,
+        PyArrayObject *data_array, PyArray_Descr *out_descr,
+        bool homogeneous);
+
+#endif  /* NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_ROWS_H_ */
diff --git a/numpy/core/src/multiarray/textreading/str_to_int.c b/numpy/core/src/multiarray/textreading/str_to_int.c

new file mode 100644 (file)

index 0000000..0dd6c0b
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/str_to_int.c
@@ -0,0 +1,110 @@
+
+#include <Python.h>
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#define _MULTIARRAYMODULE
+#include "lowlevel_strided_loops.h"
+
+#include <string.h>
+#include "textreading/str_to_int.h"
+#include "textreading/parser_config.h"
+#include "conversions.h"  /* For the deprecated parse-via-float path */
+
+
+const char *deprecation_msg = (
+        "loadtxt(): Parsing an integer via a float is deprecated.  To avoid "
+        "this warning, you can:\n"
+        "    * make sure the original data is stored as integers.\n"
+        "    * use the `converters=` keyword argument.  If you only use\n"
+        "      NumPy 1.23 or later, `converters=float` will normally work.\n"
+        "    * Use `np.loadtxt(...).astype(np.int64)` parsing the file as\n"
+        "      floating point and then convert it.  (On all NumPy versions.)\n"
+        "  (Deprecated NumPy 1.23)");
+
+#define DECLARE_TO_INT(intw, INT_MIN, INT_MAX, byteswap_unaligned)          \
+    NPY_NO_EXPORT int                                                       \
+    to_##intw(PyArray_Descr *descr,                                         \
+            const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,          \
+            parser_config *pconfig)                                         \
+    {                                                                       \
+        int64_t parsed;                                                     \
+        intw##_t x;                                                         \
+                                                                            \
+        if (NPY_UNLIKELY(                                                   \
+                str_to_int64(str, end, INT_MIN, INT_MAX, &parsed) < 0)) {   \
+            /* DEPRECATED 2022-07-03, NumPy 1.23 */                         \
+            double fval;                                                    \
+            PyArray_Descr *d_descr = PyArray_DescrFromType(NPY_DOUBLE);     \
+            Py_DECREF(d_descr);  /* borrowed */                             \
+            if (to_double(d_descr, str, end, (char *)&fval, pconfig) < 0) { \
+                return -1;                                                  \
+            }                                                               \
+            if (!pconfig->gave_int_via_float_warning) {                     \
+                pconfig->gave_int_via_float_warning = true;                 \
+                if (PyErr_WarnEx(PyExc_DeprecationWarning,                  \
+                        deprecation_msg, 3) < 0) {                          \
+                    return -1;                                              \
+                }                                                           \
+            }                                                               \
+            pconfig->gave_int_via_float_warning = true;                     \
+            x = (intw##_t)fval;                                             \
+        }                                                                   \
+        else {                                                              \
+            x = (intw##_t)parsed;                                           \
+        }                                                                   \
+        memcpy(dataptr, &x, sizeof(x));                                     \
+        if (!PyArray_ISNBO(descr->byteorder)) {                             \
+            byteswap_unaligned(dataptr);                                    \
+        }                                                                   \
+        return 0;                                                           \
+    }
+
+#define DECLARE_TO_UINT(uintw, UINT_MAX, byteswap_unaligned)                \
+    NPY_NO_EXPORT int                                                       \
+    to_##uintw(PyArray_Descr *descr,                                        \
+            const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,          \
+            parser_config *pconfig)                                         \
+    {                                                                       \
+        uint64_t parsed;                                                    \
+        uintw##_t x;                                                        \
+                                                                            \
+        if (NPY_UNLIKELY(                                                   \
+                str_to_uint64(str, end, UINT_MAX, &parsed) < 0)) {          \
+            /* DEPRECATED 2022-07-03, NumPy 1.23 */                         \
+            double fval;                                                    \
+            PyArray_Descr *d_descr = PyArray_DescrFromType(NPY_DOUBLE);     \
+            Py_DECREF(d_descr);  /* borrowed */                             \
+            if (to_double(d_descr, str, end, (char *)&fval, pconfig) < 0) { \
+                return -1;                                                  \
+            }                                                               \
+            if (!pconfig->gave_int_via_float_warning) {                     \
+                pconfig->gave_int_via_float_warning = true;                 \
+                if (PyErr_WarnEx(PyExc_DeprecationWarning,                  \
+                        deprecation_msg, 3) < 0) {                          \
+                    return -1;                                              \
+                }                                                           \
+            }                                                               \
+            pconfig->gave_int_via_float_warning = true;                     \
+            x = (uintw##_t)fval;                                            \
+        }                                                                   \
+        else {                                                              \
+            x = (uintw##_t)parsed;                                          \
+        }                                                                   \
+        memcpy(dataptr, &x, sizeof(x));                                     \
+        if (!PyArray_ISNBO(descr->byteorder)) {                             \
+            byteswap_unaligned(dataptr);                                    \
+        }                                                                   \
+        return 0;                                                           \
+    }
+
+#define byteswap_nothing(ptr)
+
+DECLARE_TO_INT(int8, INT8_MIN, INT8_MAX, byteswap_nothing)
+DECLARE_TO_INT(int16, INT16_MIN, INT16_MAX, npy_bswap2_unaligned)
+DECLARE_TO_INT(int32, INT32_MIN, INT32_MAX, npy_bswap4_unaligned)
+DECLARE_TO_INT(int64, INT64_MIN, INT64_MAX, npy_bswap8_unaligned)
+
+DECLARE_TO_UINT(uint8, UINT8_MAX, byteswap_nothing)
+DECLARE_TO_UINT(uint16, UINT16_MAX, npy_bswap2_unaligned)
+DECLARE_TO_UINT(uint32, UINT32_MAX, npy_bswap4_unaligned)
+DECLARE_TO_UINT(uint64, UINT64_MAX, npy_bswap8_unaligned)
diff --git a/numpy/core/src/multiarray/textreading/str_to_int.h b/numpy/core/src/multiarray/textreading/str_to_int.h

new file mode 100644 (file)

index 0000000..a0a89a0
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/str_to_int.h
@@ -0,0 +1,174 @@
+#ifndef NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_STR_TO_INT_H_
+#define NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_STR_TO_INT_H_
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#define _MULTIARRAYMODULE
+#include "numpy/ndarraytypes.h"
+
+#include "textreading/parser_config.h"
+
+
+/*
+ * The following two string conversion functions are largely equivalent
+ * in Pandas.  They are in the header file here, to ensure they can be easily
+ * inline in the other function.
+ * Unlike pandas, pass in end-pointer (do not rely on \0) and return 0 or -1.
+ *
+ * The actual functions are defined using macro templating below.
+ */
+NPY_FINLINE int
+str_to_int64(
+        const Py_UCS4 *p_item, const Py_UCS4 *p_end,
+        int64_t int_min, int64_t int_max, int64_t *result)
+{
+    const Py_UCS4 *p = (const Py_UCS4 *)p_item;
+    bool isneg = 0;
+    int64_t number = 0;
+
+    // Skip leading spaces.
+    while (Py_UNICODE_ISSPACE(*p)) {
+        ++p;
+    }
+
+    // Handle sign.
+    if (*p == '-') {
+        isneg = true;
+        ++p;
+    }
+    else if (*p == '+') {
+        p++;
+    }
+
+    // Check that there is a first digit.
+    if (!isdigit(*p)) {
+        return -1;
+    }
+
+    if (isneg) {
+        // If number is greater than pre_min, at least one more digit
+        // can be processed without overflowing.
+        int dig_pre_min = -(int_min % 10);
+        int64_t pre_min = int_min / 10;
+
+        // Process the digits.
+        int d = *p;
+        while (isdigit(d)) {
+            if ((number > pre_min) || ((number == pre_min) && (d - '0' <= dig_pre_min))) {
+                number = number * 10 - (d - '0');
+                d = *++p;
+            }
+            else {
+                return -1;
+            }
+        }
+    }
+    else {
+        // If number is less than pre_max, at least one more digit
+        // can be processed without overflowing.
+        int64_t pre_max = int_max / 10;
+        int dig_pre_max = int_max % 10;
+
+        // Process the digits.
+        int d = *p;
+        while (isdigit(d)) {
+            if ((number < pre_max) || ((number == pre_max) && (d - '0' <= dig_pre_max))) {
+                number = number * 10 + (d - '0');
+                d = *++p;
+            }
+            else {
+                return -1;
+            }
+        }
+    }
+
+    // Skip trailing spaces.
+    while (Py_UNICODE_ISSPACE(*p)) {
+        ++p;
+    }
+
+    // Did we use up all the characters?
+    if (p != p_end) {
+        return -1;
+    }
+
+    *result = number;
+    return 0;
+}
+
+
+NPY_FINLINE int
+str_to_uint64(
+        const Py_UCS4 *p_item, const Py_UCS4 *p_end,
+        uint64_t uint_max, uint64_t *result)
+{
+    const Py_UCS4 *p = (const Py_UCS4 *)p_item;
+    uint64_t number = 0;
+    int d;
+
+    // Skip leading spaces.
+    while (Py_UNICODE_ISSPACE(*p)) {
+        ++p;
+    }
+
+    // Handle sign.
+    if (*p == '-') {
+        return -1;
+    }
+    if (*p == '+') {
+        p++;
+    }
+
+    // Check that there is a first digit.
+    if (!isdigit(*p)) {
+        return -1;
+    }
+
+    // If number is less than pre_max, at least one more digit
+    // can be processed without overflowing.
+    uint64_t pre_max = uint_max / 10;
+    int dig_pre_max = uint_max % 10;
+
+    // Process the digits.
+    d = *p;
+    while (isdigit(d)) {
+        if ((number < pre_max) || ((number == pre_max) && (d - '0' <= dig_pre_max))) {
+            number = number * 10 + (d - '0');
+            d = *++p;
+        }
+        else {
+            return -1;
+        }
+    }
+
+    // Skip trailing spaces.
+    while (Py_UNICODE_ISSPACE(*p)) {
+        ++p;
+    }
+
+    // Did we use up all the characters?
+    if (p != p_end) {
+        return -1;
+    }
+
+    *result = number;
+    return 0;
+}
+
+
+#define DECLARE_TO_INT_PROTOTYPE(intw)                                  \
+    NPY_NO_EXPORT int                                                   \
+    to_##intw(PyArray_Descr *descr,                                     \
+            const Py_UCS4 *str, const Py_UCS4 *end, char *dataptr,      \
+            parser_config *pconfig);
+
+DECLARE_TO_INT_PROTOTYPE(int8)
+DECLARE_TO_INT_PROTOTYPE(int16)
+DECLARE_TO_INT_PROTOTYPE(int32)
+DECLARE_TO_INT_PROTOTYPE(int64)
+
+DECLARE_TO_INT_PROTOTYPE(uint8)
+DECLARE_TO_INT_PROTOTYPE(uint16)
+DECLARE_TO_INT_PROTOTYPE(uint32)
+DECLARE_TO_INT_PROTOTYPE(uint64)
+
+#endif  /* NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_STR_TO_INT_H_ */
diff --git a/numpy/core/src/multiarray/textreading/stream.h b/numpy/core/src/multiarray/textreading/stream.h

new file mode 100644 (file)

index 0000000..42ca654
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/stream.h
@@ -0,0 +1,49 @@
+#ifndef NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_STREAM_H_
+#define NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_STREAM_H_
+
+#include <stdint.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+ * When getting the next line, we hope that the buffer provider can already
+ * give some information about the newlines, because for Python iterables
+ * we definitely expect to get line-by-line buffers.
+ *
+ * BUFFER_IS_FILEEND must be returned when the end of the file is reached and
+ * must NOT be returned together with a valid (non-empty) buffer.
+ */
+#define BUFFER_MAY_CONTAIN_NEWLINE 0
+#define BUFFER_IS_LINEND 1
+#define BUFFER_IS_FILEEND 2
+
+/*
+ * Base struct for streams.  We currently have two, a chunked reader for
+ * filelikes and a line-by-line for any iterable.
+ * As of writing, the chunked reader was only used for filelikes not already
+ * opened.  That is to preserve the amount read in case of an error exactly.
+ * If we drop this, we could read it more often (but not when `max_rows` is
+ * used).
+ *
+ * The "streams" can extend this struct to store their own data (so it is
+ * a very lightweight "object").
+ */
+typedef struct _stream {
+    int (*stream_nextbuf)(void *sdata, char **start, char **end, int *kind);
+    // Note that the first argument to stream_close is the stream pointer
+    // itself, not the stream_data pointer.
+    int (*stream_close)(struct _stream *strm);
+} stream;
+
+
+#define stream_nextbuf(s, start, end, kind)  \
+        ((s)->stream_nextbuf((s), start, end, kind))
+#define stream_close(s)    ((s)->stream_close((s)))
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif  /* NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_STREAM_H_ */
diff --git a/numpy/core/src/multiarray/textreading/stream_pyobject.c b/numpy/core/src/multiarray/textreading/stream_pyobject.c

new file mode 100644 (file)

index 0000000..6f84ff0
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/stream_pyobject.c
@@ -0,0 +1,239 @@
+/*
+ * C side structures to provide capabilities to read Python file like objects
+ * in chunks, or iterate through iterables with each result representing a
+ * single line of a file.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#define _MULTIARRAYMODULE
+#include "numpy/arrayobject.h"
+
+#include "textreading/stream.h"
+
+#define READ_CHUNKSIZE 1 << 14
+
+
+typedef struct {
+    stream stream;
+    /* The Python file object being read. */
+    PyObject *file;
+
+    /* The `read` attribute of the file object. */
+    PyObject *read;
+    /* Amount to read each time we call `obj.read()` */
+    PyObject *chunksize;
+
+    /* Python str object holding the line most recently read from the file. */
+    PyObject *chunk;
+
+    /* Encoding compatible with Python's `PyUnicode_Encode` (may be NULL) */
+    const char *encoding;
+} python_chunks_from_file;
+
+
+/*
+ * Helper function to support byte objects as well as unicode strings.
+ *
+ * NOTE: Steals a reference to `str` (although usually returns it unmodified).
+ */
+static NPY_INLINE PyObject *
+process_stringlike(PyObject *str, const char *encoding)
+{
+    if (PyBytes_Check(str)) {
+        PyObject *ustr;
+        ustr = PyUnicode_FromEncodedObject(str, encoding, NULL);
+        if (ustr == NULL) {
+            return NULL;
+        }
+        Py_DECREF(str);
+        return ustr;
+    }
+    else if (!PyUnicode_Check(str)) {
+        PyErr_SetString(PyExc_TypeError,
+                "non-string returned while reading data");
+        Py_DECREF(str);
+        return NULL;
+    }
+    return str;
+}
+
+
+static NPY_INLINE void
+buffer_info_from_unicode(PyObject *str, char **start, char **end, int *kind)
+{
+    Py_ssize_t length = PyUnicode_GET_LENGTH(str);
+    *kind = PyUnicode_KIND(str);
+
+    if (*kind == PyUnicode_1BYTE_KIND) {
+        *start = (char *)PyUnicode_1BYTE_DATA(str);
+    }
+    else if (*kind == PyUnicode_2BYTE_KIND) {
+        *start = (char *)PyUnicode_2BYTE_DATA(str);
+        length *= sizeof(Py_UCS2);
+    }
+    else if (*kind == PyUnicode_4BYTE_KIND) {
+        *start = (char *)PyUnicode_4BYTE_DATA(str);
+        length *= sizeof(Py_UCS4);
+    }
+    *end = *start + length;
+}
+
+
+static int
+fb_nextbuf(python_chunks_from_file *fb, char **start, char **end, int *kind)
+{
+    Py_XDECREF(fb->chunk);
+    fb->chunk = NULL;
+
+    PyObject *chunk = PyObject_CallFunctionObjArgs(fb->read, fb->chunksize, NULL);
+    if (chunk == NULL) {
+        return -1;
+    }
+    fb->chunk = process_stringlike(chunk, fb->encoding);
+    if (fb->chunk == NULL) {
+        return -1;
+    }
+    buffer_info_from_unicode(fb->chunk, start, end, kind);
+    if (*start == *end) {
+        return BUFFER_IS_FILEEND;
+    }
+    return BUFFER_MAY_CONTAIN_NEWLINE;
+}
+
+
+static int
+fb_del(stream *strm)
+{
+    python_chunks_from_file *fb = (python_chunks_from_file *)strm;
+
+    Py_XDECREF(fb->file);
+    Py_XDECREF(fb->read);
+    Py_XDECREF(fb->chunksize);
+    Py_XDECREF(fb->chunk);
+
+    PyMem_FREE(strm);
+
+    return 0;
+}
+
+
+NPY_NO_EXPORT stream *
+stream_python_file(PyObject *obj, const char *encoding)
+{
+    python_chunks_from_file *fb;
+
+    fb = (python_chunks_from_file *)PyMem_Calloc(1, sizeof(python_chunks_from_file));
+    if (fb == NULL) {
+        PyErr_NoMemory();
+        return NULL;
+    }
+
+    fb->stream.stream_nextbuf = (void *)&fb_nextbuf;
+    fb->stream.stream_close = &fb_del;
+
+    fb->encoding = encoding;
+    Py_INCREF(obj);
+    fb->file = obj;
+
+    fb->read = PyObject_GetAttrString(obj, "read");
+    if (fb->read == NULL) {
+        goto fail;
+    }
+    fb->chunksize = PyLong_FromLong(READ_CHUNKSIZE);
+    if (fb->chunksize == NULL) {
+        goto fail;
+    }
+
+    return (stream *)fb;
+
+fail:
+    fb_del((stream *)fb);
+    return NULL;
+}
+
+
+/*
+ * Stream from a Python iterable by interpreting each item as a line in a file
+ */
+typedef struct {
+    stream stream;
+    /* The Python file object being read. */
+    PyObject *iterator;
+
+    /* Python str object holding the line most recently fetched */
+    PyObject *line;
+
+    /* Encoding compatible with Python's `PyUnicode_Encode` (may be NULL) */
+    const char *encoding;
+} python_lines_from_iterator;
+
+
+static int
+it_del(stream *strm)
+{
+    python_lines_from_iterator *it = (python_lines_from_iterator *)strm;
+
+    Py_XDECREF(it->iterator);
+    Py_XDECREF(it->line);
+
+    PyMem_FREE(strm);
+    return 0;
+}
+
+
+static int
+it_nextbuf(python_lines_from_iterator *it, char **start, char **end, int *kind)
+{
+    Py_XDECREF(it->line);
+    it->line = NULL;
+
+    PyObject *line = PyIter_Next(it->iterator);
+    if (line == NULL) {
+        if (PyErr_Occurred()) {
+            return -1;
+        }
+        *start = NULL;
+        *end = NULL;
+        return BUFFER_IS_FILEEND;
+    }
+    it->line = process_stringlike(line, it->encoding);
+    if (it->line == NULL) {
+        return -1;
+    }
+
+    buffer_info_from_unicode(it->line, start, end, kind);
+    return BUFFER_IS_LINEND;
+}
+
+
+NPY_NO_EXPORT stream *
+stream_python_iterable(PyObject *obj, const char *encoding)
+{
+    python_lines_from_iterator *it;
+
+    if (!PyIter_Check(obj)) {
+        PyErr_SetString(PyExc_TypeError,
+                "error reading from object, expected an iterable.");
+        return NULL;
+    }
+
+    it = (python_lines_from_iterator *)PyMem_Calloc(1, sizeof(*it));
+    if (it == NULL) {
+        PyErr_NoMemory();
+        return NULL;
+    }
+
+    it->stream.stream_nextbuf = (void *)&it_nextbuf;
+    it->stream.stream_close = &it_del;
+
+    it->encoding = encoding;
+    Py_INCREF(obj);
+    it->iterator = obj;
+
+    return (stream *)it;
+}
diff --git a/numpy/core/src/multiarray/textreading/stream_pyobject.h b/numpy/core/src/multiarray/textreading/stream_pyobject.h

new file mode 100644 (file)

index 0000000..45c11dd
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/stream_pyobject.h
@@ -0,0 +1,16 @@
+
+#ifndef NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_STREAM_PYOBJECT_H_
+#define NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_STREAM_PYOBJECT_H_
+
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+
+#include "textreading/stream.h"
+
+NPY_NO_EXPORT stream *
+stream_python_file(PyObject *obj, const char *encoding);
+
+NPY_NO_EXPORT stream *
+stream_python_iterable(PyObject *obj, const char *encoding);
+
+#endif  /* NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_STREAM_PYOBJECT_H_ */
diff --git a/numpy/core/src/multiarray/textreading/tokenize.cpp b/numpy/core/src/multiarray/textreading/tokenize.cpp

new file mode 100644 (file)

index 0000000..d0d9cf8
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/tokenize.cpp
@@ -0,0 +1,454 @@
+
+#include <Python.h>
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#define _MULTIARRAYMODULE
+#include "numpy/ndarraytypes.h"
+
+#include "textreading/stream.h"
+#include "textreading/tokenize.h"
+#include "textreading/parser_config.h"
+#include "textreading/growth.h"
+
+/*
+    How parsing quoted fields works:
+
+    For quoting to be activated, the first character of the field
+    must be the quote character (after taking into account
+    ignore_leading_spaces).  While quoting is active, delimiters
+    are treated as regular characters, not delimiters.  Quoting is
+    deactivated by the second occurrence of the quote character.  An
+    exception is the occurrence of two consecutive quote characters,
+    which is treated as a literal occurrence of a single quote character.
+    E.g. (with delimiter=',' and quote='"'):
+        12.3,"New York, NY","3'2"""
+    The second and third fields are `New York, NY` and `3'2"`.
+
+    If a non-delimiter occurs after the closing quote, the quote is
+    ignored and parsing continues with quoting deactivated.  Quotes
+    that occur while quoting is not activated are not handled specially;
+    they become part of the data.
+    E.g:
+        12.3,"ABC"DEF,XY"Z
+    The second and third fields are `ABCDEF` and `XY"Z`.
+
+    Note that the second field of
+        12.3,"ABC"   ,4.5
+    is `ABC   `.  Currently there is no option to ignore whitespace
+    at the end of a field.
+*/
+
+
+template <typename UCS>
+static inline int
+copy_to_field_buffer(tokenizer_state *ts,
+        const UCS *chunk_start, const UCS *chunk_end)
+{
+    npy_intp chunk_length = chunk_end - chunk_start;
+    npy_intp size = chunk_length + ts->field_buffer_pos + 2;
+
+    if (NPY_UNLIKELY(ts->field_buffer_length < size)) {
+        npy_intp alloc_size = grow_size_and_multiply(&size, 32, sizeof(Py_UCS4));
+        if (alloc_size < 0) {
+            PyErr_Format(PyExc_ValueError,
+                    "line too long to handle while reading file.");
+            return -1;
+        }
+        Py_UCS4 *grown = (Py_UCS4 *)PyMem_Realloc(ts->field_buffer, alloc_size);
+        if (grown == nullptr) {
+            PyErr_NoMemory();
+            return -1;
+        }
+        ts->field_buffer_length = size;
+        ts->field_buffer = grown;
+    }
+
+    Py_UCS4 *write_pos = ts->field_buffer + ts->field_buffer_pos;
+    for (; chunk_start < chunk_end; chunk_start++, write_pos++) {
+        *write_pos = (Py_UCS4)*chunk_start;
+    }
+    *write_pos = '\0';  /* always ensure we end with NUL */
+    ts->field_buffer_pos += chunk_length;
+    return 0;
+}
+
+
+static inline int
+add_field(tokenizer_state *ts)
+{
+    /* The previous field is done, advance to keep a NUL byte at the end */
+    ts->field_buffer_pos += 1;
+
+    if (NPY_UNLIKELY(ts->num_fields + 1 > ts->fields_size)) {
+        npy_intp size = ts->num_fields;
+
+        npy_intp alloc_size = grow_size_and_multiply(
+                &size, 4, sizeof(field_info));
+        if (alloc_size < 0) {
+            /* Check for a size overflow, path should be almost impossible. */
+            PyErr_Format(PyExc_ValueError,
+                    "too many columns found; cannot read file.");
+            return -1;
+        }
+        field_info *fields = (field_info *)PyMem_Realloc(ts->fields, alloc_size);
+        if (fields == nullptr) {
+            PyErr_NoMemory();
+            return -1;
+        }
+        ts->fields = fields;
+        ts->fields_size = size;
+    }
+
+    ts->fields[ts->num_fields].offset = ts->field_buffer_pos;
+    ts->fields[ts->num_fields].quoted = false;
+    ts->num_fields += 1;
+    /* Ensure this (currently empty) word is NUL terminated. */
+    ts->field_buffer[ts->field_buffer_pos] = '\0';
+    return 0;
+}
+
+
+template <typename UCS>
+static inline int
+tokenizer_core(tokenizer_state *ts, parser_config *const config)
+{
+    UCS *pos = (UCS *)ts->pos;
+    UCS *stop = (UCS *)ts->end;
+    UCS *chunk_start;
+
+    if (ts->state == TOKENIZE_CHECK_QUOTED) {
+        /* before we can check for quotes, strip leading whitespace */
+        if (config->ignore_leading_whitespace) {
+            while (pos < stop && Py_UNICODE_ISSPACE(*pos) &&
+                        *pos != '\r' && *pos != '\n') {
+                pos++;
+            }
+            if (pos == stop) {
+                ts->pos = (char *)pos;
+                return 0;
+            }
+        }
+
+        /* Setting chunk effectively starts the field */
+        if (*pos == config->quote) {
+            ts->fields[ts->num_fields - 1].quoted = true;
+            ts->state = TOKENIZE_QUOTED;
+            pos++;  /* TOKENIZE_QUOTED is OK with pos == stop */
+        }
+        else {
+            /* Set to TOKENIZE_QUOTED or TOKENIZE_QUOTED_WHITESPACE */
+            ts->state = ts->unquoted_state;
+        }
+    }
+
+    switch (ts->state) {
+        case TOKENIZE_UNQUOTED:
+            chunk_start = pos;
+            for (; pos < stop; pos++) {
+                if (*pos == '\r') {
+                    ts->state = TOKENIZE_EAT_CRLF;
+                    break;
+                }
+                else if (*pos == '\n') {
+                    ts->state = TOKENIZE_LINE_END;
+                    break;
+                }
+                else if (*pos == config->delimiter) {
+                    ts->state = TOKENIZE_INIT;
+                    break;
+                }
+                else if (*pos == config->comment) {
+                    ts->state = TOKENIZE_GOTO_LINE_END;
+                    break;
+                }
+            }
+            if (copy_to_field_buffer(ts, chunk_start, pos) < 0) {
+                return -1;
+            }
+            pos++;
+            break;
+
+        case TOKENIZE_UNQUOTED_WHITESPACE:
+            /* Note, this branch is largely identical to `TOKENIZE_UNQUOTED` */
+            chunk_start = pos;
+            for (; pos < stop; pos++) {
+                if (*pos == '\r') {
+                    ts->state = TOKENIZE_EAT_CRLF;
+                    break;
+                }
+                else if (*pos == '\n') {
+                    ts->state = TOKENIZE_LINE_END;
+                    break;
+                }
+                else if (Py_UNICODE_ISSPACE(*pos)) {
+                    ts->state = TOKENIZE_INIT;
+                    break;
+                }
+                else if (*pos == config->comment) {
+                    ts->state = TOKENIZE_GOTO_LINE_END;
+                    break;
+                }
+            }
+            if (copy_to_field_buffer(ts, chunk_start, pos) < 0) {
+                return -1;
+            }
+            pos++;
+            break;
+
+        case TOKENIZE_QUOTED:
+            chunk_start = pos;
+            for (; pos < stop; pos++) {
+                if (*pos == config->quote) {
+                    ts->state = TOKENIZE_QUOTED_CHECK_DOUBLE_QUOTE;
+                    break;
+                }
+            }
+            if (copy_to_field_buffer(ts, chunk_start, pos) < 0) {
+                return -1;
+            }
+            pos++;
+            break;
+
+        case TOKENIZE_QUOTED_CHECK_DOUBLE_QUOTE:
+            if (*pos == config->quote) {
+                /* Copy the quote character directly from the config: */
+                if (copy_to_field_buffer(ts,
+                        &config->quote, &config->quote+1) < 0) {
+                    return -1;
+                }
+                ts->state = TOKENIZE_QUOTED;
+                pos++;
+            }
+            else {
+                /* continue parsing as if unquoted */
+                ts->state = TOKENIZE_UNQUOTED;
+            }
+            break;
+
+        case TOKENIZE_GOTO_LINE_END:
+            if (ts->buf_state != BUFFER_MAY_CONTAIN_NEWLINE) {
+                pos = stop;  /* advance to next buffer */
+                ts->state = TOKENIZE_LINE_END;
+                break;
+            }
+            for (; pos < stop; pos++) {
+                if (*pos == '\r') {
+                    ts->state = TOKENIZE_EAT_CRLF;
+                    break;
+                }
+                else if (*pos == '\n') {
+                    ts->state = TOKENIZE_LINE_END;
+                    break;
+                }
+            }
+            pos++;
+            break;
+
+        case TOKENIZE_EAT_CRLF:
+            /* "Universal newline" support: remove \n in \r\n. */
+            if (*pos == '\n') {
+                pos++;
+            }
+            ts->state = TOKENIZE_LINE_END;
+            break;
+
+        default:
+            assert(0);
+    }
+
+    ts->pos = (char *)pos;
+    return 0;
+}
+
+
+/*
+ * This tokenizer always copies the full "row" (all tokens).  This makes
+ * two things easier:
+ * 1. It means that every word is guaranteed to be followed by a NUL character
+ *    (although it can include one as well).
+ * 2. If usecols are used we can sniff the first row easier by parsing it
+ *    fully.  Further, usecols can be negative so we may not know which row we
+ *    need up-front.
+ *
+ * The tokenizer could grow the ability to skip fields and check the
+ * maximum number of fields when known, it is unclear that this is worthwhile.
+ *
+ * Unlike some tokenizers, this one tries to work in chunks and copies
+ * data in chunks as well.  The hope is that this makes multiple light-weight
+ * loops rather than a single heavy one, to allow e.g. quickly scanning for the
+ * end of a field.  Copying chunks also means we usually only check once per
+ * field whether the buffer is large enough.
+ * Different choices are possible, this one seems to work well, though.
+ *
+ * The core (main part) of the tokenizer is specialized for the three Python
+ * unicode flavors UCS1, UCS2, and UCS4 as a worthwhile optimization.
+ */
+NPY_NO_EXPORT int
+tokenize(stream *s, tokenizer_state *ts, parser_config *const config)
+{
+    assert(ts->fields_size >= 2);
+    assert(ts->field_buffer_length >= 2*sizeof(Py_UCS4));
+
+    int finished_reading_file = 0;
+
+    /* Reset to start of buffer */
+    ts->field_buffer_pos = 0;
+    ts->num_fields = 0;
+
+    while (true) {
+        /*
+         * This loop adds new fields to the result (to make up a full row)
+         * until the row ends (typically a line end or the file end)
+         */
+        if (ts->state == TOKENIZE_INIT) {
+            /* Start a new field */
+            if (add_field(ts) < 0) {
+                return -1;
+            }
+            ts->state = TOKENIZE_CHECK_QUOTED;
+        }
+
+        if (NPY_UNLIKELY(ts->pos >= ts->end)) {
+            if (ts->buf_state == BUFFER_IS_LINEND &&
+                    ts->state != TOKENIZE_QUOTED) {
+                /*
+                 * Finished line, do not read anymore (also do not eat \n).
+                 * If we are in a quoted field and the "line" does not end with
+                 * a newline, the quoted field will not have it either.
+                 * I.e. `np.loadtxt(['"a', 'b"'], dtype="S2", quotechar='"')`
+                 * reads "ab". This matches `next(csv.reader(['"a', 'b"']))`.
+                 */
+                break;
+            }
+            /* fetch new data */
+            ts->buf_state = stream_nextbuf(s,
+                    &ts->pos, &ts->end, &ts->unicode_kind);
+            if (ts->buf_state < 0) {
+                return -1;
+            }
+            if (ts->buf_state == BUFFER_IS_FILEEND) {
+                finished_reading_file = 1;
+                ts->pos = ts->end;  /* stream should ensure this. */
+                break;
+            }
+            else if (ts->pos == ts->end) {
+                /* This must be an empty line (and it must be indicated!). */
+                assert(ts->buf_state == BUFFER_IS_LINEND);
+                break;
+            }
+        }
+        int status;
+        if (ts->unicode_kind == PyUnicode_1BYTE_KIND) {
+            status = tokenizer_core<Py_UCS1>(ts, config);
+        }
+        else if (ts->unicode_kind == PyUnicode_2BYTE_KIND) {
+            status = tokenizer_core<Py_UCS2>(ts, config);
+        }
+        else {
+            assert(ts->unicode_kind == PyUnicode_4BYTE_KIND);
+            status = tokenizer_core<Py_UCS4>(ts, config);
+        }
+        if (status < 0) {
+            return -1;
+        }
+
+        if (ts->state == TOKENIZE_LINE_END) {
+            break;
+        }
+    }
+
+    /*
+     * We have finished tokenizing a full row into fields, finalize result
+     */
+    if (ts->buf_state == BUFFER_IS_LINEND) {
+        /* This line is "finished", make sure we don't touch it again: */
+        ts->buf_state = BUFFER_MAY_CONTAIN_NEWLINE;
+        if (NPY_UNLIKELY(ts->pos < ts->end)) {
+            PyErr_SetString(PyExc_ValueError,
+                    "Found an unquoted embedded newline within a single line of "
+                    "input.  This is currently not supported.");
+            return -1;
+        }
+    }
+
+    /* Finish the last field (we "append" one to store the last ones length) */
+    if (add_field(ts) < 0) {
+        return -1;
+    }
+    ts->num_fields -= 1;
+
+    /*
+     * We always start a new field (at the very beginning and whenever a
+     * delimiter was found).
+     * This gives us two scenarios where we need to ignore the last field
+     * if it is empty:
+     * 1. If there is exactly one empty (unquoted) field, the whole line is
+     *    empty.
+     * 2. If we are splitting on whitespace we always ignore a last empty
+     *    field to match Python's splitting: `" 1 ".split()`.
+     *    (Zero fields are possible when we are only skipping lines)
+     */
+    if (ts->num_fields == 1 || (ts->num_fields > 0
+                && ts->unquoted_state == TOKENIZE_UNQUOTED_WHITESPACE)) {
+        size_t offset_last = ts->fields[ts->num_fields-1].offset;
+        size_t end_last = ts->fields[ts->num_fields].offset;
+        if (!ts->fields->quoted && end_last - offset_last == 1) {
+            ts->num_fields--;
+        }
+    }
+    ts->state = TOKENIZE_INIT;
+    return finished_reading_file;
+}
+
+
+NPY_NO_EXPORT void
+tokenizer_clear(tokenizer_state *ts)
+{
+    PyMem_FREE(ts->field_buffer);
+    ts->field_buffer = nullptr;
+    ts->field_buffer_length = 0;
+
+    PyMem_FREE(ts->fields);
+    ts->fields = nullptr;
+    ts->fields_size = 0;
+}
+
+
+/*
+ * Initialize the tokenizer.  We may want to copy all important config
+ * variables into the tokenizer.  This would improve the cache locality during
+ * tokenizing.
+ */
+NPY_NO_EXPORT int
+tokenizer_init(tokenizer_state *ts, parser_config *config)
+{
+    /* State and buf_state could be moved into tokenize if we go by row */
+    ts->buf_state = BUFFER_MAY_CONTAIN_NEWLINE;
+    ts->state = TOKENIZE_INIT;
+    if (config->delimiter_is_whitespace) {
+        ts->unquoted_state = TOKENIZE_UNQUOTED_WHITESPACE;
+    }
+    else {
+        ts->unquoted_state = TOKENIZE_UNQUOTED;
+    }
+    ts->num_fields = 0;
+
+    ts->buf_state = 0;
+    ts->pos = nullptr;
+    ts->end = nullptr;
+
+    ts->field_buffer = (Py_UCS4 *)PyMem_Malloc(32 * sizeof(Py_UCS4));
+    if (ts->field_buffer == nullptr) {
+        PyErr_NoMemory();
+        return -1;
+    }
+    ts->field_buffer_length = 32;
+
+    ts->fields = (field_info *)PyMem_Malloc(4 * sizeof(*ts->fields));
+    if (ts->fields == nullptr) {
+        PyErr_NoMemory();
+        return -1;
+    }
+    ts->fields_size = 4;
+    return 0;
+}
diff --git a/numpy/core/src/multiarray/textreading/tokenize.h b/numpy/core/src/multiarray/textreading/tokenize.h

new file mode 100644 (file)

index 0000000..a78c6d9
--- /dev/null
+++ b/numpy/core/src/multiarray/textreading/tokenize.h
@@ -0,0 +1,86 @@
+
+#ifndef NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_TOKENIZE_H_
+#define NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_TOKENIZE_H_
+
+#include <Python.h>
+#include "numpy/ndarraytypes.h"
+
+#include "textreading/stream.h"
+#include "textreading/parser_config.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+
+typedef enum {
+    /* Initialization of fields */
+    TOKENIZE_INIT,
+    TOKENIZE_CHECK_QUOTED,
+    /* Main field parsing states */
+    TOKENIZE_UNQUOTED,
+    TOKENIZE_UNQUOTED_WHITESPACE,
+    TOKENIZE_QUOTED,
+    /* Handling of two character control sequences (except "\r\n") */
+    TOKENIZE_QUOTED_CHECK_DOUBLE_QUOTE,
+    /* Line end handling */
+    TOKENIZE_LINE_END,
+    TOKENIZE_EAT_CRLF,  /* "\r\n" support (carriage return, line feed) */
+    TOKENIZE_GOTO_LINE_END,
+} tokenizer_parsing_state;
+
+
+typedef struct {
+    size_t offset;
+    bool quoted;
+} field_info;
+
+
+typedef struct {
+    tokenizer_parsing_state state;
+    /* Either TOKENIZE_UNQUOTED or TOKENIZE_UNQUOTED_WHITESPACE: */
+    tokenizer_parsing_state unquoted_state;
+    int unicode_kind;
+    int buf_state;
+    /* the buffer we are currently working on */
+    char *pos;
+    char *end;
+    /*
+     * Space to copy words into.  The buffer must always be at least two NUL
+     * entries longer (8 bytes) than the actual word (including initially).
+     * The first byte beyond the current word is always NUL'ed on write, the
+     * second byte is there to allow easy appending of an additional empty
+     * word at the end (this word is also NUL terminated).
+     */
+    npy_intp field_buffer_length;
+    npy_intp field_buffer_pos;
+    Py_UCS4 *field_buffer;
+
+    /*
+     * Fields, including information about the field being quoted.  This
+     * always includes one "additional" empty field.  The length of a field
+     * is equal to `fields[i+1].offset - fields[i].offset - 1`.
+     *
+     * The tokenizer assumes at least one field is allocated.
+     */
+    npy_intp num_fields;
+    npy_intp fields_size;
+    field_info *fields;
+} tokenizer_state;
+
+
+NPY_NO_EXPORT void
+tokenizer_clear(tokenizer_state *ts);
+
+
+NPY_NO_EXPORT int
+tokenizer_init(tokenizer_state *ts, parser_config *config);
+
+NPY_NO_EXPORT int
+tokenize(stream *s, tokenizer_state *ts, parser_config *const config);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif  /* NUMPY_CORE_SRC_MULTIARRAY_TEXTREADING_TOKENIZE_H_ */
diff --git a/numpy/core/src/npymath/ieee754.cpp b/numpy/core/src/npymath/ieee754.cpp

new file mode 100644 (file)

index 0000000..2244004
--- /dev/null
+++ b/numpy/core/src/npymath/ieee754.cpp
@@ -0,0 +1,710 @@
+/* -*- c -*- */
+/*
+ * vim:syntax=c
+ *
+ * Low-level routines related to IEEE-754 format
+ */
+#include "numpy/utils.h"
+
+#include "npy_math_common.h"
+#include "npy_math_private.h"
+
+#ifndef HAVE_COPYSIGN
+double
+npy_copysign(double x, double y)
+{
+    npy_uint32 hx, hy;
+    GET_HIGH_WORD(hx, x);
+    GET_HIGH_WORD(hy, y);
+    SET_HIGH_WORD(x, (hx & 0x7fffffff) | (hy & 0x80000000));
+    return x;
+}
+#endif
+
+/*
+ The below code is provided for compilers which do not yet provide C11
+ compatibility (gcc 4.5 and older)
+ */
+#ifndef LDBL_TRUE_MIN
+#define LDBL_TRUE_MIN __LDBL_DENORM_MIN__
+#endif
+
+#if !defined(HAVE_DECL_SIGNBIT)
+#include "_signbit.c"
+
+int
+_npy_signbit_f(float x)
+{
+    return _npy_signbit_d((double)x);
+}
+
+int
+_npy_signbit_ld(long double x)
+{
+    return _npy_signbit_d((double)x);
+}
+#endif
+
+/*
+ * FIXME: There is a lot of redundancy between _next* and npy_nextafter*.
+ * refactor this at some point
+ *
+ * p >= 0, returnx x + nulp
+ * p < 0, returnx x - nulp
+ */
+static double
+_next(double x, int p)
+{
+    volatile double t;
+    npy_int32 hx, hy, ix;
+    npy_uint32 lx;
+
+    EXTRACT_WORDS(hx, lx, x);
+    ix = hx & 0x7fffffff; /* |x| */
+
+    if (((ix >= 0x7ff00000) && ((ix - 0x7ff00000) | lx) != 0)) /* x is nan */
+        return x;
+    if ((ix | lx) == 0) { /* x == 0 */
+        if (p >= 0) {
+            INSERT_WORDS(x, 0x0, 1); /* return +minsubnormal */
+        }
+        else {
+            INSERT_WORDS(x, 0x80000000, 1); /* return -minsubnormal */
+        }
+        t = x * x;
+        if (t == x)
+            return t;
+        else
+            return x; /* raise underflow flag */
+    }
+    if (p < 0) { /* x -= ulp */
+        if (lx == 0)
+            hx -= 1;
+        lx -= 1;
+    }
+    else { /* x += ulp */
+        lx += 1;
+        if (lx == 0)
+            hx += 1;
+    }
+    hy = hx & 0x7ff00000;
+    if (hy >= 0x7ff00000)
+        return x + x;      /* overflow  */
+    if (hy < 0x00100000) { /* underflow */
+        t = x * x;
+        if (t != x) { /* raise underflow flag */
+            INSERT_WORDS(x, hx, lx);
+            return x;
+        }
+    }
+    INSERT_WORDS(x, hx, lx);
+    return x;
+}
+
+static float
+_next(float x, int p)
+{
+    volatile float t;
+    npy_int32 hx, hy, ix;
+
+    GET_FLOAT_WORD(hx, x);
+    ix = hx & 0x7fffffff; /* |x| */
+
+    if ((ix > 0x7f800000)) /* x is nan */
+        return x;
+    if (ix == 0) { /* x == 0 */
+        if (p >= 0) {
+            SET_FLOAT_WORD(x, 0x0 | 1); /* return +minsubnormal */
+        }
+        else {
+            SET_FLOAT_WORD(x, 0x80000000 | 1); /* return -minsubnormal */
+        }
+        t = x * x;
+        if (t == x)
+            return t;
+        else
+            return x; /* raise underflow flag */
+    }
+    if (p < 0) { /* x -= ulp */
+        hx -= 1;
+    }
+    else { /* x += ulp */
+        hx += 1;
+    }
+    hy = hx & 0x7f800000;
+    if (hy >= 0x7f800000)
+        return x + x;      /* overflow  */
+    if (hy < 0x00800000) { /* underflow */
+        t = x * x;
+        if (t != x) { /* raise underflow flag */
+            SET_FLOAT_WORD(x, hx);
+            return x;
+        }
+    }
+    SET_FLOAT_WORD(x, hx);
+    return x;
+}
+
+#if defined(HAVE_LDOUBLE_DOUBLE_DOUBLE_BE) || \
+        defined(HAVE_LDOUBLE_DOUBLE_DOUBLE_LE)
+
+/*
+ * FIXME: this is ugly and untested. The asm part only works with gcc, and we
+ * should consolidate the GET_LDOUBLE* / SET_LDOUBLE macros
+ */
+#define math_opt_barrier(x)    \
+    ({                         \
+        __typeof(x) __x = x;   \
+        __asm("" : "+m"(__x)); \
+        __x;                   \
+    })
+#define math_force_eval(x) __asm __volatile("" : : "m"(x))
+
+/* only works for big endian */
+typedef union {
+    npy_longdouble value;
+    struct {
+        npy_uint64 msw;
+        npy_uint64 lsw;
+    } parts64;
+    struct {
+        npy_uint32 w0, w1, w2, w3;
+    } parts32;
+} ieee854_long_double_shape_type;
+
+/* Get two 64 bit ints from a long double.  */
+
+#define GET_LDOUBLE_WORDS64(ix0, ix1, d)     \
+    do {                                     \
+        ieee854_long_double_shape_type qw_u; \
+        qw_u.value = (d);                    \
+        (ix0) = qw_u.parts64.msw;            \
+        (ix1) = qw_u.parts64.lsw;            \
+    } while (0)
+
+/* Set a long double from two 64 bit ints.  */
+
+#define SET_LDOUBLE_WORDS64(d, ix0, ix1)     \
+    do {                                     \
+        ieee854_long_double_shape_type qw_u; \
+        qw_u.parts64.msw = (ix0);            \
+        qw_u.parts64.lsw = (ix1);            \
+        (d) = qw_u.value;                    \
+    } while (0)
+
+static long double
+_next(long double x, int p)
+{
+    npy_int64 hx, ihx, ilx;
+    npy_uint64 lx;
+    npy_longdouble u;
+    const npy_longdouble eps = exp2l(-105.); // 0x1.0000000000000p-105L
+
+    GET_LDOUBLE_WORDS64(hx, lx, x);
+    ihx = hx & 0x7fffffffffffffffLL; /* |hx| */
+    ilx = lx & 0x7fffffffffffffffLL; /* |lx| */
+
+    if (((ihx & 0x7ff0000000000000LL) == 0x7ff0000000000000LL) &&
+        ((ihx & 0x000fffffffffffffLL) != 0)) {
+        return x; /* signal the nan */
+    }
+    if (ihx == 0 && ilx == 0) {          /* x == 0 */
+        SET_LDOUBLE_WORDS64(x, p, 0ULL); /* return +-minsubnormal */
+        u = x * x;
+        if (u == x) {
+            return u;
+        }
+        else {
+            return x; /* raise underflow flag */
+        }
+    }
+
+    if (p < 0) { /* p < 0, x -= ulp */
+        if ((hx == 0xffefffffffffffffLL) && (lx == 0xfc8ffffffffffffeLL))
+            return x + x; /* overflow, return -inf */
+        if (hx >= 0x7ff0000000000000LL) {
+            SET_LDOUBLE_WORDS64(u, 0x7fefffffffffffffLL, 0x7c8ffffffffffffeLL);
+            return u;
+        }
+        if (ihx <= 0x0360000000000000LL) { /* x <= LDBL_MIN */
+            u = math_opt_barrier(x);
+            x -= LDBL_TRUE_MIN;
+            if (ihx < 0x0360000000000000LL || (hx > 0 && (npy_int64)lx <= 0) ||
+                (hx < 0 && (npy_int64)lx > 1)) {
+                u = u * u;
+                math_force_eval(u); /* raise underflow flag */
+            }
+            return x;
+        }
+        if (ihx < 0x06a0000000000000LL) { /* ulp will denormal */
+            SET_LDOUBLE_WORDS64(u, (hx & 0x7ff0000000000000LL), 0ULL);
+            u *= eps;
+        }
+        else
+            SET_LDOUBLE_WORDS64(
+                    u, (hx & 0x7ff0000000000000LL) - 0x0690000000000000LL,
+                    0ULL);
+        return x - u;
+    }
+    else { /* p >= 0, x += ulp */
+        if ((hx == 0x7fefffffffffffffLL) && (lx == 0x7c8ffffffffffffeLL))
+            return x + x; /* overflow, return +inf */
+        if ((npy_uint64)hx >= 0xfff0000000000000ULL) {
+            SET_LDOUBLE_WORDS64(u, 0xffefffffffffffffLL, 0xfc8ffffffffffffeLL);
+            return u;
+        }
+        if (ihx <= 0x0360000000000000LL) { /* x <= LDBL_MIN */
+            u = math_opt_barrier(x);
+            x += LDBL_TRUE_MIN;
+            if (ihx < 0x0360000000000000LL ||
+                (hx > 0 && (npy_int64)lx < 0 && lx != 0x8000000000000001LL) ||
+                (hx < 0 && (npy_int64)lx >= 0)) {
+                u = u * u;
+                math_force_eval(u); /* raise underflow flag */
+            }
+            if (x == 0.0L) /* handle negative LDBL_TRUE_MIN case */
+                x = -0.0L;
+            return x;
+        }
+        if (ihx < 0x06a0000000000000LL) { /* ulp will denormal */
+            SET_LDOUBLE_WORDS64(u, (hx & 0x7ff0000000000000LL), 0ULL);
+            u *= eps;
+        }
+        else
+            SET_LDOUBLE_WORDS64(
+                    u, (hx & 0x7ff0000000000000LL) - 0x0690000000000000LL,
+                    0ULL);
+        return x + u;
+    }
+}
+#else
+static long double
+_next(long double x, int p)
+{
+    volatile npy_longdouble t;
+    union IEEEl2bitsrep ux;
+
+    ux.e = x;
+
+    if ((GET_LDOUBLE_EXP(ux) == 0x7fff &&
+         ((GET_LDOUBLE_MANH(ux) & ~LDBL_NBIT) | GET_LDOUBLE_MANL(ux)) != 0)) {
+        return ux.e; /* x is nan */
+    }
+    if (ux.e == 0.0) {
+        SET_LDOUBLE_MANH(ux, 0); /* return +-minsubnormal */
+        SET_LDOUBLE_MANL(ux, 1);
+        if (p >= 0) {
+            SET_LDOUBLE_SIGN(ux, 0);
+        }
+        else {
+            SET_LDOUBLE_SIGN(ux, 1);
+        }
+        t = ux.e * ux.e;
+        if (t == ux.e) {
+            return t;
+        }
+        else {
+            return ux.e; /* raise underflow flag */
+        }
+    }
+    if (p < 0) { /* x -= ulp */
+        if (GET_LDOUBLE_MANL(ux) == 0) {
+            if ((GET_LDOUBLE_MANH(ux) & ~LDBL_NBIT) == 0) {
+                SET_LDOUBLE_EXP(ux, GET_LDOUBLE_EXP(ux) - 1);
+            }
+            SET_LDOUBLE_MANH(ux, (GET_LDOUBLE_MANH(ux) - 1) |
+                                         (GET_LDOUBLE_MANH(ux) & LDBL_NBIT));
+        }
+        SET_LDOUBLE_MANL(ux, GET_LDOUBLE_MANL(ux) - 1);
+    }
+    else { /* x += ulp */
+        SET_LDOUBLE_MANL(ux, GET_LDOUBLE_MANL(ux) + 1);
+        if (GET_LDOUBLE_MANL(ux) == 0) {
+            SET_LDOUBLE_MANH(ux, (GET_LDOUBLE_MANH(ux) + 1) |
+                                         (GET_LDOUBLE_MANH(ux) & LDBL_NBIT));
+            if ((GET_LDOUBLE_MANH(ux) & ~LDBL_NBIT) == 0) {
+                SET_LDOUBLE_EXP(ux, GET_LDOUBLE_EXP(ux) + 1);
+            }
+        }
+    }
+    if (GET_LDOUBLE_EXP(ux) == 0x7fff) {
+        return ux.e + ux.e; /* overflow  */
+    }
+    if (GET_LDOUBLE_EXP(ux) == 0) { /* underflow */
+        if (LDBL_NBIT) {
+            SET_LDOUBLE_MANH(ux, GET_LDOUBLE_MANH(ux) & ~LDBL_NBIT);
+        }
+        t = ux.e * ux.e;
+        if (t != ux.e) { /* raise underflow flag */
+            return ux.e;
+        }
+    }
+
+    return ux.e;
+}
+#endif
+
+/*
+ * nextafter code taken from BSD math lib, the code contains the following
+ * notice:
+ *
+ * ====================================================
+ * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved.
+ *
+ * Developed at SunPro, a Sun Microsystems, Inc. business.
+ * Permission to use, copy, modify, and distribute this
+ * software is freely granted, provided that this notice
+ * is preserved.
+ * ====================================================
+ */
+
+#ifndef HAVE_NEXTAFTER
+double
+npy_nextafter(double x, double y)
+{
+    volatile double t;
+    npy_int32 hx, hy, ix, iy;
+    npy_uint32 lx, ly;
+
+    EXTRACT_WORDS(hx, lx, x);
+    EXTRACT_WORDS(hy, ly, y);
+    ix = hx & 0x7fffffff; /* |x| */
+    iy = hy & 0x7fffffff; /* |y| */
+
+    if (((ix >= 0x7ff00000) && ((ix - 0x7ff00000) | lx) != 0) || /* x is nan */
+        ((iy >= 0x7ff00000) && ((iy - 0x7ff00000) | ly) != 0))   /* y is nan */
+        return x + y;
+    if (x == y)
+        return y;                            /* x=y, return y */
+    if ((ix | lx) == 0) {                    /* x == 0 */
+        INSERT_WORDS(x, hy & 0x80000000, 1); /* return +-minsubnormal */
+        t = x * x;
+        if (t == x)
+            return t;
+        else
+            return x; /* raise underflow flag */
+    }
+    if (hx >= 0) {                                  /* x > 0 */
+        if (hx > hy || ((hx == hy) && (lx > ly))) { /* x > y, x -= ulp */
+            if (lx == 0)
+                hx -= 1;
+            lx -= 1;
+        }
+        else { /* x < y, x += ulp */
+            lx += 1;
+            if (lx == 0)
+                hx += 1;
+        }
+    }
+    else { /* x < 0 */
+        if (hy >= 0 || hx > hy ||
+            ((hx == hy) && (lx > ly))) { /* x < y, x -= ulp */
+            if (lx == 0)
+                hx -= 1;
+            lx -= 1;
+        }
+        else { /* x > y, x += ulp */
+            lx += 1;
+            if (lx == 0)
+                hx += 1;
+        }
+    }
+    hy = hx & 0x7ff00000;
+    if (hy >= 0x7ff00000)
+        return x + x;      /* overflow  */
+    if (hy < 0x00100000) { /* underflow */
+        t = x * x;
+        if (t != x) { /* raise underflow flag */
+            INSERT_WORDS(y, hx, lx);
+            return y;
+        }
+    }
+    INSERT_WORDS(x, hx, lx);
+    return x;
+}
+#endif
+
+#ifndef HAVE_NEXTAFTERF
+float
+npy_nextafterf(float x, float y)
+{
+    volatile float t;
+    npy_int32 hx, hy, ix, iy;
+
+    GET_FLOAT_WORD(hx, x);
+    GET_FLOAT_WORD(hy, y);
+    ix = hx & 0x7fffffff; /* |x| */
+    iy = hy & 0x7fffffff; /* |y| */
+
+    if ((ix > 0x7f800000) || /* x is nan */
+        (iy > 0x7f800000))   /* y is nan */
+        return x + y;
+    if (x == y)
+        return y;                                 /* x=y, return y */
+    if (ix == 0) {                                /* x == 0 */
+        SET_FLOAT_WORD(x, (hy & 0x80000000) | 1); /* return +-minsubnormal */
+        t = x * x;
+        if (t == x)
+            return t;
+        else
+            return x; /* raise underflow flag */
+    }
+    if (hx >= 0) {     /* x > 0 */
+        if (hx > hy) { /* x > y, x -= ulp */
+            hx -= 1;
+        }
+        else { /* x < y, x += ulp */
+            hx += 1;
+        }
+    }
+    else {                        /* x < 0 */
+        if (hy >= 0 || hx > hy) { /* x < y, x -= ulp */
+            hx -= 1;
+        }
+        else { /* x > y, x += ulp */
+            hx += 1;
+        }
+    }
+    hy = hx & 0x7f800000;
+    if (hy >= 0x7f800000)
+        return x + x;      /* overflow  */
+    if (hy < 0x00800000) { /* underflow */
+        t = x * x;
+        if (t != x) { /* raise underflow flag */
+            SET_FLOAT_WORD(y, hx);
+            return y;
+        }
+    }
+    SET_FLOAT_WORD(x, hx);
+    return x;
+}
+#endif
+
+#ifndef HAVE_NEXTAFTERL
+npy_longdouble
+npy_nextafterl(npy_longdouble x, npy_longdouble y)
+{
+    volatile npy_longdouble t;
+    union IEEEl2bitsrep ux;
+    union IEEEl2bitsrep uy;
+
+    ux.e = x;
+    uy.e = y;
+
+    if ((GET_LDOUBLE_EXP(ux) == 0x7fff &&
+         ((GET_LDOUBLE_MANH(ux) & ~LDBL_NBIT) | GET_LDOUBLE_MANL(ux)) != 0) ||
+        (GET_LDOUBLE_EXP(uy) == 0x7fff &&
+         ((GET_LDOUBLE_MANH(uy) & ~LDBL_NBIT) | GET_LDOUBLE_MANL(uy)) != 0)) {
+        return ux.e + uy.e; /* x or y is nan */
+    }
+    if (ux.e == uy.e) {
+        return uy.e; /* x=y, return y */
+    }
+    if (ux.e == 0.0) {
+        SET_LDOUBLE_MANH(ux, 0); /* return +-minsubnormal */
+        SET_LDOUBLE_MANL(ux, 1);
+        SET_LDOUBLE_SIGN(ux, GET_LDOUBLE_SIGN(uy));
+        t = ux.e * ux.e;
+        if (t == ux.e) {
+            return t;
+        }
+        else {
+            return ux.e; /* raise underflow flag */
+        }
+    }
+    if ((ux.e > 0.0) ^ (ux.e < uy.e)) { /* x -= ulp */
+        if (GET_LDOUBLE_MANL(ux) == 0) {
+            if ((GET_LDOUBLE_MANH(ux) & ~LDBL_NBIT) == 0) {
+                SET_LDOUBLE_EXP(ux, GET_LDOUBLE_EXP(ux) - 1);
+            }
+            SET_LDOUBLE_MANH(ux, (GET_LDOUBLE_MANH(ux) - 1) |
+                                         (GET_LDOUBLE_MANH(ux) & LDBL_NBIT));
+        }
+        SET_LDOUBLE_MANL(ux, GET_LDOUBLE_MANL(ux) - 1);
+    }
+    else { /* x += ulp */
+        SET_LDOUBLE_MANL(ux, GET_LDOUBLE_MANL(ux) + 1);
+        if (GET_LDOUBLE_MANL(ux) == 0) {
+            SET_LDOUBLE_MANH(ux, (GET_LDOUBLE_MANH(ux) + 1) |
+                                         (GET_LDOUBLE_MANH(ux) & LDBL_NBIT));
+            if ((GET_LDOUBLE_MANH(ux) & ~LDBL_NBIT) == 0) {
+                SET_LDOUBLE_EXP(ux, GET_LDOUBLE_EXP(ux) + 1);
+            }
+        }
+    }
+    if (GET_LDOUBLE_EXP(ux) == 0x7fff) {
+        return ux.e + ux.e; /* overflow  */
+    }
+    if (GET_LDOUBLE_EXP(ux) == 0) { /* underflow */
+        if (LDBL_NBIT) {
+            SET_LDOUBLE_MANH(ux, GET_LDOUBLE_MANH(ux) & ~LDBL_NBIT);
+        }
+        t = ux.e * ux.e;
+        if (t != ux.e) { /* raise underflow flag */
+            return ux.e;
+        }
+    }
+
+    return ux.e;
+}
+#endif
+
+namespace {
+template <typename T>
+struct numeric_limits;
+
+template <>
+struct numeric_limits<float> {
+    static const npy_float nan;
+};
+const npy_float numeric_limits<float>::nan = NPY_NANF;
+
+template <>
+struct numeric_limits<double> {
+    static const npy_double nan;
+};
+const npy_double numeric_limits<double>::nan = NPY_NAN;
+
+template <>
+struct numeric_limits<long double> {
+    static const npy_longdouble nan;
+};
+const npy_longdouble numeric_limits<long double>::nan = NPY_NANL;
+}  // namespace
+
+template <typename type>
+static type
+_npy_spacing(type x)
+{
+    /* XXX: npy isnan/isinf may be optimized by bit twiddling */
+    if (npy_isinf(x)) {
+        return numeric_limits<type>::nan;
+    }
+
+    return _next(x, 1) - x;
+}
+
+/*
+ * Instantiation of C interface
+ */
+extern "C" {
+npy_float
+npy_spacingf(npy_float x)
+{
+    return _npy_spacing(x);
+}
+npy_double
+npy_spacing(npy_double x)
+{
+    return _npy_spacing(x);
+}
+npy_longdouble
+npy_spacingl(npy_longdouble x)
+{
+    return _npy_spacing(x);
+}
+}
+
+/*
+ * Decorate all the math functions which are available on the current platform
+ */
+
+#ifdef HAVE_NEXTAFTERF
+extern "C" float
+npy_nextafterf(float x, float y)
+{
+    return nextafterf(x, y);
+}
+#endif
+
+#ifdef HAVE_NEXTAFTER
+extern "C" double
+npy_nextafter(double x, double y)
+{
+    return nextafter(x, y);
+}
+#endif
+
+#ifdef HAVE_NEXTAFTERL
+extern "C" npy_longdouble
+npy_nextafterl(npy_longdouble x, npy_longdouble y)
+{
+    return nextafterl(x, y);
+}
+#endif
+
+extern "C" int
+npy_clear_floatstatus()
+{
+    char x = 0;
+    return npy_clear_floatstatus_barrier(&x);
+}
+extern "C" int
+npy_get_floatstatus()
+{
+    char x = 0;
+    return npy_get_floatstatus_barrier(&x);
+}
+
+
+/* 
+ * General C99 code for floating point error handling.  These functions mainly
+ * exists, because `fenv.h` was not standardized in C89 so they gave better
+ * portability.  This should be unnecessary with C99/C++11 and further
+ * functionality can be used from `fenv.h` directly. 
+ */
+#include <fenv.h>
+
+extern "C" int
+npy_get_floatstatus_barrier(char *param)
+{
+    int fpstatus = fetestexcept(FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW |
+                                FE_INVALID);
+    /*
+     * By using a volatile, the compiler cannot reorder this call
+     */
+    if (param != NULL) {
+        volatile char NPY_UNUSED(c) = *(char *)param;
+    }
+
+    return ((FE_DIVBYZERO & fpstatus) ? NPY_FPE_DIVIDEBYZERO : 0) |
+           ((FE_OVERFLOW & fpstatus) ? NPY_FPE_OVERFLOW : 0) |
+           ((FE_UNDERFLOW & fpstatus) ? NPY_FPE_UNDERFLOW : 0) |
+           ((FE_INVALID & fpstatus) ? NPY_FPE_INVALID : 0);
+}
+
+extern "C" int
+npy_clear_floatstatus_barrier(char *param)
+{
+    /* testing float status is 50-100 times faster than clearing on x86 */
+    int fpstatus = npy_get_floatstatus_barrier(param);
+    if (fpstatus != 0) {
+        feclearexcept(FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW | FE_INVALID);
+    }
+
+    return fpstatus;
+}
+
+extern "C" void
+npy_set_floatstatus_divbyzero(void)
+{
+    feraiseexcept(FE_DIVBYZERO);
+}
+
+extern "C" void
+npy_set_floatstatus_overflow(void)
+{
+    feraiseexcept(FE_OVERFLOW);
+}
+
+extern "C" void
+npy_set_floatstatus_underflow(void)
+{
+    feraiseexcept(FE_UNDERFLOW);
+}
+
+extern "C" void
+npy_set_floatstatus_invalid(void)
+{
+    feraiseexcept(FE_INVALID);
+}
diff --git a/numpy/core/src/npymath/npy_math_complex.c.src b/numpy/core/src/npymath/npy_math_complex.c.src

index 8c432e483982e986eadab82b64b4970052b41c2e..ce2772273fb30fc0f8d6a7dc51b23b02769436b6 100644 (file)
--- a/numpy/core/src/npymath/npy_math_complex.c.src
+++ b/numpy/core/src/npymath/npy_math_complex.c.src
@@ -1696,7 +1696,7 @@ npy_catanh@c@(@ctype@ z)
      if (ax < SQRT_3_EPSILON / 2 && ay < SQRT_3_EPSILON / 2) {
          /*
           * z = 0 was filtered out above.  All other cases must raise
-         * inexact, but this is the only only that needs to do it
+         * inexact, but this is the only one that needs to do it
           * explicitly.
           */
          raise_inexact();
diff --git a/numpy/core/src/npysort/binsearch.c.src b/numpy/core/src/npysort/binsearch.c.src

deleted file mode 100644 (file)

index 4116589..0000000
--- a/numpy/core/src/npysort/binsearch.c.src
+++ /dev/null
@@ -1,250 +0,0 @@
-/* -*- c -*- */
-#define NPY_NO_DEPRECATED_API NPY_API_VERSION
-
-#include "npy_sort.h"
-#include "npysort_common.h"
-#include "npy_binsearch.h"
-
-#define NOT_USED NPY_UNUSED(unused)
-
-/*
- *****************************************************************************
- **                            NUMERIC SEARCHES                             **
- *****************************************************************************
- */
-
-/**begin repeat
- *
- * #TYPE = BOOL, BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG,
- *         LONGLONG, ULONGLONG, HALF, FLOAT, DOUBLE, LONGDOUBLE,
- *         CFLOAT, CDOUBLE, CLONGDOUBLE, DATETIME, TIMEDELTA#
- * #suff = bool, byte, ubyte, short, ushort, int, uint, long, ulong,
- *         longlong, ulonglong, half, float, double, longdouble,
- *         cfloat, cdouble, clongdouble, datetime, timedelta#
- * #type = npy_bool, npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int,
- *         npy_uint, npy_long, npy_ulong, npy_longlong, npy_ulonglong,
- *         npy_ushort, npy_float, npy_double, npy_longdouble, npy_cfloat,
- *         npy_cdouble, npy_clongdouble, npy_datetime, npy_timedelta#
- */
-
-#define @TYPE@_LTE(a, b) (!@TYPE@_LT((b), (a)))
-
-/**begin repeat1
- *
- * #side = left, right#
- * #CMP  = LT, LTE#
- */
-
-NPY_NO_EXPORT void
-binsearch_@side@_@suff@(const char *arr, const char *key, char *ret,
-                        npy_intp arr_len, npy_intp key_len,
-                        npy_intp arr_str, npy_intp key_str, npy_intp ret_str,
-                        PyArrayObject *NOT_USED)
-{
-    npy_intp min_idx = 0;
-    npy_intp max_idx = arr_len;
-    @type@ last_key_val;
-
-    if (key_len == 0) {
-        return;
-    }
-    last_key_val = *(const @type@ *)key;
-
-    for (; key_len > 0; key_len--, key += key_str, ret += ret_str) {
-        const @type@ key_val = *(const @type@ *)key;
-        /*
-         * Updating only one of the indices based on the previous key
-         * gives the search a big boost when keys are sorted, but slightly
-         * slows down things for purely random ones.
-         */
-        if (@TYPE@_LT(last_key_val, key_val)) {
-            max_idx = arr_len;
-        }
-        else {
-            min_idx = 0;
-            max_idx = (max_idx < arr_len) ? (max_idx + 1) : arr_len;
-        }
-
-        last_key_val = key_val;
-
-        while (min_idx < max_idx) {
-            const npy_intp mid_idx = min_idx + ((max_idx - min_idx) >> 1);
-            const @type@ mid_val = *(const @type@ *)(arr + mid_idx*arr_str);
-            if (@TYPE@_@CMP@(mid_val, key_val)) {
-                min_idx = mid_idx + 1;
-            }
-            else {
-                max_idx = mid_idx;
-            }
-        }
-        *(npy_intp *)ret = min_idx;
-    }
-}
-
-NPY_NO_EXPORT int
-argbinsearch_@side@_@suff@(const char *arr, const char *key,
-                           const char *sort, char *ret,
-                           npy_intp arr_len, npy_intp key_len,
-                           npy_intp arr_str, npy_intp key_str,
-                           npy_intp sort_str, npy_intp ret_str,
-                           PyArrayObject *NOT_USED)
-{
-    npy_intp min_idx = 0;
-    npy_intp max_idx = arr_len;
-    @type@ last_key_val;
-
-    if (key_len == 0) {
-        return 0;
-    }
-    last_key_val = *(const @type@ *)key;
-
-    for (; key_len > 0; key_len--, key += key_str, ret += ret_str) {
-        const @type@ key_val = *(const @type@ *)key;
-        /*
-         * Updating only one of the indices based on the previous key
-         * gives the search a big boost when keys are sorted, but slightly
-         * slows down things for purely random ones.
-         */
-        if (@TYPE@_LT(last_key_val, key_val)) {
-            max_idx = arr_len;
-        }
-        else {
-            min_idx = 0;
-            max_idx = (max_idx < arr_len) ? (max_idx + 1) : arr_len;
-        }
-
-        last_key_val = key_val;
-
-        while (min_idx < max_idx) {
-            const npy_intp mid_idx = min_idx + ((max_idx - min_idx) >> 1);
-            const npy_intp sort_idx = *(npy_intp *)(sort + mid_idx*sort_str);
-            @type@ mid_val;
-
-            if (sort_idx < 0 || sort_idx >= arr_len) {
-                return -1;
-            }
-
-            mid_val = *(const @type@ *)(arr + sort_idx*arr_str);
-
-            if (@TYPE@_@CMP@(mid_val, key_val)) {
-                min_idx = mid_idx + 1;
-            }
-            else {
-                max_idx = mid_idx;
-            }
-        }
-        *(npy_intp *)ret = min_idx;
-    }
-    return 0;
-}
-
-/**end repeat1**/
-/**end repeat**/
-
-/*
- *****************************************************************************
- **                             GENERIC SEARCH                              **
- *****************************************************************************
- */
-
- /**begin repeat
- *
- * #side = left, right#
- * #CMP  = <, <=#
- */
-
-NPY_NO_EXPORT void
-npy_binsearch_@side@(const char *arr, const char *key, char *ret,
-                     npy_intp arr_len, npy_intp key_len,
-                     npy_intp arr_str, npy_intp key_str, npy_intp ret_str,
-                     PyArrayObject *cmp)
-{
-    PyArray_CompareFunc *compare = PyArray_DESCR(cmp)->f->compare;
-    npy_intp min_idx = 0;
-    npy_intp max_idx = arr_len;
-    const char *last_key = key;
-
-    for (; key_len > 0; key_len--, key += key_str, ret += ret_str) {
-        /*
-         * Updating only one of the indices based on the previous key
-         * gives the search a big boost when keys are sorted, but slightly
-         * slows down things for purely random ones.
-         */
-        if (compare(last_key, key, cmp) @CMP@ 0) {
-            max_idx = arr_len;
-        }
-        else {
-            min_idx = 0;
-            max_idx = (max_idx < arr_len) ? (max_idx + 1) : arr_len;
-        }
-
-        last_key = key;
-
-        while (min_idx < max_idx) {
-            const npy_intp mid_idx = min_idx + ((max_idx - min_idx) >> 1);
-            const char *arr_ptr = arr + mid_idx*arr_str;
-
-            if (compare(arr_ptr, key, cmp) @CMP@ 0) {
-                min_idx = mid_idx + 1;
-            }
-            else {
-                max_idx = mid_idx;
-            }
-        }
-        *(npy_intp *)ret = min_idx;
-    }
-}
-
-NPY_NO_EXPORT int
-npy_argbinsearch_@side@(const char *arr, const char *key,
-                        const char *sort, char *ret,
-                        npy_intp arr_len, npy_intp key_len,
-                        npy_intp arr_str, npy_intp key_str,
-                        npy_intp sort_str, npy_intp ret_str,
-                        PyArrayObject *cmp)
-{
-    PyArray_CompareFunc *compare = PyArray_DESCR(cmp)->f->compare;
-    npy_intp min_idx = 0;
-    npy_intp max_idx = arr_len;
-    const char *last_key = key;
-
-    for (; key_len > 0; key_len--, key += key_str, ret += ret_str) {
-        /*
-         * Updating only one of the indices based on the previous key
-         * gives the search a big boost when keys are sorted, but slightly
-         * slows down things for purely random ones.
-         */
-        if (compare(last_key, key, cmp) @CMP@ 0) {
-            max_idx = arr_len;
-        }
-        else {
-            min_idx = 0;
-            max_idx = (max_idx < arr_len) ? (max_idx + 1) : arr_len;
-        }
-
-        last_key = key;
-
-        while (min_idx < max_idx) {
-            const npy_intp mid_idx = min_idx + ((max_idx - min_idx) >> 1);
-            const npy_intp sort_idx = *(npy_intp *)(sort + mid_idx*sort_str);
-            const char *arr_ptr;
-
-            if (sort_idx < 0 || sort_idx >= arr_len) {
-                return -1;
-            }
-
-            arr_ptr = arr + sort_idx*arr_str;
-
-            if (compare(arr_ptr, key, cmp) @CMP@ 0) {
-                min_idx = mid_idx + 1;
-            }
-            else {
-                max_idx = mid_idx;
-            }
-        }
-        *(npy_intp *)ret = min_idx;
-    }
-    return 0;
-}
-
-/**end repeat**/
diff --git a/numpy/core/src/npysort/binsearch.cpp b/numpy/core/src/npysort/binsearch.cpp

new file mode 100644 (file)

index 0000000..98d3059
--- /dev/null
+++ b/numpy/core/src/npysort/binsearch.cpp
@@ -0,0 +1,403 @@
+/* -*- c -*- */
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+
+#include "numpy/ndarraytypes.h"
+#include "numpy/npy_common.h"
+
+#include "npy_binsearch.h"
+#include "npy_sort.h"
+#include "numpy_tag.h"
+
+#include <array>
+#include <functional>  // for std::less and std::less_equal
+
+// Enumerators for the variant of binsearch
+enum arg_t
+{
+    noarg,
+    arg
+};
+enum side_t
+{
+    left,
+    right
+};
+
+// Mapping from enumerators to comparators
+template <class Tag, side_t side>
+struct side_to_cmp;
+
+template <class Tag>
+struct side_to_cmp<Tag, left> {
+    static constexpr auto value = Tag::less;
+};
+
+template <class Tag>
+struct side_to_cmp<Tag, right> {
+    static constexpr auto value = Tag::less_equal;
+};
+
+template <side_t side>
+struct side_to_generic_cmp;
+
+template <>
+struct side_to_generic_cmp<left> {
+    using type = std::less<int>;
+};
+
+template <>
+struct side_to_generic_cmp<right> {
+    using type = std::less_equal<int>;
+};
+
+/*
+ *****************************************************************************
+ **                            NUMERIC SEARCHES                             **
+ *****************************************************************************
+ */
+template <class Tag, side_t side>
+static void
+binsearch(const char *arr, const char *key, char *ret, npy_intp arr_len,
+          npy_intp key_len, npy_intp arr_str, npy_intp key_str,
+          npy_intp ret_str, PyArrayObject *)
+{
+    using T = typename Tag::type;
+    auto cmp = side_to_cmp<Tag, side>::value;
+    npy_intp min_idx = 0;
+    npy_intp max_idx = arr_len;
+    T last_key_val;
+
+    if (key_len == 0) {
+        return;
+    }
+    last_key_val = *(const T *)key;
+
+    for (; key_len > 0; key_len--, key += key_str, ret += ret_str) {
+        const T key_val = *(const T *)key;
+        /*
+         * Updating only one of the indices based on the previous key
+         * gives the search a big boost when keys are sorted, but slightly
+         * slows down things for purely random ones.
+         */
+        if (cmp(last_key_val, key_val)) {
+            max_idx = arr_len;
+        }
+        else {
+            min_idx = 0;
+            max_idx = (max_idx < arr_len) ? (max_idx + 1) : arr_len;
+        }
+
+        last_key_val = key_val;
+
+        while (min_idx < max_idx) {
+            const npy_intp mid_idx = min_idx + ((max_idx - min_idx) >> 1);
+            const T mid_val = *(const T *)(arr + mid_idx * arr_str);
+            if (cmp(mid_val, key_val)) {
+                min_idx = mid_idx + 1;
+            }
+            else {
+                max_idx = mid_idx;
+            }
+        }
+        *(npy_intp *)ret = min_idx;
+    }
+}
+
+template <class Tag, side_t side>
+static int
+argbinsearch(const char *arr, const char *key, const char *sort, char *ret,
+             npy_intp arr_len, npy_intp key_len, npy_intp arr_str,
+             npy_intp key_str, npy_intp sort_str, npy_intp ret_str,
+             PyArrayObject *)
+{
+    using T = typename Tag::type;
+    auto cmp = side_to_cmp<Tag, side>::value;
+    npy_intp min_idx = 0;
+    npy_intp max_idx = arr_len;
+    T last_key_val;
+
+    if (key_len == 0) {
+        return 0;
+    }
+    last_key_val = *(const T *)key;
+
+    for (; key_len > 0; key_len--, key += key_str, ret += ret_str) {
+        const T key_val = *(const T *)key;
+        /*
+         * Updating only one of the indices based on the previous key
+         * gives the search a big boost when keys are sorted, but slightly
+         * slows down things for purely random ones.
+         */
+        if (cmp(last_key_val, key_val)) {
+            max_idx = arr_len;
+        }
+        else {
+            min_idx = 0;
+            max_idx = (max_idx < arr_len) ? (max_idx + 1) : arr_len;
+        }
+
+        last_key_val = key_val;
+
+        while (min_idx < max_idx) {
+            const npy_intp mid_idx = min_idx + ((max_idx - min_idx) >> 1);
+            const npy_intp sort_idx = *(npy_intp *)(sort + mid_idx * sort_str);
+            T mid_val;
+
+            if (sort_idx < 0 || sort_idx >= arr_len) {
+                return -1;
+            }
+
+            mid_val = *(const T *)(arr + sort_idx * arr_str);
+
+            if (cmp(mid_val, key_val)) {
+                min_idx = mid_idx + 1;
+            }
+            else {
+                max_idx = mid_idx;
+            }
+        }
+        *(npy_intp *)ret = min_idx;
+    }
+    return 0;
+}
+
+/*
+ *****************************************************************************
+ **                             GENERIC SEARCH                              **
+ *****************************************************************************
+ */
+
+template <side_t side>
+static void
+npy_binsearch(const char *arr, const char *key, char *ret, npy_intp arr_len,
+              npy_intp key_len, npy_intp arr_str, npy_intp key_str,
+              npy_intp ret_str, PyArrayObject *cmp)
+{
+    using Cmp = typename side_to_generic_cmp<side>::type;
+    PyArray_CompareFunc *compare = PyArray_DESCR(cmp)->f->compare;
+    npy_intp min_idx = 0;
+    npy_intp max_idx = arr_len;
+    const char *last_key = key;
+
+    for (; key_len > 0; key_len--, key += key_str, ret += ret_str) {
+        /*
+         * Updating only one of the indices based on the previous key
+         * gives the search a big boost when keys are sorted, but slightly
+         * slows down things for purely random ones.
+         */
+        if (Cmp{}(compare(last_key, key, cmp), 0)) {
+            max_idx = arr_len;
+        }
+        else {
+            min_idx = 0;
+            max_idx = (max_idx < arr_len) ? (max_idx + 1) : arr_len;
+        }
+
+        last_key = key;
+
+        while (min_idx < max_idx) {
+            const npy_intp mid_idx = min_idx + ((max_idx - min_idx) >> 1);
+            const char *arr_ptr = arr + mid_idx * arr_str;
+
+            if (Cmp{}(compare(arr_ptr, key, cmp), 0)) {
+                min_idx = mid_idx + 1;
+            }
+            else {
+                max_idx = mid_idx;
+            }
+        }
+        *(npy_intp *)ret = min_idx;
+    }
+}
+
+template <side_t side>
+static int
+npy_argbinsearch(const char *arr, const char *key, const char *sort, char *ret,
+                 npy_intp arr_len, npy_intp key_len, npy_intp arr_str,
+                 npy_intp key_str, npy_intp sort_str, npy_intp ret_str,
+                 PyArrayObject *cmp)
+{
+    using Cmp = typename side_to_generic_cmp<side>::type;
+    PyArray_CompareFunc *compare = PyArray_DESCR(cmp)->f->compare;
+    npy_intp min_idx = 0;
+    npy_intp max_idx = arr_len;
+    const char *last_key = key;
+
+    for (; key_len > 0; key_len--, key += key_str, ret += ret_str) {
+        /*
+         * Updating only one of the indices based on the previous key
+         * gives the search a big boost when keys are sorted, but slightly
+         * slows down things for purely random ones.
+         */
+        if (Cmp{}(compare(last_key, key, cmp), 0)) {
+            max_idx = arr_len;
+        }
+        else {
+            min_idx = 0;
+            max_idx = (max_idx < arr_len) ? (max_idx + 1) : arr_len;
+        }
+
+        last_key = key;
+
+        while (min_idx < max_idx) {
+            const npy_intp mid_idx = min_idx + ((max_idx - min_idx) >> 1);
+            const npy_intp sort_idx = *(npy_intp *)(sort + mid_idx * sort_str);
+            const char *arr_ptr;
+
+            if (sort_idx < 0 || sort_idx >= arr_len) {
+                return -1;
+            }
+
+            arr_ptr = arr + sort_idx * arr_str;
+
+            if (Cmp{}(compare(arr_ptr, key, cmp), 0)) {
+                min_idx = mid_idx + 1;
+            }
+            else {
+                max_idx = mid_idx;
+            }
+        }
+        *(npy_intp *)ret = min_idx;
+    }
+    return 0;
+}
+
+/*
+ *****************************************************************************
+ **                             GENERATOR                                   **
+ *****************************************************************************
+ */
+
+template <arg_t arg>
+struct binsearch_base;
+
+template <>
+struct binsearch_base<arg> {
+    using function_type = PyArray_ArgBinSearchFunc *;
+    struct value_type {
+        int typenum;
+        function_type binsearch[NPY_NSEARCHSIDES];
+    };
+    template <class... Tags>
+    static constexpr std::array<value_type, sizeof...(Tags)>
+    make_binsearch_map(npy::taglist<Tags...>)
+    {
+        return std::array<value_type, sizeof...(Tags)>{
+                value_type{Tags::type_value,
+                           {(function_type)&argbinsearch<Tags, left>,
+                            (function_type)argbinsearch<Tags, right>}}...};
+    }
+    static constexpr std::array<function_type, 2> npy_map = {
+            (function_type)&npy_argbinsearch<left>,
+            (function_type)&npy_argbinsearch<right>};
+};
+constexpr std::array<binsearch_base<arg>::function_type, 2>
+        binsearch_base<arg>::npy_map;
+
+template <>
+struct binsearch_base<noarg> {
+    using function_type = PyArray_BinSearchFunc *;
+    struct value_type {
+        int typenum;
+        function_type binsearch[NPY_NSEARCHSIDES];
+    };
+    template <class... Tags>
+    static constexpr std::array<value_type, sizeof...(Tags)>
+    make_binsearch_map(npy::taglist<Tags...>)
+    {
+        return std::array<value_type, sizeof...(Tags)>{
+                value_type{Tags::type_value,
+                           {(function_type)&binsearch<Tags, left>,
+                            (function_type)binsearch<Tags, right>}}...};
+    }
+    static constexpr std::array<function_type, 2> npy_map = {
+            (function_type)&npy_binsearch<left>,
+            (function_type)&npy_binsearch<right>};
+};
+constexpr std::array<binsearch_base<noarg>::function_type, 2>
+        binsearch_base<noarg>::npy_map;
+
+// Handle generation of all binsearch variants
+template <arg_t arg>
+struct binsearch_t : binsearch_base<arg> {
+    using binsearch_base<arg>::make_binsearch_map;
+    using value_type = typename binsearch_base<arg>::value_type;
+
+    using taglist = npy::taglist<
+            /* If adding new types, make sure to keep them ordered by type num
+             */
+            npy::bool_tag, npy::byte_tag, npy::ubyte_tag, npy::short_tag,
+            npy::ushort_tag, npy::int_tag, npy::uint_tag, npy::long_tag,
+            npy::ulong_tag, npy::longlong_tag, npy::ulonglong_tag,
+            npy::half_tag, npy::float_tag, npy::double_tag,
+            npy::longdouble_tag, npy::cfloat_tag, npy::cdouble_tag,
+            npy::clongdouble_tag, npy::datetime_tag, npy::timedelta_tag>;
+
+    static constexpr std::array<value_type, taglist::size> map =
+            make_binsearch_map(taglist());
+};
+
+template <arg_t arg>
+constexpr std::array<typename binsearch_t<arg>::value_type,
+                     binsearch_t<arg>::taglist::size>
+        binsearch_t<arg>::map;
+
+template <arg_t arg>
+static inline typename binsearch_t<arg>::function_type
+_get_binsearch_func(PyArray_Descr *dtype, NPY_SEARCHSIDE side)
+{
+    using binsearch = binsearch_t<arg>;
+    npy_intp nfuncs = binsearch::map.size();
+    npy_intp min_idx = 0;
+    npy_intp max_idx = nfuncs;
+    int type = dtype->type_num;
+
+    if ((int)side >= (int)NPY_NSEARCHSIDES) {
+        return NULL;
+    }
+
+    /*
+     * It seems only fair that a binary search function be searched for
+     * using a binary search...
+     */
+    while (min_idx < max_idx) {
+        npy_intp mid_idx = min_idx + ((max_idx - min_idx) >> 1);
+
+        if (binsearch::map[mid_idx].typenum < type) {
+            min_idx = mid_idx + 1;
+        }
+        else {
+            max_idx = mid_idx;
+        }
+    }
+
+    if (min_idx < nfuncs && binsearch::map[min_idx].typenum == type) {
+        return binsearch::map[min_idx].binsearch[side];
+    }
+
+    if (dtype->f->compare) {
+        return binsearch::npy_map[side];
+    }
+
+    return NULL;
+}
+
+/*
+ *****************************************************************************
+ **                            C INTERFACE                                  **
+ *****************************************************************************
+ */
+extern "C" {
+NPY_NO_EXPORT PyArray_BinSearchFunc *
+get_binsearch_func(PyArray_Descr *dtype, NPY_SEARCHSIDE side)
+{
+    return _get_binsearch_func<noarg>(dtype, side);
+}
+
+NPY_NO_EXPORT PyArray_ArgBinSearchFunc *
+get_argbinsearch_func(PyArray_Descr *dtype, NPY_SEARCHSIDE side)
+{
+    return _get_binsearch_func<arg>(dtype, side);
+}
+}
diff --git a/numpy/core/src/npysort/heapsort.c.src b/numpy/core/src/npysort/heapsort.c.src

deleted file mode 100644 (file)

index 4bfea13..0000000
--- a/numpy/core/src/npysort/heapsort.c.src
+++ /dev/null
@@ -1,402 +0,0 @@
-/* -*- c -*- */
-
-/*
- * The purpose of this module is to add faster sort functions
- * that are type-specific.  This is done by altering the
- * function table for the builtin descriptors.
- *
- * These sorting functions are copied almost directly from numarray
- * with a few modifications (complex comparisons compare the imaginary
- * part if the real parts are equal, for example), and the names
- * are changed.
- *
- * The original sorting code is due to Charles R. Harris who wrote
- * it for numarray.
- */
-
-/*
- * Quick sort is usually the fastest, but the worst case scenario can
- * be slower than the merge and heap sorts.  The merge sort requires
- * extra memory and so for large arrays may not be useful.
- *
- * The merge sort is *stable*, meaning that equal components
- * are unmoved from their entry versions, so it can be used to
- * implement lexigraphic sorting on multiple keys.
- *
- * The heap sort is included for completeness.
- */
-
-#define NPY_NO_DEPRECATED_API NPY_API_VERSION
-
-#include "npy_sort.h"
-#include "npysort_common.h"
-#include <stdlib.h>
-
-#define NOT_USED NPY_UNUSED(unused)
-#define PYA_QS_STACK 100
-#define SMALL_QUICKSORT 15
-#define SMALL_MERGESORT 20
-#define SMALL_STRING 16
-
-
-/*
- *****************************************************************************
- **                            NUMERIC SORTS                                **
- *****************************************************************************
- */
-
-
-/**begin repeat
- *
- * #TYPE = BOOL, BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG,
- *         LONGLONG, ULONGLONG, HALF, FLOAT, DOUBLE, LONGDOUBLE,
- *         CFLOAT, CDOUBLE, CLONGDOUBLE, DATETIME, TIMEDELTA#
- * #suff = bool, byte, ubyte, short, ushort, int, uint, long, ulong,
- *         longlong, ulonglong, half, float, double, longdouble,
- *         cfloat, cdouble, clongdouble, datetime, timedelta#
- * #type = npy_bool, npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int,
- *         npy_uint, npy_long, npy_ulong, npy_longlong, npy_ulonglong,
- *         npy_ushort, npy_float, npy_double, npy_longdouble, npy_cfloat,
- *         npy_cdouble, npy_clongdouble, npy_datetime, npy_timedelta#
- */
-
-NPY_NO_EXPORT int
-heapsort_@suff@(void *start, npy_intp n, void *NOT_USED)
-{
-    @type@ tmp, *a;
-    npy_intp i,j,l;
-
-    /* The array needs to be offset by one for heapsort indexing */
-    a = (@type@ *)start - 1;
-
-    for (l = n>>1; l > 0; --l) {
-        tmp = a[l];
-        for (i = l, j = l<<1; j <= n;) {
-            if (j < n && @TYPE@_LT(a[j], a[j+1])) {
-                j += 1;
-            }
-            if (@TYPE@_LT(tmp, a[j])) {
-                a[i] = a[j];
-                i = j;
-                j += j;
-            }
-            else {
-                break;
-            }
-        }
-        a[i] = tmp;
-    }
-
-    for (; n > 1;) {
-        tmp = a[n];
-        a[n] = a[1];
-        n -= 1;
-        for (i = 1, j = 2; j <= n;) {
-            if (j < n && @TYPE@_LT(a[j], a[j+1])) {
-                j++;
-            }
-            if (@TYPE@_LT(tmp, a[j])) {
-                a[i] = a[j];
-                i = j;
-                j += j;
-            }
-            else {
-                break;
-            }
-        }
-        a[i] = tmp;
-    }
-
-    return 0;
-}
-
-
-NPY_NO_EXPORT int
-aheapsort_@suff@(void *vv, npy_intp *tosort, npy_intp n, void *NOT_USED)
-{
-    @type@ *v = vv;
-    npy_intp *a, i,j,l, tmp;
-    /* The arrays need to be offset by one for heapsort indexing */
-    a = tosort - 1;
-
-    for (l = n>>1; l > 0; --l) {
-        tmp = a[l];
-        for (i = l, j = l<<1; j <= n;) {
-            if (j < n && @TYPE@_LT(v[a[j]], v[a[j+1]])) {
-                j += 1;
-            }
-            if (@TYPE@_LT(v[tmp], v[a[j]])) {
-                a[i] = a[j];
-                i = j;
-                j += j;
-            }
-            else {
-                break;
-            }
-        }
-        a[i] = tmp;
-    }
-
-    for (; n > 1;) {
-        tmp = a[n];
-        a[n] = a[1];
-        n -= 1;
-        for (i = 1, j = 2; j <= n;) {
-            if (j < n && @TYPE@_LT(v[a[j]], v[a[j+1]])) {
-                j++;
-            }
-            if (@TYPE@_LT(v[tmp], v[a[j]])) {
-                a[i] = a[j];
-                i = j;
-                j += j;
-            }
-            else {
-                break;
-            }
-        }
-        a[i] = tmp;
-    }
-
-    return 0;
-}
-
-/**end repeat**/
-
-
-/*
- *****************************************************************************
- **                             STRING SORTS                                **
- *****************************************************************************
- */
-
-
-/**begin repeat
- *
- * #TYPE = STRING, UNICODE#
- * #suff = string, unicode#
- * #type = npy_char, npy_ucs4#
- */
-
-NPY_NO_EXPORT int
-heapsort_@suff@(void *start, npy_intp n, void *varr)
-{
-    PyArrayObject *arr = varr;
-    size_t len = PyArray_ITEMSIZE(arr)/sizeof(@type@);
-    @type@ *tmp = malloc(PyArray_ITEMSIZE(arr));
-    @type@ *a = (@type@ *)start - len;
-    npy_intp i, j, l;
-
-    if (tmp == NULL) {
-        return -NPY_ENOMEM;
-    }
-
-    for (l = n>>1; l > 0; --l) {
-        @TYPE@_COPY(tmp, a + l*len, len);
-        for (i = l, j = l<<1; j <= n;) {
-            if (j < n && @TYPE@_LT(a + j*len, a + (j+1)*len, len))
-                j += 1;
-            if (@TYPE@_LT(tmp, a + j*len, len)) {
-                @TYPE@_COPY(a + i*len, a + j*len, len);
-                i = j;
-                j += j;
-            }
-            else {
-                break;
-            }
-        }
-        @TYPE@_COPY(a + i*len, tmp, len);
-    }
-
-    for (; n > 1;) {
-        @TYPE@_COPY(tmp, a + n*len, len);
-        @TYPE@_COPY(a + n*len, a + len, len);
-        n -= 1;
-        for (i = 1, j = 2; j <= n;) {
-            if (j < n && @TYPE@_LT(a + j*len, a + (j+1)*len, len))
-                j++;
-            if (@TYPE@_LT(tmp, a + j*len, len)) {
-                @TYPE@_COPY(a + i*len, a + j*len, len);
-                i = j;
-                j += j;
-            }
-            else {
-                break;
-            }
-        }
-        @TYPE@_COPY(a + i*len, tmp, len);
-    }
-
-    free(tmp);
-    return 0;
-}
-
-
-NPY_NO_EXPORT int
-aheapsort_@suff@(void *vv, npy_intp *tosort, npy_intp n, void *varr)
-{
-    @type@ *v = vv;
-    PyArrayObject *arr = varr;
-    size_t len = PyArray_ITEMSIZE(arr)/sizeof(@type@);
-    npy_intp *a, i,j,l, tmp;
-
-    /* The array needs to be offset by one for heapsort indexing */
-    a = tosort - 1;
-
-    for (l = n>>1; l > 0; --l) {
-        tmp = a[l];
-        for (i = l, j = l<<1; j <= n;) {
-            if (j < n && @TYPE@_LT(v + a[j]*len, v + a[j+1]*len, len))
-                j += 1;
-            if (@TYPE@_LT(v + tmp*len, v + a[j]*len, len)) {
-                a[i] = a[j];
-                i = j;
-                j += j;
-            }
-            else {
-                break;
-            }
-        }
-        a[i] = tmp;
-    }
-
-    for (; n > 1;) {
-        tmp = a[n];
-        a[n] = a[1];
-        n -= 1;
-        for (i = 1, j = 2; j <= n;) {
-            if (j < n && @TYPE@_LT(v + a[j]*len, v + a[j+1]*len, len))
-                j++;
-            if (@TYPE@_LT(v + tmp*len, v + a[j]*len, len)) {
-                a[i] = a[j];
-                i = j;
-                j += j;
-            }
-            else {
-                break;
-            }
-        }
-        a[i] = tmp;
-    }
-
-    return 0;
-}
-
-/**end repeat**/
-
-
-/*
- *****************************************************************************
- **                             GENERIC SORT                                **
- *****************************************************************************
- */
-
-
-NPY_NO_EXPORT int
-npy_heapsort(void *start, npy_intp num, void *varr)
-{
-    PyArrayObject *arr = varr;
-    npy_intp elsize = PyArray_ITEMSIZE(arr);
-    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
-    char *tmp = malloc(elsize);
-    char *a = (char *)start - elsize;
-    npy_intp i, j, l;
-
-    if (tmp == NULL) {
-        return -NPY_ENOMEM;
-    }
-
-    for (l = num >> 1; l > 0; --l) {
-        GENERIC_COPY(tmp, a + l*elsize, elsize);
-        for (i = l, j = l << 1; j <= num;) {
-            if (j < num && cmp(a + j*elsize, a + (j+1)*elsize, arr) < 0) {
-                ++j;
-            }
-            if (cmp(tmp, a + j*elsize, arr) < 0) {
-                GENERIC_COPY(a + i*elsize, a + j*elsize, elsize);
-                i = j;
-                j += j;
-            }
-            else {
-                break;
-            }
-        }
-        GENERIC_COPY(a + i*elsize, tmp, elsize);
-    }
-
-    for (; num > 1;) {
-        GENERIC_COPY(tmp, a + num*elsize, elsize);
-        GENERIC_COPY(a + num*elsize, a + elsize, elsize);
-        num -= 1;
-        for (i = 1, j = 2; j <= num;) {
-            if (j < num && cmp(a + j*elsize, a + (j+1)*elsize, arr) < 0) {
-                ++j;
-            }
-            if (cmp(tmp, a + j*elsize, arr) < 0) {
-                GENERIC_COPY(a + i*elsize, a + j*elsize, elsize);
-                i = j;
-                j += j;
-            }
-            else {
-                break;
-            }
-        }
-        GENERIC_COPY(a + i*elsize, tmp, elsize);
-    }
-
-    free(tmp);
-    return 0;
-}
-
-
-NPY_NO_EXPORT int
-npy_aheapsort(void *vv, npy_intp *tosort, npy_intp n, void *varr)
-{
-    char *v = vv;
-    PyArrayObject *arr = varr;
-    npy_intp elsize = PyArray_ITEMSIZE(arr);
-    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
-    npy_intp *a, i, j, l, tmp;
-
-    /* The array needs to be offset by one for heapsort indexing */
-    a = tosort - 1;
-
-    for (l = n >> 1; l > 0; --l) {
-        tmp = a[l];
-        for (i = l, j = l<<1; j <= n;) {
-            if (j < n && cmp(v + a[j]*elsize, v + a[j+1]*elsize, arr) < 0) {
-                ++j;
-            }
-            if (cmp(v + tmp*elsize, v + a[j]*elsize, arr) < 0) {
-                a[i] = a[j];
-                i = j;
-                j += j;
-            }
-            else {
-                break;
-            }
-        }
-        a[i] = tmp;
-    }
-
-    for (; n > 1;) {
-        tmp = a[n];
-        a[n] = a[1];
-        n -= 1;
-        for (i = 1, j = 2; j <= n;) {
-            if (j < n && cmp(v + a[j]*elsize, v + a[j+1]*elsize, arr) < 0) {
-                ++j;
-            }
-            if (cmp(v + tmp*elsize, v + a[j]*elsize, arr) < 0) {
-                a[i] = a[j];
-                i = j;
-                j += j;
-            }
-            else {
-                break;
-            }
-        }
-        a[i] = tmp;
-    }
-
-    return 0;
-}
diff --git a/numpy/core/src/npysort/heapsort.cpp b/numpy/core/src/npysort/heapsort.cpp

new file mode 100644 (file)

index 0000000..1f31ed2
--- /dev/null
+++ b/numpy/core/src/npysort/heapsort.cpp
@@ -0,0 +1,756 @@
+/* -*- c -*- */
+
+/*
+ * The purpose of this module is to add faster sort functions
+ * that are type-specific.  This is done by altering the
+ * function table for the builtin descriptors.
+ *
+ * These sorting functions are copied almost directly from numarray
+ * with a few modifications (complex comparisons compare the imaginary
+ * part if the real parts are equal, for example), and the names
+ * are changed.
+ *
+ * The original sorting code is due to Charles R. Harris who wrote
+ * it for numarray.
+ */
+
+/*
+ * Quick sort is usually the fastest, but the worst case scenario can
+ * be slower than the merge and heap sorts.  The merge sort requires
+ * extra memory and so for large arrays may not be useful.
+ *
+ * The merge sort is *stable*, meaning that equal components
+ * are unmoved from their entry versions, so it can be used to
+ * implement lexigraphic sorting on multiple keys.
+ *
+ * The heap sort is included for completeness.
+ */
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+
+#include "npy_sort.h"
+#include "npysort_common.h"
+#include "numpy_tag.h"
+
+#include <cstdlib>
+
+#define NOT_USED NPY_UNUSED(unused)
+#define PYA_QS_STACK 100
+#define SMALL_QUICKSORT 15
+#define SMALL_MERGESORT 20
+#define SMALL_STRING 16
+
+/*
+ *****************************************************************************
+ **                            NUMERIC SORTS                                **
+ *****************************************************************************
+ */
+
+template <typename Tag, typename type>
+NPY_NO_EXPORT int
+heapsort_(type *start, npy_intp n)
+{
+    type tmp, *a;
+    npy_intp i, j, l;
+
+    /* The array needs to be offset by one for heapsort indexing */
+    a = start - 1;
+
+    for (l = n >> 1; l > 0; --l) {
+        tmp = a[l];
+        for (i = l, j = l << 1; j <= n;) {
+            if (j < n && Tag::less(a[j], a[j + 1])) {
+                j += 1;
+            }
+            if (Tag::less(tmp, a[j])) {
+                a[i] = a[j];
+                i = j;
+                j += j;
+            }
+            else {
+                break;
+            }
+        }
+        a[i] = tmp;
+    }
+
+    for (; n > 1;) {
+        tmp = a[n];
+        a[n] = a[1];
+        n -= 1;
+        for (i = 1, j = 2; j <= n;) {
+            if (j < n && Tag::less(a[j], a[j + 1])) {
+                j++;
+            }
+            if (Tag::less(tmp, a[j])) {
+                a[i] = a[j];
+                i = j;
+                j += j;
+            }
+            else {
+                break;
+            }
+        }
+        a[i] = tmp;
+    }
+
+    return 0;
+}
+
+template <typename Tag, typename type>
+NPY_NO_EXPORT int
+aheapsort_(type *vv, npy_intp *tosort, npy_intp n)
+{
+    type *v = vv;
+    npy_intp *a, i, j, l, tmp;
+    /* The arrays need to be offset by one for heapsort indexing */
+    a = tosort - 1;
+
+    for (l = n >> 1; l > 0; --l) {
+        tmp = a[l];
+        for (i = l, j = l << 1; j <= n;) {
+            if (j < n && Tag::less(v[a[j]], v[a[j + 1]])) {
+                j += 1;
+            }
+            if (Tag::less(v[tmp], v[a[j]])) {
+                a[i] = a[j];
+                i = j;
+                j += j;
+            }
+            else {
+                break;
+            }
+        }
+        a[i] = tmp;
+    }
+
+    for (; n > 1;) {
+        tmp = a[n];
+        a[n] = a[1];
+        n -= 1;
+        for (i = 1, j = 2; j <= n;) {
+            if (j < n && Tag::less(v[a[j]], v[a[j + 1]])) {
+                j++;
+            }
+            if (Tag::less(v[tmp], v[a[j]])) {
+                a[i] = a[j];
+                i = j;
+                j += j;
+            }
+            else {
+                break;
+            }
+        }
+        a[i] = tmp;
+    }
+
+    return 0;
+}
+
+/*
+ *****************************************************************************
+ **                             STRING SORTS                                **
+ *****************************************************************************
+ */
+
+template <typename Tag, typename type>
+NPY_NO_EXPORT int
+string_heapsort_(type *start, npy_intp n, void *varr)
+{
+    PyArrayObject *arr = (PyArrayObject *)varr;
+    size_t len = PyArray_ITEMSIZE(arr) / sizeof(type);
+    type *tmp = (type *)malloc(PyArray_ITEMSIZE(arr));
+    type *a = (type *)start - len;
+    npy_intp i, j, l;
+
+    if (tmp == NULL) {
+        return -NPY_ENOMEM;
+    }
+
+    for (l = n >> 1; l > 0; --l) {
+        Tag::copy(tmp, a + l * len, len);
+        for (i = l, j = l << 1; j <= n;) {
+            if (j < n && Tag::less(a + j * len, a + (j + 1) * len, len))
+                j += 1;
+            if (Tag::less(tmp, a + j * len, len)) {
+                Tag::copy(a + i * len, a + j * len, len);
+                i = j;
+                j += j;
+            }
+            else {
+                break;
+            }
+        }
+        Tag::copy(a + i * len, tmp, len);
+    }
+
+    for (; n > 1;) {
+        Tag::copy(tmp, a + n * len, len);
+        Tag::copy(a + n * len, a + len, len);
+        n -= 1;
+        for (i = 1, j = 2; j <= n;) {
+            if (j < n && Tag::less(a + j * len, a + (j + 1) * len, len))
+                j++;
+            if (Tag::less(tmp, a + j * len, len)) {
+                Tag::copy(a + i * len, a + j * len, len);
+                i = j;
+                j += j;
+            }
+            else {
+                break;
+            }
+        }
+        Tag::copy(a + i * len, tmp, len);
+    }
+
+    free(tmp);
+    return 0;
+}
+
+template <typename Tag, typename type>
+NPY_NO_EXPORT int
+string_aheapsort_(type *vv, npy_intp *tosort, npy_intp n, void *varr)
+{
+    type *v = vv;
+    PyArrayObject *arr = (PyArrayObject *)varr;
+    size_t len = PyArray_ITEMSIZE(arr) / sizeof(type);
+    npy_intp *a, i, j, l, tmp;
+
+    /* The array needs to be offset by one for heapsort indexing */
+    a = tosort - 1;
+
+    for (l = n >> 1; l > 0; --l) {
+        tmp = a[l];
+        for (i = l, j = l << 1; j <= n;) {
+            if (j < n && Tag::less(v + a[j] * len, v + a[j + 1] * len, len))
+                j += 1;
+            if (Tag::less(v + tmp * len, v + a[j] * len, len)) {
+                a[i] = a[j];
+                i = j;
+                j += j;
+            }
+            else {
+                break;
+            }
+        }
+        a[i] = tmp;
+    }
+
+    for (; n > 1;) {
+        tmp = a[n];
+        a[n] = a[1];
+        n -= 1;
+        for (i = 1, j = 2; j <= n;) {
+            if (j < n && Tag::less(v + a[j] * len, v + a[j + 1] * len, len))
+                j++;
+            if (Tag::less(v + tmp * len, v + a[j] * len, len)) {
+                a[i] = a[j];
+                i = j;
+                j += j;
+            }
+            else {
+                break;
+            }
+        }
+        a[i] = tmp;
+    }
+
+    return 0;
+}
+
+/**end repeat**/
+
+/*
+ *****************************************************************************
+ **                             GENERIC SORT                                **
+ *****************************************************************************
+ */
+
+NPY_NO_EXPORT int
+npy_heapsort(void *start, npy_intp num, void *varr)
+{
+    PyArrayObject *arr = (PyArrayObject *)varr;
+    npy_intp elsize = PyArray_ITEMSIZE(arr);
+    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
+    char *tmp = (char *)malloc(elsize);
+    char *a = (char *)start - elsize;
+    npy_intp i, j, l;
+
+    if (tmp == NULL) {
+        return -NPY_ENOMEM;
+    }
+
+    for (l = num >> 1; l > 0; --l) {
+        GENERIC_COPY(tmp, a + l * elsize, elsize);
+        for (i = l, j = l << 1; j <= num;) {
+            if (j < num &&
+                cmp(a + j * elsize, a + (j + 1) * elsize, arr) < 0) {
+                ++j;
+            }
+            if (cmp(tmp, a + j * elsize, arr) < 0) {
+                GENERIC_COPY(a + i * elsize, a + j * elsize, elsize);
+                i = j;
+                j += j;
+            }
+            else {
+                break;
+            }
+        }
+        GENERIC_COPY(a + i * elsize, tmp, elsize);
+    }
+
+    for (; num > 1;) {
+        GENERIC_COPY(tmp, a + num * elsize, elsize);
+        GENERIC_COPY(a + num * elsize, a + elsize, elsize);
+        num -= 1;
+        for (i = 1, j = 2; j <= num;) {
+            if (j < num &&
+                cmp(a + j * elsize, a + (j + 1) * elsize, arr) < 0) {
+                ++j;
+            }
+            if (cmp(tmp, a + j * elsize, arr) < 0) {
+                GENERIC_COPY(a + i * elsize, a + j * elsize, elsize);
+                i = j;
+                j += j;
+            }
+            else {
+                break;
+            }
+        }
+        GENERIC_COPY(a + i * elsize, tmp, elsize);
+    }
+
+    free(tmp);
+    return 0;
+}
+
+NPY_NO_EXPORT int
+npy_aheapsort(void *vv, npy_intp *tosort, npy_intp n, void *varr)
+{
+    char *v = (char *)vv;
+    PyArrayObject *arr = (PyArrayObject *)varr;
+    npy_intp elsize = PyArray_ITEMSIZE(arr);
+    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
+    npy_intp *a, i, j, l, tmp;
+
+    /* The array needs to be offset by one for heapsort indexing */
+    a = tosort - 1;
+
+    for (l = n >> 1; l > 0; --l) {
+        tmp = a[l];
+        for (i = l, j = l << 1; j <= n;) {
+            if (j < n &&
+                cmp(v + a[j] * elsize, v + a[j + 1] * elsize, arr) < 0) {
+                ++j;
+            }
+            if (cmp(v + tmp * elsize, v + a[j] * elsize, arr) < 0) {
+                a[i] = a[j];
+                i = j;
+                j += j;
+            }
+            else {
+                break;
+            }
+        }
+        a[i] = tmp;
+    }
+
+    for (; n > 1;) {
+        tmp = a[n];
+        a[n] = a[1];
+        n -= 1;
+        for (i = 1, j = 2; j <= n;) {
+            if (j < n &&
+                cmp(v + a[j] * elsize, v + a[j + 1] * elsize, arr) < 0) {
+                ++j;
+            }
+            if (cmp(v + tmp * elsize, v + a[j] * elsize, arr) < 0) {
+                a[i] = a[j];
+                i = j;
+                j += j;
+            }
+            else {
+                break;
+            }
+        }
+        a[i] = tmp;
+    }
+
+    return 0;
+}
+
+/***************************************
+ * C > C++ dispatch
+ ***************************************/
+template NPY_NO_EXPORT int
+heapsort_<npy::bool_tag, npy_bool>(npy_bool *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_bool(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::bool_tag>((npy_bool *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::byte_tag, npy_byte>(npy_byte *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_byte(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::byte_tag>((npy_byte *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::ubyte_tag, npy_ubyte>(npy_ubyte *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_ubyte(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::ubyte_tag>((npy_ubyte *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::short_tag, npy_short>(npy_short *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_short(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::short_tag>((npy_short *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::ushort_tag, npy_ushort>(npy_ushort *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_ushort(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::ushort_tag>((npy_ushort *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::int_tag, npy_int>(npy_int *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_int(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::int_tag>((npy_int *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::uint_tag, npy_uint>(npy_uint *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_uint(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::uint_tag>((npy_uint *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::long_tag, npy_long>(npy_long *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_long(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::long_tag>((npy_long *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::ulong_tag, npy_ulong>(npy_ulong *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_ulong(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::ulong_tag>((npy_ulong *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::longlong_tag, npy_longlong>(npy_longlong *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_longlong(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::longlong_tag>((npy_longlong *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::ulonglong_tag, npy_ulonglong>(npy_ulonglong *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_ulonglong(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::ulonglong_tag>((npy_ulonglong *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::half_tag, npy_half>(npy_half *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_half(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::half_tag>((npy_half *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::float_tag, npy_float>(npy_float *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_float(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::float_tag>((npy_float *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::double_tag, npy_double>(npy_double *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_double(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::double_tag>((npy_double *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::longdouble_tag, npy_longdouble>(npy_longdouble *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_longdouble(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::longdouble_tag>((npy_longdouble *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::cfloat_tag, npy_cfloat>(npy_cfloat *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_cfloat(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::cfloat_tag>((npy_cfloat *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::cdouble_tag, npy_cdouble>(npy_cdouble *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_cdouble(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::cdouble_tag>((npy_cdouble *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::clongdouble_tag, npy_clongdouble>(npy_clongdouble *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_clongdouble(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::clongdouble_tag>((npy_clongdouble *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::datetime_tag, npy_datetime>(npy_datetime *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_datetime(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::datetime_tag>((npy_datetime *)start, n);
+}
+
+template NPY_NO_EXPORT int
+heapsort_<npy::timedelta_tag, npy_timedelta>(npy_timedelta *, npy_intp);
+NPY_NO_EXPORT int
+heapsort_timedelta(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return heapsort_<npy::timedelta_tag>((npy_timedelta *)start, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::bool_tag, npy_bool>(npy_bool *vv, npy_intp *tosort,
+                                    npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_bool(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::bool_tag>((npy_bool *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::byte_tag, npy_byte>(npy_byte *vv, npy_intp *tosort,
+                                    npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_byte(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::byte_tag>((npy_byte *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::ubyte_tag, npy_ubyte>(npy_ubyte *vv, npy_intp *tosort,
+                                      npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_ubyte(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::ubyte_tag>((npy_ubyte *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::short_tag, npy_short>(npy_short *vv, npy_intp *tosort,
+                                      npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_short(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::short_tag>((npy_short *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::ushort_tag, npy_ushort>(npy_ushort *vv, npy_intp *tosort,
+                                        npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_ushort(void *vv, npy_intp *tosort, npy_intp n,
+                 void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::ushort_tag>((npy_ushort *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::int_tag, npy_int>(npy_int *vv, npy_intp *tosort, npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_int(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::int_tag>((npy_int *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::uint_tag, npy_uint>(npy_uint *vv, npy_intp *tosort,
+                                    npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_uint(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::uint_tag>((npy_uint *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::long_tag, npy_long>(npy_long *vv, npy_intp *tosort,
+                                    npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_long(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::long_tag>((npy_long *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::ulong_tag, npy_ulong>(npy_ulong *vv, npy_intp *tosort,
+                                      npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_ulong(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::ulong_tag>((npy_ulong *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::longlong_tag, npy_longlong>(npy_longlong *vv, npy_intp *tosort,
+                                            npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_longlong(void *vv, npy_intp *tosort, npy_intp n,
+                   void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::longlong_tag>((npy_longlong *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::ulonglong_tag, npy_ulonglong>(npy_ulonglong *vv,
+                                              npy_intp *tosort, npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_ulonglong(void *vv, npy_intp *tosort, npy_intp n,
+                    void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::ulonglong_tag>((npy_ulonglong *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::half_tag, npy_half>(npy_half *vv, npy_intp *tosort,
+                                    npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_half(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::half_tag>((npy_half *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::float_tag, npy_float>(npy_float *vv, npy_intp *tosort,
+                                      npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_float(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::float_tag>((npy_float *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::double_tag, npy_double>(npy_double *vv, npy_intp *tosort,
+                                        npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_double(void *vv, npy_intp *tosort, npy_intp n,
+                 void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::double_tag>((npy_double *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::longdouble_tag, npy_longdouble>(npy_longdouble *vv,
+                                                npy_intp *tosort, npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_longdouble(void *vv, npy_intp *tosort, npy_intp n,
+                     void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::longdouble_tag>((npy_longdouble *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::cfloat_tag, npy_cfloat>(npy_cfloat *vv, npy_intp *tosort,
+                                        npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_cfloat(void *vv, npy_intp *tosort, npy_intp n,
+                 void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::cfloat_tag>((npy_cfloat *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::cdouble_tag, npy_cdouble>(npy_cdouble *vv, npy_intp *tosort,
+                                          npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_cdouble(void *vv, npy_intp *tosort, npy_intp n,
+                  void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::cdouble_tag>((npy_cdouble *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::clongdouble_tag, npy_clongdouble>(npy_clongdouble *vv,
+                                                  npy_intp *tosort,
+                                                  npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_clongdouble(void *vv, npy_intp *tosort, npy_intp n,
+                      void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::clongdouble_tag>((npy_clongdouble *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::datetime_tag, npy_datetime>(npy_datetime *vv, npy_intp *tosort,
+                                            npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_datetime(void *vv, npy_intp *tosort, npy_intp n,
+                   void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::datetime_tag>((npy_datetime *)vv, tosort, n);
+}
+
+template NPY_NO_EXPORT int
+aheapsort_<npy::timedelta_tag, npy_timedelta>(npy_timedelta *vv,
+                                              npy_intp *tosort, npy_intp n);
+NPY_NO_EXPORT int
+aheapsort_timedelta(void *vv, npy_intp *tosort, npy_intp n,
+                    void *NPY_UNUSED(varr))
+{
+    return aheapsort_<npy::timedelta_tag>((npy_timedelta *)vv, tosort, n);
+}
+
+NPY_NO_EXPORT int
+heapsort_string(void *start, npy_intp n, void *varr)
+{
+    return string_heapsort_<npy::string_tag>((npy_char *)start, n, varr);
+}
+NPY_NO_EXPORT int
+heapsort_unicode(void *start, npy_intp n, void *varr)
+{
+    return string_heapsort_<npy::unicode_tag>((npy_ucs4 *)start, n, varr);
+}
+
+NPY_NO_EXPORT int
+aheapsort_string(void *vv, npy_intp *tosort, npy_intp n, void *varr)
+{
+    return string_aheapsort_<npy::string_tag>((npy_char *)vv, tosort, n, varr);
+}
+NPY_NO_EXPORT int
+aheapsort_unicode(void *vv, npy_intp *tosort, npy_intp n, void *varr)
+{
+    return string_aheapsort_<npy::unicode_tag>((npy_ucs4 *)vv, tosort, n,
+                                               varr);
+}
diff --git a/numpy/core/src/npysort/mergesort.c.src b/numpy/core/src/npysort/mergesort.c.src

deleted file mode 100644 (file)

index f83fbf7..0000000
--- a/numpy/core/src/npysort/mergesort.c.src
+++ /dev/null
@@ -1,511 +0,0 @@
-/* -*- c -*- */
-
-/*
- * The purpose of this module is to add faster sort functions
- * that are type-specific.  This is done by altering the
- * function table for the builtin descriptors.
- *
- * These sorting functions are copied almost directly from numarray
- * with a few modifications (complex comparisons compare the imaginary
- * part if the real parts are equal, for example), and the names
- * are changed.
- *
- * The original sorting code is due to Charles R. Harris who wrote
- * it for numarray.
- */
-
-/*
- * Quick sort is usually the fastest, but the worst case scenario can
- * be slower than the merge and heap sorts.  The merge sort requires
- * extra memory and so for large arrays may not be useful.
- *
- * The merge sort is *stable*, meaning that equal components
- * are unmoved from their entry versions, so it can be used to
- * implement lexigraphic sorting on multiple keys.
- *
- * The heap sort is included for completeness.
- */
-
-#define NPY_NO_DEPRECATED_API NPY_API_VERSION
-
-#include "npy_sort.h"
-#include "npysort_common.h"
-#include <stdlib.h>
-
-#define NOT_USED NPY_UNUSED(unused)
-#define PYA_QS_STACK 100
-#define SMALL_QUICKSORT 15
-#define SMALL_MERGESORT 20
-#define SMALL_STRING 16
-
-
-/*
- *****************************************************************************
- **                            NUMERIC SORTS                                **
- *****************************************************************************
- */
-
-
-/**begin repeat
- *
- * #TYPE = BOOL, BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG,
- *         LONGLONG, ULONGLONG, HALF, FLOAT, DOUBLE, LONGDOUBLE,
- *         CFLOAT, CDOUBLE, CLONGDOUBLE, DATETIME, TIMEDELTA#
- * #suff = bool, byte, ubyte, short, ushort, int, uint, long, ulong,
- *         longlong, ulonglong, half, float, double, longdouble,
- *         cfloat, cdouble, clongdouble, datetime, timedelta#
- * #type = npy_bool, npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int,
- *         npy_uint, npy_long, npy_ulong, npy_longlong, npy_ulonglong,
- *         npy_ushort, npy_float, npy_double, npy_longdouble, npy_cfloat,
- *         npy_cdouble, npy_clongdouble, npy_datetime, npy_timedelta#
- */
-
-static void
-mergesort0_@suff@(@type@ *pl, @type@ *pr, @type@ *pw)
-{
-    @type@ vp, *pi, *pj, *pk, *pm;
-
-    if (pr - pl > SMALL_MERGESORT) {
-        /* merge sort */
-        pm = pl + ((pr - pl) >> 1);
-        mergesort0_@suff@(pl, pm, pw);
-        mergesort0_@suff@(pm, pr, pw);
-        for (pi = pw, pj = pl; pj < pm;) {
-            *pi++ = *pj++;
-        }
-        pi = pw + (pm - pl);
-        pj = pw;
-        pk = pl;
-        while (pj < pi && pm < pr) {
-            if (@TYPE@_LT(*pm, *pj)) {
-                *pk++ = *pm++;
-            }
-            else {
-                *pk++ = *pj++;
-            }
-        }
-        while(pj < pi) {
-            *pk++ = *pj++;
-        }
-    }
-    else {
-        /* insertion sort */
-        for (pi = pl + 1; pi < pr; ++pi) {
-            vp = *pi;
-            pj = pi;
-            pk = pi - 1;
-            while (pj > pl && @TYPE@_LT(vp, *pk)) {
-                *pj-- = *pk--;
-            }
-            *pj = vp;
-        }
-    }
-}
-
-
-NPY_NO_EXPORT int
-mergesort_@suff@(void *start, npy_intp num, void *NOT_USED)
-{
-    @type@ *pl, *pr, *pw;
-
-    pl = start;
-    pr = pl + num;
-    pw = malloc((num/2) * sizeof(@type@));
-    if (pw == NULL) {
-        return -NPY_ENOMEM;
-    }
-    mergesort0_@suff@(pl, pr, pw);
-
-    free(pw);
-    return 0;
-}
-
-
-static void
-amergesort0_@suff@(npy_intp *pl, npy_intp *pr, @type@ *v, npy_intp *pw)
-{
-    @type@ vp;
-    npy_intp vi, *pi, *pj, *pk, *pm;
-
-    if (pr - pl > SMALL_MERGESORT) {
-        /* merge sort */
-        pm = pl + ((pr - pl) >> 1);
-        amergesort0_@suff@(pl, pm, v, pw);
-        amergesort0_@suff@(pm, pr, v, pw);
-        for (pi = pw, pj = pl; pj < pm;) {
-            *pi++ = *pj++;
-        }
-        pi = pw + (pm - pl);
-        pj = pw;
-        pk = pl;
-        while (pj < pi && pm < pr) {
-            if (@TYPE@_LT(v[*pm], v[*pj])) {
-                *pk++ = *pm++;
-            }
-            else {
-                *pk++ = *pj++;
-            }
-        }
-        while(pj < pi) {
-            *pk++ = *pj++;
-        }
-    }
-    else {
-        /* insertion sort */
-        for (pi = pl + 1; pi < pr; ++pi) {
-            vi = *pi;
-            vp = v[vi];
-            pj = pi;
-            pk = pi - 1;
-            while (pj > pl && @TYPE@_LT(vp, v[*pk])) {
-                *pj-- = *pk--;
-            }
-            *pj = vi;
-        }
-    }
-}
-
-
-NPY_NO_EXPORT int
-amergesort_@suff@(void *v, npy_intp *tosort, npy_intp num, void *NOT_USED)
-{
-    npy_intp *pl, *pr, *pw;
-
-    pl = tosort;
-    pr = pl + num;
-    pw = malloc((num/2) * sizeof(npy_intp));
-    if (pw == NULL) {
-        return -NPY_ENOMEM;
-    }
-    amergesort0_@suff@(pl, pr, v, pw);
-    free(pw);
-
-    return 0;
-}
-
-/**end repeat**/
-
-
-/*
- *****************************************************************************
- **                             STRING SORTS                                **
- *****************************************************************************
- */
-
-
-/**begin repeat
- *
- * #TYPE = STRING, UNICODE#
- * #suff = string, unicode#
- * #type = npy_char, npy_ucs4#
- */
-
-static void
-mergesort0_@suff@(@type@ *pl, @type@ *pr, @type@ *pw, @type@ *vp, size_t len)
-{
-    @type@ *pi, *pj, *pk, *pm;
-
-    if ((size_t)(pr - pl) > SMALL_MERGESORT*len) {
-        /* merge sort */
-        pm = pl + (((pr - pl)/len) >> 1)*len;
-        mergesort0_@suff@(pl, pm, pw, vp, len);
-        mergesort0_@suff@(pm, pr, pw, vp, len);
-        @TYPE@_COPY(pw, pl, pm - pl);
-        pi = pw + (pm - pl);
-        pj = pw;
-        pk = pl;
-        while (pj < pi && pm < pr) {
-            if (@TYPE@_LT(pm, pj, len)) {
-                @TYPE@_COPY(pk, pm, len);
-                pm += len;
-                pk += len;
-            }
-            else {
-                @TYPE@_COPY(pk, pj, len);
-                pj += len;
-                pk += len;
-            }
-        }
-        @TYPE@_COPY(pk, pj, pi - pj);
-    }
-    else {
-        /* insertion sort */
-        for (pi = pl + len; pi < pr; pi += len) {
-            @TYPE@_COPY(vp, pi, len);
-            pj = pi;
-            pk = pi - len;
-            while (pj > pl && @TYPE@_LT(vp, pk, len)) {
-                @TYPE@_COPY(pj, pk, len);
-                pj -= len;
-                pk -= len;
-            }
-            @TYPE@_COPY(pj, vp, len);
-        }
-    }
-}
-
-
-NPY_NO_EXPORT int
-mergesort_@suff@(void *start, npy_intp num, void *varr)
-{
-    PyArrayObject *arr = varr;
-    size_t elsize = PyArray_ITEMSIZE(arr);
-    size_t len = elsize / sizeof(@type@);
-    @type@ *pl, *pr, *pw, *vp;
-    int err = 0;
-
-    /* Items that have zero size don't make sense to sort */
-    if (elsize == 0) {
-        return 0;
-    }
-
-    pl = start;
-    pr = pl + num*len;
-    pw = malloc((num/2) * elsize);
-    if (pw == NULL) {
-        err = -NPY_ENOMEM;
-        goto fail_0;
-    }
-    vp = malloc(elsize);
-    if (vp == NULL) {
-        err = -NPY_ENOMEM;
-        goto fail_1;
-    }
-    mergesort0_@suff@(pl, pr, pw, vp, len);
-
-    free(vp);
-fail_1:
-    free(pw);
-fail_0:
-    return err;
-}
-
-
-static void
-amergesort0_@suff@(npy_intp *pl, npy_intp *pr, @type@ *v, npy_intp *pw, size_t len)
-{
-    @type@ *vp;
-    npy_intp vi, *pi, *pj, *pk, *pm;
-
-    if (pr - pl > SMALL_MERGESORT) {
-        /* merge sort */
-        pm = pl + ((pr - pl) >> 1);
-        amergesort0_@suff@(pl, pm, v, pw, len);
-        amergesort0_@suff@(pm, pr, v, pw, len);
-        for (pi = pw, pj = pl; pj < pm;) {
-            *pi++ = *pj++;
-        }
-        pi = pw + (pm - pl);
-        pj = pw;
-        pk = pl;
-        while (pj < pi && pm < pr) {
-            if (@TYPE@_LT(v + (*pm)*len, v + (*pj)*len, len)) {
-                *pk++ = *pm++;
-            }
-            else {
-                *pk++ = *pj++;
-            }
-        }
-        while (pj < pi) {
-            *pk++ = *pj++;
-        }
-    }
-    else {
-        /* insertion sort */
-        for (pi = pl + 1; pi < pr; ++pi) {
-            vi = *pi;
-            vp = v + vi*len;
-            pj = pi;
-            pk = pi - 1;
-            while (pj > pl && @TYPE@_LT(vp, v + (*pk)*len, len)) {
-                *pj-- = *pk--;
-            }
-            *pj = vi;
-        }
-    }
-}
-
-
-NPY_NO_EXPORT int
-amergesort_@suff@(void *v, npy_intp *tosort, npy_intp num, void *varr)
-{
-    PyArrayObject *arr = varr;
-    size_t elsize = PyArray_ITEMSIZE(arr);
-    size_t len = elsize / sizeof(@type@);
-    npy_intp *pl, *pr, *pw;
-
-    /* Items that have zero size don't make sense to sort */
-    if (elsize == 0) {
-        return 0;
-    }
-
-    pl = tosort;
-    pr = pl + num;
-    pw = malloc((num/2) * sizeof(npy_intp));
-    if (pw == NULL) {
-        return -NPY_ENOMEM;
-    }
-    amergesort0_@suff@(pl, pr, v, pw, len);
-    free(pw);
-
-    return 0;
-}
-
-/**end repeat**/
-
-
-/*
- *****************************************************************************
- **                             GENERIC SORT                                **
- *****************************************************************************
- */
-
-
-static void
-npy_mergesort0(char *pl, char *pr, char *pw, char *vp, npy_intp elsize,
-               PyArray_CompareFunc *cmp, PyArrayObject *arr)
-{
-    char *pi, *pj, *pk, *pm;
-
-    if (pr - pl > SMALL_MERGESORT*elsize) {
-        /* merge sort */
-        pm = pl + (((pr - pl)/elsize) >> 1)*elsize;
-        npy_mergesort0(pl, pm, pw, vp, elsize, cmp, arr);
-        npy_mergesort0(pm, pr, pw, vp, elsize, cmp, arr);
-        GENERIC_COPY(pw, pl, pm - pl);
-        pi = pw + (pm - pl);
-        pj = pw;
-        pk = pl;
-        while (pj < pi && pm < pr) {
-            if (cmp(pm, pj, arr) < 0) {
-                GENERIC_COPY(pk, pm, elsize);
-                pm += elsize;
-                pk += elsize;
-            }
-            else {
-                GENERIC_COPY(pk, pj, elsize);
-                pj += elsize;
-                pk += elsize;
-            }
-        }
-        GENERIC_COPY(pk, pj, pi - pj);
-    }
-    else {
-        /* insertion sort */
-        for (pi = pl + elsize; pi < pr; pi += elsize) {
-            GENERIC_COPY(vp, pi, elsize);
-            pj = pi;
-            pk = pi - elsize;
-            while (pj > pl && cmp(vp, pk, arr) < 0) {
-                GENERIC_COPY(pj, pk, elsize);
-                pj -= elsize;
-                pk -= elsize;
-            }
-            GENERIC_COPY(pj, vp, elsize);
-        }
-    }
-}
-
-
-NPY_NO_EXPORT int
-npy_mergesort(void *start, npy_intp num, void *varr)
-{
-    PyArrayObject *arr = varr;
-    npy_intp elsize = PyArray_ITEMSIZE(arr);
-    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
-    char *pl = start;
-    char *pr = pl + num*elsize;
-    char *pw;
-    char *vp;
-    int err = -NPY_ENOMEM;
-
-    /* Items that have zero size don't make sense to sort */
-    if (elsize == 0) {
-        return 0;
-    }
-
-    pw = malloc((num >> 1) *elsize);
-    vp = malloc(elsize);
-
-    if (pw != NULL && vp != NULL) {
-        npy_mergesort0(pl, pr, pw, vp, elsize, cmp, arr);
-        err = 0;
-    }
-
-    free(vp);
-    free(pw);
-
-    return err;
-}
-
-
-static void
-npy_amergesort0(npy_intp *pl, npy_intp *pr, char *v, npy_intp *pw,
-                npy_intp elsize, PyArray_CompareFunc *cmp, PyArrayObject *arr)
-{
-    char *vp;
-    npy_intp vi, *pi, *pj, *pk, *pm;
-
-    if (pr - pl > SMALL_MERGESORT) {
-        /* merge sort */
-        pm = pl + ((pr - pl) >> 1);
-        npy_amergesort0(pl, pm, v, pw, elsize, cmp, arr);
-        npy_amergesort0(pm, pr, v, pw, elsize, cmp, arr);
-        for (pi = pw, pj = pl; pj < pm;) {
-            *pi++ = *pj++;
-        }
-        pi = pw + (pm - pl);
-        pj = pw;
-        pk = pl;
-        while (pj < pi && pm < pr) {
-            if (cmp(v + (*pm)*elsize, v + (*pj)*elsize, arr) < 0) {
-                *pk++ = *pm++;
-            }
-            else {
-                *pk++ = *pj++;
-            }
-        }
-        while (pj < pi) {
-            *pk++ = *pj++;
-        }
-    }
-    else {
-        /* insertion sort */
-        for (pi = pl + 1; pi < pr; ++pi) {
-            vi = *pi;
-            vp = v + vi*elsize;
-            pj = pi;
-            pk = pi - 1;
-            while (pj > pl && cmp(vp, v + (*pk)*elsize, arr) < 0) {
-                *pj-- = *pk--;
-            }
-            *pj = vi;
-        }
-    }
-}
-
-
-NPY_NO_EXPORT int
-npy_amergesort(void *v, npy_intp *tosort, npy_intp num, void *varr)
-{
-    PyArrayObject *arr = varr;
-    npy_intp elsize = PyArray_ITEMSIZE(arr);
-    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
-    npy_intp *pl, *pr, *pw;
-
-    /* Items that have zero size don't make sense to sort */
-    if (elsize == 0) {
-        return 0;
-    }
-
-    pl = tosort;
-    pr = pl + num;
-    pw = malloc((num >> 1) * sizeof(npy_intp));
-    if (pw == NULL) {
-        return -NPY_ENOMEM;
-    }
-    npy_amergesort0(pl, pr, v, pw, elsize, cmp, arr);
-    free(pw);
-
-    return 0;
-}
diff --git a/numpy/core/src/npysort/mergesort.cpp b/numpy/core/src/npysort/mergesort.cpp

new file mode 100644 (file)

index 0000000..f892dd1
--- /dev/null
+++ b/numpy/core/src/npysort/mergesort.cpp
@@ -0,0 +1,733 @@
+/* -*- c -*- */
+
+/*
+ * The purpose of this module is to add faster sort functions
+ * that are type-specific.  This is done by altering the
+ * function table for the builtin descriptors.
+ *
+ * These sorting functions are copied almost directly from numarray
+ * with a few modifications (complex comparisons compare the imaginary
+ * part if the real parts are equal, for example), and the names
+ * are changed.
+ *
+ * The original sorting code is due to Charles R. Harris who wrote
+ * it for numarray.
+ */
+
+/*
+ * Quick sort is usually the fastest, but the worst case scenario can
+ * be slower than the merge and heap sorts.  The merge sort requires
+ * extra memory and so for large arrays may not be useful.
+ *
+ * The merge sort is *stable*, meaning that equal components
+ * are unmoved from their entry versions, so it can be used to
+ * implement lexigraphic sorting on multiple keys.
+ *
+ * The heap sort is included for completeness.
+ */
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+
+#include "npy_sort.h"
+#include "npysort_common.h"
+#include "numpy_tag.h"
+
+#include <cstdlib>
+
+#define NOT_USED NPY_UNUSED(unused)
+#define PYA_QS_STACK 100
+#define SMALL_QUICKSORT 15
+#define SMALL_MERGESORT 20
+#define SMALL_STRING 16
+
+/*
+ *****************************************************************************
+ **                            NUMERIC SORTS                                **
+ *****************************************************************************
+ */
+
+template <typename Tag, typename type>
+static void
+mergesort0_(type *pl, type *pr, type *pw)
+{
+    type vp, *pi, *pj, *pk, *pm;
+
+    if (pr - pl > SMALL_MERGESORT) {
+        /* merge sort */
+        pm = pl + ((pr - pl) >> 1);
+        mergesort0_<Tag>(pl, pm, pw);
+        mergesort0_<Tag>(pm, pr, pw);
+        for (pi = pw, pj = pl; pj < pm;) {
+            *pi++ = *pj++;
+        }
+        pi = pw + (pm - pl);
+        pj = pw;
+        pk = pl;
+        while (pj < pi && pm < pr) {
+            if (Tag::less(*pm, *pj)) {
+                *pk++ = *pm++;
+            }
+            else {
+                *pk++ = *pj++;
+            }
+        }
+        while (pj < pi) {
+            *pk++ = *pj++;
+        }
+    }
+    else {
+        /* insertion sort */
+        for (pi = pl + 1; pi < pr; ++pi) {
+            vp = *pi;
+            pj = pi;
+            pk = pi - 1;
+            while (pj > pl && Tag::less(vp, *pk)) {
+                *pj-- = *pk--;
+            }
+            *pj = vp;
+        }
+    }
+}
+
+template <typename Tag, typename type>
+NPY_NO_EXPORT int
+mergesort_(type *start, npy_intp num)
+{
+    type *pl, *pr, *pw;
+
+    pl = start;
+    pr = pl + num;
+    pw = (type *)malloc((num / 2) * sizeof(type));
+    if (pw == NULL) {
+        return -NPY_ENOMEM;
+    }
+    mergesort0_<Tag>(pl, pr, pw);
+
+    free(pw);
+    return 0;
+}
+
+template <typename Tag, typename type>
+static void
+amergesort0_(npy_intp *pl, npy_intp *pr, type *v, npy_intp *pw)
+{
+    type vp;
+    npy_intp vi, *pi, *pj, *pk, *pm;
+
+    if (pr - pl > SMALL_MERGESORT) {
+        /* merge sort */
+        pm = pl + ((pr - pl) >> 1);
+        amergesort0_<Tag>(pl, pm, v, pw);
+        amergesort0_<Tag>(pm, pr, v, pw);
+        for (pi = pw, pj = pl; pj < pm;) {
+            *pi++ = *pj++;
+        }
+        pi = pw + (pm - pl);
+        pj = pw;
+        pk = pl;
+        while (pj < pi && pm < pr) {
+            if (Tag::less(v[*pm], v[*pj])) {
+                *pk++ = *pm++;
+            }
+            else {
+                *pk++ = *pj++;
+            }
+        }
+        while (pj < pi) {
+            *pk++ = *pj++;
+        }
+    }
+    else {
+        /* insertion sort */
+        for (pi = pl + 1; pi < pr; ++pi) {
+            vi = *pi;
+            vp = v[vi];
+            pj = pi;
+            pk = pi - 1;
+            while (pj > pl && Tag::less(vp, v[*pk])) {
+                *pj-- = *pk--;
+            }
+            *pj = vi;
+        }
+    }
+}
+
+template <typename Tag, typename type>
+NPY_NO_EXPORT int
+amergesort_(type *v, npy_intp *tosort, npy_intp num)
+{
+    npy_intp *pl, *pr, *pw;
+
+    pl = tosort;
+    pr = pl + num;
+    pw = (npy_intp *)malloc((num / 2) * sizeof(npy_intp));
+    if (pw == NULL) {
+        return -NPY_ENOMEM;
+    }
+    amergesort0_<Tag>(pl, pr, v, pw);
+    free(pw);
+
+    return 0;
+}
+
+/*
+ *****************************************************************************
+ **                             STRING SORTS                                **
+ *****************************************************************************
+ */
+
+template <typename Tag, typename type>
+static void
+mergesort0_(type *pl, type *pr, type *pw, type *vp, size_t len)
+{
+    type *pi, *pj, *pk, *pm;
+
+    if ((size_t)(pr - pl) > SMALL_MERGESORT * len) {
+        /* merge sort */
+        pm = pl + (((pr - pl) / len) >> 1) * len;
+        mergesort0_<Tag>(pl, pm, pw, vp, len);
+        mergesort0_<Tag>(pm, pr, pw, vp, len);
+        Tag::copy(pw, pl, pm - pl);
+        pi = pw + (pm - pl);
+        pj = pw;
+        pk = pl;
+        while (pj < pi && pm < pr) {
+            if (Tag::less(pm, pj, len)) {
+                Tag::copy(pk, pm, len);
+                pm += len;
+                pk += len;
+            }
+            else {
+                Tag::copy(pk, pj, len);
+                pj += len;
+                pk += len;
+            }
+        }
+        Tag::copy(pk, pj, pi - pj);
+    }
+    else {
+        /* insertion sort */
+        for (pi = pl + len; pi < pr; pi += len) {
+            Tag::copy(vp, pi, len);
+            pj = pi;
+            pk = pi - len;
+            while (pj > pl && Tag::less(vp, pk, len)) {
+                Tag::copy(pj, pk, len);
+                pj -= len;
+                pk -= len;
+            }
+            Tag::copy(pj, vp, len);
+        }
+    }
+}
+
+template <typename Tag, typename type>
+static int
+string_mergesort_(type *start, npy_intp num, void *varr)
+{
+    PyArrayObject *arr = (PyArrayObject *)varr;
+    size_t elsize = PyArray_ITEMSIZE(arr);
+    size_t len = elsize / sizeof(type);
+    type *pl, *pr, *pw, *vp;
+    int err = 0;
+
+    /* Items that have zero size don't make sense to sort */
+    if (elsize == 0) {
+        return 0;
+    }
+
+    pl = start;
+    pr = pl + num * len;
+    pw = (type *)malloc((num / 2) * elsize);
+    if (pw == NULL) {
+        err = -NPY_ENOMEM;
+        goto fail_0;
+    }
+    vp = (type *)malloc(elsize);
+    if (vp == NULL) {
+        err = -NPY_ENOMEM;
+        goto fail_1;
+    }
+    mergesort0_<Tag>(pl, pr, pw, vp, len);
+
+    free(vp);
+fail_1:
+    free(pw);
+fail_0:
+    return err;
+}
+
+template <typename Tag, typename type>
+static void
+amergesort0_(npy_intp *pl, npy_intp *pr, type *v, npy_intp *pw, size_t len)
+{
+    type *vp;
+    npy_intp vi, *pi, *pj, *pk, *pm;
+
+    if (pr - pl > SMALL_MERGESORT) {
+        /* merge sort */
+        pm = pl + ((pr - pl) >> 1);
+        amergesort0_<Tag>(pl, pm, v, pw, len);
+        amergesort0_<Tag>(pm, pr, v, pw, len);
+        for (pi = pw, pj = pl; pj < pm;) {
+            *pi++ = *pj++;
+        }
+        pi = pw + (pm - pl);
+        pj = pw;
+        pk = pl;
+        while (pj < pi && pm < pr) {
+            if (Tag::less(v + (*pm) * len, v + (*pj) * len, len)) {
+                *pk++ = *pm++;
+            }
+            else {
+                *pk++ = *pj++;
+            }
+        }
+        while (pj < pi) {
+            *pk++ = *pj++;
+        }
+    }
+    else {
+        /* insertion sort */
+        for (pi = pl + 1; pi < pr; ++pi) {
+            vi = *pi;
+            vp = v + vi * len;
+            pj = pi;
+            pk = pi - 1;
+            while (pj > pl && Tag::less(vp, v + (*pk) * len, len)) {
+                *pj-- = *pk--;
+            }
+            *pj = vi;
+        }
+    }
+}
+
+template <typename Tag, typename type>
+static int
+string_amergesort_(type *v, npy_intp *tosort, npy_intp num, void *varr)
+{
+    PyArrayObject *arr = (PyArrayObject *)varr;
+    size_t elsize = PyArray_ITEMSIZE(arr);
+    size_t len = elsize / sizeof(type);
+    npy_intp *pl, *pr, *pw;
+
+    /* Items that have zero size don't make sense to sort */
+    if (elsize == 0) {
+        return 0;
+    }
+
+    pl = tosort;
+    pr = pl + num;
+    pw = (npy_intp *)malloc((num / 2) * sizeof(npy_intp));
+    if (pw == NULL) {
+        return -NPY_ENOMEM;
+    }
+    amergesort0_<Tag>(pl, pr, v, pw, len);
+    free(pw);
+
+    return 0;
+}
+
+/*
+ *****************************************************************************
+ **                             GENERIC SORT                                **
+ *****************************************************************************
+ */
+
+static void
+npy_mergesort0(char *pl, char *pr, char *pw, char *vp, npy_intp elsize,
+               PyArray_CompareFunc *cmp, PyArrayObject *arr)
+{
+    char *pi, *pj, *pk, *pm;
+
+    if (pr - pl > SMALL_MERGESORT * elsize) {
+        /* merge sort */
+        pm = pl + (((pr - pl) / elsize) >> 1) * elsize;
+        npy_mergesort0(pl, pm, pw, vp, elsize, cmp, arr);
+        npy_mergesort0(pm, pr, pw, vp, elsize, cmp, arr);
+        GENERIC_COPY(pw, pl, pm - pl);
+        pi = pw + (pm - pl);
+        pj = pw;
+        pk = pl;
+        while (pj < pi && pm < pr) {
+            if (cmp(pm, pj, arr) < 0) {
+                GENERIC_COPY(pk, pm, elsize);
+                pm += elsize;
+                pk += elsize;
+            }
+            else {
+                GENERIC_COPY(pk, pj, elsize);
+                pj += elsize;
+                pk += elsize;
+            }
+        }
+        GENERIC_COPY(pk, pj, pi - pj);
+    }
+    else {
+        /* insertion sort */
+        for (pi = pl + elsize; pi < pr; pi += elsize) {
+            GENERIC_COPY(vp, pi, elsize);
+            pj = pi;
+            pk = pi - elsize;
+            while (pj > pl && cmp(vp, pk, arr) < 0) {
+                GENERIC_COPY(pj, pk, elsize);
+                pj -= elsize;
+                pk -= elsize;
+            }
+            GENERIC_COPY(pj, vp, elsize);
+        }
+    }
+}
+
+NPY_NO_EXPORT int
+npy_mergesort(void *start, npy_intp num, void *varr)
+{
+    PyArrayObject *arr = (PyArrayObject *)varr;
+    npy_intp elsize = PyArray_ITEMSIZE(arr);
+    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
+    char *pl = (char *)start;
+    char *pr = pl + num * elsize;
+    char *pw;
+    char *vp;
+    int err = -NPY_ENOMEM;
+
+    /* Items that have zero size don't make sense to sort */
+    if (elsize == 0) {
+        return 0;
+    }
+
+    pw = (char *)malloc((num >> 1) * elsize);
+    vp = (char *)malloc(elsize);
+
+    if (pw != NULL && vp != NULL) {
+        npy_mergesort0(pl, pr, pw, vp, elsize, cmp, arr);
+        err = 0;
+    }
+
+    free(vp);
+    free(pw);
+
+    return err;
+}
+
+static void
+npy_amergesort0(npy_intp *pl, npy_intp *pr, char *v, npy_intp *pw,
+                npy_intp elsize, PyArray_CompareFunc *cmp, PyArrayObject *arr)
+{
+    char *vp;
+    npy_intp vi, *pi, *pj, *pk, *pm;
+
+    if (pr - pl > SMALL_MERGESORT) {
+        /* merge sort */
+        pm = pl + ((pr - pl) >> 1);
+        npy_amergesort0(pl, pm, v, pw, elsize, cmp, arr);
+        npy_amergesort0(pm, pr, v, pw, elsize, cmp, arr);
+        for (pi = pw, pj = pl; pj < pm;) {
+            *pi++ = *pj++;
+        }
+        pi = pw + (pm - pl);
+        pj = pw;
+        pk = pl;
+        while (pj < pi && pm < pr) {
+            if (cmp(v + (*pm) * elsize, v + (*pj) * elsize, arr) < 0) {
+                *pk++ = *pm++;
+            }
+            else {
+                *pk++ = *pj++;
+            }
+        }
+        while (pj < pi) {
+            *pk++ = *pj++;
+        }
+    }
+    else {
+        /* insertion sort */
+        for (pi = pl + 1; pi < pr; ++pi) {
+            vi = *pi;
+            vp = v + vi * elsize;
+            pj = pi;
+            pk = pi - 1;
+            while (pj > pl && cmp(vp, v + (*pk) * elsize, arr) < 0) {
+                *pj-- = *pk--;
+            }
+            *pj = vi;
+        }
+    }
+}
+
+NPY_NO_EXPORT int
+npy_amergesort(void *v, npy_intp *tosort, npy_intp num, void *varr)
+{
+    PyArrayObject *arr = (PyArrayObject *)varr;
+    npy_intp elsize = PyArray_ITEMSIZE(arr);
+    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
+    npy_intp *pl, *pr, *pw;
+
+    /* Items that have zero size don't make sense to sort */
+    if (elsize == 0) {
+        return 0;
+    }
+
+    pl = tosort;
+    pr = pl + num;
+    pw = (npy_intp *)malloc((num >> 1) * sizeof(npy_intp));
+    if (pw == NULL) {
+        return -NPY_ENOMEM;
+    }
+    npy_amergesort0(pl, pr, (char *)v, pw, elsize, cmp, arr);
+    free(pw);
+
+    return 0;
+}
+
+/***************************************
+ * C > C++ dispatch
+ ***************************************/
+NPY_NO_EXPORT int
+mergesort_bool(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::bool_tag>((npy_bool *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_byte(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::byte_tag>((npy_byte *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_ubyte(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::ubyte_tag>((npy_ubyte *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_short(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::short_tag>((npy_short *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_ushort(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::ushort_tag>((npy_ushort *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_int(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::int_tag>((npy_int *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_uint(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::uint_tag>((npy_uint *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_long(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::long_tag>((npy_long *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_ulong(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::ulong_tag>((npy_ulong *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_longlong(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::longlong_tag>((npy_longlong *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_ulonglong(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::ulonglong_tag>((npy_ulonglong *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_half(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::half_tag>((npy_half *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_float(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::float_tag>((npy_float *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_double(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::double_tag>((npy_double *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_longdouble(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::longdouble_tag>((npy_longdouble *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_cfloat(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::cfloat_tag>((npy_cfloat *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_cdouble(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::cdouble_tag>((npy_cdouble *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_clongdouble(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::clongdouble_tag>((npy_clongdouble *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_datetime(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::datetime_tag>((npy_datetime *)start, num);
+}
+NPY_NO_EXPORT int
+mergesort_timedelta(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return mergesort_<npy::timedelta_tag>((npy_timedelta *)start, num);
+}
+
+NPY_NO_EXPORT int
+amergesort_bool(void *start, npy_intp *tosort, npy_intp num,
+                void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::bool_tag>((npy_bool *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_byte(void *start, npy_intp *tosort, npy_intp num,
+                void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::byte_tag>((npy_byte *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_ubyte(void *start, npy_intp *tosort, npy_intp num,
+                 void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::ubyte_tag>((npy_ubyte *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_short(void *start, npy_intp *tosort, npy_intp num,
+                 void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::short_tag>((npy_short *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_ushort(void *start, npy_intp *tosort, npy_intp num,
+                  void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::ushort_tag>((npy_ushort *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_int(void *start, npy_intp *tosort, npy_intp num,
+               void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::int_tag>((npy_int *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_uint(void *start, npy_intp *tosort, npy_intp num,
+                void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::uint_tag>((npy_uint *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_long(void *start, npy_intp *tosort, npy_intp num,
+                void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::long_tag>((npy_long *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_ulong(void *start, npy_intp *tosort, npy_intp num,
+                 void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::ulong_tag>((npy_ulong *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_longlong(void *start, npy_intp *tosort, npy_intp num,
+                    void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::longlong_tag>((npy_longlong *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_ulonglong(void *start, npy_intp *tosort, npy_intp num,
+                     void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::ulonglong_tag>((npy_ulonglong *)start, tosort,
+                                           num);
+}
+NPY_NO_EXPORT int
+amergesort_half(void *start, npy_intp *tosort, npy_intp num,
+                void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::half_tag>((npy_half *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_float(void *start, npy_intp *tosort, npy_intp num,
+                 void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::float_tag>((npy_float *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_double(void *start, npy_intp *tosort, npy_intp num,
+                  void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::double_tag>((npy_double *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_longdouble(void *start, npy_intp *tosort, npy_intp num,
+                      void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::longdouble_tag>((npy_longdouble *)start, tosort,
+                                            num);
+}
+NPY_NO_EXPORT int
+amergesort_cfloat(void *start, npy_intp *tosort, npy_intp num,
+                  void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::cfloat_tag>((npy_cfloat *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_cdouble(void *start, npy_intp *tosort, npy_intp num,
+                   void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::cdouble_tag>((npy_cdouble *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_clongdouble(void *start, npy_intp *tosort, npy_intp num,
+                       void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::clongdouble_tag>((npy_clongdouble *)start, tosort,
+                                             num);
+}
+NPY_NO_EXPORT int
+amergesort_datetime(void *start, npy_intp *tosort, npy_intp num,
+                    void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::datetime_tag>((npy_datetime *)start, tosort, num);
+}
+NPY_NO_EXPORT int
+amergesort_timedelta(void *start, npy_intp *tosort, npy_intp num,
+                     void *NPY_UNUSED(varr))
+{
+    return amergesort_<npy::timedelta_tag>((npy_timedelta *)start, tosort,
+                                           num);
+}
+
+NPY_NO_EXPORT int
+mergesort_string(void *start, npy_intp num, void *varr)
+{
+    return string_mergesort_<npy::string_tag>((npy_char *)start, num, varr);
+}
+NPY_NO_EXPORT int
+mergesort_unicode(void *start, npy_intp num, void *varr)
+{
+    return string_mergesort_<npy::unicode_tag>((npy_ucs4 *)start, num, varr);
+}
+NPY_NO_EXPORT int
+amergesort_string(void *v, npy_intp *tosort, npy_intp num, void *varr)
+{
+    return string_amergesort_<npy::string_tag>((npy_char *)v, tosort, num,
+                                               varr);
+}
+NPY_NO_EXPORT int
+amergesort_unicode(void *v, npy_intp *tosort, npy_intp num, void *varr)
+{
+    return string_amergesort_<npy::unicode_tag>((npy_ucs4 *)v, tosort, num,
+                                                varr);
+}
diff --git a/numpy/core/src/npysort/npysort_common.h b/numpy/core/src/npysort/npysort_common.h

index 2a6e4d421234bc580d89ff83b146992ac9e5f6ec..ab9f456b65d1aac5057d60ceeca8a59bf476d177 100644 (file)
--- a/numpy/core/src/npysort/npysort_common.h
+++ b/numpy/core/src/npysort/npysort_common.h
@@ -4,6 +4,10 @@
  #include <stdlib.h>
  #include <numpy/ndarraytypes.h>
  
+#ifdef __cplusplus
+extern "C" {
+#endif
+
  /*
   *****************************************************************************
   **                        SWAP MACROS                                      **
@@ -139,14 +143,14 @@ LONGDOUBLE_LT(npy_longdouble a, npy_longdouble b)
  
  
  NPY_INLINE static int
-npy_half_isnan(npy_half h)
+_npy_half_isnan(npy_half h)
  {
      return ((h&0x7c00u) == 0x7c00u) && ((h&0x03ffu) != 0x0000u);
  }
  
  
  NPY_INLINE static int
-npy_half_lt_nonan(npy_half h1, npy_half h2)
+_npy_half_lt_nonan(npy_half h1, npy_half h2)
  {
      if (h1&0x8000u) {
          if (h2&0x8000u) {
@@ -173,11 +177,11 @@ HALF_LT(npy_half a, npy_half b)
  {
      int ret;
  
-    if (npy_half_isnan(b)) {
-        ret = !npy_half_isnan(a);
+    if (_npy_half_isnan(b)) {
+        ret = !_npy_half_isnan(a);
      }
      else {
-        ret = !npy_half_isnan(a) && npy_half_lt_nonan(a, b);
+        ret = !_npy_half_isnan(a) && _npy_half_lt_nonan(a, b);
      }
  
      return ret;
@@ -255,7 +259,7 @@ CLONGDOUBLE_LT(npy_clongdouble a, npy_clongdouble b)
  
  
  NPY_INLINE static void
-STRING_COPY(char *s1, char *s2, size_t len)
+STRING_COPY(char *s1, char const*s2, size_t len)
  {
      memcpy(s1, s2, len);
  }
@@ -291,7 +295,7 @@ STRING_LT(const char *s1, const char *s2, size_t len)
  
  
  NPY_INLINE static void
-UNICODE_COPY(npy_ucs4 *s1, npy_ucs4 *s2, size_t len)
+UNICODE_COPY(npy_ucs4 *s1, npy_ucs4 const *s2, size_t len)
  {
      while(len--) {
          *s1++ = *s2++;
@@ -373,4 +377,8 @@ GENERIC_SWAP(char *a, char *b, size_t len)
      }
  }
  
+#ifdef __cplusplus
+}
+#endif
+
  #endif
diff --git a/numpy/core/src/npysort/quicksort.c.src b/numpy/core/src/npysort/quicksort.c.src

deleted file mode 100644 (file)

index 933f758..0000000
--- a/numpy/core/src/npysort/quicksort.c.src
+++ /dev/null
@@ -1,634 +0,0 @@
-/* -*- c -*- */
-
-/*
- * The purpose of this module is to add faster sort functions
- * that are type-specific.  This is done by altering the
- * function table for the builtin descriptors.
- *
- * These sorting functions are copied almost directly from numarray
- * with a few modifications (complex comparisons compare the imaginary
- * part if the real parts are equal, for example), and the names
- * are changed.
- *
- * The original sorting code is due to Charles R. Harris who wrote
- * it for numarray.
- */
-
-/*
- * Quick sort is usually the fastest, but the worst case scenario is O(N^2) so
- * the code switches to the O(NlogN) worst case heapsort if not enough progress
- * is made on the large side of the two quicksort partitions. This improves the
- * worst case while still retaining the speed of quicksort for the common case.
- * This is variant known as introsort.
- *
- *
- * def introsort(lower, higher, recursion_limit=log2(higher - lower + 1) * 2):
- *   # sort remainder with heapsort if we are not making enough progress
- *   # we arbitrarily choose 2 * log(n) as the cutoff point
- *   if recursion_limit < 0:
- *       heapsort(lower, higher)
- *       return
- *
- *   if lower < higher:
- *      pivot_pos = partition(lower, higher)
- *      # recurse into smaller first and leave larger on stack
- *      # this limits the required stack space
- *      if (pivot_pos - lower > higher - pivot_pos):
- *          quicksort(pivot_pos + 1, higher, recursion_limit - 1)
- *          quicksort(lower, pivot_pos, recursion_limit - 1)
- *      else:
- *          quicksort(lower, pivot_pos, recursion_limit - 1)
- *          quicksort(pivot_pos + 1, higher, recursion_limit - 1)
- *
- *
- * the below code implements this converted to an iteration and as an
- * additional minor optimization skips the recursion depth checking on the
- * smaller partition as it is always less than half of the remaining data and
- * will thus terminate fast enough
- */
-
-#define NPY_NO_DEPRECATED_API NPY_API_VERSION
-
-#include "npy_sort.h"
-#include "npysort_common.h"
-#include <stdlib.h>
-
-#define NOT_USED NPY_UNUSED(unused)
-/*
- * pushing largest partition has upper bound of log2(n) space
- * we store two pointers each time
- */
-#define PYA_QS_STACK (NPY_BITSOF_INTP * 2)
-#define SMALL_QUICKSORT 15
-#define SMALL_MERGESORT 20
-#define SMALL_STRING 16
-
-
-/*
- *****************************************************************************
- **                            NUMERIC SORTS                                **
- *****************************************************************************
- */
-
-
-/**begin repeat
- *
- * #TYPE = BOOL, BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG,
- *         LONGLONG, ULONGLONG, HALF, FLOAT, DOUBLE, LONGDOUBLE,
- *         CFLOAT, CDOUBLE, CLONGDOUBLE, DATETIME, TIMEDELTA#
- * #suff = bool, byte, ubyte, short, ushort, int, uint, long, ulong,
- *         longlong, ulonglong, half, float, double, longdouble,
- *         cfloat, cdouble, clongdouble, datetime, timedelta#
- * #type = npy_bool, npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int,
- *         npy_uint, npy_long, npy_ulong, npy_longlong, npy_ulonglong,
- *         npy_ushort, npy_float, npy_double, npy_longdouble, npy_cfloat,
- *         npy_cdouble, npy_clongdouble, npy_datetime, npy_timedelta#
- */
-
-NPY_NO_EXPORT int
-quicksort_@suff@(void *start, npy_intp num, void *NOT_USED)
-{
-    @type@ vp;
-    @type@ *pl = start;
-    @type@ *pr = pl + num - 1;
-    @type@ *stack[PYA_QS_STACK];
-    @type@ **sptr = stack;
-    @type@ *pm, *pi, *pj, *pk;
-    int depth[PYA_QS_STACK];
-    int * psdepth = depth;
-    int cdepth = npy_get_msb(num) * 2;
-
-    for (;;) {
-        if (NPY_UNLIKELY(cdepth < 0)) {
-            heapsort_@suff@(pl, pr - pl + 1, NULL);
-            goto stack_pop;
-        }
-        while ((pr - pl) > SMALL_QUICKSORT) {
-            /* quicksort partition */
-            pm = pl + ((pr - pl) >> 1);
-            if (@TYPE@_LT(*pm, *pl)) @TYPE@_SWAP(*pm, *pl);
-            if (@TYPE@_LT(*pr, *pm)) @TYPE@_SWAP(*pr, *pm);
-            if (@TYPE@_LT(*pm, *pl)) @TYPE@_SWAP(*pm, *pl);
-            vp = *pm;
-            pi = pl;
-            pj = pr - 1;
-            @TYPE@_SWAP(*pm, *pj);
-            for (;;) {
-                do ++pi; while (@TYPE@_LT(*pi, vp));
-                do --pj; while (@TYPE@_LT(vp, *pj));
-                if (pi >= pj) {
-                    break;
-                }
-                @TYPE@_SWAP(*pi,*pj);
-            }
-            pk = pr - 1;
-            @TYPE@_SWAP(*pi, *pk);
-            /* push largest partition on stack */
-            if (pi - pl < pr - pi) {
-                *sptr++ = pi + 1;
-                *sptr++ = pr;
-                pr = pi - 1;
-            }
-            else {
-                *sptr++ = pl;
-                *sptr++ = pi - 1;
-                pl = pi + 1;
-            }
-            *psdepth++ = --cdepth;
-        }
-
-        /* insertion sort */
-        for (pi = pl + 1; pi <= pr; ++pi) {
-            vp = *pi;
-            pj = pi;
-            pk = pi - 1;
-            while (pj > pl && @TYPE@_LT(vp, *pk)) {
-                *pj-- = *pk--;
-            }
-            *pj = vp;
-        }
-stack_pop:
-        if (sptr == stack) {
-            break;
-        }
-        pr = *(--sptr);
-        pl = *(--sptr);
-        cdepth = *(--psdepth);
-    }
-
-    return 0;
-}
-
-
-NPY_NO_EXPORT int
-aquicksort_@suff@(void *vv, npy_intp* tosort, npy_intp num, void *NOT_USED)
-{
-    @type@ *v = vv;
-    @type@ vp;
-    npy_intp *pl = tosort;
-    npy_intp *pr = tosort + num - 1;
-    npy_intp *stack[PYA_QS_STACK];
-    npy_intp **sptr = stack;
-    npy_intp *pm, *pi, *pj, *pk, vi;
-    int depth[PYA_QS_STACK];
-    int * psdepth = depth;
-    int cdepth = npy_get_msb(num) * 2;
-
-    for (;;) {
-        if (NPY_UNLIKELY(cdepth < 0)) {
-            aheapsort_@suff@(vv, pl, pr - pl + 1, NULL);
-            goto stack_pop;
-        }
-        while ((pr - pl) > SMALL_QUICKSORT) {
-            /* quicksort partition */
-            pm = pl + ((pr - pl) >> 1);
-            if (@TYPE@_LT(v[*pm],v[*pl])) INTP_SWAP(*pm, *pl);
-            if (@TYPE@_LT(v[*pr],v[*pm])) INTP_SWAP(*pr, *pm);
-            if (@TYPE@_LT(v[*pm],v[*pl])) INTP_SWAP(*pm, *pl);
-            vp = v[*pm];
-            pi = pl;
-            pj = pr - 1;
-            INTP_SWAP(*pm, *pj);
-            for (;;) {
-                do ++pi; while (@TYPE@_LT(v[*pi], vp));
-                do --pj; while (@TYPE@_LT(vp, v[*pj]));
-                if (pi >= pj) {
-                    break;
-                }
-                INTP_SWAP(*pi, *pj);
-            }
-            pk = pr - 1;
-            INTP_SWAP(*pi,*pk);
-            /* push largest partition on stack */
-            if (pi - pl < pr - pi) {
-                *sptr++ = pi + 1;
-                *sptr++ = pr;
-                pr = pi - 1;
-            }
-            else {
-                *sptr++ = pl;
-                *sptr++ = pi - 1;
-                pl = pi + 1;
-            }
-            *psdepth++ = --cdepth;
-        }
-
-        /* insertion sort */
-        for (pi = pl + 1; pi <= pr; ++pi) {
-            vi = *pi;
-            vp = v[vi];
-            pj = pi;
-            pk = pi - 1;
-            while (pj > pl && @TYPE@_LT(vp, v[*pk])) {
-                *pj-- = *pk--;
-            }
-            *pj = vi;
-        }
-stack_pop:
-        if (sptr == stack) {
-            break;
-        }
-        pr = *(--sptr);
-        pl = *(--sptr);
-        cdepth = *(--psdepth);
-    }
-
-    return 0;
-}
-
-/**end repeat**/
-
-
-/*
- *****************************************************************************
- **                             STRING SORTS                                **
- *****************************************************************************
- */
-
-
-/**begin repeat
- *
- * #TYPE = STRING, UNICODE#
- * #suff = string, unicode#
- * #type = npy_char, npy_ucs4#
- */
-
-NPY_NO_EXPORT int
-quicksort_@suff@(void *start, npy_intp num, void *varr)
-{
-    PyArrayObject *arr = varr;
-    const size_t len = PyArray_ITEMSIZE(arr)/sizeof(@type@);
-    @type@ *vp;
-    @type@ *pl = start;
-    @type@ *pr = pl + (num - 1)*len;
-    @type@ *stack[PYA_QS_STACK], **sptr = stack, *pm, *pi, *pj, *pk;
-    int depth[PYA_QS_STACK];
-    int * psdepth = depth;
-    int cdepth = npy_get_msb(num) * 2;
-
-    /* Items that have zero size don't make sense to sort */
-    if (len == 0) {
-        return 0;
-    }
-
-    vp = malloc(PyArray_ITEMSIZE(arr));
-    if (vp == NULL) {
-        return -NPY_ENOMEM;
-    }
-
-    for (;;) {
-        if (NPY_UNLIKELY(cdepth < 0)) {
-            heapsort_@suff@(pl, (pr - pl) / len + 1, varr);
-            goto stack_pop;
-        }
-        while ((size_t)(pr - pl) > SMALL_QUICKSORT*len) {
-            /* quicksort partition */
-            pm = pl + (((pr - pl)/len) >> 1)*len;
-            if (@TYPE@_LT(pm, pl, len)) @TYPE@_SWAP(pm, pl, len);
-            if (@TYPE@_LT(pr, pm, len)) @TYPE@_SWAP(pr, pm, len);
-            if (@TYPE@_LT(pm, pl, len)) @TYPE@_SWAP(pm, pl, len);
-            @TYPE@_COPY(vp, pm, len);
-            pi = pl;
-            pj = pr - len;
-            @TYPE@_SWAP(pm, pj, len);
-            for (;;) {
-                do pi += len; while (@TYPE@_LT(pi, vp, len));
-                do pj -= len; while (@TYPE@_LT(vp, pj, len));
-                if (pi >= pj) {
-                    break;
-                }
-                @TYPE@_SWAP(pi, pj, len);
-            }
-            pk = pr - len;
-            @TYPE@_SWAP(pi, pk, len);
-            /* push largest partition on stack */
-            if (pi - pl < pr - pi) {
-                *sptr++ = pi + len;
-                *sptr++ = pr;
-                pr = pi - len;
-            }
-            else {
-                *sptr++ = pl;
-                *sptr++ = pi - len;
-                pl = pi + len;
-            }
-            *psdepth++ = --cdepth;
-        }
-
-        /* insertion sort */
-        for (pi = pl + len; pi <= pr; pi += len) {
-            @TYPE@_COPY(vp, pi, len);
-            pj = pi;
-            pk = pi - len;
-            while (pj > pl && @TYPE@_LT(vp, pk, len)) {
-                @TYPE@_COPY(pj, pk, len);
-                pj -= len;
-                pk -= len;
-            }
-            @TYPE@_COPY(pj, vp, len);
-        }
-stack_pop:
-        if (sptr == stack) {
-            break;
-        }
-        pr = *(--sptr);
-        pl = *(--sptr);
-        cdepth = *(--psdepth);
-    }
-
-    free(vp);
-    return 0;
-}
-
-
-NPY_NO_EXPORT int
-aquicksort_@suff@(void *vv, npy_intp* tosort, npy_intp num, void *varr)
-{
-    @type@ *v = vv;
-    PyArrayObject *arr = varr;
-    size_t len = PyArray_ITEMSIZE(arr)/sizeof(@type@);
-    @type@ *vp;
-    npy_intp *pl = tosort;
-    npy_intp *pr = tosort + num - 1;
-    npy_intp *stack[PYA_QS_STACK];
-    npy_intp **sptr=stack;
-    npy_intp *pm, *pi, *pj, *pk, vi;
-    int depth[PYA_QS_STACK];
-    int * psdepth = depth;
-    int cdepth = npy_get_msb(num) * 2;
-
-    /* Items that have zero size don't make sense to sort */
-    if (len == 0) {
-        return 0;
-    }
-
-    for (;;) {
-        if (NPY_UNLIKELY(cdepth < 0)) {
-            aheapsort_@suff@(vv, pl, pr - pl + 1, varr);
-            goto stack_pop;
-        }
-        while ((pr - pl) > SMALL_QUICKSORT) {
-            /* quicksort partition */
-            pm = pl + ((pr - pl) >> 1);
-            if (@TYPE@_LT(v + (*pm)*len, v + (*pl)*len, len)) INTP_SWAP(*pm, *pl);
-            if (@TYPE@_LT(v + (*pr)*len, v + (*pm)*len, len)) INTP_SWAP(*pr, *pm);
-            if (@TYPE@_LT(v + (*pm)*len, v + (*pl)*len, len)) INTP_SWAP(*pm, *pl);
-            vp = v + (*pm)*len;
-            pi = pl;
-            pj = pr - 1;
-            INTP_SWAP(*pm,*pj);
-            for (;;) {
-                do ++pi; while (@TYPE@_LT(v + (*pi)*len, vp, len));
-                do --pj; while (@TYPE@_LT(vp, v + (*pj)*len, len));
-                if (pi >= pj) {
-                    break;
-                }
-                INTP_SWAP(*pi,*pj);
-            }
-            pk = pr - 1;
-            INTP_SWAP(*pi,*pk);
-            /* push largest partition on stack */
-            if (pi - pl < pr - pi) {
-                *sptr++ = pi + 1;
-                *sptr++ = pr;
-                pr = pi - 1;
-            }
-            else {
-                *sptr++ = pl;
-                *sptr++ = pi - 1;
-                pl = pi + 1;
-            }
-            *psdepth++ = --cdepth;
-        }
-
-        /* insertion sort */
-        for (pi = pl + 1; pi <= pr; ++pi) {
-            vi = *pi;
-            vp = v + vi*len;
-            pj = pi;
-            pk = pi - 1;
-            while (pj > pl && @TYPE@_LT(vp, v + (*pk)*len, len)) {
-                *pj-- = *pk--;
-            }
-            *pj = vi;
-        }
-stack_pop:
-        if (sptr == stack) {
-            break;
-        }
-        pr = *(--sptr);
-        pl = *(--sptr);
-        cdepth = *(--psdepth);
-    }
-
-    return 0;
-}
-
-/**end repeat**/
-
-
-/*
- *****************************************************************************
- **                             GENERIC SORT                                **
- *****************************************************************************
- */
-
-
-NPY_NO_EXPORT int
-npy_quicksort(void *start, npy_intp num, void *varr)
-{
-    PyArrayObject *arr = varr;
-    npy_intp elsize = PyArray_ITEMSIZE(arr);
-    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
-    char *vp;
-    char *pl = start;
-    char *pr = pl + (num - 1)*elsize;
-    char *stack[PYA_QS_STACK];
-    char **sptr = stack;
-    char *pm, *pi, *pj, *pk;
-    int depth[PYA_QS_STACK];
-    int * psdepth = depth;
-    int cdepth = npy_get_msb(num) * 2;
-
-    /* Items that have zero size don't make sense to sort */
-    if (elsize == 0) {
-        return 0;
-    }
-
-    vp = malloc(elsize);
-    if (vp == NULL) {
-        return -NPY_ENOMEM;
-    }
-
-    for (;;) {
-        if (NPY_UNLIKELY(cdepth < 0)) {
-            npy_heapsort(pl, (pr - pl) / elsize + 1, varr);
-            goto stack_pop;
-        }
-        while(pr - pl > SMALL_QUICKSORT*elsize) {
-            /* quicksort partition */
-            pm = pl + (((pr - pl) / elsize) >> 1) * elsize;
-            if (cmp(pm, pl, arr) < 0) {
-                GENERIC_SWAP(pm, pl, elsize);
-            }
-            if (cmp(pr, pm, arr) < 0) {
-                GENERIC_SWAP(pr, pm, elsize);
-            }
-            if (cmp(pm, pl, arr) < 0) {
-                GENERIC_SWAP(pm, pl, elsize);
-            }
-            GENERIC_COPY(vp, pm, elsize);
-            pi = pl;
-            pj = pr - elsize;
-            GENERIC_SWAP(pm, pj, elsize);
-            /*
-             * Generic comparisons may be buggy, so don't rely on the sentinels
-             * to keep the pointers from going out of bounds.
-             */
-            for (;;) {
-                do {
-                    pi += elsize;
-                } while (cmp(pi, vp, arr) < 0 && pi < pj);
-                do {
-                    pj -= elsize;
-                } while (cmp(vp, pj, arr) < 0 && pi < pj);
-                if (pi >= pj) {
-                    break;
-                }
-                GENERIC_SWAP(pi, pj, elsize);
-            }
-            pk = pr - elsize;
-            GENERIC_SWAP(pi, pk, elsize);
-            /* push largest partition on stack */
-            if (pi - pl < pr - pi) {
-                *sptr++ = pi + elsize;
-                *sptr++ = pr;
-                pr = pi - elsize;
-            }
-            else {
-                *sptr++ = pl;
-                *sptr++ = pi - elsize;
-                pl = pi + elsize;
-            }
-            *psdepth++ = --cdepth;
-        }
-
-        /* insertion sort */
-        for (pi = pl + elsize; pi <= pr; pi += elsize) {
-            GENERIC_COPY(vp, pi, elsize);
-            pj = pi;
-            pk = pi - elsize;
-            while (pj > pl && cmp(vp, pk, arr) < 0) {
-                GENERIC_COPY(pj, pk, elsize);
-                pj -= elsize;
-                pk -= elsize;
-            }
-            GENERIC_COPY(pj, vp, elsize);
-        }
-stack_pop:
-        if (sptr == stack) {
-            break;
-        }
-        pr = *(--sptr);
-        pl = *(--sptr);
-        cdepth = *(--psdepth);
-    }
-
-    free(vp);
-    return 0;
-}
-
-
-NPY_NO_EXPORT int
-npy_aquicksort(void *vv, npy_intp* tosort, npy_intp num, void *varr)
-{
-    char *v = vv;
-    PyArrayObject *arr = varr;
-    npy_intp elsize = PyArray_ITEMSIZE(arr);
-    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
-    char *vp;
-    npy_intp *pl = tosort;
-    npy_intp *pr = tosort + num - 1;
-    npy_intp *stack[PYA_QS_STACK];
-    npy_intp **sptr = stack;
-    npy_intp *pm, *pi, *pj, *pk, vi;
-    int depth[PYA_QS_STACK];
-    int * psdepth = depth;
-    int cdepth = npy_get_msb(num) * 2;
-
-    /* Items that have zero size don't make sense to sort */
-    if (elsize == 0) {
-        return 0;
-    }
-
-    for (;;) {
-        if (NPY_UNLIKELY(cdepth < 0)) {
-            npy_aheapsort(vv, pl, pr - pl + 1, varr);
-            goto stack_pop;
-        }
-        while ((pr - pl) > SMALL_QUICKSORT) {
-            /* quicksort partition */
-            pm = pl + ((pr - pl) >> 1);
-            if (cmp(v + (*pm)*elsize, v + (*pl)*elsize, arr) < 0) {
-                INTP_SWAP(*pm, *pl);
-            }
-            if (cmp(v + (*pr)*elsize, v + (*pm)*elsize, arr) < 0) {
-                INTP_SWAP(*pr, *pm);
-            }
-            if (cmp(v + (*pm)*elsize, v + (*pl)*elsize, arr) < 0) {
-                INTP_SWAP(*pm, *pl);
-            }
-            vp = v + (*pm)*elsize;
-            pi = pl;
-            pj = pr - 1;
-            INTP_SWAP(*pm,*pj);
-            for (;;) {
-                do {
-                    ++pi;
-                } while (cmp(v + (*pi)*elsize, vp, arr) < 0 && pi < pj);
-                do {
-                    --pj;
-                } while (cmp(vp, v + (*pj)*elsize, arr) < 0 && pi < pj);
-                if (pi >= pj) {
-                    break;
-                }
-                INTP_SWAP(*pi,*pj);
-            }
-            pk = pr - 1;
-            INTP_SWAP(*pi,*pk);
-            /* push largest partition on stack */
-            if (pi - pl < pr - pi) {
-                *sptr++ = pi + 1;
-                *sptr++ = pr;
-                pr = pi - 1;
-            }
-            else {
-                *sptr++ = pl;
-                *sptr++ = pi - 1;
-                pl = pi + 1;
-            }
-            *psdepth++ = --cdepth;
-        }
-
-        /* insertion sort */
-        for (pi = pl + 1; pi <= pr; ++pi) {
-            vi = *pi;
-            vp = v + vi*elsize;
-            pj = pi;
-            pk = pi - 1;
-            while (pj > pl && cmp(vp, v + (*pk)*elsize, arr) < 0) {
-                *pj-- = *pk--;
-            }
-            *pj = vi;
-        }
-stack_pop:
-        if (sptr == stack) {
-            break;
-        }
-        pr = *(--sptr);
-        pl = *(--sptr);
-        cdepth = *(--psdepth);
-    }
-
-    return 0;
-}
diff --git a/numpy/core/src/npysort/quicksort.cpp b/numpy/core/src/npysort/quicksort.cpp

new file mode 100644 (file)

index 0000000..149ba32
--- /dev/null
+++ b/numpy/core/src/npysort/quicksort.cpp
@@ -0,0 +1,964 @@
+/* -*- c -*- */
+
+/*
+ * The purpose of this module is to add faster sort functions
+ * that are type-specific.  This is done by altering the
+ * function table for the builtin descriptors.
+ *
+ * These sorting functions are copied almost directly from numarray
+ * with a few modifications (complex comparisons compare the imaginary
+ * part if the real parts are equal, for example), and the names
+ * are changed.
+ *
+ * The original sorting code is due to Charles R. Harris who wrote
+ * it for numarray.
+ */
+
+/*
+ * Quick sort is usually the fastest, but the worst case scenario is O(N^2) so
+ * the code switches to the O(NlogN) worst case heapsort if not enough progress
+ * is made on the large side of the two quicksort partitions. This improves the
+ * worst case while still retaining the speed of quicksort for the common case.
+ * This is variant known as introsort.
+ *
+ *
+ * def introsort(lower, higher, recursion_limit=log2(higher - lower + 1) * 2):
+ *   # sort remainder with heapsort if we are not making enough progress
+ *   # we arbitrarily choose 2 * log(n) as the cutoff point
+ *   if recursion_limit < 0:
+ *       heapsort(lower, higher)
+ *       return
+ *
+ *   if lower < higher:
+ *      pivot_pos = partition(lower, higher)
+ *      # recurse into smaller first and leave larger on stack
+ *      # this limits the required stack space
+ *      if (pivot_pos - lower > higher - pivot_pos):
+ *          quicksort(pivot_pos + 1, higher, recursion_limit - 1)
+ *          quicksort(lower, pivot_pos, recursion_limit - 1)
+ *      else:
+ *          quicksort(lower, pivot_pos, recursion_limit - 1)
+ *          quicksort(pivot_pos + 1, higher, recursion_limit - 1)
+ *
+ *
+ * the below code implements this converted to an iteration and as an
+ * additional minor optimization skips the recursion depth checking on the
+ * smaller partition as it is always less than half of the remaining data and
+ * will thus terminate fast enough
+ */
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+
+#include "npy_cpu_features.h"
+#include "npy_sort.h"
+#include "npysort_common.h"
+#include "numpy_tag.h"
+
+#include "x86-qsort.h"
+#include <cstdlib>
+#include <utility>
+
+#ifndef NPY_DISABLE_OPTIMIZATION
+#include "x86-qsort.dispatch.h"
+#endif
+
+#define NOT_USED NPY_UNUSED(unused)
+/*
+ * pushing largest partition has upper bound of log2(n) space
+ * we store two pointers each time
+ */
+#define PYA_QS_STACK (NPY_BITSOF_INTP * 2)
+#define SMALL_QUICKSORT 15
+#define SMALL_MERGESORT 20
+#define SMALL_STRING 16
+
+/*
+ *****************************************************************************
+ **                            NUMERIC SORTS                                **
+ *****************************************************************************
+ */
+
+template <typename Tag, typename type>
+NPY_NO_EXPORT int
+heapsort_(type *start, npy_intp n);
+template <typename Tag, typename type>
+NPY_NO_EXPORT int
+aheapsort_(type *vv, npy_intp *tosort, npy_intp n);
+template <typename Tag, typename type>
+NPY_NO_EXPORT int
+string_heapsort_(type *start, npy_intp n, void *varr);
+template <typename Tag, typename type>
+NPY_NO_EXPORT int
+string_aheapsort_(type *vv, npy_intp *tosort, npy_intp n, void *varr);
+
+namespace {
+
+template <typename Tag>
+struct x86_dispatch {
+    static bool quicksort(typename Tag::type *, npy_intp) { return false; }
+};
+
+template <>
+struct x86_dispatch<npy::int_tag> {
+    static bool quicksort(npy_int *start, npy_intp num)
+    {
+        void (*dispfunc)(void *, npy_intp) = nullptr;
+        NPY_CPU_DISPATCH_CALL_XB(dispfunc = x86_quicksort_int);
+        if (dispfunc) {
+            (*dispfunc)(start, num);
+            return true;
+        }
+        return false;
+    }
+};
+
+template <>
+struct x86_dispatch<npy::uint_tag> {
+    static bool quicksort(npy_uint *start, npy_intp num)
+    {
+        void (*dispfunc)(void *, npy_intp) = nullptr;
+        NPY_CPU_DISPATCH_CALL_XB(dispfunc = x86_quicksort_uint);
+        if (dispfunc) {
+            (*dispfunc)(start, num);
+            return true;
+        }
+        return false;
+    }
+};
+
+template <>
+struct x86_dispatch<npy::float_tag> {
+    static bool quicksort(npy_float *start, npy_intp num)
+    {
+        void (*dispfunc)(void *, npy_intp) = nullptr;
+        NPY_CPU_DISPATCH_CALL_XB(dispfunc = x86_quicksort_float);
+        if (dispfunc) {
+            (*dispfunc)(start, num);
+            return true;
+        }
+        return false;
+    }
+};
+
+}  // namespace
+
+template <typename Tag, typename type>
+static int
+quicksort_(type *start, npy_intp num)
+{
+    if (x86_dispatch<Tag>::quicksort(start, num))
+        return 0;
+
+    type vp;
+    type *pl = start;
+    type *pr = pl + num - 1;
+    type *stack[PYA_QS_STACK];
+    type **sptr = stack;
+    type *pm, *pi, *pj, *pk;
+    int depth[PYA_QS_STACK];
+    int *psdepth = depth;
+    int cdepth = npy_get_msb(num) * 2;
+
+    for (;;) {
+        if (NPY_UNLIKELY(cdepth < 0)) {
+            heapsort_<Tag>(pl, pr - pl + 1);
+            goto stack_pop;
+        }
+        while ((pr - pl) > SMALL_QUICKSORT) {
+            /* quicksort partition */
+            pm = pl + ((pr - pl) >> 1);
+            if (Tag::less(*pm, *pl)) {
+                std::swap(*pm, *pl);
+            }
+            if (Tag::less(*pr, *pm)) {
+                std::swap(*pr, *pm);
+            }
+            if (Tag::less(*pm, *pl)) {
+                std::swap(*pm, *pl);
+            }
+            vp = *pm;
+            pi = pl;
+            pj = pr - 1;
+            std::swap(*pm, *pj);
+            for (;;) {
+                do {
+                    ++pi;
+                } while (Tag::less(*pi, vp));
+                do {
+                    --pj;
+                } while (Tag::less(vp, *pj));
+                if (pi >= pj) {
+                    break;
+                }
+                std::swap(*pi, *pj);
+            }
+            pk = pr - 1;
+            std::swap(*pi, *pk);
+            /* push largest partition on stack */
+            if (pi - pl < pr - pi) {
+                *sptr++ = pi + 1;
+                *sptr++ = pr;
+                pr = pi - 1;
+            }
+            else {
+                *sptr++ = pl;
+                *sptr++ = pi - 1;
+                pl = pi + 1;
+            }
+            *psdepth++ = --cdepth;
+        }
+
+        /* insertion sort */
+        for (pi = pl + 1; pi <= pr; ++pi) {
+            vp = *pi;
+            pj = pi;
+            pk = pi - 1;
+            while (pj > pl && Tag::less(vp, *pk)) {
+                *pj-- = *pk--;
+            }
+            *pj = vp;
+        }
+    stack_pop:
+        if (sptr == stack) {
+            break;
+        }
+        pr = *(--sptr);
+        pl = *(--sptr);
+        cdepth = *(--psdepth);
+    }
+
+    return 0;
+}
+
+template <typename Tag, typename type>
+static int
+aquicksort_(type *vv, npy_intp *tosort, npy_intp num)
+{
+    type *v = vv;
+    type vp;
+    npy_intp *pl = tosort;
+    npy_intp *pr = tosort + num - 1;
+    npy_intp *stack[PYA_QS_STACK];
+    npy_intp **sptr = stack;
+    npy_intp *pm, *pi, *pj, *pk, vi;
+    int depth[PYA_QS_STACK];
+    int *psdepth = depth;
+    int cdepth = npy_get_msb(num) * 2;
+
+    for (;;) {
+        if (NPY_UNLIKELY(cdepth < 0)) {
+            aheapsort_<Tag>(vv, pl, pr - pl + 1);
+            goto stack_pop;
+        }
+        while ((pr - pl) > SMALL_QUICKSORT) {
+            /* quicksort partition */
+            pm = pl + ((pr - pl) >> 1);
+            if (Tag::less(v[*pm], v[*pl])) {
+                std::swap(*pm, *pl);
+            }
+            if (Tag::less(v[*pr], v[*pm])) {
+                std::swap(*pr, *pm);
+            }
+            if (Tag::less(v[*pm], v[*pl])) {
+                std::swap(*pm, *pl);
+            }
+            vp = v[*pm];
+            pi = pl;
+            pj = pr - 1;
+            std::swap(*pm, *pj);
+            for (;;) {
+                do {
+                    ++pi;
+                } while (Tag::less(v[*pi], vp));
+                do {
+                    --pj;
+                } while (Tag::less(vp, v[*pj]));
+                if (pi >= pj) {
+                    break;
+                }
+                std::swap(*pi, *pj);
+            }
+            pk = pr - 1;
+            std::swap(*pi, *pk);
+            /* push largest partition on stack */
+            if (pi - pl < pr - pi) {
+                *sptr++ = pi + 1;
+                *sptr++ = pr;
+                pr = pi - 1;
+            }
+            else {
+                *sptr++ = pl;
+                *sptr++ = pi - 1;
+                pl = pi + 1;
+            }
+            *psdepth++ = --cdepth;
+        }
+
+        /* insertion sort */
+        for (pi = pl + 1; pi <= pr; ++pi) {
+            vi = *pi;
+            vp = v[vi];
+            pj = pi;
+            pk = pi - 1;
+            while (pj > pl && Tag::less(vp, v[*pk])) {
+                *pj-- = *pk--;
+            }
+            *pj = vi;
+        }
+    stack_pop:
+        if (sptr == stack) {
+            break;
+        }
+        pr = *(--sptr);
+        pl = *(--sptr);
+        cdepth = *(--psdepth);
+    }
+
+    return 0;
+}
+
+/*
+ *****************************************************************************
+ **                             STRING SORTS                                **
+ *****************************************************************************
+ */
+
+template <typename Tag, typename type>
+static int
+string_quicksort_(type *start, npy_intp num, void *varr)
+{
+    PyArrayObject *arr = (PyArrayObject *)varr;
+    const size_t len = PyArray_ITEMSIZE(arr) / sizeof(type);
+    type *vp;
+    type *pl = start;
+    type *pr = pl + (num - 1) * len;
+    type *stack[PYA_QS_STACK], **sptr = stack, *pm, *pi, *pj, *pk;
+    int depth[PYA_QS_STACK];
+    int *psdepth = depth;
+    int cdepth = npy_get_msb(num) * 2;
+
+    /* Items that have zero size don't make sense to sort */
+    if (len == 0) {
+        return 0;
+    }
+
+    vp = (type *)malloc(PyArray_ITEMSIZE(arr));
+    if (vp == NULL) {
+        return -NPY_ENOMEM;
+    }
+
+    for (;;) {
+        if (NPY_UNLIKELY(cdepth < 0)) {
+            string_heapsort_<Tag>(pl, (pr - pl) / len + 1, varr);
+            goto stack_pop;
+        }
+        while ((size_t)(pr - pl) > SMALL_QUICKSORT * len) {
+            /* quicksort partition */
+            pm = pl + (((pr - pl) / len) >> 1) * len;
+            if (Tag::less(pm, pl, len)) {
+                Tag::swap(pm, pl, len);
+            }
+            if (Tag::less(pr, pm, len)) {
+                Tag::swap(pr, pm, len);
+            }
+            if (Tag::less(pm, pl, len)) {
+                Tag::swap(pm, pl, len);
+            }
+            Tag::copy(vp, pm, len);
+            pi = pl;
+            pj = pr - len;
+            Tag::swap(pm, pj, len);
+            for (;;) {
+                do {
+                    pi += len;
+                } while (Tag::less(pi, vp, len));
+                do {
+                    pj -= len;
+                } while (Tag::less(vp, pj, len));
+                if (pi >= pj) {
+                    break;
+                }
+                Tag::swap(pi, pj, len);
+            }
+            pk = pr - len;
+            Tag::swap(pi, pk, len);
+            /* push largest partition on stack */
+            if (pi - pl < pr - pi) {
+                *sptr++ = pi + len;
+                *sptr++ = pr;
+                pr = pi - len;
+            }
+            else {
+                *sptr++ = pl;
+                *sptr++ = pi - len;
+                pl = pi + len;
+            }
+            *psdepth++ = --cdepth;
+        }
+
+        /* insertion sort */
+        for (pi = pl + len; pi <= pr; pi += len) {
+            Tag::copy(vp, pi, len);
+            pj = pi;
+            pk = pi - len;
+            while (pj > pl && Tag::less(vp, pk, len)) {
+                Tag::copy(pj, pk, len);
+                pj -= len;
+                pk -= len;
+            }
+            Tag::copy(pj, vp, len);
+        }
+    stack_pop:
+        if (sptr == stack) {
+            break;
+        }
+        pr = *(--sptr);
+        pl = *(--sptr);
+        cdepth = *(--psdepth);
+    }
+
+    free(vp);
+    return 0;
+}
+
+template <typename Tag, typename type>
+static int
+string_aquicksort_(type *vv, npy_intp *tosort, npy_intp num, void *varr)
+{
+    type *v = vv;
+    PyArrayObject *arr = (PyArrayObject *)varr;
+    size_t len = PyArray_ITEMSIZE(arr) / sizeof(type);
+    type *vp;
+    npy_intp *pl = tosort;
+    npy_intp *pr = tosort + num - 1;
+    npy_intp *stack[PYA_QS_STACK];
+    npy_intp **sptr = stack;
+    npy_intp *pm, *pi, *pj, *pk, vi;
+    int depth[PYA_QS_STACK];
+    int *psdepth = depth;
+    int cdepth = npy_get_msb(num) * 2;
+
+    /* Items that have zero size don't make sense to sort */
+    if (len == 0) {
+        return 0;
+    }
+
+    for (;;) {
+        if (NPY_UNLIKELY(cdepth < 0)) {
+            string_aheapsort_<Tag>(vv, pl, pr - pl + 1, varr);
+            goto stack_pop;
+        }
+        while ((pr - pl) > SMALL_QUICKSORT) {
+            /* quicksort partition */
+            pm = pl + ((pr - pl) >> 1);
+            if (Tag::less(v + (*pm) * len, v + (*pl) * len, len)) {
+                std::swap(*pm, *pl);
+            }
+            if (Tag::less(v + (*pr) * len, v + (*pm) * len, len)) {
+                std::swap(*pr, *pm);
+            }
+            if (Tag::less(v + (*pm) * len, v + (*pl) * len, len)) {
+                std::swap(*pm, *pl);
+            }
+            vp = v + (*pm) * len;
+            pi = pl;
+            pj = pr - 1;
+            std::swap(*pm, *pj);
+            for (;;) {
+                do {
+                    ++pi;
+                } while (Tag::less(v + (*pi) * len, vp, len));
+                do {
+                    --pj;
+                } while (Tag::less(vp, v + (*pj) * len, len));
+                if (pi >= pj) {
+                    break;
+                }
+                std::swap(*pi, *pj);
+            }
+            pk = pr - 1;
+            std::swap(*pi, *pk);
+            /* push largest partition on stack */
+            if (pi - pl < pr - pi) {
+                *sptr++ = pi + 1;
+                *sptr++ = pr;
+                pr = pi - 1;
+            }
+            else {
+                *sptr++ = pl;
+                *sptr++ = pi - 1;
+                pl = pi + 1;
+            }
+            *psdepth++ = --cdepth;
+        }
+
+        /* insertion sort */
+        for (pi = pl + 1; pi <= pr; ++pi) {
+            vi = *pi;
+            vp = v + vi * len;
+            pj = pi;
+            pk = pi - 1;
+            while (pj > pl && Tag::less(vp, v + (*pk) * len, len)) {
+                *pj-- = *pk--;
+            }
+            *pj = vi;
+        }
+    stack_pop:
+        if (sptr == stack) {
+            break;
+        }
+        pr = *(--sptr);
+        pl = *(--sptr);
+        cdepth = *(--psdepth);
+    }
+
+    return 0;
+}
+
+/*
+ *****************************************************************************
+ **                             GENERIC SORT                                **
+ *****************************************************************************
+ */
+
+NPY_NO_EXPORT int
+npy_quicksort(void *start, npy_intp num, void *varr)
+{
+    PyArrayObject *arr = (PyArrayObject *)varr;
+    npy_intp elsize = PyArray_ITEMSIZE(arr);
+    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
+    char *vp;
+    char *pl = (char *)start;
+    char *pr = pl + (num - 1) * elsize;
+    char *stack[PYA_QS_STACK];
+    char **sptr = stack;
+    char *pm, *pi, *pj, *pk;
+    int depth[PYA_QS_STACK];
+    int *psdepth = depth;
+    int cdepth = npy_get_msb(num) * 2;
+
+    /* Items that have zero size don't make sense to sort */
+    if (elsize == 0) {
+        return 0;
+    }
+
+    vp = (char *)malloc(elsize);
+    if (vp == NULL) {
+        return -NPY_ENOMEM;
+    }
+
+    for (;;) {
+        if (NPY_UNLIKELY(cdepth < 0)) {
+            npy_heapsort(pl, (pr - pl) / elsize + 1, varr);
+            goto stack_pop;
+        }
+        while (pr - pl > SMALL_QUICKSORT * elsize) {
+            /* quicksort partition */
+            pm = pl + (((pr - pl) / elsize) >> 1) * elsize;
+            if (cmp(pm, pl, arr) < 0) {
+                GENERIC_SWAP(pm, pl, elsize);
+            }
+            if (cmp(pr, pm, arr) < 0) {
+                GENERIC_SWAP(pr, pm, elsize);
+            }
+            if (cmp(pm, pl, arr) < 0) {
+                GENERIC_SWAP(pm, pl, elsize);
+            }
+            GENERIC_COPY(vp, pm, elsize);
+            pi = pl;
+            pj = pr - elsize;
+            GENERIC_SWAP(pm, pj, elsize);
+            /*
+             * Generic comparisons may be buggy, so don't rely on the sentinels
+             * to keep the pointers from going out of bounds.
+             */
+            for (;;) {
+                do {
+                    pi += elsize;
+                } while (cmp(pi, vp, arr) < 0 && pi < pj);
+                do {
+                    pj -= elsize;
+                } while (cmp(vp, pj, arr) < 0 && pi < pj);
+                if (pi >= pj) {
+                    break;
+                }
+                GENERIC_SWAP(pi, pj, elsize);
+            }
+            pk = pr - elsize;
+            GENERIC_SWAP(pi, pk, elsize);
+            /* push largest partition on stack */
+            if (pi - pl < pr - pi) {
+                *sptr++ = pi + elsize;
+                *sptr++ = pr;
+                pr = pi - elsize;
+            }
+            else {
+                *sptr++ = pl;
+                *sptr++ = pi - elsize;
+                pl = pi + elsize;
+            }
+            *psdepth++ = --cdepth;
+        }
+
+        /* insertion sort */
+        for (pi = pl + elsize; pi <= pr; pi += elsize) {
+            GENERIC_COPY(vp, pi, elsize);
+            pj = pi;
+            pk = pi - elsize;
+            while (pj > pl && cmp(vp, pk, arr) < 0) {
+                GENERIC_COPY(pj, pk, elsize);
+                pj -= elsize;
+                pk -= elsize;
+            }
+            GENERIC_COPY(pj, vp, elsize);
+        }
+    stack_pop:
+        if (sptr == stack) {
+            break;
+        }
+        pr = *(--sptr);
+        pl = *(--sptr);
+        cdepth = *(--psdepth);
+    }
+
+    free(vp);
+    return 0;
+}
+
+NPY_NO_EXPORT int
+npy_aquicksort(void *vv, npy_intp *tosort, npy_intp num, void *varr)
+{
+    char *v = (char *)vv;
+    PyArrayObject *arr = (PyArrayObject *)varr;
+    npy_intp elsize = PyArray_ITEMSIZE(arr);
+    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
+    char *vp;
+    npy_intp *pl = tosort;
+    npy_intp *pr = tosort + num - 1;
+    npy_intp *stack[PYA_QS_STACK];
+    npy_intp **sptr = stack;
+    npy_intp *pm, *pi, *pj, *pk, vi;
+    int depth[PYA_QS_STACK];
+    int *psdepth = depth;
+    int cdepth = npy_get_msb(num) * 2;
+
+    /* Items that have zero size don't make sense to sort */
+    if (elsize == 0) {
+        return 0;
+    }
+
+    for (;;) {
+        if (NPY_UNLIKELY(cdepth < 0)) {
+            npy_aheapsort(vv, pl, pr - pl + 1, varr);
+            goto stack_pop;
+        }
+        while ((pr - pl) > SMALL_QUICKSORT) {
+            /* quicksort partition */
+            pm = pl + ((pr - pl) >> 1);
+            if (cmp(v + (*pm) * elsize, v + (*pl) * elsize, arr) < 0) {
+                INTP_SWAP(*pm, *pl);
+            }
+            if (cmp(v + (*pr) * elsize, v + (*pm) * elsize, arr) < 0) {
+                INTP_SWAP(*pr, *pm);
+            }
+            if (cmp(v + (*pm) * elsize, v + (*pl) * elsize, arr) < 0) {
+                INTP_SWAP(*pm, *pl);
+            }
+            vp = v + (*pm) * elsize;
+            pi = pl;
+            pj = pr - 1;
+            INTP_SWAP(*pm, *pj);
+            for (;;) {
+                do {
+                    ++pi;
+                } while (cmp(v + (*pi) * elsize, vp, arr) < 0 && pi < pj);
+                do {
+                    --pj;
+                } while (cmp(vp, v + (*pj) * elsize, arr) < 0 && pi < pj);
+                if (pi >= pj) {
+                    break;
+                }
+                INTP_SWAP(*pi, *pj);
+            }
+            pk = pr - 1;
+            INTP_SWAP(*pi, *pk);
+            /* push largest partition on stack */
+            if (pi - pl < pr - pi) {
+                *sptr++ = pi + 1;
+                *sptr++ = pr;
+                pr = pi - 1;
+            }
+            else {
+                *sptr++ = pl;
+                *sptr++ = pi - 1;
+                pl = pi + 1;
+            }
+            *psdepth++ = --cdepth;
+        }
+
+        /* insertion sort */
+        for (pi = pl + 1; pi <= pr; ++pi) {
+            vi = *pi;
+            vp = v + vi * elsize;
+            pj = pi;
+            pk = pi - 1;
+            while (pj > pl && cmp(vp, v + (*pk) * elsize, arr) < 0) {
+                *pj-- = *pk--;
+            }
+            *pj = vi;
+        }
+    stack_pop:
+        if (sptr == stack) {
+            break;
+        }
+        pr = *(--sptr);
+        pl = *(--sptr);
+        cdepth = *(--psdepth);
+    }
+
+    return 0;
+}
+
+/***************************************
+ * C > C++ dispatch
+ ***************************************/
+
+NPY_NO_EXPORT int
+quicksort_bool(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::bool_tag>((npy_bool *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_byte(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::byte_tag>((npy_byte *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_ubyte(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::ubyte_tag>((npy_ubyte *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_short(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::short_tag>((npy_short *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_ushort(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::ushort_tag>((npy_ushort *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_int(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::int_tag>((npy_int *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_uint(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::uint_tag>((npy_uint *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_long(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::long_tag>((npy_long *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_ulong(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::ulong_tag>((npy_ulong *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_longlong(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::longlong_tag>((npy_longlong *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_ulonglong(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::ulonglong_tag>((npy_ulonglong *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_half(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::half_tag>((npy_half *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_float(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::float_tag>((npy_float *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_double(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::double_tag>((npy_double *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_longdouble(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::longdouble_tag>((npy_longdouble *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_cfloat(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::cfloat_tag>((npy_cfloat *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_cdouble(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::cdouble_tag>((npy_cdouble *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_clongdouble(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::clongdouble_tag>((npy_clongdouble *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_datetime(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::datetime_tag>((npy_datetime *)start, n);
+}
+NPY_NO_EXPORT int
+quicksort_timedelta(void *start, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return quicksort_<npy::timedelta_tag>((npy_timedelta *)start, n);
+}
+
+NPY_NO_EXPORT int
+aquicksort_bool(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::bool_tag>((npy_bool *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_byte(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::byte_tag>((npy_byte *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_ubyte(void *vv, npy_intp *tosort, npy_intp n,
+                 void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::ubyte_tag>((npy_ubyte *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_short(void *vv, npy_intp *tosort, npy_intp n,
+                 void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::short_tag>((npy_short *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_ushort(void *vv, npy_intp *tosort, npy_intp n,
+                  void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::ushort_tag>((npy_ushort *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_int(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::int_tag>((npy_int *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_uint(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::uint_tag>((npy_uint *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_long(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::long_tag>((npy_long *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_ulong(void *vv, npy_intp *tosort, npy_intp n,
+                 void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::ulong_tag>((npy_ulong *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_longlong(void *vv, npy_intp *tosort, npy_intp n,
+                    void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::longlong_tag>((npy_longlong *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_ulonglong(void *vv, npy_intp *tosort, npy_intp n,
+                     void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::ulonglong_tag>((npy_ulonglong *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_half(void *vv, npy_intp *tosort, npy_intp n, void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::half_tag>((npy_half *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_float(void *vv, npy_intp *tosort, npy_intp n,
+                 void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::float_tag>((npy_float *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_double(void *vv, npy_intp *tosort, npy_intp n,
+                  void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::double_tag>((npy_double *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_longdouble(void *vv, npy_intp *tosort, npy_intp n,
+                      void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::longdouble_tag>((npy_longdouble *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_cfloat(void *vv, npy_intp *tosort, npy_intp n,
+                  void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::cfloat_tag>((npy_cfloat *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_cdouble(void *vv, npy_intp *tosort, npy_intp n,
+                   void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::cdouble_tag>((npy_cdouble *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_clongdouble(void *vv, npy_intp *tosort, npy_intp n,
+                       void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::clongdouble_tag>((npy_clongdouble *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_datetime(void *vv, npy_intp *tosort, npy_intp n,
+                    void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::datetime_tag>((npy_datetime *)vv, tosort, n);
+}
+NPY_NO_EXPORT int
+aquicksort_timedelta(void *vv, npy_intp *tosort, npy_intp n,
+                     void *NPY_UNUSED(varr))
+{
+    return aquicksort_<npy::timedelta_tag>((npy_timedelta *)vv, tosort, n);
+}
+
+NPY_NO_EXPORT int
+quicksort_string(void *start, npy_intp n, void *varr)
+{
+    return string_quicksort_<npy::string_tag>((npy_char *)start, n, varr);
+}
+NPY_NO_EXPORT int
+quicksort_unicode(void *start, npy_intp n, void *varr)
+{
+    return string_quicksort_<npy::unicode_tag>((npy_ucs4 *)start, n, varr);
+}
+
+NPY_NO_EXPORT int
+aquicksort_string(void *vv, npy_intp *tosort, npy_intp n, void *varr)
+{
+    return string_aquicksort_<npy::string_tag>((npy_char *)vv, tosort, n,
+                                               varr);
+}
+NPY_NO_EXPORT int
+aquicksort_unicode(void *vv, npy_intp *tosort, npy_intp n, void *varr)
+{
+    return string_aquicksort_<npy::unicode_tag>((npy_ucs4 *)vv, tosort, n,
+                                                varr);
+}
diff --git a/numpy/core/src/npysort/radixsort.cpp b/numpy/core/src/npysort/radixsort.cpp

index 5393869eef44f0b35d2e3bfd946c2c8b6f3f2683..0e1a41c69cbe667ae0ae2a1efe273e68db7b11f4 100644 (file)
--- a/numpy/core/src/npysort/radixsort.cpp
+++ b/numpy/core/src/npysort/radixsort.cpp
@@ -4,7 +4,7 @@
  #include "npysort_common.h"
  
  #include "../common/numpy_tag.h"
-#include <stdlib.h>
+#include <cstdlib>
  #include <type_traits>
  
  /*
diff --git a/numpy/core/src/npysort/selection.c.src b/numpy/core/src/npysort/selection.c.src

deleted file mode 100644 (file)

index 0e285b3..0000000
--- a/numpy/core/src/npysort/selection.c.src
+++ /dev/null
@@ -1,419 +0,0 @@
-/* -*- c -*- */
-
-/*
- *
- * The code is loosely based on the quickselect from
- * Nicolas Devillard - 1998 public domain
- * http://ndevilla.free.fr/median/median/
- *
- * Quick select with median of 3 pivot is usually the fastest,
- * but the worst case scenario can be quadratic complexity,
- * e.g. np.roll(np.arange(x), x / 2)
- * To avoid this if it recurses too much it falls back to the
- * worst case linear median of median of group 5 pivot strategy.
- */
-
-
-#define NPY_NO_DEPRECATED_API NPY_API_VERSION
-
-#include "npy_sort.h"
-#include "npysort_common.h"
-#include "numpy/npy_math.h"
-#include "npy_partition.h"
-#include <stdlib.h>
-
-#define NOT_USED NPY_UNUSED(unused)
-
-
-/*
- *****************************************************************************
- **                            NUMERIC SORTS                                **
- *****************************************************************************
- */
-
-
-static NPY_INLINE void store_pivot(npy_intp pivot, npy_intp kth,
-                                   npy_intp * pivots, npy_intp * npiv)
-{
-    if (pivots == NULL) {
-        return;
-    }
-
-    /*
-     * If pivot is the requested kth store it, overwriting other pivots if
-     * required. This must be done so iterative partition can work without
-     * manually shifting lower data offset by kth each time
-     */
-    if (pivot == kth && *npiv == NPY_MAX_PIVOT_STACK) {
-        pivots[*npiv - 1] = pivot;
-    }
-    /*
-     * we only need pivots larger than current kth, larger pivots are not
-     * useful as partitions on smaller kth would reorder the stored pivots
-     */
-    else if (pivot >= kth && *npiv < NPY_MAX_PIVOT_STACK) {
-        pivots[*npiv] = pivot;
-        (*npiv) += 1;
-    }
-}
-
-/**begin repeat
- *
- * #TYPE = BOOL, BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG,
- *         LONGLONG, ULONGLONG, HALF, FLOAT, DOUBLE, LONGDOUBLE,
- *         CFLOAT, CDOUBLE, CLONGDOUBLE#
- * #suff = bool, byte, ubyte, short, ushort, int, uint, long, ulong,
- *         longlong, ulonglong, half, float, double, longdouble,
- *         cfloat, cdouble, clongdouble#
- * #type = npy_bool, npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int,
- *         npy_uint, npy_long, npy_ulong, npy_longlong, npy_ulonglong,
- *         npy_ushort, npy_float, npy_double, npy_longdouble, npy_cfloat,
- *         npy_cdouble, npy_clongdouble#
- * #inexact = 0*11, 1*7#
- */
-
-static npy_intp
-amedian_of_median5_@suff@(@type@ *v, npy_intp* tosort, const npy_intp num,
-                         npy_intp * pivots,
-                         npy_intp * npiv);
-
-static npy_intp
-median_of_median5_@suff@(@type@ *v, const npy_intp num,
-                         npy_intp * pivots,
-                         npy_intp * npiv);
-
-/**begin repeat1
- * #name = , a#
- * #idx = , tosort#
- * #arg = 0, 1#
- */
-#if @arg@
-/* helper macros to avoid duplication of direct/indirect selection */
-#define IDX(x) tosort[x]
-#define SORTEE(x) tosort[x]
-#define SWAP INTP_SWAP
-#define MEDIAN3_SWAP(v, tosort, low, mid, high) \
-    amedian3_swap_@suff@(v, tosort, low, mid, high)
-#define MEDIAN5(v, tosort, subleft) \
-        amedian5_@suff@(v, tosort + subleft)
-#define UNGUARDED_PARTITION(v, tosort, pivot, ll, hh) \
-        aunguarded_partition_@suff@(v, tosort, pivot, ll, hh)
-#define INTROSELECT(v, tosort, num, kth, pivots, npiv) \
-        aintroselect_@suff@(v, tosort, nmed, nmed / 2, pivots, npiv, NULL)
-#define DUMBSELECT(v, tosort, left, num, kth) \
-        adumb_select_@suff@(v, tosort + left, num, kth)
-#else
-#define IDX(x) (x)
-#define SORTEE(x) v[x]
-#define SWAP @TYPE@_SWAP
-#define MEDIAN3_SWAP(v, tosort, low, mid, high) \
-    median3_swap_@suff@(v, low, mid, high)
-#define MEDIAN5(v, tosort, subleft) \
-        median5_@suff@(v + subleft)
-#define UNGUARDED_PARTITION(v, tosort, pivot, ll, hh) \
-        unguarded_partition_@suff@(v, pivot, ll, hh)
-#define INTROSELECT(v, tosort, num, kth, pivots, npiv) \
-        introselect_@suff@(v, nmed, nmed / 2, pivots, npiv, NULL)
-#define DUMBSELECT(v, tosort, left, num, kth) \
-        dumb_select_@suff@(v + left, num, kth)
-#endif
-
-
-/*
- * median of 3 pivot strategy
- * gets min and median and moves median to low and min to low + 1
- * for efficient partitioning, see unguarded_partition
- */
-static NPY_INLINE void
-@name@median3_swap_@suff@(@type@ * v,
-#if @arg@
-                          npy_intp * tosort,
-#endif
-                          npy_intp low, npy_intp mid, npy_intp high)
-{
-    if (@TYPE@_LT(v[IDX(high)], v[IDX(mid)]))
-        SWAP(SORTEE(high), SORTEE(mid));
-    if (@TYPE@_LT(v[IDX(high)], v[IDX(low)]))
-        SWAP(SORTEE(high), SORTEE(low));
-    /* move pivot to low */
-    if (@TYPE@_LT(v[IDX(low)], v[IDX(mid)]))
-        SWAP(SORTEE(low), SORTEE(mid));
-    /* move 3-lowest element to low + 1 */
-    SWAP(SORTEE(mid), SORTEE(low + 1));
-}
-
-
-/* select index of median of five elements */
-static npy_intp @name@median5_@suff@(
-#if @arg@
-                                    const @type@ * v, npy_intp * tosort
-#else
-                                    @type@ * v
-#endif
-                                    )
-{
-    /* could be optimized as we only need the index (no swaps) */
-    if (@TYPE@_LT(v[IDX(1)], v[IDX(0)])) {
-        SWAP(SORTEE(1), SORTEE(0));
-    }
-    if (@TYPE@_LT(v[IDX(4)], v[IDX(3)])) {
-        SWAP(SORTEE(4), SORTEE(3));
-    }
-    if (@TYPE@_LT(v[IDX(3)], v[IDX(0)])) {
-        SWAP(SORTEE(3), SORTEE(0));
-    }
-    if (@TYPE@_LT(v[IDX(4)], v[IDX(1)])) {
-        SWAP(SORTEE(4), SORTEE(1));
-    }
-    if (@TYPE@_LT(v[IDX(2)], v[IDX(1)])) {
-        SWAP(SORTEE(2), SORTEE(1));
-    }
-    if (@TYPE@_LT(v[IDX(3)], v[IDX(2)])) {
-        if (@TYPE@_LT(v[IDX(3)], v[IDX(1)])) {
-            return 1;
-        }
-        else {
-            return 3;
-        }
-    }
-    else {
-        /* v[1] and v[2] swapped into order above */
-        return 2;
-    }
-}
-
-
-/*
- * partition and return the index were the pivot belongs
- * the data must have following property to avoid bound checks:
- *                  ll ... hh
- * lower-than-pivot [x x x x] larger-than-pivot
- */
-static NPY_INLINE void
-@name@unguarded_partition_@suff@(@type@ * v,
-#if @arg@
-                                 npy_intp * tosort,
-#endif
-                                 const @type@ pivot,
-                                 npy_intp * ll, npy_intp * hh)
-{
-    for (;;) {
-        do (*ll)++; while (@TYPE@_LT(v[IDX(*ll)], pivot));
-        do (*hh)--; while (@TYPE@_LT(pivot, v[IDX(*hh)]));
-
-        if (*hh < *ll)
-            break;
-
-        SWAP(SORTEE(*ll), SORTEE(*hh));
-    }
-}
-
-
-/*
- * select median of median of blocks of 5
- * if used as partition pivot it splits the range into at least 30%/70%
- * allowing linear time worstcase quickselect
- */
-static npy_intp
-@name@median_of_median5_@suff@(@type@ *v,
-#if @arg@
-                               npy_intp* tosort,
-#endif
-                               const npy_intp num,
-                               npy_intp * pivots,
-                               npy_intp * npiv)
-{
-    npy_intp i, subleft;
-    npy_intp right = num - 1;
-    npy_intp nmed = (right + 1) / 5;
-    for (i = 0, subleft = 0; i < nmed; i++, subleft += 5) {
-        npy_intp m = MEDIAN5(v, tosort, subleft);
-        SWAP(SORTEE(subleft + m), SORTEE(i));
-    }
-
-    if (nmed > 2)
-        INTROSELECT(v, tosort, nmed, nmed / 2, pivots, npiv);
-    return nmed / 2;
-}
-
-
-/*
- * N^2 selection, fast only for very small kth
- * useful for close multiple partitions
- * (e.g. even element median, interpolating percentile)
- */
-static int
-@name@dumb_select_@suff@(@type@ *v,
-#if @arg@
-                         npy_intp * tosort,
-#endif
-                         npy_intp num, npy_intp kth)
-{
-    npy_intp i;
-    for (i = 0; i <= kth; i++) {
-        npy_intp minidx = i;
-        @type@ minval = v[IDX(i)];
-        npy_intp k;
-        for (k = i + 1; k < num; k++) {
-            if (@TYPE@_LT(v[IDX(k)], minval)) {
-                minidx = k;
-                minval = v[IDX(k)];
-            }
-        }
-        SWAP(SORTEE(i), SORTEE(minidx));
-    }
-
-    return 0;
-}
-
-
-/*
- * iterative median of 3 quickselect with cutoff to median-of-medians-of5
- * receives stack of already computed pivots in v to minimize the
- * partition size were kth is searched in
- *
- * area that needs partitioning in [...]
- * kth 0:  [8  7  6  5  4  3  2  1  0] -> med3 partitions elements [4, 2, 0]
- *          0  1  2  3  4  8  7  5  6  -> pop requested kth -> stack [4, 2]
- * kth 3:   0  1  2 [3] 4  8  7  5  6  -> stack [4]
- * kth 5:   0  1  2  3  4 [8  7  5  6] -> stack [6]
- * kth 8:   0  1  2  3  4  5  6 [8  7] -> stack []
- *
- */
-NPY_NO_EXPORT int
-@name@introselect_@suff@(@type@ *v,
-#if @arg@
-                         npy_intp* tosort,
-#endif
-                         npy_intp num, npy_intp kth,
-                         npy_intp * pivots,
-                         npy_intp * npiv,
-                         void *NOT_USED)
-{
-    npy_intp low  = 0;
-    npy_intp high = num - 1;
-    int depth_limit;
-
-    if (npiv == NULL)
-        pivots = NULL;
-
-    while (pivots != NULL && *npiv > 0) {
-        if (pivots[*npiv - 1] > kth) {
-            /* pivot larger than kth set it as upper bound */
-            high = pivots[*npiv - 1] - 1;
-            break;
-        }
-        else if (pivots[*npiv - 1] == kth) {
-            /* kth was already found in a previous iteration -> done */
-            return 0;
-        }
-
-        low = pivots[*npiv - 1] + 1;
-
-        /* pop from stack */
-        *npiv -= 1;
-    }
-
-    /*
-     * use a faster O(n*kth) algorithm for very small kth
-     * e.g. for interpolating percentile
-     */
-    if (kth - low < 3) {
-        DUMBSELECT(v, tosort, low, high - low + 1, kth - low);
-        store_pivot(kth, kth, pivots, npiv);
-        return 0;
-    }
-    // Parenthesis around @inexact@ tells clang dead code as intentional
-    else if ((@inexact@) && kth == num - 1) {
-        /* useful to check if NaN present via partition(d, (x, -1)) */
-        npy_intp k;
-        npy_intp maxidx = low;
-        @type@ maxval = v[IDX(low)];
-        for (k = low + 1; k < num; k++) {
-            if (!@TYPE@_LT(v[IDX(k)], maxval)) {
-                maxidx = k;
-                maxval = v[IDX(k)];
-            }
-        }
-        SWAP(SORTEE(kth), SORTEE(maxidx));
-        return 0;
-    }
-
-    depth_limit = npy_get_msb(num) * 2;
-
-    /* guarantee three elements */
-    for (;low + 1 < high;) {
-        npy_intp       ll = low + 1;
-        npy_intp       hh = high;
-
-        /*
-         * if we aren't making sufficient progress with median of 3
-         * fall back to median-of-median5 pivot for linear worst case
-         * med3 for small sizes is required to do unguarded partition
-         */
-        if (depth_limit > 0 || hh - ll < 5) {
-            const npy_intp mid = low + (high - low) / 2;
-            /* median of 3 pivot strategy,
-             * swapping for efficient partition */
-            MEDIAN3_SWAP(v, tosort, low, mid, high);
-        }
-        else {
-            npy_intp mid;
-            /* FIXME: always use pivots to optimize this iterative partition */
-#if @arg@
-            mid = ll + amedian_of_median5_@suff@(v, tosort + ll, hh - ll, NULL, NULL);
-#else
-            mid = ll + median_of_median5_@suff@(v + ll, hh - ll, NULL, NULL);
-#endif
-            SWAP(SORTEE(mid), SORTEE(low));
-            /* adapt for the larger partition than med3 pivot */
-            ll--;
-            hh++;
-        }
-
-        depth_limit--;
-
-        /*
-         * find place to put pivot (in low):
-         * previous swapping removes need for bound checks
-         * pivot 3-lowest [x x x] 3-highest
-         */
-        UNGUARDED_PARTITION(v, tosort, v[IDX(low)], &ll, &hh);
-
-        /* move pivot into position */
-        SWAP(SORTEE(low), SORTEE(hh));
-
-        /* kth pivot stored later */
-        if (hh != kth) {
-            store_pivot(hh, kth, pivots, npiv);
-        }
-
-        if (hh >= kth)
-            high = hh - 1;
-        if (hh <= kth)
-            low = ll;
-    }
-
-    /* two elements */
-    if (high == low + 1) {
-        if (@TYPE@_LT(v[IDX(high)], v[IDX(low)])) {
-            SWAP(SORTEE(high), SORTEE(low))
-        }
-    }
-    store_pivot(kth, kth, pivots, npiv);
-
-    return 0;
-}
-
-
-#undef IDX
-#undef SWAP
-#undef SORTEE
-#undef MEDIAN3_SWAP
-#undef MEDIAN5
-#undef UNGUARDED_PARTITION
-#undef INTROSELECT
-#undef DUMBSELECT
-/**end repeat1**/
-
-/**end repeat**/
diff --git a/numpy/core/src/npysort/selection.cpp b/numpy/core/src/npysort/selection.cpp

new file mode 100644 (file)

index 0000000..ebe188b
--- /dev/null
+++ b/numpy/core/src/npysort/selection.cpp
@@ -0,0 +1,492 @@
+/* -*- c -*- */
+
+/*
+ *
+ * The code is loosely based on the quickselect from
+ * Nicolas Devillard - 1998 public domain
+ * http://ndevilla.free.fr/median/median/
+ *
+ * Quick select with median of 3 pivot is usually the fastest,
+ * but the worst case scenario can be quadratic complexity,
+ * e.g. np.roll(np.arange(x), x / 2)
+ * To avoid this if it recurses too much it falls back to the
+ * worst case linear median of median of group 5 pivot strategy.
+ */
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+
+#include "numpy/npy_math.h"
+
+#include "npy_partition.h"
+#include "npy_sort.h"
+#include "npysort_common.h"
+#include "numpy_tag.h"
+
+#include <array>
+#include <cstdlib>
+#include <utility>
+
+#define NOT_USED NPY_UNUSED(unused)
+
+template <typename Tag, bool arg, typename type>
+NPY_NO_EXPORT int
+introselect_(type *v, npy_intp *tosort, npy_intp num, npy_intp kth,
+             npy_intp *pivots, npy_intp *npiv);
+
+/*
+ *****************************************************************************
+ **                            NUMERIC SORTS                                **
+ *****************************************************************************
+ */
+
+static inline void
+store_pivot(npy_intp pivot, npy_intp kth, npy_intp *pivots, npy_intp *npiv)
+{
+    if (pivots == NULL) {
+        return;
+    }
+
+    /*
+     * If pivot is the requested kth store it, overwriting other pivots if
+     * required. This must be done so iterative partition can work without
+     * manually shifting lower data offset by kth each time
+     */
+    if (pivot == kth && *npiv == NPY_MAX_PIVOT_STACK) {
+        pivots[*npiv - 1] = pivot;
+    }
+    /*
+     * we only need pivots larger than current kth, larger pivots are not
+     * useful as partitions on smaller kth would reorder the stored pivots
+     */
+    else if (pivot >= kth && *npiv < NPY_MAX_PIVOT_STACK) {
+        pivots[*npiv] = pivot;
+        (*npiv) += 1;
+    }
+}
+
+template <typename type, bool arg>
+struct Sortee {
+    type *v;
+    Sortee(type *v, npy_intp *) : v(v) {}
+    type &operator()(npy_intp i) const { return v[i]; }
+};
+
+template <bool arg>
+struct Idx {
+    Idx(npy_intp *) {}
+    npy_intp operator()(npy_intp i) const { return i; }
+};
+
+template <typename type>
+struct Sortee<type, true> {
+    npy_intp *tosort;
+    Sortee(type *, npy_intp *tosort) : tosort(tosort) {}
+    npy_intp &operator()(npy_intp i) const { return tosort[i]; }
+};
+
+template <>
+struct Idx<true> {
+    npy_intp *tosort;
+    Idx(npy_intp *tosort) : tosort(tosort) {}
+    npy_intp operator()(npy_intp i) const { return tosort[i]; }
+};
+
+template <class T>
+static constexpr bool
+inexact()
+{
+    return !std::is_integral<T>::value;
+}
+
+/*
+ * median of 3 pivot strategy
+ * gets min and median and moves median to low and min to low + 1
+ * for efficient partitioning, see unguarded_partition
+ */
+template <typename Tag, bool arg, typename type>
+static inline void
+median3_swap_(type *v, npy_intp *tosort, npy_intp low, npy_intp mid,
+              npy_intp high)
+{
+    Idx<arg> idx(tosort);
+    Sortee<type, arg> sortee(v, tosort);
+
+    if (Tag::less(v[idx(high)], v[idx(mid)])) {
+        std::swap(sortee(high), sortee(mid));
+    }
+    if (Tag::less(v[idx(high)], v[idx(low)])) {
+        std::swap(sortee(high), sortee(low));
+    }
+    /* move pivot to low */
+    if (Tag::less(v[idx(low)], v[idx(mid)])) {
+        std::swap(sortee(low), sortee(mid));
+    }
+    /* move 3-lowest element to low + 1 */
+    std::swap(sortee(mid), sortee(low + 1));
+}
+
+/* select index of median of five elements */
+template <typename Tag, bool arg, typename type>
+static npy_intp
+median5_(type *v, npy_intp *tosort)
+{
+    Idx<arg> idx(tosort);
+    Sortee<type, arg> sortee(v, tosort);
+
+    /* could be optimized as we only need the index (no swaps) */
+    if (Tag::less(v[idx(1)], v[idx(0)])) {
+        std::swap(sortee(1), sortee(0));
+    }
+    if (Tag::less(v[idx(4)], v[idx(3)])) {
+        std::swap(sortee(4), sortee(3));
+    }
+    if (Tag::less(v[idx(3)], v[idx(0)])) {
+        std::swap(sortee(3), sortee(0));
+    }
+    if (Tag::less(v[idx(4)], v[idx(1)])) {
+        std::swap(sortee(4), sortee(1));
+    }
+    if (Tag::less(v[idx(2)], v[idx(1)])) {
+        std::swap(sortee(2), sortee(1));
+    }
+    if (Tag::less(v[idx(3)], v[idx(2)])) {
+        if (Tag::less(v[idx(3)], v[idx(1)])) {
+            return 1;
+        }
+        else {
+            return 3;
+        }
+    }
+    else {
+        /* v[1] and v[2] swapped into order above */
+        return 2;
+    }
+}
+
+/*
+ * partition and return the index were the pivot belongs
+ * the data must have following property to avoid bound checks:
+ *                  ll ... hh
+ * lower-than-pivot [x x x x] larger-than-pivot
+ */
+template <typename Tag, bool arg, typename type>
+static inline void
+unguarded_partition_(type *v, npy_intp *tosort, const type pivot, npy_intp *ll,
+                     npy_intp *hh)
+{
+    Idx<arg> idx(tosort);
+    Sortee<type, arg> sortee(v, tosort);
+
+    for (;;) {
+        do {
+            (*ll)++;
+        } while (Tag::less(v[idx(*ll)], pivot));
+        do {
+            (*hh)--;
+        } while (Tag::less(pivot, v[idx(*hh)]));
+
+        if (*hh < *ll) {
+            break;
+        }
+
+        std::swap(sortee(*ll), sortee(*hh));
+    }
+}
+
+/*
+ * select median of median of blocks of 5
+ * if used as partition pivot it splits the range into at least 30%/70%
+ * allowing linear time worstcase quickselect
+ */
+template <typename Tag, bool arg, typename type>
+static npy_intp
+median_of_median5_(type *v, npy_intp *tosort, const npy_intp num,
+                   npy_intp *pivots, npy_intp *npiv)
+{
+    Idx<arg> idx(tosort);
+    Sortee<type, arg> sortee(v, tosort);
+
+    npy_intp i, subleft;
+    npy_intp right = num - 1;
+    npy_intp nmed = (right + 1) / 5;
+    for (i = 0, subleft = 0; i < nmed; i++, subleft += 5) {
+        npy_intp m = median5_<Tag, arg>(v + (arg ? 0 : subleft),
+                                        tosort + (arg ? subleft : 0));
+        std::swap(sortee(subleft + m), sortee(i));
+    }
+
+    if (nmed > 2) {
+        introselect_<Tag, arg>(v, tosort, nmed, nmed / 2, pivots, npiv);
+    }
+    return nmed / 2;
+}
+
+/*
+ * N^2 selection, fast only for very small kth
+ * useful for close multiple partitions
+ * (e.g. even element median, interpolating percentile)
+ */
+template <typename Tag, bool arg, typename type>
+static int
+dumb_select_(type *v, npy_intp *tosort, npy_intp num, npy_intp kth)
+{
+    Idx<arg> idx(tosort);
+    Sortee<type, arg> sortee(v, tosort);
+
+    npy_intp i;
+    for (i = 0; i <= kth; i++) {
+        npy_intp minidx = i;
+        type minval = v[idx(i)];
+        npy_intp k;
+        for (k = i + 1; k < num; k++) {
+            if (Tag::less(v[idx(k)], minval)) {
+                minidx = k;
+                minval = v[idx(k)];
+            }
+        }
+        std::swap(sortee(i), sortee(minidx));
+    }
+
+    return 0;
+}
+
+/*
+ * iterative median of 3 quickselect with cutoff to median-of-medians-of5
+ * receives stack of already computed pivots in v to minimize the
+ * partition size were kth is searched in
+ *
+ * area that needs partitioning in [...]
+ * kth 0:  [8  7  6  5  4  3  2  1  0] -> med3 partitions elements [4, 2, 0]
+ *          0  1  2  3  4  8  7  5  6  -> pop requested kth -> stack [4, 2]
+ * kth 3:   0  1  2 [3] 4  8  7  5  6  -> stack [4]
+ * kth 5:   0  1  2  3  4 [8  7  5  6] -> stack [6]
+ * kth 8:   0  1  2  3  4  5  6 [8  7] -> stack []
+ *
+ */
+template <typename Tag, bool arg, typename type>
+NPY_NO_EXPORT int
+introselect_(type *v, npy_intp *tosort, npy_intp num, npy_intp kth,
+             npy_intp *pivots, npy_intp *npiv)
+{
+    Idx<arg> idx(tosort);
+    Sortee<type, arg> sortee(v, tosort);
+
+    npy_intp low = 0;
+    npy_intp high = num - 1;
+    int depth_limit;
+
+    if (npiv == NULL) {
+        pivots = NULL;
+    }
+
+    while (pivots != NULL && *npiv > 0) {
+        if (pivots[*npiv - 1] > kth) {
+            /* pivot larger than kth set it as upper bound */
+            high = pivots[*npiv - 1] - 1;
+            break;
+        }
+        else if (pivots[*npiv - 1] == kth) {
+            /* kth was already found in a previous iteration -> done */
+            return 0;
+        }
+
+        low = pivots[*npiv - 1] + 1;
+
+        /* pop from stack */
+        *npiv -= 1;
+    }
+
+    /*
+     * use a faster O(n*kth) algorithm for very small kth
+     * e.g. for interpolating percentile
+     */
+    if (kth - low < 3) {
+        dumb_select_<Tag, arg>(v + (arg ? 0 : low), tosort + (arg ? low : 0),
+                               high - low + 1, kth - low);
+        store_pivot(kth, kth, pivots, npiv);
+        return 0;
+    }
+
+    else if (inexact<type>() && kth == num - 1) {
+        /* useful to check if NaN present via partition(d, (x, -1)) */
+        npy_intp k;
+        npy_intp maxidx = low;
+        type maxval = v[idx(low)];
+        for (k = low + 1; k < num; k++) {
+            if (!Tag::less(v[idx(k)], maxval)) {
+                maxidx = k;
+                maxval = v[idx(k)];
+            }
+        }
+        std::swap(sortee(kth), sortee(maxidx));
+        return 0;
+    }
+
+    depth_limit = npy_get_msb(num) * 2;
+
+    /* guarantee three elements */
+    for (; low + 1 < high;) {
+        npy_intp ll = low + 1;
+        npy_intp hh = high;
+
+        /*
+         * if we aren't making sufficient progress with median of 3
+         * fall back to median-of-median5 pivot for linear worst case
+         * med3 for small sizes is required to do unguarded partition
+         */
+        if (depth_limit > 0 || hh - ll < 5) {
+            const npy_intp mid = low + (high - low) / 2;
+            /* median of 3 pivot strategy,
+             * swapping for efficient partition */
+            median3_swap_<Tag, arg>(v, tosort, low, mid, high);
+        }
+        else {
+            npy_intp mid;
+            /* FIXME: always use pivots to optimize this iterative partition */
+            mid = ll + median_of_median5_<Tag, arg>(v + (arg ? 0 : ll),
+                                                    tosort + (arg ? ll : 0),
+                                                    hh - ll, NULL, NULL);
+            std::swap(sortee(mid), sortee(low));
+            /* adapt for the larger partition than med3 pivot */
+            ll--;
+            hh++;
+        }
+
+        depth_limit--;
+
+        /*
+         * find place to put pivot (in low):
+         * previous swapping removes need for bound checks
+         * pivot 3-lowest [x x x] 3-highest
+         */
+        unguarded_partition_<Tag, arg>(v, tosort, v[idx(low)], &ll, &hh);
+
+        /* move pivot into position */
+        std::swap(sortee(low), sortee(hh));
+
+        /* kth pivot stored later */
+        if (hh != kth) {
+            store_pivot(hh, kth, pivots, npiv);
+        }
+
+        if (hh >= kth) {
+            high = hh - 1;
+        }
+        if (hh <= kth) {
+            low = ll;
+        }
+    }
+
+    /* two elements */
+    if (high == low + 1) {
+        if (Tag::less(v[idx(high)], v[idx(low)])) {
+            std::swap(sortee(high), sortee(low));
+        }
+    }
+    store_pivot(kth, kth, pivots, npiv);
+
+    return 0;
+}
+
+/*
+ *****************************************************************************
+ **                             GENERATOR                                   **
+ *****************************************************************************
+ */
+
+template <typename Tag>
+static int
+introselect_noarg(void *v, npy_intp num, npy_intp kth, npy_intp *pivots,
+                  npy_intp *npiv, void *)
+{
+    return introselect_<Tag, false>((typename Tag::type *)v, nullptr, num, kth,
+                                    pivots, npiv);
+}
+
+template <typename Tag>
+static int
+introselect_arg(void *v, npy_intp *tosort, npy_intp num, npy_intp kth,
+                npy_intp *pivots, npy_intp *npiv, void *)
+{
+    return introselect_<Tag, true>((typename Tag::type *)v, tosort, num, kth,
+                                   pivots, npiv);
+}
+
+struct arg_map {
+    int typenum;
+    PyArray_PartitionFunc *part[NPY_NSELECTS];
+    PyArray_ArgPartitionFunc *argpart[NPY_NSELECTS];
+};
+
+template <class... Tags>
+static constexpr std::array<arg_map, sizeof...(Tags)>
+make_partition_map(npy::taglist<Tags...>)
+{
+    return std::array<arg_map, sizeof...(Tags)>{
+            arg_map{Tags::type_value, &introselect_noarg<Tags>,
+                    &introselect_arg<Tags>}...};
+}
+
+struct partition_t {
+    using taglist =
+            npy::taglist<npy::bool_tag, npy::byte_tag, npy::ubyte_tag,
+                         npy::short_tag, npy::ushort_tag, npy::int_tag,
+                         npy::uint_tag, npy::long_tag, npy::ulong_tag,
+                         npy::longlong_tag, npy::ulonglong_tag, npy::half_tag,
+                         npy::float_tag, npy::double_tag, npy::longdouble_tag,
+                         npy::cfloat_tag, npy::cdouble_tag,
+                         npy::clongdouble_tag>;
+
+    static constexpr std::array<arg_map, taglist::size> map =
+            make_partition_map(taglist());
+};
+constexpr std::array<arg_map, partition_t::taglist::size> partition_t::map;
+
+static inline PyArray_PartitionFunc *
+_get_partition_func(int type, NPY_SELECTKIND which)
+{
+    npy_intp i;
+    npy_intp ntypes = partition_t::map.size();
+
+    if (which >= NPY_NSELECTS) {
+        return NULL;
+    }
+    for (i = 0; i < ntypes; i++) {
+        if (type == partition_t::map[i].typenum) {
+            return partition_t::map[i].part[which];
+        }
+    }
+    return NULL;
+}
+
+static inline PyArray_ArgPartitionFunc *
+_get_argpartition_func(int type, NPY_SELECTKIND which)
+{
+    npy_intp i;
+    npy_intp ntypes = partition_t::map.size();
+
+    for (i = 0; i < ntypes; i++) {
+        if (type == partition_t::map[i].typenum) {
+            return partition_t::map[i].argpart[which];
+        }
+    }
+    return NULL;
+}
+
+/*
+ *****************************************************************************
+ **                            C INTERFACE                                  **
+ *****************************************************************************
+ */
+extern "C" {
+NPY_NO_EXPORT PyArray_PartitionFunc *
+get_partition_func(int type, NPY_SELECTKIND which)
+{
+    return _get_partition_func(type, which);
+}
+NPY_NO_EXPORT PyArray_ArgPartitionFunc *
+get_argpartition_func(int type, NPY_SELECTKIND which)
+{
+    return _get_argpartition_func(type, which);
+}
+}
diff --git a/numpy/core/src/npysort/timsort.c.src b/numpy/core/src/npysort/timsort.c.src

deleted file mode 100644 (file)

index 5298f5a..0000000
--- a/numpy/core/src/npysort/timsort.c.src
+++ /dev/null
@@ -1,2572 +0,0 @@
-/* -*- c -*- */
-
-/*
- * The purpose of this module is to add faster sort functions
- * that are type-specific.  This is done by altering the
- * function table for the builtin descriptors.
- *
- * These sorting functions are copied almost directly from numarray
- * with a few modifications (complex comparisons compare the imaginary
- * part if the real parts are equal, for example), and the names
- * are changed.
- *
- * The original sorting code is due to Charles R. Harris who wrote
- * it for numarray.
- */
-
-/*
- * Quick sort is usually the fastest, but the worst case scenario can
- * be slower than the merge and heap sorts.  The merge sort requires
- * extra memory and so for large arrays may not be useful.
- *
- * The merge sort is *stable*, meaning that equal components
- * are unmoved from their entry versions, so it can be used to
- * implement lexigraphic sorting on multiple keys.
- *
- * The heap sort is included for completeness.
- */
-
-
-/* For details of Timsort, refer to
- * https://github.com/python/cpython/blob/3.7/Objects/listsort.txt
- */
-
-#define NPY_NO_DEPRECATED_API NPY_API_VERSION
-
-#include "npy_sort.h"
-#include "npysort_common.h"
-#include <stdlib.h>
-
-/* enough for 32 * 1.618 ** 128 elements */
-#define TIMSORT_STACK_SIZE 128
-
-
-
-static npy_intp compute_min_run(npy_intp num)
-{
-    npy_intp r = 0;
-
-    while (64 < num) {
-        r |= num & 1;
-        num >>= 1;
-    }
-
-    return num + r;
-}
-
-typedef struct {
-    npy_intp s; /* start pointer */
-    npy_intp l; /* length */
-} run;
-
-
-/* buffer for argsort. Declared here to avoid multiple declarations. */
-typedef struct {
-    npy_intp *pw;
-    npy_intp size;
-} buffer_intp;
-
-
-/* buffer method */
-static NPY_INLINE int
-resize_buffer_intp(buffer_intp *buffer, npy_intp new_size)
-{
-    if (new_size <= buffer->size) {
-        return 0;
-    }
-
-    if (NPY_UNLIKELY(buffer->pw == NULL)) {
-        buffer->pw = malloc(new_size * sizeof(npy_intp));
-    } else {
-        buffer->pw = realloc(buffer->pw, new_size * sizeof(npy_intp));
-    }
-
-    buffer->size = new_size;
-
-    if (NPY_UNLIKELY(buffer->pw == NULL)) {
-        return -NPY_ENOMEM;
-    } else {
-        return 0;
-    }
-}
-
-/*
- *****************************************************************************
- **                            NUMERIC SORTS                                **
- *****************************************************************************
- */
-
-
-/**begin repeat
- *
- * #TYPE = BOOL, BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG,
- *         LONGLONG, ULONGLONG, HALF, FLOAT, DOUBLE, LONGDOUBLE,
- *         CFLOAT, CDOUBLE, CLONGDOUBLE, DATETIME, TIMEDELTA#
- * #suff = bool, byte, ubyte, short, ushort, int, uint, long, ulong,
- *         longlong, ulonglong, half, float, double, longdouble,
- *         cfloat, cdouble, clongdouble, datetime, timedelta#
- * #type = npy_bool, npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int,
- *         npy_uint, npy_long, npy_ulong, npy_longlong, npy_ulonglong,
- *         npy_ushort, npy_float, npy_double, npy_longdouble, npy_cfloat,
- *         npy_cdouble, npy_clongdouble, npy_datetime, npy_timedelta#
- */
-
-
-typedef struct {
-    @type@ * pw;
-    npy_intp size;
-} buffer_@suff@;
-
-
-static NPY_INLINE int
-resize_buffer_@suff@(buffer_@suff@ *buffer, npy_intp new_size)
-{
-    if (new_size <= buffer->size) {
-        return 0;
-    }
-
-    if (NPY_UNLIKELY(buffer->pw == NULL)) {
-        buffer->pw = malloc(new_size * sizeof(@type@));
-    } else {
-        buffer->pw = realloc(buffer->pw, new_size * sizeof(@type@));
-    }
-
-    buffer->size = new_size;
-
-    if (NPY_UNLIKELY(buffer->pw == NULL)) {
-        return -NPY_ENOMEM;
-    } else {
-        return 0;
-    }
-}
-
-
-static npy_intp
-count_run_@suff@(@type@ *arr, npy_intp l, npy_intp num, npy_intp minrun)
-{
-    npy_intp sz;
-    @type@ vc, *pl, *pi, *pj, *pr;
-
-    if (NPY_UNLIKELY(num - l == 1)) {
-        return 1;
-    }
-
-    pl = arr + l;
-
-    /* (not strictly) ascending sequence */
-    if (!@TYPE@_LT(*(pl + 1), *pl)) {
-        for (pi = pl + 1; pi < arr + num - 1 && !@TYPE@_LT(*(pi + 1), *pi); ++pi) {
-        }
-    } else {  /* (strictly) descending sequence */
-        for (pi = pl + 1; pi < arr + num - 1 && @TYPE@_LT(*(pi + 1), *pi); ++pi) {
-        }
-
-        for (pj = pl, pr = pi; pj < pr; ++pj, --pr) {
-            @TYPE@_SWAP(*pj, *pr);
-        }
-    }
-
-    ++pi;
-    sz = pi - pl;
-
-    if (sz < minrun) {
-        if (l + minrun < num) {
-            sz = minrun;
-        } else {
-            sz = num - l;
-        }
-
-        pr = pl + sz;
-
-        /* insertion sort */
-        for (; pi < pr; ++pi) {
-            vc = *pi;
-            pj = pi;
-
-            while (pl < pj && @TYPE@_LT(vc, *(pj - 1))) {
-                *pj = *(pj - 1);
-                --pj;
-            }
-
-            *pj = vc;
-        }
-    }
-
-    return sz;
-}
-
-
-/* when the left part of the array (p1) is smaller, copy p1 to buffer
- * and merge from left to right
- */
-static void
-merge_left_@suff@(@type@ *p1, npy_intp l1, @type@ *p2, npy_intp l2,
-                  @type@ *p3)
-{
-    @type@ *end = p2 + l2;
-    memcpy(p3, p1, sizeof(@type@) * l1);
-    /* first element must be in p2 otherwise skipped in the caller */
-    *p1++ = *p2++;
-
-    while (p1 < p2 && p2 < end) {
-        if (@TYPE@_LT(*p2, *p3)) {
-            *p1++ = *p2++;
-        } else {
-            *p1++ = *p3++;
-        }
-    }
-
-    if (p1 != p2) {
-        memcpy(p1, p3, sizeof(@type@) * (p2 - p1));
-    }
-}
-
-
-/* when the right part of the array (p2) is smaller, copy p2 to buffer
- * and merge from right to left
- */
-static void
-merge_right_@suff@(@type@ *p1, npy_intp l1, @type@ *p2, npy_intp l2,
-                   @type@ *p3)
-{
-    npy_intp ofs;
-    @type@ *start = p1 - 1;
-    memcpy(p3, p2, sizeof(@type@) * l2);
-    p1 += l1 - 1;
-    p2 += l2 - 1;
-    p3 += l2 - 1;
-    /* first element must be in p1 otherwise skipped in the caller */
-    *p2-- = *p1--;
-
-    while (p1 < p2 && start < p1) {
-        if (@TYPE@_LT(*p3, *p1)) {
-            *p2-- = *p1--;
-        } else {
-            *p2-- = *p3--;
-        }
-    }
-
-    if (p1 != p2) {
-        ofs = p2 - start;
-        memcpy(start + 1, p3 - ofs + 1, sizeof(@type@) * ofs);
-    }
-}
-
-
-/* Note: the naming convention of gallop functions are different from that of
- * CPython. For example, here gallop_right means gallop from left toward right,
- * whereas in CPython gallop_right means gallop
- * and find the right most element among equal elements
- */
-static npy_intp
-gallop_right_@suff@(const @type@ *arr, const npy_intp size, const @type@ key)
-{
-    npy_intp last_ofs, ofs, m;
-
-    if (@TYPE@_LT(key, arr[0])) {
-        return 0;
-    }
-
-    last_ofs = 0;
-    ofs = 1;
-
-    for (;;) {
-        if (size <= ofs || ofs < 0) {
-            ofs = size; /* arr[ofs] is never accessed */
-            break;
-        }
-
-        if (@TYPE@_LT(key, arr[ofs])) {
-            break;
-        } else {
-            last_ofs = ofs;
-            /* ofs = 1, 3, 7, 15... */
-            ofs = (ofs << 1) + 1;
-        }
-    }
-
-    /* now that arr[last_ofs] <= key < arr[ofs] */
-    while (last_ofs + 1 < ofs) {
-        m = last_ofs + ((ofs - last_ofs) >> 1);
-
-        if (@TYPE@_LT(key, arr[m])) {
-            ofs = m;
-        } else {
-            last_ofs = m;
-        }
-    }
-
-    /* now that arr[ofs-1] <= key < arr[ofs] */
-    return ofs;
-}
-
-
-static npy_intp
-gallop_left_@suff@(const @type@ *arr, const npy_intp size, const @type@ key)
-{
-    npy_intp last_ofs, ofs, l, m, r;
-
-    if (@TYPE@_LT(arr[size - 1], key)) {
-        return size;
-    }
-
-    last_ofs = 0;
-    ofs = 1;
-
-    for (;;) {
-        if (size <= ofs || ofs < 0) {
-            ofs = size;
-            break;
-        }
-
-        if (@TYPE@_LT(arr[size - ofs - 1], key)) {
-            break;
-        } else {
-            last_ofs = ofs;
-            ofs = (ofs << 1) + 1;
-        }
-    }
-
-    /* now that arr[size-ofs-1] < key <= arr[size-last_ofs-1] */
-    l = size - ofs - 1;
-    r = size - last_ofs - 1;
-
-    while (l + 1 < r) {
-        m = l + ((r - l) >> 1);
-
-        if (@TYPE@_LT(arr[m], key)) {
-            l = m;
-        } else {
-            r = m;
-        }
-    }
-
-    /* now that arr[r-1] < key <= arr[r] */
-    return r;
-}
-
-
-static int
-merge_at_@suff@(@type@ *arr, const run *stack, const npy_intp at,
-                buffer_@suff@ *buffer)
-{
-    int ret;
-    npy_intp s1, l1, s2, l2, k;
-    @type@ *p1, *p2;
-    s1 = stack[at].s;
-    l1 = stack[at].l;
-    s2 = stack[at + 1].s;
-    l2 = stack[at + 1].l;
-    /* arr[s2] belongs to arr[s1+k].
-     * if try to comment this out for debugging purpose, remember
-     * in the merging process the first element is skipped
-     */
-    k = gallop_right_@suff@(arr + s1, l1, arr[s2]);
-
-    if (l1 == k) {
-        /* already sorted */
-        return 0;
-    }
-
-    p1 = arr + s1 + k;
-    l1 -= k;
-    p2 = arr + s2;
-    /* arr[s2-1] belongs to arr[s2+l2] */
-    l2 = gallop_left_@suff@(arr + s2, l2, arr[s2 - 1]);
-
-    if (l2 < l1) {
-        ret = resize_buffer_@suff@(buffer, l2);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-        merge_right_@suff@(p1, l1, p2, l2, buffer->pw);
-    } else {
-        ret = resize_buffer_@suff@(buffer, l1);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-        merge_left_@suff@(p1, l1, p2, l2, buffer->pw);
-    }
-
-    return 0;
-}
-
-
-static int
-try_collapse_@suff@(@type@ *arr, run *stack, npy_intp *stack_ptr,
-                    buffer_@suff@ *buffer)
-{
-    int ret;
-    npy_intp A, B, C, top;
-    top = *stack_ptr;
-
-    while (1 < top) {
-        B = stack[top - 2].l;
-        C = stack[top - 1].l;
-
-        if ((2 < top && stack[top - 3].l <= B + C) ||
-                (3 < top && stack[top - 4].l <= stack[top - 3].l + B)) {
-            A = stack[top - 3].l;
-
-            if (A <= C) {
-                ret = merge_at_@suff@(arr, stack, top - 3, buffer);
-
-                if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-                stack[top - 3].l += B;
-                stack[top - 2] = stack[top - 1];
-                --top;
-            } else {
-                ret = merge_at_@suff@(arr, stack, top - 2, buffer);
-
-                if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-                stack[top - 2].l += C;
-                --top;
-            }
-        } else if (1 < top && B <= C) {
-            ret = merge_at_@suff@(arr, stack, top - 2, buffer);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 2].l += C;
-            --top;
-        } else {
-            break;
-        }
-    }
-
-    *stack_ptr = top;
-    return 0;
-}
-
-static int
-force_collapse_@suff@(@type@ *arr, run *stack, npy_intp *stack_ptr,
-                      buffer_@suff@ *buffer)
-{
-    int ret;
-    npy_intp top = *stack_ptr;
-
-    while (2 < top) {
-        if (stack[top - 3].l <= stack[top - 1].l) {
-            ret = merge_at_@suff@(arr, stack, top - 3, buffer);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 3].l += stack[top - 2].l;
-            stack[top - 2] = stack[top - 1];
-            --top;
-        } else {
-            ret = merge_at_@suff@(arr, stack, top - 2, buffer);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 2].l += stack[top - 1].l;
-            --top;
-        }
-    }
-
-    if (1 < top) {
-        ret = merge_at_@suff@(arr, stack, top - 2, buffer);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-    }
-
-    return 0;
-}
-
-
-NPY_NO_EXPORT int
-timsort_@suff@(void *start, npy_intp num, void *NPY_UNUSED(varr))
-{
-    int ret;
-    npy_intp l, n, stack_ptr, minrun;
-    buffer_@suff@ buffer;
-    run stack[TIMSORT_STACK_SIZE];
-    buffer.pw = NULL;
-    buffer.size = 0;
-    stack_ptr = 0;
-    minrun = compute_min_run(num);
-
-    for (l = 0; l < num;) {
-        n = count_run_@suff@(start, l, num, minrun);
-        stack[stack_ptr].s = l;
-        stack[stack_ptr].l = n;
-        ++stack_ptr;
-        ret = try_collapse_@suff@(start, stack, &stack_ptr, &buffer);
-
-        if (NPY_UNLIKELY(ret < 0)) { goto cleanup; }
-
-        l += n;
-    }
-
-    ret = force_collapse_@suff@(start, stack, &stack_ptr, &buffer);
-
-    if (NPY_UNLIKELY(ret < 0)) { goto cleanup; }
-
-    ret = 0;
-cleanup:
-
-    free(buffer.pw);
-
-    return ret;
-}
-
-
-/* argsort */
-
-
-static npy_intp
-acount_run_@suff@(@type@ *arr, npy_intp *tosort, npy_intp l, npy_intp num,
-                  npy_intp minrun)
-{
-    npy_intp sz;
-    @type@ vc;
-    npy_intp vi;
-    npy_intp *pl, *pi, *pj, *pr;
-
-    if (NPY_UNLIKELY(num - l == 1)) {
-        return 1;
-    }
-
-    pl = tosort + l;
-
-    /* (not strictly) ascending sequence */
-    if (!@TYPE@_LT(arr[*(pl + 1)], arr[*pl])) {
-        for (pi = pl + 1; pi < tosort + num - 1
-                && !@TYPE@_LT(arr[*(pi + 1)], arr[*pi]); ++pi) {
-        }
-    } else {  /* (strictly) descending sequence */
-        for (pi = pl + 1; pi < tosort + num - 1
-                && @TYPE@_LT(arr[*(pi + 1)], arr[*pi]); ++pi) {
-        }
-
-        for (pj = pl, pr = pi; pj < pr; ++pj, --pr) {
-            INTP_SWAP(*pj, *pr);
-        }
-    }
-
-    ++pi;
-    sz = pi - pl;
-
-    if (sz < minrun) {
-        if (l + minrun < num) {
-            sz = minrun;
-        } else {
-            sz = num - l;
-        }
-
-        pr = pl + sz;
-
-        /* insertion sort */
-        for (; pi < pr; ++pi) {
-            vi = *pi;
-            vc = arr[*pi];
-            pj = pi;
-
-            while (pl < pj && @TYPE@_LT(vc, arr[*(pj - 1)])) {
-                *pj = *(pj - 1);
-                --pj;
-            }
-
-            *pj = vi;
-        }
-    }
-
-    return sz;
-}
-
-
-static npy_intp
-agallop_right_@suff@(const @type@ *arr, const npy_intp *tosort,
-                     const npy_intp size, const @type@ key)
-{
-    npy_intp last_ofs, ofs, m;
-
-    if (@TYPE@_LT(key, arr[tosort[0]])) {
-        return 0;
-    }
-
-    last_ofs = 0;
-    ofs = 1;
-
-    for (;;) {
-        if (size <= ofs || ofs < 0) {
-            ofs = size; /* arr[ofs] is never accessed */
-            break;
-        }
-
-        if (@TYPE@_LT(key, arr[tosort[ofs]])) {
-            break;
-        } else {
-            last_ofs = ofs;
-            /* ofs = 1, 3, 7, 15... */
-            ofs = (ofs << 1) + 1;
-        }
-    }
-
-    /* now that arr[tosort[last_ofs]] <= key < arr[tosort[ofs]] */
-    while (last_ofs + 1 < ofs) {
-        m = last_ofs + ((ofs - last_ofs) >> 1);
-
-        if (@TYPE@_LT(key, arr[tosort[m]])) {
-            ofs = m;
-        } else {
-            last_ofs = m;
-        }
-    }
-
-    /* now that arr[tosort[ofs-1]] <= key < arr[tosort[ofs]] */
-    return ofs;
-}
-
-
-
-static npy_intp
-agallop_left_@suff@(const @type@ *arr, const npy_intp *tosort,
-                    const npy_intp size, const @type@ key)
-{
-    npy_intp last_ofs, ofs, l, m, r;
-
-    if (@TYPE@_LT(arr[tosort[size - 1]], key)) {
-        return size;
-    }
-
-    last_ofs = 0;
-    ofs = 1;
-
-    for (;;) {
-        if (size <= ofs || ofs < 0) {
-            ofs = size;
-            break;
-        }
-
-        if (@TYPE@_LT(arr[tosort[size - ofs - 1]], key)) {
-            break;
-        } else {
-            last_ofs = ofs;
-            ofs = (ofs << 1) + 1;
-        }
-    }
-
-    /* now that arr[tosort[size-ofs-1]] < key <= arr[tosort[size-last_ofs-1]] */
-    l = size - ofs - 1;
-    r = size - last_ofs - 1;
-
-    while (l + 1 < r) {
-        m = l + ((r - l) >> 1);
-
-        if (@TYPE@_LT(arr[tosort[m]], key)) {
-            l = m;
-        } else {
-            r = m;
-        }
-    }
-
-    /* now that arr[tosort[r-1]] < key <= arr[tosort[r]] */
-    return r;
-}
-
-
-static void
-amerge_left_@suff@(@type@ *arr, npy_intp *p1, npy_intp l1, npy_intp *p2,
-                   npy_intp l2,
-                   npy_intp *p3)
-{
-    npy_intp *end = p2 + l2;
-    memcpy(p3, p1, sizeof(npy_intp) * l1);
-    /* first element must be in p2 otherwise skipped in the caller */
-    *p1++ = *p2++;
-
-    while (p1 < p2 && p2 < end) {
-        if (@TYPE@_LT(arr[*p2], arr[*p3])) {
-            *p1++ = *p2++;
-        } else {
-            *p1++ = *p3++;
-        }
-    }
-
-    if (p1 != p2) {
-        memcpy(p1, p3, sizeof(npy_intp) * (p2 - p1));
-    }
-}
-
-
-static void
-amerge_right_@suff@(@type@ *arr, npy_intp* p1, npy_intp l1, npy_intp *p2,
-                    npy_intp l2,
-                    npy_intp *p3)
-{
-    npy_intp ofs;
-    npy_intp *start = p1 - 1;
-    memcpy(p3, p2, sizeof(npy_intp) * l2);
-    p1 += l1 - 1;
-    p2 += l2 - 1;
-    p3 += l2 - 1;
-    /* first element must be in p1 otherwise skipped in the caller */
-    *p2-- = *p1--;
-
-    while (p1 < p2 && start < p1) {
-        if (@TYPE@_LT(arr[*p3], arr[*p1])) {
-            *p2-- = *p1--;
-        } else {
-            *p2-- = *p3--;
-        }
-    }
-
-    if (p1 != p2) {
-        ofs = p2 - start;
-        memcpy(start + 1, p3 - ofs + 1, sizeof(npy_intp) * ofs);
-    }
-}
-
-
-static int
-amerge_at_@suff@(@type@ *arr, npy_intp *tosort, const run *stack,
-                 const npy_intp at,
-                 buffer_intp *buffer)
-{
-    int ret;
-    npy_intp s1, l1, s2, l2, k;
-    npy_intp *p1, *p2;
-    s1 = stack[at].s;
-    l1 = stack[at].l;
-    s2 = stack[at + 1].s;
-    l2 = stack[at + 1].l;
-    /* tosort[s2] belongs to tosort[s1+k] */
-    k = agallop_right_@suff@(arr, tosort + s1, l1, arr[tosort[s2]]);
-
-    if (l1 == k) {
-        /* already sorted */
-        return 0;
-    }
-
-    p1 = tosort + s1 + k;
-    l1 -= k;
-    p2 = tosort + s2;
-    /* tosort[s2-1] belongs to tosort[s2+l2] */
-    l2 = agallop_left_@suff@(arr, tosort + s2, l2, arr[tosort[s2 - 1]]);
-
-    if (l2 < l1) {
-        ret = resize_buffer_intp(buffer, l2);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-        amerge_right_@suff@(arr, p1, l1, p2, l2, buffer->pw);
-    } else {
-        ret = resize_buffer_intp(buffer, l1);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-        amerge_left_@suff@(arr, p1, l1, p2, l2, buffer->pw);
-    }
-
-    return 0;
-}
-
-
-static int
-atry_collapse_@suff@(@type@ *arr, npy_intp *tosort, run *stack,
-                     npy_intp *stack_ptr,
-                     buffer_intp *buffer)
-{
-    int ret;
-    npy_intp A, B, C, top;
-    top = *stack_ptr;
-
-    while (1 < top) {
-        B = stack[top - 2].l;
-        C = stack[top - 1].l;
-
-        if ((2 < top && stack[top - 3].l <= B + C) ||
-                (3 < top && stack[top - 4].l <= stack[top - 3].l + B)) {
-            A = stack[top - 3].l;
-
-            if (A <= C) {
-                ret = amerge_at_@suff@(arr, tosort, stack, top - 3, buffer);
-
-                if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-                stack[top - 3].l += B;
-                stack[top - 2] = stack[top - 1];
-                --top;
-            } else {
-                ret = amerge_at_@suff@(arr, tosort, stack, top - 2, buffer);
-
-                if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-                stack[top - 2].l += C;
-                --top;
-            }
-        } else if (1 < top && B <= C) {
-            ret = amerge_at_@suff@(arr, tosort, stack, top - 2, buffer);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 2].l += C;
-            --top;
-        } else {
-            break;
-        }
-    }
-
-    *stack_ptr = top;
-    return 0;
-}
-
-
-static int
-aforce_collapse_@suff@(@type@ *arr, npy_intp *tosort, run *stack,
-                       npy_intp *stack_ptr,
-                       buffer_intp *buffer)
-{
-    int ret;
-    npy_intp top = *stack_ptr;
-
-    while (2 < top) {
-        if (stack[top - 3].l <= stack[top - 1].l) {
-            ret = amerge_at_@suff@(arr, tosort, stack, top - 3, buffer);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 3].l += stack[top - 2].l;
-            stack[top - 2] = stack[top - 1];
-            --top;
-        } else {
-            ret = amerge_at_@suff@(arr, tosort, stack, top - 2, buffer);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 2].l += stack[top - 1].l;
-            --top;
-        }
-    }
-
-    if (1 < top) {
-        ret = amerge_at_@suff@(arr, tosort, stack, top - 2, buffer);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-    }
-
-    return 0;
-}
-
-
-NPY_NO_EXPORT int
-atimsort_@suff@(void *v, npy_intp *tosort, npy_intp num,
-                void *NPY_UNUSED(varr))
-{
-    int ret;
-    npy_intp l, n, stack_ptr, minrun;
-    buffer_intp buffer;
-    run stack[TIMSORT_STACK_SIZE];
-    buffer.pw = NULL;
-    buffer.size = 0;
-    stack_ptr = 0;
-    minrun = compute_min_run(num);
-
-    for (l = 0; l < num;) {
-        n = acount_run_@suff@(v, tosort, l, num, minrun);
-        stack[stack_ptr].s = l;
-        stack[stack_ptr].l = n;
-        ++stack_ptr;
-        ret = atry_collapse_@suff@(v, tosort, stack, &stack_ptr, &buffer);
-
-        if (NPY_UNLIKELY(ret < 0)) { goto cleanup; }
-
-        l += n;
-    }
-
-    ret = aforce_collapse_@suff@(v, tosort, stack, &stack_ptr, &buffer);
-
-    if (NPY_UNLIKELY(ret < 0)) { goto cleanup; }
-
-    ret = 0;
-cleanup:
-
-    if (buffer.pw != NULL) {
-        free(buffer.pw);
-    }
-
-    return ret;
-}
-
-/**end repeat**/
-
-
-
-/* For string sorts and generic sort, element comparisons are very expensive,
- * and the time cost of insertion sort (involves N**2 comparison) clearly hurts.
- * Implementing binary insertion sort and probably gallop mode during merging process
- * can hopefully boost the performance. Here as a temporary workaround we use shorter
- * run length to reduce the cost of insertion sort.
- */
-
-static npy_intp compute_min_run_short(npy_intp num)
-{
-    npy_intp r = 0;
-
-    while (16 < num) {
-        r |= num & 1;
-        num >>= 1;
-    }
-
-    return num + r;
-}
-
-/*
- *****************************************************************************
- **                             STRING SORTS                                **
- *****************************************************************************
- */
-
-
-/**begin repeat
- *
- * #TYPE = STRING, UNICODE#
- * #suff = string, unicode#
- * #type = npy_char, npy_ucs4#
- */
-
-
-typedef struct {
-    @type@ * pw;
-    npy_intp size;
-    size_t len;
-} buffer_@suff@;
-
-
-static NPY_INLINE int
-resize_buffer_@suff@(buffer_@suff@ *buffer, npy_intp new_size)
-{
-    if (new_size <= buffer->size) {
-        return 0;
-    }
-
-    if (NPY_UNLIKELY(buffer->pw == NULL)) {
-        buffer->pw = malloc(sizeof(@type@) * new_size * buffer->len);
-    } else {
-        buffer->pw = realloc(buffer->pw,  sizeof(@type@) * new_size * buffer->len);
-    }
-
-    buffer->size = new_size;
-
-    if (NPY_UNLIKELY(buffer->pw == NULL)) {
-        return -NPY_ENOMEM;
-    } else {
-        return 0;
-    }
-}
-
-
-static npy_intp
-count_run_@suff@(@type@ *arr, npy_intp l, npy_intp num, npy_intp minrun,
-                 @type@ *vp, size_t len)
-{
-    npy_intp sz;
-    @type@ *pl, *pi, *pj, *pr;
-
-    if (NPY_UNLIKELY(num - l == 1)) {
-        return 1;
-    }
-
-    pl = arr + l * len;
-
-    /* (not strictly) ascending sequence */
-    if (!@TYPE@_LT(pl + len, pl, len)) {
-        for (pi = pl + len; pi < arr + (num - 1) * len
-                && !@TYPE@_LT(pi + len, pi, len); pi += len) {
-        }
-    } else {  /* (strictly) descending sequence */
-        for (pi = pl + len; pi < arr + (num - 1) * len
-                && @TYPE@_LT(pi + len, pi, len); pi += len) {
-        }
-
-        for (pj = pl, pr = pi; pj < pr; pj += len, pr -= len) {
-            @TYPE@_SWAP(pj, pr, len);
-        }
-    }
-
-    pi += len;
-    sz = (pi - pl) / len;
-
-    if (sz < minrun) {
-        if (l + minrun < num) {
-            sz = minrun;
-        } else {
-            sz = num - l;
-        }
-
-        pr = pl + sz * len;
-
-        /* insertion sort */
-        for (; pi < pr; pi += len) {
-            @TYPE@_COPY(vp, pi, len);
-            pj = pi;
-
-            while (pl < pj && @TYPE@_LT(vp, pj - len, len)) {
-                @TYPE@_COPY(pj, pj - len, len);
-                pj -= len;
-            }
-
-            @TYPE@_COPY(pj, vp, len);
-        }
-    }
-
-    return sz;
-}
-
-
-static npy_intp
-gallop_right_@suff@(const @type@ *arr, const npy_intp size,
-                    const @type@ *key, size_t len)
-{
-    npy_intp last_ofs, ofs, m;
-
-    if (@TYPE@_LT(key, arr, len)) {
-        return 0;
-    }
-
-    last_ofs = 0;
-    ofs = 1;
-
-    for (;;) {
-        if (size <= ofs || ofs < 0) {
-            ofs = size; /* arr[ofs] is never accessed */
-            break;
-        }
-
-        if (@TYPE@_LT(key, arr + ofs * len, len)) {
-            break;
-        } else {
-            last_ofs = ofs;
-            /* ofs = 1, 3, 7, 15... */
-            ofs = (ofs << 1) + 1;
-        }
-    }
-
-    /* now that arr[last_ofs*len] <= key < arr[ofs*len] */
-    while (last_ofs + 1 < ofs) {
-        m = last_ofs + ((ofs - last_ofs) >> 1);
-
-        if (@TYPE@_LT(key, arr + m * len, len)) {
-            ofs = m;
-        } else {
-            last_ofs = m;
-        }
-    }
-
-    /* now that arr[(ofs-1)*len] <= key < arr[ofs*len] */
-    return ofs;
-}
-
-
-
-static npy_intp
-gallop_left_@suff@(const @type@ *arr, const npy_intp size, const @type@ *key,
-                   size_t len)
-{
-    npy_intp last_ofs, ofs, l, m, r;
-
-    if (@TYPE@_LT(arr + (size - 1) * len, key, len)) {
-        return size;
-    }
-
-    last_ofs = 0;
-    ofs = 1;
-
-    for (;;) {
-        if (size <= ofs || ofs < 0) {
-            ofs = size;
-            break;
-        }
-
-        if (@TYPE@_LT(arr + (size - ofs - 1) * len, key, len)) {
-            break;
-        } else {
-            last_ofs = ofs;
-            ofs = (ofs << 1) + 1;
-        }
-    }
-
-    /* now that arr[(size-ofs-1)*len] < key <= arr[(size-last_ofs-1)*len] */
-    l = size - ofs - 1;
-    r = size - last_ofs - 1;
-
-    while (l + 1 < r) {
-        m = l + ((r - l) >> 1);
-
-        if (@TYPE@_LT(arr + m * len, key, len)) {
-            l = m;
-        } else {
-            r = m;
-        }
-    }
-
-    /* now that arr[(r-1)*len] < key <= arr[r*len] */
-    return r;
-}
-
-
-static void
-merge_left_@suff@(@type@ *p1, npy_intp l1, @type@ *p2, npy_intp l2,
-                  @type@ *p3, size_t len)
-{
-    @type@ *end = p2 + l2 * len;
-    memcpy(p3, p1, sizeof(@type@) * l1 * len);
-    /* first element must be in p2 otherwise skipped in the caller */
-    @TYPE@_COPY(p1, p2, len);
-    p1 += len;
-    p2 += len;
-
-    while (p1 < p2 && p2 < end) {
-        if (@TYPE@_LT(p2, p3, len)) {
-            @TYPE@_COPY(p1, p2, len);
-            p1 += len;
-            p2 += len;
-        } else {
-            @TYPE@_COPY(p1, p3, len);
-            p1 += len;
-            p3 += len;
-        }
-    }
-
-    if (p1 != p2) {
-        memcpy(p1, p3, sizeof(@type@) * (p2 - p1));
-    }
-}
-
-
-static void
-merge_right_@suff@(@type@ *p1, npy_intp l1, @type@ *p2, npy_intp l2,
-                   @type@ *p3, size_t len)
-{
-    npy_intp ofs;
-    @type@ *start = p1 - len;
-    memcpy(p3, p2, sizeof(@type@) * l2 * len);
-    p1 += (l1 - 1) * len;
-    p2 += (l2 - 1) * len;
-    p3 += (l2 - 1) * len;
-    /* first element must be in p1 otherwise skipped in the caller */
-    @TYPE@_COPY(p2, p1, len);
-    p2 -= len;
-    p1 -= len;
-
-    while (p1 < p2 && start < p1) {
-        if (@TYPE@_LT(p3, p1, len)) {
-            @TYPE@_COPY(p2, p1, len);
-            p2 -= len;
-            p1 -= len;
-        } else {
-            @TYPE@_COPY(p2, p3, len);
-            p2 -= len;
-            p3 -= len;
-        }
-    }
-
-    if (p1 != p2) {
-        ofs = p2 - start;
-        memcpy(start + len, p3 - ofs + len, sizeof(@type@) * ofs);
-    }
-}
-
-
-static int
-merge_at_@suff@(@type@ *arr, const run *stack, const npy_intp at,
-                buffer_@suff@ *buffer, size_t len)
-{
-    int ret;
-    npy_intp s1, l1, s2, l2, k;
-    @type@ *p1, *p2;
-    s1 = stack[at].s;
-    l1 = stack[at].l;
-    s2 = stack[at + 1].s;
-    l2 = stack[at + 1].l;
-    /* arr[s2] belongs to arr[s1+k] */
-    @TYPE@_COPY(buffer->pw, arr + s2 * len, len);
-    k = gallop_right_@suff@(arr + s1 * len, l1, buffer->pw, len);
-
-    if (l1 == k) {
-        /* already sorted */
-        return 0;
-    }
-
-    p1 = arr + (s1 + k) * len;
-    l1 -= k;
-    p2 = arr + s2 * len;
-    /* arr[s2-1] belongs to arr[s2+l2] */
-    @TYPE@_COPY(buffer->pw, arr + (s2 - 1) * len, len);
-    l2 = gallop_left_@suff@(arr + s2 * len, l2, buffer->pw, len);
-
-    if (l2 < l1) {
-        ret = resize_buffer_@suff@(buffer, l2);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-        merge_right_@suff@(p1, l1, p2, l2, buffer->pw, len);
-    } else {
-        ret = resize_buffer_@suff@(buffer, l1);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-        merge_left_@suff@(p1, l1, p2, l2, buffer->pw, len);
-    }
-
-    return 0;
-}
-
-
-static int
-try_collapse_@suff@(@type@ *arr, run *stack, npy_intp *stack_ptr,
-                    buffer_@suff@ *buffer, size_t len)
-{
-    int ret;
-    npy_intp A, B, C, top;
-    top = *stack_ptr;
-
-    while (1 < top) {
-        B = stack[top - 2].l;
-        C = stack[top - 1].l;
-
-        if ((2 < top && stack[top - 3].l <= B + C) ||
-                (3 < top && stack[top - 4].l <= stack[top - 3].l + B)) {
-            A = stack[top - 3].l;
-
-            if (A <= C) {
-                ret = merge_at_@suff@(arr, stack, top - 3, buffer, len);
-
-                if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-                stack[top - 3].l += B;
-                stack[top - 2] = stack[top - 1];
-                --top;
-            } else {
-                ret = merge_at_@suff@(arr, stack, top - 2, buffer, len);
-
-                if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-                stack[top - 2].l += C;
-                --top;
-            }
-        } else if (1 < top && B <= C) {
-            ret = merge_at_@suff@(arr, stack, top - 2, buffer, len);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 2].l += C;
-            --top;
-        } else {
-            break;
-        }
-    }
-
-    *stack_ptr = top;
-    return 0;
-}
-
-
-static int
-force_collapse_@suff@(@type@ *arr, run *stack, npy_intp *stack_ptr,
-                      buffer_@suff@ *buffer, size_t len)
-{
-    int ret;
-    npy_intp top = *stack_ptr;
-
-    while (2 < top) {
-        if (stack[top - 3].l <= stack[top - 1].l) {
-            ret = merge_at_@suff@(arr, stack, top - 3, buffer, len);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 3].l += stack[top - 2].l;
-            stack[top - 2] = stack[top - 1];
-            --top;
-        } else {
-            ret = merge_at_@suff@(arr, stack, top - 2, buffer, len);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 2].l += stack[top - 1].l;
-            --top;
-        }
-    }
-
-    if (1 < top) {
-        ret = merge_at_@suff@(arr, stack, top - 2, buffer, len);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-    }
-
-    return 0;
-}
-
-
-NPY_NO_EXPORT int
-timsort_@suff@(void *start, npy_intp num, void *varr)
-{
-    PyArrayObject *arr = varr;
-    size_t elsize = PyArray_ITEMSIZE(arr);
-    size_t len = elsize / sizeof(@type@);
-    int ret;
-    npy_intp l, n, stack_ptr, minrun;
-    run stack[TIMSORT_STACK_SIZE];
-    buffer_@suff@ buffer;
-
-    /* Items that have zero size don't make sense to sort */
-    if (len == 0) {
-        return 0;
-    }
-
-    buffer.pw = NULL;
-    buffer.size = 0;
-    buffer.len = len;
-    stack_ptr = 0;
-    minrun = compute_min_run_short(num);
-    /* used for insertion sort and gallop key */
-    ret = resize_buffer_@suff@(&buffer, 1);
-
-    if (NPY_UNLIKELY(ret < 0)) { goto cleanup; }
-
-    for (l = 0; l < num;) {
-        n = count_run_@suff@(start, l, num, minrun, buffer.pw, len);
-        /* both s and l are scaled by len */
-        stack[stack_ptr].s = l;
-        stack[stack_ptr].l = n;
-        ++stack_ptr;
-        ret = try_collapse_@suff@(start, stack, &stack_ptr, &buffer, len);
-
-        if (NPY_UNLIKELY(ret < 0)) { goto cleanup; }
-
-        l += n;
-    }
-
-    ret = force_collapse_@suff@(start, stack, &stack_ptr, &buffer, len);
-
-    if (NPY_UNLIKELY(ret < 0)) { goto cleanup; }
-
-    ret = 0;
-
-cleanup:
-    if (buffer.pw != NULL) {
-        free(buffer.pw);
-    }
-    return ret;
-}
-
-
-/* argsort */
-
-
-static npy_intp
-acount_run_@suff@(@type@ *arr, npy_intp *tosort, npy_intp l, npy_intp num,
-                  npy_intp minrun, size_t len)
-{
-    npy_intp sz;
-    npy_intp vi;
-    npy_intp *pl, *pi, *pj, *pr;
-
-    if (NPY_UNLIKELY(num - l == 1)) {
-        return 1;
-    }
-
-    pl = tosort + l;
-
-    /* (not strictly) ascending sequence */
-    if (!@TYPE@_LT(arr + (*(pl + 1)) * len, arr + (*pl) * len, len)) {
-        for (pi = pl + 1; pi < tosort + num - 1
-                && !@TYPE@_LT(arr + (*(pi + 1)) * len, arr + (*pi) * len, len); ++pi) {
-        }
-    } else {  /* (strictly) descending sequence */
-        for (pi = pl + 1; pi < tosort + num - 1
-                && @TYPE@_LT(arr + (*(pi + 1)) * len, arr + (*pi) * len, len); ++pi) {
-        }
-
-        for (pj = pl, pr = pi; pj < pr; ++pj, --pr) {
-            INTP_SWAP(*pj, *pr);
-        }
-    }
-
-    ++pi;
-    sz = pi - pl;
-
-    if (sz < minrun) {
-        if (l + minrun < num) {
-            sz = minrun;
-        } else {
-            sz = num - l;
-        }
-
-        pr = pl + sz;
-
-        /* insertion sort */
-        for (; pi < pr; ++pi) {
-            vi = *pi;
-            pj = pi;
-
-            while (pl < pj && @TYPE@_LT(arr + vi * len, arr + (*(pj - 1)) * len, len)) {
-                *pj = *(pj - 1);
-                --pj;
-            }
-
-            *pj = vi;
-        }
-    }
-
-    return sz;
-}
-
-
-static npy_intp
-agallop_left_@suff@(const @type@ *arr, const npy_intp *tosort,
-                    const npy_intp size, const @type@ *key, size_t len)
-{
-    npy_intp last_ofs, ofs, l, m, r;
-
-    if (@TYPE@_LT(arr + tosort[size - 1] * len, key, len)) {
-        return size;
-    }
-
-    last_ofs = 0;
-    ofs = 1;
-
-    for (;;) {
-        if (size <= ofs || ofs < 0) {
-            ofs = size;
-            break;
-        }
-
-        if (@TYPE@_LT(arr + tosort[size - ofs - 1] * len, key, len)) {
-            break;
-        } else {
-            last_ofs = ofs;
-            ofs = (ofs << 1) + 1;
-        }
-    }
-
-    /* now that arr[tosort[size-ofs-1]*len] < key <= arr[tosort[size-last_ofs-1]*len] */
-    l = size - ofs - 1;
-    r = size - last_ofs - 1;
-
-    while (l + 1 < r) {
-        m = l + ((r - l) >> 1);
-
-        if (@TYPE@_LT(arr + tosort[m] * len, key, len)) {
-            l = m;
-        } else {
-            r = m;
-        }
-    }
-
-    /* now that arr[tosort[r-1]*len] < key <= arr[tosort[r]*len] */
-    return r;
-}
-
-
-static npy_intp
-agallop_right_@suff@(const @type@ *arr, const npy_intp *tosort,
-                     const npy_intp size, const @type@ *key, size_t len)
-{
-    npy_intp last_ofs, ofs, m;
-
-    if (@TYPE@_LT(key, arr + tosort[0] * len, len)) {
-        return 0;
-    }
-
-    last_ofs = 0;
-    ofs = 1;
-
-    for (;;) {
-        if (size <= ofs || ofs < 0) {
-            ofs = size; /* arr[ofs] is never accessed */
-            break;
-        }
-
-        if (@TYPE@_LT(key, arr + tosort[ofs] * len, len)) {
-            break;
-        } else {
-            last_ofs = ofs;
-            /* ofs = 1, 3, 7, 15... */
-            ofs = (ofs << 1) + 1;
-        }
-    }
-
-    /* now that arr[tosort[last_ofs]*len] <= key < arr[tosort[ofs]*len] */
-    while (last_ofs + 1 < ofs) {
-        m = last_ofs + ((ofs - last_ofs) >> 1);
-
-        if (@TYPE@_LT(key, arr + tosort[m] * len, len)) {
-            ofs = m;
-        } else {
-            last_ofs = m;
-        }
-    }
-
-    /* now that arr[tosort[ofs-1]*len] <= key < arr[tosort[ofs]*len] */
-    return ofs;
-}
-
-
-
-static void
-amerge_left_@suff@(@type@ *arr, npy_intp *p1, npy_intp l1, npy_intp *p2,
-                   npy_intp l2, npy_intp *p3, size_t len)
-{
-    npy_intp *end = p2 + l2;
-    memcpy(p3, p1, sizeof(npy_intp) * l1);
-    /* first element must be in p2 otherwise skipped in the caller */
-    *p1++ = *p2++;
-
-    while (p1 < p2 && p2 < end) {
-        if (@TYPE@_LT(arr + (*p2) * len, arr + (*p3) * len, len)) {
-            *p1++ = *p2++;
-        } else {
-            *p1++ = *p3++;
-        }
-    }
-
-    if (p1 != p2) {
-        memcpy(p1, p3, sizeof(npy_intp) * (p2 - p1));
-    }
-}
-
-
-static void
-amerge_right_@suff@(@type@ *arr, npy_intp* p1, npy_intp l1, npy_intp *p2,
-                    npy_intp l2, npy_intp *p3, size_t len)
-{
-    npy_intp ofs;
-    npy_intp *start = p1 - 1;
-    memcpy(p3, p2, sizeof(npy_intp) * l2);
-    p1 += l1 - 1;
-    p2 += l2 - 1;
-    p3 += l2 - 1;
-    /* first element must be in p1 otherwise skipped in the caller */
-    *p2-- = *p1--;
-
-    while (p1 < p2 && start < p1) {
-        if (@TYPE@_LT(arr + (*p3) * len, arr + (*p1) * len, len)) {
-            *p2-- = *p1--;
-        } else {
-            *p2-- = *p3--;
-        }
-    }
-
-    if (p1 != p2) {
-        ofs = p2 - start;
-        memcpy(start + 1, p3 - ofs + 1, sizeof(npy_intp) * ofs);
-    }
-}
-
-
-
-static int
-amerge_at_@suff@(@type@ *arr, npy_intp *tosort, const run *stack,
-                 const npy_intp at, buffer_intp *buffer, size_t len)
-{
-    int ret;
-    npy_intp s1, l1, s2, l2, k;
-    npy_intp *p1, *p2;
-    s1 = stack[at].s;
-    l1 = stack[at].l;
-    s2 = stack[at + 1].s;
-    l2 = stack[at + 1].l;
-    /* tosort[s2] belongs to tosort[s1+k] */
-    k = agallop_right_@suff@(arr, tosort + s1, l1, arr + tosort[s2] * len, len);
-
-    if (l1 == k) {
-        /* already sorted */
-        return 0;
-    }
-
-    p1 = tosort + s1 + k;
-    l1 -= k;
-    p2 = tosort + s2;
-    /* tosort[s2-1] belongs to tosort[s2+l2] */
-    l2 = agallop_left_@suff@(arr, tosort + s2, l2, arr + tosort[s2 - 1] * len,
-                             len);
-
-    if (l2 < l1) {
-        ret = resize_buffer_intp(buffer, l2);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-        amerge_right_@suff@(arr, p1, l1, p2, l2, buffer->pw, len);
-    } else {
-        ret = resize_buffer_intp(buffer, l1);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-        amerge_left_@suff@(arr, p1, l1, p2, l2, buffer->pw, len);
-    }
-
-    return 0;
-}
-
-
-static int
-atry_collapse_@suff@(@type@ *arr, npy_intp *tosort, run *stack,
-                     npy_intp *stack_ptr, buffer_intp *buffer, size_t len)
-{
-    int ret;
-    npy_intp A, B, C, top;
-    top = *stack_ptr;
-
-    while (1 < top) {
-        B = stack[top - 2].l;
-        C = stack[top - 1].l;
-
-        if ((2 < top && stack[top - 3].l <= B + C) ||
-                (3 < top && stack[top - 4].l <= stack[top - 3].l + B)) {
-            A = stack[top - 3].l;
-
-            if (A <= C) {
-                ret = amerge_at_@suff@(arr, tosort, stack, top - 3, buffer, len);
-
-                if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-                stack[top - 3].l += B;
-                stack[top - 2] = stack[top - 1];
-                --top;
-            } else {
-                ret = amerge_at_@suff@(arr, tosort, stack, top - 2, buffer, len);
-
-                if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-                stack[top - 2].l += C;
-                --top;
-            }
-        } else if (1 < top && B <= C) {
-            ret = amerge_at_@suff@(arr, tosort, stack, top - 2, buffer, len);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 2].l += C;
-            --top;
-        } else {
-            break;
-        }
-    }
-
-    *stack_ptr = top;
-    return 0;
-}
-
-
-
-static int
-aforce_collapse_@suff@(@type@ *arr, npy_intp *tosort, run *stack,
-                       npy_intp *stack_ptr, buffer_intp *buffer, size_t len)
-{
-    int ret;
-    npy_intp top = *stack_ptr;
-
-    while (2 < top) {
-        if (stack[top - 3].l <= stack[top - 1].l) {
-            ret = amerge_at_@suff@(arr, tosort, stack, top - 3, buffer, len);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 3].l += stack[top - 2].l;
-            stack[top - 2] = stack[top - 1];
-            --top;
-        } else {
-            ret = amerge_at_@suff@(arr, tosort, stack, top - 2, buffer, len);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 2].l += stack[top - 1].l;
-            --top;
-        }
-    }
-
-    if (1 < top) {
-        ret = amerge_at_@suff@(arr, tosort, stack, top - 2, buffer, len);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-    }
-
-    return 0;
-}
-
-
-NPY_NO_EXPORT int
-atimsort_@suff@(void *start, npy_intp *tosort, npy_intp num, void *varr)
-{
-    PyArrayObject *arr = varr;
-    size_t elsize = PyArray_ITEMSIZE(arr);
-    size_t len = elsize / sizeof(@type@);
-    int ret;
-    npy_intp l, n, stack_ptr, minrun;
-    run stack[TIMSORT_STACK_SIZE];
-    buffer_intp buffer;
-
-    /* Items that have zero size don't make sense to sort */
-    if (len == 0) {
-        return 0;
-    }
-
-    buffer.pw = NULL;
-    buffer.size = 0;
-    stack_ptr = 0;
-    minrun = compute_min_run_short(num);
-
-    for (l = 0; l < num;) {
-        n = acount_run_@suff@(start, tosort, l, num, minrun, len);
-        /* both s and l are scaled by len */
-        stack[stack_ptr].s = l;
-        stack[stack_ptr].l = n;
-        ++stack_ptr;
-        ret = atry_collapse_@suff@(start, tosort, stack, &stack_ptr, &buffer, len);
-
-        if (NPY_UNLIKELY(ret < 0)) { goto cleanup; }
-
-        l += n;
-    }
-
-    ret = aforce_collapse_@suff@(start, tosort, stack, &stack_ptr, &buffer, len);
-
-    if (NPY_UNLIKELY(ret < 0)) { goto cleanup; }
-
-    ret = 0;
-
-cleanup:
-    if (buffer.pw != NULL) {
-        free(buffer.pw);
-    }
-    return ret;
-}
-
-
-/**end repeat**/
-
-
-
-/*
- *****************************************************************************
- **                             GENERIC SORT                                **
- *****************************************************************************
- */
-
-
-typedef struct {
-    char *pw;
-    npy_intp size;
-    size_t len;
-} buffer_char;
-
-
-static NPY_INLINE int
-resize_buffer_char(buffer_char *buffer, npy_intp new_size)
-{
-    if (new_size <= buffer->size) {
-        return 0;
-    }
-
-    if (NPY_UNLIKELY(buffer->pw == NULL)) {
-        buffer->pw = malloc(sizeof(char) * new_size * buffer->len);
-    } else {
-        buffer->pw = realloc(buffer->pw,  sizeof(char) * new_size * buffer->len);
-    }
-
-    buffer->size = new_size;
-
-    if (NPY_UNLIKELY(buffer->pw == NULL)) {
-        return -NPY_ENOMEM;
-    } else {
-        return 0;
-    }
-}
-
-
-static npy_intp
-npy_count_run(char *arr, npy_intp l, npy_intp num, npy_intp minrun,
-              char *vp, size_t len, PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    npy_intp sz;
-    char *pl, *pi, *pj, *pr;
-
-    if (NPY_UNLIKELY(num - l == 1)) {
-        return 1;
-    }
-
-    pl = arr + l * len;
-
-    /* (not strictly) ascending sequence */
-    if (cmp(pl, pl + len, py_arr) <= 0) {
-        for (pi = pl + len; pi < arr + (num - 1) * len
-                && cmp(pi, pi + len, py_arr) <= 0; pi += len) {
-        }
-    } else {  /* (strictly) descending sequence */
-        for (pi = pl + len; pi < arr + (num - 1) * len
-                && cmp(pi + len, pi, py_arr) < 0; pi += len) {
-        }
-
-        for (pj = pl, pr = pi; pj < pr; pj += len, pr -= len) {
-            GENERIC_SWAP(pj, pr, len);
-        }
-    }
-
-    pi += len;
-    sz = (pi - pl) / len;
-
-    if (sz < minrun) {
-        if (l + minrun < num) {
-            sz = minrun;
-        } else {
-            sz = num - l;
-        }
-
-        pr = pl + sz * len;
-
-        /* insertion sort */
-        for (; pi < pr; pi += len) {
-            GENERIC_COPY(vp, pi, len);
-            pj = pi;
-
-            while (pl < pj && cmp(vp, pj - len, py_arr) < 0) {
-                GENERIC_COPY(pj, pj - len, len);
-                pj -= len;
-            }
-
-            GENERIC_COPY(pj, vp, len);
-        }
-    }
-
-    return sz;
-}
-
-
-static npy_intp
-npy_gallop_right(const char *arr, const npy_intp size, const char *key,
-                 size_t len, PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    npy_intp last_ofs, ofs, m;
-
-    if (cmp(key, arr, py_arr) < 0) {
-        return 0;
-    }
-
-    last_ofs = 0;
-    ofs = 1;
-
-    for (;;) {
-        if (size <= ofs || ofs < 0) {
-            ofs = size; /* arr[ofs] is never accessed */
-            break;
-        }
-
-        if (cmp(key, arr + ofs * len, py_arr) < 0) {
-            break;
-        } else {
-            last_ofs = ofs;
-            /* ofs = 1, 3, 7, 15... */
-            ofs = (ofs << 1) + 1;
-        }
-    }
-
-    /* now that arr[last_ofs*len] <= key < arr[ofs*len] */
-    while (last_ofs + 1 < ofs) {
-        m = last_ofs + ((ofs - last_ofs) >> 1);
-
-        if (cmp(key, arr + m * len, py_arr) < 0) {
-            ofs = m;
-        } else {
-            last_ofs = m;
-        }
-    }
-
-    /* now that arr[(ofs-1)*len] <= key < arr[ofs*len] */
-    return ofs;
-}
-
-
-
-static npy_intp
-npy_gallop_left(const char *arr, const npy_intp size, const char *key,
-                size_t len, PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    npy_intp last_ofs, ofs, l, m, r;
-
-    if (cmp(arr + (size - 1) * len, key, py_arr) < 0) {
-        return size;
-    }
-
-    last_ofs = 0;
-    ofs = 1;
-
-    for (;;) {
-        if (size <= ofs || ofs < 0) {
-            ofs = size;
-            break;
-        }
-
-        if (cmp(arr + (size - ofs - 1) * len, key, py_arr) < 0) {
-            break;
-        } else {
-            last_ofs = ofs;
-            ofs = (ofs << 1) + 1;
-        }
-    }
-
-    /* now that arr[(size-ofs-1)*len] < key <= arr[(size-last_ofs-1)*len] */
-    l = size - ofs - 1;
-    r = size - last_ofs - 1;
-
-    while (l + 1 < r) {
-        m = l + ((r - l) >> 1);
-
-        if (cmp(arr + m * len, key, py_arr) < 0) {
-            l = m;
-        } else {
-            r = m;
-        }
-    }
-
-    /* now that arr[(r-1)*len] < key <= arr[r*len] */
-    return r;
-}
-
-
-static void
-npy_merge_left(char *p1, npy_intp l1, char *p2, npy_intp l2,
-               char *p3, size_t len,
-               PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    char *end = p2 + l2 * len;
-    memcpy(p3, p1, sizeof(char) * l1 * len);
-    /* first element must be in p2 otherwise skipped in the caller */
-    GENERIC_COPY(p1, p2, len);
-    p1 += len;
-    p2 += len;
-
-    while (p1 < p2 && p2 < end) {
-        if (cmp(p2, p3, py_arr) < 0) {
-            GENERIC_COPY(p1, p2, len);
-            p1 += len;
-            p2 += len;
-        } else {
-            GENERIC_COPY(p1, p3, len);
-            p1 += len;
-            p3 += len;
-        }
-    }
-
-    if (p1 != p2) {
-        memcpy(p1, p3, sizeof(char) * (p2 - p1));
-    }
-}
-
-
-static void
-npy_merge_right(char *p1, npy_intp l1, char *p2, npy_intp l2,
-                char *p3, size_t len,
-                PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    npy_intp ofs;
-    char *start = p1 - len;
-    memcpy(p3, p2, sizeof(char) * l2 * len);
-    p1 += (l1 - 1) * len;
-    p2 += (l2 - 1) * len;
-    p3 += (l2 - 1) * len;
-    /* first element must be in p1 otherwise skipped in the caller */
-    GENERIC_COPY(p2, p1, len);
-    p2 -= len;
-    p1 -= len;
-
-    while (p1 < p2 && start < p1) {
-        if (cmp(p3, p1, py_arr) < 0) {
-            GENERIC_COPY(p2, p1, len);
-            p2 -= len;
-            p1 -= len;
-        } else {
-            GENERIC_COPY(p2, p3, len);
-            p2 -= len;
-            p3 -= len;
-        }
-    }
-
-    if (p1 != p2) {
-        ofs = p2 - start;
-        memcpy(start + len, p3 - ofs + len, sizeof(char) * ofs);
-    }
-}
-
-
-
-static int
-npy_merge_at(char *arr, const run *stack, const npy_intp at,
-             buffer_char *buffer, size_t len,
-             PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    int ret;
-    npy_intp s1, l1, s2, l2, k;
-    char *p1, *p2;
-    s1 = stack[at].s;
-    l1 = stack[at].l;
-    s2 = stack[at + 1].s;
-    l2 = stack[at + 1].l;
-    /* arr[s2] belongs to arr[s1+k] */
-    GENERIC_COPY(buffer->pw, arr + s2 * len, len);
-    k = npy_gallop_right(arr + s1 * len, l1, buffer->pw, len, cmp, py_arr);
-
-    if (l1 == k) {
-        /* already sorted */
-        return 0;
-    }
-
-    p1 = arr + (s1 + k) * len;
-    l1 -= k;
-    p2 = arr + s2 * len;
-    /* arr[s2-1] belongs to arr[s2+l2] */
-    GENERIC_COPY(buffer->pw, arr + (s2 - 1) * len, len);
-    l2 = npy_gallop_left(arr + s2 * len, l2, buffer->pw, len, cmp, py_arr);
-
-    if (l2 < l1) {
-        ret = resize_buffer_char(buffer, l2);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-        npy_merge_right(p1, l1, p2, l2, buffer->pw, len, cmp, py_arr);
-    } else {
-        ret = resize_buffer_char(buffer, l1);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-        npy_merge_left(p1, l1, p2, l2, buffer->pw, len, cmp, py_arr);
-    }
-
-    return 0;
-}
-
-
-static int
-npy_try_collapse(char *arr, run *stack, npy_intp *stack_ptr,
-                 buffer_char *buffer, size_t len,
-                 PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    int ret;
-    npy_intp A, B, C, top;
-    top = *stack_ptr;
-
-    while (1 < top) {
-        B = stack[top - 2].l;
-        C = stack[top - 1].l;
-
-        if ((2 < top && stack[top - 3].l <= B + C) ||
-                (3 < top && stack[top - 4].l <= stack[top - 3].l + B)) {
-            A = stack[top - 3].l;
-
-            if (A <= C) {
-                ret = npy_merge_at(arr, stack, top - 3, buffer, len, cmp, py_arr);
-
-                if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-                stack[top - 3].l += B;
-                stack[top - 2] = stack[top - 1];
-                --top;
-            } else {
-                ret = npy_merge_at(arr, stack, top - 2, buffer, len, cmp, py_arr);
-
-                if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-                stack[top - 2].l += C;
-                --top;
-            }
-        } else if (1 < top && B <= C) {
-            ret = npy_merge_at(arr, stack, top - 2, buffer, len, cmp, py_arr);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 2].l += C;
-            --top;
-        } else {
-            break;
-        }
-    }
-
-    *stack_ptr = top;
-    return 0;
-}
-
-
-static int
-npy_force_collapse(char *arr, run *stack, npy_intp *stack_ptr,
-                   buffer_char *buffer, size_t len,
-                   PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    int ret;
-    npy_intp top = *stack_ptr;
-
-    while (2 < top) {
-        if (stack[top - 3].l <= stack[top - 1].l) {
-            ret = npy_merge_at(arr, stack, top - 3, buffer, len, cmp, py_arr);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 3].l += stack[top - 2].l;
-            stack[top - 2] = stack[top - 1];
-            --top;
-        } else {
-            ret = npy_merge_at(arr, stack, top - 2, buffer, len, cmp, py_arr);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 2].l += stack[top - 1].l;
-            --top;
-        }
-    }
-
-    if (1 < top) {
-        ret = npy_merge_at(arr, stack, top - 2, buffer, len, cmp, py_arr);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-    }
-
-    return 0;
-}
-
-
-NPY_NO_EXPORT int
-npy_timsort(void *start, npy_intp num, void *varr)
-{
-    PyArrayObject *arr = varr;
-    size_t len = PyArray_ITEMSIZE(arr);
-    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
-    int ret;
-    npy_intp l, n, stack_ptr, minrun;
-    run stack[TIMSORT_STACK_SIZE];
-    buffer_char buffer;
-
-    /* Items that have zero size don't make sense to sort */
-    if (len == 0) {
-        return 0;
-    }
-
-    buffer.pw = NULL;
-    buffer.size = 0;
-    buffer.len = len;
-    stack_ptr = 0;
-    minrun = compute_min_run_short(num);
-
-    /* used for insertion sort and gallop key */
-    ret = resize_buffer_char(&buffer, len);
-
-    if (NPY_UNLIKELY(ret < 0)) { goto cleanup; }
-
-    for (l = 0; l < num;) {
-        n = npy_count_run(start, l, num, minrun, buffer.pw, len, cmp, arr);
-
-        /* both s and l are scaled by len */
-        stack[stack_ptr].s = l;
-        stack[stack_ptr].l = n;
-        ++stack_ptr;
-        ret = npy_try_collapse(start, stack, &stack_ptr, &buffer, len, cmp, arr);
-
-        if (NPY_UNLIKELY(ret < 0)) { goto cleanup; }
-
-        l += n;
-    }
-
-    ret = npy_force_collapse(start, stack, &stack_ptr, &buffer, len, cmp, arr);
-
-    if (NPY_UNLIKELY(ret < 0)) { goto cleanup; }
-
-    ret = 0;
-
-cleanup:
-    if (buffer.pw != NULL) {
-        free(buffer.pw);
-    }
-    return ret;
-}
-
-
-/* argsort */
-
-static npy_intp
-npy_acount_run(char *arr, npy_intp *tosort, npy_intp l, npy_intp num,
-               npy_intp minrun, size_t len,
-               PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    npy_intp sz;
-    npy_intp vi;
-    npy_intp *pl, *pi, *pj, *pr;
-
-    if (NPY_UNLIKELY(num - l == 1)) {
-        return 1;
-    }
-
-    pl = tosort + l;
-
-    /* (not strictly) ascending sequence */
-    if (cmp(arr + (*pl) * len, arr + (*(pl + 1)) * len, py_arr) <= 0) {
-        for (pi = pl + 1; pi < tosort + num - 1
-                && cmp(arr + (*pi) * len, arr + (*(pi + 1)) * len, py_arr) <= 0; ++pi) {
-        }
-    } else {  /* (strictly) descending sequence */
-        for (pi = pl + 1; pi < tosort + num - 1
-                && cmp(arr + (*(pi + 1)) * len, arr + (*pi) * len, py_arr) < 0; ++pi) {
-        }
-
-        for (pj = pl, pr = pi; pj < pr; ++pj, --pr) {
-            INTP_SWAP(*pj, *pr);
-        }
-    }
-
-    ++pi;
-    sz = pi - pl;
-
-    if (sz < minrun) {
-        if (l + minrun < num) {
-            sz = minrun;
-        } else {
-            sz = num - l;
-        }
-
-        pr = pl + sz;
-
-        /* insertion sort */
-        for (; pi < pr; ++pi) {
-            vi = *pi;
-            pj = pi;
-
-            while (pl < pj && cmp(arr + vi * len, arr + (*(pj - 1)) * len, py_arr) < 0) {
-                *pj = *(pj - 1);
-                --pj;
-            }
-
-            *pj = vi;
-        }
-    }
-
-    return sz;
-}
-
-
-static npy_intp
-npy_agallop_left(const char *arr, const npy_intp *tosort,
-                 const npy_intp size, const char *key, size_t len,
-                 PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    npy_intp last_ofs, ofs, l, m, r;
-
-    if (cmp(arr + tosort[size - 1] * len, key, py_arr) < 0) {
-        return size;
-    }
-
-    last_ofs = 0;
-    ofs = 1;
-
-    for (;;) {
-        if (size <= ofs || ofs < 0) {
-            ofs = size;
-            break;
-        }
-
-        if (cmp(arr + tosort[size - ofs - 1] * len, key, py_arr) < 0) {
-            break;
-        } else {
-            last_ofs = ofs;
-            ofs = (ofs << 1) + 1;
-        }
-    }
-
-    /* now that arr[tosort[size-ofs-1]*len] < key <= arr[tosort[size-last_ofs-1]*len] */
-    l = size - ofs - 1;
-    r = size - last_ofs - 1;
-
-    while (l + 1 < r) {
-        m = l + ((r - l) >> 1);
-
-        if (cmp(arr + tosort[m] * len, key, py_arr) < 0) {
-            l = m;
-        } else {
-            r = m;
-        }
-    }
-
-    /* now that arr[tosort[r-1]*len] < key <= arr[tosort[r]*len] */
-    return r;
-}
-
-
-static npy_intp
-npy_agallop_right(const char *arr, const npy_intp *tosort,
-                  const npy_intp size, const char *key, size_t len,
-                  PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    npy_intp last_ofs, ofs, m;
-
-    if (cmp(key, arr + tosort[0] * len, py_arr) < 0) {
-        return 0;
-    }
-
-    last_ofs = 0;
-    ofs = 1;
-
-    for (;;) {
-        if (size <= ofs || ofs < 0) {
-            ofs = size; /* arr[ofs] is never accessed */
-            break;
-        }
-
-        if (cmp(key, arr + tosort[ofs] * len, py_arr) < 0) {
-            break;
-        } else {
-            last_ofs = ofs;
-            /* ofs = 1, 3, 7, 15... */
-            ofs = (ofs << 1) + 1;
-        }
-    }
-
-    /* now that arr[tosort[last_ofs]*len] <= key < arr[tosort[ofs]*len] */
-    while (last_ofs + 1 < ofs) {
-        m = last_ofs + ((ofs - last_ofs) >> 1);
-
-        if (cmp(key, arr + tosort[m] * len, py_arr) < 0) {
-            ofs = m;
-        } else {
-            last_ofs = m;
-        }
-    }
-
-    /* now that arr[tosort[ofs-1]*len] <= key < arr[tosort[ofs]*len] */
-    return ofs;
-}
-
-
-static void
-npy_amerge_left(char *arr, npy_intp *p1, npy_intp l1, npy_intp *p2,
-                npy_intp l2, npy_intp *p3, size_t len,
-                PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    npy_intp *end = p2 + l2;
-    memcpy(p3, p1, sizeof(npy_intp) * l1);
-    /* first element must be in p2 otherwise skipped in the caller */
-    *p1++ = *p2++;
-
-    while (p1 < p2 && p2 < end) {
-        if (cmp(arr + (*p2) * len, arr + (*p3) * len, py_arr) < 0) {
-            *p1++ = *p2++;
-        } else {
-            *p1++ = *p3++;
-        }
-    }
-
-    if (p1 != p2) {
-        memcpy(p1, p3, sizeof(npy_intp) * (p2 - p1));
-    }
-}
-
-
-static void
-npy_amerge_right(char *arr, npy_intp* p1, npy_intp l1, npy_intp *p2,
-                 npy_intp l2, npy_intp *p3, size_t len,
-                 PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    npy_intp ofs;
-    npy_intp *start = p1 - 1;
-    memcpy(p3, p2, sizeof(npy_intp) * l2);
-    p1 += l1 - 1;
-    p2 += l2 - 1;
-    p3 += l2 - 1;
-    /* first element must be in p1 otherwise skipped in the caller */
-    *p2-- = *p1--;
-
-    while (p1 < p2 && start < p1) {
-        if (cmp(arr + (*p3) * len, arr + (*p1) * len, py_arr) < 0) {
-            *p2-- = *p1--;
-        } else {
-            *p2-- = *p3--;
-        }
-    }
-
-    if (p1 != p2) {
-        ofs = p2 - start;
-        memcpy(start + 1, p3 - ofs + 1, sizeof(npy_intp) * ofs);
-    }
-}
-
-
-
-static int
-npy_amerge_at(char *arr, npy_intp *tosort, const run *stack,
-              const npy_intp at, buffer_intp *buffer, size_t len,
-              PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    int ret;
-    npy_intp s1, l1, s2, l2, k;
-    npy_intp *p1, *p2;
-    s1 = stack[at].s;
-    l1 = stack[at].l;
-    s2 = stack[at + 1].s;
-    l2 = stack[at + 1].l;
-    /* tosort[s2] belongs to tosort[s1+k] */
-    k = npy_agallop_right(arr, tosort + s1, l1, arr + tosort[s2] * len, len, cmp,
-                          py_arr);
-
-    if (l1 == k) {
-        /* already sorted */
-        return 0;
-    }
-
-    p1 = tosort + s1 + k;
-    l1 -= k;
-    p2 = tosort + s2;
-    /* tosort[s2-1] belongs to tosort[s2+l2] */
-    l2 = npy_agallop_left(arr, tosort + s2, l2, arr + tosort[s2 - 1] * len,
-                          len, cmp, py_arr);
-
-    if (l2 < l1) {
-        ret = resize_buffer_intp(buffer, l2);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-        npy_amerge_right(arr, p1, l1, p2, l2, buffer->pw, len, cmp, py_arr);
-    } else {
-        ret = resize_buffer_intp(buffer, l1);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-        npy_amerge_left(arr, p1, l1, p2, l2, buffer->pw, len, cmp, py_arr);
-    }
-
-    return 0;
-}
-
-
-static int
-npy_atry_collapse(char *arr, npy_intp *tosort, run *stack,
-                  npy_intp *stack_ptr, buffer_intp *buffer, size_t len,
-                  PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    int ret;
-    npy_intp A, B, C, top;
-    top = *stack_ptr;
-
-    while (1 < top) {
-        B = stack[top - 2].l;
-        C = stack[top - 1].l;
-
-        if ((2 < top && stack[top - 3].l <= B + C) ||
-                (3 < top && stack[top - 4].l <= stack[top - 3].l + B)) {
-            A = stack[top - 3].l;
-
-            if (A <= C) {
-                ret = npy_amerge_at(arr, tosort, stack, top - 3, buffer, len, cmp, py_arr);
-
-                if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-                stack[top - 3].l += B;
-                stack[top - 2] = stack[top - 1];
-                --top;
-            } else {
-                ret = npy_amerge_at(arr, tosort, stack, top - 2, buffer, len, cmp, py_arr);
-
-                if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-                stack[top - 2].l += C;
-                --top;
-            }
-        } else if (1 < top && B <= C) {
-            ret = npy_amerge_at(arr, tosort, stack, top - 2, buffer, len, cmp, py_arr);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 2].l += C;
-            --top;
-        } else {
-            break;
-        }
-    }
-
-    *stack_ptr = top;
-    return 0;
-}
-
-
-static int
-npy_aforce_collapse(char *arr, npy_intp *tosort, run *stack,
-                    npy_intp *stack_ptr, buffer_intp *buffer, size_t len,
-                    PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
-{
-    int ret;
-    npy_intp top = *stack_ptr;
-
-    while (2 < top) {
-        if (stack[top - 3].l <= stack[top - 1].l) {
-            ret = npy_amerge_at(arr, tosort, stack, top - 3, buffer, len, cmp, py_arr);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 3].l += stack[top - 2].l;
-            stack[top - 2] = stack[top - 1];
-            --top;
-        } else {
-            ret = npy_amerge_at(arr, tosort, stack, top - 2, buffer, len, cmp, py_arr);
-
-            if (NPY_UNLIKELY(ret < 0)) { return ret; }
-
-            stack[top - 2].l += stack[top - 1].l;
-            --top;
-        }
-    }
-
-    if (1 < top) {
-        ret = npy_amerge_at(arr, tosort, stack, top - 2, buffer, len, cmp, py_arr);
-
-        if (NPY_UNLIKELY(ret < 0)) { return ret; }
-    }
-
-    return 0;
-}
-
-
-NPY_NO_EXPORT int
-npy_atimsort(void *start, npy_intp *tosort, npy_intp num, void *varr)
-{
-    PyArrayObject *arr = varr;
-    size_t len = PyArray_ITEMSIZE(arr);
-    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
-    int ret;
-    npy_intp l, n, stack_ptr, minrun;
-    run stack[TIMSORT_STACK_SIZE];
-    buffer_intp buffer;
-
-    /* Items that have zero size don't make sense to sort */
-    if (len == 0) {
-        return 0;
-    }
-
-    buffer.pw = NULL;
-    buffer.size = 0;
-    stack_ptr = 0;
-    minrun = compute_min_run_short(num);
-
-    for (l = 0; l < num;) {
-        n = npy_acount_run(start, tosort, l, num, minrun, len, cmp, arr);
-        /* both s and l are scaled by len */
-        stack[stack_ptr].s = l;
-        stack[stack_ptr].l = n;
-        ++stack_ptr;
-        ret = npy_atry_collapse(start, tosort, stack, &stack_ptr, &buffer, len, cmp,
-                                arr);
-
-        if (NPY_UNLIKELY(ret < 0)) { goto cleanup; }
-
-        l += n;
-    }
-
-    ret = npy_aforce_collapse(start, tosort, stack, &stack_ptr, &buffer, len,
-                              cmp, arr);
-
-    if (NPY_UNLIKELY(ret < 0)) { goto cleanup; }
-
-    ret = 0;
-
-cleanup:
-    if (buffer.pw != NULL) {
-        free(buffer.pw);
-    }
-    return ret;
-}
diff --git a/numpy/core/src/npysort/timsort.cpp b/numpy/core/src/npysort/timsort.cpp

new file mode 100644 (file)

index 0000000..27294af
--- /dev/null
+++ b/numpy/core/src/npysort/timsort.cpp
@@ -0,0 +1,2994 @@
+/* -*- c -*- */
+
+/*
+ * The purpose of this module is to add faster sort functions
+ * that are type-specific.  This is done by altering the
+ * function table for the builtin descriptors.
+ *
+ * These sorting functions are copied almost directly from numarray
+ * with a few modifications (complex comparisons compare the imaginary
+ * part if the real parts are equal, for example), and the names
+ * are changed.
+ *
+ * The original sorting code is due to Charles R. Harris who wrote
+ * it for numarray.
+ */
+
+/*
+ * Quick sort is usually the fastest, but the worst case scenario can
+ * be slower than the merge and heap sorts.  The merge sort requires
+ * extra memory and so for large arrays may not be useful.
+ *
+ * The merge sort is *stable*, meaning that equal components
+ * are unmoved from their entry versions, so it can be used to
+ * implement lexigraphic sorting on multiple keys.
+ *
+ * The heap sort is included for completeness.
+ */
+
+/* For details of Timsort, refer to
+ * https://github.com/python/cpython/blob/3.7/Objects/listsort.txt
+ */
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+
+#include "npy_sort.h"
+#include "npysort_common.h"
+#include "numpy_tag.h"
+
+#include <cstdlib>
+#include <utility>
+
+/* enough for 32 * 1.618 ** 128 elements */
+#define TIMSORT_STACK_SIZE 128
+
+static npy_intp
+compute_min_run(npy_intp num)
+{
+    npy_intp r = 0;
+
+    while (64 < num) {
+        r |= num & 1;
+        num >>= 1;
+    }
+
+    return num + r;
+}
+
+typedef struct {
+    npy_intp s; /* start pointer */
+    npy_intp l; /* length */
+} run;
+
+/* buffer for argsort. Declared here to avoid multiple declarations. */
+typedef struct {
+    npy_intp *pw;
+    npy_intp size;
+} buffer_intp;
+
+/* buffer method */
+static inline int
+resize_buffer_intp(buffer_intp *buffer, npy_intp new_size)
+{
+    if (new_size <= buffer->size) {
+        return 0;
+    }
+
+    if (NPY_UNLIKELY(buffer->pw == NULL)) {
+        buffer->pw = (npy_intp *)malloc(new_size * sizeof(npy_intp));
+    }
+    else {
+        buffer->pw =
+                (npy_intp *)realloc(buffer->pw, new_size * sizeof(npy_intp));
+    }
+
+    buffer->size = new_size;
+
+    if (NPY_UNLIKELY(buffer->pw == NULL)) {
+        return -NPY_ENOMEM;
+    }
+    else {
+        return 0;
+    }
+}
+
+/*
+ *****************************************************************************
+ **                            NUMERIC SORTS                                **
+ *****************************************************************************
+ */
+
+template <typename Tag>
+struct buffer_ {
+    typename Tag::type *pw;
+    npy_intp size;
+};
+
+template <typename Tag>
+static inline int
+resize_buffer_(buffer_<Tag> *buffer, npy_intp new_size)
+{
+    using type = typename Tag::type;
+    if (new_size <= buffer->size) {
+        return 0;
+    }
+
+    if (NPY_UNLIKELY(buffer->pw == NULL)) {
+        buffer->pw = (type *)malloc(new_size * sizeof(type));
+    }
+    else {
+        buffer->pw = (type *)realloc(buffer->pw, new_size * sizeof(type));
+    }
+
+    buffer->size = new_size;
+
+    if (NPY_UNLIKELY(buffer->pw == NULL)) {
+        return -NPY_ENOMEM;
+    }
+    else {
+        return 0;
+    }
+}
+
+template <typename Tag, typename type>
+static npy_intp
+count_run_(type *arr, npy_intp l, npy_intp num, npy_intp minrun)
+{
+    npy_intp sz;
+    type vc, *pl, *pi, *pj, *pr;
+
+    if (NPY_UNLIKELY(num - l == 1)) {
+        return 1;
+    }
+
+    pl = arr + l;
+
+    /* (not strictly) ascending sequence */
+    if (!Tag::less(*(pl + 1), *pl)) {
+        for (pi = pl + 1; pi < arr + num - 1 && !Tag::less(*(pi + 1), *pi);
+             ++pi) {
+        }
+    }
+    else { /* (strictly) descending sequence */
+        for (pi = pl + 1; pi < arr + num - 1 && Tag::less(*(pi + 1), *pi);
+             ++pi) {
+        }
+
+        for (pj = pl, pr = pi; pj < pr; ++pj, --pr) {
+            std::swap(*pj, *pr);
+        }
+    }
+
+    ++pi;
+    sz = pi - pl;
+
+    if (sz < minrun) {
+        if (l + minrun < num) {
+            sz = minrun;
+        }
+        else {
+            sz = num - l;
+        }
+
+        pr = pl + sz;
+
+        /* insertion sort */
+        for (; pi < pr; ++pi) {
+            vc = *pi;
+            pj = pi;
+
+            while (pl < pj && Tag::less(vc, *(pj - 1))) {
+                *pj = *(pj - 1);
+                --pj;
+            }
+
+            *pj = vc;
+        }
+    }
+
+    return sz;
+}
+
+/* when the left part of the array (p1) is smaller, copy p1 to buffer
+ * and merge from left to right
+ */
+template <typename Tag, typename type>
+static void
+merge_left_(type *p1, npy_intp l1, type *p2, npy_intp l2, type *p3)
+{
+    type *end = p2 + l2;
+    memcpy(p3, p1, sizeof(type) * l1);
+    /* first element must be in p2 otherwise skipped in the caller */
+    *p1++ = *p2++;
+
+    while (p1 < p2 && p2 < end) {
+        if (Tag::less(*p2, *p3)) {
+            *p1++ = *p2++;
+        }
+        else {
+            *p1++ = *p3++;
+        }
+    }
+
+    if (p1 != p2) {
+        memcpy(p1, p3, sizeof(type) * (p2 - p1));
+    }
+}
+
+/* when the right part of the array (p2) is smaller, copy p2 to buffer
+ * and merge from right to left
+ */
+template <typename Tag, typename type>
+static void
+merge_right_(type *p1, npy_intp l1, type *p2, npy_intp l2, type *p3)
+{
+    npy_intp ofs;
+    type *start = p1 - 1;
+    memcpy(p3, p2, sizeof(type) * l2);
+    p1 += l1 - 1;
+    p2 += l2 - 1;
+    p3 += l2 - 1;
+    /* first element must be in p1 otherwise skipped in the caller */
+    *p2-- = *p1--;
+
+    while (p1 < p2 && start < p1) {
+        if (Tag::less(*p3, *p1)) {
+            *p2-- = *p1--;
+        }
+        else {
+            *p2-- = *p3--;
+        }
+    }
+
+    if (p1 != p2) {
+        ofs = p2 - start;
+        memcpy(start + 1, p3 - ofs + 1, sizeof(type) * ofs);
+    }
+}
+
+/* Note: the naming convention of gallop functions are different from that of
+ * CPython. For example, here gallop_right means gallop from left toward right,
+ * whereas in CPython gallop_right means gallop
+ * and find the right most element among equal elements
+ */
+template <typename Tag, typename type>
+static npy_intp
+gallop_right_(const type *arr, const npy_intp size, const type key)
+{
+    npy_intp last_ofs, ofs, m;
+
+    if (Tag::less(key, arr[0])) {
+        return 0;
+    }
+
+    last_ofs = 0;
+    ofs = 1;
+
+    for (;;) {
+        if (size <= ofs || ofs < 0) {
+            ofs = size; /* arr[ofs] is never accessed */
+            break;
+        }
+
+        if (Tag::less(key, arr[ofs])) {
+            break;
+        }
+        else {
+            last_ofs = ofs;
+            /* ofs = 1, 3, 7, 15... */
+            ofs = (ofs << 1) + 1;
+        }
+    }
+
+    /* now that arr[last_ofs] <= key < arr[ofs] */
+    while (last_ofs + 1 < ofs) {
+        m = last_ofs + ((ofs - last_ofs) >> 1);
+
+        if (Tag::less(key, arr[m])) {
+            ofs = m;
+        }
+        else {
+            last_ofs = m;
+        }
+    }
+
+    /* now that arr[ofs-1] <= key < arr[ofs] */
+    return ofs;
+}
+
+template <typename Tag, typename type>
+static npy_intp
+gallop_left_(const type *arr, const npy_intp size, const type key)
+{
+    npy_intp last_ofs, ofs, l, m, r;
+
+    if (Tag::less(arr[size - 1], key)) {
+        return size;
+    }
+
+    last_ofs = 0;
+    ofs = 1;
+
+    for (;;) {
+        if (size <= ofs || ofs < 0) {
+            ofs = size;
+            break;
+        }
+
+        if (Tag::less(arr[size - ofs - 1], key)) {
+            break;
+        }
+        else {
+            last_ofs = ofs;
+            ofs = (ofs << 1) + 1;
+        }
+    }
+
+    /* now that arr[size-ofs-1] < key <= arr[size-last_ofs-1] */
+    l = size - ofs - 1;
+    r = size - last_ofs - 1;
+
+    while (l + 1 < r) {
+        m = l + ((r - l) >> 1);
+
+        if (Tag::less(arr[m], key)) {
+            l = m;
+        }
+        else {
+            r = m;
+        }
+    }
+
+    /* now that arr[r-1] < key <= arr[r] */
+    return r;
+}
+
+template <typename Tag, typename type>
+static int
+merge_at_(type *arr, const run *stack, const npy_intp at, buffer_<Tag> *buffer)
+{
+    int ret;
+    npy_intp s1, l1, s2, l2, k;
+    type *p1, *p2;
+    s1 = stack[at].s;
+    l1 = stack[at].l;
+    s2 = stack[at + 1].s;
+    l2 = stack[at + 1].l;
+    /* arr[s2] belongs to arr[s1+k].
+     * if try to comment this out for debugging purpose, remember
+     * in the merging process the first element is skipped
+     */
+    k = gallop_right_<Tag>(arr + s1, l1, arr[s2]);
+
+    if (l1 == k) {
+        /* already sorted */
+        return 0;
+    }
+
+    p1 = arr + s1 + k;
+    l1 -= k;
+    p2 = arr + s2;
+    /* arr[s2-1] belongs to arr[s2+l2] */
+    l2 = gallop_left_<Tag>(arr + s2, l2, arr[s2 - 1]);
+
+    if (l2 < l1) {
+        ret = resize_buffer_<Tag>(buffer, l2);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+
+        merge_right_<Tag>(p1, l1, p2, l2, buffer->pw);
+    }
+    else {
+        ret = resize_buffer_<Tag>(buffer, l1);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+
+        merge_left_<Tag>(p1, l1, p2, l2, buffer->pw);
+    }
+
+    return 0;
+}
+
+template <typename Tag, typename type>
+static int
+try_collapse_(type *arr, run *stack, npy_intp *stack_ptr, buffer_<Tag> *buffer)
+{
+    int ret;
+    npy_intp A, B, C, top;
+    top = *stack_ptr;
+
+    while (1 < top) {
+        B = stack[top - 2].l;
+        C = stack[top - 1].l;
+
+        if ((2 < top && stack[top - 3].l <= B + C) ||
+            (3 < top && stack[top - 4].l <= stack[top - 3].l + B)) {
+            A = stack[top - 3].l;
+
+            if (A <= C) {
+                ret = merge_at_<Tag>(arr, stack, top - 3, buffer);
+
+                if (NPY_UNLIKELY(ret < 0)) {
+                    return ret;
+                }
+
+                stack[top - 3].l += B;
+                stack[top - 2] = stack[top - 1];
+                --top;
+            }
+            else {
+                ret = merge_at_<Tag>(arr, stack, top - 2, buffer);
+
+                if (NPY_UNLIKELY(ret < 0)) {
+                    return ret;
+                }
+
+                stack[top - 2].l += C;
+                --top;
+            }
+        }
+        else if (1 < top && B <= C) {
+            ret = merge_at_<Tag>(arr, stack, top - 2, buffer);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 2].l += C;
+            --top;
+        }
+        else {
+            break;
+        }
+    }
+
+    *stack_ptr = top;
+    return 0;
+}
+
+template <typename Tag, typename type>
+static int
+force_collapse_(type *arr, run *stack, npy_intp *stack_ptr,
+                buffer_<Tag> *buffer)
+{
+    int ret;
+    npy_intp top = *stack_ptr;
+
+    while (2 < top) {
+        if (stack[top - 3].l <= stack[top - 1].l) {
+            ret = merge_at_<Tag>(arr, stack, top - 3, buffer);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 3].l += stack[top - 2].l;
+            stack[top - 2] = stack[top - 1];
+            --top;
+        }
+        else {
+            ret = merge_at_<Tag>(arr, stack, top - 2, buffer);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 2].l += stack[top - 1].l;
+            --top;
+        }
+    }
+
+    if (1 < top) {
+        ret = merge_at_<Tag>(arr, stack, top - 2, buffer);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+    }
+
+    return 0;
+}
+
+template <typename Tag>
+static int
+timsort_(void *start, npy_intp num)
+{
+    using type = typename Tag::type;
+    int ret;
+    npy_intp l, n, stack_ptr, minrun;
+    buffer_<Tag> buffer;
+    run stack[TIMSORT_STACK_SIZE];
+    buffer.pw = NULL;
+    buffer.size = 0;
+    stack_ptr = 0;
+    minrun = compute_min_run(num);
+
+    for (l = 0; l < num;) {
+        n = count_run_<Tag>((type *)start, l, num, minrun);
+        stack[stack_ptr].s = l;
+        stack[stack_ptr].l = n;
+        ++stack_ptr;
+        ret = try_collapse_<Tag>((type *)start, stack, &stack_ptr, &buffer);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            goto cleanup;
+        }
+
+        l += n;
+    }
+
+    ret = force_collapse_<Tag>((type *)start, stack, &stack_ptr, &buffer);
+
+    if (NPY_UNLIKELY(ret < 0)) {
+        goto cleanup;
+    }
+
+    ret = 0;
+cleanup:
+
+    free(buffer.pw);
+
+    return ret;
+}
+
+/* argsort */
+
+template <typename Tag, typename type>
+static npy_intp
+acount_run_(type *arr, npy_intp *tosort, npy_intp l, npy_intp num,
+            npy_intp minrun)
+{
+    npy_intp sz;
+    type vc;
+    npy_intp vi;
+    npy_intp *pl, *pi, *pj, *pr;
+
+    if (NPY_UNLIKELY(num - l == 1)) {
+        return 1;
+    }
+
+    pl = tosort + l;
+
+    /* (not strictly) ascending sequence */
+    if (!Tag::less(arr[*(pl + 1)], arr[*pl])) {
+        for (pi = pl + 1;
+             pi < tosort + num - 1 && !Tag::less(arr[*(pi + 1)], arr[*pi]);
+             ++pi) {
+        }
+    }
+    else { /* (strictly) descending sequence */
+        for (pi = pl + 1;
+             pi < tosort + num - 1 && Tag::less(arr[*(pi + 1)], arr[*pi]);
+             ++pi) {
+        }
+
+        for (pj = pl, pr = pi; pj < pr; ++pj, --pr) {
+            std::swap(*pj, *pr);
+        }
+    }
+
+    ++pi;
+    sz = pi - pl;
+
+    if (sz < minrun) {
+        if (l + minrun < num) {
+            sz = minrun;
+        }
+        else {
+            sz = num - l;
+        }
+
+        pr = pl + sz;
+
+        /* insertion sort */
+        for (; pi < pr; ++pi) {
+            vi = *pi;
+            vc = arr[*pi];
+            pj = pi;
+
+            while (pl < pj && Tag::less(vc, arr[*(pj - 1)])) {
+                *pj = *(pj - 1);
+                --pj;
+            }
+
+            *pj = vi;
+        }
+    }
+
+    return sz;
+}
+
+template <typename Tag, typename type>
+static npy_intp
+agallop_right_(const type *arr, const npy_intp *tosort, const npy_intp size,
+               const type key)
+{
+    npy_intp last_ofs, ofs, m;
+
+    if (Tag::less(key, arr[tosort[0]])) {
+        return 0;
+    }
+
+    last_ofs = 0;
+    ofs = 1;
+
+    for (;;) {
+        if (size <= ofs || ofs < 0) {
+            ofs = size; /* arr[ofs] is never accessed */
+            break;
+        }
+
+        if (Tag::less(key, arr[tosort[ofs]])) {
+            break;
+        }
+        else {
+            last_ofs = ofs;
+            /* ofs = 1, 3, 7, 15... */
+            ofs = (ofs << 1) + 1;
+        }
+    }
+
+    /* now that arr[tosort[last_ofs]] <= key < arr[tosort[ofs]] */
+    while (last_ofs + 1 < ofs) {
+        m = last_ofs + ((ofs - last_ofs) >> 1);
+
+        if (Tag::less(key, arr[tosort[m]])) {
+            ofs = m;
+        }
+        else {
+            last_ofs = m;
+        }
+    }
+
+    /* now that arr[tosort[ofs-1]] <= key < arr[tosort[ofs]] */
+    return ofs;
+}
+
+template <typename Tag, typename type>
+static npy_intp
+agallop_left_(const type *arr, const npy_intp *tosort, const npy_intp size,
+              const type key)
+{
+    npy_intp last_ofs, ofs, l, m, r;
+
+    if (Tag::less(arr[tosort[size - 1]], key)) {
+        return size;
+    }
+
+    last_ofs = 0;
+    ofs = 1;
+
+    for (;;) {
+        if (size <= ofs || ofs < 0) {
+            ofs = size;
+            break;
+        }
+
+        if (Tag::less(arr[tosort[size - ofs - 1]], key)) {
+            break;
+        }
+        else {
+            last_ofs = ofs;
+            ofs = (ofs << 1) + 1;
+        }
+    }
+
+    /* now that arr[tosort[size-ofs-1]] < key <= arr[tosort[size-last_ofs-1]]
+     */
+    l = size - ofs - 1;
+    r = size - last_ofs - 1;
+
+    while (l + 1 < r) {
+        m = l + ((r - l) >> 1);
+
+        if (Tag::less(arr[tosort[m]], key)) {
+            l = m;
+        }
+        else {
+            r = m;
+        }
+    }
+
+    /* now that arr[tosort[r-1]] < key <= arr[tosort[r]] */
+    return r;
+}
+
+template <typename Tag, typename type>
+static void
+amerge_left_(type *arr, npy_intp *p1, npy_intp l1, npy_intp *p2, npy_intp l2,
+             npy_intp *p3)
+{
+    npy_intp *end = p2 + l2;
+    memcpy(p3, p1, sizeof(npy_intp) * l1);
+    /* first element must be in p2 otherwise skipped in the caller */
+    *p1++ = *p2++;
+
+    while (p1 < p2 && p2 < end) {
+        if (Tag::less(arr[*p2], arr[*p3])) {
+            *p1++ = *p2++;
+        }
+        else {
+            *p1++ = *p3++;
+        }
+    }
+
+    if (p1 != p2) {
+        memcpy(p1, p3, sizeof(npy_intp) * (p2 - p1));
+    }
+}
+
+template <typename Tag, typename type>
+static void
+amerge_right_(type *arr, npy_intp *p1, npy_intp l1, npy_intp *p2, npy_intp l2,
+              npy_intp *p3)
+{
+    npy_intp ofs;
+    npy_intp *start = p1 - 1;
+    memcpy(p3, p2, sizeof(npy_intp) * l2);
+    p1 += l1 - 1;
+    p2 += l2 - 1;
+    p3 += l2 - 1;
+    /* first element must be in p1 otherwise skipped in the caller */
+    *p2-- = *p1--;
+
+    while (p1 < p2 && start < p1) {
+        if (Tag::less(arr[*p3], arr[*p1])) {
+            *p2-- = *p1--;
+        }
+        else {
+            *p2-- = *p3--;
+        }
+    }
+
+    if (p1 != p2) {
+        ofs = p2 - start;
+        memcpy(start + 1, p3 - ofs + 1, sizeof(npy_intp) * ofs);
+    }
+}
+
+template <typename Tag, typename type>
+static int
+amerge_at_(type *arr, npy_intp *tosort, const run *stack, const npy_intp at,
+           buffer_intp *buffer)
+{
+    int ret;
+    npy_intp s1, l1, s2, l2, k;
+    npy_intp *p1, *p2;
+    s1 = stack[at].s;
+    l1 = stack[at].l;
+    s2 = stack[at + 1].s;
+    l2 = stack[at + 1].l;
+    /* tosort[s2] belongs to tosort[s1+k] */
+    k = agallop_right_<Tag>(arr, tosort + s1, l1, arr[tosort[s2]]);
+
+    if (l1 == k) {
+        /* already sorted */
+        return 0;
+    }
+
+    p1 = tosort + s1 + k;
+    l1 -= k;
+    p2 = tosort + s2;
+    /* tosort[s2-1] belongs to tosort[s2+l2] */
+    l2 = agallop_left_<Tag>(arr, tosort + s2, l2, arr[tosort[s2 - 1]]);
+
+    if (l2 < l1) {
+        ret = resize_buffer_intp(buffer, l2);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+
+        amerge_right_<Tag>(arr, p1, l1, p2, l2, buffer->pw);
+    }
+    else {
+        ret = resize_buffer_intp(buffer, l1);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+
+        amerge_left_<Tag>(arr, p1, l1, p2, l2, buffer->pw);
+    }
+
+    return 0;
+}
+
+template <typename Tag, typename type>
+static int
+atry_collapse_(type *arr, npy_intp *tosort, run *stack, npy_intp *stack_ptr,
+               buffer_intp *buffer)
+{
+    int ret;
+    npy_intp A, B, C, top;
+    top = *stack_ptr;
+
+    while (1 < top) {
+        B = stack[top - 2].l;
+        C = stack[top - 1].l;
+
+        if ((2 < top && stack[top - 3].l <= B + C) ||
+            (3 < top && stack[top - 4].l <= stack[top - 3].l + B)) {
+            A = stack[top - 3].l;
+
+            if (A <= C) {
+                ret = amerge_at_<Tag>(arr, tosort, stack, top - 3, buffer);
+
+                if (NPY_UNLIKELY(ret < 0)) {
+                    return ret;
+                }
+
+                stack[top - 3].l += B;
+                stack[top - 2] = stack[top - 1];
+                --top;
+            }
+            else {
+                ret = amerge_at_<Tag>(arr, tosort, stack, top - 2, buffer);
+
+                if (NPY_UNLIKELY(ret < 0)) {
+                    return ret;
+                }
+
+                stack[top - 2].l += C;
+                --top;
+            }
+        }
+        else if (1 < top && B <= C) {
+            ret = amerge_at_<Tag>(arr, tosort, stack, top - 2, buffer);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 2].l += C;
+            --top;
+        }
+        else {
+            break;
+        }
+    }
+
+    *stack_ptr = top;
+    return 0;
+}
+
+template <typename Tag, typename type>
+static int
+aforce_collapse_(type *arr, npy_intp *tosort, run *stack, npy_intp *stack_ptr,
+                 buffer_intp *buffer)
+{
+    int ret;
+    npy_intp top = *stack_ptr;
+
+    while (2 < top) {
+        if (stack[top - 3].l <= stack[top - 1].l) {
+            ret = amerge_at_<Tag>(arr, tosort, stack, top - 3, buffer);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 3].l += stack[top - 2].l;
+            stack[top - 2] = stack[top - 1];
+            --top;
+        }
+        else {
+            ret = amerge_at_<Tag>(arr, tosort, stack, top - 2, buffer);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 2].l += stack[top - 1].l;
+            --top;
+        }
+    }
+
+    if (1 < top) {
+        ret = amerge_at_<Tag>(arr, tosort, stack, top - 2, buffer);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+    }
+
+    return 0;
+}
+
+template <typename Tag>
+static int
+atimsort_(void *v, npy_intp *tosort, npy_intp num)
+{
+    using type = typename Tag::type;
+    int ret;
+    npy_intp l, n, stack_ptr, minrun;
+    buffer_intp buffer;
+    run stack[TIMSORT_STACK_SIZE];
+    buffer.pw = NULL;
+    buffer.size = 0;
+    stack_ptr = 0;
+    minrun = compute_min_run(num);
+
+    for (l = 0; l < num;) {
+        n = acount_run_<Tag>((type *)v, tosort, l, num, minrun);
+        stack[stack_ptr].s = l;
+        stack[stack_ptr].l = n;
+        ++stack_ptr;
+        ret = atry_collapse_<Tag>((type *)v, tosort, stack, &stack_ptr,
+                                  &buffer);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            goto cleanup;
+        }
+
+        l += n;
+    }
+
+    ret = aforce_collapse_<Tag>((type *)v, tosort, stack, &stack_ptr, &buffer);
+
+    if (NPY_UNLIKELY(ret < 0)) {
+        goto cleanup;
+    }
+
+    ret = 0;
+cleanup:
+
+    if (buffer.pw != NULL) {
+        free(buffer.pw);
+    }
+
+    return ret;
+}
+
+/* For string sorts and generic sort, element comparisons are very expensive,
+ * and the time cost of insertion sort (involves N**2 comparison) clearly
+ * hurts. Implementing binary insertion sort and probably gallop mode during
+ * merging process can hopefully boost the performance. Here as a temporary
+ * workaround we use shorter run length to reduce the cost of insertion sort.
+ */
+
+static npy_intp
+compute_min_run_short(npy_intp num)
+{
+    npy_intp r = 0;
+
+    while (16 < num) {
+        r |= num & 1;
+        num >>= 1;
+    }
+
+    return num + r;
+}
+
+/*
+ *****************************************************************************
+ **                             STRING SORTS                                **
+ *****************************************************************************
+ */
+
+template <typename Tag>
+struct string_buffer_ {
+    typename Tag::type *pw;
+    npy_intp size;
+    size_t len;
+};
+
+template <typename Tag>
+static inline int
+resize_buffer_(string_buffer_<Tag> *buffer, npy_intp new_size)
+{
+    using type = typename Tag::type;
+    if (new_size <= buffer->size) {
+        return 0;
+    }
+
+    if (NPY_UNLIKELY(buffer->pw == NULL)) {
+        buffer->pw = (type *)malloc(sizeof(type) * new_size * buffer->len);
+    }
+    else {
+        buffer->pw = (type *)realloc(buffer->pw,
+                                     sizeof(type) * new_size * buffer->len);
+    }
+
+    buffer->size = new_size;
+
+    if (NPY_UNLIKELY(buffer->pw == NULL)) {
+        return -NPY_ENOMEM;
+    }
+    else {
+        return 0;
+    }
+}
+
+template <typename Tag, typename type>
+static npy_intp
+count_run_(type *arr, npy_intp l, npy_intp num, npy_intp minrun, type *vp,
+           size_t len)
+{
+    npy_intp sz;
+    type *pl, *pi, *pj, *pr;
+
+    if (NPY_UNLIKELY(num - l == 1)) {
+        return 1;
+    }
+
+    pl = arr + l * len;
+
+    /* (not strictly) ascending sequence */
+    if (!Tag::less(pl + len, pl, len)) {
+        for (pi = pl + len;
+             pi < arr + (num - 1) * len && !Tag::less(pi + len, pi, len);
+             pi += len) {
+        }
+    }
+    else { /* (strictly) descending sequence */
+        for (pi = pl + len;
+             pi < arr + (num - 1) * len && Tag::less(pi + len, pi, len);
+             pi += len) {
+        }
+
+        for (pj = pl, pr = pi; pj < pr; pj += len, pr -= len) {
+            Tag::swap(pj, pr, len);
+        }
+    }
+
+    pi += len;
+    sz = (pi - pl) / len;
+
+    if (sz < minrun) {
+        if (l + minrun < num) {
+            sz = minrun;
+        }
+        else {
+            sz = num - l;
+        }
+
+        pr = pl + sz * len;
+
+        /* insertion sort */
+        for (; pi < pr; pi += len) {
+            Tag::copy(vp, pi, len);
+            pj = pi;
+
+            while (pl < pj && Tag::less(vp, pj - len, len)) {
+                Tag::copy(pj, pj - len, len);
+                pj -= len;
+            }
+
+            Tag::copy(pj, vp, len);
+        }
+    }
+
+    return sz;
+}
+
+template <typename Tag>
+static npy_intp
+gallop_right_(const typename Tag::type *arr, const npy_intp size,
+              const typename Tag::type *key, size_t len)
+{
+    npy_intp last_ofs, ofs, m;
+
+    if (Tag::less(key, arr, len)) {
+        return 0;
+    }
+
+    last_ofs = 0;
+    ofs = 1;
+
+    for (;;) {
+        if (size <= ofs || ofs < 0) {
+            ofs = size; /* arr[ofs] is never accessed */
+            break;
+        }
+
+        if (Tag::less(key, arr + ofs * len, len)) {
+            break;
+        }
+        else {
+            last_ofs = ofs;
+            /* ofs = 1, 3, 7, 15... */
+            ofs = (ofs << 1) + 1;
+        }
+    }
+
+    /* now that arr[last_ofs*len] <= key < arr[ofs*len] */
+    while (last_ofs + 1 < ofs) {
+        m = last_ofs + ((ofs - last_ofs) >> 1);
+
+        if (Tag::less(key, arr + m * len, len)) {
+            ofs = m;
+        }
+        else {
+            last_ofs = m;
+        }
+    }
+
+    /* now that arr[(ofs-1)*len] <= key < arr[ofs*len] */
+    return ofs;
+}
+
+template <typename Tag>
+static npy_intp
+gallop_left_(const typename Tag::type *arr, const npy_intp size,
+             const typename Tag::type *key, size_t len)
+{
+    npy_intp last_ofs, ofs, l, m, r;
+
+    if (Tag::less(arr + (size - 1) * len, key, len)) {
+        return size;
+    }
+
+    last_ofs = 0;
+    ofs = 1;
+
+    for (;;) {
+        if (size <= ofs || ofs < 0) {
+            ofs = size;
+            break;
+        }
+
+        if (Tag::less(arr + (size - ofs - 1) * len, key, len)) {
+            break;
+        }
+        else {
+            last_ofs = ofs;
+            ofs = (ofs << 1) + 1;
+        }
+    }
+
+    /* now that arr[(size-ofs-1)*len] < key <= arr[(size-last_ofs-1)*len] */
+    l = size - ofs - 1;
+    r = size - last_ofs - 1;
+
+    while (l + 1 < r) {
+        m = l + ((r - l) >> 1);
+
+        if (Tag::less(arr + m * len, key, len)) {
+            l = m;
+        }
+        else {
+            r = m;
+        }
+    }
+
+    /* now that arr[(r-1)*len] < key <= arr[r*len] */
+    return r;
+}
+
+template <typename Tag>
+static void
+merge_left_(typename Tag::type *p1, npy_intp l1, typename Tag::type *p2,
+            npy_intp l2, typename Tag::type *p3, size_t len)
+{
+    using type = typename Tag::type;
+    type *end = p2 + l2 * len;
+    memcpy(p3, p1, sizeof(type) * l1 * len);
+    /* first element must be in p2 otherwise skipped in the caller */
+    Tag::copy(p1, p2, len);
+    p1 += len;
+    p2 += len;
+
+    while (p1 < p2 && p2 < end) {
+        if (Tag::less(p2, p3, len)) {
+            Tag::copy(p1, p2, len);
+            p1 += len;
+            p2 += len;
+        }
+        else {
+            Tag::copy(p1, p3, len);
+            p1 += len;
+            p3 += len;
+        }
+    }
+
+    if (p1 != p2) {
+        memcpy(p1, p3, sizeof(type) * (p2 - p1));
+    }
+}
+
+template <typename Tag, typename type>
+static void
+merge_right_(type *p1, npy_intp l1, type *p2, npy_intp l2, type *p3,
+             size_t len)
+{
+    npy_intp ofs;
+    type *start = p1 - len;
+    memcpy(p3, p2, sizeof(type) * l2 * len);
+    p1 += (l1 - 1) * len;
+    p2 += (l2 - 1) * len;
+    p3 += (l2 - 1) * len;
+    /* first element must be in p1 otherwise skipped in the caller */
+    Tag::copy(p2, p1, len);
+    p2 -= len;
+    p1 -= len;
+
+    while (p1 < p2 && start < p1) {
+        if (Tag::less(p3, p1, len)) {
+            Tag::copy(p2, p1, len);
+            p2 -= len;
+            p1 -= len;
+        }
+        else {
+            Tag::copy(p2, p3, len);
+            p2 -= len;
+            p3 -= len;
+        }
+    }
+
+    if (p1 != p2) {
+        ofs = p2 - start;
+        memcpy(start + len, p3 - ofs + len, sizeof(type) * ofs);
+    }
+}
+
+template <typename Tag, typename type>
+static int
+merge_at_(type *arr, const run *stack, const npy_intp at,
+          string_buffer_<Tag> *buffer, size_t len)
+{
+    int ret;
+    npy_intp s1, l1, s2, l2, k;
+    type *p1, *p2;
+    s1 = stack[at].s;
+    l1 = stack[at].l;
+    s2 = stack[at + 1].s;
+    l2 = stack[at + 1].l;
+    /* arr[s2] belongs to arr[s1+k] */
+    Tag::copy(buffer->pw, arr + s2 * len, len);
+    k = gallop_right_<Tag>(arr + s1 * len, l1, buffer->pw, len);
+
+    if (l1 == k) {
+        /* already sorted */
+        return 0;
+    }
+
+    p1 = arr + (s1 + k) * len;
+    l1 -= k;
+    p2 = arr + s2 * len;
+    /* arr[s2-1] belongs to arr[s2+l2] */
+    Tag::copy(buffer->pw, arr + (s2 - 1) * len, len);
+    l2 = gallop_left_<Tag>(arr + s2 * len, l2, buffer->pw, len);
+
+    if (l2 < l1) {
+        ret = resize_buffer_<Tag>(buffer, l2);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+
+        merge_right_<Tag>(p1, l1, p2, l2, buffer->pw, len);
+    }
+    else {
+        ret = resize_buffer_<Tag>(buffer, l1);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+
+        merge_left_<Tag>(p1, l1, p2, l2, buffer->pw, len);
+    }
+
+    return 0;
+}
+
+template <typename Tag, typename type>
+static int
+try_collapse_(type *arr, run *stack, npy_intp *stack_ptr,
+              string_buffer_<Tag> *buffer, size_t len)
+{
+    int ret;
+    npy_intp A, B, C, top;
+    top = *stack_ptr;
+
+    while (1 < top) {
+        B = stack[top - 2].l;
+        C = stack[top - 1].l;
+
+        if ((2 < top && stack[top - 3].l <= B + C) ||
+            (3 < top && stack[top - 4].l <= stack[top - 3].l + B)) {
+            A = stack[top - 3].l;
+
+            if (A <= C) {
+                ret = merge_at_<Tag>(arr, stack, top - 3, buffer, len);
+
+                if (NPY_UNLIKELY(ret < 0)) {
+                    return ret;
+                }
+
+                stack[top - 3].l += B;
+                stack[top - 2] = stack[top - 1];
+                --top;
+            }
+            else {
+                ret = merge_at_<Tag>(arr, stack, top - 2, buffer, len);
+
+                if (NPY_UNLIKELY(ret < 0)) {
+                    return ret;
+                }
+
+                stack[top - 2].l += C;
+                --top;
+            }
+        }
+        else if (1 < top && B <= C) {
+            ret = merge_at_<Tag>(arr, stack, top - 2, buffer, len);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 2].l += C;
+            --top;
+        }
+        else {
+            break;
+        }
+    }
+
+    *stack_ptr = top;
+    return 0;
+}
+
+template <typename Tag, typename type>
+static int
+force_collapse_(type *arr, run *stack, npy_intp *stack_ptr,
+                string_buffer_<Tag> *buffer, size_t len)
+{
+    int ret;
+    npy_intp top = *stack_ptr;
+
+    while (2 < top) {
+        if (stack[top - 3].l <= stack[top - 1].l) {
+            ret = merge_at_<Tag>(arr, stack, top - 3, buffer, len);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 3].l += stack[top - 2].l;
+            stack[top - 2] = stack[top - 1];
+            --top;
+        }
+        else {
+            ret = merge_at_<Tag>(arr, stack, top - 2, buffer, len);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 2].l += stack[top - 1].l;
+            --top;
+        }
+    }
+
+    if (1 < top) {
+        ret = merge_at_<Tag>(arr, stack, top - 2, buffer, len);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+    }
+
+    return 0;
+}
+
+template <typename Tag>
+NPY_NO_EXPORT int
+string_timsort_(void *start, npy_intp num, void *varr)
+{
+    using type = typename Tag::type;
+    PyArrayObject *arr = reinterpret_cast<PyArrayObject *>(varr);
+    size_t elsize = PyArray_ITEMSIZE(arr);
+    size_t len = elsize / sizeof(type);
+    int ret;
+    npy_intp l, n, stack_ptr, minrun;
+    run stack[TIMSORT_STACK_SIZE];
+    string_buffer_<Tag> buffer;
+
+    /* Items that have zero size don't make sense to sort */
+    if (len == 0) {
+        return 0;
+    }
+
+    buffer.pw = NULL;
+    buffer.size = 0;
+    buffer.len = len;
+    stack_ptr = 0;
+    minrun = compute_min_run_short(num);
+    /* used for insertion sort and gallop key */
+    ret = resize_buffer_<Tag>(&buffer, 1);
+
+    if (NPY_UNLIKELY(ret < 0)) {
+        goto cleanup;
+    }
+
+    for (l = 0; l < num;) {
+        n = count_run_<Tag>((type *)start, l, num, minrun, buffer.pw, len);
+        /* both s and l are scaled by len */
+        stack[stack_ptr].s = l;
+        stack[stack_ptr].l = n;
+        ++stack_ptr;
+        ret = try_collapse_<Tag>((type *)start, stack, &stack_ptr, &buffer,
+                                 len);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            goto cleanup;
+        }
+
+        l += n;
+    }
+
+    ret = force_collapse_<Tag>((type *)start, stack, &stack_ptr, &buffer, len);
+
+    if (NPY_UNLIKELY(ret < 0)) {
+        goto cleanup;
+    }
+
+    ret = 0;
+
+cleanup:
+    if (buffer.pw != NULL) {
+        free(buffer.pw);
+    }
+    return ret;
+}
+
+/* argsort */
+
+template <typename Tag, typename type>
+static npy_intp
+acount_run_(type *arr, npy_intp *tosort, npy_intp l, npy_intp num,
+            npy_intp minrun, size_t len)
+{
+    npy_intp sz;
+    npy_intp vi;
+    npy_intp *pl, *pi, *pj, *pr;
+
+    if (NPY_UNLIKELY(num - l == 1)) {
+        return 1;
+    }
+
+    pl = tosort + l;
+
+    /* (not strictly) ascending sequence */
+    if (!Tag::less(arr + (*(pl + 1)) * len, arr + (*pl) * len, len)) {
+        for (pi = pl + 1;
+             pi < tosort + num - 1 &&
+             !Tag::less(arr + (*(pi + 1)) * len, arr + (*pi) * len, len);
+             ++pi) {
+        }
+    }
+    else { /* (strictly) descending sequence */
+        for (pi = pl + 1;
+             pi < tosort + num - 1 &&
+             Tag::less(arr + (*(pi + 1)) * len, arr + (*pi) * len, len);
+             ++pi) {
+        }
+
+        for (pj = pl, pr = pi; pj < pr; ++pj, --pr) {
+            std::swap(*pj, *pr);
+        }
+    }
+
+    ++pi;
+    sz = pi - pl;
+
+    if (sz < minrun) {
+        if (l + minrun < num) {
+            sz = minrun;
+        }
+        else {
+            sz = num - l;
+        }
+
+        pr = pl + sz;
+
+        /* insertion sort */
+        for (; pi < pr; ++pi) {
+            vi = *pi;
+            pj = pi;
+
+            while (pl < pj &&
+                   Tag::less(arr + vi * len, arr + (*(pj - 1)) * len, len)) {
+                *pj = *(pj - 1);
+                --pj;
+            }
+
+            *pj = vi;
+        }
+    }
+
+    return sz;
+}
+
+template <typename Tag, typename type>
+static npy_intp
+agallop_left_(const type *arr, const npy_intp *tosort, const npy_intp size,
+              const type *key, size_t len)
+{
+    npy_intp last_ofs, ofs, l, m, r;
+
+    if (Tag::less(arr + tosort[size - 1] * len, key, len)) {
+        return size;
+    }
+
+    last_ofs = 0;
+    ofs = 1;
+
+    for (;;) {
+        if (size <= ofs || ofs < 0) {
+            ofs = size;
+            break;
+        }
+
+        if (Tag::less(arr + tosort[size - ofs - 1] * len, key, len)) {
+            break;
+        }
+        else {
+            last_ofs = ofs;
+            ofs = (ofs << 1) + 1;
+        }
+    }
+
+    /* now that arr[tosort[size-ofs-1]*len] < key <=
+     * arr[tosort[size-last_ofs-1]*len] */
+    l = size - ofs - 1;
+    r = size - last_ofs - 1;
+
+    while (l + 1 < r) {
+        m = l + ((r - l) >> 1);
+
+        if (Tag::less(arr + tosort[m] * len, key, len)) {
+            l = m;
+        }
+        else {
+            r = m;
+        }
+    }
+
+    /* now that arr[tosort[r-1]*len] < key <= arr[tosort[r]*len] */
+    return r;
+}
+
+template <typename Tag, typename type>
+static npy_intp
+agallop_right_(const type *arr, const npy_intp *tosort, const npy_intp size,
+               const type *key, size_t len)
+{
+    npy_intp last_ofs, ofs, m;
+
+    if (Tag::less(key, arr + tosort[0] * len, len)) {
+        return 0;
+    }
+
+    last_ofs = 0;
+    ofs = 1;
+
+    for (;;) {
+        if (size <= ofs || ofs < 0) {
+            ofs = size; /* arr[ofs] is never accessed */
+            break;
+        }
+
+        if (Tag::less(key, arr + tosort[ofs] * len, len)) {
+            break;
+        }
+        else {
+            last_ofs = ofs;
+            /* ofs = 1, 3, 7, 15... */
+            ofs = (ofs << 1) + 1;
+        }
+    }
+
+    /* now that arr[tosort[last_ofs]*len] <= key < arr[tosort[ofs]*len] */
+    while (last_ofs + 1 < ofs) {
+        m = last_ofs + ((ofs - last_ofs) >> 1);
+
+        if (Tag::less(key, arr + tosort[m] * len, len)) {
+            ofs = m;
+        }
+        else {
+            last_ofs = m;
+        }
+    }
+
+    /* now that arr[tosort[ofs-1]*len] <= key < arr[tosort[ofs]*len] */
+    return ofs;
+}
+
+template <typename Tag, typename type>
+static void
+amerge_left_(type *arr, npy_intp *p1, npy_intp l1, npy_intp *p2, npy_intp l2,
+             npy_intp *p3, size_t len)
+{
+    npy_intp *end = p2 + l2;
+    memcpy(p3, p1, sizeof(npy_intp) * l1);
+    /* first element must be in p2 otherwise skipped in the caller */
+    *p1++ = *p2++;
+
+    while (p1 < p2 && p2 < end) {
+        if (Tag::less(arr + (*p2) * len, arr + (*p3) * len, len)) {
+            *p1++ = *p2++;
+        }
+        else {
+            *p1++ = *p3++;
+        }
+    }
+
+    if (p1 != p2) {
+        memcpy(p1, p3, sizeof(npy_intp) * (p2 - p1));
+    }
+}
+
+template <typename Tag, typename type>
+static void
+amerge_right_(type *arr, npy_intp *p1, npy_intp l1, npy_intp *p2, npy_intp l2,
+              npy_intp *p3, size_t len)
+{
+    npy_intp ofs;
+    npy_intp *start = p1 - 1;
+    memcpy(p3, p2, sizeof(npy_intp) * l2);
+    p1 += l1 - 1;
+    p2 += l2 - 1;
+    p3 += l2 - 1;
+    /* first element must be in p1 otherwise skipped in the caller */
+    *p2-- = *p1--;
+
+    while (p1 < p2 && start < p1) {
+        if (Tag::less(arr + (*p3) * len, arr + (*p1) * len, len)) {
+            *p2-- = *p1--;
+        }
+        else {
+            *p2-- = *p3--;
+        }
+    }
+
+    if (p1 != p2) {
+        ofs = p2 - start;
+        memcpy(start + 1, p3 - ofs + 1, sizeof(npy_intp) * ofs);
+    }
+}
+
+template <typename Tag, typename type>
+static int
+amerge_at_(type *arr, npy_intp *tosort, const run *stack, const npy_intp at,
+           buffer_intp *buffer, size_t len)
+{
+    int ret;
+    npy_intp s1, l1, s2, l2, k;
+    npy_intp *p1, *p2;
+    s1 = stack[at].s;
+    l1 = stack[at].l;
+    s2 = stack[at + 1].s;
+    l2 = stack[at + 1].l;
+    /* tosort[s2] belongs to tosort[s1+k] */
+    k = agallop_right_<Tag>(arr, tosort + s1, l1, arr + tosort[s2] * len, len);
+
+    if (l1 == k) {
+        /* already sorted */
+        return 0;
+    }
+
+    p1 = tosort + s1 + k;
+    l1 -= k;
+    p2 = tosort + s2;
+    /* tosort[s2-1] belongs to tosort[s2+l2] */
+    l2 = agallop_left_<Tag>(arr, tosort + s2, l2, arr + tosort[s2 - 1] * len,
+                            len);
+
+    if (l2 < l1) {
+        ret = resize_buffer_intp(buffer, l2);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+
+        amerge_right_<Tag>(arr, p1, l1, p2, l2, buffer->pw, len);
+    }
+    else {
+        ret = resize_buffer_intp(buffer, l1);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+
+        amerge_left_<Tag>(arr, p1, l1, p2, l2, buffer->pw, len);
+    }
+
+    return 0;
+}
+
+template <typename Tag, typename type>
+static int
+atry_collapse_(type *arr, npy_intp *tosort, run *stack, npy_intp *stack_ptr,
+               buffer_intp *buffer, size_t len)
+{
+    int ret;
+    npy_intp A, B, C, top;
+    top = *stack_ptr;
+
+    while (1 < top) {
+        B = stack[top - 2].l;
+        C = stack[top - 1].l;
+
+        if ((2 < top && stack[top - 3].l <= B + C) ||
+            (3 < top && stack[top - 4].l <= stack[top - 3].l + B)) {
+            A = stack[top - 3].l;
+
+            if (A <= C) {
+                ret = amerge_at_<Tag>(arr, tosort, stack, top - 3, buffer,
+                                      len);
+
+                if (NPY_UNLIKELY(ret < 0)) {
+                    return ret;
+                }
+
+                stack[top - 3].l += B;
+                stack[top - 2] = stack[top - 1];
+                --top;
+            }
+            else {
+                ret = amerge_at_<Tag>(arr, tosort, stack, top - 2, buffer,
+                                      len);
+
+                if (NPY_UNLIKELY(ret < 0)) {
+                    return ret;
+                }
+
+                stack[top - 2].l += C;
+                --top;
+            }
+        }
+        else if (1 < top && B <= C) {
+            ret = amerge_at_<Tag>(arr, tosort, stack, top - 2, buffer, len);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 2].l += C;
+            --top;
+        }
+        else {
+            break;
+        }
+    }
+
+    *stack_ptr = top;
+    return 0;
+}
+
+template <typename Tag, typename type>
+static int
+aforce_collapse_(type *arr, npy_intp *tosort, run *stack, npy_intp *stack_ptr,
+                 buffer_intp *buffer, size_t len)
+{
+    int ret;
+    npy_intp top = *stack_ptr;
+
+    while (2 < top) {
+        if (stack[top - 3].l <= stack[top - 1].l) {
+            ret = amerge_at_<Tag>(arr, tosort, stack, top - 3, buffer, len);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 3].l += stack[top - 2].l;
+            stack[top - 2] = stack[top - 1];
+            --top;
+        }
+        else {
+            ret = amerge_at_<Tag>(arr, tosort, stack, top - 2, buffer, len);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 2].l += stack[top - 1].l;
+            --top;
+        }
+    }
+
+    if (1 < top) {
+        ret = amerge_at_<Tag>(arr, tosort, stack, top - 2, buffer, len);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+    }
+
+    return 0;
+}
+
+template <typename Tag>
+NPY_NO_EXPORT int
+string_atimsort_(void *start, npy_intp *tosort, npy_intp num, void *varr)
+{
+    using type = typename Tag::type;
+    PyArrayObject *arr = reinterpret_cast<PyArrayObject *>(varr);
+    size_t elsize = PyArray_ITEMSIZE(arr);
+    size_t len = elsize / sizeof(type);
+    int ret;
+    npy_intp l, n, stack_ptr, minrun;
+    run stack[TIMSORT_STACK_SIZE];
+    buffer_intp buffer;
+
+    /* Items that have zero size don't make sense to sort */
+    if (len == 0) {
+        return 0;
+    }
+
+    buffer.pw = NULL;
+    buffer.size = 0;
+    stack_ptr = 0;
+    minrun = compute_min_run_short(num);
+
+    for (l = 0; l < num;) {
+        n = acount_run_<Tag>((type *)start, tosort, l, num, minrun, len);
+        /* both s and l are scaled by len */
+        stack[stack_ptr].s = l;
+        stack[stack_ptr].l = n;
+        ++stack_ptr;
+        ret = atry_collapse_<Tag>((type *)start, tosort, stack, &stack_ptr,
+                                  &buffer, len);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            goto cleanup;
+        }
+
+        l += n;
+    }
+
+    ret = aforce_collapse_<Tag>((type *)start, tosort, stack, &stack_ptr,
+                                &buffer, len);
+
+    if (NPY_UNLIKELY(ret < 0)) {
+        goto cleanup;
+    }
+
+    ret = 0;
+
+cleanup:
+    if (buffer.pw != NULL) {
+        free(buffer.pw);
+    }
+    return ret;
+}
+
+/*
+ *****************************************************************************
+ **                             GENERIC SORT                                **
+ *****************************************************************************
+ */
+
+typedef struct {
+    char *pw;
+    npy_intp size;
+    size_t len;
+} buffer_char;
+
+static inline int
+resize_buffer_char(buffer_char *buffer, npy_intp new_size)
+{
+    if (new_size <= buffer->size) {
+        return 0;
+    }
+
+    if (NPY_UNLIKELY(buffer->pw == NULL)) {
+        buffer->pw = (char *)malloc(sizeof(char) * new_size * buffer->len);
+    }
+    else {
+        buffer->pw = (char *)realloc(buffer->pw,
+                                     sizeof(char) * new_size * buffer->len);
+    }
+
+    buffer->size = new_size;
+
+    if (NPY_UNLIKELY(buffer->pw == NULL)) {
+        return -NPY_ENOMEM;
+    }
+    else {
+        return 0;
+    }
+}
+
+static npy_intp
+npy_count_run(char *arr, npy_intp l, npy_intp num, npy_intp minrun, char *vp,
+              size_t len, PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
+{
+    npy_intp sz;
+    char *pl, *pi, *pj, *pr;
+
+    if (NPY_UNLIKELY(num - l == 1)) {
+        return 1;
+    }
+
+    pl = arr + l * len;
+
+    /* (not strictly) ascending sequence */
+    if (cmp(pl, pl + len, py_arr) <= 0) {
+        for (pi = pl + len;
+             pi < arr + (num - 1) * len && cmp(pi, pi + len, py_arr) <= 0;
+             pi += len) {
+        }
+    }
+    else { /* (strictly) descending sequence */
+        for (pi = pl + len;
+             pi < arr + (num - 1) * len && cmp(pi + len, pi, py_arr) < 0;
+             pi += len) {
+        }
+
+        for (pj = pl, pr = pi; pj < pr; pj += len, pr -= len) {
+            GENERIC_SWAP(pj, pr, len);
+        }
+    }
+
+    pi += len;
+    sz = (pi - pl) / len;
+
+    if (sz < minrun) {
+        if (l + minrun < num) {
+            sz = minrun;
+        }
+        else {
+            sz = num - l;
+        }
+
+        pr = pl + sz * len;
+
+        /* insertion sort */
+        for (; pi < pr; pi += len) {
+            GENERIC_COPY(vp, pi, len);
+            pj = pi;
+
+            while (pl < pj && cmp(vp, pj - len, py_arr) < 0) {
+                GENERIC_COPY(pj, pj - len, len);
+                pj -= len;
+            }
+
+            GENERIC_COPY(pj, vp, len);
+        }
+    }
+
+    return sz;
+}
+
+static npy_intp
+npy_gallop_right(const char *arr, const npy_intp size, const char *key,
+                 size_t len, PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
+{
+    npy_intp last_ofs, ofs, m;
+
+    if (cmp(key, arr, py_arr) < 0) {
+        return 0;
+    }
+
+    last_ofs = 0;
+    ofs = 1;
+
+    for (;;) {
+        if (size <= ofs || ofs < 0) {
+            ofs = size; /* arr[ofs] is never accessed */
+            break;
+        }
+
+        if (cmp(key, arr + ofs * len, py_arr) < 0) {
+            break;
+        }
+        else {
+            last_ofs = ofs;
+            /* ofs = 1, 3, 7, 15... */
+            ofs = (ofs << 1) + 1;
+        }
+    }
+
+    /* now that arr[last_ofs*len] <= key < arr[ofs*len] */
+    while (last_ofs + 1 < ofs) {
+        m = last_ofs + ((ofs - last_ofs) >> 1);
+
+        if (cmp(key, arr + m * len, py_arr) < 0) {
+            ofs = m;
+        }
+        else {
+            last_ofs = m;
+        }
+    }
+
+    /* now that arr[(ofs-1)*len] <= key < arr[ofs*len] */
+    return ofs;
+}
+
+static npy_intp
+npy_gallop_left(const char *arr, const npy_intp size, const char *key,
+                size_t len, PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
+{
+    npy_intp last_ofs, ofs, l, m, r;
+
+    if (cmp(arr + (size - 1) * len, key, py_arr) < 0) {
+        return size;
+    }
+
+    last_ofs = 0;
+    ofs = 1;
+
+    for (;;) {
+        if (size <= ofs || ofs < 0) {
+            ofs = size;
+            break;
+        }
+
+        if (cmp(arr + (size - ofs - 1) * len, key, py_arr) < 0) {
+            break;
+        }
+        else {
+            last_ofs = ofs;
+            ofs = (ofs << 1) + 1;
+        }
+    }
+
+    /* now that arr[(size-ofs-1)*len] < key <= arr[(size-last_ofs-1)*len] */
+    l = size - ofs - 1;
+    r = size - last_ofs - 1;
+
+    while (l + 1 < r) {
+        m = l + ((r - l) >> 1);
+
+        if (cmp(arr + m * len, key, py_arr) < 0) {
+            l = m;
+        }
+        else {
+            r = m;
+        }
+    }
+
+    /* now that arr[(r-1)*len] < key <= arr[r*len] */
+    return r;
+}
+
+static void
+npy_merge_left(char *p1, npy_intp l1, char *p2, npy_intp l2, char *p3,
+               size_t len, PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
+{
+    char *end = p2 + l2 * len;
+    memcpy(p3, p1, sizeof(char) * l1 * len);
+    /* first element must be in p2 otherwise skipped in the caller */
+    GENERIC_COPY(p1, p2, len);
+    p1 += len;
+    p2 += len;
+
+    while (p1 < p2 && p2 < end) {
+        if (cmp(p2, p3, py_arr) < 0) {
+            GENERIC_COPY(p1, p2, len);
+            p1 += len;
+            p2 += len;
+        }
+        else {
+            GENERIC_COPY(p1, p3, len);
+            p1 += len;
+            p3 += len;
+        }
+    }
+
+    if (p1 != p2) {
+        memcpy(p1, p3, sizeof(char) * (p2 - p1));
+    }
+}
+
+static void
+npy_merge_right(char *p1, npy_intp l1, char *p2, npy_intp l2, char *p3,
+                size_t len, PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
+{
+    npy_intp ofs;
+    char *start = p1 - len;
+    memcpy(p3, p2, sizeof(char) * l2 * len);
+    p1 += (l1 - 1) * len;
+    p2 += (l2 - 1) * len;
+    p3 += (l2 - 1) * len;
+    /* first element must be in p1 otherwise skipped in the caller */
+    GENERIC_COPY(p2, p1, len);
+    p2 -= len;
+    p1 -= len;
+
+    while (p1 < p2 && start < p1) {
+        if (cmp(p3, p1, py_arr) < 0) {
+            GENERIC_COPY(p2, p1, len);
+            p2 -= len;
+            p1 -= len;
+        }
+        else {
+            GENERIC_COPY(p2, p3, len);
+            p2 -= len;
+            p3 -= len;
+        }
+    }
+
+    if (p1 != p2) {
+        ofs = p2 - start;
+        memcpy(start + len, p3 - ofs + len, sizeof(char) * ofs);
+    }
+}
+
+static int
+npy_merge_at(char *arr, const run *stack, const npy_intp at,
+             buffer_char *buffer, size_t len, PyArray_CompareFunc *cmp,
+             PyArrayObject *py_arr)
+{
+    int ret;
+    npy_intp s1, l1, s2, l2, k;
+    char *p1, *p2;
+    s1 = stack[at].s;
+    l1 = stack[at].l;
+    s2 = stack[at + 1].s;
+    l2 = stack[at + 1].l;
+    /* arr[s2] belongs to arr[s1+k] */
+    GENERIC_COPY(buffer->pw, arr + s2 * len, len);
+    k = npy_gallop_right(arr + s1 * len, l1, buffer->pw, len, cmp, py_arr);
+
+    if (l1 == k) {
+        /* already sorted */
+        return 0;
+    }
+
+    p1 = arr + (s1 + k) * len;
+    l1 -= k;
+    p2 = arr + s2 * len;
+    /* arr[s2-1] belongs to arr[s2+l2] */
+    GENERIC_COPY(buffer->pw, arr + (s2 - 1) * len, len);
+    l2 = npy_gallop_left(arr + s2 * len, l2, buffer->pw, len, cmp, py_arr);
+
+    if (l2 < l1) {
+        ret = resize_buffer_char(buffer, l2);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+
+        npy_merge_right(p1, l1, p2, l2, buffer->pw, len, cmp, py_arr);
+    }
+    else {
+        ret = resize_buffer_char(buffer, l1);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+
+        npy_merge_left(p1, l1, p2, l2, buffer->pw, len, cmp, py_arr);
+    }
+
+    return 0;
+}
+
+static int
+npy_try_collapse(char *arr, run *stack, npy_intp *stack_ptr,
+                 buffer_char *buffer, size_t len, PyArray_CompareFunc *cmp,
+                 PyArrayObject *py_arr)
+{
+    int ret;
+    npy_intp A, B, C, top;
+    top = *stack_ptr;
+
+    while (1 < top) {
+        B = stack[top - 2].l;
+        C = stack[top - 1].l;
+
+        if ((2 < top && stack[top - 3].l <= B + C) ||
+            (3 < top && stack[top - 4].l <= stack[top - 3].l + B)) {
+            A = stack[top - 3].l;
+
+            if (A <= C) {
+                ret = npy_merge_at(arr, stack, top - 3, buffer, len, cmp,
+                                   py_arr);
+
+                if (NPY_UNLIKELY(ret < 0)) {
+                    return ret;
+                }
+
+                stack[top - 3].l += B;
+                stack[top - 2] = stack[top - 1];
+                --top;
+            }
+            else {
+                ret = npy_merge_at(arr, stack, top - 2, buffer, len, cmp,
+                                   py_arr);
+
+                if (NPY_UNLIKELY(ret < 0)) {
+                    return ret;
+                }
+
+                stack[top - 2].l += C;
+                --top;
+            }
+        }
+        else if (1 < top && B <= C) {
+            ret = npy_merge_at(arr, stack, top - 2, buffer, len, cmp, py_arr);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 2].l += C;
+            --top;
+        }
+        else {
+            break;
+        }
+    }
+
+    *stack_ptr = top;
+    return 0;
+}
+
+static int
+npy_force_collapse(char *arr, run *stack, npy_intp *stack_ptr,
+                   buffer_char *buffer, size_t len, PyArray_CompareFunc *cmp,
+                   PyArrayObject *py_arr)
+{
+    int ret;
+    npy_intp top = *stack_ptr;
+
+    while (2 < top) {
+        if (stack[top - 3].l <= stack[top - 1].l) {
+            ret = npy_merge_at(arr, stack, top - 3, buffer, len, cmp, py_arr);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 3].l += stack[top - 2].l;
+            stack[top - 2] = stack[top - 1];
+            --top;
+        }
+        else {
+            ret = npy_merge_at(arr, stack, top - 2, buffer, len, cmp, py_arr);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 2].l += stack[top - 1].l;
+            --top;
+        }
+    }
+
+    if (1 < top) {
+        ret = npy_merge_at(arr, stack, top - 2, buffer, len, cmp, py_arr);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+    }
+
+    return 0;
+}
+
+NPY_NO_EXPORT int
+npy_timsort(void *start, npy_intp num, void *varr)
+{
+    PyArrayObject *arr = reinterpret_cast<PyArrayObject *>(varr);
+    size_t len = PyArray_ITEMSIZE(arr);
+    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
+    int ret;
+    npy_intp l, n, stack_ptr, minrun;
+    run stack[TIMSORT_STACK_SIZE];
+    buffer_char buffer;
+
+    /* Items that have zero size don't make sense to sort */
+    if (len == 0) {
+        return 0;
+    }
+
+    buffer.pw = NULL;
+    buffer.size = 0;
+    buffer.len = len;
+    stack_ptr = 0;
+    minrun = compute_min_run_short(num);
+
+    /* used for insertion sort and gallop key */
+    ret = resize_buffer_char(&buffer, len);
+
+    if (NPY_UNLIKELY(ret < 0)) {
+        goto cleanup;
+    }
+
+    for (l = 0; l < num;) {
+        n = npy_count_run((char *)start, l, num, minrun, buffer.pw, len, cmp,
+                          arr);
+
+        /* both s and l are scaled by len */
+        stack[stack_ptr].s = l;
+        stack[stack_ptr].l = n;
+        ++stack_ptr;
+        ret = npy_try_collapse((char *)start, stack, &stack_ptr, &buffer, len,
+                               cmp, arr);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            goto cleanup;
+        }
+
+        l += n;
+    }
+
+    ret = npy_force_collapse((char *)start, stack, &stack_ptr, &buffer, len,
+                             cmp, arr);
+
+    if (NPY_UNLIKELY(ret < 0)) {
+        goto cleanup;
+    }
+
+    ret = 0;
+
+cleanup:
+    if (buffer.pw != NULL) {
+        free(buffer.pw);
+    }
+    return ret;
+}
+
+/* argsort */
+
+static npy_intp
+npy_acount_run(char *arr, npy_intp *tosort, npy_intp l, npy_intp num,
+               npy_intp minrun, size_t len, PyArray_CompareFunc *cmp,
+               PyArrayObject *py_arr)
+{
+    npy_intp sz;
+    npy_intp vi;
+    npy_intp *pl, *pi, *pj, *pr;
+
+    if (NPY_UNLIKELY(num - l == 1)) {
+        return 1;
+    }
+
+    pl = tosort + l;
+
+    /* (not strictly) ascending sequence */
+    if (cmp(arr + (*pl) * len, arr + (*(pl + 1)) * len, py_arr) <= 0) {
+        for (pi = pl + 1;
+             pi < tosort + num - 1 &&
+             cmp(arr + (*pi) * len, arr + (*(pi + 1)) * len, py_arr) <= 0;
+             ++pi) {
+        }
+    }
+    else { /* (strictly) descending sequence */
+        for (pi = pl + 1;
+             pi < tosort + num - 1 &&
+             cmp(arr + (*(pi + 1)) * len, arr + (*pi) * len, py_arr) < 0;
+             ++pi) {
+        }
+
+        for (pj = pl, pr = pi; pj < pr; ++pj, --pr) {
+            std::swap(*pj, *pr);
+        }
+    }
+
+    ++pi;
+    sz = pi - pl;
+
+    if (sz < minrun) {
+        if (l + minrun < num) {
+            sz = minrun;
+        }
+        else {
+            sz = num - l;
+        }
+
+        pr = pl + sz;
+
+        /* insertion sort */
+        for (; pi < pr; ++pi) {
+            vi = *pi;
+            pj = pi;
+
+            while (pl < pj &&
+                   cmp(arr + vi * len, arr + (*(pj - 1)) * len, py_arr) < 0) {
+                *pj = *(pj - 1);
+                --pj;
+            }
+
+            *pj = vi;
+        }
+    }
+
+    return sz;
+}
+
+static npy_intp
+npy_agallop_left(const char *arr, const npy_intp *tosort, const npy_intp size,
+                 const char *key, size_t len, PyArray_CompareFunc *cmp,
+                 PyArrayObject *py_arr)
+{
+    npy_intp last_ofs, ofs, l, m, r;
+
+    if (cmp(arr + tosort[size - 1] * len, key, py_arr) < 0) {
+        return size;
+    }
+
+    last_ofs = 0;
+    ofs = 1;
+
+    for (;;) {
+        if (size <= ofs || ofs < 0) {
+            ofs = size;
+            break;
+        }
+
+        if (cmp(arr + tosort[size - ofs - 1] * len, key, py_arr) < 0) {
+            break;
+        }
+        else {
+            last_ofs = ofs;
+            ofs = (ofs << 1) + 1;
+        }
+    }
+
+    /* now that arr[tosort[size-ofs-1]*len] < key <=
+     * arr[tosort[size-last_ofs-1]*len] */
+    l = size - ofs - 1;
+    r = size - last_ofs - 1;
+
+    while (l + 1 < r) {
+        m = l + ((r - l) >> 1);
+
+        if (cmp(arr + tosort[m] * len, key, py_arr) < 0) {
+            l = m;
+        }
+        else {
+            r = m;
+        }
+    }
+
+    /* now that arr[tosort[r-1]*len] < key <= arr[tosort[r]*len] */
+    return r;
+}
+
+static npy_intp
+npy_agallop_right(const char *arr, const npy_intp *tosort, const npy_intp size,
+                  const char *key, size_t len, PyArray_CompareFunc *cmp,
+                  PyArrayObject *py_arr)
+{
+    npy_intp last_ofs, ofs, m;
+
+    if (cmp(key, arr + tosort[0] * len, py_arr) < 0) {
+        return 0;
+    }
+
+    last_ofs = 0;
+    ofs = 1;
+
+    for (;;) {
+        if (size <= ofs || ofs < 0) {
+            ofs = size; /* arr[ofs] is never accessed */
+            break;
+        }
+
+        if (cmp(key, arr + tosort[ofs] * len, py_arr) < 0) {
+            break;
+        }
+        else {
+            last_ofs = ofs;
+            /* ofs = 1, 3, 7, 15... */
+            ofs = (ofs << 1) + 1;
+        }
+    }
+
+    /* now that arr[tosort[last_ofs]*len] <= key < arr[tosort[ofs]*len] */
+    while (last_ofs + 1 < ofs) {
+        m = last_ofs + ((ofs - last_ofs) >> 1);
+
+        if (cmp(key, arr + tosort[m] * len, py_arr) < 0) {
+            ofs = m;
+        }
+        else {
+            last_ofs = m;
+        }
+    }
+
+    /* now that arr[tosort[ofs-1]*len] <= key < arr[tosort[ofs]*len] */
+    return ofs;
+}
+
+static void
+npy_amerge_left(char *arr, npy_intp *p1, npy_intp l1, npy_intp *p2,
+                npy_intp l2, npy_intp *p3, size_t len,
+                PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
+{
+    npy_intp *end = p2 + l2;
+    memcpy(p3, p1, sizeof(npy_intp) * l1);
+    /* first element must be in p2 otherwise skipped in the caller */
+    *p1++ = *p2++;
+
+    while (p1 < p2 && p2 < end) {
+        if (cmp(arr + (*p2) * len, arr + (*p3) * len, py_arr) < 0) {
+            *p1++ = *p2++;
+        }
+        else {
+            *p1++ = *p3++;
+        }
+    }
+
+    if (p1 != p2) {
+        memcpy(p1, p3, sizeof(npy_intp) * (p2 - p1));
+    }
+}
+
+static void
+npy_amerge_right(char *arr, npy_intp *p1, npy_intp l1, npy_intp *p2,
+                 npy_intp l2, npy_intp *p3, size_t len,
+                 PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
+{
+    npy_intp ofs;
+    npy_intp *start = p1 - 1;
+    memcpy(p3, p2, sizeof(npy_intp) * l2);
+    p1 += l1 - 1;
+    p2 += l2 - 1;
+    p3 += l2 - 1;
+    /* first element must be in p1 otherwise skipped in the caller */
+    *p2-- = *p1--;
+
+    while (p1 < p2 && start < p1) {
+        if (cmp(arr + (*p3) * len, arr + (*p1) * len, py_arr) < 0) {
+            *p2-- = *p1--;
+        }
+        else {
+            *p2-- = *p3--;
+        }
+    }
+
+    if (p1 != p2) {
+        ofs = p2 - start;
+        memcpy(start + 1, p3 - ofs + 1, sizeof(npy_intp) * ofs);
+    }
+}
+
+static int
+npy_amerge_at(char *arr, npy_intp *tosort, const run *stack, const npy_intp at,
+              buffer_intp *buffer, size_t len, PyArray_CompareFunc *cmp,
+              PyArrayObject *py_arr)
+{
+    int ret;
+    npy_intp s1, l1, s2, l2, k;
+    npy_intp *p1, *p2;
+    s1 = stack[at].s;
+    l1 = stack[at].l;
+    s2 = stack[at + 1].s;
+    l2 = stack[at + 1].l;
+    /* tosort[s2] belongs to tosort[s1+k] */
+    k = npy_agallop_right(arr, tosort + s1, l1, arr + tosort[s2] * len, len,
+                          cmp, py_arr);
+
+    if (l1 == k) {
+        /* already sorted */
+        return 0;
+    }
+
+    p1 = tosort + s1 + k;
+    l1 -= k;
+    p2 = tosort + s2;
+    /* tosort[s2-1] belongs to tosort[s2+l2] */
+    l2 = npy_agallop_left(arr, tosort + s2, l2, arr + tosort[s2 - 1] * len,
+                          len, cmp, py_arr);
+
+    if (l2 < l1) {
+        ret = resize_buffer_intp(buffer, l2);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+
+        npy_amerge_right(arr, p1, l1, p2, l2, buffer->pw, len, cmp, py_arr);
+    }
+    else {
+        ret = resize_buffer_intp(buffer, l1);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+
+        npy_amerge_left(arr, p1, l1, p2, l2, buffer->pw, len, cmp, py_arr);
+    }
+
+    return 0;
+}
+
+static int
+npy_atry_collapse(char *arr, npy_intp *tosort, run *stack, npy_intp *stack_ptr,
+                  buffer_intp *buffer, size_t len, PyArray_CompareFunc *cmp,
+                  PyArrayObject *py_arr)
+{
+    int ret;
+    npy_intp A, B, C, top;
+    top = *stack_ptr;
+
+    while (1 < top) {
+        B = stack[top - 2].l;
+        C = stack[top - 1].l;
+
+        if ((2 < top && stack[top - 3].l <= B + C) ||
+            (3 < top && stack[top - 4].l <= stack[top - 3].l + B)) {
+            A = stack[top - 3].l;
+
+            if (A <= C) {
+                ret = npy_amerge_at(arr, tosort, stack, top - 3, buffer, len,
+                                    cmp, py_arr);
+
+                if (NPY_UNLIKELY(ret < 0)) {
+                    return ret;
+                }
+
+                stack[top - 3].l += B;
+                stack[top - 2] = stack[top - 1];
+                --top;
+            }
+            else {
+                ret = npy_amerge_at(arr, tosort, stack, top - 2, buffer, len,
+                                    cmp, py_arr);
+
+                if (NPY_UNLIKELY(ret < 0)) {
+                    return ret;
+                }
+
+                stack[top - 2].l += C;
+                --top;
+            }
+        }
+        else if (1 < top && B <= C) {
+            ret = npy_amerge_at(arr, tosort, stack, top - 2, buffer, len, cmp,
+                                py_arr);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 2].l += C;
+            --top;
+        }
+        else {
+            break;
+        }
+    }
+
+    *stack_ptr = top;
+    return 0;
+}
+
+static int
+npy_aforce_collapse(char *arr, npy_intp *tosort, run *stack,
+                    npy_intp *stack_ptr, buffer_intp *buffer, size_t len,
+                    PyArray_CompareFunc *cmp, PyArrayObject *py_arr)
+{
+    int ret;
+    npy_intp top = *stack_ptr;
+
+    while (2 < top) {
+        if (stack[top - 3].l <= stack[top - 1].l) {
+            ret = npy_amerge_at(arr, tosort, stack, top - 3, buffer, len, cmp,
+                                py_arr);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 3].l += stack[top - 2].l;
+            stack[top - 2] = stack[top - 1];
+            --top;
+        }
+        else {
+            ret = npy_amerge_at(arr, tosort, stack, top - 2, buffer, len, cmp,
+                                py_arr);
+
+            if (NPY_UNLIKELY(ret < 0)) {
+                return ret;
+            }
+
+            stack[top - 2].l += stack[top - 1].l;
+            --top;
+        }
+    }
+
+    if (1 < top) {
+        ret = npy_amerge_at(arr, tosort, stack, top - 2, buffer, len, cmp,
+                            py_arr);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            return ret;
+        }
+    }
+
+    return 0;
+}
+
+NPY_NO_EXPORT int
+npy_atimsort(void *start, npy_intp *tosort, npy_intp num, void *varr)
+{
+    PyArrayObject *arr = reinterpret_cast<PyArrayObject *>(varr);
+    size_t len = PyArray_ITEMSIZE(arr);
+    PyArray_CompareFunc *cmp = PyArray_DESCR(arr)->f->compare;
+    int ret;
+    npy_intp l, n, stack_ptr, minrun;
+    run stack[TIMSORT_STACK_SIZE];
+    buffer_intp buffer;
+
+    /* Items that have zero size don't make sense to sort */
+    if (len == 0) {
+        return 0;
+    }
+
+    buffer.pw = NULL;
+    buffer.size = 0;
+    stack_ptr = 0;
+    minrun = compute_min_run_short(num);
+
+    for (l = 0; l < num;) {
+        n = npy_acount_run((char *)start, tosort, l, num, minrun, len, cmp,
+                           arr);
+        /* both s and l are scaled by len */
+        stack[stack_ptr].s = l;
+        stack[stack_ptr].l = n;
+        ++stack_ptr;
+        ret = npy_atry_collapse((char *)start, tosort, stack, &stack_ptr,
+                                &buffer, len, cmp, arr);
+
+        if (NPY_UNLIKELY(ret < 0)) {
+            goto cleanup;
+        }
+
+        l += n;
+    }
+
+    ret = npy_aforce_collapse((char *)start, tosort, stack, &stack_ptr,
+                              &buffer, len, cmp, arr);
+
+    if (NPY_UNLIKELY(ret < 0)) {
+        goto cleanup;
+    }
+
+    ret = 0;
+
+cleanup:
+    if (buffer.pw != NULL) {
+        free(buffer.pw);
+    }
+    return ret;
+}
+
+/***************************************
+ * C > C++ dispatch
+ ***************************************/
+
+NPY_NO_EXPORT int
+timsort_bool(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::bool_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_byte(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::byte_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_ubyte(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::ubyte_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_short(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::short_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_ushort(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::ushort_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_int(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::int_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_uint(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::uint_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_long(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::long_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_ulong(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::ulong_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_longlong(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::longlong_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_ulonglong(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::ulonglong_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_half(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::half_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_float(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::float_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_double(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::double_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_longdouble(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::longdouble_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_cfloat(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::cfloat_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_cdouble(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::cdouble_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_clongdouble(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::clongdouble_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_datetime(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::datetime_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_timedelta(void *start, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return timsort_<npy::timedelta_tag>(start, num);
+}
+NPY_NO_EXPORT int
+timsort_string(void *start, npy_intp num, void *varr)
+{
+    return string_timsort_<npy::string_tag>(start, num, varr);
+}
+NPY_NO_EXPORT int
+timsort_unicode(void *start, npy_intp num, void *varr)
+{
+    return string_timsort_<npy::unicode_tag>(start, num, varr);
+}
+
+NPY_NO_EXPORT int
+atimsort_bool(void *v, npy_intp *tosort, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::bool_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_byte(void *v, npy_intp *tosort, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::byte_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_ubyte(void *v, npy_intp *tosort, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::ubyte_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_short(void *v, npy_intp *tosort, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::short_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_ushort(void *v, npy_intp *tosort, npy_intp num,
+                void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::ushort_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_int(void *v, npy_intp *tosort, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::int_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_uint(void *v, npy_intp *tosort, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::uint_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_long(void *v, npy_intp *tosort, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::long_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_ulong(void *v, npy_intp *tosort, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::ulong_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_longlong(void *v, npy_intp *tosort, npy_intp num,
+                  void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::longlong_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_ulonglong(void *v, npy_intp *tosort, npy_intp num,
+                   void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::ulonglong_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_half(void *v, npy_intp *tosort, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::half_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_float(void *v, npy_intp *tosort, npy_intp num, void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::float_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_double(void *v, npy_intp *tosort, npy_intp num,
+                void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::double_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_longdouble(void *v, npy_intp *tosort, npy_intp num,
+                    void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::longdouble_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_cfloat(void *v, npy_intp *tosort, npy_intp num,
+                void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::cfloat_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_cdouble(void *v, npy_intp *tosort, npy_intp num,
+                 void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::cdouble_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_clongdouble(void *v, npy_intp *tosort, npy_intp num,
+                     void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::clongdouble_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_datetime(void *v, npy_intp *tosort, npy_intp num,
+                  void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::datetime_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_timedelta(void *v, npy_intp *tosort, npy_intp num,
+                   void *NPY_UNUSED(varr))
+{
+    return atimsort_<npy::timedelta_tag>(v, tosort, num);
+}
+NPY_NO_EXPORT int
+atimsort_string(void *v, npy_intp *tosort, npy_intp num, void *varr)
+{
+    return string_atimsort_<npy::string_tag>(v, tosort, num, varr);
+}
+NPY_NO_EXPORT int
+atimsort_unicode(void *v, npy_intp *tosort, npy_intp num, void *varr)
+{
+    return string_atimsort_<npy::unicode_tag>(v, tosort, num, varr);
+}
diff --git a/numpy/core/src/npysort/x86-qsort.dispatch.cpp b/numpy/core/src/npysort/x86-qsort.dispatch.cpp

new file mode 100644 (file)

index 0000000..3906722
--- /dev/null
+++ b/numpy/core/src/npysort/x86-qsort.dispatch.cpp
@@ -0,0 +1,839 @@
+/*@targets
+ * $maxopt $keep_baseline avx512_skx
+ */
+// policy $keep_baseline is used to avoid skip building avx512_skx
+// when its part of baseline features (--cpu-baseline), since
+// 'baseline' option isn't specified within targets.
+
+#include "x86-qsort.h"
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+
+#ifdef NPY_HAVE_AVX512_SKX
+#include "numpy/npy_math.h"
+
+#include "npy_sort.h"
+#include "numpy_tag.h"
+
+#include "simd/simd.h"
+#include <immintrin.h>
+
+template <typename Tag, typename type>
+NPY_NO_EXPORT int
+heapsort_(type *start, npy_intp n);
+
+/*
+ * Quicksort using AVX-512 for int, uint32 and float. The ideas and code are
+ * based on these two research papers:
+ * (1) Fast and Robust Vectorized In-Place Sorting of Primitive Types
+ *     https://drops.dagstuhl.de/opus/volltexte/2021/13775/
+ * (2) A Novel Hybrid Quicksort Algorithm Vectorized using AVX-512 on Intel
+ * Skylake https://arxiv.org/pdf/1704.08579.pdf
+ *
+ * High level idea: Vectorize the quicksort partitioning using AVX-512
+ * compressstore instructions. The algorithm to pick the pivot is to use median
+ * of 72 elements picked at random. If the array size is < 128, then use
+ * Bitonic sorting network. Good resource for bitonic sorting network:
+ * http://mitp-content-server.mit.edu:18180/books/content/sectbyfn?collid=books_pres_0&fn=Chapter%2027.pdf&id=8030
+ *
+ * Refer to https://github.com/numpy/numpy/pull/20133#issuecomment-958110340
+ * for potential problems when converting this code to universal intrinsics
+ * framework.
+ */
+
+/*
+ * Constants used in sorting 16 elements in a ZMM registers. Based on Bitonic
+ * sorting network (see
+ * https://en.wikipedia.org/wiki/Bitonic_sorter#/media/File:BitonicSort.svg)
+ */
+#define NETWORK1 14, 15, 12, 13, 10, 11, 8, 9, 6, 7, 4, 5, 2, 3, 0, 1
+#define NETWORK2 12, 13, 14, 15, 8, 9, 10, 11, 4, 5, 6, 7, 0, 1, 2, 3
+#define NETWORK3 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3, 4, 5, 6, 7
+#define NETWORK4 13, 12, 15, 14, 9, 8, 11, 10, 5, 4, 7, 6, 1, 0, 3, 2
+#define NETWORK5 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
+#define NETWORK6 11, 10, 9, 8, 15, 14, 13, 12, 3, 2, 1, 0, 7, 6, 5, 4
+#define NETWORK7 7, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8
+#define ZMM_MAX_FLOAT _mm512_set1_ps(NPY_INFINITYF)
+#define ZMM_MAX_UINT _mm512_set1_epi32(NPY_MAX_UINT32)
+#define ZMM_MAX_INT _mm512_set1_epi32(NPY_MAX_INT32)
+#define SHUFFLE_MASK(a, b, c, d) (a << 6) | (b << 4) | (c << 2) | d
+#define SHUFFLE_ps(ZMM, MASK) _mm512_shuffle_ps(zmm, zmm, MASK)
+#define SHUFFLE_epi32(ZMM, MASK) _mm512_shuffle_epi32(zmm, MASK)
+
+#define MAX(x, y) (((x) > (y)) ? (x) : (y))
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))
+
+/*
+ * Vectorized random number generator xoroshiro128+. Broken into 2 parts:
+ * (1) vnext generates 2 64-bit random integers
+ * (2) rnd_epu32 converts this to 4 32-bit random integers and bounds it to
+ *     the length of the array
+ */
+#define VROTL(x, k) /* rotate each uint64_t value in vector */ \
+    _mm256_or_si256(_mm256_slli_epi64((x), (k)),               \
+                    _mm256_srli_epi64((x), 64 - (k)))
+
+static inline __m256i
+vnext(__m256i *s0, __m256i *s1)
+{
+    *s1 = _mm256_xor_si256(*s0, *s1); /* modify vectors s1 and s0 */
+    *s0 = _mm256_xor_si256(_mm256_xor_si256(VROTL(*s0, 24), *s1),
+                           _mm256_slli_epi64(*s1, 16));
+    *s1 = VROTL(*s1, 37);
+    return _mm256_add_epi64(*s0, *s1); /* return random vector */
+}
+
+/* transform random numbers to the range between 0 and bound - 1 */
+static inline __m256i
+rnd_epu32(__m256i rnd_vec, __m256i bound)
+{
+    __m256i even = _mm256_srli_epi64(_mm256_mul_epu32(rnd_vec, bound), 32);
+    __m256i odd = _mm256_mul_epu32(_mm256_srli_epi64(rnd_vec, 32), bound);
+    return _mm256_blend_epi32(odd, even, 0b01010101);
+}
+
+template <typename type>
+struct vector;
+
+template <>
+struct vector<npy_int> {
+    using tag = npy::int_tag;
+    using type_t = npy_int;
+    using zmm_t = __m512i;
+    using ymm_t = __m256i;
+
+    static type_t type_max() { return NPY_MAX_INT32; }
+    static type_t type_min() { return NPY_MIN_INT32; }
+    static zmm_t zmm_max() { return _mm512_set1_epi32(type_max()); }
+
+    static __mmask16 ge(zmm_t x, zmm_t y)
+    {
+        return _mm512_cmp_epi32_mask(x, y, _MM_CMPINT_NLT);
+    }
+    template <int scale>
+    static ymm_t i64gather(__m512i index, void const *base)
+    {
+        return _mm512_i64gather_epi32(index, base, scale);
+    }
+    static zmm_t loadu(void const *mem) { return _mm512_loadu_si512(mem); }
+    static zmm_t max(zmm_t x, zmm_t y) { return _mm512_max_epi32(x, y); }
+    static void mask_compressstoreu(void *mem, __mmask16 mask, zmm_t x)
+    {
+        return _mm512_mask_compressstoreu_epi32(mem, mask, x);
+    }
+    static zmm_t mask_loadu(zmm_t x, __mmask16 mask, void const *mem)
+    {
+        return _mm512_mask_loadu_epi32(x, mask, mem);
+    }
+    static zmm_t mask_mov(zmm_t x, __mmask16 mask, zmm_t y)
+    {
+        return _mm512_mask_mov_epi32(x, mask, y);
+    }
+    static void mask_storeu(void *mem, __mmask16 mask, zmm_t x)
+    {
+        return _mm512_mask_storeu_epi32(mem, mask, x);
+    }
+    static zmm_t min(zmm_t x, zmm_t y) { return _mm512_min_epi32(x, y); }
+    static zmm_t permutexvar(__m512i idx, zmm_t zmm)
+    {
+        return _mm512_permutexvar_epi32(idx, zmm);
+    }
+    static type_t reducemax(zmm_t v) { return npyv_reducemax_s32(v); }
+    static type_t reducemin(zmm_t v) { return npyv_reducemin_s32(v); }
+    static zmm_t set1(type_t v) { return _mm512_set1_epi32(v); }
+    template<__mmask16 mask>
+    static zmm_t shuffle(zmm_t zmm)
+    {
+        return _mm512_shuffle_epi32(zmm, (_MM_PERM_ENUM)mask);
+    }
+    static void storeu(void *mem, zmm_t x)
+    {
+        return _mm512_storeu_si512(mem, x);
+    }
+
+    static ymm_t max(ymm_t x, ymm_t y) { return _mm256_max_epi32(x, y); }
+    static ymm_t min(ymm_t x, ymm_t y) { return _mm256_min_epi32(x, y); }
+};
+template <>
+struct vector<npy_uint> {
+    using tag = npy::uint_tag;
+    using type_t = npy_uint;
+    using zmm_t = __m512i;
+    using ymm_t = __m256i;
+
+    static type_t type_max() { return NPY_MAX_UINT32; }
+    static type_t type_min() { return 0; }
+    static zmm_t zmm_max() { return _mm512_set1_epi32(type_max()); }
+
+    template<int scale>
+    static ymm_t i64gather(__m512i index, void const *base)
+    {
+        return _mm512_i64gather_epi32(index, base, scale);
+    }
+    static __mmask16 ge(zmm_t x, zmm_t y)
+    {
+        return _mm512_cmp_epu32_mask(x, y, _MM_CMPINT_NLT);
+    }
+    static zmm_t loadu(void const *mem) { return _mm512_loadu_si512(mem); }
+    static zmm_t max(zmm_t x, zmm_t y) { return _mm512_max_epu32(x, y); }
+    static void mask_compressstoreu(void *mem, __mmask16 mask, zmm_t x)
+    {
+        return _mm512_mask_compressstoreu_epi32(mem, mask, x);
+    }
+    static zmm_t mask_loadu(zmm_t x, __mmask16 mask, void const *mem)
+    {
+        return _mm512_mask_loadu_epi32(x, mask, mem);
+    }
+    static zmm_t mask_mov(zmm_t x, __mmask16 mask, zmm_t y)
+    {
+        return _mm512_mask_mov_epi32(x, mask, y);
+    }
+    static void mask_storeu(void *mem, __mmask16 mask, zmm_t x)
+    {
+        return _mm512_mask_storeu_epi32(mem, mask, x);
+    }
+    static zmm_t min(zmm_t x, zmm_t y) { return _mm512_min_epu32(x, y); }
+    static zmm_t permutexvar(__m512i idx, zmm_t zmm)
+    {
+        return _mm512_permutexvar_epi32(idx, zmm);
+    }
+    static type_t reducemax(zmm_t v) { return npyv_reducemax_u32(v); }
+    static type_t reducemin(zmm_t v) { return npyv_reducemin_u32(v); }
+    static zmm_t set1(type_t v) { return _mm512_set1_epi32(v); }
+    template<__mmask16 mask>
+    static zmm_t shuffle(zmm_t zmm)
+    {
+        return _mm512_shuffle_epi32(zmm, (_MM_PERM_ENUM)mask);
+    }
+    static void storeu(void *mem, zmm_t x)
+    {
+        return _mm512_storeu_si512(mem, x);
+    }
+
+    static ymm_t max(ymm_t x, ymm_t y) { return _mm256_max_epu32(x, y); }
+    static ymm_t min(ymm_t x, ymm_t y) { return _mm256_min_epu32(x, y); }
+};
+template <>
+struct vector<npy_float> {
+    using tag = npy::float_tag;
+    using type_t = npy_float;
+    using zmm_t = __m512;
+    using ymm_t = __m256;
+
+    static type_t type_max() { return NPY_INFINITYF; }
+    static type_t type_min() { return -NPY_INFINITYF; }
+    static zmm_t zmm_max() { return _mm512_set1_ps(type_max()); }
+
+    static __mmask16 ge(zmm_t x, zmm_t y)
+    {
+        return _mm512_cmp_ps_mask(x, y, _CMP_GE_OQ);
+    }
+    template<int scale>
+    static ymm_t i64gather(__m512i index, void const *base)
+    {
+        return _mm512_i64gather_ps(index, base, scale);
+    }
+    static zmm_t loadu(void const *mem) { return _mm512_loadu_ps(mem); }
+    static zmm_t max(zmm_t x, zmm_t y) { return _mm512_max_ps(x, y); }
+    static void mask_compressstoreu(void *mem, __mmask16 mask, zmm_t x)
+    {
+        return _mm512_mask_compressstoreu_ps(mem, mask, x);
+    }
+    static zmm_t mask_loadu(zmm_t x, __mmask16 mask, void const *mem)
+    {
+        return _mm512_mask_loadu_ps(x, mask, mem);
+    }
+    static zmm_t mask_mov(zmm_t x, __mmask16 mask, zmm_t y)
+    {
+        return _mm512_mask_mov_ps(x, mask, y);
+    }
+    static void mask_storeu(void *mem, __mmask16 mask, zmm_t x)
+    {
+        return _mm512_mask_storeu_ps(mem, mask, x);
+    }
+    static zmm_t min(zmm_t x, zmm_t y) { return _mm512_min_ps(x, y); }
+    static zmm_t permutexvar(__m512i idx, zmm_t zmm)
+    {
+        return _mm512_permutexvar_ps(idx, zmm);
+    }
+    static type_t reducemax(zmm_t v) { return npyv_reducemax_f32(v); }
+    static type_t reducemin(zmm_t v) { return npyv_reducemin_f32(v); }
+    static zmm_t set1(type_t v) { return _mm512_set1_ps(v); }
+    template<__mmask16 mask>
+    static zmm_t shuffle(zmm_t zmm)
+    {
+        return _mm512_shuffle_ps(zmm, zmm, (_MM_PERM_ENUM)mask);
+    }
+    static void storeu(void *mem, zmm_t x) { return _mm512_storeu_ps(mem, x); }
+
+    static ymm_t max(ymm_t x, ymm_t y) { return _mm256_max_ps(x, y); }
+    static ymm_t min(ymm_t x, ymm_t y) { return _mm256_min_ps(x, y); }
+};
+
+/*
+ * COEX == Compare and Exchange two registers by swapping min and max values
+ */
+template <typename vtype, typename mm_t>
+void
+COEX(mm_t &a, mm_t &b)
+{
+    mm_t temp = a;
+    a = vtype::min(a, b);
+    b = vtype::max(temp, b);
+}
+
+template <typename vtype, typename zmm_t = typename vtype::zmm_t>
+static inline zmm_t
+cmp_merge(zmm_t in1, zmm_t in2, __mmask16 mask)
+{
+    zmm_t min = vtype::min(in2, in1);
+    zmm_t max = vtype::max(in2, in1);
+    return vtype::mask_mov(min, mask, max);  // 0 -> min, 1 -> max
+}
+
+/*
+ * Assumes zmm is random and performs a full sorting network defined in
+ * https://en.wikipedia.org/wiki/Bitonic_sorter#/media/File:BitonicSort.svg
+ */
+template <typename vtype, typename zmm_t = typename vtype::zmm_t>
+static inline zmm_t
+sort_zmm(zmm_t zmm)
+{
+    zmm = cmp_merge<vtype>(zmm, vtype::template shuffle<SHUFFLE_MASK(2, 3, 0, 1)>(zmm),
+                           0xAAAA);
+    zmm = cmp_merge<vtype>(zmm, vtype::template shuffle<SHUFFLE_MASK(0, 1, 2, 3)>(zmm),
+                           0xCCCC);
+    zmm = cmp_merge<vtype>(zmm, vtype::template shuffle<SHUFFLE_MASK(2, 3, 0, 1)>(zmm),
+                           0xAAAA);
+    zmm = cmp_merge<vtype>(
+            zmm, vtype::permutexvar(_mm512_set_epi32(NETWORK3), zmm), 0xF0F0);
+    zmm = cmp_merge<vtype>(zmm, vtype::template shuffle<SHUFFLE_MASK(1, 0, 3, 2)>(zmm),
+                           0xCCCC);
+    zmm = cmp_merge<vtype>(zmm, vtype::template shuffle<SHUFFLE_MASK(2, 3, 0, 1)>(zmm),
+                           0xAAAA);
+    zmm = cmp_merge<vtype>(
+            zmm, vtype::permutexvar(_mm512_set_epi32(NETWORK5), zmm), 0xFF00);
+    zmm = cmp_merge<vtype>(
+            zmm, vtype::permutexvar(_mm512_set_epi32(NETWORK6), zmm), 0xF0F0);
+    zmm = cmp_merge<vtype>(zmm, vtype::template shuffle<SHUFFLE_MASK(1, 0, 3, 2)>(zmm),
+                           0xCCCC);
+    zmm = cmp_merge<vtype>(zmm, vtype::template shuffle<SHUFFLE_MASK(2, 3, 0, 1)>(zmm),
+                           0xAAAA);
+    return zmm;
+}
+
+// Assumes zmm is bitonic and performs a recursive half cleaner
+template <typename vtype, typename zmm_t = typename vtype::zmm_t>
+static inline zmm_t
+bitonic_merge_zmm(zmm_t zmm)
+{
+    // 1) half_cleaner[16]: compare 1-9, 2-10, 3-11 etc ..
+    zmm = cmp_merge<vtype>(
+            zmm, vtype::permutexvar(_mm512_set_epi32(NETWORK7), zmm), 0xFF00);
+    // 2) half_cleaner[8]: compare 1-5, 2-6, 3-7 etc ..
+    zmm = cmp_merge<vtype>(
+            zmm, vtype::permutexvar(_mm512_set_epi32(NETWORK6), zmm), 0xF0F0);
+    // 3) half_cleaner[4]
+    zmm = cmp_merge<vtype>(zmm, vtype::template shuffle<SHUFFLE_MASK(1, 0, 3, 2)>(zmm),
+                           0xCCCC);
+    // 3) half_cleaner[1]
+    zmm = cmp_merge<vtype>(zmm, vtype::template shuffle<SHUFFLE_MASK(2, 3, 0, 1)>(zmm),
+                           0xAAAA);
+    return zmm;
+}
+
+// Assumes zmm1 and zmm2 are sorted and performs a recursive half cleaner
+template <typename vtype, typename zmm_t = typename vtype::zmm_t>
+static inline void
+bitonic_merge_two_zmm(zmm_t *zmm1, zmm_t *zmm2)
+{
+    // 1) First step of a merging network: coex of zmm1 and zmm2 reversed
+    *zmm2 = vtype::permutexvar(_mm512_set_epi32(NETWORK5), *zmm2);
+    zmm_t zmm3 = vtype::min(*zmm1, *zmm2);
+    zmm_t zmm4 = vtype::max(*zmm1, *zmm2);
+    // 2) Recursive half cleaner for each
+    *zmm1 = bitonic_merge_zmm<vtype>(zmm3);
+    *zmm2 = bitonic_merge_zmm<vtype>(zmm4);
+}
+
+// Assumes [zmm0, zmm1] and [zmm2, zmm3] are sorted and performs a recursive
+// half cleaner
+template <typename vtype, typename zmm_t = typename vtype::zmm_t>
+static inline void
+bitonic_merge_four_zmm(zmm_t *zmm)
+{
+    zmm_t zmm2r = vtype::permutexvar(_mm512_set_epi32(NETWORK5), zmm[2]);
+    zmm_t zmm3r = vtype::permutexvar(_mm512_set_epi32(NETWORK5), zmm[3]);
+    zmm_t zmm_t1 = vtype::min(zmm[0], zmm3r);
+    zmm_t zmm_t2 = vtype::min(zmm[1], zmm2r);
+    zmm_t zmm_t3 = vtype::permutexvar(_mm512_set_epi32(NETWORK5),
+                                      vtype::max(zmm[1], zmm2r));
+    zmm_t zmm_t4 = vtype::permutexvar(_mm512_set_epi32(NETWORK5),
+                                      vtype::max(zmm[0], zmm3r));
+    zmm_t zmm0 = vtype::min(zmm_t1, zmm_t2);
+    zmm_t zmm1 = vtype::max(zmm_t1, zmm_t2);
+    zmm_t zmm2 = vtype::min(zmm_t3, zmm_t4);
+    zmm_t zmm3 = vtype::max(zmm_t3, zmm_t4);
+    zmm[0] = bitonic_merge_zmm<vtype>(zmm0);
+    zmm[1] = bitonic_merge_zmm<vtype>(zmm1);
+    zmm[2] = bitonic_merge_zmm<vtype>(zmm2);
+    zmm[3] = bitonic_merge_zmm<vtype>(zmm3);
+}
+
+template <typename vtype, typename zmm_t = typename vtype::zmm_t>
+static inline void
+bitonic_merge_eight_zmm(zmm_t *zmm)
+{
+    zmm_t zmm4r = vtype::permutexvar(_mm512_set_epi32(NETWORK5), zmm[4]);
+    zmm_t zmm5r = vtype::permutexvar(_mm512_set_epi32(NETWORK5), zmm[5]);
+    zmm_t zmm6r = vtype::permutexvar(_mm512_set_epi32(NETWORK5), zmm[6]);
+    zmm_t zmm7r = vtype::permutexvar(_mm512_set_epi32(NETWORK5), zmm[7]);
+    zmm_t zmm_t1 = vtype::min(zmm[0], zmm7r);
+    zmm_t zmm_t2 = vtype::min(zmm[1], zmm6r);
+    zmm_t zmm_t3 = vtype::min(zmm[2], zmm5r);
+    zmm_t zmm_t4 = vtype::min(zmm[3], zmm4r);
+    zmm_t zmm_t5 = vtype::permutexvar(_mm512_set_epi32(NETWORK5),
+                                      vtype::max(zmm[3], zmm4r));
+    zmm_t zmm_t6 = vtype::permutexvar(_mm512_set_epi32(NETWORK5),
+                                      vtype::max(zmm[2], zmm5r));
+    zmm_t zmm_t7 = vtype::permutexvar(_mm512_set_epi32(NETWORK5),
+                                      vtype::max(zmm[1], zmm6r));
+    zmm_t zmm_t8 = vtype::permutexvar(_mm512_set_epi32(NETWORK5),
+                                      vtype::max(zmm[0], zmm7r));
+    COEX<vtype>(zmm_t1, zmm_t3);
+    COEX<vtype>(zmm_t2, zmm_t4);
+    COEX<vtype>(zmm_t5, zmm_t7);
+    COEX<vtype>(zmm_t6, zmm_t8);
+    COEX<vtype>(zmm_t1, zmm_t2);
+    COEX<vtype>(zmm_t3, zmm_t4);
+    COEX<vtype>(zmm_t5, zmm_t6);
+    COEX<vtype>(zmm_t7, zmm_t8);
+    zmm[0] = bitonic_merge_zmm<vtype>(zmm_t1);
+    zmm[1] = bitonic_merge_zmm<vtype>(zmm_t2);
+    zmm[2] = bitonic_merge_zmm<vtype>(zmm_t3);
+    zmm[3] = bitonic_merge_zmm<vtype>(zmm_t4);
+    zmm[4] = bitonic_merge_zmm<vtype>(zmm_t5);
+    zmm[5] = bitonic_merge_zmm<vtype>(zmm_t6);
+    zmm[6] = bitonic_merge_zmm<vtype>(zmm_t7);
+    zmm[7] = bitonic_merge_zmm<vtype>(zmm_t8);
+}
+
+template <typename vtype, typename type_t>
+static inline void
+sort_16(type_t *arr, npy_int N)
+{
+    __mmask16 load_mask = (0x0001 << N) - 0x0001;
+    typename vtype::zmm_t zmm =
+            vtype::mask_loadu(vtype::zmm_max(), load_mask, arr);
+    vtype::mask_storeu(arr, load_mask, sort_zmm<vtype>(zmm));
+}
+
+template <typename vtype, typename type_t>
+static inline void
+sort_32(type_t *arr, npy_int N)
+{
+    if (N <= 16) {
+        sort_16<vtype>(arr, N);
+        return;
+    }
+    using zmm_t = typename vtype::zmm_t;
+    zmm_t zmm1 = vtype::loadu(arr);
+    __mmask16 load_mask = (0x0001 << (N - 16)) - 0x0001;
+    zmm_t zmm2 = vtype::mask_loadu(vtype::zmm_max(), load_mask, arr + 16);
+    zmm1 = sort_zmm<vtype>(zmm1);
+    zmm2 = sort_zmm<vtype>(zmm2);
+    bitonic_merge_two_zmm<vtype>(&zmm1, &zmm2);
+    vtype::storeu(arr, zmm1);
+    vtype::mask_storeu(arr + 16, load_mask, zmm2);
+}
+
+template <typename vtype, typename type_t>
+static inline void
+sort_64(type_t *arr, npy_int N)
+{
+    if (N <= 32) {
+        sort_32<vtype>(arr, N);
+        return;
+    }
+    using zmm_t = typename vtype::zmm_t;
+    zmm_t zmm[4];
+    zmm[0] = vtype::loadu(arr);
+    zmm[1] = vtype::loadu(arr + 16);
+    __mmask16 load_mask1 = 0xFFFF, load_mask2 = 0xFFFF;
+    if (N < 48) {
+        load_mask1 = (0x0001 << (N - 32)) - 0x0001;
+        load_mask2 = 0x0000;
+    }
+    else if (N < 64) {
+        load_mask2 = (0x0001 << (N - 48)) - 0x0001;
+    }
+    zmm[2] = vtype::mask_loadu(vtype::zmm_max(), load_mask1, arr + 32);
+    zmm[3] = vtype::mask_loadu(vtype::zmm_max(), load_mask2, arr + 48);
+    zmm[0] = sort_zmm<vtype>(zmm[0]);
+    zmm[1] = sort_zmm<vtype>(zmm[1]);
+    zmm[2] = sort_zmm<vtype>(zmm[2]);
+    zmm[3] = sort_zmm<vtype>(zmm[3]);
+    bitonic_merge_two_zmm<vtype>(&zmm[0], &zmm[1]);
+    bitonic_merge_two_zmm<vtype>(&zmm[2], &zmm[3]);
+    bitonic_merge_four_zmm<vtype>(zmm);
+    vtype::storeu(arr, zmm[0]);
+    vtype::storeu(arr + 16, zmm[1]);
+    vtype::mask_storeu(arr + 32, load_mask1, zmm[2]);
+    vtype::mask_storeu(arr + 48, load_mask2, zmm[3]);
+}
+
+template <typename vtype, typename type_t>
+static inline void
+sort_128(type_t *arr, npy_int N)
+{
+    if (N <= 64) {
+        sort_64<vtype>(arr, N);
+        return;
+    }
+    using zmm_t = typename vtype::zmm_t;
+    zmm_t zmm[8];
+    zmm[0] = vtype::loadu(arr);
+    zmm[1] = vtype::loadu(arr + 16);
+    zmm[2] = vtype::loadu(arr + 32);
+    zmm[3] = vtype::loadu(arr + 48);
+    zmm[0] = sort_zmm<vtype>(zmm[0]);
+    zmm[1] = sort_zmm<vtype>(zmm[1]);
+    zmm[2] = sort_zmm<vtype>(zmm[2]);
+    zmm[3] = sort_zmm<vtype>(zmm[3]);
+    __mmask16 load_mask1 = 0xFFFF, load_mask2 = 0xFFFF;
+    __mmask16 load_mask3 = 0xFFFF, load_mask4 = 0xFFFF;
+    if (N < 80) {
+        load_mask1 = (0x0001 << (N - 64)) - 0x0001;
+        load_mask2 = 0x0000;
+        load_mask3 = 0x0000;
+        load_mask4 = 0x0000;
+    }
+    else if (N < 96) {
+        load_mask2 = (0x0001 << (N - 80)) - 0x0001;
+        load_mask3 = 0x0000;
+        load_mask4 = 0x0000;
+    }
+    else if (N < 112) {
+        load_mask3 = (0x0001 << (N - 96)) - 0x0001;
+        load_mask4 = 0x0000;
+    }
+    else {
+        load_mask4 = (0x0001 << (N - 112)) - 0x0001;
+    }
+    zmm[4] = vtype::mask_loadu(vtype::zmm_max(), load_mask1, arr + 64);
+    zmm[5] = vtype::mask_loadu(vtype::zmm_max(), load_mask2, arr + 80);
+    zmm[6] = vtype::mask_loadu(vtype::zmm_max(), load_mask3, arr + 96);
+    zmm[7] = vtype::mask_loadu(vtype::zmm_max(), load_mask4, arr + 112);
+    zmm[4] = sort_zmm<vtype>(zmm[4]);
+    zmm[5] = sort_zmm<vtype>(zmm[5]);
+    zmm[6] = sort_zmm<vtype>(zmm[6]);
+    zmm[7] = sort_zmm<vtype>(zmm[7]);
+    bitonic_merge_two_zmm<vtype>(&zmm[0], &zmm[1]);
+    bitonic_merge_two_zmm<vtype>(&zmm[2], &zmm[3]);
+    bitonic_merge_two_zmm<vtype>(&zmm[4], &zmm[5]);
+    bitonic_merge_two_zmm<vtype>(&zmm[6], &zmm[7]);
+    bitonic_merge_four_zmm<vtype>(zmm);
+    bitonic_merge_four_zmm<vtype>(zmm + 4);
+    bitonic_merge_eight_zmm<vtype>(zmm);
+    vtype::storeu(arr, zmm[0]);
+    vtype::storeu(arr + 16, zmm[1]);
+    vtype::storeu(arr + 32, zmm[2]);
+    vtype::storeu(arr + 48, zmm[3]);
+    vtype::mask_storeu(arr + 64, load_mask1, zmm[4]);
+    vtype::mask_storeu(arr + 80, load_mask2, zmm[5]);
+    vtype::mask_storeu(arr + 96, load_mask3, zmm[6]);
+    vtype::mask_storeu(arr + 112, load_mask4, zmm[7]);
+}
+
+template <typename type_t>
+static inline void
+swap(type_t *arr, npy_intp ii, npy_intp jj)
+{
+    type_t temp = arr[ii];
+    arr[ii] = arr[jj];
+    arr[jj] = temp;
+}
+
+// Median of 3 strategy
+// template<typename type_t>
+// static inline
+// npy_intp get_pivot_index(type_t *arr, const npy_intp left, const npy_intp
+// right) {
+//    return (rand() % (right + 1 - left)) + left;
+//    //npy_intp middle = ((right-left)/2) + left;
+//    //type_t a = arr[left], b = arr[middle], c = arr[right];
+//    //if ((b >= a && b <= c) || (b <= a && b >= c))
+//    //    return middle;
+//    //if ((a >= b && a <= c) || (a <= b && a >= c))
+//    //    return left;
+//    //else
+//    //    return right;
+//}
+
+/*
+ * Picking the pivot: Median of 72 array elements chosen at random.
+ */
+
+template <typename vtype, typename type_t>
+static inline type_t
+get_pivot(type_t *arr, const npy_intp left, const npy_intp right)
+{
+    /* seeds for vectorized random number generator */
+    __m256i s0 = _mm256_setr_epi64x(8265987198341093849, 3762817312854612374,
+                                    1324281658759788278, 6214952190349879213);
+    __m256i s1 = _mm256_setr_epi64x(2874178529384792648, 1257248936691237653,
+                                    7874578921548791257, 1998265912745817298);
+    s0 = _mm256_add_epi64(s0, _mm256_set1_epi64x(left));
+    s1 = _mm256_sub_epi64(s1, _mm256_set1_epi64x(right));
+
+    npy_intp arrsize = right - left + 1;
+    __m256i bound =
+            _mm256_set1_epi32(arrsize > INT32_MAX ? INT32_MAX : arrsize);
+    __m512i left_vec = _mm512_set1_epi64(left);
+    __m512i right_vec = _mm512_set1_epi64(right);
+    using ymm_t = typename vtype::ymm_t;
+    ymm_t v[9];
+    /* fill 9 vectors with random numbers */
+    for (npy_int i = 0; i < 9; ++i) {
+        __m256i rand_64 = vnext(&s0, &s1); /* vector with 4 random uint64_t */
+        __m512i rand_32 = _mm512_cvtepi32_epi64(rnd_epu32(
+                rand_64, bound)); /* random numbers between 0 and bound - 1 */
+        __m512i indices;
+        if (i < 5)
+            indices =
+                    _mm512_add_epi64(left_vec, rand_32); /* indices for arr */
+        else
+            indices =
+                    _mm512_sub_epi64(right_vec, rand_32); /* indices for arr */
+
+        v[i] = vtype::template i64gather<sizeof(type_t)>(indices, arr);
+    }
+
+    /* median network for 9 elements */
+    COEX<vtype>(v[0], v[1]);
+    COEX<vtype>(v[2], v[3]);
+    COEX<vtype>(v[4], v[5]);
+    COEX<vtype>(v[6], v[7]);
+    COEX<vtype>(v[0], v[2]);
+    COEX<vtype>(v[1], v[3]);
+    COEX<vtype>(v[4], v[6]);
+    COEX<vtype>(v[5], v[7]);
+    COEX<vtype>(v[0], v[4]);
+    COEX<vtype>(v[1], v[2]);
+    COEX<vtype>(v[5], v[6]);
+    COEX<vtype>(v[3], v[7]);
+    COEX<vtype>(v[1], v[5]);
+    COEX<vtype>(v[2], v[6]);
+    COEX<vtype>(v[3], v[5]);
+    COEX<vtype>(v[2], v[4]);
+    COEX<vtype>(v[3], v[4]);
+    COEX<vtype>(v[3], v[8]);
+    COEX<vtype>(v[4], v[8]);
+
+    // technically v[4] needs to be sorted before we pick the correct median,
+    // picking the 4th element works just as well for performance
+    type_t *temp = (type_t *)&v[4];
+
+    return temp[4];
+}
+
+/*
+ * Parition one ZMM register based on the pivot and returns the index of the
+ * last element that is less than equal to the pivot.
+ */
+template <typename vtype, typename type_t, typename zmm_t>
+static inline npy_int
+partition_vec(type_t *arr, npy_intp left, npy_intp right, const zmm_t curr_vec,
+              const zmm_t pivot_vec, zmm_t *smallest_vec, zmm_t *biggest_vec)
+{
+    /* which elements are larger than the pivot */
+    __mmask16 gt_mask = vtype::ge(curr_vec, pivot_vec);
+    npy_int amount_gt_pivot = _mm_popcnt_u32((npy_int)gt_mask);
+#if defined(_MSC_VER) && _MSC_VER < 1922
+    vtype::mask_compressstoreu(arr + left, ~gt_mask, curr_vec);
+#else
+    vtype::mask_compressstoreu(arr + left, _knot_mask16(gt_mask), curr_vec);
+#endif
+    vtype::mask_compressstoreu(arr + right - amount_gt_pivot, gt_mask,
+                               curr_vec);
+    *smallest_vec = vtype::min(curr_vec, *smallest_vec);
+    *biggest_vec = vtype::max(curr_vec, *biggest_vec);
+    return amount_gt_pivot;
+}
+
+/*
+ * Parition an array based on the pivot and returns the index of the
+ * last element that is less than equal to the pivot.
+ */
+template <typename vtype, typename type_t>
+static inline npy_intp
+partition_avx512(type_t *arr, npy_intp left, npy_intp right, type_t pivot,
+                 type_t *smallest, type_t *biggest)
+{
+    /* make array length divisible by 16 , shortening the array */
+    for (npy_int i = (right - left) % 16; i > 0; --i) {
+        *smallest = MIN(*smallest, arr[left]);
+        *biggest = MAX(*biggest, arr[left]);
+        if (arr[left] > pivot) {
+            swap(arr, left, --right);
+        }
+        else {
+            ++left;
+        }
+    }
+
+    if (left == right)
+        return left; /* less than 16 elements in the array */
+
+    using zmm_t = typename vtype::zmm_t;
+    zmm_t pivot_vec = vtype::set1(pivot);
+    zmm_t min_vec = vtype::set1(*smallest);
+    zmm_t max_vec = vtype::set1(*biggest);
+
+    if (right - left == 16) {
+        zmm_t vec = vtype::loadu(arr + left);
+        npy_int amount_gt_pivot = partition_vec<vtype>(
+                arr, left, left + 16, vec, pivot_vec, &min_vec, &max_vec);
+        *smallest = vtype::reducemin(min_vec);
+        *biggest = vtype::reducemax(max_vec);
+        return left + (16 - amount_gt_pivot);
+    }
+
+    // first and last 16 values are partitioned at the end
+    zmm_t vec_left = vtype::loadu(arr + left);
+    zmm_t vec_right = vtype::loadu(arr + (right - 16));
+    // store points of the vectors
+    npy_intp r_store = right - 16;
+    npy_intp l_store = left;
+    // indices for loading the elements
+    left += 16;
+    right -= 16;
+    while (right - left != 0) {
+        zmm_t curr_vec;
+        /*
+         * if fewer elements are stored on the right side of the array,
+         * then next elements are loaded from the right side,
+         * otherwise from the left side
+         */
+        if ((r_store + 16) - right < left - l_store) {
+            right -= 16;
+            curr_vec = vtype::loadu(arr + right);
+        }
+        else {
+            curr_vec = vtype::loadu(arr + left);
+            left += 16;
+        }
+        // partition the current vector and save it on both sides of the array
+        npy_int amount_gt_pivot =
+                partition_vec<vtype>(arr, l_store, r_store + 16, curr_vec,
+                                     pivot_vec, &min_vec, &max_vec);
+        ;
+        r_store -= amount_gt_pivot;
+        l_store += (16 - amount_gt_pivot);
+    }
+
+    /* partition and save vec_left and vec_right */
+    npy_int amount_gt_pivot =
+            partition_vec<vtype>(arr, l_store, r_store + 16, vec_left,
+                                 pivot_vec, &min_vec, &max_vec);
+    l_store += (16 - amount_gt_pivot);
+    amount_gt_pivot =
+            partition_vec<vtype>(arr, l_store, l_store + 16, vec_right,
+                                 pivot_vec, &min_vec, &max_vec);
+    l_store += (16 - amount_gt_pivot);
+    *smallest = vtype::reducemin(min_vec);
+    *biggest = vtype::reducemax(max_vec);
+    return l_store;
+}
+
+template <typename vtype, typename type_t>
+static inline void
+qsort_(type_t *arr, npy_intp left, npy_intp right, npy_int max_iters)
+{
+    /*
+     * Resort to heapsort if quicksort isnt making any progress
+     */
+    if (max_iters <= 0) {
+        heapsort_<typename vtype::tag>(arr + left, right + 1 - left);
+        return;
+    }
+    /*
+     * Base case: use bitonic networks to sort arrays <= 128
+     */
+    if (right + 1 - left <= 128) {
+        sort_128<vtype>(arr + left, (npy_int)(right + 1 - left));
+        return;
+    }
+
+    type_t pivot = get_pivot<vtype>(arr, left, right);
+    type_t smallest = vtype::type_max();
+    type_t biggest = vtype::type_min();
+    npy_intp pivot_index = partition_avx512<vtype>(arr, left, right + 1, pivot,
+                                                   &smallest, &biggest);
+    if (pivot != smallest)
+        qsort_<vtype>(arr, left, pivot_index - 1, max_iters - 1);
+    if (pivot != biggest)
+        qsort_<vtype>(arr, pivot_index, right, max_iters - 1);
+}
+
+static inline npy_intp
+replace_nan_with_inf(npy_float *arr, npy_intp arrsize)
+{
+    npy_intp nan_count = 0;
+    __mmask16 loadmask = 0xFFFF;
+    while (arrsize > 0) {
+        if (arrsize < 16) {
+            loadmask = (0x0001 << arrsize) - 0x0001;
+        }
+        __m512 in_zmm = _mm512_maskz_loadu_ps(loadmask, arr);
+        __mmask16 nanmask = _mm512_cmp_ps_mask(in_zmm, in_zmm, _CMP_NEQ_UQ);
+        nan_count += _mm_popcnt_u32((npy_int)nanmask);
+        _mm512_mask_storeu_ps(arr, nanmask, ZMM_MAX_FLOAT);
+        arr += 16;
+        arrsize -= 16;
+    }
+    return nan_count;
+}
+
+static inline void
+replace_inf_with_nan(npy_float *arr, npy_intp arrsize, npy_intp nan_count)
+{
+    for (npy_intp ii = arrsize - 1; nan_count > 0; --ii) {
+        arr[ii] = NPY_NANF;
+        nan_count -= 1;
+    }
+}
+
+/***************************************
+ * C > C++ dispatch
+ ***************************************/
+
+NPY_NO_EXPORT void
+NPY_CPU_DISPATCH_CURFX(x86_quicksort_int)(void *arr, npy_intp arrsize)
+{
+    if (arrsize > 1) {
+        qsort_<vector<npy_int>, npy_int>((npy_int *)arr, 0, arrsize - 1,
+                                         2 * (npy_int)log2(arrsize));
+    }
+}
+
+NPY_NO_EXPORT void
+NPY_CPU_DISPATCH_CURFX(x86_quicksort_uint)(void *arr, npy_intp arrsize)
+{
+    if (arrsize > 1) {
+        qsort_<vector<npy_uint>, npy_uint>((npy_uint *)arr, 0, arrsize - 1,
+                                           2 * (npy_int)log2(arrsize));
+    }
+}
+
+NPY_NO_EXPORT void
+NPY_CPU_DISPATCH_CURFX(x86_quicksort_float)(void *arr, npy_intp arrsize)
+{
+    if (arrsize > 1) {
+        npy_intp nan_count = replace_nan_with_inf((npy_float *)arr, arrsize);
+        qsort_<vector<npy_float>, npy_float>((npy_float *)arr, 0, arrsize - 1,
+                                             2 * (npy_int)log2(arrsize));
+        replace_inf_with_nan((npy_float *)arr, arrsize, nan_count);
+    }
+}
+
+#endif  // NPY_HAVE_AVX512_SKX
diff --git a/numpy/core/src/npysort/x86-qsort.h b/numpy/core/src/npysort/x86-qsort.h

new file mode 100644 (file)

index 0000000..6340e2b
--- /dev/null
+++ b/numpy/core/src/npysort/x86-qsort.h
@@ -0,0 +1,28 @@
+#include "numpy/npy_common.h"
+
+#include "npy_cpu_dispatch.h"
+
+#ifndef NPY_NO_EXPORT
+#define NPY_NO_EXPORT NPY_VISIBILITY_HIDDEN
+#endif
+
+#ifndef NPY_DISABLE_OPTIMIZATION
+#include "x86-qsort.dispatch.h"
+#endif
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+NPY_CPU_DISPATCH_DECLARE(NPY_NO_EXPORT void x86_quicksort_int,
+                         (void *start, npy_intp num))
+
+NPY_CPU_DISPATCH_DECLARE(NPY_NO_EXPORT void x86_quicksort_uint,
+                         (void *start, npy_intp num))
+
+NPY_CPU_DISPATCH_DECLARE(NPY_NO_EXPORT void x86_quicksort_float,
+                         (void *start, npy_intp num))
+
+#ifdef __cplusplus
+}
+#endif
diff --git a/numpy/core/src/umath/_operand_flag_tests.c b/numpy/core/src/umath/_operand_flag_tests.c

new file mode 100644 (file)

index 0000000..c59e13b
--- /dev/null
+++ b/numpy/core/src/umath/_operand_flag_tests.c
@@ -0,0 +1,92 @@
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#include <numpy/arrayobject.h>
+#include <numpy/ufuncobject.h>
+#include "numpy/npy_3kcompat.h"
+#include <math.h>
+#include <structmember.h>
+
+
+static PyMethodDef TestMethods[] = {
+        {NULL, NULL, 0, NULL}
+};
+
+
+static void
+inplace_add(char **args, npy_intp const *dimensions, npy_intp const *steps, void *data)
+{
+    npy_intp i;
+    npy_intp n = dimensions[0];
+    char *in1 = args[0];
+    char *in2 = args[1];
+    npy_intp in1_step = steps[0];
+    npy_intp in2_step = steps[1];
+
+    for (i = 0; i < n; i++) {
+        (*(long *)in1) = *(long*)in1 + *(long*)in2;
+        in1 += in1_step;
+        in2 += in2_step;
+    }
+}
+
+
+/*This a pointer to the above function*/
+PyUFuncGenericFunction funcs[1] = {&inplace_add};
+
+/* These are the input and return dtypes of logit.*/
+static char types[2] = {NPY_LONG, NPY_LONG};
+
+static void *data[1] = {NULL};
+
+static struct PyModuleDef moduledef = {
+    PyModuleDef_HEAD_INIT,
+    "_operand_flag_tests",
+    NULL,
+    -1,
+    TestMethods,
+    NULL,
+    NULL,
+    NULL,
+    NULL
+};
+
+PyMODINIT_FUNC PyInit__operand_flag_tests(void)
+{
+    PyObject *m = NULL;
+    PyObject *ufunc;
+
+    m = PyModule_Create(&moduledef);
+    if (m == NULL) {
+        goto fail;
+    }
+
+    import_array();
+    import_umath();
+
+    ufunc = PyUFunc_FromFuncAndData(funcs, data, types, 1, 2, 0,
+                                    PyUFunc_None, "inplace_add",
+                                    "inplace_add_docstring", 0);
+
+    /*
+     * Set flags to turn off buffering for first input operand,
+     * so that result can be written back to input operand.
+     */
+    ((PyUFuncObject*)ufunc)->op_flags[0] = NPY_ITER_READWRITE;
+    ((PyUFuncObject*)ufunc)->iter_flags = NPY_ITER_REDUCE_OK;
+    PyModule_AddObject(m, "inplace_add", (PyObject*)ufunc);
+
+    return m;
+
+fail:
+    if (!PyErr_Occurred()) {
+        PyErr_SetString(PyExc_RuntimeError,
+                        "cannot load _operand_flag_tests module.");
+    }
+    if (m) {
+        Py_DECREF(m);
+        m = NULL;
+    }
+    return m;
+}
diff --git a/numpy/core/src/umath/_operand_flag_tests.c.src b/numpy/core/src/umath/_operand_flag_tests.c.src

deleted file mode 100644 (file)

index c59e13b..0000000
--- a/numpy/core/src/umath/_operand_flag_tests.c.src
+++ /dev/null
@@ -1,92 +0,0 @@
-#define PY_SSIZE_T_CLEAN
-#include <Python.h>
-
-#define NPY_NO_DEPRECATED_API NPY_API_VERSION
-#include <numpy/arrayobject.h>
-#include <numpy/ufuncobject.h>
-#include "numpy/npy_3kcompat.h"
-#include <math.h>
-#include <structmember.h>
-
-
-static PyMethodDef TestMethods[] = {
-        {NULL, NULL, 0, NULL}
-};
-
-
-static void
-inplace_add(char **args, npy_intp const *dimensions, npy_intp const *steps, void *data)
-{
-    npy_intp i;
-    npy_intp n = dimensions[0];
-    char *in1 = args[0];
-    char *in2 = args[1];
-    npy_intp in1_step = steps[0];
-    npy_intp in2_step = steps[1];
-
-    for (i = 0; i < n; i++) {
-        (*(long *)in1) = *(long*)in1 + *(long*)in2;
-        in1 += in1_step;
-        in2 += in2_step;
-    }
-}
-
-
-/*This a pointer to the above function*/
-PyUFuncGenericFunction funcs[1] = {&inplace_add};
-
-/* These are the input and return dtypes of logit.*/
-static char types[2] = {NPY_LONG, NPY_LONG};
-
-static void *data[1] = {NULL};
-
-static struct PyModuleDef moduledef = {
-    PyModuleDef_HEAD_INIT,
-    "_operand_flag_tests",
-    NULL,
-    -1,
-    TestMethods,
-    NULL,
-    NULL,
-    NULL,
-    NULL
-};
-
-PyMODINIT_FUNC PyInit__operand_flag_tests(void)
-{
-    PyObject *m = NULL;
-    PyObject *ufunc;
-
-    m = PyModule_Create(&moduledef);
-    if (m == NULL) {
-        goto fail;
-    }
-
-    import_array();
-    import_umath();
-
-    ufunc = PyUFunc_FromFuncAndData(funcs, data, types, 1, 2, 0,
-                                    PyUFunc_None, "inplace_add",
-                                    "inplace_add_docstring", 0);
-
-    /*
-     * Set flags to turn off buffering for first input operand,
-     * so that result can be written back to input operand.
-     */
-    ((PyUFuncObject*)ufunc)->op_flags[0] = NPY_ITER_READWRITE;
-    ((PyUFuncObject*)ufunc)->iter_flags = NPY_ITER_REDUCE_OK;
-    PyModule_AddObject(m, "inplace_add", (PyObject*)ufunc);
-
-    return m;
-
-fail:
-    if (!PyErr_Occurred()) {
-        PyErr_SetString(PyExc_RuntimeError,
-                        "cannot load _operand_flag_tests module.");
-    }
-    if (m) {
-        Py_DECREF(m);
-        m = NULL;
-    }
-    return m;
-}
diff --git a/numpy/core/src/umath/_rational_tests.c b/numpy/core/src/umath/_rational_tests.c

new file mode 100644 (file)

index 0000000..bf50a22
--- /dev/null
+++ b/numpy/core/src/umath/_rational_tests.c
@@ -0,0 +1,1369 @@
+/* Fixed size rational numbers exposed to Python */
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+#include <structmember.h>
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#include "numpy/arrayobject.h"
+#include "numpy/ufuncobject.h"
+#include "numpy/npy_3kcompat.h"
+#include "common.h"  /* for error_converting */
+
+#include <math.h>
+
+
+/* Relevant arithmetic exceptions */
+
+/* Uncomment the following line to work around a bug in numpy */
+/* #define ACQUIRE_GIL */
+
+static void
+set_overflow(void) {
+#ifdef ACQUIRE_GIL
+    /* Need to grab the GIL to dodge a bug in numpy */
+    PyGILState_STATE state = PyGILState_Ensure();
+#endif
+    if (!PyErr_Occurred()) {
+        PyErr_SetString(PyExc_OverflowError,
+                "overflow in rational arithmetic");
+    }
+#ifdef ACQUIRE_GIL
+    PyGILState_Release(state);
+#endif
+}
+
+static void
+set_zero_divide(void) {
+#ifdef ACQUIRE_GIL
+    /* Need to grab the GIL to dodge a bug in numpy */
+    PyGILState_STATE state = PyGILState_Ensure();
+#endif
+    if (!PyErr_Occurred()) {
+        PyErr_SetString(PyExc_ZeroDivisionError,
+                "zero divide in rational arithmetic");
+    }
+#ifdef ACQUIRE_GIL
+    PyGILState_Release(state);
+#endif
+}
+
+/* Integer arithmetic utilities */
+
+static NPY_INLINE npy_int32
+safe_neg(npy_int32 x) {
+    if (x==(npy_int32)1<<31) {
+        set_overflow();
+    }
+    return -x;
+}
+
+static NPY_INLINE npy_int32
+safe_abs32(npy_int32 x) {
+    npy_int32 nx;
+    if (x>=0) {
+        return x;
+    }
+    nx = -x;
+    if (nx<0) {
+        set_overflow();
+    }
+    return nx;
+}
+
+static NPY_INLINE npy_int64
+safe_abs64(npy_int64 x) {
+    npy_int64 nx;
+    if (x>=0) {
+        return x;
+    }
+    nx = -x;
+    if (nx<0) {
+        set_overflow();
+    }
+    return nx;
+}
+
+static NPY_INLINE npy_int64
+gcd(npy_int64 x, npy_int64 y) {
+    x = safe_abs64(x);
+    y = safe_abs64(y);
+    if (x < y) {
+        npy_int64 t = x;
+        x = y;
+        y = t;
+    }
+    while (y) {
+        npy_int64 t;
+        x = x%y;
+        t = x;
+        x = y;
+        y = t;
+    }
+    return x;
+}
+
+static NPY_INLINE npy_int64
+lcm(npy_int64 x, npy_int64 y) {
+    npy_int64 lcm;
+    if (!x || !y) {
+        return 0;
+    }
+    x /= gcd(x,y);
+    lcm = x*y;
+    if (lcm/y!=x) {
+        set_overflow();
+    }
+    return safe_abs64(lcm);
+}
+
+/* Fixed precision rational numbers */
+
+typedef struct {
+    /* numerator */
+    npy_int32 n;
+    /*
+     * denominator minus one: numpy.zeros() uses memset(0) for non-object
+     * types, so need to ensure that rational(0) has all zero bytes
+     */
+    npy_int32 dmm;
+} rational;
+
+static NPY_INLINE rational
+make_rational_int(npy_int64 n) {
+    rational r = {(npy_int32)n,0};
+    if (r.n != n) {
+        set_overflow();
+    }
+    return r;
+}
+
+static rational
+make_rational_slow(npy_int64 n_, npy_int64 d_) {
+    rational r = {0};
+    if (!d_) {
+        set_zero_divide();
+    }
+    else {
+        npy_int64 g = gcd(n_,d_);
+        npy_int32 d;
+        n_ /= g;
+        d_ /= g;
+        r.n = (npy_int32)n_;
+        d = (npy_int32)d_;
+        if (r.n!=n_ || d!=d_) {
+            set_overflow();
+        }
+        else {
+            if (d <= 0) {
+                d = -d;
+                r.n = safe_neg(r.n);
+            }
+            r.dmm = d-1;
+        }
+    }
+    return r;
+}
+
+static NPY_INLINE npy_int32
+d(rational r) {
+    return r.dmm+1;
+}
+
+/* Assumes d_ > 0 */
+static rational
+make_rational_fast(npy_int64 n_, npy_int64 d_) {
+    npy_int64 g = gcd(n_,d_);
+    rational r;
+    n_ /= g;
+    d_ /= g;
+    r.n = (npy_int32)n_;
+    r.dmm = (npy_int32)(d_-1);
+    if (r.n!=n_ || r.dmm+1!=d_) {
+        set_overflow();
+    }
+    return r;
+}
+
+static NPY_INLINE rational
+rational_negative(rational r) {
+    rational x;
+    x.n = safe_neg(r.n);
+    x.dmm = r.dmm;
+    return x;
+}
+
+static NPY_INLINE rational
+rational_add(rational x, rational y) {
+    /*
+     * Note that the numerator computation can never overflow int128_t,
+     * since each term is strictly under 2**128/4 (since d > 0).
+     */
+    return make_rational_fast((npy_int64)x.n*d(y)+(npy_int64)d(x)*y.n,
+        (npy_int64)d(x)*d(y));
+}
+
+static NPY_INLINE rational
+rational_subtract(rational x, rational y) {
+    /* We're safe from overflow as with + */
+    return make_rational_fast((npy_int64)x.n*d(y)-(npy_int64)d(x)*y.n,
+        (npy_int64)d(x)*d(y));
+}
+
+static NPY_INLINE rational
+rational_multiply(rational x, rational y) {
+    /* We're safe from overflow as with + */
+    return make_rational_fast((npy_int64)x.n*y.n,(npy_int64)d(x)*d(y));
+}
+
+static NPY_INLINE rational
+rational_divide(rational x, rational y) {
+    return make_rational_slow((npy_int64)x.n*d(y),(npy_int64)d(x)*y.n);
+}
+
+static NPY_INLINE npy_int64
+rational_floor(rational x) {
+    /* Always round down */
+    if (x.n>=0) {
+        return x.n/d(x);
+    }
+    /*
+     * This can be done without casting up to 64 bits, but it requires
+     * working out all the sign cases
+     */
+    return -((-(npy_int64)x.n+d(x)-1)/d(x));
+}
+
+static NPY_INLINE npy_int64
+rational_ceil(rational x) {
+    return -rational_floor(rational_negative(x));
+}
+
+static NPY_INLINE rational
+rational_remainder(rational x, rational y) {
+    return rational_subtract(x, rational_multiply(y,make_rational_int(
+                    rational_floor(rational_divide(x,y)))));
+}
+
+static NPY_INLINE rational
+rational_abs(rational x) {
+    rational y;
+    y.n = safe_abs32(x.n);
+    y.dmm = x.dmm;
+    return y;
+}
+
+static NPY_INLINE npy_int64
+rational_rint(rational x) {
+    /*
+     * Round towards nearest integer, moving exact half integers towards
+     * zero
+     */
+    npy_int32 d_ = d(x);
+    return (2*(npy_int64)x.n+(x.n<0?-d_:d_))/(2*(npy_int64)d_);
+}
+
+static NPY_INLINE int
+rational_sign(rational x) {
+    return x.n<0?-1:x.n==0?0:1;
+}
+
+static NPY_INLINE rational
+rational_inverse(rational x) {
+    rational y = {0};
+    if (!x.n) {
+        set_zero_divide();
+    }
+    else {
+        npy_int32 d_;
+        y.n = d(x);
+        d_ = x.n;
+        if (d_ <= 0) {
+            d_ = safe_neg(d_);
+            y.n = -y.n;
+        }
+        y.dmm = d_-1;
+    }
+    return y;
+}
+
+static NPY_INLINE int
+rational_eq(rational x, rational y) {
+    /*
+     * Since we enforce d > 0, and store fractions in reduced form,
+     * equality is easy.
+     */
+    return x.n==y.n && x.dmm==y.dmm;
+}
+
+static NPY_INLINE int
+rational_ne(rational x, rational y) {
+    return !rational_eq(x,y);
+}
+
+static NPY_INLINE int
+rational_lt(rational x, rational y) {
+    return (npy_int64)x.n*d(y) < (npy_int64)y.n*d(x);
+}
+
+static NPY_INLINE int
+rational_gt(rational x, rational y) {
+    return rational_lt(y,x);
+}
+
+static NPY_INLINE int
+rational_le(rational x, rational y) {
+    return !rational_lt(y,x);
+}
+
+static NPY_INLINE int
+rational_ge(rational x, rational y) {
+    return !rational_lt(x,y);
+}
+
+static NPY_INLINE npy_int32
+rational_int(rational x) {
+    return x.n/d(x);
+}
+
+static NPY_INLINE double
+rational_double(rational x) {
+    return (double)x.n/d(x);
+}
+
+static NPY_INLINE int
+rational_nonzero(rational x) {
+    return x.n!=0;
+}
+
+static int
+scan_rational(const char** s, rational* x) {
+    long n,d;
+    int offset;
+    const char* ss;
+    if (sscanf(*s,"%ld%n",&n,&offset)<=0) {
+        return 0;
+    }
+    ss = *s+offset;
+    if (*ss!='/') {
+        *s = ss;
+        *x = make_rational_int(n);
+        return 1;
+    }
+    ss++;
+    if (sscanf(ss,"%ld%n",&d,&offset)<=0 || d<=0) {
+        return 0;
+    }
+    *s = ss+offset;
+    *x = make_rational_slow(n,d);
+    return 1;
+}
+
+/* Expose rational to Python as a numpy scalar */
+
+typedef struct {
+    PyObject_HEAD
+    rational r;
+} PyRational;
+
+static PyTypeObject PyRational_Type;
+
+static NPY_INLINE int
+PyRational_Check(PyObject* object) {
+    return PyObject_IsInstance(object,(PyObject*)&PyRational_Type);
+}
+
+static PyObject*
+PyRational_FromRational(rational x) {
+    PyRational* p = (PyRational*)PyRational_Type.tp_alloc(&PyRational_Type,0);
+    if (p) {
+        p->r = x;
+    }
+    return (PyObject*)p;
+}
+
+static PyObject*
+pyrational_new(PyTypeObject* type, PyObject* args, PyObject* kwds) {
+    Py_ssize_t size;
+    PyObject* x[2];
+    long n[2]={0,1};
+    int i;
+    rational r;
+    if (kwds && PyDict_Size(kwds)) {
+        PyErr_SetString(PyExc_TypeError,
+                "constructor takes no keyword arguments");
+        return 0;
+    }
+    size = PyTuple_GET_SIZE(args);
+    if (size > 2) {
+        PyErr_SetString(PyExc_TypeError,
+                "expected rational or numerator and optional denominator");
+        return 0;
+    }
+
+    if (size == 1) {
+        x[0] = PyTuple_GET_ITEM(args, 0);
+        if (PyRational_Check(x[0])) {
+            Py_INCREF(x[0]);
+            return x[0];
+        }
+        // TODO: allow construction from unicode strings
+        else if (PyBytes_Check(x[0])) {
+            const char* s = PyBytes_AS_STRING(x[0]);
+            rational x;
+            if (scan_rational(&s,&x)) {
+                const char* p;
+                for (p = s; *p; p++) {
+                    if (!isspace(*p)) {
+                        goto bad;
+                    }
+                }
+                return PyRational_FromRational(x);
+            }
+            bad:
+            PyErr_Format(PyExc_ValueError,
+                    "invalid rational literal '%s'",s);
+            return 0;
+        }
+    }
+
+    for (i=0; i<size; i++) {
+        PyObject* y;
+        int eq;
+        x[i] = PyTuple_GET_ITEM(args, i);
+        n[i] = PyLong_AsLong(x[i]);
+        if (error_converting(n[i])) {
+            if (PyErr_ExceptionMatches(PyExc_TypeError)) {
+                PyErr_Format(PyExc_TypeError,
+                        "expected integer %s, got %s",
+                        (i ? "denominator" : "numerator"),
+                        x[i]->ob_type->tp_name);
+            }
+            return 0;
+        }
+        /* Check that we had an exact integer */
+        y = PyLong_FromLong(n[i]);
+        if (!y) {
+            return 0;
+        }
+        eq = PyObject_RichCompareBool(x[i],y,Py_EQ);
+        Py_DECREF(y);
+        if (eq<0) {
+            return 0;
+        }
+        if (!eq) {
+            PyErr_Format(PyExc_TypeError,
+                    "expected integer %s, got %s",
+                    (i ? "denominator" : "numerator"),
+                    x[i]->ob_type->tp_name);
+            return 0;
+        }
+    }
+    r = make_rational_slow(n[0],n[1]);
+    if (PyErr_Occurred()) {
+        return 0;
+    }
+    return PyRational_FromRational(r);
+}
+
+/*
+ * Returns Py_NotImplemented on most conversion failures, or raises an
+ * overflow error for too long ints
+ */
+#define AS_RATIONAL(dst,object) \
+    { \
+        dst.n = 0; \
+        if (PyRational_Check(object)) { \
+            dst = ((PyRational*)object)->r; \
+        } \
+        else { \
+            PyObject* y_; \
+            int eq_; \
+            long n_ = PyLong_AsLong(object); \
+            if (error_converting(n_)) { \
+                if (PyErr_ExceptionMatches(PyExc_TypeError)) { \
+                    PyErr_Clear(); \
+                    Py_INCREF(Py_NotImplemented); \
+                    return Py_NotImplemented; \
+                } \
+                return 0; \
+            } \
+            y_ = PyLong_FromLong(n_); \
+            if (!y_) { \
+                return 0; \
+            } \
+            eq_ = PyObject_RichCompareBool(object,y_,Py_EQ); \
+            Py_DECREF(y_); \
+            if (eq_<0) { \
+                return 0; \
+            } \
+            if (!eq_) { \
+                Py_INCREF(Py_NotImplemented); \
+                return Py_NotImplemented; \
+            } \
+            dst = make_rational_int(n_); \
+        } \
+    }
+
+static PyObject*
+pyrational_richcompare(PyObject* a, PyObject* b, int op) {
+    rational x, y;
+    int result = 0;
+    AS_RATIONAL(x,a);
+    AS_RATIONAL(y,b);
+    #define OP(py,op) case py: result = rational_##op(x,y); break;
+    switch (op) {
+        OP(Py_LT,lt)
+        OP(Py_LE,le)
+        OP(Py_EQ,eq)
+        OP(Py_NE,ne)
+        OP(Py_GT,gt)
+        OP(Py_GE,ge)
+    };
+    #undef OP
+    return PyBool_FromLong(result);
+}
+
+static PyObject*
+pyrational_repr(PyObject* self) {
+    rational x = ((PyRational*)self)->r;
+    if (d(x)!=1) {
+        return PyUnicode_FromFormat(
+                "rational(%ld,%ld)",(long)x.n,(long)d(x));
+    }
+    else {
+        return PyUnicode_FromFormat(
+                "rational(%ld)",(long)x.n);
+    }
+}
+
+static PyObject*
+pyrational_str(PyObject* self) {
+    rational x = ((PyRational*)self)->r;
+    if (d(x)!=1) {
+        return PyUnicode_FromFormat(
+                "%ld/%ld",(long)x.n,(long)d(x));
+    }
+    else {
+        return PyUnicode_FromFormat(
+                "%ld",(long)x.n);
+    }
+}
+
+static npy_hash_t
+pyrational_hash(PyObject* self) {
+    rational x = ((PyRational*)self)->r;
+    /* Use a fairly weak hash as Python expects */
+    long h = 131071*x.n+524287*x.dmm;
+    /* Never return the special error value -1 */
+    return h==-1?2:h;
+}
+
+#define RATIONAL_BINOP_2(name,exp) \
+    static PyObject* \
+    pyrational_##name(PyObject* a, PyObject* b) { \
+        rational x, y, z; \
+        AS_RATIONAL(x,a); \
+        AS_RATIONAL(y,b); \
+        z = exp; \
+        if (PyErr_Occurred()) { \
+            return 0; \
+        } \
+        return PyRational_FromRational(z); \
+    }
+#define RATIONAL_BINOP(name) RATIONAL_BINOP_2(name,rational_##name(x,y))
+RATIONAL_BINOP(add)
+RATIONAL_BINOP(subtract)
+RATIONAL_BINOP(multiply)
+RATIONAL_BINOP(divide)
+RATIONAL_BINOP(remainder)
+RATIONAL_BINOP_2(floor_divide,
+    make_rational_int(rational_floor(rational_divide(x,y))))
+
+#define RATIONAL_UNOP(name,type,exp,convert) \
+    static PyObject* \
+    pyrational_##name(PyObject* self) { \
+        rational x = ((PyRational*)self)->r; \
+        type y = exp; \
+        if (PyErr_Occurred()) { \
+            return 0; \
+        } \
+        return convert(y); \
+    }
+RATIONAL_UNOP(negative,rational,rational_negative(x),PyRational_FromRational)
+RATIONAL_UNOP(absolute,rational,rational_abs(x),PyRational_FromRational)
+RATIONAL_UNOP(int,long,rational_int(x),PyLong_FromLong)
+RATIONAL_UNOP(float,double,rational_double(x),PyFloat_FromDouble)
+
+static PyObject*
+pyrational_positive(PyObject* self) {
+    Py_INCREF(self);
+    return self;
+}
+
+static int
+pyrational_nonzero(PyObject* self) {
+    rational x = ((PyRational*)self)->r;
+    return rational_nonzero(x);
+}
+
+static PyNumberMethods pyrational_as_number = {
+    pyrational_add,          /* nb_add */
+    pyrational_subtract,     /* nb_subtract */
+    pyrational_multiply,     /* nb_multiply */
+    pyrational_remainder,    /* nb_remainder */
+    0,                       /* nb_divmod */
+    0,                       /* nb_power */
+    pyrational_negative,     /* nb_negative */
+    pyrational_positive,     /* nb_positive */
+    pyrational_absolute,     /* nb_absolute */
+    pyrational_nonzero,      /* nb_nonzero */
+    0,                       /* nb_invert */
+    0,                       /* nb_lshift */
+    0,                       /* nb_rshift */
+    0,                       /* nb_and */
+    0,                       /* nb_xor */
+    0,                       /* nb_or */
+    pyrational_int,          /* nb_int */
+    0,                       /* reserved */
+    pyrational_float,        /* nb_float */
+
+    0,                       /* nb_inplace_add */
+    0,                       /* nb_inplace_subtract */
+    0,                       /* nb_inplace_multiply */
+    0,                       /* nb_inplace_remainder */
+    0,                       /* nb_inplace_power */
+    0,                       /* nb_inplace_lshift */
+    0,                       /* nb_inplace_rshift */
+    0,                       /* nb_inplace_and */
+    0,                       /* nb_inplace_xor */
+    0,                       /* nb_inplace_or */
+
+    pyrational_floor_divide, /* nb_floor_divide */
+    pyrational_divide,       /* nb_true_divide */
+    0,                       /* nb_inplace_floor_divide */
+    0,                       /* nb_inplace_true_divide */
+    0,                       /* nb_index */
+};
+
+static PyObject*
+pyrational_n(PyObject* self, void* closure) {
+    return PyLong_FromLong(((PyRational*)self)->r.n);
+}
+
+static PyObject*
+pyrational_d(PyObject* self, void* closure) {
+    return PyLong_FromLong(d(((PyRational*)self)->r));
+}
+
+static PyGetSetDef pyrational_getset[] = {
+    {(char*)"n",pyrational_n,0,(char*)"numerator",0},
+    {(char*)"d",pyrational_d,0,(char*)"denominator",0},
+    {0} /* sentinel */
+};
+
+static PyTypeObject PyRational_Type = {
+    PyVarObject_HEAD_INIT(NULL, 0)
+    "numpy.core._rational_tests.rational",  /* tp_name */
+    sizeof(PyRational),                       /* tp_basicsize */
+    0,                                        /* tp_itemsize */
+    0,                                        /* tp_dealloc */
+    0,                                        /* tp_print */
+    0,                                        /* tp_getattr */
+    0,                                        /* tp_setattr */
+    0,                                        /* tp_reserved */
+    pyrational_repr,                          /* tp_repr */
+    &pyrational_as_number,                    /* tp_as_number */
+    0,                                        /* tp_as_sequence */
+    0,                                        /* tp_as_mapping */
+    pyrational_hash,                          /* tp_hash */
+    0,                                        /* tp_call */
+    pyrational_str,                           /* tp_str */
+    0,                                        /* tp_getattro */
+    0,                                        /* tp_setattro */
+    0,                                        /* tp_as_buffer */
+    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /* tp_flags */
+    "Fixed precision rational numbers",       /* tp_doc */
+    0,                                        /* tp_traverse */
+    0,                                        /* tp_clear */
+    pyrational_richcompare,                   /* tp_richcompare */
+    0,                                        /* tp_weaklistoffset */
+    0,                                        /* tp_iter */
+    0,                                        /* tp_iternext */
+    0,                                        /* tp_methods */
+    0,                                        /* tp_members */
+    pyrational_getset,                        /* tp_getset */
+    0,                                        /* tp_base */
+    0,                                        /* tp_dict */
+    0,                                        /* tp_descr_get */
+    0,                                        /* tp_descr_set */
+    0,                                        /* tp_dictoffset */
+    0,                                        /* tp_init */
+    0,                                        /* tp_alloc */
+    pyrational_new,                           /* tp_new */
+    0,                                        /* tp_free */
+    0,                                        /* tp_is_gc */
+    0,                                        /* tp_bases */
+    0,                                        /* tp_mro */
+    0,                                        /* tp_cache */
+    0,                                        /* tp_subclasses */
+    0,                                        /* tp_weaklist */
+    0,                                        /* tp_del */
+    0,                                        /* tp_version_tag */
+};
+
+/* NumPy support */
+
+static PyObject*
+npyrational_getitem(void* data, void* arr) {
+    rational r;
+    memcpy(&r,data,sizeof(rational));
+    return PyRational_FromRational(r);
+}
+
+static int
+npyrational_setitem(PyObject* item, void* data, void* arr) {
+    rational r;
+    if (PyRational_Check(item)) {
+        r = ((PyRational*)item)->r;
+    }
+    else {
+        long long n = PyLong_AsLongLong(item);
+        PyObject* y;
+        int eq;
+        if (error_converting(n)) {
+            return -1;
+        }
+        y = PyLong_FromLongLong(n);
+        if (!y) {
+            return -1;
+        }
+        eq = PyObject_RichCompareBool(item, y, Py_EQ);
+        Py_DECREF(y);
+        if (eq<0) {
+            return -1;
+        }
+        if (!eq) {
+            PyErr_Format(PyExc_TypeError,
+                    "expected rational, got %s", item->ob_type->tp_name);
+            return -1;
+        }
+        r = make_rational_int(n);
+    }
+    memcpy(data, &r, sizeof(rational));
+    return 0;
+}
+
+static NPY_INLINE void
+byteswap(npy_int32* x) {
+    char* p = (char*)x;
+    size_t i;
+    for (i = 0; i < sizeof(*x)/2; i++) {
+        size_t j = sizeof(*x)-1-i;
+        char t = p[i];
+        p[i] = p[j];
+        p[j] = t;
+    }
+}
+
+static void
+npyrational_copyswapn(void* dst_, npy_intp dstride, void* src_,
+        npy_intp sstride, npy_intp n, int swap, void* arr) {
+    char *dst = (char*)dst_, *src = (char*)src_;
+    npy_intp i;
+    if (!src) {
+        return;
+    }
+    if (swap) {
+        for (i = 0; i < n; i++) {
+            rational* r = (rational*)(dst+dstride*i);
+            memcpy(r,src+sstride*i,sizeof(rational));
+            byteswap(&r->n);
+            byteswap(&r->dmm);
+        }
+    }
+    else if (dstride == sizeof(rational) && sstride == sizeof(rational)) {
+        memcpy(dst, src, n*sizeof(rational));
+    }
+    else {
+        for (i = 0; i < n; i++) {
+            memcpy(dst + dstride*i, src + sstride*i, sizeof(rational));
+        }
+    }
+}
+
+static void
+npyrational_copyswap(void* dst, void* src, int swap, void* arr) {
+    rational* r;
+    if (!src) {
+        return;
+    }
+    r = (rational*)dst;
+    memcpy(r,src,sizeof(rational));
+    if (swap) {
+        byteswap(&r->n);
+        byteswap(&r->dmm);
+    }
+}
+
+static int
+npyrational_compare(const void* d0, const void* d1, void* arr) {
+    rational x = *(rational*)d0,
+             y = *(rational*)d1;
+    return rational_lt(x,y)?-1:rational_eq(x,y)?0:1;
+}
+
+#define FIND_EXTREME(name,op) \
+    static int \
+    npyrational_##name(void* data_, npy_intp n, \
+            npy_intp* max_ind, void* arr) { \
+        const rational* data; \
+        npy_intp best_i; \
+        rational best_r; \
+        npy_intp i; \
+        if (!n) { \
+            return 0; \
+        } \
+        data = (rational*)data_; \
+        best_i = 0; \
+        best_r = data[0]; \
+        for (i = 1; i < n; i++) { \
+            if (rational_##op(data[i],best_r)) { \
+                best_i = i; \
+                best_r = data[i]; \
+            } \
+        } \
+        *max_ind = best_i; \
+        return 0; \
+    }
+FIND_EXTREME(argmin,lt)
+FIND_EXTREME(argmax,gt)
+
+static void
+npyrational_dot(void* ip0_, npy_intp is0, void* ip1_, npy_intp is1,
+        void* op, npy_intp n, void* arr) {
+    rational r = {0};
+    const char *ip0 = (char*)ip0_, *ip1 = (char*)ip1_;
+    npy_intp i;
+    for (i = 0; i < n; i++) {
+        r = rational_add(r,rational_multiply(*(rational*)ip0,*(rational*)ip1));
+        ip0 += is0;
+        ip1 += is1;
+    }
+    *(rational*)op = r;
+}
+
+static npy_bool
+npyrational_nonzero(void* data, void* arr) {
+    rational r;
+    memcpy(&r,data,sizeof(r));
+    return rational_nonzero(r)?NPY_TRUE:NPY_FALSE;
+}
+
+static int
+npyrational_fill(void* data_, npy_intp length, void* arr) {
+    rational* data = (rational*)data_;
+    rational delta = rational_subtract(data[1],data[0]);
+    rational r = data[1];
+    npy_intp i;
+    for (i = 2; i < length; i++) {
+        r = rational_add(r,delta);
+        data[i] = r;
+    }
+    return 0;
+}
+
+static int
+npyrational_fillwithscalar(void* buffer_, npy_intp length,
+        void* value, void* arr) {
+    rational r = *(rational*)value;
+    rational* buffer = (rational*)buffer_;
+    npy_intp i;
+    for (i = 0; i < length; i++) {
+        buffer[i] = r;
+    }
+    return 0;
+}
+
+static PyArray_ArrFuncs npyrational_arrfuncs;
+
+typedef struct { char c; rational r; } align_test;
+
+PyArray_Descr npyrational_descr = {
+    PyObject_HEAD_INIT(0)
+    &PyRational_Type,       /* typeobj */
+    'V',                    /* kind */
+    'r',                    /* type */
+    '=',                    /* byteorder */
+    /*
+     * For now, we need NPY_NEEDS_PYAPI in order to make numpy detect our
+     * exceptions.  This isn't technically necessary,
+     * since we're careful about thread safety, and hopefully future
+     * versions of numpy will recognize that.
+     */
+    NPY_NEEDS_PYAPI | NPY_USE_GETITEM | NPY_USE_SETITEM, /* hasobject */
+    0,                      /* type_num */
+    sizeof(rational),       /* elsize */
+    offsetof(align_test,r), /* alignment */
+    0,                      /* subarray */
+    0,                      /* fields */
+    0,                      /* names */
+    &npyrational_arrfuncs,  /* f */
+};
+
+#define DEFINE_CAST(From,To,statement) \
+    static void \
+    npycast_##From##_##To(void* from_, void* to_, npy_intp n, \
+                          void* fromarr, void* toarr) { \
+        const From* from = (From*)from_; \
+        To* to = (To*)to_; \
+        npy_intp i; \
+        for (i = 0; i < n; i++) { \
+            From x = from[i]; \
+            statement \
+            to[i] = y; \
+        } \
+    }
+#define DEFINE_INT_CAST(bits) \
+    DEFINE_CAST(npy_int##bits,rational,rational y = make_rational_int(x);) \
+    DEFINE_CAST(rational,npy_int##bits,npy_int32 z = rational_int(x); \
+                npy_int##bits y = z; if (y != z) set_overflow();)
+DEFINE_INT_CAST(8)
+DEFINE_INT_CAST(16)
+DEFINE_INT_CAST(32)
+DEFINE_INT_CAST(64)
+DEFINE_CAST(rational,float,double y = rational_double(x);)
+DEFINE_CAST(rational,double,double y = rational_double(x);)
+DEFINE_CAST(npy_bool,rational,rational y = make_rational_int(x);)
+DEFINE_CAST(rational,npy_bool,npy_bool y = rational_nonzero(x);)
+
+#define BINARY_UFUNC(name,intype0,intype1,outtype,exp) \
+    void name(char** args, npy_intp const *dimensions, \
+              npy_intp const *steps, void* data) { \
+        npy_intp is0 = steps[0], is1 = steps[1], \
+            os = steps[2], n = *dimensions; \
+        char *i0 = args[0], *i1 = args[1], *o = args[2]; \
+        int k; \
+        for (k = 0; k < n; k++) { \
+            intype0 x = *(intype0*)i0; \
+            intype1 y = *(intype1*)i1; \
+            *(outtype*)o = exp; \
+            i0 += is0; i1 += is1; o += os; \
+        } \
+    }
+#define RATIONAL_BINARY_UFUNC(name,type,exp) \
+    BINARY_UFUNC(rational_ufunc_##name,rational,rational,type,exp)
+RATIONAL_BINARY_UFUNC(add,rational,rational_add(x,y))
+RATIONAL_BINARY_UFUNC(subtract,rational,rational_subtract(x,y))
+RATIONAL_BINARY_UFUNC(multiply,rational,rational_multiply(x,y))
+RATIONAL_BINARY_UFUNC(divide,rational,rational_divide(x,y))
+RATIONAL_BINARY_UFUNC(remainder,rational,rational_remainder(x,y))
+RATIONAL_BINARY_UFUNC(floor_divide,rational,
+    make_rational_int(rational_floor(rational_divide(x,y))))
+PyUFuncGenericFunction rational_ufunc_true_divide = rational_ufunc_divide;
+RATIONAL_BINARY_UFUNC(minimum,rational,rational_lt(x,y)?x:y)
+RATIONAL_BINARY_UFUNC(maximum,rational,rational_lt(x,y)?y:x)
+RATIONAL_BINARY_UFUNC(equal,npy_bool,rational_eq(x,y))
+RATIONAL_BINARY_UFUNC(not_equal,npy_bool,rational_ne(x,y))
+RATIONAL_BINARY_UFUNC(less,npy_bool,rational_lt(x,y))
+RATIONAL_BINARY_UFUNC(greater,npy_bool,rational_gt(x,y))
+RATIONAL_BINARY_UFUNC(less_equal,npy_bool,rational_le(x,y))
+RATIONAL_BINARY_UFUNC(greater_equal,npy_bool,rational_ge(x,y))
+
+BINARY_UFUNC(gcd_ufunc,npy_int64,npy_int64,npy_int64,gcd(x,y))
+BINARY_UFUNC(lcm_ufunc,npy_int64,npy_int64,npy_int64,lcm(x,y))
+
+#define UNARY_UFUNC(name,type,exp) \
+    void rational_ufunc_##name(char** args, npy_intp const *dimensions, \
+                               npy_intp const *steps, void* data) { \
+        npy_intp is = steps[0], os = steps[1], n = *dimensions; \
+        char *i = args[0], *o = args[1]; \
+        int k; \
+        for (k = 0; k < n; k++) { \
+            rational x = *(rational*)i; \
+            *(type*)o = exp; \
+            i += is; o += os; \
+        } \
+    }
+UNARY_UFUNC(negative,rational,rational_negative(x))
+UNARY_UFUNC(absolute,rational,rational_abs(x))
+UNARY_UFUNC(floor,rational,make_rational_int(rational_floor(x)))
+UNARY_UFUNC(ceil,rational,make_rational_int(rational_ceil(x)))
+UNARY_UFUNC(trunc,rational,make_rational_int(x.n/d(x)))
+UNARY_UFUNC(square,rational,rational_multiply(x,x))
+UNARY_UFUNC(rint,rational,make_rational_int(rational_rint(x)))
+UNARY_UFUNC(sign,rational,make_rational_int(rational_sign(x)))
+UNARY_UFUNC(reciprocal,rational,rational_inverse(x))
+UNARY_UFUNC(numerator,npy_int64,x.n)
+UNARY_UFUNC(denominator,npy_int64,d(x))
+
+static NPY_INLINE void
+rational_matrix_multiply(char **args, npy_intp const *dimensions, npy_intp const *steps)
+{
+    /* pointers to data for input and output arrays */
+    char *ip1 = args[0];
+    char *ip2 = args[1];
+    char *op = args[2];
+
+    /* lengths of core dimensions */
+    npy_intp dm = dimensions[0];
+    npy_intp dn = dimensions[1];
+    npy_intp dp = dimensions[2];
+
+    /* striding over core dimensions */
+    npy_intp is1_m = steps[0];
+    npy_intp is1_n = steps[1];
+    npy_intp is2_n = steps[2];
+    npy_intp is2_p = steps[3];
+    npy_intp os_m = steps[4];
+    npy_intp os_p = steps[5];
+
+    /* core dimensions counters */
+    npy_intp m, p;
+
+    /* calculate dot product for each row/column vector pair */
+    for (m = 0; m < dm; m++) {
+        for (p = 0; p < dp; p++) {
+            npyrational_dot(ip1, is1_n, ip2, is2_n, op, dn, NULL);
+
+            /* advance to next column of 2nd input array and output array */
+            ip2 += is2_p;
+            op  +=  os_p;
+        }
+
+        /* reset to first column of 2nd input array and output array */
+        ip2 -= is2_p * p;
+        op -= os_p * p;
+
+        /* advance to next row of 1st input array and output array */
+        ip1 += is1_m;
+        op += os_m;
+    }
+}
+
+
+static void
+rational_gufunc_matrix_multiply(char **args, npy_intp const *dimensions,
+                                npy_intp const *steps, void *NPY_UNUSED(func))
+{
+    /* outer dimensions counter */
+    npy_intp N_;
+
+    /* length of flattened outer dimensions */
+    npy_intp dN = dimensions[0];
+
+    /* striding over flattened outer dimensions for input and output arrays */
+    npy_intp s0 = steps[0];
+    npy_intp s1 = steps[1];
+    npy_intp s2 = steps[2];
+
+    /*
+     * loop through outer dimensions, performing matrix multiply on
+     * core dimensions for each loop
+     */
+    for (N_ = 0; N_ < dN; N_++, args[0] += s0, args[1] += s1, args[2] += s2) {
+        rational_matrix_multiply(args, dimensions+1, steps+3);
+    }
+}
+
+
+static void
+rational_ufunc_test_add(char** args, npy_intp const *dimensions,
+                        npy_intp const *steps, void* data) {
+    npy_intp is0 = steps[0], is1 = steps[1], os = steps[2], n = *dimensions;
+    char *i0 = args[0], *i1 = args[1], *o = args[2];
+    int k;
+    for (k = 0; k < n; k++) {
+        npy_int64 x = *(npy_int64*)i0;
+        npy_int64 y = *(npy_int64*)i1;
+        *(rational*)o = rational_add(make_rational_fast(x, 1),
+                                     make_rational_fast(y, 1));
+        i0 += is0; i1 += is1; o += os;
+    }
+}
+
+
+static void
+rational_ufunc_test_add_rationals(char** args, npy_intp const *dimensions,
+                        npy_intp const *steps, void* data) {
+    npy_intp is0 = steps[0], is1 = steps[1], os = steps[2], n = *dimensions;
+    char *i0 = args[0], *i1 = args[1], *o = args[2];
+    int k;
+    for (k = 0; k < n; k++) {
+        rational x = *(rational*)i0;
+        rational y = *(rational*)i1;
+        *(rational*)o = rational_add(x, y);
+        i0 += is0; i1 += is1; o += os;
+    }
+}
+
+
+PyMethodDef module_methods[] = {
+    {0} /* sentinel */
+};
+
+static struct PyModuleDef moduledef = {
+    PyModuleDef_HEAD_INIT,
+    "_rational_tests",
+    NULL,
+    -1,
+    module_methods,
+    NULL,
+    NULL,
+    NULL,
+    NULL
+};
+
+PyMODINIT_FUNC PyInit__rational_tests(void) {
+    PyObject *m = NULL;
+    PyObject* numpy_str;
+    PyObject* numpy;
+    int npy_rational;
+
+    import_array();
+    if (PyErr_Occurred()) {
+        goto fail;
+    }
+    import_umath();
+    if (PyErr_Occurred()) {
+        goto fail;
+    }
+    numpy_str = PyUnicode_FromString("numpy");
+    if (!numpy_str) {
+        goto fail;
+    }
+    numpy = PyImport_Import(numpy_str);
+    Py_DECREF(numpy_str);
+    if (!numpy) {
+        goto fail;
+    }
+
+    /* Can't set this until we import numpy */
+    PyRational_Type.tp_base = &PyGenericArrType_Type;
+
+    /* Initialize rational type object */
+    if (PyType_Ready(&PyRational_Type) < 0) {
+        goto fail;
+    }
+
+    /* Initialize rational descriptor */
+    PyArray_InitArrFuncs(&npyrational_arrfuncs);
+    npyrational_arrfuncs.getitem = npyrational_getitem;
+    npyrational_arrfuncs.setitem = npyrational_setitem;
+    npyrational_arrfuncs.copyswapn = npyrational_copyswapn;
+    npyrational_arrfuncs.copyswap = npyrational_copyswap;
+    npyrational_arrfuncs.compare = npyrational_compare;
+    npyrational_arrfuncs.argmin = npyrational_argmin;
+    npyrational_arrfuncs.argmax = npyrational_argmax;
+    npyrational_arrfuncs.dotfunc = npyrational_dot;
+    npyrational_arrfuncs.nonzero = npyrational_nonzero;
+    npyrational_arrfuncs.fill = npyrational_fill;
+    npyrational_arrfuncs.fillwithscalar = npyrational_fillwithscalar;
+    /* Left undefined: scanfunc, fromstr, sort, argsort */
+    Py_SET_TYPE(&npyrational_descr, &PyArrayDescr_Type);
+    npy_rational = PyArray_RegisterDataType(&npyrational_descr);
+    if (npy_rational<0) {
+        goto fail;
+    }
+
+    /* Support dtype(rational) syntax */
+    if (PyDict_SetItemString(PyRational_Type.tp_dict, "dtype",
+                             (PyObject*)&npyrational_descr) < 0) {
+        goto fail;
+    }
+
+    /* Register casts to and from rational */
+    #define REGISTER_CAST(From,To,from_descr,to_typenum,safe) { \
+            PyArray_Descr* from_descr_##From##_##To = (from_descr); \
+            if (PyArray_RegisterCastFunc(from_descr_##From##_##To, \
+                                         (to_typenum), \
+                                         npycast_##From##_##To) < 0) { \
+                goto fail; \
+            } \
+            if (safe && PyArray_RegisterCanCast(from_descr_##From##_##To, \
+                                                (to_typenum), \
+                                                NPY_NOSCALAR) < 0) { \
+                goto fail; \
+            } \
+        }
+    #define REGISTER_INT_CASTS(bits) \
+        REGISTER_CAST(npy_int##bits, rational, \
+                      PyArray_DescrFromType(NPY_INT##bits), npy_rational, 1) \
+        REGISTER_CAST(rational, npy_int##bits, &npyrational_descr, \
+                      NPY_INT##bits, 0)
+    REGISTER_INT_CASTS(8)
+    REGISTER_INT_CASTS(16)
+    REGISTER_INT_CASTS(32)
+    REGISTER_INT_CASTS(64)
+    REGISTER_CAST(rational,float,&npyrational_descr,NPY_FLOAT,0)
+    REGISTER_CAST(rational,double,&npyrational_descr,NPY_DOUBLE,1)
+    REGISTER_CAST(npy_bool,rational, PyArray_DescrFromType(NPY_BOOL),
+                  npy_rational,1)
+    REGISTER_CAST(rational,npy_bool,&npyrational_descr,NPY_BOOL,0)
+
+    /* Register ufuncs */
+    #define REGISTER_UFUNC(name,...) { \
+        PyUFuncObject* ufunc = \
+            (PyUFuncObject*)PyObject_GetAttrString(numpy, #name); \
+        int _types[] = __VA_ARGS__; \
+        if (!ufunc) { \
+            goto fail; \
+        } \
+        if (sizeof(_types)/sizeof(int)!=ufunc->nargs) { \
+            PyErr_Format(PyExc_AssertionError, \
+                         "ufunc %s takes %d arguments, our loop takes %lu", \
+                         #name, ufunc->nargs, (unsigned long) \
+                         (sizeof(_types)/sizeof(int))); \
+            Py_DECREF(ufunc); \
+            goto fail; \
+        } \
+        if (PyUFunc_RegisterLoopForType((PyUFuncObject*)ufunc, npy_rational, \
+                rational_ufunc_##name, _types, 0) < 0) { \
+            Py_DECREF(ufunc); \
+            goto fail; \
+        } \
+        Py_DECREF(ufunc); \
+    }
+    #define REGISTER_UFUNC_BINARY_RATIONAL(name) \
+        REGISTER_UFUNC(name, {npy_rational, npy_rational, npy_rational})
+    #define REGISTER_UFUNC_BINARY_COMPARE(name) \
+        REGISTER_UFUNC(name, {npy_rational, npy_rational, NPY_BOOL})
+    #define REGISTER_UFUNC_UNARY(name) \
+        REGISTER_UFUNC(name, {npy_rational, npy_rational})
+    /* Binary */
+    REGISTER_UFUNC_BINARY_RATIONAL(add)
+    REGISTER_UFUNC_BINARY_RATIONAL(subtract)
+    REGISTER_UFUNC_BINARY_RATIONAL(multiply)
+    REGISTER_UFUNC_BINARY_RATIONAL(divide)
+    REGISTER_UFUNC_BINARY_RATIONAL(remainder)
+    REGISTER_UFUNC_BINARY_RATIONAL(true_divide)
+    REGISTER_UFUNC_BINARY_RATIONAL(floor_divide)
+    REGISTER_UFUNC_BINARY_RATIONAL(minimum)
+    REGISTER_UFUNC_BINARY_RATIONAL(maximum)
+    /* Comparisons */
+    REGISTER_UFUNC_BINARY_COMPARE(equal)
+    REGISTER_UFUNC_BINARY_COMPARE(not_equal)
+    REGISTER_UFUNC_BINARY_COMPARE(less)
+    REGISTER_UFUNC_BINARY_COMPARE(greater)
+    REGISTER_UFUNC_BINARY_COMPARE(less_equal)
+    REGISTER_UFUNC_BINARY_COMPARE(greater_equal)
+    /* Unary */
+    REGISTER_UFUNC_UNARY(negative)
+    REGISTER_UFUNC_UNARY(absolute)
+    REGISTER_UFUNC_UNARY(floor)
+    REGISTER_UFUNC_UNARY(ceil)
+    REGISTER_UFUNC_UNARY(trunc)
+    REGISTER_UFUNC_UNARY(rint)
+    REGISTER_UFUNC_UNARY(square)
+    REGISTER_UFUNC_UNARY(reciprocal)
+    REGISTER_UFUNC_UNARY(sign)
+
+    /* Create module */
+    m = PyModule_Create(&moduledef);
+
+    if (!m) {
+        goto fail;
+    }
+
+    /* Add rational type */
+    Py_INCREF(&PyRational_Type);
+    PyModule_AddObject(m,"rational",(PyObject*)&PyRational_Type);
+
+    /* Create matrix multiply generalized ufunc */
+    {
+        int types2[3] = {npy_rational,npy_rational,npy_rational};
+        PyObject* gufunc = PyUFunc_FromFuncAndDataAndSignature(0,0,0,0,2,1,
+            PyUFunc_None,(char*)"matrix_multiply",
+            (char*)"return result of multiplying two matrices of rationals",
+            0,"(m,n),(n,p)->(m,p)");
+        if (!gufunc) {
+            goto fail;
+        }
+        if (PyUFunc_RegisterLoopForType((PyUFuncObject*)gufunc, npy_rational,
+                rational_gufunc_matrix_multiply, types2, 0) < 0) {
+            goto fail;
+        }
+        PyModule_AddObject(m,"matrix_multiply",(PyObject*)gufunc);
+    }
+
+    /* Create test ufunc with built in input types and rational output type */
+    {
+        int types3[3] = {NPY_INT64,NPY_INT64,npy_rational};
+
+        PyObject* ufunc = PyUFunc_FromFuncAndData(0,0,0,0,2,1,
+                PyUFunc_None,(char*)"test_add",
+                (char*)"add two matrices of int64 and return rational matrix",0);
+        if (!ufunc) {
+            goto fail;
+        }
+        if (PyUFunc_RegisterLoopForType((PyUFuncObject*)ufunc, npy_rational,
+                rational_ufunc_test_add, types3, 0) < 0) {
+            goto fail;
+        }
+        PyModule_AddObject(m,"test_add",(PyObject*)ufunc);
+    }
+
+    /* Create test ufunc with rational types using RegisterLoopForDescr */
+    {
+        PyObject* ufunc = PyUFunc_FromFuncAndData(0,0,0,0,2,1,
+                PyUFunc_None,(char*)"test_add_rationals",
+                (char*)"add two matrices of rationals and return rational matrix",0);
+        PyArray_Descr* types[3] = {&npyrational_descr,
+                                    &npyrational_descr,
+                                    &npyrational_descr};
+
+        if (!ufunc) {
+            goto fail;
+        }
+        if (PyUFunc_RegisterLoopForDescr((PyUFuncObject*)ufunc, &npyrational_descr,
+                rational_ufunc_test_add_rationals, types, 0) < 0) {
+            goto fail;
+        }
+        PyModule_AddObject(m,"test_add_rationals",(PyObject*)ufunc);
+    }
+
+    /* Create numerator and denominator ufuncs */
+    #define NEW_UNARY_UFUNC(name,type,doc) { \
+        int types[2] = {npy_rational,type}; \
+        PyObject* ufunc = PyUFunc_FromFuncAndData(0,0,0,0,1,1, \
+            PyUFunc_None,(char*)#name,(char*)doc,0); \
+        if (!ufunc) { \
+            goto fail; \
+        } \
+        if (PyUFunc_RegisterLoopForType((PyUFuncObject*)ufunc, \
+                npy_rational,rational_ufunc_##name,types,0)<0) { \
+            goto fail; \
+        } \
+        PyModule_AddObject(m,#name,(PyObject*)ufunc); \
+    }
+    NEW_UNARY_UFUNC(numerator,NPY_INT64,"rational number numerator");
+    NEW_UNARY_UFUNC(denominator,NPY_INT64,"rational number denominator");
+
+    /* Create gcd and lcm ufuncs */
+    #define GCD_LCM_UFUNC(name,type,doc) { \
+        static const PyUFuncGenericFunction func[1] = {name##_ufunc}; \
+        static const char types[3] = {type,type,type}; \
+        static void* data[1] = {0}; \
+        PyObject* ufunc = PyUFunc_FromFuncAndData( \
+            (PyUFuncGenericFunction*)func, data,(char*)types, \
+            1,2,1,PyUFunc_One,(char*)#name,(char*)doc,0); \
+        if (!ufunc) { \
+            goto fail; \
+        } \
+        PyModule_AddObject(m,#name,(PyObject*)ufunc); \
+    }
+    GCD_LCM_UFUNC(gcd,NPY_INT64,"greatest common denominator of two integers");
+    GCD_LCM_UFUNC(lcm,NPY_INT64,"least common multiple of two integers");
+
+    return m;
+
+fail:
+    if (!PyErr_Occurred()) {
+        PyErr_SetString(PyExc_RuntimeError,
+                        "cannot load _rational_tests module.");
+    }
+    if (m) {
+        Py_DECREF(m);
+        m = NULL;
+    }
+    return m;
+}
diff --git a/numpy/core/src/umath/_rational_tests.c.src b/numpy/core/src/umath/_rational_tests.c.src

deleted file mode 100644 (file)

index bf50a22..0000000
--- a/numpy/core/src/umath/_rational_tests.c.src
+++ /dev/null
@@ -1,1369 +0,0 @@
-/* Fixed size rational numbers exposed to Python */
-#define PY_SSIZE_T_CLEAN
-#include <Python.h>
-#include <structmember.h>
-
-#define NPY_NO_DEPRECATED_API NPY_API_VERSION
-#include "numpy/arrayobject.h"
-#include "numpy/ufuncobject.h"
-#include "numpy/npy_3kcompat.h"
-#include "common.h"  /* for error_converting */
-
-#include <math.h>
-
-
-/* Relevant arithmetic exceptions */
-
-/* Uncomment the following line to work around a bug in numpy */
-/* #define ACQUIRE_GIL */
-
-static void
-set_overflow(void) {
-#ifdef ACQUIRE_GIL
-    /* Need to grab the GIL to dodge a bug in numpy */
-    PyGILState_STATE state = PyGILState_Ensure();
-#endif
-    if (!PyErr_Occurred()) {
-        PyErr_SetString(PyExc_OverflowError,
-                "overflow in rational arithmetic");
-    }
-#ifdef ACQUIRE_GIL
-    PyGILState_Release(state);
-#endif
-}
-
-static void
-set_zero_divide(void) {
-#ifdef ACQUIRE_GIL
-    /* Need to grab the GIL to dodge a bug in numpy */
-    PyGILState_STATE state = PyGILState_Ensure();
-#endif
-    if (!PyErr_Occurred()) {
-        PyErr_SetString(PyExc_ZeroDivisionError,
-                "zero divide in rational arithmetic");
-    }
-#ifdef ACQUIRE_GIL
-    PyGILState_Release(state);
-#endif
-}
-
-/* Integer arithmetic utilities */
-
-static NPY_INLINE npy_int32
-safe_neg(npy_int32 x) {
-    if (x==(npy_int32)1<<31) {
-        set_overflow();
-    }
-    return -x;
-}
-
-static NPY_INLINE npy_int32
-safe_abs32(npy_int32 x) {
-    npy_int32 nx;
-    if (x>=0) {
-        return x;
-    }
-    nx = -x;
-    if (nx<0) {
-        set_overflow();
-    }
-    return nx;
-}
-
-static NPY_INLINE npy_int64
-safe_abs64(npy_int64 x) {
-    npy_int64 nx;
-    if (x>=0) {
-        return x;
-    }
-    nx = -x;
-    if (nx<0) {
-        set_overflow();
-    }
-    return nx;
-}
-
-static NPY_INLINE npy_int64
-gcd(npy_int64 x, npy_int64 y) {
-    x = safe_abs64(x);
-    y = safe_abs64(y);
-    if (x < y) {
-        npy_int64 t = x;
-        x = y;
-        y = t;
-    }
-    while (y) {
-        npy_int64 t;
-        x = x%y;
-        t = x;
-        x = y;
-        y = t;
-    }
-    return x;
-}
-
-static NPY_INLINE npy_int64
-lcm(npy_int64 x, npy_int64 y) {
-    npy_int64 lcm;
-    if (!x || !y) {
-        return 0;
-    }
-    x /= gcd(x,y);
-    lcm = x*y;
-    if (lcm/y!=x) {
-        set_overflow();
-    }
-    return safe_abs64(lcm);
-}
-
-/* Fixed precision rational numbers */
-
-typedef struct {
-    /* numerator */
-    npy_int32 n;
-    /*
-     * denominator minus one: numpy.zeros() uses memset(0) for non-object
-     * types, so need to ensure that rational(0) has all zero bytes
-     */
-    npy_int32 dmm;
-} rational;
-
-static NPY_INLINE rational
-make_rational_int(npy_int64 n) {
-    rational r = {(npy_int32)n,0};
-    if (r.n != n) {
-        set_overflow();
-    }
-    return r;
-}
-
-static rational
-make_rational_slow(npy_int64 n_, npy_int64 d_) {
-    rational r = {0};
-    if (!d_) {
-        set_zero_divide();
-    }
-    else {
-        npy_int64 g = gcd(n_,d_);
-        npy_int32 d;
-        n_ /= g;
-        d_ /= g;
-        r.n = (npy_int32)n_;
-        d = (npy_int32)d_;
-        if (r.n!=n_ || d!=d_) {
-            set_overflow();
-        }
-        else {
-            if (d <= 0) {
-                d = -d;
-                r.n = safe_neg(r.n);
-            }
-            r.dmm = d-1;
-        }
-    }
-    return r;
-}
-
-static NPY_INLINE npy_int32
-d(rational r) {
-    return r.dmm+1;
-}
-
-/* Assumes d_ > 0 */
-static rational
-make_rational_fast(npy_int64 n_, npy_int64 d_) {
-    npy_int64 g = gcd(n_,d_);
-    rational r;
-    n_ /= g;
-    d_ /= g;
-    r.n = (npy_int32)n_;
-    r.dmm = (npy_int32)(d_-1);
-    if (r.n!=n_ || r.dmm+1!=d_) {
-        set_overflow();
-    }
-    return r;
-}
-
-static NPY_INLINE rational
-rational_negative(rational r) {
-    rational x;
-    x.n = safe_neg(r.n);
-    x.dmm = r.dmm;
-    return x;
-}
-
-static NPY_INLINE rational
-rational_add(rational x, rational y) {
-    /*
-     * Note that the numerator computation can never overflow int128_t,
-     * since each term is strictly under 2**128/4 (since d > 0).
-     */
-    return make_rational_fast((npy_int64)x.n*d(y)+(npy_int64)d(x)*y.n,
-        (npy_int64)d(x)*d(y));
-}
-
-static NPY_INLINE rational
-rational_subtract(rational x, rational y) {
-    /* We're safe from overflow as with + */
-    return make_rational_fast((npy_int64)x.n*d(y)-(npy_int64)d(x)*y.n,
-        (npy_int64)d(x)*d(y));
-}
-
-static NPY_INLINE rational
-rational_multiply(rational x, rational y) {
-    /* We're safe from overflow as with + */
-    return make_rational_fast((npy_int64)x.n*y.n,(npy_int64)d(x)*d(y));
-}
-
-static NPY_INLINE rational
-rational_divide(rational x, rational y) {
-    return make_rational_slow((npy_int64)x.n*d(y),(npy_int64)d(x)*y.n);
-}
-
-static NPY_INLINE npy_int64
-rational_floor(rational x) {
-    /* Always round down */
-    if (x.n>=0) {
-        return x.n/d(x);
-    }
-    /*
-     * This can be done without casting up to 64 bits, but it requires
-     * working out all the sign cases
-     */
-    return -((-(npy_int64)x.n+d(x)-1)/d(x));
-}
-
-static NPY_INLINE npy_int64
-rational_ceil(rational x) {
-    return -rational_floor(rational_negative(x));
-}
-
-static NPY_INLINE rational
-rational_remainder(rational x, rational y) {
-    return rational_subtract(x, rational_multiply(y,make_rational_int(
-                    rational_floor(rational_divide(x,y)))));
-}
-
-static NPY_INLINE rational
-rational_abs(rational x) {
-    rational y;
-    y.n = safe_abs32(x.n);
-    y.dmm = x.dmm;
-    return y;
-}
-
-static NPY_INLINE npy_int64
-rational_rint(rational x) {
-    /*
-     * Round towards nearest integer, moving exact half integers towards
-     * zero
-     */
-    npy_int32 d_ = d(x);
-    return (2*(npy_int64)x.n+(x.n<0?-d_:d_))/(2*(npy_int64)d_);
-}
-
-static NPY_INLINE int
-rational_sign(rational x) {
-    return x.n<0?-1:x.n==0?0:1;
-}
-
-static NPY_INLINE rational
-rational_inverse(rational x) {
-    rational y = {0};
-    if (!x.n) {
-        set_zero_divide();
-    }
-    else {
-        npy_int32 d_;
-        y.n = d(x);
-        d_ = x.n;
-        if (d_ <= 0) {
-            d_ = safe_neg(d_);
-            y.n = -y.n;
-        }
-        y.dmm = d_-1;
-    }
-    return y;
-}
-
-static NPY_INLINE int
-rational_eq(rational x, rational y) {
-    /*
-     * Since we enforce d > 0, and store fractions in reduced form,
-     * equality is easy.
-     */
-    return x.n==y.n && x.dmm==y.dmm;
-}
-
-static NPY_INLINE int
-rational_ne(rational x, rational y) {
-    return !rational_eq(x,y);
-}
-
-static NPY_INLINE int
-rational_lt(rational x, rational y) {
-    return (npy_int64)x.n*d(y) < (npy_int64)y.n*d(x);
-}
-
-static NPY_INLINE int
-rational_gt(rational x, rational y) {
-    return rational_lt(y,x);
-}
-
-static NPY_INLINE int
-rational_le(rational x, rational y) {
-    return !rational_lt(y,x);
-}
-
-static NPY_INLINE int
-rational_ge(rational x, rational y) {
-    return !rational_lt(x,y);
-}
-
-static NPY_INLINE npy_int32
-rational_int(rational x) {
-    return x.n/d(x);
-}
-
-static NPY_INLINE double
-rational_double(rational x) {
-    return (double)x.n/d(x);
-}
-
-static NPY_INLINE int
-rational_nonzero(rational x) {
-    return x.n!=0;
-}
-
-static int
-scan_rational(const char** s, rational* x) {
-    long n,d;
-    int offset;
-    const char* ss;
-    if (sscanf(*s,"%ld%n",&n,&offset)<=0) {
-        return 0;
-    }
-    ss = *s+offset;
-    if (*ss!='/') {
-        *s = ss;
-        *x = make_rational_int(n);
-        return 1;
-    }
-    ss++;
-    if (sscanf(ss,"%ld%n",&d,&offset)<=0 || d<=0) {
-        return 0;
-    }
-    *s = ss+offset;
-    *x = make_rational_slow(n,d);
-    return 1;
-}
-
-/* Expose rational to Python as a numpy scalar */
-
-typedef struct {
-    PyObject_HEAD
-    rational r;
-} PyRational;
-
-static PyTypeObject PyRational_Type;
-
-static NPY_INLINE int
-PyRational_Check(PyObject* object) {
-    return PyObject_IsInstance(object,(PyObject*)&PyRational_Type);
-}
-
-static PyObject*
-PyRational_FromRational(rational x) {
-    PyRational* p = (PyRational*)PyRational_Type.tp_alloc(&PyRational_Type,0);
-    if (p) {
-        p->r = x;
-    }
-    return (PyObject*)p;
-}
-
-static PyObject*
-pyrational_new(PyTypeObject* type, PyObject* args, PyObject* kwds) {
-    Py_ssize_t size;
-    PyObject* x[2];
-    long n[2]={0,1};
-    int i;
-    rational r;
-    if (kwds && PyDict_Size(kwds)) {
-        PyErr_SetString(PyExc_TypeError,
-                "constructor takes no keyword arguments");
-        return 0;
-    }
-    size = PyTuple_GET_SIZE(args);
-    if (size > 2) {
-        PyErr_SetString(PyExc_TypeError,
-                "expected rational or numerator and optional denominator");
-        return 0;
-    }
-
-    if (size == 1) {
-        x[0] = PyTuple_GET_ITEM(args, 0);
-        if (PyRational_Check(x[0])) {
-            Py_INCREF(x[0]);
-            return x[0];
-        }
-        // TODO: allow construction from unicode strings
-        else if (PyBytes_Check(x[0])) {
-            const char* s = PyBytes_AS_STRING(x[0]);
-            rational x;
-            if (scan_rational(&s,&x)) {
-                const char* p;
-                for (p = s; *p; p++) {
-                    if (!isspace(*p)) {
-                        goto bad;
-                    }
-                }
-                return PyRational_FromRational(x);
-            }
-            bad:
-            PyErr_Format(PyExc_ValueError,
-                    "invalid rational literal '%s'",s);
-            return 0;
-        }
-    }
-
-    for (i=0; i<size; i++) {
-        PyObject* y;
-        int eq;
-        x[i] = PyTuple_GET_ITEM(args, i);
-        n[i] = PyLong_AsLong(x[i]);
-        if (error_converting(n[i])) {
-            if (PyErr_ExceptionMatches(PyExc_TypeError)) {
-                PyErr_Format(PyExc_TypeError,
-                        "expected integer %s, got %s",
-                        (i ? "denominator" : "numerator"),
-                        x[i]->ob_type->tp_name);
-            }
-            return 0;
-        }
-        /* Check that we had an exact integer */
-        y = PyLong_FromLong(n[i]);
-        if (!y) {
-            return 0;
-        }
-        eq = PyObject_RichCompareBool(x[i],y,Py_EQ);
-        Py_DECREF(y);
-        if (eq<0) {
-            return 0;
-        }
-        if (!eq) {
-            PyErr_Format(PyExc_TypeError,
-                    "expected integer %s, got %s",
-                    (i ? "denominator" : "numerator"),
-                    x[i]->ob_type->tp_name);
-            return 0;
-        }
-    }
-    r = make_rational_slow(n[0],n[1]);
-    if (PyErr_Occurred()) {
-        return 0;
-    }
-    return PyRational_FromRational(r);
-}
-
-/*
- * Returns Py_NotImplemented on most conversion failures, or raises an
- * overflow error for too long ints
- */
-#define AS_RATIONAL(dst,object) \
-    { \
-        dst.n = 0; \
-        if (PyRational_Check(object)) { \
-            dst = ((PyRational*)object)->r; \
-        } \
-        else { \
-            PyObject* y_; \
-            int eq_; \
-            long n_ = PyLong_AsLong(object); \
-            if (error_converting(n_)) { \
-                if (PyErr_ExceptionMatches(PyExc_TypeError)) { \
-                    PyErr_Clear(); \
-                    Py_INCREF(Py_NotImplemented); \
-                    return Py_NotImplemented; \
-                } \
-                return 0; \
-            } \
-            y_ = PyLong_FromLong(n_); \
-            if (!y_) { \
-                return 0; \
-            } \
-            eq_ = PyObject_RichCompareBool(object,y_,Py_EQ); \
-            Py_DECREF(y_); \
-            if (eq_<0) { \
-                return 0; \
-            } \
-            if (!eq_) { \
-                Py_INCREF(Py_NotImplemented); \
-                return Py_NotImplemented; \
-            } \
-            dst = make_rational_int(n_); \
-        } \
-    }
-
-static PyObject*
-pyrational_richcompare(PyObject* a, PyObject* b, int op) {
-    rational x, y;
-    int result = 0;
-    AS_RATIONAL(x,a);
-    AS_RATIONAL(y,b);
-    #define OP(py,op) case py: result = rational_##op(x,y); break;
-    switch (op) {
-        OP(Py_LT,lt)
-        OP(Py_LE,le)
-        OP(Py_EQ,eq)
-        OP(Py_NE,ne)
-        OP(Py_GT,gt)
-        OP(Py_GE,ge)
-    };
-    #undef OP
-    return PyBool_FromLong(result);
-}
-
-static PyObject*
-pyrational_repr(PyObject* self) {
-    rational x = ((PyRational*)self)->r;
-    if (d(x)!=1) {
-        return PyUnicode_FromFormat(
-                "rational(%ld,%ld)",(long)x.n,(long)d(x));
-    }
-    else {
-        return PyUnicode_FromFormat(
-                "rational(%ld)",(long)x.n);
-    }
-}
-
-static PyObject*
-pyrational_str(PyObject* self) {
-    rational x = ((PyRational*)self)->r;
-    if (d(x)!=1) {
-        return PyUnicode_FromFormat(
-                "%ld/%ld",(long)x.n,(long)d(x));
-    }
-    else {
-        return PyUnicode_FromFormat(
-                "%ld",(long)x.n);
-    }
-}
-
-static npy_hash_t
-pyrational_hash(PyObject* self) {
-    rational x = ((PyRational*)self)->r;
-    /* Use a fairly weak hash as Python expects */
-    long h = 131071*x.n+524287*x.dmm;
-    /* Never return the special error value -1 */
-    return h==-1?2:h;
-}
-
-#define RATIONAL_BINOP_2(name,exp) \
-    static PyObject* \
-    pyrational_##name(PyObject* a, PyObject* b) { \
-        rational x, y, z; \
-        AS_RATIONAL(x,a); \
-        AS_RATIONAL(y,b); \
-        z = exp; \
-        if (PyErr_Occurred()) { \
-            return 0; \
-        } \
-        return PyRational_FromRational(z); \
-    }
-#define RATIONAL_BINOP(name) RATIONAL_BINOP_2(name,rational_##name(x,y))
-RATIONAL_BINOP(add)
-RATIONAL_BINOP(subtract)
-RATIONAL_BINOP(multiply)
-RATIONAL_BINOP(divide)
-RATIONAL_BINOP(remainder)
-RATIONAL_BINOP_2(floor_divide,
-    make_rational_int(rational_floor(rational_divide(x,y))))
-
-#define RATIONAL_UNOP(name,type,exp,convert) \
-    static PyObject* \
-    pyrational_##name(PyObject* self) { \
-        rational x = ((PyRational*)self)->r; \
-        type y = exp; \
-        if (PyErr_Occurred()) { \
-            return 0; \
-        } \
-        return convert(y); \
-    }
-RATIONAL_UNOP(negative,rational,rational_negative(x),PyRational_FromRational)
-RATIONAL_UNOP(absolute,rational,rational_abs(x),PyRational_FromRational)
-RATIONAL_UNOP(int,long,rational_int(x),PyLong_FromLong)
-RATIONAL_UNOP(float,double,rational_double(x),PyFloat_FromDouble)
-
-static PyObject*
-pyrational_positive(PyObject* self) {
-    Py_INCREF(self);
-    return self;
-}
-
-static int
-pyrational_nonzero(PyObject* self) {
-    rational x = ((PyRational*)self)->r;
-    return rational_nonzero(x);
-}
-
-static PyNumberMethods pyrational_as_number = {
-    pyrational_add,          /* nb_add */
-    pyrational_subtract,     /* nb_subtract */
-    pyrational_multiply,     /* nb_multiply */
-    pyrational_remainder,    /* nb_remainder */
-    0,                       /* nb_divmod */
-    0,                       /* nb_power */
-    pyrational_negative,     /* nb_negative */
-    pyrational_positive,     /* nb_positive */
-    pyrational_absolute,     /* nb_absolute */
-    pyrational_nonzero,      /* nb_nonzero */
-    0,                       /* nb_invert */
-    0,                       /* nb_lshift */
-    0,                       /* nb_rshift */
-    0,                       /* nb_and */
-    0,                       /* nb_xor */
-    0,                       /* nb_or */
-    pyrational_int,          /* nb_int */
-    0,                       /* reserved */
-    pyrational_float,        /* nb_float */
-
-    0,                       /* nb_inplace_add */
-    0,                       /* nb_inplace_subtract */
-    0,                       /* nb_inplace_multiply */
-    0,                       /* nb_inplace_remainder */
-    0,                       /* nb_inplace_power */
-    0,                       /* nb_inplace_lshift */
-    0,                       /* nb_inplace_rshift */
-    0,                       /* nb_inplace_and */
-    0,                       /* nb_inplace_xor */
-    0,                       /* nb_inplace_or */
-
-    pyrational_floor_divide, /* nb_floor_divide */
-    pyrational_divide,       /* nb_true_divide */
-    0,                       /* nb_inplace_floor_divide */
-    0,                       /* nb_inplace_true_divide */
-    0,                       /* nb_index */
-};
-
-static PyObject*
-pyrational_n(PyObject* self, void* closure) {
-    return PyLong_FromLong(((PyRational*)self)->r.n);
-}
-
-static PyObject*
-pyrational_d(PyObject* self, void* closure) {
-    return PyLong_FromLong(d(((PyRational*)self)->r));
-}
-
-static PyGetSetDef pyrational_getset[] = {
-    {(char*)"n",pyrational_n,0,(char*)"numerator",0},
-    {(char*)"d",pyrational_d,0,(char*)"denominator",0},
-    {0} /* sentinel */
-};
-
-static PyTypeObject PyRational_Type = {
-    PyVarObject_HEAD_INIT(NULL, 0)
-    "numpy.core._rational_tests.rational",  /* tp_name */
-    sizeof(PyRational),                       /* tp_basicsize */
-    0,                                        /* tp_itemsize */
-    0,                                        /* tp_dealloc */
-    0,                                        /* tp_print */
-    0,                                        /* tp_getattr */
-    0,                                        /* tp_setattr */
-    0,                                        /* tp_reserved */
-    pyrational_repr,                          /* tp_repr */
-    &pyrational_as_number,                    /* tp_as_number */
-    0,                                        /* tp_as_sequence */
-    0,                                        /* tp_as_mapping */
-    pyrational_hash,                          /* tp_hash */
-    0,                                        /* tp_call */
-    pyrational_str,                           /* tp_str */
-    0,                                        /* tp_getattro */
-    0,                                        /* tp_setattro */
-    0,                                        /* tp_as_buffer */
-    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /* tp_flags */
-    "Fixed precision rational numbers",       /* tp_doc */
-    0,                                        /* tp_traverse */
-    0,                                        /* tp_clear */
-    pyrational_richcompare,                   /* tp_richcompare */
-    0,                                        /* tp_weaklistoffset */
-    0,                                        /* tp_iter */
-    0,                                        /* tp_iternext */
-    0,                                        /* tp_methods */
-    0,                                        /* tp_members */
-    pyrational_getset,                        /* tp_getset */
-    0,                                        /* tp_base */
-    0,                                        /* tp_dict */
-    0,                                        /* tp_descr_get */
-    0,                                        /* tp_descr_set */
-    0,                                        /* tp_dictoffset */
-    0,                                        /* tp_init */
-    0,                                        /* tp_alloc */
-    pyrational_new,                           /* tp_new */
-    0,                                        /* tp_free */
-    0,                                        /* tp_is_gc */
-    0,                                        /* tp_bases */
-    0,                                        /* tp_mro */
-    0,                                        /* tp_cache */
-    0,                                        /* tp_subclasses */
-    0,                                        /* tp_weaklist */
-    0,                                        /* tp_del */
-    0,                                        /* tp_version_tag */
-};
-
-/* NumPy support */
-
-static PyObject*
-npyrational_getitem(void* data, void* arr) {
-    rational r;
-    memcpy(&r,data,sizeof(rational));
-    return PyRational_FromRational(r);
-}
-
-static int
-npyrational_setitem(PyObject* item, void* data, void* arr) {
-    rational r;
-    if (PyRational_Check(item)) {
-        r = ((PyRational*)item)->r;
-    }
-    else {
-        long long n = PyLong_AsLongLong(item);
-        PyObject* y;
-        int eq;
-        if (error_converting(n)) {
-            return -1;
-        }
-        y = PyLong_FromLongLong(n);
-        if (!y) {
-            return -1;
-        }
-        eq = PyObject_RichCompareBool(item, y, Py_EQ);
-        Py_DECREF(y);
-        if (eq<0) {
-            return -1;
-        }
-        if (!eq) {
-            PyErr_Format(PyExc_TypeError,
-                    "expected rational, got %s", item->ob_type->tp_name);
-            return -1;
-        }
-        r = make_rational_int(n);
-    }
-    memcpy(data, &r, sizeof(rational));
-    return 0;
-}
-
-static NPY_INLINE void
-byteswap(npy_int32* x) {
-    char* p = (char*)x;
-    size_t i;
-    for (i = 0; i < sizeof(*x)/2; i++) {
-        size_t j = sizeof(*x)-1-i;
-        char t = p[i];
-        p[i] = p[j];
-        p[j] = t;
-    }
-}
-
-static void
-npyrational_copyswapn(void* dst_, npy_intp dstride, void* src_,
-        npy_intp sstride, npy_intp n, int swap, void* arr) {
-    char *dst = (char*)dst_, *src = (char*)src_;
-    npy_intp i;
-    if (!src) {
-        return;
-    }
-    if (swap) {
-        for (i = 0; i < n; i++) {
-            rational* r = (rational*)(dst+dstride*i);
-            memcpy(r,src+sstride*i,sizeof(rational));
-            byteswap(&r->n);
-            byteswap(&r->dmm);
-        }
-    }
-    else if (dstride == sizeof(rational) && sstride == sizeof(rational)) {
-        memcpy(dst, src, n*sizeof(rational));
-    }
-    else {
-        for (i = 0; i < n; i++) {
-            memcpy(dst + dstride*i, src + sstride*i, sizeof(rational));
-        }
-    }
-}
-
-static void
-npyrational_copyswap(void* dst, void* src, int swap, void* arr) {
-    rational* r;
-    if (!src) {
-        return;
-    }
-    r = (rational*)dst;
-    memcpy(r,src,sizeof(rational));
-    if (swap) {
-        byteswap(&r->n);
-        byteswap(&r->dmm);
-    }
-}
-
-static int
-npyrational_compare(const void* d0, const void* d1, void* arr) {
-    rational x = *(rational*)d0,
-             y = *(rational*)d1;
-    return rational_lt(x,y)?-1:rational_eq(x,y)?0:1;
-}
-
-#define FIND_EXTREME(name,op) \
-    static int \
-    npyrational_##name(void* data_, npy_intp n, \
-            npy_intp* max_ind, void* arr) { \
-        const rational* data; \
-        npy_intp best_i; \
-        rational best_r; \
-        npy_intp i; \
-        if (!n) { \
-            return 0; \
-        } \
-        data = (rational*)data_; \
-        best_i = 0; \
-        best_r = data[0]; \
-        for (i = 1; i < n; i++) { \
-            if (rational_##op(data[i],best_r)) { \
-                best_i = i; \
-                best_r = data[i]; \
-            } \
-        } \
-        *max_ind = best_i; \
-        return 0; \
-    }
-FIND_EXTREME(argmin,lt)
-FIND_EXTREME(argmax,gt)
-
-static void
-npyrational_dot(void* ip0_, npy_intp is0, void* ip1_, npy_intp is1,
-        void* op, npy_intp n, void* arr) {
-    rational r = {0};
-    const char *ip0 = (char*)ip0_, *ip1 = (char*)ip1_;
-    npy_intp i;
-    for (i = 0; i < n; i++) {
-        r = rational_add(r,rational_multiply(*(rational*)ip0,*(rational*)ip1));
-        ip0 += is0;
-        ip1 += is1;
-    }
-    *(rational*)op = r;
-}
-
-static npy_bool
-npyrational_nonzero(void* data, void* arr) {
-    rational r;
-    memcpy(&r,data,sizeof(r));
-    return rational_nonzero(r)?NPY_TRUE:NPY_FALSE;
-}
-
-static int
-npyrational_fill(void* data_, npy_intp length, void* arr) {
-    rational* data = (rational*)data_;
-    rational delta = rational_subtract(data[1],data[0]);
-    rational r = data[1];
-    npy_intp i;
-    for (i = 2; i < length; i++) {
-        r = rational_add(r,delta);
-        data[i] = r;
-    }
-    return 0;
-}
-
-static int
-npyrational_fillwithscalar(void* buffer_, npy_intp length,
-        void* value, void* arr) {
-    rational r = *(rational*)value;
-    rational* buffer = (rational*)buffer_;
-    npy_intp i;
-    for (i = 0; i < length; i++) {
-        buffer[i] = r;
-    }
-    return 0;
-}
-
-static PyArray_ArrFuncs npyrational_arrfuncs;
-
-typedef struct { char c; rational r; } align_test;
-
-PyArray_Descr npyrational_descr = {
-    PyObject_HEAD_INIT(0)
-    &PyRational_Type,       /* typeobj */
-    'V',                    /* kind */
-    'r',                    /* type */
-    '=',                    /* byteorder */
-    /*
-     * For now, we need NPY_NEEDS_PYAPI in order to make numpy detect our
-     * exceptions.  This isn't technically necessary,
-     * since we're careful about thread safety, and hopefully future
-     * versions of numpy will recognize that.
-     */
-    NPY_NEEDS_PYAPI | NPY_USE_GETITEM | NPY_USE_SETITEM, /* hasobject */
-    0,                      /* type_num */
-    sizeof(rational),       /* elsize */
-    offsetof(align_test,r), /* alignment */
-    0,                      /* subarray */
-    0,                      /* fields */
-    0,                      /* names */
-    &npyrational_arrfuncs,  /* f */
-};
-
-#define DEFINE_CAST(From,To,statement) \
-    static void \
-    npycast_##From##_##To(void* from_, void* to_, npy_intp n, \
-                          void* fromarr, void* toarr) { \
-        const From* from = (From*)from_; \
-        To* to = (To*)to_; \
-        npy_intp i; \
-        for (i = 0; i < n; i++) { \
-            From x = from[i]; \
-            statement \
-            to[i] = y; \
-        } \
-    }
-#define DEFINE_INT_CAST(bits) \
-    DEFINE_CAST(npy_int##bits,rational,rational y = make_rational_int(x);) \
-    DEFINE_CAST(rational,npy_int##bits,npy_int32 z = rational_int(x); \
-                npy_int##bits y = z; if (y != z) set_overflow();)
-DEFINE_INT_CAST(8)
-DEFINE_INT_CAST(16)
-DEFINE_INT_CAST(32)
-DEFINE_INT_CAST(64)
-DEFINE_CAST(rational,float,double y = rational_double(x);)
-DEFINE_CAST(rational,double,double y = rational_double(x);)
-DEFINE_CAST(npy_bool,rational,rational y = make_rational_int(x);)
-DEFINE_CAST(rational,npy_bool,npy_bool y = rational_nonzero(x);)
-
-#define BINARY_UFUNC(name,intype0,intype1,outtype,exp) \
-    void name(char** args, npy_intp const *dimensions, \
-              npy_intp const *steps, void* data) { \
-        npy_intp is0 = steps[0], is1 = steps[1], \
-            os = steps[2], n = *dimensions; \
-        char *i0 = args[0], *i1 = args[1], *o = args[2]; \
-        int k; \
-        for (k = 0; k < n; k++) { \
-            intype0 x = *(intype0*)i0; \
-            intype1 y = *(intype1*)i1; \
-            *(outtype*)o = exp; \
-            i0 += is0; i1 += is1; o += os; \
-        } \
-    }
-#define RATIONAL_BINARY_UFUNC(name,type,exp) \
-    BINARY_UFUNC(rational_ufunc_##name,rational,rational,type,exp)
-RATIONAL_BINARY_UFUNC(add,rational,rational_add(x,y))
-RATIONAL_BINARY_UFUNC(subtract,rational,rational_subtract(x,y))
-RATIONAL_BINARY_UFUNC(multiply,rational,rational_multiply(x,y))
-RATIONAL_BINARY_UFUNC(divide,rational,rational_divide(x,y))
-RATIONAL_BINARY_UFUNC(remainder,rational,rational_remainder(x,y))
-RATIONAL_BINARY_UFUNC(floor_divide,rational,
-    make_rational_int(rational_floor(rational_divide(x,y))))
-PyUFuncGenericFunction rational_ufunc_true_divide = rational_ufunc_divide;
-RATIONAL_BINARY_UFUNC(minimum,rational,rational_lt(x,y)?x:y)
-RATIONAL_BINARY_UFUNC(maximum,rational,rational_lt(x,y)?y:x)
-RATIONAL_BINARY_UFUNC(equal,npy_bool,rational_eq(x,y))
-RATIONAL_BINARY_UFUNC(not_equal,npy_bool,rational_ne(x,y))
-RATIONAL_BINARY_UFUNC(less,npy_bool,rational_lt(x,y))
-RATIONAL_BINARY_UFUNC(greater,npy_bool,rational_gt(x,y))
-RATIONAL_BINARY_UFUNC(less_equal,npy_bool,rational_le(x,y))
-RATIONAL_BINARY_UFUNC(greater_equal,npy_bool,rational_ge(x,y))
-
-BINARY_UFUNC(gcd_ufunc,npy_int64,npy_int64,npy_int64,gcd(x,y))
-BINARY_UFUNC(lcm_ufunc,npy_int64,npy_int64,npy_int64,lcm(x,y))
-
-#define UNARY_UFUNC(name,type,exp) \
-    void rational_ufunc_##name(char** args, npy_intp const *dimensions, \
-                               npy_intp const *steps, void* data) { \
-        npy_intp is = steps[0], os = steps[1], n = *dimensions; \
-        char *i = args[0], *o = args[1]; \
-        int k; \
-        for (k = 0; k < n; k++) { \
-            rational x = *(rational*)i; \
-            *(type*)o = exp; \
-            i += is; o += os; \
-        } \
-    }
-UNARY_UFUNC(negative,rational,rational_negative(x))
-UNARY_UFUNC(absolute,rational,rational_abs(x))
-UNARY_UFUNC(floor,rational,make_rational_int(rational_floor(x)))
-UNARY_UFUNC(ceil,rational,make_rational_int(rational_ceil(x)))
-UNARY_UFUNC(trunc,rational,make_rational_int(x.n/d(x)))
-UNARY_UFUNC(square,rational,rational_multiply(x,x))
-UNARY_UFUNC(rint,rational,make_rational_int(rational_rint(x)))
-UNARY_UFUNC(sign,rational,make_rational_int(rational_sign(x)))
-UNARY_UFUNC(reciprocal,rational,rational_inverse(x))
-UNARY_UFUNC(numerator,npy_int64,x.n)
-UNARY_UFUNC(denominator,npy_int64,d(x))
-
-static NPY_INLINE void
-rational_matrix_multiply(char **args, npy_intp const *dimensions, npy_intp const *steps)
-{
-    /* pointers to data for input and output arrays */
-    char *ip1 = args[0];
-    char *ip2 = args[1];
-    char *op = args[2];
-
-    /* lengths of core dimensions */
-    npy_intp dm = dimensions[0];
-    npy_intp dn = dimensions[1];
-    npy_intp dp = dimensions[2];
-
-    /* striding over core dimensions */
-    npy_intp is1_m = steps[0];
-    npy_intp is1_n = steps[1];
-    npy_intp is2_n = steps[2];
-    npy_intp is2_p = steps[3];
-    npy_intp os_m = steps[4];
-    npy_intp os_p = steps[5];
-
-    /* core dimensions counters */
-    npy_intp m, p;
-
-    /* calculate dot product for each row/column vector pair */
-    for (m = 0; m < dm; m++) {
-        for (p = 0; p < dp; p++) {
-            npyrational_dot(ip1, is1_n, ip2, is2_n, op, dn, NULL);
-
-            /* advance to next column of 2nd input array and output array */
-            ip2 += is2_p;
-            op  +=  os_p;
-        }
-
-        /* reset to first column of 2nd input array and output array */
-        ip2 -= is2_p * p;
-        op -= os_p * p;
-
-        /* advance to next row of 1st input array and output array */
-        ip1 += is1_m;
-        op += os_m;
-    }
-}
-
-
-static void
-rational_gufunc_matrix_multiply(char **args, npy_intp const *dimensions,
-                                npy_intp const *steps, void *NPY_UNUSED(func))
-{
-    /* outer dimensions counter */
-    npy_intp N_;
-
-    /* length of flattened outer dimensions */
-    npy_intp dN = dimensions[0];
-
-    /* striding over flattened outer dimensions for input and output arrays */
-    npy_intp s0 = steps[0];
-    npy_intp s1 = steps[1];
-    npy_intp s2 = steps[2];
-
-    /*
-     * loop through outer dimensions, performing matrix multiply on
-     * core dimensions for each loop
-     */
-    for (N_ = 0; N_ < dN; N_++, args[0] += s0, args[1] += s1, args[2] += s2) {
-        rational_matrix_multiply(args, dimensions+1, steps+3);
-    }
-}
-
-
-static void
-rational_ufunc_test_add(char** args, npy_intp const *dimensions,
-                        npy_intp const *steps, void* data) {
-    npy_intp is0 = steps[0], is1 = steps[1], os = steps[2], n = *dimensions;
-    char *i0 = args[0], *i1 = args[1], *o = args[2];
-    int k;
-    for (k = 0; k < n; k++) {
-        npy_int64 x = *(npy_int64*)i0;
-        npy_int64 y = *(npy_int64*)i1;
-        *(rational*)o = rational_add(make_rational_fast(x, 1),
-                                     make_rational_fast(y, 1));
-        i0 += is0; i1 += is1; o += os;
-    }
-}
-
-
-static void
-rational_ufunc_test_add_rationals(char** args, npy_intp const *dimensions,
-                        npy_intp const *steps, void* data) {
-    npy_intp is0 = steps[0], is1 = steps[1], os = steps[2], n = *dimensions;
-    char *i0 = args[0], *i1 = args[1], *o = args[2];
-    int k;
-    for (k = 0; k < n; k++) {
-        rational x = *(rational*)i0;
-        rational y = *(rational*)i1;
-        *(rational*)o = rational_add(x, y);
-        i0 += is0; i1 += is1; o += os;
-    }
-}
-
-
-PyMethodDef module_methods[] = {
-    {0} /* sentinel */
-};
-
-static struct PyModuleDef moduledef = {
-    PyModuleDef_HEAD_INIT,
-    "_rational_tests",
-    NULL,
-    -1,
-    module_methods,
-    NULL,
-    NULL,
-    NULL,
-    NULL
-};
-
-PyMODINIT_FUNC PyInit__rational_tests(void) {
-    PyObject *m = NULL;
-    PyObject* numpy_str;
-    PyObject* numpy;
-    int npy_rational;
-
-    import_array();
-    if (PyErr_Occurred()) {
-        goto fail;
-    }
-    import_umath();
-    if (PyErr_Occurred()) {
-        goto fail;
-    }
-    numpy_str = PyUnicode_FromString("numpy");
-    if (!numpy_str) {
-        goto fail;
-    }
-    numpy = PyImport_Import(numpy_str);
-    Py_DECREF(numpy_str);
-    if (!numpy) {
-        goto fail;
-    }
-
-    /* Can't set this until we import numpy */
-    PyRational_Type.tp_base = &PyGenericArrType_Type;
-
-    /* Initialize rational type object */
-    if (PyType_Ready(&PyRational_Type) < 0) {
-        goto fail;
-    }
-
-    /* Initialize rational descriptor */
-    PyArray_InitArrFuncs(&npyrational_arrfuncs);
-    npyrational_arrfuncs.getitem = npyrational_getitem;
-    npyrational_arrfuncs.setitem = npyrational_setitem;
-    npyrational_arrfuncs.copyswapn = npyrational_copyswapn;
-    npyrational_arrfuncs.copyswap = npyrational_copyswap;
-    npyrational_arrfuncs.compare = npyrational_compare;
-    npyrational_arrfuncs.argmin = npyrational_argmin;
-    npyrational_arrfuncs.argmax = npyrational_argmax;
-    npyrational_arrfuncs.dotfunc = npyrational_dot;
-    npyrational_arrfuncs.nonzero = npyrational_nonzero;
-    npyrational_arrfuncs.fill = npyrational_fill;
-    npyrational_arrfuncs.fillwithscalar = npyrational_fillwithscalar;
-    /* Left undefined: scanfunc, fromstr, sort, argsort */
-    Py_SET_TYPE(&npyrational_descr, &PyArrayDescr_Type);
-    npy_rational = PyArray_RegisterDataType(&npyrational_descr);
-    if (npy_rational<0) {
-        goto fail;
-    }
-
-    /* Support dtype(rational) syntax */
-    if (PyDict_SetItemString(PyRational_Type.tp_dict, "dtype",
-                             (PyObject*)&npyrational_descr) < 0) {
-        goto fail;
-    }
-
-    /* Register casts to and from rational */
-    #define REGISTER_CAST(From,To,from_descr,to_typenum,safe) { \
-            PyArray_Descr* from_descr_##From##_##To = (from_descr); \
-            if (PyArray_RegisterCastFunc(from_descr_##From##_##To, \
-                                         (to_typenum), \
-                                         npycast_##From##_##To) < 0) { \
-                goto fail; \
-            } \
-            if (safe && PyArray_RegisterCanCast(from_descr_##From##_##To, \
-                                                (to_typenum), \
-                                                NPY_NOSCALAR) < 0) { \
-                goto fail; \
-            } \
-        }
-    #define REGISTER_INT_CASTS(bits) \
-        REGISTER_CAST(npy_int##bits, rational, \
-                      PyArray_DescrFromType(NPY_INT##bits), npy_rational, 1) \
-        REGISTER_CAST(rational, npy_int##bits, &npyrational_descr, \
-                      NPY_INT##bits, 0)
-    REGISTER_INT_CASTS(8)
-    REGISTER_INT_CASTS(16)
-    REGISTER_INT_CASTS(32)
-    REGISTER_INT_CASTS(64)
-    REGISTER_CAST(rational,float,&npyrational_descr,NPY_FLOAT,0)
-    REGISTER_CAST(rational,double,&npyrational_descr,NPY_DOUBLE,1)
-    REGISTER_CAST(npy_bool,rational, PyArray_DescrFromType(NPY_BOOL),
-                  npy_rational,1)
-    REGISTER_CAST(rational,npy_bool,&npyrational_descr,NPY_BOOL,0)
-
-    /* Register ufuncs */
-    #define REGISTER_UFUNC(name,...) { \
-        PyUFuncObject* ufunc = \
-            (PyUFuncObject*)PyObject_GetAttrString(numpy, #name); \
-        int _types[] = __VA_ARGS__; \
-        if (!ufunc) { \
-            goto fail; \
-        } \
-        if (sizeof(_types)/sizeof(int)!=ufunc->nargs) { \
-            PyErr_Format(PyExc_AssertionError, \
-                         "ufunc %s takes %d arguments, our loop takes %lu", \
-                         #name, ufunc->nargs, (unsigned long) \
-                         (sizeof(_types)/sizeof(int))); \
-            Py_DECREF(ufunc); \
-            goto fail; \
-        } \
-        if (PyUFunc_RegisterLoopForType((PyUFuncObject*)ufunc, npy_rational, \
-                rational_ufunc_##name, _types, 0) < 0) { \
-            Py_DECREF(ufunc); \
-            goto fail; \
-        } \
-        Py_DECREF(ufunc); \
-    }
-    #define REGISTER_UFUNC_BINARY_RATIONAL(name) \
-        REGISTER_UFUNC(name, {npy_rational, npy_rational, npy_rational})
-    #define REGISTER_UFUNC_BINARY_COMPARE(name) \
-        REGISTER_UFUNC(name, {npy_rational, npy_rational, NPY_BOOL})
-    #define REGISTER_UFUNC_UNARY(name) \
-        REGISTER_UFUNC(name, {npy_rational, npy_rational})
-    /* Binary */
-    REGISTER_UFUNC_BINARY_RATIONAL(add)
-    REGISTER_UFUNC_BINARY_RATIONAL(subtract)
-    REGISTER_UFUNC_BINARY_RATIONAL(multiply)
-    REGISTER_UFUNC_BINARY_RATIONAL(divide)
-    REGISTER_UFUNC_BINARY_RATIONAL(remainder)
-    REGISTER_UFUNC_BINARY_RATIONAL(true_divide)
-    REGISTER_UFUNC_BINARY_RATIONAL(floor_divide)
-    REGISTER_UFUNC_BINARY_RATIONAL(minimum)
-    REGISTER_UFUNC_BINARY_RATIONAL(maximum)
-    /* Comparisons */
-    REGISTER_UFUNC_BINARY_COMPARE(equal)
-    REGISTER_UFUNC_BINARY_COMPARE(not_equal)
-    REGISTER_UFUNC_BINARY_COMPARE(less)
-    REGISTER_UFUNC_BINARY_COMPARE(greater)
-    REGISTER_UFUNC_BINARY_COMPARE(less_equal)
-    REGISTER_UFUNC_BINARY_COMPARE(greater_equal)
-    /* Unary */
-    REGISTER_UFUNC_UNARY(negative)
-    REGISTER_UFUNC_UNARY(absolute)
-    REGISTER_UFUNC_UNARY(floor)
-    REGISTER_UFUNC_UNARY(ceil)
-    REGISTER_UFUNC_UNARY(trunc)
-    REGISTER_UFUNC_UNARY(rint)
-    REGISTER_UFUNC_UNARY(square)
-    REGISTER_UFUNC_UNARY(reciprocal)
-    REGISTER_UFUNC_UNARY(sign)
-
-    /* Create module */
-    m = PyModule_Create(&moduledef);
-
-    if (!m) {
-        goto fail;
-    }
-
-    /* Add rational type */
-    Py_INCREF(&PyRational_Type);
-    PyModule_AddObject(m,"rational",(PyObject*)&PyRational_Type);
-
-    /* Create matrix multiply generalized ufunc */
-    {
-        int types2[3] = {npy_rational,npy_rational,npy_rational};
-        PyObject* gufunc = PyUFunc_FromFuncAndDataAndSignature(0,0,0,0,2,1,
-            PyUFunc_None,(char*)"matrix_multiply",
-            (char*)"return result of multiplying two matrices of rationals",
-            0,"(m,n),(n,p)->(m,p)");
-        if (!gufunc) {
-            goto fail;
-        }
-        if (PyUFunc_RegisterLoopForType((PyUFuncObject*)gufunc, npy_rational,
-                rational_gufunc_matrix_multiply, types2, 0) < 0) {
-            goto fail;
-        }
-        PyModule_AddObject(m,"matrix_multiply",(PyObject*)gufunc);
-    }
-
-    /* Create test ufunc with built in input types and rational output type */
-    {
-        int types3[3] = {NPY_INT64,NPY_INT64,npy_rational};
-
-        PyObject* ufunc = PyUFunc_FromFuncAndData(0,0,0,0,2,1,
-                PyUFunc_None,(char*)"test_add",
-                (char*)"add two matrices of int64 and return rational matrix",0);
-        if (!ufunc) {
-            goto fail;
-        }
-        if (PyUFunc_RegisterLoopForType((PyUFuncObject*)ufunc, npy_rational,
-                rational_ufunc_test_add, types3, 0) < 0) {
-            goto fail;
-        }
-        PyModule_AddObject(m,"test_add",(PyObject*)ufunc);
-    }
-
-    /* Create test ufunc with rational types using RegisterLoopForDescr */
-    {
-        PyObject* ufunc = PyUFunc_FromFuncAndData(0,0,0,0,2,1,
-                PyUFunc_None,(char*)"test_add_rationals",
-                (char*)"add two matrices of rationals and return rational matrix",0);
-        PyArray_Descr* types[3] = {&npyrational_descr,
-                                    &npyrational_descr,
-                                    &npyrational_descr};
-
-        if (!ufunc) {
-            goto fail;
-        }
-        if (PyUFunc_RegisterLoopForDescr((PyUFuncObject*)ufunc, &npyrational_descr,
-                rational_ufunc_test_add_rationals, types, 0) < 0) {
-            goto fail;
-        }
-        PyModule_AddObject(m,"test_add_rationals",(PyObject*)ufunc);
-    }
-
-    /* Create numerator and denominator ufuncs */
-    #define NEW_UNARY_UFUNC(name,type,doc) { \
-        int types[2] = {npy_rational,type}; \
-        PyObject* ufunc = PyUFunc_FromFuncAndData(0,0,0,0,1,1, \
-            PyUFunc_None,(char*)#name,(char*)doc,0); \
-        if (!ufunc) { \
-            goto fail; \
-        } \
-        if (PyUFunc_RegisterLoopForType((PyUFuncObject*)ufunc, \
-                npy_rational,rational_ufunc_##name,types,0)<0) { \
-            goto fail; \
-        } \
-        PyModule_AddObject(m,#name,(PyObject*)ufunc); \
-    }
-    NEW_UNARY_UFUNC(numerator,NPY_INT64,"rational number numerator");
-    NEW_UNARY_UFUNC(denominator,NPY_INT64,"rational number denominator");
-
-    /* Create gcd and lcm ufuncs */
-    #define GCD_LCM_UFUNC(name,type,doc) { \
-        static const PyUFuncGenericFunction func[1] = {name##_ufunc}; \
-        static const char types[3] = {type,type,type}; \
-        static void* data[1] = {0}; \
-        PyObject* ufunc = PyUFunc_FromFuncAndData( \
-            (PyUFuncGenericFunction*)func, data,(char*)types, \
-            1,2,1,PyUFunc_One,(char*)#name,(char*)doc,0); \
-        if (!ufunc) { \
-            goto fail; \
-        } \
-        PyModule_AddObject(m,#name,(PyObject*)ufunc); \
-    }
-    GCD_LCM_UFUNC(gcd,NPY_INT64,"greatest common denominator of two integers");
-    GCD_LCM_UFUNC(lcm,NPY_INT64,"least common multiple of two integers");
-
-    return m;
-
-fail:
-    if (!PyErr_Occurred()) {
-        PyErr_SetString(PyExc_RuntimeError,
-                        "cannot load _rational_tests module.");
-    }
-    if (m) {
-        Py_DECREF(m);
-        m = NULL;
-    }
-    return m;
-}
diff --git a/numpy/core/src/umath/_scaled_float_dtype.c b/numpy/core/src/umath/_scaled_float_dtype.c

index b6c19362a5b4e610867914178458fd6726c1134c..a214b32aad76256be27505a60c2585437a31cd9d 100644 (file)
--- a/numpy/core/src/umath/_scaled_float_dtype.c
+++ b/numpy/core/src/umath/_scaled_float_dtype.c
@@ -325,7 +325,8 @@ sfloat_to_sfloat_resolve_descriptors(
              PyArrayMethodObject *NPY_UNUSED(self),
              PyArray_DTypeMeta *NPY_UNUSED(dtypes[2]),
              PyArray_Descr *given_descrs[2],
-            PyArray_Descr *loop_descrs[2])
+            PyArray_Descr *loop_descrs[2],
+            npy_intp *view_offset)
  {
      loop_descrs[0] = given_descrs[0];
      Py_INCREF(loop_descrs[0]);
@@ -341,7 +342,8 @@ sfloat_to_sfloat_resolve_descriptors(
      if (((PyArray_SFloatDescr *)loop_descrs[0])->scaling
              == ((PyArray_SFloatDescr *)loop_descrs[1])->scaling) {
          /* same scaling is just a view */
-        return NPY_NO_CASTING | _NPY_CAST_IS_VIEW;
+        *view_offset = 0;
+        return NPY_NO_CASTING;
      }
      else if (-((PyArray_SFloatDescr *)loop_descrs[0])->scaling
               == ((PyArray_SFloatDescr *)loop_descrs[1])->scaling) {
@@ -384,7 +386,8 @@ float_to_from_sfloat_resolve_descriptors(
          PyArrayMethodObject *NPY_UNUSED(self),
          PyArray_DTypeMeta *dtypes[2],
          PyArray_Descr *NPY_UNUSED(given_descrs[2]),
-        PyArray_Descr *loop_descrs[2])
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *view_offset)
  {
      loop_descrs[0] = NPY_DT_CALL_default_descr(dtypes[0]);
      if (loop_descrs[0] == NULL) {
@@ -394,7 +397,8 @@ float_to_from_sfloat_resolve_descriptors(
      if (loop_descrs[1] == NULL) {
          return -1;
      }
-    return NPY_NO_CASTING | _NPY_CAST_IS_VIEW;
+    *view_offset = 0;
+    return NPY_NO_CASTING;
  }
  
  
@@ -422,7 +426,8 @@ sfloat_to_bool_resolve_descriptors(
          PyArrayMethodObject *NPY_UNUSED(self),
          PyArray_DTypeMeta *NPY_UNUSED(dtypes[2]),
          PyArray_Descr *given_descrs[2],
-        PyArray_Descr *loop_descrs[2])
+        PyArray_Descr *loop_descrs[2],
+        npy_intp *NPY_UNUSED(view_offset))
  {
      Py_INCREF(given_descrs[0]);
      loop_descrs[0] = given_descrs[0];
@@ -541,7 +546,8 @@ multiply_sfloats_resolve_descriptors(
          PyArrayMethodObject *NPY_UNUSED(self),
          PyArray_DTypeMeta *NPY_UNUSED(dtypes[3]),
          PyArray_Descr *given_descrs[3],
-        PyArray_Descr *loop_descrs[3])
+        PyArray_Descr *loop_descrs[3],
+        npy_intp *NPY_UNUSED(view_offset))
  {
      /*
       * Multiply the scaling for the result.  If the result was passed in we
@@ -602,7 +608,8 @@ add_sfloats_resolve_descriptors(
          PyArrayMethodObject *NPY_UNUSED(self),
          PyArray_DTypeMeta *NPY_UNUSED(dtypes[3]),
          PyArray_Descr *given_descrs[3],
-        PyArray_Descr *loop_descrs[3])
+        PyArray_Descr *loop_descrs[3],
+        npy_intp *NPY_UNUSED(view_offset))
  {
      /*
       * Here we accept an output descriptor (the inner loop can deal with it),
diff --git a/numpy/core/src/umath/_struct_ufunc_tests.c b/numpy/core/src/umath/_struct_ufunc_tests.c

new file mode 100644 (file)

index 0000000..ee71c46
--- /dev/null
+++ b/numpy/core/src/umath/_struct_ufunc_tests.c
@@ -0,0 +1,160 @@
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#include "numpy/ndarraytypes.h"
+#include "numpy/ufuncobject.h"
+#include "numpy/npy_3kcompat.h"
+
+#include <math.h>
+
+
+/*
+ * struct_ufunc_test.c
+ * This is the C code for creating your own
+ * NumPy ufunc for a structured array dtype.
+ *
+ * Details explaining the Python-C API can be found under
+ * 'Extending and Embedding' and 'Python/C API' at
+ * docs.python.org .
+ */
+
+static void add_uint64_triplet(char **args,
+                               npy_intp const *dimensions,
+                               npy_intp const* steps,
+                               void* data)
+{
+    npy_intp i;
+    npy_intp is1=steps[0];
+    npy_intp is2=steps[1];
+    npy_intp os=steps[2];
+    npy_intp n=dimensions[0];
+    npy_uint64 *x, *y, *z;
+
+    char *i1=args[0];
+    char *i2=args[1];
+    char *op=args[2];
+
+    for (i = 0; i < n; i++) {
+
+        x = (npy_uint64*)i1;
+        y = (npy_uint64*)i2;
+        z = (npy_uint64*)op;
+
+        z[0] = x[0] + y[0];
+        z[1] = x[1] + y[1];
+        z[2] = x[2] + y[2];
+
+        i1 += is1;
+        i2 += is2;
+        op += os;
+    }
+}
+
+static PyObject*
+register_fail(PyObject* NPY_UNUSED(self), PyObject* NPY_UNUSED(args))
+{
+    PyObject *add_triplet;
+    PyObject *dtype_dict;
+    PyArray_Descr *dtype;
+    PyArray_Descr *dtypes[3];
+    int retval;
+
+    add_triplet = PyUFunc_FromFuncAndData(NULL, NULL, NULL, 0, 2, 1,
+                                    PyUFunc_None, "add_triplet",
+                                    "add_triplet_docstring", 0);
+
+    dtype_dict = Py_BuildValue("[(s, s), (s, s), (s, s)]",
+                               "f0", "u8", "f1", "u8", "f2", "u8");
+    PyArray_DescrConverter(dtype_dict, &dtype);
+    Py_DECREF(dtype_dict);
+
+    dtypes[0] = dtype;
+    dtypes[1] = dtype;
+    dtypes[2] = dtype;
+
+    retval = PyUFunc_RegisterLoopForDescr((PyUFuncObject *)add_triplet,
+                                dtype,
+                                &add_uint64_triplet,
+                                dtypes,
+                                NULL);
+
+    if (retval < 0) {
+        Py_DECREF(add_triplet);
+        Py_DECREF(dtype);
+        return NULL;
+    }
+    retval = PyUFunc_RegisterLoopForDescr((PyUFuncObject *)add_triplet,
+                                dtype,
+                                &add_uint64_triplet,
+                                dtypes,
+                                NULL);
+    Py_DECREF(add_triplet);
+    Py_DECREF(dtype);
+    if (retval < 0) {
+        return NULL;
+    }
+    Py_RETURN_NONE;
+}
+
+static PyMethodDef StructUfuncTestMethods[] = {
+    {"register_fail",
+        register_fail,
+        METH_NOARGS, NULL},
+    {NULL, NULL, 0, NULL}
+};
+
+static struct PyModuleDef moduledef = {
+    PyModuleDef_HEAD_INIT,
+    "_struct_ufunc_tests",
+    NULL,
+    -1,
+    StructUfuncTestMethods,
+    NULL,
+    NULL,
+    NULL,
+    NULL
+};
+
+PyMODINIT_FUNC PyInit__struct_ufunc_tests(void)
+{
+    PyObject *m, *add_triplet, *d;
+    PyObject *dtype_dict;
+    PyArray_Descr *dtype;
+    PyArray_Descr *dtypes[3];
+
+    m = PyModule_Create(&moduledef);
+
+    if (m == NULL) {
+        return NULL;
+    }
+
+    import_array();
+    import_umath();
+
+    add_triplet = PyUFunc_FromFuncAndData(NULL, NULL, NULL, 0, 2, 1,
+                                    PyUFunc_None, "add_triplet",
+                                    "add_triplet_docstring", 0);
+
+    dtype_dict = Py_BuildValue("[(s, s), (s, s), (s, s)]",
+                               "f0", "u8", "f1", "u8", "f2", "u8");
+    PyArray_DescrConverter(dtype_dict, &dtype);
+    Py_DECREF(dtype_dict);
+
+    dtypes[0] = dtype;
+    dtypes[1] = dtype;
+    dtypes[2] = dtype;
+
+    PyUFunc_RegisterLoopForDescr((PyUFuncObject *)add_triplet,
+                                dtype,
+                                &add_uint64_triplet,
+                                dtypes,
+                                NULL);
+
+    Py_DECREF(dtype);
+    d = PyModule_GetDict(m);
+
+    PyDict_SetItemString(d, "add_triplet", add_triplet);
+    Py_DECREF(add_triplet);
+    return m;
+}
diff --git a/numpy/core/src/umath/_struct_ufunc_tests.c.src b/numpy/core/src/umath/_struct_ufunc_tests.c.src

deleted file mode 100644 (file)

index ee71c46..0000000
--- a/numpy/core/src/umath/_struct_ufunc_tests.c.src
+++ /dev/null
@@ -1,160 +0,0 @@
-#define PY_SSIZE_T_CLEAN
-#include <Python.h>
-
-#define NPY_NO_DEPRECATED_API NPY_API_VERSION
-#include "numpy/ndarraytypes.h"
-#include "numpy/ufuncobject.h"
-#include "numpy/npy_3kcompat.h"
-
-#include <math.h>
-
-
-/*
- * struct_ufunc_test.c
- * This is the C code for creating your own
- * NumPy ufunc for a structured array dtype.
- *
- * Details explaining the Python-C API can be found under
- * 'Extending and Embedding' and 'Python/C API' at
- * docs.python.org .
- */
-
-static void add_uint64_triplet(char **args,
-                               npy_intp const *dimensions,
-                               npy_intp const* steps,
-                               void* data)
-{
-    npy_intp i;
-    npy_intp is1=steps[0];
-    npy_intp is2=steps[1];
-    npy_intp os=steps[2];
-    npy_intp n=dimensions[0];
-    npy_uint64 *x, *y, *z;
-
-    char *i1=args[0];
-    char *i2=args[1];
-    char *op=args[2];
-
-    for (i = 0; i < n; i++) {
-
-        x = (npy_uint64*)i1;
-        y = (npy_uint64*)i2;
-        z = (npy_uint64*)op;
-
-        z[0] = x[0] + y[0];
-        z[1] = x[1] + y[1];
-        z[2] = x[2] + y[2];
-
-        i1 += is1;
-        i2 += is2;
-        op += os;
-    }
-}
-
-static PyObject*
-register_fail(PyObject* NPY_UNUSED(self), PyObject* NPY_UNUSED(args))
-{
-    PyObject *add_triplet;
-    PyObject *dtype_dict;
-    PyArray_Descr *dtype;
-    PyArray_Descr *dtypes[3];
-    int retval;
-
-    add_triplet = PyUFunc_FromFuncAndData(NULL, NULL, NULL, 0, 2, 1,
-                                    PyUFunc_None, "add_triplet",
-                                    "add_triplet_docstring", 0);
-
-    dtype_dict = Py_BuildValue("[(s, s), (s, s), (s, s)]",
-                               "f0", "u8", "f1", "u8", "f2", "u8");
-    PyArray_DescrConverter(dtype_dict, &dtype);
-    Py_DECREF(dtype_dict);
-
-    dtypes[0] = dtype;
-    dtypes[1] = dtype;
-    dtypes[2] = dtype;
-
-    retval = PyUFunc_RegisterLoopForDescr((PyUFuncObject *)add_triplet,
-                                dtype,
-                                &add_uint64_triplet,
-                                dtypes,
-                                NULL);
-
-    if (retval < 0) {
-        Py_DECREF(add_triplet);
-        Py_DECREF(dtype);
-        return NULL;
-    }
-    retval = PyUFunc_RegisterLoopForDescr((PyUFuncObject *)add_triplet,
-                                dtype,
-                                &add_uint64_triplet,
-                                dtypes,
-                                NULL);
-    Py_DECREF(add_triplet);
-    Py_DECREF(dtype);
-    if (retval < 0) {
-        return NULL;
-    }
-    Py_RETURN_NONE;
-}
-
-static PyMethodDef StructUfuncTestMethods[] = {
-    {"register_fail",
-        register_fail,
-        METH_NOARGS, NULL},
-    {NULL, NULL, 0, NULL}
-};
-
-static struct PyModuleDef moduledef = {
-    PyModuleDef_HEAD_INIT,
-    "_struct_ufunc_tests",
-    NULL,
-    -1,
-    StructUfuncTestMethods,
-    NULL,
-    NULL,
-    NULL,
-    NULL
-};
-
-PyMODINIT_FUNC PyInit__struct_ufunc_tests(void)
-{
-    PyObject *m, *add_triplet, *d;
-    PyObject *dtype_dict;
-    PyArray_Descr *dtype;
-    PyArray_Descr *dtypes[3];
-
-    m = PyModule_Create(&moduledef);
-
-    if (m == NULL) {
-        return NULL;
-    }
-
-    import_array();
-    import_umath();
-
-    add_triplet = PyUFunc_FromFuncAndData(NULL, NULL, NULL, 0, 2, 1,
-                                    PyUFunc_None, "add_triplet",
-                                    "add_triplet_docstring", 0);
-
-    dtype_dict = Py_BuildValue("[(s, s), (s, s), (s, s)]",
-                               "f0", "u8", "f1", "u8", "f2", "u8");
-    PyArray_DescrConverter(dtype_dict, &dtype);
-    Py_DECREF(dtype_dict);
-
-    dtypes[0] = dtype;
-    dtypes[1] = dtype;
-    dtypes[2] = dtype;
-
-    PyUFunc_RegisterLoopForDescr((PyUFuncObject *)add_triplet,
-                                dtype,
-                                &add_uint64_triplet,
-                                dtypes,
-                                NULL);
-
-    Py_DECREF(dtype);
-    d = PyModule_GetDict(m);
-
-    PyDict_SetItemString(d, "add_triplet", add_triplet);
-    Py_DECREF(add_triplet);
-    return m;
-}
diff --git a/numpy/core/src/umath/dispatching.c b/numpy/core/src/umath/dispatching.c

index 8e2f0fe1377a5505b64764d626bb36fbf251ef51..b8f102b3dff2468384218d59e759c2266c8c9e97 100644 (file)
--- a/numpy/core/src/umath/dispatching.c
+++ b/numpy/core/src/umath/dispatching.c
@@ -78,7 +78,7 @@ NPY_NO_EXPORT int
  PyUFunc_AddLoop(PyUFuncObject *ufunc, PyObject *info, int ignore_duplicate)
  {
      /*
-     * Validate the info object, this should likely move to to a different
+     * Validate the info object, this should likely move to a different
       * entry-point in the future (and is mostly unnecessary currently).
       */
      if (!PyTuple_CheckExact(info) || PyTuple_GET_SIZE(info) != 2) {
diff --git a/numpy/core/src/umath/fast_loop_macros.h b/numpy/core/src/umath/fast_loop_macros.h

index 4a36c97218793bdb0875679930a3e7f7a94c9f91..cbd1f04aaacda9484cd43dd39903bedb9f96ada4 100644 (file)
--- a/numpy/core/src/umath/fast_loop_macros.h
+++ b/numpy/core/src/umath/fast_loop_macros.h
@@ -10,6 +10,8 @@
  #ifndef _NPY_UMATH_FAST_LOOP_MACROS_H_
  #define _NPY_UMATH_FAST_LOOP_MACROS_H_
  
+#include <assert.h>
+
  /*
   * MAX_STEP_SIZE is used to determine if we need to use SIMD version of the ufunc.
   * Very large step size can be as slow as processing it using scalar. The
@@ -99,10 +101,20 @@ abs_ptrdiff(char *a, char *b)
  
  #define IS_OUTPUT_CONT(tout) (steps[1] == sizeof(tout))
  
-#define IS_BINARY_REDUCE ((args[0] == args[2])\
+/*
+ * Make sure dimensions is non-zero with an assert, to allow subsequent code
+ * to ignore problems of accessing invalid memory
+ */
+
+#define IS_BINARY_REDUCE (assert(dimensions[0] != 0), \
+        (args[0] == args[2])\
          && (steps[0] == steps[2])\
          && (steps[0] == 0))
  
+/* input contiguous (for binary reduces only) */
+#define IS_BINARY_REDUCE_INPUT_CONT(tin) (assert(dimensions[0] != 0), \
+         steps[1] == sizeof(tin))
+
  /* binary loop input and output contiguous */
  #define IS_BINARY_CONT(tin, tout) (steps[0] == sizeof(tin) && \
                                     steps[1] == sizeof(tin) && \
@@ -252,6 +264,34 @@ abs_ptrdiff(char *a, char *b)
      TYPE io1 = *(TYPE *)iop1; \
      BINARY_REDUCE_LOOP_INNER
  
+/*
+ * op should be the code working on `TYPE in2` and
+ * reading/storing the result in `TYPE *io1`
+ */
+#define BASE_BINARY_REDUCE_LOOP(TYPE, op) \
+    BINARY_REDUCE_LOOP_INNER { \
+        const TYPE in2 = *(TYPE *)ip2; \
+        op; \
+    }
+
+#define BINARY_REDUCE_LOOP_FAST_INNER(TYPE, op)\
+    /* condition allows compiler to optimize the generic macro */ \
+    if(IS_BINARY_REDUCE_INPUT_CONT(TYPE)) { \
+        BASE_BINARY_REDUCE_LOOP(TYPE, op) \
+    } \
+    else { \
+        BASE_BINARY_REDUCE_LOOP(TYPE, op) \
+    }
+
+#define BINARY_REDUCE_LOOP_FAST(TYPE, op)\
+    do { \
+        char *iop1 = args[0]; \
+        TYPE io1 = *(TYPE *)iop1; \
+        BINARY_REDUCE_LOOP_FAST_INNER(TYPE, op); \
+        *((TYPE *)iop1) = io1; \
+    } \
+    while (0)
+
  #define IS_BINARY_STRIDE_ONE(esize, vsize) \
      ((steps[0] == esize) && \
       (steps[1] == esize) && \
diff --git a/numpy/core/src/umath/legacy_array_method.c b/numpy/core/src/umath/legacy_array_method.c

index ef24edff1c98bc3d94474e25746e0707300a0d10..c3d421d9b8aa799bb733e7246d2d711c1aec7518 100644 (file)
--- a/numpy/core/src/umath/legacy_array_method.c
+++ b/numpy/core/src/umath/legacy_array_method.c
@@ -103,7 +103,8 @@ NPY_NO_EXPORT NPY_CASTING
  wrapped_legacy_resolve_descriptors(PyArrayMethodObject *NPY_UNUSED(self),
          PyArray_DTypeMeta *NPY_UNUSED(dtypes[]),
          PyArray_Descr *NPY_UNUSED(given_descrs[]),
-        PyArray_Descr *NPY_UNUSED(loop_descrs[]))
+        PyArray_Descr *NPY_UNUSED(loop_descrs[]),
+        npy_intp *NPY_UNUSED(view_offset))
  {
      PyErr_SetString(PyExc_RuntimeError,
              "cannot use legacy wrapping ArrayMethod without calling the ufunc "
@@ -121,7 +122,8 @@ simple_legacy_resolve_descriptors(
          PyArrayMethodObject *method,
          PyArray_DTypeMeta **dtypes,
          PyArray_Descr **given_descrs,
-        PyArray_Descr **output_descrs)
+        PyArray_Descr **output_descrs,
+        npy_intp *NPY_UNUSED(view_offset))
  {
      int i = 0;
      int nin = method->nin;
@@ -134,7 +136,7 @@ simple_legacy_resolve_descriptors(
           * (identity) at least currently. This is because `op[0] is op[2]`.
           * (If the output descriptor is not passed, the below works.)
           */
-        output_descrs[2] = ensure_dtype_nbo(given_descrs[2]);
+        output_descrs[2] = NPY_DT_CALL_ensure_canonical(given_descrs[2]);
          if (output_descrs[2] == NULL) {
              Py_CLEAR(output_descrs[2]);
              return -1;
@@ -147,7 +149,7 @@ simple_legacy_resolve_descriptors(
              output_descrs[1] = output_descrs[2];
          }
          else {
-            output_descrs[1] = ensure_dtype_nbo(given_descrs[1]);
+            output_descrs[1] = NPY_DT_CALL_ensure_canonical(given_descrs[1]);
              if (output_descrs[1] == NULL) {
                  i = 2;
                  goto fail;
@@ -158,7 +160,7 @@ simple_legacy_resolve_descriptors(
  
      for (; i < nin + nout; i++) {
          if (given_descrs[i] != NULL) {
-            output_descrs[i] = ensure_dtype_nbo(given_descrs[i]);
+            output_descrs[i] = NPY_DT_CALL_ensure_canonical(given_descrs[i]);
          }
          else if (dtypes[i] == dtypes[0] && i > 0) {
              /* Preserve metadata from the first operand if same dtype */
@@ -190,7 +192,7 @@ simple_legacy_resolve_descriptors(
  NPY_NO_EXPORT int
  get_wrapped_legacy_ufunc_loop(PyArrayMethod_Context *context,
          int aligned, int move_references,
-        npy_intp *NPY_UNUSED(strides),
+        const npy_intp *NPY_UNUSED(strides),
          PyArrayMethod_StridedLoop **out_loop,
          NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags)
diff --git a/numpy/core/src/umath/legacy_array_method.h b/numpy/core/src/umath/legacy_array_method.h

index 0dec1fb3a4857a337f33899772c91031b750dbf1..498fb1aa27c29fb37af17d5c49f1cb06cc5c4aa7 100644 (file)
--- a/numpy/core/src/umath/legacy_array_method.h
+++ b/numpy/core/src/umath/legacy_array_method.h
@@ -20,14 +20,14 @@ PyArray_NewLegacyWrappingArrayMethod(PyUFuncObject *ufunc,
  NPY_NO_EXPORT int
  get_wrapped_legacy_ufunc_loop(PyArrayMethod_Context *context,
          int aligned, int move_references,
-        npy_intp *NPY_UNUSED(strides),
+        const npy_intp *NPY_UNUSED(strides),
          PyArrayMethod_StridedLoop **out_loop,
          NpyAuxData **out_transferdata,
          NPY_ARRAYMETHOD_FLAGS *flags);
  
  NPY_NO_EXPORT NPY_CASTING
  wrapped_legacy_resolve_descriptors(PyArrayMethodObject *,
-        PyArray_DTypeMeta **, PyArray_Descr **, PyArray_Descr **);
+        PyArray_DTypeMeta **, PyArray_Descr **, PyArray_Descr **, npy_intp *);
  
  
  #endif  /*_NPY_LEGACY_ARRAY_METHOD_H */
diff --git a/numpy/core/src/umath/loops.c.src b/numpy/core/src/umath/loops.c.src

index aaa694f34dbba408fd2193369e41903743564942..3a8a549131a2aed30476be9172faa69a41949996 100644 (file)
--- a/numpy/core/src/umath/loops.c.src
+++ b/numpy/core/src/umath/loops.c.src
@@ -636,10 +636,7 @@ NPY_NO_EXPORT NPY_GCC_OPT_3 @ATTR@ void
  @TYPE@_@kind@@isa@(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
  {
      if (IS_BINARY_REDUCE) {
-        BINARY_REDUCE_LOOP(@type@) {
-            io1 @OP@= *(@type@ *)ip2;
-        }
-        *((@type@ *)iop1) = io1;
+        BINARY_REDUCE_LOOP_FAST(@type@, io1 @OP@= in2);
      }
      else {
          BINARY_LOOP_FAST(@type@, @type@, *out = in1 @OP@ in2);
@@ -724,32 +721,6 @@ NPY_NO_EXPORT NPY_GCC_OPT_3 @ATTR@ void
  
  /**end repeat1**/
  
-/**begin repeat1
- * #kind = maximum, minimum#
- * #OP =  >, <#
- **/
-
-NPY_NO_EXPORT void
-@TYPE@_@kind@(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
-{
-    if (IS_BINARY_REDUCE) {
-        BINARY_REDUCE_LOOP(@type@) {
-            const @type@ in2 = *(@type@ *)ip2;
-            io1 = (io1 @OP@ in2) ? io1 : in2;
-        }
-        *((@type@ *)iop1) = io1;
-    }
-    else {
-        BINARY_LOOP {
-            const @type@ in1 = *(@type@ *)ip1;
-            const @type@ in2 = *(@type@ *)ip2;
-            *((@type@ *)op1) = (in1 @OP@ in2) ? in1 : in2;
-        }
-    }
-}
-
-/**end repeat1**/
-
  NPY_NO_EXPORT void
  @TYPE@_power(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
  {
@@ -790,23 +761,6 @@ NPY_NO_EXPORT void
      }
  }
  
-NPY_NO_EXPORT void
-@TYPE@_fmod(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
-{
-    BINARY_LOOP {
-        const @type@ in1 = *(@type@ *)ip1;
-        const @type@ in2 = *(@type@ *)ip2;
-        if (in2 == 0) {
-            npy_set_floatstatus_divbyzero();
-            *((@type@ *)op1) = 0;
-        }
-        else {
-            *((@type@ *)op1)= in1 % in2;
-        }
-
-    }
-}
-
  /**begin repeat1
   * #kind = isnan, isinf, isfinite#
   * #func = npy_isnan, npy_isinf, npy_isfinite#
@@ -843,57 +797,6 @@ NPY_NO_EXPORT NPY_GCC_OPT_3 void
      UNARY_LOOP_FAST(@type@, @type@, *out = in > 0 ? 1 : (in < 0 ? -1 : 0));
  }
  
-NPY_NO_EXPORT void
-@TYPE@_remainder(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
-{
-    BINARY_LOOP {
-        const @type@ in1 = *(@type@ *)ip1;
-        const @type@ in2 = *(@type@ *)ip2;
-        if (in2 == 0) {
-            npy_set_floatstatus_divbyzero();
-            *((@type@ *)op1) = 0;
-        }
-        else {
-            /* handle mixed case the way Python does */
-            const @type@ rem = in1 % in2;
-            if ((in1 > 0) == (in2 > 0) || rem == 0) {
-                *((@type@ *)op1) = rem;
-            }
-            else {
-                *((@type@ *)op1) = rem + in2;
-            }
-        }
-    }
-}
-
-NPY_NO_EXPORT void
-@TYPE@_divmod(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
-{
-    BINARY_LOOP_TWO_OUT {
-        const @type@ in1 = *(@type@ *)ip1;
-        const @type@ in2 = *(@type@ *)ip2;
-        /* see FIXME note for divide above */
-        if (in2 == 0 || (in1 == NPY_MIN_@TYPE@ && in2 == -1)) {
-            npy_set_floatstatus_divbyzero();
-            *((@type@ *)op1) = 0;
-            *((@type@ *)op2) = 0;
-        }
-        else {
-            /* handle mixed case the way Python does */
-            const @type@ quo = in1 / in2;
-            const @type@ rem = in1 % in2;
-            if ((in1 > 0) == (in2 > 0) || rem == 0) {
-                *((@type@ *)op1) = quo;
-                *((@type@ *)op2) = rem;
-            }
-            else {
-                *((@type@ *)op1) = quo - 1;
-                *((@type@ *)op2) = rem + in2;
-            }
-        }
-    }
-}
-
  /**begin repeat1
   * #kind = gcd, lcm#
   **/
@@ -928,40 +831,6 @@ NPY_NO_EXPORT NPY_GCC_OPT_3 void
      UNARY_LOOP_FAST(@type@, @type@, *out = in > 0 ? 1 : 0);
  }
  
-NPY_NO_EXPORT void
-@TYPE@_remainder(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
-{
-    BINARY_LOOP {
-        const @type@ in1 = *(@type@ *)ip1;
-        const @type@ in2 = *(@type@ *)ip2;
-        if (in2 == 0) {
-            npy_set_floatstatus_divbyzero();
-            *((@type@ *)op1) = 0;
-        }
-        else {
-            *((@type@ *)op1) = in1 % in2;
-        }
-    }
-}
-
-NPY_NO_EXPORT void
-@TYPE@_divmod(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
-{
-    BINARY_LOOP_TWO_OUT {
-        const @type@ in1 = *(@type@ *)ip1;
-        const @type@ in2 = *(@type@ *)ip2;
-        if (in2 == 0) {
-            npy_set_floatstatus_divbyzero();
-            *((@type@ *)op1) = 0;
-            *((@type@ *)op2) = 0;
-        }
-        else {
-            *((@type@ *)op1)= in1/in2;
-            *((@type@ *)op2) = in1 % in2;
-        }
-    }
-}
-
  /**begin repeat1
   * #kind = gcd, lcm#
   **/
@@ -1531,62 +1400,6 @@ TIMEDELTA_mm_qm_divmod(char **args, npy_intp const *dimensions, npy_intp const *
   *****************************************************************************
   */
  
-/**begin repeat
- *  #func = rint, floor, trunc#
- *  #scalarf = npy_rint, npy_floor, npy_trunc#
- */
-
-/**begin repeat1
-*  #TYPE = FLOAT, DOUBLE#
-*  #type = npy_float, npy_double#
-*  #typesub = f, #
-*/
-
-NPY_NO_EXPORT NPY_GCC_OPT_3 void
-@TYPE@_@func@(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(data))
-{
-    UNARY_LOOP {
-        const @type@ in1 = *(@type@ *)ip1;
-        *(@type@ *)op1 = @scalarf@@typesub@(in1);
-    }
-}
-
-
-/**end repeat1**/
-/**end repeat**/
-
-/**begin repeat
- * #isa = avx512f, fma#
- * #ISA = AVX512F, FMA#
- * #CHK = HAVE_ATTRIBUTE_TARGET_AVX512F_WITH_INTRINSICS, HAVE_ATTRIBUTE_TARGET_AVX2_WITH_INTRINSICS#
- */
-
-/**begin repeat1
- *  #TYPE = FLOAT, DOUBLE#
- *  #type = npy_float, npy_double#
- *  #typesub = f, #
- */
-
-/**begin repeat2
- *  #func = rint, floor, trunc#
- *  #scalarf = npy_rint, npy_floor, npy_trunc#
- */
-
-NPY_NO_EXPORT NPY_GCC_OPT_3 void
-@TYPE@_@func@_@isa@(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(data))
-{
-    if (!run_unary_@isa@_@func@_@TYPE@(args, dimensions, steps)) {
-        UNARY_LOOP {
-            const @type@ in1 = *(@type@ *)ip1;
-            *(@type@ *)op1 = @scalarf@@typesub@(in1);
-        }
-    }
-}
-
-/**end repeat2**/
-/**end repeat1**/
-/**end repeat**/
-
  /**begin repeat
   * Float types
   *  #type = npy_float, npy_double, npy_longdouble#
@@ -1684,93 +1497,6 @@ NPY_NO_EXPORT void
      }
  }
  
-/**begin repeat1
- * #kind = maximum, minimum#
- * #OP =  >=, <=#
- **/
-NPY_NO_EXPORT void
-@TYPE@_@kind@_avx512f(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
-{
-    /*  */
-    if (IS_BINARY_REDUCE) {
-        if (!run_unary_reduce_simd_@kind@_@TYPE@(args, dimensions, steps)) {
-            BINARY_REDUCE_LOOP(@type@) {
-                const @type@ in2 = *(@type@ *)ip2;
-                /* Order of operations important for MSVC 2015 */
-                io1 = (io1 @OP@ in2 || npy_isnan(io1)) ? io1 : in2;
-            }
-            *((@type@ *)iop1) = io1;
-        }
-    }
-    else {
-        if (!run_binary_avx512f_@kind@_@TYPE@(args, dimensions, steps)) {
-            BINARY_LOOP {
-                @type@ in1 = *(@type@ *)ip1;
-                const @type@ in2 = *(@type@ *)ip2;
-                /* Order of operations important for MSVC 2015 */
-                in1 = (in1 @OP@ in2 || npy_isnan(in1)) ? in1 : in2;
-                *((@type@ *)op1) = in1;
-            }
-        }
-    }
-    npy_clear_floatstatus_barrier((char*)dimensions);
-}
-
-NPY_NO_EXPORT void
-@TYPE@_@kind@(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
-{
-    /*  */
-    if (IS_BINARY_REDUCE) {
-        if (!run_unary_reduce_simd_@kind@_@TYPE@(args, dimensions, steps)) {
-            BINARY_REDUCE_LOOP(@type@) {
-                const @type@ in2 = *(@type@ *)ip2;
-                /* Order of operations important for MSVC 2015 */
-                io1 = (io1 @OP@ in2 || npy_isnan(io1)) ? io1 : in2;
-            }
-            *((@type@ *)iop1) = io1;
-        }
-    }
-    else {
-        BINARY_LOOP {
-            @type@ in1 = *(@type@ *)ip1;
-            const @type@ in2 = *(@type@ *)ip2;
-            /* Order of operations important for MSVC 2015 */
-            in1 = (in1 @OP@ in2 || npy_isnan(in1)) ? in1 : in2;
-            *((@type@ *)op1) = in1;
-        }
-    }
-    npy_clear_floatstatus_barrier((char*)dimensions);
-}
-/**end repeat1**/
-
-/**begin repeat1
- * #kind = fmax, fmin#
- * #OP =  >=, <=#
- **/
-NPY_NO_EXPORT void
-@TYPE@_@kind@(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
-{
-    /*  */
-    if (IS_BINARY_REDUCE) {
-        BINARY_REDUCE_LOOP(@type@) {
-            const @type@ in2 = *(@type@ *)ip2;
-            /* Order of operations important for MSVC 2015 */
-            io1 = (io1 @OP@ in2 || npy_isnan(in2)) ? io1 : in2;
-        }
-        *((@type@ *)iop1) = io1;
-    }
-    else {
-        BINARY_LOOP {
-            const @type@ in1 = *(@type@ *)ip1;
-            const @type@ in2 = *(@type@ *)ip2;
-            /* Order of operations important for MSVC 2015 */
-            *((@type@ *)op1) = (in1 @OP@ in2 || npy_isnan(in2)) ? in1 : in2;
-        }
-    }
-    npy_clear_floatstatus_barrier((char*)dimensions);
-}
-/**end repeat1**/
-
  NPY_NO_EXPORT void
  @TYPE@_floor_divide(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
  {
diff --git a/numpy/core/src/umath/loops.h.src b/numpy/core/src/umath/loops.h.src

index 081ca99571a101684ca57280f96ee56cd253e309..694518ae0e2003afd38552dbf80f86f00c2ea747 100644 (file)
--- a/numpy/core/src/umath/loops.h.src
+++ b/numpy/core/src/umath/loops.h.src
@@ -22,7 +22,6 @@
  #define BOOL_fmax BOOL_maximum
  #define BOOL_fmin BOOL_minimum
  
-
  /*
   *****************************************************************************
   **                             BOOLEAN LOOPS                               **
@@ -65,6 +64,24 @@ BOOL_@kind@(char **args, npy_intp const *dimensions, npy_intp const *steps, void
       (char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func)))
  /**end repeat**/
  
+#ifndef NPY_DISABLE_OPTIMIZATION
+    #include "loops_modulo.dispatch.h"
+#endif
+
+/**begin repeat
+ * #TYPE = UBYTE, USHORT, UINT, ULONG, ULONGLONG,
+           BYTE,  SHORT,  INT,  LONG,  LONGLONG#
+ */
+ NPY_CPU_DISPATCH_DECLARE(NPY_NO_EXPORT void @TYPE@_divmod,
+     (char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func)))
+
+ NPY_CPU_DISPATCH_DECLARE(NPY_NO_EXPORT void @TYPE@_fmod,
+     (char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func)))
+
+ NPY_CPU_DISPATCH_DECLARE(NPY_NO_EXPORT void @TYPE@_remainder,
+     (char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func)))
+/**end repeat**/
+
  /**begin repeat
   * #TYPE = BYTE, SHORT, INT, LONG, LONGLONG#
   */
@@ -143,21 +160,12 @@ NPY_NO_EXPORT void
  NPY_NO_EXPORT void
  @S@@TYPE@_power(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func));
  
-NPY_NO_EXPORT void
-@S@@TYPE@_fmod(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func));
-
  NPY_NO_EXPORT void
  @S@@TYPE@_absolute(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func));
  
  NPY_NO_EXPORT void
  @S@@TYPE@_sign(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func));
  
-NPY_NO_EXPORT void
-@S@@TYPE@_remainder(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func));
-
-NPY_NO_EXPORT void
-@S@@TYPE@_divmod(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func));
-
  NPY_NO_EXPORT void
  @S@@TYPE@_gcd(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func));
  
@@ -187,7 +195,7 @@ NPY_NO_EXPORT void
   *  #TYPE = FLOAT, DOUBLE#
   */
  /**begin repeat1
- * #kind = ceil, sqrt, absolute, square, reciprocal#
+ * #kind = rint, floor, trunc, ceil, sqrt, absolute, square, reciprocal#
   */
  NPY_CPU_DISPATCH_DECLARE(NPY_NO_EXPORT void @TYPE@_@kind@,
     (char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(data)))
@@ -210,6 +218,24 @@ NPY_CPU_DISPATCH_DECLARE(NPY_NO_EXPORT void @TYPE@_@kind@,
  /**end repeat1**/
  /**end repeat**/
  
+#ifndef NPY_DISABLE_OPTIMIZATION
+    #include "loops_hyperbolic.dispatch.h"
+#endif
+/**begin repeat
+ *  #TYPE = FLOAT, DOUBLE#
+ */
+/**begin repeat1
+ * #func = tanh#
+ */
+NPY_CPU_DISPATCH_DECLARE(NPY_NO_EXPORT void @TYPE@_@func@,
+    (char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func)))
+/**end repeat1**/
+/**end repeat**/
+
+/**end repeat1**/
+/**end repeat**/
+
+// SVML
  #ifndef NPY_DISABLE_OPTIMIZATION
      #include "loops_umath_fp.dispatch.h"
  #endif
@@ -274,26 +300,6 @@ NPY_CPU_DISPATCH_DECLARE(NPY_NO_EXPORT void @TYPE@_@kind@, (
  /**end repeat1**/
  /**end repeat**/
  
-/**begin repeat
- *  #func = rint, floor, trunc#
- */
-
-/**begin repeat1
-*  #TYPE = FLOAT, DOUBLE#
-*/
-
-NPY_NO_EXPORT NPY_GCC_OPT_3 void
-@TYPE@_@func@(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(data));
-
-/**begin repeat2
- * #isa = avx512f, fma#
- */
-NPY_NO_EXPORT NPY_GCC_OPT_3 void
-@TYPE@_@func@_@isa@(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(data));
-/**end repeat2**/
-/**end repeat1**/
-/**end repeat**/
-
  /**begin repeat
   * Float types
   *  #TYPE = HALF, FLOAT, DOUBLE, LONGDOUBLE#
@@ -658,6 +664,43 @@ OBJECT_sign(char **args, npy_intp const *dimensions, npy_intp const *steps, void
  NPY_NO_EXPORT void
  PyUFunc_OOO_O(char **args, npy_intp const *dimensions, npy_intp const *steps, void *func);
  
+/*
+ *****************************************************************************
+ **                            MIN/MAX LOOPS                                **
+ *****************************************************************************
+ */
+
+#ifndef NPY_DISABLE_OPTIMIZATION
+    #include "loops_minmax.dispatch.h"
+#endif
+
+//---------- Integers ----------
+
+/**begin repeat
+ * #TYPE = BYTE, UBYTE, SHORT, USHORT, INT, UINT,
+ *         LONG, ULONG, LONGLONG, ULONGLONG#
+ */
+/**begin repeat1
+ * #kind = maximum, minimum#
+ */
+ NPY_CPU_DISPATCH_DECLARE(NPY_NO_EXPORT void @TYPE@_@kind@,
+   (char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(data)))
+/**end repeat1**/
+/**end repeat**/
+
+//---------- Float ----------
+
+ /**begin repeat
+  * #TYPE = FLOAT, DOUBLE, LONGDOUBLE#
+  */
+/**begin repeat1
+ * #kind = maximum, minimum, fmax, fmin#
+ */
+ NPY_CPU_DISPATCH_DECLARE(NPY_NO_EXPORT void @TYPE@_@kind@,
+   (char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(data)))
+/**end repeat1**/
+/**end repeat**/
+
  /*
   *****************************************************************************
   **                              END LOOPS                                  **
diff --git a/numpy/core/src/umath/loops_arithmetic.dispatch.c.src b/numpy/core/src/umath/loops_arithmetic.dispatch.c.src

index 1ddf7c3b1a6fe1525663bc1a61ec978ddc4a620e..16a9eac2e781373f1f10ebbbac29aa81c2809c1d 100644 (file)
--- a/numpy/core/src/umath/loops_arithmetic.dispatch.c.src
+++ b/numpy/core/src/umath/loops_arithmetic.dispatch.c.src
@@ -1,7 +1,7 @@
  /*@targets
   ** $maxopt baseline
   ** sse2 sse41 avx2 avx512f avx512_skx
- ** vsx2
+ ** vsx2 vsx4
   ** neon
   **/
  #define _UMATHMODULE
@@ -128,7 +128,184 @@ simd_divide_by_scalar_contig_@sfx@(char **args, npy_intp len)
      npyv_cleanup();
  }
  /**end repeat**/
+
+#if defined(NPY_HAVE_VSX4)
+
+/**begin repeat
+ * #t = u, s#
+ * #signed = 0, 1#
+ */
+/*
+ * Computes division of 2 8-bit signed/unsigned integer vectors
+ *
+ * As Power10 only supports integer vector division for data of 32 bits or
+ * greater, we have to convert npyv_u8 into 4x npyv_u32, execute the integer
+ * vector division instruction, and then, convert the result back to npyv_u8.
+ */
+NPY_FINLINE npyv_@t@8
+vsx4_div_@t@8(npyv_@t@8 a, npyv_@t@8 b)
+{
+#if @signed@
+    npyv_s16x2 ta, tb;
+    npyv_s32x2 ahi, alo, bhi, blo;
+    ta.val[0] = vec_unpackh(a);
+    ta.val[1] = vec_unpackl(a);
+    tb.val[0] = vec_unpackh(b);
+    tb.val[1] = vec_unpackl(b);
+    ahi.val[0] = vec_unpackh(ta.val[0]);
+    ahi.val[1] = vec_unpackl(ta.val[0]);
+    alo.val[0] = vec_unpackh(ta.val[1]);
+    alo.val[1] = vec_unpackl(ta.val[1]);
+    bhi.val[0] = vec_unpackh(tb.val[0]);
+    bhi.val[1] = vec_unpackl(tb.val[0]);
+    blo.val[0] = vec_unpackh(tb.val[1]);
+    blo.val[1] = vec_unpackl(tb.val[1]);
+#else
+    npyv_u16x2 a_expand = npyv_expand_u16_u8(a);
+    npyv_u16x2 b_expand = npyv_expand_u16_u8(b);
+    npyv_u32x2 ahi = npyv_expand_u32_u16(a_expand.val[0]);
+    npyv_u32x2 alo = npyv_expand_u32_u16(a_expand.val[1]);
+    npyv_u32x2 bhi = npyv_expand_u32_u16(b_expand.val[0]);
+    npyv_u32x2 blo = npyv_expand_u32_u16(b_expand.val[1]);
  #endif
+    npyv_@t@32 v1 = vec_div(ahi.val[0], bhi.val[0]);
+    npyv_@t@32 v2 = vec_div(ahi.val[1], bhi.val[1]);
+    npyv_@t@32 v3 = vec_div(alo.val[0], blo.val[0]);
+    npyv_@t@32 v4 = vec_div(alo.val[1], blo.val[1]);
+    npyv_@t@16 hi = vec_pack(v1, v2);
+    npyv_@t@16 lo = vec_pack(v3, v4);
+    return vec_pack(hi, lo);
+}
+
+NPY_FINLINE npyv_@t@16
+vsx4_div_@t@16(npyv_@t@16 a, npyv_@t@16 b)
+{
+#if @signed@
+    npyv_s32x2 a_expand;
+    npyv_s32x2 b_expand;
+    a_expand.val[0] = vec_unpackh(a);
+    a_expand.val[1] = vec_unpackl(a);
+    b_expand.val[0] = vec_unpackh(b);
+    b_expand.val[1] = vec_unpackl(b);
+#else
+    npyv_u32x2 a_expand = npyv_expand_@t@32_@t@16(a);
+    npyv_u32x2 b_expand = npyv_expand_@t@32_@t@16(b);
+#endif
+    npyv_@t@32 v1 = vec_div(a_expand.val[0], b_expand.val[0]);
+    npyv_@t@32 v2 = vec_div(a_expand.val[1], b_expand.val[1]);
+    return vec_pack(v1, v2);
+}
+
+#define vsx4_div_@t@32 vec_div
+#define vsx4_div_@t@64 vec_div
+/**end repeat**/
+
+/**begin repeat
+ * Unsigned types
+ * #sfx = u8, u16, u32, u64#
+ * #len = 8,  16,  32,  64#
+ */
+static NPY_INLINE void
+vsx4_simd_divide_contig_@sfx@(char **args, npy_intp len)
+{
+    npyv_lanetype_@sfx@ *src1 = (npyv_lanetype_@sfx@ *) args[0];
+    npyv_lanetype_@sfx@ *src2 = (npyv_lanetype_@sfx@ *) args[1];
+    npyv_lanetype_@sfx@ *dst1 = (npyv_lanetype_@sfx@ *) args[2];
+    const npyv_@sfx@ vzero    = npyv_zero_@sfx@();
+    const int vstep           = npyv_nlanes_@sfx@;
+
+    for (; len >= vstep; len -= vstep, src1 += vstep, src2 += vstep,
+         dst1 += vstep) {
+        npyv_@sfx@ a = npyv_load_@sfx@(src1);
+        npyv_@sfx@ b = npyv_load_@sfx@(src2);
+        npyv_@sfx@ c = vsx4_div_@sfx@(a, b);
+        npyv_store_@sfx@(dst1, c);
+        if (NPY_UNLIKELY(vec_any_eq(b, vzero))) {
+            npy_set_floatstatus_divbyzero();
+        }
+    }
+
+    for (; len > 0; --len, ++src1, ++src2, ++dst1) {
+        const npyv_lanetype_@sfx@ a = *src1;
+        const npyv_lanetype_@sfx@ b = *src2;
+        if (NPY_UNLIKELY(b == 0)) {
+            npy_set_floatstatus_divbyzero();
+            *dst1 = 0;
+        } else{
+            *dst1 = a / b;
+        }
+    }
+    npyv_cleanup();
+}
+/**end repeat**/
+
+/**begin repeat
+ * Signed types
+ * #sfx = s8, s16, s32, s64#
+ * #len = 8,  16,  32,  64#
+ */
+static NPY_INLINE void
+vsx4_simd_divide_contig_@sfx@(char **args, npy_intp len)
+{
+    npyv_lanetype_@sfx@ *src1 = (npyv_lanetype_@sfx@ *) args[0];
+    npyv_lanetype_@sfx@ *src2 = (npyv_lanetype_@sfx@ *) args[1];
+    npyv_lanetype_@sfx@ *dst1 = (npyv_lanetype_@sfx@ *) args[2];
+    const npyv_@sfx@ vneg_one = npyv_setall_@sfx@(-1);
+    const npyv_@sfx@ vzero    = npyv_zero_@sfx@();
+    const npyv_@sfx@ vmin     = npyv_setall_@sfx@(NPY_MIN_INT@len@);
+    npyv_b@len@ warn          = npyv_cvt_b@len@_@sfx@(npyv_zero_@sfx@());
+    const int vstep           = npyv_nlanes_@sfx@;
+
+    for (; len >= vstep; len -= vstep, src1 += vstep, src2 += vstep,
+         dst1 += vstep) {
+        npyv_@sfx@ a   = npyv_load_@sfx@(src1);
+        npyv_@sfx@ b   = npyv_load_@sfx@(src2);
+        npyv_@sfx@ quo = vsx4_div_@sfx@(a, b);
+        npyv_@sfx@ rem = npyv_sub_@sfx@(a, vec_mul(b, quo));
+        // (b == 0 || (a == NPY_MIN_INT@len@ && b == -1))
+        npyv_b@len@ bzero    = npyv_cmpeq_@sfx@(b, vzero);
+        npyv_b@len@ amin     = npyv_cmpeq_@sfx@(a, vmin);
+        npyv_b@len@ bneg_one = npyv_cmpeq_@sfx@(b, vneg_one);
+        npyv_b@len@ overflow = npyv_and_@sfx@(bneg_one, amin);
+        npyv_b@len@ error    = npyv_or_@sfx@(bzero, overflow);
+        // in case of overflow or b = 0, 'cvtozero' forces quo/rem to be 0
+        npyv_@sfx@ cvtozero  = npyv_select_@sfx@(error, vzero, vneg_one);
+                        warn = npyv_or_@sfx@(error, warn);
+        // handle mixed case the way Python does
+        // ((a > 0) == (b > 0) || rem == 0)
+        npyv_b@len@ a_gt_zero  = npyv_cmpgt_@sfx@(a, vzero);
+        npyv_b@len@ b_gt_zero  = npyv_cmpgt_@sfx@(b, vzero);
+        npyv_b@len@ ab_eq_cond = npyv_cmpeq_@sfx@(a_gt_zero, b_gt_zero);
+        npyv_b@len@ rem_zero   = npyv_cmpeq_@sfx@(rem, vzero);
+        npyv_b@len@ or         = npyv_or_@sfx@(ab_eq_cond, rem_zero);
+        npyv_@sfx@ to_sub = npyv_select_@sfx@(or, vzero, vneg_one);
+                      quo = npyv_add_@sfx@(quo, to_sub);
+        npyv_store_@sfx@(dst1, npyv_and_@sfx@(cvtozero, quo));
+    }
+
+    if (!vec_all_eq(warn, vzero)) {
+        npy_set_floatstatus_divbyzero();
+    }
+
+    for (; len > 0; --len, ++src1, ++src2, ++dst1) {
+        const npyv_lanetype_@sfx@ a = *src1;
+        const npyv_lanetype_@sfx@ b = *src2;
+        if (b == 0 || (a == NPY_MIN_INT@len@ && b == -1)) {
+            npy_set_floatstatus_divbyzero();
+            *dst1 = 0;
+        }
+        else {
+            *dst1 = a / b;
+            if (((a > 0) != (b > 0)) && ((*dst1 * b) != a)) {
+                *dst1 -= 1;
+            }
+        }
+    }
+    npyv_cleanup();
+}
+/**end repeat**/
+#endif // NPY_HAVE_VSX4
+#endif // NPY_SIMD
  
  /********************************************************************************
   ** Defining ufunc inner functions
@@ -184,6 +361,12 @@ NPY_NO_EXPORT void NPY_CPU_DISPATCH_CURFX(@TYPE@_divide)
          *((@type@ *)iop1) = io1;
      }
  #if NPY_SIMD && defined(TO_SIMD_SFX)
+#if defined(NPY_HAVE_VSX4)
+    // both arguments are arrays of the same size
+    else if (IS_BLOCKABLE_BINARY(sizeof(@type@), NPY_SIMD_WIDTH)) {
+        TO_SIMD_SFX(vsx4_simd_divide_contig)(args, dimensions[0]);
+    }
+#endif
      // for contiguous block of memory, divisor is a scalar and not 0
      else if (IS_BLOCKABLE_BINARY_SCALAR2(sizeof(@type@), NPY_SIMD_WIDTH) &&
               (*(@type@ *)args[1]) != 0) {
@@ -218,8 +401,7 @@ NPY_NO_EXPORT void NPY_CPU_DISPATCH_CURFX(@TYPE@_divide)
   * because emulating multiply-high on these architectures is going to be expensive comparing
   * to the native scalar dividers.
   * Therefore it's better to disable NPYV in this special case to avoid any unnecessary shuffles.
- * Power10(VSX4) is an exception here since it has native support for integer vector division,
- * note neither infrastructure nor NPYV has supported VSX4 yet.
+ * Power10(VSX4) is an exception here since it has native support for integer vector division.
   */
  #if NPY_BITSOF_@STYPE@ == 64 && !defined(NPY_HAVE_VSX4) && (defined(NPY_HAVE_VSX) || defined(NPY_HAVE_NEON))
      #undef TO_SIMD_SFX
@@ -240,6 +422,12 @@ NPY_NO_EXPORT void NPY_CPU_DISPATCH_CURFX(@TYPE@_divide)
          *((@type@ *)iop1) = io1;
      }
  #if NPY_SIMD && defined(TO_SIMD_SFX)
+#if defined(NPY_HAVE_VSX4)
+    // both arguments are arrays of the same size
+    else if (IS_BLOCKABLE_BINARY(sizeof(@type@), NPY_SIMD_WIDTH)) {
+        TO_SIMD_SFX(vsx4_simd_divide_contig)(args, dimensions[0]);
+    }
+#endif
      // for contiguous block of memory, divisor is a scalar and not 0
      else if (IS_BLOCKABLE_BINARY_SCALAR2(sizeof(@type@), NPY_SIMD_WIDTH) &&
               (*(@type@ *)args[1]) != 0) {
diff --git a/numpy/core/src/umath/loops_exponent_log.dispatch.c.src b/numpy/core/src/umath/loops_exponent_log.dispatch.c.src

index 2dd43fb85362e1d753ccff1a33f951478dd279e2..8f123a48b2f4eb8371da2c1b0acdf5a370b16fd1 100644 (file)
--- a/numpy/core/src/umath/loops_exponent_log.dispatch.c.src
+++ b/numpy/core/src/umath/loops_exponent_log.dispatch.c.src
@@ -11,6 +11,7 @@
  
  #include "numpy/npy_math.h"
  #include "simd/simd.h"
+#include "npy_svml.h"
  #include "loops_utils.h"
  #include "loops.h"
  #include "lowlevel_strided_loops.h"
@@ -459,11 +460,12 @@ simd_exp_FLOAT(npy_float * op,
      @vtype@ cvt_magic = _mm@vsize@_set1_ps(NPY_RINT_CVT_MAGICf);
      @vtype@ log2e = _mm@vsize@_set1_ps(NPY_LOG2Ef);
      @vtype@ inf = _mm@vsize@_set1_ps(NPY_INFINITYF);
+    @vtype@ ninf = _mm@vsize@_set1_ps(-1*NPY_INFINITYF);
      @vtype@ zeros_f = _mm@vsize@_set1_ps(0.0f);
      @vtype@ poly, num_poly, denom_poly, quadrant;
      @vtype@i vindex = _mm@vsize@_loadu_si@vsize@((@vtype@i*)&indexarr[0]);
  
-    @mask@ xmax_mask, xmin_mask, nan_mask, inf_mask;
+    @mask@ xmax_mask, xmin_mask, nan_mask, inf_mask, ninf_mask;
      @mask@ overflow_mask = @isa@_get_partial_load_mask_ps(0, num_lanes);
      @mask@ underflow_mask = @isa@_get_partial_load_mask_ps(0, num_lanes);
      @mask@ load_mask = @isa@_get_full_load_mask_ps();
@@ -490,9 +492,11 @@ simd_exp_FLOAT(npy_float * op,
          xmax_mask = _mm@vsize@_cmp_ps@vsub@(x, _mm@vsize@_set1_ps(xmax), _CMP_GE_OQ);
          xmin_mask = _mm@vsize@_cmp_ps@vsub@(x, _mm@vsize@_set1_ps(xmin), _CMP_LE_OQ);
          inf_mask = _mm@vsize@_cmp_ps@vsub@(x, inf, _CMP_EQ_OQ);
+        ninf_mask = _mm@vsize@_cmp_ps@vsub@(x, ninf, _CMP_EQ_OQ);
          overflow_mask = @or_masks@(overflow_mask,
                                      @xor_masks@(xmax_mask, inf_mask));
-        underflow_mask = @or_masks@(underflow_mask, xmin_mask);
+        underflow_mask = @or_masks@(underflow_mask,
+                                    @xor_masks@(xmin_mask, ninf_mask));
  
          x = @isa@_set_masked_lanes_ps(x, zeros_f, @or_masks@(
                                      @or_masks@(nan_mask, xmin_mask), xmax_mask));
@@ -688,6 +692,43 @@ simd_log_FLOAT(npy_float * op,
  #endif // @CHK@
  /**end repeat**/
  
+#if NPY_SIMD && defined(NPY_HAVE_AVX512_SKX) && defined(NPY_CAN_LINK_SVML)
+/**begin repeat
+ * #func = exp, log#
+ * #default_val = 0, 1#
+ */
+static void
+simd_@func@_f64(const npyv_lanetype_f64 *src, npy_intp ssrc,
+                      npyv_lanetype_f64 *dst, npy_intp sdst, npy_intp len)
+{
+    const int vstep = npyv_nlanes_f64;
+    for (; len > 0; len -= vstep, src += ssrc*vstep, dst += sdst*vstep) {
+        npyv_f64 x;
+#if @default_val@
+        if (ssrc == 1) {
+            x = npyv_load_till_f64(src, len, @default_val@);
+        } else {
+            x = npyv_loadn_till_f64(src, ssrc, len, @default_val@);
+        }
+#else
+        if (ssrc == 1) {
+            x = npyv_load_tillz_f64(src, len);
+        } else {
+            x = npyv_loadn_tillz_f64(src, ssrc, len);
+        }
+#endif
+        npyv_f64 out = __svml_@func@8(x);
+        if (sdst == 1) {
+            npyv_store_till_f64(dst, len, out);
+        } else {
+            npyv_storen_till_f64(dst, sdst, len, out);
+        }
+    }
+    npyv_cleanup();
+}
+/**end repeat**/
+
+#else
  #ifdef SIMD_AVX512F_NOCLANG_BUG
  /*
   * Vectorized implementation of exp double using AVX512
@@ -732,6 +773,7 @@ AVX512F_exp_DOUBLE(npy_double * op,
      __m512d mTH_max = _mm512_set1_pd(0x1.62e42fefa39efp+9);
      __m512d mTH_min = _mm512_set1_pd(-0x1.74910d52d3053p+9);
      __m512d mTH_inf = _mm512_set1_pd(NPY_INFINITY);
+    __m512d mTH_ninf = _mm512_set1_pd(-NPY_INFINITY);
      __m512d zeros_d = _mm512_set1_pd(0.0f);
      __m512d ones_d = _mm512_set1_pd(1.0f);
      __m256i vindex = _mm256_loadu_si256((__m256i*)&indexarr[0]);
@@ -748,7 +790,7 @@ AVX512F_exp_DOUBLE(npy_double * op,
      __mmask8 overflow_mask = avx512_get_partial_load_mask_pd(0, num_lanes);
      __mmask8 underflow_mask = avx512_get_partial_load_mask_pd(0, num_lanes);
      __mmask8 load_mask = avx512_get_full_load_mask_pd();
-    __mmask8 xmin_mask, xmax_mask, inf_mask, nan_mask, nearzero_mask;
+    __mmask8 xmin_mask, xmax_mask, inf_mask, ninf_mask, nan_mask, nearzero_mask;
  
      while (num_remaining_elements > 0) {
          if (num_remaining_elements < num_lanes) {
@@ -769,6 +811,7 @@ AVX512F_exp_DOUBLE(npy_double * op,
          xmax_mask = _mm512_cmp_pd_mask(x, mTH_max, _CMP_GT_OQ);
          xmin_mask = _mm512_cmp_pd_mask(x, mTH_min, _CMP_LT_OQ);
          inf_mask = _mm512_cmp_pd_mask(x, mTH_inf, _CMP_EQ_OQ);
+        ninf_mask = _mm512_cmp_pd_mask(x, mTH_ninf, _CMP_EQ_OQ);
          __m512i x_abs = _mm512_and_epi64(_mm512_castpd_si512(x),
                                  _mm512_set1_epi64(0x7FFFFFFFFFFFFFFF));
          nearzero_mask = _mm512_cmp_pd_mask(_mm512_castsi512_pd(x_abs),
@@ -776,7 +819,8 @@ AVX512F_exp_DOUBLE(npy_double * op,
          nearzero_mask = _mm512_kxor(nearzero_mask, nan_mask);
          overflow_mask = _mm512_kor(overflow_mask,
                                  _mm512_kxor(xmax_mask, inf_mask));
-        underflow_mask = _mm512_kor(underflow_mask, xmin_mask);
+        underflow_mask = _mm512_kor(underflow_mask,
+                                _mm512_kxor(xmin_mask, ninf_mask));
          x = avx512_set_masked_lanes_pd(x, zeros_d,
                          _mm512_kor(_mm512_kor(nan_mask, xmin_mask),
                              _mm512_kor(xmax_mask, nearzero_mask)));
@@ -1080,7 +1124,8 @@ AVX512F_log_DOUBLE(npy_double * op,
  
  #undef WORKAROUND_LLVM__mm512_mask_mul_pd
  
-#endif // AVX512F_NOCLANG_BUG
+#endif // SIMD_AVX512F_NOCLANG_BUG
+#endif // NPY_CAN_LINK_SVML
  
  #ifdef SIMD_AVX512_SKX
  /**begin repeat
@@ -1293,17 +1338,34 @@ NPY_NO_EXPORT void NPY_CPU_DISPATCH_CURFX(FLOAT_@func@)
  NPY_NO_EXPORT void NPY_CPU_DISPATCH_CURFX(DOUBLE_@func@)
  (char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(data))
  {
+#if NPY_SIMD && defined(NPY_HAVE_AVX512_SKX) && defined(NPY_CAN_LINK_SVML)
+    const npy_double *src = (npy_double*)args[0];
+    npy_double *dst = (npy_double*)args[1];
+    const int lsize = sizeof(src[0]);
+    const npy_intp ssrc = steps[0] / lsize;
+    const npy_intp sdst = steps[1] / lsize;
+    const npy_intp len = dimensions[0];
+    assert(steps[0] % lsize == 0 && steps[1] % lsize == 0);
+    if (!is_mem_overlap(src, steps[0], dst, steps[1], len) &&
+            npyv_loadable_stride_f64(ssrc) &&
+            npyv_storable_stride_f64(sdst)) {
+        simd_@func@_f64(src, ssrc, dst, sdst, len);
+        return;
+    }
+#else
  #ifdef SIMD_AVX512F_NOCLANG_BUG
      if (IS_OUTPUT_BLOCKABLE_UNARY(sizeof(npy_double), sizeof(npy_double), 64)) {
          AVX512F_@func@_DOUBLE((npy_double*)args[1], (npy_double*)args[0], dimensions[0], steps[0]);
          return;
      }
-#endif
+#endif // SIMD_AVX512F_NOCLANG_BUG
+#endif // NPY_CAN_LINK_SVML
      UNARY_LOOP {
          const npy_double in1 = *(npy_double *)ip1;
          *(npy_double *)op1 = @scalar@(in1);
      }
  }
+
  /**end repeat**/
  
  /**begin repeat
diff --git a/numpy/core/src/umath/loops_hyperbolic.dispatch.c.src b/numpy/core/src/umath/loops_hyperbolic.dispatch.c.src

new file mode 100644 (file)

index 0000000..8cccc18
--- /dev/null
+++ b/numpy/core/src/umath/loops_hyperbolic.dispatch.c.src
@@ -0,0 +1,384 @@
+/*@targets
+ ** $maxopt baseline
+ ** (avx2 fma3) AVX512_SKX
+ ** vsx2 vsx4
+ ** neon_vfpv4
+ **/
+#include "numpy/npy_math.h"
+#include "simd/simd.h"
+#include "loops_utils.h"
+#include "loops.h"
+
+#if NPY_SIMD_FMA3 // native support
+/*
+ * NOTE: The following implementation of tanh(f32, f64) have been converted from
+ * Intel SVML to universal intrinsics, and the original code can be found in:
+ *
+ * - https://github.com/numpy/SVML/blob/main/linux/avx512/svml_z0_tanh_d_la.s
+ * - https://github.com/numpy/SVML/blob/main/linux/avx512/svml_z0_tanh_s_la.s
+ *
+ * ALGORITHM DESCRIPTION:
+ *
+ *   NOTE: Since the hyperbolic tangent function is odd
+ *         (tanh(x) = -tanh(-x)), below algorithm deals with the absolute
+ *         value of the argument |x|: tanh(x) = sign(x) * tanh(|x|)
+ *
+ *   We use a table lookup method to compute tanh(|x|).
+ *   The basic idea is to split the input range into a number of subintervals
+ *   and to approximate tanh(.) with a polynomial on each of them.
+ *
+ *   IEEE SPECIAL CONDITIONS:
+ *   x = [+,-]0, r = [+,-]0
+ *   x = +Inf,   r = +1
+ *   x = -Inf,   r = -1
+ *   x = QNaN,   r = QNaN
+ *   x = SNaN,   r = QNaN
+ *
+ *
+ *  ALGORITHM DETAILS
+ *
+ *  SVML handel |x| > HUGE_THRESHOLD, INF and NaNs by scalar callout as following:
+ *  1. check special cases
+ *  2. return `+-1` for `|x| > HUGE_THRESHOLD`  otherwise return `x`
+ *
+ *  It wasn't clear to us the reason behind using callout instead of using
+ *  AVX512 directly for single-precision.
+ *  However, we saw it's better to use SIMD instead of following SVML.
+ *
+ *  Main path computations are organized as follows:
+ *  Actually we split the interval [0, SATURATION_THRESHOLD)
+ *  into a number of subintervals.  On each subinterval we approximate tanh(.)
+ *   with a minimax polynomial of pre-defined degree. Polynomial coefficients
+ *  are computed beforehand and stored in table. We also use
+ *
+ *       y := |x| + B,
+ *
+ *  here B depends on subinterval and is used to make argument
+ *   closer to zero.
+ *   We also add large fake interval [SATURATION_THRESHOLD, HUGE_THRESHOLD],
+ *   where 1.0 + 0.0*y + 0.0*y^2 ... coefficients are stored - just to
+ *   preserve main path computation logic but return 1.0 for all arguments.
+ *
+ *   Hence reconstruction looks as follows:
+ *   we extract proper polynomial and range reduction coefficients
+ *        (Pj and B), corresponding to subinterval, to which |x| belongs,
+ *        and return
+ *
+ *       r := sign(x) * (P0 + P1 * y + ... + Pn * y^n)
+ *
+ *   NOTE: we use multiprecision technique to multiply and sum the first
+ *         K terms of the polynomial. So Pj, j = 0..K are stored in
+ *         table each as a pair of target precision numbers (Pj and PLj) to
+ *         achieve wider than target precision.
+ *
+ */
+#if NPY_SIMD_F64
+static void
+simd_tanh_f64(const double *src, npy_intp ssrc, double *dst, npy_intp sdst, npy_intp len)
+{
+    static const npy_uint64 NPY_DECL_ALIGNED(NPY_SIMD_WIDTH) lut16x18[] = {
+        // 0
+        0x0ull,                0x3fcc000000000000ull, 0x3fd4000000000000ull, 0x3fdc000000000000ull,
+        0x3fe4000000000000ull, 0x3fec000000000000ull, 0x3ff4000000000000ull, 0x3ffc000000000000ull,
+        0x4004000000000000ull, 0x400c000000000000ull, 0x4014000000000000ull, 0x401c000000000000ull,
+        0x4024000000000000ull, 0x402c000000000000ull, 0x4034000000000000ull, 0x0ull,
+        // 1
+        0x0ull,                0x3fcb8fd0416a7c92ull, 0x3fd35f98a0ea650eull, 0x3fda5729ee488037ull,
+        0x3fe1bf47eabb8f95ull, 0x3fe686650b8c2015ull, 0x3feb2523bb6b2deeull, 0x3fee1fbf97e33527ull,
+        0x3fef9258260a71c2ull, 0x3feff112c63a9077ull, 0x3fefff419668df11ull, 0x3feffffc832750f2ull,
+        0x3feffffffdc96f35ull, 0x3fefffffffffcf58ull, 0x3ff0000000000000ull, 0x3ff0000000000000ull,
+        // 2
+        0x3ff0000000000000ull, 0x3fee842ca3f08532ull, 0x3fed11574af58f1bull, 0x3fea945b9c24e4f9ull,
+        0x3fe6284c3374f815ull, 0x3fe02500a09f8d6eull, 0x3fd1f25131e3a8c0ull, 0x3fbd22ca1c24a139ull,
+        0x3f9b3afe1fba5c76ull, 0x3f6dd37d19b22b21ull, 0x3f27ccec13a9ef96ull, 0x3ecbe6c3f33250aeull,
+        0x3e41b4865394f75full, 0x3d8853f01bda5f28ull, 0x3c73953c0197ef58ull, 0x0ull,
+        // 3
+        0xbbf0b3ea3fdfaa19ull, 0xbfca48aaeb53bc21ull, 0xbfd19921f4329916ull, 0xbfd5e0f09bef8011ull,
+        0xbfd893b59c35c882ull, 0xbfd6ba7cb7576538ull, 0xbfce7291743d7555ull, 0xbfbb6d85a01efb80ull,
+        0xbf9addae58c7141aull, 0xbf6dc59376c7aa19ull, 0xbf27cc5e74677410ull, 0xbecbe6c0e8b4cc87ull,
+        0xbe41b486526b0565ull, 0xbd8853f01bef63a4ull, 0xbc73955be519be31ull, 0x0ull,
+        // 4
+        0xbfd5555555555555ull, 0xbfd183afc292ba11ull, 0xbfcc1a4b039c9bfaull, 0xbfc16e1e6d8d0be6ull,
+        0xbf92426c751e48a2ull, 0x3fb4f152b2bad124ull, 0x3fbbba40cbef72beull, 0x3fb01ba038be6a3dull,
+        0x3f916df44871efc8ull, 0x3f63c6869dfc8870ull, 0x3f1fb9aef915d828ull, 0x3ec299d1e27c6e11ull,
+        0x3e379b5ddcca334cull, 0x3d8037f57bc62c9aull, 0x3c6a2d4b50a2cff7ull, 0x0ull,
+        // 5
+        0xbce6863ee44ed636ull, 0x3fc04dcd0476c75eull, 0x3fc43d3449a80f08ull, 0x3fc5c26f3699b7e7ull,
+        0x3fc1a686f6ab2533ull, 0x3faf203c316ce730ull, 0xbf89c7a02788557cull, 0xbf98157e26e0d541ull,
+        0xbf807b55c1c7d278ull, 0xbf53a18d5843190full, 0xbf0fb6bbc89b1a5bull, 0xbeb299c9c684a963ull,
+        0xbe279b5dd4fb3d01ull, 0xbd7037f57ae72aa6ull, 0xbc5a2ca2bba78e86ull, 0x0ull,
+        // 6
+        0x3fc1111111112ab5ull, 0x3fb5c19efdfc08adull, 0x3fa74c98dc34fbacull, 0xbf790d6a8eff0a77ull,
+        0xbfac3c021789a786ull, 0xbfae2196b7326859ull, 0xbf93a7a011ff8c2aull, 0x3f6e4709c7e8430eull,
+        0x3f67682afa611151ull, 0x3f3ef2ee77717cbfull, 0x3ef95a4482f180b7ull, 0x3e9dc2c27da3b603ull,
+        0x3e12e2afd9f7433eull, 0x3d59f320348679baull, 0x3c44b61d9bbcc940ull, 0x0ull,
+        // 7
+        0xbda1ea19ddddb3b4ull, 0xbfb0b8df995ce4dfull, 0xbfb2955cf41e8164ull, 0xbfaf9d05c309f7c6ull,
+        0xbf987d27ccff4291ull, 0x3f8b2ca62572b098ull, 0x3f8f1cf6c7f5b00aull, 0x3f60379811e43dd5ull,
+        0xbf4793826f78537eull, 0xbf2405695e36240full, 0xbee0e08de39ce756ull, 0xbe83d709ba5f714eull,
+        0xbdf92e3fc5ee63e0ull, 0xbd414cc030f2110eull, 0xbc2ba022e8d82a87ull, 0x0ull,
+        // 8
+        0xbfaba1ba1990520bull, 0xbf96e37bba52f6fcull, 0x3ecff7df18455399ull, 0x3f97362834d33a4eull,
+        0x3f9e7f8380184b45ull, 0x3f869543e7c420d4ull, 0xbf7326bd4914222aull, 0xbf5fc15b0a9d98faull,
+        0x3f14cffcfa69fbb6ull, 0x3f057e48e5b79d10ull, 0x3ec33b66d7d77264ull, 0x3e66ac4e578b9b10ull,
+        0x3ddcc74b8d3d5c42ull, 0x3d23c589137f92b4ull, 0x3c107f8e2c8707a1ull, 0x0ull,
+        // 9
+        0xbe351ca7f096011full, 0x3f9eaaf3320c3851ull, 0x3f9cf823fe761fc1ull, 0x3f9022271754ff1full,
+        0xbf731fe77c9c60afull, 0xbf84a6046865ec7dull, 0xbf4ca3f1f2b9192bull, 0x3f4c77dee0afd227ull,
+        0x3f04055bce68597aull, 0xbee2bf0cb4a71647ull, 0xbea31eaafe73efd5ull, 0xbe46abb02c4368edull,
+        0xbdbcc749ca8079ddull, 0xbd03c5883836b9d2ull, 0xbbf07a5416264aecull, 0x0ull,
+        // 10
+        0x3f9664f94e6ac14eull, 0xbf94d3343bae39ddull, 0xbf7bc748e60df843ull, 0xbf8c89372b43ba85ull,
+        0xbf8129a092de747aull, 0x3f60c85b4d538746ull, 0x3f5be9392199ec18ull, 0xbf2a0c68a4489f10ull,
+        0xbf00462601dc2faaull, 0x3eb7b6a219dea9f4ull, 0x3e80cbcc8d4c5c8aull, 0x3e2425bb231a5e29ull,
+        0x3d9992a4beac8662ull, 0x3ce191ba5ed3fb67ull, 0x3bc892450bad44c4ull, 0x0ull,
+        // 11
+        0xbea8c4c1fd7852feull, 0xbfccce16b1046f13ull, 0xbf81a16f224bb7b6ull, 0xbf62cbf00406bc09ull,
+        0x3f75b29bb02cf69bull, 0x3f607df0f9f90c17ull, 0xbf4b852a6e0758d5ull, 0xbf0078c63d1b8445ull,
+        0x3eec12eadd55be7aull, 0xbe6fa600f593181bull, 0xbe5a3c935dce3f7dull, 0xbe001c6d95e3ae96ull,
+        0xbd74755a00ea1fd3ull, 0xbcbc1c6c063bb7acull, 0xbba3be9a4460fe00ull, 0x0ull,
+        // 12
+        0xbf822404577aa9ddull, 0x403d8b07f7a82aa3ull, 0xbf9f44ab92fbab0aull, 0x3fb2eac604473d6aull,
+        0x3f45f87d903aaac8ull, 0xbf5e104671036300ull, 0x3f19bc98ddf0f340ull, 0x3f0d4304bc9246e8ull,
+        0xbed13c415f7b9d41ull, 0xbe722b8d9720cdb0ull, 0x3e322666d739bec0ull, 0x3dd76a553d7e7918ull,
+        0x3d4de0fa59416a39ull, 0x3c948716cf3681b4ull, 0x3b873f9f2d2fda99ull, 0x0ull,
+        // 13
+        0xbefdd99a221ed573ull, 0x4070593a3735bab4ull, 0xbfccab654e44835eull, 0x3fd13ed80037dbacull,
+        0xbf6045b9076cc487ull, 0x3f2085ee7e8ac170ull, 0x3f23524622610430ull, 0xbeff12a6626911b4ull,
+        0x3eab9008bca408afull, 0x3e634df71865f620ull, 0xbe05bb1bcf83ca73ull, 0xbdaf2ac143fb6762ull,
+        0xbd23eae52a3dbf57ull, 0xbc6b5e3e9ca0955eull, 0xbb5eca68e2c1ba2eull, 0x0ull,
+        // 14
+        0x3f6e3be689423841ull, 0xc0d263511f5baac1ull, 0x40169f73b15ebe5cull, 0xc025c1dd41cd6cb5ull,
+        0xbf58fd89fe05e0d1ull, 0x3f73f7af01d5af7aull, 0xbf1e40bdead17e6bull, 0x3ee224cd6c4513e5ull,
+        0xbe24b645e68eeaa3ull, 0xbe4abfebfb72bc83ull, 0x3dd51c38f8695ed3ull, 0x3d8313ac38c6832bull,
+        0x3cf7787935626685ull, 0x3c401ffc49c6bc29ull, 0xbabf0b21acfa52abull, 0x0ull,
+        // 15
+        0xbf2a1306713a4f3aull, 0xc1045e509116b066ull, 0x4041fab9250984ceull, 0xc0458d090ec3de95ull,
+        0xbf74949d60113d63ull, 0x3f7c9fd6200d0adeull, 0x3f02cd40e0ad0a9full, 0xbe858ab8e019f311ull,
+        0xbe792fa6323b7cf8ull, 0x3e2df04d67876402ull, 0xbd95c72be95e4d2cull, 0xbd55a89c30203106ull,
+        0xbccad6b3bb9eff65ull, 0xbc12705ccd3dd884ull, 0xba8e0a4c47ae75f5ull, 0x0ull,
+        // 16
+        0xbf55d7e76dc56871ull, 0x41528c38809c90c7ull, 0xc076d57fb5190b02ull, 0x4085f09f888f8adaull,
+        0x3fa246332a2fcba5ull, 0xbfb29d851a896fcdull, 0x3ed9065ae369b212ull, 0xbeb8e1ba4c98a030ull,
+        0x3e6ffd0766ad4016ull, 0xbe0c63c29f505f5bull, 0xbd7fab216b9e0e49ull, 0x3d2826b62056aa27ull,
+        0x3ca313e31762f523ull, 0x3bea37aa21895319ull, 0x3ae5c7f1fd871496ull, 0x0ull,
+        // 17
+        0x3f35e67ab76a26e7ull, 0x41848ee0627d8206ull, 0xc0a216d618b489ecull, 0x40a5b89107c8af4full,
+        0x3fb69d8374520edaull, 0xbfbded519f981716ull, 0xbef02d288b5b3371ull, 0x3eb290981209c1a6ull,
+        0xbe567e924bf5ff6eull, 0x3de3f7f7de6b0eb6ull, 0x3d69ed18bae3ebbcull, 0xbcf7534c4f3dfa71ull,
+        0xbc730b73f1eaff20ull, 0xbbba2cff8135d462ull, 0xbab5a71b5f7d9035ull, 0x0ull
+    };
+    const int nlanes = npyv_nlanes_f64;
+    const npyv_f64 qnan = npyv_setall_f64(NPY_NAN);
+    for (; len > 0; len -= nlanes, src += ssrc*nlanes, dst += sdst*nlanes) {
+        npyv_f64 x;
+        if (ssrc == 1) {
+            x = npyv_load_tillz_f64(src, len);
+        } else {
+            x = npyv_loadn_tillz_f64(src, ssrc, len);
+        }
+        npyv_s64 ndnan = npyv_and_s64(npyv_reinterpret_s64_f64(x), npyv_setall_s64(0x7ff8000000000000ll));
+        // |x| > HUGE_THRESHOLD, INF and NaNs.
+        npyv_b64 special_m = npyv_cmple_s64(ndnan, npyv_setall_s64(0x7fe0000000000000ll));
+        npyv_b64 nnan_m = npyv_notnan_f64(x);
+        npyv_s64 idxs = npyv_sub_s64(ndnan, npyv_setall_s64(0x3fc0000000000000ll));
+        // no native 64-bit for max/min and its fine to use 32-bit max/min
+        // since we're not crossing 32-bit edge
+        npyv_s32 idxl = npyv_max_s32(npyv_reinterpret_s32_s64(idxs), npyv_zero_s32());
+                 idxl = npyv_min_s32(idxl, npyv_setall_s32(0x780000));
+        npyv_u64 idx  = npyv_shri_u64(npyv_reinterpret_u64_s32(idxl), 51);
+
+        npyv_f64 b = npyv_lut16_f64((const double*)lut16x18 + 16*0, idx);
+        npyv_f64 c0 = npyv_lut16_f64((const double*)lut16x18 + 1*16, idx);
+        npyv_f64 c1 = npyv_lut16_f64((const double*)lut16x18 + 2*16, idx);
+        npyv_f64 c2 = npyv_lut16_f64((const double*)lut16x18 + 3*16, idx);
+        npyv_f64 c3 = npyv_lut16_f64((const double*)lut16x18 + 4*16, idx);
+        npyv_f64 c4 = npyv_lut16_f64((const double*)lut16x18 + 5*16, idx);
+        npyv_f64 c5 = npyv_lut16_f64((const double*)lut16x18 + 6*16, idx);
+        npyv_f64 c6 = npyv_lut16_f64((const double*)lut16x18 + 7*16, idx);
+        npyv_f64 c7 = npyv_lut16_f64((const double*)lut16x18 + 8*16, idx);
+        npyv_f64 c8 = npyv_lut16_f64((const double*)lut16x18 + 9*16, idx);
+        npyv_f64 c9 = npyv_lut16_f64((const double*)lut16x18 + 10*16, idx);
+        npyv_f64 c10 = npyv_lut16_f64((const double*)lut16x18 + 11*16, idx);
+        npyv_f64 c11 = npyv_lut16_f64((const double*)lut16x18 + 12*16, idx);
+        npyv_f64 c12 = npyv_lut16_f64((const double*)lut16x18 + 13*16, idx);
+        npyv_f64 c13 = npyv_lut16_f64((const double*)lut16x18 + 14*16, idx);
+        npyv_f64 c14 = npyv_lut16_f64((const double*)lut16x18 + 15*16, idx);
+        npyv_f64 c15 = npyv_lut16_f64((const double*)lut16x18 + 16*16, idx);
+        npyv_f64 c16 = npyv_lut16_f64((const double*)lut16x18 + 17*16, idx);
+
+        // no need to zerofy nans or avoid FP exceptions by NO_EXC like SVML does
+        // since we're clearing the FP status anyway.
+        npyv_f64 sign = npyv_and_f64(x, npyv_reinterpret_f64_s64(npyv_setall_s64(0x8000000000000000ull)));
+        npyv_f64 y = npyv_sub_f64(npyv_abs_f64(x), b);
+        npyv_f64 r = npyv_muladd_f64(c16, y, c15);
+        r = npyv_muladd_f64(r, y, c14);
+        r = npyv_muladd_f64(r, y, c13);
+        r = npyv_muladd_f64(r, y, c12);
+        r = npyv_muladd_f64(r, y, c11);
+        r = npyv_muladd_f64(r, y, c10);
+        r = npyv_muladd_f64(r, y, c9);
+        r = npyv_muladd_f64(r, y, c8);
+        r = npyv_muladd_f64(r, y, c7);
+        r = npyv_muladd_f64(r, y, c6);
+        r = npyv_muladd_f64(r, y, c5);
+        r = npyv_muladd_f64(r, y, c4);
+        r = npyv_muladd_f64(r, y, c3);
+        r = npyv_muladd_f64(r, y, c2);
+        r = npyv_muladd_f64(r, y, c1);
+        r = npyv_muladd_f64(r, y, c0);
+        // 1.0 if |x| > HUGE_THRESHOLD || INF
+        r = npyv_select_f64(special_m, r, npyv_setall_f64(1.0));
+        r = npyv_or_f64(r, sign);
+        // qnan if nan
+        r = npyv_select_f64(nnan_m, r, qnan);
+        if (sdst == 1) {
+            npyv_store_till_f64(dst, len, r);
+        } else {
+            npyv_storen_till_f64(dst, sdst, len, r);
+        }
+    }
+}
+#endif // NPY_SIMD_F64
+static void
+simd_tanh_f32(const float *src, npy_intp ssrc, float *dst, npy_intp sdst, npy_intp len)
+{
+    static const npy_uint32 NPY_DECL_ALIGNED(NPY_SIMD_WIDTH) lut32x8[] = {
+        // 0
+        0x0,        0x3d700000, 0x3d900000, 0x3db00000, 0x3dd00000, 0x3df00000, 0x3e100000, 0x3e300000,
+        0x3e500000, 0x3e700000, 0x3e900000, 0x3eb00000, 0x3ed00000, 0x3ef00000, 0x3f100000, 0x3f300000,
+        0x3f500000, 0x3f700000, 0x3f900000, 0x3fb00000, 0x3fd00000, 0x3ff00000, 0x40100000, 0x40300000,
+        0x40500000, 0x40700000, 0x40900000, 0x40b00000, 0x40d00000, 0x40f00000, 0x41100000, 0x0,
+        // 1
+        0x0,        0x3d6fb9c9, 0x3d8fc35f, 0x3daf9169, 0x3dcf49ab, 0x3deee849, 0x3e0f0ee8, 0x3e2e4984,
+        0x3e4d2f8e, 0x3e6bb32e, 0x3e8c51cd, 0x3ea96163, 0x3ec543f1, 0x3edfd735, 0x3f028438, 0x3f18abf0,
+        0x3f2bc480, 0x3f3bec1c, 0x3f4f2e5b, 0x3f613c53, 0x3f6ce37d, 0x3f743c4f, 0x3f7a5feb, 0x3f7dea85,
+        0x3f7f3b3d, 0x3f7fb78c, 0x3f7fefd4, 0x3f7ffdd0, 0x3f7fffb4, 0x3f7ffff6, 0x3f7fffff, 0x3f800000,
+        // 2
+        0x3f800000, 0x3f7f1f84, 0x3f7ebd11, 0x3f7e1e5f, 0x3f7d609f, 0x3f7c842d, 0x3f7b00e5, 0x3f789580,
+        0x3f75b8ad, 0x3f726fd9, 0x3f6cc59b, 0x3f63fb92, 0x3f59ff97, 0x3f4f11d7, 0x3f3d7573, 0x3f24f360,
+        0x3f0cbfe7, 0x3eec1a69, 0x3eb0a801, 0x3e6753a2, 0x3e132f1a, 0x3db7e7d3, 0x3d320845, 0x3c84d3d4,
+        0x3bc477b7, 0x3b10d3da, 0x3a01601e, 0x388c1a3b, 0x3717b0da, 0x35a43bce, 0x338306c6, 0x0,
+        // 3
+        0xb0343c7b, 0xbd6ee69d, 0xbd8f0da7, 0xbdae477d, 0xbdcd2a1f, 0xbdeba80d, 0xbe0c443b, 0xbe293cf3,
+        0xbe44f282, 0xbe5f3651, 0xbe81c7c0, 0xbe96d7ca, 0xbea7fb8e, 0xbeb50e9e, 0xbec12efe, 0xbec4be92,
+        0xbebce070, 0xbead510e, 0xbe8ef7d6, 0xbe4b8704, 0xbe083237, 0xbdaf7449, 0xbd2e1ec4, 0xbc83bf06,
+        0xbbc3e0b5, 0xbb10aadc, 0xba0157db, 0xb88c18f2, 0xb717b096, 0xb5a43bae, 0xb383012c, 0x0,
+        // 4
+        0xbeaaaaa5, 0xbeab0612, 0xbea7f01f, 0xbea4e120, 0xbea387b7, 0xbea15962, 0xbe9d57f7, 0xbe976b5a,
+        0xbe90230d, 0xbe880dff, 0xbe7479b3, 0xbe4c3d88, 0xbe212482, 0xbdeb8cba, 0xbd5e78ad, 0x3c6b5e6e,
+        0x3d839143, 0x3dc21ee1, 0x3de347af, 0x3dcbec96, 0x3d99ef2d, 0x3d542ea1, 0x3cdde701, 0x3c2cca67,
+        0x3b81cb27, 0x3ac073a1, 0x39ac3032, 0x383a94d9, 0x36ca081d, 0x355abd4c, 0x332b3cb6, 0x0,
+        // 5
+        0xb76dd6b9, 0xbe1c276d, 0x3c1dcf2f, 0x3dc1a78d, 0x3d96f985, 0x3da2b61b, 0x3dc13397, 0x3dd2f670,
+        0x3df48a0a, 0x3e06c5a8, 0x3e1a3aba, 0x3e27c405, 0x3e2e78d0, 0x3e2c3e44, 0x3e1d3097, 0x3df4a8f4,
+        0x3da38508, 0x3d31416a, 0x3b562657, 0xbcaeeac9, 0xbcce9419, 0xbcaaeac4, 0xbc49e7d0, 0xbba71ddd,
+        0xbb003b0e, 0xba3f9a05, 0xb92c08a7, 0xb7ba9232, 0xb64a0b0f, 0xb4dac169, 0xb2ab78ac, 0x0,
+        // 6
+        0x3e0910e9, 0x43761143, 0x4165ecdc, 0xc190f756, 0xc08c097d, 0xc02ba813, 0xbf7f6bda, 0x3f2b1dc0,
+        0x3ece105d, 0x3f426a94, 0xbadb0dc4, 0x3da43b17, 0xbd51ab88, 0xbcaea23d, 0xbd3b6d8d, 0xbd6caaad,
+        0xbd795bed, 0xbd5fddda, 0xbd038f3b, 0xbc1cad63, 0x3abb4766, 0x3b95f10b, 0x3b825873, 0x3afaea66,
+        0x3a49f878, 0x39996bf3, 0x388f3e6c, 0x371bb0e3, 0x35a8a5e6, 0x34369b17, 0x322487b0, 0x0,
+        // 7
+        0xbc0e2f66, 0x460bda12, 0x43d638ef, 0xc3e11c3e, 0xc2baa4e9, 0xc249da2d, 0xc1859b82, 0x40dd5b57,
+        0x40494640, 0x40c730a8, 0xbf0f160e, 0x3e30e76f, 0xbea81387, 0xbdb26a1c, 0xbd351e57, 0xbb4c01a0,
+        0x3c1d7bfb, 0x3c722cd1, 0x3c973f1c, 0x3c33a31b, 0x3b862ef4, 0x3a27b3d0, 0xba3b5907, 0xba0efc22,
+        0xb97f9f0f, 0xb8c8af50, 0xb7bdddfb, 0xb64f2950, 0xb4e085b1, 0xb3731dfa, 0xb15a1f04, 0x0
+    };
+
+    const int nlanes = npyv_nlanes_f32;
+    const npyv_f32 qnan = npyv_setall_f32(NPY_NANF);
+    for (; len > 0; len -= nlanes, src += ssrc*nlanes, dst += sdst*nlanes) {
+        npyv_f32 x;
+        if (ssrc == 1) {
+            x = npyv_load_tillz_f32(src, len);
+        } else {
+            x = npyv_loadn_tillz_f32(src, ssrc, len);
+        }
+        npyv_s32 ndnan = npyv_and_s32(npyv_reinterpret_s32_f32(x), npyv_setall_s32(0x7fe00000));
+        // check |x| > HUGE_THRESHOLD, INF and NaNs.
+        npyv_b32 special_m = npyv_cmple_s32(ndnan, npyv_setall_s32(0x7f000000));
+        npyv_b32 nnan_m = npyv_notnan_f32(x);
+        npyv_s32 idxs = npyv_sub_s32(ndnan, npyv_setall_s32(0x3d400000));
+                 idxs = npyv_max_s32(idxs, npyv_zero_s32());
+                 idxs = npyv_min_s32(idxs, npyv_setall_s32(0x3e00000));
+        npyv_u32 idx  = npyv_shri_u32(npyv_reinterpret_u32_s32(idxs), 21);
+
+        npyv_f32 b  = npyv_lut32_f32((const float*)lut32x8 + 32*0, idx);
+        npyv_f32 c0 = npyv_lut32_f32((const float*)lut32x8 + 32*1, idx);
+        npyv_f32 c1 = npyv_lut32_f32((const float*)lut32x8 + 32*2, idx);
+        npyv_f32 c2 = npyv_lut32_f32((const float*)lut32x8 + 32*3, idx);
+        npyv_f32 c3 = npyv_lut32_f32((const float*)lut32x8 + 32*4, idx);
+        npyv_f32 c4 = npyv_lut32_f32((const float*)lut32x8 + 32*5, idx);
+        npyv_f32 c5 = npyv_lut32_f32((const float*)lut32x8 + 32*6, idx);
+        npyv_f32 c6 = npyv_lut32_f32((const float*)lut32x8 + 32*7, idx);
+
+        // no need to zerofy nans or avoid FP exceptions by NO_EXC like SVML does
+        // since we're clearing the FP status anyway.
+        npyv_f32 sign = npyv_and_f32(x, npyv_reinterpret_f32_u32(npyv_setall_u32(0x80000000)));
+        npyv_f32 y = npyv_sub_f32(npyv_abs_f32(x), b);
+        npyv_f32 r = npyv_muladd_f32(c6, y, c5);
+        r = npyv_muladd_f32(r, y, c4);
+        r = npyv_muladd_f32(r, y, c3);
+        r = npyv_muladd_f32(r, y, c2);
+        r = npyv_muladd_f32(r, y, c1);
+        r = npyv_muladd_f32(r, y, c0);
+        // 1.0 if |x| > HUGE_THRESHOLD || INF
+        r = npyv_select_f32(special_m, r, npyv_setall_f32(1.0f));
+        r = npyv_or_f32(r, sign);
+        // qnan if nan
+        r = npyv_select_f32(nnan_m, r, qnan);
+        if (sdst == 1) {
+            npyv_store_till_f32(dst, len, r);
+        } else {
+            npyv_storen_till_f32(dst, sdst, len, r);
+        }
+    }
+}
+#endif // NPY_SIMD_FMA3
+
+/**begin repeat
+ * #TYPE = FLOAT, DOUBLE#
+ * #type = float, double#
+ * #sfx  = f32,   f64#
+ * #ssfx = f,     #
+ * #simd = NPY_SIMD_FMA3, NPY_SIMD_FMA3 && NPY_SIMD_F64#
+ */
+/**begin repeat1
+ *  #func = tanh#
+ *  #simd_req_clear = 1#
+ */
+NPY_NO_EXPORT void NPY_CPU_DISPATCH_CURFX(@TYPE@_@func@)
+(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(data))
+{
+    const @type@ *src = (@type@*)args[0];
+          @type@ *dst = (@type@*)args[1];
+
+    const int lsize = sizeof(src[0]);
+    const npy_intp ssrc = steps[0] / lsize;
+    const npy_intp sdst = steps[1] / lsize;
+    npy_intp len = dimensions[0];
+    assert(len <= 1 || (steps[0] % lsize == 0 && steps[1] % lsize == 0));
+#if @simd@
+    if (is_mem_overlap(src, steps[0], dst, steps[1], len) ||
+        !npyv_loadable_stride_@sfx@(ssrc) || !npyv_storable_stride_@sfx@(sdst)
+    ) {
+        for (; len > 0; --len, src += ssrc, dst += sdst) {
+            simd_@func@_@sfx@(src, 1, dst, 1, 1);
+        }
+    } else {
+        simd_@func@_@sfx@(src, ssrc, dst, sdst, len);
+    }
+    npyv_cleanup();
+    #if @simd_req_clear@
+        npy_clear_floatstatus_barrier((char*)dimensions);
+    #endif
+#else
+    for (; len > 0; --len, src += ssrc, dst += sdst) {
+        const @type@ src0 = *src;
+        *dst = npy_@func@@ssfx@(src0);
+    }
+#endif
+}
+/**end repeat1**/
+/**end repeat**/
diff --git a/numpy/core/src/umath/loops_minmax.dispatch.c.src b/numpy/core/src/umath/loops_minmax.dispatch.c.src

new file mode 100644 (file)

index 0000000..ba2288f
--- /dev/null
+++ b/numpy/core/src/umath/loops_minmax.dispatch.c.src
@@ -0,0 +1,553 @@
+/*@targets
+ ** $maxopt baseline
+ ** neon asimd
+ ** sse2 avx2 avx512_skx
+ ** vsx2
+ **/
+#define _UMATHMODULE
+#define _MULTIARRAYMODULE
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+
+#include "simd/simd.h"
+#include "loops_utils.h"
+#include "loops.h"
+#include "lowlevel_strided_loops.h"
+// Provides the various *_LOOP macros
+#include "fast_loop_macros.h"
+
+/*******************************************************************************
+ ** Scalar intrinsics
+ ******************************************************************************/
+// signed/unsigned int
+#define scalar_max_i(A, B) ((A > B) ? A : B)
+#define scalar_min_i(A, B) ((A < B) ? A : B)
+// fp, propagates NaNs
+#define scalar_max(A, B) ((A >= B || npy_isnan(A)) ? A : B)
+#define scalar_max_f scalar_max
+#define scalar_max_d scalar_max
+#define scalar_max_l scalar_max
+#define scalar_min(A, B) ((A <= B || npy_isnan(A)) ? A : B)
+#define scalar_min_f scalar_min
+#define scalar_min_d scalar_min
+#define scalar_min_l scalar_min
+// fp, ignores NaNs
+#define scalar_maxp_f fmaxf
+#define scalar_maxp_d fmax
+#define scalar_maxp_l fmaxl
+#define scalar_minp_f fminf
+#define scalar_minp_d fmin
+#define scalar_minp_l fminl
+
+// special optimization for fp scalars propagates NaNs
+// since there're no C99 support for it
+#ifndef NPY_DISABLE_OPTIMIZATION
+/**begin repeat
+ * #type = npy_float, npy_double#
+ * #sfx = f32, f64#
+ * #c_sfx = f, d#
+ * #isa_sfx = s, d#
+ * #sse_type = __m128, __m128d#
+ */
+/**begin repeat1
+ * #op = max, min#
+ * #neon_instr = fmax, fmin#
+ */
+#ifdef NPY_HAVE_SSE2
+#undef scalar_@op@_@c_sfx@
+NPY_FINLINE @type@ scalar_@op@_@c_sfx@(@type@ a, @type@ b) {
+    @sse_type@ va = _mm_set_s@isa_sfx@(a);
+    @sse_type@ vb = _mm_set_s@isa_sfx@(b);
+    @sse_type@ rv = _mm_@op@_s@isa_sfx@(va, vb);
+    // X86 handle second operand
+    @sse_type@ nn = _mm_cmpord_s@isa_sfx@(va, va);
+    #ifdef NPY_HAVE_SSE41
+    rv = _mm_blendv_p@isa_sfx@(va, rv, nn);
+    #else
+    rv = _mm_xor_p@isa_sfx@(va, _mm_and_p@isa_sfx@(_mm_xor_p@isa_sfx@(va, rv), nn));
+    #endif
+    return _mm_cvts@isa_sfx@_@sfx@(rv);
+}
+#endif // SSE2
+#ifdef __aarch64__
+#undef scalar_@op@_@c_sfx@
+NPY_FINLINE @type@ scalar_@op@_@c_sfx@(@type@ a, @type@ b) {
+    @type@ result = 0;
+    __asm(
+        "@neon_instr@ %@isa_sfx@[result], %@isa_sfx@[a], %@isa_sfx@[b]"
+        : [result] "=w" (result)
+        : [a] "w" (a), [b] "w" (b)
+    );
+    return result;
+}
+#endif // __aarch64__
+/**end repeat1**/
+/**end repeat**/
+#endif // NPY_DISABLE_OPTIMIZATION
+// mapping to double if its possible
+#if NPY_BITSOF_DOUBLE == NPY_BITSOF_LONGDOUBLE
+/**begin repeat
+ * #op = max, min, maxp, minp#
+ */
+    #undef scalar_@op@_l
+    #define scalar_@op@_l scalar_@op@_d
+/**end repeat**/
+#endif
+
+/*******************************************************************************
+ ** extra SIMD intrinsics
+ ******************************************************************************/
+
+#if NPY_SIMD
+/**begin repeat
+ * #sfx = s8, u8, s16, u16, s32, u32, s64, u64#
+ * #is_64 = 0*6, 1*2#
+ */
+#if defined(NPY_HAVE_ASIMD) && defined(__aarch64__)
+    #if !@is_64@
+        #define npyv_reduce_min_@sfx@ vminvq_@sfx@
+        #define npyv_reduce_max_@sfx@ vmaxvq_@sfx@
+    #else
+        NPY_FINLINE npyv_lanetype_@sfx@ npyv_reduce_min_@sfx@(npyv_@sfx@ v)
+        {
+            npyv_lanetype_@sfx@ a = vgetq_lane_@sfx@(v, 0);
+            npyv_lanetype_@sfx@ b = vgetq_lane_@sfx@(v, 1);
+            npyv_lanetype_@sfx@ result = (a < b) ? a : b;
+            return result;
+        }
+        NPY_FINLINE npyv_lanetype_@sfx@ npyv_reduce_max_@sfx@(npyv_@sfx@ v)
+        {
+            npyv_lanetype_@sfx@ a = vgetq_lane_@sfx@(v, 0);
+            npyv_lanetype_@sfx@ b = vgetq_lane_@sfx@(v, 1);
+            npyv_lanetype_@sfx@ result = (a > b) ? a : b;
+            return result;
+        }
+    #endif // !@is_64@
+#else
+    /**begin repeat1
+     * #intrin = min, max#
+     */
+    NPY_FINLINE npyv_lanetype_@sfx@ npyv_reduce_@intrin@_@sfx@(npyv_@sfx@ v)
+    {
+        npyv_lanetype_@sfx@ NPY_DECL_ALIGNED(NPY_SIMD_WIDTH) s[npyv_nlanes_@sfx@];
+        npyv_storea_@sfx@(s, v);
+        npyv_lanetype_@sfx@ result = s[0];
+        for(int i=1; i<npyv_nlanes_@sfx@; ++i){
+            result = scalar_@intrin@_i(result, s[i]);
+        }
+        return result;
+    }
+    /**end repeat1**/
+#endif
+/**end repeat**/
+#endif // NPY_SIMD
+
+/**begin repeat
+ * #sfx = f32, f64#
+ * #bsfx = b32, b64#
+ * #simd_chk = NPY_SIMD, NPY_SIMD_F64#
+ * #scalar_sfx = f, d#
+ */
+#if @simd_chk@
+#if defined(NPY_HAVE_ASIMD) && defined(__aarch64__)
+    #define npyv_minn_@sfx@ vminq_@sfx@
+    #define npyv_maxn_@sfx@ vmaxq_@sfx@
+    #define npyv_reduce_minn_@sfx@ vminvq_@sfx@
+    #define npyv_reduce_maxn_@sfx@ vmaxvq_@sfx@
+    #define npyv_reduce_minp_@sfx@ vminnmvq_@sfx@
+    #define npyv_reduce_maxp_@sfx@ vmaxnmvq_@sfx@
+#else
+    /**begin repeat1
+     * #intrin = min, max#
+     */
+    // propagates NaNs
+    NPY_FINLINE npyv_@sfx@ npyv_@intrin@n_@sfx@(npyv_@sfx@ a, npyv_@sfx@ b)
+    {
+        npyv_@sfx@ result = npyv_@intrin@_@sfx@(a, b);
+        // result = npyv_select_@sfx@(npyv_notnan_@sfx@(b), result, b);
+     // X86 handle second operand
+    #ifndef NPY_HAVE_SSE2
+        result = npyv_select_@sfx@(npyv_notnan_@sfx@(b), result, b);
+    #endif
+        result = npyv_select_@sfx@(npyv_notnan_@sfx@(a), result, a);
+        return result;
+    }
+    /**end repeat1**/
+    /**begin repeat1
+     * #intrin = minn, maxn, minp, maxp#
+     * #scalar_intrin = min, max, minp, maxp#
+     */
+    NPY_FINLINE npyv_lanetype_@sfx@ npyv_reduce_@intrin@_@sfx@(npyv_@sfx@ v)
+    {
+        npyv_lanetype_@sfx@ NPY_DECL_ALIGNED(NPY_SIMD_WIDTH) s[npyv_nlanes_@sfx@];
+        npyv_storea_@sfx@(s, v);
+        npyv_lanetype_@sfx@ result = s[0];
+        for(int i=1; i<npyv_nlanes_@sfx@; ++i){
+            result = scalar_@scalar_intrin@_@scalar_sfx@(result, s[i]);
+        }
+        return result;
+    }
+    /**end repeat1**/
+#endif
+#endif // simd_chk
+/**end repeat**/
+
+/*******************************************************************************
+ ** Defining the SIMD kernels
+ ******************************************************************************/
+/**begin repeat
+ * #sfx = s8, u8, s16, u16, s32, u32, s64, u64, f32, f64#
+ * #simd_chk = NPY_SIMD*9, NPY_SIMD_F64#
+ * #is_fp = 0*8, 1, 1#
+ * #scalar_sfx = i*8, f, d#
+ */
+/**begin repeat1
+ * # intrin = max, min, maxp, minp#
+ * # fp_only = 0, 0, 1, 1#
+ */
+#define SCALAR_OP scalar_@intrin@_@scalar_sfx@
+#if @simd_chk@ && (!@fp_only@ || (@is_fp@ && @fp_only@))
+
+#if @is_fp@ && !@fp_only@
+    #define V_INTRIN npyv_@intrin@n_@sfx@ // propagates NaNs
+    #define V_REDUCE_INTRIN npyv_reduce_@intrin@n_@sfx@
+#else
+    #define V_INTRIN npyv_@intrin@_@sfx@
+    #define V_REDUCE_INTRIN npyv_reduce_@intrin@_@sfx@
+#endif
+
+// contiguous input.
+static inline void
+simd_reduce_c_@intrin@_@sfx@(const npyv_lanetype_@sfx@ *ip, npyv_lanetype_@sfx@ *op1, npy_intp len)
+{
+    if (len < 1) {
+        return;
+    }
+    const int vstep = npyv_nlanes_@sfx@;
+    const int wstep = vstep*8;
+    npyv_@sfx@ acc = npyv_setall_@sfx@(op1[0]);
+    for (; len >= wstep; len -= wstep, ip += wstep) {
+    #ifdef NPY_HAVE_SSE2
+        NPY_PREFETCH(ip + wstep, 0, 3);
+    #endif
+        npyv_@sfx@ v0 = npyv_load_@sfx@(ip + vstep * 0);
+        npyv_@sfx@ v1 = npyv_load_@sfx@(ip + vstep * 1);
+        npyv_@sfx@ v2 = npyv_load_@sfx@(ip + vstep * 2);
+        npyv_@sfx@ v3 = npyv_load_@sfx@(ip + vstep * 3);
+
+        npyv_@sfx@ v4 = npyv_load_@sfx@(ip + vstep * 4);
+        npyv_@sfx@ v5 = npyv_load_@sfx@(ip + vstep * 5);
+        npyv_@sfx@ v6 = npyv_load_@sfx@(ip + vstep * 6);
+        npyv_@sfx@ v7 = npyv_load_@sfx@(ip + vstep * 7);
+
+        npyv_@sfx@ r01 = V_INTRIN(v0, v1);
+        npyv_@sfx@ r23 = V_INTRIN(v2, v3);
+        npyv_@sfx@ r45 = V_INTRIN(v4, v5);
+        npyv_@sfx@ r67 = V_INTRIN(v6, v7);
+        acc = V_INTRIN(acc, V_INTRIN(V_INTRIN(r01, r23), V_INTRIN(r45, r67)));
+    }
+    for (; len >= vstep; len -= vstep, ip += vstep) {
+        acc = V_INTRIN(acc, npyv_load_@sfx@(ip));
+    }
+    npyv_lanetype_@sfx@ r = V_REDUCE_INTRIN(acc);
+    // Scalar - finish up any remaining iterations
+    for (; len > 0; --len, ++ip) {
+        const npyv_lanetype_@sfx@ in2 = *ip;
+        r = SCALAR_OP(r, in2);
+    }
+    op1[0] = r;
+}
+
+// contiguous inputs and output.
+static inline void
+simd_binary_ccc_@intrin@_@sfx@(const npyv_lanetype_@sfx@ *ip1, const npyv_lanetype_@sfx@ *ip2,
+                                     npyv_lanetype_@sfx@ *op1, npy_intp len)
+{
+#if NPY_SIMD_WIDTH == 128
+    // Note, 6x unroll was chosen for best results on Apple M1
+    const int vectorsPerLoop = 6;
+#else
+    // To avoid memory bandwidth bottleneck
+    const int vectorsPerLoop = 2;
+#endif
+    const int elemPerVector = npyv_nlanes_@sfx@;
+    int elemPerLoop = vectorsPerLoop * elemPerVector;
+
+    npy_intp i = 0;
+
+    for (; (i+elemPerLoop) <= len; i += elemPerLoop) {
+        npyv_@sfx@ v0 = npyv_load_@sfx@(&ip1[i + 0 * elemPerVector]);
+        npyv_@sfx@ v1 = npyv_load_@sfx@(&ip1[i + 1 * elemPerVector]);
+    #if NPY_SIMD_WIDTH == 128
+        npyv_@sfx@ v2 = npyv_load_@sfx@(&ip1[i + 2 * elemPerVector]);
+        npyv_@sfx@ v3 = npyv_load_@sfx@(&ip1[i + 3 * elemPerVector]);
+        npyv_@sfx@ v4 = npyv_load_@sfx@(&ip1[i + 4 * elemPerVector]);
+        npyv_@sfx@ v5 = npyv_load_@sfx@(&ip1[i + 5 * elemPerVector]);
+    #endif
+        npyv_@sfx@ u0 = npyv_load_@sfx@(&ip2[i + 0 * elemPerVector]);
+        npyv_@sfx@ u1 = npyv_load_@sfx@(&ip2[i + 1 * elemPerVector]);
+    #if NPY_SIMD_WIDTH == 128
+        npyv_@sfx@ u2 = npyv_load_@sfx@(&ip2[i + 2 * elemPerVector]);
+        npyv_@sfx@ u3 = npyv_load_@sfx@(&ip2[i + 3 * elemPerVector]);
+        npyv_@sfx@ u4 = npyv_load_@sfx@(&ip2[i + 4 * elemPerVector]);
+        npyv_@sfx@ u5 = npyv_load_@sfx@(&ip2[i + 5 * elemPerVector]);
+    #endif
+        npyv_@sfx@ m0 = V_INTRIN(v0, u0);
+        npyv_@sfx@ m1 = V_INTRIN(v1, u1);
+    #if NPY_SIMD_WIDTH == 128
+        npyv_@sfx@ m2 = V_INTRIN(v2, u2);
+        npyv_@sfx@ m3 = V_INTRIN(v3, u3);
+        npyv_@sfx@ m4 = V_INTRIN(v4, u4);
+        npyv_@sfx@ m5 = V_INTRIN(v5, u5);
+    #endif
+        npyv_store_@sfx@(&op1[i + 0 * elemPerVector], m0);
+        npyv_store_@sfx@(&op1[i + 1 * elemPerVector], m1);
+    #if NPY_SIMD_WIDTH == 128
+        npyv_store_@sfx@(&op1[i + 2 * elemPerVector], m2);
+        npyv_store_@sfx@(&op1[i + 3 * elemPerVector], m3);
+        npyv_store_@sfx@(&op1[i + 4 * elemPerVector], m4);
+        npyv_store_@sfx@(&op1[i + 5 * elemPerVector], m5);
+    #endif
+    }
+    for (; (i+elemPerVector) <= len; i += elemPerVector) {
+        npyv_@sfx@ v0 = npyv_load_@sfx@(ip1 + i);
+        npyv_@sfx@ u0 = npyv_load_@sfx@(ip2 + i);
+        npyv_@sfx@ m0 = V_INTRIN(v0, u0);
+        npyv_store_@sfx@(op1 + i, m0);
+    }
+    // Scalar - finish up any remaining iterations
+    for (; i < len; ++i) {
+        const npyv_lanetype_@sfx@ in1 = ip1[i];
+        const npyv_lanetype_@sfx@ in2 = ip2[i];
+        op1[i] = SCALAR_OP(in1, in2);
+    }
+}
+// non-contiguous for float 32/64-bit memory access
+#if @is_fp@
+static inline void
+simd_binary_@intrin@_@sfx@(const npyv_lanetype_@sfx@ *ip1, npy_intp sip1,
+                           const npyv_lanetype_@sfx@ *ip2, npy_intp sip2,
+                                 npyv_lanetype_@sfx@ *op1, npy_intp sop1,
+                                 npy_intp len)
+{
+    const int vstep = npyv_nlanes_@sfx@;
+    for (; len >= vstep; len -= vstep, ip1 += sip1*vstep,
+                         ip2 += sip2*vstep, op1 += sop1*vstep
+    ) {
+        npyv_@sfx@ a, b;
+        if (sip1 == 1) {
+            a = npyv_load_@sfx@(ip1);
+        } else {
+            a = npyv_loadn_@sfx@(ip1, sip1);
+        }
+        if (sip2 == 1) {
+            b = npyv_load_@sfx@(ip2);
+        } else {
+            b = npyv_loadn_@sfx@(ip2, sip2);
+        }
+        npyv_@sfx@ r = V_INTRIN(a, b);
+        if (sop1 == 1) {
+            npyv_store_@sfx@(op1, r);
+        } else {
+            npyv_storen_@sfx@(op1, sop1, r);
+        }
+    }
+    for (; len > 0; --len, ip1 += sip1, ip2 += sip2, op1 += sop1) {
+        const npyv_lanetype_@sfx@ a = *ip1;
+        const npyv_lanetype_@sfx@ b = *ip2;
+        *op1 = SCALAR_OP(a, b);
+    }
+}
+#endif
+
+#undef V_INTRIN
+#undef V_REDUCE_INTRIN
+
+#endif // simd_chk && (!fp_only || (is_fp && fp_only))
+
+#undef SCALAR_OP
+/**end repeat1**/
+/**end repeat**/
+
+/*******************************************************************************
+ ** Defining ufunc inner functions
+ ******************************************************************************/
+/**begin repeat
+ * #TYPE = UBYTE, USHORT, UINT, ULONG, ULONGLONG,
+ *         BYTE, SHORT, INT, LONG, LONGLONG,
+ *         FLOAT, DOUBLE, LONGDOUBLE#
+ *
+ * #BTYPE = BYTE, SHORT, INT,  LONG, LONGLONG,
+ *          BYTE, SHORT, INT, LONG, LONGLONG,
+ *          FLOAT, DOUBLE, LONGDOUBLE#
+ * #type = npy_ubyte, npy_ushort, npy_uint, npy_ulong, npy_ulonglong,
+ *         npy_byte, npy_short, npy_int, npy_long, npy_longlong,
+ *         npy_float, npy_double, npy_longdouble#
+ *
+ * #is_fp = 0*10, 1*3#
+ * #is_unsigned = 1*5, 0*5, 0*3#
+ * #scalar_sfx = i*10, f, d, l#
+ */
+#undef TO_SIMD_SFX
+#if 0
+/**begin repeat1
+ * #len = 8, 16, 32, 64#
+ */
+#elif NPY_SIMD && NPY_BITSOF_@BTYPE@ == @len@
+    #if @is_fp@
+        #define TO_SIMD_SFX(X) X##_f@len@
+        #if NPY_BITSOF_@BTYPE@ == 64 && !NPY_SIMD_F64
+            #undef TO_SIMD_SFX
+        #endif
+    #elif @is_unsigned@
+        #define TO_SIMD_SFX(X) X##_u@len@
+    #else
+        #define TO_SIMD_SFX(X) X##_s@len@
+    #endif
+/**end repeat1**/
+#endif
+
+/**begin repeat1
+ * # kind = maximum, minimum, fmax, fmin#
+ * # intrin = max, min, maxp, minp#
+ * # fp_only = 0, 0, 1, 1#
+ */
+#if !@fp_only@ || (@is_fp@ && @fp_only@)
+#define SCALAR_OP scalar_@intrin@_@scalar_sfx@
+
+NPY_NO_EXPORT void NPY_CPU_DISPATCH_CURFX(@TYPE@_@kind@)
+(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
+{
+    char *ip1 = args[0], *ip2 = args[1], *op1 = args[2];
+    npy_intp is1 = steps[0], is2 = steps[1], os1 = steps[2],
+             len = dimensions[0];
+    npy_intp i = 0;
+#ifdef TO_SIMD_SFX
+    #undef STYPE
+    #define STYPE TO_SIMD_SFX(npyv_lanetype)
+    if (IS_BINARY_REDUCE) {
+        // reduce and contiguous
+        if (is2 == sizeof(@type@)) {
+            TO_SIMD_SFX(simd_reduce_c_@intrin@)(
+                (STYPE*)ip2, (STYPE*)op1, len
+            );
+            goto clear_fp;
+        }
+    }
+    else if (!is_mem_overlap(ip1, is1, op1, os1, len) &&
+        !is_mem_overlap(ip2, is2, op1, os1, len)
+    ) {
+        // no overlap and operands are binary contiguous
+        if (IS_BINARY_CONT(@type@, @type@)) {
+            TO_SIMD_SFX(simd_binary_ccc_@intrin@)(
+                (STYPE*)ip1, (STYPE*)ip2, (STYPE*)op1, len
+            );
+            goto clear_fp;
+        }
+    // unroll scalars faster than non-contiguous vector load/store on Arm
+    #if !defined(NPY_HAVE_NEON) && @is_fp@
+        if (TO_SIMD_SFX(npyv_loadable_stride)(is1/sizeof(STYPE)) &&
+            TO_SIMD_SFX(npyv_loadable_stride)(is2/sizeof(STYPE)) &&
+            TO_SIMD_SFX(npyv_storable_stride)(os1/sizeof(STYPE))
+        ) {
+            TO_SIMD_SFX(simd_binary_@intrin@)(
+                (STYPE*)ip1, is1/sizeof(STYPE),
+                (STYPE*)ip2, is2/sizeof(STYPE),
+                (STYPE*)op1, os1/sizeof(STYPE), len
+            );
+            goto clear_fp;
+        }
+    #endif
+    }
+#endif // TO_SIMD_SFX
+#ifndef NPY_DISABLE_OPTIMIZATION
+    // scalar unrolls
+    if (IS_BINARY_REDUCE) {
+        // Note, 8x unroll was chosen for best results on Apple M1
+        npy_intp elemPerLoop = 8;
+        if((i+elemPerLoop) <= len){
+            @type@ m0 = *((@type@ *)(ip2 + (i + 0) * is2));
+            @type@ m1 = *((@type@ *)(ip2 + (i + 1) * is2));
+            @type@ m2 = *((@type@ *)(ip2 + (i + 2) * is2));
+            @type@ m3 = *((@type@ *)(ip2 + (i + 3) * is2));
+            @type@ m4 = *((@type@ *)(ip2 + (i + 4) * is2));
+            @type@ m5 = *((@type@ *)(ip2 + (i + 5) * is2));
+            @type@ m6 = *((@type@ *)(ip2 + (i + 6) * is2));
+            @type@ m7 = *((@type@ *)(ip2 + (i + 7) * is2));
+
+            i += elemPerLoop;
+            for(; (i+elemPerLoop)<=len; i+=elemPerLoop){
+                @type@ v0 = *((@type@ *)(ip2 + (i + 0) * is2));
+                @type@ v1 = *((@type@ *)(ip2 + (i + 1) * is2));
+                @type@ v2 = *((@type@ *)(ip2 + (i + 2) * is2));
+                @type@ v3 = *((@type@ *)(ip2 + (i + 3) * is2));
+                @type@ v4 = *((@type@ *)(ip2 + (i + 4) * is2));
+                @type@ v5 = *((@type@ *)(ip2 + (i + 5) * is2));
+                @type@ v6 = *((@type@ *)(ip2 + (i + 6) * is2));
+                @type@ v7 = *((@type@ *)(ip2 + (i + 7) * is2));
+
+                m0 = SCALAR_OP(m0, v0);
+                m1 = SCALAR_OP(m1, v1);
+                m2 = SCALAR_OP(m2, v2);
+                m3 = SCALAR_OP(m3, v3);
+                m4 = SCALAR_OP(m4, v4);
+                m5 = SCALAR_OP(m5, v5);
+                m6 = SCALAR_OP(m6, v6);
+                m7 = SCALAR_OP(m7, v7);
+            }
+
+            m0 = SCALAR_OP(m0, m1);
+            m2 = SCALAR_OP(m2, m3);
+            m4 = SCALAR_OP(m4, m5);
+            m6 = SCALAR_OP(m6, m7);
+
+            m0 = SCALAR_OP(m0, m2);
+            m4 = SCALAR_OP(m4, m6);
+
+            m0 = SCALAR_OP(m0, m4);
+
+             *((@type@ *)op1) = SCALAR_OP(*((@type@ *)op1), m0);
+        }
+    } else{
+        // Note, 4x unroll was chosen for best results on Apple M1
+        npy_intp elemPerLoop = 4;
+        for(; (i+elemPerLoop)<=len; i+=elemPerLoop){
+            /* Note, we can't just load all, do all ops, then store all here.
+             * Sometimes ufuncs are called with `accumulate`, which makes the
+             * assumption that previous iterations have finished before next
+             * iteration.  For example, the output of iteration 2 depends on the
+             * result of iteration 1.
+             */
+
+            /**begin repeat2
+             * #unroll = 0, 1, 2, 3#
+             */
+            @type@ v@unroll@ = *((@type@ *)(ip1 + (i + @unroll@) * is1));
+            @type@ u@unroll@ = *((@type@ *)(ip2 + (i + @unroll@) * is2));
+            *((@type@ *)(op1 + (i + @unroll@) * os1)) = SCALAR_OP(v@unroll@, u@unroll@);
+            /**end repeat2**/
+        }
+    }
+#endif // NPY_DISABLE_OPTIMIZATION
+    ip1 += is1 * i;
+    ip2 += is2 * i;
+    op1 += os1 * i;
+    for (; i < len; ++i, ip1 += is1, ip2 += is2, op1 += os1) {
+        const @type@ in1 = *(@type@ *)ip1;
+        const @type@ in2 = *(@type@ *)ip2;
+        *((@type@ *)op1) = SCALAR_OP(in1, in2);
+    }
+#ifdef TO_SIMD_SFX
+clear_fp:
+    npyv_cleanup();
+#endif
+#if @is_fp@
+    npy_clear_floatstatus_barrier((char*)dimensions);
+#endif
+}
+
+#undef SCALAR_OP
+
+#endif // !fp_only || (is_fp && fp_only)
+/**end repeat1**/
+/**end repeat**/
+
diff --git a/numpy/core/src/umath/loops_modulo.dispatch.c.src b/numpy/core/src/umath/loops_modulo.dispatch.c.src

new file mode 100644 (file)

index 0000000..d0ecc01
--- /dev/null
+++ b/numpy/core/src/umath/loops_modulo.dispatch.c.src
@@ -0,0 +1,631 @@
+/*@targets
+ ** baseline vsx4
+ **/
+#define _UMATHMODULE
+#define _MULTIARRAYMODULE
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+
+#include "simd/simd.h"
+#include "loops_utils.h"
+#include "loops.h"
+#include "lowlevel_strided_loops.h"
+// Provides the various *_LOOP macros
+#include "fast_loop_macros.h"
+
+#if NPY_SIMD && defined(NPY_HAVE_VSX4)
+typedef struct {
+    npyv_u32x2 hi;
+    npyv_u32x2 lo;
+} vsx4_u32x4;
+
+typedef struct {
+    npyv_s32x2 hi;
+    npyv_s32x2 lo;
+} vsx4_s32x4;
+
+// Converts 1 8-bit vector into 2 16-bit vectors
+NPY_FINLINE npyv_s16x2
+vsx4_expand_s16_s8(npyv_s8 data)
+{
+    npyv_s16x2 r;
+    r.val[0] = vec_unpackh(data);
+    r.val[1] = vec_unpackl(data);
+    return r;
+}
+
+// Converts 1 16-bit vector into 2 32-bit vectors
+NPY_FINLINE npyv_s32x2
+vsx4_expand_s32_s16(npyv_s16 data)
+{
+    npyv_s32x2 r;
+    r.val[0] = vec_unpackh(data);
+    r.val[1] = vec_unpackl(data);
+    return r;
+}
+
+/**begin repeat
+ * #t = u, s#
+ * #expand = npyv_expand, vsx4_expand#
+ */
+// Converts 1 8-bit vector into 4 32-bit vectors
+NPY_FINLINE vsx4_@t@32x4
+vsx4_expand_@t@32_@t@8(npyv_@t@8 data)
+{
+    vsx4_@t@32x4 r;
+    npyv_@t@16x2 expand = @expand@_@t@16_@t@8(data);
+    r.hi = @expand@_@t@32_@t@16(expand.val[0]);
+    r.lo = @expand@_@t@32_@t@16(expand.val[1]);
+    return r;
+}
+
+/**begin repeat1
+ * #simd = div, mod##
+ */
+/*
+ * Computes division/modulo of 2 8-bit signed/unsigned integer vectors
+ *
+ * As Power10 only supports integer vector division/modulo for data of 32 bits
+ * or greater, we have to convert npyv_u8 into 4x npyv_u32, execute the integer
+ * vector division/modulo instruction, and then, convert the result back to
+ * npyv_u8.
+ */
+NPY_FINLINE npyv_@t@8
+vsx4_@simd@_@t@8(npyv_@t@8 a, npyv_@t@8 b)
+{
+    vsx4_@t@32x4 a_expand = vsx4_expand_@t@32_@t@8(a);
+    vsx4_@t@32x4 b_expand = vsx4_expand_@t@32_@t@8(b);
+    npyv_@t@32 v1 = vec_@simd@(a_expand.hi.val[0], b_expand.hi.val[0]);
+    npyv_@t@32 v2 = vec_@simd@(a_expand.hi.val[1], b_expand.hi.val[1]);
+    npyv_@t@32 v3 = vec_@simd@(a_expand.lo.val[0], b_expand.lo.val[0]);
+    npyv_@t@32 v4 = vec_@simd@(a_expand.lo.val[1], b_expand.lo.val[1]);
+    npyv_@t@16 hi = vec_pack(v1, v2);
+    npyv_@t@16 lo = vec_pack(v3, v4);
+    return vec_pack(hi, lo);
+}
+
+NPY_FINLINE npyv_@t@8
+vsx4_@simd@_scalar_@t@8(npyv_@t@8 a, const vsx4_@t@32x4 b_expand)
+{
+    vsx4_@t@32x4 a_expand = vsx4_expand_@t@32_@t@8(a);
+    npyv_@t@32 v1 = vec_@simd@(a_expand.hi.val[0], b_expand.hi.val[0]);
+    npyv_@t@32 v2 = vec_@simd@(a_expand.hi.val[1], b_expand.hi.val[1]);
+    npyv_@t@32 v3 = vec_@simd@(a_expand.lo.val[0], b_expand.lo.val[0]);
+    npyv_@t@32 v4 = vec_@simd@(a_expand.lo.val[1], b_expand.lo.val[1]);
+    npyv_@t@16 hi = vec_pack(v1, v2);
+    npyv_@t@16 lo = vec_pack(v3, v4);
+    return vec_pack(hi, lo);
+}
+
+NPY_FINLINE npyv_@t@16
+vsx4_@simd@_@t@16(npyv_@t@16 a, npyv_@t@16 b)
+{
+    npyv_@t@32x2 a_expand = @expand@_@t@32_@t@16(a);
+    npyv_@t@32x2 b_expand = @expand@_@t@32_@t@16(b);
+    npyv_@t@32 v1 = vec_@simd@(a_expand.val[0], b_expand.val[0]);
+    npyv_@t@32 v2 = vec_@simd@(a_expand.val[1], b_expand.val[1]);
+    return vec_pack(v1, v2);
+}
+
+NPY_FINLINE npyv_@t@16
+vsx4_@simd@_scalar_@t@16(npyv_@t@16 a, const npyv_@t@32x2 b_expand)
+{
+    npyv_@t@32x2 a_expand = @expand@_@t@32_@t@16(a);
+    npyv_@t@32 v1 = vec_@simd@(a_expand.val[0], b_expand.val[0]);
+    npyv_@t@32 v2 = vec_@simd@(a_expand.val[1], b_expand.val[1]);
+    return vec_pack(v1, v2);
+}
+
+#define vsx4_@simd@_@t@32 vec_@simd@
+#define vsx4_@simd@_@t@64 vec_@simd@
+#define vsx4_@simd@_scalar_@t@32 vec_@simd@
+#define vsx4_@simd@_scalar_@t@64 vec_@simd@
+/**end repeat1**/
+/**end repeat**/
+
+/**begin repeat
+ * #sfx  = u8,  u16, s8,  s16#
+ * #osfx = u32, u32, s32, s32#
+ * #otype  = vsx4_u32x4,  npyv_u32x2,  vsx4_s32x4,  npyv_s32x2#
+ * #expand = vsx4_expand, npyv_expand, vsx4_expand, vsx4_expand#
+ */
+// Generates the divisor for the division/modulo operations
+NPY_FINLINE @otype@
+vsx4_divisor_@sfx@(const npyv_@sfx@ vscalar)
+{
+    return @expand@_@osfx@_@sfx@(vscalar);
+}
+/**end repeat**/
+
+/**begin repeat
+ * #sfx = u32, u64, s32, s64#
+ */
+NPY_FINLINE npyv_@sfx@
+vsx4_divisor_@sfx@(const npyv_@sfx@ vscalar)
+{
+    return vscalar;
+}
+/**end repeat**/
+
+/**begin repeat
+ * Unsigned types
+ * #sfx = u8, u16, u32, u64#
+ * #len = 8,  16,  32,  64#
+ * #divtype = vsx4_u32x4, npyv_u32x2,  npyv_u32,  npyv_u64#
+ */
+/**begin repeat1
+ * #func = fmod, remainder, divmod#
+ * #id = 0, 1, 2#
+ */
+static NPY_INLINE void
+vsx4_simd_@func@_contig_@sfx@(char **args, npy_intp len)
+{
+    npyv_lanetype_@sfx@ *src1 = (npyv_lanetype_@sfx@ *) args[0];
+    npyv_lanetype_@sfx@ *src2 = (npyv_lanetype_@sfx@ *) args[1];
+    npyv_lanetype_@sfx@ *dst1 = (npyv_lanetype_@sfx@ *) args[2];
+    const npyv_@sfx@ vzero    = npyv_zero_@sfx@();
+    const int vstep           = npyv_nlanes_@sfx@;
+#if @id@ == 2 /* divmod */
+    npyv_lanetype_@sfx@ *dst2 = (npyv_lanetype_@sfx@ *) args[3];
+    const npyv_@sfx@ vneg_one = npyv_setall_@sfx@(-1);
+    npyv_b@len@ warn          = npyv_cvt_b@len@_@sfx@(npyv_zero_@sfx@());
+
+    for (; len >= vstep; len -= vstep, src1 += vstep, src2 += vstep,
+         dst1 += vstep, dst2 += vstep) {
+        npyv_@sfx@ a        = npyv_load_@sfx@(src1);
+        npyv_@sfx@ b        = npyv_load_@sfx@(src2);
+        npyv_@sfx@ quo      = vsx4_div_@sfx@(a, b);
+        npyv_@sfx@ rem      = npyv_sub_@sfx@(a, vec_mul(b, quo));
+        npyv_b@len@ bzero   = npyv_cmpeq_@sfx@(b, vzero);
+        // when b is 0, 'cvtozero' forces the modulo to be 0 too
+        npyv_@sfx@ cvtozero = npyv_select_@sfx@(bzero, vzero, vneg_one);
+                       warn = npyv_or_@sfx@(bzero, warn);
+        npyv_store_@sfx@(dst1, quo);
+        npyv_store_@sfx@(dst2, npyv_and_@sfx@(cvtozero, rem));
+    }
+
+    if (!vec_all_eq(warn, vzero)) {
+        npy_set_floatstatus_divbyzero();
+    }
+
+    for (; len > 0; --len, ++src1, ++src2, ++dst1, ++dst2) {
+        const npyv_lanetype_@sfx@ a = *src1;
+        const npyv_lanetype_@sfx@ b = *src2;
+        if (NPY_UNLIKELY(b == 0)) {
+            npy_set_floatstatus_divbyzero();
+            *dst1 = 0;
+            *dst2 = 0;
+        } else{
+            *dst1 = a / b;
+            *dst2 = a % b;
+        }
+    }
+#else /* fmod and remainder */
+    for (; len >= vstep; len -= vstep, src1 += vstep, src2 += vstep,
+         dst1 += vstep) {
+        npyv_@sfx@ a = npyv_load_@sfx@(src1);
+        npyv_@sfx@ b = npyv_load_@sfx@(src2);
+        npyv_@sfx@ c = vsx4_mod_@sfx@(a, b);
+        npyv_store_@sfx@(dst1, c);
+        if (NPY_UNLIKELY(vec_any_eq(b, vzero))) {
+            npy_set_floatstatus_divbyzero();
+        }
+    }
+
+    for (; len > 0; --len, ++src1, ++src2, ++dst1) {
+        const npyv_lanetype_@sfx@ a = *src1;
+        const npyv_lanetype_@sfx@ b = *src2;
+        if (NPY_UNLIKELY(b == 0)) {
+            npy_set_floatstatus_divbyzero();
+            *dst1 = 0;
+        } else{
+            *dst1 = a % b;
+        }
+    }
+#endif
+    npyv_cleanup();
+}
+
+static NPY_INLINE void
+vsx4_simd_@func@_by_scalar_contig_@sfx@(char **args, npy_intp len)
+{
+    npyv_lanetype_@sfx@ *src1  = (npyv_lanetype_@sfx@ *) args[0];
+    npyv_lanetype_@sfx@ scalar = *(npyv_lanetype_@sfx@ *) args[1];
+    npyv_lanetype_@sfx@ *dst1  = (npyv_lanetype_@sfx@ *) args[2];
+    const int vstep            = npyv_nlanes_@sfx@;
+    const npyv_@sfx@ vscalar   = npyv_setall_@sfx@(scalar);
+    const @divtype@ divisor    = vsx4_divisor_@sfx@(vscalar);
+#if @id@ == 2 /* divmod */
+    npyv_lanetype_@sfx@ *dst2 = (npyv_lanetype_@sfx@ *) args[3];
+
+    for (; len >= vstep; len -= vstep, src1 += vstep, dst1 += vstep,
+         dst2 += vstep) {
+        npyv_@sfx@ a   = npyv_load_@sfx@(src1);
+        npyv_@sfx@ quo = vsx4_div_scalar_@sfx@(a, divisor);
+        npyv_@sfx@ rem = npyv_sub_@sfx@(a, vec_mul(vscalar, quo));
+        npyv_store_@sfx@(dst1, quo);
+        npyv_store_@sfx@(dst2, rem);
+    }
+
+    for (; len > 0; --len, ++src1, ++dst1, ++dst2) {
+        const npyv_lanetype_@sfx@ a = *src1;
+        *dst1 = a / scalar;
+        *dst2 = a % scalar;
+    }
+#else /* fmod and remainder */
+    for (; len >= vstep; len -= vstep, src1 += vstep, dst1 += vstep) {
+        npyv_@sfx@ a = npyv_load_@sfx@(src1);
+        npyv_@sfx@ c = vsx4_mod_scalar_@sfx@(a, divisor);
+        npyv_store_@sfx@(dst1, c);
+    }
+
+    for (; len > 0; --len, ++src1, ++dst1) {
+        const npyv_lanetype_@sfx@ a = *src1;
+        *dst1 = a % scalar;
+    }
+#endif
+    npyv_cleanup();
+}
+/**end repeat1**/
+/**end repeat**/
+
+/**begin repeat
+ * Signed types
+ * #sfx = s8, s16, s32, s64#
+ * #len = 8,  16,  32,  64#
+ * #divtype = vsx4_s32x4, npyv_s32x2,  npyv_s32,  npyv_s64#
+ */
+/**begin repeat1
+ * #func = fmod, remainder, divmod#
+ * #id = 0, 1, 2#
+ */
+static NPY_INLINE void
+vsx4_simd_@func@_contig_@sfx@(char **args, npy_intp len)
+{
+    npyv_lanetype_@sfx@ *src1 = (npyv_lanetype_@sfx@ *) args[0];
+    npyv_lanetype_@sfx@ *src2 = (npyv_lanetype_@sfx@ *) args[1];
+    npyv_lanetype_@sfx@ *dst1 = (npyv_lanetype_@sfx@ *) args[2];
+    const npyv_@sfx@ vzero    = npyv_zero_@sfx@();
+    const int vstep           = npyv_nlanes_@sfx@;
+#if @id@ == 2 /* divmod */
+    npyv_lanetype_@sfx@ *dst2 = (npyv_lanetype_@sfx@ *) args[3];
+    const npyv_@sfx@ vneg_one = npyv_setall_@sfx@(-1);
+    const npyv_@sfx@ vmin     = npyv_setall_@sfx@(NPY_MIN_INT@len@);
+    npyv_b@len@ warn          = npyv_cvt_b@len@_@sfx@(npyv_zero_@sfx@());
+
+    for (; len >= vstep; len -= vstep, src1 += vstep, src2 += vstep,
+         dst1 += vstep, dst2 += vstep) {
+#else /* fmod and remainder */
+    for (; len >= vstep; len -= vstep, src1 += vstep, src2 += vstep,
+         dst1 += vstep) {
+#endif
+        npyv_@sfx@ a = npyv_load_@sfx@(src1);
+        npyv_@sfx@ b = npyv_load_@sfx@(src2);
+#if @id@ <= 1 /* fmod and remainder */
+        npyv_@sfx@ rem       = vsx4_mod_@sfx@(a, b);
+#else /* divmod */
+        npyv_@sfx@ quo       = vsx4_div_@sfx@(a, b);
+        npyv_@sfx@ rem       = npyv_sub_@sfx@(a, vec_mul(b, quo));
+        // (b == 0 || (a == NPY_MIN_INT@len@ && b == -1))
+        npyv_b@len@ bzero    = npyv_cmpeq_@sfx@(b, vzero);
+        npyv_b@len@ amin     = npyv_cmpeq_@sfx@(a, vmin);
+        npyv_b@len@ bneg_one = npyv_cmpeq_@sfx@(b, vneg_one);
+        npyv_b@len@ overflow = npyv_and_@sfx@(bneg_one, amin);
+        npyv_b@len@ error    = npyv_or_@sfx@(bzero, overflow);
+        // in case of overflow or b = 0, 'cvtozero' forces quo/rem to be 0
+        npyv_@sfx@ cvtozero  = npyv_select_@sfx@(error, vzero, vneg_one);
+                        warn = npyv_or_@sfx@(error, warn);
+#endif
+#if @id@ >= 1 /* remainder and divmod */
+        // handle mixed case the way Python does
+        // ((a > 0) == (b > 0) || rem == 0)
+        npyv_b@len@ a_gt_zero  = npyv_cmpgt_@sfx@(a, vzero);
+        npyv_b@len@ b_gt_zero  = npyv_cmpgt_@sfx@(b, vzero);
+        npyv_b@len@ ab_eq_cond = npyv_cmpeq_@sfx@(a_gt_zero, b_gt_zero);
+        npyv_b@len@ rem_zero   = npyv_cmpeq_@sfx@(rem, vzero);
+        npyv_b@len@ or         = npyv_or_@sfx@(ab_eq_cond, rem_zero);
+        npyv_@sfx@ to_add      = npyv_select_@sfx@(or, vzero, b);
+                           rem = npyv_add_@sfx@(rem, to_add);
+#endif
+#if @id@ == 2 /* divmod */
+        npyv_@sfx@ to_sub = npyv_select_@sfx@(or, vzero, vneg_one);
+                      quo = npyv_add_@sfx@(quo, to_sub);
+        npyv_store_@sfx@(dst1, npyv_and_@sfx@(cvtozero, quo));
+        npyv_store_@sfx@(dst2, npyv_and_@sfx@(cvtozero, rem));
+#else /* fmod and remainder */
+        npyv_store_@sfx@(dst1, rem);
+        if (NPY_UNLIKELY(vec_any_eq(b, vzero))) {
+            npy_set_floatstatus_divbyzero();
+        }
+#endif
+    }
+
+#if @id@ == 2 /* divmod */
+    if (!vec_all_eq(warn, vzero)) {
+        npy_set_floatstatus_divbyzero();
+    }
+
+    for (; len > 0; --len, ++src1, ++src2, ++dst1, ++dst2) {
+        const npyv_lanetype_@sfx@ a = *src1;
+        const npyv_lanetype_@sfx@ b = *src2;
+        if (b == 0 || (a == NPY_MIN_INT@len@ && b == -1)) {
+            npy_set_floatstatus_divbyzero();
+            *dst1 = 0;
+            *dst2 = 0;
+        }
+        else {
+            *dst1 = a / b;
+            *dst2 = a % b;
+            if (!((a > 0) == (b > 0) || *dst2 == 0)) {
+                *dst1 -= 1;
+                *dst2 += b;
+            }
+        }
+    }
+#else /* fmod and remainder */
+    for (; len > 0; --len, ++src1, ++src2, ++dst1) {
+        const npyv_lanetype_@sfx@ a = *src1;
+        const npyv_lanetype_@sfx@ b = *src2;
+        if (NPY_UNLIKELY(b == 0)) {
+            npy_set_floatstatus_divbyzero();
+            *dst1 = 0;
+        } else{
+            *dst1 = a % b;
+#if @id@ == 1 /* remainder */
+            if (!((a > 0) == (b > 0) || *dst1 == 0)) {
+                *dst1 += b;
+            }
+#endif
+        }
+    }
+#endif
+    npyv_cleanup();
+}
+
+static NPY_INLINE void
+vsx4_simd_@func@_by_scalar_contig_@sfx@(char **args, npy_intp len)
+{
+    npyv_lanetype_@sfx@ *src1  = (npyv_lanetype_@sfx@ *) args[0];
+    npyv_lanetype_@sfx@ scalar = *(npyv_lanetype_@sfx@ *) args[1];
+    npyv_lanetype_@sfx@ *dst1  = (npyv_lanetype_@sfx@ *) args[2];
+    const npyv_@sfx@ vscalar   = npyv_setall_@sfx@(scalar);
+    const @divtype@ divisor    = vsx4_divisor_@sfx@(vscalar);
+    const int vstep            = npyv_nlanes_@sfx@;
+#if @id@ >= 1 /* remainder and divmod */
+    const npyv_@sfx@ vzero     = npyv_zero_@sfx@();
+    npyv_b@len@ b_gt_zero      = npyv_cmpgt_@sfx@(vscalar, vzero);
+#endif
+#if @id@ == 2 /* divmod */
+    npyv_b@len@ warn          = npyv_cvt_b@len@_@sfx@(npyv_zero_@sfx@());
+    const npyv_@sfx@ vmin     = npyv_setall_@sfx@(NPY_MIN_INT@len@);
+    const npyv_@sfx@ vneg_one = npyv_setall_@sfx@(-1);
+    npyv_b@len@ bneg_one      = npyv_cmpeq_@sfx@(vscalar, vneg_one);
+    npyv_lanetype_@sfx@ *dst2 = (npyv_lanetype_@sfx@ *) args[3];
+
+    for (; len >= vstep; len -= vstep, src1 += vstep, dst1 += vstep,
+         dst2 += vstep) {
+#else /* fmod and remainder */
+    for (; len >= vstep; len -= vstep, src1 += vstep, dst1 += vstep) {
+#endif
+        npyv_@sfx@ a = npyv_load_@sfx@(src1);
+#if @id@ <= 1 /* fmod and remainder */
+        npyv_@sfx@ rem       = vsx4_mod_scalar_@sfx@(a, divisor);
+#else /* divmod */
+        npyv_@sfx@ quo       = vsx4_div_scalar_@sfx@(a, divisor);
+        npyv_@sfx@ rem       = npyv_sub_@sfx@(a, vec_mul(vscalar, quo));
+        // (a == NPY_MIN_INT@len@ && b == -1)
+        npyv_b@len@ amin     = npyv_cmpeq_@sfx@(a, vmin);
+        npyv_b@len@ overflow = npyv_and_@sfx@(bneg_one, amin);
+        // in case of overflow, 'cvtozero' forces quo/rem to be 0
+        npyv_@sfx@ cvtozero  = npyv_select_@sfx@(overflow, vzero, vneg_one);
+                        warn = npyv_or_@sfx@(overflow, warn);
+#endif
+#if @id@ >= 1 /* remainder and divmod */
+        // handle mixed case the way Python does
+        // ((a > 0) == (b > 0) || rem == 0)
+        npyv_b@len@ a_gt_zero  = npyv_cmpgt_@sfx@(a, vzero);
+        npyv_b@len@ ab_eq_cond = npyv_cmpeq_@sfx@(a_gt_zero, b_gt_zero);
+        npyv_b@len@ rem_zero   = npyv_cmpeq_@sfx@(rem, vzero);
+        npyv_b@len@ or         = npyv_or_@sfx@(ab_eq_cond, rem_zero);
+        npyv_@sfx@ to_add      = npyv_select_@sfx@(or, vzero, vscalar);
+                           rem = npyv_add_@sfx@(rem, to_add);
+#endif
+#if @id@ == 2 /* divmod */
+        npyv_@sfx@ to_sub = npyv_select_@sfx@(or, vzero, vneg_one);
+        quo               = npyv_add_@sfx@(quo, to_sub);
+        npyv_store_@sfx@(dst1, npyv_and_@sfx@(cvtozero, quo));
+        npyv_store_@sfx@(dst2, npyv_and_@sfx@(cvtozero, rem));
+#else /* fmod and remainder */
+        npyv_store_@sfx@(dst1, rem);
+#endif
+    }
+
+#if @id@ == 2 /* divmod */
+    if (!vec_all_eq(warn, vzero)) {
+        npy_set_floatstatus_divbyzero();
+    }
+
+    for (; len > 0; --len, ++src1, ++dst1, ++dst2) {
+        const npyv_lanetype_@sfx@ a = *src1;
+        if (a == NPY_MIN_INT@len@ && scalar == -1) {
+            npy_set_floatstatus_divbyzero();
+            *dst1 = 0;
+            *dst2 = 0;
+        }
+        else {
+            *dst1 = a / scalar;
+            *dst2 = a % scalar;
+            if (!((a > 0) == (scalar > 0) || *dst2 == 0)) {
+                *dst1 -= 1;
+                *dst2 += scalar;
+            }
+        }
+    }
+#else /* fmod and remainder */
+    for (; len > 0; --len, ++src1, ++dst1) {
+        const npyv_lanetype_@sfx@ a = *src1;
+        *dst1 = a % scalar;
+#if @id@ == 1 /* remainder */
+        if (!((a > 0) == (scalar > 0) || *dst1 == 0)) {
+            *dst1 += scalar;
+        }
+#endif
+    }
+#endif
+    npyv_cleanup();
+}
+/**end repeat1**/
+/**end repeat**/
+#endif // NPY_SIMD && defined(NPY_HAVE_VSX4)
+
+/*****************************************************************************
+ ** Defining ufunc inner functions
+ *****************************************************************************/
+
+/**begin repeat
+ * Signed and Unsigned types
+ *  #type  = npy_ubyte, npy_ushort, npy_uint, npy_ulong, npy_ulonglong,
+ *           npy_byte,  npy_short,  npy_int,  npy_long,  npy_longlong#
+ *  #TYPE  = UBYTE,     USHORT,     UINT,     ULONG,     ULONGLONG,
+ *           BYTE,      SHORT,      INT,      LONG,      LONGLONG#
+ *  #STYPE = BYTE,      SHORT,      INT,      LONG,      LONGLONG,
+ *           BYTE,      SHORT,      INT,      LONG,      LONGLONG#
+ *  #signed = 0, 0, 0, 0, 0, 1, 1, 1, 1, 1#
+ */
+#undef TO_SIMD_SFX
+#if 0
+/**begin repeat1
+ * #len = 8, 16, 32, 64#
+ */
+#elif NPY_BITSOF_@STYPE@ == @len@
+    #if @signed@
+        #define TO_SIMD_SFX(X) X##_s@len@
+    #else
+        #define TO_SIMD_SFX(X) X##_u@len@
+    #endif
+/**end repeat1**/
+#endif
+
+NPY_NO_EXPORT void NPY_CPU_DISPATCH_CURFX(@TYPE@_fmod)
+(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
+{
+#if defined(NPY_HAVE_VSX4) && NPY_SIMD && defined(TO_SIMD_SFX)
+    // both arguments are arrays of the same size
+    if (IS_BLOCKABLE_BINARY(sizeof(@type@), NPY_SIMD_WIDTH)) {
+        TO_SIMD_SFX(vsx4_simd_fmod_contig)(args, dimensions[0]);
+        return;
+    }
+    // for contiguous block of memory, divisor is a scalar and not 0
+    else if (IS_BLOCKABLE_BINARY_SCALAR2(sizeof(@type@), NPY_SIMD_WIDTH) &&
+             (*(@type@ *)args[1]) != 0) {
+        TO_SIMD_SFX(vsx4_simd_fmod_by_scalar_contig)(args, dimensions[0]);
+        return ;
+    }
+#endif
+    BINARY_LOOP {
+        const @type@ in1 = *(@type@ *)ip1;
+        const @type@ in2 = *(@type@ *)ip2;
+        if (NPY_UNLIKELY(in2 == 0)) {
+            npy_set_floatstatus_divbyzero();
+            *((@type@ *)op1) = 0;
+        } else{
+            *((@type@ *)op1)= in1 % in2;
+        }
+    }
+}
+
+NPY_NO_EXPORT void NPY_CPU_DISPATCH_CURFX(@TYPE@_remainder)
+(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
+{
+#if defined(NPY_HAVE_VSX4) && NPY_SIMD && defined(TO_SIMD_SFX)
+    // both arguments are arrays of the same size
+    if (IS_BLOCKABLE_BINARY(sizeof(@type@), NPY_SIMD_WIDTH)) {
+        TO_SIMD_SFX(vsx4_simd_remainder_contig)(args, dimensions[0]);
+        return;
+    }
+    // for contiguous block of memory, divisor is a scalar and not 0
+    else if (IS_BLOCKABLE_BINARY_SCALAR2(sizeof(@type@), NPY_SIMD_WIDTH) &&
+             (*(@type@ *)args[1]) != 0) {
+        TO_SIMD_SFX(vsx4_simd_remainder_by_scalar_contig)(args, dimensions[0]);
+        return ;
+    }
+#endif
+    BINARY_LOOP {
+        const @type@ in1 = *(@type@ *)ip1;
+        const @type@ in2 = *(@type@ *)ip2;
+        if (NPY_UNLIKELY(in2 == 0)) {
+            npy_set_floatstatus_divbyzero();
+            *((@type@ *)op1) = 0;
+        } else{
+#if @signed@
+            /* handle mixed case the way Python does */
+            const @type@ rem = in1 % in2;
+            if ((in1 > 0) == (in2 > 0) || rem == 0) {
+                *((@type@ *)op1) = rem;
+            }
+            else {
+                *((@type@ *)op1) = rem + in2;
+            }
+#else
+            *((@type@ *)op1)= in1 % in2;
+#endif
+        }
+    }
+}
+
+NPY_NO_EXPORT void NPY_CPU_DISPATCH_CURFX(@TYPE@_divmod)
+(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
+{
+#if defined(NPY_HAVE_VSX4) && NPY_SIMD && defined(TO_SIMD_SFX)
+    // both arguments are arrays of the same size
+    if (IS_BLOCKABLE_BINARY(sizeof(@type@), NPY_SIMD_WIDTH)) {
+        TO_SIMD_SFX(vsx4_simd_divmod_contig)(args, dimensions[0]);
+        return;
+    }
+    // for contiguous block of memory, divisor is a scalar and not 0
+    else if (IS_BLOCKABLE_BINARY_SCALAR2(sizeof(@type@), NPY_SIMD_WIDTH) &&
+             (*(@type@ *)args[1]) != 0) {
+        TO_SIMD_SFX(vsx4_simd_divmod_by_scalar_contig)(args, dimensions[0]);
+        return ;
+    }
+#endif
+#if @signed@
+    BINARY_LOOP_TWO_OUT {
+        const @type@ in1 = *(@type@ *)ip1;
+        const @type@ in2 = *(@type@ *)ip2;
+        /* see FIXME note for divide above */
+        if (NPY_UNLIKELY(in2 == 0 || (in1 == NPY_MIN_@TYPE@ && in2 == -1))) {
+            npy_set_floatstatus_divbyzero();
+            *((@type@ *)op1) = 0;
+            *((@type@ *)op2) = 0;
+        }
+        else {
+            /* handle mixed case the way Python does */
+            const @type@ quo = in1 / in2;
+            const @type@ rem = in1 % in2;
+            if ((in1 > 0) == (in2 > 0) || rem == 0) {
+                *((@type@ *)op1) = quo;
+                *((@type@ *)op2) = rem;
+            }
+            else {
+                *((@type@ *)op1) = quo - 1;
+                *((@type@ *)op2) = rem + in2;
+            }
+        }
+    }
+#else
+    BINARY_LOOP_TWO_OUT {
+        const @type@ in1 = *(@type@ *)ip1;
+        const @type@ in2 = *(@type@ *)ip2;
+        if (NPY_UNLIKELY(in2 == 0)) {
+            npy_set_floatstatus_divbyzero();
+            *((@type@ *)op1) = 0;
+            *((@type@ *)op2) = 0;
+        }
+        else {
+            *((@type@ *)op1)= in1/in2;
+            *((@type@ *)op2) = in1 % in2;
+        }
+    }
+#endif
+}
+/**end repeat**/
diff --git a/numpy/core/src/umath/loops_trigonometric.dispatch.c.src b/numpy/core/src/umath/loops_trigonometric.dispatch.c.src

index cd9b2ed547ffffdbda1d3409cfced4a04e1e4fd0..44c47d14fa34126b879f236c4af9ddf1ff0ccaf1 100644 (file)
--- a/numpy/core/src/umath/loops_trigonometric.dispatch.c.src
+++ b/numpy/core/src/umath/loops_trigonometric.dispatch.c.src
@@ -1,7 +1,7 @@
  /*@targets
   ** $maxopt baseline
   ** (avx2 fma3) avx512f
- ** vsx2
+ ** vsx2 vsx3 vsx4
   ** neon_vfpv4
   **/
  #include "numpy/npy_math.h"
diff --git a/numpy/core/src/umath/loops_umath_fp.dispatch.c.src b/numpy/core/src/umath/loops_umath_fp.dispatch.c.src

index a8289fc51092f321fb467ecfc73afb79b3532e00..281b4231de3c43768b9e46c7f29932fe9a88d449 100644 (file)
--- a/numpy/core/src/umath/loops_umath_fp.dispatch.c.src
+++ b/numpy/core/src/umath/loops_umath_fp.dispatch.c.src
@@ -14,8 +14,8 @@
   * #func_suffix = f16, 8#
   */
  /**begin repeat1
- * #func = tanh, exp2, log2, log10, expm1, log1p, cbrt, tan, asin, acos, atan, sinh, cosh, asinh, acosh, atanh#
- * #default_val = 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0#
+ * #func = exp2, log2, log10, expm1, log1p, cbrt, tan, asin, acos, atan, sinh, cosh, asinh, acosh, atanh#
+ * #default_val = 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0#
   */
  static void
  simd_@func@_@sfx@(const npyv_lanetype_@sfx@ *src, npy_intp ssrc,
@@ -83,8 +83,8 @@ simd_@func@_f64(const double *src, npy_intp ssrc,
   *  #sfx  = f64, f32#
   */
  /**begin repeat1
- *  #func = tanh, exp2, log2, log10, expm1, log1p, cbrt, tan, arcsin, arccos, arctan, sinh, cosh, arcsinh, arccosh, arctanh#
- *  #intrin = tanh, exp2, log2, log10, expm1, log1p, cbrt, tan, asin, acos, atan, sinh, cosh, asinh, acosh, atanh#
+ *  #func = exp2, log2, log10, expm1, log1p, cbrt, tan, arcsin, arccos, arctan, sinh, cosh, arcsinh, arccosh, arctanh#
+ *  #intrin = exp2, log2, log10, expm1, log1p, cbrt, tan, asin, acos, atan, sinh, cosh, asinh, acosh, atanh#
   */
  NPY_NO_EXPORT void NPY_CPU_DISPATCH_CURFX(@TYPE@_@func@)
  (char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(data))
diff --git a/numpy/core/src/umath/loops_unary_fp.dispatch.c.src b/numpy/core/src/umath/loops_unary_fp.dispatch.c.src

index 93761b98c04ee900923a92c3333678b7b1386719..78e231965bcaa78411a2dfb25806164c8a0ba340 100644 (file)
--- a/numpy/core/src/umath/loops_unary_fp.dispatch.c.src
+++ b/numpy/core/src/umath/loops_unary_fp.dispatch.c.src
@@ -70,6 +70,15 @@ NPY_FINLINE double c_square_f64(double a)
  #define c_ceil_f32 npy_ceilf
  #define c_ceil_f64 npy_ceil
  
+#define c_trunc_f32 npy_truncf
+#define c_trunc_f64 npy_trunc
+
+#define c_floor_f32 npy_floorf
+#define c_floor_f64 npy_floor
+
+#define c_rint_f32 npy_rintf
+#define c_rint_f64 npy_rint
+
  /********************************************************************************
   ** Defining the SIMD kernels
   ********************************************************************************/
@@ -119,6 +128,9 @@ NPY_FINLINE double c_square_f64(double a)
          #if __clang_major__ < 10
          // Clang before v10
          #define WORKAROUND_CLANG_RECIPROCAL_BUG 1
+        #elif defined(_MSC_VER)
+        // clang-cl has the same bug
+        #define WORKAROUND_CLANG_RECIPROCAL_BUG 1
          #elif defined(NPY_CPU_X86) || defined(NPY_CPU_AMD64)
          // Clang v10+, targeting i386 or x86_64
          #define WORKAROUND_CLANG_RECIPROCAL_BUG 0
@@ -139,10 +151,10 @@ NPY_FINLINE double c_square_f64(double a)
   */
  #if @VCHK@
  /**begin repeat1
- * #kind     = ceil, sqrt, absolute, square, reciprocal#
- * #intr     = ceil, sqrt, abs,      square, recip#
- * #repl_0w1 = 0,    0,    0,        0,      1#
- * #RECIP_WORKAROUND = 0, 0, 0, 0, WORKAROUND_CLANG_RECIPROCAL_BUG#
+ * #kind     = rint,  floor, ceil, trunc, sqrt, absolute, square, reciprocal#
+ * #intr     = rint,  floor, ceil, trunc, sqrt, abs,      square, recip#
+ * #repl_0w1 = 0*7, 1#
+ * #RECIP_WORKAROUND = 0*7, WORKAROUND_CLANG_RECIPROCAL_BUG#
   */
  /**begin repeat2
   * #STYPE  = CONTIG, NCONTIG, CONTIG,  NCONTIG#
@@ -250,9 +262,9 @@ static void simd_@TYPE@_@kind@_@STYPE@_@DTYPE@
   * #VCHK = NPY_SIMD, NPY_SIMD_F64#
   */
  /**begin repeat1
- * #kind  = ceil, sqrt, absolute, square, reciprocal#
- * #intr  = ceil, sqrt, abs,      square, recip#
- * #clear = 0,    0,    1,        0,      0#
+ * #kind  = rint, floor, ceil, trunc, sqrt, absolute, square, reciprocal#
+ * #intr  = rint, floor, ceil, trunc, sqrt, abs,      square, recip#
+ * #clear = 0,    0,     0,    0,     0,    1,        0,      0#
   */
  NPY_NO_EXPORT void NPY_CPU_DISPATCH_CURFX(@TYPE@_@kind@)
  (char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))
diff --git a/numpy/core/src/umath/loops_utils.h.src b/numpy/core/src/umath/loops_utils.h.src

index 762e9ee59bede5a6cc705023f6e2856aa46c1cea..df92bc315c5e44dfd6992e4d03e38a4dc7e7121f 100644 (file)
--- a/numpy/core/src/umath/loops_utils.h.src
+++ b/numpy/core/src/umath/loops_utils.h.src
@@ -79,7 +79,11 @@ static NPY_INLINE @type@
  {
      if (n < 8) {
          npy_intp i;
-        @type@ res = 0.;
+        /*
+         * Start with -0 to preserve -0 values.  The reason is that summing
+         * only -0 should return -0, but `0 + -0 == 0` while `-0 + -0 == -0`.
+         */
+        @type@ res = -0.0;
  
          for (i = 0; i < n; i++) {
              res += @trf@(*((@dtype@*)(a + i * stride)));
@@ -156,8 +160,8 @@ static NPY_INLINE void
      if (n < 8) {
          npy_intp i;
  
-        *rr = 0.;
-        *ri = 0.;
+        *rr = -0.0;
+        *ri = -0.0;
          for (i = 0; i < n; i += 2) {
              *rr += *((@ftype@ *)(a + i * stride + 0));
              *ri += *((@ftype@ *)(a + i * stride + sizeof(@ftype@)));
diff --git a/numpy/core/src/umath/npy_simd_data.h b/numpy/core/src/umath/npy_simd_data.h

index 62438d7a3fa85181a73a2ef8a46b76b62018278c..43640a2d6324adce0cd6856e49070235652fe08c 100644 (file)
--- a/numpy/core/src/umath/npy_simd_data.h
+++ b/numpy/core/src/umath/npy_simd_data.h
@@ -15,6 +15,7 @@
  #define NPY_TANG_A4 0x1.11115b7aa905ep-7
  #define NPY_TANG_A5 0x1.6c1728d739765p-10
  
+#if !defined NPY_HAVE_AVX512_SKX || !defined NPY_CAN_LINK_SVML
  /* Lookup table for 2^(j/32) */
  static npy_uint64 EXP_Table_top[32] = {
      0x3FF0000000000000,
@@ -85,6 +86,7 @@ static npy_uint64 EXP_Table_tail[32] = {
      0x3CF9858F73A18F5E,
      0x3C99D3E12DD8A18B,
  };
+#endif //#if !defined NPY_HAVE_AVX512_SKX || !defined NPY_CAN_LINK_SVML
  #endif
  #endif
  
@@ -128,6 +130,7 @@ static npy_uint64 EXP_Table_tail[32] = {
   */
  #if defined NPY_HAVE_AVX512F
  #if !(defined(__clang__) && (__clang_major__ < 10 || (__clang_major__ == 10 && __clang_minor__ < 1)))
+#if !defined NPY_HAVE_AVX512_SKX || !defined NPY_CAN_LINK_SVML
  static npy_uint64 LOG_TABLE_TOP[64] = {
      0x0000000000000000,
      0x3F8FC0A8B1000000,
@@ -261,6 +264,7 @@ static npy_uint64 LOG_TABLE_TAIL[64] = {
      0x3D6F2CFB29AAA5F0,
      0x3D66757006095FD2,
  };
+#endif //#if !defined NPY_HAVE_AVX512_SKX || !defined NPY_CAN_LINK_SVML
  
  #define NPY_TANG_LOG_A1 0x1.55555555554e6p-4
  #define NPY_TANG_LOG_A2 0x1.9999999bac6d4p-7
diff --git a/numpy/core/src/umath/reduction.c b/numpy/core/src/umath/reduction.c

index 06709b4f36fd5cd5a4765ed281eac2dfe258d054..817f99a04fabc4be964c5a234ad40e4cebe2454e 100644 (file)
--- a/numpy/core/src/umath/reduction.c
+++ b/numpy/core/src/umath/reduction.c
@@ -337,7 +337,7 @@ PyUFunc_ReduceWrapper(PyArrayMethod_Context *context,
  
      /*
       * Note that we need to ensure that the iterator is reset before getting
-     * the fixed strides.  (The buffer information is unitialized before.)
+     * the fixed strides.  (The buffer information is uninitialized before.)
       */
      npy_intp fixed_strides[3];
      NpyIter_GetInnerFixedStrideArray(iter, fixed_strides);
diff --git a/numpy/core/src/umath/scalarmath.c.src b/numpy/core/src/umath/scalarmath.c.src

index 402e6b561717bfda611d87e900f3147233b37cd5..8fb219b63c2fd92959698ac89363e5082db88e5a 100644 (file)
--- a/numpy/core/src/umath/scalarmath.c.src
+++ b/numpy/core/src/umath/scalarmath.c.src
@@ -26,6 +26,13 @@
  #include "binop_override.h"
  #include "npy_longdouble.h"
  
+#include "array_coercion.h"
+#include "common.h"
+#include "can_cast_table.h"
+
+/* TODO: Used for some functions, should possibly move these to npy_math.h */
+#include "loops.h"
+
  /* Basic operations:
   *
   *  BINARY:
@@ -45,23 +52,22 @@
   *  #name = byte, short, int, long, longlong#
   *  #type = npy_byte, npy_short, npy_int, npy_long, npy_longlong#
   */
-static void
+static NPY_INLINE int
  @name@_ctype_add(@type@ a, @type@ b, @type@ *out) {
      *out = a + b;
      if ((*out^a) >= 0 || (*out^b) >= 0) {
-        return;
+        return 0;
      }
-    npy_set_floatstatus_overflow();
-    return;
+    return NPY_FPE_OVERFLOW;
  }
-static void
+
+static NPY_INLINE int
  @name@_ctype_subtract(@type@ a, @type@ b, @type@ *out) {
      *out = a - b;
      if ((*out^a) >= 0 || (*out^~b) >= 0) {
-        return;
+        return 0;
      }
-    npy_set_floatstatus_overflow();
-    return;
+    return NPY_FPE_OVERFLOW;
  }
  /**end repeat**/
  
@@ -69,23 +75,22 @@ static void
   *  #name = ubyte, ushort, uint, ulong, ulonglong#
   *  #type = npy_ubyte, npy_ushort, npy_uint, npy_ulong, npy_ulonglong#
   */
-static void
+static NPY_INLINE int
  @name@_ctype_add(@type@ a, @type@ b, @type@ *out) {
      *out = a + b;
      if (*out >= a && *out >= b) {
-        return;
+        return 0;
      }
-    npy_set_floatstatus_overflow();
-    return;
+    return NPY_FPE_OVERFLOW;
  }
-static void
+
+static NPY_INLINE int
  @name@_ctype_subtract(@type@ a, @type@ b, @type@ *out) {
      *out = a - b;
      if (a >= b) {
-        return;
+        return 0;
      }
-    npy_set_floatstatus_overflow();
-    return;
+    return NPY_FPE_OVERFLOW;
  }
  /**end repeat**/
  
@@ -108,18 +113,19 @@ static void
   * #neg = (1,0)*4#
   */
  #if NPY_SIZEOF_@SIZE@ > NPY_SIZEOF_@SIZENAME@
-static void
+static NPY_INLINE int
  @name@_ctype_multiply(@type@ a, @type@ b, @type@ *out) {
      @big@ temp;
      temp = ((@big@) a) * ((@big@) b);
      *out = (@type@) temp;
  #if @neg@
-    if (temp > NPY_MAX_@NAME@ || temp < NPY_MIN_@NAME@)
+    if (temp > NPY_MAX_@NAME@ || temp < NPY_MIN_@NAME@) {
  #else
-        if (temp > NPY_MAX_@NAME@)
+    if (temp > NPY_MAX_@NAME@) {
  #endif
-            npy_set_floatstatus_overflow();
-    return;
+        return NPY_FPE_OVERFLOW;
+    }
+    return 0;
  }
  #endif
  /**end repeat**/
@@ -133,12 +139,12 @@ static void
   * #SIZE = INT*2, LONG*2, LONGLONG*2#
   */
  #if NPY_SIZEOF_LONGLONG == NPY_SIZEOF_@SIZE@
-static void
+static NPY_INLINE int
  @name@_ctype_multiply(@type@ a, @type@ b, @type@ *out) {
      if (npy_mul_with_overflow_@name@(out, a, b)) {
-        npy_set_floatstatus_overflow();
+        return NPY_FPE_OVERFLOW;
      }
-    return;
+    return 0;
  }
  #endif
  /**end repeat**/
@@ -151,16 +157,16 @@ static void
   *         npy_long, npy_ulong, npy_longlong, npy_ulonglong#
   * #neg = (1,0)*5#
   */
-static void
+static NPY_INLINE int
  @name@_ctype_divide(@type@ a, @type@ b, @type@ *out) {
      if (b == 0) {
-        npy_set_floatstatus_divbyzero();
          *out = 0;
+        return NPY_FPE_DIVIDEBYZERO;
      }
  #if @neg@
      else if (b == -1 && a < 0 && a == -a) {
-        npy_set_floatstatus_overflow();
          *out = a / b;
+        return NPY_FPE_OVERFLOW;
      }
  #endif
      else {
@@ -174,17 +180,20 @@ static void
  #else
          *out = a / b;
  #endif
+        return 0;
      }
  }
  
  #define @name@_ctype_floor_divide @name@_ctype_divide
  
-static void
+static NPY_INLINE int
  @name@_ctype_remainder(@type@ a, @type@ b, @type@ *out) {
      if (a == 0 || b == 0) {
-        if (b == 0) npy_set_floatstatus_divbyzero();
          *out = 0;
-        return;
+        if (b == 0) {
+            return NPY_FPE_DIVIDEBYZERO;
+        }
+        return 0;
      }
  #if @neg@
      else if ((a > 0) == (b > 0)) {
@@ -198,6 +207,7 @@ static void
  #else
      *out = a % b;
  #endif
+    return 0;
  }
  /**end repeat**/
  
@@ -205,10 +215,15 @@ static void
   *
   * #name = byte, ubyte, short, ushort, int, uint, long,
   *         ulong, longlong, ulonglong#
- * #otyp = npy_float*4, npy_double*6#
   */
-#define @name@_ctype_true_divide(a, b, out)     \
-    *(out) = ((@otyp@) (a)) / ((@otyp@) (b));
+
+static NPY_INLINE int
+@name@_ctype_true_divide(npy_@name@ a, npy_@name@ b, npy_double *out)
+{
+    *out = (npy_double)a / (npy_double)b;
+    return 0;
+}
+
  /**end repeat**/
  
  /* b will always be positive in this call */
@@ -221,17 +236,17 @@ static void
   * #upc = BYTE, UBYTE, SHORT, USHORT, INT, UINT,
   *        LONG, ULONG, LONGLONG, ULONGLONG#
   */
-static void
+static NPY_INLINE int
  @name@_ctype_power(@type@ a, @type@ b, @type@ *out) {
      @type@ tmp;
  
      if (b == 0) {
          *out = 1;
-        return;
+        return 0;
      }
      if (a == 1) {
          *out = 1;
-        return;
+        return 0;
      }
  
      tmp = b & 1 ? a : 1;
@@ -244,6 +259,7 @@ static void
          b >>= 1;
      }
      *out = tmp;
+    return 0;
  }
  /**end repeat**/
  
@@ -261,12 +277,28 @@ static void
   * #op = &, ^, |#
   */
  
-#define @name@_ctype_@oper@(arg1, arg2, out) *(out) = (arg1) @op@ (arg2)
+static NPY_INLINE int
+@name@_ctype_@oper@(@type@ arg1, @type@ arg2, @type@ *out)
+{
+    *out = arg1 @op@ arg2;
+    return 0;
+}
  
  /**end repeat1**/
  
-#define @name@_ctype_lshift(arg1, arg2, out) *(out) = npy_lshift@suffix@(arg1, arg2)
-#define @name@_ctype_rshift(arg1, arg2, out) *(out) = npy_rshift@suffix@(arg1, arg2)
+static NPY_INLINE int
+@name@_ctype_lshift(@type@ arg1, @type@ arg2, @type@ *out)
+{
+    *out = npy_lshift@suffix@(arg1, arg2);
+    return 0;
+}
+
+static NPY_INLINE int
+@name@_ctype_rshift(@type@ arg1, @type@ arg2, @type@ *out)
+{
+    *out = npy_rshift@suffix@(arg1, arg2);
+    return 0;
+}
  
  /**end repeat**/
  
@@ -275,135 +307,162 @@ static void
   * #type = npy_float, npy_double, npy_longdouble#
   * #c = f, , l#
   */
-#define @name@_ctype_add(a, b, outp) *(outp) = (a) + (b)
-#define @name@_ctype_subtract(a, b, outp) *(outp) = (a) - (b)
-#define @name@_ctype_multiply(a, b, outp) *(outp) = (a) * (b)
-#define @name@_ctype_divide(a, b, outp) *(outp) = (a) / (b)
+
+/**begin repeat1
+ * #OP = +, -, *, /#
+ * #oper = add, subtract, multiply, divide#
+ */
+
+static NPY_INLINE int
+@name@_ctype_@oper@(@type@ a, @type@ b, @type@ *out)
+{
+    *out = a @OP@ b;
+    return 0;
+}
+
+/**end repeat1**/
+
  #define @name@_ctype_true_divide @name@_ctype_divide
  
  
-static void
+static NPY_INLINE int
  @name@_ctype_floor_divide(@type@ a, @type@ b, @type@ *out) {
      *out = npy_floor_divide@c@(a, b);
+    return 0;
  }
  
  
-static void
+static NPY_INLINE int
  @name@_ctype_remainder(@type@ a, @type@ b, @type@ *out) {
      *out = npy_remainder@c@(a, b);
+    return 0;
  }
  
  
-static void
+static NPY_INLINE int
  @name@_ctype_divmod(@type@ a, @type@ b, @type@ *out1, @type@ *out2) {
      *out1 = npy_divmod@c@(a, b, out2);
+    return 0;
  }
  
  
  /**end repeat**/
  
-#define half_ctype_add(a, b, outp) *(outp) = \
-        npy_float_to_half(npy_half_to_float(a) + npy_half_to_float(b))
-#define half_ctype_subtract(a, b, outp) *(outp) = \
-        npy_float_to_half(npy_half_to_float(a) - npy_half_to_float(b))
-#define half_ctype_multiply(a, b, outp) *(outp) = \
-        npy_float_to_half(npy_half_to_float(a) * npy_half_to_float(b))
-#define half_ctype_divide(a, b, outp) *(outp) = \
-        npy_float_to_half(npy_half_to_float(a) / npy_half_to_float(b))
+/**begin repeat
+ * #OP = +, -, *, /#
+ * #oper = add, subtract, multiply, divide#
+ */
+
+static NPY_INLINE int
+half_ctype_@oper@(npy_half a, npy_half b, npy_half *out)
+{
+    float res = npy_half_to_float(a) @OP@ npy_half_to_float(b);
+    *out = npy_float_to_half(res);
+    return 0;
+}
+
+/**end repeat**/
  #define half_ctype_true_divide half_ctype_divide
  
  
-static void
-half_ctype_floor_divide(npy_half a, npy_half b, npy_half *out) {
+static NPY_INLINE int
+half_ctype_floor_divide(npy_half a, npy_half b, npy_half *out)
+{
      npy_half mod;
  
      if (!b) {
-        *out = a / b;
-    } else {
+        float res = npy_half_to_float(a) / npy_half_to_float(b);
+        *out = npy_float_to_half(res);
+    }
+    else {
          *out = npy_half_divmod(a, b, &mod);
      }
+    return 0;
  }
  
  
-static void
-half_ctype_remainder(npy_half a, npy_half b, npy_half *out) {
+static NPY_INLINE int
+half_ctype_remainder(npy_half a, npy_half b, npy_half *out)
+{
      npy_half_divmod(a, b, out);
+    return 0;
  }
  
  
-static void
-half_ctype_divmod(npy_half a, npy_half b, npy_half *out1, npy_half *out2) {
+static NPY_INLINE int
+half_ctype_divmod(npy_half a, npy_half b, npy_half *out1, npy_half *out2)
+{
      *out1 = npy_half_divmod(a, b, out2);
+    return 0;
  }
  
  /**begin repeat
   * #name = cfloat, cdouble, clongdouble#
+ * #type = npy_cfloat, npy_cdouble, npy_clongdouble#
+ * #TYPE = CFLOAT, CDOUBLE, CLONGDOUBLE#
   * #rname = float, double, longdouble#
   * #rtype = npy_float, npy_double, npy_longdouble#
   * #c = f,,l#
   */
-#define @name@_ctype_add(a, b, outp) do{        \
-    (outp)->real = (a).real + (b).real;         \
-    (outp)->imag = (a).imag + (b).imag;         \
-    } while(0)
-#define @name@_ctype_subtract(a, b, outp) do{   \
-    (outp)->real = (a).real - (b).real;         \
-    (outp)->imag = (a).imag - (b).imag;         \
-    } while(0)
-#define @name@_ctype_multiply(a, b, outp) do{                   \
-    (outp)->real = (a).real * (b).real - (a).imag * (b).imag;   \
-    (outp)->imag = (a).real * (b).imag + (a).imag * (b).real;   \
-    } while(0)
-/* Algorithm identical to that in loops.c.src, for consistency */
-#define @name@_ctype_divide(a, b, outp) do{                         \
-    @rtype@ in1r = (a).real;                                        \
-    @rtype@ in1i = (a).imag;                                        \
-    @rtype@ in2r = (b).real;                                        \
-    @rtype@ in2i = (b).imag;                                        \
-    @rtype@ in2r_abs = npy_fabs@c@(in2r);                           \
-    @rtype@ in2i_abs = npy_fabs@c@(in2i);                           \
-    if (in2r_abs >= in2i_abs) {                                     \
-        if (in2r_abs == 0 && in2i_abs == 0) {                       \
-            /* divide by zero should yield a complex inf or nan */  \
-            (outp)->real = in1r/in2r_abs;                           \
-            (outp)->imag = in1i/in2i_abs;                           \
-        }                                                           \
-        else {                                                      \
-            @rtype@ rat = in2i/in2r;                                \
-            @rtype@ scl = 1.0@c@/(in2r + in2i*rat);                 \
-            (outp)->real = (in1r + in1i*rat)*scl;                   \
-            (outp)->imag = (in1i - in1r*rat)*scl;                   \
-        }                                                           \
-    }                                                               \
-    else {                                                          \
-        @rtype@ rat = in2r/in2i;                                    \
-        @rtype@ scl = 1.0@c@/(in2i + in2r*rat);                     \
-        (outp)->real = (in1r*rat + in1i)*scl;                       \
-        (outp)->imag = (in1i*rat - in1r)*scl;                       \
-    }                                                               \
-    } while(0)
+static NPY_INLINE int
+@name@_ctype_add(@type@ a, @type@ b, @type@ *out)
+{
+    out->real = a.real + b.real;
+    out->imag = a.imag + b.imag;
+    return 0;
+}
+
+static NPY_INLINE int
+@name@_ctype_subtract(@type@ a, @type@ b, @type@ *out)
+{
+    out->real = a.real - b.real;
+    out->imag = a.imag - b.imag;
+    return 0;
+}
+
+
+/*
+ * TODO: Mark as  to work around FPEs not being issues on clang 12.
+ *       This should be removed when possible.
+ */
+static NPY_INLINE int
+@name@_ctype_multiply( @type@ a, @type@ b, @type@ *out)
+{
+    out->real = a.real * b.real - a.imag * b.imag;
+    out->imag = a.real * b.imag + a.imag * b.real;
+    return 0;
+}
+
+/* Use the ufunc loop directly to avoid duplicating the complicated logic */
+static NPY_INLINE int
+@name@_ctype_divide(@type@ a, @type@ b, @type@ *out)
+{
+    char *args[3] = {(char *)&a, (char *)&b, (char *)out};
+    npy_intp steps[3];
+    npy_intp size = 1;
+    @TYPE@_divide(args, &size, steps, NULL);
+    return 0;
+}
  
  #define @name@_ctype_true_divide @name@_ctype_divide
  
-#define @name@_ctype_floor_divide(a, b, outp) do {      \
-    @rname@_ctype_floor_divide(                         \
-        ((a).real*(b).real + (a).imag*(b).imag),        \
-        ((b).real*(b).real + (b).imag*(b).imag),        \
-        &((outp)->real));                               \
-    (outp)->imag = 0;                                   \
-    } while(0)
  /**end repeat**/
  
  
  
  /**begin repeat
   * #name = byte, ubyte, short, ushort, int, uint, long, ulong,
- *         longlong, ulonglong, cfloat, cdouble, clongdouble#
+ *         longlong, ulonglong#
   */
-#define @name@_ctype_divmod(a, b, out, out2) {  \
-    @name@_ctype_floor_divide(a, b, out);       \
-    @name@_ctype_remainder(a, b, out2);         \
-    }
+
+static NPY_INLINE int
+@name@_ctype_divmod(npy_@name@ a, npy_@name@ b, npy_@name@ *out, npy_@name@ *out2)
+{
+    int res = @name@_ctype_floor_divide(a, b, out);
+    res |= @name@_ctype_remainder(a, b, out2);
+    return res;
+}
+
  /**end repeat**/
  
  
@@ -413,20 +472,22 @@ half_ctype_divmod(npy_half a, npy_half b, npy_half *out1, npy_half *out2) {
   * #c = f,,l#
   */
  
-static void
+static NPY_INLINE int
  @name@_ctype_power(@type@ a, @type@ b, @type@ *out)
  {
      *out = npy_pow@c@(a, b);
+    return 0;
  }
  
  /**end repeat**/
-static void
+static NPY_INLINE int
  half_ctype_power(npy_half a, npy_half b, npy_half *out)
  {
      const npy_float af = npy_half_to_float(a);
      const npy_float bf = npy_half_to_float(b);
      const npy_float outf = npy_powf(af,bf);
      *out = npy_float_to_half(outf);
+    return 0;
  }
  
  /**begin repeat
@@ -438,20 +499,23 @@ half_ctype_power(npy_half a, npy_half b, npy_half *out)
   *         npy_float, npy_double, npy_longdouble#
   * #uns = (0,1)*5,0*3#
   */
-static void
+static NPY_INLINE int
  @name@_ctype_negative(@type@ a, @type@ *out)
  {
+    *out = -a;
  #if @uns@
-    npy_set_floatstatus_overflow();
+    return NPY_FPE_OVERFLOW;
+#else
+    return 0;
  #endif
-    *out = -a;
  }
  /**end repeat**/
  
-static void
+static NPY_INLINE int
  half_ctype_negative(npy_half a, npy_half *out)
  {
      *out = a^0x8000u;
+    return 0;
  }
  
  
@@ -459,11 +523,12 @@ half_ctype_negative(npy_half a, npy_half *out)
   * #name = cfloat, cdouble, clongdouble#
   * #type = npy_cfloat, npy_cdouble, npy_clongdouble#
   */
-static void
+static NPY_INLINE int
  @name@_ctype_negative(@type@ a, @type@ *out)
  {
      out->real = -a.real;
      out->imag = -a.imag;
+    return 0;
  }
  /**end repeat**/
  
@@ -475,10 +540,11 @@ static void
   *         npy_long, npy_ulong, npy_longlong, npy_ulonglong,
   *         npy_half, npy_float, npy_double, npy_longdouble#
   */
-static void
+static NPY_INLINE int
  @name@_ctype_positive(@type@ a, @type@ *out)
  {
      *out = a;
+    return 0;
  }
  /**end repeat**/
  
@@ -487,17 +553,19 @@ static void
   * #type = npy_cfloat, npy_cdouble, npy_clongdouble#
   * #c = f,,l#
   */
-static void
+static NPY_INLINE int
  @name@_ctype_positive(@type@ a, @type@ *out)
  {
      out->real = a.real;
      out->imag = a.imag;
+    return 0;
  }
  
-static void
+static NPY_INLINE int
  @name@_ctype_power(@type@ a, @type@ b, @type@ *out)
  {
      *out = npy_cpow@c@(a, b);
+    return 0;
  }
  /**end repeat**/
  
@@ -515,10 +583,11 @@ static void
   * #name = byte, short, int, long, longlong#
   * #type = npy_byte, npy_short, npy_int, npy_long, npy_longlong#
   */
-static void
+static NPY_INLINE int
  @name@_ctype_absolute(@type@ a, @type@ *out)
  {
      *out = (a < 0 ? -a : a);
+    return 0;
  }
  /**end repeat**/
  
@@ -527,17 +596,19 @@ static void
   * #type = npy_float, npy_double, npy_longdouble#
   * #c = f,,l#
   */
-static void
+static NPY_INLINE int
  @name@_ctype_absolute(@type@ a, @type@ *out)
  {
      *out = npy_fabs@c@(a);
+    return 0;
  }
  /**end repeat**/
  
-static void
+static NPY_INLINE int
  half_ctype_absolute(npy_half a, npy_half *out)
  {
      *out = a&0x7fffu;
+    return 0;
  }
  
  /**begin repeat
@@ -546,10 +617,11 @@ half_ctype_absolute(npy_half a, npy_half *out)
   * #rtype = npy_float, npy_double, npy_longdouble#
   * #c = f,,l#
   */
-static void
+static NPY_INLINE int
  @name@_ctype_absolute(@type@ a, @rtype@ *out)
  {
      *out = npy_cabs@c@(a);
+    return 0;
  }
  /**end repeat**/
  
@@ -558,196 +630,476 @@ static void
   *         ulong, longlong, ulonglong#
   */
  
-#define @name@_ctype_invert(a, out) *(out) = ~a;
+static NPY_INLINE int
+@name@_ctype_invert(npy_@name@ a, npy_@name@ *out)
+{
+    *out = ~a;
+    return 0;
+}
  
  /**end repeat**/
  
  /*** END OF BASIC CODE **/
  
  
-/* The general strategy for commutative binary operators is to
+/*
+ * How binary operators work
+ * -------------------------
+ *
+ * All binary (numeric) operators use the larger of the two types, with the
+ * exception of unsigned int and signed int mixed cases which must promote
+ * to a larger type.
+ *
+ * The strategy employed for all binary operation is that we coerce the other
+ * scalar if it is safe to do.  E.g. `float64 + float32` the `float64` can
+ * convert `float32` and do the operation as `float64 + float64`.
+ * OTOH, for `float32 + float64` it is safe, and we should defer to `float64`.
+ *
+ * So we have multiple possible paths:
+ * - The other scalar is a subclass.  In principle *both* inputs could be
+ *   different subclasses.  In this case it would make sense to defer, but
+ *   Python's `int` does not try this as well, so we do not here:
+ *
+ *      class A(int): pass
+ *      class B(int):
+ *          def __add__(self, other): return "b"
+ *          __radd__ = __add__
+ *
+ *      A(1) + B(1)  # return 2
+ *      B(1) + A(1)  # return "b"
+ *
+ * - The other scalar can be converted:  All is good, we do the operation
+ * - The other scalar cannot be converted, there are two possibilities:
+ *   - The reverse should work, so we return NotImplemented to defer.
+ *     (If self is a subclass, this will end up in the "unknown" path.)
+ *   - Neither works (e.g. `uint8 + int8`):  We currently use the array path.
+ * - The other object is a unknown.  It could be either a scalar, an array,
+ *   or an array-like (including a list!).  Because NumPy scalars pretend to be
+ *   arrays we fall into the array fallback path here _normally_ (through
+ *   the generic scalar path).
+ *   First we check if we should defer, though.
+ *
+ * The last possibility is awkward and leads to very confusing situations.
+ * The problem is that usually we should defer (return NotImplemented)
+ * in that path.
+ * If the other object is a NumPy array (or array-like) it will know what to
+ * do.  If NumPy knows that it is a scalar (not generic `object`), then it
+ * would make sense to try and use the "array path" (i.e. deal with it
+ * using the ufunc machinery).
+ *
+ * But this overlooks two things that currently work:
+ *
+ * 1. `np.float64(3) * [1, 2, 3]`  happily returns an array result.
+ * 2. `np.int32(3) * decimal.Decimal(3)` works!  (see below)
+ *
+ * The first must work, because scalars pretend to be arrays.  Which means
+ * they inherit the greedy "convert the other object to an array" logic.
+ * This may be a questionable choice, but is fine.
+ * (As of now, it is not negotiable, since NumPy often converts 0-D arrays
+ * to scalars.)
+ *
+ * The second one is more confusing.  This works also by using the ufunc
+ * machinery (array path), but it works because:
   *
- * 1) Convert the types to the common type if both are scalars (0 return)
- * 2) If both are not scalars use ufunc machinery (-2 return)
- * 3) If both are scalars but cannot be cast to the right type
- * return NotImplemented (-1 return)
+ *     np.add(np.int32(3), decimal.Decimal(3))
   *
- * 4) Perform the function on the C-type.
- * 5) If an error condition occurred, check to see
- * what the current error-handling is and handle the error.
+ * Will convert the `int32` to an int32 array, and the decimal to an object
+ * array.  It then *casts* the `int32` array to an object array.
+ * The casting step CONVERTS the integer to a Python integer.  The ufunc object
+ * loop will then call back into Python scalar logic.
   *
- * 6) Construct and return the output scalar.
+ * The above would be recursive, if it was not for the conversion of the int32
+ * to a Python integer!
+ * This leads us to the EXCEEDINGLY IMPORTANT special case:
+ *
+ * WARNING: longdouble and clongdouble do NOT convert to a Python scalar
+ *          when cast to object.  Thus they MUST NEVER take the array-path.
+ *          However, they STILL should defer at least for
+ *          `np.longdouble(3) + array`.
+ *
+ *
+ * As a general note, in the above we defer exactly when we know that deferring
+ * will work.  `longdouble` uses the "simple" logic of generally deferring
+ * though, because it would otherwise easily run into an infinite recursion.
+ *
+ *
+ * The future?!
+ * ------------
+ *
+ * This is very tricky and it would be nice to formalize away that "recursive"
+ * path we currently use.  I (seberg) have currently no great idea on this,
+ * this is more brainstorming!
+ *
+ * If both are scalars (known to NumPy), they have a DType and we may be able
+ * to do the ufunc promotion to make sure there is no risk of recursion.
+ *
+ * In principle always deferring would probably be clean.  But we likely cannot
+ * do that?  There is also an issue that it is nice that we allow adding a
+ * DType for an existing Python scalar (which will not know about NumPy
+ * scalars).
+ * The DType/ufunc machinery teaches NumPy how arrays will work with that
+ * Python scalar, but the DType may need to help us decide whether we should
+ * defer (return NotImplemented) or try using the ufunc machinery (or a
+ * simplified ufunc-like machinery limited to scalars).
+ */
+
+
+/*
+ * Enum used to describe the space of possibilities when converting the second
+ * argument to a binary operation.
+ * Any of these flags may be combined with the return flag of
+ * `may_need_deferring` indicating that the other is any type of object which
+ * may e.g. define an `__array_priority__`.
   */
+typedef enum {
+    /* An error occurred (should not really happen/be possible) */
+    CONVERSION_ERROR = -1,
+    /* A known NumPy scalar, but of higher precision: we defer */
+    DEFER_TO_OTHER_KNOWN_SCALAR,
+    /*
+     * Conversion was successful (known scalar of less precision).  Note that
+     * the other value may still be a subclass of such a scalar so even here
+     * we may have to check for deferring.
+     * More specialized subclass handling, which defers based on whether the
+     * subclass has an implementation, plausible but complicated.
+     * We do not do it, as even CPython does not do it for the builtin `int`.
+     */
+    CONVERSION_SUCCESS,
+    /*
+     * Other object is an unkown scalar or array-like, we (typically) use
+     * the generic path, which normally ends up in the ufunc machinery.
+     */
+    OTHER_IS_UNKNOWN_OBJECT,
+    /*
+     * Promotion necessary
+     */
+    PROMOTION_REQUIRED,
+} conversion_result;
  
  /**begin repeat
   * #name = byte, ubyte, short, ushort, int, uint,
   *         long, ulong, longlong, ulonglong,
- *         half, float, longdouble,
+ *         half, float, double, longdouble,
   *         cfloat, cdouble, clongdouble#
- * #type = npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int, npy_uint,
- *         npy_long, npy_ulong, npy_longlong, npy_ulonglong,
- *         npy_half, npy_float, npy_longdouble,
- *         npy_cfloat, npy_cdouble, npy_clongdouble#
   * #Name = Byte, UByte, Short, UShort, Int, UInt,
   *         Long, ULong, LongLong, ULongLong,
- *         Half, Float, LongDouble,
+ *         Half, Float, Double, LongDouble,
   *         CFloat, CDouble, CLongDouble#
- * #TYPE = NPY_BYTE, NPY_UBYTE, NPY_SHORT, NPY_USHORT, NPY_INT, NPY_UINT,
- *         NPY_LONG, NPY_ULONG, NPY_LONGLONG, NPY_ULONGLONG,
- *         NPY_HALF, NPY_FLOAT, NPY_LONGDOUBLE,
- *         NPY_CFLOAT, NPY_CDOUBLE, NPY_CLONGDOUBLE#
+ * #TYPE = BYTE, UBYTE, SHORT, USHORT, INT, UINT,
+ *         LONG, ULONG, LONGLONG, ULONGLONG,
+ *         HALF, FLOAT, DOUBLE, LONGDOUBLE,
+ *         CFLOAT, CDOUBLE, CLONGDOUBLE#
+ * #type = npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int, npy_uint,
+ *         npy_long, npy_ulong, npy_longlong, npy_ulonglong,
+ *         npy_half, npy_float, npy_double, npy_longdouble,
+ *         npy_cfloat, npy_cdouble, npy_clongdouble#
   */
  
-static int
-_@name@_convert_to_ctype(PyObject *a, @type@ *arg1)
-{
-    PyObject *temp;
+#define IS_@TYPE@ 1
  
-    if (PyArray_IsScalar(a, @Name@)) {
-        *arg1 = PyArrayScalar_VAL(a, @Name@);
-        return 0;
-    }
-    else if (PyArray_IsScalar(a, Generic)) {
-        PyArray_Descr *descr1;
+#define IS_SAFE(FROM, TO) _npy_can_cast_safely_table[FROM][TO]
  
-        if (!PyArray_IsScalar(a, Number)) {
-            return -1;
-        }
-        descr1 = PyArray_DescrFromTypeObject((PyObject *)Py_TYPE(a));
-        if (PyArray_CanCastSafely(descr1->type_num, @TYPE@)) {
-            PyArray_CastScalarDirect(a, descr1, arg1, @TYPE@);
-            Py_DECREF(descr1);
-            return 0;
-        }
-        else {
-            Py_DECREF(descr1);
-            return -1;
-        }
-    }
-    else if (PyArray_GetPriority(a, NPY_PRIORITY) > NPY_PRIORITY) {
-        return -2;
-    }
-    else if ((temp = PyArray_ScalarFromObject(a)) != NULL) {
-        int retval = _@name@_convert_to_ctype(temp, arg1);
-
-        Py_DECREF(temp);
-        return retval;
-    }
-    return -2;
-}
+/*
+ * TODO: This whole thing is awkward, and we should create a helper header to
+ *       define inline functions that convert single elements for all numeric
+ *       types.  That could then also be used to define all cast loops.
+ *       (Even if that may get more complex for SIMD at some point.)
+ *       For now, half casts could be optimized because of that.
+ */
  
-/**end repeat**/
+#if defined(IS_HALF)
+    #define CONVERT_TO_RESULT(value)  \
+        *result = npy_float_to_half((float)(value))
+#elif defined(IS_CFLOAT) || defined(IS_CDOUBLE) || defined(IS_CLONGDOUBLE)
+    #define CONVERT_TO_RESULT(value)  \
+        result->real = value;  \
+        result->imag = 0
+#else
+    #define CONVERT_TO_RESULT(value) *result = value
+#endif
  
  
-/* Same as above but added exact checks against known python types for speed */
+#define GET_VALUE_OR_DEFER(OTHER, Other, value)  \
+    case NPY_##OTHER:  \
+        if (IS_SAFE(NPY_##OTHER, NPY_@TYPE@)) {  \
+            CONVERT_TO_RESULT(PyArrayScalar_VAL(value, Other));  \
+            ret = CONVERSION_SUCCESS;  \
+        }  \
+        else if (IS_SAFE(NPY_@TYPE@, NPY_##OTHER)) {  \
+            /*
+             * If self can cast safely to other, this is clear:
+             * we should definitely defer.
+             */  \
+             ret = DEFER_TO_OTHER_KNOWN_SCALAR;  \
+        }  \
+        else {  \
+            /* Otherwise, we must promote */  \
+            ret = PROMOTION_REQUIRED;  \
+        }  \
+        break;
  
-/**begin repeat
- * #name = double#
- * #type = npy_double#
- * #Name = Double#
- * #TYPE = NPY_DOUBLE#
- * #PYCHECKEXACT = PyFloat_CheckExact#
- * #PYEXTRACTCTYPE = PyFloat_AS_DOUBLE#
+/*
+ * Complex to complex (and rejecting complex to real) is a bit different:
   */
  
-static int
-_@name@_convert_to_ctype(PyObject *a, @type@ *arg1)
+#if defined(IS_CFLOAT) || defined(IS_CDOUBLE) || defined(IS_CLONGDOUBLE)
+
+#define GET_CVALUE_OR_DEFER(OTHER, Other, value)  \
+    case NPY_##OTHER:  \
+        if (IS_SAFE(NPY_##OTHER, NPY_@TYPE@)) {  \
+            assert(Py_TYPE(value) == &Py##Other##ArrType_Type);  \
+            result->real = PyArrayScalar_VAL(value, Other).real;  \
+            result->imag = PyArrayScalar_VAL(value, Other).imag;  \
+            ret = 1;  \
+        }  \
+        else if (IS_SAFE(NPY_@TYPE@, NPY_##OTHER)) {  \
+             ret = DEFER_TO_OTHER_KNOWN_SCALAR;  \
+        }  \
+        else {  \
+            ret = PROMOTION_REQUIRED;  \
+        }  \
+        break;
+
+#else
+
+/* Getting a complex value to real is never safe: */
+#define GET_CVALUE_OR_DEFER(OTHER, Other, value)  \
+    case NPY_##OTHER:  \
+        if (IS_SAFE(NPY_@TYPE@, NPY_##OTHER)) {  \
+            ret = DEFER_TO_OTHER_KNOWN_SCALAR;  \
+        }  \
+        else {  \
+            ret = PROMOTION_REQUIRED;  \
+        }  \
+        break;
+
+#endif
+
+
+/**
+ * Convert the value to the own type and and store the result.
+ *
+ * @param value The value to convert (if compatible)
+ * @param result The result value (output)
+ * @param may_need_deferring Set to `NPY_TRUE` when the caller must check
+ *        `BINOP_GIVE_UP_IF_NEEDED` (or similar) due to possible implementation
+ *        of `__array_priority__` (or similar).
+ *        This is set for unknown objects and all subclasses even when they
+ *        can be handled.
+ * @result The result value indicating what we did with `value` or what type
+ *         of object it is (see `conversion_result`).
+ */
+static NPY_INLINE conversion_result
+convert_to_@name@(PyObject *value, @type@ *result, npy_bool *may_need_deferring)
  {
-    PyObject *temp;
+    PyArray_Descr *descr;
+    *may_need_deferring = NPY_FALSE;
  
-    if (@PYCHECKEXACT@(a)){
-        *arg1 = @PYEXTRACTCTYPE@(a);
-        return 0;
+    if (Py_TYPE(value) == &Py@Name@ArrType_Type) {
+        *result = PyArrayScalar_VAL(value, @Name@);
+        return CONVERSION_SUCCESS;
+    }
+    /* Optimize the identical scalar specifically. */
+    if (PyArray_IsScalar(value, @Name@)) {
+        *result = PyArrayScalar_VAL(value, @Name@);
+        /*
+         * In principle special, assyemetric, handling could be possible for
+         * explicit subclasses.
+         * In practice, we just check the normal deferring logic.
+         */
+        *may_need_deferring = NPY_TRUE;
+        return CONVERSION_SUCCESS;
      }
  
-    if (PyArray_IsScalar(a, @Name@)) {
-        *arg1 = PyArrayScalar_VAL(a, @Name@);
-        return 0;
+    /*
+     * Then we check for the basic Python types float, int, and complex.
+     * (this is a bit tedious to do right for complex).
+     */
+    if (PyBool_Check(value)) {
+        CONVERT_TO_RESULT(value == Py_True);
+        return CONVERSION_SUCCESS;
      }
-    else if (PyArray_IsScalar(a, Generic)) {
-        PyArray_Descr *descr1;
  
-        if (!PyArray_IsScalar(a, Number)) {
-            return -1;
-        }
-        descr1 = PyArray_DescrFromTypeObject((PyObject *)Py_TYPE(a));
-        if (PyArray_CanCastSafely(descr1->type_num, @TYPE@)) {
-            PyArray_CastScalarDirect(a, descr1, arg1, @TYPE@);
-            Py_DECREF(descr1);
-            return 0;
+    if (PyFloat_Check(value)) {
+        if (!PyFloat_CheckExact(value)) {
+            /* A NumPy double is a float subclass, but special. */
+            if (PyArray_IsScalar(value, Double)) {
+                descr = PyArray_DescrFromType(NPY_DOUBLE);
+                goto numpy_scalar;
+            }
+            *may_need_deferring = NPY_TRUE;
          }
-        else {
-            Py_DECREF(descr1);
-            return -1;
+        if (!IS_SAFE(NPY_DOUBLE, NPY_@TYPE@)) {
+            return PROMOTION_REQUIRED;
          }
+        CONVERT_TO_RESULT(PyFloat_AS_DOUBLE(value));
+        return CONVERSION_SUCCESS;
      }
-    else if (PyArray_GetPriority(a, NPY_PRIORITY) > NPY_PRIORITY) {
-        return -2;
-    }
-    else if ((temp = PyArray_ScalarFromObject(a)) != NULL) {
-        int retval = _@name@_convert_to_ctype(temp, arg1);
  
-        Py_DECREF(temp);
-        return retval;
+    if (PyLong_Check(value)) {
+        if (!PyLong_CheckExact(value)) {
+            *may_need_deferring = NPY_TRUE;
+        }
+        if (!IS_SAFE(NPY_LONG, NPY_@TYPE@)) {
+            /*
+             * long -> (c)longdouble is safe, so `THER_IS_UNKNOWN_OBJECT` will
+             * be returned below for huge integers.
+             */
+            return PROMOTION_REQUIRED;
+        }
+        int overflow;
+        long val = PyLong_AsLongAndOverflow(value, &overflow);
+        if (overflow) {
+            return OTHER_IS_UNKNOWN_OBJECT;  /* handle as if arbitrary object */
+        }
+        if (error_converting(val)) {
+            return CONVERSION_ERROR;  /* should not be possible */
+        }
+        CONVERT_TO_RESULT(val);
+        return CONVERSION_SUCCESS;
      }
-    return -2;
-}
-
-/**end repeat**/
-
  
-/**begin repeat
- * #name = byte, ubyte, short, ushort, int, uint,
- *         long, ulong, longlong, ulonglong,
- *         half, float, double, cfloat, cdouble#
- * #type = npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int, npy_uint,
- *         npy_long, npy_ulong, npy_longlong, npy_ulonglong,
- *         npy_half, npy_float, npy_double, npy_cfloat, npy_cdouble#
- */
-static int
-_@name@_convert2_to_ctypes(PyObject *a, @type@ *arg1,
-                           PyObject *b, @type@ *arg2)
-{
-    int ret;
-    ret = _@name@_convert_to_ctype(a, arg1);
-    if (ret < 0) {
-        return ret;
-    }
-    ret = _@name@_convert_to_ctype(b, arg2);
-    if (ret < 0) {
-        return ret;
+    if (PyComplex_Check(value)) {
+        if (!PyComplex_CheckExact(value)) {
+            /* A NumPy complex double is a float subclass, but special. */
+            if (PyArray_IsScalar(value, CDouble)) {
+                descr = PyArray_DescrFromType(NPY_CDOUBLE);
+                goto numpy_scalar;
+            }
+            *may_need_deferring = NPY_TRUE;
+        }
+        if (!IS_SAFE(NPY_CDOUBLE, NPY_@TYPE@)) {
+            return PROMOTION_REQUIRED;
+        }
+#if defined(IS_CFLOAT) || defined(IS_CDOUBLE) || defined(IS_CLONGDOUBLE)
+        Py_complex val = PyComplex_AsCComplex(value);
+        if (error_converting(val.real)) {
+            return CONVERSION_ERROR;  /* should not be possible */
+        }
+        result->real = val.real;
+        result->imag = val.imag;
+        return CONVERSION_SUCCESS;
+#else
+        /* unreachable, always unsafe cast above; return to avoid warning */
+        assert(0);
+        return OTHER_IS_UNKNOWN_OBJECT;
+#endif  /* defined(IS_CFLOAT) || ... */
      }
-    return 0;
-}
-/**end repeat**/
  
-/**begin repeat
- * #name = longdouble, clongdouble#
- * #type = npy_longdouble, npy_clongdouble#
- */
-
-static int
-_@name@_convert2_to_ctypes(PyObject *a, @type@ *arg1,
-                           PyObject *b, @type@ *arg2)
-{
-    int ret;
-    ret = _@name@_convert_to_ctype(a, arg1);
-    if (ret == -2) {
-        ret = -3;
+    /*
+     * (seberg) It would be nice to use `PyArray_DiscoverDTypeFromScalarType`
+     * from array coercion here.  OTOH, the array coercion code also falls
+     * back to this code.  The issue is around how subclasses should work...
+     *
+     * It would be nice to try to fully align the paths again (they effectively
+     * are equivalent).  Proper support for subclasses is in general tricky,
+     * and it would make more sense to just _refuse_ to support them.
+     * However, it is unclear that this is a viable option...
+     */
+    if (!PyArray_IsScalar(value, Generic)) {
+        /*
+         * The input is an unknown python object.  This should probably defer
+         * but only does so for float128.
+         * For all other cases, we defer to the array logic.  If the object
+         * is indeed not an array-like, this will end up converting the NumPy
+         * scalar to a Python scalar and then try again.
+         * The logic is that the ufunc casts the input to object, which does
+         * the conversion.
+         * If the object is an array, deferring will always kick in.
+         */
+        *may_need_deferring = NPY_TRUE;
+        return OTHER_IS_UNKNOWN_OBJECT;
      }
-    if (ret < 0) {
-        return ret;
+
+    descr = PyArray_DescrFromScalar(value);
+    if (descr == NULL) {
+        if (PyErr_Occurred()) {
+            return CONVERSION_ERROR;
+        }
+        /* Should not happen, but may be possible with bad user subclasses */
+        *may_need_deferring = NPY_TRUE;
+        return OTHER_IS_UNKNOWN_OBJECT;
      }
-    ret = _@name@_convert_to_ctype(b, arg2);
-    if (ret == -2) {
-        ret = -3;
+
+  numpy_scalar:
+    if (descr->typeobj != Py_TYPE(value)) {
+        /*
+         * This is a subclass of a builtin type, we may continue normally,
+         * but should check whether we need to defer.
+         */
+        *may_need_deferring = NPY_TRUE;
      }
-    if (ret < 0) {
-        return ret;
+
+    /*
+     * Otherwise, we have a clear NumPy scalar, find if it is a compatible
+     * builtin scalar.
+     * Each `GET_VALUE_OR_DEFER` represents a case clause for its type number,
+     * extracting the value if it is safe and otherwise deferring.
+     * (Safety is known at compile time, so the switch statement should be
+     * simplified by the compiler accordingly.)
+     * If we have a scalar that is not listed or not safe, we defer to it.
+     *
+     * We should probably defer more aggressively, but that is too big a change,
+     * since it would disable `np.float64(1.) * [1, 2, 3, 4]`.
+     */
+    int ret;  /* set by the GET_VALUE_OR_DEFER macro */
+    switch (descr->type_num) {
+        GET_VALUE_OR_DEFER(BOOL, Bool, value);
+        /* UInts */
+        GET_VALUE_OR_DEFER(UBYTE, UByte, value);
+        GET_VALUE_OR_DEFER(USHORT, UShort, value);
+        GET_VALUE_OR_DEFER(UINT, UInt, value);
+        GET_VALUE_OR_DEFER(ULONG, ULong, value);
+        GET_VALUE_OR_DEFER(ULONGLONG, ULongLong, value);
+        /* Ints */
+        GET_VALUE_OR_DEFER(BYTE, Byte, value);
+        GET_VALUE_OR_DEFER(SHORT, Short, value);
+        GET_VALUE_OR_DEFER(INT, Int, value);
+        GET_VALUE_OR_DEFER(LONG, Long, value);
+        GET_VALUE_OR_DEFER(LONGLONG, LongLong, value);
+        /* Floats */
+        case NPY_HALF:
+            if (IS_SAFE(NPY_HALF, NPY_@TYPE@)) {
+                CONVERT_TO_RESULT(npy_half_to_float(PyArrayScalar_VAL(value, Half)));
+                ret = CONVERSION_SUCCESS;
+            }
+            else if (IS_SAFE(NPY_@TYPE@, NPY_HALF)) {
+                ret = DEFER_TO_OTHER_KNOWN_SCALAR;
+            }
+            else {
+                ret = PROMOTION_REQUIRED;
+            }
+            break;
+        GET_VALUE_OR_DEFER(FLOAT, Float, value);
+        GET_VALUE_OR_DEFER(DOUBLE, Double, value);
+        GET_VALUE_OR_DEFER(LONGDOUBLE, LongDouble, value);
+        /* Complex: We should still defer, but the code won't work... */
+        GET_CVALUE_OR_DEFER(CFLOAT, CFloat, value);
+        GET_CVALUE_OR_DEFER(CDOUBLE, CDouble, value);
+        GET_CVALUE_OR_DEFER(CLONGDOUBLE, CLongDouble, value);
+        default:
+            /*
+             * If there is no match, this is an unknown scalar object.  It
+             * would make sense to defer generously here, but it should also
+             * always be safe to use the array path.
+             * The issue is, that the other scalar may or may not be designed
+             * to deal with NumPy scalars.  Without knowing that, we cannot
+             * defer (which would be much faster potentially).
+             * TODO: We could add a DType flag to allow opting in to deferring!
+             */
+            *may_need_deferring = NPY_TRUE;
+            ret = OTHER_IS_UNKNOWN_OBJECT;
      }
-    return 0;
+    Py_DECREF(descr);
+    return ret;
  }
  
+#undef IS_SAFE
+#undef CONVERT_TO_RESULT
+#undef GET_VALUE_OR_DEFER
+#undef GET_CVALUE_OR_DEFER
+#undef IS_@TYPE@
+
  /**end repeat**/
  
  
@@ -756,102 +1108,162 @@ _@name@_convert2_to_ctypes(PyObject *a, @type@ *arg1,
   * #name = (byte, ubyte, short, ushort, int, uint,
   *             long, ulong, longlong, ulonglong)*12,
   *         (half, float, double, longdouble,
- *             cfloat, cdouble, clongdouble)*5,
- *         (half, float, double, longdouble)*2#
+ *             cfloat, cdouble, clongdouble)*4,
+ *         (half, float, double, longdouble)*3#
   * #Name = (Byte, UByte, Short, UShort, Int, UInt,
   *             Long, ULong,LongLong,ULongLong)*12,
   *         (Half, Float, Double, LongDouble,
- *             CFloat, CDouble, CLongDouble)*5,
- *         (Half, Float, Double, LongDouble)*2#
+ *             CFloat, CDouble, CLongDouble)*4,
+ *         (Half, Float, Double, LongDouble)*3#
   * #type = (npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int, npy_uint,
   *             npy_long, npy_ulong, npy_longlong, npy_ulonglong)*12,
   *         (npy_half, npy_float, npy_double, npy_longdouble,
- *             npy_cfloat, npy_cdouble, npy_clongdouble)*5,
- *         (npy_half, npy_float, npy_double, npy_longdouble)*2#
+ *             npy_cfloat, npy_cdouble, npy_clongdouble)*4,
+ *         (npy_half, npy_float, npy_double, npy_longdouble)*3#
   *
   * #oper = add*10, subtract*10, multiply*10, remainder*10,
   *         divmod*10, floor_divide*10, lshift*10, rshift*10, and*10,
   *         or*10, xor*10, true_divide*10,
- *         add*7, subtract*7, multiply*7, floor_divide*7, true_divide*7,
- *         divmod*4, remainder*4#
+ *         add*7, subtract*7, multiply*7, true_divide*7,
+ *         floor_divide*4, divmod*4, remainder*4#
   *
- * #fperr = 1*60,0*50,1*10,
- *          1*35,
- *          1*8#
+ * #fperr = 0*110, 1*10,
+ *          1*28, 1*12#
   * #twoout = 0*40,1*10,0*70,
- *           0*35,
- *           1*4,0*4#
+ *           0*28,
+ *           0*4,1*4,0*4#
   * #otype = (npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int, npy_uint,
   *             npy_long, npy_ulong, npy_longlong, npy_ulonglong)*11,
- *         npy_float*4, npy_double*6,
+ *         npy_double*10,
   *         (npy_half, npy_float, npy_double, npy_longdouble,
- *             npy_cfloat, npy_cdouble, npy_clongdouble)*5,
- *         (npy_half, npy_float, npy_double, npy_longdouble)*2#
+ *             npy_cfloat, npy_cdouble, npy_clongdouble)*4,
+ *         (npy_half, npy_float, npy_double, npy_longdouble)*3#
   * #OName = (Byte, UByte, Short, UShort, Int, UInt,
   *              Long, ULong, LongLong, ULongLong)*11,
- *          Float*4, Double*6,
+ *          Double*10,
   *          (Half, Float, Double, LongDouble,
- *              CFloat, CDouble, CLongDouble)*5,
- *          (Half, Float, Double, LongDouble)*2#
+ *              CFloat, CDouble, CLongDouble)*4,
+ *          (Half, Float, Double, LongDouble)*3#
   */
+#define IS_@name@
  
  static PyObject *
  @name@_@oper@(PyObject *a, PyObject *b)
  {
      PyObject *ret;
-    @type@ arg1, arg2;
-    @otype@ out;
-
-#if @twoout@
-    @otype@ out2;
-    PyObject *obj;
-#endif
+    @type@ arg1, arg2, other_val;
  
-#if @fperr@
-    int retstatus;
-    int first;
-#endif
+    /*
+     * Check if this operation may be considered forward.  Note `is_forward`
+     * does not imply that we can defer to a subclass `b`.  It just means that
+     * the first operand fits to the method.
+     */
+    int is_forward;
+    if (Py_TYPE(a) == &Py@Name@ArrType_Type) {
+        is_forward = 1;
+    }
+    else if (Py_TYPE(b) == &Py@Name@ArrType_Type) {
+        is_forward = 0;
+    }
+    else {
+        /* subclasses are involved */
+        is_forward = PyArray_IsScalar(a, @Name@);
+        assert(is_forward || PyArray_IsScalar(b, @Name@));
+    }
  
-    BINOP_GIVE_UP_IF_NEEDED(a, b, nb_@oper@, @name@_@oper@);
+    /*
+     * Extract the other value (if it is compatible).  Otherwise, decide
+     * how to deal with it.  This is somewhat complicated.
+     *
+     * Note: This pattern is used multiple times below.
+     */
+    PyObject *other = is_forward ? b : a;
  
-    switch(_@name@_convert2_to_ctypes(a, &arg1, b, &arg2)) {
-        case 0:
-            break;
-        case -1:
-            /* one of them can't be cast safely must be mixed-types*/
-            return PyArray_Type.tp_as_number->nb_@oper@(a,b);
-        case -2:
-            /* use default handling */
-            if (PyErr_Occurred()) {
-                return NULL;
-            }
-            return PyGenericArrType_Type.tp_as_number->nb_@oper@(a,b);
-        case -3:
+    npy_bool may_need_deferring;
+    conversion_result res = convert_to_@name@(
+            other, &other_val, &may_need_deferring);
+    if (res == CONVERSION_ERROR) {
+        return NULL;  /* an error occurred (should never happen) */
+    }
+    if (may_need_deferring) {
+        BINOP_GIVE_UP_IF_NEEDED(a, b, nb_@oper@, @name@_@oper@);
+    }
+    switch (res) {
+        case DEFER_TO_OTHER_KNOWN_SCALAR:
              /*
-             * special case for longdouble and clongdouble
-             * because they have a recursive getitem in their dtype
+             * defer to other;  This is normally a forward operation.  However,
+             * it could be backward if an operation is undefined forward.
+             * An example is the complex remainder `complex % bool` will defer
+             * even though it would normally handle the operation.
               */
-            Py_INCREF(Py_NotImplemented);
-            return Py_NotImplemented;
+            Py_RETURN_NOTIMPLEMENTED;
+        case CONVERSION_SUCCESS:
+            break;  /* successfully extracted value we can proceed */
+        case OTHER_IS_UNKNOWN_OBJECT:
+            /*
+             * Either an array-like, unknown scalar (any Python object, but
+             * also integers that are too large to convert to `long`), or
+             * even a subclass of a NumPy scalar (currently).
+             *
+             * Generally, we try dropping through to the array path here,
+             * but this can lead to infinite recursions for (c)longdouble.
+             */
+#if defined(IS_longdouble) || defined(IS_clongdouble)
+            Py_RETURN_NOTIMPLEMENTED;
+#endif
+        case PROMOTION_REQUIRED:
+            /*
+             * Python scalar that is larger than the current one, or two
+             * NumPy scalars that promote to a third (uint16 + int16 -> int32).
+             *
+             * TODO: We could special case the promotion case here for much
+             *       better speed and to deal with integer overflow warnings
+             *       correctly.  (e.g. `uint8 * int8` cannot warn).
+             */
+            return PyGenericArrType_Type.tp_as_number->nb_@oper@(a,b);
+        default:
+            assert(0);  /* error was checked already, impossible to reach */
+            return NULL;
      }
  
  #if @fperr@
-    npy_clear_floatstatus_barrier((char*)&out);
+    npy_clear_floatstatus_barrier((char*)&arg1);
+#endif
+    if (is_forward) {
+        arg1 = PyArrayScalar_VAL(a, @Name@);
+        arg2 = other_val;
+    }
+    else {
+        arg1 = other_val;
+        arg2 = PyArrayScalar_VAL(b, @Name@);
+    }
+
+    /*
+     * Prepare the actual calculation.
+     */
+    @otype@ out;
+#if @twoout@
+    @otype@ out2;
+    PyObject *obj;
  #endif
  
      /*
       * here we do the actual calculation with arg1 and arg2
       * as a function call.
+     * Note that `retstatus` is the "floating point error" value for integer
+     * functions.  Float functions should always return 0, and then use
+     * the following `npy_get_floatstatus_barrier`.
       */
  #if @twoout@
-    @name@_ctype_@oper@(arg1, arg2, (@otype@ *)&out, &out2);
+    int retstatus = @name@_ctype_@oper@(arg1, arg2, &out, &out2);
  #else
-    @name@_ctype_@oper@(arg1, arg2, (@otype@ *)&out);
+    int retstatus = @name@_ctype_@oper@(arg1, arg2, &out);
  #endif
  
  #if @fperr@
      /* Check status flag.  If it is set, then look up what to do */
-    retstatus = npy_get_floatstatus_barrier((char*)&out);
+    retstatus |= npy_get_floatstatus_barrier((char*)&out);
+#endif
      if (retstatus) {
          int bufsize, errmask;
          PyObject *errobj;
@@ -860,14 +1272,13 @@ static PyObject *
                                  &errobj) < 0) {
              return NULL;
          }
-        first = 1;
+        int first = 1;
          if (PyUFunc_handlefperr(errmask, errobj, retstatus, &first)) {
              Py_XDECREF(errobj);
              return NULL;
          }
          Py_XDECREF(errobj);
      }
-#endif
  
  
  #if @twoout@
@@ -899,6 +1310,9 @@ static PyObject *
      return ret;
  }
  
+
+#undef IS_@name@
+
  /**end repeat**/
  
  #define _IS_ZERO(x) (x == 0)
@@ -920,17 +1334,6 @@ static PyObject *
   *         Half, Float, Double, LongDouble,
   *         CFloat, CDouble, CLongDouble#
   *
- * #oname = float*4, double*6, half, float, double, longdouble,
- *          cfloat, cdouble, clongdouble#
- *
- * #otype = npy_float*4, npy_double*6, npy_half, npy_float,
- *          npy_double, npy_longdouble,
- *          npy_cfloat, npy_cdouble, npy_clongdouble#
- *
- * #OName = Float*4, Double*6, Half, Float,
- *          Double, LongDouble,
- *          CFloat, CDouble, CLongDouble#
- *
   * #isint = 1*10,0*7#
   * #isuint = (0,1)*5,0*7#
   * #cmplx = 0*14,1*3#
@@ -938,46 +1341,80 @@ static PyObject *
   * #zero = 0*10, NPY_HALF_ZERO, 0*6#
   * #one = 1*10, NPY_HALF_ONE, 1*6#
   */
+#define IS_@name@
  
  static PyObject *
  @name@_power(PyObject *a, PyObject *b, PyObject *modulo)
  {
-    PyObject *ret;
-    @type@ arg1, arg2, out;
-
-    BINOP_GIVE_UP_IF_NEEDED(a, b, nb_power, @name@_power);
-
-    switch(_@name@_convert2_to_ctypes(a, &arg1, b, &arg2)) {
-        case 0:
-            break;
-        case -1:
-            /* can't cast both safely mixed-types? */
-            return PyArray_Type.tp_as_number->nb_power(a,b,modulo);
-        case -2:
-            /* use default handling */
-            if (PyErr_Occurred()) {
-                return NULL;
-            }
-            return PyGenericArrType_Type.tp_as_number->nb_power(a,b,modulo);
-        case -3:
-        default:
-            /*
-             * special case for longdouble and clongdouble
-             * because they have a recursive getitem in their dtype
-             */
-            Py_INCREF(Py_NotImplemented);
-            return Py_NotImplemented;
-    }
-
      if (modulo != Py_None) {
          /* modular exponentiation is not implemented (gh-8804) */
          Py_INCREF(Py_NotImplemented);
          return Py_NotImplemented;
      }
  
+    PyObject *ret;
+    @type@ arg1, arg2, other_val;
+
+    int is_forward;
+    if (Py_TYPE(a) == &Py@Name@ArrType_Type) {
+        is_forward = 1;
+    }
+    else if (Py_TYPE(b) == &Py@Name@ArrType_Type) {
+        is_forward = 0;
+    }
+    else {
+        /* subclasses are involved */
+        is_forward = PyArray_IsScalar(a, @Name@);
+        assert(is_forward || PyArray_IsScalar(b, @Name@));
+    }
+    /*
+     * Extract the other value (if it is compatible). See the generic
+     * (non power) version above for detailed notes.
+     */
+    PyObject *other = is_forward ? b : a;
+
+    npy_bool may_need_deferring;
+    int res = convert_to_@name@(other, &other_val, &may_need_deferring);
+    if (res == CONVERSION_ERROR) {
+        return NULL;  /* an error occurred (should never happen) */
+    }
+    if (may_need_deferring) {
+        BINOP_GIVE_UP_IF_NEEDED(a, b, nb_power, @name@_power);
+    }
+    switch (res) {
+        case DEFER_TO_OTHER_KNOWN_SCALAR:
+            Py_RETURN_NOTIMPLEMENTED;
+        case CONVERSION_SUCCESS:
+            break;  /* successfully extracted value we can proceed */
+        case OTHER_IS_UNKNOWN_OBJECT:
+#if defined(IS_longdouble) || defined(IS_clongdouble)
+            Py_RETURN_NOTIMPLEMENTED;
+#endif
+        case PROMOTION_REQUIRED:
+            return PyGenericArrType_Type.tp_as_number->nb_power(a, b, modulo);
+        default:
+            assert(0);  /* error was checked already, impossible to reach */
+            return NULL;
+    }
+
  #if !@isint@
-    npy_clear_floatstatus_barrier((char*)&out);
+    npy_clear_floatstatus_barrier((char*)&arg1);
  #endif
+
+    if (is_forward) {
+        arg1 = PyArrayScalar_VAL(a, @Name@);
+        arg2 = other_val;
+    }
+    else {
+        arg1 = other_val;
+        arg2 = PyArrayScalar_VAL(b, @Name@);
+    }
+
+    /*
+     * Prepare the actual calculation:
+     */
+    @type@ out;
+
      /*
       * here we do the actual calculation with arg1 and arg2
       * as a function call.
@@ -989,11 +1426,12 @@ static PyObject *
          return NULL;
      }
  #endif
-    @name@_ctype_power(arg1, arg2, &out);
+    int retstatus = @name@_ctype_power(arg1, arg2, &out);
  
  #if !@isint@
      /* Check status flag.  If it is set, then look up what to do */
-    int retstatus = npy_get_floatstatus_barrier((char*)&out);
+    retstatus |= npy_get_floatstatus_barrier((char*)&out);
+#endif
      if (retstatus) {
          int bufsize, errmask;
          PyObject *errobj;
@@ -1009,7 +1447,6 @@ static PyObject *
          }
          Py_XDECREF(errobj);
      }
-#endif
  
      ret = PyArrayScalar_New(@Name@);
      if (ret == NULL) {
@@ -1021,78 +1458,28 @@ static PyObject *
  }
  
  
+#undef IS_@name@
  /**end repeat**/
  #undef _IS_ZERO
  
  
  /**begin repeat
   *
- * #name = cfloat, cdouble#
- *
- */
-
-/**begin repeat1
- *
- * #oper = divmod, remainder#
- *
- */
-
-#define @name@_@oper@ NULL
-
-/**end repeat1**/
-
-/**end repeat**/
-
-/**begin repeat
- *
- * #oper = divmod, remainder#
+ * #name = (cfloat, cdouble, clongdouble)*3#
+ * #oper = floor_divide*3, divmod*3, remainder*3#
   *
   */
  
  /* 
-Complex numbers do not support remainder operations. Unfortunately, 
-the type inference for long doubles is complicated, and if a remainder 
-operation is not defined - if the relevant field is left NULL - then 
-operations between long doubles and objects lead to an infinite recursion 
-instead of a TypeError. This should ensure that once everything gets
-converted to complex long doubles you correctly get a reasonably
-informative TypeError. This fixes the last part of bug gh-18548.
-*/
-
+ * Complex numbers do not support remainder so we manually make sure that the
+ * operation is not defined.  This is/was especially important for longdoubles
+ * due to their tendency to recurse for some operations, see gh-18548.
+ * (We need to define the slots to avoid inheriting it.)
+ */
  static PyObject *
-clongdouble_@oper@(PyObject *a, PyObject *b)
+@name@_@oper@(PyObject *NPY_UNUSED(a), PyObject *NPY_UNUSED(b))
  {
-    npy_clongdouble arg1, arg2;
-
-    BINOP_GIVE_UP_IF_NEEDED(a, b, nb_@oper@, clongdouble_@oper@);
-
-    switch(_clongdouble_convert2_to_ctypes(a, &arg1, b, &arg2)) {
-        case 0:
-            break;
-        case -1:
-            /* one of them can't be cast safely must be mixed-types*/
-            return PyArray_Type.tp_as_number->nb_@oper@(a,b);
-        case -2:
-            /* use default handling */
-            if (PyErr_Occurred()) {
-                return NULL;
-            }
-            return PyGenericArrType_Type.tp_as_number->nb_@oper@(a,b);
-        case -3:
-            /*
-             * special case for longdouble and clongdouble
-             * because they have a recursive getitem in their dtype
-             */
-            Py_INCREF(Py_NotImplemented);
-            return Py_NotImplemented;
-    }
-
-    /*
-     * here we do the actual calculation with arg1 and arg2
-     * as a function call.
-     */
-    PyErr_SetString(PyExc_TypeError, "complex long doubles do not support remainder");
-    return NULL;
+    Py_RETURN_NOTIMPLEMENTED;
  }
  
  /**end repeat**/
@@ -1124,6 +1511,14 @@ clongdouble_@oper@(PyObject *a, PyObject *b)
   *         byte, ubyte, short, ushort, int, uint,
   *         long, ulong, longlong, ulonglong#
   *
+ * #Name = (Byte, UByte, Short, UShort, Int, UInt,
+ *             Long, ULong, LongLong, ULongLong,
+ *             Half, Float, Double, LongDouble,
+ *             CFloat, CDouble, CLongDouble)*3,
+ *
+ *         Byte, UByte, Short, UShort, Int, UInt,
+ *         Long, ULong, LongLong, ULongLong#
+ *
   * #type = (npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int, npy_uint,
   *             npy_long, npy_ulong, npy_longlong, npy_ulonglong,
   *             npy_half, npy_float, npy_double, npy_longdouble,
@@ -1161,32 +1556,19 @@ clongdouble_@oper@(PyObject *a, PyObject *b)
  static PyObject *
  @name@_@oper@(PyObject *a)
  {
-    @type@ arg1;
+    @type@ val;
      @otype@ out;
      PyObject *ret;
  
-    switch(_@name@_convert_to_ctype(a, &arg1)) {
-    case 0:
-        break;
-    case -1:
-        /* can't cast both safely use different add function */
-        Py_INCREF(Py_NotImplemented);
-        return Py_NotImplemented;
-    case -2:
-        /* use default handling */
-        if (PyErr_Occurred()) {
-            return NULL;
-        }
-        return PyGenericArrType_Type.tp_as_number->nb_@oper@(a);
-    }
+
+    val = PyArrayScalar_VAL(a, @Name@);
+
+    @name@_ctype_@oper@(val, &out);
  
      /*
-     * here we do the actual calculation with arg1 and arg2
-     * make it a function call.
+     * TODO: Complex absolute should check floating point flags.
       */
  
-    @name@_ctype_@oper@(arg1, &out);
-
      ret = PyArrayScalar_New(@OName@);
      PyArrayScalar_ASSIGN(ret, @OName@, out);
  
@@ -1210,6 +1592,10 @@ static PyObject *
   *         uint, long, ulong, longlong, ulonglong,
   *         half, float, double, longdouble,
   *         cfloat, cdouble, clongdouble#
+ * #Name = Byte, UByte, Short, UShort, Int, UInt,
+ *         Long, ULong, LongLong, ULongLong,
+ *         Half, Float, Double, LongDouble,
+ *         CFloat, CDouble, CLongDouble#
   * #type = npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int,
   *         npy_uint, npy_long, npy_ulong, npy_longlong, npy_ulonglong,
   *         npy_half, npy_float, npy_double, npy_longdouble,
@@ -1221,24 +1607,14 @@ static int
  @name@_bool(PyObject *a)
  {
      int ret;
-    @type@ arg1;
+    @type@ val;
  
-    if (_@name@_convert_to_ctype(a, &arg1) < 0) {
-        if (PyErr_Occurred()) {
-            return -1;
-        }
-        return PyGenericArrType_Type.tp_as_number->nb_bool(a);
-    }
-
-    /*
-     * here we do the actual calculation with arg1 and arg2
-     * make it a function call.
-     */
+    val = PyArrayScalar_VAL(a, @Name@);
  
  #if @simp@
-    ret = @nonzero@(arg1);
+    ret = @nonzero@(val);
  #else
-    ret = (@nonzero@(arg1.real) || @nonzero@(arg1.imag));
+    ret = (@nonzero@(val.real) || @nonzero@(val.imag));
  #endif
  
      return ret;
@@ -1321,7 +1697,7 @@ static PyObject *
   * #to_ctype = , , , , , , , , , , npy_half_to_double, , , , , , #
   * #func = PyFloat_FromDouble*17#
   */
-static NPY_INLINE PyObject *
+static PyObject *
  @name@_float(PyObject *obj)
  {
  #if @cmplx@
@@ -1335,6 +1711,12 @@ static NPY_INLINE PyObject *
  }
  /**end repeat**/
  
+#if __GNUC__ < 10
+    /* At least GCC 9.2 issues spurious warnings for arg2 below. */
+    #pragma GCC diagnostic push  /* matching pop after function and repeat */
+    #pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
+#endif
+
  /**begin repeat
   * #oper = le, ge, lt, gt, eq, ne#
   * #op = <=, >=, <, >, ==, !=#
@@ -1353,36 +1735,49 @@ static NPY_INLINE PyObject *
   *         long, ulong, longlong, ulonglong,
   *         half, float, double, longdouble,
   *         cfloat, cdouble, clongdouble#
+ * #Name = Byte, UByte, Short, UShort, Int, UInt,
+ *         Long, ULong, LongLong, ULongLong,
+ *         Half, Float, Double, LongDouble,
+ *         CFloat, CDouble, CLongDouble#
   * #simp = def*10, def_half, def*3, cmplx*3#
   */
+#define IS_@name@
+
  static PyObject*
  @name@_richcompare(PyObject *self, PyObject *other, int cmp_op)
  {
      npy_@name@ arg1, arg2;
-    int out=0;
-
-    RICHCMP_GIVE_UP_IF_NEEDED(self, other);
+    int out = 0;
  
-    switch(_@name@_convert2_to_ctypes(self, &arg1, other, &arg2)) {
-    case 0:
-        break;
-    case -1:
-        /* can't cast both safely use different add function */
-    case -2:
-        /* use ufunc */
-        if (PyErr_Occurred()) {
+    /*
+     * Extract the other value (if it is compatible).
+     */
+    npy_bool may_need_deferring;
+    int res = convert_to_@name@(other, &arg2, &may_need_deferring);
+    if (res == CONVERSION_ERROR) {
+        return NULL;  /* an error occurred (should never happen) */
+    }
+    if (may_need_deferring) {
+        RICHCMP_GIVE_UP_IF_NEEDED(self, other);
+    }
+    switch (res) {
+        case DEFER_TO_OTHER_KNOWN_SCALAR:
+            Py_RETURN_NOTIMPLEMENTED;
+        case CONVERSION_SUCCESS:
+            break;  /* successfully extracted value we can proceed */
+        case OTHER_IS_UNKNOWN_OBJECT:
+#if defined(IS_longdouble) || defined(IS_clongdouble)
+            Py_RETURN_NOTIMPLEMENTED;
+#endif
+        case PROMOTION_REQUIRED:
+            return PyGenericArrType_Type.tp_richcompare(self, other, cmp_op);
+        default:
+            assert(0);  /* error was checked already, impossible to reach */
              return NULL;
-        }
-        return PyGenericArrType_Type.tp_richcompare(self, other, cmp_op);
-    case -3:
-        /*
-         * special case for longdouble and clongdouble
-         * because they have a recursive getitem in their dtype
-         */
-        Py_INCREF(Py_NotImplemented);
-        return Py_NotImplemented;
      }
  
+    arg1 = PyArrayScalar_VAL(self, @Name@);
+
      /* here we do the actual calculation with arg1 and arg2 */
      switch (cmp_op) {
      case Py_EQ:
@@ -1412,8 +1807,15 @@ static PyObject*
          PyArrayScalar_RETURN_FALSE;
      }
  }
+
+#undef IS_@name@
  /**end repeat**/
  
+#if __GNUC__ < 10
+    #pragma GCC diagnostic pop
+#endif
+
+
  /**begin repeat
   *  #name = byte, ubyte, short, ushort, int, uint,
   *          long, ulong, longlong, ulonglong,
diff --git a/numpy/core/src/umath/simd.inc.src b/numpy/core/src/umath/simd.inc.src

index 0e2c1ab8b31b591038dc1bbef3f019ed77b09ada..b477027b3c8a998ebcff372eaf0191748d654d33 100644 (file)
--- a/numpy/core/src/umath/simd.inc.src
+++ b/numpy/core/src/umath/simd.inc.src
@@ -88,38 +88,6 @@ run_unary_avx512f_@func@_@TYPE@(char **args, const npy_intp *dimensions, const n
   *****************************************************************************
   */
  
-/**begin repeat
- * #type = npy_float, npy_double, npy_longdouble#
- * #TYPE = FLOAT, DOUBLE, LONGDOUBLE#
- * #EXISTS = 1, 1, 0#
- */
-
-/**begin repeat1
- *  #func = maximum, minimum#
- */
-
-#if defined HAVE_ATTRIBUTE_TARGET_AVX512F_WITH_INTRINSICS && defined NPY_HAVE_SSE2_INTRINSICS && @EXISTS@
-static NPY_INLINE NPY_GCC_TARGET_AVX512F void
-AVX512F_@func@_@TYPE@(char **args, npy_intp const *dimensions, npy_intp const *steps);
-#endif
-
-static NPY_INLINE int
-run_binary_avx512f_@func@_@TYPE@(char **args, npy_intp const *dimensions, npy_intp const *steps)
-{
-#if defined HAVE_ATTRIBUTE_TARGET_AVX512F_WITH_INTRINSICS && defined NPY_HAVE_SSE2_INTRINSICS && @EXISTS@
-    if (IS_BINARY_SMALL_STEPS_AND_NOMEMOVERLAP) {
-        AVX512F_@func@_@TYPE@(args, dimensions, steps);
-        return 1;
-    }
-    else
-        return 0;
-#endif
-    return 0;
-}
-/**end repeat1**/
-
-/**end repeat**/
-
  /**begin repeat
   * #type = npy_float, npy_double, npy_longdouble#
   * #TYPE = FLOAT, DOUBLE, LONGDOUBLE#
@@ -154,47 +122,6 @@ run_@func@_avx512_skx_@TYPE@(char **args, npy_intp const *dimensions, npy_intp c
  /**end repeat1**/
  /**end repeat**/
  
-/**begin repeat
- * #ISA = FMA, AVX512F#
- * #isa = fma, avx512f#
- * #CHK = HAVE_ATTRIBUTE_TARGET_AVX2_WITH_INTRINSICS, HAVE_ATTRIBUTE_TARGET_AVX512F_WITH_INTRINSICS#
- * #REGISTER_SIZE = 32, 64#
- */
-
-/* prototypes */
-
-/**begin repeat1
- * #type = npy_float, npy_double#
- * #TYPE = FLOAT, DOUBLE#
- */
-
-/**begin repeat2
- *  #func = rint, floor, trunc#
- */
-
-#if defined @CHK@ && defined NPY_HAVE_SSE2_INTRINSICS
-static NPY_INLINE NPY_GCC_TARGET_@ISA@ void
-@ISA@_@func@_@TYPE@(@type@ *, @type@ *, const npy_intp n, const npy_intp stride);
-#endif
-
-static NPY_INLINE int
-run_unary_@isa@_@func@_@TYPE@(char **args, npy_intp const *dimensions, npy_intp const *steps)
-{
-#if defined @CHK@ && defined NPY_HAVE_SSE2_INTRINSICS
-    if (IS_OUTPUT_BLOCKABLE_UNARY(sizeof(@type@), sizeof(@type@), @REGISTER_SIZE@)) {
-        @ISA@_@func@_@TYPE@((@type@*)args[1], (@type@*)args[0], dimensions[0], steps[0]);
-        return 1;
-    }
-    else
-        return 0;
-#endif
-    return 0;
-}
-
-/**end repeat2**/
-/**end repeat1**/
-/**end repeat**/
-
  /**begin repeat
   * Float types
   *  #type = npy_float, npy_double, npy_longdouble#
@@ -204,9 +131,9 @@ run_unary_@isa@_@func@_@TYPE@(char **args, npy_intp const *dimensions, npy_intp
   */
  
  /**begin repeat1
- * #func = negative, minimum, maximum#
- * #check = IS_BLOCKABLE_UNARY, IS_BLOCKABLE_REDUCE*2 #
- * #name = unary, unary_reduce*2#
+ * #func = negative#
+ * #check = IS_BLOCKABLE_UNARY#
+ * #name = unary#
   */
  
  #if @vector@ && defined NPY_HAVE_SSE2_INTRINSICS
@@ -678,55 +605,6 @@ sse2_negative_@TYPE@(@type@ * op, @type@ * ip, const npy_intp n)
  }
  /**end repeat1**/
  
-
-/**begin repeat1
- * #kind = maximum, minimum#
- * #VOP = max, min#
- * #OP = >=, <=#
- **/
-/* arguments swapped as unary reduce has the swapped compared to unary */
-static void
-sse2_@kind@_@TYPE@(@type@ * ip, @type@ * op, const npy_intp n)
-{
-    const npy_intp stride = VECTOR_SIZE_BYTES / (npy_intp)sizeof(@type@);
-    LOOP_BLOCK_ALIGN_VAR(ip, @type@, VECTOR_SIZE_BYTES) {
-        /* Order of operations important for MSVC 2015 */
-        *op = (*op @OP@ ip[i] || npy_isnan(*op)) ? *op : ip[i];
-    }
-    assert(n < stride || npy_is_aligned(&ip[i], VECTOR_SIZE_BYTES));
-    if (i + 3 * stride <= n) {
-        /* load the first elements */
-        @vtype@ c1 = @vpre@_load_@vsuf@((@type@*)&ip[i]);
-        @vtype@ c2 = @vpre@_load_@vsuf@((@type@*)&ip[i + stride]);
-        i += 2 * stride;
-
-        /* minps/minpd will set invalid flag if nan is encountered */
-        npy_clear_floatstatus_barrier((char*)&c1);
-        LOOP_BLOCKED(@type@, 2 * VECTOR_SIZE_BYTES) {
-            @vtype@ v1 = @vpre@_load_@vsuf@((@type@*)&ip[i]);
-            @vtype@ v2 = @vpre@_load_@vsuf@((@type@*)&ip[i + stride]);
-            c1 = @vpre@_@VOP@_@vsuf@(c1, v1);
-            c2 = @vpre@_@VOP@_@vsuf@(c2, v2);
-        }
-        c1 = @vpre@_@VOP@_@vsuf@(c1, c2);
-
-        if (npy_get_floatstatus_barrier((char*)&c1) & NPY_FPE_INVALID) {
-            *op = @nan@;
-        }
-        else {
-            @type@ tmp = sse2_horizontal_@VOP@_@vtype@(c1);
-            /* Order of operations important for MSVC 2015 */
-            *op  = (*op @OP@ tmp || npy_isnan(*op)) ? *op : tmp;
-        }
-    }
-    LOOP_BLOCKED_END {
-        /* Order of operations important for MSVC 2015 */
-        *op  = (*op @OP@ ip[i] || npy_isnan(*op)) ? *op : ip[i];
-    }
-    npy_clear_floatstatus_barrier((char*)op);
-}
-/**end repeat1**/
-
  /**end repeat**/
  
  /* bunch of helper functions used in ISA_exp/log_FLOAT*/
@@ -1199,245 +1077,6 @@ AVX512_SKX_@func@_@TYPE@(npy_bool* op, @type@* ip, const npy_intp array_size, co
  /**end repeat1**/
  /**end repeat**/
  
-/**begin repeat
- * #type = npy_float, npy_double#
- * #TYPE = FLOAT, DOUBLE#
- * #num_lanes = 16, 8#
- * #vsuffix = ps, pd#
- * #mask = __mmask16, __mmask8#
- * #vtype1 = __m512, __m512d#
- * #vtype2 = __m512i, __m256i#
- * #scale = 4, 8#
- * #vindextype = __m512i, __m256i#
- * #vindexsize = 512, 256#
- * #vindexload = _mm512_loadu_si512, _mm256_loadu_si256#
- * #vtype2_load = _mm512_maskz_loadu_epi32, _mm256_maskz_loadu_epi32#
- * #vtype2_gather = _mm512_mask_i32gather_epi32, _mm256_mmask_i32gather_epi32#
- * #vtype2_store = _mm512_mask_storeu_epi32, _mm256_mask_storeu_epi32#
- * #vtype2_scatter = _mm512_mask_i32scatter_epi32, _mm256_mask_i32scatter_epi32#
- * #setzero = _mm512_setzero_epi32, _mm256_setzero_si256#
- */
-/**begin repeat1
- *  #func = maximum, minimum#
- *  #vectorf = max, min#
- */
-
-#if defined HAVE_ATTRIBUTE_TARGET_AVX512F_WITH_INTRINSICS && defined NPY_HAVE_SSE2_INTRINSICS
-static NPY_INLINE NPY_GCC_TARGET_AVX512F void
-AVX512F_@func@_@TYPE@(char **args, npy_intp const *dimensions, npy_intp const *steps)
-{
-    const npy_intp stride_ip1 = steps[0]/(npy_intp)sizeof(@type@);
-    const npy_intp stride_ip2 = steps[1]/(npy_intp)sizeof(@type@);
-    const npy_intp stride_op = steps[2]/(npy_intp)sizeof(@type@);
-    const npy_intp array_size = dimensions[0];
-    npy_intp num_remaining_elements = array_size;
-    @type@* ip1 = (@type@*) args[0];
-    @type@* ip2 = (@type@*) args[1];
-    @type@* op  = (@type@*) args[2];
-
-    @mask@ load_mask = avx512_get_full_load_mask_@vsuffix@();
-
-    /*
-     * Note: while generally indices are npy_intp, we ensure that our maximum index
-     * will fit in an int32 as a precondition for this function via
-     * IS_BINARY_SMALL_STEPS_AND_NOMEMOVERLAP
-     */
-
-    npy_int32 index_ip1[@num_lanes@], index_ip2[@num_lanes@], index_op[@num_lanes@];
-    for (npy_int32 ii = 0; ii < @num_lanes@; ii++) {
-        index_ip1[ii] = ii*stride_ip1;
-        index_ip2[ii] = ii*stride_ip2;
-        index_op[ii] = ii*stride_op;
-    }
-    @vindextype@ vindex_ip1 = @vindexload@((@vindextype@*)&index_ip1[0]);
-    @vindextype@ vindex_ip2 = @vindexload@((@vindextype@*)&index_ip2[0]);
-    @vindextype@ vindex_op  = @vindexload@((@vindextype@*)&index_op[0]);
-    @vtype1@ zeros_f = _mm512_setzero_@vsuffix@();
-
-    while (num_remaining_elements > 0) {
-        if (num_remaining_elements < @num_lanes@) {
-            load_mask = avx512_get_partial_load_mask_@vsuffix@(
-                                    num_remaining_elements, @num_lanes@);
-        }
-        @vtype1@ x1, x2;
-        if (stride_ip1 == 1) {
-            x1 = avx512_masked_load_@vsuffix@(load_mask, ip1);
-        }
-        else {
-            x1 = avx512_masked_gather_@vsuffix@(zeros_f, ip1, vindex_ip1, load_mask);
-        }
-        if (stride_ip2 == 1) {
-            x2 = avx512_masked_load_@vsuffix@(load_mask, ip2);
-        }
-        else {
-            x2 = avx512_masked_gather_@vsuffix@(zeros_f, ip2, vindex_ip2, load_mask);
-        }
-
-        /*
-         * when only one of the argument is a nan, the maxps/maxpd instruction
-         * returns the second argument. The additional blend instruction fixes
-         * this issue to conform with NumPy behaviour.
-         */
-        @mask@ nan_mask = _mm512_cmp_@vsuffix@_mask(x1, x1, _CMP_NEQ_UQ);
-        @vtype1@ out = _mm512_@vectorf@_@vsuffix@(x1, x2);
-        out = _mm512_mask_blend_@vsuffix@(nan_mask, out, x1);
-
-        if (stride_op == 1) {
-            _mm512_mask_storeu_@vsuffix@(op, load_mask, out);
-        }
-        else {
-            /* scatter! */
-            _mm512_mask_i32scatter_@vsuffix@(op, load_mask, vindex_op, out, @scale@);
-        }
-
-        ip1 += @num_lanes@*stride_ip1;
-        ip2 += @num_lanes@*stride_ip2;
-        op += @num_lanes@*stride_op;
-        num_remaining_elements -= @num_lanes@;
-    }
-}
-#endif
-/**end repeat1**/
-/**end repeat**/
-
-/**begin repeat
- * #ISA = FMA, AVX512F#
- * #isa = fma, avx512#
- * #vsize = 256, 512#
- * #BYTES = 32, 64#
- * #cvtps_epi32 = _mm256_cvtps_epi32, #
- * #mask = __m256, __mmask16#
- * #vsub = , _mask#
- * #vtype = __m256, __m512#
- * #cvtps_epi32 = _mm256_cvtps_epi32, #
- * #masked_store = _mm256_maskstore_ps, _mm512_mask_storeu_ps#
- * #CHK = HAVE_ATTRIBUTE_TARGET_AVX2_WITH_INTRINSICS, HAVE_ATTRIBUTE_TARGET_AVX512F_WITH_INTRINSICS#
- */
-
-/**begin repeat1
- *  #func = rint, floor, trunc#
- *  #vectorf = rint, floor, trunc#
- */
-
-#if defined @CHK@
-static NPY_INLINE NPY_GCC_OPT_3 NPY_GCC_TARGET_@ISA@ void
-@ISA@_@func@_FLOAT(npy_float* op,
-                   npy_float* ip,
-                   const npy_intp array_size,
-                   const npy_intp steps)
-{
-    const npy_intp stride = steps/(npy_intp)sizeof(npy_float);
-    const npy_int num_lanes = @BYTES@/(npy_intp)sizeof(npy_float);
-    npy_intp num_remaining_elements = array_size;
-    @vtype@ ones_f = _mm@vsize@_set1_ps(1.0f);
-    @mask@ load_mask = @isa@_get_full_load_mask_ps();
-    /*
-     * Note: while generally indices are npy_intp, we ensure that our maximum index
-     * will fit in an int32 as a precondition for this function via
-     * IS_OUTPUT_BLOCKABLE_UNARY
-     */
-
-    npy_int32 indexarr[16];
-    for (npy_int32 ii = 0; ii < 16; ii++) {
-        indexarr[ii] = ii*stride;
-    }
-    @vtype@i vindex = _mm@vsize@_loadu_si@vsize@((@vtype@i*)&indexarr[0]);
-
-    while (num_remaining_elements > 0) {
-        if (num_remaining_elements < num_lanes) {
-            load_mask = @isa@_get_partial_load_mask_ps(num_remaining_elements,
-                                                       num_lanes);
-        }
-        @vtype@ x;
-        if (stride == 1) {
-            x = @isa@_masked_load_ps(load_mask, ip);
-        }
-        else {
-            x = @isa@_masked_gather_ps(ones_f, ip, vindex, load_mask);
-        }
-        @vtype@ out = @isa@_@vectorf@_ps(x);
-        @masked_store@(op, @cvtps_epi32@(load_mask), out);
-
-        ip += num_lanes*stride;
-        op += num_lanes;
-        num_remaining_elements -= num_lanes;
-    }
-}
-#endif
-/**end repeat1**/
-/**end repeat**/
-
-/**begin repeat
- * #ISA = FMA, AVX512F#
- * #isa = fma, avx512#
- * #vsize = 256, 512#
- * #BYTES = 32, 64#
- * #cvtps_epi32 = _mm256_cvtps_epi32, #
- * #mask = __m256i, __mmask8#
- * #vsub = , _mask#
- * #vtype = __m256d, __m512d#
- * #vindextype = __m128i, __m256i#
- * #vindexsize = 128, 256#
- * #vindexload = _mm_loadu_si128, _mm256_loadu_si256#
- * #cvtps_epi32 = _mm256_cvtpd_epi32, #
- * #castmask = _mm256_castsi256_pd, #
- * #masked_store = _mm256_maskstore_pd, _mm512_mask_storeu_pd#
- * #CHK = HAVE_ATTRIBUTE_TARGET_AVX2_WITH_INTRINSICS, HAVE_ATTRIBUTE_TARGET_AVX512F_WITH_INTRINSICS#
- */
-
-/**begin repeat1
- *  #func = rint, floor, trunc#
- *  #vectorf =  rint, floor, trunc#
- */
-
-#if defined @CHK@
-static NPY_INLINE NPY_GCC_OPT_3 NPY_GCC_TARGET_@ISA@ void
-@ISA@_@func@_DOUBLE(npy_double* op,
-                    npy_double* ip,
-                    const npy_intp array_size,
-                    const npy_intp steps)
-{
-    const npy_intp stride = steps/(npy_intp)sizeof(npy_double);
-    const npy_int num_lanes = @BYTES@/(npy_intp)sizeof(npy_double);
-    npy_intp num_remaining_elements = array_size;
-    @mask@ load_mask = @isa@_get_full_load_mask_pd();
-    @vtype@ ones_d = _mm@vsize@_set1_pd(1.0f);
-
-    /*
-     * Note: while generally indices are npy_intp, we ensure that our maximum index
-     * will fit in an int32 as a precondition for this function via
-     * IS_OUTPUT_BLOCKABLE_UNARY
-     */
-    npy_int32 indexarr[8];
-    for (npy_int32 ii = 0; ii < 8; ii++) {
-        indexarr[ii] = ii*stride;
-    }
-    @vindextype@ vindex = @vindexload@((@vindextype@*)&indexarr[0]);
-
-    while (num_remaining_elements > 0) {
-        if (num_remaining_elements < num_lanes) {
-            load_mask = @isa@_get_partial_load_mask_pd(num_remaining_elements,
-                                                       num_lanes);
-        }
-        @vtype@ x;
-        if (stride == 1) {
-            x = @isa@_masked_load_pd(load_mask, ip);
-        }
-        else {
-            x = @isa@_masked_gather_pd(ones_d, ip, vindex, @castmask@(load_mask));
-        }
-        @vtype@ out = @isa@_@vectorf@_pd(x);
-        @masked_store@(op, load_mask, out);
-
-        ip += num_lanes*stride;
-        op += num_lanes;
-        num_remaining_elements -= num_lanes;
-    }
-}
-#endif
-/**end repeat1**/
-/**end repeat**/
-
  /**begin repeat
   * #TYPE = CFLOAT, CDOUBLE#
   * #type = npy_float, npy_double#
@@ -1717,3 +1356,4 @@ sse2_@kind@_BOOL(@type@ * op, @type@ * ip, const npy_intp n)
  #undef VECTOR_SIZE_BYTES
  #endif  /* NPY_HAVE_SSE2_INTRINSICS */
  #endif
+
diff --git a/numpy/core/src/umath/ufunc_object.c b/numpy/core/src/umath/ufunc_object.c

index aeca872dd6407a4ecb3c235db622e69cba0a5eea..fce7d61deb58c1abf418f9ec66355e80ff9296a2 100644 (file)
--- a/numpy/core/src/umath/ufunc_object.c
+++ b/numpy/core/src/umath/ufunc_object.c
@@ -1073,13 +1073,15 @@ check_for_trivial_loop(PyArrayMethodObject *ufuncimpl,
          int must_copy = !PyArray_ISALIGNED(op[i]);
  
          if (dtypes[i] != PyArray_DESCR(op[i])) {
-            NPY_CASTING safety = PyArray_GetCastSafety(
-                    PyArray_DESCR(op[i]), dtypes[i], NULL);
+            npy_intp view_offset;
+            NPY_CASTING safety = PyArray_GetCastInfo(
+                    PyArray_DESCR(op[i]), dtypes[i], NULL, &view_offset);
              if (safety < 0 && PyErr_Occurred()) {
                  /* A proper error during a cast check, should be rare */
                  return -1;
              }
-            if (!(safety & _NPY_CAST_IS_VIEW)) {
+            if (view_offset != 0) {
+                /* NOTE: Could possibly implement non-zero view offsets */
                  must_copy = 1;
              }
  
@@ -1204,6 +1206,7 @@ prepare_ufunc_output(PyUFuncObject *ufunc,
   * cannot broadcast any other array (as it requires a single stride).
   * The function accepts all 1-D arrays, and N-D arrays that are either all
   * C- or all F-contiguous.
+ * NOTE: Broadcast outputs are implicitly rejected in the overlap detection.
   *
   * Returns -2 if a trivial loop is not possible, 0 on success and -1 on error.
   */
@@ -1240,7 +1243,7 @@ try_trivial_single_output_loop(PyArrayMethod_Context *context,
          int op_ndim = PyArray_NDIM(op[iop]);
  
          /* Special case 0-D since we can handle broadcasting using a 0-stride */
-        if (op_ndim == 0) {
+        if (op_ndim == 0 && iop < nin) {
              fixed_strides[iop] = 0;
              continue;
          }
@@ -1251,7 +1254,7 @@ try_trivial_single_output_loop(PyArrayMethod_Context *context,
              operation_shape = PyArray_SHAPE(op[iop]);
          }
          else if (op_ndim != operation_ndim) {
-            return -2;  /* dimension mismatch (except 0-d ops) */
+            return -2;  /* dimension mismatch (except 0-d input ops) */
          }
          else if (!PyArray_CompareLists(
                  operation_shape, PyArray_DIMS(op[iop]), op_ndim)) {
@@ -1319,6 +1322,10 @@ try_trivial_single_output_loop(PyArrayMethod_Context *context,
       */
      char *data[NPY_MAXARGS];
      npy_intp count = PyArray_MultiplyList(operation_shape, operation_ndim);
+    if (count == 0) {
+        /* Nothing to do */
+        return 0;
+    }
      NPY_BEGIN_THREADS_DEF;
  
      PyArrayMethod_StridedLoop *strided_loop;
@@ -2702,7 +2709,7 @@ PyUFunc_GenericFunction(PyUFuncObject *NPY_UNUSED(ufunc),
   * @param out_descrs New references to the resolved descriptors (on success).
   * @param method The ufunc method, "reduce", "reduceat", or "accumulate".
  
- * @returns ufuncimpl The `ArrayMethod` implemention to use. Or NULL if an
+ * @returns ufuncimpl The `ArrayMethod` implementation to use. Or NULL if an
   *          error occurred.
   */
  static PyArrayMethodObject *
@@ -2817,7 +2824,7 @@ reduce_loop(PyArrayMethod_Context *context,
          npy_intp const *countptr, NpyIter_IterNextFunc *iternext,
          int needs_api, npy_intp skip_first_count)
  {
-    int retval;
+    int retval = 0;
      char *dataptrs_copy[4];
      npy_intp strides_copy[4];
      npy_bool masked;
@@ -2847,19 +2854,20 @@ reduce_loop(PyArrayMethod_Context *context,
                      count = 0;
                  }
              }
-
-            /* Turn the two items into three for the inner loop */
-            dataptrs_copy[0] = dataptrs[0];
-            dataptrs_copy[1] = dataptrs[1];
-            dataptrs_copy[2] = dataptrs[0];
-            strides_copy[0] = strides[0];
-            strides_copy[1] = strides[1];
-            strides_copy[2] = strides[0];
-
-            retval = strided_loop(context,
-                    dataptrs_copy, &count, strides_copy, auxdata);
-            if (retval < 0) {
-                goto finish_loop;
+            if (count > 0) {
+                /* Turn the two items into three for the inner loop */
+                dataptrs_copy[0] = dataptrs[0];
+                dataptrs_copy[1] = dataptrs[1];
+                dataptrs_copy[2] = dataptrs[0];
+                strides_copy[0] = strides[0];
+                strides_copy[1] = strides[1];
+                strides_copy[2] = strides[0];
+
+                retval = strided_loop(context,
+                        dataptrs_copy, &count, strides_copy, auxdata);
+                if (retval < 0) {
+                    goto finish_loop;
+                }
              }
  
              /* Advance loop, and abort on error (or finish) */
@@ -4538,8 +4546,10 @@ resolve_descriptors(int nop,
  
      if (ufuncimpl->resolve_descriptors != &wrapped_legacy_resolve_descriptors) {
          /* The default: use the `ufuncimpl` as nature intended it */
+        npy_intp view_offset = NPY_MIN_INTP;  /* currently ignored */
+
          NPY_CASTING safety = ufuncimpl->resolve_descriptors(ufuncimpl,
-                signature, original_dtypes, dtypes);
+                signature, original_dtypes, dtypes, &view_offset);
          if (safety < 0) {
              goto finish;
          }
diff --git a/numpy/core/src/umath/ufunc_object.h b/numpy/core/src/umath/ufunc_object.h

index 6d4fed7c02d2871f30d04c100ce8060455127d3f..32af6c58ed6af059e30c32dfa00484572b882639 100644 (file)
--- a/numpy/core/src/umath/ufunc_object.h
+++ b/numpy/core/src/umath/ufunc_object.h
@@ -13,6 +13,7 @@ NPY_NO_EXPORT const char*
  ufunc_get_name_cstr(PyUFuncObject *ufunc);
  
  /* strings from umathmodule.c that are interned on umath import */
+NPY_VISIBILITY_HIDDEN extern PyObject *npy_um_str_array_ufunc;
  NPY_VISIBILITY_HIDDEN extern PyObject *npy_um_str_array_prepare;
  NPY_VISIBILITY_HIDDEN extern PyObject *npy_um_str_array_wrap;
  NPY_VISIBILITY_HIDDEN extern PyObject *npy_um_str_pyvals_name;
diff --git a/numpy/core/src/umath/ufunc_type_resolution.c b/numpy/core/src/umath/ufunc_type_resolution.c

index 9ed923cf56e776ac4d27d3687131dd6521fe6b57..6edd00e6520e905b582cea4d7ac5f0346309b315 100644 (file)
--- a/numpy/core/src/umath/ufunc_type_resolution.c
+++ b/numpy/core/src/umath/ufunc_type_resolution.c
@@ -41,6 +41,7 @@
  #include "ufunc_object.h"
  #include "common.h"
  #include "convert_datatype.h"
+#include "dtypemeta.h"
  
  #include "mem_overlap.h"
  #if defined(HAVE_CBLAS)
@@ -416,12 +417,12 @@ PyUFunc_SimpleBinaryComparisonTypeResolver(PyUFuncObject *ufunc,
              }
          }
          else {
-            /* Usually a failure, but let the the default version handle it */
+            /* Usually a failure, but let the default version handle it */
              return PyUFunc_DefaultTypeResolver(ufunc, casting,
                      operands, type_tup, out_dtypes);
          }
  
-        out_dtypes[0] = ensure_dtype_nbo(descr);
+        out_dtypes[0] = NPY_DT_CALL_ensure_canonical(descr);
          if (out_dtypes[0] == NULL) {
              return -1;
          }
@@ -545,7 +546,8 @@ PyUFunc_SimpleUniformOperationTypeResolver(
      if (type_tup == NULL) {
          /* PyArray_ResultType forgets to force a byte order when n == 1 */
          if (ufunc->nin == 1){
-            out_dtypes[0] = ensure_dtype_nbo(PyArray_DESCR(operands[0]));
+            out_dtypes[0] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(operands[0]));
          }
          else {
              int iop;
@@ -629,7 +631,7 @@ PyUFunc_SimpleUniformOperationTypeResolver(
              /* Prefer the input descriptor if it matches (preserve metadata) */
              descr = PyArray_DESCR(operands[0]);
          }
-        out_dtypes[0] = ensure_dtype_nbo(descr);
+        out_dtypes[0] = NPY_DT_CALL_ensure_canonical(descr);
      }
  
      /* All types are the same - copy the first one to the rest */
@@ -695,7 +697,7 @@ PyUFunc_IsNaTTypeResolver(PyUFuncObject *ufunc,
          return -1;
      }
  
-    out_dtypes[0] = ensure_dtype_nbo(PyArray_DESCR(operands[0]));
+    out_dtypes[0] = NPY_DT_CALL_ensure_canonical(PyArray_DESCR(operands[0]));
      out_dtypes[1] = PyArray_DescrFromType(NPY_BOOL);
  
      return 0;
@@ -714,7 +716,7 @@ PyUFunc_IsFiniteTypeResolver(PyUFuncObject *ufunc,
                                      type_tup, out_dtypes);
      }
  
-    out_dtypes[0] = ensure_dtype_nbo(PyArray_DESCR(operands[0]));
+    out_dtypes[0] = NPY_DT_CALL_ensure_canonical(PyArray_DESCR(operands[0]));
      out_dtypes[1] = PyArray_DescrFromType(NPY_BOOL);
  
      return 0;
@@ -816,7 +818,8 @@ PyUFunc_AdditionTypeResolver(PyUFuncObject *ufunc,
          /* m8[<A>] + int => m8[<A>] + m8[<A>] */
          else if (PyTypeNum_ISINTEGER(type_num2) ||
                                      PyTypeNum_ISBOOL(type_num2)) {
-            out_dtypes[0] = ensure_dtype_nbo(PyArray_DESCR(operands[0]));
+            out_dtypes[0] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(operands[0]));
              if (out_dtypes[0] == NULL) {
                  return -1;
              }
@@ -852,7 +855,8 @@ PyUFunc_AdditionTypeResolver(PyUFuncObject *ufunc,
          /* M8[<A>] + int => M8[<A>] + m8[<A>] */
          else if (PyTypeNum_ISINTEGER(type_num2) ||
                      PyTypeNum_ISBOOL(type_num2)) {
-            out_dtypes[0] = ensure_dtype_nbo(PyArray_DESCR(operands[0]));
+            out_dtypes[0] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(operands[0]));
              if (out_dtypes[0] == NULL) {
                  return -1;
              }
@@ -876,7 +880,8 @@ PyUFunc_AdditionTypeResolver(PyUFuncObject *ufunc,
      else if (PyTypeNum_ISINTEGER(type_num1) || PyTypeNum_ISBOOL(type_num1)) {
          /* int + m8[<A>] => m8[<A>] + m8[<A>] */
          if (type_num2 == NPY_TIMEDELTA) {
-            out_dtypes[0] = ensure_dtype_nbo(PyArray_DESCR(operands[1]));
+            out_dtypes[0] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(operands[1]));
              if (out_dtypes[0] == NULL) {
                  return -1;
              }
@@ -894,7 +899,8 @@ PyUFunc_AdditionTypeResolver(PyUFuncObject *ufunc,
              if (out_dtypes[0] == NULL) {
                  return -1;
              }
-            out_dtypes[1] = ensure_dtype_nbo(PyArray_DESCR(operands[1]));
+            out_dtypes[1] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(operands[1]));
              if (out_dtypes[1] == NULL) {
                  Py_DECREF(out_dtypes[0]);
                  out_dtypes[0] = NULL;
@@ -985,7 +991,8 @@ PyUFunc_SubtractionTypeResolver(PyUFuncObject *ufunc,
          /* m8[<A>] - int => m8[<A>] - m8[<A>] */
          else if (PyTypeNum_ISINTEGER(type_num2) ||
                                          PyTypeNum_ISBOOL(type_num2)) {
-            out_dtypes[0] = ensure_dtype_nbo(PyArray_DESCR(operands[0]));
+            out_dtypes[0] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(operands[0]));
              if (out_dtypes[0] == NULL) {
                  return -1;
              }
@@ -1021,7 +1028,8 @@ PyUFunc_SubtractionTypeResolver(PyUFuncObject *ufunc,
          /* M8[<A>] - int => M8[<A>] - m8[<A>] */
          else if (PyTypeNum_ISINTEGER(type_num2) ||
                      PyTypeNum_ISBOOL(type_num2)) {
-            out_dtypes[0] = ensure_dtype_nbo(PyArray_DESCR(operands[0]));
+            out_dtypes[0] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(operands[0]));
              if (out_dtypes[0] == NULL) {
                  return -1;
              }
@@ -1061,7 +1069,8 @@ PyUFunc_SubtractionTypeResolver(PyUFuncObject *ufunc,
      else if (PyTypeNum_ISINTEGER(type_num1) || PyTypeNum_ISBOOL(type_num1)) {
          /* int - m8[<A>] => m8[<A>] - m8[<A>] */
          if (type_num2 == NPY_TIMEDELTA) {
-            out_dtypes[0] = ensure_dtype_nbo(PyArray_DESCR(operands[1]));
+            out_dtypes[0] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(operands[1]));
              if (out_dtypes[0] == NULL) {
                  return -1;
              }
@@ -1122,7 +1131,8 @@ PyUFunc_MultiplicationTypeResolver(PyUFuncObject *ufunc,
      if (type_num1 == NPY_TIMEDELTA) {
          /* m8[<A>] * int## => m8[<A>] * int64 */
          if (PyTypeNum_ISINTEGER(type_num2) || PyTypeNum_ISBOOL(type_num2)) {
-            out_dtypes[0] = ensure_dtype_nbo(PyArray_DESCR(operands[0]));
+            out_dtypes[0] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(operands[0]));
              if (out_dtypes[0] == NULL) {
                  return -1;
              }
@@ -1139,7 +1149,8 @@ PyUFunc_MultiplicationTypeResolver(PyUFuncObject *ufunc,
          }
          /* m8[<A>] * float## => m8[<A>] * float64 */
          else if (PyTypeNum_ISFLOAT(type_num2)) {
-            out_dtypes[0] = ensure_dtype_nbo(PyArray_DESCR(operands[0]));
+            out_dtypes[0] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(operands[0]));
              if (out_dtypes[0] == NULL) {
                  return -1;
              }
@@ -1165,7 +1176,8 @@ PyUFunc_MultiplicationTypeResolver(PyUFuncObject *ufunc,
              if (out_dtypes[0] == NULL) {
                  return -1;
              }
-            out_dtypes[1] = ensure_dtype_nbo(PyArray_DESCR(operands[1]));
+            out_dtypes[1] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(operands[1]));
              if (out_dtypes[1] == NULL) {
                  Py_DECREF(out_dtypes[0]);
                  out_dtypes[0] = NULL;
@@ -1187,7 +1199,8 @@ PyUFunc_MultiplicationTypeResolver(PyUFuncObject *ufunc,
              if (out_dtypes[0] == NULL) {
                  return -1;
              }
-            out_dtypes[1] = ensure_dtype_nbo(PyArray_DESCR(operands[1]));
+            out_dtypes[1] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(operands[1]));
              if (out_dtypes[1] == NULL) {
                  Py_DECREF(out_dtypes[0]);
                  out_dtypes[0] = NULL;
@@ -1278,7 +1291,8 @@ PyUFunc_DivisionTypeResolver(PyUFuncObject *ufunc,
          }
          /* m8[<A>] / int## => m8[<A>] / int64 */
          else if (PyTypeNum_ISINTEGER(type_num2)) {
-            out_dtypes[0] = ensure_dtype_nbo(PyArray_DESCR(operands[0]));
+            out_dtypes[0] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(operands[0]));
              if (out_dtypes[0] == NULL) {
                  return -1;
              }
@@ -1295,7 +1309,8 @@ PyUFunc_DivisionTypeResolver(PyUFuncObject *ufunc,
          }
          /* m8[<A>] / float## => m8[<A>] / float64 */
          else if (PyTypeNum_ISFLOAT(type_num2)) {
-            out_dtypes[0] = ensure_dtype_nbo(PyArray_DESCR(operands[0]));
+            out_dtypes[0] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(operands[0]));
              if (out_dtypes[0] == NULL) {
                  return -1;
              }
@@ -1528,7 +1543,7 @@ PyUFunc_DefaultLegacyInnerLoopSelector(PyUFuncObject *ufunc,
          }
          if (j == nargs) {
              *out_innerloop = ufunc->functions[i];
-            *out_innerloopdata = ufunc->data[i];
+            *out_innerloopdata = (ufunc->data == NULL) ? NULL : ufunc->data[i];
              return 0;
          }
  
@@ -1672,7 +1687,8 @@ set_ufunc_loop_data_types(PyUFuncObject *self, PyArrayObject **op,
          }
          else if (op[i] != NULL &&
                   PyArray_DESCR(op[i])->type_num == type_nums[i]) {
-            out_dtypes[i] = ensure_dtype_nbo(PyArray_DESCR(op[i]));
+            out_dtypes[i] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(op[i]));
          }
          /*
           * For outputs, copy the dtype from op[0] if the type_num
@@ -1680,7 +1696,8 @@ set_ufunc_loop_data_types(PyUFuncObject *self, PyArrayObject **op,
           */
          else if (i >= nin && op[0] != NULL &&
                              PyArray_DESCR(op[0])->type_num == type_nums[i]) {
-            out_dtypes[i] = ensure_dtype_nbo(PyArray_DESCR(op[0]));
+            out_dtypes[i] = NPY_DT_CALL_ensure_canonical(
+                    PyArray_DESCR(op[0]));
          }
          /* Otherwise create a plain descr from the type number */
          else {
diff --git a/numpy/core/src/umath/umathmodule.c b/numpy/core/src/umath/umathmodule.c

index 272555704cf5362bebbd7debf43c649a021e50f6..49328d19e25a520d99f3a15e9bdc3c316fe17c02 100644 (file)
--- a/numpy/core/src/umath/umathmodule.c
+++ b/numpy/core/src/umath/umathmodule.c
@@ -24,6 +24,10 @@
  #include "number.h"
  #include "dispatching.h"
  
+/* Automatically generated code to define all ufuncs: */
+#include "funcs.inc"
+#include "__umath_generated.c"
+
  static PyUFuncGenericFunction pyfunc_functions[] = {PyUFunc_On_Om};
  
  static int
@@ -56,7 +60,7 @@ object_ufunc_loop_selector(PyUFuncObject *ufunc,
                              int *out_needs_api)
  {
      *out_innerloop = ufunc->functions[0];
-    *out_innerloopdata = ufunc->data[0];
+    *out_innerloopdata = (ufunc->data == NULL) ? NULL : ufunc->data[0];
      *out_needs_api = 1;
  
      return 0;
@@ -209,6 +213,7 @@ add_newdoc_ufunc(PyObject *NPY_UNUSED(dummy), PyObject *args)
   *****************************************************************************
   */
  
+NPY_VISIBILITY_HIDDEN PyObject *npy_um_str_array_ufunc = NULL;
  NPY_VISIBILITY_HIDDEN PyObject *npy_um_str_array_prepare = NULL;
  NPY_VISIBILITY_HIDDEN PyObject *npy_um_str_array_wrap = NULL;
  NPY_VISIBILITY_HIDDEN PyObject *npy_um_str_pyvals_name = NULL;
@@ -217,6 +222,10 @@ NPY_VISIBILITY_HIDDEN PyObject *npy_um_str_pyvals_name = NULL;
  static int
  intern_strings(void)
  {
+    npy_um_str_array_ufunc = PyUnicode_InternFromString("__array_ufunc__");
+    if (npy_um_str_array_ufunc == NULL) {
+        return -1;
+    }
      npy_um_str_array_prepare = PyUnicode_InternFromString("__array_prepare__");
      if (npy_um_str_array_prepare == NULL) {
          return -1;
@@ -246,6 +255,10 @@ int initumath(PyObject *m)
      /* Add some symbolic constants to the module */
      d = PyModule_GetDict(m);
  
+    if (InitOperators(d) < 0) {
+        return -1;
+    }
+
      PyDict_SetItemString(d, "pi", s = PyFloat_FromDouble(NPY_PI));
      Py_DECREF(s);
      PyDict_SetItemString(d, "e", s = PyFloat_FromDouble(NPY_E));
@@ -288,8 +301,8 @@ int initumath(PyObject *m)
      PyModule_AddObject(m, "NZERO", PyFloat_FromDouble(NPY_NZERO));
      PyModule_AddObject(m, "NAN", PyFloat_FromDouble(NPY_NAN));
  
-    s = PyDict_GetItemString(d, "true_divide");
-    PyDict_SetItemString(d, "divide", s);
+    s = PyDict_GetItemString(d, "divide");
+    PyDict_SetItemString(d, "true_divide", s);
  
      s = PyDict_GetItemString(d, "conjugate");
      s2 = PyDict_GetItemString(d, "remainder");
diff --git a/numpy/core/src/umath/wrapping_array_method.c b/numpy/core/src/umath/wrapping_array_method.c

new file mode 100644 (file)

index 0000000..9f8f036
--- /dev/null
+++ b/numpy/core/src/umath/wrapping_array_method.c
@@ -0,0 +1,301 @@
+/*
+ * This file defines most of the machinery in order to wrap an existing ufunc
+ * loop for use with a different set of dtypes.
+ *
+ * There are two approaches for this, one is to teach the NumPy core about
+ * the possibility that the loop descriptors do not match exactly the result
+ * descriptors.
+ * The other is to handle this fully by "wrapping", so that NumPy core knows
+ * nothing about this going on.
+ * The slight difficulty here is that `context` metadata needs to be mutated.
+ * It also adds a tiny bit of overhead, since we have to "fix" the descriptors
+ * and unpack the auxdata.
+ *
+ * This means that this currently needs to live within NumPy, as it needs both
+ * extensive API exposure to do it outside, as well as some thoughts on how to
+ * expose the `context` without breaking ABI forward compatibility.
+ * (I.e. we probably need to allocate the context and provide a copy function
+ * or so.)
+ */
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#define _MULTIARRAYMODULE
+#define _UMATHMODULE
+
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+
+#include "numpy/ndarraytypes.h"
+
+#include "common.h"
+#include "array_method.h"
+#include "legacy_array_method.h"
+#include "dtypemeta.h"
+#include "dispatching.h"
+
+
+static NPY_CASTING
+wrapping_method_resolve_descriptors(
+        PyArrayMethodObject *self,
+        PyArray_DTypeMeta *dtypes[],
+        PyArray_Descr *given_descrs[],
+        PyArray_Descr *loop_descrs[],
+        npy_intp *view_offset)
+{
+    int nin = self->nin, nout = self->nout, nargs = nin + nout;
+    PyArray_Descr *orig_given_descrs[NPY_MAXARGS];
+    PyArray_Descr *orig_loop_descrs[NPY_MAXARGS];
+
+    if (self->translate_given_descrs(
+            nin, nout, self->wrapped_dtypes,
+            given_descrs, orig_given_descrs) < 0) {
+        return -1;
+    }
+    NPY_CASTING casting = self->wrapped_meth->resolve_descriptors(
+            self->wrapped_meth, self->wrapped_dtypes,
+            orig_given_descrs, orig_loop_descrs, view_offset);
+    for (int i = 0; i < nargs; i++) {
+        Py_XDECREF(orig_given_descrs);
+    }
+    if (casting < 0) {
+        return -1;
+    }
+    int res = self->translate_loop_descrs(
+            nin, nout, dtypes, given_descrs, orig_loop_descrs, loop_descrs);
+    for (int i = 0; i < nargs; i++) {
+        Py_DECREF(orig_given_descrs);
+    }
+    if (res < 0) {
+        return -1;
+    }
+    return casting;
+}
+
+
+typedef struct {
+    NpyAuxData base;
+    /* Note that if context is expanded this may become trickier: */
+    PyArrayMethod_Context orig_context;
+    PyArrayMethod_StridedLoop *orig_loop;
+    NpyAuxData *orig_auxdata;
+    PyArray_Descr *descriptors[NPY_MAXARGS];
+} wrapping_auxdata;
+
+
+#define WRAPPING_AUXDATA_FREELIST_SIZE 5
+static int wrapping_auxdata_freenum = 0;
+static wrapping_auxdata *wrapping_auxdata_freelist[WRAPPING_AUXDATA_FREELIST_SIZE] = {NULL};
+
+
+static void
+wrapping_auxdata_free(wrapping_auxdata *wrapping_auxdata)
+{
+    /* Free auxdata, everything else is borrowed: */
+    NPY_AUXDATA_FREE(wrapping_auxdata->orig_auxdata);
+    wrapping_auxdata->orig_auxdata = NULL;
+
+    if (wrapping_auxdata_freenum < WRAPPING_AUXDATA_FREELIST_SIZE) {
+        wrapping_auxdata_freelist[wrapping_auxdata_freenum] = wrapping_auxdata;
+    }
+    else {
+        PyMem_Free(wrapping_auxdata);
+    }
+}
+
+
+static wrapping_auxdata *
+get_wrapping_auxdata(void)
+{
+    wrapping_auxdata *res;
+    if (wrapping_auxdata_freenum > 0) {
+        wrapping_auxdata_freenum--;
+        res = wrapping_auxdata_freelist[wrapping_auxdata_freenum];
+    }
+    else {
+        res = PyMem_Calloc(1, sizeof(wrapping_auxdata));
+        if (res < 0) {
+            PyErr_NoMemory();
+            return NULL;
+        }
+        res->base.free = (void *)wrapping_auxdata_free;
+        res->orig_context.descriptors = res->descriptors;
+    }
+
+    return res;
+}
+
+
+static int
+wrapping_method_strided_loop(PyArrayMethod_Context *NPY_UNUSED(context),
+        char *const data[], npy_intp const dimensions[],
+        npy_intp const strides[], wrapping_auxdata *auxdata)
+{
+    /*
+     * If more things get stored on the context, it could be possible that
+     * we would have to copy it here.  But currently, we do not.
+     */
+    return auxdata->orig_loop(
+            &auxdata->orig_context, data, dimensions, strides,
+            auxdata->orig_auxdata);
+}
+
+
+static int
+wrapping_method_get_loop(
+        PyArrayMethod_Context *context,
+        int aligned, int move_references, const npy_intp *strides,
+        PyArrayMethod_StridedLoop **out_loop, NpyAuxData **out_transferdata,
+        NPY_ARRAYMETHOD_FLAGS *flags)
+{
+    assert(move_references == 0);  /* only used internally for "decref" funcs */
+    int nin = context->method->nin, nout = context->method->nout;
+
+    wrapping_auxdata *auxdata = get_wrapping_auxdata();
+    if (auxdata == NULL) {
+        return -1;
+    }
+
+    auxdata->orig_context.method = context->method->wrapped_meth;
+    auxdata->orig_context.caller = context->caller;
+
+    if (context->method->translate_given_descrs(
+            nin, nout, context->method->wrapped_dtypes,
+            context->descriptors, auxdata->orig_context.descriptors) < 0) {
+        NPY_AUXDATA_FREE((NpyAuxData *)auxdata);
+        return -1;
+    }
+    if (context->method->wrapped_meth->get_strided_loop(
+            &auxdata->orig_context, aligned, 0, strides,
+            &auxdata->orig_loop, &auxdata->orig_auxdata,
+            flags) < 0) {
+        NPY_AUXDATA_FREE((NpyAuxData *)auxdata);
+        return -1;
+    }
+
+    *out_loop = (PyArrayMethod_StridedLoop *)&wrapping_method_strided_loop;
+    *out_transferdata = (NpyAuxData *)auxdata;
+    return 0;
+}
+
+
+/**
+ * Allows creating of a fairly lightweight wrapper around an existing ufunc
+ * loop.  The idea is mainly for units, as this is currently slightly limited
+ * in that it enforces that you cannot use a loop from another ufunc.
+ *
+ * @param ufunc_obj
+ * @param new_dtypes
+ * @param wrapped_dtypes
+ * @param translate_given_descrs See typedef comment
+ * @param translate_loop_descrs See typedef comment
+ * @return 0 on success -1 on failure
+ */
+NPY_NO_EXPORT int
+PyUFunc_AddWrappingLoop(PyObject *ufunc_obj,
+        PyArray_DTypeMeta *new_dtypes[], PyArray_DTypeMeta *wrapped_dtypes[],
+        translate_given_descrs_func *translate_given_descrs,
+        translate_loop_descrs_func *translate_loop_descrs)
+{
+    int res = -1;
+    PyUFuncObject *ufunc = (PyUFuncObject *)ufunc_obj;
+    PyObject *wrapped_dt_tuple = NULL;
+    PyObject *new_dt_tuple = NULL;
+    PyArrayMethodObject *meth = NULL;
+
+    if (!PyObject_TypeCheck(ufunc_obj, &PyUFunc_Type)) {
+        PyErr_SetString(PyExc_TypeError,
+                "ufunc object passed is not a ufunc!");
+        return -1;
+    }
+
+    wrapped_dt_tuple = PyArray_TupleFromItems(
+            ufunc->nargs, (PyObject **)wrapped_dtypes, 1);
+    if (wrapped_dt_tuple == NULL) {
+        goto finish;
+    }
+
+    PyArrayMethodObject *wrapped_meth = NULL;
+    PyObject *loops = ufunc->_loops;
+    Py_ssize_t length = PyList_Size(loops);
+    for (Py_ssize_t i = 0; i < length; i++) {
+        PyObject *item = PyList_GetItem(loops, i);
+        PyObject *cur_DType_tuple = PyTuple_GetItem(item, 0);
+        int cmp = PyObject_RichCompareBool(cur_DType_tuple, wrapped_dt_tuple, Py_EQ);
+        if (cmp < 0) {
+            goto finish;
+        }
+        if (cmp == 0) {
+            continue;
+        }
+        wrapped_meth = (PyArrayMethodObject *)PyTuple_GET_ITEM(item, 1);
+        if (!PyObject_TypeCheck(wrapped_meth, &PyArrayMethod_Type)) {
+            PyErr_SetString(PyExc_TypeError,
+                    "Matching loop was not an ArrayMethod.");
+            goto finish;
+        }
+        break;
+    }
+    if (wrapped_meth == NULL) {
+        PyErr_SetString(PyExc_TypeError,
+                "Did not find the to-be-wrapped loop in the ufunc.");
+        goto finish;
+    }
+
+    PyType_Slot slots[] = {
+        {NPY_METH_resolve_descriptors, &wrapping_method_resolve_descriptors},
+        {NPY_METH_get_loop, &wrapping_method_get_loop},
+        {0, NULL}
+    };
+
+    PyArrayMethod_Spec spec = {
+        .name = "wrapped-method",
+        .nin = wrapped_meth->nin,
+        .nout = wrapped_meth->nout,
+        .casting = wrapped_meth->casting,
+        .flags = wrapped_meth->flags,
+        .dtypes = new_dtypes,
+        .slots = slots,
+    };
+    PyBoundArrayMethodObject *bmeth = PyArrayMethod_FromSpec_int(&spec, 1);
+    if (bmeth == NULL) {
+        goto finish;
+    }
+
+    Py_INCREF(bmeth->method);
+    meth = bmeth->method;
+    Py_SETREF(bmeth, NULL);
+
+    /* Finalize the "wrapped" part of the new ArrayMethod */
+    meth->wrapped_dtypes = PyMem_Malloc(ufunc->nargs * sizeof(PyArray_DTypeMeta *));
+    if (meth->wrapped_dtypes == NULL) {
+        goto finish;
+    }
+
+    Py_INCREF(wrapped_meth);
+    meth->wrapped_meth = wrapped_meth;
+    meth->translate_given_descrs = translate_given_descrs;
+    meth->translate_loop_descrs = translate_loop_descrs;
+    for (int i = 0; i < ufunc->nargs; i++) {
+        Py_XINCREF(wrapped_dtypes[i]);
+        meth->wrapped_dtypes[i] = wrapped_dtypes[i];
+    }
+
+    new_dt_tuple = PyArray_TupleFromItems(
+            ufunc->nargs, (PyObject **)new_dtypes, 1);
+    if (new_dt_tuple == NULL) {
+        goto finish;
+    }
+
+    PyObject *info = PyTuple_Pack(2, new_dt_tuple, meth);
+    if (info == NULL) {
+        goto finish;
+    }
+
+    res = PyUFunc_AddLoop(ufunc, info, 0);
+    Py_DECREF(info);
+
+  finish:
+    Py_XDECREF(wrapped_dt_tuple);
+    Py_XDECREF(new_dt_tuple);
+    Py_XDECREF(meth);
+    return res;
+}
diff --git a/numpy/core/tests/data/generate_umath_validation_data.cpp b/numpy/core/tests/data/generate_umath_validation_data.cpp

index 418eae67006f185d1cb2d0ad86cb5329f8e103fa..51ee12501d86b3fa5a026700267944e63dc1ee04 100644 (file)
--- a/numpy/core/tests/data/generate_umath_validation_data.cpp
+++ b/numpy/core/tests/data/generate_umath_validation_data.cpp
@@ -1,10 +1,10 @@
  #include <algorithm>
  #include <fstream>
  #include <iostream>
-#include <math.h>
+#include <cmath>
  #include <random>
-#include <stdio.h>
-#include <time.h>
+#include <cstdio>
+#include <ctime>
  #include <vector>
  
  struct ufunc {
diff --git a/numpy/core/tests/data/umath-validation-set-exp.csv b/numpy/core/tests/data/umath-validation-set-exp.csv

index 7c5ef3b334fbd8a63bb1a0e6ad72577f767dbaf6..071fb312932f3d59c92d03c11d505a78b96c9257 100644 (file)
--- a/numpy/core/tests/data/umath-validation-set-exp.csv
+++ b/numpy/core/tests/data/umath-validation-set-exp.csv
@@ -135,278 +135,278 @@ np.float32,0xc2867878,0x0effff15,3
  np.float32,0xc2a2324a,0x04fffff4,3
  #float64
  ## near zero ##
-np.float64,0x8000000000000000,0x3ff0000000000000,1
-np.float64,0x8010000000000000,0x3ff0000000000000,1
-np.float64,0x8000000000000001,0x3ff0000000000000,1
-np.float64,0x8360000000000000,0x3ff0000000000000,1
-np.float64,0x9a70000000000000,0x3ff0000000000000,1
-np.float64,0xb9b0000000000000,0x3ff0000000000000,1
-np.float64,0xb810000000000000,0x3ff0000000000000,1
-np.float64,0xbc30000000000000,0x3ff0000000000000,1
-np.float64,0xb6a0000000000000,0x3ff0000000000000,1
-np.float64,0x0000000000000000,0x3ff0000000000000,1
-np.float64,0x0010000000000000,0x3ff0000000000000,1
-np.float64,0x0000000000000001,0x3ff0000000000000,1
-np.float64,0x0360000000000000,0x3ff0000000000000,1
-np.float64,0x1a70000000000000,0x3ff0000000000000,1
-np.float64,0x3c30000000000000,0x3ff0000000000000,1
-np.float64,0x36a0000000000000,0x3ff0000000000000,1
-np.float64,0x39b0000000000000,0x3ff0000000000000,1
-np.float64,0x3810000000000000,0x3ff0000000000000,1
+np.float64,0x8000000000000000,0x3ff0000000000000,2
+np.float64,0x8010000000000000,0x3ff0000000000000,2
+np.float64,0x8000000000000001,0x3ff0000000000000,2
+np.float64,0x8360000000000000,0x3ff0000000000000,2
+np.float64,0x9a70000000000000,0x3ff0000000000000,2
+np.float64,0xb9b0000000000000,0x3ff0000000000000,2
+np.float64,0xb810000000000000,0x3ff0000000000000,2
+np.float64,0xbc30000000000000,0x3ff0000000000000,2
+np.float64,0xb6a0000000000000,0x3ff0000000000000,2
+np.float64,0x0000000000000000,0x3ff0000000000000,2
+np.float64,0x0010000000000000,0x3ff0000000000000,2
+np.float64,0x0000000000000001,0x3ff0000000000000,2
+np.float64,0x0360000000000000,0x3ff0000000000000,2
+np.float64,0x1a70000000000000,0x3ff0000000000000,2
+np.float64,0x3c30000000000000,0x3ff0000000000000,2
+np.float64,0x36a0000000000000,0x3ff0000000000000,2
+np.float64,0x39b0000000000000,0x3ff0000000000000,2
+np.float64,0x3810000000000000,0x3ff0000000000000,2
  ## underflow ##
-np.float64,0xc0c6276800000000,0x0000000000000000,1
-np.float64,0xc0c62d918ce2421d,0x0000000000000000,1
-np.float64,0xc0c62d918ce2421e,0x0000000000000000,1
-np.float64,0xc0c62d91a0000000,0x0000000000000000,1
-np.float64,0xc0c62d9180000000,0x0000000000000000,1
-np.float64,0xc0c62dea45ee3e06,0x0000000000000000,1
-np.float64,0xc0c62dea45ee3e07,0x0000000000000000,1
-np.float64,0xc0c62dea40000000,0x0000000000000000,1
-np.float64,0xc0c62dea60000000,0x0000000000000000,1
-np.float64,0xc0875f1120000000,0x0000000000000000,1
-np.float64,0xc0875f113c30b1c8,0x0000000000000000,1
-np.float64,0xc0875f1140000000,0x0000000000000000,1
-np.float64,0xc093480000000000,0x0000000000000000,1
-np.float64,0xffefffffffffffff,0x0000000000000000,1
-np.float64,0xc7efffffe0000000,0x0000000000000000,1
+np.float64,0xc0c6276800000000,0x0000000000000000,2
+np.float64,0xc0c62d918ce2421d,0x0000000000000000,2
+np.float64,0xc0c62d918ce2421e,0x0000000000000000,2
+np.float64,0xc0c62d91a0000000,0x0000000000000000,2
+np.float64,0xc0c62d9180000000,0x0000000000000000,2
+np.float64,0xc0c62dea45ee3e06,0x0000000000000000,2
+np.float64,0xc0c62dea45ee3e07,0x0000000000000000,2
+np.float64,0xc0c62dea40000000,0x0000000000000000,2
+np.float64,0xc0c62dea60000000,0x0000000000000000,2
+np.float64,0xc0875f1120000000,0x0000000000000000,2
+np.float64,0xc0875f113c30b1c8,0x0000000000000000,2
+np.float64,0xc0875f1140000000,0x0000000000000000,2
+np.float64,0xc093480000000000,0x0000000000000000,2
+np.float64,0xffefffffffffffff,0x0000000000000000,2
+np.float64,0xc7efffffe0000000,0x0000000000000000,2
  ## overflow ##
-np.float64,0x40862e52fefa39ef,0x7ff0000000000000,1
-np.float64,0x40872e42fefa39ef,0x7ff0000000000000,1
+np.float64,0x40862e52fefa39ef,0x7ff0000000000000,2
+np.float64,0x40872e42fefa39ef,0x7ff0000000000000,2
  ## +/- INF, +/- NAN ##
-np.float64,0x7ff0000000000000,0x7ff0000000000000,1
-np.float64,0xfff0000000000000,0x0000000000000000,1
-np.float64,0x7ff8000000000000,0x7ff8000000000000,1
-np.float64,0xfff8000000000000,0xfff8000000000000,1
+np.float64,0x7ff0000000000000,0x7ff0000000000000,2
+np.float64,0xfff0000000000000,0x0000000000000000,2
+np.float64,0x7ff8000000000000,0x7ff8000000000000,2
+np.float64,0xfff8000000000000,0xfff8000000000000,2
  ## output denormal ##
-np.float64,0xc087438520000000,0x0000000000000001,1
-np.float64,0xc08743853f2f4461,0x0000000000000001,1
-np.float64,0xc08743853f2f4460,0x0000000000000001,1
-np.float64,0xc087438540000000,0x0000000000000001,1
+np.float64,0xc087438520000000,0x0000000000000001,2
+np.float64,0xc08743853f2f4461,0x0000000000000001,2
+np.float64,0xc08743853f2f4460,0x0000000000000001,2
+np.float64,0xc087438540000000,0x0000000000000001,2
  ## between -745.13321910 and 709.78271289 ##
-np.float64,0xbff760cd14774bd9,0x3fcdb14ced00ceb6,1
-np.float64,0xbff760cd20000000,0x3fcdb14cd7993879,1
-np.float64,0xbff760cd00000000,0x3fcdb14d12fbd264,1
-np.float64,0xc07f1cf360000000,0x130c1b369af14fda,1
-np.float64,0xbeb0000000000000,0x3feffffe00001000,1
-np.float64,0xbd70000000000000,0x3fefffffffffe000,1
-np.float64,0xc084fd46e5c84952,0x0360000000000139,1
-np.float64,0xc084fd46e5c84953,0x035ffffffffffe71,1
-np.float64,0xc084fd46e0000000,0x0360000b9096d32c,1
-np.float64,0xc084fd4700000000,0x035fff9721d12104,1
-np.float64,0xc086232bc0000000,0x0010003af5e64635,1
-np.float64,0xc086232bdd7abcd2,0x001000000000007c,1
-np.float64,0xc086232bdd7abcd3,0x000ffffffffffe7c,1
-np.float64,0xc086232be0000000,0x000ffffaf57a6fc9,1
-np.float64,0xc086233920000000,0x000fe590e3b45eb0,1
-np.float64,0xc086233938000000,0x000fe56133493c57,1
-np.float64,0xc086233940000000,0x000fe5514deffbbc,1
-np.float64,0xc086234c98000000,0x000fbf1024c32ccb,1
-np.float64,0xc086234ca0000000,0x000fbf0065bae78d,1
-np.float64,0xc086234c80000000,0x000fbf3f623a7724,1
-np.float64,0xc086234ec0000000,0x000fbad237c846f9,1
-np.float64,0xc086234ec8000000,0x000fbac27cfdec97,1
-np.float64,0xc086234ee0000000,0x000fba934cfd3dc2,1
-np.float64,0xc086234ef0000000,0x000fba73d7f618d9,1
-np.float64,0xc086234f00000000,0x000fba54632dddc0,1
-np.float64,0xc0862356e0000000,0x000faae0945b761a,1
-np.float64,0xc0862356f0000000,0x000faac13eb9a310,1
-np.float64,0xc086235700000000,0x000faaa1e9567b0a,1
-np.float64,0xc086236020000000,0x000f98cd75c11ed7,1
-np.float64,0xc086236ca0000000,0x000f8081b4d93f89,1
-np.float64,0xc086236cb0000000,0x000f8062b3f4d6c5,1
-np.float64,0xc086236cc0000000,0x000f8043b34e6f8c,1
-np.float64,0xc086238d98000000,0x000f41220d9b0d2c,1
-np.float64,0xc086238da0000000,0x000f4112cc80a01f,1
-np.float64,0xc086238d80000000,0x000f414fd145db5b,1
-np.float64,0xc08624fd00000000,0x000cbfce8ea1e6c4,1
-np.float64,0xc086256080000000,0x000c250747fcd46e,1
-np.float64,0xc08626c480000000,0x000a34f4bd975193,1
-np.float64,0xbf50000000000000,0x3feff800ffeaac00,1
-np.float64,0xbe10000000000000,0x3fefffffff800000,1
-np.float64,0xbcd0000000000000,0x3feffffffffffff8,1
-np.float64,0xc055d589e0000000,0x38100004bf94f63e,1
-np.float64,0xc055d58a00000000,0x380ffff97f292ce8,1
-np.float64,0xbfd962d900000000,0x3fe585a4b00110e1,1
-np.float64,0x3ff4bed280000000,0x400d411e7a58a303,1
-np.float64,0x3fff0b3620000000,0x401bd7737ffffcf3,1
-np.float64,0x3ff0000000000000,0x4005bf0a8b145769,1
-np.float64,0x3eb0000000000000,0x3ff0000100000800,1
-np.float64,0x3d70000000000000,0x3ff0000000001000,1
-np.float64,0x40862e42e0000000,0x7fefff841808287f,1
-np.float64,0x40862e42fefa39ef,0x7fefffffffffff2a,1
-np.float64,0x40862e0000000000,0x7feef85a11e73f2d,1
-np.float64,0x4000000000000000,0x401d8e64b8d4ddae,1
-np.float64,0x4009242920000000,0x40372a52c383a488,1
-np.float64,0x4049000000000000,0x44719103e4080b45,1
-np.float64,0x4008000000000000,0x403415e5bf6fb106,1
-np.float64,0x3f50000000000000,0x3ff00400800aab55,1
-np.float64,0x3e10000000000000,0x3ff0000000400000,1
-np.float64,0x3cd0000000000000,0x3ff0000000000004,1
-np.float64,0x40562e40a0000000,0x47effed088821c3f,1
-np.float64,0x40562e42e0000000,0x47effff082e6c7ff,1
-np.float64,0x40562e4300000000,0x47f00000417184b8,1
-np.float64,0x3fe8000000000000,0x4000ef9db467dcf8,1
-np.float64,0x402b12e8d4f33589,0x412718f68c71a6fe,1
-np.float64,0x402b12e8d4f3358a,0x412718f68c71a70a,1
-np.float64,0x402b12e8c0000000,0x412718f59a7f472e,1
-np.float64,0x402b12e8e0000000,0x412718f70c0eac62,1
+np.float64,0xbff760cd14774bd9,0x3fcdb14ced00ceb6,2
+np.float64,0xbff760cd20000000,0x3fcdb14cd7993879,2
+np.float64,0xbff760cd00000000,0x3fcdb14d12fbd264,2
+np.float64,0xc07f1cf360000000,0x130c1b369af14fda,2
+np.float64,0xbeb0000000000000,0x3feffffe00001000,2
+np.float64,0xbd70000000000000,0x3fefffffffffe000,2
+np.float64,0xc084fd46e5c84952,0x0360000000000139,2
+np.float64,0xc084fd46e5c84953,0x035ffffffffffe71,2
+np.float64,0xc084fd46e0000000,0x0360000b9096d32c,2
+np.float64,0xc084fd4700000000,0x035fff9721d12104,2
+np.float64,0xc086232bc0000000,0x0010003af5e64635,2
+np.float64,0xc086232bdd7abcd2,0x001000000000007c,2
+np.float64,0xc086232bdd7abcd3,0x000ffffffffffe7c,2
+np.float64,0xc086232be0000000,0x000ffffaf57a6fc9,2
+np.float64,0xc086233920000000,0x000fe590e3b45eb0,2
+np.float64,0xc086233938000000,0x000fe56133493c57,2
+np.float64,0xc086233940000000,0x000fe5514deffbbc,2
+np.float64,0xc086234c98000000,0x000fbf1024c32ccb,2
+np.float64,0xc086234ca0000000,0x000fbf0065bae78d,2
+np.float64,0xc086234c80000000,0x000fbf3f623a7724,2
+np.float64,0xc086234ec0000000,0x000fbad237c846f9,2
+np.float64,0xc086234ec8000000,0x000fbac27cfdec97,2
+np.float64,0xc086234ee0000000,0x000fba934cfd3dc2,2
+np.float64,0xc086234ef0000000,0x000fba73d7f618d9,2
+np.float64,0xc086234f00000000,0x000fba54632dddc0,2
+np.float64,0xc0862356e0000000,0x000faae0945b761a,2
+np.float64,0xc0862356f0000000,0x000faac13eb9a310,2
+np.float64,0xc086235700000000,0x000faaa1e9567b0a,2
+np.float64,0xc086236020000000,0x000f98cd75c11ed7,2
+np.float64,0xc086236ca0000000,0x000f8081b4d93f89,2
+np.float64,0xc086236cb0000000,0x000f8062b3f4d6c5,2
+np.float64,0xc086236cc0000000,0x000f8043b34e6f8c,2
+np.float64,0xc086238d98000000,0x000f41220d9b0d2c,2
+np.float64,0xc086238da0000000,0x000f4112cc80a01f,2
+np.float64,0xc086238d80000000,0x000f414fd145db5b,2
+np.float64,0xc08624fd00000000,0x000cbfce8ea1e6c4,2
+np.float64,0xc086256080000000,0x000c250747fcd46e,2
+np.float64,0xc08626c480000000,0x000a34f4bd975193,2
+np.float64,0xbf50000000000000,0x3feff800ffeaac00,2
+np.float64,0xbe10000000000000,0x3fefffffff800000,2
+np.float64,0xbcd0000000000000,0x3feffffffffffff8,2
+np.float64,0xc055d589e0000000,0x38100004bf94f63e,2
+np.float64,0xc055d58a00000000,0x380ffff97f292ce8,2
+np.float64,0xbfd962d900000000,0x3fe585a4b00110e1,2
+np.float64,0x3ff4bed280000000,0x400d411e7a58a303,2
+np.float64,0x3fff0b3620000000,0x401bd7737ffffcf3,2
+np.float64,0x3ff0000000000000,0x4005bf0a8b145769,2
+np.float64,0x3eb0000000000000,0x3ff0000100000800,2
+np.float64,0x3d70000000000000,0x3ff0000000001000,2
+np.float64,0x40862e42e0000000,0x7fefff841808287f,2
+np.float64,0x40862e42fefa39ef,0x7fefffffffffff2a,2
+np.float64,0x40862e0000000000,0x7feef85a11e73f2d,2
+np.float64,0x4000000000000000,0x401d8e64b8d4ddae,2
+np.float64,0x4009242920000000,0x40372a52c383a488,2
+np.float64,0x4049000000000000,0x44719103e4080b45,2
+np.float64,0x4008000000000000,0x403415e5bf6fb106,2
+np.float64,0x3f50000000000000,0x3ff00400800aab55,2
+np.float64,0x3e10000000000000,0x3ff0000000400000,2
+np.float64,0x3cd0000000000000,0x3ff0000000000004,2
+np.float64,0x40562e40a0000000,0x47effed088821c3f,2
+np.float64,0x40562e42e0000000,0x47effff082e6c7ff,2
+np.float64,0x40562e4300000000,0x47f00000417184b8,2
+np.float64,0x3fe8000000000000,0x4000ef9db467dcf8,2
+np.float64,0x402b12e8d4f33589,0x412718f68c71a6fe,2
+np.float64,0x402b12e8d4f3358a,0x412718f68c71a70a,2
+np.float64,0x402b12e8c0000000,0x412718f59a7f472e,2
+np.float64,0x402b12e8e0000000,0x412718f70c0eac62,2
  ##use 1th entry
-np.float64,0x40631659AE147CB4,0x4db3a95025a4890f,1
-np.float64,0xC061B87D2E85A4E2,0x332640c8e2de2c51,1
-np.float64,0x405A4A50BE243AF4,0x496a45e4b7f0339a,1
-np.float64,0xC0839898B98EC5C6,0x0764027828830df4,1
+np.float64,0x40631659AE147CB4,0x4db3a95025a4890f,2
+np.float64,0xC061B87D2E85A4E2,0x332640c8e2de2c51,2
+np.float64,0x405A4A50BE243AF4,0x496a45e4b7f0339a,2
+np.float64,0xC0839898B98EC5C6,0x0764027828830df4,2
  #use 2th entry
-np.float64,0xC072428C44B6537C,0x2596ade838b96f3e,1
-np.float64,0xC053057C5E1AE9BF,0x3912c8fad18fdadf,1
-np.float64,0x407E89C78328BAA3,0x6bfe35d5b9a1a194,1
-np.float64,0x4083501B6DD87112,0x77a855503a38924e,1
+np.float64,0xC072428C44B6537C,0x2596ade838b96f3e,2
+np.float64,0xC053057C5E1AE9BF,0x3912c8fad18fdadf,2
+np.float64,0x407E89C78328BAA3,0x6bfe35d5b9a1a194,2
+np.float64,0x4083501B6DD87112,0x77a855503a38924e,2
  #use 3th entry
-np.float64,0x40832C6195F24540,0x7741e73c80e5eb2f,1
-np.float64,0xC083D4CD557C2EC9,0x06b61727c2d2508e,1
-np.float64,0x400C48F5F67C99BD,0x404128820f02b92e,1
-np.float64,0x4056E36D9B2DF26A,0x4830f52ff34a8242,1
+np.float64,0x40832C6195F24540,0x7741e73c80e5eb2f,2
+np.float64,0xC083D4CD557C2EC9,0x06b61727c2d2508e,2
+np.float64,0x400C48F5F67C99BD,0x404128820f02b92e,2
+np.float64,0x4056E36D9B2DF26A,0x4830f52ff34a8242,2
  #use 4th entry
-np.float64,0x4080FF700D8CBD06,0x70fa70df9bc30f20,1
-np.float64,0x406C276D39E53328,0x543eb8e20a8f4741,1
-np.float64,0xC070D6159BBD8716,0x27a4a0548c904a75,1
-np.float64,0xC052EBCF8ED61F83,0x391c0e92368d15e4,1
+np.float64,0x4080FF700D8CBD06,0x70fa70df9bc30f20,2
+np.float64,0x406C276D39E53328,0x543eb8e20a8f4741,2
+np.float64,0xC070D6159BBD8716,0x27a4a0548c904a75,2
+np.float64,0xC052EBCF8ED61F83,0x391c0e92368d15e4,2
  #use 5th entry
-np.float64,0xC061F892A8AC5FBE,0x32f807a89efd3869,1
-np.float64,0x4021D885D2DBA085,0x40bd4dc86d3e3270,1
-np.float64,0x40767AEEEE7D4FCF,0x605e22851ee2afb7,1
-np.float64,0xC0757C5D75D08C80,0x20f0751599b992a2,1
+np.float64,0xC061F892A8AC5FBE,0x32f807a89efd3869,2
+np.float64,0x4021D885D2DBA085,0x40bd4dc86d3e3270,2
+np.float64,0x40767AEEEE7D4FCF,0x605e22851ee2afb7,2
+np.float64,0xC0757C5D75D08C80,0x20f0751599b992a2,2
  #use 6th entry
-np.float64,0x405ACF7A284C4CE3,0x499a4e0b7a27027c,1
-np.float64,0xC085A6C9E80D7AF5,0x0175914009d62ec2,1
-np.float64,0xC07E4C02F86F1DAE,0x1439269b29a9231e,1
-np.float64,0x4080D80F9691CC87,0x7088a6cdafb041de,1
+np.float64,0x405ACF7A284C4CE3,0x499a4e0b7a27027c,2
+np.float64,0xC085A6C9E80D7AF5,0x0175914009d62ec2,2
+np.float64,0xC07E4C02F86F1DAE,0x1439269b29a9231e,2
+np.float64,0x4080D80F9691CC87,0x7088a6cdafb041de,2
  #use 7th entry
-np.float64,0x407FDFD84FBA0AC1,0x6deb1ae6f9bc4767,1
-np.float64,0x40630C06A1A2213D,0x4dac7a9d51a838b7,1
-np.float64,0x40685FDB30BB8B4F,0x5183f5cc2cac9e79,1
-np.float64,0x408045A2208F77F4,0x6ee299e08e2aa2f0,1
+np.float64,0x407FDFD84FBA0AC1,0x6deb1ae6f9bc4767,2
+np.float64,0x40630C06A1A2213D,0x4dac7a9d51a838b7,2
+np.float64,0x40685FDB30BB8B4F,0x5183f5cc2cac9e79,2
+np.float64,0x408045A2208F77F4,0x6ee299e08e2aa2f0,2
  #use 8th entry
-np.float64,0xC08104E391F5078B,0x0ed397b7cbfbd230,1
-np.float64,0xC031501CAEFAE395,0x3e6040fd1ea35085,1
-np.float64,0xC079229124F6247C,0x1babf4f923306b1e,1
-np.float64,0x407FB65F44600435,0x6db03beaf2512b8a,1
+np.float64,0xC08104E391F5078B,0x0ed397b7cbfbd230,2
+np.float64,0xC031501CAEFAE395,0x3e6040fd1ea35085,2
+np.float64,0xC079229124F6247C,0x1babf4f923306b1e,2
+np.float64,0x407FB65F44600435,0x6db03beaf2512b8a,2
  #use 9th entry
-np.float64,0xC07EDEE8E8E8A5AC,0x136536cec9cbef48,1
-np.float64,0x4072BB4086099A14,0x5af4d3c3008b56cc,1
-np.float64,0x4050442A2EC42CB4,0x45cd393bd8fad357,1
-np.float64,0xC06AC28FB3D419B4,0x2ca1b9d3437df85f,1
+np.float64,0xC07EDEE8E8E8A5AC,0x136536cec9cbef48,2
+np.float64,0x4072BB4086099A14,0x5af4d3c3008b56cc,2
+np.float64,0x4050442A2EC42CB4,0x45cd393bd8fad357,2
+np.float64,0xC06AC28FB3D419B4,0x2ca1b9d3437df85f,2
  #use 10th entry
-np.float64,0x40567FC6F0A68076,0x480c977fd5f3122e,1
-np.float64,0x40620A2F7EDA59BB,0x4cf278e96f4ce4d7,1
-np.float64,0xC085044707CD557C,0x034aad6c968a045a,1
-np.float64,0xC07374EA5AC516AA,0x23dd6afdc03e83d5,1
+np.float64,0x40567FC6F0A68076,0x480c977fd5f3122e,2
+np.float64,0x40620A2F7EDA59BB,0x4cf278e96f4ce4d7,2
+np.float64,0xC085044707CD557C,0x034aad6c968a045a,2
+np.float64,0xC07374EA5AC516AA,0x23dd6afdc03e83d5,2
  #use 11th entry
-np.float64,0x4073CC95332619C1,0x5c804b1498bbaa54,1
-np.float64,0xC0799FEBBE257F31,0x1af6a954c43b87d2,1
-np.float64,0x408159F19EA424F6,0x7200858efcbfc84d,1
-np.float64,0x404A81F6F24C0792,0x44b664a07ce5bbfa,1
+np.float64,0x4073CC95332619C1,0x5c804b1498bbaa54,2
+np.float64,0xC0799FEBBE257F31,0x1af6a954c43b87d2,2
+np.float64,0x408159F19EA424F6,0x7200858efcbfc84d,2
+np.float64,0x404A81F6F24C0792,0x44b664a07ce5bbfa,2
  #use 12th entry
-np.float64,0x40295FF1EFB9A741,0x4113c0e74c52d7b0,1
-np.float64,0x4073975F4CC411DA,0x5c32be40b4fec2c1,1
-np.float64,0x406E9DE52E82A77E,0x56049c9a3f1ae089,1
-np.float64,0x40748C2F52560ED9,0x5d93bc14fd4cd23b,1
+np.float64,0x40295FF1EFB9A741,0x4113c0e74c52d7b0,2
+np.float64,0x4073975F4CC411DA,0x5c32be40b4fec2c1,2
+np.float64,0x406E9DE52E82A77E,0x56049c9a3f1ae089,2
+np.float64,0x40748C2F52560ED9,0x5d93bc14fd4cd23b,2
  #use 13th entry
-np.float64,0x4062A553CDC4D04C,0x4d6266bfde301318,1
-np.float64,0xC079EC1D63598AB7,0x1a88cb184dab224c,1
-np.float64,0xC0725C1CB3167427,0x25725b46f8a081f6,1
-np.float64,0x407888771D9B45F9,0x6353b1ec6bd7ce80,1
+np.float64,0x4062A553CDC4D04C,0x4d6266bfde301318,2
+np.float64,0xC079EC1D63598AB7,0x1a88cb184dab224c,2
+np.float64,0xC0725C1CB3167427,0x25725b46f8a081f6,2
+np.float64,0x407888771D9B45F9,0x6353b1ec6bd7ce80,2
  #use 14th entry
-np.float64,0xC082CBA03AA89807,0x09b383723831ce56,1
-np.float64,0xC083A8961BB67DD7,0x0735b118d5275552,1
-np.float64,0xC076BC6ECA12E7E3,0x1f2222679eaef615,1
-np.float64,0xC072752503AA1A5B,0x254eb832242c77e1,1
+np.float64,0xC082CBA03AA89807,0x09b383723831ce56,2
+np.float64,0xC083A8961BB67DD7,0x0735b118d5275552,2
+np.float64,0xC076BC6ECA12E7E3,0x1f2222679eaef615,2
+np.float64,0xC072752503AA1A5B,0x254eb832242c77e1,2
  #use 15th entry
-np.float64,0xC058800792125DEC,0x371882372a0b48d4,1
-np.float64,0x4082909FD863E81C,0x7580d5f386920142,1
-np.float64,0xC071616F8FB534F9,0x26dbe20ef64a412b,1
-np.float64,0x406D1AB571CAA747,0x54ee0d55cb38ac20,1
+np.float64,0xC058800792125DEC,0x371882372a0b48d4,2
+np.float64,0x4082909FD863E81C,0x7580d5f386920142,2
+np.float64,0xC071616F8FB534F9,0x26dbe20ef64a412b,2
+np.float64,0x406D1AB571CAA747,0x54ee0d55cb38ac20,2
  #use 16th entry
-np.float64,0x406956428B7DAD09,0x52358682c271237f,1
-np.float64,0xC07EFC2D9D17B621,0x133b3e77c27a4d45,1
-np.float64,0xC08469BAC5BA3CCA,0x050863e5f42cc52f,1
-np.float64,0x407189D9626386A5,0x593cb1c0b3b5c1d3,1
+np.float64,0x406956428B7DAD09,0x52358682c271237f,2
+np.float64,0xC07EFC2D9D17B621,0x133b3e77c27a4d45,2
+np.float64,0xC08469BAC5BA3CCA,0x050863e5f42cc52f,2
+np.float64,0x407189D9626386A5,0x593cb1c0b3b5c1d3,2
  #use 17th entry
-np.float64,0x4077E652E3DEB8C6,0x6269a10dcbd3c752,1
-np.float64,0x407674C97DB06878,0x605485dcc2426ec2,1
-np.float64,0xC07CE9969CF4268D,0x16386cf8996669f2,1
-np.float64,0x40780EE32D5847C4,0x62a436bd1abe108d,1
+np.float64,0x4077E652E3DEB8C6,0x6269a10dcbd3c752,2
+np.float64,0x407674C97DB06878,0x605485dcc2426ec2,2
+np.float64,0xC07CE9969CF4268D,0x16386cf8996669f2,2
+np.float64,0x40780EE32D5847C4,0x62a436bd1abe108d,2
  #use 18th entry
-np.float64,0x4076C3AA5E1E8DA1,0x60c62f56a5e72e24,1
-np.float64,0xC0730AFC7239B9BE,0x24758ead095cec1e,1
-np.float64,0xC085CC2B9C420DDB,0x0109cdaa2e5694c1,1
-np.float64,0x406D0765CB6D7AA4,0x54e06f8dd91bd945,1
+np.float64,0x4076C3AA5E1E8DA1,0x60c62f56a5e72e24,2
+np.float64,0xC0730AFC7239B9BE,0x24758ead095cec1e,2
+np.float64,0xC085CC2B9C420DDB,0x0109cdaa2e5694c1,2
+np.float64,0x406D0765CB6D7AA4,0x54e06f8dd91bd945,2
  #use 19th entry
-np.float64,0xC082D011F3B495E7,0x09a6647661d279c2,1
-np.float64,0xC072826AF8F6AFBC,0x253acd3cd224507e,1
-np.float64,0x404EB9C4810CEA09,0x457933dbf07e8133,1
-np.float64,0x408284FBC97C58CE,0x755f6eb234aa4b98,1
+np.float64,0xC082D011F3B495E7,0x09a6647661d279c2,2
+np.float64,0xC072826AF8F6AFBC,0x253acd3cd224507e,2
+np.float64,0x404EB9C4810CEA09,0x457933dbf07e8133,2
+np.float64,0x408284FBC97C58CE,0x755f6eb234aa4b98,2
  #use 20th entry
-np.float64,0x40856008CF6EDC63,0x7d9c0b3c03f4f73c,1
-np.float64,0xC077CB2E9F013B17,0x1d9b3d3a166a55db,1
-np.float64,0xC0479CA3C20AD057,0x3bad40e081555b99,1
-np.float64,0x40844CD31107332A,0x7a821d70aea478e2,1
+np.float64,0x40856008CF6EDC63,0x7d9c0b3c03f4f73c,2
+np.float64,0xC077CB2E9F013B17,0x1d9b3d3a166a55db,2
+np.float64,0xC0479CA3C20AD057,0x3bad40e081555b99,2
+np.float64,0x40844CD31107332A,0x7a821d70aea478e2,2
  #use 21th entry
-np.float64,0xC07C8FCC0BFCC844,0x16ba1cc8c539d19b,1
-np.float64,0xC085C4E9A3ABA488,0x011ff675ba1a2217,1
-np.float64,0x4074D538B32966E5,0x5dfd9d78043c6ad9,1
-np.float64,0xC0630CA16902AD46,0x3231a446074cede6,1
+np.float64,0xC07C8FCC0BFCC844,0x16ba1cc8c539d19b,2
+np.float64,0xC085C4E9A3ABA488,0x011ff675ba1a2217,2
+np.float64,0x4074D538B32966E5,0x5dfd9d78043c6ad9,2
+np.float64,0xC0630CA16902AD46,0x3231a446074cede6,2
  #use 22th entry
-np.float64,0xC06C826733D7D0B7,0x2b5f1078314d41e1,1
-np.float64,0xC0520DF55B2B907F,0x396c13a6ce8e833e,1
-np.float64,0xC080712072B0F437,0x107eae02d11d98ea,1
-np.float64,0x40528A6150E19EFB,0x469fdabda02228c5,1
+np.float64,0xC06C826733D7D0B7,0x2b5f1078314d41e1,2
+np.float64,0xC0520DF55B2B907F,0x396c13a6ce8e833e,2
+np.float64,0xC080712072B0F437,0x107eae02d11d98ea,2
+np.float64,0x40528A6150E19EFB,0x469fdabda02228c5,2
  #use 23th entry
-np.float64,0xC07B1D74B6586451,0x18d1253883ae3b48,1
-np.float64,0x4045AFD7867DAEC0,0x43d7d634fc4c5d98,1
-np.float64,0xC07A08B91F9ED3E2,0x1a60973e6397fc37,1
-np.float64,0x407B3ECF0AE21C8C,0x673e03e9d98d7235,1
+np.float64,0xC07B1D74B6586451,0x18d1253883ae3b48,2
+np.float64,0x4045AFD7867DAEC0,0x43d7d634fc4c5d98,2
+np.float64,0xC07A08B91F9ED3E2,0x1a60973e6397fc37,2
+np.float64,0x407B3ECF0AE21C8C,0x673e03e9d98d7235,2
  #use 24th entry
-np.float64,0xC078AEB6F30CEABF,0x1c530b93ab54a1b3,1
-np.float64,0x4084495006A41672,0x7a775b6dc7e63064,1
-np.float64,0x40830B1C0EBF95DD,0x76e1e6eed77cfb89,1
-np.float64,0x407D93E8F33D8470,0x6a9adbc9e1e4f1e5,1
+np.float64,0xC078AEB6F30CEABF,0x1c530b93ab54a1b3,2
+np.float64,0x4084495006A41672,0x7a775b6dc7e63064,2
+np.float64,0x40830B1C0EBF95DD,0x76e1e6eed77cfb89,2
+np.float64,0x407D93E8F33D8470,0x6a9adbc9e1e4f1e5,2
  #use 25th entry
-np.float64,0x4066B11A09EFD9E8,0x504dd528065c28a7,1
-np.float64,0x408545823723AEEB,0x7d504a9b1844f594,1
-np.float64,0xC068C711F2CA3362,0x2e104f3496ea118e,1
-np.float64,0x407F317FCC3CA873,0x6cf0732c9948ebf4,1
+np.float64,0x4066B11A09EFD9E8,0x504dd528065c28a7,2
+np.float64,0x408545823723AEEB,0x7d504a9b1844f594,2
+np.float64,0xC068C711F2CA3362,0x2e104f3496ea118e,2
+np.float64,0x407F317FCC3CA873,0x6cf0732c9948ebf4,2
  #use 26th entry
-np.float64,0x407AFB3EBA2ED50F,0x66dc28a129c868d5,1
-np.float64,0xC075377037708ADE,0x21531a329f3d793e,1
-np.float64,0xC07C30066A1F3246,0x174448baa16ded2b,1
-np.float64,0xC06689A75DE2ABD3,0x2fad70662fae230b,1
+np.float64,0x407AFB3EBA2ED50F,0x66dc28a129c868d5,2
+np.float64,0xC075377037708ADE,0x21531a329f3d793e,2
+np.float64,0xC07C30066A1F3246,0x174448baa16ded2b,2
+np.float64,0xC06689A75DE2ABD3,0x2fad70662fae230b,2
  #use 27th entry
-np.float64,0x4081514E9FCCF1E0,0x71e673b9efd15f44,1
-np.float64,0xC0762C710AF68460,0x1ff1ed7d8947fe43,1
-np.float64,0xC0468102FF70D9C4,0x3be0c3a8ff3419a3,1
-np.float64,0xC07EA4CEEF02A83E,0x13b908f085102c61,1
+np.float64,0x4081514E9FCCF1E0,0x71e673b9efd15f44,2
+np.float64,0xC0762C710AF68460,0x1ff1ed7d8947fe43,2
+np.float64,0xC0468102FF70D9C4,0x3be0c3a8ff3419a3,2
+np.float64,0xC07EA4CEEF02A83E,0x13b908f085102c61,2
  #use 28th entry
-np.float64,0xC06290B04AE823C4,0x328a83da3c2e3351,1
-np.float64,0xC0770EB1D1C395FB,0x1eab281c1f1db5fe,1
-np.float64,0xC06F5D4D838A5BAE,0x29500ea32fb474ea,1
-np.float64,0x40723B3133B54C5D,0x5a3c82c7c3a2b848,1
+np.float64,0xC06290B04AE823C4,0x328a83da3c2e3351,2
+np.float64,0xC0770EB1D1C395FB,0x1eab281c1f1db5fe,2
+np.float64,0xC06F5D4D838A5BAE,0x29500ea32fb474ea,2
+np.float64,0x40723B3133B54C5D,0x5a3c82c7c3a2b848,2
  #use 29th entry
-np.float64,0x4085E6454CE3B4AA,0x7f20319b9638d06a,1
-np.float64,0x408389F2A0585D4B,0x7850667c58aab3d0,1
-np.float64,0xC0382798F9C8AE69,0x3dc1c79fe8739d6d,1
-np.float64,0xC08299D827608418,0x0a4335f76cdbaeb5,1
+np.float64,0x4085E6454CE3B4AA,0x7f20319b9638d06a,2
+np.float64,0x408389F2A0585D4B,0x7850667c58aab3d0,2
+np.float64,0xC0382798F9C8AE69,0x3dc1c79fe8739d6d,2
+np.float64,0xC08299D827608418,0x0a4335f76cdbaeb5,2
  #use 30th entry
-np.float64,0xC06F3DED43301BF1,0x2965670ae46750a8,1
-np.float64,0xC070CAF6BDD577D9,0x27b4aa4ffdd29981,1
-np.float64,0x4078529AD4B2D9F2,0x6305c12755d5e0a6,1
-np.float64,0xC055B14E75A31B96,0x381c2eda6d111e5d,1
+np.float64,0xC06F3DED43301BF1,0x2965670ae46750a8,2
+np.float64,0xC070CAF6BDD577D9,0x27b4aa4ffdd29981,2
+np.float64,0x4078529AD4B2D9F2,0x6305c12755d5e0a6,2
+np.float64,0xC055B14E75A31B96,0x381c2eda6d111e5d,2
  #use 31th entry
-np.float64,0x407B13EE414FA931,0x6700772c7544564d,1
-np.float64,0x407EAFDE9DE3EC54,0x6c346a0e49724a3c,1
-np.float64,0xC08362F398B9530D,0x07ffeddbadf980cb,1
-np.float64,0x407E865CDD9EEB86,0x6bf866cac5e0d126,1
+np.float64,0x407B13EE414FA931,0x6700772c7544564d,2
+np.float64,0x407EAFDE9DE3EC54,0x6c346a0e49724a3c,2
+np.float64,0xC08362F398B9530D,0x07ffeddbadf980cb,2
+np.float64,0x407E865CDD9EEB86,0x6bf866cac5e0d126,2
  #use 32th entry
-np.float64,0x407FB62DBC794C86,0x6db009f708ac62cb,1
-np.float64,0xC063D0BAA68CDDDE,0x31a3b2a51ce50430,1
-np.float64,0xC05E7706A2231394,0x34f24bead6fab5c9,1
-np.float64,0x4083E3A06FDE444E,0x79527b7a386d1937,1
+np.float64,0x407FB62DBC794C86,0x6db009f708ac62cb,2
+np.float64,0xC063D0BAA68CDDDE,0x31a3b2a51ce50430,2
+np.float64,0xC05E7706A2231394,0x34f24bead6fab5c9,2
+np.float64,0x4083E3A06FDE444E,0x79527b7a386d1937,2
diff --git a/numpy/core/tests/data/umath-validation-set-log.csv b/numpy/core/tests/data/umath-validation-set-log.csv

index b8f6b08757d5bddbec06328ac2c3f6b694d85fd3..7717745d59bb160f47153de3f2e8bd650e00aa42 100644 (file)
--- a/numpy/core/tests/data/umath-validation-set-log.csv
+++ b/numpy/core/tests/data/umath-validation-set-log.csv
@@ -118,154 +118,154 @@ np.float32,0x3f4884e8,0xbe7a214a,4
  np.float32,0x3f486945,0xbe7aae76,4
  #float64
  ## +ve denormal ##
-np.float64,0x0000000000000001,0xc0874385446d71c3,1
-np.float64,0x0001000000000000,0xc086395a2079b70c,1
-np.float64,0x000fffffffffffff,0xc086232bdd7abcd2,1
-np.float64,0x0007ad63e2168cb6,0xc086290bc0b2980f,1
+np.float64,0x0000000000000001,0xc0874385446d71c3,2
+np.float64,0x0001000000000000,0xc086395a2079b70c,2
+np.float64,0x000fffffffffffff,0xc086232bdd7abcd2,2
+np.float64,0x0007ad63e2168cb6,0xc086290bc0b2980f,2
  ## -ve denormal ##
-np.float64,0x8000000000000001,0xfff8000000000001,1
-np.float64,0x8001000000000000,0xfff8000000000001,1
-np.float64,0x800fffffffffffff,0xfff8000000000001,1
-np.float64,0x8007ad63e2168cb6,0xfff8000000000001,1
+np.float64,0x8000000000000001,0xfff8000000000001,2
+np.float64,0x8001000000000000,0xfff8000000000001,2
+np.float64,0x800fffffffffffff,0xfff8000000000001,2
+np.float64,0x8007ad63e2168cb6,0xfff8000000000001,2
  ## +/-0.0f, MAX, MIN##
-np.float64,0x0000000000000000,0xfff0000000000000,1
-np.float64,0x8000000000000000,0xfff0000000000000,1
-np.float64,0x7fefffffffffffff,0x40862e42fefa39ef,1
-np.float64,0xffefffffffffffff,0xfff8000000000001,1
+np.float64,0x0000000000000000,0xfff0000000000000,2
+np.float64,0x8000000000000000,0xfff0000000000000,2
+np.float64,0x7fefffffffffffff,0x40862e42fefa39ef,2
+np.float64,0xffefffffffffffff,0xfff8000000000001,2
  ## near 1.0f ##
-np.float64,0x3ff0000000000000,0x0000000000000000,1
-np.float64,0x3fe8000000000000,0xbfd269621134db92,1
-np.float64,0x3ff0000000000001,0x3cafffffffffffff,1
-np.float64,0x3ff0000020000000,0x3e7fffffe000002b,1
-np.float64,0x3ff0000000000001,0x3cafffffffffffff,1
-np.float64,0x3fefffffe0000000,0xbe70000008000005,1
-np.float64,0x3fefffffffffffff,0xbca0000000000000,1
+np.float64,0x3ff0000000000000,0x0000000000000000,2
+np.float64,0x3fe8000000000000,0xbfd269621134db92,2
+np.float64,0x3ff0000000000001,0x3cafffffffffffff,2
+np.float64,0x3ff0000020000000,0x3e7fffffe000002b,2
+np.float64,0x3ff0000000000001,0x3cafffffffffffff,2
+np.float64,0x3fefffffe0000000,0xbe70000008000005,2
+np.float64,0x3fefffffffffffff,0xbca0000000000000,2
  ## random numbers ##
-np.float64,0x02500186f3d9da56,0xc0855b8abf135773,1
-np.float64,0x09200815a3951173,0xc082ff1ad7131bdc,1
-np.float64,0x0da029623b0243d4,0xc0816fc994695bb5,1
-np.float64,0x48703b8ac483a382,0x40579213a313490b,1
-np.float64,0x09207b74c87c9860,0xc082fee20ff349ef,1
-np.float64,0x62c077698e8df947,0x407821c996d110f0,1
-np.float64,0x2350b45e87c3cfb0,0xc073d6b16b51d072,1
-np.float64,0x3990a23f9ff2b623,0xc051aa60eadd8c61,1
-np.float64,0x0d011386a116c348,0xc081a6cc7ea3b8fb,1
-np.float64,0x1fe0f0303ebe273a,0xc0763870b78a81ca,1
-np.float64,0x0cd1260121d387da,0xc081b7668d61a9d1,1
-np.float64,0x1e6135a8f581d422,0xc077425ac10f08c2,1
-np.float64,0x622168db5fe52d30,0x4077b3c669b9fadb,1
-np.float64,0x69f188e1ec6d1718,0x407d1e2f18c63889,1
-np.float64,0x3aa1bf1d9c4dd1a3,0xc04d682e24bde479,1
-np.float64,0x6c81c4011ce4f683,0x407ee5190e8a8e6a,1
-np.float64,0x2191fa55aa5a5095,0xc0750c0c318b5e2d,1
-np.float64,0x32a1f602a32bf360,0xc06270caa493fc17,1
-np.float64,0x16023c90ba93249b,0xc07d0f88e0801638,1
-np.float64,0x1c525fe6d71fa9ff,0xc078af49c66a5d63,1
-np.float64,0x1a927675815d65b7,0xc079e5bdd7fe376e,1
-np.float64,0x41227b8fe70da028,0x402aa0c9f9a84c71,1
-np.float64,0x4962bb6e853fe87d,0x405a34aa04c83747,1
-np.float64,0x23d2cda00b26b5a4,0xc0737c13a06d00ea,1
-np.float64,0x2d13083fd62987fa,0xc06a25055aeb474e,1
-np.float64,0x10e31e4c9b4579a1,0xc0804e181929418e,1
-np.float64,0x26d3247d556a86a9,0xc0716774171da7e8,1
-np.float64,0x6603379398d0d4ac,0x407a64f51f8a887b,1
-np.float64,0x02d38af17d9442ba,0xc0852d955ac9dd68,1
-np.float64,0x6a2382b4818dd967,0x407d4129d688e5d4,1
-np.float64,0x2ee3c403c79b3934,0xc067a091fefaf8b6,1
-np.float64,0x6493a699acdbf1a4,0x4079663c8602bfc5,1
-np.float64,0x1c8413c4f0de3100,0xc0788c99697059b6,1
-np.float64,0x4573f1ed350d9622,0x404e9bd1e4c08920,1
-np.float64,0x2f34265c9200b69c,0xc067310cfea4e986,1
-np.float64,0x19b43e65fa22029b,0xc07a7f8877de22d6,1
-np.float64,0x0af48ab7925ed6bc,0xc0825c4fbc0e5ade,1
-np.float64,0x4fa49699cad82542,0x4065c76d2a318235,1
-np.float64,0x7204a15e56ade492,0x40815bb87484dffb,1
-np.float64,0x4734aa08a230982d,0x40542a4bf7a361a9,1
-np.float64,0x1ae4ed296c2fd749,0xc079ac4921f20abb,1
-np.float64,0x472514ea4370289c,0x4053ff372bd8f18f,1
-np.float64,0x53a54b3f73820430,0x406b5411fc5f2e33,1
-np.float64,0x64754de5a15684fa,0x407951592e99a5ab,1
-np.float64,0x69358e279868a7c3,0x407c9c671a882c31,1
-np.float64,0x284579ec61215945,0xc0706688e55f0927,1
-np.float64,0x68b5c58806447adc,0x407c43d6f4eff760,1
-np.float64,0x1945a83f98b0e65d,0xc07acc15eeb032cc,1
-np.float64,0x0fc5eb98a16578bf,0xc080b0d02eddca0e,1
-np.float64,0x6a75e208f5784250,0x407d7a7383bf8f05,1
-np.float64,0x0fe63a029c47645d,0xc080a59ca1e98866,1
-np.float64,0x37963ac53f065510,0xc057236281f7bdb6,1
-np.float64,0x135661bb07067ff7,0xc07ee924930c21e4,1
-np.float64,0x4b4699469d458422,0x405f73843756e887,1
-np.float64,0x1a66d73e4bf4881b,0xc07a039ba1c63adf,1
-np.float64,0x12a6b9b119a7da59,0xc07f62e49c6431f3,1
-np.float64,0x24c719aa8fd1bdb5,0xc072d26da4bf84d3,1
-np.float64,0x0fa6ff524ffef314,0xc080bb8514662e77,1
-np.float64,0x1db751d66fdd4a9a,0xc077b77cb50d7c92,1
-np.float64,0x4947374c516da82c,0x4059e9acfc7105bf,1
-np.float64,0x1b1771ab98f3afc8,0xc07989326b8e1f66,1
-np.float64,0x25e78805baac8070,0xc0720a818e6ef080,1
-np.float64,0x4bd7a148225d3687,0x406082d004ea3ee7,1
-np.float64,0x53d7d6b2bbbda00a,0x406b9a398967cbd5,1
-np.float64,0x6997fb9f4e1c685f,0x407ce0a703413eba,1
-np.float64,0x069802c2ff71b951,0xc083df39bf7acddc,1
-np.float64,0x4d683ac9890f66d8,0x4062ae21d8c2acf0,1
-np.float64,0x5a2825863ec14f4c,0x40722d718d549552,1
-np.float64,0x0398799a88f4db80,0xc084e93dab8e2158,1
-np.float64,0x5ed87a8b77e135a5,0x40756d7051777b33,1
-np.float64,0x5828cd6d79b9bede,0x4070cafb22fc6ca1,1
-np.float64,0x7b18ba2a5ec6f068,0x408481386b3ed6fe,1
-np.float64,0x4938fd60922198fe,0x4059c206b762ea7e,1
-np.float64,0x31b8f44fcdd1a46e,0xc063b2faa8b6434e,1
-np.float64,0x5729341c0d918464,0x407019cac0c4a7d7,1
-np.float64,0x13595e9228ee878e,0xc07ee7235a7d8088,1
-np.float64,0x17698b0dc9dd4135,0xc07c1627e3a5ad5f,1
-np.float64,0x63b977c283abb0cc,0x4078cf1ec6ed65be,1
-np.float64,0x7349cc0d4dc16943,0x4081cc697ce4cb53,1
-np.float64,0x4e49a80b732fb28d,0x4063e67e3c5cbe90,1
-np.float64,0x07ba14b848a8ae02,0xc0837ac032a094e0,1
-np.float64,0x3da9f17b691bfddc,0xc03929c25366acda,1
-np.float64,0x02ea39aa6c3ac007,0xc08525af6f21e1c4,1
-np.float64,0x3a6a42f04ed9563d,0xc04e98e825dca46b,1
-np.float64,0x1afa877cd7900be7,0xc0799d6648cb34a9,1
-np.float64,0x58ea986649e052c6,0x4071512e939ad790,1
-np.float64,0x691abbc04647f536,0x407c89aaae0fcb83,1
-np.float64,0x43aabc5063e6f284,0x4044b45d18106fd2,1
-np.float64,0x488b003c893e0bea,0x4057df012a2dafbe,1
-np.float64,0x77eb076ed67caee5,0x40836720de94769e,1
-np.float64,0x5c1b46974aba46f4,0x40738731ba256007,1
-np.float64,0x1a5b29ecb5d3c261,0xc07a0becc77040d6,1
-np.float64,0x5d8b6ccf868c6032,0x4074865c1865e2db,1
-np.float64,0x4cfb6690b4aaf5af,0x406216cd8c7e8ddb,1
-np.float64,0x76cbd8eb5c5fc39e,0x4083038dc66d682b,1
-np.float64,0x28bbd1fec5012814,0xc07014c2dd1b9711,1
-np.float64,0x33dc1b3a4fd6bf7a,0xc060bd0756e07d8a,1
-np.float64,0x52bbe89b37de99f3,0x406a10041aa7d343,1
-np.float64,0x07bc479d15eb2dd3,0xc0837a1a6e3a3b61,1
-np.float64,0x18fc5275711a901d,0xc07aff3e9d62bc93,1
-np.float64,0x114c9758e247dc71,0xc080299a7cf15b05,1
-np.float64,0x25ac8f6d60755148,0xc07233c4c0c511d4,1
-np.float64,0x260cae2bb9e9fd7e,0xc071f128c7e82eac,1
-np.float64,0x572ccdfe0241de82,0x40701bedc84bb504,1
-np.float64,0x0ddcef6c8d41f5ee,0xc0815a7e16d07084,1
-np.float64,0x6dad1d59c988af68,0x407fb4a0bc0142b1,1
-np.float64,0x025d200580d8b6d1,0xc08556c0bc32b1b2,1
-np.float64,0x7aad344b6aa74c18,0x40845bbc453f22be,1
-np.float64,0x5b5d9d6ad9d14429,0x4073036d2d21f382,1
-np.float64,0x49cd8d8dcdf19954,0x405b5c034f5c7353,1
-np.float64,0x63edb9483335c1e6,0x4078f2dd21378786,1
-np.float64,0x7b1dd64c9d2c26bd,0x408482b922017bc9,1
-np.float64,0x782e13e0b574be5f,0x40837e2a0090a5ad,1
-np.float64,0x592dfe18b9d6db2f,0x40717f777fbcb1ec,1
-np.float64,0x654e3232ac60d72c,0x4079e71a95a70446,1
-np.float64,0x7b8e42ad22091456,0x4084a9a6f1e61722,1
-np.float64,0x570e88dfd5860ae6,0x407006ae6c0d137a,1
-np.float64,0x294e98346cb98ef1,0xc06f5edaac12bd44,1
-np.float64,0x1adeaa4ab792e642,0xc079b1431d5e2633,1
-np.float64,0x7b6ead3377529ac8,0x40849eabc8c7683c,1
-np.float64,0x2b8eedae8a9b2928,0xc06c400054deef11,1
-np.float64,0x65defb45b2dcf660,0x407a4b53f181c05a,1
-np.float64,0x1baf582d475e7701,0xc07920bcad4a502c,1
-np.float64,0x461f39cf05a0f15a,0x405126368f984fa1,1
-np.float64,0x7e5f6f5dcfff005b,0x4085a37d610439b4,1
-np.float64,0x136f66e4d09bd662,0xc07ed8a2719f2511,1
-np.float64,0x65afd8983fb6ca1f,0x407a2a7f48bf7fc1,1
-np.float64,0x572fa7f95ed22319,0x40701d706cf82e6f,1
+np.float64,0x02500186f3d9da56,0xc0855b8abf135773,2
+np.float64,0x09200815a3951173,0xc082ff1ad7131bdc,2
+np.float64,0x0da029623b0243d4,0xc0816fc994695bb5,2
+np.float64,0x48703b8ac483a382,0x40579213a313490b,2
+np.float64,0x09207b74c87c9860,0xc082fee20ff349ef,2
+np.float64,0x62c077698e8df947,0x407821c996d110f0,2
+np.float64,0x2350b45e87c3cfb0,0xc073d6b16b51d072,2
+np.float64,0x3990a23f9ff2b623,0xc051aa60eadd8c61,2
+np.float64,0x0d011386a116c348,0xc081a6cc7ea3b8fb,2
+np.float64,0x1fe0f0303ebe273a,0xc0763870b78a81ca,2
+np.float64,0x0cd1260121d387da,0xc081b7668d61a9d1,2
+np.float64,0x1e6135a8f581d422,0xc077425ac10f08c2,2
+np.float64,0x622168db5fe52d30,0x4077b3c669b9fadb,2
+np.float64,0x69f188e1ec6d1718,0x407d1e2f18c63889,2
+np.float64,0x3aa1bf1d9c4dd1a3,0xc04d682e24bde479,2
+np.float64,0x6c81c4011ce4f683,0x407ee5190e8a8e6a,2
+np.float64,0x2191fa55aa5a5095,0xc0750c0c318b5e2d,2
+np.float64,0x32a1f602a32bf360,0xc06270caa493fc17,2
+np.float64,0x16023c90ba93249b,0xc07d0f88e0801638,2
+np.float64,0x1c525fe6d71fa9ff,0xc078af49c66a5d63,2
+np.float64,0x1a927675815d65b7,0xc079e5bdd7fe376e,2
+np.float64,0x41227b8fe70da028,0x402aa0c9f9a84c71,2
+np.float64,0x4962bb6e853fe87d,0x405a34aa04c83747,2
+np.float64,0x23d2cda00b26b5a4,0xc0737c13a06d00ea,2
+np.float64,0x2d13083fd62987fa,0xc06a25055aeb474e,2
+np.float64,0x10e31e4c9b4579a1,0xc0804e181929418e,2
+np.float64,0x26d3247d556a86a9,0xc0716774171da7e8,2
+np.float64,0x6603379398d0d4ac,0x407a64f51f8a887b,2
+np.float64,0x02d38af17d9442ba,0xc0852d955ac9dd68,2
+np.float64,0x6a2382b4818dd967,0x407d4129d688e5d4,2
+np.float64,0x2ee3c403c79b3934,0xc067a091fefaf8b6,2
+np.float64,0x6493a699acdbf1a4,0x4079663c8602bfc5,2
+np.float64,0x1c8413c4f0de3100,0xc0788c99697059b6,2
+np.float64,0x4573f1ed350d9622,0x404e9bd1e4c08920,2
+np.float64,0x2f34265c9200b69c,0xc067310cfea4e986,2
+np.float64,0x19b43e65fa22029b,0xc07a7f8877de22d6,2
+np.float64,0x0af48ab7925ed6bc,0xc0825c4fbc0e5ade,2
+np.float64,0x4fa49699cad82542,0x4065c76d2a318235,2
+np.float64,0x7204a15e56ade492,0x40815bb87484dffb,2
+np.float64,0x4734aa08a230982d,0x40542a4bf7a361a9,2
+np.float64,0x1ae4ed296c2fd749,0xc079ac4921f20abb,2
+np.float64,0x472514ea4370289c,0x4053ff372bd8f18f,2
+np.float64,0x53a54b3f73820430,0x406b5411fc5f2e33,2
+np.float64,0x64754de5a15684fa,0x407951592e99a5ab,2
+np.float64,0x69358e279868a7c3,0x407c9c671a882c31,2
+np.float64,0x284579ec61215945,0xc0706688e55f0927,2
+np.float64,0x68b5c58806447adc,0x407c43d6f4eff760,2
+np.float64,0x1945a83f98b0e65d,0xc07acc15eeb032cc,2
+np.float64,0x0fc5eb98a16578bf,0xc080b0d02eddca0e,2
+np.float64,0x6a75e208f5784250,0x407d7a7383bf8f05,2
+np.float64,0x0fe63a029c47645d,0xc080a59ca1e98866,2
+np.float64,0x37963ac53f065510,0xc057236281f7bdb6,2
+np.float64,0x135661bb07067ff7,0xc07ee924930c21e4,2
+np.float64,0x4b4699469d458422,0x405f73843756e887,2
+np.float64,0x1a66d73e4bf4881b,0xc07a039ba1c63adf,2
+np.float64,0x12a6b9b119a7da59,0xc07f62e49c6431f3,2
+np.float64,0x24c719aa8fd1bdb5,0xc072d26da4bf84d3,2
+np.float64,0x0fa6ff524ffef314,0xc080bb8514662e77,2
+np.float64,0x1db751d66fdd4a9a,0xc077b77cb50d7c92,2
+np.float64,0x4947374c516da82c,0x4059e9acfc7105bf,2
+np.float64,0x1b1771ab98f3afc8,0xc07989326b8e1f66,2
+np.float64,0x25e78805baac8070,0xc0720a818e6ef080,2
+np.float64,0x4bd7a148225d3687,0x406082d004ea3ee7,2
+np.float64,0x53d7d6b2bbbda00a,0x406b9a398967cbd5,2
+np.float64,0x6997fb9f4e1c685f,0x407ce0a703413eba,2
+np.float64,0x069802c2ff71b951,0xc083df39bf7acddc,2
+np.float64,0x4d683ac9890f66d8,0x4062ae21d8c2acf0,2
+np.float64,0x5a2825863ec14f4c,0x40722d718d549552,2
+np.float64,0x0398799a88f4db80,0xc084e93dab8e2158,2
+np.float64,0x5ed87a8b77e135a5,0x40756d7051777b33,2
+np.float64,0x5828cd6d79b9bede,0x4070cafb22fc6ca1,2
+np.float64,0x7b18ba2a5ec6f068,0x408481386b3ed6fe,2
+np.float64,0x4938fd60922198fe,0x4059c206b762ea7e,2
+np.float64,0x31b8f44fcdd1a46e,0xc063b2faa8b6434e,2
+np.float64,0x5729341c0d918464,0x407019cac0c4a7d7,2
+np.float64,0x13595e9228ee878e,0xc07ee7235a7d8088,2
+np.float64,0x17698b0dc9dd4135,0xc07c1627e3a5ad5f,2
+np.float64,0x63b977c283abb0cc,0x4078cf1ec6ed65be,2
+np.float64,0x7349cc0d4dc16943,0x4081cc697ce4cb53,2
+np.float64,0x4e49a80b732fb28d,0x4063e67e3c5cbe90,2
+np.float64,0x07ba14b848a8ae02,0xc0837ac032a094e0,2
+np.float64,0x3da9f17b691bfddc,0xc03929c25366acda,2
+np.float64,0x02ea39aa6c3ac007,0xc08525af6f21e1c4,2
+np.float64,0x3a6a42f04ed9563d,0xc04e98e825dca46b,2
+np.float64,0x1afa877cd7900be7,0xc0799d6648cb34a9,2
+np.float64,0x58ea986649e052c6,0x4071512e939ad790,2
+np.float64,0x691abbc04647f536,0x407c89aaae0fcb83,2
+np.float64,0x43aabc5063e6f284,0x4044b45d18106fd2,2
+np.float64,0x488b003c893e0bea,0x4057df012a2dafbe,2
+np.float64,0x77eb076ed67caee5,0x40836720de94769e,2
+np.float64,0x5c1b46974aba46f4,0x40738731ba256007,2
+np.float64,0x1a5b29ecb5d3c261,0xc07a0becc77040d6,2
+np.float64,0x5d8b6ccf868c6032,0x4074865c1865e2db,2
+np.float64,0x4cfb6690b4aaf5af,0x406216cd8c7e8ddb,2
+np.float64,0x76cbd8eb5c5fc39e,0x4083038dc66d682b,2
+np.float64,0x28bbd1fec5012814,0xc07014c2dd1b9711,2
+np.float64,0x33dc1b3a4fd6bf7a,0xc060bd0756e07d8a,2
+np.float64,0x52bbe89b37de99f3,0x406a10041aa7d343,2
+np.float64,0x07bc479d15eb2dd3,0xc0837a1a6e3a3b61,2
+np.float64,0x18fc5275711a901d,0xc07aff3e9d62bc93,2
+np.float64,0x114c9758e247dc71,0xc080299a7cf15b05,2
+np.float64,0x25ac8f6d60755148,0xc07233c4c0c511d4,2
+np.float64,0x260cae2bb9e9fd7e,0xc071f128c7e82eac,2
+np.float64,0x572ccdfe0241de82,0x40701bedc84bb504,2
+np.float64,0x0ddcef6c8d41f5ee,0xc0815a7e16d07084,2
+np.float64,0x6dad1d59c988af68,0x407fb4a0bc0142b1,2
+np.float64,0x025d200580d8b6d1,0xc08556c0bc32b1b2,2
+np.float64,0x7aad344b6aa74c18,0x40845bbc453f22be,2
+np.float64,0x5b5d9d6ad9d14429,0x4073036d2d21f382,2
+np.float64,0x49cd8d8dcdf19954,0x405b5c034f5c7353,2
+np.float64,0x63edb9483335c1e6,0x4078f2dd21378786,2
+np.float64,0x7b1dd64c9d2c26bd,0x408482b922017bc9,2
+np.float64,0x782e13e0b574be5f,0x40837e2a0090a5ad,2
+np.float64,0x592dfe18b9d6db2f,0x40717f777fbcb1ec,2
+np.float64,0x654e3232ac60d72c,0x4079e71a95a70446,2
+np.float64,0x7b8e42ad22091456,0x4084a9a6f1e61722,2
+np.float64,0x570e88dfd5860ae6,0x407006ae6c0d137a,2
+np.float64,0x294e98346cb98ef1,0xc06f5edaac12bd44,2
+np.float64,0x1adeaa4ab792e642,0xc079b1431d5e2633,2
+np.float64,0x7b6ead3377529ac8,0x40849eabc8c7683c,2
+np.float64,0x2b8eedae8a9b2928,0xc06c400054deef11,2
+np.float64,0x65defb45b2dcf660,0x407a4b53f181c05a,2
+np.float64,0x1baf582d475e7701,0xc07920bcad4a502c,2
+np.float64,0x461f39cf05a0f15a,0x405126368f984fa1,2
+np.float64,0x7e5f6f5dcfff005b,0x4085a37d610439b4,2
+np.float64,0x136f66e4d09bd662,0xc07ed8a2719f2511,2
+np.float64,0x65afd8983fb6ca1f,0x407a2a7f48bf7fc1,2
+np.float64,0x572fa7f95ed22319,0x40701d706cf82e6f,2
diff --git a/numpy/core/tests/examples/cython/checks.pyx b/numpy/core/tests/examples/cython/checks.pyx

index 151979db70436589def4e3a8c8bafedc42dc5341..e41c6d657351628c02b54596f9e05a050b4e5021 100644 (file)
--- a/numpy/core/tests/examples/cython/checks.pyx
+++ b/numpy/core/tests/examples/cython/checks.pyx
@@ -1,3 +1,5 @@
+#cython: language_level=3
+
  """
  Functions in this module give python-space wrappers for cython functions
  exposed in numpy/__init__.pxd, so they can be tested in test_cython.py
diff --git a/numpy/core/tests/test_api.py b/numpy/core/tests/test_api.py

index d3c7211cd1b95059b0603c4e44a11a242a3f5211..b3f3e947d87e56e85a96b568b82f85d4ad5f6408 100644 (file)
--- a/numpy/core/tests/test_api.py
+++ b/numpy/core/tests/test_api.py
@@ -8,9 +8,6 @@ from numpy.testing import (
       HAS_REFCOUNT
      )
  
-# Switch between new behaviour when NPY_RELAXED_STRIDES_CHECKING is set.
-NPY_RELAXED_STRIDES_CHECKING = np.ones((10, 1), order='C').flags.f_contiguous
-
  
  def test_array_array():
      tobj = type(object)
@@ -144,7 +141,7 @@ def test_array_array():
  
  @pytest.mark.parametrize("array", [True, False])
  def test_array_impossible_casts(array):
-    # All builtin types can forst cast as least theoretically
+    # All builtin types can be forcibly cast, at least theoretically,
      # but user dtypes cannot necessarily.
      rt = rational(1, 2)
      if array:
@@ -482,13 +479,6 @@ def test_copy_order():
          assert_equal(x, y)
          assert_equal(res.flags.c_contiguous, ccontig)
          assert_equal(res.flags.f_contiguous, fcontig)
-        # This check is impossible only because
-        # NPY_RELAXED_STRIDES_CHECKING changes the strides actively
-        if not NPY_RELAXED_STRIDES_CHECKING:
-            if strides:
-                assert_equal(x.strides, y.strides)
-            else:
-                assert_(x.strides != y.strides)
  
      # Validate the initial state of a, b, and c
      assert_(a.flags.c_contiguous)
@@ -542,8 +532,7 @@ def test_copy_order():
  
  def test_contiguous_flags():
      a = np.ones((4, 4, 1))[::2,:,:]
-    if NPY_RELAXED_STRIDES_CHECKING:
-        a.strides = a.strides[:2] + (-123,)
+    a.strides = a.strides[:2] + (-123,)
      b = np.ones((2, 2, 1, 2, 2)).swapaxes(3, 4)
  
      def check_contig(a, ccontig, fcontig):
@@ -553,12 +542,8 @@ def test_contiguous_flags():
      # Check if new arrays are correct:
      check_contig(a, False, False)
      check_contig(b, False, False)
-    if NPY_RELAXED_STRIDES_CHECKING:
-        check_contig(np.empty((2, 2, 0, 2, 2)), True, True)
-        check_contig(np.array([[[1], [2]]], order='F'), True, True)
-    else:
-        check_contig(np.empty((2, 2, 0, 2, 2)), True, False)
-        check_contig(np.array([[[1], [2]]], order='F'), False, True)
+    check_contig(np.empty((2, 2, 0, 2, 2)), True, True)
+    check_contig(np.array([[[1], [2]]], order='F'), True, True)
      check_contig(np.empty((2, 2)), True, False)
      check_contig(np.empty((2, 2), order='F'), False, True)
  
@@ -567,18 +552,11 @@ def test_contiguous_flags():
      check_contig(np.array(a, copy=False, order='C'), True, False)
      check_contig(np.array(a, ndmin=4, copy=False, order='F'), False, True)
  
-    if NPY_RELAXED_STRIDES_CHECKING:
-        # Check slicing update of flags and :
-        check_contig(a[0], True, True)
-        check_contig(a[None, ::4, ..., None], True, True)
-        check_contig(b[0, 0, ...], False, True)
-        check_contig(b[:,:, 0:0,:,:], True, True)
-    else:
-        # Check slicing update of flags:
-        check_contig(a[0], True, False)
-        # Would be nice if this was C-Contiguous:
-        check_contig(a[None, 0, ..., None], False, False)
-        check_contig(b[0, 0, 0, ...], False, True)
+    # Check slicing update of flags and :
+    check_contig(a[0], True, True)
+    check_contig(a[None, ::4, ..., None], True, True)
+    check_contig(b[0, 0, ...], False, True)
+    check_contig(b[:, :, 0:0, :, :], True, True)
  
      # Test ravel and squeeze.
      check_contig(a.ravel(), True, True)
diff --git a/numpy/core/tests/test_array_coercion.py b/numpy/core/tests/test_array_coercion.py

index 293f5a68f8e6d8480cd086db4d25aca21ce459db..d349f9d023e6b6e0738046d48e2662fac3f288d7 100644 (file)
--- a/numpy/core/tests/test_array_coercion.py
+++ b/numpy/core/tests/test_array_coercion.py
@@ -746,3 +746,22 @@ class TestArrayLikes:
          with pytest.raises(error):
              np.array(BadSequence())
  
+
+class TestSpecialAttributeLookupFailure:
+    # An exception was raised while fetching the attribute
+
+    class WeirdArrayLike:
+        @property
+        def __array__(self):
+            raise RuntimeError("oops!")
+
+    class WeirdArrayInterface:
+        @property
+        def __array_interface__(self):
+            raise RuntimeError("oops!")
+
+    def test_deprecated(self):
+        with pytest.raises(RuntimeError):
+            np.array(self.WeirdArrayLike())
+        with pytest.raises(RuntimeError):
+            np.array(self.WeirdArrayInterface())
diff --git a/numpy/core/tests/test_array_interface.py b/numpy/core/tests/test_array_interface.py

new file mode 100644 (file)

index 0000000..72670ed
--- /dev/null
+++ b/numpy/core/tests/test_array_interface.py
@@ -0,0 +1,216 @@
+import sys
+import pytest
+import numpy as np
+from numpy.testing import extbuild
+
+
+@pytest.fixture
+def get_module(tmp_path):
+    """ Some codes to generate data and manage temporary buffers use when
+    sharing with numpy via the array interface protocol.
+    """
+
+    if not sys.platform.startswith('linux'):
+        pytest.skip('link fails on cygwin')
+
+    prologue = '''
+        #include <Python.h>
+        #define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
+        #include <numpy/arrayobject.h>
+        #include <stdio.h>
+        #include <math.h>
+
+        NPY_NO_EXPORT
+        void delete_array_struct(PyObject *cap) {
+
+            /* get the array interface structure */
+            PyArrayInterface *inter = (PyArrayInterface*)
+                PyCapsule_GetPointer(cap, NULL);
+
+            /* get the buffer by which data was shared */
+            double *ptr = (double*)PyCapsule_GetContext(cap);
+
+            /* for the purposes of the regression test set the elements
+               to nan */
+            for (npy_intp i = 0; i < inter->shape[0]; ++i)
+                ptr[i] = nan("");
+
+            /* free the shared buffer */
+            free(ptr);
+
+            /* free the array interface structure */
+            free(inter->shape);
+            free(inter);
+
+            fprintf(stderr, "delete_array_struct\\ncap = %ld inter = %ld"
+                " ptr = %ld\\n", (long)cap, (long)inter, (long)ptr);
+        }
+        '''
+
+    functions = [
+        ("new_array_struct", "METH_VARARGS", """
+
+            long long n_elem = 0;
+            double value = 0.0;
+
+            if (!PyArg_ParseTuple(args, "Ld", &n_elem, &value)) {
+                Py_RETURN_NONE;
+            }
+
+            /* allocate and initialize the data to share with numpy */
+            long long n_bytes = n_elem*sizeof(double);
+            double *data = (double*)malloc(n_bytes);
+
+            if (!data) {
+                PyErr_Format(PyExc_MemoryError,
+                    "Failed to malloc %lld bytes", n_bytes);
+
+                Py_RETURN_NONE;
+            }
+
+            for (long long i = 0; i < n_elem; ++i) {
+                data[i] = value;
+            }
+
+            /* calculate the shape and stride */
+            int nd = 1;
+
+            npy_intp *ss = (npy_intp*)malloc(2*nd*sizeof(npy_intp));
+            npy_intp *shape = ss;
+            npy_intp *stride = ss + nd;
+
+            shape[0] = n_elem;
+            stride[0] = sizeof(double);
+
+            /* construct the array interface */
+            PyArrayInterface *inter = (PyArrayInterface*)
+                malloc(sizeof(PyArrayInterface));
+
+            memset(inter, 0, sizeof(PyArrayInterface));
+
+            inter->two = 2;
+            inter->nd = nd;
+            inter->typekind = 'f';
+            inter->itemsize = sizeof(double);
+            inter->shape = shape;
+            inter->strides = stride;
+            inter->data = data;
+            inter->flags = NPY_ARRAY_WRITEABLE | NPY_ARRAY_NOTSWAPPED |
+                           NPY_ARRAY_ALIGNED | NPY_ARRAY_C_CONTIGUOUS;
+
+            /* package into a capsule */
+            PyObject *cap = PyCapsule_New(inter, NULL, delete_array_struct);
+
+            /* save the pointer to the data */
+            PyCapsule_SetContext(cap, data);
+
+            fprintf(stderr, "new_array_struct\\ncap = %ld inter = %ld"
+                " ptr = %ld\\n", (long)cap, (long)inter, (long)data);
+
+            return cap;
+        """)
+        ]
+
+    more_init = "import_array();"
+
+    try:
+        import array_interface_testing
+        return array_interface_testing
+    except ImportError:
+        pass
+
+    # if it does not exist, build and load it
+    return extbuild.build_and_import_extension('array_interface_testing',
+                                               functions,
+                                               prologue=prologue,
+                                               include_dirs=[np.get_include()],
+                                               build_dir=tmp_path,
+                                               more_init=more_init)
+
+
+@pytest.mark.slow
+def test_cstruct(get_module):
+
+    class data_source:
+        """
+        This class is for testing the timing of the PyCapsule destructor
+        invoked when numpy release its reference to the shared data as part of
+        the numpy array interface protocol. If the PyCapsule destructor is
+        called early the shared data is freed and invlaid memory accesses will
+        occur.
+        """
+
+        def __init__(self, size, value):
+            self.size = size
+            self.value = value
+
+        @property
+        def __array_struct__(self):
+            return get_module.new_array_struct(self.size, self.value)
+
+    # write to the same stream as the C code
+    stderr = sys.__stderr__
+
+    # used to validate the shared data.
+    expected_value = -3.1415
+    multiplier = -10000.0
+
+    # create some data to share with numpy via the array interface
+    # assign the data an expected value.
+    stderr.write(' ---- create an object to share data ---- \n')
+    buf = data_source(256, expected_value)
+    stderr.write(' ---- OK!\n\n')
+
+    # share the data
+    stderr.write(' ---- share data via the array interface protocol ---- \n')
+    arr = np.array(buf, copy=False)
+    stderr.write('arr.__array_interface___ = %s\n' % (
+                 str(arr.__array_interface__)))
+    stderr.write('arr.base = %s\n' % (str(arr.base)))
+    stderr.write(' ---- OK!\n\n')
+
+    # release the source of the shared data. this will not release the data
+    # that was shared with numpy, that is done in the PyCapsule destructor.
+    stderr.write(' ---- destroy the object that shared data ---- \n')
+    buf = None
+    stderr.write(' ---- OK!\n\n')
+
+    # check that we got the expected data. If the PyCapsule destructor we
+    # defined was prematurely called then this test will fail because our
+    # destructor sets the elements of the array to NaN before free'ing the
+    # buffer. Reading the values here may also cause a SEGV
+    assert np.allclose(arr, expected_value)
+
+    # read the data. If the PyCapsule destructor we defined was prematurely
+    # called then reading the values here may cause a SEGV and will be reported
+    # as invalid reads by valgrind
+    stderr.write(' ---- read shared data ---- \n')
+    stderr.write('arr = %s\n' % (str(arr)))
+    stderr.write(' ---- OK!\n\n')
+
+    # write to the shared buffer. If the shared data was prematurely deleted
+    # this will may cause a SEGV and valgrind will report invalid writes
+    stderr.write(' ---- modify shared data ---- \n')
+    arr *= multiplier
+    expected_value *= multiplier
+    stderr.write('arr.__array_interface___ = %s\n' % (
+                 str(arr.__array_interface__)))
+    stderr.write('arr.base = %s\n' % (str(arr.base)))
+    stderr.write(' ---- OK!\n\n')
+
+    # read the data. If the shared data was prematurely deleted this
+    # will may cause a SEGV and valgrind will report invalid reads
+    stderr.write(' ---- read modified shared data ---- \n')
+    stderr.write('arr = %s\n' % (str(arr)))
+    stderr.write(' ---- OK!\n\n')
+
+    # check that we got the expected data. If the PyCapsule destructor we
+    # defined was prematurely called then this test will fail because our
+    # destructor sets the elements of the array to NaN before free'ing the
+    # buffer. Reading the values here may also cause a SEGV
+    assert np.allclose(arr, expected_value)
+
+    # free the shared data, the PyCapsule destructor should run here
+    stderr.write(' ---- free shared data ---- \n')
+    arr = None
+    stderr.write(' ---- OK!\n\n')
diff --git a/numpy/core/tests/test_casting_unittests.py b/numpy/core/tests/test_casting_unittests.py

index cb479209030bd37c0961d09a3f4370cb73779744..5c5ff55b4c070d6bb5d89c1e943b72f79a5c464f 100644 (file)
--- a/numpy/core/tests/test_casting_unittests.py
+++ b/numpy/core/tests/test_casting_unittests.py
@@ -76,7 +76,6 @@ class Casting(enum.IntEnum):
      safe = 2
      same_kind = 3
      unsafe = 4
-    cast_is_view = 1 << 16
  
  
  def _get_cancast_table():
@@ -259,14 +258,14 @@ class TestCasting:
                  del default
  
                  for to_dt in [to_Dt(), to_Dt().newbyteorder()]:
-                    casting, (from_res, to_res) = cast._resolve_descriptors(
-                        (from_dt, to_dt))
+                    casting, (from_res, to_res), view_off = (
+                            cast._resolve_descriptors((from_dt, to_dt)))
                      assert(type(from_res) == from_Dt)
                      assert(type(to_res) == to_Dt)
-                    if casting & Casting.cast_is_view:
+                    if view_off is not None:
                          # If a view is acceptable, this is "no" casting
                          # and byte order must be matching.
-                        assert casting == Casting.no | Casting.cast_is_view
+                        assert casting == Casting.no
                          # The above table lists this as "equivalent"
                          assert Casting.equiv == CAST_TABLE[from_Dt][to_Dt]
                          # Note that to_res may not be the same as from_dt
@@ -299,7 +298,7 @@ class TestCasting:
              to_dt = to_dt.values[0]
              cast = get_castingimpl(type(from_dt), type(to_dt))
  
-            casting, (from_res, to_res) = cast._resolve_descriptors(
+            casting, (from_res, to_res), view_off = cast._resolve_descriptors(
                  (from_dt, to_dt))
  
              if from_res is not from_dt or to_res is not to_dt:
@@ -307,7 +306,7 @@ class TestCasting:
                  # each of which should is tested individually.
                  return
  
-            safe = (casting & ~Casting.cast_is_view) <= Casting.safe
+            safe = casting <= Casting.safe
              del from_res, to_res, casting
  
              arr1, arr2, values = self.get_data(from_dt, to_dt)
@@ -355,14 +354,15 @@ class TestCasting:
          for time_dt in time_dtypes:
              cast = get_castingimpl(type(from_dt), type(time_dt))
  
-            casting, (from_res, to_res) = cast._resolve_descriptors(
+            casting, (from_res, to_res), view_off = cast._resolve_descriptors(
                  (from_dt, time_dt))
  
              assert from_res is from_dt
              assert to_res is time_dt
              del from_res, to_res
  
-            assert(casting & CAST_TABLE[from_Dt][type(time_dt)])
+            assert casting & CAST_TABLE[from_Dt][type(time_dt)]
+            assert view_off is None
  
              int64_dt = np.dtype(np.int64)
              arr1, arr2, values = self.get_data(from_dt, int64_dt)
@@ -391,31 +391,37 @@ class TestCasting:
              assert arr2_o.tobytes() == arr2.tobytes()
  
      @pytest.mark.parametrize(
-            ["from_dt", "to_dt", "expected_casting", "nom", "denom"],
-            [("M8[ns]", None,
-                  Casting.no | Casting.cast_is_view, 1, 1),
-             (str(np.dtype("M8[ns]").newbyteorder()), None, Casting.equiv, 1, 1),
-             ("M8", "M8[ms]", Casting.safe | Casting.cast_is_view, 1, 1),
-             ("M8[ms]", "M8", Casting.unsafe, 1, 1),  # should be invalid cast
-             ("M8[5ms]", "M8[5ms]", Casting.no | Casting.cast_is_view, 1, 1),
-             ("M8[ns]", "M8[ms]", Casting.same_kind, 1, 10**6),
-             ("M8[ms]", "M8[ns]", Casting.safe, 10**6, 1),
-             ("M8[ms]", "M8[7ms]", Casting.same_kind, 1, 7),
-             ("M8[4D]", "M8[1M]", Casting.same_kind, None,
+            ["from_dt", "to_dt", "expected_casting", "expected_view_off",
+             "nom", "denom"],
+            [("M8[ns]", None, Casting.no, 0, 1, 1),
+             (str(np.dtype("M8[ns]").newbyteorder()), None,
+                  Casting.equiv, None, 1, 1),
+             ("M8", "M8[ms]", Casting.safe, 0, 1, 1),
+             # should be invalid cast:
+             ("M8[ms]", "M8", Casting.unsafe, None, 1, 1),
+             ("M8[5ms]", "M8[5ms]", Casting.no, 0, 1, 1),
+             ("M8[ns]", "M8[ms]", Casting.same_kind, None, 1, 10**6),
+             ("M8[ms]", "M8[ns]", Casting.safe, None, 10**6, 1),
+             ("M8[ms]", "M8[7ms]", Casting.same_kind, None, 1, 7),
+             ("M8[4D]", "M8[1M]", Casting.same_kind, None, None,
                    # give full values based on NumPy 1.19.x
                    [-2**63, 0, -1, 1314, -1315, 564442610]),
-             ("m8[ns]", None, Casting.no | Casting.cast_is_view, 1, 1),
-             (str(np.dtype("m8[ns]").newbyteorder()), None, Casting.equiv, 1, 1),
-             ("m8", "m8[ms]", Casting.safe | Casting.cast_is_view, 1, 1),
-             ("m8[ms]", "m8", Casting.unsafe, 1, 1),  # should be invalid cast
-             ("m8[5ms]", "m8[5ms]", Casting.no | Casting.cast_is_view, 1, 1),
-             ("m8[ns]", "m8[ms]", Casting.same_kind, 1, 10**6),
-             ("m8[ms]", "m8[ns]", Casting.safe, 10**6, 1),
-             ("m8[ms]", "m8[7ms]", Casting.same_kind, 1, 7),
-             ("m8[4D]", "m8[1M]", Casting.unsafe, None,
+             ("m8[ns]", None, Casting.no, 0, 1, 1),
+             (str(np.dtype("m8[ns]").newbyteorder()), None,
+                  Casting.equiv, None, 1, 1),
+             ("m8", "m8[ms]", Casting.safe, 0, 1, 1),
+             # should be invalid cast:
+             ("m8[ms]", "m8", Casting.unsafe, None, 1, 1),
+             ("m8[5ms]", "m8[5ms]", Casting.no, 0, 1, 1),
+             ("m8[ns]", "m8[ms]", Casting.same_kind, None, 1, 10**6),
+             ("m8[ms]", "m8[ns]", Casting.safe, None, 10**6, 1),
+             ("m8[ms]", "m8[7ms]", Casting.same_kind, None, 1, 7),
+             ("m8[4D]", "m8[1M]", Casting.unsafe, None, None,
                    # give full values based on NumPy 1.19.x
                    [-2**63, 0, 0, 1314, -1315, 564442610])])
-    def test_time_to_time(self, from_dt, to_dt, expected_casting, nom, denom):
+    def test_time_to_time(self, from_dt, to_dt,
+                          expected_casting, expected_view_off,
+                          nom, denom):
          from_dt = np.dtype(from_dt)
          if to_dt is not None:
              to_dt = np.dtype(to_dt)
@@ -428,10 +434,12 @@ class TestCasting:
  
          DType = type(from_dt)
          cast = get_castingimpl(DType, DType)
-        casting, (from_res, to_res) = cast._resolve_descriptors((from_dt, to_dt))
+        casting, (from_res, to_res), view_off = cast._resolve_descriptors(
+                (from_dt, to_dt))
          assert from_res is from_dt
          assert to_res is to_dt or to_dt is None
          assert casting == expected_casting
+        assert view_off == expected_view_off
  
          if nom is not None:
              expected_out = (values * nom // denom).view(to_res)
@@ -476,9 +484,11 @@ class TestCasting:
          expected_length = get_expected_stringlength(other_dt)
          string_dt = np.dtype(f"{string_char}{expected_length}")
  
-        safety, (res_other_dt, res_dt) = cast._resolve_descriptors((other_dt, None))
+        safety, (res_other_dt, res_dt), view_off = cast._resolve_descriptors(
+                (other_dt, None))
          assert res_dt.itemsize == expected_length * fact
          assert safety == Casting.safe  # we consider to string casts "safe"
+        assert view_off is None
          assert isinstance(res_dt, string_DT)
  
          # These casts currently implement changing the string length, so
@@ -490,19 +500,24 @@ class TestCasting:
                  expected_safety = Casting.same_kind
  
              to_dt = self.string_with_modified_length(string_dt, change_length)
-            safety, (_, res_dt) = cast._resolve_descriptors((other_dt, to_dt))
+            safety, (_, res_dt), view_off = cast._resolve_descriptors(
+                    (other_dt, to_dt))
              assert res_dt is to_dt
              assert safety == expected_safety
+            assert view_off is None
  
          # The opposite direction is always considered unsafe:
          cast = get_castingimpl(string_DT, other_DT)
  
-        safety, _ = cast._resolve_descriptors((string_dt, other_dt))
+        safety, _, view_off = cast._resolve_descriptors((string_dt, other_dt))
          assert safety == Casting.unsafe
+        assert view_off is None
  
          cast = get_castingimpl(string_DT, other_DT)
-        safety, (_, res_dt) = cast._resolve_descriptors((string_dt, None))
+        safety, (_, res_dt), view_off = cast._resolve_descriptors(
+            (string_dt, None))
          assert safety == Casting.unsafe
+        assert view_off is None
          assert other_dt is res_dt  # returns the singleton for simple dtypes
  
      @pytest.mark.parametrize("string_char", ["S", "U"])
@@ -521,7 +536,8 @@ class TestCasting:
  
          cast = get_castingimpl(type(other_dt), string_DT)
          cast_back = get_castingimpl(string_DT, type(other_dt))
-        _, (res_other_dt, string_dt) = cast._resolve_descriptors((other_dt, None))
+        _, (res_other_dt, string_dt), _ = cast._resolve_descriptors(
+                (other_dt, None))
  
          if res_other_dt is not other_dt:
              # do not support non-native byteorder, skip test in that case
@@ -580,13 +596,16 @@ class TestCasting:
          expected_length = other_dt.itemsize // div
          string_dt = np.dtype(f"{string_char}{expected_length}")
  
-        safety, (res_other_dt, res_dt) = cast._resolve_descriptors((other_dt, None))
+        safety, (res_other_dt, res_dt), view_off = cast._resolve_descriptors(
+                (other_dt, None))
          assert res_dt.itemsize == expected_length * fact
          assert isinstance(res_dt, string_DT)
  
+        expected_view_off = None
          if other_dt.char == string_char:
              if other_dt.isnative:
-                expected_safety = Casting.no | Casting.cast_is_view
+                expected_safety = Casting.no
+                expected_view_off = 0
              else:
                  expected_safety = Casting.equiv
          elif string_char == "U":
@@ -594,13 +613,19 @@ class TestCasting:
          else:
              expected_safety = Casting.unsafe
  
+        assert view_off == expected_view_off
          assert expected_safety == safety
  
          for change_length in [-1, 0, 1]:
              to_dt = self.string_with_modified_length(string_dt, change_length)
-            safety, (_, res_dt) = cast._resolve_descriptors((other_dt, to_dt))
+            safety, (_, res_dt), view_off = cast._resolve_descriptors(
+                    (other_dt, to_dt))
  
              assert res_dt is to_dt
+            if change_length <= 0:
+                assert view_off == expected_view_off
+            else:
+                assert view_off is None
              if expected_safety == Casting.unsafe:
                  assert safety == expected_safety
              elif change_length < 0:
@@ -655,12 +680,16 @@ class TestCasting:
          object_dtype = type(np.dtype(object))
          cast = get_castingimpl(object_dtype, type(dtype))
  
-        safety, (_, res_dt) = cast._resolve_descriptors((np.dtype("O"), dtype))
+        safety, (_, res_dt), view_off = cast._resolve_descriptors(
+                (np.dtype("O"), dtype))
          assert safety == Casting.unsafe
+        assert view_off is None
          assert res_dt is dtype
  
-        safety, (_, res_dt) = cast._resolve_descriptors((np.dtype("O"), None))
+        safety, (_, res_dt), view_off = cast._resolve_descriptors(
+                (np.dtype("O"), None))
          assert safety == Casting.unsafe
+        assert view_off is None
          assert res_dt == dtype.newbyteorder("=")
  
      @pytest.mark.parametrize("dtype", simple_dtype_instances())
@@ -669,8 +698,10 @@ class TestCasting:
          object_dtype = type(np.dtype(object))
          cast = get_castingimpl(type(dtype), object_dtype)
  
-        safety, (_, res_dt) = cast._resolve_descriptors((dtype, None))
+        safety, (_, res_dt), view_off = cast._resolve_descriptors(
+                (dtype, None))
          assert safety == Casting.safe
+        assert view_off is None
          assert res_dt is np.dtype("O")
  
      @pytest.mark.parametrize("casting", ["no", "unsafe"])
@@ -681,6 +712,75 @@ class TestCasting:
          assert np.can_cast("V4", dtype, casting=casting) == expected
          assert np.can_cast(dtype, "V4", casting=casting) == expected
  
+    @pytest.mark.parametrize(["to_dt", "expected_off"],
+            [  # Same as `from_dt` but with both fields shifted:
+             (np.dtype({"names": ["a", "b"], "formats": ["i4", "f4"],
+                        "offsets": [0, 4]}), 2),
+             # Additional change of the names
+             (np.dtype({"names": ["b", "a"], "formats": ["i4", "f4"],
+                        "offsets": [0, 4]}), 2),
+             # Incompatible field offset change
+             (np.dtype({"names": ["b", "a"], "formats": ["i4", "f4"],
+                        "offsets": [0, 6]}), None)])
+    def test_structured_field_offsets(self, to_dt, expected_off):
+        # This checks the cast-safety and view offset for swapped and "shifted"
+        # fields which are viewable
+        from_dt = np.dtype({"names": ["a", "b"],
+                            "formats": ["i4", "f4"],
+                            "offsets": [2, 6]})
+        cast = get_castingimpl(type(from_dt), type(to_dt))
+        safety, _, view_off = cast._resolve_descriptors((from_dt, to_dt))
+        if from_dt.names == to_dt.names:
+            assert safety == Casting.equiv
+        else:
+            assert safety == Casting.safe
+        # Shifting the original data pointer by -2 will align both by
+        # effectively adding 2 bytes of spacing before `from_dt`.
+        assert view_off == expected_off
+
+    @pytest.mark.parametrize(("from_dt", "to_dt", "expected_off"), [
+            # Subarray cases:
+            ("i", "(1,1)i", 0),
+            ("(1,1)i", "i", 0),
+            ("(2,1)i", "(2,1)i", 0),
+            # field cases (field to field is tested explicitly also):
+            # Not considered viewable, because a negative offset would allow
+            # may structured dtype to indirectly access invalid memory.
+            ("i", dict(names=["a"], formats=["i"], offsets=[2]), None),
+            (dict(names=["a"], formats=["i"], offsets=[2]), "i", 2),
+            # Currently considered not viewable, due to multiple fields
+            # even though they overlap (maybe we should not allow that?)
+            ("i", dict(names=["a", "b"], formats=["i", "i"], offsets=[2, 2]),
+             None),
+            # different number of fields can't work, should probably just fail
+            # so it never reports "viewable":
+            ("i,i", "i,i,i", None),
+            # Unstructured void cases:
+            ("i4", "V3", 0),  # void smaller or equal
+            ("i4", "V4", 0),  # void smaller or equal
+            ("i4", "V10", None),  # void is larger (no view)
+            ("O", "V4", None),  # currently reject objects for view here.
+            ("O", "V8", None),  # currently reject objects for view here.
+            ("V4", "V3", 0),
+            ("V4", "V4", 0),
+            ("V3", "V4", None),
+            # Note that currently void-to-other cast goes via byte-strings
+            # and is not a "view" based cast like the opposite direction:
+            ("V4", "i4", None),
+            # completely invalid/impossible cast:
+            ("i,i", "i,i,i", None),
+        ])
+    def test_structured_view_offsets_paramteric(
+            self, from_dt, to_dt, expected_off):
+        # TODO: While this test is fairly thorough, right now, it does not
+        # really test some paths that may have nonzero offsets (they don't
+        # really exists).
+        from_dt = np.dtype(from_dt)
+        to_dt = np.dtype(to_dt)
+        cast = get_castingimpl(type(from_dt), type(to_dt))
+        _, _, view_off = cast._resolve_descriptors((from_dt, to_dt))
+        assert view_off == expected_off
+
      @pytest.mark.parametrize("dtype", np.typecodes["All"])
      def test_object_casts_NULL_None_equivalence(self, dtype):
          # None to <other> casts may succeed or fail, but a NULL'ed array must
diff --git a/numpy/core/tests/test_conversion_utils.py b/numpy/core/tests/test_conversion_utils.py

index d8849ee29b0bfa69dddbd951fad4cf28c8101f67..c602eba4bb286f833d081e30b6b8dfabcfe1c1e6 100644 (file)
--- a/numpy/core/tests/test_conversion_utils.py
+++ b/numpy/core/tests/test_conversion_utils.py
@@ -2,12 +2,13 @@
  Tests for numpy/core/src/multiarray/conversion_utils.c
  """
  import re
+import sys
  
  import pytest
  
  import numpy as np
  import numpy.core._multiarray_tests as mt
-from numpy.testing import assert_warns
+from numpy.testing import assert_warns, IS_PYPY
  
  
  class StringConverterTestCase:
@@ -189,6 +190,8 @@ class TestIntpConverter:
          with pytest.warns(DeprecationWarning):
              assert self.conv(None) == ()
  
+    @pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
+            reason="PyPy bug in error formatting")
      def test_float(self):
          with pytest.raises(TypeError):
              self.conv(1.0)
diff --git a/numpy/core/tests/test_cpu_features.py b/numpy/core/tests/test_cpu_features.py

index 2ccbff41ca63ac8e07045591b4a84939648cf304..1a76897e206c428be8a9adcea477e0ca1b499b92 100644 (file)
--- a/numpy/core/tests/test_cpu_features.py
+++ b/numpy/core/tests/test_cpu_features.py
@@ -140,12 +140,23 @@ class Test_X86_Features(AbstractTest):
  is_power = re.match("^(powerpc|ppc)64", machine, re.IGNORECASE)
  @pytest.mark.skipif(not is_linux or not is_power, reason="Only for Linux and Power")
  class Test_POWER_Features(AbstractTest):
-    features = ["VSX", "VSX2", "VSX3"]
-    features_map = dict(VSX2="ARCH_2_07", VSX3="ARCH_3_00")
+    features = ["VSX", "VSX2", "VSX3", "VSX4"]
+    features_map = dict(VSX2="ARCH_2_07", VSX3="ARCH_3_00", VSX4="ARCH_3_1")
  
      def load_flags(self):
          self.load_flags_auxv()
  
+
+is_zarch = re.match("^(s390x)", machine, re.IGNORECASE)
+@pytest.mark.skipif(not is_linux or not is_zarch,
+                    reason="Only for Linux and IBM Z")
+class Test_ZARCH_Features(AbstractTest):
+    features = ["VX", "VXE", "VXE2"]
+
+    def load_flags(self):
+        self.load_flags_auxv()
+
+
  is_arm = re.match("^(arm|aarch64)", machine, re.IGNORECASE)
  @pytest.mark.skipif(not is_linux or not is_arm, reason="Only for Linux and ARM")
  class Test_ARM_Features(AbstractTest):
diff --git a/numpy/core/tests/test_cython.py b/numpy/core/tests/test_cython.py

index 9896de0ec29f6845de5b5ef7515cecd975ca0fdd..a31d9460e6ed55990111ddd631ec949e29f86ba0 100644 (file)
--- a/numpy/core/tests/test_cython.py
+++ b/numpy/core/tests/test_cython.py
@@ -13,14 +13,14 @@ try:
  except ImportError:
      cython = None
  else:
-    from distutils.version import LooseVersion
+    from numpy.compat import _pep440
  
-    # Cython 0.29.21 is required for Python 3.9 and there are
+    # Cython 0.29.30 is required for Python 3.11 and there are
      # other fixes in the 0.29 series that are needed even for earlier
      # Python versions.
      # Note: keep in sync with the one in pyproject.toml
-    required_version = LooseVersion("0.29.21")
-    if LooseVersion(cython_version) < required_version:
+    required_version = "0.29.30"
+    if _pep440.parse(cython_version) < _pep440.Version(required_version):
          # too old or wrong cython, skip the test
          cython = None
  
@@ -40,7 +40,7 @@ def install_temp(request, tmp_path):
      # build the examples and "install" them into a temporary directory
  
      install_log = str(tmp_path / "tmp_install_log.txt")
-    subprocess.check_call(
+    subprocess.check_output(
          [
              sys.executable,
              "setup.py",
diff --git a/numpy/core/tests/test_datetime.py b/numpy/core/tests/test_datetime.py

index 5294c7b8d6d725e70aa96409661a46942a4301f6..baae77a35494a0fd41dd96746ccecd364f3d2223 100644 (file)
--- a/numpy/core/tests/test_datetime.py
+++ b/numpy/core/tests/test_datetime.py
@@ -1437,7 +1437,7 @@ class TestDateTime:
  
          # NaTs
          with suppress_warnings() as sup:
-            sup.filter(RuntimeWarning,  r".*encountered in true\_divide")
+            sup.filter(RuntimeWarning,  r".*encountered in divide")
              nat = np.timedelta64('NaT')
              for tp in (int, float):
                  assert_equal(np.timedelta64(1) / tp(0), nat)
diff --git a/numpy/core/tests/test_deprecations.py b/numpy/core/tests/test_deprecations.py

index 94583a5ee04f1504588e0208e58867cc20d50811..2b7864433291fad82371b4c3d32457bdb48db109 100644 (file)
--- a/numpy/core/tests/test_deprecations.py
+++ b/numpy/core/tests/test_deprecations.py
@@ -13,7 +13,8 @@ import sys
  
  import numpy as np
  from numpy.testing import (
-    assert_raises, assert_warns, assert_, assert_array_equal, SkipTest, KnownFailureException
+    assert_raises, assert_warns, assert_, assert_array_equal, SkipTest,
+    KnownFailureException, break_cycles,
      )
  
  from numpy.core._multiarray_tests import fromstring_null_term_c_api
@@ -137,22 +138,6 @@ class _VisibleDeprecationTestCase(_DeprecationTestCase):
      warning_cls = np.VisibleDeprecationWarning
  
  
-class TestNonTupleNDIndexDeprecation:
-    def test_basic(self):
-        a = np.zeros((5, 5))
-        with warnings.catch_warnings():
-            warnings.filterwarnings('always')
-            assert_warns(FutureWarning, a.__getitem__, [[0, 1], [0, 1]])
-            assert_warns(FutureWarning, a.__getitem__, [slice(None)])
-
-            warnings.filterwarnings('error')
-            assert_raises(FutureWarning, a.__getitem__, [[0, 1], [0, 1]])
-            assert_raises(FutureWarning, a.__getitem__, [slice(None)])
-
-            # a a[[0, 1]] always was advanced indexing, so no error/warning
-            a[[0, 1]]
-
-
  class TestComparisonDeprecations(_DeprecationTestCase):
      """This tests the deprecation, for non-element-wise comparison logic.
      This used to mean that when an error occurred during element-wise comparison
@@ -200,14 +185,6 @@ class TestComparisonDeprecations(_DeprecationTestCase):
          self.assert_deprecated(lambda: np.arange(2) == NotArray())
          self.assert_deprecated(lambda: np.arange(2) != NotArray())
  
-        struct1 = np.zeros(2, dtype="i4,i4")
-        struct2 = np.zeros(2, dtype="i4,i4,i4")
-
-        assert_warns(FutureWarning, lambda: struct1 == 1)
-        assert_warns(FutureWarning, lambda: struct1 == struct2)
-        assert_warns(FutureWarning, lambda: struct1 != 1)
-        assert_warns(FutureWarning, lambda: struct1 != struct2)
-
      def test_array_richcompare_legacy_weirdness(self):
          # It doesn't really work to use assert_deprecated here, b/c part of
          # the point of assert_deprecated is to check that when warnings are
@@ -256,20 +233,6 @@ class TestDatetime64Timezone(_DeprecationTestCase):
          self.assert_deprecated(np.datetime64, args=(dt,))
  
  
-class TestNonCContiguousViewDeprecation(_DeprecationTestCase):
-    """View of non-C-contiguous arrays deprecated in 1.11.0.
-
-    The deprecation will not be raised for arrays that are both C and F
-    contiguous, as C contiguous is dominant. There are more such arrays
-    with relaxed stride checking than without so the deprecation is not
-    as visible with relaxed stride checking in force.
-    """
-
-    def test_fortran_contiguous(self):
-        self.assert_deprecated(np.ones((2,2)).T.view, args=(complex,))
-        self.assert_deprecated(np.ones((2,2)).T.view, args=(np.int8,))
-
-
  class TestArrayDataAttributeAssignmentDeprecation(_DeprecationTestCase):
      """Assigning the 'data' attribute of an ndarray is unsafe as pointed
       out in gh-7093. Eventually, such assignment should NOT be allowed, but
@@ -379,18 +342,6 @@ class TestPyArray_AS2D(_DeprecationTestCase):
          assert_raises(NotImplementedError, npy_pyarrayas2d_deprecation)
  
  
-class Test_UPDATEIFCOPY(_DeprecationTestCase):
-    """
-    v1.14 deprecates creating an array with the UPDATEIFCOPY flag, use
-    WRITEBACKIFCOPY instead
-    """
-    def test_npy_updateifcopy_deprecation(self):
-        from numpy.core._multiarray_tests import npy_updateifcopy_deprecation
-        arr = np.arange(9).reshape(3, 3)
-        v = arr.T
-        self.assert_deprecated(npy_updateifcopy_deprecation, args=(v,))
-
-
  class TestDatetimeEvent(_DeprecationTestCase):
      # 2017-08-11, 1.14.0
      def test_3_tuple(self):
@@ -426,11 +377,6 @@ class TestBincount(_DeprecationTestCase):
          self.assert_deprecated(lambda: np.bincount([1, 2, 3], minlength=None))
  
  
-class TestAlen(_DeprecationTestCase):
-    # 2019-08-02, 1.18.0
-    def test_alen(self):
-        self.assert_deprecated(lambda: np.alen(np.array([1, 2, 3])))
-
  
  class TestGeneratorSum(_DeprecationTestCase):
      # 2018-02-25, 1.15.0
@@ -667,9 +613,6 @@ class TestNonExactMatchDeprecation(_DeprecationTestCase):
  
  class TestDeprecatedGlobals(_DeprecationTestCase):
      # 2020-06-06
-    @pytest.mark.skipif(
-        sys.version_info < (3, 7),
-        reason='module-level __getattr__ not supported')
      def test_type_aliases(self):
          # from builtins
          self.assert_deprecated(lambda: np.bool(True))
@@ -1124,24 +1067,6 @@ class TestComparisonBadObjectDType(_DeprecationTestCase):
                  lambda: np.equal(1, 1, sig=(None, None, object)))
  
  
-class TestSpecialAttributeLookupFailure(_DeprecationTestCase):
-    message = r"An exception was ignored while fetching the attribute"
-
-    class WeirdArrayLike:
-        @property
-        def __array__(self):
-            raise RuntimeError("oops!")
-
-    class WeirdArrayInterface:
-        @property
-        def __array_interface__(self):
-            raise RuntimeError("oops!")
-
-    def test_deprecated(self):
-        self.assert_deprecated(lambda: np.array(self.WeirdArrayLike()))
-        self.assert_deprecated(lambda: np.array(self.WeirdArrayInterface()))
-
-
  class TestCtypesGetter(_DeprecationTestCase):
      # Deprecated 2021-05-18, Numpy 1.21.0
      warning_cls = DeprecationWarning
@@ -1250,3 +1175,66 @@ class TestQuantileInterpolationDeprecation(_DeprecationTestCase):
              warnings.simplefilter("always", DeprecationWarning)
              with pytest.raises(TypeError):
                  func([0., 1.], 0., interpolation="nearest", method="nearest")
+
+
+class TestMemEventHook(_DeprecationTestCase):
+    # Deprecated 2021-11-18, NumPy 1.23
+    def test_mem_seteventhook(self):
+        # The actual tests are within the C code in
+        # multiarray/_multiarray_tests.c.src
+        import numpy.core._multiarray_tests as ma_tests
+        with pytest.warns(DeprecationWarning,
+                          match='PyDataMem_SetEventHook is deprecated'):
+            ma_tests.test_pydatamem_seteventhook_start()
+        # force an allocation and free of a numpy array
+        # needs to be larger then limit of small memory cacher in ctors.c
+        a = np.zeros(1000)
+        del a
+        break_cycles()
+        with pytest.warns(DeprecationWarning,
+                          match='PyDataMem_SetEventHook is deprecated'):
+            ma_tests.test_pydatamem_seteventhook_end()
+
+
+class TestArrayFinalizeNone(_DeprecationTestCase):
+    message = "Setting __array_finalize__ = None"
+
+    def test_use_none_is_deprecated(self):
+        # Deprecated way that ndarray itself showed nothing needs finalizing.
+        class NoFinalize(np.ndarray):
+            __array_finalize__ = None
+
+        self.assert_deprecated(lambda: np.array(1).view(NoFinalize))
+
+class TestAxisNotMAXDIMS(_DeprecationTestCase):
+    # Deprecated 2022-01-08, NumPy 1.23
+    message = r"Using `axis=32` \(MAXDIMS\) is deprecated"
+
+    def test_deprecated(self):
+        a = np.zeros((1,)*32)
+        self.assert_deprecated(lambda: np.repeat(a, 1, axis=np.MAXDIMS))
+
+
+class TestLoadtxtParseIntsViaFloat(_DeprecationTestCase):
+    # Deprecated 2022-07-03, NumPy 1.23
+    # This test can be removed without replacement after the deprecation.
+    # The tests:
+    #   * numpy/lib/tests/test_loadtxt.py::test_integer_signs
+    #   * lib/tests/test_loadtxt.py::test_implicit_cast_float_to_int_fails
+    # Have a warning filter that needs to be removed.
+    message = r"loadtxt\(\): Parsing an integer via a float is deprecated.*"
+
+    @pytest.mark.parametrize("dtype", np.typecodes["AllInteger"])
+    def test_deprecated_warning(self, dtype):
+        with pytest.warns(DeprecationWarning, match=self.message):
+            np.loadtxt(["10.5"], dtype=dtype)
+
+    @pytest.mark.parametrize("dtype", np.typecodes["AllInteger"])
+    def test_deprecated_raised(self, dtype):
+        # The DeprecationWarning is chained when raised, so test manually:
+        with warnings.catch_warnings():
+            warnings.simplefilter("error", DeprecationWarning)
+            try:
+                np.loadtxt(["10.5"], dtype=dtype)
+            except ValueError as e:
+                assert isinstance(e.__cause__, DeprecationWarning)
diff --git a/numpy/core/tests/test_dlpack.py b/numpy/core/tests/test_dlpack.py

index eb9a1765423c7d5b29d3064584c433b809925c3b..717210b5423af358b228b5a59c2c8dbf54970def 100644 (file)
--- a/numpy/core/tests/test_dlpack.py
+++ b/numpy/core/tests/test_dlpack.py
@@ -27,12 +27,12 @@ class TestDLPack:
          z = y['int']
  
          with pytest.raises(RuntimeError):
-            np._from_dlpack(z)
+            np.from_dlpack(z)
  
      @pytest.mark.skipif(IS_PYPY, reason="PyPy can't get refcounts.")
      def test_from_dlpack_refcount(self):
          x = np.arange(5)
-        y = np._from_dlpack(x)
+        y = np.from_dlpack(x)
          assert sys.getrefcount(x) == 3
          del y
          assert sys.getrefcount(x) == 2
@@ -45,7 +45,7 @@ class TestDLPack:
      ])
      def test_dtype_passthrough(self, dtype):
          x = np.arange(5, dtype=dtype)
-        y = np._from_dlpack(x)
+        y = np.from_dlpack(x)
  
          assert y.dtype == x.dtype
          assert_array_equal(x, y)
@@ -54,44 +54,44 @@ class TestDLPack:
          x = np.asarray(np.datetime64('2021-05-27'))
  
          with pytest.raises(TypeError):
-            np._from_dlpack(x)
+            np.from_dlpack(x)
  
      def test_invalid_byte_swapping(self):
          dt = np.dtype('=i8').newbyteorder()
          x = np.arange(5, dtype=dt)
  
          with pytest.raises(TypeError):
-            np._from_dlpack(x)
+            np.from_dlpack(x)
  
      def test_non_contiguous(self):
          x = np.arange(25).reshape((5, 5))
  
          y1 = x[0]
-        assert_array_equal(y1, np._from_dlpack(y1))
+        assert_array_equal(y1, np.from_dlpack(y1))
  
          y2 = x[:, 0]
-        assert_array_equal(y2, np._from_dlpack(y2))
+        assert_array_equal(y2, np.from_dlpack(y2))
  
          y3 = x[1, :]
-        assert_array_equal(y3, np._from_dlpack(y3))
+        assert_array_equal(y3, np.from_dlpack(y3))
  
          y4 = x[1]
-        assert_array_equal(y4, np._from_dlpack(y4))
+        assert_array_equal(y4, np.from_dlpack(y4))
  
          y5 = np.diagonal(x).copy()
-        assert_array_equal(y5, np._from_dlpack(y5))
+        assert_array_equal(y5, np.from_dlpack(y5))
  
      @pytest.mark.parametrize("ndim", range(33))
      def test_higher_dims(self, ndim):
          shape = (1,) * ndim
          x = np.zeros(shape, dtype=np.float64)
  
-        assert shape == np._from_dlpack(x).shape
+        assert shape == np.from_dlpack(x).shape
  
      def test_dlpack_device(self):
          x = np.arange(5)
          assert x.__dlpack_device__() == (1, 0)
-        y = np._from_dlpack(x)
+        y = np.from_dlpack(x)
          assert y.__dlpack_device__() == (1, 0)
          z = y[::2]
          assert z.__dlpack_device__() == (1, 0)
@@ -100,7 +100,7 @@ class TestDLPack:
          x = np.arange(5)
          _ = x.__dlpack__()
          raise RuntimeError
-
+    
      def test_dlpack_destructor_exception(self):
          with pytest.raises(RuntimeError):
              self.dlpack_deleter_exception()
@@ -113,11 +113,11 @@ class TestDLPack:
  
      def test_ndim0(self):
          x = np.array(1.0)
-        y = np._from_dlpack(x)
+        y = np.from_dlpack(x)
          assert_array_equal(x, y)
  
      def test_size1dims_arrays(self):
          x = np.ndarray(dtype='f8', shape=(10, 5, 1), strides=(8, 80, 4),
                         buffer=np.ones(1000, dtype=np.uint8), order='F')
-        y = np._from_dlpack(x)
+        y = np.from_dlpack(x)
          assert_array_equal(x, y)
diff --git a/numpy/core/tests/test_dtype.py b/numpy/core/tests/test_dtype.py

index e49604e4db7af0b6701592b6cac0f15a142b0042..60b554d922032c3d5789ae8c250094ff394f6e3b 100644 (file)
--- a/numpy/core/tests/test_dtype.py
+++ b/numpy/core/tests/test_dtype.py
@@ -14,6 +14,11 @@ from numpy.testing import (
      IS_PYSTON)
  from numpy.compat import pickle
  from itertools import permutations
+import random
+
+import hypothesis
+from hypothesis.extra import numpy as hynp
+
  
  
  def assert_dtype_equal(a, b):
@@ -175,11 +180,11 @@ class TestBuiltin:
                        'formats': ['i4', 'f4'],
                        'offsets': [0, 4]})
          y = np.dtype({'names': ['B', 'A'],
-                      'formats': ['f4', 'i4'],
+                      'formats': ['i4', 'f4'],
                        'offsets': [4, 0]})
          assert_equal(x == y, False)
-        # But it is currently an equivalent cast:
-        assert np.can_cast(x, y, casting="equiv")
+        # This is an safe cast (not equiv) due to the different names:
+        assert np.can_cast(x, y, casting="safe")
  
  
  class TestRecord:
@@ -1059,6 +1064,139 @@ class TestDtypeAttributes:
              pass
          assert_equal(np.dtype(user_def_subcls).name, 'user_def_subcls')
  
+    def test_zero_stride(self):
+        arr = np.ones(1, dtype="i8")
+        arr = np.broadcast_to(arr, 10)
+        assert arr.strides == (0,)
+        with pytest.raises(ValueError):
+            arr.dtype = "i1"
+
+class TestDTypeMakeCanonical:
+    def check_canonical(self, dtype, canonical):
+        """
+        Check most properties relevant to "canonical" versions of a dtype,
+        which is mainly native byte order for datatypes supporting this.
+
+        The main work is checking structured dtypes with fields, where we
+        reproduce most the actual logic used in the C-code.
+        """
+        assert type(dtype) is type(canonical)
+
+        # a canonical DType should always have equivalent casting (both ways)
+        assert np.can_cast(dtype, canonical, casting="equiv")
+        assert np.can_cast(canonical, dtype, casting="equiv")
+        # a canonical dtype (and its fields) is always native (checks fields):
+        assert canonical.isnative
+
+        # Check that canonical of canonical is the same (no casting):
+        assert np.result_type(canonical) == canonical
+
+        if not dtype.names:
+            # The flags currently never change for unstructured dtypes
+            assert dtype.flags == canonical.flags
+            return
+
+        # Must have all the needs API flag set:
+        assert dtype.flags & 0b10000
+
+        # Check that the fields are identical (including titles):
+        assert dtype.fields.keys() == canonical.fields.keys()
+
+        def aligned_offset(offset, alignment):
+            # round up offset:
+            return - (-offset // alignment) * alignment
+
+        totalsize = 0
+        max_alignment = 1
+        for name in dtype.names:
+            # each field is also canonical:
+            new_field_descr = canonical.fields[name][0]
+            self.check_canonical(dtype.fields[name][0], new_field_descr)
+
+            # Must have the "inherited" object related flags:
+            expected = 0b11011 & new_field_descr.flags
+            assert (canonical.flags & expected) == expected
+
+            if canonical.isalignedstruct:
+                totalsize = aligned_offset(totalsize, new_field_descr.alignment)
+                max_alignment = max(new_field_descr.alignment, max_alignment)
+
+            assert canonical.fields[name][1] == totalsize
+            # if a title exists, they must match (otherwise empty tuple):
+            assert dtype.fields[name][2:] == canonical.fields[name][2:]
+
+            totalsize += new_field_descr.itemsize
+
+        if canonical.isalignedstruct:
+            totalsize = aligned_offset(totalsize, max_alignment)
+        assert canonical.itemsize == totalsize
+        assert canonical.alignment == max_alignment
+
+    def test_simple(self):
+        dt = np.dtype(">i4")
+        assert np.result_type(dt).isnative
+        assert np.result_type(dt).num == dt.num
+
+        # dtype with empty space:
+        struct_dt = np.dtype(">i4,<i1,i8,V3")[["f0", "f2"]]
+        canonical = np.result_type(struct_dt)
+        assert canonical.itemsize == 4+8
+        assert canonical.isnative
+
+        # aligned struct dtype with empty space:
+        struct_dt = np.dtype(">i1,<i4,i8,V3", align=True)[["f0", "f2"]]
+        canonical = np.result_type(struct_dt)
+        assert canonical.isalignedstruct
+        assert canonical.itemsize == np.dtype("i8").alignment + 8
+        assert canonical.isnative
+
+    def test_object_flag_not_inherited(self):
+        # The following dtype still indicates "object", because its included
+        # in the unaccessible space (maybe this could change at some point):
+        arr = np.ones(3, "i,O,i")[["f0", "f2"]]
+        assert arr.dtype.hasobject
+        canonical_dt = np.result_type(arr.dtype)
+        assert not canonical_dt.hasobject
+
+    @pytest.mark.slow
+    @hypothesis.given(dtype=hynp.nested_dtypes())
+    def test_make_canonical_hypothesis(self, dtype):
+        canonical = np.result_type(dtype)
+        self.check_canonical(dtype, canonical)
+        # result_type with two arguments should always give identical results:
+        two_arg_result = np.result_type(dtype, dtype)
+        assert np.can_cast(two_arg_result, canonical, casting="no")
+
+    @pytest.mark.slow
+    @hypothesis.given(
+            dtype=hypothesis.extra.numpy.array_dtypes(
+                subtype_strategy=hypothesis.extra.numpy.array_dtypes(),
+                min_size=5, max_size=10, allow_subarrays=True))
+    def test_structured(self, dtype):
+        # Pick 4 of the fields at random.  This will leave empty space in the
+        # dtype (since we do not canonicalize it here).
+        field_subset = random.sample(dtype.names, k=4)
+        dtype_with_empty_space = dtype[field_subset]
+        assert dtype_with_empty_space.itemsize == dtype.itemsize
+        canonicalized = np.result_type(dtype_with_empty_space)
+        self.check_canonical(dtype_with_empty_space, canonicalized)
+        # promotion with two arguments should always give identical results:
+        two_arg_result = np.promote_types(
+                dtype_with_empty_space, dtype_with_empty_space)
+        assert np.can_cast(two_arg_result, canonicalized, casting="no")
+
+        # Ensure that we also check aligned struct (check the opposite, in
+        # case hypothesis grows support for `align`.  Then repeat the test:
+        dtype_aligned = np.dtype(dtype.descr, align=not dtype.isalignedstruct)
+        dtype_with_empty_space = dtype_aligned[field_subset]
+        assert dtype_with_empty_space.itemsize == dtype_aligned.itemsize
+        canonicalized = np.result_type(dtype_with_empty_space)
+        self.check_canonical(dtype_with_empty_space, canonicalized)
+        # promotion with two arguments should always give identical results:
+        two_arg_result = np.promote_types(
+            dtype_with_empty_space, dtype_with_empty_space)
+        assert np.can_cast(two_arg_result, canonicalized, casting="no")
+
  
  class TestPickling:
  
@@ -1197,6 +1335,16 @@ class TestPromotion:
                  match=r".* no common DType exists for the given inputs"):
              np.result_type(1j, rational(1, 2))
  
+    @pytest.mark.parametrize("val", [2, 2**32, 2**63, 2**64, 2*100])
+    def test_python_integer_promotion(self, val):
+        # If we only path scalars (mainly python ones!), the result must take
+        # into account that the integer may be considered int32, int64, uint64,
+        # or object depending on the input value.  So test those paths!
+        expected_dtype = np.result_type(np.array(val).dtype, np.array(0).dtype)
+        assert np.result_type(val, 0) == expected_dtype
+        # For completeness sake, also check with a NumPy scalar as second arg:
+        assert np.result_type(val, np.int8(0)) == expected_dtype
+
      @pytest.mark.parametrize(["other", "expected"],
              [(1, rational), (1., np.float64)])
      def test_float_int_pyscalar_promote_rational(self, other, expected):
@@ -1260,6 +1408,11 @@ def test_keyword_argument():
      assert np.dtype(dtype=np.float64) == np.dtype(np.float64)
  
  
+def test_ulong_dtype():
+    # test for gh-21063
+    assert np.dtype("ulong") == np.dtype(np.uint)
+
+
  class TestFromDTypeAttribute:
      def test_simple(self):
          class dt:
diff --git a/numpy/core/tests/test_einsum.py b/numpy/core/tests/test_einsum.py

index 172311624c277a581ee54146730d2139157e7fd3..0ef1b714b3a97fbc327221516666be5693771719 100644 (file)
--- a/numpy/core/tests/test_einsum.py
+++ b/numpy/core/tests/test_einsum.py
@@ -1019,12 +1019,10 @@ class TestEinsumPath:
          # Long test 2
          long_test2 = self.build_operands('chd,bde,agbc,hiad,bdi,cgh,agdb')
          path, path_str = np.einsum_path(*long_test2, optimize='greedy')
-        print(path)
          self.assert_path_equal(path, ['einsum_path',
                                        (3, 4), (0, 3), (3, 4), (1, 3), (1, 2), (0, 1)])
  
          path, path_str = np.einsum_path(*long_test2, optimize='optimal')
-        print(path)
          self.assert_path_equal(path, ['einsum_path',
                                        (0, 5), (1, 4), (3, 4), (1, 3), (1, 2), (0, 1)])
  
@@ -1091,6 +1089,32 @@ class TestEinsumPath:
          opt = np.einsum(*path_test, optimize=exp_path)
          assert_almost_equal(noopt, opt)
  
+    def test_path_type_input_internal_trace(self):
+        #gh-20962
+        path_test = self.build_operands('cab,cdd->ab')
+        exp_path = ['einsum_path', (1,), (0, 1)]
+
+        path, path_str = np.einsum_path(*path_test, optimize=exp_path)
+        self.assert_path_equal(path, exp_path)
+
+        # Double check einsum works on the input path
+        noopt = np.einsum(*path_test, optimize=False)
+        opt = np.einsum(*path_test, optimize=exp_path)
+        assert_almost_equal(noopt, opt)
+
+    def test_path_type_input_invalid(self):
+        path_test = self.build_operands('ab,bc,cd,de->ae')
+        exp_path = ['einsum_path', (2, 3), (0, 1)]
+        assert_raises(RuntimeError, np.einsum, *path_test, optimize=exp_path)
+        assert_raises(
+            RuntimeError, np.einsum_path, *path_test, optimize=exp_path)
+
+        path_test = self.build_operands('a,a,a->a')
+        exp_path = ['einsum_path', (1,), (0, 1)]
+        assert_raises(RuntimeError, np.einsum, *path_test, optimize=exp_path)
+        assert_raises(
+            RuntimeError, np.einsum_path, *path_test, optimize=exp_path)
+
      def test_spaces(self):
          #gh-10794
          arr = np.array([[1]])
diff --git a/numpy/core/tests/test_indexing.py b/numpy/core/tests/test_indexing.py

index 1c22538567d33ba407c3f6feaf21b04623be8635..efcb92c2e6d15d0e0beb3d827be70e5bf34c4a5c 100644 (file)
--- a/numpy/core/tests/test_indexing.py
+++ b/numpy/core/tests/test_indexing.py
@@ -587,6 +587,12 @@ class TestIndexing:
  
          assert arr.dtype is dt
  
+    def test_nontuple_ndindex(self):
+        a = np.arange(25).reshape((5, 5))
+        assert_equal(a[[0, 1]], np.array([a[0], a[1]]))
+        assert_equal(a[[0, 1], [0, 1]], np.array([0, 6]))
+        assert_raises(IndexError, a.__getitem__, [slice(None)])
+
  
  class TestFieldIndexing:
      def test_scalar_return_type(self):
@@ -1332,7 +1338,7 @@ class TestBooleanIndexing:
  
  
  class TestArrayToIndexDeprecation:
-    """Creating an an index from array not 0-D is an error.
+    """Creating an index from array not 0-D is an error.
  
      """
      def test_array_to_index_error(self):
diff --git a/numpy/core/tests/test_limited_api.py b/numpy/core/tests/test_limited_api.py

index 0bb543d593aeefc402833162839a6a6570f08bda..3f9bf346ce52bda7788f117889f3f6d5b98efeb1 100644 (file)
--- a/numpy/core/tests/test_limited_api.py
+++ b/numpy/core/tests/test_limited_api.py
@@ -26,7 +26,7 @@ def test_limited_api(tmp_path):
      # build the examples and "install" them into a temporary directory
  
      install_log = str(tmp_path / "tmp_install_log.txt")
-    subprocess.check_call(
+    subprocess.check_output(
          [
              sys.executable,
              "setup.py",
diff --git a/numpy/core/tests/test_multiarray.py b/numpy/core/tests/test_multiarray.py

index 35acf307fc910f523b3b3a421316c514feb318fb..f4454130d7239e7343a1188acd320c57ac0a1b39 100644 (file)
--- a/numpy/core/tests/test_multiarray.py
+++ b/numpy/core/tests/test_multiarray.py
@@ -31,6 +31,7 @@ from numpy.testing import (
      )
  from numpy.testing._private.utils import _no_tracing
  from numpy.core.tests._locales import CommaDecimalPointLocale
+from numpy.lib.recfunctions import repack_fields
  
  # Need to test an object that does not fully implement math interface
  from datetime import timedelta, datetime
@@ -235,11 +236,6 @@ class TestFlags:
          assert_equal(self.a.flags.owndata, True)
          assert_equal(self.a.flags.writeable, True)
          assert_equal(self.a.flags.aligned, True)
-        with assert_warns(DeprecationWarning):
-            assert_equal(self.a.flags.updateifcopy, False)
-        with assert_warns(DeprecationWarning):
-            assert_equal(self.a.flags['U'], False)
-            assert_equal(self.a.flags['UPDATEIFCOPY'], False)
          assert_equal(self.a.flags.writebackifcopy, False)
          assert_equal(self.a.flags['X'], False)
          assert_equal(self.a.flags['WRITEBACKIFCOPY'], False)
@@ -450,6 +446,9 @@ class TestArrayConstruction:
      def test_array_empty(self):
          assert_raises(TypeError, np.array)
  
+    def test_0d_array_shape(self):
+        assert np.ones(np.array(3)).shape == (3,)
+
      def test_array_copy_false(self):
          d = np.array([1, 2, 3])
          e = np.array(d, copy=False)
@@ -512,6 +511,7 @@ class TestArrayConstruction:
          else:
              func(a=3)
  
+
  class TestAssignment:
      def test_assignment_broadcasting(self):
          a = np.arange(6).reshape(2, 3)
@@ -1211,7 +1211,8 @@ class TestStructured:
                  assert_equal(a == b, [False, True])
                  assert_equal(a != b, [True, False])
  
-        # Check that broadcasting with a subarray works
+        # Check that broadcasting with a subarray works, including cases that
+        # require promotion to work:
          a = np.array([[(0,)], [(1,)]], dtype=[('a', 'f8')])
          b = np.array([(0,), (0,), (1,)], dtype=[('a', 'f8')])
          assert_equal(a == b, [[True, True, False], [False, False, True]])
@@ -1234,27 +1235,56 @@ class TestStructured:
          # Check that incompatible sub-array shapes don't result to broadcasting
          x = np.zeros((1,), dtype=[('a', ('f4', (1, 2))), ('b', 'i1')])
          y = np.zeros((1,), dtype=[('a', ('f4', (2,))), ('b', 'i1')])
-        # This comparison invokes deprecated behaviour, and will probably
-        # start raising an error eventually. What we really care about in this
-        # test is just that it doesn't return True.
-        with suppress_warnings() as sup:
-            sup.filter(FutureWarning, "elementwise == comparison failed")
-            assert_equal(x == y, False)
+        # The main importance is that it does not return True:
+        with pytest.raises(TypeError):
+            x == y
  
          x = np.zeros((1,), dtype=[('a', ('f4', (2, 1))), ('b', 'i1')])
          y = np.zeros((1,), dtype=[('a', ('f4', (2,))), ('b', 'i1')])
-        # This comparison invokes deprecated behaviour, and will probably
-        # start raising an error eventually. What we really care about in this
-        # test is just that it doesn't return True.
-        with suppress_warnings() as sup:
-            sup.filter(FutureWarning, "elementwise == comparison failed")
-            assert_equal(x == y, False)
+        # The main importance is that it does not return True:
+        with pytest.raises(TypeError):
+            x == y
  
-        # Check that structured arrays that are different only in
-        # byte-order work
+    def test_structured_comparisons_with_promotion(self):
+        # Check that structured arrays can be compared so long as their
+        # dtypes promote fine:
          a = np.array([(5, 42), (10, 1)], dtype=[('a', '>i8'), ('b', '<f8')])
          b = np.array([(5, 43), (10, 1)], dtype=[('a', '<i8'), ('b', '>f8')])
          assert_equal(a == b, [False, True])
+        assert_equal(a != b, [True, False])
+
+        a = np.array([(5, 42), (10, 1)], dtype=[('a', '>f8'), ('b', '<f8')])
+        b = np.array([(5, 43), (10, 1)], dtype=[('a', '<i8'), ('b', '>i8')])
+        assert_equal(a == b, [False, True])
+        assert_equal(a != b, [True, False])
+
+        # Including with embedded subarray dtype (although subarray comparison
+        # itself may still be a bit weird and compare the raw data)
+        a = np.array([(5, 42), (10, 1)], dtype=[('a', '10>f8'), ('b', '5<f8')])
+        b = np.array([(5, 43), (10, 1)], dtype=[('a', '10<i8'), ('b', '5>i8')])
+        assert_equal(a == b, [False, True])
+        assert_equal(a != b, [True, False])
+
+    def test_void_comparison_failures(self):
+        # In principle, one could decide to return an array of False for some
+        # if comparisons are impossible.  But right now we return TypeError
+        # when "void" dtype are involved.
+        x = np.zeros(3, dtype=[('a', 'i1')])
+        y = np.zeros(3)
+        # Cannot compare non-structured to structured:
+        with pytest.raises(TypeError):
+            x == y
+
+        # Added title prevents promotion, but casts are OK:
+        y = np.zeros(3, dtype=[(('title', 'a'), 'i1')])
+        assert np.can_cast(y.dtype, x.dtype)
+        with pytest.raises(TypeError):
+            x == y
+
+        x = np.zeros(3, dtype="V7")
+        y = np.zeros(3, dtype="V8")
+        with pytest.raises(TypeError):
+            x == y
  
      def test_casting(self):
          # Check that casting a structured array to change its byte order
@@ -1429,7 +1459,7 @@ class TestStructured:
          assert_equal(testassign(arr, v1), ans)
          assert_equal(testassign(arr, v2), ans)
          assert_equal(testassign(arr, v3), ans)
-        assert_raises(ValueError, lambda: testassign(arr, v4))
+        assert_raises(TypeError, lambda: testassign(arr, v4))
          assert_equal(testassign(arr, v5), ans)
          w[:] = 4
          assert_equal(arr, np.array([(1,4),(1,4)], dtype=dt))
@@ -1464,6 +1494,75 @@ class TestStructured:
          assert_raises(ValueError, lambda : a[['b','b']])  # field exists, but repeated
          a[['b','c']]  # no exception
  
+    def test_structured_cast_promotion_fieldorder(self):
+        # gh-15494
+        # dtypes with different field names are not promotable
+        A = ("a", "<i8")
+        B = ("b", ">i8")
+        ab = np.array([(1, 2)], dtype=[A, B])
+        ba = np.array([(1, 2)], dtype=[B, A])
+        assert_raises(TypeError, np.concatenate, ab, ba)
+        assert_raises(TypeError, np.result_type, ab.dtype, ba.dtype)
+        assert_raises(TypeError, np.promote_types, ab.dtype, ba.dtype)
+
+        # dtypes with same field names/order but different memory offsets
+        # and byte-order are promotable to packed nbo.
+        assert_equal(np.promote_types(ab.dtype, ba[['a', 'b']].dtype),
+                     repack_fields(ab.dtype.newbyteorder('N')))
+
+        # gh-13667
+        # dtypes with different fieldnames but castable field types are castable
+        assert_equal(np.can_cast(ab.dtype, ba.dtype), True)
+        assert_equal(ab.astype(ba.dtype).dtype, ba.dtype)
+        assert_equal(np.can_cast('f8,i8', [('f0', 'f8'), ('f1', 'i8')]), True)
+        assert_equal(np.can_cast('f8,i8', [('f1', 'f8'), ('f0', 'i8')]), True)
+        assert_equal(np.can_cast('f8,i8', [('f1', 'i8'), ('f0', 'f8')]), False)
+        assert_equal(np.can_cast('f8,i8', [('f1', 'i8'), ('f0', 'f8')],
+                                 casting='unsafe'), True)
+
+        ab[:] = ba  # make sure assignment still works
+
+        # tests of type-promotion of corresponding fields
+        dt1 = np.dtype([("", "i4")])
+        dt2 = np.dtype([("", "i8")])
+        assert_equal(np.promote_types(dt1, dt2), np.dtype([('f0', 'i8')]))
+        assert_equal(np.promote_types(dt2, dt1), np.dtype([('f0', 'i8')]))
+        assert_raises(TypeError, np.promote_types, dt1, np.dtype([("", "V3")]))
+        assert_equal(np.promote_types('i4,f8', 'i8,f4'),
+                     np.dtype([('f0', 'i8'), ('f1', 'f8')]))
+        # test nested case
+        dt1nest = np.dtype([("", dt1)])
+        dt2nest = np.dtype([("", dt2)])
+        assert_equal(np.promote_types(dt1nest, dt2nest),
+                     np.dtype([('f0', np.dtype([('f0', 'i8')]))]))
+
+        # note that offsets are lost when promoting:
+        dt = np.dtype({'names': ['x'], 'formats': ['i4'], 'offsets': [8]})
+        a = np.ones(3, dtype=dt)
+        assert_equal(np.concatenate([a, a]).dtype, np.dtype([('x', 'i4')]))
+
+    @pytest.mark.parametrize("dtype_dict", [
+            dict(names=["a", "b"], formats=["i4", "f"], itemsize=100),
+            dict(names=["a", "b"], formats=["i4", "f"],
+                 offsets=[0, 12])])
+    @pytest.mark.parametrize("align", [True, False])
+    def test_structured_promotion_packs(self, dtype_dict, align):
+        # Structured dtypes are packed when promoted (we consider the packed
+        # form to be "canonical"), so tere is no extra padding.
+        dtype = np.dtype(dtype_dict, align=align)
+        # Remove non "canonical" dtype options:
+        dtype_dict.pop("itemsize", None)
+        dtype_dict.pop("offsets", None)
+        expected = np.dtype(dtype_dict, align=align)
+
+        res = np.promote_types(dtype, dtype)
+        assert res.itemsize == expected.itemsize
+        assert res.fields == expected.fields
+
+        # But the "expected" one, should just be returned unchanged:
+        res = np.promote_types(expected, expected)
+        assert res is expected
+
      def test_structured_asarray_is_view(self):
          # A scalar viewing an array preserves its view even when creating a
          # new array. This test documents behaviour, it may not be the best
@@ -3310,11 +3409,11 @@ class TestMethods:
          assert_equal(a.ravel('C'), [3, 2, 1, 0])
          assert_equal(a.ravel('K'), [3, 2, 1, 0])
  
-        # 1-element tidy strides test (NPY_RELAXED_STRIDES_CHECKING):
+        # 1-element tidy strides test:
          a = np.array([[1]])
          a.strides = (123, 432)
-        # If the stride is not 8, NPY_RELAXED_STRIDES_CHECKING is messing
-        # them up on purpose:
+        # If the following stride is not 8, NPY_RELAXED_STRIDES_DEBUG is
+        # messing them up on purpose:
          if np.ones(1).strides == (8,):
              assert_(np.may_share_memory(a.ravel('K'), a))
              assert_equal(a.ravel('K').strides, (a.dtype.itemsize,))
@@ -3700,6 +3799,32 @@ class TestBinop:
          check(make_obj(np.ndarray, array_ufunc=None), True, False, False,
                check_scalar=False)
  
+    @pytest.mark.parametrize("priority", [None, "runtime error"])
+    def test_ufunc_binop_bad_array_priority(self, priority):
+        # Mainly checks that this does not crash.  The second array has a lower
+        # priority than -1 ("error value").  If the __radd__ actually exists,
+        # bad things can happen (I think via the scalar paths).
+        # In principle both of these can probably just be errors in the future.
+        class BadPriority:
+            @property
+            def __array_priority__(self):
+                if priority == "runtime error":
+                    raise RuntimeError("RuntimeError in __array_priority__!")
+                return priority
+
+            def __radd__(self, other):
+                return "result"
+
+        class LowPriority(np.ndarray):
+            __array_priority__ = -1000
+
+        # Priority failure uses the same as scalars (smaller -1000).  So the
+        # LowPriority wins with 'result' for each element (inner operation).
+        res = np.arange(3).view(LowPriority) + BadPriority()
+        assert res.shape == (3,)
+        assert res[0] == 'result'
+
+
      def test_ufunc_override_normalize_signature(self):
          # gh-5674
          class SomeClass:
@@ -3960,6 +4085,41 @@ class TestCAPI:
          assert_(IsPythonScalar(2.))
          assert_(IsPythonScalar("a"))
  
+    @pytest.mark.parametrize("converter",
+             [_multiarray_tests.run_scalar_intp_converter,
+              _multiarray_tests.run_scalar_intp_from_sequence])
+    def test_intp_sequence_converters(self, converter):
+        # Test simple values (-1 is special for error return paths)
+        assert converter(10) == (10,)
+        assert converter(-1) == (-1,)
+        # A 0-D array looks a bit like a sequence but must take the integer
+        # path:
+        assert converter(np.array(123)) == (123,)
+        # Test simple sequences (intp_from_sequence only supports length 1):
+        assert converter((10,)) == (10,)
+        assert converter(np.array([11])) == (11,)
+
+    @pytest.mark.parametrize("converter",
+             [_multiarray_tests.run_scalar_intp_converter,
+              _multiarray_tests.run_scalar_intp_from_sequence])
+    @pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
+            reason="PyPy bug in error formatting")
+    def test_intp_sequence_converters_errors(self, converter):
+        with pytest.raises(TypeError,
+                match="expected a sequence of integers or a single integer, "):
+            converter(object())
+        with pytest.raises(TypeError,
+                match="expected a sequence of integers or a single integer, "
+                      "got '32.0'"):
+            converter(32.)
+        with pytest.raises(TypeError,
+                match="'float' object cannot be interpreted as an integer"):
+            converter([32.])
+        with pytest.raises(ValueError,
+                match="Maximum allowed dimension"):
+            # These converters currently convert overflows to a ValueError
+            converter(2**64)
+
  
  class TestSubscripting:
      def test_test_zero_rank(self):
@@ -3975,13 +4135,6 @@ class TestPickling:
      def test_correct_protocol5_error_message(self):
          array = np.arange(10)
  
-        if sys.version_info[:2] in ((3, 6), (3, 7)):
-            # For the specific case of python3.6 and 3.7, raise a clear import
-            # error about the pickle5 backport when trying to use protocol=5
-            # without the pickle5 package
-            with pytest.raises(ImportError):
-                array.__reduce_ex__(5)
-
      def test_record_array_with_object_dtype(self):
          my_object = object()
  
@@ -4205,7 +4358,8 @@ class TestArgmaxArgminCommon:
      sizes = [(), (3,), (3, 2), (2, 3),
               (3, 3), (2, 3, 4), (4, 3, 2),
               (1, 2, 3, 4), (2, 3, 4, 1),
-             (3, 4, 1, 2), (4, 1, 2, 3)]
+             (3, 4, 1, 2), (4, 1, 2, 3),
+             (64,), (128,), (256,)]
  
      @pytest.mark.parametrize("size, axis", itertools.chain(*[[(size, axis)
          for axis in list(range(-len(size), len(size))) + [None]]
@@ -4319,9 +4473,9 @@ class TestArgmaxArgminCommon:
      @pytest.mark.parametrize('ndim', [0, 1])
      @pytest.mark.parametrize('method', ['argmax', 'argmin'])
      def test_ret_is_out(self, ndim, method):
-        a = np.ones((4,) + (3,)*ndim)
+        a = np.ones((4,) + (256,)*ndim)
          arg_method = getattr(a, method)
-        out = np.empty((3,)*ndim, dtype=np.intp)
+        out = np.empty((256,)*ndim, dtype=np.intp)
          ret = arg_method(axis=0, out=out)
          assert ret is out
  
@@ -4372,12 +4526,44 @@ class TestArgmaxArgminCommon:
          assert_equal(arg_method(), 1)
  
  class TestArgmax:
-
-    nan_arr = [
-        ([0, 1, 2, 3, np.nan], 4),
-        ([0, 1, 2, np.nan, 3], 3),
-        ([np.nan, 0, 1, 2, 3], 0),
-        ([np.nan, 0, np.nan, 2, 3], 0),
+    usg_data = [
+        ([1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0], 0),
+        ([3, 3, 3, 3,  2,  2,  2,  2], 0),
+        ([0, 1, 2, 3,  4,  5,  6,  7], 7),
+        ([7, 6, 5, 4,  3,  2,  1,  0], 0)
+    ]
+    sg_data = usg_data + [
+        ([1, 2, 3, 4, -4, -3, -2, -1], 3),
+        ([1, 2, 3, 4, -1, -2, -3, -4], 3)
+    ]
+    darr = [(np.array(d[0], dtype=t), d[1]) for d, t in (
+        itertools.product(usg_data, (
+            np.uint8, np.uint16, np.uint32, np.uint64
+        ))
+    )]
+    darr = darr + [(np.array(d[0], dtype=t), d[1]) for d, t in (
+        itertools.product(sg_data, (
+            np.int8, np.int16, np.int32, np.int64, np.float32, np.float64
+        ))
+    )]
+    darr = darr + [(np.array(d[0], dtype=t), d[1]) for d, t in (
+        itertools.product((
+            ([0, 1, 2, 3, np.nan], 4),
+            ([0, 1, 2, np.nan, 3], 3),
+            ([np.nan, 0, 1, 2, 3], 0),
+            ([np.nan, 0, np.nan, 2, 3], 0),
+            # To hit the tail of SIMD multi-level(x4, x1) inner loops
+            # on varient SIMD widthes
+            ([1] * (2*5-1) + [np.nan], 2*5-1),
+            ([1] * (4*5-1) + [np.nan], 4*5-1),
+            ([1] * (8*5-1) + [np.nan], 8*5-1),
+            ([1] * (16*5-1) + [np.nan], 16*5-1),
+            ([1] * (32*5-1) + [np.nan], 32*5-1)
+        ), (
+            np.float32, np.float64
+        ))
+    )]
+    nan_arr = darr + [
          ([0, 1, 2, 3, complex(0, np.nan)], 4),
          ([0, 1, 2, 3, complex(np.nan, 0)], 4),
          ([0, 1, 2, complex(np.nan, 0), 3], 3),
@@ -4447,28 +4633,80 @@ class TestArgmax:
          assert_equal(np.argmax(arr), pos, err_msg="%r" % arr)
          assert_equal(arr[np.argmax(arr)], val, err_msg="%r" % arr)
  
+        # add padding to test SIMD loops
+        rarr = np.repeat(arr, 129)
+        rpos = pos * 129
+        assert_equal(np.argmax(rarr), rpos, err_msg="%r" % rarr)
+        assert_equal(rarr[np.argmax(rarr)], val, err_msg="%r" % rarr)
+
+        padd = np.repeat(np.min(arr), 513)
+        rarr = np.concatenate((arr, padd))
+        rpos = pos
+        assert_equal(np.argmax(rarr), rpos, err_msg="%r" % rarr)
+        assert_equal(rarr[np.argmax(rarr)], val, err_msg="%r" % rarr)
+
+
      def test_maximum_signed_integers(self):
  
          a = np.array([1, 2**7 - 1, -2**7], dtype=np.int8)
          assert_equal(np.argmax(a), 1)
+        a.repeat(129)
+        assert_equal(np.argmax(a), 1)
  
          a = np.array([1, 2**15 - 1, -2**15], dtype=np.int16)
          assert_equal(np.argmax(a), 1)
+        a.repeat(129)
+        assert_equal(np.argmax(a), 1)
  
          a = np.array([1, 2**31 - 1, -2**31], dtype=np.int32)
          assert_equal(np.argmax(a), 1)
+        a.repeat(129)
+        assert_equal(np.argmax(a), 1)
  
          a = np.array([1, 2**63 - 1, -2**63], dtype=np.int64)
          assert_equal(np.argmax(a), 1)
-
+        a.repeat(129)
+        assert_equal(np.argmax(a), 1)
  
  class TestArgmin:
-
-    nan_arr = [
-        ([0, 1, 2, 3, np.nan], 4),
-        ([0, 1, 2, np.nan, 3], 3),
-        ([np.nan, 0, 1, 2, 3], 0),
-        ([np.nan, 0, np.nan, 2, 3], 0),
+    usg_data = [
+        ([1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0], 8),
+        ([3, 3, 3, 3,  2,  2,  2,  2], 4),
+        ([0, 1, 2, 3,  4,  5,  6,  7], 0),
+        ([7, 6, 5, 4,  3,  2,  1,  0], 7)
+    ]
+    sg_data = usg_data + [
+        ([1, 2, 3, 4, -4, -3, -2, -1], 4),
+        ([1, 2, 3, 4, -1, -2, -3, -4], 7)
+    ]
+    darr = [(np.array(d[0], dtype=t), d[1]) for d, t in (
+        itertools.product(usg_data, (
+            np.uint8, np.uint16, np.uint32, np.uint64
+        ))
+    )]
+    darr = darr + [(np.array(d[0], dtype=t), d[1]) for d, t in (
+        itertools.product(sg_data, (
+            np.int8, np.int16, np.int32, np.int64, np.float32, np.float64
+        ))
+    )]
+    darr = darr + [(np.array(d[0], dtype=t), d[1]) for d, t in (
+        itertools.product((
+            ([0, 1, 2, 3, np.nan], 4),
+            ([0, 1, 2, np.nan, 3], 3),
+            ([np.nan, 0, 1, 2, 3], 0),
+            ([np.nan, 0, np.nan, 2, 3], 0),
+            # To hit the tail of SIMD multi-level(x4, x1) inner loops
+            # on varient SIMD widthes
+            ([1] * (2*5-1) + [np.nan], 2*5-1),
+            ([1] * (4*5-1) + [np.nan], 4*5-1),
+            ([1] * (8*5-1) + [np.nan], 8*5-1),
+            ([1] * (16*5-1) + [np.nan], 16*5-1),
+            ([1] * (32*5-1) + [np.nan], 32*5-1)
+        ), (
+            np.float32, np.float64
+        ))
+    )]
+    nan_arr = darr + [
          ([0, 1, 2, 3, complex(0, np.nan)], 4),
          ([0, 1, 2, 3, complex(np.nan, 0)], 4),
          ([0, 1, 2, complex(np.nan, 0), 3], 3),
@@ -4527,30 +4765,50 @@ class TestArgmin:
          ([False, True, False, True, True], 0),
      ]
  
-    def test_combinations(self):
-        for arr, pos in self.nan_arr:
-            with suppress_warnings() as sup:
-                sup.filter(RuntimeWarning,
-                           "invalid value encountered in reduce")
-                min_val = np.min(arr)
+    @pytest.mark.parametrize('data', nan_arr)
+    def test_combinations(self, data):
+        arr, pos = data
+        with suppress_warnings() as sup:
+            sup.filter(RuntimeWarning,
+                       "invalid value encountered in reduce")
+            min_val = np.min(arr)
  
-            assert_equal(np.argmin(arr), pos, err_msg="%r" % arr)
-            assert_equal(arr[np.argmin(arr)], min_val, err_msg="%r" % arr)
+        assert_equal(np.argmin(arr), pos, err_msg="%r" % arr)
+        assert_equal(arr[np.argmin(arr)], min_val, err_msg="%r" % arr)
+
+        # add padding to test SIMD loops
+        rarr = np.repeat(arr, 129)
+        rpos = pos * 129
+        assert_equal(np.argmin(rarr), rpos, err_msg="%r" % rarr)
+        assert_equal(rarr[np.argmin(rarr)], min_val, err_msg="%r" % rarr)
+
+        padd = np.repeat(np.max(arr), 513)
+        rarr = np.concatenate((arr, padd))
+        rpos = pos
+        assert_equal(np.argmin(rarr), rpos, err_msg="%r" % rarr)
+        assert_equal(rarr[np.argmin(rarr)], min_val, err_msg="%r" % rarr)
  
      def test_minimum_signed_integers(self):
  
          a = np.array([1, -2**7, -2**7 + 1, 2**7 - 1], dtype=np.int8)
          assert_equal(np.argmin(a), 1)
+        a.repeat(129)
+        assert_equal(np.argmin(a), 1)
  
          a = np.array([1, -2**15, -2**15 + 1, 2**15 - 1], dtype=np.int16)
          assert_equal(np.argmin(a), 1)
+        a.repeat(129)
+        assert_equal(np.argmin(a), 1)
  
          a = np.array([1, -2**31, -2**31 + 1, 2**31 - 1], dtype=np.int32)
          assert_equal(np.argmin(a), 1)
+        a.repeat(129)
+        assert_equal(np.argmin(a), 1)
  
          a = np.array([1, -2**63, -2**63 + 1, 2**63 - 1], dtype=np.int64)
          assert_equal(np.argmin(a), 1)
-
+        a.repeat(129)
+        assert_equal(np.argmin(a), 1)
  
  class TestMinMax:
  
@@ -5350,12 +5608,13 @@ class TestFromBuffer:
          buf = x.tobytes()
          assert_array_equal(np.frombuffer(buf, dtype=dt), x.flat)
  
-    def test_array_base(self):
-        arr = np.arange(10)
-        new = np.frombuffer(arr)
-        # We currently special case arrays to ensure they are used as a base.
-        # This could probably be changed (removing the test).
-        assert new.base is arr
+    @pytest.mark.parametrize("obj", [np.arange(10), b"12345678"])
+    def test_array_base(self, obj):
+        # Objects (including NumPy arrays), which do not use the
+        # `release_buffer` slot should be directly used as a base object.
+        # See also gh-21612
+        new = np.frombuffer(obj)
+        assert new.base is obj
  
      def test_empty(self):
          assert_array_equal(np.frombuffer(b''), np.array([]))
@@ -5414,19 +5673,8 @@ class TestFlat:
  
          assert_(c.flags.writeable is False)
          assert_(d.flags.writeable is False)
-        # for 1.14 all are set to non-writeable on the way to replacing the
-        # UPDATEIFCOPY array returned for non-contiguous arrays.
          assert_(e.flags.writeable is True)
          assert_(f.flags.writeable is False)
-        with assert_warns(DeprecationWarning):
-            assert_(c.flags.updateifcopy is False)
-        with assert_warns(DeprecationWarning):
-            assert_(d.flags.updateifcopy is False)
-        with assert_warns(DeprecationWarning):
-            assert_(e.flags.updateifcopy is False)
-        with assert_warns(DeprecationWarning):
-            # UPDATEIFCOPY is removed.
-            assert_(f.flags.updateifcopy is False)
          assert_(c.flags.writebackifcopy is False)
          assert_(d.flags.writebackifcopy is False)
          assert_(e.flags.writebackifcopy is False)
@@ -6921,26 +7169,6 @@ class TestInner:
              assert_equal(np.inner(b, a).transpose(2,3,0,1), desired)
  
  
-class TestAlen:
-    def test_basic(self):
-        with pytest.warns(DeprecationWarning):
-            m = np.array([1, 2, 3])
-            assert_equal(np.alen(m), 3)
-
-            m = np.array([[1, 2, 3], [4, 5, 7]])
-            assert_equal(np.alen(m), 2)
-
-            m = [1, 2, 3]
-            assert_equal(np.alen(m), 3)
-
-            m = [[1, 2, 3], [4, 5, 7]]
-            assert_equal(np.alen(m), 2)
-
-    def test_singleton(self):
-        with pytest.warns(DeprecationWarning):
-            assert_equal(np.alen(5), 1)
-
-
  class TestChoose:
      def setup(self):
          self.x = 2*np.ones((3,), dtype=int)
@@ -7609,7 +7837,7 @@ class TestNewBufferProtocol:
              assert_equal(y.format, 'T{b:a:=h:b:i:c:l:d:q:dx:B:e:@H:f:=I:g:L:h:Q:hx:f:i:d:j:^g:k:=Zf:ix:Zd:jx:^Zg:kx:4s:l:=4w:m:3x:n:?:o:@e:p:}')
          else:
              assert_equal(y.format, 'T{b:a:=h:b:i:c:q:d:q:dx:B:e:@H:f:=I:g:Q:h:Q:hx:f:i:d:j:^g:k:=Zf:ix:Zd:jx:^Zg:kx:4s:l:=4w:m:3x:n:?:o:@e:p:}')
-        # Cannot test if NPY_RELAXED_STRIDES_CHECKING changes the strides
+        # Cannot test if NPY_RELAXED_STRIDES_DEBUG changes the strides
          if not (np.ones(1).strides[0] == np.iinfo(np.intp).max):
              assert_equal(y.strides, (sz,))
          assert_equal(y.itemsize, sz)
@@ -7699,10 +7927,7 @@ class TestNewBufferProtocol:
      def test_relaxed_strides(self, c=np.ones((1, 10, 10), dtype='i8')):
          # Note: c defined as parameter so that it is persistent and leak
          # checks will notice gh-16934 (buffer info cache leak).
-
-        # Check for NPY_RELAXED_STRIDES_CHECKING:
-        if np.ones((10, 1), order="C").flags.f_contiguous:
-            c.strides = (-1, 80, 8)
+        c.strides = (-1, 80, 8)  # strides need to be fixed at export
  
          assert_(memoryview(c).strides == (800, 80, 8))
  
@@ -7846,6 +8071,26 @@ class TestNewBufferProtocol:
          # Fix buffer info again before we delete (or we lose the memory)
          _multiarray_tests.corrupt_or_fix_bufferinfo(obj)
  
+    def test_no_suboffsets(self):
+        try:
+            import _testbuffer
+        except ImportError:
+            raise pytest.skip("_testbuffer is not available")
+
+        for shape in [(2, 3), (2, 3, 4)]:
+            data = list(range(np.prod(shape)))
+            buffer = _testbuffer.ndarray(data, shape, format='i',
+                                         flags=_testbuffer.ND_PIL)
+            msg = "NumPy currently does not support.*suboffsets"
+            with pytest.raises(BufferError, match=msg):
+                np.asarray(buffer)
+            with pytest.raises(BufferError, match=msg):
+                np.asarray([buffer])
+
+            # Also check (unrelated and more limited but similar) frombuffer:
+            with pytest.raises(BufferError):
+                np.frombuffer(buffer)
+
  
  class TestArrayCreationCopyArgument(object):
  
@@ -7865,9 +8110,9 @@ class TestArrayCreationCopyArgument(object):
              pyscalar = arr.item(0)
  
              # Test never-copy raises error:
-            assert_raises(ValueError, np.array, scalar, 
+            assert_raises(ValueError, np.array, scalar,
                              copy=np._CopyMode.NEVER)
-            assert_raises(ValueError, np.array, pyscalar, 
+            assert_raises(ValueError, np.array, pyscalar,
                              copy=np._CopyMode.NEVER)
              assert_raises(ValueError, np.array, pyscalar,
                              copy=self.RaiseOnBool())
@@ -8220,18 +8465,6 @@ def test_scalar_element_deletion():
      assert_raises(ValueError, a[0].__delitem__, 'x')
  
  
-class TestMemEventHook:
-    def test_mem_seteventhook(self):
-        # The actual tests are within the C code in
-        # multiarray/_multiarray_tests.c.src
-        _multiarray_tests.test_pydatamem_seteventhook_start()
-        # force an allocation and free of a numpy array
-        # needs to be larger then limit of small memory cacher in ctors.c
-        a = np.zeros(1000)
-        del a
-        break_cycles()
-        _multiarray_tests.test_pydatamem_seteventhook_end()
-
  class TestMapIter:
      def test_mapiter(self):
          # The actual tests are within the C code in
@@ -8330,10 +8563,9 @@ class TestConversion:
  
          self_containing = np.array([None])
          self_containing[0] = self_containing
-        try:
-            Error = RecursionError
-        except NameError:
-            Error = RuntimeError  # python < 3.5
+
+        Error = RecursionError
+
          assert_raises(Error, bool, self_containing)  # previously stack overflow
          self_containing[0] = None  # resolve circular reference
  
@@ -8351,11 +8583,16 @@ class TestConversion:
              assert_equal(5, int_func(np.bytes_(b'5')))
              assert_equal(6, int_func(np.unicode_(u'6')))
  
-            class HasTrunc:
-                def __trunc__(self):
-                    return 3
-            assert_equal(3, int_func(np.array(HasTrunc())))
-            assert_equal(3, int_func(np.array([HasTrunc()])))
+            # The delegation of int() to __trunc__ was deprecated in
+            # Python 3.11.
+            if sys.version_info < (3, 11):
+                class HasTrunc:
+                    def __trunc__(self):
+                        return 3
+                assert_equal(3, int_func(np.array(HasTrunc())))
+                assert_equal(3, int_func(np.array([HasTrunc()])))
+            else:
+                pass
  
              class NotConvertible:
                  def __int__(self):
@@ -9015,12 +9252,28 @@ class TestArrayFinalize:
          a = np.array(1).view(SavesBase)
          assert_(a.saved_base is a.base)
  
-    def test_bad_finalize(self):
+    def test_bad_finalize1(self):
          class BadAttributeArray(np.ndarray):
              @property
              def __array_finalize__(self):
                  raise RuntimeError("boohoo!")
  
+        with pytest.raises(TypeError, match="not callable"):
+            np.arange(10).view(BadAttributeArray)
+
+    def test_bad_finalize2(self):
+        class BadAttributeArray(np.ndarray):
+            def __array_finalize__(self):
+                raise RuntimeError("boohoo!")
+
+        with pytest.raises(TypeError, match="takes 1 positional"):
+            np.arange(10).view(BadAttributeArray)
+
+    def test_bad_finalize3(self):
+        class BadAttributeArray(np.ndarray):
+            def __array_finalize__(self, obj):
+                raise RuntimeError("boohoo!")
+
          with pytest.raises(RuntimeError, match="boohoo!"):
              np.arange(10).view(BadAttributeArray)
  
@@ -9058,6 +9311,14 @@ class TestArrayFinalize:
          break_cycles()
          assert_(obj_ref() is None, "no references should remain")
  
+    def test_can_use_super(self):
+        class SuperFinalize(np.ndarray):
+            def __array_finalize__(self, obj):
+                self.saved_result = super().__array_finalize__(obj)
+
+        a = np.array(1).view(SuperFinalize)
+        assert_(a.saved_result is None)
+
  
  def test_orderconverter_with_nonASCII_unicode_ordering():
      # gh-7475
@@ -9252,3 +9513,115 @@ def test_getfield():
      pytest.raises(ValueError, a.getfield, 'uint8', -1)
      pytest.raises(ValueError, a.getfield, 'uint8', 16)
      pytest.raises(ValueError, a.getfield, 'uint64', 0)
+
+
+class TestViewDtype:
+    """
+    Verify that making a view of a non-contiguous array works as expected.
+    """
+    def test_smaller_dtype_multiple(self):
+        # x is non-contiguous
+        x = np.arange(10, dtype='<i4')[::2]
+        with pytest.raises(ValueError,
+                           match='the last axis must be contiguous'):
+            x.view('<i2')
+        expected = [[0, 0], [2, 0], [4, 0], [6, 0], [8, 0]]
+        assert_array_equal(x[:, np.newaxis].view('<i2'), expected)
+
+    def test_smaller_dtype_not_multiple(self):
+        # x is non-contiguous
+        x = np.arange(5, dtype='<i4')[::2]
+
+        with pytest.raises(ValueError,
+                           match='the last axis must be contiguous'):
+            x.view('S3')
+        with pytest.raises(ValueError,
+                           match='When changing to a smaller dtype'):
+            x[:, np.newaxis].view('S3')
+
+        # Make sure the problem is because of the dtype size
+        expected = [[b''], [b'\x02'], [b'\x04']]
+        assert_array_equal(x[:, np.newaxis].view('S4'), expected)
+
+    def test_larger_dtype_multiple(self):
+        # x is non-contiguous in the first dimension, contiguous in the last
+        x = np.arange(20, dtype='<i2').reshape(10, 2)[::2, :]
+        expected = np.array([[65536], [327684], [589832],
+                             [851980], [1114128]], dtype='<i4')
+        assert_array_equal(x.view('<i4'), expected)
+
+    def test_larger_dtype_not_multiple(self):
+        # x is non-contiguous in the first dimension, contiguous in the last
+        x = np.arange(20, dtype='<i2').reshape(10, 2)[::2, :]
+        with pytest.raises(ValueError,
+                           match='When changing to a larger dtype'):
+            x.view('S3')
+        # Make sure the problem is because of the dtype size
+        expected = [[b'\x00\x00\x01'], [b'\x04\x00\x05'], [b'\x08\x00\t'],
+                    [b'\x0c\x00\r'], [b'\x10\x00\x11']]
+        assert_array_equal(x.view('S4'), expected)
+
+    def test_f_contiguous(self):
+        # x is F-contiguous
+        x = np.arange(4 * 3, dtype='<i4').reshape(4, 3).T
+        with pytest.raises(ValueError,
+                           match='the last axis must be contiguous'):
+            x.view('<i2')
+
+    def test_non_c_contiguous(self):
+        # x is contiguous in axis=-1, but not C-contiguous in other axes
+        x = np.arange(2 * 3 * 4, dtype='i1').\
+                    reshape(2, 3, 4).transpose(1, 0, 2)
+        expected = [[[256, 770], [3340, 3854]],
+                    [[1284, 1798], [4368, 4882]],
+                    [[2312, 2826], [5396, 5910]]]
+        assert_array_equal(x.view('<i2'), expected)
+
+
+# Test various array sizes that hit different code paths in quicksort-avx512
+@pytest.mark.parametrize("N", [8, 16, 24, 32, 48, 64, 96, 128, 151, 191,
+                               256, 383, 512, 1023, 2047])
+def test_sort_float(N):
+    # Regular data with nan sprinkled
+    np.random.seed(42)
+    arr = -0.5 + np.random.sample(N).astype('f')
+    arr[np.random.choice(arr.shape[0], 3)] = np.nan
+    assert_equal(np.sort(arr, kind='quick'), np.sort(arr, kind='heap'))
+
+    # (2) with +INF
+    infarr = np.inf*np.ones(N, dtype='f')
+    infarr[np.random.choice(infarr.shape[0], 5)] = -1.0
+    assert_equal(np.sort(infarr, kind='quick'), np.sort(infarr, kind='heap'))
+
+    # (3) with -INF
+    neginfarr = -np.inf*np.ones(N, dtype='f')
+    neginfarr[np.random.choice(neginfarr.shape[0], 5)] = 1.0
+    assert_equal(np.sort(neginfarr, kind='quick'),
+                 np.sort(neginfarr, kind='heap'))
+
+    # (4) with +/-INF
+    infarr = np.inf*np.ones(N, dtype='f')
+    infarr[np.random.choice(infarr.shape[0], (int)(N/2))] = -np.inf
+    assert_equal(np.sort(infarr, kind='quick'), np.sort(infarr, kind='heap'))
+
+
+def test_sort_int():
+    # Random data with NPY_MAX_INT32 and NPY_MIN_INT32 sprinkled
+    rng = np.random.default_rng(42)
+    N = 2047
+    minv = np.iinfo(np.int32).min
+    maxv = np.iinfo(np.int32).max
+    arr = rng.integers(low=minv, high=maxv, size=N).astype('int32')
+    arr[np.random.choice(arr.shape[0], 10)] = minv
+    arr[np.random.choice(arr.shape[0], 10)] = maxv
+    assert_equal(np.sort(arr, kind='quick'), np.sort(arr, kind='heap'))
+
+
+def test_sort_uint():
+    # Random data with NPY_MAX_UINT32 sprinkled
+    rng = np.random.default_rng(42)
+    N = 2047
+    maxv = np.iinfo(np.uint32).max
+    arr = rng.integers(low=0, high=maxv, size=N).astype('uint32')
+    arr[np.random.choice(arr.shape[0], 10)] = maxv
+    assert_equal(np.sort(arr, kind='quick'), np.sort(arr, kind='heap'))
diff --git a/numpy/core/tests/test_nditer.py b/numpy/core/tests/test_nditer.py

index ed775cac628d92b6ac2bf7815d7e0b3ca62a1887..b43bc50e9dc0b2febcf84d9a0982b2a40129ac2e 100644 (file)
--- a/numpy/core/tests/test_nditer.py
+++ b/numpy/core/tests/test_nditer.py
@@ -828,7 +828,7 @@ def test_iter_nbo_align_contig():
                          casting='equiv',
                          op_dtypes=[np.dtype('f4')])
      with i:
-        # context manager triggers UPDATEIFCOPY on i at exit
+        # context manager triggers WRITEBACKIFCOPY on i at exit
          assert_equal(i.dtypes[0].byteorder, a.dtype.byteorder)
          assert_equal(i.operands[0].dtype.byteorder, a.dtype.byteorder)
          assert_equal(i.operands[0], a)
@@ -1990,13 +1990,13 @@ def test_iter_buffered_cast_structured_type_failure_with_cleanup():
      a = np.array([(1, 2, 3), (4, 5, 6)], dtype=sdt1)
  
      for intent in ["readwrite", "readonly", "writeonly"]:
-        # If the following assert fails, the place where the error is raised
-        # within nditer may change. That is fine, but it may make sense for
-        # a new (hard to design) test to replace it. The `simple_arr` is
-        # designed to require a multi-step cast (due to having fields).
-        assert np.can_cast(a.dtype, sdt2, casting="unsafe")
+        # This test was initially designed to test an error at a different
+        # place, but will now raise earlier to to the cast not being possible:
+        # `assert np.can_cast(a.dtype, sdt2, casting="unsafe")` fails.
+        # Without a faulty DType, there is probably no reliable
+        # way to get the initial tested behaviour.
          simple_arr = np.array([1, 2], dtype="i,i")  # requires clean up
-        with pytest.raises(ValueError):
+        with pytest.raises(TypeError):
              nditer((simple_arr, a), ['buffered', 'refs_ok'], [intent, intent],
                     casting='unsafe', op_dtypes=["f,f", sdt2])
  
diff --git a/numpy/core/tests/test_numeric.py b/numpy/core/tests/test_numeric.py

index ad94379115a8532e8e5654364351013326d283ba..0b03c65760f46bf96e1356e58059c1290f8d58d7 100644 (file)
--- a/numpy/core/tests/test_numeric.py
+++ b/numpy/core/tests/test_numeric.py
@@ -932,9 +932,28 @@ class TestTypes:
          # Promote with object:
          assert_equal(promote_types('O', S+'30'), np.dtype('O'))
  
+    @pytest.mark.parametrize(["dtype1", "dtype2"],
+            [[np.dtype("V6"), np.dtype("V10")],  # mismatch shape
+             # Mismatching names:
+             [np.dtype([("name1", "i8")]), np.dtype([("name2", "i8")])],
+            ])
+    def test_invalid_void_promotion(self, dtype1, dtype2):
+        with pytest.raises(TypeError):
+            np.promote_types(dtype1, dtype2)
+
+    @pytest.mark.parametrize(["dtype1", "dtype2"],
+            [[np.dtype("V10"), np.dtype("V10")],
+             [np.dtype([("name1", "i8")]),
+              np.dtype([("name1", np.dtype("i8").newbyteorder())])],
+             [np.dtype("i8,i8"), np.dtype("i8,>i8")],
+             [np.dtype("i8,i8"), np.dtype("i4,i4")],
+            ])
+    def test_valid_void_promotion(self, dtype1, dtype2):
+        assert np.promote_types(dtype1, dtype2) == dtype1
+
      @pytest.mark.parametrize("dtype",
-           list(np.typecodes["All"]) +
-           ["i,i", "S3", "S100", "U3", "U100", rational])
+            list(np.typecodes["All"]) +
+            ["i,i", "10i", "S3", "S100", "U3", "U100", rational])
      def test_promote_identical_types_metadata(self, dtype):
          # The same type passed in twice to promote types always
          # preserves metadata
@@ -951,14 +970,14 @@ class TestTypes:
              return
  
          res = np.promote_types(dtype, dtype)
-        if res.char in "?bhilqpBHILQPefdgFDGOmM" or dtype.type is rational:
-            # Metadata is lost for simple promotions (they create a new dtype)
+
+        # Metadata is (currently) generally lost on byte-swapping (except for
+        # unicode.
+        if dtype.char != "U":
              assert res.metadata is None
          else:
              assert res.metadata == metadata
-        if dtype.kind != "V":
-            # the result is native (except for structured void)
-            assert res.isnative
+        assert res.isnative
  
      @pytest.mark.slow
      @pytest.mark.filterwarnings('ignore:Promotion of numbers:FutureWarning')
@@ -987,8 +1006,10 @@ class TestTypes:
              # Promotion failed, this test only checks metadata
              return
  
-        if res.char in "?bhilqpBHILQPefdgFDGOmM" or res.type is rational:
-            # All simple types lose metadata (due to using promotion table):
+        if res.char not in "USV" or res.names is not None or res.shape != ():
+            # All except string dtypes (and unstructured void) lose metadata
+            # on promotion (unless both dtypes are identical).
+            # At some point structured ones did not, but were restrictive.
              assert res.metadata is None
          elif res == dtype1:
              # If one result is the result, it is usually returned unchanged:
@@ -1008,32 +1029,9 @@ class TestTypes:
          dtype1 = dtype1.newbyteorder()
          assert dtype1.metadata == metadata1
          res_bs = np.promote_types(dtype1, dtype2)
-        if res_bs.names is not None:
-            # Structured promotion doesn't remove byteswap:
-            assert res_bs.newbyteorder() == res
-        else:
-            assert res_bs == res
+        assert res_bs == res
          assert res_bs.metadata == res.metadata
  
-    @pytest.mark.parametrize(["dtype1", "dtype2"],
-            [[np.dtype("V6"), np.dtype("V10")],
-             [np.dtype([("name1", "i8")]), np.dtype([("name2", "i8")])],
-             [np.dtype("i8,i8"), np.dtype("i4,i4")],
-            ])
-    def test_invalid_void_promotion(self, dtype1, dtype2):
-        # Mainly test structured void promotion, which currently allows
-        # byte-swapping, but nothing else:
-        with pytest.raises(TypeError):
-            np.promote_types(dtype1, dtype2)
-
-    @pytest.mark.parametrize(["dtype1", "dtype2"],
-            [[np.dtype("V10"), np.dtype("V10")],
-             [np.dtype([("name1", "<i8")]), np.dtype([("name1", ">i8")])],
-             [np.dtype("i8,i8"), np.dtype("i8,>i8")],
-            ])
-    def test_valid_void_promotion(self, dtype1, dtype2):
-        assert np.promote_types(dtype1, dtype2) is dtype1
-
      def test_can_cast(self):
          assert_(np.can_cast(np.int32, np.int64))
          assert_(np.can_cast(np.float64, complex))
@@ -1202,19 +1200,76 @@ class TestFromiter:
                  raise NIterError('error at index %s' % eindex)
              yield e
  
-    def test_2592(self):
-        # Test iteration exceptions are correctly raised.
-        count, eindex = 10, 5
-        assert_raises(NIterError, np.fromiter,
-                          self.load_data(count, eindex), dtype=int, count=count)
-
-    def test_2592_edge(self):
-        # Test iter. exceptions, edge case (exception at end of iterator).
-        count = 10
-        eindex = count-1
-        assert_raises(NIterError, np.fromiter,
-                          self.load_data(count, eindex), dtype=int, count=count)
+    @pytest.mark.parametrize("dtype", [int, object])
+    @pytest.mark.parametrize(["count", "error_index"], [(10, 5), (10, 9)])
+    def test_2592(self, count, error_index, dtype):
+        # Test iteration exceptions are correctly raised. The data/generator
+        # has `count` elements but errors at `error_index`
+        iterable = self.load_data(count, error_index)
+        with pytest.raises(NIterError):
+            np.fromiter(iterable, dtype=dtype, count=count)
+
+    @pytest.mark.parametrize("dtype", ["S", "S0", "V0", "U0"])
+    def test_empty_not_structured(self, dtype):
+        # Note, "S0" could be allowed at some point, so long "S" (without
+        # any length) is rejected.
+        with pytest.raises(ValueError, match="Must specify length"):
+            np.fromiter([], dtype=dtype)
+
+    @pytest.mark.parametrize(["dtype", "data"],
+            [("d", [1, 2, 3, 4, 5, 6, 7, 8, 9]),
+             ("O", [1, 2, 3, 4, 5, 6, 7, 8, 9]),
+             ("i,O", [(1, 2), (5, 4), (2, 3), (9, 8), (6, 7)]),
+             # subarray dtypes (important because their dimensions end up
+             # in the result arrays dimension:
+             ("2i", [(1, 2), (5, 4), (2, 3), (9, 8), (6, 7)]),
+             (np.dtype(("O", (2, 3))),
+              [((1, 2, 3), (3, 4, 5)), ((3, 2, 1), (5, 4, 3))])])
+    @pytest.mark.parametrize("length_hint", [0, 1])
+    def test_growth_and_complicated_dtypes(self, dtype, data, length_hint):
+        dtype = np.dtype(dtype)
+
+        data = data * 100  # make sure we realloc a bit
+
+        class MyIter:
+            # Class/example from gh-15789
+            def __length_hint__(self):
+                # only required to be an estimate, this is legal
+                return length_hint  # 0 or 1
+
+            def __iter__(self):
+                return iter(data)
+
+        res = np.fromiter(MyIter(), dtype=dtype)
+        expected = np.array(data, dtype=dtype)
+
+        assert_array_equal(res, expected)
+
+    def test_empty_result(self):
+        class MyIter:
+            def __length_hint__(self):
+                return 10
+
+            def __iter__(self):
+                return iter([])  # actual iterator is empty.
+
+        res = np.fromiter(MyIter(), dtype="d")
+        assert res.shape == (0,)
+        assert res.dtype == "d"
+
+    def test_too_few_items(self):
+        msg = "iterator too short: Expected 10 but iterator had only 3 items."
+        with pytest.raises(ValueError, match=msg):
+            np.fromiter([1, 2, 3], count=10, dtype=int)
+
+    def test_failed_itemsetting(self):
+        with pytest.raises(TypeError):
+            np.fromiter([1, None, 3], dtype=int)
  
+        # The following manages to hit somewhat trickier code paths:
+        iterable = ((2, 3, 4) for i in range(5))
+        with pytest.raises(ValueError):
+            np.fromiter(iterable, dtype=np.dtype((int, 2)))
  
  class TestNonzero:
      def test_nonzero_trivial(self):
diff --git a/numpy/core/tests/test_numerictypes.py b/numpy/core/tests/test_numerictypes.py

index 9cb00342dd0c7a9dadd8819f09c32bc0e618bf06..73ff4764d664569dd3f8f3899dd16752cfae0412 100644 (file)
--- a/numpy/core/tests/test_numerictypes.py
+++ b/numpy/core/tests/test_numerictypes.py
@@ -436,6 +436,13 @@ class TestSctypeDict:
          assert_(np.sctypeDict['f8'] is not np.longdouble)
          assert_(np.sctypeDict['c16'] is not np.clongdouble)
  
+    def test_ulong(self):
+        # Test that 'ulong' behaves like 'long'. np.sctypeDict['long'] is an
+        # alias for np.int_, but np.long is not supported for historical
+        # reasons (gh-21063)
+        assert_(np.sctypeDict['ulong'] is np.uint)
+        assert_(not hasattr(np, 'ulong'))
+
  
  class TestBitName:
      def test_abstract(self):
diff --git a/numpy/core/tests/test_overrides.py b/numpy/core/tests/test_overrides.py

index 9216a3f5fdfacba9e91c93dc1052705054264f64..36970dbc02ed413353f63bf65a4d089cff81ffed 100644 (file)
--- a/numpy/core/tests/test_overrides.py
+++ b/numpy/core/tests/test_overrides.py
@@ -437,6 +437,7 @@ class TestArrayLike:
                  self.function = function
  
              def __array_function__(self, func, types, args, kwargs):
+                assert func is getattr(np, func.__name__)
                  try:
                      my_func = getattr(self, func.__name__)
                  except AttributeError:
diff --git a/numpy/core/tests/test_regression.py b/numpy/core/tests/test_regression.py

index 21cc8c1595f6a7295825cd92276702d68b62d681..98e0df9b84a3595f407ff0cbfbabde1e0cf3a4d7 100644 (file)
--- a/numpy/core/tests/test_regression.py
+++ b/numpy/core/tests/test_regression.py
@@ -17,12 +17,6 @@ from numpy.testing import (
  from numpy.testing._private.utils import _no_tracing, requires_memory
  from numpy.compat import asbytes, asunicode, pickle
  
-try:
-    RecursionError
-except NameError:
-    RecursionError = RuntimeError  # python < 3.5
-
-
  
  class TestRegression:
      def test_invalid_round(self):
@@ -430,7 +424,6 @@ class TestRegression:
      def test_lexsort_zerolen_custom_strides(self):
          # Ticket #14228
          xs = np.array([], dtype='i8')
-        assert xs.strides == (8,)
          assert np.lexsort((xs,)).shape[0] == 0 # Works
  
          xs.strides = (16,)
@@ -658,10 +651,10 @@ class TestRegression:
          a = np.ones((0, 2))
          a.shape = (-1, 2)
  
-    # Cannot test if NPY_RELAXED_STRIDES_CHECKING changes the strides.
-    # With NPY_RELAXED_STRIDES_CHECKING the test becomes superfluous.
+    # Cannot test if NPY_RELAXED_STRIDES_DEBUG changes the strides.
+    # With NPY_RELAXED_STRIDES_DEBUG the test becomes superfluous.
      @pytest.mark.skipif(np.ones(1).strides[0] == np.iinfo(np.intp).max,
-                        reason="Using relaxed stride checking")
+                        reason="Using relaxed stride debug")
      def test_reshape_trailing_ones_strides(self):
          # GitHub issue gh-2949, bad strides for trailing ones of new shape
          a = np.zeros(12, dtype=np.int32)[::2]  # not contiguous
@@ -918,11 +911,11 @@ class TestRegression:
          # Ticket #658
          np.indices((0, 3, 4)).T.reshape(-1, 3)
  
-    # Cannot test if NPY_RELAXED_STRIDES_CHECKING changes the strides.
-    # With NPY_RELAXED_STRIDES_CHECKING the test becomes superfluous,
+    # Cannot test if NPY_RELAXED_STRIDES_DEBUG changes the strides.
+    # With NPY_RELAXED_STRIDES_DEBUG the test becomes superfluous,
      # 0-sized reshape itself is tested elsewhere.
      @pytest.mark.skipif(np.ones(1).strides[0] == np.iinfo(np.intp).max,
-                        reason="Using relaxed stride checking")
+                        reason="Using relaxed stride debug")
      def test_copy_detection_corner_case2(self):
          # Ticket #771: strides are not set correctly when reshaping 0-sized
          # arrays
@@ -2310,6 +2303,8 @@ class TestRegression:
              new_shape = (2, 7, 7, 43826197)
          assert_raises(ValueError, a.reshape, new_shape)
  
+    @pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
+            reason="PyPy bug in error formatting")
      def test_invalid_structured_dtypes(self):
          # gh-2865
          # mapping python objects to other dtypes
@@ -2485,8 +2480,6 @@ class TestRegression:
          assert arr.shape == (1, 0, 0)
  
      @pytest.mark.skipif(sys.maxsize < 2 ** 31 + 1, reason='overflows 32-bit python')
-    @pytest.mark.skipif(sys.platform == 'win32' and sys.version_info[:2] < (3, 8),
-                        reason='overflows on windows, fixed in bpo-16865')
      def test_to_ctypes(self):
          #gh-14214
          arr = np.zeros((2 ** 31 + 1,), 'b')
diff --git a/numpy/core/tests/test_scalarbuffer.py b/numpy/core/tests/test_scalarbuffer.py

index 851cd3081aeeff6516320d056e019c80760c7bf6..0e6ab1015e15eedf077c3f8f9ccbdcb20d4d1d31 100644 (file)
--- a/numpy/core/tests/test_scalarbuffer.py
+++ b/numpy/core/tests/test_scalarbuffer.py
@@ -59,7 +59,6 @@ class TestScalarPEP3118:
                          shape=(), format=code, readonly=True)
  
          mv_x = memoryview(x)
-        print(mv_x.readonly, self._as_dict(mv_x))
          assert self._as_dict(mv_x) == expected
  
      @pytest.mark.parametrize('scalar', scalars_only, ids=codes_only)
@@ -151,4 +150,4 @@ class TestScalarPEP3118:
  
          # Check that we do not allow writeable buffer export
          with pytest.raises(BufferError, match="scalar buffer is readonly"):
-            get_buffer_info(r, ["WRITABLE"])
-\ No newline at end of file
+            get_buffer_info(r, ["WRITABLE"])
diff --git a/numpy/core/tests/test_scalarmath.py b/numpy/core/tests/test_scalarmath.py

index 90078a2ea3cefd1e9af11fde3fa26db840ed4292..b7fe5183e0d1b97423fed5a7bd29991fcf051ec5 100644 (file)
--- a/numpy/core/tests/test_scalarmath.py
+++ b/numpy/core/tests/test_scalarmath.py
@@ -4,9 +4,11 @@ import warnings
  import itertools
  import operator
  import platform
+from numpy.compat import _pep440
  import pytest
-from hypothesis import given, settings, Verbosity
+from hypothesis import given, settings
  from hypothesis.strategies import sampled_from
+from hypothesis.extra import numpy as hynp
  
  import numpy as np
  from numpy.testing import (
@@ -23,6 +25,14 @@ types = [np.bool_, np.byte, np.ubyte, np.short, np.ushort, np.intc, np.uintc,
  floating_types = np.floating.__subclasses__()
  complex_floating_types = np.complexfloating.__subclasses__()
  
+objecty_things = [object(), None]
+
+reasonable_operators_for_scalars = [
+    operator.lt, operator.le, operator.eq, operator.ne, operator.ge,
+    operator.gt, operator.add, operator.floordiv, operator.mod,
+    operator.mul, operator.pow, operator.sub, operator.truediv,
+]
+
  
  # This compares scalarmath against ufuncs.
  
@@ -65,6 +75,41 @@ class TestTypes:
              np.add(1, 1)
  
  
+@pytest.mark.slow
+@settings(max_examples=10000, deadline=2000)
+@given(sampled_from(reasonable_operators_for_scalars),
+       hynp.arrays(dtype=hynp.scalar_dtypes(), shape=()),
+       hynp.arrays(dtype=hynp.scalar_dtypes(), shape=()))
+def test_array_scalar_ufunc_equivalence(op, arr1, arr2):
+    """
+    This is a thorough test attempting to cover important promotion paths
+    and ensuring that arrays and scalars stay as aligned as possible.
+    However, if it creates troubles, it should maybe just be removed.
+    """
+    scalar1 = arr1[()]
+    scalar2 = arr2[()]
+    assert isinstance(scalar1, np.generic)
+    assert isinstance(scalar2, np.generic)
+
+    if arr1.dtype.kind == "c" or arr2.dtype.kind == "c":
+        comp_ops = {operator.ge, operator.gt, operator.le, operator.lt}
+        if op in comp_ops and (np.isnan(scalar1) or np.isnan(scalar2)):
+            pytest.xfail("complex comp ufuncs use sort-order, scalars do not.")
+
+    # ignore fpe's since they may just mismatch for integers anyway.
+    with warnings.catch_warnings(), np.errstate(all="ignore"):
+        # Comparisons DeprecationWarnings replacing errors (2022-03):
+        warnings.simplefilter("error", DeprecationWarning)
+        try:
+            res = op(arr1, arr2)
+        except Exception as e:
+            with pytest.raises(type(e)):
+                op(scalar1, scalar2)
+        else:
+            scalar_res = op(scalar1, scalar2)
+            assert_array_equal(scalar_res, res)
+
+
  class TestBaseMath:
      def test_blocked(self):
          # test alignments offsets for simd instructions
@@ -680,17 +725,29 @@ class TestAbs:
  
      @pytest.mark.parametrize("dtype", floating_types + complex_floating_types)
      def test_builtin_abs(self, dtype):
-        if sys.platform == "cygwin" and dtype == np.clongdouble:
+        if (
+                sys.platform == "cygwin" and dtype == np.clongdouble and
+                (
+                    _pep440.parse(platform.release().split("-")[0])
+                    < _pep440.Version("3.3.0")
+                )
+        ):
              pytest.xfail(
-                reason="absl is computed in double precision on cygwin"
+                reason="absl is computed in double precision on cygwin < 3.3"
              )
          self._test_abs_func(abs, dtype)
  
      @pytest.mark.parametrize("dtype", floating_types + complex_floating_types)
      def test_numpy_abs(self, dtype):
-        if sys.platform == "cygwin" and dtype == np.clongdouble:
+        if (
+                sys.platform == "cygwin" and dtype == np.clongdouble and
+                (
+                    _pep440.parse(platform.release().split("-")[0])
+                    < _pep440.Version("3.3.0")
+                )
+        ):
              pytest.xfail(
-                reason="absl is computed in double precision on cygwin"
+                reason="absl is computed in double precision on cygwin < 3.3"
              )
          self._test_abs_func(np.abs, dtype)
  
@@ -740,7 +797,6 @@ class TestHash:
              else:
                  val = float(numpy_val)
              assert val == numpy_val
-            print(repr(numpy_val), repr(val))
              assert hash(val) == hash(numpy_val)
  
          if hash(float(np.nan)) != hash(float(np.nan)):
@@ -766,19 +822,9 @@ def recursionlimit(n):
          sys.setrecursionlimit(o)
  
  
-objecty_things = [object(), None]
-reasonable_operators_for_scalars = [
-    operator.lt, operator.le, operator.eq, operator.ne, operator.ge,
-    operator.gt, operator.add, operator.floordiv, operator.mod,
-    operator.mul, operator.matmul, operator.pow, operator.sub,
-    operator.truediv,
-]
-
-
  @given(sampled_from(objecty_things),
         sampled_from(reasonable_operators_for_scalars),
         sampled_from(types))
-@settings(verbosity=Verbosity.verbose)
  def test_operator_object_left(o, op, type_):
      try:
          with recursionlimit(200):
@@ -832,3 +878,163 @@ def test_clongdouble_inf_loop(op):
          op(None, np.longdouble(3))
      except TypeError:
          pass
+
+
+@pytest.mark.parametrize("dtype", np.typecodes["AllInteger"])
+@pytest.mark.parametrize("operation", [
+        lambda min, max: max + max,
+        lambda min, max: min - max,
+        lambda min, max: max * max], ids=["+", "-", "*"])
+def test_scalar_integer_operation_overflow(dtype, operation):
+    st = np.dtype(dtype).type
+    min = st(np.iinfo(dtype).min)
+    max = st(np.iinfo(dtype).max)
+
+    with pytest.warns(RuntimeWarning, match="overflow encountered"):
+        operation(min, max)
+
+
+@pytest.mark.parametrize("dtype", np.typecodes["Integer"])
+@pytest.mark.parametrize("operation", [
+        lambda min, neg_1: abs(min),
+        lambda min, neg_1: min * neg_1,
+        lambda min, neg_1: min // neg_1], ids=["abs", "*", "//"])
+def test_scalar_signed_integer_overflow(dtype, operation):
+    # The minimum signed integer can "overflow" for some additional operations
+    st = np.dtype(dtype).type
+    min = st(np.iinfo(dtype).min)
+    neg_1 = st(-1)
+
+    with pytest.warns(RuntimeWarning, match="overflow encountered"):
+        operation(min, neg_1)
+
+
+@pytest.mark.parametrize("dtype", np.typecodes["UnsignedInteger"])
+@pytest.mark.xfail  # TODO: the check is quite simply missing!
+def test_scalar_signed_integer_overflow(dtype):
+    val = np.dtype(dtype).type(8)
+    with pytest.warns(RuntimeWarning, match="overflow encountered"):
+        -val
+
+
+@pytest.mark.parametrize("dtype", np.typecodes["AllInteger"])
+@pytest.mark.parametrize("operation", [
+        lambda val, zero: val // zero,
+        lambda val, zero: val % zero, ], ids=["//", "%"])
+def test_scalar_integer_operation_divbyzero(dtype, operation):
+    st = np.dtype(dtype).type
+    val = st(100)
+    zero = st(0)
+
+    with pytest.warns(RuntimeWarning, match="divide by zero"):
+        operation(val, zero)
+
+
+ops_with_names = [
+    ("__lt__", "__gt__", operator.lt, True),
+    ("__le__", "__ge__", operator.le, True),
+    ("__eq__", "__eq__", operator.eq, True),
+    # Note __op__ and __rop__ may be identical here:
+    ("__ne__", "__ne__", operator.ne, True),
+    ("__gt__", "__lt__", operator.gt, True),
+    ("__ge__", "__le__", operator.ge, True),
+    ("__floordiv__", "__rfloordiv__", operator.floordiv, False),
+    ("__truediv__", "__rtruediv__", operator.truediv, False),
+    ("__add__", "__radd__", operator.add, False),
+    ("__mod__", "__rmod__", operator.mod, False),
+    ("__mul__", "__rmul__", operator.mul, False),
+    ("__pow__", "__rpow__", operator.pow, False),
+    ("__sub__", "__rsub__", operator.sub, False),
+]
+
+
+@pytest.mark.parametrize(["__op__", "__rop__", "op", "cmp"], ops_with_names)
+@pytest.mark.parametrize("sctype", [np.float32, np.float64, np.longdouble])
+def test_subclass_deferral(sctype, __op__, __rop__, op, cmp):
+    """
+    This test covers scalar subclass deferral.  Note that this is exceedingly
+    complicated, especially since it tends to fall back to the array paths and
+    these additionally add the "array priority" mechanism.
+
+    The behaviour was modified subtly in 1.22 (to make it closer to how Python
+    scalars work).  Due to its complexity and the fact that subclassing NumPy
+    scalars is probably a bad idea to begin with.  There is probably room
+    for adjustments here.
+    """
+    class myf_simple1(sctype):
+        pass
+
+    class myf_simple2(sctype):
+        pass
+
+    def op_func(self, other):
+        return __op__
+
+    def rop_func(self, other):
+        return __rop__
+
+    myf_op = type("myf_op", (sctype,), {__op__: op_func, __rop__: rop_func})
+
+    # inheritance has to override, or this is correctly lost:
+    res = op(myf_simple1(1), myf_simple2(2))
+    assert type(res) == sctype or type(res) == np.bool_
+    assert op(myf_simple1(1), myf_simple2(2)) == op(1, 2)  # inherited
+
+    # Two independent subclasses do not really define an order.  This could
+    # be attempted, but we do not since Python's `int` does neither:
+    assert op(myf_op(1), myf_simple1(2)) == __op__
+    assert op(myf_simple1(1), myf_op(2)) == op(1, 2)  # inherited
+
+
+def test_longdouble_complex():
+    # Simple test to check longdouble and complex combinations, since these
+    # need to go through promotion, which longdouble needs to be careful about.
+    x = np.longdouble(1)
+    assert x + 1j == 1+1j
+    assert 1j + x == 1+1j
+
+
+@pytest.mark.parametrize(["__op__", "__rop__", "op", "cmp"], ops_with_names)
+@pytest.mark.parametrize("subtype", [float, int, complex, np.float16])
+def test_pyscalar_subclasses(subtype, __op__, __rop__, op, cmp):
+    def op_func(self, other):
+        return __op__
+
+    def rop_func(self, other):
+        return __rop__
+
+    # Check that deferring is indicated using `__array_ufunc__`:
+    myt = type("myt", (subtype,),
+               {__op__: op_func, __rop__: rop_func, "__array_ufunc__": None})
+
+    # Just like normally, we should never presume we can modify the float.
+    assert op(myt(1), np.float64(2)) == __op__
+    assert op(np.float64(1), myt(2)) == __rop__
+
+    if op in {operator.mod, operator.floordiv} and subtype == complex:
+        return  # module is not support for complex.  Do not test.
+
+    if __rop__ == __op__:
+        return
+
+    # When no deferring is indicated, subclasses are handled normally.
+    myt = type("myt", (subtype,), {__rop__: rop_func})
+
+    # Check for float32, as a float subclass float64 may behave differently
+    res = op(myt(1), np.float16(2))
+    expected = op(subtype(1), np.float16(2))
+    assert res == expected
+    assert type(res) == type(expected)
+    res = op(np.float32(2), myt(1))
+    expected = op(np.float32(2), subtype(1))
+    assert res == expected
+    assert type(res) == type(expected)
+
+    # Same check for longdouble:
+    res = op(myt(1), np.longdouble(2))
+    expected = op(subtype(1), np.longdouble(2))
+    assert res == expected
+    assert type(res) == type(expected)
+    res = op(np.float32(2), myt(1))
+    expected = op(np.longdouble(2), subtype(1))
+    assert res == expected
diff --git a/numpy/core/tests/test_scalarprint.py b/numpy/core/tests/test_scalarprint.py

index ee21d4aa5e0d68376d41690038e55f459c933af5..4deb5a0a4c2f05220a3b07455adca86c3b3a1e83 100644 (file)
--- a/numpy/core/tests/test_scalarprint.py
+++ b/numpy/core/tests/test_scalarprint.py
@@ -306,6 +306,7 @@ class TestRealScalars:
              assert_equal(fpos(tp('1.2'), unique=False, precision=4, trim='-'),
                           "1.2" if tp != np.float16 else "1.2002")
              assert_equal(fpos(tp('1.'), trim='-'), "1")
+            assert_equal(fpos(tp('1.001'), precision=1, trim='-'), "1")
  
      @pytest.mark.skipif(not platform.machine().startswith("ppc64"),
                          reason="only applies to ppc float128 values")
diff --git a/numpy/core/tests/test_simd.py b/numpy/core/tests/test_simd.py

index 12a67c44dde5fba30dc168531b2a7d71afb6afb1..e4b5e0c8f474a361b6862086564526ffab27f0be 100644 (file)
--- a/numpy/core/tests/test_simd.py
+++ b/numpy/core/tests/test_simd.py
@@ -330,16 +330,18 @@ class _SIMD_FP(_Test_Utility):
          square = self.square(vdata)
          assert square == data_square
  
-    @pytest.mark.parametrize("intrin, func", [("self.ceil", math.ceil),
-    ("self.trunc", math.trunc)])
+    @pytest.mark.parametrize("intrin, func", [("ceil", math.ceil),
+    ("trunc", math.trunc), ("floor", math.floor), ("rint", round)])
      def test_rounding(self, intrin, func):
          """
          Test intrinsics:
+            npyv_rint_##SFX
              npyv_ceil_##SFX
              npyv_trunc_##SFX
+            npyv_floor##SFX
          """
          intrin_name = intrin
-        intrin = eval(intrin)
+        intrin = getattr(self, intrin)
          pinf, ninf, nan = self._pinfinity(), self._ninfinity(), self._nan()
          # special cases
          round_cases = ((nan, nan), (pinf, pinf), (ninf, ninf))
@@ -347,20 +349,25 @@ class _SIMD_FP(_Test_Utility):
              data_round = [desired]*self.nlanes
              _round = intrin(self.setall(case))
              assert _round == pytest.approx(data_round, nan_ok=True)
+
          for x in range(0, 2**20, 256**2):
              for w in (-1.05, -1.10, -1.15, 1.05, 1.10, 1.15):
-                data = [x*w+a for a in range(self.nlanes)]
-                vdata = self.load(data)
+                data = self.load([(x+a)*w for a in range(self.nlanes)])
                  data_round = [func(x) for x in data]
-                _round = intrin(vdata)
+                _round = intrin(data)
                  assert _round == data_round
+
          # signed zero
-        if "ceil" in intrin_name or "trunc" in intrin_name:
-            for w in (-0.25, -0.30, -0.45):
-                _round = self._to_unsigned(intrin(self.setall(w)))
-                data_round = self._to_unsigned(self.setall(-0.0))
-                assert _round == data_round
-    
+        if intrin_name == "floor":
+            data_szero = (-0.0,)
+        else:
+            data_szero = (-0.0, -0.25, -0.30, -0.45, -0.5)
+
+        for w in data_szero:
+            _round = self._to_unsigned(intrin(self.setall(w)))
+            data_round = self._to_unsigned(self.setall(-0.0))
+            assert _round == data_round
+
      def test_max(self):
          """
          Test intrinsics:
@@ -620,6 +627,27 @@ class _SIMD_ALL(_Test_Utility):
                  assert storen_till[64:] == data_till
                  assert storen_till[:64] == [127]*64 # detect overflow
  
+    @pytest.mark.parametrize("intrin, table_size, elsize", [
+        ("self.lut32", 32, 32),
+        ("self.lut16", 16, 64)
+    ])
+    def test_lut(self, intrin, table_size, elsize):
+        """
+        Test lookup table intrinsics:
+            npyv_lut32_##sfx
+            npyv_lut16_##sfx
+        """
+        if elsize != self._scalar_size():
+            return
+        intrin = eval(intrin)
+        idx_itrin = getattr(self.npyv, f"setall_u{elsize}")
+        table = range(0, table_size)
+        for i in table:
+            broadi = self.setall(i)
+            idx = idx_itrin(i)
+            lut = intrin(table, idx)
+            assert lut == broadi
+
      def test_misc(self):
          broadcast_zero = self.zero()
          assert broadcast_zero == [0] * self.nlanes
diff --git a/numpy/core/tests/test_ufunc.py b/numpy/core/tests/test_ufunc.py

index 292797c6d41144a270dfca33f6e171dcf5a5cdb3..852044d32fcc6d27012c0a43a2e814b976d7154c 100644 (file)
--- a/numpy/core/tests/test_ufunc.py
+++ b/numpy/core/tests/test_ufunc.py
@@ -799,6 +799,20 @@ class TestUfunc:
          # the result would be just a scalar `5`, but is broadcast fully:
          assert (out == 5).all()
  
+    @pytest.mark.parametrize(["arr", "out"], [
+                ([2], np.empty(())),
+                ([1, 2], np.empty(1)),
+                (np.ones((4, 3)), np.empty((4, 1)))],
+            ids=["(1,)->()", "(2,)->(1,)", "(4, 3)->(4, 1)"])
+    def test_out_broadcast_errors(self, arr, out):
+        # Output is (currently) allowed to broadcast inputs, but it cannot be
+        # smaller than the actual result.
+        with pytest.raises(ValueError, match="non-broadcastable"):
+            np.positive(arr, out=out)
+
+        with pytest.raises(ValueError, match="non-broadcastable"):
+            np.add(np.ones(()), arr, out=out)
+
      def test_type_cast(self):
          msg = "type cast"
          a = np.arange(6, dtype='short').reshape((2, 3))
@@ -2198,8 +2212,7 @@ class TestUfunc:
          arr_le = np.arange(10, dtype="<i8")
  
          assert np.add.reduce(arr_be) == np.add.reduce(arr_le)
-        assert_array_equal(
-            np.add.accumulate(arr_be), np.add.accumulate(arr_le))
+        assert_array_equal(np.add.accumulate(arr_be), np.add.accumulate(arr_le))
          assert_array_equal(
              np.add.reduceat(arr_be, [1]), np.add.reduceat(arr_le, [1]))
  
@@ -2522,3 +2535,57 @@ def test_ufunc_methods_floaterrors(method):
      with np.errstate(all="raise"):
          with pytest.raises(FloatingPointError):
              method(arr)
+
+
+def _check_neg_zero(value):
+    if value != 0.0:
+        return False
+    if not np.signbit(value.real):
+        return False
+    if value.dtype.kind == "c":
+        return np.signbit(value.imag)
+    return True
+
+@pytest.mark.parametrize("dtype", np.typecodes["AllFloat"])
+def test_addition_negative_zero(dtype):
+    dtype = np.dtype(dtype)
+    if dtype.kind == "c":
+        neg_zero = dtype.type(complex(-0.0, -0.0))
+    else:
+        neg_zero = dtype.type(-0.0)
+
+    arr = np.array(neg_zero)
+    arr2 = np.array(neg_zero)
+
+    assert _check_neg_zero(arr + arr2)
+    # In-place ops may end up on a different path (reduce path) see gh-21211
+    arr += arr2
+    assert _check_neg_zero(arr)
+
+
+@pytest.mark.parametrize("dtype", np.typecodes["AllFloat"])
+@pytest.mark.parametrize("use_initial", [True, False])
+def test_addition_reduce_negative_zero(dtype, use_initial):
+    dtype = np.dtype(dtype)
+    if dtype.kind == "c":
+        neg_zero = dtype.type(complex(-0.0, -0.0))
+    else:
+        neg_zero = dtype.type(-0.0)
+
+    kwargs = {}
+    if use_initial:
+        kwargs["initial"] = neg_zero
+    else:
+        pytest.xfail("-0. propagation in sum currently requires initial")
+
+    # Test various length, in case SIMD paths or chunking play a role.
+    # 150 extends beyond the pairwise blocksize; probably not important.
+    for i in range(0, 150):
+        arr = np.array([neg_zero] * i, dtype=dtype)
+        res = np.sum(arr, **kwargs)
+        if i > 0 or use_initial:
+            assert _check_neg_zero(res)
+        else:
+            # `sum([])` should probably be 0.0 and not -0.0 like `sum([-0.0])`
+            assert not np.signbit(res.real)
+            assert not np.signbit(res.imag)
diff --git a/numpy/core/tests/test_umath.py b/numpy/core/tests/test_umath.py

index c0b26e75b2c8402758c81e80e89a2ac1b1f1f315..dd0bb88fff730190907bddadd1f9dccbfe30688d 100644 (file)
--- a/numpy/core/tests/test_umath.py
+++ b/numpy/core/tests/test_umath.py
@@ -17,18 +17,8 @@ from numpy.testing import (
      assert_array_max_ulp, assert_allclose, assert_no_warnings, suppress_warnings,
      _gen_alignment_data, assert_array_almost_equal_nulp
      )
+from numpy.testing._private.utils import _glibc_older_than
  
-def get_glibc_version():
-    try:
-        ver = os.confstr('CS_GNU_LIBC_VERSION').rsplit(' ')[1]
-    except Exception as inst:
-        ver = '0.0'
-
-    return ver
-
-
-glibcver = get_glibc_version()
-glibc_older_than = lambda x: (glibcver != '0.0' and glibcver < x)
  
  def on_powerpc():
      """ True if we are running on a Power PC platform."""
@@ -1005,16 +995,17 @@ class TestExp:
  
  class TestSpecialFloats:
      def test_exp_values(self):
-        x = [np.nan,  np.nan, np.inf, 0.]
-        y = [np.nan, -np.nan, np.inf, -np.inf]
-        for dt in ['f', 'd', 'g']:
-            xf = np.array(x, dtype=dt)
-            yf = np.array(y, dtype=dt)
-            assert_equal(np.exp(yf), xf)
+        with np.errstate(under='raise', over='raise'):
+            x = [np.nan,  np.nan, np.inf, 0.]
+            y = [np.nan, -np.nan, np.inf, -np.inf]
+            for dt in ['f', 'd', 'g']:
+                xf = np.array(x, dtype=dt)
+                yf = np.array(y, dtype=dt)
+                assert_equal(np.exp(yf), xf)
  
      # See: https://github.com/numpy/numpy/issues/19192
      @pytest.mark.xfail(
-        glibc_older_than("2.17"),
+        _glibc_older_than("2.17"),
          reason="Older glibc versions may not raise appropriate FP exceptions"
      )
      def test_exp_exceptions(self):
@@ -1262,6 +1253,11 @@ class TestSpecialFloats:
                      assert_raises(FloatingPointError, np.arctanh,
                                    np.array(value, dtype=dt))
  
+    # See: https://github.com/numpy/numpy/issues/20448
+    @pytest.mark.xfail(
+        _glibc_older_than("2.17"),
+        reason="Older glibc versions may not raise appropriate FP exceptions"
+    )
      def test_exp2(self):
          with np.errstate(all='ignore'):
              in_ = [np.nan, -np.nan, np.inf, -np.inf]
@@ -1397,7 +1393,7 @@ class TestAVXFloat32Transcendental:
          M = np.int_(N/20)
          index = np.random.randint(low=0, high=N, size=M)
          x_f32 = np.float32(np.random.uniform(low=-100.,high=100.,size=N))
-        if not glibc_older_than("2.17"):
+        if not _glibc_older_than("2.17"):
              # test coverage for elements > 117435.992f for which glibc is used
              # this is known to be problematic on old glibc, so skip it there
              x_f32[index] = np.float32(10E+10*np.random.rand(M))
@@ -1733,6 +1729,27 @@ class TestMaximum(_FilterInvalids):
          assert_equal(np.maximum(arr1[:6:2], arr2[::3], out=out[::3]), np.array([-2.0, 10., np.nan]))
          assert_equal(out, out_maxtrue)
  
+    def test_precision(self):
+        dtypes = [np.float16, np.float32, np.float64, np.longdouble]
+
+        for dt in dtypes:
+            dtmin = np.finfo(dt).min
+            dtmax = np.finfo(dt).max
+            d1 = dt(0.1)
+            d1_next = np.nextafter(d1, np.inf)
+
+            test_cases = [
+                # v1    v2          expected
+                (dtmin, -np.inf,    dtmin),
+                (dtmax, -np.inf,    dtmax),
+                (d1,    d1_next,    d1_next),
+                (dtmax, np.nan,     np.nan),
+            ]
+
+            for v1, v2, expected in test_cases:
+                assert_equal(np.maximum([v1], [v2]), [expected])
+                assert_equal(np.maximum.reduce([v1, v2]), expected)
+
  
  class TestMinimum(_FilterInvalids):
      def test_reduce(self):
@@ -1804,6 +1821,28 @@ class TestMinimum(_FilterInvalids):
          assert_equal(np.minimum(arr1[:6:2], arr2[::3], out=out[::3]), np.array([-4.0, 1.0, np.nan]))
          assert_equal(out, out_mintrue)
  
+    def test_precision(self):
+        dtypes = [np.float16, np.float32, np.float64, np.longdouble]
+
+        for dt in dtypes:
+            dtmin = np.finfo(dt).min
+            dtmax = np.finfo(dt).max
+            d1 = dt(0.1)
+            d1_next = np.nextafter(d1, np.inf)
+
+            test_cases = [
+                # v1    v2          expected
+                (dtmin, np.inf,     dtmin),
+                (dtmax, np.inf,     dtmax),
+                (d1,    d1_next,    d1),
+                (dtmin, np.nan,     np.nan),
+            ]
+
+            for v1, v2, expected in test_cases:
+                assert_equal(np.minimum([v1], [v2]), [expected])
+                assert_equal(np.minimum.reduce([v1, v2]), expected)
+
+
  class TestFmax(_FilterInvalids):
      def test_reduce(self):
          dflt = np.typecodes['AllFloat']
@@ -1845,6 +1884,27 @@ class TestFmax(_FilterInvalids):
              out = np.array([0,    0, nan], dtype=complex)
              assert_equal(np.fmax(arg1, arg2), out)
  
+    def test_precision(self):
+        dtypes = [np.float16, np.float32, np.float64, np.longdouble]
+
+        for dt in dtypes:
+            dtmin = np.finfo(dt).min
+            dtmax = np.finfo(dt).max
+            d1 = dt(0.1)
+            d1_next = np.nextafter(d1, np.inf)
+
+            test_cases = [
+                # v1    v2          expected
+                (dtmin, -np.inf,    dtmin),
+                (dtmax, -np.inf,    dtmax),
+                (d1,    d1_next,    d1_next),
+                (dtmax, np.nan,     dtmax),
+            ]
+
+            for v1, v2, expected in test_cases:
+                assert_equal(np.fmax([v1], [v2]), [expected])
+                assert_equal(np.fmax.reduce([v1, v2]), expected)
+
  
  class TestFmin(_FilterInvalids):
      def test_reduce(self):
@@ -1887,6 +1947,27 @@ class TestFmin(_FilterInvalids):
              out = np.array([0,    0, nan], dtype=complex)
              assert_equal(np.fmin(arg1, arg2), out)
  
+    def test_precision(self):
+        dtypes = [np.float16, np.float32, np.float64, np.longdouble]
+
+        for dt in dtypes:
+            dtmin = np.finfo(dt).min
+            dtmax = np.finfo(dt).max
+            d1 = dt(0.1)
+            d1_next = np.nextafter(d1, np.inf)
+
+            test_cases = [
+                # v1    v2          expected
+                (dtmin, np.inf,     dtmin),
+                (dtmax, np.inf,     dtmax),
+                (d1,    d1_next,    d1),
+                (dtmin, np.nan,     dtmin),
+            ]
+
+            for v1, v2, expected in test_cases:
+                assert_equal(np.fmin([v1], [v2]), [expected])
+                assert_equal(np.fmin.reduce([v1, v2]), expected)
+
  
  class TestBool:
      def test_exceptions(self):
diff --git a/numpy/core/tests/test_umath_accuracy.py b/numpy/core/tests/test_umath_accuracy.py

index 32e2dca66151aa60a623653a7e2e59b9cdba5fcd..3d4d5b5aadc214d05fc8cf5f0968ee072fca6c04 100644 (file)
--- a/numpy/core/tests/test_umath_accuracy.py
+++ b/numpy/core/tests/test_umath_accuracy.py
@@ -5,11 +5,14 @@ import sys
  import pytest
  from ctypes import c_longlong, c_double, c_float, c_int, cast, pointer, POINTER
  from numpy.testing import assert_array_max_ulp
+from numpy.testing._private.utils import _glibc_older_than
  from numpy.core._multiarray_umath import __cpu_features__
  
  IS_AVX = __cpu_features__.get('AVX512F', False) or \
          (__cpu_features__.get('FMA3', False) and __cpu_features__.get('AVX2', False))
-runtest = sys.platform.startswith('linux') and IS_AVX
+# only run on linux with AVX, also avoid old glibc (numpy/numpy#20448).
+runtest = (sys.platform.startswith('linux')
+           and IS_AVX and not _glibc_older_than("2.17"))
  platform_skip = pytest.mark.skipif(not runtest,
                                     reason="avoid testing inconsistent platform "
                                     "library implementations")
diff --git a/numpy/ctypeslib.py b/numpy/ctypeslib.py

index 8d105a248f17fc1c6e73c22e3cbd64def49ada30..c4bafca1bd11448a093ea7758b954541010a305f 100644 (file)
--- a/numpy/ctypeslib.py
+++ b/numpy/ctypeslib.py
@@ -170,7 +170,7 @@ def _num_fromflags(flaglist):
      return num
  
  _flagnames = ['C_CONTIGUOUS', 'F_CONTIGUOUS', 'ALIGNED', 'WRITEABLE',
-              'OWNDATA', 'UPDATEIFCOPY', 'WRITEBACKIFCOPY']
+              'OWNDATA', 'WRITEBACKIFCOPY']
  def _flags_fromnum(num):
      res = []
      for key in _flagnames:
@@ -261,7 +261,6 @@ def ndpointer(dtype=None, ndim=None, shape=None, flags=None):
            - WRITEABLE / W
            - ALIGNED / A
            - WRITEBACKIFCOPY / X
-          - UPDATEIFCOPY / U
  
      Returns
      -------
diff --git a/numpy/ctypeslib.pyi b/numpy/ctypeslib.pyi

index 1c396d240173df74e226b2d6f3b37f6facebad12..0313cd82a5a9b900324abc83596c58769b41f56d 100644 (file)
--- a/numpy/ctypeslib.pyi
+++ b/numpy/ctypeslib.pyi
@@ -5,21 +5,15 @@ from ctypes import c_int64 as _c_intp
  import os
  import sys
  import ctypes
+from collections.abc import Iterable, Sequence
  from typing import (
      Literal as L,
      Any,
-    List,
      Union,
      TypeVar,
-    Type,
      Generic,
-    Optional,
      overload,
-    Iterable,
      ClassVar,
-    Tuple,
-    Sequence,
-    Dict,
  )
  
  from numpy import (
@@ -39,25 +33,22 @@ from numpy import (
      ulonglong,
      single,
      double,
-    float_,
      longdouble,
      void,
  )
  from numpy.core._internal import _ctypes
  from numpy.core.multiarray import flagsobj
-from numpy.typing import (
+from numpy._typing import (
      # Arrays
-    ArrayLike,
      NDArray,
-    _FiniteNestedSequence,
-    _SupportsArray,
+    _ArrayLike,
  
      # Shapes
      _ShapeLike,
  
      # DTypes
      DTypeLike,
-    _SupportsDType,
+    _DTypeLike,
      _VoidDTypeLike,
      _BoolCodes,
      _UByteCodes,
@@ -77,23 +68,15 @@ from numpy.typing import (
  
  # TODO: Add a proper `_Shape` bound once we've got variadic typevars
  _DType = TypeVar("_DType", bound=dtype[Any])
-_DTypeOptional = TypeVar("_DTypeOptional", bound=Optional[dtype[Any]])
+_DTypeOptional = TypeVar("_DTypeOptional", bound=None | dtype[Any])
  _SCT = TypeVar("_SCT", bound=generic)
  
-_DTypeLike = Union[
-    dtype[_SCT],
-    Type[_SCT],
-    _SupportsDType[dtype[_SCT]],
-]
-_ArrayLike = _FiniteNestedSequence[_SupportsArray[dtype[_SCT]]]
-
  _FlagsKind = L[
      'C_CONTIGUOUS', 'CONTIGUOUS', 'C',
      'F_CONTIGUOUS', 'FORTRAN', 'F',
      'ALIGNED', 'A',
      'WRITEABLE', 'W',
      'OWNDATA', 'O',
-    'UPDATEIFCOPY', 'U',
      'WRITEBACKIFCOPY', 'X',
  ]
  
@@ -104,18 +87,18 @@ class _ndptr(ctypes.c_void_p, Generic[_DTypeOptional]):
      _dtype_: ClassVar[_DTypeOptional]
      _shape_: ClassVar[None]
      _ndim_: ClassVar[None | int]
-    _flags_: ClassVar[None | List[_FlagsKind]]
+    _flags_: ClassVar[None | list[_FlagsKind]]
  
      @overload
      @classmethod
-    def from_param(cls: Type[_ndptr[None]], obj: ndarray[Any, Any]) -> _ctypes: ...
+    def from_param(cls: type[_ndptr[None]], obj: ndarray[Any, Any]) -> _ctypes: ...
      @overload
      @classmethod
-    def from_param(cls: Type[_ndptr[_DType]], obj: ndarray[Any, _DType]) -> _ctypes: ...
+    def from_param(cls: type[_ndptr[_DType]], obj: ndarray[Any, _DType]) -> _ctypes: ...
  
  class _concrete_ndptr(_ndptr[_DType]):
      _dtype_: ClassVar[_DType]
-    _shape_: ClassVar[Tuple[int, ...]]
+    _shape_: ClassVar[tuple[int, ...]]
      @property
      def contents(self) -> ndarray[Any, _DType]: ...
  
@@ -124,7 +107,7 @@ def load_library(
      loader_path: str | bytes | os.PathLike[str] | os.PathLike[bytes],
  ) -> ctypes.CDLL: ...
  
-__all__: List[str]
+__all__: list[str]
  
  c_intp = _c_intp
  
@@ -134,7 +117,7 @@ def ndpointer(
      ndim: int = ...,
      shape: None | _ShapeLike = ...,
      flags: None | _FlagsKind | Iterable[_FlagsKind] | int | flagsobj = ...,
-) -> Type[_ndptr[None]]: ...
+) -> type[_ndptr[None]]: ...
  @overload
  def ndpointer(
      dtype: _DTypeLike[_SCT],
@@ -142,7 +125,7 @@ def ndpointer(
      *,
      shape: _ShapeLike,
      flags: None | _FlagsKind | Iterable[_FlagsKind] | int | flagsobj = ...,
-) -> Type[_concrete_ndptr[dtype[_SCT]]]: ...
+) -> type[_concrete_ndptr[dtype[_SCT]]]: ...
  @overload
  def ndpointer(
      dtype: DTypeLike,
@@ -150,54 +133,54 @@ def ndpointer(
      *,
      shape: _ShapeLike,
      flags: None | _FlagsKind | Iterable[_FlagsKind] | int | flagsobj = ...,
-) -> Type[_concrete_ndptr[dtype[Any]]]: ...
+) -> type[_concrete_ndptr[dtype[Any]]]: ...
  @overload
  def ndpointer(
      dtype: _DTypeLike[_SCT],
      ndim: int = ...,
      shape: None = ...,
      flags: None | _FlagsKind | Iterable[_FlagsKind] | int | flagsobj = ...,
-) -> Type[_ndptr[dtype[_SCT]]]: ...
+) -> type[_ndptr[dtype[_SCT]]]: ...
  @overload
  def ndpointer(
      dtype: DTypeLike,
      ndim: int = ...,
      shape: None = ...,
      flags: None | _FlagsKind | Iterable[_FlagsKind] | int | flagsobj = ...,
-) -> Type[_ndptr[dtype[Any]]]: ...
+) -> type[_ndptr[dtype[Any]]]: ...
  
  @overload
-def as_ctypes_type(dtype: _BoolCodes | _DTypeLike[bool_] | Type[ctypes.c_bool]) -> Type[ctypes.c_bool]: ...
+def as_ctypes_type(dtype: _BoolCodes | _DTypeLike[bool_] | type[ctypes.c_bool]) -> type[ctypes.c_bool]: ...
  @overload
-def as_ctypes_type(dtype: _ByteCodes | _DTypeLike[byte] | Type[ctypes.c_byte]) -> Type[ctypes.c_byte]: ...
+def as_ctypes_type(dtype: _ByteCodes | _DTypeLike[byte] | type[ctypes.c_byte]) -> type[ctypes.c_byte]: ...
  @overload
-def as_ctypes_type(dtype: _ShortCodes | _DTypeLike[short] | Type[ctypes.c_short]) -> Type[ctypes.c_short]: ...
+def as_ctypes_type(dtype: _ShortCodes | _DTypeLike[short] | type[ctypes.c_short]) -> type[ctypes.c_short]: ...
  @overload
-def as_ctypes_type(dtype: _IntCCodes | _DTypeLike[intc] | Type[ctypes.c_int]) -> Type[ctypes.c_int]: ...
+def as_ctypes_type(dtype: _IntCCodes | _DTypeLike[intc] | type[ctypes.c_int]) -> type[ctypes.c_int]: ...
  @overload
-def as_ctypes_type(dtype: _IntCodes | _DTypeLike[int_] | Type[int | ctypes.c_long]) -> Type[ctypes.c_long]: ...
+def as_ctypes_type(dtype: _IntCodes | _DTypeLike[int_] | type[int | ctypes.c_long]) -> type[ctypes.c_long]: ...
  @overload
-def as_ctypes_type(dtype: _LongLongCodes | _DTypeLike[longlong] | Type[ctypes.c_longlong]) -> Type[ctypes.c_longlong]: ...
+def as_ctypes_type(dtype: _LongLongCodes | _DTypeLike[longlong] | type[ctypes.c_longlong]) -> type[ctypes.c_longlong]: ...
  @overload
-def as_ctypes_type(dtype: _UByteCodes | _DTypeLike[ubyte] | Type[ctypes.c_ubyte]) -> Type[ctypes.c_ubyte]: ...
+def as_ctypes_type(dtype: _UByteCodes | _DTypeLike[ubyte] | type[ctypes.c_ubyte]) -> type[ctypes.c_ubyte]: ...
  @overload
-def as_ctypes_type(dtype: _UShortCodes | _DTypeLike[ushort] | Type[ctypes.c_ushort]) -> Type[ctypes.c_ushort]: ...
+def as_ctypes_type(dtype: _UShortCodes | _DTypeLike[ushort] | type[ctypes.c_ushort]) -> type[ctypes.c_ushort]: ...
  @overload
-def as_ctypes_type(dtype: _UIntCCodes | _DTypeLike[uintc] | Type[ctypes.c_uint]) -> Type[ctypes.c_uint]: ...
+def as_ctypes_type(dtype: _UIntCCodes | _DTypeLike[uintc] | type[ctypes.c_uint]) -> type[ctypes.c_uint]: ...
  @overload
-def as_ctypes_type(dtype: _UIntCodes | _DTypeLike[uint] | Type[ctypes.c_ulong]) -> Type[ctypes.c_ulong]: ...
+def as_ctypes_type(dtype: _UIntCodes | _DTypeLike[uint] | type[ctypes.c_ulong]) -> type[ctypes.c_ulong]: ...
  @overload
-def as_ctypes_type(dtype: _ULongLongCodes | _DTypeLike[ulonglong] | Type[ctypes.c_ulonglong]) -> Type[ctypes.c_ulonglong]: ...
+def as_ctypes_type(dtype: _ULongLongCodes | _DTypeLike[ulonglong] | type[ctypes.c_ulonglong]) -> type[ctypes.c_ulonglong]: ...
  @overload
-def as_ctypes_type(dtype: _SingleCodes | _DTypeLike[single] | Type[ctypes.c_float]) -> Type[ctypes.c_float]: ...
+def as_ctypes_type(dtype: _SingleCodes | _DTypeLike[single] | type[ctypes.c_float]) -> type[ctypes.c_float]: ...
  @overload
-def as_ctypes_type(dtype: _DoubleCodes | _DTypeLike[double] | Type[float | ctypes.c_double]) -> Type[ctypes.c_double]: ...
+def as_ctypes_type(dtype: _DoubleCodes | _DTypeLike[double] | type[float | ctypes.c_double]) -> type[ctypes.c_double]: ...
  @overload
-def as_ctypes_type(dtype: _LongDoubleCodes | _DTypeLike[longdouble] | Type[ctypes.c_longdouble]) -> Type[ctypes.c_longdouble]: ...
+def as_ctypes_type(dtype: _LongDoubleCodes | _DTypeLike[longdouble] | type[ctypes.c_longdouble]) -> type[ctypes.c_longdouble]: ...
  @overload
-def as_ctypes_type(dtype: _VoidDTypeLike) -> Type[Any]: ...  # `ctypes.Union` or `ctypes.Structure`
+def as_ctypes_type(dtype: _VoidDTypeLike) -> type[Any]: ...  # `ctypes.Union` or `ctypes.Structure`
  @overload
-def as_ctypes_type(dtype: str) -> Type[Any]: ...
+def as_ctypes_type(dtype: str) -> type[Any]: ...
  
  @overload
  def as_array(obj: ctypes._PointerLike, shape: Sequence[int]) -> NDArray[Any]: ...
diff --git a/numpy/distutils/__init__.py b/numpy/distutils/__init__.py

index 79974d1c220a147159873079b5350f29739206bc..f74ed4d3f6dbed79dd9cd8284ebd596853204398 100644 (file)
--- a/numpy/distutils/__init__.py
+++ b/numpy/distutils/__init__.py
@@ -19,6 +19,8 @@ LAPACK, and for setting include paths and similar build options, please see
  
  """
  
+import warnings
+
  # Must import local ccompiler ASAP in order to get
  # customized CCompiler.spawn effective.
  from . import ccompiler
@@ -26,6 +28,17 @@ from . import unixccompiler
  
  from .npy_pkg_config import *
  
+warnings.warn("\n\n"
+    "  `numpy.distutils` is deprecated since NumPy 1.23.0, as a result\n"
+    "  of the deprecation of `distutils` itself. It will be removed for\n"
+    "  Python >= 3.12. For older Python versions it will remain present.\n"
+    "  It is recommended to use `setuptools < 60.0` for those Python versions.\n"
+    "  For more details, see:\n"
+    "    https://numpy.org/devdocs/reference/distutils_status_migration.html \n\n",
+    DeprecationWarning, stacklevel=2
+)
+del warnings
+
  # If numpy is installed, add distutils.test()
  try:
      from . import __config__
diff --git a/numpy/distutils/ccompiler.py b/numpy/distutils/ccompiler.py

index 0bc34326c59ffb24b013edcb121cfbefc1aa67e7..8697fae620dc6cf0b376b5ef0e428be1e08c09e1 100644 (file)
--- a/numpy/distutils/ccompiler.py
+++ b/numpy/distutils/ccompiler.py
@@ -120,7 +120,7 @@ def CCompiler_spawn(self, cmd, display=None, env=None):
      display : str or sequence of str, optional
          The text to add to the log file kept by `numpy.distutils`.
          If not given, `display` is equal to `cmd`.
-    env: a dictionary for environment variables, optional
+    env : a dictionary for environment variables, optional
  
      Returns
      -------
diff --git a/numpy/distutils/ccompiler_opt.py b/numpy/distutils/ccompiler_opt.py

index 205a8e060f6932e7558942ec35515569280b4ed1..2019dcb254ee0f7ad78e342f9fe7d2a77a8846f8 100644 (file)
--- a/numpy/distutils/ccompiler_opt.py
+++ b/numpy/distutils/ccompiler_opt.py
@@ -177,7 +177,7 @@ class _Config:
  
              If the compiler able to successfully compile the C file then `CCompilerOpt`
              will add a C ``#define`` for it into the main dispatch header, e.g.
-            ```#define {conf_c_prefix}_XXXX`` where ``XXXX`` is the case name in upper case.
+            ``#define {conf_c_prefix}_XXXX`` where ``XXXX`` is the case name in upper case.
  
          **NOTES**:
              * space can be used as separator with options that supports "str or list"
@@ -236,6 +236,7 @@ class _Config:
          x64 = "SSE SSE2 SSE3",
          ppc64 = '', # play it safe
          ppc64le = "VSX VSX2",
+        s390x = '',
          armhf = '', # play it safe
          aarch64 = "NEON NEON_FP16 NEON_VFPV4 ASIMD"
      )
@@ -301,6 +302,16 @@ class _Config:
          VSX2 = dict(interest=2, implies="VSX", implies_detect=False),
          ## Power9/ISA 3.00
          VSX3 = dict(interest=3, implies="VSX2", implies_detect=False),
+        ## Power10/ISA 3.1
+        VSX4 = dict(interest=4, implies="VSX3", implies_detect=False,
+                    extra_checks="VSX4_MMA"),
+        # IBM/Z
+        ## VX(z13) support
+        VX = dict(interest=1, headers="vecintrin.h"),
+        ## Vector-Enhancements Facility
+        VXE = dict(interest=2, implies="VX", implies_detect=False),
+        ## Vector-Enhancements Facility 2
+        VXE2 = dict(interest=3, implies="VXE", implies_detect=False),
          # ARM
          NEON  = dict(interest=1, headers="arm_neon.h"),
          NEON_FP16 = dict(interest=2, implies="NEON"),
@@ -343,7 +354,7 @@ class _Config:
              FMA4   = dict(flags="-mfma4"),
              FMA3   = dict(flags="-mfma"),
              AVX2   = dict(flags="-mavx2"),
-            AVX512F = dict(flags="-mavx512f"),
+            AVX512F = dict(flags="-mavx512f -mno-mmx"),
              AVX512CD = dict(flags="-mavx512cd"),
              AVX512_KNL = dict(flags="-mavx512er -mavx512pf"),
              AVX512_KNM = dict(
@@ -471,15 +482,36 @@ class _Config:
                  ),
                  VSX3 = dict(
                      flags="-mcpu=power9 -mtune=power9", implies_detect=False
+                ),
+                VSX4 = dict(
+                    flags="-mcpu=power10 -mtune=power10", implies_detect=False
                  )
              )
              if self.cc_is_clang:
                  partial["VSX"]["flags"]  = "-maltivec -mvsx"
                  partial["VSX2"]["flags"] = "-mpower8-vector"
                  partial["VSX3"]["flags"] = "-mpower9-vector"
+                partial["VSX4"]["flags"] = "-mpower10-vector"
  
              return partial
  
+        on_zarch = self.cc_on_s390x
+        if on_zarch:
+            partial = dict(
+                VX = dict(
+                    flags="-march=arch11 -mzvector"
+                ),
+                VXE = dict(
+                    flags="-march=arch12", implies_detect=False
+                ),
+                VXE2 = dict(
+                    flags="-march=arch13", implies_detect=False
+                )
+            )
+
+            return partial
+
+
          if self.cc_on_aarch64 and is_unix: return dict(
              NEON = dict(
                  implies="NEON_FP16 NEON_VFPV4 ASIMD", autovec=True
@@ -745,18 +777,18 @@ class _Cache:
  
      Parameters
      ----------
-    cache_path: str or None
+    cache_path : str or None
          The path of cache file, if None then cache in file will disabled.
  
-    *factors:
+    *factors :
          The caching factors that need to utilize next to `conf_cache_factors`.
  
      Attributes
      ----------
-    cache_private: set
+    cache_private : set
          Hold the attributes that need be skipped from "in-memory cache".
  
-    cache_infile: bool
+    cache_infile : bool
          Utilized during initializing this class, to determine if the cache was able
          to loaded from the specified cache path in 'cache_path'.
      """
@@ -882,7 +914,11 @@ class _CCompiler:
      cc_on_x64 : bool
          True when the target architecture is 64-bit x86
      cc_on_ppc64 : bool
-        True when the target architecture is 64-bit big-endian PowerPC
+        True when the target architecture is 64-bit big-endian powerpc
+    cc_on_ppc64le : bool
+        True when the target architecture is 64-bit litle-endian powerpc
+    cc_on_s390x : bool
+        True when the target architecture is IBM/ZARCH on linux
      cc_on_armhf : bool
          True when the target architecture is 32-bit ARMv7+
      cc_on_aarch64 : bool
@@ -919,50 +955,57 @@ class _CCompiler:
      def __init__(self):
          if hasattr(self, "cc_is_cached"):
              return
-        #      attr                regex
+        #      attr            regex        compiler-expression
          detect_arch = (
-            ("cc_on_x64",      ".*(x|x86_|amd)64.*"),
-            ("cc_on_x86",      ".*(win32|x86|i386|i686).*"),
-            ("cc_on_ppc64le",  ".*(powerpc|ppc)64(el|le).*"),
-            ("cc_on_ppc64",    ".*(powerpc|ppc)64.*"),
-            ("cc_on_aarch64",  ".*(aarch64|arm64).*"),
-            ("cc_on_armhf",    ".*arm.*"),
+            ("cc_on_x64",      ".*(x|x86_|amd)64.*", ""),
+            ("cc_on_x86",      ".*(win32|x86|i386|i686).*", ""),
+            ("cc_on_ppc64le",  ".*(powerpc|ppc)64(el|le).*", ""),
+            ("cc_on_ppc64",    ".*(powerpc|ppc)64.*", ""),
+            ("cc_on_aarch64",  ".*(aarch64|arm64).*", ""),
+            ("cc_on_armhf",    ".*arm.*", "defined(__ARM_ARCH_7__) || "
+                                          "defined(__ARM_ARCH_7A__)"),
+            ("cc_on_s390x",    ".*s390x.*", ""),
              # undefined platform
-            ("cc_on_noarch",    ""),
+            ("cc_on_noarch",   "", ""),
          )
          detect_compiler = (
-            ("cc_is_gcc",     r".*(gcc|gnu\-g).*"),
-            ("cc_is_clang",    ".*clang.*"),
-            ("cc_is_iccw",     ".*(intelw|intelemw|iccw).*"), # intel msvc like
-            ("cc_is_icc",      ".*(intel|icc).*"), # intel unix like
-            ("cc_is_msvc",     ".*msvc.*"),
+            ("cc_is_gcc",     r".*(gcc|gnu\-g).*", ""),
+            ("cc_is_clang",    ".*clang.*", ""),
+            # intel msvc like
+            ("cc_is_iccw",     ".*(intelw|intelemw|iccw).*", ""),
+            ("cc_is_icc",      ".*(intel|icc).*", ""),  # intel unix like
+            ("cc_is_msvc",     ".*msvc.*", ""),
              # undefined compiler will be treat it as gcc
-            ("cc_is_nocc",     ""),
+            ("cc_is_nocc",     "", ""),
          )
          detect_args = (
-           ("cc_has_debug",  ".*(O0|Od|ggdb|coverage|debug:full).*"),
-           ("cc_has_native", ".*(-march=native|-xHost|/QxHost).*"),
+           ("cc_has_debug",  ".*(O0|Od|ggdb|coverage|debug:full).*", ""),
+           ("cc_has_native", ".*(-march=native|-xHost|/QxHost).*", ""),
             # in case if the class run with -DNPY_DISABLE_OPTIMIZATION
-           ("cc_noopt", ".*DISABLE_OPT.*"),
+           ("cc_noopt", ".*DISABLE_OPT.*", ""),
          )
  
          dist_info = self.dist_info()
          platform, compiler_info, extra_args = dist_info
          # set False to all attrs
          for section in (detect_arch, detect_compiler, detect_args):
-            for attr, rgex in section:
+            for attr, rgex, cexpr in section:
                  setattr(self, attr, False)
  
          for detect, searchin in ((detect_arch, platform), (detect_compiler, compiler_info)):
-            for attr, rgex in detect:
+            for attr, rgex, cexpr in detect:
                  if rgex and not re.match(rgex, searchin, re.IGNORECASE):
                      continue
+                if cexpr and not self.cc_test_cexpr(cexpr):
+                    continue
                  setattr(self, attr, True)
                  break
  
-        for attr, rgex in detect_args:
+        for attr, rgex, cexpr in detect_args:
              if rgex and not re.match(rgex, extra_args, re.IGNORECASE):
                  continue
+            if cexpr and not self.cc_test_cexpr(cexpr):
+                continue
              setattr(self, attr, True)
  
          if self.cc_on_noarch:
@@ -991,7 +1034,8 @@ class _CCompiler:
              self.cc_is_gcc = True
  
          self.cc_march = "unknown"
-        for arch in ("x86", "x64", "ppc64", "ppc64le", "armhf", "aarch64"):
+        for arch in ("x86", "x64", "ppc64", "ppc64le",
+                     "armhf", "aarch64", "s390x"):
              if getattr(self, "cc_on_" + arch):
                  self.cc_march = arch
                  break
@@ -1033,6 +1077,25 @@ class _CCompiler:
              self.dist_log("testing failed", stderr=True)
          return test
  
+    @_Cache.me
+    def cc_test_cexpr(self, cexpr, flags=[]):
+        """
+        Same as the above but supports compile-time expressions.
+        """
+        self.dist_log("testing compiler expression", cexpr)
+        test_path = os.path.join(self.conf_tmp_path, "npy_dist_test_cexpr.c")
+        with open(test_path, "w") as fd:
+            fd.write(textwrap.dedent(f"""\
+               #if !({cexpr})
+                   #error "unsupported expression"
+               #endif
+               int dummy;
+            """))
+        test = self.dist_test(test_path, flags)
+        if not test:
+            self.dist_log("testing failed", stderr=True)
+        return test
+
      def cc_normalize_flags(self, flags):
          """
          Remove the conflicts that caused due gathering implied features flags.
@@ -1071,7 +1134,9 @@ class _CCompiler:
      _cc_normalize_unix_frgx = re.compile(
          # 2- to remove any flags starts with
          # -march, -mcpu, -x(INTEL) and '-m' without '='
-        r"^(?!(-mcpu=|-march=|-x[A-Z0-9\-]))(?!-m[a-z0-9\-\.]*.$)"
+        r"^(?!(-mcpu=|-march=|-x[A-Z0-9\-]|-m[a-z0-9\-\.]*.$))|"
+        # exclude:
+        r"(?:-mzvector)"
      )
      _cc_normalize_unix_krgx = re.compile(
          # 3- keep only the highest of
@@ -1199,12 +1264,12 @@ class _Feature:
  
          Parameters
          ----------
-        names: sequence or None, optional
+        names : sequence or None, optional
              Specify certain CPU features to test it against the **C** compiler.
              if None(default), it will test all current supported features.
              **Note**: feature names must be in upper-case.
  
-        force_flags: list or None, optional
+        force_flags : list or None, optional
              If None(default), default compiler flags for every CPU feature will
              be used during the test.
  
@@ -1273,10 +1338,10 @@ class _Feature:
  
          Parameters
          ----------
-        names: str or sequence of str
+        names : str or sequence of str
              CPU feature name(s) in uppercase.
  
-        keep_origins: bool
+        keep_origins : bool
              if False(default) then the returned set will not contain any
              features from 'names'. This case happens only when two features
              imply each other.
@@ -1466,10 +1531,10 @@ class _Feature:
  
          Parameters
          ----------
-        name: str
+        name : str
              Supported CPU feature name.
  
-        force_flags: list or None, optional
+        force_flags : list or None, optional
              If None(default), the returned flags from `feature_flags()`
              will be used.
  
@@ -1505,10 +1570,10 @@ class _Feature:
  
          Parameters
          ----------
-        name: str
+        name : str
              CPU feature name in uppercase.
  
-        force_flags: list or None, optional
+        force_flags : list or None, optional
              If None(default), default compiler flags for every CPU feature will
              be used during test.
  
@@ -1550,7 +1615,7 @@ class _Feature:
  
          Parameters
          ----------
-        names: str
+        names : str
              CPU feature name in uppercase.
          """
          assert isinstance(name, str)
@@ -1636,10 +1701,10 @@ class _Parse:
  
      Parameters
      ----------
-    cpu_baseline: str or None
+    cpu_baseline : str or None
          minimal set of required CPU features or special options.
  
-    cpu_dispatch: str or None
+    cpu_dispatch : str or None
          dispatched set of additional CPU features or special options.
  
      Special options can be:
@@ -1772,7 +1837,7 @@ class _Parse:
  
          Parameters
          ----------
-        source: str
+        source : str
              the path of **C** source file.
  
          Returns
@@ -2211,7 +2276,7 @@ class CCompilerOpt(_Config, _Distutils, _Cache, _CCompiler, _Feature, _Parse):
              Path of parent directory for the generated headers and wrapped sources.
              If None(default) the files will generated in-place.
  
-        ccompiler: CCompiler
+        ccompiler : CCompiler
              Distutils `CCompiler` instance to be used for compilation.
              If None (default), the provided instance during the initialization
              will be used instead.
@@ -2519,6 +2584,8 @@ class CCompilerOpt(_Config, _Distutils, _Cache, _CCompiler, _Feature, _Parse):
          except OSError:
              pass
  
+        os.makedirs(os.path.dirname(config_path), exist_ok=True)
+
          self.dist_log("generate dispatched config -> ", config_path)
          dispatch_calls = []
          for tar in targets:
diff --git a/numpy/distutils/checks/cpu_asimd.c b/numpy/distutils/checks/cpu_asimd.c

index 8df556b6c303bff2a977c6665a7a2e4339b9bd31..6bc9022a58d3cd087d167d354224ded89be91884 100644 (file)
--- a/numpy/distutils/checks/cpu_asimd.c
+++ b/numpy/distutils/checks/cpu_asimd.c
@@ -3,9 +3,10 @@
  #endif
  #include <arm_neon.h>
  
-int main(void)
+int main(int argc, char **argv)
  {
-    float32x4_t v1 = vdupq_n_f32(1.0f), v2 = vdupq_n_f32(2.0f);
+    float *src = (float*)argv[argc-1];
+    float32x4_t v1 = vdupq_n_f32(src[0]), v2 = vdupq_n_f32(src[1]);
      /* MAXMIN */
      int ret  = (int)vgetq_lane_f32(vmaxnmq_f32(v1, v2), 0);
          ret += (int)vgetq_lane_f32(vminnmq_f32(v1, v2), 0);
@@ -13,7 +14,8 @@ int main(void)
      ret += (int)vgetq_lane_f32(vrndq_f32(v1), 0);
  #ifdef __aarch64__
      {
-        float64x2_t vd1 = vdupq_n_f64(1.0), vd2 = vdupq_n_f64(2.0);
+        double *src2 = (double*)argv[argc-1];
+        float64x2_t vd1 = vdupq_n_f64(src2[0]), vd2 = vdupq_n_f64(src2[1]);
          /* MAXMIN */
          ret += (int)vgetq_lane_f64(vmaxnmq_f64(vd1, vd2), 0);
          ret += (int)vgetq_lane_f64(vminnmq_f64(vd1, vd2), 0);
diff --git a/numpy/distutils/checks/cpu_asimddp.c b/numpy/distutils/checks/cpu_asimddp.c

index 0158d13543ad430ae3ffe6cbe20ce23e5696eed1..e7068ce02e19856349873f40d03caff438efb6fe 100644 (file)
--- a/numpy/distutils/checks/cpu_asimddp.c
+++ b/numpy/distutils/checks/cpu_asimddp.c
@@ -3,9 +3,10 @@
  #endif
  #include <arm_neon.h>
  
-int main(void)
+int main(int argc, char **argv)
  {
-    uint8x16_t v1 = vdupq_n_u8((unsigned char)1), v2 = vdupq_n_u8((unsigned char)2);
+    unsigned char *src = (unsigned char*)argv[argc-1];
+    uint8x16_t v1 = vdupq_n_u8(src[0]), v2 = vdupq_n_u8(src[1]);
      uint32x4_t va = vdupq_n_u32(3);
      int ret = (int)vgetq_lane_u32(vdotq_u32(va, v1, v2), 0);
  #ifdef __aarch64__
diff --git a/numpy/distutils/checks/cpu_asimdfhm.c b/numpy/distutils/checks/cpu_asimdfhm.c

index bb437aa403525e75863b91a9de8702a9e6df3ab9..54e328098d17b57445024c9859cd4992492c348a 100644 (file)
--- a/numpy/distutils/checks/cpu_asimdfhm.c
+++ b/numpy/distutils/checks/cpu_asimdfhm.c
@@ -3,15 +3,17 @@
  #endif
  #include <arm_neon.h>
  
-int main(void)
+int main(int argc, char **argv)
  {
-    float16x8_t vhp  = vdupq_n_f16((float16_t)1);
-    float16x4_t vlhp = vdup_n_f16((float16_t)1);
-    float32x4_t vf   = vdupq_n_f32(1.0f);
-    float32x2_t vlf  = vdup_n_f32(1.0f);
+    float16_t *src = (float16_t*)argv[argc-1];
+    float *src2 = (float*)argv[argc-2];
+    float16x8_t vhp  = vdupq_n_f16(src[0]);
+    float16x4_t vlhp = vdup_n_f16(src[1]);
+    float32x4_t vf   = vdupq_n_f32(src2[0]);
+    float32x2_t vlf  = vdup_n_f32(src2[1]);
  
-    int ret  = (int)vget_lane_f32(vfmlal_low_u32(vlf, vlhp, vlhp), 0);
-        ret += (int)vgetq_lane_f32(vfmlslq_high_u32(vf, vhp, vhp), 0);
+    int ret  = (int)vget_lane_f32(vfmlal_low_f16(vlf, vlhp, vlhp), 0);
+        ret += (int)vgetq_lane_f32(vfmlslq_high_f16(vf, vhp, vhp), 0);
  
      return ret;
  }
diff --git a/numpy/distutils/checks/cpu_asimdhp.c b/numpy/distutils/checks/cpu_asimdhp.c

index 80b94000f04e02211bd3458373cd22e25cd37766..e2de0306e0acaeda3b861756e598a132f8e1ca9f 100644 (file)
--- a/numpy/distutils/checks/cpu_asimdhp.c
+++ b/numpy/distutils/checks/cpu_asimdhp.c
@@ -3,10 +3,11 @@
  #endif
  #include <arm_neon.h>
  
-int main(void)
+int main(int argc, char **argv)
  {
-    float16x8_t vhp  = vdupq_n_f16((float16_t)-1);
-    float16x4_t vlhp = vdup_n_f16((float16_t)-1);
+    float16_t *src = (float16_t*)argv[argc-1];
+    float16x8_t vhp  = vdupq_n_f16(src[0]);
+    float16x4_t vlhp = vdup_n_f16(src[1]);
  
      int ret  =  (int)vgetq_lane_f16(vabdq_f16(vhp, vhp), 0);
          ret  += (int)vget_lane_f16(vabd_f16(vlhp, vlhp), 0);
diff --git a/numpy/distutils/checks/cpu_neon.c b/numpy/distutils/checks/cpu_neon.c

index 4eab1f384a72a260d7441b2b97eb46115b3c5b3e..8c64f864dea63cb9c4ee60249e52b1ad528751c7 100644 (file)
--- a/numpy/distutils/checks/cpu_neon.c
+++ b/numpy/distutils/checks/cpu_neon.c
@@ -3,12 +3,16 @@
  #endif
  #include <arm_neon.h>
  
-int main(void)
+int main(int argc, char **argv)
  {
-    float32x4_t v1 = vdupq_n_f32(1.0f), v2 = vdupq_n_f32(2.0f);
+    // passing from untraced pointers to avoid optimizing out any constants
+    // so we can test against the linker.
+    float *src = (float*)argv[argc-1];
+    float32x4_t v1 = vdupq_n_f32(src[0]), v2 = vdupq_n_f32(src[1]);
      int ret = (int)vgetq_lane_f32(vmulq_f32(v1, v2), 0);
  #ifdef __aarch64__
-    float64x2_t vd1 = vdupq_n_f64(1.0), vd2 = vdupq_n_f64(2.0);
+    double *src2 = (double*)argv[argc-2];
+    float64x2_t vd1 = vdupq_n_f64(src2[0]), vd2 = vdupq_n_f64(src2[1]);
      ret += (int)vgetq_lane_f64(vmulq_f64(vd1, vd2), 0);
  #endif
      return ret;
diff --git a/numpy/distutils/checks/cpu_neon_fp16.c b/numpy/distutils/checks/cpu_neon_fp16.c

index 745d2e793c4bd1e411e9cad53eb4ee7501a513dc..f3b949770db66a03a6221a230e75e87f67359759 100644 (file)
--- a/numpy/distutils/checks/cpu_neon_fp16.c
+++ b/numpy/distutils/checks/cpu_neon_fp16.c
@@ -3,9 +3,9 @@
  #endif
  #include <arm_neon.h>
  
-int main(void)
+int main(int argc, char **argv)
  {
-    short z4[] = {0, 0, 0, 0, 0, 0, 0, 0};
-    float32x4_t v_z4 = vcvt_f32_f16((float16x4_t)vld1_s16((const short*)z4));
+    short *src = (short*)argv[argc-1];
+    float32x4_t v_z4 = vcvt_f32_f16((float16x4_t)vld1_s16(src));
      return (int)vgetq_lane_f32(v_z4, 0);
  }
diff --git a/numpy/distutils/checks/cpu_neon_vfpv4.c b/numpy/distutils/checks/cpu_neon_vfpv4.c

index 45f7b5d69da4d5bb2dfca3425411db43d290ce50..a039159ddeed006d62f07250a3a1dbb5abfcb6ac 100644 (file)
--- a/numpy/distutils/checks/cpu_neon_vfpv4.c
+++ b/numpy/distutils/checks/cpu_neon_vfpv4.c
@@ -3,16 +3,18 @@
  #endif
  #include <arm_neon.h>
  
-int main(void)
+int main(int argc, char **argv)
  {
-    float32x4_t v1 = vdupq_n_f32(1.0f);
-    float32x4_t v2 = vdupq_n_f32(2.0f);
-    float32x4_t v3 = vdupq_n_f32(3.0f);
+    float *src = (float*)argv[argc-1];
+    float32x4_t v1 = vdupq_n_f32(src[0]);
+    float32x4_t v2 = vdupq_n_f32(src[1]);
+    float32x4_t v3 = vdupq_n_f32(src[2]);
      int ret = (int)vgetq_lane_f32(vfmaq_f32(v1, v2, v3), 0);
  #ifdef __aarch64__
-    float64x2_t vd1 = vdupq_n_f64(1.0);
-    float64x2_t vd2 = vdupq_n_f64(2.0);
-    float64x2_t vd3 = vdupq_n_f64(3.0);
+    double *src2 = (double*)argv[argc-2];
+    float64x2_t vd1 = vdupq_n_f64(src2[0]);
+    float64x2_t vd2 = vdupq_n_f64(src2[1]);
+    float64x2_t vd3 = vdupq_n_f64(src2[2]);
      ret += (int)vgetq_lane_f64(vfmaq_f64(vd1, vd2, vd3), 0);
  #endif
      return ret;
diff --git a/numpy/distutils/checks/cpu_vsx4.c b/numpy/distutils/checks/cpu_vsx4.c

new file mode 100644 (file)

index 0000000..a6acc73
--- /dev/null
+++ b/numpy/distutils/checks/cpu_vsx4.c
@@ -0,0 +1,14 @@
+#ifndef __VSX__
+    #error "VSX is not supported"
+#endif
+#include <altivec.h>
+
+typedef __vector unsigned int v_uint32x4;
+
+int main(void)
+{
+    v_uint32x4 v1 = (v_uint32x4){2, 4, 8, 16};
+    v_uint32x4 v2 = (v_uint32x4){2, 2, 2, 2};
+    v_uint32x4 v3 = vec_mod(v1, v2);
+    return (int)vec_extractm(v3);
+}
diff --git a/numpy/distutils/checks/cpu_vx.c b/numpy/distutils/checks/cpu_vx.c

new file mode 100644 (file)

index 0000000..18fb7ef
--- /dev/null
+++ b/numpy/distutils/checks/cpu_vx.c
@@ -0,0 +1,16 @@
+#if (__VEC__ < 10301) || (__ARCH__ < 11)
+    #error VX not supported
+#endif
+
+#include <vecintrin.h>
+int main(int argc, char **argv)
+{
+    __vector double x = vec_abs(vec_xl(argc, (double*)argv));
+    __vector double y = vec_load_len((double*)argv, (unsigned int)argc);
+
+    x = vec_round(vec_ceil(x) + vec_floor(y));
+    __vector bool long long m = vec_cmpge(x, y);
+    __vector long long i = vec_signed(vec_sel(x, y, m));
+
+    return (int)vec_extract(i, 0);
+}
diff --git a/numpy/distutils/checks/cpu_vxe.c b/numpy/distutils/checks/cpu_vxe.c

new file mode 100644 (file)

index 0000000..e6933ad
--- /dev/null
+++ b/numpy/distutils/checks/cpu_vxe.c
@@ -0,0 +1,25 @@
+#if (__VEC__ < 10302) || (__ARCH__ < 12)
+    #error VXE not supported
+#endif
+
+#include <vecintrin.h>
+int main(int argc, char **argv)
+{
+    __vector float x = vec_nabs(vec_xl(argc, (float*)argv));
+    __vector float y = vec_load_len((float*)argv, (unsigned int)argc);
+    
+    x = vec_round(vec_ceil(x) + vec_floor(y));
+    __vector bool int m = vec_cmpge(x, y);
+    x = vec_sel(x, y, m);
+
+    // need to test the existence of intrin "vflls" since vec_doublee
+    // is vec_doublee maps to wrong intrin "vfll".
+    // see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100871
+#if defined(__GNUC__) && !defined(__clang__)
+    __vector long long i = vec_signed(__builtin_s390_vflls(x));
+#else
+    __vector long long i = vec_signed(vec_doublee(x));
+#endif
+
+    return (int)vec_extract(i, 0);
+}
diff --git a/numpy/distutils/checks/cpu_vxe2.c b/numpy/distutils/checks/cpu_vxe2.c

new file mode 100644 (file)

index 0000000..f36d571
--- /dev/null
+++ b/numpy/distutils/checks/cpu_vxe2.c
@@ -0,0 +1,21 @@
+#if (__VEC__ < 10303) || (__ARCH__ < 13)
+    #error VXE2 not supported
+#endif
+
+#include <vecintrin.h>
+
+int main(int argc, char **argv)
+{
+    int val;
+    __vector signed short large = { 'a', 'b', 'c', 'a', 'g', 'h', 'g', 'o' };
+    __vector signed short search = { 'g', 'h', 'g', 'o' };
+    __vector unsigned char len = { 0 };
+    __vector unsigned char res = vec_search_string_cc(large, search, len, &val);
+    __vector float x = vec_xl(argc, (float*)argv);
+    __vector int i = vec_signed(x);
+
+    i = vec_srdb(vec_sldb(i, i, 2), i, 3);
+    val += (int)vec_extract(res, 1);
+    val += vec_extract(i, 0);
+    return val;
+}
diff --git a/numpy/distutils/checks/extra_vsx4_mma.c b/numpy/distutils/checks/extra_vsx4_mma.c

new file mode 100644 (file)

index 0000000..a70b2a9
--- /dev/null
+++ b/numpy/distutils/checks/extra_vsx4_mma.c
@@ -0,0 +1,21 @@
+#ifndef __VSX__
+    #error "VSX is not supported"
+#endif
+#include <altivec.h>
+
+typedef __vector float fv4sf_t;
+typedef __vector unsigned char vec_t;
+
+int main(void)
+{
+    __vector_quad acc0;
+    float a[4] = {0,1,2,3};
+    float b[4] = {0,1,2,3};
+    vec_t *va = (vec_t *) a;
+    vec_t *vb = (vec_t *) b;
+    __builtin_mma_xvf32ger(&acc0, va[0], vb[0]);
+    fv4sf_t result[4];
+    __builtin_mma_disassemble_acc((void *)result, &acc0);
+    fv4sf_t c0 = result[0];
+    return (int)((float*)&c0)[0];
+}
diff --git a/numpy/distutils/command/build.py b/numpy/distutils/command/build.py

index a4fda537d5dcf6444ccadfae75fefb825f4d72f6..80830d559c61dde4b46dacc7d24e387486476349 100644 (file)
--- a/numpy/distutils/command/build.py
+++ b/numpy/distutils/command/build.py
@@ -47,7 +47,8 @@ class build(old_build):
              - not part of dispatch-able features(--cpu-dispatch)
              - not supported by compiler or platform
          """
-        self.simd_test = "BASELINE SSE2 SSE42 XOP FMA4 (FMA3 AVX2) AVX512F AVX512_SKX VSX VSX2 VSX3 NEON ASIMD"
+        self.simd_test = "BASELINE SSE2 SSE42 XOP FMA4 (FMA3 AVX2) AVX512F " \
+                         "AVX512_SKX VSX VSX2 VSX3 VSX4 NEON ASIMD VX VXE VXE2"
  
      def finalize_options(self):
          build_scripts = self.build_scripts
diff --git a/numpy/distutils/fcompiler/gnu.py b/numpy/distutils/fcompiler/gnu.py

index 39178071d511b1b9af298f77534a7f0eb78d3eaa..d8143328e05141c1d860a50a3955aa054095cc32 100644 (file)
--- a/numpy/distutils/fcompiler/gnu.py
+++ b/numpy/distutils/fcompiler/gnu.py
@@ -324,7 +324,7 @@ class Gnu95FCompiler(GnuFCompiler):
              c_archs[c_archs.index("i386")] = "i686"
          # check the arches the Fortran compiler supports, and compare with
          # arch flags from C compiler
-        for arch in ["ppc", "i686", "x86_64", "ppc64"]:
+        for arch in ["ppc", "i686", "x86_64", "ppc64", "s390x"]:
              if _can_target(cmd, arch) and arch in c_archs:
                  arch_flags.extend(["-arch", arch])
          return arch_flags
diff --git a/numpy/distutils/fcompiler/intel.py b/numpy/distutils/fcompiler/intel.py

index f97c5b3483e1d52477887c40971b5094f994bcd0..1d606590411048e9bebb2dc04d28e56be89783b3 100644 (file)
--- a/numpy/distutils/fcompiler/intel.py
+++ b/numpy/distutils/fcompiler/intel.py
@@ -154,7 +154,8 @@ class IntelVisualFCompiler(BaseIntelFCompiler):
      module_include_switch = '/I'
  
      def get_flags(self):
-        opt = ['/nologo', '/MD', '/nbs', '/names:lowercase', '/assume:underscore']
+        opt = ['/nologo', '/MD', '/nbs', '/names:lowercase', 
+               '/assume:underscore', '/fpp']
          return opt
  
      def get_flags_free(self):
diff --git a/numpy/distutils/mingw32ccompiler.py b/numpy/distutils/mingw32ccompiler.py

index fbe3655c965c5cf3a184b2a8a5557ac311482a58..3349a56e8691bfeac45a616546c82fe962ff52c1 100644 (file)
--- a/numpy/distutils/mingw32ccompiler.py
+++ b/numpy/distutils/mingw32ccompiler.py
@@ -37,9 +37,6 @@ def get_msvcr_replacement():
      msvcr = msvc_runtime_library()
      return [] if msvcr is None else [msvcr]
  
-# monkey-patch cygwinccompiler with our updated version from misc_util
-# to avoid getting an exception raised on Python 3.5
-distutils.cygwinccompiler.get_msvcr = get_msvcr_replacement
  
  # Useful to generate table of symbols from a dll
  _START = re.compile(r'\[Ordinal/Name Pointer\] Table')
@@ -215,7 +212,7 @@ def find_python_dll():
      elif implementation == 'PyPy':
          dllname = f'libpypy{major_version}-c.dll'
      else:
-        dllname = 'Unknown platform {implementation}' 
+        dllname = f'Unknown platform {implementation}' 
      print("Looking for %s" % dllname)
      for folder in lib_dirs:
          dll = os.path.join(folder, dllname)
diff --git a/numpy/distutils/misc_util.py b/numpy/distutils/misc_util.py

index 513be75db2c56d67fbae96ece8468cc7afea0ba6..78665d351b6e60d94d5314d8ac0be3ddf339a08a 100644 (file)
--- a/numpy/distutils/misc_util.py
+++ b/numpy/distutils/misc_util.py
@@ -694,15 +694,11 @@ def get_shared_lib_extension(is_python_ext=False):
      -----
      For Python shared libs, `so_ext` will typically be '.so' on Linux and OS X,
      and '.pyd' on Windows.  For Python >= 3.2 `so_ext` has a tag prepended on
-    POSIX systems according to PEP 3149.  For Python 3.2 this is implemented on
-    Linux, but not on OS X.
+    POSIX systems according to PEP 3149.
  
      """
      confvars = distutils.sysconfig.get_config_vars()
-    # SO is deprecated in 3.3.1, use EXT_SUFFIX instead
-    so_ext = confvars.get('EXT_SUFFIX', None)
-    if so_ext is None:
-        so_ext = confvars.get('SO', '')
+    so_ext = confvars.get('EXT_SUFFIX', '')
  
      if not is_python_ext:
          # hardcode known values, config vars (including SHLIB_SUFFIX) are
@@ -2352,11 +2348,7 @@ def generate_config_py(target):
              extra_dll_dir = os.path.join(os.path.dirname(__file__), '.libs')
  
              if sys.platform == 'win32' and os.path.isdir(extra_dll_dir):
-                if sys.version_info >= (3, 8):
-                    os.add_dll_directory(extra_dll_dir)
-                else:
-                    os.environ.setdefault('PATH', '')
-                    os.environ['PATH'] += os.pathsep + extra_dll_dir
+                os.add_dll_directory(extra_dll_dir)
  
              """))
  
@@ -2499,4 +2491,3 @@ def exec_mod_from_location(modname, modfile):
      foo = importlib.util.module_from_spec(spec)
      spec.loader.exec_module(foo)
      return foo
-
diff --git a/numpy/distutils/tests/test_ccompiler_opt.py b/numpy/distutils/tests/test_ccompiler_opt.py

index 1b27ab07c393db0b794e48e6a66e4c92086ea6ad..1ca8bc09bf93029220fba6f2c707fa938d9bd8d6 100644 (file)
--- a/numpy/distutils/tests/test_ccompiler_opt.py
+++ b/numpy/distutils/tests/test_ccompiler_opt.py
@@ -32,6 +32,7 @@ arch_compilers = dict(
      ppc64le = ("gcc", "clang"),
      armhf = ("gcc", "clang"),
      aarch64 = ("gcc", "clang"),
+    s390x = ("gcc", "clang"),
      noarch = ("gcc",)
  )
  
@@ -382,18 +383,19 @@ class _Test_CCompilerOpt:
              if o == "native" and self.cc_name() == "msvc":
                  continue
              self.expect(o,
-                trap_files=".*cpu_(sse|vsx|neon).c",
-                x86="", ppc64="", armhf=""
+                trap_files=".*cpu_(sse|vsx|neon|vx).c",
+                x86="", ppc64="", armhf="", s390x=""
              )
              self.expect(o,
-                trap_files=".*cpu_(sse3|vsx2|neon_vfpv4).c",
+                trap_files=".*cpu_(sse3|vsx2|neon_vfpv4|vxe).c",
                  x86="sse sse2", ppc64="vsx", armhf="neon neon_fp16",
-                aarch64="", ppc64le=""
+                aarch64="", ppc64le="", s390x="vx"
              )
              self.expect(o,
                  trap_files=".*cpu_(popcnt|vsx3).c",
                  x86="sse .* sse41", ppc64="vsx vsx2",
-                armhf="neon neon_fp16 .* asimd .*"
+                armhf="neon neon_fp16 .* asimd .*",
+                s390x="vx vxe vxe2"
              )
              self.expect(o,
                  x86_gcc=".* xop fma4 .* avx512f .* avx512_knl avx512_knm avx512_skx .*",
@@ -403,13 +405,14 @@ class _Test_CCompilerOpt:
                  # in msvc, avx512_knl avx512_knm aren't supported
                  x86_msvc=".* xop fma4 .* avx512f .* avx512_skx .*",
                  armhf=".* asimd asimdhp asimddp .*",
-                ppc64="vsx vsx2 vsx3.*"
+                ppc64="vsx vsx2 vsx3 vsx4.*",
+                s390x="vx vxe vxe2.*"
              )
          # min
          self.expect("min",
              x86="sse sse2", x64="sse sse2 sse3",
              armhf="", aarch64="neon neon_fp16 .* asimd",
-            ppc64="", ppc64le="vsx vsx2"
+            ppc64="", ppc64le="vsx vsx2", s390x=""
          )
          self.expect(
              "min", trap_files=".*cpu_(sse2|vsx2).c",
@@ -420,7 +423,7 @@ class _Test_CCompilerOpt:
          try:
              self.expect("native",
                  trap_flags=".*(-march=native|-xHost|/QxHost).*",
-                x86=".*", ppc64=".*", armhf=".*"
+                x86=".*", ppc64=".*", armhf=".*", s390x=".*"
              )
              if self.march() != "unknown":
                  raise AssertionError(
@@ -432,14 +435,15 @@ class _Test_CCompilerOpt:
  
      def test_flags(self):
          self.expect_flags(
-            "sse sse2 vsx vsx2 neon neon_fp16",
+            "sse sse2 vsx vsx2 neon neon_fp16 vx vxe",
              x86_gcc="-msse -msse2", x86_icc="-msse -msse2",
              x86_iccw="/arch:SSE2",
              x86_msvc="/arch:SSE2" if self.march() == "x86" else "",
              ppc64_gcc= "-mcpu=power8",
              ppc64_clang="-maltivec -mvsx -mpower8-vector",
              armhf_gcc="-mfpu=neon-fp16 -mfp16-format=ieee",
-            aarch64=""
+            aarch64="",
+            s390x="-mzvector -march=arch12"
          )
          # testing normalize -march
          self.expect_flags(
@@ -463,6 +467,10 @@ class _Test_CCompilerOpt:
              "asimddp asimdhp asimdfhm",
              aarch64_gcc=r"-march=armv8.2-a\+dotprod\+fp16\+fp16fml"
          )
+        self.expect_flags(
+            "vx vxe vxe2",
+            s390x=r"-mzvector -march=arch13"
+        )
  
      def test_targets_exceptions(self):
          for targets in (
@@ -484,7 +492,7 @@ class _Test_CCompilerOpt:
              try:
                  self.expect_targets(
                      targets,
-                    x86="", armhf="", ppc64=""
+                    x86="", armhf="", ppc64="", s390x=""
                  )
                  if self.march() != "unknown":
                      raise AssertionError(
@@ -496,26 +504,26 @@ class _Test_CCompilerOpt:
  
      def test_targets_syntax(self):
          for targets in (
-            "/*@targets $keep_baseline sse vsx neon*/",
-            "/*@targets,$keep_baseline,sse,vsx,neon*/",
-            "/*@targets*$keep_baseline*sse*vsx*neon*/",
+            "/*@targets $keep_baseline sse vsx neon vx*/",
+            "/*@targets,$keep_baseline,sse,vsx,neon vx*/",
+            "/*@targets*$keep_baseline*sse*vsx*neon*vx*/",
              """
              /*
              ** @targets
-            ** $keep_baseline, sse vsx,neon
+            ** $keep_baseline, sse vsx,neon, vx
              */
              """,
              """
              /*
-            ************@targets*************
-            ** $keep_baseline, sse vsx, neon
-            *********************************
+            ************@targets****************
+            ** $keep_baseline, sse vsx, neon, vx
+            ************************************
              */
              """,
              """
              /*
              /////////////@targets/////////////////
-            //$keep_baseline//sse//vsx//neon
+            //$keep_baseline//sse//vsx//neon//vx
              /////////////////////////////////////
              */
              """,
@@ -523,11 +531,11 @@ class _Test_CCompilerOpt:
              /*
              @targets
              $keep_baseline
-            SSE VSX NEON*/
+            SSE VSX NEON VX*/
              """
          ) :
              self.expect_targets(targets,
-                x86="sse", ppc64="vsx", armhf="neon", unknown=""
+                x86="sse", ppc64="vsx", armhf="neon", s390x="vx", unknown=""
              )
  
      def test_targets(self):
@@ -536,37 +544,42 @@ class _Test_CCompilerOpt:
              """
              /*@targets
                  sse sse2 sse41 avx avx2 avx512f
-                vsx vsx2 vsx3
+                vsx vsx2 vsx3 vsx4
                  neon neon_fp16 asimdhp asimddp
+                vx vxe vxe2
              */
              """,
-            baseline="avx vsx2 asimd",
-            x86="avx512f avx2", armhf="asimddp asimdhp", ppc64="vsx3"
+            baseline="avx vsx2 asimd vx vxe",
+            x86="avx512f avx2", armhf="asimddp asimdhp", ppc64="vsx4 vsx3",
+            s390x="vxe2"
          )
          # test skipping non-dispatch features
          self.expect_targets(
              """
              /*@targets
                  sse41 avx avx2 avx512f
-                vsx2 vsx3
+                vsx2 vsx3 vsx4
                  asimd asimdhp asimddp
+                vx vxe vxe2
              */
              """,
-            baseline="", dispatch="sse41 avx2 vsx2 asimd asimddp",
-            x86="avx2 sse41", armhf="asimddp asimd", ppc64="vsx2"
+            baseline="", dispatch="sse41 avx2 vsx2 asimd asimddp vxe2",
+            x86="avx2 sse41", armhf="asimddp asimd", ppc64="vsx2", s390x="vxe2"
          )
          # test skipping features that not supported
          self.expect_targets(
              """
              /*@targets
                  sse2 sse41 avx2 avx512f
-                vsx2 vsx3
+                vsx2 vsx3 vsx4
                  neon asimdhp asimddp
+                vx vxe vxe2
              */
              """,
              baseline="",
-            trap_files=".*(avx2|avx512f|vsx3|asimddp).c",
-            x86="sse41 sse2", ppc64="vsx2", armhf="asimdhp neon"
+            trap_files=".*(avx2|avx512f|vsx3|vsx4|asimddp|vxe2).c",
+            x86="sse41 sse2", ppc64="vsx2", armhf="asimdhp neon",
+            s390x="vxe vx"
          )
          # test skipping features that implies each other
          self.expect_targets(
@@ -598,14 +611,16 @@ class _Test_CCompilerOpt:
                  sse2 sse42 avx2 avx512f
                  vsx2 vsx3
                  neon neon_vfpv4 asimd asimddp
+                vx vxe vxe2
              */
              """,
-            baseline="sse41 avx2 vsx2 asimd vsx3",
+            baseline="sse41 avx2 vsx2 asimd vsx3 vxe",
              x86="avx512f avx2 sse42 sse2",
              ppc64="vsx3 vsx2",
              armhf="asimddp asimd neon_vfpv4 neon",
              # neon, neon_vfpv4, asimd implies each other
-            aarch64="asimddp asimd"
+            aarch64="asimddp asimd",
+            s390x="vxe2 vxe vx"
          )
          # 'keep_sort', leave the sort as-is
          self.expect_targets(
@@ -615,13 +630,15 @@ class _Test_CCompilerOpt:
                  avx512f sse42 avx2 sse2
                  vsx2 vsx3
                  asimd neon neon_vfpv4 asimddp
+                vxe vxe2
              */
              """,
              x86="avx512f sse42 avx2 sse2",
              ppc64="vsx2 vsx3",
              armhf="asimd neon neon_vfpv4 asimddp",
              # neon, neon_vfpv4, asimd implies each other
-            aarch64="asimd asimddp"
+            aarch64="asimd asimddp",
+            s390x="vxe vxe2"
          )
          # 'autovec', skipping features that can't be
          # vectorized by the compiler
@@ -736,11 +753,13 @@ class _Test_CCompilerOpt:
                  (sse41 avx sse42) (sse3 avx2 avx512f)
                  (vsx vsx3 vsx2)
                  (asimddp neon neon_vfpv4 asimd asimdhp)
+                (vx vxe vxe2)
              */
              """,
              x86="avx avx512f",
              ppc64="vsx3",
              armhf=r"\(asimdhp asimddp\)",
+            s390x="vxe2"
          )
          # test compiler variety and avoiding duplicating
          self.expect_targets(
diff --git a/numpy/distutils/tests/test_log.py b/numpy/distutils/tests/test_log.py

index 36f49f592c3946a2c2b98ae7a254b8c0033fee4a..72fddf37370f1b5c81473a24c823a236f9f299bc 100644 (file)
--- a/numpy/distutils/tests/test_log.py
+++ b/numpy/distutils/tests/test_log.py
@@ -8,7 +8,9 @@ from numpy.distutils import log
  
  
  def setup_module():
-    log.set_verbosity(2, force=True)  # i.e. DEBUG
+    f = io.StringIO()  # changing verbosity also logs here, capture that
+    with redirect_stdout(f):
+        log.set_verbosity(2, force=True)  # i.e. DEBUG
  
  
  def teardown_module():
diff --git a/numpy/doc/ufuncs.py b/numpy/doc/ufuncs.py

index eecc15083d534e43c65e5adb95c8826e3b3a2b51..c99e9abc99a55a799899579fb0dec9ae4dccf54c 100644 (file)
--- a/numpy/doc/ufuncs.py
+++ b/numpy/doc/ufuncs.py
@@ -75,7 +75,7 @@ The axis keyword can be used to specify different axes to reduce: ::
   >>> np.add.reduce(np.arange(10).reshape(2,5),axis=1)
   array([10, 35])
  
-**.accumulate(arr)** applies the binary operator and generates an an
+**.accumulate(arr)** applies the binary operator and generates an
  equivalently shaped array that includes the accumulated amount for each
  element of the array. A couple examples: ::
  
diff --git a/numpy/f2py/__init__.py b/numpy/f2py/__init__.py

index f147f1b970a3c799fbfd37f3ea13734fb1fa0f47..84192f7386d16877790c51423f7c96566fa1f59d 100644 (file)
--- a/numpy/f2py/__init__.py
+++ b/numpy/f2py/__init__.py
@@ -47,11 +47,11 @@ def compile(source,
      source_fn : str, optional
          Name of the file where the fortran source is written.
          The default is to use a temporary file with the extension
-        provided by the `extension` parameter
-    extension : {'.f', '.f90'}, optional
+        provided by the ``extension`` parameter
+    extension : ``{'.f', '.f90'}``, optional
          Filename extension if `source_fn` is not provided.
          The extension tells which fortran standard is used.
-        The default is `.f`, which implies F77 standard.
+        The default is ``.f``, which implies F77 standard.
  
          .. versionadded:: 1.11.0
  
@@ -124,7 +124,7 @@ def compile(source,
  
  def get_include():
      """
-    Return the directory that contains the fortranobject.c and .h files.
+    Return the directory that contains the ``fortranobject.c`` and ``.h`` files.
  
      .. note::
  
@@ -145,21 +145,21 @@ def get_include():
  
      Notes
      -----
-    .. versionadded:: 1.22.0
+    .. versionadded:: 1.21.1
  
      Unless the build system you are using has specific support for f2py,
      building a Python extension using a ``.pyf`` signature file is a two-step
      process. For a module ``mymod``:
  
-        - Step 1: run ``python -m numpy.f2py mymod.pyf --quiet``. This
-          generates ``_mymodmodule.c`` and (if needed)
-          ``_fblas-f2pywrappers.f`` files next to ``mymod.pyf``.
-        - Step 2: build your Python extension module. This requires the
-          following source files:
+    * Step 1: run ``python -m numpy.f2py mymod.pyf --quiet``. This
+      generates ``_mymodmodule.c`` and (if needed)
+      ``_fblas-f2pywrappers.f`` files next to ``mymod.pyf``.
+    * Step 2: build your Python extension module. This requires the
+      following source files:
  
-              - ``_mymodmodule.c``
-              - ``_mymod-f2pywrappers.f`` (if it was generated in step 1)
-              - ``fortranobject.c``
+      * ``_mymodmodule.c``
+      * ``_mymod-f2pywrappers.f`` (if it was generated in Step 1)
+      * ``fortranobject.c``
  
      See Also
      --------
@@ -169,32 +169,19 @@ def get_include():
      return os.path.join(os.path.dirname(__file__), 'src')
  
  
-if sys.version_info[:2] >= (3, 7):
-    # module level getattr is only supported in 3.7 onwards
-    # https://www.python.org/dev/peps/pep-0562/
-    def __getattr__(attr):
+def __getattr__(attr):
  
-        # Avoid importing things that aren't needed for building
-        # which might import the main numpy module
-        if attr == "f2py_testing":
-            import numpy.f2py.f2py_testing as f2py_testing
-            return f2py_testing
+    # Avoid importing things that aren't needed for building
+    # which might import the main numpy module
+    if attr == "test":
+        from numpy._pytesttester import PytestTester
+        test = PytestTester(__name__)
+        return test
  
-        elif attr == "test":
-            from numpy._pytesttester import PytestTester
-            test = PytestTester(__name__)
-            return test
-
-        else:
-            raise AttributeError("module {!r} has no attribute "
-                                 "{!r}".format(__name__, attr))
-
-    def __dir__():
-        return list(globals().keys() | {"f2py_testing", "test"})
+    else:
+        raise AttributeError("module {!r} has no attribute "
+                              "{!r}".format(__name__, attr))
  
-else:
-    from . import f2py_testing
  
-    from numpy._pytesttester import PytestTester
-    test = PytestTester(__name__)
-    del PytestTester
+def __dir__():
+    return list(globals().keys() | {"test"})
diff --git a/numpy/f2py/__init__.pyi b/numpy/f2py/__init__.pyi

index e52e12bbd156758838f1ebc9aece1cc7ce19917c..6e3a82cf8f444166e9cda92db637565c30db11c3 100644 (file)
--- a/numpy/f2py/__init__.pyi
+++ b/numpy/f2py/__init__.pyi
@@ -1,28 +1,29 @@
  import os
  import subprocess
-from typing import Literal as L, Any, List, Iterable, Dict, overload, TypedDict
+from collections.abc import Iterable
+from typing import Literal as L, Any, overload, TypedDict
  
  from numpy._pytesttester import PytestTester
  
  class _F2PyDictBase(TypedDict):
-    csrc: List[str]
-    h: List[str]
+    csrc: list[str]
+    h: list[str]
  
  class _F2PyDict(_F2PyDictBase, total=False):
-    fsrc: List[str]
-    ltx: List[str]
+    fsrc: list[str]
+    ltx: list[str]
  
-__all__: List[str]
-__path__: List[str]
+__all__: list[str]
+__path__: list[str]
  test: PytestTester
  
-def run_main(comline_list: Iterable[str]) -> Dict[str, _F2PyDict]: ...
+def run_main(comline_list: Iterable[str]) -> dict[str, _F2PyDict]: ...
  
  @overload
  def compile(  # type: ignore[misc]
      source: str | bytes,
      modulename: str = ...,
-    extra_args: str | List[str] = ...,
+    extra_args: str | list[str] = ...,
      verbose: bool = ...,
      source_fn: None | str | bytes | os.PathLike[Any] = ...,
      extension: L[".f", ".f90"] = ...,
@@ -32,7 +33,7 @@ def compile(  # type: ignore[misc]
  def compile(
      source: str | bytes,
      modulename: str = ...,
-    extra_args: str | List[str] = ...,
+    extra_args: str | list[str] = ...,
      verbose: bool = ...,
      source_fn: None | str | bytes | os.PathLike[Any] = ...,
      extension: L[".f", ".f90"] = ...,
diff --git a/numpy/f2py/capi_maps.py b/numpy/f2py/capi_maps.py

index 581f946e5a2177c22f888aa7aafecea2c6fe5804..51423c2c7e6011ef2022a67837bd0462ffd8828b 100644 (file)
--- a/numpy/f2py/capi_maps.py
+++ b/numpy/f2py/capi_maps.py
@@ -170,6 +170,7 @@ f2cmap_all = {'real': {'': 'float', '4': 'float', '8': 'double',
  
  f2cmap_default = copy.deepcopy(f2cmap_all)
  
+f2cmap_mapped = []
  
  def load_f2cmap_file(f2cmap_file):
      global f2cmap_all
@@ -190,7 +191,7 @@ def load_f2cmap_file(f2cmap_file):
      try:
          outmess('Reading f2cmap from {!r} ...\n'.format(f2cmap_file))
          with open(f2cmap_file, 'r') as f:
-            d = eval(f.read(), {}, {})
+            d = eval(f.read().lower(), {}, {})
          for k, d1 in d.items():
              for k1 in d1.keys():
                  d1[k1.lower()] = d1[k1]
@@ -206,6 +207,7 @@ def load_f2cmap_file(f2cmap_file):
                      f2cmap_all[k][k1] = d[k][k1]
                      outmess('\tMapping "%s(kind=%s)" to "%s"\n' %
                              (k, k1, d[k][k1]))
+                    f2cmap_mapped.append(d[k][k1])
                  else:
                      errmess("\tIgnoring map {'%s':{'%s':'%s'}}: '%s' must be in %s\n" % (
                          k, k1, d[k][k1], d[k][k1], list(c2py_map.keys())))
@@ -504,7 +506,8 @@ def sign2map(a, var):
      varname,ctype,atype
      init,init.r,init.i,pytype
      vardebuginfo,vardebugshowvalue,varshowvalue
-    varrfromat
+    varrformat
+
      intent
      """
      out_a = a
diff --git a/numpy/f2py/cfuncs.py b/numpy/f2py/cfuncs.py

index bdd27adaf4c691eb33eead8cedd3fc144cfb4cca..f69933543918715fc85e1e21525697a23cb0bb12 100644 (file)
--- a/numpy/f2py/cfuncs.py
+++ b/numpy/f2py/cfuncs.py
@@ -51,8 +51,6 @@ includes0['math.h'] = '#include <math.h>'
  includes0['string.h'] = '#include <string.h>'
  includes0['setjmp.h'] = '#include <setjmp.h>'
  
-includes['Python.h'] = '#include <Python.h>'
-needs['arrayobject.h'] = ['Python.h']
  includes['arrayobject.h'] = '''#define PY_ARRAY_UNIQUE_SYMBOL PyArray_API
  #include "arrayobject.h"'''
  
@@ -66,7 +64,7 @@ typedefs['unsigned_short'] = 'typedef unsigned short unsigned_short;'
  typedefs['unsigned_long'] = 'typedef unsigned long unsigned_long;'
  typedefs['signed_char'] = 'typedef signed char signed_char;'
  typedefs['long_long'] = """\
-#ifdef _WIN32
+#if defined(NPY_OS_WIN32)
  typedef __int64 long_long;
  #else
  typedef long long long_long;
@@ -74,7 +72,7 @@ typedef unsigned long long unsigned_long_long;
  #endif
  """
  typedefs['unsigned_long_long'] = """\
-#ifdef _WIN32
+#if defined(NPY_OS_WIN32)
  typedef __uint64 long_long;
  #else
  typedef unsigned long long unsigned_long_long;
@@ -574,13 +572,13 @@ cppmacros["F2PY_THREAD_LOCAL_DECL"] = """\
  #ifndef F2PY_THREAD_LOCAL_DECL
  #if defined(_MSC_VER)
  #define F2PY_THREAD_LOCAL_DECL __declspec(thread)
-#elif defined(__MINGW32__) || defined(__MINGW64__)
+#elif defined(NPY_OS_MINGW)
  #define F2PY_THREAD_LOCAL_DECL __thread
  #elif defined(__STDC_VERSION__) \\
        && (__STDC_VERSION__ >= 201112L) \\
        && !defined(__STDC_NO_THREADS__) \\
        && (!defined(__GLIBC__) || __GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ > 12)) \\
-      && !defined(__OpenBSD__)
+      && !defined(NPY_OS_OPENBSD)
  /* __STDC_NO_THREADS__ was first defined in a maintenance release of glibc 2.12,
     see https://lists.gnu.org/archive/html/commit-hurd/2012-07/msg00180.html,
     so `!defined(__STDC_NO_THREADS__)` may give false positive for the existence
diff --git a/numpy/f2py/crackfortran.py b/numpy/f2py/crackfortran.py

index 824d87e4cde07bd42588b23ed5177bacc5a5a408..515bdd78777d519119958eea33f80fb3a0585f41 100755 (executable)
--- a/numpy/f2py/crackfortran.py
+++ b/numpy/f2py/crackfortran.py
@@ -574,11 +574,16 @@ beginpattern90 = re.compile(
  groupends = (r'end|endprogram|endblockdata|endmodule|endpythonmodule|'
               r'endinterface|endsubroutine|endfunction')
  endpattern = re.compile(
-    beforethisafter % ('', groupends, groupends, r'[\w\s]*'), re.I), 'end'
-endifs = r'(end\s*(if|do|where|select|while|forall|associate|block|critical|enum|team))|(module\s*procedure)'
+    beforethisafter % ('', groupends, groupends, r'.*'), re.I), 'end'
+endifs = r'end\s*(if|do|where|select|while|forall|associate|block|' + \
+         r'critical|enum|team)'
  endifpattern = re.compile(
      beforethisafter % (r'[\w]*?', endifs, endifs, r'[\w\s]*'), re.I), 'endif'
  #
+moduleprocedures = r'module\s*procedure'
+moduleprocedurepattern = re.compile(
+    beforethisafter % ('', moduleprocedures, moduleprocedures, r'.*'), re.I), \
+    'moduleprocedure'
  implicitpattern = re.compile(
      beforethisafter % ('', 'implicit', 'implicit', '.*'), re.I), 'implicit'
  dimensionpattern = re.compile(beforethisafter % (
@@ -727,7 +732,8 @@ def crackline(line, reset=0):
                  callpattern, usepattern, containspattern,
                  entrypattern,
                  f2pyenhancementspattern,
-                multilinepattern
+                multilinepattern,
+                moduleprocedurepattern
                  ]:
          m = pat[0].match(line)
          if m:
@@ -797,6 +803,8 @@ def crackline(line, reset=0):
          expectbegin = 0
      elif pat[1] == 'endif':
          pass
+    elif pat[1] == 'moduleprocedure':
+        analyzeline(m, pat[1], line)
      elif pat[1] == 'contains':
          if ignorecontains:
              return
@@ -884,8 +892,14 @@ def appenddecl(decl, decl2, force=1):
  
  selectpattern = re.compile(
      r'\s*(?P<this>(@\(@.*?@\)@|\*[\d*]+|\*\s*@\(@.*?@\)@|))(?P<after>.*)\Z', re.I)
+typedefpattern = re.compile(
+    r'(?:,(?P<attributes>[\w(),]+))?(::)?(?P<name>\b[a-z$_][\w$]*\b)'
+    r'(?:\((?P<params>[\w,]*)\))?\Z', re.I)
  nameargspattern = re.compile(
      r'\s*(?P<name>\b[\w$]+\b)\s*(@\(@\s*(?P<args>[\w\s,]*)\s*@\)@|)\s*((result(\s*@\(@\s*(?P<result>\b[\w$]+\b)\s*@\)@|))|(bind\s*@\(@\s*(?P<bind>.*)\s*@\)@))*\s*\Z', re.I)
+operatorpattern = re.compile(
+    r'\s*(?P<scheme>(operator|assignment))'
+    r'@\(@\s*(?P<name>[^)]+)\s*@\)@\s*\Z', re.I)
  callnameargspattern = re.compile(
      r'\s*(?P<name>\b[\w$]+\b)\s*@\(@\s*(?P<args>.*)\s*@\)@\s*\Z', re.I)
  real16pattern = re.compile(
@@ -903,11 +917,26 @@ def _is_intent_callback(vdecl):
      return 0
  
  
+def _resolvetypedefpattern(line):
+    line = ''.join(line.split())  # removes whitespace
+    m1 = typedefpattern.match(line)
+    print(line, m1)
+    if m1:
+        attrs = m1.group('attributes')
+        attrs = [a.lower() for a in attrs.split(',')] if attrs else []
+        return m1.group('name'), attrs, m1.group('params')
+    return None, [], None
+
+
  def _resolvenameargspattern(line):
      line = markouterparen(line)
      m1 = nameargspattern.match(line)
      if m1:
          return m1.group('name'), m1.group('args'), m1.group('result'), m1.group('bind')
+    m1 = operatorpattern.match(line)
+    if m1:
+        name = m1.group('scheme') + '(' + m1.group('name') + ')'
+        return name, [], None, None
      m1 = callnameargspattern.match(line)
      if m1:
          return m1.group('name'), m1.group('args'), None, None
@@ -947,7 +976,13 @@ def analyzeline(m, case, line):
              block = 'python module'
          elif re.match(r'abstract\s*interface', block, re.I):
              block = 'abstract interface'
-        name, args, result, bind = _resolvenameargspattern(m.group('after'))
+        if block == 'type':
+            name, attrs, _ = _resolvetypedefpattern(m.group('after'))
+            groupcache[groupcounter]['vars'][name] = dict(attrspec = attrs)
+            args = []
+            result = None
+        else:
+            name, args, result, _ = _resolvenameargspattern(m.group('after'))
          if name is None:
              if block == 'block data':
                  name = '_BLOCK_DATA_'
@@ -1151,6 +1186,9 @@ def analyzeline(m, case, line):
                      continue
              else:
                  k = rmbadname1(m1.group('name'))
+            if case in ['public', 'private'] and \
+               (k == 'operator' or k == 'assignment'):
+                k += m1.group('after')
              if k not in edecl:
                  edecl[k] = {}
              if case == 'dimension':
@@ -1193,6 +1231,9 @@ def analyzeline(m, case, line):
          groupcache[groupcounter]['vars'] = edecl
          if last_name is not None:
              previous_context = ('variable', last_name, groupcounter)
+    elif case == 'moduleprocedure':
+        groupcache[groupcounter]['implementedby'] = \
+            [x.strip() for x in m.group('after').split(',')]
      elif case == 'parameter':
          edecl = groupcache[groupcounter]['vars']
          ll = m.group('after').strip()[1:-1]
@@ -2105,7 +2146,8 @@ def analyzebody(block, args, tab=''):
          else:
              as_ = args
          b = postcrack(b, as_, tab=tab + '\t')
-        if b['block'] in ['interface', 'abstract interface'] and not b['body']:
+        if b['block'] in ['interface', 'abstract interface'] and \
+           not b['body'] and not b['implementedby']:
              if 'f2pyenhancements' not in b:
                  continue
          if b['block'].replace(' ', '') == 'pythonmodule':
@@ -2387,8 +2429,15 @@ def get_parameters(vars, global_params={}):
                  outmess(f'get_parameters[TODO]: '
                          f'implement evaluation of complex expression {v}\n')
  
+            # Handle _dp for gh-6624
+            # Also fixes gh-20460
+            if real16pattern.search(v):
+                v = 8
+            elif real8pattern.search(v):
+                v = 4
              try:
                  params[n] = eval(v, g_params, params)
+
              except Exception as msg:
                  params[n] = v
                  outmess('get_parameters: got "%s" on %s\n' % (msg, repr(v)))
@@ -2647,7 +2696,7 @@ def analyzevars(block):
              n_checks = []
              n_is_input = l_or(isintent_in, isintent_inout,
                                isintent_inplace)(vars[n])
-            if 'dimension' in vars[n]:  # n is array
+            if isarray(vars[n]):  # n is array
                  for i, d in enumerate(vars[n]['dimension']):
                      coeffs_and_deps = dimension_exprs.get(d)
                      if coeffs_and_deps is None:
@@ -2658,15 +2707,22 @@ def analyzevars(block):
                          # may define variables used in dimension
                          # specifications.
                          for v, (solver, deps) in coeffs_and_deps.items():
+                            def compute_deps(v, deps):
+                                for v1 in coeffs_and_deps.get(v, [None, []])[1]:
+                                    if v1 not in deps:
+                                        deps.add(v1)
+                                        compute_deps(v1, deps)
+                            all_deps = set()
+                            compute_deps(v, all_deps)
                              if ((v in n_deps
                                   or '=' in vars[v]
                                   or 'depend' in vars[v])):
                                  # Skip a variable that
                                  # - n depends on
                                  # - has user-defined initialization expression
-                                # - has user-defined dependecies
+                                # - has user-defined dependencies
                                  continue
-                            if solver is not None:
+                            if solver is not None and v not in all_deps:
                                  # v can be solved from d, hence, we
                                  # make it an optional argument with
                                  # initialization expression:
diff --git a/numpy/f2py/f2py2e.py b/numpy/f2py/f2py2e.py

index 4d79c304ae919fa2255e5cd8ee8602288fd3bf8c..10508488dc0432d62fa7720d3b2b3d8c5d2f24f1 100755 (executable)
--- a/numpy/f2py/f2py2e.py
+++ b/numpy/f2py/f2py2e.py
@@ -18,6 +18,7 @@ import sys
  import os
  import pprint
  import re
+from pathlib import Path
  
  from . import crackfortran
  from . import rules
@@ -82,6 +83,9 @@ Options:
                     file <modulename>module.c or extension module <modulename>.
                     Default is 'untitled'.
  
+  '-include<header>'  Writes additional headers in the C wrapper, can be passed
+                      multiple times, generates #include <header> each time.
+
    --[no-]lower     Do [not] lower the cases in <fortran files>. By default,
                     --lower is assumed with -h key, and --no-lower without -h key.
  
@@ -118,6 +122,7 @@ Options:
  
    --quiet          Run quietly.
    --verbose        Run with extra verbosity.
+  --skip-empty-wrappers   Only generate wrapper files when needed.
    -v               Print f2py version ID and exit.
  
  
@@ -175,6 +180,7 @@ def scaninputline(inputline):
      files, skipfuncs, onlyfuncs, debug = [], [], [], []
      f, f2, f3, f5, f6, f7, f8, f9, f10 = 1, 0, 0, 0, 0, 0, 0, 0, 0
      verbose = 1
+    emptygen = True
      dolc = -1
      dolatexdoc = 0
      dorestdoc = 0
@@ -246,6 +252,8 @@ def scaninputline(inputline):
              f7 = 1
          elif l[:15] in '--include-paths':
              f7 = 1
+        elif l == '--skip-empty-wrappers':
+            emptygen = False
          elif l[0] == '-':
              errmess('Unknown option %s\n' % repr(l))
              sys.exit()
@@ -295,6 +303,7 @@ def scaninputline(inputline):
              'Signature file "%s" exists!!! Use --overwrite-signature to overwrite.\n' % (signsfile))
          sys.exit()
  
+    options['emptygen'] = emptygen
      options['debug'] = debug
      options['verbose'] = verbose
      if dolc == -1 and not signsfile:
@@ -408,14 +417,16 @@ def run_main(comline_list):
      where ``<args>=string.join(<list>,' ')``, but in Python.  Unless
      ``-h`` is used, this function returns a dictionary containing
      information on generated modules and their dependencies on source
-    files.  For example, the command ``f2py -m scalar scalar.f`` can be
-    executed from Python as follows
+    files.
  
      You cannot build extension modules with this function, that is,
-    using ``-c`` is not allowed. Use ``compile`` command instead
+    using ``-c`` is not allowed. Use the ``compile`` command instead.
  
      Examples
      --------
+    The command ``f2py -m scalar scalar.f`` can be executed from Python as
+    follows.
+
      .. literalinclude:: ../../source/f2py/code/results/run_main_session.dat
          :language: python
  
@@ -456,7 +467,7 @@ def run_main(comline_list):
                  errmess(
                      'Tip: If your original code is Fortran source then you must use -m option.\n')
              raise TypeError('All blocks must be python module blocks but got %s' % (
-                repr(postlist[i]['block'])))
+                repr(plist['block'])))
      auxfuncs.debugoptions = options['debug']
      f90mod_rules.options = options
      auxfuncs.wrapfuncs = options['wrapfuncs']
@@ -520,7 +531,7 @@ def run_compile():
          sysinfo_flags = [f[7:] for f in sysinfo_flags]
  
      _reg2 = re.compile(
-        r'--((no-|)(wrap-functions|lower)|debug-capi|quiet)|-include')
+        r'--((no-|)(wrap-functions|lower)|debug-capi|quiet|skip-empty-wrappers)|-include')
      f2py_flags = [_m for _m in sys.argv[1:] if _reg2.match(_m)]
      sys.argv = [_m for _m in sys.argv if _m not in f2py_flags]
      f2py_flags2 = []
diff --git a/numpy/f2py/f2py_testing.py b/numpy/f2py/f2py_testing.py

deleted file mode 100644 (file)

index 1f109e6..0000000
--- a/numpy/f2py/f2py_testing.py
+++ /dev/null
@@ -1,46 +0,0 @@
-import sys
-import re
-
-from numpy.testing import jiffies, memusage
-
-
-def cmdline():
-    m = re.compile(r'\A\d+\Z')
-    args = []
-    repeat = 1
-    for a in sys.argv[1:]:
-        if m.match(a):
-            repeat = eval(a)
-        else:
-            args.append(a)
-    f2py_opts = ' '.join(args)
-    return repeat, f2py_opts
-
-
-def run(runtest, test_functions, repeat=1):
-    l = [(t, repr(t.__doc__.split('\n')[1].strip())) for t in test_functions]
-    start_memusage = memusage()
-    diff_memusage = None
-    start_jiffies = jiffies()
-    i = 0
-    while i < repeat:
-        i += 1
-        for t, fname in l:
-            runtest(t)
-            if start_memusage is None:
-                continue
-            if diff_memusage is None:
-                diff_memusage = memusage() - start_memusage
-            else:
-                diff_memusage2 = memusage() - start_memusage
-                if diff_memusage2 != diff_memusage:
-                    print('memory usage change at step %i:' % i,
-                          diff_memusage2 - diff_memusage,
-                          fname)
-                    diff_memusage = diff_memusage2
-    current_memusage = memusage()
-    print('run', repeat * len(test_functions), 'tests',
-          'in %.2f seconds' % ((jiffies() - start_jiffies) / 100.0))
-    if start_memusage:
-        print('initial virtual memory size:', start_memusage, 'bytes')
-        print('current virtual memory size:', current_memusage, 'bytes')
diff --git a/numpy/f2py/rules.py b/numpy/f2py/rules.py

index 78810a0a74a9f4d5ae80000c548952e0dc7090bc..4c99c4cd1ae567fec8a31999212cbc0d7e8ad9a2 100755 (executable)
--- a/numpy/f2py/rules.py
+++ b/numpy/f2py/rules.py
@@ -50,9 +50,10 @@ $Date: 2005/08/30 08:58:42 $
  Pearu Peterson
  
  """
-import os
+import os, sys
  import time
  import copy
+from pathlib import Path
  
  # __version__.version is now the same as the NumPy version
  from . import __version__
@@ -124,6 +125,10 @@ extern \"C\" {
  #define PY_SSIZE_T_CLEAN
  #endif /* PY_SSIZE_T_CLEAN */
  
+/* Unconditionally included */
+#include <Python.h>
+#include <numpy/npy_os.h>
+
  """ + gentitle("See f2py2e/cfuncs.py: includes") + """
  #includes#
  #includes0#
@@ -1198,8 +1203,8 @@ def buildmodule(m, um):
                      break
  
          if not nb:
-            errmess(
-                'buildmodule: Could not found the body of interfaced routine "%s". Skipping.\n' % (n))
+            print(
+                'buildmodule: Could not find the body of interfaced routine "%s". Skipping.\n' % (n), file=sys.stderr)
              continue
          nb_list = [nb]
          if 'entry' in nb:
@@ -1213,6 +1218,22 @@ def buildmodule(m, um):
              # requiresf90wrapper must be called before buildapi as it
              # rewrites assumed shape arrays as automatic arrays.
              isf90 = requiresf90wrapper(nb)
+            # options is in scope here
+            if options['emptygen']:
+                b_path = options['buildpath']
+                m_name = vrd['modulename']
+                outmess('    Generating possibly empty wrappers"\n')
+                Path(f"{b_path}/{vrd['coutput']}").touch()
+                if isf90:
+                    # f77 + f90 wrappers
+                    outmess(f'    Maybe empty "{m_name}-f2pywrappers2.f90"\n')
+                    Path(f'{b_path}/{m_name}-f2pywrappers2.f90').touch()
+                    outmess(f'    Maybe empty "{m_name}-f2pywrappers.f"\n')
+                    Path(f'{b_path}/{m_name}-f2pywrappers.f').touch()
+                else:
+                    # only f77 wrappers
+                    outmess(f'    Maybe empty "{m_name}-f2pywrappers.f"\n')
+                    Path(f'{b_path}/{m_name}-f2pywrappers.f').touch()
              api, wrap = buildapi(nb)
              if wrap:
                  if isf90:
@@ -1241,6 +1262,9 @@ def buildmodule(m, um):
          rd = dictappend(rd, ar)
  
      needs = cfuncs.get_needs()
+    # Add mapped definitions
+    needs['typedefs'] += [cvar for cvar in capi_maps.f2cmap_mapped #
+                          if cvar in typedef_need_dict.values()]
      code = {}
      for n in needs.keys():
          code[n] = []
diff --git a/numpy/f2py/tests/src/abstract_interface/foo.f90 b/numpy/f2py/tests/src/abstract_interface/foo.f90

new file mode 100644 (file)

index 0000000..76d16aa
--- /dev/null
+++ b/numpy/f2py/tests/src/abstract_interface/foo.f90
@@ -0,0 +1,34 @@
+module ops_module
+
+  abstract interface
+    subroutine op(x, y, z)
+      integer, intent(in) :: x, y
+      integer, intent(out) :: z
+    end subroutine
+  end interface
+
+contains
+
+  subroutine foo(x, y, r1, r2)
+    integer, intent(in) :: x, y
+    integer, intent(out) :: r1, r2
+    procedure (op) add1, add2
+    procedure (op), pointer::p
+    p=>add1
+    call p(x, y, r1)
+    p=>add2
+    call p(x, y, r2)
+  end subroutine
+end module
+
+subroutine add1(x, y, z)
+  integer, intent(in) :: x, y
+  integer, intent(out) :: z
+  z = x + y
+end subroutine
+
+subroutine add2(x, y, z)
+  integer, intent(in) :: x, y
+  integer, intent(out) :: z
+  z = x + 2 * y
+end subroutine
diff --git a/numpy/f2py/tests/src/abstract_interface/gh18403_mod.f90 b/numpy/f2py/tests/src/abstract_interface/gh18403_mod.f90

new file mode 100644 (file)

index 0000000..36791e4
--- /dev/null
+++ b/numpy/f2py/tests/src/abstract_interface/gh18403_mod.f90
@@ -0,0 +1,6 @@
+module test
+  abstract interface
+    subroutine foo()
+    end subroutine
+  end interface
+end module test
diff --git a/numpy/f2py/tests/src/array_from_pyobj/wrapmodule.c b/numpy/f2py/tests/src/array_from_pyobj/wrapmodule.c

index ea47e05558b7db6550d589492c24cabffddee243..c8ae7b9dc52e41beae90a2e26595a1432552d972 100644 (file)
--- a/numpy/f2py/tests/src/array_from_pyobj/wrapmodule.c
+++ b/numpy/f2py/tests/src/array_from_pyobj/wrapmodule.c
@@ -202,7 +202,6 @@ PyMODINIT_FUNC PyInit_test_array_from_pyobj_ext(void) {
    ADDCONST("ENSUREARRAY", NPY_ARRAY_ENSUREARRAY);
    ADDCONST("ALIGNED", NPY_ARRAY_ALIGNED);
    ADDCONST("WRITEABLE", NPY_ARRAY_WRITEABLE);
-  ADDCONST("UPDATEIFCOPY", NPY_ARRAY_UPDATEIFCOPY);
    ADDCONST("WRITEBACKIFCOPY", NPY_ARRAY_WRITEBACKIFCOPY);
  
    ADDCONST("BEHAVED", NPY_ARRAY_BEHAVED);
diff --git a/numpy/f2py/tests/src/block_docstring/foo.f b/numpy/f2py/tests/src/block_docstring/foo.f

new file mode 100644 (file)

index 0000000..c8315f1
--- /dev/null
+++ b/numpy/f2py/tests/src/block_docstring/foo.f
@@ -0,0 +1,6 @@
+      SUBROUTINE FOO()
+      INTEGER BAR(2, 3)
+
+      COMMON  /BLOCK/ BAR
+      RETURN
+      END
diff --git a/numpy/f2py/tests/src/callback/foo.f b/numpy/f2py/tests/src/callback/foo.f

new file mode 100644 (file)

index 0000000..ba397bb
--- /dev/null
+++ b/numpy/f2py/tests/src/callback/foo.f
@@ -0,0 +1,62 @@
+       subroutine t(fun,a)
+       integer a
+cf2py  intent(out) a
+       external fun
+       call fun(a)
+       end
+
+       subroutine func(a)
+cf2py  intent(in,out) a
+       integer a
+       a = a + 11
+       end
+
+       subroutine func0(a)
+cf2py  intent(out) a
+       integer a
+       a = 11
+       end
+
+       subroutine t2(a)
+cf2py  intent(callback) fun
+       integer a
+cf2py  intent(out) a
+       external fun
+       call fun(a)
+       end
+
+       subroutine string_callback(callback, a)
+       external callback
+       double precision callback
+       double precision a
+       character*1 r
+cf2py  intent(out) a
+       r = 'r'
+       a = callback(r)
+       end
+
+       subroutine string_callback_array(callback, cu, lencu, a)
+       external callback
+       integer callback
+       integer lencu
+       character*8 cu(lencu)
+       integer a
+cf2py  intent(out) a
+
+       a = callback(cu, lencu)
+       end
+
+       subroutine hidden_callback(a, r)
+       external global_f
+cf2py  intent(callback, hide) global_f
+       integer a, r, global_f
+cf2py  intent(out) r
+       r = global_f(a)
+       end
+
+       subroutine hidden_callback2(a, r)
+       external global_f
+       integer a, r, global_f
+cf2py  intent(out) r
+       r = global_f(a)
+       end
diff --git a/numpy/f2py/tests/src/callback/gh17797.f90 b/numpy/f2py/tests/src/callback/gh17797.f90

new file mode 100644 (file)

index 0000000..49853af
--- /dev/null
+++ b/numpy/f2py/tests/src/callback/gh17797.f90
@@ -0,0 +1,7 @@
+function gh17797(f, y) result(r)
+  external f
+  integer(8) :: r, f
+  integer(8), dimension(:) :: y
+  r = f(0)
+  r = r + sum(y)
+end function gh17797
diff --git a/numpy/f2py/tests/src/callback/gh18335.f90 b/numpy/f2py/tests/src/callback/gh18335.f90

new file mode 100644 (file)

index 0000000..92b6d75
--- /dev/null
+++ b/numpy/f2py/tests/src/callback/gh18335.f90
@@ -0,0 +1,17 @@
+        ! When gh18335_workaround is defined as an extension,
+        ! the issue cannot be reproduced.
+        !subroutine gh18335_workaround(f, y)
+        !  implicit none
+        !  external f
+        !  integer(kind=1) :: y(1)
+        !  call f(y)
+        !end subroutine gh18335_workaround
+
+        function gh18335(f) result (r)
+          implicit none
+          external f
+          integer(kind=1) :: y(1), r
+          y(1) = 123
+          call f(y)
+          r = y(1)
+        end function gh18335
diff --git a/numpy/f2py/tests/src/cli/hi77.f b/numpy/f2py/tests/src/cli/hi77.f

new file mode 100644 (file)

index 0000000..8b916eb
--- /dev/null
+++ b/numpy/f2py/tests/src/cli/hi77.f
@@ -0,0 +1,3 @@
+      SUBROUTINE HI
+        PRINT*, "HELLO WORLD"
+      END SUBROUTINE
diff --git a/numpy/f2py/tests/src/cli/hiworld.f90 b/numpy/f2py/tests/src/cli/hiworld.f90

new file mode 100644 (file)

index 0000000..981f877
--- /dev/null
+++ b/numpy/f2py/tests/src/cli/hiworld.f90
@@ -0,0 +1,3 @@
+function hi()
+  print*, "Hello World"
+end function
diff --git a/numpy/f2py/tests/src/crackfortran/accesstype.f90 b/numpy/f2py/tests/src/crackfortran/accesstype.f90

new file mode 100644 (file)

index 0000000..e2cbd44
--- /dev/null
+++ b/numpy/f2py/tests/src/crackfortran/accesstype.f90
@@ -0,0 +1,13 @@
+module foo
+  public
+  type, private, bind(c) :: a
+     integer :: i
+  end type a
+  type, bind(c) :: b_
+     integer :: j
+  end type b_
+  public :: b_
+  type :: c
+     integer :: k
+  end type c
+end module foo
diff --git a/numpy/f2py/tests/src/crackfortran/foo_deps.f90 b/numpy/f2py/tests/src/crackfortran/foo_deps.f90

new file mode 100644 (file)

index 0000000..e327b25
--- /dev/null
+++ b/numpy/f2py/tests/src/crackfortran/foo_deps.f90
@@ -0,0 +1,6 @@
+module foo
+  type bar
+    character(len = 4) :: text
+  end type bar
+  type(bar), parameter :: abar = bar('abar')
+end module foo
diff --git a/numpy/f2py/tests/src/crackfortran/gh15035.f b/numpy/f2py/tests/src/crackfortran/gh15035.f

new file mode 100644 (file)

index 0000000..1bb2e67
--- /dev/null
+++ b/numpy/f2py/tests/src/crackfortran/gh15035.f
@@ -0,0 +1,16 @@
+        subroutine subb(k)
+          real(8), intent(inout) :: k(:)
+          k=k+1
+        endsubroutine
+
+        subroutine subc(w,k)
+          real(8), intent(in) :: w(:)
+          real(8), intent(out) :: k(size(w))
+          k=w+1
+        endsubroutine
+
+        function t0(value)
+          character value
+          character t0
+          t0 = value
+        endfunction
diff --git a/numpy/f2py/tests/src/crackfortran/gh17859.f b/numpy/f2py/tests/src/crackfortran/gh17859.f

new file mode 100644 (file)

index 0000000..9959538
--- /dev/null
+++ b/numpy/f2py/tests/src/crackfortran/gh17859.f
@@ -0,0 +1,12 @@
+        integer(8) function external_as_statement(fcn)
+        implicit none
+        external fcn
+        integer(8) :: fcn
+        external_as_statement = fcn(0)
+        end
+
+        integer(8) function external_as_attribute(fcn)
+        implicit none
+        integer(8), external :: fcn
+        external_as_attribute = fcn(0)
+        end
diff --git a/numpy/f2py/tests/src/crackfortran/gh2848.f90 b/numpy/f2py/tests/src/crackfortran/gh2848.f90

new file mode 100644 (file)

index 0000000..31ea932
--- /dev/null
+++ b/numpy/f2py/tests/src/crackfortran/gh2848.f90
@@ -0,0 +1,13 @@
+      subroutine gh2848( &
+        ! first 2 parameters
+        par1, par2,&
+        ! last 2 parameters
+        par3, par4)
+
+        integer, intent(in)  :: par1, par2
+        integer, intent(out) :: par3, par4
+
+        par3 = par1
+        par4 = par2
+
+      end subroutine gh2848
diff --git a/numpy/f2py/tests/src/crackfortran/operators.f90 b/numpy/f2py/tests/src/crackfortran/operators.f90

new file mode 100644 (file)

index 0000000..1d060a3
--- /dev/null
+++ b/numpy/f2py/tests/src/crackfortran/operators.f90
@@ -0,0 +1,49 @@
+module foo
+  type bar
+     character(len = 32) :: item
+  end type bar
+  interface operator(.item.)
+     module procedure item_int, item_real
+  end interface operator(.item.)
+  interface operator(==)
+     module procedure items_are_equal
+  end interface operator(==)
+  interface assignment(=)
+     module procedure get_int, get_real
+  end interface assignment(=)
+contains
+  function item_int(val) result(elem)
+    integer, intent(in) :: val
+    type(bar) :: elem
+
+    write(elem%item, "(I32)") val
+  end function item_int
+
+  function item_real(val) result(elem)
+    real, intent(in) :: val
+    type(bar) :: elem
+
+    write(elem%item, "(1PE32.12)") val
+  end function item_real
+
+  function items_are_equal(val1, val2) result(equal)
+    type(bar), intent(in) :: val1, val2
+    logical :: equal
+
+    equal = (val1%item == val2%item)
+  end function items_are_equal
+
+  subroutine get_real(rval, item)
+    real, intent(out) :: rval
+    type(bar), intent(in) :: item
+
+    read(item%item, *) rval
+  end subroutine get_real
+
+  subroutine get_int(rval, item)
+    integer, intent(out) :: rval
+    type(bar), intent(in) :: item
+
+    read(item%item, *) rval
+  end subroutine get_int
+end module foo
diff --git a/numpy/f2py/tests/src/crackfortran/privatemod.f90 b/numpy/f2py/tests/src/crackfortran/privatemod.f90

new file mode 100644 (file)

index 0000000..2674c21
--- /dev/null
+++ b/numpy/f2py/tests/src/crackfortran/privatemod.f90
@@ -0,0 +1,11 @@
+module foo
+  private
+  integer :: a
+  public :: setA
+  integer :: b
+contains
+  subroutine setA(v)
+    integer, intent(in) :: v
+    a = v
+  end subroutine setA
+end module foo
diff --git a/numpy/f2py/tests/src/crackfortran/publicmod.f90 b/numpy/f2py/tests/src/crackfortran/publicmod.f90

new file mode 100644 (file)

index 0000000..1db76e3
--- /dev/null
+++ b/numpy/f2py/tests/src/crackfortran/publicmod.f90
@@ -0,0 +1,10 @@
+module foo
+  public
+  integer, private :: a
+  public :: setA
+contains
+  subroutine setA(v)
+    integer, intent(in) :: v
+    a = v
+  end subroutine setA
+end module foo
diff --git a/numpy/f2py/tests/src/f2cmap/.f2py_f2cmap b/numpy/f2py/tests/src/f2cmap/.f2py_f2cmap

new file mode 100644 (file)

index 0000000..a4425f8
--- /dev/null
+++ b/numpy/f2py/tests/src/f2cmap/.f2py_f2cmap
@@ -0,0 +1 @@
+dict(real=dict(real32='float', real64='double'), integer=dict(int64='long_long'))
diff --git a/numpy/f2py/tests/src/f2cmap/isoFortranEnvMap.f90 b/numpy/f2py/tests/src/f2cmap/isoFortranEnvMap.f90

new file mode 100644 (file)

index 0000000..3f0e12c
--- /dev/null
+++ b/numpy/f2py/tests/src/f2cmap/isoFortranEnvMap.f90
@@ -0,0 +1,9 @@
+      subroutine func1(n, x, res)
+        use, intrinsic :: iso_fortran_env, only: int64, real64
+        implicit none
+        integer(int64), intent(in) :: n
+        real(real64), intent(in) :: x(n)
+        real(real64), intent(out) :: res
+Cf2py   intent(hide) :: n
+        res = sum(x)
+      end
diff --git a/numpy/f2py/tests/src/negative_bounds/issue_20853.f90 b/numpy/f2py/tests/src/negative_bounds/issue_20853.f90

new file mode 100644 (file)

index 0000000..bf1fa92
--- /dev/null
+++ b/numpy/f2py/tests/src/negative_bounds/issue_20853.f90
@@ -0,0 +1,7 @@
+subroutine foo(is_, ie_, arr, tout)
+ implicit none
+ integer :: is_,ie_
+ real, intent(in) :: arr(is_:ie_)
+ real, intent(out) :: tout(is_:ie_)
+ tout = arr
+end
diff --git a/numpy/f2py/tests/src/quoted_character/foo.f b/numpy/f2py/tests/src/quoted_character/foo.f

new file mode 100644 (file)

index 0000000..9dc1cfa
--- /dev/null
+++ b/numpy/f2py/tests/src/quoted_character/foo.f
@@ -0,0 +1,14 @@
+      SUBROUTINE FOO(OUT1, OUT2, OUT3, OUT4, OUT5, OUT6)
+      CHARACTER SINGLE, DOUBLE, SEMICOL, EXCLA, OPENPAR, CLOSEPAR
+      PARAMETER (SINGLE="'", DOUBLE='"', SEMICOL=';', EXCLA="!",
+     1           OPENPAR="(", CLOSEPAR=")")
+      CHARACTER OUT1, OUT2, OUT3, OUT4, OUT5, OUT6
+Cf2py intent(out) OUT1, OUT2, OUT3, OUT4, OUT5, OUT6
+      OUT1 = SINGLE
+      OUT2 = DOUBLE
+      OUT3 = SEMICOL
+      OUT4 = EXCLA
+      OUT5 = OPENPAR
+      OUT6 = CLOSEPAR
+      RETURN
+      END
diff --git a/numpy/f2py/tests/src/return_character/foo77.f b/numpy/f2py/tests/src/return_character/foo77.f

new file mode 100644 (file)

index 0000000..facae10
--- /dev/null
+++ b/numpy/f2py/tests/src/return_character/foo77.f
@@ -0,0 +1,45 @@
+       function t0(value)
+         character value
+         character t0
+         t0 = value
+       end
+       function t1(value)
+         character*1 value
+         character*1 t1
+         t1 = value
+       end
+       function t5(value)
+         character*5 value
+         character*5 t5
+         t5 = value
+       end
+       function ts(value)
+         character*(*) value
+         character*(*) ts
+         ts = value
+       end
+
+       subroutine s0(t0,value)
+         character value
+         character t0
+cf2py    intent(out) t0
+         t0 = value
+       end
+       subroutine s1(t1,value)
+         character*1 value
+         character*1 t1
+cf2py    intent(out) t1
+         t1 = value
+       end
+       subroutine s5(t5,value)
+         character*5 value
+         character*5 t5
+cf2py    intent(out) t5
+         t5 = value
+       end
+       subroutine ss(ts,value)
+         character*(*) value
+         character*10 ts
+cf2py    intent(out) ts
+         ts = value
+       end
diff --git a/numpy/f2py/tests/src/return_character/foo90.f90 b/numpy/f2py/tests/src/return_character/foo90.f90

new file mode 100644 (file)

index 0000000..36182bc
--- /dev/null
+++ b/numpy/f2py/tests/src/return_character/foo90.f90
@@ -0,0 +1,48 @@
+module f90_return_char
+  contains
+       function t0(value)
+         character :: value
+         character :: t0
+         t0 = value
+       end function t0
+       function t1(value)
+         character(len=1) :: value
+         character(len=1) :: t1
+         t1 = value
+       end function t1
+       function t5(value)
+         character(len=5) :: value
+         character(len=5) :: t5
+         t5 = value
+       end function t5
+       function ts(value)
+         character(len=*) :: value
+         character(len=10) :: ts
+         ts = value
+       end function ts
+
+       subroutine s0(t0,value)
+         character :: value
+         character :: t0
+!f2py    intent(out) t0
+         t0 = value
+       end subroutine s0
+       subroutine s1(t1,value)
+         character(len=1) :: value
+         character(len=1) :: t1
+!f2py    intent(out) t1
+         t1 = value
+       end subroutine s1
+       subroutine s5(t5,value)
+         character(len=5) :: value
+         character(len=5) :: t5
+!f2py    intent(out) t5
+         t5 = value
+       end subroutine s5
+       subroutine ss(ts,value)
+         character(len=*) :: value
+         character(len=10) :: ts
+!f2py    intent(out) ts
+         ts = value
+       end subroutine ss
+end module f90_return_char
diff --git a/numpy/f2py/tests/src/return_complex/foo77.f b/numpy/f2py/tests/src/return_complex/foo77.f

new file mode 100644 (file)

index 0000000..37a1ec8
--- /dev/null
+++ b/numpy/f2py/tests/src/return_complex/foo77.f
@@ -0,0 +1,45 @@
+       function t0(value)
+         complex value
+         complex t0
+         t0 = value
+       end
+       function t8(value)
+         complex*8 value
+         complex*8 t8
+         t8 = value
+       end
+       function t16(value)
+         complex*16 value
+         complex*16 t16
+         t16 = value
+       end
+       function td(value)
+         double complex value
+         double complex td
+         td = value
+       end
+
+       subroutine s0(t0,value)
+         complex value
+         complex t0
+cf2py    intent(out) t0
+         t0 = value
+       end
+       subroutine s8(t8,value)
+         complex*8 value
+         complex*8 t8
+cf2py    intent(out) t8
+         t8 = value
+       end
+       subroutine s16(t16,value)
+         complex*16 value
+         complex*16 t16
+cf2py    intent(out) t16
+         t16 = value
+       end
+       subroutine sd(td,value)
+         double complex value
+         double complex td
+cf2py    intent(out) td
+         td = value
+       end
diff --git a/numpy/f2py/tests/src/return_complex/foo90.f90 b/numpy/f2py/tests/src/return_complex/foo90.f90

new file mode 100644 (file)

index 0000000..adc27b4
--- /dev/null
+++ b/numpy/f2py/tests/src/return_complex/foo90.f90
@@ -0,0 +1,48 @@
+module f90_return_complex
+  contains
+       function t0(value)
+         complex :: value
+         complex :: t0
+         t0 = value
+       end function t0
+       function t8(value)
+         complex(kind=4) :: value
+         complex(kind=4) :: t8
+         t8 = value
+       end function t8
+       function t16(value)
+         complex(kind=8) :: value
+         complex(kind=8) :: t16
+         t16 = value
+       end function t16
+       function td(value)
+         double complex :: value
+         double complex :: td
+         td = value
+       end function td
+
+       subroutine s0(t0,value)
+         complex :: value
+         complex :: t0
+!f2py    intent(out) t0
+         t0 = value
+       end subroutine s0
+       subroutine s8(t8,value)
+         complex(kind=4) :: value
+         complex(kind=4) :: t8
+!f2py    intent(out) t8
+         t8 = value
+       end subroutine s8
+       subroutine s16(t16,value)
+         complex(kind=8) :: value
+         complex(kind=8) :: t16
+!f2py    intent(out) t16
+         t16 = value
+       end subroutine s16
+       subroutine sd(td,value)
+         double complex :: value
+         double complex :: td
+!f2py    intent(out) td
+         td = value
+       end subroutine sd
+end module f90_return_complex
diff --git a/numpy/f2py/tests/src/return_integer/foo77.f b/numpy/f2py/tests/src/return_integer/foo77.f

new file mode 100644 (file)

index 0000000..1ab895b
--- /dev/null
+++ b/numpy/f2py/tests/src/return_integer/foo77.f
@@ -0,0 +1,56 @@
+       function t0(value)
+         integer value
+         integer t0
+         t0 = value
+       end
+       function t1(value)
+         integer*1 value
+         integer*1 t1
+         t1 = value
+       end
+       function t2(value)
+         integer*2 value
+         integer*2 t2
+         t2 = value
+       end
+       function t4(value)
+         integer*4 value
+         integer*4 t4
+         t4 = value
+       end
+       function t8(value)
+         integer*8 value
+         integer*8 t8
+         t8 = value
+       end
+
+       subroutine s0(t0,value)
+         integer value
+         integer t0
+cf2py    intent(out) t0
+         t0 = value
+       end
+       subroutine s1(t1,value)
+         integer*1 value
+         integer*1 t1
+cf2py    intent(out) t1
+         t1 = value
+       end
+       subroutine s2(t2,value)
+         integer*2 value
+         integer*2 t2
+cf2py    intent(out) t2
+         t2 = value
+       end
+       subroutine s4(t4,value)
+         integer*4 value
+         integer*4 t4
+cf2py    intent(out) t4
+         t4 = value
+       end
+       subroutine s8(t8,value)
+         integer*8 value
+         integer*8 t8
+cf2py    intent(out) t8
+         t8 = value
+       end
diff --git a/numpy/f2py/tests/src/return_integer/foo90.f90 b/numpy/f2py/tests/src/return_integer/foo90.f90

new file mode 100644 (file)

index 0000000..ba9249a
--- /dev/null
+++ b/numpy/f2py/tests/src/return_integer/foo90.f90
@@ -0,0 +1,59 @@
+module f90_return_integer
+  contains
+       function t0(value)
+         integer :: value
+         integer :: t0
+         t0 = value
+       end function t0
+       function t1(value)
+         integer(kind=1) :: value
+         integer(kind=1) :: t1
+         t1 = value
+       end function t1
+       function t2(value)
+         integer(kind=2) :: value
+         integer(kind=2) :: t2
+         t2 = value
+       end function t2
+       function t4(value)
+         integer(kind=4) :: value
+         integer(kind=4) :: t4
+         t4 = value
+       end function t4
+       function t8(value)
+         integer(kind=8) :: value
+         integer(kind=8) :: t8
+         t8 = value
+       end function t8
+
+       subroutine s0(t0,value)
+         integer :: value
+         integer :: t0
+!f2py    intent(out) t0
+         t0 = value
+       end subroutine s0
+       subroutine s1(t1,value)
+         integer(kind=1) :: value
+         integer(kind=1) :: t1
+!f2py    intent(out) t1
+         t1 = value
+       end subroutine s1
+       subroutine s2(t2,value)
+         integer(kind=2) :: value
+         integer(kind=2) :: t2
+!f2py    intent(out) t2
+         t2 = value
+       end subroutine s2
+       subroutine s4(t4,value)
+         integer(kind=4) :: value
+         integer(kind=4) :: t4
+!f2py    intent(out) t4
+         t4 = value
+       end subroutine s4
+       subroutine s8(t8,value)
+         integer(kind=8) :: value
+         integer(kind=8) :: t8
+!f2py    intent(out) t8
+         t8 = value
+       end subroutine s8
+end module f90_return_integer
diff --git a/numpy/f2py/tests/src/return_logical/foo77.f b/numpy/f2py/tests/src/return_logical/foo77.f

new file mode 100644 (file)

index 0000000..ef53014
--- /dev/null
+++ b/numpy/f2py/tests/src/return_logical/foo77.f
@@ -0,0 +1,56 @@
+       function t0(value)
+         logical value
+         logical t0
+         t0 = value
+       end
+       function t1(value)
+         logical*1 value
+         logical*1 t1
+         t1 = value
+       end
+       function t2(value)
+         logical*2 value
+         logical*2 t2
+         t2 = value
+       end
+       function t4(value)
+         logical*4 value
+         logical*4 t4
+         t4 = value
+       end
+c       function t8(value)
+c         logical*8 value
+c         logical*8 t8
+c         t8 = value
+c       end
+
+       subroutine s0(t0,value)
+         logical value
+         logical t0
+cf2py    intent(out) t0
+         t0 = value
+       end
+       subroutine s1(t1,value)
+         logical*1 value
+         logical*1 t1
+cf2py    intent(out) t1
+         t1 = value
+       end
+       subroutine s2(t2,value)
+         logical*2 value
+         logical*2 t2
+cf2py    intent(out) t2
+         t2 = value
+       end
+       subroutine s4(t4,value)
+         logical*4 value
+         logical*4 t4
+cf2py    intent(out) t4
+         t4 = value
+       end
+c       subroutine s8(t8,value)
+c         logical*8 value
+c         logical*8 t8
+cf2py    intent(out) t8
+c         t8 = value
+c       end
diff --git a/numpy/f2py/tests/src/return_logical/foo90.f90 b/numpy/f2py/tests/src/return_logical/foo90.f90

new file mode 100644 (file)

index 0000000..a452646
--- /dev/null
+++ b/numpy/f2py/tests/src/return_logical/foo90.f90
@@ -0,0 +1,59 @@
+module f90_return_logical
+  contains
+       function t0(value)
+         logical :: value
+         logical :: t0
+         t0 = value
+       end function t0
+       function t1(value)
+         logical(kind=1) :: value
+         logical(kind=1) :: t1
+         t1 = value
+       end function t1
+       function t2(value)
+         logical(kind=2) :: value
+         logical(kind=2) :: t2
+         t2 = value
+       end function t2
+       function t4(value)
+         logical(kind=4) :: value
+         logical(kind=4) :: t4
+         t4 = value
+       end function t4
+       function t8(value)
+         logical(kind=8) :: value
+         logical(kind=8) :: t8
+         t8 = value
+       end function t8
+
+       subroutine s0(t0,value)
+         logical :: value
+         logical :: t0
+!f2py    intent(out) t0
+         t0 = value
+       end subroutine s0
+       subroutine s1(t1,value)
+         logical(kind=1) :: value
+         logical(kind=1) :: t1
+!f2py    intent(out) t1
+         t1 = value
+       end subroutine s1
+       subroutine s2(t2,value)
+         logical(kind=2) :: value
+         logical(kind=2) :: t2
+!f2py    intent(out) t2
+         t2 = value
+       end subroutine s2
+       subroutine s4(t4,value)
+         logical(kind=4) :: value
+         logical(kind=4) :: t4
+!f2py    intent(out) t4
+         t4 = value
+       end subroutine s4
+       subroutine s8(t8,value)
+         logical(kind=8) :: value
+         logical(kind=8) :: t8
+!f2py    intent(out) t8
+         t8 = value
+       end subroutine s8
+end module f90_return_logical
diff --git a/numpy/f2py/tests/src/return_real/foo77.f b/numpy/f2py/tests/src/return_real/foo77.f

new file mode 100644 (file)

index 0000000..bf43dbf
--- /dev/null
+++ b/numpy/f2py/tests/src/return_real/foo77.f
@@ -0,0 +1,45 @@
+       function t0(value)
+         real value
+         real t0
+         t0 = value
+       end
+       function t4(value)
+         real*4 value
+         real*4 t4
+         t4 = value
+       end
+       function t8(value)
+         real*8 value
+         real*8 t8
+         t8 = value
+       end
+       function td(value)
+         double precision value
+         double precision td
+         td = value
+       end
+
+       subroutine s0(t0,value)
+         real value
+         real t0
+cf2py    intent(out) t0
+         t0 = value
+       end
+       subroutine s4(t4,value)
+         real*4 value
+         real*4 t4
+cf2py    intent(out) t4
+         t4 = value
+       end
+       subroutine s8(t8,value)
+         real*8 value
+         real*8 t8
+cf2py    intent(out) t8
+         t8 = value
+       end
+       subroutine sd(td,value)
+         double precision value
+         double precision td
+cf2py    intent(out) td
+         td = value
+       end
diff --git a/numpy/f2py/tests/src/return_real/foo90.f90 b/numpy/f2py/tests/src/return_real/foo90.f90

new file mode 100644 (file)

index 0000000..df97199
--- /dev/null
+++ b/numpy/f2py/tests/src/return_real/foo90.f90
@@ -0,0 +1,48 @@
+module f90_return_real
+  contains
+       function t0(value)
+         real :: value
+         real :: t0
+         t0 = value
+       end function t0
+       function t4(value)
+         real(kind=4) :: value
+         real(kind=4) :: t4
+         t4 = value
+       end function t4
+       function t8(value)
+         real(kind=8) :: value
+         real(kind=8) :: t8
+         t8 = value
+       end function t8
+       function td(value)
+         double precision :: value
+         double precision :: td
+         td = value
+       end function td
+
+       subroutine s0(t0,value)
+         real :: value
+         real :: t0
+!f2py    intent(out) t0
+         t0 = value
+       end subroutine s0
+       subroutine s4(t4,value)
+         real(kind=4) :: value
+         real(kind=4) :: t4
+!f2py    intent(out) t4
+         t4 = value
+       end subroutine s4
+       subroutine s8(t8,value)
+         real(kind=8) :: value
+         real(kind=8) :: t8
+!f2py    intent(out) t8
+         t8 = value
+       end subroutine s8
+       subroutine sd(td,value)
+         double precision :: value
+         double precision :: td
+!f2py    intent(out) td
+         td = value
+       end subroutine sd
+end module f90_return_real
diff --git a/numpy/f2py/tests/src/string/fixed_string.f90 b/numpy/f2py/tests/src/string/fixed_string.f90

new file mode 100644 (file)

index 0000000..7fd1585
--- /dev/null
+++ b/numpy/f2py/tests/src/string/fixed_string.f90
@@ -0,0 +1,34 @@
+function sint(s) result(i)
+   implicit none
+   character(len=*) :: s
+   integer :: j, i
+   i = 0
+   do j=len(s), 1, -1
+    if (.not.((i.eq.0).and.(s(j:j).eq.' '))) then
+      i = i + ichar(s(j:j)) * 10 ** (j - 1)
+    endif
+   end do
+   return
+ end function sint
+
+ function test_in_bytes4(a) result (i)
+   implicit none
+   integer :: sint
+   character(len=4) :: a
+   integer :: i
+   i = sint(a)
+   a(1:1) = 'A'
+   return
+ end function test_in_bytes4
+
+ function test_inout_bytes4(a) result (i)
+   implicit none
+   integer :: sint
+   character(len=4), intent(inout) :: a
+   integer :: i
+   if (a(1:1).ne.' ') then
+     a(1:1) = 'E'
+   endif
+   i = sint(a)
+   return
+ end function test_inout_bytes4
diff --git a/numpy/f2py/tests/src/string/string.f b/numpy/f2py/tests/src/string/string.f

new file mode 100644 (file)

index 0000000..5210ca4
--- /dev/null
+++ b/numpy/f2py/tests/src/string/string.f
@@ -0,0 +1,12 @@
+C FILE: STRING.F
+      SUBROUTINE FOO(A,B,C,D)
+      CHARACTER*5 A, B
+      CHARACTER*(*) C,D
+Cf2py intent(in) a,c
+Cf2py intent(inout) b,d
+      A(1:1) = 'A'
+      B(1:1) = 'B'
+      C(1:1) = 'C'
+      D(1:1) = 'D'
+      END
+C END OF FILE STRING.F
diff --git a/numpy/f2py/tests/test_abstract_interface.py b/numpy/f2py/tests/test_abstract_interface.py

index 936c1f7bc9aef28179c4d7527e1ec2214441b777..29e4b064777b3523005515be49918a7805be5392 100644 (file)
--- a/numpy/f2py/tests/test_abstract_interface.py
+++ b/numpy/f2py/tests/test_abstract_interface.py
@@ -1,66 +1,22 @@
+from pathlib import Path
  import textwrap
  from . import util
  from numpy.f2py import crackfortran
  
  
  class TestAbstractInterface(util.F2PyTest):
-    suffix = '.f90'
+    sources = [util.getpath("tests", "src", "abstract_interface", "foo.f90")]
  
-    skip = ['add1', 'add2']
-
-    code = textwrap.dedent("""
-        module ops_module
-
-          abstract interface
-            subroutine op(x, y, z)
-              integer, intent(in) :: x, y
-              integer, intent(out) :: z
-            end subroutine
-          end interface
-
-        contains
-
-          subroutine foo(x, y, r1, r2)
-            integer, intent(in) :: x, y
-            integer, intent(out) :: r1, r2
-            procedure (op) add1, add2
-            procedure (op), pointer::p
-            p=>add1
-            call p(x, y, r1)
-            p=>add2
-            call p(x, y, r2)
-          end subroutine
-        end module
-
-        subroutine add1(x, y, z)
-          integer, intent(in) :: x, y
-          integer, intent(out) :: z
-          z = x + y
-        end subroutine
-
-        subroutine add2(x, y, z)
-          integer, intent(in) :: x, y
-          integer, intent(out) :: z
-          z = x + 2 * y
-        end subroutine
-        """)
+    skip = ["add1", "add2"]
  
      def test_abstract_interface(self):
          assert self.module.ops_module.foo(3, 5) == (8, 13)
  
-    def test_parse_abstract_interface(self, tmp_path):
+    def test_parse_abstract_interface(self):
          # Test gh18403
-        f_path = tmp_path / "gh18403_mod.f90"
-        with f_path.open('w') as ff:
-            ff.write(textwrap.dedent("""\
-                module test
-                  abstract interface
-                    subroutine foo()
-                    end subroutine
-                  end interface
-                end module test
-                """))
-        mod = crackfortran.crackfortran([str(f_path)])
+        fpath = util.getpath("tests", "src", "abstract_interface",
+                             "gh18403_mod.f90")
+        mod = crackfortran.crackfortran([str(fpath)])
          assert len(mod) == 1
-        assert len(mod[0]['body']) == 1
-        assert mod[0]['body'][0]['block'] == 'abstract interface'
+        assert len(mod[0]["body"]) == 1
+        assert mod[0]["body"][0]["block"] == "abstract interface"
diff --git a/numpy/f2py/tests/test_array_from_pyobj.py b/numpy/f2py/tests/test_array_from_pyobj.py

index 649fd1c4863ba605daf00c7bd273bb995b356d72..5a084bc3e974524809dd3aec1afd77edd6809da9 100644 (file)
--- a/numpy/f2py/tests/test_array_from_pyobj.py
+++ b/numpy/f2py/tests/test_array_from_pyobj.py
@@ -6,7 +6,6 @@ import pytest
  
  import numpy as np
  
-from numpy.testing import assert_, assert_equal
  from numpy.core.multiarray import typeinfo
  from . import util
  
@@ -31,11 +30,13 @@ def setup_module():
                               define_macros=[])
          """
          d = os.path.dirname(__file__)
-        src = [os.path.join(d, 'src', 'array_from_pyobj', 'wrapmodule.c'),
-               os.path.join(d, '..', 'src', 'fortranobject.c'),
-               os.path.join(d, '..', 'src', 'fortranobject.h')]
+        src = [
+            util.getpath("tests", "src", "array_from_pyobj", "wrapmodule.c"),
+            util.getpath("src", "fortranobject.c"),
+            util.getpath("src", "fortranobject.h"),
+        ]
          wrap = util.build_module_distutils(src, config_code,
-                                           'test_array_from_pyobj_ext')
+                                           "test_array_from_pyobj_ext")
  
  
  def flags_info(arr):
@@ -45,39 +46,48 @@ def flags_info(arr):
  
  def flags2names(flags):
      info = []
-    for flagname in ['CONTIGUOUS', 'FORTRAN', 'OWNDATA', 'ENSURECOPY',
-                     'ENSUREARRAY', 'ALIGNED', 'NOTSWAPPED', 'WRITEABLE',
-                     'WRITEBACKIFCOPY', 'UPDATEIFCOPY', 'BEHAVED', 'BEHAVED_RO',
-                     'CARRAY', 'FARRAY'
-                     ]:
+    for flagname in [
+            "CONTIGUOUS",
+            "FORTRAN",
+            "OWNDATA",
+            "ENSURECOPY",
+            "ENSUREARRAY",
+            "ALIGNED",
+            "NOTSWAPPED",
+            "WRITEABLE",
+            "WRITEBACKIFCOPY",
+            "BEHAVED",
+            "BEHAVED_RO",
+            "CARRAY",
+            "FARRAY",
+    ]:
          if abs(flags) & getattr(wrap, flagname, 0):
              info.append(flagname)
      return info
  
  
  class Intent:
-
      def __init__(self, intent_list=[]):
          self.intent_list = intent_list[:]
          flags = 0
          for i in intent_list:
-            if i == 'optional':
+            if i == "optional":
                  flags |= wrap.F2PY_OPTIONAL
              else:
-                flags |= getattr(wrap, 'F2PY_INTENT_' + i.upper())
+                flags |= getattr(wrap, "F2PY_INTENT_" + i.upper())
          self.flags = flags
  
      def __getattr__(self, name):
          name = name.lower()
-        if name == 'in_':
-            name = 'in'
+        if name == "in_":
+            name = "in"
          return self.__class__(self.intent_list + [name])
  
      def __str__(self):
-        return 'intent(%s)' % (','.join(self.intent_list))
+        return "intent(%s)" % (",".join(self.intent_list))
  
      def __repr__(self):
-        return 'Intent(%r)' % (self.intent_list)
+        return "Intent(%r)" % (self.intent_list)
  
      def is_intent(self, *names):
          for name in names:
@@ -88,32 +98,46 @@ class Intent:
      def is_intent_exact(self, *names):
          return len(self.intent_list) == len(names) and self.is_intent(*names)
  
-intent = Intent()
-
-_type_names = ['BOOL', 'BYTE', 'UBYTE', 'SHORT', 'USHORT', 'INT', 'UINT',
-               'LONG', 'ULONG', 'LONGLONG', 'ULONGLONG',
-               'FLOAT', 'DOUBLE', 'CFLOAT']
-
-_cast_dict = {'BOOL': ['BOOL']}
-_cast_dict['BYTE'] = _cast_dict['BOOL'] + ['BYTE']
-_cast_dict['UBYTE'] = _cast_dict['BOOL'] + ['UBYTE']
-_cast_dict['BYTE'] = ['BYTE']
-_cast_dict['UBYTE'] = ['UBYTE']
-_cast_dict['SHORT'] = _cast_dict['BYTE'] + ['UBYTE', 'SHORT']
-_cast_dict['USHORT'] = _cast_dict['UBYTE'] + ['BYTE', 'USHORT']
-_cast_dict['INT'] = _cast_dict['SHORT'] + ['USHORT', 'INT']
-_cast_dict['UINT'] = _cast_dict['USHORT'] + ['SHORT', 'UINT']
  
-_cast_dict['LONG'] = _cast_dict['INT'] + ['LONG']
-_cast_dict['ULONG'] = _cast_dict['UINT'] + ['ULONG']
-
-_cast_dict['LONGLONG'] = _cast_dict['LONG'] + ['LONGLONG']
-_cast_dict['ULONGLONG'] = _cast_dict['ULONG'] + ['ULONGLONG']
-
-_cast_dict['FLOAT'] = _cast_dict['SHORT'] + ['USHORT', 'FLOAT']
-_cast_dict['DOUBLE'] = _cast_dict['INT'] + ['UINT', 'FLOAT', 'DOUBLE']
+intent = Intent()
  
-_cast_dict['CFLOAT'] = _cast_dict['FLOAT'] + ['CFLOAT']
+_type_names = [
+    "BOOL",
+    "BYTE",
+    "UBYTE",
+    "SHORT",
+    "USHORT",
+    "INT",
+    "UINT",
+    "LONG",
+    "ULONG",
+    "LONGLONG",
+    "ULONGLONG",
+    "FLOAT",
+    "DOUBLE",
+    "CFLOAT",
+]
+
+_cast_dict = {"BOOL": ["BOOL"]}
+_cast_dict["BYTE"] = _cast_dict["BOOL"] + ["BYTE"]
+_cast_dict["UBYTE"] = _cast_dict["BOOL"] + ["UBYTE"]
+_cast_dict["BYTE"] = ["BYTE"]
+_cast_dict["UBYTE"] = ["UBYTE"]
+_cast_dict["SHORT"] = _cast_dict["BYTE"] + ["UBYTE", "SHORT"]
+_cast_dict["USHORT"] = _cast_dict["UBYTE"] + ["BYTE", "USHORT"]
+_cast_dict["INT"] = _cast_dict["SHORT"] + ["USHORT", "INT"]
+_cast_dict["UINT"] = _cast_dict["USHORT"] + ["SHORT", "UINT"]
+
+_cast_dict["LONG"] = _cast_dict["INT"] + ["LONG"]
+_cast_dict["ULONG"] = _cast_dict["UINT"] + ["ULONG"]
+
+_cast_dict["LONGLONG"] = _cast_dict["LONG"] + ["LONGLONG"]
+_cast_dict["ULONGLONG"] = _cast_dict["ULONG"] + ["ULONGLONG"]
+
+_cast_dict["FLOAT"] = _cast_dict["SHORT"] + ["USHORT", "FLOAT"]
+_cast_dict["DOUBLE"] = _cast_dict["INT"] + ["UINT", "FLOAT", "DOUBLE"]
+
+_cast_dict["CFLOAT"] = _cast_dict["FLOAT"] + ["CFLOAT"]
  
  # 32 bit system malloc typically does not provide the alignment required by
  # 16 byte long double types this means the inout intent cannot be satisfied
@@ -121,15 +145,22 @@ _cast_dict['CFLOAT'] = _cast_dict['FLOAT'] + ['CFLOAT']
  # when numpy gains an aligned allocator the tests could be enabled again
  #
  # Furthermore, on macOS ARM64, LONGDOUBLE is an alias for DOUBLE.
-if ((np.intp().dtype.itemsize != 4 or np.clongdouble().dtype.alignment <= 8) and
-        sys.platform != 'win32' and
-        (platform.system(), platform.processor()) != ('Darwin', 'arm')):
-    _type_names.extend(['LONGDOUBLE', 'CDOUBLE', 'CLONGDOUBLE'])
-    _cast_dict['LONGDOUBLE'] = _cast_dict['LONG'] + \
-        ['ULONG', 'FLOAT', 'DOUBLE', 'LONGDOUBLE']
-    _cast_dict['CLONGDOUBLE'] = _cast_dict['LONGDOUBLE'] + \
-        ['CFLOAT', 'CDOUBLE', 'CLONGDOUBLE']
-    _cast_dict['CDOUBLE'] = _cast_dict['DOUBLE'] + ['CFLOAT', 'CDOUBLE']
+if ((np.intp().dtype.itemsize != 4 or np.clongdouble().dtype.alignment <= 8)
+        and sys.platform != "win32"
+        and (platform.system(), platform.processor()) != ("Darwin", "arm")):
+    _type_names.extend(["LONGDOUBLE", "CDOUBLE", "CLONGDOUBLE"])
+    _cast_dict["LONGDOUBLE"] = _cast_dict["LONG"] + [
+        "ULONG",
+        "FLOAT",
+        "DOUBLE",
+        "LONGDOUBLE",
+    ]
+    _cast_dict["CLONGDOUBLE"] = _cast_dict["LONGDOUBLE"] + [
+        "CFLOAT",
+        "CDOUBLE",
+        "CLONGDOUBLE",
+    ]
+    _cast_dict["CDOUBLE"] = _cast_dict["DOUBLE"] + ["CFLOAT", "CDOUBLE"]
  
  
  class Type:
@@ -154,8 +185,8 @@ class Type:
      def _init(self, name):
          self.NAME = name.upper()
          info = typeinfo[self.NAME]
-        self.type_num = getattr(wrap, 'NPY_' + self.NAME)
-        assert_equal(self.type_num, info.num)
+        self.type_num = getattr(wrap, "NPY_" + self.NAME)
+        assert self.type_num == info.num
          self.dtype = np.dtype(info.type)
          self.type = info.type
          self.elsize = info.bits / 8
@@ -195,7 +226,6 @@ class Type:
  
  
  class Array:
-
      def __init__(self, typ, dims, intent, obj):
          self.type = typ
          self.dims = dims
@@ -206,76 +236,78 @@ class Array:
          # arr.dtypechar may be different from typ.dtypechar
          self.arr = wrap.call(typ.type_num, dims, intent.flags, obj)
  
-        assert_(isinstance(self.arr, np.ndarray), repr(type(self.arr)))
+        assert isinstance(self.arr, np.ndarray)
  
          self.arr_attr = wrap.array_attrs(self.arr)
  
          if len(dims) > 1:
-            if self.intent.is_intent('c'):
-                assert_(intent.flags & wrap.F2PY_INTENT_C)
-                assert_(not self.arr.flags['FORTRAN'],
-                        repr((self.arr.flags, getattr(obj, 'flags', None))))
-                assert_(self.arr.flags['CONTIGUOUS'])
-                assert_(not self.arr_attr[6] & wrap.FORTRAN)
+            if self.intent.is_intent("c"):
+                assert (intent.flags & wrap.F2PY_INTENT_C)
+                assert not self.arr.flags["FORTRAN"]
+                assert self.arr.flags["CONTIGUOUS"]
+                assert (not self.arr_attr[6] & wrap.FORTRAN)
              else:
-                assert_(not intent.flags & wrap.F2PY_INTENT_C)
-                assert_(self.arr.flags['FORTRAN'])
-                assert_(not self.arr.flags['CONTIGUOUS'])
-                assert_(self.arr_attr[6] & wrap.FORTRAN)
+                assert (not intent.flags & wrap.F2PY_INTENT_C)
+                assert self.arr.flags["FORTRAN"]
+                assert not self.arr.flags["CONTIGUOUS"]
+                assert (self.arr_attr[6] & wrap.FORTRAN)
  
          if obj is None:
              self.pyarr = None
              self.pyarr_attr = None
              return
  
-        if intent.is_intent('cache'):
-            assert_(isinstance(obj, np.ndarray), repr(type(obj)))
+        if intent.is_intent("cache"):
+            assert isinstance(obj, np.ndarray), repr(type(obj))
              self.pyarr = np.array(obj).reshape(*dims).copy()
          else:
              self.pyarr = np.array(
-                    np.array(obj, dtype=typ.dtypechar).reshape(*dims),
-                    order=self.intent.is_intent('c') and 'C' or 'F')
-            assert_(self.pyarr.dtype == typ,
-                    repr((self.pyarr.dtype, typ)))
-        self.pyarr.setflags(write=self.arr.flags['WRITEABLE'])
-        assert_(self.pyarr.flags['OWNDATA'], (obj, intent))
+                np.array(obj, dtype=typ.dtypechar).reshape(*dims),
+                order=self.intent.is_intent("c") and "C" or "F",
+            )
+            assert self.pyarr.dtype == typ
+        self.pyarr.setflags(write=self.arr.flags["WRITEABLE"])
+        assert self.pyarr.flags["OWNDATA"], (obj, intent)
          self.pyarr_attr = wrap.array_attrs(self.pyarr)
  
          if len(dims) > 1:
-            if self.intent.is_intent('c'):
-                assert_(not self.pyarr.flags['FORTRAN'])
-                assert_(self.pyarr.flags['CONTIGUOUS'])
-                assert_(not self.pyarr_attr[6] & wrap.FORTRAN)
+            if self.intent.is_intent("c"):
+                assert not self.pyarr.flags["FORTRAN"]
+                assert self.pyarr.flags["CONTIGUOUS"]
+                assert (not self.pyarr_attr[6] & wrap.FORTRAN)
              else:
-                assert_(self.pyarr.flags['FORTRAN'])
-                assert_(not self.pyarr.flags['CONTIGUOUS'])
-                assert_(self.pyarr_attr[6] & wrap.FORTRAN)
+                assert self.pyarr.flags["FORTRAN"]
+                assert not self.pyarr.flags["CONTIGUOUS"]
+                assert (self.pyarr_attr[6] & wrap.FORTRAN)
  
-        assert_(self.arr_attr[1] == self.pyarr_attr[1])  # nd
-        assert_(self.arr_attr[2] == self.pyarr_attr[2])  # dimensions
+        assert self.arr_attr[1] == self.pyarr_attr[1]  # nd
+        assert self.arr_attr[2] == self.pyarr_attr[2]  # dimensions
          if self.arr_attr[1] <= 1:
-            assert_(self.arr_attr[3] == self.pyarr_attr[3],
-                    repr((self.arr_attr[3], self.pyarr_attr[3],
-                          self.arr.tobytes(), self.pyarr.tobytes())))  # strides
-        assert_(self.arr_attr[5][-2:] == self.pyarr_attr[5][-2:],
-                repr((self.arr_attr[5], self.pyarr_attr[5])))  # descr
-        assert_(self.arr_attr[6] == self.pyarr_attr[6],
-                repr((self.arr_attr[6], self.pyarr_attr[6],
-                      flags2names(0 * self.arr_attr[6] - self.pyarr_attr[6]),
-                      flags2names(self.arr_attr[6]), intent)))  # flags
-
-        if intent.is_intent('cache'):
-            assert_(self.arr_attr[5][3] >= self.type.elsize,
-                    repr((self.arr_attr[5][3], self.type.elsize)))
+            assert self.arr_attr[3] == self.pyarr_attr[3], repr((
+                self.arr_attr[3],
+                self.pyarr_attr[3],
+                self.arr.tobytes(),
+                self.pyarr.tobytes(),
+            ))  # strides
+        assert self.arr_attr[5][-2:] == self.pyarr_attr[5][-2:]  # descr
+        assert self.arr_attr[6] == self.pyarr_attr[6], repr((
+            self.arr_attr[6],
+            self.pyarr_attr[6],
+            flags2names(0 * self.arr_attr[6] - self.pyarr_attr[6]),
+            flags2names(self.arr_attr[6]),
+            intent,
+        ))  # flags
+
+        if intent.is_intent("cache"):
+            assert self.arr_attr[5][3] >= self.type.elsize
          else:
-            assert_(self.arr_attr[5][3] == self.type.elsize,
-                    repr((self.arr_attr[5][3], self.type.elsize)))
-        assert_(self.arr_equal(self.pyarr, self.arr))
+            assert self.arr_attr[5][3] == self.type.elsize
+            assert (self.arr_equal(self.pyarr, self.arr))
  
          if isinstance(self.obj, np.ndarray):
              if typ.elsize == Type(obj.dtype).elsize:
-                if not intent.is_intent('copy') and self.arr_attr[1] <= 1:
-                    assert_(self.has_shared_memory())
+                if not intent.is_intent("copy") and self.arr_attr[1] <= 1:
+                    assert self.has_shared_memory()
  
      def arr_equal(self, arr1, arr2):
          if arr1.shape != arr2.shape:
@@ -286,8 +318,7 @@ class Array:
          return str(self.arr)
  
      def has_shared_memory(self):
-        """Check that created array shares data with input array.
-        """
+        """Check that created array shares data with input array."""
          if self.obj is self.arr:
              return True
          if not isinstance(self.obj, np.ndarray):
@@ -297,300 +328,300 @@ class Array:
  
  
  class TestIntent:
-
      def test_in_out(self):
-        assert_equal(str(intent.in_.out), 'intent(in,out)')
-        assert_(intent.in_.c.is_intent('c'))
-        assert_(not intent.in_.c.is_intent_exact('c'))
-        assert_(intent.in_.c.is_intent_exact('c', 'in'))
-        assert_(intent.in_.c.is_intent_exact('in', 'c'))
-        assert_(not intent.in_.is_intent('c'))
+        assert str(intent.in_.out) == "intent(in,out)"
+        assert intent.in_.c.is_intent("c")
+        assert not intent.in_.c.is_intent_exact("c")
+        assert intent.in_.c.is_intent_exact("c", "in")
+        assert intent.in_.c.is_intent_exact("in", "c")
+        assert not intent.in_.is_intent("c")
  
  
  class TestSharedMemory:
      num2seq = [1, 2]
      num23seq = [[1, 2, 3], [4, 5, 6]]
  
-    @pytest.fixture(autouse=True, scope='class', params=_type_names)
+    @pytest.fixture(autouse=True, scope="class", params=_type_names)
      def setup_type(self, request):
          request.cls.type = Type(request.param)
-        request.cls.array = lambda self, dims, intent, obj: \
-            Array(Type(request.param), dims, intent, obj)
+        request.cls.array = lambda self, dims, intent, obj: Array(
+            Type(request.param), dims, intent, obj)
  
      def test_in_from_2seq(self):
          a = self.array([2], intent.in_, self.num2seq)
-        assert_(not a.has_shared_memory())
+        assert not a.has_shared_memory()
  
      def test_in_from_2casttype(self):
          for t in self.type.cast_types():
              obj = np.array(self.num2seq, dtype=t.dtype)
              a = self.array([len(self.num2seq)], intent.in_, obj)
              if t.elsize == self.type.elsize:
-                assert_(
-                    a.has_shared_memory(), repr((self.type.dtype, t.dtype)))
+                assert a.has_shared_memory(), repr((self.type.dtype, t.dtype))
              else:
-                assert_(not a.has_shared_memory(), repr(t.dtype))
+                assert not a.has_shared_memory()
  
-    @pytest.mark.parametrize('write', ['w', 'ro'])
-    @pytest.mark.parametrize('order', ['C', 'F'])
-    @pytest.mark.parametrize('inp', ['2seq', '23seq'])
+    @pytest.mark.parametrize("write", ["w", "ro"])
+    @pytest.mark.parametrize("order", ["C", "F"])
+    @pytest.mark.parametrize("inp", ["2seq", "23seq"])
      def test_in_nocopy(self, write, order, inp):
-        """Test if intent(in) array can be passed without copies
-        """
-        seq = getattr(self, 'num' + inp)
+        """Test if intent(in) array can be passed without copies"""
+        seq = getattr(self, "num" + inp)
          obj = np.array(seq, dtype=self.type.dtype, order=order)
-        obj.setflags(write=(write == 'w'))
-        a = self.array(obj.shape, ((order=='C' and intent.in_.c) or intent.in_), obj)
+        obj.setflags(write=(write == "w"))
+        a = self.array(obj.shape,
+                       ((order == "C" and intent.in_.c) or intent.in_), obj)
          assert a.has_shared_memory()
  
      def test_inout_2seq(self):
          obj = np.array(self.num2seq, dtype=self.type.dtype)
          a = self.array([len(self.num2seq)], intent.inout, obj)
-        assert_(a.has_shared_memory())
+        assert a.has_shared_memory()
  
          try:
              a = self.array([2], intent.in_.inout, self.num2seq)
          except TypeError as msg:
-            if not str(msg).startswith('failed to initialize intent'
-                                       '(inout|inplace|cache) array'):
+            if not str(msg).startswith(
+                    "failed to initialize intent(inout|inplace|cache) array"):
                  raise
          else:
-            raise SystemError('intent(inout) should have failed on sequence')
+            raise SystemError("intent(inout) should have failed on sequence")
  
      def test_f_inout_23seq(self):
-        obj = np.array(self.num23seq, dtype=self.type.dtype, order='F')
+        obj = np.array(self.num23seq, dtype=self.type.dtype, order="F")
          shape = (len(self.num23seq), len(self.num23seq[0]))
          a = self.array(shape, intent.in_.inout, obj)
-        assert_(a.has_shared_memory())
+        assert a.has_shared_memory()
  
-        obj = np.array(self.num23seq, dtype=self.type.dtype, order='C')
+        obj = np.array(self.num23seq, dtype=self.type.dtype, order="C")
          shape = (len(self.num23seq), len(self.num23seq[0]))
          try:
              a = self.array(shape, intent.in_.inout, obj)
          except ValueError as msg:
-            if not str(msg).startswith('failed to initialize intent'
-                                       '(inout) array'):
+            if not str(msg).startswith(
+                    "failed to initialize intent(inout) array"):
                  raise
          else:
              raise SystemError(
-                'intent(inout) should have failed on improper array')
+                "intent(inout) should have failed on improper array")
  
      def test_c_inout_23seq(self):
          obj = np.array(self.num23seq, dtype=self.type.dtype)
          shape = (len(self.num23seq), len(self.num23seq[0]))
          a = self.array(shape, intent.in_.c.inout, obj)
-        assert_(a.has_shared_memory())
+        assert a.has_shared_memory()
  
      def test_in_copy_from_2casttype(self):
          for t in self.type.cast_types():
              obj = np.array(self.num2seq, dtype=t.dtype)
              a = self.array([len(self.num2seq)], intent.in_.copy, obj)
-            assert_(not a.has_shared_memory(), repr(t.dtype))
+            assert not a.has_shared_memory()
  
      def test_c_in_from_23seq(self):
-        a = self.array([len(self.num23seq), len(self.num23seq[0])],
-                       intent.in_, self.num23seq)
-        assert_(not a.has_shared_memory())
+        a = self.array(
+            [len(self.num23seq), len(self.num23seq[0])], intent.in_,
+            self.num23seq)
+        assert not a.has_shared_memory()
  
      def test_in_from_23casttype(self):
          for t in self.type.cast_types():
              obj = np.array(self.num23seq, dtype=t.dtype)
-            a = self.array([len(self.num23seq), len(self.num23seq[0])],
-                           intent.in_, obj)
-            assert_(not a.has_shared_memory(), repr(t.dtype))
+            a = self.array(
+                [len(self.num23seq), len(self.num23seq[0])], intent.in_, obj)
+            assert not a.has_shared_memory()
  
      def test_f_in_from_23casttype(self):
          for t in self.type.cast_types():
-            obj = np.array(self.num23seq, dtype=t.dtype, order='F')
-            a = self.array([len(self.num23seq), len(self.num23seq[0])],
-                           intent.in_, obj)
+            obj = np.array(self.num23seq, dtype=t.dtype, order="F")
+            a = self.array(
+                [len(self.num23seq), len(self.num23seq[0])], intent.in_, obj)
              if t.elsize == self.type.elsize:
-                assert_(a.has_shared_memory(), repr(t.dtype))
+                assert a.has_shared_memory()
              else:
-                assert_(not a.has_shared_memory(), repr(t.dtype))
+                assert not a.has_shared_memory()
  
      def test_c_in_from_23casttype(self):
          for t in self.type.cast_types():
              obj = np.array(self.num23seq, dtype=t.dtype)
-            a = self.array([len(self.num23seq), len(self.num23seq[0])],
-                           intent.in_.c, obj)
+            a = self.array(
+                [len(self.num23seq), len(self.num23seq[0])], intent.in_.c, obj)
              if t.elsize == self.type.elsize:
-                assert_(a.has_shared_memory(), repr(t.dtype))
+                assert a.has_shared_memory()
              else:
-                assert_(not a.has_shared_memory(), repr(t.dtype))
+                assert not a.has_shared_memory()
  
      def test_f_copy_in_from_23casttype(self):
          for t in self.type.cast_types():
-            obj = np.array(self.num23seq, dtype=t.dtype, order='F')
-            a = self.array([len(self.num23seq), len(self.num23seq[0])],
-                           intent.in_.copy, obj)
-            assert_(not a.has_shared_memory(), repr(t.dtype))
+            obj = np.array(self.num23seq, dtype=t.dtype, order="F")
+            a = self.array(
+                [len(self.num23seq), len(self.num23seq[0])], intent.in_.copy,
+                obj)
+            assert not a.has_shared_memory()
  
      def test_c_copy_in_from_23casttype(self):
          for t in self.type.cast_types():
              obj = np.array(self.num23seq, dtype=t.dtype)
-            a = self.array([len(self.num23seq), len(self.num23seq[0])],
-                           intent.in_.c.copy, obj)
-            assert_(not a.has_shared_memory(), repr(t.dtype))
+            a = self.array(
+                [len(self.num23seq), len(self.num23seq[0])], intent.in_.c.copy,
+                obj)
+            assert not a.has_shared_memory()
  
      def test_in_cache_from_2casttype(self):
          for t in self.type.all_types():
              if t.elsize != self.type.elsize:
                  continue
              obj = np.array(self.num2seq, dtype=t.dtype)
-            shape = (len(self.num2seq),)
+            shape = (len(self.num2seq), )
              a = self.array(shape, intent.in_.c.cache, obj)
-            assert_(a.has_shared_memory(), repr(t.dtype))
+            assert a.has_shared_memory()
  
              a = self.array(shape, intent.in_.cache, obj)
-            assert_(a.has_shared_memory(), repr(t.dtype))
+            assert a.has_shared_memory()
  
-            obj = np.array(self.num2seq, dtype=t.dtype, order='F')
+            obj = np.array(self.num2seq, dtype=t.dtype, order="F")
              a = self.array(shape, intent.in_.c.cache, obj)
-            assert_(a.has_shared_memory(), repr(t.dtype))
+            assert a.has_shared_memory()
  
              a = self.array(shape, intent.in_.cache, obj)
-            assert_(a.has_shared_memory(), repr(t.dtype))
+            assert a.has_shared_memory(), repr(t.dtype)
  
              try:
                  a = self.array(shape, intent.in_.cache, obj[::-1])
              except ValueError as msg:
-                if not str(msg).startswith('failed to initialize'
-                                           ' intent(cache) array'):
+                if not str(msg).startswith(
+                        "failed to initialize intent(cache) array"):
                      raise
              else:
                  raise SystemError(
-                    'intent(cache) should have failed on multisegmented array')
+                    "intent(cache) should have failed on multisegmented array")
  
      def test_in_cache_from_2casttype_failure(self):
          for t in self.type.all_types():
              if t.elsize >= self.type.elsize:
                  continue
              obj = np.array(self.num2seq, dtype=t.dtype)
-            shape = (len(self.num2seq),)
+            shape = (len(self.num2seq), )
              try:
                  self.array(shape, intent.in_.cache, obj)  # Should succeed
              except ValueError as msg:
-                if not str(msg).startswith('failed to initialize'
-                                           ' intent(cache) array'):
+                if not str(msg).startswith(
+                        "failed to initialize intent(cache) array"):
                      raise
              else:
                  raise SystemError(
-                    'intent(cache) should have failed on smaller array')
+                    "intent(cache) should have failed on smaller array")
  
      def test_cache_hidden(self):
-        shape = (2,)
+        shape = (2, )
          a = self.array(shape, intent.cache.hide, None)
-        assert_(a.arr.shape == shape)
+        assert a.arr.shape == shape
  
          shape = (2, 3)
          a = self.array(shape, intent.cache.hide, None)
-        assert_(a.arr.shape == shape)
+        assert a.arr.shape == shape
  
          shape = (-1, 3)
          try:
              a = self.array(shape, intent.cache.hide, None)
          except ValueError as msg:
-            if not str(msg).startswith('failed to create intent'
-                                       '(cache|hide)|optional array'):
+            if not str(msg).startswith(
+                    "failed to create intent(cache|hide)|optional array"):
                  raise
          else:
              raise SystemError(
-                'intent(cache) should have failed on undefined dimensions')
+                "intent(cache) should have failed on undefined dimensions")
  
      def test_hidden(self):
-        shape = (2,)
+        shape = (2, )
          a = self.array(shape, intent.hide, None)
-        assert_(a.arr.shape == shape)
-        assert_(a.arr_equal(a.arr, np.zeros(shape, dtype=self.type.dtype)))
+        assert a.arr.shape == shape
+        assert a.arr_equal(a.arr, np.zeros(shape, dtype=self.type.dtype))
  
          shape = (2, 3)
          a = self.array(shape, intent.hide, None)
-        assert_(a.arr.shape == shape)
-        assert_(a.arr_equal(a.arr, np.zeros(shape, dtype=self.type.dtype)))
-        assert_(a.arr.flags['FORTRAN'] and not a.arr.flags['CONTIGUOUS'])
+        assert a.arr.shape == shape
+        assert a.arr_equal(a.arr, np.zeros(shape, dtype=self.type.dtype))
+        assert a.arr.flags["FORTRAN"] and not a.arr.flags["CONTIGUOUS"]
  
          shape = (2, 3)
          a = self.array(shape, intent.c.hide, None)
-        assert_(a.arr.shape == shape)
-        assert_(a.arr_equal(a.arr, np.zeros(shape, dtype=self.type.dtype)))
-        assert_(not a.arr.flags['FORTRAN'] and a.arr.flags['CONTIGUOUS'])
+        assert a.arr.shape == shape
+        assert a.arr_equal(a.arr, np.zeros(shape, dtype=self.type.dtype))
+        assert not a.arr.flags["FORTRAN"] and a.arr.flags["CONTIGUOUS"]
  
          shape = (-1, 3)
          try:
              a = self.array(shape, intent.hide, None)
          except ValueError as msg:
-            if not str(msg).startswith('failed to create intent'
-                                       '(cache|hide)|optional array'):
+            if not str(msg).startswith(
+                    "failed to create intent(cache|hide)|optional array"):
                  raise
          else:
-            raise SystemError('intent(hide) should have failed'
-                              ' on undefined dimensions')
+            raise SystemError(
+                "intent(hide) should have failed on undefined dimensions")
  
      def test_optional_none(self):
-        shape = (2,)
+        shape = (2, )
          a = self.array(shape, intent.optional, None)
-        assert_(a.arr.shape == shape)
-        assert_(a.arr_equal(a.arr, np.zeros(shape, dtype=self.type.dtype)))
+        assert a.arr.shape == shape
+        assert a.arr_equal(a.arr, np.zeros(shape, dtype=self.type.dtype))
  
          shape = (2, 3)
          a = self.array(shape, intent.optional, None)
-        assert_(a.arr.shape == shape)
-        assert_(a.arr_equal(a.arr, np.zeros(shape, dtype=self.type.dtype)))
-        assert_(a.arr.flags['FORTRAN'] and not a.arr.flags['CONTIGUOUS'])
+        assert a.arr.shape == shape
+        assert a.arr_equal(a.arr, np.zeros(shape, dtype=self.type.dtype))
+        assert a.arr.flags["FORTRAN"] and not a.arr.flags["CONTIGUOUS"]
  
          shape = (2, 3)
          a = self.array(shape, intent.c.optional, None)
-        assert_(a.arr.shape == shape)
-        assert_(a.arr_equal(a.arr, np.zeros(shape, dtype=self.type.dtype)))
-        assert_(not a.arr.flags['FORTRAN'] and a.arr.flags['CONTIGUOUS'])
+        assert a.arr.shape == shape
+        assert a.arr_equal(a.arr, np.zeros(shape, dtype=self.type.dtype))
+        assert not a.arr.flags["FORTRAN"] and a.arr.flags["CONTIGUOUS"]
  
      def test_optional_from_2seq(self):
          obj = self.num2seq
-        shape = (len(obj),)
+        shape = (len(obj), )
          a = self.array(shape, intent.optional, obj)
-        assert_(a.arr.shape == shape)
-        assert_(not a.has_shared_memory())
+        assert a.arr.shape == shape
+        assert not a.has_shared_memory()
  
      def test_optional_from_23seq(self):
          obj = self.num23seq
          shape = (len(obj), len(obj[0]))
          a = self.array(shape, intent.optional, obj)
-        assert_(a.arr.shape == shape)
-        assert_(not a.has_shared_memory())
+        assert a.arr.shape == shape
+        assert not a.has_shared_memory()
  
          a = self.array(shape, intent.optional.c, obj)
-        assert_(a.arr.shape == shape)
-        assert_(not a.has_shared_memory())
+        assert a.arr.shape == shape
+        assert not a.has_shared_memory()
  
      def test_inplace(self):
          obj = np.array(self.num23seq, dtype=self.type.dtype)
-        assert_(not obj.flags['FORTRAN'] and obj.flags['CONTIGUOUS'])
+        assert not obj.flags["FORTRAN"] and obj.flags["CONTIGUOUS"]
          shape = obj.shape
          a = self.array(shape, intent.inplace, obj)
-        assert_(obj[1][2] == a.arr[1][2], repr((obj, a.arr)))
+        assert obj[1][2] == a.arr[1][2], repr((obj, a.arr))
          a.arr[1][2] = 54
-        assert_(obj[1][2] == a.arr[1][2] ==
-                np.array(54, dtype=self.type.dtype), repr((obj, a.arr)))
-        assert_(a.arr is obj)
-        assert_(obj.flags['FORTRAN'])  # obj attributes are changed inplace!
-        assert_(not obj.flags['CONTIGUOUS'])
+        assert obj[1][2] == a.arr[1][2] == np.array(54, dtype=self.type.dtype)
+        assert a.arr is obj
+        assert obj.flags["FORTRAN"]  # obj attributes are changed inplace!
+        assert not obj.flags["CONTIGUOUS"]
  
      def test_inplace_from_casttype(self):
          for t in self.type.cast_types():
              if t is self.type:
                  continue
              obj = np.array(self.num23seq, dtype=t.dtype)
-            assert_(obj.dtype.type == t.type)
-            assert_(obj.dtype.type is not self.type.type)
-            assert_(not obj.flags['FORTRAN'] and obj.flags['CONTIGUOUS'])
+            assert obj.dtype.type == t.type
+            assert obj.dtype.type is not self.type.type
+            assert not obj.flags["FORTRAN"] and obj.flags["CONTIGUOUS"]
              shape = obj.shape
              a = self.array(shape, intent.inplace, obj)
-            assert_(obj[1][2] == a.arr[1][2], repr((obj, a.arr)))
+            assert obj[1][2] == a.arr[1][2], repr((obj, a.arr))
              a.arr[1][2] = 54
-            assert_(obj[1][2] == a.arr[1][2] ==
-                    np.array(54, dtype=self.type.dtype), repr((obj, a.arr)))
-            assert_(a.arr is obj)
-            assert_(obj.flags['FORTRAN'])  # obj attributes changed inplace!
-            assert_(not obj.flags['CONTIGUOUS'])
-            assert_(obj.dtype.type is self.type.type)  # obj changed inplace!
+            assert obj[1][2] == a.arr[1][2] == np.array(54,
+                                                        dtype=self.type.dtype)
+            assert a.arr is obj
+            assert obj.flags["FORTRAN"]  # obj attributes changed inplace!
+            assert not obj.flags["CONTIGUOUS"]
+            assert obj.dtype.type is self.type.type  # obj changed inplace!
diff --git a/numpy/f2py/tests/test_assumed_shape.py b/numpy/f2py/tests/test_assumed_shape.py

index 79e3ad1384269f5f20f6f45b8be8dda2454f6b74..e546c379b025f1b1f375a37a42c2f7e1cb3e79a7 100644 (file)
--- a/numpy/f2py/tests/test_assumed_shape.py
+++ b/numpy/f2py/tests/test_assumed_shape.py
@@ -2,35 +2,31 @@ import os
  import pytest
  import tempfile
  
-from numpy.testing import assert_
  from . import util
  
  
-def _path(*a):
-    return os.path.join(*((os.path.dirname(__file__),) + a))
-
-
  class TestAssumedShapeSumExample(util.F2PyTest):
-    sources = [_path('src', 'assumed_shape', 'foo_free.f90'),
-               _path('src', 'assumed_shape', 'foo_use.f90'),
-               _path('src', 'assumed_shape', 'precision.f90'),
-               _path('src', 'assumed_shape', 'foo_mod.f90'),
-               _path('src', 'assumed_shape', '.f2py_f2cmap'),
-               ]
+    sources = [
+        util.getpath("tests", "src", "assumed_shape", "foo_free.f90"),
+        util.getpath("tests", "src", "assumed_shape", "foo_use.f90"),
+        util.getpath("tests", "src", "assumed_shape", "precision.f90"),
+        util.getpath("tests", "src", "assumed_shape", "foo_mod.f90"),
+        util.getpath("tests", "src", "assumed_shape", ".f2py_f2cmap"),
+    ]
  
      @pytest.mark.slow
      def test_all(self):
          r = self.module.fsum([1, 2])
-        assert_(r == 3, repr(r))
+        assert r == 3
          r = self.module.sum([1, 2])
-        assert_(r == 3, repr(r))
+        assert r == 3
          r = self.module.sum_with_use([1, 2])
-        assert_(r == 3, repr(r))
+        assert r == 3
  
          r = self.module.mod.sum([1, 2])
-        assert_(r == 3, repr(r))
+        assert r == 3
          r = self.module.mod.fsum([1, 2])
-        assert_(r == 3, repr(r))
+        assert r == 3
  
  
  class TestF2cmapOption(TestAssumedShapeSumExample):
@@ -40,7 +36,7 @@ class TestF2cmapOption(TestAssumedShapeSumExample):
          f2cmap_src = self.sources.pop(-1)
  
          self.f2cmap_file = tempfile.NamedTemporaryFile(delete=False)
-        with open(f2cmap_src, 'rb') as f:
+        with open(f2cmap_src, "rb") as f:
              self.f2cmap_file.write(f.read())
          self.f2cmap_file.close()
  
diff --git a/numpy/f2py/tests/test_block_docstring.py b/numpy/f2py/tests/test_block_docstring.py

index 7d725165b2fbd496e8ef3a257a2798805d0af215..e0eacc0329c5e78733384c21397614a50601fce9 100644 (file)
--- a/numpy/f2py/tests/test_block_docstring.py
+++ b/numpy/f2py/tests/test_block_docstring.py
@@ -2,22 +2,16 @@ import sys
  import pytest
  from . import util
  
-from numpy.testing import assert_equal, IS_PYPY
+from numpy.testing import IS_PYPY
  
-class TestBlockDocString(util.F2PyTest):
-    code = """
-      SUBROUTINE FOO()
-      INTEGER BAR(2, 3)
  
-      COMMON  /BLOCK/ BAR
-      RETURN
-      END
-    """
+class TestBlockDocString(util.F2PyTest):
+    sources = [util.getpath("tests", "src", "block_docstring", "foo.f")]
  
-    @pytest.mark.skipif(sys.platform=='win32',
-                        reason='Fails with MinGW64 Gfortran (Issue #9673)')
+    @pytest.mark.skipif(sys.platform == "win32",
+                        reason="Fails with MinGW64 Gfortran (Issue #9673)")
      @pytest.mark.xfail(IS_PYPY,
                         reason="PyPy cannot modify tp_doc after PyType_Ready")
      def test_block_docstring(self):
          expected = "bar : 'i'-array(2,3)\n"
-        assert_equal(self.module.block.__doc__, expected)
+        assert self.module.block.__doc__ == expected
diff --git a/numpy/f2py/tests/test_callback.py b/numpy/f2py/tests/test_callback.py

index 5d2aab94df9a949dd6cb1dee131e756b5fee8542..4e91430fd24fac178b0c7760a1946b41b8f44876 100644 (file)
--- a/numpy/f2py/tests/test_callback.py
+++ b/numpy/f2py/tests/test_callback.py
@@ -7,77 +7,14 @@ import traceback
  import time
  
  import numpy as np
-from numpy.testing import assert_, assert_equal, IS_PYPY
+from numpy.testing import IS_PYPY
  from . import util
  
  
  class TestF77Callback(util.F2PyTest):
-    code = """
-       subroutine t(fun,a)
-       integer a
-cf2py  intent(out) a
-       external fun
-       call fun(a)
-       end
-
-       subroutine func(a)
-cf2py  intent(in,out) a
-       integer a
-       a = a + 11
-       end
-
-       subroutine func0(a)
-cf2py  intent(out) a
-       integer a
-       a = 11
-       end
-
-       subroutine t2(a)
-cf2py  intent(callback) fun
-       integer a
-cf2py  intent(out) a
-       external fun
-       call fun(a)
-       end
-
-       subroutine string_callback(callback, a)
-       external callback
-       double precision callback
-       double precision a
-       character*1 r
-cf2py  intent(out) a
-       r = 'r'
-       a = callback(r)
-       end
-
-       subroutine string_callback_array(callback, cu, lencu, a)
-       external callback
-       integer callback
-       integer lencu
-       character*8 cu(lencu)
-       integer a
-cf2py  intent(out) a
-
-       a = callback(cu, lencu)
-       end
-
-       subroutine hidden_callback(a, r)
-       external global_f
-cf2py  intent(callback, hide) global_f
-       integer a, r, global_f
-cf2py  intent(out) r
-       r = global_f(a)
-       end
-
-       subroutine hidden_callback2(a, r)
-       external global_f
-       integer a, r, global_f
-cf2py  intent(out) r
-       r = global_f(a)
-       end
-    """
+    sources = [util.getpath("tests", "src", "callback", "foo.f")]
  
-    @pytest.mark.parametrize('name', 't,t2'.split(','))
+    @pytest.mark.parametrize("name", "t,t2".split(","))
      def test_all(self, name):
          self.check_function(name)
  
@@ -110,75 +47,74 @@ cf2py  intent(out) r
              Return objects:
                  a : int
          """)
-        assert_equal(self.module.t.__doc__, expected)
+        assert self.module.t.__doc__ == expected
  
      def check_function(self, name):
          t = getattr(self.module, name)
          r = t(lambda: 4)
-        assert_(r == 4, repr(r))
-        r = t(lambda a: 5, fun_extra_args=(6,))
-        assert_(r == 5, repr(r))
-        r = t(lambda a: a, fun_extra_args=(6,))
-        assert_(r == 6, repr(r))
-        r = t(lambda a: 5 + a, fun_extra_args=(7,))
-        assert_(r == 12, repr(r))
-        r = t(lambda a: math.degrees(a), fun_extra_args=(math.pi,))
-        assert_(r == 180, repr(r))
-        r = t(math.degrees, fun_extra_args=(math.pi,))
-        assert_(r == 180, repr(r))
-
-        r = t(self.module.func, fun_extra_args=(6,))
-        assert_(r == 17, repr(r))
+        assert r == 4
+        r = t(lambda a: 5, fun_extra_args=(6, ))
+        assert r == 5
+        r = t(lambda a: a, fun_extra_args=(6, ))
+        assert r == 6
+        r = t(lambda a: 5 + a, fun_extra_args=(7, ))
+        assert r == 12
+        r = t(lambda a: math.degrees(a), fun_extra_args=(math.pi, ))
+        assert r == 180
+        r = t(math.degrees, fun_extra_args=(math.pi, ))
+        assert r == 180
+
+        r = t(self.module.func, fun_extra_args=(6, ))
+        assert r == 17
          r = t(self.module.func0)
-        assert_(r == 11, repr(r))
+        assert r == 11
          r = t(self.module.func0._cpointer)
-        assert_(r == 11, repr(r))
+        assert r == 11
  
          class A:
-
              def __call__(self):
                  return 7
  
              def mth(self):
                  return 9
+
          a = A()
          r = t(a)
-        assert_(r == 7, repr(r))
+        assert r == 7
          r = t(a.mth)
-        assert_(r == 9, repr(r))
+        assert r == 9
  
-    @pytest.mark.skipif(sys.platform=='win32',
-                        reason='Fails with MinGW64 Gfortran (Issue #9673)')
+    @pytest.mark.skipif(sys.platform == "win32",
+                        reason="Fails with MinGW64 Gfortran (Issue #9673)")
      def test_string_callback(self):
-
          def callback(code):
-            if code == 'r':
+            if code == "r":
                  return 0
              else:
                  return 1
  
-        f = getattr(self.module, 'string_callback')
+        f = getattr(self.module, "string_callback")
          r = f(callback)
-        assert_(r == 0, repr(r))
+        assert r == 0
  
-    @pytest.mark.skipif(sys.platform=='win32',
-                        reason='Fails with MinGW64 Gfortran (Issue #9673)')
+    @pytest.mark.skipif(sys.platform == "win32",
+                        reason="Fails with MinGW64 Gfortran (Issue #9673)")
      def test_string_callback_array(self):
          # See gh-10027
-        cu = np.zeros((1, 8), 'S1')
+        cu = np.zeros((1, 8), "S1")
  
          def callback(cu, lencu):
              if cu.shape != (lencu, 8):
                  return 1
-            if cu.dtype != 'S1':
+            if cu.dtype != "S1":
                  return 2
-            if not np.all(cu == b''):
+            if not np.all(cu == b""):
                  return 3
              return 0
  
-        f = getattr(self.module, 'string_callback_array')
+        f = getattr(self.module, "string_callback_array")
          res = f(callback, cu, len(cu))
-        assert_(res == 0, repr(res))
+        assert res == 0
  
      def test_threadsafety(self):
          # Segfaults if the callback handling is not threadsafe
@@ -192,7 +128,7 @@ cf2py  intent(out) r
  
              # Check reentrancy
              r = self.module.t(lambda: 123)
-            assert_(r == 123)
+            assert r == 123
  
              return 42
  
@@ -200,13 +136,15 @@ cf2py  intent(out) r
              try:
                  for j in range(50):
                      r = self.module.t(cb)
-                    assert_(r == 42)
+                    assert r == 42
                      self.check_function(name)
              except Exception:
                  errors.append(traceback.format_exc())
  
-        threads = [threading.Thread(target=runner, args=(arg,))
-                   for arg in ("t", "t2") for n in range(20)]
+        threads = [
+            threading.Thread(target=runner, args=(arg, ))
+            for arg in ("t", "t2") for n in range(20)
+        ]
  
          for t in threads:
              t.start()
@@ -222,34 +160,34 @@ cf2py  intent(out) r
          try:
              self.module.hidden_callback(2)
          except Exception as msg:
-            assert_(str(msg).startswith('Callback global_f not defined'))
+            assert str(msg).startswith("Callback global_f not defined")
  
          try:
              self.module.hidden_callback2(2)
          except Exception as msg:
-            assert_(str(msg).startswith('cb: Callback global_f not defined'))
+            assert str(msg).startswith("cb: Callback global_f not defined")
  
          self.module.global_f = lambda x: x + 1
          r = self.module.hidden_callback(2)
-        assert_(r == 3)
+        assert r == 3
  
          self.module.global_f = lambda x: x + 2
          r = self.module.hidden_callback(2)
-        assert_(r == 4)
+        assert r == 4
  
          del self.module.global_f
          try:
              self.module.hidden_callback(2)
          except Exception as msg:
-            assert_(str(msg).startswith('Callback global_f not defined'))
+            assert str(msg).startswith("Callback global_f not defined")
  
          self.module.global_f = lambda x=0: x + 3
          r = self.module.hidden_callback(2)
-        assert_(r == 5)
+        assert r == 5
  
          # reproducer of gh18341
          r = self.module.hidden_callback2(2)
-        assert_(r == 3)
+        assert r == 3
  
  
  class TestF77CallbackPythonTLS(TestF77Callback):
@@ -257,26 +195,14 @@ class TestF77CallbackPythonTLS(TestF77Callback):
      Callback tests using Python thread-local storage instead of
      compiler-provided
      """
+
      options = ["-DF2PY_USE_PYTHON_TLS"]
  
  
  class TestF90Callback(util.F2PyTest):
-
-    suffix = '.f90'
-
-    code = textwrap.dedent(
-        """
-        function gh17797(f, y) result(r)
-          external f
-          integer(8) :: r, f
-          integer(8), dimension(:) :: y
-          r = f(0)
-          r = r + sum(y)
-        end function gh17797
-        """)
+    sources = [util.getpath("tests", "src", "callback", "gh17797.f90")]
  
      def test_gh17797(self):
-
          def incr(x):
              return x + 123
  
@@ -291,32 +217,9 @@ class TestGH18335(util.F2PyTest):
      implemented as a separate test class. Do not extend this test with
      other tests!
      """
-
-    suffix = '.f90'
-
-    code = textwrap.dedent(
-        """
-        ! When gh18335_workaround is defined as an extension,
-        ! the issue cannot be reproduced.
-        !subroutine gh18335_workaround(f, y)
-        !  implicit none
-        !  external f
-        !  integer(kind=1) :: y(1)
-        !  call f(y)
-        !end subroutine gh18335_workaround
-
-        function gh18335(f) result (r)
-          implicit none
-          external f
-          integer(kind=1) :: y(1), r
-          y(1) = 123
-          call f(y)
-          r = y(1)
-        end function gh18335
-        """)
+    sources = [util.getpath("tests", "src", "callback", "gh18335.f90")]
  
      def test_gh18335(self):
-
          def foo(x):
              x[0] += 1
  
diff --git a/numpy/f2py/tests/test_common.py b/numpy/f2py/tests/test_common.py

index e4bf3550476123de91e5890a0c7b281aa87bb27b..8a4b221ef8bd51c7e6b9528fc10215a4c8fdbcdb 100644 (file)
--- a/numpy/f2py/tests/test_common.py
+++ b/numpy/f2py/tests/test_common.py
@@ -5,21 +5,14 @@ import pytest
  import numpy as np
  from . import util
  
-from numpy.testing import assert_array_equal
-
-def _path(*a):
-    return os.path.join(*((os.path.dirname(__file__),) + a))
  
  class TestCommonBlock(util.F2PyTest):
-    sources = [_path('src', 'common', 'block.f')]
+    sources = [util.getpath("tests", "src", "common", "block.f")]
  
-    @pytest.mark.skipif(sys.platform=='win32',
-                        reason='Fails with MinGW64 Gfortran (Issue #9673)')
+    @pytest.mark.skipif(sys.platform == "win32",
+                        reason="Fails with MinGW64 Gfortran (Issue #9673)")
      def test_common_block(self):
          self.module.initcb()
-        assert_array_equal(self.module.block.long_bn,
-                           np.array(1.0, dtype=np.float64))
-        assert_array_equal(self.module.block.string_bn,
-                           np.array('2', dtype='|S1'))
-        assert_array_equal(self.module.block.ok,
-                           np.array(3, dtype=np.int32))
+        assert self.module.block.long_bn == np.array(1.0, dtype=np.float64)
+        assert self.module.block.string_bn == np.array("2", dtype="|S1")
+        assert self.module.block.ok == np.array(3, dtype=np.int32)
diff --git a/numpy/f2py/tests/test_compile_function.py b/numpy/f2py/tests/test_compile_function.py

index f76fd644807c487f953926e800227e75889258ba..3c16f319812f20d9c4b0472f47643a089b52f7c6 100644 (file)
--- a/numpy/f2py/tests/test_compile_function.py
+++ b/numpy/f2py/tests/test_compile_function.py
@@ -9,7 +9,6 @@ import pytest
  
  import numpy.f2py
  
-from numpy.testing import assert_equal
  from . import util
  
  
@@ -17,14 +16,13 @@ def setup_module():
      if not util.has_c_compiler():
          pytest.skip("Needs C compiler")
      if not util.has_f77_compiler():
-        pytest.skip('Needs FORTRAN 77 compiler')
+        pytest.skip("Needs FORTRAN 77 compiler")
  
  
  # extra_args can be a list (since gh-11937) or string.
  # also test absence of extra_args
-@pytest.mark.parametrize(
-    "extra_args", [['--noopt', '--debug'], '--noopt --debug', '']
-    )
+@pytest.mark.parametrize("extra_args",
+                         [["--noopt", "--debug"], "--noopt --debug", ""])
  @pytest.mark.leaks_references(reason="Imported module seems never deleted.")
  def test_f2py_init_compile(extra_args):
      # flush through the f2py __init__ compile() function code path as a
@@ -33,7 +31,7 @@ def test_f2py_init_compile(extra_args):
  
      # the Fortran 77 syntax requires 6 spaces before any commands, but
      # more space may be added/
-    fsource =  """
+    fsource = """
          integer function foo()
          foo = 10 + 5
          return
@@ -45,7 +43,7 @@ def test_f2py_init_compile(extra_args):
      modname = util.get_temp_module_name()
  
      cwd = os.getcwd()
-    target = os.path.join(moddir, str(uuid.uuid4()) + '.f')
+    target = os.path.join(moddir, str(uuid.uuid4()) + ".f")
      # try running compile() with and without a source_fn provided so
      # that the code path where a temporary file for writing Fortran
      # source is created is also explored
@@ -54,40 +52,35 @@ def test_f2py_init_compile(extra_args):
          # util.py, but don't actually use build_module() because it has
          # its own invocation of subprocess that circumvents the
          # f2py.compile code block under test
-        try:
-            os.chdir(moddir)
-            ret_val = numpy.f2py.compile(
-                fsource,
-                modulename=modname,
-                extra_args=extra_args,
-                source_fn=source_fn
-                )
-        finally:
-            os.chdir(cwd)
-
-        # check for compile success return value
-        assert_equal(ret_val, 0)
-
-        # we are not currently able to import the Python-Fortran
-        # interface module on Windows / Appveyor, even though we do get
-        # successful compilation on that platform with Python 3.x
-        if sys.platform != 'win32':
-            # check for sensible result of Fortran function; that means
-            # we can import the module name in Python and retrieve the
-            # result of the sum operation
-            return_check = import_module(modname)
-            calc_result = return_check.foo()
-            assert_equal(calc_result, 15)
-            # Removal from sys.modules, is not as such necessary. Even with
-            # removal, the module (dict) stays alive.
-            del sys.modules[modname]
+        with util.switchdir(moddir):
+            ret_val = numpy.f2py.compile(fsource,
+                                         modulename=modname,
+                                         extra_args=extra_args,
+                                         source_fn=source_fn)
+
+            # check for compile success return value
+            assert ret_val == 0
+
+    # we are not currently able to import the Python-Fortran
+    # interface module on Windows / Appveyor, even though we do get
+    # successful compilation on that platform with Python 3.x
+    if sys.platform != "win32":
+        # check for sensible result of Fortran function; that means
+        # we can import the module name in Python and retrieve the
+        # result of the sum operation
+        return_check = import_module(modname)
+        calc_result = return_check.foo()
+        assert calc_result == 15
+        # Removal from sys.modules, is not as such necessary. Even with
+        # removal, the module (dict) stays alive.
+        del sys.modules[modname]
  
  
  def test_f2py_init_compile_failure():
      # verify an appropriate integer status value returned by
      # f2py.compile() when invalid Fortran is provided
      ret_val = numpy.f2py.compile(b"invalid")
-    assert_equal(ret_val, 1)
+    assert ret_val == 1
  
  
  def test_f2py_init_compile_bad_cmd():
@@ -99,27 +92,26 @@ def test_f2py_init_compile_bad_cmd():
      # downstream NOTE: how bad of an idea is this patching?
      try:
          temp = sys.executable
-        sys.executable = 'does not exist'
+        sys.executable = "does not exist"
  
          # the OSError should take precedence over invalid Fortran
          ret_val = numpy.f2py.compile(b"invalid")
-        assert_equal(ret_val, 127)
+        assert ret_val == 127
      finally:
          sys.executable = temp
  
  
-@pytest.mark.parametrize('fsource',
-        ['program test_f2py\nend program test_f2py',
-         b'program test_f2py\nend program test_f2py',])
+@pytest.mark.parametrize(
+    "fsource",
+    [
+        "program test_f2py\nend program test_f2py",
+        b"program test_f2py\nend program test_f2py",
+    ],
+)
  def test_compile_from_strings(tmpdir, fsource):
      # Make sure we can compile str and bytes gh-12796
-    cwd = os.getcwd()
-    try:
-        os.chdir(str(tmpdir))
-        ret_val = numpy.f2py.compile(
-                fsource,
-                modulename='test_compile_from_strings',
-                extension='.f90')
-        assert_equal(ret_val, 0)
-    finally:
-        os.chdir(cwd)
+    with util.switchdir(tmpdir):
+        ret_val = numpy.f2py.compile(fsource,
+                                     modulename="test_compile_from_strings",
+                                     extension=".f90")
+        assert ret_val == 0
diff --git a/numpy/f2py/tests/test_crackfortran.py b/numpy/f2py/tests/test_crackfortran.py

index a78b5da914822d3fba42a7dc8839433192e30c02..ea618bf333b14d6f7fb9976bcb18d245cd4fab54 100644 (file)
--- a/numpy/f2py/tests/test_crackfortran.py
+++ b/numpy/f2py/tests/test_crackfortran.py
@@ -1,6 +1,5 @@
  import pytest
  import numpy as np
-from numpy.testing import assert_array_equal, assert_equal
  from numpy.f2py.crackfortran import markinnerspaces
  from . import util
  from numpy.f2py import crackfortran
@@ -10,163 +9,118 @@ import textwrap
  class TestNoSpace(util.F2PyTest):
      # issue gh-15035: add handling for endsubroutine, endfunction with no space
      # between "end" and the block name
-    code = """
-        subroutine subb(k)
-          real(8), intent(inout) :: k(:)
-          k=k+1
-        endsubroutine
-
-        subroutine subc(w,k)
-          real(8), intent(in) :: w(:)
-          real(8), intent(out) :: k(size(w))
-          k=w+1
-        endsubroutine
-
-        function t0(value)
-          character value
-          character t0
-          t0 = value
-        endfunction
-    """
+    sources = [util.getpath("tests", "src", "crackfortran", "gh15035.f")]
  
      def test_module(self):
          k = np.array([1, 2, 3], dtype=np.float64)
          w = np.array([1, 2, 3], dtype=np.float64)
          self.module.subb(k)
-        assert_array_equal(k, w + 1)
+        assert np.allclose(k, w + 1)
          self.module.subc([w, k])
-        assert_array_equal(k, w + 1)
-        assert self.module.t0(23) == b'2'
-
-
-class TestPublicPrivate():
-
-    def test_defaultPrivate(self, tmp_path):
-        f_path = tmp_path / "mod.f90"
-        with f_path.open('w') as ff:
-            ff.write(textwrap.dedent("""\
-            module foo
-              private
-              integer :: a
-              public :: setA
-              integer :: b
-            contains
-              subroutine setA(v)
-                integer, intent(in) :: v
-                a = v
-              end subroutine setA
-            end module foo
-            """))
-        mod = crackfortran.crackfortran([str(f_path)])
+        assert np.allclose(k, w + 1)
+        assert self.module.t0(23) == b"2"
+
+
+class TestPublicPrivate:
+    def test_defaultPrivate(self):
+        fpath = util.getpath("tests", "src", "crackfortran", "privatemod.f90")
+        mod = crackfortran.crackfortran([str(fpath)])
          assert len(mod) == 1
          mod = mod[0]
-        assert 'private' in mod['vars']['a']['attrspec']
-        assert 'public' not in mod['vars']['a']['attrspec']
-        assert 'private' in mod['vars']['b']['attrspec']
-        assert 'public' not in mod['vars']['b']['attrspec']
-        assert 'private' not in mod['vars']['seta']['attrspec']
-        assert 'public' in mod['vars']['seta']['attrspec']
+        assert "private" in mod["vars"]["a"]["attrspec"]
+        assert "public" not in mod["vars"]["a"]["attrspec"]
+        assert "private" in mod["vars"]["b"]["attrspec"]
+        assert "public" not in mod["vars"]["b"]["attrspec"]
+        assert "private" not in mod["vars"]["seta"]["attrspec"]
+        assert "public" in mod["vars"]["seta"]["attrspec"]
  
      def test_defaultPublic(self, tmp_path):
-        f_path = tmp_path / "mod.f90"
-        with f_path.open('w') as ff:
-            ff.write(textwrap.dedent("""\
-            module foo
-              public
-              integer, private :: a
-              public :: setA
-            contains
-              subroutine setA(v)
-                integer, intent(in) :: v
-                a = v
-              end subroutine setA
-            end module foo
-            """))
-        mod = crackfortran.crackfortran([str(f_path)])
+        fpath = util.getpath("tests", "src", "crackfortran", "publicmod.f90")
+        mod = crackfortran.crackfortran([str(fpath)])
          assert len(mod) == 1
          mod = mod[0]
-        assert 'private' in mod['vars']['a']['attrspec']
-        assert 'public' not in mod['vars']['a']['attrspec']
-        assert 'private' not in mod['vars']['seta']['attrspec']
-        assert 'public' in mod['vars']['seta']['attrspec']
+        assert "private" in mod["vars"]["a"]["attrspec"]
+        assert "public" not in mod["vars"]["a"]["attrspec"]
+        assert "private" not in mod["vars"]["seta"]["attrspec"]
+        assert "public" in mod["vars"]["seta"]["attrspec"]
+
+    def test_access_type(self, tmp_path):
+        fpath = util.getpath("tests", "src", "crackfortran", "accesstype.f90")
+        mod = crackfortran.crackfortran([str(fpath)])
+        assert len(mod) == 1
+        tt = mod[0]['vars']
+        assert set(tt['a']['attrspec']) == {'private', 'bind(c)'}
+        assert set(tt['b_']['attrspec']) == {'public', 'bind(c)'}
+        assert set(tt['c']['attrspec']) == {'public'}
+
+
+class TestModuleProcedure():
+    def test_moduleOperators(self, tmp_path):
+        fpath = util.getpath("tests", "src", "crackfortran", "operators.f90")
+        mod = crackfortran.crackfortran([str(fpath)])
+        assert len(mod) == 1
+        mod = mod[0]
+        assert "body" in mod and len(mod["body"]) == 9
+        assert mod["body"][1]["name"] == "operator(.item.)"
+        assert "implementedby" in mod["body"][1]
+        assert mod["body"][1]["implementedby"] == \
+            ["item_int", "item_real"]
+        assert mod["body"][2]["name"] == "operator(==)"
+        assert "implementedby" in mod["body"][2]
+        assert mod["body"][2]["implementedby"] == ["items_are_equal"]
+        assert mod["body"][3]["name"] == "assignment(=)"
+        assert "implementedby" in mod["body"][3]
+        assert mod["body"][3]["implementedby"] == \
+            ["get_int", "get_real"]
  
  
  class TestExternal(util.F2PyTest):
      # issue gh-17859: add external attribute support
-    code = """
-        integer(8) function external_as_statement(fcn)
-        implicit none
-        external fcn
-        integer(8) :: fcn
-        external_as_statement = fcn(0)
-        end
-
-        integer(8) function external_as_attribute(fcn)
-        implicit none
-        integer(8), external :: fcn
-        external_as_attribute = fcn(0)
-        end
-    """
+    sources = [util.getpath("tests", "src", "crackfortran", "gh17859.f")]
  
      def test_external_as_statement(self):
          def incr(x):
              return x + 123
+
          r = self.module.external_as_statement(incr)
          assert r == 123
  
      def test_external_as_attribute(self):
          def incr(x):
              return x + 123
+
          r = self.module.external_as_attribute(incr)
          assert r == 123
  
  
  class TestCrackFortran(util.F2PyTest):
-
-    suffix = '.f90'
-
-    code = textwrap.dedent("""
-      subroutine gh2848( &
-        ! first 2 parameters
-        par1, par2,&
-        ! last 2 parameters
-        par3, par4)
-
-        integer, intent(in)  :: par1, par2
-        integer, intent(out) :: par3, par4
-
-        par3 = par1
-        par4 = par2
-
-      end subroutine gh2848
-    """)
+    # gh-2848: commented lines between parameters in subroutine parameter lists
+    sources = [util.getpath("tests", "src", "crackfortran", "gh2848.f90")]
  
      def test_gh2848(self):
          r = self.module.gh2848(1, 2)
          assert r == (1, 2)
  
  
-class TestMarkinnerspaces():
-    # issue #14118: markinnerspaces does not handle multiple quotations
+class TestMarkinnerspaces:
+    # gh-14118: markinnerspaces does not handle multiple quotations
  
      def test_do_not_touch_normal_spaces(self):
          test_list = ["a ", " a", "a b c", "'abcdefghij'"]
          for i in test_list:
-            assert_equal(markinnerspaces(i), i)
+            assert markinnerspaces(i) == i
  
      def test_one_relevant_space(self):
-        assert_equal(markinnerspaces("a 'b c' \\\' \\\'"), "a 'b@_@c' \\' \\'")
-        assert_equal(markinnerspaces(r'a "b c" \" \"'), r'a "b@_@c" \" \"')
+        assert markinnerspaces("a 'b c' \\' \\'") == "a 'b@_@c' \\' \\'"
+        assert markinnerspaces(r'a "b c" \" \"') == r'a "b@_@c" \" \"'
  
      def test_ignore_inner_quotes(self):
-        assert_equal(markinnerspaces('a \'b c" " d\' e'),
-                     "a 'b@_@c\"@_@\"@_@d' e")
-        assert_equal(markinnerspaces('a "b c\' \' d" e'),
-                     "a \"b@_@c'@_@'@_@d\" e")
+        assert markinnerspaces("a 'b c\" \" d' e") == "a 'b@_@c\"@_@\"@_@d' e"
+        assert markinnerspaces("a \"b c' ' d\" e") == "a \"b@_@c'@_@'@_@d\" e"
  
      def test_multiple_relevant_spaces(self):
-        assert_equal(markinnerspaces("a 'b c' 'd e'"), "a 'b@_@c' 'd@_@e'")
-        assert_equal(markinnerspaces(r'a "b c" "d e"'), r'a "b@_@c" "d@_@e"')
+        assert markinnerspaces("a 'b c' 'd e'") == "a 'b@_@c' 'd@_@e'"
+        assert markinnerspaces(r'a "b c" "d e"') == r'a "b@_@c" "d@_@e"'
  
  
  class TestDimSpec(util.F2PyTest):
@@ -200,7 +154,7 @@ class TestDimSpec(util.F2PyTest):
  
      """
  
-    suffix = '.f90'
+    suffix = ".f90"
  
      code_template = textwrap.dedent("""
        function get_arr_size_{count}(a, n) result (length)
@@ -228,7 +182,7 @@ class TestDimSpec(util.F2PyTest):
      nonlinear_dimspecs = ["2*n:3*n*n+2*n"]
      all_dimspecs = linear_dimspecs + nonlinear_dimspecs
  
-    code = ''
+    code = ""
      for count, dimspec in enumerate(all_dimspecs):
          lst = [(d.split(":")[0] if ":" in d else "1") for d in dimspec.split(',')]
          code += code_template.format(
@@ -237,22 +191,22 @@ class TestDimSpec(util.F2PyTest):
              first=", ".join(lst),
          )
  
-    @pytest.mark.parametrize('dimspec', all_dimspecs)
+    @pytest.mark.parametrize("dimspec", all_dimspecs)
      def test_array_size(self, dimspec):
  
          count = self.all_dimspecs.index(dimspec)
-        get_arr_size = getattr(self.module, f'get_arr_size_{count}')
+        get_arr_size = getattr(self.module, f"get_arr_size_{count}")
  
          for n in [1, 2, 3, 4, 5]:
              sz, a = get_arr_size(n)
              assert a.size == sz
  
-    @pytest.mark.parametrize('dimspec', all_dimspecs)
+    @pytest.mark.parametrize("dimspec", all_dimspecs)
      def test_inv_array_size(self, dimspec):
  
          count = self.all_dimspecs.index(dimspec)
-        get_arr_size = getattr(self.module, f'get_arr_size_{count}')
-        get_inv_arr_size = getattr(self.module, f'get_inv_arr_size_{count}')
+        get_arr_size = getattr(self.module, f"get_arr_size_{count}")
+        get_inv_arr_size = getattr(self.module, f"get_inv_arr_size_{count}")
  
          for n in [1, 2, 3, 4, 5]:
              sz, a = get_arr_size(n)
@@ -271,18 +225,9 @@ class TestDimSpec(util.F2PyTest):
              assert sz == sz1, (n, n1, sz, sz1)
  
  
-class TestModuleDeclaration():
+class TestModuleDeclaration:
      def test_dependencies(self, tmp_path):
-        f_path = tmp_path / "mod.f90"
-        with f_path.open('w') as ff:
-            ff.write(textwrap.dedent("""\
-            module foo
-              type bar
-                character(len = 4) :: text
-              end type bar
-              type(bar), parameter :: abar = bar('abar')
-            end module foo
-            """))
-        mod = crackfortran.crackfortran([str(f_path)])
+        fpath = util.getpath("tests", "src", "crackfortran", "foo_deps.f90")
+        mod = crackfortran.crackfortran([str(fpath)])
          assert len(mod) == 1
-        assert mod[0]['vars']['abar']['='] == "bar('abar')"
+        assert mod[0]["vars"]["abar"]["="] == "bar('abar')"
diff --git a/numpy/f2py/tests/test_f2cmap.py b/numpy/f2py/tests/test_f2cmap.py

new file mode 100644 (file)

index 0000000..d2967e4
--- /dev/null
+++ b/numpy/f2py/tests/test_f2cmap.py
@@ -0,0 +1,15 @@
+from . import util
+import numpy as np
+
+class TestF2Cmap(util.F2PyTest):
+    sources = [
+        util.getpath("tests", "src", "f2cmap", "isoFortranEnvMap.f90"),
+        util.getpath("tests", "src", "f2cmap", ".f2py_f2cmap")
+    ]
+
+    # gh-15095
+    def test_long_long_map(self):
+        inp = np.ones(3)
+        out = self.module.func1(inp)
+        exp_out = 3
+        assert out == exp_out
diff --git a/numpy/f2py/tests/test_f2py2e.py b/numpy/f2py/tests/test_f2py2e.py

new file mode 100644 (file)

index 0000000..9de043d
--- /dev/null
+++ b/numpy/f2py/tests/test_f2py2e.py
@@ -0,0 +1,748 @@
+import textwrap, re, sys, subprocess, shlex
+from pathlib import Path
+from collections import namedtuple
+
+import pytest
+
+from . import util
+from numpy.f2py.f2py2e import main as f2pycli
+
+#########################
+# CLI utils and classes #
+#########################
+
+PPaths = namedtuple("PPaths", "finp, f90inp, pyf, wrap77, wrap90, cmodf")
+
+
+def get_io_paths(fname_inp, mname="untitled"):
+    """Takes in a temporary file for testing and returns the expected output and input paths
+
+    Here expected output is essentially one of any of the possible generated
+    files.
+
+    ..note::
+
+         Since this does not actually run f2py, none of these are guaranteed to
+         exist, and module names are typically incorrect
+
+    Parameters
+    ----------
+    fname_inp : str
+                The input filename
+    mname : str, optional
+                The name of the module, untitled by default
+
+    Returns
+    -------
+    genp : NamedTuple PPaths
+            The possible paths which are generated, not all of which exist
+    """
+    bpath = Path(fname_inp)
+    return PPaths(
+        finp=bpath.with_suffix(".f"),
+        f90inp=bpath.with_suffix(".f90"),
+        pyf=bpath.with_suffix(".pyf"),
+        wrap77=bpath.with_name(f"{mname}-f2pywrappers.f"),
+        wrap90=bpath.with_name(f"{mname}-f2pywrappers2.f90"),
+        cmodf=bpath.with_name(f"{mname}module.c"),
+    )
+
+
+##############
+# CLI Fixtures and Tests #
+#############
+
+
+@pytest.fixture(scope="session")
+def hello_world_f90(tmpdir_factory):
+    """Generates a single f90 file for testing"""
+    fdat = util.getpath("tests", "src", "cli", "hiworld.f90").read_text()
+    fn = tmpdir_factory.getbasetemp() / "hello.f90"
+    fn.write_text(fdat, encoding="ascii")
+    return fn
+
+
+@pytest.fixture(scope="session")
+def hello_world_f77(tmpdir_factory):
+    """Generates a single f77 file for testing"""
+    fdat = util.getpath("tests", "src", "cli", "hi77.f").read_text()
+    fn = tmpdir_factory.getbasetemp() / "hello.f"
+    fn.write_text(fdat, encoding="ascii")
+    return fn
+
+
+@pytest.fixture(scope="session")
+def retreal_f77(tmpdir_factory):
+    """Generates a single f77 file for testing"""
+    fdat = util.getpath("tests", "src", "return_real", "foo77.f").read_text()
+    fn = tmpdir_factory.getbasetemp() / "foo.f"
+    fn.write_text(fdat, encoding="ascii")
+    return fn
+
+
+def test_gen_pyf(capfd, hello_world_f90, monkeypatch):
+    """Ensures that a signature file is generated via the CLI
+    CLI :: -h
+    """
+    ipath = Path(hello_world_f90)
+    opath = Path(hello_world_f90).stem + ".pyf"
+    monkeypatch.setattr(sys, "argv", f'f2py -h {opath} {ipath}'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()  # Generate wrappers
+        out, _ = capfd.readouterr()
+        assert "Saving signatures to file" in out
+        assert Path(f'{opath}').exists()
+
+
+def test_gen_pyf_stdout(capfd, hello_world_f90, monkeypatch):
+    """Ensures that a signature file can be dumped to stdout
+    CLI :: -h
+    """
+    ipath = Path(hello_world_f90)
+    monkeypatch.setattr(sys, "argv", f'f2py -h stdout {ipath}'.split())
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert "Saving signatures to file" in out
+
+
+def test_gen_pyf_no_overwrite(capfd, hello_world_f90, monkeypatch):
+    """Ensures that the CLI refuses to overwrite signature files
+    CLI :: -h without --overwrite-signature
+    """
+    ipath = Path(hello_world_f90)
+    monkeypatch.setattr(sys, "argv", f'f2py -h faker.pyf {ipath}'.split())
+
+    with util.switchdir(ipath.parent):
+        Path("faker.pyf").write_text("Fake news", encoding="ascii")
+        with pytest.raises(SystemExit):
+            f2pycli()  # Refuse to overwrite
+            _, err = capfd.readouterr()
+            assert "Use --overwrite-signature to overwrite" in err
+
+
+@pytest.mark.xfail
+def test_f2py_skip(capfd, retreal_f77, monkeypatch):
+    """Tests that functions can be skipped
+    CLI :: skip:
+    """
+    foutl = get_io_paths(retreal_f77, mname="test")
+    ipath = foutl.finp
+    toskip = "t0 t4 t8 sd s8 s4"
+    remaining = "td s0"
+    monkeypatch.setattr(
+        sys, "argv",
+        f'f2py {ipath} -m test skip: {toskip}'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, err = capfd.readouterr()
+        for skey in toskip.split():
+            assert (
+                f'buildmodule: Could not found the body of interfaced routine "{skey}". Skipping.'
+                in err)
+        for rkey in remaining.split():
+            assert f'Constructing wrapper function "{rkey}"' in out
+
+
+def test_f2py_only(capfd, retreal_f77, monkeypatch):
+    """Test that functions can be kept by only:
+    CLI :: only:
+    """
+    foutl = get_io_paths(retreal_f77, mname="test")
+    ipath = foutl.finp
+    toskip = "t0 t4 t8 sd s8 s4"
+    tokeep = "td s0"
+    monkeypatch.setattr(
+        sys, "argv",
+        f'f2py {ipath} -m test only: {tokeep}'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, err = capfd.readouterr()
+        for skey in toskip.split():
+            assert (
+                f'buildmodule: Could not find the body of interfaced routine "{skey}". Skipping.'
+                in err)
+        for rkey in tokeep.split():
+            assert f'Constructing wrapper function "{rkey}"' in out
+
+
+def test_file_processing_switch(capfd, hello_world_f90, retreal_f77,
+                                monkeypatch):
+    """Tests that it is possible to return to file processing mode
+    CLI :: :
+    BUG: numpy-gh #20520
+    """
+    foutl = get_io_paths(retreal_f77, mname="test")
+    ipath = foutl.finp
+    toskip = "t0 t4 t8 sd s8 s4"
+    ipath2 = Path(hello_world_f90)
+    tokeep = "td s0 hi"  # hi is in ipath2
+    mname = "blah"
+    monkeypatch.setattr(
+        sys,
+        "argv",
+        f'f2py {ipath} -m {mname} only: {tokeep} : {ipath2}'.split(
+        ),
+    )
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, err = capfd.readouterr()
+        for skey in toskip.split():
+            assert (
+                f'buildmodule: Could not find the body of interfaced routine "{skey}". Skipping.'
+                in err)
+        for rkey in tokeep.split():
+            assert f'Constructing wrapper function "{rkey}"' in out
+
+
+def test_mod_gen_f77(capfd, hello_world_f90, monkeypatch):
+    """Checks the generation of files based on a module name
+    CLI :: -m
+    """
+    MNAME = "hi"
+    foutl = get_io_paths(hello_world_f90, mname=MNAME)
+    ipath = foutl.f90inp
+    monkeypatch.setattr(sys, "argv", f'f2py {ipath} -m {MNAME}'.split())
+    with util.switchdir(ipath.parent):
+        f2pycli()
+
+    # Always generate C module
+    assert Path.exists(foutl.cmodf)
+    # File contains a function, check for F77 wrappers
+    assert Path.exists(foutl.wrap77)
+
+
+def test_lower_cmod(capfd, hello_world_f77, monkeypatch):
+    """Lowers cases by flag or when -h is present
+
+    CLI :: --[no-]lower
+    """
+    foutl = get_io_paths(hello_world_f77, mname="test")
+    ipath = foutl.finp
+    capshi = re.compile(r"HI\(\)")
+    capslo = re.compile(r"hi\(\)")
+    # Case I: --lower is passed
+    monkeypatch.setattr(sys, "argv", f'f2py {ipath} -m test --lower'.split())
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert capslo.search(out) is not None
+        assert capshi.search(out) is None
+    # Case II: --no-lower is passed
+    monkeypatch.setattr(sys, "argv",
+                        f'f2py {ipath} -m test --no-lower'.split())
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert capslo.search(out) is None
+        assert capshi.search(out) is not None
+
+
+def test_lower_sig(capfd, hello_world_f77, monkeypatch):
+    """Lowers cases in signature files by flag or when -h is present
+
+    CLI :: --[no-]lower -h
+    """
+    foutl = get_io_paths(hello_world_f77, mname="test")
+    ipath = foutl.finp
+    # Signature files
+    capshi = re.compile(r"Block: HI")
+    capslo = re.compile(r"Block: hi")
+    # Case I: --lower is implied by -h
+    # TODO: Clean up to prevent passing --overwrite-signature
+    monkeypatch.setattr(
+        sys,
+        "argv",
+        f'f2py {ipath} -h {foutl.pyf} -m test --overwrite-signature'.split(),
+    )
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert capslo.search(out) is not None
+        assert capshi.search(out) is None
+
+    # Case II: --no-lower overrides -h
+    monkeypatch.setattr(
+        sys,
+        "argv",
+        f'f2py {ipath} -h {foutl.pyf} -m test --overwrite-signature --no-lower'
+        .split(),
+    )
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert capslo.search(out) is None
+        assert capshi.search(out) is not None
+
+
+def test_build_dir(capfd, hello_world_f90, monkeypatch):
+    """Ensures that the build directory can be specified
+
+    CLI :: --build-dir
+    """
+    ipath = Path(hello_world_f90)
+    mname = "blah"
+    odir = "tttmp"
+    monkeypatch.setattr(sys, "argv",
+                        f'f2py -m {mname} {ipath} --build-dir {odir}'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert f"Wrote C/API module \"{mname}\"" in out
+
+
+def test_overwrite(capfd, hello_world_f90, monkeypatch):
+    """Ensures that the build directory can be specified
+
+    CLI :: --overwrite-signature
+    """
+    ipath = Path(hello_world_f90)
+    monkeypatch.setattr(
+        sys, "argv",
+        f'f2py -h faker.pyf {ipath} --overwrite-signature'.split())
+
+    with util.switchdir(ipath.parent):
+        Path("faker.pyf").write_text("Fake news", encoding="ascii")
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert "Saving signatures to file" in out
+
+
+def test_latexdoc(capfd, hello_world_f90, monkeypatch):
+    """Ensures that TeX documentation is written out
+
+    CLI :: --latex-doc
+    """
+    ipath = Path(hello_world_f90)
+    mname = "blah"
+    monkeypatch.setattr(sys, "argv",
+                        f'f2py -m {mname} {ipath} --latex-doc'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert "Documentation is saved to file" in out
+        with Path(f"{mname}module.tex").open() as otex:
+            assert "\\documentclass" in otex.read()
+
+
+def test_nolatexdoc(capfd, hello_world_f90, monkeypatch):
+    """Ensures that TeX documentation is written out
+
+    CLI :: --no-latex-doc
+    """
+    ipath = Path(hello_world_f90)
+    mname = "blah"
+    monkeypatch.setattr(sys, "argv",
+                        f'f2py -m {mname} {ipath} --no-latex-doc'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert "Documentation is saved to file" not in out
+
+
+def test_shortlatex(capfd, hello_world_f90, monkeypatch):
+    """Ensures that truncated documentation is written out
+
+    TODO: Test to ensure this has no effect without --latex-doc
+    CLI :: --latex-doc --short-latex
+    """
+    ipath = Path(hello_world_f90)
+    mname = "blah"
+    monkeypatch.setattr(
+        sys,
+        "argv",
+        f'f2py -m {mname} {ipath} --latex-doc --short-latex'.split(),
+    )
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert "Documentation is saved to file" in out
+        with Path(f"./{mname}module.tex").open() as otex:
+            assert "\\documentclass" not in otex.read()
+
+
+def test_restdoc(capfd, hello_world_f90, monkeypatch):
+    """Ensures that RsT documentation is written out
+
+    CLI :: --rest-doc
+    """
+    ipath = Path(hello_world_f90)
+    mname = "blah"
+    monkeypatch.setattr(sys, "argv",
+                        f'f2py -m {mname} {ipath} --rest-doc'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert "ReST Documentation is saved to file" in out
+        with Path(f"./{mname}module.rest").open() as orst:
+            assert r".. -*- rest -*-" in orst.read()
+
+
+def test_norestexdoc(capfd, hello_world_f90, monkeypatch):
+    """Ensures that TeX documentation is written out
+
+    CLI :: --no-rest-doc
+    """
+    ipath = Path(hello_world_f90)
+    mname = "blah"
+    monkeypatch.setattr(sys, "argv",
+                        f'f2py -m {mname} {ipath} --no-rest-doc'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert "ReST Documentation is saved to file" not in out
+
+
+def test_debugcapi(capfd, hello_world_f90, monkeypatch):
+    """Ensures that debugging wrappers are written
+
+    CLI :: --debug-capi
+    """
+    ipath = Path(hello_world_f90)
+    mname = "blah"
+    monkeypatch.setattr(sys, "argv",
+                        f'f2py -m {mname} {ipath} --debug-capi'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        with Path(f"./{mname}module.c").open() as ocmod:
+            assert r"#define DEBUGCFUNCS" in ocmod.read()
+
+
+@pytest.mark.xfail(reason="Consistently fails on CI.")
+def test_debugcapi_bld(hello_world_f90, monkeypatch):
+    """Ensures that debugging wrappers work
+
+    CLI :: --debug-capi -c
+    """
+    ipath = Path(hello_world_f90)
+    mname = "blah"
+    monkeypatch.setattr(sys, "argv",
+                        f'f2py -m {mname} {ipath} -c --debug-capi'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        cmd_run = shlex.split("python3 -c \"import blah; blah.hi()\"")
+        rout = subprocess.run(cmd_run, capture_output=True, encoding='UTF-8')
+        eout = ' Hello World\n'
+        eerr = textwrap.dedent("""\
+debug-capi:Python C/API function blah.hi()
+debug-capi:float hi=:output,hidden,scalar
+debug-capi:hi=0
+debug-capi:Fortran subroutine `f2pywraphi(&hi)'
+debug-capi:hi=0
+debug-capi:Building return value.
+debug-capi:Python C/API function blah.hi: successful.
+debug-capi:Freeing memory.
+        """)
+        assert rout.stdout == eout
+        assert rout.stderr == eerr
+
+
+def test_wrapfunc_def(capfd, hello_world_f90, monkeypatch):
+    """Ensures that fortran subroutine wrappers for F77 are included by default
+
+    CLI :: --[no]-wrap-functions
+    """
+    # Implied
+    ipath = Path(hello_world_f90)
+    mname = "blah"
+    monkeypatch.setattr(sys, "argv", f'f2py -m {mname} {ipath}'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+    out, _ = capfd.readouterr()
+    assert r"Fortran 77 wrappers are saved to" in out
+
+    # Explicit
+    monkeypatch.setattr(sys, "argv",
+                        f'f2py -m {mname} {ipath} --wrap-functions'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert r"Fortran 77 wrappers are saved to" in out
+
+
+def test_nowrapfunc(capfd, hello_world_f90, monkeypatch):
+    """Ensures that fortran subroutine wrappers for F77 can be disabled
+
+    CLI :: --no-wrap-functions
+    """
+    ipath = Path(hello_world_f90)
+    mname = "blah"
+    monkeypatch.setattr(sys, "argv",
+                        f'f2py -m {mname} {ipath} --no-wrap-functions'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert r"Fortran 77 wrappers are saved to" not in out
+
+
+def test_inclheader(capfd, hello_world_f90, monkeypatch):
+    """Add to the include directories
+
+    CLI :: -include
+    TODO: Document this in the help string
+    """
+    ipath = Path(hello_world_f90)
+    mname = "blah"
+    monkeypatch.setattr(
+        sys,
+        "argv",
+        f'f2py -m {mname} {ipath} -include<stdbool.h> -include<stdio.h> '.
+        split(),
+    )
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        with Path(f"./{mname}module.c").open() as ocmod:
+            ocmr = ocmod.read()
+            assert "#include <stdbool.h>" in ocmr
+            assert "#include <stdio.h>" in ocmr
+
+
+def test_inclpath():
+    """Add to the include directories
+
+    CLI :: --include-paths
+    """
+    # TODO: populate
+    pass
+
+
+def test_hlink():
+    """Add to the include directories
+
+    CLI :: --help-link
+    """
+    # TODO: populate
+    pass
+
+
+def test_f2cmap():
+    """Check that Fortran-to-Python KIND specs can be passed
+
+    CLI :: --f2cmap
+    """
+    # TODO: populate
+    pass
+
+
+def test_quiet(capfd, hello_world_f90, monkeypatch):
+    """Reduce verbosity
+
+    CLI :: --quiet
+    """
+    ipath = Path(hello_world_f90)
+    monkeypatch.setattr(sys, "argv", f'f2py -m blah {ipath} --quiet'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert len(out) == 0
+
+
+def test_verbose(capfd, hello_world_f90, monkeypatch):
+    """Increase verbosity
+
+    CLI :: --verbose
+    """
+    ipath = Path(hello_world_f90)
+    monkeypatch.setattr(sys, "argv", f'f2py -m blah {ipath} --verbose'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        assert "analyzeline" in out
+
+
+def test_version(capfd, monkeypatch):
+    """Ensure version
+
+    CLI :: -v
+    """
+    monkeypatch.setattr(sys, "argv", 'f2py -v'.split())
+    # TODO: f2py2e should not call sys.exit() after printing the version
+    with pytest.raises(SystemExit):
+        f2pycli()
+        out, _ = capfd.readouterr()
+        import numpy as np
+        assert np.__version__ == out.strip()
+
+
+@pytest.mark.xfail(reason="Consistently fails on CI.")
+def test_npdistop(hello_world_f90, monkeypatch):
+    """
+    CLI :: -c
+    """
+    ipath = Path(hello_world_f90)
+    monkeypatch.setattr(sys, "argv", f'f2py -m blah {ipath} -c'.split())
+
+    with util.switchdir(ipath.parent):
+        f2pycli()
+        cmd_run = shlex.split("python -c \"import blah; blah.hi()\"")
+        rout = subprocess.run(cmd_run, capture_output=True, encoding='UTF-8')
+        eout = ' Hello World\n'
+        assert rout.stdout == eout
+
+
+# Numpy distutils flags
+# TODO: These should be tested separately
+
+
+def test_npd_fcompiler():
+    """
+    CLI :: -c --fcompiler
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_compiler():
+    """
+    CLI :: -c --compiler
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_help_fcompiler():
+    """
+    CLI :: -c --help-fcompiler
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_f77exec():
+    """
+    CLI :: -c --f77exec
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_f90exec():
+    """
+    CLI :: -c --f90exec
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_f77flags():
+    """
+    CLI :: -c --f77flags
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_f90flags():
+    """
+    CLI :: -c --f90flags
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_opt():
+    """
+    CLI :: -c --opt
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_arch():
+    """
+    CLI :: -c --arch
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_noopt():
+    """
+    CLI :: -c --noopt
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_noarch():
+    """
+    CLI :: -c --noarch
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_debug():
+    """
+    CLI :: -c --debug
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_link_auto():
+    """
+    CLI :: -c --link-<resource>
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_lib():
+    """
+    CLI :: -c -L/path/to/lib/ -l<libname>
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_define():
+    """
+    CLI :: -D<define>
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_undefine():
+    """
+    CLI :: -U<name>
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_incl():
+    """
+    CLI :: -I/path/to/include/
+    """
+    # TODO: populate
+    pass
+
+
+def test_npd_linker():
+    """
+    CLI :: <filename>.o <filename>.so <filename>.a
+    """
+    # TODO: populate
+    pass
diff --git a/numpy/f2py/tests/test_kind.py b/numpy/f2py/tests/test_kind.py

index a7e2b28ed37c9cabb8dd2c8bd485466ae2552226..f0cb61fb6ce300e09c0129c1f74d04e51b4f94dd 100644 (file)
--- a/numpy/f2py/tests/test_kind.py
+++ b/numpy/f2py/tests/test_kind.py
@@ -1,32 +1,26 @@
  import os
  import pytest
  
-from numpy.testing import assert_
  from numpy.f2py.crackfortran import (
      _selected_int_kind_func as selected_int_kind,
-    _selected_real_kind_func as selected_real_kind
-    )
+    _selected_real_kind_func as selected_real_kind,
+)
  from . import util
  
  
-def _path(*a):
-    return os.path.join(*((os.path.dirname(__file__),) + a))
-
-
  class TestKind(util.F2PyTest):
-    sources = [_path('src', 'kind', 'foo.f90')]
+    sources = [util.getpath("tests", "src", "kind", "foo.f90")]
  
-    @pytest.mark.slow
      def test_all(self):
          selectedrealkind = self.module.selectedrealkind
          selectedintkind = self.module.selectedintkind
  
          for i in range(40):
-            assert_(selectedintkind(i) in [selected_int_kind(i), -1],
-                    'selectedintkind(%s): expected %r but got %r' %
-                    (i, selected_int_kind(i), selectedintkind(i)))
+            assert selectedintkind(i) == selected_int_kind(
+                i
+            ), f"selectedintkind({i}): expected {selected_int_kind(i)!r} but got {selectedintkind(i)!r}"
  
          for i in range(20):
-            assert_(selectedrealkind(i) in [selected_real_kind(i), -1],
-                    'selectedrealkind(%s): expected %r but got %r' %
-                    (i, selected_real_kind(i), selectedrealkind(i)))
+            assert selectedrealkind(i) == selected_real_kind(
+                i
+            ), f"selectedrealkind({i}): expected {selected_real_kind(i)!r} but got {selectedrealkind(i)!r}"
diff --git a/numpy/f2py/tests/test_mixed.py b/numpy/f2py/tests/test_mixed.py

index 04266ca5b19078dd26306d052f88e76415975868..80653b7d2d7700927d8ad9c93d748c7026f9f9cc 100644 (file)
--- a/numpy/f2py/tests/test_mixed.py
+++ b/numpy/f2py/tests/test_mixed.py
@@ -2,23 +2,21 @@ import os
  import textwrap
  import pytest
  
-from numpy.testing import assert_, assert_equal, IS_PYPY
+from numpy.testing import IS_PYPY
  from . import util
  
  
-def _path(*a):
-    return os.path.join(*((os.path.dirname(__file__),) + a))
-
-
  class TestMixed(util.F2PyTest):
-    sources = [_path('src', 'mixed', 'foo.f'),
-               _path('src', 'mixed', 'foo_fixed.f90'),
-               _path('src', 'mixed', 'foo_free.f90')]
+    sources = [
+        util.getpath("tests", "src", "mixed", "foo.f"),
+        util.getpath("tests", "src", "mixed", "foo_fixed.f90"),
+        util.getpath("tests", "src", "mixed", "foo_free.f90"),
+    ]
  
      def test_all(self):
-        assert_(self.module.bar11() == 11)
-        assert_(self.module.foo_fixed.bar12() == 12)
-        assert_(self.module.foo_free.bar13() == 13)
+        assert self.module.bar11() == 11
+        assert self.module.foo_fixed.bar12() == 12
+        assert self.module.foo_free.bar13() == 13
  
      @pytest.mark.xfail(IS_PYPY,
                         reason="PyPy cannot modify tp_doc after PyType_Ready")
@@ -32,4 +30,4 @@ class TestMixed(util.F2PyTest):
          -------
          a : int
          """)
-        assert_equal(self.module.bar11.__doc__, expected)
+        assert self.module.bar11.__doc__ == expected
diff --git a/numpy/f2py/tests/test_module_doc.py b/numpy/f2py/tests/test_module_doc.py

index 4b9555cee1fce4067afd6d910eec617a0f04907f..28822d405cc02ac2ce5cc214c27271a199612349 100644 (file)
--- a/numpy/f2py/tests/test_module_doc.py
+++ b/numpy/f2py/tests/test_module_doc.py
@@ -4,27 +4,24 @@ import pytest
  import textwrap
  
  from . import util
-from numpy.testing import assert_equal, IS_PYPY
-
-
-def _path(*a):
-    return os.path.join(*((os.path.dirname(__file__),) + a))
+from numpy.testing import IS_PYPY
  
  
  class TestModuleDocString(util.F2PyTest):
-    sources = [_path('src', 'module_data', 'module_data_docstring.f90')]
+    sources = [
+        util.getpath("tests", "src", "module_data",
+                     "module_data_docstring.f90")
+    ]
  
-    @pytest.mark.skipif(sys.platform=='win32',
-                        reason='Fails with MinGW64 Gfortran (Issue #9673)')
+    @pytest.mark.skipif(sys.platform == "win32",
+                        reason="Fails with MinGW64 Gfortran (Issue #9673)")
      @pytest.mark.xfail(IS_PYPY,
                         reason="PyPy cannot modify tp_doc after PyType_Ready")
      def test_module_docstring(self):
-        assert_equal(self.module.mod.__doc__,
-                     textwrap.dedent('''\
+        assert self.module.mod.__doc__ == textwrap.dedent("""\
                       i : 'i'-scalar
                       x : 'i'-array(4)
                       a : 'f'-array(2,3)
                       b : 'f'-array(-1,-1), not allocated\x00
                       foo()\n
-                     Wrapper for ``foo``.\n\n''')
-                     )
+                     Wrapper for ``foo``.\n\n""")
diff --git a/numpy/f2py/tests/test_parameter.py b/numpy/f2py/tests/test_parameter.py

index b6182716987b12ea9d3c982e5fc93ead2a5e1dc3..2f620eaa0722338a39807f91f2d6a3e61ea68ca9 100644 (file)
--- a/numpy/f2py/tests/test_parameter.py
+++ b/numpy/f2py/tests/test_parameter.py
@@ -2,115 +2,111 @@ import os
  import pytest
  
  import numpy as np
-from numpy.testing import assert_raises, assert_equal
  
  from . import util
  
  
-def _path(*a):
-    return os.path.join(*((os.path.dirname(__file__),) + a))
-
-
  class TestParameters(util.F2PyTest):
      # Check that intent(in out) translates as intent(inout)
-    sources = [_path('src', 'parameter', 'constant_real.f90'),
-               _path('src', 'parameter', 'constant_integer.f90'),
-               _path('src', 'parameter', 'constant_both.f90'),
-               _path('src', 'parameter', 'constant_compound.f90'),
-               _path('src', 'parameter', 'constant_non_compound.f90'),
+    sources = [
+        util.getpath("tests", "src", "parameter", "constant_real.f90"),
+        util.getpath("tests", "src", "parameter", "constant_integer.f90"),
+        util.getpath("tests", "src", "parameter", "constant_both.f90"),
+        util.getpath("tests", "src", "parameter", "constant_compound.f90"),
+        util.getpath("tests", "src", "parameter", "constant_non_compound.f90"),
      ]
  
      @pytest.mark.slow
      def test_constant_real_single(self):
          # non-contiguous should raise error
          x = np.arange(6, dtype=np.float32)[::2]
-        assert_raises(ValueError, self.module.foo_single, x)
+        pytest.raises(ValueError, self.module.foo_single, x)
  
          # check values with contiguous array
          x = np.arange(3, dtype=np.float32)
          self.module.foo_single(x)
-        assert_equal(x, [0 + 1 + 2*3, 1, 2])
+        assert np.allclose(x, [0 + 1 + 2 * 3, 1, 2])
  
      @pytest.mark.slow
      def test_constant_real_double(self):
          # non-contiguous should raise error
          x = np.arange(6, dtype=np.float64)[::2]
-        assert_raises(ValueError, self.module.foo_double, x)
+        pytest.raises(ValueError, self.module.foo_double, x)
  
          # check values with contiguous array
          x = np.arange(3, dtype=np.float64)
          self.module.foo_double(x)
-        assert_equal(x, [0 + 1 + 2*3, 1, 2])
+        assert np.allclose(x, [0 + 1 + 2 * 3, 1, 2])
  
      @pytest.mark.slow
      def test_constant_compound_int(self):
          # non-contiguous should raise error
          x = np.arange(6, dtype=np.int32)[::2]
-        assert_raises(ValueError, self.module.foo_compound_int, x)
+        pytest.raises(ValueError, self.module.foo_compound_int, x)
  
          # check values with contiguous array
          x = np.arange(3, dtype=np.int32)
          self.module.foo_compound_int(x)
-        assert_equal(x, [0 + 1 + 2*6, 1, 2])
+        assert np.allclose(x, [0 + 1 + 2 * 6, 1, 2])
  
      @pytest.mark.slow
      def test_constant_non_compound_int(self):
          # check values
          x = np.arange(4, dtype=np.int32)
          self.module.foo_non_compound_int(x)
-        assert_equal(x, [0 + 1 + 2 + 3*4, 1, 2, 3])
+        assert np.allclose(x, [0 + 1 + 2 + 3 * 4, 1, 2, 3])
  
      @pytest.mark.slow
      def test_constant_integer_int(self):
          # non-contiguous should raise error
          x = np.arange(6, dtype=np.int32)[::2]
-        assert_raises(ValueError, self.module.foo_int, x)
+        pytest.raises(ValueError, self.module.foo_int, x)
  
          # check values with contiguous array
          x = np.arange(3, dtype=np.int32)
          self.module.foo_int(x)
-        assert_equal(x, [0 + 1 + 2*3, 1, 2])
+        assert np.allclose(x, [0 + 1 + 2 * 3, 1, 2])
  
      @pytest.mark.slow
      def test_constant_integer_long(self):
          # non-contiguous should raise error
          x = np.arange(6, dtype=np.int64)[::2]
-        assert_raises(ValueError, self.module.foo_long, x)
+        pytest.raises(ValueError, self.module.foo_long, x)
  
          # check values with contiguous array
          x = np.arange(3, dtype=np.int64)
          self.module.foo_long(x)
-        assert_equal(x, [0 + 1 + 2*3, 1, 2])
+        assert np.allclose(x, [0 + 1 + 2 * 3, 1, 2])
  
      @pytest.mark.slow
      def test_constant_both(self):
          # non-contiguous should raise error
          x = np.arange(6, dtype=np.float64)[::2]
-        assert_raises(ValueError, self.module.foo, x)
+        pytest.raises(ValueError, self.module.foo, x)
  
          # check values with contiguous array
          x = np.arange(3, dtype=np.float64)
          self.module.foo(x)
-        assert_equal(x, [0 + 1*3*3 + 2*3*3, 1*3, 2*3])
+        assert np.allclose(x, [0 + 1 * 3 * 3 + 2 * 3 * 3, 1 * 3, 2 * 3])
  
      @pytest.mark.slow
      def test_constant_no(self):
          # non-contiguous should raise error
          x = np.arange(6, dtype=np.float64)[::2]
-        assert_raises(ValueError, self.module.foo_no, x)
+        pytest.raises(ValueError, self.module.foo_no, x)
  
          # check values with contiguous array
          x = np.arange(3, dtype=np.float64)
          self.module.foo_no(x)
-        assert_equal(x, [0 + 1*3*3 + 2*3*3, 1*3, 2*3])
+        assert np.allclose(x, [0 + 1 * 3 * 3 + 2 * 3 * 3, 1 * 3, 2 * 3])
  
      @pytest.mark.slow
      def test_constant_sum(self):
          # non-contiguous should raise error
          x = np.arange(6, dtype=np.float64)[::2]
-        assert_raises(ValueError, self.module.foo_sum, x)
+        pytest.raises(ValueError, self.module.foo_sum, x)
  
          # check values with contiguous array
          x = np.arange(3, dtype=np.float64)
          self.module.foo_sum(x)
-        assert_equal(x, [0 + 1*3*3 + 2*3*3, 1*3, 2*3])
+        assert np.allclose(x, [0 + 1 * 3 * 3 + 2 * 3 * 3, 1 * 3, 2 * 3])
diff --git a/numpy/f2py/tests/test_quoted_character.py b/numpy/f2py/tests/test_quoted_character.py

index 20c77666c59aaff5b21e17722ba35805270f1fb9..82671cd8e72f84733f5a28acdb4b5fb9d56a0a03 100644 (file)
--- a/numpy/f2py/tests/test_quoted_character.py
+++ b/numpy/f2py/tests/test_quoted_character.py
@@ -4,29 +4,13 @@
  import sys
  import pytest
  
-from numpy.testing import assert_equal
  from . import util
  
  
  class TestQuotedCharacter(util.F2PyTest):
-    code = """
-      SUBROUTINE FOO(OUT1, OUT2, OUT3, OUT4, OUT5, OUT6)
-      CHARACTER SINGLE, DOUBLE, SEMICOL, EXCLA, OPENPAR, CLOSEPAR
-      PARAMETER (SINGLE="'", DOUBLE='"', SEMICOL=';', EXCLA="!",
-     1           OPENPAR="(", CLOSEPAR=")")
-      CHARACTER OUT1, OUT2, OUT3, OUT4, OUT5, OUT6
-Cf2py intent(out) OUT1, OUT2, OUT3, OUT4, OUT5, OUT6
-      OUT1 = SINGLE
-      OUT2 = DOUBLE
-      OUT3 = SEMICOL
-      OUT4 = EXCLA
-      OUT5 = OPENPAR
-      OUT6 = CLOSEPAR
-      RETURN
-      END
-    """
+    sources = [util.getpath("tests", "src", "quoted_character", "foo.f")]
  
-    @pytest.mark.skipif(sys.platform=='win32',
-                        reason='Fails with MinGW64 Gfortran (Issue #9673)')
+    @pytest.mark.skipif(sys.platform == "win32",
+                        reason="Fails with MinGW64 Gfortran (Issue #9673)")
      def test_quoted_character(self):
-        assert_equal(self.module.foo(), (b"'", b'"', b';', b'!', b'(', b')'))
+        assert self.module.foo() == (b"'", b'"', b";", b"!", b"(", b")")
diff --git a/numpy/f2py/tests/test_regression.py b/numpy/f2py/tests/test_regression.py

index b91499e4adb3562ffabdda19a1fb74855194a36c..044f952f226830b642a2932dd6acc91e59775778 100644 (file)
--- a/numpy/f2py/tests/test_regression.py
+++ b/numpy/f2py/tests/test_regression.py
@@ -2,54 +2,65 @@ import os
  import pytest
  
  import numpy as np
-from numpy.testing import assert_, assert_raises, assert_equal, assert_string_equal
  
  from . import util
  
  
-def _path(*a):
-    return os.path.join(*((os.path.dirname(__file__),) + a))
-
-
  class TestIntentInOut(util.F2PyTest):
      # Check that intent(in out) translates as intent(inout)
-    sources = [_path('src', 'regression', 'inout.f90')]
+    sources = [util.getpath("tests", "src", "regression", "inout.f90")]
  
      @pytest.mark.slow
      def test_inout(self):
          # non-contiguous should raise error
          x = np.arange(6, dtype=np.float32)[::2]
-        assert_raises(ValueError, self.module.foo, x)
+        pytest.raises(ValueError, self.module.foo, x)
  
          # check values with contiguous array
          x = np.arange(3, dtype=np.float32)
          self.module.foo(x)
-        assert_equal(x, [3, 1, 2])
+        assert np.allclose(x, [3, 1, 2])
+
+
+class TestNegativeBounds(util.F2PyTest):
+    # Check that negative bounds work correctly
+    sources = [util.getpath("tests", "src", "negative_bounds", "issue_20853.f90")]
+
+    @pytest.mark.slow
+    def test_negbound(self):
+        xvec = np.arange(12)
+        xlow = -6
+        xhigh = 4
+        # Calculate the upper bound,
+        # Keeping the 1 index in mind
+        def ubound(xl, xh):
+            return xh - xl + 1
+        rval = self.module.foo(is_=xlow, ie_=xhigh,
+                        arr=xvec[:ubound(xlow, xhigh)])
+        expval = np.arange(11, dtype = np.float32)
+        assert np.allclose(rval, expval)
  
  
  class TestNumpyVersionAttribute(util.F2PyTest):
      # Check that th attribute __f2py_numpy_version__ is present
      # in the compiled module and that has the value np.__version__.
-    sources = [_path('src', 'regression', 'inout.f90')]
+    sources = [util.getpath("tests", "src", "regression", "inout.f90")]
  
      @pytest.mark.slow
      def test_numpy_version_attribute(self):
  
          # Check that self.module has an attribute named "__f2py_numpy_version__"
-        assert_(hasattr(self.module, "__f2py_numpy_version__"),
-                msg="Fortran module does not have __f2py_numpy_version__")
+        assert hasattr(self.module, "__f2py_numpy_version__")
  
          # Check that the attribute __f2py_numpy_version__ is a string
-        assert_(isinstance(self.module.__f2py_numpy_version__, str),
-                msg="__f2py_numpy_version__ is not a string")
+        assert isinstance(self.module.__f2py_numpy_version__, str)
  
          # Check that __f2py_numpy_version__ has the value numpy.__version__
-        assert_string_equal(np.__version__, self.module.__f2py_numpy_version__)
+        assert np.__version__ == self.module.__f2py_numpy_version__
  
  
  def test_include_path():
      incdir = np.f2py.get_include()
      fnames_in_dir = os.listdir(incdir)
-    for fname in ('fortranobject.c', 'fortranobject.h'):
+    for fname in ("fortranobject.c", "fortranobject.h"):
          assert fname in fnames_in_dir
-
diff --git a/numpy/f2py/tests/test_return_character.py b/numpy/f2py/tests/test_return_character.py

index 2c999ed0b071c1ed0718bb860b43f3cb0dbc9dcc..21055faef61d5a7caeaa5108a0f2b785f53d812b 100644 (file)
--- a/numpy/f2py/tests/test_return_character.py
+++ b/numpy/f2py/tests/test_return_character.py
@@ -1,145 +1,45 @@
  import pytest
  
  from numpy import array
-from numpy.testing import assert_
  from . import util
  import platform
-IS_S390X = platform.machine() == 's390x'
  
+IS_S390X = platform.machine() == "s390x"
  
-class TestReturnCharacter(util.F2PyTest):
  
+class TestReturnCharacter(util.F2PyTest):
      def check_function(self, t, tname):
-        if tname in ['t0', 't1', 's0', 's1']:
-            assert_(t(23) == b'2')
-            r = t('ab')
-            assert_(r == b'a', repr(r))
-            r = t(array('ab'))
-            assert_(r == b'a', repr(r))
-            r = t(array(77, 'u1'))
-            assert_(r == b'M', repr(r))
-            #assert_(_raises(ValueError, t, array([77,87])))
-            #assert_(_raises(ValueError, t, array(77)))
-        elif tname in ['ts', 'ss']:
-            assert_(t(23) == b'23', repr(t(23)))
-            assert_(t('123456789abcdef') == b'123456789a')
-        elif tname in ['t5', 's5']:
-            assert_(t(23) == b'23', repr(t(23)))
-            assert_(t('ab') == b'ab', repr(t('ab')))
-            assert_(t('123456789abcdef') == b'12345')
+        if tname in ["t0", "t1", "s0", "s1"]:
+            assert t(23) == b"2"
+            r = t("ab")
+            assert r == b"a"
+            r = t(array("ab"))
+            assert r == b"a"
+            r = t(array(77, "u1"))
+            assert r == b"M"
+        elif tname in ["ts", "ss"]:
+            assert t(23) == b"23"
+            assert t("123456789abcdef") == b"123456789a"
+        elif tname in ["t5", "s5"]:
+            assert t(23) == b"23"
+            assert t("ab") == b"ab"
+            assert t("123456789abcdef") == b"12345"
          else:
              raise NotImplementedError
  
  
-class TestF77ReturnCharacter(TestReturnCharacter):
-    code = """
-       function t0(value)
-         character value
-         character t0
-         t0 = value
-       end
-       function t1(value)
-         character*1 value
-         character*1 t1
-         t1 = value
-       end
-       function t5(value)
-         character*5 value
-         character*5 t5
-         t5 = value
-       end
-       function ts(value)
-         character*(*) value
-         character*(*) ts
-         ts = value
-       end
-
-       subroutine s0(t0,value)
-         character value
-         character t0
-cf2py    intent(out) t0
-         t0 = value
-       end
-       subroutine s1(t1,value)
-         character*1 value
-         character*1 t1
-cf2py    intent(out) t1
-         t1 = value
-       end
-       subroutine s5(t5,value)
-         character*5 value
-         character*5 t5
-cf2py    intent(out) t5
-         t5 = value
-       end
-       subroutine ss(ts,value)
-         character*(*) value
-         character*10 ts
-cf2py    intent(out) ts
-         ts = value
-       end
-    """
+class TestFReturnCharacter(TestReturnCharacter):
+    sources = [
+        util.getpath("tests", "src", "return_character", "foo77.f"),
+        util.getpath("tests", "src", "return_character", "foo90.f90"),
+    ]
  
      @pytest.mark.xfail(IS_S390X, reason="callback returns ' '")
-    @pytest.mark.parametrize('name', 't0,t1,t5,s0,s1,s5,ss'.split(','))
-    def test_all(self, name):
+    @pytest.mark.parametrize("name", "t0,t1,t5,s0,s1,s5,ss".split(","))
+    def test_all_f77(self, name):
          self.check_function(getattr(self.module, name), name)
  
-
-class TestF90ReturnCharacter(TestReturnCharacter):
-    suffix = ".f90"
-    code = """
-module f90_return_char
-  contains
-       function t0(value)
-         character :: value
-         character :: t0
-         t0 = value
-       end function t0
-       function t1(value)
-         character(len=1) :: value
-         character(len=1) :: t1
-         t1 = value
-       end function t1
-       function t5(value)
-         character(len=5) :: value
-         character(len=5) :: t5
-         t5 = value
-       end function t5
-       function ts(value)
-         character(len=*) :: value
-         character(len=10) :: ts
-         ts = value
-       end function ts
-
-       subroutine s0(t0,value)
-         character :: value
-         character :: t0
-!f2py    intent(out) t0
-         t0 = value
-       end subroutine s0
-       subroutine s1(t1,value)
-         character(len=1) :: value
-         character(len=1) :: t1
-!f2py    intent(out) t1
-         t1 = value
-       end subroutine s1
-       subroutine s5(t5,value)
-         character(len=5) :: value
-         character(len=5) :: t5
-!f2py    intent(out) t5
-         t5 = value
-       end subroutine s5
-       subroutine ss(ts,value)
-         character(len=*) :: value
-         character(len=10) :: ts
-!f2py    intent(out) ts
-         ts = value
-       end subroutine ss
-end module f90_return_char
-    """
-
      @pytest.mark.xfail(IS_S390X, reason="callback returns ' '")
-    @pytest.mark.parametrize('name', 't0,t1,t5,ts,s0,s1,s5,ss'.split(','))
-    def test_all(self, name):
+    @pytest.mark.parametrize("name", "t0,t1,t5,ts,s0,s1,s5,ss".split(","))
+    def test_all_f90(self, name):
          self.check_function(getattr(self.module.f90_return_char, name), name)
diff --git a/numpy/f2py/tests/test_return_complex.py b/numpy/f2py/tests/test_return_complex.py

index 3d2e2b94f27ac302bd337645a6e8b70fa92c5fc5..dc559289986077b08ed777a81041f230dd15c15a 100644 (file)
--- a/numpy/f2py/tests/test_return_complex.py
+++ b/numpy/f2py/tests/test_return_complex.py
@@ -1,163 +1,65 @@
  import pytest
  
  from numpy import array
-from numpy.testing import assert_, assert_raises
  from . import util
  
  
  class TestReturnComplex(util.F2PyTest):
-
      def check_function(self, t, tname):
-        if tname in ['t0', 't8', 's0', 's8']:
+        if tname in ["t0", "t8", "s0", "s8"]:
              err = 1e-5
          else:
              err = 0.0
-        assert_(abs(t(234j) - 234.0j) <= err)
-        assert_(abs(t(234.6) - 234.6) <= err)
-        assert_(abs(t(234) - 234.0) <= err)
-        assert_(abs(t(234.6 + 3j) - (234.6 + 3j)) <= err)
-        #assert_( abs(t('234')-234.)<=err)
-        #assert_( abs(t('234.6')-234.6)<=err)
-        assert_(abs(t(-234) + 234.) <= err)
-        assert_(abs(t([234]) - 234.) <= err)
-        assert_(abs(t((234,)) - 234.) <= err)
-        assert_(abs(t(array(234)) - 234.) <= err)
-        assert_(abs(t(array(23 + 4j, 'F')) - (23 + 4j)) <= err)
-        assert_(abs(t(array([234])) - 234.) <= err)
-        assert_(abs(t(array([[234]])) - 234.) <= err)
-        assert_(abs(t(array([234], 'b')) + 22.) <= err)
-        assert_(abs(t(array([234], 'h')) - 234.) <= err)
-        assert_(abs(t(array([234], 'i')) - 234.) <= err)
-        assert_(abs(t(array([234], 'l')) - 234.) <= err)
-        assert_(abs(t(array([234], 'q')) - 234.) <= err)
-        assert_(abs(t(array([234], 'f')) - 234.) <= err)
-        assert_(abs(t(array([234], 'd')) - 234.) <= err)
-        assert_(abs(t(array([234 + 3j], 'F')) - (234 + 3j)) <= err)
-        assert_(abs(t(array([234], 'D')) - 234.) <= err)
-
-        #assert_raises(TypeError, t, array([234], 'a1'))
-        assert_raises(TypeError, t, 'abc')
-
-        assert_raises(IndexError, t, [])
-        assert_raises(IndexError, t, ())
-
-        assert_raises(TypeError, t, t)
-        assert_raises(TypeError, t, {})
+        assert abs(t(234j) - 234.0j) <= err
+        assert abs(t(234.6) - 234.6) <= err
+        assert abs(t(234) - 234.0) <= err
+        assert abs(t(234.6 + 3j) - (234.6 + 3j)) <= err
+        # assert abs(t('234')-234.)<=err
+        # assert abs(t('234.6')-234.6)<=err
+        assert abs(t(-234) + 234.0) <= err
+        assert abs(t([234]) - 234.0) <= err
+        assert abs(t((234, )) - 234.0) <= err
+        assert abs(t(array(234)) - 234.0) <= err
+        assert abs(t(array(23 + 4j, "F")) - (23 + 4j)) <= err
+        assert abs(t(array([234])) - 234.0) <= err
+        assert abs(t(array([[234]])) - 234.0) <= err
+        assert abs(t(array([234], "b")) + 22.0) <= err
+        assert abs(t(array([234], "h")) - 234.0) <= err
+        assert abs(t(array([234], "i")) - 234.0) <= err
+        assert abs(t(array([234], "l")) - 234.0) <= err
+        assert abs(t(array([234], "q")) - 234.0) <= err
+        assert abs(t(array([234], "f")) - 234.0) <= err
+        assert abs(t(array([234], "d")) - 234.0) <= err
+        assert abs(t(array([234 + 3j], "F")) - (234 + 3j)) <= err
+        assert abs(t(array([234], "D")) - 234.0) <= err
+
+        # pytest.raises(TypeError, t, array([234], 'a1'))
+        pytest.raises(TypeError, t, "abc")
+
+        pytest.raises(IndexError, t, [])
+        pytest.raises(IndexError, t, ())
+
+        pytest.raises(TypeError, t, t)
+        pytest.raises(TypeError, t, {})
  
          try:
-            r = t(10 ** 400)
-            assert_(repr(r) in ['(inf+0j)', '(Infinity+0j)'], repr(r))
+            r = t(10**400)
+            assert repr(r) in ["(inf+0j)", "(Infinity+0j)"]
          except OverflowError:
              pass
  
  
-class TestF77ReturnComplex(TestReturnComplex):
-    code = """
-       function t0(value)
-         complex value
-         complex t0
-         t0 = value
-       end
-       function t8(value)
-         complex*8 value
-         complex*8 t8
-         t8 = value
-       end
-       function t16(value)
-         complex*16 value
-         complex*16 t16
-         t16 = value
-       end
-       function td(value)
-         double complex value
-         double complex td
-         td = value
-       end
+class TestFReturnComplex(TestReturnComplex):
+    sources = [
+        util.getpath("tests", "src", "return_complex", "foo77.f"),
+        util.getpath("tests", "src", "return_complex", "foo90.f90"),
+    ]
  
-       subroutine s0(t0,value)
-         complex value
-         complex t0
-cf2py    intent(out) t0
-         t0 = value
-       end
-       subroutine s8(t8,value)
-         complex*8 value
-         complex*8 t8
-cf2py    intent(out) t8
-         t8 = value
-       end
-       subroutine s16(t16,value)
-         complex*16 value
-         complex*16 t16
-cf2py    intent(out) t16
-         t16 = value
-       end
-       subroutine sd(td,value)
-         double complex value
-         double complex td
-cf2py    intent(out) td
-         td = value
-       end
-    """
-
-    @pytest.mark.parametrize('name', 't0,t8,t16,td,s0,s8,s16,sd'.split(','))
-    def test_all(self, name):
+    @pytest.mark.parametrize("name", "t0,t8,t16,td,s0,s8,s16,sd".split(","))
+    def test_all_f77(self, name):
          self.check_function(getattr(self.module, name), name)
  
-
-class TestF90ReturnComplex(TestReturnComplex):
-    suffix = ".f90"
-    code = """
-module f90_return_complex
-  contains
-       function t0(value)
-         complex :: value
-         complex :: t0
-         t0 = value
-       end function t0
-       function t8(value)
-         complex(kind=4) :: value
-         complex(kind=4) :: t8
-         t8 = value
-       end function t8
-       function t16(value)
-         complex(kind=8) :: value
-         complex(kind=8) :: t16
-         t16 = value
-       end function t16
-       function td(value)
-         double complex :: value
-         double complex :: td
-         td = value
-       end function td
-
-       subroutine s0(t0,value)
-         complex :: value
-         complex :: t0
-!f2py    intent(out) t0
-         t0 = value
-       end subroutine s0
-       subroutine s8(t8,value)
-         complex(kind=4) :: value
-         complex(kind=4) :: t8
-!f2py    intent(out) t8
-         t8 = value
-       end subroutine s8
-       subroutine s16(t16,value)
-         complex(kind=8) :: value
-         complex(kind=8) :: t16
-!f2py    intent(out) t16
-         t16 = value
-       end subroutine s16
-       subroutine sd(td,value)
-         double complex :: value
-         double complex :: td
-!f2py    intent(out) td
-         td = value
-       end subroutine sd
-end module f90_return_complex
-    """
-
-    @pytest.mark.parametrize('name', 't0,t8,t16,td,s0,s8,s16,sd'.split(','))
-    def test_all(self, name):
-        self.check_function(getattr(self.module.f90_return_complex, name), name)
+    @pytest.mark.parametrize("name", "t0,t8,t16,td,s0,s8,s16,sd".split(","))
+    def test_all_f90(self, name):
+        self.check_function(getattr(self.module.f90_return_complex, name),
+                            name)
diff --git a/numpy/f2py/tests/test_return_integer.py b/numpy/f2py/tests/test_return_integer.py

index 0a8121dc14b83b0914ed395ce03b952b4c11608e..a43c677fd0af1293a9fc9e8a2001298835946928 100644 (file)
--- a/numpy/f2py/tests/test_return_integer.py
+++ b/numpy/f2py/tests/test_return_integer.py
@@ -1,175 +1,55 @@
  import pytest
  
  from numpy import array
-from numpy.testing import assert_, assert_raises
  from . import util
  
  
  class TestReturnInteger(util.F2PyTest):
-
      def check_function(self, t, tname):
-        assert_(t(123) == 123, repr(t(123)))
-        assert_(t(123.6) == 123)
-        assert_(t('123') == 123)
-        assert_(t(-123) == -123)
-        assert_(t([123]) == 123)
-        assert_(t((123,)) == 123)
-        assert_(t(array(123)) == 123)
-        assert_(t(array([123])) == 123)
-        assert_(t(array([[123]])) == 123)
-        assert_(t(array([123], 'b')) == 123)
-        assert_(t(array([123], 'h')) == 123)
-        assert_(t(array([123], 'i')) == 123)
-        assert_(t(array([123], 'l')) == 123)
-        assert_(t(array([123], 'B')) == 123)
-        assert_(t(array([123], 'f')) == 123)
-        assert_(t(array([123], 'd')) == 123)
-
-        #assert_raises(ValueError, t, array([123],'S3'))
-        assert_raises(ValueError, t, 'abc')
-
-        assert_raises(IndexError, t, [])
-        assert_raises(IndexError, t, ())
-
-        assert_raises(Exception, t, t)
-        assert_raises(Exception, t, {})
-
-        if tname in ['t8', 's8']:
-            assert_raises(OverflowError, t, 100000000000000000000000)
-            assert_raises(OverflowError, t, 10000000011111111111111.23)
-
-
-class TestF77ReturnInteger(TestReturnInteger):
-    code = """
-       function t0(value)
-         integer value
-         integer t0
-         t0 = value
-       end
-       function t1(value)
-         integer*1 value
-         integer*1 t1
-         t1 = value
-       end
-       function t2(value)
-         integer*2 value
-         integer*2 t2
-         t2 = value
-       end
-       function t4(value)
-         integer*4 value
-         integer*4 t4
-         t4 = value
-       end
-       function t8(value)
-         integer*8 value
-         integer*8 t8
-         t8 = value
-       end
-
-       subroutine s0(t0,value)
-         integer value
-         integer t0
-cf2py    intent(out) t0
-         t0 = value
-       end
-       subroutine s1(t1,value)
-         integer*1 value
-         integer*1 t1
-cf2py    intent(out) t1
-         t1 = value
-       end
-       subroutine s2(t2,value)
-         integer*2 value
-         integer*2 t2
-cf2py    intent(out) t2
-         t2 = value
-       end
-       subroutine s4(t4,value)
-         integer*4 value
-         integer*4 t4
-cf2py    intent(out) t4
-         t4 = value
-       end
-       subroutine s8(t8,value)
-         integer*8 value
-         integer*8 t8
-cf2py    intent(out) t8
-         t8 = value
-       end
-    """
-
-    @pytest.mark.parametrize('name',
-                             't0,t1,t2,t4,t8,s0,s1,s2,s4,s8'.split(','))
-    def test_all(self, name):
+        assert t(123) == 123
+        assert t(123.6) == 123
+        assert t("123") == 123
+        assert t(-123) == -123
+        assert t([123]) == 123
+        assert t((123, )) == 123
+        assert t(array(123)) == 123
+        assert t(array([123])) == 123
+        assert t(array([[123]])) == 123
+        assert t(array([123], "b")) == 123
+        assert t(array([123], "h")) == 123
+        assert t(array([123], "i")) == 123
+        assert t(array([123], "l")) == 123
+        assert t(array([123], "B")) == 123
+        assert t(array([123], "f")) == 123
+        assert t(array([123], "d")) == 123
+
+        # pytest.raises(ValueError, t, array([123],'S3'))
+        pytest.raises(ValueError, t, "abc")
+
+        pytest.raises(IndexError, t, [])
+        pytest.raises(IndexError, t, ())
+
+        pytest.raises(Exception, t, t)
+        pytest.raises(Exception, t, {})
+
+        if tname in ["t8", "s8"]:
+            pytest.raises(OverflowError, t, 100000000000000000000000)
+            pytest.raises(OverflowError, t, 10000000011111111111111.23)
+
+
+class TestFReturnInteger(TestReturnInteger):
+    sources = [
+        util.getpath("tests", "src", "return_integer", "foo77.f"),
+        util.getpath("tests", "src", "return_integer", "foo90.f90"),
+    ]
+
+    @pytest.mark.parametrize("name",
+                             "t0,t1,t2,t4,t8,s0,s1,s2,s4,s8".split(","))
+    def test_all_f77(self, name):
          self.check_function(getattr(self.module, name), name)
  
-
-class TestF90ReturnInteger(TestReturnInteger):
-    suffix = ".f90"
-    code = """
-module f90_return_integer
-  contains
-       function t0(value)
-         integer :: value
-         integer :: t0
-         t0 = value
-       end function t0
-       function t1(value)
-         integer(kind=1) :: value
-         integer(kind=1) :: t1
-         t1 = value
-       end function t1
-       function t2(value)
-         integer(kind=2) :: value
-         integer(kind=2) :: t2
-         t2 = value
-       end function t2
-       function t4(value)
-         integer(kind=4) :: value
-         integer(kind=4) :: t4
-         t4 = value
-       end function t4
-       function t8(value)
-         integer(kind=8) :: value
-         integer(kind=8) :: t8
-         t8 = value
-       end function t8
-
-       subroutine s0(t0,value)
-         integer :: value
-         integer :: t0
-!f2py    intent(out) t0
-         t0 = value
-       end subroutine s0
-       subroutine s1(t1,value)
-         integer(kind=1) :: value
-         integer(kind=1) :: t1
-!f2py    intent(out) t1
-         t1 = value
-       end subroutine s1
-       subroutine s2(t2,value)
-         integer(kind=2) :: value
-         integer(kind=2) :: t2
-!f2py    intent(out) t2
-         t2 = value
-       end subroutine s2
-       subroutine s4(t4,value)
-         integer(kind=4) :: value
-         integer(kind=4) :: t4
-!f2py    intent(out) t4
-         t4 = value
-       end subroutine s4
-       subroutine s8(t8,value)
-         integer(kind=8) :: value
-         integer(kind=8) :: t8
-!f2py    intent(out) t8
-         t8 = value
-       end subroutine s8
-end module f90_return_integer
-    """
-
-    @pytest.mark.parametrize('name',
-                             't0,t1,t2,t4,t8,s0,s1,s2,s4,s8'.split(','))
-    def test_all(self, name):
-        self.check_function(getattr(self.module.f90_return_integer, name), name)
+    @pytest.mark.parametrize("name",
+                             "t0,t1,t2,t4,t8,s0,s1,s2,s4,s8".split(","))
+    def test_all_f90(self, name):
+        self.check_function(getattr(self.module.f90_return_integer, name),
+                            name)
diff --git a/numpy/f2py/tests/test_return_logical.py b/numpy/f2py/tests/test_return_logical.py

index 9db939c7e066d9956d0450ccb8b729cd307a34a4..6f64745ee4817071aed7868c29bc7a164da43939 100644 (file)
--- a/numpy/f2py/tests/test_return_logical.py
+++ b/numpy/f2py/tests/test_return_logical.py
@@ -1,185 +1,64 @@
  import pytest
  
  from numpy import array
-from numpy.testing import assert_, assert_raises
  from . import util
  
  
  class TestReturnLogical(util.F2PyTest):
-
      def check_function(self, t):
-        assert_(t(True) == 1, repr(t(True)))
-        assert_(t(False) == 0, repr(t(False)))
-        assert_(t(0) == 0)
-        assert_(t(None) == 0)
-        assert_(t(0.0) == 0)
-        assert_(t(0j) == 0)
-        assert_(t(1j) == 1)
-        assert_(t(234) == 1)
-        assert_(t(234.6) == 1)
-        assert_(t(234.6 + 3j) == 1)
-        assert_(t('234') == 1)
-        assert_(t('aaa') == 1)
-        assert_(t('') == 0)
-        assert_(t([]) == 0)
-        assert_(t(()) == 0)
-        assert_(t({}) == 0)
-        assert_(t(t) == 1)
-        assert_(t(-234) == 1)
-        assert_(t(10 ** 100) == 1)
-        assert_(t([234]) == 1)
-        assert_(t((234,)) == 1)
-        assert_(t(array(234)) == 1)
-        assert_(t(array([234])) == 1)
-        assert_(t(array([[234]])) == 1)
-        assert_(t(array([234], 'b')) == 1)
-        assert_(t(array([234], 'h')) == 1)
-        assert_(t(array([234], 'i')) == 1)
-        assert_(t(array([234], 'l')) == 1)
-        assert_(t(array([234], 'f')) == 1)
-        assert_(t(array([234], 'd')) == 1)
-        assert_(t(array([234 + 3j], 'F')) == 1)
-        assert_(t(array([234], 'D')) == 1)
-        assert_(t(array(0)) == 0)
-        assert_(t(array([0])) == 0)
-        assert_(t(array([[0]])) == 0)
-        assert_(t(array([0j])) == 0)
-        assert_(t(array([1])) == 1)
-        assert_raises(ValueError, t, array([0, 0]))
-
+        assert t(True) == 1
+        assert t(False) == 0
+        assert t(0) == 0
+        assert t(None) == 0
+        assert t(0.0) == 0
+        assert t(0j) == 0
+        assert t(1j) == 1
+        assert t(234) == 1
+        assert t(234.6) == 1
+        assert t(234.6 + 3j) == 1
+        assert t("234") == 1
+        assert t("aaa") == 1
+        assert t("") == 0
+        assert t([]) == 0
+        assert t(()) == 0
+        assert t({}) == 0
+        assert t(t) == 1
+        assert t(-234) == 1
+        assert t(10**100) == 1
+        assert t([234]) == 1
+        assert t((234, )) == 1
+        assert t(array(234)) == 1
+        assert t(array([234])) == 1
+        assert t(array([[234]])) == 1
+        assert t(array([234], "b")) == 1
+        assert t(array([234], "h")) == 1
+        assert t(array([234], "i")) == 1
+        assert t(array([234], "l")) == 1
+        assert t(array([234], "f")) == 1
+        assert t(array([234], "d")) == 1
+        assert t(array([234 + 3j], "F")) == 1
+        assert t(array([234], "D")) == 1
+        assert t(array(0)) == 0
+        assert t(array([0])) == 0
+        assert t(array([[0]])) == 0
+        assert t(array([0j])) == 0
+        assert t(array([1])) == 1
+        pytest.raises(ValueError, t, array([0, 0]))
  
-class TestF77ReturnLogical(TestReturnLogical):
-    code = """
-       function t0(value)
-         logical value
-         logical t0
-         t0 = value
-       end
-       function t1(value)
-         logical*1 value
-         logical*1 t1
-         t1 = value
-       end
-       function t2(value)
-         logical*2 value
-         logical*2 t2
-         t2 = value
-       end
-       function t4(value)
-         logical*4 value
-         logical*4 t4
-         t4 = value
-       end
-c       function t8(value)
-c         logical*8 value
-c         logical*8 t8
-c         t8 = value
-c       end
  
-       subroutine s0(t0,value)
-         logical value
-         logical t0
-cf2py    intent(out) t0
-         t0 = value
-       end
-       subroutine s1(t1,value)
-         logical*1 value
-         logical*1 t1
-cf2py    intent(out) t1
-         t1 = value
-       end
-       subroutine s2(t2,value)
-         logical*2 value
-         logical*2 t2
-cf2py    intent(out) t2
-         t2 = value
-       end
-       subroutine s4(t4,value)
-         logical*4 value
-         logical*4 t4
-cf2py    intent(out) t4
-         t4 = value
-       end
-c       subroutine s8(t8,value)
-c         logical*8 value
-c         logical*8 t8
-cf2py    intent(out) t8
-c         t8 = value
-c       end
-    """
+class TestFReturnLogical(TestReturnLogical):
+    sources = [
+        util.getpath("tests", "src", "return_logical", "foo77.f"),
+        util.getpath("tests", "src", "return_logical", "foo90.f90"),
+    ]
  
      @pytest.mark.slow
-    @pytest.mark.parametrize('name', 't0,t1,t2,t4,s0,s1,s2,s4'.split(','))
-    def test_all(self, name):
+    @pytest.mark.parametrize("name", "t0,t1,t2,t4,s0,s1,s2,s4".split(","))
+    def test_all_f77(self, name):
          self.check_function(getattr(self.module, name))
  
-
-class TestF90ReturnLogical(TestReturnLogical):
-    suffix = ".f90"
-    code = """
-module f90_return_logical
-  contains
-       function t0(value)
-         logical :: value
-         logical :: t0
-         t0 = value
-       end function t0
-       function t1(value)
-         logical(kind=1) :: value
-         logical(kind=1) :: t1
-         t1 = value
-       end function t1
-       function t2(value)
-         logical(kind=2) :: value
-         logical(kind=2) :: t2
-         t2 = value
-       end function t2
-       function t4(value)
-         logical(kind=4) :: value
-         logical(kind=4) :: t4
-         t4 = value
-       end function t4
-       function t8(value)
-         logical(kind=8) :: value
-         logical(kind=8) :: t8
-         t8 = value
-       end function t8
-
-       subroutine s0(t0,value)
-         logical :: value
-         logical :: t0
-!f2py    intent(out) t0
-         t0 = value
-       end subroutine s0
-       subroutine s1(t1,value)
-         logical(kind=1) :: value
-         logical(kind=1) :: t1
-!f2py    intent(out) t1
-         t1 = value
-       end subroutine s1
-       subroutine s2(t2,value)
-         logical(kind=2) :: value
-         logical(kind=2) :: t2
-!f2py    intent(out) t2
-         t2 = value
-       end subroutine s2
-       subroutine s4(t4,value)
-         logical(kind=4) :: value
-         logical(kind=4) :: t4
-!f2py    intent(out) t4
-         t4 = value
-       end subroutine s4
-       subroutine s8(t8,value)
-         logical(kind=8) :: value
-         logical(kind=8) :: t8
-!f2py    intent(out) t8
-         t8 = value
-       end subroutine s8
-end module f90_return_logical
-    """
-
      @pytest.mark.slow
-    @pytest.mark.parametrize('name',
-                             't0,t1,t2,t4,t8,s0,s1,s2,s4,s8'.split(','))
-    def test_all(self, name):
+    @pytest.mark.parametrize("name",
+                             "t0,t1,t2,t4,t8,s0,s1,s2,s4,s8".split(","))
+    def test_all_f90(self, name):
          self.check_function(getattr(self.module.f90_return_logical, name))
diff --git a/numpy/f2py/tests/test_return_real.py b/numpy/f2py/tests/test_return_real.py

index 8e5022a8ec97fa8b2d83c155af8834cf27e7f5ae..7705a11229bb9f1580cc86a01c2e65863375e9de 100644 (file)
--- a/numpy/f2py/tests/test_return_real.py
+++ b/numpy/f2py/tests/test_return_real.py
@@ -1,59 +1,62 @@
  import platform
  import pytest
+import numpy as np
  
  from numpy import array
-from numpy.testing import assert_, assert_raises
  from . import util
  
  
  class TestReturnReal(util.F2PyTest):
-
      def check_function(self, t, tname):
-        if tname in ['t0', 't4', 's0', 's4']:
+        if tname in ["t0", "t4", "s0", "s4"]:
              err = 1e-5
          else:
              err = 0.0
-        assert_(abs(t(234) - 234.0) <= err)
-        assert_(abs(t(234.6) - 234.6) <= err)
-        assert_(abs(t('234') - 234) <= err)
-        assert_(abs(t('234.6') - 234.6) <= err)
-        assert_(abs(t(-234) + 234) <= err)
-        assert_(abs(t([234]) - 234) <= err)
-        assert_(abs(t((234,)) - 234.) <= err)
-        assert_(abs(t(array(234)) - 234.) <= err)
-        assert_(abs(t(array([234])) - 234.) <= err)
-        assert_(abs(t(array([[234]])) - 234.) <= err)
-        assert_(abs(t(array([234], 'b')) + 22) <= err)
-        assert_(abs(t(array([234], 'h')) - 234.) <= err)
-        assert_(abs(t(array([234], 'i')) - 234.) <= err)
-        assert_(abs(t(array([234], 'l')) - 234.) <= err)
-        assert_(abs(t(array([234], 'B')) - 234.) <= err)
-        assert_(abs(t(array([234], 'f')) - 234.) <= err)
-        assert_(abs(t(array([234], 'd')) - 234.) <= err)
-        if tname in ['t0', 't4', 's0', 's4']:
-            assert_(t(1e200) == t(1e300))  # inf
-
-        #assert_raises(ValueError, t, array([234], 'S1'))
-        assert_raises(ValueError, t, 'abc')
-
-        assert_raises(IndexError, t, [])
-        assert_raises(IndexError, t, ())
-
-        assert_raises(Exception, t, t)
-        assert_raises(Exception, t, {})
+        assert abs(t(234) - 234.0) <= err
+        assert abs(t(234.6) - 234.6) <= err
+        assert abs(t("234") - 234) <= err
+        assert abs(t("234.6") - 234.6) <= err
+        assert abs(t(-234) + 234) <= err
+        assert abs(t([234]) - 234) <= err
+        assert abs(t((234, )) - 234.0) <= err
+        assert abs(t(array(234)) - 234.0) <= err
+        assert abs(t(array([234])) - 234.0) <= err
+        assert abs(t(array([[234]])) - 234.0) <= err
+        assert abs(t(array([234], "b")) + 22) <= err
+        assert abs(t(array([234], "h")) - 234.0) <= err
+        assert abs(t(array([234], "i")) - 234.0) <= err
+        assert abs(t(array([234], "l")) - 234.0) <= err
+        assert abs(t(array([234], "B")) - 234.0) <= err
+        assert abs(t(array([234], "f")) - 234.0) <= err
+        assert abs(t(array([234], "d")) - 234.0) <= err
+        if tname in ["t0", "t4", "s0", "s4"]:
+            assert t(1e200) == t(1e300)  # inf
+
+        # pytest.raises(ValueError, t, array([234], 'S1'))
+        pytest.raises(ValueError, t, "abc")
+
+        pytest.raises(IndexError, t, [])
+        pytest.raises(IndexError, t, ())
+
+        pytest.raises(Exception, t, t)
+        pytest.raises(Exception, t, {})
  
          try:
-            r = t(10 ** 400)
-            assert_(repr(r) in ['inf', 'Infinity'], repr(r))
+            r = t(10**400)
+            assert repr(r) in ["inf", "Infinity"]
          except OverflowError:
              pass
  
  
-
  @pytest.mark.skipif(
-    platform.system() == 'Darwin',
+    platform.system() == "Darwin",
      reason="Prone to error when run with numpy/f2py/tests on mac os, "
-           "but not when run in isolation")
+    "but not when run in isolation",
+)
+@pytest.mark.skipif(
+    np.dtype(np.intp).itemsize < 8,
+    reason="32-bit builds are buggy"
+)
  class TestCReturnReal(TestReturnReal):
      suffix = ".pyf"
      module_name = "c_ext_return_real"
@@ -86,118 +89,21 @@ end interface
  end python module c_ext_return_real
      """
  
-    @pytest.mark.parametrize('name', 't4,t8,s4,s8'.split(','))
+    @pytest.mark.parametrize("name", "t4,t8,s4,s8".split(","))
      def test_all(self, name):
          self.check_function(getattr(self.module, name), name)
  
  
-class TestF77ReturnReal(TestReturnReal):
-    code = """
-       function t0(value)
-         real value
-         real t0
-         t0 = value
-       end
-       function t4(value)
-         real*4 value
-         real*4 t4
-         t4 = value
-       end
-       function t8(value)
-         real*8 value
-         real*8 t8
-         t8 = value
-       end
-       function td(value)
-         double precision value
-         double precision td
-         td = value
-       end
-
-       subroutine s0(t0,value)
-         real value
-         real t0
-cf2py    intent(out) t0
-         t0 = value
-       end
-       subroutine s4(t4,value)
-         real*4 value
-         real*4 t4
-cf2py    intent(out) t4
-         t4 = value
-       end
-       subroutine s8(t8,value)
-         real*8 value
-         real*8 t8
-cf2py    intent(out) t8
-         t8 = value
-       end
-       subroutine sd(td,value)
-         double precision value
-         double precision td
-cf2py    intent(out) td
-         td = value
-       end
-    """
+class TestFReturnReal(TestReturnReal):
+    sources = [
+        util.getpath("tests", "src", "return_real", "foo77.f"),
+        util.getpath("tests", "src", "return_real", "foo90.f90"),
+    ]
  
-    @pytest.mark.parametrize('name', 't0,t4,t8,td,s0,s4,s8,sd'.split(','))
-    def test_all(self, name):
+    @pytest.mark.parametrize("name", "t0,t4,t8,td,s0,s4,s8,sd".split(","))
+    def test_all_f77(self, name):
          self.check_function(getattr(self.module, name), name)
  
-
-class TestF90ReturnReal(TestReturnReal):
-    suffix = ".f90"
-    code = """
-module f90_return_real
-  contains
-       function t0(value)
-         real :: value
-         real :: t0
-         t0 = value
-       end function t0
-       function t4(value)
-         real(kind=4) :: value
-         real(kind=4) :: t4
-         t4 = value
-       end function t4
-       function t8(value)
-         real(kind=8) :: value
-         real(kind=8) :: t8
-         t8 = value
-       end function t8
-       function td(value)
-         double precision :: value
-         double precision :: td
-         td = value
-       end function td
-
-       subroutine s0(t0,value)
-         real :: value
-         real :: t0
-!f2py    intent(out) t0
-         t0 = value
-       end subroutine s0
-       subroutine s4(t4,value)
-         real(kind=4) :: value
-         real(kind=4) :: t4
-!f2py    intent(out) t4
-         t4 = value
-       end subroutine s4
-       subroutine s8(t8,value)
-         real(kind=8) :: value
-         real(kind=8) :: t8
-!f2py    intent(out) t8
-         t8 = value
-       end subroutine s8
-       subroutine sd(td,value)
-         double precision :: value
-         double precision :: td
-!f2py    intent(out) td
-         td = value
-       end subroutine sd
-end module f90_return_real
-    """
-
-    @pytest.mark.parametrize('name', 't0,t4,t8,td,s0,s4,s8,sd'.split(','))
-    def test_all(self, name):
+    @pytest.mark.parametrize("name", "t0,t4,t8,td,s0,s4,s8,sd".split(","))
+    def test_all_f90(self, name):
          self.check_function(getattr(self.module.f90_return_real, name), name)
diff --git a/numpy/f2py/tests/test_semicolon_split.py b/numpy/f2py/tests/test_semicolon_split.py

index d8b4bf22212273b6d1505b3d8c0a0ce98cf23d2e..6d499046c1a53d706410a3cfbcf34dcc818a41d3 100644 (file)
--- a/numpy/f2py/tests/test_semicolon_split.py
+++ b/numpy/f2py/tests/test_semicolon_split.py
@@ -1,18 +1,24 @@
  import platform
  import pytest
+import numpy as np
  
  from . import util
-from numpy.testing import assert_equal
+
  
  @pytest.mark.skipif(
-    platform.system() == 'Darwin',
+    platform.system() == "Darwin",
      reason="Prone to error when run with numpy/f2py/tests on mac os, "
-           "but not when run in isolation")
+    "but not when run in isolation",
+)
+@pytest.mark.skipif(
+    np.dtype(np.intp).itemsize < 8,
+    reason="32-bit builds are buggy"
+)
  class TestMultiline(util.F2PyTest):
      suffix = ".pyf"
      module_name = "multiline"
-    code = """
-python module {module}
+    code = f"""
+python module {module_name}
      usercode '''
  void foo(int* x) {{
      char dummy = ';';
@@ -25,22 +31,27 @@ void foo(int* x) {{
              integer intent(out) :: x
          end subroutine foo
      end interface
-end python module {module}
-    """.format(module=module_name)
+end python module {module_name}
+    """
  
      def test_multiline(self):
-        assert_equal(self.module.foo(), 42)
+        assert self.module.foo() == 42
  
  
  @pytest.mark.skipif(
-    platform.system() == 'Darwin',
+    platform.system() == "Darwin",
      reason="Prone to error when run with numpy/f2py/tests on mac os, "
-           "but not when run in isolation")
+    "but not when run in isolation",
+)
+@pytest.mark.skipif(
+    np.dtype(np.intp).itemsize < 8,
+    reason="32-bit builds are buggy"
+)
  class TestCallstatement(util.F2PyTest):
      suffix = ".pyf"
      module_name = "callstatement"
-    code = """
-python module {module}
+    code = f"""
+python module {module_name}
      usercode '''
  void foo(int* x) {{
  }}
@@ -56,8 +67,8 @@ void foo(int* x) {{
              }}
          end subroutine foo
      end interface
-end python module {module}
-    """.format(module=module_name)
+end python module {module_name}
+    """
  
      def test_callstatement(self):
-        assert_equal(self.module.foo(), 42)
+        assert self.module.foo() == 42
diff --git a/numpy/f2py/tests/test_size.py b/numpy/f2py/tests/test_size.py

index b609fa77f711ef453493535758d037e01cfdf79e..bd2c349df585bd316a9e2547a4a3e50b16364d09 100644 (file)
--- a/numpy/f2py/tests/test_size.py
+++ b/numpy/f2py/tests/test_size.py
@@ -1,49 +1,45 @@
  import os
  import pytest
+import numpy as np
  
-from numpy.testing import assert_equal
  from . import util
  
  
-def _path(*a):
-    return os.path.join(*((os.path.dirname(__file__),) + a))
-
-
  class TestSizeSumExample(util.F2PyTest):
-    sources = [_path('src', 'size', 'foo.f90')]
+    sources = [util.getpath("tests", "src", "size", "foo.f90")]
  
      @pytest.mark.slow
      def test_all(self):
          r = self.module.foo([[]])
-        assert_equal(r, [0], repr(r))
+        assert r == [0]
  
          r = self.module.foo([[1, 2]])
-        assert_equal(r, [3], repr(r))
+        assert r == [3]
  
          r = self.module.foo([[1, 2], [3, 4]])
-        assert_equal(r, [3, 7], repr(r))
+        assert np.allclose(r, [3, 7])
  
          r = self.module.foo([[1, 2], [3, 4], [5, 6]])
-        assert_equal(r, [3, 7, 11], repr(r))
+        assert np.allclose(r, [3, 7, 11])
  
      @pytest.mark.slow
      def test_transpose(self):
          r = self.module.trans([[]])
-        assert_equal(r.T, [[]], repr(r))
+        assert np.allclose(r.T, np.array([[]]))
  
          r = self.module.trans([[1, 2]])
-        assert_equal(r, [[1], [2]], repr(r))
+        assert np.allclose(r, [[1.], [2.]])
  
          r = self.module.trans([[1, 2, 3], [4, 5, 6]])
-        assert_equal(r, [[1, 4], [2, 5], [3, 6]], repr(r))
+        assert np.allclose(r, [[1, 4], [2, 5], [3, 6]])
  
      @pytest.mark.slow
      def test_flatten(self):
          r = self.module.flatten([[]])
-        assert_equal(r, [], repr(r))
+        assert np.allclose(r, [])
  
          r = self.module.flatten([[1, 2]])
-        assert_equal(r, [1, 2], repr(r))
+        assert np.allclose(r, [1, 2])
  
          r = self.module.flatten([[1, 2, 3], [4, 5, 6]])
-        assert_equal(r, [1, 2, 3, 4, 5, 6], repr(r))
+        assert np.allclose(r, [1, 2, 3, 4, 5, 6])
diff --git a/numpy/f2py/tests/test_string.py b/numpy/f2py/tests/test_string.py

index 7b27f8786ed6fb1a01f61c2d74ff2f241c5100b1..9e937188c9309eb09534b5b1ec822b5890a0bbdd 100644 (file)
--- a/numpy/f2py/tests/test_string.py
+++ b/numpy/f2py/tests/test_string.py
@@ -1,109 +1,43 @@
  import os
  import pytest
  import textwrap
-from numpy.testing import assert_array_equal
  import numpy as np
  from . import util
  
  
-def _path(*a):
-    return os.path.join(*((os.path.dirname(__file__),) + a))
-
-
  class TestString(util.F2PyTest):
-    sources = [_path('src', 'string', 'char.f90')]
+    sources = [util.getpath("tests", "src", "string", "char.f90")]
  
      @pytest.mark.slow
      def test_char(self):
-        strings = np.array(['ab', 'cd', 'ef'], dtype='c').T
-        inp, out = self.module.char_test.change_strings(strings,
-                                                        strings.shape[1])
-        assert_array_equal(inp, strings)
+        strings = np.array(["ab", "cd", "ef"], dtype="c").T
+        inp, out = self.module.char_test.change_strings(
+            strings, strings.shape[1])
+        assert inp == pytest.approx(strings)
          expected = strings.copy()
-        expected[1, :] = 'AAA'
-        assert_array_equal(out, expected)
+        expected[1, :] = "AAA"
+        assert out == pytest.approx(expected)
  
  
  class TestDocStringArguments(util.F2PyTest):
-    suffix = '.f'
-
-    code = """
-C FILE: STRING.F
-      SUBROUTINE FOO(A,B,C,D)
-      CHARACTER*5 A, B
-      CHARACTER*(*) C,D
-Cf2py intent(in) a,c
-Cf2py intent(inout) b,d
-      PRINT*, "A=",A
-      PRINT*, "B=",B
-      PRINT*, "C=",C
-      PRINT*, "D=",D
-      PRINT*, "CHANGE A,B,C,D"
-      A(1:1) = 'A'
-      B(1:1) = 'B'
-      C(1:1) = 'C'
-      D(1:1) = 'D'
-      PRINT*, "A=",A
-      PRINT*, "B=",B
-      PRINT*, "C=",C
-      PRINT*, "D=",D
-      END
-C END OF FILE STRING.F
-        """
+    sources = [util.getpath("tests", "src", "string", "string.f")]
  
      def test_example(self):
-        a = np.array(b'123\0\0')
-        b = np.array(b'123\0\0')
-        c = np.array(b'123')
-        d = np.array(b'123')
+        a = np.array(b"123\0\0")
+        b = np.array(b"123\0\0")
+        c = np.array(b"123")
+        d = np.array(b"123")
  
          self.module.foo(a, b, c, d)
  
-        assert a.tobytes() == b'123\0\0'
-        assert b.tobytes() == b'B23\0\0', (b.tobytes(),)
-        assert c.tobytes() == b'123'
-        assert d.tobytes() == b'D23'
+        assert a.tobytes() == b"123\0\0"
+        assert b.tobytes() == b"B23\0\0"
+        assert c.tobytes() == b"123"
+        assert d.tobytes() == b"D23"
  
  
  class TestFixedString(util.F2PyTest):
-    suffix = '.f90'
-
-    code = textwrap.dedent("""
-       function sint(s) result(i)
-          implicit none
-          character(len=*) :: s
-          integer :: j, i
-          i = 0
-          do j=len(s), 1, -1
-           if (.not.((i.eq.0).and.(s(j:j).eq.' '))) then
-             i = i + ichar(s(j:j)) * 10 ** (j - 1)
-           endif
-          end do
-          return
-        end function sint
-
-        function test_in_bytes4(a) result (i)
-          implicit none
-          integer :: sint
-          character(len=4) :: a
-          integer :: i
-          i = sint(a)
-          a(1:1) = 'A'
-          return
-        end function test_in_bytes4
-
-        function test_inout_bytes4(a) result (i)
-          implicit none
-          integer :: sint
-          character(len=4), intent(inout) :: a
-          integer :: i
-          if (a(1:1).ne.' ') then
-            a(1:1) = 'E'
-          endif
-          i = sint(a)
-          return
-        end function test_inout_bytes4
-        """)
+    sources = [util.getpath("tests", "src", "string", "fixed_string.f90")]
  
      @staticmethod
      def _sint(s, start=0, end=None):
@@ -122,41 +56,41 @@ class TestFixedString(util.F2PyTest):
              end = len(s)
          i = 0
          for j in range(start, min(end, len(s))):
-            i += s[j] * 10 ** j
+            i += s[j] * 10**j
          return i
  
-    def _get_input(self, intent='in'):
-        if intent in ['in']:
-            yield ''
-            yield '1'
-            yield '1234'
-            yield '12345'
-            yield b''
-            yield b'\0'
-            yield b'1'
-            yield b'\01'
-            yield b'1\0'
-            yield b'1234'
-            yield b'12345'
-        yield np.ndarray((), np.bytes_, buffer=b'')  # array(b'', dtype='|S0')
-        yield np.array(b'')                          # array(b'', dtype='|S1')
-        yield np.array(b'\0')
-        yield np.array(b'1')
-        yield np.array(b'1\0')
-        yield np.array(b'\01')
-        yield np.array(b'1234')
-        yield np.array(b'123\0')
-        yield np.array(b'12345')
+    def _get_input(self, intent="in"):
+        if intent in ["in"]:
+            yield ""
+            yield "1"
+            yield "1234"
+            yield "12345"
+            yield b""
+            yield b"\0"
+            yield b"1"
+            yield b"\01"
+            yield b"1\0"
+            yield b"1234"
+            yield b"12345"
+        yield np.ndarray((), np.bytes_, buffer=b"")  # array(b'', dtype='|S0')
+        yield np.array(b"")  # array(b'', dtype='|S1')
+        yield np.array(b"\0")
+        yield np.array(b"1")
+        yield np.array(b"1\0")
+        yield np.array(b"\01")
+        yield np.array(b"1234")
+        yield np.array(b"123\0")
+        yield np.array(b"12345")
  
      def test_intent_in(self):
          for s in self._get_input():
              r = self.module.test_in_bytes4(s)
              # also checks that s is not changed inplace
              expected = self._sint(s, end=4)
-            assert r == expected, (s)
+            assert r == expected, s
  
      def test_intent_inout(self):
-        for s in self._get_input(intent='inout'):
+        for s in self._get_input(intent="inout"):
              rest = self._sint(s, start=4)
              r = self.module.test_inout_bytes4(s)
              expected = self._sint(s, end=4)
diff --git a/numpy/f2py/tests/test_symbolic.py b/numpy/f2py/tests/test_symbolic.py

index b37ae33ef9ac2919464c034765b200fe7c517382..8452783111ebe7130d17301d228eb5708e9eced7 100644 (file)
--- a/numpy/f2py/tests/test_symbolic.py
+++ b/numpy/f2py/tests/test_symbolic.py
@@ -1,35 +1,56 @@
-from numpy.testing import assert_raises
+import pytest
+
  from numpy.f2py.symbolic import (
-    Expr, Op, ArithOp, Language,
-    as_symbol, as_number, as_string, as_array, as_complex,
-    as_terms, as_factors, eliminate_quotes, insert_quotes,
-    fromstring, as_expr, as_apply,
-    as_numer_denom, as_ternary, as_ref, as_deref,
-    normalize, as_eq, as_ne, as_lt, as_gt, as_le, as_ge
-    )
+    Expr,
+    Op,
+    ArithOp,
+    Language,
+    as_symbol,
+    as_number,
+    as_string,
+    as_array,
+    as_complex,
+    as_terms,
+    as_factors,
+    eliminate_quotes,
+    insert_quotes,
+    fromstring,
+    as_expr,
+    as_apply,
+    as_numer_denom,
+    as_ternary,
+    as_ref,
+    as_deref,
+    normalize,
+    as_eq,
+    as_ne,
+    as_lt,
+    as_gt,
+    as_le,
+    as_ge,
+)
  from . import util
  
  
  class TestSymbolic(util.F2PyTest):
-
      def test_eliminate_quotes(self):
          def worker(s):
              r, d = eliminate_quotes(s)
              s1 = insert_quotes(r, d)
              assert s1 == s
  
-        for kind in ['', 'mykind_']:
+        for kind in ["", "mykind_"]:
              worker(kind + '"1234" // "ABCD"')
              worker(kind + '"1234" // ' + kind + '"ABCD"')
-            worker(kind + '"1234" // \'ABCD\'')
-            worker(kind + '"1234" // ' + kind + '\'ABCD\'')
+            worker(kind + "\"1234\" // 'ABCD'")
+            worker(kind + '"1234" // ' + kind + "'ABCD'")
              worker(kind + '"1\\"2\'AB\'34"')
-            worker('a = ' + kind + "'1\\'2\"AB\"34'")
+            worker("a = " + kind + "'1\\'2\"AB\"34'")
  
      def test_sanity(self):
-        x = as_symbol('x')
-        y = as_symbol('y')
-        z = as_symbol('z')
+        x = as_symbol("x")
+        y = as_symbol("y")
+        z = as_symbol("z")
  
          assert x.op == Op.SYMBOL
          assert repr(x) == "Expr(Op.SYMBOL, 'x')"
@@ -70,7 +91,7 @@ class TestSymbolic(util.F2PyTest):
          assert s != s2
  
          a = as_array((n, m))
-        b = as_array((n,))
+        b = as_array((n, ))
          assert a.op == Op.ARRAY
          assert repr(a) == ("Expr(Op.ARRAY, (Expr(Op.INTEGER, (123, 4)),"
                             " Expr(Op.INTEGER, (456, 4))))")
@@ -108,75 +129,77 @@ class TestSymbolic(util.F2PyTest):
          assert hash(e) is not None
  
      def test_tostring_fortran(self):
-        x = as_symbol('x')
-        y = as_symbol('y')
-        z = as_symbol('z')
+        x = as_symbol("x")
+        y = as_symbol("y")
+        z = as_symbol("z")
          n = as_number(123)
          m = as_number(456)
          a = as_array((n, m))
          c = as_complex(n, m)
  
-        assert str(x) == 'x'
-        assert str(n) == '123'
-        assert str(a) == '[123, 456]'
-        assert str(c) == '(123, 456)'
-
-        assert str(Expr(Op.TERMS, {x: 1})) == 'x'
-        assert str(Expr(Op.TERMS, {x: 2})) == '2 * x'
-        assert str(Expr(Op.TERMS, {x: -1})) == '-x'
-        assert str(Expr(Op.TERMS, {x: -2})) == '-2 * x'
-        assert str(Expr(Op.TERMS, {x: 1, y: 1})) == 'x + y'
-        assert str(Expr(Op.TERMS, {x: -1, y: -1})) == '-x - y'
-        assert str(Expr(Op.TERMS, {x: 2, y: 3})) == '2 * x + 3 * y'
-        assert str(Expr(Op.TERMS, {x: -2, y: 3})) == '-2 * x + 3 * y'
-        assert str(Expr(Op.TERMS, {x: 2, y: -3})) == '2 * x - 3 * y'
-
-        assert str(Expr(Op.FACTORS, {x: 1})) == 'x'
-        assert str(Expr(Op.FACTORS, {x: 2})) == 'x ** 2'
-        assert str(Expr(Op.FACTORS, {x: -1})) == 'x ** -1'
-        assert str(Expr(Op.FACTORS, {x: -2})) == 'x ** -2'
-        assert str(Expr(Op.FACTORS, {x: 1, y: 1})) == 'x * y'
-        assert str(Expr(Op.FACTORS, {x: 2, y: 3})) == 'x ** 2 * y ** 3'
+        assert str(x) == "x"
+        assert str(n) == "123"
+        assert str(a) == "[123, 456]"
+        assert str(c) == "(123, 456)"
+
+        assert str(Expr(Op.TERMS, {x: 1})) == "x"
+        assert str(Expr(Op.TERMS, {x: 2})) == "2 * x"
+        assert str(Expr(Op.TERMS, {x: -1})) == "-x"
+        assert str(Expr(Op.TERMS, {x: -2})) == "-2 * x"
+        assert str(Expr(Op.TERMS, {x: 1, y: 1})) == "x + y"
+        assert str(Expr(Op.TERMS, {x: -1, y: -1})) == "-x - y"
+        assert str(Expr(Op.TERMS, {x: 2, y: 3})) == "2 * x + 3 * y"
+        assert str(Expr(Op.TERMS, {x: -2, y: 3})) == "-2 * x + 3 * y"
+        assert str(Expr(Op.TERMS, {x: 2, y: -3})) == "2 * x - 3 * y"
+
+        assert str(Expr(Op.FACTORS, {x: 1})) == "x"
+        assert str(Expr(Op.FACTORS, {x: 2})) == "x ** 2"
+        assert str(Expr(Op.FACTORS, {x: -1})) == "x ** -1"
+        assert str(Expr(Op.FACTORS, {x: -2})) == "x ** -2"
+        assert str(Expr(Op.FACTORS, {x: 1, y: 1})) == "x * y"
+        assert str(Expr(Op.FACTORS, {x: 2, y: 3})) == "x ** 2 * y ** 3"
  
          v = Expr(Op.FACTORS, {x: 2, Expr(Op.TERMS, {x: 1, y: 1}): 3})
-        assert str(v) == 'x ** 2 * (x + y) ** 3', str(v)
+        assert str(v) == "x ** 2 * (x + y) ** 3", str(v)
          v = Expr(Op.FACTORS, {x: 2, Expr(Op.FACTORS, {x: 1, y: 1}): 3})
-        assert str(v) == 'x ** 2 * (x * y) ** 3', str(v)
+        assert str(v) == "x ** 2 * (x * y) ** 3", str(v)
  
-        assert str(Expr(Op.APPLY, ('f', (), {}))) == 'f()'
-        assert str(Expr(Op.APPLY, ('f', (x,), {}))) == 'f(x)'
-        assert str(Expr(Op.APPLY, ('f', (x, y), {}))) == 'f(x, y)'
-        assert str(Expr(Op.INDEXING, ('f', x))) == 'f[x]'
+        assert str(Expr(Op.APPLY, ("f", (), {}))) == "f()"
+        assert str(Expr(Op.APPLY, ("f", (x, ), {}))) == "f(x)"
+        assert str(Expr(Op.APPLY, ("f", (x, y), {}))) == "f(x, y)"
+        assert str(Expr(Op.INDEXING, ("f", x))) == "f[x]"
  
-        assert str(as_ternary(x, y, z)) == 'merge(y, z, x)'
-        assert str(as_eq(x, y)) == 'x .eq. y'
-        assert str(as_ne(x, y)) == 'x .ne. y'
-        assert str(as_lt(x, y)) == 'x .lt. y'
-        assert str(as_le(x, y)) == 'x .le. y'
-        assert str(as_gt(x, y)) == 'x .gt. y'
-        assert str(as_ge(x, y)) == 'x .ge. y'
+        assert str(as_ternary(x, y, z)) == "merge(y, z, x)"
+        assert str(as_eq(x, y)) == "x .eq. y"
+        assert str(as_ne(x, y)) == "x .ne. y"
+        assert str(as_lt(x, y)) == "x .lt. y"
+        assert str(as_le(x, y)) == "x .le. y"
+        assert str(as_gt(x, y)) == "x .gt. y"
+        assert str(as_ge(x, y)) == "x .ge. y"
  
      def test_tostring_c(self):
          language = Language.C
-        x = as_symbol('x')
-        y = as_symbol('y')
-        z = as_symbol('z')
+        x = as_symbol("x")
+        y = as_symbol("y")
+        z = as_symbol("z")
          n = as_number(123)
  
-        assert Expr(Op.FACTORS, {x: 2}).tostring(language=language) == 'x * x'
-        assert Expr(Op.FACTORS, {x + y: 2}).tostring(
-            language=language) == '(x + y) * (x + y)'
-        assert Expr(Op.FACTORS, {x: 12}).tostring(
-            language=language) == 'pow(x, 12)'
-
-        assert as_apply(ArithOp.DIV, x, y).tostring(
-            language=language) == 'x / y'
-        assert as_apply(ArithOp.DIV, x, x + y).tostring(
-            language=language) == 'x / (x + y)'
-        assert as_apply(ArithOp.DIV, x - y, x + y).tostring(
-            language=language) == '(x - y) / (x + y)'
-        assert (x + (x - y) / (x + y) + n).tostring(
-            language=language) == '123 + x + (x - y) / (x + y)'
+        assert Expr(Op.FACTORS, {x: 2}).tostring(language=language) == "x * x"
+        assert (Expr(Op.FACTORS, {
+            x + y: 2
+        }).tostring(language=language) == "(x + y) * (x + y)")
+        assert Expr(Op.FACTORS, {
+            x: 12
+        }).tostring(language=language) == "pow(x, 12)"
+
+        assert as_apply(ArithOp.DIV, x,
+                        y).tostring(language=language) == "x / y"
+        assert (as_apply(ArithOp.DIV, x,
+                         x + y).tostring(language=language) == "x / (x + y)")
+        assert (as_apply(ArithOp.DIV, x - y, x +
+                         y).tostring(language=language) == "(x - y) / (x + y)")
+        assert (x + (x - y) / (x + y) +
+                n).tostring(language=language) == "123 + x + (x - y) / (x + y)"
  
          assert as_ternary(x, y, z).tostring(language=language) == "(x?y:z)"
          assert as_eq(x, y).tostring(language=language) == "x == y"
@@ -187,9 +210,9 @@ class TestSymbolic(util.F2PyTest):
          assert as_ge(x, y).tostring(language=language) == "x >= y"
  
      def test_operations(self):
-        x = as_symbol('x')
-        y = as_symbol('y')
-        z = as_symbol('z')
+        x = as_symbol("x")
+        y = as_symbol("y")
+        z = as_symbol("z")
  
          assert x + x == Expr(Op.TERMS, {x: 2})
          assert x - x == Expr(Op.INTEGER, (0, 4))
@@ -205,28 +228,35 @@ class TestSymbolic(util.F2PyTest):
          assert 2 * x + 3 * y == Expr(Op.TERMS, {x: 2, y: 3})
          assert (x + y) * 2 == Expr(Op.TERMS, {x: 2, y: 2})
  
-        assert x ** 2 == Expr(Op.FACTORS, {x: 2})
-        assert (x + y) ** 2 == Expr(Op.TERMS,
-                                    {Expr(Op.FACTORS, {x: 2}): 1,
-                                     Expr(Op.FACTORS, {y: 2}): 1,
-                                     Expr(Op.FACTORS, {x: 1, y: 1}): 2})
-        assert (x + y) * x == x ** 2 + x * y
-        assert (x + y) ** 2 == x ** 2 + 2 * x * y + y ** 2
-        assert (x + y) ** 2 + (x - y) ** 2 == 2 * x ** 2 + 2 * y ** 2
+        assert x**2 == Expr(Op.FACTORS, {x: 2})
+        assert (x + y)**2 == Expr(
+            Op.TERMS,
+            {
+                Expr(Op.FACTORS, {x: 2}): 1,
+                Expr(Op.FACTORS, {y: 2}): 1,
+                Expr(Op.FACTORS, {
+                    x: 1,
+                    y: 1
+                }): 2,
+            },
+        )
+        assert (x + y) * x == x**2 + x * y
+        assert (x + y)**2 == x**2 + 2 * x * y + y**2
+        assert (x + y)**2 + (x - y)**2 == 2 * x**2 + 2 * y**2
          assert (x + y) * z == x * z + y * z
          assert z * (x + y) == x * z + y * z
  
          assert (x / 2) == as_apply(ArithOp.DIV, x, as_number(2))
          assert (2 * x / 2) == x
-        assert (3 * x / 2) == as_apply(ArithOp.DIV, 3*x, as_number(2))
+        assert (3 * x / 2) == as_apply(ArithOp.DIV, 3 * x, as_number(2))
          assert (4 * x / 2) == 2 * x
-        assert (5 * x / 2) == as_apply(ArithOp.DIV, 5*x, as_number(2))
+        assert (5 * x / 2) == as_apply(ArithOp.DIV, 5 * x, as_number(2))
          assert (6 * x / 2) == 3 * x
-        assert ((3*5) * x / 6) == as_apply(ArithOp.DIV, 5*x, as_number(2))
-        assert (30*x**2*y**4 / (24*x**3*y**3)) == as_apply(ArithOp.DIV,
-                                                           5*y, 4*x)
-        assert ((15 * x / 6) / 5) == as_apply(
-            ArithOp.DIV, x, as_number(2)), ((15 * x / 6) / 5)
+        assert ((3 * 5) * x / 6) == as_apply(ArithOp.DIV, 5 * x, as_number(2))
+        assert (30 * x**2 * y**4 / (24 * x**3 * y**3)) == as_apply(
+            ArithOp.DIV, 5 * y, 4 * x)
+        assert ((15 * x / 6) / 5) == as_apply(ArithOp.DIV, x,
+                                              as_number(2)), (15 * x / 6) / 5
          assert (x / (5 / x)) == as_apply(ArithOp.DIV, x**2, as_number(5))
  
          assert (x / 2.0) == Expr(Op.TERMS, {x: 0.5})
@@ -238,127 +268,128 @@ class TestSymbolic(util.F2PyTest):
          assert s // x == Expr(Op.CONCAT, (s, x))
          assert x // s == Expr(Op.CONCAT, (x, s))
  
-        c = as_complex(1., 2.)
-        assert -c == as_complex(-1., -2.)
-        assert c + c == as_expr((1+2j)*2)
-        assert c * c == as_expr((1+2j)**2)
+        c = as_complex(1.0, 2.0)
+        assert -c == as_complex(-1.0, -2.0)
+        assert c + c == as_expr((1 + 2j) * 2)
+        assert c * c == as_expr((1 + 2j)**2)
  
      def test_substitute(self):
-        x = as_symbol('x')
-        y = as_symbol('y')
-        z = as_symbol('z')
+        x = as_symbol("x")
+        y = as_symbol("y")
+        z = as_symbol("z")
          a = as_array((x, y))
  
          assert x.substitute({x: y}) == y
          assert (x + y).substitute({x: z}) == y + z
          assert (x * y).substitute({x: z}) == y * z
-        assert (x ** 4).substitute({x: z}) == z ** 4
+        assert (x**4).substitute({x: z}) == z**4
          assert (x / y).substitute({x: z}) == z / y
          assert x.substitute({x: y + z}) == y + z
          assert a.substitute({x: y + z}) == as_array((y + z, y))
  
-        assert as_ternary(x, y, z).substitute(
-            {x: y + z}) == as_ternary(y + z, y, z)
-        assert as_eq(x, y).substitute(
-            {x: y + z}) == as_eq(y + z, y)
+        assert as_ternary(x, y,
+                          z).substitute({x: y + z}) == as_ternary(y + z, y, z)
+        assert as_eq(x, y).substitute({x: y + z}) == as_eq(y + z, y)
  
      def test_fromstring(self):
  
-        x = as_symbol('x')
-        y = as_symbol('y')
-        z = as_symbol('z')
-        f = as_symbol('f')
+        x = as_symbol("x")
+        y = as_symbol("y")
+        z = as_symbol("z")
+        f = as_symbol("f")
          s = as_string('"ABC"')
          t = as_string('"123"')
          a = as_array((x, y))
  
-        assert fromstring('x') == x
-        assert fromstring('+ x') == x
-        assert fromstring('-  x') == -x
-        assert fromstring('x + y') == x + y
-        assert fromstring('x + 1') == x + 1
-        assert fromstring('x * y') == x * y
-        assert fromstring('x * 2') == x * 2
-        assert fromstring('x / y') == x / y
-        assert fromstring('x ** 2',
-                          language=Language.Python) == x ** 2
-        assert fromstring('x ** 2 ** 3',
-                          language=Language.Python) == x ** 2 ** 3
-        assert fromstring('(x + y) * z') == (x + y) * z
-
-        assert fromstring('f(x)') == f(x)
-        assert fromstring('f(x,y)') == f(x, y)
-        assert fromstring('f[x]') == f[x]
-        assert fromstring('f[x][y]') == f[x][y]
+        assert fromstring("x") == x
+        assert fromstring("+ x") == x
+        assert fromstring("-  x") == -x
+        assert fromstring("x + y") == x + y
+        assert fromstring("x + 1") == x + 1
+        assert fromstring("x * y") == x * y
+        assert fromstring("x * 2") == x * 2
+        assert fromstring("x / y") == x / y
+        assert fromstring("x ** 2", language=Language.Python) == x**2
+        assert fromstring("x ** 2 ** 3", language=Language.Python) == x**2**3
+        assert fromstring("(x + y) * z") == (x + y) * z
+
+        assert fromstring("f(x)") == f(x)
+        assert fromstring("f(x,y)") == f(x, y)
+        assert fromstring("f[x]") == f[x]
+        assert fromstring("f[x][y]") == f[x][y]
  
          assert fromstring('"ABC"') == s
-        assert normalize(fromstring('"ABC" // "123" ',
-                                    language=Language.Fortran)) == s // t
+        assert (normalize(
+            fromstring('"ABC" // "123" ',
+                       language=Language.Fortran)) == s // t)
          assert fromstring('f("ABC")') == f(s)
-        assert fromstring('MYSTRKIND_"ABC"') == as_string('"ABC"', 'MYSTRKIND')
-
-        assert fromstring('(/x, y/)') == a, fromstring('(/x, y/)')
-        assert fromstring('f((/x, y/))') == f(a)
-        assert fromstring('(/(x+y)*z/)') == as_array(((x+y)*z,))
-
-        assert fromstring('123') == as_number(123)
-        assert fromstring('123_2') == as_number(123, 2)
-        assert fromstring('123_myintkind') == as_number(123, 'myintkind')
-
-        assert fromstring('123.0') == as_number(123.0, 4)
-        assert fromstring('123.0_4') == as_number(123.0, 4)
-        assert fromstring('123.0_8') == as_number(123.0, 8)
-        assert fromstring('123.0e0') == as_number(123.0, 4)
-        assert fromstring('123.0d0') == as_number(123.0, 8)
-        assert fromstring('123d0') == as_number(123.0, 8)
-        assert fromstring('123e-0') == as_number(123.0, 4)
-        assert fromstring('123d+0') == as_number(123.0, 8)
-        assert fromstring('123.0_myrealkind') == as_number(123.0, 'myrealkind')
-        assert fromstring('3E4') == as_number(30000.0, 4)
-
-        assert fromstring('(1, 2)') == as_complex(1, 2)
-        assert fromstring('(1e2, PI)') == as_complex(
-            as_number(100.0), as_symbol('PI'))
-
-        assert fromstring('[1, 2]') == as_array((as_number(1), as_number(2)))
-
-        assert fromstring('POINT(x, y=1)') == as_apply(
-            as_symbol('POINT'), x, y=as_number(1))
-        assert (fromstring('PERSON(name="John", age=50, shape=(/34, 23/))')
-                == as_apply(as_symbol('PERSON'),
-                            name=as_string('"John"'),
-                            age=as_number(50),
-                            shape=as_array((as_number(34), as_number(23)))))
-
-        assert fromstring('x?y:z') == as_ternary(x, y, z)
-
-        assert fromstring('*x') == as_deref(x)
-        assert fromstring('**x') == as_deref(as_deref(x))
-        assert fromstring('&x') == as_ref(x)
-        assert fromstring('(*x) * (*y)') == as_deref(x) * as_deref(y)
-        assert fromstring('(*x) * *y') == as_deref(x) * as_deref(y)
-        assert fromstring('*x * *y') == as_deref(x) * as_deref(y)
-        assert fromstring('*x**y') == as_deref(x) * as_deref(y)
-
-        assert fromstring('x == y') == as_eq(x, y)
-        assert fromstring('x != y') == as_ne(x, y)
-        assert fromstring('x < y') == as_lt(x, y)
-        assert fromstring('x > y') == as_gt(x, y)
-        assert fromstring('x <= y') == as_le(x, y)
-        assert fromstring('x >= y') == as_ge(x, y)
-
-        assert fromstring('x .eq. y', language=Language.Fortran) == as_eq(x, y)
-        assert fromstring('x .ne. y', language=Language.Fortran) == as_ne(x, y)
-        assert fromstring('x .lt. y', language=Language.Fortran) == as_lt(x, y)
-        assert fromstring('x .gt. y', language=Language.Fortran) == as_gt(x, y)
-        assert fromstring('x .le. y', language=Language.Fortran) == as_le(x, y)
-        assert fromstring('x .ge. y', language=Language.Fortran) == as_ge(x, y)
+        assert fromstring('MYSTRKIND_"ABC"') == as_string('"ABC"', "MYSTRKIND")
+
+        assert fromstring("(/x, y/)") == a, fromstring("(/x, y/)")
+        assert fromstring("f((/x, y/))") == f(a)
+        assert fromstring("(/(x+y)*z/)") == as_array(((x + y) * z, ))
+
+        assert fromstring("123") == as_number(123)
+        assert fromstring("123_2") == as_number(123, 2)
+        assert fromstring("123_myintkind") == as_number(123, "myintkind")
+
+        assert fromstring("123.0") == as_number(123.0, 4)
+        assert fromstring("123.0_4") == as_number(123.0, 4)
+        assert fromstring("123.0_8") == as_number(123.0, 8)
+        assert fromstring("123.0e0") == as_number(123.0, 4)
+        assert fromstring("123.0d0") == as_number(123.0, 8)
+        assert fromstring("123d0") == as_number(123.0, 8)
+        assert fromstring("123e-0") == as_number(123.0, 4)
+        assert fromstring("123d+0") == as_number(123.0, 8)
+        assert fromstring("123.0_myrealkind") == as_number(123.0, "myrealkind")
+        assert fromstring("3E4") == as_number(30000.0, 4)
+
+        assert fromstring("(1, 2)") == as_complex(1, 2)
+        assert fromstring("(1e2, PI)") == as_complex(as_number(100.0),
+                                                     as_symbol("PI"))
+
+        assert fromstring("[1, 2]") == as_array((as_number(1), as_number(2)))
+
+        assert fromstring("POINT(x, y=1)") == as_apply(as_symbol("POINT"),
+                                                       x,
+                                                       y=as_number(1))
+        assert fromstring(
+            'PERSON(name="John", age=50, shape=(/34, 23/))') == as_apply(
+                as_symbol("PERSON"),
+                name=as_string('"John"'),
+                age=as_number(50),
+                shape=as_array((as_number(34), as_number(23))),
+            )
+
+        assert fromstring("x?y:z") == as_ternary(x, y, z)
+
+        assert fromstring("*x") == as_deref(x)
+        assert fromstring("**x") == as_deref(as_deref(x))
+        assert fromstring("&x") == as_ref(x)
+        assert fromstring("(*x) * (*y)") == as_deref(x) * as_deref(y)
+        assert fromstring("(*x) * *y") == as_deref(x) * as_deref(y)
+        assert fromstring("*x * *y") == as_deref(x) * as_deref(y)
+        assert fromstring("*x**y") == as_deref(x) * as_deref(y)
+
+        assert fromstring("x == y") == as_eq(x, y)
+        assert fromstring("x != y") == as_ne(x, y)
+        assert fromstring("x < y") == as_lt(x, y)
+        assert fromstring("x > y") == as_gt(x, y)
+        assert fromstring("x <= y") == as_le(x, y)
+        assert fromstring("x >= y") == as_ge(x, y)
+
+        assert fromstring("x .eq. y", language=Language.Fortran) == as_eq(x, y)
+        assert fromstring("x .ne. y", language=Language.Fortran) == as_ne(x, y)
+        assert fromstring("x .lt. y", language=Language.Fortran) == as_lt(x, y)
+        assert fromstring("x .gt. y", language=Language.Fortran) == as_gt(x, y)
+        assert fromstring("x .le. y", language=Language.Fortran) == as_le(x, y)
+        assert fromstring("x .ge. y", language=Language.Fortran) == as_ge(x, y)
  
      def test_traverse(self):
-        x = as_symbol('x')
-        y = as_symbol('y')
-        z = as_symbol('z')
-        f = as_symbol('f')
+        x = as_symbol("x")
+        y = as_symbol("y")
+        z = as_symbol("z")
+        f = as_symbol("f")
  
          # Use traverse to substitute a symbol
          def replace_visit(s, r=z):
@@ -373,8 +404,9 @@ class TestSymbolic(util.F2PyTest):
          assert (f[y]).traverse(replace_visit) == f[y]
          assert (f[z]).traverse(replace_visit) == f[z]
          assert (x + y + z).traverse(replace_visit) == (2 * z + y)
-        assert (x + f(y, x - z)).traverse(
-            replace_visit) == (z + f(y, as_number(0)))
+        assert (x +
+                f(y, x - z)).traverse(replace_visit) == (z +
+                                                         f(y, as_number(0)))
          assert as_eq(x, y).traverse(replace_visit) == as_eq(z, y)
  
          # Use traverse to collect symbols, method 1
@@ -416,28 +448,28 @@ class TestSymbolic(util.F2PyTest):
          assert symbols == {x}
  
      def test_linear_solve(self):
-        x = as_symbol('x')
-        y = as_symbol('y')
-        z = as_symbol('z')
+        x = as_symbol("x")
+        y = as_symbol("y")
+        z = as_symbol("z")
  
          assert x.linear_solve(x) == (as_number(1), as_number(0))
-        assert (x+1).linear_solve(x) == (as_number(1), as_number(1))
-        assert (2*x).linear_solve(x) == (as_number(2), as_number(0))
-        assert (2*x+3).linear_solve(x) == (as_number(2), as_number(3))
+        assert (x + 1).linear_solve(x) == (as_number(1), as_number(1))
+        assert (2 * x).linear_solve(x) == (as_number(2), as_number(0))
+        assert (2 * x + 3).linear_solve(x) == (as_number(2), as_number(3))
          assert as_number(3).linear_solve(x) == (as_number(0), as_number(3))
          assert y.linear_solve(x) == (as_number(0), y)
-        assert (y*z).linear_solve(x) == (as_number(0), y * z)
+        assert (y * z).linear_solve(x) == (as_number(0), y * z)
  
-        assert (x+y).linear_solve(x) == (as_number(1), y)
-        assert (z*x+y).linear_solve(x) == (z, y)
-        assert ((z+y)*x+y).linear_solve(x) == (z + y, y)
-        assert (z*y*x+y).linear_solve(x) == (z * y, y)
+        assert (x + y).linear_solve(x) == (as_number(1), y)
+        assert (z * x + y).linear_solve(x) == (z, y)
+        assert ((z + y) * x + y).linear_solve(x) == (z + y, y)
+        assert (z * y * x + y).linear_solve(x) == (z * y, y)
  
-        assert_raises(RuntimeError, lambda: (x*x).linear_solve(x))
+        pytest.raises(RuntimeError, lambda: (x * x).linear_solve(x))
  
      def test_as_numer_denom(self):
-        x = as_symbol('x')
-        y = as_symbol('y')
+        x = as_symbol("x")
+        y = as_symbol("y")
          n = as_number(123)
  
          assert as_numer_denom(x) == (x, as_number(1))
@@ -446,11 +478,11 @@ class TestSymbolic(util.F2PyTest):
          assert as_numer_denom(x / y) == (x, y)
          assert as_numer_denom(x * y) == (x * y, as_number(1))
          assert as_numer_denom(n + x / y) == (x + n * y, y)
-        assert as_numer_denom(n + x / (y - x / n)) == (y * n ** 2, y * n - x)
+        assert as_numer_denom(n + x / (y - x / n)) == (y * n**2, y * n - x)
  
      def test_polynomial_atoms(self):
-        x = as_symbol('x')
-        y = as_symbol('y')
+        x = as_symbol("x")
+        y = as_symbol("y")
          n = as_number(123)
  
          assert x.polynomial_atoms() == {x}
@@ -459,4 +491,4 @@ class TestSymbolic(util.F2PyTest):
          assert (y(x)).polynomial_atoms() == {y(x)}
          assert (y(x) + x).polynomial_atoms() == {y(x), x}
          assert (y(x) * x[y]).polynomial_atoms() == {y(x), x[y]}
-        assert (y(x) ** x).polynomial_atoms() == {y(x)}
+        assert (y(x)**x).polynomial_atoms() == {y(x)}
diff --git a/numpy/f2py/tests/util.py b/numpy/f2py/tests/util.py

index 1a6805e751cd56ae6e5c329ecd8c9591232d9356..ae81bbfc4cc3df6fb6ca39659bb7f8e1fef888f1 100644 (file)
--- a/numpy/f2py/tests/util.py
+++ b/numpy/f2py/tests/util.py
@@ -3,6 +3,7 @@ Utility functions for
  
  - building and importing modules on test time, using a temporary location
  - detecting if compilers are present
+- determining paths to tests
  
  """
  import os
@@ -14,7 +15,10 @@ import atexit
  import textwrap
  import re
  import pytest
+import contextlib
+import numpy
  
+from pathlib import Path
  from numpy.compat import asbytes, asstr
  from numpy.testing import temppath
  from importlib import import_module
@@ -78,9 +82,11 @@ def _memoize(func):
          if isinstance(ret, Exception):
              raise ret
          return ret
+
      wrapper.__name__ = func.__name__
      return wrapper
  
+
  #
  # Building modules
  #
@@ -93,8 +99,7 @@ def build_module(source_files, options=[], skip=[], only=[], module_name=None):
  
      """
  
-    code = ("import sys; sys.path = %s; import numpy.f2py as f2py2e; "
-            "f2py2e.main()" % repr(sys.path))
+    code = f"import sys; sys.path = {sys.path!r}; import numpy.f2py; numpy.f2py.main()"
  
      d = get_module_dir()
  
@@ -109,29 +114,30 @@ def build_module(source_files, options=[], skip=[], only=[], module_name=None):
          dst_sources.append(dst)
  
          base, ext = os.path.splitext(dst)
-        if ext in ('.f90', '.f', '.c', '.pyf'):
+        if ext in (".f90", ".f", ".c", ".pyf"):
              f2py_sources.append(dst)
  
      # Prepare options
      if module_name is None:
          module_name = get_temp_module_name()
-    f2py_opts = ['-c', '-m', module_name] + options + f2py_sources
+    f2py_opts = ["-c", "-m", module_name] + options + f2py_sources
      if skip:
-        f2py_opts += ['skip:'] + skip
+        f2py_opts += ["skip:"] + skip
      if only:
-        f2py_opts += ['only:'] + only
+        f2py_opts += ["only:"] + only
  
      # Build
      cwd = os.getcwd()
      try:
          os.chdir(d)
-        cmd = [sys.executable, '-c', code] + f2py_opts
-        p = subprocess.Popen(cmd, stdout=subprocess.PIPE,
+        cmd = [sys.executable, "-c", code] + f2py_opts
+        p = subprocess.Popen(cmd,
+                             stdout=subprocess.PIPE,
                               stderr=subprocess.STDOUT)
          out, err = p.communicate()
          if p.returncode != 0:
-            raise RuntimeError("Running f2py failed: %s\n%s"
-                               % (cmd[4:], asstr(out)))
+            raise RuntimeError("Running f2py failed: %s\n%s" %
+                               (cmd[4:], asstr(out)))
      finally:
          os.chdir(cwd)
  
@@ -144,20 +150,28 @@ def build_module(source_files, options=[], skip=[], only=[], module_name=None):
  
  
  @_memoize
-def build_code(source_code, options=[], skip=[], only=[], suffix=None,
+def build_code(source_code,
+               options=[],
+               skip=[],
+               only=[],
+               suffix=None,
                 module_name=None):
      """
      Compile and import Fortran code using f2py.
  
      """
      if suffix is None:
-        suffix = '.f'
+        suffix = ".f"
      with temppath(suffix=suffix) as path:
-        with open(path, 'w') as f:
+        with open(path, "w") as f:
              f.write(source_code)
-        return build_module([path], options=options, skip=skip, only=only,
+        return build_module([path],
+                            options=options,
+                            skip=skip,
+                            only=only,
                              module_name=module_name)
  
+
  #
  # Check if compilers are available at all...
  #
@@ -174,10 +188,10 @@ def _get_compiler_status():
  
      # XXX: this is really ugly. But I don't know how to invoke Distutils
      #      in a safer way...
-    code = textwrap.dedent("""\
+    code = textwrap.dedent(f"""\
          import os
          import sys
-        sys.path = %(syspath)s
+        sys.path = {repr(sys.path)}
  
          def configuration(parent_name='',top_path=None):
              global config
@@ -189,7 +203,7 @@ def _get_compiler_status():
          setup(configuration=configuration)
  
          config_cmd = config.get_config_cmd()
-        have_c = config_cmd.try_compile('void foo() {}')
+        have_c = config_cmd.try_compile('void foo() {{}}')
          print('COMPILERS:%%d,%%d,%%d' %% (have_c,
                                            config.have_f77c(),
                                            config.have_f90c()))
@@ -199,23 +213,27 @@ def _get_compiler_status():
  
      tmpdir = tempfile.mkdtemp()
      try:
-        script = os.path.join(tmpdir, 'setup.py')
+        script = os.path.join(tmpdir, "setup.py")
  
-        with open(script, 'w') as f:
+        with open(script, "w") as f:
              f.write(code)
  
-        cmd = [sys.executable, 'setup.py', 'config']
-        p = subprocess.Popen(cmd, stdout=subprocess.PIPE,
+        cmd = [sys.executable, "setup.py", "config"]
+        p = subprocess.Popen(cmd,
+                             stdout=subprocess.PIPE,
                               stderr=subprocess.STDOUT,
                               cwd=tmpdir)
          out, err = p.communicate()
      finally:
          shutil.rmtree(tmpdir)
  
-    m = re.search(br'COMPILERS:(\d+),(\d+),(\d+)', out)
+    m = re.search(br"COMPILERS:(\d+),(\d+),(\d+)", out)
      if m:
-        _compiler_status = (bool(int(m.group(1))), bool(int(m.group(2))),
-                            bool(int(m.group(3))))
+        _compiler_status = (
+            bool(int(m.group(1))),
+            bool(int(m.group(2))),
+            bool(int(m.group(3))),
+        )
      # Finished
      return _compiler_status
  
@@ -231,6 +249,7 @@ def has_f77_compiler():
  def has_f90_compiler():
      return _get_compiler_status()[2]
  
+
  #
  # Building with distutils
  #
@@ -256,38 +275,38 @@ def build_module_distutils(source_files, config_code, module_name, **kw):
      # Build script
      config_code = textwrap.dedent(config_code).replace("\n", "\n    ")
  
-    code = textwrap.dedent("""\
-        import os
-        import sys
-        sys.path = %(syspath)s
-
-        def configuration(parent_name='',top_path=None):
-            from numpy.distutils.misc_util import Configuration
-            config = Configuration('', parent_name, top_path)
-            %(config_code)s
-            return config
+    code = fr"""
+import os
+import sys
+sys.path = {repr(sys.path)}
  
-        if __name__ == "__main__":
-            from numpy.distutils.core import setup
-            setup(configuration=configuration)
-        """) % dict(config_code=config_code, syspath=repr(sys.path))
+def configuration(parent_name='',top_path=None):
+    from numpy.distutils.misc_util import Configuration
+    config = Configuration('', parent_name, top_path)
+    {config_code}
+    return config
  
-    script = os.path.join(d, get_temp_module_name() + '.py')
+if __name__ == "__main__":
+    from numpy.distutils.core import setup
+    setup(configuration=configuration)
+    """
+    script = os.path.join(d, get_temp_module_name() + ".py")
      dst_sources.append(script)
-    with open(script, 'wb') as f:
+    with open(script, "wb") as f:
          f.write(asbytes(code))
  
      # Build
      cwd = os.getcwd()
      try:
          os.chdir(d)
-        cmd = [sys.executable, script, 'build_ext', '-i']
-        p = subprocess.Popen(cmd, stdout=subprocess.PIPE,
+        cmd = [sys.executable, script, "build_ext", "-i"]
+        p = subprocess.Popen(cmd,
+                             stdout=subprocess.PIPE,
                               stderr=subprocess.STDOUT)
          out, err = p.communicate()
          if p.returncode != 0:
-            raise RuntimeError("Running distutils build failed: %s\n%s"
-                               % (cmd[4:], asstr(out)))
+            raise RuntimeError("Running distutils build failed: %s\n%s" %
+                               (cmd[4:], asstr(out)))
      finally:
          os.chdir(cwd)
  
@@ -299,6 +318,7 @@ def build_module_distutils(source_files, config_code, module_name, **kw):
      __import__(module_name)
      return sys.modules[module_name]
  
+
  #
  # Unittest convenience
  #
@@ -310,13 +330,13 @@ class F2PyTest:
      options = []
      skip = []
      only = []
-    suffix = '.f'
+    suffix = ".f"
      module = None
      module_name = None
  
      def setup(self):
-        if sys.platform == 'win32':
-            pytest.skip('Fails with MinGW64 Gfortran (Issue #9673)')
+        if sys.platform == "win32":
+            pytest.skip("Fails with MinGW64 Gfortran (Issue #9673)")
  
          if self.module is not None:
              return
@@ -333,24 +353,58 @@ class F2PyTest:
  
          needs_f77 = False
          needs_f90 = False
+        needs_pyf = False
          for fn in codes:
-            if fn.endswith('.f'):
+            if str(fn).endswith(".f"):
                  needs_f77 = True
-            elif fn.endswith('.f90'):
+            elif str(fn).endswith(".f90"):
                  needs_f90 = True
+            elif str(fn).endswith(".pyf"):
+                needs_pyf = True
          if needs_f77 and not has_f77_compiler():
              pytest.skip("No Fortran 77 compiler available")
          if needs_f90 and not has_f90_compiler():
              pytest.skip("No Fortran 90 compiler available")
+        if needs_pyf and not (has_f90_compiler() or has_f77_compiler()):
+            pytest.skip("No Fortran compiler available")
  
          # Build the module
          if self.code is not None:
-            self.module = build_code(self.code, options=self.options,
-                                     skip=self.skip, only=self.only,
-                                     suffix=self.suffix,
-                                     module_name=self.module_name)
+            self.module = build_code(
+                self.code,
+                options=self.options,
+                skip=self.skip,
+                only=self.only,
+                suffix=self.suffix,
+                module_name=self.module_name,
+            )
  
          if self.sources is not None:
-            self.module = build_module(self.sources, options=self.options,
-                                       skip=self.skip, only=self.only,
-                                       module_name=self.module_name)
+            self.module = build_module(
+                self.sources,
+                options=self.options,
+                skip=self.skip,
+                only=self.only,
+                module_name=self.module_name,
+            )
+
+
+#
+# Helper functions
+#
+
+
+def getpath(*a):
+    # Package root
+    d = Path(numpy.f2py.__file__).parent.resolve()
+    return d.joinpath(*a)
+
+
+@contextlib.contextmanager
+def switchdir(path):
+    curpath = Path.cwd()
+    os.chdir(path)
+    try:
+        yield
+    finally:
+        os.chdir(curpath)
diff --git a/numpy/fft/__init__.pyi b/numpy/fft/__init__.pyi

index 510e576d38062368821445d97cd903ddf93ff161..5518aac16b00728d4b7449342618f4ba810224a3 100644 (file)
--- a/numpy/fft/__init__.pyi
+++ b/numpy/fft/__init__.pyi
@@ -1,5 +1,3 @@
-from typing import Any, List
-
  from numpy._pytesttester import PytestTester
  
  from numpy.fft._pocketfft import (
@@ -26,6 +24,6 @@ from numpy.fft.helper import (
      rfftfreq as rfftfreq,
  )
  
-__all__: List[str]
-__path__: List[str]
+__all__: list[str]
+__path__: list[str]
  test: PytestTester
diff --git a/numpy/fft/_pocketfft.pyi b/numpy/fft/_pocketfft.pyi

index 86cf6a60d84ed9a29ede0e8275ad4b76895d7ba8..2bd8b0ba34af4166679c3bb96df1c26f88263bfc 100644 (file)
--- a/numpy/fft/_pocketfft.pyi
+++ b/numpy/fft/_pocketfft.pyi
@@ -1,15 +1,12 @@
-from typing import (
-    Literal as L,
-    List,
-    Sequence,
-)
+from collections.abc import Sequence
+from typing import Literal as L
  
  from numpy import complex128, float64
-from numpy.typing import ArrayLike, NDArray, _ArrayLikeNumber_co
+from numpy._typing import ArrayLike, NDArray, _ArrayLikeNumber_co
  
  _NormKind = L[None, "backward", "ortho", "forward"]
  
-__all__: List[str]
+__all__: list[str]
  
  def fft(
      a: ArrayLike,
diff --git a/numpy/fft/helper.pyi b/numpy/fft/helper.pyi

index d75826f4e03e34e1c17c63910b84369861840e3f..b49fc88f727f334cc421388ce77125eeeb13d080 100644 (file)
--- a/numpy/fft/helper.pyi
+++ b/numpy/fft/helper.pyi
@@ -1,21 +1,18 @@
-from typing import List, Any, TypeVar, overload
+from typing import Any, TypeVar, overload
  
-from numpy import generic, dtype, integer, floating, complexfloating
-from numpy.typing import (
+from numpy import generic, integer, floating, complexfloating
+from numpy._typing import (
      NDArray,
      ArrayLike,
      _ShapeLike,
-    _SupportsArray,
-    _FiniteNestedSequence,
+    _ArrayLike,
      _ArrayLikeFloat_co,
      _ArrayLikeComplex_co,
  )
  
  _SCT = TypeVar("_SCT", bound=generic)
  
-_ArrayLike = _FiniteNestedSequence[_SupportsArray[dtype[_SCT]]]
-
-__all__: List[str]
+__all__: list[str]
  
  @overload
  def fftshift(x: _ArrayLike[_SCT], axes: None | _ShapeLike = ...) -> NDArray[_SCT]: ...
diff --git a/numpy/lib/__init__.pyi b/numpy/lib/__init__.pyi

index f6ec673ec178945bc822309d3f325b1fc12a7c45..0e3da5b413cb4459d09388f4caeac1f12df3566c 100644 (file)
--- a/numpy/lib/__init__.pyi
+++ b/numpy/lib/__init__.pyi
@@ -1,5 +1,5 @@
  import math as math
-from typing import Any, List
+from typing import Any
  
  from numpy._pytesttester import PytestTester
  
@@ -237,8 +237,8 @@ from numpy.core.multiarray import (
      tracemalloc_domain as tracemalloc_domain,
  )
  
-__all__: List[str]
-__path__: List[str]
+__all__: list[str]
+__path__: list[str]
  test: PytestTester
  
  __version__ = version
diff --git a/numpy/lib/_datasource.py b/numpy/lib/_datasource.py

index 8201d3772887c24f0b8b26d0fc263f70b8bf834a..b7778234e8592cc9c56fc93e298b8e854314b275 100644 (file)
--- a/numpy/lib/_datasource.py
+++ b/numpy/lib/_datasource.py
@@ -280,8 +280,9 @@ class DataSource:
      def _splitzipext(self, filename):
          """Split zip extension from filename and return filename.
  
-        *Returns*:
-            base, zip_ext : {tuple}
+        Returns
+        -------
+        base, zip_ext : {tuple}
  
          """
  
diff --git a/numpy/lib/_version.pyi b/numpy/lib/_version.pyi

index 3581d639bcddcb4c021594502f6e5971db4b570c..1c82c99b686e2be8e34a1b6bc45dacce15532082 100644 (file)
--- a/numpy/lib/_version.pyi
+++ b/numpy/lib/_version.pyi
@@ -1,6 +1,4 @@
-from typing import Union, List
-
-__all__: List[str]
+__all__: list[str]
  
  class NumpyVersion:
      vstring: str
@@ -11,9 +9,9 @@ class NumpyVersion:
      pre_release: str
      is_devversion: bool
      def __init__(self, vstring: str) -> None: ...
-    def __lt__(self, other: Union[str, NumpyVersion]) -> bool: ...
-    def __le__(self, other: Union[str, NumpyVersion]) -> bool: ...
-    def __eq__(self, other: Union[str, NumpyVersion]) -> bool: ...  # type: ignore[override]
-    def __ne__(self, other: Union[str, NumpyVersion]) -> bool: ...  # type: ignore[override]
-    def __gt__(self, other: Union[str, NumpyVersion]) -> bool: ...
-    def __ge__(self, other: Union[str, NumpyVersion]) -> bool: ...
+    def __lt__(self, other: str | NumpyVersion) -> bool: ...
+    def __le__(self, other: str | NumpyVersion) -> bool: ...
+    def __eq__(self, other: str | NumpyVersion) -> bool: ...  # type: ignore[override]
+    def __ne__(self, other: str | NumpyVersion) -> bool: ...  # type: ignore[override]
+    def __gt__(self, other: str | NumpyVersion) -> bool: ...
+    def __ge__(self, other: str | NumpyVersion) -> bool: ...
diff --git a/numpy/lib/arraypad.pyi b/numpy/lib/arraypad.pyi

index d7c5f48445fa0399810fdb0fd921c4b4c2aa0c44..1ac6fc7d91c868ba077235b8229cd00869386660 100644 (file)
--- a/numpy/lib/arraypad.pyi
+++ b/numpy/lib/arraypad.pyi
@@ -1,22 +1,18 @@
  from typing import (
      Literal as L,
      Any,
-    Dict,
-    List,
      overload,
-    Tuple,
      TypeVar,
      Protocol,
  )
  
-from numpy import ndarray, dtype, generic
+from numpy import generic
  
-from numpy.typing import (
+from numpy._typing import (
      ArrayLike,
      NDArray,
      _ArrayLikeInt,
-    _FiniteNestedSequence,
-    _SupportsArray,
+    _ArrayLike,
  )
  
  _SCT = TypeVar("_SCT", bound=generic)
@@ -25,9 +21,9 @@ class _ModeFunc(Protocol):
      def __call__(
          self,
          vector: NDArray[Any],
-        iaxis_pad_width: Tuple[int, int],
+        iaxis_pad_width: tuple[int, int],
          iaxis: int,
-        kwargs: Dict[str, Any],
+        kwargs: dict[str, Any],
          /,
      ) -> None: ...
  
@@ -45,9 +41,7 @@ _ModeKind = L[
      "empty",
  ]
  
-_ArrayLike = _FiniteNestedSequence[_SupportsArray[dtype[_SCT]]]
-
-__all__: List[str]
+__all__: list[str]
  
  # TODO: In practice each keyword argument is exclusive to one or more
  # specific modes. Consider adding more overloads to express this in the future.
diff --git a/numpy/lib/arraysetops.py b/numpy/lib/arraysetops.py

index bd56b697566976cf4a137660759cd1784ba34f83..d42ab26758c7b53e8e27d6c37da3a9024b5aec79 100644 (file)
--- a/numpy/lib/arraysetops.py
+++ b/numpy/lib/arraysetops.py
@@ -131,13 +131,13 @@ def _unpack_tuple(x):
  
  
  def _unique_dispatcher(ar, return_index=None, return_inverse=None,
-                       return_counts=None, axis=None):
+                       return_counts=None, axis=None, *, equal_nan=None):
      return (ar,)
  
  
  @array_function_dispatch(_unique_dispatcher)
  def unique(ar, return_index=False, return_inverse=False,
-           return_counts=False, axis=None):
+           return_counts=False, axis=None, *, equal_nan=True):
      """
      Find the unique elements of an array.
  
@@ -162,9 +162,6 @@ def unique(ar, return_index=False, return_inverse=False,
      return_counts : bool, optional
          If True, also return the number of times each unique item appears
          in `ar`.
-
-        .. versionadded:: 1.9.0
-
      axis : int or None, optional
          The axis to operate on. If None, `ar` will be flattened. If an integer,
          the subarrays indexed by the given axis will be flattened and treated
@@ -175,6 +172,11 @@ def unique(ar, return_index=False, return_inverse=False,
  
          .. versionadded:: 1.13.0
  
+    equal_nan : bool, optional
+        If True, collapses multiple NaN values in the return array into one.
+
+        .. versionadded:: 1.24
+
      Returns
      -------
      unique : ndarray
@@ -269,7 +271,8 @@ def unique(ar, return_index=False, return_inverse=False,
      """
      ar = np.asanyarray(ar)
      if axis is None:
-        ret = _unique1d(ar, return_index, return_inverse, return_counts)
+        ret = _unique1d(ar, return_index, return_inverse, return_counts, 
+                        equal_nan=equal_nan)
          return _unpack_tuple(ret)
  
      # axis was specified and not None
@@ -312,13 +315,13 @@ def unique(ar, return_index=False, return_inverse=False,
          return uniq
  
      output = _unique1d(consolidated, return_index,
-                       return_inverse, return_counts)
+                       return_inverse, return_counts, equal_nan=equal_nan)
      output = (reshape_uniq(output[0]),) + output[1:]
      return _unpack_tuple(output)
  
  
  def _unique1d(ar, return_index=False, return_inverse=False,
-              return_counts=False):
+              return_counts=False, *, equal_nan=True):
      """
      Find the unique elements of an array, ignoring shape.
      """
@@ -334,7 +337,8 @@ def _unique1d(ar, return_index=False, return_inverse=False,
          aux = ar
      mask = np.empty(aux.shape, dtype=np.bool_)
      mask[:1] = True
-    if aux.shape[0] > 0 and aux.dtype.kind in "cfmM" and np.isnan(aux[-1]):
+    if (equal_nan and aux.shape[0] > 0 and aux.dtype.kind in "cfmM" and
+            np.isnan(aux[-1])):
          if aux.dtype.kind == "c":  # for complex all NaNs are considered equivalent
              aux_firstnan = np.searchsorted(np.isnan(aux), True, side='left')
          else:
@@ -640,7 +644,7 @@ def _isin_dispatcher(element, test_elements, assume_unique=None, invert=None):
  @array_function_dispatch(_isin_dispatcher)
  def isin(element, test_elements, assume_unique=False, invert=False):
      """
-    Calculates `element in test_elements`, broadcasting over `element` only.
+    Calculates ``element in test_elements``, broadcasting over `element` only.
      Returns a boolean array of the same shape as `element` that is True
      where an element of `element` is in `test_elements` and False otherwise.
  
diff --git a/numpy/lib/arraysetops.pyi b/numpy/lib/arraysetops.pyi

index 6f13ec74b82a3e28ed349af9c9cc4457138fd376..aa1310a3210c9178b6a31d608c3c52e5d9bd2710 100644 (file)
--- a/numpy/lib/arraysetops.pyi
+++ b/numpy/lib/arraysetops.pyi
@@ -1,16 +1,12 @@
  from typing import (
      Literal as L,
      Any,
-    List,
-    Union,
      TypeVar,
-    Tuple,
      overload,
      SupportsIndex,
  )
  
  from numpy import (
-    dtype,
      generic,
      number,
      bool_,
@@ -41,11 +37,10 @@ from numpy import (
      void,
  )
  
-from numpy.typing import (
+from numpy._typing import (
      ArrayLike,
      NDArray,
-    _FiniteNestedSequence,
-    _SupportsArray,
+    _ArrayLike,
      _ArrayLikeBool_co,
      _ArrayLikeDT64_co,
      _ArrayLikeTD64_co,
@@ -90,9 +85,7 @@ _SCTNoCast = TypeVar(
      void,
  )
  
-_ArrayLike = _FiniteNestedSequence[_SupportsArray[dtype[_SCT]]]
-
-__all__: List[str]
+__all__: list[str]
  
  @overload
  def ediff1d(
@@ -132,6 +125,8 @@ def unique(
      return_inverse: L[False] = ...,
      return_counts: L[False] = ...,
      axis: None | SupportsIndex = ...,
+    *,
+    equal_nan: bool = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def unique(
@@ -140,6 +135,8 @@ def unique(
      return_inverse: L[False] = ...,
      return_counts: L[False] = ...,
      axis: None | SupportsIndex = ...,
+    *,
+    equal_nan: bool = ...,
  ) -> NDArray[Any]: ...
  @overload
  def unique(
@@ -148,7 +145,9 @@ def unique(
      return_inverse: L[False] = ...,
      return_counts: L[False] = ...,
      axis: None | SupportsIndex = ...,
-) -> Tuple[NDArray[_SCT], NDArray[intp]]: ...
+    *,
+    equal_nan: bool = ...,
+) -> tuple[NDArray[_SCT], NDArray[intp]]: ...
  @overload
  def unique(
      ar: ArrayLike,
@@ -156,7 +155,9 @@ def unique(
      return_inverse: L[False] = ...,
      return_counts: L[False] = ...,
      axis: None | SupportsIndex = ...,
-) -> Tuple[NDArray[Any], NDArray[intp]]: ...
+    *,
+    equal_nan: bool = ...,
+) -> tuple[NDArray[Any], NDArray[intp]]: ...
  @overload
  def unique(
      ar: _ArrayLike[_SCT],
@@ -164,7 +165,9 @@ def unique(
      return_inverse: L[True] = ...,
      return_counts: L[False] = ...,
      axis: None | SupportsIndex = ...,
-) -> Tuple[NDArray[_SCT], NDArray[intp]]: ...
+    *,
+    equal_nan: bool = ...,
+) -> tuple[NDArray[_SCT], NDArray[intp]]: ...
  @overload
  def unique(
      ar: ArrayLike,
@@ -172,7 +175,9 @@ def unique(
      return_inverse: L[True] = ...,
      return_counts: L[False] = ...,
      axis: None | SupportsIndex = ...,
-) -> Tuple[NDArray[Any], NDArray[intp]]: ...
+    *,
+    equal_nan: bool = ...,
+) -> tuple[NDArray[Any], NDArray[intp]]: ...
  @overload
  def unique(
      ar: _ArrayLike[_SCT],
@@ -180,7 +185,9 @@ def unique(
      return_inverse: L[False] = ...,
      return_counts: L[True] = ...,
      axis: None | SupportsIndex = ...,
-) -> Tuple[NDArray[_SCT], NDArray[intp]]: ...
+    *,
+    equal_nan: bool = ...,
+) -> tuple[NDArray[_SCT], NDArray[intp]]: ...
  @overload
  def unique(
      ar: ArrayLike,
@@ -188,7 +195,9 @@ def unique(
      return_inverse: L[False] = ...,
      return_counts: L[True] = ...,
      axis: None | SupportsIndex = ...,
-) -> Tuple[NDArray[Any], NDArray[intp]]: ...
+    *,
+    equal_nan: bool = ...,
+) -> tuple[NDArray[Any], NDArray[intp]]: ...
  @overload
  def unique(
      ar: _ArrayLike[_SCT],
@@ -196,7 +205,9 @@ def unique(
      return_inverse: L[True] = ...,
      return_counts: L[False] = ...,
      axis: None | SupportsIndex = ...,
-) -> Tuple[NDArray[_SCT], NDArray[intp], NDArray[intp]]: ...
+    *,
+    equal_nan: bool = ...,
+) -> tuple[NDArray[_SCT], NDArray[intp], NDArray[intp]]: ...
  @overload
  def unique(
      ar: ArrayLike,
@@ -204,7 +215,9 @@ def unique(
      return_inverse: L[True] = ...,
      return_counts: L[False] = ...,
      axis: None | SupportsIndex = ...,
-) -> Tuple[NDArray[Any], NDArray[intp], NDArray[intp]]: ...
+    *,
+    equal_nan: bool = ...,
+) -> tuple[NDArray[Any], NDArray[intp], NDArray[intp]]: ...
  @overload
  def unique(
      ar: _ArrayLike[_SCT],
@@ -212,7 +225,9 @@ def unique(
      return_inverse: L[False] = ...,
      return_counts: L[True] = ...,
      axis: None | SupportsIndex = ...,
-) -> Tuple[NDArray[_SCT], NDArray[intp], NDArray[intp]]: ...
+    *,
+    equal_nan: bool = ...,
+) -> tuple[NDArray[_SCT], NDArray[intp], NDArray[intp]]: ...
  @overload
  def unique(
      ar: ArrayLike,
@@ -220,7 +235,9 @@ def unique(
      return_inverse: L[False] = ...,
      return_counts: L[True] = ...,
      axis: None | SupportsIndex = ...,
-) -> Tuple[NDArray[Any], NDArray[intp], NDArray[intp]]: ...
+    *,
+    equal_nan: bool = ...,
+) -> tuple[NDArray[Any], NDArray[intp], NDArray[intp]]: ...
  @overload
  def unique(
      ar: _ArrayLike[_SCT],
@@ -228,7 +245,9 @@ def unique(
      return_inverse: L[True] = ...,
      return_counts: L[True] = ...,
      axis: None | SupportsIndex = ...,
-) -> Tuple[NDArray[_SCT], NDArray[intp], NDArray[intp]]: ...
+    *,
+    equal_nan: bool = ...,
+) -> tuple[NDArray[_SCT], NDArray[intp], NDArray[intp]]: ...
  @overload
  def unique(
      ar: ArrayLike,
@@ -236,7 +255,9 @@ def unique(
      return_inverse: L[True] = ...,
      return_counts: L[True] = ...,
      axis: None | SupportsIndex = ...,
-) -> Tuple[NDArray[Any], NDArray[intp], NDArray[intp]]: ...
+    *,
+    equal_nan: bool = ...,
+) -> tuple[NDArray[Any], NDArray[intp], NDArray[intp]]: ...
  @overload
  def unique(
      ar: _ArrayLike[_SCT],
@@ -244,7 +265,9 @@ def unique(
      return_inverse: L[True] = ...,
      return_counts: L[True] = ...,
      axis: None | SupportsIndex = ...,
-) -> Tuple[NDArray[_SCT], NDArray[intp], NDArray[intp], NDArray[intp]]: ...
+    *,
+    equal_nan: bool = ...,
+) -> tuple[NDArray[_SCT], NDArray[intp], NDArray[intp], NDArray[intp]]: ...
  @overload
  def unique(
      ar: ArrayLike,
@@ -252,7 +275,9 @@ def unique(
      return_inverse: L[True] = ...,
      return_counts: L[True] = ...,
      axis: None | SupportsIndex = ...,
-) -> Tuple[NDArray[Any], NDArray[intp], NDArray[intp], NDArray[intp]]: ...
+    *,
+    equal_nan: bool = ...,
+) -> tuple[NDArray[Any], NDArray[intp], NDArray[intp], NDArray[intp]]: ...
  
  @overload
  def intersect1d(
@@ -274,14 +299,14 @@ def intersect1d(
      ar2: _ArrayLike[_SCTNoCast],
      assume_unique: bool = ...,
      return_indices: L[True] = ...,
-) -> Tuple[NDArray[_SCTNoCast], NDArray[intp], NDArray[intp]]: ...
+) -> tuple[NDArray[_SCTNoCast], NDArray[intp], NDArray[intp]]: ...
  @overload
  def intersect1d(
      ar1: ArrayLike,
      ar2: ArrayLike,
      assume_unique: bool = ...,
      return_indices: L[True] = ...,
-) -> Tuple[NDArray[Any], NDArray[intp], NDArray[intp]]: ...
+) -> tuple[NDArray[Any], NDArray[intp], NDArray[intp]]: ...
  
  @overload
  def setxor1d(
diff --git a/numpy/lib/arrayterator.pyi b/numpy/lib/arrayterator.pyi

index 82c66920640f4302d1b371c2a60a7e5f21ec5152..aa192fb7c40ffeaddc8b082d86755eb3722b8634 100644 (file)
--- a/numpy/lib/arrayterator.pyi
+++ b/numpy/lib/arrayterator.pyi
@@ -1,16 +1,13 @@
+from collections.abc import Generator
  from typing import (
-    List,
      Any,
      TypeVar,
-    Generator,
-    List,
      Union,
-    Tuple,
      overload,
  )
  
  from numpy import ndarray, dtype, generic
-from numpy.typing import DTypeLike
+from numpy._typing import DTypeLike
  
  # TODO: Set a shape bound once we've got proper shape support
  _Shape = TypeVar("_Shape", bound=Any)
@@ -19,10 +16,10 @@ _ScalarType = TypeVar("_ScalarType", bound=generic)
  
  _Index = Union[
      Union[ellipsis, int, slice],
-    Tuple[Union[ellipsis, int, slice], ...],
+    tuple[Union[ellipsis, int, slice], ...],
  ]
  
-__all__: List[str]
+__all__: list[str]
  
  # NOTE: In reality `Arrayterator` does not actually inherit from `ndarray`,
  # but its ``__getattr__` method does wrap around the former and thus has
@@ -31,12 +28,12 @@ __all__: List[str]
  class Arrayterator(ndarray[_Shape, _DType]):
      var: ndarray[_Shape, _DType]  # type: ignore[assignment]
      buf_size: None | int
-    start: List[int]
-    stop: List[int]
-    step: List[int]
+    start: list[int]
+    stop: list[int]
+    step: list[int]
  
      @property  # type: ignore[misc]
-    def shape(self) -> Tuple[int, ...]: ...
+    def shape(self) -> tuple[int, ...]: ...
      @property
      def flat(  # type: ignore[override]
          self: ndarray[Any, dtype[_ScalarType]]
diff --git a/numpy/lib/format.py b/numpy/lib/format.py

index 3967b43eefc2761d210a22d4d508d2741df984ba..625768b626405f9aeb283eec220491b54070e7a0 100644 (file)
--- a/numpy/lib/format.py
+++ b/numpy/lib/format.py
@@ -370,8 +370,7 @@ def _wrap_header(header, version):
      import struct
      assert version is not None
      fmt, encoding = _header_size_info[version]
-    if not isinstance(header, bytes):  # always true on python 3
-        header = header.encode(encoding)
+    header = header.encode(encoding)
      hlen = len(header) + 1
      padlen = ARRAY_ALIGN - ((MAGIC_LEN + struct.calcsize(fmt) + hlen) % ARRAY_ALIGN)
      try:
@@ -421,10 +420,10 @@ def _write_array_header(fp, d, version=None):
      d : dict
          This has the appropriate entries for writing its string representation
          to the header of the file.
-    version: tuple or None
-        None means use oldest that works
-        explicit version will raise a ValueError if the format does not
-        allow saving this data.  Default: None
+    version : tuple or None
+        None means use oldest that works. Providing an explicit version will
+        raise a ValueError if the format does not allow saving this data.
+        Default: None
      """
      header = ["{"]
      for key, value in sorted(d.items()):
diff --git a/numpy/lib/format.pyi b/numpy/lib/format.pyi

index 092245daf01f030d15ce978b083f21f7a926ee00..a4468f52f4646b8b9413f279b09f85cd201aaf51 100644 (file)
--- a/numpy/lib/format.pyi
+++ b/numpy/lib/format.pyi
@@ -1,8 +1,8 @@
-from typing import Any, List, Set, Literal, Final
+from typing import Any, Literal, Final
  
-__all__: List[str]
+__all__: list[str]
  
-EXPECTED_KEYS: Final[Set[str]]
+EXPECTED_KEYS: Final[set[str]]
  MAGIC_PREFIX: Final[bytes]
  MAGIC_LEN: Literal[8]
  ARRAY_ALIGN: Literal[64]
diff --git a/numpy/lib/function_base.py b/numpy/lib/function_base.py

index a215f63d30406155fcf38753eaabc7765c13a802..843e1b85ae6b4509d751776750615b95dea445dc 100644 (file)
--- a/numpy/lib/function_base.py
+++ b/numpy/lib/function_base.py
@@ -168,7 +168,7 @@ def rot90(m, k=1, axes=(0, 1)):
          Array of two or more dimensions.
      k : integer
          Number of times the array is rotated by 90 degrees.
-    axes: (2,) array_like
+    axes : (2,) array_like
          The array is rotated in the plane defined by the axes.
          Axes must be different.
  
@@ -388,12 +388,14 @@ def iterable(y):
      return True
  
  
-def _average_dispatcher(a, axis=None, weights=None, returned=None):
+def _average_dispatcher(a, axis=None, weights=None, returned=None, *,
+                        keepdims=None):
      return (a, weights)
  
  
  @array_function_dispatch(_average_dispatcher)
-def average(a, axis=None, weights=None, returned=False):
+def average(a, axis=None, weights=None, returned=False, *,
+            keepdims=np._NoValue):
      """
      Compute the weighted average along the specified axis.
  
@@ -428,6 +430,14 @@ def average(a, axis=None, weights=None, returned=False):
          is returned, otherwise only the average is returned.
          If `weights=None`, `sum_of_weights` is equivalent to the number of
          elements over which the average is taken.
+    keepdims : bool, optional
+        If this is set to True, the axes which are reduced are left
+        in the result as dimensions with size one. With this option,
+        the result will broadcast correctly against the original `a`.
+        *Note:* `keepdims` will not work with instances of `numpy.matrix`
+        or other classes whose methods do not support `keepdims`.
+
+        .. versionadded:: 1.23.0
  
      Returns
      -------
@@ -471,7 +481,7 @@ def average(a, axis=None, weights=None, returned=False):
      >>> np.average(np.arange(1, 11), weights=np.arange(10, 0, -1))
      4.0
  
-    >>> data = np.arange(6).reshape((3,2))
+    >>> data = np.arange(6).reshape((3, 2))
      >>> data
      array([[0, 1],
             [2, 3],
@@ -488,11 +498,24 @@ def average(a, axis=None, weights=None, returned=False):
      >>> avg = np.average(a, weights=w)
      >>> print(avg.dtype)
      complex256
+
+    With ``keepdims=True``, the following result has shape (3, 1).
+
+    >>> np.average(data, axis=1, keepdims=True)
+    array([[0.5],
+           [2.5],
+           [4.5]])
      """
      a = np.asanyarray(a)
  
+    if keepdims is np._NoValue:
+        # Don't pass on the keepdims argument if one wasn't given.
+        keepdims_kw = {}
+    else:
+        keepdims_kw = {'keepdims': keepdims}
+
      if weights is None:
-        avg = a.mean(axis)
+        avg = a.mean(axis, **keepdims_kw)
          scl = avg.dtype.type(a.size/avg.size)
      else:
          wgt = np.asanyarray(weights)
@@ -524,7 +547,8 @@ def average(a, axis=None, weights=None, returned=False):
              raise ZeroDivisionError(
                  "Weights sum to zero, can't be normalized")
  
-        avg = np.multiply(a, wgt, dtype=result_dtype).sum(axis)/scl
+        avg = np.multiply(a, wgt,
+                          dtype=result_dtype).sum(axis, **keepdims_kw) / scl
  
      if returned:
          if scl.shape != avg.shape:
@@ -906,7 +930,7 @@ def copy(a, order='K', subok=False):
      >>> b[0] = 3
      >>> b
      array([3, 2, 3])
-    
+
      Note that np.copy is a shallow copy and will not copy object
      elements within arrays. This is mainly important for arrays
      containing Python objects. The new array will contain the
@@ -1656,7 +1680,7 @@ def unwrap(p, discont=None, axis=-1, *, period=2*pi):
          larger than ``period/2``.
      axis : int, optional
          Axis along which unwrap will operate, default is the last axis.
-    period: float, optional
+    period : float, optional
          Size of the range over which the input wraps. By default, it is
          ``2 pi``.
  
@@ -2696,7 +2720,7 @@ def corrcoef(x, y=None, rowvar=True, bias=np._NoValue, ddof=np._NoValue, *,
      relationship between the correlation coefficient matrix, `R`, and the
      covariance matrix, `C`, is
  
-    .. math:: R_{ij} = \\frac{ C_{ij} } { \\sqrt{ C_{ii} * C_{jj} } }
+    .. math:: R_{ij} = \\frac{ C_{ij} } { \\sqrt{ C_{ii} C_{jj} } }
  
      The values of `R` are between -1 and 1, inclusive.
  
@@ -2974,15 +2998,14 @@ def bartlett(M):
                \\frac{M-1}{2} - \\left|n - \\frac{M-1}{2}\\right|
                \\right)
  
-    Most references to the Bartlett window come from the signal
-    processing literature, where it is used as one of many windowing
-    functions for smoothing values.  Note that convolution with this
-    window produces linear interpolation.  It is also known as an
-    apodization (which means"removing the foot", i.e. smoothing
-    discontinuities at the beginning and end of the sampled signal) or
-    tapering function. The fourier transform of the Bartlett is the product
-    of two sinc functions.
-    Note the excellent discussion in Kanasewich.
+    Most references to the Bartlett window come from the signal processing
+    literature, where it is used as one of many windowing functions for
+    smoothing values.  Note that convolution with this window produces linear
+    interpolation.  It is also known as an apodization (which means "removing
+    the foot", i.e. smoothing discontinuities at the beginning and end of the
+    sampled signal) or tapering function. The Fourier transform of the
+    Bartlett window is the product of two sinc functions. Note the excellent
+    discussion in Kanasewich [2]_.
  
      References
      ----------
@@ -3075,7 +3098,7 @@ def hanning(M):
      -----
      The Hanning window is defined as
  
-    .. math::  w(n) = 0.5 - 0.5cos\\left(\\frac{2\\pi{n}}{M-1}\\right)
+    .. math::  w(n) = 0.5 - 0.5\\cos\\left(\\frac{2\\pi{n}}{M-1}\\right)
                 \\qquad 0 \\leq n \\leq M-1
  
      The Hanning was named for Julius von Hann, an Austrian meteorologist.
@@ -3179,7 +3202,7 @@ def hamming(M):
      -----
      The Hamming window is defined as
  
-    .. math::  w(n) = 0.54 - 0.46cos\\left(\\frac{2\\pi{n}}{M-1}\\right)
+    .. math::  w(n) = 0.54 - 0.46\\cos\\left(\\frac{2\\pi{n}}{M-1}\\right)
                 \\qquad 0 \\leq n \\leq M-1
  
      The Hamming was named for R. W. Hamming, an associate of J. W. Tukey
@@ -3539,20 +3562,22 @@ def sinc(x):
      r"""
      Return the normalized sinc function.
  
-    The sinc function is :math:`\sin(\pi x)/(\pi x)`.
+    The sinc function is equal to :math:`\sin(\pi x)/(\pi x)` for any argument
+    :math:`x\ne 0`. ``sinc(0)`` takes the limit value 1, making ``sinc`` not
+    only everywhere continuous but also infinitely differentiable.
  
      .. note::
  
          Note the normalization factor of ``pi`` used in the definition.
          This is the most commonly used definition in signal processing.
          Use ``sinc(x / np.pi)`` to obtain the unnormalized sinc function
-        :math:`\sin(x)/(x)` that is more common in mathematics.
+        :math:`\sin(x)/x` that is more common in mathematics.
  
      Parameters
      ----------
      x : ndarray
-        Array (possibly multi-dimensional) of values for which to to
-        calculate ``sinc(x)``.
+        Array (possibly multi-dimensional) of values for which to calculate
+        ``sinc(x)``.
  
      Returns
      -------
@@ -3561,8 +3586,6 @@ def sinc(x):
  
      Notes
      -----
-    ``sinc(0)`` is the limit value 1.
-
      The name sinc is short for "sine cardinal" or "sinus cardinalis".
  
      The sinc function is used in various signal processing applications,
@@ -3984,18 +4007,21 @@ def percentile(a,
      inverted_cdf:
          method 1 of H&F [1]_.
          This method gives discontinuous results:
+
          * if g > 0 ; then take j
          * if g = 0 ; then take i
  
      averaged_inverted_cdf:
          method 2 of H&F [1]_.
          This method give discontinuous results:
+
          * if g > 0 ; then take j
          * if g = 0 ; then average between bounds
  
      closest_observation:
          method 3 of H&F [1]_.
          This method give discontinuous results:
+
          * if g > 0 ; then take j
          * if g = 0 and index is odd ; then take j
          * if g = 0 and index is even ; then take i
@@ -4003,24 +4029,28 @@ def percentile(a,
      interpolated_inverted_cdf:
          method 4 of H&F [1]_.
          This method give continuous results using:
+
          * alpha = 0
          * beta = 1
  
      hazen:
          method 5 of H&F [1]_.
          This method give continuous results using:
+
          * alpha = 1/2
          * beta = 1/2
  
      weibull:
          method 6 of H&F [1]_.
          This method give continuous results using:
+
          * alpha = 0
          * beta = 0
  
      linear:
          method 7 of H&F [1]_.
          This method give continuous results using:
+
          * alpha = 1
          * beta = 1
  
@@ -4029,6 +4059,7 @@ def percentile(a,
          This method is probably the best method if the sample
          distribution function is unknown (see reference).
          This method give continuous results using:
+
          * alpha = 1/3
          * beta = 1/3
  
@@ -4037,6 +4068,7 @@ def percentile(a,
          This method is probably the best method if the sample
          distribution function is known to be normal.
          This method give continuous results using:
+
          * alpha = 3/8
          * beta = 3/8
  
@@ -4190,7 +4222,7 @@ def quantile(a,
          8. 'median_unbiased'
          9. 'normal_unbiased'
  
-        The first three methods are discontiuous.  NumPy further defines the
+        The first three methods are discontinuous.  NumPy further defines the
          following discontinuous variations of the default 'linear' (7.) option:
  
          * 'lower'
@@ -4241,10 +4273,10 @@ def quantile(a,
      same as the median if ``q=0.5``, the same as the minimum if ``q=0.0`` and
      the same as the maximum if ``q=1.0``.
  
-    This optional `method` parameter specifies the method to use when the
+    The optional `method` parameter specifies the method to use when the
      desired quantile lies between two data points ``i < j``.
-    If ``g`` is the fractional part of the index surrounded by ``i`` and
-    alpha and beta are correction constants modifying i and j.
+    If ``g`` is the fractional part of the index surrounded by ``i`` and ``j``,
+    and alpha and beta are correction constants modifying i and j:
  
      .. math::
          i + g = (q - alpha) / ( n - alpha - beta + 1 )
@@ -4254,43 +4286,50 @@ def quantile(a,
      inverted_cdf:
          method 1 of H&F [1]_.
          This method gives discontinuous results:
+
          * if g > 0 ; then take j
          * if g = 0 ; then take i
  
      averaged_inverted_cdf:
          method 2 of H&F [1]_.
-        This method give discontinuous results:
+        This method gives discontinuous results:
+
          * if g > 0 ; then take j
          * if g = 0 ; then average between bounds
  
      closest_observation:
          method 3 of H&F [1]_.
-        This method give discontinuous results:
+        This method gives discontinuous results:
+
          * if g > 0 ; then take j
          * if g = 0 and index is odd ; then take j
          * if g = 0 and index is even ; then take i
  
      interpolated_inverted_cdf:
          method 4 of H&F [1]_.
-        This method give continuous results using:
+        This method gives continuous results using:
+
          * alpha = 0
          * beta = 1
  
      hazen:
          method 5 of H&F [1]_.
-        This method give continuous results using:
+        This method gives continuous results using:
+
          * alpha = 1/2
          * beta = 1/2
  
      weibull:
          method 6 of H&F [1]_.
-        This method give continuous results using:
+        This method gives continuous results using:
+
          * alpha = 0
          * beta = 0
  
      linear:
          method 7 of H&F [1]_.
-        This method give continuous results using:
+        This method gives continuous results using:
+
          * alpha = 1
          * beta = 1
  
@@ -4298,7 +4337,8 @@ def quantile(a,
          method 8 of H&F [1]_.
          This method is probably the best method if the sample
          distribution function is unknown (see reference).
-        This method give continuous results using:
+        This method gives continuous results using:
+
          * alpha = 1/3
          * beta = 1/3
  
@@ -4306,7 +4346,8 @@ def quantile(a,
          method 9 of H&F [1]_.
          This method is probably the best method if the sample
          distribution function is known to be normal.
-        This method give continuous results using:
+        This method gives continuous results using:
+
          * alpha = 3/8
          * beta = 3/8
  
@@ -4411,7 +4452,7 @@ def _check_interpolation_as_method(method, interpolation, fname):
          f"the `interpolation=` argument to {fname} was renamed to "
          "`method=`, which has additional options.\n"
          "Users of the modes 'nearest', 'lower', 'higher', or "
-        "'midpoint' are encouraged to review the method they. "
+        "'midpoint' are encouraged to review the method they used. "
          "(Deprecated NumPy 1.22)",
          DeprecationWarning, stacklevel=4)
      if method != "linear":
@@ -4713,10 +4754,10 @@ def trapz(y, x=None, dx=1.0, axis=-1):
      Returns
      -------
      trapz : float or ndarray
-        Definite integral of 'y' = n-dimensional array as approximated along
-        a single axis by the trapezoidal rule. If 'y' is a 1-dimensional array,
-        then the result is a float. If 'n' is greater than 1, then the result
-        is an 'n-1' dimensional array.
+        Definite integral of `y` = n-dimensional array as approximated along
+        a single axis by the trapezoidal rule. If `y` is a 1-dimensional array,
+        then the result is a float. If `n` is greater than 1, then the result
+        is an `n`-1 dimensional array.
  
      See Also
      --------
@@ -4847,9 +4888,9 @@ def meshgrid(*xi, copy=True, sparse=False, indexing='xy'):
      Returns
      -------
      X1, X2,..., XN : ndarray
-        For vectors `x1`, `x2`,..., 'xn' with lengths ``Ni=len(xi)`` ,
-        return ``(N1, N2, N3,...Nn)`` shaped arrays if indexing='ij'
-        or ``(N2, N1, N3,...Nn)`` shaped arrays if indexing='xy'
+        For vectors `x1`, `x2`,..., `xn` with lengths ``Ni=len(xi)``,
+        returns ``(N1, N2, N3,..., Nn)`` shaped arrays if indexing='ij'
+        or ``(N2, N1, N3,..., Nn)`` shaped arrays if indexing='xy'
          with the elements of `xi` repeated to fill the matrix along
          the first dimension for `x1`, the second for `x2` and so on.
  
@@ -4907,7 +4948,7 @@ def meshgrid(*xi, copy=True, sparse=False, indexing='xy'):
  
      >>> x = np.linspace(-5, 5, 101)
      >>> y = np.linspace(-5, 5, 101)
-    >>> # full coorindate arrays
+    >>> # full coordinate arrays
      >>> xx, yy = np.meshgrid(x, y)
      >>> zz = np.sqrt(xx**2 + yy**2)
      >>> xx.shape, yy.shape, zz.shape
@@ -4998,7 +5039,7 @@ def delete(arr, obj, axis=None):
      >>> mask[[0,2,4]] = False
      >>> result = arr[mask,...]
  
-    Is equivalent to `np.delete(arr, [0,2,4], axis=0)`, but allows further
+    Is equivalent to ``np.delete(arr, [0,2,4], axis=0)``, but allows further
      use of `mask`.
  
      Examples
@@ -5094,6 +5135,18 @@ def delete(arr, obj, axis=None):
              return new
  
      if isinstance(obj, (int, integer)) and not isinstance(obj, bool):
+        single_value = True
+    else:
+        single_value = False
+        _obj = obj
+        obj = np.asarray(obj)
+        if obj.size == 0 and not isinstance(_obj, np.ndarray):
+            obj = obj.astype(intp)
+        elif obj.size == 1 and not isinstance(_obj, bool):
+            obj = obj.astype(intp).reshape(())
+            single_value = True
+
+    if single_value:
          # optimization for a single value
          if (obj < -N or obj >= N):
              raise IndexError(
@@ -5110,11 +5163,6 @@ def delete(arr, obj, axis=None):
          slobj2[axis] = slice(obj+1, None)
          new[tuple(slobj)] = arr[tuple(slobj2)]
      else:
-        _obj = obj
-        obj = np.asarray(obj)
-        if obj.size == 0 and not isinstance(_obj, np.ndarray):
-            obj = obj.astype(intp)
-
          if obj.dtype == bool:
              if obj.shape != (N,):
                  raise ValueError('boolean array argument obj to delete '
@@ -5182,9 +5230,9 @@ def insert(arr, obj, values, axis=None):
  
      Notes
      -----
-    Note that for higher dimensional inserts `obj=0` behaves very different
-    from `obj=[0]` just like `arr[:,0,:] = values` is different from
-    `arr[:,[0],:] = values`.
+    Note that for higher dimensional inserts ``obj=0`` behaves very different
+    from ``obj=[0]`` just like ``arr[:,0,:] = values`` is different from
+    ``arr[:,[0],:] = values``.
  
      Examples
      --------
diff --git a/numpy/lib/function_base.pyi b/numpy/lib/function_base.pyi

index 7e227f9da52d44d1b78d22ccf7ca8274831e2e64..6c00d26b46b82296c05fd5288fe7f51c0b4b6691 100644 (file)
--- a/numpy/lib/function_base.pyi
+++ b/numpy/lib/function_base.pyi
@@ -1,19 +1,12 @@
  import sys
+from collections.abc import Sequence, Iterator, Callable, Iterable
  from typing import (
      Literal as L,
-    List,
-    Type,
-    Sequence,
-    Tuple,
-    Union,
      Any,
      TypeVar,
-    Iterator,
      overload,
-    Callable,
      Protocol,
      SupportsIndex,
-    Iterable,
      SupportsInt,
  )
  
@@ -25,7 +18,6 @@ else:
  from numpy import (
      vectorize as vectorize,
      ufunc,
-    dtype,
      generic,
      floating,
      complexfloating,
@@ -38,15 +30,14 @@ from numpy import (
      _OrderKACF,
  )
  
-from numpy.typing import (
+from numpy._typing import (
      NDArray,
      ArrayLike,
      DTypeLike,
      _ShapeLike,
      _ScalarLike_co,
-    _SupportsDType,
-    _FiniteNestedSequence,
-    _SupportsArray,
+    _DTypeLike,
+    _ArrayLike,
      _ArrayLikeInt_co,
      _ArrayLikeFloat_co,
      _ArrayLikeComplex_co,
@@ -73,13 +64,7 @@ _T_co = TypeVar("_T_co", covariant=True)
  _SCT = TypeVar("_SCT", bound=generic)
  _ArrayType = TypeVar("_ArrayType", bound=NDArray[Any])
  
-_2Tuple = Tuple[_T, _T]
-_ArrayLike = _FiniteNestedSequence[_SupportsArray[dtype[_SCT]]]
-_DTypeLike = Union[
-    dtype[_SCT],
-    Type[_SCT],
-    _SupportsDType[dtype[_SCT]],
-]
+_2Tuple = tuple[_T, _T]
  
  class _TrimZerosSequence(Protocol[_T_co]):
      def __len__(self) -> int: ...
@@ -90,7 +75,7 @@ class _SupportsWriteFlush(Protocol):
      def write(self, s: str, /) -> object: ...
      def flush(self) -> object: ...
  
-__all__: List[str]
+__all__: list[str]
  
  # NOTE: This is in reality a re-export of `np.core.umath._add_newdoc_ufunc`
  def add_newdoc_ufunc(ufunc: ufunc, new_docstring: str, /) -> None: ...
@@ -99,13 +84,13 @@ def add_newdoc_ufunc(ufunc: ufunc, new_docstring: str, /) -> None: ...
  def rot90(
      m: _ArrayLike[_SCT],
      k: int = ...,
-    axes: Tuple[int, int] = ...,
+    axes: tuple[int, int] = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def rot90(
      m: ArrayLike,
      k: int = ...,
-    axes: Tuple[int, int] = ...,
+    axes: tuple[int, int] = ...,
  ) -> NDArray[Any]: ...
  
  @overload
@@ -125,6 +110,7 @@ def average(
      axis: None = ...,
      weights: None | _ArrayLikeFloat_co= ...,
      returned: L[False] = ...,
+    keepdims: L[False] = ...,
  ) -> floating[Any]: ...
  @overload
  def average(
@@ -132,6 +118,7 @@ def average(
      axis: None = ...,
      weights: None | _ArrayLikeComplex_co = ...,
      returned: L[False] = ...,
+    keepdims: L[False] = ...,
  ) -> complexfloating[Any, Any]: ...
  @overload
  def average(
@@ -139,6 +126,7 @@ def average(
      axis: None = ...,
      weights: None | Any = ...,
      returned: L[False] = ...,
+    keepdims: L[False] = ...,
  ) -> Any: ...
  @overload
  def average(
@@ -146,6 +134,7 @@ def average(
      axis: None = ...,
      weights: None | _ArrayLikeFloat_co= ...,
      returned: L[True] = ...,
+    keepdims: L[False] = ...,
  ) -> _2Tuple[floating[Any]]: ...
  @overload
  def average(
@@ -153,6 +142,7 @@ def average(
      axis: None = ...,
      weights: None | _ArrayLikeComplex_co = ...,
      returned: L[True] = ...,
+    keepdims: L[False] = ...,
  ) -> _2Tuple[complexfloating[Any, Any]]: ...
  @overload
  def average(
@@ -160,6 +150,7 @@ def average(
      axis: None = ...,
      weights: None | Any = ...,
      returned: L[True] = ...,
+    keepdims: L[False] = ...,
  ) -> _2Tuple[Any]: ...
  @overload
  def average(
@@ -167,6 +158,7 @@ def average(
      axis: None | _ShapeLike = ...,
      weights: None | Any = ...,
      returned: L[False] = ...,
+    keepdims: bool = ...,
  ) -> Any: ...
  @overload
  def average(
@@ -174,6 +166,7 @@ def average(
      axis: None | _ShapeLike = ...,
      weights: None | Any = ...,
      returned: L[True] = ...,
+    keepdims: bool = ...,
  ) -> _2Tuple[Any]: ...
  
  @overload
@@ -201,6 +194,8 @@ def asarray_chkfinite(
      order: _OrderKACF = ...,
  ) -> NDArray[Any]: ...
  
+# TODO: Use PEP 612 `ParamSpec` once mypy supports `Concatenate`
+# xref python/mypy#8645
  @overload
  def piecewise(
      x: _ArrayLike[_SCT],
@@ -654,7 +649,7 @@ def meshgrid(
      copy: bool = ...,
      sparse: bool = ...,
      indexing: L["xy", "ij"] = ...,
-) -> List[NDArray[Any]]: ...
+) -> list[NDArray[Any]]: ...
  
  @overload
  def delete(
diff --git a/numpy/lib/histograms.py b/numpy/lib/histograms.py

index b6909bc1d9e029d5356531c50d392ae0c3f0e662..98182f1c4b039f383fd8d0a31dc4780de5e64922 100644 (file)
--- a/numpy/lib/histograms.py
+++ b/numpy/lib/histograms.py
@@ -506,8 +506,8 @@ def histogram_bin_edges(a, bins=10, range=None, weights=None):
              with non-normal datasets.
  
          'scott'
-            Less robust estimator that that takes into account data
-            variability and data size.
+            Less robust estimator that takes into account data variability
+            and data size.
  
          'stone'
              Estimator based on leave-one-out cross-validation estimate of
@@ -562,7 +562,7 @@ def histogram_bin_edges(a, bins=10, range=None, weights=None):
      below, :math:`h` is the binwidth and :math:`n_h` is the number of
      bins. All estimators that compute bin counts are recast to bin width
      using the `ptp` of the data. The final bin count is obtained from
-    ``np.round(np.ceil(range / h))``. The final bin width is often less 
+    ``np.round(np.ceil(range / h))``. The final bin width is often less
      than what is returned by the estimators below.
  
      'auto' (maximum of the 'sturges' and 'fd' estimators)
@@ -581,7 +581,7 @@ def histogram_bin_edges(a, bins=10, range=None, weights=None):
          datasets. The IQR is very robust to outliers.
  
      'scott'
-        .. math:: h = \sigma \sqrt[3]{\frac{24 * \sqrt{\pi}}{n}}
+        .. math:: h = \sigma \sqrt[3]{\frac{24 \sqrt{\pi}}{n}}
  
          The binwidth is proportional to the standard deviation of the
          data and inversely proportional to cube root of ``x.size``. Can
@@ -598,7 +598,7 @@ def histogram_bin_edges(a, bins=10, range=None, weights=None):
          does not take into account data variability.
  
      'sturges'
-        .. math:: n_h = \log _{2}n+1
+        .. math:: n_h = \log _{2}(n) + 1
  
          The number of bins is the base 2 log of ``a.size``.  This
          estimator assumes normality of data and is too conservative for
@@ -607,9 +607,9 @@ def histogram_bin_edges(a, bins=10, range=None, weights=None):
  
      'doane'
          .. math:: n_h = 1 + \log_{2}(n) +
-                        \log_{2}(1 + \frac{|g_1|}{\sigma_{g_1}})
+                        \log_{2}\left(1 + \frac{|g_1|}{\sigma_{g_1}}\right)
  
-            g_1 = mean[(\frac{x - \mu}{\sigma})^3]
+            g_1 = mean\left[\left(\frac{x - \mu}{\sigma}\right)^3\right]
  
              \sigma_{g_1} = \sqrt{\frac{6(n - 2)}{(n + 1)(n + 3)}}
  
@@ -1050,13 +1050,13 @@ def histogramdd(sample, bins=10, range=None, normed=None, weights=None,
              smin, smax = _get_outer_edges(sample[:,i], range[i])
              try:
                  n = operator.index(bins[i])
-            
+
              except TypeError as e:
                  raise TypeError(
                         "`bins[{}]` must be an integer, when a scalar".format(i)
                  ) from e
-                
-            edges[i] = np.linspace(smin, smax, n + 1)    
+
+            edges[i] = np.linspace(smin, smax, n + 1)
          elif np.ndim(bins[i]) == 1:
              edges[i] = np.asarray(bins[i])
              if np.any(edges[i][:-1] > edges[i][1:]):
diff --git a/numpy/lib/histograms.pyi b/numpy/lib/histograms.pyi

index 2ceb60793c7ecced4f54473f0b5e2037fce4181e..27b9dbcfb36157159e7816143f3faf629f950a55 100644 (file)
--- a/numpy/lib/histograms.pyi
+++ b/numpy/lib/histograms.pyi
@@ -1,13 +1,11 @@
+from collections.abc import Sequence
  from typing import (
      Literal as L,
-    List,
-    Tuple,
      Any,
      SupportsIndex,
-    Sequence,
  )
  
-from numpy.typing import (
+from numpy._typing import (
      NDArray,
      ArrayLike,
  )
@@ -23,29 +21,29 @@ _BinKind = L[
      "sturges",
  ]
  
-__all__: List[str]
+__all__: list[str]
  
  def histogram_bin_edges(
      a: ArrayLike,
      bins: _BinKind | SupportsIndex | ArrayLike = ...,
-    range: None | Tuple[float, float] = ...,
+    range: None | tuple[float, float] = ...,
      weights: None | ArrayLike = ...,
  ) -> NDArray[Any]: ...
  
  def histogram(
      a: ArrayLike,
      bins: _BinKind | SupportsIndex | ArrayLike = ...,
-    range: None | Tuple[float, float] = ...,
+    range: None | tuple[float, float] = ...,
      normed: None = ...,
      weights: None | ArrayLike = ...,
      density: bool = ...,
-) -> Tuple[NDArray[Any], NDArray[Any]]: ...
+) -> tuple[NDArray[Any], NDArray[Any]]: ...
  
  def histogramdd(
      sample: ArrayLike,
      bins: SupportsIndex | ArrayLike = ...,
-    range: Sequence[Tuple[float, float]] = ...,
+    range: Sequence[tuple[float, float]] = ...,
      normed: None | bool = ...,
      weights: None | ArrayLike = ...,
      density: None | bool = ...,
-) -> Tuple[NDArray[Any], List[NDArray[Any]]]: ...
+) -> tuple[NDArray[Any], list[NDArray[Any]]]: ...
diff --git a/numpy/lib/index_tricks.py b/numpy/lib/index_tricks.py

index 2a4402c89e48df7b71030c7a93d5cbef2c36611c..b69226d4842d201d2ba6a65a3172fed75b367cf1 100644 (file)
--- a/numpy/lib/index_tricks.py
+++ b/numpy/lib/index_tricks.py
@@ -227,13 +227,13 @@ class MGridClass(nd_grid):
  
      See Also
      --------
-    numpy.lib.index_tricks.nd_grid : class of `ogrid` and `mgrid` objects
+    lib.index_tricks.nd_grid : class of `ogrid` and `mgrid` objects
      ogrid : like mgrid but returns open (not fleshed out) mesh grids
      r_ : array concatenator
  
      Examples
      --------
-    >>> np.mgrid[0:5,0:5]
+    >>> np.mgrid[0:5, 0:5]
      array([[[0, 0, 0, 0, 0],
              [1, 1, 1, 1, 1],
              [2, 2, 2, 2, 2],
diff --git a/numpy/lib/index_tricks.pyi b/numpy/lib/index_tricks.pyi

index d16faf81a0bbb426e3a43d0b9d0f61039227d2ea..c9251abd15b385d93e5d314f0160fa94eca5552e 100644 (file)
--- a/numpy/lib/index_tricks.pyi
+++ b/numpy/lib/index_tricks.pyi
@@ -1,12 +1,9 @@
+from collections.abc import Sequence
  from typing import (
      Any,
-    Tuple,
      TypeVar,
      Generic,
      overload,
-    List,
-    Union,
-    Sequence,
      Literal,
      SupportsIndex,
  )
@@ -29,7 +26,7 @@ from numpy import (
      _OrderCF,
      _ModeKind,
  )
-from numpy.typing import (
+from numpy._typing import (
      # Arrays
      ArrayLike,
      _NestedSequence,
@@ -53,25 +50,25 @@ from numpy.core.multiarray import (
  _T = TypeVar("_T")
  _DType = TypeVar("_DType", bound=dtype[Any])
  _BoolType = TypeVar("_BoolType", Literal[True], Literal[False])
-_TupType = TypeVar("_TupType", bound=Tuple[Any, ...])
+_TupType = TypeVar("_TupType", bound=tuple[Any, ...])
  _ArrayType = TypeVar("_ArrayType", bound=ndarray[Any, Any])
  
-__all__: List[str]
+__all__: list[str]
  
  @overload
-def ix_(*args: _FiniteNestedSequence[_SupportsDType[_DType]]) -> Tuple[ndarray[Any, _DType], ...]: ...
+def ix_(*args: _FiniteNestedSequence[_SupportsDType[_DType]]) -> tuple[ndarray[Any, _DType], ...]: ...
  @overload
-def ix_(*args: str | _NestedSequence[str]) -> Tuple[NDArray[str_], ...]: ...
+def ix_(*args: str | _NestedSequence[str]) -> tuple[NDArray[str_], ...]: ...
  @overload
-def ix_(*args: bytes | _NestedSequence[bytes]) -> Tuple[NDArray[bytes_], ...]: ...
+def ix_(*args: bytes | _NestedSequence[bytes]) -> tuple[NDArray[bytes_], ...]: ...
  @overload
-def ix_(*args: bool | _NestedSequence[bool]) -> Tuple[NDArray[bool_], ...]: ...
+def ix_(*args: bool | _NestedSequence[bool]) -> tuple[NDArray[bool_], ...]: ...
  @overload
-def ix_(*args: int | _NestedSequence[int]) -> Tuple[NDArray[int_], ...]: ...
+def ix_(*args: int | _NestedSequence[int]) -> tuple[NDArray[int_], ...]: ...
  @overload
-def ix_(*args: float | _NestedSequence[float]) -> Tuple[NDArray[float_], ...]: ...
+def ix_(*args: float | _NestedSequence[float]) -> tuple[NDArray[float_], ...]: ...
  @overload
-def ix_(*args: complex | _NestedSequence[complex]) -> Tuple[NDArray[complex_], ...]: ...
+def ix_(*args: complex | _NestedSequence[complex]) -> tuple[NDArray[complex_], ...]: ...
  
  class nd_grid(Generic[_BoolType]):
      sparse: _BoolType
@@ -79,13 +76,13 @@ class nd_grid(Generic[_BoolType]):
      @overload
      def __getitem__(
          self: nd_grid[Literal[False]],
-        key: Union[slice, Sequence[slice]],
+        key: slice | Sequence[slice],
      ) -> NDArray[Any]: ...
      @overload
      def __getitem__(
          self: nd_grid[Literal[True]],
-        key: Union[slice, Sequence[slice]],
-    ) -> List[NDArray[Any]]: ...
+        key: slice | Sequence[slice],
+    ) -> list[NDArray[Any]]: ...
  
  class MGridClass(nd_grid[Literal[False]]):
      def __init__(self) -> None: ...
@@ -151,7 +148,7 @@ class IndexExpression(Generic[_BoolType]):
      @overload
      def __getitem__(self, item: _TupType) -> _TupType: ...  # type: ignore[misc]
      @overload
-    def __getitem__(self: IndexExpression[Literal[True]], item: _T) -> Tuple[_T]: ...
+    def __getitem__(self: IndexExpression[Literal[True]], item: _T) -> tuple[_T]: ...
      @overload
      def __getitem__(self: IndexExpression[Literal[False]], item: _T) -> _T: ...
  
@@ -159,7 +156,7 @@ index_exp: IndexExpression[Literal[True]]
  s_: IndexExpression[Literal[False]]
  
  def fill_diagonal(a: ndarray[Any, Any], val: Any, wrap: bool = ...) -> None: ...
-def diag_indices(n: int, ndim: int = ...) -> Tuple[NDArray[int_], ...]: ...
-def diag_indices_from(arr: ArrayLike) -> Tuple[NDArray[int_], ...]: ...
+def diag_indices(n: int, ndim: int = ...) -> tuple[NDArray[int_], ...]: ...
+def diag_indices_from(arr: ArrayLike) -> tuple[NDArray[int_], ...]: ...
  
  # NOTE: see `numpy/__init__.pyi` for `ndenumerate` and `ndindex`
diff --git a/numpy/lib/mixins.pyi b/numpy/lib/mixins.pyi

index f137bb5bcf4b139123cb9d00c5dffe46e53832fb..c5744213372cf746fcba3a3b711b49730629e28c 100644 (file)
--- a/numpy/lib/mixins.pyi
+++ b/numpy/lib/mixins.pyi
@@ -1,62 +1,74 @@
-from typing import List
  from abc import ABCMeta, abstractmethod
+from typing import Literal as L, Any
  
-__all__: List[str]
+from numpy import ufunc
+
+__all__: list[str]
  
  # NOTE: `NDArrayOperatorsMixin` is not formally an abstract baseclass,
  # even though it's reliant on subclasses implementing `__array_ufunc__`
  
+# NOTE: The accepted input- and output-types of the various dunders are
+# completely dependent on how `__array_ufunc__` is implemented.
+# As such, only little type safety can be provided here.
+
  class NDArrayOperatorsMixin(metaclass=ABCMeta):
      @abstractmethod
-    def __array_ufunc__(self, ufunc, method, *inputs, **kwargs): ...
-    def __lt__(self, other): ...
-    def __le__(self, other): ...
-    def __eq__(self, other): ...
-    def __ne__(self, other): ...
-    def __gt__(self, other): ...
-    def __ge__(self, other): ...
-    def __add__(self, other): ...
-    def __radd__(self, other): ...
-    def __iadd__(self, other): ...
-    def __sub__(self, other): ...
-    def __rsub__(self, other): ...
-    def __isub__(self, other): ...
-    def __mul__(self, other): ...
-    def __rmul__(self, other): ...
-    def __imul__(self, other): ...
-    def __matmul__(self, other): ...
-    def __rmatmul__(self, other): ...
-    def __imatmul__(self, other): ...
-    def __truediv__(self, other): ...
-    def __rtruediv__(self, other): ...
-    def __itruediv__(self, other): ...
-    def __floordiv__(self, other): ...
-    def __rfloordiv__(self, other): ...
-    def __ifloordiv__(self, other): ...
-    def __mod__(self, other): ...
-    def __rmod__(self, other): ...
-    def __imod__(self, other): ...
-    def __divmod__(self, other): ...
-    def __rdivmod__(self, other): ...
-    def __pow__(self, other): ...
-    def __rpow__(self, other): ...
-    def __ipow__(self, other): ...
-    def __lshift__(self, other): ...
-    def __rlshift__(self, other): ...
-    def __ilshift__(self, other): ...
-    def __rshift__(self, other): ...
-    def __rrshift__(self, other): ...
-    def __irshift__(self, other): ...
-    def __and__(self, other): ...
-    def __rand__(self, other): ...
-    def __iand__(self, other): ...
-    def __xor__(self, other): ...
-    def __rxor__(self, other): ...
-    def __ixor__(self, other): ...
-    def __or__(self, other): ...
-    def __ror__(self, other): ...
-    def __ior__(self, other): ...
-    def __neg__(self): ...
-    def __pos__(self): ...
-    def __abs__(self): ...
-    def __invert__(self): ...
+    def __array_ufunc__(
+        self,
+        ufunc: ufunc,
+        method: L["__call__", "reduce", "reduceat", "accumulate", "outer", "inner"],
+        *inputs: Any,
+        **kwargs: Any,
+    ) -> Any: ...
+    def __lt__(self, other: Any) -> Any: ...
+    def __le__(self, other: Any) -> Any: ...
+    def __eq__(self, other: Any) -> Any: ...
+    def __ne__(self, other: Any) -> Any: ...
+    def __gt__(self, other: Any) -> Any: ...
+    def __ge__(self, other: Any) -> Any: ...
+    def __add__(self, other: Any) -> Any: ...
+    def __radd__(self, other: Any) -> Any: ...
+    def __iadd__(self, other: Any) -> Any: ...
+    def __sub__(self, other: Any) -> Any: ...
+    def __rsub__(self, other: Any) -> Any: ...
+    def __isub__(self, other: Any) -> Any: ...
+    def __mul__(self, other: Any) -> Any: ...
+    def __rmul__(self, other: Any) -> Any: ...
+    def __imul__(self, other: Any) -> Any: ...
+    def __matmul__(self, other: Any) -> Any: ...
+    def __rmatmul__(self, other: Any) -> Any: ...
+    def __imatmul__(self, other: Any) -> Any: ...
+    def __truediv__(self, other: Any) -> Any: ...
+    def __rtruediv__(self, other: Any) -> Any: ...
+    def __itruediv__(self, other: Any) -> Any: ...
+    def __floordiv__(self, other: Any) -> Any: ...
+    def __rfloordiv__(self, other: Any) -> Any: ...
+    def __ifloordiv__(self, other: Any) -> Any: ...
+    def __mod__(self, other: Any) -> Any: ...
+    def __rmod__(self, other: Any) -> Any: ...
+    def __imod__(self, other: Any) -> Any: ...
+    def __divmod__(self, other: Any) -> Any: ...
+    def __rdivmod__(self, other: Any) -> Any: ...
+    def __pow__(self, other: Any) -> Any: ...
+    def __rpow__(self, other: Any) -> Any: ...
+    def __ipow__(self, other: Any) -> Any: ...
+    def __lshift__(self, other: Any) -> Any: ...
+    def __rlshift__(self, other: Any) -> Any: ...
+    def __ilshift__(self, other: Any) -> Any: ...
+    def __rshift__(self, other: Any) -> Any: ...
+    def __rrshift__(self, other: Any) -> Any: ...
+    def __irshift__(self, other: Any) -> Any: ...
+    def __and__(self, other: Any) -> Any: ...
+    def __rand__(self, other: Any) -> Any: ...
+    def __iand__(self, other: Any) -> Any: ...
+    def __xor__(self, other: Any) -> Any: ...
+    def __rxor__(self, other: Any) -> Any: ...
+    def __ixor__(self, other: Any) -> Any: ...
+    def __or__(self, other: Any) -> Any: ...
+    def __ror__(self, other: Any) -> Any: ...
+    def __ior__(self, other: Any) -> Any: ...
+    def __neg__(self) -> Any: ...
+    def __pos__(self) -> Any: ...
+    def __abs__(self) -> Any: ...
+    def __invert__(self) -> Any: ...
diff --git a/numpy/lib/nanfunctions.py b/numpy/lib/nanfunctions.py

index d7ea1ca65b7d8ffdd17f37bd4b0d52ef521b03a2..cf76e790957913334a1b4e78b6532a8bbf3e1e79 100644 (file)
--- a/numpy/lib/nanfunctions.py
+++ b/numpy/lib/nanfunctions.py
@@ -188,9 +188,8 @@ def _divide_by_count(a, b, out=None):
      """
      Compute a/b ignoring invalid results. If `a` is an array the division
      is done in place. If `a` is a scalar, then its type is preserved in the
-    output. If out is None, then then a is used instead so that the
-    division is in place. Note that this is only called with `a` an inexact
-    type.
+    output. If out is None, then a is used instead so that the division
+    is in place. Note that this is only called with `a` an inexact type.
  
      Parameters
      ----------
diff --git a/numpy/lib/nanfunctions.pyi b/numpy/lib/nanfunctions.pyi

index 54b4a7e260a73116fa74a1f65ea4f53e07338aad..8642055fedd2e5b851c656efd563453e8bd94bd6 100644 (file)
--- a/numpy/lib/nanfunctions.pyi
+++ b/numpy/lib/nanfunctions.pyi
@@ -1,5 +1,3 @@
-from typing import List
-
  from numpy.core.fromnumeric import (
      amin,
      amax,
@@ -20,7 +18,7 @@ from numpy.lib.function_base import (
      quantile,
  )
  
-__all__: List[str]
+__all__: list[str]
  
  # NOTE: In reaility these functions are not aliases but distinct functions
  # with identical signatures.
diff --git a/numpy/lib/npyio.py b/numpy/lib/npyio.py

index 6c34e95fef9aee719b419bb5a918b7230743dc8a..210c0ea94a7e1d8675ca9f9a9c4d9d828f4ed146 100644 (file)
--- a/numpy/lib/npyio.py
+++ b/numpy/lib/npyio.py
@@ -5,6 +5,7 @@ import itertools
  import warnings
  import weakref
  import contextlib
+import operator
  from operator import itemgetter, index as opindex, methodcaller
  from collections.abc import Mapping
  
@@ -13,6 +14,7 @@ from . import format
  from ._datasource import DataSource
  from numpy.core import overrides
  from numpy.core.multiarray import packbits, unpackbits
+from numpy.core._multiarray_umath import _load_from_filelike
  from numpy.core.overrides import set_array_function_like_doc, set_module
  from ._iotools import (
      LineSplitter, NameValidator, StringConverter, ConverterError,
@@ -157,7 +159,7 @@ class NpzFile(Mapping):
      >>> _ = outfile.seek(0)
  
      >>> npz = np.load(outfile)
-    >>> isinstance(npz, np.lib.io.NpzFile)
+    >>> isinstance(npz, np.lib.npyio.NpzFile)
      True
      >>> sorted(npz.files)
      ['x', 'y']
@@ -248,26 +250,6 @@ class NpzFile(Mapping):
          else:
              raise KeyError("%s is not a file in the archive" % key)
  
-    # deprecate the python 2 dict apis that we supported by accident in
-    # python 3. We forgot to implement itervalues() at all in earlier
-    # versions of numpy, so no need to deprecated it here.
-
-    def iteritems(self):
-        # Numpy 1.15, 2018-02-20
-        warnings.warn(
-            "NpzFile.iteritems is deprecated in python 3, to match the "
-            "removal of dict.itertems. Use .items() instead.",
-            DeprecationWarning, stacklevel=2)
-        return self.items()
-
-    def iterkeys(self):
-        # Numpy 1.15, 2018-02-20
-        warnings.warn(
-            "NpzFile.iterkeys is deprecated in python 3, to match the "
-            "removal of dict.iterkeys. Use .keys() instead.",
-            DeprecationWarning, stacklevel=2)
-        return self.keys()
-
  
  @set_module('numpy')
  def load(file, mmap_mode=None, allow_pickle=False, fix_imports=True,
@@ -285,7 +267,8 @@ def load(file, mmap_mode=None, allow_pickle=False, fix_imports=True,
      ----------
      file : file-like object, string, or pathlib.Path
          The file to read. File-like objects must support the
-        ``seek()`` and ``read()`` methods. Pickled files require that the
+        ``seek()`` and ``read()`` methods and must always
+        be opened in binary mode.  Pickled files require that the
          file-like object support the ``readline()`` method as well.
      mmap_mode : {None, 'r+', 'r', 'w+', 'c'}, optional
          If not None, then memory-map the file, using the given mode (see
@@ -720,117 +703,367 @@ def _savez(file, args, kwds, compress, allow_pickle=True, pickle_kwargs=None):
      zipf.close()
  
  
-def _floatconv(x):
-    try:
-        return float(x)  # The fastest path.
-    except ValueError:
-        if '0x' in x:  # Don't accidentally convert "a" ("0xa") to 10.
-            try:
-                return float.fromhex(x)
-            except ValueError:
-                pass
-        raise  # Raise the original exception, which makes more sense.
+def _ensure_ndmin_ndarray_check_param(ndmin):
+    """Just checks if the param ndmin is supported on
+        _ensure_ndmin_ndarray. It is intended to be used as
+        verification before running anything expensive.
+        e.g. loadtxt, genfromtxt
+    """
+    # Check correctness of the values of `ndmin`
+    if ndmin not in [0, 1, 2]:
+        raise ValueError(f"Illegal value of ndmin keyword: {ndmin}")
+
+def _ensure_ndmin_ndarray(a, *, ndmin: int):
+    """This is a helper function of loadtxt and genfromtxt to ensure
+        proper minimum dimension as requested
+
+        ndim : int. Supported values 1, 2, 3
+                    ^^ whenever this changes, keep in sync with
+                       _ensure_ndmin_ndarray_check_param
+    """
+    # Verify that the array has at least dimensions `ndmin`.
+    # Tweak the size and shape of the arrays - remove extraneous dimensions
+    if a.ndim > ndmin:
+        a = np.squeeze(a)
+    # and ensure we have the minimum number of dimensions asked for
+    # - has to be in this order for the odd case ndmin=1, a.squeeze().ndim=0
+    if a.ndim < ndmin:
+        if ndmin == 1:
+            a = np.atleast_1d(a)
+        elif ndmin == 2:
+            a = np.atleast_2d(a).T
+
+    return a
+
+
+# amount of lines loadtxt reads in one chunk, can be overridden for testing
+_loadtxt_chunksize = 50000
+
+
+def _loadtxt_dispatcher(
+        fname, dtype=None, comments=None, delimiter=None,
+        converters=None, skiprows=None, usecols=None, unpack=None,
+        ndmin=None, encoding=None, max_rows=None, *, like=None):
+    return (like,)
  
  
-_CONVERTERS = [  # These converters only ever get strs (not bytes) as input.
-    (np.bool_, lambda x: bool(int(x))),
-    (np.uint64, np.uint64),
-    (np.int64, np.int64),
-    (np.integer, lambda x: int(float(x))),
-    (np.longdouble, np.longdouble),
-    (np.floating, _floatconv),
-    (complex, lambda x: complex(x.replace('+-', '-'))),
-    (np.bytes_, methodcaller('encode', 'latin-1')),
-    (np.unicode_, str),
-]
+def _check_nonneg_int(value, name="argument"):
+    try:
+        operator.index(value)
+    except TypeError:
+        raise TypeError(f"{name} must be an integer") from None
+    if value < 0:
+        raise ValueError(f"{name} must be nonnegative")
  
  
-def _getconv(dtype):
+def _preprocess_comments(iterable, comments, encoding):
+    """
+    Generator that consumes a line iterated iterable and strips out the
+    multiple (or multi-character) comments from lines.
+    This is a pre-processing step to achieve feature parity with loadtxt
+    (we assume that this feature is a nieche feature).
      """
-    Find the correct dtype converter. Adapted from matplotlib.
+    for line in iterable:
+        if isinstance(line, bytes):
+            # Need to handle conversion here, or the splitting would fail
+            line = line.decode(encoding)
+
+        for c in comments:
+            line = line.split(c, 1)[0]
+
+        yield line
+
+
+# The number of rows we read in one go if confronted with a parametric dtype
+_loadtxt_chunksize = 50000
+
+
+def _read(fname, *, delimiter=',', comment='#', quote='"',
+          imaginary_unit='j', usecols=None, skiplines=0,
+          max_rows=None, converters=None, ndmin=None, unpack=False,
+          dtype=np.float64, encoding="bytes"):
+    r"""
+    Read a NumPy array from a text file.
+
+    Parameters
+    ----------
+    fname : str or file object
+        The filename or the file to be read.
+    delimiter : str, optional
+        Field delimiter of the fields in line of the file.
+        Default is a comma, ','.  If None any sequence of whitespace is
+        considered a delimiter.
+    comment : str or sequence of str or None, optional
+        Character that begins a comment.  All text from the comment
+        character to the end of the line is ignored.
+        Multiple comments or multiple-character comment strings are supported,
+        but may be slower and `quote` must be empty if used.
+        Use None to disable all use of comments.
+    quote : str or None, optional
+        Character that is used to quote string fields. Default is '"'
+        (a double quote). Use None to disable quote support.
+    imaginary_unit : str, optional
+        Character that represent the imaginay unit `sqrt(-1)`.
+        Default is 'j'.
+    usecols : array_like, optional
+        A one-dimensional array of integer column numbers.  These are the
+        columns from the file to be included in the array.  If this value
+        is not given, all the columns are used.
+    skiplines : int, optional
+        Number of lines to skip before interpreting the data in the file.
+    max_rows : int, optional
+        Maximum number of rows of data to read.  Default is to read the
+        entire file.
+    converters : dict or callable, optional
+        A function to parse all columns strings into the desired value, or
+        a dictionary mapping column number to a parser function.
+        E.g. if column 0 is a date string: ``converters = {0: datestr2num}``.
+        Converters can also be used to provide a default value for missing
+        data, e.g. ``converters = lambda s: float(s.strip() or 0)`` will
+        convert empty fields to 0.
+        Default: None
+    ndmin : int, optional
+        Minimum dimension of the array returned.
+        Allowed values are 0, 1 or 2.  Default is 0.
+    unpack : bool, optional
+        If True, the returned array is transposed, so that arguments may be
+        unpacked using ``x, y, z = read(...)``.  When used with a structured
+        data-type, arrays are returned for each field.  Default is False.
+    dtype : numpy data type
+        A NumPy dtype instance, can be a structured dtype to map to the
+        columns of the file.
+    encoding : str, optional
+        Encoding used to decode the inputfile. The special value 'bytes'
+        (the default) enables backwards-compatible behavior for `converters`,
+        ensuring that inputs to the converter functions are encoded
+        bytes objects. The special value 'bytes' has no additional effect if
+        ``converters=None``. If encoding is ``'bytes'`` or ``None``, the
+        default system encoding is used.
+
+    Returns
+    -------
+    ndarray
+        NumPy array.
  
-    Even when a lambda is returned, it is defined at the toplevel, to allow
-    testing for equality and enabling optimization for single-type data.
+    Examples
+    --------
+    First we create a file for the example.
+
+    >>> s1 = '1.0,2.0,3.0\n4.0,5.0,6.0\n'
+    >>> with open('example1.csv', 'w') as f:
+    ...     f.write(s1)
+    >>> a1 = read_from_filename('example1.csv')
+    >>> a1
+    array([[1., 2., 3.],
+           [4., 5., 6.]])
+
+    The second example has columns with different data types, so a
+    one-dimensional array with a structured data type is returned.
+    The tab character is used as the field delimiter.
+
+    >>> s2 = '1.0\t10\talpha\n2.3\t25\tbeta\n4.5\t16\tgamma\n'
+    >>> with open('example2.tsv', 'w') as f:
+    ...     f.write(s2)
+    >>> a2 = read_from_filename('example2.tsv', delimiter='\t')
+    >>> a2
+    array([(1. , 10, b'alpha'), (2.3, 25, b'beta'), (4.5, 16, b'gamma')],
+          dtype=[('f0', '<f8'), ('f1', 'u1'), ('f2', 'S5')])
      """
-    for base, conv in _CONVERTERS:
-        if issubclass(dtype.type, base):
-            return conv
-    return str
-
-
-# _loadtxt_flatten_dtype_internal and _loadtxt_pack_items are loadtxt helpers
-# lifted to the toplevel because recursive inner functions cause either
-# GC-dependent reference loops (because they are closures over loadtxt's
-# internal variables) or large overheads if using a manual trampoline to hide
-# the recursive calls.
-
-
-# not to be confused with the flatten_dtype we import...
-def _loadtxt_flatten_dtype_internal(dt):
-    """Unpack a structured data-type, and produce a packer function."""
-    if dt.names is None:
-        # If the dtype is flattened, return.
-        # If the dtype has a shape, the dtype occurs
-        # in the list more than once.
-        shape = dt.shape
-        if len(shape) == 0:
-            return ([dt.base], None)
-        else:
-            packing = [(shape[-1], list)]
-            if len(shape) > 1:
-                for dim in dt.shape[-2::-1]:
-                    packing = [(dim*packing[0][0], packing*dim)]
-            return ([dt.base] * int(np.prod(dt.shape)),
-                    functools.partial(_loadtxt_pack_items, packing))
+    # Handle special 'bytes' keyword for encoding
+    byte_converters = False
+    if encoding == 'bytes':
+        encoding = None
+        byte_converters = True
+
+    if dtype is None:
+        raise TypeError("a dtype must be provided.")
+    dtype = np.dtype(dtype)
+
+    read_dtype_via_object_chunks = None
+    if dtype.kind in 'SUM' and (
+            dtype == "S0" or dtype == "U0" or dtype == "M8" or dtype == 'm8'):
+        # This is a legacy "flexible" dtype.  We do not truly support
+        # parametric dtypes currently (no dtype discovery step in the core),
+        # but have to support these for backward compatibility.
+        read_dtype_via_object_chunks = dtype
+        dtype = np.dtype(object)
+
+    if usecols is not None:
+        # Allow usecols to be a single int or a sequence of ints, the C-code
+        # handles the rest
+        try:
+            usecols = list(usecols)
+        except TypeError:
+            usecols = [usecols]
+
+    _ensure_ndmin_ndarray_check_param(ndmin)
+
+    if comment is None:
+        comments = None
      else:
-        types = []
-        packing = []
-        for field in dt.names:
-            tp, bytes = dt.fields[field]
-            flat_dt, flat_packer = _loadtxt_flatten_dtype_internal(tp)
-            types.extend(flat_dt)
-            flat_packing = flat_packer.args[0] if flat_packer else None
-            # Avoid extra nesting for subarrays
-            if tp.ndim > 0:
-                packing.extend(flat_packing)
-            else:
-                packing.append((len(flat_dt), flat_packing))
-        return (types, functools.partial(_loadtxt_pack_items, packing))
-
-
-def _loadtxt_pack_items(packing, items):
-    """Pack items into nested lists based on re-packing info."""
-    if packing is None:
-        return items[0]
-    elif packing is tuple:
-        return tuple(items)
-    elif packing is list:
-        return list(items)
+        # assume comments are a sequence of strings
+        if "" in comment:
+            raise ValueError(
+                "comments cannot be an empty string. Use comments=None to "
+                "disable comments."
+            )
+        comments = tuple(comment)
+        comment = None
+        if len(comments) == 0:
+            comments = None  # No comments at all
+        elif len(comments) == 1:
+            # If there is only one comment, and that comment has one character,
+            # the normal parsing can deal with it just fine.
+            if isinstance(comments[0], str) and len(comments[0]) == 1:
+                comment = comments[0]
+                comments = None
+        else:
+            # Input validation if there are multiple comment characters
+            if delimiter in comments:
+                raise TypeError(
+                    f"Comment characters '{comments}' cannot include the "
+                    f"delimiter '{delimiter}'"
+                )
+
+    # comment is now either a 1 or 0 character string or a tuple:
+    if comments is not None:
+        # Note: An earlier version support two character comments (and could
+        #       have been extended to multiple characters, we assume this is
+        #       rare enough to not optimize for.
+        if quote is not None:
+            raise ValueError(
+                "when multiple comments or a multi-character comment is "
+                "given, quotes are not supported.  In this case quotechar "
+                "must be set to None.")
+
+    if len(imaginary_unit) != 1:
+        raise ValueError('len(imaginary_unit) must be 1.')
+
+    _check_nonneg_int(skiplines)
+    if max_rows is not None:
+        _check_nonneg_int(max_rows)
      else:
-        start = 0
-        ret = []
-        for length, subpacking in packing:
-            ret.append(
-                _loadtxt_pack_items(subpacking, items[start:start+length]))
-            start += length
-        return tuple(ret)
+        # Passing -1 to the C code means "read the entire file".
+        max_rows = -1
  
+    fh_closing_ctx = contextlib.nullcontext()
+    filelike = False
+    try:
+        if isinstance(fname, os.PathLike):
+            fname = os.fspath(fname)
+        if isinstance(fname, str):
+            fh = np.lib._datasource.open(fname, 'rt', encoding=encoding)
+            if encoding is None:
+                encoding = getattr(fh, 'encoding', 'latin1')
  
-# amount of lines loadtxt reads in one chunk, can be overridden for testing
-_loadtxt_chunksize = 50000
+            fh_closing_ctx = contextlib.closing(fh)
+            data = fh
+            filelike = True
+        else:
+            if encoding is None:
+                encoding = getattr(fname, 'encoding', 'latin1')
+            data = iter(fname)
+    except TypeError as e:
+        raise ValueError(
+            f"fname must be a string, filehandle, list of strings,\n"
+            f"or generator. Got {type(fname)} instead.") from e
  
+    with fh_closing_ctx:
+        if comments is not None:
+            if filelike:
+                data = iter(data)
+                filelike = False
+            data = _preprocess_comments(data, comments, encoding)
+
+        if read_dtype_via_object_chunks is None:
+            arr = _load_from_filelike(
+                data, delimiter=delimiter, comment=comment, quote=quote,
+                imaginary_unit=imaginary_unit,
+                usecols=usecols, skiplines=skiplines, max_rows=max_rows,
+                converters=converters, dtype=dtype,
+                encoding=encoding, filelike=filelike,
+                byte_converters=byte_converters)
  
-def _loadtxt_dispatcher(fname, dtype=None, comments=None, delimiter=None,
-                        converters=None, skiprows=None, usecols=None, unpack=None,
-                        ndmin=None, encoding=None, max_rows=None, *, like=None):
-    return (like,)
+        else:
+            # This branch reads the file into chunks of object arrays and then
+            # casts them to the desired actual dtype.  This ensures correct
+            # string-length and datetime-unit discovery (like `arr.astype()`).
+            # Due to chunking, certain error reports are less clear, currently.
+            if filelike:
+                data = iter(data)  # cannot chunk when reading from file
+
+            c_byte_converters = False
+            if read_dtype_via_object_chunks == "S":
+                c_byte_converters = True  # Use latin1 rather than ascii
+
+            chunks = []
+            while max_rows != 0:
+                if max_rows < 0:
+                    chunk_size = _loadtxt_chunksize
+                else:
+                    chunk_size = min(_loadtxt_chunksize, max_rows)
+
+                next_arr = _load_from_filelike(
+                    data, delimiter=delimiter, comment=comment, quote=quote,
+                    imaginary_unit=imaginary_unit,
+                    usecols=usecols, skiplines=skiplines, max_rows=max_rows,
+                    converters=converters, dtype=dtype,
+                    encoding=encoding, filelike=filelike,
+                    byte_converters=byte_converters,
+                    c_byte_converters=c_byte_converters)
+                # Cast here already.  We hope that this is better even for
+                # large files because the storage is more compact.  It could
+                # be adapted (in principle the concatenate could cast).
+                chunks.append(next_arr.astype(read_dtype_via_object_chunks))
+
+                skiprows = 0  # Only have to skip for first chunk
+                if max_rows >= 0:
+                    max_rows -= chunk_size
+                if len(next_arr) < chunk_size:
+                    # There was less data than requested, so we are done.
+                    break
+
+            # Need at least one chunk, but if empty, the last one may have
+            # the wrong shape.
+            if len(chunks) > 1 and len(chunks[-1]) == 0:
+                del chunks[-1]
+            if len(chunks) == 1:
+                arr = chunks[0]
+            else:
+                arr = np.concatenate(chunks, axis=0)
+
+    # NOTE: ndmin works as advertised for structured dtypes, but normally
+    #       these would return a 1D result plus the structured dimension,
+    #       so ndmin=2 adds a third dimension even when no squeezing occurs.
+    #       A `squeeze=False` could be a better solution (pandas uses squeeze).
+    arr = _ensure_ndmin_ndarray(arr, ndmin=ndmin)
+
+    if arr.shape:
+        if arr.shape[0] == 0:
+            warnings.warn(
+                f'loadtxt: input contained no data: "{fname}"',
+                category=UserWarning,
+                stacklevel=3
+            )
+
+    if unpack:
+        # Unpack structured dtypes if requested:
+        dt = arr.dtype
+        if dt.names is not None:
+            # For structured arrays, return an array for each field.
+            return [arr[field] for field in dt.names]
+        else:
+            return arr.T
+    else:
+        return arr
  
  
  @set_array_function_like_doc
  @set_module('numpy')
  def loadtxt(fname, dtype=float, comments='#', delimiter=None,
              converters=None, skiprows=0, usecols=None, unpack=False,
-            ndmin=0, encoding='bytes', max_rows=None, *, like=None):
+            ndmin=0, encoding='bytes', max_rows=None, *, quotechar=None,
+            like=None):
      r"""
      Load data from a text file.
  
@@ -849,19 +1082,20 @@ def loadtxt(fname, dtype=float, comments='#', delimiter=None,
          each row will be interpreted as an element of the array.  In this
          case, the number of columns used must match the number of fields in
          the data-type.
-    comments : str or sequence of str, optional
+    comments : str or sequence of str or None, optional
          The characters or list of characters used to indicate the start of a
          comment. None implies no comments. For backwards compatibility, byte
          strings will be decoded as 'latin1'. The default is '#'.
      delimiter : str, optional
          The string used to separate values. For backwards compatibility, byte
          strings will be decoded as 'latin1'. The default is whitespace.
-    converters : dict, optional
-        A dictionary mapping column number to a function that will parse the
-        column string into the desired value.  E.g., if column 0 is a date
-        string: ``converters = {0: datestr2num}``.  Converters can also be
-        used to provide a default value for missing data (but see also
-        `genfromtxt`): ``converters = {3: lambda s: float(s.strip() or 0)}``.
+    converters : dict or callable, optional
+        A function to parse all columns strings into the desired value, or
+        a dictionary mapping column number to a parser function.
+        E.g. if column 0 is a date string: ``converters = {0: datestr2num}``.
+        Converters can also be used to provide a default value for missing
+        data, e.g. ``converters = lambda s: float(s.strip() or 0)`` will
+        convert empty fields to 0.
          Default: None.
      skiprows : int, optional
          Skip the first `skiprows` lines, including comments; default: 0.
@@ -899,6 +1133,16 @@ def loadtxt(fname, dtype=float, comments='#', delimiter=None,
          is to read all the lines.
  
          .. versionadded:: 1.16.0
+    quotechar : unicode character or None, optional
+        The character used to denote the start and end of a quoted item.
+        Occurrences of the delimiter or comment characters are ignored within
+        a quoted item. The default value is ``quotechar=None``, which means
+        quoting support is disabled.
+
+        If two consecutive instances of `quotechar` are found within a quoted
+        field, the first is treated as an escape character. See examples.
+
+        .. versionadded:: 1.23.0
      ${ARRAY_FUNCTION_LIKE}
  
          .. versionadded:: 1.20.0
@@ -946,6 +1190,29 @@ def loadtxt(fname, dtype=float, comments='#', delimiter=None,
      >>> y
      array([2., 4.])
  
+    The `converters` argument is used to specify functions to preprocess the
+    text prior to parsing. `converters` can be a dictionary that maps
+    preprocessing functions to each column:
+
+    >>> s = StringIO("1.618, 2.296\n3.141, 4.669\n")
+    >>> conv = {
+    ...     0: lambda x: np.floor(float(x)),  # conversion fn for column 0
+    ...     1: lambda x: np.ceil(float(x)),  # conversion fn for column 1
+    ... }
+    >>> np.loadtxt(s, delimiter=",", converters=conv)
+    array([[1., 3.],
+           [3., 5.]])
+
+    `converters` can be a callable instead of a dictionary, in which case it
+    is applied to all columns:
+
+    >>> s = StringIO("0xDE 0xAD\n0xC0 0xDE")
+    >>> import functools
+    >>> conv = functools.partial(int, base=16)
+    >>> np.loadtxt(s, converters=conv)
+    array([[222., 173.],
+           [192., 222.]])
+
      This example shows how `converters` can be used to convert a field
      with a trailing minus sign into a negative number.
  
@@ -953,254 +1220,90 @@ def loadtxt(fname, dtype=float, comments='#', delimiter=None,
      >>> def conv(fld):
      ...     return -float(fld[:-1]) if fld.endswith(b'-') else float(fld)
      ...
-    >>> np.loadtxt(s, converters={0: conv, 1: conv})
+    >>> np.loadtxt(s, converters=conv)
      array([[ 10.01, -31.25],
             [ 19.22,  64.31],
             [-17.57,  63.94]])
-    """
-
-    if like is not None:
-        return _loadtxt_with_like(
-            fname, dtype=dtype, comments=comments, delimiter=delimiter,
-            converters=converters, skiprows=skiprows, usecols=usecols,
-            unpack=unpack, ndmin=ndmin, encoding=encoding,
-            max_rows=max_rows, like=like
-        )
  
-    # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-    # Nested functions used by loadtxt.
-    # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
+    Using a callable as the converter can be particularly useful for handling
+    values with different formatting, e.g. floats with underscores:
  
-    def split_line(line: str):
-        """Chop off comments, strip, and split at delimiter."""
-        for comment in comments:  # Much faster than using a single regex.
-            line = line.split(comment, 1)[0]
-        line = line.strip('\r\n')
-        return line.split(delimiter) if line else []
-
-    # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-    # Main body of loadtxt.
-    # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-
-    # Check correctness of the values of `ndmin`
-    if ndmin not in [0, 1, 2]:
-        raise ValueError('Illegal value of ndmin keyword: %s' % ndmin)
+    >>> s = StringIO("1 2.7 100_000")
+    >>> np.loadtxt(s, converters=float)
+    array([1.e+00, 2.7e+00, 1.e+05])
  
-    # Type conversions for Py3 convenience
-    if comments is not None:
-        if isinstance(comments, (str, bytes)):
-            comments = [comments]
-        comments = [_decode_line(x) for x in comments]
-    else:
-        comments = []
+    This idea can be extended to automatically handle values specified in
+    many different formats:
  
-    if delimiter is not None:
-        delimiter = _decode_line(delimiter)
+    >>> def conv(val):
+    ...     try:
+    ...         return float(val)
+    ...     except ValueError:
+    ...         return float.fromhex(val)
+    >>> s = StringIO("1, 2.5, 3_000, 0b4, 0x1.4000000000000p+2")
+    >>> np.loadtxt(s, delimiter=",", converters=conv, encoding=None)
+    array([1.0e+00, 2.5e+00, 3.0e+03, 1.8e+02, 5.0e+00])
  
-    user_converters = converters
+    Note that with the default ``encoding="bytes"``, the inputs to the
+    converter function are latin-1 encoded byte strings. To deactivate the
+    implicit encoding prior to conversion, use ``encoding=None``
  
-    byte_converters = False
-    if encoding == 'bytes':
-        encoding = None
-        byte_converters = True
-
-    if usecols is not None:
-        # Copy usecols, allowing it to be a single int or a sequence of ints.
-        try:
-            usecols = list(usecols)
-        except TypeError:
-            usecols = [usecols]
-        for i, col_idx in enumerate(usecols):
-            try:
-                usecols[i] = opindex(col_idx)  # Cast to builtin int now.
-            except TypeError as e:
-                e.args = (
-                    "usecols must be an int or a sequence of ints but "
-                    "it contains at least one element of type %s" %
-                    type(col_idx),
-                    )
-                raise
-        if len(usecols) > 1:
-            usecols_getter = itemgetter(*usecols)
-        else:
-            # Get an iterable back, even if using a single column.
-            usecols_getter = lambda obj, c=usecols[0]: [obj[c]]
-    else:
-        usecols_getter = None
+    >>> s = StringIO('10.01 31.25-\n19.22 64.31\n17.57- 63.94')
+    >>> conv = lambda x: -float(x[:-1]) if x.endswith('-') else float(x)
+    >>> np.loadtxt(s, converters=conv, encoding=None)
+    array([[ 10.01, -31.25],
+           [ 19.22,  64.31],
+           [-17.57,  63.94]])
  
-    # Make sure we're dealing with a proper dtype
-    dtype = np.dtype(dtype)
-    defconv = _getconv(dtype)
+    Support for quoted fields is enabled with the `quotechar` parameter.
+    Comment and delimiter characters are ignored when they appear within a
+    quoted item delineated by `quotechar`:
  
-    dtype_types, packer = _loadtxt_flatten_dtype_internal(dtype)
+    >>> s = StringIO('"alpha, #42", 10.0\n"beta, #64", 2.0\n')
+    >>> dtype = np.dtype([("label", "U12"), ("value", float)])
+    >>> np.loadtxt(s, dtype=dtype, delimiter=",", quotechar='"')
+    array([('alpha, #42', 10.), ('beta, #64',  2.)],
+          dtype=[('label', '<U12'), ('value', '<f8')])
  
-    fh_closing_ctx = contextlib.nullcontext()
-    try:
-        if isinstance(fname, os_PathLike):
-            fname = os_fspath(fname)
-        if _is_string_like(fname):
-            fh = np.lib._datasource.open(fname, 'rt', encoding=encoding)
-            fencoding = getattr(fh, 'encoding', 'latin1')
-            line_iter = iter(fh)
-            fh_closing_ctx = contextlib.closing(fh)
-        else:
-            line_iter = iter(fname)
-            fencoding = getattr(fname, 'encoding', 'latin1')
-            try:
-                first_line = next(line_iter)
-            except StopIteration:
-                pass  # Nothing matters if line_iter is empty.
-            else:
-                # Put first_line back.
-                line_iter = itertools.chain([first_line], line_iter)
-                if isinstance(first_line, bytes):
-                    # Using latin1 matches _decode_line's behavior.
-                    decoder = methodcaller(
-                        "decode",
-                        encoding if encoding is not None else "latin1")
-                    line_iter = map(decoder, line_iter)
-    except TypeError as e:
-        raise ValueError(
-            f"fname must be a string, filehandle, list of strings,\n"
-            f"or generator. Got {type(fname)} instead."
-        ) from e
+    Two consecutive quote characters within a quoted field are treated as a
+    single escaped character:
  
-    with fh_closing_ctx:
+    >>> s = StringIO('"Hello, my name is ""Monty""!"')
+    >>> np.loadtxt(s, dtype="U", delimiter=",", quotechar='"')
+    array('Hello, my name is "Monty"!', dtype='<U26')
  
-        # input may be a python2 io stream
-        if encoding is not None:
-            fencoding = encoding
-        # we must assume local encoding
-        # TODO emit portability warning?
-        elif fencoding is None:
-            import locale
-            fencoding = locale.getpreferredencoding()
-
-        # Skip the first `skiprows` lines
-        for i in range(skiprows):
-            next(line_iter)
-
-        # Read until we find a line with some values, and use it to determine
-        # the need for decoding and estimate the number of columns.
-        for first_line in line_iter:
-            ncols = len(usecols or split_line(first_line))
-            if ncols:
-                # Put first_line back.
-                line_iter = itertools.chain([first_line], line_iter)
-                break
-        else:  # End of lines reached
-            ncols = len(usecols or [])
-            warnings.warn('loadtxt: Empty input file: "%s"' % fname,
-                          stacklevel=2)
-
-        line_iter = itertools.islice(line_iter, max_rows)
-        lineno_words_iter = filter(
-            itemgetter(1),  # item[1] is words; filter skips empty lines.
-            enumerate(map(split_line, line_iter), 1 + skiprows))
-
-        # Now that we know ncols, create the default converters list, and
-        # set packing, if necessary.
-        if len(dtype_types) > 1:
-            # We're dealing with a structured array, each field of
-            # the dtype matches a column
-            converters = [_getconv(dt) for dt in dtype_types]
-        else:
-            # All fields have the same dtype; use specialized packers which are
-            # much faster than those using _loadtxt_pack_items.
-            converters = [defconv for i in range(ncols)]
-            if ncols == 1:
-                packer = itemgetter(0)
-            else:
-                def packer(row): return row
+    """
  
-        # By preference, use the converters specified by the user
-        for i, conv in (user_converters or {}).items():
-            if usecols:
-                try:
-                    i = usecols.index(i)
-                except ValueError:
-                    # Unused converter specified
-                    continue
-            if byte_converters:
-                # converters may use decode to workaround numpy's old
-                # behaviour, so encode the string again (converters are only
-                # called with strings) before passing to the user converter.
-                def tobytes_first(conv, x):
-                    return conv(x.encode("latin1"))
-                converters[i] = functools.partial(tobytes_first, conv)
-            else:
-                converters[i] = conv
-
-        fencode = methodcaller("encode", fencoding)
-        converters = [conv if conv is not bytes else fencode
-                      for conv in converters]
-        if len(set(converters)) == 1:
-            # Optimize single-type data. Note that this is only reached if
-            # `_getconv` returns equal callables (i.e. not local lambdas) on
-            # equal dtypes.
-            def convert_row(vals, _conv=converters[0]):
-                return [*map(_conv, vals)]
-        else:
-            def convert_row(vals):
-                return [conv(val) for conv, val in zip(converters, vals)]
-
-        # read data in chunks and fill it into an array via resize
-        # over-allocating and shrinking the array later may be faster but is
-        # probably not relevant compared to the cost of actually reading and
-        # converting the data
-        X = None
-        while True:
-            chunk = []
-            for lineno, words in itertools.islice(
-                    lineno_words_iter, _loadtxt_chunksize):
-                if usecols_getter is not None:
-                    words = usecols_getter(words)
-                elif len(words) != ncols:
-                    raise ValueError(
-                        f"Wrong number of columns at line {lineno}")
-                # Convert each value according to its column, then pack it
-                # according to the dtype's nesting, and store it.
-                chunk.append(packer(convert_row(words)))
-            if not chunk:  # The islice is empty, i.e. we're done.
-                break
+    if like is not None:
+        return _loadtxt_with_like(
+            fname, dtype=dtype, comments=comments, delimiter=delimiter,
+            converters=converters, skiprows=skiprows, usecols=usecols,
+            unpack=unpack, ndmin=ndmin, encoding=encoding,
+            max_rows=max_rows, like=like
+        )
  
-            if X is None:
-                X = np.array(chunk, dtype)
-            else:
-                nshape = list(X.shape)
-                pos = nshape[0]
-                nshape[0] += len(chunk)
-                X.resize(nshape, refcheck=False)
-                X[pos:, ...] = chunk
+    if isinstance(delimiter, bytes):
+        delimiter.decode("latin1")
  
-    if X is None:
-        X = np.array([], dtype)
+    if dtype is None:
+        dtype = np.float64
  
-    # Multicolumn data are returned with shape (1, N, M), i.e.
-    # (1, 1, M) for a single row - remove the singleton dimension there
-    if X.ndim == 3 and X.shape[:2] == (1, 1):
-        X.shape = (1, -1)
+    comment = comments
+    # Control character type conversions for Py3 convenience
+    if comment is not None:
+        if isinstance(comment, (str, bytes)):
+            comment = [comment]
+        comment = [
+            x.decode('latin1') if isinstance(x, bytes) else x for x in comment]
+    if isinstance(delimiter, bytes):
+        delimiter = delimiter.decode('latin1')
  
-    # Verify that the array has at least dimensions `ndmin`.
-    # Tweak the size and shape of the arrays - remove extraneous dimensions
-    if X.ndim > ndmin:
-        X = np.squeeze(X)
-    # and ensure we have the minimum number of dimensions asked for
-    # - has to be in this order for the odd case ndmin=1, X.squeeze().ndim=0
-    if X.ndim < ndmin:
-        if ndmin == 1:
-            X = np.atleast_1d(X)
-        elif ndmin == 2:
-            X = np.atleast_2d(X).T
+    arr = _read(fname, dtype=dtype, comment=comment, delimiter=delimiter,
+                converters=converters, skiplines=skiprows, usecols=usecols,
+                unpack=unpack, ndmin=ndmin, encoding=encoding,
+                max_rows=max_rows, quote=quotechar)
  
-    if unpack:
-        if len(dtype_types) > 1:
-            # For structured arrays, return an array for each field.
-            return [X[field] for field in dtype.names]
-        else:
-            return X.T
-    else:
-        return X
+    return arr
  
  
  _loadtxt_with_like = array_function_dispatch(
@@ -1572,8 +1675,8 @@ def _genfromtxt_dispatcher(fname, dtype=None, comments=None, delimiter=None,
                             names=None, excludelist=None, deletechars=None,
                             replace_space=None, autostrip=None, case_sensitive=None,
                             defaultfmt=None, unpack=None, usemask=None, loose=None,
-                           invalid_raise=None, max_rows=None, encoding=None, *,
-                           like=None):
+                           invalid_raise=None, max_rows=None, encoding=None,
+                           *, ndmin=None, like=None):
      return (like,)
  
  
@@ -1586,8 +1689,8 @@ def genfromtxt(fname, dtype=float, comments='#', delimiter=None,
                 deletechars=''.join(sorted(NameValidator.defaultdeletechars)),
                 replace_space='_', autostrip=False, case_sensitive=True,
                 defaultfmt="f%i", unpack=None, usemask=False, loose=True,
-               invalid_raise=True, max_rows=None, encoding='bytes', *,
-               like=None):
+               invalid_raise=True, max_rows=None, encoding='bytes',
+               *, ndmin=0, like=None):
      """
      Load data from a text file, with missing values handled as specified.
  
@@ -1686,6 +1789,10 @@ def genfromtxt(fname, dtype=float, comments='#', delimiter=None,
          to None the system default is used. The default value is 'bytes'.
  
          .. versionadded:: 1.14.0
+    ndmin : int, optional
+        Same parameter as `loadtxt`
+
+        .. versionadded:: 1.23.0
      ${ARRAY_FUNCTION_LIKE}
  
          .. versionadded:: 1.20.0
@@ -1779,9 +1886,12 @@ def genfromtxt(fname, dtype=float, comments='#', delimiter=None,
              case_sensitive=case_sensitive, defaultfmt=defaultfmt,
              unpack=unpack, usemask=usemask, loose=loose,
              invalid_raise=invalid_raise, max_rows=max_rows, encoding=encoding,
+            ndmin=ndmin,
              like=like
          )
  
+    _ensure_ndmin_ndarray_check_param(ndmin)
+
      if max_rows is not None:
          if skip_footer:
              raise ValueError(
@@ -1806,22 +1916,21 @@ def genfromtxt(fname, dtype=float, comments='#', delimiter=None,
          byte_converters = False
  
      # Initialize the filehandle, the LineSplitter and the NameValidator
+    if isinstance(fname, os_PathLike):
+        fname = os_fspath(fname)
+    if isinstance(fname, str):
+        fid = np.lib._datasource.open(fname, 'rt', encoding=encoding)
+        fid_ctx = contextlib.closing(fid)
+    else:
+        fid = fname
+        fid_ctx = contextlib.nullcontext(fid)
      try:
-        if isinstance(fname, os_PathLike):
-            fname = os_fspath(fname)
-        if isinstance(fname, str):
-            fid = np.lib._datasource.open(fname, 'rt', encoding=encoding)
-            fid_ctx = contextlib.closing(fid)
-        else:
-            fid = fname
-            fid_ctx = contextlib.nullcontext(fid)
          fhd = iter(fid)
      except TypeError as e:
          raise TypeError(
-            f"fname must be a string, filehandle, list of strings,\n"
-            f"or generator. Got {type(fname)} instead."
+            "fname must be a string, a filehandle, a sequence of strings,\n"
+            f"or an iterator of strings. Got {type(fname)} instead."
          ) from e
-
      with fid_ctx:
          split_line = LineSplitter(delimiter=delimiter, comments=comments,
                                    autostrip=autostrip, encoding=encoding)
@@ -2291,7 +2400,9 @@ def genfromtxt(fname, dtype=float, comments='#', delimiter=None,
      if usemask:
          output = output.view(MaskedArray)
          output._mask = outputmask
-    output = np.squeeze(output)
+
+    output = _ensure_ndmin_ndarray(output, ndmin=ndmin)
+
      if unpack:
          if names is None:
              return output.T
diff --git a/numpy/lib/npyio.pyi b/numpy/lib/npyio.pyi

index 75d06e9e33dd620abeab57126f69059b6751d371..8007b2dc717bf6ea5cc6295c0250a17b837c8700 100644 (file)
--- a/numpy/lib/npyio.pyi
+++ b/numpy/lib/npyio.pyi
@@ -2,23 +2,16 @@ import os
  import sys
  import zipfile
  import types
+from re import Pattern
+from collections.abc import Collection, Mapping, Iterator, Sequence, Callable, Iterable
  from typing import (
      Literal as L,
      Any,
-    Mapping,
      TypeVar,
      Generic,
-    List,
-    Type,
-    Iterator,
-    Union,
      IO,
      overload,
-    Sequence,
-    Callable,
-    Pattern,
      Protocol,
-    Iterable,
  )
  
  from numpy import (
@@ -33,7 +26,13 @@ from numpy import (
  )
  
  from numpy.ma.mrecords import MaskedRecords
-from numpy.typing import ArrayLike, DTypeLike, NDArray, _SupportsDType
+from numpy._typing import (
+    ArrayLike,
+    DTypeLike,
+    NDArray,
+    _DTypeLike,
+    _SupportsArrayFunc,
+)
  
  from numpy.core.multiarray import (
      packbits as packbits,
@@ -47,12 +46,6 @@ _SCT = TypeVar("_SCT", bound=generic)
  _CharType_co = TypeVar("_CharType_co", str, bytes, covariant=True)
  _CharType_contra = TypeVar("_CharType_contra", str, bytes, contravariant=True)
  
-_DTypeLike = Union[
-    Type[_SCT],
-    dtype[_SCT],
-    _SupportsDType[dtype[_SCT]],
-]
-
  class _SupportsGetItem(Protocol[_T_contra, _T_co]):
      def __getitem__(self, key: _T_contra, /) -> _T_co: ...
  
@@ -66,17 +59,17 @@ class _SupportsReadSeek(Protocol[_CharType_co]):
  class _SupportsWrite(Protocol[_CharType_contra]):
      def write(self, s: _CharType_contra, /) -> object: ...
  
-__all__: List[str]
+__all__: list[str]
  
  class BagObj(Generic[_T_co]):
      def __init__(self, obj: _SupportsGetItem[str, _T_co]) -> None: ...
      def __getattribute__(self, key: str) -> _T_co: ...
-    def __dir__(self) -> List[str]: ...
+    def __dir__(self) -> list[str]: ...
  
  class NpzFile(Mapping[str, NDArray[Any]]):
      zip: zipfile.ZipFile
      fid: None | IO[str]
-    files: List[str]
+    files: list[str]
      allow_pickle: bool
      pickle_kwargs: None | Mapping[str, Any]
      # Represent `f` as a mutable property so we can access the type of `self`
@@ -94,7 +87,7 @@ class NpzFile(Mapping[str, NDArray[Any]]):
      def __enter__(self: _T) -> _T: ...
      def __exit__(
          self,
-        exc_type: None | Type[BaseException],
+        exc_type: None | type[BaseException],
          exc_value: None | BaseException,
          traceback: None | types.TracebackType,
          /,
@@ -150,7 +143,8 @@ def loadtxt(
      encoding: None | str = ...,
      max_rows: None | int = ...,
      *,
-    like: None | ArrayLike = ...
+    quotechar: None | str = ...,
+    like: None | _SupportsArrayFunc = ...
  ) -> NDArray[float64]: ...
  @overload
  def loadtxt(
@@ -166,7 +160,8 @@ def loadtxt(
      encoding: None | str = ...,
      max_rows: None | int = ...,
      *,
-    like: None | ArrayLike = ...
+    quotechar: None | str = ...,
+    like: None | _SupportsArrayFunc = ...
  ) -> NDArray[_SCT]: ...
  @overload
  def loadtxt(
@@ -182,7 +177,8 @@ def loadtxt(
      encoding: None | str = ...,
      max_rows: None | int = ...,
      *,
-    like: None | ArrayLike = ...
+    quotechar: None | str = ...,
+    like: None | _SupportsArrayFunc = ...
  ) -> NDArray[Any]: ...
  
  def savetxt(
@@ -212,27 +208,92 @@ def fromregex(
      encoding: None | str = ...
  ) -> NDArray[Any]: ...
  
-# TODO: Sort out arguments
  @overload
  def genfromtxt(
      fname: str | os.PathLike[str] | Iterable[str] | Iterable[bytes],
      dtype: None = ...,
-    *args: Any,
-    **kwargs: Any,
+    comments: str = ...,
+    delimiter: None | str | int | Iterable[int] = ...,
+    skip_header: int = ...,
+    skip_footer: int = ...,
+    converters: None | Mapping[int | str, Callable[[str], Any]] = ...,
+    missing_values: Any = ...,
+    filling_values: Any = ...,
+    usecols: None | Sequence[int] = ...,
+    names: L[None, True] | str | Collection[str] = ...,
+    excludelist: None | Sequence[str] = ...,
+    deletechars: str = ...,
+    replace_space: str = ...,
+    autostrip: bool = ...,
+    case_sensitive: bool | L['upper', 'lower'] = ...,
+    defaultfmt: str = ...,
+    unpack: None | bool = ...,
+    usemask: bool = ...,
+    loose: bool = ...,
+    invalid_raise: bool = ...,
+    max_rows: None | int = ...,
+    encoding: str = ...,
+    *,
+    ndmin: L[0, 1, 2] = ...,
+    like: None | _SupportsArrayFunc = ...,
  ) -> NDArray[float64]: ...
  @overload
  def genfromtxt(
      fname: str | os.PathLike[str] | Iterable[str] | Iterable[bytes],
      dtype: _DTypeLike[_SCT],
-    *args: Any,
-    **kwargs: Any,
+    comments: str = ...,
+    delimiter: None | str | int | Iterable[int] = ...,
+    skip_header: int = ...,
+    skip_footer: int = ...,
+    converters: None | Mapping[int | str, Callable[[str], Any]] = ...,
+    missing_values: Any = ...,
+    filling_values: Any = ...,
+    usecols: None | Sequence[int] = ...,
+    names: L[None, True] | str | Collection[str] = ...,
+    excludelist: None | Sequence[str] = ...,
+    deletechars: str = ...,
+    replace_space: str = ...,
+    autostrip: bool = ...,
+    case_sensitive: bool | L['upper', 'lower'] = ...,
+    defaultfmt: str = ...,
+    unpack: None | bool = ...,
+    usemask: bool = ...,
+    loose: bool = ...,
+    invalid_raise: bool = ...,
+    max_rows: None | int = ...,
+    encoding: str = ...,
+    *,
+    ndmin: L[0, 1, 2] = ...,
+    like: None | _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def genfromtxt(
      fname: str | os.PathLike[str] | Iterable[str] | Iterable[bytes],
      dtype: DTypeLike,
-    *args: Any,
-    **kwargs: Any,
+    comments: str = ...,
+    delimiter: None | str | int | Iterable[int] = ...,
+    skip_header: int = ...,
+    skip_footer: int = ...,
+    converters: None | Mapping[int | str, Callable[[str], Any]] = ...,
+    missing_values: Any = ...,
+    filling_values: Any = ...,
+    usecols: None | Sequence[int] = ...,
+    names: L[None, True] | str | Collection[str] = ...,
+    excludelist: None | Sequence[str] = ...,
+    deletechars: str = ...,
+    replace_space: str = ...,
+    autostrip: bool = ...,
+    case_sensitive: bool | L['upper', 'lower'] = ...,
+    defaultfmt: str = ...,
+    unpack: None | bool = ...,
+    usemask: bool = ...,
+    loose: bool = ...,
+    invalid_raise: bool = ...,
+    max_rows: None | int = ...,
+    encoding: str = ...,
+    *,
+    ndmin: L[0, 1, 2] = ...,
+    like: None | _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  @overload
diff --git a/numpy/lib/polynomial.py b/numpy/lib/polynomial.py

index f824c4c5e2c1ee435feca9b0605f962af3531f29..6aa7088616b26171315bf4059e06d8975f4a5a36 100644 (file)
--- a/numpy/lib/polynomial.py
+++ b/numpy/lib/polynomial.py
@@ -686,7 +686,7 @@ def polyfit(x, y, deg, rcond=None, full=False, w=None, cov=False):
                                   "to scale the covariance matrix")
              # note, this used to be: fac = resids / (len(x) - order - 2.0)
              # it was deciced that the "- 2" (originally justified by "Bayesian
-            # uncertainty analysis") is not was the user expects
+            # uncertainty analysis") is not what the user expects
              # (see gh-11196 and gh-11197)
              fac = resids / (len(x) - order)
          if y.ndim == 1:
diff --git a/numpy/lib/polynomial.pyi b/numpy/lib/polynomial.pyi

index 00065f53b227972e362a0d393a967aa1b3f6f527..14bbaf39d24944fb565cf543002ce1a8ba06ffe0 100644 (file)
--- a/numpy/lib/polynomial.pyi
+++ b/numpy/lib/polynomial.pyi
@@ -1,12 +1,10 @@
  from typing import (
      Literal as L,
-    List,
      overload,
      Any,
      SupportsInt,
      SupportsIndex,
      TypeVar,
-    Tuple,
      NoReturn,
  )
  
@@ -25,7 +23,7 @@ from numpy import (
      object_,
  )
  
-from numpy.typing import (
+from numpy._typing import (
      NDArray,
      ArrayLike,
      _ArrayLikeBool_co,
@@ -38,8 +36,8 @@ from numpy.typing import (
  
  _T = TypeVar("_T")
  
-_2Tup = Tuple[_T, _T]
-_5Tup = Tuple[
+_2Tup = tuple[_T, _T]
+_5Tup = tuple[
      _T,
      NDArray[float64],
      NDArray[int32],
@@ -47,7 +45,7 @@ _5Tup = Tuple[
      NDArray[float64],
  ]
  
-__all__: List[str]
+__all__: list[str]
  
  def poly(seq_of_zeros: ArrayLike) -> NDArray[floating[Any]]: ...
  
diff --git a/numpy/lib/recfunctions.py b/numpy/lib/recfunctions.py

index a491f612e07532a673faf3b2aee22772391659e7..4a27c32860b4ed3f4db51d085d6c03607002faaf 100644 (file)
--- a/numpy/lib/recfunctions.py
+++ b/numpy/lib/recfunctions.py
@@ -105,7 +105,8 @@ def _get_fieldspec(dtype):
  
  def get_names(adtype):
      """
-    Returns the field names of the input datatype as a tuple.
+    Returns the field names of the input datatype as a tuple. Input datatype
+    must have fields otherwise error is raised.
  
      Parameters
      ----------
@@ -115,15 +116,10 @@ def get_names(adtype):
      Examples
      --------
      >>> from numpy.lib import recfunctions as rfn
-    >>> rfn.get_names(np.empty((1,), dtype=int))
-    Traceback (most recent call last):
-        ...
-    AttributeError: 'numpy.ndarray' object has no attribute 'names'
-
-    >>> rfn.get_names(np.empty((1,), dtype=[('A',int), ('B', float)]))
-    Traceback (most recent call last):
-        ...
-    AttributeError: 'numpy.ndarray' object has no attribute 'names'
+    >>> rfn.get_names(np.empty((1,), dtype=[('A', int)]).dtype)
+    ('A',)
+    >>> rfn.get_names(np.empty((1,), dtype=[('A',int), ('B', float)]).dtype)
+    ('A', 'B')
      >>> adtype = np.dtype([('a', int), ('b', [('ba', int), ('bb', int)])])
      >>> rfn.get_names(adtype)
      ('a', ('b', ('ba', 'bb')))
@@ -141,8 +137,9 @@ def get_names(adtype):
  
  def get_names_flat(adtype):
      """
-    Returns the field names of the input datatype as a tuple. Nested structure
-    are flattened beforehand.
+    Returns the field names of the input datatype as a tuple. Input datatype
+    must have fields otherwise error is raised.
+    Nested structure are flattened beforehand.
  
      Parameters
      ----------
@@ -152,14 +149,10 @@ def get_names_flat(adtype):
      Examples
      --------
      >>> from numpy.lib import recfunctions as rfn
-    >>> rfn.get_names_flat(np.empty((1,), dtype=int)) is None
-    Traceback (most recent call last):
-        ...
-    AttributeError: 'numpy.ndarray' object has no attribute 'names'
-    >>> rfn.get_names_flat(np.empty((1,), dtype=[('A',int), ('B', float)]))
-    Traceback (most recent call last):
-        ...
-    AttributeError: 'numpy.ndarray' object has no attribute 'names'
+    >>> rfn.get_names_flat(np.empty((1,), dtype=[('A', int)]).dtype) is None
+    False
+    >>> rfn.get_names_flat(np.empty((1,), dtype=[('A',int), ('B', str)]).dtype)
+    ('A', 'B')
      >>> adtype = np.dtype([('a', int), ('b', [('ba', int), ('bb', int)])])
      >>> rfn.get_names_flat(adtype)
      ('a', 'b', 'ba', 'bb')
@@ -784,7 +777,8 @@ def repack_fields(a, align=False, recurse=False):
  
      This method removes any overlaps and reorders the fields in memory so they
      have increasing byte offsets, and adds or removes padding bytes depending
-    on the `align` option, which behaves like the `align` option to `np.dtype`.
+    on the `align` option, which behaves like the `align` option to
+    `numpy.dtype`.
  
      If `align=False`, this method produces a "packed" memory layout in which
      each field starts at the byte the previous field ended, and any padding
@@ -917,11 +911,12 @@ def structured_to_unstructured(arr, dtype=None, copy=False, casting='unsafe'):
      dtype : dtype, optional
         The dtype of the output unstructured array.
      copy : bool, optional
-        See copy argument to `ndarray.astype`. If true, always return a copy.
-        If false, and `dtype` requirements are satisfied, a view is returned.
+        See copy argument to `numpy.ndarray.astype`. If true, always return a
+        copy. If false, and `dtype` requirements are satisfied, a view is
+        returned.
      casting : {'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional
-        See casting argument of `ndarray.astype`. Controls what kind of data
-        casting may occur.
+        See casting argument of `numpy.ndarray.astype`. Controls what kind of
+        data casting may occur.
  
      Returns
      -------
@@ -1020,11 +1015,12 @@ def unstructured_to_structured(arr, dtype=None, names=None, align=False,
      align : boolean, optional
         Whether to create an aligned memory layout.
      copy : bool, optional
-        See copy argument to `ndarray.astype`. If true, always return a copy.
-        If false, and `dtype` requirements are satisfied, a view is returned.
+        See copy argument to `numpy.ndarray.astype`. If true, always return a
+        copy. If false, and `dtype` requirements are satisfied, a view is
+        returned.
      casting : {'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional
-        See casting argument of `ndarray.astype`. Controls what kind of data
-        casting may occur.
+        See casting argument of `numpy.ndarray.astype`. Controls what kind of
+        data casting may occur.
  
      Returns
      -------
diff --git a/numpy/lib/scimath.py b/numpy/lib/scimath.py

index 308f1328bb3511cc0872b58aa37baff62612206d..b7ef0d7109c63cffc7c30f59d97389a4a4a230f7 100644 (file)
--- a/numpy/lib/scimath.py
+++ b/numpy/lib/scimath.py
@@ -234,6 +234,15 @@ def sqrt(x):
      >>> np.emath.sqrt([-1,4])
      array([0.+1.j, 2.+0.j])
  
+    Different results are expected because:
+    floating point 0.0 and -0.0 are distinct.
+
+    For more control, explicitly use complex() as follows:
+
+    >>> np.emath.sqrt(complex(-4.0, 0.0))
+    2j
+    >>> np.emath.sqrt(complex(-4.0, -0.0))
+    -2j
      """
      x = _fix_real_lt_zero(x)
      return nx.sqrt(x)
diff --git a/numpy/lib/scimath.pyi b/numpy/lib/scimath.pyi

index d0d4af41eb0cf039ce0877bcd3b047d97e72328e..589feb15f8ff38bc5003928f6d934454c8e2a94d 100644 (file)
--- a/numpy/lib/scimath.pyi
+++ b/numpy/lib/scimath.pyi
@@ -1,13 +1,94 @@
-from typing import List
-
-__all__: List[str]
-
-def sqrt(x): ...
-def log(x): ...
-def log10(x): ...
-def logn(n, x): ...
-def log2(x): ...
-def power(x, p): ...
-def arccos(x): ...
-def arcsin(x): ...
-def arctanh(x): ...
+from typing import overload, Any
+
+from numpy import complexfloating
+
+from numpy._typing import (
+    NDArray,
+    _ArrayLikeFloat_co,
+    _ArrayLikeComplex_co,
+    _ComplexLike_co,
+    _FloatLike_co,
+)
+
+__all__: list[str]
+
+@overload
+def sqrt(x: _FloatLike_co) -> Any: ...
+@overload
+def sqrt(x: _ComplexLike_co) -> complexfloating[Any, Any]: ...
+@overload
+def sqrt(x: _ArrayLikeFloat_co) -> NDArray[Any]: ...
+@overload
+def sqrt(x: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...
+
+@overload
+def log(x: _FloatLike_co) -> Any: ...
+@overload
+def log(x: _ComplexLike_co) -> complexfloating[Any, Any]: ...
+@overload
+def log(x: _ArrayLikeFloat_co) -> NDArray[Any]: ...
+@overload
+def log(x: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...
+
+@overload
+def log10(x: _FloatLike_co) -> Any: ...
+@overload
+def log10(x: _ComplexLike_co) -> complexfloating[Any, Any]: ...
+@overload
+def log10(x: _ArrayLikeFloat_co) -> NDArray[Any]: ...
+@overload
+def log10(x: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...
+
+@overload
+def log2(x: _FloatLike_co) -> Any: ...
+@overload
+def log2(x: _ComplexLike_co) -> complexfloating[Any, Any]: ...
+@overload
+def log2(x: _ArrayLikeFloat_co) -> NDArray[Any]: ...
+@overload
+def log2(x: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...
+
+@overload
+def logn(n: _FloatLike_co, x: _FloatLike_co) -> Any: ...
+@overload
+def logn(n: _ComplexLike_co, x: _ComplexLike_co) -> complexfloating[Any, Any]: ...
+@overload
+def logn(n: _ArrayLikeFloat_co, x: _ArrayLikeFloat_co) -> NDArray[Any]: ...
+@overload
+def logn(n: _ArrayLikeComplex_co, x: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...
+
+@overload
+def power(x: _FloatLike_co, p: _FloatLike_co) -> Any: ...
+@overload
+def power(x: _ComplexLike_co, p: _ComplexLike_co) -> complexfloating[Any, Any]: ...
+@overload
+def power(x: _ArrayLikeFloat_co, p: _ArrayLikeFloat_co) -> NDArray[Any]: ...
+@overload
+def power(x: _ArrayLikeComplex_co, p: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...
+
+@overload
+def arccos(x: _FloatLike_co) -> Any: ...
+@overload
+def arccos(x: _ComplexLike_co) -> complexfloating[Any, Any]: ...
+@overload
+def arccos(x: _ArrayLikeFloat_co) -> NDArray[Any]: ...
+@overload
+def arccos(x: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...
+
+@overload
+def arcsin(x: _FloatLike_co) -> Any: ...
+@overload
+def arcsin(x: _ComplexLike_co) -> complexfloating[Any, Any]: ...
+@overload
+def arcsin(x: _ArrayLikeFloat_co) -> NDArray[Any]: ...
+@overload
+def arcsin(x: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...
+
+@overload
+def arctanh(x: _FloatLike_co) -> Any: ...
+@overload
+def arctanh(x: _ComplexLike_co) -> complexfloating[Any, Any]: ...
+@overload
+def arctanh(x: _ArrayLikeFloat_co) -> NDArray[Any]: ...
+@overload
+def arctanh(x: _ArrayLikeComplex_co) -> NDArray[complexfloating[Any, Any]]: ...
diff --git a/numpy/lib/shape_base.py b/numpy/lib/shape_base.py

index a3fbee3d5fb34eced0fd7697062b54ef35a04974..ab91423d98948e334dd6e88c14c298c7846399eb 100644 (file)
--- a/numpy/lib/shape_base.py
+++ b/numpy/lib/shape_base.py
@@ -885,7 +885,7 @@ def hsplit(ary, indices_or_sections):
  
      Please refer to the `split` documentation.  `hsplit` is equivalent
      to `split` with ``axis=1``, the array is always split along the second
-    axis regardless of the array dimension.
+    axis except for 1-D arrays, where it is split at ``axis=0``.
  
      See Also
      --------
@@ -933,6 +933,12 @@ def hsplit(ary, indices_or_sections):
       array([[[2.,  3.]],
             [[6.,  7.]]])]
  
+    With a 1-D array, the split is along axis 0.
+
+    >>> x = np.array([0, 1, 2, 3, 4, 5])
+    >>> np.hsplit(x, 2)
+    [array([0, 1, 2]), array([3, 4, 5])]
+
      """
      if _nx.ndim(ary) == 0:
          raise ValueError('hsplit only works on arrays of 1 or more dimensions')
@@ -1133,35 +1139,49 @@ def kron(a, b):
      True
  
      """
+    # Working:
+    # 1. Equalise the shapes by prepending smaller array with 1s
+    # 2. Expand shapes of both the arrays by adding new axes at
+    #    odd positions for 1st array and even positions for 2nd
+    # 3. Compute the product of the modified array
+    # 4. The inner most array elements now contain the rows of
+    #    the Kronecker product
+    # 5. Reshape the result to kron's shape, which is same as
+    #    product of shapes of the two arrays.
      b = asanyarray(b)
      a = array(a, copy=False, subok=True, ndmin=b.ndim)
+    is_any_mat = isinstance(a, matrix) or isinstance(b, matrix)
      ndb, nda = b.ndim, a.ndim
+    nd = max(ndb, nda)
+
      if (nda == 0 or ndb == 0):
          return _nx.multiply(a, b)
+
      as_ = a.shape
      bs = b.shape
      if not a.flags.contiguous:
          a = reshape(a, as_)
      if not b.flags.contiguous:
          b = reshape(b, bs)
-    nd = ndb
-    if (ndb != nda):
-        if (ndb > nda):
-            as_ = (1,)*(ndb-nda) + as_
-        else:
-            bs = (1,)*(nda-ndb) + bs
-            nd = nda
-    result = outer(a, b).reshape(as_+bs)
-    axis = nd-1
-    for _ in range(nd):
-        result = concatenate(result, axis=axis)
-    wrapper = get_array_prepare(a, b)
-    if wrapper is not None:
-        result = wrapper(result)
-    wrapper = get_array_wrap(a, b)
-    if wrapper is not None:
-        result = wrapper(result)
-    return result
+
+    # Equalise the shapes by prepending smaller one with 1s
+    as_ = (1,)*max(0, ndb-nda) + as_
+    bs = (1,)*max(0, nda-ndb) + bs
+
+    # Insert empty dimensions
+    a_arr = expand_dims(a, axis=tuple(range(ndb-nda)))
+    b_arr = expand_dims(b, axis=tuple(range(nda-ndb)))
+
+    # Compute the product
+    a_arr = expand_dims(a_arr, axis=tuple(range(1, nd*2, 2)))
+    b_arr = expand_dims(b_arr, axis=tuple(range(0, nd*2, 2)))
+    # In case of `mat`, convert result to `array`
+    result = _nx.multiply(a_arr, b_arr, subok=(not is_any_mat))
+
+    # Reshape back
+    result = result.reshape(_nx.multiply(as_, bs))
+
+    return result if not is_any_mat else matrix(result, copy=False)
  
  
  def _tile_dispatcher(A, reps):
diff --git a/numpy/lib/shape_base.pyi b/numpy/lib/shape_base.pyi

index 17016c99928be09d7773d2685092fe5f77040c46..1b718da221e052c2be8727a02cba4443d8146238 100644 (file)
--- a/numpy/lib/shape_base.pyi
+++ b/numpy/lib/shape_base.pyi
@@ -1,9 +1,9 @@
-from typing import List, TypeVar, Callable, Sequence, Any, overload, Tuple, SupportsIndex, Protocol
+from collections.abc import Callable, Sequence
+from typing import TypeVar, Any, overload, SupportsIndex, Protocol
  
  from numpy import (
      generic,
      integer,
-    dtype,
      ufunc,
      bool_,
      unsignedinteger,
@@ -13,33 +13,30 @@ from numpy import (
      object_,
  )
  
-from numpy.typing import (
+from numpy._typing import (
      ArrayLike,
      NDArray,
      _ShapeLike,
-     _FiniteNestedSequence,
-     _SupportsArray,
-     _ArrayLikeBool_co,
-     _ArrayLikeUInt_co,
-     _ArrayLikeInt_co,
-     _ArrayLikeFloat_co,
-     _ArrayLikeComplex_co,
-     _ArrayLikeObject_co,
+    _ArrayLike,
+    _ArrayLikeBool_co,
+    _ArrayLikeUInt_co,
+    _ArrayLikeInt_co,
+    _ArrayLikeFloat_co,
+    _ArrayLikeComplex_co,
+    _ArrayLikeObject_co,
  )
  
  from numpy.core.shape_base import vstack
  
  _SCT = TypeVar("_SCT", bound=generic)
  
-_ArrayLike = _FiniteNestedSequence[_SupportsArray[dtype[_SCT]]]
-
  # The signatures of `__array_wrap__` and `__array_prepare__` are the same;
  # give them unique names for the sake of clarity
  class _ArrayWrap(Protocol):
      def __call__(
          self,
          array: NDArray[Any],
-        context: None | Tuple[ufunc, Tuple[Any, ...], int] = ...,
+        context: None | tuple[ufunc, tuple[Any, ...], int] = ...,
          /,
      ) -> Any: ...
  
@@ -47,7 +44,7 @@ class _ArrayPrepare(Protocol):
      def __call__(
          self,
          array: NDArray[Any],
-        context: None | Tuple[ufunc, Tuple[Any, ...], int] = ...,
+        context: None | tuple[ufunc, tuple[Any, ...], int] = ...,
          /,
      ) -> Any: ...
  
@@ -59,7 +56,7 @@ class _SupportsArrayPrepare(Protocol):
      @property
      def __array_prepare__(self) -> _ArrayPrepare: ...
  
-__all__: List[str]
+__all__: list[str]
  
  row_stack = vstack
  
@@ -76,6 +73,8 @@ def put_along_axis(
      axis: None | int,
  ) -> None: ...
  
+# TODO: Use PEP 612 `ParamSpec` once mypy supports `Concatenate`
+# xref python/mypy#8645
  @overload
  def apply_along_axis(
      func1d: Callable[..., _ArrayLike[_SCT]],
@@ -125,59 +124,59 @@ def array_split(
      ary: _ArrayLike[_SCT],
      indices_or_sections: _ShapeLike,
      axis: SupportsIndex = ...,
-) -> List[NDArray[_SCT]]: ...
+) -> list[NDArray[_SCT]]: ...
  @overload
  def array_split(
      ary: ArrayLike,
      indices_or_sections: _ShapeLike,
      axis: SupportsIndex = ...,
-) -> List[NDArray[Any]]: ...
+) -> list[NDArray[Any]]: ...
  
  @overload
  def split(
      ary: _ArrayLike[_SCT],
      indices_or_sections: _ShapeLike,
      axis: SupportsIndex = ...,
-) -> List[NDArray[_SCT]]: ...
+) -> list[NDArray[_SCT]]: ...
  @overload
  def split(
      ary: ArrayLike,
      indices_or_sections: _ShapeLike,
      axis: SupportsIndex = ...,
-) -> List[NDArray[Any]]: ...
+) -> list[NDArray[Any]]: ...
  
  @overload
  def hsplit(
      ary: _ArrayLike[_SCT],
      indices_or_sections: _ShapeLike,
-) -> List[NDArray[_SCT]]: ...
+) -> list[NDArray[_SCT]]: ...
  @overload
  def hsplit(
      ary: ArrayLike,
      indices_or_sections: _ShapeLike,
-) -> List[NDArray[Any]]: ...
+) -> list[NDArray[Any]]: ...
  
  @overload
  def vsplit(
      ary: _ArrayLike[_SCT],
      indices_or_sections: _ShapeLike,
-) -> List[NDArray[_SCT]]: ...
+) -> list[NDArray[_SCT]]: ...
  @overload
  def vsplit(
      ary: ArrayLike,
      indices_or_sections: _ShapeLike,
-) -> List[NDArray[Any]]: ...
+) -> list[NDArray[Any]]: ...
  
  @overload
  def dsplit(
      ary: _ArrayLike[_SCT],
      indices_or_sections: _ShapeLike,
-) -> List[NDArray[_SCT]]: ...
+) -> list[NDArray[_SCT]]: ...
  @overload
  def dsplit(
      ary: ArrayLike,
      indices_or_sections: _ShapeLike,
-) -> List[NDArray[Any]]: ...
+) -> list[NDArray[Any]]: ...
  
  @overload
  def get_array_prepare(*args: _SupportsArrayPrepare) -> _ArrayPrepare: ...
diff --git a/numpy/lib/stride_tricks.py b/numpy/lib/stride_tricks.py

index 5093993a9e925d38aa9eb3356ea6e7dd72c59b0d..6794ad557a2e309e3ca7e652e0c5cc093d34e615 100644 (file)
--- a/numpy/lib/stride_tricks.py
+++ b/numpy/lib/stride_tricks.py
@@ -86,6 +86,7 @@ def as_strided(x, shape=None, strides=None, subok=False, writeable=True):
      Vectorized write operations on such arrays will typically be
      unpredictable. They may even give different results for small, large,
      or transposed arrays.
+
      Since writing to these arrays has to be tested and done with great
      care, you may want to use ``writeable=False`` to avoid accidental write
      operations.
diff --git a/numpy/lib/stride_tricks.pyi b/numpy/lib/stride_tricks.pyi

index aad404107433f0e4ea9d7539a21e158274717270..4c9a98e85f7849ad262ca9e8a3d43f548234d8bb 100644 (file)
--- a/numpy/lib/stride_tricks.pyi
+++ b/numpy/lib/stride_tricks.pyi
@@ -1,26 +1,25 @@
-from typing import Any, List, Dict, Iterable, TypeVar, overload, SupportsIndex
+from collections.abc import Iterable
+from typing import Any, TypeVar, overload, SupportsIndex
  
-from numpy import dtype, generic
-from numpy.typing import (
+from numpy import generic
+from numpy._typing import (
      NDArray,
      ArrayLike,
      _ShapeLike,
      _Shape,
-    _FiniteNestedSequence,
-    _SupportsArray,
+    _ArrayLike
  )
  
  _SCT = TypeVar("_SCT", bound=generic)
-_ArrayLike = _FiniteNestedSequence[_SupportsArray[dtype[_SCT]]]
  
-__all__: List[str]
+__all__: list[str]
  
  class DummyArray:
-    __array_interface__: Dict[str, Any]
+    __array_interface__: dict[str, Any]
      base: None | NDArray[Any]
      def __init__(
          self,
-        interface: Dict[str, Any],
+        interface: dict[str, Any],
          base: None | NDArray[Any] = ...,
      ) -> None: ...
  
@@ -78,4 +77,4 @@ def broadcast_shapes(*args: _ShapeLike) -> _Shape: ...
  def broadcast_arrays(
      *args: ArrayLike,
      subok: bool = ...,
-) -> List[NDArray[Any]]: ...
+) -> list[NDArray[Any]]: ...
diff --git a/numpy/lib/tests/test_arraypad.py b/numpy/lib/tests/test_arraypad.py

index 75db5928b288760a6b0b07ef219697a915384924..ca3c35335a386779e766f4f5eb4db9c543af2f86 100644 (file)
--- a/numpy/lib/tests/test_arraypad.py
+++ b/numpy/lib/tests/test_arraypad.py
@@ -474,7 +474,7 @@ class TestStatistic:
  
      @pytest.mark.filterwarnings("ignore:Mean of empty slice:RuntimeWarning")
      @pytest.mark.filterwarnings(
-        "ignore:invalid value encountered in (true_divide|double_scalars):"
+        "ignore:invalid value encountered in (divide|double_scalars):"
          "RuntimeWarning"
      )
      @pytest.mark.parametrize("mode", ["mean", "median"])
diff --git a/numpy/lib/tests/test_arraysetops.py b/numpy/lib/tests/test_arraysetops.py

index 13385cd2409d7d1615c74d8114211a4907b50eac..e64634b6939f3932616de535df32173a8503e399 100644 (file)
--- a/numpy/lib/tests/test_arraysetops.py
+++ b/numpy/lib/tests/test_arraysetops.py
@@ -765,3 +765,11 @@ class TestUnique:
          assert_array_equal(uniq[:, inv], data)
          msg = "Unique's return_counts=True failed with axis=1"
          assert_array_equal(cnt, np.array([2, 1, 1]), msg)
+
+    def test_unique_nanequals(self):
+        # issue 20326
+        a = np.array([1, 1, np.nan, np.nan, np.nan])
+        unq = np.unique(a)
+        not_unq = np.unique(a, equal_nan=False)
+        assert_array_equal(unq, np.array([1, np.nan]))
+        assert_array_equal(not_unq, np.array([1, np.nan, np.nan, np.nan]))
diff --git a/numpy/lib/tests/test_financial_expired.py b/numpy/lib/tests/test_financial_expired.py

index 70b0cd7909b2fd6bb38880b03d939c61920efa8f..838f999a61e6d8345c8bf348dbafa5619ec420e0 100644 (file)
--- a/numpy/lib/tests/test_financial_expired.py
+++ b/numpy/lib/tests/test_financial_expired.py
@@ -3,8 +3,6 @@ import pytest
  import numpy as np
  
  
-@pytest.mark.skipif(sys.version_info[:2] < (3, 7),
-                    reason="requires python 3.7 or higher")
  def test_financial_expired():
      match = 'NEP 32'
      with pytest.warns(DeprecationWarning, match=match):
diff --git a/numpy/lib/tests/test_format.py b/numpy/lib/tests/test_format.py

index 78e67a89b5a30e17be2077b801b1fb5d75cff941..581d067de375143f550db120f8a1afff8ff463e3 100644 (file)
--- a/numpy/lib/tests/test_format.py
+++ b/numpy/lib/tests/test_format.py
@@ -283,8 +283,9 @@ from io import BytesIO
  import numpy as np
  from numpy.testing import (
      assert_, assert_array_equal, assert_raises, assert_raises_regex,
-    assert_warns,
+    assert_warns, IS_PYPY,
      )
+from numpy.testing._private.utils import requires_memory
  from numpy.lib import format
  
  
@@ -879,11 +880,13 @@ def test_large_file_support(tmpdir):
  @pytest.mark.skipif(np.dtype(np.intp).itemsize < 8,
                      reason="test requires 64-bit system")
  @pytest.mark.slow
+@requires_memory(free_bytes=2 * 2**30)
  def test_large_archive(tmpdir):
      # Regression test for product of saving arrays with dimensions of array
      # having a product that doesn't fit in int32.  See gh-7598 for details.
+    shape = (2**30, 2)
      try:
-        a = np.empty((2**30, 2), dtype=np.uint8)
+        a = np.empty(shape, dtype=np.uint8)
      except MemoryError:
          pytest.skip("Could not create large file")
  
@@ -892,10 +895,12 @@ def test_large_archive(tmpdir):
      with open(fname, "wb") as f:
          np.savez(f, arr=a)
  
+    del a
+
      with open(fname, "rb") as f:
          new_a = np.load(f)["arr"]
  
-    assert_(a.shape == new_a.shape)
+    assert new_a.shape == shape
  
  
  def test_empty_npz(tmpdir):
@@ -940,6 +945,8 @@ def test_unicode_field_names(tmpdir):
          float, np.dtype({'names': ['c'], 'formats': [np.dtype(int, metadata={})]})
      ]}), False)
      ])
+@pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
+        reason="PyPy bug in error formatting")
  def test_metadata_dtype(dt, fail):
      # gh-14142
      arr = np.ones(10, dtype=dt)
diff --git a/numpy/lib/tests/test_function_base.py b/numpy/lib/tests/test_function_base.py

index b67a31b1850ea536dc47137b5e3efef3609a4f35..bdcbef91d8c19e9820a049bcd89ad494d9ffe83b 100644 (file)
--- a/numpy/lib/tests/test_function_base.py
+++ b/numpy/lib/tests/test_function_base.py
@@ -305,6 +305,29 @@ class TestAverage:
          assert_almost_equal(y5.mean(0), average(y5, 0))
          assert_almost_equal(y5.mean(1), average(y5, 1))
  
+    @pytest.mark.parametrize(
+        'x, axis, expected_avg, weights, expected_wavg, expected_wsum',
+        [([1, 2, 3], None, [2.0], [3, 4, 1], [1.75], [8.0]),
+         ([[1, 2, 5], [1, 6, 11]], 0, [[1.0, 4.0, 8.0]],
+          [1, 3], [[1.0, 5.0, 9.5]], [[4, 4, 4]])],
+    )
+    def test_basic_keepdims(self, x, axis, expected_avg,
+                            weights, expected_wavg, expected_wsum):
+        avg = np.average(x, axis=axis, keepdims=True)
+        assert avg.shape == np.shape(expected_avg)
+        assert_array_equal(avg, expected_avg)
+
+        wavg = np.average(x, axis=axis, weights=weights, keepdims=True)
+        assert wavg.shape == np.shape(expected_wavg)
+        assert_array_equal(wavg, expected_wavg)
+
+        wavg, wsum = np.average(x, axis=axis, weights=weights, returned=True,
+                                keepdims=True)
+        assert wavg.shape == np.shape(expected_wavg)
+        assert_array_equal(wavg, expected_wavg)
+        assert wsum.shape == np.shape(expected_wsum)
+        assert_array_equal(wsum, expected_wsum)
+
      def test_weights(self):
          y = np.arange(10)
          w = np.arange(10)
@@ -890,6 +913,19 @@ class TestDelete:
          with pytest.raises(IndexError):
              np.delete([0, 1, 2], np.array([], dtype=float))
  
+    def test_single_item_array(self):
+        a_del = delete(self.a, 1)
+        a_del_arr = delete(self.a, np.array([1]))
+        a_del_lst = delete(self.a, [1])
+        a_del_obj = delete(self.a, np.array([1], dtype=object))
+        assert_equal(a_del, a_del_arr, a_del_lst, a_del_obj)
+
+        nd_a_del = delete(self.nd_a, 1, axis=1)
+        nd_a_del_arr = delete(self.nd_a, np.array([1]), axis=1)
+        nd_a_del_lst = delete(self.nd_a, [1], axis=1)
+        nd_a_del_obj = delete(self.nd_a, np.array([1], dtype=object), axis=1)
+        assert_equal(nd_a_del, nd_a_del_arr, nd_a_del_lst, nd_a_del_obj)
+
  
  class TestGradient:
  
@@ -1229,11 +1265,11 @@ class TestTrimZeros:
          res = trim_zeros(arr)
          assert_array_equal(arr, res)
  
-
      def test_list_to_list(self):
          res = trim_zeros(self.a.tolist())
          assert isinstance(res, list)
  
+
  class TestExtins:
  
      def test_basic(self):
@@ -1746,6 +1782,7 @@ class TestLeaks:
          finally:
              gc.enable()
  
+
  class TestDigitize:
  
      def test_forward(self):
@@ -2326,6 +2363,7 @@ class Test_I0:
          with pytest.raises(TypeError, match="i0 not supported for complex values"):
              res = i0(a)
  
+
  class TestKaiser:
  
      def test_simple(self):
@@ -3461,6 +3499,7 @@ class TestQuantile:
          assert np.isscalar(actual)
          assert_equal(np.quantile(a, 0.5), np.nan)
  
+
  class TestLerp:
      @hypothesis.given(t0=st.floats(allow_nan=False, allow_infinity=False,
                                     min_value=0, max_value=1),
diff --git a/numpy/lib/tests/test_io.py b/numpy/lib/tests/test_io.py

index ce345fa8218deca915d9c1866b8bb715eacaf92a..38a751d1151166a80ae99a4008fd7983d9bb2cba 100644 (file)
--- a/numpy/lib/tests/test_io.py
+++ b/numpy/lib/tests/test_io.py
@@ -696,7 +696,7 @@ class TestLoadTxt(LoadTxtBase):
          assert_array_equal(x, a)
  
          d = TextIO()
-        d.write('M 64.0 75.0\nF 25.0 60.0')
+        d.write('M 64 75.0\nF 25 60.0')
          d.seek(0)
          mydescriptor = {'names': ('gender', 'age', 'weight'),
                          'formats': ('S1', 'i4', 'f4')}
@@ -874,16 +874,27 @@ class TestLoadTxt(LoadTxtBase):
          bogus_idx = 1.5
          assert_raises_regex(
              TypeError,
-            '^usecols must be.*%s' % type(bogus_idx),
+            '^usecols must be.*%s' % type(bogus_idx).__name__,
              np.loadtxt, c, usecols=bogus_idx
              )
  
          assert_raises_regex(
              TypeError,
-            '^usecols must be.*%s' % type(bogus_idx),
+            '^usecols must be.*%s' % type(bogus_idx).__name__,
              np.loadtxt, c, usecols=[0, bogus_idx, 0]
              )
  
+    def test_bad_usecols(self):
+        with pytest.raises(OverflowError):
+            np.loadtxt(["1\n"], usecols=[2**64], delimiter=",")
+        with pytest.raises((ValueError, OverflowError)):
+            # Overflow error on 32bit platforms
+            np.loadtxt(["1\n"], usecols=[2**62], delimiter=",")
+        with pytest.raises(TypeError,
+                match="If a structured dtype .*. But 1 usecols were given and "
+                      "the number of fields is 3."):
+            np.loadtxt(["1,1\n"], dtype="i,(2)i", usecols=[0], delimiter=",")
+
      def test_fancy_dtype(self):
          c = TextIO()
          c.write('1,2,3.0\n4,5,6.0\n')
@@ -922,8 +933,7 @@ class TestLoadTxt(LoadTxtBase):
              assert_array_equal(x, a)
  
      def test_empty_file(self):
-        with suppress_warnings() as sup:
-            sup.filter(message="loadtxt: Empty input file:")
+        with pytest.warns(UserWarning, match="input contained no data"):
              c = TextIO()
              x = np.loadtxt(c)
              assert_equal(x.shape, (0,))
@@ -984,7 +994,8 @@ class TestLoadTxt(LoadTxtBase):
          c.write(inp)
          for dt in [float, np.float32]:
              c.seek(0)
-            res = np.loadtxt(c, dtype=dt)
+            res = np.loadtxt(
+                c, dtype=dt, converters=float.fromhex, encoding="latin1")
              assert_equal(res, tgt, err_msg="%s" % dt)
  
      @pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
@@ -995,9 +1006,8 @@ class TestLoadTxt(LoadTxtBase):
          is not called by default. Regression test related to gh-19598.
          """
          c = TextIO("a b c")
-        with pytest.raises(
-            ValueError, match="could not convert string to float"
-        ):
+        with pytest.raises(ValueError,
+                match=".*convert string 'a' to float64 at row 0, column 1"):
              np.loadtxt(c)
  
      @pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
@@ -1008,9 +1018,8 @@ class TestLoadTxt(LoadTxtBase):
          conversion is correct. Regression test related to gh-19598.
          """
          c = TextIO("qrs tuv")  # Invalid values for default float converter
-        with pytest.raises(
-            ValueError, match="could not convert string to float"
-        ):
+        with pytest.raises(ValueError,
+                match="could not convert string 'qrs' to float64"):
              np.loadtxt(c)
  
      def test_from_complex(self):
@@ -1106,8 +1115,7 @@ class TestLoadTxt(LoadTxtBase):
          assert_(x.shape == (3,))
  
          # Test ndmin kw with empty file.
-        with suppress_warnings() as sup:
-            sup.filter(message="loadtxt: Empty input file:")
+        with pytest.warns(UserWarning, match="input contained no data"):
              f = TextIO()
              assert_(np.loadtxt(f, ndmin=2).shape == (0, 1,))
              assert_(np.loadtxt(f, ndmin=1).shape == (0,))
@@ -1139,8 +1147,8 @@ class TestLoadTxt(LoadTxtBase):
      @pytest.mark.skipif(locale.getpreferredencoding() == 'ANSI_X3.4-1968',
                          reason="Wrong preferred encoding")
      def test_binary_load(self):
-        butf8 = b"5,6,7,\xc3\x95scarscar\n\r15,2,3,hello\n\r"\
-                b"20,2,3,\xc3\x95scar\n\r"
+        butf8 = b"5,6,7,\xc3\x95scarscar\r\n15,2,3,hello\r\n"\
+                b"20,2,3,\xc3\x95scar\r\n"
          sutf8 = butf8.decode("UTF-8").replace("\r", "").splitlines()
          with temppath() as path:
              with open(path, "wb") as f:
@@ -1203,6 +1211,30 @@ class TestLoadTxt(LoadTxtBase):
          a = np.array([[1, 2, 3, 5], [4, 5, 7, 8], [2, 1, 4, 5]], int)
          assert_array_equal(x, a)
  
+    @pytest.mark.parametrize(["skip", "data"], [
+            (1, ["ignored\n", "1,2\n", "\n", "3,4\n"]),
+            # "Bad" lines that do not end in newlines:
+            (1, ["ignored", "1,2", "", "3,4"]),
+            (1, StringIO("ignored\n1,2\n\n3,4")),
+            # Same as above, but do not skip any lines:
+            (0, ["-1,0\n", "1,2\n", "\n", "3,4\n"]),
+            (0, ["-1,0", "1,2", "", "3,4"]),
+            (0, StringIO("-1,0\n1,2\n\n3,4"))])
+    def test_max_rows_empty_lines(self, skip, data):
+        with pytest.warns(UserWarning,
+                    match=f"Input line 3.*max_rows={3-skip}"):
+            res = np.loadtxt(data, dtype=int, skiprows=skip, delimiter=",",
+                             max_rows=3-skip)
+            assert_array_equal(res, [[-1, 0], [1, 2], [3, 4]][skip:])
+
+        if isinstance(data, StringIO):
+            data.seek(0)
+
+        with warnings.catch_warnings():
+            warnings.simplefilter("error", UserWarning)
+            with pytest.raises(UserWarning):
+                np.loadtxt(data, dtype=int, skiprows=skip, delimiter=",",
+                           max_rows=3-skip)
  
  class Testfromregex:
      def test_record(self):
@@ -1432,6 +1464,10 @@ class TestFromTxt(LoadTxtBase):
                              ('F', 25.0, 60.0)], dtype=descriptor)
          assert_equal(test, control)
  
+    def test_bad_fname(self):
+        with pytest.raises(TypeError, match='fname must be a string,'):
+            np.genfromtxt(123)
+
      def test_commented_header(self):
          # Check that names can be retrieved even if the line is commented out.
          data = TextIO("""
@@ -2400,6 +2436,13 @@ M   33  21.99
          assert_equal(test['f1'], 17179869184)
          assert_equal(test['f2'], 1024)
  
+    def test_unpack_float_data(self):
+        txt = TextIO("1,2,3\n4,5,6\n7,8,9\n0.0,1.0,2.0")
+        a, b, c = np.loadtxt(txt, delimiter=",", unpack=True)
+        assert_array_equal(a, np.array([1.0, 4.0, 7.0, 0.0]))
+        assert_array_equal(b, np.array([2.0, 5.0, 8.0, 1.0]))
+        assert_array_equal(c, np.array([3.0, 6.0, 9.0, 2.0]))
+
      def test_unpack_structured(self):
          # Regression test for gh-4341
          # Unpacking should work on structured arrays
@@ -2445,6 +2488,17 @@ M   33  21.99
          assert_equal((), test.shape)
          assert_equal(expected.dtype, test.dtype)
  
+    @pytest.mark.parametrize("ndim", [0, 1, 2])
+    def test_ndmin_keyword(self, ndim: int):
+        # lets have the same behaivour of ndmin as loadtxt
+        # as they should be the same for non-missing values
+        txt = "42"
+
+        a = np.loadtxt(StringIO(txt), ndmin=ndim)
+        b = np.genfromtxt(StringIO(txt), ndmin=ndim)
+
+        assert_array_equal(a, b)
+
  
  class TestPathUsage:
      # Test that pathlib.Path can be used
diff --git a/numpy/lib/tests/test_loadtxt.py b/numpy/lib/tests/test_loadtxt.py

new file mode 100644 (file)

index 0000000..0b8fe3c
--- /dev/null
+++ b/numpy/lib/tests/test_loadtxt.py
@@ -0,0 +1,1013 @@
+"""
+Tests specific to `np.loadtxt` added during the move of loadtxt to be backed
+by C code.
+These tests complement those found in `test_io.py`.
+"""
+
+import sys
+import os
+import pytest
+from tempfile import NamedTemporaryFile, mkstemp
+from io import StringIO
+
+import numpy as np
+from numpy.ma.testutils import assert_equal
+from numpy.testing import assert_array_equal, HAS_REFCOUNT, IS_PYPY
+
+
+def test_scientific_notation():
+    """Test that both 'e' and 'E' are parsed correctly."""
+    data = StringIO(
+        (
+            "1.0e-1,2.0E1,3.0\n"
+            "4.0e-2,5.0E-1,6.0\n"
+            "7.0e-3,8.0E1,9.0\n"
+            "0.0e-4,1.0E-1,2.0"
+        )
+    )
+    expected = np.array(
+        [[0.1, 20., 3.0], [0.04, 0.5, 6], [0.007, 80., 9], [0, 0.1, 2]]
+    )
+    assert_array_equal(np.loadtxt(data, delimiter=","), expected)
+
+
+@pytest.mark.parametrize("comment", ["..", "//", "@-", "this is a comment:"])
+def test_comment_multiple_chars(comment):
+    content = "# IGNORE\n1.5, 2.5# ABC\n3.0,4.0# XXX\n5.5,6.0\n"
+    txt = StringIO(content.replace("#", comment))
+    a = np.loadtxt(txt, delimiter=",", comments=comment)
+    assert_equal(a, [[1.5, 2.5], [3.0, 4.0], [5.5, 6.0]])
+
+
+@pytest.fixture
+def mixed_types_structured():
+    """
+    Fixture providing hetergeneous input data with a structured dtype, along
+    with the associated structured array.
+    """
+    data = StringIO(
+        (
+            "1000;2.4;alpha;-34\n"
+            "2000;3.1;beta;29\n"
+            "3500;9.9;gamma;120\n"
+            "4090;8.1;delta;0\n"
+            "5001;4.4;epsilon;-99\n"
+            "6543;7.8;omega;-1\n"
+        )
+    )
+    dtype = np.dtype(
+        [('f0', np.uint16), ('f1', np.float64), ('f2', 'S7'), ('f3', np.int8)]
+    )
+    expected = np.array(
+        [
+            (1000, 2.4, "alpha", -34),
+            (2000, 3.1, "beta", 29),
+            (3500, 9.9, "gamma", 120),
+            (4090, 8.1, "delta", 0),
+            (5001, 4.4, "epsilon", -99),
+            (6543, 7.8, "omega", -1)
+        ],
+        dtype=dtype
+    )
+    return data, dtype, expected
+
+
+@pytest.mark.parametrize('skiprows', [0, 1, 2, 3])
+def test_structured_dtype_and_skiprows_no_empty_lines(
+        skiprows, mixed_types_structured):
+    data, dtype, expected = mixed_types_structured
+    a = np.loadtxt(data, dtype=dtype, delimiter=";", skiprows=skiprows)
+    assert_array_equal(a, expected[skiprows:])
+
+
+def test_unpack_structured(mixed_types_structured):
+    data, dtype, expected = mixed_types_structured
+
+    a, b, c, d = np.loadtxt(data, dtype=dtype, delimiter=";", unpack=True)
+    assert_array_equal(a, expected["f0"])
+    assert_array_equal(b, expected["f1"])
+    assert_array_equal(c, expected["f2"])
+    assert_array_equal(d, expected["f3"])
+
+
+def test_structured_dtype_with_shape():
+    dtype = np.dtype([("a", "u1", 2), ("b", "u1", 2)])
+    data = StringIO("0,1,2,3\n6,7,8,9\n")
+    expected = np.array([((0, 1), (2, 3)), ((6, 7), (8, 9))], dtype=dtype)
+    assert_array_equal(np.loadtxt(data, delimiter=",", dtype=dtype), expected)
+
+
+def test_structured_dtype_with_multi_shape():
+    dtype = np.dtype([("a", "u1", (2, 2))])
+    data = StringIO("0 1 2 3\n")
+    expected = np.array([(((0, 1), (2, 3)),)], dtype=dtype)
+    assert_array_equal(np.loadtxt(data, dtype=dtype), expected)
+
+
+def test_nested_structured_subarray():
+    # Test from gh-16678
+    point = np.dtype([('x', float), ('y', float)])
+    dt = np.dtype([('code', int), ('points', point, (2,))])
+    data = StringIO("100,1,2,3,4\n200,5,6,7,8\n")
+    expected = np.array(
+        [
+            (100, [(1., 2.), (3., 4.)]),
+            (200, [(5., 6.), (7., 8.)]),
+        ],
+        dtype=dt
+    )
+    assert_array_equal(np.loadtxt(data, dtype=dt, delimiter=","), expected)
+
+
+def test_structured_dtype_offsets():
+    # An aligned structured dtype will have additional padding
+    dt = np.dtype("i1, i4, i1, i4, i1, i4", align=True)
+    data = StringIO("1,2,3,4,5,6\n7,8,9,10,11,12\n")
+    expected = np.array([(1, 2, 3, 4, 5, 6), (7, 8, 9, 10, 11, 12)], dtype=dt)
+    assert_array_equal(np.loadtxt(data, delimiter=",", dtype=dt), expected)
+
+
+@pytest.mark.parametrize("param", ("skiprows", "max_rows"))
+def test_exception_negative_row_limits(param):
+    """skiprows and max_rows should raise for negative parameters."""
+    with pytest.raises(ValueError, match="argument must be nonnegative"):
+        np.loadtxt("foo.bar", **{param: -3})
+
+
+@pytest.mark.parametrize("param", ("skiprows", "max_rows"))
+def test_exception_noninteger_row_limits(param):
+    with pytest.raises(TypeError, match="argument must be an integer"):
+        np.loadtxt("foo.bar", **{param: 1.0})
+
+
+@pytest.mark.parametrize(
+    "data, shape",
+    [
+        ("1 2 3 4 5\n", (1, 5)),  # Single row
+        ("1\n2\n3\n4\n5\n", (5, 1)),  # Single column
+    ]
+)
+def test_ndmin_single_row_or_col(data, shape):
+    arr = np.array([1, 2, 3, 4, 5])
+    arr2d = arr.reshape(shape)
+
+    assert_array_equal(np.loadtxt(StringIO(data), dtype=int), arr)
+    assert_array_equal(np.loadtxt(StringIO(data), dtype=int, ndmin=0), arr)
+    assert_array_equal(np.loadtxt(StringIO(data), dtype=int, ndmin=1), arr)
+    assert_array_equal(np.loadtxt(StringIO(data), dtype=int, ndmin=2), arr2d)
+
+
+@pytest.mark.parametrize("badval", [-1, 3, None, "plate of shrimp"])
+def test_bad_ndmin(badval):
+    with pytest.raises(ValueError, match="Illegal value of ndmin keyword"):
+        np.loadtxt("foo.bar", ndmin=badval)
+
+
+@pytest.mark.parametrize(
+    "ws",
+    (
+            " ",  # space
+            "\t",  # tab
+            "\u2003",  # em
+            "\u00A0",  # non-break
+            "\u3000",  # ideographic space
+    )
+)
+def test_blank_lines_spaces_delimit(ws):
+    txt = StringIO(
+        f"1 2{ws}30\n\n{ws}\n"
+        f"4 5 60{ws}\n  {ws}  \n"
+        f"7 8 {ws} 90\n  # comment\n"
+        f"3 2 1"
+    )
+    # NOTE: It is unclear that the `  # comment` should succeed. Except
+    #       for delimiter=None, which should use any whitespace (and maybe
+    #       should just be implemented closer to Python
+    expected = np.array([[1, 2, 30], [4, 5, 60], [7, 8, 90], [3, 2, 1]])
+    assert_equal(
+        np.loadtxt(txt, dtype=int, delimiter=None, comments="#"), expected
+    )
+
+
+def test_blank_lines_normal_delimiter():
+    txt = StringIO('1,2,30\n\n4,5,60\n\n7,8,90\n# comment\n3,2,1')
+    expected = np.array([[1, 2, 30], [4, 5, 60], [7, 8, 90], [3, 2, 1]])
+    assert_equal(
+        np.loadtxt(txt, dtype=int, delimiter=',', comments="#"), expected
+    )
+
+
+@pytest.mark.parametrize("dtype", (float, object))
+def test_maxrows_no_blank_lines(dtype):
+    txt = StringIO("1.5,2.5\n3.0,4.0\n5.5,6.0")
+    res = np.loadtxt(txt, dtype=dtype, delimiter=",", max_rows=2)
+    assert_equal(res.dtype, dtype)
+    assert_equal(res, np.array([["1.5", "2.5"], ["3.0", "4.0"]], dtype=dtype))
+
+
+@pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
+                    reason="PyPy bug in error formatting")
+@pytest.mark.parametrize("dtype", (np.dtype("f8"), np.dtype("i2")))
+def test_exception_message_bad_values(dtype):
+    txt = StringIO("1,2\n3,XXX\n5,6")
+    msg = f"could not convert string 'XXX' to {dtype} at row 1, column 2"
+    with pytest.raises(ValueError, match=msg):
+        np.loadtxt(txt, dtype=dtype, delimiter=",")
+
+
+def test_converters_negative_indices():
+    txt = StringIO('1.5,2.5\n3.0,XXX\n5.5,6.0')
+    conv = {-1: lambda s: np.nan if s == 'XXX' else float(s)}
+    expected = np.array([[1.5, 2.5], [3.0, np.nan], [5.5, 6.0]])
+    res = np.loadtxt(
+        txt, dtype=np.float64, delimiter=",", converters=conv, encoding=None
+    )
+    assert_equal(res, expected)
+
+
+def test_converters_negative_indices_with_usecols():
+    txt = StringIO('1.5,2.5,3.5\n3.0,4.0,XXX\n5.5,6.0,7.5\n')
+    conv = {-1: lambda s: np.nan if s == 'XXX' else float(s)}
+    expected = np.array([[1.5, 3.5], [3.0, np.nan], [5.5, 7.5]])
+    res = np.loadtxt(
+        txt,
+        dtype=np.float64,
+        delimiter=",",
+        converters=conv,
+        usecols=[0, -1],
+        encoding=None,
+    )
+    assert_equal(res, expected)
+
+    # Second test with variable number of rows:
+    res = np.loadtxt(StringIO('''0,1,2\n0,1,2,3,4'''), delimiter=",",
+                     usecols=[0, -1], converters={-1: (lambda x: -1)})
+    assert_array_equal(res, [[0, -1], [0, -1]])
+
+def test_ragged_usecols():
+    # usecols, and negative ones, work even with varying number of columns.
+    txt = StringIO("0,0,XXX\n0,XXX,0,XXX\n0,XXX,XXX,0,XXX\n")
+    expected = np.array([[0, 0], [0, 0], [0, 0]])
+    res = np.loadtxt(txt, dtype=float, delimiter=",", usecols=[0, -2])
+    assert_equal(res, expected)
+
+    txt = StringIO("0,0,XXX\n0\n0,XXX,XXX,0,XXX\n")
+    with pytest.raises(ValueError,
+                match="invalid column index -2 at row 2 with 1 columns"):
+        # There is no -2 column in the second row:
+        np.loadtxt(txt, dtype=float, delimiter=",", usecols=[0, -2])
+
+
+def test_empty_usecols():
+    txt = StringIO("0,0,XXX\n0,XXX,0,XXX\n0,XXX,XXX,0,XXX\n")
+    res = np.loadtxt(txt, dtype=np.dtype([]), delimiter=",", usecols=[])
+    assert res.shape == (3,)
+    assert res.dtype == np.dtype([])
+
+
+@pytest.mark.parametrize("c1", ["a", "の", "🫕"])
+@pytest.mark.parametrize("c2", ["a", "の", "🫕"])
+def test_large_unicode_characters(c1, c2):
+    # c1 and c2 span ascii, 16bit and 32bit range.
+    txt = StringIO(f"a,{c1},c,1.0\ne,{c2},2.0,g")
+    res = np.loadtxt(txt, dtype=np.dtype('U12'), delimiter=",")
+    expected = np.array(
+        [f"a,{c1},c,1.0".split(","), f"e,{c2},2.0,g".split(",")],
+        dtype=np.dtype('U12')
+    )
+    assert_equal(res, expected)
+
+
+def test_unicode_with_converter():
+    txt = StringIO("cat,dog\nαβγ,δεζ\nabc,def\n")
+    conv = {0: lambda s: s.upper()}
+    res = np.loadtxt(
+        txt,
+        dtype=np.dtype("U12"),
+        converters=conv,
+        delimiter=",",
+        encoding=None
+    )
+    expected = np.array([['CAT', 'dog'], ['ΑΒΓ', 'δεζ'], ['ABC', 'def']])
+    assert_equal(res, expected)
+
+
+def test_converter_with_structured_dtype():
+    txt = StringIO('1.5,2.5,Abc\n3.0,4.0,dEf\n5.5,6.0,ghI\n')
+    dt = np.dtype([('m', np.int32), ('r', np.float32), ('code', 'U8')])
+    conv = {0: lambda s: int(10*float(s)), -1: lambda s: s.upper()}
+    res = np.loadtxt(txt, dtype=dt, delimiter=",", converters=conv)
+    expected = np.array(
+        [(15, 2.5, 'ABC'), (30, 4.0, 'DEF'), (55, 6.0, 'GHI')], dtype=dt
+    )
+    assert_equal(res, expected)
+
+
+def test_converter_with_unicode_dtype():
+    """
+    With the default 'bytes' encoding, tokens are encoded prior to being
+    passed to the converter. This means that the output of the converter may
+    be bytes instead of unicode as expected by `read_rows`.
+
+    This test checks that outputs from the above scenario are properly decoded
+    prior to parsing by `read_rows`.
+    """
+    txt = StringIO('abc,def\nrst,xyz')
+    conv = bytes.upper
+    res = np.loadtxt(
+            txt, dtype=np.dtype("U3"), converters=conv, delimiter=",")
+    expected = np.array([['ABC', 'DEF'], ['RST', 'XYZ']])
+    assert_equal(res, expected)
+
+
+def test_read_huge_row():
+    row = "1.5, 2.5," * 50000
+    row = row[:-1] + "\n"
+    txt = StringIO(row * 2)
+    res = np.loadtxt(txt, delimiter=",", dtype=float)
+    assert_equal(res, np.tile([1.5, 2.5], (2, 50000)))
+
+
+@pytest.mark.parametrize("dtype", "edfgFDG")
+def test_huge_float(dtype):
+    # Covers a non-optimized path that is rarely taken:
+    field = "0" * 1000 + ".123456789"
+    dtype = np.dtype(dtype)
+    value = np.loadtxt([field], dtype=dtype)[()]
+    assert value == dtype.type("0.123456789")
+
+
+@pytest.mark.parametrize(
+    ("given_dtype", "expected_dtype"),
+    [
+        ("S", np.dtype("S5")),
+        ("U", np.dtype("U5")),
+    ],
+)
+def test_string_no_length_given(given_dtype, expected_dtype):
+    """
+    The given dtype is just 'S' or 'U' with no length. In these cases, the
+    length of the resulting dtype is determined by the longest string found
+    in the file.
+    """
+    txt = StringIO("AAA,5-1\nBBBBB,0-3\nC,4-9\n")
+    res = np.loadtxt(txt, dtype=given_dtype, delimiter=",")
+    expected = np.array(
+        [['AAA', '5-1'], ['BBBBB', '0-3'], ['C', '4-9']], dtype=expected_dtype
+    )
+    assert_equal(res, expected)
+    assert_equal(res.dtype, expected_dtype)
+
+
+def test_float_conversion():
+    """
+    Some tests that the conversion to float64 works as accurately as the
+    Python built-in `float` function. In a naive version of the float parser,
+    these strings resulted in values that were off by an ULP or two.
+    """
+    strings = [
+        '0.9999999999999999',
+        '9876543210.123456',
+        '5.43215432154321e+300',
+        '0.901',
+        '0.333',
+    ]
+    txt = StringIO('\n'.join(strings))
+    res = np.loadtxt(txt)
+    expected = np.array([float(s) for s in strings])
+    assert_equal(res, expected)
+
+
+def test_bool():
+    # Simple test for bool via integer
+    txt = StringIO("1, 0\n10, -1")
+    res = np.loadtxt(txt, dtype=bool, delimiter=",")
+    assert res.dtype == bool
+    assert_array_equal(res, [[True, False], [True, True]])
+    # Make sure we use only 1 and 0 on the byte level:
+    assert_array_equal(res.view(np.uint8), [[1, 0], [1, 1]])
+
+
+@pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
+                    reason="PyPy bug in error formatting")
+@pytest.mark.parametrize("dtype", np.typecodes["AllInteger"])
+@pytest.mark.filterwarnings("error:.*integer via a float.*:DeprecationWarning")
+def test_integer_signs(dtype):
+    dtype = np.dtype(dtype)
+    assert np.loadtxt(["+2"], dtype=dtype) == 2
+    if dtype.kind == "u":
+        with pytest.raises(ValueError):
+            np.loadtxt(["-1\n"], dtype=dtype)
+    else:
+        assert np.loadtxt(["-2\n"], dtype=dtype) == -2
+
+    for sign in ["++", "+-", "--", "-+"]:
+        with pytest.raises(ValueError):
+            np.loadtxt([f"{sign}2\n"], dtype=dtype)
+
+
+@pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
+                    reason="PyPy bug in error formatting")
+@pytest.mark.parametrize("dtype", np.typecodes["AllInteger"])
+@pytest.mark.filterwarnings("error:.*integer via a float.*:DeprecationWarning")
+def test_implicit_cast_float_to_int_fails(dtype):
+    txt = StringIO("1.0, 2.1, 3.7\n4, 5, 6")
+    with pytest.raises(ValueError):
+        np.loadtxt(txt, dtype=dtype, delimiter=",")
+
+@pytest.mark.parametrize("dtype", (np.complex64, np.complex128))
+@pytest.mark.parametrize("with_parens", (False, True))
+def test_complex_parsing(dtype, with_parens):
+    s = "(1.0-2.5j),3.75,(7+-5.0j)\n(4),(-19e2j),(0)"
+    if not with_parens:
+        s = s.replace("(", "").replace(")", "")
+
+    res = np.loadtxt(StringIO(s), dtype=dtype, delimiter=",")
+    expected = np.array(
+        [[1.0-2.5j, 3.75, 7-5j], [4.0, -1900j, 0]], dtype=dtype
+    )
+    assert_equal(res, expected)
+
+
+def test_read_from_generator():
+    def gen():
+        for i in range(4):
+            yield f"{i},{2*i},{i**2}"
+
+    res = np.loadtxt(gen(), dtype=int, delimiter=",")
+    expected = np.array([[0, 0, 0], [1, 2, 1], [2, 4, 4], [3, 6, 9]])
+    assert_equal(res, expected)
+
+
+def test_read_from_generator_multitype():
+    def gen():
+        for i in range(3):
+            yield f"{i} {i / 4}"
+
+    res = np.loadtxt(gen(), dtype="i, d", delimiter=" ")
+    expected = np.array([(0, 0.0), (1, 0.25), (2, 0.5)], dtype="i, d")
+    assert_equal(res, expected)
+
+
+def test_read_from_bad_generator():
+    def gen():
+        for entry in ["1,2", b"3, 5", 12738]:
+            yield entry
+
+    with pytest.raises(
+            TypeError, match=r"non-string returned while reading data"):
+        np.loadtxt(gen(), dtype="i, i", delimiter=",")
+
+
+@pytest.mark.skipif(not HAS_REFCOUNT, reason="Python lacks refcounts")
+def test_object_cleanup_on_read_error():
+    sentinel = object()
+    already_read = 0
+
+    def conv(x):
+        nonlocal already_read
+        if already_read > 4999:
+            raise ValueError("failed half-way through!")
+        already_read += 1
+        return sentinel
+
+    txt = StringIO("x\n" * 10000)
+
+    with pytest.raises(ValueError, match="at row 5000, column 1"):
+        np.loadtxt(txt, dtype=object, converters={0: conv})
+
+    assert sys.getrefcount(sentinel) == 2
+
+
+@pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
+                    reason="PyPy bug in error formatting")
+def test_character_not_bytes_compatible():
+    """Test exception when a character cannot be encoded as 'S'."""
+    data = StringIO("–")  # == \u2013
+    with pytest.raises(ValueError):
+        np.loadtxt(data, dtype="S5")
+
+
+@pytest.mark.parametrize("conv", (0, [float], ""))
+def test_invalid_converter(conv):
+    msg = (
+        "converters must be a dictionary mapping columns to converter "
+        "functions or a single callable."
+    )
+    with pytest.raises(TypeError, match=msg):
+        np.loadtxt(StringIO("1 2\n3 4"), converters=conv)
+
+
+@pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
+                    reason="PyPy bug in error formatting")
+def test_converters_dict_raises_non_integer_key():
+    with pytest.raises(TypeError, match="keys of the converters dict"):
+        np.loadtxt(StringIO("1 2\n3 4"), converters={"a": int})
+    with pytest.raises(TypeError, match="keys of the converters dict"):
+        np.loadtxt(StringIO("1 2\n3 4"), converters={"a": int}, usecols=0)
+
+
+@pytest.mark.parametrize("bad_col_ind", (3, -3))
+def test_converters_dict_raises_non_col_key(bad_col_ind):
+    data = StringIO("1 2\n3 4")
+    with pytest.raises(ValueError, match="converter specified for column"):
+        np.loadtxt(data, converters={bad_col_ind: int})
+
+
+def test_converters_dict_raises_val_not_callable():
+    with pytest.raises(TypeError,
+                match="values of the converters dictionary must be callable"):
+        np.loadtxt(StringIO("1 2\n3 4"), converters={0: 1})
+
+
+@pytest.mark.parametrize("q", ('"', "'", "`"))
+def test_quoted_field(q):
+    txt = StringIO(
+        f"{q}alpha, x{q}, 2.5\n{q}beta, y{q}, 4.5\n{q}gamma, z{q}, 5.0\n"
+    )
+    dtype = np.dtype([('f0', 'U8'), ('f1', np.float64)])
+    expected = np.array(
+        [("alpha, x", 2.5), ("beta, y", 4.5), ("gamma, z", 5.0)], dtype=dtype
+    )
+
+    res = np.loadtxt(txt, dtype=dtype, delimiter=",", quotechar=q)
+    assert_array_equal(res, expected)
+
+
+def test_quote_support_default():
+    """Support for quoted fields is disabled by default."""
+    txt = StringIO('"lat,long", 45, 30\n')
+    dtype = np.dtype([('f0', 'U24'), ('f1', np.float64), ('f2', np.float64)])
+
+    with pytest.raises(ValueError, match="the number of columns changed"):
+        np.loadtxt(txt, dtype=dtype, delimiter=",")
+
+    # Enable quoting support with non-None value for quotechar param
+    txt.seek(0)
+    expected = np.array([("lat,long", 45., 30.)], dtype=dtype)
+
+    res = np.loadtxt(txt, dtype=dtype, delimiter=",", quotechar='"')
+    assert_array_equal(res, expected)
+
+
+@pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
+                    reason="PyPy bug in error formatting")
+def test_quotechar_multichar_error():
+    txt = StringIO("1,2\n3,4")
+    msg = r".*must be a single unicode character or None"
+    with pytest.raises(TypeError, match=msg):
+        np.loadtxt(txt, delimiter=",", quotechar="''")
+
+
+def test_comment_multichar_error_with_quote():
+    txt = StringIO("1,2\n3,4")
+    msg = (
+        "when multiple comments or a multi-character comment is given, "
+        "quotes are not supported."
+    )
+    with pytest.raises(ValueError, match=msg):
+        np.loadtxt(txt, delimiter=",", comments="123", quotechar='"')
+    with pytest.raises(ValueError, match=msg):
+        np.loadtxt(txt, delimiter=",", comments=["#", "%"], quotechar='"')
+
+    # A single character string in a tuple is unpacked though:
+    res = np.loadtxt(txt, delimiter=",", comments=("#",), quotechar="'")
+    assert_equal(res, [[1, 2], [3, 4]])
+
+
+def test_structured_dtype_with_quotes():
+    data = StringIO(
+        (
+            "1000;2.4;'alpha';-34\n"
+            "2000;3.1;'beta';29\n"
+            "3500;9.9;'gamma';120\n"
+            "4090;8.1;'delta';0\n"
+            "5001;4.4;'epsilon';-99\n"
+            "6543;7.8;'omega';-1\n"
+        )
+    )
+    dtype = np.dtype(
+        [('f0', np.uint16), ('f1', np.float64), ('f2', 'S7'), ('f3', np.int8)]
+    )
+    expected = np.array(
+        [
+            (1000, 2.4, "alpha", -34),
+            (2000, 3.1, "beta", 29),
+            (3500, 9.9, "gamma", 120),
+            (4090, 8.1, "delta", 0),
+            (5001, 4.4, "epsilon", -99),
+            (6543, 7.8, "omega", -1)
+        ],
+        dtype=dtype
+    )
+    res = np.loadtxt(data, dtype=dtype, delimiter=";", quotechar="'")
+    assert_array_equal(res, expected)
+
+
+def test_quoted_field_is_not_empty():
+    txt = StringIO('1\n\n"4"\n""')
+    expected = np.array(["1", "4", ""], dtype="U1")
+    res = np.loadtxt(txt, delimiter=",", dtype="U1", quotechar='"')
+    assert_equal(res, expected)
+
+def test_quoted_field_is_not_empty_nonstrict():
+    # Same as test_quoted_field_is_not_empty but check that we are not strict
+    # about missing closing quote (this is the `csv.reader` default also)
+    txt = StringIO('1\n\n"4"\n"')
+    expected = np.array(["1", "4", ""], dtype="U1")
+    res = np.loadtxt(txt, delimiter=",", dtype="U1", quotechar='"')
+    assert_equal(res, expected)
+
+def test_consecutive_quotechar_escaped():
+    txt = StringIO('"Hello, my name is ""Monty""!"')
+    expected = np.array('Hello, my name is "Monty"!', dtype="U40")
+    res = np.loadtxt(txt, dtype="U40", delimiter=",", quotechar='"')
+    assert_equal(res, expected)
+
+
+@pytest.mark.parametrize("data", ("", "\n\n\n", "# 1 2 3\n# 4 5 6\n"))
+@pytest.mark.parametrize("ndmin", (0, 1, 2))
+@pytest.mark.parametrize("usecols", [None, (1, 2, 3)])
+def test_warn_on_no_data(data, ndmin, usecols):
+    """Check that a UserWarning is emitted when no data is read from input."""
+    if usecols is not None:
+        expected_shape = (0, 3)
+    elif ndmin == 2:
+        expected_shape = (0, 1)  # guess a single column?!
+    else:
+        expected_shape = (0,)
+
+    txt = StringIO(data)
+    with pytest.warns(UserWarning, match="input contained no data"):
+        res = np.loadtxt(txt, ndmin=ndmin, usecols=usecols)
+    assert res.shape == expected_shape
+
+    with NamedTemporaryFile(mode="w") as fh:
+        fh.write(data)
+        fh.seek(0)
+        with pytest.warns(UserWarning, match="input contained no data"):
+            res = np.loadtxt(txt, ndmin=ndmin, usecols=usecols)
+        assert res.shape == expected_shape
+
+@pytest.mark.parametrize("skiprows", (2, 3))
+def test_warn_on_skipped_data(skiprows):
+    data = "1 2 3\n4 5 6"
+    txt = StringIO(data)
+    with pytest.warns(UserWarning, match="input contained no data"):
+        np.loadtxt(txt, skiprows=skiprows)
+
+
+@pytest.mark.parametrize(["dtype", "value"], [
+        ("i2", 0x0001), ("u2", 0x0001),
+        ("i4", 0x00010203), ("u4", 0x00010203),
+        ("i8", 0x0001020304050607), ("u8", 0x0001020304050607),
+        # The following values are constructed to lead to unique bytes:
+        ("float16", 3.07e-05),
+        ("float32", 9.2557e-41), ("complex64", 9.2557e-41+2.8622554e-29j),
+        ("float64", -1.758571353180402e-24),
+        # Here and below, the repr side-steps a small loss of precision in
+        # complex `str` in PyPy (which is probably fine, as repr works):
+        ("complex128", repr(5.406409232372729e-29-1.758571353180402e-24j)),
+        # Use integer values that fit into double.  Everything else leads to
+        # problems due to longdoubles going via double and decimal strings
+        # causing rounding errors.
+        ("longdouble", 0x01020304050607),
+        ("clongdouble", repr(0x01020304050607 + (0x00121314151617 * 1j))),
+        ("U2", "\U00010203\U000a0b0c")])
+@pytest.mark.parametrize("swap", [True, False])
+def test_byteswapping_and_unaligned(dtype, value, swap):
+    # Try to create "interesting" values within the valid unicode range:
+    dtype = np.dtype(dtype)
+    data = [f"x,{value}\n"]  # repr as PyPy `str` truncates some
+    if swap:
+        dtype = dtype.newbyteorder()
+    full_dt = np.dtype([("a", "S1"), ("b", dtype)], align=False)
+    # The above ensures that the interesting "b" field is unaligned:
+    assert full_dt.fields["b"][1] == 1
+    res = np.loadtxt(data, dtype=full_dt, delimiter=",", encoding=None,
+                     max_rows=1)  # max-rows prevents over-allocation
+    assert res["b"] == dtype.type(value)
+
+
+@pytest.mark.parametrize("dtype",
+        np.typecodes["AllInteger"] + "efdFD" + "?")
+def test_unicode_whitespace_stripping(dtype):
+    # Test that all numeric types (and bool) strip whitespace correctly
+    # \u202F is a narrow no-break space, `\n` is just a whitespace if quoted.
+    # Currently, skip float128 as it did not always support this and has no
+    # "custom" parsing:
+    txt = StringIO(' 3 ,"\u202F2\n"')
+    res = np.loadtxt(txt, dtype=dtype, delimiter=",", quotechar='"')
+    assert_array_equal(res, np.array([3, 2]).astype(dtype))
+
+
+@pytest.mark.parametrize("dtype", "FD")
+def test_unicode_whitespace_stripping_complex(dtype):
+    # Complex has a few extra cases since it has two components and
+    # parentheses
+    line = " 1 , 2+3j , ( 4+5j ), ( 6+-7j )  , 8j , ( 9j ) \n"
+    data = [line, line.replace(" ", "\u202F")]
+    res = np.loadtxt(data, dtype=dtype, delimiter=',')
+    assert_array_equal(res, np.array([[1, 2+3j, 4+5j, 6-7j, 8j, 9j]] * 2))
+
+
+@pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
+                    reason="PyPy bug in error formatting")
+@pytest.mark.parametrize("dtype", "FD")
+@pytest.mark.parametrize("field",
+        ["1 +2j", "1+ 2j", "1+2 j", "1+-+3", "(1j", "(1", "(1+2j", "1+2j)"])
+def test_bad_complex(dtype, field):
+    with pytest.raises(ValueError):
+        np.loadtxt([field + "\n"], dtype=dtype, delimiter=",")
+
+
+@pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
+                    reason="PyPy bug in error formatting")
+@pytest.mark.parametrize("dtype",
+            np.typecodes["AllInteger"] + "efgdFDG" + "?")
+def test_nul_character_error(dtype):
+    # Test that a \0 character is correctly recognized as an error even if
+    # what comes before is valid (not everything gets parsed internally).
+    if dtype.lower() == "g":
+        pytest.xfail("longdouble/clongdouble assignment may misbehave.")
+    with pytest.raises(ValueError):
+        np.loadtxt(["1\000"], dtype=dtype, delimiter=",", quotechar='"')
+
+
+@pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
+                    reason="PyPy bug in error formatting")
+@pytest.mark.parametrize("dtype",
+        np.typecodes["AllInteger"] + "efgdFDG" + "?")
+def test_no_thousands_support(dtype):
+    # Mainly to document behaviour, Python supports thousands like 1_1.
+    # (e and G may end up using different conversion and support it, this is
+    # a bug but happens...)
+    if dtype == "e":
+        pytest.skip("half assignment currently uses Python float converter")
+    if dtype in "eG":
+        pytest.xfail("clongdouble assignment is buggy (uses `complex`?).")
+
+    assert int("1_1") == float("1_1") == complex("1_1") == 11
+    with pytest.raises(ValueError):
+        np.loadtxt(["1_1\n"], dtype=dtype)
+
+
+@pytest.mark.parametrize("data", [
+    ["1,2\n", "2\n,3\n"],
+    ["1,2\n", "2\r,3\n"]])
+def test_bad_newline_in_iterator(data):
+    # In NumPy <=1.22 this was accepted, because newlines were completely
+    # ignored when the input was an iterable.  This could be changed, but right
+    # now, we raise an error.
+    msg = "Found an unquoted embedded newline within a single line"
+    with pytest.raises(ValueError, match=msg):
+        np.loadtxt(data, delimiter=",")
+
+
+@pytest.mark.parametrize("data", [
+    ["1,2\n", "2,3\r\n"],  # a universal newline
+    ["1,2\n", "'2\n',3\n"],  # a quoted newline
+    ["1,2\n", "'2\r',3\n"],
+    ["1,2\n", "'2\r\n',3\n"],
+])
+def test_good_newline_in_iterator(data):
+    # The quoted newlines will be untransformed here, but are just whitespace.
+    res = np.loadtxt(data, delimiter=",", quotechar="'")
+    assert_array_equal(res, [[1., 2.], [2., 3.]])
+
+
+@pytest.mark.parametrize("newline", ["\n", "\r", "\r\n"])
+def test_universal_newlines_quoted(newline):
+    # Check that universal newline support within the tokenizer is not applied
+    # to quoted fields.  (note that lines must end in newline or quoted
+    # fields will not include a newline at all)
+    data = ['1,"2\n"\n', '3,"4\n', '1"\n']
+    data = [row.replace("\n", newline) for row in data]
+    res = np.loadtxt(data, dtype=object, delimiter=",", quotechar='"')
+    assert_array_equal(res, [['1', f'2{newline}'], ['3', f'4{newline}1']])
+
+
+def test_null_character():
+    # Basic tests to check that the NUL character is not special:
+    res = np.loadtxt(["1\0002\0003\n", "4\0005\0006"], delimiter="\000")
+    assert_array_equal(res, [[1, 2, 3], [4, 5, 6]])
+
+    # Also not as part of a field (avoid unicode/arrays as unicode strips \0)
+    res = np.loadtxt(["1\000,2\000,3\n", "4\000,5\000,6"],
+                     delimiter=",", dtype=object)
+    assert res.tolist() == [["1\000", "2\000", "3"], ["4\000", "5\000", "6"]]
+
+
+def test_iterator_fails_getting_next_line():
+    class BadSequence:
+        def __len__(self):
+            return 100
+
+        def __getitem__(self, item):
+            if item == 50:
+                raise RuntimeError("Bad things happened!")
+            return f"{item}, {item+1}"
+
+    with pytest.raises(RuntimeError, match="Bad things happened!"):
+        np.loadtxt(BadSequence(), dtype=int, delimiter=",")
+
+
+class TestCReaderUnitTests:
+    # These are internal tests for path that should not be possible to hit
+    # unless things go very very wrong somewhere.
+    def test_not_an_filelike(self):
+        with pytest.raises(AttributeError, match=".*read"):
+            np.core._multiarray_umath._load_from_filelike(
+                object(), dtype=np.dtype("i"), filelike=True)
+
+    def test_filelike_read_fails(self):
+        # Can only be reached if loadtxt opens the file, so it is hard to do
+        # via the public interface (although maybe not impossible considering
+        # the current "DataClass" backing).
+        class BadFileLike:
+            counter = 0
+
+            def read(self, size):
+                self.counter += 1
+                if self.counter > 20:
+                    raise RuntimeError("Bad bad bad!")
+                return "1,2,3\n"
+
+        with pytest.raises(RuntimeError, match="Bad bad bad!"):
+            np.core._multiarray_umath._load_from_filelike(
+                BadFileLike(), dtype=np.dtype("i"), filelike=True)
+
+    def test_filelike_bad_read(self):
+        # Can only be reached if loadtxt opens the file, so it is hard to do
+        # via the public interface (although maybe not impossible considering
+        # the current "DataClass" backing).
+
+        class BadFileLike:
+            counter = 0
+
+            def read(self, size):
+                return 1234  # not a string!
+
+        with pytest.raises(TypeError,
+                    match="non-string returned while reading data"):
+            np.core._multiarray_umath._load_from_filelike(
+                BadFileLike(), dtype=np.dtype("i"), filelike=True)
+
+    def test_not_an_iter(self):
+        with pytest.raises(TypeError,
+                    match="error reading from object, expected an iterable"):
+            np.core._multiarray_umath._load_from_filelike(
+                object(), dtype=np.dtype("i"), filelike=False)
+
+    def test_bad_type(self):
+        with pytest.raises(TypeError, match="internal error: dtype must"):
+            np.core._multiarray_umath._load_from_filelike(
+                object(), dtype="i", filelike=False)
+
+    def test_bad_encoding(self):
+        with pytest.raises(TypeError, match="encoding must be a unicode"):
+            np.core._multiarray_umath._load_from_filelike(
+                object(), dtype=np.dtype("i"), filelike=False, encoding=123)
+
+    @pytest.mark.parametrize("newline", ["\r", "\n", "\r\n"])
+    def test_manual_universal_newlines(self, newline):
+        # This is currently not available to users, because we should always
+        # open files with universal newlines enabled `newlines=None`.
+        # (And reading from an iterator uses slightly different code paths.)
+        # We have no real support for `newline="\r"` or `newline="\n" as the
+        # user cannot specify those options.
+        data = StringIO('0\n1\n"2\n"\n3\n4 #\n'.replace("\n", newline),
+                        newline="")
+
+        res = np.core._multiarray_umath._load_from_filelike(
+            data, dtype=np.dtype("U10"), filelike=True,
+            quote='"', comment="#", skiplines=1)
+        assert_array_equal(res[:, 0], ["1", f"2{newline}", "3", "4 "])
+
+
+def test_delimiter_comment_collision_raises():
+    with pytest.raises(TypeError, match=".*control characters.*incompatible"):
+        np.loadtxt(StringIO("1, 2, 3"), delimiter=",", comments=",")
+
+
+def test_delimiter_quotechar_collision_raises():
+    with pytest.raises(TypeError, match=".*control characters.*incompatible"):
+        np.loadtxt(StringIO("1, 2, 3"), delimiter=",", quotechar=",")
+
+
+def test_comment_quotechar_collision_raises():
+    with pytest.raises(TypeError, match=".*control characters.*incompatible"):
+        np.loadtxt(StringIO("1 2 3"), comments="#", quotechar="#")
+
+
+def test_delimiter_and_multiple_comments_collision_raises():
+    with pytest.raises(
+        TypeError, match="Comment characters.*cannot include the delimiter"
+    ):
+        np.loadtxt(StringIO("1, 2, 3"), delimiter=",", comments=["#", ","])
+
+
+@pytest.mark.parametrize(
+    "ws",
+    (
+        " ",  # space
+        "\t",  # tab
+        "\u2003",  # em
+        "\u00A0",  # non-break
+        "\u3000",  # ideographic space
+    )
+)
+def test_collision_with_default_delimiter_raises(ws):
+    with pytest.raises(TypeError, match=".*control characters.*incompatible"):
+        np.loadtxt(StringIO(f"1{ws}2{ws}3\n4{ws}5{ws}6\n"), comments=ws)
+    with pytest.raises(TypeError, match=".*control characters.*incompatible"):
+        np.loadtxt(StringIO(f"1{ws}2{ws}3\n4{ws}5{ws}6\n"), quotechar=ws)
+
+
+@pytest.mark.parametrize("nl", ("\n", "\r"))
+def test_control_character_newline_raises(nl):
+    txt = StringIO(f"1{nl}2{nl}3{nl}{nl}4{nl}5{nl}6{nl}{nl}")
+    msg = "control character.*cannot be a newline"
+    with pytest.raises(TypeError, match=msg):
+        np.loadtxt(txt, delimiter=nl)
+    with pytest.raises(TypeError, match=msg):
+        np.loadtxt(txt, comments=nl)
+    with pytest.raises(TypeError, match=msg):
+        np.loadtxt(txt, quotechar=nl)
+
+
+@pytest.mark.parametrize(
+    ("generic_data", "long_datum", "unitless_dtype", "expected_dtype"),
+    [
+        ("2012-03", "2013-01-15", "M8", "M8[D]"),  # Datetimes
+        ("spam-a-lot", "tis_but_a_scratch", "U", "U17"),  # str
+    ],
+)
+@pytest.mark.parametrize("nrows", (10, 50000, 60000))  # lt, eq, gt chunksize
+def test_parametric_unit_discovery(
+    generic_data, long_datum, unitless_dtype, expected_dtype, nrows
+):
+    """Check that the correct unit (e.g. month, day, second) is discovered from
+    the data when a user specifies a unitless datetime."""
+    # Unit should be "D" (days) due to last entry
+    data = [generic_data] * 50000 + [long_datum]
+    expected = np.array(data, dtype=expected_dtype)
+
+    # file-like path
+    txt = StringIO("\n".join(data))
+    a = np.loadtxt(txt, dtype=unitless_dtype)
+    assert a.dtype == expected.dtype
+    assert_equal(a, expected)
+
+    # file-obj path
+    fd, fname = mkstemp()
+    os.close(fd)
+    with open(fname, "w") as fh:
+        fh.write("\n".join(data))
+    a = np.loadtxt(fname, dtype=unitless_dtype)
+    os.remove(fname)
+    assert a.dtype == expected.dtype
+    assert_equal(a, expected)
+
+
+def test_str_dtype_unit_discovery_with_converter():
+    data = ["spam-a-lot"] * 60000 + ["XXXtis_but_a_scratch"]
+    expected = np.array(
+        ["spam-a-lot"] * 60000 + ["tis_but_a_scratch"], dtype="U17"
+    )
+    conv = lambda s: s.strip("XXX")
+
+    # file-like path
+    txt = StringIO("\n".join(data))
+    a = np.loadtxt(txt, dtype="U", converters=conv, encoding=None)
+    assert a.dtype == expected.dtype
+    assert_equal(a, expected)
+
+    # file-obj path
+    fd, fname = mkstemp()
+    os.close(fd)
+    with open(fname, "w") as fh:
+        fh.write("\n".join(data))
+    a = np.loadtxt(fname, dtype="U", converters=conv, encoding=None)
+    os.remove(fname)
+    assert a.dtype == expected.dtype
+    assert_equal(a, expected)
+
+
+@pytest.mark.skipif(IS_PYPY and sys.implementation.version <= (7, 3, 8),
+                    reason="PyPy bug in error formatting")
+def test_control_character_empty():
+    with pytest.raises(TypeError, match="Text reading control character must"):
+        np.loadtxt(StringIO("1 2 3"), delimiter="")
+    with pytest.raises(TypeError, match="Text reading control character must"):
+        np.loadtxt(StringIO("1 2 3"), quotechar="")
+    with pytest.raises(ValueError, match="comments cannot be an empty string"):
+        np.loadtxt(StringIO("1 2 3"), comments="")
+    with pytest.raises(ValueError, match="comments cannot be an empty string"):
+        np.loadtxt(StringIO("1 2 3"), comments=["#", ""])
+
+
+def test_control_characters_as_bytes():
+    """Byte control characters (comments, delimiter) are supported."""
+    a = np.loadtxt(StringIO("#header\n1,2,3"), comments=b"#", delimiter=b",")
+    assert_equal(a, [1, 2, 3])
diff --git a/numpy/lib/tests/test_recfunctions.py b/numpy/lib/tests/test_recfunctions.py

index 2f3c14df31f0fff85f59231622814484a2b23571..9b2506a7c0fd24f7ae07ce2d0caac8103a7cc616 100644 (file)
--- a/numpy/lib/tests/test_recfunctions.py
+++ b/numpy/lib/tests/test_recfunctions.py
@@ -835,7 +835,6 @@ class TestJoinBy:
          b = np.ones(3, dtype=[('c', 'u1'), ('b', 'f4'), ('a', 'i4')])
          assert_raises(ValueError, join_by, ['a', 'b', 'b'], a, b)
  
-    @pytest.mark.xfail(reason="See comment at gh-9343")
      def test_same_name_different_dtypes_key(self):
          a_dtype = np.dtype([('key', 'S5'), ('value', '<f4')])
          b_dtype = np.dtype([('key', 'S10'), ('value', '<f4')])
diff --git a/numpy/lib/tests/test_shape_base.py b/numpy/lib/tests/test_shape_base.py

index a148e53da68a331905351dc7c7b9f0f792f00ee6..07960627b71e91784fb4aed40dde9e493b0bdd51 100644 (file)
--- a/numpy/lib/tests/test_shape_base.py
+++ b/numpy/lib/tests/test_shape_base.py
@@ -646,15 +646,52 @@ class TestSqueeze:
  class TestKron:
      def test_return_type(self):
          class myarray(np.ndarray):
-            __array_priority__ = 0.0
+            __array_priority__ = 1.0
  
          a = np.ones([2, 2])
          ma = myarray(a.shape, a.dtype, a.data)
          assert_equal(type(kron(a, a)), np.ndarray)
          assert_equal(type(kron(ma, ma)), myarray)
-        assert_equal(type(kron(a, ma)), np.ndarray)
+        assert_equal(type(kron(a, ma)), myarray)
          assert_equal(type(kron(ma, a)), myarray)
  
+    @pytest.mark.parametrize(
+        "array_class", [np.asarray, np.mat]
+    )
+    def test_kron_smoke(self, array_class):
+        a = array_class(np.ones([3, 3]))
+        b = array_class(np.ones([3, 3]))
+        k = array_class(np.ones([9, 9]))
+
+        assert_array_equal(np.kron(a, b), k)
+
+    def test_kron_ma(self):
+        x = np.ma.array([[1, 2], [3, 4]], mask=[[0, 1], [1, 0]])
+        k = np.ma.array(np.diag([1, 4, 4, 16]),
+                mask=~np.array(np.identity(4), dtype=bool))
+
+        assert_array_equal(k, np.kron(x, x))
+
+    @pytest.mark.parametrize(
+        "shape_a,shape_b", [
+            ((1, 1), (1, 1)),
+            ((1, 2, 3), (4, 5, 6)),
+            ((2, 2), (2, 2, 2)),
+            ((1, 0), (1, 1)),
+            ((2, 0, 2), (2, 2)),
+            ((2, 0, 0, 2), (2, 0, 2)),
+        ])
+    def test_kron_shape(self, shape_a, shape_b):
+        a = np.ones(shape_a)
+        b = np.ones(shape_b)
+        normalised_shape_a = (1,) * max(0, len(shape_b)-len(shape_a)) + shape_a
+        normalised_shape_b = (1,) * max(0, len(shape_a)-len(shape_b)) + shape_b
+        expected_shape = np.multiply(normalised_shape_a, normalised_shape_b)
+
+        k = np.kron(a, b)
+        assert np.array_equal(
+                k.shape, expected_shape), "Unexpected shape from kron"
+
  
  class TestTile:
      def test_basic(self):
diff --git a/numpy/lib/tests/test_twodim_base.py b/numpy/lib/tests/test_twodim_base.py

index c1c5a1615d78bd66887061fc5e4240416a553063..141f508fdf7e954e07c06ed043c938731a050440 100644 (file)
--- a/numpy/lib/tests/test_twodim_base.py
+++ b/numpy/lib/tests/test_twodim_base.py
@@ -44,6 +44,12 @@ class TestEye:
          assert_equal(eye(3) == 1,
                       eye(3, dtype=bool))
  
+    def test_uint64(self):
+        # Regression test for gh-9982
+        assert_equal(eye(np.uint64(2), dtype=int), array([[1, 0], [0, 1]]))
+        assert_equal(eye(np.uint64(2), M=np.uint64(4), k=np.uint64(1)),
+                     array([[0, 1, 0, 0], [0, 0, 1, 0]]))
+
      def test_diag(self):
          assert_equal(eye(4, k=1),
                       array([[0, 1, 0, 0],
@@ -382,7 +388,7 @@ def test_tril_triu_dtype():
      assert_equal(np.triu(arr).dtype, arr.dtype)
      assert_equal(np.tril(arr).dtype, arr.dtype)
  
-    arr = np.zeros((3,3), dtype='f4,f4')
+    arr = np.zeros((3, 3), dtype='f4,f4')
      assert_equal(np.triu(arr).dtype, arr.dtype)
      assert_equal(np.tril(arr).dtype, arr.dtype)
  
diff --git a/numpy/lib/twodim_base.py b/numpy/lib/twodim_base.py

index 3e5ad31ff0d52f30b6fb083815ef62b34ba25c82..3d47abbfb77f08d0cd54a9b3616bc7643aa05d86 100644 (file)
--- a/numpy/lib/twodim_base.py
+++ b/numpy/lib/twodim_base.py
@@ -2,6 +2,7 @@
  
  """
  import functools
+import operator
  
  from numpy.core.numeric import (
      asanyarray, arange, zeros, greater_equal, multiply, ones,
@@ -214,6 +215,11 @@ def eye(N, M=None, k=0, dtype=float, order='C', *, like=None):
      m = zeros((N, M), dtype=dtype, order=order)
      if k >= M:
          return m
+    # Ensure M and k are integers, so we don't get any surprise casting
+    # results in the expressions `M-k` and `M+1` used below.  This avoids
+    # a problem with inputs with type (for example) np.uint64.
+    M = operator.index(M)
+    k = operator.index(k)
      if k >= 0:
          i = k
      else:
@@ -494,8 +500,8 @@ def triu(m, k=0):
      Upper triangle of an array.
  
      Return a copy of an array with the elements below the `k`-th diagonal
-    zeroed. For arrays with ``ndim`` exceeding 2, `triu` will apply to the final
-    two axes.
+    zeroed. For arrays with ``ndim`` exceeding 2, `triu` will apply to the
+    final two axes.
  
      Please refer to the documentation for `tril` for further details.
  
@@ -804,7 +810,7 @@ def histogram2d(x, y, bins=10, range=None, normed=None, weights=None,
      >>> plt.show()
      """
      from numpy import histogramdd
-    
+
      if len(x) != len(y):
          raise ValueError('x and y must have the same length.')
  
diff --git a/numpy/lib/twodim_base.pyi b/numpy/lib/twodim_base.pyi

index cba503ca355843213c6be3c474a6ea333b054801..120abb7e051a77aaf041fb210c2e8fd14ba395e5 100644 (file)
--- a/numpy/lib/twodim_base.pyi
+++ b/numpy/lib/twodim_base.pyi
@@ -1,18 +1,12 @@
+from collections.abc import Callable, Sequence
  from typing import (
      Any,
-    Callable,
-    List,
-    Sequence,
      overload,
-    Tuple,
-    Type,
      TypeVar,
      Union,
  )
  
  from numpy import (
-    ndarray,
-    dtype,
      generic,
      number,
      bool_,
@@ -28,13 +22,13 @@ from numpy import (
      _OrderCF,
  )
  
-from numpy.typing import (
+from numpy._typing import (
      DTypeLike,
-    _SupportsDType,
+    _DTypeLike,
      ArrayLike,
+    _ArrayLike,
      NDArray,
-    _FiniteNestedSequence,
-    _SupportsArray,
+    _SupportsArrayFunc,
      _ArrayLikeInt_co,
      _ArrayLikeFloat_co,
      _ArrayLikeComplex_co,
@@ -50,14 +44,7 @@ _MaskFunc = Callable[
      NDArray[Union[number[Any], bool_, timedelta64, datetime64, object_]],
  ]
  
-_DTypeLike = Union[
-    Type[_SCT],
-    dtype[_SCT],
-    _SupportsDType[dtype[_SCT]],
-]
-_ArrayLike = _FiniteNestedSequence[_SupportsArray[dtype[_SCT]]]
-
-__all__: List[str]
+__all__: list[str]
  
  @overload
  def fliplr(m: _ArrayLike[_SCT]) -> NDArray[_SCT]: ...
@@ -77,7 +64,7 @@ def eye(
      dtype: None = ...,
      order: _OrderCF = ...,
      *,
-    like: None | ArrayLike = ...,
+    like: None | _SupportsArrayFunc = ...,
  ) -> NDArray[float64]: ...
  @overload
  def eye(
@@ -87,7 +74,7 @@ def eye(
      dtype: _DTypeLike[_SCT] = ...,
      order: _OrderCF = ...,
      *,
-    like: None | ArrayLike = ...,
+    like: None | _SupportsArrayFunc = ...,
  ) -> NDArray[_SCT]: ...
  @overload
  def eye(
@@ -97,7 +84,7 @@ def eye(
      dtype: DTypeLike = ...,
      order: _OrderCF = ...,
      *,
-    like: None | ArrayLike = ...,
+    like: None | _SupportsArrayFunc = ...,
  ) -> NDArray[Any]: ...
  
  @overload
@@ -117,7 +104,7 @@ def tri(
      k: int = ...,
      dtype: None = ...,
      *,
-    like: None | ArrayLike = ...
+    like: None | _SupportsArrayFunc = ...
  ) -> NDArray[float64]: ...
  @overload
  def tri(
@@ -126,7 +113,7 @@ def tri(
      k: int = ...,
      dtype: _DTypeLike[_SCT] = ...,
      *,
-    like: None | ArrayLike = ...
+    like: None | _SupportsArrayFunc = ...
  ) -> NDArray[_SCT]: ...
  @overload
  def tri(
@@ -135,7 +122,7 @@ def tri(
      k: int = ...,
      dtype: DTypeLike = ...,
      *,
-    like: None | ArrayLike = ...
+    like: None | _SupportsArrayFunc = ...
  ) -> NDArray[Any]: ...
  
  @overload
@@ -182,7 +169,7 @@ def histogram2d(  # type: ignore[misc]
      normed: None | bool = ...,
      weights: None | _ArrayLikeFloat_co = ...,
      density: None | bool = ...,
-) -> Tuple[
+) -> tuple[
      NDArray[float64],
      NDArray[floating[Any]],
      NDArray[floating[Any]],
@@ -196,7 +183,7 @@ def histogram2d(
      normed: None | bool = ...,
      weights: None | _ArrayLikeFloat_co = ...,
      density: None | bool = ...,
-) -> Tuple[
+) -> tuple[
      NDArray[float64],
      NDArray[complexfloating[Any, Any]],
      NDArray[complexfloating[Any, Any]],
@@ -210,7 +197,7 @@ def histogram2d(
      normed: None | bool = ...,
      weights: None | _ArrayLikeFloat_co = ...,
      density: None | bool = ...,
-) -> Tuple[
+) -> tuple[
      NDArray[float64],
      NDArray[Any],
      NDArray[Any],
@@ -224,32 +211,32 @@ def mask_indices(
      n: int,
      mask_func: _MaskFunc[int],
      k: int = ...,
-) -> Tuple[NDArray[intp], NDArray[intp]]: ...
+) -> tuple[NDArray[intp], NDArray[intp]]: ...
  @overload
  def mask_indices(
      n: int,
      mask_func: _MaskFunc[_T],
      k: _T,
-) -> Tuple[NDArray[intp], NDArray[intp]]: ...
+) -> tuple[NDArray[intp], NDArray[intp]]: ...
  
  def tril_indices(
      n: int,
      k: int = ...,
      m: None | int = ...,
-) -> Tuple[NDArray[int_], NDArray[int_]]: ...
+) -> tuple[NDArray[int_], NDArray[int_]]: ...
  
  def tril_indices_from(
      arr: NDArray[Any],
      k: int = ...,
-) -> Tuple[NDArray[int_], NDArray[int_]]: ...
+) -> tuple[NDArray[int_], NDArray[int_]]: ...
  
  def triu_indices(
      n: int,
      k: int = ...,
      m: None | int = ...,
-) -> Tuple[NDArray[int_], NDArray[int_]]: ...
+) -> tuple[NDArray[int_], NDArray[int_]]: ...
  
  def triu_indices_from(
      arr: NDArray[Any],
      k: int = ...,
-) -> Tuple[NDArray[int_], NDArray[int_]]: ...
+) -> tuple[NDArray[int_], NDArray[int_]]: ...
diff --git a/numpy/lib/type_check.py b/numpy/lib/type_check.py

index 56afd83ce3359da520f0068dc4e9af777b4f30e5..94d525f51da19c54ef5a1ef308fa072ca6e9fd49 100644 (file)
--- a/numpy/lib/type_check.py
+++ b/numpy/lib/type_check.py
@@ -6,7 +6,7 @@ import warnings
  
  __all__ = ['iscomplexobj', 'isrealobj', 'imag', 'iscomplex',
             'isreal', 'nan_to_num', 'real', 'real_if_close',
-           'typename', 'asfarray', 'mintypecode', 'asscalar',
+           'typename', 'asfarray', 'mintypecode',
             'common_type']
  
  import numpy.core.numeric as _nx
@@ -276,22 +276,22 @@ def isreal(x):
      >>> a = np.array([1+1j, 1+0j, 4.5, 3, 2, 2j], dtype=complex)
      >>> np.isreal(a)
      array([False,  True,  True,  True,  True, False])
-    
+
      The function does not work on string arrays.
  
      >>> a = np.array([2j, "a"], dtype="U")
      >>> np.isreal(a)  # Warns about non-elementwise comparison
      False
-    
+
      Returns True for all elements in input array of ``dtype=object`` even if
      any of the elements is complex.
  
      >>> a = np.array([1, "2", 3+4j], dtype=object)
      >>> np.isreal(a)
      array([ True,  True,  True])
-    
+
      isreal should not be used with object arrays
-    
+
      >>> a = np.array([1+2j, 2+1j], dtype=object)
      >>> np.isreal(a)
      array([ True,  True])
@@ -405,14 +405,14 @@ def _nan_to_num_dispatcher(x, copy=None, nan=None, posinf=None, neginf=None):
  def nan_to_num(x, copy=True, nan=0.0, posinf=None, neginf=None):
      """
      Replace NaN with zero and infinity with large finite numbers (default
-    behaviour) or with the numbers defined by the user using the `nan`, 
+    behaviour) or with the numbers defined by the user using the `nan`,
      `posinf` and/or `neginf` keywords.
  
      If `x` is inexact, NaN is replaced by zero or by the user defined value in
-    `nan` keyword, infinity is replaced by the largest finite floating point 
-    values representable by ``x.dtype`` or by the user defined value in 
-    `posinf` keyword and -infinity is replaced by the most negative finite 
-    floating point values representable by ``x.dtype`` or by the user defined 
+    `nan` keyword, infinity is replaced by the largest finite floating point
+    values representable by ``x.dtype`` or by the user defined value in
+    `posinf` keyword and -infinity is replaced by the most negative finite
+    floating point values representable by ``x.dtype`` or by the user defined
      value in `neginf` keyword.
  
      For complex dtypes, the above is applied to each of the real and
@@ -429,27 +429,27 @@ def nan_to_num(x, copy=True, nan=0.0, posinf=None, neginf=None):
          in-place (False). The in-place operation only occurs if
          casting to an array does not require a copy.
          Default is True.
-        
+
          .. versionadded:: 1.13
      nan : int, float, optional
-        Value to be used to fill NaN values. If no value is passed 
+        Value to be used to fill NaN values. If no value is passed
          then NaN values will be replaced with 0.0.
-        
+
          .. versionadded:: 1.17
      posinf : int, float, optional
-        Value to be used to fill positive infinity values. If no value is 
+        Value to be used to fill positive infinity values. If no value is
          passed then positive infinity values will be replaced with a very
          large number.
-        
+
          .. versionadded:: 1.17
      neginf : int, float, optional
-        Value to be used to fill negative infinity values. If no value is 
+        Value to be used to fill negative infinity values. If no value is
          passed then negative infinity values will be replaced with a very
          small (or negative) number.
-        
+
          .. versionadded:: 1.17
  
-        
+
  
      Returns
      -------
@@ -483,7 +483,7 @@ def nan_to_num(x, copy=True, nan=0.0, posinf=None, neginf=None):
      array([ 1.79769313e+308, -1.79769313e+308,  0.00000000e+000, # may vary
             -1.28000000e+002,  1.28000000e+002])
      >>> np.nan_to_num(x, nan=-9999, posinf=33333333, neginf=33333333)
-    array([ 3.3333333e+07,  3.3333333e+07, -9.9990000e+03, 
+    array([ 3.3333333e+07,  3.3333333e+07, -9.9990000e+03,
             -1.2800000e+02,  1.2800000e+02])
      >>> y = np.array([complex(np.inf, np.nan), np.nan, complex(np.nan, np.inf)])
      array([  1.79769313e+308,  -1.79769313e+308,   0.00000000e+000, # may vary
@@ -529,7 +529,7 @@ def _real_if_close_dispatcher(a, tol=None):
  @array_function_dispatch(_real_if_close_dispatcher)
  def real_if_close(a, tol=100):
      """
-    If input is complex with all imaginary parts close to zero, return 
+    If input is complex with all imaginary parts close to zero, return
      real parts.
  
      "Close to zero" is defined as `tol` * (machine epsilon of the type for
@@ -583,40 +583,6 @@ def real_if_close(a, tol=100):
      return a
  
  
-def _asscalar_dispatcher(a):
-    # 2018-10-10, 1.16
-    warnings.warn('np.asscalar(a) is deprecated since NumPy v1.16, use '
-                  'a.item() instead', DeprecationWarning, stacklevel=3)
-    return (a,)
-
-
-@array_function_dispatch(_asscalar_dispatcher)
-def asscalar(a):
-    """
-    Convert an array of size 1 to its scalar equivalent.
-
-    .. deprecated:: 1.16
-
-        Deprecated, use `numpy.ndarray.item()` instead.
-
-    Parameters
-    ----------
-    a : ndarray
-        Input array of size 1.
-
-    Returns
-    -------
-    out : scalar
-        Scalar representation of `a`. The output data type is the same type
-        returned by the input's `item` method.
-
-    Examples
-    --------
-    >>> np.asscalar(np.array([24]))
-    24
-    """
-    return a.item()
-
  #-----------------------------------------------------------------------------
  
  _namefromtype = {'S1': 'character',
diff --git a/numpy/lib/type_check.pyi b/numpy/lib/type_check.pyi

index 0a55dbf21347757d86032d1bf72e389fd6266ebd..b04da21d44b6bad5cb50ea8abaa091b5a753da11 100644 (file)
--- a/numpy/lib/type_check.pyi
+++ b/numpy/lib/type_check.pyi
@@ -1,11 +1,8 @@
+from collections.abc import Container, Iterable
  from typing import (
      Literal as L,
      Any,
-    Container,
-    Iterable,
-    List,
      overload,
-    Type,
      TypeVar,
      Protocol,
  )
@@ -20,7 +17,7 @@ from numpy import (
      integer,
  )
  
-from numpy.typing import (
+from numpy._typing import (
      ArrayLike,
      DTypeLike,
      NBitBase,
@@ -28,8 +25,7 @@ from numpy.typing import (
      _64Bit,
      _SupportsDType,
      _ScalarLike_co,
-    _FiniteNestedSequence,
-    _SupportsArray,
+    _ArrayLike,
      _DTypeLikeComplex,
  )
  
@@ -39,8 +35,6 @@ _SCT = TypeVar("_SCT", bound=generic)
  _NBit1 = TypeVar("_NBit1", bound=NBitBase)
  _NBit2 = TypeVar("_NBit2", bound=NBitBase)
  
-_ArrayLike = _FiniteNestedSequence[_SupportsArray[dtype[_SCT]]]
-
  class _SupportsReal(Protocol[_T_co]):
      @property
      def real(self) -> _T_co: ...
@@ -49,7 +43,7 @@ class _SupportsImag(Protocol[_T_co]):
      @property
      def imag(self) -> _T_co: ...
  
-__all__: List[str]
+__all__: list[str]
  
  def mintypecode(
      typechars: Iterable[str | ArrayLike],
@@ -62,7 +56,7 @@ def mintypecode(
  @overload
  def asfarray(
      a: object,
-    dtype: None | Type[float] = ...,
+    dtype: None | type[float] = ...,
  ) -> NDArray[float64]: ...
  @overload
  def asfarray(  # type: ignore[misc]
@@ -151,9 +145,6 @@ def real_if_close(
      tol: float = ...,
  ) -> NDArray[Any]: ...
  
-# NOTE: deprecated
-# def asscalar(a): ...
-
  @overload
  def typename(char: L['S1']) -> L['character']: ...
  @overload
@@ -204,28 +195,28 @@ def common_type(  # type: ignore[misc]
      *arrays: _SupportsDType[dtype[
          integer[Any]
      ]]
-) -> Type[floating[_64Bit]]: ...
+) -> type[floating[_64Bit]]: ...
  @overload
  def common_type(  # type: ignore[misc]
      *arrays: _SupportsDType[dtype[
          floating[_NBit1]
      ]]
-) -> Type[floating[_NBit1]]: ...
+) -> type[floating[_NBit1]]: ...
  @overload
  def common_type(  # type: ignore[misc]
      *arrays: _SupportsDType[dtype[
          integer[Any] | floating[_NBit1]
      ]]
-) -> Type[floating[_NBit1 | _64Bit]]: ...
+) -> type[floating[_NBit1 | _64Bit]]: ...
  @overload
  def common_type(  # type: ignore[misc]
      *arrays: _SupportsDType[dtype[
          floating[_NBit1] | complexfloating[_NBit2, _NBit2]
      ]]
-) -> Type[complexfloating[_NBit1 | _NBit2, _NBit1 | _NBit2]]: ...
+) -> type[complexfloating[_NBit1 | _NBit2, _NBit1 | _NBit2]]: ...
  @overload
  def common_type(
      *arrays: _SupportsDType[dtype[
          integer[Any] | floating[_NBit1] | complexfloating[_NBit2, _NBit2]
      ]]
-) -> Type[complexfloating[_64Bit | _NBit1 | _NBit2, _64Bit | _NBit1 | _NBit2]]: ...
+) -> type[complexfloating[_64Bit | _NBit1 | _NBit2, _64Bit | _NBit1 | _NBit2]]: ...
diff --git a/numpy/lib/ufunclike.pyi b/numpy/lib/ufunclike.pyi

index 03f08ebffea3f1dbc725eaeab113324cffdbd4db..82537e2acd953e3ce82541b04cdca0dfba1963b4 100644 (file)
--- a/numpy/lib/ufunclike.pyi
+++ b/numpy/lib/ufunclike.pyi
@@ -1,7 +1,7 @@
-from typing import Any, overload, TypeVar, List, Union
+from typing import Any, overload, TypeVar
  
  from numpy import floating, bool_, object_, ndarray
-from numpy.typing import (
+from numpy._typing import (
      NDArray,
      _FloatLike_co,
      _ArrayLikeFloat_co,
@@ -10,7 +10,7 @@ from numpy.typing import (
  
  _ArrayType = TypeVar("_ArrayType", bound=ndarray[Any, Any])
  
-__all__: List[str]
+__all__: list[str]
  
  @overload
  def fix(  # type: ignore[misc]
@@ -29,7 +29,7 @@ def fix(
  ) -> NDArray[object_]: ...
  @overload
  def fix(
-    x: Union[_ArrayLikeFloat_co, _ArrayLikeObject_co],
+    x: _ArrayLikeFloat_co | _ArrayLikeObject_co,
      out: _ArrayType,
  ) -> _ArrayType: ...
  
diff --git a/numpy/lib/utils.py b/numpy/lib/utils.py

index 1df2ab09b29d25a7f39ddb0ff2ac97032fb897fd..e8f4952d30ebc8e0eed8a2502cd34bf817a328d9 100644 (file)
--- a/numpy/lib/utils.py
+++ b/numpy/lib/utils.py
@@ -25,8 +25,7 @@ def get_include():
  
      Notes
      -----
-    When using ``distutils``, for example in ``setup.py``.
-    ::
+    When using ``distutils``, for example in ``setup.py``::
  
          import numpy as np
          ...
@@ -429,7 +428,7 @@ def _makenamedict(module='numpy'):
      return thedict, dictlist
  
  
-def _info(obj, output=sys.stdout):
+def _info(obj, output=None):
      """Provide information about ndarray obj.
  
      Parameters
@@ -455,6 +454,9 @@ def _info(obj, output=sys.stdout):
      strides = obj.strides
      endian = obj.dtype.byteorder
  
+    if output is None:
+        output = sys.stdout
+
      print("class: ", nm, file=output)
      print("shape: ", obj.shape, file=output)
      print("strides: ", strides, file=output)
@@ -481,7 +483,7 @@ def _info(obj, output=sys.stdout):
  
  
  @set_module('numpy')
-def info(object=None, maxwidth=76, output=sys.stdout, toplevel='numpy'):
+def info(object=None, maxwidth=76, output=None, toplevel='numpy'):
      """
      Get help information for a function, class, or module.
  
@@ -496,7 +498,8 @@ def info(object=None, maxwidth=76, output=sys.stdout, toplevel='numpy'):
          Printing width.
      output : file like object, optional
          File like object that the output is written to, default is
-        ``stdout``.  The object has to be opened in 'w' or 'a' mode.
+        ``None``, in which case ``sys.stdout`` will be used.
+        The object has to be opened in 'w' or 'a' mode.
      toplevel : str, optional
          Start search at this level.
  
@@ -541,6 +544,9 @@ def info(object=None, maxwidth=76, output=sys.stdout, toplevel='numpy'):
      elif hasattr(object, '_ppimport_attr'):
          object = object._ppimport_attr
  
+    if output is None:
+        output = sys.stdout
+
      if object is None:
          info(info)
      elif isinstance(object, ndarray):
diff --git a/numpy/lib/utils.pyi b/numpy/lib/utils.pyi

index f0a8797ad61eba37485edef9011da7f303f485ca..407ce112097f2cb2d836e78486bf3b02bad9bd8c 100644 (file)
--- a/numpy/lib/utils.pyi
+++ b/numpy/lib/utils.pyi
@@ -1,15 +1,9 @@
  from ast import AST
+from collections.abc import Callable, Mapping, Sequence
  from typing import (
      Any,
-    Callable,
-    List,
-    Mapping,
-    Optional,
      overload,
-    Sequence,
-    Tuple,
      TypeVar,
-    Union,
      Protocol,
  )
  
@@ -28,17 +22,17 @@ _FuncType = TypeVar("_FuncType", bound=Callable[..., Any])
  class _SupportsWrite(Protocol[_T_contra]):
      def write(self, s: _T_contra, /) -> Any: ...
  
-__all__: List[str]
+__all__: list[str]
  
  class _Deprecate:
-    old_name: Optional[str]
-    new_name: Optional[str]
-    message: Optional[str]
+    old_name: None | str
+    new_name: None | str
+    message: None | str
      def __init__(
          self,
-        old_name: Optional[str] = ...,
-        new_name: Optional[str] = ...,
-        message: Optional[str] = ...,
+        old_name: None | str = ...,
+        new_name: None | str = ...,
+        message: None | str = ...,
      ) -> None: ...
      # NOTE: `__call__` can in principle take arbitrary `*args` and `**kwargs`,
      # even though they aren't used for anything
@@ -49,47 +43,47 @@ def get_include() -> str: ...
  @overload
  def deprecate(
      *,
-    old_name: Optional[str] = ...,
-    new_name: Optional[str] = ...,
-    message: Optional[str] = ...,
+    old_name: None | str = ...,
+    new_name: None | str = ...,
+    message: None | str = ...,
  ) -> _Deprecate: ...
  @overload
  def deprecate(
      func: _FuncType,
      /,
-    old_name: Optional[str] = ...,
-    new_name: Optional[str] = ...,
-    message: Optional[str] = ...,
+    old_name: None | str = ...,
+    new_name: None | str = ...,
+    message: None | str = ...,
  ) -> _FuncType: ...
  
-def deprecate_with_doc(msg: Optional[str]) -> _Deprecate: ...
+def deprecate_with_doc(msg: None | str) -> _Deprecate: ...
  
  # NOTE: In practice `byte_bounds` can (potentially) take any object
  # implementing the `__array_interface__` protocol. The caveat is
  # that certain keys, marked as optional in the spec, must be present for
  #  `byte_bounds`. This concerns `"strides"` and `"data"`.
-def byte_bounds(a: Union[generic, ndarray[Any, Any]]) -> Tuple[int, int]: ...
+def byte_bounds(a: generic | ndarray[Any, Any]) -> tuple[int, int]: ...
  
-def who(vardict: Optional[Mapping[str, ndarray[Any, Any]]] = ...) -> None: ...
+def who(vardict: None | Mapping[str, ndarray[Any, Any]] = ...) -> None: ...
  
  def info(
      object: object = ...,
      maxwidth: int = ...,
-    output: Optional[_SupportsWrite[str]] = ...,
+    output: None | _SupportsWrite[str] = ...,
      toplevel: str = ...,
  ) -> None: ...
  
  def source(
      object: object,
-    output: Optional[_SupportsWrite[str]] = ...,
+    output: None | _SupportsWrite[str] = ...,
  ) -> None: ...
  
  def lookfor(
      what: str,
-    module: Union[None, str, Sequence[str]] = ...,
+    module: None | str | Sequence[str] = ...,
      import_modules: bool = ...,
      regenerate: bool = ...,
-    output: Optional[_SupportsWrite[str]] =...,
+    output: None | _SupportsWrite[str] =...,
  ) -> None: ...
  
-def safe_eval(source: Union[str, AST]) -> Any: ...
+def safe_eval(source: str | AST) -> Any: ...
diff --git a/numpy/linalg/__init__.pyi b/numpy/linalg/__init__.pyi

index d457f153a02efe36f46348376bcc76be1e9c4fad..d9acd55817325fb703ebddb7db594fac9cab5faf 100644 (file)
--- a/numpy/linalg/__init__.pyi
+++ b/numpy/linalg/__init__.pyi
@@ -1,5 +1,3 @@
-from typing import Any, List
-
  from numpy.linalg.linalg import (
      matrix_power as matrix_power,
      solve as solve,
@@ -25,8 +23,8 @@ from numpy.linalg.linalg import (
  
  from numpy._pytesttester import PytestTester
  
-__all__: List[str]
-__path__: List[str]
+__all__: list[str]
+__path__: list[str]
  test: PytestTester
  
  class LinAlgError(Exception): ...
diff --git a/numpy/linalg/linalg.py b/numpy/linalg/linalg.py

index 0c27e063175a7edf151cca37d6c7611e532277b0..264bc98b93ce6bb7d30de7526c70e43d10c68921 100644 (file)
--- a/numpy/linalg/linalg.py
+++ b/numpy/linalg/linalg.py
@@ -24,7 +24,8 @@ from numpy.core import (
      add, multiply, sqrt, fastCopyAndTranspose, sum, isfinite,
      finfo, errstate, geterrobj, moveaxis, amin, amax, product, abs,
      atleast_2d, intp, asanyarray, object_, matmul,
-    swapaxes, divide, count_nonzero, isnan, sign, argsort, sort
+    swapaxes, divide, count_nonzero, isnan, sign, argsort, sort,
+    reciprocal
  )
  from numpy.core.multiarray import normalize_axis_index
  from numpy.core.overrides import set_module
@@ -299,7 +300,13 @@ def tensorsolve(a, b, axes=None):
      for k in oldshape:
          prod *= k
  
-    a = a.reshape(-1, prod)
+    if a.size != prod ** 2:
+        raise LinAlgError(
+            "Input arrays must satisfy the requirement \
+            prod(a.shape[b.ndim:]) == prod(a.shape[:b.ndim])"
+        )
+
+    a = a.reshape(prod, prod)
      b = b.ravel()
      res = wrap(solve(a, b))
      res.shape = oldshape
@@ -691,8 +698,8 @@ def cholesky(a):
      Returns
      -------
      L : (..., M, M) array_like
-        Upper or lower-triangular Cholesky factor of `a`.  Returns a
-        matrix object if `a` is a matrix object.
+        Lower-triangular Cholesky factor of `a`.  Returns a matrix object if
+        `a` is a matrix object.
  
      Raises
      ------
@@ -893,11 +900,11 @@ def qr(a, mode='reduced'):
             [1, 1],
             [1, 1],
             [2, 1]])
-    >>> b = np.array([1, 0, 2, 1])
+    >>> b = np.array([1, 2, 2, 3])
      >>> q, r = np.linalg.qr(A)
      >>> p = np.dot(q.T, b)
      >>> np.dot(np.linalg.inv(r), p)
-    array([  1.1e-16,   1.0e+00])
+    array([  1.,   1.])
  
      """
      if mode not in ('reduced', 'complete', 'r', 'raw'):
@@ -1472,10 +1479,12 @@ def svd(a, full_matrices=True, compute_uv=True, hermitian=False):
      """
      Singular Value Decomposition.
  
-    When `a` is a 2D array, it is factorized as ``u @ np.diag(s) @ vh
-    = (u * s) @ vh``, where `u` and `vh` are 2D unitary arrays and `s` is a 1D
-    array of `a`'s singular values. When `a` is higher-dimensional, SVD is
-    applied in stacked mode as explained below.
+    When `a` is a 2D array, and ``full_matrices=False``, then it is
+    factorized as ``u @ np.diag(s) @ vh = (u * s) @ vh``, where
+    `u` and the Hermitian transpose of `vh` are 2D arrays with
+    orthonormal columns and `s` is a 1D array of `a`'s singular
+    values. When `a` is higher-dimensional, SVD is applied in
+    stacked mode as explained below.
  
      Parameters
      ----------
@@ -1942,7 +1951,6 @@ def pinv(a, rcond=1e-15, hermitian=False):
      See Also
      --------
      scipy.linalg.pinv : Similar function in SciPy.
-    scipy.linalg.pinv2 : Similar function in SciPy (SVD-based).
      scipy.linalg.pinvh : Compute the (Moore-Penrose) pseudo-inverse of a
                           Hermitian matrix.
  
@@ -2511,9 +2519,11 @@ def norm(x, ord=None, axis=None, keepdims=False):
  
              x = x.ravel(order='K')
              if isComplexType(x.dtype.type):
-                sqnorm = dot(x.real, x.real) + dot(x.imag, x.imag)
+                x_real = x.real
+                x_imag = x.imag
+                sqnorm = x_real.dot(x_real) + x_imag.dot(x_imag)
              else:
-                sqnorm = dot(x, x)
+                sqnorm = x.dot(x)
              ret = sqrt(sqnorm)
              if keepdims:
                  ret = ret.reshape(ndim*[1])
@@ -2553,7 +2563,7 @@ def norm(x, ord=None, axis=None, keepdims=False):
              absx = abs(x)
              absx **= ord
              ret = add.reduce(absx, axis=axis, keepdims=keepdims)
-            ret **= (1 / ord)
+            ret **= reciprocal(ord, dtype=ret.dtype)
              return ret
      elif len(axis) == 2:
          row_axis, col_axis = axis
diff --git a/numpy/linalg/linalg.pyi b/numpy/linalg/linalg.pyi

index a60b9539e8483b3256fe17fe16db4fd91fc4d99c..20cdb708b692d7a41a11d53ed4f77bbd008be181 100644 (file)
--- a/numpy/linalg/linalg.pyi
+++ b/numpy/linalg/linalg.pyi
@@ -1,13 +1,11 @@
+from collections.abc import Iterable
  from typing import (
      Literal as L,
-    List,
-    Iterable,
      overload,
      TypeVar,
      Any,
      SupportsIndex,
      SupportsInt,
-    Tuple,
  )
  
  from numpy import (
@@ -21,7 +19,7 @@ from numpy import (
  
  from numpy.linalg import LinAlgError as LinAlgError
  
-from numpy.typing import (
+from numpy._typing import (
      NDArray,
      ArrayLike,
      _ArrayLikeInt_co,
@@ -34,10 +32,10 @@ from numpy.typing import (
  _T = TypeVar("_T")
  _ArrayType = TypeVar("_ArrayType", bound=NDArray[Any])
  
-_2Tuple = Tuple[_T, _T]
+_2Tuple = tuple[_T, _T]
  _ModeKind = L["reduced", "complete", "r", "raw"]
  
-__all__: List[str]
+__all__: list[str]
  
  @overload
  def tensorsolve(
@@ -141,17 +139,17 @@ def eig(a: _ArrayLikeComplex_co) -> _2Tuple[NDArray[complexfloating[Any, Any]]]:
  def eigh(
      a: _ArrayLikeInt_co,
      UPLO: L["L", "U", "l", "u"] = ...,
-) -> Tuple[NDArray[float64], NDArray[float64]]: ...
+) -> tuple[NDArray[float64], NDArray[float64]]: ...
  @overload
  def eigh(
      a: _ArrayLikeFloat_co,
      UPLO: L["L", "U", "l", "u"] = ...,
-) -> Tuple[NDArray[floating[Any]], NDArray[floating[Any]]]: ...
+) -> tuple[NDArray[floating[Any]], NDArray[floating[Any]]]: ...
  @overload
  def eigh(
      a: _ArrayLikeComplex_co,
      UPLO: L["L", "U", "l", "u"] = ...,
-) -> Tuple[NDArray[floating[Any]], NDArray[complexfloating[Any, Any]]]: ...
+) -> tuple[NDArray[floating[Any]], NDArray[complexfloating[Any, Any]]]: ...
  
  @overload
  def svd(
@@ -159,7 +157,7 @@ def svd(
      full_matrices: bool = ...,
      compute_uv: L[True] = ...,
      hermitian: bool = ...,
-) -> Tuple[
+) -> tuple[
      NDArray[float64],
      NDArray[float64],
      NDArray[float64],
@@ -170,7 +168,7 @@ def svd(
      full_matrices: bool = ...,
      compute_uv: L[True] = ...,
      hermitian: bool = ...,
-) -> Tuple[
+) -> tuple[
      NDArray[floating[Any]],
      NDArray[floating[Any]],
      NDArray[floating[Any]],
@@ -181,7 +179,7 @@ def svd(
      full_matrices: bool = ...,
      compute_uv: L[True] = ...,
      hermitian: bool = ...,
-) -> Tuple[
+) -> tuple[
      NDArray[complexfloating[Any, Any]],
      NDArray[floating[Any]],
      NDArray[complexfloating[Any, Any]],
@@ -240,21 +238,21 @@ def slogdet(a: _ArrayLikeComplex_co) -> _2Tuple[Any]: ...
  def det(a: _ArrayLikeComplex_co) -> Any: ...
  
  @overload
-def lstsq(a: _ArrayLikeInt_co, b: _ArrayLikeInt_co, rcond: None | float = ...) -> Tuple[
+def lstsq(a: _ArrayLikeInt_co, b: _ArrayLikeInt_co, rcond: None | float = ...) -> tuple[
      NDArray[float64],
      NDArray[float64],
      int32,
      NDArray[float64],
  ]: ...
  @overload
-def lstsq(a: _ArrayLikeFloat_co, b: _ArrayLikeFloat_co, rcond: None | float = ...) -> Tuple[
+def lstsq(a: _ArrayLikeFloat_co, b: _ArrayLikeFloat_co, rcond: None | float = ...) -> tuple[
      NDArray[floating[Any]],
      NDArray[floating[Any]],
      int32,
      NDArray[floating[Any]],
  ]: ...
  @overload
-def lstsq(a: _ArrayLikeComplex_co, b: _ArrayLikeComplex_co, rcond: None | float = ...) -> Tuple[
+def lstsq(a: _ArrayLikeComplex_co, b: _ArrayLikeComplex_co, rcond: None | float = ...) -> tuple[
      NDArray[complexfloating[Any, Any]],
      NDArray[floating[Any]],
      int32,
@@ -272,7 +270,7 @@ def norm(
  def norm(
      x: ArrayLike,
      ord: None | float | L["fro", "nuc"] = ...,
-    axis: SupportsInt | SupportsIndex | Tuple[int, ...] = ...,
+    axis: SupportsInt | SupportsIndex | tuple[int, ...] = ...,
      keepdims: bool = ...,
  ) -> Any: ...
  
diff --git a/numpy/linalg/setup.py b/numpy/linalg/setup.py

index 73c386f1ccbfd833fd08bf9ddf7ea479e81419ea..dc62dff8f3300ce4ef81bcd3b00c15601e0ba742 100644 (file)
--- a/numpy/linalg/setup.py
+++ b/numpy/linalg/setup.py
@@ -70,7 +70,7 @@ def configuration(parent_package='', top_path=None):
      # umath_linalg module
      config.add_extension(
          '_umath_linalg',
-        sources=['umath_linalg.c.src', get_lapack_lite_sources],
+        sources=['umath_linalg.cpp', get_lapack_lite_sources],
          depends=['lapack_lite/f2c.h'],
          extra_info=lapack_info,
          extra_cxx_compile_args=NPY_CXX_FLAGS,
diff --git a/numpy/linalg/tests/test_linalg.py b/numpy/linalg/tests/test_linalg.py

index c1ba84a8e674edcde50a03d2c7ac8b593c081c25..ebbd92539c5514245a4ec85f1a77f063cbc36558 100644 (file)
--- a/numpy/linalg/tests/test_linalg.py
+++ b/numpy/linalg/tests/test_linalg.py
@@ -1223,6 +1223,14 @@ class _TestNormBase:
      dt = None
      dec = None
  
+    @staticmethod
+    def check_dtype(x, res):
+        if issubclass(x.dtype.type, np.inexact):
+            assert_equal(res.dtype, x.real.dtype)
+        else:
+            # For integer input, don't have to test float precision of output.
+            assert_(issubclass(res.dtype.type, np.floating))
+
  
  class _TestNormGeneral(_TestNormBase):
  
@@ -1239,37 +1247,37 @@ class _TestNormGeneral(_TestNormBase):
  
          all_types = exact_types + inexact_types
  
-        for each_inexact_types in all_types:
-            at = a.astype(each_inexact_types)
+        for each_type in all_types:
+            at = a.astype(each_type)
  
              an = norm(at, -np.inf)
-            assert_(issubclass(an.dtype.type, np.floating))
+            self.check_dtype(at, an)
              assert_almost_equal(an, 0.0)
  
              with suppress_warnings() as sup:
                  sup.filter(RuntimeWarning, "divide by zero encountered")
                  an = norm(at, -1)
-                assert_(issubclass(an.dtype.type, np.floating))
+                self.check_dtype(at, an)
                  assert_almost_equal(an, 0.0)
  
              an = norm(at, 0)
-            assert_(issubclass(an.dtype.type, np.floating))
+            self.check_dtype(at, an)
              assert_almost_equal(an, 2)
  
              an = norm(at, 1)
-            assert_(issubclass(an.dtype.type, np.floating))
+            self.check_dtype(at, an)
              assert_almost_equal(an, 2.0)
  
              an = norm(at, 2)
-            assert_(issubclass(an.dtype.type, np.floating))
+            self.check_dtype(at, an)
              assert_almost_equal(an, an.dtype.type(2.0)**an.dtype.type(1.0/2.0))
  
              an = norm(at, 4)
-            assert_(issubclass(an.dtype.type, np.floating))
+            self.check_dtype(at, an)
              assert_almost_equal(an, an.dtype.type(2.0)**an.dtype.type(1.0/4.0))
  
              an = norm(at, np.inf)
-            assert_(issubclass(an.dtype.type, np.floating))
+            self.check_dtype(at, an)
              assert_almost_equal(an, 1.0)
  
      def test_vector(self):
@@ -1402,41 +1410,41 @@ class _TestNorm2D(_TestNormBase):
  
          all_types = exact_types + inexact_types
  
-        for each_inexact_types in all_types:
-            at = a.astype(each_inexact_types)
+        for each_type in all_types:
+            at = a.astype(each_type)
  
              an = norm(at, -np.inf)
-            assert_(issubclass(an.dtype.type, np.floating))
+            self.check_dtype(at, an)
              assert_almost_equal(an, 2.0)
  
              with suppress_warnings() as sup:
                  sup.filter(RuntimeWarning, "divide by zero encountered")
                  an = norm(at, -1)
-                assert_(issubclass(an.dtype.type, np.floating))
+                self.check_dtype(at, an)
                  assert_almost_equal(an, 1.0)
  
              an = norm(at, 1)
-            assert_(issubclass(an.dtype.type, np.floating))
+            self.check_dtype(at, an)
              assert_almost_equal(an, 2.0)
  
              an = norm(at, 2)
-            assert_(issubclass(an.dtype.type, np.floating))
+            self.check_dtype(at, an)
              assert_almost_equal(an, 3.0**(1.0/2.0))
  
              an = norm(at, -2)
-            assert_(issubclass(an.dtype.type, np.floating))
+            self.check_dtype(at, an)
              assert_almost_equal(an, 1.0)
  
              an = norm(at, np.inf)
-            assert_(issubclass(an.dtype.type, np.floating))
+            self.check_dtype(at, an)
              assert_almost_equal(an, 2.0)
  
              an = norm(at, 'fro')
-            assert_(issubclass(an.dtype.type, np.floating))
+            self.check_dtype(at, an)
              assert_almost_equal(an, 2.0)
  
              an = norm(at, 'nuc')
-            assert_(issubclass(an.dtype.type, np.floating))
+            self.check_dtype(at, an)
              # Lower bar needed to support low precision floats.
              # They end up being off by 1 in the 7th place.
              np.testing.assert_almost_equal(an, 2.7320508075688772, decimal=6)
@@ -1773,29 +1781,31 @@ class TestQR:
  class TestCholesky:
      # TODO: are there no other tests for cholesky?
  
-    def test_basic_property(self):
+    @pytest.mark.parametrize(
+        'shape', [(1, 1), (2, 2), (3, 3), (50, 50), (3, 10, 10)]
+    )
+    @pytest.mark.parametrize(
+        'dtype', (np.float32, np.float64, np.complex64, np.complex128)
+    )
+    def test_basic_property(self, shape, dtype):
          # Check A = L L^H
-        shapes = [(1, 1), (2, 2), (3, 3), (50, 50), (3, 10, 10)]
-        dtypes = (np.float32, np.float64, np.complex64, np.complex128)
-
-        for shape, dtype in itertools.product(shapes, dtypes):
-            np.random.seed(1)
-            a = np.random.randn(*shape)
-            if np.issubdtype(dtype, np.complexfloating):
-                a = a + 1j*np.random.randn(*shape)
+        np.random.seed(1)
+        a = np.random.randn(*shape)
+        if np.issubdtype(dtype, np.complexfloating):
+            a = a + 1j*np.random.randn(*shape)
  
-            t = list(range(len(shape)))
-            t[-2:] = -1, -2
+        t = list(range(len(shape)))
+        t[-2:] = -1, -2
  
-            a = np.matmul(a.transpose(t).conj(), a)
-            a = np.asarray(a, dtype=dtype)
+        a = np.matmul(a.transpose(t).conj(), a)
+        a = np.asarray(a, dtype=dtype)
  
-            c = np.linalg.cholesky(a)
+        c = np.linalg.cholesky(a)
  
-            b = np.matmul(c, c.transpose(t).conj())
-            assert_allclose(b, a,
-                            err_msg=f'{shape} {dtype}\n{a}\n{c}',
-                            atol=500 * a.shape[0] * np.finfo(dtype).eps)
+        b = np.matmul(c, c.transpose(t).conj())
+        assert_allclose(b, a,
+                        err_msg=f'{shape} {dtype}\n{a}\n{c}',
+                        atol=500 * a.shape[0] * np.finfo(dtype).eps)
  
      def test_0_size(self):
          class ArraySubclass(np.ndarray):
@@ -2103,6 +2113,27 @@ class TestTensorinv:
          assert_allclose(np.tensordot(ainv, b, 1), np.linalg.tensorsolve(a, b))
  
  
+class TestTensorsolve:
+
+    @pytest.mark.parametrize("a, axes", [
+        (np.ones((4, 6, 8, 2)), None),
+        (np.ones((3, 3, 2)), (0, 2)),
+        ])
+    def test_non_square_handling(self, a, axes):
+        with assert_raises(LinAlgError):
+            b = np.ones(a.shape[:2])
+            linalg.tensorsolve(a, b, axes=axes)
+
+    @pytest.mark.parametrize("shape",
+        [(2, 3, 6), (3, 4, 4, 3), (0, 3, 3, 0)],
+    )
+    def test_tensorsolve_result(self, shape):
+        a = np.random.randn(*shape)
+        b = np.ones(a.shape[:2])
+        x = np.linalg.tensorsolve(a, b)
+        assert_allclose(np.tensordot(a, x, axes=len(x.shape)), b)
+
+
  def test_unsupported_commontype():
      # linalg gracefully handles unsupported type
      arr = np.array([[1, -2], [2, 5]], dtype='float16')
diff --git a/numpy/linalg/umath_linalg.c.src b/numpy/linalg/umath_linalg.c.src

deleted file mode 100644 (file)

index f8a1544..0000000
--- a/numpy/linalg/umath_linalg.c.src
+++ /dev/null
@@ -1,4435 +0,0 @@
-/* -*- c -*- */
-
-/*
- *****************************************************************************
- **                            INCLUDES                                     **
- *****************************************************************************
- */
-#define PY_SSIZE_T_CLEAN
-#include <Python.h>
-
-#define NPY_NO_DEPRECATED_API NPY_API_VERSION
-#include "numpy/arrayobject.h"
-#include "numpy/ufuncobject.h"
-
-#include "npy_pycompat.h"
-
-#include "npy_config.h"
-
-#include "npy_cblas.h"
-
-#include <stddef.h>
-#include <stdio.h>
-#include <assert.h>
-#include <math.h>
-
-
-static const char* umath_linalg_version_string = "0.1.5";
-
-/*
- ****************************************************************************
- *                        Debugging support                                 *
- ****************************************************************************
- */
-#define TRACE_TXT(...) do { fprintf (stderr, __VA_ARGS__); } while (0)
-#define STACK_TRACE do {} while (0)
-#define TRACE\
-    do {                                        \
-        fprintf (stderr,                        \
-                 "%s:%d:%s\n",                  \
-                 __FILE__,                      \
-                 __LINE__,                      \
-                 __FUNCTION__);                 \
-        STACK_TRACE;                            \
-    } while (0)
-
-#if 0
-#include <execinfo.h>
-void
-dbg_stack_trace()
-{
-    void *trace[32];
-    size_t size;
-
-    size = backtrace(trace, sizeof(trace)/sizeof(trace[0]));
-    backtrace_symbols_fd(trace, size, 1);
-}
-
-#undef STACK_TRACE
-#define STACK_TRACE do { dbg_stack_trace(); } while (0)
-#endif
-
-/*
- *****************************************************************************
- *                    BLAS/LAPACK calling macros                             *
- *****************************************************************************
- */
-
-#define FNAME(x) BLAS_FUNC(x)
-
-typedef CBLAS_INT         fortran_int;
-
-typedef struct { float r, i; } f2c_complex;
-typedef struct { double r, i; } f2c_doublecomplex;
-/* typedef long int (*L_fp)(); */
-
-typedef float             fortran_real;
-typedef double            fortran_doublereal;
-typedef f2c_complex       fortran_complex;
-typedef f2c_doublecomplex fortran_doublecomplex;
-
-extern fortran_int
-FNAME(sgeev)(char *jobvl, char *jobvr, fortran_int *n,
-             float a[], fortran_int *lda, float wr[], float wi[],
-             float vl[], fortran_int *ldvl, float vr[], fortran_int *ldvr,
-             float work[], fortran_int lwork[],
-             fortran_int *info);
-extern fortran_int
-FNAME(dgeev)(char *jobvl, char *jobvr, fortran_int *n,
-             double a[], fortran_int *lda, double wr[], double wi[],
-             double vl[], fortran_int *ldvl, double vr[], fortran_int *ldvr,
-             double work[], fortran_int lwork[],
-             fortran_int *info);
-extern fortran_int
-FNAME(cgeev)(char *jobvl, char *jobvr, fortran_int *n,
-             f2c_doublecomplex a[], fortran_int *lda,
-             f2c_doublecomplex w[],
-             f2c_doublecomplex vl[], fortran_int *ldvl,
-             f2c_doublecomplex vr[], fortran_int *ldvr,
-             f2c_doublecomplex work[], fortran_int *lwork,
-             double rwork[],
-             fortran_int *info);
-extern fortran_int
-FNAME(zgeev)(char *jobvl, char *jobvr, fortran_int *n,
-             f2c_doublecomplex a[], fortran_int *lda,
-             f2c_doublecomplex w[],
-             f2c_doublecomplex vl[], fortran_int *ldvl,
-             f2c_doublecomplex vr[], fortran_int *ldvr,
-             f2c_doublecomplex work[], fortran_int *lwork,
-             double rwork[],
-             fortran_int *info);
-
-extern fortran_int
-FNAME(ssyevd)(char *jobz, char *uplo, fortran_int *n,
-              float a[], fortran_int *lda, float w[], float work[],
-              fortran_int *lwork, fortran_int iwork[], fortran_int *liwork,
-              fortran_int *info);
-extern fortran_int
-FNAME(dsyevd)(char *jobz, char *uplo, fortran_int *n,
-              double a[], fortran_int *lda, double w[], double work[],
-              fortran_int *lwork, fortran_int iwork[], fortran_int *liwork,
-              fortran_int *info);
-extern fortran_int
-FNAME(cheevd)(char *jobz, char *uplo, fortran_int *n,
-              f2c_complex a[], fortran_int *lda,
-              float w[], f2c_complex work[],
-              fortran_int *lwork, float rwork[], fortran_int *lrwork, fortran_int iwork[],
-              fortran_int *liwork,
-              fortran_int *info);
-extern fortran_int
-FNAME(zheevd)(char *jobz, char *uplo, fortran_int *n,
-              f2c_doublecomplex a[], fortran_int *lda,
-              double w[], f2c_doublecomplex work[],
-              fortran_int *lwork, double rwork[], fortran_int *lrwork, fortran_int iwork[],
-              fortran_int *liwork,
-              fortran_int *info);
-
-extern fortran_int
-FNAME(sgelsd)(fortran_int *m, fortran_int *n, fortran_int *nrhs,
-              float a[], fortran_int *lda, float b[], fortran_int *ldb,
-              float s[], float *rcond, fortran_int *rank,
-              float work[], fortran_int *lwork, fortran_int iwork[],
-              fortran_int *info);
-extern fortran_int
-FNAME(dgelsd)(fortran_int *m, fortran_int *n, fortran_int *nrhs,
-              double a[], fortran_int *lda, double b[], fortran_int *ldb,
-              double s[], double *rcond, fortran_int *rank,
-              double work[], fortran_int *lwork, fortran_int iwork[],
-              fortran_int *info);
-extern fortran_int
-FNAME(cgelsd)(fortran_int *m, fortran_int *n, fortran_int *nrhs,
-              f2c_complex a[], fortran_int *lda,
-              f2c_complex b[], fortran_int *ldb,
-              float s[], float *rcond, fortran_int *rank,
-              f2c_complex work[], fortran_int *lwork,
-              float rwork[], fortran_int iwork[],
-              fortran_int *info);
-extern fortran_int
-FNAME(zgelsd)(fortran_int *m, fortran_int *n, fortran_int *nrhs,
-              f2c_doublecomplex a[], fortran_int *lda,
-              f2c_doublecomplex b[], fortran_int *ldb,
-              double s[], double *rcond, fortran_int *rank,
-              f2c_doublecomplex work[], fortran_int *lwork,
-              double rwork[], fortran_int iwork[],
-              fortran_int *info);
-
-extern fortran_int
-FNAME(dgeqrf)(fortran_int *m, fortran_int *n, double a[], fortran_int *lda,
-              double tau[], double work[],
-              fortran_int *lwork, fortran_int *info);
-extern fortran_int
-FNAME(zgeqrf)(fortran_int *m, fortran_int *n, f2c_doublecomplex a[], fortran_int *lda,
-              f2c_doublecomplex tau[], f2c_doublecomplex work[],
-              fortran_int *lwork, fortran_int *info);
-
-extern fortran_int
-FNAME(dorgqr)(fortran_int *m, fortran_int *n, fortran_int *k, double a[], fortran_int *lda,
-              double tau[], double work[],
-              fortran_int *lwork, fortran_int *info);
-extern fortran_int
-FNAME(zungqr)(fortran_int *m, fortran_int *n, fortran_int *k, f2c_doublecomplex a[],
-              fortran_int *lda, f2c_doublecomplex tau[],
-              f2c_doublecomplex work[], fortran_int *lwork, fortran_int *info);
-
-extern fortran_int
-FNAME(sgesv)(fortran_int *n, fortran_int *nrhs,
-             float a[], fortran_int *lda,
-             fortran_int ipiv[],
-             float b[], fortran_int *ldb,
-             fortran_int *info);
-extern fortran_int
-FNAME(dgesv)(fortran_int *n, fortran_int *nrhs,
-             double a[], fortran_int *lda,
-             fortran_int ipiv[],
-             double b[], fortran_int *ldb,
-             fortran_int *info);
-extern fortran_int
-FNAME(cgesv)(fortran_int *n, fortran_int *nrhs,
-             f2c_complex a[], fortran_int *lda,
-             fortran_int ipiv[],
-             f2c_complex b[], fortran_int *ldb,
-             fortran_int *info);
-extern fortran_int
-FNAME(zgesv)(fortran_int *n, fortran_int *nrhs,
-             f2c_doublecomplex a[], fortran_int *lda,
-             fortran_int ipiv[],
-             f2c_doublecomplex b[], fortran_int *ldb,
-             fortran_int *info);
-
-extern fortran_int
-FNAME(sgetrf)(fortran_int *m, fortran_int *n,
-              float a[], fortran_int *lda,
-              fortran_int ipiv[],
-              fortran_int *info);
-extern fortran_int
-FNAME(dgetrf)(fortran_int *m, fortran_int *n,
-              double a[], fortran_int *lda,
-              fortran_int ipiv[],
-              fortran_int *info);
-extern fortran_int
-FNAME(cgetrf)(fortran_int *m, fortran_int *n,
-              f2c_complex a[], fortran_int *lda,
-              fortran_int ipiv[],
-              fortran_int *info);
-extern fortran_int
-FNAME(zgetrf)(fortran_int *m, fortran_int *n,
-              f2c_doublecomplex a[], fortran_int *lda,
-              fortran_int ipiv[],
-              fortran_int *info);
-
-extern fortran_int
-FNAME(spotrf)(char *uplo, fortran_int *n,
-              float a[], fortran_int *lda,
-              fortran_int *info);
-extern fortran_int
-FNAME(dpotrf)(char *uplo, fortran_int *n,
-              double a[], fortran_int *lda,
-              fortran_int *info);
-extern fortran_int
-FNAME(cpotrf)(char *uplo, fortran_int *n,
-              f2c_complex a[], fortran_int *lda,
-              fortran_int *info);
-extern fortran_int
-FNAME(zpotrf)(char *uplo, fortran_int *n,
-              f2c_doublecomplex a[], fortran_int *lda,
-              fortran_int *info);
-
-extern fortran_int
-FNAME(sgesdd)(char *jobz, fortran_int *m, fortran_int *n,
-              float a[], fortran_int *lda, float s[], float u[],
-              fortran_int *ldu, float vt[], fortran_int *ldvt, float work[],
-              fortran_int *lwork, fortran_int iwork[], fortran_int *info);
-extern fortran_int
-FNAME(dgesdd)(char *jobz, fortran_int *m, fortran_int *n,
-              double a[], fortran_int *lda, double s[], double u[],
-              fortran_int *ldu, double vt[], fortran_int *ldvt, double work[],
-              fortran_int *lwork, fortran_int iwork[], fortran_int *info);
-extern fortran_int
-FNAME(cgesdd)(char *jobz, fortran_int *m, fortran_int *n,
-              f2c_complex a[], fortran_int *lda,
-              float s[], f2c_complex u[], fortran_int *ldu,
-              f2c_complex vt[], fortran_int *ldvt,
-              f2c_complex work[], fortran_int *lwork,
-              float rwork[], fortran_int iwork[], fortran_int *info);
-extern fortran_int
-FNAME(zgesdd)(char *jobz, fortran_int *m, fortran_int *n,
-              f2c_doublecomplex a[], fortran_int *lda,
-              double s[], f2c_doublecomplex u[], fortran_int *ldu,
-              f2c_doublecomplex vt[], fortran_int *ldvt,
-              f2c_doublecomplex work[], fortran_int *lwork,
-              double rwork[], fortran_int iwork[], fortran_int *info);
-
-extern fortran_int
-FNAME(spotrs)(char *uplo, fortran_int *n, fortran_int *nrhs,
-              float a[], fortran_int *lda,
-              float b[], fortran_int *ldb,
-              fortran_int *info);
-extern fortran_int
-FNAME(dpotrs)(char *uplo, fortran_int *n, fortran_int *nrhs,
-              double a[], fortran_int *lda,
-              double b[], fortran_int *ldb,
-              fortran_int *info);
-extern fortran_int
-FNAME(cpotrs)(char *uplo, fortran_int *n, fortran_int *nrhs,
-              f2c_complex a[], fortran_int *lda,
-              f2c_complex b[], fortran_int *ldb,
-              fortran_int *info);
-extern fortran_int
-FNAME(zpotrs)(char *uplo, fortran_int *n, fortran_int *nrhs,
-              f2c_doublecomplex a[], fortran_int *lda,
-              f2c_doublecomplex b[], fortran_int *ldb,
-              fortran_int *info);
-
-extern fortran_int
-FNAME(spotri)(char *uplo, fortran_int *n,
-              float a[], fortran_int *lda,
-              fortran_int *info);
-extern fortran_int
-FNAME(dpotri)(char *uplo, fortran_int *n,
-              double a[], fortran_int *lda,
-              fortran_int *info);
-extern fortran_int
-FNAME(cpotri)(char *uplo, fortran_int *n,
-              f2c_complex a[], fortran_int *lda,
-              fortran_int *info);
-extern fortran_int
-FNAME(zpotri)(char *uplo, fortran_int *n,
-              f2c_doublecomplex a[], fortran_int *lda,
-              fortran_int *info);
-
-extern fortran_int
-FNAME(scopy)(fortran_int *n,
-             float *sx, fortran_int *incx,
-             float *sy, fortran_int *incy);
-extern fortran_int
-FNAME(dcopy)(fortran_int *n,
-             double *sx, fortran_int *incx,
-             double *sy, fortran_int *incy);
-extern fortran_int
-FNAME(ccopy)(fortran_int *n,
-             f2c_complex *sx, fortran_int *incx,
-             f2c_complex *sy, fortran_int *incy);
-extern fortran_int
-FNAME(zcopy)(fortran_int *n,
-             f2c_doublecomplex *sx, fortran_int *incx,
-             f2c_doublecomplex *sy, fortran_int *incy);
-
-extern float
-FNAME(sdot)(fortran_int *n,
-            float *sx, fortran_int *incx,
-            float *sy, fortran_int *incy);
-extern double
-FNAME(ddot)(fortran_int *n,
-            double *sx, fortran_int *incx,
-            double *sy, fortran_int *incy);
-extern void
-FNAME(cdotu)(f2c_complex *ret, fortran_int *n,
-             f2c_complex *sx, fortran_int *incx,
-             f2c_complex *sy, fortran_int *incy);
-extern void
-FNAME(zdotu)(f2c_doublecomplex *ret, fortran_int *n,
-             f2c_doublecomplex *sx, fortran_int *incx,
-             f2c_doublecomplex *sy, fortran_int *incy);
-extern void
-FNAME(cdotc)(f2c_complex *ret, fortran_int *n,
-             f2c_complex *sx, fortran_int *incx,
-             f2c_complex *sy, fortran_int *incy);
-extern void
-FNAME(zdotc)(f2c_doublecomplex *ret, fortran_int *n,
-             f2c_doublecomplex *sx, fortran_int *incx,
-             f2c_doublecomplex *sy, fortran_int *incy);
-
-extern fortran_int
-FNAME(sgemm)(char *transa, char *transb,
-             fortran_int *m, fortran_int *n, fortran_int *k,
-             float *alpha,
-             float *a, fortran_int *lda,
-             float *b, fortran_int *ldb,
-             float *beta,
-             float *c, fortran_int *ldc);
-extern fortran_int
-FNAME(dgemm)(char *transa, char *transb,
-             fortran_int *m, fortran_int *n, fortran_int *k,
-             double *alpha,
-             double *a, fortran_int *lda,
-             double *b, fortran_int *ldb,
-             double *beta,
-             double *c, fortran_int *ldc);
-extern fortran_int
-FNAME(cgemm)(char *transa, char *transb,
-             fortran_int *m, fortran_int *n, fortran_int *k,
-             f2c_complex *alpha,
-             f2c_complex *a, fortran_int *lda,
-             f2c_complex *b, fortran_int *ldb,
-             f2c_complex *beta,
-             f2c_complex *c, fortran_int *ldc);
-extern fortran_int
-FNAME(zgemm)(char *transa, char *transb,
-             fortran_int *m, fortran_int *n, fortran_int *k,
-             f2c_doublecomplex *alpha,
-             f2c_doublecomplex *a, fortran_int *lda,
-             f2c_doublecomplex *b, fortran_int *ldb,
-             f2c_doublecomplex *beta,
-             f2c_doublecomplex *c, fortran_int *ldc);
-
-
-#define LAPACK_T(FUNC)                                          \
-    TRACE_TXT("Calling LAPACK ( " # FUNC " )\n");               \
-    FNAME(FUNC)
-
-#define BLAS(FUNC)                              \
-    FNAME(FUNC)
-
-#define LAPACK(FUNC)                            \
-    FNAME(FUNC)
-
-
-/*
- *****************************************************************************
- **                      Some handy functions                               **
- *****************************************************************************
- */
-
-static NPY_INLINE int
-get_fp_invalid_and_clear(void)
-{
-    int status;
-    status = npy_clear_floatstatus_barrier((char*)&status);
-    return !!(status & NPY_FPE_INVALID);
-}
-
-static NPY_INLINE void
-set_fp_invalid_or_clear(int error_occurred)
-{
-    if (error_occurred) {
-        npy_set_floatstatus_invalid();
-    }
-    else {
-        npy_clear_floatstatus_barrier((char*)&error_occurred);
-    }
-}
-
-/*
- *****************************************************************************
- **                      Some handy constants                               **
- *****************************************************************************
- */
-
-#define UMATH_LINALG_MODULE_NAME "_umath_linalg"
-
-typedef union {
-    fortran_complex f;
-    npy_cfloat npy;
-    float array[2];
-} COMPLEX_t;
-
-typedef union {
-    fortran_doublecomplex f;
-    npy_cdouble npy;
-    double array[2];
-} DOUBLECOMPLEX_t;
-
-static float s_one;
-static float s_zero;
-static float s_minus_one;
-static float s_ninf;
-static float s_nan;
-static double d_one;
-static double d_zero;
-static double d_minus_one;
-static double d_ninf;
-static double d_nan;
-static COMPLEX_t c_one;
-static COMPLEX_t c_zero;
-static COMPLEX_t c_minus_one;
-static COMPLEX_t c_ninf;
-static COMPLEX_t c_nan;
-static DOUBLECOMPLEX_t z_one;
-static DOUBLECOMPLEX_t z_zero;
-static DOUBLECOMPLEX_t z_minus_one;
-static DOUBLECOMPLEX_t z_ninf;
-static DOUBLECOMPLEX_t z_nan;
-
-static void init_constants(void)
-{
-    /*
-       this is needed as NPY_INFINITY and NPY_NAN macros
-       can't be used as initializers. I prefer to just set
-       all the constants the same way.
-    */
-    s_one  = 1.0f;
-    s_zero = 0.0f;
-    s_minus_one = -1.0f;
-    s_ninf = -NPY_INFINITYF;
-    s_nan = NPY_NANF;
-
-    d_one  = 1.0;
-    d_zero = 0.0;
-    d_minus_one = -1.0;
-    d_ninf = -NPY_INFINITY;
-    d_nan = NPY_NAN;
-
-    c_one.array[0]  = 1.0f;
-    c_one.array[1]  = 0.0f;
-    c_zero.array[0] = 0.0f;
-    c_zero.array[1] = 0.0f;
-    c_minus_one.array[0] = -1.0f;
-    c_minus_one.array[1] = 0.0f;
-    c_ninf.array[0] = -NPY_INFINITYF;
-    c_ninf.array[1] = 0.0f;
-    c_nan.array[0] = NPY_NANF;
-    c_nan.array[1] = NPY_NANF;
-
-    z_one.array[0]  = 1.0;
-    z_one.array[1]  = 0.0;
-    z_zero.array[0] = 0.0;
-    z_zero.array[1] = 0.0;
-    z_minus_one.array[0] = -1.0;
-    z_minus_one.array[1] = 0.0;
-    z_ninf.array[0] = -NPY_INFINITY;
-    z_ninf.array[1] = 0.0;
-    z_nan.array[0] = NPY_NAN;
-    z_nan.array[1] = NPY_NAN;
-}
-
-/*
- *****************************************************************************
- **               Structs used for data rearrangement                       **
- *****************************************************************************
- */
-
-
-/*
- * this struct contains information about how to linearize a matrix in a local
- * buffer so that it can be used by blas functions.  All strides are specified
- * in bytes and are converted to elements later in type specific functions.
- *
- * rows: number of rows in the matrix
- * columns: number of columns in the matrix
- * row_strides: the number bytes between consecutive rows.
- * column_strides: the number of bytes between consecutive columns.
- * output_lead_dim: BLAS/LAPACK-side leading dimension, in elements
- */
-typedef struct linearize_data_struct
-{
-  npy_intp rows;
-  npy_intp columns;
-  npy_intp row_strides;
-  npy_intp column_strides;
-  npy_intp output_lead_dim;
-} LINEARIZE_DATA_t;
-
-static NPY_INLINE void
-init_linearize_data_ex(LINEARIZE_DATA_t *lin_data,
-                       npy_intp rows,
-                       npy_intp columns,
-                       npy_intp row_strides,
-                       npy_intp column_strides,
-                       npy_intp output_lead_dim)
-{
-    lin_data->rows = rows;
-    lin_data->columns = columns;
-    lin_data->row_strides = row_strides;
-    lin_data->column_strides = column_strides;
-    lin_data->output_lead_dim = output_lead_dim;
-}
-
-static NPY_INLINE void
-init_linearize_data(LINEARIZE_DATA_t *lin_data,
-                    npy_intp rows,
-                    npy_intp columns,
-                    npy_intp row_strides,
-                    npy_intp column_strides)
-{
-    init_linearize_data_ex(
-        lin_data, rows, columns, row_strides, column_strides, columns);
-}
-
-static NPY_INLINE void
-dump_ufunc_object(PyUFuncObject* ufunc)
-{
-    TRACE_TXT("\n\n%s '%s' (%d input(s), %d output(s), %d specialization(s).\n",
-              ufunc->core_enabled? "generalized ufunc" : "scalar ufunc",
-              ufunc->name, ufunc->nin, ufunc->nout, ufunc->ntypes);
-    if (ufunc->core_enabled) {
-        int arg;
-        int dim;
-        TRACE_TXT("\t%s (%d dimension(s) detected).\n",
-                  ufunc->core_signature, ufunc->core_num_dim_ix);
-
-        for (arg = 0; arg < ufunc->nargs; arg++){
-            int * arg_dim_ix = ufunc->core_dim_ixs + ufunc->core_offsets[arg];
-            TRACE_TXT("\t\targ %d (%s) has %d dimension(s): (",
-                      arg, arg < ufunc->nin? "INPUT" : "OUTPUT",
-                      ufunc->core_num_dims[arg]);
-            for (dim = 0; dim < ufunc->core_num_dims[arg]; dim ++) {
-                TRACE_TXT(" %d", arg_dim_ix[dim]);
-            }
-            TRACE_TXT(" )\n");
-        }
-    }
-}
-
-static NPY_INLINE void
-dump_linearize_data(const char* name, const LINEARIZE_DATA_t* params)
-{
-    TRACE_TXT("\n\t%s rows: %zd columns: %zd"\
-              "\n\t\trow_strides: %td column_strides: %td"\
-              "\n", name, params->rows, params->columns,
-              params->row_strides, params->column_strides);
-}
-
-static NPY_INLINE void
-print_FLOAT(npy_float s)
-{
-    TRACE_TXT(" %8.4f", s);
-}
-static NPY_INLINE void
-print_DOUBLE(npy_double d)
-{
-    TRACE_TXT(" %10.6f", d);
-}
-static NPY_INLINE void
-print_CFLOAT(npy_cfloat c)
-{
-    float* c_parts = (float*)&c;
-    TRACE_TXT("(%8.4f, %8.4fj)", c_parts[0], c_parts[1]);
-}
-static NPY_INLINE void
-print_CDOUBLE(npy_cdouble z)
-{
-    double* z_parts = (double*)&z;
-    TRACE_TXT("(%8.4f, %8.4fj)", z_parts[0], z_parts[1]);
-}
-
-/**begin repeat
-   #TYPE = FLOAT, DOUBLE, CFLOAT, CDOUBLE#
-   #typ = npy_float, npy_double, npy_cfloat, npy_cdouble#
- */
-static NPY_INLINE void
-dump_@TYPE@_matrix(const char* name,
-                   size_t rows, size_t columns,
-                   const @typ@* ptr)
-{
-    size_t i, j;
-
-    TRACE_TXT("\n%s %p (%zd, %zd)\n", name, ptr, rows, columns);
-    for (i = 0; i < rows; i++)
-    {
-        TRACE_TXT("| ");
-        for (j = 0; j < columns; j++)
-        {
-            print_@TYPE@(ptr[j*rows + i]);
-            TRACE_TXT(", ");
-        }
-        TRACE_TXT(" |\n");
-    }
-}
-/**end repeat**/
-
-
-/*
- *****************************************************************************
- **                            Basics                                       **
- *****************************************************************************
- */
-
-static NPY_INLINE fortran_int
-fortran_int_min(fortran_int x, fortran_int y) {
-    return x < y ? x : y;
-}
-
-static NPY_INLINE fortran_int
-fortran_int_max(fortran_int x, fortran_int y) {
-    return x > y ? x : y;
-}
-
-#define INIT_OUTER_LOOP_1 \
-    npy_intp dN = *dimensions++;\
-    npy_intp N_;\
-    npy_intp s0 = *steps++;
-
-#define INIT_OUTER_LOOP_2 \
-    INIT_OUTER_LOOP_1\
-    npy_intp s1 = *steps++;
-
-#define INIT_OUTER_LOOP_3 \
-    INIT_OUTER_LOOP_2\
-    npy_intp s2 = *steps++;
-
-#define INIT_OUTER_LOOP_4 \
-    INIT_OUTER_LOOP_3\
-    npy_intp s3 = *steps++;
-
-#define INIT_OUTER_LOOP_5 \
-    INIT_OUTER_LOOP_4\
-    npy_intp s4 = *steps++;
-
-#define INIT_OUTER_LOOP_6  \
-    INIT_OUTER_LOOP_5\
-    npy_intp s5 = *steps++;
-
-#define INIT_OUTER_LOOP_7  \
-    INIT_OUTER_LOOP_6\
-    npy_intp s6 = *steps++;
-
-#define BEGIN_OUTER_LOOP_2 \
-    for (N_ = 0;\
-         N_ < dN;\
-         N_++, args[0] += s0,\
-             args[1] += s1) {
-
-#define BEGIN_OUTER_LOOP_3 \
-    for (N_ = 0;\
-         N_ < dN;\
-         N_++, args[0] += s0,\
-             args[1] += s1,\
-             args[2] += s2) {
-
-#define BEGIN_OUTER_LOOP_4 \
-    for (N_ = 0;\
-         N_ < dN;\
-         N_++, args[0] += s0,\
-             args[1] += s1,\
-             args[2] += s2,\
-             args[3] += s3) {
-
-#define BEGIN_OUTER_LOOP_5 \
-    for (N_ = 0;\
-         N_ < dN;\
-         N_++, args[0] += s0,\
-             args[1] += s1,\
-             args[2] += s2,\
-             args[3] += s3,\
-             args[4] += s4) {
-
-#define BEGIN_OUTER_LOOP_6 \
-    for (N_ = 0;\
-         N_ < dN;\
-         N_++, args[0] += s0,\
-             args[1] += s1,\
-             args[2] += s2,\
-             args[3] += s3,\
-             args[4] += s4,\
-             args[5] += s5) {
-
-#define BEGIN_OUTER_LOOP_7 \
-    for (N_ = 0;\
-         N_ < dN;\
-         N_++, args[0] += s0,\
-             args[1] += s1,\
-             args[2] += s2,\
-             args[3] += s3,\
-             args[4] += s4,\
-             args[5] += s5,\
-             args[6] += s6) {
-
-#define END_OUTER_LOOP  }
-
-static NPY_INLINE void
-update_pointers(npy_uint8** bases, ptrdiff_t* offsets, size_t count)
-{
-    size_t i;
-    for (i = 0; i < count; ++i) {
-        bases[i] += offsets[i];
-    }
-}
-
-
-/* disable -Wmaybe-uninitialized as there is some code that generate false
-   positives with this warning
-*/
-#pragma GCC diagnostic push
-#pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
-
-/*
- *****************************************************************************
- **                             HELPER FUNCS                                **
- *****************************************************************************
- */
-
-             /* rearranging of 2D matrices using blas */
-
-/**begin repeat
-    #TYPE = FLOAT, DOUBLE, CFLOAT, CDOUBLE#
-    #typ = float, double, COMPLEX_t, DOUBLECOMPLEX_t#
-    #copy = scopy, dcopy, ccopy, zcopy#
-    #nan = s_nan, d_nan, c_nan, z_nan#
-    #zero = s_zero, d_zero, c_zero, z_zero#
- */
-static NPY_INLINE void *
-linearize_@TYPE@_matrix(void *dst_in,
-                        void *src_in,
-                        const LINEARIZE_DATA_t* data)
-{
-    @typ@ *src = (@typ@ *) src_in;
-    @typ@ *dst = (@typ@ *) dst_in;
-
-    if (dst) {
-        int i, j;
-        @typ@* rv = dst;
-        fortran_int columns = (fortran_int)data->columns;
-        fortran_int column_strides =
-            (fortran_int)(data->column_strides/sizeof(@typ@));
-        fortran_int one = 1;
-        for (i = 0; i < data->rows; i++) {
-            if (column_strides > 0) {
-                FNAME(@copy@)(&columns,
-                              (void*)src, &column_strides,
-                              (void*)dst, &one);
-            }
-            else if (column_strides < 0) {
-                FNAME(@copy@)(&columns,
-                              (void*)((@typ@*)src + (columns-1)*column_strides),
-                              &column_strides,
-                              (void*)dst, &one);
-            }
-            else {
-                /*
-                 * Zero stride has undefined behavior in some BLAS
-                 * implementations (e.g. OSX Accelerate), so do it
-                 * manually
-                 */
-                for (j = 0; j < columns; ++j) {
-                    memcpy((@typ@*)dst + j, (@typ@*)src, sizeof(@typ@));
-                }
-            }
-            src += data->row_strides/sizeof(@typ@);
-            dst += data->output_lead_dim;
-        }
-        return rv;
-    } else {
-        return src;
-    }
-}
-
-static NPY_INLINE void *
-delinearize_@TYPE@_matrix(void *dst_in,
-                          void *src_in,
-                          const LINEARIZE_DATA_t* data)
-{
-    @typ@ *src = (@typ@ *) src_in;
-    @typ@ *dst = (@typ@ *) dst_in;
-
-    if (src) {
-        int i;
-        @typ@ *rv = src;
-        fortran_int columns = (fortran_int)data->columns;
-        fortran_int column_strides =
-            (fortran_int)(data->column_strides/sizeof(@typ@));
-        fortran_int one = 1;
-        for (i = 0; i < data->rows; i++) {
-            if (column_strides > 0) {
-                FNAME(@copy@)(&columns,
-                              (void*)src, &one,
-                              (void*)dst, &column_strides);
-            }
-            else if (column_strides < 0) {
-                FNAME(@copy@)(&columns,
-                              (void*)src, &one,
-                              (void*)((@typ@*)dst + (columns-1)*column_strides),
-                              &column_strides);
-            }
-            else {
-                /*
-                 * Zero stride has undefined behavior in some BLAS
-                 * implementations (e.g. OSX Accelerate), so do it
-                 * manually
-                 */
-                if (columns > 0) {
-                    memcpy((@typ@*)dst,
-                           (@typ@*)src + (columns-1),
-                           sizeof(@typ@));
-                }
-            }
-            src += data->output_lead_dim;
-            dst += data->row_strides/sizeof(@typ@);
-        }
-
-        return rv;
-    } else {
-        return src;
-    }
-}
-
-static NPY_INLINE void
-nan_@TYPE@_matrix(void *dst_in, const LINEARIZE_DATA_t* data)
-{
-    @typ@ *dst = (@typ@ *) dst_in;
-
-    int i, j;
-    for (i = 0; i < data->rows; i++) {
-        @typ@ *cp = dst;
-        ptrdiff_t cs = data->column_strides/sizeof(@typ@);
-        for (j = 0; j < data->columns; ++j) {
-            *cp = @nan@;
-            cp += cs;
-        }
-        dst += data->row_strides/sizeof(@typ@);
-    }
-}
-
-static NPY_INLINE void
-zero_@TYPE@_matrix(void *dst_in, const LINEARIZE_DATA_t* data)
-{
-    @typ@ *dst = (@typ@ *) dst_in;
-
-    int i, j;
-    for (i = 0; i < data->rows; i++) {
-        @typ@ *cp = dst;
-        ptrdiff_t cs = data->column_strides/sizeof(@typ@);
-        for (j = 0; j < data->columns; ++j) {
-            *cp = @zero@;
-            cp += cs;
-        }
-        dst += data->row_strides/sizeof(@typ@);
-    }
-}
-
-/**end repeat**/
-
-               /* identity square matrix generation */
-/**begin repeat
-   #TYPE = FLOAT, DOUBLE, CFLOAT, CDOUBLE#
-   #typ = float, double, COMPLEX_t, DOUBLECOMPLEX_t#
-   #cblas_type = s, d, c, z#
- */
-static NPY_INLINE void
-identity_@TYPE@_matrix(void *ptr, size_t n)
-{
-    size_t i;
-    @typ@ *matrix = (@typ@*) ptr;
-    /* in IEEE floating point, zeroes are represented as bitwise 0 */
-    memset(matrix, 0, n*n*sizeof(@typ@));
-
-    for (i = 0; i < n; ++i)
-    {
-        *matrix = @cblas_type@_one;
-        matrix += n+1;
-    }
-}
-/**end repeat**/
-
-         /* lower/upper triangular matrix using blas (in place) */
-/**begin repeat
-
-   #TYPE = FLOAT, DOUBLE, CFLOAT, CDOUBLE#
-   #typ = float, double, COMPLEX_t, DOUBLECOMPLEX_t#
-   #cblas_type = s, d, c, z#
- */
-
-static NPY_INLINE void
-triu_@TYPE@_matrix(void *ptr, size_t n)
-{
-    size_t i, j;
-    @typ@ *matrix = (@typ@*)ptr;
-    matrix += n;
-    for (i = 1; i < n; ++i) {
-        for (j = 0; j < i; ++j) {
-            matrix[j] = @cblas_type@_zero;
-        }
-        matrix += n;
-    }
-}
-/**end repeat**/
-
-
-/* -------------------------------------------------------------------------- */
-                          /* Determinants */
-
-/**begin repeat
-   #TYPE = FLOAT, DOUBLE#
-   #typ = npy_float, npy_double#
-   #log_func = npy_logf, npy_log#
-   #exp_func = npy_expf, npy_exp#
-   #zero = 0.0f, 0.0#
-*/
-
-static NPY_INLINE void
-@TYPE@_slogdet_from_factored_diagonal(@typ@* src,
-                                      fortran_int m,
-                                      @typ@ *sign,
-                                      @typ@ *logdet)
-{
-    @typ@ acc_sign = *sign;
-    @typ@ acc_logdet = @zero@;
-    int i;
-    for (i = 0; i < m; i++) {
-        @typ@ abs_element = *src;
-        if (abs_element < @zero@) {
-            acc_sign = -acc_sign;
-            abs_element = -abs_element;
-        }
-
-        acc_logdet += @log_func@(abs_element);
-        src += m+1;
-    }
-
-    *sign = acc_sign;
-    *logdet = acc_logdet;
-}
-
-static NPY_INLINE @typ@
-@TYPE@_det_from_slogdet(@typ@ sign, @typ@ logdet)
-{
-    @typ@ result = sign * @exp_func@(logdet);
-    return result;
-}
-
-/**end repeat**/
-
-
-/**begin repeat
-   #TYPE = CFLOAT, CDOUBLE#
-   #typ = npy_cfloat, npy_cdouble#
-   #basetyp = npy_float, npy_double#
-   #abs_func = npy_cabsf, npy_cabs#
-   #log_func = npy_logf, npy_log#
-   #exp_func = npy_expf, npy_exp#
-   #zero = 0.0f, 0.0#
-*/
-#define RE(COMPLEX) (((@basetyp@*)(&COMPLEX))[0])
-#define IM(COMPLEX) (((@basetyp@*)(&COMPLEX))[1])
-
-static NPY_INLINE @typ@
-@TYPE@_mult(@typ@ op1, @typ@ op2)
-{
-    @typ@ rv;
-
-    RE(rv) = RE(op1)*RE(op2) - IM(op1)*IM(op2);
-    IM(rv) = RE(op1)*IM(op2) + IM(op1)*RE(op2);
-
-    return rv;
-}
-
-
-static NPY_INLINE void
-@TYPE@_slogdet_from_factored_diagonal(@typ@* src,
-                                      fortran_int m,
-                                      @typ@ *sign,
-                                      @basetyp@ *logdet)
-{
-    int i;
-    @typ@ sign_acc = *sign;
-    @basetyp@ logdet_acc = @zero@;
-
-    for (i = 0; i < m; i++)
-    {
-        @basetyp@ abs_element = @abs_func@(*src);
-        @typ@ sign_element;
-        RE(sign_element) = RE(*src) / abs_element;
-        IM(sign_element) = IM(*src) / abs_element;
-
-        sign_acc = @TYPE@_mult(sign_acc, sign_element);
-        logdet_acc += @log_func@(abs_element);
-        src += m + 1;
-    }
-
-    *sign = sign_acc;
-    *logdet = logdet_acc;
-}
-
-static NPY_INLINE @typ@
-@TYPE@_det_from_slogdet(@typ@ sign, @basetyp@ logdet)
-{
-    @typ@ tmp;
-    RE(tmp) = @exp_func@(logdet);
-    IM(tmp) = @zero@;
-    return @TYPE@_mult(sign, tmp);
-}
-#undef RE
-#undef IM
-/**end repeat**/
-
-
-/* As in the linalg package, the determinant is computed via LU factorization
- * using LAPACK.
- * slogdet computes sign + log(determinant).
- * det computes sign * exp(slogdet).
- */
-/**begin repeat
-
-   #TYPE = FLOAT, DOUBLE, CFLOAT, CDOUBLE#
-   #typ = npy_float, npy_double, npy_cfloat, npy_cdouble#
-   #basetyp = npy_float, npy_double, npy_float, npy_double#
-   #cblas_type = s, d, c, z#
-*/
-
-static NPY_INLINE void
-@TYPE@_slogdet_single_element(fortran_int m,
-                              void* src,
-                              fortran_int* pivots,
-                              @typ@ *sign,
-                              @basetyp@ *logdet)
-{
-    fortran_int info = 0;
-    fortran_int lda = fortran_int_max(m, 1);
-    int i;
-    /* note: done in place */
-    LAPACK(@cblas_type@getrf)(&m, &m, (void *)src, &lda, pivots, &info);
-
-    if (info == 0) {
-        int change_sign = 0;
-        /* note: fortran uses 1 based indexing */
-        for (i = 0; i < m; i++)
-        {
-            change_sign += (pivots[i] != (i+1));
-        }
-
-        memcpy(sign,
-               (change_sign % 2)?
-                   &@cblas_type@_minus_one :
-                   &@cblas_type@_one
-               , sizeof(*sign));
-        @TYPE@_slogdet_from_factored_diagonal(src, m, sign, logdet);
-    } else {
-        /*
-          if getrf fails, use 0 as sign and -inf as logdet
-        */
-        memcpy(sign, &@cblas_type@_zero, sizeof(*sign));
-        memcpy(logdet, &@cblas_type@_ninf, sizeof(*logdet));
-    }
-}
-
-static void
-@TYPE@_slogdet(char **args,
-               npy_intp const *dimensions,
-               npy_intp const *steps,
-               void *NPY_UNUSED(func))
-{
-    fortran_int m;
-    npy_uint8 *tmp_buff = NULL;
-    size_t matrix_size;
-    size_t pivot_size;
-    size_t safe_m;
-    /* notes:
-     *   matrix will need to be copied always, as factorization in lapack is
-     *          made inplace
-     *   matrix will need to be in column-major order, as expected by lapack
-     *          code (fortran)
-     *   always a square matrix
-     *   need to allocate memory for both, matrix_buffer and pivot buffer
-     */
-    INIT_OUTER_LOOP_3
-    m = (fortran_int) dimensions[0];
-    safe_m = m;
-    matrix_size = safe_m * safe_m * sizeof(@typ@);
-    pivot_size = safe_m * sizeof(fortran_int);
-    tmp_buff = (npy_uint8 *)malloc(matrix_size + pivot_size);
-
-    if (tmp_buff) {
-        LINEARIZE_DATA_t lin_data;
-        /* swapped steps to get matrix in FORTRAN order */
-        init_linearize_data(&lin_data, m, m, steps[1], steps[0]);
-        BEGIN_OUTER_LOOP_3
-            linearize_@TYPE@_matrix(tmp_buff, args[0], &lin_data);
-            @TYPE@_slogdet_single_element(m,
-                                          (void*)tmp_buff,
-                                          (fortran_int*)(tmp_buff+matrix_size),
-                                          (@typ@*)args[1],
-                                          (@basetyp@*)args[2]);
-        END_OUTER_LOOP
-
-        free(tmp_buff);
-    }
-}
-
-static void
-@TYPE@_det(char **args,
-           npy_intp const *dimensions,
-           npy_intp const *steps,
-           void *NPY_UNUSED(func))
-{
-    fortran_int m;
-    npy_uint8 *tmp_buff;
-    size_t matrix_size;
-    size_t pivot_size;
-    size_t safe_m;
-    /* notes:
-     *   matrix will need to be copied always, as factorization in lapack is
-     *       made inplace
-     *   matrix will need to be in column-major order, as expected by lapack
-     *       code (fortran)
-     *   always a square matrix
-     *   need to allocate memory for both, matrix_buffer and pivot buffer
-     */
-    INIT_OUTER_LOOP_2
-    m = (fortran_int) dimensions[0];
-    safe_m = m;
-    matrix_size = safe_m * safe_m * sizeof(@typ@);
-    pivot_size = safe_m * sizeof(fortran_int);
-    tmp_buff = (npy_uint8 *)malloc(matrix_size + pivot_size);
-
-    if (tmp_buff) {
-        LINEARIZE_DATA_t lin_data;
-        @typ@ sign;
-        @basetyp@ logdet;
-        /* swapped steps to get matrix in FORTRAN order */
-        init_linearize_data(&lin_data, m, m, steps[1], steps[0]);
-
-        BEGIN_OUTER_LOOP_2
-            linearize_@TYPE@_matrix(tmp_buff, args[0], &lin_data);
-            @TYPE@_slogdet_single_element(m,
-                                          (void*)tmp_buff,
-                                          (fortran_int*)(tmp_buff + matrix_size),
-                                          &sign,
-                                          &logdet);
-            *(@typ@ *)args[1] = @TYPE@_det_from_slogdet(sign, logdet);
-        END_OUTER_LOOP
-
-        free(tmp_buff);
-    }
-}
-/**end repeat**/
-
-
-/* -------------------------------------------------------------------------- */
-                          /* Eigh family */
-
-typedef struct eigh_params_struct {
-    void *A;     /* matrix */
-    void *W;     /* eigenvalue vector */
-    void *WORK;  /* main work buffer */
-    void *RWORK; /* secondary work buffer (for complex versions) */
-    void *IWORK;
-    fortran_int N;
-    fortran_int LWORK;
-    fortran_int LRWORK;
-    fortran_int LIWORK;
-    char JOBZ;
-    char UPLO;
-    fortran_int LDA;
-} EIGH_PARAMS_t;
-
-/**begin repeat
-   #TYPE = FLOAT, DOUBLE#
-   #typ = npy_float, npy_double#
-   #ftyp = fortran_real, fortran_doublereal#
-   #lapack_func = ssyevd, dsyevd#
-*/
-
-static NPY_INLINE fortran_int
-call_@lapack_func@(EIGH_PARAMS_t *params)
-{
-    fortran_int rv;
-    LAPACK(@lapack_func@)(&params->JOBZ, &params->UPLO, &params->N,
-                          params->A, &params->LDA, params->W,
-                          params->WORK, &params->LWORK,
-                          params->IWORK, &params->LIWORK,
-                          &rv);
-    return rv;
-}
-
-/*
- * Initialize the parameters to use in for the lapack function _syevd
- * Handles buffer allocation
- */
-static NPY_INLINE int
-init_@lapack_func@(EIGH_PARAMS_t* params, char JOBZ, char UPLO,
-                   fortran_int N)
-{
-    npy_uint8 *mem_buff = NULL;
-    npy_uint8 *mem_buff2 = NULL;
-    fortran_int lwork;
-    fortran_int liwork;
-    npy_uint8 *a, *w, *work, *iwork;
-    size_t safe_N = N;
-    size_t alloc_size = safe_N * (safe_N + 1) * sizeof(@typ@);
-    fortran_int lda = fortran_int_max(N, 1);
-
-    mem_buff = malloc(alloc_size);
-
-    if (!mem_buff) {
-        goto error;
-    }
-    a = mem_buff;
-    w = mem_buff + safe_N * safe_N * sizeof(@typ@);
-
-    params->A = a;
-    params->W = w;
-    params->RWORK = NULL; /* unused */
-    params->N = N;
-    params->LRWORK = 0; /* unused */
-    params->JOBZ = JOBZ;
-    params->UPLO = UPLO;
-    params->LDA = lda;
-
-    /* Work size query */
-    {
-        @typ@ query_work_size;
-        fortran_int query_iwork_size;
-
-        params->LWORK = -1;
-        params->LIWORK = -1;
-        params->WORK = &query_work_size;
-        params->IWORK = &query_iwork_size;
-
-        if (call_@lapack_func@(params) != 0) {
-            goto error;
-        }
-
-        lwork = (fortran_int)query_work_size;
-        liwork = query_iwork_size;
-    }
-
-    mem_buff2 = malloc(lwork*sizeof(@typ@) + liwork*sizeof(fortran_int));
-    if (!mem_buff2) {
-        goto error;
-    }
-
-    work = mem_buff2;
-    iwork = mem_buff2 + lwork*sizeof(@typ@);
-
-    params->LWORK = lwork;
-    params->WORK = work;
-    params->LIWORK = liwork;
-    params->IWORK = iwork;
-
-    return 1;
-
- error:
-    /* something failed */
-    memset(params, 0, sizeof(*params));
-    free(mem_buff2);
-    free(mem_buff);
-
-    return 0;
-}
-/**end repeat**/
-
-
-/**begin repeat
-   #TYPE = CFLOAT, CDOUBLE#
-   #typ = npy_cfloat, npy_cdouble#
-   #basetyp = npy_float, npy_double#
-   #ftyp = fortran_complex, fortran_doublecomplex#
-   #fbasetyp = fortran_real, fortran_doublereal#
-   #lapack_func = cheevd, zheevd#
-*/
-static NPY_INLINE fortran_int
-call_@lapack_func@(EIGH_PARAMS_t *params)
-{
-    fortran_int rv;
-    LAPACK(@lapack_func@)(&params->JOBZ, &params->UPLO, &params->N,
-                          params->A, &params->LDA, params->W,
-                          params->WORK, &params->LWORK,
-                          params->RWORK, &params->LRWORK,
-                          params->IWORK, &params->LIWORK,
-                          &rv);
-    return rv;
-}
-
-/*
- * Initialize the parameters to use in for the lapack function _heev
- * Handles buffer allocation
- */
-static NPY_INLINE int
-init_@lapack_func@(EIGH_PARAMS_t *params,
-                   char JOBZ,
-                   char UPLO,
-                   fortran_int N)
-{
-    npy_uint8 *mem_buff = NULL;
-    npy_uint8 *mem_buff2 = NULL;
-    fortran_int lwork;
-    fortran_int lrwork;
-    fortran_int liwork;
-    npy_uint8 *a, *w, *work, *rwork, *iwork;
-    size_t safe_N = N;
-    fortran_int lda = fortran_int_max(N, 1);
-
-    mem_buff = malloc(safe_N * safe_N * sizeof(@typ@) +
-                      safe_N * sizeof(@basetyp@));
-    if (!mem_buff) {
-        goto error;
-    }
-    a = mem_buff;
-    w = mem_buff + safe_N * safe_N * sizeof(@typ@);
-
-    params->A = a;
-    params->W = w;
-    params->N = N;
-    params->JOBZ = JOBZ;
-    params->UPLO = UPLO;
-    params->LDA = lda;
-
-    /* Work size query */
-    {
-        @ftyp@ query_work_size;
-        @fbasetyp@ query_rwork_size;
-        fortran_int query_iwork_size;
-
-        params->LWORK = -1;
-        params->LRWORK = -1;
-        params->LIWORK = -1;
-        params->WORK = &query_work_size;
-        params->RWORK = &query_rwork_size;
-        params->IWORK = &query_iwork_size;
-
-        if (call_@lapack_func@(params) != 0) {
-            goto error;
-        }
-
-        lwork = (fortran_int)*(@fbasetyp@*)&query_work_size;
-        lrwork = (fortran_int)query_rwork_size;
-        liwork = query_iwork_size;
-    }
-
-    mem_buff2 = malloc(lwork*sizeof(@typ@) +
-                       lrwork*sizeof(@basetyp@) +
-                       liwork*sizeof(fortran_int));
-    if (!mem_buff2) {
-        goto error;
-    }
-
-    work = mem_buff2;
-    rwork = work + lwork*sizeof(@typ@);
-    iwork = rwork + lrwork*sizeof(@basetyp@);
-
-    params->WORK = work;
-    params->RWORK = rwork;
-    params->IWORK = iwork;
-    params->LWORK = lwork;
-    params->LRWORK = lrwork;
-    params->LIWORK = liwork;
-
-    return 1;
-
-    /* something failed */
-error:
-    memset(params, 0, sizeof(*params));
-    free(mem_buff2);
-    free(mem_buff);
-
-    return 0;
-}
-/**end repeat**/
-
-
-/**begin repeat
-   #TYPE = FLOAT, DOUBLE, CFLOAT, CDOUBLE#
-   #BASETYPE = FLOAT, DOUBLE, FLOAT, DOUBLE#
-   #typ = npy_float, npy_double, npy_cfloat, npy_cdouble#
-   #basetyp = npy_float, npy_double, npy_float, npy_double#
-   #lapack_func = ssyevd, dsyevd, cheevd, zheevd#
-**/
-/*
- * (M, M)->(M,)(M, M)
- * dimensions[1] -> M
- * args[0] -> A[in]
- * args[1] -> W
- * args[2] -> A[out]
- */
-
-static NPY_INLINE void
-release_@lapack_func@(EIGH_PARAMS_t *params)
-{
-    /* allocated memory in A and WORK */
-    free(params->A);
-    free(params->WORK);
-    memset(params, 0, sizeof(*params));
-}
-
-
-static NPY_INLINE void
-@TYPE@_eigh_wrapper(char JOBZ,
-                    char UPLO,
-                    char**args,
-                    npy_intp const *dimensions,
-                    npy_intp const *steps)
-{
-    ptrdiff_t outer_steps[3];
-    size_t iter;
-    size_t outer_dim = *dimensions++;
-    size_t op_count = (JOBZ=='N')?2:3;
-    EIGH_PARAMS_t eigh_params;
-    int error_occurred = get_fp_invalid_and_clear();
-
-    for (iter = 0; iter < op_count; ++iter) {
-        outer_steps[iter] = (ptrdiff_t) steps[iter];
-    }
-    steps += op_count;
-
-    if (init_@lapack_func@(&eigh_params,
-                           JOBZ,
-                           UPLO,
-                           (fortran_int)dimensions[0])) {
-        LINEARIZE_DATA_t matrix_in_ld;
-        LINEARIZE_DATA_t eigenvectors_out_ld;
-        LINEARIZE_DATA_t eigenvalues_out_ld;
-
-        init_linearize_data(&matrix_in_ld,
-                            eigh_params.N, eigh_params.N,
-                            steps[1], steps[0]);
-        init_linearize_data(&eigenvalues_out_ld,
-                            1, eigh_params.N,
-                            0, steps[2]);
-        if ('V' == eigh_params.JOBZ) {
-            init_linearize_data(&eigenvectors_out_ld,
-                                eigh_params.N, eigh_params.N,
-                                steps[4], steps[3]);
-        }
-
-        for (iter = 0; iter < outer_dim; ++iter) {
-            int not_ok;
-            /* copy the matrix in */
-            linearize_@TYPE@_matrix(eigh_params.A, args[0], &matrix_in_ld);
-            not_ok = call_@lapack_func@(&eigh_params);
-            if (!not_ok) {
-                /* lapack ok, copy result out */
-                delinearize_@BASETYPE@_matrix(args[1],
-                                              eigh_params.W,
-                                              &eigenvalues_out_ld);
-
-                if ('V' == eigh_params.JOBZ) {
-                    delinearize_@TYPE@_matrix(args[2],
-                                              eigh_params.A,
-                                              &eigenvectors_out_ld);
-                }
-            } else {
-                /* lapack fail, set result to nan */
-                error_occurred = 1;
-                nan_@BASETYPE@_matrix(args[1], &eigenvalues_out_ld);
-                if ('V' == eigh_params.JOBZ) {
-                    nan_@TYPE@_matrix(args[2], &eigenvectors_out_ld);
-                }
-            }
-            update_pointers((npy_uint8**)args, outer_steps, op_count);
-        }
-
-        release_@lapack_func@(&eigh_params);
-    }
-
-    set_fp_invalid_or_clear(error_occurred);
-}
-/**end repeat**/
-
-
-/**begin repeat
-   #TYPE = FLOAT, DOUBLE, CFLOAT, CDOUBLE#
- */
-static void
-@TYPE@_eighlo(char **args,
-              npy_intp const *dimensions,
-              npy_intp const *steps,
-              void *NPY_UNUSED(func))
-{
-    @TYPE@_eigh_wrapper('V', 'L', args, dimensions, steps);
-}
-
-static void
-@TYPE@_eighup(char **args,
-              npy_intp const *dimensions,
-              npy_intp const *steps,
-              void* NPY_UNUSED(func))
-{
-    @TYPE@_eigh_wrapper('V', 'U', args, dimensions, steps);
-}
-
-static void
-@TYPE@_eigvalshlo(char **args,
-                  npy_intp const *dimensions,
-                  npy_intp const *steps,
-                  void* NPY_UNUSED(func))
-{
-    @TYPE@_eigh_wrapper('N', 'L', args, dimensions, steps);
-}
-
-static void
-@TYPE@_eigvalshup(char **args,
-                  npy_intp const *dimensions,
-                  npy_intp const *steps,
-                  void* NPY_UNUSED(func))
-{
-    @TYPE@_eigh_wrapper('N', 'U', args, dimensions, steps);
-}
-/**end repeat**/
-
-/* -------------------------------------------------------------------------- */
-                  /* Solve family (includes inv) */
-
-typedef struct gesv_params_struct
-{
-    void *A; /* A is (N, N) of base type */
-    void *B; /* B is (N, NRHS) of base type */
-    fortran_int * IPIV; /* IPIV is (N) */
-
-    fortran_int N;
-    fortran_int NRHS;
-    fortran_int LDA;
-    fortran_int LDB;
-} GESV_PARAMS_t;
-
-/**begin repeat
-   #TYPE = FLOAT, DOUBLE, CFLOAT, CDOUBLE#
-   #typ = npy_float, npy_double, npy_cfloat, npy_cdouble#
-   #ftyp = fortran_real, fortran_doublereal,
-           fortran_complex, fortran_doublecomplex#
-   #lapack_func = sgesv, dgesv, cgesv, zgesv#
-*/
-
-static NPY_INLINE fortran_int
-call_@lapack_func@(GESV_PARAMS_t *params)
-{
-    fortran_int rv;
-    LAPACK(@lapack_func@)(&params->N, &params->NRHS,
-                          params->A, &params->LDA,
-                          params->IPIV,
-                          params->B, &params->LDB,
-                          &rv);
-    return rv;
-}
-
-/*
- * Initialize the parameters to use in for the lapack function _heev
- * Handles buffer allocation
- */
-static NPY_INLINE int
-init_@lapack_func@(GESV_PARAMS_t *params, fortran_int N, fortran_int NRHS)
-{
-    npy_uint8 *mem_buff = NULL;
-    npy_uint8 *a, *b, *ipiv;
-    size_t safe_N = N;
-    size_t safe_NRHS = NRHS;
-    fortran_int ld = fortran_int_max(N, 1);
-    mem_buff = malloc(safe_N * safe_N * sizeof(@ftyp@) +
-                      safe_N * safe_NRHS*sizeof(@ftyp@) +
-                      safe_N * sizeof(fortran_int));
-    if (!mem_buff) {
-        goto error;
-    }
-    a = mem_buff;
-    b = a + safe_N * safe_N * sizeof(@ftyp@);
-    ipiv = b + safe_N * safe_NRHS * sizeof(@ftyp@);
-
-    params->A = a;
-    params->B = b;
-    params->IPIV = (fortran_int*)ipiv;
-    params->N = N;
-    params->NRHS = NRHS;
-    params->LDA = ld;
-    params->LDB = ld;
-
-    return 1;
- error:
-    free(mem_buff);
-    memset(params, 0, sizeof(*params));
-
-    return 0;
-}
-
-static NPY_INLINE void
-release_@lapack_func@(GESV_PARAMS_t *params)
-{
-    /* memory block base is in A */
-    free(params->A);
-    memset(params, 0, sizeof(*params));
-}
-
-static void
-@TYPE@_solve(char **args, npy_intp const *dimensions, npy_intp const *steps,
-             void *NPY_UNUSED(func))
-{
-    GESV_PARAMS_t params;
-    fortran_int n, nrhs;
-    int error_occurred = get_fp_invalid_and_clear();
-    INIT_OUTER_LOOP_3
-
-    n = (fortran_int)dimensions[0];
-    nrhs = (fortran_int)dimensions[1];
-    if (init_@lapack_func@(&params, n, nrhs)) {
-        LINEARIZE_DATA_t a_in, b_in, r_out;
-
-        init_linearize_data(&a_in, n, n, steps[1], steps[0]);
-        init_linearize_data(&b_in, nrhs, n, steps[3], steps[2]);
-        init_linearize_data(&r_out, nrhs, n, steps[5], steps[4]);
-
-        BEGIN_OUTER_LOOP_3
-            int not_ok;
-            linearize_@TYPE@_matrix(params.A, args[0], &a_in);
-            linearize_@TYPE@_matrix(params.B, args[1], &b_in);
-            not_ok =call_@lapack_func@(&params);
-            if (!not_ok) {
-                delinearize_@TYPE@_matrix(args[2], params.B, &r_out);
-            } else {
-                error_occurred = 1;
-                nan_@TYPE@_matrix(args[2], &r_out);
-            }
-        END_OUTER_LOOP
-
-        release_@lapack_func@(&params);
-    }
-
-    set_fp_invalid_or_clear(error_occurred);
-}
-
-static void
-@TYPE@_solve1(char **args, npy_intp const *dimensions, npy_intp const *steps,
-              void *NPY_UNUSED(func))
-{
-    GESV_PARAMS_t params;
-    int error_occurred = get_fp_invalid_and_clear();
-    fortran_int n;
-    INIT_OUTER_LOOP_3
-
-    n = (fortran_int)dimensions[0];
-    if (init_@lapack_func@(&params, n, 1)) {
-        LINEARIZE_DATA_t a_in, b_in, r_out;
-        init_linearize_data(&a_in, n, n, steps[1], steps[0]);
-        init_linearize_data(&b_in, 1, n, 1, steps[2]);
-        init_linearize_data(&r_out, 1, n, 1, steps[3]);
-
-        BEGIN_OUTER_LOOP_3
-            int not_ok;
-            linearize_@TYPE@_matrix(params.A, args[0], &a_in);
-            linearize_@TYPE@_matrix(params.B, args[1], &b_in);
-            not_ok = call_@lapack_func@(&params);
-            if (!not_ok) {
-                delinearize_@TYPE@_matrix(args[2], params.B, &r_out);
-            } else {
-                error_occurred = 1;
-                nan_@TYPE@_matrix(args[2], &r_out);
-            }
-        END_OUTER_LOOP
-
-        release_@lapack_func@(&params);
-    }
-
-    set_fp_invalid_or_clear(error_occurred);
-}
-
-static void
-@TYPE@_inv(char **args, npy_intp const *dimensions, npy_intp const *steps,
-           void *NPY_UNUSED(func))
-{
-    GESV_PARAMS_t params;
-    fortran_int n;
-    int error_occurred = get_fp_invalid_and_clear();
-    INIT_OUTER_LOOP_2
-
-    n = (fortran_int)dimensions[0];
-    if (init_@lapack_func@(&params, n, n)) {
-        LINEARIZE_DATA_t a_in, r_out;
-        init_linearize_data(&a_in, n, n, steps[1], steps[0]);
-        init_linearize_data(&r_out, n, n, steps[3], steps[2]);
-
-        BEGIN_OUTER_LOOP_2
-            int not_ok;
-            linearize_@TYPE@_matrix(params.A, args[0], &a_in);
-            identity_@TYPE@_matrix(params.B, n);
-            not_ok = call_@lapack_func@(&params);
-            if (!not_ok) {
-                delinearize_@TYPE@_matrix(args[1], params.B, &r_out);
-            } else {
-                error_occurred = 1;
-                nan_@TYPE@_matrix(args[1], &r_out);
-            }
-        END_OUTER_LOOP
-
-        release_@lapack_func@(&params);
-    }
-
-    set_fp_invalid_or_clear(error_occurred);
-}
-
-/**end repeat**/
-
-
-/* -------------------------------------------------------------------------- */
-                     /* Cholesky decomposition */
-
-typedef struct potr_params_struct
-{
-    void *A;
-    fortran_int N;
-    fortran_int LDA;
-    char UPLO;
-} POTR_PARAMS_t;
-
-/**begin repeat
-
-   #TYPE = FLOAT, DOUBLE, CFLOAT, CDOUBLE#
-   #ftyp = fortran_real, fortran_doublereal,
-           fortran_complex, fortran_doublecomplex#
-   #lapack_func = spotrf, dpotrf, cpotrf, zpotrf#
- */
-
-static NPY_INLINE fortran_int
-call_@lapack_func@(POTR_PARAMS_t *params)
-{
-    fortran_int rv;
-    LAPACK(@lapack_func@)(&params->UPLO,
-                          &params->N, params->A, &params->LDA,
-                          &rv);
-    return rv;
-}
-
-static NPY_INLINE int
-init_@lapack_func@(POTR_PARAMS_t *params, char UPLO, fortran_int N)
-{
-    npy_uint8 *mem_buff = NULL;
-    npy_uint8 *a;
-    size_t safe_N = N;
-    fortran_int lda = fortran_int_max(N, 1);
-
-    mem_buff = malloc(safe_N * safe_N * sizeof(@ftyp@));
-    if (!mem_buff) {
-        goto error;
-    }
-
-    a = mem_buff;
-
-    params->A = a;
-    params->N = N;
-    params->LDA = lda;
-    params->UPLO = UPLO;
-
-    return 1;
- error:
-    free(mem_buff);
-    memset(params, 0, sizeof(*params));
-
-    return 0;
-}
-
-static NPY_INLINE void
-release_@lapack_func@(POTR_PARAMS_t *params)
-{
-    /* memory block base in A */
-    free(params->A);
-    memset(params, 0, sizeof(*params));
-}
-
-static void
-@TYPE@_cholesky(char uplo, char **args, npy_intp const *dimensions, npy_intp const *steps)
-{
-    POTR_PARAMS_t params;
-    int error_occurred = get_fp_invalid_and_clear();
-    fortran_int n;
-    INIT_OUTER_LOOP_2
-
-    assert(uplo == 'L');
-
-    n = (fortran_int)dimensions[0];
-    if (init_@lapack_func@(&params, uplo, n)) {
-        LINEARIZE_DATA_t a_in, r_out;
-        init_linearize_data(&a_in, n, n, steps[1], steps[0]);
-        init_linearize_data(&r_out, n, n, steps[3], steps[2]);
-        BEGIN_OUTER_LOOP_2
-            int not_ok;
-            linearize_@TYPE@_matrix(params.A, args[0], &a_in);
-            not_ok = call_@lapack_func@(&params);
-            if (!not_ok) {
-                triu_@TYPE@_matrix(params.A, params.N);
-                delinearize_@TYPE@_matrix(args[1], params.A, &r_out);
-            } else {
-                error_occurred = 1;
-                nan_@TYPE@_matrix(args[1], &r_out);
-            }
-        END_OUTER_LOOP
-        release_@lapack_func@(&params);
-    }
-
-    set_fp_invalid_or_clear(error_occurred);
-}
-
-static void
-@TYPE@_cholesky_lo(char **args, npy_intp const *dimensions, npy_intp const *steps,
-                void *NPY_UNUSED(func))
-{
-    @TYPE@_cholesky('L', args, dimensions, steps);
-}
-
-/**end repeat**/
-
-/* -------------------------------------------------------------------------- */
-                          /* eig family  */
-
-typedef struct geev_params_struct {
-    void *A;
-    void *WR; /* RWORK in complex versions, REAL W buffer for (sd)geev*/
-    void *WI;
-    void *VLR; /* REAL VL buffers for _geev where _ is s, d */
-    void *VRR; /* REAL VR buffers for _geev where _ is s, d */
-    void *WORK;
-    void *W;  /* final w */
-    void *VL; /* final vl */
-    void *VR; /* final vr */
-
-    fortran_int N;
-    fortran_int LDA;
-    fortran_int LDVL;
-    fortran_int LDVR;
-    fortran_int LWORK;
-
-    char JOBVL;
-    char JOBVR;
-} GEEV_PARAMS_t;
-
-static NPY_INLINE void
-dump_geev_params(const char *name, GEEV_PARAMS_t* params)
-{
-    TRACE_TXT("\n%s\n"
-
-              "\t%10s: %p\n"\
-              "\t%10s: %p\n"\
-              "\t%10s: %p\n"\
-              "\t%10s: %p\n"\
-              "\t%10s: %p\n"\
-              "\t%10s: %p\n"\
-              "\t%10s: %p\n"\
-              "\t%10s: %p\n"\
-              "\t%10s: %p\n"\
-
-              "\t%10s: %d\n"\
-              "\t%10s: %d\n"\
-              "\t%10s: %d\n"\
-              "\t%10s: %d\n"\
-              "\t%10s: %d\n"\
-
-              "\t%10s: %c\n"\
-              "\t%10s: %c\n",
-
-              name,
-
-              "A", params->A,
-              "WR", params->WR,
-              "WI", params->WI,
-              "VLR", params->VLR,
-              "VRR", params->VRR,
-              "WORK", params->WORK,
-              "W", params->W,
-              "VL", params->VL,
-              "VR", params->VR,
-
-              "N", (int)params->N,
-              "LDA", (int)params->LDA,
-              "LDVL", (int)params->LDVL,
-              "LDVR", (int)params->LDVR,
-              "LWORK", (int)params->LWORK,
-
-              "JOBVL", params->JOBVL,
-              "JOBVR", params->JOBVR);
-}
-
-/**begin repeat
-   #TYPE = FLOAT, DOUBLE#
-   #CTYPE = CFLOAT, CDOUBLE#
-   #typ = float, double#
-   #complextyp = COMPLEX_t, DOUBLECOMPLEX_t#
-   #lapack_func = sgeev, dgeev#
-   #zero = 0.0f, 0.0#
-*/
-
-static NPY_INLINE fortran_int
-call_@lapack_func@(GEEV_PARAMS_t* params)
-{
-    fortran_int rv;
-    LAPACK(@lapack_func@)(&params->JOBVL, &params->JOBVR,
-                          &params->N, params->A, &params->LDA,
-                          params->WR, params->WI,
-                          params->VLR, &params->LDVL,
-                          params->VRR, &params->LDVR,
-                          params->WORK, &params->LWORK,
-                          &rv);
-    return rv;
-}
-
-static NPY_INLINE int
-init_@lapack_func@(GEEV_PARAMS_t *params, char jobvl, char jobvr, fortran_int n)
-{
-    npy_uint8 *mem_buff = NULL;
-    npy_uint8 *mem_buff2 = NULL;
-    npy_uint8 *a, *wr, *wi, *vlr, *vrr, *work, *w, *vl, *vr;
-    size_t safe_n = n;
-    size_t a_size = safe_n * safe_n * sizeof(@typ@);
-    size_t wr_size = safe_n * sizeof(@typ@);
-    size_t wi_size = safe_n * sizeof(@typ@);
-    size_t vlr_size = jobvl=='V' ? safe_n * safe_n * sizeof(@typ@) : 0;
-    size_t vrr_size = jobvr=='V' ? safe_n * safe_n * sizeof(@typ@) : 0;
-    size_t w_size = wr_size*2;
-    size_t vl_size = vlr_size*2;
-    size_t vr_size = vrr_size*2;
-    size_t work_count = 0;
-    fortran_int ld = fortran_int_max(n, 1);
-
-    /* allocate data for known sizes (all but work) */
-    mem_buff = malloc(a_size + wr_size + wi_size +
-                      vlr_size + vrr_size +
-                      w_size + vl_size + vr_size);
-    if (!mem_buff) {
-        goto error;
-    }
-
-    a = mem_buff;
-    wr = a + a_size;
-    wi = wr + wr_size;
-    vlr = wi + wi_size;
-    vrr = vlr + vlr_size;
-    w = vrr + vrr_size;
-    vl = w + w_size;
-    vr = vl + vl_size;
-
-    params->A = a;
-    params->WR = wr;
-    params->WI = wi;
-    params->VLR = vlr;
-    params->VRR = vrr;
-    params->W = w;
-    params->VL = vl;
-    params->VR = vr;
-    params->N = n;
-    params->LDA = ld;
-    params->LDVL = ld;
-    params->LDVR = ld;
-    params->JOBVL = jobvl;
-    params->JOBVR = jobvr;
-
-    /* Work size query */
-    {
-        @typ@ work_size_query;
-
-        params->LWORK = -1;
-        params->WORK = &work_size_query;
-
-        if (call_@lapack_func@(params) != 0) {
-            goto error;
-        }
-
-        work_count = (size_t)work_size_query;
-    }
-
-    mem_buff2 = malloc(work_count*sizeof(@typ@));
-    if (!mem_buff2) {
-        goto error;
-    }
-    work = mem_buff2;
-
-    params->LWORK = (fortran_int)work_count;
-    params->WORK = work;
-
-    return 1;
- error:
-    free(mem_buff2);
-    free(mem_buff);
-    memset(params, 0, sizeof(*params));
-
-    return 0;
-}
-
-static NPY_INLINE void
-mk_@TYPE@_complex_array_from_real(@complextyp@ *c, const @typ@ *re, size_t n)
-{
-    size_t iter;
-    for (iter = 0; iter < n; ++iter) {
-        c[iter].array[0] = re[iter];
-        c[iter].array[1] = @zero@;
-    }
-}
-
-static NPY_INLINE void
-mk_@TYPE@_complex_array(@complextyp@ *c,
-                        const @typ@ *re,
-                        const @typ@ *im,
-                        size_t n)
-{
-    size_t iter;
-    for (iter = 0; iter < n; ++iter) {
-        c[iter].array[0] = re[iter];
-        c[iter].array[1] = im[iter];
-    }
-}
-
-static NPY_INLINE void
-mk_@TYPE@_complex_array_conjugate_pair(@complextyp@ *c,
-                                       const @typ@ *r,
-                                       size_t n)
-{
-    size_t iter;
-    for (iter = 0; iter < n; ++iter) {
-        @typ@ re = r[iter];
-        @typ@ im = r[iter+n];
-        c[iter].array[0] = re;
-        c[iter].array[1] = im;
-        c[iter+n].array[0] = re;
-        c[iter+n].array[1] = -im;
-    }
-}
-
-/*
- * make the complex eigenvectors from the real array produced by sgeev/zgeev.
- * c is the array where the results will be left.
- * r is the source array of reals produced by sgeev/zgeev
- * i is the eigenvalue imaginary part produced by sgeev/zgeev
- * n is so that the order of the matrix is n by n
- */
-static NPY_INLINE void
-mk_@lapack_func@_complex_eigenvectors(@complextyp@ *c,
-                                      const @typ@ *r,
-                                      const @typ@ *i,
-                                      size_t n)
-{
-    size_t iter = 0;
-    while (iter < n)
-    {
-        if (i[iter] ==  @zero@) {
-            /* eigenvalue was real, eigenvectors as well...  */
-            mk_@TYPE@_complex_array_from_real(c, r, n);
-            c += n;
-            r += n;
-            iter ++;
-        } else {
-            /* eigenvalue was complex, generate a pair of eigenvectors */
-            mk_@TYPE@_complex_array_conjugate_pair(c, r, n);
-            c += 2*n;
-            r += 2*n;
-            iter += 2;
-        }
-    }
-}
-
-
-static NPY_INLINE void
-process_@lapack_func@_results(GEEV_PARAMS_t *params)
-{
-    /* REAL versions of geev need the results to be translated
-     * into complex versions. This is the way to deal with imaginary
-     * results. In our gufuncs we will always return complex arrays!
-     */
-    mk_@TYPE@_complex_array(params->W, params->WR, params->WI, params->N);
-
-    /* handle the eigenvectors */
-    if ('V' == params->JOBVL) {
-        mk_@lapack_func@_complex_eigenvectors(params->VL, params->VLR,
-                                              params->WI, params->N);
-    }
-    if ('V' == params->JOBVR) {
-        mk_@lapack_func@_complex_eigenvectors(params->VR, params->VRR,
-                                              params->WI, params->N);
-    }
-}
-
-/**end repeat**/
-
-
-/**begin repeat
-   #TYPE = CFLOAT, CDOUBLE#
-   #typ = COMPLEX_t, DOUBLECOMPLEX_t#
-   #ftyp = fortran_complex, fortran_doublecomplex#
-   #realtyp = float, double#
-   #lapack_func = cgeev, zgeev#
- */
-
-static NPY_INLINE fortran_int
-call_@lapack_func@(GEEV_PARAMS_t* params)
-{
-    fortran_int rv;
-
-    LAPACK(@lapack_func@)(&params->JOBVL, &params->JOBVR,
-                          &params->N, params->A, &params->LDA,
-                          params->W,
-                          params->VL, &params->LDVL,
-                          params->VR, &params->LDVR,
-                          params->WORK, &params->LWORK,
-                          params->WR, /* actually RWORK */
-                          &rv);
-    return rv;
-}
-
-static NPY_INLINE int
-init_@lapack_func@(GEEV_PARAMS_t* params,
-                   char jobvl,
-                   char jobvr,
-                   fortran_int n)
-{
-    npy_uint8 *mem_buff = NULL;
-    npy_uint8 *mem_buff2 = NULL;
-    npy_uint8 *a, *w, *vl, *vr, *work, *rwork;
-    size_t safe_n = n;
-    size_t a_size = safe_n * safe_n * sizeof(@ftyp@);
-    size_t w_size = safe_n * sizeof(@ftyp@);
-    size_t vl_size = jobvl=='V'? safe_n * safe_n * sizeof(@ftyp@) : 0;
-    size_t vr_size = jobvr=='V'? safe_n * safe_n * sizeof(@ftyp@) : 0;
-    size_t rwork_size = 2 * safe_n * sizeof(@realtyp@);
-    size_t work_count = 0;
-    size_t total_size = a_size + w_size + vl_size + vr_size + rwork_size;
-    fortran_int ld = fortran_int_max(n, 1);
-
-    mem_buff = malloc(total_size);
-    if (!mem_buff) {
-        goto error;
-    }
-
-    a = mem_buff;
-    w = a + a_size;
-    vl = w + w_size;
-    vr = vl + vl_size;
-    rwork = vr + vr_size;
-
-    params->A = a;
-    params->WR = rwork;
-    params->WI = NULL;
-    params->VLR = NULL;
-    params->VRR = NULL;
-    params->VL = vl;
-    params->VR = vr;
-    params->W = w;
-    params->N = n;
-    params->LDA = ld;
-    params->LDVL = ld;
-    params->LDVR = ld;
-    params->JOBVL = jobvl;
-    params->JOBVR = jobvr;
-
-    /* Work size query */
-    {
-        @typ@ work_size_query;
-
-        params->LWORK = -1;
-        params->WORK = &work_size_query;
-
-        if (call_@lapack_func@(params) != 0) {
-            goto error;
-        }
-
-        work_count = (size_t) work_size_query.array[0];
-        /* Fix a bug in lapack 3.0.0 */
-        if(work_count == 0) work_count = 1;
-    }
-
-    mem_buff2 = malloc(work_count*sizeof(@ftyp@));
-    if (!mem_buff2) {
-        goto error;
-    }
-
-    work = mem_buff2;
-
-    params->LWORK = (fortran_int)work_count;
-    params->WORK = work;
-
-    return 1;
- error:
-    free(mem_buff2);
-    free(mem_buff);
-    memset(params, 0, sizeof(*params));
-
-    return 0;
-}
-
-
-static NPY_INLINE void
-process_@lapack_func@_results(GEEV_PARAMS_t *NPY_UNUSED(params))
-{
-    /* nothing to do here, complex versions are ready to copy out */
-}
-/**end repeat**/
-
-
-/**begin repeat
-   #TYPE = FLOAT, DOUBLE, CDOUBLE#
-   #COMPLEXTYPE = CFLOAT, CDOUBLE, CDOUBLE#
-   #ftype = fortran_real, fortran_doublereal, fortran_doublecomplex#
-   #lapack_func = sgeev, dgeev, zgeev#
- */
-
-static NPY_INLINE void
-release_@lapack_func@(GEEV_PARAMS_t *params)
-{
-    free(params->WORK);
-    free(params->A);
-    memset(params, 0, sizeof(*params));
-}
-
-static NPY_INLINE void
-@TYPE@_eig_wrapper(char JOBVL,
-                   char JOBVR,
-                   char**args,
-                   npy_intp const *dimensions,
-                   npy_intp const *steps)
-{
-    ptrdiff_t outer_steps[4];
-    size_t iter;
-    size_t outer_dim = *dimensions++;
-    size_t op_count = 2;
-    int error_occurred = get_fp_invalid_and_clear();
-    GEEV_PARAMS_t geev_params;
-
-    assert(JOBVL == 'N');
-
-    STACK_TRACE;
-    op_count += 'V'==JOBVL?1:0;
-    op_count += 'V'==JOBVR?1:0;
-
-    for (iter = 0; iter < op_count; ++iter) {
-        outer_steps[iter] = (ptrdiff_t) steps[iter];
-    }
-    steps += op_count;
-
-    if (init_@lapack_func@(&geev_params,
-                           JOBVL, JOBVR,
-                           (fortran_int)dimensions[0])) {
-        LINEARIZE_DATA_t a_in;
-        LINEARIZE_DATA_t w_out;
-        LINEARIZE_DATA_t vl_out;
-        LINEARIZE_DATA_t vr_out;
-
-        init_linearize_data(&a_in,
-                            geev_params.N, geev_params.N,
-                            steps[1], steps[0]);
-        steps += 2;
-        init_linearize_data(&w_out,
-                            1, geev_params.N,
-                            0, steps[0]);
-        steps += 1;
-        if ('V' == geev_params.JOBVL) {
-            init_linearize_data(&vl_out,
-                                geev_params.N, geev_params.N,
-                                steps[1], steps[0]);
-            steps += 2;
-        }
-        if ('V' == geev_params.JOBVR) {
-            init_linearize_data(&vr_out,
-                                geev_params.N, geev_params.N,
-                                steps[1], steps[0]);
-        }
-
-        for (iter = 0; iter < outer_dim; ++iter) {
-            int not_ok;
-            char **arg_iter = args;
-            /* copy the matrix in */
-            linearize_@TYPE@_matrix(geev_params.A, *arg_iter++, &a_in);
-            not_ok = call_@lapack_func@(&geev_params);
-
-            if (!not_ok) {
-                process_@lapack_func@_results(&geev_params);
-                delinearize_@COMPLEXTYPE@_matrix(*arg_iter++,
-                                                 geev_params.W,
-                                                 &w_out);
-
-                if ('V' == geev_params.JOBVL) {
-                    delinearize_@COMPLEXTYPE@_matrix(*arg_iter++,
-                                                     geev_params.VL,
-                                                     &vl_out);
-                }
-                if ('V' == geev_params.JOBVR) {
-                    delinearize_@COMPLEXTYPE@_matrix(*arg_iter++,
-                                                     geev_params.VR,
-                                                     &vr_out);
-                }
-            } else {
-                /* geev failed */
-                error_occurred = 1;
-                nan_@COMPLEXTYPE@_matrix(*arg_iter++, &w_out);
-                if ('V' == geev_params.JOBVL) {
-                    nan_@COMPLEXTYPE@_matrix(*arg_iter++, &vl_out);
-                }
-                if ('V' == geev_params.JOBVR) {
-                    nan_@COMPLEXTYPE@_matrix(*arg_iter++, &vr_out);
-                }
-            }
-            update_pointers((npy_uint8**)args, outer_steps, op_count);
-        }
-
-        release_@lapack_func@(&geev_params);
-    }
-
-    set_fp_invalid_or_clear(error_occurred);
-}
-
-static void
-@TYPE@_eig(char **args,
-           npy_intp const *dimensions,
-           npy_intp const *steps,
-           void *NPY_UNUSED(func))
-{
-    @TYPE@_eig_wrapper('N', 'V', args, dimensions, steps);
-}
-
-static void
-@TYPE@_eigvals(char **args,
-               npy_intp const *dimensions,
-               npy_intp const *steps,
-               void *NPY_UNUSED(func))
-{
-    @TYPE@_eig_wrapper('N', 'N', args, dimensions, steps);
-}
-
-/**end repeat**/
-
-
-/* -------------------------------------------------------------------------- */
-                 /* singular value decomposition  */
-
-typedef struct gessd_params_struct
-{
-    void *A;
-    void *S;
-    void *U;
-    void *VT;
-    void *WORK;
-    void *RWORK;
-    void *IWORK;
-
-    fortran_int M;
-    fortran_int N;
-    fortran_int LDA;
-    fortran_int LDU;
-    fortran_int LDVT;
-    fortran_int LWORK;
-    char JOBZ;
-} GESDD_PARAMS_t;
-
-
-static NPY_INLINE void
-dump_gesdd_params(const char *name,
-                  GESDD_PARAMS_t *params)
-{
-    TRACE_TXT("\n%s:\n"\
-
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-
-              "%14s: %15c'%c'\n",
-
-              name,
-
-              "A", params->A,
-              "S", params->S,
-              "U", params->U,
-              "VT", params->VT,
-              "WORK", params->WORK,
-              "RWORK", params->RWORK,
-              "IWORK", params->IWORK,
-
-              "M", (int)params->M,
-              "N", (int)params->N,
-              "LDA", (int)params->LDA,
-              "LDU", (int)params->LDU,
-              "LDVT", (int)params->LDVT,
-              "LWORK", (int)params->LWORK,
-
-              "JOBZ", ' ', params->JOBZ);
-}
-
-static NPY_INLINE int
-compute_urows_vtcolumns(char jobz,
-                        fortran_int m, fortran_int n,
-                        fortran_int *urows, fortran_int *vtcolumns)
-{
-    fortran_int min_m_n = fortran_int_min(m, n);
-    switch(jobz)
-    {
-    case 'N':
-        *urows = 0;
-        *vtcolumns = 0;
-        break;
-    case 'A':
-        *urows = m;
-        *vtcolumns = n;
-        break;
-    case 'S':
-        {
-            *urows = min_m_n;
-            *vtcolumns = min_m_n;
-        }
-        break;
-    default:
-        return 0;
-    }
-
-    return 1;
-}
-
-
-/**begin repeat
-   #TYPE = FLOAT, DOUBLE#
-   #lapack_func = sgesdd, dgesdd#
-   #ftyp = fortran_real, fortran_doublereal#
- */
-
-static NPY_INLINE fortran_int
-call_@lapack_func@(GESDD_PARAMS_t *params)
-{
-    fortran_int rv;
-    LAPACK(@lapack_func@)(&params->JOBZ, &params->M, &params->N,
-                          params->A, &params->LDA,
-                          params->S,
-                          params->U, &params->LDU,
-                          params->VT, &params->LDVT,
-                          params->WORK, &params->LWORK,
-                          params->IWORK,
-                          &rv);
-    return rv;
-}
-
-static NPY_INLINE int
-init_@lapack_func@(GESDD_PARAMS_t *params,
-                   char jobz,
-                   fortran_int m,
-                   fortran_int n)
-{
-    npy_uint8 *mem_buff = NULL;
-    npy_uint8 *mem_buff2 = NULL;
-    npy_uint8 *a, *s, *u, *vt, *work, *iwork;
-    size_t safe_m = m;
-    size_t safe_n = n;
-    size_t a_size = safe_m * safe_n * sizeof(@ftyp@);
-    fortran_int min_m_n = fortran_int_min(m, n);
-    size_t safe_min_m_n = min_m_n;
-    size_t s_size = safe_min_m_n * sizeof(@ftyp@);
-    fortran_int u_row_count, vt_column_count;
-    size_t safe_u_row_count, safe_vt_column_count;
-    size_t u_size, vt_size;
-    fortran_int work_count;
-    size_t work_size;
-    size_t iwork_size = 8 * safe_min_m_n * sizeof(fortran_int);
-    fortran_int ld = fortran_int_max(m, 1);
-
-    if (!compute_urows_vtcolumns(jobz, m, n, &u_row_count, &vt_column_count)) {
-        goto error;
-    }
-
-    safe_u_row_count = u_row_count;
-    safe_vt_column_count = vt_column_count;
-
-    u_size = safe_u_row_count * safe_m * sizeof(@ftyp@);
-    vt_size = safe_n * safe_vt_column_count * sizeof(@ftyp@);
-
-    mem_buff = malloc(a_size + s_size + u_size + vt_size + iwork_size);
-
-    if (!mem_buff) {
-        goto error;
-    }
-
-    a = mem_buff;
-    s = a + a_size;
-    u = s + s_size;
-    vt = u + u_size;
-    iwork = vt + vt_size;
-
-    /* fix vt_column_count so that it is a valid lapack parameter (0 is not) */
-    vt_column_count = fortran_int_max(1, vt_column_count);
-
-    params->M = m;
-    params->N = n;
-    params->A = a;
-    params->S = s;
-    params->U = u;
-    params->VT = vt;
-    params->RWORK = NULL;
-    params->IWORK = iwork;
-    params->LDA = ld;
-    params->LDU = ld;
-    params->LDVT = vt_column_count;
-    params->JOBZ = jobz;
-
-    /* Work size query */
-    {
-        @ftyp@ work_size_query;
-
-        params->LWORK = -1;
-        params->WORK = &work_size_query;
-
-        if (call_@lapack_func@(params) != 0) {
-            goto error;
-        }
-
-        work_count = (fortran_int)work_size_query;
-        /* Fix a bug in lapack 3.0.0 */
-        if(work_count == 0) work_count = 1;
-        work_size = (size_t)work_count * sizeof(@ftyp@);
-    }
-
-    mem_buff2 = malloc(work_size);
-    if (!mem_buff2) {
-        goto error;
-    }
-
-    work = mem_buff2;
-
-    params->LWORK = work_count;
-    params->WORK = work;
-
-    return 1;
- error:
-    TRACE_TXT("%s failed init\n", __FUNCTION__);
-    free(mem_buff);
-    free(mem_buff2);
-    memset(params, 0, sizeof(*params));
-
-    return 0;
-}
-
-/**end repeat**/
-
-/**begin repeat
-   #TYPE = CFLOAT, CDOUBLE#
-   #ftyp = fortran_complex, fortran_doublecomplex#
-   #frealtyp = fortran_real, fortran_doublereal#
-   #typ = COMPLEX_t, DOUBLECOMPLEX_t#
-   #lapack_func = cgesdd, zgesdd#
- */
-
-static NPY_INLINE fortran_int
-call_@lapack_func@(GESDD_PARAMS_t *params)
-{
-    fortran_int rv;
-    LAPACK(@lapack_func@)(&params->JOBZ, &params->M, &params->N,
-                          params->A, &params->LDA,
-                          params->S,
-                          params->U, &params->LDU,
-                          params->VT, &params->LDVT,
-                          params->WORK, &params->LWORK,
-                          params->RWORK,
-                          params->IWORK,
-                          &rv);
-    return rv;
-}
-
-static NPY_INLINE int
-init_@lapack_func@(GESDD_PARAMS_t *params,
-                   char jobz,
-                   fortran_int m,
-                   fortran_int n)
-{
-    npy_uint8 *mem_buff = NULL, *mem_buff2 = NULL;
-    npy_uint8 *a,*s, *u, *vt, *work, *rwork, *iwork;
-    size_t a_size, s_size, u_size, vt_size, work_size, rwork_size, iwork_size;
-    size_t safe_u_row_count, safe_vt_column_count;
-    fortran_int u_row_count, vt_column_count, work_count;
-    size_t safe_m = m;
-    size_t safe_n = n;
-    fortran_int min_m_n = fortran_int_min(m, n);
-    size_t safe_min_m_n = min_m_n;
-    fortran_int ld = fortran_int_max(m, 1);
-
-    if (!compute_urows_vtcolumns(jobz, m, n, &u_row_count, &vt_column_count)) {
-        goto error;
-    }
-
-    safe_u_row_count = u_row_count;
-    safe_vt_column_count = vt_column_count;
-
-    a_size = safe_m * safe_n * sizeof(@ftyp@);
-    s_size = safe_min_m_n * sizeof(@frealtyp@);
-    u_size = safe_u_row_count * safe_m * sizeof(@ftyp@);
-    vt_size = safe_n * safe_vt_column_count * sizeof(@ftyp@);
-    rwork_size = 'N'==jobz?
-        (7 * safe_min_m_n) :
-        (5*safe_min_m_n * safe_min_m_n + 5*safe_min_m_n);
-    rwork_size *= sizeof(@ftyp@);
-    iwork_size = 8 * safe_min_m_n* sizeof(fortran_int);
-
-    mem_buff = malloc(a_size +
-                      s_size +
-                      u_size +
-                      vt_size +
-                      rwork_size +
-                      iwork_size);
-    if (!mem_buff) {
-        goto error;
-    }
-
-    a = mem_buff;
-    s = a + a_size;
-    u = s + s_size;
-    vt = u + u_size;
-    rwork = vt + vt_size;
-    iwork = rwork + rwork_size;
-
-    /* fix vt_column_count so that it is a valid lapack parameter (0 is not) */
-    vt_column_count = fortran_int_max(1, vt_column_count);
-
-    params->A = a;
-    params->S = s;
-    params->U = u;
-    params->VT = vt;
-    params->RWORK = rwork;
-    params->IWORK = iwork;
-    params->M = m;
-    params->N = n;
-    params->LDA = ld;
-    params->LDU = ld;
-    params->LDVT = vt_column_count;
-    params->JOBZ = jobz;
-
-    /* Work size query */
-    {
-        @ftyp@ work_size_query;
-
-        params->LWORK = -1;
-        params->WORK = &work_size_query;
-
-        if (call_@lapack_func@(params) != 0) {
-            goto error;
-        }
-
-        work_count = (fortran_int)((@typ@*)&work_size_query)->array[0];
-        /* Fix a bug in lapack 3.0.0 */
-        if(work_count == 0) work_count = 1;
-        work_size = (size_t)work_count * sizeof(@ftyp@);
-    }
-
-    mem_buff2 = malloc(work_size);
-    if (!mem_buff2) {
-        goto error;
-    }
-
-    work = mem_buff2;
-
-    params->LWORK = work_count;
-    params->WORK = work;
-
-    return 1;
- error:
-    TRACE_TXT("%s failed init\n", __FUNCTION__);
-    free(mem_buff2);
-    free(mem_buff);
-    memset(params, 0, sizeof(*params));
-
-    return 0;
-}
-/**end repeat**/
-
-
-/**begin repeat
-   #TYPE = FLOAT, DOUBLE, CFLOAT, CDOUBLE#
-   #REALTYPE = FLOAT, DOUBLE, FLOAT, DOUBLE#
-   #lapack_func = sgesdd, dgesdd, cgesdd, zgesdd#
- */
-static NPY_INLINE void
-release_@lapack_func@(GESDD_PARAMS_t* params)
-{
-    /* A and WORK contain allocated blocks */
-    free(params->A);
-    free(params->WORK);
-    memset(params, 0, sizeof(*params));
-}
-
-static NPY_INLINE void
-@TYPE@_svd_wrapper(char JOBZ,
-                   char **args,
-                   npy_intp const *dimensions,
-                   npy_intp const *steps)
-{
-    ptrdiff_t outer_steps[4];
-    int error_occurred = get_fp_invalid_and_clear();
-    size_t iter;
-    size_t outer_dim = *dimensions++;
-    size_t op_count = (JOBZ=='N')?2:4;
-    GESDD_PARAMS_t params;
-
-    for (iter = 0; iter < op_count; ++iter) {
-        outer_steps[iter] = (ptrdiff_t) steps[iter];
-    }
-    steps += op_count;
-
-    if (init_@lapack_func@(&params,
-                           JOBZ,
-                           (fortran_int)dimensions[0],
-                           (fortran_int)dimensions[1])) {
-        LINEARIZE_DATA_t a_in, u_out, s_out, v_out;
-        fortran_int min_m_n = params.M < params.N ? params.M : params.N;
-
-        init_linearize_data(&a_in, params.N, params.M, steps[1], steps[0]);
-        if ('N' == params.JOBZ) {
-            /* only the singular values are wanted */
-            init_linearize_data(&s_out, 1, min_m_n, 0, steps[2]);
-        } else {
-            fortran_int u_columns, v_rows;
-            if ('S' == params.JOBZ) {
-                u_columns = min_m_n;
-                v_rows = min_m_n;
-            } else { /* JOBZ == 'A' */
-                u_columns = params.M;
-                v_rows = params.N;
-            }
-            init_linearize_data(&u_out,
-                                u_columns, params.M,
-                                steps[3], steps[2]);
-            init_linearize_data(&s_out,
-                                1, min_m_n,
-                                0, steps[4]);
-            init_linearize_data(&v_out,
-                                params.N, v_rows,
-                                steps[6], steps[5]);
-        }
-
-        for (iter = 0; iter < outer_dim; ++iter) {
-            int not_ok;
-            /* copy the matrix in */
-            linearize_@TYPE@_matrix(params.A, args[0], &a_in);
-            not_ok = call_@lapack_func@(&params);
-            if (!not_ok) {
-                if ('N' == params.JOBZ) {
-                    delinearize_@REALTYPE@_matrix(args[1], params.S, &s_out);
-                } else {
-                    if ('A' == params.JOBZ && min_m_n == 0) {
-                        /* Lapack has betrayed us and left these uninitialized,
-                         * so produce an identity matrix for whichever of u
-                         * and v is not empty.
-                         */
-                        identity_@TYPE@_matrix(params.U, params.M);
-                        identity_@TYPE@_matrix(params.VT, params.N);
-                    }
-
-                    delinearize_@TYPE@_matrix(args[1], params.U, &u_out);
-                    delinearize_@REALTYPE@_matrix(args[2], params.S, &s_out);
-                    delinearize_@TYPE@_matrix(args[3], params.VT, &v_out);
-                }
-            } else {
-                error_occurred = 1;
-                if ('N' == params.JOBZ) {
-                    nan_@REALTYPE@_matrix(args[1], &s_out);
-                } else {
-                    nan_@TYPE@_matrix(args[1], &u_out);
-                    nan_@REALTYPE@_matrix(args[2], &s_out);
-                    nan_@TYPE@_matrix(args[3], &v_out);
-                }
-            }
-            update_pointers((npy_uint8**)args, outer_steps, op_count);
-        }
-
-        release_@lapack_func@(&params);
-    }
-
-    set_fp_invalid_or_clear(error_occurred);
-}
-/**end repeat*/
-
-
-/* svd gufunc entry points */
-/**begin repeat
-   #TYPE = FLOAT, DOUBLE, CFLOAT, CDOUBLE#
- */
-static void
-@TYPE@_svd_N(char **args,
-             npy_intp const *dimensions,
-             npy_intp const *steps,
-             void *NPY_UNUSED(func))
-{
-    @TYPE@_svd_wrapper('N', args, dimensions, steps);
-}
-
-static void
-@TYPE@_svd_S(char **args,
-             npy_intp const *dimensions,
-             npy_intp const *steps,
-             void *NPY_UNUSED(func))
-{
-    @TYPE@_svd_wrapper('S', args, dimensions, steps);
-}
-
-static void
-@TYPE@_svd_A(char **args,
-             npy_intp const *dimensions,
-             npy_intp const *steps,
-             void *NPY_UNUSED(func))
-{
-    @TYPE@_svd_wrapper('A', args, dimensions, steps);
-}
-
-/**end repeat**/
-
-/* -------------------------------------------------------------------------- */
-                 /* qr (modes - r, raw) */
-
-typedef struct geqfr_params_struct
-{
-    fortran_int M;
-    fortran_int N;
-    void *A;
-    fortran_int LDA;
-    void* TAU;
-    void *WORK;
-    fortran_int LWORK;
-} GEQRF_PARAMS_t;
-
-
-static inline void
-dump_geqrf_params(const char *name,
-                  GEQRF_PARAMS_t *params)
-{
-    TRACE_TXT("\n%s:\n"\
-
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n",
-
-              name,
-
-              "A", params->A,
-              "TAU", params->TAU,
-              "WORK", params->WORK,
-
-              "M", (int)params->M,
-              "N", (int)params->N,
-              "LDA", (int)params->LDA,
-              "LWORK", (int)params->LWORK);
-}
-
-/**begin repeat
-   #lapack_func=dgeqrf,zgeqrf#
- */
-
-static inline fortran_int
-call_@lapack_func@(GEQRF_PARAMS_t *params)
-{
-    fortran_int rv;
-    LAPACK(@lapack_func@)(&params->M, &params->N,
-                          params->A, &params->LDA,
-                          params->TAU,
-                          params->WORK, &params->LWORK,
-                          &rv);
-    return rv;
-}
-
-/**end repeat**/
-
-/**begin repeat
-   #TYPE=DOUBLE#
-   #lapack_func=dgeqrf#
-   #ftyp=fortran_doublereal#
- */
-static inline int
-init_@lapack_func@(GEQRF_PARAMS_t *params,
-                   fortran_int m,
-                   fortran_int n)
-{
-    npy_uint8 *mem_buff = NULL;
-    npy_uint8 *mem_buff2 = NULL;
-    npy_uint8 *a, *tau, *work;
-    fortran_int min_m_n = fortran_int_min(m, n);
-    size_t safe_min_m_n = min_m_n;
-    size_t safe_m = m;
-    size_t safe_n = n;
-
-    size_t a_size = safe_m * safe_n * sizeof(@ftyp@);
-    size_t tau_size = safe_min_m_n * sizeof(@ftyp@);
-
-    fortran_int work_count;
-    size_t work_size;
-    fortran_int lda = fortran_int_max(1, m);
-
-    mem_buff = malloc(a_size + tau_size);
-
-    if (!mem_buff)
-        goto error;
-
-    a = mem_buff;
-    tau = a + a_size;
-    memset(tau, 0, tau_size);
-
-
-    params->M = m;
-    params->N = n;
-    params->A = a;
-    params->TAU = tau;
-    params->LDA = lda;
-
-    {
-        /* compute optimal work size */
-
-        @ftyp@ work_size_query;
-
-        params->WORK = &work_size_query;
-        params->LWORK = -1;
-
-        if (call_@lapack_func@(params) != 0)
-            goto error;
-
-        work_count = (fortran_int) *(@ftyp@*) params->WORK;
-
-    }
-
-    params->LWORK = fortran_int_max(fortran_int_max(1, n), work_count);
-
-    work_size = (size_t) params->LWORK * sizeof(@ftyp@);
-    mem_buff2 = malloc(work_size);
-    if (!mem_buff2)
-        goto error;
-
-    work = mem_buff2;
-
-    params->WORK = work;
-
-    return 1;
- error:
-    TRACE_TXT("%s failed init\n", __FUNCTION__);
-    free(mem_buff);
-    free(mem_buff2);
-    memset(params, 0, sizeof(*params));
-
-    return 0;
-}
-
-/**end repeat**/
-
-/**begin repeat
-   #TYPE=CDOUBLE#
-   #lapack_func=zgeqrf#
-   #ftyp=fortran_doublecomplex#
- */
-static inline int
-init_@lapack_func@(GEQRF_PARAMS_t *params,
-                   fortran_int m,
-                   fortran_int n)
-{
-    npy_uint8 *mem_buff = NULL;
-    npy_uint8 *mem_buff2 = NULL;
-    npy_uint8 *a, *tau, *work;
-    fortran_int min_m_n = fortran_int_min(m, n);
-    size_t safe_min_m_n = min_m_n;
-    size_t safe_m = m;
-    size_t safe_n = n;
-
-    size_t a_size = safe_m * safe_n * sizeof(@ftyp@);
-    size_t tau_size = safe_min_m_n * sizeof(@ftyp@);
-
-    fortran_int work_count;
-    size_t work_size;
-    fortran_int lda = fortran_int_max(1, m);
-
-    mem_buff = malloc(a_size + tau_size);
-
-    if (!mem_buff)
-        goto error;
-
-    a = mem_buff;
-    tau = a + a_size;
-    memset(tau, 0, tau_size);
-
-
-    params->M = m;
-    params->N = n;
-    params->A = a;
-    params->TAU = tau;
-    params->LDA = lda;
-
-    {
-        /* compute optimal work size */
-
-        @ftyp@ work_size_query;
-
-        params->WORK = &work_size_query;
-        params->LWORK = -1;
-
-        if (call_@lapack_func@(params) != 0)
-            goto error;
-
-        work_count = (fortran_int) ((@ftyp@*)params->WORK)->r;
-
-    }
-
-    params->LWORK = fortran_int_max(fortran_int_max(1, n),
-                                    work_count);
-
-    work_size = (size_t) params->LWORK * sizeof(@ftyp@);
-
-    mem_buff2 = malloc(work_size);
-    if (!mem_buff2)
-        goto error;
-
-    work = mem_buff2;
-
-    params->WORK = work;
-
-    return 1;
- error:
-    TRACE_TXT("%s failed init\n", __FUNCTION__);
-    free(mem_buff);
-    free(mem_buff2);
-    memset(params, 0, sizeof(*params));
-
-    return 0;
-}
-
-/**end repeat**/
-
-/**begin repeat
-   #lapack_func=dgeqrf,zgeqrf#
- */
-static inline void
-release_@lapack_func@(GEQRF_PARAMS_t* params)
-{
-    /* A and WORK contain allocated blocks */
-    free(params->A);
-    free(params->WORK);
-    memset(params, 0, sizeof(*params));
-}
-
-/**end repeat**/
-
-/**begin repeat
-   #TYPE=DOUBLE,CDOUBLE#
-   #REALTYPE=DOUBLE,DOUBLE#
-   #lapack_func=dgeqrf,zgeqrf#
-   #typ     = npy_double,npy_cdouble#
-   #basetyp = npy_double,npy_double#
-   #ftyp = fortran_doublereal,fortran_doublecomplex#
-   #cmplx = 0, 1#
- */
-static void
-@TYPE@_qr_r_raw(char **args, npy_intp const *dimensions, npy_intp const *steps,
-          void *NPY_UNUSED(func))
-{
-    GEQRF_PARAMS_t params;
-    int error_occurred = get_fp_invalid_and_clear();
-    fortran_int n, m;
-
-    INIT_OUTER_LOOP_2
-
-    m = (fortran_int)dimensions[0];
-    n = (fortran_int)dimensions[1];
-
-    if (init_@lapack_func@(&params, m, n)) {
-        LINEARIZE_DATA_t a_in, tau_out;
-
-        init_linearize_data(&a_in, n, m, steps[1], steps[0]);
-        init_linearize_data(&tau_out, 1, fortran_int_min(m, n), 1, steps[2]);
-
-        BEGIN_OUTER_LOOP_2
-            int not_ok;
-            linearize_@TYPE@_matrix(params.A, args[0], &a_in);
-            not_ok = call_@lapack_func@(&params);
-            if (!not_ok) {
-                delinearize_@TYPE@_matrix(args[0], params.A, &a_in);
-                delinearize_@TYPE@_matrix(args[1], params.TAU, &tau_out);
-            } else {
-                error_occurred = 1;
-                nan_@TYPE@_matrix(args[1], &tau_out);
-            }
-        END_OUTER_LOOP
-
-        release_@lapack_func@(&params);
-    }
-
-    set_fp_invalid_or_clear(error_occurred);
-}
-
-/**end repeat**/
-
-/* -------------------------------------------------------------------------- */
-                 /* qr common code (modes - reduced and complete) */ 
-
-typedef struct gqr_params_struct
-{
-    fortran_int M;
-    fortran_int MC;
-    fortran_int MN;
-    void* A;
-    void *Q;
-    fortran_int LDA;
-    void* TAU;
-    void *WORK;
-    fortran_int LWORK;
-} GQR_PARAMS_t;
-
-/**begin repeat
-   #lapack_func=dorgqr,zungqr#
- */
-
-static inline fortran_int
-call_@lapack_func@(GQR_PARAMS_t *params)
-{
-    fortran_int rv;
-    LAPACK(@lapack_func@)(&params->M, &params->MC, &params->MN,
-                          params->Q, &params->LDA,
-                          params->TAU,
-                          params->WORK, &params->LWORK,
-                          &rv);
-    return rv;
-}
-
-/**end repeat**/
-
-/**begin repeat
-   #lapack_func=dorgqr#
-   #ftyp=fortran_doublereal#
- */
-static inline int
-init_@lapack_func@_common(GQR_PARAMS_t *params,
-                          fortran_int m,
-                          fortran_int n,
-                          fortran_int mc)
-{
-    npy_uint8 *mem_buff = NULL;
-    npy_uint8 *mem_buff2 = NULL;
-    npy_uint8 *a, *q, *tau, *work;
-    fortran_int min_m_n = fortran_int_min(m, n);
-    size_t safe_mc = mc;
-    size_t safe_min_m_n = min_m_n;
-    size_t safe_m = m;
-    size_t safe_n = n;
-    size_t a_size = safe_m * safe_n * sizeof(@ftyp@);
-    size_t q_size = safe_m * safe_mc * sizeof(@ftyp@);
-    size_t tau_size = safe_min_m_n * sizeof(@ftyp@);
-
-    fortran_int work_count;
-    size_t work_size;
-    fortran_int lda = fortran_int_max(1, m);
-
-    mem_buff = malloc(q_size + tau_size + a_size);
-
-    if (!mem_buff)
-        goto error;
-
-    q = mem_buff;
-    tau = q + q_size;
-    a = tau + tau_size;
-
-
-    params->M = m;
-    params->MC = mc;
-    params->MN = min_m_n;
-    params->A = a;
-    params->Q = q;
-    params->TAU = tau;
-    params->LDA = lda;
-
-    {
-        /* compute optimal work size */
-        @ftyp@ work_size_query;
-
-        params->WORK = &work_size_query;
-        params->LWORK = -1;
-
-        if (call_@lapack_func@(params) != 0)
-            goto error;
-
-        work_count = (fortran_int) *(@ftyp@*) params->WORK;
-
-    }
-
-    params->LWORK = fortran_int_max(fortran_int_max(1, n), work_count);
-
-    work_size = (size_t) params->LWORK * sizeof(@ftyp@);
-
-    mem_buff2 = malloc(work_size);
-    if (!mem_buff2)
-        goto error;
-
-    work = mem_buff2;
-
-    params->WORK = work;
-
-    return 1;
- error:
-    TRACE_TXT("%s failed init\n", __FUNCTION__);
-    free(mem_buff);
-    free(mem_buff2);
-    memset(params, 0, sizeof(*params));
-
-    return 0;
-}
-
-/**end repeat**/
-
-/**begin repeat
-   #lapack_func=zungqr#
-   #ftyp=fortran_doublecomplex#
- */
-static inline int
-init_@lapack_func@_common(GQR_PARAMS_t *params,
-                          fortran_int m,
-                          fortran_int n,
-                          fortran_int mc)
-{
-    npy_uint8 *mem_buff = NULL;
-    npy_uint8 *mem_buff2 = NULL;
-    npy_uint8 *a, *q, *tau, *work;
-    fortran_int min_m_n = fortran_int_min(m, n);
-    size_t safe_mc = mc;
-    size_t safe_min_m_n = min_m_n;
-    size_t safe_m = m;
-    size_t safe_n = n;
-
-    size_t a_size = safe_m * safe_n * sizeof(@ftyp@);
-    size_t q_size = safe_m * safe_mc * sizeof(@ftyp@);
-    size_t tau_size = safe_min_m_n * sizeof(@ftyp@);
-
-    fortran_int work_count;
-    size_t work_size;
-    fortran_int lda = fortran_int_max(1, m);
-
-    mem_buff = malloc(q_size + tau_size + a_size);
-
-    if (!mem_buff)
-        goto error;
-
-    q = mem_buff;
-    tau = q + q_size;
-    a = tau + tau_size;
-
-
-    params->M = m;
-    params->MC = mc;
-    params->MN = min_m_n;
-    params->A = a;
-    params->Q = q;
-    params->TAU = tau;
-    params->LDA = lda;
-
-    {
-        /* compute optimal work size */
-        @ftyp@ work_size_query;
-
-        params->WORK = &work_size_query;
-        params->LWORK = -1;
-
-        if (call_@lapack_func@(params) != 0)
-            goto error;
-
-        work_count = (fortran_int) ((@ftyp@*)params->WORK)->r;
-
-    }
-
-    params->LWORK = fortran_int_max(fortran_int_max(1, n),
-                                    work_count);
-
-    work_size = (size_t) params->LWORK * sizeof(@ftyp@);
-
-    mem_buff2 = malloc(work_size);
-    if (!mem_buff2)
-        goto error;
-
-    work = mem_buff2;
-
-    params->WORK = work;
-    params->LWORK = work_count;
-
-    return 1;
- error:
-    TRACE_TXT("%s failed init\n", __FUNCTION__);
-    free(mem_buff);
-    free(mem_buff2);
-    memset(params, 0, sizeof(*params));
-
-    return 0;
-}
-
-/**end repeat**/
-
-/* -------------------------------------------------------------------------- */
-                 /* qr (modes - reduced) */
-
-
-static inline void
-dump_gqr_params(const char *name,
-                GQR_PARAMS_t *params)
-{
-    TRACE_TXT("\n%s:\n"\
-
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n",
-
-              name,
-
-              "Q", params->Q,
-              "TAU", params->TAU,
-              "WORK", params->WORK,
-
-              "M", (int)params->M,
-              "MC", (int)params->MC,
-              "MN", (int)params->MN,
-              "LDA", (int)params->LDA,
-              "LWORK", (int)params->LWORK);
-}
-
-/**begin repeat
-   #lapack_func=dorgqr,zungqr#
-   #ftyp=fortran_doublereal,fortran_doublecomplex#
- */
-static inline int
-init_@lapack_func@(GQR_PARAMS_t *params,
-                   fortran_int m,
-                   fortran_int n)
-{
-    return init_@lapack_func@_common(
-        params, m, n, 
-        fortran_int_min(m, n));
-}
-
-/**end repeat**/
-
-/**begin repeat
-   #lapack_func=dorgqr,zungqr#
- */
-static inline void
-release_@lapack_func@(GQR_PARAMS_t* params)
-{
-    /* A and WORK contain allocated blocks */
-    free(params->Q);
-    free(params->WORK);
-    memset(params, 0, sizeof(*params));
-}
-
-/**end repeat**/
-
-/**begin repeat
-   #TYPE=DOUBLE,CDOUBLE#
-   #REALTYPE=DOUBLE,DOUBLE#
-   #lapack_func=dorgqr,zungqr#
-   #typ     = npy_double, npy_cdouble#
-   #basetyp = npy_double, npy_double#
-   #ftyp = fortran_doublereal,fortran_doublecomplex#
-   #cmplx = 0, 1#
- */
-static void
-@TYPE@_qr_reduced(char **args, npy_intp const *dimensions, npy_intp const *steps,
-                  void *NPY_UNUSED(func))
-{
-    GQR_PARAMS_t params;
-    int error_occurred = get_fp_invalid_and_clear();
-    fortran_int n, m;
-
-    INIT_OUTER_LOOP_3
-
-    m = (fortran_int)dimensions[0];
-    n = (fortran_int)dimensions[1];
-
-    if (init_@lapack_func@(&params, m, n)) {
-        LINEARIZE_DATA_t a_in, tau_in, q_out;
-
-        init_linearize_data(&a_in, n, m, steps[1], steps[0]);
-        init_linearize_data(&tau_in, 1, fortran_int_min(m, n), 1, steps[2]);
-        init_linearize_data(&q_out, fortran_int_min(m, n), m, steps[4], steps[3]);
-
-        BEGIN_OUTER_LOOP_3
-            int not_ok;
-            linearize_@TYPE@_matrix(params.A, args[0], &a_in);
-            linearize_@TYPE@_matrix(params.Q, args[0], &a_in);
-            linearize_@TYPE@_matrix(params.TAU, args[1], &tau_in);
-            not_ok = call_@lapack_func@(&params);
-            if (!not_ok) {
-                delinearize_@TYPE@_matrix(args[2], params.Q, &q_out);
-            } else {
-                error_occurred = 1;
-                nan_@TYPE@_matrix(args[2], &q_out);
-            }
-        END_OUTER_LOOP
-
-        release_@lapack_func@(&params);
-    }
-
-    set_fp_invalid_or_clear(error_occurred);
-}
-
-/**end repeat**/
-
-/* -------------------------------------------------------------------------- */
-                 /* qr (modes - complete) */
-
-/**begin repeat
-   #lapack_func=dorgqr,zungqr#
-   #ftyp=fortran_doublereal,fortran_doublecomplex#
- */
-static inline int
-init_@lapack_func@_complete(GQR_PARAMS_t *params,
-                            fortran_int m,
-                            fortran_int n)
-{
-    return init_@lapack_func@_common(params, m, n, m);
-}
-
-/**end repeat**/
-
-/**begin repeat
-   #TYPE=DOUBLE,CDOUBLE#
-   #REALTYPE=DOUBLE,DOUBLE#
-   #lapack_func=dorgqr,zungqr#
-   #typ     = npy_double,npy_cdouble#
-   #basetyp = npy_double,npy_double#
-   #ftyp = fortran_doublereal,fortran_doublecomplex#
-   #cmplx = 0, 1#
- */
-static void
-@TYPE@_qr_complete(char **args, npy_intp const *dimensions, npy_intp const *steps,
-                  void *NPY_UNUSED(func))
-{
-    GQR_PARAMS_t params;
-    int error_occurred = get_fp_invalid_and_clear();
-    fortran_int n, m;
-
-    INIT_OUTER_LOOP_3
-
-    m = (fortran_int)dimensions[0];
-    n = (fortran_int)dimensions[1];
-
-
-    if (init_@lapack_func@_complete(&params, m, n)) {
-        LINEARIZE_DATA_t a_in, tau_in, q_out;
-
-        init_linearize_data(&a_in, n, m, steps[1], steps[0]);
-        init_linearize_data(&tau_in, 1, fortran_int_min(m, n), 1, steps[2]);
-        init_linearize_data(&q_out, m, m, steps[4], steps[3]);
-
-        BEGIN_OUTER_LOOP_3
-            int not_ok;
-            linearize_@TYPE@_matrix(params.A, args[0], &a_in);
-            linearize_@TYPE@_matrix(params.Q, args[0], &a_in);
-            linearize_@TYPE@_matrix(params.TAU, args[1], &tau_in);
-            not_ok = call_@lapack_func@(&params);
-            if (!not_ok) {
-                delinearize_@TYPE@_matrix(args[2], params.Q, &q_out);
-            } else {
-                error_occurred = 1;
-                nan_@TYPE@_matrix(args[2], &q_out);
-            }
-        END_OUTER_LOOP
-
-        release_@lapack_func@(&params);
-    }
-
-    set_fp_invalid_or_clear(error_occurred);
-}
-
-/**end repeat**/
-
-/* -------------------------------------------------------------------------- */
-                 /* least squares */
-
-typedef struct gelsd_params_struct
-{
-    fortran_int M;
-    fortran_int N;
-    fortran_int NRHS;
-    void *A;
-    fortran_int LDA;
-    void *B;
-    fortran_int LDB;
-    void *S;
-    void *RCOND;
-    fortran_int RANK;
-    void *WORK;
-    fortran_int LWORK;
-    void *RWORK;
-    void *IWORK;
-} GELSD_PARAMS_t;
-
-
-static inline void
-dump_gelsd_params(const char *name,
-                  GELSD_PARAMS_t *params)
-{
-    TRACE_TXT("\n%s:\n"\
-
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-              "%14s: %18p\n"\
-
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-              "%14s: %18d\n"\
-
-              "%14s: %18p\n",
-
-              name,
-
-              "A", params->A,
-              "B", params->B,
-              "S", params->S,
-              "WORK", params->WORK,
-              "RWORK", params->RWORK,
-              "IWORK", params->IWORK,
-
-              "M", (int)params->M,
-              "N", (int)params->N,
-              "NRHS", (int)params->NRHS,
-              "LDA", (int)params->LDA,
-              "LDB", (int)params->LDB,
-              "LWORK", (int)params->LWORK,
-              "RANK", (int)params->RANK,
-
-              "RCOND", params->RCOND);
-}
-
-
-/**begin repeat
-   #TYPE=FLOAT,DOUBLE#
-   #lapack_func=sgelsd,dgelsd#
-   #ftyp=fortran_real,fortran_doublereal#
- */
-
-static inline fortran_int
-call_@lapack_func@(GELSD_PARAMS_t *params)
-{
-    fortran_int rv;
-    LAPACK(@lapack_func@)(&params->M, &params->N, &params->NRHS,
-                          params->A, &params->LDA,
-                          params->B, &params->LDB,
-                          params->S,
-                          params->RCOND, &params->RANK,
-                          params->WORK, &params->LWORK,
-                          params->IWORK,
-                          &rv);
-    return rv;
-}
-
-static inline int
-init_@lapack_func@(GELSD_PARAMS_t *params,
-                   fortran_int m,
-                   fortran_int n,
-                   fortran_int nrhs)
-{
-    npy_uint8 *mem_buff = NULL;
-    npy_uint8 *mem_buff2 = NULL;
-    npy_uint8 *a, *b, *s, *work, *iwork;
-    fortran_int min_m_n = fortran_int_min(m, n);
-    fortran_int max_m_n = fortran_int_max(m, n);
-    size_t safe_min_m_n = min_m_n;
-    size_t safe_max_m_n = max_m_n;
-    size_t safe_m = m;
-    size_t safe_n = n;
-    size_t safe_nrhs = nrhs;
-
-    size_t a_size = safe_m * safe_n * sizeof(@ftyp@);
-    size_t b_size = safe_max_m_n * safe_nrhs * sizeof(@ftyp@);
-    size_t s_size = safe_min_m_n * sizeof(@ftyp@);
-
-    fortran_int work_count;
-    size_t work_size;
-    size_t iwork_size;
-    fortran_int lda = fortran_int_max(1, m);
-    fortran_int ldb = fortran_int_max(1, fortran_int_max(m,n));
-
-    mem_buff = malloc(a_size + b_size + s_size);
-
-    if (!mem_buff)
-        goto error;
-
-    a = mem_buff;
-    b = a + a_size;
-    s = b + b_size;
-
-
-    params->M = m;
-    params->N = n;
-    params->NRHS = nrhs;
-    params->A = a;
-    params->B = b;
-    params->S = s;
-    params->LDA = lda;
-    params->LDB = ldb;
-
-    {
-        /* compute optimal work size */
-        @ftyp@ work_size_query;
-        fortran_int iwork_size_query;
-
-        params->WORK = &work_size_query;
-        params->IWORK = &iwork_size_query;
-        params->RWORK = NULL;
-        params->LWORK = -1;
-
-        if (call_@lapack_func@(params) != 0)
-            goto error;
-
-        work_count = (fortran_int)work_size_query;
-
-        work_size  = (size_t) work_size_query * sizeof(@ftyp@);
-        iwork_size = (size_t)iwork_size_query * sizeof(fortran_int);
-    }
-
-    mem_buff2 = malloc(work_size + iwork_size);
-    if (!mem_buff2)
-        goto error;
-
-    work = mem_buff2;
-    iwork = work + work_size;
-
-    params->WORK = work;
-    params->RWORK = NULL;
-    params->IWORK = iwork;
-    params->LWORK = work_count;
-
-    return 1;
- error:
-    TRACE_TXT("%s failed init\n", __FUNCTION__);
-    free(mem_buff);
-    free(mem_buff2);
-    memset(params, 0, sizeof(*params));
-
-    return 0;
-}
-
-/**end repeat**/
-
-/**begin repeat
-   #TYPE=CFLOAT,CDOUBLE#
-   #ftyp=fortran_complex,fortran_doublecomplex#
-   #frealtyp=fortran_real,fortran_doublereal#
-   #typ=COMPLEX_t,DOUBLECOMPLEX_t#
-   #lapack_func=cgelsd,zgelsd#
- */
-
-static inline fortran_int
-call_@lapack_func@(GELSD_PARAMS_t *params)
-{
-    fortran_int rv;
-    LAPACK(@lapack_func@)(&params->M, &params->N, &params->NRHS,
-                          params->A, &params->LDA,
-                          params->B, &params->LDB,
-                          params->S,
-                          params->RCOND, &params->RANK,
-                          params->WORK, &params->LWORK,
-                          params->RWORK, params->IWORK,
-                          &rv);
-    return rv;
-}
-
-static inline int
-init_@lapack_func@(GELSD_PARAMS_t *params,
-                   fortran_int m,
-                   fortran_int n,
-                   fortran_int nrhs)
-{
-    npy_uint8 *mem_buff = NULL;
-    npy_uint8 *mem_buff2 = NULL;
-    npy_uint8 *a, *b, *s, *work, *iwork, *rwork;
-    fortran_int min_m_n = fortran_int_min(m, n);
-    fortran_int max_m_n = fortran_int_max(m, n);
-    size_t safe_min_m_n = min_m_n;
-    size_t safe_max_m_n = max_m_n;
-    size_t safe_m = m;
-    size_t safe_n = n;
-    size_t safe_nrhs = nrhs;
-
-    size_t a_size = safe_m * safe_n * sizeof(@ftyp@);
-    size_t b_size = safe_max_m_n * safe_nrhs * sizeof(@ftyp@);
-    size_t s_size = safe_min_m_n * sizeof(@frealtyp@);
-
-    fortran_int work_count;
-    size_t work_size, rwork_size, iwork_size;
-    fortran_int lda = fortran_int_max(1, m);
-    fortran_int ldb = fortran_int_max(1, fortran_int_max(m,n));
-
-    mem_buff = malloc(a_size + b_size + s_size);
-
-    if (!mem_buff)
-        goto error;
-
-    a = mem_buff;
-    b = a + a_size;
-    s = b + b_size;
-
-
-    params->M = m;
-    params->N = n;
-    params->NRHS = nrhs;
-    params->A = a;
-    params->B = b;
-    params->S = s;
-    params->LDA = lda;
-    params->LDB = ldb;
-
-    {
-        /* compute optimal work size */
-        @ftyp@ work_size_query;
-        @frealtyp@ rwork_size_query;
-        fortran_int iwork_size_query;
-
-        params->WORK = &work_size_query;
-        params->IWORK = &iwork_size_query;
-        params->RWORK = &rwork_size_query;
-        params->LWORK = -1;
-
-        if (call_@lapack_func@(params) != 0)
-            goto error;
-
-        work_count = (fortran_int)work_size_query.r;
-
-        work_size  = (size_t )work_size_query.r * sizeof(@ftyp@);
-        rwork_size = (size_t)rwork_size_query * sizeof(@frealtyp@);
-        iwork_size = (size_t)iwork_size_query * sizeof(fortran_int);
-    }
-
-    mem_buff2 = malloc(work_size + rwork_size + iwork_size);
-    if (!mem_buff2)
-        goto error;
-
-    work = mem_buff2;
-    rwork = work + work_size;
-    iwork = rwork + rwork_size;
-
-    params->WORK = work;
-    params->RWORK = rwork;
-    params->IWORK = iwork;
-    params->LWORK = work_count;
-
-    return 1;
- error:
-    TRACE_TXT("%s failed init\n", __FUNCTION__);
-    free(mem_buff);
-    free(mem_buff2);
-    memset(params, 0, sizeof(*params));
-
-    return 0;
-}
-
-/**end repeat**/
-
-
-/**begin repeat
-   #TYPE=FLOAT,DOUBLE,CFLOAT,CDOUBLE#
-   #REALTYPE=FLOAT,DOUBLE,FLOAT,DOUBLE#
-   #lapack_func=sgelsd,dgelsd,cgelsd,zgelsd#
-   #dot_func=sdot,ddot,cdotc,zdotc#
-   #typ     = npy_float, npy_double, npy_cfloat, npy_cdouble#
-   #basetyp = npy_float, npy_double, npy_float,  npy_double#
-   #ftyp = fortran_real, fortran_doublereal,
-           fortran_complex, fortran_doublecomplex#
-   #cmplx = 0, 0, 1, 1#
- */
-static inline void
-release_@lapack_func@(GELSD_PARAMS_t* params)
-{
-    /* A and WORK contain allocated blocks */
-    free(params->A);
-    free(params->WORK);
-    memset(params, 0, sizeof(*params));
-}
-
-/** Compute the squared l2 norm of a contiguous vector */
-static @basetyp@
-@TYPE@_abs2(@typ@ *p, npy_intp n) {
-    npy_intp i;
-    @basetyp@ res = 0;
-    for (i = 0; i < n; i++) {
-        @typ@ el = p[i];
-#if @cmplx@
-        res += el.real*el.real + el.imag*el.imag;
-#else
-        res += el*el;
-#endif
-    }
-    return res;
-}
-
-static void
-@TYPE@_lstsq(char **args, npy_intp const *dimensions, npy_intp const *steps,
-             void *NPY_UNUSED(func))
-{
-    GELSD_PARAMS_t params;
-    int error_occurred = get_fp_invalid_and_clear();
-    fortran_int n, m, nrhs;
-    fortran_int excess;
-
-    INIT_OUTER_LOOP_7
-
-    m = (fortran_int)dimensions[0];
-    n = (fortran_int)dimensions[1];
-    nrhs = (fortran_int)dimensions[2];
-    excess = m - n;
-
-    if (init_@lapack_func@(&params, m, n, nrhs)) {
-        LINEARIZE_DATA_t a_in, b_in, x_out, s_out, r_out;
-
-        init_linearize_data(&a_in, n, m, steps[1], steps[0]);
-        init_linearize_data_ex(&b_in, nrhs, m, steps[3], steps[2], fortran_int_max(n, m));
-        init_linearize_data_ex(&x_out, nrhs, n, steps[5], steps[4], fortran_int_max(n, m));
-        init_linearize_data(&r_out, 1, nrhs, 1, steps[6]);
-        init_linearize_data(&s_out, 1, fortran_int_min(n, m), 1, steps[7]);
-
-        BEGIN_OUTER_LOOP_7
-            int not_ok;
-            linearize_@TYPE@_matrix(params.A, args[0], &a_in);
-            linearize_@TYPE@_matrix(params.B, args[1], &b_in);
-            params.RCOND = args[2];
-            not_ok = call_@lapack_func@(&params);
-            if (!not_ok) {
-                delinearize_@TYPE@_matrix(args[3], params.B, &x_out);
-                *(npy_int*) args[5] = params.RANK;
-                delinearize_@REALTYPE@_matrix(args[6], params.S, &s_out);
-
-                /* Note that linalg.lstsq discards this when excess == 0 */
-                if (excess >= 0 && params.RANK == n) {
-                    /* Compute the residuals as the square sum of each column */
-                    int i;
-                    char *resid = args[4];
-                    @ftyp@ *components = (@ftyp@ *)params.B + n;
-                    for (i = 0; i < nrhs; i++) {
-                        @ftyp@ *vector = components + i*m;
-                        /* Numpy and fortran floating types are the same size,
-                         * so this cast is safe */
-                        @basetyp@ abs2 = @TYPE@_abs2((@typ@ *)vector, excess);
-                        memcpy(
-                            resid + i*r_out.column_strides,
-                            &abs2, sizeof(abs2));
-                    }
-                }
-                else {
-                    /* Note that this is always discarded by linalg.lstsq */
-                    nan_@REALTYPE@_matrix(args[4], &r_out);
-                }
-            } else {
-                error_occurred = 1;
-                nan_@TYPE@_matrix(args[3], &x_out);
-                nan_@REALTYPE@_matrix(args[4], &r_out);
-                *(npy_int*) args[5] = -1;
-                nan_@REALTYPE@_matrix(args[6], &s_out);
-            }
-        END_OUTER_LOOP
-
-        release_@lapack_func@(&params);
-    }
-
-    set_fp_invalid_or_clear(error_occurred);
-}
-
-/**end repeat**/
-
-#pragma GCC diagnostic pop
-
-/* -------------------------------------------------------------------------- */
-              /* gufunc registration  */
-
-static void *array_of_nulls[] = {
-    (void *)NULL,
-    (void *)NULL,
-    (void *)NULL,
-    (void *)NULL,
-
-    (void *)NULL,
-    (void *)NULL,
-    (void *)NULL,
-    (void *)NULL,
-
-    (void *)NULL,
-    (void *)NULL,
-    (void *)NULL,
-    (void *)NULL,
-
-    (void *)NULL,
-    (void *)NULL,
-    (void *)NULL,
-    (void *)NULL
-};
-
-#define FUNC_ARRAY_NAME(NAME) NAME ## _funcs
-
-#define GUFUNC_FUNC_ARRAY_REAL(NAME)                    \
-    static PyUFuncGenericFunction                       \
-    FUNC_ARRAY_NAME(NAME)[] = {                         \
-        FLOAT_ ## NAME,                                 \
-        DOUBLE_ ## NAME                                 \
-    }
-
-#define GUFUNC_FUNC_ARRAY_REAL_COMPLEX(NAME)            \
-    static PyUFuncGenericFunction                       \
-    FUNC_ARRAY_NAME(NAME)[] = {                         \
-        FLOAT_ ## NAME,                                 \
-        DOUBLE_ ## NAME,                                \
-        CFLOAT_ ## NAME,                                \
-        CDOUBLE_ ## NAME                                \
-    }
-
-/* There are problems with eig in complex single precision.
- * That kernel is disabled
- */
-#define GUFUNC_FUNC_ARRAY_EIG(NAME)                     \
-    static PyUFuncGenericFunction                       \
-    FUNC_ARRAY_NAME(NAME)[] = {                         \
-        FLOAT_ ## NAME,                                 \
-        DOUBLE_ ## NAME,                                \
-        CDOUBLE_ ## NAME                                \
-    }
-
-/* The single precision functions are not used at all,
- * due to input data being promoted to double precision
- * in Python, so they are not implemented here.
- */
-#define GUFUNC_FUNC_ARRAY_QR(NAME)                      \
-    static PyUFuncGenericFunction                       \
-    FUNC_ARRAY_NAME(NAME)[] = {                         \
-        DOUBLE_ ## NAME,                                \
-        CDOUBLE_ ## NAME                                \
-    }
-
-
-GUFUNC_FUNC_ARRAY_REAL_COMPLEX(slogdet);
-GUFUNC_FUNC_ARRAY_REAL_COMPLEX(det);
-GUFUNC_FUNC_ARRAY_REAL_COMPLEX(eighlo);
-GUFUNC_FUNC_ARRAY_REAL_COMPLEX(eighup);
-GUFUNC_FUNC_ARRAY_REAL_COMPLEX(eigvalshlo);
-GUFUNC_FUNC_ARRAY_REAL_COMPLEX(eigvalshup);
-GUFUNC_FUNC_ARRAY_REAL_COMPLEX(solve);
-GUFUNC_FUNC_ARRAY_REAL_COMPLEX(solve1);
-GUFUNC_FUNC_ARRAY_REAL_COMPLEX(inv);
-GUFUNC_FUNC_ARRAY_REAL_COMPLEX(cholesky_lo);
-GUFUNC_FUNC_ARRAY_REAL_COMPLEX(svd_N);
-GUFUNC_FUNC_ARRAY_REAL_COMPLEX(svd_S);
-GUFUNC_FUNC_ARRAY_REAL_COMPLEX(svd_A);
-GUFUNC_FUNC_ARRAY_QR(qr_r_raw);
-GUFUNC_FUNC_ARRAY_QR(qr_reduced);
-GUFUNC_FUNC_ARRAY_QR(qr_complete);
-GUFUNC_FUNC_ARRAY_REAL_COMPLEX(lstsq);
-GUFUNC_FUNC_ARRAY_EIG(eig);
-GUFUNC_FUNC_ARRAY_EIG(eigvals);
-
-static char equal_2_types[] = {
-    NPY_FLOAT, NPY_FLOAT,
-    NPY_DOUBLE, NPY_DOUBLE,
-    NPY_CFLOAT, NPY_CFLOAT,
-    NPY_CDOUBLE, NPY_CDOUBLE
-};
-
-static char equal_3_types[] = {
-    NPY_FLOAT, NPY_FLOAT, NPY_FLOAT,
-    NPY_DOUBLE, NPY_DOUBLE, NPY_DOUBLE,
-    NPY_CFLOAT, NPY_CFLOAT, NPY_CFLOAT,
-    NPY_CDOUBLE, NPY_CDOUBLE, NPY_CDOUBLE
-};
-
-/* second result is logdet, that will always be a REAL */
-static char slogdet_types[] = {
-    NPY_FLOAT, NPY_FLOAT, NPY_FLOAT,
-    NPY_DOUBLE, NPY_DOUBLE, NPY_DOUBLE,
-    NPY_CFLOAT, NPY_CFLOAT, NPY_FLOAT,
-    NPY_CDOUBLE, NPY_CDOUBLE, NPY_DOUBLE
-};
-
-static char eigh_types[] = {
-    NPY_FLOAT, NPY_FLOAT, NPY_FLOAT,
-    NPY_DOUBLE, NPY_DOUBLE, NPY_DOUBLE,
-    NPY_CFLOAT, NPY_FLOAT, NPY_CFLOAT,
-    NPY_CDOUBLE, NPY_DOUBLE, NPY_CDOUBLE
-};
-
-static char eighvals_types[] = {
-    NPY_FLOAT, NPY_FLOAT,
-    NPY_DOUBLE, NPY_DOUBLE,
-    NPY_CFLOAT, NPY_FLOAT,
-    NPY_CDOUBLE, NPY_DOUBLE
-};
-
-static char eig_types[] = {
-    NPY_FLOAT, NPY_CFLOAT, NPY_CFLOAT,
-    NPY_DOUBLE, NPY_CDOUBLE, NPY_CDOUBLE,
-    NPY_CDOUBLE, NPY_CDOUBLE, NPY_CDOUBLE
-};
-
-static char eigvals_types[] = {
-    NPY_FLOAT, NPY_CFLOAT,
-    NPY_DOUBLE, NPY_CDOUBLE,
-    NPY_CDOUBLE, NPY_CDOUBLE
-};
-
-static char svd_1_1_types[] = {
-    NPY_FLOAT, NPY_FLOAT,
-    NPY_DOUBLE, NPY_DOUBLE,
-    NPY_CFLOAT, NPY_FLOAT,
-    NPY_CDOUBLE, NPY_DOUBLE
-};
-
-static char svd_1_3_types[] = {
-    NPY_FLOAT,   NPY_FLOAT,   NPY_FLOAT,  NPY_FLOAT,
-    NPY_DOUBLE,  NPY_DOUBLE,  NPY_DOUBLE, NPY_DOUBLE,
-    NPY_CFLOAT,  NPY_CFLOAT,  NPY_FLOAT,  NPY_CFLOAT,
-    NPY_CDOUBLE, NPY_CDOUBLE, NPY_DOUBLE, NPY_CDOUBLE
-};
-
-/* A, tau */
-static char qr_r_raw_types[] = {
-    NPY_DOUBLE,  NPY_DOUBLE,
-    NPY_CDOUBLE, NPY_CDOUBLE,
-};
-
-/* A, tau, q */
-static char qr_reduced_types[] = {
-    NPY_DOUBLE,  NPY_DOUBLE,  NPY_DOUBLE,
-    NPY_CDOUBLE, NPY_CDOUBLE, NPY_CDOUBLE,
-};
-
-/* A, tau, q */
-static char qr_complete_types[] = {
-    NPY_DOUBLE,  NPY_DOUBLE,  NPY_DOUBLE,
-    NPY_CDOUBLE, NPY_CDOUBLE, NPY_CDOUBLE,
-};
-
-/*  A,           b,           rcond,      x,           resid,      rank,    s,        */
-static char lstsq_types[] = {
-    NPY_FLOAT,   NPY_FLOAT,   NPY_FLOAT,  NPY_FLOAT,   NPY_FLOAT,  NPY_INT, NPY_FLOAT,
-    NPY_DOUBLE,  NPY_DOUBLE,  NPY_DOUBLE, NPY_DOUBLE,  NPY_DOUBLE, NPY_INT, NPY_DOUBLE,
-    NPY_CFLOAT,  NPY_CFLOAT,  NPY_FLOAT,  NPY_CFLOAT,  NPY_FLOAT,  NPY_INT, NPY_FLOAT,
-    NPY_CDOUBLE, NPY_CDOUBLE, NPY_DOUBLE, NPY_CDOUBLE, NPY_DOUBLE, NPY_INT, NPY_DOUBLE,
-};
-
-typedef struct gufunc_descriptor_struct {
-    char *name;
-    char *signature;
-    char *doc;
-    int ntypes;
-    int nin;
-    int nout;
-    PyUFuncGenericFunction *funcs;
-    char *types;
-} GUFUNC_DESCRIPTOR_t;
-
-GUFUNC_DESCRIPTOR_t gufunc_descriptors [] = {
-    {
-        "slogdet",
-        "(m,m)->(),()",
-        "slogdet on the last two dimensions and broadcast on the rest. \n"\
-        "Results in two arrays, one with sign and the other with log of the"\
-        " determinants. \n"\
-        "    \"(m,m)->(),()\" \n",
-        4, 1, 2,
-        FUNC_ARRAY_NAME(slogdet),
-        slogdet_types
-    },
-    {
-        "det",
-        "(m,m)->()",
-        "det of the last two dimensions and broadcast on the rest. \n"\
-        "    \"(m,m)->()\" \n",
-        4, 1, 1,
-        FUNC_ARRAY_NAME(det),
-        equal_2_types
-    },
-    {
-        "eigh_lo",
-        "(m,m)->(m),(m,m)",
-        "eigh on the last two dimension and broadcast to the rest, using"\
-        " lower triangle \n"\
-        "Results in a vector of eigenvalues and a matrix with the"\
-        "eigenvectors. \n"\
-        "    \"(m,m)->(m),(m,m)\" \n",
-        4, 1, 2,
-        FUNC_ARRAY_NAME(eighlo),
-        eigh_types
-    },
-    {
-        "eigh_up",
-        "(m,m)->(m),(m,m)",
-        "eigh on the last two dimension and broadcast to the rest, using"\
-        " upper triangle. \n"\
-        "Results in a vector of eigenvalues and a matrix with the"\
-        " eigenvectors. \n"\
-        "    \"(m,m)->(m),(m,m)\" \n",
-        4, 1, 2,
-        FUNC_ARRAY_NAME(eighup),
-        eigh_types
-    },
-    {
-        "eigvalsh_lo",
-        "(m,m)->(m)",
-        "eigh on the last two dimension and broadcast to the rest, using"\
-        " lower triangle. \n"\
-        "Results in a vector of eigenvalues and a matrix with the"\
-        "eigenvectors. \n"\
-        "    \"(m,m)->(m)\" \n",
-        4, 1, 1,
-        FUNC_ARRAY_NAME(eigvalshlo),
-        eighvals_types
-    },
-    {
-        "eigvalsh_up",
-        "(m,m)->(m)",
-        "eigvalsh on the last two dimension and broadcast to the rest,"\
-        " using upper triangle. \n"\
-        "Results in a vector of eigenvalues and a matrix with the"\
-        "eigenvectors.\n"\
-        "    \"(m,m)->(m)\" \n",
-        4, 1, 1,
-        FUNC_ARRAY_NAME(eigvalshup),
-        eighvals_types
-    },
-    {
-        "solve",
-        "(m,m),(m,n)->(m,n)",
-        "solve the system a x = b, on the last two dimensions, broadcast"\
-        " to the rest. \n"\
-        "Results in a matrices with the solutions. \n"\
-        "    \"(m,m),(m,n)->(m,n)\" \n",
-        4, 2, 1,
-        FUNC_ARRAY_NAME(solve),
-        equal_3_types
-    },
-    {
-        "solve1",
-        "(m,m),(m)->(m)",
-        "solve the system a x = b, for b being a vector, broadcast in"\
-        " the outer dimensions. \n"\
-        "Results in vectors with the solutions. \n"\
-        "    \"(m,m),(m)->(m)\" \n",
-        4, 2, 1,
-        FUNC_ARRAY_NAME(solve1),
-        equal_3_types
-    },
-    {
-        "inv",
-        "(m, m)->(m, m)",
-        "compute the inverse of the last two dimensions and broadcast"\
-        " to the rest. \n"\
-        "Results in the inverse matrices. \n"\
-        "    \"(m,m)->(m,m)\" \n",
-        4, 1, 1,
-        FUNC_ARRAY_NAME(inv),
-        equal_2_types
-    },
-    {
-        "cholesky_lo",
-        "(m,m)->(m,m)",
-        "cholesky decomposition of hermitian positive-definite matrices. \n"\
-        "Broadcast to all outer dimensions. \n"\
-        "    \"(m,m)->(m,m)\" \n",
-        4, 1, 1,
-        FUNC_ARRAY_NAME(cholesky_lo),
-        equal_2_types
-    },
-    {
-        "svd_m",
-        "(m,n)->(m)",
-        "svd when n>=m. ",
-        4, 1, 1,
-        FUNC_ARRAY_NAME(svd_N),
-        svd_1_1_types
-    },
-    {
-        "svd_n",
-        "(m,n)->(n)",
-        "svd when n<=m",
-        4, 1, 1,
-        FUNC_ARRAY_NAME(svd_N),
-        svd_1_1_types
-    },
-    {
-        "svd_m_s",
-        "(m,n)->(m,m),(m),(m,n)",
-        "svd when m<=n",
-        4, 1, 3,
-        FUNC_ARRAY_NAME(svd_S),
-        svd_1_3_types
-    },
-    {
-        "svd_n_s",
-        "(m,n)->(m,n),(n),(n,n)",
-        "svd when m>=n",
-        4, 1, 3,
-        FUNC_ARRAY_NAME(svd_S),
-        svd_1_3_types
-    },
-    {
-        "svd_m_f",
-        "(m,n)->(m,m),(m),(n,n)",
-        "svd when m<=n",
-        4, 1, 3,
-        FUNC_ARRAY_NAME(svd_A),
-        svd_1_3_types
-    },
-    {
-        "svd_n_f",
-        "(m,n)->(m,m),(n),(n,n)",
-        "svd when m>=n",
-        4, 1, 3,
-        FUNC_ARRAY_NAME(svd_A),
-        svd_1_3_types
-    },
-    {
-        "eig",
-        "(m,m)->(m),(m,m)",
-        "eig on the last two dimension and broadcast to the rest. \n"\
-        "Results in a vector with the  eigenvalues and a matrix with the"\
-        " eigenvectors. \n"\
-        "    \"(m,m)->(m),(m,m)\" \n",
-        3, 1, 2,
-        FUNC_ARRAY_NAME(eig),
-        eig_types
-    },
-    {
-        "eigvals",
-        "(m,m)->(m)",
-        "eigvals on the last two dimension and broadcast to the rest. \n"\
-        "Results in a vector of eigenvalues. \n",
-        3, 1, 1,
-        FUNC_ARRAY_NAME(eigvals),
-        eigvals_types
-    },
-    {
-        "qr_r_raw_m",
-        "(m,n)->(m)",
-        "Compute TAU vector for the last two dimensions \n"\
-        "and broadcast to the rest. For m <= n. \n",
-        2, 1, 1,
-        FUNC_ARRAY_NAME(qr_r_raw),
-        qr_r_raw_types
-    },
-    {
-        "qr_r_raw_n",
-        "(m,n)->(n)",
-        "Compute TAU vector for the last two dimensions \n"\
-        "and broadcast to the rest. For m > n. \n",
-        2, 1, 1,
-        FUNC_ARRAY_NAME(qr_r_raw),
-        qr_r_raw_types
-    },
-    {
-        "qr_reduced",
-        "(m,n),(k)->(m,k)",
-        "Compute Q matrix for the last two dimensions \n"\
-        "and broadcast to the rest. \n",
-        2, 2, 1,
-        FUNC_ARRAY_NAME(qr_reduced),
-        qr_reduced_types
-    },
-    {
-        "qr_complete",
-        "(m,n),(n)->(m,m)",
-        "Compute Q matrix for the last two dimensions \n"\
-        "and broadcast to the rest. For m > n. \n",
-        2, 2, 1,
-        FUNC_ARRAY_NAME(qr_complete),
-        qr_complete_types
-    },
-    {
-        "lstsq_m",
-        "(m,n),(m,nrhs),()->(n,nrhs),(nrhs),(),(m)",
-        "least squares on the last two dimensions and broadcast to the rest. \n"\
-        "For m <= n. \n",
-        4, 3, 4,
-        FUNC_ARRAY_NAME(lstsq),
-        lstsq_types
-    },
-    {
-        "lstsq_n",
-        "(m,n),(m,nrhs),()->(n,nrhs),(nrhs),(),(n)",
-        "least squares on the last two dimensions and broadcast to the rest. \n"\
-        "For m >= n, meaning that residuals are produced. \n",
-        4, 3, 4,
-        FUNC_ARRAY_NAME(lstsq),
-        lstsq_types
-    }
-};
-
-static int
-addUfuncs(PyObject *dictionary) {
-    PyObject *f;
-    int i;
-    const int gufunc_count = sizeof(gufunc_descriptors)/
-        sizeof(gufunc_descriptors[0]);
-    for (i = 0; i < gufunc_count; i++) {
-        GUFUNC_DESCRIPTOR_t* d = &gufunc_descriptors[i];
-        f = PyUFunc_FromFuncAndDataAndSignature(d->funcs,
-                                                array_of_nulls,
-                                                d->types,
-                                                d->ntypes,
-                                                d->nin,
-                                                d->nout,
-                                                PyUFunc_None,
-                                                d->name,
-                                                d->doc,
-                                                0,
-                                                d->signature);
-        if (f == NULL) {
-            return -1;
-        }
-#if 0
-        dump_ufunc_object((PyUFuncObject*) f);
-#endif
-        int ret = PyDict_SetItemString(dictionary, d->name, f);
-        Py_DECREF(f);
-        if (ret < 0) {
-            return -1;
-        }
-    }
-    return 0;
-}
-
-
-
-/* -------------------------------------------------------------------------- */
-                  /* Module initialization stuff  */
-
-static PyMethodDef UMath_LinAlgMethods[] = {
-    {NULL, NULL, 0, NULL}        /* Sentinel */
-};
-
-static struct PyModuleDef moduledef = {
-        PyModuleDef_HEAD_INIT,
-        UMATH_LINALG_MODULE_NAME,
-        NULL,
-        -1,
-        UMath_LinAlgMethods,
-        NULL,
-        NULL,
-        NULL,
-        NULL
-};
-
-PyMODINIT_FUNC PyInit__umath_linalg(void)
-{
-    PyObject *m;
-    PyObject *d;
-    PyObject *version;
-
-    init_constants();
-    m = PyModule_Create(&moduledef);
-    if (m == NULL) {
-        return NULL;
-    }
-
-    import_array();
-    import_ufunc();
-
-    d = PyModule_GetDict(m);
-    if (d == NULL) {
-        return NULL;
-    }
-
-    version = PyUnicode_FromString(umath_linalg_version_string);
-    if (version == NULL) {
-        return NULL;
-    }
-    int ret = PyDict_SetItemString(d, "__version__", version);
-    Py_DECREF(version);
-    if (ret < 0) {
-        return NULL;
-    }
-
-    /* Load the ufunc operators into the module's namespace */
-    if (addUfuncs(d) < 0) {
-        return NULL;
-    }
-
-    return m;
-}
diff --git a/numpy/linalg/umath_linalg.cpp b/numpy/linalg/umath_linalg.cpp

new file mode 100644 (file)

index 0000000..bbeb379
--- /dev/null
+++ b/numpy/linalg/umath_linalg.cpp
@@ -0,0 +1,4564 @@
+/* -*- c -*- */
+
+/*
+ *****************************************************************************
+ **                            INCLUDES                                     **
+ *****************************************************************************
+ */
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+#include "numpy/arrayobject.h"
+#include "numpy/ufuncobject.h"
+
+#include "npy_pycompat.h"
+
+#include "npy_config.h"
+
+#include "npy_cblas.h"
+
+#include <cstddef>
+#include <cstdio>
+#include <cassert>
+#include <cmath>
+#include <utility>
+
+
+static const char* umath_linalg_version_string = "0.1.5";
+
+struct scalar_trait {};
+struct complex_trait {};
+template<typename typ>
+using dispatch_scalar = typename std::conditional<std::is_scalar<typ>::value, scalar_trait, complex_trait>::type;
+
+/*
+ ****************************************************************************
+ *                        Debugging support                                 *
+ ****************************************************************************
+ */
+#define TRACE_TXT(...) do { fprintf (stderr, __VA_ARGS__); } while (0)
+#define STACK_TRACE do {} while (0)
+#define TRACE\
+    do {                                        \
+        fprintf (stderr,                        \
+                 "%s:%d:%s\n",                  \
+                 __FILE__,                      \
+                 __LINE__,                      \
+                 __FUNCTION__);                 \
+        STACK_TRACE;                            \
+    } while (0)
+
+#if 0
+#include <execinfo.h>
+void
+dbg_stack_trace()
+{
+    void *trace[32];
+    size_t size;
+
+    size = backtrace(trace, sizeof(trace)/sizeof(trace[0]));
+    backtrace_symbols_fd(trace, size, 1);
+}
+
+#undef STACK_TRACE
+#define STACK_TRACE do { dbg_stack_trace(); } while (0)
+#endif
+
+/*
+ *****************************************************************************
+ *                    BLAS/LAPACK calling macros                             *
+ *****************************************************************************
+ */
+
+#define FNAME(x) BLAS_FUNC(x)
+
+typedef CBLAS_INT         fortran_int;
+
+typedef struct { float r, i; } f2c_complex;
+typedef struct { double r, i; } f2c_doublecomplex;
+/* typedef long int (*L_fp)(); */
+
+typedef float             fortran_real;
+typedef double            fortran_doublereal;
+typedef f2c_complex       fortran_complex;
+typedef f2c_doublecomplex fortran_doublecomplex;
+
+extern "C" fortran_int
+FNAME(sgeev)(char *jobvl, char *jobvr, fortran_int *n,
+             float a[], fortran_int *lda, float wr[], float wi[],
+             float vl[], fortran_int *ldvl, float vr[], fortran_int *ldvr,
+             float work[], fortran_int lwork[],
+             fortran_int *info);
+extern "C" fortran_int
+FNAME(dgeev)(char *jobvl, char *jobvr, fortran_int *n,
+             double a[], fortran_int *lda, double wr[], double wi[],
+             double vl[], fortran_int *ldvl, double vr[], fortran_int *ldvr,
+             double work[], fortran_int lwork[],
+             fortran_int *info);
+extern "C" fortran_int
+FNAME(cgeev)(char *jobvl, char *jobvr, fortran_int *n,
+             f2c_complex a[], fortran_int *lda,
+             f2c_complex w[],
+             f2c_complex vl[], fortran_int *ldvl,
+             f2c_complex vr[], fortran_int *ldvr,
+             f2c_complex work[], fortran_int *lwork,
+             float rwork[],
+             fortran_int *info);
+extern "C" fortran_int
+FNAME(zgeev)(char *jobvl, char *jobvr, fortran_int *n,
+             f2c_doublecomplex a[], fortran_int *lda,
+             f2c_doublecomplex w[],
+             f2c_doublecomplex vl[], fortran_int *ldvl,
+             f2c_doublecomplex vr[], fortran_int *ldvr,
+             f2c_doublecomplex work[], fortran_int *lwork,
+             double rwork[],
+             fortran_int *info);
+
+extern "C" fortran_int
+FNAME(ssyevd)(char *jobz, char *uplo, fortran_int *n,
+              float a[], fortran_int *lda, float w[], float work[],
+              fortran_int *lwork, fortran_int iwork[], fortran_int *liwork,
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(dsyevd)(char *jobz, char *uplo, fortran_int *n,
+              double a[], fortran_int *lda, double w[], double work[],
+              fortran_int *lwork, fortran_int iwork[], fortran_int *liwork,
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(cheevd)(char *jobz, char *uplo, fortran_int *n,
+              f2c_complex a[], fortran_int *lda,
+              float w[], f2c_complex work[],
+              fortran_int *lwork, float rwork[], fortran_int *lrwork, fortran_int iwork[],
+              fortran_int *liwork,
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(zheevd)(char *jobz, char *uplo, fortran_int *n,
+              f2c_doublecomplex a[], fortran_int *lda,
+              double w[], f2c_doublecomplex work[],
+              fortran_int *lwork, double rwork[], fortran_int *lrwork, fortran_int iwork[],
+              fortran_int *liwork,
+              fortran_int *info);
+
+extern "C" fortran_int
+FNAME(sgelsd)(fortran_int *m, fortran_int *n, fortran_int *nrhs,
+              float a[], fortran_int *lda, float b[], fortran_int *ldb,
+              float s[], float *rcond, fortran_int *rank,
+              float work[], fortran_int *lwork, fortran_int iwork[],
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(dgelsd)(fortran_int *m, fortran_int *n, fortran_int *nrhs,
+              double a[], fortran_int *lda, double b[], fortran_int *ldb,
+              double s[], double *rcond, fortran_int *rank,
+              double work[], fortran_int *lwork, fortran_int iwork[],
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(cgelsd)(fortran_int *m, fortran_int *n, fortran_int *nrhs,
+              f2c_complex a[], fortran_int *lda,
+              f2c_complex b[], fortran_int *ldb,
+              float s[], float *rcond, fortran_int *rank,
+              f2c_complex work[], fortran_int *lwork,
+              float rwork[], fortran_int iwork[],
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(zgelsd)(fortran_int *m, fortran_int *n, fortran_int *nrhs,
+              f2c_doublecomplex a[], fortran_int *lda,
+              f2c_doublecomplex b[], fortran_int *ldb,
+              double s[], double *rcond, fortran_int *rank,
+              f2c_doublecomplex work[], fortran_int *lwork,
+              double rwork[], fortran_int iwork[],
+              fortran_int *info);
+
+extern "C" fortran_int
+FNAME(dgeqrf)(fortran_int *m, fortran_int *n, double a[], fortran_int *lda,
+              double tau[], double work[],
+              fortran_int *lwork, fortran_int *info);
+extern "C" fortran_int
+FNAME(zgeqrf)(fortran_int *m, fortran_int *n, f2c_doublecomplex a[], fortran_int *lda,
+              f2c_doublecomplex tau[], f2c_doublecomplex work[],
+              fortran_int *lwork, fortran_int *info);
+
+extern "C" fortran_int
+FNAME(dorgqr)(fortran_int *m, fortran_int *n, fortran_int *k, double a[], fortran_int *lda,
+              double tau[], double work[],
+              fortran_int *lwork, fortran_int *info);
+extern "C" fortran_int
+FNAME(zungqr)(fortran_int *m, fortran_int *n, fortran_int *k, f2c_doublecomplex a[],
+              fortran_int *lda, f2c_doublecomplex tau[],
+              f2c_doublecomplex work[], fortran_int *lwork, fortran_int *info);
+
+extern "C" fortran_int
+FNAME(sgesv)(fortran_int *n, fortran_int *nrhs,
+             float a[], fortran_int *lda,
+             fortran_int ipiv[],
+             float b[], fortran_int *ldb,
+             fortran_int *info);
+extern "C" fortran_int
+FNAME(dgesv)(fortran_int *n, fortran_int *nrhs,
+             double a[], fortran_int *lda,
+             fortran_int ipiv[],
+             double b[], fortran_int *ldb,
+             fortran_int *info);
+extern "C" fortran_int
+FNAME(cgesv)(fortran_int *n, fortran_int *nrhs,
+             f2c_complex a[], fortran_int *lda,
+             fortran_int ipiv[],
+             f2c_complex b[], fortran_int *ldb,
+             fortran_int *info);
+extern "C" fortran_int
+FNAME(zgesv)(fortran_int *n, fortran_int *nrhs,
+             f2c_doublecomplex a[], fortran_int *lda,
+             fortran_int ipiv[],
+             f2c_doublecomplex b[], fortran_int *ldb,
+             fortran_int *info);
+
+extern "C" fortran_int
+FNAME(sgetrf)(fortran_int *m, fortran_int *n,
+              float a[], fortran_int *lda,
+              fortran_int ipiv[],
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(dgetrf)(fortran_int *m, fortran_int *n,
+              double a[], fortran_int *lda,
+              fortran_int ipiv[],
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(cgetrf)(fortran_int *m, fortran_int *n,
+              f2c_complex a[], fortran_int *lda,
+              fortran_int ipiv[],
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(zgetrf)(fortran_int *m, fortran_int *n,
+              f2c_doublecomplex a[], fortran_int *lda,
+              fortran_int ipiv[],
+              fortran_int *info);
+
+extern "C" fortran_int
+FNAME(spotrf)(char *uplo, fortran_int *n,
+              float a[], fortran_int *lda,
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(dpotrf)(char *uplo, fortran_int *n,
+              double a[], fortran_int *lda,
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(cpotrf)(char *uplo, fortran_int *n,
+              f2c_complex a[], fortran_int *lda,
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(zpotrf)(char *uplo, fortran_int *n,
+              f2c_doublecomplex a[], fortran_int *lda,
+              fortran_int *info);
+
+extern "C" fortran_int
+FNAME(sgesdd)(char *jobz, fortran_int *m, fortran_int *n,
+              float a[], fortran_int *lda, float s[], float u[],
+              fortran_int *ldu, float vt[], fortran_int *ldvt, float work[],
+              fortran_int *lwork, fortran_int iwork[], fortran_int *info);
+extern "C" fortran_int
+FNAME(dgesdd)(char *jobz, fortran_int *m, fortran_int *n,
+              double a[], fortran_int *lda, double s[], double u[],
+              fortran_int *ldu, double vt[], fortran_int *ldvt, double work[],
+              fortran_int *lwork, fortran_int iwork[], fortran_int *info);
+extern "C" fortran_int
+FNAME(cgesdd)(char *jobz, fortran_int *m, fortran_int *n,
+              f2c_complex a[], fortran_int *lda,
+              float s[], f2c_complex u[], fortran_int *ldu,
+              f2c_complex vt[], fortran_int *ldvt,
+              f2c_complex work[], fortran_int *lwork,
+              float rwork[], fortran_int iwork[], fortran_int *info);
+extern "C" fortran_int
+FNAME(zgesdd)(char *jobz, fortran_int *m, fortran_int *n,
+              f2c_doublecomplex a[], fortran_int *lda,
+              double s[], f2c_doublecomplex u[], fortran_int *ldu,
+              f2c_doublecomplex vt[], fortran_int *ldvt,
+              f2c_doublecomplex work[], fortran_int *lwork,
+              double rwork[], fortran_int iwork[], fortran_int *info);
+
+extern "C" fortran_int
+FNAME(spotrs)(char *uplo, fortran_int *n, fortran_int *nrhs,
+              float a[], fortran_int *lda,
+              float b[], fortran_int *ldb,
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(dpotrs)(char *uplo, fortran_int *n, fortran_int *nrhs,
+              double a[], fortran_int *lda,
+              double b[], fortran_int *ldb,
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(cpotrs)(char *uplo, fortran_int *n, fortran_int *nrhs,
+              f2c_complex a[], fortran_int *lda,
+              f2c_complex b[], fortran_int *ldb,
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(zpotrs)(char *uplo, fortran_int *n, fortran_int *nrhs,
+              f2c_doublecomplex a[], fortran_int *lda,
+              f2c_doublecomplex b[], fortran_int *ldb,
+              fortran_int *info);
+
+extern "C" fortran_int
+FNAME(spotri)(char *uplo, fortran_int *n,
+              float a[], fortran_int *lda,
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(dpotri)(char *uplo, fortran_int *n,
+              double a[], fortran_int *lda,
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(cpotri)(char *uplo, fortran_int *n,
+              f2c_complex a[], fortran_int *lda,
+              fortran_int *info);
+extern "C" fortran_int
+FNAME(zpotri)(char *uplo, fortran_int *n,
+              f2c_doublecomplex a[], fortran_int *lda,
+              fortran_int *info);
+
+extern "C" fortran_int
+FNAME(scopy)(fortran_int *n,
+             float *sx, fortran_int *incx,
+             float *sy, fortran_int *incy);
+extern "C" fortran_int
+FNAME(dcopy)(fortran_int *n,
+             double *sx, fortran_int *incx,
+             double *sy, fortran_int *incy);
+extern "C" fortran_int
+FNAME(ccopy)(fortran_int *n,
+             f2c_complex *sx, fortran_int *incx,
+             f2c_complex *sy, fortran_int *incy);
+extern "C" fortran_int
+FNAME(zcopy)(fortran_int *n,
+             f2c_doublecomplex *sx, fortran_int *incx,
+             f2c_doublecomplex *sy, fortran_int *incy);
+
+extern "C" float
+FNAME(sdot)(fortran_int *n,
+            float *sx, fortran_int *incx,
+            float *sy, fortran_int *incy);
+extern "C" double
+FNAME(ddot)(fortran_int *n,
+            double *sx, fortran_int *incx,
+            double *sy, fortran_int *incy);
+extern "C" void
+FNAME(cdotu)(f2c_complex *ret, fortran_int *n,
+             f2c_complex *sx, fortran_int *incx,
+             f2c_complex *sy, fortran_int *incy);
+extern "C" void
+FNAME(zdotu)(f2c_doublecomplex *ret, fortran_int *n,
+             f2c_doublecomplex *sx, fortran_int *incx,
+             f2c_doublecomplex *sy, fortran_int *incy);
+extern "C" void
+FNAME(cdotc)(f2c_complex *ret, fortran_int *n,
+             f2c_complex *sx, fortran_int *incx,
+             f2c_complex *sy, fortran_int *incy);
+extern "C" void
+FNAME(zdotc)(f2c_doublecomplex *ret, fortran_int *n,
+             f2c_doublecomplex *sx, fortran_int *incx,
+             f2c_doublecomplex *sy, fortran_int *incy);
+
+extern "C" fortran_int
+FNAME(sgemm)(char *transa, char *transb,
+             fortran_int *m, fortran_int *n, fortran_int *k,
+             float *alpha,
+             float *a, fortran_int *lda,
+             float *b, fortran_int *ldb,
+             float *beta,
+             float *c, fortran_int *ldc);
+extern "C" fortran_int
+FNAME(dgemm)(char *transa, char *transb,
+             fortran_int *m, fortran_int *n, fortran_int *k,
+             double *alpha,
+             double *a, fortran_int *lda,
+             double *b, fortran_int *ldb,
+             double *beta,
+             double *c, fortran_int *ldc);
+extern "C" fortran_int
+FNAME(cgemm)(char *transa, char *transb,
+             fortran_int *m, fortran_int *n, fortran_int *k,
+             f2c_complex *alpha,
+             f2c_complex *a, fortran_int *lda,
+             f2c_complex *b, fortran_int *ldb,
+             f2c_complex *beta,
+             f2c_complex *c, fortran_int *ldc);
+extern "C" fortran_int
+FNAME(zgemm)(char *transa, char *transb,
+             fortran_int *m, fortran_int *n, fortran_int *k,
+             f2c_doublecomplex *alpha,
+             f2c_doublecomplex *a, fortran_int *lda,
+             f2c_doublecomplex *b, fortran_int *ldb,
+             f2c_doublecomplex *beta,
+             f2c_doublecomplex *c, fortran_int *ldc);
+
+
+#define LAPACK_T(FUNC)                                          \
+    TRACE_TXT("Calling LAPACK ( " # FUNC " )\n");               \
+    FNAME(FUNC)
+
+#define BLAS(FUNC)                              \
+    FNAME(FUNC)
+
+#define LAPACK(FUNC)                            \
+    FNAME(FUNC)
+
+
+/*
+ *****************************************************************************
+ **                      Some handy functions                               **
+ *****************************************************************************
+ */
+
+static inline int
+get_fp_invalid_and_clear(void)
+{
+    int status;
+    status = npy_clear_floatstatus_barrier((char*)&status);
+    return !!(status & NPY_FPE_INVALID);
+}
+
+static inline void
+set_fp_invalid_or_clear(int error_occurred)
+{
+    if (error_occurred) {
+        npy_set_floatstatus_invalid();
+    }
+    else {
+        npy_clear_floatstatus_barrier((char*)&error_occurred);
+    }
+}
+
+/*
+ *****************************************************************************
+ **                      Some handy constants                               **
+ *****************************************************************************
+ */
+
+#define UMATH_LINALG_MODULE_NAME "_umath_linalg"
+
+template<typename T>
+struct numeric_limits;
+
+template<>
+struct numeric_limits<float> {
+static constexpr float one = 1.0f;
+static constexpr float zero = 0.0f;
+static constexpr float minus_one = -1.0f;
+static const float ninf;
+static const float nan;
+};
+constexpr float numeric_limits<float>::one;
+constexpr float numeric_limits<float>::zero;
+constexpr float numeric_limits<float>::minus_one;
+const float numeric_limits<float>::ninf = -NPY_INFINITYF;
+const float numeric_limits<float>::nan = NPY_NANF;
+
+template<>
+struct numeric_limits<double> {
+static constexpr double one = 1.0;
+static constexpr double zero = 0.0;
+static constexpr double minus_one = -1.0;
+static const double ninf;
+static const double nan;
+};
+constexpr double numeric_limits<double>::one;
+constexpr double numeric_limits<double>::zero;
+constexpr double numeric_limits<double>::minus_one;
+const double numeric_limits<double>::ninf = -NPY_INFINITY;
+const double numeric_limits<double>::nan = NPY_NAN;
+
+template<>
+struct numeric_limits<npy_cfloat> {
+static constexpr npy_cfloat one = {1.0f, 0.0f};
+static constexpr npy_cfloat zero = {0.0f, 0.0f};
+static constexpr npy_cfloat minus_one = {-1.0f, 0.0f};
+static const npy_cfloat ninf;
+static const npy_cfloat nan;
+};
+constexpr npy_cfloat numeric_limits<npy_cfloat>::one;
+constexpr npy_cfloat numeric_limits<npy_cfloat>::zero;
+constexpr npy_cfloat numeric_limits<npy_cfloat>::minus_one;
+const npy_cfloat numeric_limits<npy_cfloat>::ninf = {-NPY_INFINITYF, 0.0f};
+const npy_cfloat numeric_limits<npy_cfloat>::nan = {NPY_NANF, NPY_NANF};
+
+template<>
+struct numeric_limits<f2c_complex> {
+static constexpr f2c_complex one = {1.0f, 0.0f};
+static constexpr f2c_complex zero = {0.0f, 0.0f};
+static constexpr f2c_complex minus_one = {-1.0f, 0.0f};
+static const f2c_complex ninf;
+static const f2c_complex nan;
+};
+constexpr f2c_complex numeric_limits<f2c_complex>::one;
+constexpr f2c_complex numeric_limits<f2c_complex>::zero;
+constexpr f2c_complex numeric_limits<f2c_complex>::minus_one;
+const f2c_complex numeric_limits<f2c_complex>::ninf = {-NPY_INFINITYF, 0.0f};
+const f2c_complex numeric_limits<f2c_complex>::nan = {NPY_NANF, NPY_NANF};
+
+template<>
+struct numeric_limits<npy_cdouble> {
+static constexpr npy_cdouble one = {1.0, 0.0};
+static constexpr npy_cdouble zero = {0.0, 0.0};
+static constexpr npy_cdouble minus_one = {-1.0, 0.0};
+static const npy_cdouble ninf;
+static const npy_cdouble nan;
+};
+constexpr npy_cdouble numeric_limits<npy_cdouble>::one;
+constexpr npy_cdouble numeric_limits<npy_cdouble>::zero;
+constexpr npy_cdouble numeric_limits<npy_cdouble>::minus_one;
+const npy_cdouble numeric_limits<npy_cdouble>::ninf = {-NPY_INFINITY, 0.0};
+const npy_cdouble numeric_limits<npy_cdouble>::nan = {NPY_NAN, NPY_NAN};
+
+template<>
+struct numeric_limits<f2c_doublecomplex> {
+static constexpr f2c_doublecomplex one = {1.0, 0.0};
+static constexpr f2c_doublecomplex zero = {0.0, 0.0};
+static constexpr f2c_doublecomplex minus_one = {-1.0, 0.0};
+static const f2c_doublecomplex ninf;
+static const f2c_doublecomplex nan;
+};
+constexpr f2c_doublecomplex numeric_limits<f2c_doublecomplex>::one;
+constexpr f2c_doublecomplex numeric_limits<f2c_doublecomplex>::zero;
+constexpr f2c_doublecomplex numeric_limits<f2c_doublecomplex>::minus_one;
+const f2c_doublecomplex numeric_limits<f2c_doublecomplex>::ninf = {-NPY_INFINITY, 0.0};
+const f2c_doublecomplex numeric_limits<f2c_doublecomplex>::nan = {NPY_NAN, NPY_NAN};
+
+/*
+ *****************************************************************************
+ **               Structs used for data rearrangement                       **
+ *****************************************************************************
+ */
+
+
+/*
+ * this struct contains information about how to linearize a matrix in a local
+ * buffer so that it can be used by blas functions.  All strides are specified
+ * in bytes and are converted to elements later in type specific functions.
+ *
+ * rows: number of rows in the matrix
+ * columns: number of columns in the matrix
+ * row_strides: the number bytes between consecutive rows.
+ * column_strides: the number of bytes between consecutive columns.
+ * output_lead_dim: BLAS/LAPACK-side leading dimension, in elements
+ */
+typedef struct linearize_data_struct
+{
+  npy_intp rows;
+  npy_intp columns;
+  npy_intp row_strides;
+  npy_intp column_strides;
+  npy_intp output_lead_dim;
+} LINEARIZE_DATA_t;
+
+static inline void
+init_linearize_data_ex(LINEARIZE_DATA_t *lin_data,
+                       npy_intp rows,
+                       npy_intp columns,
+                       npy_intp row_strides,
+                       npy_intp column_strides,
+                       npy_intp output_lead_dim)
+{
+    lin_data->rows = rows;
+    lin_data->columns = columns;
+    lin_data->row_strides = row_strides;
+    lin_data->column_strides = column_strides;
+    lin_data->output_lead_dim = output_lead_dim;
+}
+
+static inline void
+init_linearize_data(LINEARIZE_DATA_t *lin_data,
+                    npy_intp rows,
+                    npy_intp columns,
+                    npy_intp row_strides,
+                    npy_intp column_strides)
+{
+    init_linearize_data_ex(
+        lin_data, rows, columns, row_strides, column_strides, columns);
+}
+
+static inline void
+dump_ufunc_object(PyUFuncObject* ufunc)
+{
+    TRACE_TXT("\n\n%s '%s' (%d input(s), %d output(s), %d specialization(s).\n",
+              ufunc->core_enabled? "generalized ufunc" : "scalar ufunc",
+              ufunc->name, ufunc->nin, ufunc->nout, ufunc->ntypes);
+    if (ufunc->core_enabled) {
+        int arg;
+        int dim;
+        TRACE_TXT("\t%s (%d dimension(s) detected).\n",
+                  ufunc->core_signature, ufunc->core_num_dim_ix);
+
+        for (arg = 0; arg < ufunc->nargs; arg++){
+            int * arg_dim_ix = ufunc->core_dim_ixs + ufunc->core_offsets[arg];
+            TRACE_TXT("\t\targ %d (%s) has %d dimension(s): (",
+                      arg, arg < ufunc->nin? "INPUT" : "OUTPUT",
+                      ufunc->core_num_dims[arg]);
+            for (dim = 0; dim < ufunc->core_num_dims[arg]; dim ++) {
+                TRACE_TXT(" %d", arg_dim_ix[dim]);
+            }
+            TRACE_TXT(" )\n");
+        }
+    }
+}
+
+static inline void
+dump_linearize_data(const char* name, const LINEARIZE_DATA_t* params)
+{
+    TRACE_TXT("\n\t%s rows: %zd columns: %zd"\
+              "\n\t\trow_strides: %td column_strides: %td"\
+              "\n", name, params->rows, params->columns,
+              params->row_strides, params->column_strides);
+}
+
+static inline void
+print(npy_float s)
+{
+    TRACE_TXT(" %8.4f", s);
+}
+static inline void
+print(npy_double d)
+{
+    TRACE_TXT(" %10.6f", d);
+}
+static inline void
+print(npy_cfloat c)
+{
+    float* c_parts = (float*)&c;
+    TRACE_TXT("(%8.4f, %8.4fj)", c_parts[0], c_parts[1]);
+}
+static inline void
+print(npy_cdouble z)
+{
+    double* z_parts = (double*)&z;
+    TRACE_TXT("(%8.4f, %8.4fj)", z_parts[0], z_parts[1]);
+}
+
+template<typename typ>
+static inline void
+dump_matrix(const char* name,
+                   size_t rows, size_t columns,
+                  const typ* ptr)
+{
+    size_t i, j;
+
+    TRACE_TXT("\n%s %p (%zd, %zd)\n", name, ptr, rows, columns);
+    for (i = 0; i < rows; i++)
+    {
+        TRACE_TXT("| ");
+        for (j = 0; j < columns; j++)
+        {
+            print(ptr[j*rows + i]);
+            TRACE_TXT(", ");
+        }
+        TRACE_TXT(" |\n");
+    }
+}
+
+
+/*
+ *****************************************************************************
+ **                            Basics                                       **
+ *****************************************************************************
+ */
+
+static inline fortran_int
+fortran_int_min(fortran_int x, fortran_int y) {
+    return x < y ? x : y;
+}
+
+static inline fortran_int
+fortran_int_max(fortran_int x, fortran_int y) {
+    return x > y ? x : y;
+}
+
+#define INIT_OUTER_LOOP_1 \
+    npy_intp dN = *dimensions++;\
+    npy_intp N_;\
+    npy_intp s0 = *steps++;
+
+#define INIT_OUTER_LOOP_2 \
+    INIT_OUTER_LOOP_1\
+    npy_intp s1 = *steps++;
+
+#define INIT_OUTER_LOOP_3 \
+    INIT_OUTER_LOOP_2\
+    npy_intp s2 = *steps++;
+
+#define INIT_OUTER_LOOP_4 \
+    INIT_OUTER_LOOP_3\
+    npy_intp s3 = *steps++;
+
+#define INIT_OUTER_LOOP_5 \
+    INIT_OUTER_LOOP_4\
+    npy_intp s4 = *steps++;
+
+#define INIT_OUTER_LOOP_6  \
+    INIT_OUTER_LOOP_5\
+    npy_intp s5 = *steps++;
+
+#define INIT_OUTER_LOOP_7  \
+    INIT_OUTER_LOOP_6\
+    npy_intp s6 = *steps++;
+
+#define BEGIN_OUTER_LOOP_2 \
+    for (N_ = 0;\
+         N_ < dN;\
+         N_++, args[0] += s0,\
+             args[1] += s1) {
+
+#define BEGIN_OUTER_LOOP_3 \
+    for (N_ = 0;\
+         N_ < dN;\
+         N_++, args[0] += s0,\
+             args[1] += s1,\
+             args[2] += s2) {
+
+#define BEGIN_OUTER_LOOP_4 \
+    for (N_ = 0;\
+         N_ < dN;\
+         N_++, args[0] += s0,\
+             args[1] += s1,\
+             args[2] += s2,\
+             args[3] += s3) {
+
+#define BEGIN_OUTER_LOOP_5 \
+    for (N_ = 0;\
+         N_ < dN;\
+         N_++, args[0] += s0,\
+             args[1] += s1,\
+             args[2] += s2,\
+             args[3] += s3,\
+             args[4] += s4) {
+
+#define BEGIN_OUTER_LOOP_6 \
+    for (N_ = 0;\
+         N_ < dN;\
+         N_++, args[0] += s0,\
+             args[1] += s1,\
+             args[2] += s2,\
+             args[3] += s3,\
+             args[4] += s4,\
+             args[5] += s5) {
+
+#define BEGIN_OUTER_LOOP_7 \
+    for (N_ = 0;\
+         N_ < dN;\
+         N_++, args[0] += s0,\
+             args[1] += s1,\
+             args[2] += s2,\
+             args[3] += s3,\
+             args[4] += s4,\
+             args[5] += s5,\
+             args[6] += s6) {
+
+#define END_OUTER_LOOP  }
+
+static inline void
+update_pointers(npy_uint8** bases, ptrdiff_t* offsets, size_t count)
+{
+    size_t i;
+    for (i = 0; i < count; ++i) {
+        bases[i] += offsets[i];
+    }
+}
+
+
+/* disable -Wmaybe-uninitialized as there is some code that generate false
+   positives with this warning
+*/
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
+
+/*
+ *****************************************************************************
+ **                             DISPATCHER FUNCS                            **
+ *****************************************************************************
+ */
+static fortran_int copy(fortran_int *n,
+        float *sx, fortran_int *incx,
+        float *sy, fortran_int *incy) { return FNAME(scopy)(n, sx, incx,
+            sy, incy);
+}
+static fortran_int copy(fortran_int *n,
+        double *sx, fortran_int *incx,
+        double *sy, fortran_int *incy) { return FNAME(dcopy)(n, sx, incx,
+            sy, incy);
+}
+static fortran_int copy(fortran_int *n,
+        f2c_complex *sx, fortran_int *incx,
+        f2c_complex *sy, fortran_int *incy) { return FNAME(ccopy)(n, sx, incx,
+            sy, incy);
+}
+static fortran_int copy(fortran_int *n,
+        f2c_doublecomplex *sx, fortran_int *incx,
+        f2c_doublecomplex *sy, fortran_int *incy) { return FNAME(zcopy)(n, sx, incx,
+            sy, incy);
+}
+
+static fortran_int getrf(fortran_int *m, fortran_int *n, float a[], fortran_int
+*lda, fortran_int ipiv[], fortran_int *info) {
+ return LAPACK(sgetrf)(m, n, a, lda, ipiv, info);
+}
+static fortran_int getrf(fortran_int *m, fortran_int *n, double a[], fortran_int
+*lda, fortran_int ipiv[], fortran_int *info) {
+ return LAPACK(dgetrf)(m, n, a, lda, ipiv, info);
+}
+static fortran_int getrf(fortran_int *m, fortran_int *n, f2c_complex a[], fortran_int
+*lda, fortran_int ipiv[], fortran_int *info) {
+ return LAPACK(cgetrf)(m, n, a, lda, ipiv, info);
+}
+static fortran_int getrf(fortran_int *m, fortran_int *n, f2c_doublecomplex a[], fortran_int
+*lda, fortran_int ipiv[], fortran_int *info) {
+ return LAPACK(zgetrf)(m, n, a, lda, ipiv, info);
+}
+
+/*
+ *****************************************************************************
+ **                             HELPER FUNCS                                **
+ *****************************************************************************
+ */
+template<typename T>
+struct fortran_type {
+using type = T;
+};
+
+template<> struct fortran_type<npy_cfloat> { using type = f2c_complex;};
+template<> struct fortran_type<npy_cdouble> { using type = f2c_doublecomplex;};
+template<typename T>
+using fortran_type_t = typename fortran_type<T>::type;
+
+template<typename T>
+struct basetype {
+using type = T;
+};
+template<> struct basetype<npy_cfloat> { using type = npy_float;};
+template<> struct basetype<npy_cdouble> { using type = npy_double;};
+template<> struct basetype<f2c_complex> { using type = fortran_real;};
+template<> struct basetype<f2c_doublecomplex> { using type = fortran_doublereal;};
+template<typename T>
+using basetype_t = typename basetype<T>::type;
+
+             /* rearranging of 2D matrices using blas */
+
+template<typename typ>
+static inline void *
+linearize_matrix(typ *dst,
+                        typ *src,
+                        const LINEARIZE_DATA_t* data)
+{
+    using ftyp = fortran_type_t<typ>;
+    if (dst) {
+        int i, j;
+        typ* rv = dst;
+        fortran_int columns = (fortran_int)data->columns;
+        fortran_int column_strides =
+            (fortran_int)(data->column_strides/sizeof(typ));
+        fortran_int one = 1;
+        for (i = 0; i < data->rows; i++) {
+            if (column_strides > 0) {
+                copy(&columns,
+                              (ftyp*)src, &column_strides,
+                              (ftyp*)dst, &one);
+            }
+            else if (column_strides < 0) {
+                copy(&columns,
+                              ((ftyp*)src + (columns-1)*column_strides),
+                              &column_strides,
+                              (ftyp*)dst, &one);
+            }
+            else {
+                /*
+                 * Zero stride has undefined behavior in some BLAS
+                 * implementations (e.g. OSX Accelerate), so do it
+                 * manually
+                 */
+                for (j = 0; j < columns; ++j) {
+                    memcpy(dst + j, src, sizeof(typ));
+                }
+            }
+            src += data->row_strides/sizeof(typ);
+            dst += data->output_lead_dim;
+        }
+        return rv;
+    } else {
+        return src;
+    }
+}
+
+template<typename typ>
+static inline void *
+delinearize_matrix(typ *dst,
+                          typ *src,
+                          const LINEARIZE_DATA_t* data)
+{
+using ftyp = fortran_type_t<typ>;
+
+    if (src) {
+        int i;
+        typ *rv = src;
+        fortran_int columns = (fortran_int)data->columns;
+        fortran_int column_strides =
+            (fortran_int)(data->column_strides/sizeof(typ));
+        fortran_int one = 1;
+        for (i = 0; i < data->rows; i++) {
+            if (column_strides > 0) {
+                copy(&columns,
+                              (ftyp*)src, &one,
+                              (ftyp*)dst, &column_strides);
+            }
+            else if (column_strides < 0) {
+                copy(&columns,
+                              (ftyp*)src, &one,
+                              ((ftyp*)dst + (columns-1)*column_strides),
+                              &column_strides);
+            }
+            else {
+                /*
+                 * Zero stride has undefined behavior in some BLAS
+                 * implementations (e.g. OSX Accelerate), so do it
+                 * manually
+                 */
+                if (columns > 0) {
+                    memcpy(dst,
+                           src + (columns-1),
+                           sizeof(typ));
+                }
+            }
+            src += data->output_lead_dim;
+            dst += data->row_strides/sizeof(typ);
+        }
+
+        return rv;
+    } else {
+        return src;
+    }
+}
+
+template<typename typ>
+static inline void
+nan_matrix(typ *dst, const LINEARIZE_DATA_t* data)
+{
+    int i, j;
+    for (i = 0; i < data->rows; i++) {
+        typ *cp = dst;
+        ptrdiff_t cs = data->column_strides/sizeof(typ);
+        for (j = 0; j < data->columns; ++j) {
+            *cp = numeric_limits<typ>::nan;
+            cp += cs;
+        }
+        dst += data->row_strides/sizeof(typ);
+    }
+}
+
+template<typename typ>
+static inline void
+zero_matrix(typ *dst, const LINEARIZE_DATA_t* data)
+{
+    int i, j;
+    for (i = 0; i < data->rows; i++) {
+        typ *cp = dst;
+        ptrdiff_t cs = data->column_strides/sizeof(typ);
+        for (j = 0; j < data->columns; ++j) {
+            *cp = numeric_limits<typ>::zero;
+            cp += cs;
+        }
+        dst += data->row_strides/sizeof(typ);
+    }
+}
+
+               /* identity square matrix generation */
+template<typename typ>
+static inline void
+identity_matrix(typ *matrix, size_t n)
+{
+    size_t i;
+    /* in IEEE floating point, zeroes are represented as bitwise 0 */
+    memset(matrix, 0, n*n*sizeof(typ));
+
+    for (i = 0; i < n; ++i)
+    {
+        *matrix = numeric_limits<typ>::one;
+        matrix += n+1;
+    }
+}
+
+         /* lower/upper triangular matrix using blas (in place) */
+
+template<typename typ>
+static inline void
+triu_matrix(typ *matrix, size_t n)
+{
+    size_t i, j;
+    matrix += n;
+    for (i = 1; i < n; ++i) {
+        for (j = 0; j < i; ++j) {
+            matrix[j] = numeric_limits<typ>::zero;
+        }
+        matrix += n;
+    }
+}
+
+
+/* -------------------------------------------------------------------------- */
+                          /* Determinants */
+
+static npy_float npylog(npy_float f) { return npy_logf(f);}
+static npy_double npylog(npy_double d) { return npy_log(d);}
+static npy_float npyexp(npy_float f) { return npy_expf(f);}
+static npy_double npyexp(npy_double d) { return npy_exp(d);}
+
+template<typename typ>
+static inline void
+slogdet_from_factored_diagonal(typ* src,
+                                      fortran_int m,
+                                      typ *sign,
+                                      typ *logdet)
+{
+    typ acc_sign = *sign;
+    typ acc_logdet = numeric_limits<typ>::zero;
+    int i;
+    for (i = 0; i < m; i++) {
+        typ abs_element = *src;
+        if (abs_element < numeric_limits<typ>::zero) {
+            acc_sign = -acc_sign;
+            abs_element = -abs_element;
+        }
+
+        acc_logdet += npylog(abs_element);
+        src += m+1;
+    }
+
+    *sign = acc_sign;
+    *logdet = acc_logdet;
+}
+
+template<typename typ>
+static inline typ
+det_from_slogdet(typ sign, typ logdet)
+{
+    typ result = sign * npyexp(logdet);
+    return result;
+}
+
+
+npy_float npyabs(npy_cfloat z) { return npy_cabsf(z);}
+npy_double npyabs(npy_cdouble z) { return npy_cabs(z);}
+
+#define RE(COMPLEX) (COMPLEX).real
+#define IM(COMPLEX) (COMPLEX).imag
+
+template<typename typ>
+static inline typ
+mult(typ op1, typ op2)
+{
+    typ rv;
+
+    RE(rv) = RE(op1)*RE(op2) - IM(op1)*IM(op2);
+    IM(rv) = RE(op1)*IM(op2) + IM(op1)*RE(op2);
+
+    return rv;
+}
+
+
+template<typename typ, typename basetyp>
+static inline void
+slogdet_from_factored_diagonal(typ* src,
+                                      fortran_int m,
+                                      typ *sign,
+                                      basetyp *logdet)
+{
+    int i;
+    typ sign_acc = *sign;
+    basetyp logdet_acc = numeric_limits<basetyp>::zero;
+
+    for (i = 0; i < m; i++)
+    {
+        basetyp abs_element = npyabs(*src);
+        typ sign_element;
+        RE(sign_element) = RE(*src) / abs_element;
+        IM(sign_element) = IM(*src) / abs_element;
+
+        sign_acc = mult(sign_acc, sign_element);
+        logdet_acc += npylog(abs_element);
+        src += m + 1;
+    }
+
+    *sign = sign_acc;
+    *logdet = logdet_acc;
+}
+
+template<typename typ, typename basetyp>
+static inline typ
+det_from_slogdet(typ sign, basetyp logdet)
+{
+    typ tmp;
+    RE(tmp) = npyexp(logdet);
+    IM(tmp) = numeric_limits<basetyp>::zero;
+    return mult(sign, tmp);
+}
+#undef RE
+#undef IM
+
+
+/* As in the linalg package, the determinant is computed via LU factorization
+ * using LAPACK.
+ * slogdet computes sign + log(determinant).
+ * det computes sign * exp(slogdet).
+ */
+template<typename typ, typename basetyp>
+static inline void
+slogdet_single_element(fortran_int m,
+                              typ* src,
+                              fortran_int* pivots,
+                              typ *sign,
+                              basetyp *logdet)
+{
+using ftyp = fortran_type_t<typ>;
+    fortran_int info = 0;
+    fortran_int lda = fortran_int_max(m, 1);
+    int i;
+    /* note: done in place */
+    getrf(&m, &m, (ftyp*)src, &lda, pivots, &info);
+
+    if (info == 0) {
+        int change_sign = 0;
+        /* note: fortran uses 1 based indexing */
+        for (i = 0; i < m; i++)
+        {
+            change_sign += (pivots[i] != (i+1));
+        }
+
+        *sign = (change_sign % 2)?numeric_limits<typ>::minus_one:numeric_limits<typ>::one;
+        slogdet_from_factored_diagonal(src, m, sign, logdet);
+    } else {
+        /*
+          if getrf fails, use 0 as sign and -inf as logdet
+        */
+        *sign = numeric_limits<typ>::zero;
+        *logdet = numeric_limits<basetyp>::ninf;
+    }
+}
+
+template<typename typ, typename basetyp>
+static void
+slogdet(char **args,
+               npy_intp const *dimensions,
+               npy_intp const *steps,
+               void *NPY_UNUSED(func))
+{
+    fortran_int m;
+    npy_uint8 *tmp_buff = NULL;
+    size_t matrix_size;
+    size_t pivot_size;
+    size_t safe_m;
+    /* notes:
+     *   matrix will need to be copied always, as factorization in lapack is
+     *          made inplace
+     *   matrix will need to be in column-major order, as expected by lapack
+     *          code (fortran)
+     *   always a square matrix
+     *   need to allocate memory for both, matrix_buffer and pivot buffer
+     */
+    INIT_OUTER_LOOP_3
+    m = (fortran_int) dimensions[0];
+    safe_m = m;
+    matrix_size = safe_m * safe_m * sizeof(typ);
+    pivot_size = safe_m * sizeof(fortran_int);
+    tmp_buff = (npy_uint8 *)malloc(matrix_size + pivot_size);
+
+    if (tmp_buff) {
+        LINEARIZE_DATA_t lin_data;
+        /* swapped steps to get matrix in FORTRAN order */
+        init_linearize_data(&lin_data, m, m, steps[1], steps[0]);
+        BEGIN_OUTER_LOOP_3
+            linearize_matrix((typ*)tmp_buff, (typ*)args[0], &lin_data);
+            slogdet_single_element(m,
+                                   (typ*)tmp_buff,
+                                          (fortran_int*)(tmp_buff+matrix_size),
+                                          (typ*)args[1],
+                                          (basetyp*)args[2]);
+        END_OUTER_LOOP
+
+        free(tmp_buff);
+    }
+}
+
+template<typename typ, typename basetyp>
+static void
+det(char **args,
+           npy_intp const *dimensions,
+           npy_intp const *steps,
+           void *NPY_UNUSED(func))
+{
+    fortran_int m;
+    npy_uint8 *tmp_buff;
+    size_t matrix_size;
+    size_t pivot_size;
+    size_t safe_m;
+    /* notes:
+     *   matrix will need to be copied always, as factorization in lapack is
+     *       made inplace
+     *   matrix will need to be in column-major order, as expected by lapack
+     *       code (fortran)
+     *   always a square matrix
+     *   need to allocate memory for both, matrix_buffer and pivot buffer
+     */
+    INIT_OUTER_LOOP_2
+    m = (fortran_int) dimensions[0];
+    safe_m = m;
+    matrix_size = safe_m * safe_m * sizeof(typ);
+    pivot_size = safe_m * sizeof(fortran_int);
+    tmp_buff = (npy_uint8 *)malloc(matrix_size + pivot_size);
+
+    if (tmp_buff) {
+        LINEARIZE_DATA_t lin_data;
+        typ sign;
+        basetyp logdet;
+        /* swapped steps to get matrix in FORTRAN order */
+        init_linearize_data(&lin_data, m, m, steps[1], steps[0]);
+
+        BEGIN_OUTER_LOOP_2
+            linearize_matrix((typ*)tmp_buff, (typ*)args[0], &lin_data);
+            slogdet_single_element(m,
+                                         (typ*)tmp_buff,
+                                          (fortran_int*)(tmp_buff + matrix_size),
+                                          &sign,
+                                          &logdet);
+            *(typ *)args[1] = det_from_slogdet(sign, logdet);
+        END_OUTER_LOOP
+
+        free(tmp_buff);
+    }
+}
+
+
+/* -------------------------------------------------------------------------- */
+                          /* Eigh family */
+
+template<typename typ>
+struct EIGH_PARAMS_t {
+    typ *A;     /* matrix */
+    basetype_t<typ> *W;     /* eigenvalue vector */
+    typ *WORK;  /* main work buffer */
+    basetype_t<typ> *RWORK; /* secondary work buffer (for complex versions) */
+    fortran_int *IWORK;
+    fortran_int N;
+    fortran_int LWORK;
+    fortran_int LRWORK;
+    fortran_int LIWORK;
+    char JOBZ;
+    char UPLO;
+    fortran_int LDA;
+} ;
+
+static inline fortran_int
+call_evd(EIGH_PARAMS_t<npy_float> *params)
+{
+    fortran_int rv;
+    LAPACK(ssyevd)(&params->JOBZ, &params->UPLO, &params->N,
+                          params->A, &params->LDA, params->W,
+                          params->WORK, &params->LWORK,
+                          params->IWORK, &params->LIWORK,
+                          &rv);
+    return rv;
+}
+static inline fortran_int
+call_evd(EIGH_PARAMS_t<npy_double> *params)
+{
+    fortran_int rv;
+    LAPACK(dsyevd)(&params->JOBZ, &params->UPLO, &params->N,
+                          params->A, &params->LDA, params->W,
+                          params->WORK, &params->LWORK,
+                          params->IWORK, &params->LIWORK,
+                          &rv);
+    return rv;
+}
+
+
+/*
+ * Initialize the parameters to use in for the lapack function _syevd
+ * Handles buffer allocation
+ */
+template<typename typ>
+static inline int
+init_evd(EIGH_PARAMS_t<typ>* params, char JOBZ, char UPLO,
+                   fortran_int N, scalar_trait)
+{
+    npy_uint8 *mem_buff = NULL;
+    npy_uint8 *mem_buff2 = NULL;
+    fortran_int lwork;
+    fortran_int liwork;
+    npy_uint8 *a, *w, *work, *iwork;
+    size_t safe_N = N;
+    size_t alloc_size = safe_N * (safe_N + 1) * sizeof(typ);
+    fortran_int lda = fortran_int_max(N, 1);
+
+    mem_buff = (npy_uint8 *)malloc(alloc_size);
+
+    if (!mem_buff) {
+        goto error;
+    }
+    a = mem_buff;
+    w = mem_buff + safe_N * safe_N * sizeof(typ);
+
+    params->A = (typ*)a;
+    params->W = (typ*)w;
+    params->RWORK = NULL; /* unused */
+    params->N = N;
+    params->LRWORK = 0; /* unused */
+    params->JOBZ = JOBZ;
+    params->UPLO = UPLO;
+    params->LDA = lda;
+
+    /* Work size query */
+    {
+        typ query_work_size;
+        fortran_int query_iwork_size;
+
+        params->LWORK = -1;
+        params->LIWORK = -1;
+        params->WORK = &query_work_size;
+        params->IWORK = &query_iwork_size;
+
+        if (call_evd(params) != 0) {
+            goto error;
+        }
+
+        lwork = (fortran_int)query_work_size;
+        liwork = query_iwork_size;
+    }
+
+    mem_buff2 = (npy_uint8 *)malloc(lwork*sizeof(typ) + liwork*sizeof(fortran_int));
+    if (!mem_buff2) {
+        goto error;
+    }
+
+    work = mem_buff2;
+    iwork = mem_buff2 + lwork*sizeof(typ);
+
+    params->LWORK = lwork;
+    params->WORK = (typ*)work;
+    params->LIWORK = liwork;
+    params->IWORK = (fortran_int*)iwork;
+
+    return 1;
+
+ error:
+    /* something failed */
+    memset(params, 0, sizeof(*params));
+    free(mem_buff2);
+    free(mem_buff);
+
+    return 0;
+}
+
+
+static inline fortran_int
+call_evd(EIGH_PARAMS_t<npy_cfloat> *params)
+{
+    fortran_int rv;
+    LAPACK(cheevd)(&params->JOBZ, &params->UPLO, &params->N,
+                          (fortran_type_t<npy_cfloat>*)params->A, &params->LDA, params->W,
+                          (fortran_type_t<npy_cfloat>*)params->WORK, &params->LWORK,
+                          params->RWORK, &params->LRWORK,
+                          params->IWORK, &params->LIWORK,
+                          &rv);
+    return rv;
+}
+
+static inline fortran_int
+call_evd(EIGH_PARAMS_t<npy_cdouble> *params)
+{
+    fortran_int rv;
+    LAPACK(zheevd)(&params->JOBZ, &params->UPLO, &params->N,
+                          (fortran_type_t<npy_cdouble>*)params->A, &params->LDA, params->W,
+                          (fortran_type_t<npy_cdouble>*)params->WORK, &params->LWORK,
+                          params->RWORK, &params->LRWORK,
+                          params->IWORK, &params->LIWORK,
+                          &rv);
+    return rv;
+}
+
+template<typename typ>
+static inline int
+init_evd(EIGH_PARAMS_t<typ> *params,
+                   char JOBZ,
+                   char UPLO,
+                   fortran_int N, complex_trait)
+{
+    using basetyp = basetype_t<typ>;
+using ftyp = fortran_type_t<typ>;
+using fbasetyp = fortran_type_t<basetyp>;
+    npy_uint8 *mem_buff = NULL;
+    npy_uint8 *mem_buff2 = NULL;
+    fortran_int lwork;
+    fortran_int lrwork;
+    fortran_int liwork;
+    npy_uint8 *a, *w, *work, *rwork, *iwork;
+    size_t safe_N = N;
+    fortran_int lda = fortran_int_max(N, 1);
+
+    mem_buff = (npy_uint8 *)malloc(safe_N * safe_N * sizeof(typ) +
+                      safe_N * sizeof(basetyp));
+    if (!mem_buff) {
+        goto error;
+    }
+    a = mem_buff;
+    w = mem_buff + safe_N * safe_N * sizeof(typ);
+
+    params->A = (typ*)a;
+    params->W = (basetyp*)w;
+    params->N = N;
+    params->JOBZ = JOBZ;
+    params->UPLO = UPLO;
+    params->LDA = lda;
+
+    /* Work size query */
+    {
+        ftyp query_work_size;
+        fbasetyp query_rwork_size;
+        fortran_int query_iwork_size;
+
+        params->LWORK = -1;
+        params->LRWORK = -1;
+        params->LIWORK = -1;
+        params->WORK = (typ*)&query_work_size;
+        params->RWORK = (basetyp*)&query_rwork_size;
+        params->IWORK = &query_iwork_size;
+
+        if (call_evd(params) != 0) {
+            goto error;
+        }
+
+        lwork = (fortran_int)*(fbasetyp*)&query_work_size;
+        lrwork = (fortran_int)query_rwork_size;
+        liwork = query_iwork_size;
+    }
+
+    mem_buff2 = (npy_uint8 *)malloc(lwork*sizeof(typ) +
+                       lrwork*sizeof(basetyp) +
+                       liwork*sizeof(fortran_int));
+    if (!mem_buff2) {
+        goto error;
+    }
+
+    work = mem_buff2;
+    rwork = work + lwork*sizeof(typ);
+    iwork = rwork + lrwork*sizeof(basetyp);
+
+    params->WORK = (typ*)work;
+    params->RWORK = (basetyp*)rwork;
+    params->IWORK = (fortran_int*)iwork;
+    params->LWORK = lwork;
+    params->LRWORK = lrwork;
+    params->LIWORK = liwork;
+
+    return 1;
+
+    /* something failed */
+error:
+    memset(params, 0, sizeof(*params));
+    free(mem_buff2);
+    free(mem_buff);
+
+    return 0;
+}
+
+/*
+ * (M, M)->(M,)(M, M)
+ * dimensions[1] -> M
+ * args[0] -> A[in]
+ * args[1] -> W
+ * args[2] -> A[out]
+ */
+
+template<typename typ>
+static inline void
+release_evd(EIGH_PARAMS_t<typ> *params)
+{
+    /* allocated memory in A and WORK */
+    free(params->A);
+    free(params->WORK);
+    memset(params, 0, sizeof(*params));
+}
+
+
+template<typename typ>
+static inline void
+eigh_wrapper(char JOBZ,
+                    char UPLO,
+                    char**args,
+                    npy_intp const *dimensions,
+                    npy_intp const *steps)
+{
+    using basetyp = basetype_t<typ>;
+    ptrdiff_t outer_steps[3];
+    size_t iter;
+    size_t outer_dim = *dimensions++;
+    size_t op_count = (JOBZ=='N')?2:3;
+    EIGH_PARAMS_t<typ> eigh_params;
+    int error_occurred = get_fp_invalid_and_clear();
+
+    for (iter = 0; iter < op_count; ++iter) {
+        outer_steps[iter] = (ptrdiff_t) steps[iter];
+    }
+    steps += op_count;
+
+    if (init_evd(&eigh_params,
+                           JOBZ,
+                           UPLO,
+                           (fortran_int)dimensions[0], dispatch_scalar<typ>())) {
+        LINEARIZE_DATA_t matrix_in_ld;
+        LINEARIZE_DATA_t eigenvectors_out_ld;
+        LINEARIZE_DATA_t eigenvalues_out_ld;
+
+        init_linearize_data(&matrix_in_ld,
+                            eigh_params.N, eigh_params.N,
+                            steps[1], steps[0]);
+        init_linearize_data(&eigenvalues_out_ld,
+                            1, eigh_params.N,
+                            0, steps[2]);
+        if ('V' == eigh_params.JOBZ) {
+            init_linearize_data(&eigenvectors_out_ld,
+                                eigh_params.N, eigh_params.N,
+                                steps[4], steps[3]);
+        }
+
+        for (iter = 0; iter < outer_dim; ++iter) {
+            int not_ok;
+            /* copy the matrix in */
+            linearize_matrix((typ*)eigh_params.A, (typ*)args[0], &matrix_in_ld);
+            not_ok = call_evd(&eigh_params);
+            if (!not_ok) {
+                /* lapack ok, copy result out */
+                delinearize_matrix((basetyp*)args[1],
+                                              (basetyp*)eigh_params.W,
+                                              &eigenvalues_out_ld);
+
+                if ('V' == eigh_params.JOBZ) {
+                    delinearize_matrix((typ*)args[2],
+                                              (typ*)eigh_params.A,
+                                              &eigenvectors_out_ld);
+                }
+            } else {
+                /* lapack fail, set result to nan */
+                error_occurred = 1;
+                nan_matrix((basetyp*)args[1], &eigenvalues_out_ld);
+                if ('V' == eigh_params.JOBZ) {
+                    nan_matrix((typ*)args[2], &eigenvectors_out_ld);
+                }
+            }
+            update_pointers((npy_uint8**)args, outer_steps, op_count);
+        }
+
+        release_evd(&eigh_params);
+    }
+
+    set_fp_invalid_or_clear(error_occurred);
+}
+
+
+template<typename typ>
+static void
+eighlo(char **args,
+              npy_intp const *dimensions,
+              npy_intp const *steps,
+              void *NPY_UNUSED(func))
+{
+    eigh_wrapper<typ>('V', 'L', args, dimensions, steps);
+}
+
+template<typename typ>
+static void
+eighup(char **args,
+              npy_intp const *dimensions,
+              npy_intp const *steps,
+              void* NPY_UNUSED(func))
+{
+    eigh_wrapper<typ>('V', 'U', args, dimensions, steps);
+}
+
+template<typename typ>
+static void
+eigvalshlo(char **args,
+                  npy_intp const *dimensions,
+                  npy_intp const *steps,
+                  void* NPY_UNUSED(func))
+{
+    eigh_wrapper<typ>('N', 'L', args, dimensions, steps);
+}
+
+template<typename typ>
+static void
+eigvalshup(char **args,
+                  npy_intp const *dimensions,
+                  npy_intp const *steps,
+                  void* NPY_UNUSED(func))
+{
+    eigh_wrapper<typ>('N', 'U', args, dimensions, steps);
+}
+
+/* -------------------------------------------------------------------------- */
+                  /* Solve family (includes inv) */
+
+template<typename typ>
+struct GESV_PARAMS_t
+{
+    typ *A; /* A is (N, N) of base type */
+    typ *B; /* B is (N, NRHS) of base type */
+    fortran_int * IPIV; /* IPIV is (N) */
+
+    fortran_int N;
+    fortran_int NRHS;
+    fortran_int LDA;
+    fortran_int LDB;
+};
+
+static inline fortran_int
+call_gesv(GESV_PARAMS_t<fortran_real> *params)
+{
+    fortran_int rv;
+    LAPACK(sgesv)(&params->N, &params->NRHS,
+                          params->A, &params->LDA,
+                          params->IPIV,
+                          params->B, &params->LDB,
+                          &rv);
+    return rv;
+}
+
+static inline fortran_int
+call_gesv(GESV_PARAMS_t<fortran_doublereal> *params)
+{
+    fortran_int rv;
+    LAPACK(dgesv)(&params->N, &params->NRHS,
+                          params->A, &params->LDA,
+                          params->IPIV,
+                          params->B, &params->LDB,
+                          &rv);
+    return rv;
+}
+
+static inline fortran_int
+call_gesv(GESV_PARAMS_t<fortran_complex> *params)
+{
+    fortran_int rv;
+    LAPACK(cgesv)(&params->N, &params->NRHS,
+                          params->A, &params->LDA,
+                          params->IPIV,
+                          params->B, &params->LDB,
+                          &rv);
+    return rv;
+}
+
+static inline fortran_int
+call_gesv(GESV_PARAMS_t<fortran_doublecomplex> *params)
+{
+    fortran_int rv;
+    LAPACK(zgesv)(&params->N, &params->NRHS,
+                          params->A, &params->LDA,
+                          params->IPIV,
+                          params->B, &params->LDB,
+                          &rv);
+    return rv;
+}
+
+
+/*
+ * Initialize the parameters to use in for the lapack function _heev
+ * Handles buffer allocation
+ */
+template<typename ftyp>
+static inline int
+init_gesv(GESV_PARAMS_t<ftyp> *params, fortran_int N, fortran_int NRHS)
+{
+    npy_uint8 *mem_buff = NULL;
+    npy_uint8 *a, *b, *ipiv;
+    size_t safe_N = N;
+    size_t safe_NRHS = NRHS;
+    fortran_int ld = fortran_int_max(N, 1);
+    mem_buff = (npy_uint8 *)malloc(safe_N * safe_N * sizeof(ftyp) +
+                      safe_N * safe_NRHS*sizeof(ftyp) +
+                      safe_N * sizeof(fortran_int));
+    if (!mem_buff) {
+        goto error;
+    }
+    a = mem_buff;
+    b = a + safe_N * safe_N * sizeof(ftyp);
+    ipiv = b + safe_N * safe_NRHS * sizeof(ftyp);
+
+    params->A = (ftyp*)a;
+    params->B = (ftyp*)b;
+    params->IPIV = (fortran_int*)ipiv;
+    params->N = N;
+    params->NRHS = NRHS;
+    params->LDA = ld;
+    params->LDB = ld;
+
+    return 1;
+ error:
+    free(mem_buff);
+    memset(params, 0, sizeof(*params));
+
+    return 0;
+}
+
+template<typename ftyp>
+static inline void
+release_gesv(GESV_PARAMS_t<ftyp> *params)
+{
+    /* memory block base is in A */
+    free(params->A);
+    memset(params, 0, sizeof(*params));
+}
+
+template<typename typ>
+static void
+solve(char **args, npy_intp const *dimensions, npy_intp const *steps,
+             void *NPY_UNUSED(func))
+{
+using ftyp = fortran_type_t<typ>;
+    GESV_PARAMS_t<ftyp> params;
+    fortran_int n, nrhs;
+    int error_occurred = get_fp_invalid_and_clear();
+    INIT_OUTER_LOOP_3
+
+    n = (fortran_int)dimensions[0];
+    nrhs = (fortran_int)dimensions[1];
+    if (init_gesv(&params, n, nrhs)) {
+        LINEARIZE_DATA_t a_in, b_in, r_out;
+
+        init_linearize_data(&a_in, n, n, steps[1], steps[0]);
+        init_linearize_data(&b_in, nrhs, n, steps[3], steps[2]);
+        init_linearize_data(&r_out, nrhs, n, steps[5], steps[4]);
+
+        BEGIN_OUTER_LOOP_3
+            int not_ok;
+            linearize_matrix((typ*)params.A, (typ*)args[0], &a_in);
+            linearize_matrix((typ*)params.B, (typ*)args[1], &b_in);
+            not_ok =call_gesv(&params);
+            if (!not_ok) {
+                delinearize_matrix((typ*)args[2], (typ*)params.B, &r_out);
+            } else {
+                error_occurred = 1;
+                nan_matrix((typ*)args[2], &r_out);
+            }
+        END_OUTER_LOOP
+
+        release_gesv(&params);
+    }
+
+    set_fp_invalid_or_clear(error_occurred);
+}
+
+
+template<typename typ>
+static void
+solve1(char **args, npy_intp const *dimensions, npy_intp const *steps,
+              void *NPY_UNUSED(func))
+{
+using ftyp = fortran_type_t<typ>;
+    GESV_PARAMS_t<ftyp> params;
+    int error_occurred = get_fp_invalid_and_clear();
+    fortran_int n;
+    INIT_OUTER_LOOP_3
+
+    n = (fortran_int)dimensions[0];
+    if (init_gesv(&params, n, 1)) {
+        LINEARIZE_DATA_t a_in, b_in, r_out;
+        init_linearize_data(&a_in, n, n, steps[1], steps[0]);
+        init_linearize_data(&b_in, 1, n, 1, steps[2]);
+        init_linearize_data(&r_out, 1, n, 1, steps[3]);
+
+        BEGIN_OUTER_LOOP_3
+            int not_ok;
+            linearize_matrix((typ*)params.A, (typ*)args[0], &a_in);
+            linearize_matrix((typ*)params.B, (typ*)args[1], &b_in);
+            not_ok = call_gesv(&params);
+            if (!not_ok) {
+                delinearize_matrix((typ*)args[2], (typ*)params.B, &r_out);
+            } else {
+                error_occurred = 1;
+                nan_matrix((typ*)args[2], &r_out);
+            }
+        END_OUTER_LOOP
+
+        release_gesv(&params);
+    }
+
+    set_fp_invalid_or_clear(error_occurred);
+}
+
+template<typename typ>
+static void
+inv(char **args, npy_intp const *dimensions, npy_intp const *steps,
+           void *NPY_UNUSED(func))
+{
+using ftyp = fortran_type_t<typ>;
+    GESV_PARAMS_t<ftyp> params;
+    fortran_int n;
+    int error_occurred = get_fp_invalid_and_clear();
+    INIT_OUTER_LOOP_2
+
+    n = (fortran_int)dimensions[0];
+    if (init_gesv(&params, n, n)) {
+        LINEARIZE_DATA_t a_in, r_out;
+        init_linearize_data(&a_in, n, n, steps[1], steps[0]);
+        init_linearize_data(&r_out, n, n, steps[3], steps[2]);
+
+        BEGIN_OUTER_LOOP_2
+            int not_ok;
+            linearize_matrix((typ*)params.A, (typ*)args[0], &a_in);
+            identity_matrix((typ*)params.B, n);
+            not_ok = call_gesv(&params);
+            if (!not_ok) {
+                delinearize_matrix((typ*)args[1], (typ*)params.B, &r_out);
+            } else {
+                error_occurred = 1;
+                nan_matrix((typ*)args[1], &r_out);
+            }
+        END_OUTER_LOOP
+
+        release_gesv(&params);
+    }
+
+    set_fp_invalid_or_clear(error_occurred);
+}
+
+
+/* -------------------------------------------------------------------------- */
+                     /* Cholesky decomposition */
+
+template<typename typ>
+struct POTR_PARAMS_t
+{
+    typ *A;
+    fortran_int N;
+    fortran_int LDA;
+    char UPLO;
+};
+
+
+static inline fortran_int
+call_potrf(POTR_PARAMS_t<fortran_real> *params)
+{
+    fortran_int rv;
+    LAPACK(spotrf)(&params->UPLO,
+                          &params->N, params->A, &params->LDA,
+                          &rv);
+    return rv;
+}
+
+static inline fortran_int
+call_potrf(POTR_PARAMS_t<fortran_doublereal> *params)
+{
+    fortran_int rv;
+    LAPACK(dpotrf)(&params->UPLO,
+                          &params->N, params->A, &params->LDA,
+                          &rv);
+    return rv;
+}
+
+static inline fortran_int
+call_potrf(POTR_PARAMS_t<fortran_complex> *params)
+{
+    fortran_int rv;
+    LAPACK(cpotrf)(&params->UPLO,
+                          &params->N, params->A, &params->LDA,
+                          &rv);
+    return rv;
+}
+
+static inline fortran_int
+call_potrf(POTR_PARAMS_t<fortran_doublecomplex> *params)
+{
+    fortran_int rv;
+    LAPACK(zpotrf)(&params->UPLO,
+                          &params->N, params->A, &params->LDA,
+                          &rv);
+    return rv;
+}
+
+template<typename ftyp>
+static inline int
+init_potrf(POTR_PARAMS_t<ftyp> *params, char UPLO, fortran_int N)
+{
+    npy_uint8 *mem_buff = NULL;
+    npy_uint8 *a;
+    size_t safe_N = N;
+    fortran_int lda = fortran_int_max(N, 1);
+
+    mem_buff = (npy_uint8 *)malloc(safe_N * safe_N * sizeof(ftyp));
+    if (!mem_buff) {
+        goto error;
+    }
+
+    a = mem_buff;
+
+    params->A = (ftyp*)a;
+    params->N = N;
+    params->LDA = lda;
+    params->UPLO = UPLO;
+
+    return 1;
+ error:
+    free(mem_buff);
+    memset(params, 0, sizeof(*params));
+
+    return 0;
+}
+
+template<typename ftyp>
+static inline void
+release_potrf(POTR_PARAMS_t<ftyp> *params)
+{
+    /* memory block base in A */
+    free(params->A);
+    memset(params, 0, sizeof(*params));
+}
+
+template<typename typ>
+static void
+cholesky(char uplo, char **args, npy_intp const *dimensions, npy_intp const *steps)
+{
+    using ftyp = fortran_type_t<typ>;
+    POTR_PARAMS_t<ftyp> params;
+    int error_occurred = get_fp_invalid_and_clear();
+    fortran_int n;
+    INIT_OUTER_LOOP_2
+
+    assert(uplo == 'L');
+
+    n = (fortran_int)dimensions[0];
+    if (init_potrf(&params, uplo, n)) {
+        LINEARIZE_DATA_t a_in, r_out;
+        init_linearize_data(&a_in, n, n, steps[1], steps[0]);
+        init_linearize_data(&r_out, n, n, steps[3], steps[2]);
+        BEGIN_OUTER_LOOP_2
+            int not_ok;
+            linearize_matrix(params.A, (ftyp*)args[0], &a_in);
+            not_ok = call_potrf(&params);
+            if (!not_ok) {
+                triu_matrix(params.A, params.N);
+                delinearize_matrix((ftyp*)args[1], params.A, &r_out);
+            } else {
+                error_occurred = 1;
+                nan_matrix((ftyp*)args[1], &r_out);
+            }
+        END_OUTER_LOOP
+        release_potrf(&params);
+    }
+
+    set_fp_invalid_or_clear(error_occurred);
+}
+
+template<typename typ>
+static void
+cholesky_lo(char **args, npy_intp const *dimensions, npy_intp const *steps,
+                void *NPY_UNUSED(func))
+{
+    cholesky<typ>('L', args, dimensions, steps);
+}
+
+/* -------------------------------------------------------------------------- */
+                          /* eig family  */
+
+template<typename typ>
+struct GEEV_PARAMS_t {
+    typ *A;
+    basetype_t<typ> *WR; /* RWORK in complex versions, REAL W buffer for (sd)geev*/
+    typ *WI;
+    typ *VLR; /* REAL VL buffers for _geev where _ is s, d */
+    typ *VRR; /* REAL VR buffers for _geev where _ is s, d */
+    typ *WORK;
+    typ *W;  /* final w */
+    typ *VL; /* final vl */
+    typ *VR; /* final vr */
+
+    fortran_int N;
+    fortran_int LDA;
+    fortran_int LDVL;
+    fortran_int LDVR;
+    fortran_int LWORK;
+
+    char JOBVL;
+    char JOBVR;
+};
+
+template<typename typ>
+static inline void
+dump_geev_params(const char *name, GEEV_PARAMS_t<typ>* params)
+{
+    TRACE_TXT("\n%s\n"
+
+              "\t%10s: %p\n"\
+              "\t%10s: %p\n"\
+              "\t%10s: %p\n"\
+              "\t%10s: %p\n"\
+              "\t%10s: %p\n"\
+              "\t%10s: %p\n"\
+              "\t%10s: %p\n"\
+              "\t%10s: %p\n"\
+              "\t%10s: %p\n"\
+
+              "\t%10s: %d\n"\
+              "\t%10s: %d\n"\
+              "\t%10s: %d\n"\
+              "\t%10s: %d\n"\
+              "\t%10s: %d\n"\
+
+              "\t%10s: %c\n"\
+              "\t%10s: %c\n",
+
+              name,
+
+              "A", params->A,
+              "WR", params->WR,
+              "WI", params->WI,
+              "VLR", params->VLR,
+              "VRR", params->VRR,
+              "WORK", params->WORK,
+              "W", params->W,
+              "VL", params->VL,
+              "VR", params->VR,
+
+              "N", (int)params->N,
+              "LDA", (int)params->LDA,
+              "LDVL", (int)params->LDVL,
+              "LDVR", (int)params->LDVR,
+              "LWORK", (int)params->LWORK,
+
+              "JOBVL", params->JOBVL,
+              "JOBVR", params->JOBVR);
+}
+
+static inline fortran_int
+call_geev(GEEV_PARAMS_t<float>* params)
+{
+    fortran_int rv;
+    LAPACK(sgeev)(&params->JOBVL, &params->JOBVR,
+                          &params->N, params->A, &params->LDA,
+                          params->WR, params->WI,
+                          params->VLR, &params->LDVL,
+                          params->VRR, &params->LDVR,
+                          params->WORK, &params->LWORK,
+                          &rv);
+    return rv;
+}
+
+static inline fortran_int
+call_geev(GEEV_PARAMS_t<double>* params)
+{
+    fortran_int rv;
+    LAPACK(dgeev)(&params->JOBVL, &params->JOBVR,
+                          &params->N, params->A, &params->LDA,
+                          params->WR, params->WI,
+                          params->VLR, &params->LDVL,
+                          params->VRR, &params->LDVR,
+                          params->WORK, &params->LWORK,
+                          &rv);
+    return rv;
+}
+
+
+template<typename typ>
+static inline int
+init_geev(GEEV_PARAMS_t<typ> *params, char jobvl, char jobvr, fortran_int n,
+scalar_trait)
+{
+    npy_uint8 *mem_buff = NULL;
+    npy_uint8 *mem_buff2 = NULL;
+    npy_uint8 *a, *wr, *wi, *vlr, *vrr, *work, *w, *vl, *vr;
+    size_t safe_n = n;
+    size_t a_size = safe_n * safe_n * sizeof(typ);
+    size_t wr_size = safe_n * sizeof(typ);
+    size_t wi_size = safe_n * sizeof(typ);
+    size_t vlr_size = jobvl=='V' ? safe_n * safe_n * sizeof(typ) : 0;
+    size_t vrr_size = jobvr=='V' ? safe_n * safe_n * sizeof(typ) : 0;
+    size_t w_size = wr_size*2;
+    size_t vl_size = vlr_size*2;
+    size_t vr_size = vrr_size*2;
+    size_t work_count = 0;
+    fortran_int ld = fortran_int_max(n, 1);
+
+    /* allocate data for known sizes (all but work) */
+    mem_buff = (npy_uint8 *)malloc(a_size + wr_size + wi_size +
+                      vlr_size + vrr_size +
+                      w_size + vl_size + vr_size);
+    if (!mem_buff) {
+        goto error;
+    }
+
+    a = mem_buff;
+    wr = a + a_size;
+    wi = wr + wr_size;
+    vlr = wi + wi_size;
+    vrr = vlr + vlr_size;
+    w = vrr + vrr_size;
+    vl = w + w_size;
+    vr = vl + vl_size;
+
+    params->A = (typ*)a;
+    params->WR = (typ*)wr;
+    params->WI = (typ*)wi;
+    params->VLR = (typ*)vlr;
+    params->VRR = (typ*)vrr;
+    params->W = (typ*)w;
+    params->VL = (typ*)vl;
+    params->VR = (typ*)vr;
+    params->N = n;
+    params->LDA = ld;
+    params->LDVL = ld;
+    params->LDVR = ld;
+    params->JOBVL = jobvl;
+    params->JOBVR = jobvr;
+
+    /* Work size query */
+    {
+        typ work_size_query;
+
+        params->LWORK = -1;
+        params->WORK = &work_size_query;
+
+        if (call_geev(params) != 0) {
+            goto error;
+        }
+
+        work_count = (size_t)work_size_query;
+    }
+
+    mem_buff2 = (npy_uint8 *)malloc(work_count*sizeof(typ));
+    if (!mem_buff2) {
+        goto error;
+    }
+    work = mem_buff2;
+
+    params->LWORK = (fortran_int)work_count;
+    params->WORK = (typ*)work;
+
+    return 1;
+ error:
+    free(mem_buff2);
+    free(mem_buff);
+    memset(params, 0, sizeof(*params));
+
+    return 0;
+}
+
+template<typename complextyp, typename typ>
+static inline void
+mk_complex_array_from_real(complextyp *c, const typ *re, size_t n)
+{
+    size_t iter;
+    for (iter = 0; iter < n; ++iter) {
+        c[iter].r = re[iter];
+        c[iter].i = numeric_limits<typ>::zero;
+    }
+}
+
+template<typename complextyp, typename typ>
+static inline void
+mk_complex_array(complextyp *c,
+                        const typ *re,
+                        const typ *im,
+                        size_t n)
+{
+    size_t iter;
+    for (iter = 0; iter < n; ++iter) {
+        c[iter].r = re[iter];
+        c[iter].i = im[iter];
+    }
+}
+
+template<typename complextyp, typename typ>
+static inline void
+mk_complex_array_conjugate_pair(complextyp *c,
+                                       const typ *r,
+                                       size_t n)
+{
+    size_t iter;
+    for (iter = 0; iter < n; ++iter) {
+        typ re = r[iter];
+        typ im = r[iter+n];
+        c[iter].r = re;
+        c[iter].i = im;
+        c[iter+n].r = re;
+        c[iter+n].i = -im;
+    }
+}
+
+/*
+ * make the complex eigenvectors from the real array produced by sgeev/zgeev.
+ * c is the array where the results will be left.
+ * r is the source array of reals produced by sgeev/zgeev
+ * i is the eigenvalue imaginary part produced by sgeev/zgeev
+ * n is so that the order of the matrix is n by n
+ */
+template<typename complextyp, typename typ>
+static inline void
+mk_geev_complex_eigenvectors(complextyp *c,
+                                      const typ *r,
+                                      const typ *i,
+                                      size_t n)
+{
+    size_t iter = 0;
+    while (iter < n)
+    {
+        if (i[iter] ==  numeric_limits<typ>::zero) {
+            /* eigenvalue was real, eigenvectors as well...  */
+            mk_complex_array_from_real(c, r, n);
+            c += n;
+            r += n;
+            iter ++;
+        } else {
+            /* eigenvalue was complex, generate a pair of eigenvectors */
+            mk_complex_array_conjugate_pair(c, r, n);
+            c += 2*n;
+            r += 2*n;
+            iter += 2;
+        }
+    }
+}
+
+
+template<typename complextyp, typename typ>
+static inline void
+process_geev_results(GEEV_PARAMS_t<typ> *params, scalar_trait)
+{
+    /* REAL versions of geev need the results to be translated
+     * into complex versions. This is the way to deal with imaginary
+     * results. In our gufuncs we will always return complex arrays!
+     */
+    mk_complex_array((complextyp*)params->W, (typ*)params->WR, (typ*)params->WI, params->N);
+
+    /* handle the eigenvectors */
+    if ('V' == params->JOBVL) {
+        mk_geev_complex_eigenvectors((complextyp*)params->VL, (typ*)params->VLR,
+                                              (typ*)params->WI, params->N);
+    }
+    if ('V' == params->JOBVR) {
+        mk_geev_complex_eigenvectors((complextyp*)params->VR, (typ*)params->VRR,
+                                              (typ*)params->WI, params->N);
+    }
+}
+
+
+static inline fortran_int
+call_geev(GEEV_PARAMS_t<fortran_complex>* params)
+{
+    fortran_int rv;
+
+    LAPACK(cgeev)(&params->JOBVL, &params->JOBVR,
+                          &params->N, params->A, &params->LDA,
+                          params->W,
+                          params->VL, &params->LDVL,
+                          params->VR, &params->LDVR,
+                          params->WORK, &params->LWORK,
+                          params->WR, /* actually RWORK */
+                          &rv);
+    return rv;
+}
+static inline fortran_int
+call_geev(GEEV_PARAMS_t<fortran_doublecomplex>* params)
+{
+    fortran_int rv;
+
+    LAPACK(zgeev)(&params->JOBVL, &params->JOBVR,
+                          &params->N, params->A, &params->LDA,
+                          params->W,
+                          params->VL, &params->LDVL,
+                          params->VR, &params->LDVR,
+                          params->WORK, &params->LWORK,
+                          params->WR, /* actually RWORK */
+                          &rv);
+    return rv;
+}
+
+template<typename ftyp>
+static inline int
+init_geev(GEEV_PARAMS_t<ftyp>* params,
+                   char jobvl,
+                   char jobvr,
+                   fortran_int n, complex_trait)
+{
+using realtyp = basetype_t<ftyp>;
+    npy_uint8 *mem_buff = NULL;
+    npy_uint8 *mem_buff2 = NULL;
+    npy_uint8 *a, *w, *vl, *vr, *work, *rwork;
+    size_t safe_n = n;
+    size_t a_size = safe_n * safe_n * sizeof(ftyp);
+    size_t w_size = safe_n * sizeof(ftyp);
+    size_t vl_size = jobvl=='V'? safe_n * safe_n * sizeof(ftyp) : 0;
+    size_t vr_size = jobvr=='V'? safe_n * safe_n * sizeof(ftyp) : 0;
+    size_t rwork_size = 2 * safe_n * sizeof(realtyp);
+    size_t work_count = 0;
+    size_t total_size = a_size + w_size + vl_size + vr_size + rwork_size;
+    fortran_int ld = fortran_int_max(n, 1);
+
+    mem_buff = (npy_uint8 *)malloc(total_size);
+    if (!mem_buff) {
+        goto error;
+    }
+
+    a = mem_buff;
+    w = a + a_size;
+    vl = w + w_size;
+    vr = vl + vl_size;
+    rwork = vr + vr_size;
+
+    params->A = (ftyp*)a;
+    params->WR = (realtyp*)rwork;
+    params->WI = NULL;
+    params->VLR = NULL;
+    params->VRR = NULL;
+    params->VL = (ftyp*)vl;
+    params->VR = (ftyp*)vr;
+    params->W = (ftyp*)w;
+    params->N = n;
+    params->LDA = ld;
+    params->LDVL = ld;
+    params->LDVR = ld;
+    params->JOBVL = jobvl;
+    params->JOBVR = jobvr;
+
+    /* Work size query */
+    {
+        ftyp work_size_query;
+
+        params->LWORK = -1;
+        params->WORK = &work_size_query;
+
+        if (call_geev(params) != 0) {
+            goto error;
+        }
+
+        work_count = (size_t) work_size_query.r;
+        /* Fix a bug in lapack 3.0.0 */
+        if(work_count == 0) work_count = 1;
+    }
+
+    mem_buff2 = (npy_uint8 *)malloc(work_count*sizeof(ftyp));
+    if (!mem_buff2) {
+        goto error;
+    }
+
+    work = mem_buff2;
+
+    params->LWORK = (fortran_int)work_count;
+    params->WORK = (ftyp*)work;
+
+    return 1;
+ error:
+    free(mem_buff2);
+    free(mem_buff);
+    memset(params, 0, sizeof(*params));
+
+    return 0;
+}
+
+template<typename complextyp, typename typ>
+static inline void
+process_geev_results(GEEV_PARAMS_t<typ> *NPY_UNUSED(params), complex_trait)
+{
+    /* nothing to do here, complex versions are ready to copy out */
+}
+
+
+
+template<typename typ>
+static inline void
+release_geev(GEEV_PARAMS_t<typ> *params)
+{
+    free(params->WORK);
+    free(params->A);
+    memset(params, 0, sizeof(*params));
+}
+
+template<typename fctype, typename ftype>
+static inline void
+eig_wrapper(char JOBVL,
+                   char JOBVR,
+                   char**args,
+                   npy_intp const *dimensions,
+                   npy_intp const *steps)
+{
+    ptrdiff_t outer_steps[4];
+    size_t iter;
+    size_t outer_dim = *dimensions++;
+    size_t op_count = 2;
+    int error_occurred = get_fp_invalid_and_clear();
+    GEEV_PARAMS_t<ftype> geev_params;
+
+    assert(JOBVL == 'N');
+
+    STACK_TRACE;
+    op_count += 'V'==JOBVL?1:0;
+    op_count += 'V'==JOBVR?1:0;
+
+    for (iter = 0; iter < op_count; ++iter) {
+        outer_steps[iter] = (ptrdiff_t) steps[iter];
+    }
+    steps += op_count;
+
+    if (init_geev(&geev_params,
+                           JOBVL, JOBVR,
+                           (fortran_int)dimensions[0], dispatch_scalar<ftype>())) {
+        LINEARIZE_DATA_t a_in;
+        LINEARIZE_DATA_t w_out;
+        LINEARIZE_DATA_t vl_out;
+        LINEARIZE_DATA_t vr_out;
+
+        init_linearize_data(&a_in,
+                            geev_params.N, geev_params.N,
+                            steps[1], steps[0]);
+        steps += 2;
+        init_linearize_data(&w_out,
+                            1, geev_params.N,
+                            0, steps[0]);
+        steps += 1;
+        if ('V' == geev_params.JOBVL) {
+            init_linearize_data(&vl_out,
+                                geev_params.N, geev_params.N,
+                                steps[1], steps[0]);
+            steps += 2;
+        }
+        if ('V' == geev_params.JOBVR) {
+            init_linearize_data(&vr_out,
+                                geev_params.N, geev_params.N,
+                                steps[1], steps[0]);
+        }
+
+        for (iter = 0; iter < outer_dim; ++iter) {
+            int not_ok;
+            char **arg_iter = args;
+            /* copy the matrix in */
+            linearize_matrix((ftype*)geev_params.A, (ftype*)*arg_iter++, &a_in);
+            not_ok = call_geev(&geev_params);
+
+            if (!not_ok) {
+                process_geev_results<fctype>(&geev_params,
+dispatch_scalar<ftype>{});
+                delinearize_matrix((fctype*)*arg_iter++,
+                                                 (fctype*)geev_params.W,
+                                                 &w_out);
+
+                if ('V' == geev_params.JOBVL) {
+                    delinearize_matrix((fctype*)*arg_iter++,
+                                                     (fctype*)geev_params.VL,
+                                                     &vl_out);
+                }
+                if ('V' == geev_params.JOBVR) {
+                    delinearize_matrix((fctype*)*arg_iter++,
+                                                     (fctype*)geev_params.VR,
+                                                     &vr_out);
+                }
+            } else {
+                /* geev failed */
+                error_occurred = 1;
+                nan_matrix((fctype*)*arg_iter++, &w_out);
+                if ('V' == geev_params.JOBVL) {
+                    nan_matrix((fctype*)*arg_iter++, &vl_out);
+                }
+                if ('V' == geev_params.JOBVR) {
+                    nan_matrix((fctype*)*arg_iter++, &vr_out);
+                }
+            }
+            update_pointers((npy_uint8**)args, outer_steps, op_count);
+        }
+
+        release_geev(&geev_params);
+    }
+
+    set_fp_invalid_or_clear(error_occurred);
+}
+
+template<typename fctype, typename ftype>
+static void
+eig(char **args,
+           npy_intp const *dimensions,
+           npy_intp const *steps,
+           void *NPY_UNUSED(func))
+{
+    eig_wrapper<fctype, ftype>('N', 'V', args, dimensions, steps);
+}
+
+template<typename fctype, typename ftype>
+static void
+eigvals(char **args,
+               npy_intp const *dimensions,
+               npy_intp const *steps,
+               void *NPY_UNUSED(func))
+{
+    eig_wrapper<fctype, ftype>('N', 'N', args, dimensions, steps);
+}
+
+
+
+/* -------------------------------------------------------------------------- */
+                 /* singular value decomposition  */
+
+template<typename ftyp>
+struct GESDD_PARAMS_t
+{
+    ftyp *A;
+    basetype_t<ftyp> *S;
+    ftyp *U;
+    ftyp *VT;
+    ftyp *WORK;
+    basetype_t<ftyp> *RWORK;
+    fortran_int *IWORK;
+
+    fortran_int M;
+    fortran_int N;
+    fortran_int LDA;
+    fortran_int LDU;
+    fortran_int LDVT;
+    fortran_int LWORK;
+    char JOBZ;
+} ;
+
+
+template<typename ftyp>
+static inline void
+dump_gesdd_params(const char *name,
+                  GESDD_PARAMS_t<ftyp> *params)
+{
+    TRACE_TXT("\n%s:\n"\
+
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+
+              "%14s: %15c'%c'\n",
+
+              name,
+
+              "A", params->A,
+              "S", params->S,
+              "U", params->U,
+              "VT", params->VT,
+              "WORK", params->WORK,
+              "RWORK", params->RWORK,
+              "IWORK", params->IWORK,
+
+              "M", (int)params->M,
+              "N", (int)params->N,
+              "LDA", (int)params->LDA,
+              "LDU", (int)params->LDU,
+              "LDVT", (int)params->LDVT,
+              "LWORK", (int)params->LWORK,
+
+              "JOBZ", ' ', params->JOBZ);
+}
+
+static inline int
+compute_urows_vtcolumns(char jobz,
+                        fortran_int m, fortran_int n,
+                        fortran_int *urows, fortran_int *vtcolumns)
+{
+    fortran_int min_m_n = fortran_int_min(m, n);
+    switch(jobz)
+    {
+    case 'N':
+        *urows = 0;
+        *vtcolumns = 0;
+        break;
+    case 'A':
+        *urows = m;
+        *vtcolumns = n;
+        break;
+    case 'S':
+        {
+            *urows = min_m_n;
+            *vtcolumns = min_m_n;
+        }
+        break;
+    default:
+        return 0;
+    }
+
+    return 1;
+}
+
+static inline fortran_int
+call_gesdd(GESDD_PARAMS_t<fortran_real> *params)
+{
+    fortran_int rv;
+    LAPACK(sgesdd)(&params->JOBZ, &params->M, &params->N,
+                          params->A, &params->LDA,
+                          params->S,
+                          params->U, &params->LDU,
+                          params->VT, &params->LDVT,
+                          params->WORK, &params->LWORK,
+                          (fortran_int*)params->IWORK,
+                          &rv);
+    return rv;
+}
+static inline fortran_int
+call_gesdd(GESDD_PARAMS_t<fortran_doublereal> *params)
+{
+    fortran_int rv;
+    LAPACK(dgesdd)(&params->JOBZ, &params->M, &params->N,
+                          params->A, &params->LDA,
+                          params->S,
+                          params->U, &params->LDU,
+                          params->VT, &params->LDVT,
+                          params->WORK, &params->LWORK,
+                          (fortran_int*)params->IWORK,
+                          &rv);
+    return rv;
+}
+
+template<typename ftyp>
+static inline int
+init_gesdd(GESDD_PARAMS_t<ftyp> *params,
+                   char jobz,
+                   fortran_int m,
+                   fortran_int n, scalar_trait)
+{
+    npy_uint8 *mem_buff = NULL;
+    npy_uint8 *mem_buff2 = NULL;
+    npy_uint8 *a, *s, *u, *vt, *work, *iwork;
+    size_t safe_m = m;
+    size_t safe_n = n;
+    size_t a_size = safe_m * safe_n * sizeof(ftyp);
+    fortran_int min_m_n = fortran_int_min(m, n);
+    size_t safe_min_m_n = min_m_n;
+    size_t s_size = safe_min_m_n * sizeof(ftyp);
+    fortran_int u_row_count, vt_column_count;
+    size_t safe_u_row_count, safe_vt_column_count;
+    size_t u_size, vt_size;
+    fortran_int work_count;
+    size_t work_size;
+    size_t iwork_size = 8 * safe_min_m_n * sizeof(fortran_int);
+    fortran_int ld = fortran_int_max(m, 1);
+
+    if (!compute_urows_vtcolumns(jobz, m, n, &u_row_count, &vt_column_count)) {
+        goto error;
+    }
+
+    safe_u_row_count = u_row_count;
+    safe_vt_column_count = vt_column_count;
+
+    u_size = safe_u_row_count * safe_m * sizeof(ftyp);
+    vt_size = safe_n * safe_vt_column_count * sizeof(ftyp);
+
+    mem_buff = (npy_uint8 *)malloc(a_size + s_size + u_size + vt_size + iwork_size);
+
+    if (!mem_buff) {
+        goto error;
+    }
+
+    a = mem_buff;
+    s = a + a_size;
+    u = s + s_size;
+    vt = u + u_size;
+    iwork = vt + vt_size;
+
+    /* fix vt_column_count so that it is a valid lapack parameter (0 is not) */
+    vt_column_count = fortran_int_max(1, vt_column_count);
+
+    params->M = m;
+    params->N = n;
+    params->A = (ftyp*)a;
+    params->S = (ftyp*)s;
+    params->U = (ftyp*)u;
+    params->VT = (ftyp*)vt;
+    params->RWORK = NULL;
+    params->IWORK = (fortran_int*)iwork;
+    params->LDA = ld;
+    params->LDU = ld;
+    params->LDVT = vt_column_count;
+    params->JOBZ = jobz;
+
+    /* Work size query */
+    {
+        ftyp work_size_query;
+
+        params->LWORK = -1;
+        params->WORK = &work_size_query;
+
+        if (call_gesdd(params) != 0) {
+            goto error;
+        }
+
+        work_count = (fortran_int)work_size_query;
+        /* Fix a bug in lapack 3.0.0 */
+        if(work_count == 0) work_count = 1;
+        work_size = (size_t)work_count * sizeof(ftyp);
+    }
+
+    mem_buff2 = (npy_uint8 *)malloc(work_size);
+    if (!mem_buff2) {
+        goto error;
+    }
+
+    work = mem_buff2;
+
+    params->LWORK = work_count;
+    params->WORK = (ftyp*)work;
+
+    return 1;
+ error:
+    TRACE_TXT("%s failed init\n", __FUNCTION__);
+    free(mem_buff);
+    free(mem_buff2);
+    memset(params, 0, sizeof(*params));
+
+    return 0;
+}
+
+static inline fortran_int
+call_gesdd(GESDD_PARAMS_t<fortran_complex> *params)
+{
+    fortran_int rv;
+    LAPACK(cgesdd)(&params->JOBZ, &params->M, &params->N,
+                          params->A, &params->LDA,
+                          params->S,
+                          params->U, &params->LDU,
+                          params->VT, &params->LDVT,
+                          params->WORK, &params->LWORK,
+                          params->RWORK,
+                          params->IWORK,
+                          &rv);
+    return rv;
+}
+static inline fortran_int
+call_gesdd(GESDD_PARAMS_t<fortran_doublecomplex> *params)
+{
+    fortran_int rv;
+    LAPACK(zgesdd)(&params->JOBZ, &params->M, &params->N,
+                          params->A, &params->LDA,
+                          params->S,
+                          params->U, &params->LDU,
+                          params->VT, &params->LDVT,
+                          params->WORK, &params->LWORK,
+                          params->RWORK,
+                          params->IWORK,
+                          &rv);
+    return rv;
+}
+
+template<typename ftyp>
+static inline int
+init_gesdd(GESDD_PARAMS_t<ftyp> *params,
+                   char jobz,
+                   fortran_int m,
+                   fortran_int n, complex_trait)
+{
+using frealtyp = basetype_t<ftyp>;
+    npy_uint8 *mem_buff = NULL, *mem_buff2 = NULL;
+    npy_uint8 *a,*s, *u, *vt, *work, *rwork, *iwork;
+    size_t a_size, s_size, u_size, vt_size, work_size, rwork_size, iwork_size;
+    size_t safe_u_row_count, safe_vt_column_count;
+    fortran_int u_row_count, vt_column_count, work_count;
+    size_t safe_m = m;
+    size_t safe_n = n;
+    fortran_int min_m_n = fortran_int_min(m, n);
+    size_t safe_min_m_n = min_m_n;
+    fortran_int ld = fortran_int_max(m, 1);
+
+    if (!compute_urows_vtcolumns(jobz, m, n, &u_row_count, &vt_column_count)) {
+        goto error;
+    }
+
+    safe_u_row_count = u_row_count;
+    safe_vt_column_count = vt_column_count;
+
+    a_size = safe_m * safe_n * sizeof(ftyp);
+    s_size = safe_min_m_n * sizeof(frealtyp);
+    u_size = safe_u_row_count * safe_m * sizeof(ftyp);
+    vt_size = safe_n * safe_vt_column_count * sizeof(ftyp);
+    rwork_size = 'N'==jobz?
+        (7 * safe_min_m_n) :
+        (5*safe_min_m_n * safe_min_m_n + 5*safe_min_m_n);
+    rwork_size *= sizeof(ftyp);
+    iwork_size = 8 * safe_min_m_n* sizeof(fortran_int);
+
+    mem_buff = (npy_uint8 *)malloc(a_size +
+                      s_size +
+                      u_size +
+                      vt_size +
+                      rwork_size +
+                      iwork_size);
+    if (!mem_buff) {
+        goto error;
+    }
+
+    a = mem_buff;
+    s = a + a_size;
+    u = s + s_size;
+    vt = u + u_size;
+    rwork = vt + vt_size;
+    iwork = rwork + rwork_size;
+
+    /* fix vt_column_count so that it is a valid lapack parameter (0 is not) */
+    vt_column_count = fortran_int_max(1, vt_column_count);
+
+    params->A = (ftyp*)a;
+    params->S = (frealtyp*)s;
+    params->U = (ftyp*)u;
+    params->VT = (ftyp*)vt;
+    params->RWORK = (frealtyp*)rwork;
+    params->IWORK = (fortran_int*)iwork;
+    params->M = m;
+    params->N = n;
+    params->LDA = ld;
+    params->LDU = ld;
+    params->LDVT = vt_column_count;
+    params->JOBZ = jobz;
+
+    /* Work size query */
+    {
+        ftyp work_size_query;
+
+        params->LWORK = -1;
+        params->WORK = &work_size_query;
+
+        if (call_gesdd(params) != 0) {
+            goto error;
+        }
+
+        work_count = (fortran_int)(*(frealtyp*)&work_size_query);
+        /* Fix a bug in lapack 3.0.0 */
+        if(work_count == 0) work_count = 1;
+        work_size = (size_t)work_count * sizeof(ftyp);
+    }
+
+    mem_buff2 = (npy_uint8 *)malloc(work_size);
+    if (!mem_buff2) {
+        goto error;
+    }
+
+    work = mem_buff2;
+
+    params->LWORK = work_count;
+    params->WORK = (ftyp*)work;
+
+    return 1;
+ error:
+    TRACE_TXT("%s failed init\n", __FUNCTION__);
+    free(mem_buff2);
+    free(mem_buff);
+    memset(params, 0, sizeof(*params));
+
+    return 0;
+}
+
+template<typename typ>
+static inline void
+release_gesdd(GESDD_PARAMS_t<typ>* params)
+{
+    /* A and WORK contain allocated blocks */
+    free(params->A);
+    free(params->WORK);
+    memset(params, 0, sizeof(*params));
+}
+
+template<typename typ>
+static inline void
+svd_wrapper(char JOBZ,
+                   char **args,
+                   npy_intp const *dimensions,
+                   npy_intp const *steps)
+{
+using basetyp = basetype_t<typ>;
+    ptrdiff_t outer_steps[4];
+    int error_occurred = get_fp_invalid_and_clear();
+    size_t iter;
+    size_t outer_dim = *dimensions++;
+    size_t op_count = (JOBZ=='N')?2:4;
+    GESDD_PARAMS_t<typ> params;
+
+    for (iter = 0; iter < op_count; ++iter) {
+        outer_steps[iter] = (ptrdiff_t) steps[iter];
+    }
+    steps += op_count;
+
+    if (init_gesdd(&params,
+                   JOBZ,
+                   (fortran_int)dimensions[0],
+                   (fortran_int)dimensions[1],
+dispatch_scalar<typ>())) {
+        LINEARIZE_DATA_t a_in, u_out, s_out, v_out;
+        fortran_int min_m_n = params.M < params.N ? params.M : params.N;
+
+        init_linearize_data(&a_in, params.N, params.M, steps[1], steps[0]);
+        if ('N' == params.JOBZ) {
+            /* only the singular values are wanted */
+            init_linearize_data(&s_out, 1, min_m_n, 0, steps[2]);
+        } else {
+            fortran_int u_columns, v_rows;
+            if ('S' == params.JOBZ) {
+                u_columns = min_m_n;
+                v_rows = min_m_n;
+            } else { /* JOBZ == 'A' */
+                u_columns = params.M;
+                v_rows = params.N;
+            }
+            init_linearize_data(&u_out,
+                                u_columns, params.M,
+                                steps[3], steps[2]);
+            init_linearize_data(&s_out,
+                                1, min_m_n,
+                                0, steps[4]);
+            init_linearize_data(&v_out,
+                                params.N, v_rows,
+                                steps[6], steps[5]);
+        }
+
+        for (iter = 0; iter < outer_dim; ++iter) {
+            int not_ok;
+            /* copy the matrix in */
+            linearize_matrix((typ*)params.A, (typ*)args[0], &a_in);
+            not_ok = call_gesdd(&params);
+            if (!not_ok) {
+                if ('N' == params.JOBZ) {
+                    delinearize_matrix((basetyp*)args[1], (basetyp*)params.S, &s_out);
+                } else {
+                    if ('A' == params.JOBZ && min_m_n == 0) {
+                        /* Lapack has betrayed us and left these uninitialized,
+                         * so produce an identity matrix for whichever of u
+                         * and v is not empty.
+                         */
+                        identity_matrix((typ*)params.U, params.M);
+                        identity_matrix((typ*)params.VT, params.N);
+                    }
+
+                    delinearize_matrix((typ*)args[1], (typ*)params.U, &u_out);
+                    delinearize_matrix((basetyp*)args[2], (basetyp*)params.S, &s_out);
+                    delinearize_matrix((typ*)args[3], (typ*)params.VT, &v_out);
+                }
+            } else {
+                error_occurred = 1;
+                if ('N' == params.JOBZ) {
+                    nan_matrix((basetyp*)args[1], &s_out);
+                } else {
+                    nan_matrix((typ*)args[1], &u_out);
+                    nan_matrix((basetyp*)args[2], &s_out);
+                    nan_matrix((typ*)args[3], &v_out);
+                }
+            }
+            update_pointers((npy_uint8**)args, outer_steps, op_count);
+        }
+
+        release_gesdd(&params);
+    }
+
+    set_fp_invalid_or_clear(error_occurred);
+}
+
+
+template<typename typ>
+static void
+svd_N(char **args,
+             npy_intp const *dimensions,
+             npy_intp const *steps,
+             void *NPY_UNUSED(func))
+{
+    svd_wrapper<fortran_type_t<typ>>('N', args, dimensions, steps);
+}
+
+template<typename typ>
+static void
+svd_S(char **args,
+             npy_intp const *dimensions,
+             npy_intp const *steps,
+             void *NPY_UNUSED(func))
+{
+    svd_wrapper<fortran_type_t<typ>>('S', args, dimensions, steps);
+}
+
+template<typename typ>
+static void
+svd_A(char **args,
+             npy_intp const *dimensions,
+             npy_intp const *steps,
+             void *NPY_UNUSED(func))
+{
+    svd_wrapper<fortran_type_t<typ>>('A', args, dimensions, steps);
+}
+
+/* -------------------------------------------------------------------------- */
+                 /* qr (modes - r, raw) */
+
+template<typename typ>
+struct GEQRF_PARAMS_t
+{
+    fortran_int M;
+    fortran_int N;
+    typ *A;
+    fortran_int LDA;
+    typ* TAU;
+    typ *WORK;
+    fortran_int LWORK;
+};
+
+
+template<typename typ>
+static inline void
+dump_geqrf_params(const char *name,
+                  GEQRF_PARAMS_t<typ> *params)
+{
+    TRACE_TXT("\n%s:\n"\
+
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n",
+
+              name,
+
+              "A", params->A,
+              "TAU", params->TAU,
+              "WORK", params->WORK,
+
+              "M", (int)params->M,
+              "N", (int)params->N,
+              "LDA", (int)params->LDA,
+              "LWORK", (int)params->LWORK);
+}
+
+static inline fortran_int
+call_geqrf(GEQRF_PARAMS_t<double> *params)
+{
+    fortran_int rv;
+    LAPACK(dgeqrf)(&params->M, &params->N,
+                          params->A, &params->LDA,
+                          params->TAU,
+                          params->WORK, &params->LWORK,
+                          &rv);
+    return rv;
+}
+static inline fortran_int
+call_geqrf(GEQRF_PARAMS_t<f2c_doublecomplex> *params)
+{
+    fortran_int rv;
+    LAPACK(zgeqrf)(&params->M, &params->N,
+                          params->A, &params->LDA,
+                          params->TAU,
+                          params->WORK, &params->LWORK,
+                          &rv);
+    return rv;
+}
+
+
+static inline int
+init_geqrf(GEQRF_PARAMS_t<fortran_doublereal> *params,
+                   fortran_int m,
+                   fortran_int n)
+{
+using ftyp = fortran_doublereal;
+    npy_uint8 *mem_buff = NULL;
+    npy_uint8 *mem_buff2 = NULL;
+    npy_uint8 *a, *tau, *work;
+    fortran_int min_m_n = fortran_int_min(m, n);
+    size_t safe_min_m_n = min_m_n;
+    size_t safe_m = m;
+    size_t safe_n = n;
+
+    size_t a_size = safe_m * safe_n * sizeof(ftyp);
+    size_t tau_size = safe_min_m_n * sizeof(ftyp);
+
+    fortran_int work_count;
+    size_t work_size;
+    fortran_int lda = fortran_int_max(1, m);
+
+    mem_buff = (npy_uint8 *)malloc(a_size + tau_size);
+
+    if (!mem_buff)
+        goto error;
+
+    a = mem_buff;
+    tau = a + a_size;
+    memset(tau, 0, tau_size);
+
+
+    params->M = m;
+    params->N = n;
+    params->A = (ftyp*)a;
+    params->TAU = (ftyp*)tau;
+    params->LDA = lda;
+
+    {
+        /* compute optimal work size */
+
+        ftyp work_size_query;
+
+        params->WORK = &work_size_query;
+        params->LWORK = -1;
+
+        if (call_geqrf(params) != 0)
+            goto error;
+
+        work_count = (fortran_int) *(ftyp*) params->WORK;
+
+    }
+
+    params->LWORK = fortran_int_max(fortran_int_max(1, n), work_count);
+
+    work_size = (size_t) params->LWORK * sizeof(ftyp);
+    mem_buff2 = (npy_uint8 *)malloc(work_size);
+    if (!mem_buff2)
+        goto error;
+
+    work = mem_buff2;
+
+    params->WORK = (ftyp*)work;
+
+    return 1;
+ error:
+    TRACE_TXT("%s failed init\n", __FUNCTION__);
+    free(mem_buff);
+    free(mem_buff2);
+    memset(params, 0, sizeof(*params));
+
+    return 0;
+}
+
+
+static inline int
+init_geqrf(GEQRF_PARAMS_t<fortran_doublecomplex> *params,
+                   fortran_int m,
+                   fortran_int n)
+{
+using ftyp = fortran_doublecomplex;
+    npy_uint8 *mem_buff = NULL;
+    npy_uint8 *mem_buff2 = NULL;
+    npy_uint8 *a, *tau, *work;
+    fortran_int min_m_n = fortran_int_min(m, n);
+    size_t safe_min_m_n = min_m_n;
+    size_t safe_m = m;
+    size_t safe_n = n;
+
+    size_t a_size = safe_m * safe_n * sizeof(ftyp);
+    size_t tau_size = safe_min_m_n * sizeof(ftyp);
+
+    fortran_int work_count;
+    size_t work_size;
+    fortran_int lda = fortran_int_max(1, m);
+
+    mem_buff = (npy_uint8 *)malloc(a_size + tau_size);
+
+    if (!mem_buff)
+        goto error;
+
+    a = mem_buff;
+    tau = a + a_size;
+    memset(tau, 0, tau_size);
+
+
+    params->M = m;
+    params->N = n;
+    params->A = (ftyp*)a;
+    params->TAU = (ftyp*)tau;
+    params->LDA = lda;
+
+    {
+        /* compute optimal work size */
+
+        ftyp work_size_query;
+
+        params->WORK = &work_size_query;
+        params->LWORK = -1;
+
+        if (call_geqrf(params) != 0)
+            goto error;
+
+        work_count = (fortran_int) ((ftyp*)params->WORK)->r;
+
+    }
+
+    params->LWORK = fortran_int_max(fortran_int_max(1, n),
+                                    work_count);
+
+    work_size = (size_t) params->LWORK * sizeof(ftyp);
+
+    mem_buff2 = (npy_uint8 *)malloc(work_size);
+    if (!mem_buff2)
+        goto error;
+
+    work = mem_buff2;
+
+    params->WORK = (ftyp*)work;
+
+    return 1;
+ error:
+    TRACE_TXT("%s failed init\n", __FUNCTION__);
+    free(mem_buff);
+    free(mem_buff2);
+    memset(params, 0, sizeof(*params));
+
+    return 0;
+}
+
+
+template<typename ftyp>
+static inline void
+release_geqrf(GEQRF_PARAMS_t<ftyp>* params)
+{
+    /* A and WORK contain allocated blocks */
+    free(params->A);
+    free(params->WORK);
+    memset(params, 0, sizeof(*params));
+}
+
+template<typename typ>
+static void
+qr_r_raw(char **args, npy_intp const *dimensions, npy_intp const *steps,
+          void *NPY_UNUSED(func))
+{
+using ftyp = fortran_type_t<typ>;
+
+    GEQRF_PARAMS_t<ftyp> params;
+    int error_occurred = get_fp_invalid_and_clear();
+    fortran_int n, m;
+
+    INIT_OUTER_LOOP_2
+
+    m = (fortran_int)dimensions[0];
+    n = (fortran_int)dimensions[1];
+
+    if (init_geqrf(&params, m, n)) {
+        LINEARIZE_DATA_t a_in, tau_out;
+
+        init_linearize_data(&a_in, n, m, steps[1], steps[0]);
+        init_linearize_data(&tau_out, 1, fortran_int_min(m, n), 1, steps[2]);
+
+        BEGIN_OUTER_LOOP_2
+            int not_ok;
+            linearize_matrix((typ*)params.A, (typ*)args[0], &a_in);
+            not_ok = call_geqrf(&params);
+            if (!not_ok) {
+                delinearize_matrix((typ*)args[0], (typ*)params.A, &a_in);
+                delinearize_matrix((typ*)args[1], (typ*)params.TAU, &tau_out);
+            } else {
+                error_occurred = 1;
+                nan_matrix((typ*)args[1], &tau_out);
+            }
+        END_OUTER_LOOP
+
+        release_geqrf(&params);
+    }
+
+    set_fp_invalid_or_clear(error_occurred);
+}
+
+
+/* -------------------------------------------------------------------------- */
+                 /* qr common code (modes - reduced and complete) */ 
+
+template<typename typ>
+struct GQR_PARAMS_t
+{
+    fortran_int M;
+    fortran_int MC;
+    fortran_int MN;
+    void* A;
+    typ *Q;
+    fortran_int LDA;
+    typ* TAU;
+    typ *WORK;
+    fortran_int LWORK;
+} ;
+
+static inline fortran_int
+call_gqr(GQR_PARAMS_t<double> *params)
+{
+    fortran_int rv;
+    LAPACK(dorgqr)(&params->M, &params->MC, &params->MN,
+                          params->Q, &params->LDA,
+                          params->TAU,
+                          params->WORK, &params->LWORK,
+                          &rv);
+    return rv;
+}
+static inline fortran_int
+call_gqr(GQR_PARAMS_t<f2c_doublecomplex> *params)
+{
+    fortran_int rv;
+    LAPACK(zungqr)(&params->M, &params->MC, &params->MN,
+                          params->Q, &params->LDA,
+                          params->TAU,
+                          params->WORK, &params->LWORK,
+                          &rv);
+    return rv;
+}
+
+static inline int
+init_gqr_common(GQR_PARAMS_t<fortran_doublereal> *params,
+                          fortran_int m,
+                          fortran_int n,
+                          fortran_int mc)
+{
+using ftyp = fortran_doublereal;
+    npy_uint8 *mem_buff = NULL;
+    npy_uint8 *mem_buff2 = NULL;
+    npy_uint8 *a, *q, *tau, *work;
+    fortran_int min_m_n = fortran_int_min(m, n);
+    size_t safe_mc = mc;
+    size_t safe_min_m_n = min_m_n;
+    size_t safe_m = m;
+    size_t safe_n = n;
+    size_t a_size = safe_m * safe_n * sizeof(ftyp);
+    size_t q_size = safe_m * safe_mc * sizeof(ftyp);
+    size_t tau_size = safe_min_m_n * sizeof(ftyp);
+
+    fortran_int work_count;
+    size_t work_size;
+    fortran_int lda = fortran_int_max(1, m);
+
+    mem_buff = (npy_uint8 *)malloc(q_size + tau_size + a_size);
+
+    if (!mem_buff)
+        goto error;
+
+    q = mem_buff;
+    tau = q + q_size;
+    a = tau + tau_size;
+
+
+    params->M = m;
+    params->MC = mc;
+    params->MN = min_m_n;
+    params->A = a;
+    params->Q = (ftyp*)q;
+    params->TAU = (ftyp*)tau;
+    params->LDA = lda;
+
+    {
+        /* compute optimal work size */
+        ftyp work_size_query;
+
+        params->WORK = &work_size_query;
+        params->LWORK = -1;
+
+        if (call_gqr(params) != 0)
+            goto error;
+
+        work_count = (fortran_int) *(ftyp*) params->WORK;
+
+    }
+
+    params->LWORK = fortran_int_max(fortran_int_max(1, n), work_count);
+
+    work_size = (size_t) params->LWORK * sizeof(ftyp);
+
+    mem_buff2 = (npy_uint8 *)malloc(work_size);
+    if (!mem_buff2)
+        goto error;
+
+    work = mem_buff2;
+
+    params->WORK = (ftyp*)work;
+
+    return 1;
+ error:
+    TRACE_TXT("%s failed init\n", __FUNCTION__);
+    free(mem_buff);
+    free(mem_buff2);
+    memset(params, 0, sizeof(*params));
+
+    return 0;
+}
+
+
+static inline int
+init_gqr_common(GQR_PARAMS_t<fortran_doublecomplex> *params,
+                          fortran_int m,
+                          fortran_int n,
+                          fortran_int mc)
+{
+using ftyp=fortran_doublecomplex;
+    npy_uint8 *mem_buff = NULL;
+    npy_uint8 *mem_buff2 = NULL;
+    npy_uint8 *a, *q, *tau, *work;
+    fortran_int min_m_n = fortran_int_min(m, n);
+    size_t safe_mc = mc;
+    size_t safe_min_m_n = min_m_n;
+    size_t safe_m = m;
+    size_t safe_n = n;
+
+    size_t a_size = safe_m * safe_n * sizeof(ftyp);
+    size_t q_size = safe_m * safe_mc * sizeof(ftyp);
+    size_t tau_size = safe_min_m_n * sizeof(ftyp);
+
+    fortran_int work_count;
+    size_t work_size;
+    fortran_int lda = fortran_int_max(1, m);
+
+    mem_buff = (npy_uint8 *)malloc(q_size + tau_size + a_size);
+
+    if (!mem_buff)
+        goto error;
+
+    q = mem_buff;
+    tau = q + q_size;
+    a = tau + tau_size;
+
+
+    params->M = m;
+    params->MC = mc;
+    params->MN = min_m_n;
+    params->A = a;
+    params->Q = (ftyp*)q;
+    params->TAU = (ftyp*)tau;
+    params->LDA = lda;
+
+    {
+        /* compute optimal work size */
+        ftyp work_size_query;
+
+        params->WORK = &work_size_query;
+        params->LWORK = -1;
+
+        if (call_gqr(params) != 0)
+            goto error;
+
+        work_count = (fortran_int) ((ftyp*)params->WORK)->r;
+
+    }
+
+    params->LWORK = fortran_int_max(fortran_int_max(1, n),
+                                    work_count);
+
+    work_size = (size_t) params->LWORK * sizeof(ftyp);
+
+    mem_buff2 = (npy_uint8 *)malloc(work_size);
+    if (!mem_buff2)
+        goto error;
+
+    work = mem_buff2;
+
+    params->WORK = (ftyp*)work;
+    params->LWORK = work_count;
+
+    return 1;
+ error:
+    TRACE_TXT("%s failed init\n", __FUNCTION__);
+    free(mem_buff);
+    free(mem_buff2);
+    memset(params, 0, sizeof(*params));
+
+    return 0;
+}
+
+/* -------------------------------------------------------------------------- */
+                 /* qr (modes - reduced) */
+
+
+template<typename typ>
+static inline void
+dump_gqr_params(const char *name,
+                GQR_PARAMS_t<typ> *params)
+{
+    TRACE_TXT("\n%s:\n"\
+
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n",
+
+              name,
+
+              "Q", params->Q,
+              "TAU", params->TAU,
+              "WORK", params->WORK,
+
+              "M", (int)params->M,
+              "MC", (int)params->MC,
+              "MN", (int)params->MN,
+              "LDA", (int)params->LDA,
+              "LWORK", (int)params->LWORK);
+}
+
+template<typename ftyp>
+static inline int
+init_gqr(GQR_PARAMS_t<ftyp> *params,
+                   fortran_int m,
+                   fortran_int n)
+{
+    return init_gqr_common(
+        params, m, n, 
+        fortran_int_min(m, n));
+}
+
+
+template<typename typ>
+static inline void
+release_gqr(GQR_PARAMS_t<typ>* params)
+{
+    /* A and WORK contain allocated blocks */
+    free(params->Q);
+    free(params->WORK);
+    memset(params, 0, sizeof(*params));
+}
+
+template<typename typ>
+static void
+qr_reduced(char **args, npy_intp const *dimensions, npy_intp const *steps,
+                  void *NPY_UNUSED(func))
+{
+using ftyp = fortran_type_t<typ>;
+    GQR_PARAMS_t<ftyp> params;
+    int error_occurred = get_fp_invalid_and_clear();
+    fortran_int n, m;
+
+    INIT_OUTER_LOOP_3
+
+    m = (fortran_int)dimensions[0];
+    n = (fortran_int)dimensions[1];
+
+    if (init_gqr(&params, m, n)) {
+        LINEARIZE_DATA_t a_in, tau_in, q_out;
+
+        init_linearize_data(&a_in, n, m, steps[1], steps[0]);
+        init_linearize_data(&tau_in, 1, fortran_int_min(m, n), 1, steps[2]);
+        init_linearize_data(&q_out, fortran_int_min(m, n), m, steps[4], steps[3]);
+
+        BEGIN_OUTER_LOOP_3
+            int not_ok;
+            linearize_matrix((typ*)params.A, (typ*)args[0], &a_in);
+            linearize_matrix((typ*)params.Q, (typ*)args[0], &a_in);
+            linearize_matrix((typ*)params.TAU, (typ*)args[1], &tau_in);
+            not_ok = call_gqr(&params);
+            if (!not_ok) {
+                delinearize_matrix((typ*)args[2], (typ*)params.Q, &q_out);
+            } else {
+                error_occurred = 1;
+                nan_matrix((typ*)args[2], &q_out);
+            }
+        END_OUTER_LOOP
+
+        release_gqr(&params);
+    }
+
+    set_fp_invalid_or_clear(error_occurred);
+}
+
+/* -------------------------------------------------------------------------- */
+                 /* qr (modes - complete) */
+
+template<typename ftyp>
+static inline int
+init_gqr_complete(GQR_PARAMS_t<ftyp> *params,
+                            fortran_int m,
+                            fortran_int n)
+{
+    return init_gqr_common(params, m, n, m);
+}
+
+
+template<typename typ>
+static void
+qr_complete(char **args, npy_intp const *dimensions, npy_intp const *steps,
+                  void *NPY_UNUSED(func))
+{
+using ftyp = fortran_type_t<typ>;
+    GQR_PARAMS_t<ftyp> params;
+    int error_occurred = get_fp_invalid_and_clear();
+    fortran_int n, m;
+
+    INIT_OUTER_LOOP_3
+
+    m = (fortran_int)dimensions[0];
+    n = (fortran_int)dimensions[1];
+
+
+    if (init_gqr_complete(&params, m, n)) {
+        LINEARIZE_DATA_t a_in, tau_in, q_out;
+
+        init_linearize_data(&a_in, n, m, steps[1], steps[0]);
+        init_linearize_data(&tau_in, 1, fortran_int_min(m, n), 1, steps[2]);
+        init_linearize_data(&q_out, m, m, steps[4], steps[3]);
+
+        BEGIN_OUTER_LOOP_3
+            int not_ok;
+            linearize_matrix((typ*)params.A, (typ*)args[0], &a_in);
+            linearize_matrix((typ*)params.Q, (typ*)args[0], &a_in);
+            linearize_matrix((typ*)params.TAU, (typ*)args[1], &tau_in);
+            not_ok = call_gqr(&params);
+            if (!not_ok) {
+                delinearize_matrix((typ*)args[2], (typ*)params.Q, &q_out);
+            } else {
+                error_occurred = 1;
+                nan_matrix((typ*)args[2], &q_out);
+            }
+        END_OUTER_LOOP
+
+        release_gqr(&params);
+    }
+
+    set_fp_invalid_or_clear(error_occurred);
+}
+
+/* -------------------------------------------------------------------------- */
+                 /* least squares */
+
+template<typename typ>
+struct GELSD_PARAMS_t
+{
+    fortran_int M;
+    fortran_int N;
+    fortran_int NRHS;
+    typ *A;
+    fortran_int LDA;
+    typ *B;
+    fortran_int LDB;
+    basetype_t<typ> *S;
+    basetype_t<typ> *RCOND;
+    fortran_int RANK;
+    typ *WORK;
+    fortran_int LWORK;
+    basetype_t<typ> *RWORK;
+    fortran_int *IWORK;
+};
+
+template<typename typ>
+static inline void
+dump_gelsd_params(const char *name,
+                  GELSD_PARAMS_t<typ> *params)
+{
+    TRACE_TXT("\n%s:\n"\
+
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+              "%14s: %18p\n"\
+
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+              "%14s: %18d\n"\
+
+              "%14s: %18p\n",
+
+              name,
+
+              "A", params->A,
+              "B", params->B,
+              "S", params->S,
+              "WORK", params->WORK,
+              "RWORK", params->RWORK,
+              "IWORK", params->IWORK,
+
+              "M", (int)params->M,
+              "N", (int)params->N,
+              "NRHS", (int)params->NRHS,
+              "LDA", (int)params->LDA,
+              "LDB", (int)params->LDB,
+              "LWORK", (int)params->LWORK,
+              "RANK", (int)params->RANK,
+
+              "RCOND", params->RCOND);
+}
+
+static inline fortran_int
+call_gelsd(GELSD_PARAMS_t<fortran_real> *params)
+{
+    fortran_int rv;
+    LAPACK(sgelsd)(&params->M, &params->N, &params->NRHS,
+                          params->A, &params->LDA,
+                          params->B, &params->LDB,
+                          params->S,
+                          params->RCOND, &params->RANK,
+                          params->WORK, &params->LWORK,
+                          params->IWORK,
+                          &rv);
+    return rv;
+}
+
+
+static inline fortran_int
+call_gelsd(GELSD_PARAMS_t<fortran_doublereal> *params)
+{
+    fortran_int rv;
+    LAPACK(dgelsd)(&params->M, &params->N, &params->NRHS,
+                          params->A, &params->LDA,
+                          params->B, &params->LDB,
+                          params->S,
+                          params->RCOND, &params->RANK,
+                          params->WORK, &params->LWORK,
+                          params->IWORK,
+                          &rv);
+    return rv;
+}
+
+
+template<typename ftyp>
+static inline int
+init_gelsd(GELSD_PARAMS_t<ftyp> *params,
+                   fortran_int m,
+                   fortran_int n,
+                   fortran_int nrhs,
+scalar_trait)
+{
+    npy_uint8 *mem_buff = NULL;
+    npy_uint8 *mem_buff2 = NULL;
+    npy_uint8 *a, *b, *s, *work, *iwork;
+    fortran_int min_m_n = fortran_int_min(m, n);
+    fortran_int max_m_n = fortran_int_max(m, n);
+    size_t safe_min_m_n = min_m_n;
+    size_t safe_max_m_n = max_m_n;
+    size_t safe_m = m;
+    size_t safe_n = n;
+    size_t safe_nrhs = nrhs;
+
+    size_t a_size = safe_m * safe_n * sizeof(ftyp);
+    size_t b_size = safe_max_m_n * safe_nrhs * sizeof(ftyp);
+    size_t s_size = safe_min_m_n * sizeof(ftyp);
+
+    fortran_int work_count;
+    size_t work_size;
+    size_t iwork_size;
+    fortran_int lda = fortran_int_max(1, m);
+    fortran_int ldb = fortran_int_max(1, fortran_int_max(m,n));
+
+    mem_buff = (npy_uint8 *)malloc(a_size + b_size + s_size);
+
+    if (!mem_buff)
+        goto error;
+
+    a = mem_buff;
+    b = a + a_size;
+    s = b + b_size;
+
+
+    params->M = m;
+    params->N = n;
+    params->NRHS = nrhs;
+    params->A = (ftyp*)a;
+    params->B = (ftyp*)b;
+    params->S = (ftyp*)s;
+    params->LDA = lda;
+    params->LDB = ldb;
+
+    {
+        /* compute optimal work size */
+        ftyp work_size_query;
+        fortran_int iwork_size_query;
+
+        params->WORK = &work_size_query;
+        params->IWORK = &iwork_size_query;
+        params->RWORK = NULL;
+        params->LWORK = -1;
+
+        if (call_gelsd(params) != 0)
+            goto error;
+
+        work_count = (fortran_int)work_size_query;
+
+        work_size  = (size_t) work_size_query * sizeof(ftyp);
+        iwork_size = (size_t)iwork_size_query * sizeof(fortran_int);
+    }
+
+    mem_buff2 = (npy_uint8 *)malloc(work_size + iwork_size);
+    if (!mem_buff2)
+        goto error;
+
+    work = mem_buff2;
+    iwork = work + work_size;
+
+    params->WORK = (ftyp*)work;
+    params->RWORK = NULL;
+    params->IWORK = (fortran_int*)iwork;
+    params->LWORK = work_count;
+
+    return 1;
+ error:
+    TRACE_TXT("%s failed init\n", __FUNCTION__);
+    free(mem_buff);
+    free(mem_buff2);
+    memset(params, 0, sizeof(*params));
+
+    return 0;
+}
+
+static inline fortran_int
+call_gelsd(GELSD_PARAMS_t<fortran_complex> *params)
+{
+    fortran_int rv;
+    LAPACK(cgelsd)(&params->M, &params->N, &params->NRHS,
+                          params->A, &params->LDA,
+                          params->B, &params->LDB,
+                          params->S,
+                          params->RCOND, &params->RANK,
+                          params->WORK, &params->LWORK,
+                          params->RWORK, (fortran_int*)params->IWORK,
+                          &rv);
+    return rv;
+}
+
+static inline fortran_int
+call_gelsd(GELSD_PARAMS_t<fortran_doublecomplex> *params)
+{
+    fortran_int rv;
+    LAPACK(zgelsd)(&params->M, &params->N, &params->NRHS,
+                          params->A, &params->LDA,
+                          params->B, &params->LDB,
+                          params->S,
+                          params->RCOND, &params->RANK,
+                          params->WORK, &params->LWORK,
+                          params->RWORK, (fortran_int*)params->IWORK,
+                          &rv);
+    return rv;
+}
+
+
+template<typename ftyp>
+static inline int
+init_gelsd(GELSD_PARAMS_t<ftyp> *params,
+                   fortran_int m,
+                   fortran_int n,
+                   fortran_int nrhs,
+complex_trait)
+{
+using frealtyp = basetype_t<ftyp>;
+    npy_uint8 *mem_buff = NULL;
+    npy_uint8 *mem_buff2 = NULL;
+    npy_uint8 *a, *b, *s, *work, *iwork, *rwork;
+    fortran_int min_m_n = fortran_int_min(m, n);
+    fortran_int max_m_n = fortran_int_max(m, n);
+    size_t safe_min_m_n = min_m_n;
+    size_t safe_max_m_n = max_m_n;
+    size_t safe_m = m;
+    size_t safe_n = n;
+    size_t safe_nrhs = nrhs;
+
+    size_t a_size = safe_m * safe_n * sizeof(ftyp);
+    size_t b_size = safe_max_m_n * safe_nrhs * sizeof(ftyp);
+    size_t s_size = safe_min_m_n * sizeof(frealtyp);
+
+    fortran_int work_count;
+    size_t work_size, rwork_size, iwork_size;
+    fortran_int lda = fortran_int_max(1, m);
+    fortran_int ldb = fortran_int_max(1, fortran_int_max(m,n));
+
+    mem_buff = (npy_uint8 *)malloc(a_size + b_size + s_size);
+
+    if (!mem_buff)
+        goto error;
+
+    a = mem_buff;
+    b = a + a_size;
+    s = b + b_size;
+
+
+    params->M = m;
+    params->N = n;
+    params->NRHS = nrhs;
+    params->A = (ftyp*)a;
+    params->B = (ftyp*)b;
+    params->S = (frealtyp*)s;
+    params->LDA = lda;
+    params->LDB = ldb;
+
+    {
+        /* compute optimal work size */
+        ftyp work_size_query;
+        frealtyp rwork_size_query;
+        fortran_int iwork_size_query;
+
+        params->WORK = &work_size_query;
+        params->IWORK = &iwork_size_query;
+        params->RWORK = &rwork_size_query;
+        params->LWORK = -1;
+
+        if (call_gelsd(params) != 0)
+            goto error;
+
+        work_count = (fortran_int)work_size_query.r;
+
+        work_size  = (size_t )work_size_query.r * sizeof(ftyp);
+        rwork_size = (size_t)rwork_size_query * sizeof(frealtyp);
+        iwork_size = (size_t)iwork_size_query * sizeof(fortran_int);
+    }
+
+    mem_buff2 = (npy_uint8 *)malloc(work_size + rwork_size + iwork_size);
+    if (!mem_buff2)
+        goto error;
+
+    work = mem_buff2;
+    rwork = work + work_size;
+    iwork = rwork + rwork_size;
+
+    params->WORK = (ftyp*)work;
+    params->RWORK = (frealtyp*)rwork;
+    params->IWORK = (fortran_int*)iwork;
+    params->LWORK = work_count;
+
+    return 1;
+ error:
+    TRACE_TXT("%s failed init\n", __FUNCTION__);
+    free(mem_buff);
+    free(mem_buff2);
+    memset(params, 0, sizeof(*params));
+
+    return 0;
+}
+
+template<typename ftyp>
+static inline void
+release_gelsd(GELSD_PARAMS_t<ftyp>* params)
+{
+    /* A and WORK contain allocated blocks */
+    free(params->A);
+    free(params->WORK);
+    memset(params, 0, sizeof(*params));
+}
+
+/** Compute the squared l2 norm of a contiguous vector */
+template<typename typ>
+static basetype_t<typ>
+abs2(typ *p, npy_intp n, scalar_trait) {
+    npy_intp i;
+    basetype_t<typ> res = 0;
+    for (i = 0; i < n; i++) {
+        typ el = p[i];
+        res += el*el;
+    }
+    return res;
+}
+template<typename typ>
+static basetype_t<typ>
+abs2(typ *p, npy_intp n, complex_trait) {
+    npy_intp i;
+    basetype_t<typ> res = 0;
+    for (i = 0; i < n; i++) {
+        typ el = p[i];
+        res += el.real*el.real + el.imag*el.imag;
+    }
+    return res;
+}
+
+
+template<typename typ>
+static void
+lstsq(char **args, npy_intp const *dimensions, npy_intp const *steps,
+             void *NPY_UNUSED(func))
+{
+using ftyp = fortran_type_t<typ>;
+using basetyp = basetype_t<typ>;
+    GELSD_PARAMS_t<ftyp> params;
+    int error_occurred = get_fp_invalid_and_clear();
+    fortran_int n, m, nrhs;
+    fortran_int excess;
+
+    INIT_OUTER_LOOP_7
+
+    m = (fortran_int)dimensions[0];
+    n = (fortran_int)dimensions[1];
+    nrhs = (fortran_int)dimensions[2];
+    excess = m - n;
+
+    if (init_gelsd(&params, m, n, nrhs, dispatch_scalar<ftyp>{})) {
+        LINEARIZE_DATA_t a_in, b_in, x_out, s_out, r_out;
+
+        init_linearize_data(&a_in, n, m, steps[1], steps[0]);
+        init_linearize_data_ex(&b_in, nrhs, m, steps[3], steps[2], fortran_int_max(n, m));
+        init_linearize_data_ex(&x_out, nrhs, n, steps[5], steps[4], fortran_int_max(n, m));
+        init_linearize_data(&r_out, 1, nrhs, 1, steps[6]);
+        init_linearize_data(&s_out, 1, fortran_int_min(n, m), 1, steps[7]);
+
+        BEGIN_OUTER_LOOP_7
+            int not_ok;
+            linearize_matrix((typ*)params.A, (typ*)args[0], &a_in);
+            linearize_matrix((typ*)params.B, (typ*)args[1], &b_in);
+            params.RCOND = (basetyp*)args[2];
+            not_ok = call_gelsd(&params);
+            if (!not_ok) {
+                delinearize_matrix((typ*)args[3], (typ*)params.B, &x_out);
+                *(npy_int*) args[5] = params.RANK;
+                delinearize_matrix((basetyp*)args[6], (basetyp*)params.S, &s_out);
+
+                /* Note that linalg.lstsq discards this when excess == 0 */
+                if (excess >= 0 && params.RANK == n) {
+                    /* Compute the residuals as the square sum of each column */
+                    int i;
+                    char *resid = args[4];
+                    ftyp *components = (ftyp *)params.B + n;
+                    for (i = 0; i < nrhs; i++) {
+                        ftyp *vector = components + i*m;
+                        /* Numpy and fortran floating types are the same size,
+                         * so this cast is safe */
+                        basetyp abs = abs2((typ *)vector, excess,
+dispatch_scalar<typ>{});
+                        memcpy(
+                            resid + i*r_out.column_strides,
+                            &abs, sizeof(abs));
+                    }
+                }
+                else {
+                    /* Note that this is always discarded by linalg.lstsq */
+                    nan_matrix((basetyp*)args[4], &r_out);
+                }
+            } else {
+                error_occurred = 1;
+                nan_matrix((typ*)args[3], &x_out);
+                nan_matrix((basetyp*)args[4], &r_out);
+                *(npy_int*) args[5] = -1;
+                nan_matrix((basetyp*)args[6], &s_out);
+            }
+        END_OUTER_LOOP
+
+        release_gelsd(&params);
+    }
+
+    set_fp_invalid_or_clear(error_occurred);
+}
+
+#pragma GCC diagnostic pop
+
+/* -------------------------------------------------------------------------- */
+              /* gufunc registration  */
+
+static void *array_of_nulls[] = {
+    (void *)NULL,
+    (void *)NULL,
+    (void *)NULL,
+    (void *)NULL,
+
+    (void *)NULL,
+    (void *)NULL,
+    (void *)NULL,
+    (void *)NULL,
+
+    (void *)NULL,
+    (void *)NULL,
+    (void *)NULL,
+    (void *)NULL,
+
+    (void *)NULL,
+    (void *)NULL,
+    (void *)NULL,
+    (void *)NULL
+};
+
+#define FUNC_ARRAY_NAME(NAME) NAME ## _funcs
+
+#define GUFUNC_FUNC_ARRAY_REAL(NAME)                    \
+    static PyUFuncGenericFunction                       \
+    FUNC_ARRAY_NAME(NAME)[] = {                         \
+        FLOAT_ ## NAME,                                 \
+        DOUBLE_ ## NAME                                 \
+    }
+
+#define GUFUNC_FUNC_ARRAY_REAL_COMPLEX(NAME)            \
+    static PyUFuncGenericFunction                       \
+    FUNC_ARRAY_NAME(NAME)[] = {                         \
+        FLOAT_ ## NAME,                                 \
+        DOUBLE_ ## NAME,                                \
+        CFLOAT_ ## NAME,                                \
+        CDOUBLE_ ## NAME                                \
+    }
+#define GUFUNC_FUNC_ARRAY_REAL_COMPLEX_(NAME)            \
+    static PyUFuncGenericFunction                       \
+    FUNC_ARRAY_NAME(NAME)[] = {                         \
+        NAME<npy_float, npy_float>,                                 \
+        NAME<npy_double, npy_double>,                                \
+        NAME<npy_cfloat, npy_float>,                                \
+        NAME<npy_cdouble, npy_double>                                \
+    }
+#define GUFUNC_FUNC_ARRAY_REAL_COMPLEX__(NAME)            \
+    static PyUFuncGenericFunction                       \
+    FUNC_ARRAY_NAME(NAME)[] = {                         \
+        NAME<npy_float>,                                 \
+        NAME<npy_double>,                                \
+        NAME<npy_cfloat>,                                \
+        NAME<npy_cdouble>                                \
+    }
+
+/* There are problems with eig in complex single precision.
+ * That kernel is disabled
+ */
+#define GUFUNC_FUNC_ARRAY_EIG(NAME)                     \
+    static PyUFuncGenericFunction                       \
+    FUNC_ARRAY_NAME(NAME)[] = {                         \
+        NAME<fortran_complex,fortran_real>,                                 \
+        NAME<fortran_doublecomplex,fortran_doublereal>,                                \
+        NAME<fortran_doublecomplex,fortran_doublecomplex>                                \
+    }
+
+/* The single precision functions are not used at all,
+ * due to input data being promoted to double precision
+ * in Python, so they are not implemented here.
+ */
+#define GUFUNC_FUNC_ARRAY_QR(NAME)                      \
+    static PyUFuncGenericFunction                       \
+    FUNC_ARRAY_NAME(NAME)[] = {                         \
+        DOUBLE_ ## NAME,                                \
+        CDOUBLE_ ## NAME                                \
+    }
+#define GUFUNC_FUNC_ARRAY_QR__(NAME)                      \
+    static PyUFuncGenericFunction                       \
+    FUNC_ARRAY_NAME(NAME)[] = {                         \
+        NAME<npy_double>,                                \
+        NAME<npy_cdouble>                                \
+    }
+
+
+GUFUNC_FUNC_ARRAY_REAL_COMPLEX_(slogdet);
+GUFUNC_FUNC_ARRAY_REAL_COMPLEX_(det);
+GUFUNC_FUNC_ARRAY_REAL_COMPLEX__(eighlo);
+GUFUNC_FUNC_ARRAY_REAL_COMPLEX__(eighup);
+GUFUNC_FUNC_ARRAY_REAL_COMPLEX__(eigvalshlo);
+GUFUNC_FUNC_ARRAY_REAL_COMPLEX__(eigvalshup);
+GUFUNC_FUNC_ARRAY_REAL_COMPLEX__(solve);
+GUFUNC_FUNC_ARRAY_REAL_COMPLEX__(solve1);
+GUFUNC_FUNC_ARRAY_REAL_COMPLEX__(inv);
+GUFUNC_FUNC_ARRAY_REAL_COMPLEX__(cholesky_lo);
+GUFUNC_FUNC_ARRAY_REAL_COMPLEX__(svd_N);
+GUFUNC_FUNC_ARRAY_REAL_COMPLEX__(svd_S);
+GUFUNC_FUNC_ARRAY_REAL_COMPLEX__(svd_A);
+GUFUNC_FUNC_ARRAY_QR__(qr_r_raw);
+GUFUNC_FUNC_ARRAY_QR__(qr_reduced);
+GUFUNC_FUNC_ARRAY_QR__(qr_complete);
+GUFUNC_FUNC_ARRAY_REAL_COMPLEX__(lstsq);
+GUFUNC_FUNC_ARRAY_EIG(eig);
+GUFUNC_FUNC_ARRAY_EIG(eigvals);
+
+static char equal_2_types[] = {
+    NPY_FLOAT, NPY_FLOAT,
+    NPY_DOUBLE, NPY_DOUBLE,
+    NPY_CFLOAT, NPY_CFLOAT,
+    NPY_CDOUBLE, NPY_CDOUBLE
+};
+
+static char equal_3_types[] = {
+    NPY_FLOAT, NPY_FLOAT, NPY_FLOAT,
+    NPY_DOUBLE, NPY_DOUBLE, NPY_DOUBLE,
+    NPY_CFLOAT, NPY_CFLOAT, NPY_CFLOAT,
+    NPY_CDOUBLE, NPY_CDOUBLE, NPY_CDOUBLE
+};
+
+/* second result is logdet, that will always be a REAL */
+static char slogdet_types[] = {
+    NPY_FLOAT, NPY_FLOAT, NPY_FLOAT,
+    NPY_DOUBLE, NPY_DOUBLE, NPY_DOUBLE,
+    NPY_CFLOAT, NPY_CFLOAT, NPY_FLOAT,
+    NPY_CDOUBLE, NPY_CDOUBLE, NPY_DOUBLE
+};
+
+static char eigh_types[] = {
+    NPY_FLOAT, NPY_FLOAT, NPY_FLOAT,
+    NPY_DOUBLE, NPY_DOUBLE, NPY_DOUBLE,
+    NPY_CFLOAT, NPY_FLOAT, NPY_CFLOAT,
+    NPY_CDOUBLE, NPY_DOUBLE, NPY_CDOUBLE
+};
+
+static char eighvals_types[] = {
+    NPY_FLOAT, NPY_FLOAT,
+    NPY_DOUBLE, NPY_DOUBLE,
+    NPY_CFLOAT, NPY_FLOAT,
+    NPY_CDOUBLE, NPY_DOUBLE
+};
+
+static char eig_types[] = {
+    NPY_FLOAT, NPY_CFLOAT, NPY_CFLOAT,
+    NPY_DOUBLE, NPY_CDOUBLE, NPY_CDOUBLE,
+    NPY_CDOUBLE, NPY_CDOUBLE, NPY_CDOUBLE
+};
+
+static char eigvals_types[] = {
+    NPY_FLOAT, NPY_CFLOAT,
+    NPY_DOUBLE, NPY_CDOUBLE,
+    NPY_CDOUBLE, NPY_CDOUBLE
+};
+
+static char svd_1_1_types[] = {
+    NPY_FLOAT, NPY_FLOAT,
+    NPY_DOUBLE, NPY_DOUBLE,
+    NPY_CFLOAT, NPY_FLOAT,
+    NPY_CDOUBLE, NPY_DOUBLE
+};
+
+static char svd_1_3_types[] = {
+    NPY_FLOAT,   NPY_FLOAT,   NPY_FLOAT,  NPY_FLOAT,
+    NPY_DOUBLE,  NPY_DOUBLE,  NPY_DOUBLE, NPY_DOUBLE,
+    NPY_CFLOAT,  NPY_CFLOAT,  NPY_FLOAT,  NPY_CFLOAT,
+    NPY_CDOUBLE, NPY_CDOUBLE, NPY_DOUBLE, NPY_CDOUBLE
+};
+
+/* A, tau */
+static char qr_r_raw_types[] = {
+    NPY_DOUBLE,  NPY_DOUBLE,
+    NPY_CDOUBLE, NPY_CDOUBLE,
+};
+
+/* A, tau, q */
+static char qr_reduced_types[] = {
+    NPY_DOUBLE,  NPY_DOUBLE,  NPY_DOUBLE,
+    NPY_CDOUBLE, NPY_CDOUBLE, NPY_CDOUBLE,
+};
+
+/* A, tau, q */
+static char qr_complete_types[] = {
+    NPY_DOUBLE,  NPY_DOUBLE,  NPY_DOUBLE,
+    NPY_CDOUBLE, NPY_CDOUBLE, NPY_CDOUBLE,
+};
+
+/*  A,           b,           rcond,      x,           resid,      rank,    s,        */
+static char lstsq_types[] = {
+    NPY_FLOAT,   NPY_FLOAT,   NPY_FLOAT,  NPY_FLOAT,   NPY_FLOAT,  NPY_INT, NPY_FLOAT,
+    NPY_DOUBLE,  NPY_DOUBLE,  NPY_DOUBLE, NPY_DOUBLE,  NPY_DOUBLE, NPY_INT, NPY_DOUBLE,
+    NPY_CFLOAT,  NPY_CFLOAT,  NPY_FLOAT,  NPY_CFLOAT,  NPY_FLOAT,  NPY_INT, NPY_FLOAT,
+    NPY_CDOUBLE, NPY_CDOUBLE, NPY_DOUBLE, NPY_CDOUBLE, NPY_DOUBLE, NPY_INT, NPY_DOUBLE,
+};
+
+typedef struct gufunc_descriptor_struct {
+    const char *name;
+    const char *signature;
+    const char *doc;
+    int ntypes;
+    int nin;
+    int nout;
+    PyUFuncGenericFunction *funcs;
+    char *types;
+} GUFUNC_DESCRIPTOR_t;
+
+GUFUNC_DESCRIPTOR_t gufunc_descriptors [] = {
+    {
+        "slogdet",
+        "(m,m)->(),()",
+        "slogdet on the last two dimensions and broadcast on the rest. \n"\
+        "Results in two arrays, one with sign and the other with log of the"\
+        " determinants. \n"\
+        "    \"(m,m)->(),()\" \n",
+        4, 1, 2,
+        FUNC_ARRAY_NAME(slogdet),
+        slogdet_types
+    },
+    {
+        "det",
+        "(m,m)->()",
+        "det of the last two dimensions and broadcast on the rest. \n"\
+        "    \"(m,m)->()\" \n",
+        4, 1, 1,
+        FUNC_ARRAY_NAME(det),
+        equal_2_types
+    },
+    {
+        "eigh_lo",
+        "(m,m)->(m),(m,m)",
+        "eigh on the last two dimension and broadcast to the rest, using"\
+        " lower triangle \n"\
+        "Results in a vector of eigenvalues and a matrix with the"\
+        "eigenvectors. \n"\
+        "    \"(m,m)->(m),(m,m)\" \n",
+        4, 1, 2,
+        FUNC_ARRAY_NAME(eighlo),
+        eigh_types
+    },
+    {
+        "eigh_up",
+        "(m,m)->(m),(m,m)",
+        "eigh on the last two dimension and broadcast to the rest, using"\
+        " upper triangle. \n"\
+        "Results in a vector of eigenvalues and a matrix with the"\
+        " eigenvectors. \n"\
+        "    \"(m,m)->(m),(m,m)\" \n",
+        4, 1, 2,
+        FUNC_ARRAY_NAME(eighup),
+        eigh_types
+    },
+    {
+        "eigvalsh_lo",
+        "(m,m)->(m)",
+        "eigh on the last two dimension and broadcast to the rest, using"\
+        " lower triangle. \n"\
+        "Results in a vector of eigenvalues and a matrix with the"\
+        "eigenvectors. \n"\
+        "    \"(m,m)->(m)\" \n",
+        4, 1, 1,
+        FUNC_ARRAY_NAME(eigvalshlo),
+        eighvals_types
+    },
+    {
+        "eigvalsh_up",
+        "(m,m)->(m)",
+        "eigvalsh on the last two dimension and broadcast to the rest,"\
+        " using upper triangle. \n"\
+        "Results in a vector of eigenvalues and a matrix with the"\
+        "eigenvectors.\n"\
+        "    \"(m,m)->(m)\" \n",
+        4, 1, 1,
+        FUNC_ARRAY_NAME(eigvalshup),
+        eighvals_types
+    },
+    {
+        "solve",
+        "(m,m),(m,n)->(m,n)",
+        "solve the system a x = b, on the last two dimensions, broadcast"\
+        " to the rest. \n"\
+        "Results in a matrices with the solutions. \n"\
+        "    \"(m,m),(m,n)->(m,n)\" \n",
+        4, 2, 1,
+        FUNC_ARRAY_NAME(solve),
+        equal_3_types
+    },
+    {
+        "solve1",
+        "(m,m),(m)->(m)",
+        "solve the system a x = b, for b being a vector, broadcast in"\
+        " the outer dimensions. \n"\
+        "Results in vectors with the solutions. \n"\
+        "    \"(m,m),(m)->(m)\" \n",
+        4, 2, 1,
+        FUNC_ARRAY_NAME(solve1),
+        equal_3_types
+    },
+    {
+        "inv",
+        "(m, m)->(m, m)",
+        "compute the inverse of the last two dimensions and broadcast"\
+        " to the rest. \n"\
+        "Results in the inverse matrices. \n"\
+        "    \"(m,m)->(m,m)\" \n",
+        4, 1, 1,
+        FUNC_ARRAY_NAME(inv),
+        equal_2_types
+    },
+    {
+        "cholesky_lo",
+        "(m,m)->(m,m)",
+        "cholesky decomposition of hermitian positive-definite matrices. \n"\
+        "Broadcast to all outer dimensions. \n"\
+        "    \"(m,m)->(m,m)\" \n",
+        4, 1, 1,
+        FUNC_ARRAY_NAME(cholesky_lo),
+        equal_2_types
+    },
+    {
+        "svd_m",
+        "(m,n)->(m)",
+        "svd when n>=m. ",
+        4, 1, 1,
+        FUNC_ARRAY_NAME(svd_N),
+        svd_1_1_types
+    },
+    {
+        "svd_n",
+        "(m,n)->(n)",
+        "svd when n<=m",
+        4, 1, 1,
+        FUNC_ARRAY_NAME(svd_N),
+        svd_1_1_types
+    },
+    {
+        "svd_m_s",
+        "(m,n)->(m,m),(m),(m,n)",
+        "svd when m<=n",
+        4, 1, 3,
+        FUNC_ARRAY_NAME(svd_S),
+        svd_1_3_types
+    },
+    {
+        "svd_n_s",
+        "(m,n)->(m,n),(n),(n,n)",
+        "svd when m>=n",
+        4, 1, 3,
+        FUNC_ARRAY_NAME(svd_S),
+        svd_1_3_types
+    },
+    {
+        "svd_m_f",
+        "(m,n)->(m,m),(m),(n,n)",
+        "svd when m<=n",
+        4, 1, 3,
+        FUNC_ARRAY_NAME(svd_A),
+        svd_1_3_types
+    },
+    {
+        "svd_n_f",
+        "(m,n)->(m,m),(n),(n,n)",
+        "svd when m>=n",
+        4, 1, 3,
+        FUNC_ARRAY_NAME(svd_A),
+        svd_1_3_types
+    },
+    {
+        "eig",
+        "(m,m)->(m),(m,m)",
+        "eig on the last two dimension and broadcast to the rest. \n"\
+        "Results in a vector with the  eigenvalues and a matrix with the"\
+        " eigenvectors. \n"\
+        "    \"(m,m)->(m),(m,m)\" \n",
+        3, 1, 2,
+        FUNC_ARRAY_NAME(eig),
+        eig_types
+    },
+    {
+        "eigvals",
+        "(m,m)->(m)",
+        "eigvals on the last two dimension and broadcast to the rest. \n"\
+        "Results in a vector of eigenvalues. \n",
+        3, 1, 1,
+        FUNC_ARRAY_NAME(eigvals),
+        eigvals_types
+    },
+    {
+        "qr_r_raw_m",
+        "(m,n)->(m)",
+        "Compute TAU vector for the last two dimensions \n"\
+        "and broadcast to the rest. For m <= n. \n",
+        2, 1, 1,
+        FUNC_ARRAY_NAME(qr_r_raw),
+        qr_r_raw_types
+    },
+    {
+        "qr_r_raw_n",
+        "(m,n)->(n)",
+        "Compute TAU vector for the last two dimensions \n"\
+        "and broadcast to the rest. For m > n. \n",
+        2, 1, 1,
+        FUNC_ARRAY_NAME(qr_r_raw),
+        qr_r_raw_types
+    },
+    {
+        "qr_reduced",
+        "(m,n),(k)->(m,k)",
+        "Compute Q matrix for the last two dimensions \n"\
+        "and broadcast to the rest. \n",
+        2, 2, 1,
+        FUNC_ARRAY_NAME(qr_reduced),
+        qr_reduced_types
+    },
+    {
+        "qr_complete",
+        "(m,n),(n)->(m,m)",
+        "Compute Q matrix for the last two dimensions \n"\
+        "and broadcast to the rest. For m > n. \n",
+        2, 2, 1,
+        FUNC_ARRAY_NAME(qr_complete),
+        qr_complete_types
+    },
+    {
+        "lstsq_m",
+        "(m,n),(m,nrhs),()->(n,nrhs),(nrhs),(),(m)",
+        "least squares on the last two dimensions and broadcast to the rest. \n"\
+        "For m <= n. \n",
+        4, 3, 4,
+        FUNC_ARRAY_NAME(lstsq),
+        lstsq_types
+    },
+    {
+        "lstsq_n",
+        "(m,n),(m,nrhs),()->(n,nrhs),(nrhs),(),(n)",
+        "least squares on the last two dimensions and broadcast to the rest. \n"\
+        "For m >= n, meaning that residuals are produced. \n",
+        4, 3, 4,
+        FUNC_ARRAY_NAME(lstsq),
+        lstsq_types
+    }
+};
+
+static int
+addUfuncs(PyObject *dictionary) {
+    PyObject *f;
+    int i;
+    const int gufunc_count = sizeof(gufunc_descriptors)/
+        sizeof(gufunc_descriptors[0]);
+    for (i = 0; i < gufunc_count; i++) {
+        GUFUNC_DESCRIPTOR_t* d = &gufunc_descriptors[i];
+        f = PyUFunc_FromFuncAndDataAndSignature(d->funcs,
+                                                array_of_nulls,
+                                                d->types,
+                                                d->ntypes,
+                                                d->nin,
+                                                d->nout,
+                                                PyUFunc_None,
+                                                d->name,
+                                                d->doc,
+                                                0,
+                                                d->signature);
+        if (f == NULL) {
+            return -1;
+        }
+#if 0
+        dump_ufunc_object((PyUFuncObject*) f);
+#endif
+        int ret = PyDict_SetItemString(dictionary, d->name, f);
+        Py_DECREF(f);
+        if (ret < 0) {
+            return -1;
+        }
+    }
+    return 0;
+}
+
+
+
+/* -------------------------------------------------------------------------- */
+                  /* Module initialization stuff  */
+
+static PyMethodDef UMath_LinAlgMethods[] = {
+    {NULL, NULL, 0, NULL}        /* Sentinel */
+};
+
+static struct PyModuleDef moduledef = {
+        PyModuleDef_HEAD_INIT,
+        UMATH_LINALG_MODULE_NAME,
+        NULL,
+        -1,
+        UMath_LinAlgMethods,
+        NULL,
+        NULL,
+        NULL,
+        NULL
+};
+
+PyMODINIT_FUNC PyInit__umath_linalg(void)
+{
+    PyObject *m;
+    PyObject *d;
+    PyObject *version;
+
+    m = PyModule_Create(&moduledef);
+    if (m == NULL) {
+        return NULL;
+    }
+
+    import_array();
+    import_ufunc();
+
+    d = PyModule_GetDict(m);
+    if (d == NULL) {
+        return NULL;
+    }
+
+    version = PyUnicode_FromString(umath_linalg_version_string);
+    if (version == NULL) {
+        return NULL;
+    }
+    int ret = PyDict_SetItemString(d, "__version__", version);
+    Py_DECREF(version);
+    if (ret < 0) {
+        return NULL;
+    }
+
+    /* Load the ufunc operators into the module's namespace */
+    if (addUfuncs(d) < 0) {
+        return NULL;
+    }
+
+    return m;
+}
diff --git a/numpy/ma/__init__.pyi b/numpy/ma/__init__.pyi

index 26d44b508c47fa99bd5f16e3b38c80dcc16584a6..7f5cb56a80d6b842498bd641d8a89e02ded75aba 100644 (file)
--- a/numpy/ma/__init__.pyi
+++ b/numpy/ma/__init__.pyi
@@ -1,5 +1,3 @@
-from typing import Any, List
-
  from numpy._pytesttester import PytestTester
  
  from numpy.ma import extras as extras
@@ -218,6 +216,7 @@ from numpy.ma.extras import (
      masked_all_like as masked_all_like,
      median as median,
      mr_ as mr_,
+    ndenumerate as ndenumerate,
      notmasked_contiguous as notmasked_contiguous,
      notmasked_edges as notmasked_edges,
      polyfit as polyfit,
@@ -231,6 +230,6 @@ from numpy.ma.extras import (
      vstack as vstack,
  )
  
-__all__: List[str]
-__path__: List[str]
+__all__: list[str]
+__path__: list[str]
  test: PytestTester
diff --git a/numpy/ma/core.py b/numpy/ma/core.py

index 491c2c60550f4c778898ca00da2f11c62f12bd18..ed17b1b223088b3c6a6bd7a422f755b8fa60f9ca 100644 (file)
--- a/numpy/ma/core.py
+++ b/numpy/ma/core.py
@@ -405,7 +405,7 @@ def _recursive_set_fill_value(fillvalue, dt):
  
      Returns
      -------
-    val: tuple
+    val : tuple
          A tuple of values corresponding to the structured fill value.
  
      """
@@ -2842,7 +2842,7 @@ class MaskedArray(ndarray):
          # still has the _mask attribute like MaskedArrays
          if hasattr(data, '_mask') and not isinstance(data, ndarray):
              _data._mask = data._mask
-            # FIXME: should we set `_data._sharedmask = True`? 
+            # FIXME: should we set `_data._sharedmask = True`?
          # Process mask.
          # Type of the mask
          mdtype = make_mask_descr(_data.dtype)
@@ -3542,15 +3542,17 @@ class MaskedArray(ndarray):
  
      def harden_mask(self):
          """
-        Force the mask to hard.
+        Force the mask to hard, preventing unmasking by assignment.
  
          Whether the mask of a masked array is hard or soft is determined by
          its `~ma.MaskedArray.hardmask` property. `harden_mask` sets
-        `~ma.MaskedArray.hardmask` to ``True``.
+        `~ma.MaskedArray.hardmask` to ``True`` (and returns the modified
+        self).
  
          See Also
          --------
          ma.MaskedArray.hardmask
+        ma.MaskedArray.soften_mask
  
          """
          self._hardmask = True
@@ -3558,15 +3560,17 @@ class MaskedArray(ndarray):
  
      def soften_mask(self):
          """
-        Force the mask to soft.
+        Force the mask to soft (default), allowing unmasking by assignment.
  
          Whether the mask of a masked array is hard or soft is determined by
          its `~ma.MaskedArray.hardmask` property. `soften_mask` sets
-        `~ma.MaskedArray.hardmask` to ``False``.
+        `~ma.MaskedArray.hardmask` to ``False`` (and returns the modified
+        self).
  
          See Also
          --------
          ma.MaskedArray.hardmask
+        ma.MaskedArray.harden_mask
  
          """
          self._hardmask = False
@@ -3574,16 +3578,55 @@ class MaskedArray(ndarray):
  
      @property
      def hardmask(self):
-        """ Hardness of the mask """
+        """
+        Specifies whether values can be unmasked through assignments.
+
+        By default, assigning definite values to masked array entries will
+        unmask them.  When `hardmask` is ``True``, the mask will not change
+        through assignments.
+
+        See Also
+        --------
+        ma.MaskedArray.harden_mask
+        ma.MaskedArray.soften_mask
+
+        Examples
+        --------
+        >>> x = np.arange(10)
+        >>> m = np.ma.masked_array(x, x>5)
+        >>> assert not m.hardmask
+
+        Since `m` has a soft mask, assigning an element value unmasks that
+        element:
+
+        >>> m[8] = 42
+        >>> m
+        masked_array(data=[0, 1, 2, 3, 4, 5, --, --, 42, --],
+                     mask=[False, False, False, False, False, False,
+                           True, True, False, True],
+               fill_value=999999)
+
+        After hardening, the mask is not affected by assignments:
+
+        >>> hardened = np.ma.harden_mask(m)
+        >>> assert m.hardmask and hardened is m
+        >>> m[:] = 23
+        >>> m
+        masked_array(data=[23, 23, 23, 23, 23, 23, --, --, 23, --],
+                     mask=[False, False, False, False, False, False,
+                           True, True, False, True],
+               fill_value=999999)
+
+        """
          return self._hardmask
  
      def unshare_mask(self):
          """
-        Copy the mask and set the sharedmask flag to False.
+        Copy the mask and set the `sharedmask` flag to ``False``.
  
          Whether the mask is shared between masked arrays can be seen from
-        the `sharedmask` property. `unshare_mask` ensures the mask is not shared.
-        A copy of the mask is only made if it was shared.
+        the `sharedmask` property. `unshare_mask` ensures the mask is not
+        shared. A copy of the mask is only made if it was shared.
  
          See Also
          --------
@@ -4811,7 +4854,6 @@ class MaskedArray(ndarray):
            WRITEABLE : True
            ALIGNED : True
            WRITEBACKIFCOPY : False
-          UPDATEIFCOPY : False
  
          """
          return self.flags['CONTIGUOUS']
@@ -5666,9 +5708,12 @@ class MaskedArray(ndarray):
  
          Parameters
          ----------
-        axis : {None, int}, optional
+        axis : None or int or tuple of ints, optional
              Axis along which to operate.  By default, ``axis`` is None and the
              flattened input is used.
+            .. versionadded:: 1.7.0
+            If this is a tuple of ints, the minimum is selected over multiple
+            axes, instead of a single axis or all the axes as before.
          out : array_like, optional
              Alternative output array in which to place the result.  Must be of
              the same shape and buffer length as the expected output.
@@ -5800,9 +5845,12 @@ class MaskedArray(ndarray):
  
          Parameters
          ----------
-        axis : {None, int}, optional
+        axis : None or int or tuple of ints, optional
              Axis along which to operate.  By default, ``axis`` is None and the
              flattened input is used.
+            .. versionadded:: 1.7.0
+            If this is a tuple of ints, the maximum is selected over multiple
+            axes, instead of a single axis or all the axes as before.
          out : array_like, optional
              Alternative output array in which to place the result.  Must
              be of the same shape and buffer length as the expected output.
@@ -8160,7 +8208,7 @@ class _convert2ma:
  
  
  arange = _convert2ma(
-    'arange', 
+    'arange',
      params=dict(fill_value=None, hardmask=False),
      np_ret='arange : ndarray',
      np_ma_ret='arange : MaskedArray',
@@ -8178,7 +8226,7 @@ diff = _convert2ma(
      np_ma_ret='diff : MaskedArray',
  )
  empty = _convert2ma(
-    'empty', 
+    'empty',
      params=dict(fill_value=None, hardmask=False),
      np_ret='out : ndarray',
      np_ma_ret='out : MaskedArray',
@@ -8199,7 +8247,7 @@ fromfunction = _convert2ma(
     np_ma_ret='fromfunction: MaskedArray',
  )
  identity = _convert2ma(
-    'identity', 
+    'identity',
      params=dict(fill_value=None, hardmask=False),
      np_ret='out : ndarray',
      np_ma_ret='out : MaskedArray',
diff --git a/numpy/ma/core.pyi b/numpy/ma/core.pyi

index bc1f45a8d5adc61e6023700a729e2747b8a5702a..ffdb219839f571e8ac48564705152e1d0b01fa55 100644 (file)
--- a/numpy/ma/core.pyi
+++ b/numpy/ma/core.pyi
@@ -1,4 +1,5 @@
-from typing import Any, List, TypeVar, Callable
+from collections.abc import Callable
+from typing import Any, TypeVar
  from numpy import ndarray, dtype, float64
  
  from numpy import (
@@ -23,7 +24,7 @@ from numpy.lib.function_base import (
  _ShapeType = TypeVar("_ShapeType", bound=Any)
  _DType_co = TypeVar("_DType_co", bound=dtype[Any], covariant=True)
  
-__all__: List[str]
+__all__: list[str]
  
  MaskType = bool_
  nomask: bool_
diff --git a/numpy/ma/extras.py b/numpy/ma/extras.py

index 38bf1f0e83952746ae58e914ece2890321b43ddf..641f4746f0589856cff7ed672e869462f60fcf24 100644 (file)
--- a/numpy/ma/extras.py
+++ b/numpy/ma/extras.py
@@ -10,12 +10,12 @@ A collection of utilities for `numpy.ma`.
  """
  __all__ = [
      'apply_along_axis', 'apply_over_axes', 'atleast_1d', 'atleast_2d',
-    'atleast_3d', 'average', 'clump_masked', 'clump_unmasked',
-    'column_stack', 'compress_cols', 'compress_nd', 'compress_rowcols',
-    'compress_rows', 'count_masked', 'corrcoef', 'cov', 'diagflat', 'dot',
-    'dstack', 'ediff1d', 'flatnotmasked_contiguous', 'flatnotmasked_edges',
-    'hsplit', 'hstack', 'isin', 'in1d', 'intersect1d', 'mask_cols', 'mask_rowcols',
-    'mask_rows', 'masked_all', 'masked_all_like', 'median', 'mr_',
+    'atleast_3d', 'average', 'clump_masked', 'clump_unmasked', 'column_stack',
+    'compress_cols', 'compress_nd', 'compress_rowcols', 'compress_rows',
+    'count_masked', 'corrcoef', 'cov', 'diagflat', 'dot', 'dstack', 'ediff1d',
+    'flatnotmasked_contiguous', 'flatnotmasked_edges', 'hsplit', 'hstack',
+    'isin', 'in1d', 'intersect1d', 'mask_cols', 'mask_rowcols', 'mask_rows',
+    'masked_all', 'masked_all_like', 'median', 'mr_', 'ndenumerate',
      'notmasked_contiguous', 'notmasked_edges', 'polyfit', 'row_stack',
      'setdiff1d', 'setxor1d', 'stack', 'unique', 'union1d', 'vander', 'vstack',
      ]
@@ -110,8 +110,8 @@ def masked_all(shape, dtype=float):
  
      Parameters
      ----------
-    shape : tuple
-        Shape of the required MaskedArray.
+    shape : int or tuple of ints
+        Shape of the required MaskedArray, e.g., ``(2, 3)`` or ``2``.
      dtype : dtype, optional
          Data type of the output.
  
@@ -475,6 +475,7 @@ def apply_over_axes(func, a, axes):
                          "an array of the correct shape")
      return val
  
+
  if apply_over_axes.__doc__ is not None:
      apply_over_axes.__doc__ = np.apply_over_axes.__doc__[
          :np.apply_over_axes.__doc__.find('Notes')].rstrip() + \
@@ -524,7 +525,8 @@ if apply_over_axes.__doc__ is not None:
      """
  
  
-def average(a, axis=None, weights=None, returned=False):
+def average(a, axis=None, weights=None, returned=False, *,
+            keepdims=np._NoValue):
      """
      Return the weighted average of array over the given axis.
  
@@ -550,6 +552,14 @@ def average(a, axis=None, weights=None, returned=False):
          Flag indicating whether a tuple ``(result, sum of weights)``
          should be returned as output (True), or just the result (False).
          Default is False.
+    keepdims : bool, optional
+        If this is set to True, the axes which are reduced are left
+        in the result as dimensions with size one. With this option,
+        the result will broadcast correctly against the original `a`.
+        *Note:* `keepdims` will not work with instances of `numpy.matrix`
+        or other classes whose methods do not support `keepdims`.
+
+        .. versionadded:: 1.23.0
  
      Returns
      -------
@@ -582,17 +592,32 @@ def average(a, axis=None, weights=None, returned=False):
                   mask=[False, False],
             fill_value=1e+20)
  
+    With ``keepdims=True``, the following result has shape (3, 1).
+
+    >>> np.ma.average(x, axis=1, keepdims=True)
+    masked_array(
+      data=[[0.5],
+            [2.5],
+            [4.5]],
+      mask=False,
+      fill_value=1e+20)
      """
      a = asarray(a)
      m = getmask(a)
  
      # inspired by 'average' in numpy/lib/function_base.py
  
+    if keepdims is np._NoValue:
+        # Don't pass on the keepdims argument if one wasn't given.
+        keepdims_kw = {}
+    else:
+        keepdims_kw = {'keepdims': keepdims}
+
      if weights is None:
-        avg = a.mean(axis)
+        avg = a.mean(axis, **keepdims_kw)
          scl = avg.dtype.type(a.count(axis))
      else:
-        wgt = np.asanyarray(weights)
+        wgt = asarray(weights)
  
          if issubclass(a.dtype.type, (np.integer, np.bool_)):
              result_dtype = np.result_type(a.dtype, wgt.dtype, 'f8')
@@ -618,9 +643,11 @@ def average(a, axis=None, weights=None, returned=False):
  
          if m is not nomask:
              wgt = wgt*(~a.mask)
+            wgt.mask |= a.mask
  
          scl = wgt.sum(axis=axis, dtype=result_dtype)
-        avg = np.multiply(a, wgt, dtype=result_dtype).sum(axis)/scl
+        avg = np.multiply(a, wgt,
+                          dtype=result_dtype).sum(axis, **keepdims_kw) / scl
  
      if returned:
          if scl.shape != avg.shape:
@@ -712,6 +739,7 @@ def median(a, axis=None, out=None, overwrite_input=False, keepdims=False):
      else:
          return r
  
+
  def _median(a, axis=None, out=None, overwrite_input=False):
      # when an unmasked NaN is present return it, so we need to sort the NaN
      # values behind the mask
@@ -839,6 +867,7 @@ def compress_nd(x, axis=None):
          data = data[(slice(None),)*ax + (~m.any(axis=axes),)]
      return data
  
+
  def compress_rowcols(x, axis=None):
      """
      Suppress the rows and/or columns of a 2-D array that contain
@@ -911,6 +940,7 @@ def compress_rows(a):
          raise NotImplementedError("compress_rows works for 2D arrays only.")
      return compress_rowcols(a, 0)
  
+
  def compress_cols(a):
      """
      Suppress whole columns of a 2-D array that contain masked values.
@@ -928,6 +958,7 @@ def compress_cols(a):
          raise NotImplementedError("compress_cols works for 2D arrays only.")
      return compress_rowcols(a, 1)
  
+
  def mask_rows(a, axis=np._NoValue):
      """
      Mask rows of a 2D array that contain masked values.
@@ -978,6 +1009,7 @@ def mask_rows(a, axis=np._NoValue):
              "will raise TypeError", DeprecationWarning, stacklevel=2)
      return mask_rowcols(a, 0)
  
+
  def mask_cols(a, axis=np._NoValue):
      """
      Mask columns of a 2D array that contain masked values.
@@ -1515,10 +1547,79 @@ class mr_class(MAxisConcatenator):
  
  mr_ = mr_class()
  
+
  #####--------------------------------------------------------------------------
  #---- Find unmasked data ---
  #####--------------------------------------------------------------------------
  
+def ndenumerate(a, compressed=True):
+    """
+    Multidimensional index iterator.
+
+    Return an iterator yielding pairs of array coordinates and values,
+    skipping elements that are masked. With `compressed=False`,
+    `ma.masked` is yielded as the value of masked elements. This
+    behavior differs from that of `numpy.ndenumerate`, which yields the
+    value of the underlying data array.
+
+    Notes
+    -----
+    .. versionadded:: 1.23.0
+
+    Parameters
+    ----------
+    a : array_like
+        An array with (possibly) masked elements.
+    compressed : bool, optional
+        If True (default), masked elements are skipped.
+
+    See Also
+    --------
+    numpy.ndenumerate : Equivalent function ignoring any mask.
+
+    Examples
+    --------
+    >>> a = np.ma.arange(9).reshape((3, 3))
+    >>> a[1, 0] = np.ma.masked
+    >>> a[1, 2] = np.ma.masked
+    >>> a[2, 1] = np.ma.masked
+    >>> a
+    masked_array(
+      data=[[0, 1, 2],
+            [--, 4, --],
+            [6, --, 8]],
+      mask=[[False, False, False],
+            [ True, False,  True],
+            [False,  True, False]],
+      fill_value=999999)
+    >>> for index, x in np.ma.ndenumerate(a):
+    ...     print(index, x)
+    (0, 0) 0
+    (0, 1) 1
+    (0, 2) 2
+    (1, 1) 4
+    (2, 0) 6
+    (2, 2) 8
+
+    >>> for index, x in np.ma.ndenumerate(a, compressed=False):
+    ...     print(index, x)
+    (0, 0) 0
+    (0, 1) 1
+    (0, 2) 2
+    (1, 0) --
+    (1, 1) 4
+    (1, 2) --
+    (2, 0) 6
+    (2, 1) --
+    (2, 2) 8
+    """
+    for it, mask in zip(np.ndenumerate(a), getmaskarray(a).flat):
+        if not mask:
+            yield it
+        elif not compressed:
+            yield it[0], masked
+
+
  def flatnotmasked_edges(a):
      """
      Find the indices of the first and last unmasked values.
@@ -1627,11 +1728,11 @@ def notmasked_edges(a, axis=None):
  
  def flatnotmasked_contiguous(a):
      """
-    Find contiguous unmasked data in a masked array along the given axis.
+    Find contiguous unmasked data in a masked array.
  
      Parameters
      ----------
-    a : narray
+    a : array_like
          The input array.
  
      Returns
@@ -1681,6 +1782,7 @@ def flatnotmasked_contiguous(a):
          i += n
      return result
  
+
  def notmasked_contiguous(a, axis=None):
      """
      Find contiguous unmasked data in a masked array along the given axis.
diff --git a/numpy/ma/extras.pyi b/numpy/ma/extras.pyi

index e58e43badf233654a1ee85342f5c9f95d2adc7b6..56228b927080a3963159206a9afc830d6d7335cc 100644 (file)
--- a/numpy/ma/extras.pyi
+++ b/numpy/ma/extras.pyi
@@ -1,4 +1,4 @@
-from typing import Any, List
+from typing import Any
  from numpy.lib.index_tricks import AxisConcatenator
  
  from numpy.ma.core import (
@@ -6,7 +6,7 @@ from numpy.ma.core import (
      mask_rowcols as mask_rowcols,
  )
  
-__all__: List[str]
+__all__: list[str]
  
  def count_masked(arr, axis=...): ...
  def masked_all(shape, dtype = ...): ...
@@ -44,7 +44,7 @@ diagflat: _fromnxfunction_single
  
  def apply_along_axis(func1d, axis, arr, *args, **kwargs): ...
  def apply_over_axes(func, a, axes): ...
-def average(a, axis=..., weights=..., returned=...): ...
+def average(a, axis=..., weights=..., returned=..., keepdims=...): ...
  def median(a, axis=..., out=..., overwrite_input=..., keepdims=...): ...
  def compress_nd(x, axis=...): ...
  def compress_rowcols(x, axis=...): ...
@@ -74,6 +74,7 @@ class mr_class(MAxisConcatenator):
  
  mr_: mr_class
  
+def ndenumerate(a, compressed=...): ...
  def flatnotmasked_edges(a): ...
  def notmasked_edges(a, axis=...): ...
  def flatnotmasked_contiguous(a): ...
diff --git a/numpy/ma/mrecords.pyi b/numpy/ma/mrecords.pyi

index 7bd8678cf12de57ae4f7c73379f17f4ef544ded3..264807e05d57e03e2f8b71d2db2677d8a68ab17e 100644 (file)
--- a/numpy/ma/mrecords.pyi
+++ b/numpy/ma/mrecords.pyi
@@ -1,9 +1,9 @@
-from typing import List, Any, TypeVar
+from typing import Any, TypeVar
  
  from numpy import dtype
  from numpy.ma import MaskedArray
  
-__all__: List[str]
+__all__: list[str]
  
  # TODO: Set the `bound` to something more suitable once we
  # have proper shape support
@@ -84,7 +84,7 @@ def fromtextfile(
      varnames=...,
      vartypes=...,
      # NOTE: deprecated: NumPy 1.22.0, 2021-09-23
-    # delimitor=..., 
+    # delimitor=...,
  ): ...
  
  def addfield(mrecord, newfield, newfieldname=...): ...
diff --git a/numpy/ma/tests/test_extras.py b/numpy/ma/tests/test_extras.py

index e735b9bc77fa7e3a07a67c0609be9f4f449c6c18..1827edd1f1022f3d5b04416ac8da964c6fff267b 100644 (file)
--- a/numpy/ma/tests/test_extras.py
+++ b/numpy/ma/tests/test_extras.py
@@ -28,7 +28,7 @@ from numpy.ma.extras import (
      ediff1d, apply_over_axes, apply_along_axis, compress_nd, compress_rowcols,
      mask_rowcols, clump_masked, clump_unmasked, flatnotmasked_contiguous,
      notmasked_contiguous, notmasked_edges, masked_all, masked_all_like, isin,
-    diagflat, stack, vstack
+    diagflat, ndenumerate, stack, vstack
      )
  
  
@@ -75,7 +75,7 @@ class TestGeneric:
          assert_equal(len(masked_arr['b']['c']), 1)
          assert_equal(masked_arr['b']['c'].shape, (1, 1))
          assert_equal(masked_arr['b']['c']._fill_value.shape, ())
-    
+
      def test_masked_all_with_object(self):
          # same as above except that the array is not nested
          my_dtype = np.dtype([('b', (object, (1,)))])
@@ -292,6 +292,29 @@ class TestAverage:
          assert_almost_equal(wav1.real, expected1.real)
          assert_almost_equal(wav1.imag, expected1.imag)
  
+    @pytest.mark.parametrize(
+        'x, axis, expected_avg, weights, expected_wavg, expected_wsum',
+        [([1, 2, 3], None, [2.0], [3, 4, 1], [1.75], [8.0]),
+         ([[1, 2, 5], [1, 6, 11]], 0, [[1.0, 4.0, 8.0]],
+          [1, 3], [[1.0, 5.0, 9.5]], [[4, 4, 4]])],
+    )
+    def test_basic_keepdims(self, x, axis, expected_avg,
+                            weights, expected_wavg, expected_wsum):
+        avg = np.ma.average(x, axis=axis, keepdims=True)
+        assert avg.shape == np.shape(expected_avg)
+        assert_array_equal(avg, expected_avg)
+
+        wavg = np.ma.average(x, axis=axis, weights=weights, keepdims=True)
+        assert wavg.shape == np.shape(expected_wavg)
+        assert_array_equal(wavg, expected_wavg)
+
+        wavg, wsum = np.ma.average(x, axis=axis, weights=weights,
+                                   returned=True, keepdims=True)
+        assert wavg.shape == np.shape(expected_wavg)
+        assert_array_equal(wavg, expected_wavg)
+        assert wsum.shape == np.shape(expected_wsum)
+        assert_array_equal(wsum, expected_wsum)
+
      def test_masked_weights(self):
          # Test with masked weights.
          # (Regression test for https://github.com/numpy/numpy/issues/10438)
@@ -309,6 +332,32 @@ class TestAverage:
          expected_masked = np.array([6.0, 5.576271186440678, 6.576271186440678])
          assert_almost_equal(avg_masked, expected_masked)
  
+        # weights should be masked if needed
+        # depending on the array mask. This is to avoid summing
+        # masked nan or other values that are not cancelled by a zero
+        a = np.ma.array([1.0,   2.0,   3.0,  4.0],
+                   mask=[False, False, True, True])
+        avg_unmasked = average(a, weights=[1, 1, 1, np.nan])
+
+        assert_almost_equal(avg_unmasked, 1.5)
+
+        a = np.ma.array([
+            [1.0, 2.0, 3.0, 4.0],
+            [5.0, 6.0, 7.0, 8.0],
+            [9.0, 1.0, 2.0, 3.0],
+        ], mask=[
+            [False, True, True, False],
+            [True, False, True, True],
+            [True, False, True, False],
+        ])
+
+        avg_masked = np.ma.average(a, weights=[1, np.nan, 1], axis=0)
+        avg_expected = np.ma.array([1.0, np.nan, np.nan, 3.5],
+                              mask=[False, True, True, False])
+
+        assert_almost_equal(avg_masked, avg_expected)
+        assert_equal(avg_masked.mask, avg_expected.mask)
+
  
  class TestConcatenator:
      # Tests for mr_, the equivalent of r_ for masked arrays.
@@ -1617,12 +1666,49 @@ class TestShapeBase:
              assert_equal(a.mask.shape, a.shape)
              assert_equal(a.data.shape, a.shape)
  
-
          b = diagflat(1.0)
          assert_equal(b.shape, (1, 1))
          assert_equal(b.mask.shape, b.data.shape)
  
  
+class TestNDEnumerate:
+
+    def test_ndenumerate_nomasked(self):
+        ordinary = np.arange(6.).reshape((1, 3, 2))
+        empty_mask = np.zeros_like(ordinary, dtype=bool)
+        with_mask = masked_array(ordinary, mask=empty_mask)
+        assert_equal(list(np.ndenumerate(ordinary)),
+                     list(ndenumerate(ordinary)))
+        assert_equal(list(ndenumerate(ordinary)),
+                     list(ndenumerate(with_mask)))
+        assert_equal(list(ndenumerate(with_mask)),
+                     list(ndenumerate(with_mask, compressed=False)))
+
+    def test_ndenumerate_allmasked(self):
+        a = masked_all(())
+        b = masked_all((100,))
+        c = masked_all((2, 3, 4))
+        assert_equal(list(ndenumerate(a)), [])
+        assert_equal(list(ndenumerate(b)), [])
+        assert_equal(list(ndenumerate(b, compressed=False)),
+                     list(zip(np.ndindex((100,)), 100 * [masked])))
+        assert_equal(list(ndenumerate(c)), [])
+        assert_equal(list(ndenumerate(c, compressed=False)),
+                     list(zip(np.ndindex((2, 3, 4)), 2 * 3 * 4 * [masked])))
+
+    def test_ndenumerate_mixedmasked(self):
+        a = masked_array(np.arange(12).reshape((3, 4)),
+                         mask=[[1, 1, 1, 1],
+                               [1, 1, 0, 1],
+                               [0, 0, 0, 0]])
+        items = [((1, 2), 6),
+                 ((2, 0), 8), ((2, 1), 9), ((2, 2), 10), ((2, 3), 11)]
+        assert_equal(list(ndenumerate(a)), items)
+        assert_equal(len(list(ndenumerate(a, compressed=False))), a.size)
+        for coordinate, value in ndenumerate(a, compressed=False):
+            assert_equal(a[coordinate], value)
+
+
  class TestStack:
  
      def test_stack_1d(self):
diff --git a/numpy/ma/tests/test_subclassing.py b/numpy/ma/tests/test_subclassing.py

index 83a9b2f5187ce3881aa8ac102c02f8303516a602..3491cef7f450c74dd64c84159044973a4d3fb47d 100644 (file)
--- a/numpy/ma/tests/test_subclassing.py
+++ b/numpy/ma/tests/test_subclassing.py
@@ -28,8 +28,7 @@ class SubArray(np.ndarray):
          return x
  
      def __array_finalize__(self, obj):
-        if callable(getattr(super(), '__array_finalize__', None)):
-            super().__array_finalize__(obj)
+        super().__array_finalize__(obj)
          self.info = getattr(obj, 'info', {}).copy()
          return
  
@@ -315,7 +314,7 @@ class TestSubclassing:
          assert_startswith(repr(mx), 'masked_array')
          xsub = SubArray(x)
          mxsub = masked_array(xsub, mask=[True, False, True, False, False])
-        assert_startswith(repr(mxsub), 
+        assert_startswith(repr(mxsub),
              f'masked_{SubArray.__name__}(data=[--, 1, --, 3, 4]')
  
      def test_subclass_str(self):
diff --git a/numpy/matrixlib/__init__.pyi b/numpy/matrixlib/__init__.pyi

index c1b82d2ecdb778aa233c5972a8271d684f037578..b0ca8c9ca03d39efa03bede061f2a4f8ef90523a 100644 (file)
--- a/numpy/matrixlib/__init__.pyi
+++ b/numpy/matrixlib/__init__.pyi
@@ -1,5 +1,3 @@
-from typing import List
-
  from numpy._pytesttester import PytestTester
  
  from numpy import (
@@ -12,6 +10,6 @@ from numpy.matrixlib.defmatrix import (
      asmatrix as asmatrix,
  )
  
-__all__: List[str]
-__path__: List[str]
+__all__: list[str]
+__path__: list[str]
  test: PytestTester
diff --git a/numpy/matrixlib/defmatrix.pyi b/numpy/matrixlib/defmatrix.pyi

index 6c86ea1ef769d47249d8fb70151011f181a34210..9d0d1ee50b6600bce80f1f5b1363e5ee3102a02a 100644 (file)
--- a/numpy/matrixlib/defmatrix.pyi
+++ b/numpy/matrixlib/defmatrix.pyi
@@ -1,8 +1,9 @@
-from typing import List, Any, Sequence, Mapping
+from collections.abc import Sequence, Mapping
+from typing import Any
  from numpy import matrix as matrix
-from numpy.typing import ArrayLike, DTypeLike, NDArray
+from numpy._typing import ArrayLike, DTypeLike, NDArray
  
-__all__: List[str]
+__all__: list[str]
  
  def bmat(
      obj: str | Sequence[ArrayLike] | NDArray[Any],
diff --git a/numpy/polynomial/__init__.pyi b/numpy/polynomial/__init__.pyi

index e0cfedd7aae7791a009409474dad7cd67d5188e7..c9d1c27a96c2d8ccfeb9e378a2599c2e70003ee4 100644 (file)
--- a/numpy/polynomial/__init__.pyi
+++ b/numpy/polynomial/__init__.pyi
@@ -1,5 +1,3 @@
-from typing import List
-
  from numpy._pytesttester import PytestTester
  
  from numpy.polynomial import (
@@ -17,8 +15,8 @@ from numpy.polynomial.laguerre import Laguerre as Laguerre
  from numpy.polynomial.legendre import Legendre as Legendre
  from numpy.polynomial.polynomial import Polynomial as Polynomial
  
-__all__: List[str]
-__path__: List[str]
+__all__: list[str]
+__path__: list[str]
  test: PytestTester
  
  def set_default_printstyle(style): ...
diff --git a/numpy/polynomial/_polybase.pyi b/numpy/polynomial/_polybase.pyi

index c4160146947f048dfc5c638b67dd23b16ff0933a..537221c45c6110df138135e73f8fd9b01a3a400b 100644 (file)
--- a/numpy/polynomial/_polybase.pyi
+++ b/numpy/polynomial/_polybase.pyi
@@ -1,7 +1,7 @@
  import abc
-from typing import Any, List, ClassVar
+from typing import Any, ClassVar
  
-__all__: List[str]
+__all__: list[str]
  
  class ABCPolyBase(abc.ABC):
      __hash__: ClassVar[None]  # type: ignore[assignment]
diff --git a/numpy/polynomial/chebyshev.py b/numpy/polynomial/chebyshev.py

index 89ce815d571e815f7ada7141f05ad27920e0980d..5c595bcf668d0ae6917d6811ce154dcb0ae78fce 100644 (file)
--- a/numpy/polynomial/chebyshev.py
+++ b/numpy/polynomial/chebyshev.py
@@ -1119,7 +1119,7 @@ def chebval(x, c, tensor=True):
          If `x` is a list or tuple, it is converted to an ndarray, otherwise
          it is left unchanged and treated as a scalar. In either case, `x`
          or its elements must support addition and multiplication with
-        with themselves and with the elements of `c`.
+        themselves and with the elements of `c`.
      c : array_like
          Array of coefficients ordered so that the coefficients for terms of
          degree n are contained in c[n]. If `c` is multidimensional the
diff --git a/numpy/polynomial/chebyshev.pyi b/numpy/polynomial/chebyshev.pyi

index 841c0859b1b0a412724b789373f3f18b1388b4d8..e8113dbae780263de1bd99ae841df16a4646d761 100644 (file)
--- a/numpy/polynomial/chebyshev.pyi
+++ b/numpy/polynomial/chebyshev.pyi
@@ -1,10 +1,10 @@
-from typing import Any, List
+from typing import Any
  
  from numpy import ndarray, dtype, int_
  from numpy.polynomial._polybase import ABCPolyBase
  from numpy.polynomial.polyutils import trimcoef
  
-__all__: List[str]
+__all__: list[str]
  
  chebtrim = trimcoef
  
diff --git a/numpy/polynomial/hermite.py b/numpy/polynomial/hermite.py

index 9b0735a9aad34ed14877e95bb94703d45e12e70d..e203391218846c0db6697317a5cafa4a5f260268 100644 (file)
--- a/numpy/polynomial/hermite.py
+++ b/numpy/polynomial/hermite.py
@@ -827,7 +827,7 @@ def hermval(x, c, tensor=True):
          If `x` is a list or tuple, it is converted to an ndarray, otherwise
          it is left unchanged and treated as a scalar. In either case, `x`
          or its elements must support addition and multiplication with
-        with themselves and with the elements of `c`.
+        themselves and with the elements of `c`.
      c : array_like
          Array of coefficients ordered so that the coefficients for terms of
          degree n are contained in c[n]. If `c` is multidimensional the
diff --git a/numpy/polynomial/hermite.pyi b/numpy/polynomial/hermite.pyi

index 8364a5b0fcbc64f5363f39e89ea1907d35150784..0d3556d696410689b4614138ad4cf1f6c2283a9c 100644 (file)
--- a/numpy/polynomial/hermite.pyi
+++ b/numpy/polynomial/hermite.pyi
@@ -1,4 +1,4 @@
-from typing import Any, List
+from typing import Any
  
  from numpy import ndarray, dtype, int_, float_
  from numpy.polynomial._polybase import ABCPolyBase
diff --git a/numpy/polynomial/hermite_e.pyi b/numpy/polynomial/hermite_e.pyi

index c029bfda7788a63a5f361fa3ba9414326de91a84..0b7152a253b654da2c069711a1bfdbd4e084cf6f 100644 (file)
--- a/numpy/polynomial/hermite_e.pyi
+++ b/numpy/polynomial/hermite_e.pyi
@@ -1,4 +1,4 @@
-from typing import Any, List
+from typing import Any
  
  from numpy import ndarray, dtype, int_
  from numpy.polynomial._polybase import ABCPolyBase
diff --git a/numpy/polynomial/laguerre.py b/numpy/polynomial/laguerre.py

index d9ca373ddd5cdc3800302898ecb8e26515700850..5d058828d99d6b86e1d716aeffb79f7390882243 100644 (file)
--- a/numpy/polynomial/laguerre.py
+++ b/numpy/polynomial/laguerre.py
@@ -826,7 +826,7 @@ def lagval(x, c, tensor=True):
          If `x` is a list or tuple, it is converted to an ndarray, otherwise
          it is left unchanged and treated as a scalar. In either case, `x`
          or its elements must support addition and multiplication with
-        with themselves and with the elements of `c`.
+        themselves and with the elements of `c`.
      c : array_like
          Array of coefficients ordered so that the coefficients for terms of
          degree n are contained in c[n]. If `c` is multidimensional the
diff --git a/numpy/polynomial/laguerre.pyi b/numpy/polynomial/laguerre.pyi

index 2b9ab34e0afacda4c632d678dc0fdd859329c738..e546bc20a54c0e522cd7ea851ad8e8a42d895980 100644 (file)
--- a/numpy/polynomial/laguerre.pyi
+++ b/numpy/polynomial/laguerre.pyi
@@ -1,4 +1,4 @@
-from typing import Any, List
+from typing import Any
  
  from numpy import ndarray, dtype, int_
  from numpy.polynomial._polybase import ABCPolyBase
diff --git a/numpy/polynomial/legendre.py b/numpy/polynomial/legendre.py

index 2e8052e7c00767182aaae48a7961e6b00202210a..028e2fe7b36644c32045b116ff4d1bca1253dbdf 100644 (file)
--- a/numpy/polynomial/legendre.py
+++ b/numpy/polynomial/legendre.py
@@ -185,7 +185,7 @@ def leg2poly(c):
      >>> p = c.convert(kind=P.Polynomial)
      >>> p
      Polynomial([-1. , -3.5,  3. ,  7.5], domain=[-1.,  1.], window=[-1.,  1.])
-    >>> P.leg2poly(range(4))
+    >>> P.legendre.leg2poly(range(4))
      array([-1. , -3.5,  3. ,  7.5])
  
  
@@ -857,7 +857,7 @@ def legval(x, c, tensor=True):
          If `x` is a list or tuple, it is converted to an ndarray, otherwise
          it is left unchanged and treated as a scalar. In either case, `x`
          or its elements must support addition and multiplication with
-        with themselves and with the elements of `c`.
+        themselves and with the elements of `c`.
      c : array_like
          Array of coefficients ordered so that the coefficients for terms of
          degree n are contained in c[n]. If `c` is multidimensional the
diff --git a/numpy/polynomial/legendre.pyi b/numpy/polynomial/legendre.pyi

index 86aef179304e731c5edd363d1ea427209b0d764e..63a1c3f3a1f89c2c2da61e385f7dba1e7be16c06 100644 (file)
--- a/numpy/polynomial/legendre.pyi
+++ b/numpy/polynomial/legendre.pyi
@@ -1,4 +1,4 @@
-from typing import Any, List
+from typing import Any
  
  from numpy import ndarray, dtype, int_
  from numpy.polynomial._polybase import ABCPolyBase
diff --git a/numpy/polynomial/polynomial.py b/numpy/polynomial/polynomial.py

index 3c2663b6cc9587fd9551f636cf3a40396be06058..b4741355f2364b2ececb57a8dab085c17967ee36 100644 (file)
--- a/numpy/polynomial/polynomial.py
+++ b/numpy/polynomial/polynomial.py
@@ -772,7 +772,7 @@ def polyvalfromroots(x, r, tensor=True):
  
      If `r` is a 1-D array, then `p(x)` will have the same shape as `x`.  If `r`
      is multidimensional, then the shape of the result depends on the value of
-    `tensor`. If `tensor is ``True`` the shape will be r.shape[1:] + x.shape;
+    `tensor`. If `tensor` is ``True`` the shape will be r.shape[1:] + x.shape;
      that is, each polynomial is evaluated at every value of `x`. If `tensor` is
      ``False``, the shape will be r.shape[1:]; that is, each polynomial is
      evaluated only for the corresponding broadcast value of `x`. Note that
diff --git a/numpy/polynomial/polynomial.pyi b/numpy/polynomial/polynomial.pyi

index f779300a9c5a47d327dde95d0ebb06a54579b53b..3c87f9d2926615e09bffd03d00306b6f235ec1c2 100644 (file)
--- a/numpy/polynomial/polynomial.pyi
+++ b/numpy/polynomial/polynomial.pyi
@@ -1,4 +1,4 @@
-from typing import Any, List
+from typing import Any
  
  from numpy import ndarray, dtype, int_
  from numpy.polynomial._polybase import ABCPolyBase
diff --git a/numpy/polynomial/polyutils.pyi b/numpy/polynomial/polyutils.pyi

index 52c9cfc4a6074a3117bda5c5dbbd26bea94453f4..06260a9f1fa9f449fa3bc4da267cd5b5785abe12 100644 (file)
--- a/numpy/polynomial/polyutils.pyi
+++ b/numpy/polynomial/polyutils.pyi
@@ -1,6 +1,4 @@
-from typing import List
-
-__all__: List[str]
+__all__: list[str]
  
  class RankWarning(UserWarning): ...
  
diff --git a/numpy/random/__init__.pyi b/numpy/random/__init__.pyi

index bf6147697b2d286a55c2236d3e747d797850579a..32bd64a0b01d21d2c047af8ec87bef522278cec5 100644 (file)
--- a/numpy/random/__init__.pyi
+++ b/numpy/random/__init__.pyi
@@ -1,5 +1,3 @@
-from typing import List
-
  from numpy._pytesttester import PytestTester
  
  from numpy.random._generator import Generator as Generator
@@ -67,6 +65,6 @@ from numpy.random.mtrand import (
      zipf as zipf,
  )
  
-__all__: List[str]
-__path__: List[str]
+__all__: list[str]
+__path__: list[str]
  test: PytestTester
diff --git a/numpy/random/_common.pxd b/numpy/random/_common.pxd

index 9f2e8c3ca117e9179a2b371fbbe995f746eefecb..3625634cd4e33ee5892c90321b52d7b461b51e8b 100644 (file)
--- a/numpy/random/_common.pxd
+++ b/numpy/random/_common.pxd
@@ -45,7 +45,7 @@ ctypedef double (*random_double_1)(void *state, double a) nogil
  ctypedef double (*random_double_2)(void *state, double a, double b) nogil
  ctypedef double (*random_double_3)(void *state, double a, double b, double c) nogil
  
-ctypedef double (*random_float_fill)(bitgen_t *state, np.npy_intp count, float* out) nogil
+ctypedef void (*random_float_fill)(bitgen_t *state, np.npy_intp count, float* out) nogil
  ctypedef float (*random_float_0)(bitgen_t *state) nogil
  ctypedef float (*random_float_1)(bitgen_t *state, float a) nogil
  
diff --git a/numpy/random/_common.pyx b/numpy/random/_common.pyx

index ffc821e4485e5b12fc9c7acaed789c1a8d3bf13c..607034a385e2944d675551d5b67a4e37a15b6e1a 100644 (file)
--- a/numpy/random/_common.pyx
+++ b/numpy/random/_common.pyx
@@ -65,7 +65,7 @@ cdef object random_raw(bitgen_t *bitgen, object lock, object size, object output
  
      Notes
      -----
-    This method directly exposes the the raw underlying pseudo-random
+    This method directly exposes the raw underlying pseudo-random
      number generator. All values are returned as unsigned 64-bit
      values irrespective of the number of bits produced by the PRNG.
  
@@ -172,8 +172,23 @@ cdef object prepare_ctypes(bitgen_t *bitgen):
      return _ctypes
  
  cdef double kahan_sum(double *darr, np.npy_intp n):
+    """
+    Parameters
+    ----------
+    darr : reference to double array
+        Address of values to sum
+    n : intp
+        Length of d
+    
+    Returns
+    -------
+    float
+        The sum. 0.0 if n <= 0.
+    """
      cdef double c, y, t, sum
      cdef np.npy_intp i
+    if n <= 0:
+        return 0.0
      sum = darr[0]
      c = 0.0
      for i in range(1, n):
@@ -400,10 +415,10 @@ cdef int check_array_constraint(np.ndarray val, object name, constraint_type con
  cdef int check_constraint(double val, object name, constraint_type cons) except -1:
      cdef bint is_nan
      if cons == CONS_NON_NEGATIVE:
-        if not np.isnan(val) and np.signbit(val):
+        if not npmath.isnan(val) and npmath.signbit(val):
              raise ValueError(name + " < 0")
      elif cons == CONS_POSITIVE or cons == CONS_POSITIVE_NOT_NAN:
-        if cons == CONS_POSITIVE_NOT_NAN and np.isnan(val):
+        if cons == CONS_POSITIVE_NOT_NAN and npmath.isnan(val):
              raise ValueError(name + " must not be NaN")
          elif val <= 0:
              raise ValueError(name + " <= 0")
diff --git a/numpy/random/_examples/cython/extending.pyx b/numpy/random/_examples/cython/extending.pyx

index 3a7f81aa0466cf4144d1308d3cf266c573741023..30efd7447748c1747e11bd4a053d0e01911fa2e7 100644 (file)
--- a/numpy/random/_examples/cython/extending.pyx
+++ b/numpy/random/_examples/cython/extending.pyx
@@ -31,7 +31,7 @@ def uniform_mean(Py_ssize_t n):
      random_values = np.empty(n)
      # Best practice is to acquire the lock whenever generating random values.
      # This prevents other threads from modifying the state. Acquiring the lock
-    # is only necessary if if the GIL is also released, as in this example.
+    # is only necessary if the GIL is also released, as in this example.
      with x.lock, nogil:
          for i in range(n):
              random_values[i] = rng.next_double(rng.state)
diff --git a/numpy/random/_generator.pyi b/numpy/random/_generator.pyi

index c574bef9a5cbdc51aa0295b6a8c03c906c184d77..f0d814fef798ec55f6036a05da66e476ec4e67e2 100644 (file)
--- a/numpy/random/_generator.pyi
+++ b/numpy/random/_generator.pyi
@@ -1,4 +1,5 @@
-from typing import Any, Callable, Dict, Optional, Tuple, Type, Union, overload, TypeVar, Literal
+from collections.abc import Callable
+from typing import Any, Union, overload, TypeVar, Literal
  
  from numpy import (
      bool_,
@@ -18,7 +19,7 @@ from numpy import (
      uint64,
  )
  from numpy.random import BitGenerator, SeedSequence
-from numpy.typing import (
+from numpy._typing import (
      ArrayLike,
      _ArrayLikeFloat_co,
      _ArrayLikeInt_co,
@@ -48,7 +49,7 @@ _ArrayType = TypeVar("_ArrayType", bound=ndarray[Any, Any])
  _DTypeLikeFloat32 = Union[
      dtype[float32],
      _SupportsDType[dtype[float32]],
-    Type[float32],
+    type[float32],
      _Float32Codes,
      _SingleCodes,
  ]
@@ -56,8 +57,8 @@ _DTypeLikeFloat32 = Union[
  _DTypeLikeFloat64 = Union[
      dtype[float64],
      _SupportsDType[dtype[float64]],
-    Type[float],
-    Type[float64],
+    type[float],
+    type[float64],
      _Float64Codes,
      _DoubleCodes,
  ]
@@ -66,9 +67,9 @@ class Generator:
      def __init__(self, bit_generator: BitGenerator) -> None: ...
      def __repr__(self) -> str: ...
      def __str__(self) -> str: ...
-    def __getstate__(self) -> Dict[str, Any]: ...
-    def __setstate__(self, state: Dict[str, Any]) -> None: ...
-    def __reduce__(self) -> Tuple[Callable[[str], Generator], Tuple[str], Dict[str, Any]]: ...
+    def __getstate__(self) -> dict[str, Any]: ...
+    def __setstate__(self, state: dict[str, Any]) -> None: ...
+    def __reduce__(self) -> tuple[Callable[[str], Generator], tuple[str], dict[str, Any]]: ...
      @property
      def bit_generator(self) -> BitGenerator: ...
      def bytes(self, length: int) -> bytes: ...
@@ -76,7 +77,7 @@ class Generator:
      def standard_normal(  # type: ignore[misc]
          self,
          size: None = ...,
-        dtype: Union[_DTypeLikeFloat32, _DTypeLikeFloat64] = ...,
+        dtype: _DTypeLikeFloat32 | _DTypeLikeFloat64 = ...,
          out: None = ...,
      ) -> float: ...
      @overload
@@ -95,14 +96,14 @@ class Generator:
          self,
          size: _ShapeLike = ...,
          dtype: _DTypeLikeFloat32 = ...,
-        out: Optional[ndarray[Any, dtype[float32]]] = ...,
+        out: None | ndarray[Any, dtype[float32]] = ...,
      ) -> ndarray[Any, dtype[float32]]: ...
      @overload
      def standard_normal(  # type: ignore[misc]
          self,
          size: _ShapeLike = ...,
          dtype: _DTypeLikeFloat64 = ...,
-        out: Optional[ndarray[Any, dtype[float64]]] = ...,
+        out: None | ndarray[Any, dtype[float64]] = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def permutation(self, x: int, axis: int = ...) -> ndarray[Any, dtype[int64]]: ...
@@ -112,7 +113,7 @@ class Generator:
      def standard_exponential(  # type: ignore[misc]
          self,
          size: None = ...,
-        dtype: Union[_DTypeLikeFloat32, _DTypeLikeFloat64] = ...,
+        dtype: _DTypeLikeFloat32 | _DTypeLikeFloat64 = ...,
          method: Literal["zig", "inv"] = ...,
          out: None = ...,
      ) -> float: ...
@@ -133,7 +134,7 @@ class Generator:
          size: _ShapeLike = ...,
          *,
          method: Literal["zig", "inv"] = ...,
-        out: Optional[ndarray[Any, dtype[float64]]] = ...,
+        out: None | ndarray[Any, dtype[float64]] = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def standard_exponential(
@@ -141,7 +142,7 @@ class Generator:
          size: _ShapeLike = ...,
          dtype: _DTypeLikeFloat32 = ...,
          method: Literal["zig", "inv"] = ...,
-        out: Optional[ndarray[Any, dtype[float32]]] = ...,
+        out: None | ndarray[Any, dtype[float32]] = ...,
      ) -> ndarray[Any, dtype[float32]]: ...
      @overload
      def standard_exponential(
@@ -149,13 +150,13 @@ class Generator:
          size: _ShapeLike = ...,
          dtype: _DTypeLikeFloat64 = ...,
          method: Literal["zig", "inv"] = ...,
-        out: Optional[ndarray[Any, dtype[float64]]] = ...,
+        out: None | ndarray[Any, dtype[float64]] = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def random(  # type: ignore[misc]
          self,
          size: None = ...,
-        dtype: Union[_DTypeLikeFloat32, _DTypeLikeFloat64] = ...,
+        dtype: _DTypeLikeFloat32 | _DTypeLikeFloat64 = ...,
          out: None = ...,
      ) -> float: ...
      @overload
@@ -169,45 +170,45 @@ class Generator:
          self,
          size: _ShapeLike = ...,
          *,
-        out: Optional[ndarray[Any, dtype[float64]]] = ...,
+        out: None | ndarray[Any, dtype[float64]] = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def random(
          self,
          size: _ShapeLike = ...,
          dtype: _DTypeLikeFloat32 = ...,
-        out: Optional[ndarray[Any, dtype[float32]]] = ...,
+        out: None | ndarray[Any, dtype[float32]] = ...,
      ) -> ndarray[Any, dtype[float32]]: ...
      @overload
      def random(
          self,
          size: _ShapeLike = ...,
          dtype: _DTypeLikeFloat64 = ...,
-        out: Optional[ndarray[Any, dtype[float64]]] = ...,
+        out: None | ndarray[Any, dtype[float64]] = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def beta(self, a: float, b: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def beta(
-        self, a: _ArrayLikeFloat_co, b: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, a: _ArrayLikeFloat_co, b: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def exponential(self, scale: float = ..., size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def exponential(
-        self, scale: _ArrayLikeFloat_co = ..., size: Optional[_ShapeLike] = ...
+        self, scale: _ArrayLikeFloat_co = ..., size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def integers(  # type: ignore[misc]
          self,
          low: int,
-        high: Optional[int] = ...,
+        high: None | int = ...,
      ) -> int: ...
      @overload
      def integers(  # type: ignore[misc]
          self,
          low: int,
-        high: Optional[int] = ...,
+        high: None | int = ...,
          size: None = ...,
          dtype: _DTypeLikeBool = ...,
          endpoint: bool = ...,
@@ -216,24 +217,24 @@ class Generator:
      def integers(  # type: ignore[misc]
          self,
          low: int,
-        high: Optional[int] = ...,
+        high: None | int = ...,
          size: None = ...,
-        dtype: Union[_DTypeLikeInt, _DTypeLikeUInt] = ...,
+        dtype: _DTypeLikeInt | _DTypeLikeUInt = ...,
          endpoint: bool = ...,
      ) -> int: ...
      @overload
      def integers(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[int64]]: ...
      @overload
      def integers(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
          dtype: _DTypeLikeBool = ...,
          endpoint: bool = ...,
      ) -> ndarray[Any, dtype[bool_]]: ...
@@ -241,110 +242,100 @@ class Generator:
      def integers(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[dtype[int8], Type[int8], _Int8Codes, _SupportsDType[dtype[int8]]] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[int8] | type[int8] | _Int8Codes | _SupportsDType[dtype[int8]] = ...,
          endpoint: bool = ...,
      ) -> ndarray[Any, dtype[int8]]: ...
      @overload
      def integers(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[dtype[int16], Type[int16], _Int16Codes, _SupportsDType[dtype[int16]]] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[int16] | type[int16] | _Int16Codes | _SupportsDType[dtype[int16]] = ...,
          endpoint: bool = ...,
      ) -> ndarray[Any, dtype[int16]]: ...
      @overload
      def integers(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[dtype[int32], Type[int32], _Int32Codes, _SupportsDType[dtype[int32]]] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[int32] | type[int32] | _Int32Codes | _SupportsDType[dtype[int32]] = ...,
          endpoint: bool = ...,
-    ) -> ndarray[Any, dtype[Union[int32]]]: ...
+    ) -> ndarray[Any, dtype[int32]]: ...
      @overload
      def integers(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Optional[
-            Union[dtype[int64], Type[int64], _Int64Codes, _SupportsDType[dtype[int64]]]
-        ] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: None | dtype[int64] | type[int64] | _Int64Codes | _SupportsDType[dtype[int64]] = ...,
          endpoint: bool = ...,
      ) -> ndarray[Any, dtype[int64]]: ...
      @overload
      def integers(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[dtype[uint8], Type[uint8], _UInt8Codes, _SupportsDType[dtype[uint8]]] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[uint8] | type[uint8] | _UInt8Codes | _SupportsDType[dtype[uint8]] = ...,
          endpoint: bool = ...,
      ) -> ndarray[Any, dtype[uint8]]: ...
      @overload
      def integers(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[
-            dtype[uint16], Type[uint16], _UInt16Codes, _SupportsDType[dtype[uint16]]
-        ] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[uint16] | type[uint16] | _UInt16Codes | _SupportsDType[dtype[uint16]] = ...,
          endpoint: bool = ...,
-    ) -> ndarray[Any, dtype[Union[uint16]]]: ...
+    ) -> ndarray[Any, dtype[uint16]]: ...
      @overload
      def integers(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[
-            dtype[uint32], Type[uint32], _UInt32Codes, _SupportsDType[dtype[uint32]]
-        ] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[uint32] | type[uint32] | _UInt32Codes | _SupportsDType[dtype[uint32]] = ...,
          endpoint: bool = ...,
      ) -> ndarray[Any, dtype[uint32]]: ...
      @overload
      def integers(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[
-            dtype[uint64], Type[uint64], _UInt64Codes, _SupportsDType[dtype[uint64]]
-        ] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[uint64] | type[uint64] | _UInt64Codes | _SupportsDType[dtype[uint64]] = ...,
          endpoint: bool = ...,
      ) -> ndarray[Any, dtype[uint64]]: ...
      @overload
      def integers(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[
-            dtype[int_], Type[int], Type[int_], _IntCodes, _SupportsDType[dtype[int_]]
-        ] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[int_] | type[int] | type[int_] | _IntCodes | _SupportsDType[dtype[int_]] = ...,
          endpoint: bool = ...,
      ) -> ndarray[Any, dtype[int_]]: ...
      @overload
      def integers(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[dtype[uint], Type[uint], _UIntCodes, _SupportsDType[dtype[uint]]] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[uint] | type[uint] | _UIntCodes | _SupportsDType[dtype[uint]] = ...,
          endpoint: bool = ...,
      ) -> ndarray[Any, dtype[uint]]: ...
-    # TODO: Use a TypeVar _T here to get away from Any output?  Should be int->ndarray[Any,dtype[int64]], ArrayLike[_T] -> Union[_T, ndarray[Any,Any]]
+    # TODO: Use a TypeVar _T here to get away from Any output?  Should be int->ndarray[Any,dtype[int64]], ArrayLike[_T] -> _T | ndarray[Any,Any]
      @overload
      def choice(
          self,
          a: int,
          size: None = ...,
          replace: bool = ...,
-        p: Optional[_ArrayLikeFloat_co] = ...,
+        p: None | _ArrayLikeFloat_co = ...,
          axis: int = ...,
          shuffle: bool = ...,
      ) -> int: ...
@@ -354,7 +345,7 @@ class Generator:
          a: int,
          size: _ShapeLike = ...,
          replace: bool = ...,
-        p: Optional[_ArrayLikeFloat_co] = ...,
+        p: None | _ArrayLikeFloat_co = ...,
          axis: int = ...,
          shuffle: bool = ...,
      ) -> ndarray[Any, dtype[int64]]: ...
@@ -364,7 +355,7 @@ class Generator:
          a: ArrayLike,
          size: None = ...,
          replace: bool = ...,
-        p: Optional[_ArrayLikeFloat_co] = ...,
+        p: None | _ArrayLikeFloat_co = ...,
          axis: int = ...,
          shuffle: bool = ...,
      ) -> Any: ...
@@ -374,7 +365,7 @@ class Generator:
          a: ArrayLike,
          size: _ShapeLike = ...,
          replace: bool = ...,
-        p: Optional[_ArrayLikeFloat_co] = ...,
+        p: None | _ArrayLikeFloat_co = ...,
          axis: int = ...,
          shuffle: bool = ...,
      ) -> ndarray[Any, Any]: ...
@@ -385,7 +376,7 @@ class Generator:
          self,
          low: _ArrayLikeFloat_co = ...,
          high: _ArrayLikeFloat_co = ...,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def normal(self, loc: float = ..., scale: float = ..., size: None = ...) -> float: ...  # type: ignore[misc]
@@ -394,21 +385,21 @@ class Generator:
          self,
          loc: _ArrayLikeFloat_co = ...,
          scale: _ArrayLikeFloat_co = ...,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def standard_gamma(  # type: ignore[misc]
          self,
          shape: float,
          size: None = ...,
-        dtype: Union[_DTypeLikeFloat32, _DTypeLikeFloat64] = ...,
+        dtype: _DTypeLikeFloat32 | _DTypeLikeFloat64 = ...,
          out: None = ...,
      ) -> float: ...
      @overload
      def standard_gamma(
          self,
          shape: _ArrayLikeFloat_co,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def standard_gamma(
@@ -421,17 +412,17 @@ class Generator:
      def standard_gamma(
          self,
          shape: _ArrayLikeFloat_co,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
          dtype: _DTypeLikeFloat32 = ...,
-        out: Optional[ndarray[Any, dtype[float32]]] = ...,
+        out: None | ndarray[Any, dtype[float32]] = ...,
      ) -> ndarray[Any, dtype[float32]]: ...
      @overload
      def standard_gamma(
          self,
          shape: _ArrayLikeFloat_co,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
          dtype: _DTypeLikeFloat64 = ...,
-        out: Optional[ndarray[Any, dtype[float64]]] = ...,
+        out: None | ndarray[Any, dtype[float64]] = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def gamma(self, shape: float, scale: float = ..., size: None = ...) -> float: ...  # type: ignore[misc]
@@ -440,13 +431,13 @@ class Generator:
          self,
          shape: _ArrayLikeFloat_co,
          scale: _ArrayLikeFloat_co = ...,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def f(self, dfnum: float, dfden: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def f(
-        self, dfnum: _ArrayLikeFloat_co, dfden: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, dfnum: _ArrayLikeFloat_co, dfden: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def noncentral_f(self, dfnum: float, dfden: float, nonc: float, size: None = ...) -> float: ...  # type: ignore[misc]
@@ -456,19 +447,19 @@ class Generator:
          dfnum: _ArrayLikeFloat_co,
          dfden: _ArrayLikeFloat_co,
          nonc: _ArrayLikeFloat_co,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def chisquare(self, df: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def chisquare(
-        self, df: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, df: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def noncentral_chisquare(self, df: float, nonc: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def noncentral_chisquare(
-        self, df: _ArrayLikeFloat_co, nonc: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, df: _ArrayLikeFloat_co, nonc: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def standard_t(self, df: float, size: None = ...) -> float: ...  # type: ignore[misc]
@@ -484,25 +475,25 @@ class Generator:
      def vonmises(self, mu: float, kappa: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def vonmises(
-        self, mu: _ArrayLikeFloat_co, kappa: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, mu: _ArrayLikeFloat_co, kappa: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def pareto(self, a: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def pareto(
-        self, a: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, a: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def weibull(self, a: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def weibull(
-        self, a: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, a: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def power(self, a: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def power(
-        self, a: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, a: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def standard_cauchy(self, size: None = ...) -> float: ...  # type: ignore[misc]
@@ -515,7 +506,7 @@ class Generator:
          self,
          loc: _ArrayLikeFloat_co = ...,
          scale: _ArrayLikeFloat_co = ...,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def gumbel(self, loc: float = ..., scale: float = ..., size: None = ...) -> float: ...  # type: ignore[misc]
@@ -524,7 +515,7 @@ class Generator:
          self,
          loc: _ArrayLikeFloat_co = ...,
          scale: _ArrayLikeFloat_co = ...,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def logistic(self, loc: float = ..., scale: float = ..., size: None = ...) -> float: ...  # type: ignore[misc]
@@ -533,7 +524,7 @@ class Generator:
          self,
          loc: _ArrayLikeFloat_co = ...,
          scale: _ArrayLikeFloat_co = ...,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def lognormal(self, mean: float = ..., sigma: float = ..., size: None = ...) -> float: ...  # type: ignore[misc]
@@ -542,19 +533,19 @@ class Generator:
          self,
          mean: _ArrayLikeFloat_co = ...,
          sigma: _ArrayLikeFloat_co = ...,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def rayleigh(self, scale: float = ..., size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def rayleigh(
-        self, scale: _ArrayLikeFloat_co = ..., size: Optional[_ShapeLike] = ...
+        self, scale: _ArrayLikeFloat_co = ..., size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def wald(self, mean: float, scale: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def wald(
-        self, mean: _ArrayLikeFloat_co, scale: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, mean: _ArrayLikeFloat_co, scale: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def triangular(self, left: float, mode: float, right: float, size: None = ...) -> float: ...  # type: ignore[misc]
@@ -564,37 +555,37 @@ class Generator:
          left: _ArrayLikeFloat_co,
          mode: _ArrayLikeFloat_co,
          right: _ArrayLikeFloat_co,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def binomial(self, n: int, p: float, size: None = ...) -> int: ...  # type: ignore[misc]
      @overload
      def binomial(
-        self, n: _ArrayLikeInt_co, p: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, n: _ArrayLikeInt_co, p: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[int64]]: ...
      @overload
      def negative_binomial(self, n: float, p: float, size: None = ...) -> int: ...  # type: ignore[misc]
      @overload
      def negative_binomial(
-        self, n: _ArrayLikeFloat_co, p: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, n: _ArrayLikeFloat_co, p: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[int64]]: ...
      @overload
      def poisson(self, lam: float = ..., size: None = ...) -> int: ...  # type: ignore[misc]
      @overload
      def poisson(
-        self, lam: _ArrayLikeFloat_co = ..., size: Optional[_ShapeLike] = ...
+        self, lam: _ArrayLikeFloat_co = ..., size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[int64]]: ...
      @overload
      def zipf(self, a: float, size: None = ...) -> int: ...  # type: ignore[misc]
      @overload
      def zipf(
-        self, a: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, a: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[int64]]: ...
      @overload
      def geometric(self, p: float, size: None = ...) -> int: ...  # type: ignore[misc]
      @overload
      def geometric(
-        self, p: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, p: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[int64]]: ...
      @overload
      def hypergeometric(self, ngood: int, nbad: int, nsample: int, size: None = ...) -> int: ...  # type: ignore[misc]
@@ -604,19 +595,19 @@ class Generator:
          ngood: _ArrayLikeInt_co,
          nbad: _ArrayLikeInt_co,
          nsample: _ArrayLikeInt_co,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[int64]]: ...
      @overload
      def logseries(self, p: float, size: None = ...) -> int: ...  # type: ignore[misc]
      @overload
      def logseries(
-        self, p: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, p: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[int64]]: ...
      def multivariate_normal(
          self,
          mean: _ArrayLikeFloat_co,
          cov: _ArrayLikeFloat_co,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
          check_valid: Literal["warn", "raise", "ignore"] = ...,
          tol: float = ...,
          *,
@@ -625,23 +616,23 @@ class Generator:
      def multinomial(
          self, n: _ArrayLikeInt_co,
              pvals: _ArrayLikeFloat_co,
-            size: Optional[_ShapeLike] = ...
+            size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[int64]]: ...
      def multivariate_hypergeometric(
          self,
          colors: _ArrayLikeInt_co,
          nsample: int,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
          method: Literal["marginals", "count"] = ...,
      ) -> ndarray[Any, dtype[int64]]: ...
      def dirichlet(
-        self, alpha: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, alpha: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      def permuted(
-        self, x: ArrayLike, *, axis: Optional[int] = ..., out: Optional[ndarray[Any, Any]] = ...
+        self, x: ArrayLike, *, axis: None | int = ..., out: None | ndarray[Any, Any] = ...
      ) -> ndarray[Any, Any]: ...
      def shuffle(self, x: ArrayLike, axis: int = ...) -> None: ...
  
  def default_rng(
-    seed: Union[None, _ArrayLikeInt_co, SeedSequence, BitGenerator, Generator] = ...
+    seed: None | _ArrayLikeInt_co | SeedSequence | BitGenerator | Generator = ...
  ) -> Generator: ...
diff --git a/numpy/random/_generator.pyx b/numpy/random/_generator.pyx

index 391987a1ecd34b15b38a9d38b3b20f9e1e07b140..5218c6d0eb231f341304506f963de15acce4c763 100644 (file)
--- a/numpy/random/_generator.pyx
+++ b/numpy/random/_generator.pyx
@@ -15,6 +15,7 @@ from numpy.core.multiarray import normalize_axis_index
  
  from .c_distributions cimport *
  from libc cimport string
+from libc.math cimport sqrt
  from libc.stdint cimport (uint8_t, uint16_t, uint32_t, uint64_t,
                            int32_t, int64_t, INT64_MAX, SIZE_MAX)
  from ._bounded_integers cimport (_rand_bool, _rand_int32, _rand_int64,
@@ -32,6 +33,8 @@ from ._common cimport (POISSON_LAM_MAX, CONS_POSITIVE, CONS_NONE,
  
  cdef extern from "numpy/arrayobject.h":
      int PyArray_ResolveWritebackIfCopy(np.ndarray)
+    int PyArray_FailUnlessWriteable(np.PyArrayObject *obj,
+                                    const char *name) except -1
      object PyArray_FromArray(np.PyArrayObject *, np.PyArray_Descr *, int)
  
      enum:
@@ -2995,6 +2998,22 @@ cdef class Generator:
          then the probability distribution of the number of non-"1"s that
          appear before the third "1" is a negative binomial distribution.
  
+        Because this method internally calls ``Generator.poisson`` with an
+        intermediate random value, a ValueError is raised when the choice of 
+        :math:`n` and :math:`p` would result in the mean + 10 sigma of the sampled
+        intermediate distribution exceeding the max acceptable value of the 
+        ``Generator.poisson`` method. This happens when :math:`p` is too low 
+        (a lot of failures happen for every success) and :math:`n` is too big (
+        a lot of sucesses are allowed).
+        Therefore, the :math:`n` and :math:`p` values must satisfy the constraint:
+
+        .. math:: n\\frac{1-p}{p}+10n\\sqrt{n}\\frac{1-p}{p}<2^{63}-1-10\\sqrt{2^{63}-1},
+
+        Where the left side of the equation is the derived mean + 10 sigma of
+        a sample from the gamma distribution internally used as the :math:`lam`
+        parameter of a poisson sample, and the right side of the equation is
+        the constraint for maximum value of :math:`lam` in ``Generator.poisson``.
+
          References
          ----------
          .. [1] Weisstein, Eric W. "Negative Binomial Distribution." From
@@ -3019,9 +3038,41 @@ cdef class Generator:
          ...    print(i, "wells drilled, probability of one success =", probability)
  
          """
+
+        cdef bint is_scalar = True
+        cdef double *_dn
+        cdef double *_dp
+        cdef double _dmax_lam
+
+        p_arr = <np.ndarray>np.PyArray_FROM_OTF(p, np.NPY_DOUBLE, np.NPY_ALIGNED)
+        is_scalar = is_scalar and np.PyArray_NDIM(p_arr) == 0
+        n_arr = <np.ndarray>np.PyArray_FROM_OTF(n, np.NPY_DOUBLE, np.NPY_ALIGNED)
+        is_scalar = is_scalar and np.PyArray_NDIM(n_arr) == 0
+
+        if not is_scalar:
+            check_array_constraint(n_arr, 'n', CONS_POSITIVE_NOT_NAN)
+            check_array_constraint(p_arr, 'p', CONS_BOUNDED_GT_0_1)
+            # Check that the choice of negative_binomial parameters won't result in a
+            # call to the poisson distribution function with a value of lam too large.
+            max_lam_arr = (1 - p_arr) / p_arr * (n_arr + 10 * np.sqrt(n_arr))
+            if np.any(np.greater(max_lam_arr, POISSON_LAM_MAX)):
+                raise ValueError("n too large or p too small, see Generator.negative_binomial Notes")
+
+        else:
+            _dn = <double*>np.PyArray_DATA(n_arr)
+            _dp = <double*>np.PyArray_DATA(p_arr)
+
+            check_constraint(_dn[0], 'n', CONS_POSITIVE_NOT_NAN)
+            check_constraint(_dp[0], 'p', CONS_BOUNDED_GT_0_1)
+            # Check that the choice of negative_binomial parameters won't result in a
+            # call to the poisson distribution function with a value of lam too large.
+            _dmax_lam = (1 - _dp[0]) / _dp[0] * (_dn[0] + 10 * sqrt(_dn[0]))
+            if _dmax_lam > POISSON_LAM_MAX:
+                raise ValueError("n too large or p too small, see Generator.negative_binomial Notes")
+
          return disc(&random_negative_binomial, &self._bitgen, size, self.lock, 2, 0,
-                    n, 'n', CONS_POSITIVE_NOT_NAN,
-                    p, 'p', CONS_BOUNDED_GT_0_1,
+                    n_arr, 'n', CONS_NONE,
+                    p_arr, 'p', CONS_NONE,
                      0.0, '', CONS_NONE)
  
      def poisson(self, lam=1.0, size=None):
@@ -3572,12 +3623,35 @@ cdef class Generator:
          >>> y.shape
          (3, 3, 2)
  
-        The following is probably true, given that 0.6 is roughly twice the
-        standard deviation:
+        Here we generate 800 samples from the bivariate normal distribution
+        with mean [0, 0] and covariance matrix [[6, -3], [-3, 3.5]].  The
+        expected variances of the first and second components of the sample
+        are 6 and 3.5, respectively, and the expected correlation
+        coefficient is -3/sqrt(6*3.5) ≈ -0.65465.
+
+        >>> cov = np.array([[6, -3], [-3, 3.5]])
+        >>> pts = rng.multivariate_normal([0, 0], cov, size=800)
+
+        Check that the mean, covariance, and correlation coefficient of the
+        sample are close to the expected values:
  
-        >>> list((x[0,0,:] - mean) < 0.6)
-        [True, True] # random
+        >>> pts.mean(axis=0)
+        array([ 0.0326911 , -0.01280782])  # may vary
+        >>> np.cov(pts.T)
+        array([[ 5.96202397, -2.85602287],
+               [-2.85602287,  3.47613949]])  # may vary
+        >>> np.corrcoef(pts.T)[0, 1]
+        -0.6273591314603949  # may vary
  
+        We can visualize this data with a scatter plot.  The orientation
+        of the point cloud illustrates the negative correlation of the
+        components of this sample.
+
+        >>> import matplotlib.pyplot as plt
+        >>> plt.plot(pts[:, 0], pts[:, 1], '.', alpha=0.5)
+        >>> plt.axis('equal')
+        >>> plt.grid()
+        >>> plt.show()
          """
          if method not in {'eigh', 'svd', 'cholesky'}:
              raise ValueError(
@@ -4355,6 +4429,12 @@ cdef class Generator:
          --------
          shuffle
          permutation
+        
+        Notes
+        -----
+        An important distinction between methods ``shuffle``  and ``permuted`` is 
+        how they both treat the ``axis`` parameter which can be found at 
+        :ref:`generator-handling-axis-parameter`.
  
          Examples
          --------
@@ -4416,6 +4496,7 @@ cdef class Generator:
          else:
              if type(out) is not np.ndarray:
                  raise TypeError('out must be a numpy array')
+            PyArray_FailUnlessWriteable(<np.PyArrayObject *>out, "out")
              if out.shape != x.shape:
                  raise ValueError('out must have the same shape as x')
              np.copyto(out, x, casting='safe')
@@ -4496,15 +4577,32 @@ cdef class Generator:
          -------
          None
  
+        See Also
+        --------
+        permuted
+        permutation
+
+        Notes
+        -----
+        An important distinction between methods ``shuffle``  and ``permuted`` is 
+        how they both treat the ``axis`` parameter which can be found at 
+        :ref:`generator-handling-axis-parameter`.
+
          Examples
          --------
          >>> rng = np.random.default_rng()
          >>> arr = np.arange(10)
+        >>> arr
+        array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
          >>> rng.shuffle(arr)
          >>> arr
-        [1 7 5 2 9 4 3 6 0 8] # random
+        array([2, 0, 7, 5, 1, 4, 8, 9, 3, 6]) # random
  
          >>> arr = np.arange(9).reshape((3, 3))
+        >>> arr
+        array([[0, 1, 2],
+               [3, 4, 5],
+               [6, 7, 8]])
          >>> rng.shuffle(arr)
          >>> arr
          array([[3, 4, 5], # random
@@ -4512,6 +4610,10 @@ cdef class Generator:
                 [0, 1, 2]])
  
          >>> arr = np.arange(9).reshape((3, 3))
+        >>> arr
+        array([[0, 1, 2],
+               [3, 4, 5],
+               [6, 7, 8]])
          >>> rng.shuffle(arr, axis=1)
          >>> arr
          array([[2, 0, 1], # random
@@ -4524,6 +4626,8 @@ cdef class Generator:
              char* buf_ptr
  
          if isinstance(x, np.ndarray):
+            if not x.flags.writeable:
+                raise ValueError('array is read-only')
              # Only call ndim on ndarrays, see GH 18142
              axis = normalize_axis_index(axis, np.ndim(x))
  
diff --git a/numpy/random/_mt19937.pyi b/numpy/random/_mt19937.pyi

index 820f27392f0f10dea80fe5d2f529f3a2da5f6fcb..55cfb2db42b17a80a1e15b6f8eabee1ccbdee282 100644 (file)
--- a/numpy/random/_mt19937.pyi
+++ b/numpy/random/_mt19937.pyi
@@ -1,8 +1,8 @@
-from typing import Any, Union, TypedDict
+from typing import Any, TypedDict
  
  from numpy import dtype, ndarray, uint32
  from numpy.random.bit_generator import BitGenerator, SeedSequence
-from numpy.typing import _ArrayLikeInt_co
+from numpy._typing import _ArrayLikeInt_co
  
  class _MT19937Internal(TypedDict):
      key: ndarray[Any, dtype[uint32]]
@@ -13,7 +13,7 @@ class _MT19937State(TypedDict):
      state: _MT19937Internal
  
  class MT19937(BitGenerator):
-    def __init__(self, seed: Union[None, _ArrayLikeInt_co, SeedSequence] = ...) -> None: ...
+    def __init__(self, seed: None | _ArrayLikeInt_co | SeedSequence = ...) -> None: ...
      def _legacy_seeding(self, seed: _ArrayLikeInt_co) -> None: ...
      def jumped(self, jumps: int = ...) -> MT19937: ...
      @property
diff --git a/numpy/random/_mt19937.pyx b/numpy/random/_mt19937.pyx

index 16a377cc63b9105bed93a1ee1450f58c7b19b271..5a8d52e6bde842a003fd3ee808dbc391abb7a0df 100644 (file)
--- a/numpy/random/_mt19937.pyx
+++ b/numpy/random/_mt19937.pyx
@@ -109,7 +109,7 @@ cdef class MT19937(BitGenerator):
  
      **Compatibility Guarantee**
  
-    ``MT19937`` makes a guarantee that a fixed seed and will always produce
+    ``MT19937`` makes a guarantee that a fixed seed will always produce
      the same random integer stream.
  
      References
@@ -214,7 +214,7 @@ cdef class MT19937(BitGenerator):
  
          Returns a new bit generator with the state jumped
  
-        The state of the returned big generator is jumped as-if
+        The state of the returned bit generator is jumped as-if
          2**(128 * jumps) random numbers have been generated.
  
          Parameters
diff --git a/numpy/random/_pcg64.pyi b/numpy/random/_pcg64.pyi

index 4881a987e2a79e90a79653e95eedd55d424b3db8..470aee867493b48817670f7c4ff7b24d8be31f26 100644 (file)
--- a/numpy/random/_pcg64.pyi
+++ b/numpy/random/_pcg64.pyi
@@ -1,7 +1,7 @@
-from typing import Union, TypedDict
+from typing import TypedDict
  
  from numpy.random.bit_generator import BitGenerator, SeedSequence
-from numpy.typing import _ArrayLikeInt_co
+from numpy._typing import _ArrayLikeInt_co
  
  class _PCG64Internal(TypedDict):
      state: int
@@ -14,7 +14,7 @@ class _PCG64State(TypedDict):
      uinteger: int
  
  class PCG64(BitGenerator):
-    def __init__(self, seed: Union[None, _ArrayLikeInt_co, SeedSequence] = ...) -> None: ...
+    def __init__(self, seed: None | _ArrayLikeInt_co | SeedSequence = ...) -> None: ...
      def jumped(self, jumps: int = ...) -> PCG64: ...
      @property
      def state(
@@ -28,7 +28,7 @@ class PCG64(BitGenerator):
      def advance(self, delta: int) -> PCG64: ...
  
  class PCG64DXSM(BitGenerator):
-    def __init__(self, seed: Union[None, _ArrayLikeInt_co, SeedSequence] = ...) -> None: ...
+    def __init__(self, seed: None | _ArrayLikeInt_co | SeedSequence = ...) -> None: ...
      def jumped(self, jumps: int = ...) -> PCG64DXSM: ...
      @property
      def state(
diff --git a/numpy/random/_philox.pyi b/numpy/random/_philox.pyi

index dd1c5e6e9bab97441453ff78827a9f9687f2e8c2..26ce726ecf4a6f0e9fda4d596368a05da5629124 100644 (file)
--- a/numpy/random/_philox.pyi
+++ b/numpy/random/_philox.pyi
@@ -1,8 +1,8 @@
-from typing import Any, Union, TypedDict
+from typing import Any, TypedDict
  
  from numpy import dtype, ndarray, uint64
  from numpy.random.bit_generator import BitGenerator, SeedSequence
-from numpy.typing import _ArrayLikeInt_co
+from numpy._typing import _ArrayLikeInt_co
  
  class _PhiloxInternal(TypedDict):
      counter: ndarray[Any, dtype[uint64]]
@@ -19,9 +19,9 @@ class _PhiloxState(TypedDict):
  class Philox(BitGenerator):
      def __init__(
          self,
-        seed: Union[None, _ArrayLikeInt_co, SeedSequence] = ...,
-        counter: Union[None, _ArrayLikeInt_co] = ...,
-        key: Union[None, _ArrayLikeInt_co] = ...,
+        seed: None | _ArrayLikeInt_co | SeedSequence = ...,
+        counter: None | _ArrayLikeInt_co = ...,
+        key: None | _ArrayLikeInt_co = ...,
      ) -> None: ...
      @property
      def state(
diff --git a/numpy/random/_philox.pyx b/numpy/random/_philox.pyx

index 0fe8ebd7cd5fdb811982b78366eca8e7f19e94f2..d9a366e86554ff41816344e4873b174874afac9a 100644 (file)
--- a/numpy/random/_philox.pyx
+++ b/numpy/random/_philox.pyx
@@ -266,8 +266,8 @@ cdef class Philox(BitGenerator):
  
          Returns a new bit generator with the state jumped
  
-        The state of the returned big generator is jumped as-if
-        2**(128 * jumps) random numbers have been generated.
+        The state of the returned bit generator is jumped as-if
+        (2**128) * jumps random numbers have been generated.
  
          Parameters
          ----------
diff --git a/numpy/random/_pickle.py b/numpy/random/_pickle.py

index a32f64f4a3d3fb2ac8d1a8d5cfa32e2bc953a8eb..5e89071e88ab03c5e6f163937d59c8b761338a3f 100644 (file)
--- a/numpy/random/_pickle.py
+++ b/numpy/random/_pickle.py
@@ -25,7 +25,7 @@ def __generator_ctor(bit_generator_name='MT19937'):
  
      Returns
      -------
-    rg: Generator
+    rg : Generator
          Generator using the named core BitGenerator
      """
      if bit_generator_name in BitGenerators:
@@ -48,7 +48,7 @@ def __bit_generator_ctor(bit_generator_name='MT19937'):
  
      Returns
      -------
-    bit_generator: BitGenerator
+    bit_generator : BitGenerator
          BitGenerator instance
      """
      if bit_generator_name in BitGenerators:
@@ -71,7 +71,7 @@ def __randomstate_ctor(bit_generator_name='MT19937'):
  
      Returns
      -------
-    rs: RandomState
+    rs : RandomState
          Legacy RandomState using the named core BitGenerator
      """
      if bit_generator_name in BitGenerators:
diff --git a/numpy/random/_sfc64.pyi b/numpy/random/_sfc64.pyi

index 94d11a210fe67ce1640a982a512e169a84a5c685..e1810e7d5261490d83ba1ec8f4f3df863837a5f0 100644 (file)
--- a/numpy/random/_sfc64.pyi
+++ b/numpy/random/_sfc64.pyi
@@ -1,10 +1,10 @@
-from typing import Any, Union, TypedDict
+from typing import Any, TypedDict
  
  from numpy import dtype as dtype
  from numpy import ndarray as ndarray
  from numpy import uint64
  from numpy.random.bit_generator import BitGenerator, SeedSequence
-from numpy.typing import _ArrayLikeInt_co
+from numpy._typing import _ArrayLikeInt_co
  
  class _SFC64Internal(TypedDict):
      state: ndarray[Any, dtype[uint64]]
@@ -16,7 +16,7 @@ class _SFC64State(TypedDict):
      uinteger: int
  
  class SFC64(BitGenerator):
-    def __init__(self, seed: Union[None, _ArrayLikeInt_co, SeedSequence] = ...) -> None: ...
+    def __init__(self, seed: None | _ArrayLikeInt_co | SeedSequence = ...) -> None: ...
      @property
      def state(
          self,
diff --git a/numpy/random/bit_generator.pyi b/numpy/random/bit_generator.pyi

index fa2f1ab12c90a1d81f029547d37faf36a109838a..e6e3b10cd4504a6f036ecf3db26b44abb72cf406 100644 (file)
--- a/numpy/random/bit_generator.pyi
+++ b/numpy/random/bit_generator.pyi
@@ -1,16 +1,9 @@
  import abc
  from threading import Lock
+from collections.abc import Callable, Mapping, Sequence
  from typing import (
      Any,
-    Callable,
-    Dict,
-    List,
-    Mapping,
      NamedTuple,
-    Optional,
-    Sequence,
-    Tuple,
-    Type,
      TypedDict,
      TypeVar,
      Union,
@@ -19,26 +12,26 @@ from typing import (
  )
  
  from numpy import dtype, ndarray, uint32, uint64
-from numpy.typing import _ArrayLikeInt_co, _ShapeLike, _SupportsDType, _UInt32Codes, _UInt64Codes
+from numpy._typing import _ArrayLikeInt_co, _ShapeLike, _SupportsDType, _UInt32Codes, _UInt64Codes
  
  _T = TypeVar("_T")
  
  _DTypeLikeUint32 = Union[
      dtype[uint32],
      _SupportsDType[dtype[uint32]],
-    Type[uint32],
+    type[uint32],
      _UInt32Codes,
  ]
  _DTypeLikeUint64 = Union[
      dtype[uint64],
      _SupportsDType[dtype[uint64]],
-    Type[uint64],
+    type[uint64],
      _UInt64Codes,
  ]
  
  class _SeedSeqState(TypedDict):
-    entropy: Union[None, int, Sequence[int]]
-    spawn_key: Tuple[int, ...]
+    entropy: None | int | Sequence[int]
+    spawn_key: tuple[int, ...]
      pool_size: int
      n_children_spawned: int
  
@@ -53,28 +46,28 @@ class _Interface(NamedTuple):
  class ISeedSequence(abc.ABC):
      @abc.abstractmethod
      def generate_state(
-        self, n_words: int, dtype: Union[_DTypeLikeUint32, _DTypeLikeUint64] = ...
-    ) -> ndarray[Any, dtype[Union[uint32, uint64]]]: ...
+        self, n_words: int, dtype: _DTypeLikeUint32 | _DTypeLikeUint64 = ...
+    ) -> ndarray[Any, dtype[uint32 | uint64]]: ...
  
  class ISpawnableSeedSequence(ISeedSequence):
      @abc.abstractmethod
-    def spawn(self: _T, n_children: int) -> List[_T]: ...
+    def spawn(self: _T, n_children: int) -> list[_T]: ...
  
  class SeedlessSeedSequence(ISpawnableSeedSequence):
      def generate_state(
-        self, n_words: int, dtype: Union[_DTypeLikeUint32, _DTypeLikeUint64] = ...
-    ) -> ndarray[Any, dtype[Union[uint32, uint64]]]: ...
-    def spawn(self: _T, n_children: int) -> List[_T]: ...
+        self, n_words: int, dtype: _DTypeLikeUint32 | _DTypeLikeUint64 = ...
+    ) -> ndarray[Any, dtype[uint32 | uint64]]: ...
+    def spawn(self: _T, n_children: int) -> list[_T]: ...
  
  class SeedSequence(ISpawnableSeedSequence):
-    entropy: Union[None, int, Sequence[int]]
-    spawn_key: Tuple[int, ...]
+    entropy: None | int | Sequence[int]
+    spawn_key: tuple[int, ...]
      pool_size: int
      n_children_spawned: int
      pool: ndarray[Any, dtype[uint32]]
      def __init__(
          self,
-        entropy: Union[None, int, Sequence[int], _ArrayLikeInt_co] = ...,
+        entropy: None | int | Sequence[int] | _ArrayLikeInt_co = ...,
          *,
          spawn_key: Sequence[int] = ...,
          pool_size: int = ...,
@@ -86,18 +79,18 @@ class SeedSequence(ISpawnableSeedSequence):
          self,
      ) -> _SeedSeqState: ...
      def generate_state(
-        self, n_words: int, dtype: Union[_DTypeLikeUint32, _DTypeLikeUint64] = ...
-    ) -> ndarray[Any, dtype[Union[uint32, uint64]]]: ...
-    def spawn(self, n_children: int) -> List[SeedSequence]: ...
+        self, n_words: int, dtype: _DTypeLikeUint32 | _DTypeLikeUint64 = ...
+    ) -> ndarray[Any, dtype[uint32 | uint64]]: ...
+    def spawn(self, n_children: int) -> list[SeedSequence]: ...
  
  class BitGenerator(abc.ABC):
      lock: Lock
-    def __init__(self, seed: Union[None, _ArrayLikeInt_co, SeedSequence] = ...) -> None: ...
-    def __getstate__(self) -> Dict[str, Any]: ...
-    def __setstate__(self, state: Dict[str, Any]) -> None: ...
+    def __init__(self, seed: None | _ArrayLikeInt_co | SeedSequence = ...) -> None: ...
+    def __getstate__(self) -> dict[str, Any]: ...
+    def __setstate__(self, state: dict[str, Any]) -> None: ...
      def __reduce__(
          self,
-    ) -> Tuple[Callable[[str], BitGenerator], Tuple[str], Tuple[Dict[str, Any]]]: ...
+    ) -> tuple[Callable[[str], BitGenerator], tuple[str], tuple[dict[str, Any]]]: ...
      @abc.abstractmethod
      @property
      def state(self) -> Mapping[str, Any]: ...
@@ -108,7 +101,7 @@ class BitGenerator(abc.ABC):
      @overload
      def random_raw(self, size: _ShapeLike = ..., output: Literal[True] = ...) -> ndarray[Any, dtype[uint64]]: ...  # type: ignore[misc]
      @overload
-    def random_raw(self, size: Optional[_ShapeLike] = ..., output: Literal[False] = ...) -> None: ...  # type: ignore[misc]
+    def random_raw(self, size: None | _ShapeLike = ..., output: Literal[False] = ...) -> None: ...  # type: ignore[misc]
      def _benchmark(self, cnt: int, method: str = ...) -> None: ...
      @property
      def ctypes(self) -> _Interface: ...
diff --git a/numpy/random/bit_generator.pyx b/numpy/random/bit_generator.pyx

index 123d77b40e2e680d594ea7efc29900e4e7c9f065..2c50dbf70517d7fbe45d7c0d620249ee05c8b378 100644 (file)
--- a/numpy/random/bit_generator.pyx
+++ b/numpy/random/bit_generator.pyx
@@ -35,13 +35,7 @@ import abc
  import sys
  from itertools import cycle
  import re
-
-try:
-    from secrets import randbits
-except ImportError:
-    # secrets unavailable on python 3.5 and before
-    from random import SystemRandom
-    randbits = SystemRandom().getrandbits
+from secrets import randbits
  
  from threading import Lock
  
@@ -576,7 +570,7 @@ cdef class BitGenerator():
  
          Notes
          -----
-        This method directly exposes the the raw underlying pseudo-random
+        This method directly exposes the raw underlying pseudo-random
          number generator. All values are returned as unsigned 64-bit
          values irrespective of the number of bits produced by the PRNG.
  
diff --git a/numpy/random/mtrand.pyi b/numpy/random/mtrand.pyi

index cbe87a29984205124e63ad93a6eb9988995eed2b..b6eb77f00df21f91697284ba511ebf166064b14e 100644 (file)
--- a/numpy/random/mtrand.pyi
+++ b/numpy/random/mtrand.pyi
@@ -1,4 +1,5 @@
-from typing import Any, Callable, Dict, Optional, Tuple, Type, Union, overload, Literal
+from collections.abc import Callable
+from typing import Any, Union, overload, Literal
  
  from numpy import (
      bool_,
@@ -18,7 +19,7 @@ from numpy import (
      uint64,
  )
  from numpy.random.bit_generator import BitGenerator
-from numpy.typing import (
+from numpy._typing import (
      ArrayLike,
      _ArrayLikeFloat_co,
      _ArrayLikeInt_co,
@@ -46,7 +47,7 @@ from numpy.typing import (
  _DTypeLikeFloat32 = Union[
      dtype[float32],
      _SupportsDType[dtype[float32]],
-    Type[float32],
+    type[float32],
      _Float32Codes,
      _SingleCodes,
  ]
@@ -54,29 +55,29 @@ _DTypeLikeFloat32 = Union[
  _DTypeLikeFloat64 = Union[
      dtype[float64],
      _SupportsDType[dtype[float64]],
-    Type[float],
-    Type[float64],
+    type[float],
+    type[float64],
      _Float64Codes,
      _DoubleCodes,
  ]
  
  class RandomState:
      _bit_generator: BitGenerator
-    def __init__(self, seed: Union[None, _ArrayLikeInt_co, BitGenerator] = ...) -> None: ...
+    def __init__(self, seed: None | _ArrayLikeInt_co | BitGenerator = ...) -> None: ...
      def __repr__(self) -> str: ...
      def __str__(self) -> str: ...
-    def __getstate__(self) -> Dict[str, Any]: ...
-    def __setstate__(self, state: Dict[str, Any]) -> None: ...
-    def __reduce__(self) -> Tuple[Callable[[str], RandomState], Tuple[str], Dict[str, Any]]: ...
-    def seed(self, seed: Optional[_ArrayLikeFloat_co] = ...) -> None: ...
+    def __getstate__(self) -> dict[str, Any]: ...
+    def __setstate__(self, state: dict[str, Any]) -> None: ...
+    def __reduce__(self) -> tuple[Callable[[str], RandomState], tuple[str], dict[str, Any]]: ...
+    def seed(self, seed: None | _ArrayLikeFloat_co = ...) -> None: ...
      @overload
-    def get_state(self, legacy: Literal[False] = ...) -> Dict[str, Any]: ...
+    def get_state(self, legacy: Literal[False] = ...) -> dict[str, Any]: ...
      @overload
      def get_state(
          self, legacy: Literal[True] = ...
-    ) -> Union[Dict[str, Any], Tuple[str, ndarray[Any, dtype[uint32]], int, int, float]]: ...
+    ) -> dict[str, Any] | tuple[str, ndarray[Any, dtype[uint32]], int, int, float]: ...
      def set_state(
-        self, state: Union[Dict[str, Any], Tuple[str, ndarray[Any, dtype[uint32]], int, int, float]]
+        self, state: dict[str, Any] | tuple[str, ndarray[Any, dtype[uint32]], int, int, float]
      ) -> None: ...
      @overload
      def random_sample(self, size: None = ...) -> float: ...  # type: ignore[misc]
@@ -90,13 +91,13 @@ class RandomState:
      def beta(self, a: float, b: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def beta(
-        self, a: _ArrayLikeFloat_co, b: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, a: _ArrayLikeFloat_co, b: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def exponential(self, scale: float = ..., size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def exponential(
-        self, scale: _ArrayLikeFloat_co = ..., size: Optional[_ShapeLike] = ...
+        self, scale: _ArrayLikeFloat_co = ..., size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def standard_exponential(self, size: None = ...) -> float: ...  # type: ignore[misc]
@@ -110,13 +111,13 @@ class RandomState:
      def randint(  # type: ignore[misc]
          self,
          low: int,
-        high: Optional[int] = ...,
+        high: None | int = ...,
      ) -> int: ...
      @overload
      def randint(  # type: ignore[misc]
          self,
          low: int,
-        high: Optional[int] = ...,
+        high: None | int = ...,
          size: None = ...,
          dtype: _DTypeLikeBool = ...,
      ) -> bool: ...
@@ -124,114 +125,104 @@ class RandomState:
      def randint(  # type: ignore[misc]
          self,
          low: int,
-        high: Optional[int] = ...,
+        high: None | int = ...,
          size: None = ...,
-        dtype: Union[_DTypeLikeInt, _DTypeLikeUInt] = ...,
+        dtype: _DTypeLikeInt | _DTypeLikeUInt = ...,
      ) -> int: ...
      @overload
      def randint(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[int_]]: ...
      @overload
      def randint(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
          dtype: _DTypeLikeBool = ...,
      ) -> ndarray[Any, dtype[bool_]]: ...
      @overload
      def randint(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[dtype[int8], Type[int8], _Int8Codes, _SupportsDType[dtype[int8]]] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[int8] | type[int8] | _Int8Codes | _SupportsDType[dtype[int8]] = ...,
      ) -> ndarray[Any, dtype[int8]]: ...
      @overload
      def randint(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[dtype[int16], Type[int16], _Int16Codes, _SupportsDType[dtype[int16]]] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[int16] | type[int16] | _Int16Codes | _SupportsDType[dtype[int16]] = ...,
      ) -> ndarray[Any, dtype[int16]]: ...
      @overload
      def randint(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[dtype[int32], Type[int32], _Int32Codes, _SupportsDType[dtype[int32]]] = ...,
-    ) -> ndarray[Any, dtype[Union[int32]]]: ...
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[int32] | type[int32] | _Int32Codes | _SupportsDType[dtype[int32]] = ...,
+    ) -> ndarray[Any, dtype[int32]]: ...
      @overload
      def randint(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Optional[
-            Union[dtype[int64], Type[int64], _Int64Codes, _SupportsDType[dtype[int64]]]
-        ] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: None | dtype[int64] | type[int64] | _Int64Codes | _SupportsDType[dtype[int64]] = ...,
      ) -> ndarray[Any, dtype[int64]]: ...
      @overload
      def randint(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[dtype[uint8], Type[uint8], _UInt8Codes, _SupportsDType[dtype[uint8]]] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[uint8] | type[uint8] | _UInt8Codes | _SupportsDType[dtype[uint8]] = ...,
      ) -> ndarray[Any, dtype[uint8]]: ...
      @overload
      def randint(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[
-            dtype[uint16], Type[uint16], _UInt16Codes, _SupportsDType[dtype[uint16]]
-        ] = ...,
-    ) -> ndarray[Any, dtype[Union[uint16]]]: ...
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[uint16] | type[uint16] | _UInt16Codes | _SupportsDType[dtype[uint16]] = ...,
+    ) -> ndarray[Any, dtype[uint16]]: ...
      @overload
      def randint(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[
-            dtype[uint32], Type[uint32], _UInt32Codes, _SupportsDType[dtype[uint32]]
-        ] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[uint32] | type[uint32] | _UInt32Codes | _SupportsDType[dtype[uint32]] = ...,
      ) -> ndarray[Any, dtype[uint32]]: ...
      @overload
      def randint(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[
-            dtype[uint64], Type[uint64], _UInt64Codes, _SupportsDType[dtype[uint64]]
-        ] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[uint64] | type[uint64] | _UInt64Codes | _SupportsDType[dtype[uint64]] = ...,
      ) -> ndarray[Any, dtype[uint64]]: ...
      @overload
      def randint(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[
-            dtype[int_], Type[int], Type[int_], _IntCodes, _SupportsDType[dtype[int_]]
-        ] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[int_] | type[int] | type[int_] | _IntCodes | _SupportsDType[dtype[int_]] = ...,
      ) -> ndarray[Any, dtype[int_]]: ...
      @overload
      def randint(  # type: ignore[misc]
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
-        dtype: Union[dtype[uint], Type[uint], _UIntCodes, _SupportsDType[dtype[uint]]] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
+        dtype: dtype[uint] | type[uint] | _UIntCodes | _SupportsDType[dtype[uint]] = ...,
      ) -> ndarray[Any, dtype[uint]]: ...
      def bytes(self, length: int) -> bytes: ...
      @overload
@@ -240,7 +231,7 @@ class RandomState:
          a: int,
          size: None = ...,
          replace: bool = ...,
-        p: Optional[_ArrayLikeFloat_co] = ...,
+        p: None | _ArrayLikeFloat_co = ...,
      ) -> int: ...
      @overload
      def choice(
@@ -248,7 +239,7 @@ class RandomState:
          a: int,
          size: _ShapeLike = ...,
          replace: bool = ...,
-        p: Optional[_ArrayLikeFloat_co] = ...,
+        p: None | _ArrayLikeFloat_co = ...,
      ) -> ndarray[Any, dtype[int_]]: ...
      @overload
      def choice(
@@ -256,7 +247,7 @@ class RandomState:
          a: ArrayLike,
          size: None = ...,
          replace: bool = ...,
-        p: Optional[_ArrayLikeFloat_co] = ...,
+        p: None | _ArrayLikeFloat_co = ...,
      ) -> Any: ...
      @overload
      def choice(
@@ -264,7 +255,7 @@ class RandomState:
          a: ArrayLike,
          size: _ShapeLike = ...,
          replace: bool = ...,
-        p: Optional[_ArrayLikeFloat_co] = ...,
+        p: None | _ArrayLikeFloat_co = ...,
      ) -> ndarray[Any, Any]: ...
      @overload
      def uniform(self, low: float = ..., high: float = ..., size: None = ...) -> float: ...  # type: ignore[misc]
@@ -273,7 +264,7 @@ class RandomState:
          self,
          low: _ArrayLikeFloat_co = ...,
          high: _ArrayLikeFloat_co = ...,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def rand(self) -> float: ...
@@ -284,13 +275,13 @@ class RandomState:
      @overload
      def randn(self, *args: int) -> ndarray[Any, dtype[float64]]: ...
      @overload
-    def random_integers(self, low: int, high: Optional[int] = ..., size: None = ...) -> int: ...  # type: ignore[misc]
+    def random_integers(self, low: int, high: None | int = ..., size: None = ...) -> int: ...  # type: ignore[misc]
      @overload
      def random_integers(
          self,
          low: _ArrayLikeInt_co,
-        high: Optional[_ArrayLikeInt_co] = ...,
-        size: Optional[_ShapeLike] = ...,
+        high: None | _ArrayLikeInt_co = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[int_]]: ...
      @overload
      def standard_normal(self, size: None = ...) -> float: ...  # type: ignore[misc]
@@ -305,7 +296,7 @@ class RandomState:
          self,
          loc: _ArrayLikeFloat_co = ...,
          scale: _ArrayLikeFloat_co = ...,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def standard_gamma(  # type: ignore[misc]
@@ -317,7 +308,7 @@ class RandomState:
      def standard_gamma(
          self,
          shape: _ArrayLikeFloat_co,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def gamma(self, shape: float, scale: float = ..., size: None = ...) -> float: ...  # type: ignore[misc]
@@ -326,13 +317,13 @@ class RandomState:
          self,
          shape: _ArrayLikeFloat_co,
          scale: _ArrayLikeFloat_co = ...,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def f(self, dfnum: float, dfden: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def f(
-        self, dfnum: _ArrayLikeFloat_co, dfden: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, dfnum: _ArrayLikeFloat_co, dfden: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def noncentral_f(self, dfnum: float, dfden: float, nonc: float, size: None = ...) -> float: ...  # type: ignore[misc]
@@ -342,19 +333,19 @@ class RandomState:
          dfnum: _ArrayLikeFloat_co,
          dfden: _ArrayLikeFloat_co,
          nonc: _ArrayLikeFloat_co,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def chisquare(self, df: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def chisquare(
-        self, df: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, df: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def noncentral_chisquare(self, df: float, nonc: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def noncentral_chisquare(
-        self, df: _ArrayLikeFloat_co, nonc: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, df: _ArrayLikeFloat_co, nonc: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def standard_t(self, df: float, size: None = ...) -> float: ...  # type: ignore[misc]
@@ -370,25 +361,25 @@ class RandomState:
      def vonmises(self, mu: float, kappa: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def vonmises(
-        self, mu: _ArrayLikeFloat_co, kappa: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, mu: _ArrayLikeFloat_co, kappa: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def pareto(self, a: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def pareto(
-        self, a: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, a: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def weibull(self, a: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def weibull(
-        self, a: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, a: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def power(self, a: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def power(
-        self, a: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, a: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def standard_cauchy(self, size: None = ...) -> float: ...  # type: ignore[misc]
@@ -401,7 +392,7 @@ class RandomState:
          self,
          loc: _ArrayLikeFloat_co = ...,
          scale: _ArrayLikeFloat_co = ...,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def gumbel(self, loc: float = ..., scale: float = ..., size: None = ...) -> float: ...  # type: ignore[misc]
@@ -410,7 +401,7 @@ class RandomState:
          self,
          loc: _ArrayLikeFloat_co = ...,
          scale: _ArrayLikeFloat_co = ...,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def logistic(self, loc: float = ..., scale: float = ..., size: None = ...) -> float: ...  # type: ignore[misc]
@@ -419,7 +410,7 @@ class RandomState:
          self,
          loc: _ArrayLikeFloat_co = ...,
          scale: _ArrayLikeFloat_co = ...,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def lognormal(self, mean: float = ..., sigma: float = ..., size: None = ...) -> float: ...  # type: ignore[misc]
@@ -428,19 +419,19 @@ class RandomState:
          self,
          mean: _ArrayLikeFloat_co = ...,
          sigma: _ArrayLikeFloat_co = ...,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def rayleigh(self, scale: float = ..., size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def rayleigh(
-        self, scale: _ArrayLikeFloat_co = ..., size: Optional[_ShapeLike] = ...
+        self, scale: _ArrayLikeFloat_co = ..., size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def wald(self, mean: float, scale: float, size: None = ...) -> float: ...  # type: ignore[misc]
      @overload
      def wald(
-        self, mean: _ArrayLikeFloat_co, scale: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, mean: _ArrayLikeFloat_co, scale: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def triangular(self, left: float, mode: float, right: float, size: None = ...) -> float: ...  # type: ignore[misc]
@@ -450,37 +441,37 @@ class RandomState:
          left: _ArrayLikeFloat_co,
          mode: _ArrayLikeFloat_co,
          right: _ArrayLikeFloat_co,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      @overload
      def binomial(self, n: int, p: float, size: None = ...) -> int: ...  # type: ignore[misc]
      @overload
      def binomial(
-        self, n: _ArrayLikeInt_co, p: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, n: _ArrayLikeInt_co, p: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[int_]]: ...
      @overload
      def negative_binomial(self, n: float, p: float, size: None = ...) -> int: ...  # type: ignore[misc]
      @overload
      def negative_binomial(
-        self, n: _ArrayLikeFloat_co, p: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, n: _ArrayLikeFloat_co, p: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[int_]]: ...
      @overload
      def poisson(self, lam: float = ..., size: None = ...) -> int: ...  # type: ignore[misc]
      @overload
      def poisson(
-        self, lam: _ArrayLikeFloat_co = ..., size: Optional[_ShapeLike] = ...
+        self, lam: _ArrayLikeFloat_co = ..., size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[int_]]: ...
      @overload
      def zipf(self, a: float, size: None = ...) -> int: ...  # type: ignore[misc]
      @overload
      def zipf(
-        self, a: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, a: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[int_]]: ...
      @overload
      def geometric(self, p: float, size: None = ...) -> int: ...  # type: ignore[misc]
      @overload
      def geometric(
-        self, p: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, p: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[int_]]: ...
      @overload
      def hypergeometric(self, ngood: int, nbad: int, nsample: int, size: None = ...) -> int: ...  # type: ignore[misc]
@@ -490,27 +481,27 @@ class RandomState:
          ngood: _ArrayLikeInt_co,
          nbad: _ArrayLikeInt_co,
          nsample: _ArrayLikeInt_co,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
      ) -> ndarray[Any, dtype[int_]]: ...
      @overload
      def logseries(self, p: float, size: None = ...) -> int: ...  # type: ignore[misc]
      @overload
      def logseries(
-        self, p: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, p: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[int_]]: ...
      def multivariate_normal(
          self,
          mean: _ArrayLikeFloat_co,
          cov: _ArrayLikeFloat_co,
-        size: Optional[_ShapeLike] = ...,
+        size: None | _ShapeLike = ...,
          check_valid: Literal["warn", "raise", "ignore"] = ...,
          tol: float = ...,
      ) -> ndarray[Any, dtype[float64]]: ...
      def multinomial(
-        self, n: _ArrayLikeInt_co, pvals: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, n: _ArrayLikeInt_co, pvals: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[int_]]: ...
      def dirichlet(
-        self, alpha: _ArrayLikeFloat_co, size: Optional[_ShapeLike] = ...
+        self, alpha: _ArrayLikeFloat_co, size: None | _ShapeLike = ...
      ) -> ndarray[Any, dtype[float64]]: ...
      def shuffle(self, x: ArrayLike) -> None: ...
      @overload
diff --git a/numpy/random/mtrand.pyx b/numpy/random/mtrand.pyx

index ce09a041c94ea6e0e1d61f21a6b14b9a2d242b57..fcc1f27d266fdeaef3ef1693c96871ab8a2d4861 100644 (file)
--- a/numpy/random/mtrand.pyx
+++ b/numpy/random/mtrand.pyx
@@ -402,7 +402,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.random: which should be used for new code.
+        random.Generator.random: which should be used for new code.
  
          Examples
          --------
@@ -476,7 +476,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.beta: which should be used for new code.
+        random.Generator.beta: which should be used for new code.
          """
          return cont(&legacy_beta, &self._aug_state, size, self.lock, 2,
                      a, 'a', CONS_POSITIVE,
@@ -525,7 +525,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.exponential: which should be used for new code.
+        random.Generator.exponential: which should be used for new code.
  
          References
          ----------
@@ -570,7 +570,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.standard_exponential: which should be used for new code.
+        random.Generator.standard_exponential: which should be used for new code.
  
          Examples
          --------
@@ -688,7 +688,7 @@ cdef class RandomState:
          random_integers : similar to `randint`, only for the closed
              interval [`low`, `high`], and 1 is the lowest value if `high` is
              omitted.
-        Generator.integers: which should be used for new code.
+        random.Generator.integers: which should be used for new code.
  
          Examples
          --------
@@ -790,7 +790,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.bytes: which should be used for new code.
+        random.Generator.bytes: which should be used for new code.
  
          Examples
          --------
@@ -850,7 +850,7 @@ cdef class RandomState:
          See Also
          --------
          randint, shuffle, permutation
-        Generator.choice: which should be used in new code
+        random.Generator.choice: which should be used in new code
  
          Notes
          -----
@@ -1058,7 +1058,7 @@ cdef class RandomState:
          rand : Convenience function that accepts dimensions as input, e.g.,
                 ``rand(2,2)`` would generate a 2-by-2 array of floats,
                 uniformly distributed over ``[0, 1)``.
-        Generator.uniform: which should be used for new code.
+        random.Generator.uniform: which should be used for new code.
  
          Notes
          -----
@@ -1220,7 +1220,7 @@ cdef class RandomState:
          --------
          standard_normal : Similar, but takes a tuple as its argument.
          normal : Also accepts mu and sigma arguments.
-        Generator.standard_normal: which should be used for new code.
+        random.Generator.standard_normal: which should be used for new code.
  
          Notes
          -----
@@ -1369,7 +1369,7 @@ cdef class RandomState:
          normal :
              Equivalent function with additional ``loc`` and ``scale`` arguments
              for setting the mean and standard deviation.
-        Generator.standard_normal: which should be used for new code.
+        random.Generator.standard_normal: which should be used for new code.
  
          Notes
          -----
@@ -1448,7 +1448,7 @@ cdef class RandomState:
          --------
          scipy.stats.norm : probability density function, distribution or
              cumulative density function, etc.
-        Generator.normal: which should be used for new code.
+        random.Generator.normal: which should be used for new code.
  
          Notes
          -----
@@ -1545,7 +1545,7 @@ cdef class RandomState:
          --------
          scipy.stats.gamma : probability density function, distribution or
              cumulative density function, etc.
-        Generator.standard_gamma: which should be used for new code.
+        random.Generator.standard_gamma: which should be used for new code.
  
          Notes
          -----
@@ -1629,7 +1629,7 @@ cdef class RandomState:
          --------
          scipy.stats.gamma : probability density function, distribution or
              cumulative density function, etc.
-        Generator.gamma: which should be used for new code.
+        random.Generator.gamma: which should be used for new code.
  
          Notes
          -----
@@ -1717,7 +1717,7 @@ cdef class RandomState:
          --------
          scipy.stats.f : probability density function, distribution or
              cumulative density function, etc.
-        Generator.f: which should be used for new code.
+        random.Generator.f: which should be used for new code.
  
          Notes
          -----
@@ -1810,7 +1810,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.noncentral_f: which should be used for new code.
+        random.Generator.noncentral_f: which should be used for new code.
  
          Notes
          -----
@@ -1892,7 +1892,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.chisquare: which should be used for new code.
+        random.Generator.chisquare: which should be used for new code.
  
          Notes
          -----
@@ -1964,7 +1964,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.noncentral_chisquare: which should be used for new code.
+        random.Generator.noncentral_chisquare: which should be used for new code.
  
          Notes
          -----
@@ -2042,7 +2042,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.standard_cauchy: which should be used for new code.
+        random.Generator.standard_cauchy: which should be used for new code.
  
          Notes
          -----
@@ -2121,7 +2121,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.standard_t: which should be used for new code.
+        random.Generator.standard_t: which should be used for new code.
  
          Notes
          -----
@@ -2242,7 +2242,7 @@ cdef class RandomState:
          --------
          scipy.stats.vonmises : probability density function, distribution, or
              cumulative density function, etc.
-        Generator.vonmises: which should be used for new code.
+        random.Generator.vonmises: which should be used for new code.
  
          Notes
          -----
@@ -2340,7 +2340,7 @@ cdef class RandomState:
              cumulative density function, etc.
          scipy.stats.genpareto : probability density function, distribution or
              cumulative density function, etc.
-        Generator.pareto: which should be used for new code.
+        random.Generator.pareto: which should be used for new code.
  
          Notes
          -----
@@ -2434,7 +2434,7 @@ cdef class RandomState:
          scipy.stats.weibull_min
          scipy.stats.genextreme
          gumbel
-        Generator.weibull: which should be used for new code.
+        random.Generator.weibull: which should be used for new code.
  
          Notes
          -----
@@ -2531,7 +2531,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.power: which should be used for new code.
+        random.Generator.power: which should be used for new code.
  
          Notes
          -----
@@ -2640,7 +2640,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.laplace: which should be used for new code.
+        random.Generator.laplace: which should be used for new code.
  
          Notes
          -----
@@ -2735,7 +2735,7 @@ cdef class RandomState:
          scipy.stats.gumbel_r
          scipy.stats.genextreme
          weibull
-        Generator.gumbel: which should be used for new code.
+        random.Generator.gumbel: which should be used for new code.
  
          Notes
          -----
@@ -2855,7 +2855,7 @@ cdef class RandomState:
          --------
          scipy.stats.logistic : probability density function, distribution or
              cumulative density function, etc.
-        Generator.logistic: which should be used for new code.
+        random.Generator.logistic: which should be used for new code.
  
          Notes
          -----
@@ -2942,7 +2942,7 @@ cdef class RandomState:
          --------
          scipy.stats.lognorm : probability density function, distribution,
              cumulative density function, etc.
-        Generator.lognormal: which should be used for new code.
+        random.Generator.lognormal: which should be used for new code.
  
          Notes
          -----
@@ -3050,7 +3050,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.rayleigh: which should be used for new code.
+        random.Generator.rayleigh: which should be used for new code.
  
          Notes
          -----
@@ -3134,7 +3134,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.wald: which should be used for new code.
+        random.Generator.wald: which should be used for new code.
  
          Notes
          -----
@@ -3211,7 +3211,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.triangular: which should be used for new code.
+        random.Generator.triangular: which should be used for new code.
  
          Notes
          -----
@@ -3318,7 +3318,7 @@ cdef class RandomState:
          --------
          scipy.stats.binom : probability density function, distribution or
              cumulative density function, etc.
-        Generator.binomial: which should be used for new code.
+        random.Generator.binomial: which should be used for new code.
  
          Notes
          -----
@@ -3466,7 +3466,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.negative_binomial: which should be used for new code.
+        random.Generator.negative_binomial: which should be used for new code.
  
          Notes
          -----
@@ -3549,7 +3549,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.poisson: which should be used for new code.
+        random.Generator.poisson: which should be used for new code.
  
          Notes
          -----
@@ -3636,7 +3636,7 @@ cdef class RandomState:
          --------
          scipy.stats.zipf : probability density function, distribution, or
              cumulative density function, etc.
-        Generator.zipf: which should be used for new code.
+        random.Generator.zipf: which should be used for new code.
  
          Notes
          -----
@@ -3733,7 +3733,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.geometric: which should be used for new code.
+        random.Generator.geometric: which should be used for new code.
  
          Examples
          --------
@@ -3797,7 +3797,7 @@ cdef class RandomState:
          --------
          scipy.stats.hypergeom : probability density function, distribution or
              cumulative density function, etc.
-        Generator.hypergeometric: which should be used for new code.
+        random.Generator.hypergeometric: which should be used for new code.
  
          Notes
          -----
@@ -3920,7 +3920,7 @@ cdef class RandomState:
          --------
          scipy.stats.logser : probability density function, distribution or
              cumulative density function, etc.
-        Generator.logseries: which should be used for new code.
+        random.Generator.logseries: which should be used for new code.
  
          Notes
          -----
@@ -4023,7 +4023,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.multivariate_normal: which should be used for new code.
+        random.Generator.multivariate_normal: which should be used for new code.
  
          Notes
          -----
@@ -4079,12 +4079,35 @@ cdef class RandomState:
          >>> x.shape
          (3, 3, 2)
  
-        The following is probably true, given that 0.6 is roughly twice the
-        standard deviation:
+        Here we generate 800 samples from the bivariate normal distribution
+        with mean [0, 0] and covariance matrix [[6, -3], [-3, 3.5]].  The
+        expected variances of the first and second components of the sample
+        are 6 and 3.5, respectively, and the expected correlation
+        coefficient is -3/sqrt(6*3.5) ≈ -0.65465.
  
-        >>> list((x[0,0,:] - mean) < 0.6)
-        [True, True] # random
+        >>> cov = np.array([[6, -3], [-3, 3.5]])
+        >>> pts = np.random.multivariate_normal([0, 0], cov, size=800)
  
+        Check that the mean, covariance, and correlation coefficient of the
+        sample are close to the expected values:
+
+        >>> pts.mean(axis=0)
+        array([ 0.0326911 , -0.01280782])  # may vary
+        >>> np.cov(pts.T)
+        array([[ 5.96202397, -2.85602287],
+               [-2.85602287,  3.47613949]])  # may vary
+        >>> np.corrcoef(pts.T)[0, 1]
+        -0.6273591314603949  # may vary
+
+        We can visualize this data with a scatter plot.  The orientation
+        of the point cloud illustrates the negative correlation of the
+        components of this sample.
+
+        >>> import matplotlib.pyplot as plt
+        >>> plt.plot(pts[:, 0], pts[:, 1], '.', alpha=0.5)
+        >>> plt.axis('equal')
+        >>> plt.grid()
+        >>> plt.show()
          """
          from numpy.linalg import svd
  
@@ -4193,7 +4216,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.multinomial: which should be used for new code.
+        random.Generator.multinomial: which should be used for new code.
  
          Examples
          --------
@@ -4318,13 +4341,13 @@ cdef class RandomState:
              The drawn samples, of shape ``(size, k)``.
  
          Raises
-        -------
+        ------
          ValueError
              If any value in ``alpha`` is less than or equal to zero
  
          See Also
          --------
-        Generator.dirichlet: which should be used for new code.
+        random.Generator.dirichlet: which should be used for new code.
  
          Notes
          -----
@@ -4459,7 +4482,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.shuffle: which should be used for new code.
+        random.Generator.shuffle: which should be used for new code.
  
          Examples
          --------
@@ -4483,6 +4506,9 @@ cdef class RandomState:
              char* x_ptr
              char* buf_ptr
  
+        if isinstance(x, np.ndarray) and not x.flags.writeable:
+            raise ValueError('array is read-only')
+
          if type(x) is np.ndarray and x.ndim == 1 and x.size:
              # Fast, statically typed path: shuffle the underlying buffer.
              # Only for non-empty, 1d objects of class ndarray (subclasses such
@@ -4514,7 +4540,7 @@ cdef class RandomState:
                          "Shuffling a one dimensional array subclass containing "
                          "objects gives incorrect results for most array "
                          "subclasses.  "
-                        "Please us the new random number API instead: "
+                        "Please use the new random number API instead: "
                          "https://numpy.org/doc/stable/reference/random/index.html\n"
                          "The new API fixes this issue. This version will not "
                          "be fixed due to stability guarantees of the API.",
@@ -4582,7 +4608,7 @@ cdef class RandomState:
  
          See Also
          --------
-        Generator.permutation: which should be used for new code.
+        random.Generator.permutation: which should be used for new code.
  
          Examples
          --------
diff --git a/numpy/random/setup.py b/numpy/random/setup.py

index 866c0cb2f0ab56dd40e62c080fd9cd757ffc36ab..233344430c77be6f11560a0b8e4a4f8e2678aba1 100644 (file)
--- a/numpy/random/setup.py
+++ b/numpy/random/setup.py
@@ -50,6 +50,13 @@ def configuration(parent_package='', top_path=None):
          # Some bit generators require c99
          EXTRA_COMPILE_ARGS += ['-std=c99']
  
+    if sys.platform == 'cygwin':
+        # Export symbols without __declspec(dllexport) for using by cython.
+        # Using __declspec(dllexport) does not export other necessary symbols
+        # in Cygwin package's Cython environment, making it impossible to
+        # import modules.
+        EXTRA_LINK_ARGS += ['-Wl,--export-all-symbols']
+
      # Use legacy integer variable sizes
      LEGACY_DEFS = [('NP_RANDOM_LEGACY', '1')]
      PCG64_DEFS = []
diff --git a/numpy/random/tests/test_extending.py b/numpy/random/tests/test_extending.py

index d362092b58857d5061d2aeb2b354fc2fabc474d2..04b13cb8cf703fe81705289c6b0dcb2216846a1e 100644 (file)
--- a/numpy/random/tests/test_extending.py
+++ b/numpy/random/tests/test_extending.py
@@ -31,13 +31,13 @@ try:
  except ImportError:
      cython = None
  else:
-    from distutils.version import LooseVersion
-    # Cython 0.29.21 is required for Python 3.9 and there are
+    from numpy.compat import _pep440
+    # Cython 0.29.30 is required for Python 3.11 and there are
      # other fixes in the 0.29 series that are needed even for earlier
      # Python versions.
      # Note: keep in sync with the one in pyproject.toml
-    required_version = LooseVersion('0.29.21')
-    if LooseVersion(cython_version) < required_version:
+    required_version = '0.29.30'
+    if _pep440.parse(cython_version) < _pep440.Version(required_version):
          # too old or wrong cython, skip the test
          cython = None
  
diff --git a/numpy/random/tests/test_generator_mt19937.py b/numpy/random/tests/test_generator_mt19937.py

index e5411b8ef569b6d0a3fe030bdbdbb28f5eb3816a..3ccb9103c5c9cd25b4ca1325c17f3cfd0443c486 100644 (file)
--- a/numpy/random/tests/test_generator_mt19937.py
+++ b/numpy/random/tests/test_generator_mt19937.py
@@ -1020,6 +1020,13 @@ class TestRandomDist:
          arr = np.ones((3, 2))
          assert_raises(np.AxisError, random.shuffle, arr, 2)
  
+    def test_shuffle_not_writeable(self):
+        random = Generator(MT19937(self.seed))
+        a = np.zeros(5)
+        a.flags.writeable = False
+        with pytest.raises(ValueError, match='read-only'):
+            random.shuffle(a)
+
      def test_permutation(self):
          random = Generator(MT19937(self.seed))
          alist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
@@ -1116,6 +1123,12 @@ class TestRandomDist:
          with pytest.raises(TypeError, match='Cannot cast'):
              random.permuted(x, axis=1, out=out)
  
+    def test_permuted_not_writeable(self):
+        x = np.zeros((2, 5))
+        x.flags.writeable = False
+        with pytest.raises(ValueError, match='read-only'):
+            random.permuted(x, axis=1, out=x)
+
      def test_beta(self):
          random = Generator(MT19937(self.seed))
          actual = random.beta(.1, .9, size=(3, 2))
@@ -1472,6 +1485,13 @@ class TestRandomDist:
          with assert_raises(ValueError):
              x = random.negative_binomial(1, 0)
  
+    def test_negative_binomial_invalid_p_n_combination(self):
+        # Verify that values of p and n that would result in an overflow
+        # or infinite loop raise an exception.
+        with np.errstate(invalid='ignore'):
+            assert_raises(ValueError, random.negative_binomial, 2**62, 0.1)
+            assert_raises(ValueError, random.negative_binomial, [2**62], [0.1])
+
      def test_noncentral_chisquare(self):
          random = Generator(MT19937(self.seed))
          actual = random.noncentral_chisquare(df=5, nonc=5, size=(3, 2))
@@ -2550,7 +2570,7 @@ class TestSingleEltArrayInput:
  def test_jumped(config):
      # Each config contains the initial seed, a number of raw steps
      # the sha256 hashes of the initial and the final states' keys and
-    # the position of of the initial and the final state.
+    # the position of the initial and the final state.
      # These were produced using the original C implementation.
      seed = config["seed"]
      steps = config["steps"]
diff --git a/numpy/random/tests/test_random.py b/numpy/random/tests/test_random.py

index 6a584a511e1c9aec8ea3a709708e980862e86b9d..773b63653c6704e16d9ac2591e108e7d12eaa1eb 100644 (file)
--- a/numpy/random/tests/test_random.py
+++ b/numpy/random/tests/test_random.py
@@ -564,6 +564,12 @@ class TestRandomDist:
          rng.shuffle(a)
          assert_equal(np.asarray(a), [4, 1, 0, 3, 2])
  
+    def test_shuffle_not_writeable(self):
+        a = np.zeros(3)
+        a.flags.writeable = False
+        with pytest.raises(ValueError, match='read-only'):
+            np.random.shuffle(a)
+
      def test_beta(self):
          np.random.seed(self.seed)
          actual = np.random.beta(.1, .9, size=(3, 2))
diff --git a/numpy/setup.py b/numpy/setup.py

index a0ca99919b3a11a39598b7ccbe9a19b96652af56..28c28d1acffa76a08056f12dfda244168c37b5f7 100644 (file)
--- a/numpy/setup.py
+++ b/numpy/setup.py
@@ -19,10 +19,12 @@ def configuration(parent_package='',top_path=None):
      config.add_subpackage('random')
      config.add_subpackage('testing')
      config.add_subpackage('typing')
+    config.add_subpackage('_typing')
      config.add_data_dir('doc')
      config.add_data_files('py.typed')
      config.add_data_files('*.pyi')
      config.add_subpackage('tests')
+    config.add_subpackage('_pyinstaller')
      config.make_config_py() # installs __config__.py
      return config
  
diff --git a/numpy/testing/__init__.pyi b/numpy/testing/__init__.pyi

index def0f9f583210bf13f40c5f6097145dd996aeba4..a981d61136dc38fd7445ea590f2ee031dd35aa33 100644 (file)
--- a/numpy/testing/__init__.pyi
+++ b/numpy/testing/__init__.pyi
@@ -1,5 +1,3 @@
-from typing import List
-
  from numpy._pytesttester import PytestTester
  
  from unittest import (
@@ -48,11 +46,11 @@ from numpy.testing._private.utils import (
      HAS_LAPACK64 as HAS_LAPACK64,
  )
  
-__all__: List[str]
-__path__: List[str]
+__all__: list[str]
+__path__: list[str]
  test: PytestTester
  
  def run_module_suite(
      file_to_run: None | str = ...,
-    argv: None | List[str] = ...,
+    argv: None | list[str] = ...,
  ) -> None: ...
diff --git a/numpy/testing/_private/extbuild.py b/numpy/testing/_private/extbuild.py

index fc39163953d9e8db76465d7a3014e734b8b3534c..b7a071e7f5f14b069ab0bb01561dd30a4855c536 100644 (file)
--- a/numpy/testing/_private/extbuild.py
+++ b/numpy/testing/_private/extbuild.py
@@ -25,7 +25,7 @@ def build_and_import_extension(
      functions : list of fragments
          Each fragment is a sequence of func_name, calling convention, snippet.
      prologue : string
-        Code to preceed the rest, usually extra ``#include`` or ``#define``
+        Code to precede the rest, usually extra ``#include`` or ``#define``
          macros.
      build_dir : pathlib.Path
          Where to build the module, usually a temporary directory
diff --git a/numpy/testing/_private/parameterized.py b/numpy/testing/_private/parameterized.py

index db9629a94680b57330fc10cf83c273f2e45fa503..3a29a1811d344720a312aeb85cd8d4e7a4c2529b 100644 (file)
--- a/numpy/testing/_private/parameterized.py
+++ b/numpy/testing/_private/parameterized.py
@@ -1,5 +1,5 @@
  """
-tl;dr: all code code is licensed under simplified BSD, unless stated otherwise.
+tl;dr: all code is licensed under simplified BSD, unless stated otherwise.
  
  Unless stated otherwise in the source files, all code is copyright 2010 David
  Wolever <david@wolever.net>. All rights reserved.
diff --git a/numpy/testing/_private/utils.py b/numpy/testing/_private/utils.py

index 0eb945d15cc7b7eacca3482e20fd98ae9e4e3a71..4a8f42e06ef5eb3c12fbace6c64a28f44b1fc4b1 100644 (file)
--- a/numpy/testing/_private/utils.py
+++ b/numpy/testing/_private/utils.py
@@ -810,7 +810,7 @@ def assert_array_compare(comparison, x, y, err_msg='', verbose=True, header='',
                  'Mismatched elements: {} / {} ({:.3g}%)'.format(
                      n_mismatch, n_elements, percent_mismatch)]
  
-            with errstate(invalid='ignore', divide='ignore'):
+            with errstate(all='ignore'):
                  # ignore errors for non-numeric types
                  with contextlib.suppress(TypeError):
                      error = abs(x - y)
@@ -1342,9 +1342,6 @@ def assert_raises_regex(exception_class, expected_regexp, *args, **kwargs):
  
      Alternatively, can be used as a context manager like `assert_raises`.
  
-    Name of this function adheres to Python 3.2+ reference, but should work in
-    all versions down to 2.6.
-
      Notes
      -----
      .. versionadded:: 1.9.0
@@ -2518,3 +2515,16 @@ def _no_tracing(func):
              finally:
                  sys.settrace(original_trace)
          return wrapper
+
+
+def _get_glibc_version():
+    try:
+        ver = os.confstr('CS_GNU_LIBC_VERSION').rsplit(' ')[1]
+    except Exception as inst:
+        ver = '0.0'
+
+    return ver
+
+
+_glibcver = _get_glibc_version()
+_glibc_older_than = lambda x: (_glibcver != '0.0' and _glibcver < x)
diff --git a/numpy/testing/_private/utils.pyi b/numpy/testing/_private/utils.pyi

index 4ba5d82ee7bfce1abe327503fe37bc0795815ff7..0be13b7297e0502c83b620ae682bf88258452d7f 100644 (file)
--- a/numpy/testing/_private/utils.pyi
+++ b/numpy/testing/_private/utils.pyi
@@ -5,31 +5,25 @@ import types
  import warnings
  import unittest
  import contextlib
+from re import Pattern
+from collections.abc import Callable, Iterable, Sequence
  from typing import (
      Literal as L,
      Any,
      AnyStr,
-    Callable,
      ClassVar,
-    Dict,
-    Iterable,
-    List,
      NoReturn,
      overload,
-    Pattern,
-    Sequence,
-    Set,
-    Tuple,
-    Type,
      type_check_only,
      TypeVar,
      Union,
      Final,
      SupportsIndex,
  )
+from typing_extensions import ParamSpec
  
  from numpy import generic, dtype, number, object_, bool_, _FloatValue
-from numpy.typing import (
+from numpy._typing import (
      NDArray,
      ArrayLike,
      DTypeLike,
@@ -43,6 +37,7 @@ from unittest.case import (
      SkipTest as SkipTest,
  )
  
+_P = ParamSpec("_P")
  _T = TypeVar("_T")
  _ET = TypeVar("_ET", bound=BaseException)
  _FT = TypeVar("_FT", bound=Callable[..., Any])
@@ -59,14 +54,14 @@ _ComparisonFunc = Callable[
      ],
  ]
  
-__all__: List[str]
+__all__: list[str]
  
  class KnownFailureException(Exception): ...
  class IgnoreException(Exception): ...
  
  class clear_and_catch_warnings(warnings.catch_warnings):
-    class_modules: ClassVar[Tuple[types.ModuleType, ...]]
-    modules: Set[types.ModuleType]
+    class_modules: ClassVar[tuple[types.ModuleType, ...]]
+    modules: set[types.ModuleType]
      @overload
      def __new__(
          cls,
@@ -85,10 +80,10 @@ class clear_and_catch_warnings(warnings.catch_warnings):
          record: bool,
          modules: Iterable[types.ModuleType] = ...,
      ) -> clear_and_catch_warnings: ...
-    def __enter__(self) -> None | List[warnings.WarningMessage]: ...
+    def __enter__(self) -> None | list[warnings.WarningMessage]: ...
      def __exit__(
          self,
-        __exc_type: None | Type[BaseException] = ...,
+        __exc_type: None | type[BaseException] = ...,
          __exc_val: None | BaseException = ...,
          __exc_tb: None | types.TracebackType = ...,
      ) -> None: ...
@@ -98,34 +93,34 @@ class clear_and_catch_warnings(warnings.catch_warnings):
  
  @type_check_only
  class _clear_and_catch_warnings_with_records(clear_and_catch_warnings):
-    def __enter__(self) -> List[warnings.WarningMessage]: ...
+    def __enter__(self) -> list[warnings.WarningMessage]: ...
  
  @type_check_only
  class _clear_and_catch_warnings_without_records(clear_and_catch_warnings):
      def __enter__(self) -> None: ...
  
  class suppress_warnings:
-    log: List[warnings.WarningMessage]
+    log: list[warnings.WarningMessage]
      def __init__(
          self,
          forwarding_rule: L["always", "module", "once", "location"] = ...,
      ) -> None: ...
      def filter(
          self,
-        category: Type[Warning] = ...,
+        category: type[Warning] = ...,
          message: str = ...,
          module: None | types.ModuleType = ...,
      ) -> None: ...
      def record(
          self,
-        category: Type[Warning] = ...,
+        category: type[Warning] = ...,
          message: str = ...,
          module: None | types.ModuleType = ...,
-    ) -> List[warnings.WarningMessage]: ...
+    ) -> list[warnings.WarningMessage]: ...
      def __enter__(self: _T) -> _T: ...
      def __exit__(
          self,
-        __exc_type: None | Type[BaseException] = ...,
+        __exc_type: None | type[BaseException] = ...,
          __exc_val: None | BaseException = ...,
          __exc_tb: None | types.TracebackType = ...,
      ) -> None: ...
@@ -151,10 +146,10 @@ else:
  if sys.platform == "linux":
      def jiffies(
          _proc_pid_stat: str | bytes | os.PathLike[Any] = ...,
-        _load_time: List[float] = ...,
+        _load_time: list[float] = ...,
      ) -> int: ...
  else:
-    def jiffies(_load_time: List[float] = ...) -> int: ...
+    def jiffies(_load_time: list[float] = ...) -> int: ...
  
  def build_err_msg(
      arrays: Iterable[object],
@@ -246,7 +241,7 @@ def assert_array_less(
  
  def runstring(
      astr: str | bytes | types.CodeType,
-    dict: None | Dict[str, Any],
+    dict: None | dict[str, Any],
  ) -> Any: ...
  
  def assert_string_equal(actual: str, desired: str) -> None: ...
@@ -256,42 +251,42 @@ def rundocs(
      raise_on_error: bool = ...,
  ) -> None: ...
  
-def raises(*args: Type[BaseException]) -> Callable[[_FT], _FT]: ...
+def raises(*args: type[BaseException]) -> Callable[[_FT], _FT]: ...
  
  @overload
  def assert_raises(  # type: ignore
-    expected_exception: Type[BaseException] | Tuple[Type[BaseException], ...],
-    callable: Callable[..., Any],
+    expected_exception: type[BaseException] | tuple[type[BaseException], ...],
+    callable: Callable[_P, Any],
      /,
-    *args: Any,
-    **kwargs: Any,
+    *args: _P.args,
+    **kwargs: _P.kwargs,
  ) -> None: ...
  @overload
  def assert_raises(
-    expected_exception: Type[_ET] | Tuple[Type[_ET], ...],
+    expected_exception: type[_ET] | tuple[type[_ET], ...],
      *,
      msg: None | str = ...,
  ) -> unittest.case._AssertRaisesContext[_ET]: ...
  
  @overload
  def assert_raises_regex(
-    expected_exception: Type[BaseException] | Tuple[Type[BaseException], ...],
+    expected_exception: type[BaseException] | tuple[type[BaseException], ...],
      expected_regex: str | bytes | Pattern[Any],
-    callable: Callable[..., Any],
+    callable: Callable[_P, Any],
      /,
-    *args: Any,
-    **kwargs: Any,
+    *args: _P.args,
+    **kwargs: _P.kwargs,
  ) -> None: ...
  @overload
  def assert_raises_regex(
-    expected_exception: Type[_ET] | Tuple[Type[_ET], ...],
+    expected_exception: type[_ET] | tuple[type[_ET], ...],
      expected_regex: str | bytes | Pattern[Any],
      *,
      msg: None | str = ...,
  ) -> unittest.case._AssertRaisesContext[_ET]: ...
  
  def decorate_methods(
-    cls: Type[Any],
+    cls: type[Any],
      decorator: Callable[[Callable[..., Any]], Any],
      testmatch: None | str | bytes | Pattern[Any] = ...,
  ) -> None: ...
@@ -338,25 +333,25 @@ def assert_array_max_ulp(
  
  @overload
  def assert_warns(
-    warning_class: Type[Warning],
+    warning_class: type[Warning],
  ) -> contextlib._GeneratorContextManager[None]: ...
  @overload
  def assert_warns(
-    warning_class: Type[Warning],
-    func: Callable[..., _T],
+    warning_class: type[Warning],
+    func: Callable[_P, _T],
      /,
-    *args: Any,
-    **kwargs: Any,
+    *args: _P.args,
+    **kwargs: _P.kwargs,
  ) -> _T: ...
  
  @overload
  def assert_no_warnings() -> contextlib._GeneratorContextManager[None]: ...
  @overload
  def assert_no_warnings(
-    func: Callable[..., _T],
+    func: Callable[_P, _T],
      /,
-    *args: Any,
-    **kwargs: Any,
+    *args: _P.args,
+    **kwargs: _P.kwargs,
  ) -> _T: ...
  
  @overload
@@ -391,10 +386,10 @@ def temppath(
  def assert_no_gc_cycles() -> contextlib._GeneratorContextManager[None]: ...
  @overload
  def assert_no_gc_cycles(
-    func: Callable[..., Any],
+    func: Callable[_P, Any],
      /,
-    *args: Any,
-    **kwargs: Any,
+    *args: _P.args,
+    **kwargs: _P.kwargs,
  ) -> None: ...
  
  def break_cycles() -> None: ...
diff --git a/numpy/testing/print_coercion_tables.py b/numpy/testing/print_coercion_tables.py

index 3a447cd2db5eb0dc2c57f2524d1f0ebb2bb43fa9..c1d4cdff8fd0b7e9cb9b539d9a49f3374a098a11 100755 (executable)
--- a/numpy/testing/print_coercion_tables.py
+++ b/numpy/testing/print_coercion_tables.py
@@ -87,11 +87,12 @@ def print_new_cast_table(*, can_cast=True, legacy=False, flags=False):
      from numpy.core._multiarray_tests import get_all_cast_information
  
      cast_table = {
-        0 : "#",  # No cast (classify as equivalent here)
-        1 : "#",  # equivalent casting
-        2 : "=",  # safe casting
-        3 : "~",  # same-kind casting
-        4 : ".",  # unsafe casting
+        -1: " ",
+        0: "#",  # No cast (classify as equivalent here)
+        1: "#",  # equivalent casting
+        2: "=",  # safe casting
+        3: "~",  # same-kind casting
+        4: ".",  # unsafe casting
      }
      flags_table = {
          0 : "▗", 7: "█",
diff --git a/numpy/testing/tests/test_utils.py b/numpy/testing/tests/test_utils.py

index 31d2cdc76b3e0e445a957254cc0045c6e1373d31..49eeecc8ee03d9cbd7749a8f27a2f92dcc67ab89 100644 (file)
--- a/numpy/testing/tests/test_utils.py
+++ b/numpy/testing/tests/test_utils.py
@@ -151,14 +151,13 @@ class TestArrayEqual(_GenericTest):
  
          self._test_equal(a, b)
  
-        c = np.empty(2, [('floupipi', float), ('floupa', float)])
+        c = np.empty(2, [('floupipi', float),
+                         ('floupi', float), ('floupa', float)])
          c['floupipi'] = a['floupi'].copy()
          c['floupa'] = a['floupa'].copy()
  
-        with suppress_warnings() as sup:
-            l = sup.record(FutureWarning, message="elementwise == ")
+        with pytest.raises(TypeError):
              self._test_not_equal(c, b)
-            assert_equal(len(l), 1)
  
      def test_masked_nan_inf(self):
          # Regression test for gh-11121
@@ -207,6 +206,14 @@ class TestArrayEqual(_GenericTest):
          self._test_not_equal(a, b)
          self._test_not_equal(b, a)
  
+    def test_suppress_overflow_warnings(self):
+        # Based on issue #18992
+        with pytest.raises(AssertionError):
+            with np.errstate(all="raise"):
+                np.testing.assert_array_equal(
+                    np.array([1, 2, 3], np.float32),
+                    np.array([1, 1e-40, 3], np.float32))
+
  
  class TestBuildErrorMessage:
  
@@ -1216,7 +1223,7 @@ class TestStringEqual:
                        lambda: assert_string_equal("aaa", "a+b"))
  
  
-def assert_warn_len_equal(mod, n_in_context, py34=None, py37=None):
+def assert_warn_len_equal(mod, n_in_context):
      try:
          mod_warns = mod.__warningregistry__
      except AttributeError:
@@ -1230,26 +1237,15 @@ def assert_warn_len_equal(mod, n_in_context, py34=None, py37=None):
          mod_warns = {}
  
      num_warns = len(mod_warns)
-    # Python 3.4 appears to clear any pre-existing warnings of the same type,
-    # when raising warnings inside a catch_warnings block. So, there is a
-    # warning generated by the tests within the context manager, but no
-    # previous warnings.
+
      if 'version' in mod_warns:
          # Python 3 adds a 'version' entry to the registry,
          # do not count it.
          num_warns -= 1
  
-        # Behavior of warnings is Python version dependent. Adjust the
-        # expected result to compensate. In particular, Python 3.7 does
-        # not make an entry for ignored warnings.
-        if sys.version_info[:2] >= (3, 7):
-            if py37 is not None:
-                n_in_context = py37
-        else:
-            if py34 is not None:
-                n_in_context = py34
      assert_equal(num_warns, n_in_context)
  
+
  def test_warn_len_equal_call_scenarios():
      # assert_warn_len_equal is called under
      # varying circumstances depending on serial
@@ -1298,24 +1294,28 @@ def test_clear_and_catch_warnings():
          warnings.simplefilter('ignore')
          warnings.warn('Some warning')
      assert_equal(my_mod.__warningregistry__, {})
-    # Without specified modules, don't clear warnings during context
-    # Python 3.7 catch_warnings doesn't make an entry for 'ignore'.
+    # Without specified modules, don't clear warnings during context.
+    # catch_warnings doesn't make an entry for 'ignore'.
      with clear_and_catch_warnings():
          warnings.simplefilter('ignore')
          warnings.warn('Some warning')
-    assert_warn_len_equal(my_mod, 1, py37=0)
+    assert_warn_len_equal(my_mod, 0)
+
+    # Manually adding two warnings to the registry:
+    my_mod.__warningregistry__ = {'warning1': 1,
+                                  'warning2': 2}
+
      # Confirm that specifying module keeps old warning, does not add new
      with clear_and_catch_warnings(modules=[my_mod]):
          warnings.simplefilter('ignore')
          warnings.warn('Another warning')
-    assert_warn_len_equal(my_mod, 1, py37=0)
-    # Another warning, no module spec does add to warnings dict, except on
-    # Python 3.4 (see comments in `assert_warn_len_equal`)
-    # Python 3.7 catch_warnings doesn't make an entry for 'ignore'.
+    assert_warn_len_equal(my_mod, 2)
+
+    # Another warning, no module spec it clears up registry
      with clear_and_catch_warnings():
          warnings.simplefilter('ignore')
          warnings.warn('Another warning')
-    assert_warn_len_equal(my_mod, 2, py34=1, py37=0)
+    assert_warn_len_equal(my_mod, 0)
  
  
  def test_suppress_warnings_module():
@@ -1344,7 +1344,7 @@ def test_suppress_warnings_module():
      # got filtered)
      assert_equal(len(sup.log), 1)
      assert_equal(sup.log[0].message.args[0], "Some warning")
-    assert_warn_len_equal(my_mod, 0, py37=0)
+    assert_warn_len_equal(my_mod, 0)
      sup = suppress_warnings()
      # Will have to be changed if apply_along_axis is moved:
      sup.filter(module=my_mod)
@@ -1357,12 +1357,12 @@ def test_suppress_warnings_module():
          warnings.warn('Some warning')
      assert_warn_len_equal(my_mod, 0)
  
-    # Without specified modules, don't clear warnings during context
-    # Python 3.7 does not add ignored warnings.
+    # Without specified modules
      with suppress_warnings():
          warnings.simplefilter('ignore')
          warnings.warn('Some warning')
-    assert_warn_len_equal(my_mod, 1, py37=0)
+    assert_warn_len_equal(my_mod, 0)
+
  
  def test_suppress_warnings_type():
      # Initial state of module, no warnings
@@ -1385,12 +1385,11 @@ def test_suppress_warnings_type():
          warnings.warn('Some warning')
      assert_warn_len_equal(my_mod, 0)
  
-    # Without specified modules, don't clear warnings during context
-    # Python 3.7 does not add ignored warnings.
+    # Without specified modules
      with suppress_warnings():
          warnings.simplefilter('ignore')
          warnings.warn('Some warning')
-    assert_warn_len_equal(my_mod, 1, py37=0)
+    assert_warn_len_equal(my_mod, 0)
  
  
  def test_suppress_warnings_decorate_no_record():
diff --git a/numpy/tests/test_public_api.py b/numpy/tests/test_public_api.py

index bb15e10e8241e8bfdf5f34387ad2cd10b38b017f..e028488d39d448e6b3469bc9a03b3882a5152e64 100644 (file)
--- a/numpy/tests/test_public_api.py
+++ b/numpy/tests/test_public_api.py
@@ -252,7 +252,6 @@ PRIVATE_BUT_PRESENT_MODULES = ['numpy.' + s for s in [
      "f2py.crackfortran",
      "f2py.diagnose",
      "f2py.f2py2e",
-    "f2py.f2py_testing",
      "f2py.f90mod_rules",
      "f2py.func2subr",
      "f2py.rules",
@@ -317,6 +316,7 @@ SKIP_LIST = [
      "numpy.core.code_generators.generate_numpy_api",
      "numpy.core.code_generators.generate_ufunc_api",
      "numpy.core.code_generators.numpy_api",
+    "numpy.core.code_generators.generate_umath_doc",
      "numpy.core.cversions",
      "numpy.core.generate_numpy_api",
      "numpy.distutils.msvc9compiler",
diff --git a/numpy/tests/test_scripts.py b/numpy/tests/test_scripts.py

index e67a829471dcfbc860f6f1e17ffd40e640d2609b..5aa8191bbd2d98818a0f0e377f36a2166f644f73 100644 (file)
--- a/numpy/tests/test_scripts.py
+++ b/numpy/tests/test_scripts.py
@@ -24,8 +24,8 @@ def find_f2py_commands():
      else:
          # Three scripts are installed in Unix-like systems:
          # 'f2py', 'f2py{major}', and 'f2py{major.minor}'. For example,
-        # if installed with python3.7 the scripts would be named
-        # 'f2py', 'f2py3', and 'f2py3.7'.
+        # if installed with python3.9 the scripts would be named
+        # 'f2py', 'f2py3', and 'f2py3.9'.
          version = sys.version_info
          major = str(version.major)
          minor = str(version.minor)
diff --git a/numpy/typing/__init__.py b/numpy/typing/__init__.py

index a7e4fa5e6eb923ac8cca835a597ee181e88887f0..840b9ca7259d96fdec8945d26a2a77ed25b6f207 100644 (file)
--- a/numpy/typing/__init__.py
+++ b/numpy/typing/__init__.py
@@ -155,241 +155,17 @@ API
  # NOTE: The API section will be appended with additional entries
  # further down in this file
  
-from __future__ import annotations
-
-from numpy import ufunc
-from typing import TYPE_CHECKING, final
-
-if not TYPE_CHECKING:
-    __all__ = ["ArrayLike", "DTypeLike", "NBitBase", "NDArray"]
-else:
-    # Ensure that all objects within this module are accessible while
-    # static type checking. This includes private ones, as we need them
-    # for internal use.
-    #
-    # Declare to mypy that `__all__` is a list of strings without assigning
-    # an explicit value
-    __all__: list[str]
-    __path__: list[str]
-
-
-@final  # Disallow the creation of arbitrary `NBitBase` subclasses
-class NBitBase:
-    """
-    A type representing `numpy.number` precision during static type checking.
-
-    Used exclusively for the purpose static type checking, `NBitBase`
-    represents the base of a hierarchical set of subclasses.
-    Each subsequent subclass is herein used for representing a lower level
-    of precision, *e.g.* ``64Bit > 32Bit > 16Bit``.
-
-    .. versionadded:: 1.20
-
-    Examples
-    --------
-    Below is a typical usage example: `NBitBase` is herein used for annotating
-    a function that takes a float and integer of arbitrary precision
-    as arguments and returns a new float of whichever precision is largest
-    (*e.g.* ``np.float16 + np.int64 -> np.float64``).
-
-    .. code-block:: python
-
-        >>> from __future__ import annotations
-        >>> from typing import TypeVar, TYPE_CHECKING
-        >>> import numpy as np
-        >>> import numpy.typing as npt
-
-        >>> T1 = TypeVar("T1", bound=npt.NBitBase)
-        >>> T2 = TypeVar("T2", bound=npt.NBitBase)
-
-        >>> def add(a: np.floating[T1], b: np.integer[T2]) -> np.floating[T1 | T2]:
-        ...     return a + b
-
-        >>> a = np.float16()
-        >>> b = np.int64()
-        >>> out = add(a, b)
-
-        >>> if TYPE_CHECKING:
-        ...     reveal_locals()
-        ...     # note: Revealed local types are:
-        ...     # note:     a: numpy.floating[numpy.typing._16Bit*]
-        ...     # note:     b: numpy.signedinteger[numpy.typing._64Bit*]
-        ...     # note:     out: numpy.floating[numpy.typing._64Bit*]
-
-    """
-
-    def __init_subclass__(cls) -> None:
-        allowed_names = {
-            "NBitBase", "_256Bit", "_128Bit", "_96Bit", "_80Bit",
-            "_64Bit", "_32Bit", "_16Bit", "_8Bit",
-        }
-        if cls.__name__ not in allowed_names:
-            raise TypeError('cannot inherit from final class "NBitBase"')
-        super().__init_subclass__()
-
-
-# Silence errors about subclassing a `@final`-decorated class
-class _256Bit(NBitBase):  # type: ignore[misc]
-    pass
-
-class _128Bit(_256Bit):  # type: ignore[misc]
-    pass
-
-class _96Bit(_128Bit):  # type: ignore[misc]
-    pass
-
-class _80Bit(_96Bit):  # type: ignore[misc]
-    pass
-
-class _64Bit(_80Bit):  # type: ignore[misc]
-    pass
-
-class _32Bit(_64Bit):  # type: ignore[misc]
-    pass
-
-class _16Bit(_32Bit):  # type: ignore[misc]
-    pass
-
-class _8Bit(_16Bit):  # type: ignore[misc]
-    pass
-
-
-from ._nested_sequence import (
-    _NestedSequence as _NestedSequence,
-)
-from ._nbit import (
-    _NBitByte as _NBitByte,
-    _NBitShort as _NBitShort,
-    _NBitIntC as _NBitIntC,
-    _NBitIntP as _NBitIntP,
-    _NBitInt as _NBitInt,
-    _NBitLongLong as _NBitLongLong,
-    _NBitHalf as _NBitHalf,
-    _NBitSingle as _NBitSingle,
-    _NBitDouble as _NBitDouble,
-    _NBitLongDouble as _NBitLongDouble,
-)
-from ._char_codes import (
-    _BoolCodes as _BoolCodes,
-    _UInt8Codes as _UInt8Codes,
-    _UInt16Codes as _UInt16Codes,
-    _UInt32Codes as _UInt32Codes,
-    _UInt64Codes as _UInt64Codes,
-    _Int8Codes as _Int8Codes,
-    _Int16Codes as _Int16Codes,
-    _Int32Codes as _Int32Codes,
-    _Int64Codes as _Int64Codes,
-    _Float16Codes as _Float16Codes,
-    _Float32Codes as _Float32Codes,
-    _Float64Codes as _Float64Codes,
-    _Complex64Codes as _Complex64Codes,
-    _Complex128Codes as _Complex128Codes,
-    _ByteCodes as _ByteCodes,
-    _ShortCodes as _ShortCodes,
-    _IntCCodes as _IntCCodes,
-    _IntPCodes as _IntPCodes,
-    _IntCodes as _IntCodes,
-    _LongLongCodes as _LongLongCodes,
-    _UByteCodes as _UByteCodes,
-    _UShortCodes as _UShortCodes,
-    _UIntCCodes as _UIntCCodes,
-    _UIntPCodes as _UIntPCodes,
-    _UIntCodes as _UIntCodes,
-    _ULongLongCodes as _ULongLongCodes,
-    _HalfCodes as _HalfCodes,
-    _SingleCodes as _SingleCodes,
-    _DoubleCodes as _DoubleCodes,
-    _LongDoubleCodes as _LongDoubleCodes,
-    _CSingleCodes as _CSingleCodes,
-    _CDoubleCodes as _CDoubleCodes,
-    _CLongDoubleCodes as _CLongDoubleCodes,
-    _DT64Codes as _DT64Codes,
-    _TD64Codes as _TD64Codes,
-    _StrCodes as _StrCodes,
-    _BytesCodes as _BytesCodes,
-    _VoidCodes as _VoidCodes,
-    _ObjectCodes as _ObjectCodes,
-)
-from ._scalars import (
-    _CharLike_co as _CharLike_co,
-    _BoolLike_co as _BoolLike_co,
-    _UIntLike_co as _UIntLike_co,
-    _IntLike_co as _IntLike_co,
-    _FloatLike_co as _FloatLike_co,
-    _ComplexLike_co as _ComplexLike_co,
-    _TD64Like_co as _TD64Like_co,
-    _NumberLike_co as _NumberLike_co,
-    _ScalarLike_co as _ScalarLike_co,
-    _VoidLike_co as _VoidLike_co,
-)
-from ._shape import (
-    _Shape as _Shape,
-    _ShapeLike as _ShapeLike,
-)
-from ._dtype_like import (
-    DTypeLike as DTypeLike,
-    _SupportsDType as _SupportsDType,
-    _VoidDTypeLike as _VoidDTypeLike,
-    _DTypeLikeBool as _DTypeLikeBool,
-    _DTypeLikeUInt as _DTypeLikeUInt,
-    _DTypeLikeInt as _DTypeLikeInt,
-    _DTypeLikeFloat as _DTypeLikeFloat,
-    _DTypeLikeComplex as _DTypeLikeComplex,
-    _DTypeLikeTD64 as _DTypeLikeTD64,
-    _DTypeLikeDT64 as _DTypeLikeDT64,
-    _DTypeLikeObject as _DTypeLikeObject,
-    _DTypeLikeVoid as _DTypeLikeVoid,
-    _DTypeLikeStr as _DTypeLikeStr,
-    _DTypeLikeBytes as _DTypeLikeBytes,
-    _DTypeLikeComplex_co as _DTypeLikeComplex_co,
-)
-from ._array_like import (
-    ArrayLike as ArrayLike,
-    _ArrayLike as _ArrayLike,
-    _FiniteNestedSequence as _FiniteNestedSequence,
-    _SupportsArray as _SupportsArray,
-    _ArrayLikeInt as _ArrayLikeInt,
-    _ArrayLikeBool_co as _ArrayLikeBool_co,
-    _ArrayLikeUInt_co as _ArrayLikeUInt_co,
-    _ArrayLikeInt_co as _ArrayLikeInt_co,
-    _ArrayLikeFloat_co as _ArrayLikeFloat_co,
-    _ArrayLikeComplex_co as _ArrayLikeComplex_co,
-    _ArrayLikeNumber_co as _ArrayLikeNumber_co,
-    _ArrayLikeTD64_co as _ArrayLikeTD64_co,
-    _ArrayLikeDT64_co as _ArrayLikeDT64_co,
-    _ArrayLikeObject_co as _ArrayLikeObject_co,
-    _ArrayLikeVoid_co as _ArrayLikeVoid_co,
-    _ArrayLikeStr_co as _ArrayLikeStr_co,
-    _ArrayLikeBytes_co as _ArrayLikeBytes_co,
-)
-from ._generic_alias import (
-    NDArray as NDArray,
-    _DType as _DType,
-    _GenericAlias as _GenericAlias,
+from numpy._typing import (
+    ArrayLike,
+    DTypeLike,
+    NBitBase,
+    NDArray,
  )
  
-if TYPE_CHECKING:
-    from ._ufunc import (
-        _UFunc_Nin1_Nout1 as _UFunc_Nin1_Nout1,
-        _UFunc_Nin2_Nout1 as _UFunc_Nin2_Nout1,
-        _UFunc_Nin1_Nout2 as _UFunc_Nin1_Nout2,
-        _UFunc_Nin2_Nout2 as _UFunc_Nin2_Nout2,
-        _GUFunc_Nin2_Nout1 as _GUFunc_Nin2_Nout1,
-    )
-else:
-    # Declare the (type-check-only) ufunc subclasses as ufunc aliases during
-    # runtime; this helps autocompletion tools such as Jedi (numpy/numpy#19834)
-    _UFunc_Nin1_Nout1 = ufunc
-    _UFunc_Nin2_Nout1 = ufunc
-    _UFunc_Nin1_Nout2 = ufunc
-    _UFunc_Nin2_Nout2 = ufunc
-    _GUFunc_Nin2_Nout1 = ufunc
-
-# Clean up the namespace
-del TYPE_CHECKING, final, ufunc
+__all__ = ["ArrayLike", "DTypeLike", "NBitBase", "NDArray"]
  
  if __doc__ is not None:
-    from ._add_docstring import _docstrings
+    from numpy._typing._add_docstring import _docstrings
      __doc__ += _docstrings
      __doc__ += '\n.. autoclass:: numpy.typing.NBitBase\n'
      del _docstrings
diff --git a/numpy/typing/_add_docstring.py b/numpy/typing/_add_docstring.py

deleted file mode 100644 (file)

index 10d77f5..0000000
--- a/numpy/typing/_add_docstring.py
+++ /dev/null
@@ -1,152 +0,0 @@
-"""A module for creating docstrings for sphinx ``data`` domains."""
-
-import re
-import textwrap
-
-from ._generic_alias import NDArray
-
-_docstrings_list = []
-
-
-def add_newdoc(name: str, value: str, doc: str) -> None:
-    """Append ``_docstrings_list`` with a docstring for `name`.
-
-    Parameters
-    ----------
-    name : str
-        The name of the object.
-    value : str
-        A string-representation of the object.
-    doc : str
-        The docstring of the object.
-
-    """
-    _docstrings_list.append((name, value, doc))
-
-
-def _parse_docstrings() -> str:
-    """Convert all docstrings in ``_docstrings_list`` into a single
-    sphinx-legible text block.
-
-    """
-    type_list_ret = []
-    for name, value, doc in _docstrings_list:
-        s = textwrap.dedent(doc).replace("\n", "\n    ")
-
-        # Replace sections by rubrics
-        lines = s.split("\n")
-        new_lines = []
-        indent = ""
-        for line in lines:
-            m = re.match(r'^(\s+)[-=]+\s*$', line)
-            if m and new_lines:
-                prev = textwrap.dedent(new_lines.pop())
-                if prev == "Examples":
-                    indent = ""
-                    new_lines.append(f'{m.group(1)}.. rubric:: {prev}')
-                else:
-                    indent = 4 * " "
-                    new_lines.append(f'{m.group(1)}.. admonition:: {prev}')
-                new_lines.append("")
-            else:
-                new_lines.append(f"{indent}{line}")
-
-        s = "\n".join(new_lines)
-        s_block = f""".. data:: {name}\n    :value: {value}\n    {s}"""
-        type_list_ret.append(s_block)
-    return "\n".join(type_list_ret)
-
-
-add_newdoc('ArrayLike', 'typing.Union[...]',
-    """
-    A `~typing.Union` representing objects that can be coerced
-    into an `~numpy.ndarray`.
-
-    Among others this includes the likes of:
-
-    * Scalars.
-    * (Nested) sequences.
-    * Objects implementing the `~class.__array__` protocol.
-
-    .. versionadded:: 1.20
-
-    See Also
-    --------
-    :term:`array_like`:
-        Any scalar or sequence that can be interpreted as an ndarray.
-
-    Examples
-    --------
-    .. code-block:: python
-
-        >>> import numpy as np
-        >>> import numpy.typing as npt
-
-        >>> def as_array(a: npt.ArrayLike) -> np.ndarray:
-        ...     return np.array(a)
-
-    """)
-
-add_newdoc('DTypeLike', 'typing.Union[...]',
-    """
-    A `~typing.Union` representing objects that can be coerced
-    into a `~numpy.dtype`.
-
-    Among others this includes the likes of:
-
-    * :class:`type` objects.
-    * Character codes or the names of :class:`type` objects.
-    * Objects with the ``.dtype`` attribute.
-
-    .. versionadded:: 1.20
-
-    See Also
-    --------
-    :ref:`Specifying and constructing data types <arrays.dtypes.constructing>`
-        A comprehensive overview of all objects that can be coerced
-        into data types.
-
-    Examples
-    --------
-    .. code-block:: python
-
-        >>> import numpy as np
-        >>> import numpy.typing as npt
-
-        >>> def as_dtype(d: npt.DTypeLike) -> np.dtype:
-        ...     return np.dtype(d)
-
-    """)
-
-add_newdoc('NDArray', repr(NDArray),
-    """
-    A :term:`generic <generic type>` version of
-    `np.ndarray[Any, np.dtype[+ScalarType]] <numpy.ndarray>`.
-
-    Can be used during runtime for typing arrays with a given dtype
-    and unspecified shape.
-
-    .. versionadded:: 1.21
-
-    Examples
-    --------
-    .. code-block:: python
-
-        >>> import numpy as np
-        >>> import numpy.typing as npt
-
-        >>> print(npt.NDArray)
-        numpy.ndarray[typing.Any, numpy.dtype[+ScalarType]]
-
-        >>> print(npt.NDArray[np.float64])
-        numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]]
-
-        >>> NDArrayInt = npt.NDArray[np.int_]
-        >>> a: NDArrayInt = np.arange(10)
-
-        >>> def func(a: npt.ArrayLike) -> npt.NDArray[Any]:
-        ...     return np.array(a)
-
-    """)
-
-_docstrings = _parse_docstrings()
diff --git a/numpy/typing/_array_like.py b/numpy/typing/_array_like.py

deleted file mode 100644 (file)

index 02e5ee5..0000000
--- a/numpy/typing/_array_like.py
+++ /dev/null
@@ -1,123 +0,0 @@
-from __future__ import annotations
-
-from typing import Any, Sequence, Protocol, Union, TypeVar
-from numpy import (
-    ndarray,
-    dtype,
-    generic,
-    bool_,
-    unsignedinteger,
-    integer,
-    floating,
-    complexfloating,
-    number,
-    timedelta64,
-    datetime64,
-    object_,
-    void,
-    str_,
-    bytes_,
-)
-from ._nested_sequence import _NestedSequence
-
-_T = TypeVar("_T")
-_ScalarType = TypeVar("_ScalarType", bound=generic)
-_DType = TypeVar("_DType", bound="dtype[Any]")
-_DType_co = TypeVar("_DType_co", covariant=True, bound="dtype[Any]")
-
-# The `_SupportsArray` protocol only cares about the default dtype
-# (i.e. `dtype=None` or no `dtype` parameter at all) of the to-be returned
-# array.
-# Concrete implementations of the protocol are responsible for adding
-# any and all remaining overloads
-class _SupportsArray(Protocol[_DType_co]):
-    def __array__(self) -> ndarray[Any, _DType_co]: ...
-
-
-# TODO: Wait until mypy supports recursive objects in combination with typevars
-_FiniteNestedSequence = Union[
-    _T,
-    Sequence[_T],
-    Sequence[Sequence[_T]],
-    Sequence[Sequence[Sequence[_T]]],
-    Sequence[Sequence[Sequence[Sequence[_T]]]],
-]
-
-# A union representing array-like objects; consists of two typevars:
-# One representing types that can be parametrized w.r.t. `np.dtype`
-# and another one for the rest
-_ArrayLike = Union[
-    _SupportsArray[_DType],
-    _NestedSequence[_SupportsArray[_DType]],
-    _T,
-    _NestedSequence[_T],
-]
-
-# TODO: support buffer protocols once
-#
-# https://bugs.python.org/issue27501
-#
-# is resolved. See also the mypy issue:
-#
-# https://github.com/python/typing/issues/593
-ArrayLike = _ArrayLike[
-    dtype,
-    Union[bool, int, float, complex, str, bytes],
-]
-
-# `ArrayLike<X>_co`: array-like objects that can be coerced into `X`
-# given the casting rules `same_kind`
-_ArrayLikeBool_co = _ArrayLike[
-    "dtype[bool_]",
-    bool,
-]
-_ArrayLikeUInt_co = _ArrayLike[
-    "dtype[Union[bool_, unsignedinteger[Any]]]",
-    bool,
-]
-_ArrayLikeInt_co = _ArrayLike[
-    "dtype[Union[bool_, integer[Any]]]",
-    Union[bool, int],
-]
-_ArrayLikeFloat_co = _ArrayLike[
-    "dtype[Union[bool_, integer[Any], floating[Any]]]",
-    Union[bool, int, float],
-]
-_ArrayLikeComplex_co = _ArrayLike[
-    "dtype[Union[bool_, integer[Any], floating[Any], complexfloating[Any, Any]]]",
-    Union[bool, int, float, complex],
-]
-_ArrayLikeNumber_co = _ArrayLike[
-    "dtype[Union[bool_, number[Any]]]",
-    Union[bool, int, float, complex],
-]
-_ArrayLikeTD64_co = _ArrayLike[
-    "dtype[Union[bool_, integer[Any], timedelta64]]",
-    Union[bool, int],
-]
-_ArrayLikeDT64_co = Union[
-    _SupportsArray["dtype[datetime64]"],
-    _NestedSequence[_SupportsArray["dtype[datetime64]"]],
-]
-_ArrayLikeObject_co = Union[
-    _SupportsArray["dtype[object_]"],
-    _NestedSequence[_SupportsArray["dtype[object_]"]],
-]
-
-_ArrayLikeVoid_co = Union[
-    _SupportsArray["dtype[void]"],
-    _NestedSequence[_SupportsArray["dtype[void]"]],
-]
-_ArrayLikeStr_co = _ArrayLike[
-    "dtype[str_]",
-    str,
-]
-_ArrayLikeBytes_co = _ArrayLike[
-    "dtype[bytes_]",
-    bytes,
-]
-
-_ArrayLikeInt = _ArrayLike[
-    "dtype[integer[Any]]",
-    int,
-]
diff --git a/numpy/typing/_callable.pyi b/numpy/typing/_callable.pyi

deleted file mode 100644 (file)

index e1149f2..0000000
--- a/numpy/typing/_callable.pyi
+++ /dev/null
@@ -1,327 +0,0 @@
-"""
-A module with various ``typing.Protocol`` subclasses that implement
-the ``__call__`` magic method.
-
-See the `Mypy documentation`_ on protocols for more details.
-
-.. _`Mypy documentation`: https://mypy.readthedocs.io/en/stable/protocols.html#callback-protocols
-
-"""
-
-from __future__ import annotations
-
-from typing import (
-    Union,
-    TypeVar,
-    overload,
-    Any,
-    Tuple,
-    NoReturn,
-    Protocol,
-)
-
-from numpy import (
-    ndarray,
-    dtype,
-    generic,
-    bool_,
-    timedelta64,
-    number,
-    integer,
-    unsignedinteger,
-    signedinteger,
-    int8,
-    int_,
-    floating,
-    float64,
-    complexfloating,
-    complex128,
-)
-from ._nbit import _NBitInt, _NBitDouble
-from ._scalars import (
-    _BoolLike_co,
-    _IntLike_co,
-    _FloatLike_co,
-    _NumberLike_co,
-)
-from . import NBitBase
-from ._generic_alias import NDArray
-
-_T1 = TypeVar("_T1")
-_T2 = TypeVar("_T2")
-_T1_contra = TypeVar("_T1_contra", contravariant=True)
-_T2_contra = TypeVar("_T2_contra", contravariant=True)
-_2Tuple = Tuple[_T1, _T1]
-
-_NBit1 = TypeVar("_NBit1", bound=NBitBase)
-_NBit2 = TypeVar("_NBit2", bound=NBitBase)
-
-_IntType = TypeVar("_IntType", bound=integer)
-_FloatType = TypeVar("_FloatType", bound=floating)
-_NumberType = TypeVar("_NumberType", bound=number)
-_NumberType_co = TypeVar("_NumberType_co", covariant=True, bound=number)
-_GenericType_co = TypeVar("_GenericType_co", covariant=True, bound=generic)
-
-class _BoolOp(Protocol[_GenericType_co]):
-    @overload
-    def __call__(self, other: _BoolLike_co, /) -> _GenericType_co: ...
-    @overload  # platform dependent
-    def __call__(self, other: int, /) -> int_: ...
-    @overload
-    def __call__(self, other: float, /) -> float64: ...
-    @overload
-    def __call__(self, other: complex, /) -> complex128: ...
-    @overload
-    def __call__(self, other: _NumberType, /) -> _NumberType: ...
-
-class _BoolBitOp(Protocol[_GenericType_co]):
-    @overload
-    def __call__(self, other: _BoolLike_co, /) -> _GenericType_co: ...
-    @overload  # platform dependent
-    def __call__(self, other: int, /) -> int_: ...
-    @overload
-    def __call__(self, other: _IntType, /) -> _IntType: ...
-
-class _BoolSub(Protocol):
-    # Note that `other: bool_` is absent here
-    @overload
-    def __call__(self, other: bool, /) -> NoReturn: ...
-    @overload  # platform dependent
-    def __call__(self, other: int, /) -> int_: ...
-    @overload
-    def __call__(self, other: float, /) -> float64: ...
-    @overload
-    def __call__(self, other: complex, /) -> complex128: ...
-    @overload
-    def __call__(self, other: _NumberType, /) -> _NumberType: ...
-
-class _BoolTrueDiv(Protocol):
-    @overload
-    def __call__(self, other: float | _IntLike_co, /) -> float64: ...
-    @overload
-    def __call__(self, other: complex, /) -> complex128: ...
-    @overload
-    def __call__(self, other: _NumberType, /) -> _NumberType: ...
-
-class _BoolMod(Protocol):
-    @overload
-    def __call__(self, other: _BoolLike_co, /) -> int8: ...
-    @overload  # platform dependent
-    def __call__(self, other: int, /) -> int_: ...
-    @overload
-    def __call__(self, other: float, /) -> float64: ...
-    @overload
-    def __call__(self, other: _IntType, /) -> _IntType: ...
-    @overload
-    def __call__(self, other: _FloatType, /) -> _FloatType: ...
-
-class _BoolDivMod(Protocol):
-    @overload
-    def __call__(self, other: _BoolLike_co, /) -> _2Tuple[int8]: ...
-    @overload  # platform dependent
-    def __call__(self, other: int, /) -> _2Tuple[int_]: ...
-    @overload
-    def __call__(self, other: float, /) -> _2Tuple[floating[_NBit1 | _NBitDouble]]: ...
-    @overload
-    def __call__(self, other: _IntType, /) -> _2Tuple[_IntType]: ...
-    @overload
-    def __call__(self, other: _FloatType, /) -> _2Tuple[_FloatType]: ...
-
-class _TD64Div(Protocol[_NumberType_co]):
-    @overload
-    def __call__(self, other: timedelta64, /) -> _NumberType_co: ...
-    @overload
-    def __call__(self, other: _BoolLike_co, /) -> NoReturn: ...
-    @overload
-    def __call__(self, other: _FloatLike_co, /) -> timedelta64: ...
-
-class _IntTrueDiv(Protocol[_NBit1]):
-    @overload
-    def __call__(self, other: bool, /) -> floating[_NBit1]: ...
-    @overload
-    def __call__(self, other: int, /) -> floating[_NBit1 | _NBitInt]: ...
-    @overload
-    def __call__(self, other: float, /) -> floating[_NBit1 | _NBitDouble]: ...
-    @overload
-    def __call__(
-        self, other: complex, /,
-    ) -> complexfloating[_NBit1 | _NBitDouble, _NBit1 | _NBitDouble]: ...
-    @overload
-    def __call__(self, other: integer[_NBit2], /) -> floating[_NBit1 | _NBit2]: ...
-
-class _UnsignedIntOp(Protocol[_NBit1]):
-    # NOTE: `uint64 + signedinteger -> float64`
-    @overload
-    def __call__(self, other: bool, /) -> unsignedinteger[_NBit1]: ...
-    @overload
-    def __call__(
-        self, other: int | signedinteger[Any], /
-    ) -> Any: ...
-    @overload
-    def __call__(self, other: float, /) -> floating[_NBit1 | _NBitDouble]: ...
-    @overload
-    def __call__(
-        self, other: complex, /,
-    ) -> complexfloating[_NBit1 | _NBitDouble, _NBit1 | _NBitDouble]: ...
-    @overload
-    def __call__(
-        self, other: unsignedinteger[_NBit2], /
-    ) -> unsignedinteger[_NBit1 | _NBit2]: ...
-
-class _UnsignedIntBitOp(Protocol[_NBit1]):
-    @overload
-    def __call__(self, other: bool, /) -> unsignedinteger[_NBit1]: ...
-    @overload
-    def __call__(self, other: int, /) -> signedinteger[Any]: ...
-    @overload
-    def __call__(self, other: signedinteger[Any], /) -> signedinteger[Any]: ...
-    @overload
-    def __call__(
-        self, other: unsignedinteger[_NBit2], /
-    ) -> unsignedinteger[_NBit1 | _NBit2]: ...
-
-class _UnsignedIntMod(Protocol[_NBit1]):
-    @overload
-    def __call__(self, other: bool, /) -> unsignedinteger[_NBit1]: ...
-    @overload
-    def __call__(
-        self, other: int | signedinteger[Any], /
-    ) -> Any: ...
-    @overload
-    def __call__(self, other: float, /) -> floating[_NBit1 | _NBitDouble]: ...
-    @overload
-    def __call__(
-        self, other: unsignedinteger[_NBit2], /
-    ) -> unsignedinteger[_NBit1 | _NBit2]: ...
-
-class _UnsignedIntDivMod(Protocol[_NBit1]):
-    @overload
-    def __call__(self, other: bool, /) -> _2Tuple[signedinteger[_NBit1]]: ...
-    @overload
-    def __call__(
-        self, other: int | signedinteger[Any], /
-    ) -> _2Tuple[Any]: ...
-    @overload
-    def __call__(self, other: float, /) -> _2Tuple[floating[_NBit1 | _NBitDouble]]: ...
-    @overload
-    def __call__(
-        self, other: unsignedinteger[_NBit2], /
-    ) -> _2Tuple[unsignedinteger[_NBit1 | _NBit2]]: ...
-
-class _SignedIntOp(Protocol[_NBit1]):
-    @overload
-    def __call__(self, other: bool, /) -> signedinteger[_NBit1]: ...
-    @overload
-    def __call__(self, other: int, /) -> signedinteger[_NBit1 | _NBitInt]: ...
-    @overload
-    def __call__(self, other: float, /) -> floating[_NBit1 | _NBitDouble]: ...
-    @overload
-    def __call__(
-        self, other: complex, /,
-    ) -> complexfloating[_NBit1 | _NBitDouble, _NBit1 | _NBitDouble]: ...
-    @overload
-    def __call__(
-        self, other: signedinteger[_NBit2], /,
-    ) -> signedinteger[_NBit1 | _NBit2]: ...
-
-class _SignedIntBitOp(Protocol[_NBit1]):
-    @overload
-    def __call__(self, other: bool, /) -> signedinteger[_NBit1]: ...
-    @overload
-    def __call__(self, other: int, /) -> signedinteger[_NBit1 | _NBitInt]: ...
-    @overload
-    def __call__(
-        self, other: signedinteger[_NBit2], /,
-    ) -> signedinteger[_NBit1 | _NBit2]: ...
-
-class _SignedIntMod(Protocol[_NBit1]):
-    @overload
-    def __call__(self, other: bool, /) -> signedinteger[_NBit1]: ...
-    @overload
-    def __call__(self, other: int, /) -> signedinteger[_NBit1 | _NBitInt]: ...
-    @overload
-    def __call__(self, other: float, /) -> floating[_NBit1 | _NBitDouble]: ...
-    @overload
-    def __call__(
-        self, other: signedinteger[_NBit2], /,
-    ) -> signedinteger[_NBit1 | _NBit2]: ...
-
-class _SignedIntDivMod(Protocol[_NBit1]):
-    @overload
-    def __call__(self, other: bool, /) -> _2Tuple[signedinteger[_NBit1]]: ...
-    @overload
-    def __call__(self, other: int, /) -> _2Tuple[signedinteger[_NBit1 | _NBitInt]]: ...
-    @overload
-    def __call__(self, other: float, /) -> _2Tuple[floating[_NBit1 | _NBitDouble]]: ...
-    @overload
-    def __call__(
-        self, other: signedinteger[_NBit2], /,
-    ) -> _2Tuple[signedinteger[_NBit1 | _NBit2]]: ...
-
-class _FloatOp(Protocol[_NBit1]):
-    @overload
-    def __call__(self, other: bool, /) -> floating[_NBit1]: ...
-    @overload
-    def __call__(self, other: int, /) -> floating[_NBit1 | _NBitInt]: ...
-    @overload
-    def __call__(self, other: float, /) -> floating[_NBit1 | _NBitDouble]: ...
-    @overload
-    def __call__(
-        self, other: complex, /,
-    ) -> complexfloating[_NBit1 | _NBitDouble, _NBit1 | _NBitDouble]: ...
-    @overload
-    def __call__(
-        self, other: integer[_NBit2] | floating[_NBit2], /
-    ) -> floating[_NBit1 | _NBit2]: ...
-
-class _FloatMod(Protocol[_NBit1]):
-    @overload
-    def __call__(self, other: bool, /) -> floating[_NBit1]: ...
-    @overload
-    def __call__(self, other: int, /) -> floating[_NBit1 | _NBitInt]: ...
-    @overload
-    def __call__(self, other: float, /) -> floating[_NBit1 | _NBitDouble]: ...
-    @overload
-    def __call__(
-        self, other: integer[_NBit2] | floating[_NBit2], /
-    ) -> floating[_NBit1 | _NBit2]: ...
-
-class _FloatDivMod(Protocol[_NBit1]):
-    @overload
-    def __call__(self, other: bool, /) -> _2Tuple[floating[_NBit1]]: ...
-    @overload
-    def __call__(self, other: int, /) -> _2Tuple[floating[_NBit1 | _NBitInt]]: ...
-    @overload
-    def __call__(self, other: float, /) -> _2Tuple[floating[_NBit1 | _NBitDouble]]: ...
-    @overload
-    def __call__(
-        self, other: integer[_NBit2] | floating[_NBit2], /
-    ) -> _2Tuple[floating[_NBit1 | _NBit2]]: ...
-
-class _ComplexOp(Protocol[_NBit1]):
-    @overload
-    def __call__(self, other: bool, /) -> complexfloating[_NBit1, _NBit1]: ...
-    @overload
-    def __call__(self, other: int, /) -> complexfloating[_NBit1 | _NBitInt, _NBit1 | _NBitInt]: ...
-    @overload
-    def __call__(
-        self, other: complex, /,
-    ) -> complexfloating[_NBit1 | _NBitDouble, _NBit1 | _NBitDouble]: ...
-    @overload
-    def __call__(
-        self,
-        other: Union[
-            integer[_NBit2],
-            floating[_NBit2],
-            complexfloating[_NBit2, _NBit2],
-        ], /,
-    ) -> complexfloating[_NBit1 | _NBit2, _NBit1 | _NBit2]: ...
-
-class _NumberOp(Protocol):
-    def __call__(self, other: _NumberLike_co, /) -> Any: ...
-
-class _ComparisonOp(Protocol[_T1_contra, _T2_contra]):
-    @overload
-    def __call__(self, other: _T1_contra, /) -> bool_: ...
-    @overload
-    def __call__(self, other: _T2_contra, /) -> NDArray[bool_]: ...
diff --git a/numpy/typing/_char_codes.py b/numpy/typing/_char_codes.py

deleted file mode 100644 (file)

index 1394710..0000000
--- a/numpy/typing/_char_codes.py
+++ /dev/null
@@ -1,111 +0,0 @@
-from typing import Literal
-
-_BoolCodes = Literal["?", "=?", "<?", ">?", "bool", "bool_", "bool8"]
-
-_UInt8Codes = Literal["uint8", "u1", "=u1", "<u1", ">u1"]
-_UInt16Codes = Literal["uint16", "u2", "=u2", "<u2", ">u2"]
-_UInt32Codes = Literal["uint32", "u4", "=u4", "<u4", ">u4"]
-_UInt64Codes = Literal["uint64", "u8", "=u8", "<u8", ">u8"]
-
-_Int8Codes = Literal["int8", "i1", "=i1", "<i1", ">i1"]
-_Int16Codes = Literal["int16", "i2", "=i2", "<i2", ">i2"]
-_Int32Codes = Literal["int32", "i4", "=i4", "<i4", ">i4"]
-_Int64Codes = Literal["int64", "i8", "=i8", "<i8", ">i8"]
-
-_Float16Codes = Literal["float16", "f2", "=f2", "<f2", ">f2"]
-_Float32Codes = Literal["float32", "f4", "=f4", "<f4", ">f4"]
-_Float64Codes = Literal["float64", "f8", "=f8", "<f8", ">f8"]
-
-_Complex64Codes = Literal["complex64", "c8", "=c8", "<c8", ">c8"]
-_Complex128Codes = Literal["complex128", "c16", "=c16", "<c16", ">c16"]
-
-_ByteCodes = Literal["byte", "b", "=b", "<b", ">b"]
-_ShortCodes = Literal["short", "h", "=h", "<h", ">h"]
-_IntCCodes = Literal["intc", "i", "=i", "<i", ">i"]
-_IntPCodes = Literal["intp", "int0", "p", "=p", "<p", ">p"]
-_IntCodes = Literal["long", "int", "int_", "l", "=l", "<l", ">l"]
-_LongLongCodes = Literal["longlong", "q", "=q", "<q", ">q"]
-
-_UByteCodes = Literal["ubyte", "B", "=B", "<B", ">B"]
-_UShortCodes = Literal["ushort", "H", "=H", "<H", ">H"]
-_UIntCCodes = Literal["uintc", "I", "=I", "<I", ">I"]
-_UIntPCodes = Literal["uintp", "uint0", "P", "=P", "<P", ">P"]
-_UIntCodes = Literal["uint", "L", "=L", "<L", ">L"]
-_ULongLongCodes = Literal["ulonglong", "Q", "=Q", "<Q", ">Q"]
-
-_HalfCodes = Literal["half", "e", "=e", "<e", ">e"]
-_SingleCodes = Literal["single", "f", "=f", "<f", ">f"]
-_DoubleCodes = Literal["double", "float", "float_", "d", "=d", "<d", ">d"]
-_LongDoubleCodes = Literal["longdouble", "longfloat", "g", "=g", "<g", ">g"]
-
-_CSingleCodes = Literal["csingle", "singlecomplex", "F", "=F", "<F", ">F"]
-_CDoubleCodes = Literal["cdouble", "complex", "complex_", "cfloat", "D", "=D", "<D", ">D"]
-_CLongDoubleCodes = Literal["clongdouble", "clongfloat", "longcomplex", "G", "=G", "<G", ">G"]
-
-_StrCodes = Literal["str", "str_", "str0", "unicode", "unicode_", "U", "=U", "<U", ">U"]
-_BytesCodes = Literal["bytes", "bytes_", "bytes0", "S", "=S", "<S", ">S"]
-_VoidCodes = Literal["void", "void0", "V", "=V", "<V", ">V"]
-_ObjectCodes = Literal["object", "object_", "O", "=O", "<O", ">O"]
-
-_DT64Codes = Literal[
-    "datetime64", "=datetime64", "<datetime64", ">datetime64",
-    "datetime64[Y]", "=datetime64[Y]", "<datetime64[Y]", ">datetime64[Y]",
-    "datetime64[M]", "=datetime64[M]", "<datetime64[M]", ">datetime64[M]",
-    "datetime64[W]", "=datetime64[W]", "<datetime64[W]", ">datetime64[W]",
-    "datetime64[D]", "=datetime64[D]", "<datetime64[D]", ">datetime64[D]",
-    "datetime64[h]", "=datetime64[h]", "<datetime64[h]", ">datetime64[h]",
-    "datetime64[m]", "=datetime64[m]", "<datetime64[m]", ">datetime64[m]",
-    "datetime64[s]", "=datetime64[s]", "<datetime64[s]", ">datetime64[s]",
-    "datetime64[ms]", "=datetime64[ms]", "<datetime64[ms]", ">datetime64[ms]",
-    "datetime64[us]", "=datetime64[us]", "<datetime64[us]", ">datetime64[us]",
-    "datetime64[ns]", "=datetime64[ns]", "<datetime64[ns]", ">datetime64[ns]",
-    "datetime64[ps]", "=datetime64[ps]", "<datetime64[ps]", ">datetime64[ps]",
-    "datetime64[fs]", "=datetime64[fs]", "<datetime64[fs]", ">datetime64[fs]",
-    "datetime64[as]", "=datetime64[as]", "<datetime64[as]", ">datetime64[as]",
-    "M", "=M", "<M", ">M",
-    "M8", "=M8", "<M8", ">M8",
-    "M8[Y]", "=M8[Y]", "<M8[Y]", ">M8[Y]",
-    "M8[M]", "=M8[M]", "<M8[M]", ">M8[M]",
-    "M8[W]", "=M8[W]", "<M8[W]", ">M8[W]",
-    "M8[D]", "=M8[D]", "<M8[D]", ">M8[D]",
-    "M8[h]", "=M8[h]", "<M8[h]", ">M8[h]",
-    "M8[m]", "=M8[m]", "<M8[m]", ">M8[m]",
-    "M8[s]", "=M8[s]", "<M8[s]", ">M8[s]",
-    "M8[ms]", "=M8[ms]", "<M8[ms]", ">M8[ms]",
-    "M8[us]", "=M8[us]", "<M8[us]", ">M8[us]",
-    "M8[ns]", "=M8[ns]", "<M8[ns]", ">M8[ns]",
-    "M8[ps]", "=M8[ps]", "<M8[ps]", ">M8[ps]",
-    "M8[fs]", "=M8[fs]", "<M8[fs]", ">M8[fs]",
-    "M8[as]", "=M8[as]", "<M8[as]", ">M8[as]",
-]
-_TD64Codes = Literal[
-    "timedelta64", "=timedelta64", "<timedelta64", ">timedelta64",
-    "timedelta64[Y]", "=timedelta64[Y]", "<timedelta64[Y]", ">timedelta64[Y]",
-    "timedelta64[M]", "=timedelta64[M]", "<timedelta64[M]", ">timedelta64[M]",
-    "timedelta64[W]", "=timedelta64[W]", "<timedelta64[W]", ">timedelta64[W]",
-    "timedelta64[D]", "=timedelta64[D]", "<timedelta64[D]", ">timedelta64[D]",
-    "timedelta64[h]", "=timedelta64[h]", "<timedelta64[h]", ">timedelta64[h]",
-    "timedelta64[m]", "=timedelta64[m]", "<timedelta64[m]", ">timedelta64[m]",
-    "timedelta64[s]", "=timedelta64[s]", "<timedelta64[s]", ">timedelta64[s]",
-    "timedelta64[ms]", "=timedelta64[ms]", "<timedelta64[ms]", ">timedelta64[ms]",
-    "timedelta64[us]", "=timedelta64[us]", "<timedelta64[us]", ">timedelta64[us]",
-    "timedelta64[ns]", "=timedelta64[ns]", "<timedelta64[ns]", ">timedelta64[ns]",
-    "timedelta64[ps]", "=timedelta64[ps]", "<timedelta64[ps]", ">timedelta64[ps]",
-    "timedelta64[fs]", "=timedelta64[fs]", "<timedelta64[fs]", ">timedelta64[fs]",
-    "timedelta64[as]", "=timedelta64[as]", "<timedelta64[as]", ">timedelta64[as]",
-    "m", "=m", "<m", ">m",
-    "m8", "=m8", "<m8", ">m8",
-    "m8[Y]", "=m8[Y]", "<m8[Y]", ">m8[Y]",
-    "m8[M]", "=m8[M]", "<m8[M]", ">m8[M]",
-    "m8[W]", "=m8[W]", "<m8[W]", ">m8[W]",
-    "m8[D]", "=m8[D]", "<m8[D]", ">m8[D]",
-    "m8[h]", "=m8[h]", "<m8[h]", ">m8[h]",
-    "m8[m]", "=m8[m]", "<m8[m]", ">m8[m]",
-    "m8[s]", "=m8[s]", "<m8[s]", ">m8[s]",
-    "m8[ms]", "=m8[ms]", "<m8[ms]", ">m8[ms]",
-    "m8[us]", "=m8[us]", "<m8[us]", ">m8[us]",
-    "m8[ns]", "=m8[ns]", "<m8[ns]", ">m8[ns]",
-    "m8[ps]", "=m8[ps]", "<m8[ps]", ">m8[ps]",
-    "m8[fs]", "=m8[fs]", "<m8[fs]", ">m8[fs]",
-    "m8[as]", "=m8[as]", "<m8[as]", ">m8[as]",
-]
diff --git a/numpy/typing/_dtype_like.py b/numpy/typing/_dtype_like.py

deleted file mode 100644 (file)

index c9bf1a1..0000000
--- a/numpy/typing/_dtype_like.py
+++ /dev/null
@@ -1,236 +0,0 @@
-from typing import (
-    Any,
-    List,
-    Sequence,
-    Tuple,
-    Union,
-    Type,
-    TypeVar,
-    Protocol,
-    TypedDict,
-)
-
-import numpy as np
-
-from ._shape import _ShapeLike
-from ._generic_alias import _DType as DType
-
-from ._char_codes import (
-    _BoolCodes,
-    _UInt8Codes,
-    _UInt16Codes,
-    _UInt32Codes,
-    _UInt64Codes,
-    _Int8Codes,
-    _Int16Codes,
-    _Int32Codes,
-    _Int64Codes,
-    _Float16Codes,
-    _Float32Codes,
-    _Float64Codes,
-    _Complex64Codes,
-    _Complex128Codes,
-    _ByteCodes,
-    _ShortCodes,
-    _IntCCodes,
-    _IntPCodes,
-    _IntCodes,
-    _LongLongCodes,
-    _UByteCodes,
-    _UShortCodes,
-    _UIntCCodes,
-    _UIntPCodes,
-    _UIntCodes,
-    _ULongLongCodes,
-    _HalfCodes,
-    _SingleCodes,
-    _DoubleCodes,
-    _LongDoubleCodes,
-    _CSingleCodes,
-    _CDoubleCodes,
-    _CLongDoubleCodes,
-    _DT64Codes,
-    _TD64Codes,
-    _StrCodes,
-    _BytesCodes,
-    _VoidCodes,
-    _ObjectCodes,
-)
-
-_DTypeLikeNested = Any  # TODO: wait for support for recursive types
-_DType_co = TypeVar("_DType_co", covariant=True, bound=DType[Any])
-
-# Mandatory keys
-class _DTypeDictBase(TypedDict):
-    names: Sequence[str]
-    formats: Sequence[_DTypeLikeNested]
-
-
-# Mandatory + optional keys
-class _DTypeDict(_DTypeDictBase, total=False):
-    # Only `str` elements are usable as indexing aliases,
-    # but `titles` can in principle accept any object
-    offsets: Sequence[int]
-    titles: Sequence[Any]
-    itemsize: int
-    aligned: bool
-
-
-# A protocol for anything with the dtype attribute
-class _SupportsDType(Protocol[_DType_co]):
-    @property
-    def dtype(self) -> _DType_co: ...
-
-
-# Would create a dtype[np.void]
-_VoidDTypeLike = Union[
-    # (flexible_dtype, itemsize)
-    Tuple[_DTypeLikeNested, int],
-    # (fixed_dtype, shape)
-    Tuple[_DTypeLikeNested, _ShapeLike],
-    # [(field_name, field_dtype, field_shape), ...]
-    #
-    # The type here is quite broad because NumPy accepts quite a wide
-    # range of inputs inside the list; see the tests for some
-    # examples.
-    List[Any],
-    # {'names': ..., 'formats': ..., 'offsets': ..., 'titles': ...,
-    #  'itemsize': ...}
-    _DTypeDict,
-    # (base_dtype, new_dtype)
-    Tuple[_DTypeLikeNested, _DTypeLikeNested],
-]
-
-# Anything that can be coerced into numpy.dtype.
-# Reference: https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html
-DTypeLike = Union[
-    DType[Any],
-    # default data type (float64)
-    None,
-    # array-scalar types and generic types
-    Type[Any],  # NOTE: We're stuck with `Type[Any]` due to object dtypes
-    # anything with a dtype attribute
-    _SupportsDType[DType[Any]],
-    # character codes, type strings or comma-separated fields, e.g., 'float64'
-    str,
-    _VoidDTypeLike,
-]
-
-# NOTE: while it is possible to provide the dtype as a dict of
-# dtype-like objects (e.g. `{'field1': ..., 'field2': ..., ...}`),
-# this syntax is officially discourged and
-# therefore not included in the Union defining `DTypeLike`.
-#
-# See https://github.com/numpy/numpy/issues/16891 for more details.
-
-# Aliases for commonly used dtype-like objects.
-# Note that the precision of `np.number` subclasses is ignored herein.
-_DTypeLikeBool = Union[
-    Type[bool],
-    Type[np.bool_],
-    DType[np.bool_],
-    _SupportsDType[DType[np.bool_]],
-    _BoolCodes,
-]
-_DTypeLikeUInt = Union[
-    Type[np.unsignedinteger],
-    DType[np.unsignedinteger],
-    _SupportsDType[DType[np.unsignedinteger]],
-    _UInt8Codes,
-    _UInt16Codes,
-    _UInt32Codes,
-    _UInt64Codes,
-    _UByteCodes,
-    _UShortCodes,
-    _UIntCCodes,
-    _UIntPCodes,
-    _UIntCodes,
-    _ULongLongCodes,
-]
-_DTypeLikeInt = Union[
-    Type[int],
-    Type[np.signedinteger],
-    DType[np.signedinteger],
-    _SupportsDType[DType[np.signedinteger]],
-    _Int8Codes,
-    _Int16Codes,
-    _Int32Codes,
-    _Int64Codes,
-    _ByteCodes,
-    _ShortCodes,
-    _IntCCodes,
-    _IntPCodes,
-    _IntCodes,
-    _LongLongCodes,
-]
-_DTypeLikeFloat = Union[
-    Type[float],
-    Type[np.floating],
-    DType[np.floating],
-    _SupportsDType[DType[np.floating]],
-    _Float16Codes,
-    _Float32Codes,
-    _Float64Codes,
-    _HalfCodes,
-    _SingleCodes,
-    _DoubleCodes,
-    _LongDoubleCodes,
-]
-_DTypeLikeComplex = Union[
-    Type[complex],
-    Type[np.complexfloating],
-    DType[np.complexfloating],
-    _SupportsDType[DType[np.complexfloating]],
-    _Complex64Codes,
-    _Complex128Codes,
-    _CSingleCodes,
-    _CDoubleCodes,
-    _CLongDoubleCodes,
-]
-_DTypeLikeDT64 = Union[
-    Type[np.timedelta64],
-    DType[np.timedelta64],
-    _SupportsDType[DType[np.timedelta64]],
-    _TD64Codes,
-]
-_DTypeLikeTD64 = Union[
-    Type[np.datetime64],
-    DType[np.datetime64],
-    _SupportsDType[DType[np.datetime64]],
-    _DT64Codes,
-]
-_DTypeLikeStr = Union[
-    Type[str],
-    Type[np.str_],
-    DType[np.str_],
-    _SupportsDType[DType[np.str_]],
-    _StrCodes,
-]
-_DTypeLikeBytes = Union[
-    Type[bytes],
-    Type[np.bytes_],
-    DType[np.bytes_],
-    _SupportsDType[DType[np.bytes_]],
-    _BytesCodes,
-]
-_DTypeLikeVoid = Union[
-    Type[np.void],
-    DType[np.void],
-    _SupportsDType[DType[np.void]],
-    _VoidCodes,
-    _VoidDTypeLike,
-]
-_DTypeLikeObject = Union[
-    type,
-    DType[np.object_],
-    _SupportsDType[DType[np.object_]],
-    _ObjectCodes,
-]
-
-_DTypeLikeComplex_co = Union[
-    _DTypeLikeBool,
-    _DTypeLikeUInt,
-    _DTypeLikeInt,
-    _DTypeLikeFloat,
-    _DTypeLikeComplex,
-]
diff --git a/numpy/typing/_extended_precision.py b/numpy/typing/_extended_precision.py

deleted file mode 100644 (file)

index edc1778..0000000
--- a/numpy/typing/_extended_precision.py
+++ /dev/null
@@ -1,43 +0,0 @@
-"""A module with platform-specific extended precision
-`numpy.number` subclasses.
-
-The subclasses are defined here (instead of ``__init__.pyi``) such
-that they can be imported conditionally via the numpy's mypy plugin.
-"""
-
-from typing import TYPE_CHECKING
-
-import numpy as np
-from . import (
-    _80Bit,
-    _96Bit,
-    _128Bit,
-    _256Bit,
-)
-
-if TYPE_CHECKING:
-    uint128 = np.unsignedinteger[_128Bit]
-    uint256 = np.unsignedinteger[_256Bit]
-    int128 = np.signedinteger[_128Bit]
-    int256 = np.signedinteger[_256Bit]
-    float80 = np.floating[_80Bit]
-    float96 = np.floating[_96Bit]
-    float128 = np.floating[_128Bit]
-    float256 = np.floating[_256Bit]
-    complex160 = np.complexfloating[_80Bit, _80Bit]
-    complex192 = np.complexfloating[_96Bit, _96Bit]
-    complex256 = np.complexfloating[_128Bit, _128Bit]
-    complex512 = np.complexfloating[_256Bit, _256Bit]
-else:
-    uint128 = Any
-    uint256 = Any
-    int128 = Any
-    int256 = Any
-    float80 = Any
-    float96 = Any
-    float128 = Any
-    float256 = Any
-    complex160 = Any
-    complex192 = Any
-    complex256 = Any
-    complex512 = Any
diff --git a/numpy/typing/_generic_alias.py b/numpy/typing/_generic_alias.py

deleted file mode 100644 (file)

index 1eb2c8c..0000000
--- a/numpy/typing/_generic_alias.py
+++ /dev/null
@@ -1,215 +0,0 @@
-from __future__ import annotations
-
-import sys
-import types
-from typing import (
-    Any,
-    ClassVar,
-    FrozenSet,
-    Generator,
-    Iterable,
-    Iterator,
-    List,
-    NoReturn,
-    Tuple,
-    Type,
-    TypeVar,
-    TYPE_CHECKING,
-)
-
-import numpy as np
-
-__all__ = ["_GenericAlias", "NDArray"]
-
-_T = TypeVar("_T", bound="_GenericAlias")
-
-
-def _to_str(obj: object) -> str:
-    """Helper function for `_GenericAlias.__repr__`."""
-    if obj is Ellipsis:
-        return '...'
-    elif isinstance(obj, type) and not isinstance(obj, _GENERIC_ALIAS_TYPE):
-        if obj.__module__ == 'builtins':
-            return obj.__qualname__
-        else:
-            return f'{obj.__module__}.{obj.__qualname__}'
-    else:
-        return repr(obj)
-
-
-def _parse_parameters(args: Iterable[Any]) -> Generator[TypeVar, None, None]:
-    """Search for all typevars and typevar-containing objects in `args`.
-
-    Helper function for `_GenericAlias.__init__`.
-
-    """
-    for i in args:
-        if hasattr(i, "__parameters__"):
-            yield from i.__parameters__
-        elif isinstance(i, TypeVar):
-            yield i
-
-
-def _reconstruct_alias(alias: _T, parameters: Iterator[TypeVar]) -> _T:
-    """Recursively replace all typevars with those from `parameters`.
-
-    Helper function for `_GenericAlias.__getitem__`.
-
-    """
-    args = []
-    for i in alias.__args__:
-        if isinstance(i, TypeVar):
-            value: Any = next(parameters)
-        elif isinstance(i, _GenericAlias):
-            value = _reconstruct_alias(i, parameters)
-        elif hasattr(i, "__parameters__"):
-            prm_tup = tuple(next(parameters) for _ in i.__parameters__)
-            value = i[prm_tup]
-        else:
-            value = i
-        args.append(value)
-
-    cls = type(alias)
-    return cls(alias.__origin__, tuple(args))
-
-
-class _GenericAlias:
-    """A python-based backport of the `types.GenericAlias` class.
-
-    E.g. for ``t = list[int]``, ``t.__origin__`` is ``list`` and
-    ``t.__args__`` is ``(int,)``.
-
-    See Also
-    --------
-    :pep:`585`
-        The PEP responsible for introducing `types.GenericAlias`.
-
-    """
-
-    __slots__ = ("__weakref__", "_origin", "_args", "_parameters", "_hash")
-
-    @property
-    def __origin__(self) -> type:
-        return super().__getattribute__("_origin")
-
-    @property
-    def __args__(self) -> Tuple[object, ...]:
-        return super().__getattribute__("_args")
-
-    @property
-    def __parameters__(self) -> Tuple[TypeVar, ...]:
-        """Type variables in the ``GenericAlias``."""
-        return super().__getattribute__("_parameters")
-
-    def __init__(
-        self,
-        origin: type,
-        args: object | Tuple[object, ...],
-    ) -> None:
-        self._origin = origin
-        self._args = args if isinstance(args, tuple) else (args,)
-        self._parameters = tuple(_parse_parameters(self.__args__))
-
-    @property
-    def __call__(self) -> type:
-        return self.__origin__
-
-    def __reduce__(self: _T) -> Tuple[
-        Type[_T],
-        Tuple[type, Tuple[object, ...]],
-    ]:
-        cls = type(self)
-        return cls, (self.__origin__, self.__args__)
-
-    def __mro_entries__(self, bases: Iterable[object]) -> Tuple[type]:
-        return (self.__origin__,)
-
-    def __dir__(self) -> List[str]:
-        """Implement ``dir(self)``."""
-        cls = type(self)
-        dir_origin = set(dir(self.__origin__))
-        return sorted(cls._ATTR_EXCEPTIONS | dir_origin)
-
-    def __hash__(self) -> int:
-        """Return ``hash(self)``."""
-        # Attempt to use the cached hash
-        try:
-            return super().__getattribute__("_hash")
-        except AttributeError:
-            self._hash: int = hash(self.__origin__) ^ hash(self.__args__)
-            return super().__getattribute__("_hash")
-
-    def __instancecheck__(self, obj: object) -> NoReturn:
-        """Check if an `obj` is an instance."""
-        raise TypeError("isinstance() argument 2 cannot be a "
-                        "parameterized generic")
-
-    def __subclasscheck__(self, cls: type) -> NoReturn:
-        """Check if a `cls` is a subclass."""
-        raise TypeError("issubclass() argument 2 cannot be a "
-                        "parameterized generic")
-
-    def __repr__(self) -> str:
-        """Return ``repr(self)``."""
-        args = ", ".join(_to_str(i) for i in self.__args__)
-        origin = _to_str(self.__origin__)
-        return f"{origin}[{args}]"
-
-    def __getitem__(self: _T, key: object | Tuple[object, ...]) -> _T:
-        """Return ``self[key]``."""
-        key_tup = key if isinstance(key, tuple) else (key,)
-
-        if len(self.__parameters__) == 0:
-            raise TypeError(f"There are no type variables left in {self}")
-        elif len(key_tup) > len(self.__parameters__):
-            raise TypeError(f"Too many arguments for {self}")
-        elif len(key_tup) < len(self.__parameters__):
-            raise TypeError(f"Too few arguments for {self}")
-
-        key_iter = iter(key_tup)
-        return _reconstruct_alias(self, key_iter)
-
-    def __eq__(self, value: object) -> bool:
-        """Return ``self == value``."""
-        if not isinstance(value, _GENERIC_ALIAS_TYPE):
-            return NotImplemented
-        return (
-            self.__origin__ == value.__origin__ and
-            self.__args__ == value.__args__
-        )
-
-    _ATTR_EXCEPTIONS: ClassVar[FrozenSet[str]] = frozenset({
-        "__origin__",
-        "__args__",
-        "__parameters__",
-        "__mro_entries__",
-        "__reduce__",
-        "__reduce_ex__",
-        "__copy__",
-        "__deepcopy__",
-    })
-
-    def __getattribute__(self, name: str) -> Any:
-        """Return ``getattr(self, name)``."""
-        # Pull the attribute from `__origin__` unless its
-        # name is in `_ATTR_EXCEPTIONS`
-        cls = type(self)
-        if name in cls._ATTR_EXCEPTIONS:
-            return super().__getattribute__(name)
-        return getattr(self.__origin__, name)
-
-
-# See `_GenericAlias.__eq__`
-if sys.version_info >= (3, 9):
-    _GENERIC_ALIAS_TYPE = (_GenericAlias, types.GenericAlias)
-else:
-    _GENERIC_ALIAS_TYPE = (_GenericAlias,)
-
-ScalarType = TypeVar("ScalarType", bound=np.generic, covariant=True)
-
-if TYPE_CHECKING or sys.version_info >= (3, 9):
-    _DType = np.dtype[ScalarType]
-    NDArray = np.ndarray[Any, np.dtype[ScalarType]]
-else:
-    _DType = _GenericAlias(np.dtype, (ScalarType,))
-    NDArray = _GenericAlias(np.ndarray, (Any, _DType))
diff --git a/numpy/typing/_nbit.py b/numpy/typing/_nbit.py

deleted file mode 100644 (file)

index b8d35db..0000000
--- a/numpy/typing/_nbit.py
+++ /dev/null
@@ -1,16 +0,0 @@
-"""A module with the precisions of platform-specific `~numpy.number`s."""
-
-from typing import Any
-
-# To-be replaced with a `npt.NBitBase` subclass by numpy's mypy plugin
-_NBitByte = Any
-_NBitShort = Any
-_NBitIntC = Any
-_NBitIntP = Any
-_NBitInt = Any
-_NBitLongLong = Any
-
-_NBitHalf = Any
-_NBitSingle = Any
-_NBitDouble = Any
-_NBitLongDouble = Any
diff --git a/numpy/typing/_nested_sequence.py b/numpy/typing/_nested_sequence.py

deleted file mode 100644 (file)

index a853303..0000000
--- a/numpy/typing/_nested_sequence.py
+++ /dev/null
@@ -1,90 +0,0 @@
-"""A module containing the `_NestedSequence` protocol."""
-
-from __future__ import annotations
-
-from typing import (
-    Any,
-    Iterator,
-    overload,
-    TypeVar,
-    Protocol,
-)
-
-__all__ = ["_NestedSequence"]
-
-_T_co = TypeVar("_T_co", covariant=True)
-
-
-class _NestedSequence(Protocol[_T_co]):
-    """A protocol for representing nested sequences.
-
-    Warning
-    -------
-    `_NestedSequence` currently does not work in combination with typevars,
-    *e.g.* ``def func(a: _NestedSequnce[T]) -> T: ...``.
-
-    See Also
-    --------
-    `collections.abc.Sequence`
-        ABCs for read-only and mutable :term:`sequences`.
-
-    Examples
-    --------
-    .. code-block:: python
-
-        >>> from __future__ import annotations
-
-        >>> from typing import TYPE_CHECKING
-        >>> import numpy as np
-        >>> from numpy.typing import _NestedSequnce
-
-        >>> def get_dtype(seq: _NestedSequnce[float]) -> np.dtype[np.float64]:
-        ...     return np.asarray(seq).dtype
-
-        >>> a = get_dtype([1.0])
-        >>> b = get_dtype([[1.0]])
-        >>> c = get_dtype([[[1.0]]])
-        >>> d = get_dtype([[[[1.0]]]])
-
-        >>> if TYPE_CHECKING:
-        ...     reveal_locals()
-        ...     # note: Revealed local types are:
-        ...     # note:     a: numpy.dtype[numpy.floating[numpy.typing._64Bit]]
-        ...     # note:     b: numpy.dtype[numpy.floating[numpy.typing._64Bit]]
-        ...     # note:     c: numpy.dtype[numpy.floating[numpy.typing._64Bit]]
-        ...     # note:     d: numpy.dtype[numpy.floating[numpy.typing._64Bit]]
-
-    """
-
-    def __len__(self, /) -> int:
-        """Implement ``len(self)``."""
-        raise NotImplementedError
-
-    @overload
-    def __getitem__(self, index: int, /) -> _T_co | _NestedSequence[_T_co]: ...
-    @overload
-    def __getitem__(self, index: slice, /) -> _NestedSequence[_T_co]: ...
-
-    def __getitem__(self, index, /):
-        """Implement ``self[x]``."""
-        raise NotImplementedError
-
-    def __contains__(self, x: object, /) -> bool:
-        """Implement ``x in self``."""
-        raise NotImplementedError
-
-    def __iter__(self, /) -> Iterator[_T_co | _NestedSequence[_T_co]]:
-        """Implement ``iter(self)``."""
-        raise NotImplementedError
-
-    def __reversed__(self, /) -> Iterator[_T_co | _NestedSequence[_T_co]]:
-        """Implement ``reversed(self)``."""
-        raise NotImplementedError
-
-    def count(self, value: Any, /) -> int:
-        """Return the number of occurrences of `value`."""
-        raise NotImplementedError
-
-    def index(self, value: Any, /) -> int:
-        """Return the first index of `value`."""
-        raise NotImplementedError
diff --git a/numpy/typing/_scalars.py b/numpy/typing/_scalars.py

deleted file mode 100644 (file)

index 516b996..0000000
--- a/numpy/typing/_scalars.py
+++ /dev/null
@@ -1,30 +0,0 @@
-from typing import Union, Tuple, Any
-
-import numpy as np
-
-# NOTE: `_StrLike_co` and `_BytesLike_co` are pointless, as `np.str_` and
-# `np.bytes_` are already subclasses of their builtin counterpart
-
-_CharLike_co = Union[str, bytes]
-
-# The 6 `<X>Like_co` type-aliases below represent all scalars that can be
-# coerced into `<X>` (with the casting rule `same_kind`)
-_BoolLike_co = Union[bool, np.bool_]
-_UIntLike_co = Union[_BoolLike_co, np.unsignedinteger]
-_IntLike_co = Union[_BoolLike_co, int, np.integer]
-_FloatLike_co = Union[_IntLike_co, float, np.floating]
-_ComplexLike_co = Union[_FloatLike_co, complex, np.complexfloating]
-_TD64Like_co = Union[_IntLike_co, np.timedelta64]
-
-_NumberLike_co = Union[int, float, complex, np.number, np.bool_]
-_ScalarLike_co = Union[
-    int,
-    float,
-    complex,
-    str,
-    bytes,
-    np.generic,
-]
-
-# `_VoidLike_co` is technically not a scalar, but it's close enough
-_VoidLike_co = Union[Tuple[Any, ...], np.void]
diff --git a/numpy/typing/_shape.py b/numpy/typing/_shape.py

deleted file mode 100644 (file)

index c28859b..0000000
--- a/numpy/typing/_shape.py
+++ /dev/null
@@ -1,6 +0,0 @@
-from typing import Sequence, Tuple, Union, SupportsIndex
-
-_Shape = Tuple[int, ...]
-
-# Anything that can be coerced to a shape tuple
-_ShapeLike = Union[SupportsIndex, Sequence[SupportsIndex]]
diff --git a/numpy/typing/_ufunc.pyi b/numpy/typing/_ufunc.pyi

deleted file mode 100644 (file)

index 1be3500..0000000
--- a/numpy/typing/_ufunc.pyi
+++ /dev/null
@@ -1,405 +0,0 @@
-"""A module with private type-check-only `numpy.ufunc` subclasses.
-
-The signatures of the ufuncs are too varied to reasonably type
-with a single class. So instead, `ufunc` has been expanded into
-four private subclasses, one for each combination of
-`~ufunc.nin` and `~ufunc.nout`.
-
-"""
-
-from typing import (
-    Any,
-    Generic,
-    List,
-    overload,
-    Tuple,
-    TypeVar,
-    Literal,
-    SupportsIndex,
-)
-
-from numpy import ufunc, _CastingKind, _OrderKACF
-from numpy.typing import NDArray
-
-from ._shape import _ShapeLike
-from ._scalars import _ScalarLike_co
-from ._array_like import ArrayLike, _ArrayLikeBool_co, _ArrayLikeInt_co
-from ._dtype_like import DTypeLike
-
-_T = TypeVar("_T")
-_2Tuple = Tuple[_T, _T]
-_3Tuple = Tuple[_T, _T, _T]
-_4Tuple = Tuple[_T, _T, _T, _T]
-
-_NTypes = TypeVar("_NTypes", bound=int)
-_IDType = TypeVar("_IDType", bound=Any)
-_NameType = TypeVar("_NameType", bound=str)
-
-# NOTE: In reality `extobj` should be a length of list 3 containing an
-# int, an int, and a callable, but there's no way to properly express
-# non-homogenous lists.
-# Use `Any` over `Union` to avoid issues related to lists invariance.
-
-# NOTE: `reduce`, `accumulate`, `reduceat` and `outer` raise a ValueError for
-# ufuncs that don't accept two input arguments and return one output argument.
-# In such cases the respective methods are simply typed as `None`.
-
-# NOTE: Similarly, `at` won't be defined for ufuncs that return
-# multiple outputs; in such cases `at` is typed as `None`
-
-# NOTE: If 2 output types are returned then `out` must be a
-# 2-tuple of arrays. Otherwise `None` or a plain array are also acceptable
-
-class _UFunc_Nin1_Nout1(ufunc, Generic[_NameType, _NTypes, _IDType]):
-    @property
-    def __name__(self) -> _NameType: ...
-    @property
-    def ntypes(self) -> _NTypes: ...
-    @property
-    def identity(self) -> _IDType: ...
-    @property
-    def nin(self) -> Literal[1]: ...
-    @property
-    def nout(self) -> Literal[1]: ...
-    @property
-    def nargs(self) -> Literal[2]: ...
-    @property
-    def signature(self) -> None: ...
-    @property
-    def reduce(self) -> None: ...
-    @property
-    def accumulate(self) -> None: ...
-    @property
-    def reduceat(self) -> None: ...
-    @property
-    def outer(self) -> None: ...
-
-    @overload
-    def __call__(
-        self,
-        __x1: _ScalarLike_co,
-        out: None = ...,
-        *,
-        where: None | _ArrayLikeBool_co = ...,
-        casting: _CastingKind = ...,
-        order: _OrderKACF = ...,
-        dtype: DTypeLike = ...,
-        subok: bool = ...,
-        signature: str | _2Tuple[None | str] = ...,
-        extobj: List[Any] = ...,
-    ) -> Any: ...
-    @overload
-    def __call__(
-        self,
-        __x1: ArrayLike,
-        out: None | NDArray[Any] | Tuple[NDArray[Any]] = ...,
-        *,
-        where: None | _ArrayLikeBool_co = ...,
-        casting: _CastingKind = ...,
-        order: _OrderKACF = ...,
-        dtype: DTypeLike = ...,
-        subok: bool = ...,
-        signature: str | _2Tuple[None | str] = ...,
-        extobj: List[Any] = ...,
-    ) -> NDArray[Any]: ...
-
-    def at(
-        self,
-        a: NDArray[Any],
-        indices: _ArrayLikeInt_co,
-        /,
-    ) -> None: ...
-
-class _UFunc_Nin2_Nout1(ufunc, Generic[_NameType, _NTypes, _IDType]):
-    @property
-    def __name__(self) -> _NameType: ...
-    @property
-    def ntypes(self) -> _NTypes: ...
-    @property
-    def identity(self) -> _IDType: ...
-    @property
-    def nin(self) -> Literal[2]: ...
-    @property
-    def nout(self) -> Literal[1]: ...
-    @property
-    def nargs(self) -> Literal[3]: ...
-    @property
-    def signature(self) -> None: ...
-
-    @overload
-    def __call__(
-        self,
-        __x1: _ScalarLike_co,
-        __x2: _ScalarLike_co,
-        out: None = ...,
-        *,
-        where: None | _ArrayLikeBool_co = ...,
-        casting: _CastingKind = ...,
-        order: _OrderKACF = ...,
-        dtype: DTypeLike = ...,
-        subok: bool = ...,
-        signature: str | _3Tuple[None | str] = ...,
-        extobj: List[Any] = ...,
-    ) -> Any: ...
-    @overload
-    def __call__(
-        self,
-        __x1: ArrayLike,
-        __x2: ArrayLike,
-        out: None | NDArray[Any] | Tuple[NDArray[Any]] = ...,
-        *,
-        where: None | _ArrayLikeBool_co = ...,
-        casting: _CastingKind = ...,
-        order: _OrderKACF = ...,
-        dtype: DTypeLike = ...,
-        subok: bool = ...,
-        signature: str | _3Tuple[None | str] = ...,
-        extobj: List[Any] = ...,
-    ) -> NDArray[Any]: ...
-
-    def at(
-        self,
-        a: NDArray[Any],
-        indices: _ArrayLikeInt_co,
-        b: ArrayLike,
-        /,
-    ) -> None: ...
-
-    def reduce(
-        self,
-        array: ArrayLike,
-        axis: None | _ShapeLike = ...,
-        dtype: DTypeLike = ...,
-        out: None | NDArray[Any] = ...,
-        keepdims: bool = ...,
-        initial: Any = ...,
-        where: _ArrayLikeBool_co = ...,
-    ) -> Any: ...
-
-    def accumulate(
-        self,
-        array: ArrayLike,
-        axis: SupportsIndex = ...,
-        dtype: DTypeLike = ...,
-        out: None | NDArray[Any] = ...,
-    ) -> NDArray[Any]: ...
-
-    def reduceat(
-        self,
-        array: ArrayLike,
-        indices: _ArrayLikeInt_co,
-        axis: SupportsIndex = ...,
-        dtype: DTypeLike = ...,
-        out: None | NDArray[Any] = ...,
-    ) -> NDArray[Any]: ...
-
-    # Expand `**kwargs` into explicit keyword-only arguments
-    @overload
-    def outer(
-        self,
-        A: _ScalarLike_co,
-        B: _ScalarLike_co,
-        /, *,
-        out: None = ...,
-        where: None | _ArrayLikeBool_co = ...,
-        casting: _CastingKind = ...,
-        order: _OrderKACF = ...,
-        dtype: DTypeLike = ...,
-        subok: bool = ...,
-        signature: str | _3Tuple[None | str] = ...,
-        extobj: List[Any] = ...,
-    ) -> Any: ...
-    @overload
-    def outer(  # type: ignore[misc]
-        self,
-        A: ArrayLike,
-        B: ArrayLike,
-        /, *,
-        out: None | NDArray[Any] | Tuple[NDArray[Any]] = ...,
-        where: None | _ArrayLikeBool_co = ...,
-        casting: _CastingKind = ...,
-        order: _OrderKACF = ...,
-        dtype: DTypeLike = ...,
-        subok: bool = ...,
-        signature: str | _3Tuple[None | str] = ...,
-        extobj: List[Any] = ...,
-    ) -> NDArray[Any]: ...
-
-class _UFunc_Nin1_Nout2(ufunc, Generic[_NameType, _NTypes, _IDType]):
-    @property
-    def __name__(self) -> _NameType: ...
-    @property
-    def ntypes(self) -> _NTypes: ...
-    @property
-    def identity(self) -> _IDType: ...
-    @property
-    def nin(self) -> Literal[1]: ...
-    @property
-    def nout(self) -> Literal[2]: ...
-    @property
-    def nargs(self) -> Literal[3]: ...
-    @property
-    def signature(self) -> None: ...
-    @property
-    def at(self) -> None: ...
-    @property
-    def reduce(self) -> None: ...
-    @property
-    def accumulate(self) -> None: ...
-    @property
-    def reduceat(self) -> None: ...
-    @property
-    def outer(self) -> None: ...
-
-    @overload
-    def __call__(
-        self,
-        __x1: _ScalarLike_co,
-        __out1: None = ...,
-        __out2: None = ...,
-        *,
-        where: None | _ArrayLikeBool_co = ...,
-        casting: _CastingKind = ...,
-        order: _OrderKACF = ...,
-        dtype: DTypeLike = ...,
-        subok: bool = ...,
-        signature: str | _3Tuple[None | str] = ...,
-        extobj: List[Any] = ...,
-    ) -> _2Tuple[Any]: ...
-    @overload
-    def __call__(
-        self,
-        __x1: ArrayLike,
-        __out1: None | NDArray[Any] = ...,
-        __out2: None | NDArray[Any] = ...,
-        *,
-        out: _2Tuple[NDArray[Any]] = ...,
-        where: None | _ArrayLikeBool_co = ...,
-        casting: _CastingKind = ...,
-        order: _OrderKACF = ...,
-        dtype: DTypeLike = ...,
-        subok: bool = ...,
-        signature: str | _3Tuple[None | str] = ...,
-        extobj: List[Any] = ...,
-    ) -> _2Tuple[NDArray[Any]]: ...
-
-class _UFunc_Nin2_Nout2(ufunc, Generic[_NameType, _NTypes, _IDType]):
-    @property
-    def __name__(self) -> _NameType: ...
-    @property
-    def ntypes(self) -> _NTypes: ...
-    @property
-    def identity(self) -> _IDType: ...
-    @property
-    def nin(self) -> Literal[2]: ...
-    @property
-    def nout(self) -> Literal[2]: ...
-    @property
-    def nargs(self) -> Literal[4]: ...
-    @property
-    def signature(self) -> None: ...
-    @property
-    def at(self) -> None: ...
-    @property
-    def reduce(self) -> None: ...
-    @property
-    def accumulate(self) -> None: ...
-    @property
-    def reduceat(self) -> None: ...
-    @property
-    def outer(self) -> None: ...
-
-    @overload
-    def __call__(
-        self,
-        __x1: _ScalarLike_co,
-        __x2: _ScalarLike_co,
-        __out1: None = ...,
-        __out2: None = ...,
-        *,
-        where: None | _ArrayLikeBool_co = ...,
-        casting: _CastingKind = ...,
-        order: _OrderKACF = ...,
-        dtype: DTypeLike = ...,
-        subok: bool = ...,
-        signature: str | _4Tuple[None | str] = ...,
-        extobj: List[Any] = ...,
-    ) -> _2Tuple[Any]: ...
-    @overload
-    def __call__(
-        self,
-        __x1: ArrayLike,
-        __x2: ArrayLike,
-        __out1: None | NDArray[Any] = ...,
-        __out2: None | NDArray[Any] = ...,
-        *,
-        out: _2Tuple[NDArray[Any]] = ...,
-        where: None | _ArrayLikeBool_co = ...,
-        casting: _CastingKind = ...,
-        order: _OrderKACF = ...,
-        dtype: DTypeLike = ...,
-        subok: bool = ...,
-        signature: str | _4Tuple[None | str] = ...,
-        extobj: List[Any] = ...,
-    ) -> _2Tuple[NDArray[Any]]: ...
-
-class _GUFunc_Nin2_Nout1(ufunc, Generic[_NameType, _NTypes, _IDType]):
-    @property
-    def __name__(self) -> _NameType: ...
-    @property
-    def ntypes(self) -> _NTypes: ...
-    @property
-    def identity(self) -> _IDType: ...
-    @property
-    def nin(self) -> Literal[2]: ...
-    @property
-    def nout(self) -> Literal[1]: ...
-    @property
-    def nargs(self) -> Literal[3]: ...
-
-    # NOTE: In practice the only gufunc in the main name is `matmul`,
-    # so we can use its signature here
-    @property
-    def signature(self) -> Literal["(n?,k),(k,m?)->(n?,m?)"]: ...
-    @property
-    def reduce(self) -> None: ...
-    @property
-    def accumulate(self) -> None: ...
-    @property
-    def reduceat(self) -> None: ...
-    @property
-    def outer(self) -> None: ...
-    @property
-    def at(self) -> None: ...
-
-    # Scalar for 1D array-likes; ndarray otherwise
-    @overload
-    def __call__(
-        self,
-        __x1: ArrayLike,
-        __x2: ArrayLike,
-        out: None = ...,
-        *,
-        casting: _CastingKind = ...,
-        order: _OrderKACF = ...,
-        dtype: DTypeLike = ...,
-        subok: bool = ...,
-        signature: str | _3Tuple[None | str] = ...,
-        extobj: List[Any] = ...,
-        axes: List[_2Tuple[SupportsIndex]] = ...,
-    ) -> Any: ...
-    @overload
-    def __call__(
-        self,
-        __x1: ArrayLike,
-        __x2: ArrayLike,
-        out: NDArray[Any] | Tuple[NDArray[Any]],
-        *,
-        casting: _CastingKind = ...,
-        order: _OrderKACF = ...,
-        dtype: DTypeLike = ...,
-        subok: bool = ...,
-        signature: str | _3Tuple[None | str] = ...,
-        extobj: List[Any] = ...,
-        axes: List[_2Tuple[SupportsIndex]] = ...,
-    ) -> NDArray[Any]: ...
diff --git a/numpy/typing/mypy_plugin.py b/numpy/typing/mypy_plugin.py

index 5ac75f94da93c6f9e81f20cf6b23964204c3ff0a..1ffe74fa97b10c0ce7d46c541360e73dc583484b 100644 (file)
--- a/numpy/typing/mypy_plugin.py
+++ b/numpy/typing/mypy_plugin.py
@@ -70,7 +70,7 @@ def _get_precision_dict() -> dict[str, str]:
      ret = {}
      for name, typ in names:
          n: int = 8 * typ().dtype.itemsize
-        ret[f'numpy.typing._nbit.{name}'] = f"numpy._{n}Bit"
+        ret[f'numpy._typing._nbit.{name}'] = f"numpy._{n}Bit"
      return ret
  
  
@@ -106,7 +106,7 @@ def _get_c_intp_name() -> str:
          return "c_long"
  
  
-#: A dictionary mapping type-aliases in `numpy.typing._nbit` to
+#: A dictionary mapping type-aliases in `numpy._typing._nbit` to
  #: concrete `numpy.typing.NBitBase` subclasses.
  _PRECISION_DICT: Final = _get_precision_dict()
  
@@ -121,7 +121,7 @@ def _hook(ctx: AnalyzeTypeContext) -> Type:
      """Replace a type-alias with a concrete ``NBitBase`` subclass."""
      typ, _, api = ctx
      name = typ.name.split(".")[-1]
-    name_new = _PRECISION_DICT[f"numpy.typing._nbit.{name}"]
+    name_new = _PRECISION_DICT[f"numpy._typing._nbit.{name}"]
      return api.named_type(name_new)
  
  
@@ -177,7 +177,7 @@ if TYPE_CHECKING or MYPY_EX is None:
  
              if file.fullname == "numpy":
                  _override_imports(
-                    file, "numpy.typing._extended_precision",
+                    file, "numpy._typing._extended_precision",
                      imports=[(v, v) for v in _EXTENDED_PRECISION_LIST],
                  )
              elif file.fullname == "numpy.ctypeslib":
diff --git a/numpy/typing/setup.py b/numpy/typing/setup.py

index 694a756dc5abca54bd7a8a0c9232a0974e194edb..c444e769fb6d94ffc0bff6cec25cd30a86858f2e 100644 (file)
--- a/numpy/typing/setup.py
+++ b/numpy/typing/setup.py
@@ -3,7 +3,6 @@ def configuration(parent_package='', top_path=None):
      config = Configuration('typing', parent_package, top_path)
      config.add_subpackage('tests')
      config.add_data_dir('tests/data')
-    config.add_data_files('*.pyi')
      return config
  
  
diff --git a/numpy/typing/tests/data/fail/arithmetic.pyi b/numpy/typing/tests/data/fail/arithmetic.pyi

index b99b24c1f6b4f7e064b6681e95cff96ef1433a2e..3bbc101cfd236c01a8d72e24bcf36cda87da8a10 100644 (file)
--- a/numpy/typing/tests/data/fail/arithmetic.pyi
+++ b/numpy/typing/tests/data/fail/arithmetic.pyi
@@ -1,4 +1,4 @@
-from typing import List, Any
+from typing import Any
  import numpy as np
  
  b_ = np.bool_()
@@ -15,13 +15,13 @@ AR_M: np.ndarray[Any, np.dtype[np.datetime64]]
  
  ANY: Any
  
-AR_LIKE_b: List[bool]
-AR_LIKE_u: List[np.uint32]
-AR_LIKE_i: List[int]
-AR_LIKE_f: List[float]
-AR_LIKE_c: List[complex]
-AR_LIKE_m: List[np.timedelta64]
-AR_LIKE_M: List[np.datetime64]
+AR_LIKE_b: list[bool]
+AR_LIKE_u: list[np.uint32]
+AR_LIKE_i: list[int]
+AR_LIKE_f: list[float]
+AR_LIKE_c: list[complex]
+AR_LIKE_m: list[np.timedelta64]
+AR_LIKE_M: list[np.datetime64]
  
  # Array subtraction
  
diff --git a/numpy/typing/tests/data/fail/array_constructors.pyi b/numpy/typing/tests/data/fail/array_constructors.pyi

index 4f0a60b5ba93e03f03aebe733e8f8868564cf56a..278894631f937151177e9b4e5a4c815772a16676 100644 (file)
--- a/numpy/typing/tests/data/fail/array_constructors.pyi
+++ b/numpy/typing/tests/data/fail/array_constructors.pyi
@@ -21,11 +21,13 @@ np.linspace(0, 2, retstep=b'False')  # E: No overload variant
  np.linspace(0, 2, dtype=0)  # E: No overload variant
  np.linspace(0, 2, axis=None)  # E: No overload variant
  
-np.logspace(None, 'bob')  # E: Argument 1
-np.logspace(0, 2, base=None)  # E: Argument "base"
+np.logspace(None, 'bob')  # E: No overload variant
+np.logspace(0, 2, base=None)  # E: No overload variant
  
-np.geomspace(None, 'bob')  # E: Argument 1
+np.geomspace(None, 'bob')  # E: No overload variant
  
  np.stack(generator)  # E: No overload variant
  np.hstack({1, 2})  # E: No overload variant
  np.vstack(1)  # E: No overload variant
+
+np.array([1], like=1)  # E: No overload variant
diff --git a/numpy/typing/tests/data/fail/array_like.pyi b/numpy/typing/tests/data/fail/array_like.pyi

index 3bbd2906150f46c5105e38dd1a8557d4ee00f4b1..133b5fd497006be2680dd108ed4cb5696442bad5 100644 (file)
--- a/numpy/typing/tests/data/fail/array_like.pyi
+++ b/numpy/typing/tests/data/fail/array_like.pyi
@@ -1,5 +1,5 @@
  import numpy as np
-from numpy.typing import ArrayLike
+from numpy._typing import ArrayLike
  
  
  class A:
diff --git a/numpy/typing/tests/data/fail/arrayprint.pyi b/numpy/typing/tests/data/fail/arrayprint.pyi

index 86297a0b24a4b6b7e9c6962be7607b760412411a..71b921e3a5a3774548a9beab955a3b481d360d21 100644 (file)
--- a/numpy/typing/tests/data/fail/arrayprint.pyi
+++ b/numpy/typing/tests/data/fail/arrayprint.pyi
@@ -1,4 +1,5 @@
-from typing import Callable, Any
+from collections.abc import Callable
+from typing import Any
  import numpy as np
  
  AR: np.ndarray
diff --git a/numpy/typing/tests/data/fail/einsumfunc.pyi b/numpy/typing/tests/data/fail/einsumfunc.pyi

index 33722f861199c93dca8239c08c93a364d6370b54..f0e3f1e95711c23af33053c2eb6b08ef41b3e77e 100644 (file)
--- a/numpy/typing/tests/data/fail/einsumfunc.pyi
+++ b/numpy/typing/tests/data/fail/einsumfunc.pyi
@@ -1,4 +1,4 @@
-from typing import List, Any
+from typing import Any
  import numpy as np
  
  AR_i: np.ndarray[Any, np.dtype[np.int64]]
diff --git a/numpy/typing/tests/data/fail/flatiter.pyi b/numpy/typing/tests/data/fail/flatiter.pyi

index 544ffbe4a7dba04a2a3771bdbebe1314fc694555..b4ce10ba566d7ccfb2c6523c926bf4571c7e4a27 100644 (file)
--- a/numpy/typing/tests/data/fail/flatiter.pyi
+++ b/numpy/typing/tests/data/fail/flatiter.pyi
@@ -1,7 +1,7 @@
  from typing import Any
  
  import numpy as np
-from numpy.typing import _SupportsArray
+from numpy._typing import _SupportsArray
  
  
  class Index:
diff --git a/numpy/typing/tests/data/fail/fromnumeric.pyi b/numpy/typing/tests/data/fail/fromnumeric.pyi

index 8fafed1b77051964ae02c72987e9a920639f93e0..b679703c7dd61ccc0fb5b54a0582a1401095e67d 100644 (file)
--- a/numpy/typing/tests/data/fail/fromnumeric.pyi
+++ b/numpy/typing/tests/data/fail/fromnumeric.pyi
@@ -1,38 +1,40 @@
  """Tests for :mod:`numpy.core.fromnumeric`."""
  
  import numpy as np
+import numpy.typing as npt
  
  A = np.array(True, ndmin=2, dtype=bool)
  A.setflags(write=False)
+AR_U: npt.NDArray[np.str_]
  
  a = np.bool_(True)
  
-np.take(a, None)  # E: incompatible type
-np.take(a, axis=1.0)  # E: incompatible type
-np.take(A, out=1)  # E: incompatible type
-np.take(A, mode="bob")  # E: incompatible type
+np.take(a, None)  # E: No overload variant
+np.take(a, axis=1.0)  # E: No overload variant
+np.take(A, out=1)  # E: No overload variant
+np.take(A, mode="bob")  # E: No overload variant
  
-np.reshape(a, None)  # E: Argument 2 to "reshape" has incompatible type
-np.reshape(A, 1, order="bob")  # E: Argument "order" to "reshape" has incompatible type
+np.reshape(a, None)  # E: No overload variant
+np.reshape(A, 1, order="bob")  # E: No overload variant
  
-np.choose(a, None)  # E: incompatible type
-np.choose(a, out=1.0)  # E: incompatible type
-np.choose(A, mode="bob")  # E: incompatible type
+np.choose(a, None)  # E: No overload variant
+np.choose(a, out=1.0)  # E: No overload variant
+np.choose(A, mode="bob")  # E: No overload variant
  
-np.repeat(a, None)  # E: Argument 2 to "repeat" has incompatible type
-np.repeat(A, 1, axis=1.0)  # E: Argument "axis" to "repeat" has incompatible type
+np.repeat(a, None)  # E: No overload variant
+np.repeat(A, 1, axis=1.0)  # E: No overload variant
  
-np.swapaxes(A, None, 1)  # E: Argument 2 to "swapaxes" has incompatible type
-np.swapaxes(A, 1, [0])  # E: Argument 3 to "swapaxes" has incompatible type
+np.swapaxes(A, None, 1)  # E: No overload variant
+np.swapaxes(A, 1, [0])  # E: No overload variant
  
-np.transpose(A, axes=1.0)  # E: Argument "axes" to "transpose" has incompatible type
+np.transpose(A, axes=1.0)  # E: No overload variant
  
-np.partition(a, None)  # E: Argument 2 to "partition" has incompatible type
-np.partition(
-    a, 0, axis="bob"  # E: Argument "axis" to "partition" has incompatible type
+np.partition(a, None)  # E: No overload variant
+np.partition(  # E: No overload variant
+    a, 0, axis="bob"
  )
-np.partition(
-    A, 0, kind="bob"  # E: Argument "kind" to "partition" has incompatible type
+np.partition(  # E: No overload variant
+    A, 0, kind="bob"
  )
  np.partition(
      A, 0, order=range(5)  # E: Argument "order" to "partition" has incompatible type
@@ -51,8 +53,8 @@ np.argpartition(
      A, 0, order=range(5)  # E: Argument "order" to "argpartition" has incompatible type
  )
  
-np.sort(A, axis="bob")  # E: Argument "axis" to "sort" has incompatible type
-np.sort(A, kind="bob")  # E: Argument "kind" to "sort" has incompatible type
+np.sort(A, axis="bob")  # E: No overload variant
+np.sort(A, kind="bob")  # E: No overload variant
  np.sort(A, order=range(5))  # E: Argument "order" to "sort" has incompatible type
  
  np.argsort(A, axis="bob")  # E: Argument "axis" to "argsort" has incompatible type
@@ -72,30 +74,29 @@ np.searchsorted(  # E: No overload variant of "searchsorted" matches argument ty
      A[0], 0, sorter=1.0
  )
  
-np.resize(A, 1.0)  # E: Argument 2 to "resize" has incompatible type
+np.resize(A, 1.0)  # E: No overload variant
  
  np.squeeze(A, 1.0)  # E: No overload variant of "squeeze" matches argument type
  
-np.diagonal(A, offset=None)  # E: Argument "offset" to "diagonal" has incompatible type
-np.diagonal(A, axis1="bob")  # E: Argument "axis1" to "diagonal" has incompatible type
-np.diagonal(A, axis2=[])  # E: Argument "axis2" to "diagonal" has incompatible type
+np.diagonal(A, offset=None)  # E: No overload variant
+np.diagonal(A, axis1="bob")  # E: No overload variant
+np.diagonal(A, axis2=[])  # E: No overload variant
  
-np.trace(A, offset=None)  # E: Argument "offset" to "trace" has incompatible type
-np.trace(A, axis1="bob")  # E: Argument "axis1" to "trace" has incompatible type
-np.trace(A, axis2=[])  # E: Argument "axis2" to "trace" has incompatible type
+np.trace(A, offset=None)  # E: No overload variant
+np.trace(A, axis1="bob")  # E: No overload variant
+np.trace(A, axis2=[])  # E: No overload variant
  
-np.ravel(a, order="bob")  # E: Argument "order" to "ravel" has incompatible type
+np.ravel(a, order="bob")  # E: No overload variant
  
-np.compress(
-    [True], A, axis=1.0  # E: Argument "axis" to "compress" has incompatible type
+np.compress(  # E: No overload variant
+    [True], A, axis=1.0
  )
  
  np.clip(a, 1, 2, out=1)  # E: No overload variant of "clip" matches argument type
-np.clip(1, None, None)  # E: No overload variant of "clip" matches argument type
  
-np.sum(a, axis=1.0)  # E: incompatible type
-np.sum(a, keepdims=1.0)  # E: incompatible type
-np.sum(a, initial=[1])  # E: incompatible type
+np.sum(a, axis=1.0)  # E: No overload variant
+np.sum(a, keepdims=1.0)  # E: No overload variant
+np.sum(a, initial=[1])  # E: No overload variant
  
  np.all(a, axis=1.0)  # E: No overload variant
  np.all(a, keepdims=1.0)  # E: No overload variant
@@ -105,50 +106,56 @@ np.any(a, axis=1.0)  # E: No overload variant
  np.any(a, keepdims=1.0)  # E: No overload variant
  np.any(a, out=1.0)  # E: No overload variant
  
-np.cumsum(a, axis=1.0)  # E: incompatible type
-np.cumsum(a, dtype=1.0)  # E: incompatible type
-np.cumsum(a, out=1.0)  # E: incompatible type
+np.cumsum(a, axis=1.0)  # E: No overload variant
+np.cumsum(a, dtype=1.0)  # E: No overload variant
+np.cumsum(a, out=1.0)  # E: No overload variant
  
-np.ptp(a, axis=1.0)  # E: incompatible type
-np.ptp(a, keepdims=1.0)  # E: incompatible type
-np.ptp(a, out=1.0)  # E: incompatible type
+np.ptp(a, axis=1.0)  # E: No overload variant
+np.ptp(a, keepdims=1.0)  # E: No overload variant
+np.ptp(a, out=1.0)  # E: No overload variant
  
-np.amax(a, axis=1.0)  # E: incompatible type
-np.amax(a, keepdims=1.0)  # E: incompatible type
-np.amax(a, out=1.0)  # E: incompatible type
-np.amax(a, initial=[1.0])  # E: incompatible type
+np.amax(a, axis=1.0)  # E: No overload variant
+np.amax(a, keepdims=1.0)  # E: No overload variant
+np.amax(a, out=1.0)  # E: No overload variant
+np.amax(a, initial=[1.0])  # E: No overload variant
  np.amax(a, where=[1.0])  # E: incompatible type
  
-np.amin(a, axis=1.0)  # E: incompatible type
-np.amin(a, keepdims=1.0)  # E: incompatible type
-np.amin(a, out=1.0)  # E: incompatible type
-np.amin(a, initial=[1.0])  # E: incompatible type
+np.amin(a, axis=1.0)  # E: No overload variant
+np.amin(a, keepdims=1.0)  # E: No overload variant
+np.amin(a, out=1.0)  # E: No overload variant
+np.amin(a, initial=[1.0])  # E: No overload variant
  np.amin(a, where=[1.0])  # E: incompatible type
  
-np.prod(a, axis=1.0)  # E: incompatible type
-np.prod(a, out=False)  # E: incompatible type
-np.prod(a, keepdims=1.0)  # E: incompatible type
-np.prod(a, initial=int)  # E: incompatible type
-np.prod(a, where=1.0)  # E: incompatible type
+np.prod(a, axis=1.0)  # E: No overload variant
+np.prod(a, out=False)  # E: No overload variant
+np.prod(a, keepdims=1.0)  # E: No overload variant
+np.prod(a, initial=int)  # E: No overload variant
+np.prod(a, where=1.0)  # E: No overload variant
+np.prod(AR_U)  # E: incompatible type
  
-np.cumprod(a, axis=1.0)  # E: Argument "axis" to "cumprod" has incompatible type
-np.cumprod(a, out=False)  # E: Argument "out" to "cumprod" has incompatible type
+np.cumprod(a, axis=1.0)  # E: No overload variant
+np.cumprod(a, out=False)  # E: No overload variant
+np.cumprod(AR_U)  # E: incompatible type
  
  np.size(a, axis=1.0)  # E: Argument "axis" to "size" has incompatible type
  
-np.around(a, decimals=1.0)  # E: incompatible type
-np.around(a, out=type)  # E: incompatible type
-
-np.mean(a, axis=1.0)  # E: incompatible type
-np.mean(a, out=False)  # E: incompatible type
-np.mean(a, keepdims=1.0)  # E: incompatible type
-
-np.std(a, axis=1.0)  # E: incompatible type
-np.std(a, out=False)  # E: incompatible type
-np.std(a, ddof='test')  # E: incompatible type
-np.std(a, keepdims=1.0)  # E: incompatible type
-
-np.var(a, axis=1.0)  # E: incompatible type
-np.var(a, out=False)  # E: incompatible type
-np.var(a, ddof='test')  # E: incompatible type
-np.var(a, keepdims=1.0)  # E: incompatible type
+np.around(a, decimals=1.0)  # E: No overload variant
+np.around(a, out=type)  # E: No overload variant
+np.around(AR_U)  # E: incompatible type
+
+np.mean(a, axis=1.0)  # E: No overload variant
+np.mean(a, out=False)  # E: No overload variant
+np.mean(a, keepdims=1.0)  # E: No overload variant
+np.mean(AR_U)  # E: incompatible type
+
+np.std(a, axis=1.0)  # E: No overload variant
+np.std(a, out=False)  # E: No overload variant
+np.std(a, ddof='test')  # E: No overload variant
+np.std(a, keepdims=1.0)  # E: No overload variant
+np.std(AR_U)  # E: incompatible type
+
+np.var(a, axis=1.0)  # E: No overload variant
+np.var(a, out=False)  # E: No overload variant
+np.var(a, ddof='test')  # E: No overload variant
+np.var(a, keepdims=1.0)  # E: No overload variant
+np.var(AR_U)  # E: incompatible type
diff --git a/numpy/typing/tests/data/fail/index_tricks.pyi b/numpy/typing/tests/data/fail/index_tricks.pyi

index 565e81a9ab256a56602a383f7f2105cd17caa2a0..22f6f4a61e8e11079e40d3755b0c01200ffdf762 100644 (file)
--- a/numpy/typing/tests/data/fail/index_tricks.pyi
+++ b/numpy/typing/tests/data/fail/index_tricks.pyi
@@ -1,8 +1,7 @@
-from typing import List
  import numpy as np
  
-AR_LIKE_i: List[int]
-AR_LIKE_f: List[float]
+AR_LIKE_i: list[int]
+AR_LIKE_f: list[float]
  
  np.ndindex([1, 2, 3])  # E: No overload variant
  np.unravel_index(AR_LIKE_f, (1, 2, 3))  # E: incompatible type
diff --git a/numpy/typing/tests/data/fail/multiarray.pyi b/numpy/typing/tests/data/fail/multiarray.pyi

index 22bcf8c92c909667151cb73f6ac3f0f063b1cb23..425ec3d0fb4f2b6120ed497d39bff344c6e097cb 100644 (file)
--- a/numpy/typing/tests/data/fail/multiarray.pyi
+++ b/numpy/typing/tests/data/fail/multiarray.pyi
@@ -1,4 +1,3 @@
-from typing import List
  import numpy as np
  import numpy.typing as npt
  
@@ -12,7 +11,7 @@ AR_M: npt.NDArray[np.datetime64]
  
  M: np.datetime64
  
-AR_LIKE_f: List[float]
+AR_LIKE_f: list[float]
  
  def func(a: int) -> None: ...
  
@@ -40,7 +39,7 @@ np.arange(stop=10)  # E: No overload variant
  
  np.datetime_data(int)  # E: incompatible type
  
-np.busday_offset("2012", 10)  # E: incompatible type
+np.busday_offset("2012", 10)  # E: No overload variant
  
  np.datetime_as_string("2012")  # E: No overload variant
  
diff --git a/numpy/typing/tests/data/fail/ndarray_misc.pyi b/numpy/typing/tests/data/fail/ndarray_misc.pyi

index 8320a44f3caa773975d01399c1040b69a7e5d3e8..77bd9a44e8902ce85ae9b4e6e4d94c64a16f4f6e 100644 (file)
--- a/numpy/typing/tests/data/fail/ndarray_misc.pyi
+++ b/numpy/typing/tests/data/fail/ndarray_misc.pyi
@@ -39,3 +39,5 @@ AR_b.__index__()  # E: Invalid self argument
  AR_f8[1.5]  # E: No overload variant
  AR_f8["field_a"]  # E: No overload variant
  AR_f8[["field_a", "field_b"]]  # E: Invalid index type
+
+AR_f8.__array_finalize__(object())  # E: incompatible type
diff --git a/numpy/typing/tests/data/fail/nested_sequence.pyi b/numpy/typing/tests/data/fail/nested_sequence.pyi

index e28661a058e963b26c6cf2fdd89b2c85e09a38ad..6301e51769fee30db50bfaf1e2777bf894166de8 100644 (file)
--- a/numpy/typing/tests/data/fail/nested_sequence.pyi
+++ b/numpy/typing/tests/data/fail/nested_sequence.pyi
@@ -1,13 +1,13 @@
-from typing import Sequence, Tuple, List
-import numpy.typing as npt
+from collections.abc import Sequence
+from numpy._typing import _NestedSequence
  
  a: Sequence[float]
-b: List[complex]
-c: Tuple[str, ...]
+b: list[complex]
+c: tuple[str, ...]
  d: int
  e: str
  
-def func(a: npt._NestedSequence[int]) -> None:
+def func(a: _NestedSequence[int]) -> None:
      ...
  
  reveal_type(func(a))  # E: incompatible type
diff --git a/numpy/typing/tests/data/fail/random.pyi b/numpy/typing/tests/data/fail/random.pyi

index c4d1e3e3e8024c3b539061d71a27c79200f99d2c..f0e682019281662559b2cf6bcece236a00695351 100644 (file)
--- a/numpy/typing/tests/data/fail/random.pyi
+++ b/numpy/typing/tests/data/fail/random.pyi
@@ -1,9 +1,9 @@
  import numpy as np
-from typing import Any, List
+from typing import Any
  
  SEED_FLOAT: float = 457.3
  SEED_ARR_FLOAT: np.ndarray[Any, np.dtype[np.float64]] = np.array([1.0, 2, 3, 4])
-SEED_ARRLIKE_FLOAT: List[float] = [1.0, 2.0, 3.0, 4.0]
+SEED_ARRLIKE_FLOAT: list[float] = [1.0, 2.0, 3.0, 4.0]
  SEED_SEED_SEQ: np.random.SeedSequence = np.random.SeedSequence(0)
  SEED_STR: str = "String seeding not allowed"
  # default rng
diff --git a/numpy/typing/tests/data/fail/testing.pyi b/numpy/typing/tests/data/fail/testing.pyi

index e753a9810ab3318992872c1dfd16902cf0c8a303..803870e2feadd18815ebc57665aa63d42423c752 100644 (file)
--- a/numpy/typing/tests/data/fail/testing.pyi
+++ b/numpy/typing/tests/data/fail/testing.pyi
@@ -22,5 +22,7 @@ np.testing.assert_array_max_ulp(AR_U, AR_U)  # E: incompatible type
  
  np.testing.assert_warns(warning_class=RuntimeWarning, func=func)  # E: No overload variant
  np.testing.assert_no_warnings(func=func)  # E: No overload variant
+np.testing.assert_no_warnings(func, None)  # E: Too many arguments
+np.testing.assert_no_warnings(func, test=None)  # E: Unexpected keyword argument
  
  np.testing.assert_no_gc_cycles(func=func)  # E: No overload variant
diff --git a/numpy/typing/tests/data/fail/twodim_base.pyi b/numpy/typing/tests/data/fail/twodim_base.pyi

index ab34a374ccf53eb706efc8aa1da5c86916950f7d..faa430095a5fabbf721732ecec867cac434e9259 100644 (file)
--- a/numpy/typing/tests/data/fail/twodim_base.pyi
+++ b/numpy/typing/tests/data/fail/twodim_base.pyi
@@ -1,4 +1,4 @@
-from typing import Any, List, TypeVar
+from typing import Any, TypeVar
  
  import numpy as np
  import numpy.typing as npt
@@ -15,7 +15,7 @@ def func2(ar: npt.NDArray[Any], a: float) -> float:
  AR_b: npt.NDArray[np.bool_]
  AR_m: npt.NDArray[np.timedelta64]
  
-AR_LIKE_b: List[bool]
+AR_LIKE_b: list[bool]
  
  np.eye(10, M=20.0)  # E: No overload variant
  np.eye(10, k=2.5, dtype=int)  # E: No overload variant
diff --git a/numpy/typing/tests/data/fail/ufunclike.pyi b/numpy/typing/tests/data/fail/ufunclike.pyi

index 82a5f3a1d091f9cd6119c74f36ba07fd46fbf4fd..2f9fd14c8cf2082bfaf6b4e6a816cafd5299e47f 100644 (file)
--- a/numpy/typing/tests/data/fail/ufunclike.pyi
+++ b/numpy/typing/tests/data/fail/ufunclike.pyi
@@ -1,4 +1,4 @@
-from typing import List, Any
+from typing import Any
  import numpy as np
  
  AR_c: np.ndarray[Any, np.dtype[np.complex128]]
diff --git a/numpy/typing/tests/data/pass/arithmetic.py b/numpy/typing/tests/data/pass/arithmetic.py

index fe1612906e01a7fb7811b23eb93168ebd758f86f..4ed69c9238a3a1dc428340437a8f9ab2d9de0b9b 100644 (file)
--- a/numpy/typing/tests/data/pass/arithmetic.py
+++ b/numpy/typing/tests/data/pass/arithmetic.py
@@ -185,6 +185,10 @@ AR_LIKE_f - AR_O
  AR_LIKE_c - AR_O
  AR_LIKE_O - AR_O
  
+AR_u += AR_b
+AR_u += AR_u
+AR_u += 1  # Allowed during runtime as long as the object is 0D and >=0
+
  # Array floor division
  
  AR_b // AR_LIKE_b
diff --git a/numpy/typing/tests/data/pass/array_constructors.py b/numpy/typing/tests/data/pass/array_constructors.py

index 2763d9c9272ac6acb0143a16319750f540dd5989..e035a73c6fe914a14f80131184f6c78ccc3d84f1 100644 (file)
--- a/numpy/typing/tests/data/pass/array_constructors.py
+++ b/numpy/typing/tests/data/pass/array_constructors.py
@@ -23,9 +23,8 @@ B = A.view(SubClass).copy()
  B_stack = np.array([[1], [1]]).view(SubClass)
  C = [1]
  
-if sys.version_info >= (3, 8):
-    np.ndarray(Index())
-    np.ndarray([Index()])
+np.ndarray(Index())
+np.ndarray([Index()])
  
  np.array(1, dtype=float)
  np.array(1, copy=False)
diff --git a/numpy/typing/tests/data/pass/array_like.py b/numpy/typing/tests/data/pass/array_like.py

index 5bd2fda20e5cc55c0a93dcfc0ea6f7207b7ebad0..da2520e961e7a3b2d38e0e04183378851c17b479 100644 (file)
--- a/numpy/typing/tests/data/pass/array_like.py
+++ b/numpy/typing/tests/data/pass/array_like.py
@@ -1,7 +1,9 @@
-from typing import Any, Optional
+from __future__ import annotations
+
+from typing import Any
  
  import numpy as np
-from numpy.typing import ArrayLike, _SupportsArray
+from numpy._typing import ArrayLike, _SupportsArray
  
  x1: ArrayLike = True
  x2: ArrayLike = 5
@@ -18,7 +20,7 @@ x12: ArrayLike = memoryview(b'foo')
  
  
  class A:
-    def __array__(self, dtype: Optional[np.dtype] = None) -> np.ndarray:
+    def __array__(self, dtype: None | np.dtype[Any] = None) -> np.ndarray:
          return np.array([1, 2, 3])
  
  
diff --git a/numpy/typing/tests/data/pass/literal.py b/numpy/typing/tests/data/pass/literal.py

index 8eaeb6afb2ad990a745a921a7c42bcdde71ff577..d06431eed4da31ada71eeb3947f6b238e7b2fb74 100644 (file)
--- a/numpy/typing/tests/data/pass/literal.py
+++ b/numpy/typing/tests/data/pass/literal.py
@@ -1,5 +1,7 @@
+from __future__ import annotations
+
  from functools import partial
-from typing import Callable, List, Tuple
+from collections.abc import Callable
  
  import pytest  # type: ignore
  import numpy as np
@@ -11,7 +13,7 @@ KACF = frozenset({None, "K", "A", "C", "F"})
  ACF = frozenset({None, "A", "C", "F"})
  CF = frozenset({None, "C", "F"})
  
-order_list: List[Tuple[frozenset, Callable]] = [
+order_list: list[tuple[frozenset, Callable]] = [
      (KACF, partial(np.ndarray, 1)),
      (KACF, AR.tobytes),
      (KACF, partial(AR.astype, int)),
diff --git a/numpy/typing/tests/data/pass/numeric.py b/numpy/typing/tests/data/pass/numeric.py

index 34fef7270443a0c47b2891fbc6a61bfeba09144f..c4a73c1e9b7c2792da739047b1d7c88c22c6acfb 100644 (file)
--- a/numpy/typing/tests/data/pass/numeric.py
+++ b/numpy/typing/tests/data/pass/numeric.py
@@ -5,7 +5,8 @@ Does not include tests which fall under ``array_constructors``.
  
  """
  
-from typing import List
+from __future__ import annotations
+
  import numpy as np
  
  class SubClass(np.ndarray):
@@ -14,7 +15,7 @@ class SubClass(np.ndarray):
  i8 = np.int64(1)
  
  A = np.arange(27).reshape(3, 3, 3)
-B: List[List[List[int]]] = A.tolist()
+B: list[list[list[int]]] = A.tolist()
  C = np.empty((27, 27)).view(SubClass)
  
  np.count_nonzero(i8)
diff --git a/numpy/typing/tests/data/pass/numerictypes.py b/numpy/typing/tests/data/pass/numerictypes.py

index 5af0d171ca0482bb6e1ad73b9b3e7277b635aec9..7f1dd0945438c636dc21315b0ccf60b907c106a2 100644 (file)
--- a/numpy/typing/tests/data/pass/numerictypes.py
+++ b/numpy/typing/tests/data/pass/numerictypes.py
@@ -38,9 +38,9 @@ np.nbytes[np.int64]
  
  np.ScalarType
  np.ScalarType[0]
-np.ScalarType[4]
-np.ScalarType[9]
-np.ScalarType[11]
+np.ScalarType[3]
+np.ScalarType[8]
+np.ScalarType[10]
  
  np.typecodes["Character"]
  np.typecodes["Complex"]
diff --git a/numpy/typing/tests/data/pass/random.py b/numpy/typing/tests/data/pass/random.py

index 05bd62112ff2211af59c1affda1e293ae92f06e4..9816cd2c3f95c821ee65e4f82a8d8075159cdd39 100644 (file)
--- a/numpy/typing/tests/data/pass/random.py
+++ b/numpy/typing/tests/data/pass/random.py
@@ -1,13 +1,12 @@
  from __future__ import annotations
  
-from typing import Any, List, Dict
-
+from typing import Any
  import numpy as np
  
  SEED_NONE = None
  SEED_INT = 4579435749574957634658964293569
  SEED_ARR: np.ndarray[Any, np.dtype[np.int64]] = np.array([1, 2, 3, 4], dtype=np.int64)
-SEED_ARRLIKE: List[int] = [1, 2, 3, 4]
+SEED_ARRLIKE: list[int] = [1, 2, 3, 4]
  SEED_SEED_SEQ: np.random.SeedSequence = np.random.SeedSequence(0)
  SEED_MT19937: np.random.MT19937 = np.random.MT19937(0)
  SEED_PCG64: np.random.PCG64 = np.random.PCG64(0)
@@ -76,13 +75,13 @@ D_arr_0p9: np.ndarray[Any, np.dtype[np.float64]] = np.array([0.9])
  D_arr_1p5: np.ndarray[Any, np.dtype[np.float64]] = np.array([1.5])
  I_arr_10: np.ndarray[Any, np.dtype[np.int_]] = np.array([10], dtype=np.int_)
  I_arr_20: np.ndarray[Any, np.dtype[np.int_]] = np.array([20], dtype=np.int_)
-D_arr_like_0p1: List[float] = [0.1]
-D_arr_like_0p5: List[float] = [0.5]
-D_arr_like_0p9: List[float] = [0.9]
-D_arr_like_1p5: List[float] = [1.5]
-I_arr_like_10: List[int] = [10]
-I_arr_like_20: List[int] = [20]
-D_2D_like: List[List[float]] = [[1, 2], [2, 3], [3, 4], [4, 5.1]]
+D_arr_like_0p1: list[float] = [0.1]
+D_arr_like_0p5: list[float] = [0.5]
+D_arr_like_0p9: list[float] = [0.9]
+D_arr_like_1p5: list[float] = [1.5]
+I_arr_like_10: list[int] = [10]
+I_arr_like_20: list[int] = [20]
+D_2D_like: list[list[float]] = [[1, 2], [2, 3], [3, 4], [4, 5.1]]
  D_2D: np.ndarray[Any, np.dtype[np.float64]] = np.array(D_2D_like)
  
  S_out: np.ndarray[Any, np.dtype[np.float32]] = np.empty(1, dtype=np.float32)
@@ -499,7 +498,7 @@ def_gen.integers([100])
  def_gen.integers(0, [100])
  
  I_bool_low: np.ndarray[Any, np.dtype[np.bool_]] = np.array([0], dtype=np.bool_)
-I_bool_low_like: List[int] = [0]
+I_bool_low_like: list[int] = [0]
  I_bool_high_open: np.ndarray[Any, np.dtype[np.bool_]] = np.array([1], dtype=np.bool_)
  I_bool_high_closed: np.ndarray[Any, np.dtype[np.bool_]] = np.array([1], dtype=np.bool_)
  
@@ -528,7 +527,7 @@ def_gen.integers(I_bool_low, I_bool_high_closed, dtype=np.bool_, endpoint=True)
  def_gen.integers(0, I_bool_high_closed, dtype=np.bool_, endpoint=True)
  
  I_u1_low: np.ndarray[Any, np.dtype[np.uint8]] = np.array([0], dtype=np.uint8)
-I_u1_low_like: List[int] = [0]
+I_u1_low_like: list[int] = [0]
  I_u1_high_open: np.ndarray[Any, np.dtype[np.uint8]] = np.array([255], dtype=np.uint8)
  I_u1_high_closed: np.ndarray[Any, np.dtype[np.uint8]] = np.array([255], dtype=np.uint8)
  
@@ -569,7 +568,7 @@ def_gen.integers(I_u1_low, I_u1_high_closed, dtype=np.uint8, endpoint=True)
  def_gen.integers(0, I_u1_high_closed, dtype=np.uint8, endpoint=True)
  
  I_u2_low: np.ndarray[Any, np.dtype[np.uint16]] = np.array([0], dtype=np.uint16)
-I_u2_low_like: List[int] = [0]
+I_u2_low_like: list[int] = [0]
  I_u2_high_open: np.ndarray[Any, np.dtype[np.uint16]] = np.array([65535], dtype=np.uint16)
  I_u2_high_closed: np.ndarray[Any, np.dtype[np.uint16]] = np.array([65535], dtype=np.uint16)
  
@@ -610,7 +609,7 @@ def_gen.integers(I_u2_low, I_u2_high_closed, dtype=np.uint16, endpoint=True)
  def_gen.integers(0, I_u2_high_closed, dtype=np.uint16, endpoint=True)
  
  I_u4_low: np.ndarray[Any, np.dtype[np.uint32]] = np.array([0], dtype=np.uint32)
-I_u4_low_like: List[int] = [0]
+I_u4_low_like: list[int] = [0]
  I_u4_high_open: np.ndarray[Any, np.dtype[np.uint32]] = np.array([4294967295], dtype=np.uint32)
  I_u4_high_closed: np.ndarray[Any, np.dtype[np.uint32]] = np.array([4294967295], dtype=np.uint32)
  
@@ -651,7 +650,7 @@ def_gen.integers(I_u4_low, I_u4_high_closed, dtype=np.uint32, endpoint=True)
  def_gen.integers(0, I_u4_high_closed, dtype=np.uint32, endpoint=True)
  
  I_u8_low: np.ndarray[Any, np.dtype[np.uint64]] = np.array([0], dtype=np.uint64)
-I_u8_low_like: List[int] = [0]
+I_u8_low_like: list[int] = [0]
  I_u8_high_open: np.ndarray[Any, np.dtype[np.uint64]] = np.array([18446744073709551615], dtype=np.uint64)
  I_u8_high_closed: np.ndarray[Any, np.dtype[np.uint64]] = np.array([18446744073709551615], dtype=np.uint64)
  
@@ -692,7 +691,7 @@ def_gen.integers(I_u8_low, I_u8_high_closed, dtype=np.uint64, endpoint=True)
  def_gen.integers(0, I_u8_high_closed, dtype=np.uint64, endpoint=True)
  
  I_i1_low: np.ndarray[Any, np.dtype[np.int8]] = np.array([-128], dtype=np.int8)
-I_i1_low_like: List[int] = [-128]
+I_i1_low_like: list[int] = [-128]
  I_i1_high_open: np.ndarray[Any, np.dtype[np.int8]] = np.array([127], dtype=np.int8)
  I_i1_high_closed: np.ndarray[Any, np.dtype[np.int8]] = np.array([127], dtype=np.int8)
  
@@ -733,7 +732,7 @@ def_gen.integers(I_i1_low, I_i1_high_closed, dtype=np.int8, endpoint=True)
  def_gen.integers(-128, I_i1_high_closed, dtype=np.int8, endpoint=True)
  
  I_i2_low: np.ndarray[Any, np.dtype[np.int16]] = np.array([-32768], dtype=np.int16)
-I_i2_low_like: List[int] = [-32768]
+I_i2_low_like: list[int] = [-32768]
  I_i2_high_open: np.ndarray[Any, np.dtype[np.int16]] = np.array([32767], dtype=np.int16)
  I_i2_high_closed: np.ndarray[Any, np.dtype[np.int16]] = np.array([32767], dtype=np.int16)
  
@@ -774,7 +773,7 @@ def_gen.integers(I_i2_low, I_i2_high_closed, dtype=np.int16, endpoint=True)
  def_gen.integers(-32768, I_i2_high_closed, dtype=np.int16, endpoint=True)
  
  I_i4_low: np.ndarray[Any, np.dtype[np.int32]] = np.array([-2147483648], dtype=np.int32)
-I_i4_low_like: List[int] = [-2147483648]
+I_i4_low_like: list[int] = [-2147483648]
  I_i4_high_open: np.ndarray[Any, np.dtype[np.int32]] = np.array([2147483647], dtype=np.int32)
  I_i4_high_closed: np.ndarray[Any, np.dtype[np.int32]] = np.array([2147483647], dtype=np.int32)
  
@@ -815,7 +814,7 @@ def_gen.integers(I_i4_low, I_i4_high_closed, dtype=np.int32, endpoint=True)
  def_gen.integers(-2147483648, I_i4_high_closed, dtype=np.int32, endpoint=True)
  
  I_i8_low: np.ndarray[Any, np.dtype[np.int64]] = np.array([-9223372036854775808], dtype=np.int64)
-I_i8_low_like: List[int] = [-9223372036854775808]
+I_i8_low_like: list[int] = [-9223372036854775808]
  I_i8_high_open: np.ndarray[Any, np.dtype[np.int64]] = np.array([9223372036854775807], dtype=np.int64)
  I_i8_high_closed: np.ndarray[Any, np.dtype[np.int64]] = np.array([9223372036854775807], dtype=np.int64)
  
@@ -912,7 +911,7 @@ def_gen.shuffle(D_2D, axis=1)
  
  def_gen.__str__()
  def_gen.__repr__()
-def_gen_state: Dict[str, Any]
+def_gen_state: dict[str, Any]
  def_gen_state = def_gen.__getstate__()
  def_gen.__setstate__(def_gen_state)
  
diff --git a/numpy/typing/tests/data/pass/scalars.py b/numpy/typing/tests/data/pass/scalars.py

index b258db49fd7c7878364fef5933db23e3a63a0a79..684d41fad61fd0631d6de732091d4100e05f6b2d 100644 (file)
--- a/numpy/typing/tests/data/pass/scalars.py
+++ b/numpy/typing/tests/data/pass/scalars.py
@@ -59,10 +59,9 @@ np.float64(None)
  np.float32("1")
  np.float16(b"2.5")
  
-if sys.version_info >= (3, 8):
-    np.uint64(D())
-    np.float32(D())
-    np.complex64(D())
+np.uint64(D())
+np.float32(D())
+np.complex64(D())
  
  np.bytes_(b"hello")
  np.bytes_("hello", 'utf-8')
diff --git a/numpy/typing/tests/data/pass/simple.py b/numpy/typing/tests/data/pass/simple.py

index 85965e0de707593239ba55b1a71502ad0852d353..03ca3e83f74b46c4ffdd407b66e2ab309116f3f1 100644 (file)
--- a/numpy/typing/tests/data/pass/simple.py
+++ b/numpy/typing/tests/data/pass/simple.py
@@ -2,7 +2,7 @@
  import operator
  
  import numpy as np
-from typing import Iterable  # noqa: F401
+from collections.abc import Iterable
  
  # Basic checks
  array = np.array([1, 2])
diff --git a/numpy/typing/tests/data/reveal/arithmetic.pyi b/numpy/typing/tests/data/reveal/arithmetic.pyi

index c5b46746980d2596579a2d0bab4b5f7915b80481..0ca5e9772958313c0e24402d672a6ef09774d11f 100644 (file)
--- a/numpy/typing/tests/data/reveal/arithmetic.pyi
+++ b/numpy/typing/tests/data/reveal/arithmetic.pyi
@@ -1,9 +1,10 @@
-from typing import Any, List
+from typing import Any
+
  import numpy as np
-import numpy.typing as npt
+from numpy._typing import NDArray, _128Bit
  
  # Can't directly import `np.float128` as it is not available on all platforms
-f16: np.floating[npt._128Bit]
+f16: np.floating[_128Bit]
  
  c16 = np.complex128()
  f8 = np.float64()
@@ -33,18 +34,21 @@ AR_c: np.ndarray[Any, np.dtype[np.complex128]]
  AR_m: np.ndarray[Any, np.dtype[np.timedelta64]]
  AR_M: np.ndarray[Any, np.dtype[np.datetime64]]
  AR_O: np.ndarray[Any, np.dtype[np.object_]]
+AR_number: NDArray[np.number[Any]]
  
-AR_LIKE_b: List[bool]
-AR_LIKE_u: List[np.uint32]
-AR_LIKE_i: List[int]
-AR_LIKE_f: List[float]
-AR_LIKE_c: List[complex]
-AR_LIKE_m: List[np.timedelta64]
-AR_LIKE_M: List[np.datetime64]
-AR_LIKE_O: List[np.object_]
+AR_LIKE_b: list[bool]
+AR_LIKE_u: list[np.uint32]
+AR_LIKE_i: list[int]
+AR_LIKE_f: list[float]
+AR_LIKE_c: list[complex]
+AR_LIKE_m: list[np.timedelta64]
+AR_LIKE_M: list[np.datetime64]
+AR_LIKE_O: list[np.object_]
  
  # Array subtraction
  
+reveal_type(AR_number - AR_number)  # E: ndarray[Any, dtype[number[Any]]]
+
  reveal_type(AR_b - AR_LIKE_u)  # E: ndarray[Any, dtype[unsignedinteger[Any]]]
  reveal_type(AR_b - AR_LIKE_i)  # E: ndarray[Any, dtype[signedinteger[Any]]]
  reveal_type(AR_b - AR_LIKE_f)  # E: ndarray[Any, dtype[floating[Any]]]
diff --git a/numpy/typing/tests/data/reveal/array_constructors.pyi b/numpy/typing/tests/data/reveal/array_constructors.pyi

index 265690a4df42699220b6edd6c7104bceaf82b962..4a865a31c22b283faec6715822e0ee2a6c7947be 100644 (file)
--- a/numpy/typing/tests/data/reveal/array_constructors.pyi
+++ b/numpy/typing/tests/data/reveal/array_constructors.pyi
@@ -1,4 +1,4 @@
-from typing import List, Any, TypeVar
+from typing import Any, TypeVar
  from pathlib import Path
  
  import numpy as np
@@ -12,7 +12,7 @@ i8: np.int64
  
  A: npt.NDArray[np.float64]
  B: SubClass[np.float64]
-C: List[int]
+C: list[int]
  
  def func(i: int, j: int, **kwargs: Any) -> SubClass[np.float64]: ...
  
@@ -28,6 +28,7 @@ reveal_type(np.array(B, subok=True))  # E: SubClass[{float64}]
  reveal_type(np.array([1, 1.0]))  # E: ndarray[Any, dtype[Any]]
  reveal_type(np.array(A, dtype=np.int64))  # E: ndarray[Any, dtype[{int64}]]
  reveal_type(np.array(A, dtype='c16'))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.array(A, like=A))  # E: ndarray[Any, dtype[{float64}]]
  
  reveal_type(np.zeros([1, 5, 6]))  # E: ndarray[Any, dtype[{float64}]]
  reveal_type(np.zeros([1, 5, 6], dtype=np.int64))  # E: ndarray[Any, dtype[{int64}]]
@@ -119,10 +120,24 @@ reveal_type(np.require(B, requirements="W"))  # E: SubClass[{float64}]
  reveal_type(np.require(B, requirements="A"))  # E: SubClass[{float64}]
  reveal_type(np.require(C))  # E: ndarray[Any, Any]
  
-reveal_type(np.linspace(0, 10))  # E: ndarray[Any, Any]
-reveal_type(np.linspace(0, 10, retstep=True))  # E: Tuple[ndarray[Any, Any], Any]
-reveal_type(np.logspace(0, 10))  # E: ndarray[Any, Any]
-reveal_type(np.geomspace(1, 10))  # E: ndarray[Any, Any]
+reveal_type(np.linspace(0, 10))  # E: ndarray[Any, dtype[floating[Any]]]
+reveal_type(np.linspace(0, 10j))  # E: ndarray[Any, dtype[complexfloating[Any, Any]]]
+reveal_type(np.linspace(0, 10, dtype=np.int64))  # E: ndarray[Any, dtype[{int64}]]
+reveal_type(np.linspace(0, 10, dtype=int))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.linspace(0, 10, retstep=True))  # E: Tuple[ndarray[Any, dtype[floating[Any]]], floating[Any]]
+reveal_type(np.linspace(0j, 10, retstep=True))  # E: Tuple[ndarray[Any, dtype[complexfloating[Any, Any]]], complexfloating[Any, Any]]
+reveal_type(np.linspace(0, 10, retstep=True, dtype=np.int64))  # E: Tuple[ndarray[Any, dtype[{int64}]], {int64}]
+reveal_type(np.linspace(0j, 10, retstep=True, dtype=int))  # E: Tuple[ndarray[Any, dtype[Any]], Any]
+
+reveal_type(np.logspace(0, 10))  # E: ndarray[Any, dtype[floating[Any]]]
+reveal_type(np.logspace(0, 10j))  # E: ndarray[Any, dtype[complexfloating[Any, Any]]]
+reveal_type(np.logspace(0, 10, dtype=np.int64))  # E: ndarray[Any, dtype[{int64}]]
+reveal_type(np.logspace(0, 10, dtype=int))  # E: ndarray[Any, dtype[Any]]
+
+reveal_type(np.geomspace(0, 10))  # E: ndarray[Any, dtype[floating[Any]]]
+reveal_type(np.geomspace(0, 10j))  # E: ndarray[Any, dtype[complexfloating[Any, Any]]]
+reveal_type(np.geomspace(0, 10, dtype=np.int64))  # E: ndarray[Any, dtype[{int64}]]
+reveal_type(np.geomspace(0, 10, dtype=int))  # E: ndarray[Any, dtype[Any]]
  
  reveal_type(np.zeros_like(A))  # E: ndarray[Any, dtype[{float64}]]
  reveal_type(np.zeros_like(C))  # E: ndarray[Any, dtype[Any]]
@@ -183,5 +198,5 @@ reveal_type(np.stack([C, C]))  # E: ndarray[Any, dtype[Any]]
  reveal_type(np.stack([A, A], axis=0))  # E: Any
  reveal_type(np.stack([A, A], out=B))  # E: SubClass[{float64}]
  
-reveal_type(np.block([[A, A], [A, A]]))  # E: ndarray[Any, Any]
+reveal_type(np.block([[A, A], [A, A]]))  # E: ndarray[Any, dtype[Any]]
  reveal_type(np.block(C))  # E: ndarray[Any, dtype[Any]]
diff --git a/numpy/typing/tests/data/reveal/arraypad.pyi b/numpy/typing/tests/data/reveal/arraypad.pyi

index 995f82b579e66f3671bacadad7c1476c646b13e5..a05d44034644ff1e2b19dbccd07baf2a4b00c0fd 100644 (file)
--- a/numpy/typing/tests/data/reveal/arraypad.pyi
+++ b/numpy/typing/tests/data/reveal/arraypad.pyi
@@ -1,18 +1,19 @@
-from typing import List, Any, Mapping, Tuple, SupportsIndex
+from collections.abc import Mapping
+from typing import Any, SupportsIndex
  
  import numpy as np
  import numpy.typing as npt
  
  def mode_func(
      ar: npt.NDArray[np.number[Any]],
-    width: Tuple[int, int],
+    width: tuple[int, int],
      iaxis: SupportsIndex,
      kwargs: Mapping[str, Any],
  ) -> None: ...
  
  AR_i8: npt.NDArray[np.int64]
  AR_f8: npt.NDArray[np.float64]
-AR_LIKE: List[int]
+AR_LIKE: list[int]
  
  reveal_type(np.pad(AR_i8, (2, 3), "constant"))  # E: ndarray[Any, dtype[{int64}]]
  reveal_type(np.pad(AR_LIKE, (2, 3), "constant"))  # E: ndarray[Any, dtype[Any]]
diff --git a/numpy/typing/tests/data/reveal/arrayprint.pyi b/numpy/typing/tests/data/reveal/arrayprint.pyi

index e797097ebb944e7bad5b6b78905aeaeea67076ac..6e65a8d8ad248c79de2054439abdb4d09f093cf4 100644 (file)
--- a/numpy/typing/tests/data/reveal/arrayprint.pyi
+++ b/numpy/typing/tests/data/reveal/arrayprint.pyi
@@ -1,4 +1,5 @@
-from typing import Any, Callable
+from collections.abc import Callable
+from typing import Any
  import numpy as np
  
  AR: np.ndarray[Any, Any]
diff --git a/numpy/typing/tests/data/reveal/char.pyi b/numpy/typing/tests/data/reveal/char.pyi

index ce8c1b2690a9e89aa62d957912511b67ce44acd4..0563b34727e4be89e15b01fee8ea3c339dd23df4 100644 (file)
--- a/numpy/typing/tests/data/reveal/char.pyi
+++ b/numpy/typing/tests/data/reveal/char.pyi
@@ -1,6 +1,6 @@
  import numpy as np
  import numpy.typing as npt
-from typing import Sequence
+from collections.abc import Sequence
  
  AR_U: npt.NDArray[np.str_]
  AR_S: npt.NDArray[np.bytes_]
diff --git a/numpy/typing/tests/data/reveal/chararray.pyi b/numpy/typing/tests/data/reveal/chararray.pyi

index 3da2e15993fe8907d0398930e1a0db352b18a5c5..61906c8606759fda5bb912491bca6e8798233df0 100644 (file)
--- a/numpy/typing/tests/data/reveal/chararray.pyi
+++ b/numpy/typing/tests/data/reveal/chararray.pyi
@@ -127,3 +127,6 @@ reveal_type(AR_S.istitle())  # E: ndarray[Any, dtype[bool_]]
  
  reveal_type(AR_U.isupper())  # E: ndarray[Any, dtype[bool_]]
  reveal_type(AR_S.isupper())  # E: ndarray[Any, dtype[bool_]]
+
+reveal_type(AR_U.__array_finalize__(object()))  # E: None
+reveal_type(AR_S.__array_finalize__(object()))  # E: None
diff --git a/numpy/typing/tests/data/reveal/einsumfunc.pyi b/numpy/typing/tests/data/reveal/einsumfunc.pyi

index 5b07e6d3c803dd6069cf65cafb0a2130c78acfd6..3c7146adaff9b74d595728182ca08c6def425c5b 100644 (file)
--- a/numpy/typing/tests/data/reveal/einsumfunc.pyi
+++ b/numpy/typing/tests/data/reveal/einsumfunc.pyi
@@ -1,12 +1,12 @@
-from typing import List, Any
+from typing import Any
  import numpy as np
  
-AR_LIKE_b: List[bool]
-AR_LIKE_u: List[np.uint32]
-AR_LIKE_i: List[int]
-AR_LIKE_f: List[float]
-AR_LIKE_c: List[complex]
-AR_LIKE_U: List[str]
+AR_LIKE_b: list[bool]
+AR_LIKE_u: list[np.uint32]
+AR_LIKE_i: list[int]
+AR_LIKE_f: list[float]
+AR_LIKE_c: list[complex]
+AR_LIKE_U: list[str]
  
  OUT_f: np.ndarray[Any, np.dtype[np.float64]]
  
diff --git a/numpy/typing/tests/data/reveal/emath.pyi b/numpy/typing/tests/data/reveal/emath.pyi

new file mode 100644 (file)

index 0000000..9ab2d72
--- /dev/null
+++ b/numpy/typing/tests/data/reveal/emath.pyi
@@ -0,0 +1,52 @@
+import numpy as np
+import numpy.typing as npt
+
+AR_f8: npt.NDArray[np.float64]
+AR_c16: npt.NDArray[np.complex128]
+f8: np.float64
+c16: np.complex128
+
+reveal_type(np.emath.sqrt(f8))  # E: Any
+reveal_type(np.emath.sqrt(AR_f8))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.emath.sqrt(c16))  # E: complexfloating[Any, Any]
+reveal_type(np.emath.sqrt(AR_c16))  # E: ndarray[Any, dtype[complexfloating[Any, Any]]]
+
+reveal_type(np.emath.log(f8))  # E: Any
+reveal_type(np.emath.log(AR_f8))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.emath.log(c16))  # E: complexfloating[Any, Any]
+reveal_type(np.emath.log(AR_c16))  # E: ndarray[Any, dtype[complexfloating[Any, Any]]]
+
+reveal_type(np.emath.log10(f8))  # E: Any
+reveal_type(np.emath.log10(AR_f8))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.emath.log10(c16))  # E: complexfloating[Any, Any]
+reveal_type(np.emath.log10(AR_c16))  # E: ndarray[Any, dtype[complexfloating[Any, Any]]]
+
+reveal_type(np.emath.log2(f8))  # E: Any
+reveal_type(np.emath.log2(AR_f8))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.emath.log2(c16))  # E: complexfloating[Any, Any]
+reveal_type(np.emath.log2(AR_c16))  # E: ndarray[Any, dtype[complexfloating[Any, Any]]]
+
+reveal_type(np.emath.logn(f8, 2))  # E: Any
+reveal_type(np.emath.logn(AR_f8, 4))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.emath.logn(f8, 1j))  # E: complexfloating[Any, Any]
+reveal_type(np.emath.logn(AR_c16, 1.5))  # E: ndarray[Any, dtype[complexfloating[Any, Any]]]
+
+reveal_type(np.emath.power(f8, 2))  # E: Any
+reveal_type(np.emath.power(AR_f8, 4))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.emath.power(f8, 2j))  # E: complexfloating[Any, Any]
+reveal_type(np.emath.power(AR_c16, 1.5))  # E: ndarray[Any, dtype[complexfloating[Any, Any]]]
+
+reveal_type(np.emath.arccos(f8))  # E: Any
+reveal_type(np.emath.arccos(AR_f8))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.emath.arccos(c16))  # E: complexfloating[Any, Any]
+reveal_type(np.emath.arccos(AR_c16))  # E: ndarray[Any, dtype[complexfloating[Any, Any]]]
+
+reveal_type(np.emath.arcsin(f8))  # E: Any
+reveal_type(np.emath.arcsin(AR_f8))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.emath.arcsin(c16))  # E: complexfloating[Any, Any]
+reveal_type(np.emath.arcsin(AR_c16))  # E: ndarray[Any, dtype[complexfloating[Any, Any]]]
+
+reveal_type(np.emath.arctanh(f8))  # E: Any
+reveal_type(np.emath.arctanh(AR_f8))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.emath.arctanh(c16))  # E: complexfloating[Any, Any]
+reveal_type(np.emath.arctanh(AR_c16))  # E: ndarray[Any, dtype[complexfloating[Any, Any]]]
diff --git a/numpy/typing/tests/data/reveal/fromnumeric.pyi b/numpy/typing/tests/data/reveal/fromnumeric.pyi

index cbd8d65b9f73f288d8711e07512195002c19e953..e769abcf5e526fd50ca41d37834ec8f2e32400ff 100644 (file)
--- a/numpy/typing/tests/data/reveal/fromnumeric.pyi
+++ b/numpy/typing/tests/data/reveal/fromnumeric.pyi
@@ -1,264 +1,297 @@
  """Tests for :mod:`core.fromnumeric`."""
  
  import numpy as np
+import numpy.typing as npt
+
+class NDArraySubclass(npt.NDArray[np.complex128]):
+    ...
+
+AR_b: npt.NDArray[np.bool_]
+AR_f4: npt.NDArray[np.float32]
+AR_c16: npt.NDArray[np.complex128]
+AR_u8: npt.NDArray[np.uint64]
+AR_i8: npt.NDArray[np.int64]
+AR_O: npt.NDArray[np.object_]
+AR_subclass: NDArraySubclass
+
+b: np.bool_
+f4: np.float32
+i8: np.int64
+f: float
+
+reveal_type(np.take(b, 0))  # E: bool_
+reveal_type(np.take(f4, 0))  # E: {float32}
+reveal_type(np.take(f, 0))  # E: Any
+reveal_type(np.take(AR_b, 0))  # E: bool_
+reveal_type(np.take(AR_f4, 0))  # E: {float32}
+reveal_type(np.take(AR_b, [0]))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.take(AR_f4, [0]))  # E: ndarray[Any, dtype[{float32}]]
+reveal_type(np.take([1], [0]))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.take(AR_f4, [0], out=AR_subclass))  # E: NDArraySubclass
+
+reveal_type(np.reshape(b, 1))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.reshape(f4, 1))  # E: ndarray[Any, dtype[{float32}]]
+reveal_type(np.reshape(f, 1))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.reshape(AR_b, 1))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.reshape(AR_f4, 1))  # E: ndarray[Any, dtype[{float32}]]
+
+reveal_type(np.choose(1, [True, True]))  # E: Any
+reveal_type(np.choose([1], [True, True]))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.choose([1], AR_b))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.choose([1], AR_b, out=AR_f4))  # E: ndarray[Any, dtype[{float32}]]
+
+reveal_type(np.repeat(b, 1))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.repeat(f4, 1))  # E: ndarray[Any, dtype[{float32}]]
+reveal_type(np.repeat(f, 1))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.repeat(AR_b, 1))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.repeat(AR_f4, 1))  # E: ndarray[Any, dtype[{float32}]]
+
+# TODO: array_bdd tests for np.put()
+
+reveal_type(np.swapaxes([[0, 1]], 0, 0))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.swapaxes(AR_b, 0, 0))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.swapaxes(AR_f4, 0, 0))  # E: ndarray[Any, dtype[{float32}]]
+
+reveal_type(np.transpose(b))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.transpose(f4))  # E: ndarray[Any, dtype[{float32}]]
+reveal_type(np.transpose(f))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.transpose(AR_b))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.transpose(AR_f4))  # E: ndarray[Any, dtype[{float32}]]
+
+reveal_type(np.partition(b, 0, axis=None))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.partition(f4, 0, axis=None))  # E: ndarray[Any, dtype[{float32}]]
+reveal_type(np.partition(f, 0, axis=None))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.partition(AR_b, 0))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.partition(AR_f4, 0))  # E: ndarray[Any, dtype[{float32}]]
+
+reveal_type(np.argpartition(b, 0))  # E: ndarray[Any, dtype[{intp}]]
+reveal_type(np.argpartition(f4, 0))  # E: ndarray[Any, dtype[{intp}]]
+reveal_type(np.argpartition(f, 0))  # E: ndarray[Any, dtype[{intp}]]
+reveal_type(np.argpartition(AR_b, 0))  # E: ndarray[Any, dtype[{intp}]]
+reveal_type(np.argpartition(AR_f4, 0))  # E: ndarray[Any, dtype[{intp}]]
+
+reveal_type(np.sort([2, 1], 0))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.sort(AR_b, 0))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.sort(AR_f4, 0))  # E: ndarray[Any, dtype[{float32}]]
+
+reveal_type(np.argsort(AR_b, 0))  # E: ndarray[Any, dtype[{intp}]]
+reveal_type(np.argsort(AR_f4, 0))  # E: ndarray[Any, dtype[{intp}]]
+
+reveal_type(np.argmax(AR_b))  # E: {intp}
+reveal_type(np.argmax(AR_f4))  # E: {intp}
+reveal_type(np.argmax(AR_b, axis=0))  # E: Any
+reveal_type(np.argmax(AR_f4, axis=0))  # E: Any
+reveal_type(np.argmax(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
+reveal_type(np.argmin(AR_b))  # E: {intp}
+reveal_type(np.argmin(AR_f4))  # E: {intp}
+reveal_type(np.argmin(AR_b, axis=0))  # E: Any
+reveal_type(np.argmin(AR_f4, axis=0))  # E: Any
+reveal_type(np.argmin(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
+reveal_type(np.searchsorted(AR_b[0], 0))  # E: {intp}
+reveal_type(np.searchsorted(AR_f4[0], 0))  # E: {intp}
+reveal_type(np.searchsorted(AR_b[0], [0]))  # E: ndarray[Any, dtype[{intp}]]
+reveal_type(np.searchsorted(AR_f4[0], [0]))  # E: ndarray[Any, dtype[{intp}]]
+
+reveal_type(np.resize(b, (5, 5)))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.resize(f4, (5, 5)))  # E: ndarray[Any, dtype[{float32}]]
+reveal_type(np.resize(f, (5, 5)))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.resize(AR_b, (5, 5)))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.resize(AR_f4, (5, 5)))  # E: ndarray[Any, dtype[{float32}]]
+
+reveal_type(np.squeeze(b))  # E: bool_
+reveal_type(np.squeeze(f4))  # E: {float32}
+reveal_type(np.squeeze(f))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.squeeze(AR_b))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.squeeze(AR_f4))  # E: ndarray[Any, dtype[{float32}]]
+
+reveal_type(np.diagonal(AR_b))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.diagonal(AR_f4))  # E: ndarray[Any, dtype[{float32}]]
+
+reveal_type(np.trace(AR_b))  # E: Any
+reveal_type(np.trace(AR_f4))  # E: Any
+reveal_type(np.trace(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
+reveal_type(np.ravel(b))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.ravel(f4))  # E: ndarray[Any, dtype[{float32}]]
+reveal_type(np.ravel(f))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.ravel(AR_b))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.ravel(AR_f4))  # E: ndarray[Any, dtype[{float32}]]
+
+reveal_type(np.nonzero(b))  # E: tuple[ndarray[Any, dtype[{intp}]], ...]
+reveal_type(np.nonzero(f4))  # E: tuple[ndarray[Any, dtype[{intp}]], ...]
+reveal_type(np.nonzero(f))  # E: tuple[ndarray[Any, dtype[{intp}]], ...]
+reveal_type(np.nonzero(AR_b))  # E: tuple[ndarray[Any, dtype[{intp}]], ...]
+reveal_type(np.nonzero(AR_f4))  # E: tuple[ndarray[Any, dtype[{intp}]], ...]
  
-A = np.array(True, ndmin=2, dtype=bool)
-B = np.array(1.0, ndmin=2, dtype=np.float32)
-A.setflags(write=False)
-B.setflags(write=False)
-
-a = np.bool_(True)
-b = np.float32(1.0)
-c = 1.0
-d = np.array(1.0, dtype=np.float32)  # writeable
-
-reveal_type(np.take(a, 0))  # E: Any
-reveal_type(np.take(b, 0))  # E: Any
-reveal_type(np.take(c, 0))  # E: Any
-reveal_type(np.take(A, 0))  # E: Any
-reveal_type(np.take(B, 0))  # E: Any
-reveal_type(np.take(A, [0]))  # E: Any
-reveal_type(np.take(B, [0]))  # E: Any
-
-reveal_type(np.reshape(a, 1))  # E: ndarray[Any, Any]
-reveal_type(np.reshape(b, 1))  # E: ndarray[Any, Any]
-reveal_type(np.reshape(c, 1))  # E: ndarray[Any, Any]
-reveal_type(np.reshape(A, 1))  # E: ndarray[Any, Any]
-reveal_type(np.reshape(B, 1))  # E: ndarray[Any, Any]
-
-reveal_type(np.choose(a, [True, True]))  # E: Any
-reveal_type(np.choose(A, [True, True]))  # E: Any
-
-reveal_type(np.repeat(a, 1))  # E: ndarray[Any, Any]
-reveal_type(np.repeat(b, 1))  # E: ndarray[Any, Any]
-reveal_type(np.repeat(c, 1))  # E: ndarray[Any, Any]
-reveal_type(np.repeat(A, 1))  # E: ndarray[Any, Any]
-reveal_type(np.repeat(B, 1))  # E: ndarray[Any, Any]
-
-# TODO: Add tests for np.put()
-
-reveal_type(np.swapaxes(A, 0, 0))  # E: ndarray[Any, Any]
-reveal_type(np.swapaxes(B, 0, 0))  # E: ndarray[Any, Any]
-
-reveal_type(np.transpose(a))  # E: ndarray[Any, Any]
-reveal_type(np.transpose(b))  # E: ndarray[Any, Any]
-reveal_type(np.transpose(c))  # E: ndarray[Any, Any]
-reveal_type(np.transpose(A))  # E: ndarray[Any, Any]
-reveal_type(np.transpose(B))  # E: ndarray[Any, Any]
-
-reveal_type(np.partition(a, 0, axis=None))  # E: ndarray[Any, Any]
-reveal_type(np.partition(b, 0, axis=None))  # E: ndarray[Any, Any]
-reveal_type(np.partition(c, 0, axis=None))  # E: ndarray[Any, Any]
-reveal_type(np.partition(A, 0))  # E: ndarray[Any, Any]
-reveal_type(np.partition(B, 0))  # E: ndarray[Any, Any]
-
-reveal_type(np.argpartition(a, 0))  # E: Any
-reveal_type(np.argpartition(b, 0))  # E: Any
-reveal_type(np.argpartition(c, 0))  # E: Any
-reveal_type(np.argpartition(A, 0))  # E: Any
-reveal_type(np.argpartition(B, 0))  # E: Any
-
-reveal_type(np.sort(A, 0))  # E: ndarray[Any, Any]
-reveal_type(np.sort(B, 0))  # E: ndarray[Any, Any]
-
-reveal_type(np.argsort(A, 0))  # E: ndarray[Any, Any]
-reveal_type(np.argsort(B, 0))  # E: ndarray[Any, Any]
-
-reveal_type(np.argmax(A))  # E: {intp}
-reveal_type(np.argmax(B))  # E: {intp}
-reveal_type(np.argmax(A, axis=0))  # E: Any
-reveal_type(np.argmax(B, axis=0))  # E: Any
-
-reveal_type(np.argmin(A))  # E: {intp}
-reveal_type(np.argmin(B))  # E: {intp}
-reveal_type(np.argmin(A, axis=0))  # E: Any
-reveal_type(np.argmin(B, axis=0))  # E: Any
-
-reveal_type(np.searchsorted(A[0], 0))  # E: {intp}
-reveal_type(np.searchsorted(B[0], 0))  # E: {intp}
-reveal_type(np.searchsorted(A[0], [0]))  # E: ndarray[Any, Any]
-reveal_type(np.searchsorted(B[0], [0]))  # E: ndarray[Any, Any]
-
-reveal_type(np.resize(a, (5, 5)))  # E: ndarray[Any, Any]
-reveal_type(np.resize(b, (5, 5)))  # E: ndarray[Any, Any]
-reveal_type(np.resize(c, (5, 5)))  # E: ndarray[Any, Any]
-reveal_type(np.resize(A, (5, 5)))  # E: ndarray[Any, Any]
-reveal_type(np.resize(B, (5, 5)))  # E: ndarray[Any, Any]
-
-reveal_type(np.squeeze(a))  # E: bool_
-reveal_type(np.squeeze(b))  # E: {float32}
-reveal_type(np.squeeze(c))  # E: ndarray[Any, Any]
-reveal_type(np.squeeze(A))  # E: ndarray[Any, Any]
-reveal_type(np.squeeze(B))  # E: ndarray[Any, Any]
-
-reveal_type(np.diagonal(A))  # E: ndarray[Any, Any]
-reveal_type(np.diagonal(B))  # E: ndarray[Any, Any]
-
-reveal_type(np.trace(A))  # E: Any
-reveal_type(np.trace(B))  # E: Any
-
-reveal_type(np.ravel(a))  # E: ndarray[Any, Any]
-reveal_type(np.ravel(b))  # E: ndarray[Any, Any]
-reveal_type(np.ravel(c))  # E: ndarray[Any, Any]
-reveal_type(np.ravel(A))  # E: ndarray[Any, Any]
-reveal_type(np.ravel(B))  # E: ndarray[Any, Any]
-
-reveal_type(np.nonzero(a))  # E: tuple[ndarray[Any, Any], ...]
-reveal_type(np.nonzero(b))  # E: tuple[ndarray[Any, Any], ...]
-reveal_type(np.nonzero(c))  # E: tuple[ndarray[Any, Any], ...]
-reveal_type(np.nonzero(A))  # E: tuple[ndarray[Any, Any], ...]
-reveal_type(np.nonzero(B))  # E: tuple[ndarray[Any, Any], ...]
-
-reveal_type(np.shape(a))  # E: tuple[builtins.int, ...]
  reveal_type(np.shape(b))  # E: tuple[builtins.int, ...]
-reveal_type(np.shape(c))  # E: tuple[builtins.int, ...]
-reveal_type(np.shape(A))  # E: tuple[builtins.int, ...]
-reveal_type(np.shape(B))  # E: tuple[builtins.int, ...]
-
-reveal_type(np.compress([True], a))  # E: ndarray[Any, Any]
-reveal_type(np.compress([True], b))  # E: ndarray[Any, Any]
-reveal_type(np.compress([True], c))  # E: ndarray[Any, Any]
-reveal_type(np.compress([True], A))  # E: ndarray[Any, Any]
-reveal_type(np.compress([True], B))  # E: ndarray[Any, Any]
-
-reveal_type(np.clip(a, 0, 1.0))  # E: Any
-reveal_type(np.clip(b, -1, 1))  # E: Any
-reveal_type(np.clip(c, 0, 1))  # E: Any
-reveal_type(np.clip(A, 0, 1))  # E: Any
-reveal_type(np.clip(B, 0, 1))  # E: Any
-
-reveal_type(np.sum(a))  # E: Any
-reveal_type(np.sum(b))  # E: Any
-reveal_type(np.sum(c))  # E: Any
-reveal_type(np.sum(A))  # E: Any
-reveal_type(np.sum(B))  # E: Any
-reveal_type(np.sum(A, axis=0))  # E: Any
-reveal_type(np.sum(B, axis=0))  # E: Any
-
-reveal_type(np.all(a))  # E: bool_
+reveal_type(np.shape(f4))  # E: tuple[builtins.int, ...]
+reveal_type(np.shape(f))  # E: tuple[builtins.int, ...]
+reveal_type(np.shape(AR_b))  # E: tuple[builtins.int, ...]
+reveal_type(np.shape(AR_f4))  # E: tuple[builtins.int, ...]
+
+reveal_type(np.compress([True], b))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.compress([True], f4))  # E: ndarray[Any, dtype[{float32}]]
+reveal_type(np.compress([True], f))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.compress([True], AR_b))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.compress([True], AR_f4))  # E: ndarray[Any, dtype[{float32}]]
+
+reveal_type(np.clip(b, 0, 1.0))  # E: bool_
+reveal_type(np.clip(f4, -1, 1))  # E: {float32}
+reveal_type(np.clip(f, 0, 1))  # E: Any
+reveal_type(np.clip(AR_b, 0, 1))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.clip(AR_f4, 0, 1))  # E: ndarray[Any, dtype[{float32}]]
+reveal_type(np.clip([0], 0, 1))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.clip(AR_b, 0, 1, out=AR_subclass))  # E: NDArraySubclass
+
+reveal_type(np.sum(b))  # E: bool_
+reveal_type(np.sum(f4))  # E: {float32}
+reveal_type(np.sum(f))  # E: Any
+reveal_type(np.sum(AR_b))  # E: bool_
+reveal_type(np.sum(AR_f4))  # E: {float32}
+reveal_type(np.sum(AR_b, axis=0))  # E: Any
+reveal_type(np.sum(AR_f4, axis=0))  # E: Any
+reveal_type(np.sum(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
  reveal_type(np.all(b))  # E: bool_
-reveal_type(np.all(c))  # E: bool_
-reveal_type(np.all(A))  # E: bool_
-reveal_type(np.all(B))  # E: bool_
-reveal_type(np.all(A, axis=0))  # E: Any
-reveal_type(np.all(B, axis=0))  # E: Any
-reveal_type(np.all(A, keepdims=True))  # E: Any
-reveal_type(np.all(B, keepdims=True))  # E: Any
-
-reveal_type(np.any(a))  # E: bool_
+reveal_type(np.all(f4))  # E: bool_
+reveal_type(np.all(f))  # E: bool_
+reveal_type(np.all(AR_b))  # E: bool_
+reveal_type(np.all(AR_f4))  # E: bool_
+reveal_type(np.all(AR_b, axis=0))  # E: Any
+reveal_type(np.all(AR_f4, axis=0))  # E: Any
+reveal_type(np.all(AR_b, keepdims=True))  # E: Any
+reveal_type(np.all(AR_f4, keepdims=True))  # E: Any
+reveal_type(np.all(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
  reveal_type(np.any(b))  # E: bool_
-reveal_type(np.any(c))  # E: bool_
-reveal_type(np.any(A))  # E: bool_
-reveal_type(np.any(B))  # E: bool_
-reveal_type(np.any(A, axis=0))  # E: Any
-reveal_type(np.any(B, axis=0))  # E: Any
-reveal_type(np.any(A, keepdims=True))  # E: Any
-reveal_type(np.any(B, keepdims=True))  # E: Any
-
-reveal_type(np.cumsum(a))  # E: ndarray[Any, Any]
-reveal_type(np.cumsum(b))  # E: ndarray[Any, Any]
-reveal_type(np.cumsum(c))  # E: ndarray[Any, Any]
-reveal_type(np.cumsum(A))  # E: ndarray[Any, Any]
-reveal_type(np.cumsum(B))  # E: ndarray[Any, Any]
-
-reveal_type(np.ptp(a))  # E: Any
-reveal_type(np.ptp(b))  # E: Any
-reveal_type(np.ptp(c))  # E: Any
-reveal_type(np.ptp(A))  # E: Any
-reveal_type(np.ptp(B))  # E: Any
-reveal_type(np.ptp(A, axis=0))  # E: Any
-reveal_type(np.ptp(B, axis=0))  # E: Any
-reveal_type(np.ptp(A, keepdims=True))  # E: Any
-reveal_type(np.ptp(B, keepdims=True))  # E: Any
-
-reveal_type(np.amax(a))  # E: Any
-reveal_type(np.amax(b))  # E: Any
-reveal_type(np.amax(c))  # E: Any
-reveal_type(np.amax(A))  # E: Any
-reveal_type(np.amax(B))  # E: Any
-reveal_type(np.amax(A, axis=0))  # E: Any
-reveal_type(np.amax(B, axis=0))  # E: Any
-reveal_type(np.amax(A, keepdims=True))  # E: Any
-reveal_type(np.amax(B, keepdims=True))  # E: Any
-
-reveal_type(np.amin(a))  # E: Any
-reveal_type(np.amin(b))  # E: Any
-reveal_type(np.amin(c))  # E: Any
-reveal_type(np.amin(A))  # E: Any
-reveal_type(np.amin(B))  # E: Any
-reveal_type(np.amin(A, axis=0))  # E: Any
-reveal_type(np.amin(B, axis=0))  # E: Any
-reveal_type(np.amin(A, keepdims=True))  # E: Any
-reveal_type(np.amin(B, keepdims=True))  # E: Any
-
-reveal_type(np.prod(a))  # E: Any
-reveal_type(np.prod(b))  # E: Any
-reveal_type(np.prod(c))  # E: Any
-reveal_type(np.prod(A))  # E: Any
-reveal_type(np.prod(B))  # E: Any
-reveal_type(np.prod(A, axis=0))  # E: Any
-reveal_type(np.prod(B, axis=0))  # E: Any
-reveal_type(np.prod(A, keepdims=True))  # E: Any
-reveal_type(np.prod(B, keepdims=True))  # E: Any
-reveal_type(np.prod(b, out=d))  # E: Any
-reveal_type(np.prod(B, out=d))  # E: Any
-
-reveal_type(np.cumprod(a))  # E: ndarray[Any, Any]
-reveal_type(np.cumprod(b))  # E: ndarray[Any, Any]
-reveal_type(np.cumprod(c))  # E: ndarray[Any, Any]
-reveal_type(np.cumprod(A))  # E: ndarray[Any, Any]
-reveal_type(np.cumprod(B))  # E: ndarray[Any, Any]
-
-reveal_type(np.ndim(a))  # E: int
+reveal_type(np.any(f4))  # E: bool_
+reveal_type(np.any(f))  # E: bool_
+reveal_type(np.any(AR_b))  # E: bool_
+reveal_type(np.any(AR_f4))  # E: bool_
+reveal_type(np.any(AR_b, axis=0))  # E: Any
+reveal_type(np.any(AR_f4, axis=0))  # E: Any
+reveal_type(np.any(AR_b, keepdims=True))  # E: Any
+reveal_type(np.any(AR_f4, keepdims=True))  # E: Any
+reveal_type(np.any(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
+reveal_type(np.cumsum(b))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.cumsum(f4))  # E: ndarray[Any, dtype[{float32}]]
+reveal_type(np.cumsum(f))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.cumsum(AR_b))  # E: ndarray[Any, dtype[bool_]]
+reveal_type(np.cumsum(AR_f4))  # E: ndarray[Any, dtype[{float32}]]
+reveal_type(np.cumsum(f, dtype=float))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.cumsum(f, dtype=np.float64))  # E: ndarray[Any, dtype[{float64}]]
+reveal_type(np.cumsum(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
+reveal_type(np.ptp(b))  # E: bool_
+reveal_type(np.ptp(f4))  # E: {float32}
+reveal_type(np.ptp(f))  # E: Any
+reveal_type(np.ptp(AR_b))  # E: bool_
+reveal_type(np.ptp(AR_f4))  # E: {float32}
+reveal_type(np.ptp(AR_b, axis=0))  # E: Any
+reveal_type(np.ptp(AR_f4, axis=0))  # E: Any
+reveal_type(np.ptp(AR_b, keepdims=True))  # E: Any
+reveal_type(np.ptp(AR_f4, keepdims=True))  # E: Any
+reveal_type(np.ptp(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
+reveal_type(np.amax(b))  # E: bool_
+reveal_type(np.amax(f4))  # E: {float32}
+reveal_type(np.amax(f))  # E: Any
+reveal_type(np.amax(AR_b))  # E: bool_
+reveal_type(np.amax(AR_f4))  # E: {float32}
+reveal_type(np.amax(AR_b, axis=0))  # E: Any
+reveal_type(np.amax(AR_f4, axis=0))  # E: Any
+reveal_type(np.amax(AR_b, keepdims=True))  # E: Any
+reveal_type(np.amax(AR_f4, keepdims=True))  # E: Any
+reveal_type(np.amax(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
+reveal_type(np.amin(b))  # E: bool_
+reveal_type(np.amin(f4))  # E: {float32}
+reveal_type(np.amin(f))  # E: Any
+reveal_type(np.amin(AR_b))  # E: bool_
+reveal_type(np.amin(AR_f4))  # E: {float32}
+reveal_type(np.amin(AR_b, axis=0))  # E: Any
+reveal_type(np.amin(AR_f4, axis=0))  # E: Any
+reveal_type(np.amin(AR_b, keepdims=True))  # E: Any
+reveal_type(np.amin(AR_f4, keepdims=True))  # E: Any
+reveal_type(np.amin(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
+reveal_type(np.prod(AR_b))  # E: {int_}
+reveal_type(np.prod(AR_u8))  # E: {uint64}
+reveal_type(np.prod(AR_i8))  # E: {int64}
+reveal_type(np.prod(AR_f4))  # E: floating[Any]
+reveal_type(np.prod(AR_c16))  # E: complexfloating[Any, Any]
+reveal_type(np.prod(AR_O))  # E: Any
+reveal_type(np.prod(AR_f4, axis=0))  # E: Any
+reveal_type(np.prod(AR_f4, keepdims=True))  # E: Any
+reveal_type(np.prod(AR_f4, dtype=np.float64))  # E: {float64}
+reveal_type(np.prod(AR_f4, dtype=float))  # E: Any
+reveal_type(np.prod(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
+reveal_type(np.cumprod(AR_b))  # E: ndarray[Any, dtype[{int_}]]
+reveal_type(np.cumprod(AR_u8))  # E: ndarray[Any, dtype[{uint64}]]
+reveal_type(np.cumprod(AR_i8))  # E: ndarray[Any, dtype[{int64}]]
+reveal_type(np.cumprod(AR_f4))  # E: ndarray[Any, dtype[floating[Any]]]
+reveal_type(np.cumprod(AR_c16))  # E: ndarray[Any, dtype[complexfloating[Any, Any]]]
+reveal_type(np.cumprod(AR_O))  # E: ndarray[Any, dtype[object_]]
+reveal_type(np.cumprod(AR_f4, axis=0))  # E: ndarray[Any, dtype[floating[Any]]]
+reveal_type(np.cumprod(AR_f4, dtype=np.float64))  # E: ndarray[Any, dtype[{float64}]]
+reveal_type(np.cumprod(AR_f4, dtype=float))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.cumprod(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
  reveal_type(np.ndim(b))  # E: int
-reveal_type(np.ndim(c))  # E: int
-reveal_type(np.ndim(A))  # E: int
-reveal_type(np.ndim(B))  # E: int
+reveal_type(np.ndim(f4))  # E: int
+reveal_type(np.ndim(f))  # E: int
+reveal_type(np.ndim(AR_b))  # E: int
+reveal_type(np.ndim(AR_f4))  # E: int
  
-reveal_type(np.size(a))  # E: int
  reveal_type(np.size(b))  # E: int
-reveal_type(np.size(c))  # E: int
-reveal_type(np.size(A))  # E: int
-reveal_type(np.size(B))  # E: int
-
-reveal_type(np.around(a))  # E: Any
-reveal_type(np.around(b))  # E: Any
-reveal_type(np.around(c))  # E: Any
-reveal_type(np.around(A))  # E: Any
-reveal_type(np.around(B))  # E: Any
-
-reveal_type(np.mean(a))  # E: Any
-reveal_type(np.mean(b))  # E: Any
-reveal_type(np.mean(c))  # E: Any
-reveal_type(np.mean(A))  # E: Any
-reveal_type(np.mean(B))  # E: Any
-reveal_type(np.mean(A, axis=0))  # E: Any
-reveal_type(np.mean(B, axis=0))  # E: Any
-reveal_type(np.mean(A, keepdims=True))  # E: Any
-reveal_type(np.mean(B, keepdims=True))  # E: Any
-reveal_type(np.mean(b, out=d))  # E: Any
-reveal_type(np.mean(B, out=d))  # E: Any
-
-reveal_type(np.std(a))  # E: Any
-reveal_type(np.std(b))  # E: Any
-reveal_type(np.std(c))  # E: Any
-reveal_type(np.std(A))  # E: Any
-reveal_type(np.std(B))  # E: Any
-reveal_type(np.std(A, axis=0))  # E: Any
-reveal_type(np.std(B, axis=0))  # E: Any
-reveal_type(np.std(A, keepdims=True))  # E: Any
-reveal_type(np.std(B, keepdims=True))  # E: Any
-reveal_type(np.std(b, out=d))  # E: Any
-reveal_type(np.std(B, out=d))  # E: Any
-
-reveal_type(np.var(a))  # E: Any
-reveal_type(np.var(b))  # E: Any
-reveal_type(np.var(c))  # E: Any
-reveal_type(np.var(A))  # E: Any
-reveal_type(np.var(B))  # E: Any
-reveal_type(np.var(A, axis=0))  # E: Any
-reveal_type(np.var(B, axis=0))  # E: Any
-reveal_type(np.var(A, keepdims=True))  # E: Any
-reveal_type(np.var(B, keepdims=True))  # E: Any
-reveal_type(np.var(b, out=d))  # E: Any
-reveal_type(np.var(B, out=d))  # E: Any
+reveal_type(np.size(f4))  # E: int
+reveal_type(np.size(f))  # E: int
+reveal_type(np.size(AR_b))  # E: int
+reveal_type(np.size(AR_f4))  # E: int
+
+reveal_type(np.around(b))  # E: {float16}
+reveal_type(np.around(f))  # E: Any
+reveal_type(np.around(i8))  # E: {int64}
+reveal_type(np.around(f4))  # E: {float32}
+reveal_type(np.around(AR_b))  # E: ndarray[Any, dtype[{float16}]]
+reveal_type(np.around(AR_i8))  # E: ndarray[Any, dtype[{int64}]]
+reveal_type(np.around(AR_f4))  # E: ndarray[Any, dtype[{float32}]]
+reveal_type(np.around([1.5]))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.around(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
+reveal_type(np.mean(AR_b))  # E: floating[Any]
+reveal_type(np.mean(AR_i8))  # E: floating[Any]
+reveal_type(np.mean(AR_f4))  # E: floating[Any]
+reveal_type(np.mean(AR_c16))  # E: complexfloating[Any, Any]
+reveal_type(np.mean(AR_O))  # E: Any
+reveal_type(np.mean(AR_f4, axis=0))  # E: Any
+reveal_type(np.mean(AR_f4, keepdims=True))  # E: Any
+reveal_type(np.mean(AR_f4, dtype=float))  # E: Any
+reveal_type(np.mean(AR_f4, dtype=np.float64))  # E: {float64}
+reveal_type(np.mean(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
+reveal_type(np.std(AR_b))  # E: floating[Any]
+reveal_type(np.std(AR_i8))  # E: floating[Any]
+reveal_type(np.std(AR_f4))  # E: floating[Any]
+reveal_type(np.std(AR_c16))  # E: floating[Any]
+reveal_type(np.std(AR_O))  # E: Any
+reveal_type(np.std(AR_f4, axis=0))  # E: Any
+reveal_type(np.std(AR_f4, keepdims=True))  # E: Any
+reveal_type(np.std(AR_f4, dtype=float))  # E: Any
+reveal_type(np.std(AR_f4, dtype=np.float64))  # E: {float64}
+reveal_type(np.std(AR_f4, out=AR_subclass))  # E: NDArraySubclass
+
+reveal_type(np.var(AR_b))  # E: floating[Any]
+reveal_type(np.var(AR_i8))  # E: floating[Any]
+reveal_type(np.var(AR_f4))  # E: floating[Any]
+reveal_type(np.var(AR_c16))  # E: floating[Any]
+reveal_type(np.var(AR_O))  # E: Any
+reveal_type(np.var(AR_f4, axis=0))  # E: Any
+reveal_type(np.var(AR_f4, keepdims=True))  # E: Any
+reveal_type(np.var(AR_f4, dtype=float))  # E: Any
+reveal_type(np.var(AR_f4, dtype=np.float64))  # E: {float64}
+reveal_type(np.var(AR_f4, out=AR_subclass))  # E: NDArraySubclass
diff --git a/numpy/typing/tests/data/reveal/index_tricks.pyi b/numpy/typing/tests/data/reveal/index_tricks.pyi

index a41431025ced4e3d281d2b91fb62fecda42d93cb..707d6f3d42f9f64ade648175f1a5b4ff7766e9bb 100644 (file)
--- a/numpy/typing/tests/data/reveal/index_tricks.pyi
+++ b/numpy/typing/tests/data/reveal/index_tricks.pyi
@@ -1,10 +1,10 @@
-from typing import Any, List
+from typing import Any
  import numpy as np
  
-AR_LIKE_b: List[bool]
-AR_LIKE_i: List[int]
-AR_LIKE_f: List[float]
-AR_LIKE_U: List[str]
+AR_LIKE_b: list[bool]
+AR_LIKE_i: list[int]
+AR_LIKE_f: list[float]
+AR_LIKE_U: list[str]
  
  AR_i8: np.ndarray[Any, np.dtype[np.int64]]
  
diff --git a/numpy/typing/tests/data/reveal/lib_utils.pyi b/numpy/typing/tests/data/reveal/lib_utils.pyi

index d820127078a3c539e431e91c899a884c88bc5b35..9b1bf4123da7892e3b8c1cc5cced3a9d71ecdae8 100644 (file)
--- a/numpy/typing/tests/data/reveal/lib_utils.pyi
+++ b/numpy/typing/tests/data/reveal/lib_utils.pyi
@@ -1,10 +1,10 @@
  from io import StringIO
-from typing import Any, Dict
+from typing import Any
  
  import numpy as np
  
  AR: np.ndarray[Any, np.dtype[np.float64]]
-AR_DICT: Dict[str, np.ndarray[Any, np.dtype[np.float64]]]
+AR_DICT: dict[str, np.ndarray[Any, np.dtype[np.float64]]]
  FILE: StringIO
  
  def func(a: int) -> bool: ...
diff --git a/numpy/typing/tests/data/reveal/memmap.pyi b/numpy/typing/tests/data/reveal/memmap.pyi

index 86de8eb08e28b052d2b25e3060c3f2c99fc6c49c..af730749920ba9000f668ff9588a9e9d18423c23 100644 (file)
--- a/numpy/typing/tests/data/reveal/memmap.pyi
+++ b/numpy/typing/tests/data/reveal/memmap.pyi
@@ -14,3 +14,5 @@ reveal_type(np.memmap("file.txt", offset=5))  # E: memmap[Any, dtype[{uint8}]]
  reveal_type(np.memmap(b"file.txt", dtype=np.float64, shape=(10, 3)))  # E: memmap[Any, dtype[{float64}]]
  with open("file.txt", "rb") as f:
      reveal_type(np.memmap(f, dtype=float, order="K"))  # E: memmap[Any, dtype[Any]]
+
+reveal_type(memmap_obj.__array_finalize__(object()))  # E: None
diff --git a/numpy/typing/tests/data/reveal/multiarray.pyi b/numpy/typing/tests/data/reveal/multiarray.pyi

index 72fb887742b8d95512ba6342728ea4d59afd29e9..27a54f50d6e7d49aacf9d2b8d57594fc92e8a397 100644 (file)
--- a/numpy/typing/tests/data/reveal/multiarray.pyi
+++ b/numpy/typing/tests/data/reveal/multiarray.pyi
@@ -1,5 +1,5 @@
  import datetime as dt
-from typing import Any, List, TypeVar
+from typing import Any, TypeVar
  from pathlib import Path
  
  import numpy as np
@@ -17,8 +17,8 @@ AR_u1: npt.NDArray[np.uint8]
  AR_m: npt.NDArray[np.timedelta64]
  AR_M: npt.NDArray[np.datetime64]
  
-AR_LIKE_f: List[float]
-AR_LIKE_i: List[int]
+AR_LIKE_f: list[float]
+AR_LIKE_i: list[int]
  
  m: np.timedelta64
  M: np.datetime64
diff --git a/numpy/typing/tests/data/reveal/nbit_base_example.pyi b/numpy/typing/tests/data/reveal/nbit_base_example.pyi

index d34f6f69a31de4c5ecb6ce46be87bbaf85230fe2..a7cc681947eab723068650d78270d57425ce2102 100644 (file)
--- a/numpy/typing/tests/data/reveal/nbit_base_example.pyi
+++ b/numpy/typing/tests/data/reveal/nbit_base_example.pyi
@@ -1,11 +1,13 @@
-from typing import TypeVar, Union
+from __future__ import annotations
+
+from typing import TypeVar
  import numpy as np
  import numpy.typing as npt
  
  T1 = TypeVar("T1", bound=npt.NBitBase)
  T2 = TypeVar("T2", bound=npt.NBitBase)
  
-def add(a: np.floating[T1], b: np.integer[T2]) -> np.floating[Union[T1, T2]]:
+def add(a: np.floating[T1], b: np.integer[T2]) -> np.floating[T1 | T2]:
      return a + b
  
  i8: np.int64
diff --git a/numpy/typing/tests/data/reveal/ndarray_misc.pyi b/numpy/typing/tests/data/reveal/ndarray_misc.pyi

index da4da3dcfcb43486c46727414b54473bd9cdb22d..779d0909b75275f497c725651ed7cfc004182a40 100644 (file)
--- a/numpy/typing/tests/data/reveal/ndarray_misc.pyi
+++ b/numpy/typing/tests/data/reveal/ndarray_misc.pyi
@@ -11,7 +11,7 @@ import ctypes as ct
  from typing import Any
  
  import numpy as np
-from numpy.typing import NDArray
+from numpy._typing import NDArray
  
  class SubClass(NDArray[np.object_]): ...
  
@@ -204,6 +204,8 @@ reveal_type(AR_V[AR_i8])  # E: Any
  reveal_type(AR_V[AR_i8, AR_i8])  # E: Any
  reveal_type(AR_V[AR_i8, None])  # E: ndarray[Any, dtype[void]]
  reveal_type(AR_V[0, ...])  # E: ndarray[Any, dtype[void]]
+reveal_type(AR_V[[0]])  # E: ndarray[Any, dtype[void]]
+reveal_type(AR_V[[0], [0]])  # E: ndarray[Any, dtype[void]]
  reveal_type(AR_V[:])  # E: ndarray[Any, dtype[void]]
  reveal_type(AR_V["a"])  # E: ndarray[Any, dtype[Any]]
  reveal_type(AR_V[["a", "b"]])  # E: ndarray[Any, dtype[void]]
@@ -212,3 +214,7 @@ reveal_type(AR_f8.dump("test_file"))  # E: None
  reveal_type(AR_f8.dump(b"test_file"))  # E: None
  with open("test_file", "wb") as f:
      reveal_type(AR_f8.dump(f))  # E: None
+
+reveal_type(AR_f8.__array_finalize__(None))  # E: None
+reveal_type(AR_f8.__array_finalize__(B))  # E: None
+reveal_type(AR_f8.__array_finalize__(AR_f8))  # E: None
diff --git a/numpy/typing/tests/data/reveal/nested_sequence.pyi b/numpy/typing/tests/data/reveal/nested_sequence.pyi

index 4d3aad467efc8bac5ae076e293a3891ec1151489..286f75ac5c441a144d42e19a6b8b2705427b4174 100644 (file)
--- a/numpy/typing/tests/data/reveal/nested_sequence.pyi
+++ b/numpy/typing/tests/data/reveal/nested_sequence.pyi
@@ -1,16 +1,18 @@
-from typing import Sequence, Tuple, List, Any
-import numpy.typing as npt
+from collections.abc import Sequence
+from typing import Any
+
+from numpy._typing import _NestedSequence
  
  a: Sequence[int]
  b: Sequence[Sequence[int]]
  c: Sequence[Sequence[Sequence[int]]]
  d: Sequence[Sequence[Sequence[Sequence[int]]]]
  e: Sequence[bool]
-f: Tuple[int, ...]
-g: List[int]
+f: tuple[int, ...]
+g: list[int]
  h: Sequence[Any]
  
-def func(a: npt._NestedSequence[int]) -> None:
+def func(a: _NestedSequence[int]) -> None:
      ...
  
  reveal_type(func(a))  # E: None
diff --git a/numpy/typing/tests/data/reveal/npyio.pyi b/numpy/typing/tests/data/reveal/npyio.pyi

index 637bdb6619fda7a6518d37ab74c33311323f554d..68605cf9460196864c81e628b93dd7392610e19e 100644 (file)
--- a/numpy/typing/tests/data/reveal/npyio.pyi
+++ b/numpy/typing/tests/data/reveal/npyio.pyi
@@ -1,6 +1,6 @@
  import re
  import pathlib
-from typing import IO, List
+from typing import IO
  
  import numpy.typing as npt
  import numpy as np
@@ -14,7 +14,7 @@ bag_obj: np.lib.npyio.BagObj[int]
  npz_file: np.lib.npyio.NpzFile
  
  AR_i8: npt.NDArray[np.int64]
-AR_LIKE_f8: List[float]
+AR_LIKE_f8: list[float]
  
  class BytesWriter:
      def write(self, data: bytes) -> None: ...
@@ -77,7 +77,7 @@ reveal_type(np.fromregex(bytes_reader, "test", np.float64))  # E: ndarray[Any, d
  
  reveal_type(np.genfromtxt(bytes_file))  # E: ndarray[Any, dtype[{float64}]]
  reveal_type(np.genfromtxt(pathlib_path, dtype=np.str_))  # E: ndarray[Any, dtype[str_]]
-reveal_type(np.genfromtxt(str_path, dtype=str, skiprows=2))  # E: ndarray[Any, dtype[Any]]
+reveal_type(np.genfromtxt(str_path, dtype=str, skip_header=2))  # E: ndarray[Any, dtype[Any]]
  reveal_type(np.genfromtxt(str_file, comments="test"))  # E: ndarray[Any, dtype[{float64}]]
  reveal_type(np.genfromtxt(str_path, delimiter="\n"))  # E: ndarray[Any, dtype[{float64}]]
  reveal_type(np.genfromtxt(str_path, ndmin=2))  # E: ndarray[Any, dtype[{float64}]]
diff --git a/numpy/typing/tests/data/reveal/numeric.pyi b/numpy/typing/tests/data/reveal/numeric.pyi

index e9a884c7c339cfacd286c14a54fc327af11d971c..b8fe15d3a08a7128a7907f339996b49f59cdbedf 100644 (file)
--- a/numpy/typing/tests/data/reveal/numeric.pyi
+++ b/numpy/typing/tests/data/reveal/numeric.pyi
@@ -5,7 +5,6 @@ Does not include tests which fall under ``array_constructors``.
  
  """
  
-from typing import List
  import numpy as np
  import numpy.typing as npt
  
@@ -22,7 +21,7 @@ AR_c16: npt.NDArray[np.complex128]
  AR_m: npt.NDArray[np.timedelta64]
  AR_O: npt.NDArray[np.object_]
  
-B: List[int]
+B: list[int]
  C: SubClass
  
  reveal_type(np.count_nonzero(i8))  # E: int
diff --git a/numpy/typing/tests/data/reveal/numerictypes.pyi b/numpy/typing/tests/data/reveal/numerictypes.pyi

index cc2335264113f42589ca0b05b412a9f00ca694a4..e1857557d90a28839095751b7a19d26d8ef58274 100644 (file)
--- a/numpy/typing/tests/data/reveal/numerictypes.pyi
+++ b/numpy/typing/tests/data/reveal/numerictypes.pyi
@@ -33,9 +33,9 @@ reveal_type(np.nbytes[np.int64])  # E: int
  
  reveal_type(np.ScalarType)  # E: Tuple
  reveal_type(np.ScalarType[0])  # E: Type[builtins.int]
-reveal_type(np.ScalarType[4])  # E: Type[builtins.bool]
-reveal_type(np.ScalarType[9])  # E: Type[{csingle}]
-reveal_type(np.ScalarType[11])  # E: Type[{clongdouble}]
+reveal_type(np.ScalarType[3])  # E: Type[builtins.bool]
+reveal_type(np.ScalarType[8])  # E: Type[{csingle}]
+reveal_type(np.ScalarType[10])  # E: Type[{clongdouble}]
  
  reveal_type(np.typecodes["Character"])  # E: Literal['c']
  reveal_type(np.typecodes["Complex"])  # E: Literal['FDG']
diff --git a/numpy/typing/tests/data/reveal/random.pyi b/numpy/typing/tests/data/reveal/random.pyi

index 4e06aa7d5bd7724372c626c6109176d5f0a9786b..edea6a2911e4991febd55e9a7dda08d217e29018 100644 (file)
--- a/numpy/typing/tests/data/reveal/random.pyi
+++ b/numpy/typing/tests/data/reveal/random.pyi
@@ -1,6 +1,6 @@
  from __future__ import annotations
  
-from typing import Any, List
+from typing import Any
  
  import numpy as np
  
@@ -79,13 +79,13 @@ D_arr_0p9: np.ndarray[Any, np.dtype[np.float64]] = np.array([0.9])
  D_arr_1p5: np.ndarray[Any, np.dtype[np.float64]] = np.array([1.5])
  I_arr_10: np.ndarray[Any, np.dtype[np.int_]] = np.array([10], dtype=np.int_)
  I_arr_20: np.ndarray[Any, np.dtype[np.int_]] = np.array([20], dtype=np.int_)
-D_arr_like_0p1: List[float] = [0.1]
-D_arr_like_0p5: List[float] = [0.5]
-D_arr_like_0p9: List[float] = [0.9]
-D_arr_like_1p5: List[float] = [1.5]
-I_arr_like_10: List[int] = [10]
-I_arr_like_20: List[int] = [20]
-D_2D_like: List[List[float]] = [[1, 2], [2, 3], [3, 4], [4, 5.1]]
+D_arr_like_0p1: list[float] = [0.1]
+D_arr_like_0p5: list[float] = [0.5]
+D_arr_like_0p9: list[float] = [0.9]
+D_arr_like_1p5: list[float] = [1.5]
+I_arr_like_10: list[int] = [10]
+I_arr_like_20: list[int] = [20]
+D_2D_like: list[list[float]] = [[1, 2], [2, 3], [3, 4], [4, 5.1]]
  D_2D: np.ndarray[Any, np.dtype[np.float64]] = np.array(D_2D_like)
  S_out: np.ndarray[Any, np.dtype[np.float32]] = np.empty(1, dtype=np.float32)
  D_out: np.ndarray[Any, np.dtype[np.float64]] = np.empty(1)
@@ -501,7 +501,7 @@ reveal_type(def_gen.integers([100]))  # E: ndarray[Any, dtype[signedinteger[typi
  reveal_type(def_gen.integers(0, [100]))  # E: ndarray[Any, dtype[signedinteger[typing._64Bit]]]
  
  I_bool_low: np.ndarray[Any, np.dtype[np.bool_]] = np.array([0], dtype=np.bool_)
-I_bool_low_like: List[int] = [0]
+I_bool_low_like: list[int] = [0]
  I_bool_high_open: np.ndarray[Any, np.dtype[np.bool_]] = np.array([1], dtype=np.bool_)
  I_bool_high_closed: np.ndarray[Any, np.dtype[np.bool_]] = np.array([1], dtype=np.bool_)
  
@@ -530,7 +530,7 @@ reveal_type(def_gen.integers(I_bool_low, I_bool_high_closed, dtype=np.bool_, end
  reveal_type(def_gen.integers(0, I_bool_high_closed, dtype=np.bool_, endpoint=True))  # E: ndarray[Any, dtype[bool_]
  
  I_u1_low: np.ndarray[Any, np.dtype[np.uint8]] = np.array([0], dtype=np.uint8)
-I_u1_low_like: List[int] = [0]
+I_u1_low_like: list[int] = [0]
  I_u1_high_open: np.ndarray[Any, np.dtype[np.uint8]] = np.array([255], dtype=np.uint8)
  I_u1_high_closed: np.ndarray[Any, np.dtype[np.uint8]] = np.array([255], dtype=np.uint8)
  
@@ -571,7 +571,7 @@ reveal_type(def_gen.integers(I_u1_low, I_u1_high_closed, dtype=np.uint8, endpoin
  reveal_type(def_gen.integers(0, I_u1_high_closed, dtype=np.uint8, endpoint=True))  # E: ndarray[Any, dtype[unsignedinteger[typing._8Bit]]]
  
  I_u2_low: np.ndarray[Any, np.dtype[np.uint16]] = np.array([0], dtype=np.uint16)
-I_u2_low_like: List[int] = [0]
+I_u2_low_like: list[int] = [0]
  I_u2_high_open: np.ndarray[Any, np.dtype[np.uint16]] = np.array([65535], dtype=np.uint16)
  I_u2_high_closed: np.ndarray[Any, np.dtype[np.uint16]] = np.array([65535], dtype=np.uint16)
  
@@ -612,7 +612,7 @@ reveal_type(def_gen.integers(I_u2_low, I_u2_high_closed, dtype=np.uint16, endpoi
  reveal_type(def_gen.integers(0, I_u2_high_closed, dtype=np.uint16, endpoint=True))  # E: ndarray[Any, dtype[unsignedinteger[typing._16Bit]]]
  
  I_u4_low: np.ndarray[Any, np.dtype[np.uint32]] = np.array([0], dtype=np.uint32)
-I_u4_low_like: List[int] = [0]
+I_u4_low_like: list[int] = [0]
  I_u4_high_open: np.ndarray[Any, np.dtype[np.uint32]] = np.array([4294967295], dtype=np.uint32)
  I_u4_high_closed: np.ndarray[Any, np.dtype[np.uint32]] = np.array([4294967295], dtype=np.uint32)
  
@@ -678,7 +678,7 @@ reveal_type(def_gen.integers(I_u4_low, I_u4_high_closed, dtype=np.uint, endpoint
  reveal_type(def_gen.integers(0, I_u4_high_closed, dtype=np.uint, endpoint=True))  # E: ndarray[Any, dtype[{uint}]]
  
  I_u8_low: np.ndarray[Any, np.dtype[np.uint64]] = np.array([0], dtype=np.uint64)
-I_u8_low_like: List[int] = [0]
+I_u8_low_like: list[int] = [0]
  I_u8_high_open: np.ndarray[Any, np.dtype[np.uint64]] = np.array([18446744073709551615], dtype=np.uint64)
  I_u8_high_closed: np.ndarray[Any, np.dtype[np.uint64]] = np.array([18446744073709551615], dtype=np.uint64)
  
@@ -719,7 +719,7 @@ reveal_type(def_gen.integers(I_u8_low, I_u8_high_closed, dtype=np.uint64, endpoi
  reveal_type(def_gen.integers(0, I_u8_high_closed, dtype=np.uint64, endpoint=True))  # E: ndarray[Any, dtype[unsignedinteger[typing._64Bit]]]
  
  I_i1_low: np.ndarray[Any, np.dtype[np.int8]] = np.array([-128], dtype=np.int8)
-I_i1_low_like: List[int] = [-128]
+I_i1_low_like: list[int] = [-128]
  I_i1_high_open: np.ndarray[Any, np.dtype[np.int8]] = np.array([127], dtype=np.int8)
  I_i1_high_closed: np.ndarray[Any, np.dtype[np.int8]] = np.array([127], dtype=np.int8)
  
@@ -760,7 +760,7 @@ reveal_type(def_gen.integers(I_i1_low, I_i1_high_closed, dtype=np.int8, endpoint
  reveal_type(def_gen.integers(-128, I_i1_high_closed, dtype=np.int8, endpoint=True))  # E: ndarray[Any, dtype[signedinteger[typing._8Bit]]]
  
  I_i2_low: np.ndarray[Any, np.dtype[np.int16]] = np.array([-32768], dtype=np.int16)
-I_i2_low_like: List[int] = [-32768]
+I_i2_low_like: list[int] = [-32768]
  I_i2_high_open: np.ndarray[Any, np.dtype[np.int16]] = np.array([32767], dtype=np.int16)
  I_i2_high_closed: np.ndarray[Any, np.dtype[np.int16]] = np.array([32767], dtype=np.int16)
  
@@ -801,7 +801,7 @@ reveal_type(def_gen.integers(I_i2_low, I_i2_high_closed, dtype=np.int16, endpoin
  reveal_type(def_gen.integers(-32768, I_i2_high_closed, dtype=np.int16, endpoint=True))  # E: ndarray[Any, dtype[signedinteger[typing._16Bit]]]
  
  I_i4_low: np.ndarray[Any, np.dtype[np.int32]] = np.array([-2147483648], dtype=np.int32)
-I_i4_low_like: List[int] = [-2147483648]
+I_i4_low_like: list[int] = [-2147483648]
  I_i4_high_open: np.ndarray[Any, np.dtype[np.int32]] = np.array([2147483647], dtype=np.int32)
  I_i4_high_closed: np.ndarray[Any, np.dtype[np.int32]] = np.array([2147483647], dtype=np.int32)
  
@@ -842,7 +842,7 @@ reveal_type(def_gen.integers(I_i4_low, I_i4_high_closed, dtype=np.int32, endpoin
  reveal_type(def_gen.integers(-2147483648, I_i4_high_closed, dtype=np.int32, endpoint=True))  # E: ndarray[Any, dtype[signedinteger[typing._32Bit]]]
  
  I_i8_low: np.ndarray[Any, np.dtype[np.int64]] = np.array([-9223372036854775808], dtype=np.int64)
-I_i8_low_like: List[int] = [-9223372036854775808]
+I_i8_low_like: list[int] = [-9223372036854775808]
  I_i8_high_open: np.ndarray[Any, np.dtype[np.int64]] = np.array([9223372036854775807], dtype=np.int64)
  I_i8_high_closed: np.ndarray[Any, np.dtype[np.int64]] = np.array([9223372036854775807], dtype=np.int64)
  
diff --git a/numpy/typing/tests/data/reveal/rec.pyi b/numpy/typing/tests/data/reveal/rec.pyi

index 9921621f1fd9aa4bbe66628460294dc452ef1bd7..8ea4a6ee8d9c35deec679f58a142c6418e6e53bb 100644 (file)
--- a/numpy/typing/tests/data/reveal/rec.pyi
+++ b/numpy/typing/tests/data/reveal/rec.pyi
@@ -1,12 +1,12 @@
  import io
-from typing import Any, List
+from typing import Any
  
  import numpy as np
  import numpy.typing as npt
  
  AR_i8: npt.NDArray[np.int64]
  REC_AR_V: np.recarray[Any, np.dtype[np.record]]
-AR_LIST: List[npt.NDArray[np.int64]]
+AR_LIST: list[npt.NDArray[np.int64]]
  
  format_parser: np.format_parser
  record: np.record
@@ -33,6 +33,7 @@ reveal_type(REC_AR_V.field(0, AR_i8))  # E: None
  reveal_type(REC_AR_V.field("field_a", AR_i8))  # E: None
  reveal_type(REC_AR_V["field_a"])  # E: Any
  reveal_type(REC_AR_V.field_a)  # E: Any
+reveal_type(REC_AR_V.__array_finalize__(object()))  # E: None
  
  reveal_type(np.recarray(  # recarray[Any, dtype[record]]
      shape=(10, 5),
diff --git a/numpy/typing/tests/data/reveal/shape_base.pyi b/numpy/typing/tests/data/reveal/shape_base.pyi

index f13678c3af3b1003388b6524ef7995e0a619722d..b907a432803909523bada36b2c33d8fde7bf4e08 100644 (file)
--- a/numpy/typing/tests/data/reveal/shape_base.pyi
+++ b/numpy/typing/tests/data/reveal/shape_base.pyi
@@ -1,6 +1,6 @@
  import numpy as np
-from numpy.typing import NDArray
-from typing import Any, List
+from numpy._typing import NDArray
+from typing import Any
  
  i8: np.int64
  f8: np.float64
@@ -9,7 +9,7 @@ AR_b: NDArray[np.bool_]
  AR_i8: NDArray[np.int64]
  AR_f8: NDArray[np.float64]
  
-AR_LIKE_f8: List[float]
+AR_LIKE_f8: list[float]
  
  reveal_type(np.take_along_axis(AR_f8, AR_i8, axis=1))  # E: ndarray[Any, dtype[{float64}]]
  reveal_type(np.take_along_axis(f8, AR_i8, axis=None))  # E: ndarray[Any, dtype[{float64}]]
diff --git a/numpy/typing/tests/data/reveal/stride_tricks.pyi b/numpy/typing/tests/data/reveal/stride_tricks.pyi

index 207cb34088e61b8cfee52a9f01d05976fb592aa1..17769dc4bb3954773eac23156b23cd70ffb40188 100644 (file)
--- a/numpy/typing/tests/data/reveal/stride_tricks.pyi
+++ b/numpy/typing/tests/data/reveal/stride_tricks.pyi
@@ -1,10 +1,10 @@
-from typing import List, Dict, Any
+from typing import Any
  import numpy as np
  import numpy.typing as npt
  
  AR_f8: npt.NDArray[np.float64]
-AR_LIKE_f: List[float]
-interface_dict: Dict[str, Any]
+AR_LIKE_f: list[float]
+interface_dict: dict[str, Any]
  
  reveal_type(np.lib.stride_tricks.DummyArray(interface_dict))  # E: lib.stride_tricks.DummyArray
  
diff --git a/numpy/typing/tests/data/reveal/testing.pyi b/numpy/typing/tests/data/reveal/testing.pyi

index 47cb1d04740610c14d17d5bf5fb2c04618854f32..5440af80065eb39725c5a1ae4600aae1adad03eb 100644 (file)
--- a/numpy/typing/tests/data/reveal/testing.pyi
+++ b/numpy/typing/tests/data/reveal/testing.pyi
@@ -2,7 +2,8 @@ from __future__ import annotations
  
  import re
  import sys
-from typing import Any, Callable, TypeVar
+from collections.abc import Callable
+from typing import Any, TypeVar
  from pathlib import Path
  
  import numpy as np
@@ -153,8 +154,12 @@ reveal_type(np.testing.assert_array_max_ulp(AR_i8, AR_f8, dtype=np.float32))  #
  reveal_type(np.testing.assert_warns(RuntimeWarning))  # E: _GeneratorContextManager[None]
  reveal_type(np.testing.assert_warns(RuntimeWarning, func3, 5))  # E: bool
  
+def func4(a: int, b: str) -> bool: ...
+
  reveal_type(np.testing.assert_no_warnings())  # E: _GeneratorContextManager[None]
  reveal_type(np.testing.assert_no_warnings(func3, 5))  # E: bool
+reveal_type(np.testing.assert_no_warnings(func4, a=1, b="test"))  # E: bool
+reveal_type(np.testing.assert_no_warnings(func4, 1, "test"))  # E: bool
  
  reveal_type(np.testing.tempdir("test_dir"))  # E: _GeneratorContextManager[builtins.str]
  reveal_type(np.testing.tempdir(prefix=b"test"))  # E: _GeneratorContextManager[builtins.bytes]
diff --git a/numpy/typing/tests/data/reveal/twodim_base.pyi b/numpy/typing/tests/data/reveal/twodim_base.pyi

index 0318c3cf18a5e8a6f8ec3674524843581aa732e9..0dc58d43786cb2a5fc104c776edfa9f0710f5223 100644 (file)
--- a/numpy/typing/tests/data/reveal/twodim_base.pyi
+++ b/numpy/typing/tests/data/reveal/twodim_base.pyi
@@ -1,4 +1,4 @@
-from typing import Any, List, TypeVar
+from typing import Any, TypeVar
  
  import numpy as np
  import numpy.typing as npt
@@ -21,7 +21,7 @@ AR_f: npt.NDArray[np.float64]
  AR_c: npt.NDArray[np.complex128]
  AR_O: npt.NDArray[np.object_]
  
-AR_LIKE_b: List[bool]
+AR_LIKE_b: list[bool]
  
  reveal_type(np.fliplr(AR_b))  # E: ndarray[Any, dtype[bool_]]
  reveal_type(np.fliplr(AR_LIKE_b))  # E: ndarray[Any, dtype[Any]]
diff --git a/numpy/typing/tests/data/reveal/type_check.pyi b/numpy/typing/tests/data/reveal/type_check.pyi

index 13d41d8441288740b6981574d6ec5fa61ed7e720..ddd319a94adfc53af90c9929512d23766cb6a804 100644 (file)
--- a/numpy/typing/tests/data/reveal/type_check.pyi
+++ b/numpy/typing/tests/data/reveal/type_check.pyi
@@ -1,6 +1,6 @@
-from typing import List
  import numpy as np
  import numpy.typing as npt
+from numpy._typing import _128Bit
  
  f8: np.float64
  f: float
@@ -10,11 +10,11 @@ AR_i8: npt.NDArray[np.int64]
  AR_i4: npt.NDArray[np.int32]
  AR_f2: npt.NDArray[np.float16]
  AR_f8: npt.NDArray[np.float64]
-AR_f16: npt.NDArray[np.floating[npt._128Bit]]
+AR_f16: npt.NDArray[np.floating[_128Bit]]
  AR_c8: npt.NDArray[np.complex64]
  AR_c16: npt.NDArray[np.complex128]
  
-AR_LIKE_f: List[float]
+AR_LIKE_f: list[float]
  
  class RealObj:
      real: slice
diff --git a/numpy/typing/tests/data/reveal/ufunclike.pyi b/numpy/typing/tests/data/reveal/ufunclike.pyi

index 2d67c923fe8d96af7fd77daeb85245dfeeeb03d9..9f06600b6420e3da146a1b374114d1d9a9b33087 100644 (file)
--- a/numpy/typing/tests/data/reveal/ufunclike.pyi
+++ b/numpy/typing/tests/data/reveal/ufunclike.pyi
@@ -1,11 +1,11 @@
-from typing import List, Any
+from typing import Any
  import numpy as np
  
-AR_LIKE_b: List[bool]
-AR_LIKE_u: List[np.uint32]
-AR_LIKE_i: List[int]
-AR_LIKE_f: List[float]
-AR_LIKE_O: List[np.object_]
+AR_LIKE_b: list[bool]
+AR_LIKE_u: list[np.uint32]
+AR_LIKE_i: list[int]
+AR_LIKE_f: list[float]
+AR_LIKE_O: list[np.object_]
  
  AR_U: np.ndarray[Any, np.dtype[np.str_]]
  
diff --git a/numpy/typing/tests/data/reveal/warnings_and_errors.pyi b/numpy/typing/tests/data/reveal/warnings_and_errors.pyi

index d5c50448ae6cbdb2f376baefb27ff94c694ba71b..19fa432f91a464364c1765ce1f3c588dc9bf2127 100644 (file)
--- a/numpy/typing/tests/data/reveal/warnings_and_errors.pyi
+++ b/numpy/typing/tests/data/reveal/warnings_and_errors.pyi
@@ -1,5 +1,3 @@
-from typing import Type
-
  import numpy as np
  
  reveal_type(np.ModuleDeprecationWarning())  # E: ModuleDeprecationWarning
diff --git a/numpy/typing/tests/test_generic_alias.py b/numpy/typing/tests/test_generic_alias.py

index 39343420bdc53523515bbad400d3b083e74f3356..093e12109da6053b109f91e04c1151ca700c062b 100644 (file)
--- a/numpy/typing/tests/test_generic_alias.py
+++ b/numpy/typing/tests/test_generic_alias.py
@@ -5,11 +5,12 @@ import copy
  import types
  import pickle
  import weakref
-from typing import TypeVar, Any, Callable, Tuple, Type, Union
+from typing import TypeVar, Any, Union, Callable
  
  import pytest
  import numpy as np
-from numpy.typing._generic_alias import _GenericAlias
+from numpy._typing._generic_alias import _GenericAlias
+from typing_extensions import Unpack
  
  ScalarType = TypeVar("ScalarType", bound=np.generic, covariant=True)
  T1 = TypeVar("T1")
@@ -17,28 +18,32 @@ T2 = TypeVar("T2")
  DType = _GenericAlias(np.dtype, (ScalarType,))
  NDArray = _GenericAlias(np.ndarray, (Any, DType))
  
+# NOTE: The `npt._GenericAlias` *class* isn't quite stable on python >=3.11.
+# This is not a problem during runtime (as it's 3.8-exclusive), but we still
+# need it for the >=3.9 in order to verify its semantics match
+# `types.GenericAlias` replacement. xref numpy/numpy#21526
  if sys.version_info >= (3, 9):
      DType_ref = types.GenericAlias(np.dtype, (ScalarType,))
      NDArray_ref = types.GenericAlias(np.ndarray, (Any, DType_ref))
-    FuncType = Callable[[Union[_GenericAlias, types.GenericAlias]], Any]
+    FuncType = Callable[["_GenericAlias | types.GenericAlias"], Any]
  else:
      DType_ref = Any
      NDArray_ref = Any
-    FuncType = Callable[[_GenericAlias], Any]
+    FuncType = Callable[["_GenericAlias"], Any]
  
  GETATTR_NAMES = sorted(set(dir(np.ndarray)) - _GenericAlias._ATTR_EXCEPTIONS)
  
  BUFFER = np.array([1], dtype=np.int64)
  BUFFER.setflags(write=False)
  
-def _get_subclass_mro(base: type) -> Tuple[type, ...]:
+def _get_subclass_mro(base: type) -> tuple[type, ...]:
      class Subclass(base):  # type: ignore[misc,valid-type]
          pass
      return Subclass.__mro__[1:]
  
  
  class TestGenericAlias:
-    """Tests for `numpy.typing._generic_alias._GenericAlias`."""
+    """Tests for `numpy._typing._generic_alias._GenericAlias`."""
  
      @pytest.mark.parametrize("name,func", [
          ("__init__", lambda n: n),
@@ -51,8 +56,6 @@ class TestGenericAlias:
          ("__origin__", lambda n: n.__origin__),
          ("__args__", lambda n: n.__args__),
          ("__parameters__", lambda n: n.__parameters__),
-        ("__reduce__", lambda n: n.__reduce__()[1:]),
-        ("__reduce_ex__", lambda n: n.__reduce_ex__(1)[1:]),
          ("__mro_entries__", lambda n: n.__mro_entries__([object])),
          ("__hash__", lambda n: hash(n)),
          ("__repr__", lambda n: repr(n)),
@@ -62,7 +65,6 @@ class TestGenericAlias:
          ("__getitem__", lambda n: n[Union[T1, T2]][np.float32, np.float64]),
          ("__eq__", lambda n: n == n),
          ("__ne__", lambda n: n != np.ndarray),
-        ("__dir__", lambda n: dir(n)),
          ("__call__", lambda n: n((1,), np.int64, BUFFER)),
          ("__call__", lambda n: n(shape=(1,), dtype=np.int64, buffer=BUFFER)),
          ("subclassing", lambda n: _get_subclass_mro(n)),
@@ -96,6 +98,45 @@ class TestGenericAlias:
              value_ref = func(NDArray_ref)
              assert value == value_ref
  
+    def test_dir(self) -> None:
+        value = dir(NDArray)
+        if sys.version_info < (3, 9):
+            return
+
+        # A number attributes only exist in `types.GenericAlias` in >= 3.11
+        if sys.version_info < (3, 11, 0, "beta", 3):
+            value.remove("__typing_unpacked_tuple_args__")
+        if sys.version_info < (3, 11, 0, "beta", 1):
+            value.remove("__unpacked__")
+        assert value == dir(NDArray_ref)
+
+    @pytest.mark.parametrize("name,func,dev_version", [
+        ("__iter__", lambda n: len(list(n)), ("beta", 1)),
+        ("__iter__", lambda n: next(iter(n)), ("beta", 1)),
+        ("__unpacked__", lambda n: n.__unpacked__, ("beta", 1)),
+        ("Unpack", lambda n: Unpack[n], ("beta", 1)),
+
+        # The right operand should now have `__unpacked__ = True`,
+        # and they are thus now longer equivalent
+        ("__ne__", lambda n: n != next(iter(n)), ("beta", 1)),
+
+        # >= beta3 stuff
+        ("__typing_unpacked_tuple_args__",
+         lambda n: n.__typing_unpacked_tuple_args__, ("beta", 3)),
+    ])
+    def test_py311_features(
+        self,
+        name: str,
+        func: FuncType,
+        dev_version: tuple[str, int],
+    ) -> None:
+        """Test Python 3.11 features."""
+        value = func(NDArray)
+
+        if sys.version_info >= (3, 11, 0, *dev_version):
+            value_ref = func(NDArray_ref)
+            assert value == value_ref
+
      def test_weakref(self) -> None:
          """Test ``__weakref__``."""
          value = weakref.ref(NDArray)()
@@ -132,7 +173,7 @@ class TestGenericAlias:
      def test_raise(
          self,
          name: str,
-        exc_type: Type[BaseException],
+        exc_type: type[BaseException],
          func: FuncType,
      ) -> None:
          """Test operations that are supposed to raise."""
diff --git a/numpy/typing/tests/test_typing.py b/numpy/typing/tests/test_typing.py

index fe58a8f4c5e89382551e988f14099f08e9225417..5011339b56fb3f16b7206e0dc526486b5cdf7ba7 100644 (file)
--- a/numpy/typing/tests/test_typing.py
+++ b/numpy/typing/tests/test_typing.py
@@ -136,7 +136,7 @@ def test_fail(path: str) -> None:
      output_mypy = OUTPUT_MYPY
      assert path in output_mypy
      for error_line in output_mypy[path]:
-        error_line = _strip_filename(error_line)
+        error_line = _strip_filename(error_line).split("\n", 1)[0]
          match = re.match(
              r"(?P<lineno>\d+): (error|note): .+$",
              error_line,
@@ -228,41 +228,41 @@ def _construct_ctypes_dict() -> dict[str, str]:
  
  
  def _construct_format_dict() -> dict[str, str]:
-    dct = {k.split(".")[-1]: v.replace("numpy", "numpy.typing") for
+    dct = {k.split(".")[-1]: v.replace("numpy", "numpy._typing") for
             k, v in _PRECISION_DICT.items()}
  
      return {
-        "uint8": "numpy.unsignedinteger[numpy.typing._8Bit]",
-        "uint16": "numpy.unsignedinteger[numpy.typing._16Bit]",
-        "uint32": "numpy.unsignedinteger[numpy.typing._32Bit]",
-        "uint64": "numpy.unsignedinteger[numpy.typing._64Bit]",
-        "uint128": "numpy.unsignedinteger[numpy.typing._128Bit]",
-        "uint256": "numpy.unsignedinteger[numpy.typing._256Bit]",
-        "int8": "numpy.signedinteger[numpy.typing._8Bit]",
-        "int16": "numpy.signedinteger[numpy.typing._16Bit]",
-        "int32": "numpy.signedinteger[numpy.typing._32Bit]",
-        "int64": "numpy.signedinteger[numpy.typing._64Bit]",
-        "int128": "numpy.signedinteger[numpy.typing._128Bit]",
-        "int256": "numpy.signedinteger[numpy.typing._256Bit]",
-        "float16": "numpy.floating[numpy.typing._16Bit]",
-        "float32": "numpy.floating[numpy.typing._32Bit]",
-        "float64": "numpy.floating[numpy.typing._64Bit]",
-        "float80": "numpy.floating[numpy.typing._80Bit]",
-        "float96": "numpy.floating[numpy.typing._96Bit]",
-        "float128": "numpy.floating[numpy.typing._128Bit]",
-        "float256": "numpy.floating[numpy.typing._256Bit]",
+        "uint8": "numpy.unsignedinteger[numpy._typing._8Bit]",
+        "uint16": "numpy.unsignedinteger[numpy._typing._16Bit]",
+        "uint32": "numpy.unsignedinteger[numpy._typing._32Bit]",
+        "uint64": "numpy.unsignedinteger[numpy._typing._64Bit]",
+        "uint128": "numpy.unsignedinteger[numpy._typing._128Bit]",
+        "uint256": "numpy.unsignedinteger[numpy._typing._256Bit]",
+        "int8": "numpy.signedinteger[numpy._typing._8Bit]",
+        "int16": "numpy.signedinteger[numpy._typing._16Bit]",
+        "int32": "numpy.signedinteger[numpy._typing._32Bit]",
+        "int64": "numpy.signedinteger[numpy._typing._64Bit]",
+        "int128": "numpy.signedinteger[numpy._typing._128Bit]",
+        "int256": "numpy.signedinteger[numpy._typing._256Bit]",
+        "float16": "numpy.floating[numpy._typing._16Bit]",
+        "float32": "numpy.floating[numpy._typing._32Bit]",
+        "float64": "numpy.floating[numpy._typing._64Bit]",
+        "float80": "numpy.floating[numpy._typing._80Bit]",
+        "float96": "numpy.floating[numpy._typing._96Bit]",
+        "float128": "numpy.floating[numpy._typing._128Bit]",
+        "float256": "numpy.floating[numpy._typing._256Bit]",
          "complex64": ("numpy.complexfloating"
-                      "[numpy.typing._32Bit, numpy.typing._32Bit]"),
+                      "[numpy._typing._32Bit, numpy._typing._32Bit]"),
          "complex128": ("numpy.complexfloating"
-                       "[numpy.typing._64Bit, numpy.typing._64Bit]"),
+                       "[numpy._typing._64Bit, numpy._typing._64Bit]"),
          "complex160": ("numpy.complexfloating"
-                       "[numpy.typing._80Bit, numpy.typing._80Bit]"),
+                       "[numpy._typing._80Bit, numpy._typing._80Bit]"),
          "complex192": ("numpy.complexfloating"
-                       "[numpy.typing._96Bit, numpy.typing._96Bit]"),
+                       "[numpy._typing._96Bit, numpy._typing._96Bit]"),
          "complex256": ("numpy.complexfloating"
-                       "[numpy.typing._128Bit, numpy.typing._128Bit]"),
+                       "[numpy._typing._128Bit, numpy._typing._128Bit]"),
          "complex512": ("numpy.complexfloating"
-                       "[numpy.typing._256Bit, numpy.typing._256Bit]"),
+                       "[numpy._typing._256Bit, numpy._typing._256Bit]"),
  
          "ubyte": f"numpy.unsignedinteger[{dct['_NBitByte']}]",
          "ushort": f"numpy.unsignedinteger[{dct['_NBitShort']}]",
@@ -310,7 +310,7 @@ def _parse_reveals(file: IO[str]) -> tuple[npt.NDArray[np.str_], list[str]]:
  
      All format keys will be substituted for their respective value
      from `FORMAT_DICT`, *e.g.* ``"{float64}"`` becomes
-    ``"numpy.floating[numpy.typing._64Bit]"``.
+    ``"numpy.floating[numpy._typing._64Bit]"``.
      """
      string = file.read().replace("*", "")
  
@@ -368,6 +368,7 @@ Expression: {}
  Expected reveal: {!r}
  Observed reveal: {!r}
  """
+_STRIP_PATTERN = re.compile(r"(\w+\.)+(\w+)")
  
  
  def _test_reveal(
@@ -378,9 +379,8 @@ def _test_reveal(
      lineno: int,
  ) -> None:
      """Error-reporting helper function for `test_reveal`."""
-    strip_pattern = re.compile(r"(\w+\.)+(\w+)")
-    stripped_reveal = strip_pattern.sub(strip_func, reveal)
-    stripped_expected_reveal = strip_pattern.sub(strip_func, expected_reveal)
+    stripped_reveal = _STRIP_PATTERN.sub(strip_func, reveal)
+    stripped_expected_reveal = _STRIP_PATTERN.sub(strip_func, expected_reveal)
      if stripped_reveal not in stripped_expected_reveal:
          raise AssertionError(
              _REVEAL_MSG.format(lineno,
diff --git a/pavement.py b/pavement.py

index 05f775d44876f78df446fe5b27e70a94f9e6fbf2..025489cbd9f5d13a52a5483894817dbea0800781 100644 (file)
--- a/pavement.py
+++ b/pavement.py
@@ -38,7 +38,7 @@ from paver.easy import Bunch, options, task, sh
  #-----------------------------------
  
  # Path to the release notes
-RELEASE_NOTES = 'doc/source/release/1.22.4-notes.rst'
+RELEASE_NOTES = 'doc/source/release/1.23.0-notes.rst'
  
  
  #-------------------------------------------------------
diff --git a/pyproject.toml b/pyproject.toml

index de54171996543d063ca94d449e69d3d6aa995202..b5564fc0779d3ae7a127d7c5c91e8b8f937c6fd9 100644 (file)
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -4,7 +4,7 @@ requires = [
      "packaging==20.5; platform_machine=='arm64'",  # macos M1
      "setuptools==59.2.0",
      "wheel==0.37.0",
-    "Cython>=0.29.30,<3.0",  # Note: keep in sync with tools/cythonize.py
+    "Cython>=0.29.30,<3.0",
  ]
  
  
@@ -74,3 +74,34 @@ requires = [
          directory = "change"
          name = "Changes"
          showcontent = true
+
+
+[tool.cibuildwheel]
+skip = "cp36-* cp37-* pp37-* *-manylinux_i686 *_ppc64le *_s390x *-musllinux*"
+build-verbosity = "3"
+before-build = "bash {project}/tools/wheels/cibw_before_build.sh {project}"
+before-test = "pip install -r {project}/test_requirements.txt"
+test-command = "bash {project}/tools/wheels/cibw_test_command.sh {project}"
+
+[tool.cibuildwheel.linux]
+manylinux-x86_64-image = "manylinux2014"
+manylinux-aarch64-image = "manylinux2014"
+environment = { CFLAGS="-std=c99 -fno-strict-aliasing", LDFLAGS="-Wl,--strip-debug", OPENBLAS64_="/usr/local", NPY_USE_BLAS_ILP64="1", RUNNER_OS="Linux" }
+
+[tool.cibuildwheel.macos]
+# For universal2 wheels, we will need to fuse them manually
+# instead of going through cibuildwheel
+# This is because cibuildwheel tries to make a fat wheel
+# https://github.com/multi-build/multibuild/blame/devel/README.rst#L541-L565
+# for more info
+archs = "x86_64 arm64"
+test-skip = "*_arm64 *_universal2:arm64"
+# MACOS linker doesn't support stripping symbols
+environment = { CFLAGS="-std=c99 -fno-strict-aliasing", OPENBLAS64_="/usr/local", NPY_USE_BLAS_ILP64="1", CC="clang", CXX = "clang++" }
+
+[tool.cibuildwheel.windows]
+environment = { OPENBLAS64_="openblas", OPENBLAS="", NPY_USE_BLAS_ILP64="1", CFLAGS="", LDFLAGS="" }
+
+[[tool.cibuildwheel.overrides]]
+select = "*-win32"
+environment = { OPENBLAS64_="", OPENBLAS="openblas", NPY_USE_BLAS_ILP64="0", CFLAGS="-m32", LDFLAGS="-m32" }
diff --git a/pytest.ini b/pytest.ini

index 1d84f4c4803b6b86be9180745ac2ec5d4689beda..f1c49d0ff33cf7105bebf5fba20df7738d5b5a0b 100644 (file)
--- a/pytest.ini
+++ b/pytest.ini
@@ -11,6 +11,8 @@ filterwarnings =
      ignore:numpy.dtype size changed
      ignore:numpy.ufunc size changed
      ignore::UserWarning:cpuinfo,
+    ignore: divide by zero encountered in log
+    ignore: invalid value encountered in log
  # Matrix PendingDeprecationWarning.
      ignore:the matrix subclass is not
      ignore:Importing from numpy.matlib is
@@ -18,5 +20,8 @@ filterwarnings =
      ignore:assertions not in test modules or plugins:pytest.PytestConfigWarning
  # TODO: remove below when array_api user warning is removed
      ignore:The numpy.array_api submodule is still experimental. See NEP 47.
+# ignore matplotlib headless warning for pyplot
+    ignore:Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.:UserWarning
  # Ignore DeprecationWarnings from distutils
      ignore::DeprecationWarning:.*distutils
+    ignore:\n\n  `numpy.distutils`:DeprecationWarning
diff --git a/runtests.py b/runtests.py

index ac057a358dd32052b9afa51db0ed3fc7e6c37891..fea29a10dc4914ff659d081fb4a8ac67becc26af 100755 (executable)
--- a/runtests.py
+++ b/runtests.py
@@ -199,8 +199,9 @@ def main(argv):
                  os.environ.get('PYTHONPATH', '')
              ))
      else:
-        _temp = __import__(PROJECT_MODULE)
-        site_dir = os.path.sep.join(_temp.__file__.split(os.path.sep)[:-2])
+        if not args.bench_compare:
+            _temp = __import__(PROJECT_MODULE)
+            site_dir = os.path.sep.join(_temp.__file__.split(os.path.sep)[:-2])
  
      extra_argv = args.args[:]
      if not args.bench:
@@ -286,6 +287,8 @@ def main(argv):
      if args.refguide_check:
          cmd = [os.path.join(ROOT_DIR, 'tools', 'refguide_check.py'),
                 '--doctests']
+        if args.verbose:
+            cmd += ['-' + 'v'*args.verbose]
          if args.submodule:
              cmd += [args.submodule]
          os.execv(sys.executable, [sys.executable] + cmd)
diff --git a/setup.py b/setup.py

index 0257fc5ba7558a5def63ab05cd6e7facf43c16b3..353b6c096930fc0820009755b44b5eaabe49c50f 100755 (executable)
--- a/setup.py
+++ b/setup.py
@@ -16,6 +16,11 @@ variety of databases.
  
  All NumPy wheels distributed on PyPI are BSD licensed.
  
+NumPy requires ``pytest`` and ``hypothesis``.  Tests can then be run after
+installation with::
+
+    python -c 'import numpy; numpy.test()'
+
  """
  DOCLINES = (__doc__ or '').split("\n")
  
@@ -202,7 +207,7 @@ def get_build_overrides():
      """
      from numpy.distutils.command.build_clib import build_clib
      from numpy.distutils.command.build_ext import build_ext
-    from distutils.version import LooseVersion
+    from numpy.compat import _pep440
  
      def _needs_gcc_c99_flag(obj):
          if obj.compiler.compiler_type != 'unix':
@@ -216,15 +221,16 @@ def get_build_overrides():
          out = subprocess.run([cc, '-dumpversion'], stdout=subprocess.PIPE,
                               stderr=subprocess.PIPE, universal_newlines=True)
          # -std=c99 is default from this version on
-        if LooseVersion(out.stdout) >= LooseVersion('5.0'):
+        if _pep440.parse(out.stdout) >= _pep440.Version('5.0'):
              return False
          return True
  
      class new_build_clib(build_clib):
          def build_a_library(self, build_info, lib_name, libraries):
+            from numpy.distutils.ccompiler_opt import NPY_CXX_FLAGS
              if _needs_gcc_c99_flag(self):
                  build_info['extra_cflags'] = ['-std=c99']
-            build_info['extra_cxxflags'] = ['-std=c++11']
+            build_info['extra_cxxflags'] = NPY_CXX_FLAGS
              build_clib.build_a_library(self, build_info, lib_name, libraries)
  
      class new_build_ext(build_ext):
@@ -237,6 +243,31 @@ def get_build_overrides():
  
  
  def generate_cython():
+    # Check Cython version
+    from numpy.compat import _pep440
+    try:
+        # try the cython in the installed python first (somewhat related to
+        # scipy/scipy#2397)
+        import Cython
+        from Cython.Compiler.Version import version as cython_version
+    except ImportError as e:
+        # The `cython` command need not point to the version installed in the
+        # Python running this script, so raise an error to avoid the chance of
+        # using the wrong version of Cython.
+        msg = 'Cython needs to be installed in Python as a module'
+        raise OSError(msg) from e
+    else:
+        # Note: keep in sync with that in pyproject.toml
+        # Update for Python 3.11
+        required_version = '0.29.30'
+
+        if _pep440.parse(cython_version) < _pep440.Version(required_version):
+            cython_path = Cython.__file__
+            msg = 'Building NumPy requires Cython >= {}, found {} at {}'
+            msg = msg.format(required_version, cython_version, cython_path)
+            raise RuntimeError(msg)
+
+    # Process files
      cwd = os.path.abspath(os.path.dirname(__file__))
      print("Cythonizing sources")
      for d in ('random',):
@@ -293,7 +324,7 @@ def parse_setuppy_commands():
  
                - `pip install .`       (from a git repo or downloaded source
                                         release)
-              - `pip install numpy`   (last NumPy release on PyPi)
+              - `pip install numpy`   (last NumPy release on PyPI)
  
              """))
          return True
@@ -305,7 +336,7 @@ def parse_setuppy_commands():
  
              To install NumPy from here with reliable uninstall, we recommend
              that you use `pip install .`. To install the latest NumPy release
-            from PyPi, use `pip install numpy`.
+            from PyPI, use `pip install numpy`.
  
              For help with build/installation issues, please ask on the
              numpy-discussion mailing list.  If you are sure that you have run
@@ -373,7 +404,7 @@ def get_docs_url():
      if 'dev' in VERSION:
          return "https://numpy.org/devdocs"
      else:
-        # For releases, this URL ends up on pypi.
+        # For releases, this URL ends up on PyPI.
          # By pinning the version, users looking at old PyPI releases can get
          # to the associated docs easily.
          return "https://numpy.org/doc/{}.{}".format(MAJOR, MINOR)
@@ -423,6 +454,7 @@ def setup_package():
          entry_points={
              'console_scripts': f2py_cmds,
              'array_api': ['numpy = numpy.array_api'],
+            'pyinstaller40': ['hook-dirs = numpy:_pyinstaller_hooks_dir'],
          },
      )
  
diff --git a/site.cfg.example b/site.cfg.example

index 1a6b36d2c6eb301a9a7123fed249be1fa1782c54..4df01a2106e3d1352533795ae5fb56cf0187f6d5 100644 (file)
--- a/site.cfg.example
+++ b/site.cfg.example
@@ -85,10 +85,11 @@
  # ATLAS
  # -----
  # ATLAS is an open source optimized implementation of the BLAS and LAPACK
-# routines. NumPy will try to build against ATLAS by default when available in
-# the system library dirs. To build NumPy against a custom installation of
-# ATLAS you can add an explicit section such as the following. Here we assume
-# that ATLAS was configured with ``prefix=/opt/atlas``.
+# routines. NumPy will try to build against ATLAS when available in
+# the system library dirs (and OpenBLAS, MKL and BLIS are not installed). To
+# build NumPy against a custom installation of ATLAS you can add an explicit
+# section such as the following. Here we assume that ATLAS was configured with
+# ``prefix=/opt/atlas``.
  #
  # [atlas]
  # library_dirs = /opt/atlas/lib
@@ -96,29 +97,11 @@
  
  # OpenBLAS
  # --------
-# OpenBLAS is another open source optimized implementation of BLAS and LAPACK
-# and can be seen as an alternative to ATLAS. To build NumPy against OpenBLAS
-# instead of ATLAS, use this section instead of the above, adjusting as needed
-# for your configuration (in the following example we installed OpenBLAS with
-# ``make install PREFIX=/opt/OpenBLAS``.
-# OpenBLAS is generically installed as a shared library, to force the OpenBLAS
-# library linked to also be used at runtime you can utilize the
-# runtime_library_dirs variable.
-#
-# **Warning**: OpenBLAS, by default, is built in multithreaded mode. Due to the
-# way Python's multiprocessing is implemented, a multithreaded OpenBLAS can
-# cause programs using both to hang as soon as a worker process is forked on
-# POSIX systems (Linux, Mac).
-# This is fixed in OpenBLAS 0.2.9 for the pthread build, the OpenMP build using
-# GNU openmp is as of gcc-4.9 not fixed yet.
-# Python 3.4 will introduce a new feature in multiprocessing, called the
-# "forkserver", which solves this problem. For older versions, make sure
-# OpenBLAS is built using pthreads or use Python threads instead of
-# multiprocessing.
-# (This problem does not exist with multithreaded ATLAS.)
-#
-# https://docs.python.org/library/multiprocessing.html#contexts-and-start-methods
-# https://github.com/xianyi/OpenBLAS/issues/294
+# OpenBLAS is an open source optimized implementation of BLAS and LAPACK
+# and is the default choice for NumPy itself (CI, wheels). OpenBLAS will be
+# selected above ATLAS and Netlib BLAS/LAPACK.  OpenBLAS is generically
+# installed as a shared library, to force the OpenBLAS library linked to also
+# be used at runtime you can utilize the runtime_library_dirs variable.
  #
  # [openblas]
  # libraries = openblas
diff --git a/test_requirements.txt b/test_requirements.txt

index 0bc83e2daf91adb25d02f6061046092bf1aa62f4..c5fec8cd7e48cbb727ab73d021d9ed504f478da3 100644 (file)
--- a/test_requirements.txt
+++ b/test_requirements.txt
@@ -1,4 +1,4 @@
-cython==0.29.30
+cython>=0.29.30,<3.0
  wheel==0.37.0
  setuptools==59.2.0
  hypothesis==6.24.1
@@ -10,4 +10,5 @@ cffi; python_version < '3.10'
  # For testing types. Notes on the restrictions:
  # - Mypy relies on C API features not present in PyPy
  # NOTE: Keep mypy in sync with environment.yml
-mypy==0.940; platform_python_implementation != "PyPy"
+mypy==0.950; platform_python_implementation != "PyPy"
+typing_extensions>=4.2.0
diff --git a/tools/allocation_tracking/README.md b/tools/allocation_tracking/README.md

index fd4f2c8719408f897323f61e8be77fcab07806a4..6cc4c2a582f4530bf6a8bdbdd9e8e4397ab44cf1 100644 (file)
--- a/tools/allocation_tracking/README.md
+++ b/tools/allocation_tracking/README.md
@@ -1,11 +1,7 @@
-Example for using the `PyDataMem_SetEventHook` to track allocations inside numpy.
-
-`alloc_hook.pyx` implements a hook in Cython that calls back into a python
-function. `track_allocations.py` uses it for a simple listing of allocations.
-It can be built with the `setup.py` file in this folder.
-
  Note that since Python 3.6 the builtin tracemalloc module can be used to
  track allocations inside numpy.
  Numpy places its CPU memory allocations into the `np.lib.tracemalloc_domain`
  domain.
  See https://docs.python.org/3/library/tracemalloc.html.
+
+The tool that used to be here has been deprecated.
diff --git a/tools/allocation_tracking/alloc_hook.pyx b/tools/allocation_tracking/alloc_hook.pyx

deleted file mode 100644 (file)

index eeefe17..0000000
--- a/tools/allocation_tracking/alloc_hook.pyx
+++ /dev/null
@@ -1,42 +0,0 @@
-# A cython wrapper for using python functions as callbacks for
-# PyDataMem_SetEventHook.
-
-cimport numpy as np
-
-cdef extern from "Python.h":
-    object PyLong_FromVoidPtr(void *)
-    void *PyLong_AsVoidPtr(object)
-
-ctypedef void PyDataMem_EventHookFunc(void *inp, void *outp, size_t size,
-                                      void *user_data)
-cdef extern from "numpy/arrayobject.h":
-    PyDataMem_EventHookFunc * \
-        PyDataMem_SetEventHook(PyDataMem_EventHookFunc *newhook,
-                               void *user_data, void **old_data)
-
-np.import_array()
-
-cdef void pyhook(void *old, void *new, size_t size, void *user_data):
-    cdef object pyfunc = <object> user_data
-    pyfunc(PyLong_FromVoidPtr(old),
-           PyLong_FromVoidPtr(new),
-           size)
-
-class NumpyAllocHook:
-    def __init__(self, callback):
-        self.callback = callback
-
-    def __enter__(self):
-        cdef void *old_hook, *old_data
-        old_hook = <void *> \
-            PyDataMem_SetEventHook(<PyDataMem_EventHookFunc *> pyhook,
-                                    <void *> self.callback,
-                                    <void **> &old_data)
-        self.old_hook = PyLong_FromVoidPtr(old_hook)
-        self.old_data = PyLong_FromVoidPtr(old_data)
-
-    def __exit__(self):
-        PyDataMem_SetEventHook(<PyDataMem_EventHookFunc *> \
-                                    PyLong_AsVoidPtr(self.old_hook),
-                                <void *> PyLong_AsVoidPtr(self.old_data),
-                                <void **> 0)
diff --git a/tools/allocation_tracking/setup.py b/tools/allocation_tracking/setup.py

deleted file mode 100644 (file)

index 4462f9f..0000000
--- a/tools/allocation_tracking/setup.py
+++ /dev/null
@@ -1,9 +0,0 @@
-from distutils.core import setup
-from distutils.extension import Extension
-from Cython.Distutils import build_ext
-import numpy
-
-setup(
-    cmdclass = {'build_ext': build_ext},
-    ext_modules = [Extension("alloc_hook", ["alloc_hook.pyx"],
-                             include_dirs=[numpy.get_include()])])
diff --git a/tools/allocation_tracking/sorttable.js b/tools/allocation_tracking/sorttable.js

deleted file mode 100644 (file)

index c952887..0000000
--- a/tools/allocation_tracking/sorttable.js
+++ /dev/null
@@ -1,493 +0,0 @@
-/*
-  SortTable
-  version 2
-  7th April 2007
-  Stuart Langridge, https://www.kryogenix.org/code/browser/sorttable/
-  
-  Instructions:
-  Download this file
-  Add <script src="sorttable.js"></script> to your HTML
-  Add class="sortable" to any table you'd like to make sortable
-  Click on the headers to sort
-  
-  Thanks to many, many people for contributions and suggestions.
-  Licenced as X11: https://www.kryogenix.org/code/browser/licence.html
-  This basically means: do what you want with it.
-*/
-
- 
-var stIsIE = /*@cc_on!@*/false;
-
-sorttable = {
-  init: function() {
-    // quit if this function has already been called
-    if (arguments.callee.done) return;
-    // flag this function so we don't do the same thing twice
-    arguments.callee.done = true;
-    // kill the timer
-    if (_timer) clearInterval(_timer);
-    
-    if (!document.createElement || !document.getElementsByTagName) return;
-    
-    sorttable.DATE_RE = /^(\d\d?)[\/\.-](\d\d?)[\/\.-]((\d\d)?\d\d)$/;
-    
-    forEach(document.getElementsByTagName('table'), function(table) {
-      if (table.className.search(/\bsortable\b/) != -1) {
-        sorttable.makeSortable(table);
-      }
-    });
-    
-  },
-  
-  makeSortable: function(table) {
-    if (table.getElementsByTagName('thead').length == 0) {
-      // table doesn't have a tHead. Since it should have, create one and
-      // put the first table row in it.
-      the = document.createElement('thead');
-      the.appendChild(table.rows[0]);
-      table.insertBefore(the,table.firstChild);
-    }
-    // Safari doesn't support table.tHead, sigh
-    if (table.tHead == null) table.tHead = table.getElementsByTagName('thead')[0];
-    
-    if (table.tHead.rows.length != 1) return; // can't cope with two header rows
-    
-    // Sorttable v1 put rows with a class of "sortbottom" at the bottom (as
-    // "total" rows, for example). This is B&R, since what you're supposed
-    // to do is put them in a tfoot. So, if there are sortbottom rows,
-    // for backwards compatibility, move them to tfoot (creating it if needed).
-    sortbottomrows = [];
-    for (var i=0; i<table.rows.length; i++) {
-      if (table.rows[i].className.search(/\bsortbottom\b/) != -1) {
-        sortbottomrows[sortbottomrows.length] = table.rows[i];
-      }
-    }
-    if (sortbottomrows) {
-      if (table.tFoot == null) {
-        // table doesn't have a tfoot. Create one.
-        tfo = document.createElement('tfoot');
-        table.appendChild(tfo);
-      }
-      for (var i=0; i<sortbottomrows.length; i++) {
-        tfo.appendChild(sortbottomrows[i]);
-      }
-      delete sortbottomrows;
-    }
-    
-    // work through each column and calculate its type
-    headrow = table.tHead.rows[0].cells;
-    for (var i=0; i<headrow.length; i++) {
-      // manually override the type with a sorttable_type attribute
-      if (!headrow[i].className.match(/\bsorttable_nosort\b/)) { // skip this col
-        mtch = headrow[i].className.match(/\bsorttable_([a-z0-9]+)\b/);
-        if (mtch) { override = mtch[1]; }
-             if (mtch && typeof sorttable["sort_"+override] == 'function') {
-               headrow[i].sorttable_sortfunction = sorttable["sort_"+override];
-             } else {
-               headrow[i].sorttable_sortfunction = sorttable.guessType(table,i);
-             }
-             // make it clickable to sort
-             headrow[i].sorttable_columnindex = i;
-             headrow[i].sorttable_tbody = table.tBodies[0];
-             dean_addEvent(headrow[i],"click", function(e) {
-
-          if (this.className.search(/\bsorttable_sorted\b/) != -1) {
-            // if we're already sorted by this column, just 
-            // reverse the table, which is quicker
-            sorttable.reverse(this.sorttable_tbody);
-            this.className = this.className.replace('sorttable_sorted',
-                                                    'sorttable_sorted_reverse');
-            this.removeChild(document.getElementById('sorttable_sortfwdind'));
-            sortrevind = document.createElement('span');
-            sortrevind.id = "sorttable_sortrevind";
-            sortrevind.innerHTML = stIsIE ? '&nbsp<font face="webdings">5</font>' : '&nbsp;&#x25B4;';
-            this.appendChild(sortrevind);
-            return;
-          }
-          if (this.className.search(/\bsorttable_sorted_reverse\b/) != -1) {
-            // if we're already sorted by this column in reverse, just 
-            // re-reverse the table, which is quicker
-            sorttable.reverse(this.sorttable_tbody);
-            this.className = this.className.replace('sorttable_sorted_reverse',
-                                                    'sorttable_sorted');
-            this.removeChild(document.getElementById('sorttable_sortrevind'));
-            sortfwdind = document.createElement('span');
-            sortfwdind.id = "sorttable_sortfwdind";
-            sortfwdind.innerHTML = stIsIE ? '&nbsp<font face="webdings">6</font>' : '&nbsp;&#x25BE;';
-            this.appendChild(sortfwdind);
-            return;
-          }
-          
-          // remove sorttable_sorted classes
-          theadrow = this.parentNode;
-          forEach(theadrow.childNodes, function(cell) {
-            if (cell.nodeType == 1) { // an element
-              cell.className = cell.className.replace('sorttable_sorted_reverse','');
-              cell.className = cell.className.replace('sorttable_sorted','');
-            }
-          });
-          sortfwdind = document.getElementById('sorttable_sortfwdind');
-          if (sortfwdind) { sortfwdind.parentNode.removeChild(sortfwdind); }
-          sortrevind = document.getElementById('sorttable_sortrevind');
-          if (sortrevind) { sortrevind.parentNode.removeChild(sortrevind); }
-          
-          this.className += ' sorttable_sorted';
-          sortfwdind = document.createElement('span');
-          sortfwdind.id = "sorttable_sortfwdind";
-          sortfwdind.innerHTML = stIsIE ? '&nbsp<font face="webdings">6</font>' : '&nbsp;&#x25BE;';
-          this.appendChild(sortfwdind);
-
-               // build an array to sort. This is a Schwartzian transform thing,
-               // i.e., we "decorate" each row with the actual sort key,
-               // sort based on the sort keys, and then put the rows back in order
-               // which is a lot faster because you only do getInnerText once per row
-               row_array = [];
-               col = this.sorttable_columnindex;
-               rows = this.sorttable_tbody.rows;
-               for (var j=0; j<rows.length; j++) {
-                 row_array[row_array.length] = [sorttable.getInnerText(rows[j].cells[col]), rows[j]];
-               }
-               /* If you want a stable sort, uncomment the following line */
-               //sorttable.shaker_sort(row_array, this.sorttable_sortfunction);
-               /* and comment out this one */
-               row_array.sort(this.sorttable_sortfunction);
-               
-               tb = this.sorttable_tbody;
-               for (var j=0; j<row_array.length; j++) {
-                 tb.appendChild(row_array[j][1]);
-               }
-               
-               delete row_array;
-             });
-           }
-    }
-  },
-  
-  guessType: function(table, column) {
-    // guess the type of a column based on its first non-blank row
-    sortfn = sorttable.sort_alpha;
-    for (var i=0; i<table.tBodies[0].rows.length; i++) {
-      text = sorttable.getInnerText(table.tBodies[0].rows[i].cells[column]);
-      if (text != '') {
-        if (text.match(/^-?[£$¤]?[\d,.]+%?$/)) {
-          return sorttable.sort_numeric;
-        }
-        // check for a date: dd/mm/yyyy or dd/mm/yy 
-        // can have / or . or - as separator
-        // can be mm/dd as well
-        possdate = text.match(sorttable.DATE_RE)
-        if (possdate) {
-          // looks like a date
-          first = parseInt(possdate[1]);
-          second = parseInt(possdate[2]);
-          if (first > 12) {
-            // definitely dd/mm
-            return sorttable.sort_ddmm;
-          } else if (second > 12) {
-            return sorttable.sort_mmdd;
-          } else {
-            // looks like a date, but we can't tell which, so assume
-            // that it's dd/mm (English imperialism!) and keep looking
-            sortfn = sorttable.sort_ddmm;
-          }
-        }
-      }
-    }
-    return sortfn;
-  },
-  
-  getInnerText: function(node) {
-    // gets the text we want to use for sorting for a cell.
-    // strips leading and trailing whitespace.
-    // this is *not* a generic getInnerText function; it's special to sorttable.
-    // for example, you can override the cell text with a customkey attribute.
-    // it also gets .value for <input> fields.
-    
-    hasInputs = (typeof node.getElementsByTagName == 'function') &&
-                 node.getElementsByTagName('input').length;
-    
-    if (node.getAttribute("sorttable_customkey") != null) {
-      return node.getAttribute("sorttable_customkey");
-    }
-    else if (typeof node.textContent != 'undefined' && !hasInputs) {
-      return node.textContent.replace(/^\s+|\s+$/g, '');
-    }
-    else if (typeof node.innerText != 'undefined' && !hasInputs) {
-      return node.innerText.replace(/^\s+|\s+$/g, '');
-    }
-    else if (typeof node.text != 'undefined' && !hasInputs) {
-      return node.text.replace(/^\s+|\s+$/g, '');
-    }
-    else {
-      switch (node.nodeType) {
-        case 3:
-          if (node.nodeName.toLowerCase() == 'input') {
-            return node.value.replace(/^\s+|\s+$/g, '');
-          }
-        case 4:
-          return node.nodeValue.replace(/^\s+|\s+$/g, '');
-          break;
-        case 1:
-        case 11:
-          var innerText = '';
-          for (var i = 0; i < node.childNodes.length; i++) {
-            innerText += sorttable.getInnerText(node.childNodes[i]);
-          }
-          return innerText.replace(/^\s+|\s+$/g, '');
-          break;
-        default:
-          return '';
-      }
-    }
-  },
-  
-  reverse: function(tbody) {
-    // reverse the rows in a tbody
-    newrows = [];
-    for (var i=0; i<tbody.rows.length; i++) {
-      newrows[newrows.length] = tbody.rows[i];
-    }
-    for (var i=newrows.length-1; i>=0; i--) {
-       tbody.appendChild(newrows[i]);
-    }
-    delete newrows;
-  },
-  
-  /* sort functions
-     each sort function takes two parameters, a and b
-     you are comparing a[0] and b[0] */
-  sort_numeric: function(a,b) {
-    aa = parseFloat(a[0].replace(/[^0-9.-]/g,''));
-    if (isNaN(aa)) aa = 0;
-    bb = parseFloat(b[0].replace(/[^0-9.-]/g,'')); 
-    if (isNaN(bb)) bb = 0;
-    return aa-bb;
-  },
-  sort_alpha: function(a,b) {
-    if (a[0]==b[0]) return 0;
-    if (a[0]<b[0]) return -1;
-    return 1;
-  },
-  sort_ddmm: function(a,b) {
-    mtch = a[0].match(sorttable.DATE_RE);
-    y = mtch[3]; m = mtch[2]; d = mtch[1];
-    if (m.length == 1) m = '0'+m;
-    if (d.length == 1) d = '0'+d;
-    dt1 = y+m+d;
-    mtch = b[0].match(sorttable.DATE_RE);
-    y = mtch[3]; m = mtch[2]; d = mtch[1];
-    if (m.length == 1) m = '0'+m;
-    if (d.length == 1) d = '0'+d;
-    dt2 = y+m+d;
-    if (dt1==dt2) return 0;
-    if (dt1<dt2) return -1;
-    return 1;
-  },
-  sort_mmdd: function(a,b) {
-    mtch = a[0].match(sorttable.DATE_RE);
-    y = mtch[3]; d = mtch[2]; m = mtch[1];
-    if (m.length == 1) m = '0'+m;
-    if (d.length == 1) d = '0'+d;
-    dt1 = y+m+d;
-    mtch = b[0].match(sorttable.DATE_RE);
-    y = mtch[3]; d = mtch[2]; m = mtch[1];
-    if (m.length == 1) m = '0'+m;
-    if (d.length == 1) d = '0'+d;
-    dt2 = y+m+d;
-    if (dt1==dt2) return 0;
-    if (dt1<dt2) return -1;
-    return 1;
-  },
-  
-  shaker_sort: function(list, comp_func) {
-    // A stable sort function to allow multi-level sorting of data
-    // see: https://en.wikipedia.org/wiki/Cocktail_shaker_sort
-    // thanks to Joseph Nahmias
-    var b = 0;
-    var t = list.length - 1;
-    var swap = true;
-
-    while(swap) {
-        swap = false;
-        for(var i = b; i < t; ++i) {
-            if ( comp_func(list[i], list[i+1]) > 0 ) {
-                var q = list[i]; list[i] = list[i+1]; list[i+1] = q;
-                swap = true;
-            }
-        } // for
-        t--;
-
-        if (!swap) break;
-
-        for(var i = t; i > b; --i) {
-            if ( comp_func(list[i], list[i-1]) < 0 ) {
-                var q = list[i]; list[i] = list[i-1]; list[i-1] = q;
-                swap = true;
-            }
-        } // for
-        b++;
-
-    } // while(swap)
-  }  
-}
-
-/* ******************************************************************
-   Supporting functions: bundled here to avoid depending on a library
-   ****************************************************************** */
-
-// Dean Edwards/Matthias Miller/John Resig
-
-/* for Mozilla/Opera9 */
-if (document.addEventListener) {
-    document.addEventListener("DOMContentLoaded", sorttable.init, false);
-}
-
-/* for Internet Explorer */
-/*@cc_on @*/
-/*@if (@_win32)
-    document.write("<script id=__ie_onload defer src=javascript:void(0)><\/script>");
-    var script = document.getElementById("__ie_onload");
-    script.onreadystatechange = function() {
-        if (this.readyState == "complete") {
-            sorttable.init(); // call the onload handler
-        }
-    };
-/*@end @*/
-
-/* for Safari */
-if (/WebKit/i.test(navigator.userAgent)) { // sniff
-    var _timer = setInterval(function() {
-        if (/loaded|complete/.test(document.readyState)) {
-            sorttable.init(); // call the onload handler
-        }
-    }, 10);
-}
-
-/* for other browsers */
-window.onload = sorttable.init;
-
-// written by Dean Edwards, 2005
-// with input from Tino Zijdel, Matthias Miller, Diego Perini
-
-// http://dean.edwards.name/weblog/2005/10/add-event/
-
-function dean_addEvent(element, type, handler) {
-       if (element.addEventListener) {
-               element.addEventListener(type, handler, false);
-       } else {
-               // assign each event handler a unique ID
-               if (!handler.$$guid) handler.$$guid = dean_addEvent.guid++;
-               // create a hash table of event types for the element
-               if (!element.events) element.events = {};
-               // create a hash table of event handlers for each element/event pair
-               var handlers = element.events[type];
-               if (!handlers) {
-                       handlers = element.events[type] = {};
-                       // store the existing event handler (if there is one)
-                       if (element["on" + type]) {
-                               handlers[0] = element["on" + type];
-                       }
-               }
-               // store the event handler in the hash table
-               handlers[handler.$$guid] = handler;
-               // assign a global event handler to do all the work
-               element["on" + type] = handleEvent;
-       }
-};
-// a counter used to create unique IDs
-dean_addEvent.guid = 1;
-
-function removeEvent(element, type, handler) {
-       if (element.removeEventListener) {
-               element.removeEventListener(type, handler, false);
-       } else {
-               // delete the event handler from the hash table
-               if (element.events && element.events[type]) {
-                       delete element.events[type][handler.$$guid];
-               }
-       }
-};
-
-function handleEvent(event) {
-       var returnValue = true;
-       // grab the event object (IE uses a global event object)
-       event = event || fixEvent(((this.ownerDocument || this.document || this).parentWindow || window).event);
-       // get a reference to the hash table of event handlers
-       var handlers = this.events[event.type];
-       // execute each event handler
-       for (var i in handlers) {
-               this.$$handleEvent = handlers[i];
-               if (this.$$handleEvent(event) === false) {
-                       returnValue = false;
-               }
-       }
-       return returnValue;
-};
-
-function fixEvent(event) {
-       // add W3C standard event methods
-       event.preventDefault = fixEvent.preventDefault;
-       event.stopPropagation = fixEvent.stopPropagation;
-       return event;
-};
-fixEvent.preventDefault = function() {
-       this.returnValue = false;
-};
-fixEvent.stopPropagation = function() {
-  this.cancelBubble = true;
-}
-
-// Dean's forEach: http://dean.edwards.name/base/forEach.js
-/*
-       forEach, version 1.0
-       Copyright 2006, Dean Edwards
-       License: https://www.opensource.org/licenses/mit-license.php
-*/
-
-// array-like enumeration
-if (!Array.forEach) { // mozilla already supports this
-       Array.forEach = function(array, block, context) {
-               for (var i = 0; i < array.length; i++) {
-                       block.call(context, array[i], i, array);
-               }
-       };
-}
-
-// generic enumeration
-Function.prototype.forEach = function(object, block, context) {
-       for (var key in object) {
-               if (typeof this.prototype[key] == "undefined") {
-                       block.call(context, object[key], key, object);
-               }
-       }
-};
-
-// character enumeration
-String.forEach = function(string, block, context) {
-       Array.forEach(string.split(""), function(chr, index) {
-               block.call(context, chr, index, string);
-       });
-};
-
-// globally resolve forEach enumeration
-var forEach = function(object, block, context) {
-       if (object) {
-               var resolve = Object; // default
-               if (object instanceof Function) {
-                       // functions have a "length" property
-                       resolve = Function;
-               } else if (object.forEach instanceof Function) {
-                       // the object implements a custom forEach method so use that
-                       object.forEach(block, context);
-                       return;
-               } else if (typeof object == "string") {
-                       // the object is a string
-                       resolve = String;
-               } else if (typeof object.length == "number") {
-                       // the object is array-like
-                       resolve = Array;
-               }
-               resolve.forEach(object, block, context);
-       }
-};
-
diff --git a/tools/allocation_tracking/track_allocations.py b/tools/allocation_tracking/track_allocations.py

deleted file mode 100644 (file)

index 2a80d8f..0000000
--- a/tools/allocation_tracking/track_allocations.py
+++ /dev/null
@@ -1,140 +0,0 @@
-import numpy as np
-import gc
-import inspect
-from alloc_hook import NumpyAllocHook
-
-class AllocationTracker:
-    def __init__(self, threshold=0):
-        '''track numpy allocations of size threshold bytes or more.'''
-
-        self.threshold = threshold
-
-        # The total number of bytes currently allocated with size above
-        # threshold
-        self.total_bytes = 0
-
-        # We buffer requests line by line and move them into the allocation
-        # trace when a new line occurs
-        self.current_line = None
-        self.pending_allocations = []
-
-        self.blocksizes = {}
-
-        # list of (lineinfo, bytes allocated, bytes freed, # allocations, #
-        # frees, maximum memory usage, long-lived bytes allocated)
-        self.allocation_trace = []
-
-        self.numpy_hook = NumpyAllocHook(self.hook)
-
-    def __enter__(self):
-        self.numpy_hook.__enter__()
-
-    def __exit__(self, type, value, traceback):
-        self.check_line_changed()  # forces pending events to be handled
-        self.numpy_hook.__exit__()
-
-    def hook(self, inptr, outptr, size):
-        # minimize the chances that the garbage collector kicks in during a
-        # cython __dealloc__ call and causes a double delete of the current
-        # object. To avoid this fully the hook would have to avoid all python
-        # api calls, e.g. by being implemented in C like python 3.4's
-        # tracemalloc module
-        gc_on = gc.isenabled()
-        gc.disable()
-        if outptr == 0:  # it's a free
-            self.free_cb(inptr)
-        elif inptr != 0:  # realloc
-            self.realloc_cb(inptr, outptr, size)
-        else:  # malloc
-            self.alloc_cb(outptr, size)
-        if gc_on:
-            gc.enable()
-
-    def alloc_cb(self, ptr, size):
-        if size >= self.threshold:
-            self.check_line_changed()
-            self.blocksizes[ptr] = size
-            self.pending_allocations.append(size)
-
-    def free_cb(self, ptr):
-        size = self.blocksizes.pop(ptr, 0)
-        if size:
-            self.check_line_changed()
-            self.pending_allocations.append(-size)
-
-    def realloc_cb(self, newptr, oldptr, size):
-        if (size >= self.threshold) or (oldptr in self.blocksizes):
-            self.check_line_changed()
-            oldsize = self.blocksizes.pop(oldptr, 0)
-            self.pending_allocations.append(size - oldsize)
-            self.blocksizes[newptr] = size
-
-    def get_code_line(self):
-        # first frame is this line, then check_line_changed(), then 2 callbacks,
-        # then actual code.
-        try:
-            return inspect.stack()[4][1:]
-        except Exception:
-            return inspect.stack()[0][1:]
-
-    def check_line_changed(self):
-        line = self.get_code_line()
-        if line != self.current_line and (self.current_line is not None):
-            # move pending events into the allocation_trace
-            max_size = self.total_bytes
-            bytes_allocated = 0
-            bytes_freed = 0
-            num_allocations = 0
-            num_frees = 0
-            before_size = self.total_bytes
-            for allocation in self.pending_allocations:
-                self.total_bytes += allocation
-                if allocation > 0:
-                    bytes_allocated += allocation
-                    num_allocations += 1
-                else:
-                    bytes_freed += -allocation
-                    num_frees += 1
-                max_size = max(max_size, self.total_bytes)
-            long_lived = max(self.total_bytes - before_size, 0)
-            self.allocation_trace.append((self.current_line, bytes_allocated,
-                                          bytes_freed, num_allocations,
-                                          num_frees, max_size, long_lived))
-            # clear pending allocations
-            self.pending_allocations = []
-        # move to the new line
-        self.current_line = line
-
-    def write_html(self, filename):
-        with open(filename, "w") as f:
-            f.write('<HTML><HEAD><script src="sorttable.js"></script></HEAD><BODY>\n')
-            f.write('<TABLE class="sortable" width=100%>\n')
-            f.write("<TR>\n")
-            cols = "event#,lineinfo,bytes allocated,bytes freed,#allocations,#frees,max memory usage,long lived bytes".split(',')
-            for header in cols:
-                f.write("  <TH>{0}</TH>".format(header))
-            f.write("\n</TR>\n")
-            for idx, event in enumerate(self.allocation_trace):
-                f.write("<TR>\n")
-                event = [idx] + list(event)
-                for col, val in zip(cols, event):
-                    if col == 'lineinfo':
-                        # special handling
-                        try:
-                            filename, line, module, code, index = val
-                            val = "{0}({1}): {2}".format(filename, line, code[index])
-                        except Exception:
-                            # sometimes this info is not available (from eval()?)
-                            val = str(val)
-                    f.write("  <TD>{0}</TD>".format(val))
-                f.write("\n</TR>\n")
-            f.write("</TABLE></BODY></HTML>\n")
-
-
-if __name__ == '__main__':
-    tracker = AllocationTracker(1000)
-    with tracker:
-        for i in range(100):
-            np.zeros(i * 100)
-            np.zeros(i * 200)
-    tracker.write_html("allocations.html")
diff --git a/tools/changelog.py b/tools/changelog.py

index 444d96882216643c2f054032f13bbeb3fbaca7ca..7b7e66ddb5119d58b01f15acb7d95e81f8f22848 100755 (executable)
--- a/tools/changelog.py
+++ b/tools/changelog.py
@@ -39,9 +39,6 @@ import re
  from git import Repo
  from github import Github
  
-if sys.version_info[:2] < (3, 6):
-    raise RuntimeError("Python version must be >= 3.6")
-
  this_repo = Repo(os.path.join(os.path.dirname(__file__), ".."))
  
  author_msg =\
diff --git a/tools/cythonize.py b/tools/cythonize.py

index f7b179574c99e4860a59006f56f7cc499040cc7b..002b2fad74ddab2a3c6ce353c0e60f17deed15ce 100755 (executable)
--- a/tools/cythonize.py
+++ b/tools/cythonize.py
@@ -48,33 +48,8 @@ def process_pyx(fromfile, tofile):
      if tofile.endswith('.cxx'):
          flags.append('--cplus')
  
-    try:
-        # try the cython in the installed python first (somewhat related to scipy/scipy#2397)
-        import Cython
-        from Cython.Compiler.Version import version as cython_version
-    except ImportError as e:
-        # The `cython` command need not point to the version installed in the
-        # Python running this script, so raise an error to avoid the chance of
-        # using the wrong version of Cython.
-        msg = 'Cython needs to be installed in Python as a module'
-        raise OSError(msg) from e
-    else:
-        # check the version, and invoke through python
-        from distutils.version import LooseVersion
-
-        # Cython 0.29.21 is required for Python 3.9 and there are
-        # other fixes in the 0.29 series that are needed even for earlier
-        # Python versions.
-        # Note: keep in sync with that in pyproject.toml
-        # Update for Python 3.10
-        required_version = LooseVersion('0.29.30')
-
-        if LooseVersion(cython_version) < required_version:
-            cython_path = Cython.__file__
-            raise RuntimeError(f'Building {VENDOR} requires Cython >= {required_version}'
-                               f', found {cython_version} at {cython_path}')
-        subprocess.check_call(
-            [sys.executable, '-m', 'cython'] + flags + ["-o", tofile, fromfile])
+    subprocess.check_call(
+        [sys.executable, '-m', 'cython'] + flags + ["-o", tofile, fromfile])
  
  
  def process_tempita_pyx(fromfile, tofile):
diff --git a/tools/download-wheels.py b/tools/download-wheels.py

index dd066d9adba9f0236f7eaecc511d15a384ff30ee..41e1e9e5d849c95925efb7b46e1fa2d12fb7c6d6 100644 (file)
--- a/tools/download-wheels.py
+++ b/tools/download-wheels.py
@@ -31,11 +31,17 @@ import argparse
  import urllib3
  from bs4 import BeautifulSoup
  
-__version__ = '0.1'
+__version__ = "0.1"
  
  # Edit these for other projects.
-STAGING_URL = 'https://anaconda.org/multibuild-wheels-staging/numpy'
-PREFIX = 'numpy'
+STAGING_URL = "https://anaconda.org/multibuild-wheels-staging/numpy"
+PREFIX = "numpy"
+
+# Name endings of the files to download.
+WHL = r"-.*\.whl$"
+ZIP = r"\.zip$"
+GZIP = r"\.tar\.gz$"
+SUFFIX = rf"({WHL}|{GZIP}|{ZIP})"
  
  
  def get_wheel_names(version):
@@ -50,11 +56,11 @@ def get_wheel_names(version):
          The release version. For instance, "1.18.3".
  
      """
-    http = urllib3.PoolManager(cert_reqs='CERT_REQUIRED')
-    tmpl = re.compile(rf"^.*{PREFIX}-{version}-.*\.whl$")
+    http = urllib3.PoolManager(cert_reqs="CERT_REQUIRED")
+    tmpl = re.compile(rf"^.*{PREFIX}-{version}{SUFFIX}")
      index_url = f"{STAGING_URL}/files"
-    index_html = http.request('GET', index_url)
-    soup = BeautifulSoup(index_html.data, 'html.parser')
+    index_html = http.request("GET", index_url)
+    soup = BeautifulSoup(index_html.data, "html.parser")
      return soup.findAll(text=tmpl)
  
  
@@ -72,20 +78,20 @@ def download_wheels(version, wheelhouse):
          Directory in which to download the wheels.
  
      """
-    http = urllib3.PoolManager(cert_reqs='CERT_REQUIRED')
+    http = urllib3.PoolManager(cert_reqs="CERT_REQUIRED")
      wheel_names = get_wheel_names(version)
  
      for i, wheel_name in enumerate(wheel_names):
          wheel_url = f"{STAGING_URL}/{version}/download/{wheel_name}"
          wheel_path = os.path.join(wheelhouse, wheel_name)
-        with open(wheel_path, 'wb') as f:
-            with http.request('GET', wheel_url, preload_content=False,) as r:
+        with open(wheel_path, "wb") as f:
+            with http.request("GET", wheel_url, preload_content=False,) as r:
                  print(f"{i + 1:<4}{wheel_name}")
                  shutil.copyfileobj(r, f)
      print(f"\nTotal files downloaded: {len(wheel_names)}")
  
  
-if __name__ == '__main__':
+if __name__ == "__main__":
      parser = argparse.ArgumentParser()
      parser.add_argument(
          "version",
diff --git a/tools/find_deprecated_escaped_characters.py b/tools/find_deprecated_escaped_characters.py

index 22efaae65b69c09e8915fd1d56dbe93ffd2976f6..d7225b8e85f649dfd25fe68ac3aa4c174e6a6b7b 100644 (file)
--- a/tools/find_deprecated_escaped_characters.py
+++ b/tools/find_deprecated_escaped_characters.py
@@ -7,7 +7,7 @@ were accepted before. For instance, '\(' was previously accepted but must now
  be written as '\\(' or r'\('.
  
  """
-import sys
+
  
  def main(root):
      """Find deprecated escape sequences.
@@ -56,9 +56,6 @@ def main(root):
  if __name__ == "__main__":
      from argparse import ArgumentParser
  
-    if sys.version_info[:2] < (3, 6):
-        raise RuntimeError("Python version must be >= 3.6")
-
      parser = ArgumentParser(description="Find deprecated escaped characters")
      parser.add_argument('root', help='directory or file to be checked')
      args = parser.parse_args()
diff --git a/tools/functions_missing_types.py b/tools/functions_missing_types.py

index 0461aabd3634227a0eb7f4dbb32238bf031e1cad..99c6887a981b99521fa09b260df5c1fac38d4343 100755 (executable)
--- a/tools/functions_missing_types.py
+++ b/tools/functions_missing_types.py
@@ -32,7 +32,6 @@ EXCLUDE_LIST = {
          "math",
          # Accidentally public, deprecated, or shouldn't be used
          "Tester",
-        "alen",
          "add_docstring",
          "add_newdoc",
          "add_newdoc_ufunc",
diff --git a/tools/gitpod/Dockerfile b/tools/gitpod/Dockerfile

index e2e0e1bc957151ac7be7198baeba8f121a861a4e..592a5ee0ab11246b6150e339a6307f1e2100bdf5 100644 (file)
--- a/tools/gitpod/Dockerfile
+++ b/tools/gitpod/Dockerfile
@@ -27,7 +27,7 @@
  # OS/ARCH: linux/amd64
  FROM gitpod/workspace-base:latest
  
-ARG MAMBAFORGE_VERSION="4.10.0-0"
+ARG MAMBAFORGE_VERSION="4.11.0-0"
  ARG CONDA_ENV=numpy-dev
  
  
diff --git a/tools/gitpod/gitpod.Dockerfile b/tools/gitpod/gitpod.Dockerfile

index 7894be5bc3582850c9ea18343e3746441211c408..8dac0d597d5c6f6d5eb11cc886783ac9d5982217 100644 (file)
--- a/tools/gitpod/gitpod.Dockerfile
+++ b/tools/gitpod/gitpod.Dockerfile
@@ -34,6 +34,7 @@ COPY --from=clone --chown=gitpod /tmp/numpy ${WORKSPACE}
  WORKDIR ${WORKSPACE}
  
  # Build numpy to populate the cache used by ccache
+RUN git config --global --add safe.directory /workspace/numpy
  RUN git submodule update --init --depth=1 -- numpy/core/src/umath/svml
  RUN conda activate ${CONDA_ENV} && \ 
      python setup.py build_ext --inplace && \
diff --git a/tools/gitpod/settings.json b/tools/gitpod/settings.json

index 8f070c04c05a643c892da40b30f10fc170a15bae..50296336dde42bb61e14a9aaac9ab3f389d534a3 100644 (file)
--- a/tools/gitpod/settings.json
+++ b/tools/gitpod/settings.json
@@ -1,9 +1,8 @@
  {
-    "restructuredtext.languageServer.disabled": true,
-    "restructuredtext.builtDocumentationPath": "${workspaceRoot}/doc/build/html",
-    "restructuredtext.confPath": "",
      "restructuredtext.updateOnTextChanged": "true",
      "restructuredtext.updateDelay": 300,
-    "restructuredtext.linter.disabled": true,
-    "python.pythonPath": "/home/gitpod/mambaforge3/envs/numpy-dev/bin/python"
+    "restructuredtext.linter.disabledLinters": ["doc8","rst-lint", "rstcheck"],
+    "python.defaultInterpreterPath": "/home/gitpod/mambaforge3/envs/numpy-dev/bin/python",
+    "esbonio.sphinx.buildDir": "${workspaceRoot}/doc/build/html",
+    "esbonio.sphinx.confDir": ""
  }
 \ No newline at end of file
diff --git a/tools/openblas_support.py b/tools/openblas_support.py

index 4eb72dbc9b4b3d910212bf0f4257e64c5fa13dfc..ce677f9a5f89570bc2a5fae77cdac2360ad40805 100644 (file)
--- a/tools/openblas_support.py
+++ b/tools/openblas_support.py
@@ -13,8 +13,8 @@ from tempfile import mkstemp, gettempdir
  from urllib.request import urlopen, Request
  from urllib.error import HTTPError
  
-OPENBLAS_V = '0.3.18'
-OPENBLAS_LONG = 'v0.3.18'
+OPENBLAS_V = '0.3.20'
+OPENBLAS_LONG = 'v0.3.20'
  BASE_LOC = 'https://anaconda.org/multibuild-wheels-staging/openblas-libs'
  BASEURL = f'{BASE_LOC}/{OPENBLAS_LONG}/download'
  SUPPORTED_PLATFORMS = [
diff --git a/tools/pypy-test.sh b/tools/pypy-test.sh

deleted file mode 100755 (executable)

index e6c6ae7..0000000
--- a/tools/pypy-test.sh
+++ /dev/null
@@ -1,49 +0,0 @@
-#!/usr/bin/env bash
-
-# Exit if a command fails
-set -e
-set -o pipefail
-# Print expanded commands
-set -x
-
-sudo apt-get -yq update
-sudo apt-get -yq install gfortran-5
-export F77=gfortran-5
-export F90=gfortran-5
-
-# Download the proper OpenBLAS x64 precompiled library
-target=$(python3 tools/openblas_support.py)
-ls -lR "$target"
-echo getting OpenBLAS into $target
-export LD_LIBRARY_PATH=$target/lib
-export LIB=$target/lib
-export INCLUDE=$target/include
-
-# Use a site.cfg to build with local openblas
-cat << EOF > site.cfg
-[openblas]
-libraries = openblas
-library_dirs = $target/lib:$LIB
-include_dirs = $target/lib:$LIB
-runtime_library_dirs = $target/lib
-EOF
-
-echo getting PyPy 3.6-v7.3.2
-wget -q https://downloads.python.org/pypy/pypy3.6-v7.3.2-linux64.tar.bz2 -O pypy.tar.bz2
-mkdir -p pypy3
-(cd pypy3; tar --strip-components=1 -xf ../pypy.tar.bz2)
-pypy3/bin/pypy3 -mensurepip
-pypy3/bin/pypy3 -m pip install --upgrade pip
-pypy3/bin/pypy3 -m pip install --user -r test_requirements.txt --no-warn-script-location
-
-echo
-echo pypy3 version
-pypy3/bin/pypy3 -c "import sys; print(sys.version)"
-echo
-
-pypy3/bin/pypy3 runtests.py --debug-info --show-build-log -v -- -rsx \
-      --junitxml=junit/test-results.xml --durations 10
-
-echo Make sure the correct openblas has been linked in
-pypy3/bin/pypy3 -mpip install --no-build-isolation .
-pypy3/bin/pypy3 tools/openblas_support.py --check_version
diff --git a/tools/refguide_check.py b/tools/refguide_check.py

index 21ba5a448dcc507e9c555235941c28e1a7577566..eb9a27ab5d44d1653ee22f8d0aea2f85ea8e2072 100644 (file)
--- a/tools/refguide_check.py
+++ b/tools/refguide_check.py
@@ -104,6 +104,8 @@ DOCTEST_SKIPDICT = {
      'numpy.random.vonmises': None,
      'numpy.random.power': None,
      'numpy.random.zipf': None,
+    # cases where NumPy docstrings import things from other 3'rd party libs:
+    'numpy.core.from_dlpack': None,
      # remote / local file IO with DataSource is problematic in doctest:
      'numpy.lib.DataSource': None,
      'numpy.lib.Repository': None,
@@ -131,13 +133,12 @@ RST_SKIPLIST = [
      'c-info.ufunc-tutorial.rst',
      'c-info.python-as-glue.rst',
      'f2py.getting-started.rst',
+    'f2py-examples.rst',
      'arrays.nditer.cython.rst',
      # See PR 17222, these should be fixed
-    'basics.byteswapping.rst',
      'basics.dispatch.rst',
-    'basics.indexing.rst',
      'basics.subclassing.rst',
-    'basics.types.rst',
+    'basics.interoperability.rst',
      'misc.rst',
  ]
  
diff --git a/tools/swig/README b/tools/swig/README

index 7fa0599c6b4f54743c95904e0906a6637543a881..c539c597f8c6739304bb08632edd00ed49c0ea0e 100644 (file)
--- a/tools/swig/README
+++ b/tools/swig/README
@@ -15,7 +15,7 @@ system used here, can be found in the NumPy reference guide.
  Testing
  -------
  The tests are a good example of what we are trying to do with numpy.i.
-The files related to testing are are in the test subdirectory::
+The files related to testing are in the test subdirectory::
  
      Vector.h
      Vector.cxx
diff --git a/tools/swig/numpy.i b/tools/swig/numpy.i

index 99ed073abe11452649b784c542db922f05d2d705..0ef92bab1527a2824a3807dbb8c652cefd51fd15 100644 (file)
--- a/tools/swig/numpy.i
+++ b/tools/swig/numpy.i
@@ -524,7 +524,7 @@
      return success;
    }
  
-  /* Require the given PyArrayObject to to be Fortran ordered.  If the
+  /* Require the given PyArrayObject to be Fortran ordered.  If the
     * the PyArrayObject is already Fortran ordered, do nothing.  Else,
     * set the Fortran ordering flag and recompute the strides.
     */
diff --git a/tools/travis-test.sh b/tools/travis-test.sh

index b395942fba8af23cda001c1ab5f98a48c3351cc0..db5b3f744556f049c3ad15b994a4e0897e5a0f39 100755 (executable)
--- a/tools/travis-test.sh
+++ b/tools/travis-test.sh
@@ -83,7 +83,7 @@ run_test()
    # in test_requirements.txt) does not provide a wheel, and the source tar
    # file does not install correctly when Python's optimization level is set
    # to strip docstrings (see https://github.com/eliben/pycparser/issues/291).
-  PYTHONOPTIMIZE="" $PIP install -r test_requirements.txt
+  PYTHONOPTIMIZE="" $PIP install -r test_requirements.txt pyinstaller
    DURATIONS_FLAG="--durations 10"
  
    if [ -n "$USE_DEBUG" ]; then
diff --git a/tools/wheels/LICENSE_win32.txt b/tools/wheels/LICENSE_win32.txt

new file mode 100644 (file)

index 0000000..1014d77
--- /dev/null
+++ b/tools/wheels/LICENSE_win32.txt
@@ -0,0 +1,938 @@
+
+----
+
+This binary distribution of NumPy also bundles the following software:
+
+
+Name: OpenBLAS
+Files: extra-dll\libopenb*.dll
+Description: bundled as a dynamically linked library
+Availability: https://github.com/xianyi/OpenBLAS/
+License: 3-clause BSD
+  Copyright (c) 2011-2014, The OpenBLAS Project
+  All rights reserved.
+
+  Redistribution and use in source and binary forms, with or without
+  modification, are permitted provided that the following conditions are
+  met:
+
+     1. Redistributions of source code must retain the above copyright
+        notice, this list of conditions and the following disclaimer.
+
+     2. Redistributions in binary form must reproduce the above copyright
+        notice, this list of conditions and the following disclaimer in
+        the documentation and/or other materials provided with the
+        distribution.
+     3. Neither the name of the OpenBLAS project nor the names of
+        its contributors may be used to endorse or promote products
+        derived from this software without specific prior written
+        permission.
+
+  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+  DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+  SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+  CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+  OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
+  USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+Name: LAPACK
+Files: extra-dll\libopenb*.dll
+Description: bundled in OpenBLAS
+Availability: https://github.com/xianyi/OpenBLAS/
+License 3-clause BSD
+  Copyright (c) 1992-2013 The University of Tennessee and The University
+                          of Tennessee Research Foundation.  All rights
+                          reserved.
+  Copyright (c) 2000-2013 The University of California Berkeley. All
+                          rights reserved.
+  Copyright (c) 2006-2013 The University of Colorado Denver.  All rights
+                          reserved.
+
+  $COPYRIGHT$
+
+  Additional copyrights may follow
+
+  $HEADER$
+
+  Redistribution and use in source and binary forms, with or without
+  modification, are permitted provided that the following conditions are
+  met:
+
+  - Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+
+  - Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer listed
+    in this license in the documentation and/or other materials
+    provided with the distribution.
+
+  - Neither the name of the copyright holders nor the names of its
+    contributors may be used to endorse or promote products derived from
+    this software without specific prior written permission.
+
+  The copyright holders provide no reassurances that the source code
+  provided does not infringe any patent, copyright, or any other
+  intellectual property rights of third parties.  The copyright holders
+  disclaim any liability to any recipient for claims brought against
+  recipient by any third party for infringement of that parties
+  intellectual property rights.
+
+  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+  A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+  OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+  SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+  LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+  DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+  THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+  (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+  OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+Name: GCC runtime library
+Files: extra-dll\*.dll
+Description: statically linked, in DLL files compiled with gfortran only
+Availability: https://gcc.gnu.org/viewcvs/gcc/
+License: GPLv3 + runtime exception
+  Copyright (C) 2002-2017 Free Software Foundation, Inc.
+
+  Libgfortran is free software; you can redistribute it and/or modify
+  it under the terms of the GNU General Public License as published by
+  the Free Software Foundation; either version 3, or (at your option)
+  any later version.
+
+  Libgfortran is distributed in the hope that it will be useful,
+  but WITHOUT ANY WARRANTY; without even the implied warranty of
+  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+  GNU General Public License for more details.
+
+  Under Section 7 of GPL version 3, you are granted additional
+  permissions described in the GCC Runtime Library Exception, version
+  3.1, as published by the Free Software Foundation.
+
+  You should have received a copy of the GNU General Public License and
+  a copy of the GCC Runtime Library Exception along with this program;
+  see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+  <http://www.gnu.org/licenses/>.
+
+
+Name: Microsoft Visual C++ Runtime Files
+Files: extra-dll\msvcp140.dll
+License: MSVC
+  https://www.visualstudio.com/license-terms/distributable-code-microsoft-visual-studio-2015-rc-microsoft-visual-studio-2015-sdk-rc-includes-utilities-buildserver-files/#visual-c-runtime
+
+  Subject to the License Terms for the software, you may copy and
+  distribute with your program any of the files within the followng
+  folder and its subfolders except as noted below. You may not modify
+  these files.
+
+    C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\redist
+
+  You may not distribute the contents of the following folders:
+
+    C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\redist\debug_nonredist
+    C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\redist\onecore\debug_nonredist
+
+  Subject to the License Terms for the software, you may copy and
+  distribute the following files with your program in your program’s
+  application local folder or by deploying them into the Global
+  Assembly Cache (GAC):
+
+  VC\atlmfc\lib\mfcmifc80.dll
+  VC\atlmfc\lib\amd64\mfcmifc80.dll
+
+
+Name: Microsoft Visual C++ Runtime Files
+Files: extra-dll\msvc*90.dll, extra-dll\Microsoft.VC90.CRT.manifest
+License: MSVC
+  For your convenience, we have provided the following folders for
+  use when redistributing VC++ runtime files. Subject to the license
+  terms for the software, you may redistribute the folder
+  (unmodified) in the application local folder as a sub-folder with
+  no change to the folder name. You may also redistribute all the
+  files (*.dll and *.manifest) within a folder, listed below the
+  folder for your convenience, as an entire set.
+
+  \VC\redist\x86\Microsoft.VC90.ATL\
+   atl90.dll
+   Microsoft.VC90.ATL.manifest
+  \VC\redist\ia64\Microsoft.VC90.ATL\
+   atl90.dll
+   Microsoft.VC90.ATL.manifest
+  \VC\redist\amd64\Microsoft.VC90.ATL\
+   atl90.dll
+   Microsoft.VC90.ATL.manifest
+  \VC\redist\x86\Microsoft.VC90.CRT\
+   msvcm90.dll
+   msvcp90.dll
+   msvcr90.dll
+   Microsoft.VC90.CRT.manifest
+  \VC\redist\ia64\Microsoft.VC90.CRT\
+   msvcm90.dll
+   msvcp90.dll
+   msvcr90.dll
+   Microsoft.VC90.CRT.manifest
+
+----
+
+Full text of license texts referred to above follows (that they are
+listed below does not necessarily imply the conditions apply to the
+present binary release):
+
+----
+
+GCC RUNTIME LIBRARY EXCEPTION
+
+Version 3.1, 31 March 2009
+
+Copyright (C) 2009 Free Software Foundation, Inc. <http://fsf.org/>
+
+Everyone is permitted to copy and distribute verbatim copies of this
+license document, but changing it is not allowed.
+
+This GCC Runtime Library Exception ("Exception") is an additional
+permission under section 7 of the GNU General Public License, version
+3 ("GPLv3"). It applies to a given file (the "Runtime Library") that
+bears a notice placed by the copyright holder of the file stating that
+the file is governed by GPLv3 along with this Exception.
+
+When you use GCC to compile a program, GCC may combine portions of
+certain GCC header files and runtime libraries with the compiled
+program. The purpose of this Exception is to allow compilation of
+non-GPL (including proprietary) programs to use, in this way, the
+header files and runtime libraries covered by this Exception.
+
+0. Definitions.
+
+A file is an "Independent Module" if it either requires the Runtime
+Library for execution after a Compilation Process, or makes use of an
+interface provided by the Runtime Library, but is not otherwise based
+on the Runtime Library.
+
+"GCC" means a version of the GNU Compiler Collection, with or without
+modifications, governed by version 3 (or a specified later version) of
+the GNU General Public License (GPL) with the option of using any
+subsequent versions published by the FSF.
+
+"GPL-compatible Software" is software whose conditions of propagation,
+modification and use would permit combination with GCC in accord with
+the license of GCC.
+
+"Target Code" refers to output from any compiler for a real or virtual
+target processor architecture, in executable form or suitable for
+input to an assembler, loader, linker and/or execution
+phase. Notwithstanding that, Target Code does not include data in any
+format that is used as a compiler intermediate representation, or used
+for producing a compiler intermediate representation.
+
+The "Compilation Process" transforms code entirely represented in
+non-intermediate languages designed for human-written code, and/or in
+Java Virtual Machine byte code, into Target Code. Thus, for example,
+use of source code generators and preprocessors need not be considered
+part of the Compilation Process, since the Compilation Process can be
+understood as starting with the output of the generators or
+preprocessors.
+
+A Compilation Process is "Eligible" if it is done using GCC, alone or
+with other GPL-compatible software, or if it is done without using any
+work based on GCC. For example, using non-GPL-compatible Software to
+optimize any GCC intermediate representations would not qualify as an
+Eligible Compilation Process.
+
+1. Grant of Additional Permission.
+
+You have permission to propagate a work of Target Code formed by
+combining the Runtime Library with Independent Modules, even if such
+propagation would otherwise violate the terms of GPLv3, provided that
+all Target Code was generated by Eligible Compilation Processes. You
+may then convey such a combination under terms of your choice,
+consistent with the licensing of the Independent Modules.
+
+2. No Weakening of GCC Copyleft.
+
+The availability of this Exception does not imply any general
+presumption that third-party software is unaffected by the copyleft
+requirements of the license of GCC.
+
+----
+
+                    GNU GENERAL PUBLIC LICENSE
+                       Version 3, 29 June 2007
+
+ Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+                            Preamble
+
+  The GNU General Public License is a free, copyleft license for
+software and other kinds of works.
+
+  The licenses for most software and other practical works are designed
+to take away your freedom to share and change the works.  By contrast,
+the GNU General Public License is intended to guarantee your freedom to
+share and change all versions of a program--to make sure it remains free
+software for all its users.  We, the Free Software Foundation, use the
+GNU General Public License for most of our software; it applies also to
+any other work released this way by its authors.  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+them if you wish), that you receive source code or can get it if you
+want it, that you can change the software or use pieces of it in new
+free programs, and that you know you can do these things.
+
+  To protect your rights, we need to prevent others from denying you
+these rights or asking you to surrender the rights.  Therefore, you have
+certain responsibilities if you distribute copies of the software, or if
+you modify it: responsibilities to respect the freedom of others.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must pass on to the recipients the same
+freedoms that you received.  You must make sure that they, too, receive
+or can get the source code.  And you must show them these terms so they
+know their rights.
+
+  Developers that use the GNU GPL protect your rights with two steps:
+(1) assert copyright on the software, and (2) offer you this License
+giving you legal permission to copy, distribute and/or modify it.
+
+  For the developers' and authors' protection, the GPL clearly explains
+that there is no warranty for this free software.  For both users' and
+authors' sake, the GPL requires that modified versions be marked as
+changed, so that their problems will not be attributed erroneously to
+authors of previous versions.
+
+  Some devices are designed to deny users access to install or run
+modified versions of the software inside them, although the manufacturer
+can do so.  This is fundamentally incompatible with the aim of
+protecting users' freedom to change the software.  The systematic
+pattern of such abuse occurs in the area of products for individuals to
+use, which is precisely where it is most unacceptable.  Therefore, we
+have designed this version of the GPL to prohibit the practice for those
+products.  If such problems arise substantially in other domains, we
+stand ready to extend this provision to those domains in future versions
+of the GPL, as needed to protect the freedom of users.
+
+  Finally, every program is threatened constantly by software patents.
+States should not allow patents to restrict development and use of
+software on general-purpose computers, but in those that do, we wish to
+avoid the special danger that patents applied to a free program could
+make it effectively proprietary.  To prevent this, the GPL assures that
+patents cannot be used to render the program non-free.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+                       TERMS AND CONDITIONS
+
+  0. Definitions.
+
+  "This License" refers to version 3 of the GNU General Public License.
+
+  "Copyright" also means copyright-like laws that apply to other kinds of
+works, such as semiconductor masks.
+
+  "The Program" refers to any copyrightable work licensed under this
+License.  Each licensee is addressed as "you".  "Licensees" and
+"recipients" may be individuals or organizations.
+
+  To "modify" a work means to copy from or adapt all or part of the work
+in a fashion requiring copyright permission, other than the making of an
+exact copy.  The resulting work is called a "modified version" of the
+earlier work or a work "based on" the earlier work.
+
+  A "covered work" means either the unmodified Program or a work based
+on the Program.
+
+  To "propagate" a work means to do anything with it that, without
+permission, would make you directly or secondarily liable for
+infringement under applicable copyright law, except executing it on a
+computer or modifying a private copy.  Propagation includes copying,
+distribution (with or without modification), making available to the
+public, and in some countries other activities as well.
+
+  To "convey" a work means any kind of propagation that enables other
+parties to make or receive copies.  Mere interaction with a user through
+a computer network, with no transfer of a copy, is not conveying.
+
+  An interactive user interface displays "Appropriate Legal Notices"
+to the extent that it includes a convenient and prominently visible
+feature that (1) displays an appropriate copyright notice, and (2)
+tells the user that there is no warranty for the work (except to the
+extent that warranties are provided), that licensees may convey the
+work under this License, and how to view a copy of this License.  If
+the interface presents a list of user commands or options, such as a
+menu, a prominent item in the list meets this criterion.
+
+  1. Source Code.
+
+  The "source code" for a work means the preferred form of the work
+for making modifications to it.  "Object code" means any non-source
+form of a work.
+
+  A "Standard Interface" means an interface that either is an official
+standard defined by a recognized standards body, or, in the case of
+interfaces specified for a particular programming language, one that
+is widely used among developers working in that language.
+
+  The "System Libraries" of an executable work include anything, other
+than the work as a whole, that (a) is included in the normal form of
+packaging a Major Component, but which is not part of that Major
+Component, and (b) serves only to enable use of the work with that
+Major Component, or to implement a Standard Interface for which an
+implementation is available to the public in source code form.  A
+"Major Component", in this context, means a major essential component
+(kernel, window system, and so on) of the specific operating system
+(if any) on which the executable work runs, or a compiler used to
+produce the work, or an object code interpreter used to run it.
+
+  The "Corresponding Source" for a work in object code form means all
+the source code needed to generate, install, and (for an executable
+work) run the object code and to modify the work, including scripts to
+control those activities.  However, it does not include the work's
+System Libraries, or general-purpose tools or generally available free
+programs which are used unmodified in performing those activities but
+which are not part of the work.  For example, Corresponding Source
+includes interface definition files associated with source files for
+the work, and the source code for shared libraries and dynamically
+linked subprograms that the work is specifically designed to require,
+such as by intimate data communication or control flow between those
+subprograms and other parts of the work.
+
+  The Corresponding Source need not include anything that users
+can regenerate automatically from other parts of the Corresponding
+Source.
+
+  The Corresponding Source for a work in source code form is that
+same work.
+
+  2. Basic Permissions.
+
+  All rights granted under this License are granted for the term of
+copyright on the Program, and are irrevocable provided the stated
+conditions are met.  This License explicitly affirms your unlimited
+permission to run the unmodified Program.  The output from running a
+covered work is covered by this License only if the output, given its
+content, constitutes a covered work.  This License acknowledges your
+rights of fair use or other equivalent, as provided by copyright law.
+
+  You may make, run and propagate covered works that you do not
+convey, without conditions so long as your license otherwise remains
+in force.  You may convey covered works to others for the sole purpose
+of having them make modifications exclusively for you, or provide you
+with facilities for running those works, provided that you comply with
+the terms of this License in conveying all material for which you do
+not control copyright.  Those thus making or running the covered works
+for you must do so exclusively on your behalf, under your direction
+and control, on terms that prohibit them from making any copies of
+your copyrighted material outside their relationship with you.
+
+  Conveying under any other circumstances is permitted solely under
+the conditions stated below.  Sublicensing is not allowed; section 10
+makes it unnecessary.
+
+  3. Protecting Users' Legal Rights From Anti-Circumvention Law.
+
+  No covered work shall be deemed part of an effective technological
+measure under any applicable law fulfilling obligations under article
+11 of the WIPO copyright treaty adopted on 20 December 1996, or
+similar laws prohibiting or restricting circumvention of such
+measures.
+
+  When you convey a covered work, you waive any legal power to forbid
+circumvention of technological measures to the extent such circumvention
+is effected by exercising rights under this License with respect to
+the covered work, and you disclaim any intention to limit operation or
+modification of the work as a means of enforcing, against the work's
+users, your or third parties' legal rights to forbid circumvention of
+technological measures.
+
+  4. Conveying Verbatim Copies.
+
+  You may convey verbatim copies of the Program's source code as you
+receive it, in any medium, provided that you conspicuously and
+appropriately publish on each copy an appropriate copyright notice;
+keep intact all notices stating that this License and any
+non-permissive terms added in accord with section 7 apply to the code;
+keep intact all notices of the absence of any warranty; and give all
+recipients a copy of this License along with the Program.
+
+  You may charge any price or no price for each copy that you convey,
+and you may offer support or warranty protection for a fee.
+
+  5. Conveying Modified Source Versions.
+
+  You may convey a work based on the Program, or the modifications to
+produce it from the Program, in the form of source code under the
+terms of section 4, provided that you also meet all of these conditions:
+
+    a) The work must carry prominent notices stating that you modified
+    it, and giving a relevant date.
+
+    b) The work must carry prominent notices stating that it is
+    released under this License and any conditions added under section
+    7.  This requirement modifies the requirement in section 4 to
+    "keep intact all notices".
+
+    c) You must license the entire work, as a whole, under this
+    License to anyone who comes into possession of a copy.  This
+    License will therefore apply, along with any applicable section 7
+    additional terms, to the whole of the work, and all its parts,
+    regardless of how they are packaged.  This License gives no
+    permission to license the work in any other way, but it does not
+    invalidate such permission if you have separately received it.
+
+    d) If the work has interactive user interfaces, each must display
+    Appropriate Legal Notices; however, if the Program has interactive
+    interfaces that do not display Appropriate Legal Notices, your
+    work need not make them do so.
+
+  A compilation of a covered work with other separate and independent
+works, which are not by their nature extensions of the covered work,
+and which are not combined with it such as to form a larger program,
+in or on a volume of a storage or distribution medium, is called an
+"aggregate" if the compilation and its resulting copyright are not
+used to limit the access or legal rights of the compilation's users
+beyond what the individual works permit.  Inclusion of a covered work
+in an aggregate does not cause this License to apply to the other
+parts of the aggregate.
+
+  6. Conveying Non-Source Forms.
+
+  You may convey a covered work in object code form under the terms
+of sections 4 and 5, provided that you also convey the
+machine-readable Corresponding Source under the terms of this License,
+in one of these ways:
+
+    a) Convey the object code in, or embodied in, a physical product
+    (including a physical distribution medium), accompanied by the
+    Corresponding Source fixed on a durable physical medium
+    customarily used for software interchange.
+
+    b) Convey the object code in, or embodied in, a physical product
+    (including a physical distribution medium), accompanied by a
+    written offer, valid for at least three years and valid for as
+    long as you offer spare parts or customer support for that product
+    model, to give anyone who possesses the object code either (1) a
+    copy of the Corresponding Source for all the software in the
+    product that is covered by this License, on a durable physical
+    medium customarily used for software interchange, for a price no
+    more than your reasonable cost of physically performing this
+    conveying of source, or (2) access to copy the
+    Corresponding Source from a network server at no charge.
+
+    c) Convey individual copies of the object code with a copy of the
+    written offer to provide the Corresponding Source.  This
+    alternative is allowed only occasionally and noncommercially, and
+    only if you received the object code with such an offer, in accord
+    with subsection 6b.
+
+    d) Convey the object code by offering access from a designated
+    place (gratis or for a charge), and offer equivalent access to the
+    Corresponding Source in the same way through the same place at no
+    further charge.  You need not require recipients to copy the
+    Corresponding Source along with the object code.  If the place to
+    copy the object code is a network server, the Corresponding Source
+    may be on a different server (operated by you or a third party)
+    that supports equivalent copying facilities, provided you maintain
+    clear directions next to the object code saying where to find the
+    Corresponding Source.  Regardless of what server hosts the
+    Corresponding Source, you remain obligated to ensure that it is
+    available for as long as needed to satisfy these requirements.
+
+    e) Convey the object code using peer-to-peer transmission, provided
+    you inform other peers where the object code and Corresponding
+    Source of the work are being offered to the general public at no
+    charge under subsection 6d.
+
+  A separable portion of the object code, whose source code is excluded
+from the Corresponding Source as a System Library, need not be
+included in conveying the object code work.
+
+  A "User Product" is either (1) a "consumer product", which means any
+tangible personal property which is normally used for personal, family,
+or household purposes, or (2) anything designed or sold for incorporation
+into a dwelling.  In determining whether a product is a consumer product,
+doubtful cases shall be resolved in favor of coverage.  For a particular
+product received by a particular user, "normally used" refers to a
+typical or common use of that class of product, regardless of the status
+of the particular user or of the way in which the particular user
+actually uses, or expects or is expected to use, the product.  A product
+is a consumer product regardless of whether the product has substantial
+commercial, industrial or non-consumer uses, unless such uses represent
+the only significant mode of use of the product.
+
+  "Installation Information" for a User Product means any methods,
+procedures, authorization keys, or other information required to install
+and execute modified versions of a covered work in that User Product from
+a modified version of its Corresponding Source.  The information must
+suffice to ensure that the continued functioning of the modified object
+code is in no case prevented or interfered with solely because
+modification has been made.
+
+  If you convey an object code work under this section in, or with, or
+specifically for use in, a User Product, and the conveying occurs as
+part of a transaction in which the right of possession and use of the
+User Product is transferred to the recipient in perpetuity or for a
+fixed term (regardless of how the transaction is characterized), the
+Corresponding Source conveyed under this section must be accompanied
+by the Installation Information.  But this requirement does not apply
+if neither you nor any third party retains the ability to install
+modified object code on the User Product (for example, the work has
+been installed in ROM).
+
+  The requirement to provide Installation Information does not include a
+requirement to continue to provide support service, warranty, or updates
+for a work that has been modified or installed by the recipient, or for
+the User Product in which it has been modified or installed.  Access to a
+network may be denied when the modification itself materially and
+adversely affects the operation of the network or violates the rules and
+protocols for communication across the network.
+
+  Corresponding Source conveyed, and Installation Information provided,
+in accord with this section must be in a format that is publicly
+documented (and with an implementation available to the public in
+source code form), and must require no special password or key for
+unpacking, reading or copying.
+
+  7. Additional Terms.
+
+  "Additional permissions" are terms that supplement the terms of this
+License by making exceptions from one or more of its conditions.
+Additional permissions that are applicable to the entire Program shall
+be treated as though they were included in this License, to the extent
+that they are valid under applicable law.  If additional permissions
+apply only to part of the Program, that part may be used separately
+under those permissions, but the entire Program remains governed by
+this License without regard to the additional permissions.
+
+  When you convey a copy of a covered work, you may at your option
+remove any additional permissions from that copy, or from any part of
+it.  (Additional permissions may be written to require their own
+removal in certain cases when you modify the work.)  You may place
+additional permissions on material, added by you to a covered work,
+for which you have or can give appropriate copyright permission.
+
+  Notwithstanding any other provision of this License, for material you
+add to a covered work, you may (if authorized by the copyright holders of
+that material) supplement the terms of this License with terms:
+
+    a) Disclaiming warranty or limiting liability differently from the
+    terms of sections 15 and 16 of this License; or
+
+    b) Requiring preservation of specified reasonable legal notices or
+    author attributions in that material or in the Appropriate Legal
+    Notices displayed by works containing it; or
+
+    c) Prohibiting misrepresentation of the origin of that material, or
+    requiring that modified versions of such material be marked in
+    reasonable ways as different from the original version; or
+
+    d) Limiting the use for publicity purposes of names of licensors or
+    authors of the material; or
+
+    e) Declining to grant rights under trademark law for use of some
+    trade names, trademarks, or service marks; or
+
+    f) Requiring indemnification of licensors and authors of that
+    material by anyone who conveys the material (or modified versions of
+    it) with contractual assumptions of liability to the recipient, for
+    any liability that these contractual assumptions directly impose on
+    those licensors and authors.
+
+  All other non-permissive additional terms are considered "further
+restrictions" within the meaning of section 10.  If the Program as you
+received it, or any part of it, contains a notice stating that it is
+governed by this License along with a term that is a further
+restriction, you may remove that term.  If a license document contains
+a further restriction but permits relicensing or conveying under this
+License, you may add to a covered work material governed by the terms
+of that license document, provided that the further restriction does
+not survive such relicensing or conveying.
+
+  If you add terms to a covered work in accord with this section, you
+must place, in the relevant source files, a statement of the
+additional terms that apply to those files, or a notice indicating
+where to find the applicable terms.
+
+  Additional terms, permissive or non-permissive, may be stated in the
+form of a separately written license, or stated as exceptions;
+the above requirements apply either way.
+
+  8. Termination.
+
+  You may not propagate or modify a covered work except as expressly
+provided under this License.  Any attempt otherwise to propagate or
+modify it is void, and will automatically terminate your rights under
+this License (including any patent licenses granted under the third
+paragraph of section 11).
+
+  However, if you cease all violation of this License, then your
+license from a particular copyright holder is reinstated (a)
+provisionally, unless and until the copyright holder explicitly and
+finally terminates your license, and (b) permanently, if the copyright
+holder fails to notify you of the violation by some reasonable means
+prior to 60 days after the cessation.
+
+  Moreover, your license from a particular copyright holder is
+reinstated permanently if the copyright holder notifies you of the
+violation by some reasonable means, this is the first time you have
+received notice of violation of this License (for any work) from that
+copyright holder, and you cure the violation prior to 30 days after
+your receipt of the notice.
+
+  Termination of your rights under this section does not terminate the
+licenses of parties who have received copies or rights from you under
+this License.  If your rights have been terminated and not permanently
+reinstated, you do not qualify to receive new licenses for the same
+material under section 10.
+
+  9. Acceptance Not Required for Having Copies.
+
+  You are not required to accept this License in order to receive or
+run a copy of the Program.  Ancillary propagation of a covered work
+occurring solely as a consequence of using peer-to-peer transmission
+to receive a copy likewise does not require acceptance.  However,
+nothing other than this License grants you permission to propagate or
+modify any covered work.  These actions infringe copyright if you do
+not accept this License.  Therefore, by modifying or propagating a
+covered work, you indicate your acceptance of this License to do so.
+
+  10. Automatic Licensing of Downstream Recipients.
+
+  Each time you convey a covered work, the recipient automatically
+receives a license from the original licensors, to run, modify and
+propagate that work, subject to this License.  You are not responsible
+for enforcing compliance by third parties with this License.
+
+  An "entity transaction" is a transaction transferring control of an
+organization, or substantially all assets of one, or subdividing an
+organization, or merging organizations.  If propagation of a covered
+work results from an entity transaction, each party to that
+transaction who receives a copy of the work also receives whatever
+licenses to the work the party's predecessor in interest had or could
+give under the previous paragraph, plus a right to possession of the
+Corresponding Source of the work from the predecessor in interest, if
+the predecessor has it or can get it with reasonable efforts.
+
+  You may not impose any further restrictions on the exercise of the
+rights granted or affirmed under this License.  For example, you may
+not impose a license fee, royalty, or other charge for exercise of
+rights granted under this License, and you may not initiate litigation
+(including a cross-claim or counterclaim in a lawsuit) alleging that
+any patent claim is infringed by making, using, selling, offering for
+sale, or importing the Program or any portion of it.
+
+  11. Patents.
+
+  A "contributor" is a copyright holder who authorizes use under this
+License of the Program or a work on which the Program is based.  The
+work thus licensed is called the contributor's "contributor version".
+
+  A contributor's "essential patent claims" are all patent claims
+owned or controlled by the contributor, whether already acquired or
+hereafter acquired, that would be infringed by some manner, permitted
+by this License, of making, using, or selling its contributor version,
+but do not include claims that would be infringed only as a
+consequence of further modification of the contributor version.  For
+purposes of this definition, "control" includes the right to grant
+patent sublicenses in a manner consistent with the requirements of
+this License.
+
+  Each contributor grants you a non-exclusive, worldwide, royalty-free
+patent license under the contributor's essential patent claims, to
+make, use, sell, offer for sale, import and otherwise run, modify and
+propagate the contents of its contributor version.
+
+  In the following three paragraphs, a "patent license" is any express
+agreement or commitment, however denominated, not to enforce a patent
+(such as an express permission to practice a patent or covenant not to
+sue for patent infringement).  To "grant" such a patent license to a
+party means to make such an agreement or commitment not to enforce a
+patent against the party.
+
+  If you convey a covered work, knowingly relying on a patent license,
+and the Corresponding Source of the work is not available for anyone
+to copy, free of charge and under the terms of this License, through a
+publicly available network server or other readily accessible means,
+then you must either (1) cause the Corresponding Source to be so
+available, or (2) arrange to deprive yourself of the benefit of the
+patent license for this particular work, or (3) arrange, in a manner
+consistent with the requirements of this License, to extend the patent
+license to downstream recipients.  "Knowingly relying" means you have
+actual knowledge that, but for the patent license, your conveying the
+covered work in a country, or your recipient's use of the covered work
+in a country, would infringe one or more identifiable patents in that
+country that you have reason to believe are valid.
+
+  If, pursuant to or in connection with a single transaction or
+arrangement, you convey, or propagate by procuring conveyance of, a
+covered work, and grant a patent license to some of the parties
+receiving the covered work authorizing them to use, propagate, modify
+or convey a specific copy of the covered work, then the patent license
+you grant is automatically extended to all recipients of the covered
+work and works based on it.
+
+  A patent license is "discriminatory" if it does not include within
+the scope of its coverage, prohibits the exercise of, or is
+conditioned on the non-exercise of one or more of the rights that are
+specifically granted under this License.  You may not convey a covered
+work if you are a party to an arrangement with a third party that is
+in the business of distributing software, under which you make payment
+to the third party based on the extent of your activity of conveying
+the work, and under which the third party grants, to any of the
+parties who would receive the covered work from you, a discriminatory
+patent license (a) in connection with copies of the covered work
+conveyed by you (or copies made from those copies), or (b) primarily
+for and in connection with specific products or compilations that
+contain the covered work, unless you entered into that arrangement,
+or that patent license was granted, prior to 28 March 2007.
+
+  Nothing in this License shall be construed as excluding or limiting
+any implied license or other defenses to infringement that may
+otherwise be available to you under applicable patent law.
+
+  12. No Surrender of Others' Freedom.
+
+  If conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot convey a
+covered work so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you may
+not convey it at all.  For example, if you agree to terms that obligate you
+to collect a royalty for further conveying from those to whom you convey
+the Program, the only way you could satisfy both those terms and this
+License would be to refrain entirely from conveying the Program.
+
+  13. Use with the GNU Affero General Public License.
+
+  Notwithstanding any other provision of this License, you have
+permission to link or combine any covered work with a work licensed
+under version 3 of the GNU Affero General Public License into a single
+combined work, and to convey the resulting work.  The terms of this
+License will continue to apply to the part which is the covered work,
+but the special requirements of the GNU Affero General Public License,
+section 13, concerning interaction through a network will apply to the
+combination as such.
+
+  14. Revised Versions of this License.
+
+  The Free Software Foundation may publish revised and/or new versions of
+the GNU General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+  Each version is given a distinguishing version number.  If the
+Program specifies that a certain numbered version of the GNU General
+Public License "or any later version" applies to it, you have the
+option of following the terms and conditions either of that numbered
+version or of any later version published by the Free Software
+Foundation.  If the Program does not specify a version number of the
+GNU General Public License, you may choose any version ever published
+by the Free Software Foundation.
+
+  If the Program specifies that a proxy can decide which future
+versions of the GNU General Public License can be used, that proxy's
+public statement of acceptance of a version permanently authorizes you
+to choose that version for the Program.
+
+  Later license versions may give you additional or different
+permissions.  However, no additional obligations are imposed on any
+author or copyright holder as a result of your choosing to follow a
+later version.
+
+  15. Disclaimer of Warranty.
+
+  THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
+APPLICABLE LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
+HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
+OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
+THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
+IS WITH YOU.  SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
+ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+
+  16. Limitation of Liability.
+
+  IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
+THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
+GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
+USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
+DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
+PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
+EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
+SUCH DAMAGES.
+
+  17. Interpretation of Sections 15 and 16.
+
+  If the disclaimer of warranty and limitation of liability provided
+above cannot be given local legal effect according to their terms,
+reviewing courts shall apply local law that most closely approximates
+an absolute waiver of all civil liability in connection with the
+Program, unless a warranty or assumption of liability accompanies a
+copy of the Program in return for a fee.
+
+                     END OF TERMS AND CONDITIONS
+
+            How to Apply These Terms to Your New Programs
+
+  If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+  To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+state the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+    <one line to give the program's name and a brief idea of what it does.>
+    Copyright (C) <year>  <name of author>
+
+    This program is free software: you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation, either version 3 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+Also add information on how to contact you by electronic and paper mail.
+
+  If the program does terminal interaction, make it output a short
+notice like this when it starts in an interactive mode:
+
+    <program>  Copyright (C) <year>  <name of author>
+    This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+    This is free software, and you are welcome to redistribute it
+    under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License.  Of course, your program's commands
+might be different; for a GUI interface, you would use an "about box".
+
+  You should also get your employer (if you work as a programmer) or school,
+if any, to sign a "copyright disclaimer" for the program, if necessary.
+For more information on this, and how to apply and follow the GNU GPL, see
+<http://www.gnu.org/licenses/>.
+
+  The GNU General Public License does not permit incorporating your program
+into proprietary programs.  If your program is a subroutine library, you
+may consider it more useful to permit linking proprietary applications with
+the library.  If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.  But first, please read
+<http://www.gnu.org/philosophy/why-not-lgpl.html>.
+\ No newline at end of file
diff --git a/tools/wheels/cibw_before_build.sh b/tools/wheels/cibw_before_build.sh

index 36410ba1fa34f3b28ae92280e697450f0d3bd0b6..5cf7d8088647cbe290307d72478039d9b4fa4300 100644 (file)
--- a/tools/wheels/cibw_before_build.sh
+++ b/tools/wheels/cibw_before_build.sh
@@ -1,24 +1,37 @@
  set -xe
  
  PROJECT_DIR="$1"
-UNAME="$(uname)"
+PLATFORM=$(PYTHONPATH=tools python -c "import openblas_support; print(openblas_support.get_plat())")
  
  # Update license
-if [[ $UNAME == "Linux" ]] ; then
+if [[ $RUNNER_OS == "Linux" ]] ; then
      cat $PROJECT_DIR/tools/wheels/LICENSE_linux.txt >> $PROJECT_DIR/LICENSE.txt
-elif [[ $UNAME == "Darwin" ]]; then
+elif [[ $RUNNER_OS == "macOS" ]]; then
      cat $PROJECT_DIR/tools/wheels/LICENSE_osx.txt >> $PROJECT_DIR/LICENSE.txt
+elif [[ $RUNNER_OS == "Windows" ]]; then
+    cat $PROJECT_DIR/tools/wheels/LICENSE_win32.txt >> $PROJECT_DIR/LICENSE.txt
  fi
  
  # Install Openblas
-if [[ $UNAME == "Linux" || $UNAME == "Darwin" ]] ; then
+if [[ $RUNNER_OS == "Linux" || $RUNNER_OS == "macOS" ]] ; then
      basedir=$(python tools/openblas_support.py)
      cp -r $basedir/lib/* /usr/local/lib
      cp $basedir/include/* /usr/local/include
+    if [[ $RUNNER_OS == "macOS" && $PLATFORM == "macosx-arm64" ]]; then
+        sudo mkdir -p /opt/arm64-builds/lib /opt/arm64-builds/include
+        sudo chown -R $USER /opt/arm64-builds
+        cp -r $basedir/lib/* /opt/arm64-builds/lib
+        cp $basedir/include/* /opt/arm64-builds/include
+    fi
+elif [[ $RUNNER_OS == "Windows" ]]; then
+    PYTHONPATH=tools python -c "import openblas_support; openblas_support.make_init('numpy')"
+    target=$(python tools/openblas_support.py)
+    mkdir -p openblas
+    cp $target openblas
  fi
  
  # Install GFortran
-if [[ $UNAME == "Darwin" ]]; then
+if [[ $RUNNER_OS == "macOS" ]]; then
      # same version of gfortran as the openblas-libs and numpy-wheel builds
      curl -L https://github.com/MacPython/gfortran-install/raw/master/archives/gfortran-4.9.0-Mavericks.dmg -o gfortran.dmg
      GFORTRAN_SHA256=$(shasum -a 256 gfortran.dmg)
@@ -27,9 +40,17 @@ if [[ $UNAME == "Darwin" ]]; then
          echo sha256 mismatch
          exit 1
      fi
+
      hdiutil attach -mountpoint /Volumes/gfortran gfortran.dmg
      sudo installer -pkg /Volumes/gfortran/gfortran.pkg -target /
      otool -L /usr/local/gfortran/lib/libgfortran.3.dylib
+
+    # arm64 stuff from gfortran_utils
+    if [[ $PLATFORM == "macosx-arm64" ]]; then
+        source $PROJECT_DIR/tools/wheels/gfortran_utils.sh
+        install_arm64_cross_gfortran
+    fi
+
      # Manually symlink gfortran-4.9 to plain gfortran for f2py.
      # No longer needed after Feb 13 2020 as gfortran is already present
      # and the attempted link errors. Keep this for future reference.
diff --git a/tools/wheels/cibw_test_command.sh b/tools/wheels/cibw_test_command.sh

index f09395e847a1f5ea8bed51cfb138aca25e980364..b296993fc1d594eab618ec06b973424e0a653f3a 100644 (file)
--- a/tools/wheels/cibw_test_command.sh
+++ b/tools/wheels/cibw_test_command.sh
@@ -4,10 +4,14 @@
  set -xe
  
  PROJECT_DIR="$1"
-UNAME="$(uname)"
  
  python -c "import numpy; numpy.show_config()"
-python -c "import sys; import numpy; sys.exit(not numpy.test('full', extra_argv=['-vv']))"
+if [[ $RUNNER_OS == "Windows" ]]; then
+    # GH 20391
+    PY_DIR=$(python -c "import sys; print(sys.prefix)")
+    mkdir $PY_DIR/libs
+fi
+python -c "import sys; import numpy; sys.exit(not numpy.test('full', extra_argv=['-vvv']))"
  
  python $PROJECT_DIR/tools/wheels/check_license.py
  if [[ $UNAME == "Linux" || $UNAME == "Darwin" ]] ; then
diff --git a/tools/wheels/gfortran_utils.sh b/tools/wheels/gfortran_utils.sh

new file mode 100644 (file)

index 0000000..e0c7054
--- /dev/null
+++ b/tools/wheels/gfortran_utils.sh
@@ -0,0 +1,168 @@
+# This file is vendored from github.com/MacPython/gfortran-install It is
+# licensed under BSD-2 which is copied as a comment below
+
+# Copyright 2016-2021 Matthew Brett, Isuru Fernando, Matti Picus
+
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are met:
+
+# Redistributions of source code must retain the above copyright notice, this
+# list of conditions and the following disclaimer.
+
+# Redistributions in binary form must reproduce the above copyright notice, this
+# list of conditions and the following disclaimer in the documentation and/or
+# other materials provided with the distribution.
+
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+# Bash utilities for use with gfortran
+
+GF_LIB_URL="https://3f23b170c54c2533c070-1c8a9b3114517dc5fe17b7c3f8c63a43.ssl.cf2.rackcdn.com"
+ARCHIVE_SDIR="${ARCHIVE_SDIR:-archives}"
+
+GF_UTIL_DIR=$(dirname "${BASH_SOURCE[0]}")
+
+function get_distutils_platform {
+    # Report platform as in form of distutils get_platform.
+    # This is like the platform tag that pip will use.
+    # Modify fat architecture tags on macOS to reflect compiled architecture
+
+    # Deprecate this function once get_distutils_platform_ex is used in all
+    # downstream projects
+    local plat=$1
+    case $plat in
+        i686|x86_64|arm64|universal2|intel|aarch64|s390x|ppc64le) ;;
+        *) echo Did not recognize plat $plat; return 1 ;;
+    esac
+    local uname=${2:-$(uname)}
+    if [ "$uname" != "Darwin" ]; then
+        if [ "$plat" == "intel" ]; then
+            echo plat=intel not allowed for Manylinux
+            return 1
+        fi
+        echo "manylinux1_$plat"
+        return
+    fi
+    # macOS 32-bit arch is i386
+    [ "$plat" == "i686" ] && plat="i386"
+    local target=$(echo $MACOSX_DEPLOYMENT_TARGET | tr .- _)
+    echo "macosx_${target}_${plat}"
+}
+
+function get_distutils_platform_ex {
+    # Report platform as in form of distutils get_platform.
+    # This is like the platform tag that pip will use.
+    # Modify fat architecture tags on macOS to reflect compiled architecture
+    # For non-darwin, report manylinux version
+    local plat=$1
+    local mb_ml_ver=${MB_ML_VER:-1}
+    case $plat in
+        i686|x86_64|arm64|universal2|intel|aarch64|s390x|ppc64le) ;;
+        *) echo Did not recognize plat $plat; return 1 ;;
+    esac
+    local uname=${2:-$(uname)}
+    if [ "$uname" != "Darwin" ]; then
+        if [ "$plat" == "intel" ]; then
+            echo plat=intel not allowed for Manylinux
+            return 1
+        fi
+        echo "manylinux${mb_ml_ver}_${plat}"
+        return
+    fi
+    # macOS 32-bit arch is i386
+    [ "$plat" == "i686" ] && plat="i386"
+    local target=$(echo $MACOSX_DEPLOYMENT_TARGET | tr .- _)
+    echo "macosx_${target}_${plat}"
+}
+
+function get_macosx_target {
+    # Report MACOSX_DEPLOYMENT_TARGET as given by distutils get_platform.
+    python -c "import sysconfig as s; print(s.get_config_vars()['MACOSX_DEPLOYMENT_TARGET'])"
+}
+
+function check_gfortran {
+    # Check that gfortran exists on the path
+    if [ -z "$(which gfortran)" ]; then
+        echo Missing gfortran
+        exit 1
+    fi
+}
+
+function get_gf_lib_for_suf {
+    local suffix=$1
+    local prefix=$2
+    local plat=${3:-$PLAT}
+    local uname=${4:-$(uname)}
+    if [ -z "$prefix" ]; then echo Prefix not defined; exit 1; fi
+    local plat_tag=$(get_distutils_platform_ex $plat $uname)
+    if [ -n "$suffix" ]; then suffix="-$suffix"; fi
+    local fname="$prefix-${plat_tag}${suffix}.tar.gz"
+    local out_fname="${ARCHIVE_SDIR}/$fname"
+    if [ ! -e "$out_fname" ]; then
+        curl -L "${GF_LIB_URL}/$fname" > $out_fname || (echo "Fetch of $out_fname failed"; exit 1)
+    fi
+    [ -s $out_fname ] || (echo "$out_fname is empty"; exit 24)
+    echo "$out_fname"
+}
+
+if [ "$(uname)" == "Darwin" ]; then
+    mac_target=${MACOSX_DEPLOYMENT_TARGET:-$(get_macosx_target)}
+    export MACOSX_DEPLOYMENT_TARGET=$mac_target
+    GFORTRAN_DMG="${GF_UTIL_DIR}/archives/gfortran-4.9.0-Mavericks.dmg"
+    export GFORTRAN_SHA="$(shasum $GFORTRAN_DMG)"
+
+    function install_arm64_cross_gfortran {
+        curl -L -O https://github.com/isuruf/gcc/releases/download/gcc-10-arm-20210228/gfortran-darwin-arm64.tar.gz
+        export GFORTRAN_SHA=f26990f6f08e19b2ec150b9da9d59bd0558261dd
+        if [[ "$(shasum gfortran-darwin-arm64.tar.gz)" != "${GFORTRAN_SHA}  gfortran-darwin-arm64.tar.gz" ]]; then
+            echo "shasum mismatch for gfortran-darwin-arm64"
+            exit 1
+        fi
+        sudo mkdir -p /opt/
+        sudo cp "gfortran-darwin-arm64.tar.gz" /opt/gfortran-darwin-arm64.tar.gz
+        pushd /opt
+            sudo tar -xvf gfortran-darwin-arm64.tar.gz
+            sudo rm gfortran-darwin-arm64.tar.gz
+        popd
+        export FC_ARM64="$(find /opt/gfortran-darwin-arm64/bin -name "*-gfortran")"
+        local libgfortran="$(find /opt/gfortran-darwin-arm64/lib -name libgfortran.dylib)"
+        local libdir=$(dirname $libgfortran)
+
+        export FC_ARM64_LDFLAGS="-L$libdir -Wl,-rpath,$libdir"
+        if [[ "${PLAT:-}" == "arm64" ]]; then
+            export FC=$FC_ARM64
+        fi
+    }
+    function install_gfortran {
+        hdiutil attach -mountpoint /Volumes/gfortran $GFORTRAN_DMG
+        sudo installer -pkg /Volumes/gfortran/gfortran.pkg -target /
+        check_gfortran
+        if [[ "${PLAT:-}" == "universal2" || "${PLAT:-}" == "arm64" ]]; then
+            install_arm64_cross_gfortran
+        fi
+    }
+
+    function get_gf_lib {
+        # Get lib with gfortran suffix
+        get_gf_lib_for_suf "gf_${GFORTRAN_SHA:0:7}" $@
+    }
+else
+    function install_gfortran {
+        # No-op - already installed on manylinux image
+        check_gfortran
+    }
+
+    function get_gf_lib {
+        # Get library with no suffix
+        get_gf_lib_for_suf "" $@
+    }
+fi
diff --git a/tools/wheels/upload_wheels.sh b/tools/wheels/upload_wheels.sh

new file mode 100644 (file)

index 0000000..beeca27
--- /dev/null
+++ b/tools/wheels/upload_wheels.sh
@@ -0,0 +1,59 @@
+set_travis_vars() {
+    # Set env vars
+    echo "TRAVIS_EVENT_TYPE is $TRAVIS_EVENT_TYPE"
+    echo "TRAVIS_TAG is $TRAVIS_TAG"
+    if [[ "$TRAVIS_EVENT_TYPE" == "push" && "$TRAVIS_TAG" == v* ]]; then
+      IS_PUSH="true"
+    else
+      IS_PUSH="false"
+    fi
+    if [[ "$TRAVIS_EVENT_TYPE" == "cron" ]]; then
+      IS_SCHEDULE_DISPATCH="true"
+    elif [[ "$TRAVIS_EVENT_TYPE" == "api" ]]; then
+      # Manual CI run, so upload
+      IS_SCHEDULE_DISPATCH="true"
+    else
+      IS_SCHEDULE_DISPATCH="false"
+    fi
+}
+set_upload_vars() {
+    echo "IS_PUSH is $IS_PUSH"
+    echo "IS_SCHEDULE_DISPATCH is $IS_SCHEDULE_DISPATCH"
+    if [[ "$IS_PUSH" == "true" ]]; then
+        echo push and tag event
+        export ANACONDA_ORG="multibuild-wheels-staging"
+        export TOKEN="$NUMPY_STAGING_UPLOAD_TOKEN"
+        export ANACONDA_UPLOAD="true"
+    elif [[ "$IS_SCHEDULE_DISPATCH" == "true" ]]; then
+        echo scheduled or dispatched event
+        export ANACONDA_ORG="scipy-wheels-nightly"
+        export TOKEN="$NUMPY_NIGHTLY_UPLOAD_TOKEN"
+        export ANACONDA_UPLOAD="true"
+    else
+        echo non-dispatch event
+        export ANACONDA_UPLOAD="false"
+    fi
+}
+upload_wheels() {
+    echo ${PWD}
+    if [[ ${ANACONDA_UPLOAD} == true ]]; then
+        if [ -z ${TOKEN} ]; then
+            echo no token set, not uploading
+        else
+            python -m pip install \
+            git+https://github.com/Anaconda-Platform/anaconda-client.git@be1e14936a8e947da94d026c990715f0596d7043
+            # sdists are located under dist folder when built through setup.py
+            if compgen -G "./dist/*.gz"; then
+                echo "Found sdist"
+                anaconda -t ${TOKEN} upload --skip -u ${ANACONDA_ORG} ./dist/*.gz
+            elif compgen -G "./wheelhouse/*.whl"; then
+                echo "Found wheel"
+                anaconda -t ${TOKEN} upload --skip -u ${ANACONDA_ORG} ./wheelhouse/*.whl
+            else
+                echo "Files do not exist"
+                return 1
+            fi
+            echo "PyPI-style index: https://pypi.anaconda.org/$ANACONDA_ORG/simple"
+        fi
+    fi
+}
diff --git a/tox.ini b/tox.ini

index 9bc2bbac36dd46c12eaebae4559ff1f3d7c890cf..f47551398ad87bd2f0768d870375233fa7185f84 100644 (file)
--- a/tox.ini
+++ b/tox.ini
@@ -13,7 +13,7 @@
  #     - Use pip to install the numpy sdist into the virtualenv
  #     - Run the numpy tests
  # To run against a specific subset of Python versions, use:
-#   tox -e py37
+#   tox -e py39
  
  # Extra arguments will be passed to runtests.py. To run
  # the full testsuite:
@@ -26,18 +26,13 @@
  
  [tox]
  envlist =
-  py37,py38,py39,
-  py37-not-relaxed-strides
+  py38,py39
  
  [testenv]
  deps= -Ur{toxinidir}/test_requirements.txt
  changedir={envdir}
  commands={envpython} -b {toxinidir}/runtests.py --mode=full {posargs:}
  
-[testenv:py37-not-relaxed-strides]
-basepython=python3.7
-env=NPY_RELAXED_STRIDES_CHECKING=0
-
  # Not run by default. Set up the way you want then use 'tox -e debug'
  # if you want it:
  [testenv:debug]
author	DongHun Kwak <dh0128.kwak@samsung.com>
	Fri, 15 Jul 2022 02:14:57 +0000 (11:14 +0900)
committer	DongHun Kwak <dh0128.kwak@samsung.com>
	Fri, 15 Jul 2022 02:14:57 +0000 (11:14 +0900)