--- /dev/null
+
+Contributors
+============
+
+A total of 7 people contributed to this release. People with a "+" by their
+names contributed a patch for the first time.
+
+* Allan Haldane
+* Charles Harris
+* Kevin Sheppard
+* Matti Picus
+* Ralf Gommers
+* Sebastian Berg
+* Warren Weckesser
+
+Pull requests merged
+====================
+
+A total of 12 pull requests were merged for this release.
+
+* `#14456 <https://github.com/numpy/numpy/pull/14456>`__: MAINT: clean up pocketfft modules inside numpy.fft namespace.
+* `#14463 <https://github.com/numpy/numpy/pull/14463>`__: BUG: random.hypergeometic assumes npy_long is npy_int64, hung...
+* `#14502 <https://github.com/numpy/numpy/pull/14502>`__: BUG: random: Revert gh-14458 and refix gh-14557.
+* `#14504 <https://github.com/numpy/numpy/pull/14504>`__: BUG: add a specialized loop for boolean matmul.
+* `#14506 <https://github.com/numpy/numpy/pull/14506>`__: MAINT: Update pytest version for Python 3.8
+* `#14512 <https://github.com/numpy/numpy/pull/14512>`__: DOC: random: fix doc linking, was referencing private submodules.
+* `#14513 <https://github.com/numpy/numpy/pull/14513>`__: BUG,MAINT: Some fixes and minor cleanup based on clang analysis
+* `#14515 <https://github.com/numpy/numpy/pull/14515>`__: BUG: Fix randint when range is 2**32
+* `#14519 <https://github.com/numpy/numpy/pull/14519>`__: MAINT: remove the entropy c-extension module
+* `#14563 <https://github.com/numpy/numpy/pull/14563>`__: DOC: remove note about Pocketfft license file (non-existing here).
+* `#14578 <https://github.com/numpy/numpy/pull/14578>`__: BUG: random: Create a legacy implementation of random.binomial.
+* `#14687 <https://github.com/numpy/numpy/pull/14687>`__: BUG: properly define PyArray_DescrCheck
--- /dev/null
+.. currentmodule:: numpy
+
+==========================
+NumPy 1.17.3 Release Notes
+==========================
+
+This release contains fixes for bugs reported against NumPy 1.17.2 along with a
+some documentation improvements. The Python versions supported in this release
+are 3.5-3.8.
+
+Downstream developers should use Cython >= 0.29.13 for Python 3.8 support and
+OpenBLAS >= 3.7 to avoid errors on the Skylake architecture.
+
+
+Highlights
+==========
+
+- Wheels for Python 3.8
+- Boolean ``matmul`` fixed to use booleans instead of integers.
+
+
+Compatibility notes
+===================
+
+- The seldom used ``PyArray_DescrCheck`` macro has been changed/fixed.
+
+
+Contributors
+============
+
+A total of 7 people contributed to this release. People with a "+" by their
+names contributed a patch for the first time.
+
+* Allan Haldane
+* Charles Harris
+* Kevin Sheppard
+* Matti Picus
+* Ralf Gommers
+* Sebastian Berg
+* Warren Weckesser
+
+
+Pull requests merged
+====================
+
+A total of 12 pull requests were merged for this release.
+
+* `#14456 <https://github.com/numpy/numpy/pull/14456>`__: MAINT: clean up pocketfft modules inside numpy.fft namespace.
+* `#14463 <https://github.com/numpy/numpy/pull/14463>`__: BUG: random.hypergeometic assumes npy_long is npy_int64, hung...
+* `#14502 <https://github.com/numpy/numpy/pull/14502>`__: BUG: random: Revert gh-14458 and refix gh-14557.
+* `#14504 <https://github.com/numpy/numpy/pull/14504>`__: BUG: add a specialized loop for boolean matmul.
+* `#14506 <https://github.com/numpy/numpy/pull/14506>`__: MAINT: Update pytest version for Python 3.8
+* `#14512 <https://github.com/numpy/numpy/pull/14512>`__: DOC: random: fix doc linking, was referencing private submodules.
+* `#14513 <https://github.com/numpy/numpy/pull/14513>`__: BUG,MAINT: Some fixes and minor cleanup based on clang analysis
+* `#14515 <https://github.com/numpy/numpy/pull/14515>`__: BUG: Fix randint when range is 2**32
+* `#14519 <https://github.com/numpy/numpy/pull/14519>`__: MAINT: remove the entropy c-extension module
+* `#14563 <https://github.com/numpy/numpy/pull/14563>`__: DOC: remove note about Pocketfft license file (non-existing here).
+* `#14578 <https://github.com/numpy/numpy/pull/14578>`__: BUG: random: Create a legacy implementation of random.binomial.
+* `#14687 <https://github.com/numpy/numpy/pull/14687>`__: BUG: properly define PyArray_DescrCheck
--- /dev/null
+Remove ``numpy.random.entropy`` without a deprecation
+-----------------------------------------------------
+
+``numpy.random.entropy`` was added to the `numpy.random` namespace in 1.17.0.
+It was meant to be a private c-extension module, but was exposed as public.
+It has been replaced by `numpy.random.SeedSequence` so the module was
+completely removed.
--- /dev/null
+`numpy.random.randint` produced incorrect value when the range was ``2**32``
+----------------------------------------------------------------------------
+The implementation introduced in 1.17.0 had an incorrect check when
+determining whether to use the 32-bit path or the full 64-bit
+path that incorrectly redirected random integer generation with a high - low
+range of ``2**32`` to the 64-bit generator.
-Mersenne Twister (MT19937)
+Mersenne Twister (MT19937)
--------------------------
-.. module:: numpy.random.mt19937
-
-.. currentmodule:: numpy.random.mt19937
+.. currentmodule:: numpy.random
.. autoclass:: MT19937
:exclude-members:
Parallel Congruent Generator (64-bit, PCG64)
--------------------------------------------
-.. module:: numpy.random.pcg64
-
-.. currentmodule:: numpy.random.pcg64
+.. currentmodule:: numpy.random
.. autoclass:: PCG64
:exclude-members:
Philox Counter-based RNG
------------------------
-.. module:: numpy.random.philox
-
-.. currentmodule:: numpy.random.philox
+.. currentmodule:: numpy.random
.. autoclass:: Philox
:exclude-members:
SFC64 Small Fast Chaotic PRNG
-----------------------------
-.. module:: numpy.random.sfc64
-
-.. currentmodule:: numpy.random.sfc64
+.. currentmodule:: numpy.random
.. autoclass:: SFC64
:exclude-members:
+++ /dev/null
-System Entropy
-==============
-
-.. module:: numpy.random.entropy
-
-.. autofunction:: random_entropy
select distributions
* Optional ``out`` argument that allows existing arrays to be filled for
select distributions
-* `~entropy.random_entropy` provides access to the system
- source of randomness that is used in cryptographic applications (e.g.,
- ``/dev/urandom`` on Unix).
* All BitGenerators can produce doubles, uint64s and uint32s via CTypes
(`~.PCG64.ctypes`) and CFFI (`~.PCG64.cffi`). This allows the bit generators
to be used in numba.
:maxdepth: 1
generator
- legacy mtrand <legacy>
+ Legacy Generator (RandomState) <legacy>
BitGenerators, SeedSequences <bit_generators/index>
Features
new-or-different
Comparing Performance <performance>
extending
- Reading System Entropy <entropy>
Original Source
~~~~~~~~~~~~~~~
Legacy Random Generation
------------------------
-The `~mtrand.RandomState` provides access to
+The `RandomState` provides access to
legacy generators. This generator is considered frozen and will have
no further improvements. It is guaranteed to produce the same values
as the final point release of NumPy v1.16. These all depend on Box-Muller
if it is essential to have randoms that are identical to what
would have been produced by previous versions of NumPy.
-`~mtrand.RandomState` adds additional information
+`RandomState` adds additional information
to the state which is required when using Box-Muller normals since these
are produced in pairs. It is important to use
-`~mtrand.RandomState.get_state`, and not the underlying bit generators
+`RandomState.get_state`, and not the underlying bit generators
`state`, when accessing the state so that these extra values are saved.
-Although we provide the `~mt19937.MT19937` BitGenerator for use independent of
-`~mtrand.RandomState`, note that its default seeding uses `~SeedSequence`
-rather than the legacy seeding algorithm. `~mtrand.RandomState` will use the
+Although we provide the `MT19937` BitGenerator for use independent of
+`RandomState`, note that its default seeding uses `SeedSequence`
+rather than the legacy seeding algorithm. `RandomState` will use the
legacy seeding algorithm. The methods to use the legacy seeding algorithm are
currently private as the main reason to use them is just to implement
-`~mtrand.RandomState`. However, one can reset the state of `~mt19937.MT19937`
-using the state of the `~mtrand.RandomState`:
+`RandomState`. However, one can reset the state of `MT19937`
+using the state of the `RandomState`:
.. code-block:: python
rs2.standard_exponential()
-.. currentmodule:: numpy.random.mtrand
-
.. autoclass:: RandomState
:exclude-members:
And in more detail:
-* `~.entropy.random_entropy` provides access to the system
- source of randomness that is used in cryptographic applications (e.g.,
- ``/dev/urandom`` on Unix).
* Simulate from the complex normal distribution
(`~.Generator.complex_normal`)
* The normal, exponential and gamma generators use 256-step Ziggurat
Release Notes
*************
+.. include:: ../release/1.17.3-notes.rst
.. include:: ../release/1.17.2-notes.rst
.. include:: ../release/1.17.1-notes.rst
.. include:: ../release/1.17.0-notes.rst
/* C-API that requires previous API to be defined */
-#define PyArray_DescrCheck(op) (((PyObject*)(op))->ob_type==&PyArrayDescr_Type)
+#define PyArray_DescrCheck(op) PyObject_TypeCheck(op, &PyArrayDescr_Type)
#define PyArray_Check(op) PyObject_TypeCheck(op, &PyArray_Type)
#define PyArray_CheckExact(op) (((PyObject*)(op))->ob_type == &PyArray_Type)
return retcode;
}
-/*
- * Fills an array with ones.
- *
- * dst: The destination array.
- * wheremask: If non-NULL, a boolean mask specifying where to set the values.
- *
- * Returns 0 on success, -1 on failure.
- */
-NPY_NO_EXPORT int
-PyArray_AssignOne(PyArrayObject *dst,
- PyArrayObject *wheremask)
-{
- npy_bool value;
- PyArray_Descr *bool_dtype;
- int retcode;
-
- /* Create a raw bool scalar with the value True */
- bool_dtype = PyArray_DescrFromType(NPY_BOOL);
- if (bool_dtype == NULL) {
- return -1;
- }
- value = 1;
-
- retcode = PyArray_AssignRawScalar(dst, bool_dtype, (char *)&value,
- wheremask, NPY_SAFE_CASTING);
-
- Py_DECREF(bool_dtype);
- return retcode;
-}
/*NUMPY_API
* Copy an array.
}
-/*
- * Call the python _is_from_ctypes
- */
-NPY_NO_EXPORT int
-_is_from_ctypes(PyObject *obj) {
- PyObject *ret_obj;
- static PyObject *py_func = NULL;
-
- npy_cache_import("numpy.core._internal", "_is_from_ctypes", &py_func);
-
- if (py_func == NULL) {
- return -1;
- }
- ret_obj = PyObject_CallFunctionObjArgs(py_func, obj, NULL);
- if (ret_obj == NULL) {
- return -1;
- }
-
- return PyObject_IsTrue(ret_obj);
-}
-
-
NPY_NO_EXPORT PyObject *
_array_from_buffer_3118(PyObject *memoryview)
{
/* If there are subarrays, need to wrap it */
else if (PyDataType_HASSUBARRAY(src_dtype)) {
PyArray_Dims src_shape = {NULL, -1};
- npy_intp src_size = 1;
+ npy_intp src_size;
PyArray_StridedUnaryOp *stransfer;
NpyAuxData *data;
int *swaps;
assert(N > 0); /* Guaranteed and assumed by indbuffer */
- int valbufsize = N * maxelsize;
+ npy_intp valbufsize = N * maxelsize;
if (NPY_UNLIKELY(valbufsize) == 0) {
valbufsize = 1; /* Ensure allocation is not empty */
}
if (check_and_adjust_index(&start, self->size, -1, NULL) < 0) {
goto finish;
}
- retval = 0;
PyArray_ITER_GOTO1D(self, start);
retval = type->f->setitem(val, self->dataptr, self->ao);
PyArray_ITER_RESET(self);
PyArrayObject *oparr = NULL, *ret = NULL;
npy_bool subok = NPY_FALSE;
npy_bool copy = NPY_TRUE;
- int ndmin = 0, nd;
+ int nd;
+ npy_intp ndmin = 0;
PyArray_Descr *type = NULL;
PyArray_Descr *oldtype = NULL;
NPY_ORDER order = NPY_KEEPORDER;
}
}
- /* copy=False with default dtype, order and ndim */
- if (STRIDING_OK(oparr, order)) {
- ret = oparr;
- Py_INCREF(ret);
- goto finish;
- }
+ /* copy=False with default dtype, order (any is OK) and ndim */
+ ret = oparr;
+ Py_INCREF(ret);
+ goto finish;
}
}
npy_intp istrides, nstrides = NAD_NSTRIDES();
NpyIter_AxisData *axisdata = NIT_AXISDATA(iter);
npy_intp sizeof_axisdata = NIT_AXISDATA_SIZEOF(itflags, ndim, nop);
- NpyIter_AxisData *ad_compress;
+ NpyIter_AxisData *ad_compress = axisdata;
npy_intp new_ndim = 1;
/* The HASMULTIINDEX or IDENTPERM flags do not apply after coalescing */
NIT_ITFLAGS(iter) &= ~(NPY_ITFLAG_IDENTPERM|NPY_ITFLAG_HASMULTIINDEX);
- axisdata = NIT_AXISDATA(iter);
- ad_compress = axisdata;
-
for (idim = 0; idim < ndim-1; ++idim) {
int can_coalesce = 1;
npy_intp shape0 = NAD_SHAPE(ad_compress);
}
static int
-npyiter_convert_op_axes(PyObject *op_axes_in, npy_intp nop,
+npyiter_convert_op_axes(PyObject *op_axes_in, int nop,
int **op_axes, int *oa_ndim)
{
PyObject *a;
{
npy_intp oldnbytes, newnbytes;
npy_intp oldsize, newsize;
- int new_nd=newshape->len, k, n, elsize;
+ int new_nd=newshape->len, k, elsize;
int refcnt;
npy_intp* new_dimensions=newshape->ptr;
npy_intp new_strides[NPY_MAXDIMS];
PyObject *zero = PyInt_FromLong(0);
char *optr;
optr = PyArray_BYTES(self) + oldnbytes;
- n = newsize - oldsize;
- for (k = 0; k < n; k++) {
+ npy_intp n_new = newsize - oldsize;
+ for (npy_intp i = 0; i < n_new; i++) {
_putzero((char *)optr, zero, PyArray_DESCR(self));
optr += elsize;
}
* FLOAT, DOUBLE, HALF,
* CFLOAT, CDOUBLE, CLONGDOUBLE,
* UBYTE, USHORT, UINT, ULONG, ULONGLONG,
- * BYTE, SHORT, INT, LONG, LONGLONG,
- * BOOL#
+ * BYTE, SHORT, INT, LONG, LONGLONG#
* #typ = npy_longdouble,
* npy_float,npy_double,npy_half,
* npy_cfloat, npy_cdouble, npy_clongdouble,
* npy_ubyte, npy_ushort, npy_uint, npy_ulong, npy_ulonglong,
- * npy_byte, npy_short, npy_int, npy_long, npy_longlong,
- * npy_bool#
- * #IS_COMPLEX = 0, 0, 0, 0, 1, 1, 1, 0*11#
- * #IS_HALF = 0, 0, 0, 1, 0*14#
+ * npy_byte, npy_short, npy_int, npy_long, npy_longlong#
+ * #IS_COMPLEX = 0, 0, 0, 0, 1, 1, 1, 0*10#
+ * #IS_HALF = 0, 0, 0, 1, 0*13#
*/
NPY_NO_EXPORT void
}
/**end repeat**/
+NPY_NO_EXPORT void
+BOOL_matmul_inner_noblas(void *_ip1, npy_intp is1_m, npy_intp is1_n,
+ void *_ip2, npy_intp is2_n, npy_intp is2_p,
+ void *_op, npy_intp os_m, npy_intp os_p,
+ npy_intp dm, npy_intp dn, npy_intp dp)
+
+{
+ npy_intp m, n, p;
+ npy_intp ib2_p, ob_p;
+ char *ip1 = (char *)_ip1, *ip2 = (char *)_ip2, *op = (char *)_op;
+ ib2_p = is2_p * dp;
+ ob_p = os_p * dp;
+
+ for (m = 0; m < dm; m++) {
+ for (p = 0; p < dp; p++) {
+ char *ip1tmp = ip1;
+ char *ip2tmp = ip2;
+ *(npy_bool *)op = NPY_FALSE;
+ for (n = 0; n < dn; n++) {
+ npy_bool val1 = (*(npy_bool *)ip1tmp);
+ npy_bool val2 = (*(npy_bool *)ip2tmp);
+ if (val1 != 0 && val2 != 0) {
+ *(npy_bool *)op = NPY_TRUE;
+ break;
+ }
+ ip2tmp += is2_n;
+ ip1tmp += is1_n;
+ }
+ op += os_p;
+ ip2 += is2_p;
+ }
+ op -= ob_p;
+ ip2 -= ib2_p;
+ ip1 += is1_m;
+ op += os_m;
+ }
+}
NPY_NO_EXPORT void
OBJECT_matmul_inner_noblas(void *_ip1, npy_intp is1_m, npy_intp is1_n,
typedef int converter(PyObject *, void *);
while (PyDict_Next(kwds, &pos, &key, &value)) {
- int i;
+ npy_intp i;
converter *convert;
void *output = NULL;
npy_intp index = locate_key(kwnames, key);
int *op_axes[3] = {op_axes_arrays[0], op_axes_arrays[1],
op_axes_arrays[2]};
npy_uint32 op_flags[3];
- int i, idim, ndim, otype_final;
+ int idim, ndim, otype_final;
int need_outer_iterator = 0;
NpyIter *iter = NULL;
/* The reduceat indices - ind must be validated outside this call */
npy_intp *reduceat_ind;
- npy_intp ind_size, red_axis_size;
+ npy_intp i, ind_size, red_axis_size;
/* The selected inner loop */
PyUFuncGenericFunction innerloop = NULL;
void *innerloopdata = NULL;
#endif
/* Set up the op_axes for the outer loop */
- for (i = 0, idim = 0; idim < ndim; ++idim) {
+ for (idim = 0; idim < ndim; ++idim) {
/* Use the i-th iteration dimension to match up ind */
if (idim == axis) {
op_axes_arrays[0][idim] = axis;
with assert_raises(TypeError):
b = np.matmul(a, a)
+ def test_matmul_bool(self):
+ # gh-14439
+ a = np.array([[1, 0],[1, 1]], dtype=bool)
+ assert np.max(a.view(np.uint8)) == 1
+ b = np.matmul(a, a)
+ # matmul with boolean output should always be 0, 1
+ assert np.max(b.view(np.uint8)) == 1
+
+ rg = np.random.default_rng(np.random.PCG64(43))
+ d = rg.integers(2, size=4*5, dtype=np.int8)
+ d = d.reshape(4, 5) > 0
+ out1 = np.matmul(d, d.reshape(5, 4))
+ out2 = np.dot(d, d.reshape(5, 4))
+ assert_equal(out1, out2)
+
+ c = np.matmul(np.zeros((2, 0), dtype=bool), np.zeros(0, dtype=bool))
+ assert not np.any(c)
if sys.version_info[:2] >= (3, 5):
- worst case complexity for transform sizes with large prime factors is
`N*log(N)`, because Bluestein's algorithm [3] is used for these cases.
-License
--------
-
-3-clause BSD (see LICENSE.md)
-
Some code details
-----------------
-from __future__ import division, absolute_import, print_function
+"""
+Discrete Fourier Transform (:mod:`numpy.fft`)
+=============================================
+
+.. currentmodule:: numpy.fft
+
+Standard FFTs
+-------------
+
+.. autosummary::
+ :toctree: generated/
+
+ fft Discrete Fourier transform.
+ ifft Inverse discrete Fourier transform.
+ fft2 Discrete Fourier transform in two dimensions.
+ ifft2 Inverse discrete Fourier transform in two dimensions.
+ fftn Discrete Fourier transform in N-dimensions.
+ ifftn Inverse discrete Fourier transform in N dimensions.
+
+Real FFTs
+---------
+
+.. autosummary::
+ :toctree: generated/
+
+ rfft Real discrete Fourier transform.
+ irfft Inverse real discrete Fourier transform.
+ rfft2 Real discrete Fourier transform in two dimensions.
+ irfft2 Inverse real discrete Fourier transform in two dimensions.
+ rfftn Real discrete Fourier transform in N dimensions.
+ irfftn Inverse real discrete Fourier transform in N dimensions.
+
+Hermitian FFTs
+--------------
+
+.. autosummary::
+ :toctree: generated/
+
+ hfft Hermitian discrete Fourier transform.
+ ihfft Inverse Hermitian discrete Fourier transform.
+
+Helper routines
+---------------
+
+.. autosummary::
+ :toctree: generated/
+
+ fftfreq Discrete Fourier Transform sample frequencies.
+ rfftfreq DFT sample frequencies (for usage with rfft, irfft).
+ fftshift Shift zero-frequency component to center of spectrum.
+ ifftshift Inverse of fftshift.
+
+
+Background information
+----------------------
+
+Fourier analysis is fundamentally a method for expressing a function as a
+sum of periodic components, and for recovering the function from those
+components. When both the function and its Fourier transform are
+replaced with discretized counterparts, it is called the discrete Fourier
+transform (DFT). The DFT has become a mainstay of numerical computing in
+part because of a very fast algorithm for computing it, called the Fast
+Fourier Transform (FFT), which was known to Gauss (1805) and was brought
+to light in its current form by Cooley and Tukey [CT]_. Press et al. [NR]_
+provide an accessible introduction to Fourier analysis and its
+applications.
+
+Because the discrete Fourier transform separates its input into
+components that contribute at discrete frequencies, it has a great number
+of applications in digital signal processing, e.g., for filtering, and in
+this context the discretized input to the transform is customarily
+referred to as a *signal*, which exists in the *time domain*. The output
+is called a *spectrum* or *transform* and exists in the *frequency
+domain*.
+
+Implementation details
+----------------------
+
+There are many ways to define the DFT, varying in the sign of the
+exponent, normalization, etc. In this implementation, the DFT is defined
+as
+
+.. math::
+ A_k = \\sum_{m=0}^{n-1} a_m \\exp\\left\\{-2\\pi i{mk \\over n}\\right\\}
+ \\qquad k = 0,\\ldots,n-1.
+
+The DFT is in general defined for complex inputs and outputs, and a
+single-frequency component at linear frequency :math:`f` is
+represented by a complex exponential
+:math:`a_m = \\exp\\{2\\pi i\\,f m\\Delta t\\}`, where :math:`\\Delta t`
+is the sampling interval.
-# To get sub-modules
-from .info import __doc__
+The values in the result follow so-called "standard" order: If ``A =
+fft(a, n)``, then ``A[0]`` contains the zero-frequency term (the sum of
+the signal), which is always purely real for real inputs. Then ``A[1:n/2]``
+contains the positive-frequency terms, and ``A[n/2+1:]`` contains the
+negative-frequency terms, in order of decreasingly negative frequency.
+For an even number of input points, ``A[n/2]`` represents both positive and
+negative Nyquist frequency, and is also purely real for real input. For
+an odd number of input points, ``A[(n-1)/2]`` contains the largest positive
+frequency, while ``A[(n+1)/2]`` contains the largest negative frequency.
+The routine ``np.fft.fftfreq(n)`` returns an array giving the frequencies
+of corresponding elements in the output. The routine
+``np.fft.fftshift(A)`` shifts transforms and their frequencies to put the
+zero-frequency components in the middle, and ``np.fft.ifftshift(A)`` undoes
+that shift.
+
+When the input `a` is a time-domain signal and ``A = fft(a)``, ``np.abs(A)``
+is its amplitude spectrum and ``np.abs(A)**2`` is its power spectrum.
+The phase spectrum is obtained by ``np.angle(A)``.
+
+The inverse DFT is defined as
+
+.. math::
+ a_m = \\frac{1}{n}\\sum_{k=0}^{n-1}A_k\\exp\\left\\{2\\pi i{mk\\over n}\\right\\}
+ \\qquad m = 0,\\ldots,n-1.
+
+It differs from the forward transform by the sign of the exponential
+argument and the default normalization by :math:`1/n`.
+
+Normalization
+-------------
+The default normalization has the direct transforms unscaled and the inverse
+transforms are scaled by :math:`1/n`. It is possible to obtain unitary
+transforms by setting the keyword argument ``norm`` to ``"ortho"`` (default is
+`None`) so that both direct and inverse transforms will be scaled by
+:math:`1/\\sqrt{n}`.
+
+Real and Hermitian transforms
+-----------------------------
+
+When the input is purely real, its transform is Hermitian, i.e., the
+component at frequency :math:`f_k` is the complex conjugate of the
+component at frequency :math:`-f_k`, which means that for real
+inputs there is no information in the negative frequency components that
+is not already available from the positive frequency components.
+The family of `rfft` functions is
+designed to operate on real inputs, and exploits this symmetry by
+computing only the positive frequency components, up to and including the
+Nyquist frequency. Thus, ``n`` input points produce ``n/2+1`` complex
+output points. The inverses of this family assumes the same symmetry of
+its input, and for an output of ``n`` points uses ``n/2+1`` input points.
+
+Correspondingly, when the spectrum is purely real, the signal is
+Hermitian. The `hfft` family of functions exploits this symmetry by
+using ``n/2+1`` complex points in the input (time) domain for ``n`` real
+points in the frequency domain.
+
+In higher dimensions, FFTs are used, e.g., for image analysis and
+filtering. The computational efficiency of the FFT means that it can
+also be a faster way to compute large convolutions, using the property
+that a convolution in the time domain is equivalent to a point-by-point
+multiplication in the frequency domain.
+
+Higher dimensions
+-----------------
+
+In two dimensions, the DFT is defined as
+
+.. math::
+ A_{kl} = \\sum_{m=0}^{M-1} \\sum_{n=0}^{N-1}
+ a_{mn}\\exp\\left\\{-2\\pi i \\left({mk\\over M}+{nl\\over N}\\right)\\right\\}
+ \\qquad k = 0, \\ldots, M-1;\\quad l = 0, \\ldots, N-1,
+
+which extends in the obvious way to higher dimensions, and the inverses
+in higher dimensions also extend in the same way.
+
+References
+----------
+
+.. [CT] Cooley, James W., and John W. Tukey, 1965, "An algorithm for the
+ machine calculation of complex Fourier series," *Math. Comput.*
+ 19: 297-301.
+
+.. [NR] Press, W., Teukolsky, S., Vetterline, W.T., and Flannery, B.P.,
+ 2007, *Numerical Recipes: The Art of Scientific Computing*, ch.
+ 12-13. Cambridge Univ. Press, Cambridge, UK.
+
+Examples
+--------
+
+For examples, see the various functions.
+
+"""
+
+from __future__ import division, absolute_import, print_function
-from .pocketfft import *
+from ._pocketfft import *
from .helper import *
from numpy._pytesttester import PytestTester
--- /dev/null
+/*
+ * This file is part of pocketfft.
+ * Licensed under a 3-clause BSD style license - see LICENSE.md
+ */
+
+/*
+ * Main implementation file.
+ *
+ * Copyright (C) 2004-2018 Max-Planck-Society
+ * \author Martin Reinecke
+ */
+
+#include <math.h>
+#include <string.h>
+#include <stdlib.h>
+
+#include "npy_config.h"
+#define restrict NPY_RESTRICT
+
+#define RALLOC(type,num) \
+ ((type *)malloc((num)*sizeof(type)))
+#define DEALLOC(ptr) \
+ do { free(ptr); (ptr)=NULL; } while(0)
+
+#define SWAP(a,b,type) \
+ do { type tmp_=(a); (a)=(b); (b)=tmp_; } while(0)
+
+#ifdef __GNUC__
+#define NOINLINE __attribute__((noinline))
+#define WARN_UNUSED_RESULT __attribute__ ((warn_unused_result))
+#else
+#define NOINLINE
+#define WARN_UNUSED_RESULT
+#endif
+
+struct cfft_plan_i;
+typedef struct cfft_plan_i * cfft_plan;
+struct rfft_plan_i;
+typedef struct rfft_plan_i * rfft_plan;
+
+// adapted from https://stackoverflow.com/questions/42792939/
+// CAUTION: this function only works for arguments in the range [-0.25; 0.25]!
+static void my_sincosm1pi (double a, double *restrict res)
+ {
+ double s = a * a;
+ /* Approximate cos(pi*x)-1 for x in [-0.25,0.25] */
+ double r = -1.0369917389758117e-4;
+ r = fma (r, s, 1.9294935641298806e-3);
+ r = fma (r, s, -2.5806887942825395e-2);
+ r = fma (r, s, 2.3533063028328211e-1);
+ r = fma (r, s, -1.3352627688538006e+0);
+ r = fma (r, s, 4.0587121264167623e+0);
+ r = fma (r, s, -4.9348022005446790e+0);
+ double c = r*s;
+ /* Approximate sin(pi*x) for x in [-0.25,0.25] */
+ r = 4.6151442520157035e-4;
+ r = fma (r, s, -7.3700183130883555e-3);
+ r = fma (r, s, 8.2145868949323936e-2);
+ r = fma (r, s, -5.9926452893214921e-1);
+ r = fma (r, s, 2.5501640398732688e+0);
+ r = fma (r, s, -5.1677127800499516e+0);
+ s = s * a;
+ r = r * s;
+ s = fma (a, 3.1415926535897931e+0, r);
+ res[0] = c;
+ res[1] = s;
+ }
+
+NOINLINE static void calc_first_octant(size_t den, double * restrict res)
+ {
+ size_t n = (den+4)>>3;
+ if (n==0) return;
+ res[0]=1.; res[1]=0.;
+ if (n==1) return;
+ size_t l1=(size_t)sqrt(n);
+ for (size_t i=1; i<l1; ++i)
+ my_sincosm1pi((2.*i)/den,&res[2*i]);
+ size_t start=l1;
+ while(start<n)
+ {
+ double cs[2];
+ my_sincosm1pi((2.*start)/den,cs);
+ res[2*start] = cs[0]+1.;
+ res[2*start+1] = cs[1];
+ size_t end = l1;
+ if (start+end>n) end = n-start;
+ for (size_t i=1; i<end; ++i)
+ {
+ double csx[2]={res[2*i], res[2*i+1]};
+ res[2*(start+i)] = ((cs[0]*csx[0] - cs[1]*csx[1] + cs[0]) + csx[0]) + 1.;
+ res[2*(start+i)+1] = (cs[0]*csx[1] + cs[1]*csx[0]) + cs[1] + csx[1];
+ }
+ start += l1;
+ }
+ for (size_t i=1; i<l1; ++i)
+ res[2*i] += 1.;
+ }
+
+NOINLINE static void calc_first_quadrant(size_t n, double * restrict res)
+ {
+ double * restrict p = res+n;
+ calc_first_octant(n<<1, p);
+ size_t ndone=(n+2)>>2;
+ size_t i=0, idx1=0, idx2=2*ndone-2;
+ for (; i+1<ndone; i+=2, idx1+=2, idx2-=2)
+ {
+ res[idx1] = p[2*i];
+ res[idx1+1] = p[2*i+1];
+ res[idx2] = p[2*i+3];
+ res[idx2+1] = p[2*i+2];
+ }
+ if (i!=ndone)
+ {
+ res[idx1 ] = p[2*i];
+ res[idx1+1] = p[2*i+1];
+ }
+ }
+
+NOINLINE static void calc_first_half(size_t n, double * restrict res)
+ {
+ int ndone=(n+1)>>1;
+ double * p = res+n-1;
+ calc_first_octant(n<<2, p);
+ int i4=0, in=n, i=0;
+ for (; i4<=in-i4; ++i, i4+=4) // octant 0
+ {
+ res[2*i] = p[2*i4]; res[2*i+1] = p[2*i4+1];
+ }
+ for (; i4-in <= 0; ++i, i4+=4) // octant 1
+ {
+ int xm = in-i4;
+ res[2*i] = p[2*xm+1]; res[2*i+1] = p[2*xm];
+ }
+ for (; i4<=3*in-i4; ++i, i4+=4) // octant 2
+ {
+ int xm = i4-in;
+ res[2*i] = -p[2*xm+1]; res[2*i+1] = p[2*xm];
+ }
+ for (; i<ndone; ++i, i4+=4) // octant 3
+ {
+ int xm = 2*in-i4;
+ res[2*i] = -p[2*xm]; res[2*i+1] = p[2*xm+1];
+ }
+ }
+
+NOINLINE static void fill_first_quadrant(size_t n, double * restrict res)
+ {
+ const double hsqt2 = 0.707106781186547524400844362104849;
+ size_t quart = n>>2;
+ if ((n&7)==0)
+ res[quart] = res[quart+1] = hsqt2;
+ for (size_t i=2, j=2*quart-2; i<quart; i+=2, j-=2)
+ {
+ res[j ] = res[i+1];
+ res[j+1] = res[i ];
+ }
+ }
+
+NOINLINE static void fill_first_half(size_t n, double * restrict res)
+ {
+ size_t half = n>>1;
+ if ((n&3)==0)
+ for (size_t i=0; i<half; i+=2)
+ {
+ res[i+half] = -res[i+1];
+ res[i+half+1] = res[i ];
+ }
+ else
+ for (size_t i=2, j=2*half-2; i<half; i+=2, j-=2)
+ {
+ res[j ] = -res[i ];
+ res[j+1] = res[i+1];
+ }
+ }
+
+NOINLINE static void fill_second_half(size_t n, double * restrict res)
+ {
+ if ((n&1)==0)
+ for (size_t i=0; i<n; ++i)
+ res[i+n] = -res[i];
+ else
+ for (size_t i=2, j=2*n-2; i<n; i+=2, j-=2)
+ {
+ res[j ] = res[i ];
+ res[j+1] = -res[i+1];
+ }
+ }
+
+NOINLINE static void sincos_2pibyn_half(size_t n, double * restrict res)
+ {
+ if ((n&3)==0)
+ {
+ calc_first_octant(n, res);
+ fill_first_quadrant(n, res);
+ fill_first_half(n, res);
+ }
+ else if ((n&1)==0)
+ {
+ calc_first_quadrant(n, res);
+ fill_first_half(n, res);
+ }
+ else
+ calc_first_half(n, res);
+ }
+
+NOINLINE static void sincos_2pibyn(size_t n, double * restrict res)
+ {
+ sincos_2pibyn_half(n, res);
+ fill_second_half(n, res);
+ }
+
+NOINLINE static size_t largest_prime_factor (size_t n)
+ {
+ size_t res=1;
+ size_t tmp;
+ while (((tmp=(n>>1))<<1)==n)
+ { res=2; n=tmp; }
+
+ size_t limit=(size_t)sqrt(n+0.01);
+ for (size_t x=3; x<=limit; x+=2)
+ while (((tmp=(n/x))*x)==n)
+ {
+ res=x;
+ n=tmp;
+ limit=(size_t)sqrt(n+0.01);
+ }
+ if (n>1) res=n;
+
+ return res;
+ }
+
+NOINLINE static double cost_guess (size_t n)
+ {
+ const double lfp=1.1; // penalty for non-hardcoded larger factors
+ size_t ni=n;
+ double result=0.;
+ size_t tmp;
+ while (((tmp=(n>>1))<<1)==n)
+ { result+=2; n=tmp; }
+
+ size_t limit=(size_t)sqrt(n+0.01);
+ for (size_t x=3; x<=limit; x+=2)
+ while ((tmp=(n/x))*x==n)
+ {
+ result+= (x<=5) ? x : lfp*x; // penalize larger prime factors
+ n=tmp;
+ limit=(size_t)sqrt(n+0.01);
+ }
+ if (n>1) result+=(n<=5) ? n : lfp*n;
+
+ return result*ni;
+ }
+
+/* returns the smallest composite of 2, 3, 5, 7 and 11 which is >= n */
+NOINLINE static size_t good_size(size_t n)
+ {
+ if (n<=6) return n;
+
+ size_t bestfac=2*n;
+ for (size_t f2=1; f2<bestfac; f2*=2)
+ for (size_t f23=f2; f23<bestfac; f23*=3)
+ for (size_t f235=f23; f235<bestfac; f235*=5)
+ for (size_t f2357=f235; f2357<bestfac; f2357*=7)
+ for (size_t f235711=f2357; f235711<bestfac; f235711*=11)
+ if (f235711>=n) bestfac=f235711;
+ return bestfac;
+ }
+
+typedef struct cmplx {
+ double r,i;
+} cmplx;
+
+#define NFCT 25
+typedef struct cfftp_fctdata
+ {
+ size_t fct;
+ cmplx *tw, *tws;
+ } cfftp_fctdata;
+
+typedef struct cfftp_plan_i
+ {
+ size_t length, nfct;
+ cmplx *mem;
+ cfftp_fctdata fct[NFCT];
+ } cfftp_plan_i;
+typedef struct cfftp_plan_i * cfftp_plan;
+
+#define PMC(a,b,c,d) { a.r=c.r+d.r; a.i=c.i+d.i; b.r=c.r-d.r; b.i=c.i-d.i; }
+#define ADDC(a,b,c) { a.r=b.r+c.r; a.i=b.i+c.i; }
+#define SCALEC(a,b) { a.r*=b; a.i*=b; }
+#define ROT90(a) { double tmp_=a.r; a.r=-a.i; a.i=tmp_; }
+#define ROTM90(a) { double tmp_=-a.r; a.r=a.i; a.i=tmp_; }
+#define CH(a,b,c) ch[(a)+ido*((b)+l1*(c))]
+#define CC(a,b,c) cc[(a)+ido*((b)+cdim*(c))]
+#define WA(x,i) wa[(i)-1+(x)*(ido-1)]
+/* a = b*c */
+#define A_EQ_B_MUL_C(a,b,c) { a.r=b.r*c.r-b.i*c.i; a.i=b.r*c.i+b.i*c.r; }
+/* a = conj(b)*c*/
+#define A_EQ_CB_MUL_C(a,b,c) { a.r=b.r*c.r+b.i*c.i; a.i=b.r*c.i-b.i*c.r; }
+
+#define PMSIGNC(a,b,c,d) { a.r=c.r+sign*d.r; a.i=c.i+sign*d.i; b.r=c.r-sign*d.r; b.i=c.i-sign*d.i; }
+/* a = b*c */
+#define MULPMSIGNC(a,b,c) { a.r=b.r*c.r-sign*b.i*c.i; a.i=b.r*c.i+sign*b.i*c.r; }
+/* a *= b */
+#define MULPMSIGNCEQ(a,b) { double xtmp=a.r; a.r=b.r*a.r-sign*b.i*a.i; a.i=b.r*a.i+sign*b.i*xtmp; }
+
+NOINLINE static void pass2b (size_t ido, size_t l1, const cmplx * restrict cc,
+ cmplx * restrict ch, const cmplx * restrict wa)
+ {
+ const size_t cdim=2;
+
+ if (ido==1)
+ for (size_t k=0; k<l1; ++k)
+ PMC (CH(0,k,0),CH(0,k,1),CC(0,0,k),CC(0,1,k))
+ else
+ for (size_t k=0; k<l1; ++k)
+ {
+ PMC (CH(0,k,0),CH(0,k,1),CC(0,0,k),CC(0,1,k))
+ for (size_t i=1; i<ido; ++i)
+ {
+ cmplx t;
+ PMC (CH(i,k,0),t,CC(i,0,k),CC(i,1,k))
+ A_EQ_B_MUL_C (CH(i,k,1),WA(0,i),t)
+ }
+ }
+ }
+
+NOINLINE static void pass2f (size_t ido, size_t l1, const cmplx * restrict cc,
+ cmplx * restrict ch, const cmplx * restrict wa)
+ {
+ const size_t cdim=2;
+
+ if (ido==1)
+ for (size_t k=0; k<l1; ++k)
+ PMC (CH(0,k,0),CH(0,k,1),CC(0,0,k),CC(0,1,k))
+ else
+ for (size_t k=0; k<l1; ++k)
+ {
+ PMC (CH(0,k,0),CH(0,k,1),CC(0,0,k),CC(0,1,k))
+ for (size_t i=1; i<ido; ++i)
+ {
+ cmplx t;
+ PMC (CH(i,k,0),t,CC(i,0,k),CC(i,1,k))
+ A_EQ_CB_MUL_C (CH(i,k,1),WA(0,i),t)
+ }
+ }
+ }
+
+#define PREP3(idx) \
+ cmplx t0 = CC(idx,0,k), t1, t2; \
+ PMC (t1,t2,CC(idx,1,k),CC(idx,2,k)) \
+ CH(idx,k,0).r=t0.r+t1.r; \
+ CH(idx,k,0).i=t0.i+t1.i;
+#define PARTSTEP3a(u1,u2,twr,twi) \
+ { \
+ cmplx ca,cb; \
+ ca.r=t0.r+twr*t1.r; \
+ ca.i=t0.i+twr*t1.i; \
+ cb.i=twi*t2.r; \
+ cb.r=-(twi*t2.i); \
+ PMC(CH(0,k,u1),CH(0,k,u2),ca,cb) \
+ }
+
+#define PARTSTEP3b(u1,u2,twr,twi) \
+ { \
+ cmplx ca,cb,da,db; \
+ ca.r=t0.r+twr*t1.r; \
+ ca.i=t0.i+twr*t1.i; \
+ cb.i=twi*t2.r; \
+ cb.r=-(twi*t2.i); \
+ PMC(da,db,ca,cb) \
+ A_EQ_B_MUL_C (CH(i,k,u1),WA(u1-1,i),da) \
+ A_EQ_B_MUL_C (CH(i,k,u2),WA(u2-1,i),db) \
+ }
+NOINLINE static void pass3b (size_t ido, size_t l1, const cmplx * restrict cc,
+ cmplx * restrict ch, const cmplx * restrict wa)
+ {
+ const size_t cdim=3;
+ const double tw1r=-0.5, tw1i= 0.86602540378443864676;
+
+ if (ido==1)
+ for (size_t k=0; k<l1; ++k)
+ {
+ PREP3(0)
+ PARTSTEP3a(1,2,tw1r,tw1i)
+ }
+ else
+ for (size_t k=0; k<l1; ++k)
+ {
+ {
+ PREP3(0)
+ PARTSTEP3a(1,2,tw1r,tw1i)
+ }
+ for (size_t i=1; i<ido; ++i)
+ {
+ PREP3(i)
+ PARTSTEP3b(1,2,tw1r,tw1i)
+ }
+ }
+ }
+#define PARTSTEP3f(u1,u2,twr,twi) \
+ { \
+ cmplx ca,cb,da,db; \
+ ca.r=t0.r+twr*t1.r; \
+ ca.i=t0.i+twr*t1.i; \
+ cb.i=twi*t2.r; \
+ cb.r=-(twi*t2.i); \
+ PMC(da,db,ca,cb) \
+ A_EQ_CB_MUL_C (CH(i,k,u1),WA(u1-1,i),da) \
+ A_EQ_CB_MUL_C (CH(i,k,u2),WA(u2-1,i),db) \
+ }
+NOINLINE static void pass3f (size_t ido, size_t l1, const cmplx * restrict cc,
+ cmplx * restrict ch, const cmplx * restrict wa)
+ {
+ const size_t cdim=3;
+ const double tw1r=-0.5, tw1i= -0.86602540378443864676;
+
+ if (ido==1)
+ for (size_t k=0; k<l1; ++k)
+ {
+ PREP3(0)
+ PARTSTEP3a(1,2,tw1r,tw1i)
+ }
+ else
+ for (size_t k=0; k<l1; ++k)
+ {
+ {
+ PREP3(0)
+ PARTSTEP3a(1,2,tw1r,tw1i)
+ }
+ for (size_t i=1; i<ido; ++i)
+ {
+ PREP3(i)
+ PARTSTEP3f(1,2,tw1r,tw1i)
+ }
+ }
+ }
+
+NOINLINE static void pass4b (size_t ido, size_t l1, const cmplx * restrict cc,
+ cmplx * restrict ch, const cmplx * restrict wa)
+ {
+ const size_t cdim=4;
+
+ if (ido==1)
+ for (size_t k=0; k<l1; ++k)
+ {
+ cmplx t1, t2, t3, t4;
+ PMC(t2,t1,CC(0,0,k),CC(0,2,k))
+ PMC(t3,t4,CC(0,1,k),CC(0,3,k))
+ ROT90(t4)
+ PMC(CH(0,k,0),CH(0,k,2),t2,t3)
+ PMC(CH(0,k,1),CH(0,k,3),t1,t4)
+ }
+ else
+ for (size_t k=0; k<l1; ++k)
+ {
+ {
+ cmplx t1, t2, t3, t4;
+ PMC(t2,t1,CC(0,0,k),CC(0,2,k))
+ PMC(t3,t4,CC(0,1,k),CC(0,3,k))
+ ROT90(t4)
+ PMC(CH(0,k,0),CH(0,k,2),t2,t3)
+ PMC(CH(0,k,1),CH(0,k,3),t1,t4)
+ }
+ for (size_t i=1; i<ido; ++i)
+ {
+ cmplx c2, c3, c4, t1, t2, t3, t4;
+ cmplx cc0=CC(i,0,k), cc1=CC(i,1,k),cc2=CC(i,2,k),cc3=CC(i,3,k);
+ PMC(t2,t1,cc0,cc2)
+ PMC(t3,t4,cc1,cc3)
+ ROT90(t4)
+ cmplx wa0=WA(0,i), wa1=WA(1,i),wa2=WA(2,i);
+ PMC(CH(i,k,0),c3,t2,t3)
+ PMC(c2,c4,t1,t4)
+ A_EQ_B_MUL_C (CH(i,k,1),wa0,c2)
+ A_EQ_B_MUL_C (CH(i,k,2),wa1,c3)
+ A_EQ_B_MUL_C (CH(i,k,3),wa2,c4)
+ }
+ }
+ }
+NOINLINE static void pass4f (size_t ido, size_t l1, const cmplx * restrict cc,
+ cmplx * restrict ch, const cmplx * restrict wa)
+ {
+ const size_t cdim=4;
+
+ if (ido==1)
+ for (size_t k=0; k<l1; ++k)
+ {
+ cmplx t1, t2, t3, t4;
+ PMC(t2,t1,CC(0,0,k),CC(0,2,k))
+ PMC(t3,t4,CC(0,1,k),CC(0,3,k))
+ ROTM90(t4)
+ PMC(CH(0,k,0),CH(0,k,2),t2,t3)
+ PMC(CH(0,k,1),CH(0,k,3),t1,t4)
+ }
+ else
+ for (size_t k=0; k<l1; ++k)
+ {
+ {
+ cmplx t1, t2, t3, t4;
+ PMC(t2,t1,CC(0,0,k),CC(0,2,k))
+ PMC(t3,t4,CC(0,1,k),CC(0,3,k))
+ ROTM90(t4)
+ PMC(CH(0,k,0),CH(0,k,2),t2,t3)
+ PMC (CH(0,k,1),CH(0,k,3),t1,t4)
+ }
+ for (size_t i=1; i<ido; ++i)
+ {
+ cmplx c2, c3, c4, t1, t2, t3, t4;
+ cmplx cc0=CC(i,0,k), cc1=CC(i,1,k),cc2=CC(i,2,k),cc3=CC(i,3,k);
+ PMC(t2,t1,cc0,cc2)
+ PMC(t3,t4,cc1,cc3)
+ ROTM90(t4)
+ cmplx wa0=WA(0,i), wa1=WA(1,i),wa2=WA(2,i);
+ PMC(CH(i,k,0),c3,t2,t3)
+ PMC(c2,c4,t1,t4)
+ A_EQ_CB_MUL_C (CH(i,k,1),wa0,c2)
+ A_EQ_CB_MUL_C (CH(i,k,2),wa1,c3)
+ A_EQ_CB_MUL_C (CH(i,k,3),wa2,c4)
+ }
+ }
+ }
+
+#define PREP5(idx) \
+ cmplx t0 = CC(idx,0,k), t1, t2, t3, t4; \
+ PMC (t1,t4,CC(idx,1,k),CC(idx,4,k)) \
+ PMC (t2,t3,CC(idx,2,k),CC(idx,3,k)) \
+ CH(idx,k,0).r=t0.r+t1.r+t2.r; \
+ CH(idx,k,0).i=t0.i+t1.i+t2.i;
+
+#define PARTSTEP5a(u1,u2,twar,twbr,twai,twbi) \
+ { \
+ cmplx ca,cb; \
+ ca.r=t0.r+twar*t1.r+twbr*t2.r; \
+ ca.i=t0.i+twar*t1.i+twbr*t2.i; \
+ cb.i=twai*t4.r twbi*t3.r; \
+ cb.r=-(twai*t4.i twbi*t3.i); \
+ PMC(CH(0,k,u1),CH(0,k,u2),ca,cb) \
+ }
+
+#define PARTSTEP5b(u1,u2,twar,twbr,twai,twbi) \
+ { \
+ cmplx ca,cb,da,db; \
+ ca.r=t0.r+twar*t1.r+twbr*t2.r; \
+ ca.i=t0.i+twar*t1.i+twbr*t2.i; \
+ cb.i=twai*t4.r twbi*t3.r; \
+ cb.r=-(twai*t4.i twbi*t3.i); \
+ PMC(da,db,ca,cb) \
+ A_EQ_B_MUL_C (CH(i,k,u1),WA(u1-1,i),da) \
+ A_EQ_B_MUL_C (CH(i,k,u2),WA(u2-1,i),db) \
+ }
+NOINLINE static void pass5b (size_t ido, size_t l1, const cmplx * restrict cc,
+ cmplx * restrict ch, const cmplx * restrict wa)
+ {
+ const size_t cdim=5;
+ const double tw1r= 0.3090169943749474241,
+ tw1i= 0.95105651629515357212,
+ tw2r= -0.8090169943749474241,
+ tw2i= 0.58778525229247312917;
+
+ if (ido==1)
+ for (size_t k=0; k<l1; ++k)
+ {
+ PREP5(0)
+ PARTSTEP5a(1,4,tw1r,tw2r,+tw1i,+tw2i)
+ PARTSTEP5a(2,3,tw2r,tw1r,+tw2i,-tw1i)
+ }
+ else
+ for (size_t k=0; k<l1; ++k)
+ {
+ {
+ PREP5(0)
+ PARTSTEP5a(1,4,tw1r,tw2r,+tw1i,+tw2i)
+ PARTSTEP5a(2,3,tw2r,tw1r,+tw2i,-tw1i)
+ }
+ for (size_t i=1; i<ido; ++i)
+ {
+ PREP5(i)
+ PARTSTEP5b(1,4,tw1r,tw2r,+tw1i,+tw2i)
+ PARTSTEP5b(2,3,tw2r,tw1r,+tw2i,-tw1i)
+ }
+ }
+ }
+#define PARTSTEP5f(u1,u2,twar,twbr,twai,twbi) \
+ { \
+ cmplx ca,cb,da,db; \
+ ca.r=t0.r+twar*t1.r+twbr*t2.r; \
+ ca.i=t0.i+twar*t1.i+twbr*t2.i; \
+ cb.i=twai*t4.r twbi*t3.r; \
+ cb.r=-(twai*t4.i twbi*t3.i); \
+ PMC(da,db,ca,cb) \
+ A_EQ_CB_MUL_C (CH(i,k,u1),WA(u1-1,i),da) \
+ A_EQ_CB_MUL_C (CH(i,k,u2),WA(u2-1,i),db) \
+ }
+NOINLINE static void pass5f (size_t ido, size_t l1, const cmplx * restrict cc,
+ cmplx * restrict ch, const cmplx * restrict wa)
+ {
+ const size_t cdim=5;
+ const double tw1r= 0.3090169943749474241,
+ tw1i= -0.95105651629515357212,
+ tw2r= -0.8090169943749474241,
+ tw2i= -0.58778525229247312917;
+
+ if (ido==1)
+ for (size_t k=0; k<l1; ++k)
+ {
+ PREP5(0)
+ PARTSTEP5a(1,4,tw1r,tw2r,+tw1i,+tw2i)
+ PARTSTEP5a(2,3,tw2r,tw1r,+tw2i,-tw1i)
+ }
+ else
+ for (size_t k=0; k<l1; ++k)
+ {
+ {
+ PREP5(0)
+ PARTSTEP5a(1,4,tw1r,tw2r,+tw1i,+tw2i)
+ PARTSTEP5a(2,3,tw2r,tw1r,+tw2i,-tw1i)
+ }
+ for (size_t i=1; i<ido; ++i)
+ {
+ PREP5(i)
+ PARTSTEP5f(1,4,tw1r,tw2r,+tw1i,+tw2i)
+ PARTSTEP5f(2,3,tw2r,tw1r,+tw2i,-tw1i)
+ }
+ }
+ }
+
+#define PREP7(idx) \
+ cmplx t1 = CC(idx,0,k), t2, t3, t4, t5, t6, t7; \
+ PMC (t2,t7,CC(idx,1,k),CC(idx,6,k)) \
+ PMC (t3,t6,CC(idx,2,k),CC(idx,5,k)) \
+ PMC (t4,t5,CC(idx,3,k),CC(idx,4,k)) \
+ CH(idx,k,0).r=t1.r+t2.r+t3.r+t4.r; \
+ CH(idx,k,0).i=t1.i+t2.i+t3.i+t4.i;
+
+#define PARTSTEP7a0(u1,u2,x1,x2,x3,y1,y2,y3,out1,out2) \
+ { \
+ cmplx ca,cb; \
+ ca.r=t1.r+x1*t2.r+x2*t3.r+x3*t4.r; \
+ ca.i=t1.i+x1*t2.i+x2*t3.i+x3*t4.i; \
+ cb.i=y1*t7.r y2*t6.r y3*t5.r; \
+ cb.r=-(y1*t7.i y2*t6.i y3*t5.i); \
+ PMC(out1,out2,ca,cb) \
+ }
+#define PARTSTEP7a(u1,u2,x1,x2,x3,y1,y2,y3) \
+ PARTSTEP7a0(u1,u2,x1,x2,x3,y1,y2,y3,CH(0,k,u1),CH(0,k,u2))
+#define PARTSTEP7(u1,u2,x1,x2,x3,y1,y2,y3) \
+ { \
+ cmplx da,db; \
+ PARTSTEP7a0(u1,u2,x1,x2,x3,y1,y2,y3,da,db) \
+ MULPMSIGNC (CH(i,k,u1),WA(u1-1,i),da) \
+ MULPMSIGNC (CH(i,k,u2),WA(u2-1,i),db) \
+ }
+
+NOINLINE static void pass7(size_t ido, size_t l1, const cmplx * restrict cc,
+ cmplx * restrict ch, const cmplx * restrict wa, const int sign)
+ {
+ const size_t cdim=7;
+ const double tw1r= 0.623489801858733530525,
+ tw1i= sign * 0.7818314824680298087084,
+ tw2r= -0.222520933956314404289,
+ tw2i= sign * 0.9749279121818236070181,
+ tw3r= -0.9009688679024191262361,
+ tw3i= sign * 0.4338837391175581204758;
+
+ if (ido==1)
+ for (size_t k=0; k<l1; ++k)
+ {
+ PREP7(0)
+ PARTSTEP7a(1,6,tw1r,tw2r,tw3r,+tw1i,+tw2i,+tw3i)
+ PARTSTEP7a(2,5,tw2r,tw3r,tw1r,+tw2i,-tw3i,-tw1i)
+ PARTSTEP7a(3,4,tw3r,tw1r,tw2r,+tw3i,-tw1i,+tw2i)
+ }
+ else
+ for (size_t k=0; k<l1; ++k)
+ {
+ {
+ PREP7(0)
+ PARTSTEP7a(1,6,tw1r,tw2r,tw3r,+tw1i,+tw2i,+tw3i)
+ PARTSTEP7a(2,5,tw2r,tw3r,tw1r,+tw2i,-tw3i,-tw1i)
+ PARTSTEP7a(3,4,tw3r,tw1r,tw2r,+tw3i,-tw1i,+tw2i)
+ }
+ for (size_t i=1; i<ido; ++i)
+ {
+ PREP7(i)
+ PARTSTEP7(1,6,tw1r,tw2r,tw3r,+tw1i,+tw2i,+tw3i)
+ PARTSTEP7(2,5,tw2r,tw3r,tw1r,+tw2i,-tw3i,-tw1i)
+ PARTSTEP7(3,4,tw3r,tw1r,tw2r,+tw3i,-tw1i,+tw2i)
+ }
+ }
+ }
+
+#define PREP11(idx) \
+ cmplx t1 = CC(idx,0,k), t2, t3, t4, t5, t6, t7, t8, t9, t10, t11; \
+ PMC (t2,t11,CC(idx,1,k),CC(idx,10,k)) \
+ PMC (t3,t10,CC(idx,2,k),CC(idx, 9,k)) \
+ PMC (t4,t9 ,CC(idx,3,k),CC(idx, 8,k)) \
+ PMC (t5,t8 ,CC(idx,4,k),CC(idx, 7,k)) \
+ PMC (t6,t7 ,CC(idx,5,k),CC(idx, 6,k)) \
+ CH(idx,k,0).r=t1.r+t2.r+t3.r+t4.r+t5.r+t6.r; \
+ CH(idx,k,0).i=t1.i+t2.i+t3.i+t4.i+t5.i+t6.i;
+
+#define PARTSTEP11a0(u1,u2,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,out1,out2) \
+ { \
+ cmplx ca,cb; \
+ ca.r=t1.r+x1*t2.r+x2*t3.r+x3*t4.r+x4*t5.r+x5*t6.r; \
+ ca.i=t1.i+x1*t2.i+x2*t3.i+x3*t4.i+x4*t5.i+x5*t6.i; \
+ cb.i=y1*t11.r y2*t10.r y3*t9.r y4*t8.r y5*t7.r; \
+ cb.r=-(y1*t11.i y2*t10.i y3*t9.i y4*t8.i y5*t7.i ); \
+ PMC(out1,out2,ca,cb) \
+ }
+#define PARTSTEP11a(u1,u2,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5) \
+ PARTSTEP11a0(u1,u2,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,CH(0,k,u1),CH(0,k,u2))
+#define PARTSTEP11(u1,u2,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5) \
+ { \
+ cmplx da,db; \
+ PARTSTEP11a0(u1,u2,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,da,db) \
+ MULPMSIGNC (CH(i,k,u1),WA(u1-1,i),da) \
+ MULPMSIGNC (CH(i,k,u2),WA(u2-1,i),db) \
+ }
+
+NOINLINE static void pass11 (size_t ido, size_t l1, const cmplx * restrict cc,
+ cmplx * restrict ch, const cmplx * restrict wa, const int sign)
+ {
+ const size_t cdim=11;
+ const double tw1r = 0.8412535328311811688618,
+ tw1i = sign * 0.5406408174555975821076,
+ tw2r = 0.4154150130018864255293,
+ tw2i = sign * 0.9096319953545183714117,
+ tw3r = -0.1423148382732851404438,
+ tw3i = sign * 0.9898214418809327323761,
+ tw4r = -0.6548607339452850640569,
+ tw4i = sign * 0.755749574354258283774,
+ tw5r = -0.9594929736144973898904,
+ tw5i = sign * 0.2817325568414296977114;
+
+ if (ido==1)
+ for (size_t k=0; k<l1; ++k)
+ {
+ PREP11(0)
+ PARTSTEP11a(1,10,tw1r,tw2r,tw3r,tw4r,tw5r,+tw1i,+tw2i,+tw3i,+tw4i,+tw5i)
+ PARTSTEP11a(2, 9,tw2r,tw4r,tw5r,tw3r,tw1r,+tw2i,+tw4i,-tw5i,-tw3i,-tw1i)
+ PARTSTEP11a(3, 8,tw3r,tw5r,tw2r,tw1r,tw4r,+tw3i,-tw5i,-tw2i,+tw1i,+tw4i)
+ PARTSTEP11a(4, 7,tw4r,tw3r,tw1r,tw5r,tw2r,+tw4i,-tw3i,+tw1i,+tw5i,-tw2i)
+ PARTSTEP11a(5, 6,tw5r,tw1r,tw4r,tw2r,tw3r,+tw5i,-tw1i,+tw4i,-tw2i,+tw3i)
+ }
+ else
+ for (size_t k=0; k<l1; ++k)
+ {
+ {
+ PREP11(0)
+ PARTSTEP11a(1,10,tw1r,tw2r,tw3r,tw4r,tw5r,+tw1i,+tw2i,+tw3i,+tw4i,+tw5i)
+ PARTSTEP11a(2, 9,tw2r,tw4r,tw5r,tw3r,tw1r,+tw2i,+tw4i,-tw5i,-tw3i,-tw1i)
+ PARTSTEP11a(3, 8,tw3r,tw5r,tw2r,tw1r,tw4r,+tw3i,-tw5i,-tw2i,+tw1i,+tw4i)
+ PARTSTEP11a(4, 7,tw4r,tw3r,tw1r,tw5r,tw2r,+tw4i,-tw3i,+tw1i,+tw5i,-tw2i)
+ PARTSTEP11a(5, 6,tw5r,tw1r,tw4r,tw2r,tw3r,+tw5i,-tw1i,+tw4i,-tw2i,+tw3i)
+ }
+ for (size_t i=1; i<ido; ++i)
+ {
+ PREP11(i)
+ PARTSTEP11(1,10,tw1r,tw2r,tw3r,tw4r,tw5r,+tw1i,+tw2i,+tw3i,+tw4i,+tw5i)
+ PARTSTEP11(2, 9,tw2r,tw4r,tw5r,tw3r,tw1r,+tw2i,+tw4i,-tw5i,-tw3i,-tw1i)
+ PARTSTEP11(3, 8,tw3r,tw5r,tw2r,tw1r,tw4r,+tw3i,-tw5i,-tw2i,+tw1i,+tw4i)
+ PARTSTEP11(4, 7,tw4r,tw3r,tw1r,tw5r,tw2r,+tw4i,-tw3i,+tw1i,+tw5i,-tw2i)
+ PARTSTEP11(5, 6,tw5r,tw1r,tw4r,tw2r,tw3r,+tw5i,-tw1i,+tw4i,-tw2i,+tw3i)
+ }
+ }
+ }
+
+#define CX(a,b,c) cc[(a)+ido*((b)+l1*(c))]
+#define CX2(a,b) cc[(a)+idl1*(b)]
+#define CH2(a,b) ch[(a)+idl1*(b)]
+
+NOINLINE static int passg (size_t ido, size_t ip, size_t l1,
+ cmplx * restrict cc, cmplx * restrict ch, const cmplx * restrict wa,
+ const cmplx * restrict csarr, const int sign)
+ {
+ const size_t cdim=ip;
+ size_t ipph = (ip+1)/2;
+ size_t idl1 = ido*l1;
+
+ cmplx * restrict wal=RALLOC(cmplx,ip);
+ if (!wal) return -1;
+ wal[0]=(cmplx){1.,0.};
+ for (size_t i=1; i<ip; ++i)
+ wal[i]=(cmplx){csarr[i].r,sign*csarr[i].i};
+
+ for (size_t k=0; k<l1; ++k)
+ for (size_t i=0; i<ido; ++i)
+ CH(i,k,0) = CC(i,0,k);
+ for (size_t j=1, jc=ip-1; j<ipph; ++j, --jc)
+ for (size_t k=0; k<l1; ++k)
+ for (size_t i=0; i<ido; ++i)
+ PMC(CH(i,k,j),CH(i,k,jc),CC(i,j,k),CC(i,jc,k))
+ for (size_t k=0; k<l1; ++k)
+ for (size_t i=0; i<ido; ++i)
+ {
+ cmplx tmp = CH(i,k,0);
+ for (size_t j=1; j<ipph; ++j)
+ ADDC(tmp,tmp,CH(i,k,j))
+ CX(i,k,0) = tmp;
+ }
+ for (size_t l=1, lc=ip-1; l<ipph; ++l, --lc)
+ {
+ // j=0
+ for (size_t ik=0; ik<idl1; ++ik)
+ {
+ CX2(ik,l).r = CH2(ik,0).r+wal[l].r*CH2(ik,1).r+wal[2*l].r*CH2(ik,2).r;
+ CX2(ik,l).i = CH2(ik,0).i+wal[l].r*CH2(ik,1).i+wal[2*l].r*CH2(ik,2).i;
+ CX2(ik,lc).r=-wal[l].i*CH2(ik,ip-1).i-wal[2*l].i*CH2(ik,ip-2).i;
+ CX2(ik,lc).i=wal[l].i*CH2(ik,ip-1).r+wal[2*l].i*CH2(ik,ip-2).r;
+ }
+
+ size_t iwal=2*l;
+ size_t j=3, jc=ip-3;
+ for (; j<ipph-1; j+=2, jc-=2)
+ {
+ iwal+=l; if (iwal>ip) iwal-=ip;
+ cmplx xwal=wal[iwal];
+ iwal+=l; if (iwal>ip) iwal-=ip;
+ cmplx xwal2=wal[iwal];
+ for (size_t ik=0; ik<idl1; ++ik)
+ {
+ CX2(ik,l).r += CH2(ik,j).r*xwal.r+CH2(ik,j+1).r*xwal2.r;
+ CX2(ik,l).i += CH2(ik,j).i*xwal.r+CH2(ik,j+1).i*xwal2.r;
+ CX2(ik,lc).r -= CH2(ik,jc).i*xwal.i+CH2(ik,jc-1).i*xwal2.i;
+ CX2(ik,lc).i += CH2(ik,jc).r*xwal.i+CH2(ik,jc-1).r*xwal2.i;
+ }
+ }
+ for (; j<ipph; ++j, --jc)
+ {
+ iwal+=l; if (iwal>ip) iwal-=ip;
+ cmplx xwal=wal[iwal];
+ for (size_t ik=0; ik<idl1; ++ik)
+ {
+ CX2(ik,l).r += CH2(ik,j).r*xwal.r;
+ CX2(ik,l).i += CH2(ik,j).i*xwal.r;
+ CX2(ik,lc).r -= CH2(ik,jc).i*xwal.i;
+ CX2(ik,lc).i += CH2(ik,jc).r*xwal.i;
+ }
+ }
+ }
+ DEALLOC(wal);
+
+ // shuffling and twiddling
+ if (ido==1)
+ for (size_t j=1, jc=ip-1; j<ipph; ++j, --jc)
+ for (size_t ik=0; ik<idl1; ++ik)
+ {
+ cmplx t1=CX2(ik,j), t2=CX2(ik,jc);
+ PMC(CX2(ik,j),CX2(ik,jc),t1,t2)
+ }
+ else
+ {
+ for (size_t j=1, jc=ip-1; j<ipph; ++j,--jc)
+ for (size_t k=0; k<l1; ++k)
+ {
+ cmplx t1=CX(0,k,j), t2=CX(0,k,jc);
+ PMC(CX(0,k,j),CX(0,k,jc),t1,t2)
+ for (size_t i=1; i<ido; ++i)
+ {
+ cmplx x1, x2;
+ PMC(x1,x2,CX(i,k,j),CX(i,k,jc))
+ size_t idij=(j-1)*(ido-1)+i-1;
+ MULPMSIGNC (CX(i,k,j),wa[idij],x1)
+ idij=(jc-1)*(ido-1)+i-1;
+ MULPMSIGNC (CX(i,k,jc),wa[idij],x2)
+ }
+ }
+ }
+ return 0;
+ }
+
+#undef CH2
+#undef CX2
+#undef CX
+
+NOINLINE WARN_UNUSED_RESULT static int pass_all(cfftp_plan plan, cmplx c[], double fct,
+ const int sign)
+ {
+ if (plan->length==1) return 0;
+ size_t len=plan->length;
+ size_t l1=1, nf=plan->nfct;
+ cmplx *ch = RALLOC(cmplx, len);
+ if (!ch) return -1;
+ cmplx *p1=c, *p2=ch;
+
+ for(size_t k1=0; k1<nf; k1++)
+ {
+ size_t ip=plan->fct[k1].fct;
+ size_t l2=ip*l1;
+ size_t ido = len/l2;
+ if (ip==4)
+ sign>0 ? pass4b (ido, l1, p1, p2, plan->fct[k1].tw)
+ : pass4f (ido, l1, p1, p2, plan->fct[k1].tw);
+ else if(ip==2)
+ sign>0 ? pass2b (ido, l1, p1, p2, plan->fct[k1].tw)
+ : pass2f (ido, l1, p1, p2, plan->fct[k1].tw);
+ else if(ip==3)
+ sign>0 ? pass3b (ido, l1, p1, p2, plan->fct[k1].tw)
+ : pass3f (ido, l1, p1, p2, plan->fct[k1].tw);
+ else if(ip==5)
+ sign>0 ? pass5b (ido, l1, p1, p2, plan->fct[k1].tw)
+ : pass5f (ido, l1, p1, p2, plan->fct[k1].tw);
+ else if(ip==7) pass7 (ido, l1, p1, p2, plan->fct[k1].tw, sign);
+ else if(ip==11) pass11(ido, l1, p1, p2, plan->fct[k1].tw, sign);
+ else
+ {
+ if (passg(ido, ip, l1, p1, p2, plan->fct[k1].tw, plan->fct[k1].tws, sign))
+ { DEALLOC(ch); return -1; }
+ SWAP(p1,p2,cmplx *);
+ }
+ SWAP(p1,p2,cmplx *);
+ l1=l2;
+ }
+ if (p1!=c)
+ {
+ if (fct!=1.)
+ for (size_t i=0; i<len; ++i)
+ {
+ c[i].r = ch[i].r*fct;
+ c[i].i = ch[i].i*fct;
+ }
+ else
+ memcpy (c,p1,len*sizeof(cmplx));
+ }
+ else
+ if (fct!=1.)
+ for (size_t i=0; i<len; ++i)
+ {
+ c[i].r *= fct;
+ c[i].i *= fct;
+ }
+ DEALLOC(ch);
+ return 0;
+ }
+
+#undef PMSIGNC
+#undef A_EQ_B_MUL_C
+#undef A_EQ_CB_MUL_C
+#undef MULPMSIGNC
+#undef MULPMSIGNCEQ
+
+#undef WA
+#undef CC
+#undef CH
+#undef ROT90
+#undef SCALEC
+#undef ADDC
+#undef PMC
+
+NOINLINE WARN_UNUSED_RESULT
+static int cfftp_forward(cfftp_plan plan, double c[], double fct)
+ { return pass_all(plan,(cmplx *)c, fct, -1); }
+
+NOINLINE WARN_UNUSED_RESULT
+static int cfftp_backward(cfftp_plan plan, double c[], double fct)
+ { return pass_all(plan,(cmplx *)c, fct, 1); }
+
+NOINLINE WARN_UNUSED_RESULT
+static int cfftp_factorize (cfftp_plan plan)
+ {
+ size_t length=plan->length;
+ size_t nfct=0;
+ while ((length%4)==0)
+ { if (nfct>=NFCT) return -1; plan->fct[nfct++].fct=4; length>>=2; }
+ if ((length%2)==0)
+ {
+ length>>=1;
+ // factor 2 should be at the front of the factor list
+ if (nfct>=NFCT) return -1;
+ plan->fct[nfct++].fct=2;
+ SWAP(plan->fct[0].fct, plan->fct[nfct-1].fct,size_t);
+ }
+ size_t maxl=(size_t)(sqrt((double)length))+1;
+ for (size_t divisor=3; (length>1)&&(divisor<maxl); divisor+=2)
+ if ((length%divisor)==0)
+ {
+ while ((length%divisor)==0)
+ {
+ if (nfct>=NFCT) return -1;
+ plan->fct[nfct++].fct=divisor;
+ length/=divisor;
+ }
+ maxl=(size_t)(sqrt((double)length))+1;
+ }
+ if (length>1) plan->fct[nfct++].fct=length;
+ plan->nfct=nfct;
+ return 0;
+ }
+
+NOINLINE static size_t cfftp_twsize (cfftp_plan plan)
+ {
+ size_t twsize=0, l1=1;
+ for (size_t k=0; k<plan->nfct; ++k)
+ {
+ size_t ip=plan->fct[k].fct, ido= plan->length/(l1*ip);
+ twsize+=(ip-1)*(ido-1);
+ if (ip>11)
+ twsize+=ip;
+ l1*=ip;
+ }
+ return twsize;
+ }
+
+NOINLINE WARN_UNUSED_RESULT static int cfftp_comp_twiddle (cfftp_plan plan)
+ {
+ size_t length=plan->length;
+ double *twid = RALLOC(double, 2*length);
+ if (!twid) return -1;
+ sincos_2pibyn(length, twid);
+ size_t l1=1;
+ size_t memofs=0;
+ for (size_t k=0; k<plan->nfct; ++k)
+ {
+ size_t ip=plan->fct[k].fct, ido= length/(l1*ip);
+ plan->fct[k].tw=plan->mem+memofs;
+ memofs+=(ip-1)*(ido-1);
+ for (size_t j=1; j<ip; ++j)
+ for (size_t i=1; i<ido; ++i)
+ {
+ plan->fct[k].tw[(j-1)*(ido-1)+i-1].r = twid[2*j*l1*i];
+ plan->fct[k].tw[(j-1)*(ido-1)+i-1].i = twid[2*j*l1*i+1];
+ }
+ if (ip>11)
+ {
+ plan->fct[k].tws=plan->mem+memofs;
+ memofs+=ip;
+ for (size_t j=0; j<ip; ++j)
+ {
+ plan->fct[k].tws[j].r = twid[2*j*l1*ido];
+ plan->fct[k].tws[j].i = twid[2*j*l1*ido+1];
+ }
+ }
+ l1*=ip;
+ }
+ DEALLOC(twid);
+ return 0;
+ }
+
+static cfftp_plan make_cfftp_plan (size_t length)
+ {
+ if (length==0) return NULL;
+ cfftp_plan plan = RALLOC(cfftp_plan_i,1);
+ if (!plan) return NULL;
+ plan->length=length;
+ plan->nfct=0;
+ for (size_t i=0; i<NFCT; ++i)
+ plan->fct[i]=(cfftp_fctdata){0,0,0};
+ plan->mem=0;
+ if (length==1) return plan;
+ if (cfftp_factorize(plan)!=0) { DEALLOC(plan); return NULL; }
+ size_t tws=cfftp_twsize(plan);
+ plan->mem=RALLOC(cmplx,tws);
+ if (!plan->mem) { DEALLOC(plan); return NULL; }
+ if (cfftp_comp_twiddle(plan)!=0)
+ { DEALLOC(plan->mem); DEALLOC(plan); return NULL; }
+ return plan;
+ }
+
+static void destroy_cfftp_plan (cfftp_plan plan)
+ {
+ DEALLOC(plan->mem);
+ DEALLOC(plan);
+ }
+
+typedef struct rfftp_fctdata
+ {
+ size_t fct;
+ double *tw, *tws;
+ } rfftp_fctdata;
+
+typedef struct rfftp_plan_i
+ {
+ size_t length, nfct;
+ double *mem;
+ rfftp_fctdata fct[NFCT];
+ } rfftp_plan_i;
+typedef struct rfftp_plan_i * rfftp_plan;
+
+#define WA(x,i) wa[(i)+(x)*(ido-1)]
+#define PM(a,b,c,d) { a=c+d; b=c-d; }
+/* (a+ib) = conj(c+id) * (e+if) */
+#define MULPM(a,b,c,d,e,f) { a=c*e+d*f; b=c*f-d*e; }
+
+#define CC(a,b,c) cc[(a)+ido*((b)+l1*(c))]
+#define CH(a,b,c) ch[(a)+ido*((b)+cdim*(c))]
+
+NOINLINE static void radf2 (size_t ido, size_t l1, const double * restrict cc,
+ double * restrict ch, const double * restrict wa)
+ {
+ const size_t cdim=2;
+
+ for (size_t k=0; k<l1; k++)
+ PM (CH(0,0,k),CH(ido-1,1,k),CC(0,k,0),CC(0,k,1))
+ if ((ido&1)==0)
+ for (size_t k=0; k<l1; k++)
+ {
+ CH( 0,1,k) = -CC(ido-1,k,1);
+ CH(ido-1,0,k) = CC(ido-1,k,0);
+ }
+ if (ido<=2) return;
+ for (size_t k=0; k<l1; k++)
+ for (size_t i=2; i<ido; i+=2)
+ {
+ size_t ic=ido-i;
+ double tr2, ti2;
+ MULPM (tr2,ti2,WA(0,i-2),WA(0,i-1),CC(i-1,k,1),CC(i,k,1))
+ PM (CH(i-1,0,k),CH(ic-1,1,k),CC(i-1,k,0),tr2)
+ PM (CH(i ,0,k),CH(ic ,1,k),ti2,CC(i ,k,0))
+ }
+ }
+
+NOINLINE static void radf3(size_t ido, size_t l1, const double * restrict cc,
+ double * restrict ch, const double * restrict wa)
+ {
+ const size_t cdim=3;
+ static const double taur=-0.5, taui=0.86602540378443864676;
+
+ for (size_t k=0; k<l1; k++)
+ {
+ double cr2=CC(0,k,1)+CC(0,k,2);
+ CH(0,0,k) = CC(0,k,0)+cr2;
+ CH(0,2,k) = taui*(CC(0,k,2)-CC(0,k,1));
+ CH(ido-1,1,k) = CC(0,k,0)+taur*cr2;
+ }
+ if (ido==1) return;
+ for (size_t k=0; k<l1; k++)
+ for (size_t i=2; i<ido; i+=2)
+ {
+ size_t ic=ido-i;
+ double di2, di3, dr2, dr3;
+ MULPM (dr2,di2,WA(0,i-2),WA(0,i-1),CC(i-1,k,1),CC(i,k,1)) // d2=conj(WA0)*CC1
+ MULPM (dr3,di3,WA(1,i-2),WA(1,i-1),CC(i-1,k,2),CC(i,k,2)) // d3=conj(WA1)*CC2
+ double cr2=dr2+dr3; // c add
+ double ci2=di2+di3;
+ CH(i-1,0,k) = CC(i-1,k,0)+cr2; // c add
+ CH(i ,0,k) = CC(i ,k,0)+ci2;
+ double tr2 = CC(i-1,k,0)+taur*cr2; // c add
+ double ti2 = CC(i ,k,0)+taur*ci2;
+ double tr3 = taui*(di2-di3); // t3 = taui*i*(d3-d2)?
+ double ti3 = taui*(dr3-dr2);
+ PM(CH(i-1,2,k),CH(ic-1,1,k),tr2,tr3) // PM(i) = t2+t3
+ PM(CH(i ,2,k),CH(ic ,1,k),ti3,ti2) // PM(ic) = conj(t2-t3)
+ }
+ }
+
+NOINLINE static void radf4(size_t ido, size_t l1, const double * restrict cc,
+ double * restrict ch, const double * restrict wa)
+ {
+ const size_t cdim=4;
+ static const double hsqt2=0.70710678118654752440;
+
+ for (size_t k=0; k<l1; k++)
+ {
+ double tr1,tr2;
+ PM (tr1,CH(0,2,k),CC(0,k,3),CC(0,k,1))
+ PM (tr2,CH(ido-1,1,k),CC(0,k,0),CC(0,k,2))
+ PM (CH(0,0,k),CH(ido-1,3,k),tr2,tr1)
+ }
+ if ((ido&1)==0)
+ for (size_t k=0; k<l1; k++)
+ {
+ double ti1=-hsqt2*(CC(ido-1,k,1)+CC(ido-1,k,3));
+ double tr1= hsqt2*(CC(ido-1,k,1)-CC(ido-1,k,3));
+ PM (CH(ido-1,0,k),CH(ido-1,2,k),CC(ido-1,k,0),tr1)
+ PM (CH( 0,3,k),CH( 0,1,k),ti1,CC(ido-1,k,2))
+ }
+ if (ido<=2) return;
+ for (size_t k=0; k<l1; k++)
+ for (size_t i=2; i<ido; i+=2)
+ {
+ size_t ic=ido-i;
+ double ci2, ci3, ci4, cr2, cr3, cr4, ti1, ti2, ti3, ti4, tr1, tr2, tr3, tr4;
+ MULPM(cr2,ci2,WA(0,i-2),WA(0,i-1),CC(i-1,k,1),CC(i,k,1))
+ MULPM(cr3,ci3,WA(1,i-2),WA(1,i-1),CC(i-1,k,2),CC(i,k,2))
+ MULPM(cr4,ci4,WA(2,i-2),WA(2,i-1),CC(i-1,k,3),CC(i,k,3))
+ PM(tr1,tr4,cr4,cr2)
+ PM(ti1,ti4,ci2,ci4)
+ PM(tr2,tr3,CC(i-1,k,0),cr3)
+ PM(ti2,ti3,CC(i ,k,0),ci3)
+ PM(CH(i-1,0,k),CH(ic-1,3,k),tr2,tr1)
+ PM(CH(i ,0,k),CH(ic ,3,k),ti1,ti2)
+ PM(CH(i-1,2,k),CH(ic-1,1,k),tr3,ti4)
+ PM(CH(i ,2,k),CH(ic ,1,k),tr4,ti3)
+ }
+ }
+
+NOINLINE static void radf5(size_t ido, size_t l1, const double * restrict cc,
+ double * restrict ch, const double * restrict wa)
+ {
+ const size_t cdim=5;
+ static const double tr11= 0.3090169943749474241, ti11=0.95105651629515357212,
+ tr12=-0.8090169943749474241, ti12=0.58778525229247312917;
+
+ for (size_t k=0; k<l1; k++)
+ {
+ double cr2, cr3, ci4, ci5;
+ PM (cr2,ci5,CC(0,k,4),CC(0,k,1))
+ PM (cr3,ci4,CC(0,k,3),CC(0,k,2))
+ CH(0,0,k)=CC(0,k,0)+cr2+cr3;
+ CH(ido-1,1,k)=CC(0,k,0)+tr11*cr2+tr12*cr3;
+ CH(0,2,k)=ti11*ci5+ti12*ci4;
+ CH(ido-1,3,k)=CC(0,k,0)+tr12*cr2+tr11*cr3;
+ CH(0,4,k)=ti12*ci5-ti11*ci4;
+ }
+ if (ido==1) return;
+ for (size_t k=0; k<l1;++k)
+ for (size_t i=2; i<ido; i+=2)
+ {
+ double ci2, di2, ci4, ci5, di3, di4, di5, ci3, cr2, cr3, dr2, dr3,
+ dr4, dr5, cr5, cr4, ti2, ti3, ti5, ti4, tr2, tr3, tr4, tr5;
+ size_t ic=ido-i;
+ MULPM (dr2,di2,WA(0,i-2),WA(0,i-1),CC(i-1,k,1),CC(i,k,1))
+ MULPM (dr3,di3,WA(1,i-2),WA(1,i-1),CC(i-1,k,2),CC(i,k,2))
+ MULPM (dr4,di4,WA(2,i-2),WA(2,i-1),CC(i-1,k,3),CC(i,k,3))
+ MULPM (dr5,di5,WA(3,i-2),WA(3,i-1),CC(i-1,k,4),CC(i,k,4))
+ PM(cr2,ci5,dr5,dr2)
+ PM(ci2,cr5,di2,di5)
+ PM(cr3,ci4,dr4,dr3)
+ PM(ci3,cr4,di3,di4)
+ CH(i-1,0,k)=CC(i-1,k,0)+cr2+cr3;
+ CH(i ,0,k)=CC(i ,k,0)+ci2+ci3;
+ tr2=CC(i-1,k,0)+tr11*cr2+tr12*cr3;
+ ti2=CC(i ,k,0)+tr11*ci2+tr12*ci3;
+ tr3=CC(i-1,k,0)+tr12*cr2+tr11*cr3;
+ ti3=CC(i ,k,0)+tr12*ci2+tr11*ci3;
+ MULPM(tr5,tr4,cr5,cr4,ti11,ti12)
+ MULPM(ti5,ti4,ci5,ci4,ti11,ti12)
+ PM(CH(i-1,2,k),CH(ic-1,1,k),tr2,tr5)
+ PM(CH(i ,2,k),CH(ic ,1,k),ti5,ti2)
+ PM(CH(i-1,4,k),CH(ic-1,3,k),tr3,tr4)
+ PM(CH(i ,4,k),CH(ic ,3,k),ti4,ti3)
+ }
+ }
+
+#undef CC
+#undef CH
+#define C1(a,b,c) cc[(a)+ido*((b)+l1*(c))]
+#define C2(a,b) cc[(a)+idl1*(b)]
+#define CH2(a,b) ch[(a)+idl1*(b)]
+#define CC(a,b,c) cc[(a)+ido*((b)+cdim*(c))]
+#define CH(a,b,c) ch[(a)+ido*((b)+l1*(c))]
+NOINLINE static void radfg(size_t ido, size_t ip, size_t l1,
+ double * restrict cc, double * restrict ch, const double * restrict wa,
+ const double * restrict csarr)
+ {
+ const size_t cdim=ip;
+ size_t ipph=(ip+1)/2;
+ size_t idl1 = ido*l1;
+
+ if (ido>1)
+ {
+ for (size_t j=1, jc=ip-1; j<ipph; ++j,--jc) // 114
+ {
+ size_t is=(j-1)*(ido-1),
+ is2=(jc-1)*(ido-1);
+ for (size_t k=0; k<l1; ++k) // 113
+ {
+ size_t idij=is;
+ size_t idij2=is2;
+ for (size_t i=1; i<=ido-2; i+=2) // 112
+ {
+ double t1=C1(i,k,j ), t2=C1(i+1,k,j ),
+ t3=C1(i,k,jc), t4=C1(i+1,k,jc);
+ double x1=wa[idij]*t1 + wa[idij+1]*t2,
+ x2=wa[idij]*t2 - wa[idij+1]*t1,
+ x3=wa[idij2]*t3 + wa[idij2+1]*t4,
+ x4=wa[idij2]*t4 - wa[idij2+1]*t3;
+ C1(i ,k,j ) = x1+x3;
+ C1(i ,k,jc) = x2-x4;
+ C1(i+1,k,j ) = x2+x4;
+ C1(i+1,k,jc) = x3-x1;
+ idij+=2;
+ idij2+=2;
+ }
+ }
+ }
+ }
+
+ for (size_t j=1, jc=ip-1; j<ipph; ++j,--jc) // 123
+ for (size_t k=0; k<l1; ++k) // 122
+ {
+ double t1=C1(0,k,j), t2=C1(0,k,jc);
+ C1(0,k,j ) = t1+t2;
+ C1(0,k,jc) = t2-t1;
+ }
+
+//everything in C
+//memset(ch,0,ip*l1*ido*sizeof(double));
+
+ for (size_t l=1,lc=ip-1; l<ipph; ++l,--lc) // 127
+ {
+ for (size_t ik=0; ik<idl1; ++ik) // 124
+ {
+ CH2(ik,l ) = C2(ik,0)+csarr[2*l]*C2(ik,1)+csarr[4*l]*C2(ik,2);
+ CH2(ik,lc) = csarr[2*l+1]*C2(ik,ip-1)+csarr[4*l+1]*C2(ik,ip-2);
+ }
+ size_t iang = 2*l;
+ size_t j=3, jc=ip-3;
+ for (; j<ipph-3; j+=4,jc-=4) // 126
+ {
+ iang+=l; if (iang>=ip) iang-=ip;
+ double ar1=csarr[2*iang], ai1=csarr[2*iang+1];
+ iang+=l; if (iang>=ip) iang-=ip;
+ double ar2=csarr[2*iang], ai2=csarr[2*iang+1];
+ iang+=l; if (iang>=ip) iang-=ip;
+ double ar3=csarr[2*iang], ai3=csarr[2*iang+1];
+ iang+=l; if (iang>=ip) iang-=ip;
+ double ar4=csarr[2*iang], ai4=csarr[2*iang+1];
+ for (size_t ik=0; ik<idl1; ++ik) // 125
+ {
+ CH2(ik,l ) += ar1*C2(ik,j )+ar2*C2(ik,j +1)
+ +ar3*C2(ik,j +2)+ar4*C2(ik,j +3);
+ CH2(ik,lc) += ai1*C2(ik,jc)+ai2*C2(ik,jc-1)
+ +ai3*C2(ik,jc-2)+ai4*C2(ik,jc-3);
+ }
+ }
+ for (; j<ipph-1; j+=2,jc-=2) // 126
+ {
+ iang+=l; if (iang>=ip) iang-=ip;
+ double ar1=csarr[2*iang], ai1=csarr[2*iang+1];
+ iang+=l; if (iang>=ip) iang-=ip;
+ double ar2=csarr[2*iang], ai2=csarr[2*iang+1];
+ for (size_t ik=0; ik<idl1; ++ik) // 125
+ {
+ CH2(ik,l ) += ar1*C2(ik,j )+ar2*C2(ik,j +1);
+ CH2(ik,lc) += ai1*C2(ik,jc)+ai2*C2(ik,jc-1);
+ }
+ }
+ for (; j<ipph; ++j,--jc) // 126
+ {
+ iang+=l; if (iang>=ip) iang-=ip;
+ double ar=csarr[2*iang], ai=csarr[2*iang+1];
+ for (size_t ik=0; ik<idl1; ++ik) // 125
+ {
+ CH2(ik,l ) += ar*C2(ik,j );
+ CH2(ik,lc) += ai*C2(ik,jc);
+ }
+ }
+ }
+ for (size_t ik=0; ik<idl1; ++ik) // 101
+ CH2(ik,0) = C2(ik,0);
+ for (size_t j=1; j<ipph; ++j) // 129
+ for (size_t ik=0; ik<idl1; ++ik) // 128
+ CH2(ik,0) += C2(ik,j);
+
+// everything in CH at this point!
+//memset(cc,0,ip*l1*ido*sizeof(double));
+
+ for (size_t k=0; k<l1; ++k) // 131
+ for (size_t i=0; i<ido; ++i) // 130
+ CC(i,0,k) = CH(i,k,0);
+
+ for (size_t j=1, jc=ip-1; j<ipph; ++j,--jc) // 137
+ {
+ size_t j2=2*j-1;
+ for (size_t k=0; k<l1; ++k) // 136
+ {
+ CC(ido-1,j2,k) = CH(0,k,j);
+ CC(0,j2+1,k) = CH(0,k,jc);
+ }
+ }
+
+ if (ido==1) return;
+
+ for (size_t j=1, jc=ip-1; j<ipph; ++j,--jc) // 140
+ {
+ size_t j2=2*j-1;
+ for(size_t k=0; k<l1; ++k) // 139
+ for(size_t i=1, ic=ido-i-2; i<=ido-2; i+=2, ic-=2) // 138
+ {
+ CC(i ,j2+1,k) = CH(i ,k,j )+CH(i ,k,jc);
+ CC(ic ,j2 ,k) = CH(i ,k,j )-CH(i ,k,jc);
+ CC(i+1 ,j2+1,k) = CH(i+1,k,j )+CH(i+1,k,jc);
+ CC(ic+1,j2 ,k) = CH(i+1,k,jc)-CH(i+1,k,j );
+ }
+ }
+ }
+#undef C1
+#undef C2
+#undef CH2
+
+#undef CH
+#undef CC
+#define CH(a,b,c) ch[(a)+ido*((b)+l1*(c))]
+#define CC(a,b,c) cc[(a)+ido*((b)+cdim*(c))]
+
+NOINLINE static void radb2(size_t ido, size_t l1, const double * restrict cc,
+ double * restrict ch, const double * restrict wa)
+ {
+ const size_t cdim=2;
+
+ for (size_t k=0; k<l1; k++)
+ PM (CH(0,k,0),CH(0,k,1),CC(0,0,k),CC(ido-1,1,k))
+ if ((ido&1)==0)
+ for (size_t k=0; k<l1; k++)
+ {
+ CH(ido-1,k,0) = 2.*CC(ido-1,0,k);
+ CH(ido-1,k,1) =-2.*CC(0 ,1,k);
+ }
+ if (ido<=2) return;
+ for (size_t k=0; k<l1;++k)
+ for (size_t i=2; i<ido; i+=2)
+ {
+ size_t ic=ido-i;
+ double ti2, tr2;
+ PM (CH(i-1,k,0),tr2,CC(i-1,0,k),CC(ic-1,1,k))
+ PM (ti2,CH(i ,k,0),CC(i ,0,k),CC(ic ,1,k))
+ MULPM (CH(i,k,1),CH(i-1,k,1),WA(0,i-2),WA(0,i-1),ti2,tr2)
+ }
+ }
+
+NOINLINE static void radb3(size_t ido, size_t l1, const double * restrict cc,
+ double * restrict ch, const double * restrict wa)
+ {
+ const size_t cdim=3;
+ static const double taur=-0.5, taui=0.86602540378443864676;
+
+ for (size_t k=0; k<l1; k++)
+ {
+ double tr2=2.*CC(ido-1,1,k);
+ double cr2=CC(0,0,k)+taur*tr2;
+ CH(0,k,0)=CC(0,0,k)+tr2;
+ double ci3=2.*taui*CC(0,2,k);
+ PM (CH(0,k,2),CH(0,k,1),cr2,ci3);
+ }
+ if (ido==1) return;
+ for (size_t k=0; k<l1; k++)
+ for (size_t i=2; i<ido; i+=2)
+ {
+ size_t ic=ido-i;
+ double tr2=CC(i-1,2,k)+CC(ic-1,1,k); // t2=CC(I) + conj(CC(ic))
+ double ti2=CC(i ,2,k)-CC(ic ,1,k);
+ double cr2=CC(i-1,0,k)+taur*tr2; // c2=CC +taur*t2
+ double ci2=CC(i ,0,k)+taur*ti2;
+ CH(i-1,k,0)=CC(i-1,0,k)+tr2; // CH=CC+t2
+ CH(i ,k,0)=CC(i ,0,k)+ti2;
+ double cr3=taui*(CC(i-1,2,k)-CC(ic-1,1,k));// c3=taui*(CC(i)-conj(CC(ic)))
+ double ci3=taui*(CC(i ,2,k)+CC(ic ,1,k));
+ double di2, di3, dr2, dr3;
+ PM(dr3,dr2,cr2,ci3) // d2= (cr2-ci3, ci2+cr3) = c2+i*c3
+ PM(di2,di3,ci2,cr3) // d3= (cr2+ci3, ci2-cr3) = c2-i*c3
+ MULPM(CH(i,k,1),CH(i-1,k,1),WA(0,i-2),WA(0,i-1),di2,dr2) // ch = WA*d2
+ MULPM(CH(i,k,2),CH(i-1,k,2),WA(1,i-2),WA(1,i-1),di3,dr3)
+ }
+ }
+
+NOINLINE static void radb4(size_t ido, size_t l1, const double * restrict cc,
+ double * restrict ch, const double * restrict wa)
+ {
+ const size_t cdim=4;
+ static const double sqrt2=1.41421356237309504880;
+
+ for (size_t k=0; k<l1; k++)
+ {
+ double tr1, tr2;
+ PM (tr2,tr1,CC(0,0,k),CC(ido-1,3,k))
+ double tr3=2.*CC(ido-1,1,k);
+ double tr4=2.*CC(0,2,k);
+ PM (CH(0,k,0),CH(0,k,2),tr2,tr3)
+ PM (CH(0,k,3),CH(0,k,1),tr1,tr4)
+ }
+ if ((ido&1)==0)
+ for (size_t k=0; k<l1; k++)
+ {
+ double tr1,tr2,ti1,ti2;
+ PM (ti1,ti2,CC(0 ,3,k),CC(0 ,1,k))
+ PM (tr2,tr1,CC(ido-1,0,k),CC(ido-1,2,k))
+ CH(ido-1,k,0)=tr2+tr2;
+ CH(ido-1,k,1)=sqrt2*(tr1-ti1);
+ CH(ido-1,k,2)=ti2+ti2;
+ CH(ido-1,k,3)=-sqrt2*(tr1+ti1);
+ }
+ if (ido<=2) return;
+ for (size_t k=0; k<l1;++k)
+ for (size_t i=2; i<ido; i+=2)
+ {
+ double ci2, ci3, ci4, cr2, cr3, cr4, ti1, ti2, ti3, ti4, tr1, tr2, tr3, tr4;
+ size_t ic=ido-i;
+ PM (tr2,tr1,CC(i-1,0,k),CC(ic-1,3,k))
+ PM (ti1,ti2,CC(i ,0,k),CC(ic ,3,k))
+ PM (tr4,ti3,CC(i ,2,k),CC(ic ,1,k))
+ PM (tr3,ti4,CC(i-1,2,k),CC(ic-1,1,k))
+ PM (CH(i-1,k,0),cr3,tr2,tr3)
+ PM (CH(i ,k,0),ci3,ti2,ti3)
+ PM (cr4,cr2,tr1,tr4)
+ PM (ci2,ci4,ti1,ti4)
+ MULPM (CH(i,k,1),CH(i-1,k,1),WA(0,i-2),WA(0,i-1),ci2,cr2)
+ MULPM (CH(i,k,2),CH(i-1,k,2),WA(1,i-2),WA(1,i-1),ci3,cr3)
+ MULPM (CH(i,k,3),CH(i-1,k,3),WA(2,i-2),WA(2,i-1),ci4,cr4)
+ }
+ }
+
+NOINLINE static void radb5(size_t ido, size_t l1, const double * restrict cc,
+ double * restrict ch, const double * restrict wa)
+ {
+ const size_t cdim=5;
+ static const double tr11= 0.3090169943749474241, ti11=0.95105651629515357212,
+ tr12=-0.8090169943749474241, ti12=0.58778525229247312917;
+
+ for (size_t k=0; k<l1; k++)
+ {
+ double ti5=CC(0,2,k)+CC(0,2,k);
+ double ti4=CC(0,4,k)+CC(0,4,k);
+ double tr2=CC(ido-1,1,k)+CC(ido-1,1,k);
+ double tr3=CC(ido-1,3,k)+CC(ido-1,3,k);
+ CH(0,k,0)=CC(0,0,k)+tr2+tr3;
+ double cr2=CC(0,0,k)+tr11*tr2+tr12*tr3;
+ double cr3=CC(0,0,k)+tr12*tr2+tr11*tr3;
+ double ci4, ci5;
+ MULPM(ci5,ci4,ti5,ti4,ti11,ti12)
+ PM(CH(0,k,4),CH(0,k,1),cr2,ci5)
+ PM(CH(0,k,3),CH(0,k,2),cr3,ci4)
+ }
+ if (ido==1) return;
+ for (size_t k=0; k<l1;++k)
+ for (size_t i=2; i<ido; i+=2)
+ {
+ size_t ic=ido-i;
+ double tr2, tr3, tr4, tr5, ti2, ti3, ti4, ti5;
+ PM(tr2,tr5,CC(i-1,2,k),CC(ic-1,1,k))
+ PM(ti5,ti2,CC(i ,2,k),CC(ic ,1,k))
+ PM(tr3,tr4,CC(i-1,4,k),CC(ic-1,3,k))
+ PM(ti4,ti3,CC(i ,4,k),CC(ic ,3,k))
+ CH(i-1,k,0)=CC(i-1,0,k)+tr2+tr3;
+ CH(i ,k,0)=CC(i ,0,k)+ti2+ti3;
+ double cr2=CC(i-1,0,k)+tr11*tr2+tr12*tr3;
+ double ci2=CC(i ,0,k)+tr11*ti2+tr12*ti3;
+ double cr3=CC(i-1,0,k)+tr12*tr2+tr11*tr3;
+ double ci3=CC(i ,0,k)+tr12*ti2+tr11*ti3;
+ double ci4, ci5, cr5, cr4;
+ MULPM(cr5,cr4,tr5,tr4,ti11,ti12)
+ MULPM(ci5,ci4,ti5,ti4,ti11,ti12)
+ double dr2, dr3, dr4, dr5, di2, di3, di4, di5;
+ PM(dr4,dr3,cr3,ci4)
+ PM(di3,di4,ci3,cr4)
+ PM(dr5,dr2,cr2,ci5)
+ PM(di2,di5,ci2,cr5)
+ MULPM(CH(i,k,1),CH(i-1,k,1),WA(0,i-2),WA(0,i-1),di2,dr2)
+ MULPM(CH(i,k,2),CH(i-1,k,2),WA(1,i-2),WA(1,i-1),di3,dr3)
+ MULPM(CH(i,k,3),CH(i-1,k,3),WA(2,i-2),WA(2,i-1),di4,dr4)
+ MULPM(CH(i,k,4),CH(i-1,k,4),WA(3,i-2),WA(3,i-1),di5,dr5)
+ }
+ }
+
+#undef CC
+#undef CH
+#define CC(a,b,c) cc[(a)+ido*((b)+cdim*(c))]
+#define CH(a,b,c) ch[(a)+ido*((b)+l1*(c))]
+#define C1(a,b,c) cc[(a)+ido*((b)+l1*(c))]
+#define C2(a,b) cc[(a)+idl1*(b)]
+#define CH2(a,b) ch[(a)+idl1*(b)]
+
+NOINLINE static void radbg(size_t ido, size_t ip, size_t l1,
+ double * restrict cc, double * restrict ch, const double * restrict wa,
+ const double * restrict csarr)
+ {
+ const size_t cdim=ip;
+ size_t ipph=(ip+1)/ 2;
+ size_t idl1 = ido*l1;
+
+ for (size_t k=0; k<l1; ++k) // 102
+ for (size_t i=0; i<ido; ++i) // 101
+ CH(i,k,0) = CC(i,0,k);
+ for (size_t j=1, jc=ip-1; j<ipph; ++j, --jc) // 108
+ {
+ size_t j2=2*j-1;
+ for (size_t k=0; k<l1; ++k)
+ {
+ CH(0,k,j ) = 2*CC(ido-1,j2,k);
+ CH(0,k,jc) = 2*CC(0,j2+1,k);
+ }
+ }
+
+ if (ido!=1)
+ {
+ for (size_t j=1, jc=ip-1; j<ipph; ++j,--jc) // 111
+ {
+ size_t j2=2*j-1;
+ for (size_t k=0; k<l1; ++k)
+ for (size_t i=1, ic=ido-i-2; i<=ido-2; i+=2, ic-=2) // 109
+ {
+ CH(i ,k,j ) = CC(i ,j2+1,k)+CC(ic ,j2,k);
+ CH(i ,k,jc) = CC(i ,j2+1,k)-CC(ic ,j2,k);
+ CH(i+1,k,j ) = CC(i+1,j2+1,k)-CC(ic+1,j2,k);
+ CH(i+1,k,jc) = CC(i+1,j2+1,k)+CC(ic+1,j2,k);
+ }
+ }
+ }
+ for (size_t l=1,lc=ip-1; l<ipph; ++l,--lc)
+ {
+ for (size_t ik=0; ik<idl1; ++ik)
+ {
+ C2(ik,l ) = CH2(ik,0)+csarr[2*l]*CH2(ik,1)+csarr[4*l]*CH2(ik,2);
+ C2(ik,lc) = csarr[2*l+1]*CH2(ik,ip-1)+csarr[4*l+1]*CH2(ik,ip-2);
+ }
+ size_t iang=2*l;
+ size_t j=3,jc=ip-3;
+ for(; j<ipph-3; j+=4,jc-=4)
+ {
+ iang+=l; if(iang>ip) iang-=ip;
+ double ar1=csarr[2*iang], ai1=csarr[2*iang+1];
+ iang+=l; if(iang>ip) iang-=ip;
+ double ar2=csarr[2*iang], ai2=csarr[2*iang+1];
+ iang+=l; if(iang>ip) iang-=ip;
+ double ar3=csarr[2*iang], ai3=csarr[2*iang+1];
+ iang+=l; if(iang>ip) iang-=ip;
+ double ar4=csarr[2*iang], ai4=csarr[2*iang+1];
+ for (size_t ik=0; ik<idl1; ++ik)
+ {
+ C2(ik,l ) += ar1*CH2(ik,j )+ar2*CH2(ik,j +1)
+ +ar3*CH2(ik,j +2)+ar4*CH2(ik,j +3);
+ C2(ik,lc) += ai1*CH2(ik,jc)+ai2*CH2(ik,jc-1)
+ +ai3*CH2(ik,jc-2)+ai4*CH2(ik,jc-3);
+ }
+ }
+ for(; j<ipph-1; j+=2,jc-=2)
+ {
+ iang+=l; if(iang>ip) iang-=ip;
+ double ar1=csarr[2*iang], ai1=csarr[2*iang+1];
+ iang+=l; if(iang>ip) iang-=ip;
+ double ar2=csarr[2*iang], ai2=csarr[2*iang+1];
+ for (size_t ik=0; ik<idl1; ++ik)
+ {
+ C2(ik,l ) += ar1*CH2(ik,j )+ar2*CH2(ik,j +1);
+ C2(ik,lc) += ai1*CH2(ik,jc)+ai2*CH2(ik,jc-1);
+ }
+ }
+ for(; j<ipph; ++j,--jc)
+ {
+ iang+=l; if(iang>ip) iang-=ip;
+ double war=csarr[2*iang], wai=csarr[2*iang+1];
+ for (size_t ik=0; ik<idl1; ++ik)
+ {
+ C2(ik,l ) += war*CH2(ik,j );
+ C2(ik,lc) += wai*CH2(ik,jc);
+ }
+ }
+ }
+ for (size_t j=1; j<ipph; ++j)
+ for (size_t ik=0; ik<idl1; ++ik)
+ CH2(ik,0) += CH2(ik,j);
+ for (size_t j=1, jc=ip-1; j<ipph; ++j,--jc) // 124
+ for (size_t k=0; k<l1; ++k)
+ {
+ CH(0,k,j ) = C1(0,k,j)-C1(0,k,jc);
+ CH(0,k,jc) = C1(0,k,j)+C1(0,k,jc);
+ }
+
+ if (ido==1) return;
+
+ for (size_t j=1, jc=ip-1; j<ipph; ++j, --jc) // 127
+ for (size_t k=0; k<l1; ++k)
+ for (size_t i=1; i<=ido-2; i+=2)
+ {
+ CH(i ,k,j ) = C1(i ,k,j)-C1(i+1,k,jc);
+ CH(i ,k,jc) = C1(i ,k,j)+C1(i+1,k,jc);
+ CH(i+1,k,j ) = C1(i+1,k,j)+C1(i ,k,jc);
+ CH(i+1,k,jc) = C1(i+1,k,j)-C1(i ,k,jc);
+ }
+
+// All in CH
+
+ for (size_t j=1; j<ip; ++j)
+ {
+ size_t is = (j-1)*(ido-1);
+ for (size_t k=0; k<l1; ++k)
+ {
+ size_t idij = is;
+ for (size_t i=1; i<=ido-2; i+=2)
+ {
+ double t1=CH(i,k,j), t2=CH(i+1,k,j);
+ CH(i ,k,j) = wa[idij]*t1-wa[idij+1]*t2;
+ CH(i+1,k,j) = wa[idij]*t2+wa[idij+1]*t1;
+ idij+=2;
+ }
+ }
+ }
+ }
+#undef C1
+#undef C2
+#undef CH2
+
+#undef CC
+#undef CH
+#undef PM
+#undef MULPM
+#undef WA
+
+static void copy_and_norm(double *c, double *p1, size_t n, double fct)
+ {
+ if (p1!=c)
+ {
+ if (fct!=1.)
+ for (size_t i=0; i<n; ++i)
+ c[i] = fct*p1[i];
+ else
+ memcpy (c,p1,n*sizeof(double));
+ }
+ else
+ if (fct!=1.)
+ for (size_t i=0; i<n; ++i)
+ c[i] *= fct;
+ }
+
+WARN_UNUSED_RESULT
+static int rfftp_forward(rfftp_plan plan, double c[], double fct)
+ {
+ if (plan->length==1) return 0;
+ size_t n=plan->length;
+ size_t l1=n, nf=plan->nfct;
+ double *ch = RALLOC(double, n);
+ if (!ch) return -1;
+ double *p1=c, *p2=ch;
+
+ for(size_t k1=0; k1<nf;++k1)
+ {
+ size_t k=nf-k1-1;
+ size_t ip=plan->fct[k].fct;
+ size_t ido=n / l1;
+ l1 /= ip;
+ if(ip==4)
+ radf4(ido, l1, p1, p2, plan->fct[k].tw);
+ else if(ip==2)
+ radf2(ido, l1, p1, p2, plan->fct[k].tw);
+ else if(ip==3)
+ radf3(ido, l1, p1, p2, plan->fct[k].tw);
+ else if(ip==5)
+ radf5(ido, l1, p1, p2, plan->fct[k].tw);
+ else
+ {
+ radfg(ido, ip, l1, p1, p2, plan->fct[k].tw, plan->fct[k].tws);
+ SWAP (p1,p2,double *);
+ }
+ SWAP (p1,p2,double *);
+ }
+ copy_and_norm(c,p1,n,fct);
+ DEALLOC(ch);
+ return 0;
+ }
+
+WARN_UNUSED_RESULT
+static int rfftp_backward(rfftp_plan plan, double c[], double fct)
+ {
+ if (plan->length==1) return 0;
+ size_t n=plan->length;
+ size_t l1=1, nf=plan->nfct;
+ double *ch = RALLOC(double, n);
+ if (!ch) return -1;
+ double *p1=c, *p2=ch;
+
+ for(size_t k=0; k<nf; k++)
+ {
+ size_t ip = plan->fct[k].fct,
+ ido= n/(ip*l1);
+ if(ip==4)
+ radb4(ido, l1, p1, p2, plan->fct[k].tw);
+ else if(ip==2)
+ radb2(ido, l1, p1, p2, plan->fct[k].tw);
+ else if(ip==3)
+ radb3(ido, l1, p1, p2, plan->fct[k].tw);
+ else if(ip==5)
+ radb5(ido, l1, p1, p2, plan->fct[k].tw);
+ else
+ radbg(ido, ip, l1, p1, p2, plan->fct[k].tw, plan->fct[k].tws);
+ SWAP (p1,p2,double *);
+ l1*=ip;
+ }
+ copy_and_norm(c,p1,n,fct);
+ DEALLOC(ch);
+ return 0;
+ }
+
+WARN_UNUSED_RESULT
+static int rfftp_factorize (rfftp_plan plan)
+ {
+ size_t length=plan->length;
+ size_t nfct=0;
+ while ((length%4)==0)
+ { if (nfct>=NFCT) return -1; plan->fct[nfct++].fct=4; length>>=2; }
+ if ((length%2)==0)
+ {
+ length>>=1;
+ // factor 2 should be at the front of the factor list
+ if (nfct>=NFCT) return -1;
+ plan->fct[nfct++].fct=2;
+ SWAP(plan->fct[0].fct, plan->fct[nfct-1].fct,size_t);
+ }
+ size_t maxl=(size_t)(sqrt((double)length))+1;
+ for (size_t divisor=3; (length>1)&&(divisor<maxl); divisor+=2)
+ if ((length%divisor)==0)
+ {
+ while ((length%divisor)==0)
+ {
+ if (nfct>=NFCT) return -1;
+ plan->fct[nfct++].fct=divisor;
+ length/=divisor;
+ }
+ maxl=(size_t)(sqrt((double)length))+1;
+ }
+ if (length>1) plan->fct[nfct++].fct=length;
+ plan->nfct=nfct;
+ return 0;
+ }
+
+static size_t rfftp_twsize(rfftp_plan plan)
+ {
+ size_t twsize=0, l1=1;
+ for (size_t k=0; k<plan->nfct; ++k)
+ {
+ size_t ip=plan->fct[k].fct, ido= plan->length/(l1*ip);
+ twsize+=(ip-1)*(ido-1);
+ if (ip>5) twsize+=2*ip;
+ l1*=ip;
+ }
+ return twsize;
+ return 0;
+ }
+
+WARN_UNUSED_RESULT NOINLINE static int rfftp_comp_twiddle (rfftp_plan plan)
+ {
+ size_t length=plan->length;
+ double *twid = RALLOC(double, 2*length);
+ if (!twid) return -1;
+ sincos_2pibyn_half(length, twid);
+ size_t l1=1;
+ double *ptr=plan->mem;
+ for (size_t k=0; k<plan->nfct; ++k)
+ {
+ size_t ip=plan->fct[k].fct, ido=length/(l1*ip);
+ if (k<plan->nfct-1) // last factor doesn't need twiddles
+ {
+ plan->fct[k].tw=ptr; ptr+=(ip-1)*(ido-1);
+ for (size_t j=1; j<ip; ++j)
+ for (size_t i=1; i<=(ido-1)/2; ++i)
+ {
+ plan->fct[k].tw[(j-1)*(ido-1)+2*i-2] = twid[2*j*l1*i];
+ plan->fct[k].tw[(j-1)*(ido-1)+2*i-1] = twid[2*j*l1*i+1];
+ }
+ }
+ if (ip>5) // special factors required by *g functions
+ {
+ plan->fct[k].tws=ptr; ptr+=2*ip;
+ plan->fct[k].tws[0] = 1.;
+ plan->fct[k].tws[1] = 0.;
+ for (size_t i=1; i<=(ip>>1); ++i)
+ {
+ plan->fct[k].tws[2*i ] = twid[2*i*(length/ip)];
+ plan->fct[k].tws[2*i+1] = twid[2*i*(length/ip)+1];
+ plan->fct[k].tws[2*(ip-i) ] = twid[2*i*(length/ip)];
+ plan->fct[k].tws[2*(ip-i)+1] = -twid[2*i*(length/ip)+1];
+ }
+ }
+ l1*=ip;
+ }
+ DEALLOC(twid);
+ return 0;
+ }
+
+NOINLINE static rfftp_plan make_rfftp_plan (size_t length)
+ {
+ if (length==0) return NULL;
+ rfftp_plan plan = RALLOC(rfftp_plan_i,1);
+ if (!plan) return NULL;
+ plan->length=length;
+ plan->nfct=0;
+ plan->mem=NULL;
+ for (size_t i=0; i<NFCT; ++i)
+ plan->fct[i]=(rfftp_fctdata){0,0,0};
+ if (length==1) return plan;
+ if (rfftp_factorize(plan)!=0) { DEALLOC(plan); return NULL; }
+ size_t tws=rfftp_twsize(plan);
+ plan->mem=RALLOC(double,tws);
+ if (!plan->mem) { DEALLOC(plan); return NULL; }
+ if (rfftp_comp_twiddle(plan)!=0)
+ { DEALLOC(plan->mem); DEALLOC(plan); return NULL; }
+ return plan;
+ }
+
+NOINLINE static void destroy_rfftp_plan (rfftp_plan plan)
+ {
+ DEALLOC(plan->mem);
+ DEALLOC(plan);
+ }
+
+typedef struct fftblue_plan_i
+ {
+ size_t n, n2;
+ cfftp_plan plan;
+ double *mem;
+ double *bk, *bkf;
+ } fftblue_plan_i;
+typedef struct fftblue_plan_i * fftblue_plan;
+
+NOINLINE static fftblue_plan make_fftblue_plan (size_t length)
+ {
+ fftblue_plan plan = RALLOC(fftblue_plan_i,1);
+ if (!plan) return NULL;
+ plan->n = length;
+ plan->n2 = good_size(plan->n*2-1);
+ plan->mem = RALLOC(double, 2*plan->n+2*plan->n2);
+ if (!plan->mem) { DEALLOC(plan); return NULL; }
+ plan->bk = plan->mem;
+ plan->bkf = plan->bk+2*plan->n;
+
+/* initialize b_k */
+ double *tmp = RALLOC(double,4*plan->n);
+ if (!tmp) { DEALLOC(plan->mem); DEALLOC(plan); return NULL; }
+ sincos_2pibyn(2*plan->n,tmp);
+ plan->bk[0] = 1;
+ plan->bk[1] = 0;
+
+ size_t coeff=0;
+ for (size_t m=1; m<plan->n; ++m)
+ {
+ coeff+=2*m-1;
+ if (coeff>=2*plan->n) coeff-=2*plan->n;
+ plan->bk[2*m ] = tmp[2*coeff ];
+ plan->bk[2*m+1] = tmp[2*coeff+1];
+ }
+
+ /* initialize the zero-padded, Fourier transformed b_k. Add normalisation. */
+ double xn2 = 1./plan->n2;
+ plan->bkf[0] = plan->bk[0]*xn2;
+ plan->bkf[1] = plan->bk[1]*xn2;
+ for (size_t m=2; m<2*plan->n; m+=2)
+ {
+ plan->bkf[m] = plan->bkf[2*plan->n2-m] = plan->bk[m] *xn2;
+ plan->bkf[m+1] = plan->bkf[2*plan->n2-m+1] = plan->bk[m+1] *xn2;
+ }
+ for (size_t m=2*plan->n;m<=(2*plan->n2-2*plan->n+1);++m)
+ plan->bkf[m]=0.;
+ plan->plan=make_cfftp_plan(plan->n2);
+ if (!plan->plan)
+ { DEALLOC(tmp); DEALLOC(plan->mem); DEALLOC(plan); return NULL; }
+ if (cfftp_forward(plan->plan,plan->bkf,1.)!=0)
+ { DEALLOC(tmp); DEALLOC(plan->mem); DEALLOC(plan); return NULL; }
+ DEALLOC(tmp);
+
+ return plan;
+ }
+
+NOINLINE static void destroy_fftblue_plan (fftblue_plan plan)
+ {
+ DEALLOC(plan->mem);
+ destroy_cfftp_plan(plan->plan);
+ DEALLOC(plan);
+ }
+
+NOINLINE WARN_UNUSED_RESULT
+static int fftblue_fft(fftblue_plan plan, double c[], int isign, double fct)
+ {
+ size_t n=plan->n;
+ size_t n2=plan->n2;
+ double *bk = plan->bk;
+ double *bkf = plan->bkf;
+ double *akf = RALLOC(double, 2*n2);
+ if (!akf) return -1;
+
+/* initialize a_k and FFT it */
+ if (isign>0)
+ for (size_t m=0; m<2*n; m+=2)
+ {
+ akf[m] = c[m]*bk[m] - c[m+1]*bk[m+1];
+ akf[m+1] = c[m]*bk[m+1] + c[m+1]*bk[m];
+ }
+ else
+ for (size_t m=0; m<2*n; m+=2)
+ {
+ akf[m] = c[m]*bk[m] + c[m+1]*bk[m+1];
+ akf[m+1] =-c[m]*bk[m+1] + c[m+1]*bk[m];
+ }
+ for (size_t m=2*n; m<2*n2; ++m)
+ akf[m]=0;
+
+ if (cfftp_forward (plan->plan,akf,fct)!=0)
+ { DEALLOC(akf); return -1; }
+
+/* do the convolution */
+ if (isign>0)
+ for (size_t m=0; m<2*n2; m+=2)
+ {
+ double im = -akf[m]*bkf[m+1] + akf[m+1]*bkf[m];
+ akf[m ] = akf[m]*bkf[m] + akf[m+1]*bkf[m+1];
+ akf[m+1] = im;
+ }
+ else
+ for (size_t m=0; m<2*n2; m+=2)
+ {
+ double im = akf[m]*bkf[m+1] + akf[m+1]*bkf[m];
+ akf[m ] = akf[m]*bkf[m] - akf[m+1]*bkf[m+1];
+ akf[m+1] = im;
+ }
+
+/* inverse FFT */
+ if (cfftp_backward (plan->plan,akf,1.)!=0)
+ { DEALLOC(akf); return -1; }
+
+/* multiply by b_k */
+ if (isign>0)
+ for (size_t m=0; m<2*n; m+=2)
+ {
+ c[m] = bk[m] *akf[m] - bk[m+1]*akf[m+1];
+ c[m+1] = bk[m+1]*akf[m] + bk[m] *akf[m+1];
+ }
+ else
+ for (size_t m=0; m<2*n; m+=2)
+ {
+ c[m] = bk[m] *akf[m] + bk[m+1]*akf[m+1];
+ c[m+1] =-bk[m+1]*akf[m] + bk[m] *akf[m+1];
+ }
+ DEALLOC(akf);
+ return 0;
+ }
+
+WARN_UNUSED_RESULT
+static int cfftblue_backward(fftblue_plan plan, double c[], double fct)
+ { return fftblue_fft(plan,c,1,fct); }
+
+WARN_UNUSED_RESULT
+static int cfftblue_forward(fftblue_plan plan, double c[], double fct)
+ { return fftblue_fft(plan,c,-1,fct); }
+
+WARN_UNUSED_RESULT
+static int rfftblue_backward(fftblue_plan plan, double c[], double fct)
+ {
+ size_t n=plan->n;
+ double *tmp = RALLOC(double,2*n);
+ if (!tmp) return -1;
+ tmp[0]=c[0];
+ tmp[1]=0.;
+ memcpy (tmp+2,c+1, (n-1)*sizeof(double));
+ if ((n&1)==0) tmp[n+1]=0.;
+ for (size_t m=2; m<n; m+=2)
+ {
+ tmp[2*n-m]=tmp[m];
+ tmp[2*n-m+1]=-tmp[m+1];
+ }
+ if (fftblue_fft(plan,tmp,1,fct)!=0)
+ { DEALLOC(tmp); return -1; }
+ for (size_t m=0; m<n; ++m)
+ c[m] = tmp[2*m];
+ DEALLOC(tmp);
+ return 0;
+ }
+
+WARN_UNUSED_RESULT
+static int rfftblue_forward(fftblue_plan plan, double c[], double fct)
+ {
+ size_t n=plan->n;
+ double *tmp = RALLOC(double,2*n);
+ if (!tmp) return -1;
+ for (size_t m=0; m<n; ++m)
+ {
+ tmp[2*m] = c[m];
+ tmp[2*m+1] = 0.;
+ }
+ if (fftblue_fft(plan,tmp,-1,fct)!=0)
+ { DEALLOC(tmp); return -1; }
+ c[0] = tmp[0];
+ memcpy (c+1, tmp+2, (n-1)*sizeof(double));
+ DEALLOC(tmp);
+ return 0;
+ }
+
+typedef struct cfft_plan_i
+ {
+ cfftp_plan packplan;
+ fftblue_plan blueplan;
+ } cfft_plan_i;
+
+static cfft_plan make_cfft_plan (size_t length)
+ {
+ if (length==0) return NULL;
+ cfft_plan plan = RALLOC(cfft_plan_i,1);
+ if (!plan) return NULL;
+ plan->blueplan=0;
+ plan->packplan=0;
+ if ((length<50) || (largest_prime_factor(length)<=sqrt(length)))
+ {
+ plan->packplan=make_cfftp_plan(length);
+ if (!plan->packplan) { DEALLOC(plan); return NULL; }
+ return plan;
+ }
+ double comp1 = cost_guess(length);
+ double comp2 = 2*cost_guess(good_size(2*length-1));
+ comp2*=1.5; /* fudge factor that appears to give good overall performance */
+ if (comp2<comp1) // use Bluestein
+ {
+ plan->blueplan=make_fftblue_plan(length);
+ if (!plan->blueplan) { DEALLOC(plan); return NULL; }
+ }
+ else
+ {
+ plan->packplan=make_cfftp_plan(length);
+ if (!plan->packplan) { DEALLOC(plan); return NULL; }
+ }
+ return plan;
+ }
+
+static void destroy_cfft_plan (cfft_plan plan)
+ {
+ if (plan->blueplan)
+ destroy_fftblue_plan(plan->blueplan);
+ if (plan->packplan)
+ destroy_cfftp_plan(plan->packplan);
+ DEALLOC(plan);
+ }
+
+WARN_UNUSED_RESULT static int cfft_backward(cfft_plan plan, double c[], double fct)
+ {
+ if (plan->packplan)
+ return cfftp_backward(plan->packplan,c,fct);
+ // if (plan->blueplan)
+ return cfftblue_backward(plan->blueplan,c,fct);
+ }
+
+WARN_UNUSED_RESULT static int cfft_forward(cfft_plan plan, double c[], double fct)
+ {
+ if (plan->packplan)
+ return cfftp_forward(plan->packplan,c,fct);
+ // if (plan->blueplan)
+ return cfftblue_forward(plan->blueplan,c,fct);
+ }
+
+typedef struct rfft_plan_i
+ {
+ rfftp_plan packplan;
+ fftblue_plan blueplan;
+ } rfft_plan_i;
+
+static rfft_plan make_rfft_plan (size_t length)
+ {
+ if (length==0) return NULL;
+ rfft_plan plan = RALLOC(rfft_plan_i,1);
+ if (!plan) return NULL;
+ plan->blueplan=0;
+ plan->packplan=0;
+ if ((length<50) || (largest_prime_factor(length)<=sqrt(length)))
+ {
+ plan->packplan=make_rfftp_plan(length);
+ if (!plan->packplan) { DEALLOC(plan); return NULL; }
+ return plan;
+ }
+ double comp1 = 0.5*cost_guess(length);
+ double comp2 = 2*cost_guess(good_size(2*length-1));
+ comp2*=1.5; /* fudge factor that appears to give good overall performance */
+ if (comp2<comp1) // use Bluestein
+ {
+ plan->blueplan=make_fftblue_plan(length);
+ if (!plan->blueplan) { DEALLOC(plan); return NULL; }
+ }
+ else
+ {
+ plan->packplan=make_rfftp_plan(length);
+ if (!plan->packplan) { DEALLOC(plan); return NULL; }
+ }
+ return plan;
+ }
+
+static void destroy_rfft_plan (rfft_plan plan)
+ {
+ if (plan->blueplan)
+ destroy_fftblue_plan(plan->blueplan);
+ if (plan->packplan)
+ destroy_rfftp_plan(plan->packplan);
+ DEALLOC(plan);
+ }
+
+WARN_UNUSED_RESULT static int rfft_backward(rfft_plan plan, double c[], double fct)
+ {
+ if (plan->packplan)
+ return rfftp_backward(plan->packplan,c,fct);
+ else // if (plan->blueplan)
+ return rfftblue_backward(plan->blueplan,c,fct);
+ }
+
+WARN_UNUSED_RESULT static int rfft_forward(rfft_plan plan, double c[], double fct)
+ {
+ if (plan->packplan)
+ return rfftp_forward(plan->packplan,c,fct);
+ else // if (plan->blueplan)
+ return rfftblue_forward(plan->blueplan,c,fct);
+ }
+
+#define NPY_NO_DEPRECATED_API NPY_API_VERSION
+
+#include "Python.h"
+#include "numpy/arrayobject.h"
+
+static PyObject *
+execute_complex(PyObject *a1, int is_forward, double fct)
+{
+ PyArrayObject *data = (PyArrayObject *)PyArray_FromAny(a1,
+ PyArray_DescrFromType(NPY_CDOUBLE), 1, 0,
+ NPY_ARRAY_ENSURECOPY | NPY_ARRAY_DEFAULT |
+ NPY_ARRAY_ENSUREARRAY | NPY_ARRAY_FORCECAST,
+ NULL);
+ if (!data) return NULL;
+
+ int npts = PyArray_DIM(data, PyArray_NDIM(data) - 1);
+ cfft_plan plan=NULL;
+
+ int nrepeats = PyArray_SIZE(data)/npts;
+ double *dptr = (double *)PyArray_DATA(data);
+ int fail=0;
+ Py_BEGIN_ALLOW_THREADS;
+ NPY_SIGINT_ON;
+ plan = make_cfft_plan(npts);
+ if (!plan) fail=1;
+ if (!fail)
+ for (int i = 0; i < nrepeats; i++) {
+ int res = is_forward ?
+ cfft_forward(plan, dptr, fct) : cfft_backward(plan, dptr, fct);
+ if (res!=0) { fail=1; break; }
+ dptr += npts*2;
+ }
+ if (plan) destroy_cfft_plan(plan);
+ NPY_SIGINT_OFF;
+ Py_END_ALLOW_THREADS;
+ if (fail) {
+ Py_XDECREF(data);
+ return PyErr_NoMemory();
+ }
+ return (PyObject *)data;
+}
+
+static PyObject *
+execute_real_forward(PyObject *a1, double fct)
+{
+ rfft_plan plan=NULL;
+ int fail = 0;
+ PyArrayObject *data = (PyArrayObject *)PyArray_FromAny(a1,
+ PyArray_DescrFromType(NPY_DOUBLE), 1, 0,
+ NPY_ARRAY_DEFAULT | NPY_ARRAY_ENSUREARRAY | NPY_ARRAY_FORCECAST,
+ NULL);
+ if (!data) return NULL;
+
+ int ndim = PyArray_NDIM(data);
+ const npy_intp *odim = PyArray_DIMS(data);
+ int npts = odim[ndim - 1];
+ npy_intp *tdim=(npy_intp *)malloc(ndim*sizeof(npy_intp));
+ if (!tdim)
+ { Py_XDECREF(data); return NULL; }
+ for (int d=0; d<ndim-1; ++d)
+ tdim[d] = odim[d];
+ tdim[ndim-1] = npts/2 + 1;
+ PyArrayObject *ret = (PyArrayObject *)PyArray_Empty(ndim,
+ tdim, PyArray_DescrFromType(NPY_CDOUBLE), 0);
+ free(tdim);
+ if (!ret) fail=1;
+ if (!fail) {
+ int rstep = PyArray_DIM(ret, PyArray_NDIM(ret) - 1)*2;
+
+ int nrepeats = PyArray_SIZE(data)/npts;
+ double *rptr = (double *)PyArray_DATA(ret),
+ *dptr = (double *)PyArray_DATA(data);
+
+ Py_BEGIN_ALLOW_THREADS;
+ NPY_SIGINT_ON;
+ plan = make_rfft_plan(npts);
+ if (!plan) fail=1;
+ if (!fail)
+ for (int i = 0; i < nrepeats; i++) {
+ rptr[rstep-1] = 0.0;
+ memcpy((char *)(rptr+1), dptr, npts*sizeof(double));
+ if (rfft_forward(plan, rptr+1, fct)!=0) {fail=1; break;}
+ rptr[0] = rptr[1];
+ rptr[1] = 0.0;
+ rptr += rstep;
+ dptr += npts;
+ }
+ if (plan) destroy_rfft_plan(plan);
+ NPY_SIGINT_OFF;
+ Py_END_ALLOW_THREADS;
+ }
+ if (fail) {
+ Py_XDECREF(data);
+ Py_XDECREF(ret);
+ return PyErr_NoMemory();
+ }
+ Py_DECREF(data);
+ return (PyObject *)ret;
+}
+static PyObject *
+execute_real_backward(PyObject *a1, double fct)
+{
+ rfft_plan plan=NULL;
+ PyArrayObject *data = (PyArrayObject *)PyArray_FromAny(a1,
+ PyArray_DescrFromType(NPY_CDOUBLE), 1, 0,
+ NPY_ARRAY_DEFAULT | NPY_ARRAY_ENSUREARRAY | NPY_ARRAY_FORCECAST,
+ NULL);
+ if (!data) return NULL;
+ int npts = PyArray_DIM(data, PyArray_NDIM(data) - 1);
+ PyArrayObject *ret = (PyArrayObject *)PyArray_Empty(PyArray_NDIM(data),
+ PyArray_DIMS(data), PyArray_DescrFromType(NPY_DOUBLE), 0);
+ int fail = 0;
+ if (!ret) fail=1;
+ if (!fail) {
+ int nrepeats = PyArray_SIZE(ret)/npts;
+ double *rptr = (double *)PyArray_DATA(ret),
+ *dptr = (double *)PyArray_DATA(data);
+
+ Py_BEGIN_ALLOW_THREADS;
+ NPY_SIGINT_ON;
+ plan = make_rfft_plan(npts);
+ if (!plan) fail=1;
+ if (!fail) {
+ for (int i = 0; i < nrepeats; i++) {
+ memcpy((char *)(rptr + 1), (dptr + 2), (npts - 1)*sizeof(double));
+ rptr[0] = dptr[0];
+ if (rfft_backward(plan, rptr, fct)!=0) {fail=1; break;}
+ rptr += npts;
+ dptr += npts*2;
+ }
+ }
+ if (plan) destroy_rfft_plan(plan);
+ NPY_SIGINT_OFF;
+ Py_END_ALLOW_THREADS;
+ }
+ if (fail) {
+ Py_XDECREF(data);
+ Py_XDECREF(ret);
+ return PyErr_NoMemory();
+ }
+ Py_DECREF(data);
+ return (PyObject *)ret;
+}
+
+static PyObject *
+execute_real(PyObject *a1, int is_forward, double fct)
+{
+ return is_forward ? execute_real_forward(a1, fct)
+ : execute_real_backward(a1, fct);
+}
+
+static const char execute__doc__[] = "";
+
+static PyObject *
+execute(PyObject *NPY_UNUSED(self), PyObject *args)
+{
+ PyObject *a1;
+ int is_real, is_forward;
+ double fct;
+
+ if(!PyArg_ParseTuple(args, "Oiid:execute", &a1, &is_real, &is_forward, &fct)) {
+ return NULL;
+ }
+
+ return is_real ? execute_real(a1, is_forward, fct)
+ : execute_complex(a1, is_forward, fct);
+}
+
+/* List of methods defined in the module */
+
+static struct PyMethodDef methods[] = {
+ {"execute", execute, 1, execute__doc__},
+ {NULL, NULL, 0, NULL} /* sentinel */
+};
+
+#if PY_MAJOR_VERSION >= 3
+static struct PyModuleDef moduledef = {
+ PyModuleDef_HEAD_INIT,
+ "_pocketfft_internal",
+ NULL,
+ -1,
+ methods,
+ NULL,
+ NULL,
+ NULL,
+ NULL
+};
+#endif
+
+/* Initialization function for the module */
+#if PY_MAJOR_VERSION >= 3
+#define RETVAL(x) x
+PyMODINIT_FUNC PyInit__pocketfft_internal(void)
+#else
+#define RETVAL(x)
+PyMODINIT_FUNC
+init_pocketfft_internal(void)
+#endif
+{
+ PyObject *m;
+#if PY_MAJOR_VERSION >= 3
+ m = PyModule_Create(&moduledef);
+#else
+ static const char module_documentation[] = "";
+
+ m = Py_InitModule4("_pocketfft_internal", methods,
+ module_documentation,
+ (PyObject*)NULL,PYTHON_API_VERSION);
+#endif
+ if (m == NULL) {
+ return RETVAL(NULL);
+ }
+
+ /* Import the array object */
+ import_array();
+
+ /* XXXX Add constants here */
+
+ return RETVAL(m);
+}
--- /dev/null
+"""
+Discrete Fourier Transforms
+
+Routines in this module:
+
+fft(a, n=None, axis=-1)
+ifft(a, n=None, axis=-1)
+rfft(a, n=None, axis=-1)
+irfft(a, n=None, axis=-1)
+hfft(a, n=None, axis=-1)
+ihfft(a, n=None, axis=-1)
+fftn(a, s=None, axes=None)
+ifftn(a, s=None, axes=None)
+rfftn(a, s=None, axes=None)
+irfftn(a, s=None, axes=None)
+fft2(a, s=None, axes=(-2,-1))
+ifft2(a, s=None, axes=(-2, -1))
+rfft2(a, s=None, axes=(-2,-1))
+irfft2(a, s=None, axes=(-2, -1))
+
+i = inverse transform
+r = transform of purely real data
+h = Hermite transform
+n = n-dimensional transform
+2 = 2-dimensional transform
+(Note: 2D routines are just nD routines with different default
+behavior.)
+
+"""
+from __future__ import division, absolute_import, print_function
+
+__all__ = ['fft', 'ifft', 'rfft', 'irfft', 'hfft', 'ihfft', 'rfftn',
+ 'irfftn', 'rfft2', 'irfft2', 'fft2', 'ifft2', 'fftn', 'ifftn']
+
+import functools
+
+from numpy.core import asarray, zeros, swapaxes, conjugate, take, sqrt
+from . import _pocketfft_internal as pfi
+from numpy.core.multiarray import normalize_axis_index
+from numpy.core import overrides
+
+
+array_function_dispatch = functools.partial(
+ overrides.array_function_dispatch, module='numpy.fft')
+
+
+# `inv_norm` is a float by which the result of the transform needs to be
+# divided. This replaces the original, more intuitive 'fct` parameter to avoid
+# divisions by zero (or alternatively additional checks) in the case of
+# zero-length axes during its computation.
+def _raw_fft(a, n, axis, is_real, is_forward, inv_norm):
+ axis = normalize_axis_index(axis, a.ndim)
+ if n is None:
+ n = a.shape[axis]
+
+ if n < 1:
+ raise ValueError("Invalid number of FFT data points (%d) specified."
+ % n)
+
+ fct = 1/inv_norm
+
+ if a.shape[axis] != n:
+ s = list(a.shape)
+ if s[axis] > n:
+ index = [slice(None)]*len(s)
+ index[axis] = slice(0, n)
+ a = a[tuple(index)]
+ else:
+ index = [slice(None)]*len(s)
+ index[axis] = slice(0, s[axis])
+ s[axis] = n
+ z = zeros(s, a.dtype.char)
+ z[tuple(index)] = a
+ a = z
+
+ if axis == a.ndim-1:
+ r = pfi.execute(a, is_real, is_forward, fct)
+ else:
+ a = swapaxes(a, axis, -1)
+ r = pfi.execute(a, is_real, is_forward, fct)
+ r = swapaxes(r, axis, -1)
+ return r
+
+
+def _unitary(norm):
+ if norm is None:
+ return False
+ if norm=="ortho":
+ return True
+ raise ValueError("Invalid norm value %s, should be None or \"ortho\"."
+ % norm)
+
+
+def _fft_dispatcher(a, n=None, axis=None, norm=None):
+ return (a,)
+
+
+@array_function_dispatch(_fft_dispatcher)
+def fft(a, n=None, axis=-1, norm=None):
+ """
+ Compute the one-dimensional discrete Fourier Transform.
+
+ This function computes the one-dimensional *n*-point discrete Fourier
+ Transform (DFT) with the efficient Fast Fourier Transform (FFT)
+ algorithm [CT].
+
+ Parameters
+ ----------
+ a : array_like
+ Input array, can be complex.
+ n : int, optional
+ Length of the transformed axis of the output.
+ If `n` is smaller than the length of the input, the input is cropped.
+ If it is larger, the input is padded with zeros. If `n` is not given,
+ the length of the input along the axis specified by `axis` is used.
+ axis : int, optional
+ Axis over which to compute the FFT. If not given, the last axis is
+ used.
+ norm : {None, "ortho"}, optional
+ .. versionadded:: 1.10.0
+
+ Normalization mode (see `numpy.fft`). Default is None.
+
+ Returns
+ -------
+ out : complex ndarray
+ The truncated or zero-padded input, transformed along the axis
+ indicated by `axis`, or the last one if `axis` is not specified.
+
+ Raises
+ ------
+ IndexError
+ if `axes` is larger than the last axis of `a`.
+
+ See Also
+ --------
+ numpy.fft : for definition of the DFT and conventions used.
+ ifft : The inverse of `fft`.
+ fft2 : The two-dimensional FFT.
+ fftn : The *n*-dimensional FFT.
+ rfftn : The *n*-dimensional FFT of real input.
+ fftfreq : Frequency bins for given FFT parameters.
+
+ Notes
+ -----
+ FFT (Fast Fourier Transform) refers to a way the discrete Fourier
+ Transform (DFT) can be calculated efficiently, by using symmetries in the
+ calculated terms. The symmetry is highest when `n` is a power of 2, and
+ the transform is therefore most efficient for these sizes.
+
+ The DFT is defined, with the conventions used in this implementation, in
+ the documentation for the `numpy.fft` module.
+
+ References
+ ----------
+ .. [CT] Cooley, James W., and John W. Tukey, 1965, "An algorithm for the
+ machine calculation of complex Fourier series," *Math. Comput.*
+ 19: 297-301.
+
+ Examples
+ --------
+ >>> np.fft.fft(np.exp(2j * np.pi * np.arange(8) / 8))
+ array([-2.33486982e-16+1.14423775e-17j, 8.00000000e+00-1.25557246e-15j,
+ 2.33486982e-16+2.33486982e-16j, 0.00000000e+00+1.22464680e-16j,
+ -1.14423775e-17+2.33486982e-16j, 0.00000000e+00+5.20784380e-16j,
+ 1.14423775e-17+1.14423775e-17j, 0.00000000e+00+1.22464680e-16j])
+
+ In this example, real input has an FFT which is Hermitian, i.e., symmetric
+ in the real part and anti-symmetric in the imaginary part, as described in
+ the `numpy.fft` documentation:
+
+ >>> import matplotlib.pyplot as plt
+ >>> t = np.arange(256)
+ >>> sp = np.fft.fft(np.sin(t))
+ >>> freq = np.fft.fftfreq(t.shape[-1])
+ >>> plt.plot(freq, sp.real, freq, sp.imag)
+ [<matplotlib.lines.Line2D object at 0x...>, <matplotlib.lines.Line2D object at 0x...>]
+ >>> plt.show()
+
+ """
+
+ a = asarray(a)
+ if n is None:
+ n = a.shape[axis]
+ inv_norm = 1
+ if norm is not None and _unitary(norm):
+ inv_norm = sqrt(n)
+ output = _raw_fft(a, n, axis, False, True, inv_norm)
+ return output
+
+
+@array_function_dispatch(_fft_dispatcher)
+def ifft(a, n=None, axis=-1, norm=None):
+ """
+ Compute the one-dimensional inverse discrete Fourier Transform.
+
+ This function computes the inverse of the one-dimensional *n*-point
+ discrete Fourier transform computed by `fft`. In other words,
+ ``ifft(fft(a)) == a`` to within numerical accuracy.
+ For a general description of the algorithm and definitions,
+ see `numpy.fft`.
+
+ The input should be ordered in the same way as is returned by `fft`,
+ i.e.,
+
+ * ``a[0]`` should contain the zero frequency term,
+ * ``a[1:n//2]`` should contain the positive-frequency terms,
+ * ``a[n//2 + 1:]`` should contain the negative-frequency terms, in
+ increasing order starting from the most negative frequency.
+
+ For an even number of input points, ``A[n//2]`` represents the sum of
+ the values at the positive and negative Nyquist frequencies, as the two
+ are aliased together. See `numpy.fft` for details.
+
+ Parameters
+ ----------
+ a : array_like
+ Input array, can be complex.
+ n : int, optional
+ Length of the transformed axis of the output.
+ If `n` is smaller than the length of the input, the input is cropped.
+ If it is larger, the input is padded with zeros. If `n` is not given,
+ the length of the input along the axis specified by `axis` is used.
+ See notes about padding issues.
+ axis : int, optional
+ Axis over which to compute the inverse DFT. If not given, the last
+ axis is used.
+ norm : {None, "ortho"}, optional
+ .. versionadded:: 1.10.0
+
+ Normalization mode (see `numpy.fft`). Default is None.
+
+ Returns
+ -------
+ out : complex ndarray
+ The truncated or zero-padded input, transformed along the axis
+ indicated by `axis`, or the last one if `axis` is not specified.
+
+ Raises
+ ------
+ IndexError
+ If `axes` is larger than the last axis of `a`.
+
+ See Also
+ --------
+ numpy.fft : An introduction, with definitions and general explanations.
+ fft : The one-dimensional (forward) FFT, of which `ifft` is the inverse
+ ifft2 : The two-dimensional inverse FFT.
+ ifftn : The n-dimensional inverse FFT.
+
+ Notes
+ -----
+ If the input parameter `n` is larger than the size of the input, the input
+ is padded by appending zeros at the end. Even though this is the common
+ approach, it might lead to surprising results. If a different padding is
+ desired, it must be performed before calling `ifft`.
+
+ Examples
+ --------
+ >>> np.fft.ifft([0, 4, 0, 0])
+ array([ 1.+0.j, 0.+1.j, -1.+0.j, 0.-1.j]) # may vary
+
+ Create and plot a band-limited signal with random phases:
+
+ >>> import matplotlib.pyplot as plt
+ >>> t = np.arange(400)
+ >>> n = np.zeros((400,), dtype=complex)
+ >>> n[40:60] = np.exp(1j*np.random.uniform(0, 2*np.pi, (20,)))
+ >>> s = np.fft.ifft(n)
+ >>> plt.plot(t, s.real, 'b-', t, s.imag, 'r--')
+ [<matplotlib.lines.Line2D object at ...>, <matplotlib.lines.Line2D object at ...>]
+ >>> plt.legend(('real', 'imaginary'))
+ <matplotlib.legend.Legend object at ...>
+ >>> plt.show()
+
+ """
+ a = asarray(a)
+ if n is None:
+ n = a.shape[axis]
+ if norm is not None and _unitary(norm):
+ inv_norm = sqrt(max(n, 1))
+ else:
+ inv_norm = n
+ output = _raw_fft(a, n, axis, False, False, inv_norm)
+ return output
+
+
+
+@array_function_dispatch(_fft_dispatcher)
+def rfft(a, n=None, axis=-1, norm=None):
+ """
+ Compute the one-dimensional discrete Fourier Transform for real input.
+
+ This function computes the one-dimensional *n*-point discrete Fourier
+ Transform (DFT) of a real-valued array by means of an efficient algorithm
+ called the Fast Fourier Transform (FFT).
+
+ Parameters
+ ----------
+ a : array_like
+ Input array
+ n : int, optional
+ Number of points along transformation axis in the input to use.
+ If `n` is smaller than the length of the input, the input is cropped.
+ If it is larger, the input is padded with zeros. If `n` is not given,
+ the length of the input along the axis specified by `axis` is used.
+ axis : int, optional
+ Axis over which to compute the FFT. If not given, the last axis is
+ used.
+ norm : {None, "ortho"}, optional
+ .. versionadded:: 1.10.0
+
+ Normalization mode (see `numpy.fft`). Default is None.
+
+ Returns
+ -------
+ out : complex ndarray
+ The truncated or zero-padded input, transformed along the axis
+ indicated by `axis`, or the last one if `axis` is not specified.
+ If `n` is even, the length of the transformed axis is ``(n/2)+1``.
+ If `n` is odd, the length is ``(n+1)/2``.
+
+ Raises
+ ------
+ IndexError
+ If `axis` is larger than the last axis of `a`.
+
+ See Also
+ --------
+ numpy.fft : For definition of the DFT and conventions used.
+ irfft : The inverse of `rfft`.
+ fft : The one-dimensional FFT of general (complex) input.
+ fftn : The *n*-dimensional FFT.
+ rfftn : The *n*-dimensional FFT of real input.
+
+ Notes
+ -----
+ When the DFT is computed for purely real input, the output is
+ Hermitian-symmetric, i.e. the negative frequency terms are just the complex
+ conjugates of the corresponding positive-frequency terms, and the
+ negative-frequency terms are therefore redundant. This function does not
+ compute the negative frequency terms, and the length of the transformed
+ axis of the output is therefore ``n//2 + 1``.
+
+ When ``A = rfft(a)`` and fs is the sampling frequency, ``A[0]`` contains
+ the zero-frequency term 0*fs, which is real due to Hermitian symmetry.
+
+ If `n` is even, ``A[-1]`` contains the term representing both positive
+ and negative Nyquist frequency (+fs/2 and -fs/2), and must also be purely
+ real. If `n` is odd, there is no term at fs/2; ``A[-1]`` contains
+ the largest positive frequency (fs/2*(n-1)/n), and is complex in the
+ general case.
+
+ If the input `a` contains an imaginary part, it is silently discarded.
+
+ Examples
+ --------
+ >>> np.fft.fft([0, 1, 0, 0])
+ array([ 1.+0.j, 0.-1.j, -1.+0.j, 0.+1.j]) # may vary
+ >>> np.fft.rfft([0, 1, 0, 0])
+ array([ 1.+0.j, 0.-1.j, -1.+0.j]) # may vary
+
+ Notice how the final element of the `fft` output is the complex conjugate
+ of the second element, for real input. For `rfft`, this symmetry is
+ exploited to compute only the non-negative frequency terms.
+
+ """
+ a = asarray(a)
+ inv_norm = 1
+ if norm is not None and _unitary(norm):
+ if n is None:
+ n = a.shape[axis]
+ inv_norm = sqrt(n)
+ output = _raw_fft(a, n, axis, True, True, inv_norm)
+ return output
+
+
+@array_function_dispatch(_fft_dispatcher)
+def irfft(a, n=None, axis=-1, norm=None):
+ """
+ Compute the inverse of the n-point DFT for real input.
+
+ This function computes the inverse of the one-dimensional *n*-point
+ discrete Fourier Transform of real input computed by `rfft`.
+ In other words, ``irfft(rfft(a), len(a)) == a`` to within numerical
+ accuracy. (See Notes below for why ``len(a)`` is necessary here.)
+
+ The input is expected to be in the form returned by `rfft`, i.e. the
+ real zero-frequency term followed by the complex positive frequency terms
+ in order of increasing frequency. Since the discrete Fourier Transform of
+ real input is Hermitian-symmetric, the negative frequency terms are taken
+ to be the complex conjugates of the corresponding positive frequency terms.
+
+ Parameters
+ ----------
+ a : array_like
+ The input array.
+ n : int, optional
+ Length of the transformed axis of the output.
+ For `n` output points, ``n//2+1`` input points are necessary. If the
+ input is longer than this, it is cropped. If it is shorter than this,
+ it is padded with zeros. If `n` is not given, it is taken to be
+ ``2*(m-1)`` where ``m`` is the length of the input along the axis
+ specified by `axis`.
+ axis : int, optional
+ Axis over which to compute the inverse FFT. If not given, the last
+ axis is used.
+ norm : {None, "ortho"}, optional
+ .. versionadded:: 1.10.0
+
+ Normalization mode (see `numpy.fft`). Default is None.
+
+ Returns
+ -------
+ out : ndarray
+ The truncated or zero-padded input, transformed along the axis
+ indicated by `axis`, or the last one if `axis` is not specified.
+ The length of the transformed axis is `n`, or, if `n` is not given,
+ ``2*(m-1)`` where ``m`` is the length of the transformed axis of the
+ input. To get an odd number of output points, `n` must be specified.
+
+ Raises
+ ------
+ IndexError
+ If `axis` is larger than the last axis of `a`.
+
+ See Also
+ --------
+ numpy.fft : For definition of the DFT and conventions used.
+ rfft : The one-dimensional FFT of real input, of which `irfft` is inverse.
+ fft : The one-dimensional FFT.
+ irfft2 : The inverse of the two-dimensional FFT of real input.
+ irfftn : The inverse of the *n*-dimensional FFT of real input.
+
+ Notes
+ -----
+ Returns the real valued `n`-point inverse discrete Fourier transform
+ of `a`, where `a` contains the non-negative frequency terms of a
+ Hermitian-symmetric sequence. `n` is the length of the result, not the
+ input.
+
+ If you specify an `n` such that `a` must be zero-padded or truncated, the
+ extra/removed values will be added/removed at high frequencies. One can
+ thus resample a series to `m` points via Fourier interpolation by:
+ ``a_resamp = irfft(rfft(a), m)``.
+
+ The correct interpretation of the hermitian input depends on the length of
+ the original data, as given by `n`. This is because each input shape could
+ correspond to either an odd or even length signal. By default, `irfft`
+ assumes an even output length which puts the last entry at the Nyquist
+ frequency; aliasing with its symmetric counterpart. By Hermitian symmetry,
+ the value is thus treated as purely real. To avoid losing information, the
+ correct length of the real input **must** be given.
+
+ Examples
+ --------
+ >>> np.fft.ifft([1, -1j, -1, 1j])
+ array([0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j]) # may vary
+ >>> np.fft.irfft([1, -1j, -1])
+ array([0., 1., 0., 0.])
+
+ Notice how the last term in the input to the ordinary `ifft` is the
+ complex conjugate of the second term, and the output has zero imaginary
+ part everywhere. When calling `irfft`, the negative frequencies are not
+ specified, and the output array is purely real.
+
+ """
+ a = asarray(a)
+ if n is None:
+ n = (a.shape[axis] - 1) * 2
+ inv_norm = n
+ if norm is not None and _unitary(norm):
+ inv_norm = sqrt(n)
+ output = _raw_fft(a, n, axis, True, False, inv_norm)
+ return output
+
+
+@array_function_dispatch(_fft_dispatcher)
+def hfft(a, n=None, axis=-1, norm=None):
+ """
+ Compute the FFT of a signal that has Hermitian symmetry, i.e., a real
+ spectrum.
+
+ Parameters
+ ----------
+ a : array_like
+ The input array.
+ n : int, optional
+ Length of the transformed axis of the output. For `n` output
+ points, ``n//2 + 1`` input points are necessary. If the input is
+ longer than this, it is cropped. If it is shorter than this, it is
+ padded with zeros. If `n` is not given, it is taken to be ``2*(m-1)``
+ where ``m`` is the length of the input along the axis specified by
+ `axis`.
+ axis : int, optional
+ Axis over which to compute the FFT. If not given, the last
+ axis is used.
+ norm : {None, "ortho"}, optional
+ Normalization mode (see `numpy.fft`). Default is None.
+
+ .. versionadded:: 1.10.0
+
+ Returns
+ -------
+ out : ndarray
+ The truncated or zero-padded input, transformed along the axis
+ indicated by `axis`, or the last one if `axis` is not specified.
+ The length of the transformed axis is `n`, or, if `n` is not given,
+ ``2*m - 2`` where ``m`` is the length of the transformed axis of
+ the input. To get an odd number of output points, `n` must be
+ specified, for instance as ``2*m - 1`` in the typical case,
+
+ Raises
+ ------
+ IndexError
+ If `axis` is larger than the last axis of `a`.
+
+ See also
+ --------
+ rfft : Compute the one-dimensional FFT for real input.
+ ihfft : The inverse of `hfft`.
+
+ Notes
+ -----
+ `hfft`/`ihfft` are a pair analogous to `rfft`/`irfft`, but for the
+ opposite case: here the signal has Hermitian symmetry in the time
+ domain and is real in the frequency domain. So here it's `hfft` for
+ which you must supply the length of the result if it is to be odd.
+
+ * even: ``ihfft(hfft(a, 2*len(a) - 2) == a``, within roundoff error,
+ * odd: ``ihfft(hfft(a, 2*len(a) - 1) == a``, within roundoff error.
+
+ The correct interpretation of the hermitian input depends on the length of
+ the original data, as given by `n`. This is because each input shape could
+ correspond to either an odd or even length signal. By default, `hfft`
+ assumes an even output length which puts the last entry at the Nyquist
+ frequency; aliasing with its symmetric counterpart. By Hermitian symmetry,
+ the value is thus treated as purely real. To avoid losing information, the
+ shape of the full signal **must** be given.
+
+ Examples
+ --------
+ >>> signal = np.array([1, 2, 3, 4, 3, 2])
+ >>> np.fft.fft(signal)
+ array([15.+0.j, -4.+0.j, 0.+0.j, -1.-0.j, 0.+0.j, -4.+0.j]) # may vary
+ >>> np.fft.hfft(signal[:4]) # Input first half of signal
+ array([15., -4., 0., -1., 0., -4.])
+ >>> np.fft.hfft(signal, 6) # Input entire signal and truncate
+ array([15., -4., 0., -1., 0., -4.])
+
+
+ >>> signal = np.array([[1, 1.j], [-1.j, 2]])
+ >>> np.conj(signal.T) - signal # check Hermitian symmetry
+ array([[ 0.-0.j, -0.+0.j], # may vary
+ [ 0.+0.j, 0.-0.j]])
+ >>> freq_spectrum = np.fft.hfft(signal)
+ >>> freq_spectrum
+ array([[ 1., 1.],
+ [ 2., -2.]])
+
+ """
+ a = asarray(a)
+ if n is None:
+ n = (a.shape[axis] - 1) * 2
+ unitary = _unitary(norm)
+ return irfft(conjugate(a), n, axis) * (sqrt(n) if unitary else n)
+
+
+@array_function_dispatch(_fft_dispatcher)
+def ihfft(a, n=None, axis=-1, norm=None):
+ """
+ Compute the inverse FFT of a signal that has Hermitian symmetry.
+
+ Parameters
+ ----------
+ a : array_like
+ Input array.
+ n : int, optional
+ Length of the inverse FFT, the number of points along
+ transformation axis in the input to use. If `n` is smaller than
+ the length of the input, the input is cropped. If it is larger,
+ the input is padded with zeros. If `n` is not given, the length of
+ the input along the axis specified by `axis` is used.
+ axis : int, optional
+ Axis over which to compute the inverse FFT. If not given, the last
+ axis is used.
+ norm : {None, "ortho"}, optional
+ Normalization mode (see `numpy.fft`). Default is None.
+
+ .. versionadded:: 1.10.0
+
+ Returns
+ -------
+ out : complex ndarray
+ The truncated or zero-padded input, transformed along the axis
+ indicated by `axis`, or the last one if `axis` is not specified.
+ The length of the transformed axis is ``n//2 + 1``.
+
+ See also
+ --------
+ hfft, irfft
+
+ Notes
+ -----
+ `hfft`/`ihfft` are a pair analogous to `rfft`/`irfft`, but for the
+ opposite case: here the signal has Hermitian symmetry in the time
+ domain and is real in the frequency domain. So here it's `hfft` for
+ which you must supply the length of the result if it is to be odd:
+
+ * even: ``ihfft(hfft(a, 2*len(a) - 2) == a``, within roundoff error,
+ * odd: ``ihfft(hfft(a, 2*len(a) - 1) == a``, within roundoff error.
+
+ Examples
+ --------
+ >>> spectrum = np.array([ 15, -4, 0, -1, 0, -4])
+ >>> np.fft.ifft(spectrum)
+ array([1.+0.j, 2.+0.j, 3.+0.j, 4.+0.j, 3.+0.j, 2.+0.j]) # may vary
+ >>> np.fft.ihfft(spectrum)
+ array([ 1.-0.j, 2.-0.j, 3.-0.j, 4.-0.j]) # may vary
+
+ """
+ a = asarray(a)
+ if n is None:
+ n = a.shape[axis]
+ unitary = _unitary(norm)
+ output = conjugate(rfft(a, n, axis))
+ return output * (1 / (sqrt(n) if unitary else n))
+
+
+def _cook_nd_args(a, s=None, axes=None, invreal=0):
+ if s is None:
+ shapeless = 1
+ if axes is None:
+ s = list(a.shape)
+ else:
+ s = take(a.shape, axes)
+ else:
+ shapeless = 0
+ s = list(s)
+ if axes is None:
+ axes = list(range(-len(s), 0))
+ if len(s) != len(axes):
+ raise ValueError("Shape and axes have different lengths.")
+ if invreal and shapeless:
+ s[-1] = (a.shape[axes[-1]] - 1) * 2
+ return s, axes
+
+
+def _raw_fftnd(a, s=None, axes=None, function=fft, norm=None):
+ a = asarray(a)
+ s, axes = _cook_nd_args(a, s, axes)
+ itl = list(range(len(axes)))
+ itl.reverse()
+ for ii in itl:
+ a = function(a, n=s[ii], axis=axes[ii], norm=norm)
+ return a
+
+
+def _fftn_dispatcher(a, s=None, axes=None, norm=None):
+ return (a,)
+
+
+@array_function_dispatch(_fftn_dispatcher)
+def fftn(a, s=None, axes=None, norm=None):
+ """
+ Compute the N-dimensional discrete Fourier Transform.
+
+ This function computes the *N*-dimensional discrete Fourier Transform over
+ any number of axes in an *M*-dimensional array by means of the Fast Fourier
+ Transform (FFT).
+
+ Parameters
+ ----------
+ a : array_like
+ Input array, can be complex.
+ s : sequence of ints, optional
+ Shape (length of each transformed axis) of the output
+ (``s[0]`` refers to axis 0, ``s[1]`` to axis 1, etc.).
+ This corresponds to ``n`` for ``fft(x, n)``.
+ Along any axis, if the given shape is smaller than that of the input,
+ the input is cropped. If it is larger, the input is padded with zeros.
+ if `s` is not given, the shape of the input along the axes specified
+ by `axes` is used.
+ axes : sequence of ints, optional
+ Axes over which to compute the FFT. If not given, the last ``len(s)``
+ axes are used, or all axes if `s` is also not specified.
+ Repeated indices in `axes` means that the transform over that axis is
+ performed multiple times.
+ norm : {None, "ortho"}, optional
+ .. versionadded:: 1.10.0
+
+ Normalization mode (see `numpy.fft`). Default is None.
+
+ Returns
+ -------
+ out : complex ndarray
+ The truncated or zero-padded input, transformed along the axes
+ indicated by `axes`, or by a combination of `s` and `a`,
+ as explained in the parameters section above.
+
+ Raises
+ ------
+ ValueError
+ If `s` and `axes` have different length.
+ IndexError
+ If an element of `axes` is larger than than the number of axes of `a`.
+
+ See Also
+ --------
+ numpy.fft : Overall view of discrete Fourier transforms, with definitions
+ and conventions used.
+ ifftn : The inverse of `fftn`, the inverse *n*-dimensional FFT.
+ fft : The one-dimensional FFT, with definitions and conventions used.
+ rfftn : The *n*-dimensional FFT of real input.
+ fft2 : The two-dimensional FFT.
+ fftshift : Shifts zero-frequency terms to centre of array
+
+ Notes
+ -----
+ The output, analogously to `fft`, contains the term for zero frequency in
+ the low-order corner of all axes, the positive frequency terms in the
+ first half of all axes, the term for the Nyquist frequency in the middle
+ of all axes and the negative frequency terms in the second half of all
+ axes, in order of decreasingly negative frequency.
+
+ See `numpy.fft` for details, definitions and conventions used.
+
+ Examples
+ --------
+ >>> a = np.mgrid[:3, :3, :3][0]
+ >>> np.fft.fftn(a, axes=(1, 2))
+ array([[[ 0.+0.j, 0.+0.j, 0.+0.j], # may vary
+ [ 0.+0.j, 0.+0.j, 0.+0.j],
+ [ 0.+0.j, 0.+0.j, 0.+0.j]],
+ [[ 9.+0.j, 0.+0.j, 0.+0.j],
+ [ 0.+0.j, 0.+0.j, 0.+0.j],
+ [ 0.+0.j, 0.+0.j, 0.+0.j]],
+ [[18.+0.j, 0.+0.j, 0.+0.j],
+ [ 0.+0.j, 0.+0.j, 0.+0.j],
+ [ 0.+0.j, 0.+0.j, 0.+0.j]]])
+ >>> np.fft.fftn(a, (2, 2), axes=(0, 1))
+ array([[[ 2.+0.j, 2.+0.j, 2.+0.j], # may vary
+ [ 0.+0.j, 0.+0.j, 0.+0.j]],
+ [[-2.+0.j, -2.+0.j, -2.+0.j],
+ [ 0.+0.j, 0.+0.j, 0.+0.j]]])
+
+ >>> import matplotlib.pyplot as plt
+ >>> [X, Y] = np.meshgrid(2 * np.pi * np.arange(200) / 12,
+ ... 2 * np.pi * np.arange(200) / 34)
+ >>> S = np.sin(X) + np.cos(Y) + np.random.uniform(0, 1, X.shape)
+ >>> FS = np.fft.fftn(S)
+ >>> plt.imshow(np.log(np.abs(np.fft.fftshift(FS))**2))
+ <matplotlib.image.AxesImage object at 0x...>
+ >>> plt.show()
+
+ """
+
+ return _raw_fftnd(a, s, axes, fft, norm)
+
+
+@array_function_dispatch(_fftn_dispatcher)
+def ifftn(a, s=None, axes=None, norm=None):
+ """
+ Compute the N-dimensional inverse discrete Fourier Transform.
+
+ This function computes the inverse of the N-dimensional discrete
+ Fourier Transform over any number of axes in an M-dimensional array by
+ means of the Fast Fourier Transform (FFT). In other words,
+ ``ifftn(fftn(a)) == a`` to within numerical accuracy.
+ For a description of the definitions and conventions used, see `numpy.fft`.
+
+ The input, analogously to `ifft`, should be ordered in the same way as is
+ returned by `fftn`, i.e. it should have the term for zero frequency
+ in all axes in the low-order corner, the positive frequency terms in the
+ first half of all axes, the term for the Nyquist frequency in the middle
+ of all axes and the negative frequency terms in the second half of all
+ axes, in order of decreasingly negative frequency.
+
+ Parameters
+ ----------
+ a : array_like
+ Input array, can be complex.
+ s : sequence of ints, optional
+ Shape (length of each transformed axis) of the output
+ (``s[0]`` refers to axis 0, ``s[1]`` to axis 1, etc.).
+ This corresponds to ``n`` for ``ifft(x, n)``.
+ Along any axis, if the given shape is smaller than that of the input,
+ the input is cropped. If it is larger, the input is padded with zeros.
+ if `s` is not given, the shape of the input along the axes specified
+ by `axes` is used. See notes for issue on `ifft` zero padding.
+ axes : sequence of ints, optional
+ Axes over which to compute the IFFT. If not given, the last ``len(s)``
+ axes are used, or all axes if `s` is also not specified.
+ Repeated indices in `axes` means that the inverse transform over that
+ axis is performed multiple times.
+ norm : {None, "ortho"}, optional
+ .. versionadded:: 1.10.0
+
+ Normalization mode (see `numpy.fft`). Default is None.
+
+ Returns
+ -------
+ out : complex ndarray
+ The truncated or zero-padded input, transformed along the axes
+ indicated by `axes`, or by a combination of `s` or `a`,
+ as explained in the parameters section above.
+
+ Raises
+ ------
+ ValueError
+ If `s` and `axes` have different length.
+ IndexError
+ If an element of `axes` is larger than than the number of axes of `a`.
+
+ See Also
+ --------
+ numpy.fft : Overall view of discrete Fourier transforms, with definitions
+ and conventions used.
+ fftn : The forward *n*-dimensional FFT, of which `ifftn` is the inverse.
+ ifft : The one-dimensional inverse FFT.
+ ifft2 : The two-dimensional inverse FFT.
+ ifftshift : Undoes `fftshift`, shifts zero-frequency terms to beginning
+ of array.
+
+ Notes
+ -----
+ See `numpy.fft` for definitions and conventions used.
+
+ Zero-padding, analogously with `ifft`, is performed by appending zeros to
+ the input along the specified dimension. Although this is the common
+ approach, it might lead to surprising results. If another form of zero
+ padding is desired, it must be performed before `ifftn` is called.
+
+ Examples
+ --------
+ >>> a = np.eye(4)
+ >>> np.fft.ifftn(np.fft.fftn(a, axes=(0,)), axes=(1,))
+ array([[1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], # may vary
+ [0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j],
+ [0.+0.j, 0.+0.j, 1.+0.j, 0.+0.j],
+ [0.+0.j, 0.+0.j, 0.+0.j, 1.+0.j]])
+
+
+ Create and plot an image with band-limited frequency content:
+
+ >>> import matplotlib.pyplot as plt
+ >>> n = np.zeros((200,200), dtype=complex)
+ >>> n[60:80, 20:40] = np.exp(1j*np.random.uniform(0, 2*np.pi, (20, 20)))
+ >>> im = np.fft.ifftn(n).real
+ >>> plt.imshow(im)
+ <matplotlib.image.AxesImage object at 0x...>
+ >>> plt.show()
+
+ """
+
+ return _raw_fftnd(a, s, axes, ifft, norm)
+
+
+@array_function_dispatch(_fftn_dispatcher)
+def fft2(a, s=None, axes=(-2, -1), norm=None):
+ """
+ Compute the 2-dimensional discrete Fourier Transform
+
+ This function computes the *n*-dimensional discrete Fourier Transform
+ over any axes in an *M*-dimensional array by means of the
+ Fast Fourier Transform (FFT). By default, the transform is computed over
+ the last two axes of the input array, i.e., a 2-dimensional FFT.
+
+ Parameters
+ ----------
+ a : array_like
+ Input array, can be complex
+ s : sequence of ints, optional
+ Shape (length of each transformed axis) of the output
+ (``s[0]`` refers to axis 0, ``s[1]`` to axis 1, etc.).
+ This corresponds to ``n`` for ``fft(x, n)``.
+ Along each axis, if the given shape is smaller than that of the input,
+ the input is cropped. If it is larger, the input is padded with zeros.
+ if `s` is not given, the shape of the input along the axes specified
+ by `axes` is used.
+ axes : sequence of ints, optional
+ Axes over which to compute the FFT. If not given, the last two
+ axes are used. A repeated index in `axes` means the transform over
+ that axis is performed multiple times. A one-element sequence means
+ that a one-dimensional FFT is performed.
+ norm : {None, "ortho"}, optional
+ .. versionadded:: 1.10.0
+
+ Normalization mode (see `numpy.fft`). Default is None.
+
+ Returns
+ -------
+ out : complex ndarray
+ The truncated or zero-padded input, transformed along the axes
+ indicated by `axes`, or the last two axes if `axes` is not given.
+
+ Raises
+ ------
+ ValueError
+ If `s` and `axes` have different length, or `axes` not given and
+ ``len(s) != 2``.
+ IndexError
+ If an element of `axes` is larger than than the number of axes of `a`.
+
+ See Also
+ --------
+ numpy.fft : Overall view of discrete Fourier transforms, with definitions
+ and conventions used.
+ ifft2 : The inverse two-dimensional FFT.
+ fft : The one-dimensional FFT.
+ fftn : The *n*-dimensional FFT.
+ fftshift : Shifts zero-frequency terms to the center of the array.
+ For two-dimensional input, swaps first and third quadrants, and second
+ and fourth quadrants.
+
+ Notes
+ -----
+ `fft2` is just `fftn` with a different default for `axes`.
+
+ The output, analogously to `fft`, contains the term for zero frequency in
+ the low-order corner of the transformed axes, the positive frequency terms
+ in the first half of these axes, the term for the Nyquist frequency in the
+ middle of the axes and the negative frequency terms in the second half of
+ the axes, in order of decreasingly negative frequency.
+
+ See `fftn` for details and a plotting example, and `numpy.fft` for
+ definitions and conventions used.
+
+
+ Examples
+ --------
+ >>> a = np.mgrid[:5, :5][0]
+ >>> np.fft.fft2(a)
+ array([[ 50. +0.j , 0. +0.j , 0. +0.j , # may vary
+ 0. +0.j , 0. +0.j ],
+ [-12.5+17.20477401j, 0. +0.j , 0. +0.j ,
+ 0. +0.j , 0. +0.j ],
+ [-12.5 +4.0614962j , 0. +0.j , 0. +0.j ,
+ 0. +0.j , 0. +0.j ],
+ [-12.5 -4.0614962j , 0. +0.j , 0. +0.j ,
+ 0. +0.j , 0. +0.j ],
+ [-12.5-17.20477401j, 0. +0.j , 0. +0.j ,
+ 0. +0.j , 0. +0.j ]])
+
+ """
+
+ return _raw_fftnd(a, s, axes, fft, norm)
+
+
+@array_function_dispatch(_fftn_dispatcher)
+def ifft2(a, s=None, axes=(-2, -1), norm=None):
+ """
+ Compute the 2-dimensional inverse discrete Fourier Transform.
+
+ This function computes the inverse of the 2-dimensional discrete Fourier
+ Transform over any number of axes in an M-dimensional array by means of
+ the Fast Fourier Transform (FFT). In other words, ``ifft2(fft2(a)) == a``
+ to within numerical accuracy. By default, the inverse transform is
+ computed over the last two axes of the input array.
+
+ The input, analogously to `ifft`, should be ordered in the same way as is
+ returned by `fft2`, i.e. it should have the term for zero frequency
+ in the low-order corner of the two axes, the positive frequency terms in
+ the first half of these axes, the term for the Nyquist frequency in the
+ middle of the axes and the negative frequency terms in the second half of
+ both axes, in order of decreasingly negative frequency.
+
+ Parameters
+ ----------
+ a : array_like
+ Input array, can be complex.
+ s : sequence of ints, optional
+ Shape (length of each axis) of the output (``s[0]`` refers to axis 0,
+ ``s[1]`` to axis 1, etc.). This corresponds to `n` for ``ifft(x, n)``.
+ Along each axis, if the given shape is smaller than that of the input,
+ the input is cropped. If it is larger, the input is padded with zeros.
+ if `s` is not given, the shape of the input along the axes specified
+ by `axes` is used. See notes for issue on `ifft` zero padding.
+ axes : sequence of ints, optional
+ Axes over which to compute the FFT. If not given, the last two
+ axes are used. A repeated index in `axes` means the transform over
+ that axis is performed multiple times. A one-element sequence means
+ that a one-dimensional FFT is performed.
+ norm : {None, "ortho"}, optional
+ .. versionadded:: 1.10.0
+
+ Normalization mode (see `numpy.fft`). Default is None.
+
+ Returns
+ -------
+ out : complex ndarray
+ The truncated or zero-padded input, transformed along the axes
+ indicated by `axes`, or the last two axes if `axes` is not given.
+
+ Raises
+ ------
+ ValueError
+ If `s` and `axes` have different length, or `axes` not given and
+ ``len(s) != 2``.
+ IndexError
+ If an element of `axes` is larger than than the number of axes of `a`.
+
+ See Also
+ --------
+ numpy.fft : Overall view of discrete Fourier transforms, with definitions
+ and conventions used.
+ fft2 : The forward 2-dimensional FFT, of which `ifft2` is the inverse.
+ ifftn : The inverse of the *n*-dimensional FFT.
+ fft : The one-dimensional FFT.
+ ifft : The one-dimensional inverse FFT.
+
+ Notes
+ -----
+ `ifft2` is just `ifftn` with a different default for `axes`.
+
+ See `ifftn` for details and a plotting example, and `numpy.fft` for
+ definition and conventions used.
+
+ Zero-padding, analogously with `ifft`, is performed by appending zeros to
+ the input along the specified dimension. Although this is the common
+ approach, it might lead to surprising results. If another form of zero
+ padding is desired, it must be performed before `ifft2` is called.
+
+ Examples
+ --------
+ >>> a = 4 * np.eye(4)
+ >>> np.fft.ifft2(a)
+ array([[1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], # may vary
+ [0.+0.j, 0.+0.j, 0.+0.j, 1.+0.j],
+ [0.+0.j, 0.+0.j, 1.+0.j, 0.+0.j],
+ [0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j]])
+
+ """
+
+ return _raw_fftnd(a, s, axes, ifft, norm)
+
+
+@array_function_dispatch(_fftn_dispatcher)
+def rfftn(a, s=None, axes=None, norm=None):
+ """
+ Compute the N-dimensional discrete Fourier Transform for real input.
+
+ This function computes the N-dimensional discrete Fourier Transform over
+ any number of axes in an M-dimensional real array by means of the Fast
+ Fourier Transform (FFT). By default, all axes are transformed, with the
+ real transform performed over the last axis, while the remaining
+ transforms are complex.
+
+ Parameters
+ ----------
+ a : array_like
+ Input array, taken to be real.
+ s : sequence of ints, optional
+ Shape (length along each transformed axis) to use from the input.
+ (``s[0]`` refers to axis 0, ``s[1]`` to axis 1, etc.).
+ The final element of `s` corresponds to `n` for ``rfft(x, n)``, while
+ for the remaining axes, it corresponds to `n` for ``fft(x, n)``.
+ Along any axis, if the given shape is smaller than that of the input,
+ the input is cropped. If it is larger, the input is padded with zeros.
+ if `s` is not given, the shape of the input along the axes specified
+ by `axes` is used.
+ axes : sequence of ints, optional
+ Axes over which to compute the FFT. If not given, the last ``len(s)``
+ axes are used, or all axes if `s` is also not specified.
+ norm : {None, "ortho"}, optional
+ .. versionadded:: 1.10.0
+
+ Normalization mode (see `numpy.fft`). Default is None.
+
+ Returns
+ -------
+ out : complex ndarray
+ The truncated or zero-padded input, transformed along the axes
+ indicated by `axes`, or by a combination of `s` and `a`,
+ as explained in the parameters section above.
+ The length of the last axis transformed will be ``s[-1]//2+1``,
+ while the remaining transformed axes will have lengths according to
+ `s`, or unchanged from the input.
+
+ Raises
+ ------
+ ValueError
+ If `s` and `axes` have different length.
+ IndexError
+ If an element of `axes` is larger than than the number of axes of `a`.
+
+ See Also
+ --------
+ irfftn : The inverse of `rfftn`, i.e. the inverse of the n-dimensional FFT
+ of real input.
+ fft : The one-dimensional FFT, with definitions and conventions used.
+ rfft : The one-dimensional FFT of real input.
+ fftn : The n-dimensional FFT.
+ rfft2 : The two-dimensional FFT of real input.
+
+ Notes
+ -----
+ The transform for real input is performed over the last transformation
+ axis, as by `rfft`, then the transform over the remaining axes is
+ performed as by `fftn`. The order of the output is as for `rfft` for the
+ final transformation axis, and as for `fftn` for the remaining
+ transformation axes.
+
+ See `fft` for details, definitions and conventions used.
+
+ Examples
+ --------
+ >>> a = np.ones((2, 2, 2))
+ >>> np.fft.rfftn(a)
+ array([[[8.+0.j, 0.+0.j], # may vary
+ [0.+0.j, 0.+0.j]],
+ [[0.+0.j, 0.+0.j],
+ [0.+0.j, 0.+0.j]]])
+
+ >>> np.fft.rfftn(a, axes=(2, 0))
+ array([[[4.+0.j, 0.+0.j], # may vary
+ [4.+0.j, 0.+0.j]],
+ [[0.+0.j, 0.+0.j],
+ [0.+0.j, 0.+0.j]]])
+
+ """
+ a = asarray(a)
+ s, axes = _cook_nd_args(a, s, axes)
+ a = rfft(a, s[-1], axes[-1], norm)
+ for ii in range(len(axes)-1):
+ a = fft(a, s[ii], axes[ii], norm)
+ return a
+
+
+@array_function_dispatch(_fftn_dispatcher)
+def rfft2(a, s=None, axes=(-2, -1), norm=None):
+ """
+ Compute the 2-dimensional FFT of a real array.
+
+ Parameters
+ ----------
+ a : array
+ Input array, taken to be real.
+ s : sequence of ints, optional
+ Shape of the FFT.
+ axes : sequence of ints, optional
+ Axes over which to compute the FFT.
+ norm : {None, "ortho"}, optional
+ .. versionadded:: 1.10.0
+
+ Normalization mode (see `numpy.fft`). Default is None.
+
+ Returns
+ -------
+ out : ndarray
+ The result of the real 2-D FFT.
+
+ See Also
+ --------
+ rfftn : Compute the N-dimensional discrete Fourier Transform for real
+ input.
+
+ Notes
+ -----
+ This is really just `rfftn` with different default behavior.
+ For more details see `rfftn`.
+
+ """
+
+ return rfftn(a, s, axes, norm)
+
+
+@array_function_dispatch(_fftn_dispatcher)
+def irfftn(a, s=None, axes=None, norm=None):
+ """
+ Compute the inverse of the N-dimensional FFT of real input.
+
+ This function computes the inverse of the N-dimensional discrete
+ Fourier Transform for real input over any number of axes in an
+ M-dimensional array by means of the Fast Fourier Transform (FFT). In
+ other words, ``irfftn(rfftn(a), a.shape) == a`` to within numerical
+ accuracy. (The ``a.shape`` is necessary like ``len(a)`` is for `irfft`,
+ and for the same reason.)
+
+ The input should be ordered in the same way as is returned by `rfftn`,
+ i.e. as for `irfft` for the final transformation axis, and as for `ifftn`
+ along all the other axes.
+
+ Parameters
+ ----------
+ a : array_like
+ Input array.
+ s : sequence of ints, optional
+ Shape (length of each transformed axis) of the output
+ (``s[0]`` refers to axis 0, ``s[1]`` to axis 1, etc.). `s` is also the
+ number of input points used along this axis, except for the last axis,
+ where ``s[-1]//2+1`` points of the input are used.
+ Along any axis, if the shape indicated by `s` is smaller than that of
+ the input, the input is cropped. If it is larger, the input is padded
+ with zeros. If `s` is not given, the shape of the input along the axes
+ specified by axes is used. Except for the last axis which is taken to be
+ ``2*(m-1)`` where ``m`` is the length of the input along that axis.
+ axes : sequence of ints, optional
+ Axes over which to compute the inverse FFT. If not given, the last
+ `len(s)` axes are used, or all axes if `s` is also not specified.
+ Repeated indices in `axes` means that the inverse transform over that
+ axis is performed multiple times.
+ norm : {None, "ortho"}, optional
+ .. versionadded:: 1.10.0
+
+ Normalization mode (see `numpy.fft`). Default is None.
+
+ Returns
+ -------
+ out : ndarray
+ The truncated or zero-padded input, transformed along the axes
+ indicated by `axes`, or by a combination of `s` or `a`,
+ as explained in the parameters section above.
+ The length of each transformed axis is as given by the corresponding
+ element of `s`, or the length of the input in every axis except for the
+ last one if `s` is not given. In the final transformed axis the length
+ of the output when `s` is not given is ``2*(m-1)`` where ``m`` is the
+ length of the final transformed axis of the input. To get an odd
+ number of output points in the final axis, `s` must be specified.
+
+ Raises
+ ------
+ ValueError
+ If `s` and `axes` have different length.
+ IndexError
+ If an element of `axes` is larger than than the number of axes of `a`.
+
+ See Also
+ --------
+ rfftn : The forward n-dimensional FFT of real input,
+ of which `ifftn` is the inverse.
+ fft : The one-dimensional FFT, with definitions and conventions used.
+ irfft : The inverse of the one-dimensional FFT of real input.
+ irfft2 : The inverse of the two-dimensional FFT of real input.
+
+ Notes
+ -----
+ See `fft` for definitions and conventions used.
+
+ See `rfft` for definitions and conventions used for real input.
+
+ The correct interpretation of the hermitian input depends on the shape of
+ the original data, as given by `s`. This is because each input shape could
+ correspond to either an odd or even length signal. By default, `irfftn`
+ assumes an even output length which puts the last entry at the Nyquist
+ frequency; aliasing with its symmetric counterpart. When performing the
+ final complex to real transform, the last value is thus treated as purely
+ real. To avoid losing information, the correct shape of the real input
+ **must** be given.
+
+ Examples
+ --------
+ >>> a = np.zeros((3, 2, 2))
+ >>> a[0, 0, 0] = 3 * 2 * 2
+ >>> np.fft.irfftn(a)
+ array([[[1., 1.],
+ [1., 1.]],
+ [[1., 1.],
+ [1., 1.]],
+ [[1., 1.],
+ [1., 1.]]])
+
+ """
+ a = asarray(a)
+ s, axes = _cook_nd_args(a, s, axes, invreal=1)
+ for ii in range(len(axes)-1):
+ a = ifft(a, s[ii], axes[ii], norm)
+ a = irfft(a, s[-1], axes[-1], norm)
+ return a
+
+
+@array_function_dispatch(_fftn_dispatcher)
+def irfft2(a, s=None, axes=(-2, -1), norm=None):
+ """
+ Compute the 2-dimensional inverse FFT of a real array.
+
+ Parameters
+ ----------
+ a : array_like
+ The input array
+ s : sequence of ints, optional
+ Shape of the real output to the inverse FFT.
+ axes : sequence of ints, optional
+ The axes over which to compute the inverse fft.
+ Default is the last two axes.
+ norm : {None, "ortho"}, optional
+ .. versionadded:: 1.10.0
+
+ Normalization mode (see `numpy.fft`). Default is None.
+
+ Returns
+ -------
+ out : ndarray
+ The result of the inverse real 2-D FFT.
+
+ See Also
+ --------
+ irfftn : Compute the inverse of the N-dimensional FFT of real input.
+
+ Notes
+ -----
+ This is really `irfftn` with different defaults.
+ For more details see `irfftn`.
+
+ """
+
+ return irfftn(a, s, axes, norm)
+++ /dev/null
-"""
-Discrete Fourier Transform (:mod:`numpy.fft`)
-=============================================
-
-.. currentmodule:: numpy.fft
-
-Standard FFTs
--------------
-
-.. autosummary::
- :toctree: generated/
-
- fft Discrete Fourier transform.
- ifft Inverse discrete Fourier transform.
- fft2 Discrete Fourier transform in two dimensions.
- ifft2 Inverse discrete Fourier transform in two dimensions.
- fftn Discrete Fourier transform in N-dimensions.
- ifftn Inverse discrete Fourier transform in N dimensions.
-
-Real FFTs
----------
-
-.. autosummary::
- :toctree: generated/
-
- rfft Real discrete Fourier transform.
- irfft Inverse real discrete Fourier transform.
- rfft2 Real discrete Fourier transform in two dimensions.
- irfft2 Inverse real discrete Fourier transform in two dimensions.
- rfftn Real discrete Fourier transform in N dimensions.
- irfftn Inverse real discrete Fourier transform in N dimensions.
-
-Hermitian FFTs
---------------
-
-.. autosummary::
- :toctree: generated/
-
- hfft Hermitian discrete Fourier transform.
- ihfft Inverse Hermitian discrete Fourier transform.
-
-Helper routines
----------------
-
-.. autosummary::
- :toctree: generated/
-
- fftfreq Discrete Fourier Transform sample frequencies.
- rfftfreq DFT sample frequencies (for usage with rfft, irfft).
- fftshift Shift zero-frequency component to center of spectrum.
- ifftshift Inverse of fftshift.
-
-
-Background information
-----------------------
-
-Fourier analysis is fundamentally a method for expressing a function as a
-sum of periodic components, and for recovering the function from those
-components. When both the function and its Fourier transform are
-replaced with discretized counterparts, it is called the discrete Fourier
-transform (DFT). The DFT has become a mainstay of numerical computing in
-part because of a very fast algorithm for computing it, called the Fast
-Fourier Transform (FFT), which was known to Gauss (1805) and was brought
-to light in its current form by Cooley and Tukey [CT]_. Press et al. [NR]_
-provide an accessible introduction to Fourier analysis and its
-applications.
-
-Because the discrete Fourier transform separates its input into
-components that contribute at discrete frequencies, it has a great number
-of applications in digital signal processing, e.g., for filtering, and in
-this context the discretized input to the transform is customarily
-referred to as a *signal*, which exists in the *time domain*. The output
-is called a *spectrum* or *transform* and exists in the *frequency
-domain*.
-
-Implementation details
-----------------------
-
-There are many ways to define the DFT, varying in the sign of the
-exponent, normalization, etc. In this implementation, the DFT is defined
-as
-
-.. math::
- A_k = \\sum_{m=0}^{n-1} a_m \\exp\\left\\{-2\\pi i{mk \\over n}\\right\\}
- \\qquad k = 0,\\ldots,n-1.
-
-The DFT is in general defined for complex inputs and outputs, and a
-single-frequency component at linear frequency :math:`f` is
-represented by a complex exponential
-:math:`a_m = \\exp\\{2\\pi i\\,f m\\Delta t\\}`, where :math:`\\Delta t`
-is the sampling interval.
-
-The values in the result follow so-called "standard" order: If ``A =
-fft(a, n)``, then ``A[0]`` contains the zero-frequency term (the sum of
-the signal), which is always purely real for real inputs. Then ``A[1:n/2]``
-contains the positive-frequency terms, and ``A[n/2+1:]`` contains the
-negative-frequency terms, in order of decreasingly negative frequency.
-For an even number of input points, ``A[n/2]`` represents both positive and
-negative Nyquist frequency, and is also purely real for real input. For
-an odd number of input points, ``A[(n-1)/2]`` contains the largest positive
-frequency, while ``A[(n+1)/2]`` contains the largest negative frequency.
-The routine ``np.fft.fftfreq(n)`` returns an array giving the frequencies
-of corresponding elements in the output. The routine
-``np.fft.fftshift(A)`` shifts transforms and their frequencies to put the
-zero-frequency components in the middle, and ``np.fft.ifftshift(A)`` undoes
-that shift.
-
-When the input `a` is a time-domain signal and ``A = fft(a)``, ``np.abs(A)``
-is its amplitude spectrum and ``np.abs(A)**2`` is its power spectrum.
-The phase spectrum is obtained by ``np.angle(A)``.
-
-The inverse DFT is defined as
-
-.. math::
- a_m = \\frac{1}{n}\\sum_{k=0}^{n-1}A_k\\exp\\left\\{2\\pi i{mk\\over n}\\right\\}
- \\qquad m = 0,\\ldots,n-1.
-
-It differs from the forward transform by the sign of the exponential
-argument and the default normalization by :math:`1/n`.
-
-Normalization
--------------
-The default normalization has the direct transforms unscaled and the inverse
-transforms are scaled by :math:`1/n`. It is possible to obtain unitary
-transforms by setting the keyword argument ``norm`` to ``"ortho"`` (default is
-`None`) so that both direct and inverse transforms will be scaled by
-:math:`1/\\sqrt{n}`.
-
-Real and Hermitian transforms
------------------------------
-
-When the input is purely real, its transform is Hermitian, i.e., the
-component at frequency :math:`f_k` is the complex conjugate of the
-component at frequency :math:`-f_k`, which means that for real
-inputs there is no information in the negative frequency components that
-is not already available from the positive frequency components.
-The family of `rfft` functions is
-designed to operate on real inputs, and exploits this symmetry by
-computing only the positive frequency components, up to and including the
-Nyquist frequency. Thus, ``n`` input points produce ``n/2+1`` complex
-output points. The inverses of this family assumes the same symmetry of
-its input, and for an output of ``n`` points uses ``n/2+1`` input points.
-
-Correspondingly, when the spectrum is purely real, the signal is
-Hermitian. The `hfft` family of functions exploits this symmetry by
-using ``n/2+1`` complex points in the input (time) domain for ``n`` real
-points in the frequency domain.
-
-In higher dimensions, FFTs are used, e.g., for image analysis and
-filtering. The computational efficiency of the FFT means that it can
-also be a faster way to compute large convolutions, using the property
-that a convolution in the time domain is equivalent to a point-by-point
-multiplication in the frequency domain.
-
-Higher dimensions
------------------
-
-In two dimensions, the DFT is defined as
-
-.. math::
- A_{kl} = \\sum_{m=0}^{M-1} \\sum_{n=0}^{N-1}
- a_{mn}\\exp\\left\\{-2\\pi i \\left({mk\\over M}+{nl\\over N}\\right)\\right\\}
- \\qquad k = 0, \\ldots, M-1;\\quad l = 0, \\ldots, N-1,
-
-which extends in the obvious way to higher dimensions, and the inverses
-in higher dimensions also extend in the same way.
-
-References
-----------
-
-.. [CT] Cooley, James W., and John W. Tukey, 1965, "An algorithm for the
- machine calculation of complex Fourier series," *Math. Comput.*
- 19: 297-301.
-
-.. [NR] Press, W., Teukolsky, S., Vetterline, W.T., and Flannery, B.P.,
- 2007, *Numerical Recipes: The Art of Scientific Computing*, ch.
- 12-13. Cambridge Univ. Press, Cambridge, UK.
-
-Examples
---------
-
-For examples, see the various functions.
-
-"""
-from __future__ import division, absolute_import, print_function
-
-depends = ['core']
+++ /dev/null
-/*
- * This file is part of pocketfft.
- * Licensed under a 3-clause BSD style license - see LICENSE.md
- */
-
-/*
- * Main implementation file.
- *
- * Copyright (C) 2004-2018 Max-Planck-Society
- * \author Martin Reinecke
- */
-
-#include <math.h>
-#include <string.h>
-#include <stdlib.h>
-
-#include "npy_config.h"
-#define restrict NPY_RESTRICT
-
-#define RALLOC(type,num) \
- ((type *)malloc((num)*sizeof(type)))
-#define DEALLOC(ptr) \
- do { free(ptr); (ptr)=NULL; } while(0)
-
-#define SWAP(a,b,type) \
- do { type tmp_=(a); (a)=(b); (b)=tmp_; } while(0)
-
-#ifdef __GNUC__
-#define NOINLINE __attribute__((noinline))
-#define WARN_UNUSED_RESULT __attribute__ ((warn_unused_result))
-#else
-#define NOINLINE
-#define WARN_UNUSED_RESULT
-#endif
-
-struct cfft_plan_i;
-typedef struct cfft_plan_i * cfft_plan;
-struct rfft_plan_i;
-typedef struct rfft_plan_i * rfft_plan;
-
-// adapted from https://stackoverflow.com/questions/42792939/
-// CAUTION: this function only works for arguments in the range [-0.25; 0.25]!
-static void my_sincosm1pi (double a, double *restrict res)
- {
- double s = a * a;
- /* Approximate cos(pi*x)-1 for x in [-0.25,0.25] */
- double r = -1.0369917389758117e-4;
- r = fma (r, s, 1.9294935641298806e-3);
- r = fma (r, s, -2.5806887942825395e-2);
- r = fma (r, s, 2.3533063028328211e-1);
- r = fma (r, s, -1.3352627688538006e+0);
- r = fma (r, s, 4.0587121264167623e+0);
- r = fma (r, s, -4.9348022005446790e+0);
- double c = r*s;
- /* Approximate sin(pi*x) for x in [-0.25,0.25] */
- r = 4.6151442520157035e-4;
- r = fma (r, s, -7.3700183130883555e-3);
- r = fma (r, s, 8.2145868949323936e-2);
- r = fma (r, s, -5.9926452893214921e-1);
- r = fma (r, s, 2.5501640398732688e+0);
- r = fma (r, s, -5.1677127800499516e+0);
- s = s * a;
- r = r * s;
- s = fma (a, 3.1415926535897931e+0, r);
- res[0] = c;
- res[1] = s;
- }
-
-NOINLINE static void calc_first_octant(size_t den, double * restrict res)
- {
- size_t n = (den+4)>>3;
- if (n==0) return;
- res[0]=1.; res[1]=0.;
- if (n==1) return;
- size_t l1=(size_t)sqrt(n);
- for (size_t i=1; i<l1; ++i)
- my_sincosm1pi((2.*i)/den,&res[2*i]);
- size_t start=l1;
- while(start<n)
- {
- double cs[2];
- my_sincosm1pi((2.*start)/den,cs);
- res[2*start] = cs[0]+1.;
- res[2*start+1] = cs[1];
- size_t end = l1;
- if (start+end>n) end = n-start;
- for (size_t i=1; i<end; ++i)
- {
- double csx[2]={res[2*i], res[2*i+1]};
- res[2*(start+i)] = ((cs[0]*csx[0] - cs[1]*csx[1] + cs[0]) + csx[0]) + 1.;
- res[2*(start+i)+1] = (cs[0]*csx[1] + cs[1]*csx[0]) + cs[1] + csx[1];
- }
- start += l1;
- }
- for (size_t i=1; i<l1; ++i)
- res[2*i] += 1.;
- }
-
-NOINLINE static void calc_first_quadrant(size_t n, double * restrict res)
- {
- double * restrict p = res+n;
- calc_first_octant(n<<1, p);
- size_t ndone=(n+2)>>2;
- size_t i=0, idx1=0, idx2=2*ndone-2;
- for (; i+1<ndone; i+=2, idx1+=2, idx2-=2)
- {
- res[idx1] = p[2*i];
- res[idx1+1] = p[2*i+1];
- res[idx2] = p[2*i+3];
- res[idx2+1] = p[2*i+2];
- }
- if (i!=ndone)
- {
- res[idx1 ] = p[2*i];
- res[idx1+1] = p[2*i+1];
- }
- }
-
-NOINLINE static void calc_first_half(size_t n, double * restrict res)
- {
- int ndone=(n+1)>>1;
- double * p = res+n-1;
- calc_first_octant(n<<2, p);
- int i4=0, in=n, i=0;
- for (; i4<=in-i4; ++i, i4+=4) // octant 0
- {
- res[2*i] = p[2*i4]; res[2*i+1] = p[2*i4+1];
- }
- for (; i4-in <= 0; ++i, i4+=4) // octant 1
- {
- int xm = in-i4;
- res[2*i] = p[2*xm+1]; res[2*i+1] = p[2*xm];
- }
- for (; i4<=3*in-i4; ++i, i4+=4) // octant 2
- {
- int xm = i4-in;
- res[2*i] = -p[2*xm+1]; res[2*i+1] = p[2*xm];
- }
- for (; i<ndone; ++i, i4+=4) // octant 3
- {
- int xm = 2*in-i4;
- res[2*i] = -p[2*xm]; res[2*i+1] = p[2*xm+1];
- }
- }
-
-NOINLINE static void fill_first_quadrant(size_t n, double * restrict res)
- {
- const double hsqt2 = 0.707106781186547524400844362104849;
- size_t quart = n>>2;
- if ((n&7)==0)
- res[quart] = res[quart+1] = hsqt2;
- for (size_t i=2, j=2*quart-2; i<quart; i+=2, j-=2)
- {
- res[j ] = res[i+1];
- res[j+1] = res[i ];
- }
- }
-
-NOINLINE static void fill_first_half(size_t n, double * restrict res)
- {
- size_t half = n>>1;
- if ((n&3)==0)
- for (size_t i=0; i<half; i+=2)
- {
- res[i+half] = -res[i+1];
- res[i+half+1] = res[i ];
- }
- else
- for (size_t i=2, j=2*half-2; i<half; i+=2, j-=2)
- {
- res[j ] = -res[i ];
- res[j+1] = res[i+1];
- }
- }
-
-NOINLINE static void fill_second_half(size_t n, double * restrict res)
- {
- if ((n&1)==0)
- for (size_t i=0; i<n; ++i)
- res[i+n] = -res[i];
- else
- for (size_t i=2, j=2*n-2; i<n; i+=2, j-=2)
- {
- res[j ] = res[i ];
- res[j+1] = -res[i+1];
- }
- }
-
-NOINLINE static void sincos_2pibyn_half(size_t n, double * restrict res)
- {
- if ((n&3)==0)
- {
- calc_first_octant(n, res);
- fill_first_quadrant(n, res);
- fill_first_half(n, res);
- }
- else if ((n&1)==0)
- {
- calc_first_quadrant(n, res);
- fill_first_half(n, res);
- }
- else
- calc_first_half(n, res);
- }
-
-NOINLINE static void sincos_2pibyn(size_t n, double * restrict res)
- {
- sincos_2pibyn_half(n, res);
- fill_second_half(n, res);
- }
-
-NOINLINE static size_t largest_prime_factor (size_t n)
- {
- size_t res=1;
- size_t tmp;
- while (((tmp=(n>>1))<<1)==n)
- { res=2; n=tmp; }
-
- size_t limit=(size_t)sqrt(n+0.01);
- for (size_t x=3; x<=limit; x+=2)
- while (((tmp=(n/x))*x)==n)
- {
- res=x;
- n=tmp;
- limit=(size_t)sqrt(n+0.01);
- }
- if (n>1) res=n;
-
- return res;
- }
-
-NOINLINE static double cost_guess (size_t n)
- {
- const double lfp=1.1; // penalty for non-hardcoded larger factors
- size_t ni=n;
- double result=0.;
- size_t tmp;
- while (((tmp=(n>>1))<<1)==n)
- { result+=2; n=tmp; }
-
- size_t limit=(size_t)sqrt(n+0.01);
- for (size_t x=3; x<=limit; x+=2)
- while ((tmp=(n/x))*x==n)
- {
- result+= (x<=5) ? x : lfp*x; // penalize larger prime factors
- n=tmp;
- limit=(size_t)sqrt(n+0.01);
- }
- if (n>1) result+=(n<=5) ? n : lfp*n;
-
- return result*ni;
- }
-
-/* returns the smallest composite of 2, 3, 5, 7 and 11 which is >= n */
-NOINLINE static size_t good_size(size_t n)
- {
- if (n<=6) return n;
-
- size_t bestfac=2*n;
- for (size_t f2=1; f2<bestfac; f2*=2)
- for (size_t f23=f2; f23<bestfac; f23*=3)
- for (size_t f235=f23; f235<bestfac; f235*=5)
- for (size_t f2357=f235; f2357<bestfac; f2357*=7)
- for (size_t f235711=f2357; f235711<bestfac; f235711*=11)
- if (f235711>=n) bestfac=f235711;
- return bestfac;
- }
-
-typedef struct cmplx {
- double r,i;
-} cmplx;
-
-#define NFCT 25
-typedef struct cfftp_fctdata
- {
- size_t fct;
- cmplx *tw, *tws;
- } cfftp_fctdata;
-
-typedef struct cfftp_plan_i
- {
- size_t length, nfct;
- cmplx *mem;
- cfftp_fctdata fct[NFCT];
- } cfftp_plan_i;
-typedef struct cfftp_plan_i * cfftp_plan;
-
-#define PMC(a,b,c,d) { a.r=c.r+d.r; a.i=c.i+d.i; b.r=c.r-d.r; b.i=c.i-d.i; }
-#define ADDC(a,b,c) { a.r=b.r+c.r; a.i=b.i+c.i; }
-#define SCALEC(a,b) { a.r*=b; a.i*=b; }
-#define ROT90(a) { double tmp_=a.r; a.r=-a.i; a.i=tmp_; }
-#define ROTM90(a) { double tmp_=-a.r; a.r=a.i; a.i=tmp_; }
-#define CH(a,b,c) ch[(a)+ido*((b)+l1*(c))]
-#define CC(a,b,c) cc[(a)+ido*((b)+cdim*(c))]
-#define WA(x,i) wa[(i)-1+(x)*(ido-1)]
-/* a = b*c */
-#define A_EQ_B_MUL_C(a,b,c) { a.r=b.r*c.r-b.i*c.i; a.i=b.r*c.i+b.i*c.r; }
-/* a = conj(b)*c*/
-#define A_EQ_CB_MUL_C(a,b,c) { a.r=b.r*c.r+b.i*c.i; a.i=b.r*c.i-b.i*c.r; }
-
-#define PMSIGNC(a,b,c,d) { a.r=c.r+sign*d.r; a.i=c.i+sign*d.i; b.r=c.r-sign*d.r; b.i=c.i-sign*d.i; }
-/* a = b*c */
-#define MULPMSIGNC(a,b,c) { a.r=b.r*c.r-sign*b.i*c.i; a.i=b.r*c.i+sign*b.i*c.r; }
-/* a *= b */
-#define MULPMSIGNCEQ(a,b) { double xtmp=a.r; a.r=b.r*a.r-sign*b.i*a.i; a.i=b.r*a.i+sign*b.i*xtmp; }
-
-NOINLINE static void pass2b (size_t ido, size_t l1, const cmplx * restrict cc,
- cmplx * restrict ch, const cmplx * restrict wa)
- {
- const size_t cdim=2;
-
- if (ido==1)
- for (size_t k=0; k<l1; ++k)
- PMC (CH(0,k,0),CH(0,k,1),CC(0,0,k),CC(0,1,k))
- else
- for (size_t k=0; k<l1; ++k)
- {
- PMC (CH(0,k,0),CH(0,k,1),CC(0,0,k),CC(0,1,k))
- for (size_t i=1; i<ido; ++i)
- {
- cmplx t;
- PMC (CH(i,k,0),t,CC(i,0,k),CC(i,1,k))
- A_EQ_B_MUL_C (CH(i,k,1),WA(0,i),t)
- }
- }
- }
-
-NOINLINE static void pass2f (size_t ido, size_t l1, const cmplx * restrict cc,
- cmplx * restrict ch, const cmplx * restrict wa)
- {
- const size_t cdim=2;
-
- if (ido==1)
- for (size_t k=0; k<l1; ++k)
- PMC (CH(0,k,0),CH(0,k,1),CC(0,0,k),CC(0,1,k))
- else
- for (size_t k=0; k<l1; ++k)
- {
- PMC (CH(0,k,0),CH(0,k,1),CC(0,0,k),CC(0,1,k))
- for (size_t i=1; i<ido; ++i)
- {
- cmplx t;
- PMC (CH(i,k,0),t,CC(i,0,k),CC(i,1,k))
- A_EQ_CB_MUL_C (CH(i,k,1),WA(0,i),t)
- }
- }
- }
-
-#define PREP3(idx) \
- cmplx t0 = CC(idx,0,k), t1, t2; \
- PMC (t1,t2,CC(idx,1,k),CC(idx,2,k)) \
- CH(idx,k,0).r=t0.r+t1.r; \
- CH(idx,k,0).i=t0.i+t1.i;
-#define PARTSTEP3a(u1,u2,twr,twi) \
- { \
- cmplx ca,cb; \
- ca.r=t0.r+twr*t1.r; \
- ca.i=t0.i+twr*t1.i; \
- cb.i=twi*t2.r; \
- cb.r=-(twi*t2.i); \
- PMC(CH(0,k,u1),CH(0,k,u2),ca,cb) \
- }
-
-#define PARTSTEP3b(u1,u2,twr,twi) \
- { \
- cmplx ca,cb,da,db; \
- ca.r=t0.r+twr*t1.r; \
- ca.i=t0.i+twr*t1.i; \
- cb.i=twi*t2.r; \
- cb.r=-(twi*t2.i); \
- PMC(da,db,ca,cb) \
- A_EQ_B_MUL_C (CH(i,k,u1),WA(u1-1,i),da) \
- A_EQ_B_MUL_C (CH(i,k,u2),WA(u2-1,i),db) \
- }
-NOINLINE static void pass3b (size_t ido, size_t l1, const cmplx * restrict cc,
- cmplx * restrict ch, const cmplx * restrict wa)
- {
- const size_t cdim=3;
- const double tw1r=-0.5, tw1i= 0.86602540378443864676;
-
- if (ido==1)
- for (size_t k=0; k<l1; ++k)
- {
- PREP3(0)
- PARTSTEP3a(1,2,tw1r,tw1i)
- }
- else
- for (size_t k=0; k<l1; ++k)
- {
- {
- PREP3(0)
- PARTSTEP3a(1,2,tw1r,tw1i)
- }
- for (size_t i=1; i<ido; ++i)
- {
- PREP3(i)
- PARTSTEP3b(1,2,tw1r,tw1i)
- }
- }
- }
-#define PARTSTEP3f(u1,u2,twr,twi) \
- { \
- cmplx ca,cb,da,db; \
- ca.r=t0.r+twr*t1.r; \
- ca.i=t0.i+twr*t1.i; \
- cb.i=twi*t2.r; \
- cb.r=-(twi*t2.i); \
- PMC(da,db,ca,cb) \
- A_EQ_CB_MUL_C (CH(i,k,u1),WA(u1-1,i),da) \
- A_EQ_CB_MUL_C (CH(i,k,u2),WA(u2-1,i),db) \
- }
-NOINLINE static void pass3f (size_t ido, size_t l1, const cmplx * restrict cc,
- cmplx * restrict ch, const cmplx * restrict wa)
- {
- const size_t cdim=3;
- const double tw1r=-0.5, tw1i= -0.86602540378443864676;
-
- if (ido==1)
- for (size_t k=0; k<l1; ++k)
- {
- PREP3(0)
- PARTSTEP3a(1,2,tw1r,tw1i)
- }
- else
- for (size_t k=0; k<l1; ++k)
- {
- {
- PREP3(0)
- PARTSTEP3a(1,2,tw1r,tw1i)
- }
- for (size_t i=1; i<ido; ++i)
- {
- PREP3(i)
- PARTSTEP3f(1,2,tw1r,tw1i)
- }
- }
- }
-
-NOINLINE static void pass4b (size_t ido, size_t l1, const cmplx * restrict cc,
- cmplx * restrict ch, const cmplx * restrict wa)
- {
- const size_t cdim=4;
-
- if (ido==1)
- for (size_t k=0; k<l1; ++k)
- {
- cmplx t1, t2, t3, t4;
- PMC(t2,t1,CC(0,0,k),CC(0,2,k))
- PMC(t3,t4,CC(0,1,k),CC(0,3,k))
- ROT90(t4)
- PMC(CH(0,k,0),CH(0,k,2),t2,t3)
- PMC(CH(0,k,1),CH(0,k,3),t1,t4)
- }
- else
- for (size_t k=0; k<l1; ++k)
- {
- {
- cmplx t1, t2, t3, t4;
- PMC(t2,t1,CC(0,0,k),CC(0,2,k))
- PMC(t3,t4,CC(0,1,k),CC(0,3,k))
- ROT90(t4)
- PMC(CH(0,k,0),CH(0,k,2),t2,t3)
- PMC(CH(0,k,1),CH(0,k,3),t1,t4)
- }
- for (size_t i=1; i<ido; ++i)
- {
- cmplx c2, c3, c4, t1, t2, t3, t4;
- cmplx cc0=CC(i,0,k), cc1=CC(i,1,k),cc2=CC(i,2,k),cc3=CC(i,3,k);
- PMC(t2,t1,cc0,cc2)
- PMC(t3,t4,cc1,cc3)
- ROT90(t4)
- cmplx wa0=WA(0,i), wa1=WA(1,i),wa2=WA(2,i);
- PMC(CH(i,k,0),c3,t2,t3)
- PMC(c2,c4,t1,t4)
- A_EQ_B_MUL_C (CH(i,k,1),wa0,c2)
- A_EQ_B_MUL_C (CH(i,k,2),wa1,c3)
- A_EQ_B_MUL_C (CH(i,k,3),wa2,c4)
- }
- }
- }
-NOINLINE static void pass4f (size_t ido, size_t l1, const cmplx * restrict cc,
- cmplx * restrict ch, const cmplx * restrict wa)
- {
- const size_t cdim=4;
-
- if (ido==1)
- for (size_t k=0; k<l1; ++k)
- {
- cmplx t1, t2, t3, t4;
- PMC(t2,t1,CC(0,0,k),CC(0,2,k))
- PMC(t3,t4,CC(0,1,k),CC(0,3,k))
- ROTM90(t4)
- PMC(CH(0,k,0),CH(0,k,2),t2,t3)
- PMC(CH(0,k,1),CH(0,k,3),t1,t4)
- }
- else
- for (size_t k=0; k<l1; ++k)
- {
- {
- cmplx t1, t2, t3, t4;
- PMC(t2,t1,CC(0,0,k),CC(0,2,k))
- PMC(t3,t4,CC(0,1,k),CC(0,3,k))
- ROTM90(t4)
- PMC(CH(0,k,0),CH(0,k,2),t2,t3)
- PMC (CH(0,k,1),CH(0,k,3),t1,t4)
- }
- for (size_t i=1; i<ido; ++i)
- {
- cmplx c2, c3, c4, t1, t2, t3, t4;
- cmplx cc0=CC(i,0,k), cc1=CC(i,1,k),cc2=CC(i,2,k),cc3=CC(i,3,k);
- PMC(t2,t1,cc0,cc2)
- PMC(t3,t4,cc1,cc3)
- ROTM90(t4)
- cmplx wa0=WA(0,i), wa1=WA(1,i),wa2=WA(2,i);
- PMC(CH(i,k,0),c3,t2,t3)
- PMC(c2,c4,t1,t4)
- A_EQ_CB_MUL_C (CH(i,k,1),wa0,c2)
- A_EQ_CB_MUL_C (CH(i,k,2),wa1,c3)
- A_EQ_CB_MUL_C (CH(i,k,3),wa2,c4)
- }
- }
- }
-
-#define PREP5(idx) \
- cmplx t0 = CC(idx,0,k), t1, t2, t3, t4; \
- PMC (t1,t4,CC(idx,1,k),CC(idx,4,k)) \
- PMC (t2,t3,CC(idx,2,k),CC(idx,3,k)) \
- CH(idx,k,0).r=t0.r+t1.r+t2.r; \
- CH(idx,k,0).i=t0.i+t1.i+t2.i;
-
-#define PARTSTEP5a(u1,u2,twar,twbr,twai,twbi) \
- { \
- cmplx ca,cb; \
- ca.r=t0.r+twar*t1.r+twbr*t2.r; \
- ca.i=t0.i+twar*t1.i+twbr*t2.i; \
- cb.i=twai*t4.r twbi*t3.r; \
- cb.r=-(twai*t4.i twbi*t3.i); \
- PMC(CH(0,k,u1),CH(0,k,u2),ca,cb) \
- }
-
-#define PARTSTEP5b(u1,u2,twar,twbr,twai,twbi) \
- { \
- cmplx ca,cb,da,db; \
- ca.r=t0.r+twar*t1.r+twbr*t2.r; \
- ca.i=t0.i+twar*t1.i+twbr*t2.i; \
- cb.i=twai*t4.r twbi*t3.r; \
- cb.r=-(twai*t4.i twbi*t3.i); \
- PMC(da,db,ca,cb) \
- A_EQ_B_MUL_C (CH(i,k,u1),WA(u1-1,i),da) \
- A_EQ_B_MUL_C (CH(i,k,u2),WA(u2-1,i),db) \
- }
-NOINLINE static void pass5b (size_t ido, size_t l1, const cmplx * restrict cc,
- cmplx * restrict ch, const cmplx * restrict wa)
- {
- const size_t cdim=5;
- const double tw1r= 0.3090169943749474241,
- tw1i= 0.95105651629515357212,
- tw2r= -0.8090169943749474241,
- tw2i= 0.58778525229247312917;
-
- if (ido==1)
- for (size_t k=0; k<l1; ++k)
- {
- PREP5(0)
- PARTSTEP5a(1,4,tw1r,tw2r,+tw1i,+tw2i)
- PARTSTEP5a(2,3,tw2r,tw1r,+tw2i,-tw1i)
- }
- else
- for (size_t k=0; k<l1; ++k)
- {
- {
- PREP5(0)
- PARTSTEP5a(1,4,tw1r,tw2r,+tw1i,+tw2i)
- PARTSTEP5a(2,3,tw2r,tw1r,+tw2i,-tw1i)
- }
- for (size_t i=1; i<ido; ++i)
- {
- PREP5(i)
- PARTSTEP5b(1,4,tw1r,tw2r,+tw1i,+tw2i)
- PARTSTEP5b(2,3,tw2r,tw1r,+tw2i,-tw1i)
- }
- }
- }
-#define PARTSTEP5f(u1,u2,twar,twbr,twai,twbi) \
- { \
- cmplx ca,cb,da,db; \
- ca.r=t0.r+twar*t1.r+twbr*t2.r; \
- ca.i=t0.i+twar*t1.i+twbr*t2.i; \
- cb.i=twai*t4.r twbi*t3.r; \
- cb.r=-(twai*t4.i twbi*t3.i); \
- PMC(da,db,ca,cb) \
- A_EQ_CB_MUL_C (CH(i,k,u1),WA(u1-1,i),da) \
- A_EQ_CB_MUL_C (CH(i,k,u2),WA(u2-1,i),db) \
- }
-NOINLINE static void pass5f (size_t ido, size_t l1, const cmplx * restrict cc,
- cmplx * restrict ch, const cmplx * restrict wa)
- {
- const size_t cdim=5;
- const double tw1r= 0.3090169943749474241,
- tw1i= -0.95105651629515357212,
- tw2r= -0.8090169943749474241,
- tw2i= -0.58778525229247312917;
-
- if (ido==1)
- for (size_t k=0; k<l1; ++k)
- {
- PREP5(0)
- PARTSTEP5a(1,4,tw1r,tw2r,+tw1i,+tw2i)
- PARTSTEP5a(2,3,tw2r,tw1r,+tw2i,-tw1i)
- }
- else
- for (size_t k=0; k<l1; ++k)
- {
- {
- PREP5(0)
- PARTSTEP5a(1,4,tw1r,tw2r,+tw1i,+tw2i)
- PARTSTEP5a(2,3,tw2r,tw1r,+tw2i,-tw1i)
- }
- for (size_t i=1; i<ido; ++i)
- {
- PREP5(i)
- PARTSTEP5f(1,4,tw1r,tw2r,+tw1i,+tw2i)
- PARTSTEP5f(2,3,tw2r,tw1r,+tw2i,-tw1i)
- }
- }
- }
-
-#define PREP7(idx) \
- cmplx t1 = CC(idx,0,k), t2, t3, t4, t5, t6, t7; \
- PMC (t2,t7,CC(idx,1,k),CC(idx,6,k)) \
- PMC (t3,t6,CC(idx,2,k),CC(idx,5,k)) \
- PMC (t4,t5,CC(idx,3,k),CC(idx,4,k)) \
- CH(idx,k,0).r=t1.r+t2.r+t3.r+t4.r; \
- CH(idx,k,0).i=t1.i+t2.i+t3.i+t4.i;
-
-#define PARTSTEP7a0(u1,u2,x1,x2,x3,y1,y2,y3,out1,out2) \
- { \
- cmplx ca,cb; \
- ca.r=t1.r+x1*t2.r+x2*t3.r+x3*t4.r; \
- ca.i=t1.i+x1*t2.i+x2*t3.i+x3*t4.i; \
- cb.i=y1*t7.r y2*t6.r y3*t5.r; \
- cb.r=-(y1*t7.i y2*t6.i y3*t5.i); \
- PMC(out1,out2,ca,cb) \
- }
-#define PARTSTEP7a(u1,u2,x1,x2,x3,y1,y2,y3) \
- PARTSTEP7a0(u1,u2,x1,x2,x3,y1,y2,y3,CH(0,k,u1),CH(0,k,u2))
-#define PARTSTEP7(u1,u2,x1,x2,x3,y1,y2,y3) \
- { \
- cmplx da,db; \
- PARTSTEP7a0(u1,u2,x1,x2,x3,y1,y2,y3,da,db) \
- MULPMSIGNC (CH(i,k,u1),WA(u1-1,i),da) \
- MULPMSIGNC (CH(i,k,u2),WA(u2-1,i),db) \
- }
-
-NOINLINE static void pass7(size_t ido, size_t l1, const cmplx * restrict cc,
- cmplx * restrict ch, const cmplx * restrict wa, const int sign)
- {
- const size_t cdim=7;
- const double tw1r= 0.623489801858733530525,
- tw1i= sign * 0.7818314824680298087084,
- tw2r= -0.222520933956314404289,
- tw2i= sign * 0.9749279121818236070181,
- tw3r= -0.9009688679024191262361,
- tw3i= sign * 0.4338837391175581204758;
-
- if (ido==1)
- for (size_t k=0; k<l1; ++k)
- {
- PREP7(0)
- PARTSTEP7a(1,6,tw1r,tw2r,tw3r,+tw1i,+tw2i,+tw3i)
- PARTSTEP7a(2,5,tw2r,tw3r,tw1r,+tw2i,-tw3i,-tw1i)
- PARTSTEP7a(3,4,tw3r,tw1r,tw2r,+tw3i,-tw1i,+tw2i)
- }
- else
- for (size_t k=0; k<l1; ++k)
- {
- {
- PREP7(0)
- PARTSTEP7a(1,6,tw1r,tw2r,tw3r,+tw1i,+tw2i,+tw3i)
- PARTSTEP7a(2,5,tw2r,tw3r,tw1r,+tw2i,-tw3i,-tw1i)
- PARTSTEP7a(3,4,tw3r,tw1r,tw2r,+tw3i,-tw1i,+tw2i)
- }
- for (size_t i=1; i<ido; ++i)
- {
- PREP7(i)
- PARTSTEP7(1,6,tw1r,tw2r,tw3r,+tw1i,+tw2i,+tw3i)
- PARTSTEP7(2,5,tw2r,tw3r,tw1r,+tw2i,-tw3i,-tw1i)
- PARTSTEP7(3,4,tw3r,tw1r,tw2r,+tw3i,-tw1i,+tw2i)
- }
- }
- }
-
-#define PREP11(idx) \
- cmplx t1 = CC(idx,0,k), t2, t3, t4, t5, t6, t7, t8, t9, t10, t11; \
- PMC (t2,t11,CC(idx,1,k),CC(idx,10,k)) \
- PMC (t3,t10,CC(idx,2,k),CC(idx, 9,k)) \
- PMC (t4,t9 ,CC(idx,3,k),CC(idx, 8,k)) \
- PMC (t5,t8 ,CC(idx,4,k),CC(idx, 7,k)) \
- PMC (t6,t7 ,CC(idx,5,k),CC(idx, 6,k)) \
- CH(idx,k,0).r=t1.r+t2.r+t3.r+t4.r+t5.r+t6.r; \
- CH(idx,k,0).i=t1.i+t2.i+t3.i+t4.i+t5.i+t6.i;
-
-#define PARTSTEP11a0(u1,u2,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,out1,out2) \
- { \
- cmplx ca,cb; \
- ca.r=t1.r+x1*t2.r+x2*t3.r+x3*t4.r+x4*t5.r+x5*t6.r; \
- ca.i=t1.i+x1*t2.i+x2*t3.i+x3*t4.i+x4*t5.i+x5*t6.i; \
- cb.i=y1*t11.r y2*t10.r y3*t9.r y4*t8.r y5*t7.r; \
- cb.r=-(y1*t11.i y2*t10.i y3*t9.i y4*t8.i y5*t7.i ); \
- PMC(out1,out2,ca,cb) \
- }
-#define PARTSTEP11a(u1,u2,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5) \
- PARTSTEP11a0(u1,u2,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,CH(0,k,u1),CH(0,k,u2))
-#define PARTSTEP11(u1,u2,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5) \
- { \
- cmplx da,db; \
- PARTSTEP11a0(u1,u2,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,da,db) \
- MULPMSIGNC (CH(i,k,u1),WA(u1-1,i),da) \
- MULPMSIGNC (CH(i,k,u2),WA(u2-1,i),db) \
- }
-
-NOINLINE static void pass11 (size_t ido, size_t l1, const cmplx * restrict cc,
- cmplx * restrict ch, const cmplx * restrict wa, const int sign)
- {
- const size_t cdim=11;
- const double tw1r = 0.8412535328311811688618,
- tw1i = sign * 0.5406408174555975821076,
- tw2r = 0.4154150130018864255293,
- tw2i = sign * 0.9096319953545183714117,
- tw3r = -0.1423148382732851404438,
- tw3i = sign * 0.9898214418809327323761,
- tw4r = -0.6548607339452850640569,
- tw4i = sign * 0.755749574354258283774,
- tw5r = -0.9594929736144973898904,
- tw5i = sign * 0.2817325568414296977114;
-
- if (ido==1)
- for (size_t k=0; k<l1; ++k)
- {
- PREP11(0)
- PARTSTEP11a(1,10,tw1r,tw2r,tw3r,tw4r,tw5r,+tw1i,+tw2i,+tw3i,+tw4i,+tw5i)
- PARTSTEP11a(2, 9,tw2r,tw4r,tw5r,tw3r,tw1r,+tw2i,+tw4i,-tw5i,-tw3i,-tw1i)
- PARTSTEP11a(3, 8,tw3r,tw5r,tw2r,tw1r,tw4r,+tw3i,-tw5i,-tw2i,+tw1i,+tw4i)
- PARTSTEP11a(4, 7,tw4r,tw3r,tw1r,tw5r,tw2r,+tw4i,-tw3i,+tw1i,+tw5i,-tw2i)
- PARTSTEP11a(5, 6,tw5r,tw1r,tw4r,tw2r,tw3r,+tw5i,-tw1i,+tw4i,-tw2i,+tw3i)
- }
- else
- for (size_t k=0; k<l1; ++k)
- {
- {
- PREP11(0)
- PARTSTEP11a(1,10,tw1r,tw2r,tw3r,tw4r,tw5r,+tw1i,+tw2i,+tw3i,+tw4i,+tw5i)
- PARTSTEP11a(2, 9,tw2r,tw4r,tw5r,tw3r,tw1r,+tw2i,+tw4i,-tw5i,-tw3i,-tw1i)
- PARTSTEP11a(3, 8,tw3r,tw5r,tw2r,tw1r,tw4r,+tw3i,-tw5i,-tw2i,+tw1i,+tw4i)
- PARTSTEP11a(4, 7,tw4r,tw3r,tw1r,tw5r,tw2r,+tw4i,-tw3i,+tw1i,+tw5i,-tw2i)
- PARTSTEP11a(5, 6,tw5r,tw1r,tw4r,tw2r,tw3r,+tw5i,-tw1i,+tw4i,-tw2i,+tw3i)
- }
- for (size_t i=1; i<ido; ++i)
- {
- PREP11(i)
- PARTSTEP11(1,10,tw1r,tw2r,tw3r,tw4r,tw5r,+tw1i,+tw2i,+tw3i,+tw4i,+tw5i)
- PARTSTEP11(2, 9,tw2r,tw4r,tw5r,tw3r,tw1r,+tw2i,+tw4i,-tw5i,-tw3i,-tw1i)
- PARTSTEP11(3, 8,tw3r,tw5r,tw2r,tw1r,tw4r,+tw3i,-tw5i,-tw2i,+tw1i,+tw4i)
- PARTSTEP11(4, 7,tw4r,tw3r,tw1r,tw5r,tw2r,+tw4i,-tw3i,+tw1i,+tw5i,-tw2i)
- PARTSTEP11(5, 6,tw5r,tw1r,tw4r,tw2r,tw3r,+tw5i,-tw1i,+tw4i,-tw2i,+tw3i)
- }
- }
- }
-
-#define CX(a,b,c) cc[(a)+ido*((b)+l1*(c))]
-#define CX2(a,b) cc[(a)+idl1*(b)]
-#define CH2(a,b) ch[(a)+idl1*(b)]
-
-NOINLINE static int passg (size_t ido, size_t ip, size_t l1,
- cmplx * restrict cc, cmplx * restrict ch, const cmplx * restrict wa,
- const cmplx * restrict csarr, const int sign)
- {
- const size_t cdim=ip;
- size_t ipph = (ip+1)/2;
- size_t idl1 = ido*l1;
-
- cmplx * restrict wal=RALLOC(cmplx,ip);
- if (!wal) return -1;
- wal[0]=(cmplx){1.,0.};
- for (size_t i=1; i<ip; ++i)
- wal[i]=(cmplx){csarr[i].r,sign*csarr[i].i};
-
- for (size_t k=0; k<l1; ++k)
- for (size_t i=0; i<ido; ++i)
- CH(i,k,0) = CC(i,0,k);
- for (size_t j=1, jc=ip-1; j<ipph; ++j, --jc)
- for (size_t k=0; k<l1; ++k)
- for (size_t i=0; i<ido; ++i)
- PMC(CH(i,k,j),CH(i,k,jc),CC(i,j,k),CC(i,jc,k))
- for (size_t k=0; k<l1; ++k)
- for (size_t i=0; i<ido; ++i)
- {
- cmplx tmp = CH(i,k,0);
- for (size_t j=1; j<ipph; ++j)
- ADDC(tmp,tmp,CH(i,k,j))
- CX(i,k,0) = tmp;
- }
- for (size_t l=1, lc=ip-1; l<ipph; ++l, --lc)
- {
- // j=0
- for (size_t ik=0; ik<idl1; ++ik)
- {
- CX2(ik,l).r = CH2(ik,0).r+wal[l].r*CH2(ik,1).r+wal[2*l].r*CH2(ik,2).r;
- CX2(ik,l).i = CH2(ik,0).i+wal[l].r*CH2(ik,1).i+wal[2*l].r*CH2(ik,2).i;
- CX2(ik,lc).r=-wal[l].i*CH2(ik,ip-1).i-wal[2*l].i*CH2(ik,ip-2).i;
- CX2(ik,lc).i=wal[l].i*CH2(ik,ip-1).r+wal[2*l].i*CH2(ik,ip-2).r;
- }
-
- size_t iwal=2*l;
- size_t j=3, jc=ip-3;
- for (; j<ipph-1; j+=2, jc-=2)
- {
- iwal+=l; if (iwal>ip) iwal-=ip;
- cmplx xwal=wal[iwal];
- iwal+=l; if (iwal>ip) iwal-=ip;
- cmplx xwal2=wal[iwal];
- for (size_t ik=0; ik<idl1; ++ik)
- {
- CX2(ik,l).r += CH2(ik,j).r*xwal.r+CH2(ik,j+1).r*xwal2.r;
- CX2(ik,l).i += CH2(ik,j).i*xwal.r+CH2(ik,j+1).i*xwal2.r;
- CX2(ik,lc).r -= CH2(ik,jc).i*xwal.i+CH2(ik,jc-1).i*xwal2.i;
- CX2(ik,lc).i += CH2(ik,jc).r*xwal.i+CH2(ik,jc-1).r*xwal2.i;
- }
- }
- for (; j<ipph; ++j, --jc)
- {
- iwal+=l; if (iwal>ip) iwal-=ip;
- cmplx xwal=wal[iwal];
- for (size_t ik=0; ik<idl1; ++ik)
- {
- CX2(ik,l).r += CH2(ik,j).r*xwal.r;
- CX2(ik,l).i += CH2(ik,j).i*xwal.r;
- CX2(ik,lc).r -= CH2(ik,jc).i*xwal.i;
- CX2(ik,lc).i += CH2(ik,jc).r*xwal.i;
- }
- }
- }
- DEALLOC(wal);
-
- // shuffling and twiddling
- if (ido==1)
- for (size_t j=1, jc=ip-1; j<ipph; ++j, --jc)
- for (size_t ik=0; ik<idl1; ++ik)
- {
- cmplx t1=CX2(ik,j), t2=CX2(ik,jc);
- PMC(CX2(ik,j),CX2(ik,jc),t1,t2)
- }
- else
- {
- for (size_t j=1, jc=ip-1; j<ipph; ++j,--jc)
- for (size_t k=0; k<l1; ++k)
- {
- cmplx t1=CX(0,k,j), t2=CX(0,k,jc);
- PMC(CX(0,k,j),CX(0,k,jc),t1,t2)
- for (size_t i=1; i<ido; ++i)
- {
- cmplx x1, x2;
- PMC(x1,x2,CX(i,k,j),CX(i,k,jc))
- size_t idij=(j-1)*(ido-1)+i-1;
- MULPMSIGNC (CX(i,k,j),wa[idij],x1)
- idij=(jc-1)*(ido-1)+i-1;
- MULPMSIGNC (CX(i,k,jc),wa[idij],x2)
- }
- }
- }
- return 0;
- }
-
-#undef CH2
-#undef CX2
-#undef CX
-
-NOINLINE WARN_UNUSED_RESULT static int pass_all(cfftp_plan plan, cmplx c[], double fct,
- const int sign)
- {
- if (plan->length==1) return 0;
- size_t len=plan->length;
- size_t l1=1, nf=plan->nfct;
- cmplx *ch = RALLOC(cmplx, len);
- if (!ch) return -1;
- cmplx *p1=c, *p2=ch;
-
- for(size_t k1=0; k1<nf; k1++)
- {
- size_t ip=plan->fct[k1].fct;
- size_t l2=ip*l1;
- size_t ido = len/l2;
- if (ip==4)
- sign>0 ? pass4b (ido, l1, p1, p2, plan->fct[k1].tw)
- : pass4f (ido, l1, p1, p2, plan->fct[k1].tw);
- else if(ip==2)
- sign>0 ? pass2b (ido, l1, p1, p2, plan->fct[k1].tw)
- : pass2f (ido, l1, p1, p2, plan->fct[k1].tw);
- else if(ip==3)
- sign>0 ? pass3b (ido, l1, p1, p2, plan->fct[k1].tw)
- : pass3f (ido, l1, p1, p2, plan->fct[k1].tw);
- else if(ip==5)
- sign>0 ? pass5b (ido, l1, p1, p2, plan->fct[k1].tw)
- : pass5f (ido, l1, p1, p2, plan->fct[k1].tw);
- else if(ip==7) pass7 (ido, l1, p1, p2, plan->fct[k1].tw, sign);
- else if(ip==11) pass11(ido, l1, p1, p2, plan->fct[k1].tw, sign);
- else
- {
- if (passg(ido, ip, l1, p1, p2, plan->fct[k1].tw, plan->fct[k1].tws, sign))
- { DEALLOC(ch); return -1; }
- SWAP(p1,p2,cmplx *);
- }
- SWAP(p1,p2,cmplx *);
- l1=l2;
- }
- if (p1!=c)
- {
- if (fct!=1.)
- for (size_t i=0; i<len; ++i)
- {
- c[i].r = ch[i].r*fct;
- c[i].i = ch[i].i*fct;
- }
- else
- memcpy (c,p1,len*sizeof(cmplx));
- }
- else
- if (fct!=1.)
- for (size_t i=0; i<len; ++i)
- {
- c[i].r *= fct;
- c[i].i *= fct;
- }
- DEALLOC(ch);
- return 0;
- }
-
-#undef PMSIGNC
-#undef A_EQ_B_MUL_C
-#undef A_EQ_CB_MUL_C
-#undef MULPMSIGNC
-#undef MULPMSIGNCEQ
-
-#undef WA
-#undef CC
-#undef CH
-#undef ROT90
-#undef SCALEC
-#undef ADDC
-#undef PMC
-
-NOINLINE WARN_UNUSED_RESULT
-static int cfftp_forward(cfftp_plan plan, double c[], double fct)
- { return pass_all(plan,(cmplx *)c, fct, -1); }
-
-NOINLINE WARN_UNUSED_RESULT
-static int cfftp_backward(cfftp_plan plan, double c[], double fct)
- { return pass_all(plan,(cmplx *)c, fct, 1); }
-
-NOINLINE WARN_UNUSED_RESULT
-static int cfftp_factorize (cfftp_plan plan)
- {
- size_t length=plan->length;
- size_t nfct=0;
- while ((length%4)==0)
- { if (nfct>=NFCT) return -1; plan->fct[nfct++].fct=4; length>>=2; }
- if ((length%2)==0)
- {
- length>>=1;
- // factor 2 should be at the front of the factor list
- if (nfct>=NFCT) return -1;
- plan->fct[nfct++].fct=2;
- SWAP(plan->fct[0].fct, plan->fct[nfct-1].fct,size_t);
- }
- size_t maxl=(size_t)(sqrt((double)length))+1;
- for (size_t divisor=3; (length>1)&&(divisor<maxl); divisor+=2)
- if ((length%divisor)==0)
- {
- while ((length%divisor)==0)
- {
- if (nfct>=NFCT) return -1;
- plan->fct[nfct++].fct=divisor;
- length/=divisor;
- }
- maxl=(size_t)(sqrt((double)length))+1;
- }
- if (length>1) plan->fct[nfct++].fct=length;
- plan->nfct=nfct;
- return 0;
- }
-
-NOINLINE static size_t cfftp_twsize (cfftp_plan plan)
- {
- size_t twsize=0, l1=1;
- for (size_t k=0; k<plan->nfct; ++k)
- {
- size_t ip=plan->fct[k].fct, ido= plan->length/(l1*ip);
- twsize+=(ip-1)*(ido-1);
- if (ip>11)
- twsize+=ip;
- l1*=ip;
- }
- return twsize;
- }
-
-NOINLINE WARN_UNUSED_RESULT static int cfftp_comp_twiddle (cfftp_plan plan)
- {
- size_t length=plan->length;
- double *twid = RALLOC(double, 2*length);
- if (!twid) return -1;
- sincos_2pibyn(length, twid);
- size_t l1=1;
- size_t memofs=0;
- for (size_t k=0; k<plan->nfct; ++k)
- {
- size_t ip=plan->fct[k].fct, ido= length/(l1*ip);
- plan->fct[k].tw=plan->mem+memofs;
- memofs+=(ip-1)*(ido-1);
- for (size_t j=1; j<ip; ++j)
- for (size_t i=1; i<ido; ++i)
- {
- plan->fct[k].tw[(j-1)*(ido-1)+i-1].r = twid[2*j*l1*i];
- plan->fct[k].tw[(j-1)*(ido-1)+i-1].i = twid[2*j*l1*i+1];
- }
- if (ip>11)
- {
- plan->fct[k].tws=plan->mem+memofs;
- memofs+=ip;
- for (size_t j=0; j<ip; ++j)
- {
- plan->fct[k].tws[j].r = twid[2*j*l1*ido];
- plan->fct[k].tws[j].i = twid[2*j*l1*ido+1];
- }
- }
- l1*=ip;
- }
- DEALLOC(twid);
- return 0;
- }
-
-static cfftp_plan make_cfftp_plan (size_t length)
- {
- if (length==0) return NULL;
- cfftp_plan plan = RALLOC(cfftp_plan_i,1);
- if (!plan) return NULL;
- plan->length=length;
- plan->nfct=0;
- for (size_t i=0; i<NFCT; ++i)
- plan->fct[i]=(cfftp_fctdata){0,0,0};
- plan->mem=0;
- if (length==1) return plan;
- if (cfftp_factorize(plan)!=0) { DEALLOC(plan); return NULL; }
- size_t tws=cfftp_twsize(plan);
- plan->mem=RALLOC(cmplx,tws);
- if (!plan->mem) { DEALLOC(plan); return NULL; }
- if (cfftp_comp_twiddle(plan)!=0)
- { DEALLOC(plan->mem); DEALLOC(plan); return NULL; }
- return plan;
- }
-
-static void destroy_cfftp_plan (cfftp_plan plan)
- {
- DEALLOC(plan->mem);
- DEALLOC(plan);
- }
-
-typedef struct rfftp_fctdata
- {
- size_t fct;
- double *tw, *tws;
- } rfftp_fctdata;
-
-typedef struct rfftp_plan_i
- {
- size_t length, nfct;
- double *mem;
- rfftp_fctdata fct[NFCT];
- } rfftp_plan_i;
-typedef struct rfftp_plan_i * rfftp_plan;
-
-#define WA(x,i) wa[(i)+(x)*(ido-1)]
-#define PM(a,b,c,d) { a=c+d; b=c-d; }
-/* (a+ib) = conj(c+id) * (e+if) */
-#define MULPM(a,b,c,d,e,f) { a=c*e+d*f; b=c*f-d*e; }
-
-#define CC(a,b,c) cc[(a)+ido*((b)+l1*(c))]
-#define CH(a,b,c) ch[(a)+ido*((b)+cdim*(c))]
-
-NOINLINE static void radf2 (size_t ido, size_t l1, const double * restrict cc,
- double * restrict ch, const double * restrict wa)
- {
- const size_t cdim=2;
-
- for (size_t k=0; k<l1; k++)
- PM (CH(0,0,k),CH(ido-1,1,k),CC(0,k,0),CC(0,k,1))
- if ((ido&1)==0)
- for (size_t k=0; k<l1; k++)
- {
- CH( 0,1,k) = -CC(ido-1,k,1);
- CH(ido-1,0,k) = CC(ido-1,k,0);
- }
- if (ido<=2) return;
- for (size_t k=0; k<l1; k++)
- for (size_t i=2; i<ido; i+=2)
- {
- size_t ic=ido-i;
- double tr2, ti2;
- MULPM (tr2,ti2,WA(0,i-2),WA(0,i-1),CC(i-1,k,1),CC(i,k,1))
- PM (CH(i-1,0,k),CH(ic-1,1,k),CC(i-1,k,0),tr2)
- PM (CH(i ,0,k),CH(ic ,1,k),ti2,CC(i ,k,0))
- }
- }
-
-NOINLINE static void radf3(size_t ido, size_t l1, const double * restrict cc,
- double * restrict ch, const double * restrict wa)
- {
- const size_t cdim=3;
- static const double taur=-0.5, taui=0.86602540378443864676;
-
- for (size_t k=0; k<l1; k++)
- {
- double cr2=CC(0,k,1)+CC(0,k,2);
- CH(0,0,k) = CC(0,k,0)+cr2;
- CH(0,2,k) = taui*(CC(0,k,2)-CC(0,k,1));
- CH(ido-1,1,k) = CC(0,k,0)+taur*cr2;
- }
- if (ido==1) return;
- for (size_t k=0; k<l1; k++)
- for (size_t i=2; i<ido; i+=2)
- {
- size_t ic=ido-i;
- double di2, di3, dr2, dr3;
- MULPM (dr2,di2,WA(0,i-2),WA(0,i-1),CC(i-1,k,1),CC(i,k,1)) // d2=conj(WA0)*CC1
- MULPM (dr3,di3,WA(1,i-2),WA(1,i-1),CC(i-1,k,2),CC(i,k,2)) // d3=conj(WA1)*CC2
- double cr2=dr2+dr3; // c add
- double ci2=di2+di3;
- CH(i-1,0,k) = CC(i-1,k,0)+cr2; // c add
- CH(i ,0,k) = CC(i ,k,0)+ci2;
- double tr2 = CC(i-1,k,0)+taur*cr2; // c add
- double ti2 = CC(i ,k,0)+taur*ci2;
- double tr3 = taui*(di2-di3); // t3 = taui*i*(d3-d2)?
- double ti3 = taui*(dr3-dr2);
- PM(CH(i-1,2,k),CH(ic-1,1,k),tr2,tr3) // PM(i) = t2+t3
- PM(CH(i ,2,k),CH(ic ,1,k),ti3,ti2) // PM(ic) = conj(t2-t3)
- }
- }
-
-NOINLINE static void radf4(size_t ido, size_t l1, const double * restrict cc,
- double * restrict ch, const double * restrict wa)
- {
- const size_t cdim=4;
- static const double hsqt2=0.70710678118654752440;
-
- for (size_t k=0; k<l1; k++)
- {
- double tr1,tr2;
- PM (tr1,CH(0,2,k),CC(0,k,3),CC(0,k,1))
- PM (tr2,CH(ido-1,1,k),CC(0,k,0),CC(0,k,2))
- PM (CH(0,0,k),CH(ido-1,3,k),tr2,tr1)
- }
- if ((ido&1)==0)
- for (size_t k=0; k<l1; k++)
- {
- double ti1=-hsqt2*(CC(ido-1,k,1)+CC(ido-1,k,3));
- double tr1= hsqt2*(CC(ido-1,k,1)-CC(ido-1,k,3));
- PM (CH(ido-1,0,k),CH(ido-1,2,k),CC(ido-1,k,0),tr1)
- PM (CH( 0,3,k),CH( 0,1,k),ti1,CC(ido-1,k,2))
- }
- if (ido<=2) return;
- for (size_t k=0; k<l1; k++)
- for (size_t i=2; i<ido; i+=2)
- {
- size_t ic=ido-i;
- double ci2, ci3, ci4, cr2, cr3, cr4, ti1, ti2, ti3, ti4, tr1, tr2, tr3, tr4;
- MULPM(cr2,ci2,WA(0,i-2),WA(0,i-1),CC(i-1,k,1),CC(i,k,1))
- MULPM(cr3,ci3,WA(1,i-2),WA(1,i-1),CC(i-1,k,2),CC(i,k,2))
- MULPM(cr4,ci4,WA(2,i-2),WA(2,i-1),CC(i-1,k,3),CC(i,k,3))
- PM(tr1,tr4,cr4,cr2)
- PM(ti1,ti4,ci2,ci4)
- PM(tr2,tr3,CC(i-1,k,0),cr3)
- PM(ti2,ti3,CC(i ,k,0),ci3)
- PM(CH(i-1,0,k),CH(ic-1,3,k),tr2,tr1)
- PM(CH(i ,0,k),CH(ic ,3,k),ti1,ti2)
- PM(CH(i-1,2,k),CH(ic-1,1,k),tr3,ti4)
- PM(CH(i ,2,k),CH(ic ,1,k),tr4,ti3)
- }
- }
-
-NOINLINE static void radf5(size_t ido, size_t l1, const double * restrict cc,
- double * restrict ch, const double * restrict wa)
- {
- const size_t cdim=5;
- static const double tr11= 0.3090169943749474241, ti11=0.95105651629515357212,
- tr12=-0.8090169943749474241, ti12=0.58778525229247312917;
-
- for (size_t k=0; k<l1; k++)
- {
- double cr2, cr3, ci4, ci5;
- PM (cr2,ci5,CC(0,k,4),CC(0,k,1))
- PM (cr3,ci4,CC(0,k,3),CC(0,k,2))
- CH(0,0,k)=CC(0,k,0)+cr2+cr3;
- CH(ido-1,1,k)=CC(0,k,0)+tr11*cr2+tr12*cr3;
- CH(0,2,k)=ti11*ci5+ti12*ci4;
- CH(ido-1,3,k)=CC(0,k,0)+tr12*cr2+tr11*cr3;
- CH(0,4,k)=ti12*ci5-ti11*ci4;
- }
- if (ido==1) return;
- for (size_t k=0; k<l1;++k)
- for (size_t i=2; i<ido; i+=2)
- {
- double ci2, di2, ci4, ci5, di3, di4, di5, ci3, cr2, cr3, dr2, dr3,
- dr4, dr5, cr5, cr4, ti2, ti3, ti5, ti4, tr2, tr3, tr4, tr5;
- size_t ic=ido-i;
- MULPM (dr2,di2,WA(0,i-2),WA(0,i-1),CC(i-1,k,1),CC(i,k,1))
- MULPM (dr3,di3,WA(1,i-2),WA(1,i-1),CC(i-1,k,2),CC(i,k,2))
- MULPM (dr4,di4,WA(2,i-2),WA(2,i-1),CC(i-1,k,3),CC(i,k,3))
- MULPM (dr5,di5,WA(3,i-2),WA(3,i-1),CC(i-1,k,4),CC(i,k,4))
- PM(cr2,ci5,dr5,dr2)
- PM(ci2,cr5,di2,di5)
- PM(cr3,ci4,dr4,dr3)
- PM(ci3,cr4,di3,di4)
- CH(i-1,0,k)=CC(i-1,k,0)+cr2+cr3;
- CH(i ,0,k)=CC(i ,k,0)+ci2+ci3;
- tr2=CC(i-1,k,0)+tr11*cr2+tr12*cr3;
- ti2=CC(i ,k,0)+tr11*ci2+tr12*ci3;
- tr3=CC(i-1,k,0)+tr12*cr2+tr11*cr3;
- ti3=CC(i ,k,0)+tr12*ci2+tr11*ci3;
- MULPM(tr5,tr4,cr5,cr4,ti11,ti12)
- MULPM(ti5,ti4,ci5,ci4,ti11,ti12)
- PM(CH(i-1,2,k),CH(ic-1,1,k),tr2,tr5)
- PM(CH(i ,2,k),CH(ic ,1,k),ti5,ti2)
- PM(CH(i-1,4,k),CH(ic-1,3,k),tr3,tr4)
- PM(CH(i ,4,k),CH(ic ,3,k),ti4,ti3)
- }
- }
-
-#undef CC
-#undef CH
-#define C1(a,b,c) cc[(a)+ido*((b)+l1*(c))]
-#define C2(a,b) cc[(a)+idl1*(b)]
-#define CH2(a,b) ch[(a)+idl1*(b)]
-#define CC(a,b,c) cc[(a)+ido*((b)+cdim*(c))]
-#define CH(a,b,c) ch[(a)+ido*((b)+l1*(c))]
-NOINLINE static void radfg(size_t ido, size_t ip, size_t l1,
- double * restrict cc, double * restrict ch, const double * restrict wa,
- const double * restrict csarr)
- {
- const size_t cdim=ip;
- size_t ipph=(ip+1)/2;
- size_t idl1 = ido*l1;
-
- if (ido>1)
- {
- for (size_t j=1, jc=ip-1; j<ipph; ++j,--jc) // 114
- {
- size_t is=(j-1)*(ido-1),
- is2=(jc-1)*(ido-1);
- for (size_t k=0; k<l1; ++k) // 113
- {
- size_t idij=is;
- size_t idij2=is2;
- for (size_t i=1; i<=ido-2; i+=2) // 112
- {
- double t1=C1(i,k,j ), t2=C1(i+1,k,j ),
- t3=C1(i,k,jc), t4=C1(i+1,k,jc);
- double x1=wa[idij]*t1 + wa[idij+1]*t2,
- x2=wa[idij]*t2 - wa[idij+1]*t1,
- x3=wa[idij2]*t3 + wa[idij2+1]*t4,
- x4=wa[idij2]*t4 - wa[idij2+1]*t3;
- C1(i ,k,j ) = x1+x3;
- C1(i ,k,jc) = x2-x4;
- C1(i+1,k,j ) = x2+x4;
- C1(i+1,k,jc) = x3-x1;
- idij+=2;
- idij2+=2;
- }
- }
- }
- }
-
- for (size_t j=1, jc=ip-1; j<ipph; ++j,--jc) // 123
- for (size_t k=0; k<l1; ++k) // 122
- {
- double t1=C1(0,k,j), t2=C1(0,k,jc);
- C1(0,k,j ) = t1+t2;
- C1(0,k,jc) = t2-t1;
- }
-
-//everything in C
-//memset(ch,0,ip*l1*ido*sizeof(double));
-
- for (size_t l=1,lc=ip-1; l<ipph; ++l,--lc) // 127
- {
- for (size_t ik=0; ik<idl1; ++ik) // 124
- {
- CH2(ik,l ) = C2(ik,0)+csarr[2*l]*C2(ik,1)+csarr[4*l]*C2(ik,2);
- CH2(ik,lc) = csarr[2*l+1]*C2(ik,ip-1)+csarr[4*l+1]*C2(ik,ip-2);
- }
- size_t iang = 2*l;
- size_t j=3, jc=ip-3;
- for (; j<ipph-3; j+=4,jc-=4) // 126
- {
- iang+=l; if (iang>=ip) iang-=ip;
- double ar1=csarr[2*iang], ai1=csarr[2*iang+1];
- iang+=l; if (iang>=ip) iang-=ip;
- double ar2=csarr[2*iang], ai2=csarr[2*iang+1];
- iang+=l; if (iang>=ip) iang-=ip;
- double ar3=csarr[2*iang], ai3=csarr[2*iang+1];
- iang+=l; if (iang>=ip) iang-=ip;
- double ar4=csarr[2*iang], ai4=csarr[2*iang+1];
- for (size_t ik=0; ik<idl1; ++ik) // 125
- {
- CH2(ik,l ) += ar1*C2(ik,j )+ar2*C2(ik,j +1)
- +ar3*C2(ik,j +2)+ar4*C2(ik,j +3);
- CH2(ik,lc) += ai1*C2(ik,jc)+ai2*C2(ik,jc-1)
- +ai3*C2(ik,jc-2)+ai4*C2(ik,jc-3);
- }
- }
- for (; j<ipph-1; j+=2,jc-=2) // 126
- {
- iang+=l; if (iang>=ip) iang-=ip;
- double ar1=csarr[2*iang], ai1=csarr[2*iang+1];
- iang+=l; if (iang>=ip) iang-=ip;
- double ar2=csarr[2*iang], ai2=csarr[2*iang+1];
- for (size_t ik=0; ik<idl1; ++ik) // 125
- {
- CH2(ik,l ) += ar1*C2(ik,j )+ar2*C2(ik,j +1);
- CH2(ik,lc) += ai1*C2(ik,jc)+ai2*C2(ik,jc-1);
- }
- }
- for (; j<ipph; ++j,--jc) // 126
- {
- iang+=l; if (iang>=ip) iang-=ip;
- double ar=csarr[2*iang], ai=csarr[2*iang+1];
- for (size_t ik=0; ik<idl1; ++ik) // 125
- {
- CH2(ik,l ) += ar*C2(ik,j );
- CH2(ik,lc) += ai*C2(ik,jc);
- }
- }
- }
- for (size_t ik=0; ik<idl1; ++ik) // 101
- CH2(ik,0) = C2(ik,0);
- for (size_t j=1; j<ipph; ++j) // 129
- for (size_t ik=0; ik<idl1; ++ik) // 128
- CH2(ik,0) += C2(ik,j);
-
-// everything in CH at this point!
-//memset(cc,0,ip*l1*ido*sizeof(double));
-
- for (size_t k=0; k<l1; ++k) // 131
- for (size_t i=0; i<ido; ++i) // 130
- CC(i,0,k) = CH(i,k,0);
-
- for (size_t j=1, jc=ip-1; j<ipph; ++j,--jc) // 137
- {
- size_t j2=2*j-1;
- for (size_t k=0; k<l1; ++k) // 136
- {
- CC(ido-1,j2,k) = CH(0,k,j);
- CC(0,j2+1,k) = CH(0,k,jc);
- }
- }
-
- if (ido==1) return;
-
- for (size_t j=1, jc=ip-1; j<ipph; ++j,--jc) // 140
- {
- size_t j2=2*j-1;
- for(size_t k=0; k<l1; ++k) // 139
- for(size_t i=1, ic=ido-i-2; i<=ido-2; i+=2, ic-=2) // 138
- {
- CC(i ,j2+1,k) = CH(i ,k,j )+CH(i ,k,jc);
- CC(ic ,j2 ,k) = CH(i ,k,j )-CH(i ,k,jc);
- CC(i+1 ,j2+1,k) = CH(i+1,k,j )+CH(i+1,k,jc);
- CC(ic+1,j2 ,k) = CH(i+1,k,jc)-CH(i+1,k,j );
- }
- }
- }
-#undef C1
-#undef C2
-#undef CH2
-
-#undef CH
-#undef CC
-#define CH(a,b,c) ch[(a)+ido*((b)+l1*(c))]
-#define CC(a,b,c) cc[(a)+ido*((b)+cdim*(c))]
-
-NOINLINE static void radb2(size_t ido, size_t l1, const double * restrict cc,
- double * restrict ch, const double * restrict wa)
- {
- const size_t cdim=2;
-
- for (size_t k=0; k<l1; k++)
- PM (CH(0,k,0),CH(0,k,1),CC(0,0,k),CC(ido-1,1,k))
- if ((ido&1)==0)
- for (size_t k=0; k<l1; k++)
- {
- CH(ido-1,k,0) = 2.*CC(ido-1,0,k);
- CH(ido-1,k,1) =-2.*CC(0 ,1,k);
- }
- if (ido<=2) return;
- for (size_t k=0; k<l1;++k)
- for (size_t i=2; i<ido; i+=2)
- {
- size_t ic=ido-i;
- double ti2, tr2;
- PM (CH(i-1,k,0),tr2,CC(i-1,0,k),CC(ic-1,1,k))
- PM (ti2,CH(i ,k,0),CC(i ,0,k),CC(ic ,1,k))
- MULPM (CH(i,k,1),CH(i-1,k,1),WA(0,i-2),WA(0,i-1),ti2,tr2)
- }
- }
-
-NOINLINE static void radb3(size_t ido, size_t l1, const double * restrict cc,
- double * restrict ch, const double * restrict wa)
- {
- const size_t cdim=3;
- static const double taur=-0.5, taui=0.86602540378443864676;
-
- for (size_t k=0; k<l1; k++)
- {
- double tr2=2.*CC(ido-1,1,k);
- double cr2=CC(0,0,k)+taur*tr2;
- CH(0,k,0)=CC(0,0,k)+tr2;
- double ci3=2.*taui*CC(0,2,k);
- PM (CH(0,k,2),CH(0,k,1),cr2,ci3);
- }
- if (ido==1) return;
- for (size_t k=0; k<l1; k++)
- for (size_t i=2; i<ido; i+=2)
- {
- size_t ic=ido-i;
- double tr2=CC(i-1,2,k)+CC(ic-1,1,k); // t2=CC(I) + conj(CC(ic))
- double ti2=CC(i ,2,k)-CC(ic ,1,k);
- double cr2=CC(i-1,0,k)+taur*tr2; // c2=CC +taur*t2
- double ci2=CC(i ,0,k)+taur*ti2;
- CH(i-1,k,0)=CC(i-1,0,k)+tr2; // CH=CC+t2
- CH(i ,k,0)=CC(i ,0,k)+ti2;
- double cr3=taui*(CC(i-1,2,k)-CC(ic-1,1,k));// c3=taui*(CC(i)-conj(CC(ic)))
- double ci3=taui*(CC(i ,2,k)+CC(ic ,1,k));
- double di2, di3, dr2, dr3;
- PM(dr3,dr2,cr2,ci3) // d2= (cr2-ci3, ci2+cr3) = c2+i*c3
- PM(di2,di3,ci2,cr3) // d3= (cr2+ci3, ci2-cr3) = c2-i*c3
- MULPM(CH(i,k,1),CH(i-1,k,1),WA(0,i-2),WA(0,i-1),di2,dr2) // ch = WA*d2
- MULPM(CH(i,k,2),CH(i-1,k,2),WA(1,i-2),WA(1,i-1),di3,dr3)
- }
- }
-
-NOINLINE static void radb4(size_t ido, size_t l1, const double * restrict cc,
- double * restrict ch, const double * restrict wa)
- {
- const size_t cdim=4;
- static const double sqrt2=1.41421356237309504880;
-
- for (size_t k=0; k<l1; k++)
- {
- double tr1, tr2;
- PM (tr2,tr1,CC(0,0,k),CC(ido-1,3,k))
- double tr3=2.*CC(ido-1,1,k);
- double tr4=2.*CC(0,2,k);
- PM (CH(0,k,0),CH(0,k,2),tr2,tr3)
- PM (CH(0,k,3),CH(0,k,1),tr1,tr4)
- }
- if ((ido&1)==0)
- for (size_t k=0; k<l1; k++)
- {
- double tr1,tr2,ti1,ti2;
- PM (ti1,ti2,CC(0 ,3,k),CC(0 ,1,k))
- PM (tr2,tr1,CC(ido-1,0,k),CC(ido-1,2,k))
- CH(ido-1,k,0)=tr2+tr2;
- CH(ido-1,k,1)=sqrt2*(tr1-ti1);
- CH(ido-1,k,2)=ti2+ti2;
- CH(ido-1,k,3)=-sqrt2*(tr1+ti1);
- }
- if (ido<=2) return;
- for (size_t k=0; k<l1;++k)
- for (size_t i=2; i<ido; i+=2)
- {
- double ci2, ci3, ci4, cr2, cr3, cr4, ti1, ti2, ti3, ti4, tr1, tr2, tr3, tr4;
- size_t ic=ido-i;
- PM (tr2,tr1,CC(i-1,0,k),CC(ic-1,3,k))
- PM (ti1,ti2,CC(i ,0,k),CC(ic ,3,k))
- PM (tr4,ti3,CC(i ,2,k),CC(ic ,1,k))
- PM (tr3,ti4,CC(i-1,2,k),CC(ic-1,1,k))
- PM (CH(i-1,k,0),cr3,tr2,tr3)
- PM (CH(i ,k,0),ci3,ti2,ti3)
- PM (cr4,cr2,tr1,tr4)
- PM (ci2,ci4,ti1,ti4)
- MULPM (CH(i,k,1),CH(i-1,k,1),WA(0,i-2),WA(0,i-1),ci2,cr2)
- MULPM (CH(i,k,2),CH(i-1,k,2),WA(1,i-2),WA(1,i-1),ci3,cr3)
- MULPM (CH(i,k,3),CH(i-1,k,3),WA(2,i-2),WA(2,i-1),ci4,cr4)
- }
- }
-
-NOINLINE static void radb5(size_t ido, size_t l1, const double * restrict cc,
- double * restrict ch, const double * restrict wa)
- {
- const size_t cdim=5;
- static const double tr11= 0.3090169943749474241, ti11=0.95105651629515357212,
- tr12=-0.8090169943749474241, ti12=0.58778525229247312917;
-
- for (size_t k=0; k<l1; k++)
- {
- double ti5=CC(0,2,k)+CC(0,2,k);
- double ti4=CC(0,4,k)+CC(0,4,k);
- double tr2=CC(ido-1,1,k)+CC(ido-1,1,k);
- double tr3=CC(ido-1,3,k)+CC(ido-1,3,k);
- CH(0,k,0)=CC(0,0,k)+tr2+tr3;
- double cr2=CC(0,0,k)+tr11*tr2+tr12*tr3;
- double cr3=CC(0,0,k)+tr12*tr2+tr11*tr3;
- double ci4, ci5;
- MULPM(ci5,ci4,ti5,ti4,ti11,ti12)
- PM(CH(0,k,4),CH(0,k,1),cr2,ci5)
- PM(CH(0,k,3),CH(0,k,2),cr3,ci4)
- }
- if (ido==1) return;
- for (size_t k=0; k<l1;++k)
- for (size_t i=2; i<ido; i+=2)
- {
- size_t ic=ido-i;
- double tr2, tr3, tr4, tr5, ti2, ti3, ti4, ti5;
- PM(tr2,tr5,CC(i-1,2,k),CC(ic-1,1,k))
- PM(ti5,ti2,CC(i ,2,k),CC(ic ,1,k))
- PM(tr3,tr4,CC(i-1,4,k),CC(ic-1,3,k))
- PM(ti4,ti3,CC(i ,4,k),CC(ic ,3,k))
- CH(i-1,k,0)=CC(i-1,0,k)+tr2+tr3;
- CH(i ,k,0)=CC(i ,0,k)+ti2+ti3;
- double cr2=CC(i-1,0,k)+tr11*tr2+tr12*tr3;
- double ci2=CC(i ,0,k)+tr11*ti2+tr12*ti3;
- double cr3=CC(i-1,0,k)+tr12*tr2+tr11*tr3;
- double ci3=CC(i ,0,k)+tr12*ti2+tr11*ti3;
- double ci4, ci5, cr5, cr4;
- MULPM(cr5,cr4,tr5,tr4,ti11,ti12)
- MULPM(ci5,ci4,ti5,ti4,ti11,ti12)
- double dr2, dr3, dr4, dr5, di2, di3, di4, di5;
- PM(dr4,dr3,cr3,ci4)
- PM(di3,di4,ci3,cr4)
- PM(dr5,dr2,cr2,ci5)
- PM(di2,di5,ci2,cr5)
- MULPM(CH(i,k,1),CH(i-1,k,1),WA(0,i-2),WA(0,i-1),di2,dr2)
- MULPM(CH(i,k,2),CH(i-1,k,2),WA(1,i-2),WA(1,i-1),di3,dr3)
- MULPM(CH(i,k,3),CH(i-1,k,3),WA(2,i-2),WA(2,i-1),di4,dr4)
- MULPM(CH(i,k,4),CH(i-1,k,4),WA(3,i-2),WA(3,i-1),di5,dr5)
- }
- }
-
-#undef CC
-#undef CH
-#define CC(a,b,c) cc[(a)+ido*((b)+cdim*(c))]
-#define CH(a,b,c) ch[(a)+ido*((b)+l1*(c))]
-#define C1(a,b,c) cc[(a)+ido*((b)+l1*(c))]
-#define C2(a,b) cc[(a)+idl1*(b)]
-#define CH2(a,b) ch[(a)+idl1*(b)]
-
-NOINLINE static void radbg(size_t ido, size_t ip, size_t l1,
- double * restrict cc, double * restrict ch, const double * restrict wa,
- const double * restrict csarr)
- {
- const size_t cdim=ip;
- size_t ipph=(ip+1)/ 2;
- size_t idl1 = ido*l1;
-
- for (size_t k=0; k<l1; ++k) // 102
- for (size_t i=0; i<ido; ++i) // 101
- CH(i,k,0) = CC(i,0,k);
- for (size_t j=1, jc=ip-1; j<ipph; ++j, --jc) // 108
- {
- size_t j2=2*j-1;
- for (size_t k=0; k<l1; ++k)
- {
- CH(0,k,j ) = 2*CC(ido-1,j2,k);
- CH(0,k,jc) = 2*CC(0,j2+1,k);
- }
- }
-
- if (ido!=1)
- {
- for (size_t j=1, jc=ip-1; j<ipph; ++j,--jc) // 111
- {
- size_t j2=2*j-1;
- for (size_t k=0; k<l1; ++k)
- for (size_t i=1, ic=ido-i-2; i<=ido-2; i+=2, ic-=2) // 109
- {
- CH(i ,k,j ) = CC(i ,j2+1,k)+CC(ic ,j2,k);
- CH(i ,k,jc) = CC(i ,j2+1,k)-CC(ic ,j2,k);
- CH(i+1,k,j ) = CC(i+1,j2+1,k)-CC(ic+1,j2,k);
- CH(i+1,k,jc) = CC(i+1,j2+1,k)+CC(ic+1,j2,k);
- }
- }
- }
- for (size_t l=1,lc=ip-1; l<ipph; ++l,--lc)
- {
- for (size_t ik=0; ik<idl1; ++ik)
- {
- C2(ik,l ) = CH2(ik,0)+csarr[2*l]*CH2(ik,1)+csarr[4*l]*CH2(ik,2);
- C2(ik,lc) = csarr[2*l+1]*CH2(ik,ip-1)+csarr[4*l+1]*CH2(ik,ip-2);
- }
- size_t iang=2*l;
- size_t j=3,jc=ip-3;
- for(; j<ipph-3; j+=4,jc-=4)
- {
- iang+=l; if(iang>ip) iang-=ip;
- double ar1=csarr[2*iang], ai1=csarr[2*iang+1];
- iang+=l; if(iang>ip) iang-=ip;
- double ar2=csarr[2*iang], ai2=csarr[2*iang+1];
- iang+=l; if(iang>ip) iang-=ip;
- double ar3=csarr[2*iang], ai3=csarr[2*iang+1];
- iang+=l; if(iang>ip) iang-=ip;
- double ar4=csarr[2*iang], ai4=csarr[2*iang+1];
- for (size_t ik=0; ik<idl1; ++ik)
- {
- C2(ik,l ) += ar1*CH2(ik,j )+ar2*CH2(ik,j +1)
- +ar3*CH2(ik,j +2)+ar4*CH2(ik,j +3);
- C2(ik,lc) += ai1*CH2(ik,jc)+ai2*CH2(ik,jc-1)
- +ai3*CH2(ik,jc-2)+ai4*CH2(ik,jc-3);
- }
- }
- for(; j<ipph-1; j+=2,jc-=2)
- {
- iang+=l; if(iang>ip) iang-=ip;
- double ar1=csarr[2*iang], ai1=csarr[2*iang+1];
- iang+=l; if(iang>ip) iang-=ip;
- double ar2=csarr[2*iang], ai2=csarr[2*iang+1];
- for (size_t ik=0; ik<idl1; ++ik)
- {
- C2(ik,l ) += ar1*CH2(ik,j )+ar2*CH2(ik,j +1);
- C2(ik,lc) += ai1*CH2(ik,jc)+ai2*CH2(ik,jc-1);
- }
- }
- for(; j<ipph; ++j,--jc)
- {
- iang+=l; if(iang>ip) iang-=ip;
- double war=csarr[2*iang], wai=csarr[2*iang+1];
- for (size_t ik=0; ik<idl1; ++ik)
- {
- C2(ik,l ) += war*CH2(ik,j );
- C2(ik,lc) += wai*CH2(ik,jc);
- }
- }
- }
- for (size_t j=1; j<ipph; ++j)
- for (size_t ik=0; ik<idl1; ++ik)
- CH2(ik,0) += CH2(ik,j);
- for (size_t j=1, jc=ip-1; j<ipph; ++j,--jc) // 124
- for (size_t k=0; k<l1; ++k)
- {
- CH(0,k,j ) = C1(0,k,j)-C1(0,k,jc);
- CH(0,k,jc) = C1(0,k,j)+C1(0,k,jc);
- }
-
- if (ido==1) return;
-
- for (size_t j=1, jc=ip-1; j<ipph; ++j, --jc) // 127
- for (size_t k=0; k<l1; ++k)
- for (size_t i=1; i<=ido-2; i+=2)
- {
- CH(i ,k,j ) = C1(i ,k,j)-C1(i+1,k,jc);
- CH(i ,k,jc) = C1(i ,k,j)+C1(i+1,k,jc);
- CH(i+1,k,j ) = C1(i+1,k,j)+C1(i ,k,jc);
- CH(i+1,k,jc) = C1(i+1,k,j)-C1(i ,k,jc);
- }
-
-// All in CH
-
- for (size_t j=1; j<ip; ++j)
- {
- size_t is = (j-1)*(ido-1);
- for (size_t k=0; k<l1; ++k)
- {
- size_t idij = is;
- for (size_t i=1; i<=ido-2; i+=2)
- {
- double t1=CH(i,k,j), t2=CH(i+1,k,j);
- CH(i ,k,j) = wa[idij]*t1-wa[idij+1]*t2;
- CH(i+1,k,j) = wa[idij]*t2+wa[idij+1]*t1;
- idij+=2;
- }
- }
- }
- }
-#undef C1
-#undef C2
-#undef CH2
-
-#undef CC
-#undef CH
-#undef PM
-#undef MULPM
-#undef WA
-
-static void copy_and_norm(double *c, double *p1, size_t n, double fct)
- {
- if (p1!=c)
- {
- if (fct!=1.)
- for (size_t i=0; i<n; ++i)
- c[i] = fct*p1[i];
- else
- memcpy (c,p1,n*sizeof(double));
- }
- else
- if (fct!=1.)
- for (size_t i=0; i<n; ++i)
- c[i] *= fct;
- }
-
-WARN_UNUSED_RESULT
-static int rfftp_forward(rfftp_plan plan, double c[], double fct)
- {
- if (plan->length==1) return 0;
- size_t n=plan->length;
- size_t l1=n, nf=plan->nfct;
- double *ch = RALLOC(double, n);
- if (!ch) return -1;
- double *p1=c, *p2=ch;
-
- for(size_t k1=0; k1<nf;++k1)
- {
- size_t k=nf-k1-1;
- size_t ip=plan->fct[k].fct;
- size_t ido=n / l1;
- l1 /= ip;
- if(ip==4)
- radf4(ido, l1, p1, p2, plan->fct[k].tw);
- else if(ip==2)
- radf2(ido, l1, p1, p2, plan->fct[k].tw);
- else if(ip==3)
- radf3(ido, l1, p1, p2, plan->fct[k].tw);
- else if(ip==5)
- radf5(ido, l1, p1, p2, plan->fct[k].tw);
- else
- {
- radfg(ido, ip, l1, p1, p2, plan->fct[k].tw, plan->fct[k].tws);
- SWAP (p1,p2,double *);
- }
- SWAP (p1,p2,double *);
- }
- copy_and_norm(c,p1,n,fct);
- DEALLOC(ch);
- return 0;
- }
-
-WARN_UNUSED_RESULT
-static int rfftp_backward(rfftp_plan plan, double c[], double fct)
- {
- if (plan->length==1) return 0;
- size_t n=plan->length;
- size_t l1=1, nf=plan->nfct;
- double *ch = RALLOC(double, n);
- if (!ch) return -1;
- double *p1=c, *p2=ch;
-
- for(size_t k=0; k<nf; k++)
- {
- size_t ip = plan->fct[k].fct,
- ido= n/(ip*l1);
- if(ip==4)
- radb4(ido, l1, p1, p2, plan->fct[k].tw);
- else if(ip==2)
- radb2(ido, l1, p1, p2, plan->fct[k].tw);
- else if(ip==3)
- radb3(ido, l1, p1, p2, plan->fct[k].tw);
- else if(ip==5)
- radb5(ido, l1, p1, p2, plan->fct[k].tw);
- else
- radbg(ido, ip, l1, p1, p2, plan->fct[k].tw, plan->fct[k].tws);
- SWAP (p1,p2,double *);
- l1*=ip;
- }
- copy_and_norm(c,p1,n,fct);
- DEALLOC(ch);
- return 0;
- }
-
-WARN_UNUSED_RESULT
-static int rfftp_factorize (rfftp_plan plan)
- {
- size_t length=plan->length;
- size_t nfct=0;
- while ((length%4)==0)
- { if (nfct>=NFCT) return -1; plan->fct[nfct++].fct=4; length>>=2; }
- if ((length%2)==0)
- {
- length>>=1;
- // factor 2 should be at the front of the factor list
- if (nfct>=NFCT) return -1;
- plan->fct[nfct++].fct=2;
- SWAP(plan->fct[0].fct, plan->fct[nfct-1].fct,size_t);
- }
- size_t maxl=(size_t)(sqrt((double)length))+1;
- for (size_t divisor=3; (length>1)&&(divisor<maxl); divisor+=2)
- if ((length%divisor)==0)
- {
- while ((length%divisor)==0)
- {
- if (nfct>=NFCT) return -1;
- plan->fct[nfct++].fct=divisor;
- length/=divisor;
- }
- maxl=(size_t)(sqrt((double)length))+1;
- }
- if (length>1) plan->fct[nfct++].fct=length;
- plan->nfct=nfct;
- return 0;
- }
-
-static size_t rfftp_twsize(rfftp_plan plan)
- {
- size_t twsize=0, l1=1;
- for (size_t k=0; k<plan->nfct; ++k)
- {
- size_t ip=plan->fct[k].fct, ido= plan->length/(l1*ip);
- twsize+=(ip-1)*(ido-1);
- if (ip>5) twsize+=2*ip;
- l1*=ip;
- }
- return twsize;
- return 0;
- }
-
-WARN_UNUSED_RESULT NOINLINE static int rfftp_comp_twiddle (rfftp_plan plan)
- {
- size_t length=plan->length;
- double *twid = RALLOC(double, 2*length);
- if (!twid) return -1;
- sincos_2pibyn_half(length, twid);
- size_t l1=1;
- double *ptr=plan->mem;
- for (size_t k=0; k<plan->nfct; ++k)
- {
- size_t ip=plan->fct[k].fct, ido=length/(l1*ip);
- if (k<plan->nfct-1) // last factor doesn't need twiddles
- {
- plan->fct[k].tw=ptr; ptr+=(ip-1)*(ido-1);
- for (size_t j=1; j<ip; ++j)
- for (size_t i=1; i<=(ido-1)/2; ++i)
- {
- plan->fct[k].tw[(j-1)*(ido-1)+2*i-2] = twid[2*j*l1*i];
- plan->fct[k].tw[(j-1)*(ido-1)+2*i-1] = twid[2*j*l1*i+1];
- }
- }
- if (ip>5) // special factors required by *g functions
- {
- plan->fct[k].tws=ptr; ptr+=2*ip;
- plan->fct[k].tws[0] = 1.;
- plan->fct[k].tws[1] = 0.;
- for (size_t i=1; i<=(ip>>1); ++i)
- {
- plan->fct[k].tws[2*i ] = twid[2*i*(length/ip)];
- plan->fct[k].tws[2*i+1] = twid[2*i*(length/ip)+1];
- plan->fct[k].tws[2*(ip-i) ] = twid[2*i*(length/ip)];
- plan->fct[k].tws[2*(ip-i)+1] = -twid[2*i*(length/ip)+1];
- }
- }
- l1*=ip;
- }
- DEALLOC(twid);
- return 0;
- }
-
-NOINLINE static rfftp_plan make_rfftp_plan (size_t length)
- {
- if (length==0) return NULL;
- rfftp_plan plan = RALLOC(rfftp_plan_i,1);
- if (!plan) return NULL;
- plan->length=length;
- plan->nfct=0;
- plan->mem=NULL;
- for (size_t i=0; i<NFCT; ++i)
- plan->fct[i]=(rfftp_fctdata){0,0,0};
- if (length==1) return plan;
- if (rfftp_factorize(plan)!=0) { DEALLOC(plan); return NULL; }
- size_t tws=rfftp_twsize(plan);
- plan->mem=RALLOC(double,tws);
- if (!plan->mem) { DEALLOC(plan); return NULL; }
- if (rfftp_comp_twiddle(plan)!=0)
- { DEALLOC(plan->mem); DEALLOC(plan); return NULL; }
- return plan;
- }
-
-NOINLINE static void destroy_rfftp_plan (rfftp_plan plan)
- {
- DEALLOC(plan->mem);
- DEALLOC(plan);
- }
-
-typedef struct fftblue_plan_i
- {
- size_t n, n2;
- cfftp_plan plan;
- double *mem;
- double *bk, *bkf;
- } fftblue_plan_i;
-typedef struct fftblue_plan_i * fftblue_plan;
-
-NOINLINE static fftblue_plan make_fftblue_plan (size_t length)
- {
- fftblue_plan plan = RALLOC(fftblue_plan_i,1);
- if (!plan) return NULL;
- plan->n = length;
- plan->n2 = good_size(plan->n*2-1);
- plan->mem = RALLOC(double, 2*plan->n+2*plan->n2);
- if (!plan->mem) { DEALLOC(plan); return NULL; }
- plan->bk = plan->mem;
- plan->bkf = plan->bk+2*plan->n;
-
-/* initialize b_k */
- double *tmp = RALLOC(double,4*plan->n);
- if (!tmp) { DEALLOC(plan->mem); DEALLOC(plan); return NULL; }
- sincos_2pibyn(2*plan->n,tmp);
- plan->bk[0] = 1;
- plan->bk[1] = 0;
-
- size_t coeff=0;
- for (size_t m=1; m<plan->n; ++m)
- {
- coeff+=2*m-1;
- if (coeff>=2*plan->n) coeff-=2*plan->n;
- plan->bk[2*m ] = tmp[2*coeff ];
- plan->bk[2*m+1] = tmp[2*coeff+1];
- }
-
- /* initialize the zero-padded, Fourier transformed b_k. Add normalisation. */
- double xn2 = 1./plan->n2;
- plan->bkf[0] = plan->bk[0]*xn2;
- plan->bkf[1] = plan->bk[1]*xn2;
- for (size_t m=2; m<2*plan->n; m+=2)
- {
- plan->bkf[m] = plan->bkf[2*plan->n2-m] = plan->bk[m] *xn2;
- plan->bkf[m+1] = plan->bkf[2*plan->n2-m+1] = plan->bk[m+1] *xn2;
- }
- for (size_t m=2*plan->n;m<=(2*plan->n2-2*plan->n+1);++m)
- plan->bkf[m]=0.;
- plan->plan=make_cfftp_plan(plan->n2);
- if (!plan->plan)
- { DEALLOC(tmp); DEALLOC(plan->mem); DEALLOC(plan); return NULL; }
- if (cfftp_forward(plan->plan,plan->bkf,1.)!=0)
- { DEALLOC(tmp); DEALLOC(plan->mem); DEALLOC(plan); return NULL; }
- DEALLOC(tmp);
-
- return plan;
- }
-
-NOINLINE static void destroy_fftblue_plan (fftblue_plan plan)
- {
- DEALLOC(plan->mem);
- destroy_cfftp_plan(plan->plan);
- DEALLOC(plan);
- }
-
-NOINLINE WARN_UNUSED_RESULT
-static int fftblue_fft(fftblue_plan plan, double c[], int isign, double fct)
- {
- size_t n=plan->n;
- size_t n2=plan->n2;
- double *bk = plan->bk;
- double *bkf = plan->bkf;
- double *akf = RALLOC(double, 2*n2);
- if (!akf) return -1;
-
-/* initialize a_k and FFT it */
- if (isign>0)
- for (size_t m=0; m<2*n; m+=2)
- {
- akf[m] = c[m]*bk[m] - c[m+1]*bk[m+1];
- akf[m+1] = c[m]*bk[m+1] + c[m+1]*bk[m];
- }
- else
- for (size_t m=0; m<2*n; m+=2)
- {
- akf[m] = c[m]*bk[m] + c[m+1]*bk[m+1];
- akf[m+1] =-c[m]*bk[m+1] + c[m+1]*bk[m];
- }
- for (size_t m=2*n; m<2*n2; ++m)
- akf[m]=0;
-
- if (cfftp_forward (plan->plan,akf,fct)!=0)
- { DEALLOC(akf); return -1; }
-
-/* do the convolution */
- if (isign>0)
- for (size_t m=0; m<2*n2; m+=2)
- {
- double im = -akf[m]*bkf[m+1] + akf[m+1]*bkf[m];
- akf[m ] = akf[m]*bkf[m] + akf[m+1]*bkf[m+1];
- akf[m+1] = im;
- }
- else
- for (size_t m=0; m<2*n2; m+=2)
- {
- double im = akf[m]*bkf[m+1] + akf[m+1]*bkf[m];
- akf[m ] = akf[m]*bkf[m] - akf[m+1]*bkf[m+1];
- akf[m+1] = im;
- }
-
-/* inverse FFT */
- if (cfftp_backward (plan->plan,akf,1.)!=0)
- { DEALLOC(akf); return -1; }
-
-/* multiply by b_k */
- if (isign>0)
- for (size_t m=0; m<2*n; m+=2)
- {
- c[m] = bk[m] *akf[m] - bk[m+1]*akf[m+1];
- c[m+1] = bk[m+1]*akf[m] + bk[m] *akf[m+1];
- }
- else
- for (size_t m=0; m<2*n; m+=2)
- {
- c[m] = bk[m] *akf[m] + bk[m+1]*akf[m+1];
- c[m+1] =-bk[m+1]*akf[m] + bk[m] *akf[m+1];
- }
- DEALLOC(akf);
- return 0;
- }
-
-WARN_UNUSED_RESULT
-static int cfftblue_backward(fftblue_plan plan, double c[], double fct)
- { return fftblue_fft(plan,c,1,fct); }
-
-WARN_UNUSED_RESULT
-static int cfftblue_forward(fftblue_plan plan, double c[], double fct)
- { return fftblue_fft(plan,c,-1,fct); }
-
-WARN_UNUSED_RESULT
-static int rfftblue_backward(fftblue_plan plan, double c[], double fct)
- {
- size_t n=plan->n;
- double *tmp = RALLOC(double,2*n);
- if (!tmp) return -1;
- tmp[0]=c[0];
- tmp[1]=0.;
- memcpy (tmp+2,c+1, (n-1)*sizeof(double));
- if ((n&1)==0) tmp[n+1]=0.;
- for (size_t m=2; m<n; m+=2)
- {
- tmp[2*n-m]=tmp[m];
- tmp[2*n-m+1]=-tmp[m+1];
- }
- if (fftblue_fft(plan,tmp,1,fct)!=0)
- { DEALLOC(tmp); return -1; }
- for (size_t m=0; m<n; ++m)
- c[m] = tmp[2*m];
- DEALLOC(tmp);
- return 0;
- }
-
-WARN_UNUSED_RESULT
-static int rfftblue_forward(fftblue_plan plan, double c[], double fct)
- {
- size_t n=plan->n;
- double *tmp = RALLOC(double,2*n);
- if (!tmp) return -1;
- for (size_t m=0; m<n; ++m)
- {
- tmp[2*m] = c[m];
- tmp[2*m+1] = 0.;
- }
- if (fftblue_fft(plan,tmp,-1,fct)!=0)
- { DEALLOC(tmp); return -1; }
- c[0] = tmp[0];
- memcpy (c+1, tmp+2, (n-1)*sizeof(double));
- DEALLOC(tmp);
- return 0;
- }
-
-typedef struct cfft_plan_i
- {
- cfftp_plan packplan;
- fftblue_plan blueplan;
- } cfft_plan_i;
-
-static cfft_plan make_cfft_plan (size_t length)
- {
- if (length==0) return NULL;
- cfft_plan plan = RALLOC(cfft_plan_i,1);
- if (!plan) return NULL;
- plan->blueplan=0;
- plan->packplan=0;
- if ((length<50) || (largest_prime_factor(length)<=sqrt(length)))
- {
- plan->packplan=make_cfftp_plan(length);
- if (!plan->packplan) { DEALLOC(plan); return NULL; }
- return plan;
- }
- double comp1 = cost_guess(length);
- double comp2 = 2*cost_guess(good_size(2*length-1));
- comp2*=1.5; /* fudge factor that appears to give good overall performance */
- if (comp2<comp1) // use Bluestein
- {
- plan->blueplan=make_fftblue_plan(length);
- if (!plan->blueplan) { DEALLOC(plan); return NULL; }
- }
- else
- {
- plan->packplan=make_cfftp_plan(length);
- if (!plan->packplan) { DEALLOC(plan); return NULL; }
- }
- return plan;
- }
-
-static void destroy_cfft_plan (cfft_plan plan)
- {
- if (plan->blueplan)
- destroy_fftblue_plan(plan->blueplan);
- if (plan->packplan)
- destroy_cfftp_plan(plan->packplan);
- DEALLOC(plan);
- }
-
-WARN_UNUSED_RESULT static int cfft_backward(cfft_plan plan, double c[], double fct)
- {
- if (plan->packplan)
- return cfftp_backward(plan->packplan,c,fct);
- // if (plan->blueplan)
- return cfftblue_backward(plan->blueplan,c,fct);
- }
-
-WARN_UNUSED_RESULT static int cfft_forward(cfft_plan plan, double c[], double fct)
- {
- if (plan->packplan)
- return cfftp_forward(plan->packplan,c,fct);
- // if (plan->blueplan)
- return cfftblue_forward(plan->blueplan,c,fct);
- }
-
-typedef struct rfft_plan_i
- {
- rfftp_plan packplan;
- fftblue_plan blueplan;
- } rfft_plan_i;
-
-static rfft_plan make_rfft_plan (size_t length)
- {
- if (length==0) return NULL;
- rfft_plan plan = RALLOC(rfft_plan_i,1);
- if (!plan) return NULL;
- plan->blueplan=0;
- plan->packplan=0;
- if ((length<50) || (largest_prime_factor(length)<=sqrt(length)))
- {
- plan->packplan=make_rfftp_plan(length);
- if (!plan->packplan) { DEALLOC(plan); return NULL; }
- return plan;
- }
- double comp1 = 0.5*cost_guess(length);
- double comp2 = 2*cost_guess(good_size(2*length-1));
- comp2*=1.5; /* fudge factor that appears to give good overall performance */
- if (comp2<comp1) // use Bluestein
- {
- plan->blueplan=make_fftblue_plan(length);
- if (!plan->blueplan) { DEALLOC(plan); return NULL; }
- }
- else
- {
- plan->packplan=make_rfftp_plan(length);
- if (!plan->packplan) { DEALLOC(plan); return NULL; }
- }
- return plan;
- }
-
-static void destroy_rfft_plan (rfft_plan plan)
- {
- if (plan->blueplan)
- destroy_fftblue_plan(plan->blueplan);
- if (plan->packplan)
- destroy_rfftp_plan(plan->packplan);
- DEALLOC(plan);
- }
-
-WARN_UNUSED_RESULT static int rfft_backward(rfft_plan plan, double c[], double fct)
- {
- if (plan->packplan)
- return rfftp_backward(plan->packplan,c,fct);
- else // if (plan->blueplan)
- return rfftblue_backward(plan->blueplan,c,fct);
- }
-
-WARN_UNUSED_RESULT static int rfft_forward(rfft_plan plan, double c[], double fct)
- {
- if (plan->packplan)
- return rfftp_forward(plan->packplan,c,fct);
- else // if (plan->blueplan)
- return rfftblue_forward(plan->blueplan,c,fct);
- }
-
-#define NPY_NO_DEPRECATED_API NPY_API_VERSION
-
-#include "Python.h"
-#include "numpy/arrayobject.h"
-
-static PyObject *
-execute_complex(PyObject *a1, int is_forward, double fct)
-{
- PyArrayObject *data = (PyArrayObject *)PyArray_FromAny(a1,
- PyArray_DescrFromType(NPY_CDOUBLE), 1, 0,
- NPY_ARRAY_ENSURECOPY | NPY_ARRAY_DEFAULT |
- NPY_ARRAY_ENSUREARRAY | NPY_ARRAY_FORCECAST,
- NULL);
- if (!data) return NULL;
-
- int npts = PyArray_DIM(data, PyArray_NDIM(data) - 1);
- cfft_plan plan=NULL;
-
- int nrepeats = PyArray_SIZE(data)/npts;
- double *dptr = (double *)PyArray_DATA(data);
- int fail=0;
- Py_BEGIN_ALLOW_THREADS;
- NPY_SIGINT_ON;
- plan = make_cfft_plan(npts);
- if (!plan) fail=1;
- if (!fail)
- for (int i = 0; i < nrepeats; i++) {
- int res = is_forward ?
- cfft_forward(plan, dptr, fct) : cfft_backward(plan, dptr, fct);
- if (res!=0) { fail=1; break; }
- dptr += npts*2;
- }
- if (plan) destroy_cfft_plan(plan);
- NPY_SIGINT_OFF;
- Py_END_ALLOW_THREADS;
- if (fail) {
- Py_XDECREF(data);
- return PyErr_NoMemory();
- }
- return (PyObject *)data;
-}
-
-static PyObject *
-execute_real_forward(PyObject *a1, double fct)
-{
- rfft_plan plan=NULL;
- int fail = 0;
- PyArrayObject *data = (PyArrayObject *)PyArray_FromAny(a1,
- PyArray_DescrFromType(NPY_DOUBLE), 1, 0,
- NPY_ARRAY_DEFAULT | NPY_ARRAY_ENSUREARRAY | NPY_ARRAY_FORCECAST,
- NULL);
- if (!data) return NULL;
-
- int ndim = PyArray_NDIM(data);
- const npy_intp *odim = PyArray_DIMS(data);
- int npts = odim[ndim - 1];
- npy_intp *tdim=(npy_intp *)malloc(ndim*sizeof(npy_intp));
- if (!tdim)
- { Py_XDECREF(data); return NULL; }
- for (int d=0; d<ndim-1; ++d)
- tdim[d] = odim[d];
- tdim[ndim-1] = npts/2 + 1;
- PyArrayObject *ret = (PyArrayObject *)PyArray_Empty(ndim,
- tdim, PyArray_DescrFromType(NPY_CDOUBLE), 0);
- free(tdim);
- if (!ret) fail=1;
- if (!fail) {
- int rstep = PyArray_DIM(ret, PyArray_NDIM(ret) - 1)*2;
-
- int nrepeats = PyArray_SIZE(data)/npts;
- double *rptr = (double *)PyArray_DATA(ret),
- *dptr = (double *)PyArray_DATA(data);
-
- Py_BEGIN_ALLOW_THREADS;
- NPY_SIGINT_ON;
- plan = make_rfft_plan(npts);
- if (!plan) fail=1;
- if (!fail)
- for (int i = 0; i < nrepeats; i++) {
- rptr[rstep-1] = 0.0;
- memcpy((char *)(rptr+1), dptr, npts*sizeof(double));
- if (rfft_forward(plan, rptr+1, fct)!=0) {fail=1; break;}
- rptr[0] = rptr[1];
- rptr[1] = 0.0;
- rptr += rstep;
- dptr += npts;
- }
- if (plan) destroy_rfft_plan(plan);
- NPY_SIGINT_OFF;
- Py_END_ALLOW_THREADS;
- }
- if (fail) {
- Py_XDECREF(data);
- Py_XDECREF(ret);
- return PyErr_NoMemory();
- }
- Py_DECREF(data);
- return (PyObject *)ret;
-}
-static PyObject *
-execute_real_backward(PyObject *a1, double fct)
-{
- rfft_plan plan=NULL;
- PyArrayObject *data = (PyArrayObject *)PyArray_FromAny(a1,
- PyArray_DescrFromType(NPY_CDOUBLE), 1, 0,
- NPY_ARRAY_DEFAULT | NPY_ARRAY_ENSUREARRAY | NPY_ARRAY_FORCECAST,
- NULL);
- if (!data) return NULL;
- int npts = PyArray_DIM(data, PyArray_NDIM(data) - 1);
- PyArrayObject *ret = (PyArrayObject *)PyArray_Empty(PyArray_NDIM(data),
- PyArray_DIMS(data), PyArray_DescrFromType(NPY_DOUBLE), 0);
- int fail = 0;
- if (!ret) fail=1;
- if (!fail) {
- int nrepeats = PyArray_SIZE(ret)/npts;
- double *rptr = (double *)PyArray_DATA(ret),
- *dptr = (double *)PyArray_DATA(data);
-
- Py_BEGIN_ALLOW_THREADS;
- NPY_SIGINT_ON;
- plan = make_rfft_plan(npts);
- if (!plan) fail=1;
- if (!fail) {
- for (int i = 0; i < nrepeats; i++) {
- memcpy((char *)(rptr + 1), (dptr + 2), (npts - 1)*sizeof(double));
- rptr[0] = dptr[0];
- if (rfft_backward(plan, rptr, fct)!=0) {fail=1; break;}
- rptr += npts;
- dptr += npts*2;
- }
- }
- if (plan) destroy_rfft_plan(plan);
- NPY_SIGINT_OFF;
- Py_END_ALLOW_THREADS;
- }
- if (fail) {
- Py_XDECREF(data);
- Py_XDECREF(ret);
- return PyErr_NoMemory();
- }
- Py_DECREF(data);
- return (PyObject *)ret;
-}
-
-static PyObject *
-execute_real(PyObject *a1, int is_forward, double fct)
-{
- return is_forward ? execute_real_forward(a1, fct)
- : execute_real_backward(a1, fct);
-}
-
-static const char execute__doc__[] = "";
-
-static PyObject *
-execute(PyObject *NPY_UNUSED(self), PyObject *args)
-{
- PyObject *a1;
- int is_real, is_forward;
- double fct;
-
- if(!PyArg_ParseTuple(args, "Oiid:execute", &a1, &is_real, &is_forward, &fct)) {
- return NULL;
- }
-
- return is_real ? execute_real(a1, is_forward, fct)
- : execute_complex(a1, is_forward, fct);
-}
-
-/* List of methods defined in the module */
-
-static struct PyMethodDef methods[] = {
- {"execute", execute, 1, execute__doc__},
- {NULL, NULL, 0, NULL} /* sentinel */
-};
-
-#if PY_MAJOR_VERSION >= 3
-static struct PyModuleDef moduledef = {
- PyModuleDef_HEAD_INIT,
- "pocketfft_internal",
- NULL,
- -1,
- methods,
- NULL,
- NULL,
- NULL,
- NULL
-};
-#endif
-
-/* Initialization function for the module */
-#if PY_MAJOR_VERSION >= 3
-#define RETVAL(x) x
-PyMODINIT_FUNC PyInit_pocketfft_internal(void)
-#else
-#define RETVAL(x)
-PyMODINIT_FUNC
-initpocketfft_internal(void)
-#endif
-{
- PyObject *m;
-#if PY_MAJOR_VERSION >= 3
- m = PyModule_Create(&moduledef);
-#else
- static const char module_documentation[] = "";
-
- m = Py_InitModule4("pocketfft_internal", methods,
- module_documentation,
- (PyObject*)NULL,PYTHON_API_VERSION);
-#endif
- if (m == NULL) {
- return RETVAL(NULL);
- }
-
- /* Import the array object */
- import_array();
-
- /* XXXX Add constants here */
-
- return RETVAL(m);
-}
+++ /dev/null
-"""
-Discrete Fourier Transforms
-
-Routines in this module:
-
-fft(a, n=None, axis=-1)
-ifft(a, n=None, axis=-1)
-rfft(a, n=None, axis=-1)
-irfft(a, n=None, axis=-1)
-hfft(a, n=None, axis=-1)
-ihfft(a, n=None, axis=-1)
-fftn(a, s=None, axes=None)
-ifftn(a, s=None, axes=None)
-rfftn(a, s=None, axes=None)
-irfftn(a, s=None, axes=None)
-fft2(a, s=None, axes=(-2,-1))
-ifft2(a, s=None, axes=(-2, -1))
-rfft2(a, s=None, axes=(-2,-1))
-irfft2(a, s=None, axes=(-2, -1))
-
-i = inverse transform
-r = transform of purely real data
-h = Hermite transform
-n = n-dimensional transform
-2 = 2-dimensional transform
-(Note: 2D routines are just nD routines with different default
-behavior.)
-
-"""
-from __future__ import division, absolute_import, print_function
-
-__all__ = ['fft', 'ifft', 'rfft', 'irfft', 'hfft', 'ihfft', 'rfftn',
- 'irfftn', 'rfft2', 'irfft2', 'fft2', 'ifft2', 'fftn', 'ifftn']
-
-import functools
-
-from numpy.core import asarray, zeros, swapaxes, conjugate, take, sqrt
-from . import pocketfft_internal as pfi
-from numpy.core.multiarray import normalize_axis_index
-from numpy.core import overrides
-
-
-array_function_dispatch = functools.partial(
- overrides.array_function_dispatch, module='numpy.fft')
-
-
-# `inv_norm` is a float by which the result of the transform needs to be
-# divided. This replaces the original, more intuitive 'fct` parameter to avoid
-# divisions by zero (or alternatively additional checks) in the case of
-# zero-length axes during its computation.
-def _raw_fft(a, n, axis, is_real, is_forward, inv_norm):
- axis = normalize_axis_index(axis, a.ndim)
- if n is None:
- n = a.shape[axis]
-
- if n < 1:
- raise ValueError("Invalid number of FFT data points (%d) specified."
- % n)
-
- fct = 1/inv_norm
-
- if a.shape[axis] != n:
- s = list(a.shape)
- if s[axis] > n:
- index = [slice(None)]*len(s)
- index[axis] = slice(0, n)
- a = a[tuple(index)]
- else:
- index = [slice(None)]*len(s)
- index[axis] = slice(0, s[axis])
- s[axis] = n
- z = zeros(s, a.dtype.char)
- z[tuple(index)] = a
- a = z
-
- if axis == a.ndim-1:
- r = pfi.execute(a, is_real, is_forward, fct)
- else:
- a = swapaxes(a, axis, -1)
- r = pfi.execute(a, is_real, is_forward, fct)
- r = swapaxes(r, axis, -1)
- return r
-
-
-def _unitary(norm):
- if norm is None:
- return False
- if norm=="ortho":
- return True
- raise ValueError("Invalid norm value %s, should be None or \"ortho\"."
- % norm)
-
-
-def _fft_dispatcher(a, n=None, axis=None, norm=None):
- return (a,)
-
-
-@array_function_dispatch(_fft_dispatcher)
-def fft(a, n=None, axis=-1, norm=None):
- """
- Compute the one-dimensional discrete Fourier Transform.
-
- This function computes the one-dimensional *n*-point discrete Fourier
- Transform (DFT) with the efficient Fast Fourier Transform (FFT)
- algorithm [CT].
-
- Parameters
- ----------
- a : array_like
- Input array, can be complex.
- n : int, optional
- Length of the transformed axis of the output.
- If `n` is smaller than the length of the input, the input is cropped.
- If it is larger, the input is padded with zeros. If `n` is not given,
- the length of the input along the axis specified by `axis` is used.
- axis : int, optional
- Axis over which to compute the FFT. If not given, the last axis is
- used.
- norm : {None, "ortho"}, optional
- .. versionadded:: 1.10.0
-
- Normalization mode (see `numpy.fft`). Default is None.
-
- Returns
- -------
- out : complex ndarray
- The truncated or zero-padded input, transformed along the axis
- indicated by `axis`, or the last one if `axis` is not specified.
-
- Raises
- ------
- IndexError
- if `axes` is larger than the last axis of `a`.
-
- See Also
- --------
- numpy.fft : for definition of the DFT and conventions used.
- ifft : The inverse of `fft`.
- fft2 : The two-dimensional FFT.
- fftn : The *n*-dimensional FFT.
- rfftn : The *n*-dimensional FFT of real input.
- fftfreq : Frequency bins for given FFT parameters.
-
- Notes
- -----
- FFT (Fast Fourier Transform) refers to a way the discrete Fourier
- Transform (DFT) can be calculated efficiently, by using symmetries in the
- calculated terms. The symmetry is highest when `n` is a power of 2, and
- the transform is therefore most efficient for these sizes.
-
- The DFT is defined, with the conventions used in this implementation, in
- the documentation for the `numpy.fft` module.
-
- References
- ----------
- .. [CT] Cooley, James W., and John W. Tukey, 1965, "An algorithm for the
- machine calculation of complex Fourier series," *Math. Comput.*
- 19: 297-301.
-
- Examples
- --------
- >>> np.fft.fft(np.exp(2j * np.pi * np.arange(8) / 8))
- array([-2.33486982e-16+1.14423775e-17j, 8.00000000e+00-1.25557246e-15j,
- 2.33486982e-16+2.33486982e-16j, 0.00000000e+00+1.22464680e-16j,
- -1.14423775e-17+2.33486982e-16j, 0.00000000e+00+5.20784380e-16j,
- 1.14423775e-17+1.14423775e-17j, 0.00000000e+00+1.22464680e-16j])
-
- In this example, real input has an FFT which is Hermitian, i.e., symmetric
- in the real part and anti-symmetric in the imaginary part, as described in
- the `numpy.fft` documentation:
-
- >>> import matplotlib.pyplot as plt
- >>> t = np.arange(256)
- >>> sp = np.fft.fft(np.sin(t))
- >>> freq = np.fft.fftfreq(t.shape[-1])
- >>> plt.plot(freq, sp.real, freq, sp.imag)
- [<matplotlib.lines.Line2D object at 0x...>, <matplotlib.lines.Line2D object at 0x...>]
- >>> plt.show()
-
- """
-
- a = asarray(a)
- if n is None:
- n = a.shape[axis]
- inv_norm = 1
- if norm is not None and _unitary(norm):
- inv_norm = sqrt(n)
- output = _raw_fft(a, n, axis, False, True, inv_norm)
- return output
-
-
-@array_function_dispatch(_fft_dispatcher)
-def ifft(a, n=None, axis=-1, norm=None):
- """
- Compute the one-dimensional inverse discrete Fourier Transform.
-
- This function computes the inverse of the one-dimensional *n*-point
- discrete Fourier transform computed by `fft`. In other words,
- ``ifft(fft(a)) == a`` to within numerical accuracy.
- For a general description of the algorithm and definitions,
- see `numpy.fft`.
-
- The input should be ordered in the same way as is returned by `fft`,
- i.e.,
-
- * ``a[0]`` should contain the zero frequency term,
- * ``a[1:n//2]`` should contain the positive-frequency terms,
- * ``a[n//2 + 1:]`` should contain the negative-frequency terms, in
- increasing order starting from the most negative frequency.
-
- For an even number of input points, ``A[n//2]`` represents the sum of
- the values at the positive and negative Nyquist frequencies, as the two
- are aliased together. See `numpy.fft` for details.
-
- Parameters
- ----------
- a : array_like
- Input array, can be complex.
- n : int, optional
- Length of the transformed axis of the output.
- If `n` is smaller than the length of the input, the input is cropped.
- If it is larger, the input is padded with zeros. If `n` is not given,
- the length of the input along the axis specified by `axis` is used.
- See notes about padding issues.
- axis : int, optional
- Axis over which to compute the inverse DFT. If not given, the last
- axis is used.
- norm : {None, "ortho"}, optional
- .. versionadded:: 1.10.0
-
- Normalization mode (see `numpy.fft`). Default is None.
-
- Returns
- -------
- out : complex ndarray
- The truncated or zero-padded input, transformed along the axis
- indicated by `axis`, or the last one if `axis` is not specified.
-
- Raises
- ------
- IndexError
- If `axes` is larger than the last axis of `a`.
-
- See Also
- --------
- numpy.fft : An introduction, with definitions and general explanations.
- fft : The one-dimensional (forward) FFT, of which `ifft` is the inverse
- ifft2 : The two-dimensional inverse FFT.
- ifftn : The n-dimensional inverse FFT.
-
- Notes
- -----
- If the input parameter `n` is larger than the size of the input, the input
- is padded by appending zeros at the end. Even though this is the common
- approach, it might lead to surprising results. If a different padding is
- desired, it must be performed before calling `ifft`.
-
- Examples
- --------
- >>> np.fft.ifft([0, 4, 0, 0])
- array([ 1.+0.j, 0.+1.j, -1.+0.j, 0.-1.j]) # may vary
-
- Create and plot a band-limited signal with random phases:
-
- >>> import matplotlib.pyplot as plt
- >>> t = np.arange(400)
- >>> n = np.zeros((400,), dtype=complex)
- >>> n[40:60] = np.exp(1j*np.random.uniform(0, 2*np.pi, (20,)))
- >>> s = np.fft.ifft(n)
- >>> plt.plot(t, s.real, 'b-', t, s.imag, 'r--')
- [<matplotlib.lines.Line2D object at ...>, <matplotlib.lines.Line2D object at ...>]
- >>> plt.legend(('real', 'imaginary'))
- <matplotlib.legend.Legend object at ...>
- >>> plt.show()
-
- """
- a = asarray(a)
- if n is None:
- n = a.shape[axis]
- if norm is not None and _unitary(norm):
- inv_norm = sqrt(max(n, 1))
- else:
- inv_norm = n
- output = _raw_fft(a, n, axis, False, False, inv_norm)
- return output
-
-
-
-@array_function_dispatch(_fft_dispatcher)
-def rfft(a, n=None, axis=-1, norm=None):
- """
- Compute the one-dimensional discrete Fourier Transform for real input.
-
- This function computes the one-dimensional *n*-point discrete Fourier
- Transform (DFT) of a real-valued array by means of an efficient algorithm
- called the Fast Fourier Transform (FFT).
-
- Parameters
- ----------
- a : array_like
- Input array
- n : int, optional
- Number of points along transformation axis in the input to use.
- If `n` is smaller than the length of the input, the input is cropped.
- If it is larger, the input is padded with zeros. If `n` is not given,
- the length of the input along the axis specified by `axis` is used.
- axis : int, optional
- Axis over which to compute the FFT. If not given, the last axis is
- used.
- norm : {None, "ortho"}, optional
- .. versionadded:: 1.10.0
-
- Normalization mode (see `numpy.fft`). Default is None.
-
- Returns
- -------
- out : complex ndarray
- The truncated or zero-padded input, transformed along the axis
- indicated by `axis`, or the last one if `axis` is not specified.
- If `n` is even, the length of the transformed axis is ``(n/2)+1``.
- If `n` is odd, the length is ``(n+1)/2``.
-
- Raises
- ------
- IndexError
- If `axis` is larger than the last axis of `a`.
-
- See Also
- --------
- numpy.fft : For definition of the DFT and conventions used.
- irfft : The inverse of `rfft`.
- fft : The one-dimensional FFT of general (complex) input.
- fftn : The *n*-dimensional FFT.
- rfftn : The *n*-dimensional FFT of real input.
-
- Notes
- -----
- When the DFT is computed for purely real input, the output is
- Hermitian-symmetric, i.e. the negative frequency terms are just the complex
- conjugates of the corresponding positive-frequency terms, and the
- negative-frequency terms are therefore redundant. This function does not
- compute the negative frequency terms, and the length of the transformed
- axis of the output is therefore ``n//2 + 1``.
-
- When ``A = rfft(a)`` and fs is the sampling frequency, ``A[0]`` contains
- the zero-frequency term 0*fs, which is real due to Hermitian symmetry.
-
- If `n` is even, ``A[-1]`` contains the term representing both positive
- and negative Nyquist frequency (+fs/2 and -fs/2), and must also be purely
- real. If `n` is odd, there is no term at fs/2; ``A[-1]`` contains
- the largest positive frequency (fs/2*(n-1)/n), and is complex in the
- general case.
-
- If the input `a` contains an imaginary part, it is silently discarded.
-
- Examples
- --------
- >>> np.fft.fft([0, 1, 0, 0])
- array([ 1.+0.j, 0.-1.j, -1.+0.j, 0.+1.j]) # may vary
- >>> np.fft.rfft([0, 1, 0, 0])
- array([ 1.+0.j, 0.-1.j, -1.+0.j]) # may vary
-
- Notice how the final element of the `fft` output is the complex conjugate
- of the second element, for real input. For `rfft`, this symmetry is
- exploited to compute only the non-negative frequency terms.
-
- """
- a = asarray(a)
- inv_norm = 1
- if norm is not None and _unitary(norm):
- if n is None:
- n = a.shape[axis]
- inv_norm = sqrt(n)
- output = _raw_fft(a, n, axis, True, True, inv_norm)
- return output
-
-
-@array_function_dispatch(_fft_dispatcher)
-def irfft(a, n=None, axis=-1, norm=None):
- """
- Compute the inverse of the n-point DFT for real input.
-
- This function computes the inverse of the one-dimensional *n*-point
- discrete Fourier Transform of real input computed by `rfft`.
- In other words, ``irfft(rfft(a), len(a)) == a`` to within numerical
- accuracy. (See Notes below for why ``len(a)`` is necessary here.)
-
- The input is expected to be in the form returned by `rfft`, i.e. the
- real zero-frequency term followed by the complex positive frequency terms
- in order of increasing frequency. Since the discrete Fourier Transform of
- real input is Hermitian-symmetric, the negative frequency terms are taken
- to be the complex conjugates of the corresponding positive frequency terms.
-
- Parameters
- ----------
- a : array_like
- The input array.
- n : int, optional
- Length of the transformed axis of the output.
- For `n` output points, ``n//2+1`` input points are necessary. If the
- input is longer than this, it is cropped. If it is shorter than this,
- it is padded with zeros. If `n` is not given, it is taken to be
- ``2*(m-1)`` where ``m`` is the length of the input along the axis
- specified by `axis`.
- axis : int, optional
- Axis over which to compute the inverse FFT. If not given, the last
- axis is used.
- norm : {None, "ortho"}, optional
- .. versionadded:: 1.10.0
-
- Normalization mode (see `numpy.fft`). Default is None.
-
- Returns
- -------
- out : ndarray
- The truncated or zero-padded input, transformed along the axis
- indicated by `axis`, or the last one if `axis` is not specified.
- The length of the transformed axis is `n`, or, if `n` is not given,
- ``2*(m-1)`` where ``m`` is the length of the transformed axis of the
- input. To get an odd number of output points, `n` must be specified.
-
- Raises
- ------
- IndexError
- If `axis` is larger than the last axis of `a`.
-
- See Also
- --------
- numpy.fft : For definition of the DFT and conventions used.
- rfft : The one-dimensional FFT of real input, of which `irfft` is inverse.
- fft : The one-dimensional FFT.
- irfft2 : The inverse of the two-dimensional FFT of real input.
- irfftn : The inverse of the *n*-dimensional FFT of real input.
-
- Notes
- -----
- Returns the real valued `n`-point inverse discrete Fourier transform
- of `a`, where `a` contains the non-negative frequency terms of a
- Hermitian-symmetric sequence. `n` is the length of the result, not the
- input.
-
- If you specify an `n` such that `a` must be zero-padded or truncated, the
- extra/removed values will be added/removed at high frequencies. One can
- thus resample a series to `m` points via Fourier interpolation by:
- ``a_resamp = irfft(rfft(a), m)``.
-
- The correct interpretation of the hermitian input depends on the length of
- the original data, as given by `n`. This is because each input shape could
- correspond to either an odd or even length signal. By default, `irfft`
- assumes an even output length which puts the last entry at the Nyquist
- frequency; aliasing with its symmetric counterpart. By Hermitian symmetry,
- the value is thus treated as purely real. To avoid losing information, the
- correct length of the real input **must** be given.
-
- Examples
- --------
- >>> np.fft.ifft([1, -1j, -1, 1j])
- array([0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j]) # may vary
- >>> np.fft.irfft([1, -1j, -1])
- array([0., 1., 0., 0.])
-
- Notice how the last term in the input to the ordinary `ifft` is the
- complex conjugate of the second term, and the output has zero imaginary
- part everywhere. When calling `irfft`, the negative frequencies are not
- specified, and the output array is purely real.
-
- """
- a = asarray(a)
- if n is None:
- n = (a.shape[axis] - 1) * 2
- inv_norm = n
- if norm is not None and _unitary(norm):
- inv_norm = sqrt(n)
- output = _raw_fft(a, n, axis, True, False, inv_norm)
- return output
-
-
-@array_function_dispatch(_fft_dispatcher)
-def hfft(a, n=None, axis=-1, norm=None):
- """
- Compute the FFT of a signal that has Hermitian symmetry, i.e., a real
- spectrum.
-
- Parameters
- ----------
- a : array_like
- The input array.
- n : int, optional
- Length of the transformed axis of the output. For `n` output
- points, ``n//2 + 1`` input points are necessary. If the input is
- longer than this, it is cropped. If it is shorter than this, it is
- padded with zeros. If `n` is not given, it is taken to be ``2*(m-1)``
- where ``m`` is the length of the input along the axis specified by
- `axis`.
- axis : int, optional
- Axis over which to compute the FFT. If not given, the last
- axis is used.
- norm : {None, "ortho"}, optional
- Normalization mode (see `numpy.fft`). Default is None.
-
- .. versionadded:: 1.10.0
-
- Returns
- -------
- out : ndarray
- The truncated or zero-padded input, transformed along the axis
- indicated by `axis`, or the last one if `axis` is not specified.
- The length of the transformed axis is `n`, or, if `n` is not given,
- ``2*m - 2`` where ``m`` is the length of the transformed axis of
- the input. To get an odd number of output points, `n` must be
- specified, for instance as ``2*m - 1`` in the typical case,
-
- Raises
- ------
- IndexError
- If `axis` is larger than the last axis of `a`.
-
- See also
- --------
- rfft : Compute the one-dimensional FFT for real input.
- ihfft : The inverse of `hfft`.
-
- Notes
- -----
- `hfft`/`ihfft` are a pair analogous to `rfft`/`irfft`, but for the
- opposite case: here the signal has Hermitian symmetry in the time
- domain and is real in the frequency domain. So here it's `hfft` for
- which you must supply the length of the result if it is to be odd.
-
- * even: ``ihfft(hfft(a, 2*len(a) - 2) == a``, within roundoff error,
- * odd: ``ihfft(hfft(a, 2*len(a) - 1) == a``, within roundoff error.
-
- The correct interpretation of the hermitian input depends on the length of
- the original data, as given by `n`. This is because each input shape could
- correspond to either an odd or even length signal. By default, `hfft`
- assumes an even output length which puts the last entry at the Nyquist
- frequency; aliasing with its symmetric counterpart. By Hermitian symmetry,
- the value is thus treated as purely real. To avoid losing information, the
- shape of the full signal **must** be given.
-
- Examples
- --------
- >>> signal = np.array([1, 2, 3, 4, 3, 2])
- >>> np.fft.fft(signal)
- array([15.+0.j, -4.+0.j, 0.+0.j, -1.-0.j, 0.+0.j, -4.+0.j]) # may vary
- >>> np.fft.hfft(signal[:4]) # Input first half of signal
- array([15., -4., 0., -1., 0., -4.])
- >>> np.fft.hfft(signal, 6) # Input entire signal and truncate
- array([15., -4., 0., -1., 0., -4.])
-
-
- >>> signal = np.array([[1, 1.j], [-1.j, 2]])
- >>> np.conj(signal.T) - signal # check Hermitian symmetry
- array([[ 0.-0.j, -0.+0.j], # may vary
- [ 0.+0.j, 0.-0.j]])
- >>> freq_spectrum = np.fft.hfft(signal)
- >>> freq_spectrum
- array([[ 1., 1.],
- [ 2., -2.]])
-
- """
- a = asarray(a)
- if n is None:
- n = (a.shape[axis] - 1) * 2
- unitary = _unitary(norm)
- return irfft(conjugate(a), n, axis) * (sqrt(n) if unitary else n)
-
-
-@array_function_dispatch(_fft_dispatcher)
-def ihfft(a, n=None, axis=-1, norm=None):
- """
- Compute the inverse FFT of a signal that has Hermitian symmetry.
-
- Parameters
- ----------
- a : array_like
- Input array.
- n : int, optional
- Length of the inverse FFT, the number of points along
- transformation axis in the input to use. If `n` is smaller than
- the length of the input, the input is cropped. If it is larger,
- the input is padded with zeros. If `n` is not given, the length of
- the input along the axis specified by `axis` is used.
- axis : int, optional
- Axis over which to compute the inverse FFT. If not given, the last
- axis is used.
- norm : {None, "ortho"}, optional
- Normalization mode (see `numpy.fft`). Default is None.
-
- .. versionadded:: 1.10.0
-
- Returns
- -------
- out : complex ndarray
- The truncated or zero-padded input, transformed along the axis
- indicated by `axis`, or the last one if `axis` is not specified.
- The length of the transformed axis is ``n//2 + 1``.
-
- See also
- --------
- hfft, irfft
-
- Notes
- -----
- `hfft`/`ihfft` are a pair analogous to `rfft`/`irfft`, but for the
- opposite case: here the signal has Hermitian symmetry in the time
- domain and is real in the frequency domain. So here it's `hfft` for
- which you must supply the length of the result if it is to be odd:
-
- * even: ``ihfft(hfft(a, 2*len(a) - 2) == a``, within roundoff error,
- * odd: ``ihfft(hfft(a, 2*len(a) - 1) == a``, within roundoff error.
-
- Examples
- --------
- >>> spectrum = np.array([ 15, -4, 0, -1, 0, -4])
- >>> np.fft.ifft(spectrum)
- array([1.+0.j, 2.+0.j, 3.+0.j, 4.+0.j, 3.+0.j, 2.+0.j]) # may vary
- >>> np.fft.ihfft(spectrum)
- array([ 1.-0.j, 2.-0.j, 3.-0.j, 4.-0.j]) # may vary
-
- """
- a = asarray(a)
- if n is None:
- n = a.shape[axis]
- unitary = _unitary(norm)
- output = conjugate(rfft(a, n, axis))
- return output * (1 / (sqrt(n) if unitary else n))
-
-
-def _cook_nd_args(a, s=None, axes=None, invreal=0):
- if s is None:
- shapeless = 1
- if axes is None:
- s = list(a.shape)
- else:
- s = take(a.shape, axes)
- else:
- shapeless = 0
- s = list(s)
- if axes is None:
- axes = list(range(-len(s), 0))
- if len(s) != len(axes):
- raise ValueError("Shape and axes have different lengths.")
- if invreal and shapeless:
- s[-1] = (a.shape[axes[-1]] - 1) * 2
- return s, axes
-
-
-def _raw_fftnd(a, s=None, axes=None, function=fft, norm=None):
- a = asarray(a)
- s, axes = _cook_nd_args(a, s, axes)
- itl = list(range(len(axes)))
- itl.reverse()
- for ii in itl:
- a = function(a, n=s[ii], axis=axes[ii], norm=norm)
- return a
-
-
-def _fftn_dispatcher(a, s=None, axes=None, norm=None):
- return (a,)
-
-
-@array_function_dispatch(_fftn_dispatcher)
-def fftn(a, s=None, axes=None, norm=None):
- """
- Compute the N-dimensional discrete Fourier Transform.
-
- This function computes the *N*-dimensional discrete Fourier Transform over
- any number of axes in an *M*-dimensional array by means of the Fast Fourier
- Transform (FFT).
-
- Parameters
- ----------
- a : array_like
- Input array, can be complex.
- s : sequence of ints, optional
- Shape (length of each transformed axis) of the output
- (``s[0]`` refers to axis 0, ``s[1]`` to axis 1, etc.).
- This corresponds to ``n`` for ``fft(x, n)``.
- Along any axis, if the given shape is smaller than that of the input,
- the input is cropped. If it is larger, the input is padded with zeros.
- if `s` is not given, the shape of the input along the axes specified
- by `axes` is used.
- axes : sequence of ints, optional
- Axes over which to compute the FFT. If not given, the last ``len(s)``
- axes are used, or all axes if `s` is also not specified.
- Repeated indices in `axes` means that the transform over that axis is
- performed multiple times.
- norm : {None, "ortho"}, optional
- .. versionadded:: 1.10.0
-
- Normalization mode (see `numpy.fft`). Default is None.
-
- Returns
- -------
- out : complex ndarray
- The truncated or zero-padded input, transformed along the axes
- indicated by `axes`, or by a combination of `s` and `a`,
- as explained in the parameters section above.
-
- Raises
- ------
- ValueError
- If `s` and `axes` have different length.
- IndexError
- If an element of `axes` is larger than than the number of axes of `a`.
-
- See Also
- --------
- numpy.fft : Overall view of discrete Fourier transforms, with definitions
- and conventions used.
- ifftn : The inverse of `fftn`, the inverse *n*-dimensional FFT.
- fft : The one-dimensional FFT, with definitions and conventions used.
- rfftn : The *n*-dimensional FFT of real input.
- fft2 : The two-dimensional FFT.
- fftshift : Shifts zero-frequency terms to centre of array
-
- Notes
- -----
- The output, analogously to `fft`, contains the term for zero frequency in
- the low-order corner of all axes, the positive frequency terms in the
- first half of all axes, the term for the Nyquist frequency in the middle
- of all axes and the negative frequency terms in the second half of all
- axes, in order of decreasingly negative frequency.
-
- See `numpy.fft` for details, definitions and conventions used.
-
- Examples
- --------
- >>> a = np.mgrid[:3, :3, :3][0]
- >>> np.fft.fftn(a, axes=(1, 2))
- array([[[ 0.+0.j, 0.+0.j, 0.+0.j], # may vary
- [ 0.+0.j, 0.+0.j, 0.+0.j],
- [ 0.+0.j, 0.+0.j, 0.+0.j]],
- [[ 9.+0.j, 0.+0.j, 0.+0.j],
- [ 0.+0.j, 0.+0.j, 0.+0.j],
- [ 0.+0.j, 0.+0.j, 0.+0.j]],
- [[18.+0.j, 0.+0.j, 0.+0.j],
- [ 0.+0.j, 0.+0.j, 0.+0.j],
- [ 0.+0.j, 0.+0.j, 0.+0.j]]])
- >>> np.fft.fftn(a, (2, 2), axes=(0, 1))
- array([[[ 2.+0.j, 2.+0.j, 2.+0.j], # may vary
- [ 0.+0.j, 0.+0.j, 0.+0.j]],
- [[-2.+0.j, -2.+0.j, -2.+0.j],
- [ 0.+0.j, 0.+0.j, 0.+0.j]]])
-
- >>> import matplotlib.pyplot as plt
- >>> [X, Y] = np.meshgrid(2 * np.pi * np.arange(200) / 12,
- ... 2 * np.pi * np.arange(200) / 34)
- >>> S = np.sin(X) + np.cos(Y) + np.random.uniform(0, 1, X.shape)
- >>> FS = np.fft.fftn(S)
- >>> plt.imshow(np.log(np.abs(np.fft.fftshift(FS))**2))
- <matplotlib.image.AxesImage object at 0x...>
- >>> plt.show()
-
- """
-
- return _raw_fftnd(a, s, axes, fft, norm)
-
-
-@array_function_dispatch(_fftn_dispatcher)
-def ifftn(a, s=None, axes=None, norm=None):
- """
- Compute the N-dimensional inverse discrete Fourier Transform.
-
- This function computes the inverse of the N-dimensional discrete
- Fourier Transform over any number of axes in an M-dimensional array by
- means of the Fast Fourier Transform (FFT). In other words,
- ``ifftn(fftn(a)) == a`` to within numerical accuracy.
- For a description of the definitions and conventions used, see `numpy.fft`.
-
- The input, analogously to `ifft`, should be ordered in the same way as is
- returned by `fftn`, i.e. it should have the term for zero frequency
- in all axes in the low-order corner, the positive frequency terms in the
- first half of all axes, the term for the Nyquist frequency in the middle
- of all axes and the negative frequency terms in the second half of all
- axes, in order of decreasingly negative frequency.
-
- Parameters
- ----------
- a : array_like
- Input array, can be complex.
- s : sequence of ints, optional
- Shape (length of each transformed axis) of the output
- (``s[0]`` refers to axis 0, ``s[1]`` to axis 1, etc.).
- This corresponds to ``n`` for ``ifft(x, n)``.
- Along any axis, if the given shape is smaller than that of the input,
- the input is cropped. If it is larger, the input is padded with zeros.
- if `s` is not given, the shape of the input along the axes specified
- by `axes` is used. See notes for issue on `ifft` zero padding.
- axes : sequence of ints, optional
- Axes over which to compute the IFFT. If not given, the last ``len(s)``
- axes are used, or all axes if `s` is also not specified.
- Repeated indices in `axes` means that the inverse transform over that
- axis is performed multiple times.
- norm : {None, "ortho"}, optional
- .. versionadded:: 1.10.0
-
- Normalization mode (see `numpy.fft`). Default is None.
-
- Returns
- -------
- out : complex ndarray
- The truncated or zero-padded input, transformed along the axes
- indicated by `axes`, or by a combination of `s` or `a`,
- as explained in the parameters section above.
-
- Raises
- ------
- ValueError
- If `s` and `axes` have different length.
- IndexError
- If an element of `axes` is larger than than the number of axes of `a`.
-
- See Also
- --------
- numpy.fft : Overall view of discrete Fourier transforms, with definitions
- and conventions used.
- fftn : The forward *n*-dimensional FFT, of which `ifftn` is the inverse.
- ifft : The one-dimensional inverse FFT.
- ifft2 : The two-dimensional inverse FFT.
- ifftshift : Undoes `fftshift`, shifts zero-frequency terms to beginning
- of array.
-
- Notes
- -----
- See `numpy.fft` for definitions and conventions used.
-
- Zero-padding, analogously with `ifft`, is performed by appending zeros to
- the input along the specified dimension. Although this is the common
- approach, it might lead to surprising results. If another form of zero
- padding is desired, it must be performed before `ifftn` is called.
-
- Examples
- --------
- >>> a = np.eye(4)
- >>> np.fft.ifftn(np.fft.fftn(a, axes=(0,)), axes=(1,))
- array([[1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], # may vary
- [0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j],
- [0.+0.j, 0.+0.j, 1.+0.j, 0.+0.j],
- [0.+0.j, 0.+0.j, 0.+0.j, 1.+0.j]])
-
-
- Create and plot an image with band-limited frequency content:
-
- >>> import matplotlib.pyplot as plt
- >>> n = np.zeros((200,200), dtype=complex)
- >>> n[60:80, 20:40] = np.exp(1j*np.random.uniform(0, 2*np.pi, (20, 20)))
- >>> im = np.fft.ifftn(n).real
- >>> plt.imshow(im)
- <matplotlib.image.AxesImage object at 0x...>
- >>> plt.show()
-
- """
-
- return _raw_fftnd(a, s, axes, ifft, norm)
-
-
-@array_function_dispatch(_fftn_dispatcher)
-def fft2(a, s=None, axes=(-2, -1), norm=None):
- """
- Compute the 2-dimensional discrete Fourier Transform
-
- This function computes the *n*-dimensional discrete Fourier Transform
- over any axes in an *M*-dimensional array by means of the
- Fast Fourier Transform (FFT). By default, the transform is computed over
- the last two axes of the input array, i.e., a 2-dimensional FFT.
-
- Parameters
- ----------
- a : array_like
- Input array, can be complex
- s : sequence of ints, optional
- Shape (length of each transformed axis) of the output
- (``s[0]`` refers to axis 0, ``s[1]`` to axis 1, etc.).
- This corresponds to ``n`` for ``fft(x, n)``.
- Along each axis, if the given shape is smaller than that of the input,
- the input is cropped. If it is larger, the input is padded with zeros.
- if `s` is not given, the shape of the input along the axes specified
- by `axes` is used.
- axes : sequence of ints, optional
- Axes over which to compute the FFT. If not given, the last two
- axes are used. A repeated index in `axes` means the transform over
- that axis is performed multiple times. A one-element sequence means
- that a one-dimensional FFT is performed.
- norm : {None, "ortho"}, optional
- .. versionadded:: 1.10.0
-
- Normalization mode (see `numpy.fft`). Default is None.
-
- Returns
- -------
- out : complex ndarray
- The truncated or zero-padded input, transformed along the axes
- indicated by `axes`, or the last two axes if `axes` is not given.
-
- Raises
- ------
- ValueError
- If `s` and `axes` have different length, or `axes` not given and
- ``len(s) != 2``.
- IndexError
- If an element of `axes` is larger than than the number of axes of `a`.
-
- See Also
- --------
- numpy.fft : Overall view of discrete Fourier transforms, with definitions
- and conventions used.
- ifft2 : The inverse two-dimensional FFT.
- fft : The one-dimensional FFT.
- fftn : The *n*-dimensional FFT.
- fftshift : Shifts zero-frequency terms to the center of the array.
- For two-dimensional input, swaps first and third quadrants, and second
- and fourth quadrants.
-
- Notes
- -----
- `fft2` is just `fftn` with a different default for `axes`.
-
- The output, analogously to `fft`, contains the term for zero frequency in
- the low-order corner of the transformed axes, the positive frequency terms
- in the first half of these axes, the term for the Nyquist frequency in the
- middle of the axes and the negative frequency terms in the second half of
- the axes, in order of decreasingly negative frequency.
-
- See `fftn` for details and a plotting example, and `numpy.fft` for
- definitions and conventions used.
-
-
- Examples
- --------
- >>> a = np.mgrid[:5, :5][0]
- >>> np.fft.fft2(a)
- array([[ 50. +0.j , 0. +0.j , 0. +0.j , # may vary
- 0. +0.j , 0. +0.j ],
- [-12.5+17.20477401j, 0. +0.j , 0. +0.j ,
- 0. +0.j , 0. +0.j ],
- [-12.5 +4.0614962j , 0. +0.j , 0. +0.j ,
- 0. +0.j , 0. +0.j ],
- [-12.5 -4.0614962j , 0. +0.j , 0. +0.j ,
- 0. +0.j , 0. +0.j ],
- [-12.5-17.20477401j, 0. +0.j , 0. +0.j ,
- 0. +0.j , 0. +0.j ]])
-
- """
-
- return _raw_fftnd(a, s, axes, fft, norm)
-
-
-@array_function_dispatch(_fftn_dispatcher)
-def ifft2(a, s=None, axes=(-2, -1), norm=None):
- """
- Compute the 2-dimensional inverse discrete Fourier Transform.
-
- This function computes the inverse of the 2-dimensional discrete Fourier
- Transform over any number of axes in an M-dimensional array by means of
- the Fast Fourier Transform (FFT). In other words, ``ifft2(fft2(a)) == a``
- to within numerical accuracy. By default, the inverse transform is
- computed over the last two axes of the input array.
-
- The input, analogously to `ifft`, should be ordered in the same way as is
- returned by `fft2`, i.e. it should have the term for zero frequency
- in the low-order corner of the two axes, the positive frequency terms in
- the first half of these axes, the term for the Nyquist frequency in the
- middle of the axes and the negative frequency terms in the second half of
- both axes, in order of decreasingly negative frequency.
-
- Parameters
- ----------
- a : array_like
- Input array, can be complex.
- s : sequence of ints, optional
- Shape (length of each axis) of the output (``s[0]`` refers to axis 0,
- ``s[1]`` to axis 1, etc.). This corresponds to `n` for ``ifft(x, n)``.
- Along each axis, if the given shape is smaller than that of the input,
- the input is cropped. If it is larger, the input is padded with zeros.
- if `s` is not given, the shape of the input along the axes specified
- by `axes` is used. See notes for issue on `ifft` zero padding.
- axes : sequence of ints, optional
- Axes over which to compute the FFT. If not given, the last two
- axes are used. A repeated index in `axes` means the transform over
- that axis is performed multiple times. A one-element sequence means
- that a one-dimensional FFT is performed.
- norm : {None, "ortho"}, optional
- .. versionadded:: 1.10.0
-
- Normalization mode (see `numpy.fft`). Default is None.
-
- Returns
- -------
- out : complex ndarray
- The truncated or zero-padded input, transformed along the axes
- indicated by `axes`, or the last two axes if `axes` is not given.
-
- Raises
- ------
- ValueError
- If `s` and `axes` have different length, or `axes` not given and
- ``len(s) != 2``.
- IndexError
- If an element of `axes` is larger than than the number of axes of `a`.
-
- See Also
- --------
- numpy.fft : Overall view of discrete Fourier transforms, with definitions
- and conventions used.
- fft2 : The forward 2-dimensional FFT, of which `ifft2` is the inverse.
- ifftn : The inverse of the *n*-dimensional FFT.
- fft : The one-dimensional FFT.
- ifft : The one-dimensional inverse FFT.
-
- Notes
- -----
- `ifft2` is just `ifftn` with a different default for `axes`.
-
- See `ifftn` for details and a plotting example, and `numpy.fft` for
- definition and conventions used.
-
- Zero-padding, analogously with `ifft`, is performed by appending zeros to
- the input along the specified dimension. Although this is the common
- approach, it might lead to surprising results. If another form of zero
- padding is desired, it must be performed before `ifft2` is called.
-
- Examples
- --------
- >>> a = 4 * np.eye(4)
- >>> np.fft.ifft2(a)
- array([[1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], # may vary
- [0.+0.j, 0.+0.j, 0.+0.j, 1.+0.j],
- [0.+0.j, 0.+0.j, 1.+0.j, 0.+0.j],
- [0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j]])
-
- """
-
- return _raw_fftnd(a, s, axes, ifft, norm)
-
-
-@array_function_dispatch(_fftn_dispatcher)
-def rfftn(a, s=None, axes=None, norm=None):
- """
- Compute the N-dimensional discrete Fourier Transform for real input.
-
- This function computes the N-dimensional discrete Fourier Transform over
- any number of axes in an M-dimensional real array by means of the Fast
- Fourier Transform (FFT). By default, all axes are transformed, with the
- real transform performed over the last axis, while the remaining
- transforms are complex.
-
- Parameters
- ----------
- a : array_like
- Input array, taken to be real.
- s : sequence of ints, optional
- Shape (length along each transformed axis) to use from the input.
- (``s[0]`` refers to axis 0, ``s[1]`` to axis 1, etc.).
- The final element of `s` corresponds to `n` for ``rfft(x, n)``, while
- for the remaining axes, it corresponds to `n` for ``fft(x, n)``.
- Along any axis, if the given shape is smaller than that of the input,
- the input is cropped. If it is larger, the input is padded with zeros.
- if `s` is not given, the shape of the input along the axes specified
- by `axes` is used.
- axes : sequence of ints, optional
- Axes over which to compute the FFT. If not given, the last ``len(s)``
- axes are used, or all axes if `s` is also not specified.
- norm : {None, "ortho"}, optional
- .. versionadded:: 1.10.0
-
- Normalization mode (see `numpy.fft`). Default is None.
-
- Returns
- -------
- out : complex ndarray
- The truncated or zero-padded input, transformed along the axes
- indicated by `axes`, or by a combination of `s` and `a`,
- as explained in the parameters section above.
- The length of the last axis transformed will be ``s[-1]//2+1``,
- while the remaining transformed axes will have lengths according to
- `s`, or unchanged from the input.
-
- Raises
- ------
- ValueError
- If `s` and `axes` have different length.
- IndexError
- If an element of `axes` is larger than than the number of axes of `a`.
-
- See Also
- --------
- irfftn : The inverse of `rfftn`, i.e. the inverse of the n-dimensional FFT
- of real input.
- fft : The one-dimensional FFT, with definitions and conventions used.
- rfft : The one-dimensional FFT of real input.
- fftn : The n-dimensional FFT.
- rfft2 : The two-dimensional FFT of real input.
-
- Notes
- -----
- The transform for real input is performed over the last transformation
- axis, as by `rfft`, then the transform over the remaining axes is
- performed as by `fftn`. The order of the output is as for `rfft` for the
- final transformation axis, and as for `fftn` for the remaining
- transformation axes.
-
- See `fft` for details, definitions and conventions used.
-
- Examples
- --------
- >>> a = np.ones((2, 2, 2))
- >>> np.fft.rfftn(a)
- array([[[8.+0.j, 0.+0.j], # may vary
- [0.+0.j, 0.+0.j]],
- [[0.+0.j, 0.+0.j],
- [0.+0.j, 0.+0.j]]])
-
- >>> np.fft.rfftn(a, axes=(2, 0))
- array([[[4.+0.j, 0.+0.j], # may vary
- [4.+0.j, 0.+0.j]],
- [[0.+0.j, 0.+0.j],
- [0.+0.j, 0.+0.j]]])
-
- """
- a = asarray(a)
- s, axes = _cook_nd_args(a, s, axes)
- a = rfft(a, s[-1], axes[-1], norm)
- for ii in range(len(axes)-1):
- a = fft(a, s[ii], axes[ii], norm)
- return a
-
-
-@array_function_dispatch(_fftn_dispatcher)
-def rfft2(a, s=None, axes=(-2, -1), norm=None):
- """
- Compute the 2-dimensional FFT of a real array.
-
- Parameters
- ----------
- a : array
- Input array, taken to be real.
- s : sequence of ints, optional
- Shape of the FFT.
- axes : sequence of ints, optional
- Axes over which to compute the FFT.
- norm : {None, "ortho"}, optional
- .. versionadded:: 1.10.0
-
- Normalization mode (see `numpy.fft`). Default is None.
-
- Returns
- -------
- out : ndarray
- The result of the real 2-D FFT.
-
- See Also
- --------
- rfftn : Compute the N-dimensional discrete Fourier Transform for real
- input.
-
- Notes
- -----
- This is really just `rfftn` with different default behavior.
- For more details see `rfftn`.
-
- """
-
- return rfftn(a, s, axes, norm)
-
-
-@array_function_dispatch(_fftn_dispatcher)
-def irfftn(a, s=None, axes=None, norm=None):
- """
- Compute the inverse of the N-dimensional FFT of real input.
-
- This function computes the inverse of the N-dimensional discrete
- Fourier Transform for real input over any number of axes in an
- M-dimensional array by means of the Fast Fourier Transform (FFT). In
- other words, ``irfftn(rfftn(a), a.shape) == a`` to within numerical
- accuracy. (The ``a.shape`` is necessary like ``len(a)`` is for `irfft`,
- and for the same reason.)
-
- The input should be ordered in the same way as is returned by `rfftn`,
- i.e. as for `irfft` for the final transformation axis, and as for `ifftn`
- along all the other axes.
-
- Parameters
- ----------
- a : array_like
- Input array.
- s : sequence of ints, optional
- Shape (length of each transformed axis) of the output
- (``s[0]`` refers to axis 0, ``s[1]`` to axis 1, etc.). `s` is also the
- number of input points used along this axis, except for the last axis,
- where ``s[-1]//2+1`` points of the input are used.
- Along any axis, if the shape indicated by `s` is smaller than that of
- the input, the input is cropped. If it is larger, the input is padded
- with zeros. If `s` is not given, the shape of the input along the axes
- specified by axes is used. Except for the last axis which is taken to be
- ``2*(m-1)`` where ``m`` is the length of the input along that axis.
- axes : sequence of ints, optional
- Axes over which to compute the inverse FFT. If not given, the last
- `len(s)` axes are used, or all axes if `s` is also not specified.
- Repeated indices in `axes` means that the inverse transform over that
- axis is performed multiple times.
- norm : {None, "ortho"}, optional
- .. versionadded:: 1.10.0
-
- Normalization mode (see `numpy.fft`). Default is None.
-
- Returns
- -------
- out : ndarray
- The truncated or zero-padded input, transformed along the axes
- indicated by `axes`, or by a combination of `s` or `a`,
- as explained in the parameters section above.
- The length of each transformed axis is as given by the corresponding
- element of `s`, or the length of the input in every axis except for the
- last one if `s` is not given. In the final transformed axis the length
- of the output when `s` is not given is ``2*(m-1)`` where ``m`` is the
- length of the final transformed axis of the input. To get an odd
- number of output points in the final axis, `s` must be specified.
-
- Raises
- ------
- ValueError
- If `s` and `axes` have different length.
- IndexError
- If an element of `axes` is larger than than the number of axes of `a`.
-
- See Also
- --------
- rfftn : The forward n-dimensional FFT of real input,
- of which `ifftn` is the inverse.
- fft : The one-dimensional FFT, with definitions and conventions used.
- irfft : The inverse of the one-dimensional FFT of real input.
- irfft2 : The inverse of the two-dimensional FFT of real input.
-
- Notes
- -----
- See `fft` for definitions and conventions used.
-
- See `rfft` for definitions and conventions used for real input.
-
- The correct interpretation of the hermitian input depends on the shape of
- the original data, as given by `s`. This is because each input shape could
- correspond to either an odd or even length signal. By default, `irfftn`
- assumes an even output length which puts the last entry at the Nyquist
- frequency; aliasing with its symmetric counterpart. When performing the
- final complex to real transform, the last value is thus treated as purely
- real. To avoid losing information, the correct shape of the real input
- **must** be given.
-
- Examples
- --------
- >>> a = np.zeros((3, 2, 2))
- >>> a[0, 0, 0] = 3 * 2 * 2
- >>> np.fft.irfftn(a)
- array([[[1., 1.],
- [1., 1.]],
- [[1., 1.],
- [1., 1.]],
- [[1., 1.],
- [1., 1.]]])
-
- """
- a = asarray(a)
- s, axes = _cook_nd_args(a, s, axes, invreal=1)
- for ii in range(len(axes)-1):
- a = ifft(a, s[ii], axes[ii], norm)
- a = irfft(a, s[-1], axes[-1], norm)
- return a
-
-
-@array_function_dispatch(_fftn_dispatcher)
-def irfft2(a, s=None, axes=(-2, -1), norm=None):
- """
- Compute the 2-dimensional inverse FFT of a real array.
-
- Parameters
- ----------
- a : array_like
- The input array
- s : sequence of ints, optional
- Shape of the real output to the inverse FFT.
- axes : sequence of ints, optional
- The axes over which to compute the inverse fft.
- Default is the last two axes.
- norm : {None, "ortho"}, optional
- .. versionadded:: 1.10.0
-
- Normalization mode (see `numpy.fft`). Default is None.
-
- Returns
- -------
- out : ndarray
- The result of the inverse real 2-D FFT.
-
- See Also
- --------
- irfftn : Compute the inverse of the N-dimensional FFT of real input.
-
- Notes
- -----
- This is really `irfftn` with different defaults.
- For more details see `irfftn`.
-
- """
-
- return irfftn(a, s, axes, norm)
config.add_data_dir('tests')
# Configure pocketfft_internal
- config.add_extension('pocketfft_internal',
- sources=['pocketfft.c']
+ config.add_extension('_pocketfft_internal',
+ sources=['_pocketfft.c']
)
return config
from . import _pickle
from . import common
from . import bounded_integers
-from . import entropy
from .mtrand import *
from .generator import Generator, default_rng
+++ /dev/null
-cimport numpy as np
-import numpy as np
-
-from libc.stdint cimport uint32_t, uint64_t
-
-__all__ = ['random_entropy', 'seed_by_array']
-
-np.import_array()
-
-cdef extern from "src/splitmix64/splitmix64.h":
- cdef uint64_t splitmix64_next(uint64_t *state) nogil
-
-cdef extern from "src/entropy/entropy.h":
- cdef bint entropy_getbytes(void* dest, size_t size)
- cdef bint entropy_fallback_getbytes(void *dest, size_t size)
-
-cdef Py_ssize_t compute_numel(size):
- cdef Py_ssize_t i, n = 1
- if isinstance(size, tuple):
- for i in range(len(size)):
- n *= size[i]
- else:
- n = size
- return n
-
-
-def seed_by_array(object seed, Py_ssize_t n):
- """
- Transforms a seed array into an initial state
-
- Parameters
- ----------
- seed: ndarray, 1d, uint64
- Array to use. If seed is a scalar, promote to array.
- n : int
- Number of 64-bit unsigned integers required
-
- Notes
- -----
- Uses splitmix64 to perform the transformation
- """
- cdef uint64_t seed_copy = 0
- cdef uint64_t[::1] seed_array
- cdef uint64_t[::1] initial_state
- cdef Py_ssize_t seed_size, iter_bound
- cdef int i, loc = 0
-
- if hasattr(seed, 'squeeze'):
- seed = seed.squeeze()
- arr = np.asarray(seed)
- if arr.shape == ():
- err_msg = 'Scalar seeds must be integers between 0 and 2**64 - 1'
- if not np.isreal(arr):
- raise TypeError(err_msg)
- int_seed = int(seed)
- if int_seed != seed:
- raise TypeError(err_msg)
- if int_seed < 0 or int_seed > 2**64 - 1:
- raise ValueError(err_msg)
- seed_array = np.array([int_seed], dtype=np.uint64)
- elif issubclass(arr.dtype.type, np.inexact):
- raise TypeError('seed array must be integers')
- else:
- err_msg = "Seed values must be integers between 0 and 2**64 - 1"
- obj = np.asarray(seed).astype(np.object)
- if obj.ndim != 1:
- raise ValueError('Array-valued seeds must be 1-dimensional')
- if not np.isreal(obj).all():
- raise TypeError(err_msg)
- if ((obj > int(2**64 - 1)) | (obj < 0)).any():
- raise ValueError(err_msg)
- try:
- obj_int = obj.astype(np.uint64, casting='unsafe')
- except ValueError:
- raise ValueError(err_msg)
- if not (obj == obj_int).all():
- raise TypeError(err_msg)
- seed_array = obj_int
-
- seed_size = seed_array.shape[0]
- iter_bound = n if n > seed_size else seed_size
-
- initial_state = <np.ndarray>np.empty(n, dtype=np.uint64)
- for i in range(iter_bound):
- if i < seed_size:
- seed_copy ^= seed_array[i]
- initial_state[loc] = splitmix64_next(&seed_copy)
- loc += 1
- if loc == n:
- loc = 0
-
- return np.array(initial_state)
-
-
-def random_entropy(size=None, source='system'):
- """
- random_entropy(size=None, source='system')
-
- Read entropy from the system cryptographic provider
-
- Parameters
- ----------
- size : int or tuple of ints, optional
- Output shape. If the given shape is, e.g., ``(m, n, k)``, then
- ``m * n * k`` samples are drawn. Default is None, in which case a
- single value is returned.
- source : str {'system', 'fallback'}
- Source of entropy. 'system' uses system cryptographic pool.
- 'fallback' uses a hash of the time and process id.
-
- Returns
- -------
- entropy : scalar or array
- Entropy bits in 32-bit unsigned integers. A scalar is returned if size
- is `None`.
-
- Notes
- -----
- On Unix-like machines, reads from ``/dev/urandom``. On Windows machines
- reads from the RSA algorithm provided by the cryptographic service
- provider.
-
- This function reads from the system entropy pool and so samples are
- not reproducible. In particular, it does *NOT* make use of a
- BitGenerator, and so ``seed`` and setting ``state`` have no
- effect.
-
- Raises RuntimeError if the command fails.
- """
- cdef bint success = True
- cdef Py_ssize_t n = 0
- cdef uint32_t random = 0
- cdef uint32_t [:] randoms
-
- if source not in ('system', 'fallback'):
- raise ValueError('Unknown value in source.')
-
- if size is None:
- if source == 'system':
- success = entropy_getbytes(<void *>&random, 4)
- else:
- success = entropy_fallback_getbytes(<void *>&random, 4)
- else:
- n = compute_numel(size)
- randoms = np.zeros(n, dtype=np.uint32)
- if source == 'system':
- success = entropy_getbytes(<void *>(&randoms[0]), 4 * n)
- else:
- success = entropy_fallback_getbytes(<void *>(&randoms[0]), 4 * n)
- if not success:
- raise RuntimeError('Unable to read from system cryptographic provider')
-
- if n == 0:
- return random
- return np.asarray(randoms).reshape(size)
double nonc) nogil
double legacy_wald(aug_bitgen_t *aug_state, double mean, double scale) nogil
double legacy_lognormal(aug_bitgen_t *aug_state, double mean, double sigma) nogil
+ int64_t legacy_random_binomial(bitgen_t *bitgen_state, double p,
+ int64_t n, binomial_t *binomial) nogil
int64_t legacy_negative_binomial(aug_bitgen_t *aug_state, double n, double p) nogil
int64_t legacy_random_hypergeometric(bitgen_t *bitgen_state, int64_t good, int64_t bad, int64_t sample) nogil
int64_t legacy_random_logseries(bitgen_t *bitgen_state, double p) nogil
from .common cimport *
from .bit_generator cimport BitGenerator, SeedSequence
-from .entropy import random_entropy
__all__ = ['MT19937']
Random seed initializing the pseudo-random number generator.
Can be an integer in [0, 2**32-1], array of integers in
[0, 2**32-1], a `SeedSequence, or ``None``. If `seed`
- is ``None``, then sample entropy for a seed.
+ is ``None``, then fresh, unpredictable entropy will be pulled from
+ the OS.
Raises
------
with self.lock:
try:
if seed is None:
- val = random_entropy(RK_STATE_LEN)
+ seed = SeedSequence()
+ val = seed.generate_state(RK_STATE_LEN)
# MSB is 1; assuring non-zero initial array
self.rng_state.key[0] = 0x80000000UL
for i in range(1, RK_STATE_LEN):
See Also
--------
Generator
- mt19937.MT19937
- Bit_Generators
+ MT19937
+ :ref:`bit_generator`
"""
cdef public object _bit_generator
for i in range(cnt):
_dp = (<double*>np.PyArray_MultiIter_DATA(it, 1))[0]
_in = (<long*>np.PyArray_MultiIter_DATA(it, 2))[0]
- (<long*>np.PyArray_MultiIter_DATA(it, 0))[0] = random_binomial(&self._bitgen, _dp, _in, &self._binomial)
+ (<long*>np.PyArray_MultiIter_DATA(it, 0))[0] = \
+ legacy_random_binomial(&self._bitgen, _dp, _in,
+ &self._binomial)
np.PyArray_MultiIter_NEXT(it)
if size is None:
with self.lock:
- return random_binomial(&self._bitgen, _dp, _in, &self._binomial)
+ return <long>legacy_random_binomial(&self._bitgen, _dp, _in,
+ &self._binomial)
randoms = <np.ndarray>np.empty(size, int)
cnt = np.PyArray_SIZE(randoms)
with self.lock, nogil:
for i in range(cnt):
- randoms_data[i] = random_binomial(&self._bitgen, _dp, _in,
- &self._binomial)
+ randoms_data[i] = legacy_random_binomial(&self._bitgen, _dp, _in,
+ &self._binomial)
return randoms
# Convert to int64, if necessary, to use int64 infrastructure
ongood = ongood.astype(np.int64)
onbad = onbad.astype(np.int64)
- onbad = onbad.astype(np.int64)
+ onsample = onsample.astype(np.int64)
out = discrete_broadcast_iii(&legacy_random_hypergeometric,&self._bitgen, size, self.lock,
ongood, 'ngood', CONS_NON_NEGATIVE,
onbad, 'nbad', CONS_NON_NEGATIVE,
# One can force emulated 128-bit arithmetic if one wants.
#PCG64_DEFS += [('PCG_FORCE_EMULATED_128BIT_MATH', '1')]
- config.add_extension('entropy',
- sources=['entropy.c', 'src/entropy/entropy.c'] +
- [generate_libraries],
- libraries=EXTRA_LIBRARIES,
- extra_compile_args=EXTRA_COMPILE_ARGS,
- extra_link_args=EXTRA_LINK_ARGS,
- depends=[join('src', 'splitmix64', 'splitmix.h'),
- join('src', 'entropy', 'entropy.h'),
- 'entropy.pyx',
- ],
- define_macros=defs,
- )
for gen in ['mt19937']:
# gen.pyx, src/gen/gen.c, src/gen/gen-jump.c
config.add_extension(gen,
return X;
}
-RAND_INT_TYPE random_binomial(bitgen_t *bitgen_state, double p, RAND_INT_TYPE n,
- binomial_t *binomial) {
+int64_t random_binomial(bitgen_t *bitgen_state, double p, int64_t n,
+ binomial_t *binomial) {
double q;
if ((n == 0LL) || (p == 0.0f))
uint64_t rng, uint64_t mask, bool use_masked) {
if (rng == 0) {
return off;
- } else if (rng < 0xFFFFFFFFUL) {
+ } else if (rng <= 0xFFFFFFFFUL) {
/* Call 32-bit generator if range in 32-bit. */
if (use_masked) {
return off + buffered_bounded_masked_uint32(bitgen_state, rng, mask, NULL,
for (i = 0; i < cnt; i++) {
out[i] = off;
}
- } else if (rng < 0xFFFFFFFFUL) {
+ } else if (rng <= 0xFFFFFFFFUL) {
uint32_t buf = 0;
int bcnt = 0;
typedef struct s_binomial_t {
int has_binomial; /* !=0: following parameters initialized for binomial */
double psave;
- int64_t nsave;
+ RAND_INT_TYPE nsave;
double r;
double q;
double fm;
- int64_t m;
+ RAND_INT_TYPE m;
double p1;
double xm;
double xl;
DECLDIR RAND_INT_TYPE random_poisson(bitgen_t *bitgen_state, double lam);
DECLDIR RAND_INT_TYPE random_negative_binomial(bitgen_t *bitgen_state, double n,
double p);
-DECLDIR RAND_INT_TYPE random_binomial(bitgen_t *bitgen_state, double p, RAND_INT_TYPE n,
- binomial_t *binomial);
+
+DECLDIR RAND_INT_TYPE random_binomial_btpe(bitgen_t *bitgen_state,
+ RAND_INT_TYPE n,
+ double p,
+ binomial_t *binomial);
+DECLDIR RAND_INT_TYPE random_binomial_inversion(bitgen_t *bitgen_state,
+ RAND_INT_TYPE n,
+ double p,
+ binomial_t *binomial);
+DECLDIR int64_t random_binomial(bitgen_t *bitgen_state, double p,
+ int64_t n, binomial_t *binomial);
+
DECLDIR RAND_INT_TYPE random_logseries(bitgen_t *bitgen_state, double p);
DECLDIR RAND_INT_TYPE random_geometric_search(bitgen_t *bitgen_state, double p);
DECLDIR RAND_INT_TYPE random_geometric_inversion(bitgen_t *bitgen_state, double p);
+++ /dev/null
-#include <stddef.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-
-#include "entropy.h"
-#ifdef _WIN32
-/* Windows */
-#include <sys/timeb.h>
-#include <time.h>
-#include <windows.h>
-
-#include <wincrypt.h>
-#else
-/* Unix */
-#include <sys/time.h>
-#include <time.h>
-#include <unistd.h>
-#include <fcntl.h>
-#endif
-
-bool entropy_getbytes(void *dest, size_t size) {
-#ifndef _WIN32
-
- int fd = open("/dev/urandom", O_RDONLY);
- if (fd < 0)
- return false;
- ssize_t sz = read(fd, dest, size);
- if ((sz < 0) || ((size_t)sz < size))
- return false;
- return close(fd) == 0;
-
-#else
-
- HCRYPTPROV hCryptProv;
- BOOL done;
-
- if (!CryptAcquireContext(&hCryptProv, NULL, NULL, PROV_RSA_FULL,
- CRYPT_VERIFYCONTEXT) ||
- !hCryptProv) {
- return true;
- }
- done = CryptGenRandom(hCryptProv, (DWORD)size, (unsigned char *)dest);
- CryptReleaseContext(hCryptProv, 0);
- if (!done) {
- return false;
- }
-
- return true;
-#endif
-}
-
-/* Thomas Wang 32/64 bits integer hash function */
-uint32_t entropy_hash_32(uint32_t key) {
- key += ~(key << 15);
- key ^= (key >> 10);
- key += (key << 3);
- key ^= (key >> 6);
- key += ~(key << 11);
- key ^= (key >> 16);
- return key;
-}
-
-uint64_t entropy_hash_64(uint64_t key) {
- key = (~key) + (key << 21); // key = (key << 21) - key - 1;
- key = key ^ (key >> 24);
- key = (key + (key << 3)) + (key << 8); // key * 265
- key = key ^ (key >> 14);
- key = (key + (key << 2)) + (key << 4); // key * 21
- key = key ^ (key >> 28);
- key = key + (key << 31);
- return key;
-}
-
-uint32_t entropy_randombytes(void) {
-
-#ifndef _WIN32
- struct timeval tv;
- gettimeofday(&tv, NULL);
- return entropy_hash_32(getpid()) ^ entropy_hash_32(tv.tv_sec) ^
- entropy_hash_32(tv.tv_usec) ^ entropy_hash_32(clock());
-#else
- uint32_t out = 0;
- int64_t counter;
- struct _timeb tv;
- _ftime_s(&tv);
- out = entropy_hash_32(GetCurrentProcessId()) ^
- entropy_hash_32((uint32_t)tv.time) ^ entropy_hash_32(tv.millitm) ^
- entropy_hash_32(clock());
- if (QueryPerformanceCounter((LARGE_INTEGER *)&counter) != 0)
- out ^= entropy_hash_32((uint32_t)(counter & 0xffffffff));
- return out;
-#endif
-}
-
-bool entropy_fallback_getbytes(void *dest, size_t size) {
- int hashes = (int)size;
- uint32_t *hash = malloc(hashes * sizeof(uint32_t));
- int i;
- for (i = 0; i < hashes; i++) {
- hash[i] = entropy_randombytes();
- }
- memcpy(dest, (void *)hash, size);
- free(hash);
- return true;
-}
-
-void entropy_fill(void *dest, size_t size) {
- bool success;
- success = entropy_getbytes(dest, size);
- if (!success) {
- entropy_fallback_getbytes(dest, size);
- }
-}
+++ /dev/null
-#ifndef _RANDOMDGEN__ENTROPY_H_
-#define _RANDOMDGEN__ENTROPY_H_
-
-#include <stddef.h>
-#include <stdbool.h>
-#include <stdint.h>
-
-extern void entropy_fill(void *dest, size_t size);
-
-extern bool entropy_getbytes(void *dest, size_t size);
-
-extern bool entropy_fallback_getbytes(void *dest, size_t size);
-
-#endif
}
+static RAND_INT_TYPE legacy_random_binomial_original(bitgen_t *bitgen_state,
+ double p,
+ RAND_INT_TYPE n,
+ binomial_t *binomial) {
+ double q;
+
+ if (p <= 0.5) {
+ if (p * n <= 30.0) {
+ return random_binomial_inversion(bitgen_state, n, p, binomial);
+ } else {
+ return random_binomial_btpe(bitgen_state, n, p, binomial);
+ }
+ } else {
+ q = 1.0 - p;
+ if (q * n <= 30.0) {
+ return n - random_binomial_inversion(bitgen_state, n, q, binomial);
+ } else {
+ return n - random_binomial_btpe(bitgen_state, n, q, binomial);
+ }
+ }
+}
+
+
+int64_t legacy_random_binomial(bitgen_t *bitgen_state, double p,
+ int64_t n, binomial_t *binomial) {
+ return (int64_t) legacy_random_binomial_original(bitgen_state, p,
+ (RAND_INT_TYPE) n,
+ binomial);
+}
+
+
static RAND_INT_TYPE random_hypergeometric_hyp(bitgen_t *bitgen_state,
RAND_INT_TYPE good,
RAND_INT_TYPE bad,
extern double legacy_normal(aug_bitgen_t *aug_state, double loc, double scale);
extern double legacy_standard_gamma(aug_bitgen_t *aug_state, double shape);
extern double legacy_exponential(aug_bitgen_t *aug_state, double scale);
+extern int64_t legacy_random_binomial(bitgen_t *bitgen_state, double p,
+ int64_t n, binomial_t *binomial);
extern int64_t legacy_negative_binomial(aug_bitgen_t *aug_state, double n,
double p);
extern int64_t legacy_random_hypergeometric(bitgen_t *bitgen_state,
assert c.dtype == np.dtype(int)
c = np.random.choice(10, replace=False, size=2)
assert c.dtype == np.dtype(int)
+
+ @pytest.mark.skipif(np.iinfo('l').max < 2**32,
+ reason='Cannot test with 32-bit C long')
+ def test_randint_117(self):
+ # GH 14189
+ random.seed(0)
+ expected = np.array([2357136044, 2546248239, 3071714933, 3626093760,
+ 2588848963, 3684848379, 2340255427, 3638918503,
+ 1819583497, 2678185683], dtype='int64')
+ actual = random.randint(2**32, size=10)
+ assert_array_equal(actual, expected)
+
+ def test_p_zero_stream(self):
+ # Regression test for gh-14522. Ensure that future versions
+ # generate the same variates as version 1.16.
+ np.random.seed(12345)
+ assert_array_equal(random.binomial(1, [0, 0.25, 0.5, 0.75, 1]),
+ [0, 0, 0, 1, 1])
+
+ def test_n_zero_stream(self):
+ # Regression test for gh-14522. Ensure that future versions
+ # generate the same variates as version 1.16.
+ np.random.seed(8675309)
+ expected = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
+ [3, 4, 2, 3, 3, 1, 5, 3, 1, 3]])
+ assert_array_equal(random.binomial([[0], [10]], 0.25, size=(2, 10)),
+ expected)
import numpy as np
import pytest
from numpy.testing import assert_equal, assert_, assert_array_equal
-from numpy.random import (Generator, MT19937, PCG64, Philox, SFC64, entropy)
+from numpy.random import (Generator, MT19937, PCG64, Philox, SFC64)
@pytest.fixture(scope='module',
params=(np.bool, np.int8, np.int16, np.int32, np.int64,
np.random.default_rng(-1)
with pytest.raises(ValueError):
np.random.default_rng([12345, -1])
-
-
-class TestEntropy(object):
- def test_entropy(self):
- e1 = entropy.random_entropy()
- e2 = entropy.random_entropy()
- assert_((e1 != e2))
- e1 = entropy.random_entropy(10)
- e2 = entropy.random_entropy(10)
- assert_((e1 != e2).all())
- e1 = entropy.random_entropy(10, source='system')
- e2 = entropy.random_entropy(10, source='system')
- assert_((e1 != e2).all())
-
- def test_fallback(self):
- e1 = entropy.random_entropy(source='fallback')
- time.sleep(0.1)
- e2 = entropy.random_entropy(source='fallback')
- assert_((e1 != e2))
-
#-----------------------------------
# Path to the release notes
-RELEASE_NOTES = 'doc/release/1.17.2-notes.rst'
+RELEASE_NOTES = 'doc/release/1.17.3-notes.rst'
#-------------------------------------------------------
MAJOR = 1
MINOR = 17
-MICRO = 2
+MICRO = 3
ISRELEASED = True
VERSION = '%d.%d.%d' % (MAJOR, MINOR, MICRO)
pip install --upgrade pip setuptools
-pip install pytz cython pytest==5.0.1
+pip install pytz cython pytest==5.1.2
if [ -n "$USE_ASV" ]; then pip install asv; fi
popd