This file contains all available output stages for GEMMLowp on OpenCL. More...

Namespaces
	detail

	gles

	graph

	graph_utils

	io

	logging

	misc

	quantization

	strong_type

	support

	test

	traits

	tuners

	utility

	utils

	wrapper

Data Structures
class	AccessWindowAutoPadding
	Dummy access window. More...

class	AccessWindowHorizontal
	Implementation of a row access pattern. More...

class	AccessWindowRectangle
	Implementation of a rectangular access pattern. More...

class	AccessWindowStatic
	Implementation of a static rectangular access pattern. More...

class	AccessWindowTranspose
	Implementation of a XY-transpose access pattern. More...

class	AccessWindowVertical
	Implementation of a column access pattern. More...

class	ActivationLayerInfo
	Activation Layer Information class. More...

class	Allocator
	Default malloc allocator implementation. More...

class	Array
	Basic implementation of the IArray interface which allocates a static number of T values. More...

class	AssemblyKernelGlue
	Assembly kernel glue. More...

class	BlobLifetimeManager
	Concrete class that tracks the lifetime of registered tensors and calculates the systems memory requirements in terms of blobs. More...

class	BlobMemoryPool
	Blob memory pool. More...

struct	BorderSize
	Container for 2D border size. More...

class	CLAbsoluteDifference
	Basic function to run CLAbsoluteDifferenceKernel. More...

class	CLAbsoluteDifferenceKernel
	Interface for the absolute difference kernel. More...

class	CLAccumulate
	Basic function to run CLAccumulateKernel. More...

class	CLAccumulateKernel
	Interface for the accumulate kernel. More...

class	CLAccumulateSquared
	Basic function to run CLAccumulateSquaredKernel. More...

class	CLAccumulateSquaredKernel
	Interface for the accumulate squared kernel. More...

class	CLAccumulateWeighted
	Basic function to run CLAccumulateWeightedKernel. More...

class	CLAccumulateWeightedKernel
	Interface for the accumulate weighted kernel. More...

class	CLActivationLayer
	Basic function to run CLActivationLayerKernel. More...

class	CLActivationLayerKernel
	Interface for the activation layer kernel. More...

class	CLArithmeticAddition
	Basic function to run CLArithmeticAdditionKernel. More...

class	CLArithmeticAdditionKernel
	Interface for the arithmetic addition kernel. More...

class	CLArithmeticSubtraction
	Basic function to run CLArithmeticSubtractionKernel. More...

class	CLArithmeticSubtractionKernel
	Interface for the arithmetic subtraction kernel. More...

class	CLArray
	CLArray implementation. More...

class	CLBatchNormalizationLayer
	Basic function to run CLNormalizationLayerKernel and simulate a batch normalization layer. More...

class	CLBatchNormalizationLayerKernel
	Interface for the BatchNormalization layer kernel. More...

class	CLBitwiseAnd
	Basic function to run CLBitwiseAndKernel. More...

class	CLBitwiseAndKernel
	Interface for the bitwise AND operation kernel. More...

class	CLBitwiseNot
	Basic function to run CLBitwiseNotKernel. More...

class	CLBitwiseNotKernel
	Interface for the bitwise NOT operation kernel. More...

class	CLBitwiseOr
	Basic function to run CLBitwiseOrKernel. More...

class	CLBitwiseOrKernel
	Interface for the bitwise OR operation kernel. More...

class	CLBitwiseXor
	Basic function to run CLBitwiseXorKernel. More...

class	CLBitwiseXorKernel
	Interface for the bitwise XOR operation kernel. More...

class	CLBox3x3
	Basic function to execute box filter 3x3. More...

class	CLBox3x3Kernel
	Interface for the box 3x3 filter kernel. More...

class	CLBufferAllocator
	Default OpenCL cl buffer allocator implementation. More...

class	CLBufferMemoryRegion
	OpenCL buffer memory region implementation. More...

class	CLBuildOptions
	Build options. More...

class	CLCannyEdge
	Basic function to execute canny edge on OpenCL. More...

class	CLChannelCombine
	Basic function to run CLChannelCombineKernel to perform channel combination. More...

class	CLChannelCombineKernel
	Interface for the channel combine kernel. More...

class	CLChannelExtract
	Basic function to run CLChannelExtractKernel to perform channel extraction. More...

class	CLChannelExtractKernel
	Interface for the channel extract kernel. More...

class	CLChannelShuffleLayer
	Basic function to run CLChannelShuffleLayerKernel. More...

class	CLChannelShuffleLayerKernel
	Interface for the channel shuffle kernel. More...

class	CLCoarseSVMMemoryRegion
	OpenCL coarse-grain SVM memory region implementation. More...

struct	CLCoefficientTable
	Structure for storing Spatial Gradient Matrix and the minimum eigenvalue for each keypoint. More...

class	CLCol2ImKernel
	Interface for the col2im reshaping kernel. More...

class	CLColorConvert
	Basic function to run CLColorConvertKernel. More...

class	CLColorConvertKernel
	Interface for the color convert kernel. More...

class	CLConvertFullyConnectedWeights
	Basic function to run CLConvertFullyConnectedWeightsKernel. More...

class	CLConvertFullyConnectedWeightsKernel
	Interface to convert the 2D Fully Connected weights from NCHW to NHWC or vice versa. More...

class	CLConvolution3x3
	Basic function to execute convolution of size 3x3. More...

class	CLConvolutionKernel
	Interface for the kernel to run an arbitrary size convolution on a tensor. More...

class	CLConvolutionLayer
	Basic function to compute the convolution layer. More...

class	CLConvolutionLayerReshapeWeights
	Function to reshape and transpose the weights. More...

class	CLConvolutionRectangle
	Basic function to execute non-square convolution. More...

class	CLConvolutionRectangleKernel
	Kernel for the running convolution on a rectangle matrix. More...

class	CLConvolutionSquare
	Basic function to execute square convolution.Currently it supports 5x5, 7x7, 9x9. More...

class	CLCopy

class	CLCopyKernel
	OpenCL kernel to perform a copy between two tensors. More...

class	CLCopyToArrayKernel
	CL kernel to copy keypoints information to ICLKeyPointArray and counts the number of key points. More...

class	CLDeconvolutionLayer
	Function to run the deconvolution layer. More...

class	CLDeconvolutionLayerUpsample
	Basic function to run CLDeconvolutionLayerUpsampleKernel. More...

class	CLDeconvolutionLayerUpsampleKernel
	Interface for the Deconvolution layer kernel on OpenCL. More...

class	CLDepthConcatenateLayer
	Basic function to execute concatenate tensors along z axis. More...

class	CLDepthConcatenateLayerKernel
	Interface for the depth concatenate kernel. More...

class	CLDepthConvertLayer
	Basic function to run CLDepthConvertLayerKernel. More...

class	CLDepthConvertLayerKernel
	Interface for the depth conversion kernel. More...

class	CLDepthwiseConvolutionLayer
	Basic function to execute a generic depthwise convolution. More...

class	CLDepthwiseConvolutionLayer3x3
	Basic function to execute a depthwise convolution for kernel size 3x3xC (when data layout NCHW) or Cx3x3 (when data layout NHWC). More...

class	CLDepthwiseConvolutionLayer3x3NCHWKernel
	Interface for the kernel to run a 3x3 depthwise convolution on a tensor when the data layout is NCHW. More...

class	CLDepthwiseConvolutionLayer3x3NHWCKernel
	Interface for the kernel to run a 3x3 depthwise convolution on a tensor when the data layout is NHWC. More...

class	CLDepthwiseIm2ColKernel
	Interface for the depthwise im2col reshape kernel. More...

class	CLDepthwiseSeparableConvolutionLayer
	Basic function to execute depthwise convolution. More...

class	CLDepthwiseVectorToTensorKernel
	Interface for the depthwise vector to tensor kernel. More...

class	CLDepthwiseWeightsReshapeKernel
	Interface for the depthwise weights reshape kernel. More...

class	CLDequantizationLayer
	Basic function to simulate a dequantization layer. More...

class	CLDequantizationLayerKernel
	Interface for the dequantization layer kernel. More...

class	CLDerivative
	Basic function to execute first order derivative operator. More...

class	CLDerivativeKernel
	Interface for the derivative kernel. More...

struct	CLDeviceOptions
	OpenCL device options. More...

class	CLDilate
	Basic function to execute dilate. More...

class	CLDilateKernel
	Interface for the dilate kernel. More...

class	CLDirectConvolutionLayer
	Basic function to execute direct convolution function: More...

class	CLDirectConvolutionLayerKernel
	Interface for the direct convolution kernel. More...

class	CLDirectConvolutionLayerOutputStageKernel
	OpenCL kernel to accumulate the biases, if provided, or downscale in case of quantized input. More...

class	CLDistribution1D
	CLDistribution1D object class. More...

class	CLEdgeNonMaxSuppressionKernel
	OpenCL kernel to perform Non-Maxima suppression for Canny Edge. More...

class	CLEdgeTraceKernel
	OpenCL kernel to perform Edge tracing. More...

class	CLEqualizeHistogram
	Basic function to execute histogram equalization. More...

class	CLErode
	Basic function to execute erode. More...

class	CLErodeKernel
	Interface for the erode kernel. More...

class	CLFastCorners
	Basic function to execute fast corners. More...

class	CLFastCornersKernel
	CL kernel to perform fast corners. More...

class	CLFillBorder
	Basic function to run CLFillBorderKernel. More...

class	CLFillBorderKernel
	Interface for filling the border of a kernel. More...

class	CLFineSVMMemoryRegion
	OpenCL fine-grain SVM memory region implementation. More...

class	CLFlattenLayer
	Basic function to execute flatten. More...

class	CLFloor
	Basic function to run CLFloorKernel. More...

class	CLFloorKernel
	OpenCL kernel to perform a floor operation. More...

class	CLFullyConnectedLayer
	Basic function to compute a Fully Connected layer on OpenCL. More...

class	CLFullyConnectedLayerReshapeWeights
	Basic function to reshape the weights of Fully Connected layer with OpenCL. More...

class	CLGaussian3x3
	Basic function to execute gaussian filter 3x3. More...

class	CLGaussian3x3Kernel
	Interface for the Gaussian 3x3 filter kernel. More...

class	CLGaussian5x5
	Basic function to execute gaussian filter 5x5. More...

class	CLGaussian5x5HorKernel
	Interface for the kernel to run the horizontal pass of 5x5 Gaussian filter on a tensor. More...

class	CLGaussian5x5VertKernel
	Interface for the kernel to run the vertical pass of 5x5 Gaussian filter on a tensor. More...

class	CLGaussianPyramid
	Common interface for all Gaussian pyramid functions. More...

class	CLGaussianPyramidHalf
	Basic function to execute gaussian pyramid with HALF scale factor. More...

class	CLGaussianPyramidHorKernel
	OpenCL kernel to perform a Gaussian filter and half scaling across width (horizontal pass) More...

class	CLGaussianPyramidOrb
	Basic function to execute gaussian pyramid with ORB scale factor. More...

class	CLGaussianPyramidVertKernel
	OpenCL kernel to perform a Gaussian filter and half scaling across height (vertical pass) More...

class	CLGEMM
	Basic function to execute GEMM on OpenCL. More...

class	CLGEMMConvolutionLayer
	Basic function to compute the convolution layer. More...

class	CLGEMMInterleave4x4
	Basic function to execute CLGEMMInterleave4x4Kernel. More...

class	CLGEMMInterleave4x4Kernel
	OpenCL kernel which interleaves the elements of a matrix A in chunk of 4x4. More...

class	CLGEMMLowpMatrixAReductionKernel
	OpenCL kernel used to compute the row-vectors of sums of all the entries in each row of Matrix A. More...

class	CLGEMMLowpMatrixBReductionKernel
	OpenCL kernel used to compute the row-vectors of sums of all the entries in each column of Matrix B. More...

class	CLGEMMLowpMatrixMultiplyCore
	Basic function to execute GEMMLowpMatrixMultiplyCore on OpenCL. More...

class	CLGEMMLowpMatrixMultiplyKernel
	OpenCL kernel to multiply matrices. More...

class	CLGEMMLowpOffsetContributionKernel
	OpenCL kernel used to add the offset contribution after CLGEMMLowpMatrixMultiplyKernel. More...

class	CLGEMMLowpQuantizeDownInt32ToUint8Scale
	Basic function to execute CLGEMMLowpQuantizeDownInt32ToUint8Scale on OpenCL. More...

class	CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint
	Basic function to execute CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint on OpenCL. More...

class	CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel
	OpenCL kernel used to quantize down the int32 accumulator values of GEMMLowp to QASYMM8. More...

class	CLGEMMLowpQuantizeDownInt32ToUint8ScaleKernel
	OpenCL kernel used to quantize down the int32 accumulator values of GEMMLowp to QASYMM8. More...

class	CLGEMMMatrixAccumulateBiasesKernel
	Interface to add a bias to each row of the input tensor. More...

class	CLGEMMMatrixAdditionKernel
	OpenCL kernel to perform the in-place matrix addition between 2 matrices, taking into account that the second matrix might be weighted by a scalar value beta. More...

class	CLGEMMMatrixMultiplyKernel
	OpenCL kernel to multiply two input matrices "A" and "B" . More...

class	CLGEMMMatrixVectorMultiplyKernel
	Interface for the GEMM matrix vector multiply kernel. More...

class	CLGEMMTranspose1xW
	Basic function to execute CLGEMMTranspose1xWKernel. More...

class	CLGEMMTranspose1xWKernel
	OpenCL kernel which transposes the elements of a matrix in chunks of 1xW, where W is equal to (16 / element size of the tensor) More...

class	CLGradientKernel
	OpenCL kernel to perform Gradient computation. More...

class	CLHarrisCorners
	Basic function to execute harris corners detection. More...

class	CLHarrisScoreKernel
	Interface for the harris score kernel. More...

class	CLHistogram
	Basic function to execute histogram. More...

class	CLHistogramBorderKernel
	Interface to run the histogram kernel to handle the leftover part of image. More...

class	CLHistogramKernel
	Interface to run the histogram kernel. More...

class	CLHOG
	OpenCL implementation of HOG data-object. More...

class	CLHOGBlockNormalizationKernel
	OpenCL kernel to perform HOG block normalization. More...

class	CLHOGDescriptor
	Basic function to calculate HOG descriptor. More...

class	CLHOGDetector
	Basic function to execute HOG detector based on linear SVM. More...

class	CLHOGDetectorKernel
	OpenCL kernel to perform HOG detector kernel using linear SVM. More...

class	CLHOGGradient
	Basic function to calculate the gradient for HOG. More...

class	CLHOGMultiDetection
	Basic function to detect multiple objects (or the same object at different scales) on the same input image using HOG. More...

class	CLHOGOrientationBinningKernel
	OpenCL kernel to perform HOG Orientation Binning. More...

class	CLIm2ColKernel
	Interface for the im2col reshape kernel. More...

class	CLIntegralImage
	Basic function to execute integral image. More...

class	CLIntegralImageHorKernel
	Interface to run the horizontal pass of the integral image kernel. More...

class	CLIntegralImageVertKernel
	Interface to run the vertical pass of the integral image kernel. More...

class	CLKernelLibrary
	CLKernelLibrary class. More...

class	CLL2NormalizeLayer
	Basic function to perform a L2 normalization on a given axis. More...

class	CLL2NormalizeLayerKernel
	Interface for performing a L2 normalize on a given axis given the square sum of it in this axis. More...

class	CLLaplacianPyramid
	Basic function to execute laplacian pyramid. More...

class	CLLaplacianReconstruct
	Basic function to execute laplacian reconstruction. More...

struct	CLLKInternalKeypoint
	Internal keypoint structure for Lucas-Kanade Optical Flow. More...

class	CLLKTrackerFinalizeKernel
	Interface to run the finalize step of LKTracker, where it truncates the coordinates stored in new_points array. More...

class	CLLKTrackerInitKernel
	Interface to run the initialization step of LKTracker. More...

class	CLLKTrackerStage0Kernel
	Interface to run the first stage of LKTracker, where A11, A12, A22, min_eig, ival, ixval and iyval are computed. More...

class	CLLKTrackerStage1Kernel
	Interface to run the second stage of LKTracker, where the motion vectors of the given points are computed. More...

class	CLLocallyConnectedLayer
	Basic function to compute the locally connected layer. More...

class	CLLocallyConnectedMatrixMultiplyKernel
	OpenCL kernel to multiply each row of first tensor with low 2 dimensions of second tensor. More...

class	CLLogits1DMaxKernel
	Interface for the identifying the max value of 1D Logits. More...

class	CLLogits1DMaxShiftExpSumKernel
	Interface for max, shifting, exponentiating and summing the logits. More...

class	CLLogits1DNormKernel
	Interface for calculating the final step of the Softmax Layer where each logit value is multiplied by the inverse of the sum of the logits. More...

class	CLLogits1DShiftExpSumKernel
	Interface for shifting, exponentiating and summing the logits. More...

class	CLLSTMLayer
	This function performs a single time step in a Long Short-Term Memory (LSTM) layer. More...

class	CLLut
	Basic implementation of the OpenCL lut interface. More...

class	CLLutAllocator
	Basic implementation of a CL memory LUT allocator. More...

class	CLMagnitude
	Basic function to run CLMagnitudePhaseKernel. More...

class	CLMagnitudePhaseKernel
	Template interface for the kernel to compute magnitude and phase. More...

class	CLMeanStdDev
	Basic function to execute mean and standard deviation by calling CLMeanStdDevKernel. More...

class	CLMeanStdDevKernel
	Interface for the kernel to calculate mean and standard deviation of input image pixels. More...

class	CLMedian3x3
	Basic function to execute median filter. More...

class	CLMedian3x3Kernel
	Interface for the median 3x3 filter kernel. More...

class	CLMemory
	OpenCL implementation of memory object. More...

class	CLMinMaxKernel
	Interface for the kernel to perform min max search on an image. More...

class	CLMinMaxLayerKernel
	Interface for the kernel to perform min max search on a 3D tensor. More...

class	CLMinMaxLocation
	Basic function to execute min and max location. More...

class	CLMinMaxLocationKernel
	Interface for the kernel to find min max locations of an image. More...

class	CLMultiHOG
	Basic implementation of the CL multi HOG data-objects. More...

class	CLMultiImage
	Basic implementation of the CL multi-planar image interface. More...

class	CLNonLinearFilter
	Basic function to execute non linear filter. More...

class	CLNonLinearFilterKernel
	Interface for the kernel to apply a non-linear filter. More...

class	CLNonMaximaSuppression3x3
	Basic function to execute non-maxima suppression over a 3x3 window. More...

class	CLNonMaximaSuppression3x3Kernel
	Interface to perform Non-Maxima suppression over a 3x3 window using OpenCL. More...

class	CLNormalizationLayer
	Basic function to compute a normalization layer. More...

class	CLNormalizationLayerKernel
	Interface for the normalization layer kernel. More...

struct	CLOldValue
	Structure for storing ival, ixval and iyval for each point inside the window. More...

class	CLOpticalFlow
	Basic function to execute optical flow. More...

class	CLPermute
	Basic function to execute an CLPermuteKernel. More...

class	CLPermuteKernel
	OpenCL kernel to perform tensor permutation. More...

class	CLPhase
	Basic function to execute an CLMagnitudePhaseKernel. More...

class	CLPixelWiseMultiplication
	Basic function to run CLPixelWiseMultiplicationKernel. More...

class	CLPixelWiseMultiplicationKernel
	Interface for the pixelwise multiplication kernel. More...

class	CLPoolingLayer
	Basic function to simulate a pooling layer with the specified pooling operation. More...

class	CLPoolingLayerKernel
	Interface for the pooling layer kernel. More...

class	CLPyramid
	Basic implementation of the OpenCL pyramid interface. More...

class	CLQuantizationLayer
	Basic function to simulate a quantization layer. More...

class	CLQuantizationLayerKernel
	Interface for the quantization layer kernel. More...

class	CLReductionOperation
	Perform reduction operation. More...

class	CLReductionOperationKernel
	Interface for the reduction operation kernel. More...

class	CLRemap
	Basic function to execute remap. More...

class	CLRemapKernel
	OpenCL kernel to perform a remap on a tensor. More...

class	CLReshapeLayer
	Basic function to run CLReshapeLayerKernel. More...

class	CLReshapeLayerKernel
	Interface for the kernel to perform tensor reshaping. More...

class	CLRNNLayer
	Basic function to run CLRNNLayer. More...

class	CLROIPoolingLayer
	Basic function to run CLROIPoolingLayerKernel. More...

class	CLROIPoolingLayerKernel
	Interface for the ROI pooling layer kernel. More...

class	CLScale
	Basic function to run CLScaleKernel. More...

class	CLScaleKernel
	Interface for the scale kernel. More...

class	CLScharr3x3
	Basic function to execute scharr 3x3 filter. More...

class	CLScharr3x3Kernel
	Interface for the kernel to run a 3x3 Scharr filter on a tensor. More...

class	CLScheduler
	Provides global access to a CL context and command queue. More...

class	CLSeparableConvolutionHorKernel
	Kernel for the Horizontal pass of a Separable Convolution. More...

class	CLSeparableConvolutionVertKernel
	Kernel for the Vertical pass of a Separable Convolution. More...

class	CLSobel3x3
	Basic function to execute sobel 3x3 filter. More...

class	CLSobel3x3Kernel
	Interface for the kernel to run a 3x3 Sobel filter on a tensor. More...

class	CLSobel5x5
	Basic function to execute sobel 5x5 filter. More...

class	CLSobel5x5HorKernel
	Interface for the kernel to run the horizontal pass of 5x5 Sobel filter on a tensor. More...

class	CLSobel5x5VertKernel
	Interface for the kernel to run the vertical pass of 5x5 Sobel filter on a tensor. More...

class	CLSobel7x7
	Basic function to execute sobel 7x7 filter. More...

class	CLSobel7x7HorKernel
	Interface for the kernel to run the horizontal pass of 7x7 Sobel filter on a tensor. More...

class	CLSobel7x7VertKernel
	Interface for the kernel to run the vertical pass of 7x7 Sobel filter on a tensor. More...

class	CLSoftmaxLayer
	Basic function to compute a SoftmaxLayer. More...

class	CLSubTensor
	Basic implementation of the OpenCL sub-tensor interface. More...

class	CLSymbols
	Class for loading OpenCL symbols. More...

class	CLTableLookup
	Basic function to run CLTableLookupKernel. More...

class	CLTableLookupKernel
	Interface for the kernel to perform table lookup calculations. More...

class	CLTensor
	Basic implementation of the OpenCL tensor interface. More...

class	CLTensorAllocator
	Basic implementation of a CL memory tensor allocator. More...

class	CLThreshold
	Basic function to run CLThresholdKernel. More...

class	CLThresholdKernel
	Interface for the thresholding kernel. More...

class	CLTranspose
	Basic function to transpose a matrix on OpenCL. More...

class	CLTransposeKernel
	OpenCL kernel which transposes the elements of a matrix. More...

class	CLTuner
	Basic implementation of the OpenCL tuner interface. More...

class	CLWarpAffine
	Basic function to run CLWarpAffineKernel for AFFINE transformation. More...

class	CLWarpAffineKernel
	Interface for the warp affine kernel. More...

class	CLWarpPerspective
	Basic function to run CLWarpPerspectiveKernel for PERSPECTIVE transformation. More...

class	CLWarpPerspectiveKernel
	Interface for the warp perspective kernel. More...

class	CLWeightsReshapeKernel
	OpenCL kernel to perform reshaping on the weights used by convolution and locally connected layer. More...

class	CLWidthConcatenateLayer
	Basic function to execute concatenate tensors along x axis. More...

class	CLWidthConcatenateLayerKernel
	Interface for the width concatenate kernel. More...

class	CLWinogradConvolutionLayer
	Basic function to execute Winograd-based convolution on OpenCL. More...

class	CLWinogradFilterTransformKernel
	Interface for the Winograd filter transform kernel. More...

class	CLWinogradInputTransform
	Basic function to execute a CLWinogradInputTransformKernel. More...

class	CLWinogradInputTransformKernel
	OpenCL kernel to perform Winograd input transform. More...

class	CLWinogradOutputTransformKernel
	Interface for the Winograd output transform kernel. More...

class	Coordinates
	Coordinates of an item. More...

struct	Coordinates2D
	Coordinate type. More...

struct	Coordinates3D
	Coordinate type. More...

class	CPPCornerCandidatesKernel
	CPP kernel to perform corner candidates. More...

class	CPPDetectionWindowNonMaximaSuppressionKernel
	CPP kernel to perform in-place computation of euclidean distance on IDetectionWindowArray. More...

class	CPPPermute
	Basic function to run CPPPermuteKernel. More...

class	CPPPermuteKernel
	CPP kernel to perform tensor permutation. More...

class	CPPScheduler
	C++11 implementation of a pool of threads to automatically split a kernel's execution among several threads. More...

class	CPPSortEuclideanDistanceKernel
	CPP kernel to perform sorting and euclidean distance. More...

class	CPPUpsample
	Basic function to run CPPUpsample. More...

class	CPPUpsampleKernel
	CPP kernel to perform tensor upsample. More...

class	CPUInfo

struct	DetectionWindow
	Detection window used for the object detection. More...

class	Dimensions
	Dimensions with dimensionality. More...

class	Distribution1D
	Basic implementation of the 1D distribution interface. More...

struct	enable_bitwise_ops
	Disable bitwise operations by default. More...

struct	enable_bitwise_ops< arm_compute::GPUTarget >
	Enable bitwise operations on GPUTarget enumerations. More...

class	GCAbsoluteDifference
	Basic function to run GCAbsoluteDifferenceKernel. More...

class	GCAbsoluteDifferenceKernel
	Interface for the absolute difference kernel. More...

class	GCActivationLayer
	Basic function to run GCActivationLayerKernel. More...

class	GCActivationLayerKernel
	Interface for the activation layer kernel. More...

class	GCArithmeticAddition
	Basic function to run GCArithmeticAdditionKernel. More...

class	GCArithmeticAdditionKernel
	Interface for the arithmetic addition kernel. More...

class	GCBatchNormalizationLayer
	Basic function to run GCBatchNormalizationLayerKernel and simulate a batch normalization layer. More...

class	GCBatchNormalizationLayerKernel
	Interface for the BatchNormalization layer kernel. More...

class	GCBufferAllocator
	Default GLES buffer allocator implementation. More...

class	GCCol2ImKernel
	Interface for the col2im reshaping kernel. More...

class	GCConvolutionLayer
	Basic function to compute the convolution layer. More...

class	GCConvolutionLayerReshapeWeights
	Function to reshape and transpose the weights. More...

class	GCDepthConcatenateLayer
	Basic function to execute concatenate tensors along z axis. More...

class	GCDepthConcatenateLayerKernel
	Interface for the depth concatenate kernel. More...

class	GCDepthwiseConvolutionLayer3x3
	Basic function to execute a depthwise convolution for kernel size 3x3xC. More...

class	GCDepthwiseConvolutionLayer3x3Kernel
	Interface for the kernel to run a 3x3 depthwise convolution on a tensor. More...

class	GCDirectConvolutionLayer
	Basic function to execute direct convolution function. More...

class	GCDirectConvolutionLayerKernel
	Interface for the direct convolution kernel. More...

class	GCDropoutLayer
	Basic function to do dropout op. More...

class	GCDropoutLayerKernel
	Interface for the dropout layer kernel. More...

class	GCFillBorder
	Basic function to run GCFillBorderKernel. More...

class	GCFillBorderKernel
	Interface for filling the border of a kernel. More...

class	GCFullyConnectedLayer
	Basic function to compute a Fully Connected layer on OpenGL ES. More...

class	GCFullyConnectedLayerReshapeWeights
	Basic function to reshape the weights of Fully Connected layer with OpenGL ES. More...

class	GCGEMM
	Basic function to execute GEMM on OpenGLES Compute. More...

class	GCGEMMInterleave4x4
	Basic function to execute GCGEMMInterleave4x4Kernel. More...

class	GCGEMMInterleave4x4Kernel
	OpenGL ES kernel which interleaves the elements of a matrix A in chunk of 4x4. More...

class	GCGEMMMatrixAccumulateBiasesKernel
	Interface to add a bias to each row of the input tensor. More...

class	GCGEMMMatrixAdditionKernel
	OpenGL ES kernel to perform the in-place matrix addition between 2 matrices, taking into account that the second matrix might be weighted by a scalar value beta. More...

class	GCGEMMMatrixMultiplyKernel
	GLES Compute kernel to multiply two input matrices "A" and "B" or to multiply a vector "A" by a matrix "B". More...

class	GCGEMMTranspose1xW
	Basic function to execute GCGEMMTranspose1xWKernel. More...

class	GCGEMMTranspose1xWKernel
	OpenGLES kernel which transposes the elements of a matrix in chunks of 1xW, where W is equal to (16 / element size of the tensor) More...

class	GCIm2ColKernel
	Interface for the im2col reshape kernel. More...

class	GCKernel
	GCKernel class. More...

class	GCKernelLibrary
	GCKernelLibrary class. More...

class	GCLogits1DMaxKernel
	Interface for the identifying the max value of 1D Logits. More...

class	GCLogits1DNormKernel
	Interface for calculating the final step of the Softmax Layer where each logit value is multiplied by the inverse of the sum of the logits. More...

class	GCLogits1DShiftExpSumKernel
	Interface for shifting the logits values around the max value and exponentiating the result. More...

class	GCNormalizationLayer
	Basic function to compute a normalization layer. More...

class	GCNormalizationLayerKernel
	Interface for the normalization layer kernel. More...

class	GCNormalizePlanarYUVLayer
	Basic function to run GCNormalizePlanarYUVLayerKernel. More...

class	GCNormalizePlanarYUVLayerKernel
	Interface for the NormalizePlanarYUV layer kernel. More...

class	GCPixelWiseMultiplication
	Basic function to run GCPixelWiseMultiplicationKernel. More...

class	GCPixelWiseMultiplicationKernel
	Interface for the pixelwise multiplication kernel. More...

class	GCPoolingLayer
	Basic function to simulate a pooling layer with the specified pooling operation. More...

class	GCPoolingLayerKernel
	Interface for the pooling layer kernel. More...

class	GCProgram
	GCProgram class. More...

class	GCScale
	Basic function to run GCScaleKernel. More...

class	GCScaleKernel
	Interface for the scale kernel. More...

class	GCScheduler
	Provides global access to a OpenGL ES context and command queue. More...

class	GCSoftmaxLayer
	Basic function to compute a SoftmaxLayer. More...

class	GCTensor
	Interface for OpenGL ES tensor. More...

class	GCTensorAllocator
	Basic implementation of a GLES memory tensor allocator. More...

class	GCTensorShift
	Basic function to execute shift function for tensor. More...

class	GCTensorShiftKernel
	Interface for the kernel to shift valid data on a tensor. More...

class	GCTranspose
	Basic function to transpose a matrix on OpenGL ES. More...

class	GCTransposeKernel
	OpenGL ES kernel which transposes the elements of a matrix. More...

class	GCWeightsReshapeKernel
	GLES Compute kernel to perform reshaping on the weights used by convolution and locally connected layer. More...

class	GEMMInfo
	GEMM information class. More...

class	GEMMReshapeInfo
	GEMM reshape information class. More...

class	GLBufferWrapper

class	HOG
	CPU implementation of HOG data-object. More...

class	HOGInfo
	Store the HOG's metadata. More...

class	IAccessWindow
	Interface describing methods to update access window and padding based on kernel parameters. More...

class	IAllocator
	Allocator interface. More...

class	IArray
	Array of type T. More...

class	ICLArray
	Interface for OpenCL Array. More...

class	ICLDepthwiseConvolutionLayer3x3Kernel
	Interface for the kernel to run a 3x3 depthwise convolution on a tensor. More...

class	ICLDistribution1D
	ICLDistribution1D interface class. More...

class	ICLGEMMLowpReductionKernel
	Common interface for all OpenCL reduction kernels. More...

class	ICLHOG
	Interface for OpenCL HOG data-object. More...

class	ICLKernel
	Common interface for all the OpenCL kernels. More...

class	ICLLut
	Interface for OpenCL LUT. More...

class	ICLMemoryRegion
	OpenCL memory region interface. More...

class	ICLMultiHOG
	Interface for storing multiple HOG data-objects. More...

class	ICLMultiImage
	Interface for OpenCL multi-planar images. More...

class	ICLSimple2DKernel
	Interface for simple OpenCL kernels having 1 tensor input and 1 tensor output. More...

class	ICLSimple3DKernel
	Interface for simple OpenCL kernels having 1 tensor input and 1 tensor output. More...

class	ICLSimpleFunction
	Basic interface for functions which have a single OpenCL kernel. More...

class	ICLSimpleKernel
	Interface for simple OpenCL kernels having 1 tensor input and 1 tensor output. More...

class	ICLSVMMemoryRegion
	OpenCL SVM memory region interface. More...

class	ICLTensor
	Interface for OpenCL tensor. More...

class	ICLTuner
	Basic interface for tuning the OpenCL kernels. More...

class	ICPPKernel
	Common interface for all kernels implemented in C++. More...

class	ICPPSimpleFunction
	Basic interface for functions which have a single CPP kernel. More...

class	ICPPSimpleKernel
	Interface for simple C++ kernels having 1 tensor input and 1 tensor output. More...

class	IDistribution
	Interface for distribution objects. More...

class	IDistribution1D
	1D Distribution interface More...

class	IFunction
	Base class for all functions. More...

class	IGCKernel
	Common interface for all the GLES kernels. More...

class	IGCSimple2DKernel
	Interface for simple OpenGL ES kernels having 1 tensor input and 1 tensor output. More...

class	IGCSimple3DKernel
	Interface for simple GLES kernels having 1 tensor input and 1 tensor output. More...

class	IGCSimpleFunction
	Basic interface for functions which have a single OpenGL ES kernel. More...

class	IGCSimpleKernel
	Interface for simple OpenGL ES kernels having 1 tensor input and 1 tensor output. More...

class	IGCTensor
	Interface for GLES Compute tensor. More...

class	IHOG
	Interface for HOG data-object. More...

class	IKernel
	Common information for all the kernels. More...

class	ILifetimeManager
	Interface for managing the lifetime of objects. More...

class	ILut
	Lookup Table object interface. More...

class	ILutAllocator
	Basic interface to allocate LUTs'. More...

class	IMemoryGroup
	Memory group interface. More...

class	IMemoryManager
	Memory manager interface to handle allocations of backing memory. More...

class	IMemoryPool
	Memory Pool Inteface. More...

class	IMemoryRegion
	Memory region interface. More...

class	IMultiHOG
	Interface for storing multiple HOG data-objects. More...

class	IMultiImage
	Interface for multi-planar images. More...

class	INEGEMMLowpReductionKernel
	Common interface for all NEON reduction kernels. More...

class	INEHarrisScoreKernel
	Common interface for all Harris Score kernels. More...

class	INESimpleFunction
	Basic interface for functions which have a single NEON kernel. More...

class	INEWarpKernel
	Common interface for warp affine and warp perspective. More...

class	INEWinogradLayerBatchedGEMMKernel
	Interface for the NEON kernel to perform Winograd. More...

class	INEWinogradLayerTransformInputKernel
	Interface for the NEON kernel to perform Winograd input transform. More...

class	INEWinogradLayerTransformOutputKernel
	Interface for the NEON kernel to perform Winograd output transform. More...

class	INEWinogradLayerTransformWeightsKernel
	Interface for the NEON kernel to perform Winograd weights transform. More...

struct	InternalKeyPoint
	Internal keypoint class for Lucas-Kanade Optical Flow. More...

struct	IOFormatInfo
	IO formatting information class. More...

class	IPoolManager
	Memory pool manager interface. More...

class	IPyramid
	Interface for pyramid data-object. More...

class	IScheduler
	Scheduler interface to run kernels. More...

class	ISimpleLifetimeManager
	Abstract class of the simple lifetime manager interface. More...

class	ITensor
	Interface for NEON tensor. More...

class	ITensorAllocator
	Interface to allocate tensors. More...

class	ITensorInfo
	Store the tensor's metadata. More...

class	Iterator
	Iterator updated by execute_window_loop for each window element. More...

class	Kernel
	Kernel class. More...

struct	KeyPoint
	Keypoint type. More...

class	LSTMParams

class	Lut
	Basic implementation of the LUT interface. More...

class	LutAllocator
	Basic implementation of a CPU memory LUT allocator. More...

class	Memory
	CPU implementation of memory object. More...

class	MemoryGroupBase
	Memory group. More...

class	MemoryManagerOnDemand
	On-demand memory manager. More...

class	MemoryRegion
	Memory region CPU implementation. More...

struct	MinMaxLocationValues
	Min and max values and locations. More...

class	MultiHOG
	CPU implementation of multi HOG data-object. More...

class	MultiImage
	Basic implementation of the multi-planar image interface. More...

class	MultiImageInfo
	Store the multi-planar image's metadata. More...

class	NEAbsoluteDifference
	Basic function to run NEAbsoluteDifferenceKernel. More...

class	NEAbsoluteDifferenceKernel
	Interface for the absolute difference kernel. More...

class	NEAccumulate
	Basic function to run NEAccumulateKernel. More...

class	NEAccumulateKernel
	Interface for the accumulate kernel. More...

class	NEAccumulateSquared
	Basic function to run NEAccumulateSquaredKernel. More...

class	NEAccumulateSquaredKernel
	Interface for the accumulate squared kernel. More...

class	NEAccumulateWeighted
	Basic function to run NEAccumulateWeightedKernel. More...

class	NEAccumulateWeightedKernel
	Interface for the accumulate weighted kernel. More...

class	NEActivationLayer
	Basic function to run NEActivationLayerKernel. More...

class	NEActivationLayerKernel
	Interface for the activation layer kernel. More...

class	NEArithmeticAddition
	Basic function to run NEArithmeticAdditionKernel. More...

class	NEArithmeticAdditionKernel
	Interface for the kernel to perform addition between two tensors. More...

class	NEArithmeticSubtraction
	Basic function to run NEArithmeticSubtractionKernel. More...

class	NEArithmeticSubtractionKernel
	Interface for the kernel to perform subtraction between two tensors. More...

class	NEBatchNormalizationLayer
	Basic function to run NENormalizationLayerKernel and simulate a batch normalization layer. More...

class	NEBatchNormalizationLayerKernel
	Interface for the batch normalization layer kernel. More...

class	NEBitwiseAnd
	Basic function to run NEBitwiseAndKernel. More...

class	NEBitwiseAndKernel
	Interface for the kernel to perform bitwise AND between XY-planes of two tensors. More...

class	NEBitwiseNot
	Basic function to run NEBitwiseNotKernel. More...

class	NEBitwiseNotKernel
	Interface for the kernel to perform bitwise NOT operation. More...

class	NEBitwiseOr
	Basic function to run NEBitwiseOrKernel. More...

class	NEBitwiseOrKernel
	Interface for the kernel to perform bitwise inclusive OR between two tensors. More...

class	NEBitwiseXor
	Basic function to run NEBitwiseXorKernel. More...

class	NEBitwiseXorKernel
	Interface for the kernel to perform bitwise exclusive OR (XOR) between two tensors. More...

class	NEBox3x3
	Basic function to execute box filter 3x3. More...

class	NEBox3x3Kernel
	NEON kernel to perform a Box 3x3 filter. More...

class	NECannyEdge
	Basic function to execute canny edge on NEON. More...

class	NEChannelCombine
	Basic function to run NEChannelCombineKernel to perform channel combination. More...

class	NEChannelCombineKernel
	Interface for the channel combine kernel. More...

class	NEChannelExtract
	Basic function to run NEChannelExtractKernel to perform channel extraction. More...

class	NEChannelExtractKernel
	Interface for the channel extract kernel. More...

class	NECol2Im
	Basic function to run NECol2Im. More...

class	NECol2ImKernel
	NEON kernel to perform col2im reshaping. More...

class	NEColorConvert
	Basic function to run NEColorConvertKernel to perform color conversion. More...

class	NEColorConvertKernel
	Interface for the color convert kernel. More...

class	NEConvertFullyConnectedWeights
	Basic function to run NEConvertFullyConnectedWeightsKernel. More...

class	NEConvertFullyConnectedWeightsKernel
	Interface to convert the 2D Fully Connected weights from NCHW to NHWC or vice versa. More...

class	NEConvolution3x3
	Basic function to execute convolution of size 3x3. More...

class	NEConvolutionKernel
	Interface for the kernel to run an arbitrary size convolution on a tensor. More...

class	NEConvolutionLayer
	Basic function to simulate a convolution layer. More...

class	NEConvolutionLayerReshapeWeights
	Function to reshape and perform 1xW transposition on the weights. More...

class	NEConvolutionRectangle
	Basic function to execute non-square convolution. More...

class	NEConvolutionRectangleKernel
	Kernel for the running convolution on a rectangle matrix. More...

class	NEConvolutionSquare
	Basic function to execute convolution of size 5x5, 7x7, 9x9. More...

class	NECumulativeDistributionKernel
	Interface for the cumulative distribution (cummulative summmation) calculation kernel. More...

class	NEDeconvolutionLayer
	Function to run the deconvolution layer. More...

class	NEDepthConcatenateLayer
	Basic function to execute concatenate tensors along z axis. More...

class	NEDepthConcatenateLayerKernel
	Interface for the depth concatenate kernel. More...

class	NEDepthConvertLayer
	Basic function to run NEDepthConvertLayerKernel. More...

class	NEDepthConvertLayerKernel
	Depth conversion kernel. More...

class	NEDepthwiseConvolutionLayer
	Basic function to execute a generic depthwise convolution. More...

class	NEDepthwiseConvolutionLayer3x3
	Basic function to execute a depthwise convolution for kernel size 3x3xC. More...

class	NEDepthwiseConvolutionLayer3x3Kernel
	Interface for the kernel to run a 3x3 depthwise convolution on a tensor. More...

class	NEDepthwiseIm2ColKernel
	Interface for the depthwise im2col reshape kernel. More...

class	NEDepthwiseSeparableConvolutionLayer
	Basic function to execute depthwise convolution. More...

class	NEDepthwiseVectorToTensorKernel
	Interface for the depthwise vector to tensor kernel. More...

class	NEDepthwiseWeightsReshapeKernel
	Interface for the depthwise weights reshape kernel. More...

class	NEDequantizationLayer
	Basic function to simulate a dequantization layer. More...

class	NEDequantizationLayerKernel
	Interface for the dequantization layer kernel. More...

class	NEDerivative
	Basic function to execute first order derivative operator. More...

class	NEDerivativeKernel
	Interface for the kernel to run the derivative along the X/Y directions on a tensor. More...

class	NEDilate
	Basic function to execute dilate. More...

class	NEDilateKernel
	Interface for the kernel to perform boolean image dilatation. More...

class	NEDirectConvolutionLayer
	Function to run the direct convolution. More...

class	NEDirectConvolutionLayerKernel
	NEON interface for Direct Convolution Layer kernel. More...

class	NEDirectConvolutionLayerOutputStageKernel
	NEON kernel to accumulate the biases, if provided, or downscale in case of quantized input. More...

class	NEEdgeNonMaxSuppressionKernel
	NEON kernel to perform Non-Maxima suppression for Canny Edge. More...

class	NEEdgeTraceKernel
	NEON kernel to perform Edge tracing. More...

class	NEEqualizeHistogram
	Basic function to execute histogram equalization. More...

class	NEErode
	Basic function to execute erode. More...

class	NEErodeKernel
	Interface for the kernel to perform boolean image erosion. More...

class	NEFastCorners
	Basic function to execute fast corners. More...

class	NEFastCornersKernel
	NEON kernel to perform fast corners. More...

class	NEFillArrayKernel
	This kernel adds all texels greater than or equal to the threshold value to the keypoint array. More...

class	NEFillBorder
	Basic function to run NEFillBorderKernel. More...

class	NEFillBorderKernel
	Interface for the kernel to fill borders. More...

class	NEFillInnerBorderKernel
	Interface for the kernel to fill the interior borders. More...

class	NEFlattenLayer
	Basic function to execute flatten. More...

class	NEFloor
	Basic function to run NEFloorKernel. More...

class	NEFloorKernel
	NEON kernel to perform a floor operation. More...

class	NEFullyConnectedLayer
	Basic function to compute a Fully Connected layer on NEON. More...

class	NEFullyConnectedLayerReshapeWeights
	Basic function to reshape the weights of Fully Connected layer with NEON. More...

class	NEGaussian3x3
	Basic function to execute gaussian filter 3x3. More...

class	NEGaussian3x3Kernel
	NEON kernel to perform a Gaussian 3x3 filter. More...

class	NEGaussian5x5
	Basic function to execute gaussian filter 5x5. More...

class	NEGaussian5x5HorKernel
	NEON kernel to perform a Gaussian 5x5 filter (horizontal pass) More...

class	NEGaussian5x5VertKernel
	NEON kernel to perform a Gaussian 5x5 filter (vertical pass) More...

class	NEGaussianPyramid
	Common interface for all Gaussian pyramid functions. More...

class	NEGaussianPyramidHalf
	Basic function to execute gaussian pyramid with HALF scale factor. More...

class	NEGaussianPyramidHorKernel
	NEON kernel to perform a GaussianPyramid (horizontal pass) More...

class	NEGaussianPyramidOrb
	Basic function to execute gaussian pyramid with ORB scale factor. More...

class	NEGaussianPyramidVertKernel
	NEON kernel to perform a GaussianPyramid (vertical pass) More...

class	NEGEMM
	Basic function to execute GEMM on NEON. More...

class	NEGEMMAssemblyBaseKernel
	Base class for GEMM NEON kernels implemented in Assembly. More...

class	NEGEMMConvolutionLayer
	Basic function to simulate a convolution layer. More...

class	NEGEMMInterleave4x4
	Basic function to execute NEGEMMInterleave4x4Kernel. More...

class	NEGEMMInterleave4x4Kernel
	NEON kernel to interleave the elements of a matrix. More...

class	NEGEMMLowpAssemblyMatrixMultiplyCore
	Basic function to execute matrix multiply assembly kernels. More...

class	NEGEMMLowpMatrixAReductionKernel
	NEON kernel used to compute the row-vectors of sums of all the entries in each row of Matrix A. More...

class	NEGEMMLowpMatrixBReductionKernel
	NEON kernel used to compute the row-vectors of sums of all the entries in each column of Matrix B. More...

class	NEGEMMLowpMatrixMultiplyCore
	Basic function to execute GEMMLowpMatrixMultiplyCore on NEON. More...

class	NEGEMMLowpMatrixMultiplyKernel
	NEON kernel to multiply matrices. More...

class	NEGEMMLowpOffsetContributionKernel
	NEON kernel used to add the offset contribution after NEGEMMLowpMatrixMultiplyKernel. More...

class	NEGEMMLowpQuantizeDownInt32ToUint8Scale
	Basic function to execute NEGEMMLowpQuantizeDownInt32ToUint8Scale on NEON. More...

class	NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint
	Basic function to execute NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint on NEON. More...

class	NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel
	NEON kernel used to quantize down the int32 accumulator values of GEMMLowp to QASYMM8. More...

class	NEGEMMLowpQuantizeDownInt32ToUint8ScaleKernel
	NEON kernel used to quantize down the int32 accumulator values of GEMMLowp to QASYMM8. More...

class	NEGEMMMatrixAccumulateBiasesKernel
	NEON kernel to add a bias to each row of the input tensor. More...

class	NEGEMMMatrixAdditionKernel
	NEON kernel to perform the in-place matrix addition between 2 matrices taking into account that the second matrix might be weighted by a scalar value beta: More...

class	NEGEMMMatrixMultiplyKernel
	NEON kernel to multiply two input matrices "A" and "B". More...

class	NEGEMMMatrixVectorMultiplyKernel
	Interface for the GEMM matrix vector multiply kernel. More...

class	NEGEMMTranspose1xW
	Basic function to execute NEGEMMTranspose1xWKernel. More...

class	NEGEMMTranspose1xWKernel
	NEON kernel which transposes the elements of a matrix in chunks of 1xW, where W is equal to (16 / element size of the tensor) More...

class	NEGradientKernel
	Computes magnitude and quantised phase from inputs gradients. More...

class	NEHarrisCorners
	Basic function to execute harris corners detection. More...

class	NEHarrisScoreKernel
	Template NEON kernel to perform Harris Score. More...

class	NEHistogram
	Basic function to run NEHistogramKernel. More...

class	NEHistogramKernel
	Interface for the histogram kernel. More...

class	NEHOGBlockNormalizationKernel
	NEON kernel to perform HOG block normalization. More...

class	NEHOGDescriptor
	Basic function to calculate HOG descriptor. More...

class	NEHOGDetector
	Basic function to execute HOG detector based on linear SVM. More...

class	NEHOGDetectorKernel
	NEON kernel to perform HOG detector kernel using linear SVM. More...

class	NEHOGGradient
	Basic function to calculate the gradient for HOG. More...

class	NEHOGMultiDetection
	Basic function to detect multiple objects (or the same object at different scales) on the same input image using HOG. More...

class	NEHOGOrientationBinningKernel
	NEON kernel to perform HOG Orientation Binning. More...

class	NEIm2Col
	Basic function to run NEIm2ColKernel. More...

class	NEIm2ColKernel
	Interface for the im2col reshape kernel. More...

class	NEIntegralImage
	Basic function to run a NEIntegralImageKernel. More...

class	NEIntegralImageKernel
	Kernel to perform an image integral on an image. More...

class	NEL2NormalizeLayer
	Basic function to perform a L2 normalization on a given axis. More...

class	NEL2NormalizeLayerKernel
	Interface for performing a L2 normalize on a given axis given the square sum of it in this axis. More...

class	NELaplacianPyramid
	Basic function to execute laplacian pyramid. More...

class	NELaplacianReconstruct
	Basic function to execute laplacian reconstruction. More...

struct	NELKInternalKeypoint
	Internal keypoint class for Lucas-Kanade Optical Flow. More...

class	NELKTrackerKernel
	Interface for the Lucas-Kanade tracker kernel. More...

class	NELocallyConnectedLayer
	Basic function to compute the locally connected layer. More...

class	NELocallyConnectedMatrixMultiplyKernel
	NEON kernel to multiply each row of first tensor with low 2 dimensions of second tensor. More...

class	NELogits1DMaxKernel
	Interface for the identifying the max value of 1D Logits. More...

class	NELogits1DSoftmaxKernel
	Interface for softmax computation for QASYMM8 with pre-computed max. More...

class	NEMagnitude
	Basic function to run NEMagnitudePhaseKernel. More...

class	NEMagnitudePhaseKernel
	Template interface for the kernel to compute magnitude and phase. More...

class	NEMeanStdDev
	Basic function to execute mean and std deviation. More...

class	NEMeanStdDevKernel
	Interface for the kernel to calculate mean and standard deviation of input image pixels. More...

class	NEMedian3x3
	Basic function to execute median filter. More...

class	NEMedian3x3Kernel
	Kernel to perform a median filter on a tensor. More...

class	NEMinMaxKernel
	Interface for the kernel to perform min max search on an image. More...

class	NEMinMaxLayerKernel
	Interface for the kernel to perform min max search on a 3D tensor. More...

class	NEMinMaxLocation
	Basic function to execute min and max location. More...

class	NEMinMaxLocationKernel
	Interface for the kernel to find min max locations of an image. More...

class	NENonLinearFilter
	Basic function to execute non linear filter. More...

class	NENonLinearFilterKernel
	Interface for the kernel to apply a non-linear filter. More...

class	NENonMaximaSuppression3x3
	Basic function to execute non-maxima suppression over a 3x3 window. More...

class	NENonMaximaSuppression3x3Kernel
	Interface to perform Non-Maxima suppression over a 3x3 window using NEON. More...

class	NENormalizationLayer
	Basic function to compute a normalization layer. More...

class	NENormalizationLayerKernel
	Interface for the normalization layer kernel. More...

class	NEOpticalFlow
	Basic function to execute optical flow. More...

class	NEPermute
	Basic function to run NEPermuteKernel. More...

class	NEPermuteKernel
	NEON kernel to perform tensor permutation. More...

class	NEPhase
	Basic function to run NEMagnitudePhaseKernel. More...

class	NEPixelWiseMultiplication
	Basic function to run NEPixelWiseMultiplicationKernel. More...

class	NEPixelWiseMultiplicationKernel
	Interface for the kernel to perform addition between two tensors. More...

class	NEPoolingLayer
	Basic function to simulate a pooling layer with the specified pooling operation. More...

class	NEPoolingLayerKernel
	Interface for the pooling layer kernel. More...

class	NEQuantizationLayer
	Basic function to simulate a quantization layer. More...

class	NEQuantizationLayerKernel
	Interface for the quantization layer kernel. More...

class	NEReductionOperation
	Basic function to simulate a reduction operation. More...

class	NEReductionOperationKernel
	NEON kernel to perform a reduction operation. More...

class	NERemap
	Basic function to execute remap. More...

class	NERemapKernel
	NEON kernel to perform a remap on a tensor. More...

class	NEReshapeLayer
	Basic function to run NEReshapeLayerKernel. More...

class	NEReshapeLayerKernel
	Interface for the kernel to perform tensor reshaping. More...

class	NEROIPoolingLayer
	Basic function to run NEROIPoolingLayerKernel. More...

class	NEROIPoolingLayerKernel
	Interface for the ROI pooling layer kernel. More...

class	NEScale
	Basic function to run NEScaleKernel. More...

class	NEScaleKernel
	NEON kernel to perform scaling on a tensor. More...

class	NEScharr3x3
	Basic function to execute scharr 3x3 filter. More...

class	NEScharr3x3Kernel
	Interface for the kernel to run a 3x3 Scharr filter on a tensor. More...

class	NESeparableConvolutionHorKernel
	Kernel for the Horizontal pass of a Separable Convolution. More...

class	NESeparableConvolutionVertKernel
	Kernel for the Vertical pass of a Separable Convolution. More...

class	NESobel3x3
	Basic function to execute sobel 3x3 filter. More...

class	NESobel3x3Kernel
	Interface for the kernel to run a 3x3 Sobel X filter on a tensor. More...

class	NESobel5x5
	Basic function to execute sobel 5x5 filter. More...

class	NESobel5x5HorKernel
	Interface for the kernel to run the horizontal pass of 5x5 Sobel filter on a tensor. More...

class	NESobel5x5VertKernel
	Interface for the kernel to run the vertical pass of 5x5 Sobel Y filter on a tensor. More...

class	NESobel7x7
	Basic function to execute sobel 7x7 filter. More...

class	NESobel7x7HorKernel
	Interface for the kernel to run the horizontal pass of 7x7 Sobel filter on a tensor. More...

class	NESobel7x7VertKernel
	Interface for the kernel to run the vertical pass of 7x7 Sobel Y filter on a tensor. More...

class	NESoftmaxLayer
	Basic function to compute a SoftmaxLayer. More...

class	NETableLookup
	Basic function to run NETableLookupKernel. More...

class	NETableLookupKernel
	Interface for the kernel to perform table lookup calculations. More...

class	NEThreshold
	Basic function to run NEThresholdKernel. More...

class	NEThresholdKernel
	Interface for the thresholding kernel. More...

class	NETranspose
	Basic function to transpose a matrix on NEON. More...

class	NETransposeKernel
	NEON kernel which transposes the elements of a matrix. More...

class	NEWarpAffine
	Basic function to run NEWarpAffineKernel. More...

class	NEWarpAffineKernel
	Template interface for the kernel to compute warp affine. More...

class	NEWarpPerspective
	Basic function to run NEWarpPerspectiveKernel. More...

class	NEWarpPerspectiveKernel
	Template interface for the kernel to compute warp perspective. More...

class	NEWeightsReshapeKernel
	NEON kernel to perform reshaping on the weights used by convolution and locally connected layer. More...

class	NEWinogradConvolutionLayer
	Basic function to simulate a convolution layer. More...

class	NEWinogradLayerBatchedGEMMKernel
	NEON kernel to perform Winograd. More...

class	NEWinogradLayerTransformInputKernel
	NEON kernel to perform Winograd input transform. More...

class	NEWinogradLayerTransformOutputKernel
	NEON kernel to perform Winograd output transform. More...

class	NEWinogradLayerTransformWeightsKernel
	NEON kernel to perform Winograd weights transform. More...

class	NormalizationLayerInfo
	Normalization Layer Information class. More...

class	OffsetLifetimeManager
	Concrete class that tracks the lifetime of registered tensors and calculates the systems memory requirements in terms of a single blob and a list of offsets. More...

class	OffsetMemoryPool
	Offset based memory pool. More...

class	OMPScheduler
	Pool of threads to automatically split a kernel's execution among several threads. More...

struct	OpticalFlowParameters
	Parameters of Optical Flow algorithm. More...

class	PadStrideInfo
	Padding and stride information class. More...

class	PixelValue
	Class describing the value of a pixel for any image format. More...

class	PoolingLayerInfo
	Pooling Layer Information class. More...

class	PoolManager
	Memory pool manager. More...

class	Program
	Program class. More...

class	Pyramid
	Basic implementation of the pyramid interface. More...

class	PyramidInfo
	Store the Pyramid's metadata. More...

struct	QuantizationInfo
	Quantization settings (used for QASYMM8 data type) More...

struct	Rectangle
	Rectangle type. More...

struct	ROI
	Region of interest. More...

class	ROIPoolingLayerInfo
	ROI Pooling Layer Information class. More...

class	Scheduler
	Configurable scheduler which supports multiple multithreading APIs and choosing between different schedulers at runtime. More...

class	Semaphore
	Semamphore class. More...

class	SingleThreadScheduler
	Pool of threads to automatically split a kernel's execution among several threads. More...

class	Size2D
	Class for specifying the size of an image or rectangle. More...

class	Status
	Status class. More...

class	Steps
	Class to describe a number of elements in each dimension. More...

class	Strides
	Strides of an item in bytes. More...

class	SubTensor
	Basic implementation of the sub-tensor interface. More...

class	SubTensorInfo
	Store the sub tensor's metadata. More...

class	Tensor
	Basic implementation of the tensor interface. More...

class	TensorAllocator
	Basic implementation of a CPU memory tensor allocator. More...

class	TensorInfo
	Store the tensor's metadata. More...

class	TensorShape
	Shape of a tensor. More...

struct	ThreadInfo
	Information about executing thread and CPU. More...

struct	ValidRegion
	Container for valid region of a window. More...

class	WeightsInfo
	Convolution Layer Weights Information class. More...

class	Window
	Describe a multidimensional execution window. More...

struct	WinogradInfo
	Winograd information. More...

Typedefs
using	ICLKeyPointArray = ICLArray< KeyPoint >
	Interface for OpenCL Array of Key Points. More...

using	ICLCoordinates2DArray = ICLArray< Coordinates2D >
	Interface for OpenCL Array of 2D Coordinates. More...

using	ICLDetectionWindowArray = ICLArray< DetectionWindow >
	Interface for OpenCL Array of Detection Windows. More...

using	ICLROIArray = ICLArray< ROI >
	Interface for OpenCL Array of ROIs. More...

using	ICLSize2DArray = ICLArray< Size2D >
	Interface for OpenCL Array of 2D Sizes. More...

using	ICLUInt8Array = ICLArray< cl_uchar >
	Interface for OpenCL Array of uint8s. More...

using	ICLUInt16Array = ICLArray< cl_ushort >
	Interface for OpenCL Array of uint16s. More...

using	ICLUInt32Array = ICLArray< cl_uint >
	Interface for OpenCL Array of uint32s. More...

using	ICLInt16Array = ICLArray< cl_short >
	Interface for OpenCL Array of int16s. More...

using	ICLInt32Array = ICLArray< cl_int >
	Interface for OpenCL Array of int32s. More...

using	ICLFloatArray = ICLArray< cl_float >
	Interface for OpenCL Array of floats. More...

using	ICLImage = ICLTensor
	Interface for OpenCL images. More...

using	CLConvolution3x3Kernel = CLConvolutionKernel< 3 >
	Interface for the kernel which applies a 3x3 convolution to a tensor. More...

using	CLConvolution5x5Kernel = CLConvolutionKernel< 5 >
	Interface for the kernel which applies a 5x5 convolution to a tensor. More...

using	CLConvolution7x7Kernel = CLConvolutionKernel< 7 >
	Interface for the kernel which applies a 7x7 convolution to a tensor. More...

using	CLConvolution9x9Kernel = CLConvolutionKernel< 9 >
	Interface for the kernel which applies a 9x9 convolution to a tensor. More...

using	CLSeparableConvolution5x5HorKernel = CLSeparableConvolutionHorKernel< 5 >
	Interface for the kernel which applies a horizontal pass of 5x5 convolution to a tensor. More...

using	CLSeparableConvolution7x7HorKernel = CLSeparableConvolutionHorKernel< 7 >
	Interface for the kernel which applies a horizontal pass of 7x7 convolution to a tensor. More...

using	CLSeparableConvolution9x9HorKernel = CLSeparableConvolutionHorKernel< 9 >
	Interface for the kernel which applies a horizontal pass of 9x9 convolution to a tensor. More...

using	CLSeparableConvolution5x5VertKernel = CLSeparableConvolutionVertKernel< 5 >
	Interface for the kernel which applies a vertical pass of 5x5 convolution to a tensor. More...

using	CLSeparableConvolution7x7VertKernel = CLSeparableConvolutionVertKernel< 7 >
	Interface for the kernel which applies a vertical pass of 7x7 convolution to a tensor. More...

using	CLSeparableConvolution9x9VertKernel = CLSeparableConvolutionVertKernel< 9 >
	Interface for the kernel which applies a vertical pass of 9x9 convolution to a tensor. More...

using	ICLLKInternalKeypointArray = ICLArray< CLLKInternalKeypoint >
	Interface for OpenCL Array of Internal Key Points. More...

using	ICLCoefficientTableArray = ICLArray< CLCoefficientTable >
	Interface for OpenCL Array of Coefficient Tables. More...

using	ICLOldValArray = ICLArray< CLOldValue >
	Interface for OpenCL Array of Old Values. More...

using	IImage = ITensor
	Interface for CPP Images. More...

using	qint8_t = int8_t
	8 bit fixed point scalar value More...

using	qint16_t = int16_t
	16 bit fixed point scalar value More...

using	qint32_t = int32_t
	32 bit fixed point scalar value More...

using	qint64_t = int64_t
	64 bit fixed point scalar value More...

using	IGCImage = IGCTensor
	Interface for GLES Compute image. More...

using	GCDirectConvolutionLayer1x1Kernel = GCDirectConvolutionLayerKernel< 1 >
	Interface for the 1x1 direct convolution kernel. More...

using	GCDirectConvolutionLayer3x3Kernel = GCDirectConvolutionLayerKernel< 3 >
	Interface for the 3x3 direct convolution kernel. More...

using	GCDirectConvolutionLayer5x5Kernel = GCDirectConvolutionLayerKernel< 5 >
	Interface for the 5x5 direct convolution kernel. More...

using	IKeyPointArray = IArray< KeyPoint >
	Interface for Array of Key Points. More...

using	ICoordinates2DArray = IArray< Coordinates2D >
	Interface for Array of 2D Coordinates. More...

using	IDetectionWindowArray = IArray< DetectionWindow >
	Interface for Array of Detection Windows. More...

using	IROIArray = IArray< ROI >
	Interface for Array of ROIs. More...

using	ISize2DArray = IArray< Size2D >
	Interface for Array of 2D Sizes. More...

using	IUInt8Array = IArray< uint8_t >
	Interface for Array of uint8s. More...

using	IUInt16Array = IArray< uint16_t >
	Interface for Array of uint16s. More...

using	IUInt32Array = IArray< uint32_t >
	Interface for Array of uint32s. More...

using	IInt16Array = IArray< int16_t >
	Interface for Array of int16s. More...

using	IInt32Array = IArray< int32_t >
	Interface for Array of int32s. More...

using	IFloatArray = IArray< float >
	Interface for Array of floats. More...

using	INEKernel = ICPPKernel
	Common interface for all kernels implemented in NEON. More...

using	INESimpleKernel = ICPPSimpleKernel
	Interface for simple NEON kernels having 1 tensor input and 1 tensor output. More...

using	NEAccumulateWeightedFP16Kernel = NEAccumulateWeightedKernel
	Interface for the accumulate weighted kernel using F16. More...

using	NEBox3x3FP16Kernel = NEBox3x3Kernel
	NEON kernel to perform a Box 3x3 filter for FP16 datatype. More...

using	NEGradientFP16Kernel = NEGradientKernel
	NEON kernel to perform Gradient computation for FP16 datatype. More...

using	NEConvolution3x3Kernel = NEConvolutionKernel< 3 >
	Interface for the kernel which applied a 3x3 convolution to a tensor. More...

using	NEConvolution5x5Kernel = NEConvolutionKernel< 5 >
	Interface for the kernel which applied a 5x5 convolution to a tensor. More...

using	NEConvolution7x7Kernel = NEConvolutionKernel< 7 >
	Interface for the kernel which applied a 7x7 convolution to a tensor. More...

using	NEConvolution9x9Kernel = NEConvolutionKernel< 9 >
	Interface for the kernel which applied a 9x9 convolution to a tensor. More...

using	NESeparableConvolution5x5HorKernel = NESeparableConvolutionHorKernel< 5 >
	Interface for the kernel which applied a 5x1 horizontal convolution to a tensor. More...

using	NESeparableConvolution7x7HorKernel = NESeparableConvolutionHorKernel< 7 >
	Interface for the kernel which applied a 7x1 horizontal convolution to a tensor. More...

using	NESeparableConvolution9x9HorKernel = NESeparableConvolutionHorKernel< 9 >
	Interface for the kernel which applied a 9x1 horizontal convolution to a tensor. More...

using	NESeparableConvolution5x5VertKernel = NESeparableConvolutionVertKernel< 5 >
	Interface for the kernel which applied a 1x5 vertical convolution to a tensor. More...

using	NESeparableConvolution7x7VertKernel = NESeparableConvolutionVertKernel< 7 >
	Interface for the kernel which applied a 1x7 vertical convolution to a tensor. More...

using	NESeparableConvolution9x9VertKernel = NESeparableConvolutionVertKernel< 9 >
	Interface for the kernel which applied a 1x9 vertical convolution to a tensor. More...

template<int32_t block_size>
using	NEHarrisScoreFP16Kernel = NEHarrisScoreKernel< block_size >
	Interface for the accumulate Weighted kernel using FP16. More...

using	INELKInternalKeypointArray = IArray< NELKInternalKeypoint >
	Interface for NEON Array of Internal Key Points. More...

template<MagnitudeType mag_type, PhaseType phase_type>
using	NEMagnitudePhaseFP16Kernel = NEMagnitudePhaseKernel< mag_type, phase_type >
	Template interface for the kernel to compute magnitude and phase. More...

using	NENonMaximaSuppression3x3FP16Kernel = NENonMaximaSuppression3x3Kernel
	NEON kernel to perform Non-Maxima suppression 3x3 with intermediate results in FP16 if the input data type is FP32. More...

using	qasymm8x8_t = uint8x8_t
	8 bit quantized asymmetric vector with 8 elements More...

using	qasymm8x8x2_t = uint8x8x2_t
	8 bit quantized asymmetric vector with 16 elements More...

using	qasymm8x8x3_t = uint8x8x3_t
	8 bit quantized asymmetric vector with 24 elements More...

using	qasymm8x8x4_t = uint8x8x4_t
	8 bit quantized asymmetric vector with 32 elements More...

using	qasymm8x16_t = uint8x16_t
	8 bit quantized asymmetric vector with 16 elements More...

using	qint8x8_t = int8x8_t
	8 bit fixed point vector with 8 elements More...

using	qint8x8x2_t = int8x8x2_t
	8 bit fixed point vector with 16 elements More...

using	qint8x8x3_t = int8x8x3_t
	8 bit fixed point vector with 24 elements More...

using	qint8x8x4_t = int8x8x4_t
	8 bit fixed point vector with 32 elements More...

using	qint8x16_t = int8x16_t
	8 bit fixed point vector with 16 elements More...

using	qint8x16x2_t = int8x16x2_t
	8 bit fixed point vector with 32 elements More...

using	qint8x16x3_t = int8x16x3_t
	8 bit fixed point vector with 48 elements More...

using	qint8x16x4_t = int8x16x4_t
	8 bit fixed point vector with 64 elements More...

using	qint16x4_t = int16x4_t
	16 bit fixed point vector with 4 elements More...

using	qint16x4x2_t = int16x4x2_t
	16 bit fixed point vector with 8 elements More...

using	qint16x4x3_t = int16x4x3_t
	16 bit fixed point vector with 12 elements More...

using	qint16x4x4_t = int16x4x4_t
	16 bit fixed point vector with 16 elements More...

using	qint16x8_t = int16x8_t
	16 bit fixed point vector with 8 elements More...

using	qint16x8x2_t = int16x8x2_t
	16 bit fixed point vector with 16 elements More...

using	qint16x8x3_t = int16x8x3_t
	16 bit fixed point vector with 24 elements More...

using	qint16x8x4_t = int16x8x4_t
	16 bit fixed point vector with 32 elements More...

using	qint32x2_t = int32x2_t
	32 bit fixed point vector with 2 elements More...

using	qint32x4_t = int32x4_t
	32 bit fixed point vector with 4 elements More...

using	qint32x4x2_t = int32x4x2_t
	32 bit fixed point vector with 8 elements More...

using	qasymm8_t = uint8_t
	8 bit quantized asymmetric scalar value More...

using	half = half_float::half
	16-bit floating point type More...

using	PermutationVector = Strides
	Permutation vector. More...

using	PaddingSize = BorderSize
	Container for 2D padding size. More...

using	InternalKeypoint = std::tuple< float, float, float >
	Internal key point. More...

using	KeyPointArray = Array< KeyPoint >
	Array of Key Points. More...

using	Coordinates2DArray = Array< Coordinates2D >
	Array of 2D Coordinates. More...

using	DetectionWindowArray = Array< DetectionWindow >
	Array of Detection Windows. More...

using	ROIArray = Array< ROI >
	Array of ROIs. More...

using	Size2DArray = Array< Size2D >
	Array of 2D Sizes. More...

using	UInt8Array = Array< uint8_t >
	Array of uint8s. More...

using	UInt16Array = Array< uint16_t >
	Array of uint16s. More...

using	UInt32Array = Array< uint32_t >
	Array of uint32s. More...

using	Int16Array = Array< int16_t >
	Array of int16s. More...

using	Int32Array = Array< int32_t >
	Array of int32s. More...

using	FloatArray = Array< float >
	Array of floats. More...

using	CLKeyPointArray = CLArray< KeyPoint >
	OpenCL Array of Key Points. More...

using	CLCoordinates2DArray = CLArray< Coordinates2D >
	OpenCL Array of 2D Coordinates. More...

using	CLDetectionWindowArray = CLArray< DetectionWindow >
	OpenCL Array of Detection Windows. More...

using	CLROIArray = CLArray< ROI >
	OpenCL Array of ROIs. More...

using	CLSize2DArray = CLArray< Size2D >
	OpenCL Array of 2D Sizes. More...

using	CLUInt8Array = CLArray< cl_uchar >
	OpenCL Array of uint8s. More...

using	CLUInt16Array = CLArray< cl_ushort >
	OpenCL Array of uint16s. More...

using	CLUInt32Array = CLArray< cl_uint >
	OpenCL Array of uint32s. More...

using	CLInt16Array = CLArray< cl_short >
	OpenCL Array of int16s. More...

using	CLInt32Array = CLArray< cl_int >
	OpenCL Array of int32s. More...

using	CLFloatArray = CLArray< cl_float >
	OpenCL Array of floats. More...

using	CLMemoryGroup = MemoryGroupBase< CLTensor >
	Memory Group in OpenCL. More...

using	CLImage = CLTensor
	OpenCL Image. More...

using	CLConvolution5x5 = CLConvolutionSquare< 5 >
	Basic function to run 5x5 convolution. More...

using	CLConvolution7x7 = CLConvolutionSquare< 7 >
	Basic function to run 7x7 convolution. More...

using	CLConvolution9x9 = CLConvolutionSquare< 9 >
	Basic function to run 9x9 convolution. More...

using	CLLKInternalKeypointArray = CLArray< CLLKInternalKeypoint >
	OpenCL Array of Internal Keypoints. More...

using	CLCoefficientTableArray = CLArray< CLCoefficientTable >
	OpenCL Array of Coefficient Tables. More...

using	CLOldValueArray = CLArray< CLOldValue >
	OpenCL Array of Old Values. More...

using	GCMemoryGroup = MemoryGroupBase< GCTensor >

using	GCImage = GCTensor
	OpenGL ES Image. More...

using	MemoryGroup = MemoryGroupBase< Tensor >
	Memory Group. More...

using	AssemblyKernelGlueF32 = AssemblyKernelGlue< float, float >
	Float 32 assembly kernel glue. More...

using	AssemblyKernelGlueU8U32 = AssemblyKernelGlue< uint8_t, uint32_t >
	Uint 8 to Uint 32 kernel glue. More...

using	AssemblyKernelGlueS8S32 = AssemblyKernelGlue< int8_t, int32_t >
	Int 8 to Int 32 kernel glue. More...

using	NEConvolution5x5 = NEConvolutionSquare< 5 >
	Basic function to run 5x5 convolution. More...

using	NEConvolution7x7 = NEConvolutionSquare< 7 >
	Basic function to run 7x7 convolution. More...

using	NEConvolution9x9 = NEConvolutionSquare< 9 >
	Basic function to run 9x9 convolution. More...

using	LKInternalKeypointArray = Array< NELKInternalKeypoint >
	Array of LK Internel Keypoints. More...

using	NEScheduler = Scheduler
	NEON Scheduler. More...

using	Image = Tensor
	Image. More...

using	MemoryMappings = std::map< void **, size_t >
	A map of (handle, index/offset), where handle is the memory handle of the object to provide the memory for and index/offset is the buffer/offset from the pool that should be used. More...

using	GroupMappings = std::map< size_t, MemoryMappings >
	A map of the groups and memory mappings. More...

using	Mutex = std::mutex
	Wrapper of Mutex data-object. More...

Enumerations
enum	CLVersion { CL10, CL11, CL12, CL20, UNKNOWN }
	Available OpenCL Version. More...

enum	CPUModel { GENERIC, A53, A55r0, A55r1 }
	CPU models - we only need to detect CPUs we have microarchitecture-specific code for. More...

enum	ErrorCode { OK, RUNTIME_ERROR }
	Available error codes. More...

enum	GPUTarget { UNKNOWN = 0x101, GPU_ARCH_MASK = 0xF00, MIDGARD = 0x100, BIFROST = 0x200, T600 = 0x110, T700 = 0x120, T800 = 0x130, G71 = 0x210, G72 = 0x220, G51 = 0x230, G51BIG = 0x231, G51LIT = 0x232, TNOX = 0x240, TTRX = 0x250, TBOX = 0x260 }
	Available GPU Targets. More...

enum	RoundingPolicy { TO_ZERO, TO_NEAREST_UP, TO_NEAREST_EVEN }
	Rounding method. More...

enum	Format { UNKNOWN, U8, S16, U16, S32, U32, F16, F32, UV88, RGB888, RGBA8888, YUV444, YUYV422, NV12, NV21, IYUV, UYVY422 }
	Image colour formats. More...

enum	DataType { UNKNOWN, U8, S8, QS8, QASYMM8, U16, S16, QS16, U32, S32, QS32, U64, S64, F16, F32, F64, SIZET }
	Available data types. More...

enum	SamplingPolicy { CENTER, TOP_LEFT }
	Available Sampling Policies. More...

enum	DataLayout { UNKNOWN, NCHW, NHWC }
	Supported tensor data layouts. More...

enum	DataLayoutDimension { CHANNEL, HEIGHT, WIDTH, BATCHES }
	Supported tensor data layout dimensions. More...

enum	BorderMode { UNDEFINED, CONSTANT, REPLICATE }
	Methods available to handle borders. More...

enum	ConvertPolicy { WRAP, SATURATE }
	Policy to handle overflow. More...

enum	InterpolationPolicy { NEAREST_NEIGHBOR, BILINEAR, AREA }
	Interpolation method. More...

enum	BilinearInterpolation { BILINEAR_OLD_NEW, BILINEAR_SCHARR }
	Bilinear Interpolation method used by LKTracker. More...

enum	ThresholdType { BINARY, RANGE }
	Threshold mode. More...

enum	Termination { TERM_CRITERIA_EPSILON, TERM_CRITERIA_ITERATIONS, TERM_CRITERIA_BOTH }
	Termination criteria. More...

enum	MagnitudeType { L1NORM, L2NORM }
	Magnitude calculation type. More...

enum	PhaseType { SIGNED, UNSIGNED }
	Phase calculation type. More...

enum	Channel { UNKNOWN, C0, C1, C2, C3, R, G, B, A, Y, U, V }
	Available channels. More...

enum	MatrixPattern { BOX, CROSS, DISK, OTHER }
	Available matrix patterns. More...

enum	NonLinearFilterFunction : unsigned { MEDIAN = 0, MIN = 1, MAX = 2 }
	Available non linear functions. More...

enum	ReductionOperation { SUM_SQUARE, SUM }
	Available reduction operations. More...

enum	NormType { IN_MAP_1D, IN_MAP_2D, CROSS_MAP }
	The normalization type used for the normalization layer. More...

enum	HOGNormType { L2_NORM = 1, L2HYS_NORM = 2, L1_NORM = 3 }
	Normalization type for Histogram of Oriented Gradients (HOG) More...

enum	DimensionRoundingType { FLOOR, CEIL }
	Dimension rounding type when down-scaling on CNNs. More...

enum	PoolingType { MAX, AVG, L2 }
	Available pooling types. More...

enum	ConvolutionMethod { GEMM, DIRECT, WINOGRAD }
	Available ConvolutionMethod. More...

enum	MappingType { BLOBS, OFFSETS }
	Mapping type. More...

enum	FixedPointOp { ADD, SUB, MUL, EXP, LOG, INV_SQRT, RECIPROCAL }
	Fixed point operation. More...

enum	GradientDimension { GRAD_XY }
	Gradient dimension type. More...

Functions
std::string	get_cl_type_from_data_type (const DataType &dt)
	Translates a tensor data type to the appropriate OpenCL type. More...

std::string	get_data_size_from_data_type (const DataType &dt)
	Get the size of a data type in number of bits. More...

std::string	get_underlying_cl_type_from_data_type (const DataType &dt)
	Translates fixed point tensor data type to the underlying OpenCL type. More...

GPUTarget	get_target_from_device (cl::Device &device)
	Helper function to get the GPU target from CL device. More...

CLVersion	get_cl_version (const cl::Device &device)
	Helper function to get the highest OpenCL version supported. More...

bool	device_supports_extension (const cl::Device &device, const char *extension_name)
	Helper function to check whether a given extension is supported. More...

bool	fp16_supported (const cl::Device &device)
	Helper function to check whether the cl_khr_fp16 extension is supported. More...

bool	arm_non_uniform_workgroup_supported (const cl::Device &device)
	Helper function to check whether the arm_non_uniform_work_group_size extension is supported. More...

void	enqueue (cl::CommandQueue &queue, ICLKernel &kernel, const Window &window, const cl::NDRange &lws_hint=CLKernelLibrary::get().default_ndrange())
	Add the kernel to the command queue with the given window. More...

bool	opencl_is_available ()
	Check if OpenCL is available. More...

template<typename T >
bool	operator== (const Dimensions< T > &lhs, const Dimensions< T > &rhs)
	Check that given dimensions are equal. More...

template<typename T >
bool	operator!= (const Dimensions< T > &lhs, const Dimensions< T > &rhs)
	Check that given dimensions are not equal. More...

template<typename... T>
void	ignore_unused (T &&...)
	Ignores unused arguments. More...

Status	create_error_va_list (ErrorCode error_code, const char function, const char file, const int line, const char *msg, va_list args)
	Creates an error containing the error message from variable argument list. More...

Status	create_error (ErrorCode error_code, const char function, const char file, const int line, const char *msg,...)
	Creates an error containing the error message. More...

void	error (const char function, const char file, const int line, const char *msg,...)
	Print an error message then throw an std::runtime_error. More...

qint8_t	sqshl_qs8 (qint8_t a, int shift)
	8 bit fixed point scalar saturating shift left More...

qint8_t	sshr_qs8 (qint8_t a, int shift)
	8 bit fixed point scalar shift right More...

qint16_t	sshr_qs16 (qint16_t a, int shift)
	16 bit fixed point scalar shift right More...

qint16_t	sqshl_qs16 (qint16_t a, int shift)
	16 bit fixed point scalar saturating shift left More...

qint8_t	sabs_qs8 (qint8_t a)
	8 bit fixed point scalar absolute value More...

qint16_t	sabs_qs16 (qint16_t a)
	16 bit fixed point scalar absolute value More...

qint8_t	sadd_qs8 (qint8_t a, qint8_t b)
	8 bit fixed point scalar add More...

qint16_t	sadd_qs16 (qint16_t a, qint16_t b)
	16 bit fixed point scalar add More...

qint8_t	sqadd_qs8 (qint8_t a, qint8_t b)
	8 bit fixed point scalar saturating add More...

qint16_t	sqadd_qs16 (qint16_t a, qint16_t b)
	16 bit fixed point scalar saturating add More...

qint32_t	sqadd_qs32 (qint32_t a, qint32_t b)
	32 bit fixed point scalar saturating add More...

qint8_t	ssub_qs8 (qint8_t a, qint8_t b)
	8 bit fixed point scalar subtraction More...

qint16_t	ssub_qs16 (qint16_t a, qint16_t b)
	16 bit fixed point scalar subtraction More...

qint8_t	sqsub_qs8 (qint8_t a, qint8_t b)
	8 bit fixed point scalar saturating subtraction More...

qint16_t	sqsub_qs16 (qint16_t a, qint16_t b)
	16 bit fixed point scalar saturating subtraction More...

qint8_t	smul_qs8 (qint8_t a, qint8_t b, int fixed_point_position)
	8 bit fixed point scalar multiply More...

qint16_t	smul_qs16 (qint16_t a, qint16_t b, int fixed_point_position)
	16 bit fixed point scalar multiply More...

qint8_t	sqmul_qs8 (qint8_t a, qint8_t b, int fixed_point_position)
	8 bit fixed point scalar saturating multiply More...

qint16_t	sqmul_qs16 (qint16_t a, qint16_t b, int fixed_point_position)
	16 bit fixed point scalar saturating multiply More...

qint16_t	sqmull_qs8 (qint8_t a, qint8_t b, int fixed_point_position)
	8 bit fixed point scalar multiply long More...

qint32_t	sqmull_qs16 (qint16_t a, qint16_t b, int fixed_point_position)
	16 bit fixed point scalar multiply long More...

qint8_t	sinvsqrt_qs8 (qint8_t a, int fixed_point_position)
	8 bit fixed point scalar inverse square root More...

qint16_t	sinvsqrt_qs16 (qint16_t a, int fixed_point_position)
	16 bit fixed point scalar inverse square root More...

qint8_t	sdiv_qs8 (qint8_t a, qint8_t b, int fixed_point_position)
	8 bit fixed point scalar division More...

qint16_t	sdiv_qs16 (qint16_t a, qint16_t b, int fixed_point_position)
	16 bit fixed point scalar division More...

qint8_t	sqexp_qs8 (qint8_t a, int fixed_point_position)
	8 bit fixed point scalar exponential More...

qint16_t	sqexp_qs16 (qint16_t a, int fixed_point_position)
	16 bit fixed point scalar exponential More...

qint16_t	sexp_qs16 (qint16_t a, int fixed_point_position)
	16 bit fixed point scalar exponential More...

qint8_t	slog_qs8 (qint8_t a, int fixed_point_position)
	8 bit fixed point scalar logarithm More...

qint16_t	slog_qs16 (qint16_t a, int fixed_point_position)
	16 bit fixed point scalar logarithm More...

float	scvt_f32_qs8 (qint8_t a, int fixed_point_position)
	Convert an 8 bit fixed point to float. More...

qint8_t	sqcvt_qs8_f32 (float a, int fixed_point_position)
	Convert a float to 8 bit fixed point. More...

float	scvt_f32_qs16 (qint16_t a, int fixed_point_position)
	Convert a 16 bit fixed point to float. More...

qint16_t	sqcvt_qs16_f32 (float a, int fixed_point_position)
	Convert a float to 16 bit fixed point. More...

qint8_t	sqmovn_qs16 (qint16_t a)
	Scalar saturating move and narrow. More...

qint16_t	sqmovn_qs32 (qint32_t a)
	Scalar saturating move and narrow. More...

GPUTarget	get_target_from_device ()
	Helper function to get the GPU target from GLES using GL_RENDERER enum. More...

void	enqueue (IGCKernel &kernel, const Window &window, const gles::NDRange &lws=gles::NDRange(1U, 1U, 1U))
	Add the kernel to the command queue with the given window. More...

bool	opengles31_is_available ()
	Check if the OpenGL ES 3.1 API is available at runtime. More...

const std::string &	string_from_target (GPUTarget target)
	Translates a given gpu device target to string. More...

GPUTarget	get_target_from_name (const std::string &device_name)
	Helper function to get the GPU target from a device name. More...

GPUTarget	get_arch_from_target (GPUTarget target)
	Helper function to get the GPU arch. More...

template<typename... Args>
bool	gpu_target_is_in (GPUTarget target_to_check, GPUTarget target, Args...targets)
	Helper function to check whether a gpu target is equal to the provided targets. More...

bool	gpu_target_is_in (GPUTarget target_to_check, GPUTarget target)
	Variant of gpu_target_is_in for comparing two targets. More...

template<typename Kernel , typename... T>
std::unique_ptr< Kernel >	create_configure_kernel (T &&...args)
	Helper function to create and return a unique_ptr pointed to a CL/GLES kernel object It also calls the kernel's configuration. More...

template<typename Kernel >
std::unique_ptr< Kernel >	create_kernel ()
	Helper function to create and return a unique_ptr pointed to a CL/GLES kernel object. More...

template<typename T >
T	delta_bilinear_c1 (const T *pixel_ptr, size_t stride, float dx, float dy)
	Computes bilinear interpolation using the pointer to the top-left pixel and the pixel's distance between the real coordinates and the smallest following integer coordinates. More...

template<typename T >
T	delta_linear_c1_y (const T *pixel_ptr, size_t stride, float dy)
	Computes linear interpolation using the pointer to the top pixel and the pixel's distance between the real coordinates and the smallest following integer coordinates. More...

template<typename T >
T	delta_linear_c1_x (const T *pixel_ptr, float dx)
	Computes linear interpolation using the pointer to the left pixel and the pixel's distance between the real coordinates and the smallest following integer coordinates. More...

template<typename T >
T	pixel_bilinear_c1 (const T *first_pixel_ptr, size_t stride, float x, float y)
	Return the pixel at (x,y) using bilinear interpolation. More...

template<typename T >
uint8_t	pixel_bilinear_c1_clamp (const T *first_pixel_ptr, size_t stride, size_t width, size_t height, float x, float y)
	Return the pixel at (x,y) using bilinear interpolation by clamping when out of borders. More...

uint8_t	pixel_area_c1u8_clamp (const uint8_t *first_pixel_ptr, size_t stride, size_t width, size_t height, float wr, float hr, int x, int y)
	Return the pixel at (x,y) using area interpolation by clamping when out of borders. More...

template<typename L , typename... Ts>
void	execute_window_loop (const Window &w, L &&lambda_function, Ts &&...iterators)
	Iterate through the passed window, automatically adjusting the iterators and calling the lambda_functino for each element. More...

template<typename... Ts>
bool	update_window_and_padding (Window &win, Ts &&...patterns)
	Update window and padding size for each of the access patterns. More...

Window	calculate_max_window (const ValidRegion &valid_region, const Steps &steps=Steps(), bool skip_border=false, BorderSize border_size=BorderSize())
	Calculate the maximum window for a given tensor shape and border setting. More...

Window	calculate_max_window (const ITensorInfo &info, const Steps &steps=Steps(), bool skip_border=false, BorderSize border_size=BorderSize())
	Calculate the maximum window for a given tensor shape and border setting. More...

Window	calculate_max_window_horizontal (const ValidRegion &valid_region, const Steps &steps=Steps(), bool skip_border=false, BorderSize border_size=BorderSize())
	Calculate the maximum window used by a horizontal kernel for a given tensor shape and border setting. More...

Window	calculate_max_window_horizontal (const ITensorInfo &info, const Steps &steps=Steps(), bool skip_border=false, BorderSize border_size=BorderSize())
	Calculate the maximum window used by a horizontal kernel for a given tensor shape and border setting. More...

Window	calculate_max_enlarged_window (const ValidRegion &valid_region, const Steps &steps=Steps(), BorderSize border_size=BorderSize())
	Calculate the maximum window for a given tensor shape and border setting. More...

Window	calculate_max_enlarged_window (const ITensorInfo &info, const Steps &steps=Steps(), BorderSize border_size=BorderSize())
	Calculate the maximum window for a given tensor shape and border setting. More...

template<typename... Ts>
ValidRegion	intersect_valid_regions (const Ts &...regions)
	Intersect multiple valid regions. More...

template<typename T , typename... Ts>
Strides	compute_strides (const ITensorInfo &info, T stride_x, Ts &&...fixed_strides)
	Create a strides object based on the provided strides and the tensor dimensions. More...

template<typename... Ts>
Strides	compute_strides (const ITensorInfo &info)
	Create a strides object based on the tensor dimensions. More...

template<typename T >
void	permute (Dimensions< T > &dimensions, const PermutationVector &perm)
	Permutes given Dimensions according to a permutation vector. More...

void	permute (TensorShape &shape, const PermutationVector &perm)
	Permutes given TensorShape according to a permutation vector. More...

bool	auto_init_if_empty (ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, int fixed_point_position, QuantizationInfo quantization_info=QuantizationInfo())
	Auto initialize the tensor info (shape, number of channels, data type and fixed point position) if the current assignment is empty. More...

bool	auto_init_if_empty (ITensorInfo &info_sink, const ITensorInfo &info_source)
	Auto initialize the tensor info using another tensor info. More...

bool	set_shape_if_empty (ITensorInfo &info, const TensorShape &shape)
	Set the shape to the specified value if the current assignment is empty. More...

bool	set_format_if_unknown (ITensorInfo &info, Format format)
	Set the format, data type and number of channels to the specified value if the current data type is unknown. More...

bool	set_data_type_if_unknown (ITensorInfo &info, DataType data_type)
	Set the data type and number of channels to the specified value if the current data type is unknown. More...

bool	set_data_layout_if_unknown (ITensorInfo &info, DataLayout data_layout)
	Set the data layout to the specified value if the current data layout is unknown. More...

bool	set_fixed_point_position_if_zero (ITensorInfo &info, int fixed_point_position)
	Set the fixed point position to the specified value if the current fixed point position is 0 and the data type is QS8 or QS16. More...

bool	set_quantization_info_if_empty (ITensorInfo &info, QuantizationInfo quantization_info)
	Set the quantization info to the specified value if the current quantization info is empty and the data type of asymmetric quantized type. More...

ValidRegion	calculate_valid_region_scale (const ITensorInfo &src_info, const TensorShape &dst_shape, InterpolationPolicy interpolate_policy, SamplingPolicy sampling_policy, bool border_undefined)
	Helper function to calculate the Valid Region for Scale. More...

Coordinates	index2coords (const TensorShape &shape, int index)
	Convert a linear index into n-dimensional coordinates. More...

int	coords2index (const TensorShape &shape, const Coordinates &coord)
	Convert n-dimensional coordinates into a linear index. More...

size_t	get_data_layout_dimension_index (const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
	Get the index of the given dimension. More...

int	adjust_down (int required, int available, int step)
	Decrease `required` in steps of `step` until it's less than `available`. More...

int	adjust_up (int required, int available, int step)
	Increase `required` in steps of `step` until it's greater than `available`. More...

int32x4_t	rounding_divide_by_pow2 (int32x4_t x, int exponent)
	Round to the nearest division by a power-of-two using exponent. More...

uint8x16_t	vmlaq_qasymm8 (qasymm8x16_t vd, float32x4_t vs, float32x4_t vo)
	Perform a multiply-accumulate on all 16 components of a QASYMM8 vector. More...

template<bool is_bounded_relu>
uint8x16_t	finalize_quantization (int32x4x4_t &in_s32, int result_fixedpoint_multiplier, int32_t result_shift, int32x4_t result_offset_after_shift_s32, uint8x16_t min_u8, uint8x16_t max_u8)
	Performs final quantization step on 16 elements. More...

void	colorconvert_rgb_to_rgbx (const void __restrict input, void __restrict output, const Window &win)
	Convert RGB to RGBX. More...

void	colorconvert_rgbx_to_rgb (const void input, void output, const Window &win)
	Convert RGBX to RGB. More...

template<bool yuyv, bool alpha>
void	colorconvert_yuyv_to_rgb (const void __restrict input, void __restrict output, const Window &win)
	Convert YUYV to RGB. More...

template<bool uv, bool alpha>
void	colorconvert_nv12_to_rgb (const void __restrict input, void __restrict output, const Window &win)
	Convert NV12 to RGB. More...

template<bool alpha>
void	colorconvert_iyuv_to_rgb (const void __restrict input, void __restrict output, const Window &win)
	Convert IYUV to RGB. More...

template<bool yuyv>
void	colorconvert_yuyv_to_nv12 (const void __restrict input, void __restrict output, const Window &win)
	Convert YUYV to NV12. More...

void	colorconvert_iyuv_to_nv12 (const void __restrict input, void __restrict output, const Window &win)
	Convert IYUV to NV12. More...

template<bool uv>
void	colorconvert_nv12_to_iyuv (const void __restrict input, void __restrict output, const Window &win)
	Convert NV12 to IYUV. More...

template<bool yuyv>
void	colorconvert_yuyv_to_iyuv (const void __restrict input, void __restrict output, const Window &win)
	Convert YUYV to IYUV. More...

template<bool uv>
void	colorconvert_nv12_to_yuv4 (const void __restrict input, void __restrict output, const Window &win)
	Convert NV12 to YUV4. More...

void	colorconvert_iyuv_to_yuv4 (const void __restrict input, void __restrict output, const Window &win)
	Convert IYUV to YUV4. More...

template<bool alpha>
void	colorconvert_rgb_to_nv12 (const void __restrict input, void __restrict output, const Window &win)
	Convert RGB to NV12. More...

template<bool alpha>
void	colorconvert_rgb_to_iyuv (const void __restrict input, void __restrict output, const Window &win)
	Convert RGB to IYUV. More...

template<bool alpha>
void	colorconvert_rgb_to_yuv4 (const void __restrict input, void __restrict output, const Window &win)
	Convert RGB to YUV4. More...

qint8x8_t	vget_low_qs8 (qint8x16_t a)
	Get the lower half of a 16 elements vector. More...

qint16x4_t	vget_low_qs16 (qint16x8_t a)
	Get the lower half of a 16 elements vector. More...

qint8x8_t	vget_high_qs8 (qint8x16_t a)
	Get the higher half of a 16 elements vector. More...

qint16x4_t	vget_high_qs16 (qint16x8_t a)
	Get the higher half of a 16 elements vector. More...

qint8x8_t	vld1_qs8 (const qint8_t *addr)
	Load a single 8 bit fixed point vector from memory (8 elements) More...

qint16x4_t	vld1_qs16 (const qint16_t *addr)
	Load a single 16 bit fixed point vector from memory (4 elements) More...

qint8x16_t	vld1q_qs8 (const qint8_t *addr)
	Load a single 8 bit fixed point vector from memory (16 elements) More...

qint16x8_t	vld1q_qs16 (const qint16_t *addr)
	Load a single 16 bit fixed point vector from memory (8 elements) More...

qint8x8_t	vld1_dup_qs8 (const qint8_t *addr)
	Load all lanes of 8 bit fixed point vector with same value from memory (8 elements) More...

qint16x4_t	vld1_dup_qs16 (const qint16_t *addr)
	Load all lanes of 16 bit fixed point vector with same value from memory (4 elements) More...

qint8x16_t	vld1q_dup_qs8 (const qint8_t *addr)
	Load all lanes of 8 bit fixed point vector with same value from memory (16 elements) More...

qint16x8_t	vld1q_dup_qs16 (const qint16_t *addr)
	Load all lanes of 16 bit fixed point vector with same value from memory (8 elements) More...

qint16x8x2_t	vld2q_qs16 (qint16_t *addr)
	Load two 16 bit fixed point vectors from memory (8x2 elements) More...

void	vst1_qs8 (qint8_t *addr, qint8x8_t b)
	Store a single 8 bit fixed point vector to memory (8 elements) More...

void	vst1_qs16 (qint16_t *addr, qint16x4_t b)
	Store a single 16 bit fixed point vector to memory (4 elements) More...

void	vst1q_qs8 (qint8_t *addr, qint8x16_t b)
	Store a single 8 bit fixed point vector to memory (16 elements) More...

void	vst1q_qs16 (qint16_t *addr, qint16x8_t b)
	Store a single 16 bit fixed point vector to memory (8 elements) More...

void	vst2q_qs16 (qint16_t *addr, qint16x8x2_t b)
	Store two 16 bit fixed point vector to memory (8x2 elements) More...

qint8x8_t	vqmovn_q16 (qint16x8_t a)
	16 bit fixed point vector saturating narrow (8 elements) More...

qint16x4_t	vqmovn_q32 (qint32x4_t a)
	32 bit fixed point vector saturating narrow (4 elements) More...

qint8x8_t	vdup_n_qs8 (qint8_t a)
	8 bit fixed point vector duplicate (8 elements) More...

qint16x4_t	vdup_n_qs16 (qint16_t a)
	16 bit fixed point vector duplicate (4 elements) More...

qint8x16_t	vdupq_n_qs8 (qint8_t a)
	8 bit fixed point vector duplicate (16 elements) More...

qint8x16_t	vdupq_n_qs8_f32 (float a, int fixed_point_position)
	Duplicate a float and convert it to 8 bit fixed point vector (16 elements) More...

qint16x8_t	vdupq_n_qs16_f32 (float a, int fixed_point_position)
	Duplicate a float and convert it to 16 bit fixed point vector (8 elements) More...

qint16x8_t	vdupq_n_qs16 (qint16x8_t a)
	16 bit fixed point vector duplicate (8 elements) More...

qint8x8_t	vabs_qs8 (qint8x8_t a)
	Absolute value of 8 bit fixed point vector (8 elements) More...

qint16x4_t	vabs_qs16 (qint16x4_t a)
	Absolute value of 16 bit fixed point vector (4 elements) More...

qint8x16_t	vabsq_qs8 (qint8x16_t a)
	Absolute value of 8 bit fixed point vector (16 elements) More...

qint16x8_t	vabsq_qs16 (qint16x8_t a)
	Absolute value of 16 bit fixed point vector (8 elements) More...

qint8x8_t	vqabs_qs8 (qint8x8_t a)
	Saturating absolute value of 8 bit fixed point vector (8 elements) More...

qint16x4_t	vqabs_qs16 (qint16x4_t a)
	Saturating absolute value of 16 bit fixed point vector (4 elements) More...

qint8x16_t	vqabsq_qs8 (qint8x16_t a)
	Saturating absolute value of 8 bit fixed point vector (16 elements) More...

qint16x8_t	vqabsq_qs16 (qint16x8_t a)
	Saturating absolute value of 16 bit fixed point vector (8 elements) More...

qint8x8_t	vmax_qs8 (qint8x8_t a, qint8x8_t b)
	8 bit fixed point vector max (8 elements) More...

qint16x4_t	vmax_qs16 (qint16x4_t a, qint16x4_t b)
	16 bit fixed point vector max (4 elements) More...

qint8x16_t	vmaxq_qs8 (qint8x16_t a, qint8x16_t b)
	8 bit fixed point vector max (16 elements) More...

qint16x8_t	vmaxq_qs16 (qint16x8_t a, qint16x8_t b)
	16 bit fixed point vector max (8 elements) More...

qint8x8_t	vpmax_qs8 (qint8x8_t a, qint8x8_t b)
	8 bit fixed point vector pairwise max (8 elements) More...

qint16x4_t	vpmax_qs16 (qint16x4_t a, qint16x4_t b)
	16 bit fixed point vector pairwise max (4 elements) More...

qint8x8_t	vmin_qs8 (qint8x8_t a, qint8x8_t b)
	8 bit fixed point vector min (8 elements) More...

qint16x4_t	vmin_qs16 (qint16x4_t a, qint16x4_t b)
	16 bit fixed point vector min (4 elements) More...

qint8x16_t	vminq_qs8 (qint8x16_t a, qint8x16_t b)
	8 bit fixed point vector min (16 elements) More...

qint16x8_t	vminq_qs16 (qint16x8_t a, qint16x8_t b)
	16 bit fixed point vector min (8 elements) More...

qint8x8_t	vpmin_qs8 (qint8x8_t a, qint8x8_t b)
	8 bit fixed point vector pairwise min (8 elements) More...

qint16x4_t	vpmin_qs16 (qint16x4_t a, qint16x4_t b)
	16 bit fixed point vector pairwise min (4 elements) More...

qint8x8_t	vadd_qs8 (qint8x8_t a, qint8x8_t b)
	8 bit fixed point vector add (8 elements) More...

qint16x4_t	vadd_qs16 (qint16x4_t a, qint16x4_t b)
	16 bit fixed point vector add (4 elements) More...

qint8x16_t	vaddq_qs8 (qint8x16_t a, qint8x16_t b)
	8 bit fixed point vector add (16 elements) More...

qint16x8_t	vaddq_qs16 (qint16x8_t a, qint16x8_t b)
	16 bit fixed point vector add (8 elements) More...

qint8x8_t	vqadd_qs8 (qint8x8_t a, qint8x8_t b)
	8 bit fixed point vector saturating add (8 elements) More...

qint16x4_t	vqadd_qs16 (qint16x4_t a, qint16x4_t b)
	16 bit fixed point vector saturating add (4 elements) More...

qint8x16_t	vqaddq_qs8 (qint8x16_t a, qint8x16_t b)
	8 bit fixed point vector saturating add (16 elements) More...

qint16x8_t	vqaddq_qs16 (qint16x8_t a, qint16x8_t b)
	16 bit fixed point vector saturating add (8 elements) More...

int16x4_t	vpaddl_qs8 (qint8x8_t a)
	8 bit fixed point vector saturating pairwise add (8 elements) More...

qint8x8_t	vsub_qs8 (qint8x8_t a, qint8x8_t b)
	8 bit fixed point vector subtraction (8 elements) More...

qint16x4_t	vsub_qs16 (qint16x4_t a, qint16x4_t b)
	16 bit fixed point vector subtraction (4 elements) More...

qint8x16_t	vsubq_qs8 (qint8x16_t a, qint8x16_t b)
	8 bit fixed point vector subtraction (16 elements) More...

qint16x8_t	vsubq_qs16 (qint16x8_t a, qint16x8_t b)
	16 bit fixed point vector subtraction (8 elements) More...

qint8x8_t	vqsub_qs8 (qint8x8_t a, qint8x8_t b)
	8 bit fixed point vector saturating subtraction (8 elements) More...

qint16x4_t	vqsub_qs16 (qint16x4_t a, qint16x4_t b)
	16 bit fixed point vector saturating subtraction (4 elements) More...

qint8x16_t	vqsubq_qs8 (qint8x16_t a, qint8x16_t b)
	8 bit fixed point vector saturating subtraction (16 elements) More...

qint16x8_t	vqsubq_qs16 (qint16x8_t a, qint16x8_t b)
	16 bit fixed point vector saturating subtraction (8 elements) More...

qint8x8_t	vmul_qs8 (qint8x8_t a, qint8x8_t b, int fixed_point_position)
	8 bit fixed point vector multiply (8 elements) More...

qint16x4_t	vmul_qs16 (qint16x4_t a, qint16x4_t b, int fixed_point_position)
	16 bit fixed point vector multiply (4 elements) More...

qint8x16_t	vmulq_qs8 (qint8x16_t a, qint8x16_t b, int fixed_point_position)
	8 bit fixed point vector multiply (16 elements) More...

qint16x8_t	vmulq_qs16 (qint16x8_t a, qint16x8_t b, int fixed_point_position)
	16 bit fixed point vector multiply (8 elements) More...

qint8x8_t	vqmul_qs8 (qint8x8_t a, qint8x8_t b, int fixed_point_position)
	8 bit fixed point vector saturating multiply (8 elements) More...

qint16x4_t	vqmul_qs16 (qint16x4_t a, qint16x4_t b, int fixed_point_position)
	16 bit fixed point vector saturating multiply (4 elements) More...

qint8x16_t	vqmulq_qs8 (qint8x16_t a, qint8x16_t b, int fixed_point_position)
	8 bit fixed point vector saturating multiply (16 elements) More...

qint16x8_t	vqmulq_qs16 (qint16x8_t a, qint16x8_t b, int fixed_point_position)
	16 bit fixed point vector saturating multiply (8 elements) More...

qint16x8_t	vmull_qs8 (qint8x8_t a, qint8x8_t b, int fixed_point_position)
	8 bit fixed point vector long multiply (8 elements) More...

qint32x4_t	vmull_qs16 (qint16x4_t a, qint16x4_t b, int fixed_point_position)
	16 bit fixed point vector long multiply (4 elements) More...

qint8x8_t	vmla_qs8 (qint8x8_t a, qint8x8_t b, qint8x8_t c, int fixed_point_position)
	8 bit fixed point vector multiply-accumulate (8 elements). More...

qint16x4_t	vmla_qs16 (qint16x4_t a, qint16x4_t b, qint16x4_t c, int fixed_point_position)
	16 bit fixed point vector multiply-accumulate (4 elements). More...

qint8x16_t	vmlaq_qs8 (qint8x16_t a, qint8x16_t b, qint8x16_t c, int fixed_point_position)
	8 bit fixed point vector multiply-accumulate (16 elements). More...

qint16x8_t	vmlaq_qs16 (qint16x8_t a, qint16x8_t b, qint16x8_t c, int fixed_point_position)
	16 bit fixed point vector multiply-accumulate (16 elements). More...

qint8x8_t	vqmla_qs8 (qint8x8_t a, qint8x8_t b, qint8x8_t c, int fixed_point_position)
	8 bit fixed point vector saturating multiply-accumulate (8 elements). More...

qint16x4_t	vqmla_qs16 (qint16x4_t a, qint16x4_t b, qint16x4_t c, int fixed_point_position)
	16 bit fixed point vector saturating multiply-accumulate (4 elements). More...

qint8x16_t	vqmlaq_qs8 (qint8x16_t a, qint8x16_t b, qint8x16_t c, int fixed_point_position)
	8 bit fixed point vector saturating multiply-accumulate (16 elements). More...

qint16x8_t	vqmlaq_qs16 (qint16x8_t a, qint16x8_t b, qint16x8_t c, int fixed_point_position)
	16 bit fixed point vector saturating multiply-accumulate (8 elements). More...

qint16x8_t	vmlal_qs8 (qint16x8_t a, qint8x8_t b, qint8x8_t c, int fixed_point_position)
	8 bit fixed point vector multiply-accumulate long (8 elements). More...

qint32x4_t	vmlal_qs16 (qint32x4_t a, qint16x4_t b, qint16x4_t c, int fixed_point_position)
	16 bit fixed point vector multiply-accumulate long (4 elements). More...

qint16x8_t	vqmlal_qs8 (qint16x8_t a, qint8x8_t b, qint8x8_t c, int fixed_point_position)
	8 bit fixed point vector saturating multiply-accumulate long (8 elements). More...

qint32x4_t	vqmlal_qs16 (qint32x4_t a, qint16x4_t b, qint16x4_t c, int fixed_point_position)
	16 bit fixed point vector saturating multiply-accumulate long (4 elements). More...

qint8x8_t	vqcvt_qs8_f32 (const float32x4x2_t a, int fixed_point_position)
	Convert a float vector with 4x2 elements to 8 bit fixed point vector with 8 elements. More...

qint16x4_t	vqcvt_qs16_f32 (const float32x4_t a, int fixed_point_position)
	Convert a float vector with 4 elements to 16 bit fixed point vector with 4 elements. More...

qint8x16_t	vqcvtq_qs8_f32 (const float32x4x4_t &a, int fixed_point_position)
	Convert a float vector with 4x4 elements to 8 bit fixed point vector with 16 elements. More...

qint16x8_t	vqcvtq_qs16_f32 (const float32x4x2_t &a, int fixed_point_position)
	Convert a float vector with 4x2 elements to 16 bit fixed point vector with 8 elements. More...

float32x4x2_t	vcvt_f32_qs8 (qint8x8_t a, int fixed_point_position)
	Convert a 8 bit fixed point vector with 8 elements to a float vector with 4x2 elements. More...

float32x4_t	vcvt_f32_qs16 (qint16x4_t a, int fixed_point_position)
	Convert a 16 bit fixed point vector with 4 elements to a float vector with 4 elements. More...

float32x4x4_t	vcvtq_qs8_f32 (qint8x16_t a, int fixed_point_position)
	Convert a 8 bit fixed point vector with 16 elements to a float vector with 4x4 elements. More...

float32x4x2_t	vcvtq_qs16_f32 (qint16x8_t a, int fixed_point_position)
	Convert a 16 bit fixed point vector with 8 elements to a float vector with 4x2 elements. More...

qint8x8_t	vrecip_qs8 (qint8x8_t a, int fixed_point_position)
	Calculate reciprocal of a fixed point 8bit number using the Newton-Raphson method. More...

qint16x4_t	vrecip_qs16 (qint16x4_t a, int fixed_point_position)
	Calculate reciprocal of a fixed point 8bit number using the Newton-Raphson method. More...

qint8x16_t	vrecipq_qs8 (qint8x16_t a, int fixed_point_position)
	Calculate reciprocal of a fixed point 8bit number using the Newton-Raphson method. More...

qint16x8_t	vrecipq_qs16 (qint16x8_t a, int fixed_point_position)
	Calculate reciprocal of a fixed point 8bit number using the Newton-Raphson method. More...

qint8x8_t	vdiv_qs8 (qint8x8_t a, int8x8_t b, int fixed_point_position)
	Division fixed point 8bit (8 elements) More...

qint16x4_t	vdiv_qs16 (qint16x4_t a, qint16x4_t b, int fixed_point_position)
	Division fixed point 16 bit (4 elements) More...

qint8x16_t	vdivq_qs8 (qint8x16_t a, qint8x16_t b, int fixed_point_position)
	Division fixed point 8bit (16 elements) More...

qint16x8_t	vdivq_qs16 (qint16x8_t a, qint16x8_t b, int fixed_point_position)
	Division fixed point 16 bit (8 elements) More...

template<bool islog>
qint8x8_t	vtaylor_poly_qs8 (qint8x8_t a, int fixed_point_position)
	Perform a 4th degree polynomial approximation. More...

template<bool islog>
qint16x4_t	vtaylor_poly_qs16 (qint16x4_t a, int fixed_point_position)
	Perform a 4th degree polynomial approximation. More...

template<bool islog>
qint8x16_t	vtaylor_polyq_qs8 (qint8x16_t a, int fixed_point_position)
	Perform a 4th degree polynomial approximation. More...

template<bool islog>
qint16x8_t	vtaylor_polyq_qs16 (qint16x8_t a, int fixed_point_position)
	Perform a 4th degree polynomial approximation. More...

qint8x8_t	vqexp_qs8 (qint8x8_t a, int fixed_point_position)
	Calculate saturating exponential fixed point 8bit (8 elements) More...

qint16x4_t	vqexp_qs16 (qint16x4_t a, int fixed_point_position)
	Calculate saturating exponential fixed point 16 bit (4 elements) More...

qint8x16_t	vqexpq_qs8 (qint8x16_t a, int fixed_point_position)
	Calculate saturating exponential fixed point 8bit (16 elements) More...

qint16x8_t	vqexpq_qs16 (qint16x8_t a, int fixed_point_position)
	Calculate saturating exponential fixed point 16 bit (8 elements) More...

qint8x8_t	vlog_qs8 (qint8x8_t a, int fixed_point_position)
	Calculate logarithm fixed point 8 bit (8 elements) More...

qint16x4_t	vlog_qs16 (qint16x4_t a, int fixed_point_position)
	Calculate logarithm fixed point 16 bit (4 elements) More...

qint8x16_t	vlogq_qs8 (qint8x16_t a, int fixed_point_position)
	Calculate logarithm fixed point 16bit (16 elements) More...

qint16x8_t	vlogq_qs16 (qint16x8_t a, int fixed_point_position)
	Calculate logarithm fixed point 16 bit (8 elements) More...

qint8x8_t	vinvsqrt_qs8 (qint8x8_t a, int fixed_point_position)
	Calculate inverse square root for fixed point 8bit using Newton-Raphosn method (8 elements) More...

qint16x4_t	vinvsqrt_qs16 (qint16x4_t a, int fixed_point_position)
	Calculate inverse square root for fixed point 16 bit using Newton-Raphosn method (4 elements) More...

qint8x8_t	vqinvsqrt_qs8 (qint8x8_t a, int fixed_point_position)
	Calculate saturating inverse square root for fixed point 8bit using Newton-Raphosn method (8 elements) More...

qint16x4_t	vqinvsqrt_qs16 (qint16x4_t a, int fixed_point_position)
	Calculate saturating inverse square root for fixed point 16 bit using Newton-Raphosn method (4 elements) More...

qint8x16_t	vinvsqrtq_qs8 (qint8x16_t a, int fixed_point_position)
	Calculate inverse square root for fixed point 8bit using Newton-Raphosn method (16 elements) More...

qint16x8_t	vinvsqrtq_qs16 (qint16x8_t a, int fixed_point_position)
	Calculate inverse square root for fixed point 8bit using Newton-Raphosn method (8 elements) More...

qint8x16_t	vqinvsqrtq_qs8 (qint8x16_t a, int fixed_point_position)
	Calculate saturating inverse square root for fixed point 8bit using Newton-Raphosn method (16 elements) More...

qint16x8_t	vqinvsqrtq_qs16 (qint16x8_t a, int fixed_point_position)
	Calculate saturating inverse square root for fixed point 16 bit using Newton-Raphosn method (8 elements) More...

qint8x8_t	vqtanh_qs8 (qint8x8_t a, int fixed_point_position)
	Calculate hyperbolic tangent for fixed point 8bit (8 elements) More...

qint16x4_t	vqtanh_qs16 (qint16x4_t a, int fixed_point_position)
	Calculate hyperbolic tangent for fixed point 16 bit (4 elements) More...

qint8x16_t	vqtanhq_qs8 (qint8x16_t a, int fixed_point_position)
	Calculate hyperbolic tangent for fixed point 8bit (16 elements) More...

qint16x8_t	vqtanhq_qs16 (qint16x8_t a, int fixed_point_position)
	Calculate hyperbolic tangent for fixed point 16bit (8 elements) More...

qint8x16_t	vqpowq_qs8 (qint8x16_t a, qint8x16_t b, int fixed_point_position)
	Calculate saturating n power for fixed point 8bit (16 elements). More...

qint16x8_t	vqpowq_qs16 (qint16x8_t a, qint16x8_t b, int fixed_point_position)
	Calculate saturating n power for fixed point 16bit (8 elements). More...

float32x4x2_t	vmax2q_f32 (float32x4x2_t a, float32x4x2_t b)
	Compute lane-by-lane maximum between elements of a float vector with 4x2 elements. More...

float32x4_t	vfloorq_f32 (float32x4_t val)
	Calculate floor of a vector. More...

float32x2_t	vinvsqrt_f32 (float32x2_t x)
	Calculate inverse square root. More...

float32x4_t	vinvsqrtq_f32 (float32x4_t x)
	Calculate inverse square root. More...

float32x2_t	vinv_f32 (float32x2_t x)
	Calculate reciprocal. More...

float32x4_t	vinvq_f32 (float32x4_t x)
	Calculate reciprocal. More...

float32x4_t	vtaylor_polyq_f32 (float32x4_t x, const std::array< float32x4_t, 8 > &coeffs)
	Perform a 7th degree polynomial approximation using Estrin's method. More...

float32x4_t	vexpq_f32 (float32x4_t x)
	Calculate exponential. More...

float32x4_t	vlogq_f32 (float32x4_t x)
	Calculate logarithm. More...

float32x4_t	vtanhq_f32 (float32x4_t val)
	Calculate hyperbolic tangent. More...

float32x4_t	vpowq_f32 (float32x4_t val, float32x4_t n)
	Calculate n power of a number. More...

int	round (float x, RoundingPolicy rounding_policy)
	Return a rounded value of x. More...

template<typename S , typename T >
constexpr auto	DIV_CEIL (S val, T m) -> decltype((val+m-1)/m)
	Calculate the rounded up quotient of val / m. More...

template<typename S , typename T >
auto	ceil_to_multiple (S value, T divisor) -> decltype(((value+divisor-1)/divisor)*divisor)
	Computes the smallest number larger or equal to value that is a multiple of divisor. More...

template<typename S , typename T >
auto	floor_to_multiple (S value, T divisor) -> decltype((value/divisor)*divisor)
	Computes the largest number smaller or equal to value that is a multiple of divisor. More...

std::string	build_information ()
	Returns the arm_compute library build information. More...

std::string	read_file (const std::string &filename, bool binary)
	Load an entire file in memory. More...

size_t	data_size_from_type (DataType data_type)
	The size in bytes of the data type. More...

size_t	pixel_size_from_format (Format format)
	The size in bytes of the pixel format. More...

size_t	element_size_from_data_type (DataType dt)
	The size in bytes of the data type. More...

DataType	data_type_from_format (Format format)
	Return the data type used by a given single-planar pixel format. More...

int	plane_idx_from_channel (Format format, Channel channel)
	Return the plane index of a given channel given an input format. More...

int	channel_idx_from_format (Format format, Channel channel)
	Return the channel index of a given channel given an input format. More...

size_t	num_planes_from_format (Format format)
	Return the number of planes for a given format. More...

size_t	num_channels_from_format (Format format)
	Return the number of channels for a given single-planar pixel format. More...

DataType	get_promoted_data_type (DataType dt)
	Return the promoted data type of a given data type. More...

bool	has_format_horizontal_subsampling (Format format)
	Return true if the given format has horizontal subsampling. More...

bool	has_format_vertical_subsampling (Format format)
	Return true if the given format has vertical subsampling. More...

bool	separate_matrix (const int16_t conv, int16_t conv_col, int16_t *conv_row, uint8_t size)
	Separate a 2D convolution into two 1D convolutions. More...

uint32_t	calculate_matrix_scale (const int16_t *matrix, unsigned int matrix_size)
	Calculate the scale of the given square matrix. More...

template<typename T >
TensorShape	calculate_depth_concatenate_shape (const std::vector< T * > &inputs_vector)
	Calculate the output shapes of the depth concatenate function. More...

TensorShape	adjust_odd_shape (const TensorShape &shape, Format format)
	Adjust tensor shape size if width or height are odd for a given multi-planar format. More...

TensorShape	calculate_subsampled_shape (const TensorShape &shape, Format format, Channel channel=Channel::UNKNOWN)
	Calculate subsampled shape for a given format and channel. More...

std::pair< DataType, DataType >	data_type_for_convolution (const int16_t conv_col, const int16_t conv_row, size_t size)
	Calculate accurary required by the horizontal and vertical convolution computations. More...

DataType	data_type_for_convolution_matrix (const int16_t *conv, size_t size)
	Calculate the accuracy required by the squared convolution calculation. More...

PadStrideInfo	calculate_same_pad (TensorShape input_shape, TensorShape weights_shape, PadStrideInfo conv_info)
	Calculate padding requirements in case of SAME padding. More...

TensorShape	deconvolution_output_shape (const std::pair< unsigned int, unsigned int > &out_dims, TensorShape input, TensorShape weights)
	Returns expected shape for the deconvolution output tensor. More...

const std::pair< unsigned int, unsigned int >	deconvolution_output_dimensions (unsigned int in_width, unsigned int in_height, unsigned int kernel_width, unsigned int kernel_height, unsigned int padx, unsigned int pady, unsigned int inner_border_right, unsigned int inner_border_top, unsigned int stride_x, unsigned int stride_y)
	Returns expected width and height of the deconvolution's output tensor. More...

const std::pair< unsigned int, unsigned int >	scaled_dimensions (unsigned int width, unsigned int height, unsigned int kernel_width, unsigned int kernel_height, const PadStrideInfo &pad_stride_info, const Size2D &dilation=Size2D(1U, 1U))
	Returns expected width and height of output scaled tensor depending on dimensions rounding mode. More...

const std::string &	string_from_format (Format format)
	Convert a tensor format into a string. More...

const std::string &	string_from_channel (Channel channel)
	Convert a channel identity into a string. More...

const std::string &	string_from_data_layout (DataLayout dl)
	Convert a data layout identity into a string. More...

const std::string &	string_from_data_type (DataType dt)
	Convert a data type identity into a string. More...

const std::string &	string_from_matrix_pattern (MatrixPattern pattern)
	Convert a matrix pattern into a string. More...

const std::string &	string_from_activation_func (ActivationLayerInfo::ActivationFunction act)
	Translates a given activation function to a string. More...

const std::string &	string_from_non_linear_filter_function (NonLinearFilterFunction function)
	Translates a given non linear function to a string. More...

const std::string &	string_from_interpolation_policy (InterpolationPolicy policy)
	Translates a given interpolation policy to a string. More...

const std::string &	string_from_border_mode (BorderMode border_mode)
	Translates a given border mode policy to a string. More...

const std::string &	string_from_norm_type (NormType type)
	Translates a given normalization type to a string. More...

const std::string &	string_from_pooling_type (PoolingType type)
	Translates a given pooling type to a string. More...

std::string	lower_string (const std::string &val)
	Lower a given string. More...

bool	is_data_type_float (DataType dt)
	Check if a given data type is of floating point type. More...

bool	is_data_type_quantized (DataType dt)
	Check if a given data type is of quantized type. More...

bool	is_data_type_fixed_point (DataType dt)
	Check if a given data type is of fixed point type. More...

bool	is_data_type_quantized_asymmetric (DataType dt)
	Check if a given data type is of asymmetric quantized type. More...

std::string	float_to_string_with_full_precision (float val)
	Create a string with the float in full precision. More...

template<typename T >
void	print_consecutive_elements_impl (std::ostream &s, const T *ptr, unsigned int n, int stream_width=0, const std::string &element_delim=" ")
	Print consecutive elements to an output stream. More...

template<typename T >
int	max_consecutive_elements_display_width_impl (std::ostream &s, const T *ptr, unsigned int n)
	Identify the maximum width of n consecutive elements. More...

void	print_consecutive_elements (std::ostream &s, DataType dt, const uint8_t *ptr, unsigned int n, int stream_width, const std::string &element_delim=" ")
	Print consecutive elements to an output stream. More...

int	max_consecutive_elements_display_width (std::ostream &s, DataType dt, const uint8_t *ptr, unsigned int n)
	Identify the maximum width of n consecutive elements. More...

template<typename... Ts>
arm_compute::Status	error_on_nullptr (const char function, const char file, const int line, Ts &&...pointers)
	Create an error if one of the pointers is a nullptr. More...

arm_compute::Status	error_on_mismatching_windows (const char function, const char file, const int line, const Window &full, const Window &win)
	Return an error if the passed window is invalid. More...

arm_compute::Status	error_on_invalid_subwindow (const char function, const char file, const int line, const Window &full, const Window &sub)
	Return an error if the passed subwindow is invalid. More...

arm_compute::Status	error_on_window_not_collapsable_at_dimension (const char function, const char file, const int line, const Window &full, const Window &window, const int dim)
	Return an error if the window can't be collapsed at the given dimension. More...

arm_compute::Status	error_on_coordinates_dimensions_gte (const char function, const char file, const int line, const Coordinates &pos, unsigned int max_dim)
	Return an error if the passed coordinates have too many dimensions. More...

arm_compute::Status	error_on_window_dimensions_gte (const char function, const char file, const int line, const Window &win, unsigned int max_dim)
	Return an error if the passed window has too many dimensions. More...

template<typename T , typename... Ts>
arm_compute::Status	error_on_mismatching_dimensions (const char function, const char file, int line, const Dimensions< T > &dim1, const Dimensions< T > &dim2, Ts &&...dims)
	Return an error if the passed dimension objects differ. More...

template<typename... Ts>
arm_compute::Status	error_on_tensors_not_even (const char function, const char file, int line, const Format &format, const ITensor *tensor1, Ts...tensors)
	Return an error if the passed tensor objects are not even. More...

template<typename... Ts>
arm_compute::Status	error_on_tensors_not_subsampled (const char function, const char file, int line, const Format &format, const TensorShape &shape, const ITensor *tensor1, Ts...tensors)
	Return an error if the passed tensor objects are not sub-sampled. More...

template<typename... Ts>
arm_compute::Status	error_on_mismatching_shapes (const char function, const char file, const int line, const ITensorInfo tensor_info_1, const ITensorInfo tensor_info_2, Ts...tensor_infos)
	Return an error if the passed two tensor infos have different shapes from the given dimension. More...

template<typename... Ts>
arm_compute::Status	error_on_mismatching_shapes (const char function, const char file, const int line, const ITensor tensor_1, const ITensor tensor_2, Ts...tensors)
	Return an error if the passed two tensors have different shapes from the given dimension. More...

template<typename... Ts>
arm_compute::Status	error_on_mismatching_shapes (const char function, const char file, const int line, unsigned int upper_dim, const ITensorInfo tensor_info_1, const ITensorInfo tensor_info_2, Ts...tensor_infos)
	Return an error if the passed two tensors have different shapes from the given dimension. More...

template<typename... Ts>
arm_compute::Status	error_on_mismatching_shapes (const char function, const char file, const int line, unsigned int upper_dim, const ITensor tensor_1, const ITensor tensor_2, Ts...tensors)
	Return an error if the passed two tensors have different shapes from the given dimension. More...

template<typename... Ts>
arm_compute::Status	error_on_mismatching_data_layouts (const char function, const char file, const int line, const ITensorInfo *tensor_info, Ts...tensor_infos)
	Return an error if the passed tensor infos have different data layouts. More...

template<typename... Ts>
arm_compute::Status	error_on_mismatching_data_layouts (const char function, const char file, const int line, const ITensor *tensor, Ts...tensors)
	Return an error if the passed tensors have different data layouts. More...

template<typename... Ts>
arm_compute::Status	error_on_mismatching_data_types (const char function, const char file, const int line, const ITensorInfo *tensor_info, Ts...tensor_infos)
	Return an error if the passed two tensor infos have different data types. More...

template<typename... Ts>
arm_compute::Status	error_on_mismatching_data_types (const char function, const char file, const int line, const ITensor *tensor, Ts...tensors)
	Return an error if the passed two tensors have different data types. More...

template<typename... Ts>
arm_compute::Status	error_on_mismatching_fixed_point (const char function, const char file, const int line, const ITensorInfo tensor_info_1, const ITensorInfo tensor_info_2, Ts...tensor_infos)
	Return an error if the passed tensor infos have different fixed point data types or different fixed point positions. More...

template<typename... Ts>
arm_compute::Status	error_on_mismatching_fixed_point (const char function, const char file, const int line, const ITensor tensor_1, const ITensor tensor_2, Ts...tensors)
	Return an error if the passed tensor have different fixed point data types or different fixed point positions. More...

template<typename... Ts>
arm_compute::Status	error_on_mismatching_quantization_info (const char function, const char file, const int line, const ITensorInfo tensor_info_1, const ITensorInfo tensor_info_2, Ts...tensor_infos)
	Return an error if the passed tensor infos have different asymmetric quantized data types or different quantization info. More...

template<typename... Ts>
arm_compute::Status	error_on_mismatching_quantization_info (const char function, const char file, const int line, const ITensor tensor_1, const ITensor tensor_2, Ts...tensors)
	Return an error if the passed tensor have different asymmetric quantized data types or different quantization info. More...

template<typename T , typename F , typename... Fs>
void	error_on_format_not_in (const char function, const char file, const int line, const T *object, F &&format, Fs &&...formats)
	Throw an error if the format of the passed tensor/multi-image does not match any of the formats provided. More...

template<typename T , typename... Ts>
arm_compute::Status	error_on_data_type_not_in (const char function, const char file, const int line, const ITensorInfo *tensor_info, T &&dt, Ts &&...dts)
	Return an error if the data type of the passed tensor info does not match any of the data types provided. More...

template<typename T , typename... Ts>
arm_compute::Status	error_on_data_type_not_in (const char function, const char file, const int line, const ITensor *tensor, T &&dt, Ts &&...dts)
	Return an error if the data type of the passed tensor does not match any of the data types provided. More...

template<typename T , typename... Ts>
arm_compute::Status	error_on_data_type_channel_not_in (const char function, const char file, const int line, const ITensorInfo *tensor_info, size_t num_channels, T &&dt, Ts &&...dts)
	Return an error if the data type or the number of channels of the passed tensor info does not match any of the data types and number of channels provided. More...

template<typename T , typename... Ts>
arm_compute::Status	error_on_data_type_channel_not_in (const char function, const char file, const int line, const ITensor *tensor, size_t num_channels, T &&dt, Ts &&...dts)
	Return an error if the data type or the number of channels of the passed tensor does not match any of the data types and number of channels provided. More...

arm_compute::Status	error_on_tensor_not_2d (const char function, const char file, const int line, const ITensor *tensor)
	Return an error if the tensor is not 2D. More...

template<typename T , typename... Ts>
arm_compute::Status	error_on_channel_not_in (const char function, const char file, const int line, T cn, T &&channel, Ts &&...channels)
	Return an error if the channel is not in channels. More...

arm_compute::Status	error_on_channel_not_in_known_format (const char function, const char file, const int line, Format fmt, Channel cn)
	Return an error if the channel is not in format. More...

arm_compute::Status	error_on_invalid_multi_hog (const char function, const char file, const int line, const IMultiHOG *multi_hog)
	Return an error if the IMultiHOG container is invalid. More...

arm_compute::Status	error_on_unconfigured_kernel (const char function, const char file, const int line, const IKernel *kernel)
	Return an error if the kernel is not configured. More...

arm_compute::Status	error_on_invalid_subtensor (const char function, const char file, const int line, const TensorShape &parent_shape, const Coordinates &coords, const TensorShape &shape)
	Return an error if if the coordinates and shape of the subtensor are within the parent tensor. More...

arm_compute::Status	error_on_invalid_subtensor_valid_region (const char function, const char file, const int line, const ValidRegion &parent_valid_region, const ValidRegion &valid_region)
	Return an error if the valid region of a subtensor is not inside the valid region of the parent tensor. More...

template<typename... Ts>
arm_compute::Status	error_on_mismatching_fixed_point_position (const char function, const char file, const int line, const ITensorInfo tensor_info_1, const ITensorInfo tensor_info_2, Ts...tensor_infos)
	Return an error if the input fixed-point positions are different. More...

template<typename... Ts>
arm_compute::Status	error_on_mismatching_fixed_point_position (const char function, const char file, const int line, const ITensor tensor_1, const ITensor tensor_2, Ts...tensors)
	Return an error if the input fixed-point positions are different. More...

arm_compute::Status	error_on_value_not_representable_in_fixed_point (const char function, const char file, int line, float value, const ITensorInfo *tensor_info)
	Return an error if the fixed-point value is not representable in the specified Q format. More...

arm_compute::Status	error_on_value_not_representable_in_fixed_point (const char function, const char file, int line, float value, const ITensor *tensor)
	Return an error an error if the fixed-point value is not representable in the specified Q format. More...

void	get_cpu_configuration (CPUInfo &cpuinfo)
	This function will try to detect the CPU configuration on the system and will fill the cpuinfo object accordingly to reflect this. More...

unsigned int	get_threads_hint ()
	Some systems have both big and small cores, this fuction computes the minimum number of cores that are exactly the same on the system. More...

void	allocate_workspace (size_t workspace_size, Tensor &workspace, MemoryGroup *memory_group, size_t alignment, unsigned int num_threads)
	Allocate a workspace tensor. More...

template<typename T >
bool	setup_assembly_kernel (const ITensor a, const ITensor b, ITensor *d, float alpha, float beta, bool pretranspose_hint, Tensor &workspace, Tensor &B_pretranspose, MemoryGroup &memory_group, T &asm_glue)
	Create a wrapper kernel. More...

const std::string &	string_from_scheduler_type (Scheduler::Type t)
	Convert a Scheduler::Type into a string. More...

inline::std::istream &	operator>> (::std::istream &is, BorderMode &mode)
	Formatted input of the BorderMode type. More...

template<typename T >
inline::std::ostream &	operator<< (::std::ostream &os, const Dimensions< T > &dimensions)
	Formatted output of the Dimensions type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const NonLinearFilterFunction &function)
	Formatted output of the NonLinearFilterFunction type. More...

std::string	to_string (const NonLinearFilterFunction &function)
	Formatted output of the NonLinearFilterFunction type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const MatrixPattern &pattern)
	Formatted output of the MatrixPattern type. More...

std::string	to_string (const MatrixPattern &pattern)
	Formatted output of the MatrixPattern type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const RoundingPolicy &rounding_policy)
	Formatted output of the RoundingPolicy type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const WeightsInfo &weights_info)
	Formatted output of the WeightsInfo type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const ROIPoolingLayerInfo &pool_info)
	Formatted output of the ROIPoolingInfo type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const QuantizationInfo &quantization_info)
	Formatted output of the QuantizationInfo type. More...

std::string	to_string (const QuantizationInfo &quantization_info)
	Formatted output of the QuantizationInfo type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const FixedPointOp &op)
	Formatted output of the FixedPointOp type. More...

std::string	to_string (const FixedPointOp &op)
	Formatted output of the FixedPointOp type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const ActivationLayerInfo::ActivationFunction &act_function)
	Formatted output of the activation function type. More...

std::string	to_string (const arm_compute::ActivationLayerInfo &info)
	Formatted output of the activation function info type. More...

std::string	to_string (const arm_compute::ActivationLayerInfo::ActivationFunction &function)
	Formatted output of the activation function type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const NormType &norm_type)
	Formatted output of the NormType type. More...

std::string	to_string (const arm_compute::NormalizationLayerInfo &info)
	Formatted output of NormalizationLayerInfo. More...

inline::std::ostream &	operator<< (::std::ostream &os, const NormalizationLayerInfo &info)
	Formatted output of NormalizationLayerInfo. More...

inline::std::ostream &	operator<< (::std::ostream &os, const PoolingType &pool_type)
	Formatted output of the PoolingType type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const PoolingLayerInfo &info)
	Formatted output of PoolingLayerInfo. More...

std::string	to_string (const RoundingPolicy &rounding_policy)
	Formatted output of RoundingPolicy. More...

inline::std::ostream &	operator<< (::std::ostream &os, const DataLayout &data_layout)
	Formatted output of the DataLayout type. More...

std::string	to_string (const arm_compute::DataLayout &data_layout)
	Formatted output of the DataLayout type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const DataType &data_type)
	Formatted output of the DataType type. More...

std::string	to_string (const arm_compute::DataType &data_type)
	Formatted output of the DataType type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const Format &format)
	Formatted output of the Format type. More...

std::string	to_string (const Format &format)
	Formatted output of the Format type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const Channel &channel)
	Formatted output of the Channel type. More...

std::string	to_string (const Channel &channel)
	Formatted output of the Channel type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const BorderMode &mode)
	Formatted output of the BorderMode type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const BorderSize &border)
	Formatted output of the BorderSize type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const InterpolationPolicy &policy)
	Formatted output of the InterpolationPolicy type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const SamplingPolicy &policy)
	Formatted output of the SamplingPolicy type. More...

std::string	to_string (const TensorInfo &info)
	Formatted output of the TensorInfo type. More...

template<typename T >
std::string	to_string (const Dimensions< T > &dimensions)
	Formatted output of the Dimensions type. More...

std::string	to_string (const Strides &stride)
	Formatted output of the Strides type. More...

std::string	to_string (const TensorShape &shape)
	Formatted output of the TensorShape type. More...

std::string	to_string (const Coordinates &coord)
	Formatted output of the Coordinates type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const Rectangle &rect)
	Formatted output of the Rectangle type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const PadStrideInfo &pad_stride_info)
	Formatted output of the PadStrideInfo type. More...

std::string	to_string (const PadStrideInfo &pad_stride_info)
	Formatted output of the PadStrideInfo type. More...

std::string	to_string (const BorderMode &mode)
	Formatted output of the BorderMode type. More...

std::string	to_string (const BorderSize &border)
	Formatted output of the BorderSize type. More...

std::string	to_string (const InterpolationPolicy &policy)
	Formatted output of the InterpolationPolicy type. More...

std::string	to_string (const SamplingPolicy &policy)
	Formatted output of the SamplingPolicy type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const ConvertPolicy &policy)
	Formatted output of the ConvertPolicy type. More...

std::string	to_string (const ConvertPolicy &policy)

inline::std::ostream &	operator<< (::std::ostream &os, const ReductionOperation &op)
	Formatted output of the Reduction Operations. More...

std::string	to_string (const ReductionOperation &op)
	Formatted output of the Reduction Operations. More...

std::string	to_string (const NormType &type)
	Formatted output of the Norm Type. More...

std::string	to_string (const PoolingType &type)
	Formatted output of the Pooling Type. More...

std::string	to_string (const PoolingLayerInfo &info)
	Formatted output of the Pooling Layer Info. More...

inline::std::ostream &	operator<< (::std::ostream &os, const KeyPoint &point)
	Formatted output of the KeyPoint type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const PhaseType &phase_type)
	Formatted output of the PhaseType type. More...

std::string	to_string (const arm_compute::PhaseType &type)
	Formatted output of the PhaseType type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const MagnitudeType &magnitude_type)
	Formatted output of the MagnitudeType type. More...

std::string	to_string (const arm_compute::MagnitudeType &type)
	Formatted output of the MagnitudeType type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const GradientDimension &dim)
	Formatted output of the GradientDimension type. More...

std::string	to_string (const arm_compute::GradientDimension &type)
	Formatted output of the GradientDimension type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const HOGNormType &norm_type)
	Formatted output of the HOGNormType type. More...

std::string	to_string (const HOGNormType &type)
	Formatted output of the HOGNormType type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const Size2D &size)
	Formatted output of the Size2D type. More...

std::string	to_string (const Size2D &type)
	Formatted output of the Size2D type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const HOGInfo &hog_info)
	Formatted output of the HOGInfo type. More...

std::string	to_string (const HOGInfo &type)
	Formatted output of the HOGInfo type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const ConvolutionMethod &conv_method)
	Formatted output of the ConvolutionMethod type. More...

std::string	to_string (const ConvolutionMethod &conv_method)
	Formatted output of the ConvolutionMethod type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const GPUTarget &gpu_target)
	Formatted output of the GPUTarget type. More...

std::string	to_string (const GPUTarget &gpu_target)
	Formatted output of the GPUTarget type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const DetectionWindow &detection_window)
	Formatted output of the DetectionWindow type. More...

std::string	to_string (const DetectionWindow &detection_window)
	Formatted output of the DetectionWindow type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const Termination &termination)
	Formatted output of the Termination type. More...

std::string	to_string (const Termination &termination)
	Formatted output of the Termination type. More...

inline::std::ostream &	operator<< (::std::ostream &os, const WinogradInfo &info)
	Formatted output of the WinogradInfo type. More...

std::string	to_string (const WinogradInfo &type)

Variables
constexpr size_t	MAX_DIMS = 6
	Constant value used to indicate maximum dimensions of a Window, TensorShape and Coordinates. More...

const std::array< float32x4_t, 8 >	exp_tab
	Exponent polynomial coefficients. More...

const std::array< float32x4_t, 8 >	log_tab
	Logarithm polynomial coefficients. More...

constexpr uint8_t	CONSTANT_BORDER_VALUE = 199
	Constant value of the border pixels when using BorderMode::CONSTANT. More...

constexpr float	SCALE_PYRAMID_HALF = 0.5f
	Constant value used to indicate a half-scale pyramid. More...

constexpr float	SCALE_PYRAMID_ORB = 8.408964152537146130583778358414e-01
	Constant value used to indicate a ORB scaled pyramid. More...

Detailed Description

This file contains all available output stages for GEMMLowp on OpenCL.

This file contains all available output stages for GEMMLowp on NEON.

In gemmlowp, the "output stage" is the process that takes a final int32 accumulator value (the output of CLGEMMLowpMatrixMultiplyCore), and processes it to obtain the final ASYMM8 value.

More information about the GEMMLowp output stage can be found at https://github.com/google/gemmlowp/blob/master/doc/output.md

In gemmlowp, the "output stage" is the process that takes a final int32 accumulator value (the output of NEGEMMLowpMatrixMultiplyCore), and processes it to obtain the final ASYMM8 value.

More information about the GEMMLowp output stage can be found at https://github.com/google/gemmlowp/blob/master/doc/output.md

Typedef Documentation

using AssemblyKernelGlueF32 = AssemblyKernelGlue<float, float>

Float 32 assembly kernel glue.

Definition at line 121 of file AssemblyHelper.h.

using AssemblyKernelGlueS8S32 = AssemblyKernelGlue<int8_t, int32_t>

Int 8 to Int 32 kernel glue.

Definition at line 125 of file AssemblyHelper.h.

using AssemblyKernelGlueU8U32 = AssemblyKernelGlue<uint8_t, uint32_t>

Uint 8 to Uint 32 kernel glue.

Definition at line 123 of file AssemblyHelper.h.

using CLCoefficientTableArray = CLArray<CLCoefficientTable>

OpenCL Array of Coefficient Tables.

Definition at line 49 of file CLOpticalFlow.h.

using CLConvolution3x3Kernel = CLConvolutionKernel<3>

Interface for the kernel which applies a 3x3 convolution to a tensor.

Definition at line 70 of file CLConvolutionKernel.h.

using CLConvolution5x5 = CLConvolutionSquare<5>

Basic function to run 5x5 convolution.

Definition at line 102 of file CLConvolution.h.

using CLConvolution5x5Kernel = CLConvolutionKernel<5>

Interface for the kernel which applies a 5x5 convolution to a tensor.

Definition at line 72 of file CLConvolutionKernel.h.

using CLConvolution7x7 = CLConvolutionSquare<7>

Basic function to run 7x7 convolution.

Definition at line 104 of file CLConvolution.h.

using CLConvolution7x7Kernel = CLConvolutionKernel<7>

Interface for the kernel which applies a 7x7 convolution to a tensor.

Definition at line 74 of file CLConvolutionKernel.h.

using CLConvolution9x9 = CLConvolutionSquare<9>

Basic function to run 9x9 convolution.

Definition at line 106 of file CLConvolution.h.

using CLConvolution9x9Kernel = CLConvolutionKernel<9>

Interface for the kernel which applies a 9x9 convolution to a tensor.

Definition at line 76 of file CLConvolutionKernel.h.

using CLCoordinates2DArray = CLArray<Coordinates2D>

OpenCL Array of 2D Coordinates.

Definition at line 109 of file CLArray.h.

using CLDetectionWindowArray = CLArray<DetectionWindow>

OpenCL Array of Detection Windows.

Definition at line 111 of file CLArray.h.

using CLFloatArray = CLArray<cl_float>

OpenCL Array of floats.

Definition at line 127 of file CLArray.h.

using CLImage = CLTensor

OpenCL Image.

Definition at line 80 of file CLTensor.h.

using CLInt16Array = CLArray<cl_short>

OpenCL Array of int16s.

Definition at line 123 of file CLArray.h.

using CLInt32Array = CLArray<cl_int>

OpenCL Array of int32s.

Definition at line 125 of file CLArray.h.

using CLKeyPointArray = CLArray<KeyPoint>

OpenCL Array of Key Points.

Definition at line 107 of file CLArray.h.

using CLLKInternalKeypointArray = CLArray<CLLKInternalKeypoint>

OpenCL Array of Internal Keypoints.

Definition at line 47 of file CLOpticalFlow.h.

typedef MemoryGroupBase< CLTensor > CLMemoryGroup

Memory Group in OpenCL.

Definition at line 35 of file CLMemoryGroup.h.

using CLOldValueArray = CLArray<CLOldValue>

OpenCL Array of Old Values.

Definition at line 51 of file CLOpticalFlow.h.

using CLROIArray = CLArray<ROI>

OpenCL Array of ROIs.

Definition at line 113 of file CLArray.h.

using CLSeparableConvolution5x5HorKernel = CLSeparableConvolutionHorKernel<5>

Interface for the kernel which applies a horizontal pass of 5x5 convolution to a tensor.

Definition at line 106 of file CLConvolutionKernel.h.

using CLSeparableConvolution5x5VertKernel = CLSeparableConvolutionVertKernel<5>

Interface for the kernel which applies a vertical pass of 5x5 convolution to a tensor.

Definition at line 133 of file CLConvolutionKernel.h.

using CLSeparableConvolution7x7HorKernel = CLSeparableConvolutionHorKernel<7>

Interface for the kernel which applies a horizontal pass of 7x7 convolution to a tensor.

Definition at line 108 of file CLConvolutionKernel.h.

using CLSeparableConvolution7x7VertKernel = CLSeparableConvolutionVertKernel<7>

Interface for the kernel which applies a vertical pass of 7x7 convolution to a tensor.

Definition at line 135 of file CLConvolutionKernel.h.

using CLSeparableConvolution9x9HorKernel = CLSeparableConvolutionHorKernel<9>

Interface for the kernel which applies a horizontal pass of 9x9 convolution to a tensor.

Definition at line 110 of file CLConvolutionKernel.h.

using CLSeparableConvolution9x9VertKernel = CLSeparableConvolutionVertKernel<9>

Interface for the kernel which applies a vertical pass of 9x9 convolution to a tensor.

Definition at line 137 of file CLConvolutionKernel.h.

using CLSize2DArray = CLArray<Size2D>

OpenCL Array of 2D Sizes.

Definition at line 115 of file CLArray.h.

using CLUInt16Array = CLArray<cl_ushort>

OpenCL Array of uint16s.

Definition at line 119 of file CLArray.h.

using CLUInt32Array = CLArray<cl_uint>

OpenCL Array of uint32s.

Definition at line 121 of file CLArray.h.

using CLUInt8Array = CLArray<cl_uchar>

OpenCL Array of uint8s.

Definition at line 117 of file CLArray.h.

using Coordinates2DArray = Array<Coordinates2D>

Array of 2D Coordinates.

Definition at line 67 of file Array.h.

using DetectionWindowArray = Array<DetectionWindow>

Array of Detection Windows.

Definition at line 69 of file Array.h.

using FloatArray = Array<float>

Array of floats.

Definition at line 85 of file Array.h.

using GCDirectConvolutionLayer1x1Kernel = GCDirectConvolutionLayerKernel<1>

Interface for the 1x1 direct convolution kernel.

Definition at line 86 of file GCDirectConvolutionLayerKernel.h.

using GCDirectConvolutionLayer3x3Kernel = GCDirectConvolutionLayerKernel<3>

Interface for the 3x3 direct convolution kernel.

Definition at line 88 of file GCDirectConvolutionLayerKernel.h.

using GCDirectConvolutionLayer5x5Kernel = GCDirectConvolutionLayerKernel<5>

Interface for the 5x5 direct convolution kernel.

Definition at line 90 of file GCDirectConvolutionLayerKernel.h.

using GCImage = GCTensor

OpenGL ES Image.

Definition at line 98 of file GCTensor.h.

typedef MemoryGroupBase< GCTensor > GCMemoryGroup

Definition at line 35 of file GCMemoryGroup.h.

using GroupMappings = std::map<size_t, MemoryMappings>

A map of the groups and memory mappings.

Definition at line 46 of file Types.h.

using half = half_float::half

16-bit floating point type

Definition at line 44 of file Types.h.

using ICLCoefficientTableArray = ICLArray<CLCoefficientTable>

Interface for OpenCL Array of Coefficient Tables.

Definition at line 68 of file CLLKTrackerKernel.h.

using ICLCoordinates2DArray = ICLArray<Coordinates2D>

Interface for OpenCL Array of 2D Coordinates.

Definition at line 121 of file ICLArray.h.

using ICLDetectionWindowArray = ICLArray<DetectionWindow>

Interface for OpenCL Array of Detection Windows.

Definition at line 123 of file ICLArray.h.

using ICLFloatArray = ICLArray<cl_float>

Interface for OpenCL Array of floats.

Definition at line 139 of file ICLArray.h.

typedef ICLTensor ICLImage

Interface for OpenCL images.

Definition at line 33 of file ICLMultiImage.h.

using ICLInt16Array = ICLArray<cl_short>

Interface for OpenCL Array of int16s.

Definition at line 135 of file ICLArray.h.

using ICLInt32Array = ICLArray<cl_int>

Interface for OpenCL Array of int32s.

Definition at line 137 of file ICLArray.h.

using ICLKeyPointArray = ICLArray<KeyPoint>

Interface for OpenCL Array of Key Points.

Definition at line 119 of file ICLArray.h.

using ICLLKInternalKeypointArray = ICLArray<CLLKInternalKeypoint>

Interface for OpenCL Array of Internal Key Points.

Definition at line 66 of file CLLKTrackerKernel.h.

using ICLOldValArray = ICLArray<CLOldValue>

Interface for OpenCL Array of Old Values.

Definition at line 70 of file CLLKTrackerKernel.h.

using ICLROIArray = ICLArray<ROI>

Interface for OpenCL Array of ROIs.

Definition at line 125 of file ICLArray.h.

using ICLSize2DArray = ICLArray<Size2D>

Interface for OpenCL Array of 2D Sizes.

Definition at line 127 of file ICLArray.h.

using ICLUInt16Array = ICLArray<cl_ushort>

Interface for OpenCL Array of uint16s.

Definition at line 131 of file ICLArray.h.

using ICLUInt32Array = ICLArray<cl_uint>

Interface for OpenCL Array of uint32s.

Definition at line 133 of file ICLArray.h.

using ICLUInt8Array = ICLArray<cl_uchar>

Interface for OpenCL Array of uint8s.

Definition at line 129 of file ICLArray.h.

using ICoordinates2DArray = IArray<Coordinates2D>

Interface for Array of 2D Coordinates.

Definition at line 142 of file IArray.h.

using IDetectionWindowArray = IArray<DetectionWindow>

Interface for Array of Detection Windows.

Definition at line 144 of file IArray.h.

using IFloatArray = IArray<float>

Interface for Array of floats.

Definition at line 160 of file IArray.h.

using IGCImage = IGCTensor

Interface for GLES Compute image.

Definition at line 111 of file IGCTensor.h.

typedef ITensor IImage

Interface for CPP Images.

Definition at line 37 of file CPPCornerCandidatesKernel.h.

using IInt16Array = IArray<int16_t>

Interface for Array of int16s.

Definition at line 156 of file IArray.h.

using IInt32Array = IArray<int32_t>

Interface for Array of int32s.

Definition at line 158 of file IArray.h.

using IKeyPointArray = IArray<KeyPoint>

Interface for Array of Key Points.

Definition at line 140 of file IArray.h.

using Image = Tensor

Image.

Definition at line 64 of file Tensor.h.

using INEKernel = ICPPKernel

Common interface for all kernels implemented in NEON.

Definition at line 32 of file INEKernel.h.

using INELKInternalKeypointArray = IArray<NELKInternalKeypoint>

Interface for NEON Array of Internal Key Points.

Definition at line 49 of file NELKTrackerKernel.h.

using INESimpleKernel = ICPPSimpleKernel

Interface for simple NEON kernels having 1 tensor input and 1 tensor output.

Definition at line 32 of file INESimpleKernel.h.

using Int16Array = Array<int16_t>

Array of int16s.

Definition at line 81 of file Array.h.

using Int32Array = Array<int32_t>

Array of int32s.

Definition at line 83 of file Array.h.

using InternalKeypoint = std::tuple<float, float, float>

Internal key point.

Definition at line 447 of file Types.h.

using IROIArray = IArray<ROI>

Interface for Array of ROIs.

Definition at line 146 of file IArray.h.

using ISize2DArray = IArray<Size2D>

Interface for Array of 2D Sizes.

Definition at line 148 of file IArray.h.

using IUInt16Array = IArray<uint16_t>

Interface for Array of uint16s.

Definition at line 152 of file IArray.h.

using IUInt32Array = IArray<uint32_t>

Interface for Array of uint32s.

Definition at line 154 of file IArray.h.

using IUInt8Array = IArray<uint8_t>

Interface for Array of uint8s.

Definition at line 150 of file IArray.h.

using KeyPointArray = Array<KeyPoint>

Array of Key Points.

Definition at line 65 of file Array.h.

using LKInternalKeypointArray = Array<NELKInternalKeypoint>

Array of LK Internel Keypoints.

Definition at line 46 of file NEOpticalFlow.h.

typedef MemoryGroupBase< Tensor > MemoryGroup

Memory Group.

Definition at line 34 of file MemoryGroup.h.

using MemoryMappings = std::map<void **, size_t>

A map of (handle, index/offset), where handle is the memory handle of the object to provide the memory for and index/offset is the buffer/offset from the pool that should be used.

Note: All objects are pre-pinned to specific buffers to avoid any relevant overheads

Definition at line 43 of file Types.h.

using Mutex = std::mutex

Wrapper of Mutex data-object.

Definition at line 33 of file Mutex.h.

using NEAccumulateWeightedFP16Kernel = NEAccumulateWeightedKernel

Interface for the accumulate weighted kernel using F16.

Definition at line 105 of file NEAccumulateKernel.h.

using NEBox3x3FP16Kernel = NEBox3x3Kernel

NEON kernel to perform a Box 3x3 filter for FP16 datatype.

Definition at line 68 of file NEBox3x3Kernel.h.

using NEConvolution3x3Kernel = NEConvolutionKernel<3>

Interface for the kernel which applied a 3x3 convolution to a tensor.

Definition at line 88 of file NEConvolutionKernel.h.

using NEConvolution5x5 = NEConvolutionSquare<5>

Basic function to run 5x5 convolution.

Definition at line 102 of file NEConvolution.h.

using NEConvolution5x5Kernel = NEConvolutionKernel<5>

Interface for the kernel which applied a 5x5 convolution to a tensor.

Definition at line 90 of file NEConvolutionKernel.h.

using NEConvolution7x7 = NEConvolutionSquare<7>

Basic function to run 7x7 convolution.

Definition at line 104 of file NEConvolution.h.

using NEConvolution7x7Kernel = NEConvolutionKernel<7>

Interface for the kernel which applied a 7x7 convolution to a tensor.

Definition at line 92 of file NEConvolutionKernel.h.

using NEConvolution9x9 = NEConvolutionSquare<9>

Basic function to run 9x9 convolution.

Definition at line 106 of file NEConvolution.h.

using NEConvolution9x9Kernel = NEConvolutionKernel<9>

Interface for the kernel which applied a 9x9 convolution to a tensor.

Definition at line 94 of file NEConvolutionKernel.h.

using NEGradientFP16Kernel = NEGradientKernel

NEON kernel to perform Gradient computation for FP16 datatype.

Definition at line 103 of file NECannyEdgeKernel.h.

using NEHarrisScoreFP16Kernel = NEHarrisScoreKernel<block_size>

Interface for the accumulate Weighted kernel using FP16.

Definition at line 132 of file NEHarrisCornersKernel.h.

using NEMagnitudePhaseFP16Kernel = NEMagnitudePhaseKernel<mag_type, phase_type>

Template interface for the kernel to compute magnitude and phase.

Definition at line 170 of file NEMagnitudePhaseKernel.h.

using NENonMaximaSuppression3x3FP16Kernel = NENonMaximaSuppression3x3Kernel

NEON kernel to perform Non-Maxima suppression 3x3 with intermediate results in FP16 if the input data type is FP32.

Definition at line 105 of file NENonMaximaSuppression3x3Kernel.h.

using NEScheduler = Scheduler

NEON Scheduler.

Definition at line 32 of file NEScheduler.h.

using NESeparableConvolution5x5HorKernel = NESeparableConvolutionHorKernel<5>

Interface for the kernel which applied a 5x1 horizontal convolution to a tensor.

Definition at line 138 of file NEConvolutionKernel.h.

using NESeparableConvolution5x5VertKernel = NESeparableConvolutionVertKernel<5>

Interface for the kernel which applied a 1x5 vertical convolution to a tensor.

Definition at line 198 of file NEConvolutionKernel.h.

using NESeparableConvolution7x7HorKernel = NESeparableConvolutionHorKernel<7>

Interface for the kernel which applied a 7x1 horizontal convolution to a tensor.

Definition at line 140 of file NEConvolutionKernel.h.

using NESeparableConvolution7x7VertKernel = NESeparableConvolutionVertKernel<7>

Interface for the kernel which applied a 1x7 vertical convolution to a tensor.

Definition at line 200 of file NEConvolutionKernel.h.

using NESeparableConvolution9x9HorKernel = NESeparableConvolutionHorKernel<9>

Interface for the kernel which applied a 9x1 horizontal convolution to a tensor.

Definition at line 142 of file NEConvolutionKernel.h.

using NESeparableConvolution9x9VertKernel = NESeparableConvolutionVertKernel<9>

Interface for the kernel which applied a 1x9 vertical convolution to a tensor.

Definition at line 202 of file NEConvolutionKernel.h.

using PaddingSize = BorderSize

Container for 2D padding size.

Definition at line 378 of file Types.h.

using PermutationVector = Strides

Permutation vector.

Definition at line 47 of file Types.h.

using qasymm8_t = uint8_t

8 bit quantized asymmetric scalar value

Definition at line 30 of file QAsymm8.h.

using qasymm8x16_t = uint8x16_t

8 bit quantized asymmetric vector with 16 elements

Definition at line 35 of file NEAsymm.h.

using qasymm8x8_t = uint8x8_t

8 bit quantized asymmetric vector with 8 elements

Definition at line 31 of file NEAsymm.h.

using qasymm8x8x2_t = uint8x8x2_t

8 bit quantized asymmetric vector with 16 elements

Definition at line 32 of file NEAsymm.h.

using qasymm8x8x3_t = uint8x8x3_t

8 bit quantized asymmetric vector with 24 elements

Definition at line 33 of file NEAsymm.h.

using qasymm8x8x4_t = uint8x8x4_t

8 bit quantized asymmetric vector with 32 elements

Definition at line 34 of file NEAsymm.h.

using qint16_t = int16_t

16 bit fixed point scalar value

Definition at line 30 of file FixedPoint.h.

using qint16x4_t = int16x4_t

16 bit fixed point vector with 4 elements

Definition at line 41 of file NEFixedPoint.h.

using qint16x4x2_t = int16x4x2_t

16 bit fixed point vector with 8 elements

Definition at line 42 of file NEFixedPoint.h.

using qint16x4x3_t = int16x4x3_t

16 bit fixed point vector with 12 elements

Definition at line 43 of file NEFixedPoint.h.

using qint16x4x4_t = int16x4x4_t

16 bit fixed point vector with 16 elements

Definition at line 44 of file NEFixedPoint.h.

using qint16x8_t = int16x8_t

16 bit fixed point vector with 8 elements

Definition at line 45 of file NEFixedPoint.h.

using qint16x8x2_t = int16x8x2_t

16 bit fixed point vector with 16 elements

Definition at line 46 of file NEFixedPoint.h.

using qint16x8x3_t = int16x8x3_t

16 bit fixed point vector with 24 elements

Definition at line 47 of file NEFixedPoint.h.

using qint16x8x4_t = int16x8x4_t

16 bit fixed point vector with 32 elements

Definition at line 48 of file NEFixedPoint.h.

using qint32_t = int32_t

32 bit fixed point scalar value

Definition at line 31 of file FixedPoint.h.

using qint32x2_t = int32x2_t

32 bit fixed point vector with 2 elements

Definition at line 49 of file NEFixedPoint.h.

using qint32x4_t = int32x4_t

32 bit fixed point vector with 4 elements

Definition at line 50 of file NEFixedPoint.h.

using qint32x4x2_t = int32x4x2_t

32 bit fixed point vector with 8 elements

Definition at line 51 of file NEFixedPoint.h.

using qint64_t = int64_t

64 bit fixed point scalar value

Definition at line 32 of file FixedPoint.h.

using qint8_t = int8_t

8 bit fixed point scalar value

Definition at line 29 of file FixedPoint.h.

using qint8x16_t = int8x16_t

8 bit fixed point vector with 16 elements

Definition at line 37 of file NEFixedPoint.h.

using qint8x16x2_t = int8x16x2_t

8 bit fixed point vector with 32 elements

Definition at line 38 of file NEFixedPoint.h.

using qint8x16x3_t = int8x16x3_t

8 bit fixed point vector with 48 elements

Definition at line 39 of file NEFixedPoint.h.

using qint8x16x4_t = int8x16x4_t

8 bit fixed point vector with 64 elements

Definition at line 40 of file NEFixedPoint.h.

using qint8x8_t = int8x8_t

8 bit fixed point vector with 8 elements

Definition at line 33 of file NEFixedPoint.h.

using qint8x8x2_t = int8x8x2_t

8 bit fixed point vector with 16 elements

Definition at line 34 of file NEFixedPoint.h.

using qint8x8x3_t = int8x8x3_t

8 bit fixed point vector with 24 elements

Definition at line 35 of file NEFixedPoint.h.

using qint8x8x4_t = int8x8x4_t

8 bit fixed point vector with 32 elements

Definition at line 36 of file NEFixedPoint.h.

using ROIArray = Array<ROI>

Array of ROIs.

Definition at line 71 of file Array.h.

using Size2DArray = Array<Size2D>

Array of 2D Sizes.

Definition at line 73 of file Array.h.

using UInt16Array = Array<uint16_t>

Array of uint16s.

Definition at line 77 of file Array.h.

using UInt32Array = Array<uint32_t>

Array of uint32s.

Definition at line 79 of file Array.h.

using UInt8Array = Array<uint8_t>

Array of uint8s.

Definition at line 75 of file Array.h.

Enumeration Type Documentation

enum BilinearInterpolation

strong

Bilinear Interpolation method used by LKTracker.

Enumerator
BILINEAR_OLD_NEW	Old-new method.
BILINEAR_SCHARR	Scharr method.

Definition at line 396 of file Types.h.

 {
     BILINEAR_OLD_NEW, 
     BILINEAR_SCHARR   
 };

enum BorderMode

strong

Methods available to handle borders.

Enumerator
UNDEFINED	Borders are left undefined.
CONSTANT	Pixels outside the image are assumed to have a constant value.
REPLICATE	Pixels outside the image are assumed to have the same value as the closest image pixel.

Definition at line 283 of file Types.h.

 {
     UNDEFINED, 
     CONSTANT,  
     REPLICATE  
 };

enum Channel

strong

Available channels.

Enumerator
UNKNOWN
C0	Unknown channel format. First channel (used by formats with unknown channel types).
C1	Second channel (used by formats with unknown channel types).
C2	Third channel (used by formats with unknown channel types).
C3	Fourth channel (used by formats with unknown channel types).
R	Red channel.
G	Green channel.
B	Blue channel.
A	Alpha channel.
Y	Luma channel.
U	Cb/U channel.
V	Cr/V/Value channel.

Definition at line 481 of file Types.h.

enum CLVersion

strong

Available OpenCL Version.

Enumerator
CL10
CL11
CL12
CL20
UNKNOWN

Definition at line 37 of file CLTypes.h.

 {
     CL10,   /* the OpenCL 1.0 */
     CL11,   /* the OpenCL 1.1 */
     CL12,   /* the OpenCL 1.2 */
     CL20,   /* the OpenCL 2.0 and above */
     UNKNOWN /* unkown version */
 };

enum ConvertPolicy

strong

Policy to handle overflow.

Enumerator
WRAP	Wrap around.
SATURATE	Saturate.

Definition at line 381 of file Types.h.

 {
     WRAP,    
     SATURATE 
 };

enum ConvolutionMethod

strong

Available ConvolutionMethod.

Enumerator
GEMM	Convolution using GEMM.
DIRECT	Direct convolution.
WINOGRAD	Convolution using Winograd.

Definition at line 1220 of file Types.h.

 {
     GEMM,    
     DIRECT,  
     WINOGRAD 
 };

enum CPUModel

strong

CPU models - we only need to detect CPUs we have microarchitecture-specific code for.

Architecture features are detected via HWCAPs.

Enumerator
GENERIC
A53
A55r0
A55r1

Definition at line 36 of file CPPTypes.h.

 {
     GENERIC,
     A53,
     A55r0,
     A55r1,
 };

enum DataLayout

strong

Supported tensor data layouts.

Enumerator
UNKNOWN	Unknown data layout.
NCHW	Num samples, channels, height, width.
NHWC	Num samples, height, width, channels.

Definition at line 110 of file Types.h.

 {
     UNKNOWN, 
     NCHW,    
     NHWC     
 };

enum DataLayoutDimension

strong

Supported tensor data layout dimensions.

Enumerator
CHANNEL	channel
HEIGHT	height
WIDTH	width
BATCHES	batches

Definition at line 118 of file Types.h.

 {
     CHANNEL, 
     HEIGHT,  
     WIDTH,   
     BATCHES  
 };

enum DataType

strong

Available data types.

Enumerator
UNKNOWN	Unknown data type.
U8	unsigned 8-bit number
S8	signed 8-bit number
QS8	quantized, symmetric fixed-point 8-bit number
QASYMM8	quantized, asymmetric fixed-point 8-bit number
U16	unsigned 16-bit number
S16	signed 16-bit number
QS16	quantized, symmetric fixed-point 16-bit number
U32	unsigned 32-bit number
S32	signed 32-bit number
QS32	quantized, symmetric fixed-point 32-bit number
U64	unsigned 64-bit number
S64	signed 64-bit number
F16	16-bit floating-point number
F32	32-bit floating-point number
F64	64-bit floating-point number
SIZET	size_t

Definition at line 72 of file Types.h.

 {
     UNKNOWN, 
     U8,      
     S8,      
     QS8,     
     QASYMM8, 
     U16,     
     S16,     
     QS16,    
     U32,     
     S32,     
     QS32,    
     U64,     
     S64,     
     F16,     
     F32,     
     F64,     
     SIZET    
 };

enum DimensionRoundingType

strong

Dimension rounding type when down-scaling on CNNs.

Note: Used in pooling and convolution layer

Enumerator
FLOOR	Floor rounding.
CEIL	Ceil rounding.

Definition at line 556 of file Types.h.

 {
     FLOOR, 
     CEIL   
 };

enum ErrorCode

strong

Available error codes.

Enumerator
OK	No error.
RUNTIME_ERROR	Generic runtime error.

Definition at line 44 of file Error.h.

 {
     OK,           
     RUNTIME_ERROR 
 };

enum FixedPointOp

strong

Fixed point operation.

Enumerator
ADD	Addition.
SUB	Subtraction.
MUL	Multiplication.
EXP	Exponential.
LOG	Logarithm.
INV_SQRT	Inverse square root.
RECIPROCAL	Reciprocal.

Definition at line 34 of file Types.h.

 {
     ADD,       
     SUB,       
     MUL,       
     EXP,       
     LOG,       
     INV_SQRT,  
     RECIPROCAL 
 };

enum Format

strong

Image colour formats.

Enumerator
UNKNOWN	Unknown image format.
U8	1 channel, 1 U8 per channel
S16	1 channel, 1 S16 per channel
U16	1 channel, 1 U16 per channel
S32	1 channel, 1 S32 per channel
U32	1 channel, 1 U32 per channel
F16	1 channel, 1 F16 per channel
F32	1 channel, 1 F32 per channel
UV88	2 channel, 1 U8 per channel
RGB888	3 channels, 1 U8 per channel
RGBA8888	4 channels, 1 U8 per channel
YUV444	A 3 plane of 8 bit 4:4:4 sampled Y, U, V planes.
YUYV422	A single plane of 32-bit macro pixel of Y0, U0, Y1, V0 bytes.
NV12	A 2 plane YUV format of Luma (Y) and interleaved UV data at 4:2:0 sampling.
NV21	A 2 plane YUV format of Luma (Y) and interleaved VU data at 4:2:0 sampling.
IYUV	A 3 plane of 8-bit 4:2:0 sampled Y, U, V planes.
UYVY422	A single plane of 32-bit macro pixel of U0, Y0, V0, Y1 byte.

Definition at line 50 of file Types.h.

 {
     UNKNOWN,  
     U8,       
     S16,      
     U16,      
     S32,      
     U32,      
     F16,      
     F32,      
     UV88,     
     RGB888,   
     RGBA8888, 
     YUV444,   
     YUYV422,  
     NV12,     
     NV21,     
     IYUV,     
     UYVY422   
 };

enum GPUTarget

strong

Available GPU Targets.

Enumerator
UNKNOWN
GPU_ARCH_MASK
MIDGARD
BIFROST
T600
T700
T800
G71
G72
G51
G51BIG
G51LIT
TNOX
TTRX
TBOX

Definition at line 34 of file GPUTarget.h.

 {
     UNKNOWN       = 0x101,
     GPU_ARCH_MASK = 0xF00,
     MIDGARD       = 0x100,
     BIFROST       = 0x200,
     T600          = 0x110,
     T700          = 0x120,
     T800          = 0x130,
     G71           = 0x210,
     G72           = 0x220,
     G51           = 0x230,
     G51BIG        = 0x231,
     G51LIT        = 0x232,
     TNOX          = 0x240,
     TTRX          = 0x250,
     TBOX          = 0x260
 };

enum GradientDimension

strong

Gradient dimension type.

Enumerator
GRAD_XY	x and y gradient dimension

Definition at line 46 of file Types.h.

 {
     GRAD_X,  
     GRAD_Y,  
     GRAD_XY, 
 };

enum HOGNormType

strong

Normalization type for Histogram of Oriented Gradients (HOG)

Enumerator
L2_NORM	L2-norm.
L2HYS_NORM	L2-norm followed by clipping.
L1_NORM	L1 norm.

Definition at line 530 of file Types.h.

 {
     L2_NORM    = 1, 
     L2HYS_NORM = 2, 
     L1_NORM    = 3  
 };

enum InterpolationPolicy

strong

Interpolation method.

Enumerator
NEAREST_NEIGHBOR	Output values are defined to match the source pixel whose center is nearest to the sample position.
BILINEAR	Output values are defined by bilinear interpolation between the pixels.
AREA	Output values are determined by averaging the source pixels whose areas fall under the area of the destination pixel, projected onto the source image.

Definition at line 388 of file Types.h.

 {
     NEAREST_NEIGHBOR, 
     BILINEAR,         
     AREA,             
 };

enum MagnitudeType

strong

Magnitude calculation type.

Enumerator
L1NORM	L1 normalization type.
L2NORM	L2 normalization type.

Definition at line 418 of file Types.h.

 {
     L1NORM, 
     L2NORM  
 };

enum MappingType

strong

Mapping type.

Enumerator
BLOBS	Mappings are in blob granularity.
OFFSETS	Mappings are in offset granularity in the same blob.

Definition at line 32 of file Types.h.

 {
     BLOBS,  
     OFFSETS 
 };

enum MatrixPattern

strong

Available matrix patterns.

Enumerator
BOX	Box pattern matrix.
CROSS	Cross pattern matrix.
DISK	Disk pattern matrix.
OTHER	Any other matrix pattern.

Definition at line 498 of file Types.h.

 {
     BOX,   
     CROSS, 
     DISK,  
     OTHER  
 };

enum NonLinearFilterFunction : unsigned

strong

Available non linear functions.

Enumerator
MEDIAN	Non linear median filter.
MIN	Non linear erode.
MAX	Non linear dilate.

Definition at line 507 of file Types.h.

                                    : unsigned
 {
     MEDIAN = 0, 
     MIN    = 1, 
     MAX    = 2, 
 };

enum NormType

strong

The normalization type used for the normalization layer.

Enumerator
IN_MAP_1D	Normalization applied within the same map in 1D region.
IN_MAP_2D	Normalization applied within the same map in 2D region.
CROSS_MAP	Normalization applied cross maps.

Definition at line 522 of file Types.h.

 {
     IN_MAP_1D, 
     IN_MAP_2D, 
     CROSS_MAP  
 };

enum PhaseType

strong

Phase calculation type.

Note: When PhaseType == SIGNED, each angle is mapped to the range 0 to 255 inclusive otherwise angles between 0 and 180

Enumerator
SIGNED	Angle range: [0, 360].
UNSIGNED	Angle range: [0, 180].

Definition at line 428 of file Types.h.

 {
     SIGNED,  
     UNSIGNED 
 };

enum PoolingType

strong

Available pooling types.

Enumerator
MAX	Max Pooling.
AVG	Average Pooling.
L2	L2 Pooling.

Definition at line 563 of file Types.h.

 {
     MAX, 
     AVG, 
     L2   
 };

enum ReductionOperation

strong

Available reduction operations.

Enumerator
SUM_SQUARE	Sum of squares.
SUM	Sum.

Definition at line 515 of file Types.h.

 {
     SUM_SQUARE, 
     SUM,        
 };

enum RoundingPolicy

strong

Rounding method.

Enumerator
TO_ZERO	Truncates the least significand values that are lost in operations.
TO_NEAREST_UP	Rounds to nearest value; half rounds away from zero.
TO_NEAREST_EVEN	Rounds to nearest value; half rounds to nearest even.

Definition at line 30 of file Rounding.h.

 {
     TO_ZERO,         
     TO_NEAREST_UP,   
     TO_NEAREST_EVEN, 
 };

enum SamplingPolicy

strong

Available Sampling Policies.

Enumerator
CENTER	Samples are taken at pixel center.
TOP_LEFT	Samples are taken at pixel top left corner.

Definition at line 94 of file Types.h.

 {
     CENTER,  
     TOP_LEFT 
 };

enum Termination

strong

Termination criteria.

Enumerator
TERM_CRITERIA_EPSILON	Terminate when within epsilon of a threshold.
TERM_CRITERIA_ITERATIONS	Terminate after a maximum number of iterations.
TERM_CRITERIA_BOTH	Terminate on whichever of the other conditions occurs first.

Definition at line 410 of file Types.h.

 {
     TERM_CRITERIA_EPSILON,    
     TERM_CRITERIA_ITERATIONS, 
     TERM_CRITERIA_BOTH        
 };

enum ThresholdType

strong

Threshold mode.

Enumerator
BINARY	Threshold with one value.
RANGE	Threshold with two values.

Definition at line 403 of file Types.h.

 {
     BINARY, 
     RANGE   
 };

Function Documentation

int arm_compute::adjust_down	(	int	required,
		int	available,
		int	step
	)

inline

Decrease required in steps of step until it's less than available.

Parameters

[in]	required	Number of required bytes.
[in]	available	Number of available bytes.
[in]	step	Step size used to decrease required bytes.

Returns: Largest value smaller than available that is a multiple of step

Definition at line 47 of file IAccessWindow.h.

References ARM_COMPUTE_ERROR_ON.

 {
     ARM_COMPUTE_ERROR_ON(step <= 0);
 
     return required - step * ((required - available + step - 1) / step);
 }

TensorShape arm_compute::adjust_odd_shape	(	const TensorShape &	shape,
		Format	format
	)

inline

Adjust tensor shape size if width or height are odd for a given multi-planar format.

No modification is done for other formats.

Note: Adding here a few links discussing the issue of odd size and sharing the same solution: Android Source: https://android.googlesource.com/platform/frameworks/base/+/refs/heads/master/graphics/java/android/graphics/YuvImage.java WebM: https://groups.google.com/a/webmproject.org/forum/#!topic/webm-discuss/LaCKpqiDTXM libYUV: https://bugs.chromium.org/p/libyuv/issues/detail?id=198&can=1&q=odd%20width YUVPlayer: https://sourceforge.net/p/raw-yuvplayer/bugs/1/

Parameters

[in,out]	shape	Tensor shape of 2D size
[in]	format	Format of the tensor

Returns: The adjusted tensor shape.

Definition at line 688 of file Utils.h.

References has_format_horizontal_subsampling(), has_format_vertical_subsampling(), TensorShape::set(), and U.

Referenced by error_on_tensors_not_even().

 {
     TensorShape output{ shape };
 
     // Force width to be even for formats which require subsampling of the U and V channels
     if(has_format_horizontal_subsampling(format))
     {
         output.set(0, output.x() & ~1U);
     }
 
     // Force height to be even for formats which require subsampling of the U and V channels
     if(has_format_vertical_subsampling(format))
     {
         output.set(1, output.y() & ~1U);
     }
 
     return output;
 }

int arm_compute::adjust_up	(	int	required,
		int	available,
		int	step
	)

inline

Increase required in steps of step until it's greater than available.

Parameters

[in]	required	Number of required bytes.
[in]	available	Number of available bytes.
[in]	step	Step size used to increase required bytes.

Returns: Largest value smaller than available that is a multiple of step

Definition at line 63 of file IAccessWindow.h.

References ARM_COMPUTE_ERROR_ON.

 {
     ARM_COMPUTE_ERROR_ON(step <= 0);
 
     return required + step * ((available - required + step - 1) / step);
 }

void arm_compute::allocate_workspace	(	size_t	workspace_size,
		Tensor &	workspace,
		MemoryGroup *	memory_group,
		size_t	alignment,
		unsigned int	num_threads
	)

inline

Allocate a workspace tensor.

Parameters

[in]	workspace_size	Size to allocate.
[out]	workspace	Tensor to allocate.
[in]	memory_group	Tensor memory group.
[in]	alignment	Workspace memory alignment.
[in]	num_threads	Number of workspace threads.

Definition at line 135 of file AssemblyHelper.h.

References TensorAllocator::allocate(), Tensor::allocator(), ARM_COMPUTE_ERROR_ON_MSG, ARM_COMPUTE_UNUSED, TensorAllocator::init(), and S8.

Referenced by setup_assembly_kernel().

 {
     ARM_COMPUTE_UNUSED(memory_group);
     ARM_COMPUTE_ERROR_ON_MSG(workspace_size == 0, "size cannot be 0");
     workspace.allocator()->init(TensorInfo(TensorShape{ (workspace_size + alignment - 1) * num_threads }, 1, DataType::S8));
     workspace.allocator()->allocate();
 }

bool arm_compute::arm_non_uniform_workgroup_supported ( const cl::Device & device )

Helper function to check whether the arm_non_uniform_work_group_size extension is supported.

Parameters

[in] device A CL device

Returns: True if the extension is supported

bool auto_init_if_empty	(	ITensorInfo &	info,
		const TensorShape &	shape,
		int	num_channels,
		DataType	data_type,
		int	fixed_point_position,
		QuantizationInfo	quantization_info = `QuantizationInfo()`
	)

inline

Auto initialize the tensor info (shape, number of channels, data type and fixed point position) if the current assignment is empty.

Parameters

[in,out]	info	Tensor info used to check and assign.
[in]	shape	New shape.
[in]	num_channels	New number of channels.
[in]	data_type	New data type
[in]	fixed_point_position	New fixed point position
[in]	quantization_info	(Optional) New quantization info

Returns: True if the tensor info has been initialized

Definition at line 201 of file Helpers.inl.

References ITensorInfo::set_data_type(), ITensorInfo::set_fixed_point_position(), ITensorInfo::set_num_channels(), ITensorInfo::set_quantization_info(), ITensorInfo::set_tensor_shape(), ITensorInfo::tensor_shape(), and TensorShape::total_size().

Referenced by permute().

 {
     if(info.tensor_shape().total_size() == 0)
     {
         info.set_data_type(data_type);
         info.set_num_channels(num_channels);
         info.set_tensor_shape(shape);
         info.set_fixed_point_position(fixed_point_position);
         info.set_quantization_info(quantization_info);
         return true;
     }
 
     return false;
 }

bool auto_init_if_empty	(	ITensorInfo &	info_sink,
		const ITensorInfo &	info_source
	)

inline

Auto initialize the tensor info using another tensor info.

Parameters

info_sink	Tensor info used to check and assign
info_source	Tensor info used to assign

Returns: True if the tensor info has been initialized

Definition at line 221 of file Helpers.inl.

References ITensorInfo::data_layout(), ITensorInfo::data_type(), ITensorInfo::fixed_point_position(), ITensorInfo::num_channels(), ITensorInfo::quantization_info(), ITensorInfo::set_data_layout(), ITensorInfo::set_data_type(), ITensorInfo::set_fixed_point_position(), ITensorInfo::set_num_channels(), ITensorInfo::set_quantization_info(), ITensorInfo::set_tensor_shape(), ITensorInfo::tensor_shape(), and TensorShape::total_size().

 {
     if(info_sink.tensor_shape().total_size() == 0)
     {
         info_sink.set_data_type(info_source.data_type());
         info_sink.set_num_channels(info_source.num_channels());
         info_sink.set_tensor_shape(info_source.tensor_shape());
         info_sink.set_fixed_point_position(info_source.fixed_point_position());
         info_sink.set_quantization_info(info_source.quantization_info());
         info_sink.set_data_layout(info_source.data_layout());
         return true;
     }
 
     return false;
 }

std::string arm_compute::build_information ( )

Returns the arm_compute library build information.

Contains the version number and the build options used to build the library

Returns: The arm_compute library build information

Referenced by floor_to_multiple(), and main().

TensorShape arm_compute::calculate_depth_concatenate_shape ( const std::vector< T * > & inputs_vector )

Calculate the output shapes of the depth concatenate function.

Parameters

[in] inputs_vector The vector that stores all the pointers to input.

Returns: the output shape

Definition at line 651 of file Utils.h.

References ARM_COMPUTE_ERROR_ON, arm_compute::test::fixed_point_arithmetic::detail::max(), TensorShape::set(), arm_compute::test::validation::shape, Dimensions< T >::x(), Dimensions< T >::y(), and Dimensions< T >::z().

 {
     TensorShape out_shape = inputs_vector[0]->info()->tensor_shape();
 
     size_t max_x = 0;
     size_t max_y = 0;
     size_t depth = 0;
 
     for(const auto &tensor : inputs_vector)
     {
         ARM_COMPUTE_ERROR_ON(tensor == nullptr);
         const TensorShape shape = tensor->info()->tensor_shape();
         max_x                   = std::max(shape.x(), max_x);
         max_y                   = std::max(shape.y(), max_y);
         depth += shape.z();
     }
 
     out_shape.set(0, max_x);
     out_shape.set(1, max_y);
     out_shape.set(2, depth);
 
     return out_shape;
 }

uint32_t arm_compute::calculate_matrix_scale	(	const int16_t *	matrix,
		unsigned int	matrix_size
	)

inline

Calculate the scale of the given square matrix.

The scale is the absolute value of the sum of all the coefficients in the matrix.

Note: If the coefficients add up to 0 then the scale is set to 1.

Parameters

[in]	matrix	Matrix coefficients
[in]	matrix_size	Number of elements per side of the square matrix. (Number of coefficients = matrix_size * matrix_size).

Returns: The absolute value of the sum of the coefficients if they don't add up to 0, otherwise 1.

Definition at line 637 of file Utils.h.

References arm_compute::test::fixed_point_arithmetic::detail::abs(), accumulate(), and arm_compute::test::fixed_point_arithmetic::detail::max().

 {
     const size_t size = matrix_size * matrix_size;
 
     return std::max(1, std::abs(std::accumulate(matrix, matrix + size, 0)));
 }

Window arm_compute::calculate_max_enlarged_window	(	const ValidRegion &	valid_region,
		const Steps &	steps = `Steps()`,
		BorderSize	border_size = `BorderSize()`
	)

Calculate the maximum window for a given tensor shape and border setting.

The window will also includes the border.

Parameters

[in]	valid_region	Valid region object defining the shape of the tensor space for which the window is created.
[in]	steps	(Optional) Number of elements processed for each step.
[in]	border_size	(Optional) Border size. The border region will be included in the window.

Returns: The maximum window the kernel can be executed on.

Referenced by calculate_max_enlarged_window(), and calculate_max_window_horizontal().

Window arm_compute::calculate_max_enlarged_window	(	const ITensorInfo &	info,
		const Steps &	steps = `Steps()`,
		BorderSize	border_size = `BorderSize()`
	)

inline

Calculate the maximum window for a given tensor shape and border setting.

The window will also includes the border.

Parameters

[in]	info	Tensor info object defining the shape of the object for which the window is created.
[in]	steps	(Optional) Number of elements processed for each step.
[in]	border_size	(Optional) Border size. The border region will be included in the window.

Returns: The maximum window the kernel can be executed on.

Definition at line 457 of file Helpers.h.

References calculate_max_enlarged_window(), and ITensorInfo::valid_region().

 {
     return calculate_max_enlarged_window(info.valid_region(), steps, border_size);
 }

Window arm_compute::calculate_max_window	(	const ValidRegion &	valid_region,
		const Steps &	steps = `Steps()`,
		bool	skip_border = `false`,
		BorderSize	border_size = `BorderSize()`
	)

Calculate the maximum window for a given tensor shape and border setting.

Parameters

[in]	valid_region	Valid region object defining the shape of the tensor space for which the window is created.
[in]	steps	(Optional) Number of elements processed for each step.
[in]	skip_border	(Optional) If true exclude the border region from the window.
[in]	border_size	(Optional) Border size.

Returns: The maximum window the kernel can be executed on.

Referenced by calculate_max_window(), and update_window_and_padding().

Window arm_compute::calculate_max_window	(	const ITensorInfo &	info,
		const Steps &	steps = `Steps()`,
		bool	skip_border = `false`,
		BorderSize	border_size = `BorderSize()`
	)

inline

Calculate the maximum window for a given tensor shape and border setting.

Parameters

[in]	info	Tensor info object defining the shape of the object for which the window is created.
[in]	steps	(Optional) Number of elements processed for each step.
[in]	skip_border	(Optional) If true exclude the border region from the window.
[in]	border_size	(Optional) Border size.

Returns: The maximum window the kernel can be executed on.

Definition at line 409 of file Helpers.h.

References calculate_max_window(), calculate_max_window_horizontal(), and ITensorInfo::valid_region().

 {
     return calculate_max_window(info.valid_region(), steps, skip_border, border_size);
 }

Window arm_compute::calculate_max_window_horizontal	(	const ValidRegion &	valid_region,
		const Steps &	steps = `Steps()`,
		bool	skip_border = `false`,
		BorderSize	border_size = `BorderSize()`
	)

Calculate the maximum window used by a horizontal kernel for a given tensor shape and border setting.

Parameters

[in]	valid_region	Valid region object defining the shape of the tensor space for which the window is created.
[in]	steps	(Optional) Number of elements processed for each step.
[in]	skip_border	(Optional) If true exclude the border region from the window.
[in]	border_size	(Optional) Border size. The border region will be excluded from the window.

Returns: The maximum window the kernel can be executed on.

Referenced by calculate_max_window(), and calculate_max_window_horizontal().

Window arm_compute::calculate_max_window_horizontal	(	const ITensorInfo &	info,
		const Steps &	steps = `Steps()`,
		bool	skip_border = `false`,
		BorderSize	border_size = `BorderSize()`
	)

inline

Calculate the maximum window used by a horizontal kernel for a given tensor shape and border setting.

Parameters

[in]	info	Tensor info object defining the shape of the object for which the window is created.
[in]	steps	(Optional) Number of elements processed for each step.
[in]	skip_border	(Optional) If true exclude the border region from the window.
[in]	border_size	(Optional) Border size.

Returns: The maximum window the kernel can be executed on.

Definition at line 434 of file Helpers.h.

References calculate_max_enlarged_window(), calculate_max_window_horizontal(), and ITensorInfo::valid_region().

 {
     return calculate_max_window_horizontal(info.valid_region(), steps, skip_border, border_size);
 }

PadStrideInfo arm_compute::calculate_same_pad	(	TensorShape	input_shape,
		TensorShape	weights_shape,
		PadStrideInfo	conv_info
	)

Calculate padding requirements in case of SAME padding.

Parameters

[in]	input_shape	Input shape
[in]	weights_shape	Weights shape
[in]	conv_info	Convolution information (containing strides)

Returns: PadStrideInfo for SAME padding

Referenced by data_type_for_convolution_matrix().

TensorShape arm_compute::calculate_subsampled_shape	(	const TensorShape &	shape,
		Format	format,
		Channel	channel = `Channel::UNKNOWN`
	)

inline

Calculate subsampled shape for a given format and channel.

Parameters

[in]	shape	Shape of the tensor to calculate the extracted channel.
[in]	format	Format of the tensor.
[in]	channel	Channel to create tensor shape to be extracted.

Returns: The subsampled tensor shape.

Definition at line 715 of file Utils.h.

References has_format_horizontal_subsampling(), has_format_vertical_subsampling(), TensorShape::set(), U, UNKNOWN, and V.

Referenced by arm_compute::test::validation::reference::channel_extract(), and error_on_tensors_not_subsampled().

 {
     TensorShape output{ shape };
 
     // Subsample shape only for U or V channel
     if(Channel::U == channel || Channel::V == channel || Channel::UNKNOWN == channel)
     {
         // Subsample width for the tensor shape when channel is U or V
         if(has_format_horizontal_subsampling(format))
         {
             output.set(0, output.x() / 2U);
         }
 
         // Subsample height for the tensor shape when channel is U or V
         if(has_format_vertical_subsampling(format))
         {
             output.set(1, output.y() / 2U);
         }
     }
 
     return output;
 }

ValidRegion arm_compute::calculate_valid_region_scale	(	const ITensorInfo &	src_info,
		const TensorShape &	dst_shape,
		InterpolationPolicy	interpolate_policy,
		SamplingPolicy	sampling_policy,
		bool	border_undefined
	)

Helper function to calculate the Valid Region for Scale.

Parameters

[in]	src_info	Input tensor info used to check.
[in]	dst_shape	Shape of the output.
[in]	interpolate_policy	Interpolation policy.
[in]	sampling_policy	Sampling policy.
[in]	border_undefined	True if the border is undefined.

Returns: The corresponding valid region

Referenced by arm_compute::test::validation::DATA_TEST_CASE(), arm_compute::test::validation::FIXTURE_DATA_TEST_CASE(), and permute().

auto arm_compute::ceil_to_multiple	(	S	value,
		T	divisor
	)		-> decltype(((value + divisor - 1) / divisor) * divisor)

inline

Computes the smallest number larger or equal to value that is a multiple of divisor.

Parameters

[in]	value	Lower bound value
[in]	divisor	Value to compute multiple of.

Returns: the result.

Definition at line 64 of file Utils.h.

References ARM_COMPUTE_ERROR_ON, and DIV_CEIL().

Referenced by arm_compute::test::validation::DATA_TEST_CASE(), and Window::scale().

 {
     ARM_COMPUTE_ERROR_ON(value < 0 || divisor <= 0);
     return DIV_CEIL(value, divisor) * divisor;
 }

int arm_compute::channel_idx_from_format	(	Format	format,
		Channel	channel
	)

inline

Return the channel index of a given channel given an input format.

Parameters

[in]	format	Input format
[in]	channel	Input channel

Returns: The channel index of the specific channel of the specific format

Definition at line 318 of file Utils.h.

References A, ARM_COMPUTE_ERROR, B, G, IYUV, NV12, NV21, R, RGB888, RGBA8888, U, UYVY422, V, Y, YUV444, and YUYV422.

Referenced by arm_compute::test::validation::reference::channel_extract().

 {
     switch(format)
     {
         case Format::RGB888:
         {
             switch(channel)
             {
                 case Channel::R:
                     return 0;
                 case Channel::G:
                     return 1;
                 case Channel::B:
                     return 2;
                 default:
                     ARM_COMPUTE_ERROR("Not supported channel");
                     return 0;
             }
         }
         case Format::RGBA8888:
         {
             switch(channel)
             {
                 case Channel::R:
                     return 0;
                 case Channel::G:
                     return 1;
                 case Channel::B:
                     return 2;
                 case Channel::A:
                     return 3;
                 default:
                     ARM_COMPUTE_ERROR("Not supported channel");
                     return 0;
             }
         }
         case Format::YUYV422:
         {
             switch(channel)
             {
                 case Channel::Y:
                     return 0;
                 case Channel::U:
                     return 1;
                 case Channel::V:
                     return 3;
                 default:
                     ARM_COMPUTE_ERROR("Not supported channel");
                     return 0;
             }
         }
         case Format::UYVY422:
         {
             switch(channel)
             {
                 case Channel::Y:
                     return 1;
                 case Channel::U:
                     return 0;
                 case Channel::V:
                     return 2;
                 default:
                     ARM_COMPUTE_ERROR("Not supported channel");
                     return 0;
             }
         }
         case Format::NV12:
         {
             switch(channel)
             {
                 case Channel::Y:
                     return 0;
                 case Channel::U:
                     return 0;
                 case Channel::V:
                     return 1;
                 default:
                     ARM_COMPUTE_ERROR("Not supported channel");
                     return 0;
             }
         }
         case Format::NV21:
         {
             switch(channel)
             {
                 case Channel::Y:
                     return 0;
                 case Channel::U:
                     return 1;
                 case Channel::V:
                     return 0;
                 default:
                     ARM_COMPUTE_ERROR("Not supported channel");
                     return 0;
             }
         }
         case Format::YUV444:
         case Format::IYUV:
         {
             switch(channel)
             {
                 case Channel::Y:
                     return 0;
                 case Channel::U:
                     return 0;
                 case Channel::V:
                     return 0;
                 default:
                     ARM_COMPUTE_ERROR("Not supported channel");
                     return 0;
             }
         }
         default:
             ARM_COMPUTE_ERROR("Not supported format");
             return 0;
     }
 }

void arm_compute::colorconvert_iyuv_to_nv12	(	const void *__restrict	input,
		void *__restrict	output,
		const Window &	win
	)

Convert IYUV to NV12.

Parameters

[in]	input	Input IYUV data buffer.
[out]	output	Output NV12 buffer.
[in]	win	Window for iterating the buffers.

Definition at line 601 of file NEColorConvertHelper.inl.

References ARM_COMPUTE_ERROR_ON, Window::DimX, Window::DimY, Window::Dimension::end(), execute_window_loop(), Window::set(), Window::Dimension::start(), Window::Dimension::step(), Window::validate(), Window::x(), and Window::y().

 {
     ARM_COMPUTE_ERROR_ON(nullptr == input);
     ARM_COMPUTE_ERROR_ON(nullptr == output);
     win.validate();
 
     const auto input_ptr  = static_cast<const IMultiImage *__restrict>(input);
     const auto output_ptr = static_cast<IMultiImage *__restrict>(output);
 
     // UV's width and height are subsampled
     Window win_uv(win);
     win_uv.set(Window::DimX, Window::Dimension(win_uv.x().start() / 2, win_uv.x().end() / 2, win_uv.x().step() / 2));
     win_uv.set(Window::DimY, Window::Dimension(win_uv.y().start() / 2, win_uv.y().end() / 2, 1));
     win_uv.validate();
 
     Iterator in_y(input_ptr->plane(0), win);
     Iterator in_u(input_ptr->plane(1), win_uv);
     Iterator in_v(input_ptr->plane(2), win_uv);
     Iterator out_y(output_ptr->plane(0), win);
     Iterator out_uv(output_ptr->plane(1), win_uv);
 
     execute_window_loop(win, [&](const Coordinates & id)
     {
         const auto   ta_y_top    = vld2q_u8(in_y.ptr());
         const auto   ta_y_bottom = vld2q_u8(in_y.ptr() + input_ptr->plane(0)->info()->strides_in_bytes().y());
         uint8x16x2_t ta_uv;
         ta_uv.val[0] = vld1q_u8(in_u.ptr());
         ta_uv.val[1] = vld1q_u8(in_v.ptr());
         //ta_y.val[0] = Y0 Y2 Y4 Y6 ...
         //ta_y.val[1] = Y1 Y3 Y5 Y7 ...
         //ta_uv.val[0] = U0 U2 U4 U6 ...
         //ta_uv.val[1] = V0 V2 V4 V6 ...
 
         vst2q_u8(out_y.ptr(), ta_y_top);
         vst2q_u8(out_y.ptr() + output_ptr->plane(0)->info()->strides_in_bytes().y(), ta_y_bottom);
         vst2q_u8(out_uv.ptr(), ta_uv);
     },
     in_y, in_u, in_v, out_y, out_uv);
 }

void arm_compute::colorconvert_iyuv_to_rgb	(	const void *__restrict	input,
		void *__restrict	output,
		const Window &	win
	)

Convert IYUV to RGB.

Parameters

[in]	input	Input IYUV data buffer.
[out]	output	Output RGB buffer.
[in]	win	Window for iterating the buffers.

Definition at line 482 of file NEColorConvertHelper.inl.

References arm_compute::test::validation::alpha, ARM_COMPUTE_ERROR_ON, Window::DimX, Window::DimY, Window::Dimension::end(), execute_window_loop(), ITensor::info(), Window::set(), Window::Dimension::start(), Window::Dimension::step(), ITensorInfo::strides_in_bytes(), Window::validate(), Window::x(), Dimensions< T >::y(), and Window::y().

 {
     ARM_COMPUTE_ERROR_ON(nullptr == input);
     ARM_COMPUTE_ERROR_ON(nullptr == output);
     win.validate();
 
     const auto input_ptr  = static_cast<const IMultiImage *__restrict>(input);
     const auto output_ptr = static_cast<IImage *__restrict>(output);
 
     constexpr auto element_size = alpha ? 32 : 24;
     const auto     out_stride   = output_ptr->info()->strides_in_bytes().y();
 
     // UV's width and height are subsampled
     Window win_uv(win);
     win_uv.set(Window::DimX, Window::Dimension(win_uv.x().start() / 2, win_uv.x().end() / 2, win_uv.x().step() / 2));
     win_uv.set(Window::DimY, Window::Dimension(win_uv.y().start() / 2, win_uv.y().end() / 2, 1));
     win_uv.validate();
 
     Iterator in_y(input_ptr->plane(0), win);
     Iterator in_u(input_ptr->plane(1), win_uv);
     Iterator in_v(input_ptr->plane(2), win_uv);
     Iterator out(output_ptr, win);
 
     execute_window_loop(win, [&](const Coordinates & id)
     {
         const auto ta_y_top    = vld2q_u8(in_y.ptr());
         const auto ta_y_bottom = vld2q_u8(in_y.ptr() + input_ptr->plane(0)->info()->strides_in_bytes().y());
         const auto ta_u        = vld1q_u8(in_u.ptr());
         const auto ta_v        = vld1q_u8(in_v.ptr());
         //ta_y.val[0] = Y0 Y2 Y4 Y6 ...
         //ta_y.val[1] = Y1 Y3 Y5 Y7 ...
         //ta_u.val[0] = U0 U2 U4 U6 ...
         //ta_v.val[0] = V0 V2 V4 V6 ...
 
         // Convert the uint8x16x4_t to float32x4x4_t
         float32x4x4_t yvec_top, yyvec_top, yvec_bottom, yyvec_bottom, uvec, vvec;
         convert_uint8x16_to_float32x4x4(ta_y_top.val[0], yvec_top);
         convert_uint8x16_to_float32x4x4(ta_y_top.val[1], yyvec_top);
         convert_uint8x16_to_float32x4x4(ta_y_bottom.val[0], yvec_bottom);
         convert_uint8x16_to_float32x4x4(ta_y_bottom.val[1], yyvec_bottom);
         convert_uint8x16_to_float32x4x4(ta_u, uvec);
         convert_uint8x16_to_float32x4x4(ta_v, vvec);
 
         yuyv_to_rgb_calculation(yvec_top.val[0], uvec.val[0], yyvec_top.val[0], vvec.val[0], out.ptr() + 0 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec_top.val[1], uvec.val[1], yyvec_top.val[1], vvec.val[1], out.ptr() + 1 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec_top.val[2], uvec.val[2], yyvec_top.val[2], vvec.val[2], out.ptr() + 2 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec_top.val[3], uvec.val[3], yyvec_top.val[3], vvec.val[3], out.ptr() + 3 * element_size, alpha);
 
         yuyv_to_rgb_calculation(yvec_bottom.val[0], uvec.val[0], yyvec_bottom.val[0], vvec.val[0], out.ptr() + out_stride + 0 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec_bottom.val[1], uvec.val[1], yyvec_bottom.val[1], vvec.val[1], out.ptr() + out_stride + 1 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec_bottom.val[2], uvec.val[2], yyvec_bottom.val[2], vvec.val[2], out.ptr() + out_stride + 2 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec_bottom.val[3], uvec.val[3], yyvec_bottom.val[3], vvec.val[3], out.ptr() + out_stride + 3 * element_size, alpha);
     },
     in_y, in_u, in_v, out);
 }

void arm_compute::colorconvert_iyuv_to_yuv4	(	const void *__restrict	input,
		void *__restrict	output,
		const Window &	win
	)

Convert IYUV to YUV4.

Parameters

[in]	input	Input IYUV data buffer.
[out]	output	Output YUV4 buffer.
[in]	win	Window for iterating the buffers.

Definition at line 816 of file NEColorConvertHelper.inl.

References ARM_COMPUTE_ERROR_ON, Window::DimX, Window::DimY, Window::Dimension::end(), execute_window_loop(), Window::set(), Window::Dimension::start(), Window::Dimension::step(), Window::validate(), Window::x(), and Window::y().

 {
     ARM_COMPUTE_ERROR_ON(nullptr == input);
     ARM_COMPUTE_ERROR_ON(nullptr == output);
     win.validate();
 
     const auto input_ptr  = static_cast<const IMultiImage *__restrict>(input);
     const auto output_ptr = static_cast<IMultiImage *__restrict>(output);
 
     // UV's width and height are subsampled
     Window win_uv(win);
     win_uv.set(Window::DimX, Window::Dimension(win_uv.x().start() / 2, win_uv.x().end() / 2, win_uv.x().step() / 2));
     win_uv.set(Window::DimY, Window::Dimension(win_uv.y().start() / 2, win_uv.y().end() / 2, 1));
     win_uv.validate();
 
     Iterator in_y(input_ptr->plane(0), win);
     Iterator in_u(input_ptr->plane(1), win_uv);
     Iterator in_v(input_ptr->plane(2), win_uv);
     Iterator out_y(output_ptr->plane(0), win);
     Iterator out_u(output_ptr->plane(1), win);
     Iterator out_v(output_ptr->plane(2), win);
 
     execute_window_loop(win, [&](const Coordinates & id)
     {
         const auto ta_y_top    = vld2q_u8(in_y.ptr());
         const auto ta_y_bottom = vld2q_u8(in_y.ptr() + input_ptr->plane(0)->info()->strides_in_bytes().y());
         const auto ta_u        = vld1q_u8(in_u.ptr());
         const auto ta_v        = vld1q_u8(in_v.ptr());
         //ta_y.val[0] = Y0 Y2 Y4 Y6 ...
         //ta_y.val[1] = Y1 Y3 Y5 Y7 ...
         //ta_u = U0 U2 U4 U6 ...
         //ta_v = V0 V2 V4 V6 ...
 
         vst2q_u8(out_y.ptr(), ta_y_top);
         vst2q_u8(out_y.ptr() + output_ptr->plane(0)->info()->strides_in_bytes().y(), ta_y_bottom);
 
         uint8x16x2_t uvec;
         uvec.val[0] = ta_u;
         uvec.val[1] = ta_u;
         vst2q_u8(out_u.ptr(), uvec);
         vst2q_u8(out_u.ptr() + output_ptr->plane(1)->info()->strides_in_bytes().y(), uvec);
 
         uint8x16x2_t vvec;
         vvec.val[0] = ta_v;
         vvec.val[1] = ta_v;
         vst2q_u8(out_v.ptr(), vvec);
         vst2q_u8(out_v.ptr() + output_ptr->plane(2)->info()->strides_in_bytes().y(), vvec);
     },
     in_y, in_u, in_v, out_y, out_u, out_v);
 }

void arm_compute::colorconvert_nv12_to_iyuv	(	const void *__restrict	input,
		void *__restrict	output,
		const Window &	win
	)

Convert NV12 to IYUV.

Parameters

[in]	input	Input NV12 data buffer.
[out]	output	Output IYUV buffer.
[in]	win	Window for iterating the buffers.

Definition at line 649 of file NEColorConvertHelper.inl.

References ARM_COMPUTE_ERROR_ON, Window::DimX, Window::DimY, Window::Dimension::end(), execute_window_loop(), Window::set(), Window::Dimension::start(), Window::Dimension::step(), Window::validate(), Window::x(), and Window::y().

 {
     ARM_COMPUTE_ERROR_ON(nullptr == input);
     ARM_COMPUTE_ERROR_ON(nullptr == output);
     win.validate();
 
     const auto input_ptr  = static_cast<const IMultiImage *__restrict>(input);
     const auto output_ptr = static_cast<IMultiImage *__restrict>(output);
 
     constexpr auto shift = uv ? 0 : 1;
 
     // UV's width and height are subsampled
     Window win_uv(win);
     win_uv.set(Window::DimX, Window::Dimension(win_uv.x().start() / 2, win_uv.x().end() / 2, win_uv.x().step() / 2));
     win_uv.set(Window::DimY, Window::Dimension(win_uv.y().start() / 2, win_uv.y().end() / 2, 1));
     win_uv.validate();
 
     Iterator in_y(input_ptr->plane(0), win);
     Iterator in_uv(input_ptr->plane(1), win_uv);
     Iterator out_y(output_ptr->plane(0), win);
     Iterator out_u(output_ptr->plane(1), win_uv);
     Iterator out_v(output_ptr->plane(2), win_uv);
 
     execute_window_loop(win, [&](const Coordinates & id)
     {
         const auto ta_y_top    = vld2q_u8(in_y.ptr());
         const auto ta_y_bottom = vld2q_u8(in_y.ptr() + input_ptr->plane(0)->info()->strides_in_bytes().y());
         const auto ta_uv       = vld2q_u8(in_uv.ptr());
         //ta_y.val[0] = Y0 Y2 Y4 Y6 ...
         //ta_y.val[1] = Y1 Y3 Y5 Y7 ...
         //ta_uv.val[0] = U0 U2 U4 U6 ...
         //ta_uv.val[1] = V0 V2 V4 V6 ...
 
         vst2q_u8(out_y.ptr(), ta_y_top);
         vst2q_u8(out_y.ptr() + output_ptr->plane(0)->info()->strides_in_bytes().y(), ta_y_bottom);
         vst1q_u8(out_u.ptr(), ta_uv.val[0 + shift]);
         vst1q_u8(out_v.ptr(), ta_uv.val[1 - shift]);
     },
     in_y, in_uv, out_y, out_u, out_v);
 }

void arm_compute::colorconvert_nv12_to_rgb	(	const void *__restrict	input,
		void *__restrict	output,
		const Window &	win
	)

Convert NV12 to RGB.

Parameters

[in]	input	Input NV12 data buffer.
[out]	output	Output RGB buffer.
[in]	win	Window for iterating the buffers.

Definition at line 419 of file NEColorConvertHelper.inl.

References arm_compute::test::validation::alpha, ARM_COMPUTE_ERROR_ON, Window::DimX, Window::DimY, Window::Dimension::end(), execute_window_loop(), ITensor::info(), Window::set(), Window::Dimension::start(), Window::Dimension::step(), ITensorInfo::strides_in_bytes(), Window::validate(), Window::x(), Dimensions< T >::y(), and Window::y().

 {
     ARM_COMPUTE_ERROR_ON(nullptr == input);
     ARM_COMPUTE_ERROR_ON(nullptr == output);
     win.validate();
 
     const auto input_ptr  = static_cast<const IMultiImage *__restrict>(input);
     const auto output_ptr = static_cast<IImage *__restrict>(output);
 
     constexpr auto element_size = alpha ? 32 : 24;
     const auto     out_stride   = output_ptr->info()->strides_in_bytes().y();
     constexpr auto shift        = uv ? 0 : 1;
 
     // UV's width and height are subsampled
     Window win_uv(win);
     win_uv.set(Window::DimX, Window::Dimension(win_uv.x().start() / 2, win_uv.x().end() / 2, win.x().step() / 2));
     win_uv.set(Window::DimY, Window::Dimension(win_uv.y().start() / 2, win_uv.y().end() / 2, 1));
     win_uv.validate();
 
     Iterator in_y(input_ptr->plane(0), win);
     Iterator in_uv(input_ptr->plane(1), win_uv);
     Iterator out(output_ptr, win);
 
     execute_window_loop(win, [&](const Coordinates & id)
     {
         const auto ta_y_top    = vld2q_u8(in_y.ptr());
         const auto ta_y_bottom = vld2q_u8(in_y.ptr() + input_ptr->plane(0)->info()->strides_in_bytes().y());
         const auto ta_uv       = vld2q_u8(in_uv.ptr());
         //ta_y.val[0] = Y0 Y2 Y4 Y6 ...
         //ta_y.val[1] = Y1 Y3 Y5 Y7 ...
         //ta_uv.val[0] = U0 U2 U4 U6 ...
         //ta_uv.val[1] = V0 V2 V4 V6 ...
 
         // Convert the uint8x16x4_t to float32x4x4_t
         float32x4x4_t yvec_top, yyvec_top, yvec_bottom, yyvec_bottom, uvec, vvec;
         convert_uint8x16_to_float32x4x4(ta_y_top.val[0], yvec_top);
         convert_uint8x16_to_float32x4x4(ta_y_top.val[1], yyvec_top);
         convert_uint8x16_to_float32x4x4(ta_y_bottom.val[0], yvec_bottom);
         convert_uint8x16_to_float32x4x4(ta_y_bottom.val[1], yyvec_bottom);
         convert_uint8x16_to_float32x4x4(ta_uv.val[0 + shift], uvec);
         convert_uint8x16_to_float32x4x4(ta_uv.val[1 - shift], vvec);
 
         yuyv_to_rgb_calculation(yvec_top.val[0], uvec.val[0], yyvec_top.val[0], vvec.val[0], out.ptr() + 0 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec_top.val[1], uvec.val[1], yyvec_top.val[1], vvec.val[1], out.ptr() + 1 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec_top.val[2], uvec.val[2], yyvec_top.val[2], vvec.val[2], out.ptr() + 2 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec_top.val[3], uvec.val[3], yyvec_top.val[3], vvec.val[3], out.ptr() + 3 * element_size, alpha);
 
         yuyv_to_rgb_calculation(yvec_bottom.val[0], uvec.val[0], yyvec_bottom.val[0], vvec.val[0], out.ptr() + out_stride + 0 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec_bottom.val[1], uvec.val[1], yyvec_bottom.val[1], vvec.val[1], out.ptr() + out_stride + 1 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec_bottom.val[2], uvec.val[2], yyvec_bottom.val[2], vvec.val[2], out.ptr() + out_stride + 2 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec_bottom.val[3], uvec.val[3], yyvec_bottom.val[3], vvec.val[3], out.ptr() + out_stride + 3 * element_size, alpha);
     },
     in_y, in_uv, out);
 }

void arm_compute::colorconvert_nv12_to_yuv4	(	const void *__restrict	input,
		void *__restrict	output,
		const Window &	win
	)

Convert NV12 to YUV4.

Parameters

[in]	input	Input NV12 data buffer.
[out]	output	Output YUV4 buffer.
[in]	win	Window for iterating the buffers.

Definition at line 758 of file NEColorConvertHelper.inl.

References ARM_COMPUTE_ERROR_ON, Window::DimX, Window::DimY, Window::Dimension::end(), execute_window_loop(), Window::set(), Window::Dimension::start(), Window::Dimension::step(), Window::validate(), Window::x(), and Window::y().

 {
     ARM_COMPUTE_ERROR_ON(nullptr == input);
     ARM_COMPUTE_ERROR_ON(nullptr == output);
     win.validate();
 
     const auto input_ptr  = static_cast<const IMultiImage *__restrict>(input);
     const auto output_ptr = static_cast<IMultiImage *__restrict>(output);
 
     constexpr auto shift = uv ? 0 : 1;
 
     // UV's width and height are subsampled
     Window win_uv(win);
     win_uv.set(Window::DimX, Window::Dimension(win_uv.x().start() / 2, win_uv.x().end() / 2, win_uv.x().step() / 2));
     win_uv.set(Window::DimY, Window::Dimension(win_uv.y().start() / 2, win_uv.y().end() / 2, 1));
     win_uv.validate();
 
     Iterator in_y(input_ptr->plane(0), win);
     Iterator in_uv(input_ptr->plane(1), win_uv);
     Iterator out_y(output_ptr->plane(0), win);
     Iterator out_u(output_ptr->plane(1), win);
     Iterator out_v(output_ptr->plane(2), win);
 
     execute_window_loop(win, [&](const Coordinates & id)
     {
         const auto ta_y_top    = vld2q_u8(in_y.ptr());
         const auto ta_y_bottom = vld2q_u8(in_y.ptr() + input_ptr->plane(0)->info()->strides_in_bytes().y());
         const auto ta_uv       = vld2q_u8(in_uv.ptr());
         //ta_y.val[0] = Y0 Y2 Y4 Y6 ...
         //ta_y.val[1] = Y1 Y3 Y5 Y7 ...
         //ta_uv.val[0] = U0 U2 U4 U6 ...
         //ta_uv.val[1] = V0 V2 V4 V6 ...
 
         vst2q_u8(out_y.ptr(), ta_y_top);
         vst2q_u8(out_y.ptr() + output_ptr->plane(0)->info()->strides_in_bytes().y(), ta_y_bottom);
 
         uint8x16x2_t uvec;
         uvec.val[0] = ta_uv.val[0 + shift];
         uvec.val[1] = ta_uv.val[0 + shift];
         vst2q_u8(out_u.ptr(), uvec);
         vst2q_u8(out_u.ptr() + output_ptr->plane(1)->info()->strides_in_bytes().y(), uvec);
 
         uint8x16x2_t vvec;
         vvec.val[0] = ta_uv.val[1 - shift];
         vvec.val[1] = ta_uv.val[1 - shift];
         vst2q_u8(out_v.ptr(), vvec);
         vst2q_u8(out_v.ptr() + output_ptr->plane(2)->info()->strides_in_bytes().y(), vvec);
     },
     in_y, in_uv, out_y, out_u, out_v);
 }

void arm_compute::colorconvert_rgb_to_iyuv	(	const void *__restrict	input,
		void *__restrict	output,
		const Window &	win
	)

Convert RGB to IYUV.

Parameters

[in]	input	Input RGB data buffer.
[out]	output	Output IYUV buffer.
[in]	win	Window for iterating the buffers.

Definition at line 918 of file NEColorConvertHelper.inl.

References arm_compute::test::validation::alpha, ARM_COMPUTE_ERROR_ON, Window::DimX, Window::DimY, Window::Dimension::end(), execute_window_loop(), Iterator::ptr(), Window::set(), Window::Dimension::start(), Window::Dimension::step(), Window::validate(), Window::x(), and Window::y().

 {
     ARM_COMPUTE_ERROR_ON(nullptr == input);
     ARM_COMPUTE_ERROR_ON(nullptr == output);
     win.validate();
 
     const auto input_ptr  = static_cast<const IImage *__restrict>(input);
     const auto output_ptr = static_cast<IMultiImage *__restrict>(output);
 
     // UV's width and height are subsampled
     Window win_uv(win);
     win_uv.set(Window::DimX, Window::Dimension(win_uv.x().start() / 2, win_uv.x().end() / 2, win_uv.x().step() / 2));
     win_uv.set(Window::DimY, Window::Dimension(win_uv.y().start() / 2, win_uv.y().end() / 2, 1));
     win_uv.validate();
 
     Iterator in(input_ptr, win);
     Iterator out_y(output_ptr->plane(0), win);
     Iterator out_u(output_ptr->plane(1), win_uv);
     Iterator out_v(output_ptr->plane(2), win_uv);
 
     execute_window_loop(win, [&](const Coordinates & id)
     {
         const auto ta_rgb_top    = load_rgb(in.ptr(), alpha);
         const auto ta_rgb_bottom = load_rgb(in.ptr() + input_ptr->info()->strides_in_bytes().y(), alpha);
         //ta_rgb.val[0] = R0 R1 R2 R3 ...
         //ta_rgb.val[1] = G0 G1 G2 G3 ...
         //ta_rgb.val[2] = B0 B1 B2 B3 ...
 
         store_rgb_to_iyuv(ta_rgb_top.val[0], ta_rgb_top.val[1], ta_rgb_top.val[2],
                           ta_rgb_bottom.val[0], ta_rgb_bottom.val[1], ta_rgb_bottom.val[2],
                           out_y.ptr(), out_y.ptr() + output_ptr->plane(0)->info()->strides_in_bytes().y(),
                           out_u.ptr(), out_v.ptr());
     },
     in, out_y, out_u, out_v);
 }

void arm_compute::colorconvert_rgb_to_nv12	(	const void *__restrict	input,
		void *__restrict	output,
		const Window &	win
	)

Convert RGB to NV12.

Parameters

[in]	input	Input RGB data buffer.
[out]	output	Output NV12 buffer.
[in]	win	Window for iterating the buffers.

Definition at line 875 of file NEColorConvertHelper.inl.

References arm_compute::test::validation::alpha, ARM_COMPUTE_ERROR_ON, Window::DimX, Window::DimY, Window::Dimension::end(), execute_window_loop(), Iterator::ptr(), Window::set(), Window::Dimension::start(), Window::Dimension::step(), Window::validate(), Window::x(), and Window::y().

 {
     ARM_COMPUTE_ERROR_ON(nullptr == input);
     ARM_COMPUTE_ERROR_ON(nullptr == output);
     win.validate();
 
     const auto input_ptr  = static_cast<const IImage *__restrict>(input);
     const auto output_ptr = static_cast<IMultiImage *__restrict>(output);
 
     // UV's width and height are subsampled
     Window win_uv(win);
     win_uv.set(Window::DimX, Window::Dimension(win_uv.x().start() / 2, win_uv.x().end() / 2, win_uv.x().step() / 2));
     win_uv.set(Window::DimY, Window::Dimension(win_uv.y().start() / 2, win_uv.y().end() / 2, 1));
     win_uv.validate();
 
     Iterator in(input_ptr, win);
     Iterator out_y(output_ptr->plane(0), win);
     Iterator out_uv(output_ptr->plane(1), win_uv);
 
     execute_window_loop(win, [&](const Coordinates & id)
     {
         const auto ta_rgb_top    = load_rgb(in.ptr(), alpha);
         const auto ta_rgb_bottom = load_rgb(in.ptr() + input_ptr->info()->strides_in_bytes().y(), alpha);
         //ta_rgb.val[0] = R0 R1 R2 R3 ...
         //ta_rgb.val[1] = G0 G1 G2 G3 ...
         //ta_rgb.val[2] = B0 B1 B2 B3 ...
 
         store_rgb_to_nv12(ta_rgb_top.val[0], ta_rgb_top.val[1], ta_rgb_top.val[2],
                           ta_rgb_bottom.val[0], ta_rgb_bottom.val[1], ta_rgb_bottom.val[2],
                           out_y.ptr(), out_y.ptr() + output_ptr->plane(0)->info()->strides_in_bytes().y(),
                           out_uv.ptr());
     },
     in, out_y, out_uv);
 }

void arm_compute::colorconvert_rgb_to_rgbx	(	const void *__restrict	input,
		void *__restrict	output,
		const Window &	win
	)

Convert RGB to RGBX.

Parameters

[in]	input	Input RGB data buffer.
[out]	output	Output RGBX buffer.
[in]	win	Window for iterating the buffers.

Definition at line 312 of file NEColorConvertHelper.inl.

References ARM_COMPUTE_ERROR_ON, execute_window_loop(), and Iterator::ptr().

 {
     ARM_COMPUTE_ERROR_ON(nullptr == input);
     ARM_COMPUTE_ERROR_ON(nullptr == output);
 
     const auto input_ptr  = static_cast<const IImage *__restrict>(input);
     const auto output_ptr = static_cast<IImage *__restrict>(output);
 
     Iterator in(input_ptr, win);
     Iterator out(output_ptr, win);
 
     execute_window_loop(win, [&](const Coordinates & id)
     {
         const auto   ta1 = vld3q_u8(in.ptr());
         uint8x16x4_t ta2;
         ta2.val[0] = ta1.val[0];
         ta2.val[1] = ta1.val[1];
         ta2.val[2] = ta1.val[2];
         ta2.val[3] = vdupq_n_u8(255);
         vst4q_u8(out.ptr(), ta2);
     },
     in, out);
 }

void arm_compute::colorconvert_rgb_to_yuv4	(	const void *__restrict	input,
		void *__restrict	output,
		const Window &	win
	)

Convert RGB to YUV4.

Parameters

[in]	input	Input RGB data buffer.
[out]	output	Output YUV4 buffer.
[in]	win	Window for iterating the buffers.

Definition at line 962 of file NEColorConvertHelper.inl.

References arm_compute::test::validation::alpha, ARM_COMPUTE_ERROR_ON, execute_window_loop(), Iterator::ptr(), and Window::validate().

 {
     ARM_COMPUTE_ERROR_ON(nullptr == input);
     ARM_COMPUTE_ERROR_ON(nullptr == output);
     win.validate();
 
     const auto input_ptr  = static_cast<const IImage *__restrict>(input);
     const auto output_ptr = static_cast<IMultiImage *__restrict>(output);
 
     Iterator in(input_ptr, win);
     Iterator out_y(output_ptr->plane(0), win);
     Iterator out_u(output_ptr->plane(1), win);
     Iterator out_v(output_ptr->plane(2), win);
 
     execute_window_loop(win, [&](const Coordinates & id)
     {
         const auto ta_rgb = load_rgb(in.ptr(), alpha);
         //ta_rgb.val[0] = R0 R1 R2 R3 ...
         //ta_rgb.val[1] = G0 G1 G2 G3 ...
         //ta_rgb.val[2] = B0 B1 B2 B3 ...
 
         store_rgb_to_yuv4(ta_rgb.val[0], ta_rgb.val[1], ta_rgb.val[2],
                           out_y.ptr(), out_u.ptr(), out_v.ptr());
     },
     in, out_y, out_u, out_v);
 }

void arm_compute::colorconvert_rgbx_to_rgb	(	const void *	input,
		void *	output,
		const Window &	win
	)

Convert RGBX to RGB.

Parameters

[in]	input	Input RGBX data buffer.
[out]	output	Output RGB buffer.
[in]	win	Window for iterating the buffers.

Definition at line 343 of file NEColorConvertHelper.inl.

References ARM_COMPUTE_ERROR_ON, execute_window_loop(), and Iterator::ptr().

 {
     ARM_COMPUTE_ERROR_ON(nullptr == input);
     ARM_COMPUTE_ERROR_ON(nullptr == output);
 
     const auto input_ptr  = static_cast<const IImage *__restrict>(input);
     const auto output_ptr = static_cast<IImage *__restrict>(output);
 
     Iterator in(input_ptr, win);
     Iterator out(output_ptr, win);
 
     execute_window_loop(win, [&](const Coordinates & id)
     {
         const auto   ta1 = vld4q_u8(in.ptr());
         uint8x16x3_t ta2;
         ta2.val[0] = ta1.val[0];
         ta2.val[1] = ta1.val[1];
         ta2.val[2] = ta1.val[2];
         vst3q_u8(out.ptr(), ta2);
     },
     in, out);
 }

void arm_compute::colorconvert_yuyv_to_iyuv	(	const void *__restrict	input,
		void *__restrict	output,
		const Window &	win
	)

Convert YUYV to IYUV.

Parameters

[in]	input	Input YUYV data buffer.
[out]	output	Output IYUV buffer.
[in]	win	Window for iterating the buffers.

Definition at line 698 of file NEColorConvertHelper.inl.

References ARM_COMPUTE_ERROR_ON, Window::DimX, Window::DimY, Window::Dimension::end(), execute_window_loop(), Iterator::ptr(), Window::set(), Window::Dimension::start(), Window::Dimension::step(), Window::validate(), Window::x(), and Window::y().

 {
     ARM_COMPUTE_ERROR_ON(nullptr == input);
     ARM_COMPUTE_ERROR_ON(nullptr == output);
     win.validate();
 
     const auto input_ptr  = static_cast<const IImage *__restrict>(input);
     const auto output_ptr = static_cast<IMultiImage *__restrict>(output);
 
     constexpr auto shift = yuyv ? 0 : 1;
 
     // Destination's UV's width and height are subsampled
     Window win_uv(win);
     win_uv.set(Window::DimX, Window::Dimension(win_uv.x().start() / 2, win_uv.x().end() / 2, win_uv.x().step() / 2));
     win_uv.set(Window::DimY, Window::Dimension(win_uv.y().start() / 2, win_uv.y().end() / 2, 1));
     win_uv.validate();
 
     Iterator in(input_ptr, win);
     Iterator out_y(output_ptr->plane(0), win);
     Iterator out_u(output_ptr->plane(1), win_uv);
     Iterator out_v(output_ptr->plane(2), win_uv);
 
     execute_window_loop(win, [&](const Coordinates & id)
     {
         const auto ta_top    = vld4q_u8(in.ptr());
         const auto ta_bottom = vld4q_u8(in.ptr() + input_ptr->info()->strides_in_bytes().y());
         //ta.val[0] = Y0 Y2 Y4 Y6 ...
         //ta.val[1] = U0 U2 U4 U6 ...
         //ta.val[2] = Y1 Y3 Y5 Y7 ...
         //ta.val[3] = V0 V2 V4 V7 ...
 
         uint8x16x2_t yvec;
         yvec.val[0] = ta_top.val[0 + shift];
         yvec.val[1] = ta_top.val[2 + shift];
         vst2q_u8(out_y.ptr(), yvec);
 
         uint8x16x2_t yyvec;
         yyvec.val[0] = ta_bottom.val[0 + shift];
         yyvec.val[1] = ta_bottom.val[2 + shift];
         vst2q_u8(out_y.ptr() + output_ptr->plane(0)->info()->strides_in_bytes().y(), yyvec);
 
         uint8x16_t uvec;
         uvec = vhaddq_u8(ta_top.val[1 - shift], ta_bottom.val[1 - shift]);
         vst1q_u8(out_u.ptr(), uvec);
 
         uint8x16_t vvec;
         vvec = vhaddq_u8(ta_top.val[3 - shift], ta_bottom.val[3 - shift]);
         vst1q_u8(out_v.ptr(), vvec);
     },
     in, out_y, out_u, out_v);
 }

void arm_compute::colorconvert_yuyv_to_nv12	(	const void *__restrict	input,
		void *__restrict	output,
		const Window &	win
	)

Convert YUYV to NV12.

Parameters

[in]	input	Input YUYV data buffer.
[out]	output	Output NV12 buffer.
[in]	win	Window for iterating the buffers.

Definition at line 546 of file NEColorConvertHelper.inl.

References ARM_COMPUTE_ERROR_ON, Window::DimX, Window::DimY, Window::Dimension::end(), execute_window_loop(), Iterator::ptr(), Window::set(), Window::Dimension::start(), Window::Dimension::step(), Window::validate(), Window::x(), and Window::y().

 {
     ARM_COMPUTE_ERROR_ON(nullptr == input);
     ARM_COMPUTE_ERROR_ON(nullptr == output);
     win.validate();
 
     const auto input_ptr  = static_cast<const IImage *__restrict>(input);
     const auto output_ptr = static_cast<IMultiImage *__restrict>(output);
 
     constexpr auto shift = yuyv ? 0 : 1;
 
     // NV12's UV's width and height are subsampled
     Window win_uv(win);
     win_uv.set(Window::DimX, Window::Dimension(win_uv.x().start() / 2, win_uv.x().end() / 2, win_uv.x().step() / 2));
     win_uv.set(Window::DimY, Window::Dimension(win_uv.y().start() / 2, win_uv.y().end() / 2, 1));
     win_uv.validate();
 
     Iterator in(input_ptr, win);
     Iterator out_y(output_ptr->plane(0), win);
     Iterator out_uv(output_ptr->plane(1), win_uv);
 
     execute_window_loop(win, [&](const Coordinates & id)
     {
         const auto ta_top    = vld4q_u8(in.ptr());
         const auto ta_bottom = vld4q_u8(in.ptr() + input_ptr->info()->strides_in_bytes().y());
         //ta.val[0] = Y0 Y2 Y4 Y6 ...
         //ta.val[1] = U0 U2 U4 U6 ...
         //ta.val[2] = Y1 Y3 Y5 Y7 ...
         //ta.val[3] = V0 V2 V4 V7 ...
 
         uint8x16x2_t yvec;
         yvec.val[0] = ta_top.val[0 + shift];
         yvec.val[1] = ta_top.val[2 + shift];
         vst2q_u8(out_y.ptr(), yvec);
 
         uint8x16x2_t yyvec;
         yyvec.val[0] = ta_bottom.val[0 + shift];
         yyvec.val[1] = ta_bottom.val[2 + shift];
         vst2q_u8(out_y.ptr() + output_ptr->plane(0)->info()->strides_in_bytes().y(), yyvec);
 
         uint8x16x2_t uvvec;
         uvvec.val[0] = vhaddq_u8(ta_top.val[1 - shift], ta_bottom.val[1 - shift]);
         uvvec.val[1] = vhaddq_u8(ta_top.val[3 - shift], ta_bottom.val[3 - shift]);
         vst2q_u8(out_uv.ptr(), uvvec);
     },
     in, out_y, out_uv);
 }

void arm_compute::colorconvert_yuyv_to_rgb	(	const void *__restrict	input,
		void *__restrict	output,
		const Window &	win
	)

Convert YUYV to RGB.

Parameters

[in]	input	Input YUYV data buffer.
[out]	output	Output RGB buffer.
[in]	win	Window for iterating the buffers.

Definition at line 374 of file NEColorConvertHelper.inl.

References arm_compute::test::validation::alpha, ARM_COMPUTE_ERROR_ON, execute_window_loop(), and Iterator::ptr().

 {
     ARM_COMPUTE_ERROR_ON(nullptr == input);
     ARM_COMPUTE_ERROR_ON(nullptr == output);
 
     const auto input_ptr  = static_cast<const IImage *__restrict>(input);
     const auto output_ptr = static_cast<IImage *__restrict>(output);
 
     constexpr auto element_size = alpha ? 32 : 24;
     constexpr auto shift        = yuyv ? 0 : 1;
 
     Iterator in(input_ptr, win);
     Iterator out(output_ptr, win);
 
     execute_window_loop(win, [&](const Coordinates & id)
     {
         float32x4x4_t uvec, yvec, vvec, yyvec;
         const auto    ta = vld4q_u8(in.ptr());
         //ta.val[0] = Y0 Y2 Y4 Y6 ...
         //ta.val[1] = U0 U2 U4 U6 ...
         //ta.val[2] = Y1 Y3 Y5 Y7 ...
         //ta.val[3] = V0 V2 V4 V7 ...
 
         // Convert the uint8x16x4_t to float32x4x4_t
         convert_uint8x16_to_float32x4x4(ta.val[0 + shift], yvec);
         convert_uint8x16_to_float32x4x4(ta.val[1 - shift], uvec);
         convert_uint8x16_to_float32x4x4(ta.val[2 + shift], yyvec);
         convert_uint8x16_to_float32x4x4(ta.val[3 - shift], vvec);
 
         yuyv_to_rgb_calculation(yvec.val[0], uvec.val[0], yyvec.val[0], vvec.val[0], out.ptr() + 0 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec.val[1], uvec.val[1], yyvec.val[1], vvec.val[1], out.ptr() + 1 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec.val[2], uvec.val[2], yyvec.val[2], vvec.val[2], out.ptr() + 2 * element_size, alpha);
         yuyv_to_rgb_calculation(yvec.val[3], uvec.val[3], yyvec.val[3], vvec.val[3], out.ptr() + 3 * element_size, alpha);
     },
     in, out);
 }

Strides arm_compute::compute_strides	(	const ITensorInfo &	info,
		T	stride_x,
		Ts &&...	fixed_strides
	)

inline

Create a strides object based on the provided strides and the tensor dimensions.

Parameters

[in]	info	Tensor info object providing the shape of the tensor for unspecified strides.
[in]	stride_x	Stride to be used in X dimension (in bytes).
[in]	fixed_strides	Strides to be used in higher dimensions starting at Y (in bytes).

Returns: Strides object based on the specified strides. Missing strides are calculated based on the tensor shape and the strides of lower dimensions.

Definition at line 501 of file Helpers.h.

References ITensorInfo::num_dimensions(), Dimensions< T >::set(), arm_compute::test::validation::shape, and ITensorInfo::tensor_shape().

Referenced by compute_strides().

 {
     const TensorShape &shape = info.tensor_shape();
 
     // Create strides object
     Strides strides(stride_x, fixed_strides...);
 
     for(size_t i = 1 + sizeof...(Ts); i < info.num_dimensions(); ++i)
     {
         strides.set(i, shape[i - 1] * strides[i - 1]);
     }
 
     return strides;
 }

Strides arm_compute::compute_strides ( const ITensorInfo & info )

inline

Create a strides object based on the tensor dimensions.

Parameters

[in] info Tensor info object used to compute the strides.

Returns: Strides object based on element size and tensor shape.

Definition at line 523 of file Helpers.h.

References compute_strides(), and ITensorInfo::element_size().

 {
     return compute_strides(info, info.element_size());
 }

int coords2index	(	const TensorShape &	shape,
		const Coordinates &	coord
	)

inline

Convert n-dimensional coordinates into a linear index.

Parameters

[in]	shape	Shape of the n-dimensional tensor.
[in]	coord	N-dimensional coordinates.

Returns: linead index

Definition at line 322 of file Helpers.inl.

References ARM_COMPUTE_ERROR_ON_MSG, ARM_COMPUTE_UNUSED, Dimensions< T >::num_dimensions(), and TensorShape::total_size().

Referenced by arm_compute::test::validation::reference::convert_fully_connected_weights(), permute(), and arm_compute::test::validation::reference::winograd_input_transform().

 {
     int num_elements = shape.total_size();
     ARM_COMPUTE_UNUSED(num_elements);
     ARM_COMPUTE_ERROR_ON_MSG(num_elements == 0, "Cannot create linear index from empty shape!");
 
     int index  = 0;
     int stride = 1;
 
     for(unsigned int d = 0; d < coord.num_dimensions(); ++d)
     {
         index += coord[d] * stride;
         stride *= shape[d];
     }
 
     return index;
 }

std::unique_ptr<Kernel> arm_compute::create_configure_kernel ( T &&... args )

Helper function to create and return a unique_ptr pointed to a CL/GLES kernel object It also calls the kernel's configuration.

Parameters

[in] args All the arguments that need pass to kernel's configuration.

Returns: A unique pointer pointed to a CL/GLES kernel object

Definition at line 74 of file Helpers.h.

 {
     std::unique_ptr<Kernel> k = arm_compute::support::cpp14::make_unique<Kernel>();
     k->configure(std::forward<T>(args)...);
     return k;
 }

Status arm_compute::create_error	(	ErrorCode	error_code,
		const char *	function,
		const char *	file,
		const int	line,
		const char *	msg,
			...
	)

Creates an error containing the error message.

Parameters

[in]	error_code	Error code
[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	msg	Message to display before aborting.
[in]	...	Variable number of arguments of the message.

Returns: status containing the error

Referenced by Status::throw_if_error().

Status arm_compute::create_error_va_list	(	ErrorCode	error_code,
		const char *	function,
		const char *	file,
		const int	line,
		const char *	msg,
		va_list	args
	)

Creates an error containing the error message from variable argument list.

Parameters

[in]	error_code	Error code
[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	msg	Message to display before aborting.
[in]	args	Variable argument list of the message.

Returns: status containing the error

Referenced by Status::throw_if_error().

std::unique_ptr<Kernel> arm_compute::create_kernel ( )

Helper function to create and return a unique_ptr pointed to a CL/GLES kernel object.

Returns: A unique pointer pointed to a Kernel kernel object

Definition at line 86 of file Helpers.h.

Referenced by GCKernelLibrary::set_context(), and CLKernelLibrary::set_device().

 {
     std::unique_ptr<Kernel> k = arm_compute::support::cpp14::make_unique<Kernel>();
     return k;
 }

size_t arm_compute::data_size_from_type ( DataType data_type )

inline

The size in bytes of the data type.

Parameters

[in] data_type Input data type

Returns: The size in bytes of the data type

Definition at line 107 of file Utils.h.

References ARM_COMPUTE_ERROR, F16, F32, F64, QASYMM8, QS16, QS32, QS8, S16, S32, S64, S8, SIZET, U16, U32, U64, and U8.

Referenced by arm_compute::test::validation::DATA_TEST_CASE(), TensorInfo::element_size(), and AlexNetNetwork< ITensorType, TensorType, SubTensorType, Accessor, ActivationLayerFunction, ConvolutionLayerFunction, DirectConvolutionLayerFunction, FullyConnectedLayerFunction, NormalizationLayerFunction, PoolingLayerFunction, SoftmaxLayerFunction >::init().

 {
     switch(data_type)
     {
         case DataType::U8:
         case DataType::S8:
         case DataType::QS8:
         case DataType::QASYMM8:
             return 1;
         case DataType::U16:
         case DataType::S16:
         case DataType::F16:
         case DataType::QS16:
             return 2;
         case DataType::F32:
         case DataType::U32:
         case DataType::S32:
         case DataType::QS32:
             return 4;
         case DataType::F64:
         case DataType::U64:
         case DataType::S64:
             return 8;
         case DataType::SIZET:
             return sizeof(size_t);
         default:
             ARM_COMPUTE_ERROR("Invalid data type");
             return 0;
     }
 }

std::pair<DataType, DataType> arm_compute::data_type_for_convolution	(	const int16_t *	conv_col,
		const int16_t *	conv_row,
		size_t	size
	)

inline

Calculate accurary required by the horizontal and vertical convolution computations.

Parameters

[in]	conv_col	Pointer to the vertical vector of the separated convolution filter
[in]	conv_row	Pointer to the horizontal vector of the convolution filter
[in]	size	Number of elements per vector of the separated matrix

Returns: The return type is a pair. The first element of the pair is the biggest data type needed for the first stage. The second element of the pair is the biggest data type needed for the second stage.

Definition at line 747 of file Utils.h.

References accumulate(), S16, S32, U16, and UNKNOWN.

 {
     DataType first_stage  = DataType::UNKNOWN;
     DataType second_stage = DataType::UNKNOWN;
 
     auto gez = [](const int16_t &v)
     {
         return v >= 0;
     };
 
     auto accu_neg = [](const int &first, const int &second)
     {
         return first + (second < 0 ? second : 0);
     };
 
     auto accu_pos = [](const int &first, const int &second)
     {
         return first + (second > 0 ? second : 0);
     };
 
     const bool only_positive_coefficients = std::all_of(conv_row, conv_row + size, gez) && std::all_of(conv_col, conv_col + size, gez);
 
     if(only_positive_coefficients)
     {
         const int max_row_value = std::accumulate(conv_row, conv_row + size, 0) * UINT8_MAX;
         const int max_value     = std::accumulate(conv_col, conv_col + size, 0) * max_row_value;
 
         first_stage = (max_row_value <= UINT16_MAX) ? DataType::U16 : DataType::S32;
 
         second_stage = (max_value <= UINT16_MAX) ? DataType::U16 : DataType::S32;
     }
     else
     {
         const int min_row_value  = std::accumulate(conv_row, conv_row + size, 0, accu_neg) * UINT8_MAX;
         const int max_row_value  = std::accumulate(conv_row, conv_row + size, 0, accu_pos) * UINT8_MAX;
         const int neg_coeffs_sum = std::accumulate(conv_col, conv_col + size, 0, accu_neg);
         const int pos_coeffs_sum = std::accumulate(conv_col, conv_col + size, 0, accu_pos);
         const int min_value      = neg_coeffs_sum * max_row_value + pos_coeffs_sum * min_row_value;
         const int max_value      = neg_coeffs_sum * min_row_value + pos_coeffs_sum * max_row_value;
 
         first_stage = ((INT16_MIN <= min_row_value) && (max_row_value <= INT16_MAX)) ? DataType::S16 : DataType::S32;
 
         second_stage = ((INT16_MIN <= min_value) && (max_value <= INT16_MAX)) ? DataType::S16 : DataType::S32;
     }
 
     return std::make_pair(first_stage, second_stage);
 }

DataType arm_compute::data_type_for_convolution_matrix	(	const int16_t *	conv,
		size_t	size
	)

inline

Calculate the accuracy required by the squared convolution calculation.

Parameters

[in]	conv	Pointer to the squared convolution matrix
[in]	size	The total size of the convolution matrix

Returns: The return is the biggest data type needed to do the convolution

Definition at line 803 of file Utils.h.

References arm_compute::test::validation::a, accumulate(), arm_compute::test::validation::b, arm_compute::test::validation::border_mode, calculate_same_pad(), arm_compute::test::validation::conv_info, deconvolution_output_dimensions(), deconvolution_output_shape(), lower_string(), S16, S32, scaled_dimensions(), string_from_activation_func(), string_from_border_mode(), string_from_channel(), string_from_data_layout(), string_from_data_type(), string_from_format(), string_from_interpolation_policy(), string_from_matrix_pattern(), string_from_non_linear_filter_function(), string_from_norm_type(), string_from_pooling_type(), U, and U16.

 {
     auto gez = [](const int16_t v)
     {
         return v >= 0;
     };
 
     const bool only_positive_coefficients = std::all_of(conv, conv + size, gez);
 
     if(only_positive_coefficients)
     {
         const int max_conv_value = std::accumulate(conv, conv + size, 0) * UINT8_MAX;
         if(max_conv_value <= UINT16_MAX)
         {
             return DataType::U16;
         }
         else
         {
             return DataType::S32;
         }
     }
     else
     {
         const int min_value = std::accumulate(conv, conv + size, 0, [](int a, int b)
         {
             return b < 0 ? a + b : a;
         })
         * UINT8_MAX;
 
         const int max_value = std::accumulate(conv, conv + size, 0, [](int a, int b)
         {
             return b > 0 ? a + b : a;
         })
         * UINT8_MAX;
 
         if((INT16_MIN <= min_value) && (INT16_MAX >= max_value))
         {
             return DataType::S16;
         }
         else
         {
             return DataType::S32;
         }
     }
 }

DataType arm_compute::data_type_from_format ( Format format )

inline

Return the data type used by a given single-planar pixel format.

Parameters

[in] format Input format

Returns: The size in bytes of the pixel format

Definition at line 213 of file Utils.h.

References ARM_COMPUTE_ERROR, F16, F32, IYUV, NV12, NV21, RGB888, RGBA8888, S16, S32, U16, U32, U8, UNKNOWN, UV88, UYVY422, YUV444, and YUYV422.

Referenced by arm_compute::test::validation::DATA_TEST_CASE(), and SimpleTensor< T >::data_type().

 {
     switch(format)
     {
         case Format::U8:
         case Format::UV88:
         case Format::RGB888:
         case Format::RGBA8888:
         case Format::YUYV422:
         case Format::UYVY422:
             return DataType::U8;
         case Format::U16:
             return DataType::U16;
         case Format::S16:
             return DataType::S16;
         case Format::U32:
             return DataType::U32;
         case Format::S32:
             return DataType::S32;
         case Format::F16:
             return DataType::F16;
         case Format::F32:
             return DataType::F32;
         //Doesn't make sense for planar formats:
         case Format::NV12:
         case Format::NV21:
         case Format::IYUV:
         case Format::YUV444:
         default:
             ARM_COMPUTE_ERROR("Not supported data_type for given format");
             return DataType::UNKNOWN;
     }
 }

const std::pair<unsigned int, unsigned int> arm_compute::deconvolution_output_dimensions	(	unsigned int	in_width,
		unsigned int	in_height,
		unsigned int	kernel_width,
		unsigned int	kernel_height,
		unsigned int	padx,
		unsigned int	pady,
		unsigned int	inner_border_right,
		unsigned int	inner_border_top,
		unsigned int	stride_x,
		unsigned int	stride_y
	)

Returns expected width and height of the deconvolution's output tensor.

Parameters

[in]	in_width	Width of input tensor (Number of columns)
[in]	in_height	Height of input tensor (Number of rows)
[in]	kernel_width	Kernel width.
[in]	kernel_height	Kernel height.
[in]	padx	X axis padding.
[in]	pady	Y axis padding.
[in]	inner_border_right	The number of zeros added to right edge of the input.
[in]	inner_border_top	The number of zeros added to top edge of the input.
[in]	stride_x	X axis input stride.
[in]	stride_y	Y axis input stride.

Returns: A pair with the new width in the first position and the new height in the second.

Referenced by arm_compute::test::validation::DATA_TEST_CASE(), and data_type_for_convolution_matrix().

TensorShape arm_compute::deconvolution_output_shape	(	const std::pair< unsigned int, unsigned int > &	out_dims,
		TensorShape	input,
		TensorShape	weights
	)

Returns expected shape for the deconvolution output tensor.

Parameters

[in]	out_dims	widht and height of the output tensor, these values can be obtained with the function deconvolution_output_dimensions.
[in]	input	Shape of the input tensor.
[in]	weights	Shape of the weights tensor.

Returns: Deconvolution output tensor shape.

Referenced by arm_compute::test::validation::DATA_TEST_CASE(), and data_type_for_convolution_matrix().

T arm_compute::delta_bilinear_c1	(	const T *	pixel_ptr,
		size_t	stride,
		float	dx,
		float	dy
	)

inline

Computes bilinear interpolation using the pointer to the top-left pixel and the pixel's distance between the real coordinates and the smallest following integer coordinates.

Input must be in single channel format.

Parameters

[in]	pixel_ptr	Pointer to the top-left pixel value of a single channel input.
[in]	stride	Stride to access the bottom-left and bottom-right pixel values
[in]	dx	Pixel's distance between the X real coordinate and the smallest X following integer
[in]	dy	Pixel's distance between the Y real coordinate and the smallest Y following integer

Note: dx and dy must be in the range [0, 1.0]

Returns: The bilinear interpolated pixel value

Definition at line 127 of file Helpers.h.

References ARM_COMPUTE_ERROR_ON.

Referenced by pixel_bilinear_c1(), and pixel_bilinear_c1_clamp().

 {
     ARM_COMPUTE_ERROR_ON(pixel_ptr == nullptr);
 
     const float dx1 = 1.0f - dx;
     const float dy1 = 1.0f - dy;
 
     const T a00 = *pixel_ptr;
     const T a01 = *(pixel_ptr + 1);
     const T a10 = *(pixel_ptr + stride);
     const T a11 = *(pixel_ptr + stride + 1);
 
     const float w1 = dx1 * dy1;
     const float w2 = dx * dy1;
     const float w3 = dx1 * dy;
     const float w4 = dx * dy;
 
     return static_cast<T>(a00 * w1 + a01 * w2 + a10 * w3 + a11 * w4);
 }

T arm_compute::delta_linear_c1_x	(	const T *	pixel_ptr,
		float	dx
	)

inline

Computes linear interpolation using the pointer to the left pixel and the pixel's distance between the real coordinates and the smallest following integer coordinates.

Input must be in single channel format.

Parameters

[in]	pixel_ptr	Pointer to the left pixel value of a single channel input.
[in]	dx	Pixel's distance between the X real coordinate and the smallest X following integer

Note: dx must be in the range [0, 1.0]

Returns: The linear interpolated pixel value

Definition at line 184 of file Helpers.h.

References ARM_COMPUTE_ERROR_ON.

Referenced by pixel_bilinear_c1_clamp().

 {
     ARM_COMPUTE_ERROR_ON(pixel_ptr == nullptr);
 
     const T a00 = *pixel_ptr;
     const T a01 = *(pixel_ptr + 1);
 
     const float dx1 = 1.0f - dx;
 
     const float w1 = dx1;
     const float w2 = dx;
 
     return static_cast<T>(a00 * w1 + a01 * w2);
 }

T arm_compute::delta_linear_c1_y	(	const T *	pixel_ptr,
		size_t	stride,
		float	dy
	)

inline

Computes linear interpolation using the pointer to the top pixel and the pixel's distance between the real coordinates and the smallest following integer coordinates.

Input must be in single channel format.

Parameters

[in]	pixel_ptr	Pointer to the top pixel value of a single channel input.
[in]	stride	Stride to access the bottom pixel value
[in]	dy	Pixel's distance between the Y real coordinate and the smallest Y following integer

Note: dy must be in the range [0, 1.0]

Returns: The linear interpolated pixel value

Definition at line 159 of file Helpers.h.

References ARM_COMPUTE_ERROR_ON.

Referenced by pixel_bilinear_c1_clamp().

 {
     ARM_COMPUTE_ERROR_ON(pixel_ptr == nullptr);
 
     const float dy1 = 1.0f - dy;
 
     const T a00 = *pixel_ptr;
     const T a10 = *(pixel_ptr + stride);
 
     const float w1 = dy1;
     const float w3 = dy;
 
     return static_cast<T>(a00 * w1 + a10 * w3);
 }

bool arm_compute::device_supports_extension	(	const cl::Device &	device,
		const char *	extension_name
	)

Helper function to check whether a given extension is supported.

Parameters

[in]	device	A CL device
[in]	extension_name	Name of the extension to be checked

Returns: True if the extension is supported

Referenced by CLScheduler::default_init().

constexpr auto arm_compute::DIV_CEIL	(	S	val,
		T	m
	)		-> decltype((val + m - 1) / m)

Calculate the rounded up quotient of val / m.

Parameters

[in]	val	Value to divide and round up.
[in]	m	Value to divide by.

Returns: the result.

Definition at line 51 of file Utils.h.

Referenced by ceil_to_multiple().

 {
     return (val + m - 1) / m;
 }

size_t arm_compute::element_size_from_data_type ( DataType dt )

inline

The size in bytes of the data type.

Parameters

[in] dt Input data type

Returns: The size in bytes of the data type

Definition at line 182 of file Utils.h.

References ARM_COMPUTE_ERROR, F16, F32, QASYMM8, QS16, QS32, QS8, S16, S32, S8, U16, U32, and U8.

Referenced by SimpleTensor< T >::element_size(), error_on_value_not_representable_in_fixed_point(), and arm_compute::test::validation::validate().

 {
     switch(dt)
     {
         case DataType::S8:
         case DataType::U8:
         case DataType::QS8:
         case DataType::QASYMM8:
             return 1;
         case DataType::U16:
         case DataType::S16:
         case DataType::QS16:
         case DataType::F16:
             return 2;
         case DataType::U32:
         case DataType::S32:
         case DataType::F32:
         case DataType::QS32:
             return 4;
         default:
             ARM_COMPUTE_ERROR("Undefined element size for given data type");
             return 0;
     }
 }

void arm_compute::enqueue	(	IGCKernel &	kernel,
		const Window &	window,
		const gles::NDRange &	lws = `gles::NDRange(1U, 1U, 1U)`
	)

Add the kernel to the command queue with the given window.

Note: Depending on the size of the window, this might translate into several jobs being enqueued.; If kernel->kernel() is empty then the function will return without adding anything to the queue.

Parameters

[in]	kernel	Kernel to enqueue
[in]	window	Window the kernel has to process.
[in]	lws	Local workgroup size requested, by default (1, 1, 1)

Note: If any dimension of the lws is greater than the global workgroup size then no lws will be passed.

void arm_compute::enqueue	(	cl::CommandQueue &	queue,
		ICLKernel &	kernel,
		const Window &	window,
		const cl::NDRange &	lws_hint = `CLKernelLibrary::get().default_ndrange()`
	)

Add the kernel to the command queue with the given window.

Note: Depending on the size of the window, this might translate into several jobs being enqueued.; If kernel->kernel() is empty then the function will return without adding anything to the queue.

Parameters

[in,out]	queue	OpenCL command queue.
[in]	kernel	Kernel to enqueue
[in]	window	Window the kernel has to process.
[in]	lws_hint	Local workgroup size requested. Default is based on the device target.

Note: If any dimension of the lws is greater than the global workgroup size then no lws will be passed.

Referenced by IGCKernel::get_target(), and ICLKernel::get_target().

void arm_compute::error	(	const char *	function,
		const char *	file,
		const int	line,
		const char *	msg,
			...
	)

Print an error message then throw an std::runtime_error.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	msg	Message to display before aborting.
[in]	...	Variable number of arguments of the message.

Referenced by Framework::error_on_missing_assets(), main(), and Status::throw_if_error().

arm_compute::Status arm_compute::error_on_channel_not_in	(	const char *	function,
		const char *	file,
		const int	line,
		T	cn,
		T &&	channel,
		Ts &&...	channels
	)

inline

Return an error if the channel is not in channels.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	cn	Input channel
[in]	channel	First channel allowed.
[in]	channels	(Optional) Further allowed channels.

Returns: Status

Definition at line 835 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC, and UNKNOWN.

 {
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(cn == Channel::UNKNOWN, function, file, line);
 
     const std::array<T, sizeof...(Ts)> channels_array{ { std::forward<Ts>(channels)... } };
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(channel != cn && std::none_of(channels_array.begin(), channels_array.end(), [&](const T & f)
     {
         return f == cn;
     }),
     function, file, line);
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_channel_not_in_known_format	(	const char *	function,
		const char *	file,
		const int	line,
		Format	fmt,
		Channel	cn
	)

Return an error if the channel is not in format.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	fmt	Input channel
[in]	cn	First channel allowed.

Returns: Status

arm_compute::Status arm_compute::error_on_coordinates_dimensions_gte	(	const char *	function,
		const char *	file,
		const int	line,
		const Coordinates &	pos,
		unsigned int	max_dim
	)

Return an error if the passed coordinates have too many dimensions.

The coordinates have too many dimensions if any of the dimensions greater or equal to max_dim is different from 0.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	pos	Coordinates to validate
[in]	max_dim	Maximum number of dimensions allowed.

Returns: Status

arm_compute::Status arm_compute::error_on_data_type_channel_not_in	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensorInfo *	tensor_info,
		size_t	num_channels,
		T &&	dt,
		Ts &&...	dts
	)

inline

Return an error if the data type or the number of channels of the passed tensor info does not match any of the data types and number of channels provided.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor_info	Tensor info to validate.
[in]	num_channels	Number of channels to check
[in]	dt	First data type allowed.
[in]	dts	(Optional) Further allowed data types.

Returns: Status

Definition at line 774 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG, ARM_COMPUTE_RETURN_ON_ERROR, error_on_data_type_not_in(), and ITensorInfo::num_channels().

Referenced by error_on_data_type_channel_not_in().

 {
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_data_type_not_in(function, file, line, tensor_info, std::forward<T>(dt), std::forward<Ts>(dts)...));
     const size_t tensor_nc = tensor_info->num_channels();
     ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG(tensor_nc != num_channels, function, file, line, "Number of channels %d. Required number of channels %d", tensor_nc, num_channels);
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_data_type_channel_not_in	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensor *	tensor,
		size_t	num_channels,
		T &&	dt,
		Ts &&...	dts
	)

inline

Return an error if the data type or the number of channels of the passed tensor does not match any of the data types and number of channels provided.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor	Tensor to validate.
[in]	num_channels	Number of channels to check
[in]	dt	First data type allowed.
[in]	dts	(Optional) Further allowed data types.

Returns: Status

Definition at line 795 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC, ARM_COMPUTE_RETURN_ON_ERROR, error_on_data_type_channel_not_in(), and ITensor::info().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor == nullptr, function, file, line);
     ARM_COMPUTE_RETURN_ON_ERROR(error_on_data_type_channel_not_in(function, file, line, tensor->info(), num_channels, std::forward<T>(dt), std::forward<Ts>(dts)...));
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_data_type_not_in	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensorInfo *	tensor_info,
		T &&	dt,
		Ts &&...	dts
	)

inline

Return an error if the data type of the passed tensor info does not match any of the data types provided.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor_info	Tensor info to validate.
[in]	dt	First data type allowed.
[in]	dts	(Optional) Further allowed data types.

Returns: Status

Definition at line 721 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC, ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG, ITensorInfo::data_type(), string_from_data_type(), and UNKNOWN.

Referenced by error_on_data_type_channel_not_in(), and error_on_data_type_not_in().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor_info == nullptr, function, file, line);
 
     const DataType &tensor_dt = tensor_info->data_type(); //NOLINT
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor_dt == DataType::UNKNOWN, function, file, line);
 
     const std::array<T, sizeof...(Ts)> dts_array{ { std::forward<Ts>(dts)... } };
     ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG(tensor_dt != dt && std::none_of(dts_array.begin(), dts_array.end(), [&](const T & d)
     {
         return d == tensor_dt;
     }),
     function, file, line, "ITensor data type %s not supported by this kernel", string_from_data_type(tensor_dt).c_str());
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_data_type_not_in	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensor *	tensor,
		T &&	dt,
		Ts &&...	dts
	)

inline

Return an error if the data type of the passed tensor does not match any of the data types provided.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor	Tensor to validate.
[in]	dt	First data type allowed.
[in]	dts	(Optional) Further allowed data types.

Returns: Status

Definition at line 749 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC, ARM_COMPUTE_RETURN_ON_ERROR, error_on_data_type_not_in(), and ITensor::info().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor == nullptr, function, file, line);
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_data_type_not_in(function, file, line, tensor->info(), std::forward<T>(dt), std::forward<Ts>(dts)...));
     return arm_compute::Status{};
 }

void arm_compute::error_on_format_not_in	(	const char *	function,
		const char *	file,
		const int	line,
		const T *	object,
		F &&	format,
		Fs &&...	formats
	)

Throw an error if the format of the passed tensor/multi-image does not match any of the formats provided.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	object	Tensor/multi-image to validate.
[in]	format	First format allowed.
[in]	formats	(Optional) Further allowed formats.

Definition at line 688 of file Validate.h.

References ARM_COMPUTE_ERROR_ON_LOC, ARM_COMPUTE_ERROR_ON_LOC_MSG, ARM_COMPUTE_UNUSED, string_from_format(), and UNKNOWN.

 {
     ARM_COMPUTE_ERROR_ON_LOC(object == nullptr, function, file, line);
 
     Format &&object_format = object->info()->format();
     ARM_COMPUTE_UNUSED(object_format);
 
     ARM_COMPUTE_ERROR_ON_LOC(object_format == Format::UNKNOWN, function, file, line);
 
     const std::array<F, sizeof...(Fs)> formats_array{ { std::forward<Fs>(formats)... } };
     ARM_COMPUTE_UNUSED(formats_array);
 
     ARM_COMPUTE_ERROR_ON_LOC_MSG(object_format != format && std::none_of(formats_array.begin(), formats_array.end(), [&](const F & f)
     {
         return f == object_format;
     }),
     function, file, line, "Format %s not supported by this kernel", string_from_format(object_format).c_str());
 }

arm_compute::Status arm_compute::error_on_invalid_multi_hog	(	const char *	function,
		const char *	file,
		const int	line,
		const IMultiHOG *	multi_hog
	)

Return an error if the IMultiHOG container is invalid.

An IMultiHOG container is invalid if:

it is a nullptr
it doesn't contain models
it doesn't have the HOG data objects with the same phase_type, normalization_type and l2_hyst_threshold (if normalization_type == L2HYS_NORM)

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	multi_hog	IMultiHOG container to validate

Returns: Status

arm_compute::Status arm_compute::error_on_invalid_subtensor	(	const char *	function,
		const char *	file,
		const int	line,
		const TensorShape &	parent_shape,
		const Coordinates &	coords,
		const TensorShape &	shape
	)

Return an error if if the coordinates and shape of the subtensor are within the parent tensor.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	parent_shape	Parent tensor shape
[in]	coords	Coordinates inside the parent tensor where the first element of the subtensor is
[in]	shape	Shape of the subtensor

Returns: Status

arm_compute::Status arm_compute::error_on_invalid_subtensor_valid_region	(	const char *	function,
		const char *	file,
		const int	line,
		const ValidRegion &	parent_valid_region,
		const ValidRegion &	valid_region
	)

Return an error if the valid region of a subtensor is not inside the valid region of the parent tensor.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	parent_valid_region	Parent valid region.
[in]	valid_region	Valid region of subtensor.

Returns: Status

arm_compute::Status arm_compute::error_on_invalid_subwindow	(	const char *	function,
		const char *	file,
		const int	line,
		const Window &	full,
		const Window &	sub
	)

Return an error if the passed subwindow is invalid.

The subwindow is invalid if:

It is not a valid window.
It is not fully contained inside the full window
The step for each of its dimension is not identical to the corresponding one of the full window.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	full	Full size window
[in]	sub	Sub-window to validate.

Returns: Status

arm_compute::Status arm_compute::error_on_mismatching_data_layouts	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensorInfo *	tensor_info,
		Ts...	tensor_infos
	)

inline

Return an error if the passed tensor infos have different data layouts.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor_info	The first tensor info to be compared.
[in]	tensor_infos	(Optional) Further allowed tensor infos.

Returns: Status

Definition at line 457 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC, ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG, ARM_COMPUTE_RETURN_ON_ERROR, ITensorInfo::data_layout(), and error_on_nullptr().

Referenced by error_on_mismatching_data_layouts().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor_info == nullptr, function, file, line);
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_nullptr(function, file, line, std::forward<Ts>(tensor_infos)...));
 
     DataLayout &&tensor_data_layout = tensor_info->data_layout();
     const std::array<const ITensorInfo *, sizeof...(Ts)> tensors_infos_array{ { std::forward<Ts>(tensor_infos)... } };
     ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG(std::any_of(tensors_infos_array.begin(), tensors_infos_array.end(), [&](const ITensorInfo * tensor_info_obj)
     {
         return tensor_info_obj->data_layout() != tensor_data_layout;
     }),
     function, file, line, "Tensors have different data layouts");
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_mismatching_data_layouts	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensor *	tensor,
		Ts...	tensors
	)

inline

Return an error if the passed tensors have different data layouts.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor	The first tensor to be compared.
[in]	tensors	(Optional) Further allowed tensors.

Returns: Status

Definition at line 483 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC, ARM_COMPUTE_RETURN_ON_ERROR, error_on_mismatching_data_layouts(), error_on_nullptr(), and ITensor::info().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor == nullptr, function, file, line);
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_nullptr(function, file, line, std::forward<Ts>(tensors)...));
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_mismatching_data_layouts(function, file, line, tensor->info(),
                                                                                  detail::get_tensor_info_t<ITensorInfo *>()(tensors)...));
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_mismatching_data_types	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensorInfo *	tensor_info,
		Ts...	tensor_infos
	)

inline

Return an error if the passed two tensor infos have different data types.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor_info	The first tensor info to be compared.
[in]	tensor_infos	(Optional) Further allowed tensor infos.

Returns: Status

Definition at line 508 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC, ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG, ARM_COMPUTE_RETURN_ON_ERROR, ITensorInfo::data_type(), and error_on_nullptr().

Referenced by error_on_mismatching_data_types().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor_info == nullptr, function, file, line);
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_nullptr(function, file, line, std::forward<Ts>(tensor_infos)...));
 
     DataType &&tensor_data_type = tensor_info->data_type();
     const std::array<const ITensorInfo *, sizeof...(Ts)> tensors_infos_array{ { std::forward<Ts>(tensor_infos)... } };
     ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG(std::any_of(tensors_infos_array.begin(), tensors_infos_array.end(), [&](const ITensorInfo * tensor_info_obj)
     {
         return tensor_info_obj->data_type() != tensor_data_type;
     }),
     function, file, line, "Tensors have different data types");
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_mismatching_data_types	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensor *	tensor,
		Ts...	tensors
	)

inline

Return an error if the passed two tensors have different data types.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor	The first tensor to be compared.
[in]	tensors	(Optional) Further allowed tensors.

Returns: Status

Definition at line 534 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC, ARM_COMPUTE_RETURN_ON_ERROR, error_on_mismatching_data_types(), error_on_nullptr(), and ITensor::info().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor == nullptr, function, file, line);
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_nullptr(function, file, line, std::forward<Ts>(tensors)...));
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_mismatching_data_types(function, file, line, tensor->info(),
                                                                                detail::get_tensor_info_t<ITensorInfo *>()(tensors)...));
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_mismatching_dimensions	(	const char *	function,
		const char *	file,
		int	line,
		const Dimensions< T > &	dim1,
		const Dimensions< T > &	dim2,
		Ts &&...	dims
	)

Return an error if the passed dimension objects differ.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	dim1	The first object to be compared.
[in]	dim2	The second object to be compared.
[in]	dims	(Optional) Further allowed objects.

Returns: Status

Definition at line 280 of file Validate.h.

References ARM_COMPUTE_RETURN_ON_ERROR, and arm_compute::detail::for_each_error().

 {
     ARM_COMPUTE_RETURN_ON_ERROR(detail::for_each_error(detail::compare_dimension<T>(dim1, function, file, line), dim2, std::forward<Ts>(dims)...));
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_mismatching_fixed_point	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensorInfo *	tensor_info_1,
		const ITensorInfo *	tensor_info_2,
		Ts...	tensor_infos
	)

inline

Return an error if the passed tensor infos have different fixed point data types or different fixed point positions.

Note: : If the first tensor doesn't have fixed point data type, the function returns without throwing an error

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor_info_1	The first tensor info to be compared.
[in]	tensor_info_2	The second tensor info to be compared.
[in]	tensor_infos	(Optional) Further allowed tensor infos.

Returns: Status

Definition at line 562 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG, ITensorInfo::data_type(), ITensorInfo::fixed_point_position(), and is_data_type_fixed_point().

Referenced by error_on_mismatching_fixed_point().

 {
     DataType &&first_data_type            = tensor_info_1->data_type();
     const int  first_fixed_point_position = tensor_info_1->fixed_point_position();
 
     if(!is_data_type_fixed_point(first_data_type))
     {
         return arm_compute::Status{};
     }
 
     const std::array < const ITensorInfo *, 1 + sizeof...(Ts) > tensor_infos_array{ { tensor_info_2, std::forward<Ts>(tensor_infos)... } };
     ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG(std::any_of(tensor_infos_array.begin(), tensor_infos_array.end(), [&](const ITensorInfo * tensor_info)
     {
         return tensor_info->data_type() != first_data_type;
     }),
     function, file, line, "Tensors have different fixed point data types");
     ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG(std::any_of(tensor_infos_array.begin(), tensor_infos_array.end(), [&](const ITensorInfo * tensor_info)
     {
         return tensor_info->fixed_point_position() != first_fixed_point_position;
     }),
     function, file, line, "Tensors have different fixed point positions");
 
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_mismatching_fixed_point	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensor *	tensor_1,
		const ITensor *	tensor_2,
		Ts...	tensors
	)

inline

Return an error if the passed tensor have different fixed point data types or different fixed point positions.

Note: : If the first tensor doesn't have fixed point data type, the function returns without throwing an error

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor_1	The first tensor to be compared.
[in]	tensor_2	The second tensor to be compared.
[in]	tensors	(Optional) Further allowed tensors.

Returns: Status

Definition at line 601 of file Validate.h.

References ARM_COMPUTE_RETURN_ON_ERROR, error_on_mismatching_fixed_point(), and ITensor::info().

 {
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_mismatching_fixed_point(function, file, line, tensor_1->info(), tensor_2->info(),
                                                                                 detail::get_tensor_info_t<ITensorInfo *>()(tensors)...));
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_mismatching_fixed_point_position	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensorInfo *	tensor_info_1,
		const ITensorInfo *	tensor_info_2,
		Ts...	tensor_infos
	)

inline

Return an error if the input fixed-point positions are different.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor_info_1	The first tensor info to be compared.
[in]	tensor_info_2	The second tensor info to be compared.
[in]	tensor_infos	(Optional) Further allowed tensor infos.

Returns: Status

Definition at line 955 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG, and ITensorInfo::fixed_point_position().

Referenced by error_on_mismatching_fixed_point_position().

 {
     const std::array < const ITensorInfo *, 1 + sizeof...(Ts) > tensor_info_array{ { tensor_info_2, std::forward<Ts>(tensor_infos)... } };
     ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG(std::any_of(tensor_info_array.begin(), tensor_info_array.end(), [&](const ITensorInfo * tensor_info)
     {
         return tensor_info->fixed_point_position() != tensor_info_1->fixed_point_position();
     }),
     function, file, line, "Tensors have different fixed-point positions");
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_mismatching_fixed_point_position	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensor *	tensor_1,
		const ITensor *	tensor_2,
		Ts...	tensors
	)

inline

Return an error if the input fixed-point positions are different.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor_1	The first tensor to be compared.
[in]	tensor_2	The second tensor to be compared.
[in]	tensors	(Optional) Further allowed tensors.

Returns: Status

Definition at line 978 of file Validate.h.

References ARM_COMPUTE_RETURN_ON_ERROR, error_on_mismatching_fixed_point_position(), and ITensor::info().

 {
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_mismatching_fixed_point_position(function, file, line, tensor_1->info(), tensor_2->info(),
                                                                                          detail::get_tensor_info_t<ITensorInfo *>()(tensors)...));
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_mismatching_quantization_info	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensorInfo *	tensor_info_1,
		const ITensorInfo *	tensor_info_2,
		Ts...	tensor_infos
	)

inline

Return an error if the passed tensor infos have different asymmetric quantized data types or different quantization info.

Note: : If the first tensor info doesn't have asymmetric quantized data type, the function returns without throwing an error

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor_info_1	The first tensor info to be compared.
[in]	tensor_info_2	The second tensor info to be compared.
[in]	tensor_infos	(Optional) Further allowed tensor infos.

Returns: Status

Definition at line 627 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG, ITensorInfo::data_type(), is_data_type_quantized_asymmetric(), and ITensorInfo::quantization_info().

Referenced by error_on_mismatching_quantization_info().

 {
     DataType             &&first_data_type         = tensor_info_1->data_type();
     const QuantizationInfo first_quantization_info = tensor_info_1->quantization_info();
 
     if(!is_data_type_quantized_asymmetric(first_data_type))
     {
         return arm_compute::Status{};
     }
 
     const std::array < const ITensorInfo *, 1 + sizeof...(Ts) > tensor_infos_array{ { tensor_info_2, std::forward<Ts>(tensor_infos)... } };
     ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG(std::any_of(tensor_infos_array.begin(), tensor_infos_array.end(), [&](const ITensorInfo * tensor_info)
     {
         return tensor_info->data_type() != first_data_type;
     }),
     function, file, line, "Tensors have different asymmetric quantized data types");
     ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG(std::any_of(tensor_infos_array.begin(), tensor_infos_array.end(), [&](const ITensorInfo * tensor_info)
     {
         return tensor_info->quantization_info() != first_quantization_info;
     }),
     function, file, line, "Tensors have different quantization information");
 
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_mismatching_quantization_info	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensor *	tensor_1,
		const ITensor *	tensor_2,
		Ts...	tensors
	)

inline

Return an error if the passed tensor have different asymmetric quantized data types or different quantization info.

Note: : If the first tensor doesn't have asymmetric quantized data type, the function returns without throwing an error

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor_1	The first tensor to be compared.
[in]	tensor_2	The second tensor to be compared.
[in]	tensors	(Optional) Further allowed tensors.

Returns: Status

Definition at line 666 of file Validate.h.

References ARM_COMPUTE_RETURN_ON_ERROR, error_on_mismatching_quantization_info(), and ITensor::info().

 {
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_mismatching_quantization_info(function, file, line, tensor_1->info(), tensor_2->info(),
                                                                                       detail::get_tensor_info_t<ITensorInfo *>()(tensors)...));
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_mismatching_shapes	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensorInfo *	tensor_info_1,
		const ITensorInfo *	tensor_info_2,
		Ts...	tensor_infos
	)

inline

Return an error if the passed two tensor infos have different shapes from the given dimension.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor_info_1	The first tensor info to be compared.
[in]	tensor_info_2	The second tensor info to be compared.
[in]	tensor_infos	(Optional) Further allowed tensor infos.

Returns: Status

Definition at line 368 of file Validate.h.

References U.

Referenced by error_on_mismatching_shapes().

 {
     return error_on_mismatching_shapes(function, file, line, 0U, tensor_info_1, tensor_info_2, std::forward<Ts>(tensor_infos)...);
 }

arm_compute::Status arm_compute::error_on_mismatching_shapes	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensor *	tensor_1,
		const ITensor *	tensor_2,
		Ts...	tensors
	)

inline

Return an error if the passed two tensors have different shapes from the given dimension.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor_1	The first tensor to be compared.
[in]	tensor_2	The second tensor to be compared.
[in]	tensors	(Optional) Further allowed tensors.

Returns: Status

Definition at line 385 of file Validate.h.

References error_on_mismatching_shapes(), and U.

 {
     return error_on_mismatching_shapes(function, file, line, 0U, tensor_1, tensor_2, std::forward<Ts>(tensors)...);
 }

arm_compute::Status arm_compute::error_on_mismatching_shapes	(	const char *	function,
		const char *	file,
		const int	line,
		unsigned int	upper_dim,
		const ITensorInfo *	tensor_info_1,
		const ITensorInfo *	tensor_info_2,
		Ts...	tensor_infos
	)

inline

Return an error if the passed two tensors have different shapes from the given dimension.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	upper_dim	The dimension from which to check.
[in]	tensor_info_1	The first tensor info to be compared.
[in]	tensor_info_2	The second tensor info to be compared.
[in]	tensor_infos	(Optional) Further allowed tensor infos.

Returns: Status

Definition at line 403 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC, ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG, ARM_COMPUTE_RETURN_ON_ERROR, error_on_nullptr(), and arm_compute::detail::have_different_dimensions().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor_info_1 == nullptr, function, file, line);
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor_info_2 == nullptr, function, file, line);
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_nullptr(function, file, line, std::forward<Ts>(tensor_infos)...));
 
     const std::array < const ITensorInfo *, 2 + sizeof...(Ts) > tensors_info_array{ { tensor_info_1, tensor_info_2, std::forward<Ts>(tensor_infos)... } };
     ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG(std::any_of(std::next(tensors_info_array.cbegin()), tensors_info_array.cend(), [&](const ITensorInfo * tensor_info)
     {
         return detail::have_different_dimensions((*tensors_info_array.cbegin())->tensor_shape(), tensor_info->tensor_shape(), upper_dim);
     }),
     function, file, line, "Tensors have different shapes");
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_mismatching_shapes	(	const char *	function,
		const char *	file,
		const int	line,
		unsigned int	upper_dim,
		const ITensor *	tensor_1,
		const ITensor *	tensor_2,
		Ts...	tensors
	)

inline

Return an error if the passed two tensors have different shapes from the given dimension.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	upper_dim	The dimension from which to check.
[in]	tensor_1	The first tensor to be compared.
[in]	tensor_2	The second tensor to be compared.
[in]	tensors	(Optional) Further allowed tensors.

Returns: Status

Definition at line 431 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC, ARM_COMPUTE_RETURN_ON_ERROR, error_on_mismatching_shapes(), error_on_nullptr(), and ITensor::info().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor_1 == nullptr, function, file, line);
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor_2 == nullptr, function, file, line);
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_nullptr(function, file, line, std::forward<Ts>(tensors)...));
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_mismatching_shapes(function, file, line, upper_dim, tensor_1->info(), tensor_2->info(),
                                                                            detail::get_tensor_info_t<ITensorInfo *>()(tensors)...));
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_mismatching_windows	(	const char *	function,
		const char *	file,
		const int	line,
		const Window &	full,
		const Window &	win
	)

Return an error if the passed window is invalid.

The subwindow is invalid if:

It is not a valid window.
Its dimensions don't match the full window's ones
The step for each of its dimension is not identical to the corresponding one of the full window.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	full	Full size window
[in]	win	Window to validate.

Returns: Status

arm_compute::Status arm_compute::error_on_nullptr	(	const char *	function,
		const char *	file,
		const int	line,
		Ts &&...	pointers
	)

inline

Create an error if one of the pointers is a nullptr.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	pointers	Pointers to check against nullptr.

Returns: Status

Definition at line 151 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG.

Referenced by error_on_mismatching_data_layouts(), error_on_mismatching_data_types(), error_on_mismatching_shapes(), error_on_tensors_not_even(), and error_on_tensors_not_subsampled().

 {
     const std::array<const void *, sizeof...(Ts)> pointers_array{ { std::forward<Ts>(pointers)... } };
     bool has_nullptr = std::any_of(pointers_array.begin(), pointers_array.end(), [&](const void *ptr)
     {
         return (ptr == nullptr);
     });
     ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG(has_nullptr, function, file, line, "Nullptr object!");
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_tensor_not_2d	(	const char *	function,
		const char *	file,
		const int	line,
		const ITensor *	tensor
	)

Return an error if the tensor is not 2D.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	tensor	Tensor to validate.

Returns: Status

arm_compute::Status arm_compute::error_on_tensors_not_even	(	const char *	function,
		const char *	file,
		int	line,
		const Format &	format,
		const ITensor *	tensor1,
		Ts...	tensors
	)

Return an error if the passed tensor objects are not even.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	format	Format to check if odd shape is allowed
[in]	tensor1	The first object to be compared for odd shape.
[in]	tensors	(Optional) Further allowed objects.

Returns: Status

Definition at line 303 of file Validate.h.

References adjust_odd_shape(), ARM_COMPUTE_RETURN_ERROR_ON_LOC, ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG, ARM_COMPUTE_RETURN_ON_ERROR, error_on_nullptr(), and arm_compute::detail::have_different_dimensions().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor1 == nullptr, function, file, line);
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_nullptr(function, file, line, std::forward<Ts>(tensors)...));
     const std::array < const ITensor *, 1 + sizeof...(Ts) > tensors_info_array{ { tensor1, std::forward<Ts>(tensors)... } };
     ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG(std::any_of(tensors_info_array.cbegin(), tensors_info_array.cend(), [&](const ITensor * tensor)
     {
         const TensorShape correct_shape = adjust_odd_shape(tensor->info()->tensor_shape(), format);
         return detail::have_different_dimensions(tensor->info()->tensor_shape(), correct_shape, 2);
     }),
     function, file, line, "Tensor shape has odd dimensions");
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_tensors_not_subsampled	(	const char *	function,
		const char *	file,
		int	line,
		const Format &	format,
		const TensorShape &	shape,
		const ITensor *	tensor1,
		Ts...	tensors
	)

Return an error if the passed tensor objects are not sub-sampled.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	format	Format to check if sub-sampling allowed.
[in]	shape	The tensor shape to calculate sub-sampling from.
[in]	tensor1	The first object to be compared.
[in]	tensors	(Optional) Further allowed objects.

Returns: Status

Definition at line 336 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC, ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG, ARM_COMPUTE_RETURN_ON_ERROR, calculate_subsampled_shape(), error_on_nullptr(), and arm_compute::detail::have_different_dimensions().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor1 == nullptr, function, file, line);
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_nullptr(function, file, line, std::forward<Ts>(tensors)...));
     const TensorShape sub2_shape = calculate_subsampled_shape(shape, format);
     const std::array < const ITensor *, 1 + sizeof...(Ts) > tensors_info_array{ { tensor1, std::forward<Ts>(tensors)... } };
     ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG(std::any_of(tensors_info_array.cbegin(), tensors_info_array.cend(), [&](const ITensor * tensor)
     {
         return detail::have_different_dimensions(tensor->info()->tensor_shape(), sub2_shape, 2);
     }),
     function, file, line, "Tensor shape has mismatch dimensions for sub-sampling");
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_unconfigured_kernel	(	const char *	function,
		const char *	file,
		const int	line,
		const IKernel *	kernel
	)

Return an error if the kernel is not configured.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	kernel	Kernel to validate.

Returns: Status

arm_compute::Status arm_compute::error_on_value_not_representable_in_fixed_point	(	const char *	function,
		const char *	file,
		int	line,
		float	value,
		const ITensorInfo *	tensor_info
	)

inline

Return an error if the fixed-point value is not representable in the specified Q format.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	value	The floating point value to be checked.
[in]	tensor_info	Input tensor info that has information on data type and fixed-point position.

Returns: Status

Definition at line 1000 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG, ITensorInfo::data_type(), element_size_from_data_type(), ITensorInfo::fixed_point_position(), and string_from_data_type().

Referenced by error_on_value_not_representable_in_fixed_point().

 {
     const int          fixed_point_position = tensor_info->fixed_point_position();
     const DataType     dt                   = tensor_info->data_type();
     const unsigned int q_max_range          = 0xFFFFFFFFu >> (((sizeof(unsigned int) - element_size_from_data_type(dt)) * 8) + 1);
     const float        max_range            = q_max_range / (static_cast<float>(1 << fixed_point_position));
 
     ARM_COMPUTE_RETURN_ERROR_ON_LOC_MSG(value > max_range, function, file, line,
                                         "Value %f is not representable in %s with fixed-point position %d", value, string_from_data_type(dt).c_str(), fixed_point_position);
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_value_not_representable_in_fixed_point	(	const char *	function,
		const char *	file,
		int	line,
		float	value,
		const ITensor *	tensor
	)

inline

Return an error an error if the fixed-point value is not representable in the specified Q format.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	value	The floating point value to be checked.
[in]	tensor	Input tensor that has information on data type and fixed-point position.

Returns: Status

Definition at line 1022 of file Validate.h.

References ARM_COMPUTE_RETURN_ERROR_ON_LOC, ARM_COMPUTE_RETURN_ON_ERROR, error_on_value_not_representable_in_fixed_point(), and ITensor::info().

 {
     ARM_COMPUTE_RETURN_ERROR_ON_LOC(tensor == nullptr, function, file, line);
     ARM_COMPUTE_RETURN_ON_ERROR(::arm_compute::error_on_value_not_representable_in_fixed_point(function, file, line, value, tensor->info()));
     return arm_compute::Status{};
 }

arm_compute::Status arm_compute::error_on_window_dimensions_gte	(	const char *	function,
		const char *	file,
		const int	line,
		const Window &	win,
		unsigned int	max_dim
	)

Return an error if the passed window has too many dimensions.

The window has too many dimensions if any of the dimension greater or equal to max_dim is different from 0.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	win	Window to validate
[in]	max_dim	Maximum number of dimensions allowed.

Returns: Status

arm_compute::Status arm_compute::error_on_window_not_collapsable_at_dimension	(	const char *	function,
		const char *	file,
		const int	line,
		const Window &	full,
		const Window &	window,
		const int	dim
	)

Return an error if the window can't be collapsed at the given dimension.

The window cannot be collapsed if the given dimension not equal to the full window's dimension or not start from 0.

Parameters

[in]	function	Function in which the error occurred.
[in]	file	Name of the file where the error occurred.
[in]	line	Line on which the error occurred.
[in]	full	Full size window
[in]	window	Window to be collapsed.
[in]	dim	Dimension need to be checked.

Returns: Status

void execute_window_loop	(	const Window &	w,
		L &&	lambda_function,
		Ts &&...	iterators
	)

inline

Iterate through the passed window, automatically adjusting the iterators and calling the lambda_functino for each element.

It passes the x and y positions to the lambda_function for each iteration

Parameters

[in]	w	Window to iterate through.
[in]	lambda_function	The function of type void(function)( const Coordinates & id ) to call at each iteration. Where id represents the absolute coordinates of the item to process.
[in,out]	iterators	Tensor iterators which will be updated by this function before calling lambda_function.

Definition at line 122 of file Helpers.inl.

References ARM_COMPUTE_ERROR_ON, Dimensions< int >::num_max_dimensions, and Window::validate().

 {
     w.validate();
 
     for(unsigned int i = 0; i < Coordinates::num_max_dimensions; ++i)
     {
         ARM_COMPUTE_ERROR_ON(w[i].step() == 0);
     }
 
     Coordinates id;
     ForEachDimension<Coordinates::num_max_dimensions>::unroll(w, id, std::forward<L>(lambda_function), std::forward<Ts>(iterators)...);
 }

uint8x16_t arm_compute::finalize_quantization	(	int32x4x4_t &	in_s32,
		int	result_fixedpoint_multiplier,
		int32_t	result_shift,
		int32x4_t	result_offset_after_shift_s32,
		uint8x16_t	min_u8,
		uint8x16_t	max_u8
	)

Performs final quantization step on 16 elements.

Template Parameters

is_bounded_relu Specified if a fused bounded relu should be applied

Parameters

in_s32	Input to be quantized.
result_fixedpoint_multiplier	Result multiplier parameter
result_shift	Result shift parameter
result_offset_after_shift_s32	Result offset parameter
min_u8	Relu lower bound
max_u8	Relu upper bound

Returns: Quantized values

Definition at line 74 of file NEAsymm.h.

References rounding_divide_by_pow2().

 {
     const static int32x4_t zero_s32 = vdupq_n_s32(0);
 
     // Fixed point multiplication with vector saturating rounding doubling multiply high with scalar
     in_s32.val[0] = vqrdmulhq_n_s32(in_s32.val[0], result_fixedpoint_multiplier);
     in_s32.val[1] = vqrdmulhq_n_s32(in_s32.val[1], result_fixedpoint_multiplier);
     in_s32.val[2] = vqrdmulhq_n_s32(in_s32.val[2], result_fixedpoint_multiplier);
     in_s32.val[3] = vqrdmulhq_n_s32(in_s32.val[3], result_fixedpoint_multiplier);
 
     // Round to the nearest division by a power-of-two using result_shift_s32
     in_s32.val[0] = rounding_divide_by_pow2(in_s32.val[0], result_shift);
     in_s32.val[1] = rounding_divide_by_pow2(in_s32.val[1], result_shift);
     in_s32.val[2] = rounding_divide_by_pow2(in_s32.val[2], result_shift);
     in_s32.val[3] = rounding_divide_by_pow2(in_s32.val[3], result_shift);
 
     // Add the offset terms
     in_s32.val[0] = vaddq_s32(in_s32.val[0], result_offset_after_shift_s32);
     in_s32.val[1] = vaddq_s32(in_s32.val[1], result_offset_after_shift_s32);
     in_s32.val[2] = vaddq_s32(in_s32.val[2], result_offset_after_shift_s32);
     in_s32.val[3] = vaddq_s32(in_s32.val[3], result_offset_after_shift_s32);
 
     // Saturate negative values
     in_s32.val[0] = vmaxq_s32(in_s32.val[0], zero_s32);
     in_s32.val[1] = vmaxq_s32(in_s32.val[1], zero_s32);
     in_s32.val[2] = vmaxq_s32(in_s32.val[2], zero_s32);
     in_s32.val[3] = vmaxq_s32(in_s32.val[3], zero_s32);
 
     // Convert S32 to S16
     const int16x8x2_t in_s16 =
     {
         {
             vcombine_s16(vqmovn_s32(in_s32.val[0]), vqmovn_s32(in_s32.val[1])),
             vcombine_s16(vqmovn_s32(in_s32.val[2]), vqmovn_s32(in_s32.val[3]))
         }
     };
 
     // Convert S16 to U8
     uint8x16_t out_u8 = vcombine_u8(vqmovun_s16(in_s16.val[0]), vqmovun_s16(in_s16.val[1]));
 
     if(is_bounded_relu)
     {
         out_u8 = vmaxq_u8(out_u8, min_u8);
         out_u8 = vminq_u8(out_u8, max_u8);
     }
 
     return out_u8;
 }

std::string arm_compute::float_to_string_with_full_precision ( float val )

inline

Create a string with the float in full precision.

Parameters

val	Floating point value

Returns: String with the floating point value.

Definition at line 1073 of file Utils.h.

 {
     std::stringstream ss;
     ss.precision(std::numeric_limits<float>::digits10 + 1);
     ss << val;
     return ss.str();
 }

auto arm_compute::floor_to_multiple	(	S	value,
		T	divisor
	)		-> decltype((value / divisor) * divisor)

inline

Computes the largest number smaller or equal to value that is a multiple of divisor.

Parameters

[in]	value	Upper bound value
[in]	divisor	Value to compute multiple of.

Returns: the result.

Definition at line 78 of file Utils.h.

References ARM_COMPUTE_ERROR_ON, build_information(), and read_file().

 {
     ARM_COMPUTE_ERROR_ON(value < 0 || divisor <= 0);
     return (value / divisor) * divisor;
 }

bool arm_compute::fp16_supported ( const cl::Device & device )

Helper function to check whether the cl_khr_fp16 extension is supported.

Parameters

[in] device A CL device

Returns: True if the extension is supported

GPUTarget arm_compute::get_arch_from_target ( GPUTarget target )

Helper function to get the GPU arch.

Parameters

[in] target GPU target

Returns: the GPU target which shows the arch

std::string arm_compute::get_cl_type_from_data_type ( const DataType & dt )

Translates a tensor data type to the appropriate OpenCL type.

Parameters

[in] dt DataType to be translated to OpenCL type.

Returns: The string specifying the OpenCL type to be used.

CLVersion arm_compute::get_cl_version ( const cl::Device & device )

Helper function to get the highest OpenCL version supported.

Parameters

[in] device A CL device

Returns: the highest OpenCL version supported

void arm_compute::get_cpu_configuration ( CPUInfo & cpuinfo )

This function will try to detect the CPU configuration on the system and will fill the cpuinfo object accordingly to reflect this.

Parameters

[out] cpuinfo CPUInfo to be used to hold the system's cpu configuration.

size_t get_data_layout_dimension_index	(	const DataLayout	data_layout,
		const DataLayoutDimension	data_layout_dimension
	)

inline

Get the index of the given dimension.

Parameters

[in]	data_layout	The data layout.
[in]	data_layout_dimension	The dimension which this index is requested for.

Returns: The int conversion of the requested data layout index.

Definition at line 340 of file Helpers.inl.

References ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON_MSG, BATCHES, CHANNEL, HEIGHT, NCHW, UNKNOWN, and WIDTH.

Referenced by arm_compute::misc::shape_calculator::compute_deep_convolution_shape(), arm_compute::misc::shape_calculator::compute_depthwise_convolution_shape(), arm_compute::misc::shape_calculator::compute_im2col_conv_shape(), arm_compute::misc::shape_calculator::compute_pool_shape(), arm_compute::misc::shape_calculator::compute_winograd_filter_transform_shape(), arm_compute::misc::shape_calculator::compute_winograd_input_transform_shape(), arm_compute::misc::shape_calculator::compute_winograd_output_transform_shape(), arm_compute::test::validation::DATA_TEST_CASE(), SubTensorInfo::dimension(), TensorInfo::dimension(), PPMLoader::fill_planar_tensor(), permute(), and arm_compute::graph::backends::detail::validate_depthwise_convolution_layer().

 {
     ARM_COMPUTE_ERROR_ON_MSG(data_layout == DataLayout::UNKNOWN, "Cannot retrieve the dimension index for an unknown layout!");
 
     /* Return the index based on the data layout
      * [N C H W]
      * [3 2 1 0]
      * [N H W C]
     */
     switch(data_layout_dimension)
     {
         case DataLayoutDimension::CHANNEL:
             return (data_layout == DataLayout::NCHW) ? 2 : 0;
             break;
         case DataLayoutDimension::HEIGHT:
             return (data_layout == DataLayout::NCHW) ? 1 : 2;
             break;
         case DataLayoutDimension::WIDTH:
             return (data_layout == DataLayout::NCHW) ? 0 : 1;
             break;
         case DataLayoutDimension::BATCHES:
             return 3;
             break;
         default:
             ARM_COMPUTE_ERROR("Data layout index not supported!");
             break;
     }
 }

std::string arm_compute::get_data_size_from_data_type ( const DataType & dt )

Get the size of a data type in number of bits.

Parameters

[in] dt DataType.

Returns: Number of bits in the data type specified.

DataType arm_compute::get_promoted_data_type ( DataType dt )

inline

Return the promoted data type of a given data type.

Note: If promoted data type is not supported an error will be thrown

Parameters

[in] dt Data type to get the promoted type of.

Returns: Promoted data type

Definition at line 517 of file Utils.h.

References ARM_COMPUTE_ERROR, F16, F32, QASYMM8, QS16, QS32, QS8, S16, S32, S8, U16, U32, U8, and UNKNOWN.

 {
     switch(dt)
     {
         case DataType::U8:
             return DataType::U16;
         case DataType::S8:
             return DataType::S16;
         case DataType::QS8:
             return DataType::QS16;
         case DataType::U16:
             return DataType::U32;
         case DataType::S16:
             return DataType::S32;
         case DataType::QS16:
             return DataType::QS32;
         case DataType::QASYMM8:
         case DataType::F16:
         case DataType::U32:
         case DataType::S32:
         case DataType::F32:
         case DataType::QS32:
             ARM_COMPUTE_ERROR("Unsupported data type promotions!");
         default:
             ARM_COMPUTE_ERROR("Undefined data type!");
     }
     return DataType::UNKNOWN;
 }

GPUTarget arm_compute::get_target_from_device ( )

Helper function to get the GPU target from GLES using GL_RENDERER enum.

Returns: the GPU target

GPUTarget arm_compute::get_target_from_device ( cl::Device & device )

Helper function to get the GPU target from CL device.

Parameters

[in] device A CL device

Returns: the GPU target

Referenced by CLScheduler::init().

GPUTarget arm_compute::get_target_from_name ( const std::string & device_name )

Helper function to get the GPU target from a device name.

Parameters

[in] device_name A device name

Returns: the GPU target

Referenced by arm_compute::test::validation::TEST_CASE().

unsigned int arm_compute::get_threads_hint ( )

Some systems have both big and small cores, this fuction computes the minimum number of cores that are exactly the same on the system.

To maximize performance the library attempts to process workloads concurrently using as many threads as big cores are available on the system.

Returns: The minumum number of common cores.

std::string arm_compute::get_underlying_cl_type_from_data_type ( const DataType & dt )

Translates fixed point tensor data type to the underlying OpenCL type.

Parameters

[in] dt DataType to be translated to OpenCL type.

Returns: The string specifying the underlying OpenCL type to be used.

bool arm_compute::gpu_target_is_in	(	GPUTarget	target_to_check,
		GPUTarget	target,
		Args...	targets
	)

Helper function to check whether a gpu target is equal to the provided targets.

Parameters

[in]	target_to_check	gpu target to check
[in]	target	First target to compare against
[in]	targets	(Optional) Additional targets to compare with

Returns: True if the target is equal with at least one of the targets.

Definition at line 92 of file GPUTarget.h.

Referenced by arm_compute::test::validation::TEST_CASE().

 {
     return (target_to_check == target) | gpu_target_is_in(target_to_check, targets...);
 }

bool arm_compute::gpu_target_is_in	(	GPUTarget	target_to_check,
		GPUTarget	target
	)

inline

Variant of gpu_target_is_in for comparing two targets.

Definition at line 98 of file GPUTarget.h.

 {
     return target_to_check == target;
 }

bool arm_compute::has_format_horizontal_subsampling ( Format format )

inline

Return true if the given format has horizontal subsampling.

Parameters

[in] format Format to determine subsampling.

Returns: True if the format can be subsampled horizontaly.

Definition at line 552 of file Utils.h.

References IYUV, NV12, NV21, UV88, UYVY422, and YUYV422.

Referenced by adjust_odd_shape(), and calculate_subsampled_shape().

 {
     return (format == Format::YUYV422 || format == Format::UYVY422 || format == Format::NV12 || format == Format::NV21 || format == Format::IYUV || format == Format::UV88) ? true : false;
 }

bool arm_compute::has_format_vertical_subsampling ( Format format )

inline

Return true if the given format has vertical subsampling.

Parameters

[in] format Format to determine subsampling.

Returns: True if the format can be subsampled verticaly.

Definition at line 563 of file Utils.h.

References IYUV, NV12, NV21, and UV88.

Referenced by adjust_odd_shape(), and calculate_subsampled_shape().

 {
     return (format == Format::NV12 || format == Format::NV21 || format == Format::IYUV || format == Format::UV88) ? true : false;
 }

void arm_compute::ignore_unused ( T && ... )

inline

Ignores unused arguments.

Template Parameters

T	Argument types

Parameters

[in] ... Ignored arguments

Definition at line 39 of file Error.h.

40 {

41 }

Coordinates index2coords	(	const TensorShape &	shape,
		int	index
	)

inline

Convert a linear index into n-dimensional coordinates.

Parameters

[in]	shape	Shape of the n-dimensional tensor.
[in]	index	Linear index specifying the i-th element.

Returns: n-dimensional coordinates.

Definition at line 303 of file Helpers.inl.

References ARM_COMPUTE_ERROR_ON_MSG, Dimensions< T >::num_dimensions(), TensorShape::set(), and TensorShape::total_size().

Referenced by arm_compute::test::validation::reference::convert_fully_connected_weights(), and permute().

 {
     int num_elements = shape.total_size();
 
     ARM_COMPUTE_ERROR_ON_MSG(index < 0 || index >= num_elements, "Index has to be in [0, num_elements]!");
     ARM_COMPUTE_ERROR_ON_MSG(num_elements == 0, "Cannot create coordinate from empty shape!");
 
     Coordinates coord{ 0 };
 
     for(int d = shape.num_dimensions() - 1; d >= 0; --d)
     {
         num_elements /= shape[d];
         coord.set(d, index / num_elements);
         index %= num_elements;
     }
 
     return coord;
 }

ValidRegion arm_compute::intersect_valid_regions ( const Ts &... regions )

Intersect multiple valid regions.

Parameters

[in] regions Valid regions.

Returns: Intersection of all regions.

Definition at line 469 of file Helpers.h.

References ValidRegion::anchor, arm_compute::utility::foldl(), arm_compute::test::fixed_point_arithmetic::detail::max(), arm_compute::test::fixed_point_arithmetic::detail::min(), Dimensions< T >::num_dimensions(), Dimensions< T >::set(), TensorShape::set(), and ValidRegion::shape.

 {
     auto intersect = [](const ValidRegion & r1, const ValidRegion & r2) -> ValidRegion
     {
         ValidRegion region;
 
         for(size_t d = 0; d < std::min(r1.anchor.num_dimensions(), r2.anchor.num_dimensions()); ++d)
         {
             region.anchor.set(d, std::max(r1.anchor[d], r2.anchor[d]));
         }
 
         for(size_t d = 0; d < std::min(r1.shape.num_dimensions(), r2.shape.num_dimensions()); ++d)
         {
             region.shape.set(d, std::min(r1.shape[d], r2.shape[d]));
         }
 
         return region;
     };
 
     return utility::foldl(intersect, regions...);
 }

bool arm_compute::is_data_type_fixed_point ( DataType dt )

inline

Check if a given data type is of fixed point type.

Parameters

[in] dt Input data type.

Returns: True if data type is of fixed point type, else false.

Definition at line 1037 of file Utils.h.

References QS16, QS32, and QS8.

Referenced by arm_compute::test::validation::DATA_TEST_CASE(), error_on_mismatching_fixed_point(), and arm_compute::test::validation::FIXTURE_DATA_TEST_CASE().

 {
     switch(dt)
     {
         case DataType::QS8:
         case DataType::QS16:
         case DataType::QS32:
             return true;
         default:
             return false;
     }
 }

bool arm_compute::is_data_type_float ( DataType dt )

inline

Check if a given data type is of floating point type.

Parameters

[in] dt Input data type.

Returns: True if data type is of floating point type, else false.

Definition at line 997 of file Utils.h.

References F16, and F32.

 {
     switch(dt)
     {
         case DataType::F16:
         case DataType::F32:
             return true;
         default:
             return false;
     }
 }

bool arm_compute::is_data_type_quantized ( DataType dt )

inline

Check if a given data type is of quantized type.

Note: Quantized is considered a super-set of fixed-point and asymmetric data types.

Parameters

[in] dt Input data type.

Returns: True if data type is of quantized type, else false.

Definition at line 1017 of file Utils.h.

References QASYMM8, QS16, QS32, and QS8.

 {
     switch(dt)
     {
         case DataType::QS8:
         case DataType::QASYMM8:
         case DataType::QS16:
         case DataType::QS32:
             return true;
         default:
             return false;
     }
 }

bool arm_compute::is_data_type_quantized_asymmetric ( DataType dt )

inline

Check if a given data type is of asymmetric quantized type.

Parameters

[in] dt Input data type.

Returns: True if data type is of symmetric quantized type, else false.

Definition at line 1056 of file Utils.h.

References QASYMM8.

Referenced by arm_compute::test::validation::DATA_TEST_CASE(), error_on_mismatching_quantization_info(), arm_compute::test::validation::reference::im2col_nchw(), arm_compute::test::validation::reference::im2col_nhwc(), set_quantization_info_if_empty(), and arm_compute::graph::backends::detail::validate_convolution_layer().

 {
     switch(dt)
     {
         case DataType::QASYMM8:
             return true;
         default:
             return false;
     }
 }

std::string arm_compute::lower_string ( const std::string & val )

Lower a given string.

Parameters

[in] val Given string to lower.

Returns: The lowered string

Referenced by data_type_for_convolution_matrix().

int arm_compute::max_consecutive_elements_display_width	(	std::ostream &	s,
		DataType	dt,
		const uint8_t *	ptr,
		unsigned int	n
	)

Identify the maximum width of n consecutive elements.

Parameters

[in]	s	Output stream to print the elements to.
[in]	dt	Data type of the elements
[in]	ptr	Pointer to print the elements from.
[in]	n	Number of elements to print.

Returns: The maximum width of the elements.

Referenced by max_consecutive_elements_display_width_impl().

int arm_compute::max_consecutive_elements_display_width_impl	(	std::ostream &	s,
		const T *	ptr,
		unsigned int	n
	)

Identify the maximum width of n consecutive elements.

Parameters

[in]	s	The output stream which will be used to print the elements. Used to extract the stream format.
[in]	ptr	Pointer to the elements.
[in]	n	Number of elements.

Returns: The maximum width of the elements.

Definition at line 1123 of file Utils.h.

References max_consecutive_elements_display_width(), and print_consecutive_elements().

 {
     using print_type = typename std::conditional<std::is_floating_point<T>::value, T, int>::type;
 
     int max_width = -1;
     for(unsigned int i = 0; i < n; ++i)
     {
         std::stringstream ss;
         ss.copyfmt(s);
 
         if(std::is_same<typename std::decay<T>::type, half>::value)
         {
             // We use T instead of print_type here is because the std::is_floating_point<half> returns false and then the print_type becomes int.
             ss << static_cast<T>(ptr[i]);
         }
         else
         {
             ss << static_cast<print_type>(ptr[i]);
         }
 
         max_width = std::max<int>(max_width, ss.str().size());
     }
     return max_width;
 }

size_t arm_compute::num_channels_from_format ( Format format )

inline

Return the number of channels for a given single-planar pixel format.

Parameters

[in] format Input format

Returns: The number of channels for a given image format.

Definition at line 476 of file Utils.h.

References F16, F32, IYUV, NV12, NV21, RGB888, RGBA8888, S16, S32, U16, U32, U8, UV88, UYVY422, YUV444, and YUYV422.

 {
     switch(format)
     {
         case Format::U8:
         case Format::U16:
         case Format::S16:
         case Format::U32:
         case Format::S32:
         case Format::F16:
         case Format::F32:
             return 1;
         // Because the U and V channels are subsampled
         // these formats appear like having only 2 channels:
         case Format::YUYV422:
         case Format::UYVY422:
             return 2;
         case Format::UV88:
             return 2;
         case Format::RGB888:
             return 3;
         case Format::RGBA8888:
             return 4;
         //Doesn't make sense for planar formats:
         case Format::NV12:
         case Format::NV21:
         case Format::IYUV:
         case Format::YUV444:
         default:
             return 0;
     }
 }

size_t arm_compute::num_planes_from_format ( Format format )

inline

Return the number of planes for a given format.

Parameters

[in] format Input format

Returns: The number of planes for a given image format.

Definition at line 442 of file Utils.h.

References ARM_COMPUTE_ERROR, F16, F32, IYUV, NV12, NV21, RGB888, RGBA8888, S16, S32, U16, U32, U8, UYVY422, YUV444, and YUYV422.

 {
     switch(format)
     {
         case Format::U8:
         case Format::S16:
         case Format::U16:
         case Format::S32:
         case Format::U32:
         case Format::F16:
         case Format::F32:
         case Format::RGB888:
         case Format::RGBA8888:
         case Format::YUYV422:
         case Format::UYVY422:
             return 1;
         case Format::NV12:
         case Format::NV21:
             return 2;
         case Format::IYUV:
         case Format::YUV444:
             return 3;
         default:
             ARM_COMPUTE_ERROR("Not supported format");
             return 0;
     }
 }

bool arm_compute::opencl_is_available ( )

Check if OpenCL is available.

Returns: True if OpenCL is available.

Referenced by main(), and arm_compute::test::sync_if_necessary().

bool arm_compute::opengles31_is_available ( )

Check if the OpenGL ES 3.1 API is available at runtime.

Returns: true if the OpenGL ES 3.1 API is available.

Referenced by NDRange::get(), and arm_compute::test::sync_tensor_if_necessary().

bool arm_compute::operator!=	(	const Dimensions< T > &	lhs,
		const Dimensions< T > &	rhs
	)

inline

Check that given dimensions are not equal.

Parameters

[in]	lhs	Left-hand side Dimensions.
[in]	rhs	Right-hand side Dimensions.

Returns: True if the given dimensions are not equal.

Definition at line 246 of file Dimensions.h.

 {
     return !(lhs == rhs);
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const Dimensions< T > &	dimensions
	)

Formatted output of the Dimensions type.

Parameters

[out]	os	Output stream.
[in]	dimensions	Type to output.

Returns: Modified output stream.

Definition at line 53 of file TypePrinter.h.

 {
     if(dimensions.num_dimensions() > 0)
     {
         os << dimensions[0];
 
         for(unsigned int d = 1; d < dimensions.num_dimensions(); ++d)
         {
             os << "x" << dimensions[d];
         }
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const NonLinearFilterFunction &	function
	)

Formatted output of the NonLinearFilterFunction type.

Parameters

[out]	os	Output stream.
[in]	function	Type to output.

Returns: Modified output stream.

Definition at line 75 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, MAX, MEDIAN, and MIN.

 {
     switch(function)
     {
         case NonLinearFilterFunction::MAX:
             os << "MAX";
             break;
         case NonLinearFilterFunction::MEDIAN:
             os << "MEDIAN";
             break;
         case NonLinearFilterFunction::MIN:
             os << "MIN";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const MatrixPattern &	pattern
	)

Formatted output of the MatrixPattern type.

Parameters

[out]	os	Output stream.
[in]	pattern	Type to output.

Returns: Modified output stream.

Definition at line 115 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, BOX, CROSS, DISK, and OTHER.

 {
     switch(pattern)
     {
         case MatrixPattern::BOX:
             os << "BOX";
             break;
         case MatrixPattern::CROSS:
             os << "CROSS";
             break;
         case MatrixPattern::DISK:
             os << "DISK";
             break;
         case MatrixPattern::OTHER:
             os << "OTHER";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const RoundingPolicy &	rounding_policy
	)

Formatted output of the RoundingPolicy type.

Parameters

[out]	os	Output stream.
[in]	rounding_policy	Type to output.

Returns: Modified output stream.

Definition at line 158 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, TO_NEAREST_EVEN, TO_NEAREST_UP, and TO_ZERO.

 {
     switch(rounding_policy)
     {
         case RoundingPolicy::TO_ZERO:
             os << "TO_ZERO";
             break;
         case RoundingPolicy::TO_NEAREST_UP:
             os << "TO_NEAREST_UP";
             break;
         case RoundingPolicy::TO_NEAREST_EVEN:
             os << "TO_NEAREST_EVEN";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const WeightsInfo &	weights_info
	)

Formatted output of the WeightsInfo type.

Parameters

[out]	os	Output stream.
[in]	weights_info	Type to output.

Returns: Modified output stream.

Definition at line 185 of file TypePrinter.h.

References WeightsInfo::are_reshaped(), WeightsInfo::kernel_size(), and WeightsInfo::num_kernels().

 {
     os << weights_info.are_reshaped() << ";";
     os << weights_info.num_kernels() << ";" << weights_info.kernel_size().first << "," << weights_info.kernel_size().second;
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const ROIPoolingLayerInfo &	pool_info
	)

Formatted output of the ROIPoolingInfo type.

Parameters

[out]	os	Output stream.
[in]	pool_info	Type to output.

Returns: Modified output stream.

Definition at line 200 of file TypePrinter.h.

References ROIPoolingLayerInfo::pooled_height(), ROIPoolingLayerInfo::pooled_width(), and ROIPoolingLayerInfo::spatial_scale().

 {
     os << pool_info.pooled_width() << "x" << pool_info.pooled_height() << "~" << pool_info.spatial_scale();
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const QuantizationInfo &	quantization_info
	)

Formatted output of the QuantizationInfo type.

Parameters

[out]	os	Output stream.
[in]	quantization_info	Type to output.

Returns: Modified output stream.

Definition at line 213 of file TypePrinter.h.

References QuantizationInfo::offset, and QuantizationInfo::scale.

 {
     os << "Scale:" << quantization_info.scale << "~"
        << "Offset:" << quantization_info.offset;
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const FixedPointOp &	op
	)

Formatted output of the FixedPointOp type.

Parameters

[out]	os	Output stream.
[in]	op	Type to output.

Returns: Modified output stream.

Definition at line 240 of file TypePrinter.h.

References ADD, ARM_COMPUTE_ERROR, EXP, INV_SQRT, LOG, MUL, RECIPROCAL, and SUB.

 {
     switch(op)
     {
         case FixedPointOp::ADD:
             os << "ADD";
             break;
         case FixedPointOp::SUB:
             os << "SUB";
             break;
         case FixedPointOp::MUL:
             os << "MUL";
             break;
         case FixedPointOp::EXP:
             os << "EXP";
             break;
         case FixedPointOp::LOG:
             os << "LOG";
             break;
         case FixedPointOp::INV_SQRT:
             os << "INV_SQRT";
             break;
         case FixedPointOp::RECIPROCAL:
             os << "RECIPROCAL";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const ActivationLayerInfo::ActivationFunction &	act_function
	)

Formatted output of the activation function type.

Parameters

[out]	os	Output stream.
[in]	act_function	Type to output.

Returns: Modified output stream.

Definition at line 292 of file TypePrinter.h.

References ActivationLayerInfo::ABS, ARM_COMPUTE_ERROR, ActivationLayerInfo::BOUNDED_RELU, ActivationLayerInfo::LEAKY_RELU, ActivationLayerInfo::LINEAR, ActivationLayerInfo::LOGISTIC, ActivationLayerInfo::LU_BOUNDED_RELU, ActivationLayerInfo::RELU, ActivationLayerInfo::SOFT_RELU, ActivationLayerInfo::SQRT, ActivationLayerInfo::SQUARE, and ActivationLayerInfo::TANH.

 {
     switch(act_function)
     {
         case ActivationLayerInfo::ActivationFunction::ABS:
             os << "ABS";
             break;
         case ActivationLayerInfo::ActivationFunction::LINEAR:
             os << "LINEAR";
             break;
         case ActivationLayerInfo::ActivationFunction::LOGISTIC:
             os << "LOGISTIC";
             break;
         case ActivationLayerInfo::ActivationFunction::RELU:
             os << "RELU";
             break;
         case ActivationLayerInfo::ActivationFunction::BOUNDED_RELU:
             os << "BOUNDED_RELU";
             break;
         case ActivationLayerInfo::ActivationFunction::LEAKY_RELU:
             os << "LEAKY_RELU";
             break;
         case ActivationLayerInfo::ActivationFunction::SOFT_RELU:
             os << "SOFT_RELU";
             break;
         case ActivationLayerInfo::ActivationFunction::SQRT:
             os << "SQRT";
             break;
         case ActivationLayerInfo::ActivationFunction::LU_BOUNDED_RELU:
             os << "LU_BOUNDED_RELU";
             break;
         case ActivationLayerInfo::ActivationFunction::SQUARE:
             os << "SQUARE";
             break;
         case ActivationLayerInfo::ActivationFunction::TANH:
             os << "TANH";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const NormType &	norm_type
	)

Formatted output of the NormType type.

Parameters

[out]	os	Output stream.
[in]	norm_type	Type to output.

Returns: Modified output stream.

Definition at line 372 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, CROSS_MAP, IN_MAP_1D, and IN_MAP_2D.

 {
     switch(norm_type)
     {
         case NormType::CROSS_MAP:
             os << "CROSS_MAP";
             break;
         case NormType::IN_MAP_1D:
             os << "IN_MAP_1D";
             break;
         case NormType::IN_MAP_2D:
             os << "IN_MAP_2D";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const NormalizationLayerInfo &	info
	)

Formatted output of NormalizationLayerInfo.

Parameters

[out]	os	Output stream.
[in]	info	Type to output.

Returns: Modified output stream.

Definition at line 412 of file TypePrinter.h.

References NormalizationLayerInfo::norm_size(), and NormalizationLayerInfo::type().

 {
     os << info.type() << ":NormSize=" << info.norm_size();
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const PoolingType &	pool_type
	)

Formatted output of the PoolingType type.

Parameters

[out]	os	Output stream.
[in]	pool_type	Type to output.

Returns: Modified output stream.

Definition at line 425 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, AVG, L2, and MAX.

 {
     switch(pool_type)
     {
         case PoolingType::AVG:
             os << "AVG";
             break;
         case PoolingType::MAX:
             os << "MAX";
             break;
         case PoolingType::L2:
             os << "L2";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const PoolingLayerInfo &	info
	)

Formatted output of PoolingLayerInfo.

Parameters

[out]	os	Output stream.
[in]	info	Type to output.

Returns: Modified output stream.

Definition at line 452 of file TypePrinter.h.

References PoolingLayerInfo::pool_type().

 {
     os << info.pool_type();
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const DataLayout &	data_layout
	)

Formatted output of the DataLayout type.

Parameters

[out]	os	Output stream.
[in]	data_layout	Type to output.

Returns: Modified output stream.

Definition at line 479 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, NCHW, NHWC, and UNKNOWN.

 {
     switch(data_layout)
     {
         case DataLayout::UNKNOWN:
             os << "UNKNOWN";
             break;
         case DataLayout::NHWC:
             os << "NHWC";
             break;
         case DataLayout::NCHW:
             os << "NCHW";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const DataType &	data_type
	)

Formatted output of the DataType type.

Parameters

[out]	os	Output stream.
[in]	data_type	Type to output.

Returns: Modified output stream.

Definition at line 519 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, F16, F32, F64, QASYMM8, QS16, QS8, S16, S32, S64, S8, SIZET, U16, U32, U64, U8, and UNKNOWN.

 {
     switch(data_type)
     {
         case DataType::UNKNOWN:
             os << "UNKNOWN";
             break;
         case DataType::U8:
             os << "U8";
             break;
         case DataType::QS8:
             os << "QS8";
             break;
         case DataType::QASYMM8:
             os << "QASYMM8";
             break;
         case DataType::S8:
             os << "S8";
             break;
         case DataType::U16:
             os << "U16";
             break;
         case DataType::S16:
             os << "S16";
             break;
         case DataType::QS16:
             os << "QS16";
             break;
         case DataType::U32:
             os << "U32";
             break;
         case DataType::S32:
             os << "S32";
             break;
         case DataType::U64:
             os << "U64";
             break;
         case DataType::S64:
             os << "S64";
             break;
         case DataType::F16:
             os << "F16";
             break;
         case DataType::F32:
             os << "F32";
             break;
         case DataType::F64:
             os << "F64";
             break;
         case DataType::SIZET:
             os << "SIZET";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const Format &	format
	)

Formatted output of the Format type.

Parameters

[out]	os	Output stream.
[in]	format	Type to output.

Returns: Modified output stream.

Definition at line 598 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, F16, F32, IYUV, NV12, NV21, RGB888, RGBA8888, S16, S32, U16, U32, U8, UNKNOWN, UV88, UYVY422, YUV444, and YUYV422.

 {
     switch(format)
     {
         case Format::UNKNOWN:
             os << "UNKNOWN";
             break;
         case Format::U8:
             os << "U8";
             break;
         case Format::S16:
             os << "S16";
             break;
         case Format::U16:
             os << "U16";
             break;
         case Format::S32:
             os << "S32";
             break;
         case Format::U32:
             os << "U32";
             break;
         case Format::F16:
             os << "F16";
             break;
         case Format::F32:
             os << "F32";
             break;
         case Format::UV88:
             os << "UV88";
             break;
         case Format::RGB888:
             os << "RGB888";
             break;
         case Format::RGBA8888:
             os << "RGBA8888";
             break;
         case Format::YUV444:
             os << "YUV444";
             break;
         case Format::YUYV422:
             os << "YUYV422";
             break;
         case Format::NV12:
             os << "NV12";
             break;
         case Format::NV21:
             os << "NV21";
             break;
         case Format::IYUV:
             os << "IYUV";
             break;
         case Format::UYVY422:
             os << "UYVY422";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const Channel &	channel
	)

Formatted output of the Channel type.

Parameters

[out]	os	Output stream.
[in]	channel	Type to output.

Returns: Modified output stream.

Definition at line 680 of file TypePrinter.h.

References A, ARM_COMPUTE_ERROR, B, C0, C1, C2, C3, G, R, U, UNKNOWN, V, and Y.

 {
     switch(channel)
     {
         case Channel::UNKNOWN:
             os << "UNKNOWN";
             break;
         case Channel::C0:
             os << "C0";
             break;
         case Channel::C1:
             os << "C1";
             break;
         case Channel::C2:
             os << "C2";
             break;
         case Channel::C3:
             os << "C3";
             break;
         case Channel::R:
             os << "R";
             break;
         case Channel::G:
             os << "G";
             break;
         case Channel::B:
             os << "B";
             break;
         case Channel::A:
             os << "A";
             break;
         case Channel::Y:
             os << "Y";
             break;
         case Channel::U:
             os << "U";
             break;
         case Channel::V:
             os << "V";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const BorderMode &	mode
	)

Formatted output of the BorderMode type.

Parameters

[out]	os	Output stream.
[in]	mode	Type to output.

Returns: Modified output stream.

Definition at line 747 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, CONSTANT, REPLICATE, and UNDEFINED.

 {
     switch(mode)
     {
         case BorderMode::UNDEFINED:
             os << "UNDEFINED";
             break;
         case BorderMode::CONSTANT:
             os << "CONSTANT";
             break;
         case BorderMode::REPLICATE:
             os << "REPLICATE";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const BorderSize &	border
	)

Formatted output of the BorderSize type.

Parameters

[out]	os	Output stream.
[in]	border	Type to output.

Returns: Modified output stream.

Definition at line 774 of file TypePrinter.h.

References BorderSize::bottom, BorderSize::left, BorderSize::right, and BorderSize::top.

 {
     os << border.top << ","
        << border.right << ","
        << border.bottom << ","
        << border.left;
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const InterpolationPolicy &	policy
	)

Formatted output of the InterpolationPolicy type.

Parameters

[out]	os	Output stream.
[in]	policy	Type to output.

Returns: Modified output stream.

Definition at line 791 of file TypePrinter.h.

References AREA, ARM_COMPUTE_ERROR, BILINEAR, and NEAREST_NEIGHBOR.

 {
     switch(policy)
     {
         case InterpolationPolicy::NEAREST_NEIGHBOR:
             os << "NEAREST_NEIGHBOR";
             break;
         case InterpolationPolicy::BILINEAR:
             os << "BILINEAR";
             break;
         case InterpolationPolicy::AREA:
             os << "AREA";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const SamplingPolicy &	policy
	)

Formatted output of the SamplingPolicy type.

Parameters

[out]	os	Output stream.
[in]	policy	Type to output.

Returns: Modified output stream.

Definition at line 818 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, CENTER, and TOP_LEFT.

 {
     switch(policy)
     {
         case SamplingPolicy::CENTER:
             os << "CENTER";
             break;
         case SamplingPolicy::TOP_LEFT:
             os << "TOP_LEFT";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const Rectangle &	rect
	)

Formatted output of the Rectangle type.

Parameters

[out]	os	Output stream.
[in]	rect	Type to output.

Returns: Modified output stream.

Definition at line 911 of file TypePrinter.h.

References Rectangle::height, Rectangle::width, Rectangle::x, and Rectangle::y.

 {
     os << rect.width << "x" << rect.height;
     os << "+" << rect.x << "+" << rect.y;
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const PadStrideInfo &	pad_stride_info
	)

Formatted output of the PadStrideInfo type.

Parameters

[out]	os	Output stream.
[in]	pad_stride_info	Type to output.

Returns: Modified output stream.

Definition at line 926 of file TypePrinter.h.

References PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), and PadStrideInfo::stride().

 {
     os << pad_stride_info.stride().first << "," << pad_stride_info.stride().second;
     os << ";";
     os << pad_stride_info.pad_left() << "," << pad_stride_info.pad_right() << ","
        << pad_stride_info.pad_top() << "," << pad_stride_info.pad_bottom();
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const ConvertPolicy &	policy
	)

Formatted output of the ConvertPolicy type.

Parameters

[out]	os	Output stream.
[in]	policy	Type to output.

Returns: Modified output stream.

Definition at line 1008 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, SATURATE, and WRAP.

 {
     switch(policy)
     {
         case ConvertPolicy::WRAP:
             os << "WRAP";
             break;
         case ConvertPolicy::SATURATE:
             os << "SATURATE";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const ReductionOperation &	op
	)

Formatted output of the Reduction Operations.

Parameters

[out]	os	Output stream.
[in]	op	Type to output.

Returns: Modified output stream.

Definition at line 1039 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, and SUM_SQUARE.

 {
     switch(op)
     {
         case ReductionOperation::SUM_SQUARE:
             os << "SUM_SQUARE";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const KeyPoint &	point
	)

Formatted output of the KeyPoint type.

Parameters

[out]	os	Output stream
[in]	point	Type to output.

Returns: Modified output stream.

Definition at line 1120 of file TypePrinter.h.

References KeyPoint::error, KeyPoint::orientation, KeyPoint::scale, KeyPoint::strength, KeyPoint::tracking_status, KeyPoint::x, and KeyPoint::y.

 {
     os << "{x=" << point.x << ","
        << "y=" << point.y << ","
        << "strength=" << point.strength << ","
        << "scale=" << point.scale << ","
        << "orientation=" << point.orientation << ","
        << "tracking_status=" << point.tracking_status << ","
        << "error=" << point.error << "}";
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const PhaseType &	phase_type
	)

Formatted output of the PhaseType type.

Parameters

[out]	os	Output stream
[in]	phase_type	Type to output.

Returns: Modified output stream.

Definition at line 1140 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, SIGNED, and UNSIGNED.

 {
     switch(phase_type)
     {
         case PhaseType::SIGNED:
             os << "SIGNED";
             break;
         case PhaseType::UNSIGNED:
             os << "UNSIGNED";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const MagnitudeType &	magnitude_type
	)

Formatted output of the MagnitudeType type.

Parameters

[out]	os	Output stream
[in]	magnitude_type	Type to output.

Returns: Modified output stream.

Definition at line 1177 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, L1NORM, and L2NORM.

 {
     switch(magnitude_type)
     {
         case MagnitudeType::L1NORM:
             os << "L1NORM";
             break;
         case MagnitudeType::L2NORM:
             os << "L2NORM";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const GradientDimension &	dim
	)

Formatted output of the GradientDimension type.

Parameters

[out]	os	Output stream
[in]	dim	Type to output

Returns: Modified output stream.

Definition at line 1214 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, and GRAD_XY.

 {
     switch(dim)
     {
         case GradientDimension::GRAD_X:
             os << "GRAD_X";
             break;
         case GradientDimension::GRAD_Y:
             os << "GRAD_Y";
             break;
         case GradientDimension::GRAD_XY:
             os << "GRAD_XY";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const HOGNormType &	norm_type
	)

Formatted output of the HOGNormType type.

Parameters

[out]	os	Output stream
[in]	norm_type	Type to output

Returns: Modified output stream.

Definition at line 1254 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, L1_NORM, L2_NORM, and L2HYS_NORM.

 {
     switch(norm_type)
     {
         case HOGNormType::L1_NORM:
             os << "L1_NORM";
             break;
         case HOGNormType::L2_NORM:
             os << "L2_NORM";
             break;
         case HOGNormType::L2HYS_NORM:
             os << "L2HYS_NORM";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const Size2D &	size
	)

Formatted output of the Size2D type.

Parameters

[out]	os	Output stream
[in]	size	Type to output

Returns: Modified output stream.

Definition at line 1294 of file TypePrinter.h.

References Size2D::height, and Size2D::width.

 {
     os << size.width << "x" << size.height;
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const HOGInfo &	hog_info
	)

Formatted output of the HOGInfo type.

Parameters

[out]	os	Output stream
[in]	hog_info	Type to output

Returns: Modified output stream.

Definition at line 1321 of file TypePrinter.h.

References HOGInfo::block_size(), HOGInfo::block_stride(), HOGInfo::cell_size(), HOGInfo::detection_window_size(), HOGInfo::l2_hyst_threshold(), HOGInfo::normalization_type(), HOGInfo::num_bins(), and HOGInfo::phase_type().

 {
     os << "{CellSize=" << hog_info.cell_size() << ","
        << "BlockSize=" << hog_info.block_size() << ","
        << "DetectionWindowSize=" << hog_info.detection_window_size() << ","
        << "BlockStride=" << hog_info.block_stride() << ","
        << "NumBins=" << hog_info.num_bins() << ","
        << "NormType=" << hog_info.normalization_type() << ","
        << "L2HystThreshold=" << hog_info.l2_hyst_threshold() << ","
        << "PhaseType=" << hog_info.phase_type() << "}";
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const ConvolutionMethod &	conv_method
	)

Formatted output of the ConvolutionMethod type.

Parameters

[out]	os	Output stream
[in]	conv_method	Type to output

Returns: Modified output stream.

Definition at line 1355 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, DIRECT, GEMM, and WINOGRAD.

 {
     switch(conv_method)
     {
         case ConvolutionMethod::GEMM:
             os << "GEMM";
             break;
         case ConvolutionMethod::DIRECT:
             os << "DIRECT";
             break;
         case ConvolutionMethod::WINOGRAD:
             os << "WINOGRAD";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const GPUTarget &	gpu_target
	)

Formatted output of the GPUTarget type.

Parameters

[out]	os	Output stream
[in]	gpu_target	Type to output

Returns: Modified output stream.

Definition at line 1395 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, BIFROST, G51, G51BIG, G51LIT, G71, G72, GPU_ARCH_MASK, MIDGARD, T600, T700, T800, TBOX, TNOX, and TTRX.

 {
     switch(gpu_target)
     {
         case GPUTarget::GPU_ARCH_MASK:
             os << "GPU_ARCH_MASK";
             break;
         case GPUTarget::MIDGARD:
             os << "MIDGARD";
             break;
         case GPUTarget::BIFROST:
             os << "BIFROST";
             break;
         case GPUTarget::T600:
             os << "T600";
             break;
         case GPUTarget::T700:
             os << "T700";
             break;
         case GPUTarget::T800:
             os << "T800";
             break;
         case GPUTarget::G71:
             os << "G71";
             break;
         case GPUTarget::G72:
             os << "G72";
             break;
         case GPUTarget::G51:
             os << "G51";
             break;
         case GPUTarget::G51BIG:
             os << "G51BIG";
             break;
         case GPUTarget::G51LIT:
             os << "G51LIT";
             break;
         case GPUTarget::TNOX:
             os << "TNOX";
             break;
         case GPUTarget::TTRX:
             os << "TTRX";
             break;
         case GPUTarget::TBOX:
             os << "TBOX";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const DetectionWindow &	detection_window
	)

Formatted output of the DetectionWindow type.

Parameters

[out]	os	Output stream
[in]	detection_window	Type to output

Returns: Modified output stream.

Definition at line 1468 of file TypePrinter.h.

References DetectionWindow::height, DetectionWindow::idx_class, DetectionWindow::score, DetectionWindow::width, DetectionWindow::x, and DetectionWindow::y.

 {
     os << "{x=" << detection_window.x << ","
        << "y=" << detection_window.y << ","
        << "width=" << detection_window.width << ","
        << "height=" << detection_window.height << ","
        << "idx_class=" << detection_window.idx_class << ","
        << "score=" << detection_window.score << "}";
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const Termination &	termination
	)

Formatted output of the Termination type.

Parameters

[out]	os	Output stream
[in]	termination	Type to output

Returns: Modified output stream.

Definition at line 1500 of file TypePrinter.h.

References ARM_COMPUTE_ERROR, TERM_CRITERIA_BOTH, TERM_CRITERIA_EPSILON, and TERM_CRITERIA_ITERATIONS.

 {
     switch(termination)
     {
         case Termination::TERM_CRITERIA_EPSILON:
             os << "TERM_CRITERIA_EPSILON";
             break;
         case Termination::TERM_CRITERIA_ITERATIONS:
             os << "TERM_CRITERIA_ITERATIONS";
             break;
         case Termination::TERM_CRITERIA_BOTH:
             os << "TERM_CRITERIA_BOTH";
             break;
         default:
             ARM_COMPUTE_ERROR("NOT_SUPPORTED!");
     }
 
     return os;
 }

inline ::std::ostream& arm_compute::operator<<	(	::std::ostream &	os,
		const WinogradInfo &	info
	)

Formatted output of the WinogradInfo type.

Definition at line 1534 of file TypePrinter.h.

References WinogradInfo::convolution_info, WinogradInfo::kernel_size, WinogradInfo::output_data_layout, and WinogradInfo::output_tile_size.

 {
     os << "{OutputTileSize=" << info.output_tile_size << ","
        << "KernelSize=" << info.kernel_size << ","
        << "PadStride=" << info.convolution_info << ","
        << "OutputDataLayout=" << info.output_data_layout << "}";
 
     return os;
 }

bool arm_compute::operator==	(	const Dimensions< T > &	lhs,
		const Dimensions< T > &	rhs
	)

inline

Check that given dimensions are equal.

Parameters

[in]	lhs	Left-hand side Dimensions.
[in]	rhs	Right-hand side Dimensions.

Returns: True if the given dimensions are equal.

Definition at line 234 of file Dimensions.h.

References Dimensions< T >::cbegin(), Dimensions< T >::cend(), and Dimensions< T >::num_dimensions().

 {
     return ((lhs.num_dimensions() == rhs.num_dimensions()) && std::equal(lhs.cbegin(), lhs.cend(), rhs.cbegin()));
 }

inline ::std::istream& arm_compute::operator>>	(	::std::istream &	is,
		BorderMode &	mode
	)

Formatted input of the BorderMode type.

Parameters

[out]	is	Input stream.
[in]	mode	Border mode.

Returns: the modified input stream.

Definition at line 42 of file TypeReader.h.

References arm_compute::test::validation::c, CONSTANT, REPLICATE, and UNDEFINED.

 {
     std::string value;
 
     is >> value;
 
     std::transform(value.begin(), value.end(), value.begin(), [](unsigned char c)
     {
         return std::toupper(c);
     });
 
     if(value == "UNDEFINED")
     {
         mode = BorderMode::UNDEFINED;
     }
     else if(value == "CONSTANT")
     {
         mode = BorderMode::CONSTANT;
     }
     else if(value == "REPLICATE")
     {
         mode = BorderMode::REPLICATE;
     }
     else
     {
         throw std::invalid_argument("Unsupported value '" + value + "' for border mode");
     }
 
     return is;
 }

void arm_compute::permute	(	Dimensions< T > &	dimensions,
		const PermutationVector &	perm
	)

inline

Permutes given Dimensions according to a permutation vector.

Warning: Validity of permutation is not checked

Parameters

[in,out]	dimensions	Dimensions to permute
[in]	perm	Permutation vector

Definition at line 536 of file Helpers.h.

References Dimensions< T >::begin(), Dimensions< T >::end(), Dimensions< T >::num_dimensions(), and Dimensions< T >::set().

Referenced by NumPyBinLoader::access_tensor(), arm_compute::misc::shape_calculator::compute_permutation_output_shape(), arm_compute::test::validation::DATA_TEST_CASE(), AssetsLibrary::fill(), arm_compute::test::validation::validate(), and arm_compute::test::validation::validate_wrap().

 {
     auto dimensions_copy = utility::make_array<Dimensions<T>::num_max_dimensions>(dimensions.begin(), dimensions.end());
     for(unsigned int i = 0; i < perm.num_dimensions(); ++i)
     {
         T dimension_val = (perm[i] < dimensions.num_dimensions()) ? dimensions_copy[perm[i]] : 0;
         dimensions.set(i, dimension_val);
     }
 }

void arm_compute::permute	(	TensorShape &	shape,
		const PermutationVector &	perm
	)

inline

Permutes given TensorShape according to a permutation vector.

Warning: Validity of permutation is not checked

Parameters

[in,out]	shape	Shape to permute
[in]	perm	Permutation vector

Definition at line 553 of file Helpers.h.

References auto_init_if_empty(), calculate_valid_region_scale(), coords2index(), arm_compute::test::validation::data_type, get_data_layout_dimension_index(), index2coords(), Dimensions< T >::num_dimensions(), TensorShape::set(), set_data_layout_if_unknown(), set_data_type_if_unknown(), set_fixed_point_position_if_zero(), set_format_if_unknown(), set_quantization_info_if_empty(), set_shape_if_empty(), and arm_compute::test::validation::shape.

 {
     TensorShape shape_copy = shape;
     for(unsigned int i = 0; i < perm.num_dimensions(); ++i)
     {
         size_t dimension_val = (perm[i] < shape.num_dimensions()) ? shape_copy[perm[i]] : 1;
         shape.set(i, dimension_val, false); // Avoid changes in _num_dimension
     }
 }

uint8_t pixel_area_c1u8_clamp	(	const uint8_t *	first_pixel_ptr,
		size_t	stride,
		size_t	width,
		size_t	height,
		float	wr,
		float	hr,
		int	x,
		int	y
	)

inline

Return the pixel at (x,y) using area interpolation by clamping when out of borders.

The image must be single channel U8

Note: The interpolation area depends on the width and height ration of the input and output images; Currently average of the contributing pixels is calculated

Parameters

[in]	first_pixel_ptr	Pointer to the first pixel of a single channel U8 image.
[in]	stride	Stride in bytes of the image
[in]	width	Width of the image
[in]	height	Height of the image
[in]	wr	Width ratio among the input image width and output image width.
[in]	hr	Height ratio among the input image height and output image height.
[in]	x	X position of the wanted pixel
[in]	y	Y position of the wanted pixel

Returns: The pixel at (x, y) using area interpolation.

Definition at line 32 of file Helpers.inl.

References accumulate(), ARM_COMPUTE_ERROR_ON, arm_compute::utility::for_each(), arm_compute::test::fixed_point_arithmetic::detail::max(), arm_compute::test::fixed_point_arithmetic::detail::min(), Window::set(), and sum().

Referenced by pixel_bilinear_c1_clamp().

 {
     ARM_COMPUTE_ERROR_ON(first_pixel_ptr == nullptr);
 
     // Calculate sampling position
     float in_x = (x + 0.5f) * wr - 0.5f;
     float in_y = (y + 0.5f) * hr - 0.5f;
 
     // Get bounding box offsets
     int x_from = std::floor(x * wr - 0.5f - in_x);
     int y_from = std::floor(y * hr - 0.5f - in_y);
     int x_to   = std::ceil((x + 1) * wr - 0.5f - in_x);
     int y_to   = std::ceil((y + 1) * hr - 0.5f - in_y);
 
     // Clamp position to borders
     in_x = std::max(-1.f, std::min(in_x, static_cast<float>(width)));
     in_y = std::max(-1.f, std::min(in_y, static_cast<float>(height)));
 
     // Clamp bounding box offsets to borders
     x_from = ((in_x + x_from) < -1) ? -1 : x_from;
     y_from = ((in_y + y_from) < -1) ? -1 : y_from;
     x_to   = ((in_x + x_to) > width) ? (width - in_x) : x_to;
     y_to   = ((in_y + y_to) > height) ? (height - in_y) : y_to;
 
     // Get pixel index
     const int xi = std::floor(in_x);
     const int yi = std::floor(in_y);
 
     // Bounding box elements in each dimension
     const int x_elements = (x_to - x_from + 1);
     const int y_elements = (y_to - y_from + 1);
     ARM_COMPUTE_ERROR_ON(x_elements == 0 || y_elements == 0);
 
     // Sum pixels in area
     int sum = 0;
     for(int j = yi + y_from, je = yi + y_to; j <= je; ++j)
     {
         const uint8_t *ptr = first_pixel_ptr + j * stride + xi + x_from;
         sum                = std::accumulate(ptr, ptr + x_elements, sum);
     }
 
     // Return average
     return sum / (x_elements * y_elements);
 }

T arm_compute::pixel_bilinear_c1	(	const T *	first_pixel_ptr,
		size_t	stride,
		float	x,
		float	y
	)

inline

Return the pixel at (x,y) using bilinear interpolation.

Warning: Only works if the iterator was created with an IImage

Parameters

[in]	first_pixel_ptr	Pointer to the first pixel of a single channel input.
[in]	stride	Stride in bytes of the image;
[in]	x	X position of the wanted pixel
[in]	y	Y position of the wanted pixel

Returns: The pixel at (x, y) using bilinear interpolation.

Definition at line 210 of file Helpers.h.

References ARM_COMPUTE_ERROR_ON, and delta_bilinear_c1().

 {
     ARM_COMPUTE_ERROR_ON(first_pixel_ptr == nullptr);
 
     const int32_t xi = std::floor(x);
     const int32_t yi = std::floor(y);
 
     const float dx = x - xi;
     const float dy = y - yi;
 
     return delta_bilinear_c1(first_pixel_ptr + xi + yi * stride, stride, dx, dy);
 }

uint8_t arm_compute::pixel_bilinear_c1_clamp	(	const T *	first_pixel_ptr,
		size_t	stride,
		size_t	width,
		size_t	height,
		float	x,
		float	y
	)

inline

Return the pixel at (x,y) using bilinear interpolation by clamping when out of borders.

The image must be single channel input

Warning: Only works if the iterator was created with an IImage

Parameters

[in]	first_pixel_ptr	Pointer to the first pixel of a single channel image.
[in]	stride	Stride in bytes of the image
[in]	width	Width of the image
[in]	height	Height of the image
[in]	x	X position of the wanted pixel
[in]	y	Y position of the wanted pixel

Returns: The pixel at (x, y) using bilinear interpolation.

Definition at line 237 of file Helpers.h.

References ARM_COMPUTE_ERROR_ON, delta_bilinear_c1(), delta_linear_c1_x(), delta_linear_c1_y(), arm_compute::test::fixed_point_arithmetic::detail::max(), arm_compute::test::fixed_point_arithmetic::detail::min(), and pixel_area_c1u8_clamp().

 {
     ARM_COMPUTE_ERROR_ON(first_pixel_ptr == nullptr);
 
     x = std::max(-1.f, std::min(x, static_cast<float>(width)));
     y = std::max(-1.f, std::min(y, static_cast<float>(height)));
 
     const float xi = std::floor(x);
     const float yi = std::floor(y);
 
     const float dx = x - xi;
     const float dy = y - yi;
 
     if(dx == 0.0f)
     {
         if(dy == 0.0f)
         {
             return static_cast<T>(first_pixel_ptr[static_cast<int32_t>(xi) + static_cast<int32_t>(yi) * stride]);
         }
         return delta_linear_c1_y(first_pixel_ptr + static_cast<int32_t>(xi) + static_cast<int32_t>(yi) * stride, stride, dy);
     }
     if(dy == 0.0f)
     {
         return delta_linear_c1_x(first_pixel_ptr + static_cast<int32_t>(xi) + static_cast<int32_t>(yi) * stride, dx);
     }
     return delta_bilinear_c1(first_pixel_ptr + static_cast<int32_t>(xi) + static_cast<int32_t>(yi) * stride, stride, dx, dy);
 }

size_t arm_compute::pixel_size_from_format ( Format format )

inline

The size in bytes of the pixel format.

Parameters

[in] format Input format

Returns: The size in bytes of the pixel format

Definition at line 144 of file Utils.h.

References ARM_COMPUTE_ERROR, F16, F32, IYUV, NV12, NV21, RGB888, RGBA8888, S16, S32, U16, U32, U8, UV88, UYVY422, YUV444, and YUYV422.

 {
     switch(format)
     {
         case Format::U8:
             return 1;
         case Format::U16:
         case Format::S16:
         case Format::F16:
         case Format::UV88:
         case Format::YUYV422:
         case Format::UYVY422:
             return 2;
         case Format::RGB888:
             return 3;
         case Format::RGBA8888:
             return 4;
         case Format::U32:
         case Format::S32:
         case Format::F32:
             return 4;
         //Doesn't make sense for planar formats:
         case Format::NV12:
         case Format::NV21:
         case Format::IYUV:
         case Format::YUV444:
         default:
             ARM_COMPUTE_ERROR("Undefined pixel size for given format");
             return 0;
     }
 }

int arm_compute::plane_idx_from_channel	(	Format	format,
		Channel	channel
	)

inline

Return the plane index of a given channel given an input format.

Parameters

[in]	format	Input format
[in]	channel	Input channel

Returns: The plane index of the specific channel of the specific format

Definition at line 254 of file Utils.h.

References ARM_COMPUTE_ERROR, F16, F32, IYUV, NV12, NV21, RGB888, RGBA8888, S16, S32, U, U16, U32, U8, UV88, UYVY422, V, Y, YUV444, and YUYV422.

Referenced by arm_compute::test::validation::reference::channel_extract().

 {
     switch(format)
     {
         // Single planar formats have a single plane
         case Format::U8:
         case Format::U16:
         case Format::S16:
         case Format::U32:
         case Format::S32:
         case Format::F16:
         case Format::F32:
         case Format::UV88:
         case Format::RGB888:
         case Format::RGBA8888:
         case Format::YUYV422:
         case Format::UYVY422:
             return 0;
         // Multi planar formats
         case Format::NV12:
         case Format::NV21:
         {
             // Channel U and V share the same plane of format UV88
             switch(channel)
             {
                 case Channel::Y:
                     return 0;
                 case Channel::U:
                 case Channel::V:
                     return 1;
                 default:
                     ARM_COMPUTE_ERROR("Not supported channel");
                     return 0;
             }
         }
         case Format::IYUV:
         case Format::YUV444:
         {
             switch(channel)
             {
                 case Channel::Y:
                     return 0;
                 case Channel::U:
                     return 1;
                 case Channel::V:
                     return 2;
                 default:
                     ARM_COMPUTE_ERROR("Not supported channel");
                     return 0;
             }
         }
         default:
             ARM_COMPUTE_ERROR("Not supported format");
             return 0;
     }
 }

void arm_compute::print_consecutive_elements	(	std::ostream &	s,
		DataType	dt,
		const uint8_t *	ptr,
		unsigned int	n,
		int	stream_width,
		const std::string &	element_delim = `" "`
	)

Print consecutive elements to an output stream.

Parameters

[out]	s	Output stream to print the elements to.
[in]	dt	Data type of the elements
[in]	ptr	Pointer to print the elements from.
[in]	n	Number of elements to print.
[in]	stream_width	(Optional) Width of the stream. If set to 0 the element's width is used. Defaults to 0.
[in]	element_delim	(Optional) Delimeter among the consecutive elements. Defaults to space delimeter

Referenced by max_consecutive_elements_display_width_impl().

void arm_compute::print_consecutive_elements_impl	(	std::ostream &	s,
		const T *	ptr,
		unsigned int	n,
		int	stream_width = `0`,
		const std::string &	element_delim = `" "`
	)

Print consecutive elements to an output stream.

Parameters

[out]	s	Output stream to print the elements to.
[in]	ptr	Pointer to print the elements from.
[in]	n	Number of elements to print.
[in]	stream_width	(Optional) Width of the stream. If set to 0 the element's width is used. Defaults to 0.
[in]	element_delim	(Optional) Delimeter among the consecutive elements. Defaults to space delimeter

Definition at line 1090 of file Utils.h.

 {
     using print_type = typename std::conditional<std::is_floating_point<T>::value, T, int>::type;
 
     for(unsigned int i = 0; i < n; ++i)
     {
         // Set stream width as it is not a "sticky" stream manipulator
         if(stream_width != 0)
         {
             s.width(stream_width);
         }
 
         if(std::is_same<typename std::decay<T>::type, half>::value)
         {
             // We use T instead of print_type here is because the std::is_floating_point<half> returns false and then the print_type becomes int.
             s << std::right << static_cast<T>(ptr[i]) << element_delim;
         }
         else
         {
             s << std::right << static_cast<print_type>(ptr[i]) << element_delim;
         }
     }
 }

std::string arm_compute::read_file	(	const std::string &	filename,
		bool	binary
	)

Load an entire file in memory.

Parameters

[in]	filename	Name of the file to read.
[in]	binary	Is it a binary file ?

Returns: The content of the file.

Referenced by floor_to_multiple().

int arm_compute::round	(	float	x,
		RoundingPolicy	rounding_policy
	)

Return a rounded value of x.

Rounding is done according to the rounding_policy.

Parameters

[in]	x	Float value to be rounded.
[in]	rounding_policy	Policy determining how rounding is done.

Returns: Rounded value of the argument x.

Referenced by DATA_TEST_CASE(), finalize(), lktracker_stage0(), lktracker_stage1(), pooling_layer_MxN_quantized_nchw(), pooling_layer_MxN_quantized_nhwc(), and roi_pooling_layer().

int32x4_t rounding_divide_by_pow2	(	int32x4_t	x,
		int	exponent
	)

inline

Round to the nearest division by a power-of-two using exponent.

Note: This function calculates the following expression: (x + 2^n -1 ) / 2^n where n = exponent

Parameters

[in]	x	Vector of 4 elements
[in]	exponent	Integer value used to round to nearest division by a power-of-two

Returns: the nearest division by a power-of-two using exponent

Definition at line 26 of file NEAsymm.inl.

Referenced by finalize_quantization().

 {
     const int32x4_t shift_vec  = vdupq_n_s32(-exponent);
     const int32x4_t fixup      = vshrq_n_s32(vandq_s32(x, shift_vec), 31);
     const int32x4_t fixed_up_x = vqaddq_s32(x, fixup);
     return vrshlq_s32(fixed_up_x, shift_vec);
 }

qint16_t sabs_qs16 ( qint16_t a )

inline

16 bit fixed point scalar absolute value

Parameters

[in] a 16 bit fixed point input

Returns: The result of the 16 bit fixed point absolute value

Definition at line 67 of file FixedPoint.inl.

References arm_compute::test::validation::a, arm_compute::test::fixed_point_arithmetic::detail::max(), and arm_compute::test::fixed_point_arithmetic::detail::min().

Referenced by sqexp_qs16().

 {
     return (a < 0) ? (a == std::numeric_limits<int16_t>::min()) ? std::numeric_limits<int16_t>::max() : -a : a;
 }

qint8_t sabs_qs8 ( qint8_t a )

inline

8 bit fixed point scalar absolute value

Parameters

[in] a 8 bit fixed point input

Returns: The result of the 8 bit fixed point absolute value

Definition at line 62 of file FixedPoint.inl.

References arm_compute::test::validation::a, arm_compute::test::fixed_point_arithmetic::detail::max(), and arm_compute::test::fixed_point_arithmetic::detail::min().

Referenced by sqexp_qs8().

 {
     return (a < 0) ? (a == std::numeric_limits<int8_t>::min()) ? std::numeric_limits<int8_t>::max() : -a : a;
 }

qint16_t sadd_qs16	(	qint16_t	a,
		qint16_t	b
	)

inline

16 bit fixed point scalar add

Parameters

[in]	a	First 16 bit fixed point input
[in]	b	Second 16 bit fixed point input

Returns: The result of the 16 bit fixed point addition

Definition at line 77 of file FixedPoint.inl.

References arm_compute::test::validation::b.

Referenced by slog_qs16().

 {
     return a + b;
 }

qint8_t sadd_qs8	(	qint8_t	a,
		qint8_t	b
	)

inline

8 bit fixed point scalar add

Parameters

[in]	a	First 8 bit fixed point input
[in]	b	Second 8 bit fixed point input

Returns: The result of the 8 bit fixed point addition

Definition at line 72 of file FixedPoint.inl.

References arm_compute::test::validation::b.

Referenced by slog_qs8().

 {
     return a + b;
 }

const std::pair<unsigned int, unsigned int> arm_compute::scaled_dimensions	(	unsigned int	width,
		unsigned int	height,
		unsigned int	kernel_width,
		unsigned int	kernel_height,
		const PadStrideInfo &	pad_stride_info,
		const Size2D &	dilation = `Size2D(1U, 1U)`
	)

Returns expected width and height of output scaled tensor depending on dimensions rounding mode.

Parameters

[in]	width	Width of input tensor (Number of columns)
[in]	height	Height of input tensor (Number of rows)
[in]	kernel_width	Kernel width.
[in]	kernel_height	Kernel height.
[in]	pad_stride_info	Pad and stride information.
[in]	dilation	(Optional) Dilation, in elements, across x and y. Defaults to (1, 1).

Returns: A pair with the new width in the first position and the new height in the second.

Referenced by arm_compute::misc::shape_calculator::compute_deep_convolution_shape(), arm_compute::misc::shape_calculator::compute_depthwise_convolution_shape(), arm_compute::misc::shape_calculator::compute_im2col_conv_shape(), arm_compute::misc::shape_calculator::compute_pool_shape(), arm_compute::misc::shape_calculator::compute_winograd_output_transform_shape(), arm_compute::test::validation::reference::convolution_layer_nchw(), data_type_for_convolution_matrix(), and arm_compute::test::validation::reference::locally_connected().

float scvt_f32_qs16	(	qint16_t	a,
		int	fixed_point_position
	)

inline

Convert a 16 bit fixed point to float.

Parameters

[in]	a	Input to convert
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the conversion 16 bit fixed point -> float

Definition at line 384 of file FixedPoint.inl.

References arm_compute::test::validation::a.

 {
     return static_cast<float>(a) / (1 << fixed_point_position);
 }

float scvt_f32_qs8	(	qint8_t	a,
		int	fixed_point_position
	)

inline

Convert an 8 bit fixed point to float.

Parameters

[in]	a	Input to convert
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the conversion 8 bit fixed point -> float

Definition at line 373 of file FixedPoint.inl.

References arm_compute::test::validation::a.

 {
     return static_cast<float>(a) / (1 << fixed_point_position);
 }

qint16_t sdiv_qs16	(	qint16_t	a,
		qint16_t	b,
		int	fixed_point_position
	)

inline

16 bit fixed point scalar division

Parameters

[in]	a	First 16 bit fixed point input
[in]	b	Second 16 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point division.

Definition at line 255 of file FixedPoint.inl.

References arm_compute::test::validation::b.

Referenced by slog_qs16().

 {
     const qint32_t temp = a << fixed_point_position;
     return static_cast<qint16_t>(temp / b);
 }

qint8_t sdiv_qs8	(	qint8_t	a,
		qint8_t	b,
		int	fixed_point_position
	)

inline

8 bit fixed point scalar division

Parameters

[in]	a	First 8 bit fixed point input
[in]	b	Second 8 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point division.

Definition at line 249 of file FixedPoint.inl.

References arm_compute::test::validation::b.

Referenced by slog_qs8().

 {
     const qint16_t temp = a << fixed_point_position;
     return static_cast<qint8_t>(temp / b);
 }

bool arm_compute::separate_matrix	(	const int16_t *	conv,
		int16_t *	conv_col,
		int16_t *	conv_row,
		uint8_t	size
	)

inline

Separate a 2D convolution into two 1D convolutions.

Parameters

[in]	conv	2D convolution
[out]	conv_col	1D vertical convolution
[out]	conv_row	1D horizontal convolution
[in]	size	Size of the 2D convolution

Returns: true if the separation was successful

Definition at line 577 of file Utils.h.

References arm_compute::test::fixed_point_arithmetic::detail::abs().

 {
     int32_t min_col     = -1;
     int16_t min_col_val = -1;
 
     for(int32_t i = 0; i < size; ++i)
     {
         if(conv[i] != 0 && (min_col < 0 || abs(min_col_val) > abs(conv[i])))
         {
             min_col     = i;
             min_col_val = conv[i];
         }
     }
 
     if(min_col < 0)
     {
         return false;
     }
 
     for(uint32_t j = 0; j < size; ++j)
     {
         conv_col[j] = conv[min_col + j * size];
     }
 
     for(uint32_t i = 0; i < size; i++)
     {
         if(static_cast<int>(i) == min_col)
         {
             conv_row[i] = 1;
         }
         else
         {
             int16_t coeff = conv[i] / conv[min_col];
 
             for(uint32_t j = 1; j < size; ++j)
             {
                 if(conv[i + j * size] != (conv_col[j] * coeff))
                 {
                     return false;
                 }
             }
 
             conv_row[i] = coeff;
         }
     }
 
     return true;
 }

bool set_data_layout_if_unknown	(	ITensorInfo &	info,
		DataLayout	data_layout
	)

inline

Set the data layout to the specified value if the current data layout is unknown.

Parameters

[in,out]	info	Tensor info used to check and assign.
[in]	data_layout	New data layout.

Returns: True if the data type has been changed.

Definition at line 270 of file Helpers.inl.

References ITensorInfo::data_layout(), ITensorInfo::set_data_layout(), and UNKNOWN.

Referenced by permute().

 {
     if(info.data_layout() == DataLayout::UNKNOWN)
     {
         info.set_data_layout(data_layout);
         return true;
     }
 
     return false;
 }

bool set_data_type_if_unknown	(	ITensorInfo &	info,
		DataType	data_type
	)

inline

Set the data type and number of channels to the specified value if the current data type is unknown.

Parameters

[in,out]	info	Tensor info used to check and assign.
[in]	data_type	New data type.

Returns: True if the data type has been changed.

Definition at line 259 of file Helpers.inl.

References ITensorInfo::data_type(), ITensorInfo::set_data_type(), and UNKNOWN.

Referenced by permute().

 {
     if(info.data_type() == DataType::UNKNOWN)
     {
         info.set_data_type(data_type);
         return true;
     }
 
     return false;
 }

bool set_fixed_point_position_if_zero	(	ITensorInfo &	info,
		int	fixed_point_position
	)

inline

Set the fixed point position to the specified value if the current fixed point position is 0 and the data type is QS8 or QS16.

Parameters

[in,out]	info	Tensor info used to check and assign.
[in]	fixed_point_position	New fixed point position

Returns: True if the fixed point position has been changed.

Definition at line 281 of file Helpers.inl.

References ITensorInfo::data_type(), ITensorInfo::fixed_point_position(), QS16, QS8, and ITensorInfo::set_fixed_point_position().

Referenced by permute().

 {
     if(info.fixed_point_position() == 0 && (info.data_type() == DataType::QS8 || info.data_type() == DataType::QS16))
     {
         info.set_fixed_point_position(fixed_point_position);
         return true;
     }
 
     return false;
 }

bool set_format_if_unknown	(	ITensorInfo &	info,
		Format	format
	)

inline

Set the format, data type and number of channels to the specified value if the current data type is unknown.

Parameters

[in,out]	info	Tensor info used to check and assign.
[in]	format	New format.

Returns: True if the format has been changed.

Definition at line 248 of file Helpers.inl.

References ITensorInfo::data_type(), ITensorInfo::set_format(), and UNKNOWN.

Referenced by permute().

 {
     if(info.data_type() == DataType::UNKNOWN)
     {
         info.set_format(format);
         return true;
     }
 
     return false;
 }

bool set_quantization_info_if_empty	(	ITensorInfo &	info,
		QuantizationInfo	quantization_info
	)

inline

Set the quantization info to the specified value if the current quantization info is empty and the data type of asymmetric quantized type.

Parameters

[in,out]	info	Tensor info used to check and assign.
[in]	quantization_info	Quantization info

Returns: True if the quantization info has been changed.

Definition at line 292 of file Helpers.inl.

References ITensorInfo::data_type(), QuantizationInfo::empty(), is_data_type_quantized_asymmetric(), ITensorInfo::quantization_info(), and ITensorInfo::set_quantization_info().

Referenced by permute().

 {
     if(info.quantization_info().empty() && (is_data_type_quantized_asymmetric(info.data_type())))
     {
         info.set_quantization_info(quantization_info);
         return true;
     }
 
     return false;
 }

bool set_shape_if_empty	(	ITensorInfo &	info,
		const TensorShape &	shape
	)

inline

Set the shape to the specified value if the current assignment is empty.

Parameters

[in,out]	info	Tensor info used to check and assign.
[in]	shape	New shape.

Returns: True if the shape has been changed.

Definition at line 237 of file Helpers.inl.

References ITensorInfo::set_tensor_shape(), ITensorInfo::tensor_shape(), and TensorShape::total_size().

Referenced by permute().

 {
     if(info.tensor_shape().total_size() == 0)
     {
         info.set_tensor_shape(shape);
         return true;
     }
 
     return false;
 }

bool arm_compute::setup_assembly_kernel	(	const ITensor *	a,
		const ITensor *	b,
		ITensor *	d,
		float	alpha,
		float	beta,
		bool	pretranspose_hint,
		Tensor &	workspace,
		Tensor &	B_pretranspose,
		MemoryGroup &	memory_group,
		T &	asm_glue
	)

inline

Create a wrapper kernel.

Parameters

[in]	a	Input tensor A.
[in]	b	Input tensor B.
[out]	d	Output tensor.
[in]	alpha	Alpha value.
[in]	beta	Beta value.
[in]	pretranspose_hint	Pre-transpose hint in case matrix b should be pre-transposed
[out]	workspace	Workspace tensor
[out]	B_pretranspose	Tensor to hold the pre-transposed B
[in]	memory_group	Tensor memory group.
[out]	asm_glue	Assembly glue kernel.

Returns: the wrapper kernel.

Definition at line 159 of file AssemblyHelper.h.

References arm_compute::test::validation::a, allocate_workspace(), ARM_COMPUTE_ERROR_ON_NULLPTR, arm_compute::test::validation::b, Tensor::buffer(), IScheduler::cpu_info(), Scheduler::get(), ITensor::info(), IScheduler::num_threads(), ITensorInfo::tensor_shape(), TensorShape::total_size_upper(), Dimensions< T >::x(), Dimensions< T >::y(), and Dimensions< T >::z().

 {
     const CPUInfo &ci          = NEScheduler::get().cpu_info();
     const int      M           = d->info()->tensor_shape().y();
     const int      N           = d->info()->tensor_shape().x();
     const int      K           = a->info()->tensor_shape().x();
     const int      batches     = d->info()->tensor_shape().total_size_upper(2);
     const int      multis      = b->info()->tensor_shape().z();
     unsigned int   num_threads = NEScheduler::get().num_threads();
 
     // unique_ptr to a Gemm object
     std::unique_ptr<typename T::AssemblyGemm>
     asm_gemm(arm_gemm::gemm<typename T::TypeOperator, typename T::TypeResult>(ci, M, N, K, batches, multis, false, false, alpha, beta, num_threads, pretranspose_hint));
     // arm_compute wrapper for the Gemm object (see above)
     std::unique_ptr<NEGEMMAssemblyWrapper<typename T::AssemblyGemm>>
                                                                   acl_gemm_wrapper = support::cpp14::make_unique<NEGEMMAssemblyWrapper<typename T::AssemblyGemm>>();
     if(acl_gemm_wrapper != nullptr && asm_gemm != nullptr)
     {
         acl_gemm_wrapper->configure(asm_gemm.get());
         const size_t workspace_size = asm_gemm->get_working_size();
         if(workspace_size)
         {
             // Allocate workspace
             const unsigned int alignment = 4096;
             allocate_workspace(workspace_size, workspace, &memory_group, alignment, num_threads);
             ARM_COMPUTE_ERROR_ON_NULLPTR(workspace.buffer());
             asm_gemm->set_working_space(reinterpret_cast<typename T::TypeResult *>(workspace.buffer()));
         }
 
         //if we disable this code below in brackets then ConvLayer deadlocks when threads > 1 and
         //the shapes are In=1x1x1024 Weights=1x1x1024x1001 Biases=1001 Out=1x1x1001
         {
             const unsigned int window_size = asm_gemm->get_window_size();
             if(window_size < num_threads)
             {
                 num_threads = window_size;
                 asm_gemm->set_nthreads(num_threads);
             }
         }
 
         // Check for pre-transposed support
         if(asm_gemm->B_pretranspose_required())
         {
             // Forcing 128-byte alignment (required by 32-bit kernels)
             const unsigned int alignment           = 128;
             const size_t       B_pretranspose_size = asm_gemm->get_B_pretransposed_array_size();
             allocate_workspace(B_pretranspose_size, B_pretranspose, nullptr, alignment, 1);
             ARM_COMPUTE_ERROR_ON_NULLPTR(B_pretranspose.buffer());
             asm_glue._pretranspose = &B_pretranspose;
         }
 
         asm_glue._gemm_kernel_asm  = std::move(asm_gemm);
         asm_glue._optimised_kernel = std::move(acl_gemm_wrapper);
         // We need to setup the ptrs in the run() method
         asm_glue._a = a;
         asm_glue._b = b;
         asm_glue._d = d;
         return true;
     }
     return false;
 }

qint16_t arm_compute::sexp_qs16	(	qint16_t	a,
		int	fixed_point_position
	)

16 bit fixed point scalar exponential

Parameters

[in]	a	16 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point exponential.

qint16_t sinvsqrt_qs16	(	qint16_t	a,
		int	fixed_point_position
	)

inline

16 bit fixed point scalar inverse square root

Parameters

[in]	a	16 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point inverse square root.

Definition at line 229 of file FixedPoint.inl.

References smul_qs16(), and ssub_qs16().

 {
     const qint16_t shift = 16 - (fixed_point_position + (__builtin_clz(a) - 16));
 
     const qint16_t const_three = (3 << fixed_point_position);
     qint16_t       temp        = shift < 0 ? (a << -shift) : (a >> shift);
     qint16_t       x2          = temp;
 
     // We need three iterations to find the result
     for(int i = 0; i < 3; ++i)
     {
         qint16_t three_minus_dx = ssub_qs16(const_three, smul_qs16(temp, smul_qs16(x2, x2, fixed_point_position), fixed_point_position));
         x2                      = smul_qs16(x2, three_minus_dx, fixed_point_position) >> 1;
     }
 
     temp = shift < 0 ? (x2 << ((-shift) >> 1)) : (x2 >> (shift >> 1));
 
     return temp;
 }

qint8_t sinvsqrt_qs8	(	qint8_t	a,
		int	fixed_point_position
	)

inline

8 bit fixed point scalar inverse square root

Parameters

[in]	a	8 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point inverse square root.

Definition at line 209 of file FixedPoint.inl.

References smul_qs8(), and ssub_qs8().

 {
     const qint8_t shift = 8 - (fixed_point_position + (__builtin_clz(a) - 24));
 
     const qint8_t const_three = (3 << fixed_point_position);
     qint8_t       temp        = shift < 0 ? (a << -shift) : (a >> shift);
     qint8_t       x2          = temp;
 
     // We need three iterations to find the result
     for(int i = 0; i < 3; ++i)
     {
         qint8_t three_minus_dx = ssub_qs8(const_three, smul_qs8(temp, smul_qs8(x2, x2, fixed_point_position), fixed_point_position));
         x2                     = (smul_qs8(x2, three_minus_dx, fixed_point_position) >> 1);
     }
 
     temp = shift < 0 ? (x2 << (-shift >> 1)) : (x2 >> (shift >> 1));
 
     return temp;
 }

qint16_t slog_qs16	(	qint16_t	a,
		int	fixed_point_position
	)

inline

16 bit fixed point scalar logarithm

Parameters

[in]	a	16 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point logarithm.

Definition at line 340 of file FixedPoint.inl.

References A, B, sadd_qs16(), sdiv_qs16(), smul_qs16(), sqadd_qs16(), sqmul_qs16(), ssub_qs16(), and sum().

 {
     // Constants
     qint16_t const_one = (1 << fixed_point_position);
     qint16_t ln2       = (0x58B9 >> (7 - fixed_point_position));
     qint16_t A         = (0x5C0F >> (7 - fixed_point_position - 1));
     qint16_t B         = -(0x56AE >> (7 - fixed_point_position));
     qint16_t C         = (0x2933 >> (7 - fixed_point_position));
     qint16_t D         = -(0x0AA7 >> (7 - fixed_point_position));
 
     if((const_one == a) || (a < 0))
     {
         return 0;
     }
     else if(a < const_one)
     {
         return -slog_qs16(sdiv_qs16(const_one, a, fixed_point_position), fixed_point_position);
     }
 
     // Remove even powers of 2
     qint16_t shift_val = 31 - __builtin_clz(a >> fixed_point_position);
     a >>= shift_val;
     a = ssub_qs16(a, const_one);
 
     // Polynomial expansion
     qint16_t sum = sqadd_qs16(sqmul_qs16(a, D, fixed_point_position), C);
     sum          = sqadd_qs16(sqmul_qs16(a, sum, fixed_point_position), B);
     sum          = sqadd_qs16(sqmul_qs16(a, sum, fixed_point_position), A);
     sum          = sqmul_qs16(a, sum, fixed_point_position);
 
     return smul_qs16(sadd_qs16(sum, shift_val << fixed_point_position), ln2, fixed_point_position);
 }

qint8_t slog_qs8	(	qint8_t	a,
		int	fixed_point_position
	)

inline

8 bit fixed point scalar logarithm

Parameters

[in]	a	8 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point logarithm.

Definition at line 307 of file FixedPoint.inl.

References A, B, sadd_qs8(), sdiv_qs8(), smul_qs8(), sqadd_qs8(), sqmul_qs8(), ssub_qs8(), and sum().

 {
     // Constants
     qint8_t const_one = (1 << fixed_point_position);
     qint8_t ln2       = (0x58 >> (7 - fixed_point_position));
     qint8_t A         = (0x5C >> (7 - fixed_point_position - 1));
     qint8_t B         = -(0x56 >> (7 - fixed_point_position));
     qint8_t C         = (0x29 >> (7 - fixed_point_position));
     qint8_t D         = -(0x0A >> (7 - fixed_point_position));
 
     if((const_one == a) || (a < 0))
     {
         return 0;
     }
     else if(a < const_one)
     {
         return -slog_qs8(sdiv_qs8(const_one, a, fixed_point_position), fixed_point_position);
     }
 
     // Remove even powers of 2
     qint8_t shift_val = 31 - __builtin_clz(a >> fixed_point_position);
     a >>= shift_val;
     a = ssub_qs8(a, const_one);
 
     // Polynomial expansion
     qint8_t sum = sqadd_qs8(sqmul_qs8(a, D, fixed_point_position), C);
     sum         = sqadd_qs8(sqmul_qs8(a, sum, fixed_point_position), B);
     sum         = sqadd_qs8(sqmul_qs8(a, sum, fixed_point_position), A);
     sum         = sqmul_qs8(a, sum, fixed_point_position);
 
     return smul_qs8(sadd_qs8(sum, shift_val << fixed_point_position), ln2, fixed_point_position);
 }

qint16_t smul_qs16	(	qint16_t	a,
		qint16_t	b,
		int	fixed_point_position
	)

inline

16 bit fixed point scalar multiply

Parameters

[in]	a	First 16 bit fixed point input
[in]	b	Second 16 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point multiplication.

Definition at line 149 of file FixedPoint.inl.

References arm_compute::test::validation::a.

Referenced by sinvsqrt_qs16(), and slog_qs16().

 {
     const qint32_t round_up_const = (1 << (fixed_point_position - 1));
 
     qint32_t tmp = static_cast<qint32_t>(a) * static_cast<qint32_t>(b);
 
     // Rounding up
     tmp += round_up_const;
 
     return static_cast<qint16_t>(tmp >> fixed_point_position);
 }

qint8_t smul_qs8	(	qint8_t	a,
		qint8_t	b,
		int	fixed_point_position
	)

inline

8 bit fixed point scalar multiply

Parameters

[in]	a	First 8 bit fixed point input
[in]	b	Second 8 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point multiplication.

Definition at line 137 of file FixedPoint.inl.

References arm_compute::test::validation::a.

Referenced by sinvsqrt_qs8(), and slog_qs8().

 {
     const qint16_t round_up_const = (1 << (fixed_point_position - 1));
 
     qint16_t tmp = static_cast<qint16_t>(a) * static_cast<qint16_t>(b);
 
     // Rounding up
     tmp += round_up_const;
 
     return static_cast<qint8_t>(tmp >> fixed_point_position);
 }

qint16_t sqadd_qs16	(	qint16_t	a,
		qint16_t	b
	)

inline

16 bit fixed point scalar saturating add

Parameters

[in]	a	First 16 bit fixed point input
[in]	b	Second 16 bit fixed point input

Returns: The result of the 16 bit fixed point addition. The result is saturated in case of overflow

Definition at line 91 of file FixedPoint.inl.

References arm_compute::test::validation::a, and arm_compute::utility::saturate_cast().

Referenced by slog_qs16(), sqexp_qs16(), and sshr_qs16().

 {
     // We need to store the temporary result in qint32_t otherwise we cannot evaluate the overflow
     qint32_t tmp = (static_cast<qint32_t>(a) + static_cast<qint32_t>(b));
 
     // Saturate the result in case of overflow and cast to qint16_t
     return utility::saturate_cast<qint16_t>(tmp);
 }

qint32_t sqadd_qs32	(	qint32_t	a,
		qint32_t	b
	)

inline

32 bit fixed point scalar saturating add

Parameters

[in]	a	First 32 bit fixed point input
[in]	b	Second 32 bit fixed point input

Returns: The result of the 32 bit fixed point addition. The result is saturated in case of overflow

Definition at line 100 of file FixedPoint.inl.

References arm_compute::test::validation::a, and arm_compute::utility::saturate_cast().

 {
     // We need to store the temporary result in qint64_t otherwise we cannot evaluate the overflow
     qint64_t tmp = (static_cast<qint64_t>(a) + static_cast<qint64_t>(b));
 
     // Saturate the result in case of overflow and cast to qint32_t
     return utility::saturate_cast<qint32_t>(tmp);
 }

qint8_t sqadd_qs8	(	qint8_t	a,
		qint8_t	b
	)

inline

8 bit fixed point scalar saturating add

Parameters

[in]	a	First 8 bit fixed point input
[in]	b	Second 8 bit fixed point input

Returns: The result of the 8 bit fixed point addition. The result is saturated in case of overflow

Definition at line 82 of file FixedPoint.inl.

References arm_compute::test::validation::a, and arm_compute::utility::saturate_cast().

Referenced by slog_qs8(), sqexp_qs8(), and sshr_qs8().

 {
     // We need to store the temporary result in qint16_t otherwise we cannot evaluate the overflow
     qint16_t tmp = (static_cast<qint16_t>(a) + static_cast<qint16_t>(b));
 
     // Saturate the result in case of overflow and cast to qint8_t
     return utility::saturate_cast<qint8_t>(tmp);
 }

qint16_t sqcvt_qs16_f32	(	float	a,
		int	fixed_point_position
	)

inline

Convert a float to 16 bit fixed point.

Parameters

[in]	a	Input to convert
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the conversion float -> 16 bit fixed point

Definition at line 389 of file FixedPoint.inl.

References arm_compute::utility::saturate_cast().

 {
     // round_nearest_integer(a * 2^(fixed_point_position))
     return utility::saturate_cast<qint16_t>(a * (1 << fixed_point_position) + ((a >= 0) ? 0.5 : -0.5));
 }

qint8_t sqcvt_qs8_f32	(	float	a,
		int	fixed_point_position
	)

inline

Convert a float to 8 bit fixed point.

Parameters

[in]	a	Input to convert
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the conversion float -> 8 bit fixed point

Definition at line 378 of file FixedPoint.inl.

References arm_compute::utility::saturate_cast().

 {
     // round_nearest_integer(a * 2^(fixed_point_position))
     return utility::saturate_cast<qint8_t>(a * (1 << fixed_point_position) + ((a >= 0) ? 0.5 : -0.5));
 }

qint16_t sqexp_qs16	(	qint16_t	a,
		int	fixed_point_position
	)

inline

16 bit fixed point scalar exponential

Parameters

[in]	a	16 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point exponential.

Definition at line 284 of file FixedPoint.inl.

References A, arm_compute::test::validation::alpha, B, sabs_qs16(), sqadd_qs16(), sqmul_qs16(), sqshl_qs16(), sqsub_qs16(), and sum().

 {
     // Constants
     const qint16_t const_one = (1 << fixed_point_position);
     const qint16_t ln2       = ((0x58B9 >> (14 - fixed_point_position)) + 1) >> 1;
     const qint16_t inv_ln2   = (((0x38AA >> (14 - fixed_point_position)) + 1) >> 1) | const_one;
     const qint16_t A         = ((0x7FBA >> (14 - fixed_point_position)) + 1) >> 1;
     const qint16_t B         = ((0x3FE9 >> (14 - fixed_point_position)) + 1) >> 1;
     const qint16_t C         = ((0x1693 >> (14 - fixed_point_position)) + 1) >> 1;
     const qint16_t D         = ((0x0592 >> (14 - fixed_point_position)) + 1) >> 1;
 
     // Polynomial expansion
     const int      dec_a = (sqmul_qs16(a, inv_ln2, fixed_point_position) >> fixed_point_position);
     const qint16_t alpha = sabs_qs16(sqsub_qs16(a, sqmul_qs16(ln2, sqshl_qs16(dec_a, fixed_point_position), fixed_point_position)));
     qint16_t       sum   = sqadd_qs16(sqmul_qs16(alpha, D, fixed_point_position), C);
     sum                  = sqadd_qs16(sqmul_qs16(alpha, sum, fixed_point_position), B);
     sum                  = sqadd_qs16(sqmul_qs16(alpha, sum, fixed_point_position), A);
     sum                  = sqmul_qs16(alpha, sum, fixed_point_position);
     sum                  = sqadd_qs16(sum, const_one);
 
     return (dec_a < 0) ? (sum >> -dec_a) : sqshl_qs16(sum, dec_a);
 }

qint8_t sqexp_qs8	(	qint8_t	a,
		int	fixed_point_position
	)

inline

8 bit fixed point scalar exponential

Parameters

[in]	a	8 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point exponential.

Definition at line 261 of file FixedPoint.inl.

References A, arm_compute::test::validation::alpha, B, sabs_qs8(), sqadd_qs8(), sqmul_qs8(), sqshl_qs8(), sqsub_qs8(), and sum().

 {
     // Constants
     const qint8_t const_one = (1 << fixed_point_position);
     const qint8_t ln2       = ((0x58 >> (6 - fixed_point_position)) + 1) >> 1;
     const qint8_t inv_ln2   = (((0x38 >> (6 - fixed_point_position)) + 1) >> 1) | const_one;
     const qint8_t A         = ((0x7F >> (6 - fixed_point_position)) + 1) >> 1;
     const qint8_t B         = ((0x3F >> (6 - fixed_point_position)) + 1) >> 1;
     const qint8_t C         = ((0x16 >> (6 - fixed_point_position)) + 1) >> 1;
     const qint8_t D         = ((0x05 >> (6 - fixed_point_position)) + 1) >> 1;
 
     // Polynomial expansion
     const int     dec_a = (sqmul_qs8(a, inv_ln2, fixed_point_position) >> fixed_point_position);
     const qint8_t alpha = sabs_qs8(sqsub_qs8(a, sqmul_qs8(ln2, sqshl_qs8(dec_a, fixed_point_position), fixed_point_position)));
     qint8_t       sum   = sqadd_qs8(sqmul_qs8(alpha, D, fixed_point_position), C);
     sum                 = sqadd_qs8(sqmul_qs8(alpha, sum, fixed_point_position), B);
     sum                 = sqadd_qs8(sqmul_qs8(alpha, sum, fixed_point_position), A);
     sum                 = sqmul_qs8(alpha, sum, fixed_point_position);
     sum                 = sqadd_qs8(sum, const_one);
 
     return (dec_a < 0) ? (sum >> -dec_a) : sqshl_qs8(sum, dec_a);
 }

qint8_t sqmovn_qs16 ( qint16_t a )

inline

Scalar saturating move and narrow.

Parameters

[in] a Input to convert to 8 bit fixed point

Returns: The narrowing conversion to 8 bit

Definition at line 395 of file FixedPoint.inl.

References arm_compute::test::validation::a, and arm_compute::utility::saturate_cast().

 {
     // Saturate the result in case of overflow and cast to qint8_t
     return utility::saturate_cast<qint8_t>(a);
 }

qint16_t sqmovn_qs32 ( qint32_t a )

inline

Scalar saturating move and narrow.

Parameters

[in] a Input to convert to 16 bit fixed point

Returns: The narrowing conversion to 16 bit

Definition at line 401 of file FixedPoint.inl.

References arm_compute::test::validation::a, and arm_compute::utility::saturate_cast().

 {
     // Saturate the result in case of overflow and cast to qint16_t
     return utility::saturate_cast<qint16_t>(a);
 }

qint16_t sqmul_qs16	(	qint16_t	a,
		qint16_t	b,
		int	fixed_point_position
	)

inline

16 bit fixed point scalar saturating multiply

Parameters

[in]	a	First 16 bit fixed point input
[in]	b	Second 16 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point multiplication. The result is saturated in case of overflow

Definition at line 173 of file FixedPoint.inl.

References arm_compute::test::validation::a, and arm_compute::utility::saturate_cast().

Referenced by slog_qs16(), and sqexp_qs16().

 {
     const qint32_t round_up_const = (1 << (fixed_point_position - 1));
 
     qint32_t tmp = static_cast<qint32_t>(a) * static_cast<qint32_t>(b);
 
     // Rounding up
     tmp += round_up_const;
 
     return utility::saturate_cast<qint16_t>(tmp >> fixed_point_position);
 }

qint8_t sqmul_qs8	(	qint8_t	a,
		qint8_t	b,
		int	fixed_point_position
	)

inline

8 bit fixed point scalar saturating multiply

Parameters

[in]	a	First 8 bit fixed point input
[in]	b	Second 8 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point multiplication. The result is saturated in case of overflow

Definition at line 161 of file FixedPoint.inl.

References arm_compute::test::validation::a, and arm_compute::utility::saturate_cast().

Referenced by slog_qs8(), and sqexp_qs8().

 {
     const qint16_t round_up_const = (1 << (fixed_point_position - 1));
 
     qint16_t tmp = static_cast<qint16_t>(a) * static_cast<qint16_t>(b);
 
     // Rounding up
     tmp += round_up_const;
 
     return utility::saturate_cast<qint8_t>(tmp >> fixed_point_position);
 }

qint32_t sqmull_qs16	(	qint16_t	a,
		qint16_t	b,
		int	fixed_point_position
	)

inline

16 bit fixed point scalar multiply long

Parameters

[in]	a	First 16 bit fixed point input
[in]	b	Second 16 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point multiplication long. The result is saturated in case of overflow

Definition at line 197 of file FixedPoint.inl.

References arm_compute::test::validation::a.

 {
     const qint32_t round_up_const = (1 << (fixed_point_position - 1));
 
     qint32_t tmp = static_cast<qint32_t>(a) * static_cast<qint32_t>(b);
 
     // Rounding up
     tmp += round_up_const;
 
     return tmp >> fixed_point_position;
 }

qint16_t sqmull_qs8	(	qint8_t	a,
		qint8_t	b,
		int	fixed_point_position
	)

inline

8 bit fixed point scalar multiply long

Parameters

[in]	a	First 8 bit fixed point input
[in]	b	Second 8 bit fixed point input
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point multiplication long. The result is saturated in case of overflow

Definition at line 185 of file FixedPoint.inl.

References arm_compute::test::validation::a.

 {
     const qint16_t round_up_const = (1 << (fixed_point_position - 1));
 
     qint16_t tmp = static_cast<qint16_t>(a) * static_cast<qint16_t>(b);
 
     // Rounding up
     tmp += round_up_const;
 
     return tmp >> fixed_point_position;
 }

qint16_t sqshl_qs16	(	qint16_t	a,
		int	shift
	)

inline

16 bit fixed point scalar saturating shift left

Parameters

[in]	a	First 16 bit fixed point input
[in]	shift	Shift amount (positive only values)

Returns: The result of the 16 bit fixed point shift. The result is saturated in case of overflow

Definition at line 40 of file FixedPoint.inl.

References arm_compute::test::validation::a, and arm_compute::utility::saturate_cast().

Referenced by sqexp_qs16().

 {
     qint32_t tmp = static_cast<qint32_t>(a) << shift;
 
     // Saturate the result in case of overflow and cast to qint16_t
     return utility::saturate_cast<qint16_t>(tmp);
 }

qint8_t sqshl_qs8	(	qint8_t	a,
		int	shift
	)

inline

8 bit fixed point scalar saturating shift left

Parameters

[in]	a	First 8 bit fixed point input
[in]	shift	Shift amount (positive only values)

Returns: The result of the 8 bit fixed point shift. The result is saturated in case of overflow

Definition at line 32 of file FixedPoint.inl.

References arm_compute::test::validation::a, and arm_compute::utility::saturate_cast().

Referenced by sqexp_qs8().

 {
     qint16_t tmp = static_cast<qint16_t>(a) << shift;
 
     // Saturate the result in case of overflow and cast to qint8_t
     return utility::saturate_cast<qint8_t>(tmp);
 }

qint16_t sqsub_qs16	(	qint16_t	a,
		qint16_t	b
	)

inline

16 bit fixed point scalar saturating subtraction

Parameters

[in]	a	First 16 bit fixed point input
[in]	b	Second 16 bit fixed point input

Returns: The result of the 16 bit fixed point subtraction. The result is saturated in case of overflow

Definition at line 128 of file FixedPoint.inl.

References arm_compute::test::validation::a, and arm_compute::utility::saturate_cast().

Referenced by sqexp_qs16().

 {
     // We need to store the temporary result in qint32_t otherwise we cannot evaluate the overflow
     qint32_t tmp = static_cast<qint32_t>(a) - static_cast<qint32_t>(b);
 
     // Saturate the result in case of overflow and cast to qint16_t
     return utility::saturate_cast<qint16_t>(tmp);
 }

qint8_t sqsub_qs8	(	qint8_t	a,
		qint8_t	b
	)

inline

8 bit fixed point scalar saturating subtraction

Parameters

[in]	a	First 8 bit fixed point input
[in]	b	Second 8 bit fixed point input

Returns: The result of the 8 bit fixed point subtraction. The result is saturated in case of overflow

Definition at line 119 of file FixedPoint.inl.

References arm_compute::test::validation::a, and arm_compute::utility::saturate_cast().

Referenced by sqexp_qs8().

 {
     // We need to store the temporary result in uint16_t otherwise we cannot evaluate the overflow
     qint16_t tmp = static_cast<qint16_t>(a) - static_cast<qint16_t>(b);
 
     // Saturate the result in case of overflow and cast to qint8_t
     return utility::saturate_cast<qint8_t>(tmp);
 }

qint16_t sshr_qs16	(	qint16_t	a,
		int	shift
	)

inline

16 bit fixed point scalar shift right

Parameters

[in]	a	First 16 bit fixed point input
[in]	shift	Shift amount (positive only values)

Returns: The result of the 16 bit fixed point shift

Definition at line 55 of file FixedPoint.inl.

References ARM_COMPUTE_ERROR_ON_MSG, and sqadd_qs16().

 {
     ARM_COMPUTE_ERROR_ON_MSG(shift == 0, "Shift should not be zero");
     const qint16_t round_val = 1 << (shift - 1);
     return sqadd_qs16(a, round_val) >> shift;
 }

qint8_t sshr_qs8	(	qint8_t	a,
		int	shift
	)

inline

8 bit fixed point scalar shift right

Parameters

[in]	a	First 8 bit fixed point input
[in]	shift	Shift amount (positive only values)

Returns: The result of the 8 bit fixed point shift

Definition at line 48 of file FixedPoint.inl.

References ARM_COMPUTE_ERROR_ON_MSG, and sqadd_qs8().

 {
     ARM_COMPUTE_ERROR_ON_MSG(shift == 0, "Shift should not be zero");
     const qint8_t round_val = 1 << (shift - 1);
     return sqadd_qs8(a, round_val) >> shift;
 }

qint16_t ssub_qs16	(	qint16_t	a,
		qint16_t	b
	)

inline

16 bit fixed point scalar subtraction

Parameters

[in]	a	First 16 bit fixed point input
[in]	b	Second 16 bit fixed point input

Returns: The result of the 16 bit fixed point subtraction

Definition at line 114 of file FixedPoint.inl.

References arm_compute::test::validation::b.

Referenced by sinvsqrt_qs16(), and slog_qs16().

 {
     return a - b;
 }

qint8_t ssub_qs8	(	qint8_t	a,
		qint8_t	b
	)

inline

8 bit fixed point scalar subtraction

Parameters

[in]	a	First 8 bit fixed point input
[in]	b	Second 8 bit fixed point input

Returns: The result of the 8 bit fixed point subtraction

Definition at line 109 of file FixedPoint.inl.

References arm_compute::test::validation::b.

Referenced by sinvsqrt_qs8(), and slog_qs8().

 {
     return a - b;
 }

const std::string& arm_compute::string_from_activation_func ( ActivationLayerInfo::ActivationFunction act )

Translates a given activation function to a string.

Parameters

[in] act ActivationLayerInfo::ActivationFunction to be translated to string.

Returns: The string describing the activation function.

Referenced by data_type_for_convolution_matrix().

const std::string& arm_compute::string_from_border_mode ( BorderMode border_mode )

Translates a given border mode policy to a string.

Parameters

[in] border_mode BorderMode to be translated to string.

Returns: The string describing the border mode.

Referenced by data_type_for_convolution_matrix().

const std::string& arm_compute::string_from_channel ( Channel channel )

Convert a channel identity into a string.

Parameters

[in] channel Channel to be translated to string.

Returns: The string describing the channel.

Referenced by data_type_for_convolution_matrix().

const std::string& arm_compute::string_from_data_layout ( DataLayout dl )

Convert a data layout identity into a string.

Parameters

[in] dl DataLayout to be translated to string.

Returns: The string describing the data layout.

Referenced by data_type_for_convolution_matrix().

const std::string& arm_compute::string_from_data_type ( DataType dt )

Convert a data type identity into a string.

Parameters

[in] dt DataType to be translated to string.

Returns: The string describing the data type.

Referenced by data_type_for_convolution_matrix(), error_on_data_type_not_in(), and error_on_value_not_representable_in_fixed_point().

const std::string& arm_compute::string_from_format ( Format format )

Convert a tensor format into a string.

Parameters

[in] format Format to be translated to string.

Returns: The string describing the format.

Referenced by data_type_for_convolution_matrix(), and error_on_format_not_in().

const std::string& arm_compute::string_from_interpolation_policy ( InterpolationPolicy policy )

Translates a given interpolation policy to a string.

Parameters

[in] policy InterpolationPolicy to be translated to string.

Returns: The string describing the interpolation policy.

Referenced by data_type_for_convolution_matrix().

const std::string& arm_compute::string_from_matrix_pattern ( MatrixPattern pattern )

Convert a matrix pattern into a string.

Parameters

[in] pattern MatrixPattern to be translated to string.

Returns: The string describing the matrix pattern.

Referenced by data_type_for_convolution_matrix().

const std::string& arm_compute::string_from_non_linear_filter_function ( NonLinearFilterFunction function )

Translates a given non linear function to a string.

Parameters

[in] function NonLinearFilterFunction to be translated to string.

Returns: The string describing the non linear function.

Referenced by data_type_for_convolution_matrix().

const std::string& arm_compute::string_from_norm_type ( NormType type )

Translates a given normalization type to a string.

Parameters

[in] type NormType to be translated to string.

Returns: The string describing the normalization type.

Referenced by data_type_for_convolution_matrix().

const std::string& arm_compute::string_from_pooling_type ( PoolingType type )

Translates a given pooling type to a string.

Parameters

[in] type PoolingType to be translated to string.

Returns: The string describing the pooling type.

Referenced by data_type_for_convolution_matrix().

const std::string& arm_compute::string_from_scheduler_type ( Scheduler::Type t )

Convert a Scheduler::Type into a string.

Parameters

[in] t Scheduler::Type to be translated to string.

Returns: The string describing the scheduler type.

const std::string& arm_compute::string_from_target ( GPUTarget target )

Translates a given gpu device target to string.

Parameters

[in] target Given gpu target.

Returns: The string describing the target.

std::string arm_compute::to_string ( const NonLinearFilterFunction & function )

inline

Formatted output of the NonLinearFilterFunction type.

Parameters

[in] function Type to output.

Returns: Formatted string.

Definition at line 101 of file TypePrinter.h.

Referenced by arm_compute::test::validation::DATA_TEST_CASE(), and main().

 {
     std::stringstream str;
     str << function;
     return str.str();
 }

std::string arm_compute::to_string ( const MatrixPattern & pattern )

inline

Formatted output of the MatrixPattern type.

Parameters

[in] pattern Type to output.

Returns: Formatted string.

Definition at line 144 of file TypePrinter.h.

 {
     std::stringstream str;
     str << pattern;
     return str.str();
 }

std::string arm_compute::to_string ( const QuantizationInfo & quantization_info )

inline

Formatted output of the QuantizationInfo type.

Parameters

[in] quantization_info Type to output.

Returns: Formatted string.

Definition at line 226 of file TypePrinter.h.

 {
     std::stringstream str;
     str << quantization_info;
     return str.str();
 }

std::string arm_compute::to_string ( const FixedPointOp & op )

inline

Formatted output of the FixedPointOp type.

Parameters

[in] op Type to output.

Returns: Formatted string.

Definition at line 278 of file TypePrinter.h.

 {
     std::stringstream str;
     str << op;
     return str.str();
 }

std::string arm_compute::to_string ( const arm_compute::ActivationLayerInfo & info )

inline

Formatted output of the activation function info type.

Parameters

[in] info Type to output.

Returns: Formatted string.

Definition at line 342 of file TypePrinter.h.

References ActivationLayerInfo::activation(), and ActivationLayerInfo::enabled().

 {
     std::stringstream str;
     if(info.enabled())
     {
         str << info.activation();
     }
     return str.str();
 }

std::string arm_compute::to_string ( const arm_compute::ActivationLayerInfo::ActivationFunction & function )

inline

Formatted output of the activation function type.

Parameters

[in] function Type to output.

Returns: Formatted string.

Definition at line 358 of file TypePrinter.h.

 {
     std::stringstream str;
     str << function;
     return str.str();
 }

std::string arm_compute::to_string ( const arm_compute::NormalizationLayerInfo & info )

inline

Formatted output of NormalizationLayerInfo.

Parameters

[in] info Type to output.

Returns: Formatted string.

Definition at line 398 of file TypePrinter.h.

References NormalizationLayerInfo::norm_size(), and NormalizationLayerInfo::type().

 {
     std::stringstream str;
     str << info.type() << ":NormSize=" << info.norm_size();
     return str.str();
 }

std::string arm_compute::to_string ( const RoundingPolicy & rounding_policy )

inline

Formatted output of RoundingPolicy.

Parameters

[in] rounding_policy Type to output.

Returns: Formatted string.

Definition at line 465 of file TypePrinter.h.

References arm_compute::test::validation::rounding_policy.

 {
     std::stringstream str;
     str << rounding_policy;
     return str.str();
 }

std::string arm_compute::to_string ( const arm_compute::DataLayout & data_layout )

inline

Formatted output of the DataLayout type.

Parameters

[in] data_layout Type to output.

Returns: Formatted string.

Definition at line 505 of file TypePrinter.h.

 {
     std::stringstream str;
     str << data_layout;
     return str.str();
 }

std::string arm_compute::to_string ( const arm_compute::DataType & data_type )

inline

Formatted output of the DataType type.

Parameters

[in] data_type Type to output.

Returns: Formatted string.

Definition at line 584 of file TypePrinter.h.

References arm_compute::test::validation::data_type.

 {
     std::stringstream str;
     str << data_type;
     return str.str();
 }

std::string arm_compute::to_string ( const Format & format )

inline

Formatted output of the Format type.

Parameters

[in] format Type to output.

Returns: Formatted string.

Definition at line 666 of file TypePrinter.h.

 {
     std::stringstream str;
     str << format;
     return str.str();
 }

std::string arm_compute::to_string ( const Channel & channel )

inline

Formatted output of the Channel type.

Parameters

[in] channel Type to output.

Returns: Formatted string.

Definition at line 733 of file TypePrinter.h.

 {
     std::stringstream str;
     str << channel;
     return str.str();
 }

std::string arm_compute::to_string ( const TensorInfo & info )

inline

Formatted output of the TensorInfo type.

Parameters

[in] info Type to output.

Returns: Formatted string.

Definition at line 841 of file TypePrinter.h.

References TensorInfo::data_type(), TensorInfo::fixed_point_position(), TensorInfo::num_channels(), and TensorInfo::tensor_shape().

 {
     std::stringstream str;
     str << "{Shape=" << info.tensor_shape() << ","
         << "Type=" << info.data_type() << ","
         << "Channels=" << info.num_channels() << ","
         << "FixedPointPos=" << info.fixed_point_position() << "}";
     return str.str();
 }

std::string arm_compute::to_string ( const Dimensions< T > & dimensions )

inline

Formatted output of the Dimensions type.

Parameters

[in] dimensions Type to output.

Returns: Formatted string.

Definition at line 858 of file TypePrinter.h.

 {
     std::stringstream str;
     str << dimensions;
     return str.str();
 }

std::string arm_compute::to_string ( const Strides & stride )

inline

Formatted output of the Strides type.

Parameters

[in] stride Type to output.

Returns: Formatted string.

Definition at line 871 of file TypePrinter.h.

 {
     std::stringstream str;
     str << stride;
     return str.str();
 }

std::string arm_compute::to_string ( const TensorShape & shape )

inline

Formatted output of the TensorShape type.

Parameters

[in] shape Type to output.

Returns: Formatted string.

Definition at line 884 of file TypePrinter.h.

References arm_compute::test::validation::shape.

 {
     std::stringstream str;
     str << shape;
     return str.str();
 }

std::string arm_compute::to_string ( const Coordinates & coord )

inline

Formatted output of the Coordinates type.

Parameters

[in] coord Type to output.

Returns: Formatted string.

Definition at line 897 of file TypePrinter.h.

 {
     std::stringstream str;
     str << coord;
     return str.str();
 }

std::string arm_compute::to_string ( const PadStrideInfo & pad_stride_info )

inline

Formatted output of the PadStrideInfo type.

Parameters

[in] pad_stride_info Type to output.

Returns: Formatted string.

Definition at line 942 of file TypePrinter.h.

 {
     std::stringstream str;
     str << pad_stride_info;
     return str.str();
 }

std::string arm_compute::to_string ( const BorderMode & mode )

inline

Formatted output of the BorderMode type.

Parameters

[in] mode Type to output.

Returns: Formatted string.

Definition at line 955 of file TypePrinter.h.

 {
     std::stringstream str;
     str << mode;
     return str.str();
 }

std::string arm_compute::to_string ( const BorderSize & border )

inline

Formatted output of the BorderSize type.

Parameters

[in] border Type to output.

Returns: Formatted string.

Definition at line 968 of file TypePrinter.h.

 {
     std::stringstream str;
     str << border;
     return str.str();
 }

std::string arm_compute::to_string ( const InterpolationPolicy & policy )

inline

Formatted output of the InterpolationPolicy type.

Parameters

[in] policy Type to output.

Returns: Formatted string.

Definition at line 981 of file TypePrinter.h.

 {
     std::stringstream str;
     str << policy;
     return str.str();
 }

std::string arm_compute::to_string ( const SamplingPolicy & policy )

inline

Formatted output of the SamplingPolicy type.

Parameters

[in] policy Type to output.

Returns: Formatted string.

Definition at line 994 of file TypePrinter.h.

 {
     std::stringstream str;
     str << policy;
     return str.str();
 }

std::string arm_compute::to_string ( const ConvertPolicy & policy )

inline

Definition at line 1025 of file TypePrinter.h.

 {
     std::stringstream str;
     str << policy;
     return str.str();
 }

std::string arm_compute::to_string ( const ReductionOperation & op )

inline

Formatted output of the Reduction Operations.

Parameters

[in] op Type to output.

Returns: Formatted string.

Definition at line 1059 of file TypePrinter.h.

 {
     std::stringstream str;
     str << op;
     return str.str();
 }

std::string arm_compute::to_string ( const NormType & type )

inline

Formatted output of the Norm Type.

Parameters

[in] type Type to output.

Returns: Formatted string.

Definition at line 1072 of file TypePrinter.h.

 {
     std::stringstream str;
     str << type;
     return str.str();
 }

std::string arm_compute::to_string ( const PoolingType & type )

inline

Formatted output of the Pooling Type.

Parameters

[in] type Type to output.

Returns: Formatted string.

Definition at line 1085 of file TypePrinter.h.

 {
     std::stringstream str;
     str << type;
     return str.str();
 }

std::string arm_compute::to_string ( const PoolingLayerInfo & info )

inline

Formatted output of the Pooling Layer Info.

Parameters

[in] info Type to output.

Returns: Formatted string.

Definition at line 1098 of file TypePrinter.h.

References Size2D::height, PoolingLayerInfo::is_global_pooling(), PoolingLayerInfo::pad_stride_info(), PoolingLayerInfo::pool_size(), PoolingLayerInfo::pool_type(), and Size2D::width.

 {
     std::stringstream str;
     str << "{Type=" << info.pool_type() << ","
         << "IsGlobalPooling=" << info.is_global_pooling();
     if(!info.is_global_pooling())
     {
         str << ","
             << "PoolSize=" << info.pool_size().width << "," << info.pool_size().height << ","
             << "PadStride=" << info.pad_stride_info();
     }
     str << "}";
     return str.str();
 }

std::string arm_compute::to_string ( const arm_compute::PhaseType & type )

inline

Formatted output of the PhaseType type.

Parameters

[in] type Type to output.

Returns: Formatted string.

Definition at line 1163 of file TypePrinter.h.

 {
     std::stringstream str;
     str << type;
     return str.str();
 }

std::string arm_compute::to_string ( const arm_compute::MagnitudeType & type )

inline

Formatted output of the MagnitudeType type.

Parameters

[in] type Type to output.

Returns: Formatted string.

Definition at line 1200 of file TypePrinter.h.

 {
     std::stringstream str;
     str << type;
     return str.str();
 }

std::string arm_compute::to_string ( const arm_compute::GradientDimension & type )

inline

Formatted output of the GradientDimension type.

Parameters

[in] type Type to output

Returns: Formatted string.

Definition at line 1240 of file TypePrinter.h.

 {
     std::stringstream str;
     str << type;
     return str.str();
 }

std::string arm_compute::to_string ( const HOGNormType & type )

inline

Formatted output of the HOGNormType type.

Parameters

[in] type Type to output

Returns: Formatted string.

Definition at line 1280 of file TypePrinter.h.

 {
     std::stringstream str;
     str << type;
     return str.str();
 }

std::string arm_compute::to_string ( const Size2D & type )

inline

Formatted output of the Size2D type.

Parameters

[in] type Type to output

Returns: Formatted string.

Definition at line 1307 of file TypePrinter.h.

 {
     std::stringstream str;
     str << type;
     return str.str();
 }

std::string arm_compute::to_string ( const HOGInfo & type )

inline

Formatted output of the HOGInfo type.

Parameters

[in] type Type to output

Returns: Formatted string.

Definition at line 1341 of file TypePrinter.h.

 {
     std::stringstream str;
     str << type;
     return str.str();
 }

std::string arm_compute::to_string ( const ConvolutionMethod & conv_method )

inline

Formatted output of the ConvolutionMethod type.

Parameters

[in] conv_method Type to output

Returns: Formatted string.

Definition at line 1381 of file TypePrinter.h.

 {
     std::stringstream str;
     str << conv_method;
     return str.str();
 }

std::string arm_compute::to_string ( const GPUTarget & gpu_target )

inline

Formatted output of the GPUTarget type.

Parameters

[in] gpu_target Type to output

Returns: Formatted string.

Definition at line 1454 of file TypePrinter.h.

 {
     std::stringstream str;
     str << gpu_target;
     return str.str();
 }

std::string arm_compute::to_string ( const DetectionWindow & detection_window )

inline

Formatted output of the DetectionWindow type.

Parameters

[in] detection_window Type to output

Returns: Formatted string.

Definition at line 1486 of file TypePrinter.h.

 {
     std::stringstream str;
     str << detection_window;
     return str.str();
 }

std::string arm_compute::to_string ( const Termination & termination )

inline

Formatted output of the Termination type.

Parameters

[in] termination Type to output

Returns: Formatted string.

Definition at line 1526 of file TypePrinter.h.

 {
     std::stringstream str;
     str << termination;
     return str.str();
 }

std::string arm_compute::to_string ( const WinogradInfo & type )

inline

Definition at line 1544 of file TypePrinter.h.

 {
     std::stringstream str;
     str << type;
     return str.str();
 }

bool arm_compute::update_window_and_padding	(	Window &	win,
		Ts &&...	patterns
	)

Update window and padding size for each of the access patterns.

First the window size is reduced based on all access patterns that are not allowed to modify the padding of the underlying tensor. Then the padding of the remaining tensors is increased to match the window.

Parameters

[in]	win	Window that is used by the kernel.
[in]	patterns	Access patterns used to calculate the final window and padding.

Returns: True if the window has been changed. Changes to the padding do not influence the returned value.

Definition at line 368 of file Helpers.h.

References calculate_max_window(), arm_compute::utility::for_each(), IAccessWindow::update_padding_if_needed(), and IAccessWindow::update_window_if_needed().

 {
     bool window_changed = false;
 
     utility::for_each([&](const IAccessWindow & w)
     {
         window_changed |= w.update_window_if_needed(win);
     },
     patterns...);
 
     bool padding_changed = false;
 
     utility::for_each([&](IAccessWindow & w)
     {
         padding_changed |= w.update_padding_if_needed(win);
     },
     patterns...);
 
     return window_changed;
 }

qint16x4_t arm_compute::vabs_qs16 ( qint16x4_t a )

Absolute value of 16 bit fixed point vector (4 elements)

Parameters

[in] a 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector absolute value

qint8x8_t arm_compute::vabs_qs8 ( qint8x8_t a )

Absolute value of 8 bit fixed point vector (8 elements)

Parameters

[in] a 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector absolute value

qint16x8_t arm_compute::vabsq_qs16 ( qint16x8_t a )

Absolute value of 16 bit fixed point vector (8 elements)

Parameters

[in] a 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector absolute value

qint8x16_t arm_compute::vabsq_qs8 ( qint8x16_t a )

Absolute value of 8 bit fixed point vector (16 elements)

Parameters

[in] a 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector absolute value

qint16x4_t arm_compute::vadd_qs16	(	qint16x4_t	a,
		qint16x4_t	b
	)

16 bit fixed point vector add (4 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector addition

qint8x8_t arm_compute::vadd_qs8	(	qint8x8_t	a,
		qint8x8_t	b
	)

8 bit fixed point vector add (8 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector addition

qint16x8_t arm_compute::vaddq_qs16	(	qint16x8_t	a,
		qint16x8_t	b
	)

16 bit fixed point vector add (8 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector addition

qint8x16_t arm_compute::vaddq_qs8	(	qint8x16_t	a,
		qint8x16_t	b
	)

8 bit fixed point vector add (16 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector addition

float32x4_t arm_compute::vcvt_f32_qs16	(	qint16x4_t	a,
		int	fixed_point_position
	)

Convert a 16 bit fixed point vector with 4 elements to a float vector with 4 elements.

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the conversion 16 bit fixed point -> float32x2

float32x4x2_t arm_compute::vcvt_f32_qs8	(	qint8x8_t	a,
		int	fixed_point_position
	)

Convert a 8 bit fixed point vector with 8 elements to a float vector with 4x2 elements.

Parameters

[in]	a	8 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the conversion 8 bit fixed point -> float32x2x4

float32x4x2_t arm_compute::vcvtq_qs16_f32	(	qint16x8_t	a,
		int	fixed_point_position
	)

Convert a 16 bit fixed point vector with 8 elements to a float vector with 4x2 elements.

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the conversion 16 bit fixed point -> float32x4x2

float32x4x4_t arm_compute::vcvtq_qs8_f32	(	qint8x16_t	a,
		int	fixed_point_position
	)

Convert a 8 bit fixed point vector with 16 elements to a float vector with 4x4 elements.

Parameters

[in]	a	8 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the conversion 8 bit fixed point -> float32x4x4

qint16x4_t arm_compute::vdiv_qs16	(	qint16x4_t	a,
		qint16x4_t	b,
		int	fixed_point_position
	)

Division fixed point 16 bit (4 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The quotient and remainder number in fixed point format.

qint8x8_t arm_compute::vdiv_qs8	(	qint8x8_t	a,
		int8x8_t	b,
		int	fixed_point_position
	)

Division fixed point 8bit (8 elements)

Parameters

[in]	a	First 8bit fixed point input vector
[in]	b	Second 8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The quotient and remainder number in fixed point format.

qint16x8_t arm_compute::vdivq_qs16	(	qint16x8_t	a,
		qint16x8_t	b,
		int	fixed_point_position
	)

Division fixed point 16 bit (8 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The quotient and remainder number in 16 bit fixed point format.

qint8x16_t arm_compute::vdivq_qs8	(	qint8x16_t	a,
		qint8x16_t	b,
		int	fixed_point_position
	)

Division fixed point 8bit (16 elements)

Parameters

[in]	a	First 8bit fixed point input vector
[in]	b	Second 8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The quotient and remainder number in 8bit fixed point format.

qint16x4_t arm_compute::vdup_n_qs16 ( qint16_t a )

16 bit fixed point vector duplicate (4 elements)

Parameters

[in] a 16 bit fixed point to duplicate

Returns: The result of the vector duplication

qint8x8_t arm_compute::vdup_n_qs8 ( qint8_t a )

8 bit fixed point vector duplicate (8 elements)

Parameters

[in] a 8 bit fixed point to duplicate

Returns: The result of the vector duplication

qint16x8_t arm_compute::vdupq_n_qs16 ( qint16x8_t a )

16 bit fixed point vector duplicate (8 elements)

Parameters

[in] a 16 bit fixed point to duplicate

Returns: The result of the vector duplication

qint16x8_t arm_compute::vdupq_n_qs16_f32	(	float	a,
		int	fixed_point_position
	)

Duplicate a float and convert it to 16 bit fixed point vector (8 elements)

Parameters

[in]	a	floating point value to convert and duplicate
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the vector duplication

qint8x16_t arm_compute::vdupq_n_qs8 ( qint8_t a )

8 bit fixed point vector duplicate (16 elements)

Parameters

[in] a 8 bit fixed point to duplicate

Returns: The result of the vector duplication

qint8x16_t arm_compute::vdupq_n_qs8_f32	(	float	a,
		int	fixed_point_position
	)

Duplicate a float and convert it to 8 bit fixed point vector (16 elements)

Parameters

[in]	a	floating point value to convert and duplicate
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the vector duplication

float32x4_t arm_compute::vexpq_f32 ( float32x4_t x )

Calculate exponential.

Parameters

[in] x Input vector value in F32 format.

Returns: The calculated exponent.

float32x4_t arm_compute::vfloorq_f32 ( float32x4_t val )

Calculate floor of a vector.

Parameters

[in] val Input vector value in F32 format.

Returns: The calculated floor vector.

qint16x4_t arm_compute::vget_high_qs16 ( qint16x8_t a )

Get the higher half of a 16 elements vector.

Parameters

[in] a vector of 8 elements

Returns: 16 bit fixed point vector (4 elements)

qint8x8_t arm_compute::vget_high_qs8 ( qint8x16_t a )

Get the higher half of a 16 elements vector.

Parameters

[in] a vector of 16 elements

Returns: 8 bit fixed point vector (8 elements)

qint16x4_t arm_compute::vget_low_qs16 ( qint16x8_t a )

Get the lower half of a 16 elements vector.

Parameters

[in] a vector of 8 elements

Returns: 16 bit fixed point vector (4 elements)

qint8x8_t arm_compute::vget_low_qs8 ( qint8x16_t a )

Get the lower half of a 16 elements vector.

Parameters

[in] a vector of 16 elements

Returns: 8 bit fixed point vector (8 elements)

float32x2_t arm_compute::vinv_f32 ( float32x2_t x )

Calculate reciprocal.

Parameters

[in] x Input value.

Returns: The calculated reciprocal.

float32x4_t arm_compute::vinvq_f32 ( float32x4_t x )

Calculate reciprocal.

Parameters

[in] x Input value.

Returns: The calculated reciprocal.

float32x2_t arm_compute::vinvsqrt_f32 ( float32x2_t x )

Calculate inverse square root.

Parameters

[in] x Input value.

Returns: The calculated inverse square root.

qint16x4_t arm_compute::vinvsqrt_qs16	(	qint16x4_t	a,
		int	fixed_point_position
	)

Calculate inverse square root for fixed point 16 bit using Newton-Raphosn method (4 elements)

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit inverse sqrt.

qint8x8_t arm_compute::vinvsqrt_qs8	(	qint8x8_t	a,
		int	fixed_point_position
	)

Calculate inverse square root for fixed point 8bit using Newton-Raphosn method (8 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8bit inverse sqrt.

float32x4_t arm_compute::vinvsqrtq_f32 ( float32x4_t x )

Calculate inverse square root.

Parameters

[in] x Input value.

Returns: The calculated inverse square root.

qint16x8_t arm_compute::vinvsqrtq_qs16	(	qint16x8_t	a,
		int	fixed_point_position
	)

Calculate inverse square root for fixed point 8bit using Newton-Raphosn method (8 elements)

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit inverse sqrt.

qint8x16_t arm_compute::vinvsqrtq_qs8	(	qint8x16_t	a,
		int	fixed_point_position
	)

Calculate inverse square root for fixed point 8bit using Newton-Raphosn method (16 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8bit inverse sqrt.

qint16x4_t arm_compute::vld1_dup_qs16 ( const qint16_t * addr )

Load all lanes of 16 bit fixed point vector with same value from memory (4 elements)

Parameters

[in] addr Memory address of the 16 bit fixed point scalar value to load

Returns: 16 bit fixed point vector (4 elements)

qint8x8_t arm_compute::vld1_dup_qs8 ( const qint8_t * addr )

Load all lanes of 8 bit fixed point vector with same value from memory (8 elements)

Parameters

[in] addr Memory address of the 8 bit fixed point scalar value to load

Returns: 8 bit fixed point vector (8 elements)

Referenced by arm_compute::detail::load_matrix_row().

qint16x4_t arm_compute::vld1_qs16 ( const qint16_t * addr )

Load a single 16 bit fixed point vector from memory (4 elements)

Parameters

[in] addr Memory address of the 16 bit fixed point vector to load

Returns: 16 bit fixed point vector (4 elements)

qint8x8_t arm_compute::vld1_qs8 ( const qint8_t * addr )

Load a single 8 bit fixed point vector from memory (8 elements)

Parameters

[in] addr Memory address of the 8 bit fixed point vector to load

Returns: 8 bit fixed point vector (8 elements)

Referenced by arm_compute::detail::convolve_3x3< 1 >().

qint16x8_t arm_compute::vld1q_dup_qs16 ( const qint16_t * addr )

Load all lanes of 16 bit fixed point vector with same value from memory (8 elements)

Parameters

[in] addr Memory address of the 16 bit fixed point scalar value to load

Returns: 16 bit fixed point vector (8 elements)

qint8x16_t arm_compute::vld1q_dup_qs8 ( const qint8_t * addr )

Load all lanes of 8 bit fixed point vector with same value from memory (16 elements)

Parameters

[in] addr Memory address of the 8 bit fixed point scalar value to load

Returns: 8 bit fixed point vector (16 elements)

qint16x8_t arm_compute::vld1q_qs16 ( const qint16_t * addr )

Load a single 16 bit fixed point vector from memory (8 elements)

Parameters

[in] addr Memory address of the 16 bit fixed point vector to load

Returns: 16 bit fixed point vector (8 elements)

qint8x16_t arm_compute::vld1q_qs8 ( const qint8_t * addr )

Load a single 8 bit fixed point vector from memory (16 elements)

Parameters

[in] addr Memory address of the 8 bit fixed point vector to load

Returns: 8 bit fixed point vector (16 elements)

qint16x8x2_t arm_compute::vld2q_qs16 ( qint16_t * addr )

Load two 16 bit fixed point vectors from memory (8x2 elements)

Parameters

[in] addr Memory address of the 16 bit fixed point vectors to load

Returns: 16 bit fixed point vectors (8x2 elements)

qint16x4_t arm_compute::vlog_qs16	(	qint16x4_t	a,
		int	fixed_point_position
	)

Calculate logarithm fixed point 16 bit (4 elements)

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit logarithm.

qint8x8_t arm_compute::vlog_qs8	(	qint8x8_t	a,
		int	fixed_point_position
	)

Calculate logarithm fixed point 8 bit (8 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8bit logarithm.

float32x4_t arm_compute::vlogq_f32 ( float32x4_t x )

Calculate logarithm.

Parameters

[in] x Input vector value in F32 format.

Returns: The calculated logarithm.

qint16x8_t arm_compute::vlogq_qs16	(	qint16x8_t	a,
		int	fixed_point_position
	)

Calculate logarithm fixed point 16 bit (8 elements)

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit logarithm.

qint8x16_t arm_compute::vlogq_qs8	(	qint8x16_t	a,
		int	fixed_point_position
	)

Calculate logarithm fixed point 16bit (16 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8bit logarithm.

float32x4x2_t arm_compute::vmax2q_f32	(	float32x4x2_t	a,
		float32x4x2_t	b
	)

Compute lane-by-lane maximum between elements of a float vector with 4x2 elements.

Parameters

[in]	a	Float input vector
[in]	b	Float input vector

Returns: The lane-by-lane maximum -> float32x4x2

qint16x4_t arm_compute::vmax_qs16	(	qint16x4_t	a,
		qint16x4_t	b
	)

16 bit fixed point vector max (4 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector max operation

qint8x8_t arm_compute::vmax_qs8	(	qint8x8_t	a,
		qint8x8_t	b
	)

8 bit fixed point vector max (8 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector max operation

qint16x8_t arm_compute::vmaxq_qs16	(	qint16x8_t	a,
		qint16x8_t	b
	)

16 bit fixed point vector max (8 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector max operation

qint8x16_t arm_compute::vmaxq_qs8	(	qint8x16_t	a,
		qint8x16_t	b
	)

8 bit fixed point vector max (16 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector max operation

qint16x4_t arm_compute::vmin_qs16	(	qint16x4_t	a,
		qint16x4_t	b
	)

16 bit fixed point vector min (4 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector max operation

qint8x8_t arm_compute::vmin_qs8	(	qint8x8_t	a,
		qint8x8_t	b
	)

8 bit fixed point vector min (8 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector max operation

qint16x8_t arm_compute::vminq_qs16	(	qint16x8_t	a,
		qint16x8_t	b
	)

16 bit fixed point vector min (8 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector min operation

qint8x16_t arm_compute::vminq_qs8	(	qint8x16_t	a,
		qint8x16_t	b
	)

8 bit fixed point vector min (16 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector min operation

qint16x4_t arm_compute::vmla_qs16	(	qint16x4_t	a,
		qint16x4_t	b,
		qint16x4_t	c,
		int	fixed_point_position
	)

16 bit fixed point vector multiply-accumulate (4 elements).

This operation performs the product between b and c and add the result to a (a + b * c).

Parameters

[in]	a	First 16 bit fixed point input vector where the result of multiplication must be added to
[in]	b	Second 16 bit fixed point input vector
[in]	c	Third 16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point vector multiply-accumulate

qint8x8_t arm_compute::vmla_qs8	(	qint8x8_t	a,
		qint8x8_t	b,
		qint8x8_t	c,
		int	fixed_point_position
	)

8 bit fixed point vector multiply-accumulate (8 elements).

This operation performs the product between b and c and add the result to a (a + b * c).

Parameters

[in]	a	First 8 bit fixed point input vector where the result of multiplication must be added to
[in]	b	Second 8 bit fixed point input vector
[in]	c	Third 8 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point vector multiply-accumulate

qint32x4_t arm_compute::vmlal_qs16	(	qint32x4_t	a,
		qint16x4_t	b,
		qint16x4_t	c,
		int	fixed_point_position
	)

16 bit fixed point vector multiply-accumulate long (4 elements).

This operation performs the product between b and c and add the result to the 32 bit fixed point vector a (a + b * c). 4 elements

Parameters

[in]	a	First 32 bit fixed point input vector where the result of multiplication must be added to
[in]	b	Second 16 bit fixed point input vector
[in]	c	Third 16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point vector multiply-accumulate long

qint16x8_t arm_compute::vmlal_qs8	(	qint16x8_t	a,
		qint8x8_t	b,
		qint8x8_t	c,
		int	fixed_point_position
	)

8 bit fixed point vector multiply-accumulate long (8 elements).

This operation performs the product between b and c and add the result to the 16 bit fixed point vector a (a + b * c). 8 elements

Parameters

[in]	a	First 16 bit fixed point input vector where the result of multiplication must be added to
[in]	b	Second 8 bit fixed point input vector
[in]	c	Third 8 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point vector multiply-accumulate long

qasymm8x16_t vmlaq_qasymm8	(	qasymm8x16_t	vd,
		float32x4_t	vs,
		float32x4_t	vo
	)

inline

Perform a multiply-accumulate on all 16 components of a QASYMM8 vector.

vd*vs + vo

Parameters

[in]	vd	Input vector value in QASYMM8 format
[in]	vs	Vector multiplier in F32 format. The multiplier value must be duplicated across all four lanes.
[in]	vo	Vector addend in F32 format. The addend value must be duplicated across all four lanes.

Returns: A 16-component vector in QASYMM8 format, saturated to fit

Definition at line 34 of file NEAsymm.inl.

 {
     // Convert uint8 vectors to uint16 vectors
     const uint8x8_t vd_low        = vget_low_u8(vd);
     const uint8x8_t vd_high       = vget_high_u8(vd);
     uint16x8_t      vd_low_u16x8  = vmovl_u8(vd_low);
     uint16x8_t      vd_high_u16x8 = vmovl_u8(vd_high);
     // Convert uint16 vectors to uint32 vectors
     uint32x4_t A_u32x4 = vmovl_u16(vget_low_u16(vd_low_u16x8));
     uint32x4_t B_u32x4 = vmovl_u16(vget_high_u16(vd_low_u16x8));
     uint32x4_t C_u32x4 = vmovl_u16(vget_low_u16(vd_high_u16x8));
     uint32x4_t D_u32x4 = vmovl_u16(vget_high_u16(vd_high_u16x8));
     // Convert uint32 vectors to float32 vectors
     float32x4_t A_f32x4 = vcvtq_f32_u32(A_u32x4);
     float32x4_t B_f32x4 = vcvtq_f32_u32(B_u32x4);
     float32x4_t C_f32x4 = vcvtq_f32_u32(C_u32x4);
     float32x4_t D_f32x4 = vcvtq_f32_u32(D_u32x4);
     // vd = vd*vs + vo
     A_f32x4 = vmlaq_f32(vo, A_f32x4, vs);
     B_f32x4 = vmlaq_f32(vo, B_f32x4, vs);
     C_f32x4 = vmlaq_f32(vo, C_f32x4, vs);
     D_f32x4 = vmlaq_f32(vo, D_f32x4, vs);
     // Convert float32 vectors to uint32 vectors
     A_u32x4 = vcvtq_u32_f32(A_f32x4);
     B_u32x4 = vcvtq_u32_f32(B_f32x4);
     C_u32x4 = vcvtq_u32_f32(C_f32x4);
     D_u32x4 = vcvtq_u32_f32(D_f32x4);
     // Convert uint32 vectors to uint16 vectors (with saturation)
     vd_low_u16x8  = vcombine_u16(vqmovn_u32(A_u32x4), vqmovn_u32(B_u32x4));
     vd_high_u16x8 = vcombine_u16(vqmovn_u32(C_u32x4), vqmovn_u32(D_u32x4));
     // convert uint16 vectors to uint8 vectors (with saturation)
     return vcombine_u8(vqmovn_u16(vd_low_u16x8), vqmovn_u16(vd_high_u16x8));
 }

qint16x8_t arm_compute::vmlaq_qs16	(	qint16x8_t	a,
		qint16x8_t	b,
		qint16x8_t	c,
		int	fixed_point_position
	)

16 bit fixed point vector multiply-accumulate (16 elements).

This operation performs the product between b and c and add the result to a (a + b * c).

Parameters

[in]	a	First 16 bit fixed point input vector where the result of multiplication must be added to
[in]	b	Second 16 bit fixed point input vector
[in]	c	Third 16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point vector multiply-accumulate

qint8x16_t arm_compute::vmlaq_qs8	(	qint8x16_t	a,
		qint8x16_t	b,
		qint8x16_t	c,
		int	fixed_point_position
	)

8 bit fixed point vector multiply-accumulate (16 elements).

This operation performs the product between b and c and add the result to a (a + b * c).

Parameters

[in]	a	First 8 bit fixed point input vector where the result of multiplication must be added to
[in]	b	Second 8 bit fixed point input vector
[in]	c	Third 8 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point vector multiply-accumulate

qint16x4_t arm_compute::vmul_qs16	(	qint16x4_t	a,
		qint16x4_t	b,
		int	fixed_point_position
	)

16 bit fixed point vector multiply (4 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point vector multiplication.

qint8x8_t arm_compute::vmul_qs8	(	qint8x8_t	a,
		qint8x8_t	b,
		int	fixed_point_position
	)

8 bit fixed point vector multiply (8 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point vector multiplication.

qint32x4_t arm_compute::vmull_qs16	(	qint16x4_t	a,
		qint16x4_t	b,
		int	fixed_point_position
	)

16 bit fixed point vector long multiply (4 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 32 bit fixed point long vector multiplication.

qint16x8_t arm_compute::vmull_qs8	(	qint8x8_t	a,
		qint8x8_t	b,
		int	fixed_point_position
	)

8 bit fixed point vector long multiply (8 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point long vector multiplication.

Referenced by arm_compute::detail::convolve_3x3< 1 >().

qint16x8_t arm_compute::vmulq_qs16	(	qint16x8_t	a,
		qint16x8_t	b,
		int	fixed_point_position
	)

16 bit fixed point vector multiply (8 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point vector multiplication.

qint8x16_t arm_compute::vmulq_qs8	(	qint8x16_t	a,
		qint8x16_t	b,
		int	fixed_point_position
	)

8 bit fixed point vector multiply (16 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point vector multiplication.

int16x4_t arm_compute::vpaddl_qs8 ( qint8x8_t a )

8 bit fixed point vector saturating pairwise add (8 elements)

Parameters

[in] a 8 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector addition. The result is saturated in case of overflow

qint16x4_t arm_compute::vpmax_qs16	(	qint16x4_t	a,
		qint16x4_t	b
	)

16 bit fixed point vector pairwise max (4 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector pairwise max operation

qint8x8_t arm_compute::vpmax_qs8	(	qint8x8_t	a,
		qint8x8_t	b
	)

8 bit fixed point vector pairwise max (8 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector pairwise max operation

qint16x4_t arm_compute::vpmin_qs16	(	qint16x4_t	a,
		qint16x4_t	b
	)

16 bit fixed point vector pairwise min (4 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector pairwise min operation

qint8x8_t arm_compute::vpmin_qs8	(	qint8x8_t	a,
		qint8x8_t	b
	)

8 bit fixed point vector pairwise min (8 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector pairwise min operation

float32x4_t arm_compute::vpowq_f32	(	float32x4_t	val,
		float32x4_t	n
	)

Calculate n power of a number.

pow(x,n) = e^(n*log(x))

Parameters

[in]	val	Input vector value in F32 format.
[in]	n	Powers to raise the input to.

Returns: The calculated power.

qint16x4_t arm_compute::vqabs_qs16 ( qint16x4_t a )

Saturating absolute value of 16 bit fixed point vector (4 elements)

Parameters

[in] a 4 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector absolute value

qint8x8_t arm_compute::vqabs_qs8 ( qint8x8_t a )

Saturating absolute value of 8 bit fixed point vector (8 elements)

Parameters

[in] a 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector absolute value

qint16x8_t arm_compute::vqabsq_qs16 ( qint16x8_t a )

Saturating absolute value of 16 bit fixed point vector (8 elements)

Parameters

[in] a 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector absolute value

qint8x16_t arm_compute::vqabsq_qs8 ( qint8x16_t a )

Saturating absolute value of 8 bit fixed point vector (16 elements)

Parameters

[in] a 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector absolute value

qint16x4_t arm_compute::vqadd_qs16	(	qint16x4_t	a,
		qint16x4_t	b
	)

16 bit fixed point vector saturating add (4 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector addition. The result is saturated in case of overflow

qint8x8_t arm_compute::vqadd_qs8	(	qint8x8_t	a,
		qint8x8_t	b
	)

8 bit fixed point vector saturating add (8 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector addition. The result is saturated in case of overflow

qint16x8_t arm_compute::vqaddq_qs16	(	qint16x8_t	a,
		qint16x8_t	b
	)

16 bit fixed point vector saturating add (8 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector addition. The result is saturated in case of overflow

qint8x16_t arm_compute::vqaddq_qs8	(	qint8x16_t	a,
		qint8x16_t	b
	)

8 bit fixed point vector saturating add (16 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector addition. The result is saturated in case of overflow

qint16x4_t arm_compute::vqcvt_qs16_f32	(	const float32x4_t	a,
		int	fixed_point_position
	)

Convert a float vector with 4 elements to 16 bit fixed point vector with 4 elements.

Parameters

[in]	a	Float input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the conversion float -> 16 bit fixed point. The result is saturated in case of overflow

qint8x8_t arm_compute::vqcvt_qs8_f32	(	const float32x4x2_t	a,
		int	fixed_point_position
	)

Convert a float vector with 4x2 elements to 8 bit fixed point vector with 8 elements.

Parameters

[in]	a	Float input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the conversion float -> 8 bit fixed point. The result is saturated in case of overflow

qint16x8_t arm_compute::vqcvtq_qs16_f32	(	const float32x4x2_t &	a,
		int	fixed_point_position
	)

Convert a float vector with 4x2 elements to 16 bit fixed point vector with 8 elements.

Parameters

[in]	a	Float input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the conversion float -> 16 bit fixed point. The result is saturated in case of overflow

qint8x16_t arm_compute::vqcvtq_qs8_f32	(	const float32x4x4_t &	a,
		int	fixed_point_position
	)

Convert a float vector with 4x4 elements to 8 bit fixed point vector with 16 elements.

Parameters

[in]	a	Float input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the conversion float -> 8 bit fixed point. The result is saturated in case of overflow

qint16x4_t arm_compute::vqexp_qs16	(	qint16x4_t	a,
		int	fixed_point_position
	)

Calculate saturating exponential fixed point 16 bit (4 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit saturating exponential

qint8x8_t arm_compute::vqexp_qs8	(	qint8x8_t	a,
		int	fixed_point_position
	)

Calculate saturating exponential fixed point 8bit (8 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8bit saturating exponential

qint16x8_t arm_compute::vqexpq_qs16	(	qint16x8_t	a,
		int	fixed_point_position
	)

Calculate saturating exponential fixed point 16 bit (8 elements)

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit saturating exponential

qint8x16_t arm_compute::vqexpq_qs8	(	qint8x16_t	a,
		int	fixed_point_position
	)

Calculate saturating exponential fixed point 8bit (16 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8bit saturating exponential

qint16x4_t arm_compute::vqinvsqrt_qs16	(	qint16x4_t	a,
		int	fixed_point_position
	)

Calculate saturating inverse square root for fixed point 16 bit using Newton-Raphosn method (4 elements)

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit inverse sqrt.

qint8x8_t arm_compute::vqinvsqrt_qs8	(	qint8x8_t	a,
		int	fixed_point_position
	)

Calculate saturating inverse square root for fixed point 8bit using Newton-Raphosn method (8 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8bit inverse sqrt.

qint16x8_t arm_compute::vqinvsqrtq_qs16	(	qint16x8_t	a,
		int	fixed_point_position
	)

Calculate saturating inverse square root for fixed point 16 bit using Newton-Raphosn method (8 elements)

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit inverse sqrt.

qint8x16_t arm_compute::vqinvsqrtq_qs8	(	qint8x16_t	a,
		int	fixed_point_position
	)

Calculate saturating inverse square root for fixed point 8bit using Newton-Raphosn method (16 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8bit inverse sqrt.

qint16x4_t arm_compute::vqmla_qs16	(	qint16x4_t	a,
		qint16x4_t	b,
		qint16x4_t	c,
		int	fixed_point_position
	)

16 bit fixed point vector saturating multiply-accumulate (4 elements).

This operation performs the product between b and c and add the result to a (a + b * c).

Parameters

[in]	a	First 16 bit fixed point input vector where the result of multiplication must be added to
[in]	b	Second 16 bit fixed point input vector
[in]	c	Third 16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point vector multiply-accumulate. The result is saturated in case of overflow

qint8x8_t arm_compute::vqmla_qs8	(	qint8x8_t	a,
		qint8x8_t	b,
		qint8x8_t	c,
		int	fixed_point_position
	)

8 bit fixed point vector saturating multiply-accumulate (8 elements).

This operation performs the product between b and c and add the result to a (a + b * c).

Parameters

[in]	a	First 8 bit fixed point input vector where the result of multiplication must be added to
[in]	b	Second 8 bit fixed point input vector
[in]	c	Third 8 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point vector multiply-accumulate. The result is saturated in case of overflow

qint32x4_t arm_compute::vqmlal_qs16	(	qint32x4_t	a,
		qint16x4_t	b,
		qint16x4_t	c,
		int	fixed_point_position
	)

16 bit fixed point vector saturating multiply-accumulate long (4 elements).

The saturation is performed on the 16 bit fixed point output vector. This operation performs the product between b and c and add the result to the 32 bit fixed point vector a (a + b * c). 4 elements

Parameters

[in]	a	First 32 bit fixed point input vector where the result of multiplication must be added to
[in]	b	Second 16 bit fixed point input vector
[in]	c	Third 16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point vector multiply-accumulate long

qint16x8_t arm_compute::vqmlal_qs8	(	qint16x8_t	a,
		qint8x8_t	b,
		qint8x8_t	c,
		int	fixed_point_position
	)

8 bit fixed point vector saturating multiply-accumulate long (8 elements).

The saturation is performed on the 16 bit fixed point output vector. This operation performs the product between b and c and add the result to the 16 bit fixed point vector a (a + b * c). 8 elements

Parameters

[in]	a	First 16 bit fixed point input vector where the result of multiplication must be added to
[in]	b	Second 8 bit fixed point input vector
[in]	c	Third 8 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point vector multiply-accumulate long

Referenced by arm_compute::detail::convolve_3x3< 1 >().

qint16x8_t arm_compute::vqmlaq_qs16	(	qint16x8_t	a,
		qint16x8_t	b,
		qint16x8_t	c,
		int	fixed_point_position
	)

16 bit fixed point vector saturating multiply-accumulate (8 elements).

This operation performs the product between b and c and add the result to a (a + b * c).

Parameters

[in]	a	First 16 bit fixed point input vector where the result of multiplication must be added to
[in]	b	Second 16 bit fixed point input vector
[in]	c	Third 16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point vector multiply-accumulate.The result is saturated in case of overflow

qint8x16_t arm_compute::vqmlaq_qs8	(	qint8x16_t	a,
		qint8x16_t	b,
		qint8x16_t	c,
		int	fixed_point_position
	)

8 bit fixed point vector saturating multiply-accumulate (16 elements).

This operation performs the product between b and c and add the result to a (a + b * c).

Parameters

[in]	a	First 8 bit fixed point input vector where the result of multiplication must be added to
[in]	b	Second 8 bit fixed point input vector
[in]	c	Third 8 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point vector multiply-accumulate.The result is saturated in case of overflow

qint8x8_t arm_compute::vqmovn_q16 ( qint16x8_t a )

16 bit fixed point vector saturating narrow (8 elements)

Parameters

[in] a 16 bit fixed point vector to convert

Returns: 8 bit fixed point vector

qint16x4_t arm_compute::vqmovn_q32 ( qint32x4_t a )

32 bit fixed point vector saturating narrow (4 elements)

Parameters

[in] a 32 bit fixed point vector to convert

Returns: 16 bit fixed point vector

qint16x4_t arm_compute::vqmul_qs16	(	qint16x4_t	a,
		qint16x4_t	b,
		int	fixed_point_position
	)

16 bit fixed point vector saturating multiply (4 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point vector multiplication. The result is saturated in case of overflow

qint8x8_t arm_compute::vqmul_qs8	(	qint8x8_t	a,
		qint8x8_t	b,
		int	fixed_point_position
	)

8 bit fixed point vector saturating multiply (8 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point vector multiplication. The result is saturated in case of overflow

qint16x8_t arm_compute::vqmulq_qs16	(	qint16x8_t	a,
		qint16x8_t	b,
		int	fixed_point_position
	)

16 bit fixed point vector saturating multiply (8 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit fixed point vector multiplication. The result is saturated in case of overflow

qint8x16_t arm_compute::vqmulq_qs8	(	qint8x16_t	a,
		qint8x16_t	b,
		int	fixed_point_position
	)

8 bit fixed point vector saturating multiply (16 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8 bit fixed point vector multiplication. The result is saturated in case of overflow

qint16x8_t arm_compute::vqpowq_qs16	(	qint16x8_t	a,
		qint16x8_t	b,
		int	fixed_point_position
	)

Calculate saturating n power for fixed point 16bit (8 elements).

pow(a,b) = e^(b*log(a))

Parameters

[in]	a	16bit fixed point input vector
[in]	b	16bit fixed point power vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16bit power.

qint8x16_t arm_compute::vqpowq_qs8	(	qint8x16_t	a,
		qint8x16_t	b,
		int	fixed_point_position
	)

Calculate saturating n power for fixed point 8bit (16 elements).

pow(a,b) = e^(b*log(a))

Parameters

[in]	a	8bit fixed point input vector
[in]	b	8bit fixed point power vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8bit power.

qint16x4_t arm_compute::vqsub_qs16	(	qint16x4_t	a,
		qint16x4_t	b
	)

16 bit fixed point vector saturating subtraction (4 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector subtraction. The result is saturated in case of overflow

qint8x8_t arm_compute::vqsub_qs8	(	qint8x8_t	a,
		qint8x8_t	b
	)

8 bit fixed point vector saturating subtraction (8 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector subtraction. The result is saturated in case of overflow

qint16x8_t arm_compute::vqsubq_qs16	(	qint16x8_t	a,
		qint16x8_t	b
	)

16 bit fixed point vector saturating subtraction (8 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector subtraction. The result is saturated in case of overflow

qint8x16_t arm_compute::vqsubq_qs8	(	qint8x16_t	a,
		qint8x16_t	b
	)

8 bit fixed point vector saturating subtraction (16 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector subtraction. The result is saturated in case of overflow

qint16x4_t arm_compute::vqtanh_qs16	(	qint16x4_t	a,
		int	fixed_point_position
	)

Calculate hyperbolic tangent for fixed point 16 bit (4 elements)

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The calculated Hyperbolic Tangent.

qint8x8_t arm_compute::vqtanh_qs8	(	qint8x8_t	a,
		int	fixed_point_position
	)

Calculate hyperbolic tangent for fixed point 8bit (8 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The calculated Hyperbolic Tangent.

qint16x8_t arm_compute::vqtanhq_qs16	(	qint16x8_t	a,
		int	fixed_point_position
	)

Calculate hyperbolic tangent for fixed point 16bit (8 elements)

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The calculated Hyperbolic Tangent.

qint8x16_t arm_compute::vqtanhq_qs8	(	qint8x16_t	a,
		int	fixed_point_position
	)

Calculate hyperbolic tangent for fixed point 8bit (16 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The calculated Hyperbolic Tangent.

qint16x4_t arm_compute::vrecip_qs16	(	qint16x4_t	a,
		int	fixed_point_position
	)

Calculate reciprocal of a fixed point 8bit number using the Newton-Raphson method.

(4 elements)

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit reciprocal (1/a).

qint8x8_t arm_compute::vrecip_qs8	(	qint8x8_t	a,
		int	fixed_point_position
	)

Calculate reciprocal of a fixed point 8bit number using the Newton-Raphson method.

(8 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8bit reciprocal (1/a).

qint16x8_t arm_compute::vrecipq_qs16	(	qint16x8_t	a,
		int	fixed_point_position
	)

Calculate reciprocal of a fixed point 8bit number using the Newton-Raphson method.

(8 elements)

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit reciprocal (1/a).

qint8x16_t arm_compute::vrecipq_qs8	(	qint8x16_t	a,
		int	fixed_point_position
	)

Calculate reciprocal of a fixed point 8bit number using the Newton-Raphson method.

(16 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8bit reciprocal (1/a).

void arm_compute::vst1_qs16	(	qint16_t *	addr,
		qint16x4_t	b
	)

Store a single 16 bit fixed point vector to memory (4 elements)

Parameters

[in]	addr	Memory address where the 16 bit fixed point vector should be stored
[in]	b	16 bit fixed point vector to store

Referenced by arm_compute::detail::store_results< 3 >().

void arm_compute::vst1_qs8	(	qint8_t *	addr,
		qint8x8_t	b
	)

Store a single 8 bit fixed point vector to memory (8 elements)

Parameters

[in]	addr	Memory address where the 8 bit fixed point vector should be stored
[in]	b	8 bit fixed point vector to store

void arm_compute::vst1q_qs16	(	qint16_t *	addr,
		qint16x8_t	b
	)

Store a single 16 bit fixed point vector to memory (8 elements)

Parameters

[in]	addr	Memory address where the 16 bit fixed point vector should be stored
[in]	b	16 bit fixed point vector to store

Referenced by arm_compute::detail::store_results< 1 >(), and arm_compute::detail::store_results< 2 >().

void arm_compute::vst1q_qs8	(	qint8_t *	addr,
		qint8x16_t	b
	)

Store a single 8 bit fixed point vector to memory (16 elements)

Parameters

[in]	addr	Memory address where the 8 bit fixed point vector should be stored
[in]	b	8 bit fixed point vector to store

void arm_compute::vst2q_qs16	(	qint16_t *	addr,
		qint16x8x2_t	b
	)

Store two 16 bit fixed point vector to memory (8x2 elements)

Parameters

[in]	addr	Memory address where the 16 bit fixed point vectors should be stored
[in]	b	16 bit fixed point vectors to store

qint16x4_t arm_compute::vsub_qs16	(	qint16x4_t	a,
		qint16x4_t	b
	)

16 bit fixed point vector subtraction (4 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector subtraction

qint8x8_t arm_compute::vsub_qs8	(	qint8x8_t	a,
		qint8x8_t	b
	)

8 bit fixed point vector subtraction (8 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector subtraction

qint16x8_t arm_compute::vsubq_qs16	(	qint16x8_t	a,
		qint16x8_t	b
	)

16 bit fixed point vector subtraction (8 elements)

Parameters

[in]	a	First 16 bit fixed point input vector
[in]	b	Second 16 bit fixed point input vector

Returns: The result of the 16 bit fixed point vector subtraction

qint8x16_t arm_compute::vsubq_qs8	(	qint8x16_t	a,
		qint8x16_t	b
	)

8 bit fixed point vector subtraction (16 elements)

Parameters

[in]	a	First 8 bit fixed point input vector
[in]	b	Second 8 bit fixed point input vector

Returns: The result of the 8 bit fixed point vector subtraction

float32x4_t arm_compute::vtanhq_f32 ( float32x4_t val )

Calculate hyperbolic tangent.

tanh(x) = (e^2x - 1)/(e^2x + 1)

Note: We clamp x to [-5,5] to avoid overflowing issues.

Parameters

[in] val Input vector value in F32 format.

Returns: The calculated Hyperbolic Tangent.

qint16x4_t arm_compute::vtaylor_poly_qs16	(	qint16x4_t	a,
		int	fixed_point_position
	)

Perform a 4th degree polynomial approximation.

(4 elements)

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 16 bit taylor approximation.

qint8x8_t arm_compute::vtaylor_poly_qs8	(	qint8x8_t	a,
		int	fixed_point_position
	)

Perform a 4th degree polynomial approximation.

(8 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8bit taylor approximation.

float32x4_t arm_compute::vtaylor_polyq_f32	(	float32x4_t	x,
		const std::array< float32x4_t, 8 > &	coeffs
	)

Perform a 7th degree polynomial approximation using Estrin's method.

Parameters

[in]	x	Input vector value in F32 format.
[in]	coeffs	Polynomial coefficients table.

Returns: The calculated approximation.

qint16x8_t arm_compute::vtaylor_polyq_qs16	(	qint16x8_t	a,
		int	fixed_point_position
	)

Perform a 4th degree polynomial approximation.

(8 elements)

Parameters

[in]	a	16 bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8bit taylor approximation.

qint8x16_t arm_compute::vtaylor_polyq_qs8	(	qint8x16_t	a,
		int	fixed_point_position
	)

Perform a 4th degree polynomial approximation.

(16 elements)

Parameters

[in]	a	8bit fixed point input vector
[in]	fixed_point_position	Fixed point position that expresses the number of bits for the fractional part of the number

Returns: The result of the 8bit taylor approximation.

Variable Documentation

constexpr uint8_t CONSTANT_BORDER_VALUE = 199

Constant value of the border pixels when using BorderMode::CONSTANT.

Definition at line 101 of file Types.h.

const std::array<float32x4_t, 8> exp_tab

Initial value:

=
{
    {
        vdupq_n_f32(1.f),
        vdupq_n_f32(0.0416598916054f),
        vdupq_n_f32(0.500000596046f),
        vdupq_n_f32(0.0014122662833f),
        vdupq_n_f32(1.00000011921f),
        vdupq_n_f32(0.00833693705499f),
        vdupq_n_f32(0.166665703058f),
        vdupq_n_f32(0.000195780929062f),
    }
}

Exponent polynomial coefficients.

Definition at line 28 of file NEMath.inl.

const std::array<float32x4_t, 8> log_tab

Initial value:

=
{
    {
        vdupq_n_f32(-2.29561495781f),
        vdupq_n_f32(-2.47071170807f),
        vdupq_n_f32(-5.68692588806f),
        vdupq_n_f32(-0.165253549814f),
        vdupq_n_f32(5.17591238022f),
        vdupq_n_f32(0.844007015228f),
        vdupq_n_f32(4.58445882797f),
        vdupq_n_f32(0.0141278216615f),
    }
}

Logarithm polynomial coefficients.

Definition at line 43 of file NEMath.inl.

constexpr size_t MAX_DIMS = 6

Constant value used to indicate maximum dimensions of a Window, TensorShape and Coordinates.

Definition at line 37 of file Dimensions.h.

constexpr float SCALE_PYRAMID_HALF = 0.5f

Constant value used to indicate a half-scale pyramid.

Definition at line 104 of file Types.h.

Referenced by arm_compute::test::validation::DATA_TEST_CASE(), arm_compute::test::validation::reference::gaussian_pyramid_half(), and arm_compute::test::validation::reference::optical_flow().

constexpr float SCALE_PYRAMID_ORB = 8.408964152537146130583778358414e-01

Constant value used to indicate a ORB scaled pyramid.

Definition at line 107 of file Types.h.

Namespaces

Data Structures

Typedefs

Enumerations

Functions

Variables

Detailed Description

Typedef Documentation

Enumeration Type Documentation

Function Documentation

Variable Documentation