24 #ifndef __ARM_COMPUTE_CLSOFTMAXLAYERKERNEL_H__ 25 #define __ARM_COMPUTE_CLSOFTMAXLAYERKERNEL_H__ 158 static const unsigned int _grid_size;
159 static const unsigned int _serial_vector_size;
160 static const unsigned int _parallel_vector_size;
Interface for max, shifting, exponentiating and summing the logits.
ICLSimpleKernel & operator=(const ICLSimpleKernel &)=delete
Prevent instances of this class from being copied (As this class contains pointers) ...
DATA_TYPE sum(__global const DATA_TYPE *input)
Calculate sum of a vector.
Store the tensor's metadata.
Common interface for all the OpenCL kernels.
This file contains all available output stages for GEMMLowp on OpenCL.
static Status validate(const ITensorInfo *input, const ITensorInfo *output)
Static function to check if given info will lead to a valid configuration of CLLogits1DMaxKernel.
Interface for calculating the final step of the Softmax Layer where each logit value is multiplied by...
Interface for simple OpenCL kernels having 1 tensor input and 1 tensor output.
Interface for shifting, exponentiating and summing the logits.
Interface for the identifying the max value of 1D Logits.
Interface for OpenCL tensor.
void run(const Window &window, cl::CommandQueue &queue) override
Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue...
fixed_point< T > max(fixed_point< T > x, fixed_point< T > y)
std::tuple< bool, unsigned int > ParallelReductionInfo
Info for whether a parallel reduction will be run and the vector size of the execution.
void configure(const ICLTensor *input, ICLTensor *output)
Set the input and output tensors.
const Window & window() const
The maximum window the kernel can be executed on.
Describe a multidimensional execution window.