Basic function to execute a depthwise convolution for kernel size 3x3xC (when data layout NCHW) or Cx3x3 (when data layout NHWC). More...

Collaboration diagram for CLDepthwiseConvolutionLayer3x3:

Public Member Functions
	CLDepthwiseConvolutionLayer3x3 ()
	Default constructor. More...

void	configure (ICLTensor input, const ICLTensor weights, const ICLTensor biases, ICLTensor output, const PadStrideInfo &conv_info, unsigned int depth_multiplier=1, ActivationLayerInfo act_info=ActivationLayerInfo())
	Initialize the function's source, destination, conv and border_size. More...

void	run () override
	Run the kernels contained in the function. More...

Public Member Functions inherited from IFunction
virtual	~IFunction ()=default
	Destructor. More...

virtual void	prepare ()
	Prepare the function for executing. More...

Static Public Member Functions
static Status	validate (const ITensorInfo input, const ITensorInfo weights, const ITensorInfo biases, const ITensorInfo output, const PadStrideInfo &conv_info, unsigned int depth_multiplier=1, ActivationLayerInfo act_info=ActivationLayerInfo(), GPUTarget gpu_target=GPUTarget::MIDGARD)
	Static function to check if given info will lead to a valid configuration of CLDepthwiseConvolutionLayer3x3. More...

Detailed Description

Basic function to execute a depthwise convolution for kernel size 3x3xC (when data layout NCHW) or Cx3x3 (when data layout NHWC).

This function calls the following OpenCL kernels:

Constructor & Destructor Documentation

CLDepthwiseConvolutionLayer3x3 ( )

Default constructor.

void configure	(	ICLTensor *	input,
		const ICLTensor *	weights,
		const ICLTensor *	biases,
		ICLTensor *	output,
		const PadStrideInfo &	conv_info,
		unsigned int	depth_multiplier = `1`,
		ActivationLayerInfo	act_info = `ActivationLayerInfo()`
	)

Initialize the function's source, destination, conv and border_size.

Parameters

[in,out]	input	Source tensor. Data type supported: QASYMM8/F16/F32. (Written to only for border filling).
[in]	weights	Weights tensor. A 3D tensor with shape [3, 3, IFM]. Data type supported: Same as `input`.
[in]	biases	(Optional) Biases tensor. A 1D tensor with shape [IFM]. Must be nullptr if not needed. Data type supported: Same as `input`.
[out]	output	Destination tensor. Data type supported: same as `input`.
[in]	conv_info	Padding and stride information to use for the convolution.
[in]	depth_multiplier	(Optional) Multiplier to apply to the input's depth in order to retrieve the output's depth. Defaults to 1.
[in]	act_info	(Optional) Activation layer information in case of a fused activation. Only RELU, BOUNDED_RELU and LU_BOUNDED_RELU for 3x3 QASYMM8 supported.

void run ( )

overridevirtual

Run the kernels contained in the function.

For NEON kernels:

Note: CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

Note: The function will not block until the kernels are executed. It is the user's responsibility to wait.; Will call prepare() on first run if hasn't been done

Implements IFunction.

static Status validate	(	const ITensorInfo *	input,
		const ITensorInfo *	weights,
		const ITensorInfo *	biases,
		const ITensorInfo *	output,
		const PadStrideInfo &	conv_info,
		unsigned int	depth_multiplier = `1`,
		ActivationLayerInfo	act_info = `ActivationLayerInfo()`,
		GPUTarget	gpu_target = `GPUTarget::MIDGARD`
	)

static

Static function to check if given info will lead to a valid configuration of CLDepthwiseConvolutionLayer3x3.

Parameters

[in]	input	Source tensor. Data type supported: QASYMM8 for all layouts, F16/F32 for NCHW.
[in]	weights	Weights tensor. A 3D tensor with shape [3, 3, IFM]. Data type supported: Same as `input`.
[in]	biases	Biases tensor. A 1D tensor with shape [IFM]. Must be nullptr if not needed. Data type supported: Same as `input`, S32 when input is QASYMM8.
[in]	output	Destination tensor. Data type supported: same as `input`.
[in]	conv_info	Padding and stride information to use for the convolution.
[in]	depth_multiplier	(Optional) Multiplier to apply to the input's depth in order to retrieve the output's depth. Defaults to 1.
[in]	act_info	(Optional) Activation layer information in case of a fused activation. Only RELU, BOUNDED_RELU and LU_BOUNDED_RELU for 3x3 QASYMM8 supported.
[in]	gpu_target	(Optional) GPU target to validate the kernel for. Defaults to midgard.

The documentation for this class was generated from the following file: