This extension of the OpenCL tuner is still in experimental phase and
allows to control the size of batches of workgroups distributed to
compute units
Resolves COMPMID-3938
Change-Id: I8e55db6877717ef5d50bc7eee24b248b5a2f9414
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5027
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
CLTunerMode tuner_mode = CLTunerMode::NORMAL; /**< Parameter to select the level (granularity) of the tuning */
bool tune_wbsm = false; /**< Flag to tune the batches of work groups distributed to compute units.
Internally, the library will check if this feature is available on
- the target platform */
+ the target platform. This OpenCL tuner extension is still in experimental phase */
};
/** Converts a string to a strong types enumeration @ref CLTunerMode
- NEGEMMMatrixVectorMultiplyKernel
- NELocallyConnectedMatrixMultiplyKernel / CLLocallyConnectedMatrixMultiplyKernel
- NEUpsampleLayerKernel / CLUpsampleLayerKernel
+ - Extend OpenCL tuner with workgroup batch size support
+ - Experimental extension for the OpenCL tuner to tune the batches of work groups distribute to compute units
v20.11 Public major release
- Various bug fixes.