Implement internal HAL for GEMM and matrix decompositions