The ones given in 3ae30cd was by far to low because I
mixed m and m*n in my measures. Note that the new ones
are closed to the [z]gemv ones which is comforting
that both are right.
#ifdef SMPTEST
// Threshold chosen so that speed-up is > 1 on a Xeon E5-2630
- if(1L * m * n > 24L * GEMM_MULTITHREAD_THRESHOLD)
+ if(1L * m * n > 2048L * GEMM_MULTITHREAD_THRESHOLD)
nthreads = num_cpu_avail(2);
else
nthreads = 1;
#ifdef SMPTEST
// Threshold chosen so that speed-up is > 1 on a Xeon E5-2630
- if(1L * m * n > 3L * sizeof(FLOAT) * GEMM_MULTITHREAD_THRESHOLD)
+ if(1L * m * n > 36L * sizeof(FLOAT) * sizeof(FLOAT) * GEMM_MULTITHREAD_THRESHOLD)
nthreads = num_cpu_avail(2);
else
nthreads = 1;