Replace vpermpd with vpermilpd in the Haswell DTRMM kernel
authorMartin Kroeker <martin@ruby.chemie.uni-freiburg.de>
Sun, 28 Jul 2019 21:17:28 +0000 (23:17 +0200)
committerGitHub <noreply@github.com>
Sun, 28 Jul 2019 21:17:28 +0000 (23:17 +0200)
commit2dfb804cb943ac12035fe51859d109daca76b4f4
treebbfb95aa52fc84f4329c7695b2bc38067e3082b3
parentabea977ded8729c6dcfcfbee51a18eceef8d8440
Replace vpermpd with vpermilpd in the Haswell DTRMM kernel

to improve performance on AMD Zen (#2180) applying wjc404's improvement of the DGEMM kernel from #2186
kernel/x86_64/dtrmm_kernel_4x8_haswell.c