crypto: x86/chacha20 - Add a 4-block AVX-512VL variant
authorMartin Willi <martin@strongswan.org>
Tue, 20 Nov 2018 16:30:50 +0000 (17:30 +0100)
committerHerbert Xu <herbert@gondor.apana.org.au>
Thu, 29 Nov 2018 08:27:04 +0000 (16:27 +0800)
commit180def6c4ad139ae6f97953ae810092ace295d5b
treeea9451b8ed9a9da6adac4ed41c2eab0769e4ccf1
parent29a47b54e030efe308aa90e6c26a9ce7f5f84ed8
crypto: x86/chacha20 - Add a 4-block AVX-512VL variant

This version uses the same principle as the AVX2 version by scheduling the
operations for two block pairs in parallel. It benefits from the AVX-512VL
rotate instructions and the more efficient partial block handling using
"vmovdqu8", resulting in a speedup of the raw block function of ~20%.

Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
arch/x86/crypto/chacha20-avx512vl-x86_64.S
arch/x86/crypto/chacha20_glue.c