crypto: x86/chacha20 - Add a 2-block AVX-512VL variant
authorMartin Willi <martin@strongswan.org>
Tue, 20 Nov 2018 16:30:49 +0000 (17:30 +0100)
committerHerbert Xu <herbert@gondor.apana.org.au>
Thu, 29 Nov 2018 08:27:04 +0000 (16:27 +0800)
commit29a47b54e030efe308aa90e6c26a9ce7f5f84ed8
treeb53c29fb2903d8d4f62afb9804cb10e6643c7034
parentcee7a36ecb5bafef8c87fb2c10641e6125044154
crypto: x86/chacha20 - Add a 2-block AVX-512VL variant

This version uses the same principle as the AVX2 version. It benefits
from the AVX-512VL rotate instructions and the more efficient partial
block handling using "vmovdqu8", resulting in a speedup of ~20%.

Unlike the AVX2 version, it is faster than the single block SSSE3 version
to process a single block. Hence we engage that function for (partial)
single block lengths as well.

Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
arch/x86/crypto/chacha20-avx512vl-x86_64.S
arch/x86/crypto/chacha20_glue.c