x86: Make the divisor in setting `non_temporal_threshold` cpu specific
authorNoah Goldstein <goldstein.w.n@gmail.com>
Wed, 7 Jun 2023 18:18:03 +0000 (13:18 -0500)
committerNoah Goldstein <goldstein.w.n@gmail.com>
Mon, 12 Jun 2023 16:33:39 +0000 (11:33 -0500)
commit180897c161a171d8ef0faee1c6c9fd6b57d8b13b
tree89e71e02a6e1edc57bb13f311228816dcbc92bd6
parentf193ea20eddc6cef84cba54cf1a647204ee6a86b
x86: Make the divisor in setting `non_temporal_threshold` cpu specific

Different systems prefer a different divisors.

From benchmarks[1] so far the following divisors have been found:
    ICX     : 2
    SKX     : 2
    BWD     : 8

For Intel, we are generalizing that BWD and older prefers 8 as a
divisor, and SKL and newer prefers 2. This number can be further tuned
as benchmarks are run.

[1]: https://github.com/goldsteinn/memcpy-nt-benchmarks
Reviewed-by: DJ Delorie <dj@redhat.com>
sysdeps/x86/cpu-features.c
sysdeps/x86/dl-cacheinfo.h
sysdeps/x86/dl-diagnostics-cpu.c
sysdeps/x86/include/cpu-features.h