x86: Improve svml_s_atanhf8_core_avx2.S
authorNoah Goldstein <goldstein.w.n@gmail.com>
Thu, 9 Jun 2022 18:16:35 +0000 (11:16 -0700)
committerNoah Goldstein <goldstein.w.n@gmail.com>
Thu, 9 Jun 2022 19:51:04 +0000 (12:51 -0700)
commit65897e991685c87f4575694197d3ce24f7fc9c5a
tree69ff6478a0982003b267bd380813743a56645985
parent73bae395cfc862a30e640e9de6f2defecd6fd100
x86: Improve svml_s_atanhf8_core_avx2.S

Improvements are:
    1. Reduce code size (-60 bytes).
    2. Remove redundant move instructions.
    3. Slightly improve instruction selection/scheduling where
       possible.
    4. Prefer registers which get short instruction encoding.
    5. Shrink rodata usage (-32 bytes).

The throughput improvement is not that significant (3-5%) as the
port 0 bottleneck is unavoidable.

       Function, New Time, Old Time, New / Old
_ZGVdN8v_atanhf,    2.799,    2.923,     0.958
sysdeps/x86_64/fpu/multiarch/svml_s_atanhf8_core_avx2.S