[X86][BtVer2] Fix latency/throughput of scalar integer MUL instructions.
authorAndrea Di Biagio <Andrea_DiBiagio@sn.scee.net>
Thu, 22 Aug 2019 15:20:16 +0000 (15:20 +0000)
committerAndrea Di Biagio <Andrea_DiBiagio@sn.scee.net>
Thu, 22 Aug 2019 15:20:16 +0000 (15:20 +0000)
commitc9649eb9dab747c3b5c1d2b8ab6d54145fce40b2
tree54d93c7e67553f2da1451b44cf9c0bc09fa7fdf6
parent4ae79199ed1a2d6dc8961ed124048f5622b95bab
[X86][BtVer2] Fix latency/throughput of scalar integer MUL instructions.

Single operand MUL instructions that implicitly set EAX have the following
latency/throughput profile (see below):

imul %cl              # latency: 3cy - uOPs: 1 - 1 JMul
imul %cx              # latency: 3cy - uOPs: 3 - 3 JMul
imul %ecx             # latency: 3cy - uOPs: 2 - 2 JMul
imul %rcx             # latency: 6cy - uOPs: 2 - 4 JMul

mul %cl               # latency: 3cy - uOPs: 1 - 1 JMul
mul %cx               # latency: 3cy - uOPs: 3 - 3 JMul
mul %ecx              # latency: 3cy - uOPs: 2 - 2 JMul
mul %rcx              # latency: 6cy - uOPs: 2 - 4 JMul

Excluding the 64bit variant, which has a latency of 6cy, every other instruction
has a latency of 3cy. However, the number of decoded macro-opcodes (as well as
the resource cyles) depend on the MUL size.

The two operand MULs have a more predictable profile (see below):

imul %dx, %dx         # latency: 3cy - uOPs: 1 - 1 JMul
imul %edx, %edx       # latency: 3cy - uOPs: 1 - 1 JMul
imul %rdx, %rdx       # latency: 6cy - uOPs: 1 - 4 JMul

imul $3, %dx, %dx     # latency: 4cy - uOPs: 2 - 2 JMul
imul $3, %ecx, %ecx   # latency: 3cy - uOPs: 1 - 1 JMul
imul $3, %rdx, %rdx   # latency: 6cy - uOPs: 1 - 4 JMul

This patch updates the values in the Jaguar scheduling model and regenerates
llvm-mca tests.

Differential Revision: https://reviews.llvm.org/D66547

llvm-svn: 369661
16 files changed:
llvm/lib/Target/X86/X86ScheduleBtVer2.td
llvm/test/tools/llvm-mca/X86/BtVer2/clear-super-register-1.s
llvm/test/tools/llvm-mca/X86/BtVer2/cmpxchg-read-advance.s
llvm/test/tools/llvm-mca/X86/BtVer2/dependency-breaking-sbb-2.s
llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-2.s
llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-4.s
llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-6.s
llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update-7.s
llvm/test/tools/llvm-mca/X86/BtVer2/partial-reg-update.s
llvm/test/tools/llvm-mca/X86/BtVer2/read-advance-2.s
llvm/test/tools/llvm-mca/X86/BtVer2/resources-x86_64.s
llvm/test/tools/llvm-mca/X86/BtVer2/xadd.s
llvm/test/tools/llvm-mca/X86/BtVer2/xchg.s
llvm/test/tools/llvm-mca/X86/intel-syntax.s
llvm/test/tools/llvm-mca/X86/llvm-mca-markers-10.s
llvm/test/tools/llvm-mca/X86/llvm-mca-markers-9.s