[exegesis][X86] ParallelSnippetGenerator: don't accidentally create serialized instru...
authorRoman Lebedev <lebedev.ri@gmail.com>
Tue, 7 Sep 2021 08:47:20 +0000 (11:47 +0300)
committerRoman Lebedev <lebedev.ri@gmail.com>
Tue, 7 Sep 2021 09:39:23 +0000 (12:39 +0300)
commit03512ae9bf31b725d8233c03094fd463b5f46285
tree7a5e0fcc294749e4528aa756e296cec7669f1624
parentc33e296be1daa621f5933f3729a0ce51172f1505
[exegesis][X86] ParallelSnippetGenerator: don't accidentally create serialized instructions

In the case of no tied variables, we pick random defs, and then random uses that don't alias with defs we just picked.
Sounds good, except that an X86 instruction may have implicit reg uses,
e.g. for `MULX` it's `EDX`/`RDX`: `Intel SDM, 4-162 Vol. 2B MULX — Unsigned Multiply Without Affecting Flags`
> Performs an unsigned multiplication of the implicit source operand (EDX/RDX) and the specified source operand
> (the third operand) and stores the low half of the result in the second destination (second operand), the high half
> of the result in the first destination operand (first operand), without reading or writing the arithmetic flags.

And indeed, every once in a while `llvm-exegesis` happened to pick EDX as a def while measuring throughput,
and producing garbage output:
```
$ ./bin/llvm-exegesis -num-repetitions=1000000 -mode=inverse_throughput -repetition-mode=min --loop-body-size=4096 -dump-object-to-disk=false -opcode-name=MULX32rr --max-configs-per-opcode=65536
---
mode:            inverse_throughput
key:
  instructions:
    - 'MULX32rr EDX R11D R12D'
  config:          ''
  register_initial_values:
    - 'R12D=0x0'
    - 'EDX=0x0'
cpu_name:        znver3
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 1000000
measurements:
  - { key: inverse_throughput, value: 4.00014, per_snippet_value: 4.00014 }
error:           ''
info:            instruction has no tied variables picking Uses different from defs
assembled_snippet: 415441BC00000000BA00000000C4C223F6D4C4C223F6D4C4C223F6D4C4C223F6D4415CC3415441BC00000000BA0000000049B80200000000000000C4C223F6D4C4C223F6D44983C0FF75F0415CC3
...
```
```
$ ./bin/llvm-exegesis -num-repetitions=1000000 -mode=inverse_throughput -repetition-mode=min --loop-body-size=4096 -dump-object-to-disk=false -opcode-name=MULX32rr --max-configs-per-opcode=65536
---
mode:            inverse_throughput
key:
  instructions:
    - 'MULX32rr R13D EDX ECX'
  config:          ''
  register_initial_values:
    - 'ECX=0x0'
    - 'EDX=0x0'
cpu_name:        znver3
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 1000000
measurements:
  - { key: inverse_throughput, value: 3.00013, per_snippet_value: 3.00013 }
error:           ''
info:            instruction has no tied variables picking Uses different from defs
assembled_snippet: 4155B900000000BA00000000C4626BF6E9C4626BF6E9C4626BF6E9C4626BF6E9415DC34155B900000000BA0000000049B80200000000000000C4626BF6E9C4626BF6E94983C0FF75F0415DC3
...
```
Oops! Not only does that not look fun, i did hit that pitfail during AMD Zen 3 enablement.
While i have since then addressed this in rGd4d459e7475b4bb0d15280f12ed669342fa5edcd,
i suspect there may be other buggy results lying around, so we should at least stop producing them.

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D109275
llvm/tools/llvm-exegesis/lib/ParallelSnippetGenerator.cpp
llvm/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp