review.tizen.org Git - platform/upstream/llvm.git/commit

[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)

This revision follows up on the conversation titled:

```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```

The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.

This results in roughly 20% fewer cycles as reported by llvm-mca:

After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted):
```
Iterations:        100
Instructions:      5900
Total Cycles:      2415
Total uOps:        7300

Dispatch Width:    6
uOps Per Cycle:    3.02
IPC:               2.44
Block RThroughput: 24.0

Cycles with backend pressure increase [ 89.90% ]
Throughput Bottlenecks:
  Resource Pressure       [ 89.65% ]
  - SKXPort1  [ 0.04% ]
  - SKXPort2  [ 12.42% ]
  - SKXPort3  [ 12.42% ]
  - SKXPort5  [ 89.52% ]
  Data Dependencies:      [ 37.06% ]
  - Register Dependencies [ 37.06% ]
  - Memory Dependencies   [ 0.00% ]
```

After this revision (inline_asm version, vblendps instructions are indeed emitted):
```
Iterations:        100
Instructions:      6300
Total Cycles:      2015
Total uOps:        7700

Dispatch Width:    6
uOps Per Cycle:    3.82
IPC:               3.13
Block RThroughput: 20.0

Cycles with backend pressure increase [ 83.47% ]
Throughput Bottlenecks:
  Resource Pressure       [ 83.18% ]
  - SKXPort0  [ 14.49% ]
  - SKXPort1  [ 14.54% ]
  - SKXPort2  [ 19.70% ]
  - SKXPort3  [ 19.70% ]
  - SKXPort5  [ 83.03% ]
  - SKXPort6  [ 14.49% ]
  Data Dependencies:      [ 39.75% ]
  - Register Dependencies [ 39.75% ]
  - Memory Dependencies   [ 0.00% ]
```

An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0).

Reviewed By: ftynse, dcaballe

Differential Revision: https://reviews.llvm.org/D114335

author	Nicolas Vasilache <nicolas.vasilache@gmail.com>
	Mon, 22 Nov 2021 10:22:37 +0000 (10:22 +0000)
committer	Nicolas Vasilache <nicolas.vasilache@gmail.com>
	Mon, 22 Nov 2021 10:32:34 +0000 (10:32 +0000)
commit	a9e236bed835c58be381dadb973a1db0681e4795
tree	f49eaed687cba9eaedde7061518eba41bfe581ca	tree \| snapshot
parent	4d21b64464ac548ec8442bc0d2a7e984ba78bd88	commit \| diff

mlir/include/mlir/Dialect/X86Vector/Transforms.h		diff \| blob \| history
mlir/lib/Dialect/X86Vector/Transforms/AVXTranspose.cpp		diff \| blob \| history
mlir/test/Dialect/Vector/vector-transpose-lowering.mlir		diff \| blob \| history
mlir/test/Integration/Dialect/LLVMIR/CPU/X86/test-inline-asm-vector.mlir	[new file with mode: 0644]	blob
mlir/test/lib/Dialect/Vector/CMakeLists.txt		diff \| blob \| history
mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp		diff \| blob \| history
utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel		diff \| blob \| history