review.tizen.org Git - platform/upstream/llvm.git/commit

projects / platform / upstream / llvm.git / commit

author	jeff <Jeffrey.Byrnes@amd.com>
	Wed, 21 Sep 2022 18:09:30 +0000 (18:09 +0000)
committer	Jeffrey Byrnes <Jeffrey.Byrnes@amd.com>
	Mon, 3 Oct 2022 19:58:29 +0000 (12:58 -0700)
commit	f4e6149d8217176f71591b277e9cd08be5f732c1
tree	c8b7b1d897bebe6ca36592487db2e9b6ef2df539	tree \| snapshot
parent	c1bfa414284f887f3f0b5709ebbd281dc3ccc34f	commit \| diff

[AMDGPU] Use V_PERM to match buildvectors when inputs are not canonicalized (i.e. can't use V_PACK)

If we can not prove that f16 operands of a buildvector are canonicalized, then we can not lower into a V_PACK. In this scenario, we would previously lower into some combination of and(sdwa), shr, or. This patch allows for matching into V_PERM instead.

Change-Id: Ifa4a74fdb81ef44f22ba490c7fdf81ec8aebc945

53 files changed:

llvm/lib/Target/AMDGPU/AMDGPUInstructions.td		diff \| blob \| history
llvm/lib/Target/AMDGPU/SIInstructions.td		diff \| blob \| history
llvm/lib/Target/AMDGPU/SOPInstructions.td		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/GlobalISel/fpow.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.atomic.dim.a16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.gather4.a16.dim.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.load.2darraymsaa.a16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.load.3d.a16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.sample.cd.g16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.sample.g16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/GlobalISel/uaddsat.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/GlobalISel/usubsat.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/add.v2i16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/build-vector-packed-partial-undef.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/chain-hi-to-lo.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/combine-vload-extract.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/divergence-driven-buildvector.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/extract-subvector-16bit.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.global.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.private.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/fcanonicalize.f16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/fmax_legacy.f16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/fmin_legacy.f16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/fshr.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/idot4s.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/idot4u.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/idot8s.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/idot8u.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.gather4.a16.dim.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.a16.dim.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.cd.a16.dim.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.cd.g16.encode.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.cd.g16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.a16.dim.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.encode.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/load-hi16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/load-lo16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/pack.v2f16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/pack.v2i16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/partial-shift-shrink.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/strict_fadd.f16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/strict_fma.f16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/strict_fmul.f16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/strict_fsub.f16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/sub.v2i16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll		diff \| blob \| history

Domain: System / Toolchain;

RSS Atom