[AMDGPU] Fix and simplify AMDGPUCodeGenPrepare::expandDivRem32
authorJay Foad <jay.foad@amd.com>
Mon, 22 Jun 2020 14:27:37 +0000 (15:27 +0100)
committerJay Foad <jay.foad@amd.com>
Wed, 8 Jul 2020 18:14:48 +0000 (19:14 +0100)
commitf4bd01c1918e90f232a098b4878b52c6f7d4a215
tree4ab33d1745112656f8455dee904bd6ec9a8fbe20
parentc444b1b904b11356c57980a41a19f4ef361b80a8
[AMDGPU] Fix and simplify AMDGPUCodeGenPrepare::expandDivRem32

Fix the division/remainder algorithm by adding a second quotient
refinement step, which is required in some cases like
0xFFFFFFFFu / 0x11111111u (https://bugs.llvm.org/show_bug.cgi?id=46212).

Also document, rewrite and simplify it by ensuring that we always have a
lower bound on inv(y), which simplifies the UNR step and the quotient
refinement steps.

Differential Revision: https://reviews.llvm.org/D83381
llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
llvm/test/CodeGen/AMDGPU/GlobalISel/sdiv.i32.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/srem.i32.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/udiv.i32.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i32.ll
llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fold-binop-select.ll
llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll
llvm/test/CodeGen/AMDGPU/bypass-div.ll
llvm/test/CodeGen/AMDGPU/idiv-licm.ll
llvm/test/CodeGen/AMDGPU/sdiv.ll
llvm/test/CodeGen/AMDGPU/udivrem.ll