review.tizen.org Git - platform/upstream/llvm.git/commit

author	Ivan Chikish <nekotekina@gmail.com>
	Mon, 15 May 2023 10:25:32 +0000 (11:25 +0100)
committer	Simon Pilgrim <llvm-dev@redking.me.uk>
	Mon, 15 May 2023 10:25:32 +0000 (11:25 +0100)
commit	579812c081a23f70804b7214d59776f59d056914
tree	f925fac82daf9c5fe2757c074494609a5ea2803c	tree \| snapshot
parent	b5d1ea9d2b771b25df4a0997e600beab7684800f	commit \| diff

[X86] LowerRotate: prefer unpack-based algorithm

Splitting and improving from the https://reviews.llvm.org/D146357

When running tests for LowerShift, I discovered some poor codegen in rotate and funnel shift tests. This patch attempts to address some of them.

Using unpack for splitting and using double-bitwidth shifts may improve performance according to https://uica.uops.info tests.

    No cross-lane shuffles
    No dirtying double-width registers
    Massive improvement for AVX2 rotates in some cases (var_funnnel_v8i16, var_funnnel_v16i16) — because unpack is currently only used for vXi8 vectors.

Differential Revision: https://reviews.llvm.org/D149071

llvm/lib/Target/X86/X86ISelLowering.cpp		diff \| blob \| history
llvm/test/CodeGen/X86/min-legal-vector-width.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-fshl-rot-128.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-fshl-rot-256.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-fshl-rot-512.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-fshr-rot-128.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-fshr-rot-256.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-fshr-rot-512.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-rotate-128.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-rotate-256.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-rotate-512.ll		diff \| blob \| history