review.tizen.org Git - platform/upstream/llvm.git/commit

projects / platform / upstream / llvm.git / commit

author	Sanjay Patel <spatel@rotateright.com>
	Tue, 4 Jun 2019 16:40:04 +0000 (16:40 +0000)
committer	Sanjay Patel <spatel@rotateright.com>
	Tue, 4 Jun 2019 16:40:04 +0000 (16:40 +0000)
commit	606eb2367f9f0bef2d1e0182bbb2bf4effb1711e
tree	07ad29ff737cfeb198014fa795f057a9150954e6	tree \| snapshot
parent	f15e3d856fddd3ecf80fdbb798be64d0c4bc6de4	commit \| diff

[x86] split 256-bit store of concatenated vectors

This shows up as a side issue to the main problem for the AVX target example from PR37428:
https://bugs.llvm.org/show_bug.cgi?id=37428 - https://godbolt.org/z/7tpRa3

But as we can see in the pile of existing test diffs, it's actually a widespread problem
that affects any AVX or later target. Apart from a couple of oddballs, I think these are
all improvements for the reasons stated in the code comment: we do not want to enable YMM
unnecessarily (avoid vzeroupper and frequency throttling) and some cores split 256-bit
stores anyway.

We could say that MergeConsecutiveStores() is going overboard on some of these examples,
but that won't solve the problem completely. But that is a reason I'm proposing this as
a lowering rather than a combine: we will infinite loop fighting the merge code if we try
this earlier.

Differential Revision: https://reviews.llvm.org/D62498

llvm-svn: 362524

25 files changed:

llvm/lib/Target/X86/X86ISelLowering.cpp		diff \| blob \| history
llvm/test/CodeGen/X86/avg.ll		diff \| blob \| history
llvm/test/CodeGen/X86/avx-intrinsics-x86-upgrade.ll		diff \| blob \| history
llvm/test/CodeGen/X86/avx-intrinsics-x86.ll		diff \| blob \| history
llvm/test/CodeGen/X86/avx512-trunc-widen.ll		diff \| blob \| history
llvm/test/CodeGen/X86/avx512-trunc.ll		diff \| blob \| history
llvm/test/CodeGen/X86/nontemporal-2.ll		diff \| blob \| history
llvm/test/CodeGen/X86/oddsubvector.ll		diff \| blob \| history
llvm/test/CodeGen/X86/pmovsx-inreg.ll		diff \| blob \| history
llvm/test/CodeGen/X86/shrink_vmul-widen.ll		diff \| blob \| history
llvm/test/CodeGen/X86/shrink_vmul.ll		diff \| blob \| history
llvm/test/CodeGen/X86/shuffle-vs-trunc-512-widen.ll		diff \| blob \| history
llvm/test/CodeGen/X86/shuffle-vs-trunc-512.ll		diff \| blob \| history
llvm/test/CodeGen/X86/subvector-broadcast.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vec_fptrunc.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vec_saddo.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vec_smulo.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vec_ssubo.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vec_uaddo.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vec_umulo.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vec_usubo.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-gep.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-trunc-widen.ll		diff \| blob \| history
llvm/test/CodeGen/X86/vector-trunc.ll		diff \| blob \| history
llvm/test/CodeGen/X86/x86-interleaved-access.ll		diff \| blob \| history

Domain: System / Toolchain;

RSS Atom