Rewrite more vector loads to scalar loads
This teaches forwprop to rewrite more vector loads that are only
used in BIT_FIELD_REFs as scalar loads. This provides the
remaining uplift to SPEC CPU 2017 510.parest_r on Zen 2 which
has CPU gathers disabled.
In particular vector load + vec_unpack + bit-field-ref is turned
into (extending) scalar loads which avoids costly XMM/GPR
transitions. To not conflict with vector load + bit-field-ref
+ vector constructor matching to vector load + shuffle the
extended transform is only done after vector lowering.
2021-07-30 Richard Biener <rguenther@suse.de>
* tree-ssa-forwprop.c (pass_forwprop::execute): Split
out code to decompose vector loads ...
(optimize_vector_load): ... here. Generalize it to
handle intermediate widening and TARGET_MEM_REF loads
and apply it to loads with a supported vector mode as well.