[RISCV] Don't use zero-stride vector load if there's no optimized u-arch
For vector strided instructions, as the RVV spec says:
> When rs2=x0, then an implementation is allowed, but not required, to
> perform fewer memory operations than the number of active elements, and
> may perform different numbers of memory operations across different
> dynamic executions of the same static instruction.
So compiler shouldn't assume that fewer memory operations will be
performed when rs2=x0.
We add a target feature to specify whether u-arch supports optimized
zero-stride vector load. And we do vector splat optimization iff this
feature is supported.
This feature is enabled by default since most designs implement this
optimization.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D137699