Improve fusion logic of (a dot b) * alpha
The previous approach didn't work because a multiplication by a scalar value
will be changed into an explicit broadcast.
Another issue that is fixed in this CL is retrieving the constant value from
the literal. This depends on the PrimitiveType, before we always assumed it to be double.
Also when checking ImplementedAsGemm() we should not call it recursively, but instead just the check related to kDot.
Finally add an execution test and adjust the fusion logic test.
PiperOrigin-RevId:
195638795