Enabled XLA for TF C API.
Summary of changes:
1. Set MarkForCompilationPassFlags::tf_xla_cpu_global_jit default to true in
C_API unit test env when XLA-execute is intended. Together with setting session
config config.graph_options.optimizer_options.global_jit_level to > 0, this
turns on XLA for the entire graph (eligible nodes only, with _Arg and _RetVal
nodes excluded).
We decided against defaulting MarkForCompilationPassFlags::tf_xla_cpu_global_jit
to true, due to performance concerns with the single-threaded nature of the XLA
CPU backend (see
https://www.tensorflow.org/performance/xla/jit#turning_on_jit_compilation).
2. In FindCompilationCandidates() during MarkForCompilationPass, skip compiling
any '_Arg'-typed nodes. This is necessary to avoid hitting a "Invalid argument
number" error during MarkForCompilationPass.
3. Extended C API based build rules to link in XLA libraries, and added unit
test "CAPI.Session_Min_XLA_CPU".
Also added some misc improvements and debugging aids.
PiperOrigin-RevId:
185193314