Add a kernel usable as a GEBP inner loop for an LLVM IR GEMM
authorSanjoy Das <sanjoy@google.com>
Mon, 21 May 2018 18:11:48 +0000 (11:11 -0700)
committerTensorFlower Gardener <gardener@tensorflow.org>
Mon, 21 May 2018 18:14:06 +0000 (11:14 -0700)
commita0e4081cf5e11556c8e3d3e022a17afca991b3fe
tree31b3e6f4d087b58ae7e4b3cc4d96963fbdd00c07
parent0f192f9b0ab0f5c30b7284d5b8eff86993aebb3e
Add a kernel usable as a GEBP inner loop for an LLVM IR GEMM

This is not used in any real code path, but I've added an escape hatch that runs
regular matrix multiplies through this kernel for testing purposes.

As far as I can tell this is functionally correct, but I don't yet have a proper
apples-to-apples performance comparison -- that'll have to wait till the
implementation is complete.

PiperOrigin-RevId: 197422075
tensorflow/compiler/xla/service/cpu/cpu_options.cc
tensorflow/compiler/xla/service/cpu/cpu_options.h
tensorflow/compiler/xla/service/cpu/dot_op_emitter.cc
tensorflow/compiler/xla/service/cpu/dot_op_emitter.h
tensorflow/compiler/xla/service/cpu/vector_support_library.h
tensorflow/compiler/xla/service/llvm_ir/kernel_support_library.h