Add AllReduceOp to GPU dialect with lowering to NVVM.
authorChristian Sigg <csigg@google.com>
Thu, 26 Sep 2019 07:17:13 +0000 (00:17 -0700)
committerA. Unique TensorFlower <gardener@tensorflow.org>
Thu, 26 Sep 2019 07:17:50 +0000 (00:17 -0700)
commit116dac00baa6870aec2a2b469b2d6f95c2fbb316
tree1ff87872c7a0db12f4fed9ff715327b05921792d
parent94298cea933991b29dcb7f340725bc25e78cebcf
Add AllReduceOp to GPU dialect with lowering to NVVM.

The reduction operation is currently fixed to "add", and the scope is fixed to "workgroup".

The implementation is currently limited to sizes that are multiple 32 (warp size) and no larger than 1024.

PiperOrigin-RevId: 271290265
mlir/include/mlir/Dialect/GPU/GPUOps.td
mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
mlir/test/Conversion/GPUToNVVM/gpu-to-nvvm.mlir
mlir/test/Dialect/GPU/ops.mlir
mlir/test/mlir-cuda-runner/all-reduce.mlir [new file with mode: 0644]