[mlir] Transform scf.parallel to scf.for + async.execute
authorEugene Zhulenev <ezhulenev@google.com>
Fri, 13 Nov 2020 11:01:52 +0000 (03:01 -0800)
committerEugene Zhulenev <ezhulenev@google.com>
Fri, 13 Nov 2020 12:02:56 +0000 (04:02 -0800)
commitc30ab6c2a307cfdce8323ed94c3d70eb2d26bc14
treef1cb1f1e49d7d133ae9ebcb4c5faf15a40611528
parent7da0d0a67ffc61a455512103f00a53a27d880bdc
[mlir] Transform scf.parallel to scf.for + async.execute

Depends On D89958

1. Adds `async.group`/`async.awaitall` to group together multiple async tokens/values
2. Rewrite scf.parallel operation into multiple concurrent async.execute operations over non overlapping subranges of the original loop.

Example:

```
   scf.for (%i, %j) = (%lbi, %lbj) to (%ubi, %ubj) step (%si, %sj) {
     "do_some_compute"(%i, %j): () -> ()
   }
```

Converted to:

```
   %c0 = constant 0 : index
   %c1 = constant 1 : index

   // Compute blocks sizes for each induction variable.
   %num_blocks_i = ... : index
   %num_blocks_j = ... : index
   %block_size_i = ... : index
   %block_size_j = ... : index

   // Create an async group to track async execute ops.
   %group = async.create_group

   scf.for %bi = %c0 to %num_blocks_i step %c1 {
     %block_start_i = ... : index
     %block_end_i   = ... : index

     scf.for %bj = %c0 t0 %num_blocks_j step %c1 {
       %block_start_j = ... : index
       %block_end_j   = ... : index

       // Execute the body of original parallel operation for the current
       // block.
       %token = async.execute {
         scf.for %i = %block_start_i to %block_end_i step %si {
           scf.for %j = %block_start_j to %block_end_j step %sj {
             "do_some_compute"(%i, %j): () -> ()
           }
         }
       }

       // Add produced async token to the group.
       async.add_to_group %token, %group
     }
   }

   // Await completion of all async.execute operations.
   async.await_all %group
```
In this example outer loop launches inner block level loops as separate async
execute operations which will be executed concurrently.

At the end it waits for the completiom of all async execute operations.

Reviewed By: ftynse, mehdi_amini

Differential Revision: https://reviews.llvm.org/D89963
22 files changed:
mlir/include/mlir/Dialect/Async/CMakeLists.txt
mlir/include/mlir/Dialect/Async/IR/Async.h
mlir/include/mlir/Dialect/Async/IR/AsyncBase.td
mlir/include/mlir/Dialect/Async/IR/AsyncOps.td
mlir/include/mlir/Dialect/Async/Passes.h [new file with mode: 0644]
mlir/include/mlir/Dialect/Async/Passes.td [new file with mode: 0644]
mlir/include/mlir/ExecutionEngine/AsyncRuntime.h
mlir/include/mlir/InitAllPasses.h
mlir/integration_test/Dialect/Async/CPU/lit.local.cfg [new file with mode: 0644]
mlir/integration_test/Dialect/Async/CPU/test-async-parallel-for-1d.mlir [new file with mode: 0644]
mlir/integration_test/Dialect/Async/CPU/test-async-parallel-for-2d.mlir [new file with mode: 0644]
mlir/lib/Conversion/AsyncToLLVM/AsyncToLLVM.cpp
mlir/lib/Dialect/Async/CMakeLists.txt
mlir/lib/Dialect/Async/IR/Async.cpp
mlir/lib/Dialect/Async/Transforms/AsyncParallelFor.cpp [new file with mode: 0644]
mlir/lib/Dialect/Async/Transforms/CMakeLists.txt [new file with mode: 0644]
mlir/lib/Dialect/Async/Transforms/PassDetail.h [new file with mode: 0644]
mlir/lib/ExecutionEngine/AsyncRuntime.cpp
mlir/test/Conversion/AsyncToLLVM/convert-to-llvm.mlir
mlir/test/Dialect/Async/async-parallel-for.mlir [new file with mode: 0644]
mlir/test/Dialect/Async/ops.mlir
mlir/test/mlir-cpu-runner/async-group.mlir [new file with mode: 0644]