From 3d62ef80181578f0799aa878dfeb490494c7e575 Mon Sep 17 00:00:00 2001
From: Alex Zinenko <zinenko@google.com>
Date: Mon, 13 May 2019 08:51:34 -0700
Subject: [PATCH]     Update region documentation

    Restructure the Regions section in LangRef to avoid having a wall of text and
    reflect a recent evolution of the design.  Unspecify region types, that are put
    on hold until use cases arise.

    Update the Rationale doc with a list of design decisions related to regions.
    Separately list the design alternatives that were considered and discarded due
    to the lack of existing use cases.

--

PiperOrigin-RevId: 247943144
---
 mlir/g3doc/LangRef.md   |  87 +++++++++++++++++++++++++----------------
 mlir/g3doc/Rationale.md | 100 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 154 insertions(+), 33 deletions(-)

diff --git a/mlir/g3doc/LangRef.md b/mlir/g3doc/LangRef.md
index a8f62ff..bb0731d 100644
--- a/mlir/g3doc/LangRef.md
+++ b/mlir/g3doc/LangRef.md
@@ -1249,31 +1249,36 @@ func @count(%x: i64) -> (i64, i64)
 
 ## Regions
 
-A region is a CFG of MLIR [Blocks](#blocks). Regions serve as a generalization
-of a function body that can be nested under arbitrary operations. A region
-semantics is defined by the containing entity (operation or function). Regions
-do not have a name or an address, only the blocks contained in a region do.
+### Definition
+
+A region is a CFG of MLIR [Blocks](#blocks). Regions serve to group semantically
+connected blocks, where the semantics is not imposed by the IR. Instead, the
+containing entity (operation or function) defines the semantics of the regions
+it contains. Regions do not have a name or an address, only the blocks contained
+in a region do. Regions are meaningless outside of the containing entity and
+have no type or attributes.
 
 The first block in the region cannot be a successor of any other block. The
-arguments of this block are treated as arguments of the region. The syntax for
-the region is as follows:
+syntax for the region is as follows:
 
 ``` {.ebnf}
-region ::= region-signature? region-body
-region-signature ::= `(` argument-list `)` (`->` function-result-type)?
-region-body ::= `{` block+ `}`
+region ::= `{` block+ `}`
 ```
 
-The function body is an example of a region, the body of an `affine.for`
-operation is another example, this time of an single-block region.
+The function body is an example of a region: it consists of a CFG of blocks and
+has additional semantic restrictions that other types of regions may not have
+(block terminators must either branch to a different block, or return from a
+function where the types of the `return` arguments must match the result types
+of the function signature).
+
+### Control and Value Scoping
 
 Regions provide nested control isolation: it is impossible to branch to a block
 within a region from outside it, or to branch from within a region to a block
 outside it. Similarly it provides a natural scoping for value visibility: SSA
 values defined in a region don't escape to the enclosing region if any. By
-default, a region can referenced values defined outside of the region, whenever
+default, a region can reference values defined outside of the region, whenever
 it would have been legal to use them as operands to the enclosing operation.
-This can be further restricted using custom verifier.
 
 Example:
 
@@ -1299,6 +1304,11 @@ func $@accelerator_compute(i64, i1) -> i64 {
 }
 ```
 
+This can be further restricted using custom verifier, for example, disallowing
+references to values defined outside the region completely.
+
+### Control Flow
+
 Regions are Single-Entry-Multiple-Exit (SEME). It means that control can only
 flow into the first block of the region, but can flow out of the region at the
 end of any of the blocks it contains. (This behavior is similar to that of
@@ -1306,30 +1316,32 @@ functions in most programming languages). Nonetheless, when exiting the region
 from any of its multiple exit points, the control flows to the same successor.
 
 Regions present in an operation can be executed any number of times. The IR does
-not guarantee if a region passed as an argument to an operation will be
-executed; if so, how many times and when. In particular, a region can be
-executed zero, one or multiple times, in no particular order with respect to
-other regions or operations. It may be executed as a part of an operation, or by
-some later operation using any values produced by the operation that contains
-the region. The successor to a regionâs exit points may not necessarily exist:
-regions enclosing non-terminating code such as infinite loops are possible, as
-well as an operation implementing an infinite loop over a region. Concurrent or
-asynchronous execution of regions is unspecified. Operations may define pecific
+not guarantee if a region of an operation will be executed; if so, how many
+times and when. In particular, a region can be executed zero, one or multiple
+times, in no particular order with respect to other regions or operations. It
+may be executed as a part of an operation, or by some later operation using any
+values produced by the operation that contains the region. The successor to a
+regionâs exit points may not necessarily exist: regions enclosing
+non-terminating code such as infinite loops are possible, as well as an
+operation implementing an infinite loop over a region. Concurrent or
+asynchronous execution of regions is unspecified. Operations may define specific
 rules of execution, e.g. sequential loops or switch-like blocks.
 
 In case of zero executions, control does not flow into the region. In case of
 multiple executions, the control may exit the region from any of the region exit
 points and enter it again at its entry point. It may also enter another region.
-If an operation has multiple region arguments, the semantics of the operation
-defines into which regions the control flows and in which order, if any. An
-operation may trigger execution of regions that were specified in other
-operations, in particular those that defined the values the given operation
-uses. When all argument regions were executed the number of times required by
-the operation semantics, the control flows from any of the region exit points to
-the original control-successor of the operation that triggered the execution.
-Thus operations with region arguments can be treated opaquely in the enclosing
-control flow graph, providing a level of control flow isolation similar to that
-of the call operation.
+If an operation has multiple regions, the semantics of the operation defines
+into which regions the control flows and in which order, if any. An operation
+may trigger execution of regions that were specified in other operations, in
+particular those that defined the values the given operation uses. When all
+argument regions were executed the number of times required by the operation
+semantics, the control flows from any of the region exit points to the original
+control-successor of the operation that triggered the execution. Thus operations
+with region arguments can be treated opaquely in the enclosing control flow
+graph, providing a level of control flow isolation similar to that of the call
+operation.
+
+### Closure
 
 Regions allow to define an operation that creates a closure, for example by
 âboxingâ the body of the region into a value they produce. It remains up to the
@@ -1343,9 +1355,18 @@ Note that if an operation triggers asynchronous execution of the region, it is
 under the responsibility of the operation caller to wait for the region to be
 executed guaranteeing that any directly used values remain live.
 
+### Arguments and Results
+
+The arguments of the first block of a region are treated as arguments of the
+region. The source of these arguments is defined by the parent entity of the
+region. If a region is a function body, its arguments are the function
+arguments. If a region is used in an operation, the operation semantics
+specified how these values are produced. They may correspond to some of the
+values the operation itself uses.
+
 Regions produce a (possibly empty) list of values. For function body regions,
 `return` is the standard region-exiting terminator, but dialects can provide
-their own. For regions passed as operation arguments, the operation semantics
+their own. For regions that belong to an operation, the operation semantics
 defines the relation between the region results and the operation results.
 
 ## Blocks
diff --git a/mlir/g3doc/Rationale.md b/mlir/g3doc/Rationale.md
index 56e0e48..a28b9d9 100644
--- a/mlir/g3doc/Rationale.md
+++ b/mlir/g3doc/Rationale.md
@@ -371,6 +371,62 @@ However, this control flow granularity is not available in the ML functions
 where min/max, and thus `select`, are likely to appear. In addition, simpler
 control flow may be beneficial for optimization in general.
 
+### Regions
+
+#### Attributes of type 'Block'
+
+We considered representing regions through `ArrayAttr`s containing a list of a
+special type `IRBlockAttr`, which in turn would contain a list of operations.
+All attributes in MLIR are uniqueâd within the context, which would make the IR
+inside the regions immortal for no good reason.
+
+#### Use "inlined" functions as regions
+
+We considered attaching a "force-inline" attribute on a function and/or a
+function `call` operation. Even the minimal region support (use cases in
+affine.for and affine.if existing before the regions) requires access to the
+values defined in the dominating block, which is not supported by functions.
+Conceptually, function bodies are instances of regions rather than the inverse;
+regions can also be device kernels, alternative sections, etc.
+
+#### Dedicated `region` operation
+
+This would mean we have a special kind of operation that is allowed to have
+regions while other operations are not. Such distinction is similar to the
+Stmt/Op difference we have had and chose to remove to make the IR simpler and
+more flexible. It would also require analyses and passes to consider the
+interplay between operations (e.g., an `affine.for` operation must be followed
+by a region operation). Finally, a region operation can be introduced using the
+current implementation, among other operations and without being special in any
+sense.
+
+#### Explicit capture of the values used in a region
+
+Being able to use values defined outside the region implies that use-def chains
+may contain uses from different nested regions. Consequently, IR transformations
+and analyses can pull the instruction defining the value across region
+boundaries, for example in case of TableGen-defined canonicalization patterns.
+This would not be the case if all used values had been passed as region
+arguments. One of the motivations for introducing regions in the IR is precisely
+to enable cross-region analyses and transformations that are simpler than
+inter-procedural transformations. Having uses from different regions appear in
+the same use-def chain, contrary to an additional data structure maintaining
+correspondence between function call arguments as uses of the original
+definitions and formal arguments as new definitions, enables such
+simplification. Since individual operations now belong to blocks, which belong
+to regions, it is always possible to check if the definition of the value
+belongs to the same region as its particular use. The risk is that any IR
+traversal will need to handle explicitly this situation and it is easy to forget
+a check (or conversely it isnât easy to design the right check in a tablegen
+pattern for example): traversing use-def chains potentially crosses implicitly
+semantic barriers, making it possible to unknowingly break region semantics.
+This is expected to be caught in the verifier after the transformation.
+
+At the same time, one may choose to pass certain or all values as region
+arguments to explicitly break the use-def chains in the current proposal. This
+can be combined with an attribute-imposed semantic requirement disallowing the
+body of the region to refer to any value from outside it.
+
 ### Quantized integer operations
 
 We haven't designed integer quantized operations in MLIR, but experience from
@@ -860,6 +916,50 @@ bb0 (%0, %1: memref<128xf32>, i64):
 }
 ```
 
+### Regions
+
+#### Making function definition an operation
+
+MLIR supports values of a Function type. Instead of having first-class IR
+concept for functions, one could define an operation with a body region that
+defines a function value. The particularity of functions is that their names are
+globally visible and can be referred to before being defined, unlike SSA values
+that must be defined first. Implementing a "function definition" operation would
+require to relax some of the SSA constraints in a region, and also make the IR
+Module a region as well. It would also affect the core infrastructure (e.g.,
+function passes) only for the sake of concept unification.
+
+#### Having types on a region
+
+Instead of inspecting the types of arguments of the first block, one could give
+the region itself a type. This type would be redundant with block argument
+types, which must have values and create room for type mismatches. While
+functions do have types that are partly redundant with the arguments of the
+first block in the function, this is necessary to support function declarations
+that do not have a body which we can refer to in order to obtain the argument
+types. A region is always contained in an operation or a function that can be
+queried to obtain the âtypeâ of the region if necessary.
+
+A type on a region can be justified if Regions were to be considered separately
+from the enclosing entity (operation or function) and had their own semantics
+that should be checked.
+
+#### Attaching attributes to regions
+
+Regions could be annotated with dialect attributes to use attribute verification
+hooks. An operation could take multiple regions as arguments, and each of them
+may require different attributes. However, there are currently very few
+practical cases where this would be necessary. Instead, one could simulate
+per-region attributes with array attributes attached to the entity containing
+the region (operation or function). This decreases the overall complexity of the
+IR and enables more concise and op-specific forms, e.g., when all regions of an
+op have the same attribute that can be only mentioned once. Since the semantics
+of the region is entirely defined by the enclosing entity, it also makes sense
+to have attributes attached to that entity rather than to the region itself.
+
+This can be reconsidered in the future if we see a non-neglectable amount of use
+cases.
+
 ### Read/Write/May_Read/May_Write sets for External Functions
 
 Having read, write, may_read, and may_write sets for external functions which
-- 
2.7.4