We provide a ``PassInfo`` object to contain the basic information needed by
a pass. ``name`` is the pass name, ``opt_level`` indicates at which optimization
level the pass will be enabled, and ``required`` represents the passes that are
-required to execute a certain pass (see `include/tvm/relay/transform.h`_ for
+required to execute a certain pass (see `include/tvm/ir/transform.h`_ for
more details). For example, during registration of a pass (will be covered in
later), the pass developers can specify the name of the pass, the optimization
level it will be performed at, and/or the passes that are required.
sequential passes. Each subclass itself could act as a pass manager. For
instance, they could collect the required passes and execute them or build
a dependency graph based on the given metadata. The full definition of them
-can be found in `src/relay/pass/pass_manager.cc`_
+can be found in `src/relay/ir/transform.cc`_ and `src/ir/transform.cc`_.
Module-Level Passes
^^^^^^^^^^^^^^^^^^^
.. _Relay module: https://docs.tvm.ai/langref/relay_expr.html#module-and-global-functions
-.. _include/tvm/relay/transform.h: https://github.com/apache/incubator-tvm/blob/master/include/tvm/relay/transform.h
+.. _include/tvm/ir/transform.h: https://github.com/apache/incubator-tvm/blob/master/include/tvm/ir/transform.h
-.. _src/relay/pass/pass_manager.cc: https://github.com/apache/incubator-tvm/blob/master/src/relay/pass/pass_manager.cc
+.. _src/relay/ir/transform.cc: https://github.com/apache/incubator-tvm/blob/master/src/relay/ir/transform.cc
+
+.. _src/ir/transform.cc: https://github.com/apache/incubator-tvm/blob/master/src/ir/transform.cc
.. _src/relay/pass/fold_constant.cc: https://github.com/apache/incubator-tvm/blob/master/src/relay/pass/fold_constant.cc
- PackedFunc itself
- Module for compiled modules
- DLTensor* for tensor object exchange
-- TVM Node to represent any object in IR
+- TVM Object to represent any object in IR
The restriction makes the implementation simple without the need of serialization.
Despite being minimum, the PackedFunc is sufficient for the use-case of deep learning deployment as
.. _here: https://github.com/apache/incubator-tvm/tree/master/src/api
-To keep the runtime minimum, we isolated the IR Node support from the deployment runtime. The resulting runtime takes around 200K - 600K depending on how many runtime driver modules (e.g., CUDA) get included.
+To keep the runtime minimum, we isolated the IR Object support from the deployment runtime. The resulting runtime takes around 200K - 600K depending on how many runtime driver modules (e.g., CUDA) get included.
The overhead of calling into PackedFunc vs. a normal function is small, as it is only saving a few values on the stack.
So it is OK as long as we don't wrap small functions.
This instant feedback gives us a lot of advantages. For example, to test the correctness of generated code on iPhone, we no longer have to write test-cases in swift/objective-c from scratch -- We can use RPC to execute on iPhone, copy the result back and do verification on the host via numpy. We can also do the profiling using the same script.
-TVM Node and Compiler Stack
+TVM Object and Compiler Stack
---------------------------
As we mentioned earlier, we build compiler stack API on top of the PackedFunc runtime system.
- be able to serialize any language object and IRs
- be able to explore, print, and manipulate the IR objects in front-end language to do quick prototyping.
-We introduced a base class, called `Node`_ to solve this problem.
-All the language object in the compiler stack is a subclass of Node. Each node contains a string type_key that uniquely identifies
-the type of object. We choose string instead of int as type key so new Node class can be added in the decentralized fashion without
+We introduced a base class, called `Object`_ to solve this problem.
+All the language object in the compiler stack is a subclass of ``Object``. Each object contains a string type_key that uniquely identifies
+the type of object. We choose string instead of int as type key so new ``Object`` class can be added in the decentralized fashion without
adding the code back to the central repo. To ease the speed of dispatching, we allocate an integer type_index at runtime for each type_key.
-.. _Node: https://github.com/dmlc/HalideIR/blob/master/src/tvm/node/node.h#L61
+.. _Object: https://github.com/apache/incubator-tvm/blob/master/include/tvm/runtime/object.h
-Since usually one Node object could be referenced in multiple places in the language, we use a shared_ptr to keep
-track of reference. We use NodeRef class to represent a reference to the Node.
-We can roughly view NodeRef class as shared_ptr to the Node container.
-We can also define subclass NodeRef to hold each subtypes of Node. Each Node class needs to define the VisitAttr function.
+Since usually one ``Object`` could be referenced in multiple places in the language, we use a shared_ptr to keep
+track of reference. We use ``ObjectRef`` class to represent a reference to the ``Object``.
+We can roughly view ``ObjectRef`` class as shared_ptr to the ``Object`` container.
+We can also define subclass ``ObjectRef`` to hold each subtypes of ``Object``. Each subclass of ``Object`` needs to define the VisitAttr function.
.. code:: c
virtual void Visit(const char* key, std::string* value) = 0;
virtual void Visit(const char* key, void** value) = 0;
virtual void Visit(const char* key, Type* value) = 0;
- virtual void Visit(const char* key, NodeRef* value) = 0;
+ virtual void Visit(const char* key, ObjectRef* value) = 0;
// ...
};
- class Node {
+ class BaseAttrsNode : public Object {
public:
- virtual void VisitAttrs(AttrVisitor* visitor) {}
+ virtual void VisitAttrs(AttrVisitor* v) {}
// ...
};
-Each Node subclass will override this to visit its members. Here is an example implementation of TensorNode.
+Each ``Object`` subclass will override this to visit its members. Here is an example implementation of TensorNode.
.. code:: c
- class TensorNode : public Node {
+ class TensorNode : public Object {
public:
/*! \brief The shape of the tensor */
Array<Expr> shape;
}
};
-In the above examples, both ``Operation`` and ``Array<Expr>`` are NodeRef.
+In the above examples, both ``Operation`` and ``Array<Expr>`` are ObjectRef.
The VisitAttrs gives us a reflection API to visit each member of the object.
We can use this function to visit the node and serialize any language object recursively.
It also allows us to get members of an object easily in front-end language.
# access the op field of TensorNode
print(x.op.name)
-New Node can be added to C++ without changing the front-end runtime, making it easy to make extensions to the compiler stack.
+New ``Object`` can be added to C++ without changing the front-end runtime, making it easy to make extensions to the compiler stack.
Note that this is not the fastest way to expose members to front-end language, but might be one of the simplest
approaches possible. We also find that it fits our purposes as we mainly use python for testing and prototyping and still use c++
to do the heavy lifting job.