compiler/circle-execution-plan/README.md

   1 # circle-execution-plan
   2
   3 _circle-execution-plan_ tool provides model with "execution plan".
   4
   5 This tool takes circle file as input and returns modified circle file.
   6 The output circle file contains plan (`CircleNodeMemoryPlan`) information for every node.
   7
   8
   9 "execution plan" contains:
  10 - number which determines order in which nodes will be executed
  11 - memory offsets for node output tensors from the beginning of shared memory buffer
  12
  13 In order to record and read this metadata, we use `CircleImportMetadata` and `CircleExportMetadata`.
  14 For this purpose we use `std::map<uint32_t, std::vector<uint32_t>> _memory_plan_table` which for each node with key ID contains encoded `CircleNodeMemoryPlan` data.
  15
  16 ### Execution plan building
  17
  18 In order to build "execution plan" we use `ExecutionPlanner` class.
  19 The main method is `get_execution_plan()` which for each node finds and writes to its annotations
  20 "execution plan". For this purpose there are two steps:
  21 - determining the order of execution of nodes, which is stored in `_ordered_nodes` vector.
  22 Now for this purpose there is only one default method `get_default_execution_order_plan()` that uses `loco::postorder_traversal(const std::vector<loco::Node *> &roots)`.
  23   In the future we can add new method and find the most suitable way to graph traversal.
  24
  25 - determining memory offsets for nodes from the beginning of shared memory buffer, which is stored in `_offsets`.
  26 Now for this purpose there is one method `get_offsets_with_greedy_by_size()` that is the implementation of the "Greedy by Size" algorithm, which is described in https://arxiv.org/pdf/2001.03288.pdf article.
  27   The main objective is to minimize the size of the allocated memory block.
  28   In the future, other methods may also appear here to determine memory offsets for nodes
  29   in the best way.