docs/overview/roadmap.md

   1 # Roadmap
   2
   3 This document describes roadmap of **ONE** project.
   4
   5 This project **ONE** aims at providing a high-performance, on-device neural network (NN) inference
   6 framework that performs inference of a given NN model on processors, such as CPU, GPU, DSP, or NPU,
   7 in the target platform, such as Tizen, Android, and Ubuntu.
   8
   9 ## Progress
  10
  11 Until last year, we already saw significant gains in accelerating with a single CPU or GPU backend.
  12 We have seen better performance improvements, not only when using a single backend, but even when
  13 mixing CPUs or GPUs considering the characteristics of individual operations. It could give us an
  14 opportunity to have a high degree of freedom in terms of operator coverage, and possibly provide
  15 better performance compared to single backend acceleration.
  16
  17 On the other hand, we introduced the compiler as a front-end. This will support a variety of deep
  18 learning frameworks in relatively spacious host PC environments, while the runtime running on the
  19 target device is intended to take a smaller burden. In this approach, the compiler and the runtime
  20 will effectively share information among themselves by the Common IR, named _circle_, and a
  21 container format which is referred to as the _NN Package_.
  22
  23 ## Goal
  24
  25 In the meantime, we have been working on improving the acceleration performance based on the vision
  26 model. From this year, now we start working on the voice model. The runtime requirements for the
  27 voice model will be different from those of the vision model. There will be new requirements that
  28 we do not recognize yet, along with some already recognized elements such as control flow and
  29 dynamic tensor. In addition, recent studies on voice models require efficient support for specific
  30 architectures such as attention, transformer, and BERT. Also, depending on the characteristics of
  31 most voice models with large memory bandwidth, we will have to put more effort into optimizing the
  32 memory bandwidth at runtime.
  33
  34 ## Deliverables
  35
  36 - Runtime
  37   + Control Flow support (IF/WHILE)
  38   + Dynamic Tensor support
  39   + High quality kernel development for UINT8 quantized model
  40   + End-to-end performance optimization for voice models
  41 - Compiler
  42   + More than 100 operations support
  43   + Standalone _circle_ interpreter
  44   + Completion and application of _circle2circle_ pass
  45     - _circle-quantizer_ for UINT8 and INT16
  46     - _circle-optimizer_
  47   + Grphical _circle_ model viewer
  48
  49 ## Milestones
  50
  51 - [2020 Project Milestones](https://github.com/Samsung/ONE/projects/1)
  52
  53 ## Workgroups (WGs)
  54
  55 - We organize WGs for major topics, and each WG will be working on its own major topic by breaking
  56   it into small tasks/issues, performing them inside WG, and collaborating between WGs.
  57 - The WG information can be found [here](workgroup.md).
  58