docs/api/device_runner.md

   1 # CHIP on-device testing
   2
   3 _Requirements and high-level design_
   4
   5 ## Background
   6
   7 The ability to run tests on actual and emulated hardware is paramount in
   8 embedded projects. CHIP is no exception. We want on-device testing to be a first
   9 class goal of CHIP architecture. On-device testing requirements apply both to
  10 Continuous Integration testing for main CHIP software stack development and to
  11 eventual CHIP product certification. This document explores the requirements and
  12 evaluates potential solutions.
  13
  14 ## Overview of requirements
  15
  16 A good device test infrastructure is built on four pillars.
  17
  18 ### Pillar 1: Using a test framework
  19
  20 A test framework provides a testing structure that developers can follow and
  21 potentially reduces some of the burden of test setup and teardown (less
  22 boilerplate). Support for state-oriented and asynchronous structuring of tests
  23 would be beneficial. Many test frameworks leverage scripting languages such as
  24 Python to simplify the quick development of tests and to leverage rich sets of
  25 libraries for device/systems access and results generation.
  26
  27 ### Pillar 2: Dispatching tests
  28
  29 Tests can run on lab machines or on the developer's local workstation. Tests can
  30 be triggered manually by the developer or as a result of completion of a
  31 changeset built on a continuous integration (CI) server. CHIP involves multiple
  32 stakeholders, many of which will want to contribute to the testing efforts with
  33 lab capacity. The infrastructure therefore must be prepared for
  34 cross-organization test dispatch.
  35
  36 To facilitate uniform dispatch of tests we will probably need a simple
  37 request/response protocol. Potentially HTTPS based and RESTful. Due to the long
  38 running nature of device tests the response for a test scheduling request could
  39 be a test ID, not the test result. That ID could be used to query the test
  40 status, subscribe for notifications on status changes and to pull the test
  41 results. Core aspects of such a scheme include the conventions for request
  42 artifacts contents and minimum expected results contents once the run is
  43 complete.
  44
  45 ### Pillar 3: Interacting with devices
  46
  47 The test host environment has to reset devices, flash images on them, issue
  48 commands, monitor status and collect test results. It may also need to integrate
  49 both virtual (simulated) and real devices together. This can at first be done in
  50 an ad-hoc way per platform but eventually we can go into device access
  51 abstraction, i.e. define a common device testing interface which CHIP-compliant
  52 devices can expose. The test host has to be prepared for driving multiple
  53 devices at the same time for a single test, e.g. for tests that check
  54 communication between multiple devices.
  55
  56 ### Pillar 4: Collecting results
  57
  58 Ideally, test results are output in standard formats and similar or analogous
  59 results between different devices and tests are output the same way. This
  60 ensures reusability of code that processes similar data while allowing
  61 aggregation of results across different dimensions. Failed tests must propagate
  62 errors from device platform layers all the way to the CHIP stack and present
  63 errors and potential stack traces in a standard result format. As the purpose of
  64 on-device tests is to capture bugs, it is important that the test outputs
  65 highlight the failure reason(s) and developers don't have to browse through
  66 thousands of lines of logs to find the one line that sheds light on why a test
  67 failed.
  68
  69 ## Priorities
  70
  71 In the spirit of CHIP's charter, it would be great to see something taking-off
  72 as soon as possible, to support continuous testing of the evolving CHIP stack.
  73 We could then improve on that first iteration, even if we have to throw away
  74 some temporary concepts and code.
  75
  76 Test dispatch (Pillar 2) arises as the highest priority, because all other
  77 pillars can have ad-hoc solutions. The first need is an interface between a
  78 CircleCI job and a test execution host at a participating organization. This
  79 would enable dispatching tests to a variety of existing in-house infrastructure,
  80 while retaining common request/response protocols to shield the CI system from
  81 implementation details of each lab.
  82
  83 The next most important goal is to provide a test framework (Pillar 1). With a
  84 standard framework developers can start writing tests, even if those tests will
  85 be device specific and of ad-hoc input and output format. The general structure
  86 of tests will however be present and later the tests can be adapted to standard
  87 interactions (Pillar 3) and result formats (Pillar 4).
  88
  89 Specifying result formats (Pillar 4) for the most common outputs
  90 (success/failure, failure reason, stack trace, memory and CPU usage time series,
  91 pcaps of network traffic, etc.) will be an ongoing effort. The simplest output
  92 formats can be specified together with the test framework.
  93
  94 Lastly, we want to look into a common device interaction interface that would
  95 enable reusing tests between different devices.
  96
  97 ## Baseline hardware platforms for CHIP
  98
  99 The TSG is targeting the following platforms/boards for early bringup:
 100
 101 -   Nordic nRF52 board <TODO: REF>
 102 -   SiLabs XXXX board <TODO:REF>
 103 -   Espressif ESP32 XXXX board <TODO:REF>