6 @page tests Validation and benchmarks tests
10 @section tests_overview Overview
12 Benchmark and validation tests are based on the same framework to setup and run
13 the tests. In addition to running simple, self-contained test functions the
14 framework supports fixtures and data test cases. The former allows to share
15 common setup routines between various backends thus reducing the amount of
16 duplicated code. The latter can be used to parameterize tests or fixtures with
17 different inputs, e.g. different tensor shapes. One limitation is that
18 tests/fixtures cannot be parameterized based on the data type if static type
19 information is needed within the test (e.g. to validate the results).
21 @subsection tests_overview_fixtures Fixtures
23 Fixtures can be used to share common setup, teardown or even run tasks among
24 multiple test cases. For that purpose a fixture can define a `setup`,
25 `teardown` and `run` method. Additionally the constructor and destructor might
28 An instance of the fixture is created immediately before the actual test is
29 executed. After construction the @ref framework::Fixture::setup method is called. Then the test
30 function or the fixtures `run` method is invoked. After test execution the
31 @ref framework::Fixture::teardown method is called and lastly the fixture is destructed.
33 @subsubsection tests_overview_fixtures_fixture Fixture
35 Fixtures for non-parameterized test are straightforward. The custom fixture
36 class has to inherit from @ref framework::Fixture and choose to implement any of the
37 `setup`, `teardown` or `run` methods. None of the methods takes any arguments
40 class CustomFixture : public framework::Fixture
49 ARM_COMPUTE_ASSERT(_ptr != nullptr);
60 @subsubsection tests_overview_fixtures_data_fixture Data fixture
62 The advantage of a parameterized fixture is that arguments can be passed to the setup method at runtime. To make this possible the setup method has to be a template with a type parameter for every argument (though the template parameter doesn't have to be used). All other methods remain the same.
64 class CustomFixture : public framework::Fixture
66 #ifdef ALTERNATIVE_DECLARATION
67 template <typename ...>
68 void setup(size_t size)
82 ARM_COMPUTE_ASSERT(_ptr != nullptr);
93 @subsection tests_overview_test_cases Test cases
95 All following commands can be optionally prefixed with `EXPECTED_FAILURE_` or
98 @subsubsection tests_overview_test_cases_test_case Test case
100 A simple test case function taking no inputs and having no (shared) state.
102 - First argument is the name of the test case (has to be unique within the
103 enclosing test suite).
104 - Second argument is the dataset mode in which the test will be active.
107 TEST_CASE(TestCaseName, DatasetMode::PRECOMMIT)
109 ARM_COMPUTE_ASSERT_EQUAL(1 + 1, 2);
112 @subsubsection tests_overview_test_cases_fixture_fixture_test_case Fixture test case
114 A simple test case function taking no inputs that inherits from a fixture. The
115 test case will have access to all public and protected members of the fixture.
116 Only the setup and teardown methods of the fixture will be used. The body of
117 this function will be used as test function.
119 - First argument is the name of the test case (has to be unique within the
120 enclosing test suite).
121 - Second argument is the class name of the fixture.
122 - Third argument is the dataset mode in which the test will be active.
125 class FixtureName : public framework::Fixture
128 void setup() override
137 FIXTURE_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT)
139 ARM_COMPUTE_ASSERT_EQUAL(_one + 1, 2);
142 @subsubsection tests_overview_test_cases_fixture_register_fixture_test_case Registering a fixture as test case
144 Allows to use a fixture directly as test case. Instead of defining a new test
145 function the run method of the fixture will be executed.
147 - First argument is the name of the test case (has to be unique within the
148 enclosing test suite).
149 - Second argument is the class name of the fixture.
150 - Third argument is the dataset mode in which the test will be active.
153 class FixtureName : public framework::Fixture
156 void setup() override
163 ARM_COMPUTE_ASSERT_EQUAL(_one + 1, 2);
170 REGISTER_FIXTURE_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT);
173 @subsubsection tests_overview_test_cases_data_test_case Data test case
175 A parameterized test case function that has no (shared) state. The dataset will
176 be used to generate versions of the test case with different inputs.
178 - First argument is the name of the test case (has to be unique within the
179 enclosing test suite).
180 - Second argument is the dataset mode in which the test will be active.
181 - Third argument is the dataset.
182 - Further arguments specify names of the arguments to the test function. The
183 number must match the arity of the dataset.
186 DATA_TEST_CASE(TestCaseName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}), num)
188 ARM_COMPUTE_ASSERT(num < 4);
191 @subsubsection tests_overview_test_cases_fixture_data_test_case Fixture data test case
193 A parameterized test case that inherits from a fixture. The test case will have
194 access to all public and protected members of the fixture. Only the setup and
195 teardown methods of the fixture will be used. The setup method of the fixture
196 needs to be a template and has to accept inputs from the dataset as arguments.
197 The body of this function will be used as test function. The dataset will be
198 used to generate versions of the test case with different inputs.
200 - First argument is the name of the test case (has to be unique within the
201 enclosing test suite).
202 - Second argument is the class name of the fixture.
203 - Third argument is the dataset mode in which the test will be active.
204 - Fourth argument is the dataset.
207 class FixtureName : public framework::Fixture
210 template <typename T>
220 FIXTURE_DATA_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}))
222 ARM_COMPUTE_ASSERT(_num < 4);
225 @subsubsection tests_overview_test_cases_register_fixture_data_test_case Registering a fixture as data test case
227 Allows to use a fixture directly as parameterized test case. Instead of
228 defining a new test function the run method of the fixture will be executed.
229 The setup method of the fixture needs to be a template and has to accept inputs
230 from the dataset as arguments. The dataset will be used to generate versions of
231 the test case with different inputs.
233 - First argument is the name of the test case (has to be unique within the
234 enclosing test suite).
235 - Second argument is the class name of the fixture.
236 - Third argument is the dataset mode in which the test will be active.
237 - Fourth argument is the dataset.
240 class FixtureName : public framework::Fixture
243 template <typename T>
251 ARM_COMPUTE_ASSERT(_num < 4);
258 REGISTER_FIXTURE_DATA_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}));
260 @section writing_tests Writing validation tests
262 Before starting a new test case have a look at the existing ones. They should
263 provide a good overview how test cases are structured.
265 - The C++ reference needs to be added to `tests/validation/CPP/`. The
266 reference function is typically a template parameterized by the underlying
267 value type of the `SimpleTensor`. This makes it easy to specialise for
268 different data types.
269 - If all backends have a common interface it makes sense to share the setup
270 code. This can be done by adding a fixture in
271 `tests/validation/fixtures/`. Inside of the `setup` method of a fixture
272 the tensors can be created and initialised and the function can be configured
273 and run. The actual test will only have to validate the results. To be shared
274 among multiple backends the fixture class is usually a template that accepts
275 the specific types (data, tensor class, function class etc.) as parameters.
276 - The actual test cases need to be added for each backend individually.
277 Typically the will be multiple tests for different data types and for
278 different execution modes, e.g. precommit and nightly.
280 @section tests_running_tests Running tests
281 @subsection tests_running_tests_benchmarking Benchmarking
282 @subsubsection tests_running_tests_benchmarking_filter Filter tests
283 All tests can be run by invoking
285 ./arm_compute_benchmark ./data
287 where `./data` contains the assets needed by the tests.
289 If only a subset of the tests has to be executed the `--filter` option takes a
290 regular expression to select matching tests.
292 ./arm_compute_benchmark --filter='NEON/.*AlexNet' ./data
294 Additionally each test has a test id which can be used as a filter, too.
295 However, the test id is not guaranteed to be stable when new tests are added.
296 Only for a specific build the same the test will keep its id.
298 ./arm_compute_benchmark --filter-id=10 ./data
300 All available tests can be displayed with the `--list-tests` switch.
302 ./arm_compute_benchmark --list-tests
304 More options can be found in the `--help` message.
306 @subsubsection tests_running_tests_benchmarking_runtime Runtime
307 By default every test is run once on a single thread. The number of iterations
308 can be controlled via the `--iterations` option and the number of threads via
311 @subsubsection tests_running_tests_benchmarking_output Output
312 By default the benchmarking results are printed in a human readable format on
313 the command line. The colored output can be disabled via `--no-color-output`.
314 As an alternative output format JSON is supported and can be selected via
315 `--log-format=json`. To write the output to a file instead of stdout the
316 `--log-file` option can be used.
318 @subsubsection tests_running_tests_benchmarking_mode Mode
319 Tests contain different datasets of different sizes, some of which will take several hours to run.
320 You can select which datasets to use by using the `--mode` option, we recommed you use `--mode=precommit` to start with.
322 @subsubsection tests_running_tests_benchmarking_instruments Instruments
323 You can use the `--instruments` option to select one or more instruments to measure the execution time of the benchmark tests.
325 `PMU` will try to read the CPU PMU events from the kernel (They need to be enabled on your platform)
327 `MALI` will try to collect Mali hardware performance counters. (You need to have a recent enough Mali driver)
329 `WALL_CLOCK` will measure time using `gettimeofday`: this should work on all platforms.
331 You can pass a combinations of these instruments: `--instruments=PMU,MALI,WALL_CLOCK`
333 @note You need to make sure the instruments have been selected at compile time using the `pmu=1` or `mali=1` scons options.
335 @subsection tests_running_tests_validation Validation
337 @note The new validation tests have the same interface as the benchmarking tests.
341 } // namespace arm_compute