From 3a9cdd780de28deeda45600fb5b8b134d91d17f2 Mon Sep 17 00:00:00 2001 From: Alyssa Rosenzweig Date: Sat, 29 Oct 2022 13:36:02 -0400 Subject: [PATCH] panfrost/ci: Disable trace-based testing Trace-based testing has not worked for Panfrost. It was a neat experiment, and I'm glad we tried it, but the results have been mostly negative for the driver. Disable the trace-based tests. For testing that specific API features work correctly, we run the conformance tests (dEQP), which are thorough for OpenGL ES. For big GL features, we run Piglit, and if there are big GL features that we are not testing adequately, we should extend Piglit for these. For fine-grained driver correctness, we are already covered. Where trace-based testing can fit in is as a smoke test, ensuring that the overall rendering of complex scenes does not regress. In principle, that's a lovely idea, but the current implementation has not worked out for Panfrost thus far. The crux of the issue is that the trace based tests are based on checksums, not fuzzy-compared reference images. That requires updating checksums any time rendering changes. However, a rendering change to a trace is NOT a regression. The behaviour of OpenGL is specified very loosely. For a given trace, there are many different valid checksums. That means that correct changes to core code frequently fail CI after running through the rest of CI, only because a checksum changed in a still correct way. That's a pain to deal with, exacerbated by rebase pains, and provides negative value to the project. Some recent examples of this I've hit in the past two weeks alone: panfrost: Enable rendering to 16-bit and 32-bit 4b49241f7d7 ("panfrost: Use proper formats for pntc varying") ac2964dfbd1 ("nir: Be smarter fusing ffma") The last example were virgl traces, but were especially bad: due to a rebase fail, I had to update traces /twice/, wasting two full runs of pre-merge CI across *all* hardware. This was extremely wasteful. The value of trace-based testing is as a smoke test to check that traces still render correctly. That is useful, but it turns out that checksums are the wrong way to go about it. A better implementation would be storing only a single reference image from a software rasterizer per trace. No driver-specific references would be stored. That reference image must never change, provided the trace never changes. CI would then check rendered results against that image with tolerant fuzzy comparisons. That tolerance matches with the fuzzy comparison that the human eye would do when investigating a checksum change anyway. Yes, the image comparison JavaScript will now report that 0 pixels changed within the tolerance, but there's nothing a human eye can do with that information other than an error prone copypaste of new checksums back in the yaml file and kicking it back to CI, itself a waste of time. Finally, in the time we've had trace-based testing alongside the conformance tests, I cannot remember a single actual regression in one of my commits the trace jobs have identified that the conformance tests have not also identified. By contrast, the conformance test coverage has prevented the merge of a number of actual regressions, with very few flakes or xfail changes, and I am grateful we have that coverage. That means the value added from the trace jobs is close to zero, while the above checksum issues means that the cost is tremendous, even ignoring the physical cost of the extra CI jobs. If you work on trace-based testing and would like to understand how it could adapted to be useful for Panfrost, see my recommendations above. If you work on CI in general and would like to improve Panfrost's CI coverage, what we need right now is not trace-based testing, it's GLES3.1 conformance runs on MediaTek MT8192 or MT8195. That hardware is already in the Collabora LAVA lab, but it's not being used for Mesa CI as the required kernel patches haven't made their way to mainline yet and nobody has cherry-picked them to the gfx-ci kernel. If you are a Collaboran and interested in improving Panfrost CI, please ping AngeloGioacchino for information on which specific patches need to be backported or cherry-picked to our gfx-ci kernel. Thank you. Signed-off-by: Alyssa Rosenzweig Acked-by: Jason Ekstrand Part-of: --- src/panfrost/ci/gitlab-ci.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/panfrost/ci/gitlab-ci.yml b/src/panfrost/ci/gitlab-ci.yml index 7ce82d3..d2b610e 100644 --- a/src/panfrost/ci/gitlab-ci.yml +++ b/src/panfrost/ci/gitlab-ci.yml @@ -50,7 +50,7 @@ panfrost-t760-gles2:armhf: - .lava-rk3288-veyron-jaq - .test-manual-mr -panfrost-t760-traces:armhf: +.panfrost-t760-traces:armhf: extends: - .lava-piglit-traces:armhf - .panfrost-midgard-rules @@ -80,7 +80,7 @@ panfrost-t860-gl:arm64: FDO_CI_CONCURRENT: 6 DEQP_SUITE: panfrost-t860 -panfrost-t860-traces:arm64: +.panfrost-t860-traces:arm64: extends: - .lava-piglit-traces:arm64 - .lava-rk3399-gru-kevin -- 2.7.4