Fix c vs intrinsic mismatch of vpx_hadamard_32x32() function
authorAnupam Pandey <anupam.pandey@ittiam.com>
Tue, 6 Jun 2023 06:57:34 +0000 (12:27 +0530)
committerAnupam Pandey <anupam.pandey@ittiam.com>
Fri, 9 Jun 2023 11:45:37 +0000 (17:15 +0530)
commit8c308aefea7c58a1a979b81f4aa6d68908e379ee
tree99d4aa93cb371171dd450fdf48c02ac13a0dc95a
parente510716d7e9a0a34592eb8ff1f8a65b951fe2eeb
Fix c vs intrinsic mismatch of vpx_hadamard_32x32() function

This CL resolves the mismatch between C and intrinsic implementation
of vpx_hadamard_32x32 function. The mismatch was due to integer
overflow during the addition operation in the intrinsic functions.
Specifically, the addition in the intrinsic function was performed
at the 16-bit level, while the calculation of a0 + a1 resulted in
a 17-bit value.

This code change addresses the problem by performing
the addition at the 32-bit level (with sign extension) in both SSE2
and AVX2, and then converting the results back to the 16-bit level
after a right shift.

STATS_CHANGED

Change-Id: I576ca64e3b9ebb31d143fcd2da64322790bc5853
test/hadamard_test.cc
vpx_dsp/avg.c
vpx_dsp/x86/avg_intrin_avx2.c
vpx_dsp/x86/avg_intrin_sse2.c