Chris Wilson [Sun, 13 Sep 2009 17:02:10 +0000 (18:02 +0100)]
[test] Exercise repeating patterns for composite.
Chris Wilson [Sun, 13 Sep 2009 14:04:30 +0000 (15:04 +0100)]
[build] Add rule to generate asm for inspection.
Chris Wilson [Sun, 13 Sep 2009 14:04:54 +0000 (15:04 +0100)]
[sse2] Don't emit prefetch 0 for an absent mask
Chris Wilson [Sun, 13 Sep 2009 14:07:08 +0000 (15:07 +0100)]
[test] Add composite test from rendercheck
Iterate over all destination formats for dst, src and composite and
compare the result of all oprators with a selection of colours.
Chris Wilson [Thu, 27 Aug 2009 08:19:14 +0000 (09:19 +0100)]
build: Suppress verbose compile lines
Compile warnings are being lost in the sea of noise. Automake-1.11 finally
introduced AM_SILENT_RULES to suppress the echoing of the compile line for
every object. Enable this to bring sanity to the pixman build.
Chris Wilson [Sun, 13 Sep 2009 15:32:27 +0000 (16:32 +0100)]
Merge branch '0.16'
Conflicts:
configure.ac
pixman/pixman-sse2.c
Chris Wilson [Sun, 16 Aug 2009 11:16:46 +0000 (12:16 +0100)]
Remove duplicated declaration
The pixman_tranform_pixman_f_transform() declaration is repeated 4 lines
down.
Chris Wilson [Sun, 13 Sep 2009 15:26:29 +0000 (16:26 +0100)]
Enable component alpha on solid masks.
Chris Wilson [Sun, 13 Sep 2009 15:26:52 +0000 (16:26 +0100)]
[sse2] Bit-reversing typo: src != dst
Søren Sandmann Pedersen [Fri, 11 Sep 2009 01:33:24 +0000 (21:33 -0400)]
Fix off-by-one error in source_image_needs_out_of_bounds_workaround()
If extents->x2/y2 are equal to image->width/height, then the clip is
still inside the drawable, so no workaround is necessary.
Gaetan Nadon [Wed, 9 Sep 2009 00:06:19 +0000 (20:06 -0400)]
Remove unused generated libcomp.pc #23801
Siarhei Siamashka [Fri, 4 Sep 2009 11:14:00 +0000 (14:14 +0300)]
Change CFLAGS order for PPC and ARM configure tests
CFLAGS are always appended to the end of gcc options when compiling
sources in autotools based projects. Configure tests should do the
same. Otherwise build fails on PPC when using CFLAGS="-O2 -mno-altivec"
for example. Similar problem affects ARM.
Siarhei Siamashka [Wed, 2 Sep 2009 16:46:47 +0000 (19:46 +0300)]
ARM: Remove fallback to ARMv6 implementation from NEON delegate chain
This can help to fix build problems with '-mthumb' gcc option in CFLAGS.
ARMv6 optimized code can't be compiled for thumb (because of its inline
assembly) and gets automatically disabled in configure. Reference
to it from NEON optimized code resulted in linking problems.
Every ARMv6 optimized fast path function also has a better NEON
counterpart, so there is no need to fallback to ARMv6. Shorter
delegate chain should additionally result in a bit better performance.
Siarhei Siamashka [Fri, 4 Sep 2009 11:14:00 +0000 (14:14 +0300)]
Change CFLAGS order for PPC and ARM configure tests
CFLAGS are always appended to the end of gcc options when compiling
sources in autotools based projects. Configure tests should do the
same. Otherwise build fails on PPC when using CFLAGS="-O2 -mno-altivec"
for example. Similar problem affects ARM.
Siarhei Siamashka [Wed, 2 Sep 2009 16:46:47 +0000 (19:46 +0300)]
ARM: Remove fallback to ARMv6 implementation from NEON delegate chain
This can help to fix build problems with '-mthumb' gcc option in CFLAGS.
ARMv6 optimized code can't be compiled for thumb (because of its inline
assembly) and gets automatically disabled in configure. Reference
to it from NEON optimized code resulted in linking problems.
Every ARMv6 optimized fast path function also has a better NEON
counterpart, so there is no need to fallback to ARMv6. Shorter
delegate chain should additionally result in a bit better performance.
M Joonas Pihlaja [Mon, 31 Aug 2009 22:02:53 +0000 (23:02 +0100)]
Default to optimised builds when using a Sun Studio compiler.
Autoconf's AC_PROG_CC sets the default CFLAGS to -O2 -g for
gcc and -g for every other compiler. This patch defaults
CFLAGS to the equivalent -O -g when we're using Sun Studio's cc
if the user or site admin hasn't already set CFLAGS.
M Joonas Pihlaja [Mon, 31 Aug 2009 19:27:32 +0000 (20:27 +0100)]
Work around a Sun Studio 12 code generation bug involving _mm_set_epi32().
Calling a static function wrapper around _mm_set_epi32() when not
using optimisation causes Sun Studio 12's cc to emit a spurious
floating point load which confuses the assembler. Using a macro wrapper
rather than a function steps around the problem.
M Joonas Pihlaja [Mon, 31 Aug 2009 19:24:04 +0000 (20:24 +0100)]
Work around differing _mm_prefetch() prototypes on Solaris.
Sun Studio 12 expects the address to prefetch to be
a const char pointer rather than a __m128i pointer or
void pointer.
M Joonas Pihlaja [Mon, 31 Aug 2009 22:02:53 +0000 (23:02 +0100)]
Default to optimised builds when using a Sun Studio compiler.
Autoconf's AC_PROG_CC sets the default CFLAGS to -O2 -g for
gcc and -g for every other compiler. This patch defaults
CFLAGS to the equivalent -O -g when we're using Sun Studio's cc
if the user or site admin hasn't already set CFLAGS.
M Joonas Pihlaja [Mon, 31 Aug 2009 19:27:32 +0000 (20:27 +0100)]
Work around a Sun Studio 12 code generation bug involving _mm_set_epi32().
Calling a static function wrapper around _mm_set_epi32() when not
using optimisation causes Sun Studio 12's cc to emit a spurious
floating point load which confuses the assembler. Using a macro wrapper
rather than a function steps around the problem.
M Joonas Pihlaja [Mon, 31 Aug 2009 19:24:04 +0000 (20:24 +0100)]
Work around differing _mm_prefetch() prototypes on Solaris.
Sun Studio 12 expects the address to prefetch to be
a const char pointer rather than a __m128i pointer or
void pointer.
Siarhei Siamashka [Fri, 28 Aug 2009 19:34:21 +0000 (22:34 +0300)]
ARM: workaround for gcc bug in vshll_n_u8 intrinsic
Some versions of gcc (cs2009q1, 4.4.1) incorrectly reject
shift operand having value >= 8, claiming that it is out of
range. So inline assembly is used as a workaround.
Siarhei Siamashka [Fri, 28 Aug 2009 19:34:21 +0000 (22:34 +0300)]
ARM: workaround for gcc bug in vshll_n_u8 intrinsic
Some versions of gcc (cs2009q1, 4.4.1) incorrectly reject
shift operand having value >= 8, claiming that it is out of
range. So inline assembly is used as a workaround.
Søren Sandmann Pedersen [Tue, 2 Jun 2009 12:27:33 +0000 (08:27 -0400)]
Enable the x888_8_8888 sse2 fast path.
Søren Sandmann Pedersen [Wed, 2 Sep 2009 20:09:32 +0000 (16:09 -0400)]
Set version number to 0.16.1
Makoto Kato [Tue, 1 Sep 2009 01:59:05 +0000 (10:59 +0900)]
Add CPU detection for VC++ x64
VC++ x64 has no inline assembler and x64 mode supports SSE2.
So, it is unnecessary to call cpuid.
Makoto Kato [Tue, 1 Sep 2009 01:59:05 +0000 (10:59 +0900)]
Add CPU detection for VC++ x64
VC++ x64 has no inline assembler and x64 mode supports SSE2.
So, it is unnecessary to call cpuid.
Søren Sandmann Pedersen [Tue, 1 Sep 2009 12:23:23 +0000 (08:23 -0400)]
Change names of add_8888_8_8 fast paths to add_n_8_8
The source is solid in those.
Søren Sandmann Pedersen [Fri, 28 Aug 2009 12:14:04 +0000 (08:14 -0400)]
Post-release version bump
Søren Sandmann Pedersen [Fri, 28 Aug 2009 11:55:30 +0000 (07:55 -0400)]
Pre-release version bump
Chris Wilson [Fri, 28 Aug 2009 10:31:06 +0000 (06:31 -0400)]
_pixman_run_fast_path: typo
This is one example of a compiler warning that was lost amit the build
noise.
The error here is that in a list of required conditions we used ';'
instead of '&&' with the result of continuing to use the fast-path
even if we had a wide mask.
Another error is that it was testing src, not mask as it should.
Makoto Kato [Fri, 28 Aug 2009 08:09:15 +0000 (04:09 -0400)]
Remove spurious spaces in pixman-x64-mmx-emulation.h
Søren Sandmann Pedersen [Wed, 12 Aug 2009 18:08:58 +0000 (14:08 -0400)]
Check if we have posix_memalign() in configure.ac. [23260, 23261]
Fall back to malloc() in blitters-test.c if we don't.
Siarhei Siamashka [Wed, 12 Aug 2009 17:22:24 +0000 (20:22 +0300)]
ARM: a fix to pass blitters-test for 'neon_composite_over_n_8_0565'
Inline assembly for handling <8 pixels width did not pass blitters-test.
Fortunately gcc has no problems compiling alternative implementation
which is using RVCT style intrinsics, so it can be used instead.
Søren Sandmann Pedersen [Tue, 11 Aug 2009 18:03:24 +0000 (14:03 -0400)]
Post-release version bump
Søren Sandmann Pedersen [Tue, 11 Aug 2009 17:56:16 +0000 (13:56 -0400)]
Pre-release version-bump
Søren Sandmann Pedersen [Tue, 11 Aug 2009 06:04:40 +0000 (02:04 -0400)]
Merge branch 'blitter-test'
Søren Sandmann Pedersen [Tue, 11 Aug 2009 00:47:36 +0000 (20:47 -0400)]
Fix x/y mixup in bits_image_fetch_pixel_convolution()
Bug 23224, reported by Michel Dänzer.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 04:45:53 +0000 (00:45 -0400)]
Update CRC value in blitters-test.
At this point, the SIMD, SSE2, MMX and general implementations all
agree.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 04:25:56 +0000 (00:25 -0400)]
Various formatting fixes
Søren Sandmann Pedersen [Wed, 5 Aug 2009 20:28:10 +0000 (16:28 -0400)]
Add the ability to print intermediate CRC values
Søren Sandmann Pedersen [Wed, 5 Aug 2009 19:53:33 +0000 (15:53 -0400)]
Reenable commented-out tests in blitter-test.
The crashes and valgrind issues are all fixed at this point.
Siarhei Siamashka [Sun, 2 Aug 2009 21:01:01 +0000 (00:01 +0300)]
One more update to blitters-test - use aligned memory
allocations in order in order to make reproducibility
of alignment sensitive bugs more deterministic
Also testing of masks is reenabled
Siarhei Siamashka [Fri, 31 Jul 2009 23:20:12 +0000 (02:20 +0300)]
HACK: updated test to better cover new neon optimizations
Siarhei Siamashka [Tue, 21 Jul 2009 22:29:51 +0000 (01:29 +0300)]
Test program for stressing the use of different formats and operators
The code and overall method is mostly based on scaling-test. This one
focuses on trying to stress as many different color formats and types
of composition operations as possible.
This is an initial implementation which may need more tuning. Also
not all color format and operator combinations are actually used.
When cpu specific optimizations are disabled, this test provides
identical deterministic results on x86, PPC and ARM.
Script blitters-test-bisect.rb now works in non-stop mode, until
it finds any problem. This allows to run it for example overnight
in order to test a lot more variants of pixman calls and increase
chances of detecting problems in pixman. Just like with scaling-test,
running blitters-test binary alone with no command line arguments
runs a small predefined number of tests and compares checksum
with a reference value for quick verification.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 16:00:07 +0000 (12:00 -0400)]
Delete commented out code in pixman-vmx.c
Søren Sandmann Pedersen [Fri, 7 Aug 2009 15:53:50 +0000 (11:53 -0400)]
Misc formatting fixes for pixman-vmx.c
Søren Sandmann Pedersen [Fri, 7 Aug 2009 15:48:22 +0000 (11:48 -0400)]
In vmx_combine_atop_reverse_ca() extract alpha after inversing
Søren Sandmann Pedersen [Fri, 7 Aug 2009 15:46:09 +0000 (11:46 -0400)]
Really fix vmx_combine_over_reverse_ca()
The inverse destination alpha is just one component, not four.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 15:40:42 +0000 (11:40 -0400)]
Fix vmx_combine_out_reverse_ca()
The source alpha is just one component, not four.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 15:38:03 +0000 (11:38 -0400)]
Fix vmx_over_reverse_ca()
Destination alpha must be extracted after inversing, otherwise we end
up with 0xFFs in the rgb channels.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 15:35:20 +0000 (11:35 -0400)]
Multiply with the alpha of dest, not inverse alpha
Søren Sandmann Pedersen [Fri, 7 Aug 2009 15:32:31 +0000 (11:32 -0400)]
Fix vmx_combine_vmx_atop_ca()
It didn't compute the mask correct before.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 15:26:23 +0000 (11:26 -0400)]
Fix vmx_combine_over_ca().
In the non-vector code, the mask needs to be multiplied with source
alpha.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 15:21:43 +0000 (11:21 -0400)]
In vmx_combine_out_ca() multiply with the alpha of the negated vdest.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 15:16:31 +0000 (11:16 -0400)]
Fix vmx_combine_out_ca()
It should multiply with just the destination alpha channel, not all
four channels.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 15:07:16 +0000 (11:07 -0400)]
Do the full four-component IN computation in vmx_combine_in_ca().
Søren Sandmann Pedersen [Fri, 7 Aug 2009 14:54:16 +0000 (10:54 -0400)]
Fix bug in vmx_combine_xor_ca()
The destination needs to be inverted before the alpha channel is
extracted; otherwise, the RGB channels of da will be 0xff.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 05:07:01 +0000 (01:07 -0400)]
Make pix_multiply bit-exact
Søren Sandmann Pedersen [Fri, 7 Aug 2009 03:50:32 +0000 (23:50 -0400)]
Change the SSE2 versions of pix_add_multiply() to produce bit-exact results.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 03:52:11 +0000 (23:52 -0400)]
Fix a couple of alpha==0 vs src==0 issues in pixman-sse2.c
Søren Sandmann Pedersen [Fri, 7 Aug 2009 03:05:36 +0000 (23:05 -0400)]
Rename mmx_composite_add_8888_8_8() to mmx_composite_add_n_8_8().
Søren Sandmann Pedersen [Fri, 7 Aug 2009 02:46:50 +0000 (22:46 -0400)]
Fix a couple more alpha==0 vs src==0 bugs in pixman-mmx.c
Søren Sandmann Pedersen [Fri, 7 Aug 2009 02:42:25 +0000 (22:42 -0400)]
Make pix_add_mul() in pixman-mmx.c produce exact results.
Previously this routine would compute (x * a + y * b) / 255. Now it
computes (x * a) / 255 + (y * b) / 255, so that the results are
bitwise equivalent to the non-mmx versions.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 00:29:44 +0000 (20:29 -0400)]
Rewrite the two-component arithmetic macros.
Previously they were not bit-for-bit equivalent to the one-component
versions. The new code is also simpler and easier to read because it
factors out some common sub-macros.
The x * a + y * b macro now only uses four multiplications - the
previous version used eight.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 00:41:04 +0000 (20:41 -0400)]
Fix a bunch of srca == 0 checks that should be src == 0 in pixman-mmx.c
Søren Sandmann Pedersen [Thu, 6 Aug 2009 01:24:50 +0000 (21:24 -0400)]
Don't run fast paths if the format requires wide compositing.
This could happen because the wide formats would still be considered
solid if the image was 1x1 and repeating.
Søren Sandmann Pedersen [Thu, 6 Aug 2009 01:16:14 +0000 (21:16 -0400)]
Fix bug in combine_mask_alpha_ca()
If the mask was 0xffffffff, the source would end up being shifted
twice by A_SHIFT.
Søren Sandmann Pedersen [Thu, 6 Aug 2009 00:40:36 +0000 (20:40 -0400)]
Fix another case of changing the solid source.
This time in fast_path_composite_n_8888_8888().
Søren Sandmann Pedersen [Thu, 6 Aug 2009 00:31:41 +0000 (20:31 -0400)]
Fix incorrect optimization in combine_over_ca().
Previously the code assumed that an alpha of 0 meant that no change
would take place. This is incorrect because an alpha of 0 can happen
as the result of the source having alpha=0, but rgb != 0.
Søren Sandmann Pedersen [Wed, 5 Aug 2009 22:18:37 +0000 (18:18 -0400)]
Don't change the constant source in fast_composite_over_n_8888_0565.
Søren Sandmann Pedersen [Wed, 5 Aug 2009 20:17:52 +0000 (16:17 -0400)]
Fix bugs in combine_over_reverse_ca().
The computation cannot be optimized away when alpha is 0 because that
can happen when the source has alpha zero and rgb non-zero.
Søren Sandmann Pedersen [Fri, 31 Jul 2009 21:27:38 +0000 (17:27 -0400)]
Add a dirty bit to the image struct, and validate before using the image.
This cuts down the number of property_changed calls significantly.
Søren Sandmann Pedersen [Fri, 31 Jul 2009 14:39:41 +0000 (10:39 -0400)]
Add sse2 version of add_n_8888_8888()
Søren Sandmann Pedersen [Fri, 31 Jul 2009 14:26:10 +0000 (10:26 -0400)]
Add a fast path for the add_n_8888_8888() operation.
It shows up on gnome-terminal traces.
Søren Sandmann Pedersen [Fri, 31 Jul 2009 11:29:31 +0000 (07:29 -0400)]
Move bounds checks for REPEAT_NONE to get_pixel()
On a P4, this is a large speedup for the swfdec-fill-rate-2xaa trace:
After:
[ # ] backend test min(s) median(s) stddev. count
[ 0] image swfdec-fill-rate-2xaa 33.061 33.061 0.00% 1/1
Before:
[ # ] backend test min(s) median(s) stddev. count
[ 0] image swfdec-fill-rate-2xaa 40.342 40.342 0.00% 1/1
Pixman 0.14.0 produces this:
[ # ] backend test min(s) median(s) stddev. count
[ 0] image swfdec-fill-rate-2xaa 36.896 36.896 0.00% 1/1
Søren Sandmann Pedersen [Thu, 30 Jul 2009 14:51:38 +0000 (10:51 -0400)]
Remove leftover 0xffffffff in repeat()
Søren Sandmann Pedersen [Thu, 30 Jul 2009 14:45:18 +0000 (10:45 -0400)]
Remove unused function
Søren Sandmann Pedersen [Thu, 30 Jul 2009 14:03:44 +0000 (10:03 -0400)]
Misc formatting
Søren Sandmann Pedersen [Thu, 30 Jul 2009 13:58:12 +0000 (09:58 -0400)]
Change all the fetch_pixels() functions to only fetch one pixel.
Søren Sandmann Pedersen [Tue, 28 Jul 2009 13:43:12 +0000 (09:43 -0400)]
Add fetch_pixel_raw_32 and fetch_pixel_32 virtual functions.
By default both are intialized to bits_image_fetch_pixel_raw(), but if
there is an alpha map, then fetch_pixel_32() is set to
bits_image_fetch_pixel_alpha().
Søren Sandmann Pedersen [Tue, 28 Jul 2009 13:12:51 +0000 (09:12 -0400)]
Various renamings and clean-ups
Søren Sandmann Pedersen [Tue, 28 Jul 2009 12:58:41 +0000 (08:58 -0400)]
Change bits_image_fetch_alpha_pixels() to fetch just one pixel.
Søren Sandmann Pedersen [Tue, 28 Jul 2009 12:44:40 +0000 (08:44 -0400)]
Change bits_image_fetch_pixels_convolution() to fetch just one pixel.
Søren Sandmann Pedersen [Tue, 28 Jul 2009 12:33:28 +0000 (08:33 -0400)]
Change bits_image_fetch_bilinear_pixels() to fetch one pixel at a time.
Søren Sandmann Pedersen [Tue, 28 Jul 2009 12:03:44 +0000 (08:03 -0400)]
Make the repeat routine work on only one coordinate at a time.
Søren Sandmann Pedersen [Tue, 28 Jul 2009 11:55:27 +0000 (07:55 -0400)]
Make bits_image_fetch_nearest() return one pixel.
Previously it would work on a buffer of coordinates.
Søren Sandmann Pedersen [Tue, 28 Jul 2009 11:42:34 +0000 (07:42 -0400)]
Change bits_image_fetch_transformed() to work one pixel at a time.
Previously, it would generate a buffer of coordinates, then pass that
off to a pixel fetcher, but this caused a large performance regression
with the swfdec-fill-rate-2xfsaa cairo trace.
This is the first step towards fixing that.
Søren Sandmann Pedersen [Fri, 7 Aug 2009 04:11:20 +0000 (00:11 -0400)]
Only define PIXMAN_TIMERS if timers are actually enabled [bug 23169]
Søren Sandmann Pedersen [Tue, 28 Jul 2009 13:58:52 +0000 (09:58 -0400)]
Various updates to the CODING_STYLE document
Søren Sandmann Pedersen [Tue, 28 Jul 2009 08:05:26 +0000 (04:05 -0400)]
Add a CODING_STYLE document based on the one from cairo.
Søren Sandmann Pedersen [Wed, 22 Jul 2009 08:51:08 +0000 (04:51 -0400)]
Remove a couple of unused variables
Søren Sandmann Pedersen [Wed, 22 Jul 2009 08:32:07 +0000 (04:32 -0400)]
Rename source_pict_class_t to source_image_class_t
Søren Sandmann Pedersen [Wed, 22 Jul 2009 08:28:08 +0000 (04:28 -0400)]
Replace a bunch of 'pict's with 'image'
Chris Wilson [Fri, 24 Jul 2009 08:36:08 +0000 (09:36 +0100)]
Explain how we can simplify the radial gradient computation
Soeren rightfully complained that I had removed all the comments from
André's patch, most importantly that explain why the transformation is
valid. So add a few details to show that B varies linearly across the
scanline and how we can therefore reduce the per-pixel cost of evaluating
B.
Chris Wilson [Thu, 23 Jul 2009 18:08:40 +0000 (19:08 +0100)]
Fix inversion of radial gradients when r2 > r1
Fixes: Bug 22908 -- Invalid output of radial gradient
http://bugs.freedesktop.org/show_bug.cgi?id=22908
We also include a modified patch by André Tupinambá <andrelrt@gmail.com>,
to pull constant expressions out of the inner radial gradient walker.
Benjamin Otte [Thu, 23 Jul 2009 07:54:49 +0000 (09:54 +0200)]
Don't warn for empty rectangles, only degenerate ones
Benjamin Otte [Tue, 21 Jul 2009 13:00:52 +0000 (15:00 +0200)]
Log errors for invalid rectangles passed to region code
Benjamin Otte [Tue, 21 Jul 2009 12:57:59 +0000 (14:57 +0200)]
Simplify code that logs errors
Benjamin Otte [Tue, 21 Jul 2009 12:50:30 +0000 (14:50 +0200)]
Make the text when reporting a broken region more useful