profile/ivi/pixman.git
14 years agoCompute src_format outside the fast path loop.
Søren Sandmann Pedersen [Sat, 12 Sep 2009 06:11:12 +0000 (02:11 -0400)]
Compute src_format outside the fast path loop.

Inside the loop all we have to do is check that the formats match.

14 years agoEliminate the NEED_COMPONENT_ALPHA flag.
Søren Sandmann Pedersen [Sat, 12 Sep 2009 05:53:54 +0000 (01:53 -0400)]
Eliminate the NEED_COMPONENT_ALPHA flag.

Instead introduce two new fake formats

PIXMAN_a8r8g8b8_ca
PIXMAN_a8b8g8r8_ca

that are used in the fast path tables for this case.

14 years agoEliminate the NEED_SOLID_MASK flag
Søren Sandmann Pedersen [Sat, 12 Sep 2009 05:35:56 +0000 (01:35 -0400)]
Eliminate the NEED_SOLID_MASK flag

This flag was used to indicate that the mask was solid while still
allowing a specific format to be required. However, there is not
actually any need for this because the fast paths all used
_pixman_image_get_solid() which already allowed arbitrary formats.

The one thing that had to be dealt with was component alpha. In
addition to interpreting the presence of the NEED_COMPONENT_ALPHA
flag, we now also interprete the *absence* of this flag as a
requirement that the mask does *not* have component alpha.

Siarhei Siamashka pointed out that the first version of this commit
had a bug, in which a NEED_SOLID_MASK was accidentally not turned into
a PIXMAN_solid in the ARM NEON implementation.

14 years agoUse the destination buffer directly in more cases instead of fetching.
Søren Sandmann Pedersen [Sat, 19 Sep 2009 10:14:38 +0000 (06:14 -0400)]
Use the destination buffer directly in more cases instead of fetching.

When the destination buffer is either a8r8g8b8 or x8r8g8b8, we can use
it directly instead of fetching into a temporary buffer. When the
format is x8r8g8b8, we require the operator to not make use of
destination alpha, but when it is a8r8g8b8, there are no restrictions.

This is approximately a 5% speedup on the poppler cairo benchmark:

[ # ]  backend                         test   min(s) median(s) stddev. count

Before:
[  0]    image                      poppler    6.661    6.709   0.59%    6/6

After:
[  0]    image                      poppler    6.307    6.320   0.12%    5/6

14 years agotest: Move image_endian_swap() from blitters-test.c to utils.[ch]
Søren Sandmann Pedersen [Tue, 10 Nov 2009 20:48:36 +0000 (15:48 -0500)]
test: Move image_endian_swap() from blitters-test.c to utils.[ch]

14 years agotest: Move random number generator from blitters/scaling-test to utils.[ch]
Søren Sandmann Pedersen [Tue, 10 Nov 2009 20:45:17 +0000 (15:45 -0500)]
test: Move random number generator from blitters/scaling-test to utils.[ch]

14 years agotest: In scaling-test use the crc32 from utils.c
Søren Sandmann Pedersen [Tue, 10 Nov 2009 20:32:12 +0000 (15:32 -0500)]
test: In scaling-test use the crc32 from utils.c

14 years agotest: Move CRC32 code from blitters-test to new files utils.[ch]
Søren Sandmann Pedersen [Tue, 10 Nov 2009 20:29:20 +0000 (15:29 -0500)]
test: Move CRC32 code from blitters-test to new files utils.[ch]

14 years agotest: Rename utils.[ch] to gtk-utils.[ch]
Søren Sandmann Pedersen [Tue, 10 Nov 2009 19:58:19 +0000 (14:58 -0500)]
test: Rename utils.[ch] to gtk-utils.[ch]

14 years agosse2: Add a fast path for OVER 8888 x 8 x 8888
Søren Sandmann Pedersen [Sun, 20 Sep 2009 21:37:36 +0000 (17:37 -0400)]
sse2: Add a fast path for OVER 8888 x 8 x 8888

This is a small speedup on the swfdec-youtube benchmark:

Before:
[  0]    image               swfdec-youtube    5.789    5.806   0.20%    6/6

After:
[  0]    image               swfdec-youtube    5.489    5.524   0.27%    6/6

Ie., approximately 5% faster.

14 years agoARM: enabled 'neon_composite_add_8000_8000' fast path
Siarhei Siamashka [Wed, 4 Nov 2009 15:14:14 +0000 (17:14 +0200)]
ARM: enabled 'neon_composite_add_8000_8000' fast path

14 years agoARM: enabled 'neon_composite_add_8_8_8' fast path
Siarhei Siamashka [Wed, 4 Nov 2009 15:13:31 +0000 (17:13 +0200)]
ARM: enabled 'neon_composite_add_8_8_8' fast path

14 years agoARM: enabled 'neon_composite_add_n_8_8' fast path
Siarhei Siamashka [Wed, 4 Nov 2009 15:12:56 +0000 (17:12 +0200)]
ARM: enabled 'neon_composite_add_n_8_8' fast path

14 years agoARM: enabled 'neon_composite_over_8888_8888' fast path
Siarhei Siamashka [Wed, 4 Nov 2009 15:12:14 +0000 (17:12 +0200)]
ARM: enabled 'neon_composite_over_8888_8888' fast path

14 years agoARM: enabled 'neon_composite_over_8888_0565' fast path
Siarhei Siamashka [Wed, 4 Nov 2009 15:11:32 +0000 (17:11 +0200)]
ARM: enabled 'neon_composite_over_8888_0565' fast path

14 years agoARM: enabled 'neon_composite_over_8888_n_8888' fast path
Siarhei Siamashka [Wed, 4 Nov 2009 15:10:55 +0000 (17:10 +0200)]
ARM: enabled 'neon_composite_over_8888_n_8888' fast path

14 years agoARM: enabled 'neon_composite_over_n_8_8888' fast path
Siarhei Siamashka [Wed, 4 Nov 2009 15:10:09 +0000 (17:10 +0200)]
ARM: enabled 'neon_composite_over_n_8_8888' fast path

14 years agoARM: enabled 'neon_composite_over_n_8_0565' fast path
Siarhei Siamashka [Wed, 4 Nov 2009 15:09:31 +0000 (17:09 +0200)]
ARM: enabled 'neon_composite_over_n_8_0565' fast path

14 years agoARM: enabled 'neon_composite_src_0888_0888' fast path
Siarhei Siamashka [Wed, 4 Nov 2009 15:08:48 +0000 (17:08 +0200)]
ARM: enabled 'neon_composite_src_0888_0888' fast path

14 years agoARM: enabled 'neon_composite_src_8888_0565' fast path
Siarhei Siamashka [Wed, 4 Nov 2009 15:08:09 +0000 (17:08 +0200)]
ARM: enabled 'neon_composite_src_8888_0565' fast path

14 years agoARM: enabled 'neon_composite_src_0565_0565' fast path
Siarhei Siamashka [Wed, 4 Nov 2009 15:07:36 +0000 (17:07 +0200)]
ARM: enabled 'neon_composite_src_0565_0565' fast path

14 years agoARM: added 'bindings' for NEON assembly optimized functions
Siarhei Siamashka [Wed, 4 Nov 2009 15:05:46 +0000 (17:05 +0200)]
ARM: added 'bindings' for NEON assembly optimized functions

These functions serve as 'adaptors', converting standard internal
pixman fast path function arguments into arguments expected
by assembly functions.

14 years agoARM: enabled new implementation for pixman_fill_neon
Siarhei Siamashka [Wed, 4 Nov 2009 13:29:27 +0000 (15:29 +0200)]
ARM: enabled new implementation for pixman_fill_neon

14 years agoARM: introduction of the new framework for NEON fast path optimizations
Siarhei Siamashka [Wed, 4 Nov 2009 13:18:38 +0000 (15:18 +0200)]
ARM: introduction of the new framework for NEON fast path optimizations

GNU assembler and its macro preprocessor is now used to generate
NEON optimized functions from a common template. This automatically
takes care of nuisances like ensuring optimal alignment, dealing with
leading/trailing pixels, doing prefetch, etc.

Implementations for a lot of compositing functions are also added,
but not enabled.

14 years agoARM: removed old ARM NEON optimizations
Siarhei Siamashka [Wed, 4 Nov 2009 12:25:27 +0000 (14:25 +0200)]
ARM: removed old ARM NEON optimizations

14 years agoDefine PIXMAN_USE_INTERNAL_API in pixman-private.h
Søren Sandmann Pedersen [Sat, 7 Nov 2009 19:47:22 +0000 (14:47 -0500)]
Define PIXMAN_USE_INTERNAL_API in pixman-private.h

Instead of mucking around with CFLAGS in configure.ac, preventing
users from setting their own CFLAGS, just define the
PIXMAN_USE_INTERNAL_API and PIXMAN_DISABLE_DEPRECATED in
pixman-private.h

14 years agoInclude <inttypes.h> when compiled with HP's C compiler.
Søren Sandmann Pedersen [Tue, 27 Oct 2009 13:11:28 +0000 (09:11 -0400)]
Include <inttypes.h> when compiled with HP's C compiler.

Fixes bug 23169.

14 years agoC fast path function for 'over_n_1_8888'
Siarhei Siamashka [Tue, 27 Oct 2009 10:25:13 +0000 (12:25 +0200)]
C fast path function for 'over_n_1_8888'

This function is needed to improve performance of xfce4 terminal.
Some other applications may potentially benefit too.

14 years agoC fast path function for 'add_1000_1000'
Siarhei Siamashka [Tue, 27 Oct 2009 10:11:05 +0000 (12:11 +0200)]
C fast path function for 'add_1000_1000'

This function is needed to improve performance of xfce4 terminal.
Some other applications may potentially benefit too.

14 years agoblitters-test updated to also randomly generate mask_x/mask_y
Siarhei Siamashka [Fri, 23 Oct 2009 17:56:30 +0000 (20:56 +0300)]
blitters-test updated to also randomly generate mask_x/mask_y

14 years agoAdd fast path scaled, bilinear fetcher.
André Tupinambá [Sun, 20 Sep 2009 03:01:50 +0000 (23:01 -0400)]
Add fast path scaled, bilinear fetcher.

This adds a bilinear fetcher for the case where the image has a scaled
transformation, does not repeat, and the format {ax}8r8g8b8.

Results for the swfdec-youtube benchmark

Before:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image               swfdec-youtube    7.841    7.915   0.72%    6/6

After:

[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]    image               swfdec-youtube    6.677    6.780   0.94%    6/6

These results were measured on a faster machine than the ones in the
previous commit, so the numbers are not comparable.

Signed-off-by: Søren Sandmann Pedersen <sandmann@redhat.com>
14 years agoSpeed up bilinear interpolation.
André Tupinambá [Sat, 19 Sep 2009 13:32:37 +0000 (09:32 -0400)]
Speed up bilinear interpolation.

Speed up bilinear interpolation by processing more than one component
at a time on 64 bit architectures, and by precomputing the dist{ixiy}
products on 32 bit architectures.

Previously bilinear interpolation for one pixel would take 24
multiplications. With this improvement it takes 12 on 64 bit, and 20
on 32 bit.

This is a small but consistent speedup on the swfdec-youtube
benchmark:

[ # ]  backend                         test   min(s) median(s) stddev. count
Before:
[  0]    image               swfdec-youtube   18.010   18.020   0.09%    4/5

After:
[  0]    image               swfdec-youtube   17.488   17.584   0.22%    5/6

Signed-off-by: Søren Sandmann Pedersen <sandmann@redhat.com>
14 years agoExtend scaling-test to also test bilinear filtering.
Søren Sandmann Pedersen [Sun, 27 Sep 2009 13:41:25 +0000 (09:41 -0400)]
Extend scaling-test to also test bilinear filtering.

14 years agoThis is not a GNU project, so declare it foreign.
Jeremy Huddleston [Wed, 21 Oct 2009 19:47:27 +0000 (12:47 -0700)]
This is not a GNU project, so declare it foreign.

On Wed, 2009-10-21 at 13:36 +1000, Peter Hutterer wrote:
> On Tue, Oct 20, 2009 at 08:23:55PM -0700, Jeremy Huddleston wrote:
> > I noticed an INSTALL file in xlsclients and libXvMC today, and it
> > was quite annoying to work around since 'autoreconf -fvi' replaces
> > it and git wants to commit it.  Should these files even be in git?
> > Can I nuke them for the betterment of humanity and since they get
> > created by autoreconf anyways?
>
> See https://bugs.freedesktop.org/show_bug.cgi?id=24206

As an interim measure, replace AM_INIT_AUTOMAKE([dist-bzip2]) with
AM_INIT_AUTOMAKE([foreign dist-bzip2]). This will prevent the generation
of the INSTALL file. It is also part of the 24206 solution.

Signed-off-by: Jeremy Huddleston <jeremyhu@freedesktop.org>
14 years agoMake walk_region_internal() use 32 bit dimensions
Søren Sandmann Pedersen [Tue, 20 Oct 2009 00:32:37 +0000 (20:32 -0400)]
Make walk_region_internal() use 32 bit dimensions

14 years agoMake pixman_compute_composite_region32() use 32 bit dimensions
Søren Sandmann Pedersen [Tue, 20 Oct 2009 00:31:54 +0000 (20:31 -0400)]
Make pixman_compute_composite_region32() use 32 bit dimensions

14 years agoChange prototype of _pixman_walk_composite_region from int16_t to int32_t
Søren Sandmann Pedersen [Tue, 20 Oct 2009 00:30:22 +0000 (20:30 -0400)]
Change prototype of _pixman_walk_composite_region from int16_t to int32_t

14 years agoRemove unused color_table and color_table_size fields
Søren Sandmann Pedersen [Tue, 20 Oct 2009 00:27:36 +0000 (20:27 -0400)]
Remove unused color_table and color_table_size fields

14 years agoRemove BOUNDS() macro.
Søren Sandmann Pedersen [Sun, 18 Oct 2009 07:02:28 +0000 (03:02 -0400)]
Remove BOUNDS() macro.

It was bounding the clip region to INT16_MIN, INT16_MAX, but this was
a relic from the X server. We don't need it since we are already
restricting the clip region to the geometry of the destination.

14 years ago--enable-maintainer-mode is gone from configure, so remove it
Benjamin Otte [Wed, 30 Sep 2009 06:02:39 +0000 (08:02 +0200)]
--enable-maintainer-mode is gone from configure, so remove it

14 years agoAdd default cases for all switch statements
Benjamin Otte [Thu, 17 Sep 2009 11:19:04 +0000 (13:19 +0200)]
Add default cases for all switch statements

Fixes compilation with -Wswitch-default. Compilation with -Wswitch-enums
works fine as is.

14 years agoFix compile warnings
Benjamin Otte [Thu, 17 Sep 2009 11:18:22 +0000 (13:18 +0200)]
Fix compile warnings

14 years agoARM: Removal of unused/broken NEON code
Siarhei Siamashka [Sun, 26 Jul 2009 22:21:26 +0000 (01:21 +0300)]
ARM: Removal of unused/broken NEON code

14 years agoFix double semicolon; pointed out by Travis Griggs
Søren Sandmann Pedersen [Thu, 8 Oct 2009 17:01:27 +0000 (13:01 -0400)]
Fix double semicolon; pointed out by Travis Griggs

14 years agoFix build with Visual Studio 2008
Gerdus van Zyl [Tue, 29 Sep 2009 10:28:03 +0000 (12:28 +0200)]
Fix build with Visual Studio 2008

moved __m64 ms declaration in sse2_composite_over_x888_8_8888 to top
of function so it compiles with visual studio 2008

14 years agoFix composite on big-endian systems.
Andrea Canciani [Sun, 27 Sep 2009 09:40:52 +0000 (11:40 +0200)]
Fix composite on big-endian systems.

Data narrower than 32bpp is padded to an unsigned long and on
big-endian systems this shifts the value by the padding bits.

14 years agoFix fetch-test for big-endian systems.
Søren Sandmann Pedersen [Sat, 26 Sep 2009 17:12:14 +0000 (13:12 -0400)]
Fix fetch-test for big-endian systems.

Data narrower than 32bpp should be stored in the correct
endian. Reported by Andrea Canciani.

14 years agoAdd missing break in composite.c
Søren Sandmann Pedersen [Thu, 24 Sep 2009 12:57:26 +0000 (08:57 -0400)]
Add missing break in composite.c

14 years agopixman: Update .gitignore
Guillem Jover [Tue, 22 Sep 2009 17:51:13 +0000 (19:51 +0200)]
pixman: Update .gitignore

Generalize to catch all .pc files. Add more tests.

Signed-off-by: Guillem Jover <guillem@hadrons.org>
14 years agoIn the compositing test, Don't try to use component alpha with solid fills.
Søren Sandmann Pedersen [Thu, 24 Sep 2009 12:10:00 +0000 (08:10 -0400)]
In the compositing test, Don't try to use component alpha with solid fills.

It's not supported yet.

14 years agoUpdate CRC value in blitters-test for the new bug fixes
Søren Sandmann Pedersen [Fri, 18 Sep 2009 15:33:18 +0000 (11:33 -0400)]
Update CRC value in blitters-test for the new bug fixes

14 years agoFix bug in blitters-test with BGRA formats.
Søren Sandmann Pedersen [Fri, 18 Sep 2009 12:16:56 +0000 (08:16 -0400)]
Fix bug in blitters-test with BGRA formats.

When masking out the x bits, blitter-test would make the incorrect
assumption that the they were always in the topmost position. This is
not correct for formats of type PIXMAN_TYPE_BGRA.

14 years agoFix bugs in fetch_*_b2g3r3().
Søren Sandmann Pedersen [Fri, 18 Sep 2009 13:43:14 +0000 (09:43 -0400)]
Fix bugs in fetch_*_b2g3r3().

The red channel should only be shifted five positions, not six.

14 years agoFix bugs in a1b2g1r1.
Søren Sandmann Pedersen [Thu, 24 Sep 2009 11:48:46 +0000 (07:48 -0400)]
Fix bugs in a1b2g1r1.

The first bug is that it is treating the input as if it were a1r1g1b1;
the second one is that the red channel should only be shifted two
bits, not three.

14 years agoFix shift bug in fetch_scanline/pixel_a2b2g2r2()
Søren Sandmann Pedersen [Fri, 18 Sep 2009 12:48:04 +0000 (08:48 -0400)]
Fix shift bug in fetch_scanline/pixel_a2b2g2r2()

0x30 * 0x55 is 0xff0, so the red channel should be shifted four bits,
not six.

14 years agoFix four bit formats.
Søren Sandmann Pedersen [Fri, 18 Sep 2009 12:13:46 +0000 (08:13 -0400)]
Fix four bit formats.

The original Render code used to index pixels with their position in
bits in the image. When the scanline code was introduced pixels were
indexed in bytes, but the FETCH/STORE_4/8 macros still assumed bits.

This commit fixes that by making the FETCH/STORE_4 macros first
convert the index to bit position.

14 years agoHide PIXMAN_OP_NONE and PIXMAN_N_OPERATORS behind PIXMAN_INTERNAL_API.
Søren Sandmann Pedersen [Sun, 20 Sep 2009 20:50:37 +0000 (16:50 -0400)]
Hide PIXMAN_OP_NONE and PIXMAN_N_OPERATORS behind PIXMAN_INTERNAL_API.

These cannot sanely be used by applications since they may change in
new versions.

14 years agoAdd a few notes about testing to TODO
Søren Sandmann Pedersen [Fri, 18 Sep 2009 12:06:32 +0000 (08:06 -0400)]
Add a few notes about testing to TODO

14 years agoFix alpha handling for 10 bpc formats.
Søren Sandmann Pedersen [Fri, 18 Sep 2009 13:11:04 +0000 (09:11 -0400)]
Fix alpha handling for 10 bpc formats.

These generally extracted the 2 bits of alpha, then shifted them 62
bits and replicated across 16 bits. Then they were shifted another 48
bits, making the resulting alpha channel 0.

14 years agoReturn result from pixman_image_set_transform().
Søren Sandmann Pedersen [Thu, 24 Sep 2009 09:22:33 +0000 (05:22 -0400)]
Return result from pixman_image_set_transform().

Previously it would always return TRUE, even when malloc() had failed.

14 years agoRevert "Enable component alpha on solid masks."
Søren Sandmann Pedersen [Tue, 15 Sep 2009 11:43:23 +0000 (07:43 -0400)]
Revert "Enable component alpha on solid masks."

For consistency we will probably want to allow component alpha to be
set on all masks at some point, but this commit only enabled it for
solid images.

This reverts commit 29e22cf38e8abc54b9dddbdeb3909d02866a82a0.

14 years ago[Makefile] Set the SIMD specific CFLAGS for inspecting asm.
Chris Wilson [Tue, 15 Sep 2009 12:16:17 +0000 (13:16 +0100)]
[Makefile] Set the SIMD specific CFLAGS for inspecting asm.

14 years agoRemove optimization for 0xffffffff and 0xff the add_n_8888_8888_ca fast path
Søren Sandmann Pedersen [Mon, 14 Sep 2009 22:48:32 +0000 (18:48 -0400)]
Remove optimization for 0xffffffff and 0xff the add_n_8888_8888_ca fast path

This is an ADD operation, not an OVER. Fixes bug 23934, reported by
Siarhei Siamashka.

14 years agoDon't prefetch from NULL in the SSE2 fast paths.
M Joonas Pihlaja [Mon, 14 Sep 2009 19:52:29 +0000 (22:52 +0300)]
Don't prefetch from NULL in the SSE2 fast paths.

On an Athlon64 box prefetch from NULL slows down
the rgba OVER rgba fast for predominantly solid sources
by up to 3.5x in the one-rounded-rectangle test case
when run using a tiling polygon renderer.  This patch
conditionalises the prefetches of the mask everywhere
where the mask pointer may be NULL in a fast path.

14 years agoReformat test/composite.c to follow the standard coding style.
Søren Sandmann Pedersen [Mon, 14 Sep 2009 10:58:03 +0000 (06:58 -0400)]
Reformat test/composite.c to follow the standard coding style.

14 years ago[test] Exercise repeating patterns for composite.
Chris Wilson [Sun, 13 Sep 2009 17:02:10 +0000 (18:02 +0100)]
[test] Exercise repeating patterns for composite.

14 years ago[build] Add rule to generate asm for inspection.
Chris Wilson [Sun, 13 Sep 2009 14:04:30 +0000 (15:04 +0100)]
[build] Add rule to generate asm for inspection.

14 years ago[sse2] Don't emit prefetch 0 for an absent mask
Chris Wilson [Sun, 13 Sep 2009 14:04:54 +0000 (15:04 +0100)]
[sse2] Don't emit prefetch 0 for an absent mask

14 years ago[test] Add composite test from rendercheck
Chris Wilson [Sun, 13 Sep 2009 14:07:08 +0000 (15:07 +0100)]
[test] Add composite test from rendercheck

Iterate over all destination formats for dst, src and composite and
compare the result of all oprators with a selection of colours.

14 years agobuild: Suppress verbose compile lines
Chris Wilson [Thu, 27 Aug 2009 08:19:14 +0000 (09:19 +0100)]
build: Suppress verbose compile lines

Compile warnings are being lost in the sea of noise. Automake-1.11 finally
introduced AM_SILENT_RULES to suppress the echoing of the compile line for
every object. Enable this to bring sanity to the pixman build.

14 years agoMerge branch '0.16'
Chris Wilson [Sun, 13 Sep 2009 15:32:27 +0000 (16:32 +0100)]
Merge branch '0.16'

Conflicts:
configure.ac
pixman/pixman-sse2.c

14 years agoRemove duplicated declaration
Chris Wilson [Sun, 16 Aug 2009 11:16:46 +0000 (12:16 +0100)]
Remove duplicated declaration

The pixman_tranform_pixman_f_transform() declaration is repeated 4 lines
down.

14 years agoEnable component alpha on solid masks.
Chris Wilson [Sun, 13 Sep 2009 15:26:29 +0000 (16:26 +0100)]
Enable component alpha on solid masks.

14 years ago[sse2] Bit-reversing typo: src != dst
Chris Wilson [Sun, 13 Sep 2009 15:26:52 +0000 (16:26 +0100)]
[sse2] Bit-reversing typo: src != dst

14 years agoFix off-by-one error in source_image_needs_out_of_bounds_workaround()
Søren Sandmann Pedersen [Fri, 11 Sep 2009 01:33:24 +0000 (21:33 -0400)]
Fix off-by-one error in source_image_needs_out_of_bounds_workaround()

If extents->x2/y2 are equal to image->width/height, then the clip is
still inside the drawable, so no workaround is necessary.

14 years agoRemove unused generated libcomp.pc #23801
Gaetan Nadon [Wed, 9 Sep 2009 00:06:19 +0000 (20:06 -0400)]
Remove unused generated libcomp.pc #23801

14 years agoChange CFLAGS order for PPC and ARM configure tests
Siarhei Siamashka [Fri, 4 Sep 2009 11:14:00 +0000 (14:14 +0300)]
Change CFLAGS order for PPC and ARM configure tests

CFLAGS are always appended to the end of gcc options when compiling
sources in autotools based projects. Configure tests should do the
same. Otherwise build fails on PPC when using CFLAGS="-O2 -mno-altivec"
for example. Similar problem affects ARM.

14 years agoARM: Remove fallback to ARMv6 implementation from NEON delegate chain
Siarhei Siamashka [Wed, 2 Sep 2009 16:46:47 +0000 (19:46 +0300)]
ARM: Remove fallback to ARMv6 implementation from NEON delegate chain

This can help to fix build problems with '-mthumb' gcc option in CFLAGS.
ARMv6 optimized code can't be compiled for thumb (because of its inline
assembly) and gets automatically disabled in configure. Reference
to it from NEON optimized code resulted in linking problems.

Every ARMv6 optimized fast path function also has a better NEON
counterpart, so there is no need to fallback to ARMv6. Shorter
delegate chain should additionally result in a bit better performance.

14 years agoChange CFLAGS order for PPC and ARM configure tests
Siarhei Siamashka [Fri, 4 Sep 2009 11:14:00 +0000 (14:14 +0300)]
Change CFLAGS order for PPC and ARM configure tests

CFLAGS are always appended to the end of gcc options when compiling
sources in autotools based projects. Configure tests should do the
same. Otherwise build fails on PPC when using CFLAGS="-O2 -mno-altivec"
for example. Similar problem affects ARM.

14 years agoARM: Remove fallback to ARMv6 implementation from NEON delegate chain
Siarhei Siamashka [Wed, 2 Sep 2009 16:46:47 +0000 (19:46 +0300)]
ARM: Remove fallback to ARMv6 implementation from NEON delegate chain

This can help to fix build problems with '-mthumb' gcc option in CFLAGS.
ARMv6 optimized code can't be compiled for thumb (because of its inline
assembly) and gets automatically disabled in configure. Reference
to it from NEON optimized code resulted in linking problems.

Every ARMv6 optimized fast path function also has a better NEON
counterpart, so there is no need to fallback to ARMv6. Shorter
delegate chain should additionally result in a bit better performance.

14 years agoDefault to optimised builds when using a Sun Studio compiler.
M Joonas Pihlaja [Mon, 31 Aug 2009 22:02:53 +0000 (23:02 +0100)]
Default to optimised builds when using a Sun Studio compiler.

Autoconf's AC_PROG_CC sets the default CFLAGS to -O2 -g for
gcc and -g for every other compiler.  This patch defaults
CFLAGS to the equivalent -O -g when we're using Sun Studio's cc
if the user or site admin hasn't already set CFLAGS.

14 years agoWork around a Sun Studio 12 code generation bug involving _mm_set_epi32().
M Joonas Pihlaja [Mon, 31 Aug 2009 19:27:32 +0000 (20:27 +0100)]
Work around a Sun Studio 12 code generation bug involving _mm_set_epi32().

Calling a static function wrapper around _mm_set_epi32() when not
using optimisation causes Sun Studio 12's cc to emit a spurious
floating point load which confuses the assembler.  Using a macro wrapper
rather than a function steps around the problem.

14 years agoWork around differing _mm_prefetch() prototypes on Solaris.
M Joonas Pihlaja [Mon, 31 Aug 2009 19:24:04 +0000 (20:24 +0100)]
Work around differing _mm_prefetch() prototypes on Solaris.

Sun Studio 12 expects the address to prefetch to be
a const char pointer rather than a __m128i pointer or
void pointer.

14 years agoDefault to optimised builds when using a Sun Studio compiler.
M Joonas Pihlaja [Mon, 31 Aug 2009 22:02:53 +0000 (23:02 +0100)]
Default to optimised builds when using a Sun Studio compiler.

Autoconf's AC_PROG_CC sets the default CFLAGS to -O2 -g for
gcc and -g for every other compiler.  This patch defaults
CFLAGS to the equivalent -O -g when we're using Sun Studio's cc
if the user or site admin hasn't already set CFLAGS.

14 years agoWork around a Sun Studio 12 code generation bug involving _mm_set_epi32().
M Joonas Pihlaja [Mon, 31 Aug 2009 19:27:32 +0000 (20:27 +0100)]
Work around a Sun Studio 12 code generation bug involving _mm_set_epi32().

Calling a static function wrapper around _mm_set_epi32() when not
using optimisation causes Sun Studio 12's cc to emit a spurious
floating point load which confuses the assembler.  Using a macro wrapper
rather than a function steps around the problem.

14 years agoWork around differing _mm_prefetch() prototypes on Solaris.
M Joonas Pihlaja [Mon, 31 Aug 2009 19:24:04 +0000 (20:24 +0100)]
Work around differing _mm_prefetch() prototypes on Solaris.

Sun Studio 12 expects the address to prefetch to be
a const char pointer rather than a __m128i pointer or
void pointer.

14 years agoARM: workaround for gcc bug in vshll_n_u8 intrinsic
Siarhei Siamashka [Fri, 28 Aug 2009 19:34:21 +0000 (22:34 +0300)]
ARM: workaround for gcc bug in vshll_n_u8 intrinsic

Some versions of gcc (cs2009q1, 4.4.1) incorrectly reject
shift operand having value >= 8, claiming that it is out of
range. So inline assembly is used as a workaround.

14 years agoARM: workaround for gcc bug in vshll_n_u8 intrinsic
Siarhei Siamashka [Fri, 28 Aug 2009 19:34:21 +0000 (22:34 +0300)]
ARM: workaround for gcc bug in vshll_n_u8 intrinsic

Some versions of gcc (cs2009q1, 4.4.1) incorrectly reject
shift operand having value >= 8, claiming that it is out of
range. So inline assembly is used as a workaround.

14 years agoEnable the x888_8_8888 sse2 fast path.
Søren Sandmann Pedersen [Tue, 2 Jun 2009 12:27:33 +0000 (08:27 -0400)]
Enable the x888_8_8888 sse2 fast path.

14 years agoSet version number to 0.16.1
Søren Sandmann Pedersen [Wed, 2 Sep 2009 20:09:32 +0000 (16:09 -0400)]
Set version number to 0.16.1

14 years agoAdd CPU detection for VC++ x64
Makoto Kato [Tue, 1 Sep 2009 01:59:05 +0000 (10:59 +0900)]
Add CPU detection for VC++ x64

VC++ x64 has no inline assembler and x64 mode supports SSE2.
So, it is unnecessary to call cpuid.

14 years agoAdd CPU detection for VC++ x64
Makoto Kato [Tue, 1 Sep 2009 01:59:05 +0000 (10:59 +0900)]
Add CPU detection for VC++ x64

VC++ x64 has no inline assembler and x64 mode supports SSE2.
So, it is unnecessary to call cpuid.

14 years agoChange names of add_8888_8_8 fast paths to add_n_8_8
Søren Sandmann Pedersen [Tue, 1 Sep 2009 12:23:23 +0000 (08:23 -0400)]
Change names of add_8888_8_8 fast paths to add_n_8_8

The source is solid in those.

14 years agoPost-release version bump
Søren Sandmann Pedersen [Fri, 28 Aug 2009 12:14:04 +0000 (08:14 -0400)]
Post-release version bump

14 years agoPre-release version bump
Søren Sandmann Pedersen [Fri, 28 Aug 2009 11:55:30 +0000 (07:55 -0400)]
Pre-release version bump

14 years ago_pixman_run_fast_path: typo
Chris Wilson [Fri, 28 Aug 2009 10:31:06 +0000 (06:31 -0400)]
_pixman_run_fast_path: typo

This is one example of a compiler warning that was lost amit the build
noise.

The error here is that in a list of required conditions we used ';'
instead of '&&' with the result of continuing to use the fast-path
even if we had a wide mask.

Another error is that it was testing src, not mask as it should.

14 years agoRemove spurious spaces in pixman-x64-mmx-emulation.h
Makoto Kato [Fri, 28 Aug 2009 08:09:15 +0000 (04:09 -0400)]
Remove spurious spaces in pixman-x64-mmx-emulation.h

14 years agoCheck if we have posix_memalign() in configure.ac. [23260, 23261]
Søren Sandmann Pedersen [Wed, 12 Aug 2009 18:08:58 +0000 (14:08 -0400)]
Check if we have posix_memalign() in configure.ac. [23260, 23261]

Fall back to malloc() in blitters-test.c if we don't.

14 years agoARM: a fix to pass blitters-test for 'neon_composite_over_n_8_0565'
Siarhei Siamashka [Wed, 12 Aug 2009 17:22:24 +0000 (20:22 +0300)]
ARM: a fix to pass blitters-test for 'neon_composite_over_n_8_0565'

Inline assembly for handling <8 pixels width did not pass blitters-test.
Fortunately gcc has no problems compiling alternative implementation
which is using RVCT style intrinsics, so it can be used instead.

14 years agoPost-release version bump
Søren Sandmann Pedersen [Tue, 11 Aug 2009 18:03:24 +0000 (14:03 -0400)]
Post-release version bump