Søren Sandmann Pedersen [Wed, 4 Aug 2010 21:55:14 +0000 (17:55 -0400)]
Add alpha-loop test program
This tests what happens if you attempt to make an image with an alpha
map that has the image as its alpha map. This results in an infinite
loop in _pixman_image_validate(), so the test sets up a SIGALRM to
exit if it runs for more than five seconds.
Siarhei Siamashka [Mon, 31 May 2010 16:24:43 +0000 (19:24 +0300)]
ARM: 'neon_combine_out_reverse_u' combiner
This operation was seen in mozilla browser profiling logs.
Implemented so that 'over' and 'out_reverse' operations
now reuse common parts of code.
Siarhei Siamashka [Fri, 19 Mar 2010 10:21:32 +0000 (12:21 +0200)]
Code simplification (no need advancing 'vx' at the end of scanline)
Søren Sandmann Pedersen [Fri, 2 Jul 2010 18:14:21 +0000 (14:14 -0400)]
Store the various bits image fetchers in a table with formats and flags.
Similarly to how the fast paths are done, put the various bits_image
fetchers in a table, so that we can quickly find the best one based on
the image's flags and format.
Søren Sandmann Pedersen [Fri, 2 Jul 2010 16:53:56 +0000 (12:53 -0400)]
Add some new FAST_PATH flags
The flags are:
* AFFINE_TRANSFORM, for affine transforms
* Y_UNIT_ZERO, for when the 10 entry in the transformation is zero
* FILTER_BILINEAR, for when the image has a bilinear filter
* NO_NORMAL_REPEAT, for when the repeat mode is not NORMAL
* HAS_TRANSFORM, for when the transform is not NULL
Also add some new FAST_PATH_REPEAT_* macros. These are just shorthands
for the image not having any of the other repeat modes. For example
REPEAT_NORMAL is (NO_NONE | NO_PAD | NO_REFLECT).
Søren Sandmann Pedersen [Fri, 2 Jul 2010 16:45:44 +0000 (12:45 -0400)]
Remove "_raw_" from all the accessors.
There are no non-raw accessors anymore.
Søren Sandmann Pedersen [Fri, 2 Jul 2010 16:34:42 +0000 (12:34 -0400)]
Eliminate the store_scanline_{32,64} function pointers.
Now that we can't recurse on alpha maps, they are not needed anymore.
Søren Sandmann Pedersen [Fri, 2 Jul 2010 16:31:50 +0000 (12:31 -0400)]
Split bits_image_fetch_transformed() into two functions.
One function deals with the common affine, no-alpha-map case. The
other deals with perspective transformations and alpha maps.
Søren Sandmann Pedersen [Fri, 2 Jul 2010 16:11:44 +0000 (12:11 -0400)]
Eliminate get_pixel_32() and get_pixel_64() from bits_image.
These functions can simply be passed as arguments to the various pixel
fetchers. We don't need to store them. Since they are known at compile
time and the pixel fetchers are force_inline, this is not a
performance issue.
Also temporarily make all pixel access go through the alpha path.
Søren Sandmann Pedersen [Fri, 2 Jul 2010 15:58:23 +0000 (11:58 -0400)]
Eliminate recursion from alpha map code
Alpha maps with alpha maps are no longer supported. It's not a useful
feature and it could could lead to infinite recursion.
Søren Sandmann Pedersen [Thu, 22 Jul 2010 08:27:45 +0000 (04:27 -0400)]
Replace compute_src_extent_flags() with analyze_extents()
This commit fixes two separate problems: 1. Incorrect computation of
the FAST_PATH_SAMPLES_COVER_CLIP flag, and 2. FAST_PATH_16BIT_SAFE is
a nonsensical thing to compute.
== 1. Incorrect computation of SAMPLES_COVER_CLIP:
Previously we were using pixman_transform_bounds() to compute which
source samples would be used for a composite operation. This is
incorrect for several reasons:
(a) pixman_transform_bounds() is transforming the integer bounding box
of the destination samples, where it should be transforming the
bounding box of the samples themselves. In other words, it is too
pessimistic in some cases.
(b) pixman_transform_bounds() is not rounding the same way as we do
during sampling. For example, for a NEAREST filter we subtract
pixman_fixed_e before rounding off to the nearest sample so that a
transformed value of 1 will round to the sample at 0.5 and not to the
one at 1.5. However, pixman_transform_bounds() would simply truncate
to 1 which would imply that the first sample to be used was the one at
1.5. In other words, it is too optimistic in some cases.
(c) The result of pixman_transform_bounds() does not account for the
interpolation filter applied to the source.
== 2. FAST_PATH_16BIT_SAFE is nonsensical
The FAST_PATH_16BIT_SAFE is a flag that indicates that various
computations can be safely done within a 16.16 fixed-point
variable. It was used by certain fast paths who relied on those
computations succeeding. The problem is that many other compositing
functions were making similar assumptions but not actually requiring
the flag to be set. Notably, all the general compositing functions
simply walk the source region using 16.16 variables. If the
transformation happens to overflow, strange things will happen.
So instead of computing this flag in certain cases, it is better to
simply detect that overflows will happen and not try to composite at
all in that case. This has the advantage that most compositing
functions can be written naturally way.
It does have the disadvantage that we are giving up on some cases that
previously worked, but those are all corner cases where the areas
involved were very close to the limits of the coordinate
system. Relying on these working reliably was always a somewhat
dubious proposition. The most important case that might have worked
previously was untransformed compositing involving images larger than
32 bits. But even in those cases, if you had REPEAT_PAD or
REPEAT_REFLECT turned on, you would hit bits_image_fetch_transformed()
which has the 16 bit limitations.
== Fixes
This patch fixes both problems by introducing a new function called
analyze_extents() that has the responsibility to reject corner cases,
and to compute flags based on the extents.
It does this through a new compute_sample_extents() function that will
compute a conservative (but tight) approximation to the bounding box
of the samples that will actually be needed. By basing the computation
on the positions of the _sample_ locations in the destination, and by
taking the interpolation filter into account, it fixes problem one.
The same function is also used with a one-pixel expanded version of
the destination extents. By checking if the transformed bounding box
will overflow 16.16 fixed point, it fixes problem two.
Søren Sandmann Pedersen [Wed, 28 Jul 2010 06:11:08 +0000 (02:11 -0400)]
Extend scaling-crash-test in various ways
This extends scaling-crash-test to test some more things:
- All combinations of NEAREST/BILINEAR/CONVOLUTION filters and
NORMAL/PAD/REFLECT repeat modes.
- Tests various scale factors very close to 1/7th such that the source
area is very close to edge of the source image.
- The same things, only with scale factors very close to 1/32767th.
- Enables the commented-out tests for accessing memory outside the
source buffer.
Also there is now a border around the source buffer which has a
different color than the source buffer itself so that if we sample
outside, it will show up.
Finally, the test now allows the destination buffer to not be changed
at all. This allows pixman to simply bail out in cases where the
transformation too strange.
Søren Sandmann Pedersen [Thu, 5 Aug 2010 23:00:56 +0000 (19:00 -0400)]
Fix Altivec/OpenBSD patch
As Brad pointed out, I pushed the wrong version of this patch.
Brad Smith [Sat, 31 Jul 2010 09:07:02 +0000 (05:07 -0400)]
Add support for AltiVec detection for OpenBSD/PowerPC.
Bug 29331.
Søren Sandmann Pedersen [Wed, 4 Aug 2010 13:50:30 +0000 (09:50 -0400)]
CODING_STYLE: Delete the stuff about trailing spaces
Also fix various other minor issues.
Søren Sandmann Pedersen [Wed, 28 Jul 2010 07:17:35 +0000 (03:17 -0400)]
If we bail out of do_composite, make sure to undo any workarounds.
The workaround for an old X bug has to be undone if we bail from
do_composite, so we can't just return.
Søren Sandmann Pedersen [Wed, 4 Aug 2010 12:58:51 +0000 (08:58 -0400)]
Add x14r6g6b6 format to blitters-test
Marek Vasut [Sun, 1 Aug 2010 00:18:52 +0000 (02:18 +0200)]
Add support for 32bpp X14R6G6B6 format.
This format is used on PXA framebuffer with some boards. It uses only 18 bits
from the 32 bit framebuffer to interpret color.
Signed-off-by: Marek Vasut <marek.vasut@gmail.com>
Siarhei Siamashka [Wed, 14 Jul 2010 13:43:16 +0000 (16:43 +0300)]
test: 'scaling-test' updated to provide better coverage
Negative scale factors are now also tested. A small additional
translate transform helps to stress the use of fractional
coordinates better.
Also the number of iterations to run by default increased in order
to compensate increased variety of operations to be tested.
Siarhei Siamashka [Mon, 19 Jul 2010 17:25:05 +0000 (20:25 +0300)]
test: 'scaling-crash-test' added
This test tries to exploit some corner cases and previously known
bugs in nearest neighbor scaling fast path code, attempting to
crash pixman or cause some other nasty effect.
Søren Sandmann Pedersen [Fri, 16 Jul 2010 03:40:28 +0000 (23:40 -0400)]
bits: Fix potential divide-by-zero in projective code
If the homogeneous coordinate is 0, just set the coordinates to 0.
Søren Sandmann Pedersen [Mon, 26 Apr 2010 00:25:50 +0000 (20:25 -0400)]
[sse2] Add sse2_composite_add_n_8()
This shows up when epiphany displays the "ImageTest" on
glimr.rubyforge.org/cake/canvas.html
Søren Sandmann Pedersen [Sun, 25 Apr 2010 23:54:28 +0000 (19:54 -0400)]
[sse2] Add sse2_composite_in_n_8()
This shows up when epiphany displays the "ImageTest" on
glimr.rubyforge.org/cake/canvas.html
Søren Sandmann Pedersen [Tue, 13 Jul 2010 04:31:35 +0000 (00:31 -0400)]
[sse2] Add sse2_composite_src_x888_8888()
This operation shows up when Firefox displays
http://dougx.net/plunder/plunder.html
Søren Sandmann Pedersen [Tue, 13 Jul 2010 04:08:10 +0000 (00:08 -0400)]
[fast] Add fast_composite_src_x888_8888()
This shows up on when Firefox displays http://dougx.net/plunder/plunder.html
M Joonas Pihlaja [Wed, 14 Jul 2010 06:51:27 +0000 (09:51 +0300)]
Fix thinko in configure.ac's macro to test linking.
Copy-paste carnage. Renames save_{cflags,libs,ldflags} to
save_{CFLAGS,LIBS,LDFLAGS}.
M Joonas Pihlaja [Sun, 11 Jul 2010 16:59:01 +0000 (19:59 +0300)]
Avoid trailing slashes on automake install dirs.
The install-sh on a Solaris box couldn't copy with
trailing slashes.
M Joonas Pihlaja [Sat, 10 Jul 2010 12:36:41 +0000 (15:36 +0300)]
Check for specific flags by actually trying to compile and link.
Instead of relying on preprocessor version checks to see if a
some compiler flags are supported, actually try to compile and
link a test program with the flags.
M Joonas Pihlaja [Sat, 10 Jul 2010 01:41:01 +0000 (02:41 +0100)]
Check that the OpenMP pragmas don't cause link errors.
This patch adds extra guards around our use of
OpenMP pragmas and checks that the pragmas won't
cause link errors. This fixes the build on
Tru64 and Solaris with the native compilers and clang.
M Joonas Pihlaja [Fri, 9 Jul 2010 09:09:07 +0000 (12:09 +0300)]
Don't trust OpenBSD's gcc to produce working code for __thread.
The gcc on OpenBSD 4.5 to 4.7 at least produces bad code for __thread,
without as much as a warning.
See PR #6410 "Using __thread TLS variables compiles ok but segfault at runtime."
http://cvs.openbsd.org/cgi-bin/query-pr-wrapper?full=yes&numbers=6410
M Joonas Pihlaja [Fri, 9 Jul 2010 09:07:35 +0000 (12:07 +0300)]
Try harder to find suitable flags for pthreads.
The flags -D_REENTRANT -lpthread work on more systems than
does -pthread unfortunately, so give that a go too.
Søren Sandmann Pedersen [Mon, 12 Jul 2010 19:13:49 +0000 (15:13 -0400)]
Check for read accessors before taking the bilinear fast path
The bilinear fast path accesses pixels directly, so if the image has a
read accessor, then it can't be used.
Søren Sandmann Pedersen [Sun, 11 Jul 2010 23:58:49 +0000 (19:58 -0400)]
fast-path: Some formatting fixes
Add spaces before parentheses; fix indentation in the macro.
Søren Sandmann Pedersen [Sun, 11 Jul 2010 23:57:29 +0000 (19:57 -0400)]
In the FAST_NEAREST macro call the function 8888_8888 and not x888_x888
The x888 suggests that they have something to do with the x8r8g8b8
formats, but that's not the case; they are assuming a8r8g8b8
formats. (Although in some cases they also work for x8r8g8b8 type
formats).
Søren Sandmann Pedersen [Sun, 11 Jul 2010 23:45:22 +0000 (19:45 -0400)]
Make the repeat mode explicit in the FAST_NEAREST macro.
Before, it was 0 or 1 meaning 'no repeat' and 'normal repeat'
respectively. Now we explicitly pass in either NONE or NORMAL.
Søren Sandmann Pedersen [Sun, 11 Jul 2010 00:47:01 +0000 (20:47 -0400)]
When converting indexed formats to 64 bits, don't correct for channel widths
Indexed formats are mapped to a8r8g8b8 with full precision, so when
expanding we shouldn't correct for the width of the channels
Søren Sandmann Pedersen [Sat, 10 Jul 2010 22:40:06 +0000 (18:40 -0400)]
test: Make sure the palettes for indexed format roundtrip properly
The palettes for indexed formats must satisfy the condition that if
some index maps to a color C, then the 15 bit version of that color
must map back to the index. This ensures that the destination operator
is always a no-op, which seems like a reasonable assumption to make.
Søren Sandmann Pedersen [Sat, 10 Jul 2010 20:49:51 +0000 (16:49 -0400)]
Split the fast path caching into its own force_inline function
The do_composite() function is a lot more readable this way.
Søren Sandmann Pedersen [Sat, 10 Jul 2010 20:08:51 +0000 (16:08 -0400)]
Cache the implementation along with the fast paths.
When calling a fast path, we need to pass the corresponding
implementation since it might contain information necessary to run the
fast path.
Søren Sandmann Pedersen [Sat, 10 Jul 2010 19:47:12 +0000 (15:47 -0400)]
Hide the global implementation variable behind a force_inline function.
Previously the global variable was called 'imp' which was confusing
with the argument to various other functions also being called imp.
Søren Sandmann Pedersen [Wed, 30 Jun 2010 06:31:10 +0000 (02:31 -0400)]
Fix memory leak in the pthreads thread local storage code
When a thread exits, we leak whatever is stored in thread local
variables, so install a destructor to free it.
Søren Sandmann Pedersen [Thu, 1 Jul 2010 20:54:30 +0000 (16:54 -0400)]
Make the combiner macros less likely to cause name collisions.
Protect the arguments to the combiner macros with parentheses, and
postfix their temporary variables with underscores to avoid name space
collisions with the surrounding code.
Alexander Shulgin pointed out that underscore-prefixed identifiers are
reserved for the C implementation, so we use postfix underscores
instead.
Søren Sandmann Pedersen [Mon, 21 Jun 2010 19:30:46 +0000 (15:30 -0400)]
Minor tweaks to README
Søren Sandmann Pedersen [Sun, 20 Jun 2010 17:12:27 +0000 (13:12 -0400)]
Store the conical angle in floating point radians, not fixed point degrees
This is a slight simplification.
Søren Sandmann Pedersen [Sat, 19 Jun 2010 22:57:45 +0000 (18:57 -0400)]
Fix conical gradients to match QConicalGradient from Qt
Under the assumption that pixman gradients are supposed to match
QConicalgradient, described here:
http://doc.trolltech.com/4.4/qconicalgradient.html
this patch fixes two separate bugs in pixman-conical-gradient.c.
The first bug is that the output of atan2() is in the range of [-pi,
pi], which means the parameter into the gradient can be negative. This
is wrong since a QConicalGradient always interpolates around the
center from 0 to 1. The fix for that is to (a) make sure the given
angle is between 0 and 360, and (b) add or subtract 2 * M_PI if the
computed angle ends up outside [0, 2 * pi].
The other bug is that we were interpolating clockwise, whereas
QConicalGradient calls for a counter-clockwise interpolation. This is
easily fixed by subtracting the parameter from 1.
Finally, this patch encapsulates the computation in a new force-inline
function so that it can be reused in both the affine and non-affine
case.
Søren Sandmann Pedersen [Sun, 30 May 2010 22:26:28 +0000 (18:26 -0400)]
Make separate gray scanline storers.
For gray formats the palettes are indexed by luminance, not RGB, so we
can't use the color storers for gray too.
Søren Sandmann Pedersen [Sun, 30 May 2010 20:52:09 +0000 (16:52 -0400)]
When storing a g1 pixel, store the lowest bit, rather than comparing with 0.
Andrea Canciani [Wed, 9 Jun 2010 14:35:37 +0000 (16:35 +0200)]
test: verify that gradients do not crash pixman
Test gradients under particular conditions (no stops, all the stops
at the same offset) to check that pixman does not misbehave.
Andrea Canciani [Tue, 8 Jun 2010 18:36:15 +0000 (20:36 +0200)]
support single-stop gradients
Just like conical gradients, linear and radial gradients can now
have a single stop.
Søren Sandmann Pedersen [Wed, 19 May 2010 02:27:46 +0000 (22:27 -0400)]
Eliminate mask_bits from all the scanline fetchers.
Back in the day, the mask_bits argument was used to distinguish
between masks used for component alpha (where it was 0xffffffff) and
masks for unified alpha (where it was 0xff000000). In this way, the
fetchers could check if just the alpha channel was 0 and in that case
avoid fetching the source.
However, we haven't actually used it like that for a long time; it is
currently always either 0xffffffff or 0 (if the mask is NULL). It also
doesn't seem worthwhile resurrecting it because for premultiplied
buffers, if alpha is 0, then so are the color channels
normally.
This patch eliminates the mask_bits and changes the fetchers to just
assume it is 0xffffffff if mask is non-NULL.
Jeff Muizelaar [Mon, 15 Mar 2010 12:56:38 +0000 (14:56 +0200)]
create getter for component alpha
This patch comes from the mozilla central tree. See
http://hg.mozilla.org/mozilla-central/rev/
89338a224278 for the
original changeset.
Signed-off-by: Jeff Muizelaar <jmuizelaar@mozilla.com>
Signed-off-by: Egor Starkov <egor.starkov@nokia.com>
Signed-off-by: Rami Ylimaki <ext-rami.ylimaki@nokia.com>
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@nokia.com>
Siarhei Siamashka [Tue, 11 May 2010 22:34:57 +0000 (01:34 +0300)]
test: added OpenMP support for better utilization of multiple CPU cores
Some of the tests are quite heavy CPU users and may benefit from
using multiple CPU cores, so the programs from 'test' directory
are now built with OpenMP support. OpenMP is easy to use, portable
and also takes care of making a decision about how many threads
to spawn.
Siarhei Siamashka [Tue, 11 May 2010 21:10:04 +0000 (00:10 +0300)]
test: scaling-test updated to use new fuzzer_test_main() function
Siarhei Siamashka [Tue, 11 May 2010 20:21:05 +0000 (23:21 +0300)]
test: blitters-test updated to use new fuzzer_test_main() function
Siarhei Siamashka [Tue, 11 May 2010 19:57:48 +0000 (22:57 +0300)]
test: blitters-test-bisect.rb converted to perl
This new script can be used to run continuously to compare two test
programs based on fuzzer_test_main() function from 'util.c' and
narrow down to a single problematic test from the batch which results
in different behavior.
Siarhei Siamashka [Tue, 11 May 2010 19:46:47 +0000 (22:46 +0300)]
test: main loop from blitters-test added as a new function to utils.c
This new generalized function can be reused in both blitters-test
and scaling-test. Final checksum calculation changed in order to make
it parallelizable (it is a sum of individual 32-bit values returned
by a callback function, which is now responsible for running test-specific
code). Return values may be crc32, some other hash or even just zero on
success and non-zero on error (in this case, the expected result of the
whole test run should be 0).
Søren Sandmann Pedersen [Sun, 9 May 2010 18:24:24 +0000 (14:24 -0400)]
Merge branch 'for-master'
Søren Sandmann Pedersen [Wed, 5 May 2010 22:05:40 +0000 (01:05 +0300)]
test/gtk-utils: Set the size of the window to the size of the image
Jeff Muizelaar [Tue, 4 May 2010 15:55:30 +0000 (11:55 -0400)]
Add support for compiling pixman without thread/tls support
Søren Sandmann Pedersen [Sat, 24 Apr 2010 22:43:38 +0000 (18:43 -0400)]
Add macros for thread local storage on MinGW 32
These macros are identical to the ones that Tor Lillqvist posted here:
http://lists.freedesktop.org/archives/pixman/2010-April/000160.html
with one exception: the variable is allocated with calloc() and not
malloc().
Cc: tml@iki.fi
Søren Sandmann Pedersen [Fri, 23 Apr 2010 16:34:19 +0000 (12:34 -0400)]
Don't use __thread on MinGW.
It is apparently broken. See this:
http://mingw-users.1079350.n2.nabble.com/gcc-4-4-multi-threaded-exception-handling-thread-specifier-not-working-td3440749.html
We'll need to support thread local storage on MinGW32 some other way.
Cc: tml@iki.fi
Søren Sandmann Pedersen [Mon, 29 Mar 2010 03:02:43 +0000 (23:02 -0400)]
Add support for 8bpp to pixman_fill_sse2()
Søren Sandmann Pedersen [Sat, 24 Apr 2010 17:11:50 +0000 (13:11 -0400)]
sse2: Add sse2_composite_over_reverse_n_8888
This is a small speed-up for the poppler benchmark:
Before:
[ # ] backend test min(s) median(s) stddev. count
[ 0] image poppler 4.443 4.474 0.31% 6/6
After:
[ # ] backend test min(s) median(s) stddev. count
[ 0] image poppler 4.224 4.248 0.42% 6/6
Søren Sandmann Pedersen [Sat, 24 Apr 2010 19:15:05 +0000 (15:15 -0400)]
Don't consider indexed formats opaque.
The indexed formats have 0 bits of alpha, but can't be considered
opaque because there may be non-opaque colors in the palette.
Søren Sandmann Pedersen [Thu, 25 Feb 2010 00:21:50 +0000 (19:21 -0500)]
Add an over_8888_8888_8888 sse2 fast path.
Søren Sandmann Pedersen [Wed, 18 Feb 2009 04:03:25 +0000 (23:03 -0500)]
Add pixman_region{,32}_intersect_rect()
Søren Sandmann Pedersen [Wed, 22 Jul 2009 00:52:26 +0000 (20:52 -0400)]
Rename fast_composite_src_8888_x888 to fast_composite_src_memcpy()
Then generalize it and use it for SRC copying between various
identical formats.
Jeff Muizelaar [Tue, 27 Apr 2010 19:23:20 +0000 (15:23 -0400)]
Add missing HAVE_CONFIG_H guards for config.h inclusion
Søren Sandmann Pedersen [Thu, 22 Apr 2010 16:14:23 +0000 (12:14 -0400)]
Remove alphamap from the GTK+ part of tests/Makefile.am
It doesn't use GTK+ and it was already listed in the non-GTK+ part.
Søren Sandmann Pedersen [Wed, 21 Apr 2010 13:59:29 +0000 (09:59 -0400)]
Add pixman_image_get_format() accessor
Søren Sandmann Pedersen [Wed, 21 Apr 2010 13:55:35 +0000 (09:55 -0400)]
Some minor updates to README
Søren Sandmann Pedersen [Sun, 18 Apr 2010 20:24:39 +0000 (16:24 -0400)]
Update README to mention the pixman mailing list
Søren Sandmann Pedersen [Wed, 7 Apr 2010 23:34:41 +0000 (19:34 -0400)]
[mmx] Fix mask creation bugs
This line:
mask = mask | mask >> 8 | mask >> 16 | mask >> 24;
only works when mask has 0s in the lower 24 bits, so add
mask &= 0xff000000;
before.
Reported by Todd Rinaldo on the #cairo IRC channel.
Søren Sandmann Pedersen [Wed, 7 Apr 2010 05:44:12 +0000 (01:44 -0400)]
Fixes for pthread thread local storage.
The tls_name_key variable is passed to tls_name_get(), and the first
time this happens it isn't initialized. tls_name_get() then passes it
on to tls_name_alloc() which passes it on to pthread_setspecific()
leading to undefined behavior.
None of this is actually necessary at all because there is only one
such variable per thread local variable, so it doesn't need to passed
as a parameter at all.
All of this was pointed out by Tor Lillqvist on the cairo mailing
list.
Søren Sandmann Pedersen [Wed, 7 Apr 2010 05:39:14 +0000 (01:39 -0400)]
Fix uninitialized cache when pthreads are used
The thread local cache is allocated with malloc(), but we rely on it
being initialized to zero, so allocate it with calloc() instead.
Siddharth Agarwal [Tue, 13 Apr 2010 14:15:29 +0000 (10:15 -0400)]
Visual Studio 2010 includes stdint.h
Use the builtin version instead of defining the types ourselves.
Søren Sandmann Pedersen [Thu, 1 Apr 2010 10:21:21 +0000 (06:21 -0400)]
Post-release version bump to 0.19.1
Søren Sandmann Pedersen [Thu, 1 Apr 2010 09:23:31 +0000 (05:23 -0400)]
Pre-release version bump to 0.18.0
Matthias Hopf [Wed, 24 Mar 2010 17:54:29 +0000 (18:54 +0100)]
Revert "Improve PIXREGION_NIL to return true on degenerated regions."
This reverts commit
ebba1493136a5a0dd7667073165b2115de203eda.
Scheduled for re-discussion after stable 0.18 has been released.
Matthias Hopf [Wed, 24 Mar 2010 11:00:21 +0000 (12:00 +0100)]
Improve PIXREGION_NIL to return true on degenerated regions.
Fixes Novell bug 568811.
Søren Sandmann Pedersen [Tue, 23 Mar 2010 21:25:54 +0000 (17:25 -0400)]
Post-release version bump to 0.17.15
Søren Sandmann Pedersen [Tue, 23 Mar 2010 20:52:02 +0000 (16:52 -0400)]
Pre-release version bump to 0.17.14
Søren Sandmann Pedersen [Tue, 23 Mar 2010 15:00:04 +0000 (11:00 -0400)]
Merge remote branch 'ssvb/arm-fixes'
Siarhei Siamashka [Mon, 22 Mar 2010 19:56:17 +0000 (21:56 +0200)]
ARM: SIMD optimizations moved to a separate .S file
This should be the last step in providing full armv4t compatibility
with CPU features runtime autodetection in pixman.
Siarhei Siamashka [Mon, 22 Mar 2010 17:51:00 +0000 (19:51 +0200)]
ARM: SIMD optimizations updated to use common assembly calling conventions
Siarhei Siamashka [Mon, 22 Mar 2010 16:51:54 +0000 (18:51 +0200)]
ARM: Helper ARM NEON assembly binding macros moved into a separate header
This is needed for future reuse of the same macros for the other
ARM assembly optimizations (armv4t, armv6)
Siarhei Siamashka [Sat, 26 Dec 2009 22:27:53 +0000 (00:27 +0200)]
ARM: Workaround for a NEON bug in assembler from binutils 2.18
The problem was reported as bug 25534 against pixman in
freedesktop.org bugzila. Link to a patch for binutils:
http://sourceware.org/ml/binutils/2008-03/msg00260.html
For pixman the impact is a build failure when using
binutils 2.18. Versions 2.19 and higer are fine. Still
some distros may be using older versions of binutils and
this is causing problems.
This patch workarounds the problem by replacing a problematic
"vmov a, b" instruction with equivalent "vorr a, b, b". Actually
they even map to the same instruction opcode in the generated
code, so the resulting binary is identical with and without patch.
Siarhei Siamashka [Mon, 22 Mar 2010 09:54:51 +0000 (11:54 +0200)]
ARM: Use '.object_arch' directive in NEON assembly file
This can be used to override the architecture recorded in the EABI object
attribute section. We set a minimum arch to 'armv4'. Binutils documentation
recommends to use this directive with the code performing runtime detection
of CPU features.
Additionally NEON/VFP EABI attributes are suppressed. And the instruction
set to use is explicitly set to '.arm'.
Configure test for NEON support is also updated to include a bunch of
these new directives (if any of these is unsupported by the assembler,
it is better to fail configure test than to fail library build).
All these changes are required to fix SIGILL problem on armv4t, reported in
http://lists.freedesktop.org/archives/pixman/2010-March/000123.html
Jon TURNEY [Wed, 17 Mar 2010 21:07:06 +0000 (21:07 +0000)]
Avoid a potential division-by-zero exeception in window-test
Avoid a division-by-zero exception if the first number returned by
rand() is a multiple of 500, causing us to create a zero width pixmap,
and then attempt to use get_rand(0) when generating a random stride...
Fixes https://bugs.freedesktop.org/attachment.cgi?id=34162
Søren Sandmann Pedersen [Wed, 17 Mar 2010 19:12:06 +0000 (15:12 -0400)]
Post-release version bump to 0.17.13
Søren Sandmann Pedersen [Wed, 17 Mar 2010 17:46:44 +0000 (13:46 -0400)]
Pre-release version bump to 0.17.12
Søren Sandmann Pedersen [Wed, 17 Mar 2010 14:50:42 +0000 (10:50 -0400)]
Specialize the fast_composite_scaled_nearest_* scalers to positive x units
This avoids a test in the inner loop, which improves performance
especially for tiled sources.
On x86-32, I get these results:
Before:
op=1, src_fmt=
20028888, dst_fmt=
20028888, speed=306.96 MPix/s (73.18 FPS)
op=1, src_fmt=
20028888, dst_fmt=
10020565, speed=102.67 MPix/s (24.48 FPS)
op=1, src_fmt=
10020565, dst_fmt=
10020565, speed=324.85 MPix/s (77.45 FPS)
After:
op=1, src_fmt=
20028888, dst_fmt=
20028888, speed=332.19 MPix/s (79.20 FPS)
op=1, src_fmt=
20028888, dst_fmt=
10020565, speed=110.41 MPix/s (26.32 FPS)
op=1, src_fmt=
10020565, dst_fmt=
10020565, speed=363.28 MPix/s (86.61 FPS)
Søren Sandmann Pedersen [Wed, 17 Mar 2010 14:35:34 +0000 (10:35 -0400)]
Add a FAST_PATH_X_UNIT_POSITIVE flag
This is the common case for a lot of transformed images. If the unit
were negative, the transformation would be a reflection which is
fairly rare.
Alexander Larsson [Wed, 17 Mar 2010 10:58:05 +0000 (11:58 +0100)]
Use the right format for the OVER_8888_565 fast path
Alexander Larsson [Fri, 12 Mar 2010 14:45:04 +0000 (15:45 +0100)]
Add specialized fast nearest scalers
This is a macroized version of SRC/OVER repeat normal/unneeded nearest
neighbour scaling instantiated for some common 8888 and 565 formats.
Based on work by Siarhei Siamashka
Alexander Larsson [Fri, 12 Mar 2010 14:41:01 +0000 (15:41 +0100)]
Add FAST_PATH_SAMPLES_COVER_CLIP and FAST_PATH_16BIT_SAFE
FAST_PATH_SAMPLES_COVER_CLIP:
This is set of the source sample grid, unrepeated but transformed
completely completely covers the clip destination. If this is set
you can use a simple scaled that doesn't have to care about the repeat
mode.
FAST_PATH_16BIT_SAFE:
This signifies two things:
1) The size of the src/mask fits in a 16.16 fixed point, so something like:
max_vx = src_image->bits.width << 16;
Is allowed and is guaranteed to not overflow max_vx
2) When stepping the source space we're guaranteed to never overflow
a 16.16 bit fix point variable, even if we step one extra step
in the destination space. This means that a loop doing:
x = vx >> 16;
vx += unit_x; d = src_row[x];
will never overflow vx causing x to be negative.
And additionally, if you track vx like above and apply NORMAL repeat
after the vx addition with something like:
while (vx >= max_vx) vx -= max_vx;
This will never overflow the vx even on the final increment that
takes vx one past the end of where we will read, which makes the
repeat loop safe.
Alexander Larsson [Fri, 12 Mar 2010 14:40:07 +0000 (15:40 +0100)]
Add FAST_PATH_NO_NONE_REPEAT flag
Alexander Larsson [Tue, 16 Mar 2010 13:18:29 +0000 (14:18 +0100)]
Add CONVERT_8888_TO_8888 and CONVERT_0565_TO_0565 macros
These are useful for macroization
Alexander Larsson [Fri, 12 Mar 2010 15:23:42 +0000 (16:23 +0100)]
Add CONVERT_0565_TO_8888 macro
This lets us simplify some fast paths since we get a consistent
naming that always has 8888 and gets some value for alpha.
Søren Sandmann Pedersen [Mon, 15 Mar 2010 15:51:09 +0000 (11:51 -0400)]
Ensure that only the low 4 bit of 4 bit pixels are stored.
In some cases we end up trying to use the STORE_4 macro with an 8 bit
values, which resulted in other pixels getting overwritten. Fix this
by always masking off the low 4 bits.
This fixes blitters-test on big-endian machines.