pixman/refactor

   1 Roadmap
   2
   3 - Move all the fetchers etc. into pixman-image to make pixman-compose.c
   4   less intimidating.
   5
   6   DONE
   7
   8 - Make combiners for unified alpha take a mask argument. That way
   9   we won't need two separate paths for unified vs component in the
  10   general compositing code.
  11
  12   DONE, except that the Altivec code needs to be updated. Luca is
  13   looking into that.
  14
  15 - Delete separate 'unified alpha' path
  16
  17   DONE
  18
  19 - Split images into their own files
  20
  21   DONE
  22
  23 - Split the gradient walker code out into its own file
  24
  25   DONE
  26
  27 - Add scanline getters per image
  28
  29   DONE
  30
  31 - Generic 64 bit fetcher
  32
  33   DONE
  34
  35 - Split fast path tables into their respective architecture dependent
  36   files.
  37
  38 See "Render Algorithm" below for rationale
  39
  40 Images will eventually have these virtual functions:
  41
  42        get_scanline()
  43        get_scanline_wide()
  44        get_pixel()
  45        get_pixel_wide()
  46        get_untransformed_pixel()
  47        get_untransformed_pixel_wide()
  48        get_unfiltered_pixel()
  49        get_unfiltered_pixel_wide()
  50
  51        store_scanline()
  52        store_scanline_wide()
  53
  54 1.
  55
  56 Initially we will just have get_scanline() and get_scanline_wide();
  57 these will be based on the ones in pixman-compose. Hopefully this will
  58 reduce the complexity in pixman_composite_rect_general().
  59
  60 Note that there is access considerations - the compose function is
  61 being compiled twice.
  62
  63
  64 2.
  65
  66 Split image types into their own source files. Export noop virtual
  67 reinit() call.  Call this whenever a property of the image changes.
  68
  69
  70 3.
  71
  72 Split the get_scanline() call into smaller functions that are
  73 initialized by the reinit() call.
  74
  75 The Render Algorithm:
  76         (first repeat, then filter, then transform, then clip)
  77
  78 Starting from a destination pixel (x, y), do
  79
  80         1 x = x - xDst + xSrc
  81           y = y - yDst + ySrc
  82
  83         2 reject pixel that is outside the clip
  84
  85         This treats clipping as something that happens after
  86         transformation, which I think is correct for client clips. For
  87         hierarchy clips it is wrong, but who really cares? Without
  88         GraphicsExposes hierarchy clips are basically irrelevant. Yes,
  89         you could imagine cases where the pixels of a subwindow of a
  90         redirected, transformed window should be treated as
  91         transparent. I don't really care
  92
  93         Basically, I think the render spec should say that pixels that
  94         are unavailable due to the hierarcy have undefined content,
  95         and that GraphicsExposes are not generated. Ie., basically
  96         that using non-redirected windows as sources is fail. This is
  97         at least consistent with the current implementation and we can
  98         update the spec later if someone makes it work.
  99
 100         The implication for render is that it should stop passing the
 101         hierarchy clip to pixman. In pixman, if a souce image has a
 102         clip it should be used in computing the composite region and
 103         nowhere else, regardless of what "has_client_clip" says. The
 104         default should be for there to not be any clip.
 105
 106         I would really like to get rid of the client clip as well for
 107         source images, but unfortunately there is at least one
 108         application in the wild that uses them.
 109
 110         3 Transform pixel: (x, y) = T(x, y)
 111
 112         4 Call p = GetUntransformedPixel (x, y)
 113
 114         5 If the image has an alpha map, then
 115
 116                 Call GetUntransformedPixel (x, y) on the alpha map
 117
 118                 add resulting alpha channel to p
 119
 120            return p
 121
 122         Where GetUnTransformedPixel is:
 123
 124         6 switch (filter)
 125           {
 126           case NEAREST:
 127                 return GetUnfilteredPixel (x, y);
 128                 break;
 129
 130           case BILINEAR:
 131                 return GetUnfilteredPixel (...) // 4 times
 132                 break;
 133
 134           case CONVOLUTION:
 135                 return GetUnfilteredPixel (...) // as many times as necessary.
 136                 break;
 137           }
 138
 139         Where GetUnfilteredPixel (x, y) is
 140
 141         7 switch (repeat)
 142            {
 143            case REPEAT_NORMAL:
 144            case REPEAT_PAD:
 145            case REPEAT_REFLECT:
 146                 // adjust x, y as appropriate
 147                 break;
 148
 149            case REPEAT_NONE:
 150                 if (x, y) is outside image bounds
 151                      return 0;
 152                 break;
 153            }
 154
 155            return GetRawPixel(x, y)
 156
 157         Where GetRawPixel (x, y) is
 158
 159         8 Compute the pixel in question, depending on image type.
 160
 161 For gradients, repeat has a totally different meaning, so
 162 UnfilteredPixel() and RawPixel() must be the same function so that
 163 gradients can do their own repeat algorithm.
 164
 165 So, the GetRawPixel
 166
 167         for bits must deal with repeats
 168         for gradients must deal with repeats (differently)
 169         for solids, should ignore repeats.
 170
 171         for polygons, when we add them, either ignore repeats or do
 172         something similar to bits (in which case, we may want an extra
 173         layer of indirection to modify the coordinates).
 174
 175 It is then possible to build things like "get scanline" or "get tile" on
 176 top of this. In the simplest case, just repeatedly calling GetPixel()
 177 would work, but specialized get_scanline()s or get_tile()s could be
 178 plugged in for common cases.
 179
 180 By not plugging anything in for images with access functions, we only
 181 have to compile the pixel functions twice, not the scanline functions.
 182
 183 And we can get rid of fetchers for the bizarre formats that no one
 184 uses. Such as b2g3r3 etc. r1g2b1? Seriously? It is also worth
 185 considering a generic format based pixel fetcher for these edge cases.
 186
 187 Since the actual routines depend on the image attributes, the images
 188 must be notified when those change and update their function pointers
 189 appropriately. So there should probably be a virtual function called
 190 (* reinit) or something like that.
 191
 192 There will also be wide fetchers for both pixels and lines. The line
 193 fetcher will just call the wide pixel fetcher. The wide pixel fetcher
 194 will just call expand, except for 10 bit formats.
 195
 196 Rendering pipeline:
 197
 198 Drawable:
 199         0. if (picture has alpha map)
 200                 0.1. Position alpha map according to the alpha_x/alpha_y
 201                 0.2. Where the two drawables intersect, the alpha channel
 202                      Replace the alpha channel of source with the one
 203                      from the alpha map. Replacement only takes place
 204                      in the intersection of the two drawables' geometries.
 205         1. Repeat the drawable according to the repeat attribute
 206         2. Reconstruct a continuous image according to the filter
 207         3. Transform according to the transform attribute
 208         4. Position image such that src_x, src_y is over dst_x, dst_y
 209         5. Sample once per destination pixel
 210         6. Clip. If a pixel is not within the source clip, then no
 211            compositing takes place at that pixel. (Ie., it's *not*
 212            treated as 0).
 213
 214         Sampling a drawable:
 215
 216         - If the channel does not have an alpha channel, the pixels in it
 217           are treated as opaque.
 218
 219         Note on reconstruction:
 220
 221         - The top left pixel has coordinates (0.5, 0.5) and pixels are
 222           spaced 1 apart.
 223
 224 Gradient:
 225         1. Unless gradient type is conical, repeat the underlying (0, 1)
 226                 gradient according to the repeat attribute
 227         2. Integrate the gradient across the plane according to type.
 228         3. Transform according to transform attribute
 229         4. Position gradient
 230         5. Sample once per destination pixel.
 231         6. Clip
 232
 233 Solid Fill:
 234         1. Repeat has no effect
 235         2. Image is already continuous and defined for the entire plane
 236         3. Transform has no effect
 237         4. Positioning has no effect
 238         5. Sample once per destination pixel.
 239         6. Clip
 240
 241 Polygon:
 242         1. Repeat has no effect
 243         2. Image is already continuous and defined on the whole plane
 244         3. Transform according to transform attribute
 245         4. Position image
 246         5. Supersample 15x17 per destination pixel.
 247         6. Clip
 248
 249 Possibly interesting additions:
 250         - More general transformations, such as warping, or general
 251           shading.
 252
 253         - Shader image where a function is called to generate the
 254           pixel (ie., uploading assembly code).
 255
 256         - Resampling kernels
 257
 258           In principle the polygon image uses a 15x17 box filter for
 259           resampling. If we allow general resampling filters, then we
 260           get all the various antialiasing types for free.
 261
 262           Bilinear downsampling looks terrible and could be much
 263           improved by a resampling filter. NEAREST reconstruction
 264           combined with a box resampling filter is what GdkPixbuf
 265           does, I believe.
 266
 267           Useful for high frequency gradients as well.
 268
 269           (Note that the difference between a reconstruction and a
 270           resampling filter is mainly where in the pipeline they
 271           occur. High quality resampling should use a correctly
 272           oriented kernel so it should happen after transformation.
 273
 274           An implementation can transform the resampling kernel and
 275           convolve it with the reconstruction if it so desires, but it
 276           will need to deal with the fact that the resampling kernel
 277           will not necessarily be pixel aligned.
 278
 279           "Output kernels"
 280
 281           One could imagine doing the resampling after compositing,
 282           ie., for each destination pixel sample each source image 16
 283           times, then composite those subpixels individually, then
 284           finally apply a kernel.
 285
 286           However, this is effectively the same as full screen
 287           antialiasing, which is a simpler way to think about it. So
 288           resampling kernels may make sense for individual images, but
 289           not as a post-compositing step.
 290
 291           Fullscreen AA is inefficient without chained compositing
 292           though. Consider an (image scaled up to oversample size IN
 293           some polygon) scaled down to screen size. With the current
 294           implementation, there will be a huge temporary. With chained
 295           compositing, the whole thing ends up being equivalent to the
 296           output kernel from above.
 297
 298         - Color space conversion
 299
 300           The complete model here is that each surface has a color
 301           space associated with it and that the compositing operation
 302           also has one associated with it. Note also that gradients
 303           should have associcated colorspaces.
 304
 305         - Dithering
 306
 307           If people dither something that is already dithered, it will
 308           look terrible, but don't do that, then. (Dithering happens
 309           after resampling if at all - what is the relationship
 310           with color spaces? Presumably dithering should happen in linear
 311           intensity space).
 312
 313         - Floating point surfaces, 16, 32 and possibly 64 bit per
 314           channel.
 315
 316         Maybe crack:
 317
 318         - Glyph polygons
 319
 320           If glyphs could be given as polygons, they could be
 321           positioned and rasterized more accurately. The glyph
 322           structure would need subpixel positioning though.
 323
 324         - Luminance vs. coverage for the alpha channel
 325
 326           Whether the alpha channel should be interpreted as luminance
 327           modulation or as coverage (intensity modulation). This is a
 328           bit of a departure from the rendering model though. It could
 329           also be considered whether it should be possible to have
 330           both channels in the same drawable.
 331
 332         - Alternative for component alpha
 333
 334           - Set component-alpha on the output image.
 335
 336             - This means each of the components are sampled
 337               independently and composited in the corresponding
 338               channel only.
 339
 340           - Have 3 x oversampled mask
 341
 342           - Scale it down by 3 horizontally, with [ 1/3, 1/3, 1/3 ]
 343             resampling filter.
 344
 345             Is this equivalent to just using a component alpha mask?
 346
 347         Incompatible changes:
 348
 349         - Gradients could be specified with premultiplied colors. (You
 350           can use a mask to get things like gradients from solid red to
 351           transparent red.
 352
 353 Refactoring pixman
 354
 355 The pixman code is not particularly nice to put it mildly. Among the
 356 issues are
 357
 358 - inconsistent naming style (fb vs Fb, camelCase vs
 359   underscore_naming). Sometimes there is even inconsistency *within*
 360   one name.
 361
 362       fetchProc32 ACCESS(pixman_fetchProcForPicture32)
 363
 364   may be one of the uglies names ever created.
 365
 366   coding style:
 367          use the one from cairo except that pixman uses this brace style:
 368
 369                 while (blah)
 370                 {
 371                 }
 372
 373         Format do while like this:
 374
 375                do
 376                {
 377
 378                }
 379                while (...);
 380
 381 - PIXMAN_COMPOSITE_RECT_GENERAL() is horribly complex
 382
 383 - switch case logic in pixman-access.c
 384
 385   Instead it would be better to just store function pointers in the
 386   image objects themselves,
 387
 388         get_pixel()
 389         get_scanline()
 390
 391 - Much of the scanline fetching code is for formats that no one
 392   ever uses. a2r2g2b2 anyone?
 393
 394   It would probably be worthwhile having a generic fetcher for any
 395   pixman format whatsoever.
 396
 397 - Code related to particular image types should be split into individual
 398   files.
 399
 400         pixman-bits-image.c
 401         pixman-linear-gradient-image.c
 402         pixman-radial-gradient-image.c
 403         pixman-solid-image.c
 404
 405 - Fast path code should be split into files based on architecture:
 406
 407        pixman-mmx-fastpath.c
 408        pixman-sse2-fastpath.c
 409        pixman-c-fastpath.c
 410
 411        etc.
 412
 413   Each of these files should then export a fastpath table, which would
 414   be declared in pixman-private.h. This should allow us to get rid
 415   of the pixman-mmx.h files.
 416
 417   The fast path table should describe each fast path. Ie there should
 418   be bitfields indicating what things the fast path can handle, rather than
 419   like now where it is only allowed to take one format per src/mask/dest. Ie.,
 420
 421   {
 422     FAST_a8r8g8b8 | FAST_x8r8g8b8,
 423     FAST_null,
 424     FAST_x8r8g8b8,
 425     FAST_repeat_normal | FAST_repeat_none,
 426     the_fast_path
 427   }
 428
 429 There should then be *one* file that implements pixman_image_composite().
 430 This should do this:
 431
 432      optimize_operator();
 433
 434      convert 1x1 repeat to solid (actually this should be done at
 435      image creation time).
 436
 437      is there a useful fastpath?
 438
 439 There should be a file called pixman-cpu.c that contains all the
 440 architecture specific stuff to detect what CPU features we have.
 441
 442 Issues that must be kept in mind:
 443
 444        - we need accessor code to be preserved
 445
 446        - maybe there should be a "store_scanline" too?
 447
 448          Is this sufficient?
 449
 450          We should preserve the optimization where the
 451          compositing happens directly in the destination
 452          whenever possible.
 453
 454         - It should be possible to create GPU samplers from the
 455           images.
 456
 457 The "horizontal" classification should be a bit in the image, the
 458 "vertical" classification should just happen inside the gradient
 459 file. Note though that
 460
 461       (a) these will change if the tranformation/repeat changes.
 462
 463       (b) at the moment the optimization for linear gradients
 464           takes the source rectangle into account. Presumably
 465           this is to also optimize the case where the gradient
 466           is close enough to horizontal?
 467
 468 Who is responsible for repeats? In principle it should be the scanline
 469 fetch. Right now NORMAL repeats are handled by walk_composite_region()
 470 while other repeats are handled by the scanline code.
 471
 472
 473 (Random note on filtering: do you filter before or after
 474 transformation?  Hardware is going to filter after transformation;
 475 this is also what pixman does currently). It's not completely clear
 476 what filtering *after* transformation means. One thing that might look
 477 good would be to do *supersampling*, ie., compute multiple subpixels
 478 per destination pixel, then average them together.