ARM: NEON: Bilinear macro template for instruction scheduling
authorTaekyun Kim <tkq.kim@samsung.com>
Tue, 20 Sep 2011 12:32:35 +0000 (21:32 +0900)
committerTaekyun Kim <tkq.kim@samsung.com>
Tue, 18 Oct 2011 04:00:06 +0000 (13:00 +0900)
commit6682b2b3597c9f431900bfe7b1b42dfbe006bae5
tree5f1f80ff135a2f1da94e0fce5c02f467b684a7c4
parentb5e4355fa4973e3edd4abeb11bdc47c42371cc76
ARM: NEON: Bilinear macro template for instruction scheduling

This macro template takes 6 code blocks.

1. process_last_pixel
2. process_two_pixels
3. process_four_pixels
4. process_pixblock_head
5. process_pixblock_tail
6. process_pixblock_tail_head

process_last_pixel does not need to update horizontal weight. This
is done by the template. two and four code block should update
horizontal weight inside of them. head/tail/tail_head blocks
consist unrolled core loop. You can apply instruction scheduling
to the tail_head blocks.

You can also specify size of the pixel block. Supported size is 4
and 8. If you want to use mask, give BILINEAR_FLAG_USE_MASK flags
to the template, then you can use register MASK. When using d8~d15
registers, give BILINEAR_FLAG_USE_ALL_NEON_REGS to make sure
registers are properly saved on the stack and later restored.
pixman/pixman-arm-neon-asm-bilinear.S