jumper, factor out load4() and from_half()
authorMike Klein <mtklein@chromium.org>
Tue, 4 Apr 2017 02:21:15 +0000 (22:21 -0400)
committerSkia Commit-Bot <skia-commit-bot@chromium.org>
Tue, 4 Apr 2017 13:57:54 +0000 (13:57 +0000)
commit114e6b33d67537f034b749e77f68d168ef9bfbc6
tree6b92567de9d110f80da64e1eb48778f764dca229
parent88ec28e3d7567ec2c3e26fed66c16a68a8f8ae64
jumper, factor out load4() and from_half()

load_f16 gets slightly worse codegen for ARMv7, SSE2, SSE4.1, and AVX
from splitting it apart compared to the previous fused versions.  But
the stage code becomes much simpler.

I'm happy to make those trades until someone complains.

load4() will be useful on its own to implement a couple other stages.

Everything draws the same.  I intend to follow up with more of the
same sort of refactoring, but this was tricky enough a change I want
to do them in small steps.

Change-Id: Ib4aa86a58d000f2d7916937cd4f22dc2bd135a49
Reviewed-on: https://skia-review.googlesource.com/11186
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
src/jumper/SkJumper_generated.S
src/jumper/SkJumper_generated_win.S
src/jumper/SkJumper_stages.cpp
src/jumper/SkJumper_vectors.h