Move immintrin/arm_neon includes to where they are used.
authormtklein <mtklein@chromium.org>
Tue, 7 Jun 2016 16:35:27 +0000 (09:35 -0700)
committerCommit bot <commit-bot@chromium.org>
Tue, 7 Jun 2016 16:35:28 +0000 (09:35 -0700)
commit12dfaaa53c23f3d03050bde8f64136ac1f44164a
tree63cfa96123575974f0560f785b3bc63367e15e63
parentd62e28b19a23b913c549b7891ecf79e779577181
Move immintrin/arm_neon includes to where they are used.

On my Mac (so, immintrin), this improves compile time, both wall and cpu,
by about 16%.  To test I ran this on an SSD with files hot in their caches:

  $ env CC=/usr/bin/clang CXX=/usr/bin/clang++ ./gyp_skia && \
    ninja -C out/Release -t clean && \
    time ninja -C out/Release

  Before: 159 wall / 3367 cpu
          159 wall / 3368 cpu

  After:  137 wall / 2860 cpu
          136 wall / 2863 cpu

I also tried further refining immintrin down to emmintrin / tmmintrin / smmintrin etc.
That made no signficant difference, so I've kept immintrin for its simplicity.

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2045633002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

TBR=reed@google.com
No public API changes.

Review-Url: https://codereview.chromium.org/2045633002
include/core/SkTypes.h
include/private/SkFloatingPoint.h
src/opts/SkBlurImageFilter_opts.h
src/opts/SkNx_neon.h
src/opts/SkNx_sse.h
src/opts/SkSwizzler_opts.h