Optimize CPU version performance of the nonzero function. (#15190)
authorVitaly Fedyunin <vitalyf@fb.com>
Wed, 9 Jan 2019 21:33:34 +0000 (13:33 -0800)
committerFacebook Github Bot <facebook-github-bot@users.noreply.github.com>
Wed, 9 Jan 2019 21:37:38 +0000 (13:37 -0800)
commit5838b59c5d6a18bc0a76d3bc592d71f6dc1a0929
treef5e16b6b6972a8b1d6384ae2c38dcf9f6de1fe12
parent0571eaebab3d7785449fb1efb9bed5f1fbf45f44
Optimize CPU version performance of the nonzero function. (#15190)

Summary:
Optimized CPU version of the nonzero. Now 2x faster (in avg.) than numpy.

Can be further optimized for 1D tensors and boolean tensors.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15190

Differential Revision: D13468570

Pulled By: VitalyFedyunin

fbshipit-source-id: e55ce54d60626a42d9a10a02e407856458b8055e
aten/src/TH/generic/THTensorEvenMoreMath.cpp