[MicroBench] Added a log_vml version of the signed log1p kernel (#64205)
authorRaghavan Raman <raghavanr@fb.com>
Fri, 10 Sep 2021 19:35:24 +0000 (12:35 -0700)
committerFacebook GitHub Bot <facebook-github-bot@users.noreply.github.com>
Fri, 10 Sep 2021 23:49:06 +0000 (16:49 -0700)
commit2cc97784950739a0a71abad59ee263e7583ea080
tree148ddbbb29d8ea3709ea162d8490d81e5ba1d24d
parentcad7a4b0eab0001a98ef15a787c841d52e04652c
[MicroBench] Added a log_vml version of the signed log1p kernel (#64205)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64205

The log_vml version of the micro-bench is over **2x** faster than the log1p version. Here are the perf numbers:

```
---------------------------------------------------------------------------------------------
Benchmark                                   Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------
SignedLog1pBench/ATen/10/1467           45915 ns        45908 ns        14506 GB/s=2.5564G/s
SignedLog1pBench/NNC/10/1467            40469 ns        40466 ns        17367 GB/s=2.9002G/s
SignedLog1pBench/NNCLogVml/10/1467      19560 ns        19559 ns        35902 GB/s=6.00016G/s
```

Thanks to bertmaher for pointing this out.

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D30644716

Pulled By: navahgar

fbshipit-source-id: ba2b32c79d4265cd48a2886b0c62d0e89ff69c19
benchmarks/cpp/tensorexpr/bench_signed_log1p.cpp