[QNN] Requantize operator (#3531)
authorAnimesh Jain <anijain@umich.edu>
Thu, 8 Aug 2019 18:41:24 +0000 (11:41 -0700)
committerTianqi Chen <tqchen@users.noreply.github.com>
Thu, 8 Aug 2019 18:41:24 +0000 (11:41 -0700)
commita78adbd53a7ef608cc3821ac0c0424dba3fc687a
treeb96c31f51ad04cfb4281bbc1c283141a7826e1b8
parent60607eff494ab0b40ec21d6faf72c2f3dcef502f
[QNN] Requantize operator (#3531)

* [Relay] [Quantization] WIP - Common files for the qauntization work.

* [Relay] [Quantization] WIP - Prototyping requantize op.

* Requantize operator implementation.

Requantize converts one quantized tensor representation to another quantized
representation. The PR has following implementation features

- Requantize operator defined in qnn namespace - relay.qnn.requantize
- Lowering of the requantize to exisiting Relay operators
- Integer fixed point implementation of requantize
    - Two rounding modes - FE_UPWARDS (round towards infinity) and
    FE_AWAY_FROM_ZERO (std::round behavior)
- Floating point implementation as well, that can act as reference or can be
used for devices when FP32 computation is not used.
- Unit test cases

Relevant Issue - https://github.com/dmlc/tvm/issues/2351

Credit to TFLite and GemmLowp to provide reference implementations.

* Typo and lint fixes.

* Doc fix.

* Uncommenting the lint script (fixing mistake).

* Modifying the unit tests.

* Moving C++ files into src/relay/qnn

* Moving python files to python/tvm/relay/qnn. Some minor fixes.

* Moving the attrs.h inside the include directory.

* Pushing files that I forgot earlier. Changing util location.

* Incorporating comments. API change. Lint fixes.

* Modifying the GetFixedPointMultiplierShift API as per comments.

* Forgot the dialect change.

* Changing rewrite to qnn_lower.

* Renaming Quantize to Qnn for clarity.

* Remove use_int_domain.

* Incorportaing review comments.

* Adding API doc for QNN dialect.

* Move the qnn_lower pass to transform namespace.

* Moving from expr to module. Adding namespace in C++.

* Minor sentence rewrites. Added qnn namespace.

* Added the API doc.

* Chanding default out_dtype to int8. Adding a test with in/out_dtype as uint8.

* Style fixes. Better error messages.

* Adding documentation.

* More documentation fixes.

* Adding out dtype check for requantize.

* Adding corner case for FP32 to fixed point conversion.

* Adding extra line.

* Documentation fix.

* Adding static inline.

* Incorporating jackwish comment. Removed idtype from requantize lowering.

* Removing Quantize/Dequantize code. Restricting Requantize to (u)int8/int32.

* Style fixes.

* Fix the docs.

* Move to Legalize API.
docs/langref/relay_op.rst
include/tvm/relay/qnn/attrs.h [new file with mode: 0644]
python/tvm/relay/__init__.py
python/tvm/relay/qnn/__init__.py [new file with mode: 0644]
python/tvm/relay/qnn/op/__init__.py [new file with mode: 0644]
python/tvm/relay/qnn/op/_make.py [new file with mode: 0644]
python/tvm/relay/qnn/op/qnn.py [new file with mode: 0644]
src/relay/pass/pattern_util.h
src/relay/qnn/op/requantize.cc [new file with mode: 0644]
src/relay/qnn/util.h [new file with mode: 0644]
tests/python/relay/test_qnn_requantize.py [new file with mode: 0644]