review.tizen.org Git - platform/upstream/pytorch.git/commit

Fold col offsets into bias; optimize A symmetric quant (#16942)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16942

We can fold col offsets into bias if zero point of activation is constant.
fbgemm still needs to provide an option to pass col offsets in case zero point of activation keep changes (e.g., dynamic quantization).
A trick to optimize static quantization case is setting A zero point to 0 after folding into bias.

This diff also optimizes when weights use symmetric quantization. When B zero point is 0, we use PackAMatrix instead of PackAWithRowOffset .

TODO:
Ideally, PackAWithRowOffset should perform as fast as PackAMatrix when B_zero_point is 0 to make client code simpler
Same in PackAWithIm2Col and depth-wise convolution (group convolution is already doing this)

Reviewed By: csummersea

Differential Revision: D14013931

fbshipit-source-id: e4d313343e2a16a451eb910beed30e35de02a40c

author	Jongsoo Park <jongsoo@fb.com>
	Wed, 13 Feb 2019 01:00:33 +0000 (17:00 -0800)
committer	Facebook Github Bot <facebook-github-bot@users.noreply.github.com>
	Wed, 13 Feb 2019 01:33:06 +0000 (17:33 -0800)
commit	92221ad8408c9e95d6e140b857225bd228907d1d
tree	640c11d8851c5a208ca032857bf967d1e183f1ed	tree \| snapshot
parent	3e1e5d5a8bdbee579339144110e77124cf4e62b2	commit \| diff

caffe2/quantization/server/fully_connected_dnnlowp_acc16_op.cc		diff \| blob \| history
caffe2/quantization/server/fully_connected_dnnlowp_op.cc		diff \| blob \| history
caffe2/quantization/server/fully_connected_dnnlowp_op.h		diff \| blob \| history
caffe2/quantization/server/fully_connected_dnnlowp_op_test.py		diff \| blob \| history