Fold col offsets into bias; optimize A symmetric quant (#16942)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16942
We can fold col offsets into bias if zero point of activation is constant.
fbgemm still needs to provide an option to pass col offsets in case zero point of activation keep changes (e.g., dynamic quantization).
A trick to optimize static quantization case is setting A zero point to 0 after folding into bias.
This diff also optimizes when weights use symmetric quantization. When B zero point is 0, we use PackAMatrix instead of PackAWithRowOffset .
TODO:
Ideally, PackAWithRowOffset should perform as fast as PackAMatrix when B_zero_point is 0 to make client code simpler
Same in PackAWithIm2Col and depth-wise convolution (group convolution is already doing this)
Reviewed By: csummersea
Differential Revision:
D14013931
fbshipit-source-id:
e4d313343e2a16a451eb910beed30e35de02a40c