pre-pack operation of dnnlowp conv with 16-bit accumulation (#14881)
authorJongsoo Park <jongsoo@fb.com>
Mon, 10 Dec 2018 09:06:17 +0000 (01:06 -0800)
committerFacebook Github Bot <facebook-github-bot@users.noreply.github.com>
Mon, 10 Dec 2018 09:08:21 +0000 (01:08 -0800)
commitb039a715ce4e9cca82ae3bf72cb84652957b2844
treec7807d0219b5b67bd49b8a3e60073f8421ce9207
parente747acbebbae2de381ccda8b70010953046c397d
pre-pack operation of dnnlowp conv with 16-bit accumulation (#14881)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14881

This diff allows us to pre-quantize and pre-pack weight matrix used in DNNLOWP_ACC16 .
The intended use pattern is run Int8ConvPackWeight in init_net that generates a packed weight and Int8Conv with DNNLOWP_ACC16 engine uses the the packed weight.

Reviewed By: csummersea

Differential Revision: D13374662

fbshipit-source-id: dd02b9a4eb7af1fe208aa857fcd0b445e6e395af
26 files changed:
caffe2/quantization/server/CMakeLists.txt
caffe2/quantization/server/conv_depthwise_dnnlowp_op_test.py
caffe2/quantization/server/conv_dnnlowp_acc16_op.cc
caffe2/quantization/server/conv_dnnlowp_acc16_op.h
caffe2/quantization/server/conv_dnnlowp_acc16_op_test.py
caffe2/quantization/server/conv_dnnlowp_op.cc
caffe2/quantization/server/conv_dnnlowp_op.h
caffe2/quantization/server/conv_dnnlowp_op_test.py
caffe2/quantization/server/conv_groupwise_dnnlowp_acc16_op_test.py
caffe2/quantization/server/conv_groupwise_dnnlowp_op_test.py
caffe2/quantization/server/conv_pool_dnnlowp_op_base.h
caffe2/quantization/server/dnnlowp_op.h
caffe2/quantization/server/fbgemm_pack_blob.h [new file with mode: 0644]
caffe2/quantization/server/fbgemm_pack_op.cc [new file with mode: 0644]
caffe2/quantization/server/fbgemm_pack_op.h [new file with mode: 0644]
caffe2/quantization/server/fully_connected_dnnlowp_acc16_op.cc
caffe2/quantization/server/fully_connected_dnnlowp_acc16_op.h
caffe2/quantization/server/fully_connected_dnnlowp_acc16_op_test.py
caffe2/quantization/server/fully_connected_dnnlowp_op.cc
caffe2/quantization/server/fully_connected_dnnlowp_op.h
caffe2/quantization/server/fully_connected_dnnlowp_op_test.py
caffe2/quantization/server/fully_connected_rowwise_dnnlowp_op.cc
caffe2/quantization/server/fully_connected_rowwise_dnnlowp_op.h
caffe2/quantization/server/fully_connected_rowwise_dnnlowp_op_test.py
caffe2/quantization/server/group_norm_dnnlowp_op_test.py
caffe2/quantization/server/utils.py