[quant] support linear_relu_dynamic for qnnpack backend (#63820)
authorSupriya Rao <supriyar@fb.com>
Fri, 27 Aug 2021 04:05:56 +0000 (21:05 -0700)
committerFacebook GitHub Bot <facebook-github-bot@users.noreply.github.com>
Fri, 27 Aug 2021 04:12:02 +0000 (21:12 -0700)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63820

Adds support in the operator directly to call relu operator if relu fusion is enabled.
Once QNNPACK natively supports relu fusion in the linear_dynamic this can be removed

Test Plan:
python test/test_quantization.py TestDynamicQuantizedLinear.test_qlinear

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D30502813

fbshipit-source-id: 3352ee5f73e482b6d1941f389d720a461b84ba23

aten/src/ATen/native/quantized/cpu/qlinear_dynamic.cpp
test/quantization/core/test_quantized_op.py

index da64197..23c6158 100644 (file)
@@ -349,6 +349,12 @@ at::Tensor PackedLinearWeightsQnnp::apply_dynamic_impl(at::Tensor input) {
   TORCH_INTERNAL_ASSERT(
       runStatus == pytorch_qnnp_status_success,
       "failed to run QNNPACK Linear operator");
+
+  // Call the relu operator here until qlinear dynamic in QNNPACK
+  // supports it natively.
+  if (ReluFused) {
+    output.relu_();
+  }
   return output;
 }
 
index 9243fe2..86fe350 100644 (file)
@@ -2606,7 +2606,6 @@ class TestDynamicQuantizedLinear(TestCase):
     def test_qlinear(self, batch_size, input_channels, output_channels,
                      use_bias, use_relu, use_multi_dim_input, use_channelwise, reduce_range):
         if torch.backends.quantized.engine == 'qnnpack':
-            use_relu = False
             reduce_range = False
 
         qlinear_prepack = torch.ops.quantized.linear_prepack