From: Supriya Rao <supriyar@fb.com>
Date: Fri, 27 Aug 2021 04:05:56 +0000 (-0700)
Subject: [quant] support linear_relu_dynamic for qnnpack backend (#63820)
X-Git-Tag: accepted/tizen/8.0/unified/20231005.095509~662
X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=975f4ccad6fb7ca13c50ee628ec3fb3a77a64893;p=platform%2Fupstream%2Fpytorch.git

[quant] support linear_relu_dynamic for qnnpack backend (#63820)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63820

Adds support in the operator directly to call relu operator if relu fusion is enabled.
Once QNNPACK natively supports relu fusion in the linear_dynamic this can be removed

Test Plan:
python test/test_quantization.py TestDynamicQuantizedLinear.test_qlinear

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D30502813

fbshipit-source-id: 3352ee5f73e482b6d1941f389d720a461b84ba23
---

diff --git a/aten/src/ATen/native/quantized/cpu/qlinear_dynamic.cpp b/aten/src/ATen/native/quantized/cpu/qlinear_dynamic.cpp
index da64197..23c6158 100644
--- a/aten/src/ATen/native/quantized/cpu/qlinear_dynamic.cpp
+++ b/aten/src/ATen/native/quantized/cpu/qlinear_dynamic.cpp
@@ -349,6 +349,12 @@ at::Tensor PackedLinearWeightsQnnp::apply_dynamic_impl(at::Tensor input) {
   TORCH_INTERNAL_ASSERT(
       runStatus == pytorch_qnnp_status_success,
       "failed to run QNNPACK Linear operator");
+
+  // Call the relu operator here until qlinear dynamic in QNNPACK
+  // supports it natively.
+  if (ReluFused) {
+    output.relu_();
+  }
   return output;
 }
 
diff --git a/test/quantization/core/test_quantized_op.py b/test/quantization/core/test_quantized_op.py
index 9243fe2..86fe350 100644
--- a/test/quantization/core/test_quantized_op.py
+++ b/test/quantization/core/test_quantized_op.py
@@ -2606,7 +2606,6 @@ class TestDynamicQuantizedLinear(TestCase):
     def test_qlinear(self, batch_size, input_channels, output_channels,
                      use_bias, use_relu, use_multi_dim_input, use_channelwise, reduce_range):
         if torch.backends.quantized.engine == 'qnnpack':
-            use_relu = False
             reduce_range = False
 
         qlinear_prepack = torch.ops.quantized.linear_prepack