[quant][fx2trt] Generate engine graph for explicit quant/implicit quant and fp16...
authorJerry Zhang <jerryzh@fb.com>
Sat, 18 Sep 2021 19:49:07 +0000 (12:49 -0700)
committerFacebook GitHub Bot <facebook-github-bot@users.noreply.github.com>
Sat, 18 Sep 2021 20:30:37 +0000 (13:30 -0700)
commitd8189db80f309f2750d7ee41fadf8a1b860643d7
tree52abea49a52e7c2c9ce77cf58ede7f1400dc869e
parent7f8d622d70f93373b392e7e84420b36badff8ed4
[quant][fx2trt] Generate engine graph for explicit quant/implicit quant and fp16 graph (#65289)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65289

Turn on VERBOSE logging and use engine visualizer to generate the graph.

Runtime:
```
explicit quant result diff max tensor(0.0771)
implicit quant result diff max tensor(0.1909)
trt fp16 time (ms/iter) 1.0740923881530762
trt int8 time (ms/iter) 0.5288887023925781
trt implicit int8 time (ms/iter) 0.6334662437438965
PyTorch time (CUDA) (ms/iter) 4.448361396789551
PyTorch time (CPU) (ms/iter) 45.13296604156494
```

Generated Graphs:
```
explicit int8: https://www.internalfb.com/intern/graphviz/?paste=P458669571
implicit int8: https://www.internalfb.com/intern/graphviz/?paste=P458669656
fp16: https://www.internalfb.com/intern/graphviz/?paste=P458669708
```

Test Plan:
```
buck run mode/opt -c python.package_style=inplace caffe2:fx2trt_quantized_resnet_test 2>log
buck run //deeplearning/trt/fx2trt/tools:engine_layer_visualize -- --log_file log
```

Reviewed By: 842974287

Differential Revision: D30955035

fbshipit-source-id: 24949458ad9823fb026d56d78a6ee1c6874b6034
torch/fx/experimental/fx2trt/example/quantized_resnet_test.py