Improvements for symbolic AD (#14758)
authorAdam Paszke <adam.paszke@gmail.com>
Wed, 5 Dec 2018 04:35:51 +0000 (20:35 -0800)
committerFacebook Github Bot <facebook-github-bot@users.noreply.github.com>
Wed, 5 Dec 2018 04:38:21 +0000 (20:38 -0800)
commit8dfebc16cc2dbd6e4f9fd03515428d5b8d49c4c3
tree810785163ea02eeef5bd6210a13e423ff7e46b83
parent38eb1beff5bbaed0d3cc8ad59039b50f850f7245
Improvements for symbolic AD (#14758)

Summary:
**Review only the last commit.**

This commit adds a few optimizations to AD, that let us dramatically
reduce the number of sizes we capture from forward.

We now:
- collapse chains of SumToSize
- avoid capturing sizes of tensors that are captured anyway
- more aggressively DCE the reverse code
- run CSE on the primal code to deduplicate `aten::size` calls

cc zou3519 zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14758

Differential Revision: D13324440

Pulled By: zou3519

fbshipit-source-id: 45ccbc13605adcef2b461840c6089d3200000c72
test/cpp/jit/tests.h
test/expect/TestJit.test_cpp_cuda.expect
test/expect/TestScript.test_lstm_fusion_cuda-backward.expect
test/expect/TestScript.test_lstm_fusion_cuda-forward.expect
test/expect/TestScript.test_milstm_fusion_cuda-backward.expect
test/expect/TestScript.test_milstm_fusion_cuda-forward.expect
torch/csrc/jit/autodiff.cpp
torch/csrc/jit/passes/dead_code_elimination.cpp
torch/csrc/jit/passes/dead_code_elimination.h
torch/csrc/jit/passes/graph_fuser.cpp
torch/csrc/jit/passes/peephole.cpp