Fork/join parallelism for ensemble export modules (#310)
authorJames Reed <jamesreed@fb.com>
Tue, 5 Feb 2019 09:51:52 +0000 (01:51 -0800)
committerFacebook Github Bot <facebook-github-bot@users.noreply.github.com>
Tue, 5 Feb 2019 09:55:09 +0000 (01:55 -0800)
commit0cd918f4d395f6799ad6f57c726615c85df8c8d2
tree5d4ca82e47639aa326a3cc90b15396f68454488a
parentce15ae8f23180c703929969b07bf5d04437e65e7
Fork/join parallelism for ensemble export modules (#310)

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/310

This adds fork/join parallelism to the EncoderEnsemble and DecoderBatchedStepEnsemble models. Note that when run in Python, these calls are no-op, and similarly we remove these calls before exporting to ONNX. But when we run in the PyTorch native runtime, we will now have the opportunity to run these sections in parallel.

Benchmark validation is pending me slogging through FBLearner Flow issues, as usual

Reviewed By: jmp84

Differential Revision: D13827861

fbshipit-source-id: 0cb9df6e10c0ba64a6b81fa374e077bce90f1d5b
torch/csrc/jit/passes/erase_fork_wait.h [new file with mode: 0644]