Enhance the documentation for DistributedDataParallel from torch.nn.parallel.distribu...
authorDerek Kim <bluewhale8202@gmail.com>
Thu, 17 Jan 2019 08:59:11 +0000 (00:59 -0800)
committerFacebook Github Bot <facebook-github-bot@users.noreply.github.com>
Thu, 17 Jan 2019 09:02:44 +0000 (01:02 -0800)
commit4171ef3728b92e50e5f791ab84c304994238dda0
tree33ebdff99198fb726f57fd227970ed9596d8860e
parentded4ff87af6e4cd0d22ef091f8ba7d49a4a49d1c
Enhance the documentation for DistributedDataParallel from torch.nn.parallel.distributed (#16010)

Summary:
- a typo fixed
- made the docs consistent with #5108

And maybe one more change is needed. According to the current docs
> The batch size should be larger than the number of GPUs used **locally**.

But shouldn't the batch size be larger than the number of GPUs used **either locally or remotely**? Sadly, I couldn't experiment this with my single GPU.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16010

Differential Revision: D13709516

Pulled By: ezyang

fbshipit-source-id: e44459a602a8a834fd365fe46e4063e9e045d5ce
torch/nn/parallel/distributed.py