review.tizen.org Git - platform/upstream/pytorch.git/commit

Fix TestDataLoader.test_proper_exit (#15665)

Summary:
Currently, in `test_proper_exit`,
1. we do not kill the correct input `pid` in the `kill_pid` function
https://github.com/pytorch/pytorch/blob/fe15d6a2c231a7bc1b32781217ed336ccf9adff7/test/test_dataloader.py#L325-L329
2. the Windows command that detects process status doesn't actually work
https://github.com/pytorch/pytorch/blob/fe15d6a2c231a7bc1b32781217ed336ccf9adff7/test/test_dataloader.py#L641-L646
3. `worker_error` and `worker_kill` cases (sometimes?) are not tested because the workers may exit naturally due to the pre-fetching mechanism and a too small `dataset size / batch size`.

In this PR, I, in separate commits:
1. Install `psutil` (a python package specifically built for process monitoring) on some CI builds. (Linux builds installation are done in https://github.com/pietern/pytorch-dockerfiles/pull/29 https://github.com/pietern/pytorch-dockerfiles/pull/30  https://github.com/pytorch/ossci-job-dsl/pull/36 and https://github.com/pytorch/pytorch/pull/15795).
2. Rewrite `test_proper_exit` with `psutil` so we

    1. do not rely on the hacky `is_process_alive` https://github.com/pytorch/pytorch/blob/fe15d6a2c231a7bc1b32781217ed336ccf9adff7/test/test_dataloader.py#L640-L653
   2. increase the #task per worker so `worker_error` and `worker_kill` properly trigger
   3. test error message content to ensure that the loader exits with correct message corresponding to each exiting scenario.

3. Fix Windows data loader not having any mechanism to detect worker failures.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15665

Differential Revision: D13615527

Pulled By: soumith

fbshipit-source-id: cfb2f67837d2d87928a53f00b4d20f09754b7949

author	SsnL <tongzhou.wang.1994@gmail.com>
	Thu, 10 Jan 2019 16:44:32 +0000 (08:44 -0800)
committer	Facebook Github Bot <facebook-github-bot@users.noreply.github.com>
	Thu, 10 Jan 2019 16:47:27 +0000 (08:47 -0800)
commit	9b5ec2a076982c57033f2e345cee3051b55de996
tree	ddbc4ecf1fb3c6887598e229998e7d752edf30ba	tree \| snapshot
parent	0ed3f766e9eb803e8f37f728af9494d756aec9a7	commit \| diff

.jenkins/pytorch/macos-test.sh		diff \| blob \| history
.jenkins/pytorch/test.sh		diff \| blob \| history
.jenkins/pytorch/win-test.sh		diff \| blob \| history
test/test_dataloader.py		diff \| blob \| history
torch/utils/data/dataloader.py		diff \| blob \| history