Mikhail Zolotukhin [Thu, 24 Jan 2019 19:05:07 +0000 (11:05 -0800)]
Directly include headers from ATen.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16287
Differential Revision:
D13792949
Pulled By: ZolotukhinM
fbshipit-source-id:
d627d8dc469df048063c70d0b5b8d33fede809a3
Richard Zou [Thu, 24 Jan 2019 19:01:57 +0000 (11:01 -0800)]
Refactor the docs build workflow (#16265)
Summary:
In preparation for setting up a doc build job for stable docs, I wanted
to refactor the workflow so that future changes will be easier.
This PR the following changes:
- Refactor the doc push script into a reusable command
- Add command line options for the doc push script.
These don't matter too much for now but will be useful
for setting up future jobs for building different versions of the
docs.
- Instead of checking out pytorch/pytorch:master, we re-use the pytorch
installation inside the docker image.
- Change the sed in the script to a perl command. sed is annoyingly
different across platforms; the perl command is more stable
- Run the script in dry run mode (without pushing the doc build)
whenever a PR is opened. This lets us test changes to the doc build workflow.
Test Plan
- I tested the doc build script locally with my own credentials and it
worked fine.
- Wait for the pytorch_doc_push CI.
- After merging this PR, keep an eye on the pytorch_doc_push CI status.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16265
Differential Revision:
D13803511
Pulled By: zou3519
fbshipit-source-id:
4564bca3e74d490f89a1d1da9fb8b98eb44bdbb1
Owen Anderson [Thu, 24 Jan 2019 18:40:38 +0000 (10:40 -0800)]
Save a little bit of work in constant pooling by not moving nodes that will get deleted.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16161
Differential Revision:
D13791247
Pulled By: resistor
fbshipit-source-id:
2a5a4f98309509b4ba875373ee57e6f63c75a4fd
Gregory Chanan [Thu, 24 Jan 2019 15:36:54 +0000 (07:36 -0800)]
Handle non-contiguous inputs with mkldnn convolution. (#16300)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/16018.
Backwards appears to be fine because the derivative is written in terms of mkldnn_convolution itself.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16300
Differential Revision:
D13797776
Pulled By: gchanan
fbshipit-source-id:
68a990b8a3c186412a99d176931314806c9ed7bf
Xiaomeng Yang [Thu, 24 Jan 2019 10:50:35 +0000 (02:50 -0800)]
Optimize SpatialBN on GPU (#16202)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16202
Optimize SpatialBN on GPU
Reviewed By: houseroad
Differential Revision:
D13747581
fbshipit-source-id:
48a885a240ef2a325235e8f89ebbe50e7c780c84
Xiaomeng Yang [Thu, 24 Jan 2019 07:55:06 +0000 (23:55 -0800)]
optimize group_norm (#16216)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16216
Optimize GroupNormOp
Reviewed By: houseroad
Differential Revision:
D13754145
fbshipit-source-id:
650f64c81486c6c9d276f2e3325392d5838751ba
Lu Fang [Thu, 24 Jan 2019 05:32:57 +0000 (21:32 -0800)]
Fix the tensor deserialization problem of jit script module on CUDA (#16279)
Summary:
Now we create a temporary tensor for the whole record.
Fix https://github.com/pytorch/pytorch/issues/15271
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16279
Reviewed By: BIT-silence
Differential Revision:
D13791442
Pulled By: houseroad
fbshipit-source-id:
6f52ca09627fb684f74121357cc42e4adadec36a
Erik Brinkman [Thu, 24 Jan 2019 03:37:29 +0000 (19:37 -0800)]
Small fixes for pdist (#16210)
Summary:
pdist was recently patched to remove buggy batch support and fix issues
with large tensors. This fixed missed a few spots, and didn't handle a
few recommendations that this commit addresses.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16210
Differential Revision:
D13791914
Pulled By: gchanan
fbshipit-source-id:
0595841be1b298f7268fd4c02a6628acfec918f2
Jerry Zhang [Thu, 24 Jan 2019 03:24:38 +0000 (19:24 -0800)]
Fix comparison in ReinitializeTensor (#16294)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16294
In `ReinitializeTensor`, we compare `tensor->GetDevice()` and `options.device()`, but in the callsite, we actually just provide an option with `device_type`, which means the `device_id` will always be default(-1) for `options`, but for tensor, although it is passed a `device` with default `device_id`, when we allocate the data, the `device` of the `tensor` is the `device` of `Storage`, which is the `device` of underlying `DataPtr`, which is the same as the `device` of the `Context` of the operator, which has a non-default `device_id`.
Therefore everytime we do `ReinitializeTensor`, we'll find the `device` does not match, and after the `ReinitializeTensor` call, the `device` still does not match. That's why everytime we'll allocate a new Tensor and cause perf regressions for ops that uses `ReinitializeTensor` on multiple GPUs.
Reviewed By: BIT-silence
Differential Revision:
D13795635
fbshipit-source-id:
24d6afa1a0196a32eb0134ee08b4280244cdb0c3
Benny Chen [Thu, 24 Jan 2019 03:02:14 +0000 (19:02 -0800)]
Fix issues under caffe round 1
Summary: Some automation to fix uninitialized members for caffe2 code. Ran canary to make sure I don't have any regression in prod, but not sure how to test comprehensively for caffe2
Reviewed By: ezyang
Differential Revision:
D13776185
fbshipit-source-id:
fb2a479971cc0276d8784be1c44f01252410bd24
David Riazati [Thu, 24 Jan 2019 02:11:04 +0000 (18:11 -0800)]
Add support for overloaded functions (#15556)
Summary:
This PR adds support for overloaded functions as a step toward adding rnn modules to the JIT standard library.
Possible overloads must be manually specified, and when resolving the overload it chooses by the first one that passes the schema matching logic. The structure is very similar to boolean dispatch in #14425. The overload will only work on weak modules.
In order to avoid supporting overloaded methods in Python to match the JIT execution, the current setup offloads that work to the user. In the test added in `test_jit.py`, two methods are used to overload the `forward` method. In order to call `forward` outside the JIT, a Python-only `forward` that does the right argument type switching must also be provided.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15556
Differential Revision:
D13576348
Pulled By: driazati
fbshipit-source-id:
7d3bdd4ee5a6088cc20c92f26a696d1ee5b9204b
Elias Ellison [Thu, 24 Jan 2019 01:47:29 +0000 (17:47 -0800)]
Constant propagation changes (#16244)
Summary:
- remove loop node that is guaranteed not to execute
- remove extra loop outputs that are no longer needed
- if we are inlining an if node, only run constant propagation on the block that will execute
- remove the recurse argument since we only expose the Graph Constant Propagation and it's not used
This also includes a few extra hooks to python_ir that I think make it a little be easier to test graph conditions from python.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16244
Differential Revision:
D13791635
Pulled By: eellison
fbshipit-source-id:
d16351fffcfc8013b02015db200f8fde002e0577
nlml [Thu, 24 Jan 2019 00:02:55 +0000 (16:02 -0800)]
raise exception if try jit.load non-existent file (#16270)
Summary:
addresses https://github.com/pytorch/pytorch/issues/16267
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16270
Differential Revision:
D13791773
Pulled By: suo
fbshipit-source-id:
256304a02dbf724a7c0baade48c94b3ee77f53cf
Jesse Hellemn [Wed, 23 Jan 2019 23:35:00 +0000 (15:35 -0800)]
Fixing upload of nightly binaries and clean MacOS output (#16016)
Summary:
- Fix environment variable used to guard binary uploads
- Move common MacOS brew setup-code into a common function to decrease code duplication and also to move that noisy console output into its own CircleCI step
- Split Mac builds into separate build-test and upload jobs. Add one of these jobs to PR runs; add upload jobs to nightly binarybuilds workflow
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16016
Differential Revision:
D13791084
Pulled By: pjh5
fbshipit-source-id:
8eeb8e1963d46eab84f0f6dad9f0265163d5bf73
Teng Li [Wed, 23 Jan 2019 22:04:47 +0000 (14:04 -0800)]
CUDA event should only be recorded after NCCL group (#8219)
Summary:
Otherwise, it won't work if we sync on this event.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/8219
Reviewed By: pietern
Differential Revision:
D13788657
Pulled By: teng-li
fbshipit-source-id:
8c96e9691ed2441d7a685fb7ae8fece906f58daf
Edward Yang [Wed, 23 Jan 2019 21:51:05 +0000 (13:51 -0800)]
Change data() accessor in Caffe2 to return non-const pointer. (#16176)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16176
This makes PyTorch and Caffe2's data() method line up.
Historically, PyTorch made no distinction between tensors
with const or non-const data, and thus provided a
non-const pointer with data() member. Changing the API to
return a const-pointer would break all mutable code, whereas
changing the Caffe2 API to change a pointer doesn't break
any code, *except* for code which required an exact match
on const-ness (e.g., in template arguments). Since the latter
is less disruptive, we've opted for it here.
The few places downstream that broke due to this are fixed
in this patch.
Reviewed By: smessmer
Differential Revision:
D13742916
fbshipit-source-id:
baa4b4544cfdf7c1f369f4d69a1e0d5953c1bd99
svcscm [Wed, 23 Jan 2019 21:05:30 +0000 (13:05 -0800)]
Updating submodules
Reviewed By: cdelahousse
fbshipit-source-id:
99d58034f9369846f8c82a5ea11c71e202e52a4e
Christian Puhrsch [Wed, 23 Jan 2019 20:30:47 +0000 (12:30 -0800)]
Align native_functions.yaml func schema more with JIT signature schema (#16111)
Summary:
This PR applies a few minor modifications leading to 100s of additional matches
Modifications to native_functions.yaml
1) double to float
2) int64_t to int
3) IntList[\d*] to int[\d*]
4) {} to []
5) Tensor? x=[] to Tensor? x=None
6) TensorList to Tensor[]
7) 1e-x to 1e-0x
8) Generator* x = nullptr to Generator? x = None
9) `{.*}` to `[.*]`
Overall this adds about 300 new matches and brings us to about 1/2 compliance of native_functions func with their JIT signature equivalent
While this is still a draft "tools/jit/gen_jit_dispatch.py" contains code to aid in finding close signatures
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16111
Reviewed By: ezyang
Differential Revision:
D13738123
Pulled By: cpuhrsch
fbshipit-source-id:
d1ec1e089bdb26ec155f6f31ccf768270acb76c7
Syed Tousif Ahmed [Wed, 23 Jan 2019 20:29:32 +0000 (12:29 -0800)]
Fixes selection of cuDNN algorithm (#15881)
Summary:
This PR updates the logic for using cudnnGet* and cudnnFind*. Current version of cudnn find and get (v7) returns a pair of best algorithm and the convDesc mathType. While we were using the returned algorithm, we didn't update the mathType. As a result, we ended up with a slow choice of algorithm and math type. Without this patch, we are seeing a 10x regression in group convolutions.
Changelist:
- Changed the template arguments to be `perf_t` instead of `algo_t` to unify cudnnFind and cudnnGet. Both cudnnFind and cudnnGet have the same purpose and hence, it made sense to unify them and get rid of `getAlgorithm`.
- Used cudnnGet*_v7 everywhere cudnnGet* was being used.
- Removed all cudnn6 paths (This PR depends on https://github.com/pytorch/pytorch/pull/15851)
Differential Revision:
D13787601
Pulled By: ezyang
fbshipit-source-id:
81fe86727673d021306fe1c99c3e528b7c9ad17f
Edward Yang [Wed, 23 Jan 2019 19:53:55 +0000 (11:53 -0800)]
Disable flaky test
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16274
Reviewed By: pietern
Differential Revision:
D13788036
fbshipit-source-id:
a9b7353fb0655908e6d47387cc77af33e9471aed
Junjie Bai [Wed, 23 Jan 2019 17:31:14 +0000 (09:31 -0800)]
Update third_party protobuf to v3.6.1
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16251
Reviewed By: ezyang
Differential Revision:
D13781444
Pulled By: bddppq
fbshipit-source-id:
b713a021033d214f30a49ee02b95edf8633bcc50
Armaan Sethi [Wed, 23 Jan 2019 16:32:29 +0000 (08:32 -0800)]
fix sigma in the middle of when word (#16227)
Summary:
there is a random sigma in the when word on :
https://pytorch.org/cppdocs/contributing.html
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16227
Differential Revision:
D13762753
Pulled By: goldsborough
fbshipit-source-id:
3d4bf4be859a3069402fe8c3fbc8ebee4f25cc5a
Derek Kim [Wed, 23 Jan 2019 10:57:56 +0000 (02:57 -0800)]
Typos and broken RSTs fixed in torch.distribution (#16136)
Summary:
- probabilty -> probability
- make long lines break
- Add LogitRelaxedBernoulli in distribution.rst
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16136
Differential Revision:
D13780406
Pulled By: soumith
fbshipit-source-id:
54beb975eb18c7d67779a9631dacf7d1461a6b32
Johannes M Dieterich [Wed, 23 Jan 2019 02:21:07 +0000 (18:21 -0800)]
tune elementwise for AMD uarch (#16217)
Summary:
Tune elementwise kernel for AMD architectures by increasing the work group sizes and launch bounds. This change improves training throughput for torchvision models by up to 11% in our tests while exhibiting no significant performance regression.
No functional/performance change for CUDA - just shifting numbers into constrexpr.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16217
Differential Revision:
D13776684
Pulled By: bddppq
fbshipit-source-id:
edbaebe904598b2de66a9e9a68a1aa219ebc01e9
rohithkrn [Wed, 23 Jan 2019 01:19:15 +0000 (17:19 -0800)]
fix typo in resnet50_trainer.py
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16219
Differential Revision:
D13776742
Pulled By: bddppq
fbshipit-source-id:
10a6ab4c58159b3f619b739074f773662722c1d9
Lu Fang [Tue, 22 Jan 2019 22:55:50 +0000 (14:55 -0800)]
update of fbcode/onnx to
dc75285d4a1cff9618400164dfdb26c5a1bab70a
Summary:
Previous import was
c553fb32a0902ce5dd42e1b40123e9e9b38bdbe7
Included changes:
- **[dc75285](https://github.com/onnx/onnx/commit/dc75285)**: Relax constraint that the initializers must be a subset of graph inputs (#1718) <G. Ramalingam>
- **[985c8cd](https://github.com/onnx/onnx/commit/985c8cd)**: Fix typo in scan shape inferencing (#1753) <Scott McKay>
- **[ab52a5d](https://github.com/onnx/onnx/commit/ab52a5d)**: remove stale test cases <Lu Fang>
- **[56434bb](https://github.com/onnx/onnx/commit/56434bb)**: Removing experimental ConstantFill op. <Spandan Tiwari>
- **[881c63c](https://github.com/onnx/onnx/commit/881c63c)**: Show string names of data types instead of int IDs (#1749) <Shinichiro Hamaji>
- **[0a12fe4](https://github.com/onnx/onnx/commit/0a12fe4)**: Update ConstantOfShape op. (#1744) <Bowen Bao>
- **[ef028e5](https://github.com/onnx/onnx/commit/ef028e5)**: Update definition of Cast Op to support casting to/from string (#1704) <Raymond Yang>
Reviewed By: BIT-silence
Differential Revision:
D13773962
fbshipit-source-id:
b98079277994a699d4807210ba1d9c27f4672090
Shen Li [Tue, 22 Jan 2019 22:28:18 +0000 (14:28 -0800)]
Add default_stream() and enhance current_stream() (#16200)
Summary:
Closes #16156
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16200
Differential Revision:
D13747455
Pulled By: mrshenli
fbshipit-source-id:
00c0d5f341c3ac7a757bdb4631a17e11fbc6d3ec
Edward Yang [Tue, 22 Jan 2019 22:17:20 +0000 (14:17 -0800)]
complex_registration_extension.cpp includes to angled brackets
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16122
Reviewed By: smessmer
Differential Revision:
D13717900
fbshipit-source-id:
8401f39d993482d3e08d2d79bc1841deafee2a5b
Edward Yang [Tue, 22 Jan 2019 22:17:20 +0000 (14:17 -0800)]
Remove ATen/Allocator.h forwarding header.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16121
Reviewed By: smessmer
Differential Revision:
D13717899
fbshipit-source-id:
83488f2aa801ca75059949ec85171ec03e64c4ff
Edward Yang [Tue, 22 Jan 2019 21:35:03 +0000 (13:35 -0800)]
Remove dead curVal store.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16116
Reviewed By: smessmer
Differential Revision:
D13717719
fbshipit-source-id:
2ecee3f08f64e64ec5ac3c92fb326bc3df37e40e
Sebastian Messmer [Tue, 22 Jan 2019 21:21:38 +0000 (13:21 -0800)]
Make kernel registration constexpr again (#16166)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16166
Since we now don't use std::function anymore, we can make kernel registration constexpr again.
Reviewed By: ezyang
Differential Revision:
D13738630
fbshipit-source-id:
918fa3a3c8c6f0ddbd0f08b3b143cdf066265387
Sebastian Messmer [Tue, 22 Jan 2019 21:21:38 +0000 (13:21 -0800)]
Avoid closure around kernel (#16165)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16165
Store kernels as direct function pointers instead of std::function.
Using direct function pointers avoids a performance risk std::function would introduce.
Reviewed By: ezyang
Differential Revision:
D13738627
fbshipit-source-id:
a348906c8a201436699681980a82ca95065a06a0
Sebastian Messmer [Tue, 22 Jan 2019 21:21:37 +0000 (13:21 -0800)]
Pass IValues from JIT to c10 dispatcher (#16066)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16066
Don't unwrap and re-wrap but directly pass through the IValues
Reviewed By: ezyang
Differential Revision:
D13689037
fbshipit-source-id:
99b8155e640eb61a3c0597bf0f2b9c338712b45e
Shen Li [Tue, 22 Jan 2019 21:14:24 +0000 (13:14 -0800)]
Release GIL when synchronize or wait (#16182)
Summary:
address the second future work item in #15937
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16182
Differential Revision:
D13744972
Pulled By: mrshenli
fbshipit-source-id:
e9812e3fd4a5623e99b639d9f334bfc2d1827d92
Wanchao Liang [Tue, 22 Jan 2019 20:11:23 +0000 (12:11 -0800)]
Revert
D13540278: [pytorch][PR] Unhide unique from C++, make unique partially scriptable
Differential Revision:
D13540278
Original commit changeset:
3768c76a90b0
fbshipit-source-id:
7a31c239f9dca6ff467344d99820095addcae9d7
Xiang Gao [Tue, 22 Jan 2019 19:09:18 +0000 (11:09 -0800)]
Return namedtuples from torch.* function with multiple return arguments for C++ operators (#15429)
Summary:
Partially fixes: https://github.com/pytorch/pytorch/issues/394
Implementation detail:
Codegen is modified to generate codes that looks like below:
```C++
static PyObject * THPVariable_svd(PyObject* self_, PyObject* args, PyObject* kwargs)
{
HANDLE_TH_ERRORS
static PythonArgParser parser({
"svd(Tensor input, bool some=True, bool compute_uv=True, *, TensorList[3] out=None)",
}, /*traceable=*/true);
ParsedArgs<6> parsed_args;
auto r = parser.parse(args, kwargs, parsed_args);
static PyStructSequence_Field fields0[] = {
{"U", ""}, {"S", ""}, {"V", ""}, {nullptr}
};
static PyStructSequence_Desc desc0 = {
"torch.return_types.svd_out", nullptr,
fields0, 3
};
static PyTypeObject type0;
static bool namedtuple_type_initialized0 = false;
if (!namedtuple_type_initialized0) {
PyStructSequence_InitType(&type0, &desc0);
namedtuple_type_initialized0 = true;
}
static PyStructSequence_Field fields1[] = {
{"U", ""}, {"S", ""}, {"V", ""}, {nullptr}
};
static PyStructSequence_Desc desc1 = {
"torch.return_types.svd", nullptr,
fields1, 3
};
static PyTypeObject type1;
static bool namedtuple_type_initialized1 = false;
if (!namedtuple_type_initialized1) {
PyStructSequence_InitType(&type1, &desc1);
namedtuple_type_initialized1 = true;
}
if (r.idx == 0) {
if (r.isNone(3)) {
return wrap(&type1, dispatch_svd(r.tensor(0), r.toBool(1), r.toBool(2)));
} else {
auto results = r.tensorlist_n<3>(3);
return wrap(&type0, dispatch_svd(r.tensor(0), r.toBool(1), r.toBool(2), results[0], results[1], results[2]));
}
}
Py_RETURN_NONE;
END_HANDLE_TH_ERRORS
}
```
Types are defined as static member of `THPVariable_${op_name}` functions, and initialized at the first time the function is called.
When parsing function prototypes in `native_functions.yaml`, the parser will set the specified name as `field_name` when see things like `-> (Tensor t1, ...)`. These field names will be the field names of namedtuple. The class of namedtuples will be named `torch.return_types.${op_name}`.
In some python 2, `PyStructSequence` is not a subtype of tuple, so we have to create some functions to check if an object is a tuple or namedtuple for compatibility issue.
Operators in `native_functions.yaml` are changed such that only `max` and `svd` are generated as namedtuple. Tests are added for these two operators to see if the return value works as expected. Docs for these two ops are also updated to explicitly mention the return value is a namedtuple. More ops will be added in later PRs.
There is some issue with Windows build of linker unable to resolve `PyStructSequence_UnnamedField`, and some workaround is added to deal with this case.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15429
Differential Revision:
D13709678
Pulled By: ezyang
fbshipit-source-id:
23a511c9436977098afc49374e9a748b6e30bccf
Jongsoo Park [Tue, 22 Jan 2019 18:08:33 +0000 (10:08 -0800)]
Fix formating in caffe2/quantization/server/README.md
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14237
Reviewed By: dskhudia
Differential Revision:
D13751791
Pulled By: jspark1105
fbshipit-source-id:
54f73d5134e596817802c66d43098d18458c2799
Yaxun (Sam) Liu [Tue, 22 Jan 2019 17:07:18 +0000 (09:07 -0800)]
hip-clang enablement (#16085)
Summary:
Initial enabling of the upcoming hip-clang compiler for the PyTorch source base.
Changes:
* update the Eigen submodule to a version including our upstreamed hip-clang enabling there
* modify a few ifdef guards with the `__HIP__` macro used by hip-clang
* use `__lane_id` instead of `hc::__lane_id`
* add Debug flags for ROCm to the cmake infrastructure
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16085
Differential Revision:
D13709459
Pulled By: ezyang
fbshipit-source-id:
1b7b33fe810a0434766180580d4443ea177eb7c7
Andy Wei [Tue, 22 Jan 2019 16:46:06 +0000 (08:46 -0800)]
Raise CalledProcessError when torch.distributed launch process not return 0 (#16069)
Summary:
`torch.distributed.launch.py` will not raise error when `subprocess.Popen` is not return 0.
For better debugging it should always raise an error if processes launched have unusual behavior
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16069
Differential Revision:
D13709467
Pulled By: ezyang
fbshipit-source-id:
31d32a5ec8fed7bccd62d845bfba0e670ed3fe20
Shahzad Lone [Tue, 22 Jan 2019 16:00:00 +0000 (08:00 -0800)]
Reserve vectors that we know the size in advance for. (#16201)
Summary:
Save reallocation costs, by reserving vectors according to how many elements we expect to put in.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16201
Differential Revision:
D13762594
Pulled By: ezyang
fbshipit-source-id:
7e3bfe421489dde48a2ddb0920dd155f69baecc0
Will Feng [Tue, 22 Jan 2019 05:53:43 +0000 (21:53 -0800)]
cpp doc fix (#16221)
Summary:
Fixed a few C++ API callsites to work with v1.0.1.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16221
Differential Revision:
D13759207
Pulled By: yf225
fbshipit-source-id:
bd92c2b95a0c6ff3ba5d73cb249d0bc88cfdc340
Lu Fang [Tue, 22 Jan 2019 04:13:07 +0000 (20:13 -0800)]
Move away from ConstantFill (#16214)
Summary:
Prerequisite of https://github.com/onnx/onnx/pull/1434
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16214
Reviewed By: BIT-silence
Differential Revision:
D13755116
Pulled By: houseroad
fbshipit-source-id:
a46be8d7df959b5ede93e1f9c911a9a9326e6879
Zachary DeVito [Tue, 22 Jan 2019 03:57:33 +0000 (19:57 -0800)]
ban conv_double_backward from sandcastle, it takes too long
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16220
Differential Revision:
D13755108
Pulled By: zdevito
fbshipit-source-id:
46b1b128b155964c25249add0c84680491845e9b
Zachary DeVito [Tue, 22 Jan 2019 01:24:32 +0000 (17:24 -0800)]
Remove dead code from setup.py, remove need for build target. (#16162)
Summary:
Now it is only necessary to use 'develop' or 'install' to build. Incremental cmake is on by default. `develop --cmake` forces it to rerun.
The NinjaBuilder stuff is dead. It was used to make building _C.so
faster but now _C.so is just an empty stub file.
Removed a bunch of custom build commands from setup.py that are
no longer meaningful now that cmake handles most of the build.
Removed unused targets in build_pytorch_lib.sh/bat
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16162
Differential Revision:
D13744155
Pulled By: zdevito
fbshipit-source-id:
d836484782c65b7f8e8c7a82620886f7a7777892
Xiang Gao [Mon, 21 Jan 2019 20:28:56 +0000 (12:28 -0800)]
Unhide unique from C++, make unique partially scriptable (#15256)
Summary:
This PR does three things:
~~Allow `int64_t?` in function schema, which provide an elegant way of implementing null-able int arguments, as discussed in https://github.com/pytorch/pytorch/pull/15208#pullrequestreview-
185230081~~
~~Originally implemented in https://github.com/pytorch/pytorch/pull/15235~~
~~Example:~~
```yaml
- func: myop(Tensor self, int64_t? dim=None) -> Tensor
variants: function
```
~~cc: zou3519~~
Edit: implemented in https://github.com/pytorch/pytorch/pull/15234
Previously tried in https://github.com/pytorch/pytorch/pull/12064. There was a problem that C++ does not have kwarg support, which makes it confusing to know whether `unique(t, 1)` actually means `unique(t, dim=1)` or `unique(t, sorted=1)`.
Now I think I have a better idea on how to implement this: there are two ATen operators: `unique` and `unique_dim`. `unique` has the same signature as in python, and exported to both python and C++. `unique_dim` has signature `unique_dim(tensor, dim, sorted=False, return_inverse=False)`, and only exported to C++, which could be used more naturally for a C++ user.
Differential Revision:
D13540278
Pulled By: wanchaol
fbshipit-source-id:
3768c76a90b0881f565a1f890459ebccbdfe6ecd
Lu Fang [Mon, 21 Jan 2019 17:43:40 +0000 (09:43 -0800)]
update of fbcode/onnx to
c553fb32a0902ce5dd42e1b40123e9e9b38bdbe7 (#16190)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16190
Previous import was
fd60104394fa353e1762f44ecad1b2166e33deef
Included changes:
- **[c553fb3](https://github.com/onnx/onnx/commit/c553fb3)**: Handle negative axis in scan shape inference (#1748) <G. Ramalingam>
- **[51b6ecc](https://github.com/onnx/onnx/commit/51b6ecc)**: external_data: Store large tensor values in separate files (#678) <Michał Karzyński>
- **[ba05f26](https://github.com/onnx/onnx/commit/ba05f26)**: Scan output axes (#1737) <G. Ramalingam>
- **[90920c0](https://github.com/onnx/onnx/commit/90920c0)**: Add NonZero op. (#1714) <Sergii Dymchenko>
- **[c4cf112](https://github.com/onnx/onnx/commit/c4cf112)**: fix the test cases for constantofshape (#1746) <Lu Fang>
- **[d902349](https://github.com/onnx/onnx/commit/d902349)**: Add sample implementation support (#1712) <Lu Fang>
Differential Revision:
D13745693
fbshipit-source-id:
05e2cce9ae1dfa2865db83840df64673d55cea57
Xiaomeng Yang [Sun, 20 Jan 2019 16:50:32 +0000 (08:50 -0800)]
Separate Moments from math and optimize it (#16175)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16175
Separate Moments from math and optimize it
i-am-not-moving-c2-to-c10
Reviewed By: houseroad
Differential Revision:
D13742472
fbshipit-source-id:
90757d908d38c98ca69818855aaf68315e525992
Shen Li [Sun, 20 Jan 2019 06:58:54 +0000 (22:58 -0800)]
Unify device() return type in Stream, Event, and Tensor (#16150)
Summary:
Addresses one future work item in #15937
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16150
Differential Revision:
D13732299
Pulled By: mrshenli
fbshipit-source-id:
4d0b35df573a3bf92dea6e2e7eb42fe8bac77b18
Spandan Tiwari [Sun, 20 Jan 2019 03:50:20 +0000 (19:50 -0800)]
Replace use of ConstantLike with with ConstantOfShape (#16095)
Summary:
Submitting this PR as an update to existing PR (https://github.com/pytorch/pytorch/pull/15938) on houseroad 's request.
This PR replaces the use of ONNX op `ConstantLike` with `ConstantOfShape` in the ONNX exporter. In addition to removing the call sites in `symbolic.py`, it also replace the call site in `peephole.cpp`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16095
Differential Revision:
D13745723
Pulled By: houseroad
fbshipit-source-id:
e2a5f534f01adf199df9e27544f7afcfa540e1f0
Miro Furtado [Sat, 19 Jan 2019 22:58:36 +0000 (14:58 -0800)]
Fix LBFGS issue (#16167)
Summary:
Resolves #15923 where LBFGS threw "Error: a leaf Variable that requires grad has been used in an in-place operation."
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16167
Differential Revision:
D13745822
Pulled By: soumith
fbshipit-source-id:
7d1d0511d06838c0c6f4c8a6b53cf15193283059
Kjell Schubert [Sat, 19 Jan 2019 13:57:37 +0000 (05:57 -0800)]
Allow for concurrent quantization in FullyConnectedDNNLowPOp (#16174)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16174
Our service creates a new caffe2 workspace for the same underlying network on multiple threads concurrently at service startup time (later these workspaces are being reused for sequential requests), resulting in concurrent quantization via FullyConnectedDNNLowPOp calling GetOrCreateFbgemmPackBMatrix(). The lazily performed quantizations during the first inference in each workspace are all funnelled through GetOrCreateFbgemmPackBMatrix()'s cache_mutex, which means quantization is serialized, so at service startup time only a single CPU core is being used for around a minute until the serial quantization is done.
An better solution would be to avoid the quantization of the same weight matrix of the operator copies in different net copies to begin with, but this here is the simpler solution for our current problem.
Reviewed By: jspark1105
Differential Revision:
D13708785
fbshipit-source-id:
537519896b3b939c552d67f400bafc8a69ce11eb
Lu Fang [Sat, 19 Jan 2019 06:55:41 +0000 (22:55 -0800)]
Support ConstantOfShape in Caffe2 ONNX Backend (#16108)
Summary:
This PR is the prerequisite to land https://github.com/pytorch/pytorch/pull/16095
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16108
Reviewed By: BIT-silence
Differential Revision:
D13725722
Pulled By: houseroad
fbshipit-source-id:
28c0fb72f075cd04f9db44dfab0163844c20c620
Xiaomeng Yang [Sat, 19 Jan 2019 06:37:12 +0000 (22:37 -0800)]
Separate affine_channel from math and optimize it (#16135)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16135
Separate affine_channel from math and optimize it
i-am-not-moving-c2-to-c10
Reviewed By: houseroad
Differential Revision:
D13727606
fbshipit-source-id:
8980af4afadaf964a18a9da581106fe30896a7e9
Sebastian Messmer [Fri, 18 Jan 2019 23:55:57 +0000 (15:55 -0800)]
Pass IValue from c10 dispatcher to caffe2 operator (#16065)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16065
Before, we registered the caffe2 kernel with the c10 dispatcher using plain C types.
Now, we pass in IValues, which avoids the unwrapping inbetween.
Reviewed By: ezyang
Differential Revision:
D13689036
fbshipit-source-id:
b976a2c46a5a541f6a926b3df255e8a535e32420
Sebastian Messmer [Fri, 18 Jan 2019 23:55:57 +0000 (15:55 -0800)]
Make c10 dispatcher use boxed kernel function pointers (#16051)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16051
This changes the kernels stored in the c10 dispatcher from plain C function pointers to IValue-based KernelFunction*.
Note that KernelFunction is currently taking an `ArrayRef<IValue>` as arguments. A later diff will change that to it taking a `Stack*`.
Reviewed By: ezyang
Differential Revision:
D13684518
fbshipit-source-id:
1fa54f60cec2e967b92a4a043d6e3ac1627ed991
Thomas Viehmann [Fri, 18 Jan 2019 23:27:12 +0000 (15:27 -0800)]
add back NNPACK in PyTorch (#15924)
Summary:
This tests the water for adding back NNPACK in PyTorch, it's a lot better than the fallback THNN versions.
In #6151, we (ezyang and soumith) removed NNPACK support from PyTorch. Of course Maratyszcza might have advice, too. (Or an opinion on the CMake changes.)
The only functional changes are to use NNPack more aggressively on mobile and a .contiguous() to match NNPack's assumption (I stumbled over that while using NNPack for style transfer.)
The CMake changes try to use the NNPack we already have in git.
In terms of lines of code this is a large part of the diff of https://lernapparat.de/pytorch-jit-android/ . As far as I can tell, we don't have MKLDNN on mobile and the native THNN implementation are prohibitively expensive in terms of both CPU and memory.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15924
Differential Revision:
D13709576
Pulled By: ezyang
fbshipit-source-id:
f2e287739909451c173abf046588209a7450ca2c
Natalia Gimelshein [Fri, 18 Jan 2019 22:53:32 +0000 (14:53 -0800)]
improve performance of unique with inverse indices (#16145)
Summary:
Partial fix for #15804, only w/o dim.
For jcjohnson benchmarking script I'm getting the following results on V100:
Before:
```
unning with N = 10000, M = 10000
cuda (no inverse): 0.98 ms
cpu (no inverse): 0.96 ms
cuda (with inverse): 1.07 ms
cpu (with inverse): 1.76 ms
Running with N = 10000, M = 100000
cuda (no inverse): 0.76 ms
cpu (no inverse): 1.53 ms
cuda (with inverse): 1.23 ms
cpu (with inverse): 3.02 ms
Running with N = 100000, M = 100000
cuda (no inverse): 1.28 ms
cpu (no inverse): 11.22 ms
cuda (with inverse): 69.76 ms
cpu (with inverse): 20.28 ms
Running with N = 100000, M = 1000000
cuda (no inverse): 0.78 ms
cpu (no inverse): 18.78 ms
cuda (with inverse): 133.45 ms
cpu (with inverse): 34.09 ms
Running with N = 500000, M = 500000
cuda (no inverse): 1.43 ms
cpu (no inverse): 61.13 ms
cuda (with inverse): 3315.18 ms
cpu (with inverse): 104.57 ms
Running with N = 500000, M = 5000000
cuda (no inverse): 0.86 ms
cpu (no inverse): 96.44 ms
cuda (with inverse): 5209.93 ms
cpu (with inverse): 176.10 ms
```
After
```
Running with N = 10000, M = 10000
cuda (no inverse): 1.04 ms
cpu (no inverse): 0.94 ms
cuda (with inverse): 0.64 ms
cpu (with inverse): 1.76 ms
Running with N = 10000, M = 100000
cuda (no inverse): 0.77 ms
cpu (no inverse): 1.55 ms
cuda (with inverse): 0.58 ms
cpu (with inverse): 2.79 ms
Running with N = 100000, M = 100000
cuda (no inverse): 1.30 ms
cpu (no inverse): 14.15 ms
cuda (with inverse): 1.63 ms
cpu (with inverse): 20.90 ms
Running with N = 100000, M = 1000000
cuda (no inverse): 0.82 ms
cpu (no inverse): 18.63 ms
cuda (with inverse): 0.61 ms
cpu (with inverse): 33.52 ms
Running with N = 500000, M = 500000
cuda (no inverse): 1.51 ms
cpu (no inverse): 59.81 ms
cuda (with inverse): 1.23 ms
cpu (with inverse): 110.69 ms
Running with N = 500000, M = 5000000
cuda (no inverse): 0.92 ms
cpu (no inverse): 104.26 ms
cuda (with inverse): 0.84 ms
cpu (with inverse): 187.12 ms
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16145
Differential Revision:
D13738821
Pulled By: soumith
fbshipit-source-id:
0811fb4ade47e3b466cebbc124e3f3333a986749
Michael Suo [Fri, 18 Jan 2019 22:01:41 +0000 (14:01 -0800)]
fix for clang-tidy (#16164)
Summary:
It turns out that clang-tidy is bundled with travis's standard trusty distribution, so no need to install it manually.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16164
Differential Revision:
D13738986
Pulled By: suo
fbshipit-source-id:
d0cd76c615625b2ed7f18951289412989f15849d
Shen Li [Fri, 18 Jan 2019 20:32:20 +0000 (12:32 -0800)]
Change current device in stream context manager if necessary (#16128)
Summary:
Fixes #16019
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16128
Differential Revision:
D13721850
Pulled By: mrshenli
fbshipit-source-id:
422c6c0b97c1cd46e127e265b532cb8c74a3aac5
Jerry Zhang [Fri, 18 Jan 2019 20:14:34 +0000 (12:14 -0800)]
Fix SoftmaxOps (#16049)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16049
We might see the pattern
```
if (scale_.numel() != N) {
scale_->Resize(N);
// set initial value for scale_
}
// In class:
Tensor scale_{CPU};
```
before in the code, where `scale_` is a member variable of Type `caffe2::Tensor`
This pattern actually serves two purposes, if `scale_` is partially initialized with device type but not size, this call will
initialize Tensor with the correct size, or if `scale_` is already initialized with size, it will check whether the size
matches a runtime value `N` and if not it will Resize. To rewrite this we'll do the following:
```
if (!scale_.defined() || scale_.numel() != N) {
ReinitializeTensor(&scale_, {N}, at::dtype<float>().device(CPU));
// set initial value for scale_
}
```
There are some variants, if `scale_` is resized to a constant size, we can call `ReinitializeTensor` instead
```
if (scale_.numel() != 1) {
scale_->Resize(1);
}
```
-->
```
ReinitializeTensor(&scale_, {1}, at::dtype<float>().device(CPU));
```
Normal Resize will be refactored directly into ReinitializeTensor:
```
scale_->Resize(N);
```
-->
```
ReinitializeTensor(&scale_, {N}, at::dtype<float>().device(CPU));
```
Reviewed By: dzhulgakov
Differential Revision:
D13667883
fbshipit-source-id:
2c7cb61544b72765b594011b99150eb5a1b50836
Jerry Zhang [Fri, 18 Jan 2019 19:49:36 +0000 (11:49 -0800)]
rest of uses for deprecation of dims() in Tensor (#16118)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16118
att
Differential Revision:
D13697211
fbshipit-source-id:
12bf6edd1794240ac748cc1b8fecb0c1e8eb9112
Nikita Shulga [Fri, 18 Jan 2019 19:33:40 +0000 (11:33 -0800)]
RNN operators should inherit step_net device_options (#16086)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16086
[caffe2] RNN operators should inherit step_net device_options
According to NetDef documentaiton, if network has a specific device option it applies to all network operators that do not explicitly specifiy it.
But this does not seem to be the case for RecurrentNetwork operators
Reviewed By: orionr
Differential Revision:
D13699552
fbshipit-source-id:
14529bc9504e3b02f763e3c2429be21e46f82b68
Elias Ellison [Fri, 18 Jan 2019 19:17:34 +0000 (11:17 -0800)]
Add implicit optional unwrapping (#15587)
Summary:
Add support for type inference for optional type refinement.
If a conditional is of the form "x is None" or "x is not None", or is a boolean expression containing multiple none checks, the proper type refinements are inserted in each branch.
For example:
if optional_tensor is not None and len(optional_tensor) < 2:
# optional_tensor is a Tensor
if optional_tensor1 is not None and optional_tensor2 is not None:
# both optional_tensor1 and optional_tensor2 are Tensors
TODO:
- not run an op for unchecked unwrap optional in the interpreter
- potentially refine types to prim::None (omitted for now to simply things & because it's not an actual use cause).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15587
Differential Revision:
D13733810
Pulled By: eellison
fbshipit-source-id:
57c32be9f5a09ab5542ba0144a6059b96de23d7a
Jerry Zhang [Fri, 18 Jan 2019 18:59:53 +0000 (10:59 -0800)]
Add defined() to caffe2::Tensor (#16125)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16125
Add defined() method to check whether the Tensor is defined.
Reviewed By: ezyang
Differential Revision:
D13719222
fbshipit-source-id:
ff8efef2159ed1026bd16acaea40c768a1e20a47
Edward Yang [Fri, 18 Jan 2019 18:52:18 +0000 (10:52 -0800)]
Remove ATen/Half.h and ATen/core/Half.h forwarding headers.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16115
Reviewed By: bddppq
Differential Revision:
D13717049
fbshipit-source-id:
fb1d690183a932a1fa1a2d235f3219520f51620a
Shen Li [Fri, 18 Jan 2019 18:07:33 +0000 (10:07 -0800)]
Port legacy any(*) to ATen
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15547
Differential Revision:
D13549495
Pulled By: mrshenli
fbshipit-source-id:
09a065a8ffa7d73f409759b779c7314cc87f4853
Richard Zou [Fri, 18 Jan 2019 15:56:17 +0000 (07:56 -0800)]
Improve pack_sequence and pack_padded_sequence error message (#16084)
Summary:
Mention that if enforce_sorted=True, the user can set
enforce_sorted=False. This is a new flag that is probably hard to
discover unless one throughly reads the docs.
Fixes #15567
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16084
Differential Revision:
D13701118
Pulled By: zou3519
fbshipit-source-id:
c9aeb47ae9769d28b0051bcedb8f2f51a5a5c260
Teng Li [Fri, 18 Jan 2019 10:23:51 +0000 (02:23 -0800)]
TCP init method race condition fix (#15684)
Summary:
This PR fixes a race condition for TCP init method, when master rank can exit earlier than slave ranks and thus the TCP daemon thread gets shutdown before other slaves are able to access it.
This will let every rank (process) write a special key to the store to mark that they are completed (and thus about to exit). The master rank (who is the server) will always wait until all the ranks to complete before complete itself.
This should fix: https://github.com/pytorch/pytorch/issues/15638
Tested using the repro of https://github.com/pytorch/pytorch/issues/15638 and works fine. Also test_distributed and test_c10d should have already had this coverage.
I had to make rendezvous test in c10d the world size of 1, since it is a single process code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15684
Differential Revision:
D13570904
Pulled By: teng-li
fbshipit-source-id:
34f3bc471204bbd29320df359347ad5561c6b589
Dmytro Dzhulgakov [Fri, 18 Jan 2019 08:28:26 +0000 (00:28 -0800)]
Remove caffe2::Tensor copy constructor (#15416)
Summary:
Based on offline discussion it should be less surprising to the users of existing code. Thus caffe2::Tensor is now a move-only class (as it used to be), explicit calls to UnsafeSharedInstance() are necessary to get shared_ptr behavior.
This change also identified a few places that misused the copy constructor - those are fixed
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15416
Reviewed By: Yangqing
Differential Revision:
D13524598
fbshipit-source-id:
aea12d6dff77342606fa88ce4ddddbff266245a7
Zachary DeVito [Fri, 18 Jan 2019 08:01:48 +0000 (00:01 -0800)]
Fix RERUN_CMAKE
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16132
Differential Revision:
D13726816
Pulled By: zdevito
fbshipit-source-id:
26ad70651b0138642ad5240670f5c452018c13a2
Mikhail Zolotukhin [Fri, 18 Jan 2019 02:03:38 +0000 (18:03 -0800)]
Cleanup includes in python_print.cpp.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16129
Differential Revision:
D13724297
Pulled By: ZolotukhinM
fbshipit-source-id:
24e140bc052c85ef40b928eb84f463d341346a51
Mikhail Zolotukhin [Fri, 18 Jan 2019 01:27:36 +0000 (17:27 -0800)]
Refactor attributes.h (#16098)
Summary:
This PR inlines `Attributes` into `Node`. It helps to cleanup the code a little as everything is one place (some of the cleanups are included in the PR).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16098
Differential Revision:
D13717637
Pulled By: ZolotukhinM
fbshipit-source-id:
c54ae65178a95a01354688921a9ccb1ca699f8eb
Sebastian Messmer [Thu, 17 Jan 2019 23:53:52 +0000 (15:53 -0800)]
Fix export macros (#15856)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15856
They seem to be wrong.
cc zdevito to take a look but I think this is now more correct.
It's weird this didn't cause linker errors. Probably, this functionality isn't used across library boundaries yet.
Reviewed By: dzhulgakov
Differential Revision:
D13605257
fbshipit-source-id:
7077ca9027c3ac79a4847ec15ead7ddb28696445
Sebastian Messmer [Thu, 17 Jan 2019 23:53:52 +0000 (15:53 -0800)]
Remove some dependencies from ivalue.h to ATen (#15855)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15855
This is preparation work for moving IValue to c10.
Reviewed By: ezyang
Differential Revision:
D13605259
fbshipit-source-id:
cc545f582ab8607bb02aaf71273cb2710200b295
Sebastian Messmer [Thu, 17 Jan 2019 23:53:52 +0000 (15:53 -0800)]
Code style cleanup (#15854)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15854
- Remove unnecessary `inline` keyword
- Add a TODO stating the intention for Blob::ShareExternal()
Reviewed By: dzhulgakov
Differential Revision:
D13605258
fbshipit-source-id:
c0bc85c74c4ca4b3811d42ac7f866182e159d840
Sebastian Messmer [Thu, 17 Jan 2019 23:47:16 +0000 (15:47 -0800)]
Use intrusive_ptr for Blob in IValue (#16052)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16052
We need IValue to take/return Blob as an intrusive_ptr because we want to pass it around and Blob has disabled copying.
This is needed in a diff on top.
Reviewed By: ezyang
Differential Revision:
D13684761
fbshipit-source-id:
7cb3d7e9fec39a2bc9f063d4d30404e6d7016eb2
Sebastian Messmer [Thu, 17 Jan 2019 23:47:16 +0000 (15:47 -0800)]
Move c10 dispatcher back to ATen/core (#16050)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16050
The c10 dispatcher will (soon) depend on IValue and IValue can't be moved to c10 yet because it depends on at::Tensor, which depends on legacy Type dispatch and we don't want the legacy dispatch in c10.
So instead, we move the c10 dispatcher back to ATen/core until we can actually move at::Tensor to c10.
Reviewed By: ezyang
Differential Revision:
D13684517
fbshipit-source-id:
1125f4254223907c52f96ff73034f6d4ae9fd0a7
Chris Gottbrath [Thu, 17 Jan 2019 22:55:49 +0000 (14:55 -0800)]
Moving cuda-convnet2 to the internal fb dir to satisfy internal dependencies. (#16104)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16104
PyTorch PR 15784 removed cuda-convnet from the contrib directory. This broke
some internal-only fb dependencies. Moving this to the internal area.
Reviewed By: ezyang
Differential Revision:
D13709112
fbshipit-source-id:
2d7811545da67489869b59c350a29817eff693cf
Michael Suo [Thu, 17 Jan 2019 22:38:42 +0000 (14:38 -0800)]
further wildcard cleanups (#16041)
Summary:
Some cleanup to wildcard handling, including one bugfix: previously, we were not considering writes to the wildcard set as part of the potential write set for nodes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16041
Differential Revision:
D13705738
Pulled By: suo
fbshipit-source-id:
acb8ccbaa70fe47445577ddf24a69f84630de411
David Riazati [Thu, 17 Jan 2019 21:39:07 +0000 (13:39 -0800)]
Refactor _jit_internal (#16058)
Summary:
Use qualified names in `jit/__init__.py` to avoid polluting that namespace
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16058
Differential Revision:
D13718745
Pulled By: driazati
fbshipit-source-id:
19d150569c8374541250a961f24f70c3f523de03
Jesse Hellemn [Thu, 17 Jan 2019 21:37:04 +0000 (13:37 -0800)]
Include all Caffe2 headers in Python installations (#16124)
Summary:
Confirmed on a local run that all the additional headers are present. This shouldn't be caught in any existing tests though.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16124
Differential Revision:
D13720773
Pulled By: pjh5
fbshipit-source-id:
22a42639f5649cac555ecc5a8b6760a8cbfcf01f
Aaron Jaech [Thu, 17 Jan 2019 21:13:07 +0000 (13:13 -0800)]
Add comment to explain rnn bias vectors (#15843)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15843
RNN/LSTMs only need one bias vector, but our implementation uses two to be compatible with CuDNN. This diff adds a comment to explain this.
Reviewed By: ezyang
Differential Revision:
D13602365
fbshipit-source-id:
eef5bd9383d9f241dc0ef0472f753b4a44cc19b5
Will Feng [Thu, 17 Jan 2019 20:18:15 +0000 (12:18 -0800)]
Add @yf225 to cpp codeowner
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16120
Differential Revision:
D13718880
Pulled By: yf225
fbshipit-source-id:
1c0a41ffba71855a3ad88b8d263ba2bd5076351d
Yangqing Jia [Thu, 17 Jan 2019 20:15:07 +0000 (12:15 -0800)]
Update FP16 to latest master (#14498)
Summary:
TSIA - fp16 cmake had a bug that is fixed in https://github.com/Maratyszcza/FP16/pull/9 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14498
Differential Revision:
D13240829
Pulled By: Yangqing
fbshipit-source-id:
724745750efe4f1b49d29ee07380c36997579915
Egil Martinsson [Thu, 17 Jan 2019 20:14:39 +0000 (12:14 -0800)]
Cleanup gumbel_softmax (#13339)
Summary:
Fixes #12643, amends to #3341.
- Allow multidimensional input ~~(but apply softmax over `dim=-1`)~~ with `dim` argument
- Cleaner: Less lines of code
- Faster (1.32x speedup vs original, 2x speedup vs using `torch.Distributions`)
- Small fixes in docstring
- Remove some references in docstring. Was the linked (excellent) ipynb the first to do the straight-through trick? Instead, I propose changing to reference to the two papers most known for it.
- Add deprecationwarning for `eps`. It's not needed anymore.
- Initial commit keeps some code alternatives commented to exploit CI
- As of discussion when `gumbel_softmax` was added (#3341), this was merged into `torch.nn.functional` before all the work with `Distributions` and `Pyro`, and there will probably be multiple other best practices for this in the future.
I've tested building using the `Distributions`-api, but it was too slow, see below.
I therefore propose not using `Distributions` to keep it fast and simple, but adding a comment in docstring that `gumbel_softmax` may be deprecated in the future.
```
dist = torch.distributions.RelaxedOneHotCategorical(temperature=tau, logits=logits, validate_args=False)
y_soft = dist.rsample()
```
Pros:
* Built using tricks like `logsumexp` etc
* Explicitly uses `torch.distributions.utils._finfo` to avoid overflow (old implementation had an `eps` flag)
* Maintained for this exact purpose.
Cons:
* Very slow. Construction of distribution adds overhead see timings below. May be solved in future with speedups of `TransformedDistribution` and `Distribution`.
* Assumes which `dim` to apply softmax over.
```
y_soft = logits.new(logits.shape)
y_soft = (logits - y_soft.exponential_().log()) / tau # Gumbel noise
y_soft = y_soft.softmax(dim) # Gumbel softmax noise
```
Pros:
* Faster
```
import time
start = time.time()
num_draws = 1000000
logits = torch.randn(1,3)
for draw in range(num_draws):
y_draw = gumbel_softmax(logits, hard=True)
counts = counts + y_draw
print(end - start)
>> 12.
995795965194702
>> 7.
658372640609741
>> 20.
3382670879364
````
Decide on which path to chose. I'll commit in changes to the unit tests in a while to show that it passes both old tests and new tests. I'll also remove the commented code about `RelaxedOneHotCategorical`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13339
Differential Revision:
D13092434
Pulled By: ezyang
fbshipit-source-id:
4c21788df336f4e9c2ac289022e395b261227b4b
Christian Puhrsch [Thu, 17 Jan 2019 20:07:04 +0000 (12:07 -0800)]
Add matches_jit_signature attribute to native_functions.yaml (#16040)
Summary:
If "matches_jit_signature" is set to True for a particular function, we will assume that the func syntax follows the JIT signature syntax. This is a temporary attribute and doesn't need to be set by developers outside the core team. It serves as a means of tracking an ongoing schema unification with the goal of aligning func syntax with other components of PyTorch in order to reduce overall complexity and match coverage of different function descriptions.
Followup PRs might be about removing _out from native_functions.yaml and using Tensor annotations instead, etc.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16040
Reviewed By: ezyang
Differential Revision:
D13703176
Pulled By: cpuhrsch
fbshipit-source-id:
ce248e1823a6f18efa95502f9f3eebf023b4a46c
FrankHui [Thu, 17 Jan 2019 19:31:31 +0000 (11:31 -0800)]
add if in register_buffer like register_parameters (#16110)
Summary:
without this "if", code below will throw error " Linear' object has no attribute '_buffers' "
And with this if, error would be "cannot assign buffer before Module.\_\_init\_\_() call", which I think it's more accurate, just like register_parameter.
```
import math
import torch
from torch.nn.parameter import Parameter
from torch.nn import functional as F
from torch.nn import Module
class Linear(Module):
def __init__(self, in_features, out_features, bias=True):
self.in_features = in_features
self.out_features = out_features
self.register_buffer('test', torch.Tensor(out_features, in_features))
self.weight = Parameter(torch.Tensor(out_features, in_features))
if bias:
self.bias = Parameter(torch.Tensor(out_features))
else:
self.register_parameter('bias', None)
super(Linear, self).__init__()
self.reset_parameters()
def reset_parameters(self):
stdv = 1. / math.sqrt(self.weight.size(1))
self.weight.data.uniform_(-stdv, stdv)
if self.bias is not None:
self.bias.data.uniform_(-stdv, stdv)
def forward(self, input):
return F.linear(input, self.weight, self.bias)
def extra_repr(self):
return 'in_features={}, out_features={}, bias={}'.format(
self.in_features, self.out_features, self.bias is not None
)
linear = Linear(3,4)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16110
Differential Revision:
D13715839
Pulled By: soumith
fbshipit-source-id:
c300eff0a8655aade448354cf489a592f7db722a
Edward Yang [Thu, 17 Jan 2019 19:15:39 +0000 (11:15 -0800)]
Revert
D13709409: [pytorch][PR] Exclude pyi from flake8 checks.
Differential Revision:
D13709409
Original commit changeset:
ec4a959e146f
fbshipit-source-id:
feabed5719a0bfdfe7979074b7e1ba9756c4ba25
Guoqiang Jerry Chen [Thu, 17 Jan 2019 18:54:43 +0000 (10:54 -0800)]
respect grad guard for torch.jit._fork and torch.jit._wait (#16101)
Summary:
respect grad guard for torch.jit._fork and torch.jit._wait.
Verified that the test failed without the fix, and pass with the fix.
Ideally I would like to enable and disable grad inside the forked function.
It doesn't seems like it's supported at this moment. This code handles that
as well.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16101
Differential Revision:
D13708374
Pulled By: gqchen
fbshipit-source-id:
0533f080c4d0253fb4c61d2a0d3cc22de5721a09
Gregory Chanan [Thu, 17 Jan 2019 18:12:47 +0000 (10:12 -0800)]
Revert batched pdist, improve existing kernel, add test (#15901)
Summary:
1) Reverts https://github.com/pytorch/pytorch/pull/12302 which added support for batched pdist. Except I kept the (non-batched) test improvements that came with that PR, because they are nice to have. Motivation: https://github.com/pytorch/pytorch/issues/15511
2) For the non-batched pdist, improved the existing kernel by forcing fp64 math and properly checking cuda launch errors
3) Added a 'large tensor' test that at least on my machine, fails on the batch pdist implementation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15901
Reviewed By: ezyang
Differential Revision:
D13616730
Pulled By: gchanan
fbshipit-source-id:
620d3f9b9acd492dc131bad9d2ff618d69fc2954
Derek Kim [Thu, 17 Jan 2019 18:07:46 +0000 (10:07 -0800)]
Fix trivial typos in torch.cuda._utils (#16026)
Summary:
Trivial typo fixings.
Maybe the indefinite article "an" is needed before each "specified index" but I'm not perfectly sure.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16026
Differential Revision:
D13709499
Pulled By: ezyang
fbshipit-source-id:
698b000bb8aa063afd81db6e67046456a439b2ce
Sasha Rush [Thu, 17 Jan 2019 18:04:51 +0000 (10:04 -0800)]
Unify the shape notation for all of the pytorch modules (#15741)
Summary:
PR to update the shape notation for all of the torch.nn modules to take a unified form. The goal is to make these definitions machine-readable and those checkable by unifying the style across all of the different modules.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15741
Differential Revision:
D13709601
Pulled By: ezyang
fbshipit-source-id:
fb89a03903fdf0cd0dcf76f3e469b8582b2f3634
Neeraj Pradhan [Thu, 17 Jan 2019 17:59:58 +0000 (09:59 -0800)]
Fix numerical stability in binomial.log_prob (#15962)
Summary:
This issue was discovered by fehiepsi in https://github.com/uber/pyro/issues/1706 with the `log_prob` computation for Binomial, ~and can be seen with `torch.float32` when we have a combination of low probability value and high `total_count` - a test is added to capture this (since scipy only uses float64, the comparison is done using relative tolerance).~
The problem is in the code that tries to pull out the minimum values amongst the logits (written by me earlier, presumably to avoid numerical instability issues), but it is not needed.
EDIT: After a few attempts, I have been unable to reliably show that the change is more numerically stable, and have removed my previous test which fails on linux. The reason is that the issue manifests itself when `total_count` is high and `probs` is very low. However, the precision of `lgamma` when `total_count` is high is bad enough to wash away any benefits. The justification for this still stands though - (a) simplifies code (removes the unnecessary bit), (b) is no worse than the previous implementation, (c) has better continuity behavior as observed by fehiepsi in the issue above.
cc. fehiepsi, alicanb, fritzo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15962
Differential Revision:
D13709541
Pulled By: ezyang
fbshipit-source-id:
596c6853b6e4d5fba42336afa168a665ab6fbde2
Lu Fang [Thu, 17 Jan 2019 17:53:15 +0000 (09:53 -0800)]
update of fbcode/onnx to
fd60104394fa353e1762f44ecad1b2166e33deef (#16094)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16094
Previous import was
84a0441ae28795a928005863dc142bee81827566
Included changes:
- **[fd60104](https://github.com/onnx/onnx/commit/fd60104)**: deprecate no-spatial mode of BN (#1637) <liqunfu>
Reviewed By: BIT-silence
Differential Revision:
D13705357
fbshipit-source-id:
44dbc8bf15fced6d50048b04c2882e38f75c0e34
Derek Kim [Thu, 17 Jan 2019 17:50:55 +0000 (09:50 -0800)]
A trivial typo fixed in onnx.verify.verify (#15871)
Summary:
A trivial typo fixing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15871
Differential Revision:
D13709588
Pulled By: ezyang
fbshipit-source-id:
84460e53e30470bef72bc836c08fd149b4d725cf
Syed Tousif Ahmed [Thu, 17 Jan 2019 17:49:44 +0000 (09:49 -0800)]
Remove support for CUDNN 6 (#15851)
Summary: This PR aims to remove support for cuDNN 6.
Differential Revision:
D13709595
Pulled By: ezyang
fbshipit-source-id:
853624db1cf66b0534d7028654c38c2806fb4107
Edward Yang [Thu, 17 Jan 2019 17:49:03 +0000 (09:49 -0800)]
Exclude pyi from flake8 checks. (#16105)
Summary:
Idiomatic pyi files will fail with Python 2 flake8 even
though they would work with mypy. This is because pyi
files generally use Python 3 only syntax. No point
in linting them.
There are currently no pyi files checked in, this is purely
a prophylactic measure.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16105
Reviewed By: zou3519
Differential Revision:
D13709409
Pulled By: ezyang
fbshipit-source-id:
ec4a959e146f81ccb9533b04348be8dd78808421
Marat Dukhan [Thu, 17 Jan 2019 17:18:04 +0000 (09:18 -0800)]
Update cpuinfo to avoid reporting error when sysfs is not accessible (#16107)
Summary:
On some cloud-based x86 systems /sys/ is not mounted.
cpuinfo has a work-around for these systems, but it reports an error if sysfs files fail to read, and this error was confusing to some users (e.g. pytorch/cpuinfo#20). This update downgrades the error to a warning, so it is not reported with default configuration options.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16107
Differential Revision:
D13715243
Pulled By: soumith
fbshipit-source-id:
f5c4c86422343ca449487f0185f3a8865ccf3b9d
bddppq [Thu, 17 Jan 2019 17:15:14 +0000 (09:15 -0800)]
Export PyTorch erf to ONNX Erf and add Caffe2 Erf operator
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16106
Differential Revision:
D13709490
Pulled By: bddppq
fbshipit-source-id:
1b5b32261f06543371f7bd7ac9b11957a5eb4ad0
DavidWongEA [Thu, 17 Jan 2019 16:30:55 +0000 (08:30 -0800)]
Potential fix for model inference crash on Win10 (#15919) (#16092)
Summary:
Please refer to issue #15919
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16092
Differential Revision:
D13712897
Pulled By: soumith
fbshipit-source-id:
edcd1ed3504f1fa1af841a1757616382c745958f