platform/upstream/pytorch.git
5 years agoPass IValue from c10 dispatcher to caffe2 operator (#16065)
Sebastian Messmer [Fri, 18 Jan 2019 23:55:57 +0000 (15:55 -0800)]
Pass IValue from c10 dispatcher to caffe2 operator (#16065)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16065

Before, we registered the caffe2 kernel with the c10 dispatcher using plain C types.
Now, we pass in IValues, which avoids the unwrapping inbetween.

Reviewed By: ezyang

Differential Revision: D13689036

fbshipit-source-id: b976a2c46a5a541f6a926b3df255e8a535e32420

5 years agoMake c10 dispatcher use boxed kernel function pointers (#16051)
Sebastian Messmer [Fri, 18 Jan 2019 23:55:57 +0000 (15:55 -0800)]
Make c10 dispatcher use boxed kernel function pointers (#16051)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16051
This changes the kernels stored in the c10 dispatcher from plain C function pointers to IValue-based KernelFunction*.

Note that KernelFunction is currently taking an `ArrayRef<IValue>` as arguments. A later diff will change that to it taking a `Stack*`.

Reviewed By: ezyang

Differential Revision: D13684518

fbshipit-source-id: 1fa54f60cec2e967b92a4a043d6e3ac1627ed991

5 years agoadd back NNPACK in PyTorch (#15924)
Thomas Viehmann [Fri, 18 Jan 2019 23:27:12 +0000 (15:27 -0800)]
add back NNPACK in PyTorch (#15924)

Summary:
This tests the water for adding back NNPACK in PyTorch, it's a lot better than the fallback THNN versions.

In #6151, we (ezyang and soumith) removed NNPACK support from PyTorch. Of course Maratyszcza might have advice, too. (Or an opinion on the CMake changes.)

The only functional changes are to use NNPack more aggressively on mobile and a .contiguous() to match NNPack's assumption (I stumbled over that while using NNPack for style transfer.)
The CMake changes try to use the NNPack we already have in git.

In terms of lines of code this is a large part of the diff of https://lernapparat.de/pytorch-jit-android/ . As far as I can tell, we don't have MKLDNN on mobile and the native THNN implementation are prohibitively expensive in terms of both CPU and memory.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15924

Differential Revision: D13709576

Pulled By: ezyang

fbshipit-source-id: f2e287739909451c173abf046588209a7450ca2c

5 years agoimprove performance of unique with inverse indices (#16145)
Natalia Gimelshein [Fri, 18 Jan 2019 22:53:32 +0000 (14:53 -0800)]
improve performance of unique with inverse indices (#16145)

Summary:
Partial fix for #15804, only w/o dim.
For jcjohnson benchmarking script I'm getting the following results on V100:
Before:
```
unning with N = 10000, M = 10000
cuda (no inverse): 0.98 ms
cpu (no inverse): 0.96 ms
cuda (with inverse): 1.07 ms
cpu (with inverse): 1.76 ms

Running with N = 10000, M = 100000
cuda (no inverse): 0.76 ms
cpu (no inverse): 1.53 ms
cuda (with inverse): 1.23 ms
cpu (with inverse): 3.02 ms

Running with N = 100000, M = 100000
cuda (no inverse): 1.28 ms
cpu (no inverse): 11.22 ms
cuda (with inverse): 69.76 ms
cpu (with inverse): 20.28 ms

Running with N = 100000, M = 1000000
cuda (no inverse): 0.78 ms
cpu (no inverse): 18.78 ms
cuda (with inverse): 133.45 ms
cpu (with inverse): 34.09 ms

Running with N = 500000, M = 500000
cuda (no inverse): 1.43 ms
cpu (no inverse): 61.13 ms
cuda (with inverse): 3315.18 ms
cpu (with inverse): 104.57 ms

Running with N = 500000, M = 5000000
cuda (no inverse): 0.86 ms
cpu (no inverse): 96.44 ms
cuda (with inverse): 5209.93 ms
cpu (with inverse): 176.10 ms
```
After
```
Running with N = 10000, M = 10000
cuda (no inverse): 1.04 ms
cpu (no inverse): 0.94 ms
cuda (with inverse): 0.64 ms
cpu (with inverse): 1.76 ms

Running with N = 10000, M = 100000
cuda (no inverse): 0.77 ms
cpu (no inverse): 1.55 ms
cuda (with inverse): 0.58 ms
cpu (with inverse): 2.79 ms

Running with N = 100000, M = 100000
cuda (no inverse): 1.30 ms
cpu (no inverse): 14.15 ms
cuda (with inverse): 1.63 ms
cpu (with inverse): 20.90 ms

Running with N = 100000, M = 1000000
cuda (no inverse): 0.82 ms
cpu (no inverse): 18.63 ms
cuda (with inverse): 0.61 ms
cpu (with inverse): 33.52 ms

Running with N = 500000, M = 500000
cuda (no inverse): 1.51 ms
cpu (no inverse): 59.81 ms
cuda (with inverse): 1.23 ms
cpu (with inverse): 110.69 ms

Running with N = 500000, M = 5000000
cuda (no inverse): 0.92 ms
cpu (no inverse): 104.26 ms
cuda (with inverse): 0.84 ms
cpu (with inverse): 187.12 ms
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16145

Differential Revision: D13738821

Pulled By: soumith

fbshipit-source-id: 0811fb4ade47e3b466cebbc124e3f3333a986749

5 years agofix for clang-tidy (#16164)
Michael Suo [Fri, 18 Jan 2019 22:01:41 +0000 (14:01 -0800)]
fix for clang-tidy (#16164)

Summary:
It turns out that clang-tidy is bundled with travis's standard trusty distribution, so no need to install it manually.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16164

Differential Revision: D13738986

Pulled By: suo

fbshipit-source-id: d0cd76c615625b2ed7f18951289412989f15849d

5 years agoChange current device in stream context manager if necessary (#16128)
Shen Li [Fri, 18 Jan 2019 20:32:20 +0000 (12:32 -0800)]
Change current device in stream context manager if necessary (#16128)

Summary:
Fixes #16019
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16128

Differential Revision: D13721850

Pulled By: mrshenli

fbshipit-source-id: 422c6c0b97c1cd46e127e265b532cb8c74a3aac5

5 years agoFix SoftmaxOps (#16049)
Jerry Zhang [Fri, 18 Jan 2019 20:14:34 +0000 (12:14 -0800)]
Fix SoftmaxOps (#16049)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16049

We might see the pattern
```
if (scale_.numel() != N) {
   scale_->Resize(N);
   // set initial value for scale_
}

// In class:
Tensor scale_{CPU};
```
before in the code, where `scale_` is a member variable of Type `caffe2::Tensor`
This pattern actually serves two purposes, if `scale_` is partially initialized with device type but not size, this call will
initialize Tensor with the correct size, or if `scale_` is already initialized with size, it will check whether the size
matches a runtime value `N` and if not it will Resize. To rewrite this we'll do the following:
```
if (!scale_.defined() || scale_.numel() != N) {
  ReinitializeTensor(&scale_, {N}, at::dtype<float>().device(CPU));
  // set initial value for scale_
}

```
There are some variants, if `scale_` is resized to a constant size, we can call `ReinitializeTensor` instead
```
if (scale_.numel() != 1) {
  scale_->Resize(1);
}
```
-->
```
ReinitializeTensor(&scale_, {1}, at::dtype<float>().device(CPU));
```

Normal Resize will be refactored directly into ReinitializeTensor:
```
scale_->Resize(N);
```
-->
```
ReinitializeTensor(&scale_, {N}, at::dtype<float>().device(CPU));
```

Reviewed By: dzhulgakov

Differential Revision: D13667883

fbshipit-source-id: 2c7cb61544b72765b594011b99150eb5a1b50836

5 years agorest of uses for deprecation of dims() in Tensor (#16118)
Jerry Zhang [Fri, 18 Jan 2019 19:49:36 +0000 (11:49 -0800)]
rest of uses for deprecation of dims() in Tensor (#16118)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16118

att

Differential Revision: D13697211

fbshipit-source-id: 12bf6edd1794240ac748cc1b8fecb0c1e8eb9112

5 years agoRNN operators should inherit step_net device_options (#16086)
Nikita Shulga [Fri, 18 Jan 2019 19:33:40 +0000 (11:33 -0800)]
RNN operators should inherit step_net device_options (#16086)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16086

[caffe2] RNN operators should inherit step_net device_options
According to NetDef documentaiton, if network has a specific device option it applies to all network operators that do not explicitly specifiy it.
But this does not seem to be the case for RecurrentNetwork operators

Reviewed By: orionr

Differential Revision: D13699552

fbshipit-source-id: 14529bc9504e3b02f763e3c2429be21e46f82b68

5 years agoAdd implicit optional unwrapping (#15587)
Elias Ellison [Fri, 18 Jan 2019 19:17:34 +0000 (11:17 -0800)]
Add implicit optional unwrapping (#15587)

Summary:
Add support for type inference for optional type refinement.

If a conditional is of the form "x is None" or "x is not None", or is a boolean expression containing multiple none checks, the proper type refinements are inserted in each branch.

For example:
if optional_tensor is not None and len(optional_tensor) < 2:
# optional_tensor is a Tensor

if optional_tensor1 is not None and optional_tensor2 is not None:
# both optional_tensor1 and optional_tensor2 are Tensors

TODO:

- not run an op for unchecked unwrap optional in the interpreter

- potentially refine types to prim::None (omitted for now to simply things & because it's not an actual use cause).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15587

Differential Revision: D13733810

Pulled By: eellison

fbshipit-source-id: 57c32be9f5a09ab5542ba0144a6059b96de23d7a

5 years agoAdd defined() to caffe2::Tensor (#16125)
Jerry Zhang [Fri, 18 Jan 2019 18:59:53 +0000 (10:59 -0800)]
Add defined() to caffe2::Tensor (#16125)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16125

Add defined() method to check whether the Tensor is defined.

Reviewed By: ezyang

Differential Revision: D13719222

fbshipit-source-id: ff8efef2159ed1026bd16acaea40c768a1e20a47

5 years agoRemove ATen/Half.h and ATen/core/Half.h forwarding headers.
Edward Yang [Fri, 18 Jan 2019 18:52:18 +0000 (10:52 -0800)]
Remove ATen/Half.h and ATen/core/Half.h forwarding headers.

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16115

Reviewed By: bddppq

Differential Revision: D13717049

fbshipit-source-id: fb1d690183a932a1fa1a2d235f3219520f51620a

5 years agoPort legacy any(*) to ATen
Shen Li [Fri, 18 Jan 2019 18:07:33 +0000 (10:07 -0800)]
Port legacy any(*) to ATen

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15547

Differential Revision: D13549495

Pulled By: mrshenli

fbshipit-source-id: 09a065a8ffa7d73f409759b779c7314cc87f4853

5 years agoImprove pack_sequence and pack_padded_sequence error message (#16084)
Richard Zou [Fri, 18 Jan 2019 15:56:17 +0000 (07:56 -0800)]
Improve pack_sequence and pack_padded_sequence error message (#16084)

Summary:
Mention that if enforce_sorted=True, the user can set
enforce_sorted=False. This is a new flag that is probably hard to
discover unless one throughly reads the docs.

Fixes #15567
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16084

Differential Revision: D13701118

Pulled By: zou3519

fbshipit-source-id: c9aeb47ae9769d28b0051bcedb8f2f51a5a5c260

5 years agoTCP init method race condition fix (#15684)
Teng Li [Fri, 18 Jan 2019 10:23:51 +0000 (02:23 -0800)]
TCP init method race condition fix (#15684)

Summary:
This PR fixes a race condition for TCP init method, when master rank can exit earlier than slave ranks and thus the TCP daemon thread gets shutdown before other slaves are able to access it.

This will let every rank (process) write a special key to the store to mark that they are completed (and thus about to exit).  The master rank (who is the server) will always wait until all the ranks to complete before complete itself.

This should fix: https://github.com/pytorch/pytorch/issues/15638

Tested using the repro of https://github.com/pytorch/pytorch/issues/15638 and works fine. Also test_distributed and test_c10d should have already had this coverage.

I had to make rendezvous test in c10d the world size of 1, since it is a single process code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15684

Differential Revision: D13570904

Pulled By: teng-li

fbshipit-source-id: 34f3bc471204bbd29320df359347ad5561c6b589

5 years agoRemove caffe2::Tensor copy constructor (#15416)
Dmytro Dzhulgakov [Fri, 18 Jan 2019 08:28:26 +0000 (00:28 -0800)]
Remove caffe2::Tensor copy constructor (#15416)

Summary:
Based on offline discussion it should be less surprising to the users of existing code. Thus caffe2::Tensor is now a move-only class (as it used to be), explicit calls to UnsafeSharedInstance() are necessary to get shared_ptr behavior.

This change also identified a few places that misused the copy constructor - those are fixed

Pull Request resolved: https://github.com/pytorch/pytorch/pull/15416

Reviewed By: Yangqing

Differential Revision: D13524598

fbshipit-source-id: aea12d6dff77342606fa88ce4ddddbff266245a7

5 years agoFix RERUN_CMAKE
Zachary DeVito [Fri, 18 Jan 2019 08:01:48 +0000 (00:01 -0800)]
Fix RERUN_CMAKE

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16132

Differential Revision: D13726816

Pulled By: zdevito

fbshipit-source-id: 26ad70651b0138642ad5240670f5c452018c13a2

5 years agoCleanup includes in python_print.cpp.
Mikhail Zolotukhin [Fri, 18 Jan 2019 02:03:38 +0000 (18:03 -0800)]
Cleanup includes in python_print.cpp.

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16129

Differential Revision: D13724297

Pulled By: ZolotukhinM

fbshipit-source-id: 24e140bc052c85ef40b928eb84f463d341346a51

5 years agoRefactor attributes.h (#16098)
Mikhail Zolotukhin [Fri, 18 Jan 2019 01:27:36 +0000 (17:27 -0800)]
Refactor attributes.h (#16098)

Summary:
This PR inlines `Attributes` into `Node`. It helps to cleanup the code a little as everything is one place (some of the cleanups are included in the PR).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16098

Differential Revision: D13717637

Pulled By: ZolotukhinM

fbshipit-source-id: c54ae65178a95a01354688921a9ccb1ca699f8eb

5 years agoFix export macros (#15856)
Sebastian Messmer [Thu, 17 Jan 2019 23:53:52 +0000 (15:53 -0800)]
Fix export macros (#15856)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15856

They seem to be wrong.
cc zdevito to take a look but I think this is now more correct.

It's weird this didn't cause linker errors. Probably, this functionality isn't used across library boundaries yet.

Reviewed By: dzhulgakov

Differential Revision: D13605257

fbshipit-source-id: 7077ca9027c3ac79a4847ec15ead7ddb28696445

5 years agoRemove some dependencies from ivalue.h to ATen (#15855)
Sebastian Messmer [Thu, 17 Jan 2019 23:53:52 +0000 (15:53 -0800)]
Remove some dependencies from ivalue.h to ATen (#15855)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15855

This is preparation work for moving IValue to c10.

Reviewed By: ezyang

Differential Revision: D13605259

fbshipit-source-id: cc545f582ab8607bb02aaf71273cb2710200b295

5 years agoCode style cleanup (#15854)
Sebastian Messmer [Thu, 17 Jan 2019 23:53:52 +0000 (15:53 -0800)]
Code style cleanup (#15854)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15854

- Remove unnecessary `inline` keyword
- Add a TODO stating the intention for Blob::ShareExternal()

Reviewed By: dzhulgakov

Differential Revision: D13605258

fbshipit-source-id: c0bc85c74c4ca4b3811d42ac7f866182e159d840

5 years agoUse intrusive_ptr for Blob in IValue (#16052)
Sebastian Messmer [Thu, 17 Jan 2019 23:47:16 +0000 (15:47 -0800)]
Use intrusive_ptr for Blob in IValue (#16052)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16052

We need IValue to take/return Blob as an intrusive_ptr because we want to pass it around and Blob has disabled copying.
This is needed in a diff on top.

Reviewed By: ezyang

Differential Revision: D13684761

fbshipit-source-id: 7cb3d7e9fec39a2bc9f063d4d30404e6d7016eb2

5 years agoMove c10 dispatcher back to ATen/core (#16050)
Sebastian Messmer [Thu, 17 Jan 2019 23:47:16 +0000 (15:47 -0800)]
Move c10 dispatcher back to ATen/core (#16050)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16050

The c10 dispatcher will (soon) depend on IValue and IValue can't be moved to c10 yet because it depends on at::Tensor, which depends on legacy Type dispatch and we don't want the legacy dispatch in c10.

So instead, we move the c10 dispatcher back to ATen/core until we can actually move at::Tensor to c10.

Reviewed By: ezyang

Differential Revision: D13684517

fbshipit-source-id: 1125f4254223907c52f96ff73034f6d4ae9fd0a7

5 years agoMoving cuda-convnet2 to the internal fb dir to satisfy internal dependencies. (#16104)
Chris Gottbrath [Thu, 17 Jan 2019 22:55:49 +0000 (14:55 -0800)]
Moving cuda-convnet2 to the internal fb dir to satisfy internal dependencies. (#16104)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16104

PyTorch PR 15784 removed cuda-convnet from the contrib directory. This broke
some internal-only fb dependencies. Moving this to the internal area.

Reviewed By: ezyang

Differential Revision: D13709112

fbshipit-source-id: 2d7811545da67489869b59c350a29817eff693cf

5 years agofurther wildcard cleanups (#16041)
Michael Suo [Thu, 17 Jan 2019 22:38:42 +0000 (14:38 -0800)]
further wildcard cleanups (#16041)

Summary:
Some cleanup to wildcard handling, including one bugfix: previously, we were not considering writes to the wildcard set as part of the potential write set for nodes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16041

Differential Revision: D13705738

Pulled By: suo

fbshipit-source-id: acb8ccbaa70fe47445577ddf24a69f84630de411

5 years agoRefactor _jit_internal (#16058)
David Riazati [Thu, 17 Jan 2019 21:39:07 +0000 (13:39 -0800)]
Refactor _jit_internal (#16058)

Summary:
Use qualified names in `jit/__init__.py` to avoid polluting that namespace
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16058

Differential Revision: D13718745

Pulled By: driazati

fbshipit-source-id: 19d150569c8374541250a961f24f70c3f523de03

5 years agoInclude all Caffe2 headers in Python installations (#16124)
Jesse Hellemn [Thu, 17 Jan 2019 21:37:04 +0000 (13:37 -0800)]
Include all Caffe2 headers in Python installations (#16124)

Summary:
Confirmed on a local run that all the additional headers are present. This shouldn't be caught in any existing tests though.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16124

Differential Revision: D13720773

Pulled By: pjh5

fbshipit-source-id: 22a42639f5649cac555ecc5a8b6760a8cbfcf01f

5 years agoAdd comment to explain rnn bias vectors (#15843)
Aaron Jaech [Thu, 17 Jan 2019 21:13:07 +0000 (13:13 -0800)]
Add comment to explain rnn bias vectors (#15843)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15843

RNN/LSTMs only need one bias vector, but our implementation uses two to be compatible with CuDNN. This diff adds a comment to explain this.

Reviewed By: ezyang

Differential Revision: D13602365

fbshipit-source-id: eef5bd9383d9f241dc0ef0472f753b4a44cc19b5

5 years agoAdd @yf225 to cpp codeowner
Will Feng [Thu, 17 Jan 2019 20:18:15 +0000 (12:18 -0800)]
Add @yf225 to cpp codeowner

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16120

Differential Revision: D13718880

Pulled By: yf225

fbshipit-source-id: 1c0a41ffba71855a3ad88b8d263ba2bd5076351d

5 years agoUpdate FP16 to latest master (#14498)
Yangqing Jia [Thu, 17 Jan 2019 20:15:07 +0000 (12:15 -0800)]
Update FP16 to latest master (#14498)

Summary:
TSIA - fp16 cmake had a bug that is fixed in https://github.com/Maratyszcza/FP16/pull/9 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14498

Differential Revision: D13240829

Pulled By: Yangqing

fbshipit-source-id: 724745750efe4f1b49d29ee07380c36997579915

5 years agoCleanup gumbel_softmax (#13339)
Egil Martinsson [Thu, 17 Jan 2019 20:14:39 +0000 (12:14 -0800)]
Cleanup gumbel_softmax (#13339)

Summary:
Fixes #12643, amends to #3341.

- Allow multidimensional input ~~(but apply softmax over `dim=-1`)~~ with `dim` argument
- Cleaner: Less lines of code
- Faster (1.32x speedup vs original, 2x speedup vs using `torch.Distributions`)
- Small fixes in docstring
- Remove some references in docstring. Was the linked (excellent) ipynb the first to do the straight-through trick? Instead, I propose changing to reference to the two papers most known for it.
- Add deprecationwarning for `eps`. It's not needed anymore.
- Initial commit keeps some code alternatives commented to exploit CI

- As of discussion when `gumbel_softmax` was added (#3341), this was merged into `torch.nn.functional` before all the work with `Distributions` and `Pyro`, and there will probably be multiple other best practices for this in the future.
I've tested building using the `Distributions`-api, but it was too slow, see below.

I therefore propose not using `Distributions` to keep it fast and simple, but adding a comment in docstring that `gumbel_softmax` may be deprecated in the future.

```
dist = torch.distributions.RelaxedOneHotCategorical(temperature=tau, logits=logits, validate_args=False)
y_soft = dist.rsample()
```

Pros:
* Built using tricks like `logsumexp` etc
* Explicitly uses `torch.distributions.utils._finfo` to avoid overflow (old implementation had an `eps` flag)
* Maintained for this exact purpose.

Cons:
* Very slow. Construction of distribution adds overhead see timings below. May be solved in future with speedups of `TransformedDistribution` and `Distribution`.
* Assumes which `dim` to apply softmax over.

```
    y_soft = logits.new(logits.shape)
    y_soft = (logits - y_soft.exponential_().log()) / tau  # Gumbel noise
    y_soft = y_soft.softmax(dim)  # Gumbel softmax noise
```
Pros:
* Faster

```
    import time
    start = time.time()
    num_draws = 1000000
    logits = torch.randn(1,3)

    for draw in range(num_draws):
        y_draw = gumbel_softmax(logits, hard=True)
        counts = counts + y_draw
    print(end - start)

>> 12.995795965194702

>> 7.658372640609741

>> 20.3382670879364
````

Decide on which path to chose. I'll commit in changes to the unit tests in a while to show that it passes both old tests and new tests. I'll also remove the commented code about `RelaxedOneHotCategorical`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13339

Differential Revision: D13092434

Pulled By: ezyang

fbshipit-source-id: 4c21788df336f4e9c2ac289022e395b261227b4b

5 years agoAdd matches_jit_signature attribute to native_functions.yaml (#16040)
Christian Puhrsch [Thu, 17 Jan 2019 20:07:04 +0000 (12:07 -0800)]
Add matches_jit_signature attribute to native_functions.yaml (#16040)

Summary:
If "matches_jit_signature" is set to True for a particular function, we will assume that the func syntax follows the JIT signature syntax. This is a temporary attribute and doesn't need to be set by developers outside the core team. It serves as a means of tracking an ongoing schema unification with the goal of aligning func syntax with other components of PyTorch in order to reduce overall complexity and match coverage of different function descriptions.

Followup PRs might be about removing _out from native_functions.yaml and using Tensor annotations instead, etc.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16040

Reviewed By: ezyang

Differential Revision: D13703176

Pulled By: cpuhrsch

fbshipit-source-id: ce248e1823a6f18efa95502f9f3eebf023b4a46c

5 years agoadd if in register_buffer like register_parameters (#16110)
FrankHui [Thu, 17 Jan 2019 19:31:31 +0000 (11:31 -0800)]
add if in register_buffer like register_parameters (#16110)

Summary:
without this "if", code below will throw error " Linear' object has no attribute '_buffers' "
And with this if, error would be "cannot assign buffer before Module.\_\_init\_\_() call", which I think it's more accurate, just like register_parameter.
```
import math
import torch
from torch.nn.parameter import Parameter
from torch.nn import functional as F
from torch.nn import Module
class Linear(Module):
    def __init__(self, in_features, out_features, bias=True):

        self.in_features = in_features
        self.out_features = out_features
        self.register_buffer('test', torch.Tensor(out_features, in_features))
        self.weight = Parameter(torch.Tensor(out_features, in_features))
        if bias:
            self.bias = Parameter(torch.Tensor(out_features))
        else:
            self.register_parameter('bias', None)

        super(Linear, self).__init__()

        self.reset_parameters()

    def reset_parameters(self):
        stdv = 1. / math.sqrt(self.weight.size(1))
        self.weight.data.uniform_(-stdv, stdv)
        if self.bias is not None:
            self.bias.data.uniform_(-stdv, stdv)

    def forward(self, input):
        return F.linear(input, self.weight, self.bias)

    def extra_repr(self):
        return 'in_features={}, out_features={}, bias={}'.format(
            self.in_features, self.out_features, self.bias is not None
        )

linear = Linear(3,4)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16110

Differential Revision: D13715839

Pulled By: soumith

fbshipit-source-id: c300eff0a8655aade448354cf489a592f7db722a

5 years agoRevert D13709409: [pytorch][PR] Exclude pyi from flake8 checks.
Edward Yang [Thu, 17 Jan 2019 19:15:39 +0000 (11:15 -0800)]
Revert D13709409: [pytorch][PR] Exclude pyi from flake8 checks.

Differential Revision:
D13709409

Original commit changeset: ec4a959e146f

fbshipit-source-id: feabed5719a0bfdfe7979074b7e1ba9756c4ba25

5 years agorespect grad guard for torch.jit._fork and torch.jit._wait (#16101)
Guoqiang Jerry Chen [Thu, 17 Jan 2019 18:54:43 +0000 (10:54 -0800)]
respect grad guard for torch.jit._fork and torch.jit._wait (#16101)

Summary:
respect grad guard for torch.jit._fork and torch.jit._wait.

Verified that the test failed without the fix, and pass with the fix.

Ideally I would like to enable and disable grad inside the forked function.
It doesn't seems like it's supported at this moment. This code handles that
as well.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16101

Differential Revision: D13708374

Pulled By: gqchen

fbshipit-source-id: 0533f080c4d0253fb4c61d2a0d3cc22de5721a09

5 years agoRevert batched pdist, improve existing kernel, add test (#15901)
Gregory Chanan [Thu, 17 Jan 2019 18:12:47 +0000 (10:12 -0800)]
Revert batched pdist, improve existing kernel, add test (#15901)

Summary:
1) Reverts https://github.com/pytorch/pytorch/pull/12302 which added support for batched pdist. Except I kept the (non-batched) test improvements that came with that PR, because they are nice to have.  Motivation: https://github.com/pytorch/pytorch/issues/15511
2) For the non-batched pdist, improved the existing kernel by forcing fp64 math and properly checking cuda launch errors
3) Added a 'large tensor' test that at least on my machine, fails on the batch pdist implementation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15901

Reviewed By: ezyang

Differential Revision: D13616730

Pulled By: gchanan

fbshipit-source-id: 620d3f9b9acd492dc131bad9d2ff618d69fc2954

5 years agoFix trivial typos in torch.cuda._utils (#16026)
Derek Kim [Thu, 17 Jan 2019 18:07:46 +0000 (10:07 -0800)]
Fix trivial typos in torch.cuda._utils (#16026)

Summary:
Trivial typo fixings.

Maybe the indefinite article "an" is needed before each "specified index" but I'm not perfectly sure.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16026

Differential Revision: D13709499

Pulled By: ezyang

fbshipit-source-id: 698b000bb8aa063afd81db6e67046456a439b2ce

5 years agoUnify the shape notation for all of the pytorch modules (#15741)
Sasha Rush [Thu, 17 Jan 2019 18:04:51 +0000 (10:04 -0800)]
Unify the shape notation for all of the pytorch modules (#15741)

Summary:
PR to update the shape notation for all of the torch.nn modules to take a unified form. The goal is to make these definitions machine-readable and those checkable by unifying the style across all of the different modules.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15741

Differential Revision: D13709601

Pulled By: ezyang

fbshipit-source-id: fb89a03903fdf0cd0dcf76f3e469b8582b2f3634

5 years agoFix numerical stability in binomial.log_prob (#15962)
Neeraj Pradhan [Thu, 17 Jan 2019 17:59:58 +0000 (09:59 -0800)]
Fix numerical stability in binomial.log_prob (#15962)

Summary:
This issue was discovered by fehiepsi in https://github.com/uber/pyro/issues/1706 with the `log_prob` computation for Binomial, ~and can be seen with `torch.float32` when we have a combination of low probability value and high `total_count` - a test is added to capture this (since scipy only uses float64, the comparison is done using relative tolerance).~

The problem is in the code that tries to pull out the minimum values amongst the logits (written by me earlier, presumably to avoid numerical instability issues), but it is not needed.

EDIT: After a few attempts, I have been unable to reliably show that the change is more numerically stable, and have removed my previous test which fails on linux. The reason is that the issue manifests itself when `total_count` is high and `probs` is very low. However, the precision of `lgamma` when `total_count` is high is bad enough to wash away any benefits. The justification for this still stands though - (a) simplifies code (removes the unnecessary bit), (b) is no worse than the previous implementation, (c) has better continuity behavior as observed by fehiepsi in the issue above.

cc. fehiepsi, alicanb, fritzo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15962

Differential Revision: D13709541

Pulled By: ezyang

fbshipit-source-id: 596c6853b6e4d5fba42336afa168a665ab6fbde2

5 years agoAutomatic update of fbcode/onnx to fd60104394fa353e1762f44ecad1b2166e33deef (#16094)
Lu Fang [Thu, 17 Jan 2019 17:53:15 +0000 (09:53 -0800)]
update of fbcode/onnx to fd60104394fa353e1762f44ecad1b2166e33deef (#16094)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16094

Previous import was 84a0441ae28795a928005863dc142bee81827566

Included changes:
- **[fd60104](https://github.com/onnx/onnx/commit/fd60104)**: deprecate no-spatial mode of BN (#1637) <liqunfu>

Reviewed By: BIT-silence

Differential Revision: D13705357

fbshipit-source-id: 44dbc8bf15fced6d50048b04c2882e38f75c0e34

5 years agoA trivial typo fixed in onnx.verify.verify (#15871)
Derek Kim [Thu, 17 Jan 2019 17:50:55 +0000 (09:50 -0800)]
A trivial typo fixed in onnx.verify.verify (#15871)

Summary:
A trivial typo fixing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15871

Differential Revision: D13709588

Pulled By: ezyang

fbshipit-source-id: 84460e53e30470bef72bc836c08fd149b4d725cf

5 years agoRemove support for CUDNN 6 (#15851)
Syed Tousif Ahmed [Thu, 17 Jan 2019 17:49:44 +0000 (09:49 -0800)]
Remove support for CUDNN 6 (#15851)

Summary: This PR aims to remove support for cuDNN 6.

Differential Revision: D13709595

Pulled By: ezyang

fbshipit-source-id: 853624db1cf66b0534d7028654c38c2806fb4107

5 years agoExclude pyi from flake8 checks. (#16105)
Edward Yang [Thu, 17 Jan 2019 17:49:03 +0000 (09:49 -0800)]
Exclude pyi from flake8 checks. (#16105)

Summary:
Idiomatic pyi files will fail with Python 2 flake8 even
though they would work with mypy.  This is because pyi
files generally use Python 3 only syntax.  No point
in linting them.

There are currently no pyi files checked in, this is purely
a prophylactic measure.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16105

Reviewed By: zou3519

Differential Revision: D13709409

Pulled By: ezyang

fbshipit-source-id: ec4a959e146f81ccb9533b04348be8dd78808421

5 years agoUpdate cpuinfo to avoid reporting error when sysfs is not accessible (#16107)
Marat Dukhan [Thu, 17 Jan 2019 17:18:04 +0000 (09:18 -0800)]
Update cpuinfo to avoid reporting error when sysfs is not accessible (#16107)

Summary:
On some cloud-based x86 systems /sys/ is not mounted.
cpuinfo has a work-around for these systems, but it reports an error if sysfs files fail to read, and this error was confusing to some users (e.g. pytorch/cpuinfo#20). This update downgrades the error to a warning, so it is not reported with default configuration options.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16107

Differential Revision: D13715243

Pulled By: soumith

fbshipit-source-id: f5c4c86422343ca449487f0185f3a8865ccf3b9d

5 years agoExport PyTorch erf to ONNX Erf and add Caffe2 Erf operator
bddppq [Thu, 17 Jan 2019 17:15:14 +0000 (09:15 -0800)]
Export PyTorch erf to ONNX Erf and add Caffe2 Erf operator

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16106

Differential Revision: D13709490

Pulled By: bddppq

fbshipit-source-id: 1b5b32261f06543371f7bd7ac9b11957a5eb4ad0

5 years agoPotential fix for model inference crash on Win10 (#15919) (#16092)
DavidWongEA [Thu, 17 Jan 2019 16:30:55 +0000 (08:30 -0800)]
Potential fix for model inference crash on Win10 (#15919) (#16092)

Summary:
Please refer to issue #15919
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16092

Differential Revision: D13712897

Pulled By: soumith

fbshipit-source-id: edcd1ed3504f1fa1af841a1757616382c745958f

5 years agoMove all Stream and Event Python implementation to C++ (#15937)
Shen Li [Thu, 17 Jan 2019 15:22:42 +0000 (07:22 -0800)]
Move all Stream and Event Python implementation to C++ (#15937)

Summary:
1. Added `torch/csrc/cuda/Event.h` and `torch/csrc/cuda/Event.cpp` to bind Python Event class to C++ implementation.
2. Move all CUDA runtime invocations from `torch/cuda/streams.py` to C++
3. Added tests to cover Stream and Event APIs. ~(event IPC handle tests is introduced in #15974)~
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15937

Differential Revision: D13649001

Pulled By: mrshenli

fbshipit-source-id: 84ca58f35f6ba679a4ba33150ceba678d760d240

5 years agoA trivial typo fix in caffe2.python (#15907)
Derek Kim [Thu, 17 Jan 2019 12:55:03 +0000 (04:55 -0800)]
A trivial typo fix in caffe2.python (#15907)

Summary:
blobl -> globl
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15907

Differential Revision: D13709586

Pulled By: ezyang

fbshipit-source-id: 9d3ad76b7fea76c7934407d3c164417b4157e234

5 years agoAdd count_include_pad for avg_pool on CuDNN (#16100)
Xiaomeng Yang [Thu, 17 Jan 2019 10:07:04 +0000 (02:07 -0800)]
Add count_include_pad for avg_pool on CuDNN (#16100)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16100

Add count_include_pad for avg_pool on CuDNN

Reviewed By: houseroad

Differential Revision: D13707959

fbshipit-source-id: 261f5d116066fef75cf9a5787dfbc5d12b5b9f9b

5 years agoEnhance the documentation for DistributedDataParallel from torch.nn.parallel.distribu...
Derek Kim [Thu, 17 Jan 2019 08:59:11 +0000 (00:59 -0800)]
Enhance the documentation for DistributedDataParallel from torch.nn.parallel.distributed (#16010)

Summary:
- a typo fixed
- made the docs consistent with #5108

And maybe one more change is needed. According to the current docs
> The batch size should be larger than the number of GPUs used **locally**.

But shouldn't the batch size be larger than the number of GPUs used **either locally or remotely**? Sadly, I couldn't experiment this with my single GPU.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16010

Differential Revision: D13709516

Pulled By: ezyang

fbshipit-source-id: e44459a602a8a834fd365fe46e4063e9e045d5ce

5 years agofix a little error in comments (#15922)
QingfengLi [Thu, 17 Jan 2019 08:22:28 +0000 (00:22 -0800)]
fix a little error in comments (#15922)

Summary:
There is a little error in the comment, "A->B",  so the Task B must start after task A finishes, not "B".
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15922

Differential Revision: D13709579

Pulled By: ezyang

fbshipit-source-id: 735afe83f4532b7c7456da3e96209b3e07071f37

5 years agoCorresponding data type for BYTE (#15627)
fulltopic [Thu, 17 Jan 2019 08:15:00 +0000 (00:15 -0800)]
Corresponding data type for BYTE (#15627)

Summary:
TensorProto.DataType in caffe2/proto/caffe2.proto has BYTE = 3 defined, while there is no corresponding TypeMeta defined in caffe2/core/types.cc: DataTypeToTypeMeta. This issue failed the C++ tutorial of MNIST + LMDB.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15627

Differential Revision: D13709602

Pulled By: ezyang

fbshipit-source-id: d4826d0f9b3975e6a8478d4bad1abbbedcaea197

5 years agoFix possible importing errors in build_libtorch.py (#15471)
Derek Kim [Thu, 17 Jan 2019 07:52:37 +0000 (23:52 -0800)]
Fix possible importing errors in build_libtorch.py (#15471)

Summary:
1. I fixed the importing process, which had some problems
    -  **I think `setup_helpers` should not be imported as the top level module. It can lead to many future errors. For example, what if `setup_helpers` imports another module from the upper level?** So we need to change it.
    - The code is not consistent with other modules in `tools` package. For example, other
    modules in the package imports `from tools.setuptools...` not `from setuptools...`.
    - **It should be able to run with `python -m tools.build_libtorch` command**  because this module is a part of the tools package. Currently, you cannot do that and I think it's simply wrong.

~~2. I Added platform specific warning messages.
    - I constantly forgot that I needed to define some environment variables in advance specific to my platform to build libtorch, especially when I'm working at a non pytorch root directory. So I thought adding warnings for common options would be helpful .~~

~~3. Made the build output path configurable. And a few other changes.~~

orionr  ebetica
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15471

Differential Revision: D13709607

Pulled By: ezyang

fbshipit-source-id: 950d5727aa09f857d973538c50b1ab169d88da38

5 years agoRemove redundant includes from ir.{h,cpp}.
Mikhail Zolotukhin [Thu, 17 Jan 2019 07:38:13 +0000 (23:38 -0800)]
Remove redundant includes from ir.{h,cpp}.

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16080

Differential Revision: D13701796

Pulled By: ZolotukhinM

fbshipit-source-id: 7efae3a0fd969376e4b438a8d8fb96adb33dc55c

5 years agoGenerate PDB files for better debugging on Windows (#16008)
peter [Thu, 17 Jan 2019 07:31:57 +0000 (23:31 -0800)]
Generate PDB files for better debugging on Windows (#16008)

Summary:
1. Unify `build_pytorch_libs.bat`, `setup.py` and `torch/CMakeLists.txt` on the debugging flags with the `CMAKE_BUILD_TYPE` being `Debug`, `Release` and `RelWithDebInfo`.
2. Install PDBs through CMake if they are generated.

Reference:
1. CMake PDB install: https://gitlab.kitware.com/cmake/cmake/issues/18393#note_459199
2. About debugging flags https://stackoverflow.com/a/4662345
3. MSDN page about /DEBUG flag: https://docs.microsoft.com/en-us/cpp/build/reference/debug-generate-debug-info?view=vs-2017
4. MSDN page about /Z{i/I/7}: https://docs.microsoft.com/en-us/cpp/build/reference/z7-zi-zi-debug-information-format?view=vs-2017

Work to do:
- [x] Test the changes work in Release config through this PR
- [ ] <del> Test debug build through https://github.com/pytorch/pytorch/pull/16009 </del>
- [x] Test release build with debugging symbols through #16013

Difficulties:
- [x] Replace /Zi flags with /Z7 (which will be added if DEBUG or RelWithDebInfo is used), as it is not supported by sccache
- [x] Resolve `LINK : fatal error LNK1210: exceeded internal ILK size limit; link with /INCREMENTAL:NO` in the debug build
- [ ] DEBUG build blocked by a MSVC bug. In order to resolve it, we'll need to update the MSVC in CI: https://developercommunity.visualstudio.com/content/problem/225957/fatal-error-lnk1318-unexpected-pdb-error-ok-0.html
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16008

Differential Revision: D13709527

Pulled By: ezyang

fbshipit-source-id: e8365bc75d9ec64099093f7001f83d99a06b196b

5 years agoUpdate int8_simd.h (#13859)
JerryShih [Thu, 17 Jan 2019 07:17:03 +0000 (23:17 -0800)]
Update int8_simd.h (#13859)

Summary:
If we use clang with sse4 support, we will have the function redefinition
error between [1] and [2]. This patch try to add some checkings to fix this
problem.

I just turn on USE_NATIVE_ARCH with clang, then I hit the redefinition error.

[1]
caffe2/operators/quantized/int8_simd.h
[2]
third_party/gemmlowp/gemmlowp/fixedpoint/fixedpoint_sse.h
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13859

Differential Revision: D13095694

Pulled By: ezyang

fbshipit-source-id: c65166e4d5a04bb54e2b82c52740af00116ccb0d

5 years agoAdd IS_PYTORCH_CI flag for testing (#16006)
SsnL [Thu, 17 Jan 2019 06:56:56 +0000 (22:56 -0800)]
Add IS_PYTORCH_CI flag for testing (#16006)

Summary:
Use case:
Some data loader tests rely on `psutil` (a third party lib). So they are guarded by `skipIf`. But we want to always test them on CI envs. With `IS_PYTORCH_CI`, we can raise if `psutil` is not found.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16006

Reviewed By: ezyang

Differential Revision: D13673957

Pulled By: yf225

fbshipit-source-id: c63a7138093f45333c0b371fed0bcc88b67f2a22

5 years agoMoving torch.norm to ATen using TensorIterator (#15414)
jiej [Thu, 17 Jan 2019 06:12:13 +0000 (22:12 -0800)]
Moving torch.norm to ATen using TensorIterator (#15414)

Summary:
Adding supports for torch.nomr:
i. multi dimensions for dim
ii. dtype that specifies math/output tensor type
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15414

Differential Revision: D13702022

Pulled By: ezyang

fbshipit-source-id: da2676f2b6aff988889b1539d0de8ecd4946823a

5 years agoResolve errors in perfkernel for Windows (#16031)
Tongliang Liao [Thu, 17 Jan 2019 05:38:13 +0000 (21:38 -0800)]
Resolve errors in perfkernel for Windows (#16031)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16031

1. MSVC only has _mm_prefetch(const char*, int). Fixed in both python codegen and C++ files.
2. uint32_t in "cvtsh_ss_bugfix.h" requires "#include <cstdint>".
3. Some files use gflags headers. Add dependency via c10.
4. Isolate arch flags with interface library and private compile options.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15753

Reviewed By: dskhudia

Differential Revision: D13636233

Pulled By: jspark1105

fbshipit-source-id: cdcbd4240e07b749554a2a5676c11af88f23c31d

5 years agoadd a constexpr in c10::Half (#16091)
Soumith Chintala [Thu, 17 Jan 2019 05:09:09 +0000 (21:09 -0800)]
add a constexpr in c10::Half (#16091)

Summary:
Debug build generates references which are not resolved otherwise

as recognized by dlibenzi
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16091

Differential Revision: D13703584

Pulled By: soumith

fbshipit-source-id: 6ac5666d2c6b1520e083f6eac9c535a1609d9c6b

5 years agoTensor reinitialization codemod - 3/5 (#15912)
Jerry Zhang [Thu, 17 Jan 2019 03:46:19 +0000 (19:46 -0800)]
Tensor reinitialization codemod - 3/5 (#15912)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15912

Codemod generated with clangr shard mode, 25 files per diff,
To eliminiate partially initialized Tensor, we split the initialization of local Tensor variables into two steps, first declare un uninitialized Tensor, and
call `ReinitializeTensor` to initialize it.
motivation: https://github.com/pytorch/pytorch/pull/12407

Reviewed By: dzhulgakov

Differential Revision: D13586734

fbshipit-source-id: 8485d2c51225343961351c7a2e8f95055534f9a9

5 years agoBound shape inference for c2 (#16081)
Yinghai Lu [Thu, 17 Jan 2019 02:58:08 +0000 (18:58 -0800)]
Bound shape inference for c2 (#16081)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16081

A simple version of bound shape inference, conditioned on batch size. In addition to doing normal shape inference, it will change the batch size (1st dim of the shape) of the inputs as well as batch size modulating ops such as `SparseLengthsSum`. Probably support to more ops is needed, such as `SparseToDense`. We can build on this.

Reviewed By: jackm321, rdzhabarov

Differential Revision: D13661968

fbshipit-source-id: 6a724a647e109757c26e3e26e15a49725ecc75cc

5 years agoFix max_pool_grad test (#16088)
Xiaomeng Yang [Wed, 16 Jan 2019 23:22:45 +0000 (15:22 -0800)]
Fix max_pool_grad test (#16088)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16088

Fix max_pool_grad test

Reviewed By: houseroad

Differential Revision: D13700917

fbshipit-source-id: f4f942ee920bcd943c38a8f8a6aafd1d13c4515f

5 years agoRevert D12812029: [pt1][tensor] Remove deprecated caffe2::Tensor APIs
Edward Yang [Wed, 16 Jan 2019 22:38:37 +0000 (14:38 -0800)]
Revert D12812029: [pt1][tensor] Remove deprecated caffe2::Tensor APIs

Differential Revision:
D12812029

Original commit changeset: ea0c3dd882be

fbshipit-source-id: d5bb4cbb1d7c9be08789599a7db0fb3313f3dbc4

5 years agoPort the backend of FractionalMaxPool3d from TH to ATen (#15575)
Chandler Zuo [Wed, 16 Jan 2019 22:01:39 +0000 (14:01 -0800)]
Port the backend of FractionalMaxPool3d from TH to ATen (#15575)

Summary:
1. Port the FractionalMaxPool3d implementation from THNN/THCUNN to ATen.
2. Expose this function to Python module nn.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15575

Differential Revision: D13612848

Pulled By: chandlerzuo

fbshipit-source-id: 5f474b39005efa7788e984e8a805456dcdc43f6c

5 years agoupdate pytorch docker to cuda 10
Natalia Gimelshein [Wed, 16 Jan 2019 21:13:33 +0000 (13:13 -0800)]
update pytorch docker to cuda 10

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16082

Differential Revision: D13699081

Pulled By: soumith

fbshipit-source-id: 86942e2c5595931384cf87dd1ef75936a4d74a57

5 years agomultinomial: fix detection of zero probability (#16075)
Thomas Viehmann [Wed, 16 Jan 2019 20:15:12 +0000 (12:15 -0800)]
multinomial: fix detection of zero probability (#16075)

Summary:
The cumsum over the probabilities can be not monotonically
non-decreasing. Thus it is hard to detect zero probability
classes using just the cumsum.
This changes the binary search postprocessing to use the
(non-cumulated) distribution instead.

Thank you, jcjohnson, for the bug report with
reproducing case.

Fixes: #13867
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16075

Differential Revision: D13695565

Pulled By: soumith

fbshipit-source-id: 02c4d6f868f0050c1ae7d333f4317c5610e49cd9

5 years agoEnable single graph sharing between multiple threads for onnxifiop (#16047)
Kimish Patel [Wed, 16 Jan 2019 19:46:04 +0000 (11:46 -0800)]
Enable single graph sharing between multiple threads for onnxifiop (#16047)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16047

Implements single thead safe map enabling sharing of generated graph between
different ops.
Added model_id to every onnxified op to help create a unique id in the map.
Some formatting fix.

Reviewed By: yinghai

Differential Revision: D13663927

fbshipit-source-id: 27417e8fe752fdd48abb6a87966cd76d592e1206

5 years agoFix error message formatting in AT_CHECK/AT_ERROR (#16067)
vishwakftw [Wed, 16 Jan 2019 19:12:47 +0000 (11:12 -0800)]
Fix error message formatting in AT_CHECK/AT_ERROR (#16067)

Summary:
Changelog:

- Fix formatting for error messages in prelu, EmbeddingBag, RNN

Fixes https://github.com/pytorch/pytorch/issues/16043
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16067

Differential Revision: D13693286

Pulled By: soumith

fbshipit-source-id: b0760d13c9a45e82dababfc44dabe648e5345ca3

5 years agoCorrect sphinx-note in symeig (wrong indentation)
Rasmus Diederichsen [Wed, 16 Jan 2019 18:22:08 +0000 (10:22 -0800)]
Correct sphinx-note in symeig (wrong indentation)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16073

Differential Revision: D13692874

Pulled By: soumith

fbshipit-source-id: ea2a98e88679d382f9a2edab199e9ba7c8ce2213

5 years agoFix the caffe2_gpu linkage with torch on Windows (#16071)
peter [Wed, 16 Jan 2019 17:06:22 +0000 (09:06 -0800)]
Fix the caffe2_gpu linkage with torch on Windows (#16071)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/15992.
Inspired by https://docs.microsoft.com/en-us/cpp/build/reference/optimization-best-practices?view=vs-2017. But this PR needs to be tested.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16071

Differential Revision: D13693006

Pulled By: soumith

fbshipit-source-id: e83e9ae2591fa4da01d2b1b593558dba3bdc3cf7

5 years agoPort legacy all(*) to ATen (#15540)
Shen Li [Wed, 16 Jan 2019 17:02:44 +0000 (09:02 -0800)]
Port legacy all(*) to ATen (#15540)

Summary:
Questions:

1. ~This PR disables `common_dtype` computation [in `TensorIterator.cpp`](https://github.com/mrshenli/pytorch/blob/all/aten/src/ATen/native/TensorIterator.cpp#L489-L491) for `all*` operators. The reason is that, [this code](https://github.com/mrshenli/pytorch/blob/all/aten/src/ATen/native/TensorIterator.cpp#L120) otherwise complains type mismatch, where the `op.tensor` is `type Variable[CPUByteType]` while the `op` is `CPUByteType`. I am not sure if this is the right solution for this problem.~

2. Should I clean up all occurrences of `_th_all` and `_th_all_out` (and `logicalAnd`, `logicalAndAll`)?

3. Do I need to implement derivatives for `all`?

gchanan

Benchmark:

<img width="590" alt="screen shot 2018-12-26 at 3 24 31 pm" src="https://user-images.githubusercontent.com/16999635/50456505-e9596a00-0922-11e9-844e-00c4b4aad7ca.png">

<img width="587" alt="screen shot 2018-12-26 at 3 26 10 pm" src="https://user-images.githubusercontent.com/16999635/50456509-ef4f4b00-0922-11e9-96bf-0a30c8574fe7.png">

<img width="590" alt="screen shot 2018-12-26 at 3 26 54 pm" src="https://user-images.githubusercontent.com/16999635/50456510-ef4f4b00-0922-11e9-8a63-e47988843cc8.png">

<img width="589" alt="screen shot 2018-12-26 at 3 27 16 pm" src="https://user-images.githubusercontent.com/16999635/50456511-ef4f4b00-0922-11e9-9004-2518aebcdc6e.png">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15540

Differential Revision: D13548938

Pulled By: mrshenli

fbshipit-source-id: 5a2e5eef1047decb4c79906cb9f3332034908c9c

5 years agoRename away uses of THAllocator and THCDeviceAllocator (#16061)
Edward Yang [Wed, 16 Jan 2019 13:33:14 +0000 (05:33 -0800)]
Rename away uses of THAllocator and THCDeviceAllocator (#16061)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16061

I discovered I needed to delete these names in preparation of moving
THCCachingAllocator to c10_cuda; might as well also fix all the other
sites too.

Reviewed By: dzhulgakov

Differential Revision: D13686869

fbshipit-source-id: e8cc55d39ac4bfd3e3a22c761f89a7a111ce5f5e

5 years agoStop pretending that TH headers are both C++ and C compatible. (#16059)
Edward Yang [Wed, 16 Jan 2019 13:33:14 +0000 (05:33 -0800)]
Stop pretending that TH headers are both C++ and C compatible. (#16059)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16059

Just deleted all __cplusplus ifdef guards; we only ever use
these headers in C++ contexts.

Reviewed By: dzhulgakov

Differential Revision: D13686580

fbshipit-source-id: ce28c4a32f3596bfb17aeeb34904a02899991453

5 years agoFix logic errors when accumulating reductions in output (CUDA) (#16023)
Brennan Vincent [Wed, 16 Jan 2019 03:55:13 +0000 (19:55 -0800)]
Fix logic errors when accumulating reductions in output (CUDA) (#16023)

Summary:
The correct logic is as follows:

* If there is an earlier split, we need to combine with its result
* If there is *not* a later split, we need to project before saving into the output.

This should partially f i x #15837  . For example:
```
In [7]: a=torch.ones([1838860800], dtype=torch.float, device="cuda:1")

In [8]: a.mean()
Out[8]: tensor(1., device='cuda:1')
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16023

Differential Revision: D13678449

Pulled By: umanwizard

fbshipit-source-id: ab5078484c88e96bb30121b5cf24a0e8b0a8c2f8

5 years agoRemove deprecated caffe2::Tensor APIs (#15814)
Jerry Zhang [Wed, 16 Jan 2019 02:39:29 +0000 (18:39 -0800)]
Remove deprecated caffe2::Tensor APIs (#15814)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15814

Plan is to remove the APIs we want to deprecate one by one and make sure it still builds in sandcastle and ossci

Reviewed By: ezyang

Differential Revision: D12812029

fbshipit-source-id: ea0c3dd882bec95fcd4507160ebc61f598b6d040

5 years agoRemaining Tensor API fixes - dims() -> sizes() (#15743)
Jerry Zhang [Wed, 16 Jan 2019 02:39:28 +0000 (18:39 -0800)]
Remaining Tensor API fixes - dims() -> sizes() (#15743)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15743

Remaining fixes so that D12812029 will compile

Reviewed By: dzhulgakov

Differential Revision: D13535559

fbshipit-source-id: 2c8b3403570c8c35ac8efe2d827233abc0e6e0d1

5 years agoComment about CuDNNWrapper (#15496)
Edward Yang [Wed, 16 Jan 2019 01:57:27 +0000 (17:57 -0800)]
Comment about CuDNNWrapper (#15496)

Summary:
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15496

Differential Revision: D13544130

Pulled By: ezyang

fbshipit-source-id: 51bdd8312b482925b30a478774cdfa629c57ee4e

5 years agoPort FractionalMaxPool2d from TH to ATen (#15531)
Chandler Zuo [Wed, 16 Jan 2019 01:54:20 +0000 (17:54 -0800)]
Port FractionalMaxPool2d from TH to ATen (#15531)

Summary:
Tested:

pytest test/test_nn.py -k Fractional
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15531

Differential Revision: D13612833

Pulled By: chandlerzuo

fbshipit-source-id: b919d698d068b97ba7a4f8021367e7f6c8aae39c

5 years agoSupport tracing GenericList (#15969)
James Reed [Wed, 16 Jan 2019 01:29:48 +0000 (17:29 -0800)]
Support tracing GenericList (#15969)

Summary:
Treat GenericList similarly to tuples and TensorList: recursively unpack them and assignValueTrace accordingly. Also add interpreter support for ListUnpack on GenericList
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15969

Differential Revision: D13665139

Pulled By: jamesr66a

fbshipit-source-id: cd8cb3dd7475f424e48a69d217f2eac529df9f6a

5 years agos/fwdproxy.any/fwdproxy/g in fbsource (#16024)
Kyle Lexmond [Wed, 16 Jan 2019 01:20:24 +0000 (17:20 -0800)]
s/fwdproxy.any/fwdproxy/g in fbsource (#16024)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16024

codemod with 'Yes to all': s/fwdproxy.any/fwdproxy/g in fbsource

Reviewed By: maxgeorg

Differential Revision: D13666336

fbshipit-source-id: a5a694d66efec5304a1c8c231d638441f88efe1d

5 years agoAutomatic update of fbcode/onnx to 84a0441ae28795a928005863dc142bee81827566 (#16046)
Lu Fang [Wed, 16 Jan 2019 01:10:56 +0000 (17:10 -0800)]
update of fbcode/onnx to 84a0441ae28795a928005863dc142bee81827566 (#16046)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16046

Previous import was 7abd834091f1024c11749dcfd25126802db9fdd5

Included changes:
- **[84a0441](https://github.com/onnx/onnx/commit/84a0441)**: Clarify namescopes in the presence of nested subgraphs (#1665) <G. Ramalingam>
- **[118fec5](https://github.com/onnx/onnx/commit/118fec5)**: Add Where op. (#1569) <Sergii Dymchenko>
- **[beefa15](https://github.com/onnx/onnx/commit/beefa15)**: Use strings directly for casing as np.object w/o redundant StringHolder. (#1736) <Dmitri Smirnov>
- **[4023bae](https://github.com/onnx/onnx/commit/4023bae)**: Add a capability to input/output unicode strings (#1734) <Dmitri Smirnov>
- **[1a8a7fc](https://github.com/onnx/onnx/commit/1a8a7fc)**: typos fixed: iutput -> input (#1726) <Beomsoo Kim>
- **[0128478](https://github.com/onnx/onnx/commit/0128478)**: Scan test update (#1732) <G. Ramalingam>
- **[c6a24fd](https://github.com/onnx/onnx/commit/c6a24fd)**: turn rtol to 0.002 on densenet121, since AMD and Nvidia GPU's precion difference (#1733) <Lu Fang>
- **[5b7ac72](https://github.com/onnx/onnx/commit/5b7ac72)**: Add Shrink operator (#1622) <Rui Zhu>

Reviewed By: yinghai

Differential Revision: D13676711

fbshipit-source-id: 513cc137223469b47af48919432aaecf58006012

5 years agoAdd count_include_pad to average_pool_gradient_op (#15997)
Xiaomeng Yang [Wed, 16 Jan 2019 00:44:33 +0000 (16:44 -0800)]
Add count_include_pad to average_pool_gradient_op (#15997)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15997

Add count_include_pad to average_pool_gradient_op

Reviewed By: houseroad

Differential Revision: D13648339

fbshipit-source-id: 205cb2acb32dc24a85256b628298b1a11f0ffa2c

5 years agoRemove cuda from autograd profiler (#15898)
Zachary DeVito [Wed, 16 Jan 2019 00:25:28 +0000 (16:25 -0800)]
Remove cuda from autograd profiler (#15898)

Summary:
This puts stubs in the autograd profiler for the use of cuda APIs allowing the cuda parts of libtorch to be linked separately from the CPU parts.

This also edits the buck build.

Previous:

For GPU builds:
_C -> csrc -> caffe2
For CPU builds:
_C -> csrc-cpu -> caffe2

Now:
GPU:
_C -> libtorch_cuda -> (libtorch -> caffe2, for CPU)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/15898

Reviewed By: ailzhang

Differential Revision: D13617991

Pulled By: zdevito

fbshipit-source-id: 6d84a50bb356a54b4217f93219902755601b00e1

5 years agoFix namespace typo. (#16021)
Yavuz Yetim [Wed, 16 Jan 2019 00:17:01 +0000 (16:17 -0800)]
Fix namespace typo. (#16021)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16021

Adds nom:: so that TRIVIAL_CONVERTER works more generally.

Reviewed By: janewangfb

Differential Revision: D13664748

fbshipit-source-id: 100f47a8326e41bd0ac2ae281669f5a0363fe060

5 years agoFixing missing cpp tests for Caffe2 setup.py builds (#16037)
Jesse Hellemn [Tue, 15 Jan 2019 20:10:23 +0000 (12:10 -0800)]
Fixing missing cpp tests for Caffe2 setup.py builds (#16037)

Summary:
These were broken (always skipped in setup.py builds) by https://github.com/pytorch/pytorch/pull/15917
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16037

Differential Revision: D13675549

Pulled By: pjh5

fbshipit-source-id: fed50855dd0b5d0c80fface3d8b2156f18aae4e7

5 years agoTest cases for calling caffe2 LayerNorm from PyTorch and JIT
Sebastian Messmer [Tue, 15 Jan 2019 19:24:00 +0000 (11:24 -0800)]
Test cases for calling caffe2 LayerNorm from PyTorch and JIT

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15895

Reviewed By: dzhulgakov

Differential Revision: D13615336

fbshipit-source-id: de28fef8ce025d6d37a4c80c029ec97b7195cfd9

5 years agoEnhance cpu support on gloo based multi-nodes mode. (#11330)
Shane Li [Tue, 15 Jan 2019 19:07:55 +0000 (11:07 -0800)]
Enhance cpu support on gloo based multi-nodes mode. (#11330)

Summary:
1. Add some gloo communication operators into related fallback list;
2. Work around to avoid compiling errors while using fallback operator whose CPU operator inherits from 'OperatorBase' directly like PrefetchOperator;
3. Add new cpu context support for some python module files and resnet50 training example file.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11330

Reviewed By: yinghai

Differential Revision: D13624519

Pulled By: wesolwsk

fbshipit-source-id: ce39d57ddb8cd7786db2e873bfe954069d972f4f

5 years agoConstant prop prim::None (#15979)
Elias Ellison [Tue, 15 Jan 2019 18:56:17 +0000 (10:56 -0800)]
Constant prop prim::None (#15979)

Summary:
Previously we were only constant propping prim::Constants, but we should be constant propping prim::None as well.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15979

Differential Revision: D13664692

Pulled By: eellison

fbshipit-source-id: 01839403576c21fc030c427e49275b8e1210fa8f

5 years agoAdd a note about THNN height/width/etc argument reordering. (#15819)
Edward Yang [Tue, 15 Jan 2019 18:19:22 +0000 (10:19 -0800)]
Add a note about THNN height/width/etc argument reordering. (#15819)

Summary:
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15819

Differential Revision: D13665297

Pulled By: ezyang

fbshipit-source-id: 4570275bc9e65269788f836f2447d09474cefeff

5 years agoFix Python path finding for benchmark tests
Jesse Hellemn [Tue, 15 Jan 2019 18:12:18 +0000 (10:12 -0800)]
Fix Python path finding for benchmark tests

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16022

Differential Revision: D13673792

Pulled By: pjh5

fbshipit-source-id: 177a823ef343b7f60e26ad9ef51415332045438d

5 years agoQuantized RNNCell modules (#15469)
James Reed [Tue, 15 Jan 2019 18:07:18 +0000 (10:07 -0800)]
Quantized RNNCell modules (#15469)

Summary:
Similarly to https://github.com/pytorch/pytorch/pull/13777, we apply post-processing quantization to RNN cell modules (`RNNCell`, `LSTMCell`, and `GRUCell`).

A further follow-up PR will involve quantizing the full `RNN`, `GRU`, and `LSTM` modules. This depends on those modules being scriptable as part of the standard library scripting effort, though. Note that infrastructure in this pr such as `gather_quantized_params` is currently unused but should be used in the future when we can port over the full RNN modules.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15469

Differential Revision: D13545802

Pulled By: jamesr66a

fbshipit-source-id: ad3b694517842893ea619438e9f5e88fd7b96510

5 years agoMiscellaneous broken RSTs fixed (#16033)
Derek Kim [Tue, 15 Jan 2019 17:44:50 +0000 (09:44 -0800)]
Miscellaneous broken RSTs fixed (#16033)

Summary:
https://pytorch.org/docs/master/tensors.html#torch.Tensor.bernoulli_
https://pytorch.org/docs/master/torch.html#torch.addmm
https://pytorch.org/docs/master/distributed_deprecated.html#torch.distributed.deprecated.reduce_multigpu
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16033

Differential Revision: D13671202

Pulled By: soumith

fbshipit-source-id: 276e10e610affe205376573e7f0f9894695d218d

5 years agoAdd PyTorchPredictorContainer (#15899)
Lu Fang [Tue, 15 Jan 2019 17:13:16 +0000 (09:13 -0800)]
Add PyTorchPredictorContainer (#15899)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15899

Add PyTorchPredictorContainer to support multiple jit script modules

Reviewed By: pritamdamania87

Differential Revision: D13596139

fbshipit-source-id: 3ce0bdf2f4dbba7aa1d20e824d03e5ac98f5d887

5 years agoAdd `itertools.{prod, combinations, combinations_with_replacement}` like op to pytorc...
Xiang Gao [Tue, 15 Jan 2019 16:24:27 +0000 (08:24 -0800)]
Add `itertools.{prod, combinations, combinations_with_replacement}` like op to pytorch (#9393)

Summary:
closes https://github.com/pytorch/pytorch/issues/7580
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9393

Differential Revision: D13659628

Pulled By: zou3519

fbshipit-source-id: 3a233befa785709395a793ba8833413be394a6fd

5 years agouse fbgemm gconv in dnnlowp (#16020)
Jongsoo Park [Tue, 15 Jan 2019 07:59:33 +0000 (23:59 -0800)]
use fbgemm gconv in dnnlowp (#16020)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16020

Needs to go over more iterations. For conv, I think we need a high level interface that abstracts out low-level details of which code path will be taken (acc16, outlier-aware, depth-wise, group conv, ...) otherwise the client code will be complex as can be seen from DNNLOWP Conv ops. This will also help us to make interface more stable.

Reviewed By: dskhudia, jianyuh

Differential Revision: D13588996

fbshipit-source-id: 9afce9e441bcaf20437fcc2874fb9d4165a46bcb

5 years ago`var` for multiple dimensions (#15892)
Brennan Vincent [Tue, 15 Jan 2019 04:14:04 +0000 (20:14 -0800)]
`var` for multiple dimensions (#15892)

Summary:
Timings are the same as for `std` .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15892

Differential Revision: D13651173

Pulled By: umanwizard

fbshipit-source-id: a26bf1021dd972aa9e3e60fb901cd4983bfa190f

5 years agoUpdating submodules
svcscm [Tue, 15 Jan 2019 02:42:13 +0000 (18:42 -0800)]
Updating submodules

Reviewed By: yns88

fbshipit-source-id: 19841cff4a7fd69318d7828db75c16cd75757edd

5 years agoUpdating submodules
svcscm [Tue, 15 Jan 2019 02:35:14 +0000 (18:35 -0800)]
Updating submodules

Reviewed By: yns88

fbshipit-source-id: 68b7c41366618ffd636c2b9c45c7ffbbcbc44f85