Jerry Zhang [Wed, 3 Apr 2019 20:13:26 +0000 (13:13 -0700)]
QTensor (#18230)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18230
Implementing minimum qtensor API to unblock other workstreams in quantization
Changes:
- Added Quantizer which represents different quantization schemes
- Added qint8 as a data type for QTensor
- Added a new ScalarType QInt8
- Added QTensorImpl for QTensor
- Added following user facing APIs
- quantize_linear(scale, zero_point)
- dequantize()
- q_scale()
- q_zero_point()
Reviewed By: dzhulgakov
Differential Revision:
D14524641
fbshipit-source-id:
c1c0ae0978fb500d47cdb23fb15b747773429e6c
Dmytro Dzhulgakov [Wed, 3 Apr 2019 20:12:28 +0000 (13:12 -0700)]
Enforce import order to make protobuf cpp implementation in python work (#18560)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18560
We have to import python protobuf here **before** we load cpp extension.
Otherwise it breaks under certain build conditions if cpp implementation of
protobuf is used. Presumably there's some registry in protobuf library and
python side has to initialize the dictionary first, before static
initialization in python extension does so. Otherwise, duplicated protobuf
descriptors will be created and it can lead to obscure errors like
Parameter to MergeFrom() must be instance of same class: expected caffe2.NetDef got caffe2.NetDef.
I think it also fixes https://github.com/facebookarchive/caffe2/issues/1573
Reviewed By: ezyang, iroot900
Differential Revision:
D14622054
fbshipit-source-id:
2499eb88ecdee85ff8d845859048f7ae5da2a480
Lu Fang [Wed, 3 Apr 2019 20:08:21 +0000 (13:08 -0700)]
Pin onnx ir_version to 4 (#18768)
Summary:
to make test_operators.py more stable. in future, we will bump this up manually, and I think it's acceptable, since ir_version should be bumped too often.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18768
Reviewed By: zrphercule
Differential Revision:
D14741514
Pulled By: houseroad
fbshipit-source-id:
0369dbc55424e345a113e49fc104a441ea290d58
Soumith Chintala [Wed, 3 Apr 2019 19:44:35 +0000 (12:44 -0700)]
fix nccl compilation to make sure it compiles for architectures that pytorch compiles for (#18739)
Summary:
resubmit of https://github.com/pytorch/pytorch/pull/18704 with additional fixes
Fixes https://github.com/pytorch/pytorch/issues/18359
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18739
Differential Revision:
D14737274
Pulled By: soumith
fbshipit-source-id:
cfbbbf68b098594bd045861d1b2c085da693ea51
Soumith Chintala [Wed, 3 Apr 2019 19:27:19 +0000 (12:27 -0700)]
push magma init into lazyInitCUDA (#18527)
Summary:
Tries to fix C++ API's usage of MAGMA-based functions.
Attempts to Fix https://github.com/pytorch/pytorch/issues/18074
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18527
Differential Revision:
D14691694
Pulled By: soumith
fbshipit-source-id:
dd04e74418e486d73ea4a92193ddf79352ed71ba
Jerry Zhang [Wed, 3 Apr 2019 19:01:57 +0000 (12:01 -0700)]
For some files that are touched by the QTensor diff (#18765)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18765
att
Reviewed By: ZolotukhinM
Differential Revision:
D14733442
fbshipit-source-id:
525002034e6dccc2045da645e1193671fd0474b3
Wanchao Liang [Wed, 3 Apr 2019 18:18:05 +0000 (11:18 -0700)]
Fix contiguous AD and Autogradzero inconsistency (#18633)
Summary:
Fixes #17962
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18633
Differential Revision:
D14700449
Pulled By: wanchaol
fbshipit-source-id:
3d15d67c01b69b28394a0f2f001db90ed9fd31dc
Iurii Zdebskyi [Wed, 3 Apr 2019 17:53:11 +0000 (10:53 -0700)]
Added indexing for bool tensors and bool Indices (#18583)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18583
ghimport-source-id:
2b1941449827f4ab632fa0f5c8cf0791a6be0845
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18583 Added indexing for bool tensors and bool Indices**
* #18505 Added numpy conversion
* #18166 Bool Tensor for CUDA
-----------
This PR enables bool tensor indexing and indexing with bool indices. This is a part of Bool Tensor feature implementation work. The whole plan looks like this:
1. Storage Implementation [Done]
2. Tensor Creation.
a) CPU [Done]
b) CUDA [In review]
3. Tensor Conversions. [In review]
4. Tensor Indexing. [This PR]
5. Tensor Operations.
6. Back compatibility related changes.
TODO:
as a follow up, we should move nonzero method from TH to Aten to make code cleaner.
Change:
```
v = torch.tensor([True, False, True], dtype=torch.bool)
boolIndices = torch.tensor([True, False, False], dtype=torch.bool)
v[boolIndices]
-> tensor([True], dtype=torch.bool)
v = torch.randn(5, 7, 3)
boolIndices = torch.tensor([True, False, True, True, False], dtype=torch.bool)
v[boolIndices]
->
tensor([[[ 0.5885, -0.3322, 0.7388],
[ 1.1182, 0.7808, -1.1492],
[-0.7952, 0.5255, -0.0251],
[ 0.7128, 0.8099, 1.2689],
[-0.7018, -1.4733, -0.3732],
[ 0.4503, 0.4986, -1.1605],
[ 0.3348, -1.3767, -0.2976]],
[[-2.0303, -0.4720, -0.1448],
[-0.1914, -0.6821, 2.0061],
[-1.0420, -0.1872, -0.3438],
[ 1.7587, -0.4183, -0.7577],
[ 1.0094, -0.1950, -0.2430],
[ 0.1174, 0.3308, -0.5700],
[ 0.1110, -0.2714, 1.3006]],
[[-0.1946, -1.4747, -0.4650],
[-1.0567, 1.0110, -0.2809],
[ 0.3729, -0.5699, 0.0815],
[-0.7733, -0.8316, 0.1674],
[ 1.2000, -0.3745, -1.1679],
[ 1.7105, 0.9851, -0.1907],
[-1.1077, 0.2086, -0.0548]]])
```
Differential Revision:
D14673403
fbshipit-source-id:
2b88ec2c7eb26a4f5ef64f8707fb68068d476fc9
Lu Fang [Wed, 3 Apr 2019 17:51:41 +0000 (10:51 -0700)]
add an assertion to check the param num (#18145)
Summary:
Introduce this check to see whether it will break any existing workflow
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18145
Reviewed By: dzhulgakov
Differential Revision:
D14511711
Pulled By: houseroad
fbshipit-source-id:
a7bb6ac84c9133fe94d3fe2f1a8566faed14a136
Jiakai Liu [Wed, 3 Apr 2019 17:42:30 +0000 (10:42 -0700)]
add Android NDK param to CI docker build script (#18782)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18782
ghimport-source-id:
6c4bde7dc835b59209c1d5f7b243f00c9fe99de2
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18782 [pytorch] add Android NDK param to CI docker build script**
Inspired by discussion: https://github.com/pytorch/pytorch/pull/16242
Reviewed By: dreiss
Differential Revision:
D14739471
fbshipit-source-id:
0a081045186cbf359eb3cdadee722741cd8cd62f
Gu, Jinghui [Wed, 3 Apr 2019 17:29:19 +0000 (10:29 -0700)]
Upgrade mkldnn-bridge for dnnlowp support (#16308)
Summary:
The mkldnn-bridge is upgraded in this PR to support DNNLOWP operators.
Meanwhile, APIs have been updated in caffe2 to use latest version.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16308
Differential Revision:
D14697018
Pulled By: yinghai
fbshipit-source-id:
ca952589098accb08295fd5aa92924c61e74d69c
Michael Kösel [Wed, 3 Apr 2019 17:11:33 +0000 (10:11 -0700)]
add 'abs' builtin
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18502
Differential Revision:
D14750173
Pulled By: eellison
fbshipit-source-id:
359cf08938ada442ca1a3b3ea14022ce10229499
kshitij12345 [Wed, 3 Apr 2019 16:16:29 +0000 (09:16 -0700)]
Fix dense Embedding to work with double backward (#9078)
Summary:
Fixes : #6469
1. `ATen/native/native_functions.yml` had [dispatch](https://github.com/pytorch/pytorch/blob/
03e7953a98875c0164cb8e2c19b45800e85f4347/aten/src/ATen/native/native_functions.yaml#L451-L455) variants for for `embedding_dense_backward` , however `embedding_backward` explicitly made [call](https://github.com/pytorch/pytorch/blob/
03e7953a98875c0164cb8e2c19b45800e85f4347/aten/src/ATen/native/Embedding.cpp#L35-L45) to it, thus leading to error.
2. In case of CUDA type tensor, the function crashed used to crash on dereferencing of indices's data [pointer](https://github.com/pytorch/pytorch/blob/
03e7953a98875c0164cb8e2c19b45800e85f4347/aten/src/ATen/native/Embedding.cpp#L93).
Both have been solved and checked against (on CUDA and CPU)
1. As mentioned in the issue
```
import torch
class Test(torch.nn.Module):
def __init__(self):
super(Test,self).__init__()
self.embd = torch.nn.Embedding(1000, 100)
self.dense = torch.nn.Linear(100, 1)
def forward(self, inp):
inp = self.embd(inp)
return self.dense(inp)
test = Test()
inp = torch.tensor([0,1,2,1,1])
out = test(inp)
raw_loss = out.mean(dim=0)
loss_grad = torch.autograd.grad(outputs=raw_loss,
inputs=list(test.parameters()),
retain_graph=True, create_graph=True, only_inputs=True)
norm = sum([param.norm()**2 for param in loss_grad])
loss = raw_loss + norm
loss.backward(retain_graph=True)
print(test.embd.weight.grad)
```
2. Test Script
```
import torch
import time
start = time.time()
l = [1,1]*100
input = torch.tensor([[1,0],[1,0]],device='cpu')
embedding_matrix = torch.tensor([[1.0,3.0],[2.0,4]],requires_grad=True,device='cpu')
sq = embedding_matrix * embedding_matrix
emb = torch.nn.functional.embedding(input, sq,scale_grad_by_freq=False)
print('Embedding Matrix')
print(embedding_matrix)
print('-----------------')
sum_ = emb.sum()#prod.sum()
loss_grad, = torch.autograd.grad(outputs=sum_,inputs=embedding_matrix,create_graph=True)
print('Gradient')
print(loss_grad)
print('-----------------')
sum2_ = sum_ + loss_grad.sum()
print(sum2_)
sum2_.backward()
print(embedding_matrix.grad)
print(time.time() - start)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9078
Reviewed By: ezyang
Differential Revision:
D14691901
Pulled By: soumith
fbshipit-source-id:
78e2612ba39080be564c876311671eb5a0119a0f
Shen Li [Wed, 3 Apr 2019 16:06:09 +0000 (09:06 -0700)]
Highlight NCCL all_reduce and all_gather requirements (#18741)
Summary:
See #18689
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18741
Differential Revision:
D14726874
Pulled By: mrshenli
fbshipit-source-id:
a92404c653e3c62fc23fa3ccacfb3b2959b2e307
svcscm [Wed, 3 Apr 2019 15:25:14 +0000 (08:25 -0700)]
Updating submodules
Reviewed By: zpao
fbshipit-source-id:
ea0b06ce68d3fd6092eaea7c835a8b51c1120ea0
peter [Wed, 3 Apr 2019 15:19:45 +0000 (08:19 -0700)]
Make it possible for users for select /Zi or /ZI over /Z7 when using MSVC (#18790)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/18701.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18790
Differential Revision:
D14748195
Pulled By: ezyang
fbshipit-source-id:
e50df1b5ca199a88d7b5ea3ea45d25d23cd31a27
Jongsoo Park [Wed, 3 Apr 2019 14:55:02 +0000 (07:55 -0700)]
use optimization in
D14020675 (#16945)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16945
As title
Reviewed By: jianyuh
Differential Revision:
D14020769
fbshipit-source-id:
fc0f05fcc57bfe9b4aa0c5750060d7b2ba57dd7a
Gregory Chanan [Wed, 3 Apr 2019 14:52:54 +0000 (07:52 -0700)]
Add device and dtype to storage. (#18749)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18749
ghimport-source-id:
9026a037f5e11cdb9ccd386f4b6b5768b9c3259b
Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18751 Disallow changing the device of a tensor via set_.
* #18750 Use non-legacy constructors for tensor deserialization.
* **#18749 Add device and dtype to storage.**
The goal here is to fix our serialization, which currently depends on the legacy constructors. Having dtype and device on Storage allows us to use the non-legacy constructors.
This fits somewhat along our goal of removing Storage, my having Storage act like a Tensor.
Differential Revision:
D14729516
fbshipit-source-id:
bf4a3e8669ad4859931f4a3fa56df605cbc08dcb
Gregory Chanan [Wed, 3 Apr 2019 14:51:15 +0000 (07:51 -0700)]
Use non-legacy constructors for tensor deserialization. (#18750)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18750
ghimport-source-id:
f1475cfb67841c41d9867d4429ba9125d5c7dd07
Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18751 Disallow changing the device of a tensor via set_.
* **#18750 Use non-legacy constructors for tensor deserialization.**
* #18749 Add device and dtype to storage.
Deserialization currently uses legacy constructors. This is bad because we need to maintain them, but there is a more immediate problem:
1) We are trying to implement device caching on TensorImpl to get rid of a virtual dispatch
2) This doesn't work if one is able to change the device of a Tensor underlying a Variable.
3) Deserialization does 2)
So the plan is to change deserialization, then enforce that we don't change the device out from underneath a Variable.
Differential Revision:
D14729513
fbshipit-source-id:
090d6cdb375b94dc1bf4f554b2df243952b8cdc6
Iurii Zdebskyi [Wed, 3 Apr 2019 14:22:38 +0000 (07:22 -0700)]
Added numpy conversion (#18505)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18505
ghimport-source-id:
f3c9b9251e5793f9e192f587194ddfebb45facc1
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18505 [WIP]Added numpy conversion**
* #18166 Bool Tensor for CUDA
Differential Revision:
D14646403
fbshipit-source-id:
79d39d692c778ce1981c1d35b1c33e3d93111041
Gregory Chanan [Wed, 3 Apr 2019 14:05:16 +0000 (07:05 -0700)]
Remove THTensor_(newUnfold). (#18773)
Summary:
It's not used and unfold's use of `device_guard: False` is scary.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18773
Differential Revision:
D14736526
Pulled By: gchanan
fbshipit-source-id:
6281a284bee45fa5038783e4c1ed4d1ed7ca81ab
mingzhe0908 [Wed, 3 Apr 2019 05:49:49 +0000 (22:49 -0700)]
temp fix for flake8 error (#18788)
Summary:
Fix lint error
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18788
Reviewed By: houseroad
Differential Revision:
D14741840
Pulled By: mingzhe09088
fbshipit-source-id:
1fa630e3c6e606e3d78fe8293e5b0e7ea1b78da3
Igor Fedan [Wed, 3 Apr 2019 04:10:22 +0000 (21:10 -0700)]
Fix flake8 issues
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18762
Reviewed By: houseroad
Differential Revision:
D14734152
Pulled By: ifedan
fbshipit-source-id:
5adf123f88273895ad34ee9041896358d686de08
Jerry Zhang [Wed, 3 Apr 2019 03:54:28 +0000 (20:54 -0700)]
Change ReinitializeTensor to use C10_LOG_FIRST_N (#18531)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18531
Currently we use C10_LOG_EVERY_MS to log the data type change, but it pollutes the log of some service,
we would like to change it to C10_LOG_FIRST_N to prevent that.
Reviewed By: dzhulgakov
Differential Revision:
D14647704
fbshipit-source-id:
b84e4002bd4aa94d616133cd1049c3d4ab05386e
Yinghai Lu [Wed, 3 Apr 2019 03:52:58 +0000 (20:52 -0700)]
Add support for getting TensorProto argument (#18364)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18364
att
Reviewed By: bddppq
Differential Revision:
D14584784
fbshipit-source-id:
03f9207d5cf4f7f4b812428a931edbcdcb21ca8d
Michael Suo [Wed, 3 Apr 2019 01:06:07 +0000 (18:06 -0700)]
make test module hook use save/load (#18284)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18284
ghimport-source-id:
5a92c03fda19072ffb6afd40e0f56806716c7be6
Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18296 [jit] Add namespacing for ScriptClasses
* **#18284 [jit] make test module hook use save/load**
* #18211 [jit] Turn script_type_parser into a class
* #18148 [jit] python interop for script classes
Instead of python-printing and comparing strings (which does not capture
depdency information, etc.), use save/load on in-memory buffers and
compare the main module contents inside the buffer
Reviewed By: ailzhang
Differential Revision:
D14581129
fbshipit-source-id:
52264ae9ce076775ab3fd1a0c32c8d6f6677a903
Zachary DeVito [Wed, 3 Apr 2019 00:33:06 +0000 (17:33 -0700)]
Add ability to specialize class types to ArgumentSpec (#18314)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18314
ghimport-source-id:
8cecb768d476ab19c9460f39c8f94a764e4cb052
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18314 Add ability to specialize class types to ArgumentSpec**
* #18226 Add Slot type to abstract the raw pointers being used for slots.
Differential Revision:
D14574395
fbshipit-source-id:
cc3af6e56e9ae52990f4a1ad56ecceaa2d493577
Mingzhe Li [Wed, 3 Apr 2019 00:03:23 +0000 (17:03 -0700)]
Operator-level performance microbenchmarks (#18740)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18740
Test utilities for writing Caffe2/PyTorch performance microbenchmarks. Brief description of the file structure
* benchmark_core.py : core utiltiites for running microbenchmark tests
* benchmark_caffe2.py : Caffe2 specific benchmark utilitites
* benchmark_pytorch.py: PyTorch specific benchmark utilities
* benchmark_runner.py : Main function. Currently it can run the microbenchmark tests in a stand-alone mode. The next step is to have this integrate with AI-PEP.
The utilities are located at https://github.com/pytorch/pytorch/tree/master/test to have access to both Caffe2/PyTorch Python's frontend.
Include two operator microbenchmarks; support both Caffe2/PyTorch:
* MatMul
* Add
Reference: PyTorch benchmarks : https://github.com/pytorch/benchmark/tree/master/timing/python. In this work, we start with two example binary operators MatMul and Add, but eventually we should to cover unary operators like in the PyTorch benchmark repo.
Reviewed By: zheng-xq
Differential Revision:
D13887111
fbshipit-source-id:
b7a56b95448c9ec3e674b0de0ffb96af4439bfce
Iurii Zdebskyi [Tue, 2 Apr 2019 23:10:43 +0000 (16:10 -0700)]
Bool Tensor for CUDA (#18166)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18166
ghimport-source-id:
a8e2ba2d966e49747a55701c4f6863c5e24d6f14
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18166 Bool Tensor for CUDA**
* #18165 Resolved comments from Bool Tensor for CPU PR
------
This PR enables bool tensor creation and some basic operations for the CPU backend. This is a part of Bool Tensor feature implementation work. The whole plan looks like this:
1. Storage Implementation [Done]
2. Tensor Creation.
a) CPU [Done]
b) CUDA [This PR]
3. Tensor Conversions.
4. Tensor Indexing.
5. Tensor Operations.
6. Back compatibility related changes.
Change:
Enable bool tensor in CUDA with the following operations:
torch.zeros
torch.tensor
torch.ones
torch.rand/rand_like/randint/randint_like
torch.full
torch.full_like
torch.empty
torch.empty_like
Tested via unit tests and local scripts.
Differential Revision:
D14605104
fbshipit-source-id:
b7d7340a7d70edd03a109222d271e68becba762c
Jan Schlüter [Tue, 2 Apr 2019 22:15:31 +0000 (15:15 -0700)]
Add helpful information to the gradient/inplace operation exception (#18523)
Summary:
To debug a `one of the variables needed for gradient computation has been modified by an inplace operation` error, I wanted to know *which* variable has been modified, so I extended the error message with what information is easily available at this point.
Before:
```
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
```
After:
```
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [80, 1]], which is output 0 of UnsqueezeBackward0, is at version 1, not expected version 0. Hint: enable anomaly detection to find the forward pass operation which modified it.
```
The hint to enable anomaly detection is only shown when it is not enabled. It's meant to save people some googling. I'd even go further and reference `torch.autograd.set_detect_anomaly(True)`, but maybe we're not running Python?
Disclaimer: I haven't looked at other parts of the code to check if using `std::stringstream` is acceptable practice, let me know if it isn't. Similarly, I haven't checked about indentation practices.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18523
Differential Revision:
D14683249
Pulled By: soumith
fbshipit-source-id:
f97a99d4aabea7461df766d66cd72300b48e2350
Mikhail Zolotukhin [Tue, 2 Apr 2019 22:14:18 +0000 (15:14 -0700)]
build_variables.py: turn on link_whole for _C_impl library. (#18763)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18763
Without `link_whole` flag in opt-builds some of the files are not linked into `_C_impl` library, which causes some of static initializers not to run (namely, registering an cutomPythonOperation from python_interpreter.cpp). This diff fixes it.
Differential Revision:
D14732471
fbshipit-source-id:
57cff6b4b6d479ad7ab7fd29f677746d91d6ff45
vaeksare [Tue, 2 Apr 2019 21:25:28 +0000 (14:25 -0700)]
Fix windows msbuild bug (#18748)
Summary:
Fix the bug introduced by #18681 where an undefined variable was being used to limit max cpu count when building for Windows without Ninja.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18748
Differential Revision:
D14733209
Pulled By: soumith
fbshipit-source-id:
52fc0dd4dde99da75a6956b63f02da2e647eed4f
Igor Fedan [Tue, 2 Apr 2019 20:18:20 +0000 (13:18 -0700)]
torch.cross' dim default changed to c10::optional instead of int=-1 (#17582)
Summary:
Argument dim=-1 doesn't work for torch.cross. The signature of the torch.cross has been changed to c10::optional<int64_t> dim instead of int64_t. So based on document "If dim is not given, it defaults to the first dimension found with the size 3." and if dim is specified (even negative) it will use the correspondent dim.
Fixes #17229
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17582
Differential Revision:
D14483063
Pulled By: ifedan
fbshipit-source-id:
f9699093ec401cb185fd33ca4563c8a46cdcd746
Sacha [Tue, 2 Apr 2019 20:15:10 +0000 (13:15 -0700)]
Fix multi-configuration on Windows CMake (CUDA) (#18548)
Summary:
Multiple configurations is the default (eg. Release;Debug) on Windows and this check always broke this configuration as CMAKE_BUILD_TYPE was not set. The workaround was to always set CMAKE_BUILD_TYPE to Debug or Release, which was very unfortunate.
The correct method is to use generator expressions that expand depending on the current CONFIG being processed.
Side note: Anywhere else CMAKE_BUILD_TYPE is checked should probably be fixed too.
Note that the CMakeLists.txt forces it in to Release mode. However, I came across this error when importing the prebuilt Config in to another project, where CMAKE_BUILD_TYPE was not set.
> 3>CMake Error at pre_built/pytorch-1.0.1/share/cmake/Caffe2/public/cuda.cmake:380 (message):
> 3> Unknown cmake build type:
Proper support for configurations would mean we can build debug and release at the same time and as you can see, it is less CMake code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18548
Differential Revision:
D14730790
Pulled By: ezyang
fbshipit-source-id:
70ae16832870d742c577c34a50ec7564c3da0afb
Igor Fedan [Tue, 2 Apr 2019 19:32:52 +0000 (12:32 -0700)]
Fix flake8 issues in gragrad test
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18727
Differential Revision:
D14724887
Pulled By: ifedan
fbshipit-source-id:
8c1db6460303e746e4aea0142302b8d61277c067
Sebastian Messmer [Tue, 2 Apr 2019 19:23:15 +0000 (12:23 -0700)]
Register operators by passing arguments to RegisterOperators constructor (#18577)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18577
This is also part of the legacy API and we need to support it if we want to replace it.
Reviewed By: dzhulgakov
Differential Revision:
D14671432
fbshipit-source-id:
007abf4ab816647a509fc08e35d79b6c1aa55b03
Sebastian Messmer [Tue, 2 Apr 2019 19:23:13 +0000 (12:23 -0700)]
Allow registering an operator schema without a kernel (#18551)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18551
This is helpful for defining a set of operators as an interface but not adding concrete kernels just yet.
The registration logic will ensure that any other libraries that add kernels for these schemas exactly match the schema defined here.
Reviewed By: dzhulgakov
Differential Revision:
D14660208
fbshipit-source-id:
7adb5a4876cff5a0ad21d92d8c450cb889f00cc3
Sebastian Messmer [Tue, 2 Apr 2019 19:23:13 +0000 (12:23 -0700)]
Improve compiler error messages of the op registration API (#18550)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18550
When the operator registration API is used wrongly, in most cases we should now get a nice compiler error
instead of weird template error messages.
This is done by making the enable_if conditions more broad so they also match error cases,
but then having static_asserts against these error cases inside the function.
Before that, since the function didn't match, the error message said something like "no function found to match your call",
now it will show the error message specified in the static_asserts.
Reviewed By: dzhulgakov
Differential Revision:
D14659178
fbshipit-source-id:
7ca4fb72d9051eadf0a7e2717b962bf1213a52b2
Sebastian Messmer [Tue, 2 Apr 2019 19:23:13 +0000 (12:23 -0700)]
Improve and test error messages for signature mismatches (#18547)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18547
- Argument indices in the error messages are 1-indexed not 0-indexed.
- Add test cases that a mismatching signature actually shows the correct error messages
Reviewed By: dzhulgakov
Differential Revision:
D14656695
fbshipit-source-id:
55e45634baa3117e18b8687ea6b2a2f83715bdf6
Sebastian Messmer [Tue, 2 Apr 2019 19:23:13 +0000 (12:23 -0700)]
Enable gmock and fix system gtest issue (#18706)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18706
- Enable gmock
- Fix issue where the gtest source files in third_party would include system gtest headers
Reviewed By: ezyang
Differential Revision:
D14715302
fbshipit-source-id:
5335390913e651bda85c69d7ea9b5c1bce58f172
Edward Yang [Tue, 2 Apr 2019 17:46:16 +0000 (10:46 -0700)]
Emergency workaround for apt-get failure. (#18733)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18733
ghimport-source-id:
b56766fb4b1084d8a7947cf622275d44e325141b
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18733 Emergency workaround for apt-get failure.**
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Reviewed By: dreiss
Differential Revision:
D14725779
fbshipit-source-id:
6855347853a3f13461ca267ed563e2db5815166e
Pieter Noordhuis [Tue, 2 Apr 2019 17:29:54 +0000 (10:29 -0700)]
Fix clang-tidy errors in torch/csrc/distributed
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18709
Differential Revision:
D14725936
Pulled By: pietern
fbshipit-source-id:
307bc446d53da5d0e04d730bb51b7fb29212ace3
Eli Amesefe [Tue, 2 Apr 2019 17:07:22 +0000 (10:07 -0700)]
Undefined behavior with memset of std::string to 0 (#18703)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18703
`zeroPtr` is sometimes a `std::string` tensor, so `memset` to 0 is undefined behavior.
This might be accidentally safe with `std::string` implementation that use SSO (Small String Optimization), but will crash otherwise.
Reviewed By: zheng-xq
Differential Revision:
D14714458
fbshipit-source-id:
012a18464e6514d38ff791509b88ddc3fc55b2b1
Soumith Chintala [Tue, 2 Apr 2019 16:36:05 +0000 (09:36 -0700)]
Revert
D14717015: [pytorch][PR] fix nccl compilation to make sure it compiles for architectures that pytorch compiles for
Differential Revision:
D14717015
Original commit changeset:
4aac036f57e5
fbshipit-source-id:
c820b8dfb27564271e6b80e133fe655658a7c25c
Lu Fang [Tue, 2 Apr 2019 16:11:14 +0000 (09:11 -0700)]
update of fbcode/onnx to
f0d7df2c643c4e37f1fd7735ef02c972c4d19fb5 (#18695)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18695
Previous import was
fb1a80692c1ab0bd27b1072f2e7bffacba336777
Included changes:
- **[
f0d7df2c](https://github.com/onnx/onnx/commit/
f0d7df2c)**: fix testcase names of maxpool_2d_ceil and averagepool_2d_ceil (#1896) <karljang>
Reviewed By: zrphercule
Differential Revision:
D14709993
fbshipit-source-id:
7fe2145a481ea2c1b6d85ba1c85c662200a53241
Vitaly Fedyunin [Tue, 2 Apr 2019 15:44:27 +0000 (08:44 -0700)]
Adding pin_memory kwarg to zeros, ones, empty, ... tensor constructors. (#18455)
Summary:
Make it possible to construct a pinned memory tensor without creating a storage first and without calling pin_memory() function. It is also faster, as copy operation is unnecessary.
Supported functions:
```python
torch.rand_like(t, pin_memory=True)
torch.randn_like(t, pin_memory=True)
torch.empty_like(t, pin_memory=True)
torch.full_like(t, 4, pin_memory=True)
torch.zeros_like(t, pin_memory=True)
torch.ones_like(t, pin_memory=True)
torch.tensor([10,11], pin_memory=True)
torch.randn(3, 5, pin_memory=True)
torch.rand(3, pin_memory=True)
torch.zeros(3, pin_memory=True)
torch.randperm(3, pin_memory=True)
torch.empty(6, pin_memory=True)
torch.ones(6, pin_memory=True)
torch.eye(6, pin_memory=True)
torch.arange(3, 5, pin_memory=True)
```
Part of the bigger: `Remove Storage` plan.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18455
Reviewed By: ezyang
Differential Revision:
D14672084
Pulled By: VitalyFedyunin
fbshipit-source-id:
9d0997ec00f59500ee018f8b851934d334012124
Edward Yang [Tue, 2 Apr 2019 15:03:29 +0000 (08:03 -0700)]
Improve Backend comment. (#18567)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18567
ghimport-source-id:
1e50e611a3afcfae86828b7afe06c3fdc6a7bef7
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18567 Improve Backend comment.**
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Reviewed By: dzhulgakov
Differential Revision:
D14666189
fbshipit-source-id:
64a41c4a998b1a59ff780d1ae06fa16e5ef3c7c4
vishwakftw [Tue, 2 Apr 2019 14:53:34 +0000 (07:53 -0700)]
Expose alias multinomial methods to ATen (#17904)
Summary:
This PR exposes the multinomialAliasSetup and multinomialAliasDraw methods.
cc: neerajprad
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17904
Differential Revision:
D14700205
Pulled By: ezyang
fbshipit-source-id:
16462fb1f1ef1d560fd586632ea356b23e966ee3
BloodAxe [Tue, 2 Apr 2019 14:49:40 +0000 (07:49 -0700)]
Update cpp_extension.py (#18638)
Summary:
Hi. It seems that when building CPP-extensions with CUDA for Windows, an `extra_cuda_cflags` options are not properly forwarded to `nvcc`.
Use of extra CUDA options is necessary to build, for instance, a InplaceABN (https://github.com/mapillary/inplace_abn), which requires `--expt-extended-lambda` option.
This PR adds one line that correctly appends `extra_cuda_cflags`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18638
Differential Revision:
D14704270
Pulled By: ezyang
fbshipit-source-id:
e1e330d193d9afd5707a5437a74c0499460d2b90
Mark Pare [Tue, 2 Apr 2019 14:48:45 +0000 (07:48 -0700)]
fix typo
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18653
Differential Revision:
D14713920
Pulled By: ezyang
fbshipit-source-id:
170295a162dd23916c1dcc9330918d33277cc9ed
Gregory Chanan [Tue, 2 Apr 2019 14:31:54 +0000 (07:31 -0700)]
Kill LegacyBridge functions that don't do multiple dispatch. (#18696)
Summary:
At some point, we needed these functions to deal with autograd dispatching to the sparse of TH version of a backwards. But we rewrote all backwards definitions in terms of native functions, so this is no longer necessary.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18696
Differential Revision:
D14710834
Pulled By: gchanan
fbshipit-source-id:
b22568c58eefc79d672555bd8832398ccd965cb7
svcscm [Tue, 2 Apr 2019 13:47:23 +0000 (06:47 -0700)]
Updating submodules
Reviewed By: zpao
fbshipit-source-id:
da3cd711bb81b07c6c284426ffc5e10a969b0d2b
Jongsoo Park [Tue, 2 Apr 2019 06:45:01 +0000 (23:45 -0700)]
add Int8FCRelu (#18673)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18673
Add a fused FC + Relu
Reviewed By: csummersea
Differential Revision:
D14667055
fbshipit-source-id:
d88fefba008fc0ca450291532d2b320694c6b785
David Riazati [Tue, 2 Apr 2019 00:31:53 +0000 (17:31 -0700)]
Fix uninitialized value in pickler (#18678)
Summary:
Fixes #18671
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18678
Differential Revision:
D14708969
Pulled By: driazati
fbshipit-source-id:
d372c6e3a2a3d3fc48d8afc1fa6807f2ce0e5c6e
Soumith Chintala [Tue, 2 Apr 2019 00:07:27 +0000 (17:07 -0700)]
fixes multiprocessing serialization for integer nn.Parameter (#18639)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/17345
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18639
Differential Revision:
D14711565
Pulled By: soumith
fbshipit-source-id:
0063ed138a215b95d6571dcd68b18569714abe19
Soumith Chintala [Tue, 2 Apr 2019 00:07:08 +0000 (17:07 -0700)]
fix nccl compilation to make sure it compiles for architectures that pytorch compiles for (#18704)
Summary:
cc: t-vi gchanan zou3519
This fixes https://github.com/pytorch/pytorch/issues/18359
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18704
Differential Revision:
D14717015
Pulled By: soumith
fbshipit-source-id:
4aac036f57e564b05d759662e8ad7a80170901c0
Jon Malmaud [Mon, 1 Apr 2019 22:56:21 +0000 (15:56 -0700)]
More type stubs (#18511)
Summary:
Added stubs for:
* The `device` module
* The `cuda` module
* Parts of the `optim` module
* Began adding stubs for the `autograd` module. I'll annotate more later but `no_grad` and friends are probably the most used exports from it so it seemed like a good place to start.
This would close #16996, although comments on that issue reference other missing stubs so maybe it's worth keeping open as an umbrella issue.
The big remaining missing package is `nn`.
Also added a `py.typed` file so mypy will pick up on the type stubs. That closes #17639.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18511
Differential Revision:
D14715053
Pulled By: ezyang
fbshipit-source-id:
9e4882ac997063650e6ce47604b3eaf1232c61c9
Gregory Chanan [Mon, 1 Apr 2019 22:51:44 +0000 (15:51 -0700)]
NCCL build fix WITH_DISTRIBUTED=1.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18691
Reviewed By: ezyang
Differential Revision:
D14706205
Pulled By: gchanan
fbshipit-source-id:
802f19bfd7df3703c0dbce03036e2f2e32eb3efb
Duc Ngo [Mon, 1 Apr 2019 22:49:56 +0000 (15:49 -0700)]
caffe2 - set up correct inheritance structure for remaining operator test classes (#18622)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18622
Set up correct inheritance structure for remaining operator test classes
Reviewed By: ezyang
Differential Revision:
D14685941
fbshipit-source-id:
a6b1b3be325935b7fec7515be13a4994b3016bf0
Elias Ellison [Mon, 1 Apr 2019 22:33:35 +0000 (15:33 -0700)]
Peephole Optimize Shape Ops (#18549)
Summary:
Peephole optimize ops that just require Dimensioned Tensor Type, which is what we specialize graphs on.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18549
Differential Revision:
D14690827
Pulled By: eellison
fbshipit-source-id:
9d7439eb584f0a5b877f5aa53cf80150f00e7e5f
Sebastian Messmer [Mon, 1 Apr 2019 21:53:08 +0000 (14:53 -0700)]
Deprecated lambda based API (#18542)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18542
This adds the deprecated API for defining kernels as lambdas. The new API for defining kernels as lambdas was introduced in
D14653005.
Reviewed By: dzhulgakov
Differential Revision:
D14653551
fbshipit-source-id:
99900f1436716c69e52c83b68333b642ec2c8558
Sebastian Messmer [Mon, 1 Apr 2019 21:53:08 +0000 (14:53 -0700)]
deprecated function based API (#18444)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18444
This adds the deprecated function based API to c10::RegisterOperators().
This is the API currently exposed under jit::RegisterOperators() and we need to support it for backwards compatibility.
Reviewed By: dzhulgakov
Differential Revision:
D14514218
fbshipit-source-id:
c77676851cfd431d66f18fd8038cf153a3a7d7cc
Junjie Bai [Mon, 1 Apr 2019 21:30:09 +0000 (14:30 -0700)]
Revert "Tensor construction codemod(raw_mutable_data) (#16373)" (#18680)
Summary:
This reverts commit
d73c830e236f5b980e5c91914b818d150b60278c.
We have observed significant perf drop when training ResNext101 with multiple amd GPUs:
Before:
https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang7-rocmdeb-ubuntu16.04-bench/1636/console
2 GPUs ResNext training got 150\~160 imgs/sec
4 GPUs ResNext training got 270\~280 imgs/sec
After:
https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang7-rocmdeb-ubuntu16.04-bench/1637/console
Both 2 and 4 GPUs ResNext training drop to 110\~120 imgs/sec
Similar perf drop are seen on ResNet50 training jobs as well.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18680
Differential Revision:
D14702941
Pulled By: bddppq
fbshipit-source-id:
828141805afc23f25c08d4a2eb6d4b99f817c128
Pieter Noordhuis [Mon, 1 Apr 2019 21:27:03 +0000 (14:27 -0700)]
C++ handler for gradient reduction (#18251)
Summary:
This commit adds the `c10d::Reducer` class that hooks into autograd
and performs gradient bucketing and reduction. These are the core
parts of `nn.parallel.DistributedDataParallel` that up to now were
only usable for CUDA models.
This should enable the following:
* Distributed data parallelism for models defined using the C++ frontend.
* Allow overlap of gradient computation and reduction for non-CUDA models.
* Enable distributed data parallelism for models with some unused parameters.
This does not include any logic for computing bucket assignment, which
can be done separately; either by observing autograd execution order
(this is what Apex does), or by assigning buckets based on some
maximum byte size, or both.
Also see #17757 and #13273.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18251
Reviewed By: mrshenli
Differential Revision:
D14571899
Pulled By: pietern
fbshipit-source-id:
20f95eefd288dfe8cfffe0a28ca22fa7c9c3cd4c
svcscm [Mon, 1 Apr 2019 20:54:40 +0000 (13:54 -0700)]
Updating submodules
Reviewed By: zpao
fbshipit-source-id:
735fc388bff7066e8f46526266a73bf35e121442
Jongsoo Park [Mon, 1 Apr 2019 20:02:02 +0000 (13:02 -0700)]
add ConvRelu schema (#18693)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18693
As title
Reviewed By: protonu
Differential Revision:
D14662880
fbshipit-source-id:
3664faa660a04e1f528a413d2a1700b872c3c684
Karl Ostmo [Mon, 1 Apr 2019 20:01:29 +0000 (13:01 -0700)]
offload scripts from win-test.sh
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18601
Differential Revision:
D14711856
Pulled By: kostmo
fbshipit-source-id:
75fe620541fe2903f69a53dbd1b6d51a0d718113
peter [Mon, 1 Apr 2019 19:37:00 +0000 (12:37 -0700)]
Some fixes for the build script on Windows (#18681)
Summary:
Fixes https://discuss.pytorch.org/t/pytorch-build-from-source-on-windows/40288/13?u=peterjc123.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18681
Differential Revision:
D14711039
Pulled By: soumith
fbshipit-source-id:
f7e1a94b163064c055670b2925cd4502e7773599
Igor Fedan [Mon, 1 Apr 2019 19:30:29 +0000 (12:30 -0700)]
Fix for double backwards tests (#18190)
Summary:
If none of the outputs require_grad, we don't actually check gradgrad, instead we will check that their numerical gradients are 0.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18190
Differential Revision:
D14563388
Pulled By: ifedan
fbshipit-source-id:
a4eb94c9eb60f14dbe6986cd8cef1fe78a7bc839
David Riazati [Mon, 1 Apr 2019 18:58:28 +0000 (11:58 -0700)]
Add string index/slice operations (#18247)
Summary:
Adds support for string indexing (`"a"[0]`) and slicing (`"abc"[1:3]`)
to script.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18247
Differential Revision:
D14574486
Pulled By: driazati
fbshipit-source-id:
4b42aa0881e5398ea7f112be46c0335e6e19dced
eellison [Mon, 1 Apr 2019 18:48:39 +0000 (11:48 -0700)]
Re-land Parsing file check (#18570)
Summary:
The last time I tried to land it there was a merge race with the docs coverage test lol. Re-landing with the fix.
Re-land of https://github.com/pytorch/pytorch/pull/18304
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18570
Reviewed By: driazati
Differential Revision:
D14707285
Pulled By: eellison
fbshipit-source-id:
3a0265928aa8cad78961723d8bf0fbf871fdb71d
Ru Li [Mon, 1 Apr 2019 17:31:08 +0000 (10:31 -0700)]
Create Node2Vec ModuleKeeper
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18504
Reviewed By: sunnieshang
Differential Revision:
D14632091
fbshipit-source-id:
d4544866552dc6bcbc7515be9e88cb11e7622a44
Jongsoo Park [Mon, 1 Apr 2019 15:49:37 +0000 (08:49 -0700)]
use acc16 only when n>128 and k>128 in Skylake (#18672)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18672
In Skylake, when n < 128 or k < 128, acc16 is slower.
Reviewed By: jianyuh
Differential Revision:
D14700576
fbshipit-source-id:
80ca9f1af4626637eed9c5ca49f95ae744811189
Gregory Chanan [Mon, 1 Apr 2019 14:57:48 +0000 (07:57 -0700)]
Move ideep singleton registration to ATen from C2. (#18335)
Summary:
Since we are going to add ideep to ATen, and ATen is always compiled, it makes sense to have the registration in ATen rather than C2.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18335
Reviewed By: bddppq
Differential Revision:
D14578652
Pulled By: gchanan
fbshipit-source-id:
4d77fcfc21a362b21d5291a127498aa722548873
Shuichi KITAGUCHI [Mon, 1 Apr 2019 14:24:27 +0000 (07:24 -0700)]
Create torch/lib directory before copying _C.lib on Windows environment. (#18666)
Summary:
`python setup.py develop` fails with following messages.
~~~
...
-- Building with NumPy bindings
-- Not using cuDNN
-- Not using MIOpen
-- Not using CUDA
-- Using MKLDNN
-- Not using NCCL
-- Building without distributed package
Copying extension caffe2.python.caffe2_pybind11_state
Copying caffe2.python.caffe2_pybind11_state from torch\Lib\site-packages\caffe2\python\caffe2_pybind11_state.cp37-win_amd64.pyd to C:\data\source\pytorch\build\lib.win-amd64-3.7\caffe2\python\caffe2_pybind11_state.cp37-win_amd64.pyd
copying torch\Lib\site-packages\caffe2\python\caffe2_pybind11_state.cp37-win_amd64.pyd -> C:\data\source\pytorch\build\lib.win-amd64-3.7\caffe2\python
building 'torch._C' extension
creating build\temp.win-amd64-3.7
creating build\temp.win-amd64-3.7\Release
creating build\temp.win-amd64-3.7\Release\torch
creating build\temp.win-amd64-3.7\Release\torch\csrc
...
creating C:\data\source\pytorch\build\lib.win-amd64-3.7\torch
C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\bin\HostX64\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /nodefaultlib:libucrt.lib ucrt.lib /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\data\source\pytorch\torch\lib /LIBPATH:C:\data\dlenv\libs /LIBPATH:C:\data\dlenv\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\ATLMFC\lib\x64" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\lib\um\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.17763.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.17763.0\um\x64" shm.lib torch_python.lib /EXPORT:PyInit__C build\temp.win-amd64-3.7\Release\torch/csrc/stub.obj /OUT:build\lib.win-amd64-3.7\torch\_C.cp37-win_amd64.pyd /IMPLIB:build\temp.win-amd64-3.7\Release\torch/csrc\_C.cp37-win_amd64.lib /NODEFAULTLIB:LIBCMT.LIB
ライブラリ build\temp.win-amd64-3.7\Release\torch/csrc\_C.cp37-win_amd64.lib とオブジェクト build\temp.win-amd64-3.7\Release\torch/csrc\_C.cp37-win_amd64.exp を作成中
コード生成しています。
コード生成が終了しました。
copying build\lib.win-amd64-3.7\torch\_C.cp37-win_amd64.pyd -> torch
copying build\lib.win-amd64-3.7\caffe2\python\caffe2_pybind11_state.cp37-win_amd64.pyd -> caffe2\python
copying build/temp.win-amd64-3.7/Release/torch/csrc/_C.cp37-win_amd64.lib -> build/lib.win-amd64-3.7/torch/lib/_C.lib
error: could not create 'build/lib.win-amd64-3.7/torch/lib/_C.lib': No such file or directory
~~~
When `python setup.py install` is executed, `torch/lib` has been created by previous process (copying many files) and this copy succeeds. But in develop mode, that process does not executed and this copy fails.
This patch creates `torch/lib` directory if do not exist.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18666
Differential Revision:
D14704269
Pulled By: ezyang
fbshipit-source-id:
b2d7c698a906b945bf34bb78f17b91b4fdfd3294
Sacha [Mon, 1 Apr 2019 14:23:06 +0000 (07:23 -0700)]
Move flags that do not work on MSVC (#18686)
Summary:
MSVC errors on these flags as they are not supported
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18686
Differential Revision:
D14704254
Pulled By: ezyang
fbshipit-source-id:
936d33ed6b7474d7774a49505cdac50dbe8dd99a
Junjie Bai [Mon, 1 Apr 2019 05:26:27 +0000 (22:26 -0700)]
Fix unused lambda capture warnings (#18662)
Summary:
```
aten/src/ATen/native/cpu/DistanceOpsKernel.cpp.DEFAULT.cpp:109:104: warning: lambda capture 'combs' is not used [-Wunused-lambda-capture]
parallel_for(0, combs, internal::GRAIN_SIZE / (16 * m), [p, self_start, self_end, n, m, res_start, combs](int64_t k, int64_t end) {
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18662
Differential Revision:
D14699379
Pulled By: bddppq
fbshipit-source-id:
5062d4327bb5f7b485c2ffa30c98e10576416f03
Jongsoo Park [Mon, 1 Apr 2019 04:25:17 +0000 (21:25 -0700)]
handle a rare case of histogram min is inf/nan (#18239)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18239
When min is inf or nan, we get UBSAN errors
Reviewed By: csummersea
Differential Revision:
D14537668
fbshipit-source-id:
e70ffb5ecd2b10793356070c69fdabf8f25b203e
Edward Yang [Mon, 1 Apr 2019 02:08:03 +0000 (19:08 -0700)]
Delete duplicated technical content from contribution_guide.rst (#18628)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18628
ghimport-source-id:
d94b81a6f303883d97beaae25344fd591e13ce52
Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18629 Provide flake8 install instructions.
* **#18628 Delete duplicated technical content from contribution_guide.rst**
There's useful guide in contributing_guide.rst, but the
technical bits were straight up copy-pasted from CONTRIBUTING.md,
and I don't think it makes sense to break the CONTRIBUTING.md
link. Instead, I deleted the duplicate bits and added a cross
reference to the rst document.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Differential Revision:
D14701003
fbshipit-source-id:
3bbb102fae225cbda27628a59138bba769bfa288
Edward Yang [Mon, 1 Apr 2019 01:56:12 +0000 (18:56 -0700)]
Provide flake8 install instructions. (#18629)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18629
ghimport-source-id:
66a8871c56ffcfa7d4bfdf601e180fae99194e28
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18629 Provide flake8 install instructions.**
* #18628 Delete duplicated technical content from contribution_guide.rst
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Differential Revision:
D14701004
fbshipit-source-id:
b64292f0ef01b7894cf6b9ff8d5fd9e921c8d162
Rui Zhu [Mon, 1 Apr 2019 00:37:17 +0000 (17:37 -0700)]
Adding quantized tensor shape/type info support for caffe2=>glow in caffe2 side (#18621)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18621
This diff added caffe2 support for onnxifi quantization.
Reviewed By: yinghai
Differential Revision:
D14648767
fbshipit-source-id:
4ddb492cacbba6142305866e6dbb875880acaea3
David Riazati [Sun, 31 Mar 2019 23:17:28 +0000 (16:17 -0700)]
Fix test on windows (#18667)
Summary:
Breakage in #18188
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18667
Differential Revision:
D14700133
Pulled By: driazati
fbshipit-source-id:
4cc26bd579fc1b074b3bef6046cc1030facee130
Ailing Zhang [Sun, 31 Mar 2019 15:41:46 +0000 (08:41 -0700)]
Enforce check ad in test_jit (#18509)
Summary:
If a test triggers autodiff, it must have a `DifferentiableGraph` in its differentiated forward graph, and this subgraph must have either the original aten node, or the corresponding nodes used in AD formula.
Typically a forward differentiable graph looks like this:
```
graph(%i0 : Float(),
%i1 : Float()):
%3 : Float() = prim::DifferentiableGraph_0(%i0, %i1)
return (%3)
with prim::DifferentiableGraph_0 = graph(%0 : Float(),
%1 : Float()):
%2 : Float() = aten::max(%0, %1)
return (%2)
```
which tells us `aten::max(Tensor self, Tensor other) -> Tensor` is symbolically differentiable.
Update: there're a lot of cases (fusions/ConstantChunk/python implementations) that breaks it so I decided to make the check optionally take node names if different from function name.
~~[OLD]Theoretically I could also check if `aten::max` is in the differentiable block or not to be more precise, but there're also cases like `chunk` where in a differentiable block it's replaced with a prim node (ConstantChunk) and we will have to special case them. Any suggestions here (to be more precise or no) is very welcome!~~
We used to have a list containing nn tests should be run against AD, I moved it to an field when constructing our test(either torch or nn). I think it's cleaner this way, and it matches the fact that for the same op we support one schema of it but not all, in this way we could just turn on the corresponding test which triggers that supported schema.
cc: apaszke zdevito wanchaol ngimel for a review
[Done] :
- Going through a manual second pass of all tests to check if they should enable AD test or not....
- Add a readme about how to add AD for an op and how to add/enable its test in test_jit.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18509
Differential Revision:
D14696811
Pulled By: ailzhang
fbshipit-source-id:
c5e693277baac585cd3aed5ab2c0e7faa5e6f29f
Junjie Bai [Sun, 31 Mar 2019 09:05:14 +0000 (02:05 -0700)]
Use proper isnan check
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18663
Differential Revision:
D14699385
Pulled By: bddppq
fbshipit-source-id:
596ad3371e7704802591e49f7e1c55dc6cd2896f
Soumith Chintala [Sat, 30 Mar 2019 20:24:11 +0000 (13:24 -0700)]
pad_circular -> _pad_circular (#18608)
Summary:
pad_circular is really private, as circular padding is exposed via `F.pad`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18608
Differential Revision:
D14691704
Pulled By: soumith
fbshipit-source-id:
8c2f90596feed670976115041efed3ca071e8306
Will Feng [Sat, 30 Mar 2019 18:28:44 +0000 (11:28 -0700)]
Fix wrap(at::Scalar) (#18632)
Summary:
Problem:
```cpp
// This function expects a `Variable` as input
inline PyObject* wrap(at::Tensor tensor) {
return THPVariable_Wrap(Variable(std::move(tensor)));
}
inline PyObject* wrap(at::Scalar scalar) {
// This function calls `wrap(at::Tensor tensor)` (the function above), but since
// `scalar_to_tensor(...)` returns a `Tensor` and not a `Variable`, the call to
// `wrap(at::Tensor tensor)` will fail with "Tensor that was converted to Variable
// was not actually a Variable", which is not what we want.
return wrap(scalar_to_tensor(scalar));
}
```
The right fix is to call `make_variable(...)` with the tensor returned from `scalar_to_tensor(scalar)`.
This unblocks https://github.com/pytorch/pytorch/pull/18230 as it is the only patch that hits this code path now. All other native functions that return Scalar (such as `item()` or `_local_scalar_dense()`) either has custom-defined implementation that doesn't go through this path, or is not exposed to Python at all.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18632
Differential Revision:
D14689293
Pulled By: yf225
fbshipit-source-id:
be7ba5d3de83a69533a2997de97ad92989ff78ee
Gao, Xiang [Sat, 30 Mar 2019 17:50:48 +0000 (10:50 -0700)]
Deprecated type() -> scalar_type()
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18642
Differential Revision:
D14696848
Pulled By: ezyang
fbshipit-source-id:
43d1f86ecee5f6c6c5b70fd7d0e2063c3fc473ab
Edward Yang [Sat, 30 Mar 2019 15:58:10 +0000 (08:58 -0700)]
Turn on F401: Unused import warning. (#18598)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id:
c74597e5e7437e94a43c163cee0639b20d0d0c6a
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**
This was requested by someone at Facebook; this lint is turned
on for Facebook by default. "Sure, why not."
I had to noqa a number of imports in __init__. Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it. Left for future work.
Be careful! flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments. flake8-3 will
report an import unused; flake8-2 will not. For now, I just
noqa'd all these sites.
All the changes were done by hand.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Differential Revision:
D14687478
fbshipit-source-id:
30d532381e914091aadfa0d2a5a89404819663e3
ryan [Sat, 30 Mar 2019 08:20:55 +0000 (01:20 -0700)]
Update documentation for CTCLoss (#18415)
Summary:
This is meant to resolve #18249, where I pointed out a few things that could improve the CTCLoss docs.
My main goal was to clarify:
- Target sequences are sequences of class indices, excluding the blank index
- Lengths of `target` and `input` are needed for masking unequal length sequences, and do not necessarily = S, which is the length of the longest sequence in the batch.
I thought about Thomas's suggestion to link the distill.pub article, but I'm not sure about it. I think that should be up to y'all to decide.
I have no experience with .rst, so it might not render as expected :)
t-vi ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18415
Differential Revision:
D14691969
Pulled By: soumith
fbshipit-source-id:
381a2d52307174661c58053ae9dfae6e40cbfd46
Sebastian Messmer [Sat, 30 Mar 2019 07:03:46 +0000 (00:03 -0700)]
Fallback kernels (#18443)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18443
Allow registering a kernel without a dispatch key. In this case, the kernel becomes a fallback kernel that is called whenever no other kernel matches.
This is also useful for the legacy function based API (since that API doesn't know about dispatch keys) or any other custom ops that don't care about dispatch
and just want one kernel to be called no matter the dispatch key.
Reviewed By: dzhulgakov
Differential Revision:
D14603258
fbshipit-source-id:
242dc8871dad2989ca25079854d0cc97429e7199
Sebastian Messmer [Sat, 30 Mar 2019 07:03:46 +0000 (00:03 -0700)]
Introduce lambda-based kernel API (#18541)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18541
Allow registering lambdas as c10 kernels.
Reviewed By: dzhulgakov
Differential Revision:
D14653005
fbshipit-source-id:
f867cc776b1339e83b7a2e1935f5cf924cfba44a
Sebastian Messmer [Sat, 30 Mar 2019 07:03:46 +0000 (00:03 -0700)]
Report better errors when kernel or dispatch key are missing (#18302)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18302
These might be use cases we want to support in the future, but they don't work yet.
Let's at least report an error instead of doing segfaults or worse.
Reviewed By: dzhulgakov
Differential Revision:
D14572346
fbshipit-source-id:
49262ce131493bc887defe2978d8b22f202cd8cc
Sebastian Messmer [Sat, 30 Mar 2019 07:03:44 +0000 (00:03 -0700)]
Move stuff to cpp files (#18301)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18301
Move code out of headers and templates into source files and non-templates.
Reviewed By: dzhulgakov
Differential Revision:
D14572347
fbshipit-source-id:
9fd5d62d54000a95e93076cd73f591ba2c5c2653
Sebastian Messmer [Sat, 30 Mar 2019 07:03:44 +0000 (00:03 -0700)]
Check kernel against function schema in c10 op registration (#18256)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18256
This diff infers the function schema from the kernel function/functor and checks that it matches the specified function schema.
This diff does not allow (yet) to omit specifying the function schema in the registration API. That will come in a future diff.
Reviewed By: dzhulgakov
Differential Revision:
D14552738
fbshipit-source-id:
00202b489ede19f26ae686c97416b38c72c11532
Sebastian Messmer [Sat, 30 Mar 2019 07:03:44 +0000 (00:03 -0700)]
Add functor- and function-based kernel registration API (#18162)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18162
- Adds the API to register a functor- and function-based kernel.
- Change the experimental c10 ops to use this new API instead of the old one
- Deletes the old APIs in KernelRegistration.h and OpSchemaRegistration.h
Reviewed By: dzhulgakov
Differential Revision:
D14514239
fbshipit-source-id:
35b2f6e8f62964e54886450a6a5fac812ed20f26
Sebastian Messmer [Sat, 30 Mar 2019 07:03:43 +0000 (00:03 -0700)]
New operator registration MVP (#18161)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18161
This introduces version 0 for the new operator registration.
For now, it only works with kernels that are defined as stack-based functions.
This is actually not the intended public API for defining kernels, but it's the basis which is going to be used to define the public APIs (see diffs on top for them),
and it's also the API used for exposing caffe2 operators.
This diff also switches the mechanism for exposing caffe2 operators to the new mechanism.
Reviewed By: dzhulgakov
Differential Revision:
D14514231
fbshipit-source-id:
454ab7b5b46a10203aa27b175400d23f818dd1df
Junjie Bai [Sat, 30 Mar 2019 05:42:18 +0000 (22:42 -0700)]
Fix trt installation in CI (#18609)
Summary:
caffe2_py2_cuda9_0_cudnn7_ubuntu16_04_build is failing
```
...
Mar 29 04:44:46 Need to get 174 MB of archives.
Mar 29 04:44:46 After this operation, 576 MB of additional disk space will be used.
Mar 29 04:44:46 Do you want to continue? [Y/n] Abort.
Exited with code 1
...
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18609
Differential Revision:
D14694990
Pulled By: bddppq
fbshipit-source-id:
260446a8650f660a2baf123a3f17efdf0a8d6c64
David Riazati [Sat, 30 Mar 2019 02:06:06 +0000 (19:06 -0700)]
Attribute serialization improvements (#18188)
Summary:
* adds attributes to `ScriptModule.__getattr__` so they can be accessed in Python after re-importing
* full support for all the possible values for an `int64_t`
* this necessitated a bunch more `pushWhatever` functions, so re-introduced a templated version to cut down on duplicate code
* tests to validate references / value sharing works
* adds `torch.jit.Unpickler` which people can use to de-serialize the pickle files into Python / have a quick reference on how to do this without PyTorch
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18188
Differential Revision:
D14527490
Pulled By: driazati
fbshipit-source-id:
efd15579cc04aa2e28c4b2c9490d82d849dee559
Cheng,Penghui [Sat, 30 Mar 2019 01:51:50 +0000 (18:51 -0700)]
support pre-convert filter format for mkldnn training mode and change 'OptimizeForIdeep' to 'OptimizeForMkldnn' (#15171)
Summary:
For MKL-DNN,the filter data will be reorderd to primitive format, it takes a lot of time.
So the patch provide a method to convert filter format before training.
And "OptimizeForIdeep" will be changed to "OptimizeForMkldnn" in this patch.
This patch depends on https://github.com/pytorch/pytorch/pull/12866
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15171
Differential Revision:
D14590741
Pulled By: yinghai
fbshipit-source-id:
07971c9977edac3c8eec08ca2c39cda639683492
Jerry Zhang [Sat, 30 Mar 2019 01:26:07 +0000 (18:26 -0700)]
Tensor construction codemod(raw_mutable_data) (#16373)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16373
motivation: https://github.com/pytorch/pytorch/pull/12407
This is a manual diff.
most of the fixes should be:
```
auto* Y = Output(0);
Y->Resize(dims);
Y->raw_mutable_data(dtype);
```
-->
```
auto* Y = Output(0, dims, at::dtype(dtype));
```
But there might be other cases.
Reviewed By: dzhulgakov
Differential Revision:
D13725460
fbshipit-source-id:
649a4b0e42f62cda1a60171dd9fa3e440dc9dca1