platform/upstream/pytorch.git
5 years agoRemove extra include
Ilia Cherniavskii [Thu, 22 Nov 2018 01:19:37 +0000 (17:19 -0800)]
Remove extra include

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14206

Reviewed By: dzhulgakov

Differential Revision: D13131318

fbshipit-source-id: 559b55b8d98cdf6b7d1d3e31237c5473edc5e462

5 years agoRemoved redundant allreduce options in DDP (#14208)
Teng Li [Thu, 22 Nov 2018 00:54:36 +0000 (16:54 -0800)]
Removed redundant allreduce options in DDP (#14208)

Summary:
This somehow is not cleaned up after the C++ migration. Unused and can be removed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14208

Differential Revision: D13132492

Pulled By: teng-li

fbshipit-source-id: 0f05b6368174664ebb2560c037347c8eb45f7c38

5 years agoAdd list inequality operator (#14129)
David Riazati [Thu, 22 Nov 2018 00:30:43 +0000 (16:30 -0800)]
Add list inequality operator (#14129)

Summary:
This PR adds `aten::neq` for list inequality comparisons and converts
`nll_loss` to weak script
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14129

Differential Revision: D13123894

Pulled By: driazati

fbshipit-source-id: 8c1edf7c163217ec00eb653f95d196db3998613f

5 years agoAdd onnxifi support to SparseLengthsWeightedSum (#14210)
Yinghai Lu [Wed, 21 Nov 2018 23:43:10 +0000 (15:43 -0800)]
Add onnxifi support to SparseLengthsWeightedSum (#14210)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14210

We left `SparseLengthsWeightedSum` as benchmark is not testing it due to fp16 filler issue. It was flushed out by unit tests. Hence we add the support here.

Reviewed By: bddppq

Differential Revision: D13132320

fbshipit-source-id: b21c30c185c9e1fbf3980641bc3cdc39e85af2e1

5 years agoAdd "axis" and "axis_w" arguments in FC to support customized axix to reduce dim...
Gu, Jinghui [Wed, 21 Nov 2018 23:42:29 +0000 (15:42 -0800)]
Add "axis" and "axis_w" arguments in FC to support customized axix to reduce dim. (#12971)

Summary:
Add "axis" and "axis_w" arguments in FC to support customized axix to reduce dim.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12971

Reviewed By: bddppq

Differential Revision: D12850675

Pulled By: yinghai

fbshipit-source-id: f1cde163201bd7add53b8475329db1f038a73019

5 years agoIDEEP fallback for ResizeNearest op (#14212)
Viswanath Sivakumar [Wed, 21 Nov 2018 21:42:04 +0000 (13:42 -0800)]
IDEEP fallback for ResizeNearest op (#14212)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14212

TSIA

Reviewed By: yinghai

Differential Revision: D13134134

fbshipit-source-id: e3c5c9c8756d6e25b213f8dde9d809a44373d7a3

5 years agoFix ONNX_ATEN mode (#14239)
zrphercule [Wed, 21 Nov 2018 21:12:18 +0000 (13:12 -0800)]
Fix ONNX_ATEN mode (#14239)

Summary:
Fix ONNX_ATEN mode by adding it to the validateBlock method.
Before this pr, validateBlock will throw an exception when using this mode.

I will add related test cases for ONNX_ATEN mode in a different pr once this is merged, since we dont have any currently.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14239

Differential Revision: D13145443

Pulled By: zrphercule

fbshipit-source-id: 60e7942aa126acfe67bdb428ef231ac3066234b1

5 years agoBump gloo (#14281)
Pieter Noordhuis [Wed, 21 Nov 2018 19:25:42 +0000 (11:25 -0800)]
Bump gloo (#14281)

Summary:
Includes more robust error handling and timeout support.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14281

Differential Revision: D13158232

Pulled By: pietern

fbshipit-source-id: e80432799a020576d5abdcd9a21d66b629479caf

5 years agofix comment on dnnlowp op arguments (#14265)
Jongsoo Park [Wed, 21 Nov 2018 17:37:58 +0000 (09:37 -0800)]
fix comment on dnnlowp op arguments (#14265)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14265

Fix comment

Reviewed By: hx89

Differential Revision: D13152106

fbshipit-source-id: fbe98906963cbd5cb20a583a737a792fbc38292e

5 years agonative NN wrappers, including with buffers.
Gregory Chanan [Wed, 21 Nov 2018 17:04:59 +0000 (09:04 -0800)]
native NN wrappers, including with buffers.

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14256

Differential Revision: D13148783

Pulled By: gchanan

fbshipit-source-id: 4b6179033cf1df26061b6731eaaa4e008692e592

5 years agoRemove header generated at configuration time (#14244)
Pieter Noordhuis [Wed, 21 Nov 2018 16:43:14 +0000 (08:43 -0800)]
Remove header generated at configuration time (#14244)

Summary:
The build was picking up the empty stub header instead of the generated
one. Because of the large number of include paths we end up passing to
the compiler it is brittle to have both an empty stub file and a
generated file and expect the compiler to pick up the right one.

With the recent change to compile everything from a single CMake run we
can now use native CMake facilities to propagate macros that indicate
backend support. The stanzas target_compile_definitions with the
INTERFACE flag ensure that these macros are set only for downstream
consumers of the c10d target.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14244

Reviewed By: teng-li

Differential Revision: D13144293

Pulled By: pietern

fbshipit-source-id: f49324220db689c68c126b159f4f00a8b9bc1252

5 years agoAddress jittering issues in python_print (#14064)
Zachary DeVito [Wed, 21 Nov 2018 14:36:26 +0000 (06:36 -0800)]
Address jittering issues in python_print (#14064)

Summary:
export - print a method with python_print
import - import a method with import_method

We want to ensure:

    export(g) == export(import(export(g)))

That is after after exporting/importing once, the graph will stay exactly
the same. This is less strict that g == import(export(g)) which would
require us to maintain a lot more information about the structure of the
IR and about the names of debug symbols.

This PR addresses this with the following fixes:
* print out double-precision numbers with high enough precision such
  that they always parse in the same way
* when creating loop-carried dependencies, sort them
  by variable name, ensuring a consistent order
* parse nan correctly
* DCE: remove unused outputs of if statements, and loop-carried dependencies
  in loops that are dead both after the loop and inside the body of the
  loop.
* Do not set uniqueName for variables whose names are _[0-9]+, these
  are probably rare in user code, and we need a way to communicate
  that we do not care about a variable name when re-parsing the graph.
  Otherwise temporary variable names will jitter around.
* Expand the definition of a constant in printing code to None,
  and family.
* Allow re-treeing to work as long as the only thing in its way is a
  constant node. These do not have side effects but are sometimes
  inserted in a different order when tracing compared to how we print them.
* Print all constant nodes out first in the order in which they are used_val
 (or, if they are inlined, ensure they get assigned CONSTANT.cX number
  in a consistent order). Cleanup tuples (this is done in the compiler,
  but not in the tracer, leading to some tuple indexing jitter if not
  done).
* use strtod_l, not std::stod which can throw exceptions

Other:
* Add REL_WITH_DEB_INFO to setup.py. It already existed for the
  cmake files. Threading it into setup.py allows us to turn on
  debug symbols with optimization everywhere.
* enable round trip testing for all generated graphs. This only adds
  ~6 seconds to total build time but tests printing for every graph.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14064

Differential Revision: D13094637

Pulled By: zdevito

fbshipit-source-id: 0a1c6912194d965f15d6b0c6cf838ccc551f161d

5 years agoUpdating submodules
svcscm [Wed, 21 Nov 2018 10:16:29 +0000 (02:16 -0800)]
Updating submodules

Reviewed By: cdelahousse

fbshipit-source-id: 27838fb2dad82c78906faf3cc2d124557c30e88f

5 years agoUpdating submodules
svcscm [Wed, 21 Nov 2018 08:25:17 +0000 (00:25 -0800)]
Updating submodules

Reviewed By: cdelahousse

fbshipit-source-id: 3c17e12a579245a84e9a56b1d8a1641232150675

5 years agoAdd tensor table in ModelDef and use it for jit script serialization and deserializat...
Lu Fang [Wed, 21 Nov 2018 07:33:30 +0000 (23:33 -0800)]
Add tensor table in ModelDef and use it for jit script serialization and deserialization (#13861)

Summary:
As we discussed, the tensors in the torch script will be associated with the tensor data in the serialized file. So let's add a table of tensor (actually it's a repeated TensorProto filed) in the ModelDef. TensorProto.name will be the id.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/13861

Reviewed By: dzhulgakov

Differential Revision: D13036940

Pulled By: zrphercule

fbshipit-source-id: ecb91b062ac4bc26af2a8d6d12c91d5614efd559

5 years agoc10d Automatically retry on EINTR (#14180)
Tongzhou Wang [Wed, 21 Nov 2018 07:27:16 +0000 (23:27 -0800)]
c10d Automatically retry on EINTR (#14180)

Summary:
Probably fixes https://github.com/pytorch/pytorch/issues/14170

Actually I probably shouldn't retry all `SYSCHECK` calls. I'll leave to the reviewers to decide.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14180

Reviewed By: pietern

Differential Revision: D13144741

Pulled By: SsnL

fbshipit-source-id: d73288f76b18cae14b1b43dad4e5e8d010a96d95

5 years agoMake NCCL backend support barrier op (#14142)
Teng Li [Wed, 21 Nov 2018 05:10:18 +0000 (21:10 -0800)]
Make NCCL backend support barrier op (#14142)

Summary:
This is a feature request from: https://github.com/pytorch/pytorch/issues/13573

As the title says, this PR makes NCCL backend support barrier op.

There are a couple scenarios that need to be addressed:
(1) When there is already a NCCL op happened, we need to record what GPU device(s)  the previous op happened and queue the allreduce barrier op on the same GPU device
(2) When there is no NCCL op yet, we will try to use a single GPU and separate each process from a single GPU as the best effort.

As for the async work, during wait, we would like not just wait on the NCCL kernel to be completed, but also block the thread until the current stream and nccl stream return.

`test_distributed` should cover the test. I also manually tested both scenarios.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14142

Differential Revision: D13113391

Pulled By: teng-li

fbshipit-source-id: 96c33d4d129e2977e6892d85d0fc449424c35499

5 years agoFix memory leakage in onnxifi transformer (#14245)
Yinghai Lu [Wed, 21 Nov 2018 02:00:14 +0000 (18:00 -0800)]
Fix memory leakage in onnxifi transformer (#14245)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14245

tsia

Reviewed By: bddppq, rdzhabarov

Differential Revision: D13144783

fbshipit-source-id: 5e07bb7ab883ba1af68547a26272cd320967b9e3

5 years agoAllow undefined tensors as constants (#14120)
David Riazati [Wed, 21 Nov 2018 00:42:00 +0000 (16:42 -0800)]
Allow undefined tensors as constants (#14120)

Summary:
This PR inserts `prim::None` constants for undefined tensors. This comes in the standard library if an `Optional[Tensor]` is statically determined to be `None`:

```python
torch.jit.script
def fn(x=None):
    # type: (Optional[Tensor]) -> Tensor
    return torch.jit._unwrap_optional(x)

torch.jit.script
def fn2():
    # type: () -> Tensor
    return fn()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14120

Differential Revision: D13124625

Pulled By: driazati

fbshipit-source-id: 9eaa82e478c49c503f68ed89d8c770e8273ea569

5 years agoExport BatchNorm functional and module, add necessary JIT support (#14016)
Wanchao Liang [Tue, 20 Nov 2018 22:09:27 +0000 (14:09 -0800)]
Export BatchNorm functional and module, add necessary JIT support (#14016)

Summary:
This PR did three things:

1. It export the BatchNorm functional and module, and rewrite some of the components to stay align with the current supported JIT features
2. In the process of export, add necessary compiler support for in_place op aug assign
4. change the test_jit behavior in add_module_test to utilize a single rng state during module initialization
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14016

Differential Revision: D13112064

Pulled By: wanchaol

fbshipit-source-id: 31e3aee5fbb509673c781e7dbb6d8884cfa55d91

5 years agoHave PYTORCH_FUSION_DEBUG print C kernel source (#14213)
Thomas Viehmann [Tue, 20 Nov 2018 20:43:23 +0000 (12:43 -0800)]
Have PYTORCH_FUSION_DEBUG print C kernel source (#14213)

Summary:
- Move up handling the environment variable from CPU only to all
- Introduce two levels to be enabled with PYTORCH_FUSION_DEBUG=n:
  1: print C source
  2: print CPU assembly, too (previous effect of PYTORCH_FUSION_DEBUG)

apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14213

Differential Revision: D13135393

Pulled By: soumith

fbshipit-source-id: befa4ebea3b3c97e471393a9f6402b93a6b24031

5 years agoDelete backwards compatibility StorageImpl.h and TensorImpl.h (#14230)
Tugrul Ates [Tue, 20 Nov 2018 20:23:14 +0000 (12:23 -0800)]
Delete backwards compatibility StorageImpl.h and TensorImpl.h (#14230)

Summary:
Since they directly include the real ones in core.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14230

Differential Revision: D13140323

Pulled By: tugrulates

fbshipit-source-id: d7e3b94e891b2d7fa273d01c0b7edfebdbd7e368

5 years agoremove unused parameters from caffe2_dnnlowp_utils.cc (#14164)
Jongsoo Park [Tue, 20 Nov 2018 08:53:29 +0000 (00:53 -0800)]
remove unused parameters from caffe2_dnnlowp_utils.cc (#14164)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14164

See title

Reviewed By: csummersea

Differential Revision: D13115470

fbshipit-source-id: d754f558cd06e5f4c1cd00315e912cdb7b50731a

5 years agouse pragma once (#14163)
Jongsoo Park [Tue, 20 Nov 2018 08:53:29 +0000 (00:53 -0800)]
use pragma once (#14163)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14163

Some of the names we were using to guard the header file was too short (e.g. DYNAMIC_HISTOGRAM_H).

Reviewed By: csummersea

Differential Revision: D13115451

fbshipit-source-id: cef8c84c62922616ceea17effff7bdf8d67302a2

5 years agoformat python files (#14161)
Jongsoo Park [Tue, 20 Nov 2018 08:53:29 +0000 (00:53 -0800)]
format python files (#14161)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14161

Formatting using Nuclide

Reviewed By: hx89

Differential Revision: D13115348

fbshipit-source-id: 7432ce6072a1822d7287b4ebcfcb6309282e15ac

5 years agoclang-format (#14160)
Jongsoo Park [Tue, 20 Nov 2018 08:53:29 +0000 (00:53 -0800)]
clang-format (#14160)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14160

clang-format of C++ files

Reviewed By: hx89

Differential Revision: D13115201

fbshipit-source-id: d2ad65f66209e00578ef90f87f41272de2d24aa9

5 years agoAdd sigmoid op based on MKL-DNN
Hui Wu [Tue, 20 Nov 2018 06:54:19 +0000 (22:54 -0800)]
Add sigmoid op based on MKL-DNN

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13097

Differential Revision: D13105366

Pulled By: yinghai

fbshipit-source-id: d156e8fd519baeecf61c25dcd8fa2c2fa7351ef4

5 years agoOSS build fix (#14192)
Daya S Khudia [Tue, 20 Nov 2018 06:45:00 +0000 (22:45 -0800)]
OSS build fix (#14192)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14192

We can only use C10_* in OSS. The build is only broken if built with USE_FBGEMM=ON

Reviewed By: jianyuh

Differential Revision: D13121781

fbshipit-source-id: f0ee9a75997766e63e1da8a53de7ddb98296a171

5 years agoMake EncodeMethod in jit script serialization return a string (#14167)
Lu Fang [Tue, 20 Nov 2018 06:12:16 +0000 (22:12 -0800)]
Make EncodeMethod in jit script serialization return a string (#14167)

Summary:
Nit

Pull Request resolved: https://github.com/pytorch/pytorch/pull/14167

Reviewed By: ezyang

Differential Revision: D13116584

Pulled By: dzhulgakov

fbshipit-source-id: c0e7e71a81004031564bd2fc59f393041e1283d5

5 years agoCreate README.md of caffe2/quantization/server
Jongsoo Park [Tue, 20 Nov 2018 05:44:29 +0000 (21:44 -0800)]
Create README.md of caffe2/quantization/server

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14217

Reviewed By: csummersea

Differential Revision: D13135086

Pulled By: jspark1105

fbshipit-source-id: bddf4f1c2dc5ec8ea6ebe9e265956f367e082d52

5 years agoCircleCI: fix NCCL install (#14172)
Will Feng [Tue, 20 Nov 2018 05:28:29 +0000 (21:28 -0800)]
CircleCI: fix NCCL install (#14172)

Summary:
The `$BUILD_ENVIRONMENT` checks work in `test.sh` but not `build.sh`, this PR fixes the issue.

This replaces https://github.com/pytorch/pytorch/pull/14124.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14172

Differential Revision: D13135087

Pulled By: yf225

fbshipit-source-id: 42fff3926734778713d483d74ba0a89e5502dd9e

5 years agoFix a bug in test case of onnx::If
zrphercule [Tue, 20 Nov 2018 02:43:58 +0000 (18:43 -0800)]
Fix a bug in test case of onnx::If

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14209

Differential Revision: D13132607

Pulled By: zrphercule

fbshipit-source-id: b7f7ccc6a6cbdeb57a7f88a1971d15dd81e6fc81

5 years agoTensor type checking and informative error messages for torch.distributed (#14204)
Teng Li [Tue, 20 Nov 2018 02:25:00 +0000 (18:25 -0800)]
Tensor type checking and informative error messages for torch.distributed (#14204)

Summary:
This will address https://github.com/pytorch/pytorch/issues/13574

This error message should be more informative to the user for all the non-multiGPU ops, since we python binding to multi-gpu ops always.

test_distributed should cover all. Also tested both RunTime errors.

```
>>> a = torch.ByteTensor([])
>>> b = [a, a]
>>> dist.all_reduce(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/private/home/tengli/pytorch/torch/distributed/distributed_c10d.py", line 809, in all_reduce
    _check_single_tensor(tensor, "tensor")
  File "/private/home/tengli/pytorch/torch/distributed/distributed_c10d.py", line 207, in _check_single_tensor
    "to be a torch.Tensor type".format(param_name))
RuntimeError: Invalid function argument. Expecting parameter: tensor to be a torch.Tensor type

>>> b = ["b"]
>>> dist.all_gather(b, a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/private/home/tengli/pytorch/torch/distributed/distributed_c10d.py", line 1006, in all_gather
    _check_tensor_list(tensor_list, "tensor_list")
  File "/private/home/tengli/pytorch/torch/distributed/distributed_c10d.py", line 225, in _check_tensor_list
    "to be a List[torch.Tensor] type".format(param_name))
RuntimeError: Invalid function argument. Expecting parameter: tensor_list to be a List[torch.Tensor] type
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14204

Differential Revision: D13131526

Pulled By: teng-li

fbshipit-source-id: bca3d881e41044a013a6b90fa187e722b9dd45f2

5 years agoMove stream functions from CUDAContext to CUDAStream (#14110)
Edward Yang [Tue, 20 Nov 2018 01:01:34 +0000 (17:01 -0800)]
Move stream functions from CUDAContext to CUDAStream (#14110)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14110

I'm planning to move CUDAStream to c10/cuda, without also moving
CUDAContext, and so it's most convenient if these definitions
are in the actual header file in question.

Reviewed By: smessmer

Differential Revision: D13104693

fbshipit-source-id: 23ce492003091adadaa5ca6a17124213005046c2

5 years agoMove CUDAStreamInternals inside detail namespace. (#14109)
Edward Yang [Tue, 20 Nov 2018 01:01:34 +0000 (17:01 -0800)]
Move CUDAStreamInternals inside detail namespace. (#14109)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14109

Previously it was at the top level, because the author was under
the impression that you could only refer to top-level C++ names
from C, but this is not true; you just need to make a stub struct
conditioned on __cplusplus.

Reviewed By: smessmer

Differential Revision: D13104694

fbshipit-source-id: ecb7ae6dcfa4ab4e062aad7a886937dca15fd1b2

5 years agoDelete dependencies from CUDAStream; remove synchronize_with (#13920)
Edward Yang [Tue, 20 Nov 2018 01:01:33 +0000 (17:01 -0800)]
Delete dependencies from CUDAStream; remove synchronize_with (#13920)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13920

I want to move CUDAStream and CUDAGuard to c10_cuda without also
bringing along CUDAContext or CUDAEvent for the ride (at least for
now).  To do this, I need to eliminate those dependencies.

There's a few functions in CUDAContext.h which don't really need
THCState, so they're separated out and put in general
purpose c10/cuda/CUDAFunctions.h

Reviewed By: smessmer

Differential Revision: D13047468

fbshipit-source-id: 7ed9d5e660f95805ab39d7af25892327edae050e

5 years agoFix race in AtomicFetchAdd. (#13479)
Yavuz Yetim [Mon, 19 Nov 2018 23:57:28 +0000 (15:57 -0800)]
Fix race in AtomicFetchAdd. (#13479)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13479

Increases the lock scope to above Output() calls.

These calls potentially allocate the underlying blob/tensor
objects and multiple invocations race each other over the
same output blobs/tensors.

Reviewed By: bwasti

Differential Revision: D12891629

fbshipit-source-id: a6015cfdb08e352521a1f062eb9d94a971cfbdb0

5 years agoRemove API macros from intrusive_ptr (#14137)
Sebastian Messmer [Mon, 19 Nov 2018 23:35:18 +0000 (15:35 -0800)]
Remove API macros from intrusive_ptr (#14137)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14137

This is a templated header-only class and shouldn't need export/import macros.

Reviewed By: ezyang

Differential Revision: D13111712

fbshipit-source-id: c8c958e75b090d011d25156af22f37f9ca605196

5 years agoTensor construction: combine Resize+mutable_data - 1/4 (#13942)
Jerry Zhang [Mon, 19 Nov 2018 23:29:45 +0000 (15:29 -0800)]
Tensor construction: combine Resize+mutable_data - 1/4 (#13942)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13942

Codemod generated with clangr shard mode, 25 files per diff,
motivation: https://github.com/pytorch/pytorch/pull/12407

Reviewed By: smessmer

Differential Revision: D13054770

fbshipit-source-id: a9e86e5dfcb4f7cebf5243e1d359fad064561bed

5 years agoTensor construction: combine Resize+mutable_data - 3/4 (#13944)
Jerry Zhang [Mon, 19 Nov 2018 23:25:43 +0000 (15:25 -0800)]
Tensor construction: combine Resize+mutable_data - 3/4 (#13944)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13944

Pull Request resolved: https://github.com/pytorch/pytorch/pull/13854

Codemod generated with clangr shard mode, 25 files per diff,
motivation: https://github.com/pytorch/pytorch/pull/12407

Reviewed By: ezyang

Differential Revision: D13054836

fbshipit-source-id: 5de07a156687f1ee607d0450410881d9176a87a7

5 years agoStore the optimize flag in module (#14166)
Lu Fang [Mon, 19 Nov 2018 22:29:31 +0000 (14:29 -0800)]
Store the optimize flag in module (#14166)

Summary:
When the save/load of script module, we store optimize flag in module instead of encoding it in method.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/14166

Reviewed By: ezyang

Differential Revision: D13117577

Pulled By: dzhulgakov

fbshipit-source-id: dc322948bda0ac5809d8ef9a345497ebb8f33a61

5 years agoCleanup caffe2 hipify exclude patterns (#14198)
Junjie Bai [Mon, 19 Nov 2018 22:21:20 +0000 (14:21 -0800)]
Cleanup caffe2 hipify exclude patterns (#14198)

Summary:
depthwise_3x3_conv_op.cu does not exist
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14198

Differential Revision: D13127479

Pulled By: bddppq

fbshipit-source-id: ec6bd434055a49ea405c4b399bde8c074114f955

5 years agoSupport 'python_module' of 'nn' in native functions. (#14126)
Gregory Chanan [Mon, 19 Nov 2018 22:10:47 +0000 (14:10 -0800)]
Support 'python_module' of 'nn' in native functions. (#14126)

Summary:
Also move mse_loss, binary_cross_entropy, l1_loss to use this functionality.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14126

Reviewed By: ezyang

Differential Revision: D13109975

Pulled By: gchanan

fbshipit-source-id: 0b29dc8cf222d25db14da7532d8dc096a988a0ec

5 years agoUse onnx proto_utils to support using protobuf-lite
Junjie Bai [Mon, 19 Nov 2018 21:25:32 +0000 (13:25 -0800)]
Use onnx proto_utils to support using protobuf-lite

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14150

Differential Revision: D13115586

Pulled By: bddppq

fbshipit-source-id: d6b6935a8deac60f6f58d62a71f6840182a72a51

5 years agoUse fbgemm revision file added by shipit (#14105)
Daya S Khudia [Mon, 19 Nov 2018 20:08:35 +0000 (12:08 -0800)]
Use fbgemm revision file added by shipit (#14105)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14105

Pull Request resolved: https://github.com/facebook/fbshipit/pull/62

Use fbgemm revision file created by ShipIt for updating fbgemm revision for pytorch. We don't have to manually update submodule now.

Reviewed By: yns88

Differential Revision: D13072074

fbshipit-source-id: bef9eabad50f7140179c370a60bd9ca73067b9b5

5 years agoSetup sccache for PyTorch ROCm CI (#14153)
Your Name [Mon, 19 Nov 2018 19:26:38 +0000 (11:26 -0800)]
Setup sccache for PyTorch ROCm CI (#14153)

Summary:
Discovered huge build time difference between caffe2 rocm build and pytorch rocm build (6min vs. 30min), turns out it's because the sccache setup needed in caffe2 docker images are not n pytorch build script.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14153

Differential Revision: D13115097

Pulled By: bddppq

fbshipit-source-id: 88414f164b980f0e667c8e138479b4a75ab7692e

5 years agoallow empty index for scatter_* methods (#14077)
Ailing Zhang [Mon, 19 Nov 2018 17:45:28 +0000 (09:45 -0800)]
allow empty index for scatter_* methods (#14077)

Summary:
Fixes #2027
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14077

Differential Revision: D13095788

Pulled By: ailzhang

fbshipit-source-id: ad2c8bbf83d36e07940782b9206fbdcde8905fd3

5 years agouse at::Device throughout JIT (#14181)
ArmenAg [Mon, 19 Nov 2018 17:18:45 +0000 (09:18 -0800)]
use at::Device throughout JIT (#14181)

Summary:
zdevito soumith

Sorry about the previous PR, had some git issues. This is the same exact code as the previous PR but updated w.r.t pytorch/master.

fixes #13254
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14181

Differential Revision: D13117688

Pulled By: soumith

fbshipit-source-id: 044840b2c7a0101ef43dd16655fd9a0f9981f53f

5 years agoSupport named return arguments in native_functions. (#14100)
Gregory Chanan [Mon, 19 Nov 2018 16:18:47 +0000 (08:18 -0800)]
Support named return arguments in native_functions. (#14100)

Summary:
Note there was a hacky way of doing this before by specifying "return:" lists manually; this makes the
return names part of the function declaration itself.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14100

Differential Revision: D13101810

Pulled By: gchanan

fbshipit-source-id: 1c80574cd4e8263764fc65126427b122fe36df35

5 years agoSplit out CUDAMultiStreamGuard from CUDAGuard (#13912)
Edward Yang [Mon, 19 Nov 2018 16:13:08 +0000 (08:13 -0800)]
Split out CUDAMultiStreamGuard from CUDAGuard (#13912)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13912

The implementation and API of CUDAMultiStreamGuard is less mature,
and it cannot be implemented generically (yet) in c10_cuda.  This
might be a reasonable thing to do eventually, but not for now.

Reviewed By: smessmer

Differential Revision: D13046500

fbshipit-source-id: 4ea39ca1344f1ad5ae7c82c98617aa348c327848

5 years agoMove AT_CUDA_CHECK to c10
Edward Yang [Mon, 19 Nov 2018 16:13:08 +0000 (08:13 -0800)]
Move AT_CUDA_CHECK to c10

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13910

Reviewed By: smessmer

Differential Revision: D13046201

fbshipit-source-id: 8d360a0e4d6c2edf070d130e600c6b04f0ee0058

5 years agoAdd c10 cuda library. (#13900)
Edward Yang [Mon, 19 Nov 2018 16:13:07 +0000 (08:13 -0800)]
Add c10 cuda library. (#13900)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13900

Add c10 cuda library.

Right now, this is not used by anything, and only tests if the CUDA
headers are available (and not, e.g., that linking works.)

Extra changes:
- cmake/public/cuda.cmake now is correctly include guarded, so you
  can include it multiple times without trouble.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Reviewed By: smessmer

Differential Revision: D13025313

fbshipit-source-id: fda85b4c35783ffb48ddd6bbb98dbd9154119d86

5 years agoSwitch Int8Add operator to QNNPACK (#14089)
Marat Dukhan [Mon, 19 Nov 2018 07:55:01 +0000 (23:55 -0800)]
Switch Int8Add operator to QNNPACK (#14089)

Summary:
- Improved single-threaded performance due to optimized low-level micro-kernels
- Improved parallelization (previously was parallelized across images in a batch and pixels only, now within channels as well)
- Slightly different result due to different implementation of fixed-point arithmetics (no accuracy loss expected)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14089

Differential Revision: D13110135

Pulled By: Maratyszcza

fbshipit-source-id: 1f149394af5c16940f79a3fd36e183bba1be2497

5 years agoNo more -werror for c10d (#14155)
Teng Li [Sun, 18 Nov 2018 21:51:15 +0000 (13:51 -0800)]
No more -werror for c10d (#14155)

Summary:
As the title says
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14155

Differential Revision: D13115769

Pulled By: teng-li

fbshipit-source-id: 278deba090364544d92fa603621604ce37fa974e

5 years agoAdd ultra low precision options (#14133)
Summer Deng [Sun, 18 Nov 2018 20:49:39 +0000 (12:49 -0800)]
Add ultra low precision options (#14133)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14133

Experiment with ultra low precisions on the Resnext-101 URU trunk model

Reviewed By: jspark1105

Differential Revision: D10108518

fbshipit-source-id: f04d74fbe1c9e75efafcd9845719bdb2efbbfe9c

5 years agoAdds symbolic diff for THNN Conv2d and aten native BatchNorm (#13888)
Soumith Chintala [Sun, 18 Nov 2018 17:20:29 +0000 (09:20 -0800)]
Adds symbolic diff for THNN Conv2d and aten native BatchNorm (#13888)

Summary:
Adds symbolic diff and tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13888

Differential Revision: D13115548

Pulled By: soumith

fbshipit-source-id: ba75b01a95a5715a7761724dda018168b6188917

5 years agoPrint warning when ROCm memory leaking is detected in pytorch tests (#14151)
Your Name [Sun, 18 Nov 2018 08:09:25 +0000 (00:09 -0800)]
Print warning when ROCm memory leaking is detected in pytorch tests (#14151)

Summary:
We keep seeing random failures in CI because of ROCm memory leaking, e.g:

https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/3102//console
https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/3080//console

To make the CI more stable, turn it to warning instead of failure.

iotamudelta please help investigating the memory leaking
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14151

Differential Revision: D13115096

Pulled By: bddppq

fbshipit-source-id: a13b68274ecba363d9d8436aa6a62ac40a77d78c

5 years agoRemove debugging code in test_cholesky_batched (#14156)
vishwakftw [Sun, 18 Nov 2018 06:25:39 +0000 (22:25 -0800)]
Remove debugging code in test_cholesky_batched (#14156)

Summary:
They didn't turn up in my tests because I use pytest which doesn't
print debug statements if the tests pass

Differential Revision: D13115227

Pulled By: soumith

fbshipit-source-id: 46a7d47da7412d6b071158a23ab21e7fb0c6e11b

5 years agoBack out "[reland][codemod][caffe2] Tensor construction: combine Resize+mutable_data...
Jerry Zhang [Sun, 18 Nov 2018 03:42:42 +0000 (19:42 -0800)]
Back out "[reland][codemod][caffe2] Tensor construction: combine Resize+mutable_data - 2/4" (#14154)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14154

Original commit changeset: e89c2e692178

Reviewed By: amateurcoffee

Differential Revision: D13115023

fbshipit-source-id: 8f9fb55842ae6c8139d5cd88ec6d0abb0c5cc5e7

5 years agoCostInference for 1D conv (#14009)
Martin Schatz [Sun, 18 Nov 2018 01:26:09 +0000 (17:26 -0800)]
CostInference for 1D conv (#14009)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14009

As title

Reviewed By: yinghai

Differential Revision: D13078718

fbshipit-source-id: 081e7b13ad6741c635ef413915b555f10f93bd33

5 years agoBatched cholesky decomposition (#14017)
vishwakftw [Sat, 17 Nov 2018 18:47:17 +0000 (10:47 -0800)]
Batched cholesky decomposition (#14017)

Summary:
Implements batching for the Cholesky decomposition.

Performance could be improved with a dedicated batched `tril` and `triu` op, which is also impeding autograd operations.

Changes made:
- batching code
- tests in `test_torch.py`, `test_cuda.py` and `test_autograd.py`.
- doc string modification
- autograd modification
- removal of `_batch_potrf` in `MultivariateNormal`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14017

Differential Revision: D13087945

Pulled By: ezyang

fbshipit-source-id: 2386db887140295475ffc247742d5e9562a42f6e

5 years agoremove unnecessary file from avx2 list (#14012)
Jongsoo Park [Sat, 17 Nov 2018 18:26:56 +0000 (10:26 -0800)]
remove unnecessary file from avx2 list (#14012)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14012

conv_dnnlowp_op.cc doesn't need avx2 anymore.

Reviewed By: dskhudia

Differential Revision: D13079665

fbshipit-source-id: dbfe8d2213de4969b6334d54de81d51149268cbd

5 years agoChange from using enum to int to store data_type
Your Name [Sat, 17 Nov 2018 17:22:09 +0000 (09:22 -0800)]
Change from using enum to int to store data_type

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14140

Differential Revision: D13112937

Pulled By: bddppq

fbshipit-source-id: 124d9546bfbd1f9c207a21e40eb3646f7739bd58

5 years agoRevert "CircleCI: fix NCCL install (#14124)" (#14146)
Junjie Bai [Sat, 17 Nov 2018 08:20:44 +0000 (00:20 -0800)]
Revert "CircleCI: fix NCCL install (#14124)" (#14146)

Summary:
This reverts commit a1fa9d8cf9b2b0e7373ec420c2487d4dfd0e587c.

[pytorch_linux_trusty_py2_7_9_build](https://circleci.com/gh/pytorch/pytorch/270206?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link/console):
```
Nov 17 07:37:27 + sudo apt-get -qq update
Nov 17 07:37:30 W: Ignoring Provides line with DepCompareOp for package gdb-minimal
Nov 17 07:37:30 W: You may want to run apt-get update to correct these problems
Nov 17 07:37:30 + sudo apt-get -qq install --allow-downgrades --allow-change-held-packages openmpi-bin libopenmpi-dev
Nov 17 07:37:30 E: Command line option --allow-downgrades is not understood
Nov 17 07:37:30 + cleanup
Nov 17 07:37:30 + retcode=100
Nov 17 07:37:30 + set +x
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14146

Differential Revision: D13113912

Pulled By: bddppq

fbshipit-source-id: cd9d371cf72159f03d12a8b56ed5bd2060ebbe59

5 years agoRevert D10428917: [Caffe2] Add cost into profile observer
Junjie Bai [Sat, 17 Nov 2018 07:26:12 +0000 (23:26 -0800)]
Revert D10428917: [Caffe2] Add cost into profile observer

Differential Revision:
D10428917

Original commit changeset: 7c100e551bdd

fbshipit-source-id: 5164d9ba61cc103eccfdeb91a5cc140cea31a819

5 years agoRevert D10439558: Add cost for non-linear ops
Junjie Bai [Sat, 17 Nov 2018 07:26:12 +0000 (23:26 -0800)]
Revert D10439558: Add cost for non-linear ops

Differential Revision:
D10439558

Original commit changeset: 9aeb05bac8b5

fbshipit-source-id: f00977b4f95bdd500d254eb44fb5b0c816506ee4

5 years agoUpdate FXdiv submodule (#14128)
Marat Dukhan [Sat, 17 Nov 2018 05:57:42 +0000 (21:57 -0800)]
Update FXdiv submodule (#14128)

Summary:
Use the most recent version that disables inline assembly.
I suspect inline assembly causes miscompilation on some versions of gcc7.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14128

Reviewed By: bddppq

Differential Revision: D13112370

Pulled By: Maratyszcza

fbshipit-source-id: 36cc95dc51390a293b72c18ae982c3a515a11981

5 years agoRename neon2sse.h to NEON_2_SSE.h to match upstream repo
Marat Dukhan [Sat, 17 Nov 2018 05:21:40 +0000 (21:21 -0800)]
Rename neon2sse.h to NEON_2_SSE.h to match upstream repo

Summary:
- NEON2SSE is a header that implements NEON intrinsics on top fo SSE intrinsics
- Upstream repo provides NEON_2_SSE.h header, but internally it was imported as neon2sse.h
- This patch fix incompatibilities between internal and upstream versions

Reviewed By: hlu1

Differential Revision: D13096755

fbshipit-source-id: 65e1df9a2a5e74bd52c9aee9be27469ba938cd8c

5 years agoDisable QNNPACK for multi-architecture iOS builds (#14125)
Marat Dukhan [Sat, 17 Nov 2018 05:02:37 +0000 (21:02 -0800)]
Disable QNNPACK for multi-architecture iOS builds (#14125)

Summary:
QNNPACK contains assembly files, and CMake tries to build them for wrong architectures in multi-arch builds. This patch has two effects:
- Disables QNNPACK in multi-arch iOS builds
- Specifies a single `IOS_ARCH=arm64` by default (covers most iPhones/iPads on the market)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14125

Differential Revision: D13112366

Pulled By: Maratyszcza

fbshipit-source-id: b369083045b440e41d506667a92e41139c11a971

5 years agoRegister caffe2 layer norm with c10 dispatcher (#13693)
Sebastian Messmer [Sat, 17 Nov 2018 04:10:31 +0000 (20:10 -0800)]
Register caffe2 layer norm with c10 dispatcher (#13693)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13693

We can't directly call the caffe2::Operator class from c10 yet because that class isn't deprotobuffed yet.
Instead, we factor out the kernel into a reusable static method and call it from the caffe2::Operator and
also register it with c10.

Reviewed By: ezyang

Differential Revision: D12912242

fbshipit-source-id: c57502f14cea7a8be281f9787b175bb6e402d00c

5 years agoAdd c10/core/ to cmake build (#14111)
Sebastian Messmer [Sat, 17 Nov 2018 04:10:30 +0000 (20:10 -0800)]
Add c10/core/ to cmake build (#14111)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14111

It was already in TARGETs, but we forgot it in cmake.

Reviewed By: ezyang

Differential Revision: D13105166

fbshipit-source-id: f09549e98ebca751339b5ada1150e00cc4cd9540

5 years agoUpdate atol scale in dnnlowp test (#14135)
Haixin Liu [Sat, 17 Nov 2018 03:08:49 +0000 (19:08 -0800)]
Update atol scale in dnnlowp test (#14135)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14135

Update atol scale of dnnlowp test. Can't reproduce the flaky test error in the task locally even after setting the same seed value, but found according to comments in check_quantized_results_close(), atol_scale should be 1/1.9=0.526315789473684, which is larger than current value 0.51. So increase the atol_scale to 0.53.

Reviewed By: jspark1105

Differential Revision: D13108415

fbshipit-source-id: 1e8840659fdf0092f51b439cf499858795f9706a

5 years agofix sparse_adagrad param_size overflow error (#14049)
Jongsoo Park [Sat, 17 Nov 2018 02:49:08 +0000 (18:49 -0800)]
fix sparse_adagrad param_size overflow error (#14049)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14049

param_size should be passed as int64_t

Reviewed By: hyuen

Differential Revision: D13090511

fbshipit-source-id: 7892d315d7c82c7d7ca103fb36d30cdf1fe24785

5 years agoAdd cost for non-linear ops (#13327)
Haixin Liu [Sat, 17 Nov 2018 02:30:49 +0000 (18:30 -0800)]
Add cost for non-linear ops (#13327)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13327

Add cost inference function to non-linear ops. Since the actual flops of the non-linear operator depends on the implementation, we use the number of non-linear operations as the proxy for the analytical flops for non-linear operators.

Reviewed By: jspark1105

Differential Revision: D10439558

fbshipit-source-id: 9aeb05bac8b5c7ae5d351ebf365e0a81cf4fc227

5 years agoAdd cost into profile observer (#12793)
Haixin Liu [Sat, 17 Nov 2018 02:30:49 +0000 (18:30 -0800)]
Add cost into profile observer (#12793)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12793

Add analytical cost into profile observer. It includes the op level cost information for each op run and net level aggregated cost information for each op type.

It outputs the following information:
1. analytical flops
2. analytical bytes_read
3. analytical bytes_written

Example output at op level:
```I1017 14:58:14.245978 3686541 profile_observer_gpu.cc:26] --------- Starting operator FC op#24 ---------
I1017 14:58:14.246049 3686541 profile_observer_gpu.cc:33] Input 0: Tensor model1/embedded_encoder_inputs of type float. Dims: (17,1,256,):
I1017 14:58:14.246109 3686541 profile_observer_gpu.cc:33] Input 1: Tensor model1/encoder/layer0/fw/milstm/i2h_w of type float. Dims: (2048,256,):
I1017 14:58:14.246176 3686541 profile_observer_gpu.cc:33] Input 2: Tensor model1/encoder/layer0/fw/milstm/i2h_b of type float. Dims: (2048,):
I1017 14:58:14.246217 3686541 profile_observer_gpu.cc:44] Argument 0: name: "use_cudnn" i: 1
I1017 14:58:14.246271 3686541 profile_observer_gpu.cc:44] Argument 1: name: "cudnn_exhaustive_search" i: 0
I1017 14:58:14.246338 3686541 profile_observer_gpu.cc:44] Argument 2: name: "order" s: "NHWC"
I1017 14:58:14.246372 3686541 profile_observer_gpu.cc:44] Argument 3: name: "axis" i: 2
I1017 14:58:14.246418 3686541 profile_observer_gpu.cc:44] Argument 4: name: "quantization_scheme" i: 1
I1017 14:58:14.246470 3686541 profile_observer_gpu.cc:53] Output 0: Tensor model1/encoder/layer0/fw/milstm/i2h of type float. Dims: (17,1,2048,):
I1017 14:58:14.246596 3686541 profile_observer_gpu.cc:61] Cost (flops, bytes_read, bytes_written):
I1017 14:58:14.246649 3686541 profile_observer_gpu.cc:62]        17860608 2122752 139264
I1017 14:58:14.246677 3686541 profile_observer_gpu.cc:64] --------- Finished operator FC in 0.764221 ms ---------
```
Example output at net level:
```
I1017 11:13:44.675585 3146691 profile_observer_gpu.cc:165] ================ Detailed stats for net model0/encoder/layer0/bw/milstm ================
I1017 11:13:44.675662 3146691 profile_observer_gpu.cc:167] Cost (flops, bytes_read, bytes_written) per operator type:
I1017 11:13:44.675706 3146691 profile_observer_gpu.cc:169]        20992000 42045440 81920 FC
I1017 11:13:44.675745 3146691 profile_observer_gpu.cc:169]           20480 163840 81920 Mul
I1017 11:13:44.675824 3146691 profile_observer_gpu.cc:169]           20480 163840 81920 Sum
I1017 11:13:44.675878 3146691 profile_observer_gpu.cc:169]               0 0 0 ElementwiseLinear
I1017 11:13:44.675909 3146691 profile_observer_gpu.cc:169]               0 0 0 LSTMUnit
I1017 11:13:44.675958 3146691 profile_observer_gpu.cc:169]               0 0 0 rnn_internal_apply_link
```

Reviewed By: mdschatz

Differential Revision: D10428917

fbshipit-source-id: 7c100e551bdd3ac8d7c09be12c72d70a2d67cae1

5 years agoCircleCI: fix NCCL install (#14124)
Will Feng [Sat, 17 Nov 2018 02:28:55 +0000 (18:28 -0800)]
CircleCI: fix NCCL install (#14124)

Summary:
The `$BUILD_ENVIRONMENT` checks work in `test.sh` but not `build.sh`, this PR is trying to figure out why.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14124

Reviewed By: teng-li

Differential Revision: D13112483

Pulled By: yf225

fbshipit-source-id: 5f65997586648805cf52217a261389625b5535e1

5 years agoFixed MPI build with higher version of GCC (#14122)
Teng Li [Sat, 17 Nov 2018 02:02:13 +0000 (18:02 -0800)]
Fixed MPI build with higher version of GCC (#14122)

Summary:
This appears as I enabled -Werror in c10d build. Good to catch this and fix it.

Should fix https://github.com/pytorch/pytorch/issues/14078 and https://github.com/pytorch/pytorch/issues/13962
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14122

Differential Revision: D13110678

Pulled By: teng-li

fbshipit-source-id: f4c19e16976d65debbd33ed59e17ddbaa19f765a

5 years agomultiprocessing.spawn python version check (#14039)
Teng Li [Sat, 17 Nov 2018 01:49:56 +0000 (17:49 -0800)]
multiprocessing.spawn python version check (#14039)

Summary:
This will be super helpful to the user
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14039

Differential Revision: D13089200

Pulled By: teng-li

fbshipit-source-id: 29e7507bd8fe5a0c58a85c52f976bfca282b4c1b

5 years agoDon't python bind _thnn_ functions. (#14101)
Gregory Chanan [Sat, 17 Nov 2018 00:47:00 +0000 (16:47 -0800)]
Don't python bind _thnn_ functions. (#14101)

Summary:
This is needed for moving nn functions to native functions, but since some functions are already named
this way, I'm going to stop binding pre-emptively so we can check if there are any current dependencies.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14101

Differential Revision: D13102219

Pulled By: gchanan

fbshipit-source-id: 6bbcca33a03ab1bf648f1b73cadfe84339fa3050

5 years agoFix docs/cpp/requirements.txt (#14121)
Peter Goldsborough [Fri, 16 Nov 2018 22:53:19 +0000 (14:53 -0800)]
Fix docs/cpp/requirements.txt (#14121)

Summary:
soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14121

Differential Revision: D13108063

Pulled By: goldsborough

fbshipit-source-id: 35cf65ba776e8826c5cab7ae6d3a2d446f87e7cc

5 years agoAllow cooperative structured objects to be passed modules in tracing (#13961)
Thomas Viehmann [Fri, 16 Nov 2018 21:59:31 +0000 (13:59 -0800)]
Allow cooperative structured objects to be passed modules in tracing (#13961)

Summary:
Before this patch, the JIT does not allow Module's forward to take
structured objects.
This patch allows cooperative objects to do so.
Cooperative means:
- It has a method self._jit_unwrap() that returns (a list/tuple of)
  tensors. These are then used in _iter_tensors.
- It has a method self._jit_wrap(flattened_input) that takes
  (a list/tuple?) the flattened_unput (potentially more than it needs)
  and returns itself (updated) and the unconsumed flattened_inputs.
  This is then used in the _unflatten mechanism.

This is all it takes to permit maskrcnn-benchmark to use
its structured BoxList/ImageList types and trace it without calling
the .forward directly.
I'll push a model working with this patch in
https://github.com/facebookresearch/maskrcnn-benchmark/pull/138

I must admit I haven't fully checked whether there are ONNX changes needed before it, too, can profit, but I would be hopeful that anything currently usable remains so.

fmassa zdevito

So the main downside that I'm aware of is that people will later want to use more elaborate mechanisms, but I think this could be done by just amending what wrap/unwrap are returning / consuming.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13961

Differential Revision: D13103927

Pulled By: soumith

fbshipit-source-id: 2cbc724cc4b53197388b662f75d9e601a495c087

5 years agoAdd SharedDataset (#13800)
Peter Goldsborough [Fri, 16 Nov 2018 21:01:25 +0000 (13:01 -0800)]
Add SharedDataset (#13800)

Summary:
This PR adds a `SharedDataset` to the C++ frontend data API, which allows wrapping a shared_ptr to a dataset into a class that conforms to the `Dataset` interface (with `get_batch`). This enables use cases where a custom dataset is (1) thread-safe and (2) expensive to copy. All workers will reference a single instance of this dataset. No additional copies are incurred.

jaliyae apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13800

Differential Revision: D13075610

Pulled By: goldsborough

fbshipit-source-id: 4ffdfd7959d49b042c0e254110085f62a0bfeb6c

5 years agoremove dynamic initialization warning (#13913) (#13967)
jjsjann123 [Fri, 16 Nov 2018 20:59:01 +0000 (12:59 -0800)]
remove dynamic initialization warning (#13913) (#13967)

Summary:
removed assignment in default constructor.
removed static shared memory and used dynamic shared memory.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13967

Differential Revision: D13089996

Pulled By: soumith

fbshipit-source-id: 2a218b909c849bed39636b45a02d10ebc279a0b0

5 years agoMissing .decode() after check_output in cpp_extensions (#13935)
Peter Goldsborough [Fri, 16 Nov 2018 20:12:01 +0000 (12:12 -0800)]
Missing .decode() after check_output in cpp_extensions (#13935)

Summary:
soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13935

Differential Revision: D13090852

Pulled By: goldsborough

fbshipit-source-id: 47da269d074fd1e7220e90580692d6ee489ec78b

5 years agoWindows shared build (#13550)
ArutyunovG [Fri, 16 Nov 2018 20:06:21 +0000 (12:06 -0800)]
Windows shared build (#13550)

Summary:
Hi guys,

I'd like to build Caffe2 with more supported options in Windows with Microsoft Visual Studios.
This is the first pull request.
Running scripts/build_windows_shared.bat is able to build Caffe2 with both CMAKE_BUILD_TYPE=Debug and CMAKE_BUILD_TYPE=Release with Visual Studio 14 2015.
CUDA is 9.0, cudnn is 7.0.5, glog, gflags and lmdb are supported on my system.
Python is 3.5, Detectron works from python interface as well.
It was even possible to debug detectron code and step into caffe2_gpu.dll with pdbs built.

What is disappointing, that c10/experimental ops don't build with this Visual Studio generator, I added special option INCLUDE_EXPERIMENTAL_C10_OPS (default ON) to deal with it in build_windows_shared.bat.

After this pull request the next step is to add Visual Studio 2017 support in the script.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13550

Reviewed By: ezyang

Differential Revision: D13042597

Pulled By: orionr

fbshipit-source-id: f313f909f599cd582a1d000eff766eef3a9fc4fc

5 years agoMake JOIN_TIMEOUT longer for ppc64le (#14107)
Freddie Mendoza [Fri, 16 Nov 2018 20:05:27 +0000 (12:05 -0800)]
Make JOIN_TIMEOUT longer for ppc64le (#14107)

Summary:
This should resolve the issue on ppc64le getting FAIL: test_proper_exit (__main__.TestDataLoader). This only happens when the CI build machine is very busy and fails with a timeout.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14107

Differential Revision: D13103859

Pulled By: soumith

fbshipit-source-id: 268be80b59840853c5025f3211af272f68608fe5

5 years agoLog error from the net's run (#14035)
Ilia Cherniavskii [Fri, 16 Nov 2018 20:01:01 +0000 (12:01 -0800)]
Log error from the net's run (#14035)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14035

Log error meesage in case of net's run failure

Reviewed By: andrewwdye

Differential Revision: D13085431

fbshipit-source-id: d79f76782410cd3a5bd2d8d7f5fb1e535d821051

5 years agoChange hip filename extension to .hip (#14036)
Junjie Bai [Fri, 16 Nov 2018 19:50:29 +0000 (11:50 -0800)]
Change hip filename extension to .hip (#14036)

Summary:
xw285cornell

- To make hip files to have unique filename extension we change hip files from _hip.cc to .hip (it's the only blessing option other than .cu in hipcc https://github.com/ROCm-Developer-Tools/HIP/blob/3d51a1fb0105e2f2312d2523c20e0034339f6ada/bin/hipcc#L552).
- Change to use host compiler to compile .cc|.cpp files. Previously we use hcc to compile them which is unnecessary
- Change the hipify script to not replace "gpu" with "hip" in the filename of the generated hipified files. Previously we do this because hcc has a bug when linking files that have same filename. We have now changed to use host linker to do linking so this is unnecessary anymore.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14036

Reviewed By: xw285cornell

Differential Revision: D13091813

Pulled By: bddppq

fbshipit-source-id: ea3d887751d8abb39d75f5d5104aa66ce66b9ee0

5 years agoEnable Caffe2 ROCm test on centos (#14090)
Your Name [Fri, 16 Nov 2018 19:47:02 +0000 (11:47 -0800)]
Enable Caffe2 ROCm test on centos (#14090)

Summary:
xw285cornell petrex ashishfarmer rohithkrn
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14090

Differential Revision: D13096874

Pulled By: bddppq

fbshipit-source-id: b471c6e4db95cd51567745a2f758d58bba7eafad

5 years agoEnable Caffe2 test on centos (#14091)
Junjie Bai [Fri, 16 Nov 2018 19:45:01 +0000 (11:45 -0800)]
Enable Caffe2 test on centos (#14091)

Summary:
Turns out we don't have any centos test CI job
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14091

Differential Revision: D13104722

Pulled By: bddppq

fbshipit-source-id: 22fe92ad4b7f2c391eea16b8b95658fa1ee605e2

5 years agoRelax limits for gradients in test_jit's checkGraph (#14094)
Thomas Viehmann [Fri, 16 Nov 2018 19:36:06 +0000 (11:36 -0800)]
Relax limits for gradients in test_jit's checkGraph (#14094)

Summary:
- This should help TestJit.test_lstm_fusion_concat_cuda
  to be less flaky. (Checked on manual_seed 0..99)
  Fixes: #14026
- Revert the renaming of test_fused_abs that was introduced
  to game the order of tests to avoid the flakiness above.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14094

Differential Revision: D13100174

Pulled By: soumith

fbshipit-source-id: 91bb63b07a960a81dddfc0bf25c67696c0f6c46d

5 years agoadd torch-python target (#12742)
Anders Papitto [Fri, 16 Nov 2018 19:32:51 +0000 (11:32 -0800)]
add torch-python target (#12742)

Summary:
This is the next minimal step towards moving _C into cmake. For now,
leave _C in setup.py, but reduce it to an empty stub file. All of its
sources are now part of the new torch-python cmake target.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12742

Reviewed By: soumith

Differential Revision: D13089691

Pulled By: anderspapitto

fbshipit-source-id: 1c746fda33cfebb26e02a7f0781fefa8b0d86385

5 years agoalias annotation parsing #2 (#14053)
Michael Suo [Fri, 16 Nov 2018 19:32:34 +0000 (11:32 -0800)]
alias annotation parsing #2 (#14053)

Summary:
hopefully this one doesn't break master.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14053

Differential Revision: D13093406

Pulled By: suo

fbshipit-source-id: 8fed44f1a3d463748726cb14acac2ea53dedf29b

5 years agoMake THPDtype_New error instead of truncate (#14103)
Andy Chen [Fri, 16 Nov 2018 19:32:05 +0000 (11:32 -0800)]
Make THPDtype_New error instead of truncate (#14103)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14103

Addressing T34828781, we change THPDtype_New so that it throws a RuntimeError if the length of name is greater than buffer size (DTYPE_NAME_LEN) - instead of truncating the string to fit the buffer.

Reviewed By: ezyang

Differential Revision: D13094600

fbshipit-source-id: d0dbf8fdfa342630c31f4d8ca7230d5f24a1254a

5 years agoAdd filler for SparseLengthsWeightedSum (#13949)
Yinghai Lu [Fri, 16 Nov 2018 19:27:45 +0000 (11:27 -0800)]
Add filler for SparseLengthsWeightedSum (#13949)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13949

This diff adds support to fillers for `SparseLengthsWeight*` ops. It does 3 things:
1. Add the fillers for `SparseLengthsWeight*` ops
2. Add filling heuristics to consider the path of `LengthsRangeFill` -> `Gather` -> `SparseLengthsWeightedSum`, where the length input is shared by `LengthsRangeFill` and `SparseLengthsWeightedSum`. Therefore, we need to carefully bound the value of that length input so that at `Gather`, it does not index out-of-bound for the weight input of `Gather`.
3. Fix and simplify the logic of `math::RandFixedSum`, where we just keep rejecting the generated value if it violates the invariants.

Reviewed By: highker

Differential Revision: D13048216

fbshipit-source-id: bfe402e07e6421b28548047d18b298c148e0ec87

5 years agoUpdate ATen doc with optional syntax (#14086)
Wanchao Liang [Fri, 16 Nov 2018 18:01:09 +0000 (10:01 -0800)]
Update ATen doc with optional syntax (#14086)

Summary:
Update the readme to reflect the recent optional syntax change.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14086

Differential Revision: D13096114

Pulled By: wanchaol

fbshipit-source-id: 713834d4d92021e1c7a31f3a56a00fb7da58c348

5 years agoAdd missing space in stft doc
Tongzhou Wang [Fri, 16 Nov 2018 17:54:44 +0000 (09:54 -0800)]
Add missing space in stft doc

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14092

Reviewed By: soumith

Differential Revision: D13100177

Pulled By: SsnL

fbshipit-source-id: 4eeaa3d0c04212516941d8d5a266aafb53bd9672

5 years agoPreemptively test for out-of-order length. (#13933)
Brian Vaughan [Fri, 16 Nov 2018 16:37:05 +0000 (08:37 -0800)]
Preemptively test for out-of-order length. (#13933)

Summary:
torch.nn.utils.rnn.pack_padded_sequence segment fault if not in
decreasing order #13324

We were seeing this segfault on throw, pre-emptively checking avoids
this:

*** Error in `/home/bvaughan/anaconda3/bin/python': double free or corruption (!prev): 0x00005555566e7510 ***
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13933

Differential Revision: D13090389

Pulled By: nairbv

fbshipit-source-id: 6f6b319e74cb55830be799e9c46bc33aa59256d8

5 years agonomnigraph - support subgraph visualization (#13795)
Duc Ngo [Fri, 16 Nov 2018 16:04:58 +0000 (08:04 -0800)]
nomnigraph - support subgraph visualization (#13795)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13795

Add ability for dot string generation for a single subgraph and python bindings (which is pretty useful for model exploration in Python)
Restructure DotGenerator class a bit to make it easy to implement this feature

Reviewed By: bwasti

Differential Revision: D13010512

fbshipit-source-id: 825665438394b7e6968ab6da167b477af82a7b62

5 years agonomnigraph - easy - expose hasProduce(NodeRef) to python (#14075)
Duc Ngo [Fri, 16 Nov 2018 16:04:58 +0000 (08:04 -0800)]
nomnigraph - easy - expose hasProduce(NodeRef) to python (#14075)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14075

Expose hasProduce(NodeRef) to python

Reviewed By: bwasti

Differential Revision: D13092930

fbshipit-source-id: f1ec06e73e0f5f6a16ad0cbb7d2e3e499a861d8e