review.tizen.org Git - platform/upstream/pytorch.git/log

projects / platform / upstream / pytorch.git / log

Ilia Cherniavskii [Thu, 22 Nov 2018 01:19:37 +0000 (17:19 -0800)]

Remove extra include

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14206

Reviewed By: dzhulgakov

Differential Revision: D13131318

fbshipit-source-id: 559b55b8d98cdf6b7d1d3e31237c5473edc5e462

commit | commitdiff | tree

Teng Li [Thu, 22 Nov 2018 00:54:36 +0000 (16:54 -0800)]

Removed redundant allreduce options in DDP (#14208)

Summary:
This somehow is not cleaned up after the C++ migration. Unused and can be removed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14208

Differential Revision: D13132492

Pulled By: teng-li

fbshipit-source-id: 0f05b6368174664ebb2560c037347c8eb45f7c38

commit | commitdiff | tree

David Riazati [Thu, 22 Nov 2018 00:30:43 +0000 (16:30 -0800)]

Add list inequality operator (#14129)

Summary:
This PR adds `aten::neq` for list inequality comparisons and converts
`nll_loss` to weak script
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14129

Differential Revision: D13123894

Pulled By: driazati

fbshipit-source-id: 8c1edf7c163217ec00eb653f95d196db3998613f

commit | commitdiff | tree

Yinghai Lu [Wed, 21 Nov 2018 23:43:10 +0000 (15:43 -0800)]

Add onnxifi support to SparseLengthsWeightedSum (#14210)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14210

We left `SparseLengthsWeightedSum` as benchmark is not testing it due to fp16 filler issue. It was flushed out by unit tests. Hence we add the support here.

Reviewed By: bddppq

Differential Revision: D13132320

fbshipit-source-id: b21c30c185c9e1fbf3980641bc3cdc39e85af2e1

commit | commitdiff | tree

Gu, Jinghui [Wed, 21 Nov 2018 23:42:29 +0000 (15:42 -0800)]

Add "axis" and "axis_w" arguments in FC to support customized axix to reduce dim. (#12971)

Summary:
Add "axis" and "axis_w" arguments in FC to support customized axix to reduce dim.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12971

Reviewed By: bddppq

Differential Revision: D12850675

Pulled By: yinghai

fbshipit-source-id: f1cde163201bd7add53b8475329db1f038a73019

commit | commitdiff | tree

Viswanath Sivakumar [Wed, 21 Nov 2018 21:42:04 +0000 (13:42 -0800)]

IDEEP fallback for ResizeNearest op (#14212)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14212

TSIA

Reviewed By: yinghai

Differential Revision: D13134134

fbshipit-source-id: e3c5c9c8756d6e25b213f8dde9d809a44373d7a3

commit | commitdiff | tree

zrphercule [Wed, 21 Nov 2018 21:12:18 +0000 (13:12 -0800)]

Fix ONNX_ATEN mode (#14239)

Summary:
Fix ONNX_ATEN mode by adding it to the validateBlock method.
Before this pr, validateBlock will throw an exception when using this mode.

I will add related test cases for ONNX_ATEN mode in a different pr once this is merged, since we dont have any currently.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14239

Differential Revision: D13145443

Pulled By: zrphercule

fbshipit-source-id: 60e7942aa126acfe67bdb428ef231ac3066234b1

commit | commitdiff | tree

Pieter Noordhuis [Wed, 21 Nov 2018 19:25:42 +0000 (11:25 -0800)]

Bump gloo (#14281)

Summary:
Includes more robust error handling and timeout support.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14281

Differential Revision: D13158232

Pulled By: pietern

fbshipit-source-id: e80432799a020576d5abdcd9a21d66b629479caf

commit | commitdiff | tree

Jongsoo Park [Wed, 21 Nov 2018 17:37:58 +0000 (09:37 -0800)]

fix comment on dnnlowp op arguments (#14265)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14265

Fix comment

Reviewed By: hx89

Differential Revision: D13152106

fbshipit-source-id: fbe98906963cbd5cb20a583a737a792fbc38292e

commit | commitdiff | tree

Gregory Chanan [Wed, 21 Nov 2018 17:04:59 +0000 (09:04 -0800)]

native NN wrappers, including with buffers.

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14256

Differential Revision: D13148783

Pulled By: gchanan

fbshipit-source-id: 4b6179033cf1df26061b6731eaaa4e008692e592

commit | commitdiff | tree

Pieter Noordhuis [Wed, 21 Nov 2018 16:43:14 +0000 (08:43 -0800)]

Remove header generated at configuration time (#14244)

Summary:
The build was picking up the empty stub header instead of the generated
one. Because of the large number of include paths we end up passing to
the compiler it is brittle to have both an empty stub file and a
generated file and expect the compiler to pick up the right one.

With the recent change to compile everything from a single CMake run we
can now use native CMake facilities to propagate macros that indicate
backend support. The stanzas target_compile_definitions with the
INTERFACE flag ensure that these macros are set only for downstream
consumers of the c10d target.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14244

Reviewed By: teng-li

Differential Revision: D13144293

Pulled By: pietern

fbshipit-source-id: f49324220db689c68c126b159f4f00a8b9bc1252

commit | commitdiff | tree

Zachary DeVito [Wed, 21 Nov 2018 14:36:26 +0000 (06:36 -0800)]

Address jittering issues in python_print (#14064)

Summary:
export - print a method with python_print
import - import a method with import_method

We want to ensure:

    export(g) == export(import(export(g)))

That is after after exporting/importing once, the graph will stay exactly
the same. This is less strict that g == import(export(g)) which would
require us to maintain a lot more information about the structure of the
IR and about the names of debug symbols.

This PR addresses this with the following fixes:
* print out double-precision numbers with high enough precision such
  that they always parse in the same way
* when creating loop-carried dependencies, sort them
  by variable name, ensuring a consistent order
* parse nan correctly
* DCE: remove unused outputs of if statements, and loop-carried dependencies
  in loops that are dead both after the loop and inside the body of the
  loop.
* Do not set uniqueName for variables whose names are _[0-9]+, these
  are probably rare in user code, and we need a way to communicate
  that we do not care about a variable name when re-parsing the graph.
  Otherwise temporary variable names will jitter around.
* Expand the definition of a constant in printing code to None,
  and family.
* Allow re-treeing to work as long as the only thing in its way is a
  constant node. These do not have side effects but are sometimes
  inserted in a different order when tracing compared to how we print them.
* Print all constant nodes out first in the order in which they are used_val
(or, if they are inlined, ensure they get assigned CONSTANT.cX number
  in a consistent order). Cleanup tuples (this is done in the compiler,
  but not in the tracer, leading to some tuple indexing jitter if not
  done).
* use strtod_l, not std::stod which can throw exceptions

Other:
* Add REL_WITH_DEB_INFO to setup.py. It already existed for the
  cmake files. Threading it into setup.py allows us to turn on
  debug symbols with optimization everywhere.
* enable round trip testing for all generated graphs. This only adds
  ~6 seconds to total build time but tests printing for every graph.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14064

Differential Revision: D13094637

Pulled By: zdevito

fbshipit-source-id: 0a1c6912194d965f15d6b0c6cf838ccc551f161d

commit | commitdiff | tree

svcscm [Wed, 21 Nov 2018 10:16:29 +0000 (02:16 -0800)]

Updating submodules

Reviewed By: cdelahousse

fbshipit-source-id: 27838fb2dad82c78906faf3cc2d124557c30e88f

commit | commitdiff | tree

svcscm [Wed, 21 Nov 2018 08:25:17 +0000 (00:25 -0800)]

Updating submodules

Reviewed By: cdelahousse

fbshipit-source-id: 3c17e12a579245a84e9a56b1d8a1641232150675

commit | commitdiff | tree

Lu Fang [Wed, 21 Nov 2018 07:33:30 +0000 (23:33 -0800)]

Add tensor table in ModelDef and use it for jit script serialization and deserialization (#13861)

Summary:
As we discussed, the tensors in the torch script will be associated with the tensor data in the serialized file. So let's add a table of tensor (actually it's a repeated TensorProto filed) in the ModelDef. TensorProto.name will be the id.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/13861

Reviewed By: dzhulgakov

Differential Revision: D13036940

Pulled By: zrphercule

fbshipit-source-id: ecb91b062ac4bc26af2a8d6d12c91d5614efd559

commit | commitdiff | tree

Tongzhou Wang [Wed, 21 Nov 2018 07:27:16 +0000 (23:27 -0800)]

c10d Automatically retry on EINTR (#14180)

Summary:
Probably fixes https://github.com/pytorch/pytorch/issues/14170

Actually I probably shouldn't retry all `SYSCHECK` calls. I'll leave to the reviewers to decide.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14180

Reviewed By: pietern

Differential Revision: D13144741

Pulled By: SsnL

fbshipit-source-id: d73288f76b18cae14b1b43dad4e5e8d010a96d95

commit | commitdiff | tree

Teng Li [Wed, 21 Nov 2018 05:10:18 +0000 (21:10 -0800)]

Make NCCL backend support barrier op (#14142)

Summary:
This is a feature request from: https://github.com/pytorch/pytorch/issues/13573

As the title says, this PR makes NCCL backend support barrier op.

There are a couple scenarios that need to be addressed:
(1) When there is already a NCCL op happened, we need to record what GPU device(s) the previous op happened and queue the allreduce barrier op on the same GPU device
(2) When there is no NCCL op yet, we will try to use a single GPU and separate each process from a single GPU as the best effort.

As for the async work, during wait, we would like not just wait on the NCCL kernel to be completed, but also block the thread until the current stream and nccl stream return.

`test_distributed` should cover the test. I also manually tested both scenarios.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14142

Differential Revision: D13113391

Pulled By: teng-li

fbshipit-source-id: 96c33d4d129e2977e6892d85d0fc449424c35499

commit | commitdiff | tree

Yinghai Lu [Wed, 21 Nov 2018 02:00:14 +0000 (18:00 -0800)]

Fix memory leakage in onnxifi transformer (#14245)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14245

tsia

Reviewed By: bddppq, rdzhabarov

Differential Revision: D13144783

fbshipit-source-id: 5e07bb7ab883ba1af68547a26272cd320967b9e3

commit | commitdiff | tree

David Riazati [Wed, 21 Nov 2018 00:42:00 +0000 (16:42 -0800)]

Allow undefined tensors as constants (#14120)

Summary:
This PR inserts `prim::None` constants for undefined tensors. This comes in the standard library if an `Optional[Tensor]` is statically determined to be `None`:

```python
torch.jit.script
def fn(x=None):
    # type: (Optional[Tensor]) -> Tensor
    return torch.jit._unwrap_optional(x)

torch.jit.script
def fn2():
    # type: () -> Tensor
    return fn()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14120

Differential Revision: D13124625

Pulled By: driazati

fbshipit-source-id: 9eaa82e478c49c503f68ed89d8c770e8273ea569

commit | commitdiff | tree

Wanchao Liang [Tue, 20 Nov 2018 22:09:27 +0000 (14:09 -0800)]

Export BatchNorm functional and module, add necessary JIT support (#14016)

Summary:
This PR did three things:

1. It export the BatchNorm functional and module, and rewrite some of the components to stay align with the current supported JIT features
2. In the process of export, add necessary compiler support for in_place op aug assign
4. change the test_jit behavior in add_module_test to utilize a single rng state during module initialization
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14016

Differential Revision: D13112064

Pulled By: wanchaol

fbshipit-source-id: 31e3aee5fbb509673c781e7dbb6d8884cfa55d91

commit | commitdiff | tree

Thomas Viehmann [Tue, 20 Nov 2018 20:43:23 +0000 (12:43 -0800)]

Have PYTORCH_FUSION_DEBUG print C kernel source (#14213)

Summary:
- Move up handling the environment variable from CPU only to all
- Introduce two levels to be enabled with PYTORCH_FUSION_DEBUG=n:
1: print C source
2: print CPU assembly, too (previous effect of PYTORCH_FUSION_DEBUG)

apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14213

Differential Revision: D13135393

Pulled By: soumith

fbshipit-source-id: befa4ebea3b3c97e471393a9f6402b93a6b24031

commit | commitdiff | tree

Tugrul Ates [Tue, 20 Nov 2018 20:23:14 +0000 (12:23 -0800)]

Delete backwards compatibility StorageImpl.h and TensorImpl.h (#14230)

Summary:
Since they directly include the real ones in core.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14230

Differential Revision: D13140323

Pulled By: tugrulates

fbshipit-source-id: d7e3b94e891b2d7fa273d01c0b7edfebdbd7e368

commit | commitdiff | tree

Jongsoo Park [Tue, 20 Nov 2018 08:53:29 +0000 (00:53 -0800)]

remove unused parameters from caffe2_dnnlowp_utils.cc (#14164)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14164

See title

Reviewed By: csummersea

Differential Revision: D13115470

fbshipit-source-id: d754f558cd06e5f4c1cd00315e912cdb7b50731a

commit | commitdiff | tree

Jongsoo Park [Tue, 20 Nov 2018 08:53:29 +0000 (00:53 -0800)]

use pragma once (#14163)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14163

Some of the names we were using to guard the header file was too short (e.g. DYNAMIC_HISTOGRAM_H).

Reviewed By: csummersea

Differential Revision: D13115451

fbshipit-source-id: cef8c84c62922616ceea17effff7bdf8d67302a2

commit | commitdiff | tree

Jongsoo Park [Tue, 20 Nov 2018 08:53:29 +0000 (00:53 -0800)]

format python files (#14161)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14161

Formatting using Nuclide

Reviewed By: hx89

Differential Revision: D13115348

fbshipit-source-id: 7432ce6072a1822d7287b4ebcfcb6309282e15ac

commit | commitdiff | tree

Jongsoo Park [Tue, 20 Nov 2018 08:53:29 +0000 (00:53 -0800)]

clang-format (#14160)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14160

clang-format of C++ files

Reviewed By: hx89

Differential Revision: D13115201

fbshipit-source-id: d2ad65f66209e00578ef90f87f41272de2d24aa9

commit | commitdiff | tree

Hui Wu [Tue, 20 Nov 2018 06:54:19 +0000 (22:54 -0800)]

Add sigmoid op based on MKL-DNN

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13097

Differential Revision: D13105366

Pulled By: yinghai

fbshipit-source-id: d156e8fd519baeecf61c25dcd8fa2c2fa7351ef4

commit | commitdiff | tree

Daya S Khudia [Tue, 20 Nov 2018 06:45:00 +0000 (22:45 -0800)]

OSS build fix (#14192)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14192

We can only use C10_* in OSS. The build is only broken if built with USE_FBGEMM=ON

Reviewed By: jianyuh

Differential Revision: D13121781

fbshipit-source-id: f0ee9a75997766e63e1da8a53de7ddb98296a171

commit | commitdiff | tree

Lu Fang [Tue, 20 Nov 2018 06:12:16 +0000 (22:12 -0800)]

Make EncodeMethod in jit script serialization return a string (#14167)

Summary:
Nit

Pull Request resolved: https://github.com/pytorch/pytorch/pull/14167

Reviewed By: ezyang

Differential Revision: D13116584

Pulled By: dzhulgakov

fbshipit-source-id: c0e7e71a81004031564bd2fc59f393041e1283d5

commit | commitdiff | tree

Jongsoo Park [Tue, 20 Nov 2018 05:44:29 +0000 (21:44 -0800)]

Create README.md of caffe2/quantization/server

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14217

Reviewed By: csummersea

Differential Revision: D13135086

Pulled By: jspark1105

fbshipit-source-id: bddf4f1c2dc5ec8ea6ebe9e265956f367e082d52

commit | commitdiff | tree

Will Feng [Tue, 20 Nov 2018 05:28:29 +0000 (21:28 -0800)]

CircleCI: fix NCCL install (#14172)

Summary:
The `$BUILD_ENVIRONMENT` checks work in `test.sh` but not `build.sh`, this PR fixes the issue.

This replaces https://github.com/pytorch/pytorch/pull/14124.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14172

Differential Revision: D13135087

Pulled By: yf225

fbshipit-source-id: 42fff3926734778713d483d74ba0a89e5502dd9e

commit | commitdiff | tree

zrphercule [Tue, 20 Nov 2018 02:43:58 +0000 (18:43 -0800)]

Fix a bug in test case of onnx::If

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14209

Differential Revision: D13132607

Pulled By: zrphercule

fbshipit-source-id: b7f7ccc6a6cbdeb57a7f88a1971d15dd81e6fc81

commit | commitdiff | tree

Teng Li [Tue, 20 Nov 2018 02:25:00 +0000 (18:25 -0800)]

Tensor type checking and informative error messages for torch.distributed (#14204)

Summary:
This will address https://github.com/pytorch/pytorch/issues/13574

This error message should be more informative to the user for all the non-multiGPU ops, since we python binding to multi-gpu ops always.

test_distributed should cover all. Also tested both RunTime errors.

```
>>> a = torch.ByteTensor([])
>>> b = [a, a]
>>> dist.all_reduce(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/private/home/tengli/pytorch/torch/distributed/distributed_c10d.py", line 809, in all_reduce
    _check_single_tensor(tensor, "tensor")
  File "/private/home/tengli/pytorch/torch/distributed/distributed_c10d.py", line 207, in _check_single_tensor
    "to be a torch.Tensor type".format(param_name))
RuntimeError: Invalid function argument. Expecting parameter: tensor to be a torch.Tensor type

>>> b = ["b"]
>>> dist.all_gather(b, a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/private/home/tengli/pytorch/torch/distributed/distributed_c10d.py", line 1006, in all_gather
    _check_tensor_list(tensor_list, "tensor_list")
  File "/private/home/tengli/pytorch/torch/distributed/distributed_c10d.py", line 225, in _check_tensor_list
    "to be a List[torch.Tensor] type".format(param_name))
RuntimeError: Invalid function argument. Expecting parameter: tensor_list to be a List[torch.Tensor] type
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14204

Differential Revision: D13131526

Pulled By: teng-li

fbshipit-source-id: bca3d881e41044a013a6b90fa187e722b9dd45f2

commit | commitdiff | tree

Edward Yang [Tue, 20 Nov 2018 01:01:34 +0000 (17:01 -0800)]

Move stream functions from CUDAContext to CUDAStream (#14110)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14110

I'm planning to move CUDAStream to c10/cuda, without also moving
CUDAContext, and so it's most convenient if these definitions
are in the actual header file in question.

Reviewed By: smessmer

Differential Revision: D13104693

fbshipit-source-id: 23ce492003091adadaa5ca6a17124213005046c2

commit | commitdiff | tree

Edward Yang [Tue, 20 Nov 2018 01:01:34 +0000 (17:01 -0800)]

Move CUDAStreamInternals inside detail namespace. (#14109)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14109

Previously it was at the top level, because the author was under
the impression that you could only refer to top-level C++ names
from C, but this is not true; you just need to make a stub struct
conditioned on __cplusplus.

Reviewed By: smessmer

Differential Revision: D13104694

fbshipit-source-id: ecb7ae6dcfa4ab4e062aad7a886937dca15fd1b2

commit | commitdiff | tree

Edward Yang [Tue, 20 Nov 2018 01:01:33 +0000 (17:01 -0800)]

Delete dependencies from CUDAStream; remove synchronize_with (#13920)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13920

I want to move CUDAStream and CUDAGuard to c10_cuda without also
bringing along CUDAContext or CUDAEvent for the ride (at least for
now). To do this, I need to eliminate those dependencies.

There's a few functions in CUDAContext.h which don't really need
THCState, so they're separated out and put in general
purpose c10/cuda/CUDAFunctions.h

Reviewed By: smessmer

Differential Revision: D13047468

fbshipit-source-id: 7ed9d5e660f95805ab39d7af25892327edae050e

commit | commitdiff | tree

Yavuz Yetim [Mon, 19 Nov 2018 23:57:28 +0000 (15:57 -0800)]

Fix race in AtomicFetchAdd. (#13479)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13479

Increases the lock scope to above Output() calls.

These calls potentially allocate the underlying blob/tensor
objects and multiple invocations race each other over the
same output blobs/tensors.

Reviewed By: bwasti

Differential Revision: D12891629

fbshipit-source-id: a6015cfdb08e352521a1f062eb9d94a971cfbdb0

commit | commitdiff | tree

Sebastian Messmer [Mon, 19 Nov 2018 23:35:18 +0000 (15:35 -0800)]

Remove API macros from intrusive_ptr (#14137)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14137

This is a templated header-only class and shouldn't need export/import macros.

Reviewed By: ezyang

Differential Revision: D13111712

fbshipit-source-id: c8c958e75b090d011d25156af22f37f9ca605196

commit | commitdiff | tree

Jerry Zhang [Mon, 19 Nov 2018 23:29:45 +0000 (15:29 -0800)]

Tensor construction: combine Resize+mutable_data - 1/4 (#13942)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13942

Codemod generated with clangr shard mode, 25 files per diff,
motivation: https://github.com/pytorch/pytorch/pull/12407

Reviewed By: smessmer

Differential Revision: D13054770

fbshipit-source-id: a9e86e5dfcb4f7cebf5243e1d359fad064561bed

commit | commitdiff | tree

Jerry Zhang [Mon, 19 Nov 2018 23:25:43 +0000 (15:25 -0800)]

Tensor construction: combine Resize+mutable_data - 3/4 (#13944)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13944

Pull Request resolved: https://github.com/pytorch/pytorch/pull/13854

Codemod generated with clangr shard mode, 25 files per diff,
motivation: https://github.com/pytorch/pytorch/pull/12407

Reviewed By: ezyang

Differential Revision: D13054836

fbshipit-source-id: 5de07a156687f1ee607d0450410881d9176a87a7

commit | commitdiff | tree

Lu Fang [Mon, 19 Nov 2018 22:29:31 +0000 (14:29 -0800)]

Store the optimize flag in module (#14166)

Summary:
When the save/load of script module, we store optimize flag in module instead of encoding it in method.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/14166

Reviewed By: ezyang

Differential Revision: D13117577

Pulled By: dzhulgakov

fbshipit-source-id: dc322948bda0ac5809d8ef9a345497ebb8f33a61

commit | commitdiff | tree

Junjie Bai [Mon, 19 Nov 2018 22:21:20 +0000 (14:21 -0800)]

Cleanup caffe2 hipify exclude patterns (#14198)

Summary:
depthwise_3x3_conv_op.cu does not exist
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14198

Differential Revision: D13127479

Pulled By: bddppq

fbshipit-source-id: ec6bd434055a49ea405c4b399bde8c074114f955

commit | commitdiff | tree

Gregory Chanan [Mon, 19 Nov 2018 22:10:47 +0000 (14:10 -0800)]

Support 'python_module' of 'nn' in native functions. (#14126)

Summary:
Also move mse_loss, binary_cross_entropy, l1_loss to use this functionality.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14126

Reviewed By: ezyang

Differential Revision: D13109975

Pulled By: gchanan

fbshipit-source-id: 0b29dc8cf222d25db14da7532d8dc096a988a0ec

commit | commitdiff | tree

Junjie Bai [Mon, 19 Nov 2018 21:25:32 +0000 (13:25 -0800)]

Use onnx proto_utils to support using protobuf-lite

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14150

Differential Revision: D13115586

Pulled By: bddppq

fbshipit-source-id: d6b6935a8deac60f6f58d62a71f6840182a72a51

commit | commitdiff | tree

Daya S Khudia [Mon, 19 Nov 2018 20:08:35 +0000 (12:08 -0800)]

Use fbgemm revision file added by shipit (#14105)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14105

Pull Request resolved: https://github.com/facebook/fbshipit/pull/62

Use fbgemm revision file created by ShipIt for updating fbgemm revision for pytorch. We don't have to manually update submodule now.

Reviewed By: yns88

Differential Revision: D13072074

fbshipit-source-id: bef9eabad50f7140179c370a60bd9ca73067b9b5

commit | commitdiff | tree

Your Name [Mon, 19 Nov 2018 19:26:38 +0000 (11:26 -0800)]

Setup sccache for PyTorch ROCm CI (#14153)

Summary:
Discovered huge build time difference between caffe2 rocm build and pytorch rocm build (6min vs. 30min), turns out it's because the sccache setup needed in caffe2 docker images are not n pytorch build script.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14153

Differential Revision: D13115097

Pulled By: bddppq

fbshipit-source-id: 88414f164b980f0e667c8e138479b4a75ab7692e

commit | commitdiff | tree

Ailing Zhang [Mon, 19 Nov 2018 17:45:28 +0000 (09:45 -0800)]

allow empty index for scatter_* methods (#14077)

Summary:
Fixes #2027
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14077

Differential Revision: D13095788

Pulled By: ailzhang

fbshipit-source-id: ad2c8bbf83d36e07940782b9206fbdcde8905fd3

commit | commitdiff | tree

ArmenAg [Mon, 19 Nov 2018 17:18:45 +0000 (09:18 -0800)]

use at::Device throughout JIT (#14181)

Summary:
zdevito soumith

Sorry about the previous PR, had some git issues. This is the same exact code as the previous PR but updated w.r.t pytorch/master.

fixes #13254
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14181

Differential Revision: D13117688

Pulled By: soumith

fbshipit-source-id: 044840b2c7a0101ef43dd16655fd9a0f9981f53f

commit | commitdiff | tree

Gregory Chanan [Mon, 19 Nov 2018 16:18:47 +0000 (08:18 -0800)]

Support named return arguments in native_functions. (#14100)

Summary:
Note there was a hacky way of doing this before by specifying "return:" lists manually; this makes the
return names part of the function declaration itself.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14100

Differential Revision: D13101810

Pulled By: gchanan

fbshipit-source-id: 1c80574cd4e8263764fc65126427b122fe36df35

commit | commitdiff | tree

Edward Yang [Mon, 19 Nov 2018 16:13:08 +0000 (08:13 -0800)]

Split out CUDAMultiStreamGuard from CUDAGuard (#13912)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13912

The implementation and API of CUDAMultiStreamGuard is less mature,
and it cannot be implemented generically (yet) in c10_cuda. This
might be a reasonable thing to do eventually, but not for now.

Reviewed By: smessmer

Differential Revision: D13046500

fbshipit-source-id: 4ea39ca1344f1ad5ae7c82c98617aa348c327848

commit | commitdiff | tree

Edward Yang [Mon, 19 Nov 2018 16:13:08 +0000 (08:13 -0800)]

Move AT_CUDA_CHECK to c10

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13910

Reviewed By: smessmer

Differential Revision: D13046201

fbshipit-source-id: 8d360a0e4d6c2edf070d130e600c6b04f0ee0058

commit | commitdiff | tree

Edward Yang [Mon, 19 Nov 2018 16:13:07 +0000 (08:13 -0800)]

Add c10 cuda library. (#13900)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13900

Add c10 cuda library.

Right now, this is not used by anything, and only tests if the CUDA
headers are available (and not, e.g., that linking works.)

Extra changes:
- cmake/public/cuda.cmake now is correctly include guarded, so you
can include it multiple times without trouble.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Reviewed By: smessmer

Differential Revision: D13025313

fbshipit-source-id: fda85b4c35783ffb48ddd6bbb98dbd9154119d86

commit | commitdiff | tree

Marat Dukhan [Mon, 19 Nov 2018 07:55:01 +0000 (23:55 -0800)]

Switch Int8Add operator to QNNPACK (#14089)

Summary:
- Improved single-threaded performance due to optimized low-level micro-kernels
- Improved parallelization (previously was parallelized across images in a batch and pixels only, now within channels as well)
- Slightly different result due to different implementation of fixed-point arithmetics (no accuracy loss expected)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14089

Differential Revision: D13110135

Pulled By: Maratyszcza

fbshipit-source-id: 1f149394af5c16940f79a3fd36e183bba1be2497

commit | commitdiff | tree

Teng Li [Sun, 18 Nov 2018 21:51:15 +0000 (13:51 -0800)]

No more -werror for c10d (#14155)

Summary:
As the title says
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14155

Differential Revision: D13115769

Pulled By: teng-li

fbshipit-source-id: 278deba090364544d92fa603621604ce37fa974e

commit | commitdiff | tree

Summer Deng [Sun, 18 Nov 2018 20:49:39 +0000 (12:49 -0800)]

Add ultra low precision options (#14133)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14133

Experiment with ultra low precisions on the Resnext-101 URU trunk model

Reviewed By: jspark1105

Differential Revision: D10108518

fbshipit-source-id: f04d74fbe1c9e75efafcd9845719bdb2efbbfe9c

commit | commitdiff | tree

Soumith Chintala [Sun, 18 Nov 2018 17:20:29 +0000 (09:20 -0800)]

Adds symbolic diff for THNN Conv2d and aten native BatchNorm (#13888)

Summary:
Adds symbolic diff and tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13888

Differential Revision: D13115548

Pulled By: soumith

fbshipit-source-id: ba75b01a95a5715a7761724dda018168b6188917

commit | commitdiff | tree

Your Name [Sun, 18 Nov 2018 08:09:25 +0000 (00:09 -0800)]

Print warning when ROCm memory leaking is detected in pytorch tests (#14151)

Summary:
We keep seeing random failures in CI because of ROCm memory leaking, e.g:

https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/3102//console
https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/3080//console

To make the CI more stable, turn it to warning instead of failure.

iotamudelta please help investigating the memory leaking
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14151

Differential Revision: D13115096

Pulled By: bddppq

fbshipit-source-id: a13b68274ecba363d9d8436aa6a62ac40a77d78c

commit | commitdiff | tree

vishwakftw [Sun, 18 Nov 2018 06:25:39 +0000 (22:25 -0800)]

Remove debugging code in test_cholesky_batched (#14156)

Summary:
They didn't turn up in my tests because I use pytest which doesn't
print debug statements if the tests pass

Differential Revision: D13115227

Pulled By: soumith

fbshipit-source-id: 46a7d47da7412d6b071158a23ab21e7fb0c6e11b

commit | commitdiff | tree

Jerry Zhang [Sun, 18 Nov 2018 03:42:42 +0000 (19:42 -0800)]

Back out "[reland][codemod][caffe2] Tensor construction: combine Resize+mutable_data - 2/4" (#14154)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14154

Original commit changeset: e89c2e692178

Reviewed By: amateurcoffee

Differential Revision: D13115023

fbshipit-source-id: 8f9fb55842ae6c8139d5cd88ec6d0abb0c5cc5e7

commit | commitdiff | tree

Martin Schatz [Sun, 18 Nov 2018 01:26:09 +0000 (17:26 -0800)]

CostInference for 1D conv (#14009)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14009

As title

Reviewed By: yinghai

Differential Revision: D13078718

fbshipit-source-id: 081e7b13ad6741c635ef413915b555f10f93bd33

commit | commitdiff | tree

vishwakftw [Sat, 17 Nov 2018 18:47:17 +0000 (10:47 -0800)]

Batched cholesky decomposition (#14017)

Summary:
Implements batching for the Cholesky decomposition.

Performance could be improved with a dedicated batched `tril` and `triu` op, which is also impeding autograd operations.

Changes made:
- batching code
- tests in `test_torch.py`, `test_cuda.py` and `test_autograd.py`.
- doc string modification
- autograd modification
- removal of `_batch_potrf` in `MultivariateNormal`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14017

Differential Revision: D13087945

Pulled By: ezyang

fbshipit-source-id: 2386db887140295475ffc247742d5e9562a42f6e

commit | commitdiff | tree

Jongsoo Park [Sat, 17 Nov 2018 18:26:56 +0000 (10:26 -0800)]

remove unnecessary file from avx2 list (#14012)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14012

conv_dnnlowp_op.cc doesn't need avx2 anymore.

Reviewed By: dskhudia

Differential Revision: D13079665

fbshipit-source-id: dbfe8d2213de4969b6334d54de81d51149268cbd

commit | commitdiff | tree

Your Name [Sat, 17 Nov 2018 17:22:09 +0000 (09:22 -0800)]

Change from using enum to int to store data_type

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14140

Differential Revision: D13112937

Pulled By: bddppq

fbshipit-source-id: 124d9546bfbd1f9c207a21e40eb3646f7739bd58

commit | commitdiff | tree

Junjie Bai [Sat, 17 Nov 2018 08:20:44 +0000 (00:20 -0800)]

Revert "CircleCI: fix NCCL install (#14124)" (#14146)

Summary:
This reverts commit a1fa9d8cf9b2b0e7373ec420c2487d4dfd0e587c.

[pytorch_linux_trusty_py2_7_9_build](https://circleci.com/gh/pytorch/pytorch/270206?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link/console):
```
Nov 17 07:37:27 + sudo apt-get -qq update
Nov 17 07:37:30 W: Ignoring Provides line with DepCompareOp for package gdb-minimal
Nov 17 07:37:30 W: You may want to run apt-get update to correct these problems
Nov 17 07:37:30 + sudo apt-get -qq install --allow-downgrades --allow-change-held-packages openmpi-bin libopenmpi-dev
Nov 17 07:37:30 E: Command line option --allow-downgrades is not understood
Nov 17 07:37:30 + cleanup
Nov 17 07:37:30 + retcode=100
Nov 17 07:37:30 + set +x
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14146

Differential Revision: D13113912

Pulled By: bddppq

fbshipit-source-id: cd9d371cf72159f03d12a8b56ed5bd2060ebbe59

commit | commitdiff | tree

Junjie Bai [Sat, 17 Nov 2018 07:26:12 +0000 (23:26 -0800)]

Revert D10428917: [Caffe2] Add cost into profile observer

Differential Revision:
D10428917

Original commit changeset: 7c100e551bdd

fbshipit-source-id: 5164d9ba61cc103eccfdeb91a5cc140cea31a819

commit | commitdiff | tree

Junjie Bai [Sat, 17 Nov 2018 07:26:12 +0000 (23:26 -0800)]

Revert D10439558: Add cost for non-linear ops

Differential Revision:
D10439558

Original commit changeset: 9aeb05bac8b5

fbshipit-source-id: f00977b4f95bdd500d254eb44fb5b0c816506ee4

commit | commitdiff | tree

Marat Dukhan [Sat, 17 Nov 2018 05:57:42 +0000 (21:57 -0800)]

Update FXdiv submodule (#14128)

Summary:
Use the most recent version that disables inline assembly.
I suspect inline assembly causes miscompilation on some versions of gcc7.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14128

Reviewed By: bddppq

Differential Revision: D13112370

Pulled By: Maratyszcza

fbshipit-source-id: 36cc95dc51390a293b72c18ae982c3a515a11981

commit | commitdiff | tree

Marat Dukhan [Sat, 17 Nov 2018 05:21:40 +0000 (21:21 -0800)]

Rename neon2sse.h to NEON_2_SSE.h to match upstream repo

Summary:
- NEON2SSE is a header that implements NEON intrinsics on top fo SSE intrinsics
- Upstream repo provides NEON_2_SSE.h header, but internally it was imported as neon2sse.h
- This patch fix incompatibilities between internal and upstream versions

Reviewed By: hlu1

Differential Revision: D13096755

fbshipit-source-id: 65e1df9a2a5e74bd52c9aee9be27469ba938cd8c

commit | commitdiff | tree

Marat Dukhan [Sat, 17 Nov 2018 05:02:37 +0000 (21:02 -0800)]

Disable QNNPACK for multi-architecture iOS builds (#14125)

Summary:
QNNPACK contains assembly files, and CMake tries to build them for wrong architectures in multi-arch builds. This patch has two effects:
- Disables QNNPACK in multi-arch iOS builds
- Specifies a single `IOS_ARCH=arm64` by default (covers most iPhones/iPads on the market)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14125

Differential Revision: D13112366

Pulled By: Maratyszcza

fbshipit-source-id: b369083045b440e41d506667a92e41139c11a971

commit | commitdiff | tree

Sebastian Messmer [Sat, 17 Nov 2018 04:10:31 +0000 (20:10 -0800)]

Register caffe2 layer norm with c10 dispatcher (#13693)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13693

We can't directly call the caffe2::Operator class from c10 yet because that class isn't deprotobuffed yet.
Instead, we factor out the kernel into a reusable static method and call it from the caffe2::Operator and
also register it with c10.

Reviewed By: ezyang

Differential Revision: D12912242

fbshipit-source-id: c57502f14cea7a8be281f9787b175bb6e402d00c

commit | commitdiff | tree

Sebastian Messmer [Sat, 17 Nov 2018 04:10:30 +0000 (20:10 -0800)]

Add c10/core/ to cmake build (#14111)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14111

It was already in TARGETs, but we forgot it in cmake.

Reviewed By: ezyang

Differential Revision: D13105166

fbshipit-source-id: f09549e98ebca751339b5ada1150e00cc4cd9540

commit | commitdiff | tree

Haixin Liu [Sat, 17 Nov 2018 03:08:49 +0000 (19:08 -0800)]

Update atol scale in dnnlowp test (#14135)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14135

Update atol scale of dnnlowp test. Can't reproduce the flaky test error in the task locally even after setting the same seed value, but found according to comments in check_quantized_results_close(), atol_scale should be 1/1.9=0.526315789473684, which is larger than current value 0.51. So increase the atol_scale to 0.53.

Reviewed By: jspark1105

Differential Revision: D13108415

fbshipit-source-id: 1e8840659fdf0092f51b439cf499858795f9706a

commit | commitdiff | tree

Jongsoo Park [Sat, 17 Nov 2018 02:49:08 +0000 (18:49 -0800)]

fix sparse_adagrad param_size overflow error (#14049)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14049

param_size should be passed as int64_t

Reviewed By: hyuen

Differential Revision: D13090511

fbshipit-source-id: 7892d315d7c82c7d7ca103fb36d30cdf1fe24785

commit | commitdiff | tree

Haixin Liu [Sat, 17 Nov 2018 02:30:49 +0000 (18:30 -0800)]

Add cost for non-linear ops (#13327)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13327

Add cost inference function to non-linear ops. Since the actual flops of the non-linear operator depends on the implementation, we use the number of non-linear operations as the proxy for the analytical flops for non-linear operators.

Reviewed By: jspark1105

Differential Revision: D10439558

fbshipit-source-id: 9aeb05bac8b5c7ae5d351ebf365e0a81cf4fc227

commit | commitdiff | tree

Haixin Liu [Sat, 17 Nov 2018 02:30:49 +0000 (18:30 -0800)]

Add cost into profile observer (#12793)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12793

Add analytical cost into profile observer. It includes the op level cost information for each op run and net level aggregated cost information for each op type.

It outputs the following information:
1. analytical flops
2. analytical bytes_read
3. analytical bytes_written

Example output at op level:
```I1017 14:58:14.245978 3686541 profile_observer_gpu.cc:26] --------- Starting operator FC op#24 ---------
I1017 14:58:14.246049 3686541 profile_observer_gpu.cc:33] Input 0: Tensor model1/embedded_encoder_inputs of type float. Dims: (17,1,256,):
I1017 14:58:14.246109 3686541 profile_observer_gpu.cc:33] Input 1: Tensor model1/encoder/layer0/fw/milstm/i2h_w of type float. Dims: (2048,256,):
I1017 14:58:14.246176 3686541 profile_observer_gpu.cc:33] Input 2: Tensor model1/encoder/layer0/fw/milstm/i2h_b of type float. Dims: (2048,):
I1017 14:58:14.246217 3686541 profile_observer_gpu.cc:44] Argument 0: name: "use_cudnn" i: 1
I1017 14:58:14.246271 3686541 profile_observer_gpu.cc:44] Argument 1: name: "cudnn_exhaustive_search" i: 0
I1017 14:58:14.246338 3686541 profile_observer_gpu.cc:44] Argument 2: name: "order" s: "NHWC"
I1017 14:58:14.246372 3686541 profile_observer_gpu.cc:44] Argument 3: name: "axis" i: 2
I1017 14:58:14.246418 3686541 profile_observer_gpu.cc:44] Argument 4: name: "quantization_scheme" i: 1
I1017 14:58:14.246470 3686541 profile_observer_gpu.cc:53] Output 0: Tensor model1/encoder/layer0/fw/milstm/i2h of type float. Dims: (17,1,2048,):
I1017 14:58:14.246596 3686541 profile_observer_gpu.cc:61] Cost (flops, bytes_read, bytes_written):
I1017 14:58:14.246649 3686541 profile_observer_gpu.cc:62]        17860608 2122752 139264
I1017 14:58:14.246677 3686541 profile_observer_gpu.cc:64] --------- Finished operator FC in 0.764221 ms ---------
```
Example output at net level:
```
I1017 11:13:44.675585 3146691 profile_observer_gpu.cc:165] ================ Detailed stats for net model0/encoder/layer0/bw/milstm ================
I1017 11:13:44.675662 3146691 profile_observer_gpu.cc:167] Cost (flops, bytes_read, bytes_written) per operator type:
I1017 11:13:44.675706 3146691 profile_observer_gpu.cc:169]        20992000 42045440 81920 FC
I1017 11:13:44.675745 3146691 profile_observer_gpu.cc:169]           20480 163840 81920 Mul
I1017 11:13:44.675824 3146691 profile_observer_gpu.cc:169]           20480 163840 81920 Sum
I1017 11:13:44.675878 3146691 profile_observer_gpu.cc:169]               0 0 0 ElementwiseLinear
I1017 11:13:44.675909 3146691 profile_observer_gpu.cc:169]               0 0 0 LSTMUnit
I1017 11:13:44.675958 3146691 profile_observer_gpu.cc:169]               0 0 0 rnn_internal_apply_link
```

Reviewed By: mdschatz

Differential Revision: D10428917

fbshipit-source-id: 7c100e551bdd3ac8d7c09be12c72d70a2d67cae1

commit | commitdiff | tree

Will Feng [Sat, 17 Nov 2018 02:28:55 +0000 (18:28 -0800)]

CircleCI: fix NCCL install (#14124)

Summary:
The `$BUILD_ENVIRONMENT` checks work in `test.sh` but not `build.sh`, this PR is trying to figure out why.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14124

Reviewed By: teng-li

Differential Revision: D13112483

Pulled By: yf225

fbshipit-source-id: 5f65997586648805cf52217a261389625b5535e1

commit | commitdiff | tree

Teng Li [Sat, 17 Nov 2018 02:02:13 +0000 (18:02 -0800)]

Fixed MPI build with higher version of GCC (#14122)

Summary:
This appears as I enabled -Werror in c10d build. Good to catch this and fix it.

Should fix https://github.com/pytorch/pytorch/issues/14078 and https://github.com/pytorch/pytorch/issues/13962
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14122

Differential Revision: D13110678

Pulled By: teng-li

fbshipit-source-id: f4c19e16976d65debbd33ed59e17ddbaa19f765a

commit | commitdiff | tree

Teng Li [Sat, 17 Nov 2018 01:49:56 +0000 (17:49 -0800)]

multiprocessing.spawn python version check (#14039)

Summary:
This will be super helpful to the user
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14039

Differential Revision: D13089200

Pulled By: teng-li

fbshipit-source-id: 29e7507bd8fe5a0c58a85c52f976bfca282b4c1b

commit | commitdiff | tree

Gregory Chanan [Sat, 17 Nov 2018 00:47:00 +0000 (16:47 -0800)]

Don't python bind _thnn_ functions. (#14101)

Summary:
This is needed for moving nn functions to native functions, but since some functions are already named
this way, I'm going to stop binding pre-emptively so we can check if there are any current dependencies.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14101

Differential Revision: D13102219

Pulled By: gchanan

fbshipit-source-id: 6bbcca33a03ab1bf648f1b73cadfe84339fa3050

commit | commitdiff | tree

Peter Goldsborough [Fri, 16 Nov 2018 22:53:19 +0000 (14:53 -0800)]

Fix docs/cpp/requirements.txt (#14121)

Summary:
soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14121

Differential Revision: D13108063

Pulled By: goldsborough

fbshipit-source-id: 35cf65ba776e8826c5cab7ae6d3a2d446f87e7cc

commit | commitdiff | tree

Thomas Viehmann [Fri, 16 Nov 2018 21:59:31 +0000 (13:59 -0800)]

Allow cooperative structured objects to be passed modules in tracing (#13961)

Summary:
Before this patch, the JIT does not allow Module's forward to take
structured objects.
This patch allows cooperative objects to do so.
Cooperative means:
- It has a method self._jit_unwrap() that returns (a list/tuple of)
  tensors. These are then used in _iter_tensors.
- It has a method self._jit_wrap(flattened_input) that takes
  (a list/tuple?) the flattened_unput (potentially more than it needs)
  and returns itself (updated) and the unconsumed flattened_inputs.
  This is then used in the _unflatten mechanism.

This is all it takes to permit maskrcnn-benchmark to use
its structured BoxList/ImageList types and trace it without calling
the .forward directly.
I'll push a model working with this patch in
https://github.com/facebookresearch/maskrcnn-benchmark/pull/138

I must admit I haven't fully checked whether there are ONNX changes needed before it, too, can profit, but I would be hopeful that anything currently usable remains so.

fmassa zdevito

So the main downside that I'm aware of is that people will later want to use more elaborate mechanisms, but I think this could be done by just amending what wrap/unwrap are returning / consuming.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13961

Differential Revision: D13103927

Pulled By: soumith

fbshipit-source-id: 2cbc724cc4b53197388b662f75d9e601a495c087

commit | commitdiff | tree

Peter Goldsborough [Fri, 16 Nov 2018 21:01:25 +0000 (13:01 -0800)]

Add SharedDataset (#13800)

Summary:
This PR adds a `SharedDataset` to the C++ frontend data API, which allows wrapping a shared_ptr to a dataset into a class that conforms to the `Dataset` interface (with `get_batch`). This enables use cases where a custom dataset is (1) thread-safe and (2) expensive to copy. All workers will reference a single instance of this dataset. No additional copies are incurred.

jaliyae apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13800

Differential Revision: D13075610

Pulled By: goldsborough

fbshipit-source-id: 4ffdfd7959d49b042c0e254110085f62a0bfeb6c

commit | commitdiff | tree

jjsjann123 [Fri, 16 Nov 2018 20:59:01 +0000 (12:59 -0800)]

remove dynamic initialization warning (#13913) (#13967)

Summary:
removed assignment in default constructor.
removed static shared memory and used dynamic shared memory.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13967

Differential Revision: D13089996

Pulled By: soumith

fbshipit-source-id: 2a218b909c849bed39636b45a02d10ebc279a0b0

commit | commitdiff | tree

Peter Goldsborough [Fri, 16 Nov 2018 20:12:01 +0000 (12:12 -0800)]

Missing .decode() after check_output in cpp_extensions (#13935)

Summary:
soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13935

Differential Revision: D13090852

Pulled By: goldsborough

fbshipit-source-id: 47da269d074fd1e7220e90580692d6ee489ec78b

commit | commitdiff | tree

ArutyunovG [Fri, 16 Nov 2018 20:06:21 +0000 (12:06 -0800)]

Windows shared build (#13550)

Summary:
Hi guys,

I'd like to build Caffe2 with more supported options in Windows with Microsoft Visual Studios.
This is the first pull request.
Running scripts/build_windows_shared.bat is able to build Caffe2 with both CMAKE_BUILD_TYPE=Debug and CMAKE_BUILD_TYPE=Release with Visual Studio 14 2015.
CUDA is 9.0, cudnn is 7.0.5, glog, gflags and lmdb are supported on my system.
Python is 3.5, Detectron works from python interface as well.
It was even possible to debug detectron code and step into caffe2_gpu.dll with pdbs built.

What is disappointing, that c10/experimental ops don't build with this Visual Studio generator, I added special option INCLUDE_EXPERIMENTAL_C10_OPS (default ON) to deal with it in build_windows_shared.bat.

After this pull request the next step is to add Visual Studio 2017 support in the script.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13550

Reviewed By: ezyang

Differential Revision: D13042597

Pulled By: orionr

fbshipit-source-id: f313f909f599cd582a1d000eff766eef3a9fc4fc

commit | commitdiff | tree

Freddie Mendoza [Fri, 16 Nov 2018 20:05:27 +0000 (12:05 -0800)]

Make JOIN_TIMEOUT longer for ppc64le (#14107)

Summary:
This should resolve the issue on ppc64le getting FAIL: test_proper_exit (__main__.TestDataLoader). This only happens when the CI build machine is very busy and fails with a timeout.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14107

Differential Revision: D13103859

Pulled By: soumith

fbshipit-source-id: 268be80b59840853c5025f3211af272f68608fe5

commit | commitdiff | tree

Ilia Cherniavskii [Fri, 16 Nov 2018 20:01:01 +0000 (12:01 -0800)]

Log error from the net's run (#14035)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14035

Log error meesage in case of net's run failure

Reviewed By: andrewwdye

Differential Revision: D13085431

fbshipit-source-id: d79f76782410cd3a5bd2d8d7f5fb1e535d821051

commit | commitdiff | tree

Junjie Bai [Fri, 16 Nov 2018 19:50:29 +0000 (11:50 -0800)]

Change hip filename extension to .hip (#14036)

Summary:
xw285cornell

- To make hip files to have unique filename extension we change hip files from _hip.cc to .hip (it's the only blessing option other than .cu in hipcc https://github.com/ROCm-Developer-Tools/HIP/blob/3d51a1fb0105e2f2312d2523c20e0034339f6ada/bin/hipcc#L552).
- Change to use host compiler to compile .cc|.cpp files. Previously we use hcc to compile them which is unnecessary
- Change the hipify script to not replace "gpu" with "hip" in the filename of the generated hipified files. Previously we do this because hcc has a bug when linking files that have same filename. We have now changed to use host linker to do linking so this is unnecessary anymore.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14036

Reviewed By: xw285cornell

Differential Revision: D13091813

Pulled By: bddppq

fbshipit-source-id: ea3d887751d8abb39d75f5d5104aa66ce66b9ee0

commit | commitdiff | tree

Your Name [Fri, 16 Nov 2018 19:47:02 +0000 (11:47 -0800)]

Enable Caffe2 ROCm test on centos (#14090)

Summary:
xw285cornell petrex ashishfarmer rohithkrn
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14090

Differential Revision: D13096874

Pulled By: bddppq

fbshipit-source-id: b471c6e4db95cd51567745a2f758d58bba7eafad

commit | commitdiff | tree

Junjie Bai [Fri, 16 Nov 2018 19:45:01 +0000 (11:45 -0800)]

Enable Caffe2 test on centos (#14091)

Summary:
Turns out we don't have any centos test CI job
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14091

Differential Revision: D13104722

Pulled By: bddppq

fbshipit-source-id: 22fe92ad4b7f2c391eea16b8b95658fa1ee605e2

commit | commitdiff | tree

Thomas Viehmann [Fri, 16 Nov 2018 19:36:06 +0000 (11:36 -0800)]

Relax limits for gradients in test_jit's checkGraph (#14094)

Summary:
- This should help TestJit.test_lstm_fusion_concat_cuda
  to be less flaky. (Checked on manual_seed 0..99)
  Fixes: #14026
- Revert the renaming of test_fused_abs that was introduced
  to game the order of tests to avoid the flakiness above.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14094

Differential Revision: D13100174

Pulled By: soumith

fbshipit-source-id: 91bb63b07a960a81dddfc0bf25c67696c0f6c46d

commit | commitdiff | tree

Anders Papitto [Fri, 16 Nov 2018 19:32:51 +0000 (11:32 -0800)]

add torch-python target (#12742)

Summary:
This is the next minimal step towards moving _C into cmake. For now,
leave _C in setup.py, but reduce it to an empty stub file. All of its
sources are now part of the new torch-python cmake target.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12742

Reviewed By: soumith

Differential Revision: D13089691

Pulled By: anderspapitto

fbshipit-source-id: 1c746fda33cfebb26e02a7f0781fefa8b0d86385

commit | commitdiff | tree

Michael Suo [Fri, 16 Nov 2018 19:32:34 +0000 (11:32 -0800)]

alias annotation parsing #2 (#14053)

Summary:
hopefully this one doesn't break master.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14053

Differential Revision: D13093406

Pulled By: suo

fbshipit-source-id: 8fed44f1a3d463748726cb14acac2ea53dedf29b

commit | commitdiff | tree

Andy Chen [Fri, 16 Nov 2018 19:32:05 +0000 (11:32 -0800)]

Make THPDtype_New error instead of truncate (#14103)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14103

Addressing T34828781, we change THPDtype_New so that it throws a RuntimeError if the length of name is greater than buffer size (DTYPE_NAME_LEN) - instead of truncating the string to fit the buffer.

Reviewed By: ezyang

Differential Revision: D13094600

fbshipit-source-id: d0dbf8fdfa342630c31f4d8ca7230d5f24a1254a

commit | commitdiff | tree

Yinghai Lu [Fri, 16 Nov 2018 19:27:45 +0000 (11:27 -0800)]

Add filler for SparseLengthsWeightedSum (#13949)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13949

This diff adds support to fillers for `SparseLengthsWeight*` ops. It does 3 things:
1. Add the fillers for `SparseLengthsWeight*` ops
2. Add filling heuristics to consider the path of `LengthsRangeFill` -> `Gather` -> `SparseLengthsWeightedSum`, where the length input is shared by `LengthsRangeFill` and `SparseLengthsWeightedSum`. Therefore, we need to carefully bound the value of that length input so that at `Gather`, it does not index out-of-bound for the weight input of `Gather`.
3. Fix and simplify the logic of `math::RandFixedSum`, where we just keep rejecting the generated value if it violates the invariants.

Reviewed By: highker

Differential Revision: D13048216

fbshipit-source-id: bfe402e07e6421b28548047d18b298c148e0ec87

commit | commitdiff | tree

Wanchao Liang [Fri, 16 Nov 2018 18:01:09 +0000 (10:01 -0800)]

Update ATen doc with optional syntax (#14086)

Summary:
Update the readme to reflect the recent optional syntax change.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14086

Differential Revision: D13096114

Pulled By: wanchaol

fbshipit-source-id: 713834d4d92021e1c7a31f3a56a00fb7da58c348

commit | commitdiff | tree

Tongzhou Wang [Fri, 16 Nov 2018 17:54:44 +0000 (09:54 -0800)]

Add missing space in stft doc

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14092

Reviewed By: soumith

Differential Revision: D13100177

Pulled By: SsnL

fbshipit-source-id: 4eeaa3d0c04212516941d8d5a266aafb53bd9672

commit | commitdiff | tree

Brian Vaughan [Fri, 16 Nov 2018 16:37:05 +0000 (08:37 -0800)]

Preemptively test for out-of-order length. (#13933)

Summary:
torch.nn.utils.rnn.pack_padded_sequence segment fault if not in
decreasing order #13324

We were seeing this segfault on throw, pre-emptively checking avoids
this:

*** Error in `/home/bvaughan/anaconda3/bin/python': double free or corruption (!prev): 0x00005555566e7510 ***
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13933

Differential Revision: D13090389

Pulled By: nairbv

fbshipit-source-id: 6f6b319e74cb55830be799e9c46bc33aa59256d8

commit | commitdiff | tree

Duc Ngo [Fri, 16 Nov 2018 16:04:58 +0000 (08:04 -0800)]

nomnigraph - support subgraph visualization (#13795)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13795

Add ability for dot string generation for a single subgraph and python bindings (which is pretty useful for model exploration in Python)
Restructure DotGenerator class a bit to make it easy to implement this feature

Reviewed By: bwasti

Differential Revision: D13010512

fbshipit-source-id: 825665438394b7e6968ab6da167b477af82a7b62

commit | commitdiff | tree

Duc Ngo [Fri, 16 Nov 2018 16:04:58 +0000 (08:04 -0800)]

nomnigraph - easy - expose hasProduce(NodeRef) to python (#14075)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14075

Expose hasProduce(NodeRef) to python

Reviewed By: bwasti

Differential Revision: D13092930

fbshipit-source-id: f1ec06e73e0f5f6a16ad0cbb7d2e3e499a861d8e

Domain: Machine Learning / ML Framework;

RSS Atom