platform/upstream/pytorch.git
2 years ago[iOS][OSS][BE] Add Simulator tests for full JIT (#64851)
Tao Xu [Tue, 14 Sep 2021 01:14:32 +0000 (18:14 -0700)]
[iOS][OSS][BE] Add Simulator tests for full JIT (#64851)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64851

ghstack-source-id: 137970229

Test Plan: CircleCI

Reviewed By: hanton, cccclai

Differential Revision: D30877963

fbshipit-source-id: 7bb8ade1959b85c3902ba9dc0660cdac8f558d64

2 years agoadd acc_ops.max, acc_ops.maximum, consolidate acc_ops.min and acc_ops.minimum
Emad El-Haraty [Tue, 14 Sep 2021 00:59:11 +0000 (17:59 -0700)]
add acc_ops.max, acc_ops.maximum, consolidate acc_ops.min and acc_ops.minimum

Summary:
This diff adds `acc_ops.max` and `acc_ops.maximum` support.
It further consolidates the logic for `acc_ops.min` and `acc_ops.minimum` to match the logic for max.

torch.max has three behaviors:
```1. max(input)
2. max(input, dim, keepdim=False, *, out=None)
3. max(input, other, *, out=None)
```

Likewise, `torch.min` has three identical behaviors.

I've chosen to implement each as an acc_op, then map to the appropriate one.

the third max function is effectively `torch.maximum`, so I've implemented it as that.

Reviewed By: yinghai, jfix71, 842974287

Differential Revision: D30551464

fbshipit-source-id: 0a2eec10e5185cbf7d9984eec3fd399b23528b2a

2 years agoAdd BFloat16 support for cross, tril, triu, tril_indices, triu_indices and cumsum...
CaoE [Tue, 14 Sep 2021 00:58:20 +0000 (17:58 -0700)]
Add BFloat16 support for cross, tril, triu, tril_indices, triu_indices and cumsum operators on CPU (#62454)

Summary:
Add BFloat16 support for cross, tril, triu, tril_indices, triu_indices and cumsum operators on CPU.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62454

Reviewed By: albanD

Differential Revision: D30845805

Pulled By: heitorschueroff

fbshipit-source-id: f83836862e38109ec929e83567133e9e88096b8b

2 years agoUse RDS for build size tracking (#64303)
David Riazati [Tue, 14 Sep 2021 00:47:18 +0000 (17:47 -0700)]
Use RDS for build size tracking (#64303)

Summary:
This adds 2 utilities: `register_rds_table` and `rds_write`. `register_rds_table` needs to be called once with the schema for the data that `rds_write` will write. These go to a lambda called `rds-proxy`, which will write to/read from the DB as necessary. This data can then be arbitrarily queried via `rds-proxy` (for use in CI) or on metrics.pytorch.org (for analysis).

It also hooks these up for build size tracking (which previously was not working on GHA)

TODO:
* verify output in logs + clean up prints

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64303

Reviewed By: malfet, seemethere

Differential Revision: D30711934

Pulled By: driazati

fbshipit-source-id: 0af808ddf528a24875a378caeb1aa9cb0693f802

2 years agoAdd `skipIfTBB` decorator (#64942)
Nikita Shulga [Tue, 14 Sep 2021 00:10:30 +0000 (17:10 -0700)]
Add `skipIfTBB` decorator (#64942)

Summary:
And replace two existing usages in the codebase with it

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64942

Reviewed By: jbschlosser

Differential Revision: D30906382

Pulled By: malfet

fbshipit-source-id: e7f20f53aff734b0379eded361255543dab4fa4b

2 years agoRaise TypeError on assigned grad with wrong type (#64876)
Victor Quach [Mon, 13 Sep 2021 23:39:55 +0000 (16:39 -0700)]
Raise TypeError on assigned grad with wrong type (#64876)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/64813

Raises a TypeError when assigned value to a grad is not a Tensor or
None.

Adds tests.

cc ezyang gchanan

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64876

Reviewed By: anjali411

Differential Revision: D30901678

Pulled By: soulitzer

fbshipit-source-id: dbb3cb5fd0bbac6918e0b2e2f51d340daa43dee0

2 years agokill SkipInfo (#64878)
Natalia Gimelshein [Mon, 13 Sep 2021 23:31:07 +0000 (16:31 -0700)]
kill SkipInfo (#64878)

Summary:
Per offline discussion, replaces SkipInfo with DecorateInfo. SkipInfo class itself is not removed yet to give functorch time to replace its SkipInfos.
cc zou3519

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64878

Reviewed By: mruberry

Differential Revision: D30908052

Pulled By: ngimel

fbshipit-source-id: 5124180b25c6e32517722883b9f3a2b488e3fe20

2 years agoFix TRTOperatorSupport (#64873)
Shirong Wu [Mon, 13 Sep 2021 22:53:20 +0000 (15:53 -0700)]
Fix TRTOperatorSupport (#64873)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64873

Fix TRTOperatorSupport's key naming to match the key generated by torch.fx.passes.tools_common.get_node_target. The get_node_target is used by splitter_base for comparing whether operator is supported by name.

Test Plan:
print out the supported operator dict and check name.
Run TRTSplitter with lrm_split_model_generator and verify split result is correct with all supported operators printed.
current split result:
````
Supported node types in the model:
acc_ops.size: ((), {'input': torch.float32})
acc_ops.getitem: ((), {'input': torch.float32})
acc_ops.getitem: ((), {'input': None})
acc_ops.reshape: ((), {'input': torch.float32})
acc_ops.unsqueeze: ((), {'input': torch.float32})
acc_ops.linear: ((), {'input': torch.float32, 'weight': torch.float32})
acc_ops.linear: ((), {'input': torch.float32, 'weight': torch.float32, 'bias': torch.float32})
acc_ops.mul: ((), {'input': torch.float32, 'other': torch.float32})
acc_ops.cat: ((), {})
acc_ops.add: ((), {'input': torch.float32, 'other': torch.float32})
acc_ops.add: ((), {'input': torch.float32})
acc_ops.tanh: ((), {'input': torch.float32})
acc_ops.transpose: ((), {'input': torch.float32})
acc_ops.matmul: ((), {'input': torch.float32, 'other': torch.float32})
acc_ops.div: ((), {'input': torch.float32, 'other': torch.float32})
acc_ops.squeeze: ((), {'input': torch.float32})
acc_ops.noop: ((), {'input': torch.float32})
acc_ops.layer_norm: ((), {'input': torch.float32, 'weight': torch.float32, 'bias': torch.float32})
acc_ops.permute: ((), {'input': torch.float32})
acc_ops.sigmoid: ((), {'input': torch.float32})
acc_ops.flatten: ((), {'input': torch.float32})
acc_ops.softmax: ((), {'input': torch.float32})
acc_ops.sum: ((), {'input': torch.float32})

Unsupported node types in the model:
torch.ops.fb.pad_sequence_embeddings: ((), {'embeddings': torch.float32, 'offsets': torch.int32})
acc_ops.linalg_norm: ((), {'input': torch
```

Reviewed By: yinghai

Differential Revision: D30884463

fbshipit-source-id: 22442aa6a69cd148ce9bc8be8f62157dd6d19954

2 years agoRevert D30878101: [pytorch][PR] Fix test report uploading
Eli Uriegas [Mon, 13 Sep 2021 22:21:51 +0000 (15:21 -0700)]
Revert D30878101: [pytorch][PR] Fix test report uploading

Test Plan: revert-hammer

Differential Revision:
D30878101 (https://github.com/pytorch/pytorch/commit/fba40bfc1ab45b4410504ec64b585c4df74b6f47)

Original commit changeset: 0730f17fa3f4

fbshipit-source-id: dad89e68b4daf656dd0b592bc9b2758f00af38c6

2 years agotorch.ao migration: fake_quantize.py, phase 1 (#64814)
Vasiliy Kuznetsov [Mon, 13 Sep 2021 22:20:44 +0000 (15:20 -0700)]
torch.ao migration: fake_quantize.py, phase 1 (#64814)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64814

1. move the file
```
hg mv caffe2/torch/quantization/fake_quantize.py caffe2/torch/ao/quantization/
```

2. create a new file in the old location and copy the imports
3. fix all callsites inside `torch`

Test Plan:
```
buck test mode/dev //caffe2/test:quantization
```

Reviewed By: z-a-f

Differential Revision: D30866792

fbshipit-source-id: 7a221cb46c0ab01f1c5de9be061f09ecc83ce23e

2 years ago[PyTorch] Reduce heap allocations in OperatorName::setNamespaceIfNotSet (#64673)
Scott Wolchok [Mon, 13 Sep 2021 21:31:36 +0000 (14:31 -0700)]
[PyTorch] Reduce heap allocations in OperatorName::setNamespaceIfNotSet (#64673)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64673

We are now guaranteed to allocate at most one time in this function.
ghstack-source-id: 137786392

Test Plan: Previous diff adds test coverage for this function.

Reviewed By: dhruvbird

Differential Revision: D30813014

fbshipit-source-id: 17d844a1cc8c30574afcc6b0b41b219e62c0b723

2 years ago[PyTorch] Add test for operator_name (#64672)
Scott Wolchok [Mon, 13 Sep 2021 21:31:36 +0000 (14:31 -0700)]
[PyTorch] Add test for operator_name (#64672)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64672

Just a small struct missing test coverage. Next diff changes it.
ghstack-source-id: 137786388

Test Plan: CI

Reviewed By: dhruvbird

Differential Revision: D30813013

fbshipit-source-id: 05f39494bb9512a71a928bfe6fcfa710016bdf61

2 years agohandle the case in acc_ops.sum when dim == 0, differentiating it from the case when...
Emad El-Haraty [Mon, 13 Sep 2021 21:22:53 +0000 (14:22 -0700)]
handle the case in acc_ops.sum when dim == 0, differentiating it from the case when dim is None (#64869)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64869

handle the case in acc_ops.sum when dim == 0, differentiating it from the case when dim is None

Reviewed By: 842974287

Differential Revision: D30872739

fbshipit-source-id: 2755d3230804a16ef1c9289f804138c6dd7766b3

2 years agofix build error when system cmake3 version >=3.5 but <=3.10 (#64914)
XiaobingSuper [Mon, 13 Sep 2021 20:21:23 +0000 (13:21 -0700)]
fix build error when system cmake3 version >=3.5 but <=3.10 (#64914)

Summary:
For PyTorch source build using conda, there will raise an error in https://github.com/pytorch/pytorch/blob/8535418a06d75025541370cc656a8b6a0330ca0d/CMakeLists.txt#L1 when we get a CMake version < 3.10, it can be fixed by upgrade CMake in conda env, but for centos, there has CMake3, PyTorch fist check whether CMake3's verison<=3.5, so if user's system camke<= 3.5, PyTorch will use the system's cmake3, which will have build error like:
```
CMake Error at CMakeLists.txt:1 (cmake_minimum_required):
  CMake 3.10 or higher is required.  You are running version 3.6.3

-- Configuring incomplete, errors occurred!
```

we need to check CMake3 also >=3.10, if not, then check conda's CMake version.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64914

Reviewed By: jbschlosser

Differential Revision: D30901673

Pulled By: ezyang

fbshipit-source-id: 064e2c5bc0b9331d6ecd65cd700e5a42c3403790

2 years agoFix test report uploading (#64846)
driazati [Mon, 13 Sep 2021 20:21:09 +0000 (13:21 -0700)]
Fix test report uploading (#64846)

Summary:
Previously we just weren't uploading Windows test report XML files to S3, only to GitHub actions. This was different than Linux where we use both (though maybe we can kill the GHA upload in a follow up PR since I don't think it's very useful anymore). This factors it all out into a macro so they both do the same thing. This also fixes the naming of uploaded files to include info about the job name (the full config, so they can be matched to the job visually or by the included job id).

See https://hud.pytorch.org/pr/64846 for results

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64846

Reviewed By: seemethere

Differential Revision: D30878101

Pulled By: driazati

fbshipit-source-id: 0730f17fa3f46a32c131f52669084c3103b0e616

2 years agoPin SciPy to 1.6.3 on Mac (take 2) (#64922)
Nikita Shulga [Mon, 13 Sep 2021 19:46:11 +0000 (12:46 -0700)]
Pin SciPy to 1.6.3 on Mac (take 2) (#64922)

Summary:
It's already pinned by via docker install on Linux

`scipy.stats.`[`poission`|`geom`|`binom`] returns quite different results between 1.6.x and 1.7+ versions of SciPy, which results in several distributions tests failing accuracy thresholds

Reland of https://github.com/pytorch/pytorch/pull/64844 but limited to just Mac platform
Followup PR for Windows are coming as well

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64922

Reviewed By: janeyx99

Differential Revision: D30901257

Pulled By: malfet

fbshipit-source-id: 0543e7bae9d3bbeb8b6be7b3ecf605880f97665f

2 years ago[Deploy] Avoid use-after-free during autograd shutdown (#64620)
Don Jang [Mon, 13 Sep 2021 19:41:50 +0000 (12:41 -0700)]
[Deploy] Avoid use-after-free during autograd shutdown (#64620)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64620

`autograd` extension module's shutdown logic destructs `PyThreadState` by `pybind11::gil_scoped_acquire` using the RAII pattern.

The problem is that torch.deploy also destructs `PyThreadState` as part of its shutdown process (https://www.internalfb.com/phabricator/paste/view/P456363738), causing double destruction, use-after-free.

This change adds `defined(USE_DEPLOY)` as a special case to avoid destruction of `PyThreadState` to the existing special treatment for  `IS_PYTHON_3_9_PLUS`.

Test Plan: Added `TorchpyTest.Autograd` unittest to ensure that torch.deploy can create multiple instances that use autograd without causing a crash.

Reviewed By: albanD

Differential Revision: D30779080

fbshipit-source-id: 4de3283cc2d394acc9b8141c17cacbfab5eea052

2 years ago[Pytorch Edge] Quantized Ops Dtype Selective (#63680)
Jacob Szwejbka [Mon, 13 Sep 2021 17:54:08 +0000 (10:54 -0700)]
[Pytorch Edge] Quantized Ops Dtype Selective (#63680)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63680

Quantized ops not covered by DType Selectivity. Add the check, and adjust call sites to be constexpr friendly.

Test Plan: CI (this covers all model unit tests), verified that segmentation (a model that uses some of these quant ops) still works on instagram.

Reviewed By: dhruvbird, raymondethan

Differential Revision: D30457626

fbshipit-source-id: 5ba850d2b53a18558dfbb1cfaa78d8f53b5dbad8

2 years agoDisable more of the pragma warning stuff (#64899)
Edward Yang [Mon, 13 Sep 2021 17:53:06 +0000 (10:53 -0700)]
Disable more of the pragma warning stuff (#64899)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64899

ghstack-source-id: 137882055

Test Plan: sandcastle, ossci

Reviewed By: malfet, ngimel

Differential Revision: D30893691

fbshipit-source-id: 67ec8cc9f212aa16a201771603236e429944b561

2 years ago[PyTorch] Gate tls_local_dispatch_key_set off on iOS too (#64753)
Scott Wolchok [Mon, 13 Sep 2021 17:48:55 +0000 (10:48 -0700)]
[PyTorch] Gate tls_local_dispatch_key_set off on iOS too (#64753)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64753

This may possibly be causing problems on iOS. (Maybe we should just revert inlining access to this thing? Really don't understand what's wrong with it, though.)
ghstack-source-id: 137830520

Test Plan: CI

Reviewed By: iseeyuan

Differential Revision: D30826897

fbshipit-source-id: 0438dee9d49e7601c26cdca0e8540229c777eddb

2 years agotypo fix (#64615)
VertexC [Mon, 13 Sep 2021 17:46:30 +0000 (10:46 -0700)]
typo fix (#64615)

Summary:
Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64615

Reviewed By: jbschlosser

Differential Revision: D30884298

Pulled By: ngimel

fbshipit-source-id: 230f9d06aa85abcdd69828a1ea0a83f36cbfcb17

2 years ago[nn] no batch dim support: CosineEmbeddingLoss (#64590)
kshitij12345 [Mon, 13 Sep 2021 17:44:04 +0000 (10:44 -0700)]
[nn] no batch dim support: CosineEmbeddingLoss (#64590)

Summary:
Reference: https://github.com/pytorch/pytorch/issues/60585

TODO
* [x] Add tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64590

Reviewed By: H-Huang

Differential Revision: D30900775

Pulled By: jbschlosser

fbshipit-source-id: d24e72787017e79afbf8f04a94901a290485b81a

2 years agoFixes failure in test_dataloader.py that occurs on jetson boards (#64757)
Rishi Puri [Mon, 13 Sep 2021 17:05:47 +0000 (10:05 -0700)]
Fixes failure in test_dataloader.py that occurs on jetson boards (#64757)

Summary:
CUDA IPC is not supported for jetsons

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64757

Reviewed By: jbschlosser

Differential Revision: D30900593

Pulled By: ejguan

fbshipit-source-id: c6b2e8a9746276fdb4a009b6412e47cc8aac69f2

2 years ago.github: Always run chown workspace (#64854)
Jane Xu [Mon, 13 Sep 2021 17:04:58 +0000 (10:04 -0700)]
.github: Always run chown workspace (#64854)

Summary:
In some workflow runs, like https://github.com/pytorch/pytorch/runs/3568714658, the chown workspace step is duplicated.

Is that intentional? Unfortunately it is pretty necessary since (w/ docker) the folder can sometimes be in a broken permission state before and after we run jobs.

So this PR makes the second chown workspace run always because that's the true intention of the step.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64854

Reviewed By: jbschlosser, seemethere

Differential Revision: D30879289

Pulled By: janeyx99

fbshipit-source-id: 4157ff826c86e8c912deb1ba0cb5c47ea7596529

2 years agoReland .circleci: Skip cuda /cudnn install if existing (#64880)
Eli Uriegas [Mon, 13 Sep 2021 16:50:50 +0000 (09:50 -0700)]
Reland .circleci: Skip cuda /cudnn install if existing (#64880)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64880

This reverts commit 5836a116d0de214d6d759e70671f23150a5deaba.

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D30885675

Pulled By: seemethere

fbshipit-source-id: 8c96584d5a632170e29f91c5daf0206680a78661

2 years agotorch.ao migration: quantize_jit.py phase1 (#64860)
Supriya Rao [Mon, 13 Sep 2021 15:38:41 +0000 (08:38 -0700)]
torch.ao migration: quantize_jit.py phase1 (#64860)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64860

ghstack-source-id: 137885395

Test Plan: buck test mode/dev //caffe2/test:quantization

Reviewed By: jerryzh168

Differential Revision: D30880574

fbshipit-source-id: 9629027dd3b00bb8d45633e1564fc03a866f8c31

2 years agotorch.ao migration: stubs.py phase 1 (#64861)
Supriya Rao [Mon, 13 Sep 2021 15:38:41 +0000 (08:38 -0700)]
torch.ao migration: stubs.py phase 1 (#64861)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64861

1. move the file
  ```
  hg mv caffe2/torch/quantization/stubs.py caffe2/torch/ao/quantization/
  ```

  2. create a new file in the old location and copy the imports
  3. fix all call sites inside `torch`
ghstack-source-id: 137885365

Test Plan: buck test mode/dev //caffe2/test:quantization

Reviewed By: jerryzh168

Differential Revision: D30879678

fbshipit-source-id: a2d24f25d01064212aca15e94e8c78240ba48953

2 years agoadd BFloat16 operators on CPU: cummax, cummin (#63307)
Jiayi Sun [Mon, 13 Sep 2021 14:59:00 +0000 (07:59 -0700)]
add BFloat16 operators on CPU: cummax, cummin (#63307)

Summary:
Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63307

Reviewed By: nikithamalgifb

Differential Revision: D30342002

Pulled By: anjali411

fbshipit-source-id: eee6e640da996ef0e983960119608d9c12405336

2 years agofix quantization.rst doc (#64802)
Xiaoyu Zhang [Mon, 13 Sep 2021 14:18:38 +0000 (07:18 -0700)]
fix quantization.rst doc (#64802)

Summary:
RT。

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64802

Reviewed By: jbschlosser

Differential Revision: D30887210

Pulled By: vkuzo

fbshipit-source-id: 0267883d3065d724ea654a28db78f5fe5702ef06

2 years agoND Embeddings benchmark - Standardize randomized inputs (#64707)
Eddie Ren [Mon, 13 Sep 2021 13:45:39 +0000 (06:45 -0700)]
ND Embeddings benchmark - Standardize randomized inputs (#64707)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64707

Use torch.randn instead of torch.from_numpy to generate the tensor

Test Plan: buck run //caffe2/benchmarks/operator_benchmark/pt:qembedding_pack_test

Reviewed By: jingsh

Differential Revision: D30817302

fbshipit-source-id: 924c05517812b4b9f7df05a8999f9236cfe7b672

2 years agoInitial implementation of nanmean (#62671)
Heitor Schueroff [Mon, 13 Sep 2021 12:50:27 +0000 (05:50 -0700)]
Initial implementation of nanmean (#62671)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62671

Very crude first implementation of `torch.nanmean`. The current reduction kernels do not have good support for implementing nan* variants. Rather than implementing new kernels for each nan* operator, I will work on new reduction kernels with support for a `nan_policy` flag and then I will port `nanmean` to use that.

**TODO**

- [x] Fix autograd issue

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D30515181

Pulled By: heitorschueroff

fbshipit-source-id: 303004ebd7ac9cf963dc4f8e2553eaded5f013f0

2 years ago[Reland] Added reference tests to ReductionOpInfo (#64273)
Heitor Schueroff [Mon, 13 Sep 2021 03:04:19 +0000 (20:04 -0700)]
[Reland] Added reference tests to ReductionOpInfo (#64273)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64273

Reintroduced sample_inputs_prod and constrained the range of values for large reference tests.

This reverts commit e4fd2ab59ce8645f5ae9477c7724b6af82124b3b.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D30672097

Pulled By: heitorschueroff

fbshipit-source-id: b44ed8dfd5eb0c74c194164dafc3242f6728a78f

2 years agoAdds DLPack support (#57110)
Emilio Castillo [Mon, 13 Sep 2021 02:45:57 +0000 (19:45 -0700)]
Adds DLPack support (#57110)

Summary:
Partially Fixes https://github.com/pytorch/pytorch/issues/55090
Depends on https://github.com/pytorch/pytorch/issues/55365

Inspired by https://github.com/dmlc/dlpack/issues/57#issuecomment-774482973

Questions, in PyTorch we can't create streams or easily synchronize them from just an integer. Should we add an [`ExternalStream`](https://docs.cupy.dev/en/stable/reference/generated/cupy.cuda.ExternalStream.html) object like the one we have in CuPy?

TODO: Add tests

Would like some feedback as this design needs quite a few iterations
rgommers leofang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57110

Reviewed By: saketh-are

Differential Revision: D30761481

Pulled By: mruberry

fbshipit-source-id: e85d78df3c1f8defc2a698878da89cd843cb1209

2 years ago[fix] fix test_python_dispatch with pytest (#64574)
kshitij12345 [Mon, 13 Sep 2021 00:05:33 +0000 (17:05 -0700)]
[fix] fix test_python_dispatch with pytest (#64574)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/62501

Another approach for fixing the same issue

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64574

Reviewed By: ngimel

Differential Revision: D30867237

Pulled By: ezyang

fbshipit-source-id: c632a1e0b241effdc21ae929abe42fccec88aa24

2 years agoRevert D30876591: [pytorch][PR] Pin scipy to 1.6.3 on Windows and Mac
Nikita Shulga [Sun, 12 Sep 2021 22:55:21 +0000 (15:55 -0700)]
Revert D30876591: [pytorch][PR] Pin scipy to 1.6.3 on Windows and Mac

Test Plan: revert-hammer

Differential Revision:
D30876591 (https://github.com/pytorch/pytorch/commit/39f2b9de2ac7fb14e4aaf61863e98d01a53bc875)

Original commit changeset: 4946e0922063

fbshipit-source-id: b8beff3d973b21fe09c158baef25344030f8fb08

2 years agotorch.ao migration: numeric suite, eager and fx (#64817)
Vasiliy Kuznetsov [Sun, 12 Sep 2021 18:59:44 +0000 (11:59 -0700)]
torch.ao migration: numeric suite, eager and fx (#64817)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64817

This migrates `torch.quantization._numeric_suite` to `torch.ao.ns._numeric_suite`, and `torch.quantization._numeric_suite_fx` to `torch.ao.ns._numeric_suite_fx`.

1. move the files
```
HG: move eager mode
hg mv caffe2/torch/quantization/_numeric_suite.py caffe2/torch/ao/ns/
HG: move fx
hg mv caffe2/torch/quantization/_numeric_suite_fx.py caffe2/torch/ao/ns/
hg mv caffe2/torch/quantization/ns/* caffe2/torch/ao/ns/fx/
```

2. create new versions of `_numeric_suite.py` and `_numeric_suite_fx.py` with
imports

3. update all FB callsites

Test Plan: buck test mode/dev //caffe2/test:quantization

Reviewed By: z-a-f

Differential Revision: D30867538

fbshipit-source-id: 120ee830434ca490c1183a187a518eebcbbaf22c

2 years agoPin scipy to 1.6.3 on Windows and Mac (#64844)
Nikita Shulga [Sun, 12 Sep 2021 17:52:28 +0000 (10:52 -0700)]
Pin scipy to 1.6.3 on Windows and Mac (#64844)

Summary:
It's already pinned by via docker install on Linux

As `scipy.stats.`[`poission`|`geom`|`binom`] returns quite different results in 1.7+ versions of SciPy

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64844

Reviewed By: driazati

Differential Revision: D30876591

Pulled By: malfet

fbshipit-source-id: 4946e0922063e9ac320c218a0b089f73486466f7

2 years agoRevert D30867266: [pytorch][PR] TST Adds gradcheck and gradgradcheck to module info
Nikita Shulga [Sun, 12 Sep 2021 17:29:10 +0000 (10:29 -0700)]
Revert D30867266: [pytorch][PR] TST Adds gradcheck and gradgradcheck to module info

Test Plan: revert-hammer

Differential Revision:
D30867266 (https://github.com/pytorch/pytorch/commit/67ebde56459557199b3c907b81b3c819f77500b9)

Original commit changeset: cbc073326151

fbshipit-source-id: 00234e01eafc45fb999f7c83a397f9d6b3e01e46

2 years ago[RFC] Modularize functions of parsing bytecode (#61862)
Martin Yuan [Sun, 12 Sep 2021 05:22:28 +0000 (22:22 -0700)]
[RFC] Modularize functions of parsing bytecode (#61862)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61862

Modularize functions of parsing bytecode tables so that they can be used as needed in situations other than mobile lite interpreter.
* The decoupled functions are re-used by current lite interpreter loader.
* The bytecode can be serialized/deserialized from other formats.
* The decoupled functions have minimum dependencies on other PyTorch components.

Next:
Build a driver binary to include the parser and interpreter, but only has necessary dependency on other PyTorch components.
ghstack-source-id: 137867287

Test Plan:
As an example, a simple bytecode is parsed to a mobile function, and directly run in the added unit test, `RunTimeTest:ParseBytecode`. It contains basic control flow (if, else) and basic data orchestration (list construction).
CI

Reviewed By: larryliu0820

Differential Revision: D29798382

Pulled By: iseeyuan

fbshipit-source-id: 1c173a5f5d37097e3a97baec3f3e48e1eea1400f

2 years agoRevert D30875977: [caffe2] [aten] Remove loose (unpaired) #pragma warning ( pop ...
Natalia Gimelshein [Sun, 12 Sep 2021 00:17:00 +0000 (17:17 -0700)]
Revert D30875977: [caffe2] [aten] Remove loose (unpaired) #pragma warning ( pop ) in TensorBase.h

Test Plan: revert-hammer

Differential Revision:
D30875977 (https://github.com/pytorch/pytorch/commit/1f35d20a894bb07e27691332af4beb097142762f)

Original commit changeset: bd593feb5a75

fbshipit-source-id: 4c82dbc857fdb28e0240dacc1a0e607a76552bb4

2 years ago[iOS][OSS][BE] Update XCode to use 12.5.1 (#64850)
Tao Xu [Sat, 11 Sep 2021 18:21:42 +0000 (11:21 -0700)]
[iOS][OSS][BE] Update XCode to use 12.5.1 (#64850)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64850

ghstack-source-id: 137827895

Test Plan: CircleCI

Reviewed By: hanton

Differential Revision: D30877964

fbshipit-source-id: 803f2506a755b3815024704e6177c7826bc42de8

2 years ago[iOS][OSS][BE] Remove unused files (#64849)
Tao Xu [Sat, 11 Sep 2021 18:21:42 +0000 (11:21 -0700)]
[iOS][OSS][BE] Remove unused files (#64849)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64849

ghstack-source-id: 137827893

Test Plan: CircleCI

Reviewed By: hanton

Differential Revision: D30877962

fbshipit-source-id: a76f7fe888b990ba6cad650f72be7f4a1e58a2f1

2 years ago[TensorExpr] Move 2 graph passes from kernel.cpp to graph_opt.cpp (#64828)
Mikhail Zolotukhin [Sat, 11 Sep 2021 17:21:42 +0000 (10:21 -0700)]
[TensorExpr] Move 2 graph passes from kernel.cpp to graph_opt.cpp (#64828)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64828

Also, make `removeUnusedSelfArgument` more consistent with other passes
by mutating the graph in-place rather than returning a copy.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D30870776

Pulled By: ZolotukhinM

fbshipit-source-id: 4873f01b013921143a5aa43746d655a2d8d620c9

2 years ago[TensorExpr] Add debug logging (store/load tracing) to IREval. (#64848)
Mikhail Zolotukhin [Sat, 11 Sep 2021 16:23:52 +0000 (09:23 -0700)]
[TensorExpr] Add debug logging (store/load tracing) to IREval. (#64848)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64848

Test Plan: Imported from OSS

Reviewed By: Chillee

Differential Revision: D30878278

Pulled By: ZolotukhinM

fbshipit-source-id: bd946075336ba2e9786602161c236a0ff8a5a011

2 years ago[TensorExpr] LLVMCodegen: fix lowering for UInt->Float casts. (#64862)
Mikhail Zolotukhin [Sat, 11 Sep 2021 16:23:10 +0000 (09:23 -0700)]
[TensorExpr] LLVMCodegen: fix lowering for UInt->Float casts. (#64862)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64862

Previously we erroneously were looking at dst signedness. This was
discovered when we tried to implement quantize/dequantize ops.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D30881696

Pulled By: ZolotukhinM

fbshipit-source-id: 34af842e5e52a3b6b5d2e70c4ef32f910a20341f

2 years ago[caffe2] [aten] Remove loose (unpaired) #pragma warning ( pop ) in TensorBase.h ...
Elias Guestrin [Sat, 11 Sep 2021 07:41:51 +0000 (00:41 -0700)]
[caffe2] [aten] Remove loose (unpaired) #pragma warning ( pop ) in TensorBase.h (#64870)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64870

Remove loose (unpaired) #pragma warning ( pop ) in TensorBase.h
Issue started with D30728580 (https://github.com/pytorch/pytorch/commit/d701357d921ef167d42c125e65b6f7da6be3ad0f), was fixed with D30846958 (https://github.com/pytorch/pytorch/commit/40098f48a1a37a06a456fd642d908ca522295706), and brought back again with the reversion of D30846958 (https://github.com/pytorch/pytorch/commit/40098f48a1a37a06a456fd642d908ca522295706).

Reviewed By: H-Huang

Differential Revision: D30875977

fbshipit-source-id: bd593feb5a75245470e43ad568ebdd3f1738da7c

2 years ago[quant][fx2trt] Add lowering support for reference linear/conv modules (#64368)
Jerry Zhang [Sat, 11 Sep 2021 05:24:10 +0000 (22:24 -0700)]
[quant][fx2trt] Add lowering support for reference linear/conv modules (#64368)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64368

Test Plan:
python torch/fx/experimental/fx2trt/example/quantized_resnet_test.py

Imported from OSS

Reviewed By: 842974287

Differential Revision: D30708738

fbshipit-source-id: 88142b7ce43ed96093597112dab03a2d277de993

2 years ago[tensorexpr] Simplify x/100 -> 0 if x is a non-negative integer less than 100. (...
Hui Guo [Sat, 11 Sep 2021 03:30:06 +0000 (20:30 -0700)]
[tensorexpr] Simplify x/100 -> 0 if x is a non-negative integer less than 100. (#64763)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64763

Simplification pattern:
  x/N -> 0; N is a constant positive integer and x is a for-loop index whose range is a subset of [0, N).

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D30845854

Pulled By: huiguoo

fbshipit-source-id: 814d69ed4be05e57405c222183cc1c6c526721cd

2 years agoRevert D30869803: .circleci: Skip cuda /cudnn install if existing
Eli Uriegas [Sat, 11 Sep 2021 01:55:07 +0000 (18:55 -0700)]
Revert D30869803: .circleci: Skip cuda /cudnn install if existing

Test Plan: revert-hammer

Differential Revision:
D30869803 (https://github.com/pytorch/pytorch/commit/717d267e191bcc1669acad21d87ffb70e6e89b90)

Original commit changeset: 9eb3bd20875d

fbshipit-source-id: bef8d0c693696307a3be7abd5331b7fa813d754a

2 years agoTST Adds gradcheck and gradgradcheck to module info (#64444)
Thomas J. Fan [Fri, 10 Sep 2021 23:25:21 +0000 (16:25 -0700)]
TST Adds gradcheck and gradgradcheck to module info (#64444)

Summary:
Follow up to https://github.com/pytorch/pytorch/issues/61935

cc albanD mruberry jbschlosser walterddr

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64444

Reviewed By: ngimel

Differential Revision: D30867266

Pulled By: jbschlosser

fbshipit-source-id: cbc0733261517dbfcdd3415d969b9e802b62b7ac

2 years agoPreserve types during empty container assignment (#58911)
Ansley Ussery [Fri, 10 Sep 2021 23:18:33 +0000 (16:18 -0700)]
Preserve types during empty container assignment (#58911)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58911

Stack from [ghstack](https://github.com/ezyang/ghstack):
* __->__ #58911

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D30785623

Pulled By: ansley

fbshipit-source-id: 4e05d6369318974290fea02ad2bc148293c25090

2 years agoAlways upload stats to S3 (#64853)
Jane Xu [Fri, 10 Sep 2021 22:17:07 +0000 (15:17 -0700)]
Always upload stats to S3 (#64853)

Summary:
It's not very useful when stats are only uploaded when the tests all pass.

Like for this failing run, the stats were not uploaded to Scribe or S3: https://github.com/pytorch/pytorch/runs/3568714658

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64853

Reviewed By: seemethere

Differential Revision: D30878361

Pulled By: janeyx99

fbshipit-source-id: 19a4c520efdd5575785a3ffbc60e6c09456b9e0d

2 years ago[DataPipe] Remove ZipArchiveReader's dependency on FileLoader (#64786)
Kevin Tse [Fri, 10 Sep 2021 21:22:36 +0000 (14:22 -0700)]
[DataPipe] Remove ZipArchiveReader's dependency on FileLoader (#64786)

Summary:
Stack from [ghstack](https://github.com/ezyang/ghstack):
* https://github.com/pytorch/pytorch/issues/64788
* __->__ https://github.com/pytorch/pytorch/issues/64786

This PR removes ZipArchiveReader's dependency on FileLoader DataPipe, by allowing it to use a IterDataPipe of path names as input rather than a tuple of path name and a stream.

It also adds additional tests to ensure that the DataPipe is functioning properly when it is read multiple times or reset half way through reading.

The whole stack fixes issues related to unclosed buffer stream (see https://github.com/pytorch/pytorch/issues/64281).

cc VitalyFedyunin ejguan

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64786

Reviewed By: ngimel

Differential Revision: D30870968

Pulled By: NivekT

fbshipit-source-id: 64b04d1697b99772f2fa20fc141668e6b8e18c41

2 years ago.circleci: Skip cuda /cudnn install if existing (#64825)
Eli Uriegas [Fri, 10 Sep 2021 21:00:22 +0000 (14:00 -0700)]
.circleci: Skip cuda /cudnn install if existing (#64825)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64825

Rewrites this script to only install the CUDA tools if they are not already
pre-installed

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D30869803

Pulled By: seemethere

fbshipit-source-id: 9eb3bd20875df0f2b18f5314ac825dbdf91637b5

2 years ago[doc][hackathon] To add Adadelta Optimizer to the documentation (#63255)
Ilqar Ramazanli [Fri, 10 Sep 2021 20:33:12 +0000 (13:33 -0700)]
[doc][hackathon] To add Adadelta Optimizer to the documentation (#63255)

Summary:
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper  https://github.com/pytorch/pytorch/issues/63236.

In this PR we are adding description of AdaDelta Algorithm to the documentation.  For more details, we refer to the paper  here https://arxiv.org/abs/1212.5701

<img width="654" alt="AdaDeltaalg" src="https://user-images.githubusercontent.com/73658284/132770544-82ccf90a-1d54-4ad5-8fc4-51c8dec63a12.png">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63255

Reviewed By: ngimel

Differential Revision: D30867589

Pulled By: iramazanli

fbshipit-source-id: 5ba602c20c724a4486bdd38b73e1b64c0e767bdc

2 years agoAdd more error checking in subclass creation (#64746)
Alban Desmaison [Fri, 10 Sep 2021 20:07:37 +0000 (13:07 -0700)]
Add more error checking in subclass creation (#64746)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64746

This extracts the error checking that used to be in the PR above.
We are not going to land the proposed fix there, but I think we want this error checking in right now as these would lead to respectively a memory leak and arbitrary memory read/write.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D30867569

Pulled By: albanD

fbshipit-source-id: bf468033fb8b49fcb26eed423f5fad82b4a46c56

2 years agoMove THPVariable_NewWithVar around (#64550)
Alban Desmaison [Fri, 10 Sep 2021 20:07:37 +0000 (13:07 -0700)]
Move THPVariable_NewWithVar around (#64550)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64550

Just moves a function around to make the next PR easier to read.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D30867570

Pulled By: albanD

fbshipit-source-id: 99ae925568ed29ca7fdea059762c21d430d4a204

2 years ago[MicroBench] Added a log_vml version of the signed log1p kernel (#64205)
Raghavan Raman [Fri, 10 Sep 2021 19:35:24 +0000 (12:35 -0700)]
[MicroBench] Added a log_vml version of the signed log1p kernel (#64205)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64205

The log_vml version of the micro-bench is over **2x** faster than the log1p version. Here are the perf numbers:

```
---------------------------------------------------------------------------------------------
Benchmark                                   Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------
SignedLog1pBench/ATen/10/1467           45915 ns        45908 ns        14506 GB/s=2.5564G/s
SignedLog1pBench/NNC/10/1467            40469 ns        40466 ns        17367 GB/s=2.9002G/s
SignedLog1pBench/NNCLogVml/10/1467      19560 ns        19559 ns        35902 GB/s=6.00016G/s
```

Thanks to bertmaher for pointing this out.

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D30644716

Pulled By: navahgar

fbshipit-source-id: ba2b32c79d4265cd48a2886b0c62d0e89ff69c19

2 years ago[nnc] Added an implementation of sign op (#64033)
Raghavan Raman [Fri, 10 Sep 2021 19:35:24 +0000 (12:35 -0700)]
[nnc] Added an implementation of sign op (#64033)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64033

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D30579197

Pulled By: navahgar

fbshipit-source-id: f9f7fa7f2ffa109cf4e441eb1af821b8b891d4d3

2 years agoExtend 2Dim embedding bag benchmarking to include 3Dim benchmarks (#64647)
Eddie Ren [Fri, 10 Sep 2021 19:31:27 +0000 (12:31 -0700)]
Extend 2Dim embedding bag benchmarking to include 3Dim benchmarks (#64647)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64647

Add support for benchmarking of 8 bit quantizations of N-D batched embeddings. Currently only works for 3Dim embeddings and still requires thought on ramping up from 3Dim to NDim.

Test Plan: ```buck run //caffe2/benchmarks/operator_benchmark/pt:qembedding_pack_test```

Reviewed By: jingsh

Differential Revision: D30770085

fbshipit-source-id: 26659020f3458991592065a05366bde0f060494e

2 years agoRevert D30846958: [caffe2/aten] Remove loose #pragma warning ( pop ) in TensorBase.h
Howard Huang [Fri, 10 Sep 2021 18:48:43 +0000 (11:48 -0700)]
Revert D30846958: [caffe2/aten] Remove loose #pragma warning ( pop ) in TensorBase.h

Test Plan: revert-hammer

Differential Revision:
D30846958 (https://github.com/pytorch/pytorch/commit/40098f48a1a37a06a456fd642d908ca522295706)

Original commit changeset: 52a3fb66e426

fbshipit-source-id: 1d749f6981756f2169d6867538555a945cbb8ca6

2 years ago[DataPipe] fixing tests related fork() to remove warnings (#64827)
Kevin Tse [Fri, 10 Sep 2021 18:00:01 +0000 (11:00 -0700)]
[DataPipe] fixing tests related fork() to remove warnings (#64827)

Summary:
There are two warnings produced by `test_fork_datapipe`. This PR addresses the issues raised by those warnings without impacting the test cases.

cc VitalyFedyunin ejguan

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64827

Reviewed By: ejguan

Differential Revision: D30870528

Pulled By: NivekT

fbshipit-source-id: 580a001c6fa3ff6f8b04a7e5183e58861938204b

2 years ago[tensorexpr] Add 'pre_alloc' argument in python API of tensorexpr kernel (#64718)
Hui Guo [Fri, 10 Sep 2021 16:59:25 +0000 (09:59 -0700)]
[tensorexpr] Add 'pre_alloc' argument in python API of tensorexpr kernel (#64718)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64718

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D30826582

Pulled By: huiguoo

fbshipit-source-id: 6c173c8964f2643039273cdc83e64fb02bb5f381

2 years agoSkip conjugate and negate fallback for view ops and their in-place versions (#64392)
anjali411 [Fri, 10 Sep 2021 16:55:50 +0000 (09:55 -0700)]
Skip conjugate and negate fallback for view ops and their in-place versions (#64392)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64392

cc ezyang anjali411 dylanbespalko mruberry Lezcano nikitaved

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D30866330

Pulled By: anjali411

fbshipit-source-id: 7b2f51486bf1d610ad2b1472306bab608ee69c37

2 years agoTo add Rprop documentation (#63866)
Ilqar Ramazanli [Fri, 10 Sep 2021 16:47:38 +0000 (09:47 -0700)]
To add Rprop documentation (#63866)

Summary:
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper  https://github.com/pytorch/pytorch/issues/63236.

In this PR we are adding description of Rprop to the documentation.  For more details, we refer to the paper  http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.1417

<img width="657" alt="Rpropalg" src="https://user-images.githubusercontent.com/73658284/132750009-a5ec059e-6d53-4c67-917b-57174c8ca27b.png">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63866

Reviewed By: ngimel

Differential Revision: D30867590

Pulled By: iramazanli

fbshipit-source-id: 0d2d4ffc6c4d939290bbbaa84d2c6e901ed8b54a

2 years ago[ROCm] define C10_WARP_SIZE to warpSize HIP constant (#64302)
Jeff Daily [Fri, 10 Sep 2021 16:36:26 +0000 (09:36 -0700)]
[ROCm] define C10_WARP_SIZE to warpSize HIP constant (#64302)

Summary:
warpSize is defined as a constexpr in HIP headers.  It is incorrect to assume warpSize 64.  This change fixes the C10_WARP_SIZE definition in torch sources similar to [how it was done in caffe2](https://github.com/pytorch/pytorch/blob/master/caffe2/utils/GpuDefs.cuh#L10-L14).

cc jeffdaily sunway513 jithunnair-amd ROCmSupport

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64302

Reviewed By: mrshenli

Differential Revision: D30785975

Pulled By: malfet

fbshipit-source-id: 68f8333182ad4d02bd0c8d02f1751a50bc5bafa7

2 years agofix typo in torch/onnx/utils.py (#63396)
Corey Levinson [Fri, 10 Sep 2021 16:35:06 +0000 (09:35 -0700)]
fix typo in torch/onnx/utils.py (#63396)

Summary:
fixes minor typo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63396

Reviewed By: pbelevich

Differential Revision: D30644295

Pulled By: SplitInfinity

fbshipit-source-id: c506f67383909aa2c0c7c533038446b4b3d76a3a

2 years agobuild: bump bazel to 4.2.1 (#64455)
rui [Fri, 10 Sep 2021 15:28:45 +0000 (08:28 -0700)]
build: bump bazel to 4.2.1 (#64455)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64455

Reviewed By: saketh-are

Differential Revision: D30752580

Pulled By: malfet

fbshipit-source-id: 4f5cc6f820396348181c09463f7e5628b5f69471

2 years agoROCm MIOpen NHWC Convolution support (#63617)
Aswin John Mathews [Fri, 10 Sep 2021 15:05:21 +0000 (08:05 -0700)]
ROCm MIOpen NHWC Convolution support (#63617)

Summary:
- Added 2D-Convolution NHWC support
  - on ROCm 4.3, with `PYTORCH_MIOPEN_SUGGEST_NHWC=1` flag
  - May need to force MIOpen to search for solutions ( see examples below for flags )

**PYTORCH_MIOPEN_SUGGEST_NHWC Environment Flag**
MIOpen does not officially support NHWC yet, although convolution support has been added to tip-of-tree of MIOpen. This flag is intended to be a short-lived flag to explicitly turn on NHWC support until ROCm officially supports NHWC and performance is verified.

**Examples**
1. Example usage 1 : Run test on ROCm4.3
`PYTORCH_TEST_WITH_ROCM=1 PYTORCH_MIOPEN_SUGGEST_NHWC=1 MIOPEN_FIND_ENFORCE=4 MIOPEN_DEBUG_CONV_GEMM=0 MIOPEN_FIND_MODE=1 pytest test_nn.py -v -k "test_conv_cudnn_nhwc" `
2. Example usage 2: Run the following with `PYTORCH_MIOPEN_SUGGEST_NHWC=1` on ROCm4.3.
```
#!/usr/bin/env python3
import torch
model = torch.nn.Conv2d(8, 4, 3).cuda().half()
model = model.to(memory_format=torch.channels_last)
input = torch.randint(1, 10, (2, 8, 4, 4), dtype=torch.float32, requires_grad=True)
input = input.to(device="cuda", memory_format=torch.channels_last, dtype=torch.float16)

# should print True for is_contiguous(channels_last), and strides must match NHWC format
print(input.is_contiguous(memory_format=torch.channels_last), input.shape, input.stride() )

out = model(input)

# should print True for is_contiguous(channels_last), and strides must match NHWC format
print("Contiguous channel last :", out.is_contiguous(memory_format=torch.channels_last), " out shape :",  out.shape, "out stride :", out.stride() )
```

See https://pytorch.org/tutorials/intermediate/memory_format_tutorial.html for more examples.

cc jeffdaily sunway513 jithunnair-amd ROCmSupport

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63617

Reviewed By: saketh-are

Differential Revision: D30730800

Pulled By: ezyang

fbshipit-source-id: 61906a0f30be8299e6547d312ae6ac91cc7c3238

2 years agoLet all_reduce_coalesced and all_gather_coalesced return Future objects (#64722)
Shen Li [Fri, 10 Sep 2021 14:44:09 +0000 (07:44 -0700)]
Let all_reduce_coalesced and all_gather_coalesced return Future objects (#64722)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64722

`all_reduce_coalesced` and `all_gather_coalesced` are never publicly
released in our API docs. So, I would assume the blast radius to be small.

The motivation for this change to allow implementing
`all_reduce_coalesced` and `all_gather_coalesced` by re-using `allreduce`
and `allgather` C++ cores and perform flatten and copy only on the Python
side. With that, we can then remove `all_reduce_coalesced` and
`all_gather_coalesced` from C++ ProcessGroup APIs. For the async mode,
the copy-back logic after the communication will need to be chained
as a callback on the returned Future and use the chained child Future
as the return value (otherwise, we will need to wrap the child Future
into another work handle). This PR tries to test if we can directly
return a Future without breaking tests and internal use cases. If yes,
it will make the consolidation a lot easier.

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski SciPioneer H-Huang mrzzd cbalioglu gcramer23

Test Plan: Imported from OSS

Reviewed By: rohan-varma

Differential Revision: D30830994

Pulled By: mrshenli

fbshipit-source-id: dcde0ed9245e9e8fee357b3588b07d540a4b6318

2 years ago`torch.lu`: forward AD support (#64742)
Nikita Vedeneev [Fri, 10 Sep 2021 14:17:30 +0000 (07:17 -0700)]
`torch.lu`: forward AD support (#64742)

Summary:
As per title.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64742

Reviewed By: H-Huang

Differential Revision: D30841227

Pulled By: albanD

fbshipit-source-id: dc4d043ab94358594adb110fbbbb60750c98262a

2 years ago[const_fold] Keep around node.meta for replaced folded ops (#64782)
Jordan Fix [Fri, 10 Sep 2021 06:49:22 +0000 (23:49 -0700)]
[const_fold] Keep around node.meta for replaced folded ops (#64782)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64782

Previously, get_attrs that were added to the graph did not retain node.meta after folding. Add such support, and improve coverage in general here.

Test Plan: Added test coverage.

Reviewed By: protonu

Differential Revision: D30852704

fbshipit-source-id: ece87a61c69b2e68982964c6adc4dde14dae12c7

2 years ago[caffe2/aten] Remove loose #pragma warning ( pop ) in TensorBase.h (#64773)
Elias Guestrin [Fri, 10 Sep 2021 06:44:03 +0000 (23:44 -0700)]
[caffe2/aten] Remove loose #pragma warning ( pop ) in TensorBase.h (#64773)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64773

Remove loose `#pragma warning ( pop )` in TensorBase.h.

Reviewed By: ezyang

Differential Revision: D30846958

fbshipit-source-id: 52a3fb66e426bc16ef7bde2a13e26e8293969026

2 years agoAdd TRTSplitter (#64762)
Shirong Wu [Fri, 10 Sep 2021 04:02:15 +0000 (21:02 -0700)]
Add TRTSplitter (#64762)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64762

Extract and format TRTSplitter from fx2trt_example code, current implementation is tentative, subject to changed based on feeds model lowering progress.

Test Plan:
manul print of supported operator:
`{<class 'torch.nn.modules.activation.ReLU'>: None, <function relu at 0x7f9b1abd0790>: None, <class 'torch.nn.modules.activation.Sigmoid'>: None, <class 'torch.nn.modules.pooling.AdaptiveAvgPool2d'>: None, <built-in method add of type object at 0x7f9b7f402498>: None, <built-in function add>: None, <built-in method add of PyCapsule object at 0x7f9b1a3dc690>: None, <built-in method add_relu of PyCapsule object at 0x7f9b1a34cf90>: None, <class 'torch.nn.modules.batchnorm.BatchNorm2d'>: None, <class 'torch.nn.quantized.modules.batchnorm.BatchNorm2d'>: None, <class 'torch.nn.modules.conv.Conv2d'>: None, <class 'torch.nn.quantized.modules.conv.Conv2d'>: None, <class 'torch.nn.intrinsic.quantized.modules.conv_relu.ConvReLU2d'>: None, <class 'torch.nn.modules.linear.Linear'>: None, <class 'torch.nn.quantized.modules.linear.Linear'>: None, <class 'torch.nn.modules.pooling.MaxPool2d'>: None, <built-in function mul>: None, <built-in method mul of type object at 0x7f9b7f402498>: None, <built-in method mul of PyCapsule object at 0x7f9b1a3dc6c0>: None, <built-in method flatten of type object at 0x7f9b7f402498>: None, <class 'torch.nn.quantized.modules.DeQuantize'>: None, <built-in method dequantize of type object at 0x7f9b7f402498>: None, 'dequantize': None, <class 'torch.nn.quantized.modules.Quantize'>: None, <built-in method quantize_per_tensor of type object at 0x7f9b7f402498>: None, <class 'torch.nn.modules.linear.Identity'>: None, <function conv2d at 0x7f9b1a1fe9d0>: None, <function flatten at 0x7f9b1a1f5ca0>: None, <function size at 0x7f9b1a1f5b80>: None, <function batch_norm at 0x7f9b1a1feaf0>: None, <function layer_norm at 0x7f9b1a1feb80>: None, <function softmax at 0x7f9b1a1f9550>: None, <function relu at 0x7f9b1a1fe040>: None, <function sin at 0x7f9b1a2030d0>: None, <function cos at 0x7f9b1a203160>: None, <function tan at 0x7f9b1a2031f0>: None, <function sinh at 0x7f9b1a1fe160>: None, <function cosh at 0x7f9b1a1fe280>: None, <function tanh at 0x7f9b1a1fe310>: None, <function asin at 0x7f9b1a1fe3a0>: None, <function acos at 0x7f9b1a1fe430>: None, <function atan at 0x7f9b1a1fe4c0>: None, <function exp at 0x7f9b1a1fe550>: None, <function log at 0x7f9b1a1fe5e0>: None, <function sqrt at 0x7f9b1a1fe670>: None, <function reciprocal at 0x7f9b1a1fe700>: None, <function abs at 0x7f9b1a1fe790>: None, <function neg at 0x7f9b1a1fe820>: None, <function floor at 0x7f9b1a1fe8b0>: None, <function ceil at 0x7f9b1a1fe940>: None, <function sum at 0x7f9b1a1f9c10>: None, <function max_pool2d at 0x7f9b1a1f5d30>: None, <function squeeze at 0x7f9b1a1f5c10>: None, <function add at 0x7f9b1a1f91f0>: None, <function sub at 0x7f9b1a1f9ca0>: None, <function div at 0x7f9b1a1f9dc0>: None, <function mul at 0x7f9b1a1f9d30>: None, <function pow at 0x7f9b1a1f9e50>: None, <function min_two_tensors_input at 0x7f9b1a1f9940>: None, <function unsqueeze at 0x7f9b1a1f9280>: None, <function topk at 0x7f9b1a203280>: None, <function adaptive_avg_pool2d at 0x7f9b1a1f5dc0>: None, <function avg_pool2d at 0x7f9b1a1f5e50>: None, <function reshape at 0x7f9b1a203550>: None, <function slice_tensor at 0x7f9b1a1fee50>: None, <function split at 0x7f9b1a1fec10>: None, <function linear at 0x7f9b1a1f51f0>: None, <function clamp at 0x7f9b1a1f93a0>: None, <function tuple_construct at 0x7f9b1a1fed30>: None, <function contiguous at 0x7f9b1a1f9430>: None, <function getitem at 0x7f9b1a203310>: None, <function cat at 0x7f9b1a1f9310>: None, <function transpose at 0x7f9b1a1f94c0>: None, <function matmul at 0x7f9b1a1f98b0>: None, <function sigmoid at 0x7f9b1a1fe1f0>: None, <function permute at 0x7f9b1a1f9670>: None, <function quantize_per_tensor at 0x7f9b1a1f9b80>: None, <function dequantize at 0x7f9b1a1f99d0>: None, <function sign at 0x7f9b1a1f5ee0>: None}`

Reviewed By: 842974287

Differential Revision: D30798047

fbshipit-source-id: 69076a550874425b7186fbbf2ecf03da4a99b42f

2 years ago[PyTorch] Fix missing move in torch::jit::Lexer::next (#64653)
Scott Wolchok [Fri, 10 Sep 2021 01:53:36 +0000 (18:53 -0700)]
[PyTorch] Fix missing move in torch::jit::Lexer::next (#64653)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64653

Saves shared_ptr refcount inc/dec in SourceRange.
ghstack-source-id: 137608457

Test Plan: Profiled startup of framework overheads benchmark from high_per_models; self time spent in next() is way down.

Reviewed By: dhruvbird

Differential Revision: D30739240

fbshipit-source-id: ac455678c9d46e657b111d3788d4369983028674

2 years ago[PyTorch] Use std::find in the JIT lexer (#64652)
Scott Wolchok [Fri, 10 Sep 2021 01:53:36 +0000 (18:53 -0700)]
[PyTorch] Use std::find in the JIT lexer (#64652)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64652

If nothing else, it is slightly clearer code.
ghstack-source-id: 137608456

Test Plan: CI

Reviewed By: dhruvbird

Differential Revision: D30739239

fbshipit-source-id: bc7917b59883ca4a33fc6916b4e422bad79cf04b

2 years ago[TensorExpr] Simplify TE IR before applying any transformations. (#64717)
Mikhail Zolotukhin [Fri, 10 Sep 2021 01:48:17 +0000 (18:48 -0700)]
[TensorExpr] Simplify TE IR before applying any transformations. (#64717)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64717

This also exposed several bugs, which are fixed in this PR.

Differential Revision:
D30826408
D30826408

Test Plan: Imported from OSS

Reviewed By: navahgar

Pulled By: ZolotukhinM

fbshipit-source-id: a67ec5739aceed9ffdf0d24f77eb3787cefe4560

2 years ago[quant][fix] Fix quantization for sub_scalar (#64603)
Jerry Zhang [Fri, 10 Sep 2021 00:17:01 +0000 (17:17 -0700)]
[quant][fix] Fix quantization for sub_scalar (#64603)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64603

We'll insert observer only when both the operator and dtype is supported

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_sub_scalar

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D30797025

fbshipit-source-id: a77c21e2749405534fc245374cf33a0657a3d2c8

2 years ago[Android] print type name for IValues (#64602)
Linbin Yu [Thu, 9 Sep 2021 23:56:50 +0000 (16:56 -0700)]
[Android] print type name for IValues (#64602)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64602

print type name in error message for easier debugging.

Test Plan:
Example:
java.lang.IllegalStateException: Expected IValue type Tensor, actual type TensorList

Reviewed By: beback4u

Differential Revision: D30782318

fbshipit-source-id: 60d88a659e7b4bb2b574b12c7652a28f0d5ad0d2

2 years ago[caffe2][tiny] Add logging to report what the current lengths are when mismatched...
Xinyi Zhang [Thu, 9 Sep 2021 23:43:55 +0000 (16:43 -0700)]
[caffe2][tiny] Add logging to report what the current lengths are when mismatched lengths are detected (#64768)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64768

as title

Differential Revision: D30846637

fbshipit-source-id: 266768c81b315fdebba854135ea2db1faf67fd6a

2 years ago[doc][hackathon] To add Adagrad Optimizer to the documentation (#63254)
Ilqar Ramazanli [Thu, 9 Sep 2021 22:37:44 +0000 (15:37 -0700)]
[doc][hackathon] To add Adagrad Optimizer to the documentation (#63254)

Summary:
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper  https://github.com/pytorch/pytorch/issues/63236.

In this PR we are adding description of Adagrad to the documentation.  For more details, we refer to the paper
http://jmlr.org/papers/v12/duchi11a.html

<img width="658" alt="AdaGradAlgo" src="https://user-images.githubusercontent.com/73658284/132743276-a52ea3fb-70a5-4788-94b7-f99367907a26.png">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63254

Reviewed By: albanD

Differential Revision: D30852139

Pulled By: iramazanli

fbshipit-source-id: 9e496560a97e92be8386585b01d9bd3bba4b0c66

2 years ago[Static Runtime] Fix resize_output_check warning coming from prim::VarConcat (#64765)
Harut Movsisyan [Thu, 9 Sep 2021 21:35:00 +0000 (14:35 -0700)]
[Static Runtime] Fix resize_output_check warning coming from prim::VarConcat (#64765)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64765

Test Plan: Tested the fix with BR v1 model predictor-replayer setup.

Reviewed By: ajyu

Differential Revision: D30846506

fbshipit-source-id: 3ef3c93f11285c7cd1e2b188ca298a7ab4fba579

2 years agoRename profiler metadata key (#63743)
Han Guangyun [Thu, 9 Sep 2021 20:02:56 +0000 (13:02 -0700)]
Rename profiler metadata key (#63743)

Summary:
rename metadata key to be the same with variable name

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63743

Reviewed By: albanD

Differential Revision: D30839501

Pulled By: gdankel

fbshipit-source-id: b9b4e670dcc9557b8d8d0730baea0ad39a1a0ca4

2 years agoAdd support for lowering info during serialize_module, and add padding/partial to...
Jordan Fix [Thu, 9 Sep 2021 19:59:54 +0000 (12:59 -0700)]
Add support for lowering info during serialize_module, and add padding/partial to it (#5810)

Summary:
Pull Request resolved: https://github.com/pytorch/glow/pull/5810

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64725

- Any info added to the dict in node.meta["lowering_info"] will be added to the node_rep during serialization.
- Use this to add annotations on placeholders that allow partial inputs and require padding.
- Check for these annotations and set them in the NNPICompiledFunction as expected

Test Plan: Validated working on inline_cvr in stack. Additionally existing fx_glow end to end tests should still pass.

Reviewed By: 842974287

Differential Revision: D30824192

fbshipit-source-id: def64ef097aa35c337abb494415f7d437c6c7fa9

2 years agocat_shape_check: Fixes dimension in the error message for CUDA cat shape check and...
Palwisha Akhtar [Thu, 9 Sep 2021 19:49:03 +0000 (12:49 -0700)]
cat_shape_check: Fixes dimension in the error message for CUDA cat shape check and removes unnecessary offending index information (#64556)

Summary:
Fixes: https://github.com/pytorch/pytorch/issues/64207

Thank you, SsnL for providing the reproducing script.

cc ngimel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64556

Reviewed By: albanD

Differential Revision: D30843859

Pulled By: ngimel

fbshipit-source-id: 457ebe80eaef793d9f5d35ee962b6697e5de1907

2 years agoEnable the on-demand performance PR testing to run on a specified TB branch (#64701)
Xu Zhao [Thu, 9 Sep 2021 19:35:36 +0000 (12:35 -0700)]
Enable the on-demand performance PR testing to run on a specified TB branch (#64701)

Summary:
This is to enable performance testing of experimental features such as LazyTensor.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64701

Test Plan:
TorchBench CI

RUN_TORCHBENCH: BERT_pytorch, mobilenet_v3_large
TORCHBENCH_BRANCH: v1.0

Reviewed By: seemethere

Differential Revision: D30847389

Pulled By: xuzhao9

fbshipit-source-id: 6853b368fa6f1ba8ffde517805c74bf318dcb35b

2 years ago.github: Remove add_annotations workflow (#64449)
Eli Uriegas [Thu, 9 Sep 2021 19:20:43 +0000 (12:20 -0700)]
.github: Remove add_annotations workflow (#64449)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64449

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Test Plan: Imported from OSS

Reviewed By: suo, janeyx99

Differential Revision: D30738460

Pulled By: seemethere

fbshipit-source-id: f1259fcba2f0c14a9bcfbe811ec0a4bf61106619

2 years ago[Dist/CI] Remove dist from target determinator (#64721)
Rohan Varma [Thu, 9 Sep 2021 19:05:26 +0000 (12:05 -0700)]
[Dist/CI] Remove dist from target determinator (#64721)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64721

There are a couple PRs where distributed CI did not runa nd we expect
it to. Examples:

https://github.com/pytorch/pytorch/pull/64513/checks?check_run_id=3539190960,
https://github.com/pytorch/pytorch/pull/64113. All distributed tests should've
been run on these PRs, but we can see they were not:

```
Determination is skipping distributed/test_c10d_common
Determination is skipping distributed/test_c10d_gloo
Determination is skipping distributed/test_c10d_nccl
Determination is skipping distributed/test_c10d_spawn_gloo
Determination is skipping distributed/test_c10d_spawn_nccl
Running distributed/test_data_parallel without determination
Determination is skipping distributed/test_distributed_spawn
Determination is skipping distributed/test_jit_c10d
```

Since it is important to run distributed tests on PRs that touch distributed,
exclude distributed from target_det_list for now.
ghstack-source-id: 137654015

Test Plan: CI

Reviewed By: driazati, mrshenli

Differential Revision: D30830455

fbshipit-source-id: 8b0fdf5b57c2c647b0d82c48e2bb8e2bdbe4d307

2 years agofix acc topk's handling of the case when dim=0, fix tests as well (#64727)
Emad El-Haraty [Thu, 9 Sep 2021 17:32:22 +0000 (10:32 -0700)]
fix acc topk's handling of the case when dim=0, fix tests as well (#64727)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64727

the acc ops convertor for topk has a subtle bug (i found this while trying to introduce max/min)
the code does not differentiate between dim == None and dim ==0, but these are both different computations

Reviewed By: jfix71, 842974287

Differential Revision: D30833621

fbshipit-source-id: 6cd84e6ca4e95bb1a6d6465e61830b76808a9c78

2 years agoFix a shadowed variable (#64695)
Richard Barnes [Thu, 9 Sep 2021 17:30:59 +0000 (10:30 -0700)]
Fix a shadowed variable (#64695)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64695

Resolves this warning:
```
caffe2/aten/src/ATen/ParallelOpenMP.h:109:63: warning: declaration of 'int64_t begin' shadows a parameter [-Wshadow=compatible-local]
  109 |   internal::invoke_parallel(begin, end, grain_size, [&](int64_t begin, int64_t end) {
      |                                                       ~~~~~~~~^~~~~
caffe2/aten/src/ATen/ParallelOpenMP.h:86:1: note: shadowed declaration is here
   85 | inline scalar_t parallel_reduce(
      |                 ~~~~~~~~~~~~~~~~
   86 |     const int64_t begin,
      | ^   ~
```

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D30816128

fbshipit-source-id: 3adff6d94eea9fbd65885e88283cae10b87dba18

2 years agoAdded more version comparison operations (#63848)
Nived P A [Thu, 9 Sep 2021 17:29:10 +0000 (10:29 -0700)]
Added more version comparison operations (#63848)

Summary:
Currently the [TorchVersion](https://github.com/pytorch/pytorch/blob/1022443168b5fad55bbd03d087abf574c9d2e9df/torch/torch_version.py#L13) only only supports 'greater than', and 'equal to' operations for comparing torch versions and something like `TorchVersion('1.5.0') < (1,5,1)` or `TorchVersion('1.5.0') >= (1,5)` will throw an error.

I have added 'less than' (`__lt__()`), 'greater than or equal to' (`__ge__()`) and 'less than or equal to' (`__le__()`) operations, so that the TorchVersion object can be useful for wider range of version comparisons.

cc seemethere zsol

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63848

Reviewed By: fmassa, heitorschueroff

Differential Revision: D30526996

Pulled By: seemethere

fbshipit-source-id: 1db6bee555043e0719fd541cec27810852590940

2 years agoReverts cat and stack warning when out= is not the expected shape (#64714)
Mike Ruberry [Thu, 9 Sep 2021 17:02:03 +0000 (10:02 -0700)]
Reverts cat and stack warning when out= is not the expected shape (#64714)

Summary:
These warnings are being thrown too aggressively at the moment. See https://github.com/pytorch/pytorch/issues/64709 for a follow-up to reenable them once internal call sites are reviewed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64714

Reviewed By: ngimel

Differential Revision: D30822965

Pulled By: mruberry

fbshipit-source-id: 3ad7c92d381d42ac6187ed84afab477c579a8f35

2 years agoTo add SequentialLR to PyTorch Core Schedulers (#64037)
Ilqar Ramazanli [Thu, 9 Sep 2021 16:32:36 +0000 (09:32 -0700)]
To add SequentialLR to PyTorch Core Schedulers (#64037)

Summary:
Partially resolves https://github.com/pytorch/vision/issues/4281

In this PR we are proposing a new scheduler --SequentialLR-- which enables list of different schedulers called in different periods of the training process.

The main motivation of this scheduler is recently gained popularity of warming up phase in the training time. It has been shown that having a small steps in initial stages of training can help convergence procedure get faster.

With the help of SequentialLR we mainly enable to call a small constant (or linearly increasing) learning rate followed by actual target learning rate scheduler.

```PyThon
scheduler1 = ConstantLR(optimizer, factor=0.1, total_iters=2)
scheduler2 = ExponentialLR(optimizer, gamma=0.9)
scheduler = SequentialLR(optimizer, schedulers=[scheduler1, scheduler2], milestones=[5])

for epoch in range(100):
    train(...)
    validate(...)
    scheduler.step()
```

which this code snippet will call `ConstantLR` in the first 5 epochs and will follow up with `ExponentialLR` in the following epochs.

This scheduler could be used to provide call of any group of schedulers next to each other. The main consideration we should make is every time we switch to a new scheduler we assume that new scheduler starts from the beginning- zeroth epoch.

We also add Chained Scheduler to `optim.rst` and `lr_scheduler.pyi` files here.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64037

Reviewed By: albanD

Differential Revision: D30841099

Pulled By: iramazanli

fbshipit-source-id: 94f7d352066ee108eef8cda5f0dcb07f4d371751

2 years ago[pytorch] Make qlinear weight packing thread safe (#63804)
John Shen [Thu, 9 Sep 2021 16:30:32 +0000 (09:30 -0700)]
[pytorch] Make qlinear weight packing thread safe (#63804)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63804

Adding a lock around weight packing section of qlinear + qlinear_dynamic

Test Plan: automated tests

Reviewed By: kimishpatel

Differential Revision: D30340957

fbshipit-source-id: 1c9faf796c4ffbc74345396188a6f1154a76bea6

2 years ago`torch.lu_solve`: forward AD support (#64646)
Nikita Vedeneev [Thu, 9 Sep 2021 15:56:29 +0000 (08:56 -0700)]
`torch.lu_solve`: forward AD support (#64646)

Summary:
As per title.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64646

Reviewed By: VitalyFedyunin

Differential Revision: D30807898

Pulled By: albanD

fbshipit-source-id: 1f943c22357dd1b3662cfe0d2a26af68e3a2df4c

2 years ago[nnc] Handled cast in index expression during inlining (#64716)
Raghavan Raman [Thu, 9 Sep 2021 15:26:16 +0000 (08:26 -0700)]
[nnc] Handled cast in index expression during inlining (#64716)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64716

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D30826388

Pulled By: navahgar

fbshipit-source-id: 7e446602f650527e0d954e437f0370602019e040

2 years ago[nnc] Updated indices during broadcast to use int64_t (#64627)
Raghavan Raman [Thu, 9 Sep 2021 15:26:16 +0000 (08:26 -0700)]
[nnc] Updated indices during broadcast to use int64_t (#64627)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64627

This fixes the root cause of S242719

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D30801686

Pulled By: navahgar

fbshipit-source-id: b6d3ebdc7eb57116eaced53c2f35c7798bb17e80

2 years agoRevert D30745921: [DDP] Fix when buffers are reassigned in module
Howard Huang [Thu, 9 Sep 2021 15:20:40 +0000 (08:20 -0700)]
Revert D30745921: [DDP] Fix when buffers are reassigned in module

Test Plan: revert-hammer

Differential Revision:
D30745921 (https://github.com/pytorch/pytorch/commit/d59ecc02df70bad2273858c2fad2b4993133a3d3)

Original commit changeset: 25eb1edbf445

fbshipit-source-id: 343ead86bf1e2d0b2d4124be331ea2fa437303ad

2 years agoRevert D30745961: [DDP] Remove self.modules_params
Howard Huang [Thu, 9 Sep 2021 15:20:40 +0000 (08:20 -0700)]
Revert D30745961: [DDP] Remove self.modules_params

Test Plan: revert-hammer

Differential Revision:
D30745961 (https://github.com/pytorch/pytorch/commit/8c095102948c9601792a884dad56da5085c51bee)

Original commit changeset: 32d102502570

fbshipit-source-id: 59f7cc50d369b6cc2856cf4ebd0f58b96202336d

2 years agoRevert D30745960: [DDP] Remove SPMD from self.modules_buffers
Howard Huang [Thu, 9 Sep 2021 15:20:40 +0000 (08:20 -0700)]
Revert D30745960: [DDP] Remove SPMD from self.modules_buffers

Test Plan: revert-hammer

Differential Revision:
D30745960 (https://github.com/pytorch/pytorch/commit/15532595209d2daf34d35e10f8d3d3b64966aea2)

Original commit changeset: 66a8f9847e9f

fbshipit-source-id: d3f3fb813c45ac1b0ff15c6154b2e99e5dbab433