review.tizen.org Git - platform/upstream/pytorch.git/log

projects / platform / upstream / pytorch.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Edward Yang [Wed, 1 Sep 2021 00:55:23 +0000 (17:55 -0700)]

Convert mul to use opmath_gpu_kernel_with_scalars (#64019)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64019

Note that previously the functor operated on scalar_t and
this modifies it to operate on opmath_t, but this is not
a problem as half precision was implemented by performing the
compute in float anyway.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D30575282

Pulled By: ezyang

fbshipit-source-id: cc6900ef996e755740afe48f9cb4d0366858dd47

commit | commitdiff | tree

soulitzer [Wed, 1 Sep 2021 00:51:55 +0000 (17:51 -0700)]

Use the correct overloaded name to skip boxed autograd not implemented kernel registration (#64182)

Summary:
Some internal use_count tests are failing for `dequantize_self` because we only compare the skip list with the base name `dequantize` when we should be comparing with the full name including the overload

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64182

Reviewed By: albanD

Differential Revision: D30639909

Pulled By: soulitzer

fbshipit-source-id: d4d22dd1a5c8f7180251ce7739830764cce6f151

commit | commitdiff | tree

Ray Peng [Wed, 1 Sep 2021 00:45:50 +0000 (17:45 -0700)]

[Static Runtime] Out version for softmax (#64243)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64243

Test Plan:
```
> buck run //caffe2/benchmarks/static_runtime:static_runtime_cpptest -- --v=1
...
V0830 16:35:22.524479 613839 impl.cpp:1410] Switch to out variant for node: %5 : Tensor = aten::softmax(%a.1, %dim.1, %dtype.1)
...
[ OK ] StaticRuntime.IndividualOps_Softmax (803 ms)
```

Reviewed By: hlu1

Differential Revision: D30656149

fbshipit-source-id: 115b7b4a75448fd6a5c526808080ca9a4251302c

commit | commitdiff | tree

Eli Uriegas [Wed, 1 Sep 2021 00:38:42 +0000 (17:38 -0700)]

.circleci: Remove already migrated CUDA configs (#64231)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64231

This migrates over the CUDA 11.1 and CUDA 10.2 configs that we had
previously migrated to GHA

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
cc ezyang seemethere malfet walterddr lg20987 pytorch/pytorch-dev-infra

Test Plan: Imported from OSS

Reviewed By: zhouzhuojie

Differential Revision: D30683811

Pulled By: seemethere

fbshipit-source-id: 71b0761461557d871c26eb02f665a2e4d9b1d9fb

commit | commitdiff | tree

Eli Uriegas [Wed, 1 Sep 2021 00:38:42 +0000 (17:38 -0700)]

.github: Consolidate linux setup / teardown (#64229)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64229

Consolidates linux setup / teardown into easy to use jinja2 macros

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
cc ezyang seemethere malfet walterddr lg20987 pytorch/pytorch-dev-infra

Test Plan: Imported from OSS

Reviewed By: zhouzhuojie, driazati

Differential Revision: D30683810

Pulled By: seemethere

fbshipit-source-id: 2578630df3e212fb79392a699090553baef44cc2

commit | commitdiff | tree

Nikita Shulga [Wed, 1 Sep 2021 00:33:11 +0000 (17:33 -0700)]

Add ciflow-tracking issue to pytorch-probot (#64125)

Summary:
Doesn't do anything yet...

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64125

Reviewed By: zhouzhuojie

Differential Revision: D30620283

Pulled By: malfet

fbshipit-source-id: 91869d35c1b70a55e32261d2c32fb0136ec33960

commit | commitdiff | tree

Mikhail Zolotukhin [Wed, 1 Sep 2021 00:32:00 +0000 (17:32 -0700)]

[TensorExpr] Move declaration of buildErrorMessage to exception.h (#64301)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64301

Test Plan: Imported from OSS

Reviewed By: navahgar, huiguoo

Differential Revision: D30678215

Pulled By: ZolotukhinM

fbshipit-source-id: 599c83b3890450a0fb6526815f037eec9563661c

commit | commitdiff | tree

Jay Leverett [Wed, 1 Sep 2021 00:28:42 +0000 (17:28 -0700)]

Fix redundant class definition in GraphModule singleton constructor (#64274)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/63883

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64274

Reviewed By: jamesr66a

Differential Revision: D30675970

Pulled By: jayleverett

fbshipit-source-id: e74ef2a28013f0fa7c58d14f38e66cfe48d26b74

commit | commitdiff | tree

Nikita Shulga [Wed, 1 Sep 2021 00:19:11 +0000 (17:19 -0700)]

Discover new tests in run_tests.py (#64246)

Summary:
Introduce `discover_tests` function that globs for all Python files
starting with `test_` in test folder excluding subfolders which are
executed differently

Fixes https://github.com/pytorch/pytorch/issues/64178

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64246

Reviewed By: walterddr, seemethere

Differential Revision: D30661652

Pulled By: malfet

fbshipit-source-id: a52e78ec717b6846add267579dd8d9ae75326bf9

commit | commitdiff | tree

Richard Zou [Tue, 31 Aug 2021 21:53:01 +0000 (14:53 -0700)]

Revert D30543236: Add python mode

Test Plan: revert-hammer

Differential Revision:
D30543236 (https://github.com/pytorch/pytorch/commit/4bd03b02424d93b72f15e28c542ede13f88ea929)

Original commit changeset: ef5444d96a5a

fbshipit-source-id: b0042ac2c22765fa11d6d00bf751f6a4489eb6d8

commit | commitdiff | tree

Kevin Tse [Tue, 31 Aug 2021 20:55:59 +0000 (13:55 -0700)]

[DataPipe] export fork, mux, demux for public usage (#64279)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64279

cc VitalyFedyunin ejguan

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D30671971

Pulled By: NivekT

fbshipit-source-id: 056ac12ef7183b254d1eec341145594639e47ef6

commit | commitdiff | tree

Kevin Tse [Tue, 31 Aug 2021 20:55:59 +0000 (13:55 -0700)]

[DataPipe] adding description, __len__, tests for mux() (#64224)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64224

cc VitalyFedyunin ejguan

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D30651551

Pulled By: NivekT

fbshipit-source-id: f8af98ba71a592900b992a8077432062ec57bb48

commit | commitdiff | tree

zhouzhuojie [Tue, 31 Aug 2021 20:48:28 +0000 (13:48 -0700)]

Try the forked checkout action with retry (#64120)

Summary:
Fixes #{issue number}

The main difference is:
https://github.com/zhouzhuojie/checkout/commit/ffc6f93ad4b6e3cdcdd1a34e8c896765002f9b34

Can test multiple times in this PR to see if it works, will make the `retry` number configurable if it's usable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64120

Reviewed By: malfet

Differential Revision: D30656099

Pulled By: zhouzhuojie

fbshipit-source-id: a89932196bb0c44e412a34664ed6a061b02ef92e

commit | commitdiff | tree

Rishi Puri [Tue, 31 Aug 2021 20:47:29 +0000 (13:47 -0700)]

fix syntax error in bfloat16 PR (#64122)

Summary:
fixes prior syntax error from PR ngimel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64122

Reviewed By: H-Huang

Differential Revision: D30643596

Pulled By: ngimel

fbshipit-source-id: 0a2d5a40fb6dc7339cd03112e57ef0e1bf8a000e

commit | commitdiff | tree

Michael Carilli [Tue, 31 Aug 2021 20:29:39 +0000 (13:29 -0700)]

[CUDA graphs] Prototype API and documentation (#63269)

Summary:
RFC: https://github.com/pytorch/pytorch/issues/61880

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63269

Reviewed By: mruberry

Differential Revision: D30596643

Pulled By: ngimel

fbshipit-source-id: b1f8061406364b667e2c2d4d30fbce1f0d8456be

commit | commitdiff | tree

Rohan Varma [Tue, 31 Aug 2021 19:51:20 +0000 (12:51 -0700)]

Remove ref to test_distributed_fork (#64197)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64197

Removes this line as test is gone.
ghstack-source-id: 136986275

Test Plan: CI

Reviewed By: walterddr

Differential Revision: D30642929

fbshipit-source-id: a0c7dfdfb35a4a7f7ec1b881dbea53d85136012c

commit | commitdiff | tree

Eli Uriegas [Tue, 31 Aug 2021 19:50:11 +0000 (12:50 -0700)]

.circleci: Remove migrated jobs, move docs builds (#64222)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64222

Removes both backwards_compat as well as docs_test from the general
gcc5.4 config and moves the docs build from being run on every PR to
only being run on master.

We can remove docs builds when we migrate the docs push job (including
all secrets associated with that)

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
cc ezyang seemethere malfet walterddr lg20987 pytorch/pytorch-dev-infra

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D30650953

Pulled By: seemethere

fbshipit-source-id: ac11da6a551a6c81f3dc1d47fd81846cbfe9975a

commit | commitdiff | tree

Raghuraman Krishnamoorthi [Tue, 31 Aug 2021 19:22:13 +0000 (12:22 -0700)]

[ao][docs] Clarify operator support for quantization (#63270)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63270

Add table to quantization main page showing supported modules
for static and dynamic quantization.
ghstack-source-id: 137087204

Test Plan: Imported from OSS

Reviewed By: HDCharles

Differential Revision: D30658654

fbshipit-source-id: a82c998e1db6370596d5b0ca4c7cc96c1c90f30e

commit | commitdiff | tree

Vasiliy Kuznetsov [Tue, 31 Aug 2021 19:09:59 +0000 (12:09 -0700)]

ns for fx: make layer types more readable (#64270)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64270

Before this PR, layer types were populated by doing
`str(module_instance)` and `str(function)`. This resulted
in moderately readable strings for modules, and poorly readable
strings for functions.

This PR switches the logic to use `torch.typename` utility instead.
The results are significantly more readable.

Example function type:

```
# before
'<built-in method linear of PyCapsule object at 0x7fe9b20ce7b0>'

# after
'torch._ops.quantized.PyCapsule.linear'
```

Example module type:

```
# before
"<class 'torch.nn.quantized.modules.conv.Conv2d'>"

# after
'torch.nn.quantized.modules.conv.Conv2d'
```

Test Plan:
Manually inspect NS results for modules and functions, verify they are
more readable.

Manually inspect NS results for modules and functions, verify they are
more readable.

Imported from OSS

Differential Revision:
D30669545
D30669545

Reviewed By: jerryzh168

Pulled By: vkuzo

fbshipit-source-id: 60959e5cafa0a4992b083bf99f5d8260f9acdac0

commit | commitdiff | tree

Shiyan Deng [Tue, 31 Aug 2021 18:29:07 +0000 (11:29 -0700)]

[fx2trt] Add acc_ops.sign and converter for it (#63876)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63876

Add `acc_ops.sign` which maps from `torch.sign`.

Add a plugin (not support dynamic shape currently) for `acc_ops.sign`. The plugin calls `at::sign` directly.

Test Plan: buck test mode/opt -c python.package_style=inplace -c fbcode.nvcc_arch=a100 caffe2/torch/fb/fx2trt:test_unary_ops

Reviewed By: yinghai

Differential Revision: D30518081

fbshipit-source-id: a0b9e6c30deac0b04b8cb09a162579e229985330

commit | commitdiff | tree

Saketh Are [Tue, 31 Aug 2021 17:59:57 +0000 (10:59 -0700)]

Use stacklevel for floordiv deprecation warnings (#64034)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/60548

`Tensor.__floordiv__` was indirectly deprecated by deprecation of `torch.floor_divide` (see https://github.com/pytorch/pytorch/issues/43874). Deprecating it directly provides clearer feedback.

Repro:
```
import torch
x = torch.tensor(0)
x // 1
```

Before this change, a deprecation warning was triggered within the C++ implementation of floor_divide:
```
UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  ../aten/src/ATen/native/BinaryOps.cpp:571.)
  return torch.floor_divide(self, other)
```

After this change, the warning instead cites the user's offending line of Python code:
```
UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  x // 1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64034

Reviewed By: mruberry

Differential Revision: D30658010

Pulled By: saketh-are

fbshipit-source-id: b0e6c5008d741897509d102f4a89efb47de4aa2a

commit | commitdiff | tree

Raghuraman Krishnamoorthi [Tue, 31 Aug 2021 16:45:28 +0000 (09:45 -0700)]

[ao][docs] Add description of qconfig and qengine to quantization page (#63582)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63582

Current quantization docs do not define qconfig and qengine. Added text to define these concepts before they are used.
ghstack-source-id: 137051719

Test Plan: Imported from OSS

Reviewed By: HDCharles

Differential Revision: D30658656

fbshipit-source-id: a45a0fcdf685ca1c3f5c3506337246a430f8f506

commit | commitdiff | tree

Kushashwa Ravi Shrimali [Tue, 31 Aug 2021 16:45:09 +0000 (09:45 -0700)]

Add OpInfo for `nn.functional.cosine_similarity` (#62959)

Summary:
Please see https://github.com/facebookresearch/functorch/issues/78 and https://github.com/pytorch/pytorch/issues/54261.

Notes:

* Some redundant tests from `test_nn.py` have been removed. I'm unsure about precision checks if they can be removed as well.
* Broadcasting is also checked in the OpInfo for `cosine_similarity`.

cc: mruberry zou3519 Chillee

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62959

Reviewed By: heitorschueroff

Differential Revision: D30520176

Pulled By: zou3519

fbshipit-source-id: 14e902eb4bcce875edab28a1669a2ea021052b9b

commit | commitdiff | tree

Kevin Tse [Tue, 31 Aug 2021 15:07:23 +0000 (08:07 -0700)]

[DataPipe] implementing __len__ for fork (no valid length for demux) (#64215)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64215

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D30648672

Pulled By: NivekT

fbshipit-source-id: 4780f2f6a79ae15a4009092475e7d92f96dd09a2

commit | commitdiff | tree

Kevin Tse [Tue, 31 Aug 2021 15:07:23 +0000 (08:07 -0700)]

[DataPipe] implementing demux() (#63650)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63650

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D30493944

Pulled By: NivekT

fbshipit-source-id: 0aa06dee8c7fb1744975b8f6a0694b90c11ef80d

commit | commitdiff | tree

Kevin Tse [Tue, 31 Aug 2021 15:07:23 +0000 (08:07 -0700)]

[DataPipe] implementing fork() (#63649)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63649

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D30493945

Pulled By: NivekT

fbshipit-source-id: 40db7d4134facd266d86bc0dc2edf2729c4e5842

commit | commitdiff | tree

Kimish Patel [Tue, 31 Aug 2021 14:36:53 +0000 (07:36 -0700)]

Revert D30327514: [Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling.

Test Plan: revert-hammer

Differential Revision:
D30327514 (https://github.com/pytorch/pytorch/commit/bc9277dca3a40d99147d4a1a3e0160a4a8e91f9f)

Original commit changeset: 3bb2f2daaaed

fbshipit-source-id: 0b2aa7c57d08de77c9aaa75e546a7d0938610f64

commit | commitdiff | tree

Harut Movsisyan [Tue, 31 Aug 2021 07:49:39 +0000 (00:49 -0700)]

[Static Runtime] Implement aten::nonzero out variant (#64126)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64126

Test Plan:
Confirm out variant is called:

```
> buck run //caffe2/benchmarks/static_runtime:static_runtime_cpptest -- --v=1
```

Reviewed By: mikeiovine

Differential Revision: D30617729

fbshipit-source-id: 752749638c8f467815efa57021cb3de5c728ab1b

commit | commitdiff | tree

Facebook Community Bot [Tue, 31 Aug 2021 04:31:11 +0000 (21:31 -0700)]

Automated submodule update: FBGEMM (#64213)

Summary:
This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM).

New submodule commit: https://github.com/pytorch/FBGEMM/commit/9d69998df6236d6714aa37ae6142a2a2d4fb2bf6

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64213

Test Plan: Ensure that CI jobs succeed on GitHub before landing.

Reviewed By: jspark1105

Differential Revision: D30647878

fbshipit-source-id: b903b39441b4e28dda7eab226ac874e2227e750a

commit | commitdiff | tree

Kimish Patel [Tue, 31 Aug 2021 03:53:50 +0000 (20:53 -0700)]

[Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling. (#63367)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63367

This diff changes the way operator profiling is done in lite predictor
benchmarking binary.
Instead of using custom callbacks it uses KinetoEdgeCPUProfiler to profile
events and then generate operator level metric from it.
Since KinetoEvents do not contain cpu clock time, now we report only wallclock
time.
This unifies various profiling effort that we have for benchmarking purpose. In
production we will still use observer based mechanism, but the advantage of
using kineto profiler is that we get few other things for free, such as:
- chrome trace generation.
- operator level memory profiling (to be added)
- flop counts (to be added)

Furthermore possible we can use python post processing script to parse chrome
trace and generate output similar to torch.profiler. (To be done)

Test Plan:
aibench run
Model without debug info:
https://www.internalfb.com/intern/aibench/details/219598441154763
Model with debug info and `--print_module_info true` (see Operator summary has now module hierarchy information).
https://www.internalfb.com/intern/aibench/details/617154236292985

Reviewed By: raziel

Differential Revision: D30327514

fbshipit-source-id: 3bb2f2daaaedfb04bd6f5d9c91292783f9c4344f

commit | commitdiff | tree

Peter Bell [Tue, 31 Aug 2021 03:17:12 +0000 (20:17 -0700)]

Compile BatchLinearAlgebra without nvcc (#64146)

Summary:
These files only use cuda libraries interfaces, so don't actually need to be compiled with nvcc.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64146

Reviewed By: ezyang

Differential Revision: D30633189

Pulled By: ngimel

fbshipit-source-id: c9d0ae5259a10cb49332d31f0da89ad758736ea8

commit | commitdiff | tree

Bert Maher [Tue, 31 Aug 2021 03:08:15 +0000 (20:08 -0700)]

[nnc] Enable fusion of bfloat16 ops (#64196)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64196

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D30643864

Pulled By: bertmaher

fbshipit-source-id: e95edeaf7089464d713ea1d1f951743d3e5f61c5

commit | commitdiff | tree

James Reed [Tue, 31 Aug 2021 02:54:50 +0000 (19:54 -0700)]

[WIP][FX] BC guarantees for 1.10 (#63888)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63888

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D30523133

Pulled By: jamesr66a

fbshipit-source-id: b04cc0d842a74862f42ecba98b757310cd2ec7b0

commit | commitdiff | tree

leslie-fang-intel [Tue, 31 Aug 2021 02:28:59 +0000 (19:28 -0700)]

add operation list for AutocastCPU (#63534)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63534

In this PR:
* We have changed the default dtype of `AutocastCPU` from `float16` to `bfloat16` as discussed here `https://github.com/pytorch/pytorch/pull/61002`
* We also update the operation list which needs casting to `lower_precision_fp` or `float32`.

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D30644914

Pulled By: ezyang

fbshipit-source-id: 8b93485ba452b3759611e3f0ac88e920fe495ac1

commit | commitdiff | tree

oleshp [Tue, 31 Aug 2021 02:22:05 +0000 (19:22 -0700)]

Update contribution_guide.rst (#64142)

Summary:
Grammatical update.

Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64142

Reviewed By: mruberry

Differential Revision: D30639394

Pulled By: ezyang

fbshipit-source-id: cf1a4dfbd8e34b0772f1b09f5d820278e8ef8574

commit | commitdiff | tree

Santiago Castro [Tue, 31 Aug 2021 02:17:21 +0000 (19:17 -0700)]

Avoid an unnecessary list creation in `DataChunk` (#64111)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64111

Reviewed By: mruberry

Differential Revision: D30639383

Pulled By: ezyang

fbshipit-source-id: 96b243307413c99a67d55d862a71937e1ef210f4

commit | commitdiff | tree

Samantha Andow [Tue, 31 Aug 2021 02:15:16 +0000 (19:15 -0700)]

Add optional tensor arguments to (#63967)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/63435

Adds optional tensor arguments to check handling torch function checks. The only one I didn't do this for in the functional file was `multi_head_attention_forward` since that already took care of some optional tensor arguments but not others so it seemed like arguments were specifically chosen

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63967

Reviewed By: albanD

Differential Revision: D30640441

Pulled By: ezyang

fbshipit-source-id: 5ef9554d2fb6c14779f8f45542ab435fb49e5d0f

commit | commitdiff | tree

CaoE [Tue, 31 Aug 2021 02:12:23 +0000 (19:12 -0700)]

add BFloat16 support for fold and unfold on CPU (#62880)

Summary:
Add BFloat16 support for fold and unfold operators on CPU.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62880

Reviewed By: iramazanli

Differential Revision: D30576387

Pulled By: zou3519

fbshipit-source-id: c48f6e56702bfea34448db1b3a1634c49c5d8ec8

commit | commitdiff | tree

Edward Yang [Tue, 31 Aug 2021 02:08:45 +0000 (19:08 -0700)]

Add acc_gpu_kernel_with_scalars and port add to use it (#63884)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63884

See https://dev-discuss.pytorch.org/t/cuda-loops-case-study-code-generation-vs-templates/302
for explanation of what's going on here.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D30545296

Pulled By: ezyang

fbshipit-source-id: f0da52153ae63599fe1d57e90e73f50ca2116939

commit | commitdiff | tree

Erjia Guan [Tue, 31 Aug 2021 01:41:08 +0000 (18:41 -0700)]

Modify inline doc for DataPipe (#64221)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64221

List of tasks in this PR
- [x] Add inline doc for DataPipe
- [x] Improve the inline doc
- [x] Expose DataPipe to `datapipes.iter` (`UnBatcher`) Note: `Forker`, `Demux`, `Mux` are exposed in another PR authored by Kevin
- [x] Add correct typing to DataPipe
- [x] Unify the argument to `datapipe` rather than `source_datapipe`

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D30650541

Pulled By: ejguan

fbshipit-source-id: c09d1b9742b8097d8e645c15947cef80c876877b

commit | commitdiff | tree

Erjia Guan [Tue, 31 Aug 2021 01:41:08 +0000 (18:41 -0700)]

Replace group_by_key by group_by IterDataPipe (#64220)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64220

Remove `ByKeyGrouperIterDataPipe` due to duplicated functionality.
Fix a bug in `GrouperIterDataPipe` using the existing test.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D30650542

Pulled By: ejguan

fbshipit-source-id: 666b4d28282fb4f49f3ff101b8d08be16a50d836

commit | commitdiff | tree

Richard Zou [Tue, 31 Aug 2021 01:39:50 +0000 (18:39 -0700)]

Add python mode (#63496)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63496

This PR adds a (private) enable_python_mode context manager.
(see torch/utils/_python_dispatch.py).
enable_python_mode accepts the type of a __torch_dispatch__ object
as its argument. Whenever an operator gets called inside of the
context manager, it dispatches to the __torch_dispatch__ of
the passed-in type.

Example usage:
```
with enable_python_mode(LoggingTensor):
z = torch.empty([])
assert isinstance(z, LoggingTensor)
```

There are quite a few changes that were made to support this.

First, we added TorchDispatchTypeObject, a C++ struct that represents the
type of a `__torch_dispatch__` object (e.g. LoggingTensor).
It holds both the PyObject* representing the class and a PyInterpreter*
so we know which Python interpreter it came from.

Next, we updated the concrete_dispatch_fn in python_variable.cpp to accept
a `const std::shared_ptr<TorchDispatchTypeObject>&` argument. When this
is null, dispatching happens as usual. When it is non-null, we prepend
the TorchDispatchTypeObject's PyObject* to the overloaded args list so that
it is considered first for dispatch.

To get that to work, we changed how `handle_torch_dispatch_no_python_arg_parser`
works. The "overloaded args list" previously only consisted of Tensor PyObjects,
but now it can have types in addition to Tensors!
- We renamed `append_overloaded_arg` to `append_overloaded_arg`
- We added a new `append_overloaded_type` that appends a type to
overloaded_args
- We added special handling in `handle_torch_dispatch_no_python_arg_parser`
and `append_overloaded_arg` to handle types in addition to Tensors.

Then, there is PythonMode and PythonModeTLS.
- We reuse the DispatchKey::Python dispatch key as a mode key
- We use PythonMode::enter and PythonMode::exit to enable/disable
DispatchKey::Python and set the PythonModeTLS.
- PythonModeTLS stores a TorchDispatchTypeObject as metadata.
- PythonMode is in libtorch_python, and PythonModeTLS is in ATen.
This split is due to the libtorch_python library boundary (because we need
to save TLS in ATen/ThreadLocalState)
- We modify the PythonFallbackKernel to look up
the relevant TorchDispatchTypeObject (if Python Mode is active) and
dispatch using it.

There are two more miscellaneous changes:
- internal_new_from_data (torch/csrc/utils/tensor_new.cpp) gets an
exclude guard. enable_python_mode currently does not handle
torch.tensor and the exclude guard is to prevent a bug.

Future:
- This PR does not allow for the nesting of Python modes. In the future we
should be able to enable this with a more sane no_dispatch API and by changing
the TLS to a stack. For now I did not need this for CompositeImplicitAutograd testing.

Test Plan: - new tests

Reviewed By: malfet, albanD

Differential Revision: D30543236

Pulled By: zou3519

fbshipit-source-id: ef5444d96a5a957d1657b7e37dce80f9a497d452

commit | commitdiff | tree

Bert Maher [Tue, 31 Aug 2021 01:36:33 +0000 (18:36 -0700)]

[nnc] Fix half2float conversion and re-enable float16 (#64199)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64199

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D30643865

Pulled By: bertmaher

fbshipit-source-id: 9de6adca53bd08839328cbaf6364f7de9550264b

commit | commitdiff | tree

Harut Movsisyan [Mon, 30 Aug 2021 23:16:45 +0000 (16:16 -0700)]

[Static Runtime] Implement aten::cumsum out variant (#64159)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64159

Test Plan:
Confirm out variant is called for both versions:

```
> buck run //caffe2/benchmarks/static_runtime:static_runtime_cpptest -- --v=1
```

Reviewed By: mikeiovine

Differential Revision: D30622819

fbshipit-source-id: a2c8c7f969dae5f507718fb3d513e1fb4f026736

commit | commitdiff | tree

Richard Zou [Mon, 30 Aug 2021 22:58:50 +0000 (15:58 -0700)]

OpInfo for nn.functional.interpolate (#61956)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61956

Each mode goes through a different implementation so they are listed as
different variants.

Test Plan: - run tests

Reviewed By: malfet

Differential Revision: D30013751

Pulled By: zou3519

fbshipit-source-id: 4253b40b55667d7486ef2d98b441c13d807ab292

commit | commitdiff | tree

Thomas J. Fan [Mon, 30 Aug 2021 22:03:40 +0000 (15:03 -0700)]

BUG Fixes regression for nllloss gradcheck (#64203)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/64163

This PR includes the fix and the opinfo from https://github.com/pytorch/pytorch/pull/63854/ for non-regression testing.

cc albanD mruberry jbschlosser

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64203

Reviewed By: albanD

Differential Revision: D30647522

Pulled By: jbschlosser

fbshipit-source-id: 2974d299763505908fa93532aca2bd5d5b71f2e9

commit | commitdiff | tree

Ivan Yashchuk [Mon, 30 Aug 2021 22:03:15 +0000 (15:03 -0700)]

Enable Half, BFloat16, and Complex dtypes for coo-coo sparse matmul [CUDA] (#59980)

Summary:
This PR enables Half, BFloat16, ComplexFloat, and ComplexDouble support for matrix-matrix multiplication of COO sparse matrices.
The change is applied only to CUDA 11+ builds.

`cusparseSpGEMM` also supports `CUDA_C_16F` (complex float16) and `CUDA_C_16BF` (complex bfloat16). PyTorch also supports the complex float16 dtype (`ScalarType::ComplexHalf`), but there is no convenient dispatch, so this dtype is omitted in this PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59980

Reviewed By: ngimel

Differential Revision: D29699456

Pulled By: cpuhrsch

fbshipit-source-id: 407ae53392acb2f92396a62a57cbaeb0fe6e950b

commit | commitdiff | tree

Alban Desmaison [Mon, 30 Aug 2021 21:56:35 +0000 (14:56 -0700)]

Revert D30561459: Fix bytes_written and bytes_read

Test Plan: revert-hammer

Differential Revision:
D30561459 (https://github.com/pytorch/pytorch/commit/e98173ff3423247c597e21c923c8f47470ef07ab)

Original commit changeset: 976fa5167097

fbshipit-source-id: 43f4c234ca400820fe6db5b4f37a25e14dc4b0dd

commit | commitdiff | tree

Alban Desmaison [Mon, 30 Aug 2021 21:46:50 +0000 (14:46 -0700)]

Back out "Added reference tests to ReductionOpInfo" (#64183)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64183

Original commit changeset: 6a1f82ac2819

Test Plan: CI

Reviewed By: soulitzer

Differential Revision: D30639835

fbshipit-source-id: e238043c6fbd0453317a9ed219e348298f98aaca

commit | commitdiff | tree

Jerry Zhang [Mon, 30 Aug 2021 21:21:39 +0000 (14:21 -0700)]

[quant][graphmode][fx] Add reference quantized conv module (#63828)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63828

Added reference quantized conv module for the custom backend flow, the reference quantized module will
have the following code:
```
        w(float) -- quant - dequant \
        x(float) ------------- F.conv2d ---
```
In the full model, we will see
```
        w(float) -- quant - *dequant \
        x -- quant --- *dequant --  *F.conv2d --- *quant - dequant
```
and the backend should be able to fuse the ops with `*` into a quantized linear

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_conv_linear_reference

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D30504749

fbshipit-source-id: e1d8c43a0e0d6d9ea2375b8ca59a9c0f455514fb

commit | commitdiff | tree

Daya Khudia [Mon, 30 Aug 2021 20:58:47 +0000 (13:58 -0700)]

Back out "[JIT] Add aten::slice optimization"

Summary:
Original commit changeset: d12ee39f6828
build-break
overriding_review_checks_triggers_an_audit_and_retroactive_review
Oncall Short Name: dskhudia

Test Plan: Local run succeeds

Differential Revision: D30633990

fbshipit-source-id: 91cf7cc0ad7e47d919347c2a1527688e062e0c62

commit | commitdiff | tree

Eli Uriegas [Mon, 30 Aug 2021 20:55:19 +0000 (13:55 -0700)]

.github: Adding configuration for backwards_compat (#64204)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64204

Adds backwards_compat to our existing test matrix for github actions

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
cc ezyang seemethere malfet walterddr lg20987 pytorch/pytorch-dev-infra

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D30646764

Pulled By: seemethere

fbshipit-source-id: f0da6027e29fab03aff058cb13466fae5dcf3678

commit | commitdiff | tree

Eli Uriegas [Mon, 30 Aug 2021 20:55:19 +0000 (13:55 -0700)]

.github: Adding configuration for docs_test (#64201)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64201

Adds docs_test to our existing test matrix for github actions

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
cc ezyang seemethere malfet walterddr lg20987 pytorch/pytorch-dev-infra

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D30646765

Pulled By: seemethere

fbshipit-source-id: 946adae01ff1f1f7ebe626e408e161b77b19a011

commit | commitdiff | tree

Will Constable [Mon, 30 Aug 2021 20:29:51 +0000 (13:29 -0700)]

Make name() part of IMethod interface (#63995)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63995

JIT methods already have name() in their interface, and Py methods have names in their implementation. I'm adding this for a particular case where someone tried to use name() on a JIT method that we're replacing with an IMethod.

Test Plan: add case to imethod API test

Reviewed By: suo

Differential Revision: D30559401

fbshipit-source-id: 76236721f5cd9a9d9d488ddba12bfdd01d679a2c

commit | commitdiff | tree

Nikita Shulga [Mon, 30 Aug 2021 20:26:00 +0000 (13:26 -0700)]

Fix type annotation in tools/nightly.py (#64202)

Summary:
`tempfile.TemporaryDirectory` is a generic only in python-3.9 and above

Workaround by wrapping type annotation in quotes

Fixes https://github.com/pytorch/pytorch/issues/64017

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64202

Reviewed By: janeyx99

Differential Revision: D30644215

Pulled By: malfet

fbshipit-source-id: 3c16240b9fa899bd4d572c1732a7d87d3dd0fbd5

commit | commitdiff | tree

lezcano [Mon, 30 Aug 2021 20:10:23 +0000 (13:10 -0700)]

Implements the orthogonal parametrization (#62089)

Summary:
Implements an orthogonal / unitary parametrisation.

It does passes the tests and I have trained a couple models with this implementation, so I believe it should be somewhat correct. Now, the implementation is very subtle. I'm tagging nikitaved and IvanYashchuk as reviewers in case they have comments / they see some room for optimisation of the code, in particular of the `forward` function.

Fixes https://github.com/pytorch/pytorch/issues/42243

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62089

Reviewed By: ezyang

Differential Revision: D30639063

Pulled By: albanD

fbshipit-source-id: 988664f333ac7a75ce71ba44c8d77b986dff2fe6

commit | commitdiff | tree

Tanvir Zaman [Mon, 30 Aug 2021 19:56:15 +0000 (12:56 -0700)]

Fix bytes_written and bytes_read (#64040)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64040

In operator cost inference functions, in many places we are using sizeof(x.data_type()). Since data_type() returns a 32 bit integer from [this enum](https://www.internalfb.com/code/fbsource/[15e7ffe4073cf08c61077c7c24a4839504b964a2]/fbcode/caffe2/caffe2/proto/caffe2.proto?lines=20), we are basically always getting 4 for sizeof(x.data_type()) no matter what actual data type x has. Big thanks to Jack Langman for specifically pointing to this bug.

We would instead use the size in bytes based on actual data type.

Test Plan:
Added unit tests BatchMatMulMemCostTest:

buck test //caffe2/caffe2/fb/fbgemm:batch_matmul_op_test -- BatchMatMulMemCostTest

Extended existing unit test test_columnwise_concat for different data types:

buck test //caffe2/caffe2/python/operator_test:concat_op_cost_test -- test_columnwise_concat

Differential Revision: D30561459

fbshipit-source-id: 976fa5167097a35af548498480001aafd7851d93

commit | commitdiff | tree

Philip Meier [Mon, 30 Aug 2021 19:28:39 +0000 (12:28 -0700)]

remove componentwise comparison of complex values in torch.testing.assert_close (#63841)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63841

Closes #61906.

cc ezyang gchanan

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D30633526

Pulled By: mruberry

fbshipit-source-id: ddb5d61838cd1e12d19d0093799e827344382cdc

commit | commitdiff | tree

Philip Meier [Mon, 30 Aug 2021 19:28:39 +0000 (12:28 -0700)]

remove componentwise comparison of complex values in TestCase.assertEqual (#63572)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63572

Addresses #61906. Issue will be fixed later in the stack when `torch.testing.assert_close` got the same treatment.

cc ezyang gchanan

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D30633527

Pulled By: mruberry

fbshipit-source-id: c2002a4998a7a75cb2ab83f87190bde43a9d4f7c

commit | commitdiff | tree

Xiang Gao [Mon, 30 Aug 2021 19:25:29 +0000 (12:25 -0700)]

Bring back old algorithm for sorting on small number of segments (#64127)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/63456
The code was copy-pasted from the previous commit without modification.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64127

Reviewed By: mruberry

Differential Revision: D30632090

Pulled By: ngimel

fbshipit-source-id: 58bbdd9b0423f01d4e65e2ec925ad9a3f88efc9b

commit | commitdiff | tree

Kushashwa Ravi Shrimali [Mon, 30 Aug 2021 19:16:23 +0000 (12:16 -0700)]

[Doc] `make_tensor` to `torch.testing` module (#63925)

Summary:
This PR aims to add `make_tensor` to the `torch.testing` module in PyTorch docs.

TODOs:

* [x] Add examples

cc: pmeier mruberry brianjo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63925

Reviewed By: ngimel

Differential Revision: D30633487

Pulled By: mruberry

fbshipit-source-id: 8e5a1f880c6ece5925b4039fee8122bd739538af

commit | commitdiff | tree

Peter Bell [Mon, 30 Aug 2021 19:14:09 +0000 (12:14 -0700)]

Fix bad use of channels last kernel in sync batch norm backward (#64100)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/64039

There are two distinct problems here.
1. If `grad_output` is channels last but not input, then input would be read as-if it were channels last. So reading the wrong values.
2. `use_channels_last_kernels` doesn't guarunte that `suggest_memory_format` will actually return channels last, so use `empty_like` instead so the strides always match.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64100

Reviewed By: mruberry

Differential Revision: D30622127

Pulled By: ngimel

fbshipit-source-id: e28cc57215596817f1432fcdd6c49d69acfedcf2

commit | commitdiff | tree

Zhengxu Chen [Mon, 30 Aug 2021 18:46:14 +0000 (11:46 -0700)]

[jit] Make operation call accept Stack& instead Stack* (#63414)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63414

Misuse of raw pointer in here where stack is never nullable.
ghstack-source-id: 136938318

Test Plan:
compiles.

Imported from OSS

Reviewed By: ejguan

Differential Revision: D30375410

fbshipit-source-id: 9d65b620bb76d90d886c800f54308520095d58ee

commit | commitdiff | tree

= [Mon, 30 Aug 2021 16:43:25 +0000 (09:43 -0700)]

Improve performance of index_select by avoiding item (#63008)

Summary:
Partially fixes https://github.com/pytorch/pytorch/issues/61788

From a CUDA perspective: item already pulls all Tensor content onto the host (albeit one-by-one), which incurs very expensive memory transfers. This way we'll do it all at once.
From a CPU perspective: item has a lot of overhead as a native function in comparison to simply using a pointer.

Overall there's still lots of performance gains to be had, but this is a small change that should take us into a more usable landscape. This doesn't land a separate benchmark, but I postulate that's not necessary to decide on the benefit of this (we'll also see if it shows up indirectly), however is still a good follow-up item.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63008

Reviewed By: zou3519

Differential Revision: D30211160

Pulled By: cpuhrsch

fbshipit-source-id: 70b752be5df51afc66b5aa1c77135d1205520cdd

commit | commitdiff | tree

Harut Movsisyan [Mon, 30 Aug 2021 16:36:46 +0000 (09:36 -0700)]

[Static Runtime] aten::cat out version when it is not being replaced by prim::VarConcat (#64157)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64157

UseVariadicCat optimization is not applied to aten::cat if list input to the op can not be moved to the position before op (https://fburl.com/diffusion/l6kweimu). For these cases we will need out version for SR.

Test Plan:
Confirm out variant is called:
```
> buck run //caffe2/benchmarks/static_runtime:static_runtime_cpptest -- --v=1
```

Reviewed By: d1jang

Differential Revision: D30598574

fbshipit-source-id: 74cfa8291dc8b5df4aef58adfb1ab2a16f10d90a

commit | commitdiff | tree

Scott Wolchok [Mon, 30 Aug 2021 16:34:24 +0000 (09:34 -0700)]

[PyTorch] Fix missing move in unpickler (#63974)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63974

Saw some time spent in this for model loading, no reason not to move here.
ghstack-source-id: 136760979

Test Plan: Re-profile model loading on devserver; IValue copy ctor time has gone down

Reviewed By: dhruvbird

Differential Revision: D30548923

fbshipit-source-id: 42000f2e18582762b43353cca10ae094833de3b3

commit | commitdiff | tree

Scott Wolchok [Mon, 30 Aug 2021 16:34:24 +0000 (09:34 -0700)]

[PyTorch] Reduce copies/refcount bumps in BytecodeDeserializer::parseMethods (#63961)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63961

Saw a report that this function was slow and was doing unexplained vector copies. First pass to remove a bunch of copying.
ghstack-source-id: 136760976

Test Plan:
Pixel 3
before: https://our.intern.facebook.com/intern/aibench/details/461850118893980
after: https://www.internalfb.com/intern/aibench/details/48965886029524

MilanBoard failed to return data from simpleperf

Reviewed By: dhruvbird

Differential Revision: D30544551

fbshipit-source-id: 0e2b5471a10c0803d52c923e6fb5625f5542b99d

commit | commitdiff | tree

Raghavan Raman [Mon, 30 Aug 2021 16:26:20 +0000 (09:26 -0700)]

[MicroBench] Added a micro benchmark for a signed log1p kernel. (#64032)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64032

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D30579198

Pulled By: navahgar

fbshipit-source-id: a53d68225fba768b26491d14b535f8f2dcf50c0e

commit | commitdiff | tree

Facebook Community Bot [Mon, 30 Aug 2021 15:27:36 +0000 (08:27 -0700)]

Automated submodule update: FBGEMM (#64149)

Summary:
This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM).

New submodule commit: https://github.com/pytorch/FBGEMM/commit/f6dfed87a10ed5729bce83e98788e437a94cbda0

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64149

Test Plan: Ensure that CI jobs succeed on GitHub before landing.

Reviewed By: jspark1105

Differential Revision: D30632209

fbshipit-source-id: aa1cebaf50169c3a93dbcb994fa47e29d6b6a0d7

commit | commitdiff | tree

Vitaly Fedyunin [Mon, 30 Aug 2021 14:54:11 +0000 (07:54 -0700)]

[DataLoader2] Adding Messages, Protocols, Loop wrappers (#63882)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63882

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D30627452

Pulled By: VitalyFedyunin

fbshipit-source-id: 561ea2df07f3572e04401171946154024126387b

commit | commitdiff | tree

Rong Rong (AI Infra) [Mon, 30 Aug 2021 14:49:27 +0000 (07:49 -0700)]

remove one more distributed test (#64108)

Summary:
Follow up on https://github.com/pytorch/pytorch/issues/62896. one more place we should remove distributed test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64108

Reviewed By: janeyx99, soulitzer

Differential Revision: D30614062

Pulled By: walterddr

fbshipit-source-id: 6576415dc2d481d65419da19c5aa0afc37a86cff

commit | commitdiff | tree

Raghavan Raman [Mon, 30 Aug 2021 11:38:00 +0000 (04:38 -0700)]

[nnc] Updated internal asserts to include more detailed error messages (#64118)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64118

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D30616944

Pulled By: navahgar

fbshipit-source-id: 35289696cc0e7faa01599304243b86f0febc6daf

commit | commitdiff | tree

Raghavan Raman [Mon, 30 Aug 2021 11:38:00 +0000 (04:38 -0700)]

[nnc] Fixed warning due to implicit parameter conversion (#64117)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64117

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D30616945

Pulled By: navahgar

fbshipit-source-id: eaf69232ac4a684ab5f97a54a514971655f86ef3

commit | commitdiff | tree

Thomas J. Fan [Mon, 30 Aug 2021 06:31:42 +0000 (23:31 -0700)]

ENH Adds label_smoothing to cross entropy loss (#63122)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/7455

Partially resolves pytorch/vision#4281

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63122

Reviewed By: iramazanli

Differential Revision: D30586076

Pulled By: jbschlosser

fbshipit-source-id: 06afc3aa1f8b9edb07fe9ed68c58968ad1926924

commit | commitdiff | tree

Harut Movsisyan [Mon, 30 Aug 2021 03:58:45 +0000 (20:58 -0700)]

[Static Runtime] Out version for torch.linalg.norm (#64070)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64070

Test Plan:
Confirm out variant is called for both versions:

```
> buck run //caffe2/benchmarks/static_runtime:static_runtime_cpptest -- --v=1
```

Reviewed By: d1jang

Differential Revision: D30595816

fbshipit-source-id: e88d88d4fc698774e83a98efce66b8fa4e281563

commit | commitdiff | tree

Zafar Takhirov [Mon, 30 Aug 2021 03:28:32 +0000 (20:28 -0700)]

[quant] AO migration of the `quantize.py` (#64086)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64086

AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly.

This migrates the `quantize.py` from torch.quantization to `torch.ao.quantization`.

At this point both locations will be supported. Eventually the torch.quantization will be deprecated.

Test Plan: `buck test mode/opt //caffe2/test:quantization`

Reviewed By: jerryzh168, raghuramank100

Differential Revision: D30055886

fbshipit-source-id: 8ef7470f9fa640c0042bef5bb843e7a05ecd0b9f

commit | commitdiff | tree

Mike Ruberry [Mon, 30 Aug 2021 02:37:06 +0000 (19:37 -0700)]

Removes beta warning from the special module documentation (#64148)

Summary:
Updates documentation per feature review. torch.special is now stable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64148

Reviewed By: ngimel

Differential Revision: D30632049

Pulled By: mruberry

fbshipit-source-id: 8f6148ec7737e7b3a90644eeca23eb217eda513d

commit | commitdiff | tree

mingfeima [Mon, 30 Aug 2021 01:35:37 +0000 (18:35 -0700)]

add channel last support for MaxUnpool2d (#49984)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49984

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D26007051

Pulled By: VitalyFedyunin

fbshipit-source-id: 6c54751ade4092e03c1651aaa60380f7d6e92f6b

commit | commitdiff | tree

Nikita Shulga [Sun, 29 Aug 2021 22:49:59 +0000 (15:49 -0700)]

Revert D30620966: [pytorch][PR] Move Parallel[Native|TBB] to GHA

Test Plan: revert-hammer

Differential Revision:
D30620966 (https://github.com/pytorch/pytorch/commit/223f886032978487099da4f54e86e9e0549cde0c)

Original commit changeset: 9a23e4b3e168

fbshipit-source-id: b9248d377b9a7b850dfb3f10f3350fbc9855acfe

commit | commitdiff | tree

Tugsbayasgalan (Tugsuu) Manlaibaatar [Sun, 29 Aug 2021 21:17:54 +0000 (14:17 -0700)]

[DOC] Add doc for maybe_wrap_dim (#63161)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63161

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D30629451

Pulled By: tugsbayasgalan

fbshipit-source-id: b03f030f197e10393a8ff223b240d23c30858028

commit | commitdiff | tree

Garrett Cramer [Sun, 29 Aug 2021 18:33:48 +0000 (11:33 -0700)]

add support for sending cpu sparse tensors over rpc (#62794)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62794

This pr updates jit serialization to support pickling Sparse COO tensors.
This pr updates message.cpp to support Sparse COO tensors.
A bug was filed a few years ago https://github.com/pytorch/pytorch/issues/30807.

I tested the fix by adding sparse tensor tests to rpc_test.py and dist_autograd_test.py.

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski SciPioneer H-Huang mrzzd cbalioglu gcramer23 gmagogsfm

Test Plan: Imported from OSS

Reviewed By: soulitzer

Differential Revision: D30608848

Pulled By: gcramer23

fbshipit-source-id: 629ba8e4a3d8365875a709c9b87447c7a71204fb

commit | commitdiff | tree

Tugsbayasgalan (Tugsuu) Manlaibaatar [Sun, 29 Aug 2021 17:19:56 +0000 (10:19 -0700)]

[DOC] improve docstring for Optimizer.state_dict (#63153)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63153

Fixes: https://github.com/pytorch/pytorch/issues/60121

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D30629462

Pulled By: tugsbayasgalan

fbshipit-source-id: a9160e02ac53bb1a6219879747d73aae9ebe4d2f

commit | commitdiff | tree

Facebook Community Bot [Sun, 29 Aug 2021 16:56:34 +0000 (09:56 -0700)]

Automated submodule update: FBGEMM (#64141)

Summary:
This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM).

New submodule commit: https://github.com/pytorch/FBGEMM/commit/9939bac9defab4d18fb7fdded7e1a76c0c2b49b4

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64141

Test Plan: Ensure that CI jobs succeed on GitHub before landing.

Reviewed By: jspark1105

Differential Revision: D30629417

fbshipit-source-id: 1b1ad3d4caff925f798b86b358ab193554c9b8e0

commit | commitdiff | tree

Bert Maher [Sun, 29 Aug 2021 02:57:10 +0000 (19:57 -0700)]

[nnc] Make 64-bit dimensions work (#64077)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64077

We were assuming kernel dimensions fit in 32 bits (the old fuser made
this assumption too), but we should be able to support 64.
ghstack-source-id: 136933272

Test Plan: unit tests; new IR level test with huge sizes

Reviewed By: ZolotukhinM

Differential Revision: D30596689

fbshipit-source-id: 23b7e393a2ebaecb0c391a6b1f0c4b05a98bcc94

commit | commitdiff | tree

Bert Maher [Sun, 29 Aug 2021 02:57:10 +0000 (19:57 -0700)]

Parse int64 sizes/strides (#64076)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64076

We were parsing sizes into int32s, so if you had a tensor with more
than 2^32 elements, you couldn't represent it.
ghstack-source-id: 136933273

Test Plan: parseIR with size of 4e9

Reviewed By: ZolotukhinM

Differential Revision: D30521116

fbshipit-source-id: 1e28e462cba52d648e0e2acb4e234d86aae25a3e

commit | commitdiff | tree

Bert Maher [Sun, 29 Aug 2021 02:18:10 +0000 (19:18 -0700)]

[nnc] Fix batchnorm implementation (#64112)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64112

Fixes #64062

Test Plan: Imported from OSS

Reviewed By: zhxchen17

Differential Revision: D30622897

Pulled By: bertmaher

fbshipit-source-id: 7d7c6131aa786e61fa1d0a517288396a0bdb1d22

commit | commitdiff | tree

Ilqar Ramazanli [Sat, 28 Aug 2021 22:54:53 +0000 (15:54 -0700)]

To add RMSProp algorithm documentation (#63721)

Summary:
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper https://github.com/pytorch/pytorch/issues/63236.

In this PR we are adding description of RMSProp to the documentation. For more details, we refer to the paper https://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf

<img width="464" alt="RMSProp" src="https://user-images.githubusercontent.com/73658284/131179226-3fb6fe5a-5301-4948-afbe-f38bf57f24ff.png">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63721

Reviewed By: albanD

Differential Revision: D30612426

Pulled By: iramazanli

fbshipit-source-id: c3ac630a9658d1282866b53c86023ac10cf95398

commit | commitdiff | tree

Facebook Community Bot [Sat, 28 Aug 2021 18:50:49 +0000 (11:50 -0700)]

Automated submodule update: FBGEMM (#64129)

Summary:
This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM).

New submodule commit: https://github.com/pytorch/FBGEMM/commit/f14e79481460a7c0dedf452a258072231cb343e6

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64129

Test Plan: Ensure that CI jobs succeed on GitHub before landing.

Reviewed By: jspark1105

Differential Revision: D30621549

fbshipit-source-id: 34c109e75c96a261bf370f7a06dbb8b9004860ab

commit | commitdiff | tree

Nikita Shulga [Sat, 28 Aug 2021 18:46:40 +0000 (11:46 -0700)]

Move Parallel[Native|TBB] to GHA (#64123)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64123

Reviewed By: driazati

Differential Revision: D30620966

Pulled By: malfet

fbshipit-source-id: 9a23e4b3e16870f77bf18df4370cd468603d592d

commit | commitdiff | tree

Tugsbayasgalan (Tugsuu) Manlaibaatar [Sat, 28 Aug 2021 18:44:58 +0000 (11:44 -0700)]

Enhancement for smart serialization for out schemas (#63096)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63096

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D30415255

Pulled By: tugsbayasgalan

fbshipit-source-id: eb40440a3b46258394d035479f5fc4a4baa12bcc

commit | commitdiff | tree

Priya Ramani [Sat, 28 Aug 2021 05:50:20 +0000 (22:50 -0700)]

[Light] Fix error message (#64010)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64010

Fixing typos in a error message

Test Plan:
Error message before fix:
Lite Interpreter verson number does not match. The model version must be between 3 and 5But the model version is 6

Error message after fix:
Lite Interpreter version number does not match. The model version must be between 3 and 5 but the model version is 6

Reviewed By: larryliu0820

Differential Revision: D30568367

fbshipit-source-id: 205f3278ee8dcf38579dbb828580a9e986ccacc1

commit | commitdiff | tree

Jerry Zhang [Sat, 28 Aug 2021 03:58:20 +0000 (20:58 -0700)]

[quant][graphmode][fx] Add reference quantized linear module (#63627)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63627

Added reference quantized linear module for the custom backend flow, the reference quantized module will
have the following code:
```
        w(float) -- quant - dequant \
        x(float) ------------- F.linear ---
```
In the full model, we will see
```
        w(float) -- quant - *dequant \
        x -- quant --- *dequant --  *F.linear --- *quant - dequant
```
and the backend should be able to fuse the ops with `*` into a quantized linear

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_conv_linear_reference

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D30504750

fbshipit-source-id: 5729921745c2b6a0fb344efc3689f3b170e89500

commit | commitdiff | tree

Yuchen Huang [Sat, 28 Aug 2021 01:57:22 +0000 (18:57 -0700)]

[iOS][GPU] Consolidate array and non-array kernel for hardswish (#63369)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63369

ghstack-source-id: 136918152

(Note: this ignores all push blocking failures!)

Test Plan:
- `buck test pp-macos`
- Op tests in PyTorchPlayground app
- Run mobilenetv3 test

https://pxl.cl/1Ncls

Reviewed By: xta0

Differential Revision: D30354454

fbshipit-source-id: 88bf4f8b5871e63170161b3f3e44f99b8a3086c6

commit | commitdiff | tree

Ilqar Ramazanli [Sat, 28 Aug 2021 01:51:09 +0000 (18:51 -0700)]

To add Nesterov Adam algorithm description to documentation (#63793)

Summary:
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper https://github.com/pytorch/pytorch/issues/63236.

In this PR we are adding description of Nesterov Adam Algorithm to the documentation. For more details, we refer to the paper https://openreview.net/forum?id=OM0jvwB8jIp57ZJjtNEZ

<img width="439" alt="NAdam" src="https://user-images.githubusercontent.com/73658284/131185124-e81b2edf-33d9-4a9d-a7bf-f7e5eea47d7c.png">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63793

Reviewed By: NivekT

Differential Revision: D30617057

Pulled By: iramazanli

fbshipit-source-id: cd2054b0d9b6883878be74576e86e307f32f1435

commit | commitdiff | tree

Mike Iovine [Sat, 28 Aug 2021 00:37:05 +0000 (17:37 -0700)]

[Static Runtime] Optimize memory planner initialization (#64101)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64101

Checking `getOutOfPlaceOperation(n)` is a very expensive operation, especially in multithreaded environments, due to a lock acquisition when the NNC cache is queried. This slows down the memory planner initialization time, and by extension, the latency for the first static runtime inference.

There are two optimizations in this diff:
* Cache the result of `p_node->has_out_variant()` to avoid the call to `getOutOfPlaceOperation`. This speeds up calls to `canReuseInputOutputs`, which in turn speeds up `isOptimizableContainerType`
* Precompute all `isOptimizableContainerType` during static runtime initialization to avoid a pass over all of each node's inputs.

Test Plan: All unit tests pass: `buck test caffe2/benchmarks/static_runtime/...`

Reviewed By: movefast1990

Differential Revision: D30595579

fbshipit-source-id: 70aaa7af9589c739c672788bf662f711731864f2

commit | commitdiff | tree

Mikhail Zolotukhin [Fri, 27 Aug 2021 23:15:55 +0000 (16:15 -0700)]

[TensorExpr] Update tutorial. (#64109)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64109

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D30614050

Pulled By: ZolotukhinM

fbshipit-source-id: e8f9bd9ef2483e6eafbc0bd5394d311cd694c7b2

commit | commitdiff | tree

Eli Uriegas [Fri, 27 Aug 2021 23:02:49 +0000 (16:02 -0700)]

.github: Add cpp_docs job to current gcc5 workflow (#64044)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64044

Adds the cpp_docs job to the current workflow, also modifies the scripts
surrounding building docs so that they can be powered through
environment variables with sane defaults rather than having to have
passed arguments.

Ideally should not break current jobs running in circleci but those
should eventually be turned off anyways.

Coincides with work from:
* https://github.com/seemethere/upload-artifact-s3/pull/1
* https://github.com/seemethere/upload-artifact-s3/pull/2

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
cc ezyang seemethere malfet walterddr lg20987 pytorch/pytorch-dev-infra

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D30610010

Pulled By: seemethere

fbshipit-source-id: f67adeb1bd422bb9e24e0f1ec0098cf9c648f283

commit | commitdiff | tree

soulitzer [Fri, 27 Aug 2021 21:59:08 +0000 (14:59 -0700)]

Update codegen to use boxed kernel (#63459)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63459

- Replaces the usual registration basically when "requires_derivative" is True (as in we still need a grad_fn), but `fn.info` is `None` (TODO maybe make sure differentiable inputs > 0 also to match requires_derivative).
- Adds some (temporary?) fixes to some sparse functions See: https://github.com/pytorch/pytorch/issues/63549
- To remove the codegen that generates NotImplemented node (though that should only be one line), because there are some ops listed under `RESET_GRAD_ACCUMULATOR` that have a extra function call. We would need to make this list of ops available to c++, but this would either mean we'd have to codegen a list of strings, or move the RESET_GRAD_ACCUMULATOR to cpp land. We could do this in a future PR if necessary.

Test Plan: Imported from OSS

Reviewed By: heitorschueroff

Differential Revision: D30518571

Pulled By: soulitzer

fbshipit-source-id: 99a35cbced46292d1b4e51594ae4d534c2caf8b6

commit | commitdiff | tree

soulitzer [Fri, 27 Aug 2021 21:59:08 +0000 (14:59 -0700)]

Add autograd not implemented boxed fallback (#63458)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63458

See description and discussion from https://github.com/pytorch/pytorch/pull/62450

Test Plan: Imported from OSS

Reviewed By: heitorschueroff

Differential Revision: D30518572

Pulled By: soulitzer

fbshipit-source-id: 3b1504d49abb84560ae17077f0dec335749c9882

commit | commitdiff | tree

Jessica Choi [Fri, 27 Aug 2021 21:46:31 +0000 (14:46 -0700)]

Removing references to ProcessGroupAgent in comments (#64051)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64051

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski SciPioneer H-Huang mrzzd cbalioglu gcramer23

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D30587076

Pulled By: jaceyca

fbshipit-source-id: 414cb95faad0b4da0eaf2956c0668af057f93574

Domain: Machine Learning / ML Framework;