platform/upstream/pytorch.git
2 years agoFix zero-dim handling in torch.matmul (#63359)
Richard Zou [Tue, 17 Aug 2021 20:39:52 +0000 (13:39 -0700)]
Fix zero-dim handling in torch.matmul (#63359)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63359

Fixes #63352. The problem was that in e.g. `torch.matmul(A, B)` with A,
B having shapes [3, 2, 0] and [0, 2], the code attempts to call
`A.view(-1, 0)` which fails due to "-1 being ambiguous". The solution is
to manually compute what we want the shape of the view to be.

Test Plan: - new tests

Reviewed By: ngimel

Differential Revision: D30351583

Pulled By: zou3519

fbshipit-source-id: 7625691fe8b85d96a4073409596a932c303e3e8c

2 years ago[TensorExpr] Add a wrapper for all expr and stmt pointers. (#63195)
Mikhail Zolotukhin [Tue, 17 Aug 2021 20:39:36 +0000 (13:39 -0700)]
[TensorExpr] Add a wrapper for all expr and stmt pointers. (#63195)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63195

This helps us to later switch from using KernelArena with raw pointers
to shared pointers without having to change all our source files at
once.

The changes are mechanical and should not affect any functionality.

With this PR, we're changing the following:
 * `Add*` --> `AddPtr`
 * `new Add(...)` --> `alloc<Add>(...)`
 * `dynamic_cast<Add*>` --> `to<Add>`
 * `static_cast<Add*>` --> `static_to<Add>`

Due to some complications with args forwarding, some places became more
verbose, e.g.:
 * `new Block({})` --> `new Block(std::vector<ExprPtr>())`

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D30292779

Pulled By: ZolotukhinM

fbshipit-source-id: 150301c7d2df56b608b035827b6a9a87f5e2d9e9

2 years agoOpInfo fix: `conv_transpose2d` (#63389)
Kushashwa Ravi Shrimali [Tue, 17 Aug 2021 20:35:32 +0000 (13:35 -0700)]
OpInfo fix: `conv_transpose2d` (#63389)

Summary:
Addresses comment: https://github.com/pytorch/pytorch/pull/62882#issuecomment-899679606.

cc: mruberry ngimel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63389

Reviewed By: mruberry

Differential Revision: D30377481

Pulled By: ngimel

fbshipit-source-id: 0fa21acc3503c259c9b27463e8555247c43d9e2e

2 years ago[Static Runtime] Implement aten::append (#63350)
Mike Iovine [Tue, 17 Aug 2021 20:34:44 +0000 (13:34 -0700)]
[Static Runtime] Implement aten::append (#63350)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63350

Add a native implementation for `aten::append`, the list append op.

Test Plan: New unit test: `buck test caffe2/benchmarks/static_runtime:static_runtime_cpptest -- Append`

Reviewed By: hlu1

Differential Revision: D30326461

fbshipit-source-id: 0dbdf6cc82e78c7c36db39583256f6b87385e3d3

2 years ago[vulkan] Add log_softmax (#63193)
Ivan Kobzarev [Tue, 17 Aug 2021 20:34:20 +0000 (13:34 -0700)]
[vulkan] Add log_softmax (#63193)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63193

Test Plan: Imported from OSS

Reviewed By: SS-JIA

Differential Revision: D30291987

fbshipit-source-id: 89c6560274e5a841e5af249f6963b67ef6826f4c

2 years ago[quant][fx] Ensure qconfig works for QAT with multiple modules (#63343)
Supriya Rao [Tue, 17 Aug 2021 18:39:16 +0000 (11:39 -0700)]
[quant][fx] Ensure qconfig works for QAT with multiple modules (#63343)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63343

The previous implementation had a bug where we were trying to modify an ordered dict value while iterating through it.
This fixes it by creating a copy before modifying it.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_qconfig_qat_module_type

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D30346116

fbshipit-source-id: 0e33dad1163e8bff3fd363bfd04de8f7114d7a3a

2 years agoAdd return type hint and improve the docstring of consume_prefix_in_state_dict_if_pre...
Yi Wang [Tue, 17 Aug 2021 18:28:43 +0000 (11:28 -0700)]
Add return type hint and improve the docstring of consume_prefix_in_state_dict_if_present method (#63388)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63388

Context: https://discuss.pytorch.org/t/how-to-use-the-helper-function-consume-prefix-in-state-dict-if-present/129505/3

Make it clear that this method strips the prefix in place rather than returns a new value.

Additional reformatting is also applied.
ghstack-source-id: 135973393

Test Plan: waitforbuildbot

Reviewed By: rohan-varma

Differential Revision: D30360931

fbshipit-source-id: 1a0c7967a4c86f729e3c810686c21dec43d1dd7a

2 years agoAdd handling of ifs to shape propagation (#62914)
Elias Ellison [Tue, 17 Aug 2021 18:21:50 +0000 (11:21 -0700)]
Add handling of ifs to shape propagation (#62914)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62914

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D30196945

Pulled By: eellison

fbshipit-source-id: 1c0c7f938c4547330fd1dba8ab7dd0b99a79b6a9

2 years agoSmall shape analysis changes (#62911)
Elias Ellison [Tue, 17 Aug 2021 18:21:50 +0000 (11:21 -0700)]
Small shape analysis changes (#62911)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62911

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D30196946

Pulled By: eellison

fbshipit-source-id: 2562bab323088d9c1440ae0431e533f9bcc513d3

2 years agoAdd a few peepholes (#62910)
Elias Ellison [Tue, 17 Aug 2021 18:21:50 +0000 (11:21 -0700)]
Add a few peepholes (#62910)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62910

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D30196947

Pulled By: eellison

fbshipit-source-id: d88c92616d4de4f47ff4fcf5c1994e629ca20395

2 years agoPropagate symbolic dimensions through idioms like x.view(y.size()) (#61975)
Elias Ellison [Tue, 17 Aug 2021 18:21:50 +0000 (11:21 -0700)]
Propagate symbolic dimensions through idioms like x.view(y.size()) (#61975)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61975

Propagate symbolic dimensions through size calls. We did this by associating SymbolicSizes with integer inputs by looking through their constructors for `x.size(1)` or `x.size()` nodes.

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D30196948

Pulled By: eellison

fbshipit-source-id: 377fc1d2f6d396c52dc0e87fa814b15720f1414e

2 years ago[fx2trt] Refactor linear op to use mm + add
Jerry Zhang [Tue, 17 Aug 2021 17:41:38 +0000 (10:41 -0700)]
[fx2trt] Refactor linear op to use mm + add

Summary:
Previously linear is translated to fully_connected which only works when weight is a constant,
this diff changes that to mm + add so that the weight can be an ITensor so that we can have the weight - quantize - dequantize
pattern in the produced TensorRT network

Test Plan: buck run mode/opt caffe2/torch/fb/fx2trt:test_linear

Reviewed By: 842974287

Differential Revision: D30294751

fbshipit-source-id: 596fbd4c81caef8df41a002a2e14fbf22d9d2a80

2 years agoUpdates set_default_dtype documentation (#63233)
Mike Ruberry [Tue, 17 Aug 2021 17:37:57 +0000 (10:37 -0700)]
Updates set_default_dtype documentation (#63233)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/60560.

The description of set_default_dtype is updated to clarify that it affects the interpretation of Python numbers as either float32 (complex64) or float64 (complex128) and that default (floating) dtypes other than float32 or float64 are unsupported.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63233

Reviewed By: VitalyFedyunin

Differential Revision: D30306396

Pulled By: mruberry

fbshipit-source-id: bbee62f323c773b23b2fa45cb99122bc28197432

2 years agoRemove backend_debug from torch_core srcs and replace with library dependency (#63111)
Amy He [Tue, 17 Aug 2021 17:31:02 +0000 (10:31 -0700)]
Remove backend_debug from torch_core srcs and replace with library dependency (#63111)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63111

### Problem:
Buck contains at least two libraries which have `backend_debug_info.cpp` as a source, `torch_core` and `backend_interface_lib`. `backend_debug_info.cpp` registers BackendDebugInfo as a class. If targets contain both libraries (e.g. sparkAR debug build with NNAPI delegation), then BackendDebugInfo is registered twice, causing a runtime error.
### Solution:
These changes remove `backend_debug_info.cpp` and `backend_interface.cpp` as a source in `torch_core` and adds backend_interface_lib as a dependency instead.

**build_variables.bzl:**
- Added a list that excludes `backend_debug_info.cpp` and `backend_interface.cpp` ( both srcs already included by `backend_interface_lib`)

**buck:**
- torch_core: Removed `backend_debug_info.cpp` from srcs and added `backend_interface_lib` deps
- backend_interface_lib: Replaced `torch_mobile_core` dep with more specific deps
  - to avoid an indirect dep between `torch_core` and `torch_mobile_core`

ghstack-source-id: 135981061

Test Plan:
### Test Plan:
Build and run SparkAR internally with Android NNAPI Delegation (`buck build --show-output arstudioplayer_arm64_debug`)
and internal tests.

Reviewed By: iseeyuan

Differential Revision: D30259034

fbshipit-source-id: 0c14c827732f07fb9b9bd25a999828b51793cdcc

2 years agoMove Android Nnapi srcs from aten_native_cpu to aten_cpu (#62919)
Amy He [Tue, 17 Aug 2021 17:31:02 +0000 (10:31 -0700)]
Move Android Nnapi srcs from aten_native_cpu to aten_cpu (#62919)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62919

Move Android NNAPI srcs (nnapi_bind.cpp, nnapi_wrapper.cpp, nnapi_model_loader.cpp) from aten_native_cpu to aten_cpu, so that later the NNAPI delegate's execution library can depend on it.

aten_native_cpu is built selectively per app, but the srcs have no selective components and are required for the NNAPI delegate library in D30259033.

See Buck Dependencies: https://docs.google.com/document/d/17RuWkqWKCO6sc5fKzIDkGeNhhvMk7BvJOqeSnGsHZ8o/edit?usp=sharing
ghstack-source-id: 135981062

Test Plan: `buck build --show-output arstudioplayer_arm64_debug` and internal tests

Reviewed By: iseeyuan

Differential Revision: D30164867

fbshipit-source-id: 0beff481ff250e75664ce8393beabbeb9db66770

2 years ago[android][vulkan] Fix model loading for Vulkan backend (#63402)
Ivan Kobzarev [Tue, 17 Aug 2021 17:12:11 +0000 (10:12 -0700)]
[android][vulkan] Fix model loading for Vulkan backend (#63402)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63402

Test Plan: Imported from OSS

Reviewed By: SS-JIA

Differential Revision: D30370692

Pulled By: IvanKobzarev

fbshipit-source-id: 73311b9b767fe9ed3ae390db59d6aa2c4a98f06d

2 years agoAdvertise USE_PRECOMPILED_HEADERS in CONTRIBUTING.md (#62827)
Peter Bell [Tue, 17 Aug 2021 17:11:05 +0000 (10:11 -0700)]
Advertise USE_PRECOMPILED_HEADERS in CONTRIBUTING.md (#62827)

Summary:
This option was added in https://github.com/pytorch/pytorch/issues/61940 and fits with this section's theme of improving build times.

I've also changed it to a `cmake_dependent_option` instead of `FATAL_ERROR`ing for older CMake versions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62827

Reviewed By: astaff

Differential Revision: D30342102

Pulled By: malfet

fbshipit-source-id: 3095b44b7085aee8a884ec95cba9f8998d4442e7

2 years ago[fx] persist `tracer_cls` on `fx.Graph` when deep copying (#63353)
Bradley Davis [Tue, 17 Aug 2021 16:55:25 +0000 (09:55 -0700)]
[fx] persist `tracer_cls` on `fx.Graph` when deep copying (#63353)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63353

Custom deepcopy method copies all nodes but does not copy the tracer_cls attribute

Reviewed By: houseroad

Differential Revision: D30349424

fbshipit-source-id: 3e98bdac8a8a992eb0b4ec67fe80bb2e5cf3884d

2 years ago[PyTorch] Avoid using std::regex for device string parsing in Device.cpp (#63204)
Dhruv Matani [Tue, 17 Aug 2021 16:20:49 +0000 (09:20 -0700)]
[PyTorch] Avoid using std::regex for device string parsing in Device.cpp (#63204)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63204

Currently, `std::regex` is used for parsing device strings. This is undesirable for a few reasons.

1. Increases binary size
2. Slows down model loading
3. Potentially uses more memory at runtime
4. Takes marginally longer time to build code that uses std::regex v/s not using std::regex

This change avoids the use of `std::regex` for parsing the device string since we don't need to.
ghstack-source-id: 136006963

Test Plan:
### AI Bench Runs

**Before this change:**
1. Model Load time: [252ms](https://www.internalfb.com/intern/aibench/details/332471502816548)
2. Model unload time: 3.5ms

**After this change:**
1. Model Load time: [240ms](https://www.internalfb.com/intern/aibench/details/652195589031318), which is an approx 5% reduction for the current model. I suspect percentage wise, it will be larger for smaller models since this is a fixed cost reduction.
2. Model unload time: 3.3ms (probably too small to be meaningfully impactful to an end user).

### BSB Results

```
D30281388-V1 (https://www.internalfb.com/intern/diff/D30281388/?dest_number=135713848)

messenger-pika-optimized-device: Succeeded
Change in Download Size for arm64 + 3x assets variation: -7.1 KiB
Change in Uncompressed Size for arm64 + 3x assets variation: -17.6 KiB

Mbex Comparison: https://our.intern.facebook.com/intern/mbex/bsb:551399955987465@base/bsb:551399955987465@diff/
```

Reviewed By: raziel

Differential Revision: D30281388

fbshipit-source-id: 4d998e9f313e6366d9d89a6a73cd090ddfb059fc

2 years ago[PyTorch] Add Device_test.cpp (#63203)
Dhruv Matani [Tue, 17 Aug 2021 16:20:49 +0000 (09:20 -0700)]
[PyTorch] Add Device_test.cpp (#63203)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63203

Currently, `c10::Device` isn't being tested - i.e. there's no test to ensure that the device string parsing works as expected. This diff adds very basic tests to assert that the stuff we expect to work works, and the stuff that we don't expect to work doesn't work.

ghstack-source-id: 136006962

Test Plan:
New test. Ran as:

```
cd fbsource/fbcode/
buck test //caffe2/c10:c10_test_0 -- -r '.*DeviceTest.*'
```

Reviewed By: dreiss, raziel

Differential Revision: D30286910

fbshipit-source-id: b5699068dcbba89d5d224dbaf74b175f3f785a00

2 years agochange with_callable_args to return a fresh _PartialWrapper (#63374)
Taylor Robie [Tue, 17 Aug 2021 16:09:59 +0000 (09:09 -0700)]
change with_callable_args to return a fresh _PartialWrapper (#63374)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/63326

Currently `get_callable_args` has the side effect of mutating the input _PartialWrapper. When that input is one of the global defaults, there are all sorts of lifetime issues that crop up. (Details in the linked issue.) So far as I can tell, we only need to make a constructor which is module (and by extension device) aware, so making a fresh one should have the same effect without leaking the last call's module.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63374

Test Plan: the repro in https://github.com/pytorch/pytorch/issues/63326 now reports no leaked Tensors, and all quantization tests pass locally.

Reviewed By: HDCharles

Differential Revision: D30359360

Pulled By: robieta

fbshipit-source-id: aef33261ac49952d8d90da868a57ab063dfc456e

2 years agoFix flaky test for dp saved tensor hooks (#63324)
Victor Quach [Tue, 17 Aug 2021 15:55:25 +0000 (08:55 -0700)]
Fix flaky test for dp saved tensor hooks (#63324)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63324

Fix for https://www.internalfb.com/tasks/?t=98258963
`catch_warnings` seem to only trigger once in certain cases where it
should trigger twice.
This test is only meant to test whether hooks are trigger / not trigger,
so changing it to self.assertGreater is ok.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D30340833

Pulled By: Varal7

fbshipit-source-id: 1bfb9437befe9e8ab8f95efe5f513337fa9bdc5c

2 years agoAdd mode to TarArchiveReader (#63332)
Erjia Guan [Tue, 17 Aug 2021 14:26:08 +0000 (07:26 -0700)]
Add mode to TarArchiveReader (#63332)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63332

Add a corresponding PR from [torchdata](https://github.com/facebookexternal/torchdata/pull/101)

Test Plan: Imported from OSS

Reviewed By: astaff

Differential Revision: D30350151

Pulled By: ejguan

fbshipit-source-id: bced4a1ee1ce89d4e91e678327342e1c095dbb9e

2 years agoadd torch.meshgrid() OpInfo (#62720)
Michael Dagitses [Tue, 17 Aug 2021 11:03:02 +0000 (04:03 -0700)]
add torch.meshgrid() OpInfo (#62720)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/62719

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62720

Reviewed By: astaff

Differential Revision: D30344574

Pulled By: dagitses

fbshipit-source-id: ed42d9fe20741df98018efb08e640fca370583fb

2 years agoExtends warning on norm docs (#63310)
Mike Ruberry [Tue, 17 Aug 2021 05:22:15 +0000 (22:22 -0700)]
Extends warning on norm docs (#63310)

Summary:
torch.norm has a couple documentation issues, like https://github.com/pytorch/pytorch/issues/44552 and https://github.com/pytorch/pytorch/issues/38595, but since it's deprecated this PR simply clarifies that the documentation (and implementation) of torch.norm maybe be incorrect. This should be additional encouragement for users to migrate to torch.linalg.vector_norm and torch.linalg.matrix_norm.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63310

Reviewed By: ngimel

Differential Revision: D30337997

Pulled By: mruberry

fbshipit-source-id: 0fdcc438f36e4ab29e21e0a64709e4f35a2467ba

2 years agoCleanup dead code (#63328)
Peter Bell [Tue, 17 Aug 2021 05:09:25 +0000 (22:09 -0700)]
Cleanup dead code (#63328)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63328

This code supported the old `at::_fft_with_size` operator which no longer exists.

Test Plan: Imported from OSS

Reviewed By: astaff

Differential Revision: D30343557

Pulled By: mruberry

fbshipit-source-id: 7a71585e013acb46c98f14fd40e15bdfbf026bac

2 years agoWorkaround for cuFFT bug (#63327)
Peter Bell [Tue, 17 Aug 2021 05:09:25 +0000 (22:09 -0700)]
Workaround for cuFFT bug (#63327)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63327

Fixes #63152

Test Plan: Imported from OSS

Reviewed By: astaff

Differential Revision: D30343558

Pulled By: mruberry

fbshipit-source-id: 68e17a07650f65f397e26efc417e97e2ab302f82

2 years agoAdd step to report code coverage from GHA (#63373)
Nikita Shulga [Tue, 17 Aug 2021 03:35:12 +0000 (20:35 -0700)]
Add step to report code coverage from GHA (#63373)

Summary:
Similar to the logic provided in https://github.com/pytorch/pytorch/blob/b2069e7d01814d776c417042e28133c6b0e5082f/.circleci/verbatim-sources/job-specs/pytorch-job-specs.yml#L197-L201

Fixes https://github.com/pytorch/pytorch/issues/63366

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63373

Reviewed By: walterddr

Differential Revision: D30357737

Pulled By: malfet

fbshipit-source-id: 20b115eb4d6412bd9895680308a9097742d2ae7b

2 years ago[TensorExpr] Remove test_train from tensorexpr tests. (#63194)
Mikhail Zolotukhin [Tue, 17 Aug 2021 03:34:49 +0000 (20:34 -0700)]
[TensorExpr] Remove test_train from tensorexpr tests. (#63194)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63194

This test implements functionality used nowhere, and the author no
longer works on that. This PR also adds test_approx to CMakeLists where
it's been missing before.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D30292777

Pulled By: ZolotukhinM

fbshipit-source-id: ab6d98e729320a16f1b02ea0c69734f5e7fb2554

2 years ago[JIT] Set future's error to current exception as is when `--torch_jit_enable_rethrow_...
Don Jang [Tue, 17 Aug 2021 00:30:26 +0000 (17:30 -0700)]
[JIT] Set future's error to current exception as is when `--torch_jit_enable_rethrow_caught_exception=true` (#63348)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63348

This change addresses singlaiiit's comment on D30241792 (https://github.com/pytorch/pytorch/commit/61b49c8e41a2faf7fd40278ca72616c5d92963cb), which makes the JIT interpreter's behavior consistent between `future` is set and not.

Test Plan: Enhanced `EnableRethrowCaughtExceptionTest.EnableRethrowCaughtExceptionTestRethrowsCaughtException` to cover the modified code path.

Reviewed By: singlaiiit

Differential Revision: D30347782

fbshipit-source-id: 79ce57283154ca4372e5341217d942398db21ac8

2 years ago[Static Runtime] Fix a bug that assigns multiple outputs to single storage (#63012)
Don Jang [Mon, 16 Aug 2021 23:50:30 +0000 (16:50 -0700)]
[Static Runtime] Fix a bug that assigns multiple outputs to single storage (#63012)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63012

This change fixes a bug that the static runtime's memory optimizer assigns multiple outputs of a node to the same storage.  Fixing this bug enables the static runtime to run `inline_cvr` with its memory optimizer enabled.

A problematic line from `inline_cvr` was as follows:
```
  %7767 : Tensor, %getitem_6419.1 : Tensor = fb::gather_ranges(%tensor74.1, %7764)
```
where enabling the memory optimizer assigns `%7767` and `%getitem_6419.1` to the same storage, which made their data corrupted during the 2nd iteration.

This change fixed the aforementioned bug by marking all inputs & outputs of a node as `alive` during our liveness analysis. By doing that, no inputs / outputs will collide with each other. I believe this is a fair assumption that most ops' implementation always has, but missing in our analysis before this change.

Test Plan: - Added a unittest `StaticRuntime.ValuesShareSameStorageDoesNotContainOutputsFromSameNode` to cover the new code.

Reviewed By: hlu1

Differential Revision: D30202018

fbshipit-source-id: 10287a1bee9e86be16a5201e9a7cd7c7f046bab9

2 years ago[Model Averaging] Add a few member methods of PostLocalSGDOptimizer (#63340)
Yi Wang [Mon, 16 Aug 2021 23:33:21 +0000 (16:33 -0700)]
[Model Averaging] Add a few member methods of PostLocalSGDOptimizer (#63340)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63340

Some methods are needed such as accessing optimizer states. These are necessary for integration with PyTorch Lightning.

Proposal: https://github.com/pytorch/pytorch/issues/59699
ghstack-source-id: 135912246

Test Plan: buck test mode/dev-nosan caffe2/test/distributed:distributed_nccl_fork -- test_ddp_hook_parity_post_localSGD

Reviewed By: rohan-varma

Differential Revision: D30328794

fbshipit-source-id: e585b874313bd266fdc7c79936e2af98700c7bad

2 years ago[PyPer] Skip printing out per node time when do_profile is on (#63256)
Hao Lu [Mon, 16 Aug 2021 23:30:53 +0000 (16:30 -0700)]
[PyPer] Skip printing out per node time when do_profile is on (#63256)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63256

This suppresses printing out the per node time which is very long when the net has too many ops. It can be easily turned on by setting `--pt_sr_print_per_node_time=1`.

Reviewed By: ajyu, mikeiovine

Differential Revision: D30298331

fbshipit-source-id: 32b3f93b3fe19d335654168311fda93331a1e706

2 years agoRefactor NnapiCompilation registration into it's own file (#63183)
Amy He [Mon, 16 Aug 2021 22:42:14 +0000 (15:42 -0700)]
Refactor NnapiCompilation registration into it's own file (#63183)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63183

Move registration of NnapiCompilation into it's own file, so that `nnapi_bind.cpp` (which contains the implementation of NnapiCompilation) can be moved to `aten_cpu`, while maintaining the selectiveness for registration.

`nnapi_bind.cpp` is moved to `aten_cpu` in https://github.com/pytorch/pytorch/pull/62919. See the PR for more details on why it's needed.

ghstack-source-id: 135900318

Test Plan: Nnapi unit tests: `python test/test_nnapi.py`

Reviewed By: iseeyuan

Differential Revision: D30288708

fbshipit-source-id: 6ed5967fa6bd018075469d18e68f844d413cf265

2 years agoAdd section to CONTRIBUTING.md explaining developer docs (#63228)
Richard Zou [Mon, 16 Aug 2021 22:35:05 +0000 (15:35 -0700)]
Add section to CONTRIBUTING.md explaining developer docs (#63228)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63228

It is a quick summary and links to a page on the Developer Wiki that has
more detail.

Test Plan: Imported from OSS

Reviewed By: SplitInfinity

Differential Revision: D30347109

Pulled By: zou3519

fbshipit-source-id: a6242986d275e5279ca3f61ade2294a132d268c4

2 years agotest: Add ability to set CONTINUE_THROUGH_ERROR (#63357)
Eli Uriegas [Mon, 16 Aug 2021 22:30:24 +0000 (15:30 -0700)]
test: Add ability to set CONTINUE_THROUGH_ERROR (#63357)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63357

Adds the ability to set CONTINUE_THROUGH_ERROR as an environment
variable so that we can easily set it without having to add the flag
directly

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Test Plan: Imported from OSS

Reviewed By: astaff

Differential Revision: D30351108

Pulled By: seemethere

fbshipit-source-id: 767fa9bd24e1399f359eb24d16f6cc985a2d7173

2 years agoAdd driver function to run test_sharded_tensor.py and test_sharding_spec.py (#63189)
Bo Wang [Mon, 16 Aug 2021 22:18:01 +0000 (15:18 -0700)]
Add driver function to run test_sharded_tensor.py and test_sharding_spec.py (#63189)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63189

Add main --> run_tests func in test file which is needed to launch the real test cases in OSS flow.

Test Plan:
b/f:
$ python test/distributed/_sharding_spec/test_sharding_spec.py --v   ==> nothing happened
$ python test/distributed/_sharded_tensor/test_sharded_tensor.py --v ==> nothing happened

after:

$ python test/distributed/_sharding_spec/test_sharding_spec.py --v   ==>

test_chunked_sharding_spec (__main__.TestShardingSpec) ... ok
test_device_placement (__main__.TestShardingSpec) ... ok
test_enumerable_sharding_spec (__main__.TestShardingSpec) ... ok

$ python test/distributed/_sharded_tensor/test_sharded_tensor.py --v

test_complete_world_size (__main__.TestShardedTensorChunked) ... ok
test_insufficient_sharding_dims (__main__.TestShardedTensorChunked) ... ok
test_invalid_pg_rpc_ranks (__main__.TestShardedTensorChunked) ... [W tensorpipe_agent.cpp:699] RPC agent for worker2 encountered error when reading incoming request from worker0: eof (this error originated at tensorpipe/transport/shm/connection_impl.cc:259)
ok
test_invalid_sharding (__main__.TestShardedTensorChunked) ... ok
test_load_state_dict_errors (__main__.TestShardedTensorChunked) ... ok
test_multiple_local_shards (__main__.TestShardedTensorChunked) ... ok
test_new_group (__main__.TestShardedTensorChunked) ... ok
test_partial_world_size (__main__.TestShardedTensorChunked) ... ok
test_sharded_tensor_metadata (__main__.TestShardedTensorChunked) ... ok
test_sharded_tensor_sizes (__main__.TestShardedTensorChunked) ... ok
test_sharding_columns (__main__.TestShardedTensorChunked) ... ok
test_state_dict (__main__.TestShardedTensorChunked) ... ok
test_state_dict_new_group (__main__.TestShardedTensorChunked) ... ok
test_state_dict_no_sharded_tensors (__main__.TestShardedTensorChunked) ... ok
test_grid_sharding (__main__.TestShardedTensorEnumerable) ... ok
test_multiple_local_shards (__main__.TestShardedTensorEnumerable) ... ok
test_new_group (__main__.TestShardedTensorEnumerable) ... ok
test_partial_world_size (__main__.TestShardedTensorEnumerable) ... ok
test_sharded_tensor_metadata (__main__.TestShardedTensorEnumerable) ... ok
test_uneven_shards (__main__.TestShardedTensorEnumerable) ... ok
test_with_rpc_names (__main__.TestShardedTensorEnumerable) ... ok
test_init_from_local_shards (__main__.TestShardedTensorFromLocalShards) ... ok
test_init_from_local_shards_invalid_shards (__main__.TestShardedTensorFromLocalShards) ... ok
test_init_from_local_shards_invalid_shards_gaps (__main__.TestShardedTensorFromLocalShards) ...

Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D30294094

fbshipit-source-id: 08f0431a12ea854abe00dc920205b10ba43ae6b6

2 years ago[fx2trt] add unsqueeze converter (#63355)
Shiyan Deng [Mon, 16 Aug 2021 22:16:51 +0000 (15:16 -0700)]
[fx2trt] add unsqueeze converter (#63355)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63355

Added converter for acc_ops.unsqueeze. Needed for ig model.

DIdn't add support for input that has more than one dynamic dim. This is not needed right now and I feel it would be a rare case.

Test Plan: unit test

Reviewed By: yinghai

Differential Revision: D30138293

fbshipit-source-id: 899fe8eb68387de83195a2f6e199618d96f09a9e

2 years ago[Static Runtime] Implement prim::TupleUnpack (#63243)
Mike Iovine [Mon, 16 Aug 2021 21:50:27 +0000 (14:50 -0700)]
[Static Runtime] Implement prim::TupleUnpack (#63243)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63243

Add `prim::TupleUnpack` native op to static runtime.

Test Plan: Unit test: `buck test caffe2/benchmarks/static_runtime:static_runtime_cpptest`

Reviewed By: hlu1

Differential Revision: D30306955

fbshipit-source-id: 21923d6cbd5545c144ac051b3d48b37ec6e610cf

2 years ago[fx2trt] Factor out add_matrix_multiply_layer
Jerry Zhang [Mon, 16 Aug 2021 21:07:43 +0000 (14:07 -0700)]
[fx2trt] Factor out add_matrix_multiply_layer

Summary: Factor out the function so that it can be reused in future diffs

Test Plan: buck run mode/opt caffe2/torch/fb/fx2trt:test_matmul

Reviewed By: 842974287

Differential Revision: D30322823

fbshipit-source-id: 069b945de2c744cdbcca1618b62827692dfb4174

2 years agoA re-open PR: Avoid re-creating the random number generator in RandomSampler (#63026)
MY_ [Mon, 16 Aug 2021 21:07:06 +0000 (14:07 -0700)]
A re-open PR: Avoid re-creating the random number generator in RandomSampler (#63026)

Summary:
More details can be found in the old pr: https://github.com/pytorch/pytorch/pull/53085

ejguan  Thanks for your guidance. I tried to reopen this PR following your instructions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63026

Reviewed By: anjali411

Differential Revision: D30224920

Pulled By: ejguan

fbshipit-source-id: 2fa83bd4a2661485e553447fe3e57ce723f2716d

2 years agoImprove pip package determination (#63321)
Nikita Shulga [Mon, 16 Aug 2021 20:50:44 +0000 (13:50 -0700)]
Improve pip package determination (#63321)

Summary:
Invoking `pip` or `pip3` yields list of packages invoked for `pip` alias on the path, rather than for the one currently being executed. Changed `get_pip_packages` to use `sys.executable + '-mpip'`

Also, add mypy to the list of packages of interest

Discovered while looking at https://github.com/pytorch/pytorch/issues/63279

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63321

Reviewed By: walterddr

Differential Revision: D30342099

Pulled By: malfet

fbshipit-source-id: fc8d17cf2ddcf18236cfde5c1b9edb4e72804ee0

2 years ago[Profiler] Change FLOP/s to Total FLOPs (#62779)
Lucas Kabela [Mon, 16 Aug 2021 20:34:56 +0000 (13:34 -0700)]
[Profiler] Change FLOP/s to Total FLOPs (#62779)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62779

Change from floating point operations per second to total floating point operations.  This requires removing the division  by executing time from the Kineto computed FLOPs and updating necessary documentation

Test Plan:
Running the following script:

```
import torch
from torch.profiler import profile
import torchvision.models as models

model = models.resnet18().eval()
inputs = torch.randn(5, 3, 224, 224)
with torch.no_grad():
    with profile(record_shapes=True, with_flops=True) as prof:
        model(inputs)
print(prof.key_averages().table(sort_by="cpu_time_total"))
```

Before diff results in:

{F636640118}

And after diff should be about `(27.78 * 10^9) FLOP/s * .652838 seconds =18135839640 FLOP = 18.136 GFLOP`.  Running the script again yields this answer:

{F636655686}

------------------------------------

Reviewed By: gdankel

Differential Revision: D29972997

fbshipit-source-id: 0f8d9f264b7d9f8f6bb3f10ab7c2c9794291e28b

2 years agoFix triage workflow when the card already exists in project (#63347)
zhouzhuojie [Mon, 16 Aug 2021 20:32:40 +0000 (13:32 -0700)]
Fix triage workflow when the card already exists in project (#63347)

Summary:
Fixes issues like https://github.com/pytorch/pytorch/runs/3336787242

```
RequestError [HttpError]: Validation Failed: {"resource":"ProjectCard","code":"unprocessable","field":"data","message":"Project already has the associated issue"}
Error: Unhandled error: HttpError: Validation Failed: {"resource":"ProjectCard","code":"unprocessable","field":"data","message":"Project already has the associated issue"}
    at /home/runner/work/_actions/actions/github-script/v2/dist/index.js:7531:23
    at processTicksAndRejections (internal/process/task_queues.js:93:5)
    at async eval (eval at callAsyncFunction (/home/runner/work/_actions/actions/github-script/v2/dist/index.js:7985:56), <anonymous>:63:1)
    at async main (/home/runner/work/_actions/actions/github-script/v2/dist/index.js:8011:20) {
  name: 'HttpError',
  status: 422,

...
```

The card may already exist, thus no need to handle `422` status code. Anything else will re-throw the err.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63347

Reviewed By: malfet

Differential Revision: D30348529

Pulled By: zhouzhuojie

fbshipit-source-id: 36647837bfccad43ce01eb5dfe6642e685615037

2 years ago[opinfo] nn.functional.pad (#62814)
kshitij12345 [Mon, 16 Aug 2021 20:26:46 +0000 (13:26 -0700)]
[opinfo] nn.functional.pad (#62814)

Summary:
Reference: https://github.com/facebookresearch/functorch/issues/78

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62814

Reviewed By: VitalyFedyunin

Differential Revision: D30307492

Pulled By: zou3519

fbshipit-source-id: 4f6062eb4a3c91ed1795df1f82846afa0abafcdc

2 years agoAdd expecttest to requirements.txt (#63320)
Sam Estep [Mon, 16 Aug 2021 20:20:59 +0000 (13:20 -0700)]
Add expecttest to requirements.txt (#63320)

Summary:
This PR closes the developer environment gap left by https://github.com/pytorch/pytorch/issues/60658 by adding [expecttest](https://github.com/ezyang/expecttest) to `requirements.txt`. Thus it provides a solution to one of the short-term problems that https://github.com/pytorch/pytorch/issues/60697 tries to solve, but does not provide a long-term solution to https://github.com/pytorch/pytorch/issues/61375.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63320

Reviewed By: malfet

Differential Revision: D30340654

Pulled By: samestep

fbshipit-source-id: 26c8f8c9889cce4a94fafb1bf2f0d6df4c70503f

2 years agoadd comma to prevent syntax errors (#62492)
kyshel [Mon, 16 Aug 2021 19:12:45 +0000 (12:12 -0700)]
add comma to prevent syntax errors (#62492)

Summary:
Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62492

Reviewed By: VitalyFedyunin

Differential Revision: D30304684

Pulled By: ezyang

fbshipit-source-id: db08ca39bcecbfd79ea50df18536bf4e87f51e15

2 years agoRetry apt-get during setup_ci_workspace (#63319)
Bert Maher [Mon, 16 Aug 2021 19:10:50 +0000 (12:10 -0700)]
Retry apt-get during setup_ci_workspace (#63319)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63319

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D30346067

Pulled By: bertmaher

fbshipit-source-id: 2aafa97e78f9297553d772b2524d6f1c0ebaa46e

2 years agoMake `torch.lu` differentiable for wide/tall inputs + jit (#61564)
Nikita Vedeneev [Mon, 16 Aug 2021 18:39:04 +0000 (11:39 -0700)]
Make `torch.lu` differentiable for wide/tall inputs + jit (#61564)

Summary:
As per title.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61564

Reviewed By: astaff

Differential Revision: D30338136

Pulled By: mruberry

fbshipit-source-id: f01436fc90980544cdfa270feee16bb3dda21b93

2 years ago[Model Averaging] Allow subgroup to be None in PostLocalSGDState (#63277)
Yi Wang [Mon, 16 Aug 2021 17:05:47 +0000 (10:05 -0700)]
[Model Averaging] Allow subgroup to be None in PostLocalSGDState (#63277)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63277

`PostLocalSGDState` requires a subgroup. To initialize this subgroup, a global process group must be initialized. However, this imposes a restriction that a hook state can only be provided after distributed environment initialization, which is not compatible with lightning DDP plugin setup where hook state should be provided before distributed environment initialization.

Proposal: https://github.com/pytorch/pytorch/issues/59699
ghstack-source-id: 135848575

Test Plan: buck test mode/dev-nosan caffe2/test/distributed:distributed_nccl_fork -- test_ddp_hook_parity_post_localSGD

Reviewed By: cbalioglu

Differential Revision: D30325041

fbshipit-source-id: 7b870166d096d306c3f2f7c69816a705cec0bebd

2 years agoRevert "[docs] Update docs for NegativeBinomial (#45693)" (#63192)
Meghan Lele [Mon, 16 Aug 2021 16:12:57 +0000 (09:12 -0700)]
Revert "[docs] Update docs for NegativeBinomial (#45693)" (#63192)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63192

**Summary**
This reverts commit 402caaeba513929dcfe12df183c764b0ef43f688. As per the
dicussion in #62178, this commit was not needed.

**Test Plan**
Continuous integration.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D30293202

Pulled By: SplitInfinity

fbshipit-source-id: 91ee7ad0523a9880605d83fe9712c39df67384a8

2 years agoRefactor BucketBatch (#63185)
Erjia Guan [Mon, 16 Aug 2021 13:39:56 +0000 (06:39 -0700)]
Refactor BucketBatch (#63185)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63185

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D30288893

Pulled By: ejguan

fbshipit-source-id: b88b792d12a83c99d8ea9e516e3b4c54a82100f6

2 years agoReplace str by repr for DataChunk (#63184)
Erjia Guan [Mon, 16 Aug 2021 13:39:56 +0000 (06:39 -0700)]
Replace str by repr for DataChunk (#63184)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63184

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D30288892

Pulled By: ejguan

fbshipit-source-id: 45c88fdd3987e234f2c22ebbbfd8d5044983c34c

2 years ago[nnc] Updated IRMutator and IRSimplifier to perform in-place mutations. (#63246)
Raghavan Raman [Mon, 16 Aug 2021 07:07:51 +0000 (00:07 -0700)]
[nnc] Updated IRMutator and IRSimplifier to perform in-place mutations. (#63246)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63246

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D30309636

Pulled By: navahgar

fbshipit-source-id: 409ea8d6982888cfee9127e6248044dd2ed9d8d4

2 years ago[docs][ao] Add overload information for fake_quantize_per_tensor_affine (#63258)
Supriya Rao [Mon, 16 Aug 2021 05:44:44 +0000 (22:44 -0700)]
[docs][ao] Add overload information for fake_quantize_per_tensor_affine (#63258)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63258

This function supports scalar and tensor qparams

Test Plan:
CI

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D30316432

fbshipit-source-id: 8b2f5582e7e095fdda22c17d178abcbc89a2d1fc

2 years ago[docs][ao] Add missing docstrings for quantized_max_pool1d and quantized_max_pool2d...
Supriya Rao [Mon, 16 Aug 2021 05:44:44 +0000 (22:44 -0700)]
[docs][ao] Add missing docstrings for quantized_max_pool1d and quantized_max_pool2d (#63242)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63242

These functions are part of the native functions namespace as well as the quantized namespace

Test Plan:
CI

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D30316430

fbshipit-source-id: cd9c839e5c1a961e3c6944e514c16fbc256a2f0c

2 years ago[docs][ao] Add missing documentation for torch.quantized_batch_norm (#63240)
Supriya Rao [Mon, 16 Aug 2021 05:44:44 +0000 (22:44 -0700)]
[docs][ao] Add missing documentation for torch.quantized_batch_norm (#63240)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63240

Op is exposed via torch.quantized_batch_norm to the end user without any existing documentation

Test Plan:
CI

Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D30316431

fbshipit-source-id: bf2dc8b7b6f497cf73528eaa2bedef9f65029d84

2 years ago[OpInfo] Add expected_failure kwarg to SkipInfo (#62963)
Heitor Schueroff [Mon, 16 Aug 2021 01:06:41 +0000 (18:06 -0700)]
[OpInfo] Add expected_failure kwarg to SkipInfo (#62963)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62963

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D30327199

Pulled By: heitorschueroff

fbshipit-source-id: 45231eca11d1697a4449d79849fb17264d128a6b

2 years agoSmall refactor for OpInfo decorators (#62713)
Heitor Schueroff [Mon, 16 Aug 2021 01:06:41 +0000 (18:06 -0700)]
Small refactor for OpInfo decorators (#62713)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62713

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D30327200

Pulled By: heitorschueroff

fbshipit-source-id: 1899293990c8c0a66da88646714b38f1aae9179d

2 years ago[Pytorch Edge] Fix broken test post changes in error reporting format. (#63287)
Kimish Patel [Sun, 15 Aug 2021 23:12:47 +0000 (16:12 -0700)]
[Pytorch Edge] Fix broken test post changes in error reporting format. (#63287)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63287

Recent changes in https://github.com/pytorch/pytorch/pull/62419 changed
the way module hierarchy is reported. Now it includes information about
function names as well.

Test Plan:
python test/mobile/test_lite_script_module.py
TestLiteScriptModule.test_save_mobile_module_with_debug_info_with_trace

Imported from OSS

Reviewed By: iseeyuan

Differential Revision: D30328512

fbshipit-source-id: ddd6b11b9ab01cc725f4568a35eff7a92f17204b

2 years agoTo add warm-up scheduler to optim (#60836)
Ilqar Ramazanli [Sun, 15 Aug 2021 19:30:18 +0000 (12:30 -0700)]
To add warm-up scheduler to optim (#60836)

Summary:
Warm up of learning rate scheduling has initially been discussed  by Priya et. al. in the paper: https://arxiv.org/pdf/1706.02677.pdf .

In the section 2.2 of the paper they discussed and proposed idea of warming up learning schedulers in order to prevent big variance / noise in the learning rate. Then idea has been further discussed in the following papers:
  * Akilesh Gotmare et al. https://arxiv.org/abs/1810.13243
  * Bernstein et al  http://proceedings.mlr.press/v80/bernstein18a/bernstein18a.pdf
  * Liyuan Liu et al: https://arxiv.org/pdf/1908.03265.pdf

There are two type of popularly used learning rate warm up ideas
  * Constant warmup  (start with very small constant learning rate)
  * Linear Warmup        ( start with small learning rate and gradually increase)

In this PR we are adding warm up as learning rate scheduler. Note that learning rates are chainable, which means that we can merge warmup scheduler with any other learning rate scheduler to make more sophisticated learning rate scheduler.

## Linear Warmup

Linear Warmup is multiplying learning rate with pre-defined constant - warmup_factor in the first epoch (epoch 0). Then targeting to increase this multiplication constant to one in warmup_iters many epochs. Hence we can derive the formula at i-th step to have multiplication constant equal to:

                    warmup_factor + (1-warmup_factor) * i /  warmup_iters

Moreover, the fraction of this quantity at point i to point i-1 will give us

           1 + (1.0 - warmup_factor) / [warmup_iters*warmup_factor+(i-1)*(1-warmup_factor)]

which is used in get_lr() method in our implementation. Below we provide an example how to use linear warmup scheduler and to give an example to show how does it works.

```python
import torch
from torch.nn import Parameter
from torch.optim import SGD
from torch.optim.lr_scheduler import WarmUpLR

model = [Parameter(torch.randn(2, 2, requires_grad=True))]
optimizer = SGD(model, 0.1)
scheduler = WarmUpLR(optimizer, warmup_factor=0.1, warmup_iters=10, warmup_method="linear")

for epoch in range(15):

    print(epoch, scheduler.get_last_lr()[0])

    optimizer.step()
    scheduler.step()
```

```
0 0.010000000000000002
1 0.019000000000000003
2 0.028000000000000008
3 0.03700000000000001
4 0.04600000000000001
5 0.055000000000000014
6 0.06400000000000002
7 0.07300000000000002
8 0.08200000000000003
9 0.09100000000000004
10 0.10000000000000005
11 0.10000000000000005
12 0.10000000000000005
13 0.10000000000000005
14 0.10000000000000005
```

## Constant Warmup

Constant warmup has straightforward idea, to multiply learning rate by warmup_factor until we reach to epoch warmup_factor, then do nothing for following epochs

```python
import torch
from torch.nn import Parameter
from torch.optim import SGD
from torch.optim.lr_scheduler import WarmUpLR

model = [Parameter(torch.randn(2, 2, requires_grad=True))]
optimizer = SGD(model, 0.1)
scheduler = WarmUpLR(optimizer, warmup_factor=0.1, warmup_iters=5, warmup_method="constant")

for epoch in range(10):

    print(epoch, scheduler.get_last_lr()[0])

    optimizer.step()
    scheduler.step()
```

```
0 0.010000000000000002
1 0.010000000000000002
2 0.010000000000000002
3 0.010000000000000002
4 0.010000000000000002
5 0.10000000000000002
6 0.10000000000000002
7 0.10000000000000002
8 0.10000000000000002
9 0.10000000000000002
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60836

Reviewed By: saketh-are

Differential Revision: D29537615

Pulled By: iramazanli

fbshipit-source-id: d910946027acc52663b301f9c56ade686e62cb69

2 years agoMove fx2trt and oss_acc_tracer to oss (#63101)
Shiyan Deng [Sun, 15 Aug 2021 18:52:20 +0000 (11:52 -0700)]
Move fx2trt and oss_acc_tracer to oss (#63101)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63101

Move internal fx2trt to torch/fx/experimental/fx2trt and merge the two TRT interpreter we have right now. cc: mortzur as this might affect uru exporting script.

Move oss_acc_tracer to torch/fx/experimental/fx_acc.

Test Plan: CI

Reviewed By: jerryzh168

Differential Revision: D30257909

fbshipit-source-id: 4e374965fbf88d72e91844d9e9b6ff9b98f467d1

2 years agoHide all symbols in llvm namespace (#63272)
Bert Maher [Sun, 15 Aug 2021 18:28:23 +0000 (11:28 -0700)]
Hide all symbols in llvm namespace (#63272)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63272

Test Plan: Imported from OSS

Reviewed By: nikithamalgifb

Differential Revision: D30331695

Pulled By: bertmaher

fbshipit-source-id: d35130c96f7e2a31fa86d9d80de59002e96301df

2 years agoAdd copy button to code snippets in docs (#63149)
anjali411 [Sun, 15 Aug 2021 13:22:53 +0000 (06:22 -0700)]
Add copy button to code snippets in docs (#63149)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63149

Test Plan: Imported from OSS

Reviewed By: navahgar, albanD

Differential Revision: D30308891

Pulled By: anjali411

fbshipit-source-id: ad51180ab2f27c4525682b2603bbf753bb8f1ce9

2 years ago[Pytorch Edge] Enable kineto profiler on mobile via EdgeKinetoProfiler (#62419)
Kimish Patel [Sat, 14 Aug 2021 04:37:57 +0000 (21:37 -0700)]
[Pytorch Edge] Enable kineto profiler on mobile via EdgeKinetoProfiler (#62419)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62419

This diff adds support for cpu only kineto profiler on mobile. Thus
enabling chrome trace generation on mobile. This bring cpp API for
mobile profiling on part with Torchscript.
This is done via:
1. Utilizating debug handle annotations in KinetoEvent.
2. Adding post processing capability, via callbacks, to
KinetoThreadLocalState
3. Creating new RAII stype profiler, KinetoEdgeCPUProfiler, which can be
used in surrounding scope of model execution. This will write chrome
trace to the location specified in profiler constructor.

Test Plan:
MobileProfiler.ModuleHierarchy

Imported from OSS

Reviewed By: raziel

Differential Revision: D29993660

fbshipit-source-id: 0b44f52f9e9c5f5aff81ebbd9273c254c3c03299

2 years ago[Pytorch Mobile] Combing instructions and debug hanles in single struct (#62418)
Kimish Patel [Sat, 14 Aug 2021 04:37:57 +0000 (21:37 -0700)]
[Pytorch Mobile] Combing instructions and debug hanles in single struct (#62418)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62418

Debug handles have one to one correspondence with instruction, so just
combine them in one.

Test Plan:
CI

Imported from OSS

Reviewed By: raziel

Differential Revision: D29993661

fbshipit-source-id: 125c7163174cf66624dd95f110fdc8208fea8a07

2 years ago[Pytorch Profiler] Introduce scopes to enableProfiler (#62417)
Kimish Patel [Sat, 14 Aug 2021 04:37:57 +0000 (21:37 -0700)]
[Pytorch Profiler] Introduce scopes to enableProfiler (#62417)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62417

This diff adds an option to make enableProfiler enable callbacks only
for certain RecordScopes.
Why?
Profiling has some overhead when we repeatedly execute callbacks for
alls copes. On mobile side when we often have small quantized models
this overhead can be large. We observed that by only profiling top level
op and skipping profiling of other atend ops called within we can limit
this overhead. For example, instead of profling at::conv2d -> at::convolution ->
at::convolution_ and further more if ops like transpose etc. are called,
skipping profiling of those. Of course this limits the visibility, but
at the least this way we get a choice.

Test Plan: Imported from OSS

Reviewed By: ilia-cher

Differential Revision: D29993659

fbshipit-source-id: 852d3ae7822f0d94dc6e507bd4019b60d488ef69

2 years ago[Pytorch Profiler] Add debug_handles to KinetoEvent (#62228)
Kimish Patel [Sat, 14 Aug 2021 04:37:57 +0000 (21:37 -0700)]
[Pytorch Profiler] Add debug_handles to KinetoEvent (#62228)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62228

This diff adds debug handles to events and provides a way to use
RECORD_FUNCTIONs that will pass debug_handles down to profiler, which
will record it in the events.

Why add debug_handles?
For pytorch mobile, with lite interpreter, we generate debug handles
that can be used for lazily symbolicate exception traces to model level
stack trace. Similar to the model level stack trace you get in
TorchScript models. The debug_handles also enable getting module
hierarchy for lite interpreter model, support for which was added to
KinetoProfiler in previous diffs.

Followup plan:
1. Enabled scope callbacks such that lite interpreter can use it to
profiler only top level ops.
2. Enable post processing callbacks that take KinetoEvents and populate
module hierarchy using debug handles.

This will let us use KinetoProfiler for lite interpter use cases on
mobile. Aim is to use RAII guard to similarly generate chrome trace for
mobile usecases as well, although only for top level ops.

Test Plan:
test_misc : RecordDebugHandles.Basic

Imported from OSS

Reviewed By: ilia-cher

Differential Revision: D29935899

fbshipit-source-id: 4f06dc411b6b5fe0ffaebdd26d3274c96f8f389b

2 years ago[Pytorch Profiler] Move start timestamp to end of start callback (#62191)
Kimish Patel [Sat, 14 Aug 2021 04:37:57 +0000 (21:37 -0700)]
[Pytorch Profiler] Move start timestamp to end of start callback (#62191)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62191

This moves start timestamping to end of callback. This way we dont
account for callstack/module hierarchy related overhead in op runtime.

Test Plan:
CI

Imported from OSS

Reviewed By: ilia-cher

Differential Revision: D29910519

fbshipit-source-id: f462031a81ae12b3db7993cf482e5ad93a35e096

2 years ago[Pytorch Profiler] Add support for adding module hierarchy to (#61792)
Kimish Patel [Sat, 14 Aug 2021 04:37:57 +0000 (21:37 -0700)]
[Pytorch Profiler] Add support for adding module hierarchy to (#61792)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61792

KinetoEvent

This PR adds module hierarchy information to events.
What is module hierarchy information attached to events?
During profiling a TorchScript module, when events are added, we ask JIT
what is the module hierarchy associated with the node being
executed. At the time of execution of that node, there might be multiple
frames in the stack of interpreter. For each frame, we find
corresponding node and the corresponding module hierarchy is queried.
Module hierarchy corresponding to the node is associated with node's
InlinedCallStack. InlinedCallStack of node tracks the path via which the
node is inlined. Thus during the inlining process we annotate
module information corresponding to the CallMethod nodes being inlined.

With this PR, chrome trace will contain additional metadata:
"Module Hierarchy". This can look like this:
TOP(ResNet)::forward.SELF(ResNet)::_forward_impl.layer1(Sequential)::forward.0(BasicBlock)::forward.conv1(Conv2d)::forward.SELF(Conv2d)::_conv_forward
It contains module instance, type name and the method name in the
callstack.

Test Plan:
test_profiler

Imported from OSS

Reviewed By: raziel, ilia-cher

Differential Revision: D29745442

fbshipit-source-id: dc8dfaf7c5b8ab256ff0b2ef1e5ec265ca366528

2 years agoadd substract of max and testcase (#63132)
leslie-fang-intel [Sat, 14 Aug 2021 03:49:27 +0000 (20:49 -0700)]
add substract of max and testcase (#63132)

Summary:
As discussed here https://github.com/pytorch/pytorch/pull/62897, in the path of BF16/non-last-dim Softmax, we miss the subtractions of max value which will cause the overflow in the `exp()` calculation when the value of input tensor is large, such as `1000.0`.
To avoid this issue, we add the subtractions of max value and the corresponding test cases in this PR.

Note w/o subtractions of max value(accidental reverts or changes), we will get the underlying error message of the test case
```
AssertionError: False is not true : Tensors failed to compare as equal!With rtol=0.05 and atol=0.05, found 103984 element(s) (out of 126720) whose difference(s) exceeded the margin of error (including 103984 nan comparisons). The greatest difference was nan (0.0 vs. nan), which occurred at index (0, 0, 0, 1).
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63132

Reviewed By: VitalyFedyunin

Differential Revision: D30280792

Pulled By: cpuhrsch

fbshipit-source-id: 722821debf983bbb4fec878975fa8a4da0d1d866

2 years agoOpInfo: `nn.functional.conv_transpose2d` (#62882)
Kushashwa Ravi Shrimali [Sat, 14 Aug 2021 00:10:07 +0000 (17:10 -0700)]
OpInfo: `nn.functional.conv_transpose2d` (#62882)

Summary:
See https://github.com/facebookresearch/functorch/issues/78 and https://github.com/pytorch/pytorch/issues/54261.

cc: mruberry zou3519 Chillee

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62882

Reviewed By: bdhirsh

Differential Revision: D30280804

Pulled By: zou3519

fbshipit-source-id: e40cdf43e98c1f11e45df6b8bc13110b4d29c45f

2 years agorefactor fx2trt example script so it can be imported as a library (#63262)
Kefei Lu [Fri, 13 Aug 2021 23:57:47 +0000 (16:57 -0700)]
refactor fx2trt example script so it can be imported as a library (#63262)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63262

Just create a `__main__` guard.

Test Plan: run linter, sandcastle tests

Reviewed By: 842974287

Differential Revision: D30263617

fbshipit-source-id: 8044ce5d815b043c3778591384cb13d9a89d0048

2 years ago[iOS] Add `LibTorch-Lite-Nightly` pod (#63239)
Hanton Yang [Fri, 13 Aug 2021 23:20:22 +0000 (16:20 -0700)]
[iOS] Add `LibTorch-Lite-Nightly` pod (#63239)

Summary:
D30090760 (https://github.com/pytorch/pytorch/commit/e182b459d94fe77c1d9f623c94fc2621c8cc55de) was reverted by D30303292 because of a lint issue in `LibTorch-Lite-Nightly.podspec.template`. Resubmit the diff after fixing the issue.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63239

Test Plan: Imported from OSS

Reviewed By: xta0

Differential Revision: D30315690

Pulled By: hanton

fbshipit-source-id: f0fa719ffc3b8181ab28c123584ae5c1da8992c0

2 years agoAllow TransformerEncoder and TransformerDecoder to accept 0-dim batch sized tensors...
Sameer Deshmukh [Fri, 13 Aug 2021 23:08:01 +0000 (16:08 -0700)]
Allow TransformerEncoder and TransformerDecoder to accept 0-dim batch sized tensors. (#62800)

Summary:
This issue fixes a part of https://github.com/pytorch/pytorch/issues/12013, which is summarized concretely in  https://github.com/pytorch/pytorch/issues/38115.

This PR allows TransformerEncoder and Decoder (alongwith the inner `Layer` classes) to accept inputs with 0-dimensional batch sizes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62800

Reviewed By: VitalyFedyunin

Differential Revision: D30303240

Pulled By: jbschlosser

fbshipit-source-id: 8f8082a6f2a9f9d7ce0b22a942d286d5db62bd12

2 years ago[ROCm] Update HIP_VERSION to TORCH_HIP_VERSION (#62786)
Pruthvi Madugundu [Fri, 13 Aug 2021 21:57:17 +0000 (14:57 -0700)]
[ROCm] Update HIP_VERSION to TORCH_HIP_VERSION (#62786)

Summary:
- HIP_VERSION semantic versioning will change in ROCm4.3. The changes essentially remove the dependency on HIP_VERSION provided in the hip header to keep code compatible with older and newer versions of ROCm.
- TORCH_HIP_VERSION is derived from HIP_VERSION_MAJOR and HIP_VERSION_MINOR

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62786

Reviewed By: bdhirsh

Differential Revision: D30281682

Pulled By: seemethere

fbshipit-source-id: e41e69fb9e13de5ddd1af99ba5bbdcbb7b64b673

2 years agoRespect user-set CMAKE_PREFIX_PATH (#61904)
Can Balioglu [Fri, 13 Aug 2021 20:47:37 +0000 (13:47 -0700)]
Respect user-set CMAKE_PREFIX_PATH (#61904)

Summary:
Fixes the case where the `CMAKE_PREFIX_PATH` variable gets silently overwritten by a user specified environment variable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61904

Reviewed By: walterddr, malfet

Differential Revision: D29792014

Pulled By: cbalioglu

fbshipit-source-id: babacc8d5a1490bff1e14247850cc00c6ba9e6be

2 years agoRemove left-over print in test_diff_graph_inline_threshold (#63231)
gmagogsfm [Fri, 13 Aug 2021 20:06:08 +0000 (13:06 -0700)]
Remove left-over print in test_diff_graph_inline_threshold (#63231)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63231

Reviewed By: VitalyFedyunin

Differential Revision: D30305851

Pulled By: gmagogsfm

fbshipit-source-id: 43da3b5f49ad4a6a2d6d174acf792f3ccf41a463

2 years agoAdd CostInferenceFunction for SplitOp (#63133)
Tanvir Zaman [Fri, 13 Aug 2021 19:25:16 +0000 (12:25 -0700)]
Add CostInferenceFunction for SplitOp (#63133)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63133

SplitOp is costly but missing cost inference function which hurts cost based balancing. Changes are:
(1) Addition of CostInferenceFunction for SplitOp
(2) Small fix in CostInferenceFunction for ConcatOp

Test Plan:
Added unit tests:

buck test //caffe2/caffe2/python/operator_test:split_op_cost_test

buck test //caffe2/caffe2/python/operator_test:concat_op_cost_test

Reviewed By: smacke

Differential Revision: D30247360

fbshipit-source-id: 989e962f3a981acc85b73aac3fb23e603b7d1591

2 years ago[docs] Merge note block in `torch.lu` documentation (#63156)
Meghan Lele [Fri, 13 Aug 2021 19:08:28 +0000 (12:08 -0700)]
[docs] Merge note block in `torch.lu` documentation (#63156)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63156

**Summary**
This commit merges the four successive `Note` blocks that appear in the
documentation for `torch.lu`. Each one only has one line in it, so all
of them have been merged into one block with a bulleted list that
contains the original items.

**Test Plan**
Continuous integration.

*Before*
<img width="888" alt="Captura de Pantalla 2021-08-12 a la(s) 10 48 39 a  m" src="https://user-images.githubusercontent.com/4392003/129244443-b7d1594e-8833-4c20-a911-e1bf7ca88a8d.png">

*After*
<img width="932" alt="Captura de Pantalla 2021-08-12 a la(s) 10 48 46 a  m" src="https://user-images.githubusercontent.com/4392003/129244462-1f39dcdb-90e0-4fd9-a95f-343b0b6be1f1.png">

**Fixes**
This commit fixes #62339.

Test Plan: Imported from OSS

Reviewed By: navahgar, pbelevich

Differential Revision: D30292633

Pulled By: SplitInfinity

fbshipit-source-id: cb9071165629bfe7316b1d2fe952e4354c75d48f

2 years ago[docs] Remove `input` parameter from `Tensor.flatten` docs (#63180)
Meghan Lele [Fri, 13 Aug 2021 18:46:54 +0000 (11:46 -0700)]
[docs] Remove `input` parameter from `Tensor.flatten` docs (#63180)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63180

**Summary**
This commit removes the `input` parameter from the signature for
`Tensor.flatten` shown in its documentation. This parameter is accepted
by `torch.flatten` but not `Tensor.flatten` (since the input is the
`Tensor` on which `flatten` is invoked).

**Test Plan**
Continuous integration.

**Fixes**
This commit fixes #57478.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D30293156

Pulled By: SplitInfinity

fbshipit-source-id: 4ad70d638af009fb6bdeb703433b306904d39a76

2 years ago[docs] Add cross references to `torch.transpose` and `torch.t` (#63177)
Meghan Lele [Fri, 13 Aug 2021 18:46:14 +0000 (11:46 -0700)]
[docs] Add cross references to `torch.transpose` and `torch.t` (#63177)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63177

**Summary**
This commit adds a link in the documentation for `torch.transpose` that
directs to `torch.t` and vice versa. These two functions are related and
it is useful for users of one to know about the other.

**Test Plan**
Continuous integration.

**Fixes**
This commit fixes #56267.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D30292654

Pulled By: SplitInfinity

fbshipit-source-id: 8e60cd7a598ff8b4756cb30141399dfe8e118338

2 years ago[docs] Mention `vsplit`, `hsplit` and `tensor_split` in Tensor views doc (#63191)
Meghan Lele [Fri, 13 Aug 2021 18:43:05 +0000 (11:43 -0700)]
[docs] Mention `vsplit`, `hsplit` and `tensor_split` in Tensor views doc (#63191)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63191

**Summary**
This commit adds `vsplit`, `hsplit` and `tensor_split` to the list of
view ops on the Tensor Views documentation page.

**Test Plan**
Continuous integration.

*Before*
<img width="195" alt="Captura de Pantalla 2021-08-12 a la(s) 2 55 07 p  m" src="https://user-images.githubusercontent.com/4392003/129275921-c1cfdf6c-9f1f-45f3-98b6-1de7a0f0cc84.png">

*After*
<img width="197" alt="Captura de Pantalla 2021-08-12 a la(s) 2 55 15 p  m" src="https://user-images.githubusercontent.com/4392003/129275936-de4afde7-0143-4e1d-b38f-c86256f4896c.png">

**Fixes**
This commit fixes #62727.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D30293181

Pulled By: SplitInfinity

fbshipit-source-id: 283783a4ccc3ebc50cb0a427e55c7a6cb618ffd7

2 years agoAllow Average Pooling modules to accept tensors with 0-dim batch sizes. (#62025)
Sameer Deshmukh [Fri, 13 Aug 2021 18:27:47 +0000 (11:27 -0700)]
Allow Average Pooling modules to accept tensors with 0-dim batch sizes. (#62025)

Summary:
This issue fixes a part of https://github.com/pytorch/pytorch/issues/12013, which is summarized concretely in  https://github.com/pytorch/pytorch/issues/38115.

It introduces changes and tests for allowing the Average Pooling layers to accept tensors with 0 sized batch dimensions and return meaningful results.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62025

Reviewed By: VitalyFedyunin

Differential Revision: D30303256

Pulled By: jbschlosser

fbshipit-source-id: 5f727e62a7c58d2b8bb49fcc3bd7688474917ba5

2 years ago[skip ci] fix workflow code generation (#63235)
zhouzhuojie [Fri, 13 Aug 2021 17:37:07 +0000 (10:37 -0700)]
[skip ci] fix workflow code generation (#63235)

Summary:
Fixes a clean git check with code generation introduced by https://github.com/pytorch/pytorch/pull/63148

`generated-win-vs2019-cuda10-py3.yml` was renamed as `generated-win-vs2019-cuda10.1-py3.yml`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63235

Reviewed By: VitalyFedyunin

Differential Revision: D30306474

Pulled By: zhouzhuojie

fbshipit-source-id: cbae1ace064e360e8ca0c0e997116bdb20d54d46

2 years ago[Static Runtime] Add pass to eliminate __getitem__/DictConstruct calls (#62429)
Mike Iovine [Fri, 13 Aug 2021 17:18:03 +0000 (10:18 -0700)]
[Static Runtime] Add pass to eliminate __getitem__/DictConstruct calls (#62429)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62429

Introduce a new pass to eliminate calls to `prim::DictConstruct/aten::__getitem__`. Given a graph like this:
```
%2 : Dict = prim::DictConstruct(%key, %value)
%3 : Tensor = aten::__getitem__(%2, %key)
%4 : Tensor = op(%3)
```
This pass produces a graph like this (after dead code elimination):
```
%4 : Tensor = op(%value)
```

This optimization is applied in the static runtime.

Test Plan:
`buck test //caffe2/test:jit -- TestPeephole`

**local.forward performance summary**
About 3% runtime benefit. All `DictConstruct` calls optimized out, `__getitem__` calls reduced significantly (~50% of them are cut out)
P438354810

**local_request_only.forward performance summary**
About 14% runtime benefit. Again, all `DictConstruct` calls optimized out, 50% `__getitem__` calls removed.
P438359742

There is some variance with runtime measurements, so take these numbers with a grain of salt. Also note that the benefit does not exist in the shrunk model since there are no `DictConstruct` calls

Reviewed By: hlu1

Differential Revision: D29995087

fbshipit-source-id: f376376a46ff808115afd2d60446e5db8f6f752f

2 years agoFixing user inputs for low, high in `make_tensor` (#61108)
Kushashwa Ravi Shrimali [Fri, 13 Aug 2021 17:12:01 +0000 (10:12 -0700)]
Fixing user inputs for low, high in `make_tensor` (#61108)

Summary:
**TODOs:**

* [x] Do not clamp inputs for low and high when given and valid.
* [x] Devise rules for modifying `low` and `high` when extremals/invalid values passed.
* [x] Testing with `test_references_numerics_hard` with the revised changes. _(I've tested locally, the changes will take place in a separate PR though after offline discussion with mruberry)_
* [x] Revise comments/documentation for `make_tensor`

See https://github.com/pytorch/pytorch/issues/61758 for tracker issue.

cc: mruberry pmeier

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61108

Reviewed By: VitalyFedyunin

Differential Revision: D30296167

Pulled By: mruberry

fbshipit-source-id: 67e8d15b173209a9c97ca013231494a5fa99f8c7

2 years ago[hackathon] fix benchmarking script in CONTRIBUTING (#63199)
Natalia Gimelshein [Fri, 13 Aug 2021 16:49:15 +0000 (09:49 -0700)]
[hackathon] fix benchmarking script in CONTRIBUTING (#63199)

Summary:
[skip ci]
Per title

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63199

Reviewed By: mruberry

Differential Revision: D30305487

Pulled By: ngimel

fbshipit-source-id: 2704c4f08ab976a55c9f8c2fe54cd4f3f39412cf

2 years ago[codemod][lint][caffe2] Extend BLACK coverage
Andres Suarez [Fri, 13 Aug 2021 16:26:38 +0000 (09:26 -0700)]
[codemod][lint][caffe2] Extend BLACK coverage

Test Plan: Sandcastle

Reviewed By: zsol

Differential Revision: D30302716

fbshipit-source-id: f9724d4f4d1b8950f581cc2c6c77eedf19b4b6fc

2 years agoENH Adds no_batch_dim to FractionalMaxPool2d (#62490)
Thomas J. Fan [Fri, 13 Aug 2021 15:43:04 +0000 (08:43 -0700)]
ENH Adds no_batch_dim to FractionalMaxPool2d (#62490)

Summary:
Towards https://github.com/pytorch/pytorch/issues/60585

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62490

Reviewed By: bdhirsh

Differential Revision: D30287143

Pulled By: jbschlosser

fbshipit-source-id: 1b9dd932157f571adf3aa2c98c3c6b56ece8fa6e

2 years ago[JIT] Add a flag to rethrow caught exception in jit interpreter (#63073)
Don Jang [Fri, 13 Aug 2021 15:37:23 +0000 (08:37 -0700)]
[JIT] Add a flag to rethrow caught exception in jit interpreter (#63073)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63073

It turned out that it's less than ideal to print out verbose stacktrace in exception messages in high-QPS services (see the related task) with a non-significant failure rate due to the truncation of long stacktrace which results in losing the original exception message thrown from native code. It is actually desirable to retain only the message of the original exception directly thrown from native code in such a usecase.

This change adds a new flag `torch_jit_disable_exception_stacktrace` to the pytorch jit interpreter to suppress stacktrace in the messages of exception thrown from the interpreter.

Reviewed By: Krovatkin

Differential Revision: D30241792

fbshipit-source-id: c340225c69286663cbd857bd31ba6f1736b1ac4c

2 years agoPort `norm` kernel to structured kernels. (#62711)
Yukio Siraichi [Fri, 13 Aug 2021 15:20:19 +0000 (08:20 -0700)]
Port `norm` kernel to structured kernels. (#62711)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62711

Tracking issue: #55070

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D30109866

Pulled By: ezyang

fbshipit-source-id: 894c9496894d059c7690a174b75bbd4db7ed6016

2 years agoPort `prod` kernel to structured kernels. (#62024)
Yukio Siraichi [Fri, 13 Aug 2021 15:20:19 +0000 (08:20 -0700)]
Port `prod` kernel to structured kernels. (#62024)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62024

Tracking issue: #55070

In this PR, I also broke down the meta functions of other reduction kernels (e.g. `all`,
`argmax`, `sum`) into the composition of common patterns.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D29847122

Pulled By: ezyang

fbshipit-source-id: a6680a6cf6e59bb46b8ffe7bf2a3a611d6e0fd14

2 years agoPort `mean` kernel to structured kernels. (#61643)
Yukio Siraichi [Fri, 13 Aug 2021 15:20:19 +0000 (08:20 -0700)]
Port `mean` kernel to structured kernels. (#61643)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61643

Tracking issue: #55070

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D29783866

Pulled By: ezyang

fbshipit-source-id: dc95baf593096c03fb5f292ee6c36de3cc7f2b35

2 years agoRemove req to call step() in training loop (#63164)
Andrew Gu [Fri, 13 Aug 2021 15:19:23 +0000 (08:19 -0700)]
Remove req to call step() in training loop (#63164)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63164

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D30284616

Pulled By: andwgu

fbshipit-source-id: afdb677fb08851b139178a9f6d782196f26773e1

2 years agoPass `_allow_empty_param_list` into func opt ctor (#63163)
Andrew Gu [Fri, 13 Aug 2021 15:19:23 +0000 (08:19 -0700)]
Pass `_allow_empty_param_list` into func opt ctor (#63163)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63163

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D30284615

Pulled By: andwgu

fbshipit-source-id: 4857f5b618ec5b007648737ab532ce605e5d70dc

2 years agoSimplify data structures, add uniform approximation, fix mem leak (#63162)
Andrew Gu [Fri, 13 Aug 2021 15:19:23 +0000 (08:19 -0700)]
Simplify data structures, add uniform approximation, fix mem leak (#63162)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63162

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D30284617

Pulled By: andwgu

fbshipit-source-id: 9bd9e5f89abcc0d3dac56b85d55cc88e843baa9f

2 years ago[docs][ao] update quantize_per_tensor to mention overloads (#63165)
Supriya Rao [Fri, 13 Aug 2021 14:58:38 +0000 (07:58 -0700)]
[docs][ao] update quantize_per_tensor to mention overloads (#63165)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63165

Add details about the overloads for
* list of tensors input
* supporting tensor scale/zero-point inputs

Test Plan:
CI

Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D30291045

fbshipit-source-id: 9fc6418792c5e3a35417eeb8d31de4a4bfcbb7a5

2 years agoMake saved tensors default hooks thread local (#62909)
Victor Quach [Fri, 13 Aug 2021 14:47:12 +0000 (07:47 -0700)]
Make saved tensors default hooks thread local (#62909)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62909

This PR makes saved tensors default hooks thread local.
This allows using default hooks in a multithreaded context.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D30165416

Pulled By: Varal7

fbshipit-source-id: 10a7d580661d3d94bdaf398c4e076b7bea11c16b

2 years agoAllow 0-dim batch sizes for AdaptiveMaxPool and MaxPool. (#62088)
Sameer Deshmukh [Fri, 13 Aug 2021 14:31:42 +0000 (07:31 -0700)]
Allow 0-dim batch sizes for AdaptiveMaxPool and MaxPool. (#62088)

Summary:
This issue fixes a part of https://github.com/pytorch/pytorch/issues/12013, which is summarized concretely in  https://github.com/pytorch/pytorch/issues/38115.

This PR allows `MaxPool` and `AdaptiveMaxPool` to accept tensors whose batch size is 0. Some changes have been made to modernize the tests so that they will show the name of C++ function that throws an error.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62088

Reviewed By: bdhirsh

Differential Revision: D30281285

Pulled By: jbschlosser

fbshipit-source-id: 52bffc67bfe45a78e11e4706b62cce1469eba1b9