platform/upstream/pytorch.git
2 years agoCI: Consolidate Build and Test naming for better stats collection (#65232)
Jane Xu [Fri, 17 Sep 2021 19:32:11 +0000 (12:32 -0700)]
CI: Consolidate Build and Test naming for better stats collection (#65232)

Summary:
All build pytorch steps should now be named "Build" and test steps named "Test" for workflows that test PyTorch on Linux and Windows.

I left the binary stuff alone as that build is different.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65232

Reviewed By: driazati, seemethere

Differential Revision: D31024232

Pulled By: janeyx99

fbshipit-source-id: 24b1a1e2b1b25aba70b7adc41603ec8fa4ce7dd6

2 years agoBack out "Revert D30745960: [DDP] Remove SPMD from self.modules_buffers" (#64778)
Rohan Varma [Fri, 17 Sep 2021 19:23:09 +0000 (12:23 -0700)]
Back out "Revert D30745960: [DDP] Remove SPMD from self.modules_buffers" (#64778)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64778

Original commit changeset: d3f3fb813c45
ghstack-source-id: 138326910

Test Plan: ci

Reviewed By: H-Huang

Differential Revision: D30849443

fbshipit-source-id: 15dab8a959a29d2e2fefac6ad52b8d8168eacc02

2 years agoBack out "Revert D30745961: [DDP] Remove self.modules_params" (#64777)
Rohan Varma [Fri, 17 Sep 2021 19:23:09 +0000 (12:23 -0700)]
Back out "Revert D30745961: [DDP] Remove self.modules_params" (#64777)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64777

Original commit changeset: 59f7cc50d369
ghstack-source-id: 138326909

Test Plan: ci

Reviewed By: H-Huang

Differential Revision: D30849442

fbshipit-source-id: bb87ba83935374d8a3ebbc29365df1417dd4f26f

2 years agoBack out "Revert D30745921: [DDP] Fix when buffers are reassigned in module" (#64776)
Rohan Varma [Fri, 17 Sep 2021 19:23:09 +0000 (12:23 -0700)]
Back out "Revert D30745921: [DDP] Fix when buffers are reassigned in module" (#64776)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64776

Original commit changeset: 343ead86bf1e
ghstack-source-id: 138326914

Test Plan: ci

Reviewed By: H-Huang

Differential Revision: D30849444

fbshipit-source-id: 9a72805416fe7d6c68e51bdcdb88f6e1fecb614d

2 years ago[xplat][pytorch]: Disabling too many logging. (#65170)
Sangbaek Park [Fri, 17 Sep 2021 19:16:10 +0000 (12:16 -0700)]
[xplat][pytorch]: Disabling too many logging. (#65170)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65170

Disabling too many logging. These are per frame logging
and outputting lots of logs in Skylight command line.

Test Plan:
```
cd ~/fbsource
buck build -c ndk.custom_libcxx=false -c pt.enable_qpl=0 //xplat/caffe2:pt_vulkan_api_test_binAndroid\#android-arm64 --show-output
adb push buck-out/gen/xplat/caffe2/pt_vulkan_api_test_binAndroid\#android-arm64 /data/local/tmp/vulkan_api_test
adb shell "/data/local/tmp/vulkan_api_test"
cd -
```

Reviewed By: SS-JIA

Differential Revision: D30778852

fbshipit-source-id: bcf75ec417dfe3e9ce3df92a1894352772bd663d

2 years agodelegate parallelism to Ninja when possible (#64733)
Michael Dagitses [Fri, 17 Sep 2021 19:07:19 +0000 (12:07 -0700)]
delegate parallelism to Ninja when possible (#64733)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64733

The previous implementation was wrong when CPU scheduling affinity is
set. In fact, it is still wrong if Ninja is not being used.

When there is CPU scheduling affinity set, the number of processors
available on the system likely exceeds the number of processors that
are usable to the build. We ought to use
`len(os.sched_getaffinity(0))` to determine the effective parallelism.

This change is more minimal and instead just delegates to Ninja (which
handles this correctly) when it is used.

Test Plan:
I verified this worked as correctly using Ninja on a 96-core machine
with 24 cores available for scheduling by checking:
 * the cmake command did not specify "-j"
 * the number of top-level jobs in top/pstree never exceeded 26 (24 +
   2)

And I verified we get the legacy behavior by specifying USE_NINJA=0 on
the build.

Reviewed By: jbschlosser, driazati

Differential Revision: D30968796

Pulled By: dagitses

fbshipit-source-id: 29547dd378fea793957bcc2f7d52d5def1ecace2

2 years agoadd test for number of jobs when building (#65162)
Michael Dagitses [Fri, 17 Sep 2021 19:07:19 +0000 (12:07 -0700)]
add test for number of jobs when building (#65162)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65162

Test Plan: Imported from OSS

Reviewed By: driazati

Differential Revision: D30998006

Pulled By: dagitses

fbshipit-source-id: 8b8d45668acf0e6c0f16df0f705a1af8c6d4f22d

2 years agoRemove CUDA 9.2 references conditionals and workarounds (#65070)
Jane Xu [Fri, 17 Sep 2021 18:45:11 +0000 (11:45 -0700)]
Remove CUDA 9.2 references conditionals and workarounds (#65070)

Summary:
Title says it all

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65070

Reviewed By: malfet

Differential Revision: D30966464

Pulled By: janeyx99

fbshipit-source-id: e454906fd5d7d321d390939ba5d237e1d9b150f8

2 years agofix torch.distributed.elastic event docs (#64974)
edward-io [Fri, 17 Sep 2021 18:34:36 +0000 (11:34 -0700)]
fix torch.distributed.elastic event docs (#64974)

Summary:
the example code wasn't working for me.

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang cbalioglu gcramer23

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64974

Reviewed By: kiukchung, cbalioglu

Differential Revision: D30926481

Pulled By: edward-io

fbshipit-source-id: f5e32cc2b948b6ee30d84a8247856f39fc786f67

2 years ago[nnc] Updated inlining to handle cases when producer indices are constants after...
Raghavan Raman [Fri, 17 Sep 2021 17:50:43 +0000 (10:50 -0700)]
[nnc] Updated inlining to handle cases when producer indices are constants after eval (#65044)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65044

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D30954655

Pulled By: navahgar

fbshipit-source-id: dfaedb5af710b2625ceec3a443a6c4e34158ab16

2 years ago[nnc] Updated inliner to remove assertions and exception (#64719)
Raghavan Raman [Fri, 17 Sep 2021 17:50:43 +0000 (10:50 -0700)]
[nnc] Updated inliner to remove assertions and exception (#64719)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64719

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D30828583

Pulled By: navahgar

fbshipit-source-id: 9826a59085a210e44d101a843ff2cae440dfd633

2 years ago[ONNX] Do not use `numpy` in ONNX opsets (#65188)
Nikita Shulga [Fri, 17 Sep 2021 17:32:26 +0000 (10:32 -0700)]
[ONNX] Do not use `numpy` in ONNX opsets (#65188)

Summary:
Replace `torch.tensor([numpy.arange(a, b, c)])` with `torch.arange(a, b, c).unsqueeze(0)`
Replace `tuple(numpy.add(a, b))` with `tuple( x + y for (x, y) in zip(a, b)`

As `numpy` is an optional dependency, it shouldn't be used in PyTorch core by default

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65188

Reviewed By: mruberry

Differential Revision: D31009490

Pulled By: malfet

fbshipit-source-id: 528e48f055bf9ac1de1fd7e94c0be41915df9a0b

2 years ago[CoreML][OSS] Include Core ML in iOS/MacOS nightlies (#65075)
Tao Xu [Fri, 17 Sep 2021 17:31:18 +0000 (10:31 -0700)]
[CoreML][OSS] Include Core ML in iOS/MacOS nightlies (#65075)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65075

Need to drop one line at - https://github.com/pytorch/builder/blob/master/conda/pytorch-nightly/meta.yaml#L65
ghstack-source-id: 138324213

Test Plan:
- Check the iOS nightly builds
  - `pod install LibTorch-Lite-Nightly`

Reviewed By: hanton

Differential Revision: D30912269

fbshipit-source-id: b07679b75ecf89beae2975c37cf17d2449a3304f

2 years agoadd a test case for const fold (#65224)
Shiyan Deng [Fri, 17 Sep 2021 17:24:11 +0000 (10:24 -0700)]
add a test case for const fold (#65224)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65224

Add a test case for the fix D30996277 (https://github.com/pytorch/pytorch/commit/8c38d141df429459ea6891847950ce157ac82b2c).

Test Plan: buck test mode/opt -c python.package_style=inplace -c fbcode.nvcc_arch=v100,a100 -c fbcode.enable_gpu_sections=true -j 40 caffe2/test:fx_const_fold -- test_const_fold_module_attr

Reviewed By: jfix71

Differential Revision: D31000386

fbshipit-source-id: f444361839decc583bf93ac946cfe2049376719e

2 years ago[PyTorchEdge] promote prim ops by using ops table for mobile runtime (#64816)
Pavithran Ramachandran [Fri, 17 Sep 2021 17:22:41 +0000 (10:22 -0700)]
[PyTorchEdge] promote prim ops by using ops table for mobile runtime (#64816)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64816

## Context:
Promoting prim ops:
Certain prim ops are frequent than others (like tupleIndex, raiseException, ...). These ops are frequent that they are chosen to be promoted as first class instructions. To promote it requires multiple steps and support from TS team as it changes how the bytecode is serialized and deserialized. So to prevent multiple bytecode version bumps and provided stability while these changes happen, an iterim iterative process is proposed which uses a table to lookup for "promoted" op's function. This allows us to rapidly update the ops list and test on production model without having to change the bytecode. In case of failure, we can quickly revert this change.

## Observation
The ops are chosen based on the notebook N1135657 which examines the top frequent ops.

## Fix
An iterim solution of having a static table, which when given a prim op name returns a function to be applied on the stack. This helps us check in `function.cpp` to get the "promoted" op. As a fall back, the "promoted" op still resides in `register_prim_ops.cpp` so that the function of prim op is never missed.

ghstack-source-id: 138261338

Test Plan:
```
[pavithran@67109.od ~/fbsource/fbcode (eddab7da6)]$ buck test caffe2/test/cpp/jit:jit -- BackendTest.TestComposite
Building: finished in 5.4 sec (100%) 7284/7284 jobs, 0/7284 updated
  Total time: 5.8 sec
More details at https://www.internalfb.com/intern/buck/build/480191aa-a1ba-42ca-99e9-ee4bf2b06d65
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: 867382eb-327f-43d7-a45c-875b7f484b15
Trace available for this run at /tmp/tpx-20210914-100224.283682/trace.log
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/844425134506115
    ✓ ListingSuccess: caffe2/test/cpp/jit:jit - main (12.159)
    ✓ Pass: caffe2/test/cpp/jit:jit - BackendTest.TestCompositeWithSetStates (0.797)
    ✓ Pass: caffe2/test/cpp/jit:jit - BackendTest.TestComposite (0.779)
Summary
  Pass: 2
  ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/844425134506115
```

{F663491347}

Reviewed By: iseeyuan

Differential Revision: D30819926

fbshipit-source-id: 4cbe05d5761bdc9d62ef08e18172dcf64cb49526

2 years agoRevert D30993855: [pytorch][PR] OpInfo: nn.functional.conv2d
Michael Suo [Fri, 17 Sep 2021 17:21:43 +0000 (10:21 -0700)]
Revert D30993855: [pytorch][PR] OpInfo: nn.functional.conv2d

Test Plan: revert-hammer

Differential Revision:
D30993855 (https://github.com/pytorch/pytorch/commit/873255c6d95342d144e9d1b633c16410844b934e)

Original commit changeset: 7402f99addb4

fbshipit-source-id: b0539daa195dc6a3739bce5c264cb2177b7721ff

2 years ago[CoreML][OSS] Integrate with CMake (#64523)
Tao Xu [Fri, 17 Sep 2021 17:14:40 +0000 (10:14 -0700)]
[CoreML][OSS] Integrate with CMake (#64523)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64523

- Build Pytorch with CoreML delegate - ` USE_PYTORCH_METAL=ON python setup.py install --cmake`
- Build iOS static libs - `IOS_PLATFORM=SIMULATOR USE_COREML_DELEGATE=1  ./scripts/build_ios.sh`
ghstack-source-id: 138324216

Test Plan:
- Test the Helloword example

{F657778559}

Reviewed By: iseeyuan

Differential Revision: D30594041

fbshipit-source-id: 8cece0b2d4b3ef82d3ef4da8c1054919148beb16

2 years ago[Reland] [Model Averaging] Simplify PostLocalSGD Optimizer API (#65197)
Yi Wang [Fri, 17 Sep 2021 17:00:13 +0000 (10:00 -0700)]
[Reland] [Model Averaging] Simplify PostLocalSGD Optimizer API (#65197)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65197

1. The constructor accepts a local optimizer instance instead of the inputs of local optimizer constructor and the class type.
2. The parameters are read from local optimizer's param_groups instead of a separate input.

Proposal: https://github.com/pytorch/pytorch/issues/59699
ghstack-source-id: 138307226

Test Plan: buck test mode/dev-nosan //caffe2/test/distributed:distributed_nccl_spawn -- test_post_localSGD_optimizer_parity

Reviewed By: rohan-varma

Differential Revision: D31007439

fbshipit-source-id: bbb0526e6763ef76775b85088571506b3942c722

2 years agoBf16 matmul (#64619)
haozhe.zhu [Fri, 17 Sep 2021 16:52:47 +0000 (09:52 -0700)]
Bf16 matmul (#64619)

Summary:
Re-create PR to fix https://github.com/pytorch/pytorch/pull/61891.

Drop the support for addbmm.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64619

Reviewed By: jbschlosser

Differential Revision: D30902995

Pulled By: VitalyFedyunin

fbshipit-source-id: dc318d73adff8f6974c9752d0d097e69276f8206

2 years agoTorchhub: rewrite commit hash check to avoid using unnecessary GitHub API credits...
Nicolas Hug [Fri, 17 Sep 2021 16:49:46 +0000 (09:49 -0700)]
Torchhub: rewrite commit hash check to avoid using unnecessary GitHub API credits (#64362)

Summary:
This PR adds more detailed error messages to torchhub if the commit hash validation goes wrong, providing suggestions to the users on how to resolve the issue.

It also documents why such validation is important.

EDIT: it also avoids validatating some stuff when we know "stuff" isn't a commit since there's no risk in this case

CC malfet mthrok

cc nairbv NicolasHug

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64362

Reviewed By: gchanan, malfet

Differential Revision: D30731191

Pulled By: NicolasHug

fbshipit-source-id: d1ee7c2ef2591dd7a5291977af1635ada2552d1b

2 years ago[FX] Ensure BC coverage for all of torch.fx.passes (#65081)
James Reed [Fri, 17 Sep 2021 16:26:37 +0000 (09:26 -0700)]
[FX] Ensure BC coverage for all of torch.fx.passes (#65081)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65081

Test Plan: Imported from OSS

Reviewed By: jbschlosser, khabinov

Differential Revision: D30967428

Pulled By: jamesr66a

fbshipit-source-id: 2ff83da728dc469f086cf504e71b43396db612d8

2 years ago[FX] Move graph_manipulation and param_fetch out of experimental and into passes...
James Reed [Fri, 17 Sep 2021 16:26:37 +0000 (09:26 -0700)]
[FX] Move graph_manipulation and param_fetch out of experimental and into passes (#65183)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65183

ghstack-source-id: 138309655

Test Plan: waitforsadcastle

Reviewed By: protonu

Differential Revision: D31007630

fbshipit-source-id: 77d14b284737aabbe2b9e6394177a0c2e40aafba

2 years ago[fx2trt] make gpu trace better (#65168)
Shiyan Deng [Fri, 17 Sep 2021 16:23:05 +0000 (09:23 -0700)]
[fx2trt] make gpu trace better (#65168)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65168

Add record_function to TRTModule and EngineHolder so each parts would appear on gpu trace.

Test Plan: CI

Reviewed By: wushirong

Differential Revision: D30997968

fbshipit-source-id: b90662f20a8c0d321846c222f3e8c8eb7e010eba

2 years ago[CoreML][iOS/MacOS] Add the CoreML executor (#64522)
Tao Xu [Fri, 17 Sep 2021 16:16:39 +0000 (09:16 -0700)]
[CoreML][iOS/MacOS] Add the CoreML executor (#64522)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64522

The `PTMCoreMLExecutor` serves as a bridge between the delegate APIs and Core ML runtime.
ghstack-source-id: 138324217

allow-large-files

Test Plan:
iOS:
Run the CoreML tests in the playground app

MacOS:

```
buck test pp-macos

PASS     633ms  1 Passed   0 Skipped   0 Failed   CoreMLTests
```

{F657776101}

Reviewed By: raziel, iseeyuan

Differential Revision: D30594042

fbshipit-source-id: a42a5307a24c2f364333829f3a84f7b9a51e1b3e

2 years agoAllow extra unused arguments in symbolic shape function (#65095)
Elias Ellison [Fri, 17 Sep 2021 15:32:05 +0000 (08:32 -0700)]
Allow extra unused arguments in symbolic shape function (#65095)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65095

The reason I didn't do this initially was because I was worried that matching one schema to another schema with an extra argument might change semantics, e.g. Add(Tensor, Tensor) to Add(Tensor, Tensor, Tensor) might be different. However we don't actually need to worry about this because the graph schema isn't used for node matching, unlike symbolic_script.cpp

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D30972081

Pulled By: eellison

fbshipit-source-id: d4089e8feafc330df2ca158866fe779a7da0b073

2 years agoActually deprecate __torch_function__ as plain methods (#64843)
albanD [Fri, 17 Sep 2021 15:01:33 +0000 (08:01 -0700)]
Actually deprecate __torch_function__ as plain methods (#64843)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64843

Fix for https://github.com/pytorch/pytorch/issues/63767

Test Plan: Imported from OSS

Reviewed By: heitorschueroff

Differential Revision: D30991425

Pulled By: albanD

fbshipit-source-id: 1214143b8aea87e6ff406c7fc13096bd15d1a768

2 years agoUpdate fx proxy to use classmethod for __torch_function__ (#64842)
albanD [Fri, 17 Sep 2021 15:01:33 +0000 (08:01 -0700)]
Update fx proxy to use classmethod for __torch_function__ (#64842)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64842

Change the `__torch_function__` to follow best guidelines of using classmethods.
I am not sure how to handle the case where multiple tracer objects are given as input but given that before we were getting an arbitrary tracer from there based on the "self" that was arbitrarily chosen by the torch_function caller, the new implementation is not worst?
Let me know what you think!

Test Plan: Imported from OSS

Reviewed By: heitorschueroff

Differential Revision: D30991423

Pulled By: albanD

fbshipit-source-id: d28940df230b543952b278a0eb2d61cf7ae123ce

2 years agoUse classmethods for overrides (#64841)
albanD [Fri, 17 Sep 2021 15:01:33 +0000 (08:01 -0700)]
Use classmethods for overrides (#64841)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64841

Test Plan: Imported from OSS

Reviewed By: heitorschueroff

Differential Revision: D30991424

Pulled By: albanD

fbshipit-source-id: 551e2119768f3a4292713f3bfa83930f5506adbd

2 years agoFix port allocation race condition for elastic test (#65149)
Howard Huang [Fri, 17 Sep 2021 14:55:01 +0000 (07:55 -0700)]
Fix port allocation race condition for elastic test (#65149)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65149

Fixes #64789

There is a race condition between when the free port is acquired to when it is used to create the store in which it may have been used. Since this test only tests that timeout is triggered for tcpstore, we can bind to any port on tcpstore creation.

This only affects the test on the server (since that is where the port is used), but I changed both tests for clarity

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang cbalioglu gcramer23

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D30993166

Pulled By: H-Huang

fbshipit-source-id: eac4f28d641ac87c4ebee89df83f90955144f2f1

2 years agoSmall improvements to compare_models_torch binary (#65171)
Stephen Jia [Fri, 17 Sep 2021 14:48:04 +0000 (07:48 -0700)]
Small improvements to compare_models_torch binary (#65171)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65171

Add the model comparison binary to BUCK, and also add some quality of life features such as controlling the input range.

Test Plan:
```
# Build the binary
cd ~/fbsource
buck build -c ndk.custom_libcxx=false -c pt.enable_qpl=0 //xplat/caffe2:ptmobile_compareAndroid\#android-arm64 --show-ou
# Push it to the device
adb push buck-out/gen/xplat/caffe2/ptmobile_compareAndroid\#android-arm64 /data/local/tmp/compare_models

# Run the benchmark binary
BENCH_CMD="/data/local/tmp/compare_models"
BENCH_CMD+=" --model=$PATH_TO_MODEL"
BENCH_CMD+=" --refmodel=$PATH_TO_REFERENCE_MODEL"
BENCH_CMD+=" --input_type=float --input_dims=$MODEL_INPUT_SIZE"
BENCH_CMD+=" --iter=100"
BENCH_CMD+=" --tolerance 1e-5"

```

Reviewed By: beback4u

Differential Revision: D30371322

fbshipit-source-id: 5e520aaf119c90985a1d5a135f76e4057148333b

2 years agoDisable autograd fallback tests on Windows (#65147)
Edward Yang [Fri, 17 Sep 2021 14:40:59 +0000 (07:40 -0700)]
Disable autograd fallback tests on Windows (#65147)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65147

I think they trigger an MSVC bug per https://github.com/pytorch/pytorch/issues/48763
ghstack-source-id: 138247203

Test Plan: breakpointed https://www.internalfb.com/intern/sandcastle/job/9007199738584981/ and sush'ed into the host and ran `buck build arvr/mode/win/opt //xplat/caffe2:autograd_libtorch_test_ovrsource` in `/cygdrive/d/ovrsource-null-hg`

Reviewed By: soulitzer

Differential Revision: D30992685

fbshipit-source-id: 06c6fb2c18d55490f89fc91ee5b7a4c5a7faf1c6

2 years agoimplement "xy" indexing for torch.meshgrid (#62724)
Michael Dagitses [Fri, 17 Sep 2021 14:32:32 +0000 (07:32 -0700)]
implement "xy" indexing for torch.meshgrid (#62724)

Summary:
This is step 4/7 of https://github.com/pytorch/pytorch/issues/50276. This allows the use of `"xy"` indexing but doesn't change any defaults.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62724

Reviewed By: heitorschueroff

Differential Revision: D30995290

Pulled By: dagitses

fbshipit-source-id: 08a6a6144b20bc019f68bc3c52e3bbf967976d8f

2 years agoAllow parametrization to be nested (#65167)
Alban Desmaison [Fri, 17 Sep 2021 13:28:41 +0000 (06:28 -0700)]
Allow parametrization to be nested (#65167)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/65163

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65167

Reviewed By: jbschlosser

Differential Revision: D31002318

Pulled By: albanD

fbshipit-source-id: b1f1c6c9efa9e83af9789ed13efc133f777f418e

2 years agoPass GITHUB_TOKEN to linux CI jobs and avoid skipping torchhub tests (#64807)
Nicolas Hug [Fri, 17 Sep 2021 10:27:23 +0000 (03:27 -0700)]
Pass GITHUB_TOKEN to linux CI jobs and avoid skipping torchhub tests (#64807)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/64760

This should hopefully put the torchhub tests back.

This also avoids skipping the torchhub tests: currently the tests are skipped if they fail, which pretty much defeats the purpose of having a test in the first place since we're never notified when they do fail.

cc ezyang seemethere malfet lg20987 pytorch/pytorch-dev-infra nairbv NicolasHug

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64807

Reviewed By: seemethere

Differential Revision: D30994585

Pulled By: NicolasHug

fbshipit-source-id: 561782c22462b5cfec99cca153eb59623db5660a

2 years ago[CoreML][fbcode] Add the `preprocess` python APIs (#64521)
Tao Xu [Fri, 17 Sep 2021 07:19:36 +0000 (00:19 -0700)]
[CoreML][fbcode] Add the `preprocess` python APIs (#64521)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64521

Add the preprocess part for the coreml delegate. Check out the `example.py` for the usage.
ghstack-source-id: 138324214

Test Plan:
```
(base) [taox@devvm2780.vll0 ~/fbsource/fbcode/caffe2/fb]  buck run coreml:example -- --model="/home/taox/mobilenetv2/mobilenetv2.pt" --out="/home/taox/mobilenetv2/mobilenetv2_coreml.pt"
Parsing buck files: finished in 0.5 sec
Downloaded 0/1 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules)
Building: finished in 10.6 sec (100%) 12611/57623 jobs, 1/57623 updated
  Total time: 11.1 sec
Converting Frontend ==> MIL Ops: 100%|██████████████████████████████████████████▉| 382/383 [00:00<00:00, 692.58 ops/s]
Running MIL optimization passes: 100%|███████████████████████████████████████████| 18/18 [00:00<00:00, 45.55 passes/s]
Translating MIL ==> MLModel Ops: 100%|███████████████████████████████████████████| 704/704 [00:01<00:00, 468.56 ops/s]
input {
  name: "input_0"
  type {
    multiArrayType {
      shape: 1
      shape: 3
      shape: 224
      shape: 224
      dataType: FLOAT32
    }
  }
}
output {
  name: "645"
  type {
    multiArrayType {
      dataType: FLOAT32
    }
  }
}
metadata {
  userDefined {
    key: "com.github.apple.coremltools.source"
    value: "torch==1.10.0a0+fb"
  }
  userDefined {
    key: "com.github.apple.coremltools.version"
    value: "4.1"
  }
}

{'inputs': '[["input_0", "0", "[1, 3, 224, 224]"]]', 'outputs': '[["645", "0", "[1, 1000]"]]', 'config': '{"spec_ver": "4", "backend": "cpu", "allow_low_precision": "True"}', 'metadata': '{"coremltool_ver": "4.1", "torch_ver": "torch==1.10.0a0+fb"}'}
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0826 13:27:12.690302 2477051 backend_detail.cpp:376] Warning: Backend [coreml] is not available. Execution of this Module is still possible by saving and loading on a device where the backend is available. (function codegen_backend_module)
graph(%self.1 : torch.jit.LoweredModule.coreml.__torch__.torchvision.models.mobilenetv2.MobileNetV2,
      %x.1 : Tensor):
  %51 : str = prim::Constant[value="Exception: Backend is not available."]()
  %50 : str = prim::Constant[value="AssertionError: "]()
  %14 : str = prim::Constant[value="forward"]() # <string>:5:62
  %48 : Tensor = prim::Uninitialized()
  %44 : Tensor = prim::Uninitialized()
  %typed_inputs.1 : Any[] = prim::ListConstruct(%x.1)
  %__backend.3 : __torch__.torch.classes.__backends__.coreml = prim::GetAttr[name="__backend"](%self.1)
  %8 : bool = prim::CallMethod[name="is_available"](%__backend.3) # <string>:4:19
  %49 : Tensor = prim::If(%8) # <string>:4:16
    block0():
      %__backend : __torch__.torch.classes.__backends__.coreml = prim::GetAttr[name="__backend"](%self.1)
      %__handles : Dict(str, Any) = prim::GetAttr[name="__handles"](%self.1)
      %15 : Any = aten::__getitem__(%__handles, %14) # <string>:5:47
      %17 : Any[] = prim::CallMethod[name="execute"](%__backend, %15, %typed_inputs.1) # <string>:5:24
      %18 : Any = prim::ListUnpack(%17)
      %20 : bool = prim::isinstance[types=[Tensor]](%18)
      %39 : Tensor = prim::If(%20) # <string>:6:18
        block0():
          %22 : Tensor = prim::unchecked_cast(%18)
          -> (%22)
        block1():
           = prim::RaiseException(%50) # <string>:6:18
          -> (%44)
      -> (%39)
    block1():
       = prim::RaiseException(%51) # <string>:9:18
      -> (%48)
  return (%49)

```

Reviewed By: raziel

Differential Revision: D30585154

fbshipit-source-id: 66c7d2e931be6eaa3c43a0ee131ea8046452449d

2 years ago[Static Runtime] Introduce static_runtime::dict_unpack (#64771)
Don Jang [Fri, 17 Sep 2021 05:32:23 +0000 (22:32 -0700)]
[Static Runtime] Introduce static_runtime::dict_unpack (#64771)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64771

Test Plan:
- Added `StaticRuntime.RemoveImmutableInputDictLookupsWithImmutableInputDict`
- Added `StaticRuntime.RemoveImmutableInputDictLookupsWithMutableInputDict`
- TBD: Perf impact measurement

Reviewed By: mikeiovine

Differential Revision: D30685083

fbshipit-source-id: 050a92ef3b3ed0fdc0ab7a13a4b5dbfede9342a9

2 years ago[ONNX] Update submodule to 1.10.1 (#63716) (#64576)
BowenBao [Fri, 17 Sep 2021 04:39:10 +0000 (21:39 -0700)]
[ONNX] Update submodule to 1.10.1 (#63716) (#64576)

Summary:
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **https://github.com/pytorch/pytorch/issues/64576 [ONNX] Update submodule to 1.10.1 (https://github.com/pytorch/pytorch/issues/63716)**

* [ONNX] Update IR version to 7

* [ONNX] update submodule to 1.10.1

* Disable some tests in caffe2 that fail b/c caffe2 doesn't support the
  new ops.
* Update Bazel file.

* Update expect files for new ONNX IR version

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64576

Reviewed By: jansel

Differential Revision: D31006896

Pulled By: msaroufim

fbshipit-source-id: f3bf97709f23a5a2cd49c708e7363231f2c1961a

2 years ago[FX} Add torch.ops.profiler._record_function_{enter,exit} as stateful ops for DCE...
James Reed [Fri, 17 Sep 2021 03:31:03 +0000 (20:31 -0700)]
[FX} Add torch.ops.profiler._record_function_{enter,exit} as stateful ops for DCE (#65180)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65180

Test Plan: Imported from OSS

Reviewed By: jansel

Differential Revision: D31007115

Pulled By: jamesr66a

fbshipit-source-id: 823b15db712a382a4f2a4fd409983d47bc067150

2 years ago[quant] AO migration of the `torch/quantization/utils.py` (phase 1) (#64919)
Zafar Takhirov [Fri, 17 Sep 2021 03:29:05 +0000 (20:29 -0700)]
[quant] AO migration of the `torch/quantization/utils.py` (phase 1) (#64919)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64919

AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly. This migrates the quantization utilities.
ghstack-source-id: 138303325

Test Plan: `buck test mode/dev //caffe2/test:quantization`

Reviewed By: jerryzh168

Differential Revision: D30899082

fbshipit-source-id: 85eb38c419e417147e71758b682cd095308dd0c9

2 years ago[acc_utils] Add print_model_info (#65045)
Jordan Fix [Fri, 17 Sep 2021 02:55:46 +0000 (19:55 -0700)]
[acc_utils] Add print_model_info (#65045)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65045

This is a useful tool for printing out all of the ops that are found in a model after acc_tracer. It assumes the provided model has no `call_module` or `call_method`, which is generally a reasonable assumption assuming a model has been successfully traced by the acc_tracer.

Test Plan:
Tested locally. Sample output:
```
Model Info:
> placeholder: 1184
> get_attr: 655
> output: 2
> torch.fx.experimental.fx_acc.acc_ops.add: 2
> torch.fx.experimental.fx_acc.acc_ops.cat: 23
> torch.fx.experimental.fx_acc.acc_ops.embedding_bag: 576
> torch.fx.experimental.fx_acc.acc_ops.layer_norm: 15
> torch.fx.experimental.fx_acc.acc_ops.linear: 27
> torch.fx.experimental.fx_acc.acc_ops.matmul: 3
> torch.fx.experimental.fx_acc.acc_ops.mul: 17
> torch.fx.experimental.fx_acc.acc_ops.permute: 2
> torch.fx.experimental.fx_acc.acc_ops.reshape: 419
> torch.fx.experimental.fx_acc.acc_ops.sigmoid: 16
> torch.fx.experimental.fx_acc.acc_ops.slice_tensor: 630
> torch.fx.experimental.fx_acc.acc_ops.sum: 4
> torch.fx.experimental.fx_acc.acc_ops.tanh: 315
```

Reviewed By: 842974287

Differential Revision: D30954829

fbshipit-source-id: 5c4f0770667b72859b74099d9f4575284fc48bd2

2 years agoAdd back the owning_module fix (#65159)
Yinghai Lu [Fri, 17 Sep 2021 02:26:36 +0000 (19:26 -0700)]
Add back the owning_module fix (#65159)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65159

This was a legit fix originally introduced in D30905949 (https://github.com/pytorch/pytorch/commit/446d95a7f64cb464d28d27c4c87c48900a9fde79). But we hesitated and removed it for some reason. Putting it back.

Reviewed By: 842974287

Differential Revision: D30996277

fbshipit-source-id: 3f5eede11dba2072e7cd5ae6ca7ac81d55fb75fa

2 years agoAdd dropout shape inference as no-op in acc_tracer (#65113)
Rui Zhu [Fri, 17 Sep 2021 01:08:00 +0000 (18:08 -0700)]
Add dropout shape inference as no-op in acc_tracer (#65113)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65113

Register dropout as no-op in acc_tracer & Add shape inference for no-op

Test Plan:
buck test glow/fb/fx/acc_tracer:test_acc_shape_inference --- test_unary_15_dropout_no_op
buck test glow/fb/fx/oss_acc_tracer:test_acc_tracer -- test_dropout

Reviewed By: jfix71

Differential Revision: D30880679

fbshipit-source-id: 592fe50e17137c94c12727658191dedf08daf8cf

2 years agoPin SciPy to 1.6.2 on Windows (#65017)
Nikita Shulga [Fri, 17 Sep 2021 00:36:14 +0000 (17:36 -0700)]
Pin SciPy to 1.6.2 on Windows (#65017)

Summary:
Re-enable previously disabled test_distributions

Note: conda does not have ScipPy-1.6.3, only 1.6.2

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65017

Reviewed By: seemethere

Differential Revision: D31003199

Pulled By: malfet

fbshipit-source-id: 96b9d2a833f703008bb1f4df9361db8ec6f8ccc6

2 years agoAdded logging for the Reducer's non-member functions. (#65023)
Avery Wang [Thu, 16 Sep 2021 23:37:52 +0000 (16:37 -0700)]
Added logging for the Reducer's non-member functions. (#65023)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65023

Added an optional logging parameter for non-member functions `compute_bucket_assignment_by_size` and `verify_replica0_across_processes`. If a logger is provided then `TORCH_CHECK` assertions are replaced with a wrapper that logs the error to the DDP reducer's logger before calling `TORCH_CHECK`. If a logger is not provided `TORCH_CHECK` is still called.

Modified python-side calls to `_compute_bucket_assignment_by_size` and `_verify_model_across_ranks` to include a logger whenever possible. A notable exception is when these non-member functions are called in DDP's constructor - we cannot pass in a logger as they may have not been initialized yet.

We also added 4 new tests: `test_compute_bucket_assignment_by_size_sparse_error_{with, without}_logger` which tests the `_compute_bucket_assignment_by_size` function to ensure that sparse tensors are rejected and the errors are logged.  `test_verify_model_across_rank_{with, without}_logger` calls `_verify_model_across_ranks` to ensure that ill-formed models (different ranks have different number of parameters compared to rank 0) are rejected and the errors are logged. The test `test_ddp_model_diff_across_ranks` remains unchanged - while it does construct a ill-formed DDP instance which triggers the error in `_verify_model_across_ranks`, we cannot check the logger because this error appears in the constructor.

Lastly, did some cleanup of the `test_ddp_model_diff_across_ranks` function to make the logic of choosing which context manager and error message to use more clean.

Test Plan:
**Build commands**
`buck build mode/dev-nosan //caffe2/test/distributed:distributed_nccl_spawn --keep-going`

`buck build mode/dev-nosan //caffe2/test/distributed:distributed_gloo_spawn --keep-going`

**Test commands**
Test for `_compute_bucket_assignment_by_size` (Python)/ `compute_bucket_assignment_by_size` (C++)
`BACKEND={nccl, gloo} WORLD_SIZE=2 ../buck-out/dev/gen/caffe2/test/distributed/distributed_{nccl, gloo}_spawn#binary.par -r test_compute_bucket_assignment_by_size_sparse_error_{with, without}_logger`

Test for `_verify_model_across_ranks` (Python)/`verify_replicas0_across_process` (C++)
`BACKEND={nccl, gloo} WORLD_SIZE=2 ../buck-out/dev/gen/caffe2/test/distributed/distributed_{nccl, gloo}_spawn#binary.par -r test_verify_model_across_ranks_{with, without}_logger`

Test that constructs an ill-formed DDP instance. Only did cleanup of this function.
`BACKEND={nccl, gloo} WORLD_SIZE=2 ../buck-out/dev/gen/caffe2/test/distributed/distributed_{nccl, gloo}_spawn#binary.par -r test_ddp_model_diff_across_ranks`

Reviewed By: rohan-varma

Differential Revision: D30924790

fbshipit-source-id: dae6fa82485a204a6a4b022f2d073417d07ebb2f

2 years agoOpInfo: nn.functional.conv2d (#63517)
kshitij12345 [Thu, 16 Sep 2021 21:20:43 +0000 (14:20 -0700)]
OpInfo: nn.functional.conv2d (#63517)

Summary:
Reference: https://github.com/pytorch/pytorch/issues/54261

Reference: https://github.com/facebookresearch/functorch/issues/78

Mostly inspired from https://github.com/pytorch/pytorch/issues/62882

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63517

Reviewed By: heitorschueroff

Differential Revision: D30993855

Pulled By: zou3519

fbshipit-source-id: 7402f99addb4ef8f19c2ce1a09ed9006e737cc7e

2 years agoRemove old references to 9.2 in documentation (#65059)
Jane Xu [Thu, 16 Sep 2021 20:22:02 +0000 (13:22 -0700)]
Remove old references to 9.2 in documentation (#65059)

Summary:
Removes references in .rst and README.md and comments in the Dockerfile

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65059

Reviewed By: malfet

Differential Revision: D30961110

Pulled By: janeyx99

fbshipit-source-id: 702a9a81bf08125ec4ac38bc656fc2c128c30018

2 years agoProvide function interface for `remove_duplicate_output_args` (#65134)
Kefei Lu [Thu, 16 Sep 2021 20:14:12 +0000 (13:14 -0700)]
Provide function interface for `remove_duplicate_output_args` (#65134)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65134

So that its implementation can be abstracted and replaced

Test Plan: Run linter, CI

Reviewed By: 842974287

Differential Revision: D30966916

fbshipit-source-id: 92ec78c7410d0be14faecb0ba1eafdc74bab5a5d

2 years agoAdd type annotation for `TRTInterpreter.run` (#65135)
Kefei Lu [Thu, 16 Sep 2021 20:14:12 +0000 (13:14 -0700)]
Add type annotation for `TRTInterpreter.run` (#65135)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65135

Opportunistically adding type annotation as I work through fx2trt code base.

Test Plan: run linter and CI

Reviewed By: houseroad, 842974287

Differential Revision: D30903185

fbshipit-source-id: 3f700b57f4433f2d312c1ff2e6b99948e3c8845c

2 years ago[quant]ao migration for quantization mappings and fuser method mappings hg mv (#64985)
Charles David Hernandez [Thu, 16 Sep 2021 19:55:55 +0000 (12:55 -0700)]
[quant]ao migration for quantization mappings and fuser method mappings hg mv (#64985)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64985

moving quantization_mappings.py and fuser_method_mappings.py to the ao folder while retaining backwards compatibility

also added dict test

ghstack-source-id: 138215312

Test Plan:
buck test mode/dev //caffe2/test:quantization

https://www.internalfb.com/intern/testinfra/testrun/7036874471986444

buck test mode/dev //caffe2/test:quantization -- TestAOMigrationQuantization

https://www.internalfb.com/intern/testinfra/testrun/5348024625792701

Reviewed By: z-a-f

Differential Revision: D30982551

fbshipit-source-id: 00f53bd44009d6012a7de852000aad6885131edb

2 years agoRemove CUDA 9.2 and older references from our cmake (#65065)
Jane Xu [Thu, 16 Sep 2021 19:53:12 +0000 (12:53 -0700)]
Remove CUDA 9.2 and older references from our cmake (#65065)

Summary:
Removes old CUDA references in our cuda.cmake

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65065

Reviewed By: malfet

Differential Revision: D30992673

Pulled By: janeyx99

fbshipit-source-id: 85b524089ed57e5acbc71720267cf05e24a8c20a

2 years agoDisable ParallelTBB (#65092)
Nikita Shulga [Thu, 16 Sep 2021 19:37:10 +0000 (12:37 -0700)]
Disable ParallelTBB (#65092)

Summary:
As ParallelTBB's `at::get_thread_num` is not compatible with general model used by OpenMP and ParallelNative (where it is an contiguous thread index within parallel loop), see https://github.com/pytorch/pytorch/issues/64571#issuecomment-914691883

More examples of similar regressions: https://github.com/pytorch/pytorch/runs/3612142217

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65092

Reviewed By: zhouzhuojie

Differential Revision: D30995936

Pulled By: malfet

fbshipit-source-id: db145b6a850d794f2c954f59f30249b291473e36

2 years agoIntroduce tensorRT as builtin module for torch::deploy. (#63818)
Zhengxu Chen [Thu, 16 Sep 2021 18:23:11 +0000 (11:23 -0700)]
Introduce tensorRT as builtin module for torch::deploy. (#63818)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63818

ghstack-source-id: 138156957

Test Plan: next diff

Reviewed By: wconstab

Differential Revision: D30499309

fbshipit-source-id: 4ab1bc9896243c0c1503afb18fbfb196fc37404e

2 years ago[JIT] Improve BatchMM mutability handling (#65097)
David Berard [Thu, 16 Sep 2021 17:44:33 +0000 (10:44 -0700)]
[JIT] Improve BatchMM mutability handling (#65097)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65097

Previously, BatchMM would skip any block containing any mutable
operators. Now it will avoid batching any operation whose inputs or
outputs are ever mutated. Specifically: consider a tree of ADD, T,
and MM nodes rooted at an ADD node.  If any input or output to any
node in the tree is ever mutated, then the entire tree will be ignored
by BatchMM.

Test Plan: python test/test_jit.py TestBatchMM

Reviewed By: eellison

Differential Revision: D30973515

Pulled By: davidberard98

fbshipit-source-id: 9d836faa1ef0c9e3fefe0ffc0bd265f275471f48

2 years ago[quant] ao migration of observer and qconfig (#64982)
Charles David Hernandez [Thu, 16 Sep 2021 17:31:21 +0000 (10:31 -0700)]
[quant] ao migration of observer and qconfig (#64982)

Summary:
(Had to recreate this diff so it wasn't dependent on the stack)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64982

migration of qconfig.py and observer.py to torch/ao/quantization using new test format
ghstack-source-id: 138215256

Test Plan:
buck test mode/opt //caffe2/test:quantization

https://www.internalfb.com/intern/testinfra/testconsole/testrun/8444249354294701/

buck test mode/dev //caffe2/test:quantization -- TestAOMigrationQuantization

https://www.internalfb.com/intern/testinfra/testrun/3940649742829796

Reviewed By: z-a-f

Differential Revision: D30982534

fbshipit-source-id: 48d08969b1984311ceb036eac0877c811cd6add9

2 years ago[Fix] Raise error when empty index tensor is passed (gather) (#65006)
Kushashwa Ravi Shrimali [Thu, 16 Sep 2021 17:12:50 +0000 (10:12 -0700)]
[Fix] Raise error when empty index tensor is passed (gather) (#65006)

Summary:
See https://github.com/pytorch/pytorch/pull/63312#issuecomment-919330081 for context.

cc: ezyang ysiraichi

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65006

Reviewed By: mruberry

Differential Revision: D30937730

Pulled By: ezyang

fbshipit-source-id: a8f77b1f40d07e7e3bef6caaafa119685f297638

2 years ago[FX] Gate FXGraphDrawer on whether pydot is installed (#65088)
James Reed [Thu, 16 Sep 2021 17:00:59 +0000 (10:00 -0700)]
[FX] Gate FXGraphDrawer on whether pydot is installed (#65088)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65088

Test Plan: Imported from OSS

Reviewed By: khabinov

Differential Revision: D30967951

Pulled By: jamesr66a

fbshipit-source-id: dba2f13a47889b3d4187de925b4fe74ee90b7f79

2 years agoadd support for indexing to meshgrid (#62722)
Michael Dagitses [Thu, 16 Sep 2021 16:58:09 +0000 (09:58 -0700)]
add support for indexing to meshgrid (#62722)

Summary:
This is step 3/7 of https://github.com/pytorch/pytorch/issues/50276. It only adds support for the argument but doesn't implement new indexing modes yet.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62722

Test Plan:
Verified this is not FC breaking by adding logging to both meshgrid
overloads and then called meshgrid twice:

`meshgrid(*tensors)`
  and
`meshgrid(*tensors, indexing='ij')`

This confirmed that the former signature triggered the original native
function and the latter signature triggered the new native function.

Reviewed By: H-Huang

Differential Revision: D30394313

Pulled By: dagitses

fbshipit-source-id: e265cb114d8caae414ee2305dc463b34fdb57fa6

2 years ago[Reland] Add python mode (#64360)
Richard Zou [Thu, 16 Sep 2021 16:00:34 +0000 (09:00 -0700)]
[Reland] Add python mode (#64360)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64360

This PR adds a (private) enable_python_mode context manager.
(see torch/utils/_python_dispatch.py).
enable_python_mode accepts the type of a __torch_dispatch__ object
as its argument. Whenever an operator gets called inside of the
context manager, it dispatches to the __torch_dispatch__ of
the passed-in type.

Example usage:
```
with enable_python_mode(LoggingTensor):
    z = torch.empty([])
    assert isinstance(z, LoggingTensor)
```

There are quite a few changes that were made to support this.

First, we added TorchDispatchTypeObject, a C++ struct that represents the
type of a `__torch_dispatch__` object (e.g. LoggingTensor).
It holds both the PyObject* representing the class and a PyInterpreter*
so we know which Python interpreter it came from.

Next, we updated the concrete_dispatch_fn in python_variable.cpp to accept
a `const std::shared_ptr<TorchDispatchTypeObject>&` argument. When this
is null, dispatching happens as usual. When it is non-null, we prepend
the TorchDispatchTypeObject's PyObject* to the overloaded args list so that
it is considered first for dispatch.

To get that to work, we changed how `handle_torch_dispatch_no_python_arg_parser`
works. The "overloaded args list" previously only consisted of Tensor PyObjects,
but now it can have types in addition to Tensors!
- We renamed `append_overloaded_arg` to `append_overloaded_arg`
- We added a new `append_overloaded_type` that appends a type to
overloaded_args
- We added special handling in `handle_torch_dispatch_no_python_arg_parser`
and `append_overloaded_arg` to handle types in addition to Tensors.

Then, there is PythonMode and PythonModeTLS.
- We reuse the DispatchKey::Python dispatch key as a mode key
- We use PythonMode::enter and PythonMode::exit to enable/disable
DispatchKey::Python and set the PythonModeTLS.
- PythonModeTLS stores a TorchDispatchTypeObject as metadata.
- PythonMode is in libtorch_python, and PythonModeTLS is in ATen.
This split is due to the libtorch_python library boundary (because we need
to save TLS in ATen/ThreadLocalState)
- We modify the PythonFallbackKernel to look up
the relevant TorchDispatchTypeObject (if Python Mode is active) and
dispatch using it.

There are two more miscellaneous changes:
- internal_new_from_data (torch/csrc/utils/tensor_new.cpp) gets an
exclude guard. enable_python_mode currently does not handle
torch.tensor and the exclude guard is to prevent a bug.

Future:
- This PR does not allow for the nesting of Python modes. In the future we
should be able to enable this with a more sane no_dispatch API and by changing
the TLS to a stack. For now I did not need this for CompositeImplicitAutograd testing.

Test Plan: - new tests

Reviewed By: ezyang

Differential Revision: D30698082

Pulled By: zou3519

fbshipit-source-id: 7094a90eee6aa51f8b71bc4d91cfb6f49e9691f8

2 years agoRevert D30888794: [Model Averaging] Simplify PostLocalSGD Optimizer API
Alban Desmaison [Thu, 16 Sep 2021 13:36:29 +0000 (06:36 -0700)]
Revert D30888794: [Model Averaging] Simplify PostLocalSGD Optimizer API

Test Plan: revert-hammer

Differential Revision:
D30888794 (https://github.com/pytorch/pytorch/commit/3d312b3b8ee90f8b289c7d5601a13d0521b46b7e)

Original commit changeset: 21261b480f6b

fbshipit-source-id: 87abb7e8cd9ecaac909ec6c3ee053fa7c4ae1975

2 years agoImprove LSTM documentation for proj_size > 0 (#65102)
Rodrigo Berriel [Thu, 16 Sep 2021 13:33:40 +0000 (06:33 -0700)]
Improve LSTM documentation for proj_size > 0 (#65102)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/65053. Although the documentation states that:

https://github.com/pytorch/pytorch/blob/fe0f9d1dafb9791cb08635636a01128850d17538/torch/nn/modules/rnn.py#L500-L506

It seems that the definition of `weight_ih_l[k]` could be improved by specifying what happens when `k > 0` and `proj_size > 0`. As `proj_size` is only used in LSTM, no changes are needed for the other RNNs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65102

Reviewed By: supriyar

Differential Revision: D30975781

Pulled By: jbschlosser

fbshipit-source-id: 12df06e5e6a8d5de0ad10fb15e33c3e6311c11d3

2 years ago[Static Runtime] Use FastSet instead of std::set everywhere (#65114)
Scott Wolchok [Thu, 16 Sep 2021 04:43:09 +0000 (21:43 -0700)]
[Static Runtime] Use FastSet instead of std::set everywhere (#65114)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65114

There doesn't seem to be any reason to use std::set for sets of pointers, right?
ghstack-source-id: 138198504

Reviewed By: hlu1

Differential Revision: D30978450

fbshipit-source-id: 4599c6249fda3a89959f839d3bf6400c5891f82c

2 years agoReduce PyToch Warnings - Cast fixes from D26624430 (#65015)
Amr Elshennawy [Thu, 16 Sep 2021 04:19:03 +0000 (21:19 -0700)]
Reduce PyToch Warnings - Cast fixes from D26624430 (#65015)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65015

Split out the existing fixes into a diff we can land separately.

Test Plan:
pooled_embeddings_modules_test

Parsing buck files: finished in 8.3 sec
Creating action graph: finished in 38.3 sec
[RE] Metadata: Session ID=[https://fburl.com/b/reSessionID-9bea421c-875e-4168-9e00-7d67479b1a9f]
[RE] Waiting on 46 remote actions. Completed 905 actions remotely, action cache hit rate: 5.08%.
Downloaded 7002/8869 artifacts, 560.00 Mbytes, 11.6% cache miss (for updated rules)
Building: finished in 13:12.4 min (100%) 31964/31964 jobs, 17344/31964 updated
  Total time: 13:59.1 min
More details at https://www.internalfb.com/intern/buck/build/b9a58bba-e0aa-4c2b-8824-a0c4074b0954
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: 28cbe2b1-6fbc-450c-91c9-c06a7ff1d53b
Trace available for this run at /tmp/tpx-20210914-114921.005504/trace.log
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/1407375088325000
    ✓ ListingSuccess: caffe2/torch/fb/sparsenn:pooled_embeddings_modules_test - main (23.849)
    {emoji:2702} Omit: caffe2/torch/fb/sparsenn:pooled_embeddings_modules_test - test_permutation (caffe2.torch.fb.sparsenn.tests.pooled_embeddings_modules_test.PooledEmbeddingModulesTest_1_cuda)
Test output:
> This test was disabled.
To run this test locally, add the command line flag --run-disabled to your test command (prefix with -- if using buck).
To view why this is disabled or re-enable this test in the test console, visit https://our.intern.facebook.com/intern/testinfra/testdetail/562949981577936
    ↻ Skip: caffe2/torch/fb/sparsenn:pooled_embeddings_modules_test - test_permutation (caffe2.torch.fb.sparsenn.tests.pooled_embeddings_modules_test.PooledEmbeddingModulesTest_0_cpu) (13.201)
Test output:
> Repro command : $(cat "/tmp/tpx-20210914-114921.005504/dc174692-8d92-4459-8b8f-201643c6ab7d/execution_command")
Skipped: CUDA is not available or no GPUs detected
stdout:

stderr:

    ↻ Skip: caffe2/torch/fb/sparsenn:pooled_embeddings_modules_test - test_permutation_autograd (caffe2.torch.fb.sparsenn.tests.pooled_embeddings_modules_test.PooledEmbeddingModulesTest_1_cuda) (13.201)
Test output:
> Repro command : $(cat "/tmp/tpx-20210914-114921.005504/dc174692-8d92-4459-8b8f-201643c6ab7d/execution_command")
Skipped: CUDA is not available or no GPUs detected
stdout:

stderr:

    ✓ Pass: caffe2/torch/fb/sparsenn:pooled_embeddings_modules_test - test_compatibility (caffe2.torch.fb.sparsenn.tests.pooled_embeddings_modules_test.PooledEmbeddingModulesTest_1_cuda) (13.201)
    ↻ Skip: caffe2/torch/fb/sparsenn:pooled_embeddings_modules_test - test_permutation_autograd (caffe2.torch.fb.sparsenn.tests.pooled_embeddings_modules_test.PooledEmbeddingModulesTest_0_cpu) (13.201)
Test output:
> Repro command : $(cat "/tmp/tpx-20210914-114921.005504/dc174692-8d92-4459-8b8f-201643c6ab7d/execution_command")
Skipped: CUDA is not available or no GPUs detected
stdout:

stderr:

    ✓ Pass: caffe2/torch/fb/sparsenn:pooled_embeddings_modules_test - test_compatibility (caffe2.torch.fb.sparsenn.tests.pooled_embeddings_modules_test.PooledEmbeddingModulesTest_0_cpu) (13.201)
    ✓ Pass: caffe2/torch/fb/sparsenn:pooled_embeddings_modules_test - main (13.201)
Summary
  Pass: 3
  Skip: 3
    ↻ caffe2/torch/fb/sparsenn:pooled_embeddings_modules_test - test_permutation (caffe2.torch.fb.sparsenn.tests.pooled_embeddings_modules_test.PooledEmbeddingModulesTest_0_cpu)
    ↻ caffe2/torch/fb/sparsenn:pooled_embeddings_modules_test - test_permutation_autograd (caffe2.torch.fb.sparsenn.tests.pooled_embeddings_modules_test.PooledEmbeddingModulesTest_1_cuda)
    ↻ caffe2/torch/fb/sparsenn:pooled_embeddings_modules_test - test_permutation_autograd (caffe2.torch.fb.sparsenn.tests.pooled_embeddings_modules_test.PooledEmbeddingModulesTest_0_cpu)
  Omit: 1
    {emoji:2702} caffe2/torch/fb/sparsenn:pooled_embeddings_modules_test - test_permutation (caffe2.torch.fb.sparsenn.tests.pooled_embeddings_modules_test.PooledEmbeddingModulesTest_1_cuda)
  ListingSuccess: 1

shape_inference_mode_test

[amrelshennawy@devvm855.ftw0 /data/users/amrelshennawy/fbsource/fbcode] buck test caffe2/torch/fb/sparsenn:shape_inference_mode_test
Downloaded 6/18 artifacts, 11.69 Kbytes, 53.8% cache miss (for updated rules)
Building: finished in 1.6 sec (100%) 110/110 jobs, 26/110 updated
  Total time: 1.8 sec
More details at https://www.internalfb.com/intern/buck/build/0e5f45b2-5777-49e9-a3b0-09bd05687b2b
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: 99509108-5ff3-4b1a-b7b3-2f43c4036209
Trace available for this run at /tmp/tpx-20210914-120119.723607/trace.log
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/6192449502564504
    ✓ ListingSuccess: caffe2/torch/fb/sparsenn:shape_inference_mode_test - main (0.374)
    ✓ Pass: caffe2/torch/fb/sparsenn:shape_inference_mode_test - test_set_upper_bound_mode (torch.python.fb.shape_inference_mode_test.TestShapeInferenceMode) (0.249)
    ✓ Pass: caffe2/torch/fb/sparsenn:shape_inference_mode_test - test_set_upper_bound_settings (torch.python.fb.shape_inference_mode_test.TestShapeInferenceMode) (0.253)
Summary
  Pass: 2
  ListingSuccess: 1

test
[amrelshennawy@devvm855.ftw0 /data/users/amrelshennawy/fbsource/fbcode] buck test caffe2/torch/fb/sparsenn:test
Parsing buck files: finished in 1.1 sec
Creating action graph: finished in 38.6 sec
Downloaded 6/30 artifacts, 11.29 Kbytes, 66.7% cache miss (for updated rules)
Building: finished in 41.6 sec (100%) 26783/26783 jobs, 43/26783 updated
  Total time: 01:21.4 min
More details at https://www.internalfb.com/intern/buck/build/8f794eb0-3d3c-4ee3-9aec-5ec5cec1b0f4
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: a06164b5-d7d7-444c-a4ff-e312cb9970d9
Trace available for this run at /tmp/tpx-20210914-120428.464799/trace.log
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/3377699789132066
    ✓ ListingSuccess: caffe2/torch/fb/sparsenn:test - main (16.637)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_dense_mlp_quantize_ops (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (17.870)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_clip_ranges_shape_inference_mode (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (17.922)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_gather_ranges_to_dense_caffe2 (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.348)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_self_binning_histogram_quantile_simple (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.370)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_recat_embedding_grad_output_mixed_D_batch (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.516)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_xl_embedding_bag_byte_rowwise_offsets (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.515)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_offsets_to_ranges (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (18.861)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_xl_embedding_bags (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.873)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_offsets_to_ranges_out (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (18.969)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_pack_segments_pad_minf (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (19.104)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_deprecated_multiple_runs (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (19.342)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_deprecated_sigrid_transform (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (19.664)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_offsets_to_ranges_out_empty_batch (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (19.745)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_clip_lengths (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (19.771)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_multiple_runs_torch_bind (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (19.944)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_offsets_to_ranges_empty_batch (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (19.944)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_gather_ranges_shape_inference_mode (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (20.245)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_prior_correction_calibration_prediction_nonbinary (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (20.328)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_8bitfakefused (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (20.501)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_deprecated_ranges (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (20.608)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_clip_lengths_inference_tests (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (22.403)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_broadcast_cat_out (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (23.025)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_clip_lengths_negatives_tests (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (23.956)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_broadcast_cat (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (24.100)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_transform_torch_bind (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (17.384)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_expand_values_scores_tensor (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (18.672)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_expand_empty_values_scores_tensor (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (18.679)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_pack_segments (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (17.726)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_expand_ranges_tensor (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (17.567)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_batch_box_cox_all_zeros (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (19.036)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_rowwise_prune_op_32bit_indices (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.430)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_transform_torch_bind_upper_bound (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (18.176)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_expand_dense_feature_tensor (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (19.006)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_clip_ranges_gather (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.555)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_xl_int_nbit_split_embedding_codegen_lookup_function (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.791)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_pack_segments_smaller_max_len (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.737)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_self_binning_histogram_quantile_pos (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (20.212)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_xl_embedding_bag_2bit_rowwise_offsets (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.612)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_prior_correction_calibration_prediction_binary (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (20.858)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_tracing_torch_bind_upper_bound (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (19.002)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_deprecated_tracing (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (20.824)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_self_binning_histogram_quantile_1d_counts (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.976)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_recat_embedding_grad_output_mixed_D (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (19.832)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_batch_one_hot_lengths (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (19.844)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_clip_ranges (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.558)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_batch_box_cox_non_zeros (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (19.418)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_prior_correction_calibration_accumulate (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (19.222)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_unsqueeze_vector (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (19.327)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_xl_embedding_bag_4bit_rowwise_offsets (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (17.772)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_self_binning_histogram_quantile (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.425)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_broadcast_cat_backward (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (17.956)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_expand_offsets_tensor (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (19.320)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_gather_ranges (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (17.923)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_batch_one_hot (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.549)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_deprecated_sigrid_transforms_create (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (18.932)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_clip_ranges_gather_lengths_to_offsets (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.807)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_length_to_row_idx (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (17.738)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_tracing_torch_bind (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (20.175)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_batch_box_cox_mixed (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.116)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_self_binning_histogram_quantile_1d_bins (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.671)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_permute_out (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (19.002)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_create_sigrid_transforms_torch_bind (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (18.151)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_ranges_torch_bind (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (16.780)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_self_binning_histogram_quantile_no_bins (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (19.185)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_cumsum (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (19.242)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_self_binning_histogram_quantile_le_one (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (19.876)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_pack_and_unpack_segments (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (19.222)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_self_binning_histogram_quantile_dims (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (20.007)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_sigrid_hash_op (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.959)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_rowwise_prune_op_64bit_indices (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (18.601)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_ranges_torch_bind_upper_bound (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (17.977)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_broadcast_stack (caffe2.torch.fb.sparsenn.tests.sparsenn_operators_test.SparseNNOperatorsTest) (22.588)
    ✓ Pass: caffe2/torch/fb/sparsenn:test - test_multiple_runs_torch_bind_upper_bound (caffe2.torch.fb.sparsenn.tests.sigrid_transforms_test.SigridTransformsOpsTest) (15.342)
Summary
  Pass: 73
  ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/3377699789132066

Did not run (no GPU on my devserver):
gpu_test
cpp_gpu_test

Reviewed By: r-barnes

Differential Revision: D30940399

fbshipit-source-id: d867ca646723340775a49c1b983cdab64f2d67d8

2 years agoBug fix (#65105)
Priya Ramani [Thu, 16 Sep 2021 03:03:38 +0000 (20:03 -0700)]
Bug fix (#65105)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65105

Using buildErrorMessage in external_functions.cpp was breaking build target nnc_cpu_backend_lib as buildErrorMessage is defined in tensorexpr/kernel.cpp which is not included in mobile builds and we don't want to include it in mobile builds.
Also buildErrorMessage wraps error messages for fuser whereas nnc_aten_conv2d is now only used in AOT workflow and not called by the fuser. So wrapping assertion failures with fuser error message would be misleading for AOT workflow.

Test Plan:
Before fix:
```
+ buck build //xplat/caffe2/fb/lite_predictor:lite_predictor_nnc
Downloading... 3/3 artifacts, 24.81 Kbytes, 0.0% cache miss (for updated rules)
Building... 1.7 sec (99%) 4639/4641 jobs, 3/4641 updated
     - //xplat/caffe2/fb/lite_predictor:lite_predictor_nnc#binary... 0.7 sec (running c++ link[0.6 sec])
Command failed with exit code 1.

command: [/data/users/priyaramani/fbsource/buck-out/cells/fbcode/gen/aab7ed39/tools/build/buck/wrappers/__ld__/ld.sh, --ld=/data/users/priyaramani/fbsource/fbcode/third-party-buck/platform009/build/llvm-fb/9.0.0/bin/clang++, --cc=/data/users/priyaramani/fbsource/buck-out/cells/fbcode/gen/aab7ed39/tools/build/buck/wrappers/__fbc...
<truncated>
...

stderr: clang-9: warning: argument unused during compilation: '-pthread' [-Wunused-command-line-argument]
ld.lld: error: undefined symbol: torch::jit::tensorexpr::buildErrorMessage(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
>>> referenced by external_functions.cpp:69 (xplat/caffe2/torch/csrc/jit/tensorexpr/external_functions.cpp:69)
>>>               ../nnc_cpu_backend_lib#compile-external_functions.cpp.o50e02bc2,platform009-clang/torch/csrc/jit/tensorexpr/external_functions.cpp.o:(nnc_aten_conv2d) in archive /data/users/priyaramani/fbsource/buck-out/gen/aab7ed39/xplat/caffe2/nnc_cpu_backend_lib#platform009-clang,static/libnnc_cpu_backend_lib.a
clang-9: error: linker command failed with exit code 1 (use -v to see invocation)

    When running <c++ link>.
    When building rule //xplat/caffe2/fb/lite_predictor:lite_predictor_nnc#binary (ovr_config//platform/linux:x86_64-fbcode).
clang-9: warning: argument unused during compilation: '-pthread' [-Wunused-command-line-argument]
ld.lld: error: undefined symbol: torch::jit::tensorexpr::buildErrorMessage(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
>>> referenced by external_functions.cpp:69 (xplat/caffe2/torch/csrc/jit/tensorexpr/external_functions.cpp:69)
>>>               ../nnc_cpu_backend_lib#compile-external_functions.cpp.o50e02bc2,platform009-clang/torch/csrc/jit/tensorexpr/external_functions.cpp.o:(nnc_aten_conv2d) in archive /data/users/priyaramani/fbsource/buck-out/gen/aab7ed39/xplat/caffe2/nnc_cpu_backend_lib#platform009-clang,static/libnnc_cpu_backend_lib.a
clang-9: error: linker command failed with exit code 1 (use -v to see invocation)

Command failed with exit code 1.

command: [/data/users/priyaramani/fbsource/buck-out/cells/fbcode/gen/aab7ed39/tools/build/buck/wrappers/__ld__/ld.sh, --ld=/data/users/priyaramani/fbsource/fbcode/third-party-buck/platform009/build/llvm-fb/9.0.0[DEBUG kernel.cpp:2766]       }
```

After fix:
```
+ buck build //xplat/caffe2/fb/lite_predictor:lite_predictor_nnc
Action graph will be rebuilt because files have been added or removed.
clang-9: warning: argument unused during compilation: '-pthread' [-Wunused-command-line-argument]

Downloaded 11/15 artifacts, 78.37 Kbytes, 15.4% cache miss (for updated rules)
Building: finished in 7.4 sec (100%) 4718/4718 jobs, 46/4718 updated
  Total time: 7.5 sec
More details at https://www.internalfb.com/intern/buck/build/b87be016-340c-49f8-b832-0c1de70aae9e
```

Reviewed By: ZolotukhinM

Differential Revision: D30975952

fbshipit-source-id: 85c028cc6af63c03b505b51302f5158c23e1a047

2 years ago[acc_ops] Add support for torch variants of squeeze and mul (#65037)
Jordan Fix [Thu, 16 Sep 2021 02:39:41 +0000 (19:39 -0700)]
[acc_ops] Add support for torch variants of squeeze and mul (#65037)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65037

att

Test Plan: updated unit tests

Reviewed By: yuhc

Differential Revision: D30952224

fbshipit-source-id: aaf75b27b4fc6c0436ba7bfcf324f761b900171b

2 years agoAdd NNC AOT Compiler executable (#63994)
Priya Ramani [Thu, 16 Sep 2021 02:12:47 +0000 (19:12 -0700)]
Add NNC AOT Compiler executable (#63994)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63994

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D30582149

Pulled By: priyaramani

fbshipit-source-id: 3bbf085428824c3cb308e006c18bb0a57f50fef6

2 years ago[quant] AO migration of the `_correct_bias.py`, `_equalize.py`, and `_learnable_fake_...
Zafar Takhirov [Thu, 16 Sep 2021 01:13:53 +0000 (18:13 -0700)]
[quant] AO migration of the `_correct_bias.py`, `_equalize.py`, and `_learnable_fake_quantize.py` (#64917)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64917

AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly.
This migrates from torch.quantization to torch.ao.quantization the following files:
- `_correct_bias.py`
- `_equalize.py`
- `_learnable_fake_quantize.py`

**Note:** These file are migrated completely without any warning. The old location is thus silently deprecated.

Test Plan: `buck test mode/dev //caffe2/test:quantization -- TestBiasCorrection`

Reviewed By: vkuzo

Differential Revision: D30898565

fbshipit-source-id: 1d39be2539dd1adfcb42e16bdcc0daf5c8316bbd

2 years ago.circleci/.jenkins: Remove 9.2 references in CI (#65024)
Jane Xu [Thu, 16 Sep 2021 01:03:19 +0000 (18:03 -0700)]
.circleci/.jenkins: Remove 9.2 references in CI (#65024)

Summary:
Removes 9.2 references in CI scripts and configs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65024

Reviewed By: driazati

Differential Revision: D30945948

Pulled By: janeyx99

fbshipit-source-id: 77890a00520c61500a934a90a74e3fcca84c09b5

2 years ago.github: GHA add retry for docker run in chown workspace step (#65104)
Jane Xu [Thu, 16 Sep 2021 01:00:24 +0000 (18:00 -0700)]
.github: GHA add retry for docker run in chown workspace step (#65104)

Summary:
This should help prevent further errors in GHA workflows during the Chown Workspace step such as https://github.com/pytorch/pytorch/runs/3614067053

I did not add retries to other steps with docker run

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65104

Reviewed By: seemethere

Differential Revision: D30976330

Pulled By: janeyx99

fbshipit-source-id: e403008548aa01c9a0a4ccebe56df0e889dd045c

2 years agoRevert D30752939: [pytorch][PR] nvfuser update
Eli Uriegas [Thu, 16 Sep 2021 00:37:10 +0000 (17:37 -0700)]
Revert D30752939: [pytorch][PR] nvfuser update

Test Plan: revert-hammer

Differential Revision:
D30752939 (https://github.com/pytorch/pytorch/commit/cfaecaf40bd6cabd3f4e0ef0d8c7252655349b61)

Original commit changeset: ce122e80f01b

fbshipit-source-id: 57685df8f9946032a06eff1de8a3d1498500d2d2

2 years ago[quant] AO migration of the `quant_types.py` (phase 1) (#64916)
Zafar Takhirov [Thu, 16 Sep 2021 00:24:09 +0000 (17:24 -0700)]
[quant] AO migration of the `quant_types.py` (phase 1) (#64916)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64916

AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly.
This migrates the quant_type.py from torch.quantization to torch.ao.quantization.
At this point both locations will be supported. Eventually the torch.quantization will be deprecated.

Test Plan: `buck test mode/dev //caffe2/test:quantization -- TestAOMigrationQuantization`

Reviewed By: vkuzo

Differential Revision: D30898422

fbshipit-source-id: 3e6126b49f0565a4136d6928cea9eb25368927ff

2 years ago[quant] AO migration of the `fuse_modules.py` (phase 1) (#64913)
Zafar Takhirov [Thu, 16 Sep 2021 00:24:09 +0000 (17:24 -0700)]
[quant] AO migration of the `fuse_modules.py` (phase 1) (#64913)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64913

AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly.
This migrates the fuse_module.py from torch.quantization to torch.ao.quantization.
At this point both locations will be supported. Eventually the torch.quantization will be deprecated.

Test Plan: `buck test mode/dev //caffe2/test:quantization`

Reviewed By: vkuzo

Differential Revision: D30882819

fbshipit-source-id: 1926ad6aa49136aceb5b625dcef4bfde3a2860d4

2 years ago[TensorExpr] Add a method for sanitizing Var and Buf names in Stmt. (#65010)
Mikhail Zolotukhin [Thu, 16 Sep 2021 00:13:48 +0000 (17:13 -0700)]
[TensorExpr] Add a method for sanitizing Var and Buf names in Stmt. (#65010)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65010

This pass ensures all names are legal and not-duplicated.

Fixes #52727.

Test Plan: Imported from OSS

Reviewed By: bertmaher, navahgar

Differential Revision: D30939717

Pulled By: ZolotukhinM

fbshipit-source-id: 7dbe7f937de41f22ad49137a5e067d698443ed63

2 years ago.github: Enable only specific workflows for canary (#65099)
Eli Uriegas [Wed, 15 Sep 2021 23:51:34 +0000 (16:51 -0700)]
.github: Enable only specific workflows for canary (#65099)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65099

Utilizes ciflow to enable only specific workflows for
pytorch/pytorch-canary to reduce noise on that specific repository

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D30973691

Pulled By: seemethere

fbshipit-source-id: 371765535b42a00bd72c2551c4faebf733d759f0

2 years agoci: Disable jit legacy on circleci, enable on gha (#65106)
Eli Uriegas [Wed, 15 Sep 2021 23:09:57 +0000 (16:09 -0700)]
ci: Disable jit legacy on circleci, enable on gha (#65106)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65106

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
cc ezyang seemethere malfet lg20987 pytorch/pytorch-dev-infra

Test Plan: Imported from OSS

Reviewed By: malfet, janeyx99

Differential Revision: D30976186

Pulled By: seemethere

fbshipit-source-id: 8958f821eab9aa284496c57915894ed70f6b2fff

2 years agoCI: Upgrade windows 10.1 jobs to 10.2 (#65080)
Jane Xu [Wed, 15 Sep 2021 22:59:21 +0000 (15:59 -0700)]
CI: Upgrade windows 10.1 jobs to 10.2 (#65080)

Summary:
This is first 2 steps in the following task:
1. Upgrade 10.1 to 10.2
2. Migrate force_on_cpu job to GHA

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65080

Test Plan: https://github.com/pytorch/pytorch/pull/65086

Reviewed By: seemethere

Differential Revision: D30973655

Pulled By: janeyx99

fbshipit-source-id: 67ab69ea99ff9e0336400a7173efef6d7daac07c

2 years agoReplace windows 10.2 smoke tests on PRs to be 11.3 (#65090)
Jane Xu [Wed, 15 Sep 2021 22:59:06 +0000 (15:59 -0700)]
Replace windows 10.2 smoke tests on PRs to be 11.3 (#65090)

Summary:
As we default to linux CUDA 11.3 on PRs, we should do the same thing with Windows (instead of having 10.2 be the default). This means that 10.2 will now be master only, and 11.3 windows smoke tests will run on every PR.

This also copies over the "run smoke tests only" config--removing that will be in a separate PR once there's more certain decision making.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65090

Reviewed By: seemethere

Differential Revision: D30968382

Pulled By: janeyx99

fbshipit-source-id: c73f9a2cc800b678909365c4d80627d29fc09f94

2 years agoRevert D30883290: [Static Runtime] Move MemoryPlanner out into memory_planner.cpp
Natalia Gimelshein [Wed, 15 Sep 2021 22:38:56 +0000 (15:38 -0700)]
Revert D30883290: [Static Runtime] Move MemoryPlanner out into memory_planner.cpp

Test Plan: revert-hammer

Differential Revision:
D30883290 (https://github.com/pytorch/pytorch/commit/0e11454d19e106ba6d5819c1147ca540cbce2943)

Original commit changeset: a37570f8d943

fbshipit-source-id: 65c57a2b0d2e3c7006765195dd519e8cf2472f72

2 years ago[quant] Removing hardcoded "torch.quantization.observer" for migration (#64981)
Charles David Hernandez [Wed, 15 Sep 2021 22:15:02 +0000 (15:15 -0700)]
[quant] Removing hardcoded "torch.quantization.observer" for migration (#64981)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64981

this would have cause errors when observer.py was moved to ao.

see: D30391189
ghstack-source-id: 138118430

Test Plan:
buck test mode/opt //caffe2/test:quantization -- --exact 'caffe2/test:quantization - test_dynamic_quant_multi_uses (quantization.jit.test_quantize_jit.TestQuantizeDynamicJitPasses)'

buck test mode/opt //caffe2/test:quantization -- --exact 'caffe2/test:quantization - test_save_load_state_dict_script (quantization.core.test_workflow_module.TestObserver)'

Reviewed By: supriyar

Differential Revision: D30432008

fbshipit-source-id: 754727a89c78f6ceada6f8ff92c304f3953f38fc

2 years ago[Caffe2][easy] Avoid spurious vector copy in TransposeOp (#64403)
Scott Wolchok [Wed, 15 Sep 2021 22:12:29 +0000 (15:12 -0700)]
[Caffe2][easy] Avoid spurious vector copy in TransposeOp (#64403)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64403

No need to copy to the heap here.
ghstack-source-id: 138033019

Test Plan: CI

Reviewed By: smacke

Differential Revision: D30712506

fbshipit-source-id: 5f4131b2569ebb1f5092262aaddb17215dea88f1

2 years ago[Caffe2] Don't pass vector by value in SqueezeOp (#64400)
Scott Wolchok [Wed, 15 Sep 2021 22:12:29 +0000 (15:12 -0700)]
[Caffe2] Don't pass vector by value in SqueezeOp (#64400)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64400

There appears to be no need to copy this vector.
ghstack-source-id: 138033020

Test Plan: CI

Reviewed By: smacke

Differential Revision: D30711014

fbshipit-source-id: b9fcf3d496a663b8478aa22d52b2c41f8f85e90f

2 years agoUse RDS for build size tracking (#64303)
David Riazati [Wed, 15 Sep 2021 21:46:11 +0000 (14:46 -0700)]
Use RDS for build size tracking (#64303)

Summary:
This adds 2 utilities: `register_rds_table` and `rds_write`. `register_rds_table` needs to be called once with the schema for the data that `rds_write` will write. These go to a lambda called `rds-proxy`, which will write to/read from the DB as necessary. This data can then be arbitrarily queried via `rds-proxy` (for use in CI) or on metrics.pytorch.org (for analysis).

It also hooks these up for build size tracking (which previously was not working on GHA)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64303

Reviewed By: mruberry

Differential Revision: D30941182

Pulled By: driazati

fbshipit-source-id: 12c5575ddd29902477464fc989ad76a052306b9b

2 years agonvfuser update (#63745)
jiej [Wed, 15 Sep 2021 21:40:18 +0000 (14:40 -0700)]
nvfuser update (#63745)

Summary:
Syncing nvfuser code base from devel branch, Listing a few of our development since last sync:

- Extends support to normalization and reduction kernels.
- Multiple kernel launch for single `CudaFusionGroup`. Hierarchical caching system has been updated to cache graph segmentation.
- profile_ivalue is enabled to convert dynamic scalar into compile time constants, which are required by the codegen. (e.g. reduction axes).

To keep this PR simple and relatively review-free. We stripped most external changes and submitted them as separate PRs, so this gigantic PR is easier to handle.

internal updates are files located in:
1. updates in nvfuser codegen `torch/csrc/jit/coddgen/cuda`
2. added nvfuser specific benchmarks `benchmarks/cpp/nvfuser`
3. nvfuser jit cpp tests `test/cpp/jit/test_gpu.cpp` `test/cpp/jit/test_gpu_shift.cpp` `test/cpp/jit/test_gpu_validator.h`

updates affecting integration:

1. profile_ivalue enabled for nvfuser. related changes are in `torch/csrc/jit/runtime/*`,
2. exposed a few more symbols `aten/src/ATen/core/*` used by codegen

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63745

Reviewed By: saketh-are

Differential Revision: D30752939

Pulled By: malfet

fbshipit-source-id: ce122e80f01bcd3865f5bd3c4dfde660665fd84c

2 years agoAdd embedding shape analysis (#64323)
Elias Ellison [Wed, 15 Sep 2021 20:43:12 +0000 (13:43 -0700)]
Add embedding shape analysis (#64323)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64323

Test Plan: Imported from OSS

Reviewed By: driazati

Differential Revision: D30738145

Pulled By: eellison

fbshipit-source-id: be12408330d671bc65cf645aa2c20fafd954e6a9

2 years agoMax Pool with indices (#64121)
Elias Ellison [Wed, 15 Sep 2021 20:43:12 +0000 (13:43 -0700)]
Max Pool with indices (#64121)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64121

Add support for aten operators which return multiple outputs

Test Plan: Imported from OSS

Reviewed By: driazati

Differential Revision: D30738142

Pulled By: eellison

fbshipit-source-id: 0d7e51187bd5e3e9b43f0fdb5178366a97aec943

2 years agoAdd Maxpool to shape analysis / Opinfo (#63530)
Elias Ellison [Wed, 15 Sep 2021 20:43:12 +0000 (13:43 -0700)]
Add Maxpool to shape analysis / Opinfo (#63530)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63530

how to review: pretty much just check that the inputs generated are a good representation of the op semantics, that should be sufficient for correctness, and then you can also double check the op size semantics by going to https://codebrowser.bddppq.com/pytorch/pytorch/ typing in native::{op_name} and looking at the op implementation as a bonus if you want

Test Plan: Imported from OSS

Reviewed By: driazati

Differential Revision: D30738147

Pulled By: eellison

fbshipit-source-id: cf52339e572ee04e0d6167fd95d8a82d58ea7706

2 years ago[quant][refactor] Change the structure of the ao migration tests (#64912)
Zafar Takhirov [Wed, 15 Sep 2021 20:11:58 +0000 (13:11 -0700)]
[quant][refactor] Change the structure of the ao migration tests (#64912)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64912

The test naming was confusing and ambiguous. The file was changed to reflect the framework that is being migrated ("quantization" instead of "quantize"). Also, the common testing class was extracted out
ghstack-source-id: 138157450

Test Plan: `buck test mode/dev //caffe2/test:quantization -- TestAOMigrationQuantization`

Reviewed By: vkuzo

Differential Revision: D30898214

fbshipit-source-id: 017f95995271d35bcdf6ff6a1b3974b837543e84

2 years agoAdd retries to ECR login step (#65013)
David Riazati [Wed, 15 Sep 2021 20:10:02 +0000 (13:10 -0700)]
Add retries to ECR login step (#65013)

Summary:
Switch retry mode from `legacy` to `standard` (https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-retries.html#cli-usage-retries-configure) and up the number of retries.

Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65013

Reviewed By: zhouzhuojie, mruberry

Differential Revision: D30943292

Pulled By: driazati

fbshipit-source-id: 0a21e9b4eacbb77e6aca22f9256d94cd591b23cd

2 years agoTo add state dict and load_dict for Chained Scheduler (#65034)
Ilqar Ramazanli [Wed, 15 Sep 2021 20:07:59 +0000 (13:07 -0700)]
To add state dict and load_dict for Chained Scheduler (#65034)

Summary:
Adding state_dict() and load_state_dict() methods for Chained Scheduler

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65034

Reviewed By: prabhat00155, nateanl

Differential Revision: D30958207

Pulled By: datumbox

fbshipit-source-id: 1a587a330d34e0548e891a39f8fb5a3d251b71fa

2 years ago[ONNX] Enhance shape (two changes merged) (#64585)
BowenBao [Wed, 15 Sep 2021 19:56:33 +0000 (12:56 -0700)]
[ONNX] Enhance shape (two changes merged) (#64585)

Summary:
Enhanced shape inference by introducing typeReliableMap.
[ONNX] exporter changes for torch hub models (https://github.com/pytorch/pytorch/issues/62856)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64585

Reviewed By: ezyang

Differential Revision: D30870418

Pulled By: msaroufim

fbshipit-source-id: 87a294799cb87d649d1d13b6114a5cfbac9be15c

Co-authored-by: jiafatom <jiafa@microsoft.com>
2 years ago[Static Runtime] Move MemoryPlanner out into memory_planner.cpp (#65011)
Don Jang [Wed, 15 Sep 2021 19:50:22 +0000 (12:50 -0700)]
[Static Runtime] Move MemoryPlanner out into memory_planner.cpp (#65011)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65011

This change moves `MemoryPlanner` out of impl.cpp into memory_planner.cpp.

`MemoryPlanner` performs an independent sub-task of static analysis of a graph, and creating memory planning, and allocating/deallocating managed Tensors.

This change will reduce merge conflicts as I work on MemoryPlanner more actively for output Tensor support.

Test Plan: N/A

Reviewed By: mikeiovine

Differential Revision: D30883290

fbshipit-source-id: a37570f8d9430224a6987d2190bcf81cf875043d

2 years ago(torch.distributed.elastic) properly format traceback on error (#65041)
Kiuk Chung [Wed, 15 Sep 2021 19:48:28 +0000 (12:48 -0700)]
(torch.distributed.elastic) properly format traceback on error (#65041)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65041

Fixes a bug introduced in https://github.com/pytorch/pytorch/pull/64036 where the traceback of the error handler is printed out rather than the traceback of the actual exception.

Fixes https://github.com/pytorch/pytorch/issues/60910
Closes https://github.com/pytorch/pytorch/issues/60910

BEFORE (note that the `py_callstack` is NOT the traceback of the RuntimeError):
```
**************************************************************************************************************************************************************************************************************************************************
                                                                                                              run_script_path FAILED
==================================================================================================================================================================================================================================================
Root Cause:
[0]:
  time: 2021-09-14_22:01:06
  rank: 0 (local_rank: 0)
  exitcode: 1 (pid: 1092727)
  error_file: /tmp/torchelastic_aeyvjbpe/none_8zuih7tj/attempt_0/0/error.json
  msg:
    {
      "message": "RuntimeError: rasing error since --throw was specified",
      "extraInfo": {
        "py_callstack": [
          "  File \"<string>\", line 1, in <module>\n",
          "  File \"/usr/local/fbcode/platform009/lib/python3.8/multiprocessing/spawn.py\", line 116, in spawn_main\n    exitcode = _main(fd, parent_sentinel)\n",
          "  File \"/usr/local/fbcode/platform009/lib/python3.8/multiprocessing/spawn.py\", line 129, in _main\n    return self._bootstrap(parent_sentinel)\n",
          "  File \"/usr/local/fbcode/platform009/lib/python3.8/multiprocessing/process.py\", line 315, in _bootstrap\n    self.run()\n",
          "  File \"/usr/local/fbcode/platform009/lib/python3.8/multiprocessing/process.py\", line 108, in run\n    self._target(*self._args, **self._kwargs)\n",
          "  File \"/data/users/kiuk/fbsource/fbcode/buck-out/dev/gen/caffe2/run#link-tree/torch/multiprocessing/spawn.py\", line 59, in _wrap\n    fn(i, *args)\n",
          "  File \"/data/users/kiuk/fbsource/fbcode/buck-out/dev/gen/caffe2/run#link-tree/torch/distributed/elastic/multiprocessing/api.py\", line 382, in _wrap\n    ret = record(fn)(*args_)\n",
          "  File \"/data/users/kiuk/fbsource/fbcode/buck-out/dev/gen/caffe2/run#link-tree/torch/distributed/elastic/multiprocessing/errors/__init__.py\", line 373, in wrapper\n    error_handler.record_exception(e)\n",
          "  File \"/data/users/kiuk/fbsource/fbcode/buck-out/dev/gen/caffe2/run#link-tree/torch/distributed/elastic/multiprocessing/errors/error_handler.py\", line 86, in record_exception\n    _write_error(e, self._get_error_file_path())\n",
          "  File \"/data/users/kiuk/fbsource/fbcode/buck-out/dev/gen/caffe2/run#link-tree/torch/distributed/elastic/multiprocessing/errors/error_handler.py\", line 26, in _write_error\n    \"py_callstack\": traceback.format_stack(),\n"
        ],
        "timestamp": "1631682066"
      }
    }

==================================================================================================================================================================================================================================================
Other Failures:
  <NO_OTHER_FAILURES>
**************************************************************************************************************************************************************************************************************************************************
```

AFTER (note the traceback is the traceback of the RuntimeError):
```
********************************************************************************
                             run_script_path FAILED
================================================================================
Root Cause:
[0]:
  time: 2021-09-14_21:49:25
  rank: 0 (local_rank: 0)
  exitcode: 1 (pid: 1014681)
  error_file: /tmp/torchelastic_q0zods2c/none_qwmz5dgj/attempt_0/0/error.json
  msg: Traceback (most recent call last):
    File "/data/users/kiuk/fbsource/fbcode/buck-out/dev/gen/caffe2/run#link-tree/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 361, in wrapper
      return f(*args, **kwargs)
    File "/data/users/kiuk/fbsource/fbcode/buck-out/dev/gen/caffe2/run#link-tree/torch/distributed/run.py", line 671, in run_script_path
      runpy.run_path(sys.argv[0], run_name="__main__")
    File "/usr/local/fbcode/platform009/lib/python3.8/runpy.py", line 265, in run_path
      return _run_module_code(code, init_globals, run_name,
    File "/usr/local/fbcode/platform009/lib/python3.8/runpy.py", line 97, in _run_module_code
      _run_code(code, mod_globals, init_globals,
    File "/usr/local/fbcode/platform009/lib/python3.8/runpy.py", line 87, in _run_code
      exec(code, run_globals)
    File "/home/kiuk/tmp/test.py", line 55, in <module>
      main()
    File "/data/users/kiuk/fbsource/fbcode/buck-out/dev/gen/caffe2/run#link-tree/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 361, in wrapper
      return f(*args, **kwargs)
    File "/home/kiuk/tmp/test.py", line 25, in main
      raise RuntimeError("rasing error since --throw was specified")
  RuntimeError: rasing error since --throw was specified

================================================================================
Other Failures:
  <NO_OTHER_FAILURES>
********************************************************************************
```

Test Plan:
(see summary for before and after)

`test.py` contents:
```
import argparse
import os
import sys

import torch
import torch.distributed as dist
import torch.nn.functional as F

from torch.distributed.elastic.multiprocessing.errors import record

def parse_args(argv):
    parser = argparse.ArgumentParser(description="test script")
    parser.add_argument("--init_method", type=str, default="env://")
    parser.add_argument("--backend", type=str, default="gloo")
    parser.add_argument("--throw", action="store_true", default=False)
    parser.add_argument("--exit", action="store_true", default=False)
    return parser.parse_args()

record
def main():
    args = parse_args(sys.argv[1:])

    if args.throw:
        raise RuntimeError("rasing error since --throw was specified")

    if args.exit:
        sys.exit(1)

    init_method=args.init_method
    backend=args.backend

    world_size = int(os.environ["WORLD_SIZE"])
    rank = int(os.environ["RANK"])

    print(f"initializing `{backend}` process group with rank={rank}, world_size={world_size} at {init_method}")

    dist.init_process_group(
        backend=backend,
        init_method=init_method,
        world_size=world_size,
        rank=rank)

    print(f"successfully initialized process group with rank={dist.get_rank()}, world_size={dist.get_world_size()}")

    t = F.one_hot(torch.tensor(rank), num_classes=world_size)
    dist.all_reduce(t)
    derived_world_size = torch.sum(t).item()
    if derived_world_size != world_size:
        raise RuntimeError(f"derived world size: {derived_world_size} != actual world size: {world_size}")
    else:
        print(f"sucessfully derived world size: {derived_world_size} (expected: {world_size}). Exiting")

if __name__ == "__main__":
    main()
```

run it as:

```
$ python -m torch.distributed.run --nproc_per_node 2 test.py --throw
```

Reviewed By: cbalioglu

Differential Revision: D30953731

fbshipit-source-id: bbea04c59c2aec58969cf44d8e3723d5f8abe8a8

2 years agoRemove `run_functional_checks` from `test_autograd` and create necessary OpInfos...
soulitzer [Wed, 15 Sep 2021 19:43:54 +0000 (12:43 -0700)]
Remove `run_functional_checks` from `test_autograd` and create necessary OpInfos (#64993)

Summary:
OpInfo tracker: https://github.com/pytorch/pytorch/issues/54261

 - Eliminate duplicated testing logic in test_autograd
 - Moved tests that rely on this testing logic to use OpInfos
   - `cat` already has OpInfo (no action needed)
   - Created OpInfo for `block_diag` and `broadcast_tensors`

Running into some FX errors. Added op to skip-list and created an issue here: https://github.com/pytorch/pytorch/issues/64997
Both `block_diag` and `broadcast_tensors` are variadic, so skipping `test_variant_consistency_jit` (from comments on other OpInfos, it looks like JIT does not support variadic tensors)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64993

Reviewed By: jbschlosser

Differential Revision: D30961736

Pulled By: soulitzer

fbshipit-source-id: e169305384a683acae1178c4e12e9e214a67226a

2 years agoDispatch.h: Avoid including ivalue (#64165)
Peter Bell [Wed, 15 Sep 2021 19:15:01 +0000 (12:15 -0700)]
Dispatch.h: Avoid including ivalue (#64165)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64165

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D30728587

Pulled By: ezyang

fbshipit-source-id: d0d2e97491d9d5e2d2fc2d6e51420a4467c1bba4

2 years agoTo add state_dict and load_state_dict to SequentialLR (#65035)
Ilqar Ramazanli [Wed, 15 Sep 2021 18:55:53 +0000 (11:55 -0700)]
To add state_dict and load_state_dict to SequentialLR (#65035)

Summary:
To add state_dict() and load_state_dict() methods to SequentialLR

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65035

Reviewed By: prabhat00155, nateanl

Differential Revision: D30958204

Pulled By: datumbox

fbshipit-source-id: 65114e1b07146526ae2680233f5cd42b2534d67a

2 years ago[CircleCI] Disable pytorch_linux_xenial_cuda10_2 test jobs (#65071)
Nikita Shulga [Wed, 15 Sep 2021 18:52:06 +0000 (11:52 -0700)]
[CircleCI] Disable pytorch_linux_xenial_cuda10_2 test jobs (#65071)

Summary:
As all of them has been migrated to GHA:
- pytorch_linux_pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_distributed_test -> "linux-xenial-cuda11.3-py3.6-gcc7 / test (distributed, 1, 1, linux.8xlarge.nvidia.gpu)"
- pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test1 -> "linux-xenial-cuda10.2-py3.6-gcc7 / test (default, 1, 2,
linux.8xlarge.nvidia.gpu)"
- pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2 -> "linux-xenial-cuda10.2-py3.6-gcc7 / test (default, 2, 2,
linux.8xlarge.nvidia.gpu)"
- pytorch_linux_xenial_cuda10_2_cudnn7_py3_multigpu_test -> "linux-xenial-cuda10.2-py3.6-gcc7 / test (multigpu, 1, 1,
linux.16xlarge.nvidia.gpu)"
- pytorch_linux_xenial_cuda10_2_cudnn7_py3_nogpu_NO_AVX2_test -> "linux-xenial-cuda10.2-py3.6-gcc7 / test (nogpu_NO_AVX2, 1, 1, linux.2xlarge)"
- pytorch_linux_xenial_cuda10_2_cudnn7_py3_nogpu_NO_AVX_test -> "linux-xenial-cuda10.2-py3.6-gcc7 / test (nogpu_NO_AVX, 1, 1, linux.2xlarge)"
- pytorch_linux_xenial_cuda10_2_cudnn7_py3_slow_test -> "linux-xenial-cuda10.2-py3.6-gcc7 / test (slow, 1, 1, linux.8xlarge.nvidia.gpu)"

"pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_build" is still a holdout due to slow gradchecks

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65071

Reviewed By: driazati, seemethere, janeyx99

Differential Revision: D30963413

Pulled By: malfet

fbshipit-source-id: d9a5188ce7eb2f60547b91b854a5db83af2b10e7

2 years agoStarter Task 1 (#64927)
Samuel Salas [Wed, 15 Sep 2021 18:49:28 +0000 (11:49 -0700)]
Starter Task 1 (#64927)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64927

Mypy error corrections

Test Plan: Corrected mypy errors to make code less prone to bugs by modifying types or adding lines that avoid special undesired cases e.g. asserting a variable to not None.

Reviewed By: wushirong

Differential Revision: D30901654

fbshipit-source-id: daae8692603b8b38203a98f673c455749c2fb855

2 years ago[ROCm] Update CI images for ROCm 4.3.1 (#64610)
Kyle Chen [Wed, 15 Sep 2021 18:48:33 +0000 (11:48 -0700)]
[ROCm] Update CI images for ROCm 4.3.1 (#64610)

Summary:
Signed-off-by: Kyle Chen <kylechen@amd.com>
reference:
https://github.com/pytorch/pytorch/issues/58017

jithunnair-amd
jeffdaily
arindamroy-eng

cc jeffdaily sunway513 jithunnair-amd ROCmSupport

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64610

Reviewed By: seemethere

Differential Revision: D30964582

Pulled By: malfet

fbshipit-source-id: a8335d3d32d7f1557d3cf6cb055ad0f9c49ef7aa

2 years agoPort `all` and `any` full reductions to structured kernels. (#64642)
Yukio Siraichi [Wed, 15 Sep 2021 18:05:14 +0000 (11:05 -0700)]
Port `all` and `any` full reductions to structured kernels. (#64642)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64642

Tracking issue: #55070

This PR creates out overloads for both `all` and `any` kernels (full reduction overload),
and ports them to structured kernels.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D30867354

Pulled By: ezyang

fbshipit-source-id: 46bccaf6c94a09ed77cc6c724d1183c82f801751

2 years ago[PyTorch] remove string_view::operator[] bounds check (#64670)
Scott Wolchok [Wed, 15 Sep 2021 16:55:02 +0000 (09:55 -0700)]
[PyTorch] remove string_view::operator[] bounds check (#64670)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64670

Bounds checking is not required for `std::string_view`, and the checking hoses performance for the following performance prototype diff.
ghstack-source-id: 138037531

Test Plan: CI

Reviewed By: ezyang, bhosmer

Differential Revision: D30747515

fbshipit-source-id: 1f4374415a82dfdccce76ea2c6885c13cb93d369

2 years ago[PyTorch][easy] Add cbegin/cend to SmallVector (#64682)
Scott Wolchok [Wed, 15 Sep 2021 16:55:02 +0000 (09:55 -0700)]
[PyTorch][easy] Add cbegin/cend to SmallVector (#64682)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64682

Looks like it was forked from llvm before cbegin and cend existed.
ghstack-source-id: 138036981

Test Plan: CI

Reviewed By: dhruvbird

Differential Revision: D30814434

fbshipit-source-id: 9740fa8d3df1c90b77298a95ab9f1d0cf8c90320