platform/upstream/pytorch.git
3 years agoCI: Enable using labels to control GHA workflows (#64314)
Jane Xu [Thu, 2 Sep 2021 16:50:56 +0000 (09:50 -0700)]
CI: Enable using labels to control GHA workflows (#64314)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/62852

Sets a global environment variable containing a list of PR labels. For this PR, the PR_LABELS variable looks like:
```
[
  "cla signed",
  "ciflow/default"
]
```
confirmed in a run: https://github.com/pytorch/pytorch/runs/3490072161?check_suite_focus=true

This information can be used in other workflow steps to control the logic. For example, if I want to force a build, I can label my PR with "force-build" and do something like the following in my build script:
```
if [[ "${PR_LABELS}" = *force-build* ]]; then
   python setup.py install
else
   #use cached wheel or something
fi
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64314

Reviewed By: driazati

Differential Revision: D30714570

Pulled By: janeyx99

fbshipit-source-id: 80b060ee32643ddd22eb7b8ec548579c7ccf6441

3 years agoFixes and details to torchhub docs (#63783)
Nicolas Hug [Thu, 2 Sep 2021 16:27:44 +0000 (09:27 -0700)]
Fixes and details to torchhub docs (#63783)

Summary:
This PR:

- adds a few details regarding the newly added `skip_validation` parameter https://github.com/pytorch/pytorch/pull/62139
- uses double-backticks instead of single-backticks since this is rst, not mardown.
- adds a few minor doc nits here and there

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63783

Reviewed By: zou3519

Differential Revision: D30696658

Pulled By: NicolasHug

fbshipit-source-id: 6f01c7eb3cfcd7e17e4c33c09d193054fa18ad36

3 years agoTST Adds __repr__ and str to module info (#63737)
Thomas J. Fan [Thu, 2 Sep 2021 16:02:35 +0000 (09:02 -0700)]
TST Adds __repr__ and str to module info (#63737)

Summary:
Follow up to https://github.com/pytorch/pytorch/pull/61935

This PR adds `test_repr` to `test_modules`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63737

Reviewed By: gchanan

Differential Revision: D30729642

Pulled By: jbschlosser

fbshipit-source-id: c11a28bc0739abd3ed40727389dd28ed4069edad

3 years agoFix torch.istft length mismatch and window runtime error (#63469)
Zhaoheng Ni [Thu, 2 Sep 2021 15:59:53 +0000 (08:59 -0700)]
Fix torch.istft length mismatch and window runtime error (#63469)

Summary:
The PR fixes two issues:
- See https://github.com/pytorch/pytorch/issues/62747 and https://github.com/pytorch/audio/issues/1409. The length mismatch when the given ``length`` parameter is longer than expected. Add padding logic in consistent with librosa.
- See https://github.com/pytorch/pytorch/issues/62323. The current implementations checks if the min value of window_envelop.abs() is greater than zero.  In librosa they normalize the signal on non-zero values by indexing. Like
```
approx_nonzero_indices = ifft_window_sum > util.tiny(ifft_window_sum)
y[approx_nonzero_indices] /= ifft_window_sum[approx_nonzero_indices]
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63469

Reviewed By: fmassa

Differential Revision: D30695827

Pulled By: nateanl

fbshipit-source-id: d034e53f0d65b3fd1dbd150c9c5acf3faf25a164

3 years ago[Static Runtime] Add sign/abs/lop1p/mul fusion pass (#64209)
Mike Iovine [Thu, 2 Sep 2021 15:12:48 +0000 (08:12 -0700)]
[Static Runtime] Add sign/abs/lop1p/mul fusion pass (#64209)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64209

Add a new fusion pass that turns transforms the following pattern:
```
graph(%input):
    %0 : Tensor = aten::sign(%input)
    %1 : Tensor = aten::abs(%input)
    %2 : Tensor = aten::log1p(%1)
    %res : Tensor = aten::mul(%0, %2)
    return (%res)
```
Into a single op:
```
graph(%input):
    %res : Tensor = static_runtim::signed_log1p(%input)
    return (%res)
```

The intent is to reduce the number of passes over the tensor. However, enabling this pass actually causes a performance regression, probably due to a lack of vectorization in the fused implementation. Because of this issue, this diff **does not** enable this pass.

Followup: navahgar will add an NNC kernel which is faster than the the unfused version and enable this pass. We still need this version as a fallback since the NNC kernel will not support all dtypes.

Test Plan:
`buck test caffe2/benchmarks/static_runtime:static_runtime_cpptest -- SignedLog1p`

Test passed with new graph pass disabled and enabled.

Reviewed By: hlu1

Differential Revision: D30559929

fbshipit-source-id: e4e080cb2e6a705cfdde1fc98bee92b723f8132a

3 years ago[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT`
CodemodService FBSourceClangFormatLinterBot [Thu, 2 Sep 2021 15:10:37 +0000 (08:10 -0700)]
[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT`

Reviewed By: zertosh

Differential Revision: D30710635

fbshipit-source-id: e8dae05a7e3a19d656067a4f102aab4a3c93ac42

3 years agoFix broken caffe2 test: PlanExecutorTest.BlockingErrorPlan (#64401)
Seth Elliott [Thu, 2 Sep 2021 14:48:47 +0000 (07:48 -0700)]
Fix broken caffe2 test: PlanExecutorTest.BlockingErrorPlan (#64401)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64401

PlanExecutorTest.BlockingErrorPlan uses `ASSERT_DEATH` which internally performs a `fork()`. This can cause problems under certain configurations that use threads. This change updates this test to use the "threadsafe" style for GTest death tests in order to improve its quality in multithreaded environments.

Test Plan:
I confirmed that this change fixes the issue on my devvm with the following command:
```
buck test mode/dev //caffe2/caffe2:caffe2_test_cpu -- PlanExecutorTest.BlockingErrorPlan
```

Reviewed By: praihan

Differential Revision: D30709447

fbshipit-source-id: 12ffd9ad0371e2e5b43a9873c80568e5ab02d246

3 years agosimplify op name determination into a single forward pass (#64261)
Michael Dagitses [Thu, 2 Sep 2021 13:49:09 +0000 (06:49 -0700)]
simplify op name determination into a single forward pass (#64261)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64261

Note that this does not preserve byte-for-byte compatibility with
existing names.

Test Plan:
* Rely on CI to catch gross errors.
* Merge after release cut to catch subtle issues.

Reviewed By: albanD

Differential Revision: D30700647

Pulled By: dagitses

fbshipit-source-id: 7b02f34b8fae3041240cc78fbc6bcae498c3acd4

3 years agofix copy.deepcopy on LinearPackedParams (#64367)
Vasiliy Kuznetsov [Thu, 2 Sep 2021 13:12:07 +0000 (06:12 -0700)]
fix copy.deepcopy on LinearPackedParams (#64367)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64367

This is the same thing as https://github.com/pytorch/pytorch/pull/56154
but for quantized linear. It fixes the behavior of `copy.deepcopy` on
these modules. Before this PR, copied instances of `LinearPackedParams`
were not properly initialized, and inspecting them raised errors of
missing `_modules`. After this PR, inspecting and using the copies
works.

Test Plan:
```
python test/test_quantization.py TestStaticQuantizedModule.test_linear_api
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D30702667

fbshipit-source-id: 38c26d1e72663416eeb989985b77ffc2052c12b9

3 years ago[jit] shape propagation for prepack (#63585)
Ivan Kobzarev [Thu, 2 Sep 2021 12:27:59 +0000 (05:27 -0700)]
[jit] shape propagation for prepack (#63585)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63585

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D30428905

Pulled By: IvanKobzarev

fbshipit-source-id: c18f6605a69b2e000bdf14a23e637c5a1c2ec64c

3 years agoextract TestAutogradComplex into its own test file (#63400)
Michael Dagitses [Thu, 2 Sep 2021 11:04:59 +0000 (04:04 -0700)]
extract TestAutogradComplex into its own test file (#63400)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63400

This is the first step to break up test_autograd.py for #63205.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D30541499

Pulled By: dagitses

fbshipit-source-id: 8d9d32007938b9eade0e88f95a6a3190e7e2ef01

3 years agorequire that `TARGET_DET_LIST` is sorted (and sort it here) (#64102)
Michael Dagitses [Thu, 2 Sep 2021 11:04:59 +0000 (04:04 -0700)]
require that `TARGET_DET_LIST` is sorted (and sort it here) (#64102)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64102

We sort this list so that we may add comments to indicate the absence
of a file right where that file would need to be put. This makes it
difficult to wrongly add such a file.

The sorting itself was done programmatically to ensure that no entries
were inadvertently removed.

I printed the sorted list with:

```
  for p in sorted(TARGET_DET_LIST):
    print(f'    "{p}",')
```

Then copied it back into the file.

Test Plan: Imported from OSS

Reviewed By: driazati

Differential Revision: D30625076

Pulled By: dagitses

fbshipit-source-id: cf36fcb3e53e274b76d1f4aae83da1f53c03f9ed

3 years agoFix list() and help() torchhub functions for Windows (#63773)
Nicolas Hug [Thu, 2 Sep 2021 10:48:44 +0000 (03:48 -0700)]
Fix list() and help() torchhub functions for Windows (#63773)

Summary:
This PR Fixes the help() and list() torchhub functions which were probably failing for Windows since the `/` OS separator was hardcoded.

Before merging this I need to double check whether the CI actually runs the corresponding tests on Windows or not

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63773

Reviewed By: zou3519

Differential Revision: D30695664

Pulled By: NicolasHug

fbshipit-source-id: fac328163fd05db804a8186ae28f22b3cc3a6404

3 years agoRemove outdated comment in hub.py (#63757)
Nicolas Hug [Thu, 2 Sep 2021 10:46:59 +0000 (03:46 -0700)]
Remove outdated comment in hub.py (#63757)

Summary:
This PR removes an outdated comment about Python2 that was orginally introduced in https://github.com/pytorch/pytorch/pull/25083/files. The code has changed since then, but the comment wasn't removed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63757

Reviewed By: zou3519

Differential Revision: D30695656

Pulled By: NicolasHug

fbshipit-source-id: 431cf414588b9e5a1ad6acdae724ff5af1b16971

3 years agoUpdate hub.load() signature to avoid polluting kwargs param (#63755)
Nicolas Hug [Thu, 2 Sep 2021 10:45:06 +0000 (03:45 -0700)]
Update hub.load() signature to avoid polluting kwargs param (#63755)

Summary:
This PR addresses an old comment about Python2 EOL, directly putting some parameters in the function signature instead of in a `**kargs` dict.

I believe the changes are fully backward compatible.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63755

Reviewed By: zou3519

Differential Revision: D30695634

Pulled By: NicolasHug

fbshipit-source-id: 398f347c5a04bfb58e77e46773a869cb9d0eb225

3 years agoFix TRTModule not adding outputs in order (#64418)
Kefei Lu [Thu, 2 Sep 2021 08:17:56 +0000 (01:17 -0700)]
Fix TRTModule not adding outputs in order (#64418)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64418

In T99368564, we found that when running TRT lowered module, the output tensors are out-of-order, as compared to the output from the original, non-lowered module. It turns out that in `TRTModule.forward()`, we cannot rely on `ICudaEngine` bindings natural order indices to create the output tensors, but rather, we should explicitly construct the output tensor from the bindings' names, in an ordered that we supply.

Test Plan:
* Arc lint
* Run CI/sandcastle tests
* Run GPU lowering using commands and code changes in D30171741 and ensure we don't observe out-of-order outputs

Reviewed By: yinghai

Differential Revision: D30693545

fbshipit-source-id: 32a894ceeb148fcf4e8d279be3835c7d1f1aa2ba

3 years agoPort `gather` to structured kernel (#63312)
Kushashwa Ravi Shrimali [Thu, 2 Sep 2021 08:08:53 +0000 (01:08 -0700)]
Port `gather` to structured kernel (#63312)

Summary:
Will add a description once this is ready for review.

cc: ysiraichi ezyang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63312

Reviewed By: iramazanli

Differential Revision: D30597447

Pulled By: ezyang

fbshipit-source-id: d36e59835c2f4b38e286032dd2a1111a7e16b7e5

3 years agoReplace std::unordered_map<c10::Device, c10::Device> with DeviceMap (#64393)
Pavel Belevich [Thu, 2 Sep 2021 07:57:39 +0000 (00:57 -0700)]
Replace std::unordered_map<c10::Device, c10::Device> with DeviceMap (#64393)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64393

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski SciPioneer H-Huang mrzzd cbalioglu gcramer23

Test Plan: Imported from OSS

Reviewed By: rohan-varma

Differential Revision: D30708384

Pulled By: pbelevich

fbshipit-source-id: 1c565727e4f09cd9e560874dd90aa403470b4a97

3 years ago[PyTorch Edge] Support default args with out arg, flag off (#63540)
Chen Lai [Thu, 2 Sep 2021 07:50:40 +0000 (00:50 -0700)]
[PyTorch Edge] Support default args with out arg, flag off (#63540)

Summary:
1. Allow consuming operators with defaults arguments and out arguments. Flag is off to keep the same behavior as v6, in pr 63651, turn on the flag.
2. Add two unittests to cover this type of operators.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63540

ghstack-source-id: 137211562

Test Plan:
```
caffe2/test/cpp/jit:jit - LiteInterpreterTest.DefaultArgsWithOutArg
caffe2/test/cpp/jit:jit - LiteInterpreterTest.DefaultArgsPinvWithOutArg
```

Reviewed By: raziel, iseeyuan, tugsbayasgalan

Differential Revision: D30414156

fbshipit-source-id: 0f3a219a22aee10ac53184cbd95940726c459d1f

3 years agoRemove unnecessary resize_output (#64272)
Edward Yang [Thu, 2 Sep 2021 07:48:03 +0000 (00:48 -0700)]
Remove unnecessary resize_output (#64272)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64272

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS

Reviewed By: H-Huang, bdhirsh

Differential Revision: D30686941

Pulled By: ezyang

fbshipit-source-id: de60e6f1115648f8cf7daaa1e652594fe8b06742

3 years agoMove graph util to fx2trt (#64064)
Shirong Wu [Thu, 2 Sep 2021 05:09:42 +0000 (22:09 -0700)]
Move graph util to fx2trt (#64064)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64064

Move original util in torch2trt to fx2trt dir since torch2trt is gonne be deprecated. This is a follow up diff for D30379124

Test Plan: manual

Reviewed By: yinghai, mikekgfb

Differential Revision: D30591687

fbshipit-source-id: ae0e59dfbc2d2e2aa4f3ccea7cff2291c7deb388

3 years agoAdd a warning about DataLoader num_workers > 0 "memory leak" (#64337)
Edward Yang [Thu, 2 Sep 2021 04:48:36 +0000 (21:48 -0700)]
Add a warning about DataLoader num_workers > 0 "memory leak" (#64337)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64337

See https://github.com/pytorch/pytorch/issues/13246

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS

Reviewed By: H-Huang

Differential Revision: D30690320

Pulled By: ezyang

fbshipit-source-id: 2751aca05a94e63d25162599f458855988516fad

3 years ago[Dist CI] Move rest of distributed tests to their own CI job (#64253)
Rohan Varma [Thu, 2 Sep 2021 04:07:01 +0000 (21:07 -0700)]
[Dist CI] Move rest of distributed tests to their own CI job (#64253)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64253

Follow up to D30496178 (https://github.com/pytorch/pytorch/commit/f4aff3a346a0525e37d6071f318f7a4c54d5e1fb) to move the rest of distributed tests to their own jobs for Linux GHA.
ghstack-source-id: 137233785

Test Plan: CI

Reviewed By: walterddr

Differential Revision: D30662999

fbshipit-source-id: f7cfbc0d1223aca52120f17f9da987d70fda8de6

3 years ago[DDP] Log num threads (#64072)
Rohan Varma [Thu, 2 Sep 2021 01:12:02 +0000 (18:12 -0700)]
[DDP] Log num threads (#64072)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64072

Log gloo threads to DDP logging.
ghstack-source-id: 137119480

Test Plan: CI

Reviewed By: mrshenli

Differential Revision: D30596083

fbshipit-source-id: 2b4f6e762cb5d850be6056bcc5922029a1af3c91

3 years agoadd documentation to shape inference algorithm (#64312)
Zeina Migeed [Thu, 2 Sep 2021 01:04:19 +0000 (18:04 -0700)]
add documentation to shape inference algorithm (#64312)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64312

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D30709254

Pulled By: migeed-z

fbshipit-source-id: 3297d26fe6727c5b9ca176625b1683d787f59659

3 years ago[DDP Comm Hook] Add debugging communication hooks to ddp_comm_hooks.rst (#64352)
Yi Wang [Thu, 2 Sep 2021 00:32:39 +0000 (17:32 -0700)]
[DDP Comm Hook] Add debugging communication hooks to ddp_comm_hooks.rst (#64352)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64352

as title
ghstack-source-id: 137246253

Test Plan: N/A

Reviewed By: rohan-varma

Differential Revision: D30694089

fbshipit-source-id: a78110b11d59bb0718f43c99ede23f2fd8ab21d0

3 years ago[DDP Comm Hook] Create a noop hook for performance debugging (#64344)
Yi Wang [Thu, 2 Sep 2021 00:32:39 +0000 (17:32 -0700)]
[DDP Comm Hook] Create a noop hook for performance debugging (#64344)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64344

As title.

Additionally, avoid using numpy array in test_ddp_hooks.py.
ghstack-source-id: 137170449

Test Plan: buck test mode/dev-nosan caffe2/test/distributed/algorithms/ddp_comm_hooks:test_ddp_hooks -- test_ddp_comm_hook_noop_hook

Reviewed By: rohan-varma

Differential Revision: D30693220

fbshipit-source-id: e17f0d1c6198863cf20a53566f586a6bff602522

3 years ago[DDP] Add more logging iterations (#64071)
Rohan Varma [Thu, 2 Sep 2021 00:04:37 +0000 (17:04 -0700)]
[DDP] Add more logging iterations (#64071)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64071

Adding more logging iterations to get additional data.
ghstack-source-id: 137119476

Test Plan: CI

Reviewed By: mrshenli

Differential Revision: D30579367

fbshipit-source-id: 57195266ada5e5926f0d8eaf4fb4e01dc98924d7

3 years agoFix incorrect DDP test (#64074)
Rohan Varma [Wed, 1 Sep 2021 23:25:00 +0000 (16:25 -0700)]
Fix incorrect DDP test (#64074)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64074

Previous PR https://github.com/pytorch/pytorch/pull/63831 did not actually test the error in https://github.com/pytorch/pytorch/issues/63812. Introduce a test
directly from the repro that simulates it.
ghstack-source-id: 137171460

Test Plan: CI

Reviewed By: SciPioneer

Differential Revision: D30569719

fbshipit-source-id: fd61250ef6d291c093607663d91d6d2cb5574eb7

3 years ago[c10d] Prefer use of torch_check (#63928)
Rohan Varma [Wed, 1 Sep 2021 23:21:31 +0000 (16:21 -0700)]
[c10d] Prefer use of torch_check (#63928)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63928

throw std::invalid_argument results in not getting stacktraces with
TORCH_SHOW_CPP_STACKTRACES=1, so instead prefer torch_check here.
ghstack-source-id: 137135328

Test Plan: CI

Reviewed By: mrshenli

Differential Revision: D30533955

fbshipit-source-id: 33e5bf4f449e3043dec68da93f8022f6624d9675

3 years agoAdd fast path for addmm when the inputs are conjugate (#59380)
anjali411 [Wed, 1 Sep 2021 23:11:38 +0000 (16:11 -0700)]
Add fast path for addmm when the inputs are conjugate (#59380)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59380

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D28898374

Pulled By: anjali411

fbshipit-source-id: eab0e64d37bb57c18b54cabb8e5c00666338ba04

3 years ago[DDP Comm Hook] Add bf16 gradient compression to ddp_comm_hooks.rst (#64346)
Yi Wang [Wed, 1 Sep 2021 23:09:46 +0000 (16:09 -0700)]
[DDP Comm Hook] Add bf16 gradient compression to ddp_comm_hooks.rst (#64346)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64346

as title
ghstack-source-id: 137170288

Test Plan: N/A

Reviewed By: rohan-varma

Differential Revision: D30693513

fbshipit-source-id: 8c64b8404ff3b0322e1bbbd93f6ef051ea91307d

3 years ago[quant][graphmode][fx] Add fbgemm backend_config_dict (#64288)
Jerry Zhang [Wed, 1 Sep 2021 22:48:54 +0000 (15:48 -0700)]
[quant][graphmode][fx] Add fbgemm backend_config_dict (#64288)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64288

This is just to setup the file structure and unblock experimentation.
The format for backend_config_dict will change in the future

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: zou3519

Differential Revision: D30699457

fbshipit-source-id: 28211a4def05d34757850c045a36e311f54760fe

3 years agoMake datasets in `ConcatDataset` not need to be sized (#64114)
Santiago Castro [Wed, 1 Sep 2021 22:18:14 +0000 (15:18 -0700)]
Make datasets in `ConcatDataset` not need to be sized (#64114)

Summary:
`datasets` needs to be iterable, but also sized because the length is checked. But immediately after it's converted to a list. By changing the order of these 2 lines, it doesn't need to be sized anymore.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64114

Reviewed By: H-Huang

Differential Revision: D30641480

Pulled By: ejguan

fbshipit-source-id: 7e16548c2123afa65b83845f9929271fa07fe1e8

3 years agoRestore LayerNorm numerics test (#64385)
Richard Zou [Wed, 1 Sep 2021 22:12:05 +0000 (15:12 -0700)]
Restore LayerNorm numerics test (#64385)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64385

It was deleted in https://github.com/pytorch/pytorch/pull/63276.

The numerics test was meant to check LayerNorm behavior on large inputs,
but we deleted it without realizing that.

Test Plan: - wait for tests.

Reviewed By: ngimel

Differential Revision: D30702950

Pulled By: zou3519

fbshipit-source-id: a480e26c45ec38fb628938b70416cdb22d976a46

3 years ago[quant][graphmode][api] Add backend_config_dict to prepare_fx api (#64135)
Jerry Zhang [Wed, 1 Sep 2021 21:56:14 +0000 (14:56 -0700)]
[quant][graphmode][api] Add backend_config_dict to prepare_fx api (#64135)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64135

We want to start aligning the api with the design in https://github.com/pytorch/pytorch/wiki/Extending-PyTorch-Quantization-to-Custom-Backends

We plan to gradually move things from `prepare_custom_config_dict` and `convert_custom_config_dict`
to `backend_config_dict` and allow custom backend developer to define their own way of quantizing operators.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: zou3519

Differential Revision: D30699456

fbshipit-source-id: e3c068da8d3da2270f57719f7159cc71cafa8598

3 years agoSilent rm error for sccache log file (#64388)
zhouzhuojie [Wed, 1 Sep 2021 21:53:25 +0000 (14:53 -0700)]
Silent rm error for sccache log file (#64388)

Summary:
Sample reporting from dr.ci

![image](https://user-images.githubusercontent.com/658840/131724645-75afa04f-7554-4674-8e7c-cf139c84d994.png)

The `rm` command is not actually running into problems, just need to silent the console output.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64388

Reviewed By: walterddr, malfet, seemethere

Differential Revision: D30704439

Pulled By: zhouzhuojie

fbshipit-source-id: ecd35531decf05b75cef30d08d46635f81112f67

3 years ago[xplat][metal] Add getters and setters for ivars in Conv2dOpContext (#57395)
Yuchen Huang [Wed, 1 Sep 2021 21:48:00 +0000 (14:48 -0700)]
[xplat][metal] Add getters and setters for ivars in Conv2dOpContext (#57395)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57395

As title
ghstack-source-id: 137223806

(Note: this ignores all push blocking failures!)

Test Plan:
### Lib Build
- `buck build caffe2:aten_metal_prepack`

### Integration Test
- `arc focus2 pp-ops -a ModelRunner`
- Click "Test Person/Hair Segmentation Model"

{F612831435}

- Image Classification Demo

{F614144868}

Reviewed By: xta0

Differential Revision: D28132020

fbshipit-source-id: 73560263a9d14e9ecfa39c69deb158a2ed8cb179

3 years ago[structured] Preserve computed elements from meta func to impl (#61746)
Meghan Lele [Wed, 1 Sep 2021 21:24:54 +0000 (14:24 -0700)]
[structured] Preserve computed elements from meta func to impl (#61746)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61746

**Summary**
This commit introduces a new feature for structured kernels that allows
kernels to declare quantities as "precomputed" in
`native_functions.yaml`, compute them once in the `meta` function and
reuse them again in the `impl`. The names and types of these quantities
are used to generate code for a struct containing them that the `meta`
function must return. In the case of a handful of surveyed kernels
(`all,`, `any`, `avg_pool2d`), these quantities that are used both in
the `meta` and `impl` have the same meaning as certain kernel arguments
and in fact supersede them. Accordingly, the correspondence between a
kernel argument and the precomputed elements that supersede it is also
captured in `native_functions.yaml`. This information is used to unpack
the struct returned by `meta` and pass its contents correctly to the
`impl` function.

The primary goal is to avoid recompute and enhance developer experience
(e.g. sometimes people can forget to compute these elements while
porting a kernel).

Test Plan: Imported from OSS

Reviewed By: tugsbayasgalan

Differential Revision: D30407831

Pulled By: SplitInfinity

fbshipit-source-id: 00975525ea373721fe52d06f75cd4ac91f3dc556

3 years ago[Static Runtime] Make per-op latency readable by FAI-PEP (#64315)
Mike Iovine [Wed, 1 Sep 2021 21:19:21 +0000 (14:19 -0700)]
[Static Runtime] Make per-op latency readable by FAI-PEP (#64315)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64315

Add a new flag `generate_ai_pep_output` to `StaticRuntime::benchmark`. If set, produces per-op-kind average total latency in milliseconds in a JSON format recognized by [Facebook AI performance evaluation platform (FAI-PEP)](https://github.com/facebook/FAI-PEP).

This is useful for observing the impact of changes that make a big difference for a specific op, but do not affect the overall SR latency by more than a few percent.

Reviewed By: hlu1

Differential Revision: D30679352

fbshipit-source-id: c847fa6ea20774aaf1e7949b11db4421d1f70b7e

3 years agoUpdate optimize_for_mobile to preserve node's debug information (#63106)
Salil Desai [Wed, 1 Sep 2021 21:08:02 +0000 (14:08 -0700)]
Update optimize_for_mobile to preserve node's debug information (#63106)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63106

Propagate debug info to the re-written nodes in the graph.

Test Plan:
- Clone open source repo and build
- ``` python3 test/test_jit.py TestOptimizeForMobilePreserveDebugInfo ```
- Tests pass

Reviewed By: kimishpatel

Differential Revision: D28654659

fbshipit-source-id: 2d7c87f2fb95a3be53246375f35639bbd97c237e

3 years agoBreak up "@generated" string so Phabricator shows changes
David Reiss [Wed, 1 Sep 2021 20:41:37 +0000 (13:41 -0700)]
Break up "@generated" string so Phabricator shows changes

Summary: Created from CodeHub with https://fburl.com/edit-in-codehub

Test Plan:
CI

Sandcastle run

Reviewed By: larryliu0820

Differential Revision: D30701781

fbshipit-source-id: 3acab8b65a327c4ec7da90bc855ecf02f801c40a

3 years agoAdd forward AD support for custom Functions (#64061)
Alban Desmaison [Wed, 1 Sep 2021 20:34:48 +0000 (13:34 -0700)]
Add forward AD support for custom Functions (#64061)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64061

Test Plan: Imported from OSS

Reviewed By: soulitzer

Differential Revision: D30640868

Pulled By: albanD

fbshipit-source-id: b0e6610430a879074d6d5306443772fc154b431f

3 years agoFix bytes_written and bytes_read (#64244)
Tanvir Zaman [Wed, 1 Sep 2021 20:31:45 +0000 (13:31 -0700)]
Fix bytes_written and bytes_read (#64244)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64244

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64040

In operator cost inference functions, in many places we are using sizeof(x.data_type()). Since data_type() returns a 32 bit integer from [this enum](https://www.internalfb.com/code/fbsource/[15e7ffe4073cf08c61077c7c24a4839504b964a2]/fbcode/caffe2/caffe2/proto/caffe2.proto?lines=20), we are basically always getting 4 for sizeof(x.data_type()) no matter what actual data type x has. Big thanks to Jack Langman for specifically pointing to this bug.

We would instead use the size in bytes based on actual data type.

Test Plan:
Added unit tests BatchMatMulMemCostTest:

buck test //caffe2/caffe2/fb/fbgemm:batch_matmul_op_test -- BatchMatMulMemCostTest

Extended existing unit test test_columnwise_concat for different data types:

buck test //caffe2/caffe2/python/operator_test:concat_op_cost_test -- test_columnwise_concat

Reviewed By: CrazySherman

Differential Revision: D30656698

fbshipit-source-id: d42c0c9a0c5b0ddc5dba39e4994f1f85a5e618bf

3 years ago[Caffe2] Create fewer strings during argument fetching (#64285)
Scott Wolchok [Wed, 1 Sep 2021 20:24:11 +0000 (13:24 -0700)]
[Caffe2] Create fewer strings during argument fetching (#64285)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64285

With C++14 heterogeneous ordered container lookup, it is no longer necessary to create a `std::string` in order to look up elements of a `CaffeMap` keyed by std::string. Accordingly, this diff reworks the argument-getting operator functions to avoid that in favor of `c10::string_view`.
ghstack-source-id: 137139818
ghstack-source-id: 137139818

Test Plan: buildsizebot iOS apps -- code size win. less strings is probably marginally good for perf but this only happens at setup time anyway.

Reviewed By: dzhulgakov

Differential Revision: D26826676

fbshipit-source-id: ee653b14dc2c528bae8c90f0fc6a7a419cbca1d6

3 years agoBack out "Revert D30327514: [Pytorch lite predictor] Use KinetoEdgeCPUProfiler for...
Kimish Patel [Wed, 1 Sep 2021 19:38:39 +0000 (12:38 -0700)]
Back out "Revert D30327514: [Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling." (#64307)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64307

Original commit changeset: 0b2aa7c57d08

Restores original changes.
This diff changes the way operator profiling is done in lite predictor
benchmarking binary.
Instead of using custom callbacks it uses KinetoEdgeCPUProfiler to profile
events and then generate operator level metric from it.
Since KinetoEvents do not contain cpu clock time, now we report only wallclock
time.
This unifies various profiling effort that we have for benchmarking purpose. In
production we will still use observer based mechanism, but the advantage of
using kineto profiler is that we get few other things for free, such as:
chrome trace generation.
operator level memory profiling (to be added)
flop counts (to be added)
Furthermore possible we can use python post processing script to parse chrome
trace and generate output similar to torch.profiler. (To be done)

Furthermore removes some tests from test_lite_interpreter.cpp which were testing module hierarchy in debug info. They should be covered by test_mobile_profiler.cpp.

Test Plan:
aibench run
Model without debug info:
https://www.internalfb.com/intern/aibench/details/219598441154763
Model with debug info and --print_module_info true (see Operator summary has now module hierarchy information).
https://www.internalfb.com/intern/aibench/details/617154236292985

Reviewed By: raziel

Differential Revision: D30680354

fbshipit-source-id: b6ba0d59c510c13d13d9935b1d8051cc82ffa4e9

3 years agoAdd a record scope around autograd::engine::evaluate_function (#63619)
Rohan Varma [Wed, 1 Sep 2021 19:28:23 +0000 (12:28 -0700)]
Add a record scope around autograd::engine::evaluate_function (#63619)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63619

Adds a RECORD_FUNCTION with the function that is being valuate as part
of backwards execution. This has been useful in picking up some operations
in the backwards pass that otherwise would not show up, for example custom cpp
functions that use custom C++ code.
ghstack-source-id: 137041723

Test Plan:
CI

benchmark:
buck run mode/opt //scripts/rvarm1/ddp:bench

Reviewed By: albanD

Differential Revision: D30439492

fbshipit-source-id: 955917770cdf2a2edb0303223ace710b668ba388

3 years ago[Bootcamp] Include both python unittest and parser parameters in --help and -h flag...
Patrick Kan [Wed, 1 Sep 2021 19:20:50 +0000 (12:20 -0700)]
[Bootcamp] Include both python unittest and parser parameters in --help and -h flag (#64297)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/45945

Creates a new thread to run -h or --help with unittest.main if the help flag is present, and keeps the add_help default for parameters.

Includes both python unittest and parser parameters in --help and -h flag and will remain up to date since both messages are displayed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64297

Test Plan:
Imported from GitHub

`python test/test_spectral_ops.py --help`

Output:
```
% python test/test_spectral_ops.py --help
usage: test_spectral_ops.py [-h] [-v] [-q] [--locals] [-f] [-c] [-b] [-k TESTNAMEPATTERNS] [tests [tests ...]]

positional arguments:
  tests                a list of any number of test modules, classes and test methods.

optional arguments:
  -h, --help           show this help message and exit
  -v, --verbose        Verbose output
  -q, --quiet          Quiet output
  --locals             Show local variables in tracebacks
  -f, --failfast       Stop on first fail or error
  -c, --catch          Catch Ctrl-C and display results so far
  -b, --buffer         Buffer stdout and stderr during tests
  -k TESTNAMEPATTERNS  Only run tests which match the given substring

Examples:
  test_spectral_ops.py                           - run default set of tests
  test_spectral_ops.py MyTestSuite               - run suite 'MyTestSuite'
  test_spectral_ops.py MyTestCase.testSomething  - run MyTestCase.testSomething
  test_spectral_ops.py MyTestCase                - run all 'test*' test methods
                                       in MyTestCase

usage: test_spectral_ops.py [-h] [--subprocess] [--seed SEED] [--accept] [--jit_executor JIT_EXECUTOR] [--repeat REPEAT]
                            [--test_bailouts] [--save-xml [SAVE_XML]] [--discover-tests] [--log-suffix LOG_SUFFIX]
                            [--run-parallel RUN_PARALLEL] [--import-slow-tests [IMPORT_SLOW_TESTS]]
                            [--import-disabled-tests [IMPORT_DISABLED_TESTS]]

optional arguments:
  -h, --help            show this help message and exit
  --subprocess          whether to run each test in a subprocess
  --seed SEED
  --accept
  --jit_executor JIT_EXECUTOR
  --repeat REPEAT
  --test_bailouts
  --save-xml [SAVE_XML]
  --discover-tests
  --log-suffix LOG_SUFFIX
  --run-parallel RUN_PARALLEL
  --import-slow-tests [IMPORT_SLOW_TESTS]
  --import-disabled-tests [IMPORT_DISABLED_TESTS]
  ```

Also ran some other tests to make sure tests still worked, and other tests with --help or -h flag

Reviewed By: seemethere

Differential Revision: D30677776

Pulled By: PatrickKan

fbshipit-source-id: eb3d6e3fa677137ec703ec3a23808efb99acc896

3 years ago[FX] Fix python code generation for wrapped getattr() with default value (#64271)
Patrick Hu [Wed, 1 Sep 2021 17:49:39 +0000 (10:49 -0700)]
[FX] Fix python code generation for wrapped getattr() with default value (#64271)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64271

Closes #60417

Modified emit_node() in fx/graph.py to generate getattr() call with default value when len(node.args) != 2 instead of accessing the attribute.
Added test_torch_fx_getattr() in test/test_fx.py.

Test Plan:
pytest test/test_fx.py

Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D30671265

fbshipit-source-id: f2db9ea47e0cb247547e200684f715aab006c374

3 years ago[nnc] Updated generic error message with info about turning off the fuser (#64316)
Raghavan Raman [Wed, 1 Sep 2021 17:28:02 +0000 (10:28 -0700)]
[nnc] Updated generic error message with info about turning off the fuser (#64316)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64316

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D30683942

Pulled By: navahgar

fbshipit-source-id: d86607563672213f99a1436dcf4f5dc28053b713

3 years agoFixes reduction launch config (#64304)
Xiang Gao [Wed, 1 Sep 2021 17:17:52 +0000 (10:17 -0700)]
Fixes reduction launch config (#64304)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/48573
See also https://github.com/pytorch/pytorch/pull/64194

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64304

Reviewed By: janeyx99

Differential Revision: D30689600

Pulled By: ngimel

fbshipit-source-id: bf2103ca177fd3b6e27bc0324b81925234483a29

3 years agoOpInfo for `nn.functional.layer_norm` (#63276)
Kushashwa Ravi Shrimali [Wed, 1 Sep 2021 15:48:25 +0000 (08:48 -0700)]
OpInfo for `nn.functional.layer_norm` (#63276)

Summary:
Please see https://github.com/facebookresearch/functorch/issues/78 and https://github.com/pytorch/pytorch/issues/54261.

Note:

* This PR also adds a reference test inspired by existing tests in `test_nn.py`.

cc: mruberry zou3519

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63276

Reviewed By: ejguan

Differential Revision: D30452483

Pulled By: zou3519

fbshipit-source-id: 2578d01ca34e031668a41bd284db60c31ae1fba8

3 years agofix GradBucket.is_last() logic (#63768)
Nima Elyasi [Wed, 1 Sep 2021 15:47:44 +0000 (08:47 -0700)]
fix GradBucket.is_last() logic (#63768)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63768

passed number of buckets to GradBucket constructor, to check if index is equal to num_buckets - 1 in the .is_last() function.

Test Plan:
buck test mode/dev-nosan //caffe2/test/distributed/algorithms/ddp_comm_hooks:test_ddp_hooks

test output: https://www.internalfb.com/intern/testinfra/testconsole/testrun/8162774375985873/

Reviewed By: SciPioneer, mrshenli

Differential Revision: D30455913

fbshipit-source-id: 8c67ca69cbf191d6e189e09248407eb167bb24b6

3 years agoRevert D29699456: [pytorch][PR] Enable Half, BFloat16, and Complex dtypes for coo...
Richard Zou [Wed, 1 Sep 2021 14:16:55 +0000 (07:16 -0700)]
Revert D29699456: [pytorch][PR] Enable Half, BFloat16, and Complex dtypes for coo-coo sparse matmul [CUDA]

Test Plan: revert-hammer

Differential Revision:
D29699456 (https://github.com/pytorch/pytorch/commit/ad4848565e1d9f4d408c60614f213acb52035181)

Original commit changeset: 407ae53392ac

fbshipit-source-id: b6c70ba8bb28c0c38de47857030b69792a8470de

3 years ago[FX] Rename reduce functions back to their old, public names (#64324)
James Reed [Wed, 1 Sep 2021 05:20:41 +0000 (22:20 -0700)]
[FX] Rename reduce functions back to their old, public names (#64324)

Summary:
Unfortunately pickle serializes the names of these functions. Also put them under backward-compatibility enforcement.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64324

Test Plan: Local repro https://fb.workplace.com/groups/3440841732711443/permalink/4018921611570116/

Reviewed By: SplitInfinity, TailofJune

Differential Revision: D30684185

Pulled By: jamesr66a

fbshipit-source-id: 900701220155d15115cd0c07cf7774a2891bd04f

3 years ago[Metal][GPU] Enable metal for simulators and fix test failures if possible (#64322)
Yuchen Huang [Wed, 1 Sep 2021 05:00:11 +0000 (22:00 -0700)]
[Metal][GPU] Enable metal for simulators and fix test failures if possible (#64322)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64322

As title
ghstack-source-id: 137143877

Test Plan:
- `aibench-cli mobile`
- Select iOS -> `y` -> `1` -> `n` -> "--metal_op_test"
- Select all iPhone 6 + iPhone 7 + iPhone 8 and a iPhone X or 11 or 12
```
Benchmark Submitted. Find more details at: https://our.intern.facebook.com/intern/aibench/details/318120612514604
Benchmark Status:
        D10 (https://github.com/pytorch/pytorch/commit/b8256280ce45f02a7e105d3b3db4a547990e683d)AP-12.0.1: DONE
        N71mAP-14.3: DONE
DUMMY latency:
        D10 (https://github.com/pytorch/pytorch/commit/b8256280ce45f02a7e105d3b3db4a547990e683d)AP-12.0.1: 4319.3
        N71mAP-14.3: 8868.51
I0831 16:06:27.210558 605277 ClientSingletonManager.cpp:99] Shutting down Manifold ClientSingletonManager
```

Reviewed By: xta0

Differential Revision: D30147163

fbshipit-source-id: 2de6bbd9bd525e32ca92b2845eb435800855edcc

3 years ago[CUDA graphs] hotfix for test_graph_ (#64339)
Michael Carilli [Wed, 1 Sep 2021 04:43:25 +0000 (21:43 -0700)]
[CUDA graphs] hotfix for test_graph_ (#64339)

Summary:
Graphed workloads that try to capture a full backward pass must do warmup on a non-default stream. If warmup happens on the default stream, AccumulateGrad functions might tag themselves to run on the default stream, and therefore won't be capturable.

ngimel and I suspect some test_cuda.py tests run with the default stream as the ambient stream, which breaks `test_graph_grad_scaling` because `test_graph_grad_scaling` does warmup on the ambient stream _assuming_ the ambient stream is a non-default stream.

This PR explicitly sets a side stream for the warmup in `test_graph_grad_scaling`, which is what I should have done all along because it's what the new documentation recommends.

I pushed the PR branch straight to the main pytorch repo because we need to run ci-all on it, and I'm not sure what the requirements are these days.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64339

Reviewed By: mruberry

Differential Revision: D30690711

Pulled By: ngimel

fbshipit-source-id: 91ad75f46a11f311e25bc468ea184e22acdcc25a

3 years agoRemove outdated warning about RecursiveScriptModule not being copiable (#64085)
gmagogsfm [Wed, 1 Sep 2021 04:27:46 +0000 (21:27 -0700)]
Remove outdated warning about RecursiveScriptModule not being copiable (#64085)

Summary:
RecursiveScriptModule has its customized `__copy__` and `__deepcopy__` defined. The warning/error  that says it is not copiable is outdated

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64085

Reviewed By: rohan-varma

Differential Revision: D30598623

Pulled By: gmagogsfm

fbshipit-source-id: 0701d8617f42d818bc7b88244caee4cd47fbe976

3 years ago[TensorExpr] Wrap error messages with buildErrorMessage call. (#64330)
Mikhail Zolotukhin [Wed, 1 Sep 2021 03:27:44 +0000 (20:27 -0700)]
[TensorExpr] Wrap error messages with buildErrorMessage call. (#64330)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64330

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D30687226

Pulled By: ZolotukhinM

fbshipit-source-id: ade1be2ad6847c6afbba60307ef854696821b4e3

3 years agoFix bug in ShardedTensorMetadata serde. (#63902)
Pritam Damania [Wed, 1 Sep 2021 03:19:55 +0000 (20:19 -0700)]
Fix bug in ShardedTensorMetadata serde. (#63902)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63902

The 'memory_format' field was not being serialized correctly and used
the same encoding for different fields.
ghstack-source-id: 137142406

Test Plan: waitforbuildbot

Reviewed By: bowangbj

Differential Revision: D30527324

fbshipit-source-id: f0f223e2d660ef6e4abae9649d9992acc36e1278

3 years agoDelete some dead code from RRefMessageBase (#64298)
Pavel Belevich [Wed, 1 Sep 2021 03:14:08 +0000 (20:14 -0700)]
Delete some dead code from RRefMessageBase (#64298)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64298

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski SciPioneer H-Huang mrzzd cbalioglu gcramer23

Test Plan: Imported from OSS

Reviewed By: rohan-varma

Differential Revision: D30676702

Pulled By: pbelevich

fbshipit-source-id: 77dbc0f8064c3518376454ff573d45ed0274956b

3 years agodisallow empty named dims list to flatten(names, name) (#61953)
Matti Picus [Wed, 1 Sep 2021 01:54:44 +0000 (18:54 -0700)]
disallow empty named dims list to flatten(names, name) (#61953)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/61137 by raising an error if an empty tuple is passed in for the names:
```
>>> torch.empty((2, 3), names=['a', 'b']).flatten((), 'abc')
RuntimeError: flatten(tensor, dims, out_dim): dims cannot be empty
```

or from the original issue:
```
>>> torch.empty((2, 3)).flatten((), 'abc')
RuntimeError: flatten(tensor, dims, out_dim): dims cannot be empty
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61953

Reviewed By: iramazanli

Differential Revision: D30574571

Pulled By: malfet

fbshipit-source-id: e606e84458a8dd66e5da6d0eb1a260f37b4ce91b

3 years ago[caffe2][easy] Save heap allocation in ConcatOp (#63529)
Scott Wolchok [Wed, 1 Sep 2021 01:22:23 +0000 (18:22 -0700)]
[caffe2][easy] Save heap allocation in ConcatOp (#63529)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63529

Output() takes an IntArrayRef, so we can just use a std::initializer_list (stack-allocated array) instead of std::vector here.
ghstack-source-id: 137085908

Test Plan: existing CI

Reviewed By: mruberry

Differential Revision: D29687400

fbshipit-source-id: 9f2a7c6679f2552c098bb1bf7befaca18e0e5d4d

3 years agoConvert mul to use opmath_gpu_kernel_with_scalars (#64019)
Edward Yang [Wed, 1 Sep 2021 00:55:23 +0000 (17:55 -0700)]
Convert mul to use opmath_gpu_kernel_with_scalars (#64019)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64019

Note that previously the functor operated on scalar_t and
this modifies it to operate on opmath_t, but this is not
a problem as half precision was implemented by performing the
compute in float anyway.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D30575282

Pulled By: ezyang

fbshipit-source-id: cc6900ef996e755740afe48f9cb4d0366858dd47

3 years agoUse the correct overloaded name to skip boxed autograd not implemented kernel registr...
soulitzer [Wed, 1 Sep 2021 00:51:55 +0000 (17:51 -0700)]
Use the correct overloaded name to skip boxed autograd not implemented kernel registration (#64182)

Summary:
Some internal use_count tests are failing for `dequantize_self` because we only compare the skip list with the base name `dequantize` when we should be comparing with the full name including the overload

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64182

Reviewed By: albanD

Differential Revision: D30639909

Pulled By: soulitzer

fbshipit-source-id: d4d22dd1a5c8f7180251ce7739830764cce6f151

3 years ago[Static Runtime] Out version for softmax (#64243)
Ray Peng [Wed, 1 Sep 2021 00:45:50 +0000 (17:45 -0700)]
[Static Runtime] Out version for softmax (#64243)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64243

Test Plan:
```
> buck run //caffe2/benchmarks/static_runtime:static_runtime_cpptest -- --v=1
...
V0830 16:35:22.524479 613839 impl.cpp:1410] Switch to out variant for node: %5 : Tensor = aten::softmax(%a.1, %dim.1, %dtype.1)
...
[       OK ] StaticRuntime.IndividualOps_Softmax (803 ms)
```

Reviewed By: hlu1

Differential Revision: D30656149

fbshipit-source-id: 115b7b4a75448fd6a5c526808080ca9a4251302c

3 years ago.circleci: Remove already migrated CUDA configs (#64231)
Eli Uriegas [Wed, 1 Sep 2021 00:38:42 +0000 (17:38 -0700)]
.circleci: Remove already migrated CUDA configs (#64231)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64231

This migrates over the CUDA 11.1 and CUDA 10.2 configs that we had
previously migrated to GHA

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
cc ezyang seemethere malfet walterddr lg20987 pytorch/pytorch-dev-infra

Test Plan: Imported from OSS

Reviewed By: zhouzhuojie

Differential Revision: D30683811

Pulled By: seemethere

fbshipit-source-id: 71b0761461557d871c26eb02f665a2e4d9b1d9fb

3 years ago.github: Consolidate linux setup / teardown (#64229)
Eli Uriegas [Wed, 1 Sep 2021 00:38:42 +0000 (17:38 -0700)]
.github: Consolidate linux setup / teardown (#64229)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64229

Consolidates linux setup / teardown into easy to use jinja2 macros

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
cc ezyang seemethere malfet walterddr lg20987 pytorch/pytorch-dev-infra

Test Plan: Imported from OSS

Reviewed By: zhouzhuojie, driazati

Differential Revision: D30683810

Pulled By: seemethere

fbshipit-source-id: 2578630df3e212fb79392a699090553baef44cc2

3 years agoAdd ciflow-tracking issue to pytorch-probot (#64125)
Nikita Shulga [Wed, 1 Sep 2021 00:33:11 +0000 (17:33 -0700)]
Add ciflow-tracking issue to pytorch-probot (#64125)

Summary:
Doesn't do anything yet...

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64125

Reviewed By: zhouzhuojie

Differential Revision: D30620283

Pulled By: malfet

fbshipit-source-id: 91869d35c1b70a55e32261d2c32fb0136ec33960

3 years ago[TensorExpr] Move declaration of buildErrorMessage to exception.h (#64301)
Mikhail Zolotukhin [Wed, 1 Sep 2021 00:32:00 +0000 (17:32 -0700)]
[TensorExpr] Move declaration of buildErrorMessage to exception.h (#64301)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64301

Test Plan: Imported from OSS

Reviewed By: navahgar, huiguoo

Differential Revision: D30678215

Pulled By: ZolotukhinM

fbshipit-source-id: 599c83b3890450a0fb6526815f037eec9563661c

3 years agoFix redundant class definition in GraphModule singleton constructor (#64274)
Jay Leverett [Wed, 1 Sep 2021 00:28:42 +0000 (17:28 -0700)]
Fix redundant class definition in GraphModule singleton constructor (#64274)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/63883

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64274

Reviewed By: jamesr66a

Differential Revision: D30675970

Pulled By: jayleverett

fbshipit-source-id: e74ef2a28013f0fa7c58d14f38e66cfe48d26b74

3 years agoDiscover new tests in run_tests.py (#64246)
Nikita Shulga [Wed, 1 Sep 2021 00:19:11 +0000 (17:19 -0700)]
Discover new tests in run_tests.py (#64246)

Summary:
Introduce `discover_tests` function that globs for all Python files
starting with `test_` in test folder excluding subfolders which are
executed differently

Fixes https://github.com/pytorch/pytorch/issues/64178

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64246

Reviewed By: walterddr, seemethere

Differential Revision: D30661652

Pulled By: malfet

fbshipit-source-id: a52e78ec717b6846add267579dd8d9ae75326bf9

3 years agoRevert D30543236: Add python mode
Richard Zou [Tue, 31 Aug 2021 21:53:01 +0000 (14:53 -0700)]
Revert D30543236: Add python mode

Test Plan: revert-hammer

Differential Revision:
D30543236 (https://github.com/pytorch/pytorch/commit/4bd03b02424d93b72f15e28c542ede13f88ea929)

Original commit changeset: ef5444d96a5a

fbshipit-source-id: b0042ac2c22765fa11d6d00bf751f6a4489eb6d8

3 years ago[DataPipe] export fork, mux, demux for public usage (#64279)
Kevin Tse [Tue, 31 Aug 2021 20:55:59 +0000 (13:55 -0700)]
[DataPipe] export fork, mux, demux for public usage (#64279)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64279

cc VitalyFedyunin ejguan

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D30671971

Pulled By: NivekT

fbshipit-source-id: 056ac12ef7183b254d1eec341145594639e47ef6

3 years ago[DataPipe] adding description, __len__, tests for mux() (#64224)
Kevin Tse [Tue, 31 Aug 2021 20:55:59 +0000 (13:55 -0700)]
[DataPipe] adding description, __len__, tests for mux() (#64224)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64224

cc VitalyFedyunin ejguan

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D30651551

Pulled By: NivekT

fbshipit-source-id: f8af98ba71a592900b992a8077432062ec57bb48

3 years agoTry the forked checkout action with retry (#64120)
zhouzhuojie [Tue, 31 Aug 2021 20:48:28 +0000 (13:48 -0700)]
Try the forked checkout action with retry (#64120)

Summary:
Fixes #{issue number}

The main difference is:
https://github.com/zhouzhuojie/checkout/commit/ffc6f93ad4b6e3cdcdd1a34e8c896765002f9b34

Can test multiple times in this PR to see if it works, will make the `retry` number configurable if it's usable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64120

Reviewed By: malfet

Differential Revision: D30656099

Pulled By: zhouzhuojie

fbshipit-source-id: a89932196bb0c44e412a34664ed6a061b02ef92e

3 years agofix syntax error in bfloat16 PR (#64122)
Rishi Puri [Tue, 31 Aug 2021 20:47:29 +0000 (13:47 -0700)]
fix syntax error in bfloat16 PR (#64122)

Summary:
fixes prior syntax error from PR ngimel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64122

Reviewed By: H-Huang

Differential Revision: D30643596

Pulled By: ngimel

fbshipit-source-id: 0a2d5a40fb6dc7339cd03112e57ef0e1bf8a000e

3 years ago[CUDA graphs] Prototype API and documentation (#63269)
Michael Carilli [Tue, 31 Aug 2021 20:29:39 +0000 (13:29 -0700)]
[CUDA graphs] Prototype API and documentation (#63269)

Summary:
RFC: https://github.com/pytorch/pytorch/issues/61880

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63269

Reviewed By: mruberry

Differential Revision: D30596643

Pulled By: ngimel

fbshipit-source-id: b1f8061406364b667e2c2d4d30fbce1f0d8456be

3 years agoRemove ref to test_distributed_fork (#64197)
Rohan Varma [Tue, 31 Aug 2021 19:51:20 +0000 (12:51 -0700)]
Remove ref to test_distributed_fork (#64197)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64197

Removes this line as test is gone.
ghstack-source-id: 136986275

Test Plan: CI

Reviewed By: walterddr

Differential Revision: D30642929

fbshipit-source-id: a0c7dfdfb35a4a7f7ec1b881dbea53d85136012c

3 years ago.circleci: Remove migrated jobs, move docs builds (#64222)
Eli Uriegas [Tue, 31 Aug 2021 19:50:11 +0000 (12:50 -0700)]
.circleci: Remove migrated jobs, move docs builds (#64222)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64222

Removes both backwards_compat as well as docs_test from the general
gcc5.4 config and moves the docs build from being run on every PR to
only being run on master.

We can remove docs builds when we migrate the docs push job (including
all secrets associated with that)

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
cc ezyang seemethere malfet walterddr lg20987 pytorch/pytorch-dev-infra

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D30650953

Pulled By: seemethere

fbshipit-source-id: ac11da6a551a6c81f3dc1d47fd81846cbfe9975a

3 years ago[ao][docs] Clarify operator support for quantization (#63270)
Raghuraman Krishnamoorthi [Tue, 31 Aug 2021 19:22:13 +0000 (12:22 -0700)]
[ao][docs] Clarify operator support for quantization (#63270)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63270

Add table to quantization main page showing supported modules
for static and dynamic quantization.
ghstack-source-id: 137087204

Test Plan: Imported from OSS

Reviewed By: HDCharles

Differential Revision: D30658654

fbshipit-source-id: a82c998e1db6370596d5b0ca4c7cc96c1c90f30e

3 years agons for fx: make layer types more readable (#64270)
Vasiliy Kuznetsov [Tue, 31 Aug 2021 19:09:59 +0000 (12:09 -0700)]
ns for fx: make layer types more readable (#64270)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64270

Before this PR, layer types were populated by doing
`str(module_instance)` and `str(function)`. This resulted
in moderately readable strings for modules, and poorly readable
strings for functions.

This PR switches the logic to use `torch.typename` utility instead.
The results are significantly more readable.

Example function type:

```
# before
'<built-in method linear of PyCapsule object at 0x7fe9b20ce7b0>'

# after
'torch._ops.quantized.PyCapsule.linear'
```

Example module type:

```
# before
"<class 'torch.nn.quantized.modules.conv.Conv2d'>"

# after
'torch.nn.quantized.modules.conv.Conv2d'
```

Test Plan:
Manually inspect NS results for modules and functions, verify they are
more readable.

Manually inspect NS results for modules and functions, verify they are
more readable.

Imported from OSS

Differential Revision:
D30669545
D30669545

Reviewed By: jerryzh168

Pulled By: vkuzo

fbshipit-source-id: 60959e5cafa0a4992b083bf99f5d8260f9acdac0

3 years ago[fx2trt] Add acc_ops.sign and converter for it (#63876)
Shiyan Deng [Tue, 31 Aug 2021 18:29:07 +0000 (11:29 -0700)]
[fx2trt] Add acc_ops.sign and converter for it (#63876)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63876

Add `acc_ops.sign` which maps from `torch.sign`.

Add a plugin (not support dynamic shape currently) for `acc_ops.sign`. The plugin calls `at::sign` directly.

Test Plan: buck test mode/opt -c python.package_style=inplace -c fbcode.nvcc_arch=a100 caffe2/torch/fb/fx2trt:test_unary_ops

Reviewed By: yinghai

Differential Revision: D30518081

fbshipit-source-id: a0b9e6c30deac0b04b8cb09a162579e229985330

3 years agoUse stacklevel for floordiv deprecation warnings (#64034)
Saketh Are [Tue, 31 Aug 2021 17:59:57 +0000 (10:59 -0700)]
Use stacklevel for floordiv deprecation warnings (#64034)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/60548

`Tensor.__floordiv__` was indirectly deprecated by deprecation of `torch.floor_divide` (see https://github.com/pytorch/pytorch/issues/43874). Deprecating it directly provides clearer feedback.

Repro:
```
import torch
x = torch.tensor(0)
x // 1
```

Before this change, a deprecation warning was triggered within the C++ implementation of floor_divide:
```
UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  ../aten/src/ATen/native/BinaryOps.cpp:571.)
  return torch.floor_divide(self, other)
```

After this change, the warning instead cites the user's offending line of Python code:
```
UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  x // 1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64034

Reviewed By: mruberry

Differential Revision: D30658010

Pulled By: saketh-are

fbshipit-source-id: b0e6c5008d741897509d102f4a89efb47de4aa2a

3 years ago[ao][docs] Add description of qconfig and qengine to quantization page (#63582)
Raghuraman Krishnamoorthi [Tue, 31 Aug 2021 16:45:28 +0000 (09:45 -0700)]
[ao][docs] Add description of qconfig and qengine to quantization page (#63582)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63582

Current quantization docs do not define qconfig and qengine. Added text to define these concepts before they are used.
ghstack-source-id: 137051719

Test Plan: Imported from OSS

Reviewed By: HDCharles

Differential Revision: D30658656

fbshipit-source-id: a45a0fcdf685ca1c3f5c3506337246a430f8f506

3 years agoAdd OpInfo for `nn.functional.cosine_similarity` (#62959)
Kushashwa Ravi Shrimali [Tue, 31 Aug 2021 16:45:09 +0000 (09:45 -0700)]
Add OpInfo for `nn.functional.cosine_similarity` (#62959)

Summary:
Please see https://github.com/facebookresearch/functorch/issues/78 and https://github.com/pytorch/pytorch/issues/54261.

Notes:

* Some redundant tests from `test_nn.py` have been removed. I'm unsure about precision checks if they can be removed as well.
* Broadcasting is also checked in the OpInfo for `cosine_similarity`.

cc: mruberry zou3519 Chillee

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62959

Reviewed By: heitorschueroff

Differential Revision: D30520176

Pulled By: zou3519

fbshipit-source-id: 14e902eb4bcce875edab28a1669a2ea021052b9b

3 years ago[DataPipe] implementing __len__ for fork (no valid length for demux) (#64215)
Kevin Tse [Tue, 31 Aug 2021 15:07:23 +0000 (08:07 -0700)]
[DataPipe] implementing __len__ for fork (no valid length for demux) (#64215)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64215

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D30648672

Pulled By: NivekT

fbshipit-source-id: 4780f2f6a79ae15a4009092475e7d92f96dd09a2

3 years ago[DataPipe] implementing demux() (#63650)
Kevin Tse [Tue, 31 Aug 2021 15:07:23 +0000 (08:07 -0700)]
[DataPipe] implementing demux() (#63650)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63650

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D30493944

Pulled By: NivekT

fbshipit-source-id: 0aa06dee8c7fb1744975b8f6a0694b90c11ef80d

3 years ago[DataPipe] implementing fork() (#63649)
Kevin Tse [Tue, 31 Aug 2021 15:07:23 +0000 (08:07 -0700)]
[DataPipe] implementing fork() (#63649)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63649

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D30493945

Pulled By: NivekT

fbshipit-source-id: 40db7d4134facd266d86bc0dc2edf2729c4e5842

3 years agoRevert D30327514: [Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator...
Kimish Patel [Tue, 31 Aug 2021 14:36:53 +0000 (07:36 -0700)]
Revert D30327514: [Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling.

Test Plan: revert-hammer

Differential Revision:
D30327514 (https://github.com/pytorch/pytorch/commit/bc9277dca3a40d99147d4a1a3e0160a4a8e91f9f)

Original commit changeset: 3bb2f2daaaed

fbshipit-source-id: 0b2aa7c57d08de77c9aaa75e546a7d0938610f64

3 years ago[Static Runtime] Implement aten::nonzero out variant (#64126)
Harut Movsisyan [Tue, 31 Aug 2021 07:49:39 +0000 (00:49 -0700)]
[Static Runtime] Implement aten::nonzero out variant (#64126)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64126

Test Plan:
Confirm out variant is called:

```
> buck run //caffe2/benchmarks/static_runtime:static_runtime_cpptest -- --v=1
```

Reviewed By: mikeiovine

Differential Revision: D30617729

fbshipit-source-id: 752749638c8f467815efa57021cb3de5c728ab1b

3 years agoAutomated submodule update: FBGEMM (#64213)
Facebook Community Bot [Tue, 31 Aug 2021 04:31:11 +0000 (21:31 -0700)]
Automated submodule update: FBGEMM (#64213)

Summary:
This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM).

New submodule commit: https://github.com/pytorch/FBGEMM/commit/9d69998df6236d6714aa37ae6142a2a2d4fb2bf6

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64213

Test Plan: Ensure that CI jobs succeed on GitHub before landing.

Reviewed By: jspark1105

Differential Revision: D30647878

fbshipit-source-id: b903b39441b4e28dda7eab226ac874e2227e750a

3 years ago[Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling. (#63367)
Kimish Patel [Tue, 31 Aug 2021 03:53:50 +0000 (20:53 -0700)]
[Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling. (#63367)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63367

This diff changes the way operator profiling is done in lite predictor
benchmarking binary.
Instead of using custom callbacks it uses KinetoEdgeCPUProfiler to profile
events and then generate operator level metric from it.
Since KinetoEvents do not contain cpu clock time, now we report only wallclock
time.
This unifies various profiling effort that we have for benchmarking purpose. In
production we will still use observer based mechanism, but the advantage of
using kineto profiler is that we get few other things for free, such as:
- chrome trace generation.
- operator level memory profiling (to be added)
- flop counts (to be added)

Furthermore possible we can use python post processing script to parse chrome
trace and generate output similar to torch.profiler. (To be done)

Test Plan:
aibench run
Model without debug info:
https://www.internalfb.com/intern/aibench/details/219598441154763
Model with debug info and `--print_module_info true` (see Operator summary has now module hierarchy information).
https://www.internalfb.com/intern/aibench/details/617154236292985

Reviewed By: raziel

Differential Revision: D30327514

fbshipit-source-id: 3bb2f2daaaedfb04bd6f5d9c91292783f9c4344f

3 years agoCompile BatchLinearAlgebra without nvcc (#64146)
Peter Bell [Tue, 31 Aug 2021 03:17:12 +0000 (20:17 -0700)]
Compile BatchLinearAlgebra without nvcc (#64146)

Summary:
These files only use cuda libraries interfaces, so don't actually need to be compiled with nvcc.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64146

Reviewed By: ezyang

Differential Revision: D30633189

Pulled By: ngimel

fbshipit-source-id: c9d0ae5259a10cb49332d31f0da89ad758736ea8

3 years ago[nnc] Enable fusion of bfloat16 ops (#64196)
Bert Maher [Tue, 31 Aug 2021 03:08:15 +0000 (20:08 -0700)]
[nnc] Enable fusion of bfloat16 ops (#64196)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64196

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D30643864

Pulled By: bertmaher

fbshipit-source-id: e95edeaf7089464d713ea1d1f951743d3e5f61c5

3 years ago[WIP][FX] BC guarantees for 1.10 (#63888)
James Reed [Tue, 31 Aug 2021 02:54:50 +0000 (19:54 -0700)]
[WIP][FX] BC guarantees for 1.10 (#63888)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63888

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D30523133

Pulled By: jamesr66a

fbshipit-source-id: b04cc0d842a74862f42ecba98b757310cd2ec7b0

3 years agoadd operation list for AutocastCPU (#63534)
leslie-fang-intel [Tue, 31 Aug 2021 02:28:59 +0000 (19:28 -0700)]
add operation list for AutocastCPU (#63534)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63534

In this PR:
* We have changed the default dtype of `AutocastCPU` from `float16` to `bfloat16` as discussed here `https://github.com/pytorch/pytorch/pull/61002`
* We also update the operation list which needs casting to `lower_precision_fp` or `float32`.

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D30644914

Pulled By: ezyang

fbshipit-source-id: 8b93485ba452b3759611e3f0ac88e920fe495ac1

3 years agoUpdate contribution_guide.rst (#64142)
oleshp [Tue, 31 Aug 2021 02:22:05 +0000 (19:22 -0700)]
Update contribution_guide.rst (#64142)

Summary:
Grammatical update.

Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64142

Reviewed By: mruberry

Differential Revision: D30639394

Pulled By: ezyang

fbshipit-source-id: cf1a4dfbd8e34b0772f1b09f5d820278e8ef8574

3 years agoAvoid an unnecessary list creation in `DataChunk` (#64111)
Santiago Castro [Tue, 31 Aug 2021 02:17:21 +0000 (19:17 -0700)]
Avoid an unnecessary list creation in `DataChunk` (#64111)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64111

Reviewed By: mruberry

Differential Revision: D30639383

Pulled By: ezyang

fbshipit-source-id: 96b243307413c99a67d55d862a71937e1ef210f4

3 years agoAdd optional tensor arguments to (#63967)
Samantha Andow [Tue, 31 Aug 2021 02:15:16 +0000 (19:15 -0700)]
Add optional tensor arguments to (#63967)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/63435

Adds optional tensor arguments to check handling torch function checks. The only one I didn't do this for in the functional file was `multi_head_attention_forward` since that already took care of some optional tensor arguments but not others so it seemed like arguments were specifically chosen

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63967

Reviewed By: albanD

Differential Revision: D30640441

Pulled By: ezyang

fbshipit-source-id: 5ef9554d2fb6c14779f8f45542ab435fb49e5d0f