Raghavan Raman [Fri, 10 Sep 2021 19:35:24 +0000 (12:35 -0700)]
[MicroBench] Added a log_vml version of the signed log1p kernel (#64205)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64205
The log_vml version of the micro-bench is over **2x** faster than the log1p version. Here are the perf numbers:
```
---------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
---------------------------------------------------------------------------------------------
SignedLog1pBench/ATen/10/1467 45915 ns 45908 ns 14506 GB/s=2.5564G/s
SignedLog1pBench/NNC/10/1467 40469 ns 40466 ns 17367 GB/s=2.9002G/s
SignedLog1pBench/NNCLogVml/10/1467 19560 ns 19559 ns 35902 GB/s=6.00016G/s
```
Thanks to bertmaher for pointing this out.
Test Plan: Imported from OSS
Reviewed By: bertmaher
Differential Revision:
D30644716
Pulled By: navahgar
fbshipit-source-id:
ba2b32c79d4265cd48a2886b0c62d0e89ff69c19
Raghavan Raman [Fri, 10 Sep 2021 19:35:24 +0000 (12:35 -0700)]
[nnc] Added an implementation of sign op (#64033)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64033
Test Plan: Imported from OSS
Reviewed By: bertmaher
Differential Revision:
D30579197
Pulled By: navahgar
fbshipit-source-id:
f9f7fa7f2ffa109cf4e441eb1af821b8b891d4d3
Eddie Ren [Fri, 10 Sep 2021 19:31:27 +0000 (12:31 -0700)]
Extend 2Dim embedding bag benchmarking to include 3Dim benchmarks (#64647)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64647
Add support for benchmarking of 8 bit quantizations of N-D batched embeddings. Currently only works for 3Dim embeddings and still requires thought on ramping up from 3Dim to NDim.
Test Plan: ```buck run //caffe2/benchmarks/operator_benchmark/pt:qembedding_pack_test```
Reviewed By: jingsh
Differential Revision:
D30770085
fbshipit-source-id:
26659020f3458991592065a05366bde0f060494e
Howard Huang [Fri, 10 Sep 2021 18:48:43 +0000 (11:48 -0700)]
Revert
D30846958: [caffe2/aten] Remove loose #pragma warning ( pop ) in TensorBase.h
Test Plan: revert-hammer
Differential Revision:
D30846958 (https://github.com/pytorch/pytorch/commit/
40098f48a1a37a06a456fd642d908ca522295706)
Original commit changeset:
52a3fb66e426
fbshipit-source-id:
1d749f6981756f2169d6867538555a945cbb8ca6
Kevin Tse [Fri, 10 Sep 2021 18:00:01 +0000 (11:00 -0700)]
[DataPipe] fixing tests related fork() to remove warnings (#64827)
Summary:
There are two warnings produced by `test_fork_datapipe`. This PR addresses the issues raised by those warnings without impacting the test cases.
cc VitalyFedyunin ejguan
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64827
Reviewed By: ejguan
Differential Revision:
D30870528
Pulled By: NivekT
fbshipit-source-id:
580a001c6fa3ff6f8b04a7e5183e58861938204b
Hui Guo [Fri, 10 Sep 2021 16:59:25 +0000 (09:59 -0700)]
[tensorexpr] Add 'pre_alloc' argument in python API of tensorexpr kernel (#64718)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64718
Test Plan: Imported from OSS
Reviewed By: navahgar
Differential Revision:
D30826582
Pulled By: huiguoo
fbshipit-source-id:
6c173c8964f2643039273cdc83e64fb02bb5f381
anjali411 [Fri, 10 Sep 2021 16:55:50 +0000 (09:55 -0700)]
Skip conjugate and negate fallback for view ops and their in-place versions (#64392)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64392
cc ezyang anjali411 dylanbespalko mruberry Lezcano nikitaved
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision:
D30866330
Pulled By: anjali411
fbshipit-source-id:
7b2f51486bf1d610ad2b1472306bab608ee69c37
Ilqar Ramazanli [Fri, 10 Sep 2021 16:47:38 +0000 (09:47 -0700)]
To add Rprop documentation (#63866)
Summary:
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper https://github.com/pytorch/pytorch/issues/63236.
In this PR we are adding description of Rprop to the documentation. For more details, we refer to the paper http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.1417
<img width="657" alt="Rpropalg" src="https://user-images.githubusercontent.com/
73658284/
132750009-
a5ec059e-6d53-4c67-917b-
57174c8ca27b.png">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63866
Reviewed By: ngimel
Differential Revision:
D30867590
Pulled By: iramazanli
fbshipit-source-id:
0d2d4ffc6c4d939290bbbaa84d2c6e901ed8b54a
Jeff Daily [Fri, 10 Sep 2021 16:36:26 +0000 (09:36 -0700)]
[ROCm] define C10_WARP_SIZE to warpSize HIP constant (#64302)
Summary:
warpSize is defined as a constexpr in HIP headers. It is incorrect to assume warpSize 64. This change fixes the C10_WARP_SIZE definition in torch sources similar to [how it was done in caffe2](https://github.com/pytorch/pytorch/blob/master/caffe2/utils/GpuDefs.cuh#L10-L14).
cc jeffdaily sunway513 jithunnair-amd ROCmSupport
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64302
Reviewed By: mrshenli
Differential Revision:
D30785975
Pulled By: malfet
fbshipit-source-id:
68f8333182ad4d02bd0c8d02f1751a50bc5bafa7
Corey Levinson [Fri, 10 Sep 2021 16:35:06 +0000 (09:35 -0700)]
fix typo in torch/onnx/utils.py (#63396)
Summary:
fixes minor typo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63396
Reviewed By: pbelevich
Differential Revision:
D30644295
Pulled By: SplitInfinity
fbshipit-source-id:
c506f67383909aa2c0c7c533038446b4b3d76a3a
rui [Fri, 10 Sep 2021 15:28:45 +0000 (08:28 -0700)]
build: bump bazel to 4.2.1 (#64455)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64455
Reviewed By: saketh-are
Differential Revision:
D30752580
Pulled By: malfet
fbshipit-source-id:
4f5cc6f820396348181c09463f7e5628b5f69471
Aswin John Mathews [Fri, 10 Sep 2021 15:05:21 +0000 (08:05 -0700)]
ROCm MIOpen NHWC Convolution support (#63617)
Summary:
- Added 2D-Convolution NHWC support
- on ROCm 4.3, with `PYTORCH_MIOPEN_SUGGEST_NHWC=1` flag
- May need to force MIOpen to search for solutions ( see examples below for flags )
**PYTORCH_MIOPEN_SUGGEST_NHWC Environment Flag**
MIOpen does not officially support NHWC yet, although convolution support has been added to tip-of-tree of MIOpen. This flag is intended to be a short-lived flag to explicitly turn on NHWC support until ROCm officially supports NHWC and performance is verified.
**Examples**
1. Example usage 1 : Run test on ROCm4.3
`PYTORCH_TEST_WITH_ROCM=1 PYTORCH_MIOPEN_SUGGEST_NHWC=1 MIOPEN_FIND_ENFORCE=4 MIOPEN_DEBUG_CONV_GEMM=0 MIOPEN_FIND_MODE=1 pytest test_nn.py -v -k "test_conv_cudnn_nhwc" `
2. Example usage 2: Run the following with `PYTORCH_MIOPEN_SUGGEST_NHWC=1` on ROCm4.3.
```
#!/usr/bin/env python3
import torch
model = torch.nn.Conv2d(8, 4, 3).cuda().half()
model = model.to(memory_format=torch.channels_last)
input = torch.randint(1, 10, (2, 8, 4, 4), dtype=torch.float32, requires_grad=True)
input = input.to(device="cuda", memory_format=torch.channels_last, dtype=torch.float16)
# should print True for is_contiguous(channels_last), and strides must match NHWC format
print(input.is_contiguous(memory_format=torch.channels_last), input.shape, input.stride() )
out = model(input)
# should print True for is_contiguous(channels_last), and strides must match NHWC format
print("Contiguous channel last :", out.is_contiguous(memory_format=torch.channels_last), " out shape :", out.shape, "out stride :", out.stride() )
```
See https://pytorch.org/tutorials/intermediate/memory_format_tutorial.html for more examples.
cc jeffdaily sunway513 jithunnair-amd ROCmSupport
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63617
Reviewed By: saketh-are
Differential Revision:
D30730800
Pulled By: ezyang
fbshipit-source-id:
61906a0f30be8299e6547d312ae6ac91cc7c3238
Shen Li [Fri, 10 Sep 2021 14:44:09 +0000 (07:44 -0700)]
Let all_reduce_coalesced and all_gather_coalesced return Future objects (#64722)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64722
`all_reduce_coalesced` and `all_gather_coalesced` are never publicly
released in our API docs. So, I would assume the blast radius to be small.
The motivation for this change to allow implementing
`all_reduce_coalesced` and `all_gather_coalesced` by re-using `allreduce`
and `allgather` C++ cores and perform flatten and copy only on the Python
side. With that, we can then remove `all_reduce_coalesced` and
`all_gather_coalesced` from C++ ProcessGroup APIs. For the async mode,
the copy-back logic after the communication will need to be chained
as a callback on the returned Future and use the chained child Future
as the return value (otherwise, we will need to wrap the child Future
into another work handle). This PR tries to test if we can directly
return a Future without breaking tests and internal use cases. If yes,
it will make the consolidation a lot easier.
cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski SciPioneer H-Huang mrzzd cbalioglu gcramer23
Test Plan: Imported from OSS
Reviewed By: rohan-varma
Differential Revision:
D30830994
Pulled By: mrshenli
fbshipit-source-id:
dcde0ed9245e9e8fee357b3588b07d540a4b6318
Nikita Vedeneev [Fri, 10 Sep 2021 14:17:30 +0000 (07:17 -0700)]
`torch.lu`: forward AD support (#64742)
Summary:
As per title.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64742
Reviewed By: H-Huang
Differential Revision:
D30841227
Pulled By: albanD
fbshipit-source-id:
dc4d043ab94358594adb110fbbbb60750c98262a
Jordan Fix [Fri, 10 Sep 2021 06:49:22 +0000 (23:49 -0700)]
[const_fold] Keep around node.meta for replaced folded ops (#64782)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64782
Previously, get_attrs that were added to the graph did not retain node.meta after folding. Add such support, and improve coverage in general here.
Test Plan: Added test coverage.
Reviewed By: protonu
Differential Revision:
D30852704
fbshipit-source-id:
ece87a61c69b2e68982964c6adc4dde14dae12c7
Elias Guestrin [Fri, 10 Sep 2021 06:44:03 +0000 (23:44 -0700)]
[caffe2/aten] Remove loose #pragma warning ( pop ) in TensorBase.h (#64773)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64773
Remove loose `#pragma warning ( pop )` in TensorBase.h.
Reviewed By: ezyang
Differential Revision:
D30846958
fbshipit-source-id:
52a3fb66e426bc16ef7bde2a13e26e8293969026
Shirong Wu [Fri, 10 Sep 2021 04:02:15 +0000 (21:02 -0700)]
Add TRTSplitter (#64762)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64762
Extract and format TRTSplitter from fx2trt_example code, current implementation is tentative, subject to changed based on feeds model lowering progress.
Test Plan:
manul print of supported operator:
`{<class 'torch.nn.modules.activation.ReLU'>: None, <function relu at 0x7f9b1abd0790>: None, <class 'torch.nn.modules.activation.Sigmoid'>: None, <class 'torch.nn.modules.pooling.AdaptiveAvgPool2d'>: None, <built-in method add of type object at 0x7f9b7f402498>: None, <built-in function add>: None, <built-in method add of PyCapsule object at 0x7f9b1a3dc690>: None, <built-in method add_relu of PyCapsule object at 0x7f9b1a34cf90>: None, <class 'torch.nn.modules.batchnorm.BatchNorm2d'>: None, <class 'torch.nn.quantized.modules.batchnorm.BatchNorm2d'>: None, <class 'torch.nn.modules.conv.Conv2d'>: None, <class 'torch.nn.quantized.modules.conv.Conv2d'>: None, <class 'torch.nn.intrinsic.quantized.modules.conv_relu.ConvReLU2d'>: None, <class 'torch.nn.modules.linear.Linear'>: None, <class 'torch.nn.quantized.modules.linear.Linear'>: None, <class 'torch.nn.modules.pooling.MaxPool2d'>: None, <built-in function mul>: None, <built-in method mul of type object at 0x7f9b7f402498>: None, <built-in method mul of PyCapsule object at 0x7f9b1a3dc6c0>: None, <built-in method flatten of type object at 0x7f9b7f402498>: None, <class 'torch.nn.quantized.modules.DeQuantize'>: None, <built-in method dequantize of type object at 0x7f9b7f402498>: None, 'dequantize': None, <class 'torch.nn.quantized.modules.Quantize'>: None, <built-in method quantize_per_tensor of type object at 0x7f9b7f402498>: None, <class 'torch.nn.modules.linear.Identity'>: None, <function conv2d at 0x7f9b1a1fe9d0>: None, <function flatten at 0x7f9b1a1f5ca0>: None, <function size at 0x7f9b1a1f5b80>: None, <function batch_norm at 0x7f9b1a1feaf0>: None, <function layer_norm at 0x7f9b1a1feb80>: None, <function softmax at 0x7f9b1a1f9550>: None, <function relu at 0x7f9b1a1fe040>: None, <function sin at 0x7f9b1a2030d0>: None, <function cos at 0x7f9b1a203160>: None, <function tan at 0x7f9b1a2031f0>: None, <function sinh at 0x7f9b1a1fe160>: None, <function cosh at 0x7f9b1a1fe280>: None, <function tanh at 0x7f9b1a1fe310>: None, <function asin at 0x7f9b1a1fe3a0>: None, <function acos at 0x7f9b1a1fe430>: None, <function atan at 0x7f9b1a1fe4c0>: None, <function exp at 0x7f9b1a1fe550>: None, <function log at 0x7f9b1a1fe5e0>: None, <function sqrt at 0x7f9b1a1fe670>: None, <function reciprocal at 0x7f9b1a1fe700>: None, <function abs at 0x7f9b1a1fe790>: None, <function neg at 0x7f9b1a1fe820>: None, <function floor at 0x7f9b1a1fe8b0>: None, <function ceil at 0x7f9b1a1fe940>: None, <function sum at 0x7f9b1a1f9c10>: None, <function max_pool2d at 0x7f9b1a1f5d30>: None, <function squeeze at 0x7f9b1a1f5c10>: None, <function add at 0x7f9b1a1f91f0>: None, <function sub at 0x7f9b1a1f9ca0>: None, <function div at 0x7f9b1a1f9dc0>: None, <function mul at 0x7f9b1a1f9d30>: None, <function pow at 0x7f9b1a1f9e50>: None, <function min_two_tensors_input at 0x7f9b1a1f9940>: None, <function unsqueeze at 0x7f9b1a1f9280>: None, <function topk at 0x7f9b1a203280>: None, <function adaptive_avg_pool2d at 0x7f9b1a1f5dc0>: None, <function avg_pool2d at 0x7f9b1a1f5e50>: None, <function reshape at 0x7f9b1a203550>: None, <function slice_tensor at 0x7f9b1a1fee50>: None, <function split at 0x7f9b1a1fec10>: None, <function linear at 0x7f9b1a1f51f0>: None, <function clamp at 0x7f9b1a1f93a0>: None, <function tuple_construct at 0x7f9b1a1fed30>: None, <function contiguous at 0x7f9b1a1f9430>: None, <function getitem at 0x7f9b1a203310>: None, <function cat at 0x7f9b1a1f9310>: None, <function transpose at 0x7f9b1a1f94c0>: None, <function matmul at 0x7f9b1a1f98b0>: None, <function sigmoid at 0x7f9b1a1fe1f0>: None, <function permute at 0x7f9b1a1f9670>: None, <function quantize_per_tensor at 0x7f9b1a1f9b80>: None, <function dequantize at 0x7f9b1a1f99d0>: None, <function sign at 0x7f9b1a1f5ee0>: None}`
Reviewed By:
842974287
Differential Revision:
D30798047
fbshipit-source-id:
69076a550874425b7186fbbf2ecf03da4a99b42f
Scott Wolchok [Fri, 10 Sep 2021 01:53:36 +0000 (18:53 -0700)]
[PyTorch] Fix missing move in torch::jit::Lexer::next (#64653)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64653
Saves shared_ptr refcount inc/dec in SourceRange.
ghstack-source-id:
137608457
Test Plan: Profiled startup of framework overheads benchmark from high_per_models; self time spent in next() is way down.
Reviewed By: dhruvbird
Differential Revision:
D30739240
fbshipit-source-id:
ac455678c9d46e657b111d3788d4369983028674
Scott Wolchok [Fri, 10 Sep 2021 01:53:36 +0000 (18:53 -0700)]
[PyTorch] Use std::find in the JIT lexer (#64652)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64652
If nothing else, it is slightly clearer code.
ghstack-source-id:
137608456
Test Plan: CI
Reviewed By: dhruvbird
Differential Revision:
D30739239
fbshipit-source-id:
bc7917b59883ca4a33fc6916b4e422bad79cf04b
Mikhail Zolotukhin [Fri, 10 Sep 2021 01:48:17 +0000 (18:48 -0700)]
[TensorExpr] Simplify TE IR before applying any transformations. (#64717)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64717
This also exposed several bugs, which are fixed in this PR.
Differential Revision:
D30826408
D30826408
Test Plan: Imported from OSS
Reviewed By: navahgar
Pulled By: ZolotukhinM
fbshipit-source-id:
a67ec5739aceed9ffdf0d24f77eb3787cefe4560
Jerry Zhang [Fri, 10 Sep 2021 00:17:01 +0000 (17:17 -0700)]
[quant][fix] Fix quantization for sub_scalar (#64603)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64603
We'll insert observer only when both the operator and dtype is supported
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_sub_scalar
Imported from OSS
Reviewed By: vkuzo
Differential Revision:
D30797025
fbshipit-source-id:
a77c21e2749405534fc245374cf33a0657a3d2c8
Linbin Yu [Thu, 9 Sep 2021 23:56:50 +0000 (16:56 -0700)]
[Android] print type name for IValues (#64602)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64602
print type name in error message for easier debugging.
Test Plan:
Example:
java.lang.IllegalStateException: Expected IValue type Tensor, actual type TensorList
Reviewed By: beback4u
Differential Revision:
D30782318
fbshipit-source-id:
60d88a659e7b4bb2b574b12c7652a28f0d5ad0d2
Xinyi Zhang [Thu, 9 Sep 2021 23:43:55 +0000 (16:43 -0700)]
[caffe2][tiny] Add logging to report what the current lengths are when mismatched lengths are detected (#64768)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64768
as title
Differential Revision:
D30846637
fbshipit-source-id:
266768c81b315fdebba854135ea2db1faf67fd6a
Ilqar Ramazanli [Thu, 9 Sep 2021 22:37:44 +0000 (15:37 -0700)]
[doc][hackathon] To add Adagrad Optimizer to the documentation (#63254)
Summary:
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper https://github.com/pytorch/pytorch/issues/63236.
In this PR we are adding description of Adagrad to the documentation. For more details, we refer to the paper
http://jmlr.org/papers/v12/duchi11a.html
<img width="658" alt="AdaGradAlgo" src="https://user-images.githubusercontent.com/
73658284/
132743276-
a52ea3fb-70a5-4788-94b7-
f99367907a26.png">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63254
Reviewed By: albanD
Differential Revision:
D30852139
Pulled By: iramazanli
fbshipit-source-id:
9e496560a97e92be8386585b01d9bd3bba4b0c66
Harut Movsisyan [Thu, 9 Sep 2021 21:35:00 +0000 (14:35 -0700)]
[Static Runtime] Fix resize_output_check warning coming from prim::VarConcat (#64765)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64765
Test Plan: Tested the fix with BR v1 model predictor-replayer setup.
Reviewed By: ajyu
Differential Revision:
D30846506
fbshipit-source-id:
3ef3c93f11285c7cd1e2b188ca298a7ab4fba579
Han Guangyun [Thu, 9 Sep 2021 20:02:56 +0000 (13:02 -0700)]
Rename profiler metadata key (#63743)
Summary:
rename metadata key to be the same with variable name
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63743
Reviewed By: albanD
Differential Revision:
D30839501
Pulled By: gdankel
fbshipit-source-id:
b9b4e670dcc9557b8d8d0730baea0ad39a1a0ca4
Jordan Fix [Thu, 9 Sep 2021 19:59:54 +0000 (12:59 -0700)]
Add support for lowering info during serialize_module, and add padding/partial to it (#5810)
Summary:
Pull Request resolved: https://github.com/pytorch/glow/pull/5810
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64725
- Any info added to the dict in node.meta["lowering_info"] will be added to the node_rep during serialization.
- Use this to add annotations on placeholders that allow partial inputs and require padding.
- Check for these annotations and set them in the NNPICompiledFunction as expected
Test Plan: Validated working on inline_cvr in stack. Additionally existing fx_glow end to end tests should still pass.
Reviewed By:
842974287
Differential Revision:
D30824192
fbshipit-source-id:
def64ef097aa35c337abb494415f7d437c6c7fa9
Palwisha Akhtar [Thu, 9 Sep 2021 19:49:03 +0000 (12:49 -0700)]
cat_shape_check: Fixes dimension in the error message for CUDA cat shape check and removes unnecessary offending index information (#64556)
Summary:
Fixes: https://github.com/pytorch/pytorch/issues/64207
Thank you, SsnL for providing the reproducing script.
cc ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64556
Reviewed By: albanD
Differential Revision:
D30843859
Pulled By: ngimel
fbshipit-source-id:
457ebe80eaef793d9f5d35ee962b6697e5de1907
Xu Zhao [Thu, 9 Sep 2021 19:35:36 +0000 (12:35 -0700)]
Enable the on-demand performance PR testing to run on a specified TB branch (#64701)
Summary:
This is to enable performance testing of experimental features such as LazyTensor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64701
Test Plan:
TorchBench CI
RUN_TORCHBENCH: BERT_pytorch, mobilenet_v3_large
TORCHBENCH_BRANCH: v1.0
Reviewed By: seemethere
Differential Revision:
D30847389
Pulled By: xuzhao9
fbshipit-source-id:
6853b368fa6f1ba8ffde517805c74bf318dcb35b
Eli Uriegas [Thu, 9 Sep 2021 19:20:43 +0000 (12:20 -0700)]
.github: Remove add_annotations workflow (#64449)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64449
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Test Plan: Imported from OSS
Reviewed By: suo, janeyx99
Differential Revision:
D30738460
Pulled By: seemethere
fbshipit-source-id:
f1259fcba2f0c14a9bcfbe811ec0a4bf61106619
Rohan Varma [Thu, 9 Sep 2021 19:05:26 +0000 (12:05 -0700)]
[Dist/CI] Remove dist from target determinator (#64721)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64721
There are a couple PRs where distributed CI did not runa nd we expect
it to. Examples:
https://github.com/pytorch/pytorch/pull/64513/checks?check_run_id=
3539190960,
https://github.com/pytorch/pytorch/pull/64113. All distributed tests should've
been run on these PRs, but we can see they were not:
```
Determination is skipping distributed/test_c10d_common
Determination is skipping distributed/test_c10d_gloo
Determination is skipping distributed/test_c10d_nccl
Determination is skipping distributed/test_c10d_spawn_gloo
Determination is skipping distributed/test_c10d_spawn_nccl
Running distributed/test_data_parallel without determination
Determination is skipping distributed/test_distributed_spawn
Determination is skipping distributed/test_jit_c10d
```
Since it is important to run distributed tests on PRs that touch distributed,
exclude distributed from target_det_list for now.
ghstack-source-id:
137654015
Test Plan: CI
Reviewed By: driazati, mrshenli
Differential Revision:
D30830455
fbshipit-source-id:
8b0fdf5b57c2c647b0d82c48e2bb8e2bdbe4d307
Emad El-Haraty [Thu, 9 Sep 2021 17:32:22 +0000 (10:32 -0700)]
fix acc topk's handling of the case when dim=0, fix tests as well (#64727)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64727
the acc ops convertor for topk has a subtle bug (i found this while trying to introduce max/min)
the code does not differentiate between dim == None and dim ==0, but these are both different computations
Reviewed By: jfix71,
842974287
Differential Revision:
D30833621
fbshipit-source-id:
6cd84e6ca4e95bb1a6d6465e61830b76808a9c78
Richard Barnes [Thu, 9 Sep 2021 17:30:59 +0000 (10:30 -0700)]
Fix a shadowed variable (#64695)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64695
Resolves this warning:
```
caffe2/aten/src/ATen/ParallelOpenMP.h:109:63: warning: declaration of 'int64_t begin' shadows a parameter [-Wshadow=compatible-local]
109 | internal::invoke_parallel(begin, end, grain_size, [&](int64_t begin, int64_t end) {
| ~~~~~~~~^~~~~
caffe2/aten/src/ATen/ParallelOpenMP.h:86:1: note: shadowed declaration is here
85 | inline scalar_t parallel_reduce(
| ~~~~~~~~~~~~~~~~
86 | const int64_t begin,
| ^ ~
```
Test Plan: Sandcastle
Reviewed By: ngimel
Differential Revision:
D30816128
fbshipit-source-id:
3adff6d94eea9fbd65885e88283cae10b87dba18
Nived P A [Thu, 9 Sep 2021 17:29:10 +0000 (10:29 -0700)]
Added more version comparison operations (#63848)
Summary:
Currently the [TorchVersion](https://github.com/pytorch/pytorch/blob/
1022443168b5fad55bbd03d087abf574c9d2e9df/torch/torch_version.py#L13) only only supports 'greater than', and 'equal to' operations for comparing torch versions and something like `TorchVersion('1.5.0') < (1,5,1)` or `TorchVersion('1.5.0') >= (1,5)` will throw an error.
I have added 'less than' (`__lt__()`), 'greater than or equal to' (`__ge__()`) and 'less than or equal to' (`__le__()`) operations, so that the TorchVersion object can be useful for wider range of version comparisons.
cc seemethere zsol
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63848
Reviewed By: fmassa, heitorschueroff
Differential Revision:
D30526996
Pulled By: seemethere
fbshipit-source-id:
1db6bee555043e0719fd541cec27810852590940
Mike Ruberry [Thu, 9 Sep 2021 17:02:03 +0000 (10:02 -0700)]
Reverts cat and stack warning when out= is not the expected shape (#64714)
Summary:
These warnings are being thrown too aggressively at the moment. See https://github.com/pytorch/pytorch/issues/64709 for a follow-up to reenable them once internal call sites are reviewed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64714
Reviewed By: ngimel
Differential Revision:
D30822965
Pulled By: mruberry
fbshipit-source-id:
3ad7c92d381d42ac6187ed84afab477c579a8f35
Ilqar Ramazanli [Thu, 9 Sep 2021 16:32:36 +0000 (09:32 -0700)]
To add SequentialLR to PyTorch Core Schedulers (#64037)
Summary:
Partially resolves https://github.com/pytorch/vision/issues/4281
In this PR we are proposing a new scheduler --SequentialLR-- which enables list of different schedulers called in different periods of the training process.
The main motivation of this scheduler is recently gained popularity of warming up phase in the training time. It has been shown that having a small steps in initial stages of training can help convergence procedure get faster.
With the help of SequentialLR we mainly enable to call a small constant (or linearly increasing) learning rate followed by actual target learning rate scheduler.
```PyThon
scheduler1 = ConstantLR(optimizer, factor=0.1, total_iters=2)
scheduler2 = ExponentialLR(optimizer, gamma=0.9)
scheduler = SequentialLR(optimizer, schedulers=[scheduler1, scheduler2], milestones=[5])
for epoch in range(100):
train(...)
validate(...)
scheduler.step()
```
which this code snippet will call `ConstantLR` in the first 5 epochs and will follow up with `ExponentialLR` in the following epochs.
This scheduler could be used to provide call of any group of schedulers next to each other. The main consideration we should make is every time we switch to a new scheduler we assume that new scheduler starts from the beginning- zeroth epoch.
We also add Chained Scheduler to `optim.rst` and `lr_scheduler.pyi` files here.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64037
Reviewed By: albanD
Differential Revision:
D30841099
Pulled By: iramazanli
fbshipit-source-id:
94f7d352066ee108eef8cda5f0dcb07f4d371751
John Shen [Thu, 9 Sep 2021 16:30:32 +0000 (09:30 -0700)]
[pytorch] Make qlinear weight packing thread safe (#63804)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63804
Adding a lock around weight packing section of qlinear + qlinear_dynamic
Test Plan: automated tests
Reviewed By: kimishpatel
Differential Revision:
D30340957
fbshipit-source-id:
1c9faf796c4ffbc74345396188a6f1154a76bea6
Nikita Vedeneev [Thu, 9 Sep 2021 15:56:29 +0000 (08:56 -0700)]
`torch.lu_solve`: forward AD support (#64646)
Summary:
As per title.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64646
Reviewed By: VitalyFedyunin
Differential Revision:
D30807898
Pulled By: albanD
fbshipit-source-id:
1f943c22357dd1b3662cfe0d2a26af68e3a2df4c
Raghavan Raman [Thu, 9 Sep 2021 15:26:16 +0000 (08:26 -0700)]
[nnc] Handled cast in index expression during inlining (#64716)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64716
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision:
D30826388
Pulled By: navahgar
fbshipit-source-id:
7e446602f650527e0d954e437f0370602019e040
Raghavan Raman [Thu, 9 Sep 2021 15:26:16 +0000 (08:26 -0700)]
[nnc] Updated indices during broadcast to use int64_t (#64627)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64627
This fixes the root cause of S242719
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision:
D30801686
Pulled By: navahgar
fbshipit-source-id:
b6d3ebdc7eb57116eaced53c2f35c7798bb17e80
Howard Huang [Thu, 9 Sep 2021 15:20:40 +0000 (08:20 -0700)]
Revert
D30745921: [DDP] Fix when buffers are reassigned in module
Test Plan: revert-hammer
Differential Revision:
D30745921 (https://github.com/pytorch/pytorch/commit/
d59ecc02df70bad2273858c2fad2b4993133a3d3)
Original commit changeset:
25eb1edbf445
fbshipit-source-id:
343ead86bf1e2d0b2d4124be331ea2fa437303ad
Howard Huang [Thu, 9 Sep 2021 15:20:40 +0000 (08:20 -0700)]
Revert
D30745961: [DDP] Remove self.modules_params
Test Plan: revert-hammer
Differential Revision:
D30745961 (https://github.com/pytorch/pytorch/commit/
8c095102948c9601792a884dad56da5085c51bee)
Original commit changeset:
32d102502570
fbshipit-source-id:
59f7cc50d369b6cc2856cf4ebd0f58b96202336d
Howard Huang [Thu, 9 Sep 2021 15:20:40 +0000 (08:20 -0700)]
Revert
D30745960: [DDP] Remove SPMD from self.modules_buffers
Test Plan: revert-hammer
Differential Revision:
D30745960 (https://github.com/pytorch/pytorch/commit/
15532595209d2daf34d35e10f8d3d3b64966aea2)
Original commit changeset:
66a8f9847e9f
fbshipit-source-id:
d3f3fb813c45ac1b0ff15c6154b2e99e5dbab433
Elias Ellison [Thu, 9 Sep 2021 15:12:30 +0000 (08:12 -0700)]
[JIT] Add gradient check in constants (#64613)
Summary:
fixes internal issue
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64613
Reviewed By: Gamrix
Differential Revision:
D30799016
Pulled By: eellison
fbshipit-source-id:
48ef52d1cac627919e6cd232216d24878a2a8b58
Edward Yang [Thu, 9 Sep 2021 14:17:26 +0000 (07:17 -0700)]
Filter out _disabled_torch_function_impl from handle_torch_function (#64689)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64689
This brings it in line with the C++ implementation.
Fixes https://github.com/pytorch/pytorch/issues/64687
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision:
D30816215
Pulled By: ezyang
fbshipit-source-id:
ed36af6c35467ae678d9548197efd97c36d38dec
Ilqar Ramazanli [Thu, 9 Sep 2021 14:04:57 +0000 (07:04 -0700)]
To add Rectified Adam Description to Documentation (#63772)
Summary:
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper https://github.com/pytorch/pytorch/issues/63236.
In this PR we are adding description of Rectified Adam Algorithm to the documentation. For more details, we refer to the paper https://arxiv.org/abs/1908.03265
<img width="446" alt="RadamAlgo" src="https://user-images.githubusercontent.com/
73658284/
132587815-
4764b642-df53-4e41-975f-
72e0f40fdc48.png">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63772
Reviewed By: datumbox
Differential Revision:
D30839694
Pulled By: iramazanli
fbshipit-source-id:
6f5629ce56e10c66a451433334b587b99eda1610
Ilqar Ramazanli [Thu, 9 Sep 2021 14:03:49 +0000 (07:03 -0700)]
[doc][hackathon] To add AdamW Optimizer to the documentation (#63252)
Summary:
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper https://github.com/pytorch/pytorch/issues/63236.
In this PR we are adding description of AdamW Algorithm to the documentation. For more details, we refer to the paper here https://arxiv.org/abs/1711.05101
<img width="442" alt="AdamWalgo" src="https://user-images.githubusercontent.com/
73658284/
132589957-
6d381e96-cb62-40d0-990f-
82a32ec455be.png">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63252
Reviewed By: datumbox
Differential Revision:
D30839685
Pulled By: iramazanli
fbshipit-source-id:
1a426c874ab86408d286a34f41aefcf5b21167c0
Ilqar Ramazanli [Thu, 9 Sep 2021 13:39:14 +0000 (06:39 -0700)]
To add Adamax algorithm to documentation (#63903)
Summary:
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper https://github.com/pytorch/pytorch/issues/63236.
In this PR we are adding description of Adamax Algorithm to the documentation. For more details, we refer to the paper https://arxiv.org/abs/1412.6980
<img width="447" alt="Adamx" src="https://user-images.githubusercontent.com/
73658284/
132577306-
878ce64c-627a-4086-808c-
d0482868d4a1.png">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63903
Reviewed By: albanD
Differential Revision:
D30819055
Pulled By: iramazanli
fbshipit-source-id:
37f748cbea9f93bf37193ee30fc295fb1a1e9ffd
CodemodService FBSourceClangFormatLinterBot [Thu, 9 Sep 2021 11:21:58 +0000 (04:21 -0700)]
[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT`
Reviewed By: zertosh
Differential Revision:
D30835585
fbshipit-source-id:
a7d35319fd3ae3eddd29b69d299d842f68d587f6
Yinghai Lu [Thu, 9 Sep 2021 07:58:39 +0000 (00:58 -0700)]
Fix lop1p lowering bug (#64724)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64724
`1` will introduce a int tensor instead of float tensor, which doesn't work well with downstream operators (elementwise). Error would be like
```
[TensorRT] WARNING: IElementWiseLayer with inputs (Unnamed Layer* 1) [Unary]_output and (Unnamed Layer* 2) [Constant]_output: first input has type Float but second input has type Int32.
```
Changing the constant to be float type fixes this.
Reviewed By:
842974287
Differential Revision:
D30796959
fbshipit-source-id:
0538e4dd960df9ce87a2d4cafe8f1a0c061b6bad
Peter Bell [Thu, 9 Sep 2021 05:07:12 +0000 (22:07 -0700)]
Migrate uses of THCReduceApplyUtils to cuda_utils::BlockReduce (#64713)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64713
Resubmit of #64442
Test Plan: Imported from OSS
Reviewed By: VitalyFedyunin
Differential Revision:
D30825646
Pulled By: ngimel
fbshipit-source-id:
66b06bd0b30b401833e337920681d19d96b11f9d
Rohan Varma [Thu, 9 Sep 2021 02:13:33 +0000 (19:13 -0700)]
[DDP] Remove SPMD from self.modules_buffers (#64474)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64474
No need for a nested list here.
ghstack-source-id:
137526312
Test Plan: CI
Reviewed By: zhaojuanmao
Differential Revision:
D30745960
fbshipit-source-id:
66a8f9847e9fe1e02c51b79647e93bf7665cf4d9
Rohan Varma [Thu, 9 Sep 2021 02:13:33 +0000 (19:13 -0700)]
[DDP] Remove self.modules_params (#64473)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64473
Unused after SPMD deprecated.
ghstack-source-id:
137526305
Test Plan: CI
Reviewed By: zhaojuanmao
Differential Revision:
D30745961
fbshipit-source-id:
32d102502570291e01579e5b47a6d74dc71013bb
Rohan Varma [Thu, 9 Sep 2021 02:13:33 +0000 (19:13 -0700)]
[DDP] Fix when buffers are reassigned in module (#64472)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64472
Sometimes, user module can reassign tensor buffer, as in:
```
self.buffer = torch.randn(1, 2) # in init
self.buffer += 1 # in forward
```
in this case, `self.modules_buffers` will become outdated and we should
repopulate self.modules_buffers if we need to sync module buffers.
See https://github.com/pytorch/pytorch/issues/63916 for full description of the
issue.
ghstack-source-id:
137526309
Test Plan: CI
Reviewed By: zhaojuanmao
Differential Revision:
D30745921
fbshipit-source-id:
25eb1edbf445703a481802e07f3058d38ea6fc64
Scott Wolchok [Thu, 9 Sep 2021 01:30:14 +0000 (18:30 -0700)]
[PyTorch] Fix MobileDebugInfo vector copy (#64030)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64030
ghstack-source-id:
137566816
Test Plan:
Pixel 3 before: https://our.intern.facebook.com/intern/aibench/details/
320277034999340
Pixel 3 after: https://our.intern.facebook.com/intern/aibench/details/
724509739115867
can see the vector copy disappear in the flame graph. Overall mean decreased from 354 ms to 348 ms (though I'm not sure if this is outside usual noise).
Reviewed By: raziel
Differential Revision:
D30559032
fbshipit-source-id:
6d8bb5396d3449cc63023ee7acf694b5d146ddc1
Scott Wolchok [Thu, 9 Sep 2021 01:30:14 +0000 (18:30 -0700)]
[PyTorch] move from input ivalues in ByteCodeDeserializer (#64029)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64029
This should save us a separate pass over the data structure to destroy it.
ghstack-source-id:
137566821
Test Plan:
Pixel3
before:
https://www.internalfb.com/intern/aibench/details/
503337445067962
after:
https://our.intern.facebook.com/intern/aibench/details/
320277034999340
overall mean time decreased from 373 ms to 358 ms. In flame graph, we
can see that some time spent destroying a vector of IValues was moved
into parseMethods, and the new parseMethods time is less than the old
time plus the recursive destruction time.
Reviewed By: dhruvbird
Differential Revision:
D30559530
fbshipit-source-id:
d080295a846745ea03ac50f08f4f6c95f4eaf3d8
Scott Wolchok [Thu, 9 Sep 2021 01:30:14 +0000 (18:30 -0700)]
[PyTorch] Copy vectors less in Function::append_operator (#63977)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63977
Doesn't seem to be any reason to copy these argument vectors.
ghstack-source-id:
137566815
Test Plan: CI
Reviewed By: dhruvbird, raziel
Differential Revision:
D30550301
fbshipit-source-id:
33c199f975e4fb62c50a8210dc08aa9bb7a3e2f2
Yinghai Lu [Thu, 9 Sep 2021 01:20:46 +0000 (18:20 -0700)]
[FX] make visualizer produce different formatted output (#64699)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64699
Previously we just hardcode to svg format. We should give folks a choice in terms of what format they want to see. If we give a weird extension like .abc and this will error out and we expect this to be the right behavior.
Reviewed By: houseroad
Differential Revision:
D30718883
fbshipit-source-id:
fe8827262f94ea6887999bb225de763d1909eef8
Nikita Shulga [Thu, 9 Sep 2021 01:06:04 +0000 (18:06 -0700)]
Re-enable nightly doc pushes (#64708)
Summary:
That were accidentally disabled by https://github.com/pytorch/pytorch/pull/64222
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64708
Reviewed By: seemethere
Differential Revision:
D30822089
Pulled By: malfet
fbshipit-source-id:
056b5c006f236c78ffe8afa4a5eab2f35e1bce89
Jordan Fix [Wed, 8 Sep 2021 23:09:53 +0000 (16:09 -0700)]
[acc_tracer] Enable check_mutable_operations (#64456)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64456
att
Test Plan: CI
Reviewed By: protonu
Differential Revision:
D30679174
fbshipit-source-id:
73f3a07d58380cd44fb3481aa97d463c0a964de8
Hui Guo [Wed, 8 Sep 2021 22:30:59 +0000 (15:30 -0700)]
[tensorexpr] Allocate intermediate buffers at compile time (#64227)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64227
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision:
D30652220
Pulled By: huiguoo
fbshipit-source-id:
cd75005cdfa42751318de7174b44e14a3a01634e
Hui Guo [Wed, 8 Sep 2021 22:30:59 +0000 (15:30 -0700)]
[tensorexpr] Add 'is_allocated' flag for buffers and use it to insert 'Alloc/Free' stmts (#64226)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64226
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision:
D30652221
Pulled By: huiguoo
fbshipit-source-id:
ef9bb0e3db2c444b476e5fc23956bc34ae0f0111
Jordan Fix [Wed, 8 Sep 2021 22:30:28 +0000 (15:30 -0700)]
[acc_normalizer] Improve error when kwarg normalization fails (#64408)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64408
att
Test Plan: NFC
Reviewed By: protonu
Differential Revision:
D30716392
fbshipit-source-id:
e1c3bb1afcd5363a9d502549d8a46b90226be40c
Hector Yuen [Wed, 8 Sep 2021 22:22:21 +0000 (15:22 -0700)]
Update breakpad to an existing commit: 7d188f6 (#64666)
Summary:
Fixes issue https://github.com/pytorch/pytorch/issues/64561
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64666
Reviewed By: driazati
Differential Revision:
D30814127
Pulled By: hyuen
fbshipit-source-id:
511a30fc26153569b1cd39f34e4a1a6bb99cc5e4
Ilqar Ramazanli [Wed, 8 Sep 2021 22:20:52 +0000 (15:20 -0700)]
To add Stochastic Gradient Descent to Documentation (#63805)
Summary:
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper https://github.com/pytorch/pytorch/issues/63236.
In this PR we are adding description of Stochastic Gradient Descent to the documentation.
<img width="466" alt="SGDalgo" src="https://user-images.githubusercontent.com/
73658284/
132585881-
b351a6d4-ece0-4825-b9c0-
126d7303ed53.png">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63805
Reviewed By: albanD
Differential Revision:
D30818947
Pulled By: iramazanli
fbshipit-source-id:
3812028e322c8a64f4343552b0c8c4582ea382f3
Eli Uriegas [Wed, 8 Sep 2021 21:40:03 +0000 (14:40 -0700)]
.github: Upgrade windows CUDA 10.1 -> 10.2 (#64658)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64658
We don't release 10.1 anymore so let's bump to 10.2
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Test Plan: Imported from OSS
Reviewed By: malfet, janeyx99
Differential Revision:
D30811178
Pulled By: seemethere
fbshipit-source-id:
c504ebf7f0d4c0d6229319d774f808b4ba0facd9
Shirong Wu [Wed, 8 Sep 2021 21:29:33 +0000 (14:29 -0700)]
Add plugin for linalg norm operation (#64611)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64611
Add plugin for torch.linalg.norm, this plugin correctly only support norm operation without batch_size change, so vector input or matrix input with dim including '0' is not supported with this plugin.
Test Plan: Unit test
Reviewed By:
842974287
Differential Revision:
D30525958
fbshipit-source-id:
0d66b60a390bb6235166e5a80390090d0acf691a
Natalia Gimelshein [Wed, 8 Sep 2021 21:25:42 +0000 (14:25 -0700)]
Revert
D30735341: Migrate uses of THCReduceApplyUtils to cuda_utils::BlockReduce
Test Plan: revert-hammer
Differential Revision:
D30735341 (https://github.com/pytorch/pytorch/commit/
a5ad08ec704a3f765814eacf5c393e871c0174e1)
Original commit changeset:
3cb58bed8f1f
fbshipit-source-id:
874dd0f93b24a99694db42a15714834069d402bc
Yinghai Lu [Wed, 8 Sep 2021 20:50:46 +0000 (13:50 -0700)]
[fx] make const fold code more pythonic (#64451)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64451
No functional change.
Test Plan:
```
buck test caffe2/test:fx_const_fold
```
Reviewed By: jfix71, RoshanPAN, houseroad
Differential Revision:
D30718255
fbshipit-source-id:
95f98561c7f33fcc6c839db68683c85eb152c949
Zafar Takhirov [Wed, 8 Sep 2021 20:32:29 +0000 (13:32 -0700)]
[quant] Enable jit tracing on quantizable LSTM (resubmission) (#64638)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64638
The quantizable LSTM didn't support jit tracing because it had several non taceable paths. We sacrifice some of the user experience to enable the tracing.
The main UX feature removed is a user-friendly message when trying to access the backwards path in a bidirectional LSTM: When the bidirectional flag is False, we used to throw a nice error message when the user tried accessing backwards weights. Now the message is default (removed properties).
Test Plan: `buck test mode/dev //caffe2/test:quantization -- test_custom_module_lstm`
Reviewed By: HDCharles
Differential Revision:
D30803753
fbshipit-source-id:
a639955a96cee22538d9436f1c952a5d121f50f9
Peter Bell [Wed, 8 Sep 2021 20:25:42 +0000 (13:25 -0700)]
Factor out TensorBase that doesn't depend on native operators (#63612)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63612
This makes Tensor inherit from a new class TensorBase, that provides a subset of Tensor that doesn't
directly depend on native_functions.yaml. Code that only includes TensorBase.h with thus not need to
be rebuilt every time someone changes an operator signature.
Making `Tensor` inherit from this class means that `const TensorBase&` parameters will be callable
with an ordinary `Tensor`. I've also made `Tensor` constructible and assignable from `TensorBase` to
minimize friction in code mixing the two types.
To help enforce that `Tensor.h` and `Functions.h` aren't accidentally included, I've added an error
into `Operators.h` if `TORCH_ASSERT_NO_OPERATORS` is defined. We can either set this in the build
system for certain folders, or just define it at the top of any file.
I've also included an example of manually special-casing the commonly used `contiguous` operator.
The inline function's slow path defers to `TensorBase::__dispatch_contiguous` which is defined in
`Tensor.cpp`. I've made it so `OptionalTensorRef` is constructible from `TensorBase`, so I can
materialize a `Tensor` for use in dispatch without actually increasing its refcount.
Test Plan: Imported from OSS
Reviewed By: gchanan
Differential Revision:
D30728580
Pulled By: ezyang
fbshipit-source-id:
2cbc8eee08043382ee6904ea8e743b1286921c03
David Riazati [Wed, 8 Sep 2021 18:35:42 +0000 (11:35 -0700)]
Make doc previews use its own S3 bucket (#64594)
Summary:
We had been using the gha-artifacts bucket (which previously only stored workflow artifacts) to keep the docs around. This makes it hard to see how our storage for artifacts vs docs is trending.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64594
Reviewed By: seemethere
Differential Revision:
D30794328
Pulled By: driazati
fbshipit-source-id:
6b2721a3d76e8a273bde055783d56551f8409edd
Thomas J. Fan [Wed, 8 Sep 2021 18:00:11 +0000 (11:00 -0700)]
TST Adds inplace checks to module_info (#63739)
Summary:
Follow up to https://github.com/pytorch/pytorch/pull/61935
This PR adds inplace checks to `test_modules`. This version checks the constructor for `inplace` and performs the check automatically.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63739
Reviewed By: saketh-are
Differential Revision:
D30737774
Pulled By: jbschlosser
fbshipit-source-id:
8813534511e9296c8424d1ca878412726ddd4043
Peter Bell [Wed, 8 Sep 2021 17:57:30 +0000 (10:57 -0700)]
Migrate uses of THCReduceApplyUtils to cuda_utils::BlockReduce (#64442)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64442
Test Plan: Imported from OSS
Reviewed By: mrshenli
Differential Revision:
D30735341
Pulled By: ngimel
fbshipit-source-id:
3cb58bed8f1f5aa32fd49fd37b10c8490bcc645a
Eli Uriegas [Wed, 8 Sep 2021 17:51:29 +0000 (10:51 -0700)]
.github: Run docker containers in detach mode (#64459)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64459
Should allow users to exec into the docker container if using with-ssh,
even if the build / test command has finished executing
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Test Plan: Imported from OSS
Reviewed By: malfet
Differential Revision:
D30742797
Pulled By: seemethere
fbshipit-source-id:
969ed8799216c6051439c7d41ab709b2d40938ac
Animesh Jain [Wed, 8 Sep 2021 17:48:09 +0000 (10:48 -0700)]
[NNC] Add Softplus operator (#64589)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64589
Adding softplus operator lowering for NNC. Enabling element wise fusion as well.
Test Plan: Added a test in test_jit_fuser.py
Reviewed By: bertmaher
Differential Revision:
D30736449
fbshipit-source-id:
6c5fc3bceb5cef2322ecd4449f827e4af018ea93
Horace He [Wed, 8 Sep 2021 16:59:04 +0000 (09:59 -0700)]
Add `__matmul__` to the magic methods for FX tracing (#64512)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/64483
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64512
Reviewed By: mrshenli
Differential Revision:
D30797265
Pulled By: Chillee
fbshipit-source-id:
7630e048a960e0b27c4309d04d85301abe325189
kshitij12345 [Wed, 8 Sep 2021 16:52:53 +0000 (09:52 -0700)]
update scatter formula (#64546)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/63430
Already tested OpInfo gradient tests
https://github.com/pytorch/pytorch/blob/
544c8e6a5d26efdf1cf679b313893fe119825930/torch/testing/_internal/common_methods_invocations.py#L8575-L8577
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64546
Reviewed By: saketh-are
Differential Revision:
D30768759
Pulled By: albanD
fbshipit-source-id:
27d144971c51a956a232fc7d02df5c9d2706d565
Kevin Tse [Wed, 8 Sep 2021 16:42:22 +0000 (09:42 -0700)]
fixing trapezoid() comments for clarity (#64592)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64592
cc mruberry rgommers heitorschueroff
Test Plan: Imported from OSS
Reviewed By: mrshenli
Differential Revision:
D30785663
Pulled By: NivekT
fbshipit-source-id:
e968687fbb83a59bb46ce6858c6caafa5aa04412
Ivan Yashchuk [Wed, 8 Sep 2021 16:34:46 +0000 (09:34 -0700)]
Add forward mode differentiation for torch.linalg.cholesky and transpose (#62159)
Summary:
This PR adds forward mode differentiation for `torch.linalg.cholesky`, `torch.linalg.cholesky_ex`, and `transpose` functions.
Complex tests for Cholesky fail because for some reason the gradcheck sends matrices full of zeros to `cholesky_jvp` function.
cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry heitorschueroff walterddr IvanYashchuk xwang233
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62159
Reviewed By: mrshenli
Differential Revision:
D30776829
Pulled By: albanD
fbshipit-source-id:
32e5539ed6423eed8c18cce16271330ab0ea8d5e
Hojin Lee [Wed, 8 Sep 2021 16:33:23 +0000 (09:33 -0700)]
Fix typo embedding_renorm_cuda_ (#64542)
Summary:
Fixes #{issue number}
cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64542
Reviewed By: mrshenli
Differential Revision:
D30792842
Pulled By: ngimel
fbshipit-source-id:
c9a548256d02b3ce6fb77dd9fb058084f2c91608
Rohan Varma [Wed, 8 Sep 2021 16:17:49 +0000 (09:17 -0700)]
[c10d] Provide failure reason from ProcessGroup when aborting NCCL comm (#64241)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64241
When things go wrong PG NCCL aborts nccl communicators via `ncclCommAbort`, but one issues is that often the error can be set to `ncclSystemError` (see https://github.com/pytorch/pytorch/blob/master/torch/csrc/distributed/c10d/NCCLUtils.hpp#L176) when that might not be the true cause of the issue and the actual issue is that some prior work timed out, communicator was aborted on other rank, etc.
This results in a lot of confusion when debugging jobs with a large no. of processes as the current message for ncclSystemError is not very informative: https://github.com/pytorch/pytorch/blob/master/torch/csrc/distributed/c10d/NCCLUtils.hpp#L22
The fix here is to pass in a string exception message from PG NCCL down to `NCCLUtils` which will aim to raise that as the actual issue and not the confusing `ncclSystemError` message.
Test Plan: CI
Reviewed By: pallab-zz, cbalioglu
Differential Revision:
D30658855
fbshipit-source-id:
17661dbe0a1bb8cc5b87b637c47634b1f52f54e1
Sameer Deshmukh [Wed, 8 Sep 2021 15:40:01 +0000 (08:40 -0700)]
Change MaxUnpool to accept tensors with 0-dim batch sizes. (#64082)
Summary:
Part of the fix for https://github.com/pytorch/pytorch/issues/38115.
Changes the `MaxUnpool` module to work with 0-dimensions batch sizes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64082
Reviewed By: mrshenli
Differential Revision:
D30793907
Pulled By: jbschlosser
fbshipit-source-id:
d21aa665be5aa18f592b39ef7b4e3cbc632e21ed
johnlu [Wed, 8 Sep 2021 15:24:46 +0000 (08:24 -0700)]
Add Half conversion of bit cast for SYCL kernel (#64340)
Summary:
## Motivation
Enhance the performance of Half/float conversion in SYCL kernels.
## Solution
Add the native SYCL half type to help convert the half from/to float in the kernel code.
## Additional Context
`__SYCL_DEVICE_ONLY__` is a MACRO only valid when compiling the kernel code for SYCL backend.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64340
Reviewed By: gchanan
Differential Revision:
D30720823
Pulled By: ezyang
fbshipit-source-id:
e7e770d02df5b2d45da61d2fed3ba59383b3dc3a
Bert Maher [Wed, 8 Sep 2021 15:07:19 +0000 (08:07 -0700)]
[nnc] Provide helpful error messages about turning off the fuser (#64516)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64516
If fuser compilation fails due to a bug (which should be highly
unlikely at this point) we want to direct the user how to unblock themselves by
disabling fusion, in addition to requesting that they report a bug.
ghstack-source-id:
137398537
Test Plan: existing tests
Reviewed By: ZolotukhinM
Differential Revision:
D30758051
fbshipit-source-id:
98be89f1b1d4fb3bc816f5b2634c618b9297930e
leslie-fang-intel [Wed, 8 Sep 2021 14:45:12 +0000 (07:45 -0700)]
Allow disabling cache in autocast (automatic mixed precision) (#63552)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63552
In this PR, we want to exclude these 2 cases in the `Autocast` weight cache usages:
- Using `torch.jit.trace` under the `Autocast`
As report in https://github.com/pytorch/pytorch/issues/50231 and several other discussions, using `torch.jit.trace` under the `Autocast`, the trace process would hit Autocast's weight cache and fails. So we should disable weight cache under the trace process.
- Using `Autocast` with `Grad mode`
- Usually we are using `Grad mode` for training. Since in the training phase, the weight will change in every step. So we doesn't need to cache the weight.
- For the recommended `Autocast` training case in the [doc](https://pytorch.org/docs/stable/amp.html), `Autocast` will clear the cache every step leaving the context. We should disable it to save the clear operations.
```
model = Net().cuda()
optimizer = optim.SGD(model.parameters(), ...)
for input, target in data:
optimizer.zero_grad()
with autocast():
output = model(input)
loss = loss_fn(output, target)
loss.backward()
optimizer.step()
```
Test Plan: Imported from OSS
Reviewed By: mrshenli
Differential Revision:
D30644913
Pulled By: ezyang
fbshipit-source-id:
ad7bc87372e554e7aa1aa0795e9676871b3974e7
Protonu Basu [Wed, 8 Sep 2021 14:11:38 +0000 (07:11 -0700)]
Adding support for lowering 4Bit EmbeddingBag Operator (#5806)
Summary:
Pull Request resolved: https://github.com/pytorch/glow/pull/5806
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64001
Add 4 bit embeddingbag operator in acc_ops.
Test Plan: Let CI run.
Reviewed By: jfix71
Differential Revision:
D30532824
fbshipit-source-id:
bf476c9710477792aae202dacf64e23539c33bd9
Freey0 [Wed, 8 Sep 2021 13:40:54 +0000 (06:40 -0700)]
restore test_inplace_comparison_ops_require_inputs_have_same_dtype Expected behavior (#64267)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64267
This test expects every operation to throw a runtime error.
And Reinsert in-place operation test,Fix bug for comparison operation
fix: #64018
Test Plan: Imported from OSS
Reviewed By: gchanan
Differential Revision:
D30720915
Pulled By: ezyang
fbshipit-source-id:
215a6556d20770f70f4ced1c1f9a9753933f1d37
Zafar Takhirov [Wed, 8 Sep 2021 11:57:28 +0000 (04:57 -0700)]
[quant] AO migration of the `quantize.py` (resubmission) (#64445)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64445
AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly.
This migrates the quantize.py from torch.quantization to torch.ao.quantization.
At this point both locations will be supported. Eventually the torch.quantization will be deprecated.
Test Plan: `buck test mode/dev //caffe2/test:quantization`
Reviewed By: HDCharles
Differential Revision:
D30734870
fbshipit-source-id:
dc204f3cc46bff2cc81c95159eab9d333b43bb4b
Mikhail Zolotukhin [Wed, 8 Sep 2021 07:22:05 +0000 (00:22 -0700)]
[TensorExpr] Don't rely on exceptions in Vectorizer. (#64609)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64609
We've been using exceptions to indicate whether vectorization succeeded
or not, but that posed some problems with (e.g. we spent too much time
symbolicazing these exceptions). This change converts this mechanism to
a standard error return code.
Test Plan: Imported from OSS
Reviewed By: bertmaher
Differential Revision:
D30795342
Pulled By: ZolotukhinM
fbshipit-source-id:
16e38b37bcdd78ceb438ac814cc377f35b058e17
Jordan Fix [Wed, 8 Sep 2021 05:43:04 +0000 (22:43 -0700)]
[fx_const_fold] Fix constant folding for attrs in submodule hierarchies (#64342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64342
Previously we weren't handling the case where an attribute was in a module that wasn't the root.
Test Plan: Added unit test coverage.
Reviewed By: yinghai
Differential Revision:
D30691730
fbshipit-source-id:
b39b5cf748c4c882f315a4f32b51ad88cc7a43ed
Hendrik Schröter [Wed, 8 Sep 2021 03:14:08 +0000 (20:14 -0700)]
Add __ge__ to TorchVersion (#64565)
Summary:
This PR adds greater equal comparison so that not the base class's (str) comparison method is used.
This is necessary for a correct comparison with a version string.
Previously the following was the case:
```py
>>> torch.__version__
'1.10.0.dev20210830+cpu'
>>> torch.__version__>"1.9"
True
>>> torch.__version__>="1.9"
False # Wrong output since the base class (str) was used for __ge__ comparison
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64565
Reviewed By: raghuramank100
Differential Revision:
D30790463
Pulled By: mrshenli
fbshipit-source-id:
79c680f8b448001b34d3e5d5332124a78bea4e34
Maksim Levental [Wed, 8 Sep 2021 02:57:12 +0000 (19:57 -0700)]
add out variant of linear (#61801)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61801
resubmitting because the last one was unrecoverable due to making changes incorrectly in the stack
Test Plan: Imported from OSS
Reviewed By: desertfire
Differential Revision:
D29812510
Pulled By: makslevental
fbshipit-source-id:
ba9685dc81b6699724104d5ff3211db5852370a6
Steven Jin [Wed, 8 Sep 2021 02:00:18 +0000 (19:00 -0700)]
Fix building docs instructions (#64508)
Summary:
Fixes #{64507}
Removed duplicate instruction and linted the file a bit (consistent spacing around codeblocks/headers, adding code types in codeblocks, remove `$` from bash code blocks when uncecessary).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64508
Reviewed By: raghuramank100
Differential Revision:
D30791164
Pulled By: mrshenli
fbshipit-source-id:
a00db32dcfdd1ecc194c836f31174c806062eb6d
Nikita Shulga [Wed, 8 Sep 2021 01:46:58 +0000 (18:46 -0700)]
Fix quicklint (#64612)
Summary:
Fixes land-race introduced by https://github.com/pytorch/pytorch/commit/
a22c936b6398f5cfd959b3e09622db4d90d61050
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64612
Reviewed By: ngimel
Differential Revision:
D30798648
Pulled By: malfet
fbshipit-source-id:
ca546f68141d44493deba7bbf840e5f9662e8558
Natalia Gimelshein [Wed, 8 Sep 2021 01:43:49 +0000 (18:43 -0700)]
Revert
D29998114: [pytorch][PR] enable bf16 mkldnn path for gemm
Test Plan: revert-hammer
Differential Revision:
D29998114 (https://github.com/pytorch/pytorch/commit/
acc9f9afc8f2be70d7f5d3248ca1760e0336b3b8)
Original commit changeset:
459dc5874c63
fbshipit-source-id:
1994623a3afc22a94bd0cf5de766b023185f5238
Don Jang [Wed, 8 Sep 2021 01:21:41 +0000 (18:21 -0700)]
[JIT] Fix a bug of rejecting ops with AliasAnalysisKind::CONSERVATIVE (#64336)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64336
Currently AliasDB rejects any user-defined ops with `AliasAnalysisKind::CONSERVATIVE` if they do not have a special treatment for alias analysis. For example, the following alias schema gets rejects:
```
m.def(torch::schema(
"namescope::my_op(...) -> ...",
c10::AliasAnalysisKind::CONSERVATIVE));
```
This rejection condition is contradictory: AliasDB can handle ops with `CONSERVATIVE` in a general way without any special casing at https://fburl.com/diffusion/op5u72sk calling https://fburl.com/diffusion/h3aws5dd which seems very appropriate to be conservative for alias analysis.
This change corrects the rejection condition to be satisfied for ops *with* special casing but have `CONSERVATIVE`, since they both cannot be used simultaneously.
Test Plan:
Confirmed that
```
m.def(torch::schema(
"namescope::my_op(...) -> ...",
c10::AliasAnalysisKind::CONSERVATIVE));
```
gets accepted and `my_op`'s all inputs and outputs are put to point to wildcard(*) by AliasDB.
Reviewed By: eellison
Differential Revision:
D30690121
fbshipit-source-id:
431cc1a84edd5227f52b44a0fd85d5eb16f3c288
Elias Ellison [Wed, 8 Sep 2021 01:19:14 +0000 (18:19 -0700)]
Add symbolic shape comparison optimization (#64300)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64300
Test Plan: Imported from OSS
Reviewed By: driazati
Differential Revision:
D30738146
Pulled By: eellison
fbshipit-source-id:
96287798535b367f23d3e9430d70fc02c59744ab
Elias Ellison [Wed, 8 Sep 2021 01:19:14 +0000 (18:19 -0700)]
Refactor to use shape arguments (#64299)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64299
Test Plan: Imported from OSS
Reviewed By: driazati
Differential Revision:
D30738141
Pulled By: eellison
fbshipit-source-id:
37ca30de81349ecf23d8656291863737b6ad6d96
Elias Ellison [Wed, 8 Sep 2021 01:19:14 +0000 (18:19 -0700)]
Add view with negative dim (#63516)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63516
how to review: pretty much just check that the inputs generated are a good representation of the op semantics, that should be sufficient for correctness, and then you can also double check the op size semantics by going to https://codebrowser.bddppq.com/pytorch/pytorch/ typing in native::{op_name} and looking at the op implementation as a bonus if you want
Test Plan: Imported from OSS
Reviewed By: driazati
Differential Revision:
D30738143
Pulled By: eellison
fbshipit-source-id:
c7cd01cb2c8a13cb2664415f3d98aedec19a8e07