Kimish Patel [Sat, 14 Aug 2021 04:37:57 +0000 (21:37 -0700)]
[Pytorch Profiler] Add debug_handles to KinetoEvent (#62228)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62228
This diff adds debug handles to events and provides a way to use
RECORD_FUNCTIONs that will pass debug_handles down to profiler, which
will record it in the events.
Why add debug_handles?
For pytorch mobile, with lite interpreter, we generate debug handles
that can be used for lazily symbolicate exception traces to model level
stack trace. Similar to the model level stack trace you get in
TorchScript models. The debug_handles also enable getting module
hierarchy for lite interpreter model, support for which was added to
KinetoProfiler in previous diffs.
Followup plan:
1. Enabled scope callbacks such that lite interpreter can use it to
profiler only top level ops.
2. Enable post processing callbacks that take KinetoEvents and populate
module hierarchy using debug handles.
This will let us use KinetoProfiler for lite interpter use cases on
mobile. Aim is to use RAII guard to similarly generate chrome trace for
mobile usecases as well, although only for top level ops.
Test Plan:
test_misc : RecordDebugHandles.Basic
Imported from OSS
Reviewed By: ilia-cher
Differential Revision:
D29935899
fbshipit-source-id:
4f06dc411b6b5fe0ffaebdd26d3274c96f8f389b
Kimish Patel [Sat, 14 Aug 2021 04:37:57 +0000 (21:37 -0700)]
[Pytorch Profiler] Move start timestamp to end of start callback (#62191)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62191
This moves start timestamping to end of callback. This way we dont
account for callstack/module hierarchy related overhead in op runtime.
Test Plan:
CI
Imported from OSS
Reviewed By: ilia-cher
Differential Revision:
D29910519
fbshipit-source-id:
f462031a81ae12b3db7993cf482e5ad93a35e096
Kimish Patel [Sat, 14 Aug 2021 04:37:57 +0000 (21:37 -0700)]
[Pytorch Profiler] Add support for adding module hierarchy to (#61792)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61792
KinetoEvent
This PR adds module hierarchy information to events.
What is module hierarchy information attached to events?
During profiling a TorchScript module, when events are added, we ask JIT
what is the module hierarchy associated with the node being
executed. At the time of execution of that node, there might be multiple
frames in the stack of interpreter. For each frame, we find
corresponding node and the corresponding module hierarchy is queried.
Module hierarchy corresponding to the node is associated with node's
InlinedCallStack. InlinedCallStack of node tracks the path via which the
node is inlined. Thus during the inlining process we annotate
module information corresponding to the CallMethod nodes being inlined.
With this PR, chrome trace will contain additional metadata:
"Module Hierarchy". This can look like this:
TOP(ResNet)::forward.SELF(ResNet)::_forward_impl.layer1(Sequential)::forward.0(BasicBlock)::forward.conv1(Conv2d)::forward.SELF(Conv2d)::_conv_forward
It contains module instance, type name and the method name in the
callstack.
Test Plan:
test_profiler
Imported from OSS
Reviewed By: raziel, ilia-cher
Differential Revision:
D29745442
fbshipit-source-id:
dc8dfaf7c5b8ab256ff0b2ef1e5ec265ca366528
leslie-fang-intel [Sat, 14 Aug 2021 03:49:27 +0000 (20:49 -0700)]
add substract of max and testcase (#63132)
Summary:
As discussed here https://github.com/pytorch/pytorch/pull/62897, in the path of BF16/non-last-dim Softmax, we miss the subtractions of max value which will cause the overflow in the `exp()` calculation when the value of input tensor is large, such as `1000.0`.
To avoid this issue, we add the subtractions of max value and the corresponding test cases in this PR.
Note w/o subtractions of max value(accidental reverts or changes), we will get the underlying error message of the test case
```
AssertionError: False is not true : Tensors failed to compare as equal!With rtol=0.05 and atol=0.05, found 103984 element(s) (out of 126720) whose difference(s) exceeded the margin of error (including 103984 nan comparisons). The greatest difference was nan (0.0 vs. nan), which occurred at index (0, 0, 0, 1).
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63132
Reviewed By: VitalyFedyunin
Differential Revision:
D30280792
Pulled By: cpuhrsch
fbshipit-source-id:
722821debf983bbb4fec878975fa8a4da0d1d866
Kushashwa Ravi Shrimali [Sat, 14 Aug 2021 00:10:07 +0000 (17:10 -0700)]
OpInfo: `nn.functional.conv_transpose2d` (#62882)
Summary:
See https://github.com/facebookresearch/functorch/issues/78 and https://github.com/pytorch/pytorch/issues/54261.
cc: mruberry zou3519 Chillee
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62882
Reviewed By: bdhirsh
Differential Revision:
D30280804
Pulled By: zou3519
fbshipit-source-id:
e40cdf43e98c1f11e45df6b8bc13110b4d29c45f
Kefei Lu [Fri, 13 Aug 2021 23:57:47 +0000 (16:57 -0700)]
refactor fx2trt example script so it can be imported as a library (#63262)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63262
Just create a `__main__` guard.
Test Plan: run linter, sandcastle tests
Reviewed By:
842974287
Differential Revision:
D30263617
fbshipit-source-id:
8044ce5d815b043c3778591384cb13d9a89d0048
Hanton Yang [Fri, 13 Aug 2021 23:20:22 +0000 (16:20 -0700)]
[iOS] Add `LibTorch-Lite-Nightly` pod (#63239)
Summary:
D30090760 (https://github.com/pytorch/pytorch/commit/
e182b459d94fe77c1d9f623c94fc2621c8cc55de) was reverted by
D30303292 because of a lint issue in `LibTorch-Lite-Nightly.podspec.template`. Resubmit the diff after fixing the issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63239
Test Plan: Imported from OSS
Reviewed By: xta0
Differential Revision:
D30315690
Pulled By: hanton
fbshipit-source-id:
f0fa719ffc3b8181ab28c123584ae5c1da8992c0
Sameer Deshmukh [Fri, 13 Aug 2021 23:08:01 +0000 (16:08 -0700)]
Allow TransformerEncoder and TransformerDecoder to accept 0-dim batch sized tensors. (#62800)
Summary:
This issue fixes a part of https://github.com/pytorch/pytorch/issues/12013, which is summarized concretely in https://github.com/pytorch/pytorch/issues/38115.
This PR allows TransformerEncoder and Decoder (alongwith the inner `Layer` classes) to accept inputs with 0-dimensional batch sizes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62800
Reviewed By: VitalyFedyunin
Differential Revision:
D30303240
Pulled By: jbschlosser
fbshipit-source-id:
8f8082a6f2a9f9d7ce0b22a942d286d5db62bd12
Pruthvi Madugundu [Fri, 13 Aug 2021 21:57:17 +0000 (14:57 -0700)]
[ROCm] Update HIP_VERSION to TORCH_HIP_VERSION (#62786)
Summary:
- HIP_VERSION semantic versioning will change in ROCm4.3. The changes essentially remove the dependency on HIP_VERSION provided in the hip header to keep code compatible with older and newer versions of ROCm.
- TORCH_HIP_VERSION is derived from HIP_VERSION_MAJOR and HIP_VERSION_MINOR
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62786
Reviewed By: bdhirsh
Differential Revision:
D30281682
Pulled By: seemethere
fbshipit-source-id:
e41e69fb9e13de5ddd1af99ba5bbdcbb7b64b673
Can Balioglu [Fri, 13 Aug 2021 20:47:37 +0000 (13:47 -0700)]
Respect user-set CMAKE_PREFIX_PATH (#61904)
Summary:
Fixes the case where the `CMAKE_PREFIX_PATH` variable gets silently overwritten by a user specified environment variable.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61904
Reviewed By: walterddr, malfet
Differential Revision:
D29792014
Pulled By: cbalioglu
fbshipit-source-id:
babacc8d5a1490bff1e14247850cc00c6ba9e6be
gmagogsfm [Fri, 13 Aug 2021 20:06:08 +0000 (13:06 -0700)]
Remove left-over print in test_diff_graph_inline_threshold (#63231)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63231
Reviewed By: VitalyFedyunin
Differential Revision:
D30305851
Pulled By: gmagogsfm
fbshipit-source-id:
43da3b5f49ad4a6a2d6d174acf792f3ccf41a463
Tanvir Zaman [Fri, 13 Aug 2021 19:25:16 +0000 (12:25 -0700)]
Add CostInferenceFunction for SplitOp (#63133)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63133
SplitOp is costly but missing cost inference function which hurts cost based balancing. Changes are:
(1) Addition of CostInferenceFunction for SplitOp
(2) Small fix in CostInferenceFunction for ConcatOp
Test Plan:
Added unit tests:
buck test //caffe2/caffe2/python/operator_test:split_op_cost_test
buck test //caffe2/caffe2/python/operator_test:concat_op_cost_test
Reviewed By: smacke
Differential Revision:
D30247360
fbshipit-source-id:
989e962f3a981acc85b73aac3fb23e603b7d1591
Meghan Lele [Fri, 13 Aug 2021 19:08:28 +0000 (12:08 -0700)]
[docs] Merge note block in `torch.lu` documentation (#63156)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63156
**Summary**
This commit merges the four successive `Note` blocks that appear in the
documentation for `torch.lu`. Each one only has one line in it, so all
of them have been merged into one block with a bulleted list that
contains the original items.
**Test Plan**
Continuous integration.
*Before*
<img width="888" alt="Captura de Pantalla 2021-08-12 a la(s) 10 48 39 a m" src="https://user-images.githubusercontent.com/4392003/
129244443-
b7d1594e-8833-4c20-a911-
e1bf7ca88a8d.png">
*After*
<img width="932" alt="Captura de Pantalla 2021-08-12 a la(s) 10 48 46 a m" src="https://user-images.githubusercontent.com/4392003/
129244462-
1f39dcdb-90e0-4fd9-a95f-
343b0b6be1f1.png">
**Fixes**
This commit fixes #62339.
Test Plan: Imported from OSS
Reviewed By: navahgar, pbelevich
Differential Revision:
D30292633
Pulled By: SplitInfinity
fbshipit-source-id:
cb9071165629bfe7316b1d2fe952e4354c75d48f
Meghan Lele [Fri, 13 Aug 2021 18:46:54 +0000 (11:46 -0700)]
[docs] Remove `input` parameter from `Tensor.flatten` docs (#63180)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63180
**Summary**
This commit removes the `input` parameter from the signature for
`Tensor.flatten` shown in its documentation. This parameter is accepted
by `torch.flatten` but not `Tensor.flatten` (since the input is the
`Tensor` on which `flatten` is invoked).
**Test Plan**
Continuous integration.
**Fixes**
This commit fixes #57478.
Test Plan: Imported from OSS
Reviewed By: VitalyFedyunin
Differential Revision:
D30293156
Pulled By: SplitInfinity
fbshipit-source-id:
4ad70d638af009fb6bdeb703433b306904d39a76
Meghan Lele [Fri, 13 Aug 2021 18:46:14 +0000 (11:46 -0700)]
[docs] Add cross references to `torch.transpose` and `torch.t` (#63177)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63177
**Summary**
This commit adds a link in the documentation for `torch.transpose` that
directs to `torch.t` and vice versa. These two functions are related and
it is useful for users of one to know about the other.
**Test Plan**
Continuous integration.
**Fixes**
This commit fixes #56267.
Test Plan: Imported from OSS
Reviewed By: VitalyFedyunin
Differential Revision:
D30292654
Pulled By: SplitInfinity
fbshipit-source-id:
8e60cd7a598ff8b4756cb30141399dfe8e118338
Meghan Lele [Fri, 13 Aug 2021 18:43:05 +0000 (11:43 -0700)]
[docs] Mention `vsplit`, `hsplit` and `tensor_split` in Tensor views doc (#63191)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63191
**Summary**
This commit adds `vsplit`, `hsplit` and `tensor_split` to the list of
view ops on the Tensor Views documentation page.
**Test Plan**
Continuous integration.
*Before*
<img width="195" alt="Captura de Pantalla 2021-08-12 a la(s) 2 55 07 p m" src="https://user-images.githubusercontent.com/4392003/
129275921-
c1cfdf6c-9f1f-45f3-98b6-
1de7a0f0cc84.png">
*After*
<img width="197" alt="Captura de Pantalla 2021-08-12 a la(s) 2 55 15 p m" src="https://user-images.githubusercontent.com/4392003/
129275936-
de4afde7-0143-4e1d-b38f-
c86256f4896c.png">
**Fixes**
This commit fixes #62727.
Test Plan: Imported from OSS
Reviewed By: VitalyFedyunin
Differential Revision:
D30293181
Pulled By: SplitInfinity
fbshipit-source-id:
283783a4ccc3ebc50cb0a427e55c7a6cb618ffd7
Sameer Deshmukh [Fri, 13 Aug 2021 18:27:47 +0000 (11:27 -0700)]
Allow Average Pooling modules to accept tensors with 0-dim batch sizes. (#62025)
Summary:
This issue fixes a part of https://github.com/pytorch/pytorch/issues/12013, which is summarized concretely in https://github.com/pytorch/pytorch/issues/38115.
It introduces changes and tests for allowing the Average Pooling layers to accept tensors with 0 sized batch dimensions and return meaningful results.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62025
Reviewed By: VitalyFedyunin
Differential Revision:
D30303256
Pulled By: jbschlosser
fbshipit-source-id:
5f727e62a7c58d2b8bb49fcc3bd7688474917ba5
zhouzhuojie [Fri, 13 Aug 2021 17:37:07 +0000 (10:37 -0700)]
[skip ci] fix workflow code generation (#63235)
Summary:
Fixes a clean git check with code generation introduced by https://github.com/pytorch/pytorch/pull/63148
`generated-win-vs2019-cuda10-py3.yml` was renamed as `generated-win-vs2019-cuda10.1-py3.yml`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63235
Reviewed By: VitalyFedyunin
Differential Revision:
D30306474
Pulled By: zhouzhuojie
fbshipit-source-id:
cbae1ace064e360e8ca0c0e997116bdb20d54d46
Mike Iovine [Fri, 13 Aug 2021 17:18:03 +0000 (10:18 -0700)]
[Static Runtime] Add pass to eliminate __getitem__/DictConstruct calls (#62429)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62429
Introduce a new pass to eliminate calls to `prim::DictConstruct/aten::__getitem__`. Given a graph like this:
```
%2 : Dict = prim::DictConstruct(%key, %value)
%3 : Tensor = aten::__getitem__(%2, %key)
%4 : Tensor = op(%3)
```
This pass produces a graph like this (after dead code elimination):
```
%4 : Tensor = op(%value)
```
This optimization is applied in the static runtime.
Test Plan:
`buck test //caffe2/test:jit -- TestPeephole`
**local.forward performance summary**
About 3% runtime benefit. All `DictConstruct` calls optimized out, `__getitem__` calls reduced significantly (~50% of them are cut out)
P438354810
**local_request_only.forward performance summary**
About 14% runtime benefit. Again, all `DictConstruct` calls optimized out, 50% `__getitem__` calls removed.
P438359742
There is some variance with runtime measurements, so take these numbers with a grain of salt. Also note that the benefit does not exist in the shrunk model since there are no `DictConstruct` calls
Reviewed By: hlu1
Differential Revision:
D29995087
fbshipit-source-id:
f376376a46ff808115afd2d60446e5db8f6f752f
Kushashwa Ravi Shrimali [Fri, 13 Aug 2021 17:12:01 +0000 (10:12 -0700)]
Fixing user inputs for low, high in `make_tensor` (#61108)
Summary:
**TODOs:**
* [x] Do not clamp inputs for low and high when given and valid.
* [x] Devise rules for modifying `low` and `high` when extremals/invalid values passed.
* [x] Testing with `test_references_numerics_hard` with the revised changes. _(I've tested locally, the changes will take place in a separate PR though after offline discussion with mruberry)_
* [x] Revise comments/documentation for `make_tensor`
See https://github.com/pytorch/pytorch/issues/61758 for tracker issue.
cc: mruberry pmeier
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61108
Reviewed By: VitalyFedyunin
Differential Revision:
D30296167
Pulled By: mruberry
fbshipit-source-id:
67e8d15b173209a9c97ca013231494a5fa99f8c7
Natalia Gimelshein [Fri, 13 Aug 2021 16:49:15 +0000 (09:49 -0700)]
[hackathon] fix benchmarking script in CONTRIBUTING (#63199)
Summary:
[skip ci]
Per title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63199
Reviewed By: mruberry
Differential Revision:
D30305487
Pulled By: ngimel
fbshipit-source-id:
2704c4f08ab976a55c9f8c2fe54cd4f3f39412cf
Andres Suarez [Fri, 13 Aug 2021 16:26:38 +0000 (09:26 -0700)]
[codemod][lint][caffe2] Extend BLACK coverage
Test Plan: Sandcastle
Reviewed By: zsol
Differential Revision:
D30302716
fbshipit-source-id:
f9724d4f4d1b8950f581cc2c6c77eedf19b4b6fc
Thomas J. Fan [Fri, 13 Aug 2021 15:43:04 +0000 (08:43 -0700)]
ENH Adds no_batch_dim to FractionalMaxPool2d (#62490)
Summary:
Towards https://github.com/pytorch/pytorch/issues/60585
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62490
Reviewed By: bdhirsh
Differential Revision:
D30287143
Pulled By: jbschlosser
fbshipit-source-id:
1b9dd932157f571adf3aa2c98c3c6b56ece8fa6e
Don Jang [Fri, 13 Aug 2021 15:37:23 +0000 (08:37 -0700)]
[JIT] Add a flag to rethrow caught exception in jit interpreter (#63073)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63073
It turned out that it's less than ideal to print out verbose stacktrace in exception messages in high-QPS services (see the related task) with a non-significant failure rate due to the truncation of long stacktrace which results in losing the original exception message thrown from native code. It is actually desirable to retain only the message of the original exception directly thrown from native code in such a usecase.
This change adds a new flag `torch_jit_disable_exception_stacktrace` to the pytorch jit interpreter to suppress stacktrace in the messages of exception thrown from the interpreter.
Reviewed By: Krovatkin
Differential Revision:
D30241792
fbshipit-source-id:
c340225c69286663cbd857bd31ba6f1736b1ac4c
Yukio Siraichi [Fri, 13 Aug 2021 15:20:19 +0000 (08:20 -0700)]
Port `norm` kernel to structured kernels. (#62711)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62711
Tracking issue: #55070
Test Plan: Imported from OSS
Reviewed By: anjali411
Differential Revision:
D30109866
Pulled By: ezyang
fbshipit-source-id:
894c9496894d059c7690a174b75bbd4db7ed6016
Yukio Siraichi [Fri, 13 Aug 2021 15:20:19 +0000 (08:20 -0700)]
Port `prod` kernel to structured kernels. (#62024)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62024
Tracking issue: #55070
In this PR, I also broke down the meta functions of other reduction kernels (e.g. `all`,
`argmax`, `sum`) into the composition of common patterns.
Test Plan: Imported from OSS
Reviewed By: ejguan
Differential Revision:
D29847122
Pulled By: ezyang
fbshipit-source-id:
a6680a6cf6e59bb46b8ffe7bf2a3a611d6e0fd14
Yukio Siraichi [Fri, 13 Aug 2021 15:20:19 +0000 (08:20 -0700)]
Port `mean` kernel to structured kernels. (#61643)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61643
Tracking issue: #55070
Test Plan: Imported from OSS
Reviewed By: ejguan
Differential Revision:
D29783866
Pulled By: ezyang
fbshipit-source-id:
dc95baf593096c03fb5f292ee6c36de3cc7f2b35
Andrew Gu [Fri, 13 Aug 2021 15:19:23 +0000 (08:19 -0700)]
Remove req to call step() in training loop (#63164)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63164
Test Plan: Imported from OSS
Reviewed By: mrshenli
Differential Revision:
D30284616
Pulled By: andwgu
fbshipit-source-id:
afdb677fb08851b139178a9f6d782196f26773e1
Andrew Gu [Fri, 13 Aug 2021 15:19:23 +0000 (08:19 -0700)]
Pass `_allow_empty_param_list` into func opt ctor (#63163)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63163
Test Plan: Imported from OSS
Reviewed By: mrshenli
Differential Revision:
D30284615
Pulled By: andwgu
fbshipit-source-id:
4857f5b618ec5b007648737ab532ce605e5d70dc
Andrew Gu [Fri, 13 Aug 2021 15:19:23 +0000 (08:19 -0700)]
Simplify data structures, add uniform approximation, fix mem leak (#63162)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63162
Test Plan: Imported from OSS
Reviewed By: mrshenli
Differential Revision:
D30284617
Pulled By: andwgu
fbshipit-source-id:
9bd9e5f89abcc0d3dac56b85d55cc88e843baa9f
Supriya Rao [Fri, 13 Aug 2021 14:58:38 +0000 (07:58 -0700)]
[docs][ao] update quantize_per_tensor to mention overloads (#63165)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63165
Add details about the overloads for
* list of tensors input
* supporting tensor scale/zero-point inputs
Test Plan:
CI
Imported from OSS
Reviewed By: bdhirsh
Differential Revision:
D30291045
fbshipit-source-id:
9fc6418792c5e3a35417eeb8d31de4a4bfcbb7a5
Victor Quach [Fri, 13 Aug 2021 14:47:12 +0000 (07:47 -0700)]
Make saved tensors default hooks thread local (#62909)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62909
This PR makes saved tensors default hooks thread local.
This allows using default hooks in a multithreaded context.
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision:
D30165416
Pulled By: Varal7
fbshipit-source-id:
10a7d580661d3d94bdaf398c4e076b7bea11c16b
Sameer Deshmukh [Fri, 13 Aug 2021 14:31:42 +0000 (07:31 -0700)]
Allow 0-dim batch sizes for AdaptiveMaxPool and MaxPool. (#62088)
Summary:
This issue fixes a part of https://github.com/pytorch/pytorch/issues/12013, which is summarized concretely in https://github.com/pytorch/pytorch/issues/38115.
This PR allows `MaxPool` and `AdaptiveMaxPool` to accept tensors whose batch size is 0. Some changes have been made to modernize the tests so that they will show the name of C++ function that throws an error.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62088
Reviewed By: bdhirsh
Differential Revision:
D30281285
Pulled By: jbschlosser
fbshipit-source-id:
52bffc67bfe45a78e11e4706b62cce1469eba1b9
AspenStars [Fri, 13 Aug 2021 13:40:41 +0000 (06:40 -0700)]
DOC Improve documentation for LayerNorm (#63144)
Summary:
In this [commit](https://github.com/pytorch/pytorch/pull/59178/commits/
7026995f3ca253fbc19bf511d53f48f861799a4a) and [issue](https://github.com/pytorch/pytorch/pull/59178#issuecomment-
897485295), the [Line 134](https://github.com/deniskokarev/pytorch/blob/
47e286d024c183cb26a464447b34fde88b80d17d/torch/nn/modules/normalization.py#L134) will overwrite the "embedding" variate which would cause an error when initiating `nn.LayerNorm` function.
I suggest renaming the "embedding" in [Line 133](https://github.com/deniskokarev/pytorch/blob/
47e286d024c183cb26a464447b34fde88b80d17d/torch/nn/modules/normalization.py#L133) to "embedding_dim".
The final example is:
```
batch, sentence_length, embedding_dim = 20, 5, 10
embedding = torch.randn(batch, sentence_length, embedding_dim)
layer_norm = nn.LayerNorm(embedding_dim)
```
Fixes #{59178}
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63144
Reviewed By: bdhirsh
Differential Revision:
D30288778
Pulled By: jbschlosser
fbshipit-source-id:
e74b11430e302dae5661bf6e830ee5ac6c1838c4
Alban Desmaison [Fri, 13 Aug 2021 13:36:27 +0000 (06:36 -0700)]
Revert
D30090760: [iOS] Add podspec for libTorch-lite nightly build
Test Plan: revert-hammer
Differential Revision:
D30090760 (https://github.com/pytorch/pytorch/commit/
e182b459d94fe77c1d9f623c94fc2621c8cc55de)
Original commit changeset:
361aa2ed24a1
fbshipit-source-id:
9c0dfee80a80eb012b142d3928204d6eb8025b0a
Kushashwa Ravi Shrimali [Fri, 13 Aug 2021 13:33:40 +0000 (06:33 -0700)]
OpInfo for `torch.nn.functional.normalize` (#62635)
Summary:
See https://github.com/facebookresearch/functorch/issues/78 and https://github.com/pytorch/pytorch/issues/54261
cc: mruberry zou3519 Chillee
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62635
Reviewed By: H-Huang
Differential Revision:
D30136503
Pulled By: zou3519
fbshipit-source-id:
258c069f30d9c2a51ed27dadf94f3703b9432a4a
Nikita Vedeneev [Fri, 13 Aug 2021 04:15:42 +0000 (21:15 -0700)]
Implements backward for `torch.lu_solve` (#61681)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/22620
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61681
Reviewed By: ngimel
Differential Revision:
D30063116
Pulled By: mruberry
fbshipit-source-id:
e095b0cadfb7c8b37a7ef91bae5b5dc170d8ef1c
Charles David Hernandez [Fri, 13 Aug 2021 03:57:54 +0000 (20:57 -0700)]
Moving getattr_from_fqn to torch.quantization.utils (#63107)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63107
moving this function because the functionality would be useful outside of ns
ghstack-source-id:
135727260
Test Plan: buck test //caffe2/test:quantization_fx mode/dev-nosan --keep-going --config client.id=nuclide --show-full-output -- suite
Reviewed By: supriyar
Differential Revision:
D30260735
fbshipit-source-id:
58deabdd0f3b03b0ee7ee92be0548a0945084d65
Thomas J. Fan [Fri, 13 Aug 2021 01:05:29 +0000 (18:05 -0700)]
ENH Migrate nll_loss2d from THC to ATen (#62826)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/24608
Fixes https://github.com/pytorch/pytorch/issues/24607
With the following benchmark, the backward pass runs a little slower. This is strange since the implementation should be exactly the same.
<details>
<summary>Benchmark script</summary>
```python
from itertools import product
import torch
import torch.nn as nn
import torch.nn.functional as F
import time
torch.manual_seed(0)
MS_PER_SECOND = 1000
def _time():
torch.cuda.synchronize()
return time.perf_counter() * MS_PER_SECOND
device = "cuda"
C = 3
n_runs = 30
reductions = ["none", "sum", "mean"]
Ns = [128, 256, 512]
Hs = [128, 256, 512]
for reduction, N, H in product(reductions, Ns, Hs):
total_fwd_time = 0
total_back_time = 0
if reduction == "none":
grad_out = torch.randn(N, H, H, device=device)
else:
grad_out = torch.randn(1)[0]
for _ in range(n_runs):
input = torch.randn(N, C, H, H, device=device, requires_grad=True)
target = torch.rand(N, H, H, device=device).mul(3).floor().long()
# forward
start = _time()
result = F.nll_loss(input, target, reduction=reduction)
total_fwd_time += _time() - start
result = F.nll_loss(input, target, reduction=reduction)
for _ in range(n_runs):
# backward
start = _time()
result.backward(grad_out, retain_graph=True)
total_back_time += _time() - start
fwd_avg = total_fwd_time / n_runs
bwd_avg = total_back_time / n_runs
print(
f"input size({N}, {C}, {H}, {H}), reduction: {reduction}, fwd: {fwd_avg:.2f} (ms), back: {bwd_avg:.2f} (ms)"
)
```
</details>
<details>
<summary>master results</summary>
```
input size(128, 3, 128, 128), reduction: none, fwd: 0.34 (ms), back: 0.57 (ms)
input size(128, 3, 256, 256), reduction: none, fwd: 2.56 (ms), back: 3.85 (ms)
input size(128, 3, 512, 512), reduction: none, fwd: 14.54 (ms), back: 16.62 (ms)
input size(256, 3, 128, 128), reduction: none, fwd: 1.26 (ms), back: 1.78 (ms)
input size(256, 3, 256, 256), reduction: none, fwd: 7.07 (ms), back: 8.22 (ms)
input size(256, 3, 512, 512), reduction: none, fwd: 29.38 (ms), back: 33.29 (ms)
input size(512, 3, 128, 128), reduction: none, fwd: 3.41 (ms), back: 4.05 (ms)
input size(512, 3, 256, 256), reduction: none, fwd: 14.32 (ms), back: 16.46 (ms)
input size(512, 3, 512, 512), reduction: none, fwd: 59.20 (ms), back: 66.68 (ms)
input size(128, 3, 128, 128), reduction: sum, fwd: 0.08 (ms), back: 0.21 (ms)
input size(128, 3, 256, 256), reduction: sum, fwd: 0.21 (ms), back: 0.73 (ms)
input size(128, 3, 512, 512), reduction: sum, fwd: 0.82 (ms), back: 2.86 (ms)
input size(256, 3, 128, 128), reduction: sum, fwd: 0.12 (ms), back: 0.39 (ms)
input size(256, 3, 256, 256), reduction: sum, fwd: 0.42 (ms), back: 1.45 (ms)
input size(256, 3, 512, 512), reduction: sum, fwd: 1.53 (ms), back: 5.66 (ms)
input size(512, 3, 128, 128), reduction: sum, fwd: 0.21 (ms), back: 0.74 (ms)
input size(512, 3, 256, 256), reduction: sum, fwd: 0.78 (ms), back: 2.86 (ms)
input size(512, 3, 512, 512), reduction: sum, fwd: 2.98 (ms), back: 11.23 (ms)
input size(128, 3, 128, 128), reduction: mean, fwd: 0.07 (ms), back: 0.21 (ms)
input size(128, 3, 256, 256), reduction: mean, fwd: 0.21 (ms), back: 0.73 (ms)
input size(128, 3, 512, 512), reduction: mean, fwd: 0.82 (ms), back: 2.86 (ms)
input size(256, 3, 128, 128), reduction: mean, fwd: 0.13 (ms), back: 0.39 (ms)
input size(256, 3, 256, 256), reduction: mean, fwd: 0.42 (ms), back: 1.45 (ms)
input size(256, 3, 512, 512), reduction: mean, fwd: 1.54 (ms), back: 5.65 (ms)
input size(512, 3, 128, 128), reduction: mean, fwd: 0.22 (ms), back: 0.74 (ms)
input size(512, 3, 256, 256), reduction: mean, fwd: 0.78 (ms), back: 2.87 (ms)
input size(512, 3, 512, 512), reduction: mean, fwd: 2.98 (ms), back: 11.23 (ms)
```
</details>
<details>
<summary>PR results</summary>
```
input size(128, 3, 128, 128), reduction: none, fwd: 0.33 (ms), back: 0.59 (ms)
input size(128, 3, 256, 256), reduction: none, fwd: 2.51 (ms), back: 3.92 (ms)
input size(128, 3, 512, 512), reduction: none, fwd: 14.52 (ms), back: 17.05 (ms)
input size(256, 3, 128, 128), reduction: none, fwd: 1.23 (ms), back: 1.85 (ms)
input size(256, 3, 256, 256), reduction: none, fwd: 7.07 (ms), back: 8.45 (ms)
input size(256, 3, 512, 512), reduction: none, fwd: 29.39 (ms), back: 34.21 (ms)
input size(512, 3, 128, 128), reduction: none, fwd: 3.40 (ms), back: 4.18 (ms)
input size(512, 3, 256, 256), reduction: none, fwd: 14.33 (ms), back: 16.90 (ms)
input size(512, 3, 512, 512), reduction: none, fwd: 59.04 (ms), back: 68.36 (ms)
input size(128, 3, 128, 128), reduction: sum, fwd: 0.07 (ms), back: 0.25 (ms)
input size(128, 3, 256, 256), reduction: sum, fwd: 0.21 (ms), back: 0.86 (ms)
input size(128, 3, 512, 512), reduction: sum, fwd: 0.82 (ms), back: 3.33 (ms)
input size(256, 3, 128, 128), reduction: sum, fwd: 0.12 (ms), back: 0.46 (ms)
input size(256, 3, 256, 256), reduction: sum, fwd: 0.42 (ms), back: 1.70 (ms)
input size(256, 3, 512, 512), reduction: sum, fwd: 1.53 (ms), back: 6.58 (ms)
input size(512, 3, 128, 128), reduction: sum, fwd: 0.21 (ms), back: 0.87 (ms)
input size(512, 3, 256, 256), reduction: sum, fwd: 0.78 (ms), back: 3.34 (ms)
input size(512, 3, 512, 512), reduction: sum, fwd: 2.98 (ms), back: 13.07 (ms)
input size(128, 3, 128, 128), reduction: mean, fwd: 0.07 (ms), back: 0.26 (ms)
input size(128, 3, 256, 256), reduction: mean, fwd: 0.21 (ms), back: 0.86 (ms)
input size(128, 3, 512, 512), reduction: mean, fwd: 0.82 (ms), back: 3.34 (ms)
input size(256, 3, 128, 128), reduction: mean, fwd: 0.12 (ms), back: 0.46 (ms)
input size(256, 3, 256, 256), reduction: mean, fwd: 0.42 (ms), back: 1.72 (ms)
input size(256, 3, 512, 512), reduction: mean, fwd: 1.53 (ms), back: 6.60 (ms)
input size(512, 3, 128, 128), reduction: mean, fwd: 0.21 (ms), back: 0.87 (ms)
input size(512, 3, 256, 256), reduction: mean, fwd: 0.78 (ms), back: 3.33 (ms)
input size(512, 3, 512, 512), reduction: mean, fwd: 2.98 (ms), back: 13.07 (ms)
```
</details>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62826
Reviewed By: bdhirsh
Differential Revision:
D30282279
Pulled By: ngimel
fbshipit-source-id:
4aa0ff3f8af0632957417931d332ec486a12b52d
Alexander Soare [Fri, 13 Aug 2021 00:35:02 +0000 (17:35 -0700)]
add autowrap_functions kwarg to fx.Tracer (#62106)
Summary:
Implements feature request https://github.com/pytorch/pytorch/issues/62021
Test it out with
```python
from torch import fx
from torch import nn
def fx_int(x):
return int(x)
class MyModule(nn.Module):
def forward(self, x):
return fx_int(x.shape[0] / 2)
tracer = fx.Tracer(autowrap_functions=(fx_int,)) # or remove kwarg to demonstrate symbolic trace error
tracer.trace(MyModule())
```
First time contributor, so please advise if I could have done anything to make lives easier for next time.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62106
Reviewed By: SplitInfinity, driazati
Differential Revision:
D30080834
Pulled By: jamesr66a
fbshipit-source-id:
68fadf8c881ea7930e7afd62b642874010fe4903
Bradley Davis [Fri, 13 Aug 2021 00:27:08 +0000 (17:27 -0700)]
[fx] store Tracer class on Graph and GraphModule for package deserialization [v2, the re-do] (#63121)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63121
Re-introducing this diff with a small change to ignore setting Tracer classes on GraphModules when the Tracer class is defined not at module-level (prevents pickling).
Previous, reverted Pull Request: https://github.com/pytorch/pytorch/pull/62497
Reviewed By: houseroad
Differential Revision:
D30252776
fbshipit-source-id:
42d2bc846e4b32d00563419c38c02b63cd0986e6
Karol Sputo [Thu, 12 Aug 2021 22:36:29 +0000 (15:36 -0700)]
Show warning in eager mode for empty containers (#62978)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/54873
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62978
Reviewed By: navahgar
Differential Revision:
D30278343
Pulled By: ansley
fbshipit-source-id:
ebb19f7b8a10720f2612b99a2668d1ebbc1f2d16
Hanton Yang [Thu, 12 Aug 2021 22:32:25 +0000 (15:32 -0700)]
[iOS] Add podspec for libTorch-lite nightly build (#62691)
Summary:
The nightly pod version will be aliased with [PyTorch nightly build version](https://l.facebook.com/l.php?u=https%3A%2F%2Fgithub.com%2Fpytorch%2Fpytorch%2Fblob%2Fmaster%2F.circleci%2Fscripts%2Fbinary_populate_env.sh%23L88&h=AT3AeTpSGcz9YVeG7Lr_bweWOv8H2-kAMevglFfMslaZwgEPptNM59WdWj2ZER806rKVLNhQGM5EQcyFC_8xOq334LBo2J6YzgPW2LELkgASlA6UxP2gaD2 (https://github.com/pytorch/pytorch/commit/
fa22f6303f5cf23058accf899723d6f589066ddf)Wy5mA6_lu_YlHHbEGPIU7ewJQD1 (https://github.com/pytorch/pytorch/commit/
2d884f226365f94833df91de532e3a31b0db310d)aBSlOy) and [CocoaPods version specification](https://l.facebook.com/l.php?u=https%3A%2F%2Fguides.cocoapods.org%2Fusing%2Fthe-podfile.html%23specifying-pod-versions&h=AT3AeTpSGcz9YVeG7Lr_bweWOv8H2-kAMevglFfMslaZwgEPptNM59WdWj2ZER806rKVLNhQGM5EQcyFC_8xOq334LBo2J6YzgPW2LELkgASlA6UxP2gaD2 (https://github.com/pytorch/pytorch/commit/
fa22f6303f5cf23058accf899723d6f589066ddf)Wy5mA6_lu_YlHHbEGPIU7ewJQD1 (https://github.com/pytorch/pytorch/commit/
2d884f226365f94833df91de532e3a31b0db310d)aBSlOy), the version format of the podspect is `PyTorch version + nightly build date`, like `1.10.0.
20210812`.
Usage:
1. Add `pod 'LibTorch-Lite-Nightly'` to `Podfile`
2. Run `pod install` to install the nightly built lib
3. Run `pod update` to update the lib to the latest version
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62691
Test Plan:
* Test on [TestApp](https://github.com/pytorch/pytorch/tree/master/ios/TestApp) and [HelloWorld](https://github.com/pytorch/ios-demo-app):
Podfile: `pod 'LibTorch-Lite-Nightly'`
* Test on Private Pod:
{
F642106928}
Reviewed By: xta0
Differential Revision:
D30090760
Pulled By: hanton
fbshipit-source-id:
361aa2ed24a11d6aced8374cb45f70f49bd5da52
Rong Rong (AI Infra) [Thu, 12 Aug 2021 21:40:29 +0000 (14:40 -0700)]
[BE] delete GHA generated workflow files before regen (#63148)
Summary:
Unlike circle which all workflow goes in one file, GHA legacy generated files will stay silently in once's PR. e.g. when we change build_environment name and that's not ideal.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63148
Reviewed By: bdhirsh
Differential Revision:
D30283382
Pulled By: walterddr
fbshipit-source-id:
ffdd5bf9561dd38499052855a12ee5cf838a20b0
Tao Xu [Thu, 12 Aug 2021 20:18:42 +0000 (13:18 -0700)]
[iOS][GPU] Fix the clamp shader function for x86_64 (#63062)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63062
Pervasively, due to the need of supporting 10.0, we used a fp16 version of the clamp kernel on Metal, which didn't work well on x86_64. Since we don't need to support 10.0 anymore, we can use the fp32 version, which works both on arm64 and x86_64.
ghstack-source-id:
135536785
Test Plan:
- `buck test pp-macos`
- Op tests in the playground app
{
F641013793}
Reviewed By: husthyc
Differential Revision:
D30239931
fbshipit-source-id:
6ad1bf71422b537e052fbd7b7465ba8deb7ca0cf
Victor Quach [Thu, 12 Aug 2021 19:36:38 +0000 (12:36 -0700)]
Forbid inplace modification of a saved tensor's pack_hook input (#62717)
Summary:
When using saved tensors hooks (especially default hooks),
if the user defines a `pack_hook` that modifies its input,
it can cause some surprising behavior.
The goal of this PR is to prevent future user headache by catching
inplace modifications of the input of `pack_hook` and raising an error if
applicable.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62717
Reviewed By: albanD
Differential Revision:
D30255243
Pulled By: Varal7
fbshipit-source-id:
8d73f1e1b50b697a59a2849b5e21cf0aa7493b76
Howard Huang [Thu, 12 Aug 2021 19:22:06 +0000 (12:22 -0700)]
Update CONTRIBUTING.md to remove ProcessGroupAgent (#63160)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63160
Test Plan: Imported from OSS
Reviewed By: mrshenli
Differential Revision:
D30284439
Pulled By: H-Huang
fbshipit-source-id:
53c31b6917ef5e2125e146fb0ed73ae3d76a8cf9
Edward Wang (EcoF) [Thu, 12 Aug 2021 19:10:50 +0000 (12:10 -0700)]
add use_strict_trace to tensorboard add_graph method (#63120)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63120
FAIM returns dictionaries as the model output, which throws an error when trying to trace using add_graph. Pass in `strict` to the tracer to make this user configurable.
User post: https://fb.workplace.com/groups/pytorchLightning/permalink/
1510194972650369/?comment_id=
1510252919311241&reply_comment_id=
1510281112641755
Test Plan: unit test
Reviewed By: Reubend
Differential Revision:
D30265890
fbshipit-source-id:
58b25d9500b875a29a664aa9ef4c1e7f13631fa1
Shen Li [Thu, 12 Aug 2021 18:39:31 +0000 (11:39 -0700)]
Revert
D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default
Test Plan: revert-hammer
Differential Revision:
D30279364 (https://github.com/pytorch/pytorch/commit/
b0043072529b81276a69df29e00555333117646c)
Original commit changeset:
c1ed77dfe43a
fbshipit-source-id:
eab50857675c51e0088391af06ec0ecb14e2347e
jiej [Thu, 12 Aug 2021 18:03:32 +0000 (11:03 -0700)]
LayerNorm Support in autodiff: (#50467)
Summary:
1. extend autodiff by adding entry for layer_norm in symbolic script, we now use native_layer_norm_backward
2. added backward function `layernorm_double_backward` for `native_layer_norm_backward`, preserves double backward support for LayerNorm in autodiff/ScriptModule
3. added python test to verify autodiff on layer_norm with various configuration of optional tensors; (verify the fix in https://github.com/pytorch/pytorch/issues/49430)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50467
Reviewed By: eellison
Differential Revision:
D30232864
Pulled By: jansel
fbshipit-source-id:
b9c33075386aff96afff7415df9f94388bfb474a
Co-authored-by: Ryan Spring <rspring@nvidia.com>
Co-authored-by: Jie <jiej@nvidia.com>
Zsolt Dollenstein [Thu, 12 Aug 2021 17:56:55 +0000 (10:56 -0700)]
[codemod][lint][fbcode/c*] Enable BLACK by default
Test Plan: manual inspection & sandcastle
Reviewed By: zertosh
Differential Revision:
D30279364
fbshipit-source-id:
c1ed77dfe43a3bde358f92737cd5535ae5d13c9a
Kushashwa Ravi Shrimali [Thu, 12 Aug 2021 16:45:17 +0000 (09:45 -0700)]
[reland] OpInfo: `adaptive_avg_pool2d` (#62935)
Summary:
This PR is an attempt to reland https://github.com/pytorch/pytorch/pull/62704.
**What has changed?**
The op has non-deterministic behavior, hence an appropriate `gradcheck` wrapper had to be added.
cc: mruberry zou3519 heitorschueroff kshitij12345
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62935
Reviewed By: anjali411
Differential Revision:
D30225095
Pulled By: zou3519
fbshipit-source-id:
644873cc21d44b19c8b68f9edff691913778de0e
Rong Rong (AI Infra) [Thu, 12 Aug 2021 15:13:01 +0000 (08:13 -0700)]
[BE] shorten CI name part2 (#63030)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/62357
there's no need to specify cudnn version since they are recommended from cuda version already.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63030
Reviewed By: zhouzhuojie, driazati
Differential Revision:
D30226354
Pulled By: walterddr
fbshipit-source-id:
7e2dc577810e0ce80ee27569c25a814566250ab1
Rohan Varma [Thu, 12 Aug 2021 07:37:30 +0000 (00:37 -0700)]
Skip zero test on windows (#63087)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63087
Test failed on windows unexpectedly see
https://github.com/pytorch/pytorch/issues/63086. Skip for now while we
investigate
ghstack-source-id:
135631811
Test Plan: CI
Reviewed By: ngimel
Differential Revision:
D30251300
fbshipit-source-id:
8acb1ea8863c654c171fe989ac24446c321c085d
Peter Bell [Thu, 12 Aug 2021 06:46:12 +0000 (23:46 -0700)]
BatchNorm: Use resize_output and empty, instead of empty_like (#63084)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/62967
This lets each of the three implementations choose which memory format
to use for the output, meaning channels_last can be used in more cases.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63084
Reviewed By: saketh-are
Differential Revision:
D30255740
Pulled By: ngimel
fbshipit-source-id:
48d42850952ec910b29521a1c4e530eb2b29df5e
Supriya Rao [Thu, 12 Aug 2021 05:05:30 +0000 (22:05 -0700)]
[quant] Make version 1 the default for get_default_qat_qconfig (#63043)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63043
In version 1 we use the fused module/operator during QAT. Making this the default for all QAT runs going forward.
Older models saved after prepare_qat_fx can still load their state_dict into a model prepared using version 1.
The state_dict will still have the same attribute for the observer/fake_quant modules.
There may be some numerics difference between the old observer code in observer.py and the new fused module that was
re-written in C++/CUDA to perform observe + fake_quantize.
This PR also updates the test to check for the new module instead of the default FakeQuantize module.
Note: there are also some changes to make the operator work for multi-dim per-channel quantization + updated the test for that.
Test Plan:
python test/test_quantization.py TestSerialization.test_default_qat_qconfig
Imported from OSS
Reviewed By: raghuramank100
Differential Revision:
D30232222
fbshipit-source-id:
f3553a1926ab7c663bbeed6d574e30a7e90dfb5b
Pritam Damania [Thu, 12 Aug 2021 04:41:31 +0000 (21:41 -0700)]
Fix sharded tensor tests. (#63054)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63054
1) Ensure these tests are skipped in environments without any GPUs.
2) Add the test to run_test.py
ghstack-source-id:
135595698
Test Plan: waitforbuildbot
Reviewed By: wanchaol
Differential Revision:
D30239159
fbshipit-source-id:
21b543ba72e8d10182bc77e7ae1fd34fd4096509
Meghan Lele [Thu, 12 Aug 2021 04:01:28 +0000 (21:01 -0700)]
Port `log_softmax_backward_data` to structured kernel (#62372)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62372
Test Plan: Imported from OSS
Reviewed By: saketh-are
Differential Revision:
D30240242
Pulled By: SplitInfinity
fbshipit-source-id:
67d5e4b1543c2e43675e905ce18ca49c11e33748
Meghan Lele [Thu, 12 Aug 2021 04:01:28 +0000 (21:01 -0700)]
Port `log_softmax` to structured kernel (#57374)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57374
Test Plan: Imported from OSS
Reviewed By: saketh-are
Differential Revision:
D30240243
Pulled By: SplitInfinity
fbshipit-source-id:
de6617c75d16e26d607a884c25b8752b7b561737
zhouzhuojie [Thu, 12 Aug 2021 00:09:02 +0000 (17:09 -0700)]
Add ciflow_ruleset.json generator along with gha ci (#63097)
Summary:
- Add `.github/generated-ciflow-ruleset.json` for ciflow-bot (so that we can generate better comments)
- The lint job also checks git dirty to make sure that the file is always in sync with ciflow configs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63097
Reviewed By: saketh-are
Differential Revision:
D30263278
Pulled By: zhouzhuojie
fbshipit-source-id:
bad68105a228e892ba071b29ecfdf433e1038054
Jiewen Tan [Wed, 11 Aug 2021 23:42:34 +0000 (16:42 -0700)]
Improve IMethod::getArgumentNames to deal with empty argument names list (#62947)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62947
This diff improved IMethod::getArgumentNames to deal with empty argument names list.
Test Plan:
buck test mode/dev //caffe2/caffe2/fb/predictor:pytorch_predictor_test -- PyTorchDeployPredictor.GetEmptyArgumentNamesValidationMode
buck test mode/dev //caffe2/caffe2/fb/predictor:pytorch_predictor_test -- PyTorchDeployPredictor.GetEmptyArgumentNamesRealMode
Reviewed By: wconstab
Differential Revision:
D30179974
fbshipit-source-id:
c7aec35c360a73318867c5b77ebfec3affee47e3
Amy He [Wed, 11 Aug 2021 21:24:06 +0000 (14:24 -0700)]
Fix Nnapi backend execute's dangling pointer (#63092)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63092
Bug discovered while testing NNAPI Delegate on SparkAR.
Using
```
c10::IntArrayRef order = {0, 2, 3, 1};
fixed_inputs.push_back(tensorInp.get(i).permute(order).contiguous());
```
results in a garbage value for order in `permute()`.
Moving order inside the call to `permute()` fixes this issue. Problem is seemingly related to https://github.com/pytorch/pytorch/issues/44409, but luckily the solution in this case is simple.
Bug wasn't caught earlier, since regular unit tests weren't affected by the dangling pointer, and address sanitizer NNAPI tests are turned off due to there being a different failure (T95764916).
ghstack-source-id:
135526129
Test Plan:
Run Unit tests: `python test/test_jit.py`
Build and run SparkAR on an Android phone at the top of this diff stack (
D30173959): `buck build --show-output arstudioplayer_arm64_debug -c pt.enable_nnapi=1`
Reviewed By: raziel, iseeyuan
Differential Revision:
D30237504
fbshipit-source-id:
c946d81feefc453b43d9295d8d6f509cafdcec03
Nikita Shulga [Wed, 11 Aug 2021 21:05:55 +0000 (14:05 -0700)]
Fix warnings (#62930)
Summary:
Add `-Wno-writable-strings`(which is clang's flavor of `-Wwrite-strings`) to list of warnings ignored while compiling torch_python.
Avoid unnecessary copies in range loop
Fix number of signed-unsigned comparisons
Found while building locally on M1
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62930
Reviewed By: albanD
Differential Revision:
D30171981
Pulled By: malfet
fbshipit-source-id:
25bd43dab5675f927ca707e32737ed178b04651e
Tao Xu [Wed, 11 Aug 2021 20:28:09 +0000 (13:28 -0700)]
[iOS][GPU] Consolidate array and non-array kernel for upsampling_nearest2d (#63061)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63061
Cleanup the redundant shader code for the upsampling nearest kernel.
ghstack-source-id:
135524349
Test Plan:
- `buck test pp-macos`
- Op tests in PyTorchPlayground app
Reviewed By: husthyc
Differential Revision:
D30236905
fbshipit-source-id:
e1e001b446452b077e6db719b0519c9070f3300b
Richard Barnes [Wed, 11 Aug 2021 20:12:16 +0000 (13:12 -0700)]
irange-ify 13b (#62476)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62476
Test Plan: Sandcastle
Reviewed By: malfet
Differential Revision:
D30001445
fbshipit-source-id:
6f4525338c80e9f929695f47f36ca9c72d96a75d
CaoE [Wed, 11 Aug 2021 19:51:28 +0000 (12:51 -0700)]
Add BFloat16 support for unique and unique_consecutive on CPU (#62559)
Summary:
Add BFloat16 support for unique and unique_consecutive on CPU.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62559
Reviewed By: saketh-are
Differential Revision:
D30250675
Pulled By: ngimel
fbshipit-source-id:
26e48f971d87f3b86db237e8ad3a4b74eb3c1def
Alexander Grund [Wed, 11 Aug 2021 19:42:32 +0000 (12:42 -0700)]
Add Github action to upload full source releases (#63022)
Summary:
Those release tarballs include the submodules.
The action is run on every tag, master-branch push but will not upload anything.
This makes sure nothing is broken when an actual release happens.
On created releases the action runs and uploads the tarball
Fixes https://github.com/pytorch/pytorch/issues/62708
As I don't have access rights here and testing is obviously hard (as a new release needs to be published), I set up a test at https://github.com/Flamefire/pytorch/releases/tag/testtag
See also the run(s) at https://github.com/Flamefire/pytorch/actions/workflows/create_release.yml
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63022
Reviewed By: saketh-are
Differential Revision:
D30256253
Pulled By: seemethere
fbshipit-source-id:
ab5fe131452de14ae3768b91c221e68c536cb3aa
Xiang Gao [Wed, 11 Aug 2021 19:34:58 +0000 (12:34 -0700)]
Embedding thrust->cub: unique (#63042)
Summary:
Followup of https://github.com/pytorch/pytorch/pull/62495
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63042
Reviewed By: saketh-are
Differential Revision:
D30231084
Pulled By: ngimel
fbshipit-source-id:
03b0a88107e8a2aee3570881d81bf2b676f525cd
Howard Cheng [Wed, 11 Aug 2021 19:32:10 +0000 (12:32 -0700)]
[PyTorch] Add flop count for addmm (#61895)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61895
* Add FLOP count for addmm, should be `2*m*n*k`.
Share the same code path for `addmm` and `mm`.
Test Plan:
Imported from OSS
`python test/test_profiler.py`
Run a sample profile and check that FLOPS for `aten::addmm` is correct.
`[chowar@devbig053.frc2 ~/local/pytorch/build] ninja bin/test_jit`
`[chowar@devbig053.frc2 ~/local/pytorch/build] ./bin/test_jit --gtest_filter='ComputeFlopsTest*'`
Reviewed By: dskhudia
Differential Revision:
D29785671
fbshipit-source-id:
d1512036202d7234a981bda897af1f75808ccbfe
Salil Desai [Wed, 11 Aug 2021 18:51:58 +0000 (11:51 -0700)]
XNNPack Input Pointer Caching Comment (#62818)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62818
Added a comment to explain why we no longer need to manually cache pointers/parameters for convolution, as removed in
D29777605 (https://github.com/pytorch/pytorch/commit/
f5c6c3947e4618d30ebd68a414f1cfcda27bdcd4)
Test Plan: Sandcastle tests (no code changed)
Reviewed By: kimishpatel
Differential Revision:
D30113489
fbshipit-source-id:
d697f05816acbd367d59a4aced1925303c683d40
rusty1s [Wed, 11 Aug 2021 18:35:53 +0000 (11:35 -0700)]
`_convert_coo_to_csr` CPP and CUDA functionality (#61838)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/57381 and improves https://github.com/pytorch/pytorch/pull/61340 via dedicated `coo_to_csr` functionalities.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61838
Reviewed By: ezyang
Differential Revision:
D30132736
Pulled By: cpuhrsch
fbshipit-source-id:
a1fd074c0d70366a524d219a620b94f8bed71d7c
Pritam Damania [Wed, 11 Aug 2021 18:22:48 +0000 (11:22 -0700)]
Add a _RemoteDevice structure for ShardedTensor/ShardingSpec. (#62927)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62927
As part of the ShardedTensor work, we realized we do need some sort of
_RemoteDevice structure that deals with our format of "workername/device" so
that users don't have to worry about parsing this string directly.
Right now this structure is just the bare minimum and is mostly a container for
describing a remote device. It is currently only used in ShardedTensor,
ShardingSpec and RemoteModule.
Once we actually have a consolidated remote device proposal, this class can be
extended appropriately if needed.
ghstack-source-id:
135534086
Test Plan:
1) unit tests
2) waitforbuildbot
Reviewed By: SciPioneer
Differential Revision:
D30170689
fbshipit-source-id:
1ac2e81c7a597dc40bf3fbf2c1168c382c66649f
Jacob Szwejbka [Wed, 11 Aug 2021 18:14:25 +0000 (11:14 -0700)]
[Pytorch Edge] Move RuntimeCompatibilityInfo Factory Method (#63005)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63005
Realized I forgot to move the Runtime half of these functions be within the struct.
Test Plan: ci
Reviewed By: pavithranrao
Differential Revision:
D30205521
fbshipit-source-id:
ccd87d7d78450dd0dd23ba493bbb9d87be4640a5
Stephen Macke [Wed, 11 Aug 2021 18:09:02 +0000 (11:09 -0700)]
[easy] add an `inplace` argument to MutableNetProto.to_net() and core.Net() constructor (#63068)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63068
The caffe2 core.Net constructor can accept a caffe2_pb2.NetDef proto, but it always creates a copy. This is wasteful when we can prove that the proto being passed to it will not be used anywhere else. So we add an "inplace" argument to the `core.Net` constructor that allows clients to give away ownership of the passed proto without copying. We default this argument to `False`, ensuring that behavior does not change unless explicitly requested.
Test Plan: Let CI run.
Differential Revision:
D29976510
fbshipit-source-id:
26e13ca76f3431b8ef0de51f08bbf263491d323e
zhouzhuojie [Wed, 11 Aug 2021 16:42:15 +0000 (09:42 -0700)]
Fix gha render-test-result mixed failure passthrough (#63056)
Summary:
To fix something like https://github.com/pytorch/pytorch/actions/runs/
1114555082
![image](https://user-images.githubusercontent.com/658840/
128956528-
86997457-5e18-4ae1-83cc-
aa7d0ca03c0e.png)
Not sure why `needs.test.result` doesn't capture the `failure` case before, so changed it to `needs.test.result != 'skipped' || failure()`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63056
Reviewed By: walterddr, tktrungna
Differential Revision:
D30240112
Pulled By: zhouzhuojie
fbshipit-source-id:
d159cc3f79ed5d604ae12583736b37ac28e8d87c
Yida Wang [Wed, 11 Aug 2021 16:36:49 +0000 (09:36 -0700)]
Fix issues with printing certain torch modules (#62447)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/54420
When I tested on master, with the testing code, there were multiple objects on the garbage collector that cannot be printed.
Testing code:
```
import torch
import gc
import os
import sys
print(torch.__version__)
a = torch.rand(10)
print(a)
objects = gc.get_objects()
for i in range(len(objects)):
print(objects[i])
```
### 1
```
print(torch.classes)
```
Like SplitInfinity has mentioned in the GitHub issue, the solution here is to set `__file__` for `torch.classes` to something. Similar to [_ops.py](https://github.com/pytorch/pytorch/blob/master/torch/_ops.py#L69), where `__file__` is set to `_ops.py`, we could set `__file__` for torch.classes to `_classes.py`.
### 2
```
print(torch._ops.ops.quantized)
print(torch._ops.ops.atan)
```
When we try to print these two modules, it will call `_OpNamespace::__getattr__`, but the `op_name` is `__file__`. This becomes a problem when `torch._C._jit_get_operation(qualified_op_name)` [(link)](https://github.com/pytorch/pytorch/blob/master/torch/_ops.py#L60) tries to look for an actual op on the native C++ side.
Only when we get the attribute for an actual op, e.g. `print(torch._ops.ops.quantized.elu)`, the `op_name` becomes proper (e.g. `elu`).
My current solution is to return a hardcoded string (i.e. “torch.ops”) if `op_name` is `"__file__"`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62447
Reviewed By: saketh-are
Differential Revision:
D30234654
Pulled By: yidawang-oss
fbshipit-source-id:
de43a8f599739c749fb3307eea015cc61f1da60e
Peter Bell [Wed, 11 Aug 2021 15:44:08 +0000 (08:44 -0700)]
Shard python_functions.cpp (#62186)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62186
This file takes 6 minutes on its own to compile and is the limiting factor for
building `libtorch_python` on a 32-core threadripper. This splits the file into
5 shards which take around 50 seconds each to compile.
Test Plan: Imported from OSS
Reviewed By: bdhirsh
Differential Revision:
D29962046
Pulled By: albanD
fbshipit-source-id:
df13cfaebd54296f10609f67ae74a850c329bd37
Sze Wai Celeste Yuen [Wed, 11 Aug 2021 15:38:13 +0000 (08:38 -0700)]
Fix inconsisteny between Python and JIT power operation (#62842)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62842
Test Plan:
Wrote unit test TestAtenPow to test behavior of aten::pow when:
1. base is int, exponent is int
2. base is int, exponent is float
3. base is float, exponent is int
4. base is float, exponent is float
Specifically, we test that when base is zero and exponent is negative, we raise error. In all other cases, we expect behavior to be the same as the result returned by Python.
It is because the cpp code relies on overloading, we need to make sure all combinations of types give us the expected result.
Reviewed By: zhxchen17
Differential Revision:
D30146115
Pulled By: szewaiyuen7
fbshipit-source-id:
dc661897ad38da286ee454120fbe41314b7f2995
Dmytro Dzhulgakov [Wed, 11 Aug 2021 08:08:45 +0000 (01:08 -0700)]
Fix CUDA_KERNEL_ASSERT ambiguous symbol in NDEBUG mode (#62527)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62527
If NDEBUG is applied inconsistently in compilation we might get 'ambiguous declaration' error. Let's make sure that the forward declaration matches glibc including all specifiers.
Test Plan: sandcastle
Reviewed By: mdschatz
Differential Revision:
D30030051
fbshipit-source-id:
9f4d5f1d4e74f0a4eaeeaaaad76b93ee485d8bcd
Pritam Damania [Wed, 11 Aug 2021 05:37:14 +0000 (22:37 -0700)]
[4/N] Enable opt-asan for distributed unit tests. (#62051)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62051
The goal here is to enable opt-asan for "spawn" based unit tests since
this works for "spawn" unlike "dev-asan". As a result, we can run ASAN for
"spawn" unit tests as well.
This means we can completely remove fork unit tests from the code base since
the only purpose for these tests was to run ASAN.
ghstack-source-id:
135523770
Test Plan: waitforbuildbot
Reviewed By: SciPioneer
Differential Revision:
D29854514
fbshipit-source-id:
02a5bfcfae2afc21badecff77082c7a6ad83636b
Lu Fang [Wed, 11 Aug 2021 04:56:41 +0000 (21:56 -0700)]
Back out "[fx] store Tracer class on Graph and GraphModule for package deserialization" (#63053)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63053
Original commit changeset:
eca09424ad30
The original diff -
D30019214 (https://github.com/pytorch/pytorch/commit/
6286d338785c48a3e7a9b969e2bc3bd4d502851d) breaks the publish flow in model saving.
Test Plan: ci
Differential Revision:
D30236517
fbshipit-source-id:
3e05db02fc1cbbc2ed262c83bf56d555277abb34
Rishi Puri [Wed, 11 Aug 2021 03:02:07 +0000 (20:02 -0700)]
rebase for autocast updates to include device_type and dtype flags (#61002)
Summary:
Fixes #{55374}
https://github.com/pytorch/pytorch/issues/55374
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61002
Reviewed By: malfet, mruberry
Differential Revision:
D30016812
Pulled By: ngimel
fbshipit-source-id:
6e09a29f539d28e9aea5cd9489b1e633cc588033
Wei-Sheng Chin [Wed, 11 Aug 2021 02:46:46 +0000 (19:46 -0700)]
Fix missing element types and shapes when autograd.Function has multiple tensor outputs (#57966)
Summary:
When generating IR for autograd.Function, if the function has multiple outputs, a TupleUnpack may be inserted after the original function node, and Pytorch only assigns proper information (tensor element type and shape) to the TupleUnpack and forgets the original function node. In contrast, if autograd.Function only produces one output, the original function node may have tensor
element type and shape in its output schema.
Before this PR:
- (simplified) IR for autograd.Function with one output: input (tensor, dtype=float32, shape=[2, 3]) -> PythonOp -> output (tensor, dtype=float32, shape=[4, 5])
- (simplified) IR for autograd.Function with one output: input (tensor, dtype=float32, shape=[2, 3]) -> PythonOp -> output_0 **(tensor)**, output_1 **(tensor)** -> TupleUnpack output_2 (tensor, dtype=float32, shape=[4, 5]), output_3 (tensor, dtype=float32, shape=[6, 7])
After this PR:
- (simplified) IR for autograd.Function with one output: input (tensor, dtype=float32, shape=[2, 3]) -> PythonOp -> output (tensor, dtype=float32, shape=[4, 5])
- (simplified) IR for autograd.Function with one output: input (tensor, dtype=float32, shape=[2, 3]) -> PythonOp ->output_0 **(tensor, dtype=float32, shape=[4, 5])**, output_1 **(tensor, dtype=float32, shape=[6, 7])** -> TupleUnpack output_2 (tensor, dtype=float32, shape=[4, 5]), output_3 (tensor, dtype=float32, shape=[6, 7])
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57966
Reviewed By: zhxchen17
Differential Revision:
D30208207
Pulled By: gmagogsfm
fbshipit-source-id:
42a3d1f9c0932133112a85df0c49cf4ea0afa175
Natalia Gimelshein [Wed, 11 Aug 2021 01:39:45 +0000 (18:39 -0700)]
remove dead code (#63031)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63031
Reviewed By: mruberry
Differential Revision:
D30225094
Pulled By: ngimel
fbshipit-source-id:
3666a0fa120bea85225cd3ee04f89d64952d2862
Natalia Gimelshein [Wed, 11 Aug 2021 01:23:00 +0000 (18:23 -0700)]
Revert
D30199482: [pytorch][PR] Add BFloat16 support for unique and unique_consecutive on CPU
Test Plan: revert-hammer
Differential Revision:
D30199482 (https://github.com/pytorch/pytorch/commit/
fc0b8e60337ae46b90ed5d2f6d1f623f0f8d6581)
Original commit changeset:
6f2d9cc1a528
fbshipit-source-id:
39e9f202bcbd978525f792173d4f97b5b329b5b1
Richard Barnes [Wed, 11 Aug 2021 00:57:22 +0000 (17:57 -0700)]
Use `const auto` with irange (#62990)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62990
Test Plan: Sandcastle
Reviewed By: zhouzhuojie
Differential Revision:
D30199748
fbshipit-source-id:
284b208ffa3c6c4749e5ac9b1fccb28914590f2c
Eddie Yan [Wed, 11 Aug 2021 00:44:40 +0000 (17:44 -0700)]
change nccl version reporting (#62916)
Summary:
https://github.com/pytorch/pytorch/issues/62295
Previously the packing and unpacking of the NCCL version "integer" was done to have parity with the upstream NCCL version encoding. However, there doesn't seem to be any place where this integer is directly compared with a version integer sourced from upstream NCCL, and syncing the encoding seems to be error-prone (e.g., a recent change where a special case was added for minor versions >= 10 https://github.com/NVIDIA/nccl/blob/
7e515921295adaab72adf56ea71a0fafb0ecb5f3/src/nccl.h.in#L22).
This patch changes the reporting to return a tuple of version numbers instead (to preserve ease-of-use for comparisons) and tweaks the passing between C/Python to avoid the digit overflow problem.
CC ngimel mcarilli
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62916
Reviewed By: anjali411
Differential Revision:
D30201069
Pulled By: mrshenli
fbshipit-source-id:
2e4e7c69f001c3f22bd04aa6df6a992e538bea45
tktrungna [Tue, 10 Aug 2021 23:24:57 +0000 (16:24 -0700)]
Update test_torch_deploy (#62838)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62838
Fixes #62380
* update test functions to call wheel install folder {sitepackages}/torch instead of build/ folder
* add symbolic link for shared libraries which are called by the tests (this is a bit hacky and should be fixed the rpath before compiling -- similar to https://github.com/pytorch/pytorch/blob/master/.jenkins/pytorch/test.sh#L204-L208).
### Test plan
check if all ci workflows pass
Test Plan: Imported from OSS
Reviewed By: walterddr
Differential Revision:
D30193141
Pulled By: tktrungna
fbshipit-source-id:
72c2bd3a740fca0f72e4803df505240193692c44
tktrungna [Tue, 10 Aug 2021 23:24:57 +0000 (16:24 -0700)]
update test_libtorch (#62797)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62797
Fixes #62380
* update test functions to call wheel install folder {sitepackages}/torch instead of build/ folder
* add symbolic link for shared libraries which are called by the tests (this is a bit hacky and should be fixed the rpath before compiling -- similar to https://github.com/pytorch/pytorch/blob/master/.jenkins/pytorch/test.sh#L204-L208).
### Test plan
check if all ci workflows pass
Test Plan: Imported from OSS
Reviewed By: walterddr
Differential Revision:
D30193140
Pulled By: tktrungna
fbshipit-source-id:
d8e54c403f42abbbbe4556abf40c22a7955df737
tktrungna [Tue, 10 Aug 2021 23:24:57 +0000 (16:24 -0700)]
update test distributed (#62796)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62796
Fixes #62380
* update test functions to call wheel install folder {sitepackages}/torch instead of build/ folder
* add symbolic link for shared libraries which are called by the tests (this is a bit hacky and should be fixed the rpath before compiling -- similar to https://github.com/pytorch/pytorch/blob/master/.jenkins/pytorch/test.sh#L204-L208).
### Test plan
check if all ci workflows pass
Test Plan: Imported from OSS
Reviewed By: driazati
Differential Revision:
D30193142
Pulled By: tktrungna
fbshipit-source-id:
1247f9eda1c11c763c31c7383c77545b1ead1a60
tktrungna [Tue, 10 Aug 2021 23:24:57 +0000 (16:24 -0700)]
update test_vulkan (#62795)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62795
Test Plan: Imported from OSS
Reviewed By: driazati
Differential Revision:
D30124421
Pulled By: tktrungna
fbshipit-source-id:
235ba166b02f7334e89cb2493024067851bf5b9b
tktrungna [Tue, 10 Aug 2021 23:24:57 +0000 (16:24 -0700)]
update test_rpc (#62781)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62781
Test Plan: Imported from OSS
Reviewed By: walterddr, zhouzhuojie
Differential Revision:
D30124391
Pulled By: tktrungna
fbshipit-source-id:
99c275d6c9f23b4f274fd0ca19a16879ed27afd5
Matej Sladek [Tue, 10 Aug 2021 23:19:39 +0000 (16:19 -0700)]
[ONNX] add support for prim::Unitialized in lower_tuples pass (#56912)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/56911
Code from issue generates this Torchscript:
```
graph(%self : __torch__.MyModule,
%t.1 : Tensor):
%12 : None = prim::Constant()
%7 : str = prim::Constant[value="Negative input"]() # /mnt/nvdl/usr/msladek/notes/python_code/unitialized.py:11:28
%3 : int = prim::Constant[value=0]() # /mnt/nvdl/usr/msladek/notes/python_code/unitialized.py:10:15
%9 : int = prim::Constant[value=5]() # /mnt/nvdl/usr/msladek/notes/python_code/unitialized.py:13:31
%33 : (Tensor, Tensor) = prim::Uninitialized()
%4 : Tensor = aten::lt(%t.1, %3) # /mnt/nvdl/usr/msladek/notes/python_code/unitialized.py:10:11
%6 : bool = aten::Bool(%4) # /mnt/nvdl/usr/msladek/notes/python_code/unitialized.py:10:11
%34 : (Tensor, Tensor) = prim::If(%6) # /mnt/nvdl/usr/msladek/notes/python_code/unitialized.py:10:8
block0():
= prim::RaiseException(%7) # /mnt/nvdl/usr/msladek/notes/python_code/unitialized.py:11:12
-> (%33)
block1():
%11 : int[] = prim::ListConstruct(%9)
%16 : Tensor = aten::zeros(%11, %12, %12, %12, %12) # /mnt/nvdl/usr/msladek/notes/python_code/unitialized.py:13:19
%18 : int[] = prim::ListConstruct(%9)
%23 : Tensor = aten::zeros(%18, %12, %12, %12, %12) # /mnt/nvdl/usr/msladek/notes/python_code/unitialized.py:13:35
%24 : (Tensor, Tensor) = prim::TupleConstruct(%16, %23)
-> (%24)
return (%34)
```
Problem is that onnx exporter during lower_tuples pass doesn't support forwarding of tuples in prim::Unitialized.
Solution is:
1. add prim::Unitialized to supported_op in lower_tuples pass
1. As prim::Unitialized has now multiple outputs, we should call giveFreshAlias for every output
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56912
Reviewed By: nikithamalgifb
Differential Revision:
D29837200
Pulled By: SplitInfinity
fbshipit-source-id:
321fae6fe52b1523df5653dbb9ea73b998ef1cda
Howard Huang [Tue, 10 Aug 2021 22:56:18 +0000 (15:56 -0700)]
Remove process_group_agent and faulty_process_group_agent files (#62985)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62985
Remove the process_group_agent and faulty_process_group_agent code now that PROCESS_GROUP backend has been deprecated for RPC (https://github.com/pytorch/pytorch/issues/55615). Discussed with xush6528 that it was okay to remove ProcessGroupAgentTest and ProcessGroupAgentBench which depended on process_group_agent.
Test Plan: CI tests
Reviewed By: pritamdamania87
Differential Revision:
D30195576
fbshipit-source-id:
8b4381cffadb868b19d481198015d0a67b205811
Natalia Gimelshein [Tue, 10 Aug 2021 22:44:09 +0000 (15:44 -0700)]
fix sort and topk with discontiguous out (#63029)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/62645 and https://github.com/pytorch/pytorch/issues/62940. The root cause of those bugs is in the bad interaction between `collapseDims` and setting the size of sorting/topK dimension to 1. If all other dimensions happen to be 1, `collapseDims` thinks that that `1` dimension is collapsible (even though it was specifically marked to be preserved) and loses its stride information. If dimension was really of size 1, the stride information would be unimportant, but since in reality that dimension is not 1 and was set to 1 for convenience, the loss of stride information results in incorrect outputs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63029
Reviewed By: heitorschueroff
Differential Revision:
D30224925
Pulled By: ngimel
fbshipit-source-id:
269dd375c5cd57c6007fe91f729f8c60a2e7a264
Hanton Yang [Tue, 10 Aug 2021 22:15:23 +0000 (15:15 -0700)]
[iOS] enable Metal in the nightly build (#62855)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62855
Test Plan: Test on Private Pod with the [HelloWorld](https://fburl.com/3hiwkkhm) demo
Reviewed By: xta0
Differential Revision:
D30174151
Pulled By: hanton
fbshipit-source-id:
22cd8663ac239811bf8ed1c3b6301460d798dbfa
Christian Puhrsch [Tue, 10 Aug 2021 22:14:00 +0000 (15:14 -0700)]
test_cudnn_convolution_relu skipCUDAIfRocm
Summary: skip rocm test for test_cudnn_convolution_relu
Test Plan: This skips a test
Reviewed By: ngimel
Differential Revision:
D30233620
fbshipit-source-id:
31eab8b03c3f15674e0d262a8f55965c1aa6b809
Victor Quach [Tue, 10 Aug 2021 21:58:16 +0000 (14:58 -0700)]
Add docstring for saved tensors default hooks (#62361)
Summary:
Add documentation for the saved tensors default hooks introduced in https://github.com/pytorch/pytorch/issues/61834 / https://github.com/pytorch/pytorch/issues/62563
Sister PR: https://github.com/pytorch/pytorch/issues/62362 (will add a link from autograd.rst to notes/autograd in whatever PR does not land first)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62361
Reviewed By: zou3519
Differential Revision:
D30081997
Pulled By: Varal7
fbshipit-source-id:
cb923e943e1d96db9669c1d863d693af30910c62
Tao Xu [Tue, 10 Aug 2021 21:32:11 +0000 (14:32 -0700)]
[iOS][CI] Store every version of nightlies in S3 (#63039)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63039
Test Plan: Imported from OSS
Reviewed By: hanton
Differential Revision:
D30229385
Pulled By: xta0
fbshipit-source-id:
15b438a6326159258803ab97e67dc9ec5db50d59
Jerry Zhang [Tue, 10 Aug 2021 20:57:14 +0000 (13:57 -0700)]
[quant][graphmode] Reference pattern support for elu (#62607)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62607
Removing the quantize handler for elu since it can be covered by DefaultNodeQuantizeHandler
Test Plan:
python test/test_quantization.py TestQuantizeFxOps
Imported from OSS
Reviewed By: iramazanli
Differential Revision:
D30053977
fbshipit-source-id:
426789443e928bb01a88907de616cbda5866f621