Salil Desai [Tue, 14 Sep 2021 19:09:45 +0000 (12:09 -0700)]
[PyTorch Edge][Model Loading] Operator Call De-dup at TorchScript Serialization Level [1/2] (#64268)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64268
If the same pair of operator name and num inputs have been used to add an instruction to the operator table previously (and the operator's schema is not vararg), use the same index as that instruction rather than creating a new one.
ghstack-source-id:
138014905
Test Plan: Phabricator tests, and test performance changes in next diff
Reviewed By: iseeyuan, tugsbayasgalan
Differential Revision:
D30615434
fbshipit-source-id:
f442f557f12412693a73004ce44733ccef063b82
Eli Uriegas [Tue, 14 Sep 2021 18:20:51 +0000 (11:20 -0700)]
.github: Add render test results step (#64937)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64937
Adds CLI output for rendered test results to go alongside test exeuction, users should be able to quickly diagnose test failures like so:
![fdsfdsfdsfdsf](https://user-images.githubusercontent.com/1700823/
133156245-
ba939cbf-8aa2-47a7-b1fb-
7cc876ca75c4.png)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
cc ezyang seemethere malfet lg20987 pytorch/pytorch-dev-infra
Test Plan: Imported from OSS
Reviewed By: jbschlosser
Differential Revision:
D30917897
Pulled By: seemethere
fbshipit-source-id:
f51ea499462e3cfd64496cb711b84a93971c91bd
Natalia Gimelshein [Tue, 14 Sep 2021 18:19:07 +0000 (11:19 -0700)]
remove SkipInfo class (#64972)
Summary:
per title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64972
Reviewed By: mruberry
Differential Revision:
D30924598
Pulled By: ngimel
fbshipit-source-id:
1ac1ec8fd50ca27e3cd36c12a588d334e7466899
Scott Wolchok [Tue, 14 Sep 2021 17:35:04 +0000 (10:35 -0700)]
[PyTorch] Don't store multiple kernels per key on mobile (#64447)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64447
As the code comment says, we needn't worry about Jupyter notebooks on mobile.
ghstack-source-id:
137951718
Test Plan: Profiled startup of //caffe2/caffe2/fb/high_perf_models/pytorch/benchmark_framework_overheads:cpp_benchmark on devserver with -niter 0 -nrep 0 and `C10_DISPATCHER_ONE_KERNEL_PER_DISPATCH_KEY` defined. Time spent in sherwood_v3_table lookups went way down.
Reviewed By: ezyang, bhosmer
Differential Revision:
D30736094
fbshipit-source-id:
bcc22cd0d9adceba259a03898c992759d501fe89
Shiyan Deng [Tue, 14 Sep 2021 16:41:57 +0000 (09:41 -0700)]
[fx const fold] fix some cases with deep model hierarchy (#64945)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64945
In the const folding pass, we try to create `get_attr` nodes in submod_1 for `get_attr` nodes that are in the main graph. But we don't have the real attributes in submod_1. To fix this we assign main module as the owning module of sumod_1 graph.
The fix above would cause problem for `call_module` node in submod_1 because during split modules gets inlined (target changed from "mod.a.b" -> "mod_a_b") to submod_1. Changing the owning module would make those `call_module nodes unable to find the referring module. To fix this, we set the targeting module to main module.
Reviewed By: jfix71
Differential Revision:
D30905949
fbshipit-source-id:
cd67bc8fe4b8ad4344ae97b8e36753fdce3ece6d
Yi Wang [Tue, 14 Sep 2021 16:41:13 +0000 (09:41 -0700)]
[Model Averaging] Revert #63895 (#64903)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64903
Fix the accuracy regression caused by https://github.com/pytorch/pytorch/pull/63895.
Test Plan:
buck test mode/dev-nosan //caffe2/test/distributed:distributed_nccl_spawn -- test_periodic_model_averager
buck test mode/dev-nosan //caffe2/test/distributed:distributed_nccl_spawn -- test_post_localSGD_optimizer_parity
Reviewed By: rohan-varma
Differential Revision:
D30894688
fbshipit-source-id:
fe00b8b23b860d9f806f87c1b6caba1d0b807485
Nick Kreeger [Tue, 14 Sep 2021 16:40:33 +0000 (09:40 -0700)]
Drop incremental linking on Windows with REL_WITH_DEB_INFO=1. (#64892)
Summary:
The library will no longer link properly on VS 2019 (14.29.30133). To
ensure that engineers building on Windows can use and debug with this
build type, incremental linking needs to be turned off for this build
flag.
Verified that this build type successfully builds, links, and provides
debuggable Python modules on Windows.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64892
Reviewed By: jbschlosser
Differential Revision:
D30902565
Pulled By: malfet
fbshipit-source-id:
e5286a4c6f45c7cbe4cdc1b98560129bd386970b
Nikita Shulga [Tue, 14 Sep 2021 16:38:34 +0000 (09:38 -0700)]
Disable target determination for now (#64921)
Summary:
There were several reports of target determinator incorrectly skipping
tests, most recent one is https://github.com/pytorch/pytorch/issues/64902
Let's disable it until it could be further stabilized
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64921
Reviewed By: seemethere, janeyx99
Differential Revision:
D30901186
Pulled By: malfet
fbshipit-source-id:
531afd2d390c6b51f727330d5dd1882d70b6fdde
Jane (Yuan) Xu [Tue, 14 Sep 2021 15:59:15 +0000 (08:59 -0700)]
print_test_stats.py: dedup test report upload name with TEST_CONFIG (#64948)
Summary:
Connected with issue https://github.com/pytorch/pytorch/issues/64845, takeover of https://github.com/pytorch/pytorch/issues/64091
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64948
Reviewed By: malfet, seemethere
Differential Revision:
D30908592
Pulled By: janeyx99
fbshipit-source-id:
dc31b0bbc9f4e35d23412aa14acbbab7422b4146
Richard Zou [Tue, 14 Sep 2021 15:07:01 +0000 (08:07 -0700)]
Make {select,slice,diagonal}_backward primitives wrt autograd (#64933)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64933
Fixes https://github.com/facebookresearch/functorch/issues/108
This is a short-term fix. A longer-term fix would be to either:
1. have proper {select,slice,diagonal}_embed functions
2. have efficient {select,slice,diagonal}_scatter functions (and
efficient zero tensors).
NB: I didn't use diag_embed because diag_embed is slightly different
from diagonal_backward.
There are no BC concerns because TorchScript (luckily) does not
serialize the backwards graph.
Test Plan:
- run tests
- run benchmarks.
https://gist.github.com/zou3519/
e7c0774d1ac97f32aa02ec44d81e60e1.
Surprisingly the instruction count goes down. This is probably because
we create fewer autograd nodes now.
Reviewed By: ezyang
Differential Revision:
D30909333
Pulled By: zou3519
fbshipit-source-id:
3b33e13010ba13b4d487b346aa9bee8a0e8c378c
Yukio Siraichi [Tue, 14 Sep 2021 14:55:13 +0000 (07:55 -0700)]
Replace composite dispatch with `CompositeExplicitAutograd` (#64641)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64641
`sum`, `mean`, and `norm` were ported to structured kernels in #61642, #61643, and #62711,
respectively. Those PRs changed related overlads into composite kernels. However, their
dispatch section remained the same, when they really should be marked as
`CompositeExplicitAutograd`. This PR fixes this issue.
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision:
D30867122
Pulled By: ezyang
fbshipit-source-id:
b951aee41a3cab9ca546df826a285d60013e3b3a
Edward Yang [Tue, 14 Sep 2021 13:07:52 +0000 (06:07 -0700)]
Revert
D30711934: [pytorch][PR] Use RDS for build size tracking
Test Plan: revert-hammer
Differential Revision:
D30711934 (https://github.com/pytorch/pytorch/commit/
1cd0252eed8ddb26e4599ef2b0fec4d8843b8828)
Original commit changeset:
0af808ddf528
fbshipit-source-id:
6f67ed5cbaf333cc55729be2a23e385772e31b10
Mikhail Zolotukhin [Tue, 14 Sep 2021 07:19:57 +0000 (00:19 -0700)]
[TensorExpr] Remove 'Placeholder' class. (#64887)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64887
BufHandle has exactly the same functionality and should be used instead.
Differential Revision:
D30889483
D30889483
Test Plan: Imported from OSS
Reviewed By: navahgar
Pulled By: ZolotukhinM
fbshipit-source-id:
365fe8e396731b88920535a3de96bd3301aaa3f3
Mikhail Zolotukhin [Tue, 14 Sep 2021 07:19:57 +0000 (00:19 -0700)]
[TensorExpr] PyBinds: improve QoL of pybind users. (#64886)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64886
Bind methods for implicit conversions and constructors to avoid
boilerplate code.
Differential Revision:
D30889193
D30889193
Test Plan: Imported from OSS
Reviewed By: jbschlosser
Pulled By: ZolotukhinM
fbshipit-source-id:
137c0c98f7f1576e1bb97c8de8a900b28407a30e
Peter Bell [Tue, 14 Sep 2021 06:15:10 +0000 (23:15 -0700)]
Fix use of deprecated tensor.type() in SegmentReduce.cpp (#64151)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64151
Reviewed By: mruberry
Differential Revision:
D30917268
Pulled By: ngimel
fbshipit-source-id:
63427372b651ac495d48ef552eba5fbf0e4378e9
Supriya Rao [Tue, 14 Sep 2021 05:21:05 +0000 (22:21 -0700)]
[quant] handle empty input in fused_moving_avg_obs_fake_quant op (#64829)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64829
If an empty input is passed in, the aminmax operator fails with a runtime error like
```
RuntimeError: aminmax(): cannot compute aminmax over an empty dimension as the operation has no identity.
```
To avoid this during training we just return the input if we find it to be empty
Test Plan:
python test/test_quantization.py TestFusedObsFakeQuant
Imported from OSS
Reviewed By: jingsh
Differential Revision:
D30870879
fbshipit-source-id:
0cb4b187449a45a37150a77510d2292f93a7d1cd
Ivan Yashchuk [Tue, 14 Sep 2021 04:13:56 +0000 (21:13 -0700)]
Add forward AD for torch.linalg.eigh (#62163)
Summary:
This PR adds forward mode differentiation for `torch.linalg.eigh` and a few other functions required for tests to pass.
For some reason running tests for `torch.linalg.eigvalsh` and complex `torch.linalg.eigh` hangs. These tests are skipped for now.
cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry heitorschueroff walterddr IvanYashchuk xwang233
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62163
Reviewed By: jbschlosser
Differential Revision:
D30903988
Pulled By: albanD
fbshipit-source-id:
d6a74adb9e6d2f4be8ac707848ecabf06d629823
Natalia Gimelshein [Tue, 14 Sep 2021 03:34:57 +0000 (20:34 -0700)]
[THC] remove TensorTypeUtils and TensorInfo (#64965)
Summary:
per title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64965
Reviewed By: mruberry
Differential Revision:
D30916754
Pulled By: ngimel
fbshipit-source-id:
b24020d6a7ce8a05a5ab6c579d176dd94dd3b1d7
Xiang Gao [Tue, 14 Sep 2021 02:49:33 +0000 (19:49 -0700)]
EmbeddingBag sort thrust->cub (#64498)
Summary:
Partially fixes https://github.com/pytorch/pytorch/issues/57505
Also fixes a warning I found when compiling:
```
/home/gaoxiang/pytorch-cub/torch/csrc/distributed/c10d/quantization/quantization_gpu.cu(7): warning: inline qualifier ignored for "__global__" function
```
I also updated the bfloat16 guard to CUDA 11.5
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64498
Reviewed By: mruberry
Differential Revision:
D30917077
Pulled By: ngimel
fbshipit-source-id:
fb9df08fd469038478a563014b5af7452b4b28c0
Chiang, Yu-Hsun (oToToT) [Tue, 14 Sep 2021 01:59:13 +0000 (18:59 -0700)]
Speed up torch.unique_consecutive() (#64835)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/62690
Like the way `unique_consecutive_cpu_template` implemented, this PR reimplements `_unique_dim_cpu_impl` to get better performance.
Also, because the overhead of `unique_dim_consecutive_cpu` is quite large, directly call `unique_consecutive_cpu_template` when we know the given input is a 1d-array.
## Benchmark
### Script
```python
import torch
import time
torch.manual_seed(0)
t = torch.randint(500, (
10000000, ))
t = torch.sort(t)[0]
start = time.time()
uniques, inverse, counts = torch.unique_consecutive(t, dim=0, return_inverse=True, return_counts=True)
end = time.time()
print("torch.unique_consecutive(dim=0) time:", end - start)
start = time.time()
uniques2, inverse2, counts2 = torch.unique_consecutive(t, return_inverse=True, return_counts=True)
end = time.time()
print("torch.unique_consecutive() time:", end - start)
t = torch.randint(500, (
10000000, 2))
t = torch.sort(t)[0]
start = time.time()
uniques, inverse, counts = torch.unique_consecutive(t, dim=0, return_inverse=True, return_counts=True)
end = time.time()
print("torch.unique_consecutive(dim=0) time:", end - start)
start = time.time()
uniques, inverse, counts = torch.unique_consecutive(t, dim=1, return_inverse=True, return_counts=True)
end = time.time()
print("torch.unique_consecutive(dim=1) time:", end - start)
```
### Before
```
torch.unique_consecutive(dim=0) time: 78.
64345622062683
torch.unique_consecutive() time: 0.
029544353485107422
torch.unique_consecutive(dim=0) time: 91.
49796152114868
torch.unique_consecutive(dim=1) time: 0.
30872368812561035
```
### After
```
torch.unique_consecutive(dim=0) time: 0.
08256125450134277
torch.unique_consecutive() time: 0.
08162403106689453
torch.unique_consecutive(dim=0) time: 35.
58408498764038
torch.unique_consecutive(dim=1) time: 1.
6258199214935303
```
## System Information
```
Collecting environment information...
PyTorch version: 1.10.0a0+git7f1932e
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.3 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: 10.0.0-4ubuntu1
CMake version: version 3.16.3
Libc version: glibc-2.31
Python version: 3.8.10 (default, Jun 2 2021, 10:49:15) [GCC 9.4.0] (64-bit runtime)
Python platform: Linux-5.11.0-34-generic-x86_64-with-glibc2.29
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.21.2
[pip3] torch==1.10.0a0+gitbe09195
[conda] Could not collect
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64835
Reviewed By: jbschlosser
Differential Revision:
D30894906
Pulled By: ngimel
fbshipit-source-id:
42ab76d638391ce6c4e589d9c71bdf7579310ad9
Vitaly Fedyunin [Tue, 14 Sep 2021 01:48:48 +0000 (18:48 -0700)]
[WIP] Example of DataPipes and DataFrames integration (#60840)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60840
Test Plan: Imported from OSS
Reviewed By: wenleix, ejguan
Differential Revision:
D29461080
Pulled By: VitalyFedyunin
fbshipit-source-id:
4909394dcd39e97ee49b699fda542b311b7e0d82
driazati [Tue, 14 Sep 2021 01:34:40 +0000 (18:34 -0700)]
Re-land Fix test report uploading (#64958)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64958
This is a re-do of #64846 which was missing a path prefix for windows test reports
Test Plan: Imported from OSS
Reviewed By: seemethere
Differential Revision:
D30915253
Pulled By: driazati
fbshipit-source-id:
d14d0a64d2f8aabc335db9c4d0d2b63512887c66
Tao Xu [Tue, 14 Sep 2021 01:14:32 +0000 (18:14 -0700)]
[iOS][OSS][BE] Add Simulator tests for full JIT (#64851)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64851
ghstack-source-id:
137970229
Test Plan: CircleCI
Reviewed By: hanton, cccclai
Differential Revision:
D30877963
fbshipit-source-id:
7bb8ade1959b85c3902ba9dc0660cdac8f558d64
Emad El-Haraty [Tue, 14 Sep 2021 00:59:11 +0000 (17:59 -0700)]
add acc_ops.max, acc_ops.maximum, consolidate acc_ops.min and acc_ops.minimum
Summary:
This diff adds `acc_ops.max` and `acc_ops.maximum` support.
It further consolidates the logic for `acc_ops.min` and `acc_ops.minimum` to match the logic for max.
torch.max has three behaviors:
```1. max(input)
2. max(input, dim, keepdim=False, *, out=None)
3. max(input, other, *, out=None)
```
Likewise, `torch.min` has three identical behaviors.
I've chosen to implement each as an acc_op, then map to the appropriate one.
the third max function is effectively `torch.maximum`, so I've implemented it as that.
Reviewed By: yinghai, jfix71,
842974287
Differential Revision:
D30551464
fbshipit-source-id:
0a2eec10e5185cbf7d9984eec3fd399b23528b2a
CaoE [Tue, 14 Sep 2021 00:58:20 +0000 (17:58 -0700)]
Add BFloat16 support for cross, tril, triu, tril_indices, triu_indices and cumsum operators on CPU (#62454)
Summary:
Add BFloat16 support for cross, tril, triu, tril_indices, triu_indices and cumsum operators on CPU.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62454
Reviewed By: albanD
Differential Revision:
D30845805
Pulled By: heitorschueroff
fbshipit-source-id:
f83836862e38109ec929e83567133e9e88096b8b
David Riazati [Tue, 14 Sep 2021 00:47:18 +0000 (17:47 -0700)]
Use RDS for build size tracking (#64303)
Summary:
This adds 2 utilities: `register_rds_table` and `rds_write`. `register_rds_table` needs to be called once with the schema for the data that `rds_write` will write. These go to a lambda called `rds-proxy`, which will write to/read from the DB as necessary. This data can then be arbitrarily queried via `rds-proxy` (for use in CI) or on metrics.pytorch.org (for analysis).
It also hooks these up for build size tracking (which previously was not working on GHA)
TODO:
* verify output in logs + clean up prints
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64303
Reviewed By: malfet, seemethere
Differential Revision:
D30711934
Pulled By: driazati
fbshipit-source-id:
0af808ddf528a24875a378caeb1aa9cb0693f802
Nikita Shulga [Tue, 14 Sep 2021 00:10:30 +0000 (17:10 -0700)]
Add `skipIfTBB` decorator (#64942)
Summary:
And replace two existing usages in the codebase with it
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64942
Reviewed By: jbschlosser
Differential Revision:
D30906382
Pulled By: malfet
fbshipit-source-id:
e7f20f53aff734b0379eded361255543dab4fa4b
Victor Quach [Mon, 13 Sep 2021 23:39:55 +0000 (16:39 -0700)]
Raise TypeError on assigned grad with wrong type (#64876)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/64813
Raises a TypeError when assigned value to a grad is not a Tensor or
None.
Adds tests.
cc ezyang gchanan
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64876
Reviewed By: anjali411
Differential Revision:
D30901678
Pulled By: soulitzer
fbshipit-source-id:
dbb3cb5fd0bbac6918e0b2e2f51d340daa43dee0
Natalia Gimelshein [Mon, 13 Sep 2021 23:31:07 +0000 (16:31 -0700)]
kill SkipInfo (#64878)
Summary:
Per offline discussion, replaces SkipInfo with DecorateInfo. SkipInfo class itself is not removed yet to give functorch time to replace its SkipInfos.
cc zou3519
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64878
Reviewed By: mruberry
Differential Revision:
D30908052
Pulled By: ngimel
fbshipit-source-id:
5124180b25c6e32517722883b9f3a2b488e3fe20
Shirong Wu [Mon, 13 Sep 2021 22:53:20 +0000 (15:53 -0700)]
Fix TRTOperatorSupport (#64873)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64873
Fix TRTOperatorSupport's key naming to match the key generated by torch.fx.passes.tools_common.get_node_target. The get_node_target is used by splitter_base for comparing whether operator is supported by name.
Test Plan:
print out the supported operator dict and check name.
Run TRTSplitter with lrm_split_model_generator and verify split result is correct with all supported operators printed.
current split result:
````
Supported node types in the model:
acc_ops.size: ((), {'input': torch.float32})
acc_ops.getitem: ((), {'input': torch.float32})
acc_ops.getitem: ((), {'input': None})
acc_ops.reshape: ((), {'input': torch.float32})
acc_ops.unsqueeze: ((), {'input': torch.float32})
acc_ops.linear: ((), {'input': torch.float32, 'weight': torch.float32})
acc_ops.linear: ((), {'input': torch.float32, 'weight': torch.float32, 'bias': torch.float32})
acc_ops.mul: ((), {'input': torch.float32, 'other': torch.float32})
acc_ops.cat: ((), {})
acc_ops.add: ((), {'input': torch.float32, 'other': torch.float32})
acc_ops.add: ((), {'input': torch.float32})
acc_ops.tanh: ((), {'input': torch.float32})
acc_ops.transpose: ((), {'input': torch.float32})
acc_ops.matmul: ((), {'input': torch.float32, 'other': torch.float32})
acc_ops.div: ((), {'input': torch.float32, 'other': torch.float32})
acc_ops.squeeze: ((), {'input': torch.float32})
acc_ops.noop: ((), {'input': torch.float32})
acc_ops.layer_norm: ((), {'input': torch.float32, 'weight': torch.float32, 'bias': torch.float32})
acc_ops.permute: ((), {'input': torch.float32})
acc_ops.sigmoid: ((), {'input': torch.float32})
acc_ops.flatten: ((), {'input': torch.float32})
acc_ops.softmax: ((), {'input': torch.float32})
acc_ops.sum: ((), {'input': torch.float32})
Unsupported node types in the model:
torch.ops.fb.pad_sequence_embeddings: ((), {'embeddings': torch.float32, 'offsets': torch.int32})
acc_ops.linalg_norm: ((), {'input': torch
```
Reviewed By: yinghai
Differential Revision:
D30884463
fbshipit-source-id:
22442aa6a69cd148ce9bc8be8f62157dd6d19954
Eli Uriegas [Mon, 13 Sep 2021 22:21:51 +0000 (15:21 -0700)]
Revert
D30878101: [pytorch][PR] Fix test report uploading
Test Plan: revert-hammer
Differential Revision:
D30878101 (https://github.com/pytorch/pytorch/commit/
fba40bfc1ab45b4410504ec64b585c4df74b6f47)
Original commit changeset:
0730f17fa3f4
fbshipit-source-id:
dad89e68b4daf656dd0b592bc9b2758f00af38c6
Vasiliy Kuznetsov [Mon, 13 Sep 2021 22:20:44 +0000 (15:20 -0700)]
torch.ao migration: fake_quantize.py, phase 1 (#64814)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64814
1. move the file
```
hg mv caffe2/torch/quantization/fake_quantize.py caffe2/torch/ao/quantization/
```
2. create a new file in the old location and copy the imports
3. fix all callsites inside `torch`
Test Plan:
```
buck test mode/dev //caffe2/test:quantization
```
Reviewed By: z-a-f
Differential Revision:
D30866792
fbshipit-source-id:
7a221cb46c0ab01f1c5de9be061f09ecc83ce23e
Scott Wolchok [Mon, 13 Sep 2021 21:31:36 +0000 (14:31 -0700)]
[PyTorch] Reduce heap allocations in OperatorName::setNamespaceIfNotSet (#64673)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64673
We are now guaranteed to allocate at most one time in this function.
ghstack-source-id:
137786392
Test Plan: Previous diff adds test coverage for this function.
Reviewed By: dhruvbird
Differential Revision:
D30813014
fbshipit-source-id:
17d844a1cc8c30574afcc6b0b41b219e62c0b723
Scott Wolchok [Mon, 13 Sep 2021 21:31:36 +0000 (14:31 -0700)]
[PyTorch] Add test for operator_name (#64672)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64672
Just a small struct missing test coverage. Next diff changes it.
ghstack-source-id:
137786388
Test Plan: CI
Reviewed By: dhruvbird
Differential Revision:
D30813013
fbshipit-source-id:
05f39494bb9512a71a928bfe6fcfa710016bdf61
Emad El-Haraty [Mon, 13 Sep 2021 21:22:53 +0000 (14:22 -0700)]
handle the case in acc_ops.sum when dim == 0, differentiating it from the case when dim is None (#64869)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64869
handle the case in acc_ops.sum when dim == 0, differentiating it from the case when dim is None
Reviewed By:
842974287
Differential Revision:
D30872739
fbshipit-source-id:
2755d3230804a16ef1c9289f804138c6dd7766b3
XiaobingSuper [Mon, 13 Sep 2021 20:21:23 +0000 (13:21 -0700)]
fix build error when system cmake3 version >=3.5 but <=3.10 (#64914)
Summary:
For PyTorch source build using conda, there will raise an error in https://github.com/pytorch/pytorch/blob/
8535418a06d75025541370cc656a8b6a0330ca0d/CMakeLists.txt#L1 when we get a CMake version < 3.10, it can be fixed by upgrade CMake in conda env, but for centos, there has CMake3, PyTorch fist check whether CMake3's verison<=3.5, so if user's system camke<= 3.5, PyTorch will use the system's cmake3, which will have build error like:
```
CMake Error at CMakeLists.txt:1 (cmake_minimum_required):
CMake 3.10 or higher is required. You are running version 3.6.3
-- Configuring incomplete, errors occurred!
```
we need to check CMake3 also >=3.10, if not, then check conda's CMake version.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64914
Reviewed By: jbschlosser
Differential Revision:
D30901673
Pulled By: ezyang
fbshipit-source-id:
064e2c5bc0b9331d6ecd65cd700e5a42c3403790
driazati [Mon, 13 Sep 2021 20:21:09 +0000 (13:21 -0700)]
Fix test report uploading (#64846)
Summary:
Previously we just weren't uploading Windows test report XML files to S3, only to GitHub actions. This was different than Linux where we use both (though maybe we can kill the GHA upload in a follow up PR since I don't think it's very useful anymore). This factors it all out into a macro so they both do the same thing. This also fixes the naming of uploaded files to include info about the job name (the full config, so they can be matched to the job visually or by the included job id).
See https://hud.pytorch.org/pr/64846 for results
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64846
Reviewed By: seemethere
Differential Revision:
D30878101
Pulled By: driazati
fbshipit-source-id:
0730f17fa3f46a32c131f52669084c3103b0e616
Nikita Shulga [Mon, 13 Sep 2021 19:46:11 +0000 (12:46 -0700)]
Pin SciPy to 1.6.3 on Mac (take 2) (#64922)
Summary:
It's already pinned by via docker install on Linux
`scipy.stats.`[`poission`|`geom`|`binom`] returns quite different results between 1.6.x and 1.7+ versions of SciPy, which results in several distributions tests failing accuracy thresholds
Reland of https://github.com/pytorch/pytorch/pull/64844 but limited to just Mac platform
Followup PR for Windows are coming as well
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64922
Reviewed By: janeyx99
Differential Revision:
D30901257
Pulled By: malfet
fbshipit-source-id:
0543e7bae9d3bbeb8b6be7b3ecf605880f97665f
Don Jang [Mon, 13 Sep 2021 19:41:50 +0000 (12:41 -0700)]
[Deploy] Avoid use-after-free during autograd shutdown (#64620)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64620
`autograd` extension module's shutdown logic destructs `PyThreadState` by `pybind11::gil_scoped_acquire` using the RAII pattern.
The problem is that torch.deploy also destructs `PyThreadState` as part of its shutdown process (https://www.internalfb.com/phabricator/paste/view/P456363738), causing double destruction, use-after-free.
This change adds `defined(USE_DEPLOY)` as a special case to avoid destruction of `PyThreadState` to the existing special treatment for `IS_PYTHON_3_9_PLUS`.
Test Plan: Added `TorchpyTest.Autograd` unittest to ensure that torch.deploy can create multiple instances that use autograd without causing a crash.
Reviewed By: albanD
Differential Revision:
D30779080
fbshipit-source-id:
4de3283cc2d394acc9b8141c17cacbfab5eea052
Jacob Szwejbka [Mon, 13 Sep 2021 17:54:08 +0000 (10:54 -0700)]
[Pytorch Edge] Quantized Ops Dtype Selective (#63680)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63680
Quantized ops not covered by DType Selectivity. Add the check, and adjust call sites to be constexpr friendly.
Test Plan: CI (this covers all model unit tests), verified that segmentation (a model that uses some of these quant ops) still works on instagram.
Reviewed By: dhruvbird, raymondethan
Differential Revision:
D30457626
fbshipit-source-id:
5ba850d2b53a18558dfbb1cfaa78d8f53b5dbad8
Edward Yang [Mon, 13 Sep 2021 17:53:06 +0000 (10:53 -0700)]
Disable more of the pragma warning stuff (#64899)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64899
ghstack-source-id:
137882055
Test Plan: sandcastle, ossci
Reviewed By: malfet, ngimel
Differential Revision:
D30893691
fbshipit-source-id:
67ec8cc9f212aa16a201771603236e429944b561
Scott Wolchok [Mon, 13 Sep 2021 17:48:55 +0000 (10:48 -0700)]
[PyTorch] Gate tls_local_dispatch_key_set off on iOS too (#64753)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64753
This may possibly be causing problems on iOS. (Maybe we should just revert inlining access to this thing? Really don't understand what's wrong with it, though.)
ghstack-source-id:
137830520
Test Plan: CI
Reviewed By: iseeyuan
Differential Revision:
D30826897
fbshipit-source-id:
0438dee9d49e7601c26cdca0e8540229c777eddb
VertexC [Mon, 13 Sep 2021 17:46:30 +0000 (10:46 -0700)]
typo fix (#64615)
Summary:
Fixes #{issue number}
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64615
Reviewed By: jbschlosser
Differential Revision:
D30884298
Pulled By: ngimel
fbshipit-source-id:
230f9d06aa85abcdd69828a1ea0a83f36cbfcb17
kshitij12345 [Mon, 13 Sep 2021 17:44:04 +0000 (10:44 -0700)]
[nn] no batch dim support: CosineEmbeddingLoss (#64590)
Summary:
Reference: https://github.com/pytorch/pytorch/issues/60585
TODO
* [x] Add tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64590
Reviewed By: H-Huang
Differential Revision:
D30900775
Pulled By: jbschlosser
fbshipit-source-id:
d24e72787017e79afbf8f04a94901a290485b81a
Rishi Puri [Mon, 13 Sep 2021 17:05:47 +0000 (10:05 -0700)]
Fixes failure in test_dataloader.py that occurs on jetson boards (#64757)
Summary:
CUDA IPC is not supported for jetsons
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64757
Reviewed By: jbschlosser
Differential Revision:
D30900593
Pulled By: ejguan
fbshipit-source-id:
c6b2e8a9746276fdb4a009b6412e47cc8aac69f2
Jane Xu [Mon, 13 Sep 2021 17:04:58 +0000 (10:04 -0700)]
.github: Always run chown workspace (#64854)
Summary:
In some workflow runs, like https://github.com/pytorch/pytorch/runs/
3568714658, the chown workspace step is duplicated.
Is that intentional? Unfortunately it is pretty necessary since (w/ docker) the folder can sometimes be in a broken permission state before and after we run jobs.
So this PR makes the second chown workspace run always because that's the true intention of the step.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64854
Reviewed By: jbschlosser, seemethere
Differential Revision:
D30879289
Pulled By: janeyx99
fbshipit-source-id:
4157ff826c86e8c912deb1ba0cb5c47ea7596529
Eli Uriegas [Mon, 13 Sep 2021 16:50:50 +0000 (09:50 -0700)]
Reland .circleci: Skip cuda /cudnn install if existing (#64880)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64880
This reverts commit
5836a116d0de214d6d759e70671f23150a5deaba.
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Test Plan: Imported from OSS
Reviewed By: malfet
Differential Revision:
D30885675
Pulled By: seemethere
fbshipit-source-id:
8c96584d5a632170e29f91c5daf0206680a78661
Supriya Rao [Mon, 13 Sep 2021 15:38:41 +0000 (08:38 -0700)]
torch.ao migration: quantize_jit.py phase1 (#64860)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64860
ghstack-source-id:
137885395
Test Plan: buck test mode/dev //caffe2/test:quantization
Reviewed By: jerryzh168
Differential Revision:
D30880574
fbshipit-source-id:
9629027dd3b00bb8d45633e1564fc03a866f8c31
Supriya Rao [Mon, 13 Sep 2021 15:38:41 +0000 (08:38 -0700)]
torch.ao migration: stubs.py phase 1 (#64861)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64861
1. move the file
```
hg mv caffe2/torch/quantization/stubs.py caffe2/torch/ao/quantization/
```
2. create a new file in the old location and copy the imports
3. fix all call sites inside `torch`
ghstack-source-id:
137885365
Test Plan: buck test mode/dev //caffe2/test:quantization
Reviewed By: jerryzh168
Differential Revision:
D30879678
fbshipit-source-id:
a2d24f25d01064212aca15e94e8c78240ba48953
Jiayi Sun [Mon, 13 Sep 2021 14:59:00 +0000 (07:59 -0700)]
add BFloat16 operators on CPU: cummax, cummin (#63307)
Summary:
Fixes #{issue number}
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63307
Reviewed By: nikithamalgifb
Differential Revision:
D30342002
Pulled By: anjali411
fbshipit-source-id:
eee6e640da996ef0e983960119608d9c12405336
Xiaoyu Zhang [Mon, 13 Sep 2021 14:18:38 +0000 (07:18 -0700)]
fix quantization.rst doc (#64802)
Summary:
RT。
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64802
Reviewed By: jbschlosser
Differential Revision:
D30887210
Pulled By: vkuzo
fbshipit-source-id:
0267883d3065d724ea654a28db78f5fe5702ef06
Eddie Ren [Mon, 13 Sep 2021 13:45:39 +0000 (06:45 -0700)]
ND Embeddings benchmark - Standardize randomized inputs (#64707)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64707
Use torch.randn instead of torch.from_numpy to generate the tensor
Test Plan: buck run //caffe2/benchmarks/operator_benchmark/pt:qembedding_pack_test
Reviewed By: jingsh
Differential Revision:
D30817302
fbshipit-source-id:
924c05517812b4b9f7df05a8999f9236cfe7b672
Heitor Schueroff [Mon, 13 Sep 2021 12:50:27 +0000 (05:50 -0700)]
Initial implementation of nanmean (#62671)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62671
Very crude first implementation of `torch.nanmean`. The current reduction kernels do not have good support for implementing nan* variants. Rather than implementing new kernels for each nan* operator, I will work on new reduction kernels with support for a `nan_policy` flag and then I will port `nanmean` to use that.
**TODO**
- [x] Fix autograd issue
Test Plan: Imported from OSS
Reviewed By: malfet
Differential Revision:
D30515181
Pulled By: heitorschueroff
fbshipit-source-id:
303004ebd7ac9cf963dc4f8e2553eaded5f013f0
Heitor Schueroff [Mon, 13 Sep 2021 03:04:19 +0000 (20:04 -0700)]
[Reland] Added reference tests to ReductionOpInfo (#64273)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64273
Reintroduced sample_inputs_prod and constrained the range of values for large reference tests.
This reverts commit
e4fd2ab59ce8645f5ae9477c7724b6af82124b3b.
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision:
D30672097
Pulled By: heitorschueroff
fbshipit-source-id:
b44ed8dfd5eb0c74c194164dafc3242f6728a78f
Emilio Castillo [Mon, 13 Sep 2021 02:45:57 +0000 (19:45 -0700)]
Adds DLPack support (#57110)
Summary:
Partially Fixes https://github.com/pytorch/pytorch/issues/55090
Depends on https://github.com/pytorch/pytorch/issues/55365
Inspired by https://github.com/dmlc/dlpack/issues/57#issuecomment-
774482973
Questions, in PyTorch we can't create streams or easily synchronize them from just an integer. Should we add an [`ExternalStream`](https://docs.cupy.dev/en/stable/reference/generated/cupy.cuda.ExternalStream.html) object like the one we have in CuPy?
TODO: Add tests
Would like some feedback as this design needs quite a few iterations
rgommers leofang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57110
Reviewed By: saketh-are
Differential Revision:
D30761481
Pulled By: mruberry
fbshipit-source-id:
e85d78df3c1f8defc2a698878da89cd843cb1209
kshitij12345 [Mon, 13 Sep 2021 00:05:33 +0000 (17:05 -0700)]
[fix] fix test_python_dispatch with pytest (#64574)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/62501
Another approach for fixing the same issue
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64574
Reviewed By: ngimel
Differential Revision:
D30867237
Pulled By: ezyang
fbshipit-source-id:
c632a1e0b241effdc21ae929abe42fccec88aa24
Nikita Shulga [Sun, 12 Sep 2021 22:55:21 +0000 (15:55 -0700)]
Revert
D30876591: [pytorch][PR] Pin scipy to 1.6.3 on Windows and Mac
Test Plan: revert-hammer
Differential Revision:
D30876591 (https://github.com/pytorch/pytorch/commit/
39f2b9de2ac7fb14e4aaf61863e98d01a53bc875)
Original commit changeset:
4946e0922063
fbshipit-source-id:
b8beff3d973b21fe09c158baef25344030f8fb08
Vasiliy Kuznetsov [Sun, 12 Sep 2021 18:59:44 +0000 (11:59 -0700)]
torch.ao migration: numeric suite, eager and fx (#64817)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64817
This migrates `torch.quantization._numeric_suite` to `torch.ao.ns._numeric_suite`, and `torch.quantization._numeric_suite_fx` to `torch.ao.ns._numeric_suite_fx`.
1. move the files
```
HG: move eager mode
hg mv caffe2/torch/quantization/_numeric_suite.py caffe2/torch/ao/ns/
HG: move fx
hg mv caffe2/torch/quantization/_numeric_suite_fx.py caffe2/torch/ao/ns/
hg mv caffe2/torch/quantization/ns/* caffe2/torch/ao/ns/fx/
```
2. create new versions of `_numeric_suite.py` and `_numeric_suite_fx.py` with
imports
3. update all FB callsites
Test Plan: buck test mode/dev //caffe2/test:quantization
Reviewed By: z-a-f
Differential Revision:
D30867538
fbshipit-source-id:
120ee830434ca490c1183a187a518eebcbbaf22c
Nikita Shulga [Sun, 12 Sep 2021 17:52:28 +0000 (10:52 -0700)]
Pin scipy to 1.6.3 on Windows and Mac (#64844)
Summary:
It's already pinned by via docker install on Linux
As `scipy.stats.`[`poission`|`geom`|`binom`] returns quite different results in 1.7+ versions of SciPy
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64844
Reviewed By: driazati
Differential Revision:
D30876591
Pulled By: malfet
fbshipit-source-id:
4946e0922063e9ac320c218a0b089f73486466f7
Nikita Shulga [Sun, 12 Sep 2021 17:29:10 +0000 (10:29 -0700)]
Revert
D30867266: [pytorch][PR] TST Adds gradcheck and gradgradcheck to module info
Test Plan: revert-hammer
Differential Revision:
D30867266 (https://github.com/pytorch/pytorch/commit/
67ebde56459557199b3c907b81b3c819f77500b9)
Original commit changeset:
cbc073326151
fbshipit-source-id:
00234e01eafc45fb999f7c83a397f9d6b3e01e46
Martin Yuan [Sun, 12 Sep 2021 05:22:28 +0000 (22:22 -0700)]
[RFC] Modularize functions of parsing bytecode (#61862)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61862
Modularize functions of parsing bytecode tables so that they can be used as needed in situations other than mobile lite interpreter.
* The decoupled functions are re-used by current lite interpreter loader.
* The bytecode can be serialized/deserialized from other formats.
* The decoupled functions have minimum dependencies on other PyTorch components.
Next:
Build a driver binary to include the parser and interpreter, but only has necessary dependency on other PyTorch components.
ghstack-source-id:
137867287
Test Plan:
As an example, a simple bytecode is parsed to a mobile function, and directly run in the added unit test, `RunTimeTest:ParseBytecode`. It contains basic control flow (if, else) and basic data orchestration (list construction).
CI
Reviewed By: larryliu0820
Differential Revision:
D29798382
Pulled By: iseeyuan
fbshipit-source-id:
1c173a5f5d37097e3a97baec3f3e48e1eea1400f
Natalia Gimelshein [Sun, 12 Sep 2021 00:17:00 +0000 (17:17 -0700)]
Revert
D30875977: [caffe2] [aten] Remove loose (unpaired) #pragma warning ( pop ) in TensorBase.h
Test Plan: revert-hammer
Differential Revision:
D30875977 (https://github.com/pytorch/pytorch/commit/
1f35d20a894bb07e27691332af4beb097142762f)
Original commit changeset:
bd593feb5a75
fbshipit-source-id:
4c82dbc857fdb28e0240dacc1a0e607a76552bb4
Tao Xu [Sat, 11 Sep 2021 18:21:42 +0000 (11:21 -0700)]
[iOS][OSS][BE] Update XCode to use 12.5.1 (#64850)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64850
ghstack-source-id:
137827895
Test Plan: CircleCI
Reviewed By: hanton
Differential Revision:
D30877964
fbshipit-source-id:
803f2506a755b3815024704e6177c7826bc42de8
Tao Xu [Sat, 11 Sep 2021 18:21:42 +0000 (11:21 -0700)]
[iOS][OSS][BE] Remove unused files (#64849)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64849
ghstack-source-id:
137827893
Test Plan: CircleCI
Reviewed By: hanton
Differential Revision:
D30877962
fbshipit-source-id:
a76f7fe888b990ba6cad650f72be7f4a1e58a2f1
Mikhail Zolotukhin [Sat, 11 Sep 2021 17:21:42 +0000 (10:21 -0700)]
[TensorExpr] Move 2 graph passes from kernel.cpp to graph_opt.cpp (#64828)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64828
Also, make `removeUnusedSelfArgument` more consistent with other passes
by mutating the graph in-place rather than returning a copy.
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision:
D30870776
Pulled By: ZolotukhinM
fbshipit-source-id:
4873f01b013921143a5aa43746d655a2d8d620c9
Mikhail Zolotukhin [Sat, 11 Sep 2021 16:23:52 +0000 (09:23 -0700)]
[TensorExpr] Add debug logging (store/load tracing) to IREval. (#64848)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64848
Test Plan: Imported from OSS
Reviewed By: Chillee
Differential Revision:
D30878278
Pulled By: ZolotukhinM
fbshipit-source-id:
bd946075336ba2e9786602161c236a0ff8a5a011
Mikhail Zolotukhin [Sat, 11 Sep 2021 16:23:10 +0000 (09:23 -0700)]
[TensorExpr] LLVMCodegen: fix lowering for UInt->Float casts. (#64862)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64862
Previously we erroneously were looking at dst signedness. This was
discovered when we tried to implement quantize/dequantize ops.
Test Plan: Imported from OSS
Reviewed By: navahgar
Differential Revision:
D30881696
Pulled By: ZolotukhinM
fbshipit-source-id:
34af842e5e52a3b6b5d2e70c4ef32f910a20341f
Elias Guestrin [Sat, 11 Sep 2021 07:41:51 +0000 (00:41 -0700)]
[caffe2] [aten] Remove loose (unpaired) #pragma warning ( pop ) in TensorBase.h (#64870)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64870
Remove loose (unpaired) #pragma warning ( pop ) in TensorBase.h
Issue started with
D30728580 (https://github.com/pytorch/pytorch/commit/
d701357d921ef167d42c125e65b6f7da6be3ad0f), was fixed with
D30846958 (https://github.com/pytorch/pytorch/commit/
40098f48a1a37a06a456fd642d908ca522295706), and brought back again with the reversion of
D30846958 (https://github.com/pytorch/pytorch/commit/
40098f48a1a37a06a456fd642d908ca522295706).
Reviewed By: H-Huang
Differential Revision:
D30875977
fbshipit-source-id:
bd593feb5a75245470e43ad568ebdd3f1738da7c
Jerry Zhang [Sat, 11 Sep 2021 05:24:10 +0000 (22:24 -0700)]
[quant][fx2trt] Add lowering support for reference linear/conv modules (#64368)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64368
Test Plan:
python torch/fx/experimental/fx2trt/example/quantized_resnet_test.py
Imported from OSS
Reviewed By:
842974287
Differential Revision:
D30708738
fbshipit-source-id:
88142b7ce43ed96093597112dab03a2d277de993
Hui Guo [Sat, 11 Sep 2021 03:30:06 +0000 (20:30 -0700)]
[tensorexpr] Simplify x/100 -> 0 if x is a non-negative integer less than 100. (#64763)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64763
Simplification pattern:
x/N -> 0; N is a constant positive integer and x is a for-loop index whose range is a subset of [0, N).
Test Plan: Imported from OSS
Reviewed By: bertmaher
Differential Revision:
D30845854
Pulled By: huiguoo
fbshipit-source-id:
814d69ed4be05e57405c222183cc1c6c526721cd
Eli Uriegas [Sat, 11 Sep 2021 01:55:07 +0000 (18:55 -0700)]
Revert
D30869803: .circleci: Skip cuda /cudnn install if existing
Test Plan: revert-hammer
Differential Revision:
D30869803 (https://github.com/pytorch/pytorch/commit/
717d267e191bcc1669acad21d87ffb70e6e89b90)
Original commit changeset:
9eb3bd20875d
fbshipit-source-id:
bef8d0c693696307a3be7abd5331b7fa813d754a
Thomas J. Fan [Fri, 10 Sep 2021 23:25:21 +0000 (16:25 -0700)]
TST Adds gradcheck and gradgradcheck to module info (#64444)
Summary:
Follow up to https://github.com/pytorch/pytorch/issues/61935
cc albanD mruberry jbschlosser walterddr
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64444
Reviewed By: ngimel
Differential Revision:
D30867266
Pulled By: jbschlosser
fbshipit-source-id:
cbc0733261517dbfcdd3415d969b9e802b62b7ac
Ansley Ussery [Fri, 10 Sep 2021 23:18:33 +0000 (16:18 -0700)]
Preserve types during empty container assignment (#58911)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58911
Stack from [ghstack](https://github.com/ezyang/ghstack):
* __->__ #58911
Test Plan: Imported from OSS
Reviewed By: gmagogsfm
Differential Revision:
D30785623
Pulled By: ansley
fbshipit-source-id:
4e05d6369318974290fea02ad2bc148293c25090
Jane Xu [Fri, 10 Sep 2021 22:17:07 +0000 (15:17 -0700)]
Always upload stats to S3 (#64853)
Summary:
It's not very useful when stats are only uploaded when the tests all pass.
Like for this failing run, the stats were not uploaded to Scribe or S3: https://github.com/pytorch/pytorch/runs/
3568714658
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64853
Reviewed By: seemethere
Differential Revision:
D30878361
Pulled By: janeyx99
fbshipit-source-id:
19a4c520efdd5575785a3ffbc60e6c09456b9e0d
Kevin Tse [Fri, 10 Sep 2021 21:22:36 +0000 (14:22 -0700)]
[DataPipe] Remove ZipArchiveReader's dependency on FileLoader (#64786)
Summary:
Stack from [ghstack](https://github.com/ezyang/ghstack):
* https://github.com/pytorch/pytorch/issues/64788
* __->__ https://github.com/pytorch/pytorch/issues/64786
This PR removes ZipArchiveReader's dependency on FileLoader DataPipe, by allowing it to use a IterDataPipe of path names as input rather than a tuple of path name and a stream.
It also adds additional tests to ensure that the DataPipe is functioning properly when it is read multiple times or reset half way through reading.
The whole stack fixes issues related to unclosed buffer stream (see https://github.com/pytorch/pytorch/issues/64281).
cc VitalyFedyunin ejguan
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64786
Reviewed By: ngimel
Differential Revision:
D30870968
Pulled By: NivekT
fbshipit-source-id:
64b04d1697b99772f2fa20fc141668e6b8e18c41
Eli Uriegas [Fri, 10 Sep 2021 21:00:22 +0000 (14:00 -0700)]
.circleci: Skip cuda /cudnn install if existing (#64825)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64825
Rewrites this script to only install the CUDA tools if they are not already
pre-installed
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Test Plan: Imported from OSS
Reviewed By: malfet
Differential Revision:
D30869803
Pulled By: seemethere
fbshipit-source-id:
9eb3bd20875df0f2b18f5314ac825dbdf91637b5
Ilqar Ramazanli [Fri, 10 Sep 2021 20:33:12 +0000 (13:33 -0700)]
[doc][hackathon] To add Adadelta Optimizer to the documentation (#63255)
Summary:
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper https://github.com/pytorch/pytorch/issues/63236.
In this PR we are adding description of AdaDelta Algorithm to the documentation. For more details, we refer to the paper here https://arxiv.org/abs/1212.5701
<img width="654" alt="AdaDeltaalg" src="https://user-images.githubusercontent.com/
73658284/
132770544-
82ccf90a-1d54-4ad5-8fc4-
51c8dec63a12.png">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63255
Reviewed By: ngimel
Differential Revision:
D30867589
Pulled By: iramazanli
fbshipit-source-id:
5ba602c20c724a4486bdd38b73e1b64c0e767bdc
Alban Desmaison [Fri, 10 Sep 2021 20:07:37 +0000 (13:07 -0700)]
Add more error checking in subclass creation (#64746)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64746
This extracts the error checking that used to be in the PR above.
We are not going to land the proposed fix there, but I think we want this error checking in right now as these would lead to respectively a memory leak and arbitrary memory read/write.
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision:
D30867569
Pulled By: albanD
fbshipit-source-id:
bf468033fb8b49fcb26eed423f5fad82b4a46c56
Alban Desmaison [Fri, 10 Sep 2021 20:07:37 +0000 (13:07 -0700)]
Move THPVariable_NewWithVar around (#64550)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64550
Just moves a function around to make the next PR easier to read.
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision:
D30867570
Pulled By: albanD
fbshipit-source-id:
99ae925568ed29ca7fdea059762c21d430d4a204
Raghavan Raman [Fri, 10 Sep 2021 19:35:24 +0000 (12:35 -0700)]
[MicroBench] Added a log_vml version of the signed log1p kernel (#64205)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64205
The log_vml version of the micro-bench is over **2x** faster than the log1p version. Here are the perf numbers:
```
---------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
---------------------------------------------------------------------------------------------
SignedLog1pBench/ATen/10/1467 45915 ns 45908 ns 14506 GB/s=2.5564G/s
SignedLog1pBench/NNC/10/1467 40469 ns 40466 ns 17367 GB/s=2.9002G/s
SignedLog1pBench/NNCLogVml/10/1467 19560 ns 19559 ns 35902 GB/s=6.00016G/s
```
Thanks to bertmaher for pointing this out.
Test Plan: Imported from OSS
Reviewed By: bertmaher
Differential Revision:
D30644716
Pulled By: navahgar
fbshipit-source-id:
ba2b32c79d4265cd48a2886b0c62d0e89ff69c19
Raghavan Raman [Fri, 10 Sep 2021 19:35:24 +0000 (12:35 -0700)]
[nnc] Added an implementation of sign op (#64033)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64033
Test Plan: Imported from OSS
Reviewed By: bertmaher
Differential Revision:
D30579197
Pulled By: navahgar
fbshipit-source-id:
f9f7fa7f2ffa109cf4e441eb1af821b8b891d4d3
Eddie Ren [Fri, 10 Sep 2021 19:31:27 +0000 (12:31 -0700)]
Extend 2Dim embedding bag benchmarking to include 3Dim benchmarks (#64647)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64647
Add support for benchmarking of 8 bit quantizations of N-D batched embeddings. Currently only works for 3Dim embeddings and still requires thought on ramping up from 3Dim to NDim.
Test Plan: ```buck run //caffe2/benchmarks/operator_benchmark/pt:qembedding_pack_test```
Reviewed By: jingsh
Differential Revision:
D30770085
fbshipit-source-id:
26659020f3458991592065a05366bde0f060494e
Howard Huang [Fri, 10 Sep 2021 18:48:43 +0000 (11:48 -0700)]
Revert
D30846958: [caffe2/aten] Remove loose #pragma warning ( pop ) in TensorBase.h
Test Plan: revert-hammer
Differential Revision:
D30846958 (https://github.com/pytorch/pytorch/commit/
40098f48a1a37a06a456fd642d908ca522295706)
Original commit changeset:
52a3fb66e426
fbshipit-source-id:
1d749f6981756f2169d6867538555a945cbb8ca6
Kevin Tse [Fri, 10 Sep 2021 18:00:01 +0000 (11:00 -0700)]
[DataPipe] fixing tests related fork() to remove warnings (#64827)
Summary:
There are two warnings produced by `test_fork_datapipe`. This PR addresses the issues raised by those warnings without impacting the test cases.
cc VitalyFedyunin ejguan
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64827
Reviewed By: ejguan
Differential Revision:
D30870528
Pulled By: NivekT
fbshipit-source-id:
580a001c6fa3ff6f8b04a7e5183e58861938204b
Hui Guo [Fri, 10 Sep 2021 16:59:25 +0000 (09:59 -0700)]
[tensorexpr] Add 'pre_alloc' argument in python API of tensorexpr kernel (#64718)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64718
Test Plan: Imported from OSS
Reviewed By: navahgar
Differential Revision:
D30826582
Pulled By: huiguoo
fbshipit-source-id:
6c173c8964f2643039273cdc83e64fb02bb5f381
anjali411 [Fri, 10 Sep 2021 16:55:50 +0000 (09:55 -0700)]
Skip conjugate and negate fallback for view ops and their in-place versions (#64392)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64392
cc ezyang anjali411 dylanbespalko mruberry Lezcano nikitaved
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision:
D30866330
Pulled By: anjali411
fbshipit-source-id:
7b2f51486bf1d610ad2b1472306bab608ee69c37
Ilqar Ramazanli [Fri, 10 Sep 2021 16:47:38 +0000 (09:47 -0700)]
To add Rprop documentation (#63866)
Summary:
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper https://github.com/pytorch/pytorch/issues/63236.
In this PR we are adding description of Rprop to the documentation. For more details, we refer to the paper http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.1417
<img width="657" alt="Rpropalg" src="https://user-images.githubusercontent.com/
73658284/
132750009-
a5ec059e-6d53-4c67-917b-
57174c8ca27b.png">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63866
Reviewed By: ngimel
Differential Revision:
D30867590
Pulled By: iramazanli
fbshipit-source-id:
0d2d4ffc6c4d939290bbbaa84d2c6e901ed8b54a
Jeff Daily [Fri, 10 Sep 2021 16:36:26 +0000 (09:36 -0700)]
[ROCm] define C10_WARP_SIZE to warpSize HIP constant (#64302)
Summary:
warpSize is defined as a constexpr in HIP headers. It is incorrect to assume warpSize 64. This change fixes the C10_WARP_SIZE definition in torch sources similar to [how it was done in caffe2](https://github.com/pytorch/pytorch/blob/master/caffe2/utils/GpuDefs.cuh#L10-L14).
cc jeffdaily sunway513 jithunnair-amd ROCmSupport
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64302
Reviewed By: mrshenli
Differential Revision:
D30785975
Pulled By: malfet
fbshipit-source-id:
68f8333182ad4d02bd0c8d02f1751a50bc5bafa7
Corey Levinson [Fri, 10 Sep 2021 16:35:06 +0000 (09:35 -0700)]
fix typo in torch/onnx/utils.py (#63396)
Summary:
fixes minor typo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63396
Reviewed By: pbelevich
Differential Revision:
D30644295
Pulled By: SplitInfinity
fbshipit-source-id:
c506f67383909aa2c0c7c533038446b4b3d76a3a
rui [Fri, 10 Sep 2021 15:28:45 +0000 (08:28 -0700)]
build: bump bazel to 4.2.1 (#64455)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64455
Reviewed By: saketh-are
Differential Revision:
D30752580
Pulled By: malfet
fbshipit-source-id:
4f5cc6f820396348181c09463f7e5628b5f69471
Aswin John Mathews [Fri, 10 Sep 2021 15:05:21 +0000 (08:05 -0700)]
ROCm MIOpen NHWC Convolution support (#63617)
Summary:
- Added 2D-Convolution NHWC support
- on ROCm 4.3, with `PYTORCH_MIOPEN_SUGGEST_NHWC=1` flag
- May need to force MIOpen to search for solutions ( see examples below for flags )
**PYTORCH_MIOPEN_SUGGEST_NHWC Environment Flag**
MIOpen does not officially support NHWC yet, although convolution support has been added to tip-of-tree of MIOpen. This flag is intended to be a short-lived flag to explicitly turn on NHWC support until ROCm officially supports NHWC and performance is verified.
**Examples**
1. Example usage 1 : Run test on ROCm4.3
`PYTORCH_TEST_WITH_ROCM=1 PYTORCH_MIOPEN_SUGGEST_NHWC=1 MIOPEN_FIND_ENFORCE=4 MIOPEN_DEBUG_CONV_GEMM=0 MIOPEN_FIND_MODE=1 pytest test_nn.py -v -k "test_conv_cudnn_nhwc" `
2. Example usage 2: Run the following with `PYTORCH_MIOPEN_SUGGEST_NHWC=1` on ROCm4.3.
```
#!/usr/bin/env python3
import torch
model = torch.nn.Conv2d(8, 4, 3).cuda().half()
model = model.to(memory_format=torch.channels_last)
input = torch.randint(1, 10, (2, 8, 4, 4), dtype=torch.float32, requires_grad=True)
input = input.to(device="cuda", memory_format=torch.channels_last, dtype=torch.float16)
# should print True for is_contiguous(channels_last), and strides must match NHWC format
print(input.is_contiguous(memory_format=torch.channels_last), input.shape, input.stride() )
out = model(input)
# should print True for is_contiguous(channels_last), and strides must match NHWC format
print("Contiguous channel last :", out.is_contiguous(memory_format=torch.channels_last), " out shape :", out.shape, "out stride :", out.stride() )
```
See https://pytorch.org/tutorials/intermediate/memory_format_tutorial.html for more examples.
cc jeffdaily sunway513 jithunnair-amd ROCmSupport
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63617
Reviewed By: saketh-are
Differential Revision:
D30730800
Pulled By: ezyang
fbshipit-source-id:
61906a0f30be8299e6547d312ae6ac91cc7c3238
Shen Li [Fri, 10 Sep 2021 14:44:09 +0000 (07:44 -0700)]
Let all_reduce_coalesced and all_gather_coalesced return Future objects (#64722)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64722
`all_reduce_coalesced` and `all_gather_coalesced` are never publicly
released in our API docs. So, I would assume the blast radius to be small.
The motivation for this change to allow implementing
`all_reduce_coalesced` and `all_gather_coalesced` by re-using `allreduce`
and `allgather` C++ cores and perform flatten and copy only on the Python
side. With that, we can then remove `all_reduce_coalesced` and
`all_gather_coalesced` from C++ ProcessGroup APIs. For the async mode,
the copy-back logic after the communication will need to be chained
as a callback on the returned Future and use the chained child Future
as the return value (otherwise, we will need to wrap the child Future
into another work handle). This PR tries to test if we can directly
return a Future without breaking tests and internal use cases. If yes,
it will make the consolidation a lot easier.
cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse agolynski SciPioneer H-Huang mrzzd cbalioglu gcramer23
Test Plan: Imported from OSS
Reviewed By: rohan-varma
Differential Revision:
D30830994
Pulled By: mrshenli
fbshipit-source-id:
dcde0ed9245e9e8fee357b3588b07d540a4b6318
Nikita Vedeneev [Fri, 10 Sep 2021 14:17:30 +0000 (07:17 -0700)]
`torch.lu`: forward AD support (#64742)
Summary:
As per title.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64742
Reviewed By: H-Huang
Differential Revision:
D30841227
Pulled By: albanD
fbshipit-source-id:
dc4d043ab94358594adb110fbbbb60750c98262a
Jordan Fix [Fri, 10 Sep 2021 06:49:22 +0000 (23:49 -0700)]
[const_fold] Keep around node.meta for replaced folded ops (#64782)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64782
Previously, get_attrs that were added to the graph did not retain node.meta after folding. Add such support, and improve coverage in general here.
Test Plan: Added test coverage.
Reviewed By: protonu
Differential Revision:
D30852704
fbshipit-source-id:
ece87a61c69b2e68982964c6adc4dde14dae12c7
Elias Guestrin [Fri, 10 Sep 2021 06:44:03 +0000 (23:44 -0700)]
[caffe2/aten] Remove loose #pragma warning ( pop ) in TensorBase.h (#64773)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64773
Remove loose `#pragma warning ( pop )` in TensorBase.h.
Reviewed By: ezyang
Differential Revision:
D30846958
fbshipit-source-id:
52a3fb66e426bc16ef7bde2a13e26e8293969026
Shirong Wu [Fri, 10 Sep 2021 04:02:15 +0000 (21:02 -0700)]
Add TRTSplitter (#64762)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64762
Extract and format TRTSplitter from fx2trt_example code, current implementation is tentative, subject to changed based on feeds model lowering progress.
Test Plan:
manul print of supported operator:
`{<class 'torch.nn.modules.activation.ReLU'>: None, <function relu at 0x7f9b1abd0790>: None, <class 'torch.nn.modules.activation.Sigmoid'>: None, <class 'torch.nn.modules.pooling.AdaptiveAvgPool2d'>: None, <built-in method add of type object at 0x7f9b7f402498>: None, <built-in function add>: None, <built-in method add of PyCapsule object at 0x7f9b1a3dc690>: None, <built-in method add_relu of PyCapsule object at 0x7f9b1a34cf90>: None, <class 'torch.nn.modules.batchnorm.BatchNorm2d'>: None, <class 'torch.nn.quantized.modules.batchnorm.BatchNorm2d'>: None, <class 'torch.nn.modules.conv.Conv2d'>: None, <class 'torch.nn.quantized.modules.conv.Conv2d'>: None, <class 'torch.nn.intrinsic.quantized.modules.conv_relu.ConvReLU2d'>: None, <class 'torch.nn.modules.linear.Linear'>: None, <class 'torch.nn.quantized.modules.linear.Linear'>: None, <class 'torch.nn.modules.pooling.MaxPool2d'>: None, <built-in function mul>: None, <built-in method mul of type object at 0x7f9b7f402498>: None, <built-in method mul of PyCapsule object at 0x7f9b1a3dc6c0>: None, <built-in method flatten of type object at 0x7f9b7f402498>: None, <class 'torch.nn.quantized.modules.DeQuantize'>: None, <built-in method dequantize of type object at 0x7f9b7f402498>: None, 'dequantize': None, <class 'torch.nn.quantized.modules.Quantize'>: None, <built-in method quantize_per_tensor of type object at 0x7f9b7f402498>: None, <class 'torch.nn.modules.linear.Identity'>: None, <function conv2d at 0x7f9b1a1fe9d0>: None, <function flatten at 0x7f9b1a1f5ca0>: None, <function size at 0x7f9b1a1f5b80>: None, <function batch_norm at 0x7f9b1a1feaf0>: None, <function layer_norm at 0x7f9b1a1feb80>: None, <function softmax at 0x7f9b1a1f9550>: None, <function relu at 0x7f9b1a1fe040>: None, <function sin at 0x7f9b1a2030d0>: None, <function cos at 0x7f9b1a203160>: None, <function tan at 0x7f9b1a2031f0>: None, <function sinh at 0x7f9b1a1fe160>: None, <function cosh at 0x7f9b1a1fe280>: None, <function tanh at 0x7f9b1a1fe310>: None, <function asin at 0x7f9b1a1fe3a0>: None, <function acos at 0x7f9b1a1fe430>: None, <function atan at 0x7f9b1a1fe4c0>: None, <function exp at 0x7f9b1a1fe550>: None, <function log at 0x7f9b1a1fe5e0>: None, <function sqrt at 0x7f9b1a1fe670>: None, <function reciprocal at 0x7f9b1a1fe700>: None, <function abs at 0x7f9b1a1fe790>: None, <function neg at 0x7f9b1a1fe820>: None, <function floor at 0x7f9b1a1fe8b0>: None, <function ceil at 0x7f9b1a1fe940>: None, <function sum at 0x7f9b1a1f9c10>: None, <function max_pool2d at 0x7f9b1a1f5d30>: None, <function squeeze at 0x7f9b1a1f5c10>: None, <function add at 0x7f9b1a1f91f0>: None, <function sub at 0x7f9b1a1f9ca0>: None, <function div at 0x7f9b1a1f9dc0>: None, <function mul at 0x7f9b1a1f9d30>: None, <function pow at 0x7f9b1a1f9e50>: None, <function min_two_tensors_input at 0x7f9b1a1f9940>: None, <function unsqueeze at 0x7f9b1a1f9280>: None, <function topk at 0x7f9b1a203280>: None, <function adaptive_avg_pool2d at 0x7f9b1a1f5dc0>: None, <function avg_pool2d at 0x7f9b1a1f5e50>: None, <function reshape at 0x7f9b1a203550>: None, <function slice_tensor at 0x7f9b1a1fee50>: None, <function split at 0x7f9b1a1fec10>: None, <function linear at 0x7f9b1a1f51f0>: None, <function clamp at 0x7f9b1a1f93a0>: None, <function tuple_construct at 0x7f9b1a1fed30>: None, <function contiguous at 0x7f9b1a1f9430>: None, <function getitem at 0x7f9b1a203310>: None, <function cat at 0x7f9b1a1f9310>: None, <function transpose at 0x7f9b1a1f94c0>: None, <function matmul at 0x7f9b1a1f98b0>: None, <function sigmoid at 0x7f9b1a1fe1f0>: None, <function permute at 0x7f9b1a1f9670>: None, <function quantize_per_tensor at 0x7f9b1a1f9b80>: None, <function dequantize at 0x7f9b1a1f99d0>: None, <function sign at 0x7f9b1a1f5ee0>: None}`
Reviewed By:
842974287
Differential Revision:
D30798047
fbshipit-source-id:
69076a550874425b7186fbbf2ecf03da4a99b42f
Scott Wolchok [Fri, 10 Sep 2021 01:53:36 +0000 (18:53 -0700)]
[PyTorch] Fix missing move in torch::jit::Lexer::next (#64653)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64653
Saves shared_ptr refcount inc/dec in SourceRange.
ghstack-source-id:
137608457
Test Plan: Profiled startup of framework overheads benchmark from high_per_models; self time spent in next() is way down.
Reviewed By: dhruvbird
Differential Revision:
D30739240
fbshipit-source-id:
ac455678c9d46e657b111d3788d4369983028674
Scott Wolchok [Fri, 10 Sep 2021 01:53:36 +0000 (18:53 -0700)]
[PyTorch] Use std::find in the JIT lexer (#64652)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64652
If nothing else, it is slightly clearer code.
ghstack-source-id:
137608456
Test Plan: CI
Reviewed By: dhruvbird
Differential Revision:
D30739239
fbshipit-source-id:
bc7917b59883ca4a33fc6916b4e422bad79cf04b
Mikhail Zolotukhin [Fri, 10 Sep 2021 01:48:17 +0000 (18:48 -0700)]
[TensorExpr] Simplify TE IR before applying any transformations. (#64717)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64717
This also exposed several bugs, which are fixed in this PR.
Differential Revision:
D30826408
D30826408
Test Plan: Imported from OSS
Reviewed By: navahgar
Pulled By: ZolotukhinM
fbshipit-source-id:
a67ec5739aceed9ffdf0d24f77eb3787cefe4560
Jerry Zhang [Fri, 10 Sep 2021 00:17:01 +0000 (17:17 -0700)]
[quant][fix] Fix quantization for sub_scalar (#64603)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64603
We'll insert observer only when both the operator and dtype is supported
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_sub_scalar
Imported from OSS
Reviewed By: vkuzo
Differential Revision:
D30797025
fbshipit-source-id:
a77c21e2749405534fc245374cf33a0657a3d2c8