Karen Zhou [Tue, 24 Aug 2021 17:17:28 +0000 (10:17 -0700)]
[pruner] modify base pruner to prune bias by default (#63202)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63202
By default, the prune will also prune biases, such that the whole output channel is removed. The user can manually set `also_prune_bias` to False when calling `prepare` if they don't want the bias to be pruned.
ghstack-source-id:
136466671
Test Plan:
`buck test mode/dev-nosan //caffe2/test:ao -- TestBasePruner`
https://pxl.cl/1MV32
modify `fusion_tests` according to API change
`buck test mode/opt //scripts/kazhou:fusion_tests`
https://pxl.cl/1NbKz
Reviewed By: z-a-f
Differential Revision:
D30294494
fbshipit-source-id:
c84655648bee0035559195ca855b98fb7edaa134
Karen Zhou [Tue, 24 Aug 2021 17:17:28 +0000 (10:17 -0700)]
[pruner] amend base pruner API to match base sparsifier (#63178)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63178
Update base pruner API to match base sparsifier API as defined in
D28970960 / PR58955
Changes include:
- `enable_mask_update = True` in `__init__`
- `prepare` takes model and config instead of constructor
- convert functionality renamed to `squash_mask`, `convert` method call now raises Error
- `activation_handles` ad `bias_handles` initialized in `_prepare` instead of constructor
ghstack-source-id:
136467595
Test Plan:
Function names updates according to changes
`buck test mode/dev-nosan //caffe2/test:ao -- TestBasePruner`
https://pxl.cl/1MTgH
TODO will need to modify `fbcode/scripts/kazhou/fusion_tests.py` to use new API
Reviewed By: z-a-f
Differential Revision:
D30287179
fbshipit-source-id:
d4727bea1873b500f2d4bb784db26d532bf26cce
Karen Zhou [Tue, 24 Aug 2021 17:17:28 +0000 (10:17 -0700)]
[pruner] refactor `ActivationReconstruction` forward hooks (#63158)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63158
Combined functionality for `ActivationReconstruction` for both Linear and Conv2d in one class. The only difference between the old classes was the size and indexing of the reconstructed tensor -- that logic can be generalized by iterating over the size of `output`.
ghstack-source-id:
136467465
Test Plan:
`buck test mode/dev-nosan //caffe2/test:ao -- TestBasePruner`
https://pxl.cl/1MSSv
Reviewed By: raghuramank100
Differential Revision:
D30282765
fbshipit-source-id:
08a1e4e0650511019fff85cf52b41dd818b0c7f8
Mike Iovine [Tue, 24 Aug 2021 16:38:25 +0000 (09:38 -0700)]
[Static Runtime] Implement prim::VarStack out variant (#63579)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63579
Provide a static runtime out variant implementation for the new op introduced in
D30426232 (https://github.com/pytorch/pytorch/commit/
1385f9fb12e6607c98d2d9d5edaaaab2bc07386f).
Test Plan: `buck test //caffe2/benchmarks/static_runtime:static_runtime_cpptest -- IndividualOps_VarStack`
Reviewed By: navahgar
Differential Revision:
D30410525
fbshipit-source-id:
bc59a3d8ad23e3d94561ec2dca9cc20687dbadf8
Xiang Gao [Tue, 24 Aug 2021 16:24:50 +0000 (09:24 -0700)]
[Reland] Embedding thrust->cub migration (#63806)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/63427
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63806
Reviewed By: bdhirsh
Differential Revision:
D30498255
Pulled By: ngimel
fbshipit-source-id:
78b7085a92a168cf0163f53dcb712bac922f5235
mingfeima [Tue, 24 Aug 2021 15:54:36 +0000 (08:54 -0700)]
optimize BFloat16 elemwise operators CPU: sigmoid, sigmoid_backward, tanh_backward, addcmul, addcdiv (#55221)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55221
Test Plan: Imported from OSS
Reviewed By: bdhirsh
Differential Revision:
D28836797
Pulled By: VitalyFedyunin
fbshipit-source-id:
6b79098c902ffe65d228668118ef36fb49bab800
yanbing-j [Tue, 24 Aug 2021 15:32:33 +0000 (08:32 -0700)]
Enable BFloat16 LeakyReLU and RReLU in CPU path (#61514)
Summary:
Enable and optimize BFloat16 LeakyReLU and RReLU in CPU path.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61514
Reviewed By: ejguan
Differential Revision:
D30257612
Pulled By: VitalyFedyunin
fbshipit-source-id:
8cc0d1faacd02dcc9827af724a86d95b6952748f
Thomas J. Fan [Tue, 24 Aug 2021 15:26:21 +0000 (08:26 -0700)]
ENH Adds no_batch_dim for NLLLoss (#62651)
Summary:
Towards https://github.com/pytorch/pytorch/issues/60585
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62651
Reviewed By: VitalyFedyunin
Differential Revision:
D30303340
Pulled By: jbschlosser
fbshipit-source-id:
7ab478cf63bf6cd1f850cad5fd101e74a2cfe3f5
mingfeima [Tue, 24 Aug 2021 15:22:47 +0000 (08:22 -0700)]
fix batchnorm2d issue when input is non contiguous (#63392)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63392
Test Plan: Imported from OSS
Reviewed By: ejguan
Differential Revision:
D30476317
Pulled By: VitalyFedyunin
fbshipit-source-id:
03055a0aec21cf2c029b6f32315da2b09cb722d0
Mike Iovine [Tue, 24 Aug 2021 15:19:38 +0000 (08:19 -0700)]
[JIT] Add variadic stack op (#63578)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63578
Added a new op `prim::VarStack` and a pass that transforms instances of `aten::stack(list, dim)` into `prim::VarStack(list[0], ..., list[n], dim)`. Also provided a JIT interpreter implementation.
Most of the implementation/tests are the same as `prim::VarConcat`.
Test Plan: `buck test caffe2/test/cpp/jit:jit -- TestStackOpt`
Reviewed By: navahgar
Differential Revision:
D30426232
fbshipit-source-id:
9829a7db6e0a5038c9b7528c43c25b0c221aa2ce
Rong Rong (AI Infra) [Tue, 24 Aug 2021 15:01:36 +0000 (08:01 -0700)]
[BE] add distributed run_test options (#63147)
Summary:
Currently distributed tests are mixed within test_python.
We would like to split the distributed tests into its own batch thus we need to split them out.
Adding an option to include/exclude distributed tests with CUSTOM_HANDLERS.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63147
Test Plan:
- locally run with the addition run_test.py options.
- CI
Dependency: found a bug in mpiexec test and need https://github.com/pytorch/pytorch/issues/63580 to fix it first.
Reviewed By: bdhirsh
Differential Revision:
D30496178
Pulled By: walterddr
fbshipit-source-id:
7903a57b619f2425028028f944211938823918a6
Alban Desmaison [Tue, 24 Aug 2021 14:20:56 +0000 (07:20 -0700)]
Revert
D30388099: Add a common autograd TLS state
Test Plan: revert-hammer
Differential Revision:
D30388099 (https://github.com/pytorch/pytorch/commit/
83d9bad44a1e1e6202103cd22e4dbd2bd3d7dae0)
Original commit changeset:
8e03f940150f
fbshipit-source-id:
f6d60fec66e8292f5268335bb8a3e7e1a662f23b
Thomas J. Fan [Tue, 24 Aug 2021 13:58:05 +0000 (06:58 -0700)]
ENH Adds no_batch_dim tests/docs for LPPool1d and Identity (#62190)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/60585
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62190
Reviewed By: ejguan
Differential Revision:
D29942385
Pulled By: jbschlosser
fbshipit-source-id:
00df6f6f01ad039631bb8679f8de94863aac7650
albanD [Tue, 24 Aug 2021 13:52:38 +0000 (06:52 -0700)]
Add a common autograd TLS state (#63114)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63114
This PR collapses the GradMode and InferenceMode thread local booleans into a single thread local uint8.
This helps reducing the number of thread local variable accesses done when we propagate ThreadLocalStates.
Note that this is even more beneficial as we will add a forward mode AD TLS (similar to GradMode) higher in this stack and this new structure should reduce the perf impact of adding this new TLS.
Here is the full benchmark result between master and the top of this stack: https://gist.github.com/albanD/
e421101e9ed344e94999bef3a54bf0f3
tl;dr: give a benefit in most cases. It is never detrimental.
Test Plan: Imported from OSS
Reviewed By: ejguan
Differential Revision:
D30388099
Pulled By: albanD
fbshipit-source-id:
8e03f940150ff063c2edd792733663413ae2f486
Marjan Fariborz [Tue, 24 Aug 2021 08:43:33 +0000 (01:43 -0700)]
Separating quantization test from distributed_test (#63058)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63058
Dedicating separate tests for different quantization methods. Currently supporting FP16 method.
ghstack-source-id:
136499767
Test Plan: uck test mode/dev //caffe2/test/distributed/algorithms/quantization:quantization_gloo_fork -- name_of_the_test
Reviewed By: wanchaol
Differential Revision:
D30142580
fbshipit-source-id:
3aacec1a231a662067d2b48c001f0c69fefcdd60
Mikhail Zolotukhin [Tue, 24 Aug 2021 07:29:22 +0000 (00:29 -0700)]
[TensorExpr] Nuke KernelArena and KernelScope. (#63587)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63587
Now that there is no classes using KernelArena for memory management we
can remove it.
Differential Revision:
D30429115
D30429115
Test Plan: Imported from OSS
Reviewed By: navahgar
Pulled By: ZolotukhinM
fbshipit-source-id:
375f6f9294d27790645eeb7cb5a8e87047a57544
Mikhail Zolotukhin [Tue, 24 Aug 2021 07:29:22 +0000 (00:29 -0700)]
[TensorExpr] Make 'Tensor' a value type. (#63586)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63586
This is another commit in transition from KernelArena memory management.
Tensor is essentially just a pair of <BufPtr, StmtPtr> and we don't need
to dynamically allocate it at all - it's cheap to pass it by value, and
that's what we're switching to in this commit.
After this change nothing uses KernelScope/KernelArena and they can be
safely removed.
Differential Revision:
D30429114
D30429114
Test Plan: Imported from OSS
Reviewed By: navahgar
Pulled By: ZolotukhinM
fbshipit-source-id:
f90b859cfe863692b7beffbe9bd0e4143df1e819
Mikhail Zolotukhin [Tue, 24 Aug 2021 07:29:22 +0000 (00:29 -0700)]
[TensorExpr] Switch Exprs and Stmt from kernel-arena to shared_ptr. (#63216)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63216
Currently there are three classes managed by KernelArena: Expr, Stmt,
and Tensor (and derived classes). KernelArena has been a long standing
painpoint for NNC devs and we're moving away from that memory management
model to ref-count based memory model (using shared_ptr). This commit
switches Expr and Stmt to shared_ptr and is the biggest change in this
transition. Later commits will detach Tensor from KernelArena and kill
the arena + scope altogether.
Differential Revision:
D30353195
D30353195
Test Plan: Imported from OSS
Reviewed By: navahgar
Pulled By: ZolotukhinM
fbshipit-source-id:
9575225ada3d0fb65087ae40435f3dfea4792cae
Mikhail Zolotukhin [Tue, 24 Aug 2021 07:29:22 +0000 (00:29 -0700)]
[TensorExpr] More NFC changes like Expr* -> ExprPtr. (#63778)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63778
This is a preparation for a switch from raw pointers to shared pointers
as a memory model for TE expressions and statements.
Test Plan: Imported from OSS
Reviewed By: navahgar
Differential Revision:
D30487425
Pulled By: ZolotukhinM
fbshipit-source-id:
9cbe817b7d4e5fc2f150b29bb9b3bf578868f20c
mingfeima [Tue, 24 Aug 2021 05:53:35 +0000 (22:53 -0700)]
add channels last for GroupNorm (#49821)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49821
Test Plan: Imported from OSS
Reviewed By: ejguan
Differential Revision:
D26007053
Pulled By: VitalyFedyunin
fbshipit-source-id:
34a48d5d3b66a159febf3c3d96748fbaba1b9e31
Jane Xu [Tue, 24 Aug 2021 01:44:46 +0000 (18:44 -0700)]
Add ROCm as a platform for which tests can be disabled (#63813)
Summary:
Realized we were missing ROCm as a platform on which one could disable a flaky test. (like how this issue specifies windows https://github.com/pytorch/pytorch/issues/61655)
cc jeffdaily sunway513 jithunnair-amd ROCmSupport
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63813
Reviewed By: seemethere
Differential Revision:
D30498478
Pulled By: janeyx99
fbshipit-source-id:
f1abe8677e1ddd01de3291e1618272ad8e287dc4
Mike Iovine [Tue, 24 Aug 2021 01:43:17 +0000 (18:43 -0700)]
[Static Runtime] SR clones graph input (#63704)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63704
Previously SR did not clone the graph. This was leading to subtle bugs in `testStaticRuntime`; static runtime would modify its graph, and the graph used by the JIT interpreter would change as well. The JIT interpreter would then crash if SR-only ops were added!
Cloning the graph is more consistent with the behavior of the `Module` ctor.
Test Plan: `buck test caffe2/benchmarks/static_runtime/...`
Reviewed By: hlu1
Differential Revision:
D30463294
fbshipit-source-id:
b771551a1f55f95fde79373b23babcf3e5ddf726
Shiyan Deng [Tue, 24 Aug 2021 01:17:20 +0000 (18:17 -0700)]
[fx2trt] Add acc op and converter for torch.pow (#63795)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63795
att
Test Plan: buck run mode/opt caffe2/torch/fb/fx2trt:test_binary_ops
Reviewed By: jackm321, wushirong
Differential Revision:
D30492488
fbshipit-source-id:
6d615770567b13720316f06fd2f866ea2fdc2995
Vitaly Fedyunin [Tue, 24 Aug 2021 01:07:37 +0000 (18:07 -0700)]
Adding DataLoader2 class as future replacement of DataLoader (#63742)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63742
Supports sharding and batching on loader level**
Supports sharding and batching on loader level
Test Plan: Imported from OSS
Reviewed By: ejguan
Differential Revision:
D30494506
Pulled By: VitalyFedyunin
fbshipit-source-id:
6648e09d955055ac38e3a4e3973f701acefca762
Rohan Varma [Tue, 24 Aug 2021 00:45:39 +0000 (17:45 -0700)]
[BE] Enable PostLocalSGD tests on windows (#63463)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63463
Now that `torch.distributed.optim` gates DistributedOptimizer on RPC availability, local sgd optimizer can be used on windows.
ghstack-source-id:
136437632
Test Plan: Ci
Reviewed By: SciPioneer
Differential Revision:
D30358922
fbshipit-source-id:
9b56aebf1075f026637296d338805ad8851c9d40
Rohan Varma [Tue, 24 Aug 2021 00:45:39 +0000 (17:45 -0700)]
[BE] Enable functional optim tests for windows (#63462)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63462
Now that `torch.distributed.optim` gates DistributedOptimizer on RPC availability, these tests can be run on windows.
ghstack-source-id:
136437635
Test Plan: CI
Reviewed By: SciPioneer
Differential Revision:
D30358923
fbshipit-source-id:
36739bdfe7214789f17de652d30c62c2bc124c73
Shiyan Deng [Tue, 24 Aug 2021 00:41:38 +0000 (17:41 -0700)]
[fx_acc] Add mapper for torch.log1p (#63792)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63792
Map `torch.log1p` to `acc_ops.add` + `acc_ops.log`.
Test Plan: buck test mode/opt glow/fb/fx/oss_acc_tracer:test_acc_tracer -- test_log1p
Reviewed By: wushirong
Differential Revision:
D30491706
fbshipit-source-id:
bcbeddf06131113185d2019cfd7cf5e9193a8a78
Peter Bell [Tue, 24 Aug 2021 00:39:50 +0000 (17:39 -0700)]
Fix pocketfft include path in mobile build (#63714)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63714
PocketFFT was disabled for CMake < 3.9 but CMake 3.11 is the first version to support `INCLUDE_DIRECTORIES` as a target property. So updating to CMake 3.10 causes the mobile builds to fail. Instead of limiting the CMake support, this just adds the include directory to the entire target,
Test Plan: Imported from OSS
Reviewed By: bdhirsh
Differential Revision:
D30498369
Pulled By: malfet
fbshipit-source-id:
83372e29c477c97e7015763b7c29d6d7e456bcef
Peter Bell [Tue, 24 Aug 2021 00:39:45 +0000 (17:39 -0700)]
Simplify ccache instructions in CONTRIBUTING.md (#62549)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62549
When building CUDA files with native CMake support, it will respect the
`CMAKE_CUDA_COMPILER_LAUNCHER` setting. So, there's no need for symlinks.
Test Plan: Imported from OSS
Reviewed By: bdhirsh
Differential Revision:
D30498488
Pulled By: malfet
fbshipit-source-id:
71c2ae9d4570cfac2a64d777bc95cda3764332a0
driazati [Tue, 24 Aug 2021 00:30:51 +0000 (17:30 -0700)]
Skip archiving useless build artifacts (#63785)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63785
We currently zip up everything in `build/` which includes a lot of cruft (`.o` files, random things copied in from dependencies, etc). This makes the artifact bigger (slower upload/download times, and takes about 1.5 minutes to archive). This change makes archiving instead take ~15 seconds and removes the 50 second upload to GitHub step that isn't as useful now that we have the HUD PR page that lists out all artifacts.
Test Plan: Imported from OSS
Reviewed By: seemethere, janeyx99
Differential Revision:
D30494444
Pulled By: driazati
fbshipit-source-id:
93202dba7387daeb4859a938110b02ff2dc2ccc4
Bert Maher [Tue, 24 Aug 2021 00:28:33 +0000 (17:28 -0700)]
Fix some memory bugs in onnx passes (#63754)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63754
Running onnx tests with ASAN uncovers several memory errors. These two are caused by: (1) iterating the uses list of a node after mutation, and (2) accessing the `blocks` attribute of a possibly deleted node.
To reproduce (this is on a CentOS 7 box):
```
DEBUG=1 CFLAGS="-fsanitize=address" CXXFLAGS="-fsanitize=address" USE_LLVM=$(realpath ../llvm-project/install) CMAKE_PREFIX_PATH=$CONDA_PREFIX python setup.py install
LD_PRELOAD=$(realpath /lib64/libasan.so.5) numactl -C3 pytest -v --cov --cov-report xml:test/coverage.xml --cov-append onnx/test_pytorch_onnx_onnxruntime.py::TestONNXRuntime_opset11 -s
```
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision:
D30493939
Pulled By: bertmaher
fbshipit-source-id:
e16e19dc9b4c9896e102ca8bf04c8bedfdde87af
Mike Iovine [Tue, 24 Aug 2021 00:26:27 +0000 (17:26 -0700)]
[JIT] Move UseVariadicCat internals (#63577)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63577
Since other variadic ops will have an almost identical implementation, we can generalize the `UseVariadicCat` implementation and put it in a common folder.
Also moved some test utilities that other variadic op tests will likely need.
Test Plan: `buck test caffe2/test/cpp/jit:jit -- ConcatOptTest`
Reviewed By: navahgar
Differential Revision:
D30409937
fbshipit-source-id:
925c11c27b58ce98cb8368d2a205e26ba66d3db9
Akshit Khurana [Mon, 23 Aug 2021 23:33:07 +0000 (16:33 -0700)]
Fix typo in NNAPI tests (#63797)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63797
nnapi memory format test has a typo
Test Plan:
pytest test/test_nnapi.py::TestNNAPI
Imported from OSS
Reviewed By: Amyh11325
Differential Revision:
D30495473
fbshipit-source-id:
8edad7c01a080847a64a2797e077ec4d6077552a
Don Jang [Mon, 23 Aug 2021 23:20:27 +0000 (16:20 -0700)]
[Static Runtime] Add an out variant op for aten::abs (#63675)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63675
This change adds an out variant implementation for `aten::abs`.
Test Plan:
- Observed `V0820 14:14:08.880342 101788 impl.cpp:1394] Switch to out variant for node: %3 : Tensor = aten::abs(%a.1)`
- Perf impact: TBD
Reviewed By: hlu1
Differential Revision:
D30461317
fbshipit-source-id:
0c0230bd40afe463ae1ccb222c2a1207ebcf4191
Rong Rong (AI Infra) [Mon, 23 Aug 2021 22:36:59 +0000 (15:36 -0700)]
fix git diff issue (#63408)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/60111, ideally we should merge this before https://github.com/pytorch/pytorch/issues/63360 but we can also test this with https://github.com/pytorch/pytorch/issues/63360 easily.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63408
Test Plan:
- This is conform working with local test.sh run by setting PR_NUMBER
- should be validated by GHA CI as well
Concern:
- currently GHA CI is running into proxy 403 rate-limit exceeded issue consistently. However the worst case is not generating any git diff files, which is going to be exactly the same as current behavior.
- depends on https://github.com/pytorch/pytorch/issues/63770.
Reviewed By: driazati, janeyx99
Differential Revision:
D30489355
Pulled By: walterddr
fbshipit-source-id:
a638b7ae5820f29a7aca6cc40ff390ab253cb174
Eli Uriegas [Mon, 23 Aug 2021 22:02:10 +0000 (15:02 -0700)]
.github: Add ec2 information as a step (#63784)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63784
Also creates the common.yml.j2 file as a place to store common code
amongst the templates
Should look like:
![image](https://user-images.githubusercontent.com/1700823/
130495226-
f18b8c0f-1ea7-4097-8bbb-
e998fabb71f2.png)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Test Plan: Imported from OSS
Reviewed By: malfet, driazati
Differential Revision:
D30490682
Pulled By: seemethere
fbshipit-source-id:
18028b4acff938ef54cd6e4877561b2d830a11cf
Erjia Guan [Mon, 23 Aug 2021 21:32:56 +0000 (14:32 -0700)]
Rename DataPipe to Op-er (#63325)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63325
Rename each DataPipe to an operation name ending with er. Functional API should remain `verb` such as `read_from_tar` , `shuffle`, ... (Discussed in [here](https://github.com/facebookexternal/torchdata/pull/97#discussion_r688553905))
- Batch -> Batcher
- Collate -> Collator
- Concat -> Concater
- GroupByKey - > ByKeyGrouper ?
- ListDirFiles -> FileLister
- LoadFilesFromDisk -> FileLoader
- Map -> Mapper
- ReadFilesFromTar -> TarArchiveReader
- ReadFilesFromZip -> ZipArchiveReader
- ReadLinesFromFile -> LineReader
- Shuffle -> Shuffler
- ToBytes -> StreamReader
- Transforms -> Transformer
- Zip -> Zipper
Let me know if you have better name for each DataPipe
Test Plan: Imported from OSS
Reviewed By: mruberry
Differential Revision:
D30466950
Pulled By: ejguan
fbshipit-source-id:
72909dca7b3964ab83b965891f96cc1ecf62d049
Zeina Migeed [Mon, 23 Aug 2021 21:09:10 +0000 (14:09 -0700)]
Add equality constraints for some acc opeartions for symbolic inference (#63689)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63689
Test Plan:
buck run mode/opt-clang caffe2/torch/fb/model_transform/experimental:fx_ir_lower_inline_cvr -- \
--action=lower_and_run \
--filename=inline_cvr_7x_dec_2020.model \
--print_glow_glog=True
Reviewed By: jamesr66a
Differential Revision:
D30462113
fbshipit-source-id:
0b2a1ce9770561248527d47c07b80112491dc949
Hao Lu [Mon, 23 Aug 2021 19:53:42 +0000 (12:53 -0700)]
[Static Runtime] Remove unused fusion patterns (#63636)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63636
Reviewed By: d1jang
Differential Revision:
D30446573
fbshipit-source-id:
3abb7f697380f3b4e865b98c594de359b5e26b96
Bert Maher [Mon, 23 Aug 2021 19:41:32 +0000 (12:41 -0700)]
[nnc] Re-enable CPU fusion" (#63665)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63665
This reverts commit
125e2d02e575612eb427104e7c67f1c28f090db8.
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision:
D30471646
Pulled By: bertmaher
fbshipit-source-id:
4189869566f03b5f9ada78d78830f6a34946eed6
Peter Bell [Mon, 23 Aug 2021 19:05:51 +0000 (12:05 -0700)]
Kill THCUNN (#63429)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63429
Test Plan: Imported from OSS
Reviewed By: mruberry
Differential Revision:
D30441308
Pulled By: ngimel
fbshipit-source-id:
3ae342a2f8d5c7f8827b637c4055c5d1b0a1be26
Rong Rong (AI Infra) [Mon, 23 Aug 2021 16:44:09 +0000 (09:44 -0700)]
fix mpi ssh runtime error (#63580)
Summary:
should fix https://github.com/pytorch/pytorch/issues/60756.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63580
Test Plan:
- this CI.
- validated by running on the bionic_cuda container: https://app.circleci.com/pipelines/github/pytorch/pytorch/366632/workflows/
478602fb-698f-4210-ac09-
d9c61af5c62b/jobs/
15472104
Reviewed By: malfet
Differential Revision:
D30486472
Pulled By: walterddr
fbshipit-source-id:
d83ab88d163d4a468f03961a13d891b658668a7f
Rong Rong (AI Infra) [Mon, 23 Aug 2021 16:28:21 +0000 (09:28 -0700)]
hotfix clone issue (#63770)
Summary:
This was discovered during https://github.com/pytorch/pytorch/issues/63408. For some reason only this checkout action is not correctly set fetch-depth
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63770
Reviewed By: malfet, janeyx99
Differential Revision:
D30486110
Pulled By: walterddr
fbshipit-source-id:
a67395cca2487407ed0d49c8c89587935ca5f212
Gary Miguel [Mon, 23 Aug 2021 14:41:33 +0000 (07:41 -0700)]
[ONNX] add test images to repo (#63717)
Summary:
This is better than the status quo:
* Test doesn't download files from the internet -> faster and more
reliable.
* Test doesn't leave the git working directory dirty.
Rather than using the original images, I've copied some images from
the pytorch/vision repo. This will keep the tests in the two repos
in sync, while avoiding adding new assets to the vision repo.
See https://github.com/pytorch/vision/pull/4176.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63717
Reviewed By: janeyx99
Differential Revision:
D30466016
Pulled By: malfet
fbshipit-source-id:
2c56d4c11b5c74db1764576bf1c95ce4ae714574
Alban Desmaison [Mon, 23 Aug 2021 14:05:51 +0000 (07:05 -0700)]
Allow implementing either backward or vjp for Function (#63434)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63434
Test Plan: Imported from OSS
Reviewed By: ejguan
Differential Revision:
D30431968
Pulled By: albanD
fbshipit-source-id:
0bb88664283486a9fd3364e6c3d79442a44625c2
Jithun Nair [Mon, 23 Aug 2021 05:29:04 +0000 (22:29 -0700)]
Update ROCm PyTorch persons of interest (#55206)
Summary:
cc jeffdaily sunway513
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55206
Reviewed By: VitalyFedyunin
Differential Revision:
D30296584
Pulled By: dzhulgakov
fbshipit-source-id:
6e5c610cc6b7c7fd58b80fa3f9de31f269341a88
Pritam Damania [Mon, 23 Aug 2021 01:55:45 +0000 (18:55 -0700)]
Remove `_fork_processes` from common_distributed.py (#63711)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63711
This removes `_fork_process` from common_distributed.py and fixes all
other callpoints to use `spawn_process` instead.
ghstack-source-id:
136395719
Test Plan: waitforbuildbot
Reviewed By: xush6528
Differential Revision:
D30463834
fbshipit-source-id:
0c09e8a996d0e5b912c8cdd45488a39951bac4db
Horace He [Sun, 22 Aug 2021 00:13:27 +0000 (17:13 -0700)]
Made FuncTorchBatched decompose CompositeImplicitAutograd (#63616)
Summary:
See https://github.com/facebookresearch/functorch/issues/56
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63616
Reviewed By: zou3519
Differential Revision:
D30438316
Pulled By: Chillee
fbshipit-source-id:
e84446d9f68b87daa0cfff75b3b8a972f36ec85a
jiej [Sat, 21 Aug 2021 16:05:04 +0000 (09:05 -0700)]
BatchNorm autodiff re-enabled (#57321)
Summary:
Turns on BN in autodiff:
1. outputs an empty tensor for running stats to by pass autodiff issue on None;
2. fixing BN inference backward in cudnn & miopen, where backward falls back to native batchnorm kernel instead;
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57321
Reviewed By: albanD, ngimel
Differential Revision:
D30250419
Pulled By: jansel
fbshipit-source-id:
a62553789c20fb50a820003a056f40d9d642dfaa
Bert Maher [Sat, 21 Aug 2021 10:45:21 +0000 (03:45 -0700)]
Revert
D30360382: [nnc] Support thread level parallelism in fused kernels
Test Plan: revert-hammer
Differential Revision:
D30360382 (https://github.com/pytorch/pytorch/commit/
d6d86efb1c839ddafd1398d6dab9caa4f31a9f0b)
Original commit changeset:
29acf4e932c6
fbshipit-source-id:
e0531113135d30eabb172dc1537d5dd6d65dc438
Bert Maher [Sat, 21 Aug 2021 10:36:09 +0000 (03:36 -0700)]
Revert
D30417127: Remove flag to toggle CPU fusion in the presence of parallelism
Test Plan: revert-hammer
Differential Revision:
D30417127 (https://github.com/pytorch/pytorch/commit/
6600bc96517269c608ea47b76b6bda9476c7bcef)
Original commit changeset:
b77d7c68364f
fbshipit-source-id:
6b52fb83a84fe241945e3cb3eeb71050d1d9c8f1
Wanchao Liang [Sat, 21 Aug 2021 05:15:55 +0000 (22:15 -0700)]
[sharded_tensor] add readonly tensor properties (#63679)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63679
This PR add read only tensor properties to sharded tensor, to match the torch.Tensor behaviors.
Test Plan: test_sharded_tensor_metadata
Reviewed By: pritamdamania87
Differential Revision:
D30459343
fbshipit-source-id:
9aec8ecfe76479eed25f3b843495e5719ed2956d
Hao Lu [Sat, 21 Aug 2021 04:41:19 +0000 (21:41 -0700)]
[Static Runtime] Implement out variant for fb::quantized_linear (#63635)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63635
Reviewed By: ajyu
Differential Revision:
D30446234
fbshipit-source-id:
1ef014186ff725930a97d0159626f9233ee74030
Akshit Khurana [Sat, 21 Aug 2021 04:08:59 +0000 (21:08 -0700)]
NNAPI: Support const values in binary ops
Summary:
NNAPI converter failed with 1 const value and one tensor earlier
Code suggestions from dreiss
Test Plan:
pytest test/test_nnapi.py::TestNNAPI::test_pointwise_binary
Imported from OSS
Reviewed By: anshuljain1
Differential Revision:
D28893881
fbshipit-source-id:
59240373fb03c6fdafa4cb2fa4d8408dd20092f6
Peter Bell [Sat, 21 Aug 2021 01:27:33 +0000 (18:27 -0700)]
Migrate thnn_conv2d from THC to ATen (#63428)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63428
Closes gh-24644, closes gh-24645
Test Plan: Imported from OSS
Reviewed By: mruberry
Differential Revision:
D30441307
Pulled By: ngimel
fbshipit-source-id:
9c3dec469c0525831ae398df261cf41b7df7e373
Bo Wang [Sat, 21 Aug 2021 00:09:35 +0000 (17:09 -0700)]
Extend _sharded_tensor constructor to support other ops like torch.ones (#63378)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63378
a) Introduce InitCommonParams to wrap tensor creation params
b) Factor local tensor initiation into common_params so that tensor value is not hard specified in ShardedTensor constructor
c) Add _sharded_tensor.ones(...) to exemplify - Note memory_format arg is not provided to be consistent as torch.ones
d) Follow up: more ops like torch.full, torch.zero, torch.rand,
Test:
$ python test/distributed/_sharded_tensor/test_sharded_tensor.py TestCreateTensorFromParams --v
$ python test/distributed/_sharded_tensor/test_sharded_tensor.py TestShardedTensorChunked.test_create_sharded_tensor_with_ones --v
$ python test/distributed/_sharded_tensor/test_sharded_tensor.py TestShardedTensorEnumerable.test_create_sharded_tensor_with_ones --v
Test Plan: Imported from OSS
Reviewed By: pritamdamania87, wanchaol
Differential Revision:
D30359245
Pulled By: bowangbj
fbshipit-source-id:
85768fcb36e9d9d40213036884b1266930a91701
driazati [Fri, 20 Aug 2021 23:38:42 +0000 (16:38 -0700)]
[clang-tidy] Enable more folders (#63380)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63380
Crosses off some more of #62011, see the test in the stacked PR #63381
Test Plan: Imported from OSS
Reviewed By: malfet, seemethere
Differential Revision:
D30455843
Pulled By: driazati
fbshipit-source-id:
d473545d05ffa0b2476968f0b1c55f3a16a2c755
Yi Zhang [Fri, 20 Aug 2021 23:28:39 +0000 (16:28 -0700)]
enable increment build for build_libtorch (#63074)
Summary:
Since issue https://github.com/pytorch/pytorch/issues/59859 is resolved.
rerun_cmake in build_libtorch should not be hardcoded.
build_libtorch is necessary to generate debug version libtorch.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63074
Reviewed By: VitalyFedyunin, seemethere
Differential Revision:
D30306705
Pulled By: malfet
fbshipit-source-id:
f2077d334191f4973da0681560937bc8bab730c1
北海若 [Fri, 20 Aug 2021 22:45:12 +0000 (15:45 -0700)]
[Doc] Deprecation notice for only_inputs argument (#63631)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/63544.
Changed docstring accordingly. I'm new here, not sure if the style is okay. Please check.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63631
Reviewed By: ejguan
Differential Revision:
D30459439
Pulled By: soulitzer
fbshipit-source-id:
8df3c509d1dd39764815b099ab47229550126cbe
driazati [Fri, 20 Aug 2021 22:45:10 +0000 (15:45 -0700)]
Remove breakpad from docker image (#63598)
Summary:
As of https://github.com/pytorch/pytorch/issues/63186 we're doing this properly via a third_party cmake build, so we don't need it here anymore.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63598
Reviewed By: walterddr, malfet
Differential Revision:
D30432250
Pulled By: driazati
fbshipit-source-id:
d0d5db14355cf574e42c0d0ed786bb26230180bd
jiayisun [Fri, 20 Aug 2021 21:54:51 +0000 (14:54 -0700)]
add BFloat16 operators on CPU: range, sinh, cosh, frexp, nan_to_num (#61826)
Summary:
Added BFloat16 support for range, sinh, cosh, frexp, and nan_to_num on CPU, and collected the benchmark data of these OPs(range, sinh, cosh, frexp, and nan_to_num) for BFloat16 and Float32 data type by using the operator_benchmark tool of PyTorch on the platform of Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
Number of cores: 1 core, 28 cores(1 socket)
[cosh_sinh_benchmark.txt](https://github.com/pytorch/pytorch/files/6974313/cosh_sinh_benchmark.txt)
[frexp_benchmark.txt](https://github.com/pytorch/pytorch/files/6974315/frexp_benchmark.txt)
[nan_to_num_benchmark.txt](https://github.com/pytorch/pytorch/files/6974317/nan_to_num_benchmark.txt)
[range_benchmark.txt](https://github.com/pytorch/pytorch/files/6974318/range_benchmark.txt)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61826
Reviewed By: saketh-are
Differential Revision:
D30257259
Pulled By: VitalyFedyunin
fbshipit-source-id:
394cd713e6394050a8c90b2160633beb675d71dd
Jeff Daily [Fri, 20 Aug 2021 21:00:20 +0000 (14:00 -0700)]
empty caching allocator before test_avg_pool2d large subtest (#63528)
Summary:
Otherwise, unrecoverable OOM occurs on MI25. Fixes broken ROCm CI test1.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63528
Reviewed By: malfet, zhouzhuojie
Differential Revision:
D30459151
Pulled By: walterddr
fbshipit-source-id:
63e205c4f486fcbdd514cfb0ed8e38584f894585
Nikita Shulga [Fri, 20 Aug 2021 20:13:54 +0000 (13:13 -0700)]
Include iostream in ProcessGroupMPI.cpp (#63656)
Summary:
As it uses `std::cerr`, which in turn results in compilation regression introduced by https://github.com/pytorch/pytorch/pull/61500
Fixes https://github.com/pytorch/pytorch/issues/63653
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63656
Reviewed By: ejguan
Differential Revision:
D30455824
Pulled By: malfet
fbshipit-source-id:
29f316e7f7fd8e7dcbee2666e7a985f25bf56515
Scott Wolchok [Fri, 20 Aug 2021 19:56:01 +0000 (12:56 -0700)]
[easy]Unbreak caffe2benchmarking build (#63655)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63655
ghstack-source-id:
136324310
Test Plan: buck build //fbobjc/Apps/Internal/Caffe2Benchmarking:Caffe2Benchmarking fbobjc/mode/iphonesimulator
Reviewed By: hl475, JacobSzwejbka
Differential Revision:
D30455659
fbshipit-source-id:
b6da6be4f89b6e84753ef0849ffedea04785034a
BowenBao [Fri, 20 Aug 2021 19:44:29 +0000 (12:44 -0700)]
[ONNX] Suppport torch.dot and torch.nn.utils.spectral_norm (#62596) (#62765)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62765
Fixes #27723
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Differential Revision:
D30375181
Pulled By: msaroufim
fbshipit-source-id:
715f4745899757ec405877980cd20c826028eb2c
Co-authored-by: BowenBao <bowbao@microsoft.com>
BowenBao [Fri, 20 Aug 2021 19:44:29 +0000 (12:44 -0700)]
[ONNX] Update repeat_interleave for dynamic repeats (#59979) (#62764)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62764
Fixes #58733
- Support dynamic interleave for cases with dynamic repeat values
- Moved repeat_interleave symbolic from opset 11 to opset 13, as sequence as output types for loop outputs is needed for this change
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Differential Revision:
D30375179
Pulled By: msaroufim
fbshipit-source-id:
787f96bf91d124fd0483761088c5f4ae930d96a9
Co-authored-by: Shubham Bhokare <shubhambhokare@gmail.com>
BowenBao [Fri, 20 Aug 2021 19:44:29 +0000 (12:44 -0700)]
[ONNX] Fix an issue that optimizations might adjust graph inputs unexpectedly. (#61280) (#62763)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62763
This PR is to fix the issue that the graph inputs might be updated when we export the model in inference mode.
When a model is export in inference mode, some optimizations will be made. One side effect of these optimizations is: the inputs of graph might be adjusted. Such optimizatiosn include:
1. Conv and BatchNorm op fusion.
2. Do constant folding.
If the user sets export_params=False, or set keep_initializers_as_inputs=True, it's highly possible that the user wants to provide the corresponding parameters or initiliazers as the inputs of the graph.
In such situation, no matter the model is export in inference mode or training mode, exporter needs to prevent above optimizations from adjusting the graph inputs. By this, the inputs of graph could match inputs that users provided.
The changes in this PR, add an additional common judgement to see if the above optimizations needs to be done or not. From the value of export_params and keep_initializers_as_inputs arguments, infer if the graph inputs are allowed to be adjusted.
If no, these optimizations will be ignored, even other requirements are matched.
Besides these code changes, the comments of some parameters below have been updated so that users have more thoughts when they consider how to leverage these parameters for different purposes:
1. export_params
2. training
3. do_constant_folding
4. keep_initializers_as_inputs
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Differential Revision:
D30375183
Pulled By: msaroufim
fbshipit-source-id:
4db8b9695649eb32a3a0fefa950ee2e5651bdba0
Co-authored-by: fatcat-z <jiz@microsoft.com>
BowenBao [Fri, 20 Aug 2021 19:44:29 +0000 (12:44 -0700)]
[ONNX] Fix controlflow shape inference with contrib op (#60707) (#62762)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62762
`ONNXShapeTypeInference` for node `n` is skipped if `n` is non ONNX namespace, or if `n` contains any non ONNX namespace nodes. This prevents controlflow nodes containing contrib ops from running `SpecialPostProcess`, which sets up correct node output shape/type information in rare cases.
This PR depends on opset 14 export https://github.com/pytorch/pytorch/pull/59486
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Differential Revision:
D30375180
Pulled By: msaroufim
fbshipit-source-id:
5deacec39f091deb4d75ddd9e660e12fca7f16c5
Co-authored-by: BowenBao <bowbao@microsoft.com>
Alban Desmaison [Fri, 20 Aug 2021 19:26:58 +0000 (12:26 -0700)]
Revert
D30417370: [nnc] Enable CPU fusion
Test Plan: revert-hammer
Differential Revision:
D30417370 (https://github.com/pytorch/pytorch/commit/
b9fc656cf26d60127bd695e4e5a7d27622f2563d)
Original commit changeset:
84ce7a578a36
fbshipit-source-id:
cd23774cdc3273fd72f8a05f1900eaf36f373e6b
Pritam Damania [Fri, 20 Aug 2021 19:09:49 +0000 (12:09 -0700)]
[8/N] Remove c10d/ddp fork tests. (#63454)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63454
Continuation of https://github.com/pytorch/pytorch/pull/63443, this
PR removes all fork tests from torch.distributed.
ghstack-source-id:
136285511
Test Plan: waitforbuildbot
Reviewed By: SciPioneer
Differential Revision:
D30387872
fbshipit-source-id:
f6d6313db126ae7b95b86f78a1e0726887c5c513
Alban Desmaison [Fri, 20 Aug 2021 19:05:32 +0000 (12:05 -0700)]
Revert
D30426527: Adding DataLoader2 class as future replacement of DataLoader
Test Plan: revert-hammer
Differential Revision:
D30426527 (https://github.com/pytorch/pytorch/commit/
5a7133b87fe2fd7d025d36855ed4cc06539a9299)
Original commit changeset:
e5905d3364c4
fbshipit-source-id:
794d8a4e9256ccff8cf894aee10eff6adc30d502
Philip Meier [Fri, 20 Aug 2021 18:43:07 +0000 (11:43 -0700)]
Add `BinaryUfuncOpInfo` and broadcasting tests (#61964)
Summary:
As proof of concept, this PR uses the new `BinaryUfuncOpInfo` in broadcasting tests for `add`, `sub`, `mul`, `div`, `floor_div`, and `true_div`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61964
Reviewed By: ngimel
Differential Revision:
D30407734
Pulled By: mruberry
fbshipit-source-id:
ada28994f43b0635f279f45a02ecba18bc8ee033
Bert Maher [Fri, 20 Aug 2021 18:11:49 +0000 (11:11 -0700)]
[nnc] Enable CPU fusion (#63545)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63545
Test Plan: Imported from OSS
Reviewed By: navahgar
Differential Revision:
D30417370
Pulled By: bertmaher
fbshipit-source-id:
84ce7a578a3678d5562bab99d1dc00330c4f72d1
Bert Maher [Fri, 20 Aug 2021 18:11:49 +0000 (11:11 -0700)]
Remove flag to toggle CPU fusion in the presence of parallelism (#63514)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63514
Test Plan: Imported from OSS
Reviewed By: navahgar
Differential Revision:
D30417127
Pulled By: bertmaher
fbshipit-source-id:
b77d7c68364f2af73570740540f3b1152313016e
Bert Maher [Fri, 20 Aug 2021 18:11:49 +0000 (11:11 -0700)]
[nnc] Support thread level parallelism in fused kernels (#63386)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63386
Test Plan: Imported from OSS
Reviewed By: navahgar
Differential Revision:
D30360382
Pulled By: bertmaher
fbshipit-source-id:
29acf4e932c669ce0f35823faea9099bcd8119b6
Aaron Bockover [Fri, 20 Aug 2021 18:11:47 +0000 (11:11 -0700)]
Add support for the ONNX Runtime Eager Mode backend (#58248)
Summary:
This PR implements the necessary hooks/stubs/enums/etc for complete ONNX Runtime (ORT) Eager Mode integration. The actual extension will live out of tree at https://github.com/pytorch/ort.
We have been [working on this at Microsoft](https://github.com/microsoft/onnxruntime-pytorch/tree/eager-ort/torch_onnxruntime) for the last few months, and are finally ready to contribute the PyTorch core changes upstream (nothing major or exciting, just the usual boilerplate for adding new backends).
The ORT backend will allow us to ferry [almost] all torch ops into granular ONNX kernels that ORT will eagerly execute against any devices it supports (therefore, we only need a single ORT backend from a PyTorch perspective).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58248
Reviewed By: astaff
Differential Revision:
D30344992
Pulled By: albanD
fbshipit-source-id:
69082b32121246340d686e16653626114b7714b2
Victor Quach [Fri, 20 Aug 2021 18:07:22 +0000 (11:07 -0700)]
Add docs describing saved tensor hooks (#62362)
Summary:
Add section to the Autograd mechanics docs to describe the recently
exposed saved tensors (https://github.com/pytorch/pytorch/issues/52451), how to register packing / unpacking
hooks (https://github.com/pytorch/pytorch/issues/60975) and how to use default hooks (https://github.com/pytorch/pytorch/issues/61834)
Sister PR: https://github.com/pytorch/pytorch/issues/62361 (will add a link from autograd.rst to notes/autograd in whatever PR does not land first)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62362
Reviewed By: soulitzer
Differential Revision:
D30453177
Pulled By: Varal7
fbshipit-source-id:
f5759977b069ff0ef36a47b08856d297691a6caa
Shiyan Deng [Fri, 20 Aug 2021 17:49:21 +0000 (10:49 -0700)]
[fx2trt] Add layernorm plugin for dynamic shape (#63620)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63620
Added layernorm dynamic plugin, so that it works when explicit batch dim is required. Needed for ig model.
Changed the way of how we creating a plugin layer from instantiating the plugin directly to use plugin creator with `PluginFieldCollection`.
Follow ups:
Another way to convert layernorm is by breaking it down to supported trt layers. T97398182
Test Plan: layernorm unittest
Reviewed By: yinghai
Differential Revision:
D30138205
fbshipit-source-id:
aebe021d8de818e20376634f30e84579b9807f9b
Pavithran Ramachandran [Fri, 20 Aug 2021 16:34:53 +0000 (09:34 -0700)]
[PyTorch][Edge] Improve InflatableArgs for Bundled Inputs (#62368)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62368
# Context
The bundled inputs accepts an expression in the form of string InflatableArg.fmt that can be applied on the inputs to inflate. The InflatableArg.fmt provides flexibility to have custom transformation to inflate. When the input arguments to a function are not Tensor type, TorchScript casts the inputs from type T to Optional[T] expects the function to handle Nullable (None) clause as well. This becomes tricky to handle in one line code or lambda functions.
We propose an alternative way which allows InflatableArg to include the text of a TorchScript function that would be defined on the module as a helper, then use that in its inflation expression. This can be provided by InflatableArg.fmt_fn. Please refer to pytorch/test/test_bundled_inputs.py for example on how to use the same.
Also refer JacobSzwejbka comment on the same [here](https://github.com/pytorch/pytorch/pull/62368#issuecomment-
892012812)
# Mitigation
Allow InflatedArg to include the text of a TorchScript function that would be defined on the module as a helper, then use that in its inflation expression.
ghstack-source-id:
135158680
Test Plan:
To run `test_dict_args`
```
(base) [pavithran@devvm1803.vll0 /data/users/pavithran/fbsource/fbcode] buck test //caffe2/test:test_bundled_inputs -- test_dict_args
Action graph will be rebuilt because files have been added or removed.
Building: finished in 5.4 sec (100%) 12180/12180 jobs, 0/12180 updated
Total time: 5.8 sec
More details at https://www.internalfb.com/intern/buck/build/
fafcf277-1095-4cba-978d-
6022f0d391ad
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id:
5ef9de71-c1b1-406b-a6c0-
3321c2368b8d
Trace available for this run at /tmp/tpx-
20210727-163946.454212/trace.log
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/
7036874465805934
✓ ListingSuccess: caffe2/test:test_bundled_inputs - main (11.365)
✓ Pass: caffe2/test:test_bundled_inputs - test_dict_args (test_bundled_inputs.TestBundledInputs) (12.307)
Summary
Pass: 1
ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/
7036874465805934
```
To check the py code of TS module:
P433043973
Reviewed By: dreiss
Differential Revision:
D29950421
fbshipit-source-id:
c819ec5c94429b7fbf6c4beb0259457f169b08ec
Vitaly Fedyunin [Fri, 20 Aug 2021 16:00:23 +0000 (09:00 -0700)]
Adding DataLoader2 class as future replacement of DataLoader (#63523)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63523
Supports sharding and batching on loader level**
* #63522 Adding IterableAsDataPipe IterDataPipe
usefull for tests and simple cases
Supports sharding and batching on loader level
Test Plan: Imported from OSS
Reviewed By: ejguan
Differential Revision:
D30426527
Pulled By: VitalyFedyunin
fbshipit-source-id:
e5905d3364c4880e720dd62fb066f08881c71a6e
albanD [Fri, 20 Aug 2021 15:42:31 +0000 (08:42 -0700)]
Small custom function refactor which doesn't change anything (#63433)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63433
Test Plan: Imported from OSS
Reviewed By: mruberry
Differential Revision:
D30431970
Pulled By: albanD
fbshipit-source-id:
905fa4d2ddeca18005b1bcb13dd6f8a080327e7c
Vitaly Fedyunin [Fri, 20 Aug 2021 15:36:14 +0000 (08:36 -0700)]
Adding IterableAsDataPipe IterDataPipe (#63522)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63522
Supports sharding and batching on loader level
* **#63522 Adding IterableAsDataPipe IterDataPipe
usefull for tests and simple cases**
usefull for tests and simple cases
Test Plan: Imported from OSS
Reviewed By: ejguan
Differential Revision:
D30426528
Pulled By: VitalyFedyunin
fbshipit-source-id:
535b5cc1505bb58731fcca8170541ac5ee7bd417
Mike Iovine [Fri, 20 Aug 2021 13:14:13 +0000 (06:14 -0700)]
[Static Runtime] Enable RemoveListMutation (#63536)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63536
Enable a pass that transforms sequences like this:
```
li = []
li.append(1)
li.append(2)
```
into this:
```
li = [1, 2]
```
Initially I implemented this pass myself (
D30387213), but I discovered that there is an existing pass that does the same thing.
Reviewed By: hlu1
Differential Revision:
D30412970
fbshipit-source-id:
0810ef03480878d5039bd800a40f5fd31c2652ec
Don Jang [Fri, 20 Aug 2021 07:43:40 +0000 (00:43 -0700)]
[Static Runtime] Add native op for aten::detach (#63625)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63625
This change adds a static runtime's native op implementation for `aten::detach` op.
See the standard `aten::detach`'s implementation (https://codebrowser.bddppq.com/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp.html#_ZN2at6native6detachERKNS_6TensorE ) for comparison.
Test Plan:
- Added `StaticRuntime.IndividualOps_Detach`.
- Observed
```
V0819 18:55:33.181188 3092034 impl.cpp:1398] Switch to native impl for node: %a.1 : Tensor = aten::detach(%input.1)
```
Reviewed By: hlu1
Differential Revision:
D30443187
fbshipit-source-id:
d6e0eadb1b817e0a126c4fc97526abc276ee8a17
Nikita Shulga [Fri, 20 Aug 2021 06:42:24 +0000 (23:42 -0700)]
Update protobuf to 3.13.1 (#62571)
Summary:
Update bazel to 4.10.0
Update ASAN_SYMBOLIZER_PATH to llvm-7
Suppress `vptr` ubsan violations in `test_jit`
Fix ProtoBuf patching for ONNX which caused Windows builds to crash while attempting to free `std::string` allocated on stack
Fixes https://github.com/pytorch/pytorch/issues/62569
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62571
Reviewed By: walterddr
Differential Revision:
D30048685
Pulled By: malfet
fbshipit-source-id:
6462c1bef9c42318551d2cf906bbab41e1d4e1cd
Raghavan Raman [Fri, 20 Aug 2021 05:50:32 +0000 (22:50 -0700)]
[nnc] Updated sliceTail to do inplace mutation (#63532)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63532
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision:
D30412184
Pulled By: navahgar
fbshipit-source-id:
e7669d3b9d24e14501f3feb6505c88d1d42030c6
Raghavan Raman [Fri, 20 Aug 2021 05:50:32 +0000 (22:50 -0700)]
[nnc] Updated sliceHead to do inplace mutation (#63531)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63531
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision:
D30412183
Pulled By: navahgar
fbshipit-source-id:
47ee9482a36e606788d28d22eee4edaca45ffa50
Scott Wolchok [Fri, 20 Aug 2021 01:52:33 +0000 (18:52 -0700)]
[PyTorch] Remove unnecessary iostream includes in headers (#61500)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61500
libstdc++ defines a static variable called `std::__ioinit` in iostream that adds global constructor size overhead to each translation that includes iostream. To reduce the size overhead from that, we can often include ostream instead.
ghstack-source-id:
136163529
Test Plan: buildsizebot some mobile apps
Reviewed By: dhruvbird
Differential Revision:
D29648016
fbshipit-source-id:
9c3139712c71248513cc5032d21e77f3ecbae8fe
Scott Wolchok [Fri, 20 Aug 2021 01:52:33 +0000 (18:52 -0700)]
[PyTorch] Remove unused dump() methods in vec headers (#63533)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63533
These methods don't seem to be used, and they use std::cout, which incurs a small code size overhead on platforms using libstdc++ due to std::__ioinit (see #61500). Seems like we can just delete them?
ghstack-source-id:
136163409
Test Plan:
CI
Reviwers: #sentinel, dhruvbird
Reviewed By: dskhudia
Differential Revision:
D30412269
fbshipit-source-id:
380b9aa2f9aabc4107188b6b209d2afc1769c0ee
Pavithran Ramachandran [Fri, 20 Aug 2021 01:39:50 +0000 (18:39 -0700)]
[PyTorch][Edge] Support backtrace symbolication for Android builds (#63339)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63339
# Context
https://fb.workplace.com/groups/pytorch.dev/permalink/
900474523864362/?comment_id=
901125403799274&reply_comment_id=
905023386742809
##### WHAT IS A STACK TRACE?
A stack trace (also called stack backtrace or stack traceback) is a report of the active stack frames at a certain point in time during the execution of a program.
Typically when an exception is thrown, one would expect to see the code (file:line) that threw the exception, and every intermediate frame up to and including the main function.
We are enabling android stack trace to help debugging on android devices.
Test Plan:
## Steps to test
```
buck build fbsource//xplat/caffe2/mode/aibench_pytorch_android -c pt.enable_qpl=0 -c pt.has_backtraces=1 fbsource//xplat/caffe2/fb/lite_predictor:lite_predictorAndroid#android-x86_64
one_world android emulator android-28
adb push ~/fbsource/buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictorAndroid#android-x86_64 /data/local/tmp
cd /data/local/tmp
./lite_predictorAndroid#android-x86_64
./lite_predictorAndroid#android-x86_64 --model ./detect.bc --input_dims "1,3,192,192" --input_type float --warmup 20 --iter 5 --report_pep true
```
## See how model file is not found stack traces is:
### before
```
./lite_predictorAndroid#android-x86_64 --model ./detect.bc --input_dims "1,3,192,192" --input_type float --warmup 20 --iter 5 --report_pep true
Run with 2 threads
Run with 2 threads
Loading model...
terminating with uncaught exception of type c10::Error: open file failed, file path: ./detect.bc
Exception raised from RAIIFile at xplat/caffe2/caffe2/serialize/file_adapter.cc:13 (most recent call first):
(no backtrace available)
Aborted
```
### after
```
134|generic_x86_64:/data/local/tmp $ ./lite_predictorAndroid#android-x86_64 --model ./detect.bc --input_dims "1,3,192,192" --input_type float --warmup 20 --iter 5 --report_pep true
Run with 2 threads
Run with 2 threads
Loading model...
terminating with uncaught exception of type c10::Error: open file failed, file path: ./detect.bc
Exception raised from RAIIFile at xplat/caffe2/caffe2/serialize/file_adapter.cc:13 (most recent call first):
frame #0 c10::get_backtrace(unsigned long, unsigned long, bool)[0x59494274f10e]
frame #1 [0x5949427b1eee]
frame #2 [0x5949427b1eb2]
frame #3 [0x5949427b1cdc]
frame #4 std::__ndk1::function<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > ()>::operator()() const[0x5949427afc34]
frame #5 c10::Error::Error(c10::SourceLocation, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >)[0x5949427b05b1]
frame #6 c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)[0x5949427aca5f]
frame #7 caffe2::serialize::FileAdapter::RAIIFile::RAIIFile(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)[0x5949426b37b2]
frame #8 caffe2::serialize::FileAdapter::FileAdapter(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)[0x5949426b3903]
frame #9 torch::jit::_load_for_mobile(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, c10::optional<c10::Device>, std::__ndk1::unordered_map<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >, std::__ndk1::hash<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > >, std::__ndk1::equal_to<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > >, std::__ndk1::allocator<std::__ndk1::pair<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > > > >&)[0x5949422737bd]
frame #10 torch::jit::_load_for_mobile(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, c10::optional<c10::Device>)[0x594942273769]
frame #11 benchmark(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, bool, int, int, int, bool, int, bool, int, double, bool, bool, bool, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)[0x59494189b21d]
frame #12 main[0x594941882aff]
frame #13 __libc_init[0x7b699d08578d]
```
### what we get for os:linux
```
(base) [pavithran@devvm1803.vll0 /data/users/pavithran/fbsource] ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor --model ./detect.bc --input_dims "1,3,192,192" --input_type float --warmup 20 --iter 5 --report_pep true
Run with 24 threads
Run with 24 threads
Loading model...
terminate called after throwing an instance of 'c10::Error'
what(): open file failed, file path: ./detect.bc
Exception raised from RAIIFile at xplat/caffe2/caffe2/serialize/file_adapter.cc:13 (most recent call first):
frame #0: ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor() [0x20cb7fe]
frame #1: ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor() [0x20cb6c6]
frame #2: std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>::operator()() const + 0x54 (0x20ca4e4 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #3: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x57 (0x20ca9a7 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #4: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x7a (0x20c823a in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #5: caffe2::serialize::FileAdapter::RAIIFile::RAIIFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x96 (0x206f3d6 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #6: caffe2::serialize::FileAdapter::FileAdapter(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x42 (0x206f502 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #7: torch::jit::_load_for_mobile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, c10::optional<c10::Device>, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&) + 0x30 (0x1be826c in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #8: torch::jit::_load_for_mobile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, c10::optional<c10::Device>) + 0x35 (0x1be8214 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #9: benchmark(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, int, int, int, bool, int, bool, int, double, bool, bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x16d (0x12093ad in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #10: main + 0x25c (0x11f933c in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #11: __libc_start_main + 0x105 (0x7fc7b9f2ed95 in /usr/local/fbcode/platform009/lib/libc.so.6)
frame #12: _start + 0x2a (0x11f902a in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
Aborted (core dumped)
````
Reviewed By: dhruvbird
Differential Revision:
D30135947
fbshipit-source-id:
f50c634ef4545843305cad4b4a14a8776b1aec76
Nikita Shulga [Thu, 19 Aug 2021 23:46:31 +0000 (16:46 -0700)]
Revert
D30359218: [pytorch][PR] [doc] pre-commit fix instructions
Test Plan: revert-hammer
Differential Revision:
D30359218 (https://github.com/pytorch/pytorch/commit/
4e1d84ae8fae49995c8966ccbe0f34360978492f)
Original commit changeset:
61771babeac4
fbshipit-source-id:
c2ac0a4a7463fafa03ad0b20bfb0701a8c1476c4
zhouzhuojie [Thu, 19 Aug 2021 22:37:10 +0000 (15:37 -0700)]
Add concurrency group for more workflows (#63606)
Summary:
Fixes unnecessary duplicated workflows runs
![image](https://user-images.githubusercontent.com/658840/
130146332-
ecf54e49-3538-49c1-88de-
b099f1c1e41f.png)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63606
Reviewed By: malfet, mruberry
Differential Revision:
D30436889
Pulled By: zhouzhuojie
fbshipit-source-id:
aafbad1edc45e3ab9bceb00e8f3b4204f18e43d0
Zeina Migeed [Thu, 19 Aug 2021 22:22:52 +0000 (15:22 -0700)]
acc type inference (#63119)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63119
Test Plan:
buck run mode/opt-clang caffe2/torch/fb/model_transform/experimental:fx_ir_lower_inline_cvr -- \
--action=lower_and_run \
--filename=inline_cvr_7x_dec_2020.model \
--print_glow_glog=True
Reviewed By: jamesr66a, jfix71, ansley
Differential Revision:
D30235895
fbshipit-source-id:
dab7f96e1799b99eeae0ee519cf0ddd636fddf2e
Sergei Vorobev [Thu, 19 Aug 2021 21:57:00 +0000 (14:57 -0700)]
Replace hardcoded values in IndexKernel.cu (#63372)
Summary:
This is a small change that helps to maintain Cruise pytorch fork, since we use a different hardcoded value.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63372
Reviewed By: mruberry
Differential Revision:
D30396171
Pulled By: ejguan
fbshipit-source-id:
cc0023f58b5922d3d98c7283495e6dc8d35049b6
Adam J. Stewart [Thu, 19 Aug 2021 21:54:26 +0000 (14:54 -0700)]
DataLoader: allow non-integer Samplers (#63500)
Summary:
Not entirely sure how to use TypeVar but if someone could give me a hint it would be appreciated. Also let me know if you want me to add tests so we can make sure non-integer samplers actually work. It seems like `test/test_dataloader.py` is the correct location but that's a big file.
Fixes https://github.com/pytorch/pytorch/issues/63483
ejguan
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63500
Reviewed By: mruberry
Differential Revision:
D30403689
Pulled By: ejguan
fbshipit-source-id:
464e09e5aad3215b94a29cc5e21cb4b10ec136e3
Kimish Patel [Thu, 19 Aug 2021 20:32:26 +0000 (13:32 -0700)]
[Pytorch] Fix callstack pointer serialization bug (#63576)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63576
We serialize function name associated with InlinedCallStackPtr. This is derived
via querying Function* stored in InlinedCallStack. However this is a raw
pointer that is not gauranteed to be valid when we serialization happens. On
the other hand we also store function name separately when constructing
InlinedCallStack anyways. So this change just uniformly relies on function_name
instead of Function*
Test Plan: Internal build's asan failure + CI
Reviewed By: larryliu0820
Differential Revision:
D30427029
fbshipit-source-id:
de9617482404785920ed2e67b72f38461590fba3
Charles David Hernandez [Thu, 19 Aug 2021 20:04:48 +0000 (13:04 -0700)]
Updating the names of these functions (#63513)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63513
updating these names per Jerry's nits in the previous pr
Test Plan: Imported from OSS
Reviewed By: jerryzh168
Differential Revision:
D30406710
fbshipit-source-id:
a9f1577a2b8c4a93f5005e0f6278b7d7348d8b66
Natalia Gimelshein [Thu, 19 Aug 2021 20:00:08 +0000 (13:00 -0700)]
Revert embedding thrust->cub migration (#63451)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/63427
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63451
Reviewed By: mruberry
Differential Revision:
D30398482
Pulled By: ngimel
fbshipit-source-id:
e153786d204215555a6571688eabae712facad7e
Philip Meier [Thu, 19 Aug 2021 19:45:32 +0000 (12:45 -0700)]
Updates internal `assert_allclose` callsites in favor of `assert_close` (#61841)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61841
Redo of #60863.
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision:
D30408145
Pulled By: mruberry
fbshipit-source-id:
0b34ebc7f23ba38ecd89640b61d8aca59b7eab58
Mike Ruberry [Thu, 19 Aug 2021 19:41:42 +0000 (12:41 -0700)]
Modernizes add and mul documentation (#63309)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/39329.
The documentation for torch.add and torch.mul was sorely out of date and even included deprecated references. This PR modernizes their descriptions consistent with torch.sub.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63309
Reviewed By: ngimel
Differential Revision:
D30338004
Pulled By: mruberry
fbshipit-source-id:
ee1c2a8106af8341253cafb0003b06e8f652624d