Iurii Zdebskyi [Mon, 8 Apr 2019 20:46:52 +0000 (13:46 -0700)]
Renamed bool tensors into byte tensors (#19021)
Summary:
Renamed bool tensors into byte tensors to represent the correct type in generated code
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19021
Differential Revision:
D14835188
Pulled By: izdeby
fbshipit-source-id:
0252d2c69dab35ac2f076cf9a87423463e902c76
Thomas Viehmann [Mon, 8 Apr 2019 20:35:34 +0000 (13:35 -0700)]
Handle None indexing in TorchScript (#18615)
Summary:
t[None], t[None, 1:, None] and friends for unsqueezing
Fixes: #12810
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18615
Differential Revision:
D14837039
Pulled By: wanchaol
fbshipit-source-id:
ab3862c41629f087b0a46b7c59c93dac4018e6fe
Junjie Bai [Mon, 8 Apr 2019 20:09:11 +0000 (13:09 -0700)]
Turn on mkldnn in most builds except rocm
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18965
Differential Revision:
D14836931
Pulled By: bddppq
fbshipit-source-id:
463a9bc5043a1f3194158f7bbfae3b71c6cd4b20
David Riazati [Mon, 8 Apr 2019 20:01:09 +0000 (13:01 -0700)]
Remove dead code in module.cpp (#19022)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19022
ghimport-source-id:
cdf694c1b426eb9f82d4c148c9f2c2cfc180cedd
Reviewed By: eellison
Differential Revision:
D14833409
Pulled By: driazati
fbshipit-source-id:
8914c7227add7f3e07f56b21a513ba7727fb6800
Mikhail Zolotukhin [Mon, 8 Apr 2019 19:22:52 +0000 (12:22 -0700)]
Convert test_recursive_cse to use Filecheck inline annotations. (#19032)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19032
ghimport-source-id:
58a146542deb08dd3057d099167ba530a5e51400
Differential Revision:
D14836689
Pulled By: ZolotukhinM
fbshipit-source-id:
e65ca5f09193eb7c16c204aedd50c474ea31210c
Mikhail Zolotukhin [Mon, 8 Apr 2019 19:06:55 +0000 (12:06 -0700)]
Add a document 'How to Write Tests Using FileCheck' (#19005)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19005
ghimport-source-id:
f9c3eff54adc8eef3ead2c77be62c44d88d22a00
Differential Revision:
D14826845
Pulled By: ZolotukhinM
fbshipit-source-id:
62cc3657ee89acc979403da15e39bd4cd09a866d
Duc Ngo [Mon, 8 Apr 2019 18:48:42 +0000 (11:48 -0700)]
caffe2 - Expose tensor filler util to Python (#18886)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18886
Expose tensor filler util to Python and add a unit test (both C++/Python)
Reviewed By: salexspb
Differential Revision:
D14784470
fbshipit-source-id:
bb8e013d1755c27c166e87d5a8491a97c65d3d8d
Jiakai Liu [Mon, 8 Apr 2019 17:54:59 +0000 (10:54 -0700)]
call build_android.sh from pytorch CI build script (#18918)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18918
ghimport-source-id:
98c63da263adbbc6ac74a69ac117740c852833cd
Reviewed By: dreiss
Differential Revision:
D14800727
Pulled By: ljk53
fbshipit-source-id:
4d06f845bb34bcdb74b0602404f2a0782f8c8783
Jon Malmaud [Mon, 8 Apr 2019 16:45:49 +0000 (09:45 -0700)]
Type annotations for `util.data`. (#18963)
Summary:
I haven't had a chance to rigorously try these out yet so don't merge yet.
Closes #18725.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18963
Differential Revision:
D14832897
Pulled By: ezyang
fbshipit-source-id:
4780e7a34126bc66ddbfd9d808dfc9e0edd77e68
Johannes M Dieterich [Mon, 8 Apr 2019 16:44:08 +0000 (09:44 -0700)]
ifdef guard some explicit pragma unrolls (#19018)
Summary:
the ROCm compiler cannot and will not satisfy them, causing compile time warnings.
Reason being a runtime loop trip count.
Some warnings remain arising from other parts of the ROCm stack - tickets are filed and they will be resolved within these components.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19018
Differential Revision:
D14832859
Pulled By: ezyang
fbshipit-source-id:
0d66e4aebe4e56af14dd5e2967d3c374a82be25c
Summer Deng [Mon, 8 Apr 2019 16:26:37 +0000 (09:26 -0700)]
Fix a dev mode bug in activation distribution observer (#19004)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19004
Handling the exception case when the data has min 3.40282e+38 max -3.40282e+38
Reviewed By: jspark1105
Differential Revision:
D14822193
fbshipit-source-id:
b9771d1584fdf8317f5b8c7f5806be5d27314386
Gregory Chanan [Mon, 8 Apr 2019 15:10:19 +0000 (08:10 -0700)]
Clean up some sparse code. (#18962)
Summary:
1) sparse_dispatches in native_parse was not used anymore, got rid of it.
2) got rid of overloaded sizes_ in SparseTensorImpl, which just uses the base implementation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18962
Differential Revision:
D14811545
Pulled By: gchanan
fbshipit-source-id:
2fa60ef50456b5f605caa63beae1d8d2542fd527
Roy Li [Mon, 8 Apr 2019 06:56:02 +0000 (23:56 -0700)]
Remove tensorWithAllocator() from Type (#18780)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18780
ghimport-source-id:
7d18a11ce87d988bd32f6ebb96acd878ab8d61be
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18780 Remove tensorWithAllocator() from Type**
* #18779 Remove tensorFromBlob() from Type
Differential Revision:
D14739336
fbshipit-source-id:
429ab10bb9f6ac9f97b5a11c7a836b6b6336cb2d
Johannes M Dieterich [Mon, 8 Apr 2019 01:13:33 +0000 (18:13 -0700)]
Fix sparse mm for ROCm (#18985)
Summary:
* Annotate also two pass reduction with launch bounds
* ifdef some shortcomings of ROCm w.r.t. short-circuit returns - internal tickets filed
* while there, plug memory leak by destroying matrix descriptor after the sparse call (applicable to cuSPARSE)
* while there, fix types for cusparseXcoo2csr as per cuSPARSE documentation
* enable test_dsmm in test_sparse which now passes
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18985
Differential Revision:
D14822009
Pulled By: bddppq
fbshipit-source-id:
757267a47a63ee56ef396c33059f7eca099f4833
Ilia Cherniavskii [Sun, 7 Apr 2019 22:03:21 +0000 (15:03 -0700)]
Check if profiler is disabled in push/pop event (#18908)
Summary:
Make sure to check if profiler is disabled in push/pop and mark event
functions
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18908
Differential Revision:
D14791931
Pulled By: ilia-cher
fbshipit-source-id:
e4f5149e69999ee2b9238c21cccad6d27c6a714a
Nishant Pandit [Sun, 7 Apr 2019 16:09:33 +0000 (09:09 -0700)]
Implement Observer pass on simple model and validate stats (#18848)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18848
The Observer Module is based on eager mode compute qparam implementation.
Goal is to validate QParam result for EagerMode and Script Mode for simple
model
Observer stats are collected and qparam computed only for activations only at this point
Reviewed By: zafartahirov
Differential Revision:
D14720805
fbshipit-source-id:
cb2f321b4b9927b37905fdb8eb55c5610d41b351
Balint Cristian [Sun, 7 Apr 2019 15:23:10 +0000 (08:23 -0700)]
AVX2 with GCC9 fix. (#18991)
Summary:
Dear All,
The proposed patch fixes the test code snippets used in cmake infrastructure, and implicit failure to set properly the ```CAFFE2_COMPILER_SUPPORTS_AVX2_EXTENSIONS``` flag. The libcaffe2.so will have some ```UND``` avx2 related references, rendering it unusable.
* Using GCC 9 test code from cmake build infra always fails:
```
$ gcc -O2 -g -pipe -Wall -m64 -mtune=generic -fopenmp -DCXX_HAS_AVX_1 -fPIE -o test.o -c test.c -mavx2
test.c: In function ‘main’:
test.c:11:26: error: incompatible type for argument 1 of ‘_mm256_extract_epi64’
11 | _mm256_extract_epi64(x, 0); // we rely on this in our AVX2 code
| ^
| |
| __m256 {aka __vector(8) float}
In file included from /usr/lib/gcc/x86_64-redhat-linux/9/include/immintrin.h:51,
from test.c:4:
/usr/lib/gcc/x86_64-redhat-linux/9/include/avxintrin.h:550:31: note: expected ‘__m256i’ {aka ‘__vector(4) long long int’} but argument is of type ‘__m256’ {aka ‘__vector(8) float’}
550 | _mm256_extract_epi64 (__m256i __X, const int __N)
|
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 9.0.1
20190328 (Red Hat 9.0.1-0.12) (GCC)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18991
Differential Revision:
D14821838
Pulled By: ezyang
fbshipit-source-id:
7eb3a854a1a831f6fda8ed7ad089746230b529d7
Roy Li [Sun, 7 Apr 2019 08:35:11 +0000 (01:35 -0700)]
Remove tensorFromBlob() from Type (#18779)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18779
ghimport-source-id:
e7453b74fcce0e4f4a9cbce0324992a85272a426
Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18780 Remove tensorWithAllocator() from Type
* **#18779 Remove tensorFromBlob() from Type**
Differential Revision:
D14739335
fbshipit-source-id:
8a0619a5b412332efa3b2d60c1edebd53d089d50
James Reed [Sun, 7 Apr 2019 07:15:42 +0000 (00:15 -0700)]
Improve precision of emitted code for prim::Constant (#18817)
Summary:
Stacked on https://github.com/pytorch/pytorch/pull/18815 and https://github.com/pytorch/pytorch/pull/18811.
This makes it so that we emit a higher-precision literal for float values in the fusion kernel, as well as assign that to a `double` variable. This prevents us from losing precision for values such as `pi`, but with the previous fixes this will also get downcasted to `float` if downstream operations require it. Therefore, we should not lose performance because of implicit promotions
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18817
Differential Revision:
D14820842
Pulled By: jamesr66a
fbshipit-source-id:
519671c6ca5e7adac746a4c4c72760a6d91e332f
Arunava [Sun, 7 Apr 2019 07:07:24 +0000 (00:07 -0700)]
convert_sync_batch_norm to SyncBatchNorm (#18787)
Summary:
Closes #18382
Please let me know if any changes are required.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18787
Differential Revision:
D14821147
Pulled By: soumith
fbshipit-source-id:
edd98eab1b3f4151c4ae5148146435ddb2ae678d
Summer Deng [Sun, 7 Apr 2019 04:50:28 +0000 (21:50 -0700)]
fix bug when falling back to acc32 when weight is prepacked (#18974)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18974
When the weight is prepacked and it doesn't contain a prepacked weight for acc32, we shouldn't fallback to acc32.
Reviewed By: bddppq
Differential Revision:
D14814067
fbshipit-source-id:
aec917322de695e283f0aca1e930c5603d196404
Ailing Zhang [Sun, 7 Apr 2019 04:36:22 +0000 (21:36 -0700)]
move 2ops back to autodiff (#18969)
Summary:
Move these 2 ops back to autodiff to unblock xla CI.
I will leave them for my next PR to cleanup symbolic_variable.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18969
Differential Revision:
D14816811
Pulled By: ailzhang
fbshipit-source-id:
dd8a7e133dcad29560d3d1d25691883960117299
Nishant Pandit [Sun, 7 Apr 2019 03:56:17 +0000 (20:56 -0700)]
Preserve naming for inputs/outputs with observer insertion (#18713)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18713
- Quantizer observer node output is hooked up to following node
which mutates the naming for input/output. This is not desired and
required because observer op can be a sync node
- Quantizer is aimed for quantizing tensors so we should insert observer
op for Values that are type tensor
Reviewed By: zafartahirov
Differential Revision:
D14715916
fbshipit-source-id:
feca04c65a43103b46084d3548998498b19ee599
James Reed [Sun, 7 Apr 2019 00:44:53 +0000 (17:44 -0700)]
Emit math functions specific to output type (#18815)
Summary:
Stacked on https://github.com/pytorch/pytorch/pull/18811
This makes it so that we only emit the *f variants of math functions if the output value's type is FloatTensor, otherwise we call the double variants to prevent loss of precision. This fixes more numerical issues
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18815
Differential Revision:
D14816965
Pulled By: jamesr66a
fbshipit-source-id:
464be644168875ede987142281fb2168f4041e81
Soumith Chintala [Sat, 6 Apr 2019 19:36:58 +0000 (12:36 -0700)]
add instructions for NVIDIA Jetson platforms (#18990)
Summary:
Thanks to dusty-nv , we now have Stable and Weekly wheels provided for the NVIDIA Jetson Platform. They require JetPack 4.2.
He's also maintaining source build instructions.
This PR adds links to the binaries and source build instructions to the README.
The links are dynamic, so when new stable / weekly wheels are available, Dustin will update the same URL to point to the new files
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18990
Differential Revision:
D14820158
Pulled By: soumith
fbshipit-source-id:
761a56557decb72ad9c1b9f8a2745667f558eec3
Nishant Pandit [Sat, 6 Apr 2019 19:34:33 +0000 (12:34 -0700)]
Quantizer pass to insert quant-dequant nodes into IR (#18446)
Summary:
- Quantizer pass to mutate IR by inserting quant-dequant nodes
before and after nodes which support quantized ops. This information
will be used by jit compiler to substitute with quantized ops
- This currently covers simple model. It will be expanded later
for subgraph pattern matching to cover more complex patterns
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18446
Differential Revision:
D14592265
Pulled By: nishantpdce
fbshipit-source-id:
c9ba6c12aa96cb9c117826e386721eec83a55ea6
Soumith Chintala [Sat, 6 Apr 2019 18:37:41 +0000 (11:37 -0700)]
add SyncBatchNorm to docs (#18988)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/18983
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18988
Differential Revision:
D14820042
Pulled By: soumith
fbshipit-source-id:
356169f554a42303b266d700d3379a5288f9671d
mooncake4132 [Sat, 6 Apr 2019 17:25:56 +0000 (10:25 -0700)]
Add c10_cuda to libraries in CUDAExtension for Windows (#18982)
Summary:
This change was necessary for me to compile [apex](https://github.com/NVIDIA/apex) on Windows.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18982
Differential Revision:
D14819818
Pulled By: soumith
fbshipit-source-id:
37ff9b93a72ab2b7c87f23a61e9f776c71c4c1a8
Gao, Xiang [Sat, 6 Apr 2019 16:09:52 +0000 (09:09 -0700)]
Remove Trainer from README.md (#18980)
Summary:
Trainer has been removed long time ago
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18980
Differential Revision:
D14819855
Pulled By: ezyang
fbshipit-source-id:
f62020e688ebf6663416aec7435bf1f531607941
Zachary DeVito [Sat, 6 Apr 2019 01:53:31 +0000 (18:53 -0700)]
Create Object that represents a Module (#18469)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18469
ghimport-source-id:
73cb8b58f43f10b1dcfca805fd5b25c4fa977632
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18469 Create Object that represents a Module**
* #18468 slots with explicit value/setValue make more sense in future patches
* #18467 Make Object hold its ClassType
* #18379 Enforce single parent for script submodules
* #18378 Unify namespace of script::Module
* #18314 Add ability to specialize class types to ArgumentSpec
* #18226 Add Slot type to abstract the raw pointers being used for slots.
This changes the underlying storage for script::Module to hold
a ivalue::Object which has slots for all the parameters and attributes.
NamedIValue and Slot are now merged together into one class Slot that stores
the tuple (ivalue::Object, offset) and can be used to read the name, type,
or value of the slot and also to set the value. This cleans up a bunch
of client uses.
This PR does not actually use the module object in any generated code.
A future PR will switch how code is generated to treat modules as
first class.
Differential Revision:
D14613508
fbshipit-source-id:
d853a7559f58d244de2ef54a781427fcd1060ed0
Gao, Xiang [Sat, 6 Apr 2019 01:13:39 +0000 (18:13 -0700)]
Add numpy like repeat as torch.repeat_interleave (#18395)
Summary:
Fixes: https://github.com/pytorch/pytorch/issues/14093
cc: SsnL
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18395
Differential Revision:
D14599509
Pulled By: umanwizard
fbshipit-source-id:
2391a1cc135fe5bab38475f1c8ed87c4a96222f3
Elias Ellison [Sat, 6 Apr 2019 00:52:12 +0000 (17:52 -0700)]
Fix interpolate trace (#18875)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/10654
The issue is that in tracing `.size` returns an int tensor, and when an int tensor is multiplied by a scalar the int dominates and the scalar gets casted 0.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18875
Differential Revision:
D14814441
Pulled By: eellison
fbshipit-source-id:
a4e96a2698f2fcbf3ec4b2bb4c43a30250f30ad9
James Reed [Sat, 6 Apr 2019 00:10:13 +0000 (17:10 -0700)]
Code string API for fuser testing (#18884)
Summary:
This adds a C++ function `debugGetFusedKernelCode` as well as a Python binding `_jit_fuser_get_fused_kernel_code` that will, given a FusionGroup graph and a set of specified inputs, return the compiled kernel source code. We can then check the contents of this source code for verification of the fuser codegen backend.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18884
Differential Revision:
D14795508
Pulled By: jamesr66a
fbshipit-source-id:
8f6e9dd13ebbb517737d893b0b5f5e9aa06af124
Michael Suo [Fri, 5 Apr 2019 22:13:35 +0000 (15:13 -0700)]
remove unused func (#18712)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18712
ghimport-source-id:
e435150a501b20695a5276addee93d795e04b532
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18712 [jit][easy] remove unused func**
* #18711 [jit] fix side-effects and aliasing for custom ops
as title
Differential Revision:
D14730979
fbshipit-source-id:
381d16ea2a45779bf6d5fc6d90a4f8585461e902
Junjie Bai [Fri, 5 Apr 2019 20:56:34 +0000 (13:56 -0700)]
Revert
D14778810: [caffe2/int8] fix bug when falling back to acc32 when weight is prepacked
Differential Revision:
D14778810
Original commit changeset:
d49a8c4b7c81
fbshipit-source-id:
15568b084848de74437582548bec42aadc74080d
Zachary DeVito [Fri, 5 Apr 2019 20:33:14 +0000 (13:33 -0700)]
slots with explicit value/setValue make more sense in future patches (#18468)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18468
ghimport-source-id:
d4b41c521f2269a695e03c8e7d05d5542731ee48
Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18469 Create Object that represents a Module
* **#18468 slots with explicit value/setValue make more sense in future patches**
* #18467 Make Object hold its ClassType
* #18379 Enforce single parent for script submodules
* #18378 Unify namespace of script::Module
* #18314 Add ability to specialize class types to ArgumentSpec
* #18226 Add Slot type to abstract the raw pointers being used for slots.
Reviewed By: suo
Differential Revision:
D14613509
fbshipit-source-id:
9f2208d0efd01465c78cebdc3e8365a9e0adf9ff
Zachary DeVito [Fri, 5 Apr 2019 20:33:14 +0000 (13:33 -0700)]
Make Object hold its ClassType (#18467)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18467
ghimport-source-id:
d51bdd64d2529d08c634c58df1a0870b54ad49fb
Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18469 Create Object that represents a Module
* #18468 slots with explicit value/setValue make more sense in future patches
* **#18467 Make Object hold its ClassType**
* #18379 Enforce single parent for script submodules
* #18378 Unify namespace of script::Module
* #18314 Add ability to specialize class types to ArgumentSpec
* #18226 Add Slot type to abstract the raw pointers being used for slots.
Currently it holds a symbol whose unqualified name is the name of the
class. This will get confusing when there are multiple possible registries,
and it makes getting the class type from the object difficult.
The pointer to the class is only 4 more bytes so this patch just puts
it in the object.
Reviewed By: suo
Differential Revision:
D14613510
fbshipit-source-id:
b35175ba4be83d2522deaa6dad5070d6ec691fed
Zachary DeVito [Fri, 5 Apr 2019 20:33:14 +0000 (13:33 -0700)]
Enforce single parent for script submodules (#18379) (#18860)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18860
ghimport-source-id:
96305349bf3db564f43df2263b1e5bddcc9e9dae
Reviewed By: suo
Differential Revision:
D14780421
Pulled By: zdevito
fbshipit-source-id:
2bdd89b35866ba035ebea0adab037e441c1006e2
Stas Bekman [Fri, 5 Apr 2019 19:46:44 +0000 (12:46 -0700)]
CUDA_NVCC_EXECUTABLE is not needed, as nvcc is in PATH (#18958)
Summary:
As indicated by f0k: https://github.com/pytorch/pytorch/pull/18495#issuecomment-
480178763
nvcc via ccache is already first in the PATH in the instructions I provided, so CUDA_NVCC_EXECUTABLE is not needed.
I re-built to test that it's so.
Thank you!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18958
Differential Revision:
D14810732
Pulled By: ezyang
fbshipit-source-id:
3758ae2253c745c5d7cfccedd49fa00cc4629965
Ahmad Salim Al-Sibahi [Fri, 5 Apr 2019 19:45:37 +0000 (12:45 -0700)]
Fix precision issue with expansion that prefers 'probs' over 'logits' (#18614)
Summary:
I have experienced that sometimes both were in `__dict__`, but it chose to copy `probs` which loses precision over `logits`. This is especially important when training (bayesian) neural networks or doing other type of optimization, since the loss is heavily affected.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18614
Differential Revision:
D14793486
Pulled By: ezyang
fbshipit-source-id:
d4ff5e34fbb4021ea9de9f58af09a7de00d80a63
Joakim Rishaug [Fri, 5 Apr 2019 19:44:49 +0000 (12:44 -0700)]
Method is supposed to be in-place (#18684)
Summary:
Tracing models which attempts to return this in-place value doesn't turn out well.
I haven't run any tests to confirm the results to be honest, but regardless of the outcome, the operation happens in-place, so it should work as before.
Sample output from traced model attempting to set `max_norm` on `Embedding`:
```
a leaf Variable that requires grad has been used in an in-place operation. (check_inplace at /pytorch/torch/csrc/autograd/VariableTypeUtils.h:49)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f0ecc5cc021 in /usr/local/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f0ecc5cb8ea in /usr/local/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #2: <unknown function> + 0x38ab2f (0x7f0ecb55ab2f in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #3: torch::autograd::VariableType::embedding_renorm_(at::Tensor&, at::Tensor const&, double, double) const + 0x76 (0x7f0ecb5b5966 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #4: <unknown function> + 0x56c958 (0x7f0ecb73c958 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #5: <unknown function> + 0x672286 (0x7f0ecb842286 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #6: torch::jit::InterpreterState::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) + 0x22 (0x7f0ecb83d842 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #7: <unknown function> + 0x65c6ac (0x7f0ecb82c6ac in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #8: <unknown function> + 0x3c8ab4 (0x7f0f06bc0ab4 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #9: <unknown function> + 0x3ad2c3 (0x7f0f06ba52c3 in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #10: <unknown function> + 0x11663e (0x7f0f0690e63e in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #39: python_call + 0x11 (0x5563c3c521c1 in uwsgi)
frame #40: uwsgi_request_wsgi + 0x100 (0x5563c3c54410 in uwsgi)
frame #41: wsgi_req_recv + 0xac (0x5563c3becabc in uwsgi)
frame #42: simple_loop_run + 0xc4 (0x5563c3c35be4 in uwsgi)
frame #43: simple_loop + 0x10 (0x5563c3c35a00 in uwsgi)
frame #44: uwsgi_ignition + 0x241 (0x5563c3c3a3a1 in uwsgi)
frame #45: uwsgi_worker_run + 0x275 (0x5563c3c3ec35 in uwsgi)
frame #46: <unknown function> + 0x8f22c (0x5563c3c3f22c in uwsgi)
frame #47: <unknown function> + 0x3c13e (0x5563c3bec13e in uwsgi)
frame #48: __libc_start_main + 0xf1 (0x7f0f138922e1 in /lib/x86_64-linux-gnu/libc.so.6)
frame #49: _start + 0x2a (0x5563c3bec16a in uwsgi)
:
operation failed in interpreter:
op_version_set = 0
def forward(self,
input_1: Tensor) -> Tensor:
_0 = torch.norm(self.item_embedding.weight, 2, 1, True)
_1 = torch.div(self.item_embedding.weight, _0)
m_weight = torch.t(_1)
input_2 = torch.contiguous(input_1)
weight_1 = torch.embedding_renorm_(self.item_embedding.weight, input_2, 1., 2.)
~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
x = torch.embedding(weight_1, input_2, -1, False, False)
input_3 = torch.div(x, torch.norm(x, 2, 2, True))
max_batch_size = ops.prim.NumToTensor(torch.size(input_3, 0))
hx = torch.zeros([2, int(max_batch_size), 70], dtype=6, layout=0, device=torch.device("cpu"))
_2 = [self.lstm_layer.weight_ih_l0, self.lstm_layer.weight_hh_l0, self.lstm_layer.weight_ih_l1, self.lstm_layer.weight_hh_l1]
input_4, _3, _4 = torch.lstm(input_3, [hx, hx], _2, False, 2, 0.
10000000000000001, False, False, True)
input = torch.matmul(input_4, torch.t(self.rnn2item.weight))
tastevec = torch.div(input, torch.norm(input, 2, 2, True))
outputs = torch.matmul(tastevec, m_weight)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18684
Differential Revision:
D14782041
Pulled By: ezyang
fbshipit-source-id:
7b2fc19b7d5b6600263644498bb728319a19f39d
Summer Deng [Fri, 5 Apr 2019 19:44:09 +0000 (12:44 -0700)]
fix bug when falling back to acc32 when weight is prepacked (#18881)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18881
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18878
When the weight is prepacked and it doesn't contain a prepacked weight for acc32, we shouldn't fallback to acc32.
TODO: add unit tests with better coverage
Reviewed By: feiyu1990
Differential Revision:
D14778810
fbshipit-source-id:
d49a8c4b7c815ab29b77feb53ee730ad63780488
Marek Kolodziej [Fri, 5 Apr 2019 19:43:02 +0000 (12:43 -0700)]
More numerically stable lerp (#18871)
Summary:
The C++ and CUDA implementations of the lerp are not numerically stable. This is discussed on Wikipedia [here](https://en.wikipedia.org/wiki/Linear_interpolation#Programming_language_support). I checked the GPU SASS output and there's no overhead from using the more precise implementation, from Kepler all the way to Turing. I haven't looked at CPU ASM though.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18871
Differential Revision:
D14793438
Pulled By: ezyang
fbshipit-source-id:
2ddc2e026c5285466cae7d1b4101174253100445
Pieter Noordhuis [Fri, 5 Apr 2019 19:13:31 +0000 (12:13 -0700)]
Increase default c10d/ProcessGroupGloo test timeout (#18916)
Summary:
See #18659.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18916
Differential Revision:
D14808749
Pulled By: pietern
fbshipit-source-id:
9a9c8beddb2dbbb1bf4c5e575743d9e1fa3f07fa
Ailing Zhang [Fri, 5 Apr 2019 18:57:17 +0000 (11:57 -0700)]
remove symbolic variable part 1 (#17986)
Summary:
As discussed with gchanan we should deduplicate symbolic_variable and symbolic_script to prepare for the future merge with derivatives.yaml.
This PR moves most easy formulas to symbolic_script.
TODO: run benchmarks to make sure no perf regression
cc: apaszke zdevito wanchaol
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17986
Differential Revision:
D14766412
Pulled By: ailzhang
fbshipit-source-id:
d95a3f876e256c0f505779a71587c985571d3b8f
Edward Yang [Fri, 5 Apr 2019 18:55:38 +0000 (11:55 -0700)]
Revert
D14742020: Wrap workaround for cpp custom types a bit prettier and add an example
Differential Revision:
D14742020
Original commit changeset:
0f2fd83ae56a
fbshipit-source-id:
5640255aef0319b7d8996e07132e87213130d31c
Karl Ostmo [Fri, 5 Apr 2019 18:26:31 +0000 (11:26 -0700)]
Decompose more Windows scripts (#18917)
Summary:
This PR:
* pulls four distinct installation steps out of `build_pytorch.bat` and into their own scripts.
* eliminates the copy step for helper scripts called by `win-build.sh` and `win-test.sh`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18917
Differential Revision:
D14807236
Pulled By: kostmo
fbshipit-source-id:
03e91a5834dfd6d68903ad9725eacc099bbf6d53
Dmytro Dzhulgakov [Fri, 5 Apr 2019 18:14:11 +0000 (11:14 -0700)]
Wrap workaround for cpp custom types a bit prettier and add an example (#18791)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18791
As a temporary demonstration on how to extend this hack further until custom C types are ready.
Reviewed By: jamesr66a
Differential Revision:
D14742020
fbshipit-source-id:
0f2fd83ae56ab2abe16977a1829ed421e6abe74b
bddppq [Fri, 5 Apr 2019 18:09:15 +0000 (11:09 -0700)]
Remove cuda::compat functions in aten (#18905)
Summary:
Looks like the issue of using `std::` functions is fixed in new rocm version
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18905
Differential Revision:
D14792943
Pulled By: bddppq
fbshipit-source-id:
af11acbb85872943f23b6e55415db1f0699e7b8f
Michael Suo [Fri, 5 Apr 2019 17:40:19 +0000 (10:40 -0700)]
fix side-effects and aliasing for custom ops (#18711)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18711
ghimport-source-id:
c9caedc0660b2b7ba3730cd0e1a2e0e9c3cf422b
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18711 [jit] fix side-effects and aliasing for custom ops**
Previously we didn't track aliasing, mutation, or side effects for
custom ops. This PR adds in guards with the most conservative
assumptions possible: the op will
1) have side effects,
2) write to everything
3) produce a wildcard.
In order to tell whether a given operator is a custom op, this PR introduces
the concept of a "reserved" namespace (basically all our builtin namespaces).
Custom ops live in non-reserved namespaces, so a check on the namespace
is sufficient to tell whether a schema/node is "custom" or not.
This is just to get things correct for now. Follow-ups to this:
- Users should be able to specify aliasing/mutability without having to learn
the whole alias annotation schema.
- Relax assumptions a bit. In particular outputs can only alias input tensors,
they don't have to be wildcards.
Fixes #18490
Differential Revision:
D14730978
fbshipit-source-id:
540b47a24ccf24145051609bdcc99c97e46e0fe0
Elias Ellison [Fri, 5 Apr 2019 17:37:58 +0000 (10:37 -0700)]
Expand the list of ops that mutate an inputs shape (#18812)
Summary:
Expand the list of ops that resize an input in-place to include broadcasting ops and other ops that affect shape. Whoever is reviewing the PR could you please look through pytorch in place ops and see if I missed any.
Expanding the PR from: https://github.com/pytorch/pytorch/pull/17518
This is already being tested in test_resize_input_ops.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18812
Differential Revision:
D14793410
Pulled By: eellison
fbshipit-source-id:
125f4f5375ac1036fb96fabc9da2aaccc9adc778
J M Dieterich [Fri, 5 Apr 2019 17:11:43 +0000 (10:11 -0700)]
add launch bounds, enable more tests (#18909)
Summary:
Add launch bounds annotations for ROCm arising from maxThreadsPerBlock and apply threads use.
Enable tests that now work.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18909
Differential Revision:
D14801490
Pulled By: ezyang
fbshipit-source-id:
b81c97fc783a2627bc7e31b32036a364cfe40cc7
Yinghai Lu [Fri, 5 Apr 2019 17:09:14 +0000 (10:09 -0700)]
Add backward pass to infer single missing input shape for Concat opportunitiscally (#18911)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18911
Att.
Reviewed By: bddppq
Differential Revision:
D14791295
fbshipit-source-id:
4b7a775924f0eadb0cb73aa6c434a6a5be8b92be
Jiakai Liu [Fri, 5 Apr 2019 16:54:27 +0000 (09:54 -0700)]
change to use clang if NDK >= 18 (#18914)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18914
ghimport-source-id:
4d9d9322ee5559d96e13533ec37ff3be86a0227c
Reviewed By: ezyang
Differential Revision:
D14794162
Pulled By: ljk53
fbshipit-source-id:
caac55e12b1e62bf6ebcc6e2062d5ed122ad4e64
Zachary DeVito [Fri, 5 Apr 2019 16:46:10 +0000 (09:46 -0700)]
Revert
D14673459: [pytorch][PR] [jit] Replace Slot on script::Method with NamedIValue
Differential Revision:
D14673459
Original commit changeset:
21200180c47f
fbshipit-source-id:
9c01de4cf5bb7c87ac0c55705b901db990cd917b
Edward Yang [Fri, 5 Apr 2019 16:37:11 +0000 (09:37 -0700)]
Disable flaky test_proper_exit test. (#18950)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18950
ghimport-source-id:
27bd575fd3c73a51ace1360aa020fa63a792a5d2
Differential Revision:
D14802009
Pulled By: ezyang
fbshipit-source-id:
051e1d038892c2c6e8337357fa80771b8dc42680
Edward Yang [Fri, 5 Apr 2019 16:33:08 +0000 (09:33 -0700)]
Checkout pytorch_sphinx_theme with https. (#18859)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18859
ghimport-source-id:
fbbcb8a2dd9c9f0a317de489b6bbb83e9071a7d8
Differential Revision:
D14801989
Pulled By: ezyang
fbshipit-source-id:
a9bc02e1383adafcac01994e6346b28551d95c71
Pieter Noordhuis [Fri, 5 Apr 2019 16:04:43 +0000 (09:04 -0700)]
Add tests for reducer class (#18845)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18845
This adds a few CPU only test cases for the reducer class.
Reviewed By: mrshenli
Differential Revision:
D14768432
fbshipit-source-id:
c008a52206826304e634a95bc14167ed94c97662
Owen Anderson [Fri, 5 Apr 2019 15:34:41 +0000 (08:34 -0700)]
Fix a few instances of notifying on a CV while holding the lock (#18857)
Summary:
Fix a few instances of notifying on a CV while holding the lock to release the lock before notifying. This avoids an extra thread suspension when the notified thread tries to grab the lock.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18857
Differential Revision:
D14779132
Pulled By: resistor
fbshipit-source-id:
b18a05c4c15be1426ebfdffac1c8f002b771cfd7
peter [Fri, 5 Apr 2019 14:44:43 +0000 (07:44 -0700)]
Unify caffe2 and libtorch build scripts on Windows (#18683)
Summary:
`scripts/build_windows.bat` is the original way to build caffe2 on Windows, but since it is merged into libtorch, the build scripts should be unified because they actually do the same thing except there are some different flags.
The follow-up is to add the tests. Looks like the CI job for caffe2 windows is defined [here](https://github.com/pytorch/ossci-job-dsl/blob/master/src/jobs/caffe2.groovy#L906). Could we make them a separate file, just like what we've done in `.jenkins/pytorch/win-build.sh`? There's a bunch of things we can do there, like using ninja and sccache to accelerate build.
cc orionr yf225
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18683
Differential Revision:
D14730188
Pulled By: ezyang
fbshipit-source-id:
ea287d7f213d66c49faac307250c31f9abeb0ebe
Gregory Chanan [Fri, 5 Apr 2019 14:18:39 +0000 (07:18 -0700)]
Simplify storage wrapping in TH. (#18855)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18855
ghimport-source-id:
01faa229fa4db901ab8539d3778b716d909ba4cf
Reviewed By: dzhulgakov
Differential Revision:
D14790669
Pulled By: gchanan
fbshipit-source-id:
167b9bc9c9872743fa8f6040a26ddf7ff5789c27
Gregory Chanan [Fri, 5 Apr 2019 14:18:38 +0000 (07:18 -0700)]
Cache device on TensorImpl; clean up TensorImpl constructors. (#18833)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18833
ghimport-source-id:
6f2be25fcc5e6be3ffe20582e604bd2c1fbab66b
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18833 [STACK] Cache device on TensorImpl; clean up TensorImpl constructors.**
* #18832 [STACK] Disallow changing the device of a tensor via set_.
* #18831 [STACK] Stop swapping in Storages of the wrong device for Tensors.
1) We cache device on TensorImpl. This means we can access the device without a virtual function and allows us to more easily extend TensorImpls (because they don't need to figure out how to store the Device for themselves).
2) Clean up TensorImpl APIs. We had a constructor that took a TensorTypeId and an allocator and would allocate a Storage based on the recognized types of TensorTypeIds. Instead, we just have two different constructors: one for types with a storage, one without.
Reviewed By: dzhulgakov
Differential Revision:
D14766230
fbshipit-source-id:
745b8db84dcd6cb58f1a8675ad3ff8d033bc50df
Vitaly Fedyunin [Fri, 5 Apr 2019 13:19:58 +0000 (06:19 -0700)]
Revert "Adding pin_memory kwarg to zeros, ones, empty,... (#18854)
Summary:
This reverts commit
c484cf43a02863efd2f4a76aad43246fb0191ab5.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18854
Differential Revision:
D14778393
Pulled By: VitalyFedyunin
fbshipit-source-id:
4b5a1f5b1c091bbc4a8e75614734cc011d26b452
Sebastian Messmer [Fri, 5 Apr 2019 08:46:58 +0000 (01:46 -0700)]
Silence compiler warnings (#18912)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18912
We intentionally test a deprecated API, no need to show the warnings here.
Reviewed By: dzhulgakov
Differential Revision:
D14792617
fbshipit-source-id:
9ea2a4106d566064283726eed2c274b98f49a2e5
Dmytro Dzhulgakov [Fri, 5 Apr 2019 08:04:58 +0000 (01:04 -0700)]
ScriptModuleOp in caffe2 (#18716)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18716
Might be useful as an intermediate stage for some systems that currently use Caffe2 nets as an execution mechanism.
Not sure it's a good idea all together, please comment.
Limitations:
- only Tensor types as inputs/outputs
- the entire module is serialized as a zip archive inside a proto in Caffe2 db, it'd be subject to 4Gb limit and is likely very slow. For small models it'd work though.
- no autograd, though it can be attached in principle
- no way to retrieve parameters inside the script module from C2 runtime perspective (though they potentially can be alias-fetched and stored as individual blobs)
- after deserialization, python wrappers returned don't have correct type (as we don't do module_lookup trick)
Build-wise, I had to add dependency from pybind_state to libtorch.so. I don't think we build Caffe2 python frontend independently anymore, so it should be fine.
Reviewed By: amirshim, houseroad
Differential Revision:
D14339599
fbshipit-source-id:
88a37a8abd1f1c4703e5ef937031f222535d4080
Karl Ostmo [Fri, 5 Apr 2019 07:49:06 +0000 (00:49 -0700)]
flake8 fix on extracted python script
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18931
Differential Revision:
D14796114
Pulled By: kostmo
fbshipit-source-id:
25971be5a36fffc61e29db981af7298a0fe0ed8c
David Riazati [Fri, 5 Apr 2019 06:27:05 +0000 (23:27 -0700)]
Replace Slot on script::Method with NamedIValue (#18252)
Summary:
This refactor lets us track the types of initial values added onto a `Method`. The main motivation for this is the change in `module.cpp`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18252
Differential Revision:
D14673459
Pulled By: driazati
fbshipit-source-id:
21200180c47f25bb70898771adfb569856e6c34a
Karl Ostmo [Fri, 5 Apr 2019 04:05:13 +0000 (21:05 -0700)]
U/kostmo/windows offload scripts 3
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18754
Differential Revision:
D14794893
Pulled By: kostmo
fbshipit-source-id:
05187d9b53615ffbcc7253accdc692c4ecaf25d9
Tongzhou Wang [Fri, 5 Apr 2019 02:03:08 +0000 (19:03 -0700)]
fix lint in optim doc
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18883
Differential Revision:
D14793365
Pulled By: ezyang
fbshipit-source-id:
c1b46c98e3319badec3e0e772d0ddea24cbf9c89
Iurii Zdebskyi [Fri, 5 Apr 2019 01:23:38 +0000 (18:23 -0700)]
Fixed the comment to reference gist example instead of private repo (#18852)
Summary:
Replace link to a file in a private repo with a gist
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18852
Reviewed By: ezyang
Differential Revision:
D14778481
Pulled By: izdeby
fbshipit-source-id:
8389aa4bf115ddcfd85079cc2c861404efa678e7
Sepehr Sameni [Fri, 5 Apr 2019 01:07:54 +0000 (18:07 -0700)]
return missing keys from load_state_dict (#18668)
Summary:
return missing_keys and unexpected_keys from load_state_dict so the user can handle them when strict mode is off; also removed an unused variable
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18668
Differential Revision:
D14782073
Pulled By: ezyang
fbshipit-source-id:
ab3b855eb77bb7422594d971988067e86eef20f2
Junjie Bai [Fri, 5 Apr 2019 00:21:41 +0000 (17:21 -0700)]
Fix caffe2 miopen conv transpose gradient op for case of no dX gradient
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18809
Reviewed By: ezyang
Differential Revision:
D14759762
Pulled By: bddppq
fbshipit-source-id:
ff795b7e58c82f67a1d7284b5ab06b0e0e5fd3ae
Brennan Vincent [Fri, 5 Apr 2019 00:18:11 +0000 (17:18 -0700)]
don't attempt to multiply by a sparse matrix (#18737)
Summary:
Tested by running the script in #16562 , and there was no error.
Then:
```
>>> print(mat.grad)
tensor([[1., 2., 3.],
[1., 2., 3.],
[1., 2., 3.]])
```
which is correct.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18737
Differential Revision:
D14773078
Pulled By: umanwizard
fbshipit-source-id:
8aa36eb6f6aa104263a467d9ac91d61b3bfd05f5
Wanchao Liang [Fri, 5 Apr 2019 00:00:46 +0000 (17:00 -0700)]
add Fast-RNN to AI-PEP
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18885
Reviewed By: hl475
Differential Revision:
D14728854
fbshipit-source-id:
7e7a2946929551963f7c938e3d82a260a9efdfbd
Pieter Noordhuis [Thu, 4 Apr 2019 21:14:50 +0000 (14:14 -0700)]
Allow override of backend in dist.new_group() (#18595)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18595
There is no need to force the backend to be the same as the global
process group, as long as the backend is "nccl" or "gloo".
Reviewed By: mrshenli
Differential Revision:
D14657204
fbshipit-source-id:
868817b9f219e3be8db0761a487f0027ed46663b
Lara [Thu, 4 Apr 2019 20:15:18 +0000 (13:15 -0700)]
ONNX Export All Cases of Softmax
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18482
Reviewed By: zrphercule
Differential Revision:
D14630697
Pulled By: houseroad
fbshipit-source-id:
c06f1e3bead10a265c5f4ac3723d49f4caf46801
Iurii Zdebskyi [Thu, 4 Apr 2019 20:01:10 +0000 (13:01 -0700)]
Added bool and half support for resize_as_ and view methods (#18821)
Summary:
Enabled **resize_as_** and **view** methods for bool and half tensors.
tested via unit tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18821
Reviewed By: ezyang
Differential Revision:
D14762852
Pulled By: izdeby
fbshipit-source-id:
4312079fb4e893fea6f71ff4f163094b2674f1e8
Lu Fang [Thu, 4 Apr 2019 19:57:31 +0000 (12:57 -0700)]
update of fbcode/onnx to
079c2639f9bb79b1774d1e3bfa05b0c093816ca7 (#18841)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18841
Previous import was
f0d7df2c643c4e37f1fd7735ef02c972c4d19fb5
Included changes:
- **[
079c2639](https://github.com/onnx/onnx/commit/
079c2639)**: update the squeeze and unsqueeze doc (#1905) <Lu Fang>
- **[
a8b45d62](https://github.com/onnx/onnx/commit/
a8b45d62)**: fix the ir_version onnx-operators.proto (#1903) <Lu Fang>
Reviewed By: zrphercule
Differential Revision:
D14767158
fbshipit-source-id:
2d772fece45e25d72bf1d10fad156189397f3f86
James Reed [Thu, 4 Apr 2019 19:53:44 +0000 (12:53 -0700)]
Actually model scalar type promotion in shape analysis (#18811)
Summary:
This was causing some numerical issues in the fuser
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18811
Differential Revision:
D14767390
Pulled By: jamesr66a
fbshipit-source-id:
f1123d1aab5501abad850d2edc996f8aa8dafe04
Max Wang [Thu, 4 Apr 2019 19:42:12 +0000 (12:42 -0700)]
Add a .ctags.d/ toplevel directory (#18827)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18827
ghimport-source-id:
38f857bc29b2c2c6a71069d00c4c69ed0bef1574
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18827 Add a .ctags.d/ toplevel directory**
Exclude build artifacts by default.
Reviewed By: ezyang
Differential Revision:
D14765721
fbshipit-source-id:
a785dbb2ef1df96af8e23cc65c8db2a6b67b4fce
Wanwannodao [Thu, 4 Apr 2019 19:40:46 +0000 (12:40 -0700)]
Fix typo
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18802
Differential Revision:
D14781874
Pulled By: ezyang
fbshipit-source-id:
0f94c40bd84c84558ea3329117580f6c749c019f
Xiaomeng Yang [Thu, 4 Apr 2019 18:46:37 +0000 (11:46 -0700)]
Add support for group ConvTranspose (#18794)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18794
Add support for group ConvTranspose
Reviewed By: houseroad
Differential Revision:
D14741327
fbshipit-source-id:
5d947ca044bf8495dd7f8f56122441ebbcc6c7e4
Gregory Chanan [Thu, 4 Apr 2019 18:12:13 +0000 (11:12 -0700)]
Disallow changing the device of a tensor via set_. (#18832)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18832
ghimport-source-id:
fde4ad90541ba52dfa02bdd83466f17e6541e535
Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18833 [STACK] Cache device on TensorImpl; clean up TensorImpl constructors.
* **#18832 [STACK] Disallow changing the device of a tensor via set_.**
* #18831 [STACK] Stop swapping in Storages of the wrong device for Tensors.
This is necessary to cache the device on a TensorImpl.
Differential Revision:
D14766231
fbshipit-source-id:
bba61634b2d6252ac0697b96033c9eea680956e8
Karl Ostmo [Thu, 4 Apr 2019 17:38:09 +0000 (10:38 -0700)]
U/kostmo/win test offload scripts
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18694
Differential Revision:
D14766339
Pulled By: kostmo
fbshipit-source-id:
a2300e72129979f866430ca5c09dd7fff6df0a89
Zachary DeVito [Thu, 4 Apr 2019 17:22:27 +0000 (10:22 -0700)]
Revert
D14603722: Enforce single parent for script submodules
Differential Revision:
D14603722
Original commit changeset:
63ab5d0cccf7
fbshipit-source-id:
2c4174def102eda4589e08c4dbd67ce8af975199
Edward Yang [Thu, 4 Apr 2019 16:20:20 +0000 (09:20 -0700)]
Fix deviceCount on FakeGuardImpl. (#18745)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18745
ghimport-source-id:
3ed111efe83b3061652869e33d9b5910b7daa732
Differential Revision:
D14759198
Pulled By: ezyang
fbshipit-source-id:
70a8db767f310fe0e0079c7b0693e9330d7cd472
Gregory Chanan [Thu, 4 Apr 2019 13:19:54 +0000 (06:19 -0700)]
Stop swapping in Storages of the wrong device for Tensors. (#18831)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18831
ghimport-source-id:
2741e0d70ebe2c2217572c3af54ddd9d2047e342
Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18833 [STACK] Cache device on TensorImpl; clean up TensorImpl constructors.
* #18832 [STACK] Disallow changing the device of a tensor via set_.
* **#18831 [STACK] Stop swapping in Storages of the wrong device for Tensors.**
This is necessary to support device caching, see https://github.com/pytorch/pytorch/pull/18751 and https://github.com/pytorch/pytorch/pull/18578.
In library code, we potentially swap in Storages with the wrong device when device_guard is False. This happens as follows with "view-like" operations.
1) We allocate a tensor on the 'wrong' device (because device_guard is false).
2) We swap out the 'wrong' storage with the 'right' storage using e.g. THCTensor_setStorage.
Instead, we can just construct the Tensor with the correct Storage from the beginning. This is what we do with 'view'.
Note there are two other "view-like" cases where this happens:
1) unfold
2) set_()
Because these aren't performance critical, I just added the device_guard instead of applying the above correction.
For completeness, this also includes a test that all `device_guard: false` functions behave properly under these conditions.
Reviewed By: dzhulgakov
Differential Revision:
D14766232
fbshipit-source-id:
0865c3ddae3f415df5da7a9869b1ea9f210e81bc
Roy Li [Thu, 4 Apr 2019 09:21:09 +0000 (02:21 -0700)]
Pass ScalarType separately from Type in python constructors
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17786
Reviewed By: ezyang
Differential Revision:
D14379075
fbshipit-source-id:
3abf066563b789a30cafe5b0c868a41326f5b833
Roy Li [Thu, 4 Apr 2019 09:21:09 +0000 (02:21 -0700)]
Store ScalarType and Backend instead of Type in TensorIterator
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17601
Reviewed By: ezyang
Differential Revision:
D14274754
fbshipit-source-id:
b08880ae586b6ae57d4c0bbeb203796d087926c4
Roy Li [Thu, 4 Apr 2019 09:21:09 +0000 (02:21 -0700)]
Introduce DeprecatedTypeProperties class (#17991)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17991
changes:
-Breaks bc: Tensor::type() now returns DeprecatedTypeProperties& rather than Type&.
-Added DeprecatedTypeProperties, it serves as a temporary replacement for Type as the return value of Tensor::type(). This contributes to making Type just for dispatch purposes so that we can make it dtype agnostic.
-Tensor::dispatch_type() now returns Type& like Tensor::type() used to do.
-Changed callsites of Tensor::type() appropriately.
Reviewed By: ezyang
Differential Revision:
D14443117
fbshipit-source-id:
239ccb7a09626279a71d1a37f8f82e7f57bf7d9e
Bram Wasti [Thu, 4 Apr 2019 07:24:16 +0000 (00:24 -0700)]
Fix to handle null strides in DLPack tensor (#18510)
Summary:
DLPack can have non-strided tensors, which is represented by a nullptr in the place of dl_tensor.strides.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18510
Differential Revision:
D14647328
Pulled By: bwasti
fbshipit-source-id:
5364282810a5772cfc2319fc8133fe86fdd84dd1
Yinghai Lu [Thu, 4 Apr 2019 07:19:21 +0000 (00:19 -0700)]
Add shape inference function for Split (#18838)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18838
It turns out that we don't have shape inference function of `Split` op at all. This diff adds that.
Reviewed By: bertmaher
Differential Revision:
D14766871
fbshipit-source-id:
535cb4f24bdada603c76579e00e7a39aee93e19f
Lu Fang [Thu, 4 Apr 2019 06:14:07 +0000 (23:14 -0700)]
Fix the duplication problem in _unique_state_dict (#18139)
Summary:
Since parameter.data will create a new torch.Tensor each time, we get duplicate tensors when call _unique_state_dict now. Try to deduplicate it before creating new tensor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18139
Reviewed By: dzhulgakov
Differential Revision:
D14511262
Pulled By: houseroad
fbshipit-source-id:
cb69795d0b6509721220650bbb19edeb3459a503
Jongsoo Park [Thu, 4 Apr 2019 05:50:05 +0000 (22:50 -0700)]
fold col offset into bias; optimize A symmetric quant (#17026)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17026
D14013931 was for FC. This diff is similar optimizations for Conv.
A subtle difference is that in FC, once we fold col_offset into bias during pre-processing step, we can treat everything as if A_zero_offset == 0 (symmetric quantization of A).
In Conv, we can't do this because padding still needs to use the original A_zero_offset.
From requantization point of view, once col_offset folded into bias, we can treat as if we're doing symmetric A quantization.
But, for steps involving padding like im2col, im2col fused with packing, and direct conv for depth-wise/group convolution we still need to pass the original A_zero_offset.
Reviewed By: jianyuh
Differential Revision:
D14020276
fbshipit-source-id:
c29caefd1127bbc6aff0e9d535939bb0c1ecb66c
Michael Suo [Thu, 4 Apr 2019 05:18:09 +0000 (22:18 -0700)]
fix flake8 lint (#18835)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18835
ghimport-source-id:
7b1f433ae51232822704d62699233688072cbc23
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18835 fix flake8 lint**
* #18826 [jit] run cpp tests for non-cuda builds in test_jit.py
...again
Reviewed By: ZolotukhinM
Differential Revision:
D14766790
fbshipit-source-id:
29361a407589092831dfbc3c5d63d2834934cd02
Michael Suo [Thu, 4 Apr 2019 05:18:09 +0000 (22:18 -0700)]
run cpp tests for non-cuda builds in test_jit.py (#18826)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18826
ghimport-source-id:
7ffa3bc7ef7402a6d6eb6ba5849e197019d77bf8
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18826 [jit] run cpp tests for non-cuda builds in test_jit.py**
We did all the work of nicely separating our cpp tests that don't require
CUDA, but they aren't run from test_jit.py if CUDA is missing.
Reviewed By: ZolotukhinM
Differential Revision:
D14766287
fbshipit-source-id:
9326b3a5c90f6c20fc8cfaf1a1885a363b91f30a
Lu Fang [Thu, 4 Apr 2019 04:29:36 +0000 (21:29 -0700)]
Fix the linter (#18842)
Summary:
Remove extra empty line
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18842
Differential Revision:
D14767334
Pulled By: houseroad
fbshipit-source-id:
63224bc407949949e1eb5123d3f151e4ac8f6988
Zachary DeVito [Thu, 4 Apr 2019 03:21:27 +0000 (20:21 -0700)]
Enforce single parent for script submodules (#18379)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18379
ghimport-source-id:
9895ecc1ff7897e98853dc00675341f36726e7c7
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18379 Enforce single parent for script submodules**
* #18378 Unify namespace of script::Module
* #18314 Add ability to specialize class types to ArgumentSpec
* #18226 Add Slot type to abstract the raw pointers being used for slots.
The assumption that a ScriptModule has a single parent is present in
our serialization format, and likely a few other places. It is not
enforced on creation of script module hierarchies though, meaning that
problems associated with (e.g. replicating a module twice in the output
format) will not be caught until much later in the development cycle.
This patch enforces the property when a submodule is registered.
It also removes NamedModule since it is no longer necessary in this regime.
This will also allow the easy discover of a modules fully-qualified name
without needing to traverse the Module hierarchy.
Differential Revision:
D14603722
fbshipit-source-id:
63ab5d0cccf7d66c7833e0adf9023024ca9607cb
Elias Ellison [Thu, 4 Apr 2019 00:09:37 +0000 (17:09 -0700)]
Allow ints, floats, and tensors in conditional (#18755)
Summary:
Per our offline discussion, allow Tensors, ints, and floats to be casted to be bool when used in a conditional
Fix for https://github.com/pytorch/pytorch/issues/18381
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18755
Reviewed By: driazati
Differential Revision:
D14752476
Pulled By: eellison
fbshipit-source-id:
149960c92afcf7e4cc4997bccc57f4e911118ff1
Wanchao Liang [Wed, 3 Apr 2019 23:50:46 +0000 (16:50 -0700)]
Fix layernorm ad formula on weight and bias (#18233)
Summary:
Fix the layernorm formula when weight and bias passed in.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18233
Differential Revision:
D14760375
Pulled By: wanchaol
fbshipit-source-id:
d6bd3b137bc04c391aa5c24d021d1f811ba2a877