platform/upstream/pytorch.git
5 years agoClarify JIT docs
James Reed [Sun, 10 Mar 2019 07:10:26 +0000 (23:10 -0800)]
Clarify JIT docs

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17846

Differential Revision: D14400363

Pulled By: jamesr66a

fbshipit-source-id: 862316b5fd95526b6edebeca19d2cc522779df11

5 years agoAdd metadata for torch jit TracedModules. (#17640)
Pritam Damania [Sun, 10 Mar 2019 05:31:42 +0000 (21:31 -0800)]
Add metadata for torch jit TracedModules. (#17640)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17640

Pull Request resolved: https://github.com/pytorch/pytorch/pull/17311

I've extended our model metadata framework in this diff to support
traced modules as well. Re-used a lot of components from the previous
implementation of ScriptModule metadata.

Tracing is a little different from Scripting since you can't just create a
subclass of TopLevelTraceModule (type returned by torch.jit.trace) and attach
metadata the way we did for ScriptModule. As a result, I've introduced a
separate API torch.fb.jit_trace which returns an instance of
TracedModuleWithMetadata which is a subclass of TopLevelTracedModule. As a
result, we can now attach metadata to this instance.

Reviewed By: dzhulgakov

Differential Revision: D14117966

fbshipit-source-id: 3eee5eef733cb8d6a219c02e2f41d08698eca326

5 years agoFix PySlice_Unpack not available on PyPy 3.6 yet (#17836)
Konstantin Lopuhin [Sun, 10 Mar 2019 04:06:57 +0000 (20:06 -0800)]
Fix PySlice_Unpack not available on PyPy 3.6 yet (#17836)

Summary:
This is one of the fixes needed to support compilation on PyPy 3.6, see https://github.com/pytorch/pytorch/issues/17835
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17836

Differential Revision: D14399404

Pulled By: soumith

fbshipit-source-id: ca650a6e2066aed86ddd3314a95d0cb3c515c633

5 years agoPyPy compatibility: let unmodified slots be inherited in the standard way (#17837)
Ronan Lamy [Sat, 9 Mar 2019 19:38:05 +0000 (11:38 -0800)]
PyPy compatibility: let unmodified slots be inherited in the standard way (#17837)

Summary:
This is needed to fix a segfault on PyPy 3.6, see https://bitbucket.org/pypy/pypy/issues/2968/segfault-calling-cpyext_tp_new_tuple and https://github.com/pytorch/pytorch/issues/17835
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17837

Differential Revision: D14399408

Pulled By: soumith

fbshipit-source-id: 75328a30018313d3223dd3e3eef9240a416c049b

5 years agoRun fp16 resnet50 training in bench script (#17831)
Junjie Bai [Sat, 9 Mar 2019 05:50:20 +0000 (21:50 -0800)]
Run fp16 resnet50 training in bench script (#17831)

Summary:
cc xw285cornell
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17831

Differential Revision: D14398532

Pulled By: bddppq

fbshipit-source-id: 37c03cc2eebe3a6083e05631cb6ff03474e4a8a2

5 years agoInt8 FC performance debugging (#17700)
Summer Deng [Sat, 9 Mar 2019 03:00:43 +0000 (19:00 -0800)]
Int8 FC performance debugging (#17700)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17700

Add performance debugging utilities in DNNLOWP FC operator and the python script

Reviewed By: amylittleyang

Differential Revision: D14321299

fbshipit-source-id: 50dbd7b352a1da5d2ecb659d8003e71e70750063

5 years agoOptimize LayerNormOp (#17604)
Xiaomeng Yang [Sat, 9 Mar 2019 01:35:17 +0000 (17:35 -0800)]
Optimize LayerNormOp (#17604)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17604

Optimize LayerNormOp

i-am-not-moving-c2-to-c10

Reviewed By: houseroad

Differential Revision: D14274175

fbshipit-source-id: a7aa263a1b0eb109682d2be99306e7b2cdcc0faf

5 years agoRemove some simple use cases of Type::ScalarType()
Roy Li [Sat, 9 Mar 2019 00:39:04 +0000 (16:39 -0800)]
Remove some simple use cases of Type::ScalarType()

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17529

Reviewed By: ezyang

Differential Revision: D14237932

fbshipit-source-id: be633a1fc19215d53cfe083fdd7196acf2b7dd2f

5 years agoChange Dispatch.h to use ScalarType over Type
Roy Li [Sat, 9 Mar 2019 00:39:04 +0000 (16:39 -0800)]
Change Dispatch.h to use ScalarType over Type

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17527

Reviewed By: zou3519

Differential Revision: D14235395

fbshipit-source-id: 3f53e33f6794f1f14c2edf79014b8ef8397822c5

5 years agoRevert D14361993: [pytorch][PR] [Onnx] - refactoring serialization of ONNX initialize...
Lu Fang [Sat, 9 Mar 2019 00:27:00 +0000 (16:27 -0800)]
Revert D14361993: [pytorch][PR] [Onnx] - refactoring serialization of ONNX initializers to be name-based

Differential Revision:
D14361993

Original commit changeset: da93e945d557

fbshipit-source-id: 15eea001fbcd059ac13903405aeb9ea182c6ee8b

5 years agoOpen registration for c10 thread pool (#17788)
James Reed [Fri, 8 Mar 2019 23:33:34 +0000 (15:33 -0800)]
Open registration for c10 thread pool (#17788)

Summary:
1. Move ATen threadpool & open registration mechanism to C10
2. Move the `global_work_queue` to use this open registration mechanism, to allow users to substitute in their own
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17788

Reviewed By: zdevito

Differential Revision: D14379707

Pulled By: jamesr66a

fbshipit-source-id: 949662d0024875abf09907d97db927f160c54d45

5 years agoCast nn.Upsample.scale_factor to a float (#17732)
David Riazati [Fri, 8 Mar 2019 23:26:25 +0000 (15:26 -0800)]
Cast nn.Upsample.scale_factor to a float (#17732)

Summary:
Fixes #17106
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17732

Differential Revision: D14388192

Pulled By: driazati

fbshipit-source-id: d9c9e87a7c6db63c1de3ddebbb8dcf619f0dc34d

5 years agoFix lint in run_test.py
Edward Yang [Fri, 8 Mar 2019 22:30:15 +0000 (14:30 -0800)]
Fix lint in run_test.py

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17815

Reviewed By: eellison

Differential Revision: D14390308

fbshipit-source-id: 22efd62a1bbd1fc8155a942d7160d5b7d3158e6b

5 years agoFix lint in test/common_utils.py
Edward Yang [Fri, 8 Mar 2019 22:19:23 +0000 (14:19 -0800)]
Fix lint in test/common_utils.py

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17814

Reviewed By: eellison

Differential Revision: D14390194

fbshipit-source-id: b4b3bbe20a15d0b9ed127b255e01c0d6d0832c1b

5 years agoReplace tensor.type().scalarType() calls with tensor.scalar_type()
Roy Li [Fri, 8 Mar 2019 22:05:01 +0000 (14:05 -0800)]
Replace tensor.type().scalarType() calls with tensor.scalar_type()

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17515

Reviewed By: ezyang

Differential Revision: D14233250

fbshipit-source-id: 6c7af8d2291c0c2b148001b30cf03834f34366c0

5 years agoCatch exceptions in bound_shape_inference (#17775)
Yinghai Lu [Fri, 8 Mar 2019 21:15:05 +0000 (13:15 -0800)]
Catch exceptions in bound_shape_inference (#17775)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17775

Handles use input shape hint properly.

Reviewed By: zrphercule

Differential Revision: D14368735

fbshipit-source-id: 504cd96589e47aa432617e56362aa6b01a25ba9b

5 years agorefactor caffe2 operator constructors - 11/9 (#17722)
Sebastian Messmer [Fri, 8 Mar 2019 20:33:31 +0000 (12:33 -0800)]
refactor caffe2 operator constructors - 11/9 (#17722)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17722

clangr codemod

Reviewed By: ezyang

Differential Revision: D14350584

fbshipit-source-id: adef54cedc9409b4fb365f6644e2621a9e47b2ff

5 years agoSuppress C408 lint (don't use dict constructor) (#17813)
Edward Yang [Fri, 8 Mar 2019 20:15:49 +0000 (12:15 -0800)]
Suppress C408 lint (don't use dict constructor) (#17813)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17813

We have a lot of manually written out dict() constructors,
and (1) I don't think use of curly brace syntax is much
of an improvement and (2) it seems like a waste of time to
fix them all.

Reviewed By: eellison

Differential Revision: D14390136

fbshipit-source-id: 6199bef4dea75b6079bcb9d9e8acf20a2e1a86e1

5 years agoAdd matches_jit_signature to recent native functions
Christian Puhrsch [Fri, 8 Mar 2019 19:37:01 +0000 (11:37 -0800)]
Add matches_jit_signature to recent native functions

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17805

Differential Revision: D14388004

Pulled By: cpuhrsch

fbshipit-source-id: c50580b6fe1e9cfefed91aaa526376325d9f9c0d

5 years agoAdd /MD to prevent linking errors on Windows
peterjc123 [Fri, 8 Mar 2019 18:39:06 +0000 (10:39 -0800)]
Add /MD to prevent linking errors on Windows

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17799

Differential Revision: D14385777

Pulled By: ezyang

fbshipit-source-id: 8c1d9f80c48399087f5fae4474690e6d80d740e6

5 years agoChange message on unknown db type to be friendly (#17795)
Dmytro Dzhulgakov [Fri, 8 Mar 2019 18:36:29 +0000 (10:36 -0800)]
Change message on unknown db type to be friendly (#17795)

Summary:
CreateDB actually returns nullptr when db type is unknown and throws when the file is missing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17795

Reviewed By: ezyang

Differential Revision: D14383226

Pulled By: dzhulgakov

fbshipit-source-id: 1dcf75a6b4ba8b64a24d4e5daf02db3189d56b7b

5 years agoTrace rnn max_batch_size (#17727)
David Riazati [Fri, 8 Mar 2019 18:29:51 +0000 (10:29 -0800)]
Trace rnn max_batch_size (#17727)

Summary:
This causes the tracer to record the select / cast to int operation instead of just an int constant

Fixes #15319 but relies on a fix for #17583 first
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17727

Differential Revision: D14377886

Pulled By: driazati

fbshipit-source-id: 59453def54ba72756303f723993844dbeb5d2f8b

5 years agoRemove legacy way of exposing caffe2 operators to PyTorch (#17742)
Sebastian Messmer [Fri, 8 Mar 2019 18:19:49 +0000 (10:19 -0800)]
Remove legacy way of exposing caffe2 operators to PyTorch (#17742)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17742

This path isn't used anymore, and is incompatible with the changes stacked on top of this diff.
Removing it.
cc bwasti to check and confirm these can really be deleted

Reviewed By: ezyang

Differential Revision: D14362426

fbshipit-source-id: 32cdc19f28c2a981ae1e204901420998367ee588

5 years agoRemove 'Tensor' key from ATen codegen. (#17782)
Gregory Chanan [Fri, 8 Mar 2019 17:41:33 +0000 (09:41 -0800)]
Remove 'Tensor' key from ATen codegen. (#17782)

Summary:
We used to have different ATen Tensor types, but we don't anymore.  This was just being maintained by a codegen'ed comment.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17782

Reviewed By: ezyang

Differential Revision: D14378004

Pulled By: gchanan

fbshipit-source-id: 1bbf276393a391252d372cc385230c784bd78588

5 years agoRemove ProcessorSpecificPlugin. (#17789)
Gregory Chanan [Fri, 8 Mar 2019 17:39:03 +0000 (09:39 -0800)]
Remove ProcessorSpecificPlugin. (#17789)

Summary:
It doesn't seem to be used.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17789

Reviewed By: ezyang

Differential Revision: D14382423

Pulled By: gchanan

fbshipit-source-id: 0ac3236c48979a1b2bcd615e307e55f10fd8eb77

5 years agoRemove THPPlugin. (#17790)
Gregory Chanan [Fri, 8 Mar 2019 17:38:48 +0000 (09:38 -0800)]
Remove THPPlugin. (#17790)

Summary:
It doesn't seem to be used.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17790

Reviewed By: ezyang

Differential Revision: D14380897

Pulled By: gchanan

fbshipit-source-id: 3c3884a08c3b6c1489347d439509b19e079c5861

5 years agoReplace tens with hundreds.
Edward Yang [Fri, 8 Mar 2019 15:23:16 +0000 (07:23 -0800)]
Replace tens with hundreds.

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17752

Differential Revision: D14366743

fbshipit-source-id: 39f6ac08180d780866e284024918d9abd197d239

5 years agoSupport failback for more operators in ideep (#17747)
Tim Khatkevich [Fri, 8 Mar 2019 13:43:17 +0000 (05:43 -0800)]
Support failback for more operators in ideep (#17747)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17747

RMACRegions, Normalize and RoIPooling

Reviewed By: dskhudia

Differential Revision: D14365096

fbshipit-source-id: dafcb7077515e03c2880832a442015b70fc7140d

5 years agoCleanup include files in jit/passes/common_subexpression_elimination.h.
Mikhail Zolotukhin [Fri, 8 Mar 2019 09:08:17 +0000 (01:08 -0800)]
Cleanup include files in jit/passes/common_subexpression_elimination.h.

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17784

Differential Revision: D14381529

Pulled By: ZolotukhinM

fbshipit-source-id: e32e17ee644ef888a6d56a8ee3648e7ac21758bf

5 years agoUse return names in JIT operators
Christian Puhrsch [Fri, 8 Mar 2019 07:31:00 +0000 (23:31 -0800)]
Use return names in JIT operators

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17638

Differential Revision: D14295606

Pulled By: cpuhrsch

fbshipit-source-id: 62040ac65434411357808735f0fe6cd33cc1c30f

5 years agoChange ConvPoolOp<Context>::SetOutputSize to ConvPoolOp<Context>::GetOutputSize ...
Jerry Zhang [Fri, 8 Mar 2019 02:31:33 +0000 (18:31 -0800)]
Change ConvPoolOp<Context>::SetOutputSize to ConvPoolOp<Context>::GetOutputSize (#17764)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17764

Original commit changeset: f1923fdca4a1

reverted int8 ops fixes the original runtime regression.
We'll ignore the memory regression since it is flaky, see D14228484

Reviewed By: dzhulgakov

Differential Revision: D13885233

fbshipit-source-id: ccbe4b94acb44b7b4cb3ae4d73e3f6091e1e1195

5 years agoClean up some old ScalarType stuff
Roy Li [Fri, 8 Mar 2019 00:16:43 +0000 (16:16 -0800)]
Clean up some old ScalarType stuff

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17755

Differential Revision: D14377135

Pulled By: li-roy

fbshipit-source-id: 35305760a1621340ba66c61a193ff61cfedfa7e8

5 years agoadd reference to flake8-mypy in contributing.md
Elias Ellison [Thu, 7 Mar 2019 23:23:16 +0000 (15:23 -0800)]
add reference to flake8-mypy in contributing.md

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17759

Differential Revision: D14376813

Pulled By: eellison

fbshipit-source-id: cca1128e967ef7368633b94a3fa3c8e76a4a16f4

5 years agoMove lerp to ATen, add functionality for tensor weights (#17348)
vishwakftw [Thu, 7 Mar 2019 22:01:47 +0000 (14:01 -0800)]
Move lerp to ATen, add functionality for tensor weights (#17348)

Summary:
Changelog:
- Remove TH/THC bindings
- Add tensor weights for `lerp`
- Modify derivatives appropriately
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17348

Differential Revision: D14355845

Pulled By: soumith

fbshipit-source-id: eaede4c09ee589d77ba6cf52583510ea8e3a2fcf

5 years agoRefactor dispatcher (#17753)
Iurii Zdebskyi [Thu, 7 Mar 2019 21:38:59 +0000 (13:38 -0800)]
Refactor dispatcher (#17753)

Summary:
This is a side PR for a bool tensor feature. The idea of this change came from a feedback received in this [PR](https://github.com/pytorch/pytorch/pull/17376).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17753

Differential Revision: D14367989

Pulled By: izdeby

fbshipit-source-id: 4fa380e56e20f18e480be68920170dbc3a4eb91c

5 years agoadd layernorm to AD
Wanchao Liang [Thu, 7 Mar 2019 21:31:55 +0000 (13:31 -0800)]
add layernorm to AD

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17702

Differential Revision: D14368472

Pulled By: wanchaol

fbshipit-source-id: 8db390e39444078258ad1d34ba74d6ddafa5d02b

5 years agomove half<->float conversions to oss operators (#17548)
Hector Yuen [Thu, 7 Mar 2019 20:52:54 +0000 (12:52 -0800)]
move half<->float conversions to oss operators (#17548)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17548

expose half float operators to OSS

common/math/Float16.h is the original implementation
this is substituted by caffe2/c10/util/Half.h

from the comments seems like the both implementations don't handle denormals

Reviewed By: jspark1105

Differential Revision: D14244200

fbshipit-source-id: f90ba28c5bf6a2b451b429cc4925b8cc376ac651

5 years agoFix the update ONNX expect files (#17767)
Lu Fang [Thu, 7 Mar 2019 20:51:09 +0000 (12:51 -0800)]
Fix the update ONNX expect files (#17767)

Summary:
Fix the CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17767

Reviewed By: zrphercule

Differential Revision: D14370483

Pulled By: houseroad

fbshipit-source-id: e7b0bbde0797c41f5a010fa206fab80fe2792eb7

5 years agoCleanup testFusion/testOne: there are unused arguments.
Mikhail Zolotukhin [Thu, 7 Mar 2019 19:13:48 +0000 (11:13 -0800)]
Cleanup testFusion/testOne: there are unused arguments.

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17737

Differential Revision: D14366584

Pulled By: ZolotukhinM

fbshipit-source-id: 3c2dd2aabfecca475909e4eec4a077d900795da9

5 years agoAutomatic update of fbcode/onnx to 96c58ceeacf0f2b73d752e413e4fd78787a12da3 (#17676)
Lu Fang [Thu, 7 Mar 2019 19:03:57 +0000 (11:03 -0800)]
update of fbcode/onnx to 96c58ceeacf0f2b73d752e413e4fd78787a12da3 (#17676)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17676

Previous import was e18bb41d255a23daf368ffd62a2645db55db4c72

Included changes:
- **[96c58ce](https://github.com/onnx/onnx/commit/96c58ce)**: Fix shape inference when auto_pad is notset again (#1830) <Li-Wen Chang>
- **[873ddbb](https://github.com/onnx/onnx/commit/873ddbb)**: More extendable Runner (#1809) <Michał Karzyński>

Reviewed By: zrphercule

Differential Revision: D14321241

fbshipit-source-id: 12de9021afc61f5435f1b719cccf7b0f4ad73a84

5 years agoSet the default ONNX opset to the latest stable opset (i.e., 9) (#17736)
Lu Fang [Thu, 7 Mar 2019 18:51:29 +0000 (10:51 -0800)]
Set the default ONNX opset to the latest stable opset (i.e., 9) (#17736)

Summary:
1) The changes in the new opset won't affect internal pipeline.
2) The CI won't be affected by the ONNX changes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17736

Reviewed By: zrphercule

Differential Revision: D14358710

Pulled By: houseroad

fbshipit-source-id: 4ef15d2246b50f6875ee215ce37ecf92d555ca6a

5 years agoAdd module attributes (#17309)
David Riazati [Thu, 7 Mar 2019 18:41:13 +0000 (10:41 -0800)]
Add module attributes (#17309)

Summary:
Similar to `nn.Parameter`s, this PR lets you store any `IValue` on a module as an attribute on a `ScriptModule` (only from the Python front-end currently). To mark something as an attribute, it should wrapped in `jit.Attribute(value, type)` (ex. `self.table = torch.jit.Attribute(table, Dict[str, torch.Tensor])`)

Followup Work:
* (de)serializing for use in C++
* change `self.training` to be a `bool` attribute instead of a buffer
* mutable attributes
* string frontend support
* documentation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17309

Differential Revision: D14354316

Pulled By: driazati

fbshipit-source-id: 67e08ab5229366b67fbc837e67b58831a4fb3318

5 years ago- refactoring serialization of ONNX initializers to be name-based (#17420)
Spandan Tiwari [Thu, 7 Mar 2019 18:06:17 +0000 (10:06 -0800)]
- refactoring serialization of ONNX initializers to be name-based (#17420)

Summary:
Currently, serialization of model parameters in ONNX export depends on the order in which they are stored in a container (`list` on Python side and `std::vector` on C++ side). This has worked fine till now, but if we need to do any pass on that graph that mutates the parameter list, then strictly order-based serialization may not work.

This PR is the first in a set to bring in more passes (such as constant folding) related to ONNX export. This PR lays the groundwork by moving the serialization in ONNX export from order-based to name based approach, which is more amenable to some of the passes.

houseroad - As discussed this change uses a map for export, and removes the code from `export.cpp` that relies on the order to compute initializer names.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17420

Differential Revision: D14361993

Pulled By: houseroad

fbshipit-source-id: da93e945d55755c126de06641f35df87d1648cc4

5 years agoONNX Export for Max and Average Pooling in CEIL_MODE
Lara Haidar-Ahmad [Thu, 7 Mar 2019 17:59:28 +0000 (09:59 -0800)]
ONNX Export for Max and Average Pooling in CEIL_MODE

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16769

Differential Revision: D14362175

Pulled By: houseroad

fbshipit-source-id: 65cfb1dfba6a43d39cc85374add368fe8e4e5645

5 years agouse flake8-mypy (#17721)
Elias Ellison [Thu, 7 Mar 2019 17:12:35 +0000 (09:12 -0800)]
use flake8-mypy (#17721)

Summary:
Use flake8 installed with mypy checks so that our linter matches fbcode. Mypy type errors also provide valuable signal
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17721

Differential Revision: D14357778

Pulled By: eellison

fbshipit-source-id: d8c9ea3fe3b5f550c3b70fe259e0eabf95e4c92d

5 years agouse fp16<->fp32 intrinsic (#17496)
Jongsoo Park [Thu, 7 Mar 2019 10:17:42 +0000 (02:17 -0800)]
use fp16<->fp32 intrinsic (#17496)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17496

As title.

Reviewed By: hyuen

Differential Revision: D14222907

fbshipit-source-id: d5d6c032e725ca8b52aca2be7401ec3c59f6a242

5 years agoImplement a Caffe2 standalone LSTM operator (#17726)
Ahmed Aly [Thu, 7 Mar 2019 09:03:51 +0000 (01:03 -0800)]
Implement a Caffe2 standalone LSTM operator (#17726)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17726

Pull Request resolved: https://github.com/pytorch/pytorch/pull/17725

Pull Request resolved: https://github.com/pytorch/pytorch/pull/17461

Implementing a standalone LSTM Operator in Caffe2 adopted from this Aten implementation: diffusion/FBS/browse/master/fbcode/caffe2/aten/src/ATen/native/RNN.cpp. The most tricky thing in this exercise was that caffe2::Tensor has no copy constructor that made it necessary to implement a custom templated copy constructor for the different Tensor containers used in the code. Also there was no way to use off-the-shelf C2 operators in my code easily so I had to copy some code that is doing basic matmul, cat, split, transpose and linear as utility functions.

Two things missing:

- Profiling this implementation against the current ONNXified LSTM op
- Make this operator available to use in PyTorch

Reviewed By: dzhulgakov

Differential Revision: D14351575

fbshipit-source-id: 3b99b53212cf593c7a49e45580b5a07b90809e64

5 years agocaffe2:libtorch_cuda depends on caffe2:caffe2_gpu (#17729)
Sebastian Messmer [Thu, 7 Mar 2019 07:50:14 +0000 (23:50 -0800)]
caffe2:libtorch_cuda depends on caffe2:caffe2_gpu (#17729)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17729

When doing "import torch" in fbcode, previously the caffe2 cuda kernels weren't loaded because libcaffe2_gpu.so wasn't loaded.
Once you also did "from caffe2.python import workspace", then the cuda kernels were loaded because that triggered a runtime mechanism for loading libcaffe2_gpu.so.

We want the cuda kernels to always be available, so this diff adds a dependency from caffe2:libtorch_cuda to caffe2:caffe2_gpu.

Reviewed By: ezyang

Differential Revision: D14353498

fbshipit-source-id: 76a9fe69f231b308ab40eac393bb216c6fad3658

5 years agoadd tensor and cost inference functions (#17684)
Jongsoo Park [Thu, 7 Mar 2019 07:26:27 +0000 (23:26 -0800)]
add tensor and cost inference functions (#17684)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17684

Adding tensor and cost inference functions to more int8 operators.

Reviewed By: yinghai

Differential Revision: D14174746

fbshipit-source-id: dfad975fa75899565c8fb61f1b7747a9206ebd22

5 years agoONNX Export Narrow op
Lara Haidar [Thu, 7 Mar 2019 06:35:12 +0000 (22:35 -0800)]
ONNX Export Narrow op

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17550

Differential Revision: D14350401

Pulled By: houseroad

fbshipit-source-id: 4d88079bb7a8bbd270b0272009826eb3b202cc33

5 years agoKeep the dim_type of hinted shape as BATCH if possible (#17734)
Yinghai Lu [Thu, 7 Mar 2019 03:55:39 +0000 (19:55 -0800)]
Keep the dim_type of hinted shape as BATCH if possible (#17734)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17734

If input is not BATCH, we will skip adjust its batch size during onnxifi transformation. So when we take hints, we take it as CONSTANT but later need to change it to BATCH if possible.

Reviewed By: jackm321

Differential Revision: D14355983

fbshipit-source-id: 63eb54a44afb1565c71486fdd73db07ca0ac4fd4

5 years agofix different round behavior on CPU and GPU #16498 (#17443)
jwu [Thu, 7 Mar 2019 03:37:03 +0000 (19:37 -0800)]
fix different round behavior on CPU and GPU #16498 (#17443)

Summary:
xxtemp, colesbury, bhushan23, zou3519,  convert gpu round behavior to half-to-even, consistent with torch cpu version and numpy. You feedback are welcomed.
See #16498
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17443

Differential Revision: D14261786

Pulled By: VitalyFedyunin

fbshipit-source-id: 98156436b545d72769831a89e2775d43ad913ebc

5 years agoWarn about memory overlaps on expanded tensors (#17576)
zou3519 [Thu, 7 Mar 2019 01:37:13 +0000 (17:37 -0800)]
Warn about memory overlaps on expanded tensors (#17576)

Summary:
Eventually we should remove these when we're certain that all our ops
handle memory overlaps correctly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17576

Differential Revision: D14349990

Pulled By: zou3519

fbshipit-source-id: c3a09f6113b9b1bf93e7f13c0b426c45b2cdf21f

5 years agofix exp fam. formula
Tongzhou Wang [Wed, 6 Mar 2019 23:35:25 +0000 (15:35 -0800)]
fix exp fam. formula

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17719

Differential Revision: D14349029

Pulled By: soumith

fbshipit-source-id: cf016756a9319436f7379e8377f8bd1e1b672b40

5 years agorefactor caffe2 operator constructors - 10/9 (#17659)
Sebastian Messmer [Wed, 6 Mar 2019 23:08:44 +0000 (15:08 -0800)]
refactor caffe2 operator constructors - 10/9 (#17659)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17659

clangr codemod

Reviewed By: ezyang

Differential Revision: D14304675

fbshipit-source-id: 45fbd84c50651a70ae29bf46df3322715e99d225

5 years agoImprove ONNX symbolic for logsoftmax and softmax (#17672)
Lu Fang [Wed, 6 Mar 2019 22:59:16 +0000 (14:59 -0800)]
Improve ONNX symbolic for logsoftmax and softmax (#17672)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17672

support dtype in the onnx symbolic

Reviewed By: zrphercule

Differential Revision: D14313987

fbshipit-source-id: e9364621b3f795191d880599711dfbcb220d0e31

5 years agoEnable using CMD when building cpp extensions on Windows
peter [Wed, 6 Mar 2019 22:40:05 +0000 (14:40 -0800)]
Enable using CMD when building cpp extensions on Windows

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17706

Differential Revision: D14346482

Pulled By: ezyang

fbshipit-source-id: 7c85e51c701f6c0947ad324ef19fafda40ae1cb9

5 years agoDo not rename net boundary inputs/outputs during ssaRewrite. (#17545)
Yinghai Lu [Wed, 6 Mar 2019 22:24:02 +0000 (14:24 -0800)]
Do not rename net boundary inputs/outputs during ssaRewrite. (#17545)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17545

This diff avoids renaming boundary inputs of net during onnxifi transform.
It also removes adding mappings for the initializer during onnxifi op creation.
Thus gets read of the mapped ws creation during onnxifi op creation.

Reviewed By: zrphercule

Differential Revision: D14243161

fbshipit-source-id: 6eafa920c45f6a6bfacbbb443e8e84cf9778644c

5 years agoReapply D14078519 (#17596)
Sebastian Messmer [Wed, 6 Mar 2019 21:47:27 +0000 (13:47 -0800)]
Reapply D14078519 (#17596)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17596

Was reverted before, now fixed version.

Reviewed By: ezyang

Differential Revision: D14270288

fbshipit-source-id: c72490b5d02cc6098cb60145fa9a842b3c9a24c5

5 years agoBatch of expect file removals (#17581)
eellison [Wed, 6 Mar 2019 21:41:13 +0000 (13:41 -0800)]
Batch of expect file removals (#17581)

Summary:
Another batch of removing expect files.

One note - I removed the Batched expect files without adding equivalent tests since they are already being tested in another ways, and we are no longer actively maintaining that project.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17581

Differential Revision: D14343578

Pulled By: eellison

fbshipit-source-id: ce0b1fd2b5b4ec80ad9003bab1b58f41645d3da6

5 years ago(#14267)
jiej [Wed, 6 Mar 2019 21:36:14 +0000 (13:36 -0800)]
(#14267)

Summary:
- Summary:

Added synchronized batch normalization, allows synchronization of stats across mini-batches between processes within a process group.
Current implementation uses a mixture of extended ATen native functions (cpp cuda extension) + torch.nn.modules (c10d python API)

- User-facing api:

1. torch.nn.utils.convert_sync_batchnorm(modules, process_group=None)

2. torch.nn.SyncBatchNorm(num_features, eps=1e-5, momentum=0.1, affine=True, track_running_stats=True, ***process_group=None***)

- supported use case:
DistributedDataParallel with ***single-gpu multi-process***

a. User creates model containing `torch.nn.SyncBatchNorm` layers through one of the ways listed below:

  1. use layers directly:

     torch.nn.SyncBatchNorm(...)

     similar API as with torch.nn.BatchNormXd(...)
     with added argument `process_group` which is used to limit the scope of
     synchronization within each process group. Default value is None, which
     implies synchronization across all GPUs

  2. use torch.nn.utils.convert_sync_batchnorm(modules, process_group)

     recursively convert all `torch.nn.BatchNormXd` into `torch.nn.SyncBatchNorm`
     preserving values of parameters/buffers.
     the utility function also allows user to specify process_group value to all
     converted layers.

b. user wraps their model with
   `torch.distributed.parallel.DataParallelDistributed`, from this point, user
   should follow the general guidelines for DDP use guide

- Error checking

For use cases not supported, we error out:

1. Application launched without ddp:
   > import torch
   > sbn = torch.nn.SyncBatchNorm(10).cuda()
   > inp = torch.randn(5, 10, 3, 3).cuda()
   > sbn(inp) --> Error!
   > AttributeError: SyncBatchNorm is only supported within torch.nn.parallel.DistributedDataParallel

2. Application launched using DDP with multi-GPU per-process:
   > ddp_module = nn.parallel.DistributedDataParallel(module, device_ids=device_ids, output_device=args.local_rank)
   > ValueError: SyncBatchNorm is only supported for DDP with single GPU per process
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14267

Differential Revision: D14270035

Pulled By: ezyang

fbshipit-source-id: 4956d8fa565c32e9df5408d53719ff9f945f4d6d

5 years agoUpdate ModuleDict doc about order
Tongzhou Wang [Wed, 6 Mar 2019 21:06:41 +0000 (13:06 -0800)]
Update ModuleDict doc about order

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17717

Differential Revision: D14346557

Pulled By: ezyang

fbshipit-source-id: 2484c7d8105f9aa8bce5567d1fa2d4f587cc9cc2

5 years agoUpdate CODEOWNERS (#17720)
Pieter Noordhuis [Wed, 6 Mar 2019 20:30:05 +0000 (12:30 -0800)]
Update CODEOWNERS (#17720)

Summary:
teng-li is passing the baton to mrshenli. Thanks for all your work on distributed teng-li!! :tada:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17720

Differential Revision: D14350120

Pulled By: pietern

fbshipit-source-id: edfe784520c54630203cc8fbb296455d3dbf341b

5 years agoONNX Export Argmin and Argmax ops
Lara Haidar-Ahmad [Wed, 6 Mar 2019 20:05:48 +0000 (12:05 -0800)]
ONNX Export Argmin and Argmax ops

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17382

Differential Revision: D14338811

Pulled By: houseroad

fbshipit-source-id: be07548d8063d1aa94f1801c18137738365b85fb

5 years agoTurn atol to 1e-5 when comparing the end to end results (#17708)
Lu Fang [Wed, 6 Mar 2019 20:02:34 +0000 (12:02 -0800)]
Turn atol to 1e-5 when comparing the end to end results (#17708)

Summary:
results smaller than 1e-5 don't make sense.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17708

Differential Revision: D14348893

Pulled By: houseroad

fbshipit-source-id: 5e07c38e5b58b27b61fae63bfc3c21e2fe5629fe

5 years agoremove loop expects (#17695)
Elias Ellison [Wed, 6 Mar 2019 19:42:19 +0000 (11:42 -0800)]
remove loop expects (#17695)

Summary:
Replace loop unrolling expect files with assertions on the output IR
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17695

Differential Revision: D14347105

Pulled By: eellison

fbshipit-source-id: 1703b4ca32bc1c67c01fc4330b0e6eb66feaa103

5 years agotypo fix
youkaichao [Wed, 6 Mar 2019 19:31:50 +0000 (11:31 -0800)]
typo fix

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17653

Differential Revision: D14302003

Pulled By: ezyang

fbshipit-source-id: 8ad90985a392b07127c7e315d4e74ce77962b573

5 years agoomit group conv NHWC test for GPU (#17715)
Deepali Chourasia [Wed, 6 Mar 2019 19:25:26 +0000 (11:25 -0800)]
omit group conv NHWC test for GPU (#17715)

Summary:
Observed the test `TestGroupConvolution.test_group_convolution` to fail with the following error:

```
Falsifying example: test_group_convolution(self=<caffe2.python.operator_test.group_conv_test.TestGroupConvolution testMethod=test_group_convolution>, stride=3, pad=0, kernel=5, size=8, group=4, input_channels_per_group=7, output_channels_per_group=8, batch_size=2, order='NHWC', engine='', use_bias=False, gc=, dc=[, device_type: 1])

You can reproduce this example by temporarily adding reproduce_failure('3.59.1', b'AAAA') as a decorator on your test case
```
This example generated by hypothesis has `group=2, order='NHWC' and dc=[, device_type: 1])`.
I think this example should be skipped.

I have mimicked the change corresponding to [PR#13554](https://github.com/pytorch/pytorch/pull/13554) to skip this example.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17715

Differential Revision: D14346642

Pulled By: ezyang

fbshipit-source-id: b1f1fef09f625fdb43d31c7213854e61a96381ba

5 years agofix tuple matching (#17687)
Elias Ellison [Wed, 6 Mar 2019 19:21:09 +0000 (11:21 -0800)]
fix tuple matching (#17687)

Summary:
Check for Tuple Matching in isSubvalueOf, since they may contain container types that need to be recursed within isSubvalueOf

Fix for https://github.com/pytorch/pytorch/issues/17650
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17687

Differential Revision: D14324642

Pulled By: eellison

fbshipit-source-id: 7f1e019875286b2640a3b9c003d1635dda8cf543

5 years agoTemporarily disable Upsample operator tests in pytorch-onnx tests (#17696)
Spandan Tiwari [Wed, 6 Mar 2019 19:03:32 +0000 (11:03 -0800)]
Temporarily disable Upsample operator tests in pytorch-onnx tests (#17696)

Summary:
In discussion with houseroad, because Upsample op is being updated in ONNX https://github.com/onnx/onnx/pull/1773 and these tests are blocking it. These tests will be updated once the ONNX PR goes in.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17696

Differential Revision: D14338845

Pulled By: houseroad

fbshipit-source-id: cfaf8cf1ab578ae69dd3bf21b1c0681b572b9b6f

5 years agoAdd check for x64 Python before setup (#17707)
peter [Wed, 6 Mar 2019 18:41:20 +0000 (10:41 -0800)]
Add check for x64 Python before setup (#17707)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/17657.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17707

Differential Revision: D14346705

Pulled By: ezyang

fbshipit-source-id: 5daafacdb99eb9a9c6517263d10f20c79f920d24

5 years agoReplace caffe2::DeviceGuard with c10::cuda::CUDAGuard (#17623)
Edward Yang [Wed, 6 Mar 2019 18:32:38 +0000 (10:32 -0800)]
Replace caffe2::DeviceGuard with c10::cuda::CUDAGuard (#17623)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17623

Despite it's generic sounding name, caffe2::DeviceGuard actually
only worked on CUDA devices.  Rename it to something that more
clearly spells out its applicability.

I'm not sure if it's the right call, but in this patch I added
'using CUDAGuard = c10::cuda::CUDAGuard', as this seems to be more
in-line with how the Caffe2 codebase is currently written.  More
idiomatic c10 namespace style would be to say cuda::CUDAGuard.
Willing to change this if people shout.

This is a respin of D13156470 (#14284)

Reviewed By: dzhulgakov

Differential Revision: D14285504

fbshipit-source-id: 93b8ab938b064572b3b010c307e1261fde0fff3d

5 years agoRemove nomscheduler (#17693)
Duc Ngo [Wed, 6 Mar 2019 18:31:00 +0000 (10:31 -0800)]
Remove nomscheduler (#17693)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17693

Remove nomscheduler tool

Reviewed By: yinghai

Differential Revision: D14328168

fbshipit-source-id: 674d0e18596a4dc2bbb6b8d321f4066c4fc454ab

5 years agoindex operation support for torch.HalfTensor (#17645)
bhushan [Wed, 6 Mar 2019 18:28:49 +0000 (10:28 -0800)]
index operation support for torch.HalfTensor (#17645)

Summary:
- Test cases added
1. indexing for half tensor
2. setting for half tensor

fixes #17161
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17645

Differential Revision: D14302069

Pulled By: ezyang

fbshipit-source-id: 100f141c07046f200c904e27c5882a9417bccda0

5 years agoRevert D14160172: Implement a Caffe2 standalone LSTM operator
Soumith Chintala [Wed, 6 Mar 2019 16:41:42 +0000 (08:41 -0800)]
Revert D14160172: Implement a Caffe2 standalone LSTM operator

Differential Revision:
D14160172

Original commit changeset: c33e3f9e8aea

fbshipit-source-id: cffe35d93f0ac75ca93aa98a3b82af3d372f2fc1

5 years agofix typo in hub doc
Tongzhou Wang [Wed, 6 Mar 2019 07:14:25 +0000 (23:14 -0800)]
fix typo in hub doc

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17705

Differential Revision: D14338380

Pulled By: ailzhang

fbshipit-source-id: d53eece30bede88a642e718ee6f829ba29c7d1c4

5 years agofix dropout AD & rename range to rangelist (#17691)
Ailing Zhang [Wed, 6 Mar 2019 04:47:02 +0000 (20:47 -0800)]
fix dropout AD & rename range to rangelist (#17691)

Summary:
fixes #17669
Address apaszke 's comments in #17523
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17691

Differential Revision: D14328083

Pulled By: ailzhang

fbshipit-source-id: 9ec4a54f13bfd1aaf4b1821dd00c31793ac07a44

5 years agoenable use of MIOpen for depthwise convolutions (#17685)
Chaitanya Sri Krishna Lolla [Wed, 6 Mar 2019 02:41:20 +0000 (18:41 -0800)]
enable use of MIOpen for depthwise convolutions (#17685)

Summary:
* added miopen conv mode to be used for setConvDescriptor
* added miopen depthwise convolutions
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17685

Differential Revision: D14327811

Pulled By: bddppq

fbshipit-source-id: d5bdc1abafd5f39694fadf3f9275b9d880c5b115

5 years agoImplement a Caffe2 standalone LSTM operator (#17461)
Ahmed Aly [Wed, 6 Mar 2019 01:31:51 +0000 (17:31 -0800)]
Implement a Caffe2 standalone LSTM operator (#17461)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17461

Implementing a standalone LSTM Operator in Caffe2 adopted from this Aten implementation: diffusion/FBS/browse/master/fbcode/caffe2/aten/src/ATen/native/RNN.cpp. The most tricky thing in this exercise was that caffe2::Tensor has no copy constructor that made it necessary to implement a custom templated copy constructor for the different Tensor containers used in the code. Also there was no way to use off-the-shelf C2 operators in my code easily so I had to copy some code that is doing basic matmul, cat, split, transpose and linear as utility functions.

Two things missing:

- Profiling this implementation against the current ONNXified LSTM op
- Make this operator available to use in PyTorch

Reviewed By: dzhulgakov

Differential Revision: D14160172

fbshipit-source-id: c33e3f9e8aeae578b64d97593cb031a251216029

5 years agoFix nll_loss crash on cpu where ignore_index is out of bounds (#17328)
Soumith Chintala [Tue, 5 Mar 2019 22:26:20 +0000 (14:26 -0800)]
Fix nll_loss crash on cpu where ignore_index is out of bounds (#17328)

Summary:
Fixes https://github.com/pytorch/pytorch/issues/15508
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17328

Differential Revision: D14322629

Pulled By: soumith

fbshipit-source-id: 7d02f372be78794782c18affcfc109ce30b1e91c

5 years agoAdd '--hip-clang-launch' to favor <<<>>>-based launch. (#17686)
Johannes M Dieterich [Tue, 5 Mar 2019 20:49:25 +0000 (12:49 -0800)]
Add '--hip-clang-launch' to favor <<<>>>-based launch. (#17686)

Summary:
hip-clang uses triple chevron kernel dispatch syntax. Add an option to the hipification script to skip translating triple chevron to hipLaunchKernelGGL.

Once we switch to hip-clang, this option will be default and subsequently removed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17686

Differential Revision: D14327810

Pulled By: bddppq

fbshipit-source-id: 5e1512325077dd3ebb8fb9b5bf35fd1f8d9a4dc3

5 years agoImprove caching allocator for Pascal and newer GPUs. (#17120)
Sam Gross [Tue, 5 Mar 2019 17:38:23 +0000 (09:38 -0800)]
Improve caching allocator for Pascal and newer GPUs. (#17120)

Summary:
```
NVIDIA changed the CUDA allocation behavior on Pascal GPUs. The
page size increased from 1MB to 2MB and allocations larger than 1MB
are now always page-aligned. Previously, allocations larger than 1MB
were aligned to 128KB boundaries.

This interacted poorly with the caching allocator. The remaining
memory in a page could only be filled by small cudaMalloc calls, but
the caching allocator never cudaMalloc's a chunk smaller than 1MB.
This behavior could also cause a large discrepancy between the memory
usage reported by nvidia-smi and the memory usage reported by
PyTorch, because nvidia-smi counts a partially used page as "full",
while PyTorch only counts the actual memory requested.

This PR makes a few changes to the caching allocator to better support
Pascal and Volta GPUs:

 - All cudaMalloc calls are now multiples of 2MB (the page size)
 - Requests between 1-10MB allocate (and split) a 20MB block to
   reduce wasted space due to rounding
 - Small requests are now packed into 2MB blocks (instead of 1MB)

This improves Mask R-CNN memory usage by 10-20% in internal tests on
Volta GPUs. Maxwell performance seems to be largely unchanged, but
it's possible that some use cases suffer slightly.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17120

Differential Revision: D14301536

Pulled By: colesbury

fbshipit-source-id: a8282315ea8f7b8ca149b5066fdeaecd0d404edf

5 years agoTurn the Half::from_bits into a constexpr function to avoid unresolve… (#17661)
Davide Libenzi [Tue, 5 Mar 2019 15:24:27 +0000 (07:24 -0800)]
Turn the Half::from_bits into a constexpr function to avoid unresolve… (#17661)

Summary:
…d symbol errors when building in DEBUG mode.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17661

Differential Revision: D14319610

Pulled By: soumith

fbshipit-source-id: 6c508a37155e29260f403d7174f343aa1ff32385

5 years agoRemove Expect Files from python / tracing / script interop
Elias Ellison [Tue, 5 Mar 2019 06:38:41 +0000 (22:38 -0800)]
Remove Expect Files from python / tracing / script interop

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17622

Differential Revision: D14308307

Pulled By: eellison

fbshipit-source-id: bda249d38ac2570000a12b0ca328c26233ecefe8

5 years agoEnable apex on Windows
peterjc123 [Tue, 5 Mar 2019 05:50:53 +0000 (21:50 -0800)]
Enable apex on Windows

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17675

Differential Revision: D14320473

Pulled By: soumith

fbshipit-source-id: cb696984f5196f9b8b50722b4fe927bb6407c322

5 years agobump docker build to upgrade magma to 2.5.0 (#17674)
Soumith Chintala [Tue, 5 Mar 2019 04:28:06 +0000 (20:28 -0800)]
bump docker build to upgrade magma to 2.5.0 (#17674)

Summary:
upgrades magma in docker build.

vishwakftw
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17674

Differential Revision: D14320187

Pulled By: soumith

fbshipit-source-id: 7887f65fb703b802fc6231408b55ad9c4039882b

5 years agorefactor caffe2 operator constructors - 1/9 (#17082)
Sebastian Messmer [Mon, 4 Mar 2019 23:56:21 +0000 (15:56 -0800)]
refactor caffe2 operator constructors - 1/9 (#17082)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17082

clangr codemod

Reviewed By: ezyang

Differential Revision: D14078498

fbshipit-source-id: f7f65d6d81c7942293f53fdaa61f756d8b7360c1

5 years agoExpose cuda kernel for caffe2::GenerateProposals
Sebastian Messmer [Mon, 4 Mar 2019 22:53:55 +0000 (14:53 -0800)]
Expose cuda kernel for caffe2::GenerateProposals

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17066

Reviewed By: ezyang, wat3rBro

Differential Revision: D14071130

fbshipit-source-id: 6fe26503f6069c36ec31d6c09b549b932d5db242

5 years agoprint warnings when DNNLOWP_16 or DNNLOWP_ROWWISE_16 engine is used (#17176)
Jongsoo Park [Mon, 4 Mar 2019 22:25:19 +0000 (14:25 -0800)]
print warnings when DNNLOWP_16 or DNNLOWP_ROWWISE_16 engine is used (#17176)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17176

As title

Reviewed By: csummersea

Differential Revision: D14111616

fbshipit-source-id: 1282cb2452c4ad385fd2dc6d3f8c19e9fec715ff

5 years agoFix XOutput/XOutputTensor for ivalue based c2 operators (#17599)
Sebastian Messmer [Mon, 4 Mar 2019 22:17:11 +0000 (14:17 -0800)]
Fix XOutput/XOutputTensor for ivalue based c2 operators (#17599)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17599

XOutput/XOutputTensor was broken for ivalue based operators. This diff fixes that.

Reviewed By: ezyang

Differential Revision: D14274003

fbshipit-source-id: b99f020244c66c4e2551dbd32ae0f665cc91b338

5 years agoFix InputSize/OutputSize for ivalue based operators (#17579)
Sebastian Messmer [Mon, 4 Mar 2019 22:17:11 +0000 (14:17 -0800)]
Fix InputSize/OutputSize for ivalue based operators (#17579)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17579

These methods previously just returned 0 when it was not a legacy operator,
making it impossible to convert some operators.

Reviewed By: dzhulgakov

Differential Revision: D14253094

fbshipit-source-id: 72bfdcf6da291a4ab80d1e0ceb20984b86edc408

5 years agoFix clamp fusion on missing limits (#17533)
Wanchao Liang [Mon, 4 Mar 2019 21:04:53 +0000 (13:04 -0800)]
Fix clamp fusion on missing limits (#17533)

Summary:
Fixes #17449

Context: before #17186, we don't fuse `clamp` for the case when `min/max` are missing inputs, because they are `prim::None` node, after #17186, we make None a `prim::Constant` node which enables the fusion for `clamp`. But codegen.cpp does not handle the case when `prim::Constant` is not a Double/Int/Bool, this PR makes it so that missing inputs are handled correctly, it is done in the following way:

1. emit nothing when you see `type? = prim::Constant()`
2. when emitRHS, do special casing for aten::clamp
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17533

Differential Revision: D14238450

Pulled By: wanchaol

fbshipit-source-id: 61a272154754b13e89021bb86002927f02cde19c

5 years agoint32 indexing for Tensor Iterator Reduction (#17428)
Jie [Mon, 4 Mar 2019 21:02:40 +0000 (13:02 -0800)]
int32 indexing for Tensor Iterator Reduction (#17428)

Summary:
1. Enabling int32 indexing for cases where TI cannot accumulate in output due to
incompatible data types (e.g. Welford).
2. Updating Welford kernel to use int32 instead of int64 indexing on GPU.

This change improves performance for torch.var / torch.std

Implementation:
1. Allocated extra buffer to handle accumulation between sub Tensor Iterators.
2. Removed int64 indexing in gpu_reduce_kernel
3. WelfordOps now supports index type / combination typeas a template parameter.
While GPU uses int32_t and float, CPU implementation uses int64_t and double.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17428

Differential Revision: D14264608

Pulled By: umanwizard

fbshipit-source-id: 3eb54451de925b469dbc1127e5ea7443c4431036

5 years agoRemoved all usages of TH_Index_Base (#17591)
Iurii Zdebskyi [Mon, 4 Mar 2019 20:43:28 +0000 (12:43 -0800)]
Removed all usages of TH_Index_Base (#17591)

Summary:
TH_Index_Base is hard coded to 0 and can be removed from the code base.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17591

Differential Revision: D14269273

Pulled By: izdeby

fbshipit-source-id: d844e261f4af7297bad8a81e7d6dcf0a391b94e6

5 years agoPyTorch/Caffe2 tensor interop in Python (#17190)
Dmytro Dzhulgakov [Mon, 4 Mar 2019 19:30:43 +0000 (11:30 -0800)]
PyTorch/Caffe2 tensor interop in Python (#17190)

Summary:
Because of two separate python extensions with different pybind
instances I have to go through void* conversion. Since it's hidden from
user, it's fine.

New APIs added on C2 side:
- workspace.FetchTorch('blob')
- workspace.Workspace.current.blobs['blob'].to_torch()
- workspace.FeedBlob('blob', pytorch_tensor)

Works on CPU an GPU.

The only glitches are with resizing because of variable/tensor split.
But data sharing works properly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17190

Reviewed By: ezyang

Differential Revision: D14163882

Pulled By: dzhulgakov

fbshipit-source-id: d18e5b8fcae026f393c842a1149e972515732de2

5 years agoFixed typo in aten/src/ATen/native_parse.py (#17641)
wkcn [Mon, 4 Mar 2019 18:08:04 +0000 (10:08 -0800)]
Fixed typo in aten/src/ATen/native_parse.py (#17641)

Summary:
Hi, there.
There is a typo in aten/src/ATen/native_parse.py, and I fix it.
`std::aray` -> `std::array`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17641

Differential Revision: D14301981

Pulled By: ezyang

fbshipit-source-id: a37859cdedcbf6c29333b954486dfa086d6c2176

5 years agoRemove GPU dependency from ProfileObserver (#17592)
Martin Schatz [Mon, 4 Mar 2019 17:55:05 +0000 (09:55 -0800)]
Remove GPU dependency from ProfileObserver (#17592)

Summary:
Remove GPU dependency and register ProfileObserver.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17592

Reviewed By: ezyang

Differential Revision: D14265801

Pulled By: mdschatz

fbshipit-source-id: f98c0c32653c64a8b087c58ece4f864dfbe1d4b8

5 years agoDon't make factory methods create a tensor and then immediately copy it (#17565)
Brennan Vincent [Mon, 4 Mar 2019 06:13:27 +0000 (22:13 -0800)]
Don't make factory methods create a tensor and then immediately copy it (#17565)

Summary:
Create a `make_variable` override that moves out of a tensor instead of going through `shallow_copy_and_detach`. Call this override from factory methods like `empty` that create a brand new tensor, do nothing with it, and then copy it into a variable.

Will update this with actual numbers, but it seems to get rid of around 20-40% of the overhead of calling `torch.empty(0)`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17565

Differential Revision: D14266130

Pulled By: umanwizard

fbshipit-source-id: f57d5f2ca3f80ee8ee96d50f905e852fd10db941

5 years agoFixed typo in torch/functional.py w/r/t broadcast_tensors (#17642)
Jack Richter-Powell [Sun, 3 Mar 2019 18:05:36 +0000 (10:05 -0800)]
Fixed typo in torch/functional.py w/r/t broadcast_tensors (#17642)

Summary:
In reference to #17574
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17642

Differential Revision: D14297177

Pulled By: ezyang

fbshipit-source-id: 968176ea3b46a0153da0fd9e6b40db314d29e51c

5 years agoChange fake tqdm constructor to match real tqdm (#17636)
Bryan He [Sun, 3 Mar 2019 09:01:26 +0000 (01:01 -0800)]
Change fake tqdm constructor to match real tqdm (#17636)

Summary:
Currently, the fake tqdm implementation requires an input (whereas real tqdm does not).

This caused a problem in torchvision (https://github.com/pytorch/vision/pull/770), and seems likely to cause minor irritations elsewhere.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17636

Differential Revision: D14296530

Pulled By: ezyang

fbshipit-source-id: bc077d898773c93dab34c985a7b30525a43e558a