Fixes error when too many parameters are passed to fused cuda kernel (#18063)
authorRoy Ju <rju@nvidia.com>
Wed, 10 Apr 2019 05:29:33 +0000 (22:29 -0700)
committerFacebook Github Bot <facebook-github-bot@users.noreply.github.com>
Wed, 10 Apr 2019 05:37:09 +0000 (22:37 -0700)
commita9a29dd63f2f64b1f1703ba2dbe64fd18e8ee528
tree39f993b4ccbb774bd815a1269cd0b518bd2e2262
parent496b0b03d988ccdb242f8674f1c5e176f2bef221
Fixes error when too many parameters are passed to fused cuda kernel (#18063)

Summary:
Bug fix for https://github.com/pytorch/pytorch/issues/15043, where a large fusion in JIT with a large number of kernel arguments, which exceeds the limit allowed by nvrtc on a cuda device.
  The fix is to check the number of arguments before a cuda kernel is generated. If the number exceeds the limit, take the runFallBack() path.
  Add a reduced test from the original issue to keep the test time low. The test would fail without this fix.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18063

Differential Revision: D14691401

Pulled By: soumith

fbshipit-source-id: b98829bc89ed7724e91eda82ae3a5a1151af721a
test/test_jit.py
torch/csrc/jit/fuser/compiler.cpp
torch/csrc/jit/passes/graph_fuser.cpp
torch/csrc/jit/passes/graph_fuser.h