Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15843
RNN/LSTMs only need one bias vector, but our implementation uses two to be compatible with CuDNN. This diff adds a comment to explain this.
Reviewed By: ezyang
Differential Revision:
D13602365
fbshipit-source-id:
eef5bd9383d9f241dc0ef0472f753b4a44cc19b5
w_ih = Parameter(torch.Tensor(gate_size, layer_input_size))
w_hh = Parameter(torch.Tensor(gate_size, hidden_size))
b_ih = Parameter(torch.Tensor(gate_size))
+ # Second bias vector included for CuDNN compatibility. Only one
+ # bias vector is needed in standard definition.
b_hh = Parameter(torch.Tensor(gate_size))
layer_params = (w_ih, w_hh, b_ih, b_hh)