zero-init param diffs and accumulate gradients
authorJonathan L Long <jonlong@cs.berkeley.edu>
Tue, 12 Aug 2014 04:38:59 +0000 (21:38 -0700)
committerEvan Shelhamer <shelhamer@imaginarynumber.net>
Wed, 27 May 2015 21:07:20 +0000 (14:07 -0700)
commit41cf06cc6e40e1b41d04b5b26e19395611bdcf5d
treee36850e5a940d399f32c43be0135432f5a147847
parentb12c17104cd3446b876346ea8342ecec8551fab4
zero-init param diffs and accumulate gradients

(With layers whose backward accumulates gradients), this effectively
decouples the computational batch from the SGD minibatch. Each
iteration accumulates gradients over iter_size batches, then parameters
are updated.
src/caffe/proto/caffe.proto
src/caffe/solver.cpp