[addition] Bug fix addition layer calcDerivative
authorParichay Kapoor <pk.kapoor@samsung.com>
Wed, 16 Jun 2021 03:03:56 +0000 (12:03 +0900)
committerJijoong Moon <jijoong.moon@samsung.com>
Wed, 23 Jun 2021 07:42:19 +0000 (16:42 +0900)
Bug fix for addition layer calcDerivative().
addition layer assign the same tensor derivative memory back to its
other layers. If the other layers connecting with addition layers are
inplace, then this can lead to wrong results.

Signed-off-by: Parichay Kapoor <pk.kapoor@samsung.com>
nntrainer/layers/addition_layer.cpp
nntrainer/layers/concat_layer.cpp

index cd8c5901bf224cd114a7c090461f91041992d6d8..953d6e840ba9d8024bafbc2b8be6a4b81364bec4 100644 (file)
@@ -59,7 +59,12 @@ void AdditionLayer::forwarding(bool training) {
 void AdditionLayer::calcDerivative() {
 
   for (unsigned int i = 0; i < getNumInputs(); ++i) {
-    net_input[i]->getGradientRef() = net_hidden[0]->getGradientRef();
+    /**
+     * TODO: replace this with tensor assignment during optimization.
+     * Tensor assignement needs to make sure that the previous connected layers
+     * are not inplace
+     */
+    net_input[i]->getGradientRef().copy(net_hidden[0]->getGradientRef());
   }
 }
 
index 0ab0bf16ef9eddcf060f9fdec6a6ce83b315bdb1..1c0f2bdd32e1b81fdd52bde3d805340413599bcd 100644 (file)
@@ -99,6 +99,7 @@ void ConcatLayer::calcDerivative() {
     TensorDim in_dim = input_dim[idx];
 
     for (unsigned int b = 0; b < in_dim.batch(); ++b) {
+      // TODO: replace with tensor::copy/fill
       memcpy(
         net_input[idx]->getGradient().getAddress(b * in_dim.getFeatureLen()),
         net_hidden[0]->getGradient().getAddress(b * d.getFeatureLen() +