Fix the issue with Bahdanau attention when normalized=True and dtype = float16/32
authorYong Tang <yong.tang.github@outlook.com>
Sun, 1 Apr 2018 02:01:47 +0000 (02:01 +0000)
committerYong Tang <yong.tang.github@outlook.com>
Mon, 16 Apr 2018 20:04:11 +0000 (20:04 +0000)
While revisiting 18016 I noticed that Bahdanau attention has a similiar
dtype mismatch issue when normalized=True. The issue comes from:
```
     g = variable_scope.get_variable(
         "attention_g", dtype=dtype,
         initializer=math.sqrt((1. / num_units)))
```
where the initializer value does not work well with differnt dtype.

This fix converts changes the initializer to `init_ops.constant_initializer`
to address the issue, and adds additional test cases for it.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
tensorflow/contrib/seq2seq/python/ops/attention_wrapper.py

index 9ba541c..867e49b 100644 (file)
@@ -472,7 +472,7 @@ def _bahdanau_score(processed_query, keys, normalize):
     # Scalar used in weight normalization
     g = variable_scope.get_variable(
         "attention_g", dtype=dtype,
-        initializer=math.sqrt((1. / num_units)))
+        initializer=init_ops.constant_initializer(math.sqrt((1. / num_units))), shape=())
     # Bias added prior to the nonlinearity
     b = variable_scope.get_variable(
         "attention_b", [num_units], dtype=dtype,