The following equivalent combination is recognized and fused into single Gelu op:
\f[
- Gelu(x) = 0.5*x*(1 + erf((x) / sqrt(2) )
+ Gelu(x) = 0.5*x*(1.0 + erf((x) / \sqrt{2})
\f]
-Similarly, the following Gelu approximation (typical for the TensorFlow*) is recognized and fused into single Gelu op
+Similarly, the following Gelu approximation (typical for the TensorFlow*) is recognized and fused into single Gelu op
+
\f[
- Gelu(x) \approx 0.5*x*(1 + tanh((sqrt(2/pi)) * (x + 0.044715 * x ^ 3))
+ Gelu(x) \approx 0.5x(1.0 + tanh(\sqrt{2.0/pi} * (x + 0.044715 * x ^ 3))
\f]
+
**Inputs**:
* **1**: Multidimensional input tensor. Required.
--- /dev/null
+## Mish <a name="Mish"></a>
+
+**Versioned name**: *Mish-4*
+
+**Category**: *Activation*
+
+**Short description**: Mish is a Self Regularized Non-Monotonic Neural Activation Function.
+
+**Detailed description**: Mish is a self regularized non-monotonic neural activation function proposed in the [article](https://arxiv.org/abs/1908.08681).
+
+**Attributes**: operation has no attributes.
+
+**Inputs**:
+
+* **1**: Input tensor *x* of any floating point type T. Required.
+
+**Outputs**:
+
+* **1**: Floating point tensor with shape and type matching the input tensor. Required.
+
+**Types**
+
+* *T*: any floating point type.
+
+**Mathematical Formulation**
+
+ For each element from the input tensor calculates corresponding
+ element in the output tensor with the following formula:
+ \f[
+ Mish(x) = x*tanh(ln(1.0+e^{x}))
+ \f]
+
+**Examples**
+
+```xml
+<layer ... type="Mish">
+ <input>
+ <port id="0">
+ <dim>256</dim>
+ <dim>56</dim>
+ </port>
+ </input>
+ <output>
+ <port id="3">
+ <dim>256</dim>
+ <dim>56</dim>
+ </port>
+ </output>
+</layer>
+```
\ No newline at end of file
* [MaxPool](pooling/MaxPool_1.md)
* [Maximum](arithmetic/Maximum_1.md)
* [Minimum](arithmetic/Minimum_1.md)
+* [Mish](activation/Mish_4.md)
* [Mod](arithmetic/Mod_1.md)
* [MVN](normalization/MVN_1.md)
* [Multiply](arithmetic/Multiply_1.md)