Hyperbolic Tangent Activation Function
A Hyperbolic Tangent Activation Function is a Neuron Activation Function based on an Hyperbolic Tangent Function.
- AKA: TanH Activation Function.
- Context:
- It can (typically) be used in the activation of Hyperbolic Tangent Neurons.
- Example(s):
- Counter-Example(s):
- See: Artificial Neural Network, Artificial Neuron, Neural Network Topology, Neural Network Layer, Neural Network Learning Rate.
References
2018a
- (Pytorch, 2018) ⇒ http://pytorch.org/docs/master/nn.html#tanh Retrieved:2018-2-10
- QUOTE:
class torch.nn.Tanh
sourceApplies element-wise, [math]\displaystyle{ f(x)=\dfrac{\exp(x)−\exp(−x)}{\exp(x)+\exp(−x)} }[/math]
Shape:
- Input: (N,∗) where * means, any number of additional dimensions
- Output: (N,∗), same shape as the input
- QUOTE:
- Examples:
>>> m = nn.Tanh() >>> input = autograd.Variable(torch.randn(2)) >>> print(input) >>> print(m(input))
2018b
- (Santos, 2018) ⇒ Santos (2018) "Activation Functions". In: Neural Networks - Artificial Inteligence Retrieved: 2018-01-28.
- QUOTE: After the neuron do the dot product between it's inputs and weights, it also apply a non-linearity on this result. This non-linear function is called Activation Function.
On the past the popular choice for activation functions were the sigmoid and tanh. Recently it was observed the ReLU layers has better response for deep neural networks, due to a problem called vanishing gradient. So you can consider using only ReLU neurons.
sigmoid: [math]\displaystyle{ \sigma(x)=\dfrac{1}{1+e^{−x}} }[/math]
tanh:[math]\displaystyle{ \sigma(x)=\dfrac{e^x−e^x}{e^x+e^x} }[/math]
ReLU:[math]\displaystyle{ \sigma(x)=max(0,x) }[/math]
- QUOTE: After the neuron do the dot product between it's inputs and weights, it also apply a non-linearity on this result. This non-linear function is called Activation Function.
2018c
- (CS231n, 2018) ⇒ Commonly used activation functions. In: CS231n Convolutional Neural Networks for Visual Recognition Retrieved: 2018-01-28.
- QUOTE: Every activation function (or non-linearity) takes a single number and performs a certain fixed mathematical operation on it. There are several activation functions you may encounter in practice:
- Tanh. The tanh non-linearity is shown on the image above on the right. It squashes a real-valued number to the range [-1, 1]. Like the sigmoid neuron, its activations saturate, but unlike the sigmoid neuron its output is zero-centered. Therefore, in practice the tanh non-linearity is always preferred to the sigmoid nonlinearity. Also note that the tanh neuron is simply a scaled sigmoid neuron, in particular the following holds: [math]\displaystyle{ tanh(x)=2\sigma(2x)−1 }[/math].
- QUOTE: Every activation function (or non-linearity) takes a single number and performs a certain fixed mathematical operation on it. There are several activation functions you may encounter in practice:
2017
- (Mate Labs, 2017) ⇒ Mate Labs Aug 23, 2017. Secret Sauce behind the beauty of Deep Learning: Beginners guide to Activation Functions
- QUOTE: Hyperbolic tangent (TanH) — It looks like a scaled sigmoid function. Data is centered around zero, so the derivatives will be higher. Tanh quickly converges than sigmoid and logistic activation functions.
[math]\displaystyle{ f(x)=\tanh(x)=\dfrac{2}{1+e^{-2x}} -2 }[/math]
Range: [math]\displaystyle{ (-1, 1) }[/math]
Examples: [math]\displaystyle{ \tanh(2) = 0.9640,\; \tanh(-0.567) = -0.5131, \; \tanh(0) = 0 }[/math]
- QUOTE: Hyperbolic tangent (TanH) — It looks like a scaled sigmoid function. Data is centered around zero, so the derivatives will be higher. Tanh quickly converges than sigmoid and logistic activation functions.
2005
- (Golda,2005) ⇒ Adam Golda (2005). "Introduction to neural networks"
- QUOTE: Functions that more accurate describe the non-linear characteristic of the biological neuron activation function are:
- (...) hyperbolic tangent function:[math]\displaystyle{ y=tgh\left(\dfrac{\alpha\varphi}{2}\right)=\dfrac{1 - \exp({-\alpha\varphi})}{1+\exp({-\alpha\varphi})} }[/math] where [math]\displaystyle{ \alpha }[/math] is a parameter.
The next picture presents the graphs of particular activation functions:
- (...) hyperbolic tangent function:[math]\displaystyle{ y=tgh\left(\dfrac{\alpha\varphi}{2}\right)=\dfrac{1 - \exp({-\alpha\varphi})}{1+\exp({-\alpha\varphi})} }[/math] where [math]\displaystyle{ \alpha }[/math] is a parameter.
- QUOTE: Functions that more accurate describe the non-linear characteristic of the biological neuron activation function are:
- a. linear function,
- b. threshold function,
- c. sigmoid function.