Bent Identity Activation Function
Jump to navigation
Jump to search
A Bent Identity Activation Function is a neuron activation function that is based on the mathematical function: [math]\displaystyle{ f(x)=\frac{\sqrt{x^2 + 1} - 1}{2} + x }[/math]. [math]\displaystyle{ }[/math].
- Context:
- It can (typically) be used in the activation of Bent Identity Neurons.
- Example(s):
- ...
- Counter-Example(s):
- a Softmax-based Activation Function,
- a Rectified-based Activation Function,
- a Heaviside Step Activation Function,
- a Ramp Function-based Activation Function,
- a Logistic Sigmoid-based Activation Function,
- a Hyperbolic Tangent-based Activation Function,
- a Gaussian-based Activation Function,
- a Softsign Activation Function,
- a Softshrink Activation Function,
- an Adaptive Piecewise Linear Activation Function,
- a Maxout Activation Function.
- See: Artificial Neural Network, Artificial Neuron, Neural Network Topology, Neural Network Layer, Neural Network Learning Rate.
References
2018
- (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/Activation_function#Comparison_of_activation_functions Retrieved:2018-2-12.
- The following table compares the properties of several activation functions that are functions of one fold from the previous layer or layers:
Name | Plot | Equation | Derivative (with respect to x) | Range | Order of continuity | Monotonic | Derivative Monotonic | Approximates identity near the origin |
---|---|---|---|---|---|---|---|---|
Identity | [math]\displaystyle{ f(x)=x }[/math] | [math]\displaystyle{ f'(x)=1 }[/math] | [math]\displaystyle{ (-\infty,\infty) }[/math] | [math]\displaystyle{ C^\infty }[/math] | Yes | Yes | Yes | |
Binary step | [math]\displaystyle{ f(x) = \begin{cases} 0 & \text{for } x \lt 0\\ 1 & \text{for } x \ge 0\end{cases} }[/math] | [math]\displaystyle{ f'(x) = \begin{cases} 0 & \text{for } x \ne 0\\ ? & \text{for } x = 0\end{cases} }[/math] | [math]\displaystyle{ \{0,1\} }[/math] | [math]\displaystyle{ C^{-1} }[/math] | Yes | No | No | |
Logistic (a.k.a. Soft step) | [math]\displaystyle{ f(x)=\frac{1}{1+e^{-x}} }[/math] | [math]\displaystyle{ f'(x)=f(x)(1-f(x)) }[/math] | [math]\displaystyle{ (0,1) }[/math] | [math]\displaystyle{ C^\infty }[/math] | Yes | No | No | |
(...) | (...) | (...) | (...) | (...) | (...) | (...) | (...) | (...) |
Adaptive piecewise linear (APL) [1] | [math]\displaystyle{ f(x) = \max(0,x) + \sum_{s=1}^{S}a_i^s \max(0, -x + b_i^s) }[/math] | [math]\displaystyle{ f'(x) = H(x) - \sum_{s=1}^{S}a_i^s H(-x + b_i^s) }[/math]Template:Ref | [math]\displaystyle{ (-\infty,\infty) }[/math] | [math]\displaystyle{ C^0 }[/math] | No | No | No | |
SoftPlus[2] | [math]\displaystyle{ f(x)=\ln(1+e^x) }[/math] | [math]\displaystyle{ f'(x)=\frac{1}{1+e^{-x}} }[/math] | [math]\displaystyle{ (0,\infty) }[/math] | [math]\displaystyle{ C^\infty }[/math] | Yes | Yes | No | |
Bent identity | [math]\displaystyle{ f(x)=\frac{\sqrt{x^2 + 1} - 1}{2} + x }[/math] | [math]\displaystyle{ f'(x)=\frac{x}{2\sqrt{x^2 + 1}} + 1 }[/math] | [math]\displaystyle{ (-\infty,\infty) }[/math] | [math]\displaystyle{ C^\infty }[/math] | Yes | Yes | Yes | |
SoftExponential [3] | [math]\displaystyle{ f(\alpha,x) = \begin{cases} -\frac{\ln(1-\alpha (x + \alpha))}{\alpha} & \text{for } \alpha \lt 0\\ x & \text{for } \alpha = 0\\ \frac{e^{\alpha x} - 1}{\alpha} + \alpha & \text{for } \alpha \gt 0\end{cases} }[/math] | [math]\displaystyle{ f'(\alpha,x) = \begin{cases} \frac{1}{1-\alpha (\alpha + x)} & \text{for } \alpha \lt 0\\ e^{\alpha x} & \text{for } \alpha \ge 0\end{cases} }[/math] | [math]\displaystyle{ (-\infty,\infty) }[/math] | [math]\displaystyle{ C^\infty }[/math] | Yes | Yes | Template:Depends | |
Sinusoid[4] | [math]\displaystyle{ f(x)=\sin(x) }[/math] | [math]\displaystyle{ f'(x)=\cos(x) }[/math] | [math]\displaystyle{ [-1,1] }[/math] | [math]\displaystyle{ C^\infty }[/math] | No | No | Yes | |
Sinc | [math]\displaystyle{ f(x)=\begin{cases} 1 & \text{for } x = 0\\ \frac{\sin(x)}{x} & \text{for } x \ne 0\end{cases} }[/math] | [math]\displaystyle{ f'(x)=\begin{cases} 0 & \text{for } x = 0\\ \frac{\cos(x)}{x} - \frac{\sin(x)}{x^2} & \text{for } x \ne 0\end{cases} }[/math] | [math]\displaystyle{ [\approx-.217234,1] }[/math] | [math]\displaystyle{ C^\infty }[/math] | No | No | No | |
Gaussian | [math]\displaystyle{ f(x)=e^{-x^2} }[/math] | [math]\displaystyle{ f'(x)=-2xe^{-x^2} }[/math] | [math]\displaystyle{ (0,1] }[/math] | [math]\displaystyle{ C^\infty }[/math] | No | No | No |
Here, H is the Heaviside step function.
α is a stochastic variable sampled from a uniform distribution at training time and fixed to the expectation value of the distribution at test time.
2017
- (Mate Labs, 2017) ⇒ Mate Labs Aug 23, 2017. Secret Sauce behind the beauty of Deep Learning: Beginners guide to Activation Functions
- QUOTE: Bent identity
Range: [math]\displaystyle{ (-\infty,+\infty) }[/math]
[math]\displaystyle{ f(x)=\frac{\sqrt{x^2 + 1} - 1}{2} + x }[/math]
- QUOTE: Bent identity
- ↑ Forest Agostinelli; Matthew Hoffman; Peter Sadowski; Pierre Baldi (21 Dec 2014). "Learning Activation Functions to Improve Deep Neural Networks". arXiv:1412.6830 Freely accessible cs.NE.
- ↑ Glorot, Xavier; Bordes, Antoine; Bengio, Yoshua (2011). "Deep sparse rectifier neural networks" (PDF). International Conference on Artificial Intelligence and Statistics.
- ↑ Godfrey, Luke B.; Gashler, Michael S. (2016-02-03). "A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks". 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management: KDIR 1602: 481–486. arXiv:1602.01321. Bibcode 2016arXiv160201321G.
- ↑ Gashler, Michael S.; Ashmore, Stephen C. (2014-05-09). "Training Deep Fourier Neural Networks To Fit Time-Series Data". arXiv:1405.2262 Freely accessible cs.NE.