Bent Identity Activation Function

A Bent Identity Activation Function is a neuron activation function that is based on the mathematical function: [math]\displaystyle{ f(x)=\frac{\sqrt{x^2 + 1} - 1}{2} + x }[/math]. [math]\displaystyle{ }[/math].

Context:
- It can (typically) be used in the activation of Bent Identity Neurons.
Example(s):
- ...
Counter-Example(s):
See: Artificial Neural Network, Artificial Neuron, Neural Network Topology, Neural Network Layer, Neural Network Learning Rate.

References

2018

(Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/Activation_function#Comparison_of_activation_functions Retrieved:2018-2-12.
- The following table compares the properties of several activation functions that are functions of one fold from the previous layer or layers:

Name	Plot	Equation	Derivative (with respect to x)	Range	Order of continuity	Monotonic	Derivative Monotonic	Approximates identity near the origin
Identity		[math]\displaystyle{ f(x)=x }[/math]	[math]\displaystyle{ f'(x)=1 }[/math]	[math]\displaystyle{ (-\infty,\infty) }[/math]	[math]\displaystyle{ C^\infty }[/math]	Yes	Yes	Yes
Binary step		[math]\displaystyle{ f(x) = \begin{cases} 0 & \text{for } x \lt 0\\ 1 & \text{for } x \ge 0\end{cases} }[/math]	[math]\displaystyle{ f'(x) = \begin{cases} 0 & \text{for } x \ne 0\\ ? & \text{for } x = 0\end{cases} }[/math]	[math]\displaystyle{ \{0,1\} }[/math]	[math]\displaystyle{ C^{-1} }[/math]	Yes	No	No
Logistic (a.k.a. Soft step)		[math]\displaystyle{ f(x)=\frac{1}{1+e^{-x}} }[/math]	[math]\displaystyle{ f'(x)=f(x)(1-f(x)) }[/math]	[math]\displaystyle{ (0,1) }[/math]	[math]\displaystyle{ C^\infty }[/math]	Yes	No	No
(...)	(...)	(...)	(...)	(...)	(...)	(...)	(...)	(...)
Adaptive piecewise linear (APL) ^[1]		[math]\displaystyle{ f(x) = \max(0,x) + \sum_{s=1}^{S}a_i^s \max(0, -x + b_i^s) }[/math]	[math]\displaystyle{ f'(x) = H(x) - \sum_{s=1}^{S}a_i^s H(-x + b_i^s) }[/math]Template:Ref	[math]\displaystyle{ (-\infty,\infty) }[/math]	[math]\displaystyle{ C^0 }[/math]	No	No	No
SoftPlus^[2]		[math]\displaystyle{ f(x)=\ln(1+e^x) }[/math]	[math]\displaystyle{ f'(x)=\frac{1}{1+e^{-x}} }[/math]	[math]\displaystyle{ (0,\infty) }[/math]	[math]\displaystyle{ C^\infty }[/math]	Yes	Yes	No
Bent identity		[math]\displaystyle{ f(x)=\frac{\sqrt{x^2 + 1} - 1}{2} + x }[/math]	[math]\displaystyle{ f'(x)=\frac{x}{2\sqrt{x^2 + 1}} + 1 }[/math]	[math]\displaystyle{ (-\infty,\infty) }[/math]	[math]\displaystyle{ C^\infty }[/math]	Yes	Yes	Yes
SoftExponential ^[3]		[math]\displaystyle{ f(\alpha,x) = \begin{cases} -\frac{\ln(1-\alpha (x + \alpha))}{\alpha} & \text{for } \alpha \lt 0\\ x & \text{for } \alpha = 0\\ \frac{e^{\alpha x} - 1}{\alpha} + \alpha & \text{for } \alpha \gt 0\end{cases} }[/math]	[math]\displaystyle{ f'(\alpha,x) = \begin{cases} \frac{1}{1-\alpha (\alpha + x)} & \text{for } \alpha \lt 0\\ e^{\alpha x} & \text{for } \alpha \ge 0\end{cases} }[/math]	[math]\displaystyle{ (-\infty,\infty) }[/math]	[math]\displaystyle{ C^\infty }[/math]	Yes	Yes	Template:Depends
Sinusoid^[4]		[math]\displaystyle{ f(x)=\sin(x) }[/math]	[math]\displaystyle{ f'(x)=\cos(x) }[/math]	[math]\displaystyle{ [-1,1] }[/math]	[math]\displaystyle{ C^\infty }[/math]	No	No	Yes
Sinc		[math]\displaystyle{ f(x)=\begin{cases} 1 & \text{for } x = 0\\ \frac{\sin(x)}{x} & \text{for } x \ne 0\end{cases} }[/math]	[math]\displaystyle{ f'(x)=\begin{cases} 0 & \text{for } x = 0\\ \frac{\cos(x)}{x} - \frac{\sin(x)}{x^2} & \text{for } x \ne 0\end{cases} }[/math]	[math]\displaystyle{ [\approx-.217234,1] }[/math]	[math]\displaystyle{ C^\infty }[/math]	No	No	No
Gaussian		[math]\displaystyle{ f(x)=e^{-x^2} }[/math]	[math]\displaystyle{ f'(x)=-2xe^{-x^2} }[/math]	[math]\displaystyle{ (0,1] }[/math]	[math]\displaystyle{ C^\infty }[/math]	No	No	No

Here, H is the Heaviside step function.

α is a stochastic variable sampled from a uniform distribution at training time and fixed to the expectation value of the distribution at test time.

2017

(Mate Labs, 2017) ⇒ Mate Labs Aug 23, 2017. Secret Sauce behind the beauty of Deep Learning: Beginners guide to Activation Functions
- QUOTE: Bent identity
  Range: [math]\displaystyle{ (-\infty,+\infty) }[/math]
  [math]\displaystyle{ f(x)=\frac{\sqrt{x^2 + 1} - 1}{2} + x }[/math]

↑ Forest Agostinelli; Matthew Hoffman; Peter Sadowski; Pierre Baldi (21 Dec 2014). "Learning Activation Functions to Improve Deep Neural Networks". arXiv:1412.6830 Freely accessible cs.NE.
↑ Glorot, Xavier; Bordes, Antoine; Bengio, Yoshua (2011). "Deep sparse rectifier neural networks" (PDF). International Conference on Artificial Intelligence and Statistics.
↑ Godfrey, Luke B.; Gashler, Michael S. (2016-02-03). "A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks". 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management: KDIR 1602: 481–486. arXiv:1602.01321. Bibcode 2016arXiv160201321G.
↑ Gashler, Michael S.; Ashmore, Stephen C. (2014-05-09). "Training Deep Fourier Neural Networks To Fit Time-Series Data". arXiv:1405.2262 Freely accessible cs.NE.

[1] Forest Agostinelli; Matthew Hoffman; Peter Sadowski; Pierre Baldi (21 Dec 2014). "Learning Activation Functions to Improve Deep Neural Networks". arXiv:1412.6830 Freely accessible cs.NE.

[2] Glorot, Xavier; Bordes, Antoine; Bengio, Yoshua (2011). "Deep sparse rectifier neural networks" (PDF). International Conference on Artificial Intelligence and Statistics.

[3] Godfrey, Luke B.; Gashler, Michael S. (2016-02-03). "A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks". 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management: KDIR 1602: 481–486. arXiv:1602.01321. Bibcode 2016arXiv160201321G.

[4] Gashler, Michael S.; Ashmore, Stephen C. (2014-05-09). "Training Deep Fourier Neural Networks To Fit Time-Series Data". arXiv:1405.2262 Freely accessible cs.NE.

[1]

[2]

[3]

[4]

Bent Identity Activation Function

References

2018

2017

Navigation menu

Search