Parametric Rectified Linear Activation Function

A Parametric Rectified Linear Activation Function is a Rectified-based Activation Function that is based on the mathematical function: [math]\displaystyle{ f(x)=max(0,x)+\alpha∗min(0,x) }[/math], where [math]\displaystyle{ \alpha }[/math] is a Neural Network Learnable Parameter.

AKA: PReLU.
Context:
- It can (typically) be used in the activation of Parametric Rectified Linear Neurons.
Example(s):
- a torch.nn.PReLU (a PyTorch implementation),
- a chainer.functions.prelu (a Chainer implementation),
- …
Counter-Example(s):
See: Artificial Neural Network, Artificial Neuron, Neural Network Topology, Neural Network Layer, Neural Network Learning Rate.

References

Shape:

Examples:

>>> m = nn.PReLU() >>> input = autograd.Variable(torch.randn(2)) >>> print(input) >>> print(m(input))

Returns: Output variable.

Return type: Variable.

See also: PReLU

(Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/Rectifier_(neural_networks)#Leaky_ReLUs Retrieved:2018-2-4.
- Leaky ReLUs allow a small, non-zero gradient when the unit is not active.^[1] : [math]\displaystyle{ f(x) = \begin{cases} x & \mbox{if } x \gt 0 \\ 0.01x & \mbox{otherwise} \end{cases} }[/math]
  Parametric ReLUs take this idea further by making the coefficient of leakage into a parameter that is learned along with the other neural network parameters.^[2] : [math]\displaystyle{ f(x) = \begin{cases} x & \mbox{if } x \gt 0 \\ a x & \mbox{otherwise} \end{cases} }[/math]
  Note that for [math]\displaystyle{ a\leq1 }[/math], this is equivalent to : [math]\displaystyle{ f(x) = \max(x, ax) }[/math] and thus has a relation to "maxout" networks.

(Mate Labs, 2017) ⇒ Mate Labs Aug 23, 2017. Secret Sauce behind the beauty of Deep Learning: Beginners guide to Activation Functions
- QUOTE: Parametric Rectified Linear Unit(PReLU) — It makes the coefficient of leakage into a parameter that is learned along with the other neural network parameters. Alpha(α) is the coefficient of leakage here.
  For [math]\displaystyle{ \alpha\leq 1 \quad f(x) = max(x, \alpha x) }[/math]
  Range:[math]\displaystyle{ (-\infty, +\infty) }[/math]
  [math]\displaystyle{ f(\alpha, x) = \begin{cases} \alpha x, & \mbox{for } x \lt 0 \\ x, & \mbox{for } x \geq 0 \end{cases} }[/math]

(He et al., 2015) ⇒ He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on computer vision (pp. 1026-1034).
- ABSTRACT: Rectified activation units (rectifiers) are essential for state-of-the-art neural networks. In this work, we study rectifier neural networks for image classification from two aspects. First, we propose a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit. PReLU improves model fitting with nearly zero extra computational cost and little overfitting risk. Second, we derive a robust initialization method that particularly considers the rectifier nonlinearities. This method enables us to train extremely deep rectified models directly from scratch and to investigate deeper or wider network architectures. Based on our PReLU networks (PReLU-nets), we achieve 4.94% top-5 test error on the ImageNet 2012 classification dataset. This is a 26% relative improvement over the ILSVRC 2014 winner (GoogLeNet, 6.66%). To our knowledge, our result is the first to surpass human-level performance (5.1%, Russakovsky et al.) on this visual recognition challenge.

↑ Andrew L. Maas, Awni Y. Hannun, Andrew Y. Ng (2014). Rectifier Nonlinearities Improve Neural Network Acoustic Models
↑ He, Kaiming; Zhang, Xiangyu; Ren, Shaoqing; Sun, Jian (2015). “Delving Deep into Rectifiers: Surpassing Human-Level Performance on Image Net Classification". arXiv:1502.01852 Freely accessible [cs.CV].