Delta Rule

From GM-RKB
Jump to navigation Jump to search

A Delta Rule is a gradient descent learning rule for a neuron [math]\displaystyle{ j \, }[/math] with differentiable activation function [math]\displaystyle{ g(x) \, }[/math], the delta rule for [math]\displaystyle{ j \, }[/math] 's [math]\displaystyle{ i \, }[/math] th weight [math]\displaystyle{ w_{ji} \, }[/math] is given by [math]\displaystyle{ \Delta w_{ji}=\alpha(t_j-y_j) g'(h_j) x_i }[/math].



References

2016

  • (Wikipedia, 2016) ⇒ https://en.wikipedia.org/wiki/delta_rule Retrieved:2016-8-2.
    • In machine learning, the delta rule is a gradient descent learning rule for updating the weights of the inputs to artificial neurons in a single-layer neural network. It is a special case of the more general backpropagation algorithm. For a neuron [math]\displaystyle{ j \, }[/math] with activation function [math]\displaystyle{ g(x) \, }[/math], the delta rule for [math]\displaystyle{ j \, }[/math] 's [math]\displaystyle{ i \, }[/math] th weight [math]\displaystyle{ w_{ji} \, }[/math] is given by : [math]\displaystyle{ \Delta w_{ji}=\alpha(t_j-y_j) g'(h_j) x_i \, }[/math] , where

      It holds that [math]\displaystyle{ h_j=\sum x_i w_{ji} \, }[/math] and [math]\displaystyle{ y_j=g(h_j) \, }[/math] .

      The delta rule is commonly stated in simplified form for a neuron with a linear activation function as : [math]\displaystyle{ \Delta w_{ji}=\alpha(t_j-y_j) x_i \, }[/math] While the delta rule is similar to the perceptron's update rule, the derivation is different. The perceptron uses the Heaviside step function as the activation function [math]\displaystyle{ g(h) }[/math] , and that means that [math]\displaystyle{ g'(h) }[/math] does not exist at zero, and is equal to zero elsewhere, which makes the direct application of the delta rule impossible.