Multi-Layer Neural Network Training Algorithm
Jump to navigation
Jump to search
A Multi-Layer Neural Network Training Algorithm is a neural network training algorithm that can be applied by a Multi-Layer Neural Network Training System (to solve a multi-layer neural network training task - to produce an ML NNet).
- AKA: Multilevel ANN Training Method.
- Example(s):
- Counter-Example(s):
- See: Recurrent NNet Algorithm, Single Hidden-Layer Neural Network.
References
2015
- https://www.quora.com/What-is-the-difference-between-deep-learning-and-usual-machine-learning/answer/Sebastian-Raschka-1?srid=uuoZN
- QUOTE: Now, if you add multiple hidden layers to this MLP, you'd also call the network "deep." The problem with such "deep" networks is that it becomes harder and harder to learn "good" weights for this network. When you start training your network, you typically assign random values as initial weights, which can be terribly off from the "optimal" solution you want to find. During training, you then use the popular backpropagation algorithm (think of it as reverse-mode autodifferentiation) to propagate the "errors" from right to left and calculate the partial derivatives with respect to each weight to take a step into the opposite direction of the cost (or "error") gradient. Now, the problem is the so-called "vanishing gradient" -- the more layers you add, the harder it becomes to "update" your weights because the signal becomes weaker and weaker. Since your network's weights can be terribly off in the beginning (random initialization) it can become almost impossible to parameterize a "deep" neural network with backpropagation.
Now, this is where "deep learning" comes into play. Roughly speaking, you can think of deep learning as "clever" tricks or algorithms that can help you with training such "deep" neural network structures. There are many, many different neural network architectures, but to continue with the example of the MLP, let me introduce the idea of convolutional neural networks (ConvNets). You can think of those as an "addon" to your MLP that helps you to detect features as "good" inputs for your MLP.
- QUOTE: Now, if you add multiple hidden layers to this MLP, you'd also call the network "deep." The problem with such "deep" networks is that it becomes harder and harder to learn "good" weights for this network. When you start training your network, you typically assign random values as initial weights, which can be terribly off from the "optimal" solution you want to find. During training, you then use the popular backpropagation algorithm (think of it as reverse-mode autodifferentiation) to propagate the "errors" from right to left and calculate the partial derivatives with respect to each weight to take a step into the opposite direction of the cost (or "error") gradient. Now, the problem is the so-called "vanishing gradient" -- the more layers you add, the harder it becomes to "update" your weights because the signal becomes weaker and weaker. Since your network's weights can be terribly off in the beginning (random initialization) it can become almost impossible to parameterize a "deep" neural network with backpropagation.