Multi-layer Perceptron (MLP) Classification System
Jump to navigation
Jump to search
A Multi-layer Perceptron (MLP) Classification System is a Multilayer Feedforward Neural Network Training System that implements Multi-layer Perceptron Classification Algorithm to solve a Multi-layer Perceptron Classification Task.
- AKA: Multi-layer Perceptron (MLP) Classifier.
- Context:
- It is a supervised learning system that is based on the backpropagation algorithm.
- Example(s):
- sklearn.neural_network.MLPClassifier.
knn.mlp.Classifier()
- Multi-Layer Perceptrons from scikit-neuralnetwork Neural Network Module- DL4J
MLPClassifierLinear()
. - …
- Counter-Example(s):
- See: Supervised Neural Network, Natural Language Processing, Feedforward Neural Network, Artificial Neural Network, Activation Function, Supervised Learning, Backpropagation, Perceptron, Linear Separability.
References
2017a
- (sklearn,2017) ⇒ http://scikit-learn.org/stable/modules/neural_networks_supervised.html#classification Retrieved:2017-12-3.
- QUOTE: Class MLPClassifier implements a multi-layer perceptron (MLP) algorithm that trains using Backpropagation. MLP trains on two arrays: array X of size (n_samples, n_features), which holds the training samples represented as floating point feature vectors; and array y of size (n_samples,), which holds the target values (class labels) for the training samples:
>>> from
sklearn.neural_network import MLPClassifier X = [[0., 0.], [1., 1.]] y = [0, 1] clf = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(5, 2), random_state=1) clf.fit(X, y) |
>>> clf.predict([[2., 2.], [-1., -2.]]) |
- MLP can fit a non-linear model to the training data.
clf.coefs_
contains the weight matrices that constitute the model parameters:
- MLP can fit a non-linear model to the training data.
>>> [coef.shape for coef in clf.coefs] |
- Currently, MLPClassifier supports only the Cross-Entropy loss function, which allows probability estimates by running the
predict_proba
method. MLP trains using Backpropagation. More precisely, it trains using some form of gradient descent and the gradients are calculated using Backpropagation. For classification, it minimizes the Cross-Entropy loss function, giving a vector of probability estimates P(y|x) per sample x:
- Currently, MLPClassifier supports only the Cross-Entropy loss function, which allows probability estimates by running the
>>> clf.predict_proba([[2., 2.], [1., 2.]]) |
- MLPClassifier supports multi-class classification by applying Softmax as the output function. Further, the model supports multi-label classification in which a sample can belong to more than one class. For each class, the raw output passes through the logistic function. Values larger or equal to 0.5 are rounded to 1, otherwise to 0. For a predicted output of a sample, the indices where the value is 1 represents the assigned classes of that sample:
>>> X = [[0., 0.], [1., 1.]] y = [[0, 1]], [1, 1]] clf = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(15,), random_state=1) clf.fit(X, y) clf.predict([[1., 2.]]) clf.predict([[0., 0.]]) |
2017b
- (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Multilayer_perceptron Retrieved:2017-12-3.
- A multilayer perceptron (MLP) is a class of feedforward artificial neural network. An MLP consists of at least three layers of nodes. Except for the input nodes, each node is a neuron that uses a nonlinear activation function. MLP utilizes a supervised learning technique called backpropagation for training. [1] [2] Its multiple layers and non-linear activation distinguish MLP from a linear perceptron. It can distinguish data that is not linearly separable.[3] Multilayer perceptrons are sometimes colloquially referred to as "vanilla" neural networks, especially when they have a single hidden layer. [4]
- ↑ Rosenblatt, Frank. x. Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books, Washington DC, 1961
- ↑ Rumelhart, David E., Geoffrey E. Hinton, and R. J. Williams. “Learning Internal Representations by Error Propagation". David E. Rumelhart, James L. McClelland, and the PDP research group. (editors), Parallel distributed processing: Explorations in the microstructure of cognition, Volume 1: Foundation. MIT Press, 1986.
- ↑ Cybenko, G. 1989. Approximation by superpositions of a sigmoidal function Mathematics of Control, Signals, and Systems, 2(4), 303–314.
- ↑ Hastie, Trevor. Tibshirani, Robert. Friedman, Jerome. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York, NY, 2009.
,