Accuracy Estimation Algorithm

AKA: Predictive Function Accuracy Estimation Algorithm.
Context:
- It is based on following accuracy mathematical expression:
  $Accuracy = \dfrac{TP+TN}{TP+TN+FP+FN}$
  where TP is True Positive, TN is True Negative, FP is False Positive, and FN is False Negative binary classifier predictions.
Example(s):
Counter-Example(s):
- Precision,
- Recall,
- RMS.
See: Classification Accuracy Metric, Confusion Matrix; Resubstitution Accuracy; Precision; Recall; F-Measure; Error Rate; Statistical Significance; Cross-validation; Classification Task; Task Performance, Cross-Validation, Bootstrap, Resampling.

References

(Sammut & Webb, 2017) ⇒ (2017) Accuracy. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA
- QUOTE: Accuracy refers to a measure of the degree to which the predictions of a model matches the reality being modeled. The term accuracy is often applied in the context of classification models. In this context, [math]\displaystyle{ accuracy = P(\lambda(X) = Y ) }[/math], where [math]\displaystyle{ XY }[/math] is a joint distribution and the classification model [math]\displaystyle{ \lambda }[/math] is a function [math]\displaystyle{ X \rightarrow Y }[/math]. Sometimes, this quantity is expressed as a percentage rather than a value between 0.0 and 1.0.
  The accuracy of a model is often assessed or estimated by applying it to test data for which the labels ([math]\displaystyle{ Y }[/math] values) are known. The accuracy of a classifier on test data may be calculated as number of correctly classified objects/total number of objects. Alternatively, a smoothing function may be applied, such as a Laplace estimate or an m-estimate.
  Accuracy is directly related to error rate, such that [math]\displaystyle{ accuracy = 1. 0 – error\; rate }[/math] (or when expressed as a percentage, [math]\displaystyle{ accuracy = 100 – error\; rate }[/math]).

(Melli, 2002) ⇒ Gabor Melli. (2002). “PredictionWorks' Data Mining Glossary.
- Accuracy: The measure of a model's ability to correctly label a previously unseen test case. If the label is categorical (classification), accuracy is commonly reported as the rate which a case will be labeled with the right category. For example, a model may be said to predict whether a customer responds to a promotional campaign with 85.5% accuracy. If the label is continuous, accuracy is commonly reported as the average distance between the predicted label and the correct value. For example, a model may be said to predict the amount a customer will spend on a given month within $55. See also Accuracy Estimation, Classification, Estimation, Model, and Statistical Significance.

(Melli, 2002) ⇒ Gabor Melli. (2002). “PredictionWorks' Data Mining Glossary.
- Accuracy Estimation: The use of a validation process to approximate the true value of a model's accuracy based on a data sample. See also Accuracy, RMS, Resampling Techniques and Validation.

(Kohavi & Provost, 1998) ⇒ Ron Kohavi, and Foster Provost. (1998). “Glossary of Terms.” In: Machine Leanring 30(2-3).
- Accuracy (error rate): The rate of correct (incorrect) predictions made by the model over a data set (cf. coverage). Accuracy is usually estimated by using an independent test set that was not used at any time during the learning process. More complex accuracy estimation techniques, such as cross-validation and the bootstrap, are commonly used, especially with data sets containing a small number of instances.

(Kohavi, 1995) ⇒ Ron Kohavi. (1995). “A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection.” In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI 1995).
- ABSTRACT: We review accuracy estimation methods and compare the two most common methods: cross-validation and bootstrap. Recent experimental results on artificial data and theoretical results in restricted settings have shown that for selecting a good classifier from a set of classifiers (model selection), ten-fold cross-validation may be better than the more expensive leave-one-out cross-validation. We report on a large-scale experiment --- over half a million runs of C4.5 Algorithm and a Naive-Bayes algorithm --- to estimate the effects of different parameters on these algorithms on real-world datasets. For cross-validation, we vary the number of folds and whether the folds are stratified or not; for bootstrap, we vary the number of bootstrap samples. Our results indicate that for real-world datasets similar to ours, the best method to use for model selection is ten-fold stratified cross validation, even if computation power allows using more folds.