Testing Record Set
(Redirected from Evaluation set)
Jump to navigation
Jump to search
A Testing Record Set is a dataset used for evaluations by a learning system.
- AKA: Unseen Test Data, Test Dataset, Holdout Set.
- Context:
- It can range from being an Unlabeled Testing Dataset to being a Labeled Testing Dataset.
- It can range from being a Numerical Test dataset to being a Categorical Test Dataset.
- It can be an Input to a Supervised Learning Task to Test the Performance of a Predictive Function; but it must not be seen by the Supervised Learning Algorithm during training.
- …
- Counter-Example(s):
- See: Test Set, Evaluation Data.
References
2018
- (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/Training,_test,_and_validation_sets#Test_dataset Retrieved:2018-4-8.
- A test dataset is a dataset that is independent of the training dataset, but that follows the same probability distribution as the training dataset. If a model fit to the training dataset also fits the test dataset well, minimal overfitting has taken place (see figure below). A better fitting of the training dataset as opposed to the test dataset usually points to overfitting.
A test set is therefore a set of examples used only to assess the performance (i.e. generalization) of a fully specified classifier.
- A test dataset is a dataset that is independent of the training dataset, but that follows the same probability distribution as the training dataset. If a model fit to the training dataset also fits the test dataset well, minimal overfitting has taken place (see figure below). A better fitting of the training dataset as opposed to the test dataset usually points to overfitting.
2017
- (Sammut & Webb, 2017) ⇒ (2017) Test Set. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA
- QUOTE: A test set is a data set containing data that are used for evaluation by a learning system. Where the training set and the test set contain disjoint sets of data, the test set is known as a holdout set.
2009
- (Wikipedia, 2009) ⇒ http://en.wikipedia.org/wiki/Test_set
- In machine learning and genetic programming it is common to train the system using known examples (the training set). In GP the set of examples used to determine performance is commonly called the test set. A program's fitness is often based solely upon its performance on the test set. A common performance measure is the number of examples it gets right. Each (approximately) correct answer is known as a hit.
- (Chen et al., 2009) ⇒ Bo Chen, Wai Lam, Ivor Tsang, and Tak-Lam Wong. (2009). “Extracting Discrimininative Concepts for Domain Adaptation in Text Mining.” In: Proceedings of ACM SIGKDD Conference (KDD-2009). doi:10.1145/1557019.1557045
- One common predictive modeling challenge occurs in text mining problems is that the training data and the operational (testing) data are drawn from different underlying distributions.
2008
- (Wick et al., 2008) ⇒ Michael Wick, Khashayar Rohanimanesh, Karl Schultz, and Andrew McCallum. (2008). “A Unified Approach for Schema Matching, Coreference, and Canonicalization.” In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2008).
- … In the training data we used the first two schemas and in the testing data we used one of the schemas from training, and also the third schema. This way we train a model on one schema but test it on another schema. ... The testing data was created similarly, but for a different set of data records.