Skewed Training Dataset
(Redirected from skewed dataset)
Jump to navigation
Jump to search
A Skewed Training Dataset is a training record set whose target attribute has a Skewed Distribution.
- AKA: Imbalanced Dataset.
- Context:
- It can require the use of an Imbalanced Supervised Classification Algorithm.
- Example(s):
- Counter-Example(s):
- See: Continuous Variable, Small Training Set, Imbalanced Classification Task.
References
2006
- http://www.maassmedia.com/blog/three-statistical-concepts-all-marketing-professionals-should-know-and-use/
- QUOTE: In the image below, three different distributions are presented: symmetric, right-skewed, and left-skewed. A symmetric distribution has the same mean and median. A right skewed distribution has a greater volume of data toward the left (skewed right refers to the direction of the tail), so the mean is greater than the median. The opposite is true of a left skewed distribution. ...
- QUOTE: In the image below, three different distributions are presented: symmetric, right-skewed, and left-skewed. A symmetric distribution has the same mean and median. A right skewed distribution has a greater volume of data toward the left (skewed right refers to the direction of the tail), so the mean is greater than the median. The opposite is true of a left skewed distribution. ...
2004
- (Chawla et al., 2004) ⇒ Nitesh Chawla, Nathalie Japkowicz, Aleksander Kolcz. (2004). “Editorial: Special issue on learning from imbalanced data sets.” In: ACM SIGKDD Explorations Newsletter, 6(1). doi:10.1145/1007730.1007733
- (Wu & Chang, 2004) ⇒ Gang Wu, and Edward Y. Chang. (2004). “Aligning Boundary in Kernel Space for Learning Imbalanced Dataset.” In: Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM 2004) doi:10.1109/ICDM.2004.10106
1999
- (Morik et al., 1999) ⇒ Katharina Morik, Peter Brockhausen, and Thorsten Joachims. (1999). “Combining Statistical Learning with a Knowledge-based Approach - A Case Study in Intensive Care Monitoring.” In: Proceedings of the Sixteenth International Conference on Machine Learning (ICML 1999).