2004 TheoreticalComparBetGiniAndIGain
- (Raileanu & Stoffel, 2004) ⇒ Laura Elena Raileanu, and Kilian Stoffel. (2004). “Theoretical Comparison between the Gini Index and Information Gain Criteria.” In: Annals of Mathematics and Artificial Intelligence, 41(1). doi:10.1023/B:AMAI.0000018580.96245.c6
Subject Headings Impurity Function, Gini Impurity Index, Information Gain Criteria.
Notes
Cited By
Quotes
Abstract
Knowledge Discovery in Databases (KDD) is an active and important research area with the promise for a high payoff in many business and scientific applications. One of the main tasks in KDD is classification. A particular efficient method for classification is decision tree induction. The selection of the attribute used at each node of the tree to split the data (split criterion) is crucial in order to correctly classify objects. Different split criteria were proposed in the literature (Information Gain, Gini Index, etc.). It is not obvious which of them will produce the best decision tree for a given data set. A large amount of empirical tests were conducted in order to answer this question. No conclusive results were found. In this paper we introduce a formal methodology, which allows us to compare multiple split criteria. This permits us to present fundamental insights into the decision process. Furthermore, we are able to present a formal description of how to select between split criteria for a given data set. As an illustration we apply the methodology to two widely used split criteria: Gini Index and Information Gain.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2004 TheoreticalComparBetGiniAndIGain | Theoretical Comparison between the Gini Index and Information Gain Criteria | http://www2.unine.ch/files/content/sites/imi/files/shared/documents/papers/Gini index fulltext.pdf | 10.1023/B:AMAI.0000018580.96245.c6 |