2008 PartitionedLogisticRegressionfo
- (Chang et al., 2008) ⇒ Ming-wei Chang, Wen-tau Yih, and Christopher Meek. (2008). “Partitioned Logistic Regression for Spam Filtering.” In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2008). doi:10.1145/1401890.1401907
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%22Partitioned+logistic+regression+for+spam+filtering%22+2008
- http://portal.acm.org/citation.cfm?doid=1401890.1401907&preflayout=flat#citedby
Quotes
Author Keywords
Abstract
Naive Bayes and logistic regression perform well in different regimes. While the former is a very simple generative model which is efficient to train and performs well empirically in many applications,the latter is a discriminative model which often achieves better accuracy and can be shown to outperform naive Bayes asymptotically. In this paper, we propose a novel hybrid model, partitioned logistic regression, which has several advantages over both naive Bayes and logistic regression. This model separates the original feature space into several disjoint feature groups. Individual models on these groups of features are learned using logistic regression and their predictions are combined using the naive Bayes principle to produce a robust final estimation. We show that our model is better both theoretically and empirically. In addition, when applying it in a practical application, email spam filtering, it improves the normalized AUC score at 10% false-positive rate by 28.8% and 23.6% compared to naive Bayes and logistic regression, when using the exact same training examples.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2008 PartitionedLogisticRegressionfo | Wen-tau Yih Ming-wei Chang Christopher Meek | Partitioned Logistic Regression for Spam Filtering | KDD-2008 Proceedings | 10.1145/1401890.1401907 | 2008 |