1999 UsingMaximumEntropyforTextClass: Difference between revisions

From GM-RKB
Jump to navigation Jump to search
m (Text replacement - ". " to ". ")
m (Text replacement - "]].↵↵== References ==" to "]]. ---- ---- == References ==")
 
Line 13: Line 13:


[[1999_UsingMaximumEntropyforTextClass|This paper]] proposes the use of [[Maximum Entropy-based Learning Algorithm|maximum entropy technique]]s for [[text classification]]. [[Maximum entropy algorithm|Maximum entropy]] is a [[probability distribution estimation technique]] widely used for a variety of natural language tasks, such as [[language modeling]], [[part-of-speech tagging]], and [[text segmentation]]. The underlying principle of maximum entropy is that without external knowledge, one should prefer distributions that are uniform. Constraints on the distribution, derived from labeled [[training data]], inform [[Maximum Entropy algorithm|the technique]] where to be minimally non-uniform. The [[Maximum entropy algorithm|maximum entropy formulation]] has a unique solution which can be found by the improved [[iterative scaling algorithm]]. [[In this paper]], [[Maximum entropy algorithm|maximum entropy]] is used for [[text classification]] by [[estimating]] the [[conditional distribution]] of the [[class variable]] given the [[document]]. In experiments on several [[text dataset]]s we compare accuracy to [[naive Bayes classification algorithm|naive Bayes]] and show that [[Maximum entropy algorithm|maximum entropy]] is sometimes significantly better, but also sometimes worse. Much future work remains, but the results indicate that [[Maximum entropy algorithm|maximum entropy]] is a promising technique for [[text classification]].
[[1999_UsingMaximumEntropyforTextClass|This paper]] proposes the use of [[Maximum Entropy-based Learning Algorithm|maximum entropy technique]]s for [[text classification]]. [[Maximum entropy algorithm|Maximum entropy]] is a [[probability distribution estimation technique]] widely used for a variety of natural language tasks, such as [[language modeling]], [[part-of-speech tagging]], and [[text segmentation]]. The underlying principle of maximum entropy is that without external knowledge, one should prefer distributions that are uniform. Constraints on the distribution, derived from labeled [[training data]], inform [[Maximum Entropy algorithm|the technique]] where to be minimally non-uniform. The [[Maximum entropy algorithm|maximum entropy formulation]] has a unique solution which can be found by the improved [[iterative scaling algorithm]]. [[In this paper]], [[Maximum entropy algorithm|maximum entropy]] is used for [[text classification]] by [[estimating]] the [[conditional distribution]] of the [[class variable]] given the [[document]]. In experiments on several [[text dataset]]s we compare accuracy to [[naive Bayes classification algorithm|naive Bayes]] and show that [[Maximum entropy algorithm|maximum entropy]] is sometimes significantly better, but also sometimes worse. Much future work remains, but the results indicate that [[Maximum entropy algorithm|maximum entropy]] is a promising technique for [[text classification]].
----
----


== References ==
== References ==

Latest revision as of 03:23, 10 February 2024

Subject Headings: Maximum Entropy Algorithm; Text Classification Algorithm.

Notes

Cited By

Quotes

Abstract

This paper proposes the use of maximum entropy techniques for text classification. Maximum entropy is a probability distribution estimation technique widely used for a variety of natural language tasks, such as language modeling, part-of-speech tagging, and text segmentation. The underlying principle of maximum entropy is that without external knowledge, one should prefer distributions that are uniform. Constraints on the distribution, derived from labeled training data, inform the technique where to be minimally non-uniform. The maximum entropy formulation has a unique solution which can be found by the improved iterative scaling algorithm. In this paper, maximum entropy is used for text classification by estimating the conditional distribution of the class variable given the document. In experiments on several text datasets we compare accuracy to naive Bayes and show that maximum entropy is sometimes significantly better, but also sometimes worse. Much future work remains, but the results indicate that maximum entropy is a promising technique for text classification.



References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1999 UsingMaximumEntropyforTextClassKamal Nigam
John D. Lafferty
Andrew McCallum
Using Maximum Entropy for Text Classification1999