Supervised Multilabel Text Classification Task
(Redirected from multilabel text categorization)
Jump to navigation
Jump to search
A Supervised Multilabel Text Classification Task is a text classification task that is a supervised multilabel classification task.
- AKA: Multi-Label Text Categorization.
- Context:
- It can be solved by a Supervised Multilabel Text Classification Algorithm.
- Example(s):
- Each document may belong to several predefined document topics, such as "Health", "Sports", "Local Politics", "National Politics", "Global Politics".
- …
- Counter-Example(s):
- See: Multilabel Text Classification Algorithm, Unilabel Text Classification Task.
References
2011
- (Bekkerman et al., 2011) ⇒ Ron Bekkerman, and Matan Gavish. (2011). “High-precision Phrase-based Document Classification on a Modern Scale.” In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2011). doi:10.1145/2020408.2020449
- QUOTE: The problem of multilabel text classification is defined as follows. Each document from an unlabeled corpus [math]\displaystyle{ D }[/math] is to be categorized into one or more classes from a class set [math]\displaystyle{ C }[/math]. More formally, a rule [math]\displaystyle{ L : D \rightarrow 2^C }[/math] is to be learned from training data. Also available is a labeled test set [math]\displaystyle{ D_{test} = {( d_i, C^*_i )} }[/math], where [math]\displaystyle{ \vert D_{test}\vert \ll \vert D \vert }[/math] and each [math]\displaystyle{ C^*_i \subset C }[/math] is a set of [math]\displaystyle{ d_i }[/math]’s ground truth classes. Performance of the classification rule [math]\displaystyle{ L }[/math] is evaluated on the test set by comparing each [math]\displaystyle{ L(d_i) }[/math] with [math]\displaystyle{ C^*_i }[/math].
2006
- (Zhang & Zhou, 2006) ⇒ Min-Ling Zhang, and Zhi-Hua Zhou. (2006). “Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization." IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 10, pp. 1338-1351, October, (2006). doi:10.1109/TKDE.2006.162.
- QUOTE: Applications to two real-world multilabel learning problems, i.e., functional genomics and text categorization, show that the performance of BP-MLL is superior to that of some well-established multilabel learning algorithms.
1999
- (McCallum, 1999) ⇒ Andrew McCallum. (1999). “Multi-label Text Classification with a Mixture Model Trained by EM.” In: AAAI 99 Workshop on Text Learning.
- QUOTE: In many important document classification tasks, documents may each be associated with multiple class labels. ... Text classification is the problem of assigning a text document into one or more topic categories or classes. In multiclass document classification, as distinguished from binary document classification, there are more than two classes. In multi-label classification each document may have more than one class label. For example, given classes N. America, S. America, Europe, Asia and Australia, a news article about U.S. troops in Bosnia may be labeled with both the N. America and Europe classes.