2012 AutomaticTaxonomyConstructionfr
- (Liu et al., 2012) ⇒ Xueqing Liu, Yangqiu Song, Shixia Liu, and Haixun Wang. (2012). “Automatic Taxonomy Construction from Keywords.” In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2012). ISBN:978-1-4503-1462-6 doi:10.1145/2339530.2339754
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%222012%22+Automatic+Taxonomy+Construction+from+Keywords
- http://dl.acm.org/citation.cfm?id=2339530.2339754&preflayout=flat#citedby
Quotes
Author Keywords
- Bayesian rose tree; data mining; hierarchical clustering; knowledgebase; learning; nonparametric statistics; query understanding; taxonomy building
Abstract
Taxonomies, especially the ones in specific domains, are becoming indispensable to a growing number of applications. State-of-the-art approaches assume there exists a text corpus to accurately characterize the domain of interest, and that a taxonomy can be derived from the text corpus using information extraction techniques. In reality, neither assumption is valid, especially for highly focused or fast-changing domains. In this paper, we study a challenging problem: Deriving a taxonomy from a set of keyword phrases. A solution can benefit many real life applications because i) keywords give users the flexibility and ease to characterize a specific domain; and ii) in many applications, such as online advertisements, the domain of interest is already represented by a set of keywords. However, it is impossible to create a taxonomy out of a keyword set itself. We argue that additional knowledge and contexts are needed. To this end, we first use a general purpose knowledgebase and keyword search to supply the required knowledge and context. Then we develop a Bayesian approach to build a hierarchical taxonomy for a given set of keywords. We reduce the complexity of previous hierarchical clustering approaches from O (n 2 log n) to O (n log n), so that we can derive a domain specific taxonomy from one million keyword phrases inan hour. Finally, we conduct comprehensive large scale experiments to show the effectiveness and efficiency of our approach. A real life example of building an insurance-related query taxonomy illustrates the usefulness of our approach for specific domains.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2012 AutomaticTaxonomyConstructionfr | Haixun Wang Shixia Liu Yangqiu Song Xueqing Liu | Automatic Taxonomy Construction from Keywords | 10.1145/2339530.2339754 | 2012 |