2008 KnowledgeDiscoveryofSemanticRel
- (Sato et al., 2008) ⇒ Issei Sato, Minoru Yoshida, and Hiroshi Nakagawa. (2008). “Knowledge Discovery of Semantic Relationships Between Words Using Nonparametric Bayesian Graph Model.” In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2008). doi:10.1145/1401890.1401962
Subject Headings:
Notes
Subject Headings(s): Word Similarity Learning Task, Graph Clustering Algorithm
Notes
- Paper Summary
- The paper proposes a Probabilistic Generative Graph Clustering Algorithm that discovers clusters of nodes/vertices within a Disassortative Graph and applies the algorithm to the Word Similarity Learning Task.
- The algorithm is
- The proposed algorithm differs from previous algorithms in its ability to detect soft/inclusive clusters (and to not require the number of clusters sought??)
- The proposed algorithm is applied to the task of grouping nouns with similar meaning and verbs with similar meaning.
- The proposed algorithm is empirically tested againt its ability to find words that WordNet considers to be similar.
- In general the proposed algorithm performed better than the baseline algorithms.
- Questions
- Define an Assortative Graph.
- It'd be good to check-in with the graph-mining community for other relevant references.
- It'd be good to check in with the NLP community for other relevant references.
- Coudl it discover WordNet synsets?
- How could this information be used for other semantic analysis?
- If the threshold(?) were high enough then it could discover synonyms.
- It could discover words with more than one meaning (homonyms)
Cited By
- http://scholar.google.com/scholar?q=%22Knowledge+discovery+of+semantic+relationships+between+words+using+nonparametric+bayesian+graph+model%22+2008
- http://portal.acm.org/citation.cfm?doid=1401890.1401962&preflayout=flat#citedby
Quotes
Author Keywords
Abstract
We developed a model based on nonparametric Bayesian modeling for automatic discovery of semantic relationships between words taken from a corpus. It is aimed at discovering semantic knowledge about words in particular domains, which has become increasingly important with the growing use of text mining, information retrieval, and speech recognition. The subject-predicate structure is taken as a syntactic structure with the noun as the subject and the verb as the predicate. This structure is regarded as a graph structure. The generation of this graph can be modeled using the hierarchical Dirichlet process and the Pitman-Yor process. The probabilistic generative model we developed for this graph structure consists of subject-predicate structures extracted from a corpus. Evaluation of this model by measuring the performance of graph clustering based on WordNet similarities demonstrated that it outperforms other baseline models.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2008 KnowledgeDiscoveryofSemanticRel | Hiroshi Nakagawa Issei Sato Minoru Yoshida | Knowledge Discovery of Semantic Relationships Between Words Using Nonparametric Bayesian Graph Model | KDD-2008 Proceedings | http://www.r.dl.itc.u-tokyo.ac.jp/~nakagawa/academic-res/KDD2008.pdf | 10.1145/1401890.1401962 | 2008 |