Neural Semantic Indexing Algorithm
Jump to navigation
Jump to search
A Neural Semantic Indexing Algorithm is an semantic indexing algorithm that is based on a neural network model.
- Context:
- It can be implemented by a Neural Semantic Indexing System to solve a Neural Semantic Indexing Task.
- It can use techniques like word embeddings, hierarchical classification, and multi-label classification to capture the nuances of language and the relationships between different terms.
- It can (often) be applied in information retrieval, document classification, and content analysis tasks to improve the accuracy of search results by understanding the semantic context of words and phrases.
- ...
- Example(s):
- Counter-Example(s):
- See: Semantic Analysis, Vectorial Semantics, Artificial Neural Network, Synonymy, Polysemy, Word Hashing, Neural Natural Language Processing System.
References
2016
- (Yan et al., 2016) ⇒ Yan Yan, Xu-Cheng Yin, Bo-Wen Zhang, Chun Yang, and Hong-Wei Hao. (2016). “Semantic Indexing with Deep Learning: A Case Study.” In: Big Data Analytics Journal, 1(7). doi:10.1186/s41044-016-0007-z
- QUOTE: Considering the numerous classes of the documents and the uneven distribution of samples, we introduce a hierarchical CNN-based framework (HC) to conduct biomedical document semantic indexing for both multiple labels and correlated labels. The architecture of our proposed framework is summarized in Fig. 1. The model consists of three parts: feature representation, CNN model, and multi-label hierarchical classification. Hierarchical indexing achieves far superior performance compared with flat classification when processing large number of classes. Moreover, the coarse clustering step is an effective way to remove noise from the unevenly distributed samples. In addition, we also design suitable loss functions for the learning of this framework. The details of these three parts are described in the following subsections.
Fig. 1: A hierarchical CNNs-based framework with multi-label classification for semantic indexing (HC)
- QUOTE: Considering the numerous classes of the documents and the uneven distribution of samples, we introduce a hierarchical CNN-based framework (HC) to conduct biomedical document semantic indexing for both multiple labels and correlated labels. The architecture of our proposed framework is summarized in Fig. 1. The model consists of three parts: feature representation, CNN model, and multi-label hierarchical classification. Hierarchical indexing achieves far superior performance compared with flat classification when processing large number of classes. Moreover, the coarse clustering step is an effective way to remove noise from the unevenly distributed samples. In addition, we also design suitable loss functions for the learning of this framework. The details of these three parts are described in the following subsections.
2013
- (Huang et al., 2013) ⇒ Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. (2013). “[f Learning Deep Structured Semantic Models for Web Search Using Clickthrough Data].” In: Proceedings of the 22nd ACM International Conference on Conference on information & knowledge management. ISBN:978-1-4503-2263-8 doi:10.1145/2505515.2505665
- QUOTE: In this study, extending from both research lines discussed above, we propose a series of Deep Structured Semantic Models (DSSM) for Web search. More specifically, our best model uses a deep neural network (DNN) to rank a set of documents for a given query as follows. First, a non-linear projection is performed to map the query and the documents to a common semantic space. Then, the relevance of each document given the query is calculated as the cosine similarity between their vectors in that semantic space. The neural network models are discriminatively trained using the clickthrough data such that the conditional likelihood of the clicked document given the query is maximized.