Convolutional Latent Semantic Model (CLSM) Algorithm
A Convolutional Latent Semantic Model (CLSM) Algorithm is a Neural Latent Semantic Indexing Algorithm that is based on a Convolutional Neural Network Training Algorithm.
- Context:
- It was first developed by Shen et al. (2014).
- It can be implemented by a CLSM System to solve a CLSM Task.
- Example(s):
- …
- Counter-Example(s):
- See: Convolutional Neural Network, Semantic Analysis, Vectorial Semantics, Artificial Neural Network, Synonymy, Polysemy, Word Hashing, Neural Natural Language Processing System.
References
- (Shen et al., 2014) ⇒ Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Grégoire Mesnil. (2014). “A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval.” In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. ISBN:978-1-4503-2598-1 doi:10.1145/2661829.2661935
- QUOTE: The architecture of the CLSM is illustrated in Figure 1. The model contains (1) a word-n-gram layer obtained by running a contextual sliding window over the input word sequence (i.e., a query or a document), (2) a letter-trigram layer that transforms each word-trigram into a letter-trigram representation vector, (3) a convolutional layer that extracts contextual features for each word with its neighboring words defined by a window, e.g., a word-n-gram, (4) a max-pooling layer that discovers and combines salient word-n-gram features to form a fixed-length sentence-level feature vector, and (5) a semantic layer that extracts a high-level semantic feature vector for the input word sequence. In what follows, we describe these components in detail, using the annotation illustrated in Figure 1.
Figure 1: The CLSM maps a variable-length word sequence to a low-dimensional vector in a latent semantic space. A word contextual window size (i.e. the receptive field) of three is used in the illustration. Convolution over word sequence via learned matrix [math]\displaystyle{ W_C }[/math] is performed implicitly via the earlier layer’s mapping with a local receptive field. The dimensionalities of the convolutional layer and the semantic layer are set to 300 and 128 in the illustration, respectively. The max operation across the sequence is applied for each of 300 feature dimensions separately. (Only the first dimension is shown to avoid figure clutter.)
- QUOTE: The architecture of the CLSM is illustrated in Figure 1. The model contains (1) a word-n-gram layer obtained by running a contextual sliding window over the input word sequence (i.e., a query or a document), (2) a letter-trigram layer that transforms each word-trigram into a letter-trigram representation vector, (3) a convolutional layer that extracts contextual features for each word with its neighboring words defined by a window, e.g., a word-n-gram, (4) a max-pooling layer that discovers and combines salient word-n-gram features to form a fixed-length sentence-level feature vector, and (5) a semantic layer that extracts a high-level semantic feature vector for the input word sequence. In what follows, we describe these components in detail, using the annotation illustrated in Figure 1.