Neural-based Language Model (LM) Training Algorithm
Jump to navigation
Jump to search
A Neural-based Language Model (LM) Training Algorithm is a language modeling algorithm that is a Neural-based NLP algorithm.
- Context:
- It can be implemented by a Neural-based LM System.
- It can range from being a Neural Word-level LM Algorithm to being a Neural Character-level LM Algorithm.
- …
- Example(s):
- an RNN-based LM Algorithm, such as an LSTM-based LM algorithm.
- a Convolutional NNet-based LM Algorithm.
- …
- Counter-Example(s):
- See: Neural Language Generation.
References
2017
- (Dauphin et al., 2017) ⇒ Yann N. Dauphin, Angela Fan, Michael Auli, and David Grangier. (2017). “Language Modeling with Gated Convolutional Networks.” In: International Conference on Machine Learning.
- QUOTE: ... The pre-dominant approach to language modeling to date is based on recurrent neural networks. Their success on this task is often linked to their ability to capture unbounded context. In this paper we develop a finite context approach through stacked convolutions, which can be more efficient since they allow parallelization over sequential tokens. ...
2016
- (Kim et al., 2016) ⇒ Yoon Kim, Yacine Jernite, David Sontag, and Alexander M. Rush. (2016). “Character-Aware Neural Language Models.” In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-2016).
- QUOTE: We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convolutional neural network (CNN) and a highway network over characters, whose output is given to a long short-term memory (LSTM) recurrent neural network language model (RNN-LM). On the English Penn Treebank the model is on par with the existing state-of-the-art despite having 60% fewer parameters. ...
2013
- (Chelba et al., 2013) ⇒ Ciprian Chelba, Tomáš Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, and Tony Robinson. (2013). “One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling." Technical Report, Google Research.
- QUOTE: ... We show performance of several well-known types of language models, with the best results achieved with a recurrent neural network based language model. The baseline unpruned Kneser-Ney 5-gram model achieves perplexity 67.6; a combination of techniques leads to 35% reduction in perplexity, or 10% reduction in cross-entropy (bits), over that baseline. ...