Neural Natural Language Processing (NNLP) System
A Neural Natural Language Processing (NNLP) System is a Data-Driven Natural Language Processing (NLP) System that implements a Neural Natural Language Processing (NNLP) Algorithm to solve a Neural Natural Language Processing (NNLP) Task .
- Example(s):
- Counter-Example(s):
- See: Artificial Neural Network, Natural Language Processing System, Natural Language Understanding System, Natural Language Generation System, Natural Language User Interface System, Sentiment Analysis System, Part-of-Speech Tagging System, Text Chunking System, Named Entity Recognition System, Semantic Role Labeling System, Natural Language Parsing System, Word Sense Disambiguation System, Multi-Task Deep Neural Network.
References
2019
- (Liu et al., 2019) ⇒ Xiaodong Liu, Pengcheng He, Weizhu Chen, and Jianfeng Gao. (2019). “Multi-Task Deep Neural Networks for Natural Language Understanding.” arXiv:1901.11504
2018a
- (Devlin et al., 2018) ⇒ Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. (2018). “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” In: arXiv:1810.04805.
2018b
- (Sriram et al., 2018) ⇒ Anuroop Sriram, Heewoo Jun, Sanjeev Satheesh, and Adam Coates. (2018). “Cold Fusion: Training Seq2seq Models Together with Language Models.” In: Proceedings of the Sixth International Conference on Learning Representations (ICLR-2018).
2017a
- (See et al., 2017) ⇒ Abigail See, Peter J Liu, and Christopher D Manning. (2017). “Get To The Point: Summarization with Pointer-Generator Networks.” In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
- QUOTE: Neural sequence-to-sequence models have provided a viable new approach for abstractive text summarization (meaning they are not restricted to simply selecting and rearranging passages from the original text).
2017b
- (Young et al., 2017) ⇒ Tom Young, Devamanyu Hazarika, Soujanya Poria, and Erik Cambria. (2017). “Recent Trends in Deep Learning Based Natural Language Processing.” In: IEEE Computational Intelligence Magazine Journal, 13(3). DOI: 10.1109/MCI.2018.2840738 arXiv:1708.02709
2016
- (Goldberg, 2016) ⇒ Yoav Goldberg. (2016). “A Primer on Neural Network Models for Natural Language Processing.” In: Journal of Artificial Intelligence Research, 57(1). DOI: 10.1613/jair.4992 arXiv:1510.00726
2015a
- (Mesnil et al., 2015) ⇒ Grégoire Mesnil, Yann Dauphin, Kaisheng Yao, Yoshua Bengio, Li Deng, Dilek Hakkani-Tur, Xiaodong He, Larry Heck, Gokhan Tur, Dong Yu, and Geoffrey Zweig. (2015). “Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding.” In: IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) Journal, 23(3). doi:10.1109/TASLP.2014.2383614
2015b
- (Hermann et al., 2015) ⇒ Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. (2015). “Teaching Machines to Read and Comprehend.” In: Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS'15). arXiv:1506.03340v3
2011
- (Collobert et al., 2011) ⇒ Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. (2011). “Natural Language Processing (Almost) from Scratch.” In: The Journal of Machine Learning Research, 12.
- QUOTE: All the NLP tasks above can be seen as tasks assigning labels to words. The traditional NLP approach is: extract from the sentence a rich set of hand-designed features which are then fed to a standard classification algorithm, for example, a Support Vector Machine (SVM), often with a linear kernel. The choice of features is a completely empirical process, mainly based first on linguistic intuition, and then trial and error, and the feature selection is task dependent, implying additional research for each new NLP task. Complex tasks like SRL then require a large number of possibly complex features (e.g., extracted from a parse tree) which can impact the computational cost which might be important for large-scale applications or applications requiring real-time response.
Instead, we advocate a radically different approach: as input we will try to pre-process our features as little as possible and then use a multilayer neural network (NN) architecture, trained in an end-to-end fashion. The architecture takes the input sentence and learns several layers of feature extraction that process the inputs. The features computed by the deep layers of the network are automatically trained by backpropagation to be relevant to the task. We describe in this section a general multilayer architecture suitable for all our NLP tasks, which is generalizable to other NLP tasks as well.
Our architecture is summarized in Figure 1 and Figure 2. The first layer extracts features for each word. The second layer extracts features from a window of words or from the whole sentence, treating it as a sequence with local and global structure (i.e., it is not treated like a bag of words). The following layers are standard NN layers.
- QUOTE: All the NLP tasks above can be seen as tasks assigning labels to words. The traditional NLP approach is: extract from the sentence a rich set of hand-designed features which are then fed to a standard classification algorithm, for example, a Support Vector Machine (SVM), often with a linear kernel. The choice of features is a completely empirical process, mainly based first on linguistic intuition, and then trial and error, and the feature selection is task dependent, implying additional research for each new NLP task. Complex tasks like SRL then require a large number of possibly complex features (e.g., extracted from a parse tree) which can impact the computational cost which might be important for large-scale applications or applications requiring real-time response.
Figure 1: Window approach network. Figure 2: Sentence approach network.
2008
- (Collobert & Weston, 2008) ⇒ Ronan Collobert, and Jason Weston. (2008). “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning.” In: Proceedings of the 25th International Conference on Machine learning. ISBN:978-1-60558-205-4 doi:10.1145/1390156.1390177
2003
- (Bengio et al., 2003a) ⇒ Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin. (2003). “A Neural Probabilistic Language Model.” In: The Journal of Machine Learning Research, 3.
1992
- (Kazuhiro et al., 1992) ⇒ Kimura Kazuhiro, Suzuoka Takashi, and Amano Sin-ya. (1992). “Association-based Natural Language Processing with Neural Networks.” In: Proceedings of the 30th annual meeting on Association for Computational Linguistics. doi:10.3115/981967.981996