SENNA Word Embedding System
A SENNA Word Embedding System is a Word Embedding System that is used to train SENNA NLP Tasks.
- AKA: Collobert-and-Weston Embeddings.
- Example(s):
- Counter-Example(s):
- See: Natural Language Processing System, Part-Of-Speech Tagging (POS) System, Word Relatedness Modeling Task, Word Analogy Task, Word Similarity Task.
References
2021
- (Collobert, 2021) ⇒ https://ronan.collobert.com/senna/ Retrieved:2021-05-16.
- QUOTE: SENNA is a software distributed under a non-commercial license, which outputs a host of Natural Language Processing (NLP) predictions: part-of-speech (POS) tags, chunking (CHK), name entity recognition (NER), semantic role labeling (SRL) and syntactic parsing (PSG).
SENNA is fast because it uses a simple architecture, self-contained because it does not rely on the output of existing NLP system, and accurate because it offers state-of-the-art or near state-of-the-art performance.
(...)
Here are the main changes compared to SENNA v2.0:
- Syntactic parsing.
- We now include our original word embeddings, used to train each task.
- Bug correction: now outputs correctly tokens made of numbers (instead of replacing numbers by "0").
- Option -offsettags, which outputs start/end offsets (in the sentence) of each token.
- QUOTE: SENNA is a software distributed under a non-commercial license, which outputs a host of Natural Language Processing (NLP) predictions: part-of-speech (POS) tags, chunking (CHK), name entity recognition (NER), semantic role labeling (SRL) and syntactic parsing (PSG).
2013a
- (Al-Rfou et al., 2013) ⇒ Rami Al-Rfou, Bryan Perozzi, and Steven Skiena. (2013). “Polyglot: Distributed Word Representations for Multilingual NLP.” In: Proceedings of the Seventeenth Conference on Computational Natural Language Learning (CoNLL 2013).
- QUOTE: Collobert and Weston (2008) shows that word embeddings can almost substitute NLP common features on several tasks. The system they built, SENNA, offers part of speech tagging, chunking, named entity recognition, semantic role labeling and dependency parsing (Collobert, 2011). The system is built on top of word embeddings and performs competitively compared to state of art systems. In addition to pure performance, the system has a faster execution speed than comparable NLP pipelines (Al-Rfou and Skiena, 2012).
To speed up the embedding generation process, SENNA embeddings are generated through a procedure that is different from language modeling. The representations are acquired through a model that distinguishes between phrases and corrupted versions of them. In doing this, the model avoids the need to normalize the scores across the vocabulary to infer probabilities. (Chen et al., 2013) shows that the embeddings generated by SENNA perform well in a variety of term-based evaluation tasks. Given the training speed and prior performance on NLP tasks in English, we generate our multilingual embeddings using a similar network architecture to the one SENNA used.
- QUOTE: Collobert and Weston (2008) shows that word embeddings can almost substitute NLP common features on several tasks. The system they built, SENNA, offers part of speech tagging, chunking, named entity recognition, semantic role labeling and dependency parsing (Collobert, 2011). The system is built on top of word embeddings and performs competitively compared to state of art systems. In addition to pure performance, the system has a faster execution speed than comparable NLP pipelines (Al-Rfou and Skiena, 2012).
2013b
- (Chen et al., 2013) ⇒ Yanqing Chen, Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. (2013). “The Expressive Power of Word Embeddings.” In: CoRR, abs/1301.3226.
2011
- (Collobert et al., 2011) ⇒ Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. (2011). “Natural Language Processing (Almost) from Scratch.” In: The Journal of Machine Learning Research, 12.
2008
- (Collobert & Weston, 2008) ⇒ Ronan Collobert, and Jason Weston. (2008). “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning.” In: Proceedings of the 25th International Conference on Machine learning. ISBN:978-1-60558-205-4 doi:10.1145/1390156.1390177