SEW-Embed System
Jump to navigation
Jump to search
A SEW-Embed System is a Multilingual and Cross-lingual Semantic Word Similarity System that is based on SEW Corpus and word embedding representations.
- Context:
- It was the 3rd best performing semantic word similarity system in the global ranking of SemEval-2017 Task 2 with a score of 0.56.
- It was initially developed by Bovi & Raganato (2017).
- Example(s):
- the one proposed in Bovi & Raganato (2017),
- …
- Counter-Example(s):
- See: SemEval, SemEval-2017 Task, Semantic Word Similarity Benchmark Task, Semantic Textual Similarity Benchmark Task, Semantic Similarity Modelling System, Semantic Similarity Measure, Semantic Relatedness Measure.
References
2017a
- (Camacho-Collados et al., 2017) ⇒ Jose Camacho-Collados, Mohammad Taher Pilehvar, Nigel Collier, and Roberto Navigli. (2017). “SemEval-2017 Task 2: Multilingual and Cross-lingual Semantic Word Similarity.” In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval@ACL 2017).
- QUOTE: The global ranking for this subtask was computed by averaging the results of the six datasets on which each system performed best. The global rankings are displayed in Table 9. Luminoso was the only system outperforming the baseline, achieving the best overall results. OoO achieved the second best overall performance using an extension of the Bilingual Bag-of-Words without Alignments (BilBOWA) approach of Gouws et al. (2015) on the shared Europarl corpus. The third overall system was SEW, which leveraged Wikipedia-based concept vectors (Raganato et al., 2016) and pre-trained word embeddings for learning language-independent concept embeddings.
2017b
- (Bovi & Raganato, 2017) ⇒ Claudio Delli Bovi, and Alessandro Raganato. (2017). “Sew-Embed at SemEval-2017 Task 2: Language-Independent Concept Representations from a Semantically Enriched Wikipedia.” In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval ACL 2017).
- QUOTE: In this paper we propose SEW-EMBED, an embedded augmentation of SEW's original representations in which sparse vectors, defined in the high-dimensional space of Wikipedia pages, are mapped to continuous vector representations via a weighted average of embedded vectors from an arbitrary, pre-specified word (or sense) representation. Regardless of the particular representation used, the resulting vectors are still defined at the concept level, and hence immediately expendable in a multilingual and cross-lingual setting.
2016
- (Ragenato et al., 2016) ⇒ Alessandro Raganato, Claudio Delli Bovi and Roberto Navigli (2016). "Automatic Construction and Evaluation of a Large Semantically Enriched Wikipedia". In: Proceedings of 25th International Joint Conference on Artificial Intelligence (IJCAI-16).
- QUOTE: Our approach for building a Semantically Enriched Wikipedia (SEW) takes as input a Wikipedia dump and outputs a sense-annotated corpus, built upon the original Wikipedia text, where mentions are annotated according to the sense inventory of BabelNet (...)