2017 HighRiskLearningAcquiringNewWor

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Distributional Semantics Models; OOV; OOV Modelling System, Herbelot-Baroni Distributional Semantics Model.

Notes

Cited By

Quotes

Abstract

Distributional semantics models are known to struggle with small data. It is generally accepted that in order to learn 'a good vector'™ for a word, a model must have sufficient examples of its usage. This contradicts the fact that humans can guess the meaning of a word from a few occurrences only. In this paper, we show that a neural language model such as Word2Vec only necessitates minor modifications to its standard architecture to learn new terms from tiny data, using background knowledge from a previously learnt semantic space. We test our model on word definitions and on a nonce task involving 2-6 sentences'™ worth of context, showing a large increase in performance over state-of-the-art models on the definitional task.

References

BibTeX

@inproceedings{2017_HighRiskLearningAcquiringNewWor,
  author    = {Aelie Herbelot and
               Marco Baroni},
  editor    = {Martha Palmer and
               Rebecca Hwa and
               Sebastian Riedel},
  title     = {High-risk learning: acquiring new word vectors from tiny data},
  booktitle = {Proceedings of the 2017 Conference on Empirical Methods in Natural
               Language Processing (EMNLP 2017)},
  pages     = {304--309},
  publisher = {Association for Computational Linguistics},
  year      = {2017},
  url       = {https://doi.org/10.18653/v1/d17-1030},
  doi       = {10.18653/v1/d17-1030},
}


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2017 HighRiskLearningAcquiringNewWorMarco Baroni
Aelie Herbelot
High-risk Learning: Acquiring New Word Vectors from Tiny Data2017