Few-Shot Natural Language Processing (NLP) Task
(Redirected from few-shot NLP)
Jump to navigation
Jump to search
A Few-Shot Natural Language Processing (NLP) Task is a in-context learning NLP task that is a few-shot learning task (with only a few NLP training examples).
- Context:
- Optional Inputs: Auxiliary Information that can assist in defining the task or providing additional context.
- Optional Inputs: Semantic Spaces that can provide a representation of semantic relationships between words or phrases.
- It can be solved by a Few-Shot NLP Learning System (that implements a few-shot NLP learning algorithm).
- It can range from being a Few-Shot Natural Language Understanding Task, to being a Few-Shot Natural Language Generation Task.
- …
- Example(s):
- a One-Shot NLP Learning Task, where only one example is given for a specific NLP task, such as classifying a novel category of text.
- a Few-Shot In-Context Learning Task, where a few examples are given within the context of the task, such as in dialogue systems where the response is based on a few previous exchanges.
- a Chain-of-Thought Learning Task, where the model uses prior outputs as part of the input for subsequent predictions.
- a Few-Shot Benchmark Learning Task.
- a Few-Shot End-of-Sentence Detection Task, where only a few sentence end examples are provided.
- a Few-Shot Missing Word Prediction Task, where only a few missing word examples are provided.
- a Few-Shot Named Entity Recognition Task, where only a few text/named entity examples are provided.
- a Few-Shot Information Extraction Task, where only a few text/populated record examples are provided.
- a Few-Shot Document Summarization Task, where only a few text/text summary examples are provided.
- a Few-Shot Question Answering Task, where only a few text/question answer examples are provided.
- …
- Counter-Example(s):
- Zero-Shot NLP Task, where no training examples are provided for a specific class or task.
- Supervised NLP Task, where a large amount of labeled data is available for training.
- Few-Shot Computer Vision, which involves image-based tasks, not text.
- Few-Shot Sound Processing, which involves audio signals, not text.
- …
- See: Transfer Learning, Meta-Learning, Multi-task Learning, NLP Task, Machine Learning.
References
2021
- (Yang, 2021) ⇒ Mengde Yang. (2021). “A Survey on Few-shot Learning in Natural Language Processing.” In: 2021 International Conference on Artificial Intelligence and Electromechanical Automation (AIEA), pp. 294-297 . IEEE,
- ABSTRACT: The annotated dataset is the foundation for Supervised Natural Language Processing. However, the cost of obtaining dataset is high. In recent years, the Few-Shot Learning has gradually attracted the attention of researchers. From the definition, in this paper, we conclude the difference in Few-Shot Learning between Natural Language Processing and Computer Vision. On that basis, the current Few-Shot Learning on Natural Language Processing is summarized, including Transfer Learning, Meta Learning and Knowledge Distillation. Furthermore, we conclude the solutions to Few-Shot Learning in Natural Language Processing, such as the method based on Distant Supervision, Meta Learning and Knowledge Distillation. Finally, we present the challenges facing Few-Shot Learning in Natural Language Processing.
2020
- (Yin et al., 2020) ⇒ Wenpeng Yin, Nazneen Fatema Rajani, Dragomir Radev, Richard Socher, and Caiming Xiong. (2020). “Universal Natural Language Processing with Limited Annotations: Try Few-shot Textual Entailment As a Start.” arXiv preprint arXiv:2010.02584
- ABSTRACT: A standard way to address different NLP problems is by first constructing a problem-specific dataset, then building a model to fit this dataset. To build the ultimate artificial intelligence, we desire a single machine that can handle diverse new problems, for which task-specific annotations are limited. We bring up textual entailment as a unified solver for such NLP problems. However, current research of textual entailment has not spilled much ink on the following questions: (i) How well does a pretrained textual entailment system generalize across domains with only a handful of domain-specific examples? and (ii) When is it worth transforming an NLP task into textual entailment? We argue that the transforming is unnecessary if we can obtain rich annotations for this task. Textual entailment really matters particularly when the target NLP task has insufficient annotations.
Universal NLP can be probably achieved through different routines. In this work, we introduce Universal Few-shot textual Entailment (UFO-Entail). We demonstrate that this framework enables a pretrained entailment model to work well on new entailment domains in a few-shot setting, and show its effectiveness as a unified solver for several downstream NLP tasks such as question answering and coreference resolution when the end-task annotations are limited. Code: this https URL
- ABSTRACT: A standard way to address different NLP problems is by first constructing a problem-specific dataset, then building a model to fit this dataset. To build the ultimate artificial intelligence, we desire a single machine that can handle diverse new problems, for which task-specific annotations are limited. We bring up textual entailment as a unified solver for such NLP problems. However, current research of textual entailment has not spilled much ink on the following questions: (i) How well does a pretrained textual entailment system generalize across domains with only a handful of domain-specific examples? and (ii) When is it worth transforming an NLP task into textual entailment? We argue that the transforming is unnecessary if we can obtain rich annotations for this task. Textual entailment really matters particularly when the target NLP task has insufficient annotations.
2020
- (Brown, Mann et al., 2020) ⇒ Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, et al. (2020). “Language Models Are Few-Shot Learners.” In: Advances in Neural Information Processing Systems 33 (NeurIPS 2020).
- ABSTRACT: Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.