Few-Shot In-Context Learning (FS-ICL) Task
(Redirected from In-Context Few-Shot Learning Task)
Jump to navigation
Jump to search
A Few-Shot In-Context Learning (FS-ICL) Task is an in-context learning task with only a few training examples.
- Context:
- Optional Inputs: Auxiliary Information that encodes observable distinguishing properties of objects.
- Optional Inputs: Semantic Spaces that represent the high-level feature space in which the model makes predictions.
- It can be solved by a Few-Shot Learning System or a Few-Shot In-Context Learning (ICL) System that implements a few-shot ICL algorithm or a few-shot learning algorithm.
- It can range from being a Few-Shot Computer Vision Learning Task, to being a Few-Shot Natural Language Processing Task, to being a Few-Shot Sound Processing Task.
- It can range from being a Few-Shot Classification Task, to being a Few-Shot Regression Task, to being a Few-Shot Ordering Task.
- …
- Example(s):
- One-Shot Learning, where only a single example is provide.
- Few-Shot NLP Task, such as:
- Few-Shot Information Extraction, such as: predicting the next word in a sentence based on a few examples of similar sentences.
- …
- Few-Shot Vision Task, such as:
- object recognition with only one example of some object.
- a Few-Shot Benchmark Learning Task, such as: the Omniglot dataset for handwriting recognition.
- …
- Counter-Example(s):
- Zero-Shot Learning, where no examples are provided.
- Many-Shot Learning, where many examples are provided.
- See: Zero-Shot ICL, Object Categorization Problem, Transfer Learning, Meta-Learning, Multi-task Learning.
References
2023
- (Wikipedia, 2023) ⇒ https://en.wikipedia.org/wiki/One-shot_learning Retrieved:2023-2-26.
- One-shot learning is an object categorization problem, found mostly in computer vision. Whereas most machine learning-based object categorization algorithms require training on hundreds or thousands of examples, one-shot learning aims to classify objects from one, or only a few, examples. The term few-shot learning is also used for these problems, especially when more than one example is needed.
2023
- (Liu et al., 2022) ⇒ Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, and Colin A. Raffel. (2022). “Few-shot Parameter-efficient Fine-tuning is Better and Cheaper Than in-context Learning.” Advances in Neural Information Processing Systems 35
- ABSTRACT Few-shot in-context learning (ICL) enables pre-trained language models to perform a previously-unseen task without any gradient-based training by feeding a small number of training examples as part of the input. ICL incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made. Parameter-efficient fine-tuning (PEFT) (e.g. adapter modules, prompt tuning, sparse update methods, etc.) offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task. In this paper, we rigorously compare few-shot ICL and PEFT and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
2022
- (Wei, Tay et al., 2022) ⇒ Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, and William Fedus.. (2022). “Emergent Abilities of Large Language Models.” In: Transactions on Machine Learning Research, 08/2022 (TMLR).
- QUOTE: ... Brown et al. (2020) proposed few-shot prompting, which includes a few input-output examples in the model’s context (input) as a preamble before asking the model to perform the task for an unseen inference-time example. An example prompt is shown in Figure 1. <P The ability to perform a task via few-shot prompting is emergent when a model has random performance until a certain scale, after which performance increases to well-above random. Figure 2 shows eight such emergent abilities spanning five language model families from various work.
2020
- (Wang et al., 2020) ⇒ Yaqing Wang, Quanming Yao, James T. Kwok, and Lionel M. Ni. (2020). “Generalizing from a Few Examples: A Survey on Few-shot Learning.” ACM computing surveys (csur) 53, no. 3
- ABSTRACT: Machine learning has been highly successful in data-intensive applications but is often hampered when the data set is small. Recently, Few-shot Learning (FSL) is proposed to tackle this problem. Using prior knowledge, FSL can rapidly generalize to new tasks containing only a few samples with supervised information. In this article, we conduct a thorough survey to fully understand FSL. Starting from a formal definition of FSL, we distinguish FSL from several relevant machine learning problems. We then point out that the core issue in FSL is that the empirical risk minimizer is unreliable. Based on how prior knowledge can be used to handle this core issue, we categorize FSL methods from three perspectives: (i) data, which uses prior knowledge to augment the supervised experience; (ii) model, which uses prior knowledge to reduce the size of the hypothesis space; and (iii) algorithm, which uses prior knowledge to alter the search for the best hypothesis in the given hypothesis space. With this taxonomy, we review and discuss the pros and cons of each category. Promising directions, in the aspects of the FSL problem setups, techniques, applications, and theories, are also proposed to provide insights for future research.
2020
- (Brown, Mann et al., 2020) ⇒ Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. (2020). “Language Models Are Few-Shot Learners.” In: Advances in Neural Information Processing Systems 33 (NeurIPS 2020).
2020
- (Wang et al., 2020) ⇒ Yaqing Wang, Quanming Yao, James T. Kwok, and Lionel M. Ni. (2020). “Generalizing from a Few Examples: A Survey on Few-shot Learning.” ACM computing surveys (csur) 53, no. 3
- ABSTRACT: Machine learning has been highly successful in data-intensive applications but is often hampered when the data set is small. Recently, Few-shot Learning (FSL) is proposed to tackle this problem. Using prior knowledge, FSL can rapidly generalize to new tasks containing only a few samples with supervised information. In this article, we conduct a thorough survey to fully understand FSL. Starting from a formal definition of FSL, we distinguish FSL from several relevant machine learning problems. We then point out that the core issue in FSL is that the empirical risk minimizer is unreliable. Based on how prior knowledge can be used to handle this core issue, we categorize FSL methods from three perspectives: (i) data, which uses prior knowledge to augment the supervised experience; (ii) model, which uses prior knowledge to reduce the size of the hypothesis space; and (iii) algorithm, which uses prior knowledge to alter the search for the best hypothesis in the given hypothesis space. With this taxonomy, we review and discuss the pros and cons of each category. Promising directions, in the aspects of the FSL problem setups, techniques, applications, and theories, are also proposed to provide insights for future research.
2017
- (Snell et al., 2017) ⇒ Jake Snell, Kevin Swersky, and Richard Zemel. (2017). “Prototypical Networks for Few-shot Learning.” Advances in Neural Information Processing Systems 30
- ABSTRACT: We propose Prototypical Networks for the problem of few-shot classification, where a classifier must generalize to new classes not seen in the training set, given only a small number of examples of each new class. Prototypical Networks learn a metric space in which classification can be performed by computing distances to prototype representations of each class. Compared to recent approaches for few-shot learning, they reflect a simpler inductive bias that is beneficial in this limited-data regime, and achieve excellent results. We provide an analysis showing that some simple design decisions can yield substantial improvements over recent approaches involving complicated architectural choices and meta-learning. We further extend Prototypical Networks to zero-shot learning and achieve state-of-the-art results on the CU-Birds dataset.