In-Context-based (ICL) LLM Task

An In-Context-based (ICL) LLM Task is a LLM-based NLP task that uses the LLM input context to specify the inference task.

AKA: Prompt-based Learning.
Context:
- Input: Pre-Trained Large Language Model.
- Measure: In-Context Learning Performance Measure (such as ICL recall).
- ...
- It can range from being an In-Context Zero-Shot Learning Task to being an In-Context Few-Shot Learning Task to being an In-Context Many-Shot Learning Task, depending on the number of provided examples.
- It can range from being an In-Text Context Learning to being an In-Image Context Learning to being In-Image-and-Text Context Learning, depending on the modality of the input context.
- It can range from being a Single-Prompt In-Context Learning Task to being a Multi-Prompt In-Context Learning Task to being a Dynamic-Prompt In-Context Learning Task, depending on the prompt structure and whether multiple prompts or a dynamic prompt strategy are used.
- It can range from being an In-Context Conversational Task to being an In-Context Document Processing Task to being an In-Context Multimedia Understanding Task, depending on the type of content and interaction required.
- It can range from being a Task-Specific In-Context Learning Task to being a Domain-Specific In-Context Learning Task to being a General-Purpose In-Context Learning Task, depending on the target domain or generalization scope.
- ...
- It can utilize Prompt Engineering techniques to enhance performance, where the prompts are designed to elicit the most relevant and coherent responses based on the context.
- It can be supported by an In-Context Learning System (that implements an in-context learning algorithm).
- ...
Example(s):
- In-Context Information Retrieval, such as zero-shot IR.
- In-Context Information Extraction, such as zero-shot IE.
- In-Context Question-Answering (QA), such as open-domain QA.
- In-Context Text Summarization, such as real-time news summarization.
- In-Context Machine Translation, such as cross-lingual translation in-context.
- In-Context Sentiment Analysis, such as social media sentiment analysis.
- In-Context Text Completion, such as autocompletion for emails.
- In-Context Code Generation, such as function completion in programming.
- In-Context Creative Writing, such as story continuation.
- ...
Counter-Example(s):
- Simple Command Execution, where tasks require only straightforward, single-step commands without the need for understanding broader context.
- LLM Independent Research Tasks, where the LLM is tasked with generating information or answers without any specific contextual grounding.
- Background Knowledge Q/A Task.
See: Contextual Understanding, Adaptive Learning Systems, Human-Computer Interaction, Language Model Fine-Tuning, Zero-Shot Learning, Prompt Engineering.

References

2024

(Kamath et al., 2024) ⇒ Uday Kamath, Kevin Keenan, Garrett Somers, and Sarah Sorenson. (2024). “Large Language Models: A Deep Dive.” Springer.
- NOTES:
  - The book extensively discusses **In-Context Learning (ICL)** in **Chapter 3: Prompt-based Learning**, highlighting how LLMs use input context to perform inference tasks without explicit fine-tuning.
  - It provides practical strategies for **prompt engineering** to enhance ICL performance, offering insights into designing prompts that elicit relevant and coherent responses based on context.
  - The authors explore various types of ICL tasks, ranging from **zero-shot** to **few-shot learning**, and explain how LLMs adapt to different amounts of contextual information.
  - Challenges related to ICL, such as context length limitations and maintaining coherence in generated outputs, are addressed with proposed solutions in the book.
  - The versatility of ICL is showcased through applications in **question-answering**, **machine translation**, **code generation**, and more, demonstrating the broad applicability of in-context-based LLM tasks.

2023

chat
- In-context learning refers to the process by which large language models (LLMs) like GPT-4 learn to perform tasks and generate responses based on the context provided in the input text. Instead of being explicitly trained for specific tasks with labeled data, LLMs learn to understand and respond to prompts by using the context from the text and their vast pre-training on diverse text corpora.
  The idea behind in-context learning is that the model can adapt and generalize to a wide range of tasks without explicit fine-tuning, simply by leveraging the contextual information provided in the input. This is a key aspect of the capabilities of LLMs, as it allows them to perform well on various tasks, from answering questions and generating content to providing summaries or translations.
  In-context learning also means that the model can improve its performance as the input context becomes richer and more informative. This is because LLMs are designed to capture and exploit relationships between words, phrases, and concepts in the text, which helps them generate more accurate and relevant responses based on the given context.
- In-context learning with large language models (LLMs) can be used across various tasks and domains. Here are some examples of different types of in-context learning applications with LLMs:
  - Question-Answering (QA): LLMs can be provided with a context, such as a paragraph or an article, along with a question related to the text. The model will then generate an answer based on the given context.
  - Text Summarization: LLMs can be given a long text and tasked with generating a concise summary. The model will extract and condense the most important information from the input text using the context.
  - Machine Translation: LLMs can translate text from one language to another by understanding the context of the source text and generating a corresponding translation in the target language.
  - Sentiment Analysis: LLMs can analyze the sentiment of a given text, such as a product review or a social media post, by examining the context and determining the positive, negative, or neutral tone of the content.
  - Text Completion: LLMs can be given an incomplete sentence or paragraph and asked to complete it in a coherent and contextually appropriate manner.
  - Code Generation: LLMs can be given a natural language description of a programming task and generate the corresponding code, taking into account the context provided in the description.
  - Creative Writing: LLMs can be given a writing prompt and asked to generate a story, poem, or other creative text based on the context of the prompt.
  - Conversational Agents: LLMs can be used to build chatbots and virtual assistants that can engage in contextually relevant and coherent conversations with users.
- These are just a few examples of the many possible applications of in-context learning with LLMs. The key in each of these examples is that the LLM uses the context provided in the input to understand the task and generate an appropriate response.

2023

chat
- The “Language Models are Few-Shot Learners” (LMaFSL) paper evaluated GPT-3 on a range of natural language processing tasks, using zero-shot, one-shot, and few-shot learning settings. Here are the tasks and examples for each setting:
  - Translation: For English-German, English-French, and English-Romanian translation tasks, GPT-3 was evaluated in zero-shot, one-shot, and few-shot settings.
    - Example (English-French, one-shot):
      - Input: "Translate the following English text to French: '{example English text}' : '{example French text}' ||| Translate the following English text to French: 'I do not want to go.'"
      - Output: "Je ne veux pas y aller."
  - Summarization: The model was tested on summarizing text using the CNN/Daily Mail dataset. The evaluation included zero-shot, one-shot, and few-shot settings.
    - Example (zero-shot):
      - Input: "summarize: '{long text}'"
      - Output: "{summarized text}"
  - Sentiment Analysis: GPT-3 was evaluated on sentiment classification using the Stanford Sentiment Treebank and the IMDb datasets in zero-shot, one-shot, and few-shot settings.
    - Example (one-shot):
      - Input: "Review sentiment: 'I absolutely loved this movie. It's fantastic!' : Positive ||| Review sentiment: 'This movie was a waste of time. I did not enjoy it at all.'"
      - Output: "Negative"
  - Question Answering: GPT-3 was tested on answering questions using the Natural Questions and LAMBADA datasets in zero-shot, one-shot, and few-shot settings.
    - Example (zero-shot):
      - Input: "What is the capital of France?"
      - Output: "Paris"
  - Commonsense Reasoning: GPT-3's performance in commonsense reasoning was assessed using the Winograd Schema Challenge and the COPA dataset in zero-shot, one-shot, and few-shot settings.
    - Example (Winograd Schema, zero-shot):
      - Input: "The trophy would not fit in the brown suitcase because it was too big. What was too big?"
      - Output: "the trophy"
  - Reading Comprehension: GPT-3 was tested on the SuperGLUE benchmark, which includes various subtasks like BoolQ, MultiRC, ReCoRD, and WiC, using zero-shot, one-shot, and few-shot learning settings.
    - Example (BoolQ, one-shot):
      - Input: "Question: 'Is Finland part of Scandinavia?' Answer: Yes ||| Question: 'Does the sun orbit the Earth?'"
      - Output: "Answer: No"

A Few-Shot Information Extraction (IE) Task is an in-context information extraction task that is a few-shot NLP task.

Context:
- It can (typically) be a Few-Shot IE Task from Text.
- ...
Example(s):
Counter-Example(s):
- Zero-Shot Information Extraction.
See: Information Extraction, Zero-Shot Learning, Natural Language Processing, Machine Learning.

References

2022

(Agrawal et al., 2022) ⇒ Monica Agrawal, Stefan Hegselmann, Hunter Lang, Yoon Kim, and David Sontag. (2022). “Large Language Models are Few-Shot Clinical Information Extractors.” In: Proceedings of the EMNLP-2022.
- QUOTE: ... In prompt-based learning (also known as in-context learning), a pretrained language model is adapted to different tasks via priming on natural language prompts — pieces of text that are combined with an input and then fed to the language model to produce an output for that task. This paradigm has been successful for few-shot and zero-shot learning at many general-domain tasks (Brown et al., 2020; Liu et al., 2021; Wei et al., 2021; Sanh et al., 2021).

2021

(Liu et al., 2021) ⇒ Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen. (2021). “What Makes Good In-Context Examples for GPT-$3 $?. ” arXiv preprint arXiv:2101.06804

2021

(Rubin et al., 2021) ⇒ Ohad Rubin, Jonathan Herzig, and Jonathan Berant. (2021). “Learning to Retrieve Prompts for In-Context Learning.” arXiv preprint arXiv:2112.08633

2020

(Brown et al., 2020) ⇒ Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. (2020). “Language Models Are Few-Shot Learners.” In: Advances in Neural Information Processing Systems 33 (NeurIPS 2020).
- QUOTE: ... There are many approaches to building multi-task models. Giving task instructions in natural language was first formalized in a supervised setting with [ MKXS18 ] and used in [ RWC+19 ] for in-context learning and in [ RSR+19 ] for multi-task fine-tuning. ...
- Figure 2.1: Zero-shot, one-shot and few-shot, contrasted with traditional fine-tuning. The panels above show four methods for performing a task with a language model – fine-tuning is the traditional method, whereas zero-, one-, and few-shot, which we study in this work, require the model to perform the task with only forward passes at test time. We typically present the model with a few dozen examples in the few shot setting. Exact phrasings for all task descriptions, examples and prompts can be found in Appendix G.

2019

(Radford et al., 2019) ⇒ Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. (2019). “Language Models Are Unsupervised Multitask Learners.” In: OpenAI Blog Journal, 1(8).