In-Context-based (ICL) LLM Task

From GM-RKB
(Redirected from ICL)
Jump to navigation Jump to search

An In-Context-based (ICL) LLM Task is a LLM-based NLP task that uses the LLM input context to specify the inference task.



References

2024

  • (Kamath et al., 2024) ⇒ [[::Uday Kamath]], [[::Kevin Keenan]], [[::Garrett Somers]], and [[::Sarah Sorenson]]. ([[::2024]]). “Large Language Models: A Deep Dive.” Springer.
    • NOTES:
      • The book extensively discusses **In-Context Learning (ICL)** in **Chapter 3: Prompt-based Learning**, highlighting how LLMs use input context to perform inference tasks without explicit fine-tuning.
      • It provides practical strategies for **prompt engineering** to enhance ICL performance, offering insights into designing prompts that elicit relevant and coherent responses based on context.
      • The authors explore various types of ICL tasks, ranging from **zero-shot** to **few-shot learning**, and explain how LLMs adapt to different amounts of contextual information.
      • Challenges related to ICL, such as context length limitations and maintaining coherence in generated outputs, are addressed with proposed solutions in the book.
      • The versatility of ICL is showcased through applications in **question-answering**, **machine translation**, **code generation**, and more, demonstrating the broad applicability of in-context-based LLM tasks.

2023

  • chat
    • In-context learning refers to the process by which large language models (LLMs) like GPT-4 learn to perform tasks and generate responses based on the context provided in the input text. Instead of being explicitly trained for specific tasks with labeled data, LLMs learn to understand and respond to prompts by using the context from the text and their vast pre-training on diverse text corpora.

      The idea behind in-context learning is that the model can adapt and generalize to a wide range of tasks without explicit fine-tuning, simply by leveraging the contextual information provided in the input. This is a key aspect of the capabilities of LLMs, as it allows them to perform well on various tasks, from answering questions and generating content to providing summaries or translations.

      In-context learning also means that the model can improve its performance as the input context becomes richer and more informative. This is because LLMs are designed to capture and exploit relationships between words, phrases, and concepts in the text, which helps them generate more accurate and relevant responses based on the given context.

    • In-context learning with large language models (LLMs) can be used across various tasks and domains. Here are some examples of different types of in-context learning applications with LLMs:
      • Question-Answering (QA): LLMs can be provided with a context, such as a paragraph or an article, along with a question related to the text. The model will then generate an answer based on the given context.
      • Text Summarization: LLMs can be given a long text and tasked with generating a concise summary. The model will extract and condense the most important information from the input text using the context.
      • Machine Translation: LLMs can translate text from one language to another by understanding the context of the source text and generating a corresponding translation in the target language.
      • Sentiment Analysis: LLMs can analyze the sentiment of a given text, such as a product review or a social media post, by examining the context and determining the positive, negative, or neutral tone of the content.
      • Text Completion: LLMs can be given an incomplete sentence or paragraph and asked to complete it in a coherent and contextually appropriate manner.
      • Code Generation: LLMs can be given a natural language description of a programming task and generate the corresponding code, taking into account the context provided in the description.
      • Creative Writing: LLMs can be given a writing prompt and asked to generate a story, poem, or other creative text based on the context of the prompt.
      • Conversational Agents: LLMs can be used to build chatbots and virtual assistants that can engage in contextually relevant and coherent conversations with users.
    • These are just a few examples of the many possible applications of in-context learning with LLMs. The key in each of these examples is that the LLM uses the context provided in the input to understand the task and generate an appropriate response.

2023

  • chat
    • The “Language Models are Few-Shot Learners” (LMaFSL) paper evaluated GPT-3 on a range of natural language processing tasks, using zero-shot, one-shot, and few-shot learning settings. Here are the tasks and examples for each setting:
      • Translation: For English-German, English-French, and English-Romanian translation tasks, GPT-3 was evaluated in zero-shot, one-shot, and few-shot settings.
        • Example (English-French, one-shot):
          • Input: "Translate the following English text to French: '{example English text}' : '{example French text}' ||| Translate the following English text to French: 'I do not want to go.'"
          • Output: "Je ne veux pas y aller."
      • Summarization: The model was tested on summarizing text using the CNN/Daily Mail dataset. The evaluation included zero-shot, one-shot, and few-shot settings.
        • Example (zero-shot):
          • Input: "summarize: '{long text}'"
          • Output: "{summarized text}"
      • Sentiment Analysis: GPT-3 was evaluated on sentiment classification using the Stanford Sentiment Treebank and the IMDb datasets in zero-shot, one-shot, and few-shot settings.
        • Example (one-shot):
          • Input: "Review sentiment: 'I absolutely loved this movie. It's fantastic!' : Positive ||| Review sentiment: 'This movie was a waste of time. I did not enjoy it at all.'"
          • Output: "Negative"
      • Question Answering: GPT-3 was tested on answering questions using the Natural Questions and LAMBADA datasets in zero-shot, one-shot, and few-shot settings.
        • Example (zero-shot):
          • Input: "What is the capital of France?"
          • Output: "Paris"
      • Commonsense Reasoning: GPT-3's performance in commonsense reasoning was assessed using the Winograd Schema Challenge and the COPA dataset in zero-shot, one-shot, and few-shot settings.
        • Example (Winograd Schema, zero-shot):
          • Input: "The trophy would not fit in the brown suitcase because it was too big. What was too big?"
          • Output: "the trophy"
      • Reading Comprehension: GPT-3 was tested on the SuperGLUE benchmark, which includes various subtasks like BoolQ, MultiRC, ReCoRD, and WiC, using zero-shot, one-shot, and few-shot learning settings.
        • Example (BoolQ, one-shot):
          • Input: "Question: 'Is Finland part of Scandinavia?' Answer: Yes ||| Question: 'Does the sun orbit the Earth?'"
          • Output: "Answer: No"

A Few-Shot Information Extraction (IE) Task is an in-context information extraction task that is a few-shot NLP task.



References

2022

2021

2021

2020

2019