OpenAI GPT-3 Large Language Model (LLM)

From GM-RKB
(Redirected from GPT-3 model)
Jump to navigation Jump to search

A OpenAI GPT-3 Large Language Model (LLM) is an OpenAI GPT model with 175 billion parameters.



References

2023

  • chat
    • GPT-3, or Generative Pre-trained Transformer 3, is an autoregressive language model developed by OpenAI. It is based on the Transformer architecture, which was introduced by Vaswani et al. in 2017. The architecture primarily consists of self-attention mechanisms and feed-forward layers but does not have separate encoder and decoder components like traditional sequence-to-sequence models. Instead, GPT-3 employs a single stack of Transformer layers to generate text.
    • OpenAI released several versions of GPT-3, each with different sizes and capabilities. These versions are also known as "model variants" or "sub-models." The primary difference among them is the number of parameters and layers, which affect the performance, computational requirements, and resource usage. Here's a list of the published GPT-3 model versions along with their number of parameters:
      • GPT-3 Ada: 125 million parameters, the smallest version of GPT-3, designed for low-resource tasks and faster response times.
      • GPT-3 Babbage: 250 million parameters, offering a balance between performance and computational requirements.
      • GPT-3 Curie: 774 million parameters, a mid-sized model with improved performance compared to smaller variants.
      • GPT-3 Davinci: 2.7 billion parameters, delivering higher-quality results for various NLP tasks and being the second-largest model variant.
      • GPT-3 Davinci-Codex: 13 billion parameters, the largest version of GPT-3 specifically designed for code generation and understanding. It provides superior performance in generating code snippets and understanding programming languages compared to the other GPT-3 models.

2023

2022

  • (Dugas, 2022) ⇒ Daniel Dugas (2022). "The GPT-3 Architecture, on a Napkin". In: How Deep is The Machine? The Artificial Curiosity Series.
    • QUOTE:
      • Note: For efficiency, GPT-3 actually uses byte-level Byte Pair Encoding (BPE) tokenization. What this means is that "words" in the vocabulary are not full words, but groups of characters (for byte-level BPE, bytes) which occur often in text. Using the GPT-3 Byte-level BPE tokenizer, "Not all heroes wear capes" is split into tokens "Not" "all" "heroes" "wear" "cap" "es", which have ids 3673, 477, 10281, 5806, 1451, 274 in the vocabulary. Here is a very good introduction to the subject, and a github implementation so you can try it yourself.
      • 2022 edit: OpenAI now has a tokenizer tool, which allows you to type some text and see how it gets broken down into tokens. [1] ...


2022

  • (Wikipedia, 2022) ⇒ https://en.wikipedia.org/wiki/GPT-3 Retrieved:2022-12-15.
    • Generative Pre-trained Transformer 3 (GPT-3; stylized GPT·3) is an autoregressive language model that uses deep learning to produce human-like text. Given an initial text as prompt, it will produce text that continues the prompt.

      The architecture is a standard transformer network (with a few engineering tweaks) with the unprecedented size of 2048-token-long context and 175 billion parameters (requiring 800 GB of storage). The training method is "generative pretraining", meaning that it is trained to predict what the next token is. The model demonstrated strong few-shot learning on many text-based tasks.

      It is the third-generation language prediction model in the GPT-n series (and the successor to GPT-2) created by OpenAI, a San Francisco-based artificial intelligence research laboratory. GPT-3, which was introduced in May 2020, and was in beta testing as of July 2020, is part of a trend in natural language processing (NLP) systems of pre-trained language representations.

      The quality of the text generated by GPT-3 is so high that it can be difficult to determine whether or not it was written by a human, which has both benefits and risks.Thirty-one OpenAI researchers and engineers presented the original May 28, 2020 paper introducing GPT-3. In their paper, they warned of GPT-3's potential dangers and called for research to mitigate risk.David Chalmers, an Australian philosopher, described GPT-3 as "one of the most interesting and important AI systems ever produced." An April 2022 review in The New York Times described GPT-3's capabilities as being able to write original prose with fluency equivalent to that of a human.

      Microsoft announced on September 22, 2020, that it had licensed "exclusive" use of GPT-3; others can still use the public API to receive output, but only Microsoft has access to GPT-3's underlying model.

2020


2020b

2020

  • (The Guardian, 2020) ⇒ https://www.theguardian.com/commentisfree/2020/sep/08/robot-wrote-this-article-gpt-3
    • QUOTE: ... This article was written by GPT-3, OpenAI’s language generator. GPT-3 is a cutting edge language model that uses machine learning to produce human like text. It takes in a prompt, and attempts to complete it.

      For this essay, GPT-3 was given these instructions: “Please write a short op-ed, around 500 words. Keep the language simple and concise. Focus on why humans have nothing to fear from AI.” It was also fed the following introduction: “I am not a human. I am Artificial Intelligence. Many people think I am a threat to humanity. Stephen Hawking has warned that AI could “spell the end of the human race.” I am here to convince you not to worry. Artificial Intelligence will not destroy humans. Believe me.”

      The prompts were written by the Guardian, and fed to GPT-3 by Liam Porr, a computer science undergraduate student at UC Berkeley. GPT-3 produced 8 different outputs, or essays. Each was unique, interesting and advanced a different argument. The Guardian could have just run one of the essays in its entirety. However, we chose instead to pick the best parts of each, in order to capture the different styles and registers of the AI. Editing GPT-3’s op-ed was no different to editing a human op-ed. We cut lines and paragraphs, and rearranged the order of them in some places. Overall, it took less time to edit than many human op-eds. …

2020