Decoder-Only Neural Model Architecture: Difference between revisions
Jump to navigation
Jump to search
(Redirected page to Decoder-only Neural Model Architecture) Tag: New redirect |
m (Text replacement - ".↵----" to ". ----") |
||
(7 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
A [[Decoder-Only Neural Model Architecture]] is a [[neural model architecture]] that utilizes only the decoder component of a traditional [[Encoder-Decoder Architecture]] for generating sequences of data. | |||
* <B>Context:</B> | |||
** It can (often) leverage a [[Transformer]]-based architecture, utilizing mechanisms such as [[self-attention]] to process input sequences directly for output generation. | |||
** It can be trained on large datasets to capture intricate patterns and relationships within the data. | |||
** ... | |||
* <B>Example(s):</B> | |||
** [[GPT Architecture]], a [[decoder-only text-to-text transformer model architecture]]. | |||
** [[KOSMOS-1 Architecture]], a [[multimodal large language model architecture]] ([[MLLM]]). | |||
** ... | |||
* <B>Counter-Example(s):</B> | |||
** an [[Encoder-Only Model Architecture]], such as a [[BERT Architecture]]. | |||
** A [[Seq2Seq Model Architecture]], which relies on both an encoder and a decoder for tasks such as machine translation. | |||
** A [[CNN Model Architecture]], which is primarily used for tasks involving image data and does not inherently generate sequences. | |||
* <B>See:</B> [[Sequence Generation]], [[Transformer Architecture]], [[Natural Language Generation]], [[Self-Attention Mechanism]]. | |||
---- | |||
---- | |||
[[Category:Concept]] |
Latest revision as of 02:46, 28 November 2024
A Decoder-Only Neural Model Architecture is a neural model architecture that utilizes only the decoder component of a traditional Encoder-Decoder Architecture for generating sequences of data.
- Context:
- It can (often) leverage a Transformer-based architecture, utilizing mechanisms such as self-attention to process input sequences directly for output generation.
- It can be trained on large datasets to capture intricate patterns and relationships within the data.
- ...
- Example(s):
- Counter-Example(s):
- an Encoder-Only Model Architecture, such as a BERT Architecture.
- A Seq2Seq Model Architecture, which relies on both an encoder and a decoder for tasks such as machine translation.
- A CNN Model Architecture, which is primarily used for tasks involving image data and does not inherently generate sequences.
- See: Sequence Generation, Transformer Architecture, Natural Language Generation, Self-Attention Mechanism.