Decoder-only Transformer-based Model
Jump to navigation
Jump to search
A Decoder-only Transformer-based Model is a transformer model that is a decoder-only neural network (that solely relies on a decoder components).
- Context:
- It can (typically) be employed in natural language generation tasks.
- It can generate text in an autoregressive manner, where each new output is conditioned on the previous outputs.
- It can (often) use techniques such as masked self-attention to focus on already generated elements while predicting the next element in a sequence.
- It can be less suited for tasks requiring an understanding of input context, as it lacks an encoder component.
- ...
- Example(s):
- GPT (Generative Pre-trained Transformer) models, such as GPT-3 and GPT-4.
- Decoder-only Language Models used in creative writing applications.
- ...
- Counter-Example(s):
- See: Autoregressive Model, Natural Language Generation, Masked Self-Attention, Transformer Architecture.