Encoder-only Transformer-based Large Language Model (LLM)
Jump to navigation
Jump to search
An Encoder-only Transformer-based Large Language Model (LLM) is a transformer-based LLM that solely employs an encoder architecture.
- Context:
- It can focus on encoding input text into Meaningful Representation, Contextualized Representations.
- It can support NLU Tasks.
- It can use pretraining methods like masked language modeling and next-sentence prediction.
- It can be fine-tuned on smaller datasets for specific tasks such as sentiment analysis, question-answering, and named entity recognition.
- It can incorporate bidirectional attention, enabling it to understand the full context of a sentence or text.
- ...
- Example(s):
- Counter-Example(s):
- See: Auto-Encoding Transformer-based LLM, Bidirectional Encoder Representations from Transformers, Natural Language Understanding.
References
2023
- chat
- Encoder-only models focus on encoding input text into meaningful contextualized representations. These models are particularly well-suited for tasks where the main goal is to understand and extract information from the input text. BERT (Bidirectional Encoder Representations from Transformers) is a prominent example of an encoder-only LLM. BERT uses the Transformer's encoder layers to pretrain deep bidirectional representations, capturing the context from both left and right directions. This allows BERT to excel in tasks like sentence classification, named entity recognition, and question-answering.