Encoder-only Transformer-based Large Language Model (LLM)

Context:
- It can focus on encoding input text into Meaningful Representation, Contextualized Representations.
- It can support NLU Tasks.
- It can use pretraining methods like masked language modeling and next-sentence prediction.
- It can be fine-tuned on smaller datasets for specific tasks such as sentiment analysis, question-answering, and named entity recognition.
- It can incorporate bidirectional attention, enabling it to understand the full context of a sentence or text.
- ...
Example(s):
- BERT.
- RoBERTa.
- ALBERT.
- ...
Counter-Example(s):
- Decoder-only Transformer-based LLM, like GPT.
- Encoder-Decoder Transformer-based LLM, like T5.
See: Auto-Encoding Transformer-based LLM, Bidirectional Encoder Representations from Transformers, Natural Language Understanding.

References

chat
- Encoder-only models focus on encoding input text into meaningful contextualized representations. These models are particularly well-suited for tasks where the main goal is to understand and extract information from the input text. BERT (Bidirectional Encoder Representations from Transformers) is a prominent example of an encoder-only LLM. BERT uses the Transformer's encoder layers to pretrain deep bidirectional representations, capturing the context from both left and right directions. This allows BERT to excel in tasks like sentence classification, named entity recognition, and question-answering.