Instruction-Tuned Large Language Model (LLM)
(Redirected from Instruction-Tuned LLM)
Jump to navigation
Jump to search
An Instruction-Tuned Large Language Model (LLM) is a fine-tuned large language model that is refined using an instruction-following dataset (composed of input-output pairs and attempts to follow instructions more accurately).
- Context:
- It can (typically) be created by a LLM Instruction-Tuning System (that solves an LLM instruction-tuning task to adapt a base LLM).
- It can generate more precise and focused responses (than base LLMs).
- It can understand and respond to complex instructions.
- It can be compared to a specialized entry-level professional who has received additional targeted training to perform specific tasks efficiently.
- ...
- Example(s):
- a Conversational LLM.
- ChatGPT Model, variant of the GPT (Generative Pre-trained Transformer) model, specifically fine-tuned for understanding and generating human-like conversational responses.
- Dolly 2.0, an open-source, instruction-following LLM developed by Databricks, fine-tuned on a unique dataset for enhanced interactive capabilities.
- FLAN (Fine-tuned LAnguage Net), developed by Google to improve its performance on a wide range of natural language processing tasks.
- ...
- Counter-Example(s):
- A Base LLM that is not fine-tuned for specific tasks and so may not follow detailed instructions accurately.
- ...
- See: Large Language Model, Reinforcement Learning, Natural Language Processing.
References
2023
- (Wang, Kordi et al., 2023) ⇒ Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, and Hannaneh Hajishirzi. (2023). “Self-Instruct: Aligning Language Models with Self-Generated Instructions.” doi:10.48550/arXiv.2212.10560
- NOTE:
- It can generate a large and diverse synthetic instruction dataset by prompting a language model.
- It can improve language models' ability to follow instructions by finetuning them on the synthetic data.
- It can help build better instruction-following models with minimal human effort.
- It provides a new benchmark for evaluating instruction-following.
- ABSTRACT: Large "instruction-tuned" language models (i.e., finetuned to respond to instructions) have demonstrated a remarkable ability to generalize zero-shot to new tasks.
- NOTE:
2023
- (Brooks et al., 2023) ⇒ T. Brooks, A. Holynski, AA. Efros (2023). “Instructpix2pix: Learning to follow image editing instructions", Proceedings of the IEEE. [Link to the article](https://openaccess.thecvf.com/content/ICCV2023/papers/xxxxx.pdf)
- QUOTE: "...Using this data, we fine-tuned the GPT-3 Davinci …"
2023
- (Liu et al., 2023) ⇒ H. Liu, C. Li, Q. Wu, YJ. Lee (2023). “Visual instruction tuning", arXiv preprint arXiv:2304.08485. [Link to the article](https://arxiv.org/pdf/2304.08485.pdf)
- QUOTE: "...large language models (LLMs) using machine-generated instruction-following data has ... multimodal language-image instruction-following data. By instruction tuning on such generated ..."
2023
- (Longpre et al., 2023) ⇒ S. Longpre, L. Hou, T. Vu, A. Webson, HW. Chung ... (2023). “The flan collection: Designing data and methods for effective instruction tuning", arXiv preprint arXiv:2304.03277. [Link to the article](https://arxiv.org/pdf/2304.03277.pdf)
- QUOTE: "...available instruction tuning methods, … instruction-tuned models as more computationally-efficient starting checkpoints for new tasks..."
2023
- (Peng et al., 2023) ⇒ B. Peng, C. Li, P. He, M. Galley, J. Gao (2023). “Instruction tuning with gpt-4", arXiv preprint arXiv:2304.03277. [Link to the article](https://arxiv.org/pdf/2304.03277.pdf)
- QUOTE: "...This paper demonstrates the effectiveness of instruction tuning using GPT-4..."
2023
- (Chungon et al., 2022) ⇒ Hyung W. Chungon, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, et al. (2022). “Scaling Instruction-finetuned Language Models.” arXiv preprint arXiv:2210.11416
- ABSTRACT: Finetuning language models on a collection of datasets phrased as instructions has been shown to improve model performance and generalization to unseen tasks. In this paper we explore instruction finetuning with a particular focus on (1) scaling the number of tasks, (2) scaling the model size, and (3) finetuning on chain-of-thought data. We find that instruction finetuning with the above aspects dramatically improves performance on a variety of model classes (PaLM, T5, U-PaLM), prompting setups (zero-shot, few-shot, CoT), and evaluation benchmarks (MMLU, BBH, TyDiQA, MGSM, open-ended generation). For instance, Flan-PaLM 540B instruction-finetuned on 1.8K tasks outperforms PALM 540B by a large margin (+9.4% on average). Flan-PaLM 540B achieves state-of-the-art performance on several benchmarks, such as 75.2% on five-shot MMLU. We also publicly release Flan-T5 checkpoints, which achieve strong few-shot performance even compared to much larger models, such as PaLM 62B. Overall, instruction finetuning is a general method for improving the performance and usability of pretrained language models.
2022
- (Wang et al., 2022) ⇒ Y. Wang, S. Mishra, P. Alipoormolabashi, Y. Kordi ... (2022). “Benchmarking generalization via in-context instructions on 1,600+ language tasks", arXiv preprint arXiv. [Link to the article](https://arxiv.org/pdf/xxxxx.pdf)
- QUOTE: "... Instruction-tuned models. We evaluate on language models that are fine-tuned to follow language instructions..."
- (Ouyang et al., 2022) ⇒ L. Ouyang, J. Wu, X. Jiang, D. Almeida ... (2022). “Training Language Models to Follow Instructions with Human Feedback.”, Advances in Neural Information Processing Systems. [Link to the article](https://proceedings.neurips.cc/paper/2022/file/xxxxx.pdf)
- QUOTE: "...results show that fine-tuning with human feedback is a ..."
- (Wang et al., 2022) ⇒ Y. Wang, S. Mishra, P. Alipoormolabashi ... (2022). “Super-naturalinstructions: Generalization via declarative instructions on 1600+ nlp tasks", Proceedings of the. [Link to the article](https://aclanthology.org/2022.xxxx.pdf)
- QUOTE: "...instructions show stronger generalization to unseen tasks. In particular, our model that is fine-tuned on a ..."
2021
- (Wei et al., 2021) ⇒ J. Wei, M. Bosma, VY. Zhao, K. Guu, AW. Yu ... (2021). “Finetuned language models are zero-shot learners", arXiv preprint arXiv. [Link to the article](https://arxiv.org/pdf/xxxxx.pdf)
- QUOTE: "...We show that instruction tuning—finetuning … in instruction tuning..."
- (Mishra et al., 2021) ⇒ S. Mishra, D. Khashabi, C. Baral, H. Hajishirzi (2021). “Cross-task generalization via natural language crowdsourcing instructions", arXiv preprint arXiv. [Link to the article](https://arxiv.org/pdf/xxxxx.pdf)
- QUOTE: "...For comparison, we evaluate GPT3 which uses no finetuning, unlike BART that is fine-tuned with the Tseen tasks..."
- (Mishra et al., 2021) ⇒ S. Mishra, D. Khashabi, C. Baral, Y. Choi ... (2021). “Reframing Instructional Prompts to GPTk's Language", arXiv preprint arXiv. [Link to the article](https://arxiv.org/pdf/xxxxx.pdf)
- QUOTE: "...Reframing can be particularly helpful in ..."