Orca 2 Large Language Model (LLM)
Jump to navigation
Jump to search
An Orca 2 Large Language Model (LLM) is an open-source large language model developed by Microsoft Research.
- Context:
- It can (typically) be based on Synthetic Training Data.
- It can (typically) be a LLAMA 2 [fine-tuned LLM]].
- It can (typically) be based on high-quality Synthetic Training Data.
- It can have Reasoning Abilities, outperforming models of similar size and even those 5-10 times larger in complex reasoning tasks.
- It can have training involving various Reasoning Strategies.
- It can challenge Language Understanding and Common Sense Reasoning.
- ...
- Example(s):
- Orca 2 7B.
- Orca 2 13B.
- ..
- Counter-Example(s):
- See: ....
References
2023
- GBard
- Orca 2 is an open-source large language model from Microsoft Research. It is a 13B parameter model that is trained on a massive dataset of text and code. Orca 2 is capable of performing many tasks, including generating text, translating languages, writing different kinds of creative content, and answering your questions in an informative way. It has been shown to outperform much larger models on a variety of benchmark tasks, such as question answering and natural language inference.
Here are some of the key features of Orca 2:
- High performance: Orca 2 has been shown to outperform much larger models on a variety of benchmark tasks.
- Efficient: Orca 2 is a relatively small model, which makes it more efficient to train and run than larger models.
- Open-source: Orca 2 is an open-source model, which means that anyone can use it and modify it.
- Orca 2 is an open-source large language model from Microsoft Research. It is a 13B parameter model that is trained on a massive dataset of text and code. Orca 2 is capable of performing many tasks, including generating text, translating languages, writing different kinds of creative content, and answering your questions in an informative way. It has been shown to outperform much larger models on a variety of benchmark tasks, such as question answering and natural language inference.
2023
- (Mitra et al., 2023) ⇒ Arindam Mitra, Luciano Del Corro, Shweti Mahajan, Andres Codas, Clarisse Simoes, Sahaj Agrawal, Xuxi Chen, Anastasia Razdaibiedina, Erik Jones, Kriti Aggarwal, Hamid Palangi, Guoqing Zheng, Corby Rosset, Hamed Khanpour, and Ahmed Awadallah. (2023). “Orca 2: Teaching Small Language Models How to Reason.” In: arXiv preprint arXiv:2311.11045. doi:10.48550/arXiv.2311.11045.
- NOTE:
- It introduces Orca 2, the latest version of Microsoft's smaller language model aimed at enhancing reasoning abilities. Orca 2 significantly surpasses models of similar size and attains performance levels comparable to or better than models 5-10 times larger on complex reasoning tasks.
- It comes in two sizes - 7 billion and 13 billion parameters. Both are created by fine-tuning the LLAMA 2 base models using tailored, high-quality synthetic data that teaches various reasoning techniques.
- Orca 2 training data was generated such that it equips the model to choose different solution strategies based on the task, such as step-by-step processing, recall-generate, extract-generate, etc. The data is obtained from a more capable teacher model.
- Evaluation using 15 diverse benchmarks covering language understanding, common sense reasoning, etc. shows Orca 2 matches or surpasses the performance of larger models. It has limitations common to language models but shows potential for reasoning improvements in smaller models.
- The key insight is that tailored synthetic training data and training smaller LLM models on diverse reasoning strategies, allows them to attain capabilities typically seen only in much larger models. This underscores their value for efficiency and capability balancing.
- NOTE: