Meta Llama LLM Model

From GM-RKB

(Redirected from LLaMA)

Jump to navigation Jump to search

A Meta Llama LLM Model is a open-source auto-regressive transformer-based LLM model developed by Meta AI.

Context:
- It can advance Meta AI's capabilities in the generative AI field, competing with prominent models such as GPT-4.
- It can aim to be more responsive and capable of handling a broader range of complex queries, improving upon limitations observed in earlier versions.
- It can range from smaller, more accessible models to large, high-capacity models that set new benchmarks in AI performance.
- It can lead benchmarks for its scale, showing strong capabilities in tasks like solving mathematical theorems, predicting protein structures, and more.
- It can enhance tool use and coding comprehension, improve generalization and conversation abilities, and offer fine-tuning for application-specific enhancements.
- It can pose challenges due to compute and energy constraints as the training scales up.
- ...
Example(s):
- A Llama 1.
- A Llama 2.
- A Llama 3, Llama 3.1.
- ...
Counter-Example(s):
- OpenAI LLM.
- Google LLM.
See: InstructGPT, BERT, Transformer Neural Networks.

References

2024

(AI@ Meta Llama Team, 2024) ⇒ AI@Meta Llama Team. (2024). “The Llama 3 Herd of Models.” In: Meta AI Research.
- NOTE: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=Meta_Llama_LLM_Model&oldid=919764"

Concept

Facts

... more about "Meta Llama LLM Model"

2024 +