Foundational Pre-Trained Neural Model
(Redirected from foundational model)
Jump to navigation
Jump to search
A Foundational Pre-Trained Neural Model is a pre-trained large neural model that serves as a starting point for various downstream artificial intelligence (AI) tasks.
- Context:
- It can (typically) be trained on large Training Datasets.
- It can be the basis for a Fine-Tuned Neural Model.
- …
- Example(s):
- Foundational Encoder-Only Text Model], such as:
- BERT, RoBERTa, ALBERT, DistilBERT.
- ...
- a Foundation LLMs, such as:
- Foundational Base LLM, such as GPT-3
- Foundational Instruct LLM, such as LLaMA-Instruct model.
- ...
- Foundational Computer Vision Models, such as:
- Foundational ___ CV Model, such as: ResNet, a convolutional neural network model for image recognition.
- Foundational Object Detection CV Model, such as: YOLO, a real-time object detection system.
- Foundational Encoder-Only Text Model], such as:
- Counter-Example(s):
- See: Large Language Models, BERT (Language Model), Generative pre-Trained Transformer.
References
2023
- chat
- A foundation model is a large pre-trained neural network model that serves as a starting point for various downstream artificial intelligence (AI) tasks. It is typically trained on massive amounts of diverse data (often from the Web), enabling it to learn and capture a wide range of language features and nuances. However, a foundation model is not typically fine-tuned for any specific task or application. If needed however, it is amenable to fine-tuning on a smaller dataset for some specific tasks. Examples of foundation models include BERT, GPT, and T5.
2023
- (Wikipedia, 2023) ⇒ https://en.wikipedia.org/wiki/Foundation_models Retrieved:2023-4-22.
- A foundation model (also called base model)[1] is a large artificial intelligence (AI) model trained on a vast quantity of data at scale (often by self-supervised learning or semi-supervised learning)[2] resulting in a model that can be adapted to a wide range of downstream tasks.[3][4] Foundation models have helped bring about a major transformation in how AI systems are built, such as by powering prominent chatbots and other user-facing AI. The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) popularized the term.[3]
Early examples of foundation models were pre-trained large language models (LLMs) including Google's BERT[5] and OpenAI's GPT-n series. Such broad models can in turn be used for task and/or domain specific models using sequences of other kinds of tokens, such as medical codes.[6]
Beyond text, several visual and multimodal foundation models have been produced— including DALL-E, Flamingo,[7] Florence [8] and NOOR.[9] Visual foundation models (VFMs) have been combined with text-based LLMs to develop sophisticated task-specific models.[10]
- A foundation model (also called base model)[1] is a large artificial intelligence (AI) model trained on a vast quantity of data at scale (often by self-supervised learning or semi-supervised learning)[2] resulting in a model that can be adapted to a wide range of downstream tasks.[3][4] Foundation models have helped bring about a major transformation in how AI systems are built, such as by powering prominent chatbots and other user-facing AI. The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) popularized the term.[3]
- ↑ https://time.com/6271657/a-to-z-of-artificial-intelligence/
- ↑ https://analyticsindiamag.com/self-supervised-learning-vs-semi-supervised-learning-how-they-differ/
- ↑ 3.0 3.1 Introducing the Center for Research on Foundation Models (CRFM), 2022, Stanford HAI, accessed on 11 June 2022
- ↑ Goldman, Sharon, Foundation models: 2022's AI paradigm shift, 2022, VentureBeat, access-date=2022-10-24
- ↑ Rogers, A., Kovaleva, O., Rumshisky, A., A Primer in BERTology: What we know about how BERT works, 2020, arXiv:2002.12327
- ↑ Steinberg, Ethan, Jung, Ken, Fries, Jason A., Corbin, Conor K., Pfohl, Stephen R., Shah, Nigam H., "Language models are an effective representation learning technique for electronic health record data", January 2021, Journal of Biomedical Informatics, volume 113, pages 103637, doi:10.1016/j.jbi.2020.103637, ISSN: 1532-0480, PMID: 33290879, PMC: 7863633
- ↑ Tackling multiple tasks with a single visual language model, 2022, DeepMind, access-date: 13 June 2022
- ↑ Lu Yuan, Dongdong Chen, Yi-Ling Chen, Noel Codella, Xiyang Dai, Jianfeng Gao, Houdong Hu, Xuedong Huang, Boxin Li, Chunyuan Li, Ce Liu, Mengchen Liu, Zicheng Liu, Yumao Lu, Yu Shi, Lijuan Wang, Jianfeng Wang, Bin Xiao, Zhen Xiao, Jianwei Yang, Michael Zeng, Luowei Zhou, Pengchuan Zhang, Florence: A New Foundation Model for Computer Vision, 2022, arXiv, cs.CV
- ↑ Technology Innovation Institute Announces Launch of NOOR, the World's Largest Arabic NLP Model, 2023, Yahoo Finance, Retrieved from [1]
- ↑ https://arxiv.org/pdf/2303.04671.pdf