Foundational Pre-Trained Neural Model

From GM-RKB
(Redirected from foundation model)
Jump to navigation Jump to search

A Foundational Pre-Trained Neural Model is a pre-trained large neural model that serves as a starting point for various downstream artificial intelligence (AI) tasks.



References

2023

2023

  • (Wikipedia, 2023) ⇒ https://en.wikipedia.org/wiki/Foundation_models Retrieved:2023-4-22.
    • A foundation model (also called base model)[1] is a large artificial intelligence (AI) model trained on a vast quantity of data at scale (often by self-supervised learning or semi-supervised learning)[2] resulting in a model that can be adapted to a wide range of downstream tasks.[3][4] Foundation models have helped bring about a major transformation in how AI systems are built, such as by powering prominent chatbots and other user-facing AI. The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) popularized the term.[3]

      Early examples of foundation models were pre-trained large language models (LLMs) including Google's BERT[5] and OpenAI's GPT-n series. Such broad models can in turn be used for task and/or domain specific models using sequences of other kinds of tokens, such as medical codes.[6]

      Beyond text, several visual and multimodal foundation models have been produced— including DALL-E, Flamingo,[7] Florence [8] and NOOR.[9] Visual foundation models (VFMs) have been combined with text-based LLMs to develop sophisticated task-specific models.[10]

  1. https://time.com/6271657/a-to-z-of-artificial-intelligence/
  2. https://analyticsindiamag.com/self-supervised-learning-vs-semi-supervised-learning-how-they-differ/
  3. 3.0 3.1 Introducing the Center for Research on Foundation Models (CRFM), 2022, Stanford HAI, accessed on 11 June 2022
  4. Goldman, Sharon, Foundation models: 2022's AI paradigm shift, 2022, VentureBeat, access-date=2022-10-24
  5. Rogers, A., Kovaleva, O., Rumshisky, A., A Primer in BERTology: What we know about how BERT works, 2020, arXiv:2002.12327
  6. Steinberg, Ethan, Jung, Ken, Fries, Jason A., Corbin, Conor K., Pfohl, Stephen R., Shah, Nigam H., "Language models are an effective representation learning technique for electronic health record data", January 2021, Journal of Biomedical Informatics, volume 113, pages 103637, doi:10.1016/j.jbi.2020.103637, ISSN: 1532-0480, PMID: 33290879, PMC: 7863633
  7. Tackling multiple tasks with a single visual language model, 2022, DeepMind, access-date: 13 June 2022
  8. Lu Yuan, Dongdong Chen, Yi-Ling Chen, Noel Codella, Xiyang Dai, Jianfeng Gao, Houdong Hu, Xuedong Huang, Boxin Li, Chunyuan Li, Ce Liu, Mengchen Liu, Zicheng Liu, Yumao Lu, Yu Shi, Lijuan Wang, Jianfeng Wang, Bin Xiao, Zhen Xiao, Jianwei Yang, Michael Zeng, Luowei Zhou, Pengchuan Zhang, Florence: A New Foundation Model for Computer Vision, 2022, arXiv, cs.CV
  9. Technology Innovation Institute Announces Launch of NOOR, the World's Largest Arabic NLP Model, 2023, Yahoo Finance, Retrieved from [1]
  10. https://arxiv.org/pdf/2303.04671.pdf