Medical Language Model (LM)
A Medical Language Model (LM) is a domain-specific LM for medical language.
- Context:
- It can range from being a Pure Medical LM to being a Fine-Tuned Medical LM.
- It can range from being a Large Medical LM to being a Small Medical LM.
- It can be applied to tasks such as Medical QA.
- It can be related to a Health LM.
- ...
- Example(s):
- a Fine-Tuned Large Medical LM, such as: Med-PaLM 2, Med-PaLM 2.
- a Pure Small Medical LM.
- BioBERT, PubMedBERT.
- BioGPT.
- ...
- Counter-Example(s):
- A Clinical Study LM.
- A Legal LM.
- See: Medical Question Answering.
References =
2023
- (Singhal, Tu et al., 2023) ⇒ Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Le Hou, et al. (2023). “Towards Expert-Level Medical Question Answering with Large Language Models.” doi:10.48550/arXiv.2305.09617
- QUOTE: Language is at the heart of health and medicine, underpinning interactions between people and care providers. Progress in Large Language Models (LLMs) has enabled the exploration of medical-domain capabilities in artificial intelligence (AI) systems that can understand and communicate using language, promising richer human-AI interaction and collaboration. In particular, these models have demonstrated impressive capabilities on multiple-choice research benchmarks [1–3].
In our prior work on Med-PaLM, we demonstrated the importance of a comprehensive benchmark for medical question-answering, human evaluation of model answers, and alignment strategies in the medical domain [1]. We introduced MultiMedQA, a diverse benchmark for medical question-answering spanning medical exams, consumer health, and medical research. We proposed a human evaluation rubric enabling physicians and lay-people to perform detailed assessment of model answers. Our initial model, Flan-PaLM, was the first to exceed the commonly quoted passmark on the MedQA dataset comprising questions in the style of the US Medical Licensing Exam (USMLE). However, human evaluation revealed that further work was needed to ensure the AI output, including long-form answers to
- QUOTE: Language is at the heart of health and medicine, underpinning interactions between people and care providers. Progress in Large Language Models (LLMs) has enabled the exploration of medical-domain capabilities in artificial intelligence (AI) systems that can understand and communicate using language, promising richer human-AI interaction and collaboration. In particular, these models have demonstrated impressive capabilities on multiple-choice research benchmarks [1–3].