Google Med-PaLM 2 Language Model
A Google Med-PaLM 2 Language Model is a fine-tuned large medical LM that is a Google LM.
- Context:
- It can (typically) be a Fine-Tuned PaLM 2 Model (PalM 2).
- It can perform at “expert” level on U.S. Medical Licensing Exam-style questions.
- It can achieve state-of-the-art results in medical competency.
- …
- Counter-Example(s):
- Med-PaLM 1.
- BioMedLM.
- …
- See: Vertex AI Model Garden, Medical QA Benchmark, Domain-Specific LM.
References
2023
- https://cloud.google.com/blog/topics/healthcare-life-sciences/sharing-google-med-palm-2-medical-large-language-model 2023-03-23
- QUOTE: ... Today, we're sharing exciting progress on these initiatives, with the announcement of limited access to Google’s medical large language model, or LLM, called Med-PaLM 2. It will be available in coming weeks to a select group of Google Cloud customers for limited testing, to explore use cases and share feedback as we investigate safe, responsible, and meaningful ways to use this technology.
Med-PaLM 2 harnesses the power of Google’s LLMs, aligned to the medical domain to more accurately and safely answer medical questions. As a result, Med-PaLM 2 was the first LLM to perform at an “expert” test-taker level performance on the MedQA dataset of US Medical Licensing Examination (USMLE)-style questions, reaching 85%+ accuracy, and it was the first AI system to reach a passing score on the MedMCQA dataset comprising Indian AIIMS and NEET medical examination questions, scoring 72.3%.
Industry-tailored LLMs like Med-PaLM 2 are part of a burgeoning family of generative AI technologies that have the potential to significantly enhance healthcare experiences. We’re looking forward to working with our customers to understand how Med-PaLM 2 might be used to facilitate rich, informative discussions, answer complex medical questions, and find insights in complicated and unstructured medical texts. They might also explore its utility to help draft short- and long-form responses and summarize documentation and insights from internal data sets and bodies of scientific knowledge. ...
- QUOTE: ... Today, we're sharing exciting progress on these initiatives, with the announcement of limited access to Google’s medical large language model, or LLM, called Med-PaLM 2. It will be available in coming weeks to a select group of Google Cloud customers for limited testing, to explore use cases and share feedback as we investigate safe, responsible, and meaningful ways to use this technology.
2023
- https://blog.google/technology/ai/google-palm-2-ai-large-language-model/
- QUOTE: Med-PaLM 2, trained by our health research teams with medical knowledge, can answer questions and summarize insights from a variety of dense medical texts. It achieves state-of-the-art results in medical competency, and was the first large language model to perform at “expert” level on U.S. Medical Licensing Exam-style questions. We're now adding multimodal capabilities to synthesize information like x-rays and mammograms to one day improve patient outcomes. Med-PaLM 2 will open up to a small group of Cloud customers for feedback later this summer to identify safe, helpful use cases.
2023
- (Singhal, Tu et al., 2023) ⇒ Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Le Hou, et al. (2023). “Towards Expert-Level Medical Question Answering with Large Language Models.” doi:10.48550/arXiv.2305.09617
- QUOTE:
- … We show that Med-PaLM 2 exhibits strong performance in both multiple-choice and long-form medical question answering, including popular benchmarks and challenging new adversarial datasets. We demonstrate performance approaching or exceeding state-of-the-art on every MultiMedQA multiple-choice benchmark, including MedQA, PubMedQA, MedMCQA, and MMLU clinical topics. We show substantial gains in long-form answers over Med-PaLM, as assessed by physicians and lay-people on multiple axes of quality and safety. Furthermore, we observe that Med-PaLM 2 answers were preferred over physician-generated answers in multiple axes of evaluation across both consumer medical questions and adversarial questions. ...
- QUOTE: