Medical Question Answering (QA) Benchmark Task
(Redirected from Medical QA Benchmark)
Jump to navigation
Jump to search
A Medical Question Answering (QA) Benchmark Task is a domain-specific QA benchmark task that is a medical QA task.
- Context:
- It can (often) be used to assess and compare the performance of different Medical QA Systems and algorithms.
- It can be a Medical Corpus-Specific QA Benchmark Task.
- It can range from being a Broad Medical Topic Benchamrk QA Task to being a Narrow Medical TOpic Benchmark QA Task.
- ...
- Example(s):
- MedQA (USMLE).
- MedMCQA.
- PubMedQA.
- ...
- Counter-Example(s):
- See Also: Medical Question Answering (QA) Task, Medical QA System.
References
2023
- (Singhal, Tu et al., 2023) ⇒ Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Le Hou, et al. (2023). “Towards Expert-Level Medical Question Answering with Large Language Models.” doi:10.48550/arXiv.2305.09617
- QUOTE:
- A majority of these approaches involve smaller language models trained using domain specific data (BioLinkBert [11], DRAGON [12], PubMedGPT [13], PubMedBERT [14], BioGPT [15]), resulting in a steady improvement in state-of-the-art performance on benchmark datasets such as MedQA (USMLE) [16], MedMCQA [17], and PubMedQA [18].
- For evaluation on multiple-choice questions, we used the MedQA [16], MedMCQA [17], PubMedQA [18] and MMLU clinical topics [31] datasets
- QUOTE: