Medical Question Answering (QA) Benchmark Task

From GM-RKB

(Redirected from Medical QA Benchmark)

Jump to navigation Jump to search

A Medical Question Answering (QA) Benchmark Task is a domain-specific QA benchmark task that is a medical QA task.

Context:
- It can (often) be used to assess and compare the performance of different Medical QA Systems and algorithms.
- It can be a Medical Corpus-Specific QA Benchmark Task.
- It can range from being a Broad Medical Topic Benchamrk QA Task to being a Narrow Medical TOpic Benchmark QA Task.
- ...
Example(s):
- MedQA (USMLE).
- MedMCQA.
- PubMedQA.
- ...
Counter-Example(s):
- a Clinical Study QA Task.
- a Legal QA Benchmark Task.
- ...
See Also: Medical Question Answering (QA) Task, Medical QA System.

References

2023

(Singhal, Tu et al., 2023) ⇒ Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Le Hou, et al. (2023). “Towards Expert-Level Medical Question Answering with Large Language Models.” doi:10.48550/arXiv.2305.09617
- QUOTE:
  - A majority of these approaches involve smaller language models trained using domain specific data (BioLinkBert [11], DRAGON [12], PubMedGPT [13], PubMedBERT [14], BioGPT [15]), resulting in a steady improvement in state-of-the-art performance on benchmark datasets such as MedQA (USMLE) [16], MedMCQA [17], and PubMedQA [18].
  - For evaluation on multiple-choice questions, we used the MedQA [16], MedMCQA [17], PubMedQA [18] and MMLU clinical topics [31] datasets

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=Medical_Question_Answering_(QA)_Benchmark_Task&oldid=882719"

Concept