SearchQA Dataset
Jump to navigation
Jump to search
A SearchQA Dataset is a QA-reading comprehension dataset that is a large-scale dataset for reading comprehension and question answering tasks.
- Context:
- It consists of more than 140k question-answer pairs with each pair having about 49.6 snippets.
- Example(s):
- …
- Counter-Example(s):
- a CoQA Dataset,
- a CNN-Daily Mail Dataset,
- a FastQA Dataset,
- a MS COCO Dataset,
- a NarrativeQA Dataset,
- a NewsQA Dataset.
- a RACE Dataset,
- a SQuAD Dataset,
- a TriviaQA Dataset.
- See: Question-Answering System, Natural Language Processing Task, Natural Language Understanding Task, Natural Language Generation Task.
References
2017
- (Dunn et al., 2017) ⇒ Matthew Dunn, Levent Sagun, Mike Higgins, V. Ugur Guney, Volkan Cirik, and Kyunghyun Cho. (2017). “SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine.” In: ePrint: abs/1704.05179.
- QUOTE: Following this approach, we built SearchQA, which consists of more than 140k question-answer pairs with each pair having 49.6 snippets on average. Each question-answer-context tuple of the SearchQA comes with additional meta-data such as the snippet's URL, which we believe will be valuable resources for future research. We conduct human evaluation as well as test two baseline methods, one simple word selection and the other deep learning based, on the SearchQA.
(...) A major goal of the new dataset is to build and provide to the public a machine comprehension dataset that better reflects a noisy information retrieval system. In order to achieve this goal, we need to introduce a natural, realistic noise to the context of each question-answer pair. We use a production-level search engine –Google– for this purpose.
- QUOTE: Following this approach, we built SearchQA, which consists of more than 140k question-answer pairs with each pair having 49.6 snippets on average. Each question-answer-context tuple of the SearchQA comes with additional meta-data such as the snippet's URL, which we believe will be valuable resources for future research. We conduct human evaluation as well as test two baseline methods, one simple word selection and the other deep learning based, on the SearchQA.