2017 TriviaQAALargeScaleDistantlySup

From GM-RKB
Jump to navigation Jump to search

Subject Headings: TriviaQA Dataset; Reading Comprehension Dataset; Question Answering Dataset.

Notes

Cited By

Quotes

Abstract

We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples. TriviaQA includes 95K question-answer pairs authored by trivia enthusiasts and independently gathered evidence documents, six per question on average, that provide high quality distant supervision for answering the questions. We show that, in comparison to other recently introduced large-scale datasets, TriviaQA (1) has relatively complex, compositional questions, (2) has considerable syntactic and lexical variability between questions and corresponding answer-evidence sentences, and (3) requires more cross sentence reasoning to find answers. We also present two baseline algorithms: a feature-based classifier and a state-of-the-art neural network, that performs well on SQuAD reading comprehension. Neither approach comes close to human performance (23% and 40% vs. 80%), suggesting that TriviaQA is a challenging test-bed that is worth significant future study.

1. Introduction

(...)

 TriviaQA contains over 650K question-answer evidence triples, that are derived by combining 95K Trivia enthusiast authored question-answer pairs with on average six supporting evidence documents per question. To our knowledge, TriviaQA is the first dataset where full-sentence questions are authored organically (i.e. independently of an NLP task) and evidence documents are collected retrospectively from Wikipedia and the Web. This decoupling of question generation from evidence collection allows us to control for potential bias in question style or content, while offering organically generated questions from various topics. Designed to engage humans, TriviaQA presents a new challenge for RC models. They should be able to deal with large amount of text from various sources such as news articles, encyclopedic entries and blog articles, and should handle inference over multiple sentences.

(...)

2. Overview

3. Dataset Collection

4. Dataset Analysis

5. Baseline Methods

6. Experiments

7 Related Work

Acknowledgments

References

BibTeX

@inproceedings{2017_TriviaQAALargeScaleDistantlySup,
  author    = {Mandar Joshi and
               Eunsol Choi and
               Daniel S. Weld and
 [[Luke Zettlemoyer]]},
  editor    = {Regina Barzilay and
               Min-Yen Kan},
  title     = {TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for
               Reading Comprehension},
  booktitle = {Proceedings of the 55th Annual Meeting of the Association for Computational
               Linguistics (ACL 2017) Volume 1: Long Papers},
  pages     = {1601--1611},
  publisher = {Association for Computational Linguistics},
  year      = {2017},
  url       = {https://doi.org/10.18653/v1/P17-1147},
  doi       = {10.18653/v1/P17-1147},
}


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2017 TriviaQAALargeScaleDistantlySupDaniel S. Weld
Luke Zettlemoyer
Mandar Joshi
Eunsol Choi
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension2017