2015 SurveyorASystemforGeneratingCoh
- (Jha et al., 2015) ⇒ Rahul Jha, Reed Coke, and Dragomir Radev. (2015). “Surveyor: A System for Generating Coherent Survey Articles for Scientific Topics.” In: Proceedings of Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015.
Subject Headings: Scientific Topic Document Summarization, Surveyor System, Coherent Document, Readable Document.
Notes
Cited By
Quotes
Abstract
We investigate the task of generating coherent survey articles for scientific topics. We introduce an extractive summarization algorithm that combines a content model with a discourse model to generate coherent and readable summaries of scientific topics using text from scientific articles relevant to the topic. Human evaluation on 15 topics in computational linguistics shows that our system produces significantly more coherent summaries than previous systems. Specifically, our system improves the ratings for coherence by 36% in human evaluation compared to C-Lexrank, a state of the art system for scientific article summarization.
Introduction
This paper is about generating coherent summaries of scientific topics. Given a set of input papers that are relevant to a specific topic such as question answering, our system called Surveyor extracts and organizes text segments from these papers into a coherent and readable survey of the topic. There are many applications for automated surveys thus generated. Human surveys do not exist for all topics and quickly become outdated in rapidly growing fields like computer science. Therefore, an automated system for this task can be very useful for new graduate students and cross-disciplinary researchers who need to quickly familiarize themselves with a new topic.
Our work builds on previous work on summarization of scientific literature (Mohammad et al. 2009; Qazvinian and Radev 2008). Prior systems for generating survey articles for scientific topics such as C-Lexrank have focused on building informative summaries but no attempt has been made to ensure the coherence and readability of the output summaries. Surveyor on the other hand focuses on generating survey articles that contain well defined subtopics presented in a coherent order. Figure 1 shows part of the output of Surveyor for the topic of question answering. Our experimental results on a corpus of computational linguistics topics show that Surveyor produces survey articles that are substantially more coherent and readable compared to previous work. The main contributions of this paper are:
- We propose a summarization algorithm that combines a content model and a discourse model in a modular way to build coherent summaries.
- We introduce the notion of Minimum Independent Discourse Contexts as a way of flexibly modeling discourse relationships in a summarization system.
- We conducted several experiments for evaluating coherence and informativeness of Surveyor on a dataset of 15 topics in computational linguistics with 297 articles and 30 human-written gold summaries (2 per topic). All data used for our experiments is available at http://clair.si.umich.edu/corpora/surveyor_aaai_15.tgz.
We first give an overview of our summarization approach. This is followed by details about our experimental setup and a discussion of results. Finally, we summarize the related work and conclude the paper with pointers for future work.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2015 SurveyorASystemforGeneratingCoh | Dragomir Radev Rahul Jha Reed Coke | Surveyor: A System for Generating Coherent Survey Articles for Scientific Topics | 2015 |