2009 ExploringContentModelsforMultiD

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

Abstract

We present an exploration of generative probabilistic models for multi-document summarization. Beginning with a simple word frequency based model (Nenkova and Vanderwende, 2005), we construct a sequence of models each injecting more structure into the representation of document set content and exhibiting ROUGE gains along the way. Our final model, HierSum, utilizes a hierarchical LDA-style model (Blei et al., 2004) to represent content specificity as a hierarchy of topic vocabulary distributions. At the task of producing generic DUC-style summaries, HierSum yields state-of-the-art ROUGE performance and in pairwise user evaluation strongly outperforms Toutanova et al. (2007)'s state-of-the-art discriminative system. We also explore HierSum's capacity to produce multiple ' topical summaries' in order to facilitate content discovery and navigation.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2009 ExploringContentModelsforMultiDLucy Vanderwende
Aria Haghighi
Exploring Content Models for Multi-document Summarization2009