2004 LexPageRank
- (Erkan & Radev, 2004) ⇒ Günes Erkan, Dragomir R. Radev. (2004). “LexPageRank: Prestige in Multi-Document Text Summarization.” In: Proceedings of Empirical Methods in Natural Language Processing (EMNLP 2004).
Subject Headings: Multi-Document Extractive Summarization Algorithm.
Notes
- It proposes a Multi-Document Extractive Graph-based Text Summarization Algorithm named LexPageRank.
- It suggests that LexPageRank outperforms Centroid-based Summarization Algorithm on the DUC 2004 Benchmark Task.
Cited By
2007
- (Wan, Yang & Xiao, 2007) ⇒ Xiaojun Wan, Jianwu Yang, and Jianguo Xiao. (2007). “Manifold-Ranking Based Topic-Focused Multi-Document Summarization.” In: Proceedings of the 20th international joint conference on Artificial intelligence (IJCAI 2007)
- QUOTE: Recently, graph-based methods have been proposed to rank sentences or passages. Websumm [Mani and Bloedorn, 2000], LexPageRank (Erkan and Radev, 2004) and Mihalcea and Tarau [2005] are three such systems using algorithms similar to PageRank and HITS to compute sentence importance.
2006
- (Agirre et al., 2006) ⇒ Eneko Agirre, David Martínez, Oier Lopez de Lacalle, Aitor Soroa. (2006). “Two Graph-based Algorithms for State-of-the-Art WSD.” In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2006).
- QUOTE: Graph-based methods have gained attention in several areas of NLP, including knowledge-based WSD (Mihalcea, 2005; Navigli and Velardi, 2005) and summarization (Erkan and Radev, 2004; Mihalcea and Tarau, 2004).
Quotes
Abstract
Multidocument extractive summarization relies on the concept of sentence centrality to identify the most important sentences in a document. Centrality is typically defined in terms of the presence of particular important words or in terms of similarity to a centroid pseudo-sentence. We are now considering an approach for computing sentence importance based on the concept of eigenvector centrality (prestige) that we call LexPageRank. In this model, a sentence connectivity matrix is constructed based on cosine similarity. If the cosine similarity between two sentences exceeds a particular predefined threshold, a corresponding edge is added to the connectivity matrix. We provide an evaluation of our method on DUC 2004 data. The results show that our approach outperforms centroid-based summarization and is quite successful compared to other summarization systems.
References
- Ron Brandow, Karl Mitze, and Lisa F. Rau. (1995). Automatic condensation of electronic publications by sentence selection. Information Processing and Management, 31(5):675–685.
- Jaime Carbonell and Jade Goldstein. (1998). The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Research and Development in Information Retrieval, pages 335–336.
- Chin-Yew Lin and E.H. Hovy. (2003). Automatic evaluationof summaries using n-gram co-occurrence. In: Proceedings of 2003 Language Technology Conference (HLT-NAACL 2003), Edmonton, Canada, May 27 - June 1.
- L. Page, S. Brin, Rajeev Motwani, and T. Winograd. (1998). The pagerank citation ranking: Bringing order to the web. Technical report, Stanford University, Stanford, CA.
- Dragomir Radev, Hongyan Jing, and Malgorzata Budzikowska. (2000). Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. In ANLP/NAACL Workshop on Summarization, Seattle, WA, April.
- Dragomir Radev, Sasha Blair-Goldensohn, and Zhu Zhang. (2001). Experiments in single and multidocument summarization using MEAD. In First Document Understanding Conference, New Orleans, LA, September.
- Dragomir Radev. (2000). A common theory of information fusion from multiple text sources, step one: Cross-document structure. In: Proceedingseedings, 1st ACL SIGDIAL Workshop on Discourse and Dialogue, Hong Kong, October.
,